PDF GeniePDF Genie

Tagged PDF

A PDF that includes hidden structural tags — headings, paragraphs, lists, tables — that describe the content's logical hierarchy for screen readers and data extraction.

A tagged PDF contains an invisible structure tree that labels each piece of content with its semantic role: this text is a heading, this block is a paragraph, these cells form a table, this image has this alt text. Screen readers use the tag tree to navigate and read the document intelligently for users with visual impairments.

What tagging enables

  • Screen reader accessibility — JAWS, NVDA, and VoiceOver can announce headings, navigate by chapter, and read tables cell-by-cell using the tag structure
  • Reflowable content — a tagged PDF can be reflowed for small screens (phones, tablets) because the reader knows which text is body content and which is decorative
  • Data extraction — tools that extract tables from PDFs have a much easier time when the PDF is tagged with proper ``, ``, and `
    ` structures
  • Accessibility compliance — WCAG, Section 508 (USA), and EN 301 549 (EU) all require tagged structure for digital documents intended for public use
  • Untagged PDFs

    Most PDFs created from printed-to-PDF, generic PDF exports, or older Word → PDF conversions are not tagged. The visual content is there but the semantic structure is missing. Screen readers can still read them, but they read the raw text stream without understanding headings or tables, which is painfully slow and lossy.

    Making a PDF accessible

    Converting a untagged PDF to a properly tagged one is non-trivial — it requires adding the structure manually or using AI tools to infer it. Adobe Acrobat Pro has an automated tagging feature; dedicated accessibility remediation tools do more careful work. A fully tagged, WCAG-compliant PDF is a meaningful investment.

    관련 도구