Thuật ngữ PDF
Định nghĩa dễ hiểu về các thuật ngữ PDF bạn gặp khi làm việc với tài liệu số. Nhấn vào bất kỳ từ nào để xem giải thích dài hơn với ví dụ và công cụ liên quan.
A
A4, Letter, Legal (page sizes)
The three most common PDF page dimensions — A4 (European standard), Letter (US default), and Legal (longer US document size).
AES Encryption
The Advanced Encryption Standard — the cipher used to protect modern password-protected PDFs. AES-256 is the strongest commonly deployed version.
Annotation (PDF)
A note, highlight, shape, or comment added on top of a PDF without modifying the underlying content — used for review, markup, and collaboration.
B
Bates Numbering
Sequential identification numbers stamped onto each page of a PDF, commonly used in legal discovery to give every page a unique reference.
Bookmarks (PDF)
A navigable outline inside a PDF — the table of contents that appears in the sidebar of most PDF readers, letting you jump to specific sections.
C
Compression (PDF)
Techniques for reducing PDF file size — by re-encoding images at lower quality, removing unused objects, or both.
CropBox & MediaBox
Two of the PDF boxes that define what part of each page is visible and printable — the MediaBox is the physical page, the CropBox is the visible window.
D
F
Flattening (PDF)
Permanently merging interactive elements — form fields, annotations, signatures — into the static page content of a PDF so they can no longer be edited.
Font Embedding
Packaging the fonts used in a PDF inside the file itself, so the document renders correctly on any device even if those fonts aren't installed.
Form Fields (PDF)
Interactive input areas in a PDF — text boxes, checkboxes, dropdowns, and signature fields — that users can fill in electronically.
G
H
L
Layer (PDF)
An optional content group in a PDF — text, images, or annotations organized into independent layers that can be shown, hidden, or printed separately.
LibreOffice
A free, open-source office suite (Writer, Calc, Impress) that can convert between Word, Excel, PowerPoint and PDF — widely used as a server-side document conversion engine.
M
N
O
P
PDF
The Portable Document Format — a file format that preserves layout, fonts, and graphics so a document looks identical on every device.
PDF Encryption
Protecting a PDF with a password, typically using AES-128 or AES-256 encryption, so the file cannot be opened or modified without the key.
PDF File Size
The byte count of a PDF file — driven mostly by embedded images and fonts. Reducing file size usually means compressing images, not text.
pdf-lib
A pure-JavaScript library for creating and modifying PDF documents entirely in the browser — no server round-trip needed.
PDF.js
Mozilla's open-source JavaScript PDF renderer — the engine behind Firefox's built-in PDF viewer and most browser-based PDF preview tools.
PDF/A
An ISO-standardized variant of PDF designed for long-term archival — fonts are embedded, external dependencies are disallowed, and the file will render identically decades from now.
Q
R
Rasterization
Converting vector content (text, shapes, lines) into a grid of pixels — a raster image. Often a deliberate step for redaction or compression; sometimes an unwanted side effect.
Redaction
Permanently removing sensitive text or images from a PDF so they cannot be recovered by copying, searching, or inspecting the file.
S
T
Tagged PDF
A PDF that includes hidden structural tags — headings, paragraphs, lists, tables — that describe the content's logical hierarchy for screen readers and data extraction.
Tesseract
The most widely used open-source OCR (optical character recognition) engine — originally developed at HP, now maintained by Google.