Skip to main content

Document Processing Glossary

Decipher the technical jargon. Understand the technology powering your workflows.

#OCR (Optical Character Recognition)

Technology that converts different types of documents, such as scanned paper documents, PDF files, or images captured by a digital camera, into editable and searchable data.

#PDF Flattening

The process of merging all layers of a PDF, including form fields and annotations, into a single static layer. This prevents further editing of the form data and ensures the document prints correctly.

#Lossless Compression

A class of data compression algorithms that allows the original data to be perfectly reconstructed from the compressed data. Good for text documents where clarity is essential.

#AES-256 Encryption

Advanced Encryption Standard with a 256-bit key range. It is the industry standard for securing data and is used by governments and financial institutions to protect sensitive PDF documents.

#eSignature (Electronic Signature)

Data in electronic form which is attached to or logically associated with other data in electronic form and which is used by the signatory to sign. It creates a legal binding similar to a handwritten signature.

#PDF/A

An ISO-standardized version of the Portable Document Format (PDF) specialized for use in the archiving and long-term preservation of electronic documents.

#Raster vs Vector

Raster images are made of pixels (like JPG), while Vector images use mathematical formulas (like PDF fonts). Vector PDFs can be zoomed infinitely without losing quality.

#Metadata

Data that provides information about other data. In PDFs, this includes the author, creation date, modification history, and software used to create the file.

#Watermark

A recognizable image or pattern in paper or digital documents that appears as various shades of lightness/darkness. It is used to identify ownership or copyright.

#Redaction

The process of permanently removing visible text or graphics from a document. Unlike 'masking' (covering with a black box), redaction deletes the underlying data.

Don't see a term?

Our blog covers many of these topics in deeper detail.