Overview
Document Classification enables you to automatically organize and route documents based on their type and structure. This capability helps streamline document workflows by automatically identifying document categories and extracting specific sections from multi-part documents.Document Classification is coming soon. This feature is currently in development and will be available in a future release. Contact [email protected] to be notified when Classification becomes available.
Classification Modes
Document Type Classification
Classify documents into predefined categories using custom rules (Coming Soon)
Document Splitting
Split multi-section documents into logical parts based on your rules (Coming Soon)
How It Works
Document Type Classification
Automatically categorize documents into predefined types based on rules you define.1
Define Categories
Create a list of document types you need to identify (e.g., Invoice, Receipt, Contract, ID Card)
2
Configure Rules
Define classification rules that identify each document type based on content patterns, layout features, or specific text
3
Submit Document
Upload or reference a document for classification
4
Receive Classification
Get back the document type classification with confidence scores
Document Splitting
Split complex multi-section documents into separate logical parts.1
Define Sections
Specify the section types you want to extract (e.g., Balance Sheet, Income Statement, Cash Flow Statement)
2
Create Splitting Rules
Define rules to identify section boundaries based on headers, page breaks, or content patterns
3
Submit Document
Upload the multi-section document for splitting
4
Receive Split Results
Get back page ranges for each identified section, with sections that don’t match any rules marked as unclassified
Use Cases
Invoice Processing
Classify incoming documents as Invoice, Purchase Order, Receipt, or Credit Note to route to appropriate accounting workflows
Identity Verification
Identify document types such as Passport, Driver’s License, or National ID to apply document-specific extraction templates
Financial Statements
Split comprehensive financial reports into Balance Sheet, Income Statement, and Cash Flow Statement for targeted analysis
Legal Contracts
Divide lengthy contracts into Preamble, Terms & Conditions, Signatures, and Appendices for efficient review
Healthcare Records
Sort medical documents into Lab Results, Prescriptions, Insurance Claims, and Referrals for proper patient record routing
Batch Scanning
Split bulk scanned documents into individual files based on separator pages or content detection
Key Capabilities
User-Defined Rules
Classification and splitting are based on rules you define, giving you precise control over how documents are categorized and divided. Rules can leverage:- Content patterns and specific text markers
- Document layout and structure
- Header and section formatting
- Page boundaries and separators
Classification Output
When classifying documents, you’ll receive:- Identified document type from your predefined categories
- Confidence score indicating classification certainty
- Alternative classifications if multiple types match
- Unclassified status for documents that don’t match any rules
Splitting Output
When splitting documents, you’ll receive:- Page ranges for each identified section
- Section type labels based on your definitions
- Unmatched pages that don’t fit any section criteria
- Ability to process each section independently with targeted templates
Best Practices
Design Clear Classification Rules
Design Clear Classification Rules
When Classification becomes available, define clear, specific rules for each document type. Use distinctive features like headers, layouts, or specific text patterns to ensure accurate classification. Test rules with diverse document samples to handle variations in format and quality.
Plan Your Document Categories
Plan Your Document Categories
Prepare your classification taxonomy now by identifying the document types you need to distinguish and the criteria that differentiate them. A well-organized category structure will help you hit the ground running when the feature launches.
Define Section Boundaries Carefully
Define Section Boundaries Carefully
For document splitting, establish clear rules for where sections begin and end. Consider using multiple indicators (headers, page numbers, content patterns) to improve split accuracy, especially for documents with varying formats.
Combine with Templates
Combine with Templates
Plan to use Classification together with Templates for powerful workflows. Once documents are classified or split, apply type-specific or section-specific extraction templates to extract relevant data from each part.