Processing Modes
Fintelite AI offers three primary processing modes to handle different use cases:Predict
Extract structured data using templates or schemas
Parse
Extract full document text and layout information
Fraud Detection
Analyze documents for authenticity and tampering
Predict Mode
Extract specific data fields from documents using AI-powered templates or custom schemas.When to Use
- Extracting invoice details (line items, totals, dates)
- Processing ID cards and passports
- Parsing financial statements
- Any structured data extraction task
How It Works
1
Upload Document
Upload your document or reference an existing file by ID
2
Define Schema
Specify what data to extract using a JSON schema or template
3
AI Processing
The AI analyzes the document and extracts requested fields
4
Get Results
Receive structured JSON with extracted data and confidence scores
For complete request/response examples and API details, see the Predict API Reference. For understanding job status and response formats, see the Jobs Concept Guide.
Parse Mode
Extract complete document text with layout information including blocks and markdown.When to Use
- Full document text extraction
- Document analysis and search
- Building document databases
- RAG applications (document parsed into markdown)
- Pre-processing for another pipeline
Features
- Block Detection: Identifies headers, footers, tables, figures, and text blocks
- Markdown Output: Converts document to markdown format
- Multi-page Support: Handles documents with any number of pages
For complete request/response examples and API details, see the Parse API Reference. For understanding job status and response formats, see the Jobs Concept Guide.
Document Formats Supported
Native PDF parsing and OCR for scanned PDFs
Images
JPG, PNG, HEIC formats with OCR
Multi-page TIFF
Process multi-page TIFF documents
URLs
Process documents from public URLs
Synchronous vs Asynchronous
- Sync: Returns results immediately, best for small documents such as ID cards, receipts, and single-page invoices
- Async: Returns job ID, check status later via
/status/{job_id}, best for large documents and batch processing
For job management details, see the Jobs Concept Guide.
File Input Methods
Fintelite AI supports three ways to provide document files:1. File Upload
Upload a new file directly in the request:2. File ID Reference
Reference a previously uploaded file:3. URL Reference
Process a document from a public URL:Configuration Options
Control document processing behavior with parser, chunking, and citation settings.Parser Configuration
Control parsing behavior and quality modes
Chunking Strategies
Split large documents for better processing
Citations
Track source of extracted data
For all configuration options and examples, see the Configuration Overview.
Confidence Scores
Every extraction includes confidence scores on a 0-100 scale, with nested structure matching your extraction data. Confidence structure mirrors extraction:Fields with confidence below 80 have higher likelihood of errors and should be manually reviewed.
Citations
Track which document blocks each extracted value came from for verification and debugging. Key Points:- Requires
use_parser: true - Disabled automatically in
parser_mode: LITE - Changes output format to include
valueandcitationsfields - Citation format:
b.X= block number,w.X= word number
For complete citation details, format, and usage examples, see the Citations Configuration.
Best Practices
Document Quality
Document Quality
- Use high-resolution scans (300 DPI minimum)
- Ensure good lighting and contrast
- Avoid skewed or rotated images
- Remove artifacts and noise
Schema Design
Schema Design
- Be specific with field names and descriptions
- Use appropriate data types (string, number, integer, boolean, array, object)
- Use enum for fields with predefined values
- Include nested structures for complex data (max 3 levels)
- Test with sample documents first
Performance Optimization
Performance Optimization
- Reuse uploaded files via file IDs
- Use async mode for large documents
- Implement webhooks instead of polling
- Cache frequently used templates
Error Handling
Error Handling
- Check confidence scores
- Implement retry logic for failures
- Validate extracted data
- Handle missing or null fields