Skip to main content

What are Jobs?

Jobs represent document processing tasks in Fintelite AI. Every processing request (predict, parse, or fraud detection) creates a job that tracks the status, results, and metadata of the operation.

Job Types

PREDICT

Data extraction jobs using templates or schemas

PARSE

Full document parsing and OCR jobs

FRAUD

Document fraud detection and analysis jobs

Job Lifecycle

1

IN_QUEUE

Job is queued and waiting to be processed by a worker
2

IN_PROGRESS

Worker has picked up the job and is actively processing it
3

COMPLETED

Job finished successfully with results available
4

FAILED

Job encountered an error during processing
5

CANCELLED

Job was manually cancelled before completion

Response Structure by Job Type

Each job type returns a different response structure when completed. Understanding these differences helps you parse results correctly.

PREDICT Jobs - Data Extraction

PREDICT jobs extract structured data based on templates or schemas. The response includes:
  • config: Configuration used for extraction (version, parser_mode, enable_citations)
  • output.extraction: Extracted data format
  • output.confidences: Confidence scores for extracted fields

Without Citations

When enable_citations is false (default), fields contain direct values:
{
  "output": {
    "type": "json",
    "extraction": {
      "invoice_number": "INV-2024-001",
      "invoice_date": "2024-01-15",
      "total_amount": 1500.50,
      "line_items": [
        {
          "description": "Widget A",
          "quantity": 10,
          "unit_price": 50.00,
          "total": 500.00
        }
      ]
    },
    "confidences": {
      "average": 92.5,
      "fields": {
        "invoice_number": 95.0,
        "invoice_date": 88.0,
        "total_amount": 96.5,
        "line_items": [
          {
            "description": 93.0,
            "quantity": 91.0,
            "unit_price": 94.5,
            "total": 95.0
          }
        ]
      }
    }
  }
}

With Citations Enabled

When enable_citations is true, each field includes source references:
{
  "output": {
    "type": "json",
    "extraction": {
      "invoice_number": {
        "value": "INV-2024-001",
        "citations": ["b.1"]
      },
      "invoice_date": {
        "value": "2024-01-15",
        "citations": ["b.1"]
      },
      "total_amount": {
        "value": 1500.50,
        "citations": ["b.5"]
      },
      "line_items": [
        {
          "description": {
            "value": "Widget A",
            "citations": ["b.10"]
          },
          "quantity": {
            "value": 10,
            "citations": ["b.10"]
          },
          "unit_price": {
            "value": 50.00,
            "citations": ["b.10", "b.11"]
          },
          "total": {
            "value": 500.00,
            "citations": ["b.11"]
          }
        }
      ]
    },
    "confidences": {
      "average": 92.5,
      "fields": {
        "invoice_number": 95.0,
        "invoice_date": 88.0,
        "total_amount": 96.5,
        "line_items": [
          {
            "description": 93.0,
            "quantity": 91.0,
            "unit_price": 94.5,
            "total": 95.0
          }
        ]
      }
    }
  }
}
Citation References:
  • b.X: Reference to block index X in the document annotations
  • w.X: Reference to word index X in the document annotations
Citations are automatically disabled when using parser_mode: LITE. Use PLUS or PRO parser modes to enable citations.

PARSE Jobs - Document Parsing

PARSE jobs extract full document text and structure. The response includes:
  • output.annotations: Detailed document structure with content blocks
  • output.markdown: Array of markdown-formatted text for each page
  • output.metadata: Processing metadata including mode and page mapping
{
  "output": {
    "type": "json",
    "annotations": {
      "blocks": [
        {
          "type": "SECTION_HEADER",
          "content": "SERVICE AGREEMENT",
          "bbox": [0.35, 0.08, 0.65, 0.17],
          "page": 1,
          "block_num": 1,
          "confidence": "HIGH",
          "confidence_val": 0.98
        },
        {
          "type": "TEXT",
          "content": "This Service Agreement is entered into on January 15, 2024\nbetween the parties listed below.",
          "bbox": [0.1, 0.2, 0.9, 0.35],
          "page": 1,
          "block_num": 2,
          "confidence": "HIGH",
          "confidence_val": 0.95
        },
        {
          "type": "TABLE",
          "content": "Item | Quantity | Price\nWidget A | 10 | $100.00\nWidget B | 5 | $75.00",
          "bbox": [0.15, 0.4, 0.85, 0.65],
          "page": 2,
          "block_num": 3,
          "confidence": "MEDIUM",
          "confidence_val": 0.87
        },
        {
          "type": "IMAGE",
          "content": "Company Logo",
          "bbox": [0.2, 0.7, 0.8, 0.9],
          "page": 2,
          "block_num": 4,
          "confidence": "HIGH",
          "confidence_val": 0.96,
          "image_url": "image_1"
        }
      ],
      "references": {
        "image_1": "data:image/png;base64,iVBORw0KGgo..."
      }
    },
    "markdown": [
      "# SERVICE AGREEMENT\n\nThis Service Agreement is entered into on January 15, 2024\nbetween the parties listed below.",
      "## Terms and Conditions\n\nItem | Quantity | Price\n--- | --- | ---\nWidget A | 10 | $100.00\nWidget B | 5 | $75.00",
      "## Signatures\n\n_____________________"
    ],
    "metadata": {
      "mode": "PLUS",
      "page_mapping": [
        {"original": 1, "processed": 1},
        {"original": 2, "processed": 2}
      ]
    }
  }
}
Block Structure:
  • type: Block type classification (SECTION_HEADER, HEADER, FOOTER, TABLE, IMAGE, TEXT)
  • content: Text content extracted from the block (lines combined with newlines). For IMAGE blocks, contains image caption if available
  • bbox: Normalized bounding box [x_min, y_min, x_max, y_max] (0-1 scale)
  • page: Original page number from source document
  • page_seq: Sequential page number in processed document (only present when using page ranges, e.g., pages: "1-5,10-15")
  • block_num: Sequential block number (used for citations)
  • confidence: Confidence level (HIGH, MEDIUM, LOW)
  • confidence_val: Numeric confidence score (0-1 scale, typically 0.85-0.95)
  • image_url: Reference key for IMAGE blocks (points to entry in references map)
Additional Fields:
  • references: Map of image references (image_id → base64 encoded image data)
Metadata Fields:
  • mode: Parser mode used (LITE, PRO, … )
  • page_mapping: Array of {"original": page_num, "processed": page_num} objects
total_pages is in the file object at the top level of the job response, not in output.metadata.

FRAUD Jobs - Fraud Detection

FRAUD jobs analyze documents for authenticity issues. The response includes:
  • output.output[0].anomalies: Detected fraud indicators with severity levels
  • output.output[0].font_analysis: Font consistency checks
  • output.output[0].image_analysis: Image manipulation detection
  • output.output[0].manipulation_analysis: Document tampering indicators
  • output.output[0].metadata_analysis: File metadata verification
  • output.output[0].summary: Overall risk assessment
{
  "output": {
    "type": "json",
    "output": [
      {
        "anomalies": [
          {
            "type": "total_mismatch",
            "severity": "MEDIUM",
            "location": "Page 1, Region: (7.3, 34.5, 85.3, 2.5)",
            "details": "Content data mismatch detected",
            "evidence": "Critical calculation mismatch for 'Ordinary Pay'. The listed Rate (2000.0000) multiplied by Hours (40.00) should be $80,000.00..."
          }
        ],
        "font_analysis": [
          {
            "number": 1,
            "indicator": "Font Overview",
            "status": "PASS",
            "description": "Summary of fonts used in the document",
            "value": {
              "total_fonts": 2,
              "embedded_fonts": 2,
              "external_fonts": 0
            }
          }
        ],
        "image_analysis": [
          {
            "number": 1,
            "indicator": "Image Analysis Overview",
            "status": "PASS",
            "description": "Summary of image analysis findings",
            "value": {
              "total_images": 1,
              "clean_images": 1,
              "suspicious_images": 0
            }
          }
        ],
        "manipulation_analysis": [
          {
            "number": 1,
            "indicator": "Visual Content Analysis",
            "status": "PASS",
            "description": "Analyzes document content for inconsistencies",
            "value": {
              "layout_consistency": true,
              "consistency_anomalies": []
            }
          }
        ],
        "metadata_analysis": [
          {
            "number": 1,
            "indicator": "Document Timeline",
            "status": "PASS",
            "description": "Checks if the document has been modified after creation",
            "value": {
              "created_at": "D:20251216074741+13'00'",
              "last_modified": "Not Available",
              "file_type": ".pdf",
              "file_size": 7463
            }
          }
        ],
        "summary": {
          "risk_level": "WARNING",
          "issues_detected": ["Image Manipulation"],
          "metrics_summary": "1 of 4 triggered review-level indicators."
        }
      }
    ]
  }
}
Risk Levels:
  • TRUSTED: No significant issues detected
  • WARNING: Minor inconsistencies found
  • HIGH_RISK: Critical fraud indicators detected

Understanding Output Types: JSON vs URL

Job results can be delivered in two formats depending on the response size.

JSON Output (type: "json")

For smaller responses, data is provided directly in the response:
{
  "id": "job-uuid",
  "status": "COMPLETED",
  "output": {
    "type": "json",
    "extraction": {
      "field1": "value1",
      "field2": "value2"
    },
    "confidences": {
      "average": 0.95
    }
  }
}
Access data directly from the output field based on job type:
  • PREDICT: output.extraction
  • PARSE: output.markdown and output.annotations
  • FRAUD: output.extraction

URL Output (type: "url")

For large responses, results are stored in S3 with a presigned URL:
{
  "id": "job-uuid",
  "status": "COMPLETED",
  "output": {
    "type": "url",
    "url": "https://s3.amazonaws.com/results/job-uuid.json",
    "expires_at": "2024-01-16T10:30:00Z",
    "metadata": {
      "result_size_bytes": 3145728
    }
  }
}
Handling URL Output:
async function getJobResults(jobId) {
  const job = await fetch(`https://api-vision.fintelite.ai/status/${jobId}`, {
    headers: { 'X-API-Key': API_KEY }
  }).then(r => r.json());

  if (job.output.type === 'url') {
    // Fetch results from presigned URL
    const results = await fetch(job.output.url).then(r => r.json());
    return results;
  } else {
    // Access inline data
    return job.output.extraction || job.output.markdown || job.output.annotations;
  }
}
URL Expiration: Presigned URLs expire after 24 hours. Download and store results before expiration if needed for later use.
Output Type Control: The API determines output type based on:
  1. force_url: true in request → Always returns URL output
  2. Response size > 2MB → Automatically returns URL output
  3. Otherwise → Returns JSON output
{
  "files": "https://example.com/large-document.pdf",
  "template_id": "INVOICE",
  "force_url": true  // Force URL output regardless of size
}
Always check the type field and handle both json and url output formats in your code, as large responses will automatically use URL format even without force_url.

Job Status Examples

Common Job Statuses

These statuses apply to all job types (PREDICT, PARSE, FRAUD) with the same structure. Only the type field differs.
Job is queued and waiting to be processed.
{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "type": "PREDICT",  // or "PARSE", "FRAUD"
  "status": "IN_QUEUE",
  "execution_time": 0,
  "delay_time": 0,
  "file": {
    "id": "file-123",
    "filename": "invoice.pdf",
    "type": "application/pdf",
    "total_pages": 3
  }
}
Job is currently being processed.
{
  "id": "550e8400-e29b-41d4-a716-446655440001",
  "type": "PREDICT",  // or "PARSE", "FRAUD"
  "status": "IN_PROGRESS",
  "execution_time": 0,
  "delay_time": 1245,
  "file": {
    "id": "file-124",
    "filename": "receipt.jpg",
    "type": "image/jpeg",
    "total_pages": 1
  }
}
Job processing failed with error details.
{
  "id": "550e8400-e29b-41d4-a716-446655440005",
  "type": "PREDICT",  // or "PARSE", "FRAUD"
  "status": "FAILED",
  "execution_time": 1200,
  "delay_time": 450,
  "file": {
    "id": "file-128",
    "filename": "corrupted.pdf",
    "type": "application/pdf",
    "total_pages": 0
  },
  "config": {
    "template_id": "template-uuid",
    "parser_mode": "PLUS",
    "enable_citations": false
  },
  "error": {
    "code": "PROCESSING_ERROR",
    "message": "Failed to parse document: file is corrupted or unreadable"
  }
}
Job was cancelled while in queue.
{
  "id": "550e8400-e29b-41d4-a716-446655440006",
  "type": "PREDICT",  // or "PARSE", "FRAUD"
  "status": "CANCELLED",
  "execution_time": 0,
  "delay_time": 5000,
  "file": {
    "id": "file-129",
    "filename": "cancelled-doc.pdf",
    "type": "application/pdf",
    "total_pages": 10
  }
}
For COMPLETED status examples with full output structures, see the job-type-specific sections above: PREDICT Jobs, PARSE Jobs, and FRAUD Jobs.

Synchronous vs Asynchronous

Synchronous Processing

Returns results immediately in the response:
curl -X POST https://api-vision.fintelite.ai/predict \
  -H "X-API-Key: YOUR_API_KEY" \
  -F "files=id://file-uuid" \
  -F "template_id=template-uuid"
Response includes job details and results:
{
  "id": "job-uuid",
  "status": "COMPLETED",
  "output": {
    "type": "json",
    "output": [...],
    "confidences": {...}
  }
}
Best for:
  • Small documents (1-5 pages)
  • Real-time applications
  • Interactive user flows
Not recommended for production at scale. Synchronous endpoints may timeout on large documents or under high load. Use asynchronous endpoints (/predict-async, /parse-async, /fraud-async) for production deployments.

Asynchronous Processing

Returns job ID immediately, check status later:
curl -X POST https://api-vision.fintelite.ai/predict-async \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "files": "id://file-uuid",
    "template_id": "template-uuid",
    "webhook": {
      "url": "https://your-app.com/webhook",
      "metadata": {
        "order_id": "12345"
      }
    }
  }'
Response with job ID:
{
  "id": "job-uuid",
  "status": "IN_QUEUE"
}
Best for:
  • Large documents (5+ pages)
  • Batch processing
  • Background workflows
  • Non-blocking operations
For API operations (checking status, listing jobs, retrying, cancelling), see the Job Management API Reference.

Webhook Notifications

Get notified when async jobs complete:

Configure Webhook

Include webhook configuration in async requests:
curl -X POST https://api-vision.fintelite.ai/predict-async \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "files": "id://file-uuid",
    "template_id": "template-uuid",
    "webhook": {
      "url": "https://your-app.com/webhook/fintelite",
      "metadata": {
        "customer_id": "cust-123",
        "order_id": "order-456",
        "internal_ref": "ref-789"
      }
    }
  }'

Webhook Payload

When the job completes, your endpoint receives:
{
  "job_id": "job-uuid",
  "status": "COMPLETED",
  "job_type": "PREDICT",
  "metadata": {
    "customer_id": "cust-123",
    "order_id": "order-456",
    "internal_ref": "ref-789"
  },
  "completed_at": "2024-01-15T10:05:30Z",
  "execution_time": 3450,
  "results": {
    "type": "url",
    "url": "https://s3.amazonaws.com/presigned-url...",
    "expires_at": "2024-01-15T11:05:30Z"
  }
}

Webhook Best Practices

Verify that webhooks are coming from Fintelite AI by checking the signature header (implementation depends on your webhook configuration).
Return a 200 OK response within 5 seconds. Process the payload asynchronously if needed.
app.post('/webhook/fintelite', (req, res) => {
  // Send 200 response immediately
  res.status(200).send('OK');

  // Process async
  processWebhook(req.body).catch(console.error);
});
Fintelite will retry failed webhook deliveries up to 3 times. Implement idempotency to handle duplicate deliveries.
const processedJobs = new Set();

async function processWebhook(payload) {
  if (processedJobs.has(payload.job_id)) {
    return; // Already processed
  }

  // Process...

  processedJobs.add(payload.job_id);
}
  • Use HTTPS
  • Implement authentication/verification
  • Validate payload structure
  • Rate limit requests

Job Priority

Jobs are processed in the order they’re received (FIFO - First In, First Out). Priority handling may be available on enterprise plans.

Error Handling

When jobs fail, they transition to FAILED status with error details in the error field.

Common Failure Reasons

Document Quality

Poor scan quality, low resolution, or unclear text

Format Issues

Corrupted file, unsupported format, or password-protected documents

Processing Timeout

Document too large, too complex, or processing exceeded time limits

Service Errors

Temporary service unavailability or rate limit exceeded

Best Practices

  • Use webhooks instead of polling to get notified of failures
  • Implement retry logic with exponential backoff for transient errors
  • Monitor error patterns to identify systematic issues
  • Validate documents before submission to catch format issues early
For retry and error handling API operations, see the Job Management API Reference.

Performance Optimization

Use Webhooks

Avoid polling by implementing webhook notifications

Batch Processing

Submit multiple jobs in parallel for batch operations

File Reuse

Upload files once, reference by ID for multiple jobs

Template Caching

Reuse templates instead of providing schemas each time

Monitoring and Analytics

Track job metrics through your account dashboard:
  • Success Rate: Percentage of completed vs failed jobs
  • Average Processing Time: Mean execution time by job type
  • Queue Depth: Number of jobs waiting in queue
  • Credit Usage: Credits consumed by job type (Coming Soon)

Next Steps