Job Management

What are Jobs?

Jobs represent document processing tasks in Fintelite AI. Every processing request (predict, parse, or fraud detection) creates a job that tracks the status, results, and metadata of the operation.

Job Types

PREDICT

Data extraction jobs using templates or schemas

PARSE

Full document parsing and OCR jobs

FRAUD

Document fraud detection and analysis jobs

Job Lifecycle

IN_QUEUE

Job is queued and waiting to be processed by a worker

IN_PROGRESS

Worker has picked up the job and is actively processing it

COMPLETED

Job finished successfully with results available

FAILED

Job encountered an error during processing

CANCELLED

Job was manually cancelled before completion

Response Structure by Job Type

Each job type returns a different response structure when completed. Understanding these differences helps you parse results correctly.

PREDICT Jobs - Data Extraction

PREDICT jobs extract structured data based on templates or schemas. The response includes:

config: Configuration used for extraction (version, parser_mode, enable_citations)
output.extraction: Extracted data format
output.confidences: Confidence scores for extracted fields

Without Citations

When enable_citations is false (default), fields contain direct values:

{
  "output": {
    "type": "json",
    "extraction": {
      "invoice_number": "INV-2024-001",
      "invoice_date": "2024-01-15",
      "total_amount": 1500.50,
      "line_items": [
        {
          "description": "Widget A",
          "quantity": 10,
          "unit_price": 50.00,
          "total": 500.00
        }
      ]
    },
    "confidences": {
      "average": 92.5,
      "fields": {
        "invoice_number": 95.0,
        "invoice_date": 88.0,
        "total_amount": 96.5,
        "line_items": [
          {
            "description": 93.0,
            "quantity": 91.0,
            "unit_price": 94.5,
            "total": 95.0
          }
        ]
      }
    }
  }
}

With Citations Enabled

When enable_citations is true, each field includes source references:

{
  "output": {
    "type": "json",
    "extraction": {
      "invoice_number": {
        "value": "INV-2024-001",
        "citations": ["b.1"]
      },
      "invoice_date": {
        "value": "2024-01-15",
        "citations": ["b.1"]
      },
      "total_amount": {
        "value": 1500.50,
        "citations": ["b.5"]
      },
      "line_items": [
        {
          "description": {
            "value": "Widget A",
            "citations": ["b.10"]
          },
          "quantity": {
            "value": 10,
            "citations": ["b.10"]
          },
          "unit_price": {
            "value": 50.00,
            "citations": ["b.10", "b.11"]
          },
          "total": {
            "value": 500.00,
            "citations": ["b.11"]
          }
        }
      ]
    },
    "confidences": {
      "average": 92.5,
      "fields": {
        "invoice_number": 95.0,
        "invoice_date": 88.0,
        "total_amount": 96.5,
        "line_items": [
          {
            "description": 93.0,
            "quantity": 91.0,
            "unit_price": 94.5,
            "total": 95.0
          }
        ]
      }
    }
  }
}

Citation References:

b.X: Reference to block index X in the document annotations
w.X: Reference to word index X in the document annotations

Citations are automatically disabled when using parser_mode: LITE. Use PLUS or PRO parser modes to enable citations.

PARSE Jobs - Document Parsing

PARSE jobs extract full document text and structure. The response includes:

output.annotations: Detailed document structure with content blocks
output.markdown: Array of markdown-formatted text for each page
output.metadata: Processing metadata including mode and page mapping

{
  "output": {
    "type": "json",
    "annotations": {
      "blocks": [
        {
          "type": "SECTION_HEADER",
          "content": "SERVICE AGREEMENT",
          "bbox": [0.35, 0.08, 0.65, 0.17],
          "page": 1,
          "block_num": 1,
          "confidence": "HIGH",
          "confidence_val": 0.98
        },
        {
          "type": "TEXT",
          "content": "This Service Agreement is entered into on January 15, 2024\nbetween the parties listed below.",
          "bbox": [0.1, 0.2, 0.9, 0.35],
          "page": 1,
          "block_num": 2,
          "confidence": "HIGH",
          "confidence_val": 0.95
        },
        {
          "type": "TABLE",
          "content": "Item | Quantity | Price\nWidget A | 10 | $100.00\nWidget B | 5 | $75.00",
          "bbox": [0.15, 0.4, 0.85, 0.65],
          "page": 2,
          "block_num": 3,
          "confidence": "MEDIUM",
          "confidence_val": 0.87
        },
        {
          "type": "IMAGE",
          "content": "Company Logo",
          "bbox": [0.2, 0.7, 0.8, 0.9],
          "page": 2,
          "block_num": 4,
          "confidence": "HIGH",
          "confidence_val": 0.96,
          "image_url": "image_1"
        }
      ],
      "references": {
        "image_1": "data:image/png;base64,iVBORw0KGgo..."
      }
    },
    "markdown": [
      "# SERVICE AGREEMENT\n\nThis Service Agreement is entered into on January 15, 2024\nbetween the parties listed below.",
      "## Terms and Conditions\n\nItem | Quantity | Price\n--- | --- | ---\nWidget A | 10 | $100.00\nWidget B | 5 | $75.00",
      "## Signatures\n\n_____________________"
    ],
    "metadata": {
      "mode": "PLUS",
      "page_mapping": [
        {"original": 1, "processed": 1},
        {"original": 2, "processed": 2}
      ]
    }
  }
}

Block Structure:

type: Block type classification (SECTION_HEADER, HEADER, FOOTER, TABLE, IMAGE, TEXT)
content: Text content extracted from the block (lines combined with newlines). For IMAGE blocks, contains image caption if available
bbox: Normalized bounding box [x_min, y_min, x_max, y_max] (0-1 scale)
page: Original page number from source document
page_seq: Sequential page number in processed document (only present when using page ranges, e.g., pages: "1-5,10-15")
block_num: Sequential block number (used for citations)
confidence: Confidence level (HIGH, MEDIUM, LOW)
confidence_val: Numeric confidence score (0-1 scale, typically 0.85-0.95)
image_url: Reference key for IMAGE blocks (points to entry in references map)

Additional Fields:

references: Map of image references (image_id → base64 encoded image data)

Metadata Fields:

mode: Parser mode used (LITE, PRO, … )
page_mapping: Array of {"original": page_num, "processed": page_num} objects

total_pages is in the file object at the top level of the job response, not in output.metadata.

FRAUD Jobs - Fraud Detection

FRAUD jobs analyze documents for authenticity issues. The response includes:

output.output[0].anomalies: Detected fraud indicators with severity levels
output.output[0].font_analysis: Font consistency checks
output.output[0].image_analysis: Image manipulation detection
output.output[0].manipulation_analysis: Document tampering indicators
output.output[0].metadata_analysis: File metadata verification
output.output[0].summary: Overall risk assessment

{
  "output": {
    "type": "json",
    "output": [
      {
        "anomalies": [
          {
            "type": "total_mismatch",
            "severity": "MEDIUM",
            "location": "Page 1, Region: (7.3, 34.5, 85.3, 2.5)",
            "details": "Content data mismatch detected",
            "evidence": "Critical calculation mismatch for 'Ordinary Pay'. The listed Rate (2000.0000) multiplied by Hours (40.00) should be $80,000.00..."
          }
        ],
        "font_analysis": [
          {
            "number": 1,
            "indicator": "Font Overview",
            "status": "PASS",
            "description": "Summary of fonts used in the document",
            "value": {
              "total_fonts": 2,
              "embedded_fonts": 2,
              "external_fonts": 0
            }
          }
        ],
        "image_analysis": [
          {
            "number": 1,
            "indicator": "Image Analysis Overview",
            "status": "PASS",
            "description": "Summary of image analysis findings",
            "value": {
              "total_images": 1,
              "clean_images": 1,
              "suspicious_images": 0
            }
          }
        ],
        "manipulation_analysis": [
          {
            "number": 1,
            "indicator": "Visual Content Analysis",
            "status": "PASS",
            "description": "Analyzes document content for inconsistencies",
            "value": {
              "layout_consistency": true,
              "consistency_anomalies": []
            }
          }
        ],
        "metadata_analysis": [
          {
            "number": 1,
            "indicator": "Document Timeline",
            "status": "PASS",
            "description": "Checks if the document has been modified after creation",
            "value": {
              "created_at": "D:20251216074741+13'00'",
              "last_modified": "Not Available",
              "file_type": ".pdf",
              "file_size": 7463
            }
          }
        ],
        "summary": {
          "risk_level": "WARNING",
          "issues_detected": ["Image Manipulation"],
          "metrics_summary": "1 of 4 triggered review-level indicators."
        }
      }
    ]
  }
}

Risk Levels:

TRUSTED: No significant issues detected
WARNING: Minor inconsistencies found
HIGH_RISK: Critical fraud indicators detected

Understanding Output Types: JSON vs URL

Job results can be delivered in two formats depending on the response size.

JSON Output (`type: "json"`)

For smaller responses, data is provided directly in the response:

{
  "id": "job-uuid",
  "status": "COMPLETED",
  "output": {
    "type": "json",
    "extraction": {
      "field1": "value1",
      "field2": "value2"
    },
    "confidences": {
      "average": 0.95
    }
  }
}

Access data directly from the output field based on job type:

PREDICT: output.extraction
PARSE: output.markdown and output.annotations
FRAUD: output.extraction

URL Output (`type: "url"`)

For large responses, results are stored in S3 with a presigned URL:

{
  "id": "job-uuid",
  "status": "COMPLETED",
  "output": {
    "type": "url",
    "url": "https://s3.amazonaws.com/results/job-uuid.json",
    "expires_at": "2024-01-16T10:30:00Z",
    "metadata": {
      "result_size_bytes": 3145728
    }
  }
}

Handling URL Output:

async function getJobResults(jobId) {
  const job = await fetch(`https://api-vision.fintelite.ai/status/${jobId}`, {
    headers: { 'X-API-Key': API_KEY }
  }).then(r => r.json());

  if (job.output.type === 'url') {
    // Fetch results from presigned URL
    const results = await fetch(job.output.url).then(r => r.json());
    return results;
  } else {
    // Access inline data
    return job.output.extraction || job.output.markdown || job.output.annotations;
  }
}

URL Expiration: Presigned URLs expire after 24 hours. Download and store results before expiration if needed for later use.

Output Type Control: The API determines output type based on:

force_url: true in request → Always returns URL output
Response size > 2MB → Automatically returns URL output
Otherwise → Returns JSON output

{
  "files": "https://example.com/large-document.pdf",
  "template_id": "INVOICE",
  "force_url": true  // Force URL output regardless of size
}

Always check the type field and handle both json and url output formats in your code, as large responses will automatically use URL format even without force_url.

Job Status Examples

Common Job Statuses

These statuses apply to all job types (PREDICT, PARSE, FRAUD) with the same structure. Only the type field differs.

IN_QUEUE

Job is queued and waiting to be processed.

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "type": "PREDICT",  // or "PARSE", "FRAUD"
  "status": "IN_QUEUE",
  "execution_time": 0,
  "delay_time": 0,
  "file": {
    "id": "file-123",
    "filename": "invoice.pdf",
    "type": "application/pdf",
    "total_pages": 3
  }
}

IN_PROGRESS

Job is currently being processed.

{
  "id": "550e8400-e29b-41d4-a716-446655440001",
  "type": "PREDICT",  // or "PARSE", "FRAUD"
  "status": "IN_PROGRESS",
  "execution_time": 0,
  "delay_time": 1245,
  "file": {
    "id": "file-124",
    "filename": "receipt.jpg",
    "type": "image/jpeg",
    "total_pages": 1
  }
}

FAILED

Job processing failed with error details.

{
  "id": "550e8400-e29b-41d4-a716-446655440005",
  "type": "PREDICT",  // or "PARSE", "FRAUD"
  "status": "FAILED",
  "execution_time": 1200,
  "delay_time": 450,
  "file": {
    "id": "file-128",
    "filename": "corrupted.pdf",
    "type": "application/pdf",
    "total_pages": 0
  },
  "config": {
    "template_id": "template-uuid",
    "parser_mode": "PLUS",
    "enable_citations": false
  },
  "error": {
    "code": "PROCESSING_ERROR",
    "message": "Failed to parse document: file is corrupted or unreadable"
  }
}

CANCELLED

Job was cancelled while in queue.

{
  "id": "550e8400-e29b-41d4-a716-446655440006",
  "type": "PREDICT",  // or "PARSE", "FRAUD"
  "status": "CANCELLED",
  "execution_time": 0,
  "delay_time": 5000,
  "file": {
    "id": "file-129",
    "filename": "cancelled-doc.pdf",
    "type": "application/pdf",
    "total_pages": 10
  }
}

For COMPLETED status examples with full output structures, see the job-type-specific sections above: PREDICT Jobs, PARSE Jobs, and FRAUD Jobs.

Synchronous vs Asynchronous

Synchronous Processing

Returns results immediately in the response:

curl -X POST https://api-vision.fintelite.ai/predict \
  -H "X-API-Key: YOUR_API_KEY" \
  -F "files=id://file-uuid" \
  -F "template_id=template-uuid"

Response includes job details and results:

{
  "id": "job-uuid",
  "status": "COMPLETED",
  "output": {
    "type": "json",
    "output": [...],
    "confidences": {...}
  }
}

Best for:

Small documents (1-5 pages)
Real-time applications
Interactive user flows

Not recommended for production at scale. Synchronous endpoints may timeout on large documents or under high load. Use asynchronous endpoints (/predict-async, /parse-async, /fraud-async) for production deployments.

Asynchronous Processing

Returns job ID immediately, check status later:

curl -X POST https://api-vision.fintelite.ai/predict-async \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "files": "id://file-uuid",
    "template_id": "template-uuid",
    "webhook": {
      "url": "https://your-app.com/webhook",
      "metadata": {
        "order_id": "12345"
      }
    }
  }'

Response with job ID:

{
  "id": "job-uuid",
  "status": "IN_QUEUE"
}

Best for:

Large documents (5+ pages)
Batch processing
Background workflows
Non-blocking operations

For API operations (checking status, listing jobs, retrying, cancelling), see the Job Management API Reference.

Webhook Notifications

Get notified when async jobs complete:

Configure Webhook

Include webhook configuration in async requests:

curl -X POST https://api-vision.fintelite.ai/predict-async \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "files": "id://file-uuid",
    "template_id": "template-uuid",
    "webhook": {
      "url": "https://your-app.com/webhook/fintelite",
      "metadata": {
        "customer_id": "cust-123",
        "order_id": "order-456",
        "internal_ref": "ref-789"
      }
    }
  }'

Webhook Payload

When the job completes, your endpoint receives:

{
  "job_id": "job-uuid",
  "status": "COMPLETED",
  "job_type": "PREDICT",
  "metadata": {
    "customer_id": "cust-123",
    "order_id": "order-456",
    "internal_ref": "ref-789"
  },
  "completed_at": "2024-01-15T10:05:30Z",
  "execution_time": 3450,
  "results": {
    "type": "url",
    "url": "https://s3.amazonaws.com/presigned-url...",
    "expires_at": "2024-01-15T11:05:30Z"
  }
}

Webhook Best Practices

Verify Webhook Signature

Verify that webhooks are coming from Fintelite AI by checking the signature header (implementation depends on your webhook configuration).

Respond Quickly

Return a 200 OK response within 5 seconds. Process the payload asynchronously if needed.

app.post('/webhook/fintelite', (req, res) => {
  // Send 200 response immediately
  res.status(200).send('OK');

  // Process async
  processWebhook(req.body).catch(console.error);
});

Handle Retries

Fintelite will retry failed webhook deliveries up to 3 times. Implement idempotency to handle duplicate deliveries.

const processedJobs = new Set();

async function processWebhook(payload) {
  if (processedJobs.has(payload.job_id)) {
    return; // Already processed
  }

  // Process...

  processedJobs.add(payload.job_id);
}

Secure Your Endpoint

Use HTTPS
Implement authentication/verification
Validate payload structure
Rate limit requests

Job Priority

Jobs are processed in the order they’re received (FIFO - First In, First Out). Priority handling may be available on enterprise plans.

Error Handling

When jobs fail, they transition to FAILED status with error details in the error field.

Common Failure Reasons

Document Quality

Poor scan quality, low resolution, or unclear text

Format Issues

Corrupted file, unsupported format, or password-protected documents

Processing Timeout

Document too large, too complex, or processing exceeded time limits

Service Errors

Temporary service unavailability or rate limit exceeded

Best Practices

Use webhooks instead of polling to get notified of failures
Implement retry logic with exponential backoff for transient errors
Monitor error patterns to identify systematic issues
Validate documents before submission to catch format issues early

For retry and error handling API operations, see the Job Management API Reference.

Performance Optimization

Use Webhooks

Avoid polling by implementing webhook notifications

Batch Processing

Submit multiple jobs in parallel for batch operations

File Reuse

Upload files once, reference by ID for multiple jobs

Template Caching

Reuse templates instead of providing schemas each time

Monitoring and Analytics

Track job metrics through your account dashboard:

Success Rate: Percentage of completed vs failed jobs
Average Processing Time: Mean execution time by job type
Queue Depth: Number of jobs waiting in queue
Credit Usage: Credits consumed by job type (Coming Soon)

Next Steps

Document Processing

Learn about processing modes

Jobs API

Explore job management endpoints

Introduction

Core Concepts

Configuration

API Reference

System

File Management

Document Processing

Template Management

Account Usage

Fraud Detection

​What are Jobs?

​Job Types

PREDICT

PARSE

FRAUD

​Job Lifecycle

​Response Structure by Job Type

​PREDICT Jobs - Data Extraction

​Without Citations

​With Citations Enabled

​PARSE Jobs - Document Parsing

​FRAUD Jobs - Fraud Detection

​Understanding Output Types: JSON vs URL

​JSON Output (type: "json")

​URL Output (type: "url")

​Job Status Examples

​Common Job Statuses

​Synchronous vs Asynchronous

​Synchronous Processing

​Asynchronous Processing

​Webhook Notifications

​Configure Webhook

​Webhook Payload

​Webhook Best Practices

​Job Priority

​Error Handling

​Common Failure Reasons

Document Quality

Format Issues

Processing Timeout

Service Errors

​Best Practices

​Performance Optimization

Use Webhooks

Batch Processing

File Reuse

Template Caching

​Monitoring and Analytics

​Next Steps

Document Processing

Jobs API

What are Jobs?

Job Types

Job Lifecycle

Response Structure by Job Type

PREDICT Jobs - Data Extraction

Without Citations

With Citations Enabled

PARSE Jobs - Document Parsing

FRAUD Jobs - Fraud Detection

Understanding Output Types: JSON vs URL

JSON Output (`type: "json"`)

URL Output (`type: "url"`)

Job Status Examples

Common Job Statuses

Synchronous vs Asynchronous

Synchronous Processing

Asynchronous Processing

Webhook Notifications

Configure Webhook

Webhook Payload

Webhook Best Practices

Job Priority

Error Handling

Common Failure Reasons

Best Practices

Performance Optimization

Monitoring and Analytics

Next Steps