Bank Statement Engine
Overview
The Bank Statement Engine is a core service within LoanPilot designed to automate the extraction, classification, and risk assessment of financial data from bank statements. It leverages AI providers to transform unstructured documents (PDFs or images) into structured data for credit underwriting.
The engine doesn't just extract text; it performs complex calculations for daily balances, identifies recurring transactions (EMIs), detects potential fraud, and generates a comprehensive financial health score.
Analysis Endpoint
To trigger a bank statement analysis, send a POST request to the analysis route.
Endpoint: POST /api/analyze/bank-statement
Request Body
The request must be sent as application/json and requires the following fields:
| Field | Type | Required | Description |
| :--- | :--- | :--- | :--- |
| documentId | string | Yes | The UUID of the uploaded document in the loan_documents table. |
| applicationId | string | Yes | The UUID of the associated loan application. |
| organizationId | string | Yes | The UUID of the organization (bank/lender). |
| aiProvider | string | No | Optional. Specific AI provider to use (e.g., openai, anthropic). Defaults to the system configured provider. |
Example Usage
const response = await fetch('/api/analyze/bank-statement', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({
documentId: 'doc_123abc',
applicationId: 'app_456def',
organizationId: 'org_789ghi',
}),
});
const result = await response.json();
Data Extraction & Metrics
The engine extracts and persists a wide array of financial data points into the bank_statement_analysis schema.
1. Summary Metrics
Basic financial health indicators calculated across the statement period:
- Average/Minimum/Maximum Balance: Snapshot of liquidity.
- Negative Balance Days: Counts how many days the account was overdrawn.
- Net Income: Total credits minus total debits.
- Transaction Volume: Total value of all credit and debit entries.
2. Transaction Classification
Every transaction is parsed and categorized. The engine returns:
- Transactions Array: A list containing date, description, amount, type (credit/debit), and category.
- Category Summary: A key-value map showing total spend per category (e.g.,
Utilities: 500.00,Rent: 2000.00). - Daily Balances: A mapped array of dates and the closing balance for each day in the period.
3. Enhanced Analysis (Risk & Underwriting)
For advanced credit decisioning, the engine provides "Enhanced Analysis" fields:
| Metric | Description | | :--- | :--- | | Fraud Risk Score | A calculated score (0-100) based on detected anomalies. | | EMI Detection | Identifies recurring loan repayment patterns and calculates a total monthly EMI obligation. | | Customer Concentration | Analyzes if the majority of inflows come from a single source (risk of dependency). | | Cashflow Volatility | Measures the consistency of inflows and outflows over the period. | | Health Score | An aggregate score indicating the overall financial strength of the applicant. |
Response Schema
A successful analysis returns a JSON object containing the internal analysis ID and the extracted metrics.
{
"success": true,
"analysisId": "8888-8888-8888-8888",
"data": {
"period_start": "2023-01-01",
"period_end": "2023-03-31",
"summary": {
"average_balance": 15400.50,
"total_credits": 45000.00,
"total_debits": 32000.00
},
"overall_health_score": 85,
"fraud_risk_score": 5
}
}
Implementation Notes
- Authentication: This endpoint is protected. You must provide a valid session cookie or Supabase auth token.
- File Handling: The engine automatically retrieves the file from the configured storage provider (Local or S3) based on the
documentId. - Processing Time: Since this involves AI-driven OCR and analysis, the request can take 15–45 seconds. It is recommended to handle the UI state with a loading indicator or via a background job in high-volume environments.