MedhaOS API Documentation
Version: 1.0
Date: 2025-01-19
Base URL: https://api.example.com/api/v1/medha-os
Authentication: API Key via X-API-Key header
Overview
MedhaOS provides a comprehensive REST API for AI-powered document processing, chat, classification, extraction, and more. All endpoints are tenant-isolated and support schema-driven configuration for universal adaptability.
Key Features
- Schema-Driven Extraction - Configure extraction fields per tenant
- Configurable Categories - Define classification categories per tenant
- Business Rules - Configure validation rules per tenant
- Multi-Tenant Isolation - Complete data isolation per tenant
- Universal Adaptability - Works for any vertical without code changes
Authentication
All API requests require authentication via API key:
X-API-Key: your-api-key-here
The API key identifies the tenant and provides access to tenant-specific data and configurations.
Configuration API
Schema Management
Create Extraction Schema
Define fields to extract from documents.
Endpoint: POST /config/schemas
Request Body:
{
"name": "Invoice Schema",
"description": "Schema for extracting invoice data",
"fields": [
{
"name": "vendor_name",
"type": "string",
"required": true,
"description": "Name of the vendor",
"validation_rules": ["not_empty", "max_length:100"]
},
{
"name": "invoice_number",
"type": "string",
"required": true,
"description": "Invoice number",
"validation_rules": ["regex:^INV-\\d+$"]
},
{
"name": "invoice_date",
"type": "date",
"required": true,
"description": "Date of invoice"
},
{
"name": "total_amount",
"type": "number",
"required": true,
"description": "Total amount",
"validation_rules": ["min:0", "max:1000000"]
}
]
}
Response:
{
"success": true,
"schema_id": "schema_123",
"message": "Schema created successfully"
}
Get Schema
Endpoint: GET /config/schemas/{schema_id}
Response:
{
"success": true,
"schema": {
"schema_id": "schema_123",
"tenant_id": "tenant_456",
"name": "Invoice Schema",
"description": "Schema for extracting invoice data",
"fields": [...],
"created_at": "2025-01-19T00:00:00Z",
"updated_at": "2025-01-19T00:00:00Z"
}
}
Update Schema
Endpoint: PUT /config/schemas/{schema_id}
Request Body: Same as create schema
Response:
{
"success": true,
"message": "Schema updated successfully"
}
Delete Schema
Endpoint: DELETE /config/schemas/{schema_id}
Response:
{
"success": true,
"message": "Schema deleted successfully"
}
List Schemas
Endpoint: GET /config/schemas
Response:
{
"success": true,
"schemas": [
{
"schema_id": "schema_123",
"name": "Invoice Schema",
"created_at": "2025-01-19T00:00:00Z"
},
{
"schema_id": "schema_124",
"name": "Receipt Schema",
"created_at": "2025-01-19T00:00:00Z"
}
]
}
Category Management
Create Categories
Define classification categories for documents.
Endpoint: POST /config/categories
Request Body:
{
"categories": [
{
"name": "invoice",
"description": "Invoice documents",
"parent": null
},
{
"name": "receipt",
"description": "Receipt documents",
"parent": null
},
{
"name": "contract",
"description": "Contract documents",
"parent": null
},
{
"name": "other",
"description": "Other documents",
"parent": null
}
]
}
Response:
{
"success": true,
"message": "Categories created successfully",
"category_count": 4
}
List Categories
Endpoint: GET /config/categories
Response:
{
"success": true,
"categories": [
{
"category_id": "cat_123",
"name": "invoice",
"description": "Invoice documents",
"created_at": "2025-01-19T00:00:00Z"
},
...
]
}
Update Category
Endpoint: PUT /config/categories/{category_id}
Request Body:
{
"name": "invoice",
"description": "Updated description"
}
Delete Category
Endpoint: DELETE /config/categories/{category_id}
Business Rule Management
Create Validation Rule
Define validation rules for extracted fields.
Endpoint: POST /config/rules
Request Body:
{
"name": "Tax ID Validation",
"type": "regex",
"pattern": "^[0-9]{2}[A-Z]{5}[0-9]{4}[A-Z]{1}[1-9A-Z]{1}Z[0-9A-Z]{1}$",
"applies_to": ["tax_id", "gstin"],
"error_message": "Invalid tax ID format",
"description": "Validates tax ID format"
}
Response:
{
"success": true,
"rule_id": "rule_123",
"message": "Rule created successfully"
}
List Rules
Endpoint: GET /config/rules
Response:
{
"success": true,
"rules": [
{
"rule_id": "rule_123",
"name": "Tax ID Validation",
"type": "regex",
"applies_to": ["tax_id"],
"created_at": "2025-01-19T00:00:00Z"
}
]
}
Update Rule
Endpoint: PUT /config/rules/{rule_id}
Delete Rule
Endpoint: DELETE /config/rules/{rule_id}
AI Capabilities API
Document Classification
Classify documents into tenant-defined categories.
Endpoint: POST /classify
Request Body:
{
"file_id": "file_123",
"schema_id": "schema_456" // Optional
}
Response:
{
"success": true,
"document_type": "invoice",
"confidence": 0.95,
"categories": ["invoice"],
"processing_time": 1.2
}
Content Extraction
Extract structured data from documents using tenant-configured schema.
Endpoint: POST /extract
Request Body:
{
"file_id": "file_123",
"schema_id": "schema_456" // Uses default if not provided
}
Response:
{
"success": true,
"extracted_data": {
"vendor_name": "Acme Corp",
"invoice_number": "INV-12345",
"invoice_date": "2025-01-19",
"total_amount": 1000.00
},
"confidence_scores": {
"vendor_name": 0.98,
"invoice_number": 0.95,
"invoice_date": 0.99,
"total_amount": 0.97
},
"fields_requiring_review": [],
"processing_time": 2.5
}
Chat with Documents
Chat with documents using AI.
Endpoint: POST /chat
Request Body:
{
"message": "What is the total amount?",
"file_id": "file_123",
"session_id": "session_456" // Optional
}
Response:
{
"success": true,
"response": "The total amount is $1,000.00",
"session_id": "session_456",
"sources": ["file_123"]
}
Document Summarization
Summarize document content. This endpoint supports both synchronous and asynchronous (long-running) execution.
Endpoint: POST /ai/summarize
Request Body:
{
"file_id": "file_123",
"max_length": 512, // Optional: maximum summary length in tokens (default: 512)
"type": "extractive" // Optional: extractive, abstractive, or key_points
}
Response (Synchronous):
{
"success": true,
"summary": "This invoice from Acme Corp dated January 19, 2025...",
"length": 512,
"processing_time": 3.2
}
Response (Asynchronous/Long-running): When executed as a background task, the response includes a task ID:
{
"success": true,
"task_id": "task_123",
"status": "processing",
"message": "Summarization task queued"
}
Note: For long-running summarization tasks, use the task status endpoint to check completion.
Form Auto-Fill
AI-powered form field auto-fill with confidence scores.
Endpoint: POST /ai/form-fill
Request Body:
{
"file_id": "file_123",
"auto_fill_all": true, // Auto-fill all detected form fields
"form_field_id": "field_456", // Optional: fill specific field
"context": "Additional context for field filling" // Optional
}
Response:
{
"success": true,
"suggestions": [
{
"field_id": "field_123",
"field_name": "email",
"suggested_value": "contact@example.com",
"confidence": 0.95, // Model-derived confidence score (0.0-1.0)
"source": "email_extraction" // email_extraction, date_extraction, text_extraction, no_match, error
}
],
"fields_count": 5
}
Confidence Calculation:
- Confidence is calculated based on:
- Keyword matching quality between field name and document content
- Field type validation (email format, date format, etc.)
- Document context match
- Higher confidence (≥0.8) indicates high-quality matches
- Lower confidence (<0.5) indicates uncertain or missing matches
Semantic Search
Search documents using semantic similarity.
Endpoint: POST /search/semantic
Request Body:
{
"query": "invoices from last month",
"file_ids": ["file_123", "file_456"], // Optional: search specific files
"limit": 10
}
Response:
{
"success": true,
"results": [
{
"file_id": "file_123",
"score": 0.95,
"snippet": "Invoice from Acme Corp...",
"relevance": "high"
}
],
"total_results": 5
}
ML Analytics Forecasting
Generate time series forecasts using ML models.
Endpoint: POST /analytics/forecast
Request Body:
{
"metric_name": "document_uploads",
"forecast_period": "7d", // 1d, 7d, or 30d
"tenant_id": "tenant_123" // Optional
}
Response:
{
"success": true,
"forecast": [
{
"timestamp": "2025-01-28T00:00:00",
"value": 125.5,
"confidence": 0.85 // Model-derived confidence based on R² score
}
],
"accuracy": 0.87, // Forecast accuracy score
"model_used": "LinearRegression"
}
Model Explainability
Counterfactual Explanations
Generate counterfactual explanations showing minimal changes needed to reach target prediction.
Endpoint: POST /ai/explain/counterfactual
Request Body:
{
"model_id": "model_123",
"input_data": {
"feature_1": 0.5,
"feature_2": 0.3,
"feature_3": 0.7
},
"target_prediction": 1,
"max_changes": 3 // Maximum number of features to change
}
Response:
{
"success": true,
"counterfactual_id": "cf_123",
"original_input": {...},
"counterfactual_input": {...},
"original_prediction": "0",
"counterfactual_prediction": "1",
"changes": {
"feature_1": {
"original": 0.5,
"new": 0.8,
"change": 0.3
}
},
"distance": 0.15, // L1 distance from original
"confidence": 0.9
}
Global Explanations
Generate global model explanations using SHAP or permutation importance.
Endpoint: POST /ai/explain/global
Request Body:
{
"model_id": "model_123",
"model_version": "1.0",
"explanation_method": "SHAP", // SHAP or PERMUTATION
"sample_size": 1000 // Optional: number of samples for explanation
}
Response:
{
"success": true,
"explanation_id": "exp_123",
"model_id": "model_123",
"feature_importances": [
{
"feature_name": "feature_1",
"importance_score": 0.85,
"rank": 1,
"contribution": 0.85,
"direction": "positive",
"confidence": 0.85
}
],
"attribution_scores": {
"feature_1": 0.85,
"feature_2": 0.62
}
}
Fairness and Bias Analysis
Analyze model fairness and detect bias.
Endpoint: POST /ai/explain/fairness
Request Body:
{
"model_id": "model_123",
"explanations": [...] // List of model explanations to analyze
}
Response:
{
"success": true,
"fairness": {
"overall_fairness": "good",
"fairness_score": 0.85,
"metrics": {
"demographic_parity": 0.85,
"equalized_odds": 0.82,
"equal_opportunity": 0.88,
"disparate_impact": 0.91,
"statistical_parity": 0.87
},
"recommendations": [
"Monitor model performance across different demographic groups"
]
},
"bias": {
"overall_bias": "low",
"bias_indicators": {
"gender_bias": 0.15,
"age_bias": 0.22,
"race_bias": 0.08,
"socioeconomic_bias": 0.31
},
"recommendations": [
"Address socioeconomic bias through data augmentation"
]
}
}
Health Check
Endpoint: GET /health
Response:
{
"status": "healthy",
"service": "medha-os",
"version": "1.0.0"
}
Error Responses
All errors follow this format:
{
"success": false,
"error": "Error message",
"error_code": "ERROR_CODE",
"status_code": 400
}
Common Error Codes
INVALID_API_KEY- API key is missing or invalidSCHEMA_NOT_FOUND- Schema ID not foundVALIDATION_ERROR- Request validation failedTENANT_NOT_FOUND- Tenant not foundRATE_LIMIT_EXCEEDED- Rate limit exceededINTERNAL_ERROR- Internal server error
Rate Limiting
Default rate limit: 100 requests per minute per API key.
Rate limit headers:
X-RateLimit-Limit- Maximum requests allowedX-RateLimit-Remaining- Remaining requestsX-RateLimit-Reset- Time when limit resets
Examples
Complete Workflow: Accounting Software
- Create Extraction Schema:
curl -X POST https://api.example.com/api/v1/medha-os/config/schemas \
-H "X-API-Key: your-api-key" \
-H "Content-Type: application/json" \
-d '{
"name": "Invoice Schema",
"fields": [
{"name": "vendor", "type": "string", "required": true},
{"name": "invoice_number", "type": "string", "required": true},
{"name": "amount", "type": "number", "required": true}
]
}'
- Create Categories:
curl -X POST https://api.example.com/api/v1/medha-os/config/categories \
-H "X-API-Key: your-api-key" \
-H "Content-Type: application/json" \
-d '{
"categories": [
{"name": "invoice", "description": "Invoice documents"},
{"name": "receipt", "description": "Receipt documents"}
]
}'
- Extract Data:
curl -X POST https://api.example.com/api/v1/medha-os/extract \
-H "X-API-Key: your-api-key" \
-H "Content-Type: application/json" \
-d '{
"file_id": "file_123",
"schema_id": "schema_456"
}'
- Classify Document:
curl -X POST https://api.example.com/api/v1/medha-os/classify \
-H "X-API-Key: your-api-key" \
-H "Content-Type: application/json" \
-d '{
"file_id": "file_123"
}'
SDKs
JavaScript SDK
import { MedhaOSClient } from '@medhaos/sdk';
const client = new MedhaOSClient({
apiKey: 'your-api-key',
baseURL: 'https://api.example.com'
});
// Create schema
const schema = await client.config.createSchema({
name: 'Invoice Schema',
fields: [...]
});
// Extract data
const result = await client.extract({
file_id: 'file_123',
schema_id: schema.schema_id
});
Python SDK
from medhaos import MedhaOSClient
client = MedhaOSClient(
api_key='your-api-key',
base_url='https://api.example.com'
)
# Create schema
schema = client.config.create_schema(
name='Invoice Schema',
fields=[...]
)
# Extract data
result = client.extract(
file_id='file_123',
schema_id=schema['schema_id']
)
Onboarding Endpoints
POST /api/v1/onboard
Create a new tenant and receive API keys.
Request:
{
"name": "Acme Inc.",
"domain": "acme.com",
"email": "admin@acme.com",
"use_case": "e-commerce"
}
Response:
{
"success": true,
"data": {
"tenant_id": "uuid-here",
"api_key_public": "medha_pk_...",
"api_key_secret": "medha_sk_...",
"endpoint": "https://api.medhaos.com/api/v1",
"status": "active"
}
}
Rate Limit: 5 requests/hour per IP
Example (multiple technologies):
curl -X POST https://api.example.com/api/v1/onboard \
-H "Content-Type: application/json" \
-d '{
"name": "Acme Inc.",
"domain": "acme.com",
"email": "admin@acme.com",
"use_case": "e-commerce"
}'
GET /api/v1/onboard/status/{tenant_id}
Get onboarding status for a tenant.
Response:
{
"success": true,
"data": {
"status": "active",
"tenant_id": "uuid-here",
"name": "Acme Inc.",
"domain": "acme.com",
"email": "admin@acme.com",
"domains": [
{
"domain": "acme.com",
"origin": "https://acme.com",
"is_verified": true
}
]
}
}
POST /api/v1/onboard/domains
Register an additional domain for the authenticated tenant.
Headers:
X-API-Key: Your secret API key (required)
Request:
{
"domain": "subdomain.acme.com",
"origin": "https://subdomain.acme.com"
}
Response:
{
"success": true,
"data": {
"domain": "subdomain.acme.com",
"origin": "https://subdomain.acme.com"
}
}
GET /api/v1/onboard/domains
List all registered domains for the authenticated tenant.
Headers:
X-API-Key: Your secret API key (required)
Response:
{
"success": true,
"data": {
"domains": [
{
"domain": "acme.com",
"origin": "https://acme.com",
"is_verified": true
}
]
}
}
Admin APIs
Admin-only endpoints (config, routing, learning, shadow log, etc.) are documented in FEATURES.md (section 23. Admin). Key paths include:
POST /api/v1/admin/learning/export— Trigger SFT export (body:days?,min_samples?,export_only?)GET /api/v1/admin/learning/status— Learning status (events count, last export)GET /api/v1/admin/shadow-log— Tail of shadow log (query:limit)
See FEATURES.md for the full admin API list and docs/ADMIN_MODEL_RD.md for the Model R&D workflow.
Support
For API support, contact: support@medhaos.com
Last Updated: 2025-01-19