MedhaOS API Documentation

Version: 1.0
Date: 2025-01-19
Base URL: https://api.example.com/api/v1/medha-os
Authentication: API Key via X-API-Key header


Overview

MedhaOS provides a comprehensive REST API for AI-powered document processing, chat, classification, extraction, and more. All endpoints are tenant-isolated and support schema-driven configuration for universal adaptability.

Key Features

  • Schema-Driven Extraction - Configure extraction fields per tenant
  • Configurable Categories - Define classification categories per tenant
  • Business Rules - Configure validation rules per tenant
  • Multi-Tenant Isolation - Complete data isolation per tenant
  • Universal Adaptability - Works for any vertical without code changes

Authentication

All API requests require authentication via API key:

X-API-Key: your-api-key-here

The API key identifies the tenant and provides access to tenant-specific data and configurations.


Configuration API

Schema Management

Create Extraction Schema

Define fields to extract from documents.

Endpoint: POST /config/schemas

Request Body:

{
  "name": "Invoice Schema",
  "description": "Schema for extracting invoice data",
  "fields": [
    {
      "name": "vendor_name",
      "type": "string",
      "required": true,
      "description": "Name of the vendor",
      "validation_rules": ["not_empty", "max_length:100"]
    },
    {
      "name": "invoice_number",
      "type": "string",
      "required": true,
      "description": "Invoice number",
      "validation_rules": ["regex:^INV-\\d+$"]
    },
    {
      "name": "invoice_date",
      "type": "date",
      "required": true,
      "description": "Date of invoice"
    },
    {
      "name": "total_amount",
      "type": "number",
      "required": true,
      "description": "Total amount",
      "validation_rules": ["min:0", "max:1000000"]
    }
  ]
}

Response:

{
  "success": true,
  "schema_id": "schema_123",
  "message": "Schema created successfully"
}

Get Schema

Endpoint: GET /config/schemas/{schema_id}

Response:

{
  "success": true,
  "schema": {
    "schema_id": "schema_123",
    "tenant_id": "tenant_456",
    "name": "Invoice Schema",
    "description": "Schema for extracting invoice data",
    "fields": [...],
    "created_at": "2025-01-19T00:00:00Z",
    "updated_at": "2025-01-19T00:00:00Z"
  }
}

Update Schema

Endpoint: PUT /config/schemas/{schema_id}

Request Body: Same as create schema

Response:

{
  "success": true,
  "message": "Schema updated successfully"
}

Delete Schema

Endpoint: DELETE /config/schemas/{schema_id}

Response:

{
  "success": true,
  "message": "Schema deleted successfully"
}

List Schemas

Endpoint: GET /config/schemas

Response:

{
  "success": true,
  "schemas": [
    {
      "schema_id": "schema_123",
      "name": "Invoice Schema",
      "created_at": "2025-01-19T00:00:00Z"
    },
    {
      "schema_id": "schema_124",
      "name": "Receipt Schema",
      "created_at": "2025-01-19T00:00:00Z"
    }
  ]
}

Category Management

Create Categories

Define classification categories for documents.

Endpoint: POST /config/categories

Request Body:

{
  "categories": [
    {
      "name": "invoice",
      "description": "Invoice documents",
      "parent": null
    },
    {
      "name": "receipt",
      "description": "Receipt documents",
      "parent": null
    },
    {
      "name": "contract",
      "description": "Contract documents",
      "parent": null
    },
    {
      "name": "other",
      "description": "Other documents",
      "parent": null
    }
  ]
}

Response:

{
  "success": true,
  "message": "Categories created successfully",
  "category_count": 4
}

List Categories

Endpoint: GET /config/categories

Response:

{
  "success": true,
  "categories": [
    {
      "category_id": "cat_123",
      "name": "invoice",
      "description": "Invoice documents",
      "created_at": "2025-01-19T00:00:00Z"
    },
    ...
  ]
}

Update Category

Endpoint: PUT /config/categories/{category_id}

Request Body:

{
  "name": "invoice",
  "description": "Updated description"
}

Delete Category

Endpoint: DELETE /config/categories/{category_id}

Business Rule Management

Create Validation Rule

Define validation rules for extracted fields.

Endpoint: POST /config/rules

Request Body:

{
  "name": "Tax ID Validation",
  "type": "regex",
  "pattern": "^[0-9]{2}[A-Z]{5}[0-9]{4}[A-Z]{1}[1-9A-Z]{1}Z[0-9A-Z]{1}$",
  "applies_to": ["tax_id", "gstin"],
  "error_message": "Invalid tax ID format",
  "description": "Validates tax ID format"
}

Response:

{
  "success": true,
  "rule_id": "rule_123",
  "message": "Rule created successfully"
}

List Rules

Endpoint: GET /config/rules

Response:

{
  "success": true,
  "rules": [
    {
      "rule_id": "rule_123",
      "name": "Tax ID Validation",
      "type": "regex",
      "applies_to": ["tax_id"],
      "created_at": "2025-01-19T00:00:00Z"
    }
  ]
}

Update Rule

Endpoint: PUT /config/rules/{rule_id}

Delete Rule

Endpoint: DELETE /config/rules/{rule_id}


AI Capabilities API

Document Classification

Classify documents into tenant-defined categories.

Endpoint: POST /classify

Request Body:

{
  "file_id": "file_123",
  "schema_id": "schema_456"  // Optional
}

Response:

{
  "success": true,
  "document_type": "invoice",
  "confidence": 0.95,
  "categories": ["invoice"],
  "processing_time": 1.2
}

Content Extraction

Extract structured data from documents using tenant-configured schema.

Endpoint: POST /extract

Request Body:

{
  "file_id": "file_123",
  "schema_id": "schema_456"  // Uses default if not provided
}

Response:

{
  "success": true,
  "extracted_data": {
    "vendor_name": "Acme Corp",
    "invoice_number": "INV-12345",
    "invoice_date": "2025-01-19",
    "total_amount": 1000.00
  },
  "confidence_scores": {
    "vendor_name": 0.98,
    "invoice_number": 0.95,
    "invoice_date": 0.99,
    "total_amount": 0.97
  },
  "fields_requiring_review": [],
  "processing_time": 2.5
}

Chat with Documents

Chat with documents using AI.

Endpoint: POST /chat

Request Body:

{
  "message": "What is the total amount?",
  "file_id": "file_123",
  "session_id": "session_456"  // Optional
}

Response:

{
  "success": true,
  "response": "The total amount is $1,000.00",
  "session_id": "session_456",
  "sources": ["file_123"]
}

Document Summarization

Summarize document content. This endpoint supports both synchronous and asynchronous (long-running) execution.

Endpoint: POST /ai/summarize

Request Body:

{
  "file_id": "file_123",
  "max_length": 512,  // Optional: maximum summary length in tokens (default: 512)
  "type": "extractive"  // Optional: extractive, abstractive, or key_points
}

Response (Synchronous):

{
  "success": true,
  "summary": "This invoice from Acme Corp dated January 19, 2025...",
  "length": 512,
  "processing_time": 3.2
}

Response (Asynchronous/Long-running): When executed as a background task, the response includes a task ID:

{
  "success": true,
  "task_id": "task_123",
  "status": "processing",
  "message": "Summarization task queued"
}

Note: For long-running summarization tasks, use the task status endpoint to check completion.

Form Auto-Fill

AI-powered form field auto-fill with confidence scores.

Endpoint: POST /ai/form-fill

Request Body:

{
  "file_id": "file_123",
  "auto_fill_all": true,  // Auto-fill all detected form fields
  "form_field_id": "field_456",  // Optional: fill specific field
  "context": "Additional context for field filling"  // Optional
}

Response:

{
  "success": true,
  "suggestions": [
    {
      "field_id": "field_123",
      "field_name": "email",
      "suggested_value": "contact@example.com",
      "confidence": 0.95,  // Model-derived confidence score (0.0-1.0)
      "source": "email_extraction"  // email_extraction, date_extraction, text_extraction, no_match, error
    }
  ],
  "fields_count": 5
}

Confidence Calculation:

  • Confidence is calculated based on:
    • Keyword matching quality between field name and document content
    • Field type validation (email format, date format, etc.)
    • Document context match
  • Higher confidence (≥0.8) indicates high-quality matches
  • Lower confidence (<0.5) indicates uncertain or missing matches

Search documents using semantic similarity.

Endpoint: POST /search/semantic

Request Body:

{
  "query": "invoices from last month",
  "file_ids": ["file_123", "file_456"],  // Optional: search specific files
  "limit": 10
}

Response:

{
  "success": true,
  "results": [
    {
      "file_id": "file_123",
      "score": 0.95,
      "snippet": "Invoice from Acme Corp...",
      "relevance": "high"
    }
  ],
  "total_results": 5
}

ML Analytics Forecasting

Generate time series forecasts using ML models.

Endpoint: POST /analytics/forecast

Request Body:

{
  "metric_name": "document_uploads",
  "forecast_period": "7d",  // 1d, 7d, or 30d
  "tenant_id": "tenant_123"  // Optional
}

Response:

{
  "success": true,
  "forecast": [
    {
      "timestamp": "2025-01-28T00:00:00",
      "value": 125.5,
      "confidence": 0.85  // Model-derived confidence based on R² score
    }
  ],
  "accuracy": 0.87,  // Forecast accuracy score
  "model_used": "LinearRegression"
}

Model Explainability

Counterfactual Explanations

Generate counterfactual explanations showing minimal changes needed to reach target prediction.

Endpoint: POST /ai/explain/counterfactual

Request Body:

{
  "model_id": "model_123",
  "input_data": {
    "feature_1": 0.5,
    "feature_2": 0.3,
    "feature_3": 0.7
  },
  "target_prediction": 1,
  "max_changes": 3  // Maximum number of features to change
}

Response:

{
  "success": true,
  "counterfactual_id": "cf_123",
  "original_input": {...},
  "counterfactual_input": {...},
  "original_prediction": "0",
  "counterfactual_prediction": "1",
  "changes": {
    "feature_1": {
      "original": 0.5,
      "new": 0.8,
      "change": 0.3
    }
  },
  "distance": 0.15,  // L1 distance from original
  "confidence": 0.9
}

Global Explanations

Generate global model explanations using SHAP or permutation importance.

Endpoint: POST /ai/explain/global

Request Body:

{
  "model_id": "model_123",
  "model_version": "1.0",
  "explanation_method": "SHAP",  // SHAP or PERMUTATION
  "sample_size": 1000  // Optional: number of samples for explanation
}

Response:

{
  "success": true,
  "explanation_id": "exp_123",
  "model_id": "model_123",
  "feature_importances": [
    {
      "feature_name": "feature_1",
      "importance_score": 0.85,
      "rank": 1,
      "contribution": 0.85,
      "direction": "positive",
      "confidence": 0.85
    }
  ],
  "attribution_scores": {
    "feature_1": 0.85,
    "feature_2": 0.62
  }
}

Fairness and Bias Analysis

Analyze model fairness and detect bias.

Endpoint: POST /ai/explain/fairness

Request Body:

{
  "model_id": "model_123",
  "explanations": [...]  // List of model explanations to analyze
}

Response:

{
  "success": true,
  "fairness": {
    "overall_fairness": "good",
    "fairness_score": 0.85,
    "metrics": {
      "demographic_parity": 0.85,
      "equalized_odds": 0.82,
      "equal_opportunity": 0.88,
      "disparate_impact": 0.91,
      "statistical_parity": 0.87
    },
    "recommendations": [
      "Monitor model performance across different demographic groups"
    ]
  },
  "bias": {
    "overall_bias": "low",
    "bias_indicators": {
      "gender_bias": 0.15,
      "age_bias": 0.22,
      "race_bias": 0.08,
      "socioeconomic_bias": 0.31
    },
    "recommendations": [
      "Address socioeconomic bias through data augmentation"
    ]
  }
}

Health Check

Endpoint: GET /health

Response:

{
  "status": "healthy",
  "service": "medha-os",
  "version": "1.0.0"
}

Error Responses

All errors follow this format:

{
  "success": false,
  "error": "Error message",
  "error_code": "ERROR_CODE",
  "status_code": 400
}

Common Error Codes

  • INVALID_API_KEY - API key is missing or invalid
  • SCHEMA_NOT_FOUND - Schema ID not found
  • VALIDATION_ERROR - Request validation failed
  • TENANT_NOT_FOUND - Tenant not found
  • RATE_LIMIT_EXCEEDED - Rate limit exceeded
  • INTERNAL_ERROR - Internal server error

Rate Limiting

Default rate limit: 100 requests per minute per API key.

Rate limit headers:

  • X-RateLimit-Limit - Maximum requests allowed
  • X-RateLimit-Remaining - Remaining requests
  • X-RateLimit-Reset - Time when limit resets

Examples

Complete Workflow: Accounting Software

  1. Create Extraction Schema:
curl -X POST https://api.example.com/api/v1/medha-os/config/schemas \
  -H "X-API-Key: your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Invoice Schema",
    "fields": [
      {"name": "vendor", "type": "string", "required": true},
      {"name": "invoice_number", "type": "string", "required": true},
      {"name": "amount", "type": "number", "required": true}
    ]
  }'
  1. Create Categories:
curl -X POST https://api.example.com/api/v1/medha-os/config/categories \
  -H "X-API-Key: your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "categories": [
      {"name": "invoice", "description": "Invoice documents"},
      {"name": "receipt", "description": "Receipt documents"}
    ]
  }'
  1. Extract Data:
curl -X POST https://api.example.com/api/v1/medha-os/extract \
  -H "X-API-Key: your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "file_id": "file_123",
    "schema_id": "schema_456"
  }'
  1. Classify Document:
curl -X POST https://api.example.com/api/v1/medha-os/classify \
  -H "X-API-Key: your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "file_id": "file_123"
  }'

SDKs

JavaScript SDK

import { MedhaOSClient } from '@medhaos/sdk';

const client = new MedhaOSClient({
  apiKey: 'your-api-key',
  baseURL: 'https://api.example.com'
});

// Create schema
const schema = await client.config.createSchema({
  name: 'Invoice Schema',
  fields: [...]
});

// Extract data
const result = await client.extract({
  file_id: 'file_123',
  schema_id: schema.schema_id
});

Python SDK

from medhaos import MedhaOSClient

client = MedhaOSClient(
    api_key='your-api-key',
    base_url='https://api.example.com'
)

# Create schema
schema = client.config.create_schema(
    name='Invoice Schema',
    fields=[...]
)

# Extract data
result = client.extract(
    file_id='file_123',
    schema_id=schema['schema_id']
)

Onboarding Endpoints

POST /api/v1/onboard

Create a new tenant and receive API keys.

Request:

{
  "name": "Acme Inc.",
  "domain": "acme.com",
  "email": "admin@acme.com",
  "use_case": "e-commerce"
}

Response:

{
  "success": true,
  "data": {
    "tenant_id": "uuid-here",
    "api_key_public": "medha_pk_...",
    "api_key_secret": "medha_sk_...",
    "endpoint": "https://api.medhaos.com/api/v1",
    "status": "active"
  }
}

Rate Limit: 5 requests/hour per IP

Example (multiple technologies):

curl -X POST https://api.example.com/api/v1/onboard \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Acme Inc.",
    "domain": "acme.com",
    "email": "admin@acme.com",
    "use_case": "e-commerce"
  }'

GET /api/v1/onboard/status/{tenant_id}

Get onboarding status for a tenant.

Response:

{
  "success": true,
  "data": {
    "status": "active",
    "tenant_id": "uuid-here",
    "name": "Acme Inc.",
    "domain": "acme.com",
    "email": "admin@acme.com",
    "domains": [
      {
        "domain": "acme.com",
        "origin": "https://acme.com",
        "is_verified": true
      }
    ]
  }
}

POST /api/v1/onboard/domains

Register an additional domain for the authenticated tenant.

Headers:

  • X-API-Key: Your secret API key (required)

Request:

{
  "domain": "subdomain.acme.com",
  "origin": "https://subdomain.acme.com"
}

Response:

{
  "success": true,
  "data": {
    "domain": "subdomain.acme.com",
    "origin": "https://subdomain.acme.com"
  }
}

GET /api/v1/onboard/domains

List all registered domains for the authenticated tenant.

Headers:

  • X-API-Key: Your secret API key (required)

Response:

{
  "success": true,
  "data": {
    "domains": [
      {
        "domain": "acme.com",
        "origin": "https://acme.com",
        "is_verified": true
      }
    ]
  }
}

Admin APIs

Admin-only endpoints (config, routing, learning, shadow log, etc.) are documented in FEATURES.md (section 23. Admin). Key paths include:

  • POST /api/v1/admin/learning/export — Trigger SFT export (body: days?, min_samples?, export_only?)
  • GET /api/v1/admin/learning/status — Learning status (events count, last export)
  • GET /api/v1/admin/shadow-log — Tail of shadow log (query: limit)

See FEATURES.md for the full admin API list and docs/ADMIN_MODEL_RD.md for the Model R&D workflow.

Support

For API support, contact: support@medhaos.com


Last Updated: 2025-01-19