Genesis — How /comprendre Works

API reference for POST /monce-haiku, /monce-sonnet, /comprendre, /deterministic, /stream

Endpoints

POST /monce-haiku      → orchestrate, Haiku fallback if trust < 65
POST /monce-sonnet     → orchestrate, Sonnet fallback if trust < 65
POST /comprendre       → alias for /monce-haiku
POST /deterministic    → fan-out only, never calls LLM, always $0
POST /stream           → SSE: deterministic event first, then LLM event if needed

Request Body

{
  "text": "Bonjour, je cherche du feuilleté 44.2...",
  "factory_id": 3,
  "anthropic": false     // default false — controls LLM fallback
}

anthropic: false (default) means never call LLM, even if trust is low. Set anthropic: true to allow the LLM fallback when trust drops below threshold.

Response Shape

{
  "version": "0.1.0",
  "mode": "deterministic" | "haiku_enhanced" | "sonnet_enhanced",
  "llm_invoked": false,
  "llm_model": null,
  "trust_aggregate": 67,
  "trust_assessment": {
    "aggregate": 67,
    "needs_llm": false,
    "reason": "Aggregate trust 67 >= 65 — deterministic sufficient",
    "ok_services": [...],
    "high_trust": [...],
    "low_trust": [...],
    "failed": [...]
  },
  "trust_breakdown": {
    "emailclassifier": {"trust": 65, "status": "ok", "label": "Email", "latency_ms": 42},
    ...
  },
  "dominant": {
    "service": "businessclassifier",
    "label": "Dispatch",
    "trust": 100,
    "prediction": "demande_devis"
  },
  "consensus": {
    "emailclassifier": "Devis",
    "businessclassifier": "demande_devis",
    ...
  },
  "synthesis": null,           // null if deterministic, LLM text if enhanced
  "classifiers": { ... },      // full raw response per service
  "errors": { ... },           // errors for failed services
  "latency_ms": 287,
  "fanout_ms": 287,
  "llm_ms": null
}

SSE Stream (/stream)

The /stream endpoint returns Server-Sent Events for the landing page:

event: deterministic
data: { full deterministic response }

event: llm_start          // only if needs_llm
data: {"model": "haiku", "reason": "..."}

event: llm_done           // only if LLM succeeded
data: { full enhanced response }

event: complete           // only if deterministic was sufficient
data: {"reason": "Aggregate trust 67 >= 65"}

Fan-out Behavior

All 10 downstream calls are made concurrently via asyncio.gather. Every call sends anthropic: false. A slow or failing service does not block the others. Timeout: 10s per service, 12s global.

Trust Extraction

Each classifier returns trust in different shapes. The orchestrator normalizes:

1. response.trust_score (int)
2. response.trust.score (nested dict)
3. response.trust (int)
4. Fallback: max(Probability) * 100