Skip to main content

What this page is for

Use this page as the public contract for Context’s managed query runtime. Both of Context’s public Query surfaces route through the same runtime:
  • /api/v1/query — the SDK / API surface for programmatic callers
  • /api/chat — the first-party conversational surface
You should expect identical answer behavior across both. Transport, streaming, and session state differ by surface; the managed runtime that produces the answer does not.

What the runtime guarantees

Grounded answers

Every paid response is assembled from tool results the runtime actually retrieved. If the marketplace can’t answer, the runtime returns capability_miss instead of fabricating numbers.

Honest ambiguity

If a prompt is under-specified, the agentic runtime proceeds with the best grounded interpretation and discloses any assumption it made in the answer. It returns capability_miss only when the marketplace genuinely can’t answer.

Full collection preservation

Comparison queries return the complete relevant set rather than a cherry-picked subset, so your model can reason over the whole collection.

Per-response settlement

Paid responses settle in USDC on Base after the answer is delivered. Failed executions do not trigger a charge.

Public controls

Both surfaces expose the same underlying runtime through different wire shapes. The public controls that actually affect answer behavior are:
ControlSurfaceDescription
query/api/v1/queryNatural-language question to answer
tools/api/v1/queryOptional pinned tool shortlist (overrides marketplace discovery)
responseShape/api/v1/queryanswer_with_evidence (default) or evidence_only
includeData/api/v1/queryOptional bounded inline execution data. Large payloads return a preview plus full-data references
agentModelId/api/v1/queryPublic model knob for the main librarian agent loop
selectedChatModel/api/chatLegacy chat wire name for the same user-selected agent model
includeDeveloperTrace/api/v1/queryOpt in to structured diagnostics for your own debugging
All other runtime behavior — discovery, selection, ambiguity resolution, execution, retrieval, and post-execution computation via the managed Python code_interpreter (see Code Interpreter) — is managed by the server. Callers do not pick between internal lanes or budgets.

Response shapes

ShapeWho it’s forContents
answer_with_evidenceFirst-party chat, human-facing apps (default)Natural-language answer plus structured evidence, freshness, confidence, and optional view payloads
evidence_onlyExternal agents that don’t need proseBounded evidence, computed artifacts, and full-data references without a synthesized prose answer
In evidence_only mode the agent harness receives the structured evidence package and computed artifacts without prose synthesis. Full execution data is referenced by artifacts.canonicalDataRef and, when requested, dataUrl; includeData: true adds a bounded preview rather than unbounded raw rows. Artifacts vs. computed artifacts. Structured charts and dataframes the runtime computes (for example, via the managed Python code_interpreter) are returned in computedArtifacts. The artifacts field carries provenance metadata — source references and dataset handles for the evidence behind the answer. When the runtime returns capability_miss, the response still carries typed fields your client can branch on — no prose-parsing required.

How the two surfaces differ

The runtime contract is identical across surfaces. What differs:
Property/api/v1/query/api/chat
TransportJSON responseStreamed UI events
Session stateCaller-managedServer-managed chat thread
AuthAPI keyUser wallet session
If an answer renders correctly through /api/v1/query but breaks in chat, the chat adapter is usually the first place to look — the shared runtime is the same on both sides.

Debugging with developer traces

Pass includeDeveloperTrace: true to /api/v1/query to receive a structured diagnostics payload alongside the answer. Traces are the supported public way to inspect what the runtime did on a given request. The chat app exposes a sibling surface: enable Developer Mode in Settings → Developer Settings, and a Developer Logs card appears at the bottom of each response with the same iterative-runtime signals (orchestration mode, tool call history, execution trace, verification, tool-registry stats). The two surfaces are kept conceptually aligned — the chat card and the SDK developerTrace tell the same story for the same turn class — but the SDK trace additionally includes a timeline, summary rollups, cost, and contributorSearches that the chat card does not surface. See the Debugging Guide for the full section list. The trace is intentionally a typed, stable surface — internal intermediate diagnostics can change across rollouts without breaking it.

Chat streaming UI extensions (first-party /chat)

Beyond message text, the chat stream emits structured UI payloads used by the web client:
Data partPurpose
data-reasoningStepIncremental librarian reasoning chunks, optionally keyed with relatedExecutionCallId to align with tool execution.
data-executionToolCallsFinal per-call timeline (arguments + truncated results + executionCallId) for the docked execution sidecar.
data-chartDelta, data-codeDelta, data-sheetDelta, data-textDelta, data-imageDeltaArtifact document streaming payloads; buffered client-side next to structural data-id/data-kind/data-finish parts.
Users can wipe all server-side conversations with DELETE /api/history (authenticated).