Installation
Requirements
- Python 3.10+
- httpx (async HTTP)
- pydantic (type validation)
- pyjwt[crypto] (JWT verification)
Prerequisites
Before using the API, complete setup at ctxprotocol.com:Quick Start
Want per-call pricing and spending limits? The SDK also supports Execute mode for direct method calls inside session budgets. See Two SDK Modes below.
Two SDK Modes
The SDK offers two payment models:| Mode | Method | Payment Model | Use Case |
|---|---|---|---|
| Query | client.query.run() | Pay-per-response | Complex questions, multi-tool synthesis, curated intelligence |
| Execute | client.tools.execute() | Per call (with spending limit) | Deterministic pipelines, raw outputs, explicit cost control |
You have access to both modes. Pick the one that fits your use case.
- Use Query (
client.query.run()) when you want a managed librarian contract where Context handles discovery/orchestration (up to 100 MCP calls per response turn) and returnsanswer_with_evidenceorevidence_only. Pay-per-response (~$0.10). - Use Execute (
client.tools.execute()) when your app/agent is the librarian and you want per-call pricing with spending limits (~$0.001/call).
Execute Quick Start
Mixed listings are first-class: one listing can expose methods to both modes. Methods without explicit execute pricing remain Query-only until pricing metadata is added.
Compatibility: payload fields like
price and price_per_query are kept for backward compatibility. In Query mode, they represent listing-level price per response turn.
A future major release can add response-named aliases (for example, price_per_response) before deprecating legacy names.Configuration
Client Options
| Option | Type | Required | Default | Description |
|---|---|---|---|---|
api_key | str | Yes | - | Your Context Protocol API key |
base_url | str | No | https://www.ctxprotocol.com | API base URL (for development) |
Always use
async with context manager or call await client.close() when done to properly release resources.The Python SDK automatically retries transient failures (HTTP 5xx, transport errors, and timeouts) with exponential backoff.
API Reference
Discovery
client.discovery.search(query, limit?)
Search for tools matching a query string, with optional mode-aware filters.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
query | str | Yes | Search query |
limit | int | No | Maximum results to return (1-50) |
mode | "query" | "execute" | No | Discovery mode with billing semantics |
surface | "answer" | "execute" | "both" | No | Method mode filter |
query_eligible | bool | No | Require methods marked query eligible |
require_execute_pricing | bool | No | Require explicit method execute pricing |
exclude_latency_classes | list["instant" | "fast" | "slow" | "streaming"] | No | Exclude by latency class |
exclude_slow | bool | No | Convenience filter for query mode |
favorites_only | bool | No | Override the account-level Favorites-Only Auto Mode for this request |
list[Tool]
If you enable Favorites-Only Auto Mode in Settings, SDK discovery and query requests made with the same API key inherit that account default automatically. Set
favorites_only=True or favorites_only=False per request to override it. On Query requests with explicit tools, manual tool selection wins and favorites_only is ignored.client.discovery.get_featured(limit?, ...)
Get featured/popular tools.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
limit | int | No | Maximum results to return |
mode | "query" | "execute" | No | Optional discovery mode |
surface | "answer" | "execute" | "both" | No | Optional method mode filter |
query_eligible | bool | No | Optional query-safe filter |
require_execute_pricing | bool | No | Optional execute-price filter |
list[Tool]
client.discovery.get(tool_id)
Fetch one tool listing by UUID.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
tool_id | str | Yes | Marketplace tool UUID |
Tool
Tools (Execute Mode)
client.tools.execute(tool_id, tool_name, args?)
Execute a single tool method. Execute calls can run inside a session budget (max_spend_usd) with automatic payment after delivery.
Parameters:
| Option | Type | Required | Description |
|---|---|---|---|
tool_id | str | Yes | UUID of the tool |
tool_name | str | Yes | Name of the method to call |
args | dict | No | Arguments matching the tool’s inputSchema |
idempotency_key | str | No | Optional idempotency key (UUID recommended) |
mode | "execute" | No | Explicit mode label (defaults to "execute") |
session_id | str | No | Execute session ID to accrue spend against |
max_spend_usd | str | No | Optional inline session budget (if no session_id) |
close_session | bool | No | Request session closure after this call settles |
ExecutionResult
client.tools.start_session(max_spend_usd)
Start an execute session budget envelope.
client.tools.get_session(session_id)
Fetch current execute session status/spend.
client.tools.close_session(session_id)
Close an execute session and trigger final flush behavior.
Developer
client.developer.update_tool(tool_id, ...)
Update contributor-owned tool metadata programmatically. This is for developers managing their own listings, not for buyer query execution.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
tool_id | str | Yes | Tool UUID |
name | str | No | New display name |
description | str | No | New description |
suggested_prompts | list[str] | No | Suggested buyer prompts |
category | str | No | Listing category |
dict[str, Any]
Query (Pay-Per-Response)
The Query API is Context’s response marketplace. Instead of buying raw API calls, you’re buying curated intelligence. Ask a question, pay once, and get a grounded, managed answer.
client.query.run(query, tools?, favorites_only?, agent_model_id?, response_shape?, include_data?, include_data_url?, include_developer_trace?, idempotency_key?)
Run an agentic query. The managed runtime handles tool discovery, ambiguity resolution, multi-tool execution (up to 100 MCP calls per response turn as a safety cap), and grounding — and returns the selected Query response contract (answer_with_evidence or evidence_only, default answer_with_evidence). Query billing is pay-per-response with automatic payment after delivery.
Parameters:
| Option | Type | Required | Description |
|---|---|---|---|
query | str | Yes | Natural-language question |
tools | list[str] | No | Tool IDs to use (auto-discover if omitted) |
favorites_only | bool | No | Override the account-level Favorites-Only Auto Mode for this request |
agent_model_id | str | No | Main librarian agent model ID (omit to use DEFAULT_AGENT_MODEL_ID) |
response_shape | "answer_with_evidence" | "evidence_only" | No | Structured response mode (default: answer_with_evidence). |
include_data | bool | No | Include bounded execution data inline. Large payloads return a preview plus full-data references |
include_data_url | bool | No | Persist execution data to blob and return a URL |
include_developer_trace | bool | No | Include optional developer trace + orchestration diagnostics |
idempotency_key | str | No | Optional idempotency key (UUID recommended) |
resume_from | QueryAttemptReference | No | Resume from a previous Query attempt handle |
fork_from | QueryForkReference | No | Fork a previous Query attempt into a new branch |
QueryResult
agent_model_id lets headless users choose the main librarian agent model explicitly. If omitted, the API uses its managed default agent model. Internal tool selection remains managed by the server.
If response_shape is evidence_only, Context skips the extra prose synthesis layer, but the librarian agent still runs to fetch, compute, and ground the result.Import AGENT_MODEL_IDS and DEFAULT_AGENT_MODEL_ID from ctxprotocol to see the current supported slugs. Omit agent_model_id to use the managed default (DEFAULT_AGENT_MODEL_ID).The query runtime now exposes a single managed executor surface.
The server decides internal budgets, ambiguity handling, and exploration policy from the query itself instead of asking SDK callers to choose a lane.
include_developer_trace and orchestration_metrics are optional diagnostic surfaces that are both gated on include_developer_trace=True — they are None on the response envelope when the flag is omitted. Their inner fields are typed but may evolve across rollouts as the managed runtime changes, so treat them as debugging signals rather than a stable execution contract.The developer_trace object mirrors the chat app’s Developer Logs card (same iterative-runtime signals): orchestration_mode="query", summary (tool calls, retries, loop steps), timeline, tool_call_history (with is_code_interpreter flags on Python sandbox calls), execution_trace, verification (with bounded_answer_reason / bounded_answer_data_gap when a safety guardrail stopped retries), and diagnostics (selection, execution contract, cost, contributor searches, tool-registry stats, retry budget, stage timing). The initial_code / final_code fields are the // iterative execution: no generated code sentinel kept for trace compatibility with the previous VM-based runtime — no JavaScript is generated or executed.Structured Response Shapes
Query is Context’s managed librarian contract. You can choose how much structure you want back:response_shape | Best for | Behavior |
|---|---|---|
answer_with_evidence | First-party chat, human-facing apps (default) | Prose answer plus structured evidence, artifacts, freshness, confidence, and usage metadata |
evidence_only | External agents, downstream automation | Bounded evidence, computed artifacts, and full-data references without prose synthesis |
answer_with_evidence, but it is using the same Query contract you get in the SDK.
Query Envelope Fields
Whenresponse_shape is answer_with_evidence or evidence_only, the result may include:
| Field | What it contains |
|---|---|
summary | Short machine-friendly summary of the answer |
evidence | Canonical facts, source refs, assumptions, known unknowns, retrieval reason codes, and optional wedge-specific market_intelligence |
artifacts | data_url, canonical dataset metadata, and stage-artifact kinds |
view | Optional UI/render hint such as table, leaderboard, heatmap, or timeseries, plus optional metrics, columns, and rows |
outcome | Public outcome label, tone, stop_reason, and optional issue_class |
controller | Public bounded-controller summary including actions_taken, next_action, and patch-preservation flags |
freshness | as_of, source timestamps, and freshness note |
confidence | Confidence level, reason, fact counts, and gap signals |
usage | Duration, cost, tools used, outcome type, and optional orchestration metrics |
High-Fidelity Rehydration (Retrieval-First Synthesis)
When retrieval-first rollout is enabled in the deployment, the query runtime can switch synthesis context assembly from baseline truncation to retrieval-first slices for full-data or truncation-sensitive requests.- Stage artifacts are emitted in request-scoped internal storage (selection, execution, synthesis). The
scoutstage is a legacy artifact slot — scout probe execution is disabled in the current iterative runtime, so no scout stage artifacts are produced. - Retrieval primitives (path lookup, array windows/sampling, keyword slices, top-K relevance) are used to build a bounded context pack from canonical execution data.
- Final synthesis still passes through the existing synthesis safety contract.
include_datareturns a bounded inline preview when needed, andinclude_data_url/artifacts.canonical_data_refreference the same canonical execution dataset used by retrieval-first assembly.
Resume and Fork
Every Query response can includequery_session handles. Use resume_from to continue from a previous attempt, or fork_from to branch from an attempt while preserving lineage.
client.query.stream(query, tools?, favorites_only?, agent_model_id?, response_shape?, include_data?, include_data_url?, include_developer_trace?, idempotency_key?)
Same as run() but streams events in real-time via SSE.
Returns: AsyncGenerator of stream events
Use the same
idempotency_key when retrying the same logical request after network or timeout errors.If you stream with
response_shape="evidence_only", expect the structured result on the final done event and few or no text-delta events.query.run() uses the streaming path internally and has a default stream timeout of 600 seconds. Non-streaming SDK requests default to 300 seconds. For complex chart-heavy requests, use async jobs.Async Query Jobs
Use async jobs when a query may exceed a single blocking SDK request.Types
Import Types
Tool
McpTool
For argument guidance, use standard JSON Schema fields directly inside
inputSchema properties. Put fallback values in default and sample invocations in examples. Do not rely on custom _meta.inputExamples.ExecutionResult (Execute Mode)
ExecuteSessionSpend
QueryResult (Pay-Per-Response)
Context Requirement Types
For MCP server contributors building tools that need user context:Why Context Injection Matters:
- No Auth Required: Public blockchain/user data is fetched by the platform
- Security: Your MCP server never handles private keys
- Simplicity: You receive structured, type-safe data
Python reference implementation: Hummingbot contributor server.
For practical guidance on these pacing hints, see Tool Metadata.
Injected Context Types
HyperliquidContext
PolymarketContext
WalletContext
Contributor Search Helpers
If you are building a contributor for a search-hard venue, the Python SDK ships an optional helper surface atctxprotocol.contrib.search.
Use it only when the venue’s upstream search is weak enough that deterministic retrieval plus a bounded model judge materially improves candidate resolution. Do not use it for venues that already expose reliable direct search.
- contributor-side intent shaping, candidate normalization, shortlist construction, and validated resolution
- provider-agnostic judge injection with stable override knobs for
provider,model,timeout,budget, anddisabled - machine-readable artifact generation via
build_contributor_search_validation_artifact(...) - runtime trace inspection via
extract_contributor_searches_from_developer_trace(trace)orresult.developer_trace.diagnostics.contributor_searches
- extra judge spend is contributor-owned in this rollout, so recover it through your own listing response price and/or execute pricing
- keep deterministic validation around every judge result; malformed, timed-out, over-budget, or contradictory judgments must degrade honestly
- save replayable validation artifacts alongside your contributor examples. Current reference fixtures live under
examples/client/validation/
Error Handling
The SDK raisesContextError with specific error codes:
Error Codes
| Code | Description | Handling |
|---|---|---|
unauthorized | Invalid API key | Check configuration |
no_wallet | Wallet not set up | Direct user to help_url |
insufficient_allowance | Spending cap not set | Direct user to help_url |
payment_failed | USDC payment failed | Check balance |
execution_failed | Tool error | Retry with different args |
Securing Your Tool (MCP Contributors)
If you’re building an MCP server, verify incoming requests usingctxprotocol.
If you wrap a search-hard venue whose upstream API cannot reliably resolve the right market or entity, follow Optional Contributor Search Helpers. That pattern stays contributor-side, keeps provider credentials contributor-owned, and does not change
client.query.run() or _meta.Free vs Paid Security Requirements:
| Tool Type | Security Middleware | Rationale |
|---|---|---|
| Free Tools ($0.00) | Optional | Great for distribution and adoption |
| Paid Tools ($0.01+) | Mandatory | We cannot route payments to insecure endpoints |
Option 1: FastMCP (Recommended)
FastMCP is the fastest way to build MCP servers. Usectxprotocol middleware:
Option 2: Raw FastAPI
For more control, use FastAPI with our middleware:Manual Verification
For more control, use the lower-level utilities:Verification Options
| Option | Type | Required | Description |
|---|---|---|---|
authorization_header | str | Yes | Full Authorization header (e.g., "Bearer eyJ...") |
audience | str | No | Expected audience claim for stricter validation |
MCP Security Model
| MCP Method | Auth Required | Why |
|---|---|---|
initialize | ❌ No | Session setup |
tools/list | ❌ No | Discovery - agents need to see your schemas |
resources/list | ❌ No | Discovery |
prompts/list | ❌ No | Discovery |
tools/call | ✅ Yes | Execution - costs money, runs your code |
What this means in practice:
- ✅
https://your-mcp.com/mcp+initialize→ Works without auth - ✅
https://your-mcp.com/mcp+tools/list→ Works without auth - ❌
https://your-mcp.com/mcp+tools/call→ Requires Context Protocol JWT
Payment Flow
Context supports two settlement timings:- Query mode (
client.query.*) uses deferred settlement after the response is delivered - Execute mode (
client.tools.execute) accrues per-call method spend into execute sessions with automatic batch payment - In both modes, spending caps are enforced via ContextRouter allowance checks
- 90% goes to the tool developer, 10% goes to the protocol

