Installation
Requirements
- Python 3.10+
- httpx (async HTTP)
- pydantic (type validation)
- pyjwt[crypto] (JWT verification)
Prerequisites
Before using the API, complete setup at ctxprotocol.com:Quick Start
Want per-call pricing and spending limits? The SDK also supports Execute mode for direct method calls inside session budgets. See Two SDK Modes below.
Two SDK Modes
The SDK offers two payment models:| Mode | Method | Payment Model | Use Case |
|---|---|---|---|
| Query | client.query.run() | Pay-per-response | Complex questions, multi-tool synthesis, curated intelligence |
| Execute | client.tools.execute() | Per call (with spending limit) | Deterministic pipelines, raw outputs, explicit cost control |
You have access to both modes — pick the one that fits your use case.
- Use Query (
client.query.run()) when you want a managed librarian contract — Context handles discovery/orchestration (up to 100 MCP calls per response turn) and can return plainanswer,answer_with_evidence, orevidence_only. Pay-per-response (~$0.10). - Use Execute (
client.tools.execute()) when your app/agent is the librarian and you want per-call pricing with spending limits (~$0.001/call).
Execute Quick Start
Mixed listings are first-class: one listing can expose methods to both modes. Methods without explicit execute pricing remain Query-only until pricing metadata is added.
Compatibility: payload fields like
price and price_per_query are kept for backward compatibility. In Query mode, they represent listing-level price per response turn.
A future major release can add response-named aliases (for example, price_per_response) before deprecating legacy names.Configuration
Client Options
| Option | Type | Required | Default | Description |
|---|---|---|---|---|
api_key | str | Yes | — | Your Context Protocol API key |
base_url | str | No | https://ctxprotocol.com | API base URL (for development) |
Always use
async with context manager or call await client.close() when done to properly release resources.The Python SDK automatically retries transient failures (HTTP 5xx, transport errors, and timeouts) with exponential backoff.
API Reference
Discovery
client.discovery.search(query, limit?)
Search for tools matching a query string, with optional mode-aware filters.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
query | str | Yes | Search query |
limit | int | No | Maximum results to return (1-50) |
mode | "query" | "execute" | No | Discovery mode with billing semantics |
surface | "answer" | "execute" | "both" | No | Method mode filter |
query_eligible | bool | No | Require methods marked query eligible |
require_execute_pricing | bool | No | Require explicit method execute pricing |
exclude_latency_classes | list["instant" | "fast" | "slow" | "streaming"] | No | Exclude by latency class |
exclude_slow | bool | No | Convenience filter for query mode |
list[Tool]
client.discovery.get_featured(limit?, ...)
Get featured/popular tools.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
limit | int | No | Maximum results to return |
mode | "query" | "execute" | No | Optional discovery mode |
surface | "answer" | "execute" | "both" | No | Optional method mode filter |
query_eligible | bool | No | Optional query-safe filter |
require_execute_pricing | bool | No | Optional execute-price filter |
list[Tool]
Tools (Execute Mode)
client.tools.execute(tool_id, tool_name, args?)
Execute a single tool method. Execute calls can run inside a session budget (max_spend_usd) with automatic payment after delivery.
Parameters:
| Option | Type | Required | Description |
|---|---|---|---|
tool_id | str | Yes | UUID of the tool |
tool_name | str | Yes | Name of the method to call |
args | dict | No | Arguments matching the tool’s inputSchema |
idempotency_key | str | No | Optional idempotency key (UUID recommended) |
mode | "execute" | No | Explicit mode label (defaults to "execute") |
session_id | str | No | Execute session ID to accrue spend against |
max_spend_usd | str | No | Optional inline session budget (if no session_id) |
close_session | bool | No | Request session closure after this call settles |
ExecutionResult
client.tools.start_session(max_spend_usd)
Start an execute session budget envelope.
client.tools.get_session(session_id)
Fetch current execute session status/spend.
client.tools.close_session(session_id)
Close an execute session and trigger final flush behavior.
Query (Pay-Per-Response)
The Query API is Context’s response marketplace — instead of buying raw API calls, you’re buying curated intelligence. Ask a question, pay once, and get a managed answer contract backed by multi-tool data aggregation, error recovery, and completeness checks.
client.query.run(query, tools?, answer_model_id?, response_shape?, include_data?, include_data_url?, include_developer_trace?, query_depth?, debug_scout_deep_mode?, idempotency_key?)
Run an agentic query. The server applies discovery-first orchestration (discover/probe -> plan-from-evidence -> execute -> bounded fallback) with up to 100 MCP calls per response turn as a runtime safety cap, then returns the selected Query response contract (answer, answer_with_evidence, or evidence_only). The active runtime now has one real completeness-oriented deep lane plus the lower-latency fast lane. deep stays metadata-first before planning, and fast remains one-shot biased. Query billing is pay-per-response with automatic payment after delivery.
Parameters:
| Option | Type | Required | Description |
|---|---|---|---|
query | str | Yes | Natural-language question |
tools | list[str] | No | Tool IDs to use (auto-discover if omitted) |
answer_model_id | str | No | Final synthesis model ID (e.g. kimi-model-thinking, glm-model) |
response_shape | "answer" | "answer_with_evidence" | "evidence_only" | No | Structured response mode for Query answers |
include_data | bool | No | Include execution data inline in the response |
include_data_url | bool | No | Persist execution data to blob and return a URL |
include_developer_trace | bool | No | Include optional developer trace + orchestration diagnostics |
query_depth | "fast" | "auto" | "deep" | No | Query orchestration depth (fast lower latency, auto server-routed, deep completeness-oriented) |
debug_scout_deep_mode | "deep" | No | Development/testing only internal deep lane override; legacy deep-light / deep-heavy aliases are temporarily accepted |
idempotency_key | str | No | Optional idempotency key (UUID recommended) |
QueryResult
answer_model_id lets headless users choose the final synthesis model explicitly. If omitted, the API uses its managed default answer model.
If response_shape is evidence_only, synthesis is skipped and no answer model runs for that request.Current platform IDs: kimi-model-thinking, glm-model, gemini-flash-model, claude-sonnet-model, claude-opus-model.query_depth is available in both run() and stream():fast: lower-latency path for simple lookups.auto: server routes to eitherfastordeepusing query intent and selected tool metadata quality.deep: completeness-oriented path (default when omitted).
include_developer_trace and orchestration_metrics are optional diagnostics.
debug_scout_deep_mode remains a test-only override in non-production environments.
Inside deep, the runtime currently uses one real metadata-first deep path. Legacy deep-light and deep-heavy debug values are normalized to deep when accepted for backwards compatibility.
Selection diagnostics can show initial vs final lane decisions, Scout probe adequacy, bounded pre-plan probe call counts, and whether pre-plan evidence changed the initial plan.Structured Response Shapes
Query is Context’s managed librarian contract. You can choose how much structure you want back:response_shape | Best for | Behavior |
|---|---|---|
answer | Backward compatibility | Natural-language answer only |
answer_with_evidence | First-party chat, human-facing apps | Prose answer plus structured evidence, artifacts, freshness, confidence, and usage metadata |
evidence_only | External agents, downstream automation | Machine-friendly summary plus the same structured evidence package without depending on prose synthesis |
answer_with_evidence, but it is using the same Query contract you get in the SDK.
Query Envelope Fields
Whenresponse_shape is answer_with_evidence or evidence_only, the result may include:
| Field | What it contains |
|---|---|
summary | Short machine-friendly summary of the answer |
evidence | Canonical facts, source refs, assumptions, known unknowns, retrieval reason codes, and optional wedge-specific market_intelligence |
artifacts | data_url, canonical dataset metadata, and stage-artifact kinds |
view | Optional UI/render hint such as table, leaderboard, heatmap, or timeseries, plus optional metrics, columns, and rows |
outcome | Public outcome label, tone, stop_reason, and optional issue_class |
controller | Public bounded-controller summary including actions_taken, next_action, and patch-preservation flags |
freshness | as_of, source timestamps, and freshness note |
confidence | Confidence level, reason, fact counts, and gap signals |
usage | Duration, cost, tools used, outcome type, and optional orchestration metrics |
High-Fidelity Rehydration (Retrieval-First Synthesis)
When retrieval-first rollout is enabled in the deployment, the query runtime can switch synthesis context assembly from baseline truncation to retrieval-first slices for full-data or truncation-sensitive requests.- Stage artifacts are emitted in request-scoped internal storage (selection, planning, execution, completeness, synthesis).
- Retrieval primitives (path lookup, array windows/sampling, keyword slices, top-K relevance) are used to build a bounded context pack from canonical execution data.
- Final synthesis still passes through the existing synthesis safety contract.
include_dataandinclude_data_urlcontinue to reference the same canonical execution dataset used by retrieval-first assembly.
client.query.stream(query, tools?, answer_model_id?, response_shape?, include_data?, include_data_url?, include_developer_trace?, query_depth?, debug_scout_deep_mode?, idempotency_key?)
Same as run() but streams events in real-time via SSE.
Returns: AsyncGenerator of stream events
Use the same
idempotency_key when retrying the same logical request after network or timeout errors.If you stream with
response_shape="evidence_only", expect the structured result on the final done event and few or no text-delta events.Types
Import Types
Tool
McpTool
For argument guidance, use standard JSON Schema fields directly inside
inputSchema properties. Put fallback values in default and sample invocations in examples. Do not rely on custom _meta.inputExamples.ExecutionResult (Execute Mode)
ExecuteSessionSpend
QueryResult (Pay-Per-Response)
Context Requirement Types
For MCP server contributors building tools that need user context:Why Context Injection Matters:
- No Auth Required: Public blockchain/user data is fetched by the platform
- Security: Your MCP server never handles private keys
- Simplicity: You receive structured, type-safe data
Python reference implementation: Hummingbot contributor server.
For practical guidance on these pacing hints, see Tool Metadata.
Injected Context Types
HyperliquidContext
PolymarketContext
WalletContext
Contributor Search Helpers
If you are building a contributor for a search-hard venue, the Python SDK ships an optional helper surface atctxprotocol.contrib.search.
Use it only when the venue’s upstream search is weak enough that deterministic retrieval plus a bounded model judge materially improves candidate resolution. Do not use it for venues that already expose reliable direct search.
- contributor-side intent shaping, candidate normalization, shortlist construction, and validated resolution
- provider-agnostic judge injection with stable override knobs for
provider,model,timeout,budget, anddisabled - machine-readable artifact generation via
build_contributor_search_validation_artifact(...) - runtime trace inspection via
extract_contributor_searches_from_developer_trace(trace)orresult.developer_trace.diagnostics.contributor_searches
- extra judge spend is contributor-owned in this rollout, so recover it through your own listing response price and/or execute pricing
- keep deterministic validation around every judge result; malformed, timed-out, over-budget, or contradictory judgments must degrade honestly
- save replayable validation artifacts alongside your contributor examples. Current reference fixtures live under
examples/client/validation/
Error Handling
The SDK raisesContextError with specific error codes:
Error Codes
| Code | Description | Handling |
|---|---|---|
unauthorized | Invalid API key | Check configuration |
no_wallet | Wallet not set up | Direct user to help_url |
insufficient_allowance | Spending cap not set | Direct user to help_url |
payment_failed | USDC payment failed | Check balance |
execution_failed | Tool error | Retry with different args |
Securing Your Tool (MCP Contributors)
If you’re building an MCP server, verify incoming requests usingctxprotocol.
If you wrap a search-hard venue whose upstream API cannot reliably resolve the right market or entity, follow Optional Contributor Search Helpers. That pattern stays contributor-side, keeps provider credentials contributor-owned, and does not change
client.query.run() or _meta.Free vs Paid Security Requirements:
| Tool Type | Security Middleware | Rationale |
|---|---|---|
| Free Tools ($0.00) | Optional | Great for distribution and adoption |
| Paid Tools ($0.01+) | Mandatory | We cannot route payments to insecure endpoints |
Option 1: FastMCP (Recommended)
FastMCP is the fastest way to build MCP servers. Usectxprotocol middleware:
Option 2: Raw FastAPI
For more control, use FastAPI with our middleware:Manual Verification
For more control, use the lower-level utilities:Verification Options
| Option | Type | Required | Description |
|---|---|---|---|
authorization_header | str | Yes | Full Authorization header (e.g., "Bearer eyJ...") |
audience | str | No | Expected audience claim for stricter validation |
MCP Security Model
| MCP Method | Auth Required | Why |
|---|---|---|
initialize | ❌ No | Session setup |
tools/list | ❌ No | Discovery - agents need to see your schemas |
resources/list | ❌ No | Discovery |
prompts/list | ❌ No | Discovery |
tools/call | ✅ Yes | Execution - costs money, runs your code |
What this means in practice:
- ✅
https://your-mcp.com/mcp+initialize→ Works without auth - ✅
https://your-mcp.com/mcp+tools/list→ Works without auth - ❌
https://your-mcp.com/mcp+tools/call→ Requires Context Protocol JWT
Payment Flow
Context supports two settlement timings:- Query mode (
client.query.*) uses deferred settlement after the response is delivered - Execute mode (
client.tools.execute) accrues per-call method spend into execute sessions with automatic batch payment - In both modes, spending caps are enforced via ContextRouter allowance checks
- 90% goes to the tool developer, 10% goes to the protocol
Links
- Context Protocol — Main website
- PyPI Package
- GitHub (Python SDK)
- TypeScript SDK — For Node.js

