> ## Documentation Index
> Fetch the complete documentation index at: https://docs.ctxprotocol.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Protocol Architecture

> How the marketplace, settlement layer, and safety model work together

## Overview

Context is built around a **managed runtime**. The server discovers relevant tools, executes retrieval against the marketplace, and settles economics according to the active mode (Query or Execute).

The architecture consists of five layers:

1. **The Marketplace:** supply of tools from developers
2. **The Agent:** demand from users and developers via Query + Execute modes
3. **The Handshake Layer:** secure user-approval flow for signatures, transactions, and OAuth ([guide](/guides/handshake-architecture))
4. **The Protocol:** settlement layer on Base
5. **Context Injection:** user data for personalized tools

***

## The Marketplace (Supply)

Developers register **Tools** (the marketplace listing) powered by standard Model Context Protocol servers. Paste your endpoint URL and Context auto-discovers methods via `tools/list`.

### Terminology

| Term       | Definition                                                                           |
| ---------- | ------------------------------------------------------------------------------------ |
| **Tool**   | The marketplace listing (what users see in discovery/search)                         |
| **Method** | A concrete MCP method (`tools/call`) exposed under a listing                         |
| **Mode**   | Where a method is eligible: `answer`, `execute`, or `both` (set via `_meta.surface`) |

### How It Works

1. You build an MCP server exposing your data/APIs
2. You register it as an MCP listing with a listing response price and optional method-level execute pricing metadata
3. Query runtime selects query-eligible methods; Execute runtime selects methods with explicit per-call pricing
4. You get paid in USDC on Base through deferred settlement flows (query turns and execute-session batches)

***

## The Data Broker Standard

<Warning>
  Context is not a text-based chat platform. It is a **Structured Data Marketplace**. We treat your tool like a financial API, not a conversational bot.
</Warning>

### Why `outputSchema` and `structuredContent` Matter

The MCP specification defines `outputSchema` and `structuredContent` as optional features for tools. **Context requires them for all paid tools** because structured data enables powerful marketplace features:

| Feature            | Without Schema                 | With Schema                       |
| ------------------ | ------------------------------ | --------------------------------- |
| AI Code Generation | Agent guesses response format  | Agent writes precise parsing code |
| Type Safety        | Runtime errors, broken parsing | Guaranteed structure              |
| Dispute Resolution | Manual review required         | Auto-adjudicated on-chain         |
| Trust Signal       | Unknown reliability            | "Data Broker" verified            |

### Why Context Requires These Fields

While optional in vanilla MCP, Context requires structured outputs because:

* **AI Agent Benefit**: Structured outputs give the runtime stable fields to compose, compare, and verify across tools.
* **Payment Verification**: Our smart contracts can verify that your returned JSON matches your promised schema.
* **Dispute Resolution**: Schema mismatches can be auto-adjudicated on-chain without manual review.

<Info>
  **Result:** You are not just a "Prompt Engineer." You are a **Data Broker** selling verifiable information on-chain.
</Info>

***

## The Agent (Demand)

When a user asks a complex question (e.g., *"Is it profitable to arb Uniswap vs Aave?"*), the managed runtime handles everything between the question and the answer. Callers only see the public contract; the server owns orchestration.

<Note>
  For the public contract — inputs, response shapes, and debugging surfaces — see [Query and Chat Runtime](/architecture/query-chat-agentic-flow).
</Note>

<Steps>
  <Step title="Find the right tools">
    The runtime discovers relevant tools from the marketplace, or uses the ones you pin.
  </Step>

  <Step title="Resolve ambiguity if needed">
    If the prompt is under-specified, the runtime proceeds with the best grounded interpretation and discloses the assumption it made, or returns `capability_miss` when the marketplace genuinely can't answer.
  </Step>

  <Step title="Execute and verify">
    The runtime issues tool calls, preserves full returned collections, and recovers in-line when a call fails or drifts off-domain.
  </Step>

  <Step title="Ground the answer">
    Verified tool output is composed into the response contract you asked for: `answer_with_evidence` or `evidence_only`.
  </Step>

  <Step title="Settle">
    After delivery, payment is settled in the background via `ContextRouter` — deferred settlement in USDC on Base. Tool fees are waived if execution fails.
  </Step>
</Steps>

<Note>
  **Composability is the superpower of Context.** Any frontier model can stitch together disparate tools into a coherent workflow, creating infinite new use cases from your single MCP server.
</Note>

***

## The Librarian Model (Canonical)

Context runs one marketplace with two modes:

* **User A (Query):** Context is the librarian. You ask a question and pay once for a curated response.
* **User B (Execute):** Your app/agent is the librarian. You select methods directly and pay per execute call within a session budget.
* **Both in one app:** SDK clients can invoke both models side-by-side.

<Note>
  Compatibility: some API/SDK fields still use legacy names like `pricePerQuery`. In this rollout, Query billing semantics are listing-level **per response turn**.
  A future major release can add response-named aliases (for example, `pricePerResponse`) before deprecating legacy names.
</Note>

***

## The Protocol (Settlement)

All value flows through `ContextRouter.sol` on Base, with mode-specific settlement semantics:

* **Query mode (`/api/v1/query`)** keeps deferred settlement after the response is delivered.
* **Execute mode (`/api/v1/tools/execute`)** uses execute sessions (`maxSpendUsd`) with automatic payment in deferred batches.

### How Settlement Works

For **Query mode**:

1. User sets a **spending cap** (ERC-20 allowance on ContextRouter) as a one-time setup
2. Before execution, the server performs a **read-only pre-flight check** (balance + allowance) with no gas cost
3. Tools execute and the AI response is streamed to the user
4. **After the response is fully delivered:** the server settles payment in the background via the operator wallet
5. If settlement fails, it is retried automatically (up to 5 attempts with exponential backoff)

For **Execute mode**:

1. User spending cap (allowance) is checked as the long-lived global ceiling
2. Client starts an execute session with `maxSpendUsd` (short-lived run budget)
3. Each execute call accrues method-level price (`_meta.pricing.executeUsd`) into session spend
4. Session spend is flushed to pending settlement in deferred batches (threshold/TTL/close)
5. Session responses expose `methodPrice`, `spent`, `remaining`, and `maxSpend`

<Info>
  **Key benefit:** Both modes avoid blocking users on on-chain settlement. Query responses and execute results return first, then settlement is handled asynchronously. Failed executions do not silently overcharge successful-call economics.
</Info>

### Payment Distribution

| Recipient         | Share |
| ----------------- | ----- |
| Tool Developer    | 90%   |
| Protocol Treasury | 10%   |

When multiple tools are used, settlements can be **batched** (query turn batches and execute-session batches).

***

## Staking System

All tools (including free) require a minimum stake, enforced on-chain.

| Tool Type  | Minimum Stake                                                  |
| ---------- | -------------------------------------------------------------- |
| Free Tools | \$10 USDC                                                      |
| Paid Tools | \$10 USDC or 100× listing response price (whichever is higher) |

### Stake Properties

* **Fully refundable** with a 7-day withdrawal delay
* **Creates accountability** and enables slashing for fraud
* **On-chain enforcement** via smart contracts

***

## Security: Financial Protocol Architecture

We use **Asymmetric Request Signing (RS256)** to secure the marketplace. This is not just "auth" it's the foundation of a financial protocol.

### Free vs Paid Requirements

| Tool Type                | Security Middleware | Rationale                                      |
| ------------------------ | ------------------- | ---------------------------------------------- |
| **Free Tools (\$0.00)**  | **Optional**        | Great for distribution and adoption            |
| **Paid Tools (\$0.01+)** | **Mandatory**       | We cannot route payments to insecure endpoints |

<Info>
  If you're building a free tool, you can skip security middleware entirely. But for paid tools, the middleware ensures that only legitimate, paid requests from the Context platform can execute your code.
</Info>

### Why This Matters (for Paid Tools)

| Approach                | Result                                         |
| ----------------------- | ---------------------------------------------- |
| Without Auth            | A toy: anyone can `curl` your endpoint         |
| With Shared Secrets     | A security liability: secrets can leak         |
| With Asymmetric Signing | Professional infrastructure (like Stripe/Visa) |

### How It Works

1. **Platform signs requests** with a private key (stored securely in environment)
2. **Tool servers verify requests** using the public key (distributed via SDK)
3. **Short-lived tokens** (2-minute expiration) prevent replay attacks

### JWT Claims

| Claim    | Description                          |
| -------- | ------------------------------------ |
| `iss`    | `https://ctxprotocol.com` (issuer)   |
| `aud`    | Tool endpoint URL (audience)         |
| `toolId` | Database ID of the tool being called |
| `iat`    | Issue timestamp                      |
| `exp`    | Expiration (2 minutes from issue)    |

### For MCP Tool Developers

Secure your endpoint with a single line:

```typescript theme={null}
import { createContextMiddleware } from "@ctxprotocol/sdk";

// 1 line to secure your endpoint
app.use("/mcp", createContextMiddleware());
```

***

## Context Injection (User Data)

For tools that need access to user-specific data (e.g., *"Analyze my positions"* or *"Check my balances"*), Context uses a **Context Injection** pattern. The platform fetches user data client-side and injects it directly into your tool's arguments. Your server never sees private keys or credentials.

| Context Type    | Description                             |
| --------------- | --------------------------------------- |
| `"hyperliquid"` | Hyperliquid perpetuals & spot positions |
| `"polymarket"`  | Polymarket prediction markets           |
| `"wallet"`      | Generic EVM wallet balances             |

<Card title="Context Injection Guide" icon="syringe" href="/guides/tool-metadata#context-injection">
  Full guide: declaring requirements, handling data, supported context types
</Card>

***

## Tool Safety Limits

Each Query turn runs inside a **managed execution environment** with bounded tool-call budgets, timeout enforcement, and deferred settlement.

### MCP Tools Limits

| Limit                       | Value            | Purpose                                            |
| --------------------------- | ---------------- | -------------------------------------------------- |
| Max calls per response turn | 100              | Prevent runaway loops in Query runtime             |
| Runtime timeout budget      | 5000ms           | Per-step safety guardrail inside managed execution |
| **MCP execution timeout**   | **\~60 seconds** | Encourages pre-computed products                   |

Every response-turn MCP call increments an internal counter. Once `executionCount >= 100`, the platform throws an error for that turn.

<Warning>
  **The \~60 second timeout is intentional.** This limit is enforced at the platform/client level (not by MCP itself) and serves as a quality forcing function. It encourages data brokers to build pre-computed insight products rather than raw data access tools.

  See [Build & List Your Tool](/guides/build-tools#execution-limits--product-design) for detailed guidance on product architecture.
</Warning>

### Runtime Cancellation and Retry Semantics

* Query execution timeouts propagate cancellation to in-flight MCP calls, so requests do not continue burning API quota after the turn has timed out.
* Infrastructure-style failures (for example: `429`, auth failures, upstream timeout/unavailable responses) are treated as non-healable for that turn and are not retried.
* Likely code/data-shape issues (property mismatch, parsing errors) are retried when a retry can plausibly improve the answer.
* If a result indicates a capability or tier constraint (for example: the required metric is unavailable on the current plan), the runtime returns best-effort output with explicit limitations or a structured `capability_miss` rather than looping on the same tool.
* Tool contributors can publish pacing hints in `_meta.rateLimit` / `_meta.rateLimitHints` (`maxRequestsPerMinute`, `cooldownMs`, `maxConcurrency`, `supportsBulk`, `recommendedBatchTools`, `notes`), which the runtime uses for pacing and planning.
* Runtime pacing combines declared hints with adaptive feedback: on rate-limit / timeout / abort signals it increases cooldowns, and after stable successes it gradually relaxes back toward declared baselines.
* Example implementation: [Coinglass contributor server](https://github.com/ctxprotocol/sdk/tree/main/examples/server/coinglass-contributor).
* Implementation guide: [Tool Metadata](/guides/tool-metadata#rate-limit-hints).

### Economic Model

| Scenario                 | Behavior                                                                                                                     |
| ------------------------ | ---------------------------------------------------------------------------------------------------------------------------- |
| Free tools (\$0.00)      | Can be used immediately without payment                                                                                      |
| Query mode               | Pay-per-response (\~\$0.10), query-eligible method selection, up to 100 MCP calls per turn, automatic payment after delivery |
| Execute mode             | Per-call method pricing (\~\$0.001) inside execute sessions, automatic payment in deferred batches                           |
| Unpriced execute methods | **Query-only:** invisible in Execute mode until `_meta.pricing.executeUsd` is set                                            |
| Settlement fails         | Retried automatically (up to 5 attempts), then platform absorbs                                                              |

<Info>
  **Why execute pricing is \~1/100 of response pricing:** A single Query response can invoke up to 100 method calls per turn, all covered by one flat response fee. When SDK consumers pay per-execute-call, the per-call price must be proportionally lower. A common starting point is execute price = listing response price / 100.
</Info>

<Warning>
  **Execute mode gating:** Methods without explicit `_meta.pricing.executeUsd` do not appear in SDK-level Execute discovery. They remain usable in Query mode only. Contributors must set execute pricing (either via the contribute form or in MCP server `_meta`) to unlock Execute visibility. See [Tool Metadata](/guides/tool-metadata#pricing--execute-mode-eligibility) for configuration details.
</Warning>

***

## SDK Payment Modes

The SDKs expose two modes, reflecting Context's role as both a response marketplace and an execution marketplace:

| Mode                                   | Endpoint                | Payment Model    | Settlement Timing                         | What You Get                                                                                       |
| -------------------------------------- | ----------------------- | ---------------- | ----------------------------------------- | -------------------------------------------------------------------------------------------------- |
| **Execute** (`client.tools.execute()`) | `/api/v1/tools/execute` | Per execute call | Session accrual + automatic batch payment | Raw data + explicit spending limit                                                                 |
| **Query** (`client.query.run()`)       | `/api/v1/query`         | Pay-per-response | Deferred (post-response)                  | Managed librarian output: `answer_with_evidence` or `evidence_only` from the full agentic pipeline |

**Execute** is for precision: you call specific methods with explicit execute pricing and get raw data + session spend visibility. Good for custom pipelines and deterministic orchestration.

**Query** is for intelligence: you ask a natural-language question, and the managed runtime handles everything: tool discovery, ambiguity resolution, multi-tool execution (up to 100 MCP calls per response turn), and settlement. One flat fee, with a structured response contract: `answer_with_evidence` (prose plus grounding, chat parity) or `evidence_only` (bounded evidence, computed artifacts, and full-data references for agent harnesses). This is the same managed runtime that powers the client app.

<Info>
  **Why offer both?** Different consumers have different needs. An agent building a custom trading strategy wants raw data and full control (Execute). An agent answering user questions wants curated intelligence without managing orchestration (Query). Context serves both.
</Info>
