Claude Desktop & Cowork¶

MoE Sovereign implements the full Anthropic Third-Party Inference Gateway specification. This means Claude Desktop (including Claude Cowork and the embedded Claude Code CLI) can route all inference through your self-hosted MoE pipeline — with no changes to the client.

How It Works¶

Claude Desktop connects to MoE Sovereign like any other Anthropic-compatible gateway. The handshake is:

Claude Desktop reads the thirdPartyInference.baseUrl from claude_desktop_config.json.
On startup it queries GET /v1/models to discover available models — any entry whose ID starts with claude- or anthropic- appears as a "From gateway" item in the model picker, labeled with the human-readable display_name field.
All inference requests go to POST /v1/messages. The anthropic-beta and anthropic-version headers are forwarded transparently.
Context-budget warnings use POST /v1/messages/count_tokens.

Everything that Claude Desktop routes to Anthropic by default is instead processed by the MoE pipeline on your own hardware.

Quick Start (Automated)¶

The fastest path is the bundled setup script:

# From the moe-infra checkout or server
bash scripts/setup-claude-desktop.sh

The script will:

Detect the claude_desktop_config.json location on macOS, Linux, and WSL
Prompt for your MoE Sovereign URL and API key
Run a live connectivity check against /v1/models
Create a timestamped backup of the existing config
Merge the gateway block without touching other settings

After the script finishes, restart Claude Desktop.

Quick Start (Manual)¶

Locate claude_desktop_config.json:

OS	Path
macOS	`~/Library/Application Support/Claude/claude_desktop_config.json`
Linux	`~/.config/Claude/claude_desktop_config.json`
Windows	`%APPDATA%\Claude\claude_desktop_config.json`

Add (or merge) the following block:

{
  "thirdPartyInference": {
    "type":    "gateway",
    "baseUrl": "https://your-moe-instance.example.com",
    "apiKey":  "moe-sk-xxxxxxxx..."
  }
}

Restart Claude Desktop. The gateway becomes active immediately.

Selecting the Gateway in Claude Desktop UI¶

After restarting, open Developer → Configure Third-Party Inference. Claude Desktop shows the gateway URL you configured.

The /model picker in Claude Code (embedded in Claude Desktop) will list all models from /v1/models that match the claude- prefix — these are your CC Profile-enabled API keys' model aliases. They appear as:

Claude Sonnet 4.6 → MoE (Gateway)    [From gateway]
Claude Opus 4.6 → MoE (Gateway)      [From gateway]
...

The label is the display_name field returned by MoE Sovereign.

Claude Cowork¶

Claude Cowork (the web-based workspace inside Claude Desktop) uses the same gateway configuration. All Cowork conversations route through your MoE pipeline. This satisfies data-residency requirements: prompts and completions never reach Anthropic's infrastructure.

Connectors in Cowork

Anthropic-native Cowork connectors (e.g. Google Workspace, Jira) depend on Anthropic's cloud layer and are not available when using a third-party gateway. Use MCP servers as an alternative for tool integrations.

Feature Compatibility¶

Feature	Status	Notes
Claude Code (embedded CLI)	Fully supported	`ANTHROPIC_BASE_URL` set automatically
Claude Cowork chat	Fully supported	All conversations route through MoE
Model picker (`/model`)	Fully supported	`claude-*` aliases with `display_name`
Context-budget warnings	Fully supported	`/v1/messages/count_tokens` endpoint
Streaming (SSE)	Fully supported	Real-time token output
Tool use / file edits	Fully supported	MCP tool pass-through
Thinking blocks (`<think>`)	Supported	Requires `moe_reasoning` CC Profile
Cowork connectors (Google, Jira…)	Not available	Use MCP servers instead
Web search (Anthropic-native)	Not available	MoE uses SearXNG internally

CC Profile Binding¶

When Claude Desktop authenticates with an API key that has a CC Profile bound in the Admin UI, the profile controls which MoE routing mode is used:

moe_native — single-model fast path (best for interactive chat)
moe_reasoning — reasoning expert with thinking blocks
moe_orchestrated — full MoE pipeline (all experts, GraphRAG, MCP tools)

Create multiple API keys with different profiles for different use-cases and switch between them in claude_desktop_config.json.

See Claude Code Profiles for full configuration details.

Troubleshooting¶

Gateway not showing up after restart¶

Verify JSON syntax in claude_desktop_config.json:

python3 -c "import json; json.load(open('/path/to/claude_desktop_config.json'))" && echo OK

"Invalid API Key"¶

curl https://your-moe-instance.example.com/v1/models \
  -H "x-api-key: moe-sk-xxxxxxxx..."

A 200 response confirms the key is valid. If you receive 401, check the key in the User Portal under API Keys.

Models not appearing in picker¶

The model picker only shows entries whose ID starts with claude- or anthropic-. These are shown to users whose API key has a CC Profile (cc_profile permission). Ask your admin to grant this permission, or create a key with an explicit CC Profile in the Admin UI.

Context-budget warnings¶

If Claude Code warns about context limits, this is normal — the /v1/messages/count_tokens endpoint provides an estimate (chars / 3.5). It is accurate enough for budget warnings but not identical to Anthropic's exact tokenizer.