Claude Desktop & Cowork¶
MoE Sovereign implements the full Anthropic Third-Party Inference Gateway specification. This means Claude Desktop (including Claude Cowork and the embedded Claude Code CLI) can route all inference through your self-hosted MoE pipeline — with no changes to the client.
How It Works¶
Claude Desktop connects to MoE Sovereign like any other Anthropic-compatible gateway. The handshake is:
- Claude Desktop reads the
thirdPartyInference.baseUrlfromclaude_desktop_config.json. - On startup it queries
GET /v1/modelsto discover available models — any entry whose ID starts withclaude-oranthropic-appears as a "From gateway" item in the model picker, labeled with the human-readabledisplay_namefield. - All inference requests go to
POST /v1/messages. Theanthropic-betaandanthropic-versionheaders are forwarded transparently. - Context-budget warnings use
POST /v1/messages/count_tokens.
Everything that Claude Desktop routes to Anthropic by default is instead processed by the MoE pipeline on your own hardware.
Quick Start (Automated)¶
The fastest path is the bundled setup script:
The script will:
- Detect the
claude_desktop_config.jsonlocation on macOS, Linux, and WSL - Prompt for your MoE Sovereign URL and API key
- Run a live connectivity check against
/v1/models - Create a timestamped backup of the existing config
- Merge the gateway block without touching other settings
After the script finishes, restart Claude Desktop.
Quick Start (Manual)¶
Locate claude_desktop_config.json:
| OS | Path |
|---|---|
| macOS | ~/Library/Application Support/Claude/claude_desktop_config.json |
| Linux | ~/.config/Claude/claude_desktop_config.json |
| Windows | %APPDATA%\Claude\claude_desktop_config.json |
Add (or merge) the following block:
{
"thirdPartyInference": {
"type": "gateway",
"baseUrl": "https://your-moe-instance.example.com",
"apiKey": "moe-sk-xxxxxxxx..."
}
}
Restart Claude Desktop. The gateway becomes active immediately.
Selecting the Gateway in Claude Desktop UI¶
After restarting, open Developer → Configure Third-Party Inference. Claude Desktop shows the gateway URL you configured.
The /model picker in Claude Code (embedded in Claude Desktop) will list
all models from /v1/models that match the claude- prefix — these are
your CC Profile-enabled API keys' model aliases. They appear as:
The label is the display_name field returned by MoE Sovereign.
Claude Cowork¶
Claude Cowork (the web-based workspace inside Claude Desktop) uses the same gateway configuration. All Cowork conversations route through your MoE pipeline. This satisfies data-residency requirements: prompts and completions never reach Anthropic's infrastructure.
Connectors in Cowork
Anthropic-native Cowork connectors (e.g. Google Workspace, Jira) depend on Anthropic's cloud layer and are not available when using a third-party gateway. Use MCP servers as an alternative for tool integrations.
Feature Compatibility¶
| Feature | Status | Notes |
|---|---|---|
| Claude Code (embedded CLI) | Fully supported | ANTHROPIC_BASE_URL set automatically |
| Claude Cowork chat | Fully supported | All conversations route through MoE |
Model picker (/model) |
Fully supported | claude-* aliases with display_name |
| Context-budget warnings | Fully supported | /v1/messages/count_tokens endpoint |
| Streaming (SSE) | Fully supported | Real-time token output |
| Tool use / file edits | Fully supported | MCP tool pass-through |
Thinking blocks (<think>) |
Supported | Requires moe_reasoning CC Profile |
| Cowork connectors (Google, Jira…) | Not available | Use MCP servers instead |
| Web search (Anthropic-native) | Not available | MoE uses SearXNG internally |
CC Profile Binding¶
When Claude Desktop authenticates with an API key that has a CC Profile bound in the Admin UI, the profile controls which MoE routing mode is used:
moe_native— single-model fast path (best for interactive chat)moe_reasoning— reasoning expert with thinking blocksmoe_orchestrated— full MoE pipeline (all experts, GraphRAG, MCP tools)
Create multiple API keys with different profiles for different use-cases and
switch between them in claude_desktop_config.json.
See Claude Code Profiles for full configuration details.
Troubleshooting¶
Gateway not showing up after restart¶
Verify JSON syntax in claude_desktop_config.json:
"Invalid API Key"¶
A 200 response confirms the key is valid. If you receive 401, check
the key in the User Portal under API Keys.
Models not appearing in picker¶
The model picker only shows entries whose ID starts with claude- or
anthropic-. These are shown to users whose API key has a CC Profile
(cc_profile permission). Ask your admin to grant this permission, or
create a key with an explicit CC Profile in the Admin UI.
Context-budget warnings¶
If Claude Code warns about context limits, this is normal — the
/v1/messages/count_tokens endpoint provides an estimate (chars / 3.5).
It is accurate enough for budget warnings but not identical to Anthropic's
exact tokenizer.