Go-LLM-proxy v0.3 released – translating proxy for Claude Code and Codex

3 pointsposted 19 hours ago
by yatesdr

1 Comments

yatesdr

19 hours ago

Happy to report v0.3 released for go-llm-proxy!

Great for connecting your local LLM coding and vision models to Claude Code and Codex.

General improvements

> Vision pipeline - images described by your vision model, transparent to the client

> Dual OCR pipeline - smart routing for PDFs and tool output (text extraction first, vision fallback for scanned docs). Dedicated OCR models like

> PaddleOCR-VL are ~17x faster than general vision models on document pages

> Brave & Tavily search integration - native behavior for Claude Code and Codex when configured on the proxy

> Per-model processor routing - override vision, OCR, and search settings per model

> Context window auto-detection from backends SSE keepalive improvements during pipeline processing Full MCP SSE endpoint for web search on OpenCode, Qwen Code, Claw, and other MCP-compatible agents Docker update for easier deployment (limited testing so far)

Codex-specific

> Full Responses API translation - Chat Completions under the hood, your local backend doesn't need to support /v1/responses

> Reasoning token display - reasoning_summary_text.delta events so Codex shows thinking natively

> Native search UI - emits web_search_call output items so Codex renders "Searched N results" in its interface

> Structured tool output - Codex's view_image returns arrays/objects, not strings. The proxy handles all three formats

> mcp_tool_call_output and mcp_list_tools input types handled (Codex sends these, other backends choke on them)

> Config generator produces config.toml with provider, reasoning effort, context window, and optional Tavily MCP

Claude Code-specific:

> Full Messages API translation - Anthropic protocol to Chat Completions, so Claude Code works with vLLM/llama-server

> Thinking blocks - backend reasoning tokens wrapped as thinking/signature_delta content blocks so Claude Code renders them

> web_search_20250305 server tool intercepted and executed proxy-side

> PDF type: "document" blocks extracted to text before forwarding

> Streaming search with server_tool_use + web_search_tool_result blocks so Claude Code shows "Did N searches"

> /anthropic/v1/messages explicit route for clients that use the Anthropic base URL convention

> Config generator produces settings.json with Sonnet/Opus/Haiku tier selectors, thinking toggles, and start scripts