more openAI compatibility, tools support etc.#12
more openAI compatibility, tools support etc.#12hyorman wants to merge 5 commits intolutzleonhardt:masterfrom
Conversation
PR Compliance Guide 🔍Below is a summary of compliance checks for this PR:
Compliance status legend🟢 - Fully Compliant🟡 - Partial Compliant 🔴 - Not Compliant ⚪ - Requires Further Human Verification 🏷️ - Compliance label |
|||||||||||||||||||||||||
PR Code Suggestions ✨Explore these optional code suggestions:
|
||||||||||||||||||||||||
There was a problem hiding this comment.
Pull request overview
This PR significantly expands the copilot-proxy extension to provide broader OpenAI API compatibility. It adds support for tools/function calling, implements the Assistants API, and includes legacy completions and models endpoints. The extension now auto-starts the server and persists assistant state.
Changes:
- Added comprehensive OpenAI-compatible type definitions including tools, embeddings, models, responses, and assistants APIs
- Implemented full Assistants API with threads, messages, runs, and tool calling via prompt engineering
- Added legacy completions endpoint, models listing, and embeddings stub
- Introduced state persistence for assistants with debounced auto-save
- Changed extension behavior to auto-start server on activation
- Added web-based chat client with Flask backend
Reviewed changes
Copilot reviewed 25 out of 27 changed files in this pull request and generated 21 comments.
Show a summary per file
| File | Description |
|---|---|
| src/types.ts | Added extensive type definitions for OpenAI API compatibility including tools, assistants, embeddings, and responses |
| src/server.ts | Expanded with models endpoints, completions API, responses API, tool parsing, and assistants router integration |
| src/extension.ts | Added state persistence, auto-start behavior, model listing command, and improved system message handling |
| src/assistants/*.ts | New comprehensive assistants module with types, state management, tool utilities, run execution engine, and routes |
| package.json | Updated version, commands, engine requirements, removed vsce dependency, changed activation events |
| client/* | New web client with HTML/CSS/JS frontend and Flask proxy backend, plus Python streaming client |
| package-lock.json | Deleted (breaks reproducible builds) |
| docs/specs/archive/* | Removed planning documents |
Comments suppressed due to low confidence (1)
src/extension.ts:180
- The
extractMessageContentfunction now handlesnullandundefinedvalues, but there's a potential issue: whencontentis an array ofStructuredMessageContent, the function returns the concatenated text from items. However, in the types,StructuredMessageContenthas atypeproperty, but the function assumes it has atextproperty. This could lead toundefinedbeing returned for valid structured content that doesn't match the expected shape.
function extractMessageContent(content: string | StructuredMessageContent[] | null | undefined): string {
if (content === null || content === undefined) {
return '';
}
if (typeof content === 'string') {
return content;
}
if (Array.isArray(content)) {
return content.map(item => item.text).join('\n');
}
return String(content);
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/extension.ts
Outdated
| // Auto-start the server on extension activation | ||
| if (!serverInstance) { | ||
| const configPort = vscode.workspace.getConfiguration("copilotProxy").get("port", 3000); | ||
| serverInstance = startServer(configPort); | ||
| outputChannel.appendLine(`Express server auto-started on port ${configPort}.`); | ||
| } |
There was a problem hiding this comment.
The server now auto-starts on extension activation (lines 89-93), which is a significant behavior change. This could cause issues if the configured port is already in use, potentially blocking the extension from loading. Consider adding error handling and user notification if the server fails to start, or making auto-start optional via configuration.
src/server.ts
Outdated
| function parseToolCalls(content: string, tools: FunctionTool[]): { text: string; toolCalls: ResponseFunctionCallItem[] } { | ||
| const toolCalls: ResponseFunctionCallItem[] = []; | ||
| let remainingText = content; | ||
|
|
||
| // Look for JSON-formatted tool calls in the response | ||
| // Common patterns: <tool_call>, ```json, or direct JSON objects | ||
| const toolCallPatterns = [ | ||
| /<tool_call>([\s\S]*?)<\/tool_call>/g, | ||
| /```(?:json)?\s*\n?({[\s\S]*?"name"[\s\S]*?"arguments"[\s\S]*?})\s*\n?```/g, | ||
| /\{\s*"tool_calls?"\s*:\s*\[([\s\S]*?)\]\s*\}/g | ||
| ]; | ||
|
|
||
| for (const pattern of toolCallPatterns) { | ||
| let match; | ||
| while ((match = pattern.exec(content)) !== null) { | ||
| try { | ||
| let parsed = JSON.parse(match[1] || match[0]); | ||
|
|
||
| // Handle both single tool call and array of tool calls | ||
| const calls = Array.isArray(parsed) ? parsed : (parsed.tool_calls || [parsed]); | ||
|
|
||
| for (const call of calls) { | ||
| if (call.name && tools.some(t => t.function.name === call.name)) { | ||
| const toolCall: ResponseFunctionCallItem = { | ||
| type: 'function_call', | ||
| id: generateId('fc'), | ||
| call_id: generateId('call'), | ||
| name: call.name, | ||
| arguments: typeof call.arguments === 'string' ? call.arguments : JSON.stringify(call.arguments || {}), | ||
| status: 'completed' | ||
| }; | ||
| toolCalls.push(toolCall); | ||
| remainingText = remainingText.replace(match[0], '').trim(); | ||
| } | ||
| } | ||
| } catch (e) { | ||
| // Not valid JSON, continue | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
The parseToolCalls function uses regular expressions with global flags but doesn't reset lastIndex between iterations or patterns. When using .exec() with global regex, the lastIndex property persists across calls, which can cause the regex to miss matches or behave unexpectedly. Consider using String.prototype.matchAll() or resetting the regex between uses.
src/server.ts
Outdated
| status: 'completed' | ||
| }; | ||
| toolCalls.push(toolCall); | ||
| remainingText = remainingText.replace(match[0], '').trim(); |
There was a problem hiding this comment.
The tool parsing logic removes matched tool calls from the content by calling replace(match[0], '') on line 231. However, remainingText is defined once at the start (line 201) and never reassigned when tool calls are found. This means the text content isn't actually being stripped of tool call markers as intended - the replace operation is performed on a variable that's immediately reassigned.
src/server.ts
Outdated
| } | ||
| }; | ||
| toolCalls.push(toolCall); | ||
| remainingText = remainingText.replace(match[0], '').trim(); |
There was a problem hiding this comment.
The parseChatToolCalls function has the same issue as parseToolCalls - the remainingText variable at line 523 is initialized but line 550 performs a replace operation that doesn't get assigned back to remainingText. This means tool call markers won't be removed from the returned text content.
src/assistants/tools.ts
Outdated
| export class ToolCallBuffer { | ||
| private content: string = ''; | ||
| private inToolCall: boolean = false; | ||
| private toolCallDepth: number = 0; | ||
|
|
||
| /** | ||
| * Add content to buffer | ||
| * Returns content that can be safely emitted (not part of a tool call) | ||
| */ | ||
| append(chunk: string): { safeContent: string; complete: boolean } { | ||
| this.content += chunk; | ||
|
|
||
| // Check for tool call markers | ||
| const openCount = (this.content.match(/<tool_call>/gi) || []).length; | ||
| const closeCount = (this.content.match(/<\/tool_call>/gi) || []).length; | ||
|
|
||
| this.inToolCall = openCount > closeCount; | ||
|
|
||
| if (!this.inToolCall && openCount === closeCount) { | ||
| // All tool calls are complete (or there are none) | ||
| return { safeContent: '', complete: true }; | ||
| } | ||
|
|
||
| // We're in the middle of a tool call, don't emit anything yet | ||
| return { safeContent: '', complete: false }; | ||
| } | ||
|
|
||
| /** | ||
| * Get the full accumulated content | ||
| */ | ||
| getContent(): string { | ||
| return this.content; | ||
| } | ||
|
|
||
| /** | ||
| * Check if we're currently inside a tool call block | ||
| */ | ||
| isInToolCall(): boolean { | ||
| return this.inToolCall; | ||
| } | ||
|
|
||
| /** | ||
| * Reset the buffer | ||
| */ | ||
| reset(): void { | ||
| this.content = ''; | ||
| this.inToolCall = false; | ||
| this.toolCallDepth = 0; | ||
| } | ||
| } |
There was a problem hiding this comment.
The ToolCallBuffer class tracks tool call depth but never uses the toolCallDepth property (line 194). It's incremented and reset but not used in any logic. Either implement proper depth tracking for nested tool calls or remove the unused property to avoid confusion.
|
|
||
| // POST /v1/responses - Create a model response (new OpenAI API) | ||
| app.post<{}, {}, CreateResponseRequest>('/v1/responses', async (req, res) => { | ||
| const { model, input, instructions, stream, temperature, max_output_tokens, metadata, tools, tool_choice } = req.body; |
There was a problem hiding this comment.
Unused variable tool_choice.
| const { model, input, instructions, stream, temperature, max_output_tokens, metadata, tools, tool_choice } = req.body; | |
| const { model, input, instructions, stream, temperature, max_output_tokens, metadata, tools } = req.body; |
|
|
||
| app.post<{}, {}, ChatCompletionRequest>('/v1/chat/completions', async (req, res) => { | ||
| const { model, stream } = req.body; | ||
| const { model, stream, tools, tool_choice } = req.body; |
There was a problem hiding this comment.
Unused variable tool_choice.
| const { model, stream, tools, tool_choice } = req.body; | |
| const { model, stream, tools } = req.body; |
| else if(choices.length && choices[0].delta) content = choices.map(c=>c.delta?.content||'').join(''); | ||
| else if(data.text) content = data.text; | ||
| else content = JSON.stringify(data); | ||
| }catch(e){ content = JSON.stringify(data) } |
There was a problem hiding this comment.
Avoid automated semicolon insertion (96% of all statements in the enclosing function have an explicit semicolon).
| }catch(e){ content = JSON.stringify(data) } | |
| }catch(e){ content = JSON.stringify(data); } |
| return send_from_directory('web', 'index.html') | ||
|
|
||
| @app.route('/api/chat', methods=['POST']) | ||
| def api_chat(): |
There was a problem hiding this comment.
api_chat returns tuple of size 2 and tuple of size 3.
|
|
||
|
|
||
| @app.route('/api/models', methods=['GET']) | ||
| def api_models(): |
There was a problem hiding this comment.
api_models returns tuple of size 2 and tuple of size 3.
PR Type
Enhancement
Description
Implements complete OpenAI Assistants API with async generator-based run execution engine supporting streaming and tool calling
Adds comprehensive Express routes for CRUD operations on assistants, threads, messages, and runs with SSE streaming support
Extends OpenAI API compatibility with
/v1/models,/v1/responses, and enhanced/v1/chat/completionsendpoints featuring tool callingImplements prompt-based tool calling utilities with XML marker parsing and
ToolCallBufferfor streaming support (VS Code LM API lacks native function calling)Adds in-memory state management with debounced persistence to VS Code
globalStatefor assistants, threads, messages, runs, and run stepsIntegrates state persistence and model discovery in extension with auto-start server and model listing command
Creates complete web UI with Flask proxy server, Python streaming client, and interactive chat application with persistent message storage
Defines comprehensive TypeScript types for Assistants API, tool calling, and new response formats
Updates package configuration to require VS Code
^1.95.0and refactors command IDs to camelCaseDiagram Walkthrough
File Walkthrough
14 files
runner.ts
Run execution engine with streaming and tool supportsrc/assistants/runner.ts
and non-streaming modes
available tools
throughout execution
submitted
routes.ts
OpenAI Assistants API Express routessrc/assistants/routes.ts
server.ts
Extended OpenAI API compatibility with tools and responsessrc/server.ts
/v1/modelsendpoints for listing and retrieving available models/v1/embeddingsstub endpoint returning 501 Not Implemented/v1/completionsendpoint wrapping to chat completions/v1/responsesAPI with tool calling support andstreaming
/v1/chat/completionswith tool calling and functiondefinitions
state.ts
In-memory state management with persistencesrc/assistants/state.ts
messages, runs, and run steps
integration
types.ts
OpenAI Assistants API TypeScript type definitionssrc/assistants/types.ts
file_search)
extension.ts
State persistence and model discovery integrationsrc/extension.ts
globalStatewith debounced savesgetAvailableModels()function to query VS Code LanguageModel API
types.ts
Extended type definitions for tools and new APIssrc/types.ts
FunctionTool,ToolCall)ChatCompletionRequestwith tool parametersindex.ts
Assistants API module exportssrc/assistants/index.ts
routes
tools.ts
Tool calling utilities for VS Code LM APIsrc/assistants/tools.ts
which lacks native function calling support
detailed parameter documentation
XML markers andJSON extraction
ToolCallBufferclass for streaming support to detectcomplete tool calls before parsing
copilot_proxy.py
Python streaming client for chat completionsclient/copilot_proxy.py
API server
data:prefixed linesfragments in real-time
server.py
Flask proxy server for API requestsclient/server.py
API at
http://localhost:3000/v1/api/chatendpoint to forward chat completion requests withproper error handling
/api/modelsendpoint to fetch and return available modelsfrom upstream API
webdirectoryapp.js
Interactive chat application frontendclient/web/app.js
localStorage/api/modelsendpoint withfallback options
newlines)
with error handling
styles.css
Dark theme styling for chat interfaceclient/web/styles.css
accent colors
container, and form
messages
appearance
index.html
HTML structure for chat applicationclient/web/index.html
title, model selector, and new chat button
history
1 files
README_WEB.md
Web UI setup and usage documentationclient/README_WEB.md
dependencies
2 files
package.json
Package configuration and metadata updatespackage.json
1.0.2to1.0.4and improves package metadata withdisplayNameandreadmefields^1.70.0to^1.95.0copilotProxy.startServer,copilotProxy.stopServer, etc.) and adds newlistModelscommandactivationEventsto use wildcard*and removes specificcommand activation events
@vscode/vscepackage instead of standalonevscedependency.vscodeignore
VS Code extension packaging exclusions.vscodeignore
package
node_modulesdirectory in the packaged extension1 files
requirements-web.txt
Python dependencies for web serverclient/requirements-web.txt
Flask>=2.0andrequests>=2.257 files