more openAI compatibility, tools support etc. by hyorman · Pull Request #12 · lutzleonhardt/copilot-proxy

hyorman · 2026-01-28T21:54:14Z

PR Type

Enhancement

Description

Implements complete OpenAI Assistants API with async generator-based run execution engine supporting streaming and tool calling
Adds comprehensive Express routes for CRUD operations on assistants, threads, messages, and runs with SSE streaming support
Extends OpenAI API compatibility with /v1/models, /v1/responses, and enhanced /v1/chat/completions endpoints featuring tool calling
Implements prompt-based tool calling utilities with XML marker parsing and ToolCallBuffer for streaming support (VS Code LM API lacks native function calling)
Adds in-memory state management with debounced persistence to VS Code globalState for assistants, threads, messages, runs, and run steps
Integrates state persistence and model discovery in extension with auto-start server and model listing command
Creates complete web UI with Flask proxy server, Python streaming client, and interactive chat application with persistent message storage
Defines comprehensive TypeScript types for Assistants API, tool calling, and new response formats
Updates package configuration to require VS Code ^1.95.0 and refactors command IDs to camelCase

Diagram Walkthrough

flowchart LR
  VSCode["VS Code Extension"]
  Server["Express Server"]
  AssistantsAPI["Assistants API<br/>Routes & Runner"]
  ToolUtils["Tool Calling<br/>Utilities"]
  State["State<br/>Management"]
  WebUI["Web UI<br/>Flask + Chat App"]
  
  VSCode -->|"Persistence & Models"| Server
  Server -->|"Mount Routes"| AssistantsAPI
  AssistantsAPI -->|"Use Tools"| ToolUtils
  AssistantsAPI -->|"Manage State"| State
  WebUI -->|"Proxy Requests"| Server
  State -->|"Debounced Save"| VSCode

File Walkthrough

Relevant files

Enhancement

14 files

runner.ts `Run execution engine with streaming and tool support` src/assistants/runner.ts Implements async generator-based run execution engine with streaming and non-streaming modes Supports tool calling with prompt-based parsing and validation against available tools Handles run cancellation, step tracking, and state management throughout execution Implements continuation logic for runs after tool outputs are submitted	+933/-0
routes.ts `OpenAI Assistants API Express routes` src/assistants/routes.ts Implements complete Express routes for OpenAI Assistants API endpoints Supports CRUD operations for assistants, threads, messages, and runs Handles streaming and non-streaming run execution with SSE Implements tool output submission and run cancellation endpoints	+762/-0
server.ts `Extended OpenAI API compatibility with tools and responses` src/server.ts Adds `/v1/models` endpoints for listing and retrieving available models Implements `/v1/embeddings` stub endpoint returning 501 Not Implemented Adds legacy `/v1/completions` endpoint wrapping to chat completions Implements new `/v1/responses` API with tool calling support and streaming Enhances `/v1/chat/completions` with tool calling and function definitions Mounts assistants router and adds health check and 404 handler	+631/-6
state.ts `In-memory state management with persistence` src/assistants/state.ts Implements in-memory state management for assistants, threads, messages, runs, and run steps Provides debounced persistence callback for VS Code globalState integration Supports serialization/deserialization for state restoration Manages pending tool contexts for runs awaiting tool outputs	+434/-0
types.ts `OpenAI Assistants API TypeScript type definitions` src/assistants/types.ts Defines comprehensive TypeScript types for OpenAI Assistants API Includes types for assistants, threads, messages, runs, and run steps Defines tool calling types and streaming event types Provides types for future extensibility (code_interpreter, file_search)	+379/-0
extension.ts `State persistence and model discovery integration` src/extension.ts Adds state persistence using VS Code `globalState` with debounced saves Implements `getAvailableModels()` function to query VS Code Language Model API Auto-starts server on extension activation Adds command to list available LLM models via VS Code picker Improves system message handling by prepending to first user message	+120/-13
types.ts `Extended type definitions for tools and new APIs` src/types.ts Adds tool/function calling types (`FunctionTool`, `ToolCall`) Implements legacy completions API types for backward compatibility Adds embeddings API stub types Implements new Responses API types with tool calling support Extends `ChatCompletionRequest` with tool parameters	+238/-3
index.ts `Assistants API module exports` src/assistants/index.ts Creates module barrel export for assistants API functionality Exports types, state management, run execution, tool utilities, and routes	+31/-0
tools.ts `Tool calling utilities for VS Code LM API` src/assistants/tools.ts Implements prompt-based tool calling utilities for VS Code LM API which lacks native function calling support Provides tool definition formatting for system prompt injection with detailed parameter documentation Includes tool call parsing from model output using XML markers and JSON extraction Implements `ToolCallBuffer` class for streaming support to detect complete tool calls before parsing	+240/-0
copilot_proxy.py `Python streaming client for chat completions` client/copilot_proxy.py Creates a Python client that streams chat completions from the local API server Implements SSE (Server-Sent Events) parsing with proper handling of `data:` prefixed lines Handles JSON decoding errors gracefully and outputs streamed content fragments in real-time	+60/-0
server.py `Flask proxy server for API requests` client/server.py Implements Flask web server that proxies requests to the underlying API at `http://localhost:3000/v1` Provides `/api/chat` endpoint to forward chat completion requests with proper error handling Provides `/api/models` endpoint to fetch and return available models from upstream API Serves static web files from the `web` directory	+41/-0
app.js `Interactive chat application frontend` client/web/app.js Implements interactive chat UI with persistent message storage using `localStorage` Dynamically loads available models from `/api/models` endpoint with fallback options Handles form submission with Enter key support (Shift+Enter for newlines) Parses various API response formats and displays assistant responses with error handling	+150/-0
styles.css `Dark theme styling for chat interface` client/web/styles.css Defines dark theme CSS variables for background, panel, text, and accent colors Implements responsive chat UI layout with flexbox for header, messages container, and form Styles message bubbles with different backgrounds for user vs AI messages Applies gradient backgrounds and subtle borders for modern dark theme appearance	+22/-0
index.html `HTML structure for chat application` client/web/index.html Creates HTML structure for chat application with header containing title, model selector, and new chat button Implements messages container section for displaying conversation history Includes chat form with textarea for user input and send button Links external CSS and JavaScript files for styling and functionality	+30/-0

Documentation

1 files

README_WEB.md `Web UI setup and usage documentation` client/README_WEB.md Adds documentation for minimal Flask web UI server Provides setup instructions for Python virtual environment and dependencies Explains how to configure API endpoint and access the chat interface	+22/-0

Configuration changes

2 files

package.json `Package configuration and metadata updates` package.json Updates version from `1.0.2` to `1.0.4` and improves package metadata with `displayName` and `readme` fields Changes description to reflect OpenAI compatibility focus Updates VS Code engine requirement from `^1.70.0` to `^1.95.0` Refactors command IDs to use camelCase (`copilotProxy.startServer`, `copilotProxy.stopServer`, etc.) and adds new `listModels` command Simplifies `activationEvents` to use wildcard `*` and removes specific command activation events Updates build script to use `@vscode/vsce` package instead of standalone `vsce` dependency	+22/-19
.vscodeignore `VS Code extension packaging exclusions` .vscodeignore Defines files and directories to exclude from VS Code extension package Excludes Python cache, source files, tests, and configuration files Preserves `node_modules` directory in the packaged extension	+22/-0

Dependencies

1 files

requirements-web.txt `Python dependencies for web server` client/requirements-web.txt Specifies Python dependencies for the web server: `Flask>=2.0` and `requests>=2.25`	+2/-0

Additional files

7 files

plan-rest-api.md	+0/-47
plan-ui-settings.md	+0/-183
plan-use-vscode-llm.md	+0/-81
plan-vscode-extension.md	+0/-130
client.py	+0/-34
requirements.txt	+0/-1
run_client.sh	+0/-5

qodo-code-review · 2026-01-28T21:54:59Z

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
⚪	Missing authentication Description: The new OpenAI-compatible HTTP API endpoints (e.g., `/v1/models`, `/v1/responses`, `/v1/chat/completions`, and mounted assistants routes) are exposed without any authentication/authorization checks, enabling any network-reachable client to create/read/update/delete assistants/threads/messages/runs and trigger model executions, which is a realistic security risk if the server is not strictly bound to localhost or is reachable via port-forwarding. server.ts [54-681] Referred Code app.get('/v1/models', async (req: Request, res: Response) => { try { const models = await getAvailableModels(); const response: ModelsListResponse = { object: 'list', data: models.map(m => ({ id: m.family, object: 'model' as const, created: Math.floor(Date.now() / 1000), owned_by: m.vendor })) }; res.json(response); } catch (error) { console.error('Error listing models:', error); res.status(500).json(errorResponse('Failed to list models', 'server_error')); } }); // GET /v1/models/:model - Get specific model app.get('/v1/models/:model', async (req: Request, res: Response) => { ... (clipped 607 lines)
	DoS via SSE Description: The newly added Assistants API CRUD and run execution endpoints (including SSE streaming at `/v1/threads/:thread_id/runs` and `/v1/threads/:thread_id/runs/:run_id/submit_tool_outputs`) accept untrusted client input and can be invoked repeatedly without throttling/limits, creating a realistic denial-of-service vector via many concurrent long-lived SSE connections and repeated run creation/execution. routes.ts [87-760] Referred Code router.post('/v1/assistants', (req: Request, res: Response) => { const body = req.body as CreateAssistantRequest; const validationError = validateRequired(body, ['model']); if (validationError) { return res.status(400).json(errorResponse(validationError, 'invalid_request_error', 'model')); } const assistant: Assistant = { id: state.generateAssistantId(), object: 'assistant', created_at: Math.floor(Date.now() / 1000), name: body.name ?? null, description: body.description ?? null, model: body.model, instructions: body.instructions ?? null, tools: body.tools ?? [], metadata: body.metadata ?? {} }; state.createAssistant(assistant); ... (clipped 653 lines)
Ticket Compliance
⚪	🎫 No ticket provided Create ticket/issue
Codebase Duplication Compliance
⚪	Codebase context is not defined Follow the guide to enable codebase context checks.
Custom Compliance
🔴	Generic: Comprehensive Audit Trails Objective: To create a detailed and reliable record of critical system actions for security analysis and compliance. Status: Missing audit logs: Critical CRUD actions over assistants/threads/messages/runs are performed without any audit logging that includes actor identity, timestamped action context, and outcome. Referred Code // Create assistant router.post('/v1/assistants', (req: Request, res: Response) => { const body = req.body as CreateAssistantRequest; const validationError = validateRequired(body, ['model']); if (validationError) { return res.status(400).json(errorResponse(validationError, 'invalid_request_error', 'model')); } const assistant: Assistant = { id: state.generateAssistantId(), object: 'assistant', created_at: Math.floor(Date.now() / 1000), name: body.name ?? null, description: body.description ?? null, model: body.model, instructions: body.instructions ?? null, tools: body.tools ?? [], metadata: body.metadata ?? {} }; ... (clipped 614 lines) Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Robust Error Handling and Edge Case Management Objective: Ensure comprehensive error handling that provides meaningful context and graceful degradation Status: Missing input validation: External inputs are used without robust validation (e.g., `prompt` normalization allows undefined and tool-call JSON parsing swallows parse errors), which can cause undefined behavior and opaque failures. Referred Code app.post<{}, {}, CompletionRequest>('/v1/completions', async (req: Request, res: Response) => { const { model, prompt, stream, ...rest } = req.body; // Normalize prompt to string const promptText = Array.isArray(prompt) ? prompt.join('\n') : prompt; // Remove vendor prefixes const cleanModel = model.split('/').pop()!; // Convert to chat completion request const chatRequest: ChatCompletionRequest = { model: cleanModel, messages: [{ role: 'user', content: promptText }], stream: stream ?? false }; if (stream) { res.setHeader('Content-Type', 'text/event-stream'); res.setHeader('Cache-Control', 'no-cache'); res.setHeader('Connection', 'keep-alive'); ... (clipped 176 lines) Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Secure Error Handling Objective: To prevent the leakage of sensitive system information through error messages while providing sufficient detail for internal debugging. Status: Leaks internal errors: Streaming error events and run `last_error` propagate `error.message` to clients, potentially exposing internal implementation details instead of a generic user-facing message. Referred Code } catch (error) { console.error('Run execution error:', error); state.updateRun(threadId, runId, { status: 'failed', failed_at: Math.floor(Date.now() / 1000), last_error: { code: 'server_error', message: error instanceof Error ? error.message : 'Unknown error' } }); if (streaming) { yield createEvent('error', { error: { message: error instanceof Error ? error.message : 'Unknown error', code: 'server_error' } }); yield createEvent('thread.run.failed', state.getRun(threadId, runId)); yield createEvent('done', '[DONE]'); } Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Security-First Input Validation and Data Handling Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent vulnerabilities Status: No authz checks: The newly added public CRUD endpoints (including destructive operations and run execution/cancellation) implement no authentication/authorization or caller identity checks, allowing any caller to read/write/delete state. Referred Code // Create assistant router.post('/v1/assistants', (req: Request, res: Response) => { const body = req.body as CreateAssistantRequest; const validationError = validateRequired(body, ['model']); if (validationError) { return res.status(400).json(errorResponse(validationError, 'invalid_request_error', 'model')); } const assistant: Assistant = { id: state.generateAssistantId(), object: 'assistant', created_at: Math.floor(Date.now() / 1000), name: body.name ?? null, description: body.description ?? null, model: body.model, instructions: body.instructions ?? null, tools: body.tools ?? [], metadata: body.metadata ?? {} }; ... (clipped 407 lines) Learn more about managing compliance generic rules or creating your own custom rules
⚪	Generic: Meaningful Naming and Self-Documenting Code Objective: Ensure all identifiers clearly express their purpose and intent, making code self-documenting Status: Generic identifiers: Several generic names (e.g., `body`, `rest`, `parsed`, `calls`, `match`) reduce clarity in complex parsing/translation logic and may warrant refactoring for readability. Referred Code // POST /v1/completions - Wrap as chat completion app.post<{}, {}, CompletionRequest>('/v1/completions', async (req: Request, res: Response) => { const { model, prompt, stream, ...rest } = req.body; // Normalize prompt to string const promptText = Array.isArray(prompt) ? prompt.join('\n') : prompt; // Remove vendor prefixes const cleanModel = model.split('/').pop()!; // Convert to chat completion request const chatRequest: ChatCompletionRequest = { model: cleanModel, messages: [{ role: 'user', content: promptText }], stream: stream ?? false }; if (stream) { res.setHeader('Content-Type', 'text/event-stream'); res.setHeader('Cache-Control', 'no-cache'); res.setHeader('Connection', 'keep-alive'); ... (clipped 425 lines) Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Secure Logging Practices Objective: To ensure logs are useful for debugging and auditing without exposing sensitive information like PII, PHI, or cardholder data. Status: Unstructured error logs: The new code logs raw error objects via `console.error(...)`/`console.warn(...)` without structured logging or redaction, which may leak sensitive request or model data depending on upstream error contents. Referred Code } catch (error) { console.error('Error listing models:', error); res.status(500).json(errorResponse('Failed to list models', 'server_error')); } }); // GET /v1/models/:model - Get specific model app.get('/v1/models/:model', async (req: Request, res: Response) => { try { const models = await getAvailableModels(); const model = models.find(m => m.family === req.params.model); if (!model) { return res.status(404).json( errorResponse(`Model '${req.params.model}' not found`, 'invalid_request_error', 'model', 'model_not_found') ); } const response: ModelObject = { id: model.family, object: 'model', ... (clipped 427 lines) Learn more about managing compliance generic rules or creating your own custom rules
Update

Compliance status legend

🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

qodo-code-review · 2026-01-28T21:56:13Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Implement streaming logic for tool continuations Implement the missing streaming logic in the `continueRunWithToolOutputs` function to correctly handle async iterators from `processChatRequest` and prevent a runtime error. src/assistants/runner.ts [742-760] -// Non-streaming continuation (for now - streaming follows same pattern as executeRun) -const response = await processChatRequest(request) as any; +let fullContent = ''; +let promptTokens = 0; +let completionTokens = 0; -const responseContent = response.choices[0]?.message?.content; -if (!responseContent) { - state.updateRun(threadId, runId, { - status: 'failed', - failed_at: Math.floor(Date.now() / 1000), - last_error: { code: 'server_error', message: 'Empty response from model' } - }); - return; +if (streaming) { + // Create message in progress + const assistantMessage: Message = { + id: messageId, + object: 'thread.message', + created_at: Math.floor(Date.now() / 1000), + thread_id: threadId, + status: 'in_progress', + role: 'assistant', + content: [{ type: 'text', text: { value: '', annotations: [] } }], + assistant_id: assistant.id, + run_id: runId, + // ... other fields + }; + state.addMessage(threadId, assistantMessage); + yield createEvent('thread.message.created', assistantMessage); + yield createEvent('thread.message.in_progress', assistantMessage); + + const streamIterator = await processChatRequest(request) as AsyncIterable<ChatCompletionChunk>; + let deltaIndex = 0; + for await (const chunk of streamIterator) { + // (Add cancellation check here) + const content = chunk.choices[0]?.delta?.content ?? ''; + if (content) { + fullContent += content; + const delta: MessageDelta = { + id: messageId, + object: 'thread.message.delta', + delta: { content: [{ index: deltaIndex++, type: 'text', text: { value: content } }] } + }; + yield createEvent('thread.message.delta', delta); + } + } + // Rough token estimation for streaming + completionTokens = fullContent.length; + promptTokens = chatMessages.reduce((sum, m) => sum + (typeof m.content === 'string' ? m.content.length : 0), 0); + +} else { + // Non-streaming continuation + const response = await processChatRequest(request) as any; + + const responseContent = response.choices[0]?.message?.content; + if (!responseContent) { + state.updateRun(threadId, runId, { + status: 'failed', + failed_at: Math.floor(Date.now() / 1000), + last_error: { code: 'server_error', message: 'Empty response from model' } + }); + return; + } + + fullContent = typeof responseContent === 'string' + ? responseContent + : JSON.stringify(responseContent); + + promptTokens = response.usage?.prompt_tokens ?? 0; + completionTokens = response.usage?.completion_tokens ?? fullContent.length; } -fullContent = typeof responseContent === 'string' - ? responseContent - : JSON.stringify(responseContent); - -promptTokens = response.usage?.prompt_tokens ?? 0; -completionTokens = response.usage?.completion_tokens ?? fullContent.length; - Apply / Chat Suggestion importance[1-10]: 9 __ Why: The suggestion correctly identifies a significant bug where the streaming path in `continueRunWithToolOutputs` is unimplemented, which would cause a runtime error, and it provides a correct implementation pattern.	High
	Flush buffer on complete Modify the `ToolCallBuffer.append` method to return the buffered content as `safeContent` when tool calls are complete, preventing data loss and ensuring text is correctly flushed. src/assistants/tools.ts [200-216] append(chunk: string): { safeContent: string; complete: boolean } { this.content += chunk; const openCount = (this.content.match(/<tool_call>/gi) \|\| []).length; const closeCount = (this.content.match(/<\/tool_call>/gi) \|\| []).length; this.inToolCall = openCount > closeCount; if (!this.inToolCall && openCount === closeCount) { - return { safeContent: '', complete: true }; + const safe = this.content; + this.reset(); + return { safeContent: safe, complete: true }; } return { safeContent: '', complete: false }; } Apply / Chat Suggestion importance[1-10]: 9 __ Why: This suggestion fixes a critical bug in the `ToolCallBuffer` where it would drop all non-tool-call content, leading to empty responses. The fix ensures all content is correctly processed.	High
	Add null check to prevent crash Add a null/undefined check at the beginning of `extractMessageContent` to prevent a potential `TypeError` when processing messages that have no content. src/extension.ts [168-175] -function extractMessageContent(content: string \| StructuredMessageContent[]): string { +function extractMessageContent(content: string \| StructuredMessageContent[] \| null \| undefined): string { + if (content === null \|\| content === undefined) { + return ''; + } if (typeof content === 'string') { return content; } return content.map(item => item.text).join('\n'); } `[To ensure code accuracy, apply this suggestion manually]` Suggestion importance[1-10]: 8 __ Why: The suggestion correctly identifies that the `extractMessageContent` function will crash if passed `null` or `undefined` content, which is a valid scenario, and provides a simple fix to prevent the error.	Medium
	Fix multi-line SSE data parsing Correct the Server-Sent Events (SSE) parsing logic to properly handle fragmented messages by concatenating only the content of `data:` lines, preventing JSON decoding errors. client/copilot_proxy.py [53-55] elif line.startswith("data:"): - # append JSON after "data:" - buffer += (line + "\n") + # append JSON part after "data:" + buffer += line[len("data:"):].strip() Apply / Chat Suggestion importance[1-10]: 8 __ Why: This suggestion fixes a significant bug in the SSE parsing logic that would cause JSON decoding to fail for fragmented messages, making the client more robust.	Medium
	Remove racy run cancellation timeout Remove the `setTimeout` block from the run cancellation route handler to eliminate a race condition and rely on the `executeRun` function for the final state transition to `cancelled`. src/assistants/routes.ts [504-512] -// After a short delay, mark as cancelled -setTimeout(() => { - const currentRun = state.getRun(thread_id, run_id); - if (currentRun?.status === 'cancelling') { - state.updateRun(thread_id, run_id, { status: 'cancelled' }); - } -}, 100); - res.json(updated); Apply / Chat Suggestion importance[1-10]: 7 __ Why: The suggestion correctly identifies a race condition in the run cancellation logic and proposes removing the unreliable `setTimeout` to rely on the robust state handling within the run executor.	Medium
	Align tool choice with spec Align the `tool_choice` object shape in `CreateResponseRequest` with the OpenAI specification by nesting the function name within a `function` object, matching the `ChatCompletionRequest` interface. src/types.ts [204-218] export interface CreateResponseRequest { model: string; input: string \| ResponseInputItem[]; instructions?: string; stream?: boolean; temperature?: number; max_output_tokens?: number; top_p?: number; store?: boolean; metadata?: Record<string, string>; tools?: FunctionTool[]; - tool_choice?: 'none' \| 'auto' \| 'required' \| { type: 'function'; name: string }; + tool_choice?: 'none' \| 'auto' \| 'required' \| { type: 'function'; function: { name: string } }; previous_response_id?: string; parallel_tool_calls?: boolean; } Apply / Chat Suggestion importance[1-10]: 7 __ Why: The suggestion correctly identifies an inconsistency between the `CreateResponseRequest` and `ChatCompletionRequest` interfaces, which would likely cause validation errors and bugs.	Medium
General	Use object for tool arguments Update the `ToolCall` interface to use `Record<string, unknown>` for the `arguments` field instead of `string` to improve type safety and avoid double JSON stringification. src/types.ts [43-49] export interface ToolCall { id: string; type: 'function'; function: { name: string; - arguments: string; + arguments: Record<string, unknown>; }; } `[To ensure code accuracy, apply this suggestion manually]` Suggestion importance[1-10]: 6 __ Why: The suggestion improves type safety and code clarity by changing `arguments` from a string to an object, which avoids unnecessary JSON stringification and aligns the type with its actual usage.	Low
General	Enforce case-sensitive tool call parsing Remove the case-insensitive (`i`) flag from the tool call parsing regular expression to enforce the strict, case-sensitive format specified in the system prompt. src/assistants/tools.ts [113] -const regex = /<tool_call>\s([\s\S]?)\s<\/tool_call>/gi; +const regex = /<tool_call>\s([\s\S]?)\s<\/tool_call>/g; Apply / Chat Suggestion importance[1-10]: 5 __ Why: The suggestion correctly points out that the regex should be case-sensitive to match the strict format instructed in the prompt, improving parsing accuracy.	Low
Update

Copilot

Pull request overview

This PR significantly expands the copilot-proxy extension to provide broader OpenAI API compatibility. It adds support for tools/function calling, implements the Assistants API, and includes legacy completions and models endpoints. The extension now auto-starts the server and persists assistant state.

Changes:

Added comprehensive OpenAI-compatible type definitions including tools, embeddings, models, responses, and assistants APIs
Implemented full Assistants API with threads, messages, runs, and tool calling via prompt engineering
Added legacy completions endpoint, models listing, and embeddings stub
Introduced state persistence for assistants with debounced auto-save
Changed extension behavior to auto-start server on activation
Added web-based chat client with Flask backend

Reviewed changes

Copilot reviewed 25 out of 27 changed files in this pull request and generated 21 comments.

Show a summary per file

File	Description
src/types.ts	Added extensive type definitions for OpenAI API compatibility including tools, assistants, embeddings, and responses
src/server.ts	Expanded with models endpoints, completions API, responses API, tool parsing, and assistants router integration
src/extension.ts	Added state persistence, auto-start behavior, model listing command, and improved system message handling
src/assistants/*.ts	New comprehensive assistants module with types, state management, tool utilities, run execution engine, and routes
package.json	Updated version, commands, engine requirements, removed vsce dependency, changed activation events
client/*	New web client with HTML/CSS/JS frontend and Flask proxy backend, plus Python streaming client
package-lock.json	Deleted (breaks reproducible builds)
docs/specs/archive/*	Removed planning documents

Comments suppressed due to low confidence (1)

src/extension.ts:180

The extractMessageContent function now handles null and undefined values, but there's a potential issue: when content is an array of StructuredMessageContent, the function returns the concatenated text from items. However, in the types, StructuredMessageContent has a type property, but the function assumes it has a text property. This could lead to undefined being returned for valid structured content that doesn't match the expected shape.

function extractMessageContent(content: string | StructuredMessageContent[] | null | undefined): string {
  if (content === null || content === undefined) {
    return '';
  }
  if (typeof content === 'string') {
    return content;
  }
  if (Array.isArray(content)) {
    return content.map(item => item.text).join('\n');
  }
  return String(content);
}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-28T21:59:29Z

src/extension.ts

+  // Auto-start the server on extension activation
+  if (!serverInstance) {
+    const configPort = vscode.workspace.getConfiguration("copilotProxy").get("port", 3000);
+    serverInstance = startServer(configPort);
+    outputChannel.appendLine(`Express server auto-started on port ${configPort}.`);
+  }


The server now auto-starts on extension activation (lines 89-93), which is a significant behavior change. This could cause issues if the configured port is already in use, potentially blocking the extension from loading. Consider adding error handling and user notification if the server fails to start, or making auto-start optional via configuration.

Copilot · 2026-01-28T21:59:29Z

src/server.ts

+function parseToolCalls(content: string, tools: FunctionTool[]): { text: string; toolCalls: ResponseFunctionCallItem[] } {
+  const toolCalls: ResponseFunctionCallItem[] = [];
+  let remainingText = content;
+
+  // Look for JSON-formatted tool calls in the response
+  // Common patterns: <tool_call>, ```json, or direct JSON objects
+  const toolCallPatterns = [
+    /<tool_call>([\s\S]*?)<\/tool_call>/g,
+    /```(?:json)?\s*\n?({[\s\S]*?"name"[\s\S]*?"arguments"[\s\S]*?})\s*\n?```/g,
+    /\{\s*"tool_calls?"\s*:\s*\[([\s\S]*?)\]\s*\}/g
+  ];
+
+  for (const pattern of toolCallPatterns) {
+    let match;
+    while ((match = pattern.exec(content)) !== null) {
+      try {
+        let parsed = JSON.parse(match[1] || match[0]);
+
+        // Handle both single tool call and array of tool calls
+        const calls = Array.isArray(parsed) ? parsed : (parsed.tool_calls || [parsed]);
+
+        for (const call of calls) {
+          if (call.name && tools.some(t => t.function.name === call.name)) {
+            const toolCall: ResponseFunctionCallItem = {
+              type: 'function_call',
+              id: generateId('fc'),
+              call_id: generateId('call'),
+              name: call.name,
+              arguments: typeof call.arguments === 'string' ? call.arguments : JSON.stringify(call.arguments || {}),
+              status: 'completed'
+            };
+            toolCalls.push(toolCall);
+            remainingText = remainingText.replace(match[0], '').trim();
+          }
+        }
+      } catch (e) {
+        // Not valid JSON, continue
+      }
+    }
+  }


The parseToolCalls function uses regular expressions with global flags but doesn't reset lastIndex between iterations or patterns. When using .exec() with global regex, the lastIndex property persists across calls, which can cause the regex to miss matches or behave unexpectedly. Consider using String.prototype.matchAll() or resetting the regex between uses.

Copilot · 2026-01-28T21:59:30Z

src/server.ts

+              status: 'completed'
+            };
+            toolCalls.push(toolCall);
+            remainingText = remainingText.replace(match[0], '').trim();


The tool parsing logic removes matched tool calls from the content by calling replace(match[0], '') on line 231. However, remainingText is defined once at the start (line 201) and never reassigned when tool calls are found. This means the text content isn't actually being stripped of tool call markers as intended - the replace operation is performed on a variable that's immediately reassigned.

Copilot · 2026-01-28T21:59:30Z

src/server.ts

+              }
+            };
+            toolCalls.push(toolCall);
+            remainingText = remainingText.replace(match[0], '').trim();


The parseChatToolCalls function has the same issue as parseToolCalls - the remainingText variable at line 523 is initialized but line 550 performs a replace operation that doesn't get assigned back to remainingText. This means tool call markers won't be removed from the returned text content.

Copilot · 2026-01-28T21:59:30Z

src/assistants/tools.ts

+export class ToolCallBuffer {
+  private content: string = '';
+  private inToolCall: boolean = false;
+  private toolCallDepth: number = 0;
+
+  /**
+   * Add content to buffer
+   * Returns content that can be safely emitted (not part of a tool call)
+   */
+  append(chunk: string): { safeContent: string; complete: boolean } {
+    this.content += chunk;
+
+    // Check for tool call markers
+    const openCount = (this.content.match(/<tool_call>/gi) || []).length;
+    const closeCount = (this.content.match(/<\/tool_call>/gi) || []).length;
+
+    this.inToolCall = openCount > closeCount;
+
+    if (!this.inToolCall && openCount === closeCount) {
+      // All tool calls are complete (or there are none)
+      return { safeContent: '', complete: true };
+    }
+
+    // We're in the middle of a tool call, don't emit anything yet
+    return { safeContent: '', complete: false };
+  }
+
+  /**
+   * Get the full accumulated content
+   */
+  getContent(): string {
+    return this.content;
+  }
+
+  /**
+   * Check if we're currently inside a tool call block
+   */
+  isInToolCall(): boolean {
+    return this.inToolCall;
+  }
+
+  /**
+   * Reset the buffer
+   */
+  reset(): void {
+    this.content = '';
+    this.inToolCall = false;
+    this.toolCallDepth = 0;
+  }
+}


The ToolCallBuffer class tracks tool call depth but never uses the toolCallDepth property (line 194). It's incremented and reset but not used in any logic. Either implement proper depth tracking for nested tool calls or remove the unused property to avoid confusion.

Copilot · 2026-01-28T21:59:33Z

src/server.ts

+
+// POST /v1/responses - Create a model response (new OpenAI API)
+app.post<{}, {}, CreateResponseRequest>('/v1/responses', async (req, res) => {
+  const { model, input, instructions, stream, temperature, max_output_tokens, metadata, tools, tool_choice } = req.body;


Unused variable tool_choice.

Suggested change

const { model, input, instructions, stream, temperature, max_output_tokens, metadata, tools, tool_choice } = req.body;

const { model, input, instructions, stream, temperature, max_output_tokens, metadata, tools } = req.body;

Copilot · 2026-01-28T21:59:33Z

src/server.ts

+
 app.post<{}, {}, ChatCompletionRequest>('/v1/chat/completions', async (req, res) => {
-  const { model, stream } = req.body;
+  const { model, stream, tools, tool_choice } = req.body;


Unused variable tool_choice.

Suggested change

const { model, stream, tools, tool_choice } = req.body;

const { model, stream, tools } = req.body;

Copilot · 2026-01-28T21:59:34Z

client/web/app.js

+      else if(choices.length && choices[0].delta) content = choices.map(c=>c.delta?.content||'').join('');
+      else if(data.text) content = data.text;
+      else content = JSON.stringify(data);
+    }catch(e){ content = JSON.stringify(data) }


Avoid automated semicolon insertion (96% of all statements in the enclosing function have an explicit semicolon).

Suggested change

}catch(e){ content = JSON.stringify(data) }

}catch(e){ content = JSON.stringify(data); }

Copilot · 2026-01-28T21:59:34Z

client/server.py

+    return send_from_directory('web', 'index.html')
+
+@app.route('/api/chat', methods=['POST'])
+def api_chat():


api_chat returns tuple of size 2 and tuple of size 3.

Copilot · 2026-01-28T21:59:34Z

client/server.py

+
+
+@app.route('/api/models', methods=['GET'])
+def api_models():


api_models returns tuple of size 2 and tuple of size 3.

Halil Yasin Orman and others added 2 commits January 27, 2026 20:00

some improvements

db58541

tool support

d10df8d

Copilot AI review requested due to automatic review settings January 28, 2026 21:54

Copilot started reviewing on behalf of hyorman January 28, 2026 21:54 View session

qodo-code-review bot added the Review effort 5/5 label Jan 28, 2026

Copilot AI reviewed Jan 28, 2026

View reviewed changes

hyorman added 3 commits January 31, 2026 20:06

api token config

40ed6e0

ignore client

071076e

tool registert fix

89bfc8f

hyorman closed this Feb 5, 2026

hyorman deleted the some-improvements branch February 5, 2026 20:51

hyorman restored the some-improvements branch February 5, 2026 20:51

	const { model, input, instructions, stream, temperature, max_output_tokens, metadata, tools, tool_choice } = req.body;
	const { model, input, instructions, stream, temperature, max_output_tokens, metadata, tools } = req.body;

	const { model, stream, tools, tool_choice } = req.body;
	const { model, stream, tools } = req.body;

	}catch(e){ content = JSON.stringify(data) }
	}catch(e){ content = JSON.stringify(data); }



		@app.route('/api/models', methods=['GET'])
		def api_models():

Conversation

hyorman commented Jan 28, 2026 • edited by qodo-code-review bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Type

Description

Diagram Walkthrough

File Walkthrough

Uh oh!

qodo-code-review bot commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Compliance Guide 🔍

Uh oh!

qodo-code-review bot commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hyorman commented Jan 28, 2026 •

edited by qodo-code-review bot

Loading

qodo-code-review bot commented Jan 28, 2026 •

edited

Loading

qodo-code-review bot commented Jan 28, 2026 •

edited

Loading