Skip to content

/v1/chat/completions is no longer OpenAI Spec #383

@timothycarambat

Description

@timothycarambat

Foundry Version

0.8.117

In AnythingLLM we have support for Foundry Local and are building out more of a "first party" type of expierence that will translate to our Desktop offering as well.

Previously, we were able to reuse the standard OpenAI response handler that works for lots of providers. I recently pulled the latest SDK and now responses stream to the UI but never close or report metrics. This seems to come from #369 where some standard properties are missing alongside some new foreign keys like isDelta that don't seem to mean anything or reflect the real state of the response.

Reproduction

curl --location 'http://127.0.0.1:51597/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
    "model": "qwen2.5-0.5b-instruct-generic-gpu:4",
    "messages": [
        {"role": "user", "content": "Hello! What is your name?"}
    ]
}'
{
    "model": "qwen2.5-0.5b-instruct-generic-gpu:4",
    "choices": [
        {
            "delta": {
                "role": "assistant",
                "content": "Hello! My name is Qwen, an AI language model created by Alibaba Cloud. I am here to assist you in various ways: answer questions, generate text content, engage in conversations with humans and other conversational agents, provide information on a wide range of topics, perform tasks that involve problem-solving or data analysis, etc. How can I help you today?",
                "tool_calls": []
            },
            "message": {
                "role": "assistant",
                "content": "Hello! My name is Qwen, an AI language model created by Alibaba Cloud. I am here to assist you in various ways: answer questions, generate text content, engage in conversations with humans and other conversational agents, provide information on a wide range of topics, perform tasks that involve problem-solving or data analysis, etc. How can I help you today?",
                "tool_calls": []
            },
            "index": 0,
            "finish_reason": "stop"
        }
    ],
    "created": 1768603693,
    "CreatedAt": "2026-01-16T22:48:13+00:00",
    "id": "chat.id.3",
    "IsDelta": false,
    "Successful": true,
    "HttpStatusCode": 0,
    "object": "chat.completion"
}

Streaming chunk

{
    "model": "qwen2.5-0.5b-instruct-generic-gpu:4",
    "choices": [
        {
            "delta": {
                "role": "assistant",
                "content": "!",
                "tool_calls": []
            },
            "message": {
                "role": "assistant",
                "content": "!",
                "tool_calls": []
            },
            "index": 0
        }
    ],
    "created": 1768604314,
    "CreatedAt": "2026-01-16T22:58:34+00:00",
    "id": "chat.id.4",
    "IsDelta": false, // <-- This is always false even when mid-stream
    "Successful": true,
    "HttpStatusCode": 0,
    "object": "chat.completion.chunk"
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions