Skip to content

Can't run glm-4-9b-chat on cuda 12 #18

@alabulei1

Description

@alabulei1

When I run glm-4-9b-chat-Q5_K_M.gguf on the Cuda 12 machine, the API server can be started successfully. However, when I send a question, the API server will crash.

The command I used to start the API server is as follows:

wasmedge --dir .:. --nn-preload default:GGML:AUTO:glm-4-9b-chat-Q5_K_M.gguf \
  llama-api-server.wasm \
  --prompt-template glm-4-chat \
  --ctx-size 4096 \
  --model-name glm-4-9b-chat

Here is the error message

[2024-07-11 07:50:12.036] [wasi_logging_stdout] [info] llama_core: llama_core::chat in llama-core/src/chat.rs:1315: Get the model metadata.
[2024-07-11 07:50:12.036] [wasi_logging_stdout] [error] llama_core: llama_core::chat in llama-core/src/chat.rs:1349: The model `internlm2_5-7b-chat` does not exist in the chat graphs.
[2024-07-11 07:50:12.036] [wasi_logging_stdout] [error] chat_completions_handler: llama_api_server::backend::ggml in llama-api-server/src/backend/ggml.rs:392: Failed to get chat completions. Reason: The model `internlm2_5-7b-chat` does not exist in the chat graphs.
[2024-07-11 07:50:12.036] [wasi_logging_stdout] [error] response: llama_api_server::error in llama-api-server/src/error.rs:25: 500 Internal Server Error: Failed to get chat completions. Reason: The model `internlm2_5-7b-chat` does not exist in the chat graphs.
[2024-07-11 07:50:12.036] [wasi_logging_stdout] [info] chat_completions_handler: llama_api_server::backend::ggml in llama-api-server/src/backend/ggml.rs:399: Send the chat completion response.
[2024-07-11 07:50:12.036] [wasi_logging_stdout] [error] response: llama_api_server in llama-api-server/src/main.rs:518: version: HTTP/1.1, body_size: 133, status: 500, is_informational: false, is_success: false, is_redirection: false, is_client_error: false, is_server_error: true

Versions

[2024-07-11 07:47:51.368] [wasi_logging_stdout] [info] server_config: llama_api_server in llama-api-server/src/main.rs:131: server version: 0.12.3

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions