Skip to content

Eval bug: Automatic parser generation failed #11

@Edgar-I

Description

@Edgar-I

Name and Version

/media/SSD_2T/Projects/llama_stepfun.cpp/build/bin/llama-cli --version
ggml_cuda_init: found 4 ROCm devices:
Device 0: AMD Instinct MI50/MI60, gfx906:sramecc+:xnack- (0x906), VMM: no, Wave Size: 64
Device 1: AMD Instinct MI50/MI60, gfx906:sramecc+:xnack- (0x906), VMM: no, Wave Size: 64
Device 2: AMD Instinct MI50/MI60, gfx906:sramecc+:xnack- (0x906), VMM: no, Wave Size: 64
Device 3: AMD Instinct MI50/MI60, gfx906:sramecc+:xnack- (0x906), VMM: no, Wave Size: 64
version: 8039 (e384c6f)
built with GNU 13.3.0 for Linux x86_64

Compiled with

HIPCXX="$(hipconfig -l)/clang" HIP_PATH="$(hipconfig -R)"
rm -rf build &&
cmake -S . -B build -DGGML_HIP=ON -DLLAMA_BUILD_SERVER=ON -DLLAMA_BUILD_EXAMPLES=ON -DGPU_TARGETS=gfx906 -DCMAKE_BUILD_TYPE=Release
&& cmake --build build --config Release -- -j 32

Run with

LLAMA_SET_ROWS=1 /media/SSD_2T/Projects/llama_stepfun.cpp/build/bin/llama-server
--model /media/SSD_2T/models/Step3.5/Step-3.5-Flash-IQ4_XS-00001-of-00004.gguf
--ctx-size 130000
--temp 1.0
--repeat-penalty 1.0
--min-p 0.01
--spec-type ngram-mod --spec-ngram-size-n 24 --draft-min 48 --draft-max 64
--batch-size 1928
--host 0.0.0.0 --port 1235

Operating systems

Linux

GGML backends

HIP

Hardware

MI50 32GB * 4, DDR4 512GB, Epyc 7532

Models

https://huggingface.co/ubergarm/Step-3.5-Flash-GGUF/tree/main/IQ4_XS

This mainline compatible quant does not use imatrix.

Problem description & steps to reproduce

Instant crash

First Bad Commit

No response

Relevant log output

Logs

error_log.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions