OverloadedError: Model is overloaded

I am using the `meta-llama/Llama-2-70b-chat-hf` model on a data frame with 3000 rows, each including a 500-token text. But after 10 rows is processed, I get the following error

`[<ipython-input-24-a416ebb86ac8>](https://localhost:8080/#) in call_llama2_api(self, messages)
     79     def call_llama2_api(self, messages):
     80         huggingface.prompt_builder = "llama2"
---> 81         response = huggingface.ChatCompletion.create(
     82             model="meta-llama/Llama-2-70b-chat-hf",
     83             messages=messages,

[/usr/local/lib/python3.10/dist-packages/easyllm/clients/huggingface.py](https://localhost:8080/#) in create(messages, model, temperature, top_p, top_k, n, max_tokens, stop, stream, frequency_penalty, debug)
    205             generated_tokens = 0
    206             for _i in range(request.n):
--> 207                 res = client.text_generation(
    208                     prompt,
    209                     details=True,

[/usr/local/lib/python3.10/dist-packages/huggingface_hub/inference/_client.py](https://localhost:8080/#) in text_generation(self, prompt, details, stream, model, do_sample, max_new_tokens, best_of, repetition_penalty, return_full_text, seed, stop_sequences, temperature, top_k, top_p, truncate, typical_p, watermark, decoder_input_details)
   1063                     decoder_input_details=decoder_input_details,
   1064                 )
-> 1065             raise_text_generation_error(e)
   1066 
   1067         # Parse output

[/usr/local/lib/python3.10/dist-packages/huggingface_hub/inference/_text_generation.py](https://localhost:8080/#) in raise_text_generation_error(http_error)
    472             raise IncompleteGenerationError(message) from http_error
    473         if error_type == "overloaded":
--> 474             raise OverloadedError(message) from http_error
    475         if error_type == "validation":
    476             raise ValidationError(message) from http_error

OverloadedError: Model is overloaded`

Is there any solution to fix this problem, like increasing the rate limit?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OverloadedError: Model is overloaded #34

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

OverloadedError: Model is overloaded #34

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions