-
Notifications
You must be signed in to change notification settings - Fork 8
Description
I wrote a prompt that basically said something like "scrap me the header of this site: sitename_here", and I got TENS of responses. There is no way to control that as far as I know, and you are billed for every token returned. And its not a small bill, it can add up fast. Why you can't instruct assistant that a request of this specific type should not take more than 3 responses, for example? You can limit tokens as far as I know but I don't want to do that, but I also don't like the idea of getting literally TENS of short responses with similar content for a simple question, and getting billed for every token TENS of times (that also includes billing tens of times for prompt and for instructions). It is horrific design from OpenAI. I don't see use of it for me, as a software developer, in the current state. I will look into Chat Completions API now, as at first glance that is something more reliable.