I call chatGPT from Python using openai_client.beta.chatpletions.parse(...,response_format=MyClass)
It spits back a giant response comprising of 750 tokens, pasted here. I'm only interested in response.choices[0].message.parsed
, which is a more modest 300 tokens. While having all the extra junk doesn't hurt the code, it does hurt my wallet.
Is there a way to just get the parsed message?
I call chatGPT from Python using openai_client.beta.chatpletions.parse(...,response_format=MyClass)
It spits back a giant response comprising of 750 tokens, pasted here. I'm only interested in response.choices[0].message.parsed
, which is a more modest 300 tokens. While having all the extra junk doesn't hurt the code, it does hurt my wallet.
Is there a way to just get the parsed message?
Share Improve this question asked Mar 18 at 4:03 LiuLiu 871 silver badge8 bronze badges1 Answer
Reset to default 0You can’t tell OpenAI’s API "only bill me for a subset of the response." You are billed for every token generated (including tokens you might not end up using). If you truly want only the function call/parsed data, without the extra “explanatory” text, you’ll need to have the model produce fewer tokens in the first place (e.g., by forcing a function call, or by instructing the model to respond with minimal or no text). Merely discarding extra text after the fact does not reduce token usage or cost.
You have two options:
Instruct model to return minimal output
- For example, “Return no other text besides the JSON for the function call.” The model may still produce some extraneous text, but usually much less if you emphasize it.
Use function_call feature
- Define a function schema describing the JSON structure you want.
- Call the model with function_call={"name": "your_function"} so it returns only the JSON arguments in message.function_call.arguments.
- Reference: https://platform.openai/docs/guides/function-calling?api-mode=chat&example=get-weather