I created a fresh project in Azure AI using the portal, with all new Azure resources associated with the project.
I created a deployment of gpt-4o-mini with the default deployment options.
Deployment type: Global Standard
Rate limit (Tokens per minute): 8,000
Rate limit (Requests per minute): 80
Provisioning state: Succeeded
Looking at the metrics of this model deployment, I see:
As you can see, I am far from any of the rate limits.
I create a new Agent in the portal (using Assistants API within Azure AI Agent Service preview).
I start a playground session for this agent in the portal, a thread is created and I enter (any) user input.
As you can see, I am getting an error message claiming I have hit a rate limit.
This makes no sense to me. I can start a playground session on the same model deployment itself and consuming that model there, there are no rate limiting issues.
What is going on here? I don't seem to be able to make any progress with AI Agent Service. What am I missing?
Thank you.