This page mentions "rate limits"
What doest the (current) 4M tokens/minute limit apply to? Is it per project? Per region? I am looking for a precise definition of it. To me, the online documentation mixes quotas with limits yet stating that these are distinct concepts.
This is what Gemini itself says about it. I want to double-check with a human:
It's per project. This means that the limit applies to all requests made to the model from a single Google Cloud project, regardless of the region.