Hi Nemanja,
Thanks for reporting this.
In Azure AI OpenAI services, quotas are tracked per model, per region, and they can sometimes reset or auto-adjust when backend capacity changes, especially for newer or high-demand regions/models like GPT-4-o.
A few points that might help clarify the situation:
- Quota enforcement is region-specific: Each model deployment has its own quota limits in each region, and they do not aggregate across regions. Even if a quota increase shows successfully applied, the portal can still reset the quota if that capacity is not actually available for your subscription in that region. Microsoft Learn
- Automatic quota adjustment: Azure may adjust available quota behind the scenes based on capacity availability or regional constraints, particularly for high-traffic regions like Sweden Central. This can sometimes result in the appearance that a quota is “reset” without a clear notification. Tracking quota limits via the Azure portal’s Quotas blade can help verify what’s actually assigned at any moment.
- Not isolated to one region: As you noted seeing the same in France, this behavior can occur in multiple regions where demand exceeds provisioned capacity for a specific model version.
Please share any logs or error messages you see after the reset, and we can help you outline the support ticket if needed.
Thankyou!