最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

google cloud platform - Vertex AI quota limit error despite my usage not near the limit - Stack Overflow

programmeradmin1浏览0评论

I'm trying to trigger a Vertex AI custom build with this Python script:

print("Creating custom job...")
job = aiplatform.CustomJob(
    display_name=f"process-text-{timestamp}",
    worker_pool_specs=[{
        "machine_spec": {
            "machine_type": "n1-standard-4",
        },
        "replica_count": 1,
        "container_spec": {
            "image_uri": TEXT_PROCESSOR_IMAGE,
            "args": [bucket_name, file_name]
        },
    }]
)

I'm getting an error:

grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.RESOURCE_EXHAUSTED

details = "The following quota metrics exceed quota limits: aiplatform.googleapis/custom_model_training_cpus"

debug_error_string = "UNKNOWN:Error received from peer ipv4:142.250.152.95:443 {created_time:"2025-02-24T13:58:36.944174121+00:00", grpc_status:8, grpc_message:"The following quota metrics exceed quota limits: aiplatform.googleapis/custom_model_training_cpus"}"
>

The above exception was the direct cause of the following exception:

raise exceptions.from_grpc_error(exc) from exc

google.api_core.exceptions.ResourceExhausted: 429 The following quota metrics exceed quota limits: aiplatform.googleapis/custom_model_training_cpus

From this page in the docs I thought that I had to increase my quota:

However in the GCP console when I go to and sort by "Current usage percentage" nothing is over 70%.

Also trying to follow the docs more closely in the "Request additional device quota for the project" section I also don't see any resources that are close to my limits:

I'm trying to trigger a Vertex AI custom build with this Python script:

print("Creating custom job...")
job = aiplatform.CustomJob(
    display_name=f"process-text-{timestamp}",
    worker_pool_specs=[{
        "machine_spec": {
            "machine_type": "n1-standard-4",
        },
        "replica_count": 1,
        "container_spec": {
            "image_uri": TEXT_PROCESSOR_IMAGE,
            "args": [bucket_name, file_name]
        },
    }]
)

I'm getting an error:

grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.RESOURCE_EXHAUSTED

details = "The following quota metrics exceed quota limits: aiplatform.googleapis/custom_model_training_cpus"

debug_error_string = "UNKNOWN:Error received from peer ipv4:142.250.152.95:443 {created_time:"2025-02-24T13:58:36.944174121+00:00", grpc_status:8, grpc_message:"The following quota metrics exceed quota limits: aiplatform.googleapis/custom_model_training_cpus"}"
>

The above exception was the direct cause of the following exception:

raise exceptions.from_grpc_error(exc) from exc

google.api_core.exceptions.ResourceExhausted: 429 The following quota metrics exceed quota limits: aiplatform.googleapis/custom_model_training_cpus

From this page in the docs I thought that I had to increase my quota:

https://cloud.google/vertex-ai/docs/training/neural-architecture-search/environment-setup?_gl=1

However in the GCP console when I go to https://console.cloud.google/iam-admin/quotas and sort by "Current usage percentage" nothing is over 70%.

Also trying to follow the docs more closely in the "Request additional device quota for the project" section I also don't see any resources that are close to my limits:

Share Improve this question edited Mar 3 at 17:38 desertnaut 60.5k32 gold badges155 silver badges181 bronze badges asked Feb 24 at 14:30 EvanssEvanss 23.3k101 gold badges322 silver badges556 bronze badges 1
  • In the screenshot above, try filtering using the metric - aiplatform.googleapis/custom_model_training_cpus. Check this image. – Raghavendra N Commented Mar 2 at 8:29
Add a comment  | 

1 Answer 1

Reset to default 2 +250

You used the model n1-standard-4, which is only available in regions stated in this website. When setting up aiplatform (a.k.a. calling aiplatform.init), you should select a region where this device is available using the location parameter. You may also check Google's Documentation about CPUs if it helps (this is the total quota for a given project and region).

发布评论

评论列表(0)

  1. 暂无评论