最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

openai api - My AzureOpenAI batch job shows complete, but only a few rows (not all rows) are completed in the batch file without

programmeradmin3浏览0评论

I have tried uploading a batch job with unique custom_id for each row in my input file. The job gets validated but completes very quickly and once I check the job only 276/4096 (as shown in the example below) is completed.

I'm unsure what is going wrong here. There is no error, previously I thought it might be a duplicate custom ID issue but still after resolving that I still face the same issue.

This is an example of the batch data below, which shows status as completed however, request_counts=BatchRequestCounts(completed=276, failed=0, total=4096) shows only 276 are completed.

Batch(id='batch_02d3c78a-ba97-40e3-8646-e83099ba5dbb', completion_window='24h', created_at=1742003026, endpoint='/chat/completions', input_file_id='file-083b8e3f2f024dc9a34a06d6014679c9', object='batch', status='completed', cancelled_at=None, cancelling_at=None, completed_at=1742003648, error_file_id='file-c93301a5-0199-479e-8660-71426f22ce2d', errors=None, expired_at=None, expires_at=1742089426, failed_at=None, finalizing_at=1742003528, in_progress_at=1742003279, metadata=None, output_file_id='file-34f7b43e-956c-4bc7-9d0a-6699db9333c6', request_counts=BatchRequestCounts(completed=276, failed=0, total=4096))

I was expecting that the batch output will provide me back with all responses back in a single file with 4096 rows, but only 276 came back in the output file.

My rate limits are also really high, so I don't think that's an issue.

I tried running GPT-4o-mini (Global Batch) on a batch of 5K training samples. This is the example of a single instance taken from the file containing 4096 samples ~ 139MB in size:

{
   "custom_id": "0_18_v2_fever_958611a10d8432c7ca51a59fca384dbc", 
   "method": "POST", 
   "url": "/chat/completions", 
   "body": {
     "model": "gpt-4o-mini-batch", 
     "messages": [
                    {"role": "user", "content": "<my prompt here>"}
                 ], 
     "max_completion_tokens": 4096, 
     "temperature": 0.1
    }
}

I also tried deploying a new model and uploading the files, however, I got back the same response -> exactly almost similar amount of rows got completed only ~200. I am also showing the jobs run till now, only the first run got completed returning back all the rows required. Here is all the jobs below:

Batch(id='batch_02d3c78a-ba97-40e3-8646-e83099ba5dbb', completion_window='24h', created_at=1742003026, endpoint='/chat/completions', input_file_id='file-083b8e3f2f024dc9a34a06d6014679c9', object='batch', status='completed', cancelled_at=None, cancelling_at=None, completed_at=1742003648, error_file_id='file-c93301a5-0199-479e-8660-71426f22ce2d', errors=None, expired_at=None, expires_at=1742089426, failed_at=None, finalizing_at=1742003528, in_progress_at=1742003279, metadata=None, output_file_id='file-34f7b43e-956c-4bc7-9d0a-6699db9333c6', request_counts=BatchRequestCounts(completed=276, failed=0, total=4096))
Batch(id='batch_15b67c2a-d941-43fa-94dd-d433ebdc94c4', completion_window='24h', created_at=1742003010, endpoint='/chat/completions', input_file_id='file-9dc4bf26713147aa98e19621fa0f907e', object='batch', status='completed', cancelled_at=None, cancelling_at=None, completed_at=1742003745, error_file_id='file-2be4eb9a-8abc-4312-801c-8e258fac607d', errors=None, expired_at=None, expires_at=1742089410, failed_at=None, finalizing_at=1742003631, in_progress_at=1742003279, metadata=None, output_file_id='file-417f3e4f-f299-4b3a-a195-827c8f4db1ca', request_counts=BatchRequestCounts(completed=265, failed=0, total=5000))
Batch(id='batch_9dccaf7c-1394-4778-aa07-aa02b280c772', completion_window='24h', created_at=1742002991, endpoint='/chat/completions', input_file_id='file-d7aaa72ebff440b19722cb9c9b8d205f', object='batch', status='completed', cancelled_at=None, cancelling_at=None, completed_at=1742003646, error_file_id='file-c2634ad4-99fc-4575-a6b3-d7b68492b97a', errors=None, expired_at=None, expires_at=1742089391, failed_at=None, finalizing_at=1742003527, in_progress_at=1742003293, metadata=None, output_file_id='file-6b9c6de8-4514-4b94-a42c-b1a1cbb1c635', request_counts=BatchRequestCounts(completed=275, failed=0, total=5000))
Batch(id='batch_e945eb8c-31a9-4a07-a415-356d1a064fe2', completion_window='24h', created_at=1742002970, endpoint='/chat/completions', input_file_id='file-e337e5ac8c06408ba67cae7b624f0c28', object='batch', status='completed', cancelled_at=None, cancelling_at=None, completed_at=1742003757, error_file_id='file-414ee824-b070-4bb1-9d80-14dd65a4aea2', errors=None, expired_at=None, expires_at=1742089370, failed_at=None, finalizing_at=1742003631, in_progress_at=1742003281, metadata=None, output_file_id='file-07ff6f4e-ab4b-4d15-9932-b84058f70736', request_counts=BatchRequestCounts(completed=272, failed=0, total=5000))
Batch(id='batch_cf622edb-2670-4ff2-9ee9-bd25bb35a3a0', completion_window='24h', created_at=1742002952, endpoint='/chat/completions', input_file_id='file-c88451a73f434a3b85bc0f95c21be385', object='batch', status='completed', cancelled_at=None, cancelling_at=None, completed_at=1742003753, error_file_id='file-f6d2900c-fb52-4c13-9bba-4cce1ffaec14', errors=None, expired_at=None, expires_at=1742089352, failed_at=None, finalizing_at=1742003631, in_progress_at=1742003289, metadata=None, output_file_id='file-9af3869e-fff3-45a4-b7a4-c38c9f2f9bac', request_counts=BatchRequestCounts(completed=270, failed=0, total=5000))
Batch(id='batch_40d32e10-a5a9-4c8c-93fc-03179afcfe78', completion_window='24h', created_at=1742002931, endpoint='/chat/completions', input_file_id='file-f1f64a0a175044e180afc9e9396d67d9', object='batch', status='completed', cancelled_at=None, cancelling_at=None, completed_at=1742003640, error_file_id='file-6d878431-9fca-44f8-a849-6972505d4d4a', errors=None, expired_at=None, expires_at=1742089331, failed_at=None, finalizing_at=1742003528, in_progress_at=1742003278, metadata=None, output_file_id='file-6f8cd787-c3b2-41bf-b7ef-0eba738899e8', request_counts=BatchRequestCounts(completed=277, failed=0, total=5000))
...
Batch(id='batch_23204b19-99e7-4ac3-8dd1-a70929280323', completion_window='24h', created_at=1741826257, endpoint='/chat/completions', input_file_id='file-3e06cf290fb24b0592154510e54b8809', object='batch', status='completed', cancelled_at=None, cancelling_at=None, completed_at=1741828981, error_file_id='file-da5bbdd7-6872-4379-b0fb-4a0376b6e6cc', errors=None, expired_at=None, expires_at=1741912657, failed_at=None, finalizing_at=1741828830, in_progress_at=1741828482, metadata=None, output_file_id='file-95378806-a0bc-4d80-a27a-eb3171ad9f70', request_counts=BatchRequestCounts(completed=5243, failed=0, total=5243))

I uploaded the batch script with Python, here is the pseudocode:

from openai import AzureOpenAI

client = AzureOpenAI(
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT") if endpoint is None else endpoint,
    api_key=os.getenv("AZURE_OPENAI_API_KEY") if api_key is None else api_key,
    api_version=os.getenv("AZURE_OPENAI_API_VERSION") if api_version is None else api_version,
)

file = client.files.create(
    file=open(final_filepath, "rb"), 
    purpose="batch"
)
file_id = file.id

batch_response = client.batches.create(
     input_file_id=file_id,
     endpoint="/chat/completions",
     completion_window="24h"
)

发布评论

评论列表(0)

  1. 暂无评论