I have a Step Functions that starts a Glue job run. This step functions is invoked via Event Bridge when new files are created at a specific S3 location. Sometimes 2 or more files are created in quick succession. So when the first execution (in response to first file creation) is in progress, the second execution starts(in response to the second file creation), and when it reaches the Glue step, it fails with a "Concurrent runs exceeded".
I want the Glue StartJobRun step in the 2nd invocation to wait until Glue run in the first invocation is complete. I understand setting JobRunQueuingEnabled to true will achieve this.
This is my definition:
{
"Type": "Task",
"Resource": "arn:aws:states:::glue:startJobRun.sync",
"Arguments": {
"JobName": "test-glue-job",
"Arguments": {
"--JobRunQueuingEnabled": "true",
"JobRunQueuingEnabled": "true",
"--jobRunQueuingEnabled": "true",
"--jobrunqueuingenabled": "true"
}
},
"End": true
}
The various variants of JobRunQueuingEnabled are my attempts to see if any one of them will do the trick (some of the alternatives are inspired by this question). But none of them work, and "Concurrent runs exceeded" continues to occur.
Here is another alternative I tried:
{
"Type": "Task",
"Resource": "arn:aws:states:::glue:startJobRun.sync",
"Arguments": {
"JobName": "test-glue-job",
"JobRunQueuingEnabled": "true",
...
Step Functions will not permit a save with the following error:
/States/Map/ItemProcessor/States/Glue StartJobRun/Arguments: The field "JobRunQueuingEnabled" is not supported by Step Functions
A similar error occurs when "--JobRunQueuingEnabled": "true"
is tried.
Could it be that the API supports it but Step Functions does not? Or am I doing something wrong?
I have a Step Functions that starts a Glue job run. This step functions is invoked via Event Bridge when new files are created at a specific S3 location. Sometimes 2 or more files are created in quick succession. So when the first execution (in response to first file creation) is in progress, the second execution starts(in response to the second file creation), and when it reaches the Glue step, it fails with a "Concurrent runs exceeded".
I want the Glue StartJobRun step in the 2nd invocation to wait until Glue run in the first invocation is complete. I understand setting JobRunQueuingEnabled to true will achieve this.
This is my definition:
{
"Type": "Task",
"Resource": "arn:aws:states:::glue:startJobRun.sync",
"Arguments": {
"JobName": "test-glue-job",
"Arguments": {
"--JobRunQueuingEnabled": "true",
"JobRunQueuingEnabled": "true",
"--jobRunQueuingEnabled": "true",
"--jobrunqueuingenabled": "true"
}
},
"End": true
}
The various variants of JobRunQueuingEnabled are my attempts to see if any one of them will do the trick (some of the alternatives are inspired by this question). But none of them work, and "Concurrent runs exceeded" continues to occur.
Here is another alternative I tried:
{
"Type": "Task",
"Resource": "arn:aws:states:::glue:startJobRun.sync",
"Arguments": {
"JobName": "test-glue-job",
"JobRunQueuingEnabled": "true",
...
Step Functions will not permit a save with the following error:
/States/Map/ItemProcessor/States/Glue StartJobRun/Arguments: The field "JobRunQueuingEnabled" is not supported by Step Functions
A similar error occurs when "--JobRunQueuingEnabled": "true"
is tried.
Could it be that the API supports it but Step Functions does not? Or am I doing something wrong?
Share Improve this question asked Mar 27 at 2:40 RamRam 9391 gold badge8 silver badges24 bronze badges1 Answer
Reset to default 1You may try to update your glue job configuration to enable queuing as below:
Go to AWS Glue Console
Select your job → "Edit job"
Under "Job details", set "Job Concurrency" to a value > 1 (e.g., 2)
Enable "Job run queuing" (this is the JobRunQueuingEnabled
setting)
I think this will make Glue automatically queue runs when the concurrency limit is hit. Just have a try.