最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

amazon web services - AWS lambda SQS does not allow to process 1 message per 1 invocation - Stack Overflow

programmeradmin3浏览0评论

I am trying to set up what I thought initially was an easy pipeline.

I have 2 SQS queues and 2 lambdas that should work sequentially.

The problem is that my lambdas do a heavy job so I want 1 message processed by 1 lambda invocation. For this reason, I have set the Batch size in the lambda SQS triggers to 1 and my lambdas start as follows:

@idempotent(persistence_store=persistence_layer)
def lambda_handler(event, context):
    record = event['Records'][0]
    body = json.loads(record['body'])

    task_id = body['taskId']

    set_state_in_dynamodb(task_id, 'Running')

    # Do all the job

    set_state_in_dynamodb(task_id, 'Success')

In my world, everything should have worked as follows:

N messages go into Queue1 each triggering an invocation of Lambda 1. If there are limitations on parallel execution, I am good with that. But not 1. After Lambda 1 finishes its work it puts M messages into Queue 2 which triggers Lambda 2 in the same way.

Instead, I see the following behavior:

I put 10 messages into Queue 1, they are all dequeued, however, only the state of a subset ( random ) of the tasks is updated, i.e. set to Running. After the invisibility period expires, the same happens.

As far as I understand, AWS dequeues more than 1 message and invokes the corresponding lambda passing only the first one, however, the others remain invisible and we should wait until the invisibility period is passed so that they are dequeued in the same manner. So, after a few such iterations, I can get a message that was never processed before but has a dequeue count, say 3.

This thing gets worse on Queue 2 and Lambda 2.

I tried to change the type of the queue from Standard to FIFO and provided a different MessageGroupId for each message, which resulted in the same behavior.

What am I doing wrong?

I just want my messages to be processed in parallel and only once.

I am trying to set up what I thought initially was an easy pipeline.

I have 2 SQS queues and 2 lambdas that should work sequentially.

The problem is that my lambdas do a heavy job so I want 1 message processed by 1 lambda invocation. For this reason, I have set the Batch size in the lambda SQS triggers to 1 and my lambdas start as follows:

@idempotent(persistence_store=persistence_layer)
def lambda_handler(event, context):
    record = event['Records'][0]
    body = json.loads(record['body'])

    task_id = body['taskId']

    set_state_in_dynamodb(task_id, 'Running')

    # Do all the job

    set_state_in_dynamodb(task_id, 'Success')

In my world, everything should have worked as follows:

N messages go into Queue1 each triggering an invocation of Lambda 1. If there are limitations on parallel execution, I am good with that. But not 1. After Lambda 1 finishes its work it puts M messages into Queue 2 which triggers Lambda 2 in the same way.

Instead, I see the following behavior:

I put 10 messages into Queue 1, they are all dequeued, however, only the state of a subset ( random ) of the tasks is updated, i.e. set to Running. After the invisibility period expires, the same happens.

As far as I understand, AWS dequeues more than 1 message and invokes the corresponding lambda passing only the first one, however, the others remain invisible and we should wait until the invisibility period is passed so that they are dequeued in the same manner. So, after a few such iterations, I can get a message that was never processed before but has a dequeue count, say 3.

This thing gets worse on Queue 2 and Lambda 2.

I tried to change the type of the queue from Standard to FIFO and provided a different MessageGroupId for each message, which resulted in the same behavior.

What am I doing wrong?

I just want my messages to be processed in parallel and only once.

Share Improve this question asked Jan 18 at 15:50 VahagnVahagn 4264 silver badges21 bronze badges 2
  • 3 The batch of messages that the Lambda function is invoked with should never exceed the configured batch size limit. You can easily verify this by logging the length of event['Records'] and raising an exception if it's not what you expected. Have you verified this? Your symptoms (as far as I can tell from your problem description) suggest that you have configured the default batch size of 10 and your Lambda function is only processing the first one, so the remaining 9 are made visible in the queue again. Did you deploy the Lambda function after you reduced the batch size limit to 1? – jarmod Commented Jan 18 at 17:50
  • 1 You might want to add print(event) to the start of your Lambda function so you can see what is being passed to the function. Also, please note that if you are viewing the content of messages via the SQS management console, then you are actually consuming messages from the queue, which will reappear after the invisibility timeout expires. It is safer to do your investigation from log files without using the SQS console to view message information. – John Rotenstein Commented Jan 19 at 7:17
Add a comment  | 

2 Answers 2

Reset to default 0

As already mentioned in a comment the solution described should work fine. Are your sure you have set Batch Size = 1 in the lambda event source?

Your description perfectly matches with the default Batch Size = 10, where your lambda ignore the other 9 events.

BTW I strongly suggest you also to support multiple events in a single lambda, this will result in a low execution time overall and lower costs.

The issue was the limit on the number of AWS lambda parallel threads. It was 10 by default. I made a request to increase it to 1000 and everything works well now.

发布评论

评论列表(0)

  1. 暂无评论