最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

Azure Data Factory - Pipeline re-runs until condition has been met, then it ends successfully - Stack Overflow

programmeradmin5浏览0评论

We have a pipeline that, when it runs, calls an API for a set date range of data, retrieves data and processes it. One of the fields returned in the data is "limitreached" which indicates if our call returned a maximum number of records. If that value is true, then our call didnt return its entire set date range, as we hit the maximum number of records.

Our pipeline then records the maximum date we got up to in a load control and completes.

What I would like to do is, when the "limitreached" value is true of the latest run (recorded in a load control), once the pipeline has "completed" its run, I would like it to start again. The next call would then use the latest maximum date as the starting point, and run again. This time ideally returning a "false" on "limitreached"

The problem I am having with ADF is, it seems to completely hate this idea? I have tried:

  • A IfCondition at the end of the orchestration pipeline to check IF "limitreached" = TRUE then run API call pipeline again. However this would run a nested pipeline which could run the risk of nesting 4/5 layers deep if we hit the limit repeatedly in one day.

  • An IfCondition that returns to the start of the pipeline if the condition returns TRUE, but ends if FALSE. ADF doesnt allow a cycle like this. It gives an error "Pipeline cannot have cycles. Activity 'name' is in a cycle."

Our process looks like: The first pipeline calls the API, returns a json file of data. The second pipeline shreds that data into snowflake tables, once of which is a value "limitreached".

If that value "limitreached" is TRUE, I want this process to run again.

I have tried:

Whereby the script activity checks the results of the limitreached value, and the IFCondition says "Does limitreached = true". If it did, it would start the process again. If it didnt, it would fail the activity. This doesnt work because you can not loop a pipeline like this.

I also tried creating a third pipeline to do the check, and if so then kick off the starting pipeline, but that returns this error:

Is there any way to trigger some sort of "if condition is met, rerun the pipeline" its built in such a way that it starts and ends perfectly. I just need it to check if it should re-run again without manual intervention. Is that possible?

We have a pipeline that, when it runs, calls an API for a set date range of data, retrieves data and processes it. One of the fields returned in the data is "limitreached" which indicates if our call returned a maximum number of records. If that value is true, then our call didnt return its entire set date range, as we hit the maximum number of records.

Our pipeline then records the maximum date we got up to in a load control and completes.

What I would like to do is, when the "limitreached" value is true of the latest run (recorded in a load control), once the pipeline has "completed" its run, I would like it to start again. The next call would then use the latest maximum date as the starting point, and run again. This time ideally returning a "false" on "limitreached"

The problem I am having with ADF is, it seems to completely hate this idea? I have tried:

  • A IfCondition at the end of the orchestration pipeline to check IF "limitreached" = TRUE then run API call pipeline again. However this would run a nested pipeline which could run the risk of nesting 4/5 layers deep if we hit the limit repeatedly in one day.

  • An IfCondition that returns to the start of the pipeline if the condition returns TRUE, but ends if FALSE. ADF doesnt allow a cycle like this. It gives an error "Pipeline cannot have cycles. Activity 'name' is in a cycle."

Our process looks like: The first pipeline calls the API, returns a json file of data. The second pipeline shreds that data into snowflake tables, once of which is a value "limitreached".

If that value "limitreached" is TRUE, I want this process to run again.

I have tried:

Whereby the script activity checks the results of the limitreached value, and the IFCondition says "Does limitreached = true". If it did, it would start the process again. If it didnt, it would fail the activity. This doesnt work because you can not loop a pipeline like this.

I also tried creating a third pipeline to do the check, and if so then kick off the starting pipeline, but that returns this error:

Is there any way to trigger some sort of "if condition is met, rerun the pipeline" its built in such a way that it starts and ends perfectly. I just need it to check if it should re-run again without manual intervention. Is that possible?

Share Improve this question edited Mar 24 at 12:49 MikeLanglois asked Mar 24 at 12:03 MikeLangloisMikeLanglois 712 silver badges10 bronze badges 3
  • Are you getting any error? can you explain your ask with any sample data and what activities have you tried? – Rakesh Govindula Commented Mar 24 at 12:19
  • A specific one is "Pipeline 'pl-name' cannot have cycles. Activity 'pl-name' is in a cycle." when trying to have it loop back to the start after an IFCondition checks if it is true or false. – MikeLanglois Commented Mar 24 at 12:22
  • @RakeshGovindula I have updated my question with error messaging and current things attempted – MikeLanglois Commented Mar 24 at 12:57
Add a comment  | 

1 Answer 1

Reset to default 1

Currently, ADF pipeline does not support cycles in the activity flow.

To achieve your requirement, you can use Until activity and have your pipeline design inside this.

First initialize a Boolean variable flag with a @bool('false') value.

In the Until activity, use the below expression.

@equals(variables('flag'),true)

Inside Until activity, set-up your pipeline flow here. After the last activity, take another set variable activity and update your flag with an expression like below.

@if(equals(5,5),bool('true'),bool('false'))

Here, I have taken a sample condition like equals(5,5) but you need to check your script activity output and assign true or false to the flag variable.

In each iteration, the pipeline flow will be executed and if the condition has met then it will stop the iterations.

与本文相关的文章

发布评论

评论列表(0)

  1. 暂无评论