最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

Azure Data Factory - Multiple IF Statements andor ForEach Loops - Stack Overflow

programmeradmin2浏览0评论

The area I work in takes inbound files and loads them to an on-prem SQL Server. We use SSIS to loop through the files, and we run multiple validations on the files to insure that they are formatted correctly. We run one of the validations, then, if it passes, move to the next; if the validation fails, we fail the file and do processing on it for the failure (sending emails, moving to correct location, etc.).

We are migrating from on-prem to the cloud (Snowflake, specifically). Due to this migration, we have to move away from SSIS and use Azure Data Factory. Unfortunately, ADF does not support multiple nested IF statements, nor does it support a nested IF inside a ForEach loop. This is a major drawback, and I am struggling to figure out how to go about getting this to work without having to write some ugly billion-line stored procedure in Snowflake. As an example, our SSIS package effectively does this:

  1. Check inbound file name
  2. If file name passes, move to the next validation. If the file name does not pass, fail the file.
  3. Check the inbound file has the correct header.
  4. If the header passes, move to the next validation. IF the header does not pass, fail the file.

I wish I could provide screenshots here to give you an example of this. Something like this, perhaps:

Check File Name ---if pass--- Check Header ---if pass--- Next Validation | | | | if fail if fail | | | | -----file failure routine----

And so on. What I struggle with here is being able to accomplish this in ADF. As mentioned, you cannot nest IF statements, and you cannot nest an IF inside a ForEach. Can someone point me to tutorial or a page or something that gives a step-by-step on how to do this? Keep in mind that, after we get through the validations, each file can follow a separate load routine based upon the type of file it is. Yes, I know this is complex. But it seems like Microsoft designed ADF simply to charge the crap out of people for running multiple pipelines...which still doesn't get me where I need to go because of the multiple IF PASS statements I need.

If someone can help, that would be keen.

The area I work in takes inbound files and loads them to an on-prem SQL Server. We use SSIS to loop through the files, and we run multiple validations on the files to insure that they are formatted correctly. We run one of the validations, then, if it passes, move to the next; if the validation fails, we fail the file and do processing on it for the failure (sending emails, moving to correct location, etc.).

We are migrating from on-prem to the cloud (Snowflake, specifically). Due to this migration, we have to move away from SSIS and use Azure Data Factory. Unfortunately, ADF does not support multiple nested IF statements, nor does it support a nested IF inside a ForEach loop. This is a major drawback, and I am struggling to figure out how to go about getting this to work without having to write some ugly billion-line stored procedure in Snowflake. As an example, our SSIS package effectively does this:

  1. Check inbound file name
  2. If file name passes, move to the next validation. If the file name does not pass, fail the file.
  3. Check the inbound file has the correct header.
  4. If the header passes, move to the next validation. IF the header does not pass, fail the file.

I wish I could provide screenshots here to give you an example of this. Something like this, perhaps:

Check File Name ---if pass--- Check Header ---if pass--- Next Validation | | | | if fail if fail | | | | -----file failure routine----

And so on. What I struggle with here is being able to accomplish this in ADF. As mentioned, you cannot nest IF statements, and you cannot nest an IF inside a ForEach. Can someone point me to tutorial or a page or something that gives a step-by-step on how to do this? Keep in mind that, after we get through the validations, each file can follow a separate load routine based upon the type of file it is. Yes, I know this is complex. But it seems like Microsoft designed ADF simply to charge the crap out of people for running multiple pipelines...which still doesn't get me where I need to go because of the multiple IF PASS statements I need.

If someone can help, that would be keen.

Share Improve this question asked Mar 5 at 15:36 Corey LivermoreCorey Livermore 414 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 0

You can use the below flow :

within foreach loop (for diff file iterations) have the below sequential flow :

  1. use getmetadata activity to check get the file details from the path (filename, filetype,header etc)

  2. use IF activity to check whether the filename is as expected or not

    1. if expected, no activities within True

    2. if not, then add a fail activity within False to throw error and fail the iteration and move to the next one

  3. Use If activity to check whether the header is as expected or not

    1. True ,false would be same as above
  4. Add the necessary additional steps

  5. Depending on the file type use Switch activity to redirect for the necessary executions

    Now in the above scenario, you would have to explicitly fail the iteration to proceed to the next one to overcome the limitation of nested if.

    If you want to avoid the failure and want the nested IF scenario, you can call an execute pipeline activity from within the IF activity and call the other IF activity in the new pipeline

发布评论

评论列表(0)

  1. 暂无评论