最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

azure data factory - How to count all the files that exist in folder and its sub-folders using ADF pipeline - Stack Overflow

programmeradmin1浏览0评论

I have an azure file share directory with files, folders and files within the folders FOLDER1 -- file1, file2, file3 ... FILE1 FILE2 FOLDER2 - file1, file2 ... FOLDER3 - file1, file2, file3 ....

I would like to develope a pipeline that counts all the files within the file share folder itself and files within the folders.

My solution so far: I used Get Metadata activity to list all the childitems, but I am stuck on what next to do. I am a beginner with azure.

I have an azure file share directory with files, folders and files within the folders FOLDER1 -- file1, file2, file3 ... FILE1 FILE2 FOLDER2 - file1, file2 ... FOLDER3 - file1, file2, file3 ....

I would like to develope a pipeline that counts all the files within the file share folder itself and files within the folders.

My solution so far: I used Get Metadata activity to list all the childitems, but I am stuck on what next to do. I am a beginner with azure.

Share Improve this question asked Nov 16, 2024 at 23:19 ShedrackShedrack 13 bronze badges 2
  • are the files exists only in the last subfolder or in any parent or intermediate folder as well? – Rakesh Govindula Commented Nov 17, 2024 at 3:30
  • The files exist in any parent or intermediate folder as well – Shedrack Commented Nov 17, 2024 at 6:42
Add a comment  | 

1 Answer 1

Reset to default 1

How to count all the files that exist in folder and its sub-folders using ADF pipeline

To count files from the folder and its sub folder you need to use the combination of getmetadata , filter and for each loop activity as below:

  • First use Get metadata activity to get child items from the file share directory.
  • Then use filter activity to filter the files and folders using the output of get metadata
For folders - @equals(item().type,'Folder')
For files - @equals(item().type,'files')
  • Then pass the output of filter activity where you filter the folders pass to the for each loop.
  • The similarly get File numbers from subfolders in foreach loop as above and the store it in the append variable.
  • Then take for each loop to Add the subfolders file count stored in append variable pass it to this for each loop:
  • Then take two variablessubfilecount and subfilecounttemp to add the subfolders files as below:
  • Then add this count and directory files count which we separated from first get metdata output:

My pipelie.json:

{
    "name": "pipeline3",
    "properties": {
        "activities": [
            {
                "name": "Get Metadata1",
                "type": "GetMetadata",
                "dependsOn": [],
                "policy": {
                    "timeout": "0.12:00:00",
                    "retry": 0,
                    "retryIntervalInSeconds": 30,
                    "secureOutput": false,
                    "secureInput": false
                },
                "userProperties": [],
                "typeProperties": {
                    "dataset": {
                        "referenceName": "DelimitedText2",
                        "type": "DatasetReference"
                    },
                    "fieldList": [
                        "childItems"
                    ],
                    "storeSettings": {
                        "type": "AzureBlobStorageReadSettings",
                        "enablePartitionDiscovery": false
                    },
                    "formatSettings": {
                        "type": "DelimitedTextReadSettings"
                    }
                }
            },
            {
                "name": "Filter1",
                "type": "Filter",
                "dependsOn": [
                    {
                        "activity": "Get Metadata1",
                        "dependencyConditions": [
                            "Succeeded"
                        ]
                    }
                ],
                "userProperties": [],
                "typeProperties": {
                    "items": {
                        "value": "@activity('Get Metadata1').output.childItems",
                        "type": "Expression"
                    },
                    "condition": {
                        "value": "@equals(item().type,'Folder')",
                        "type": "Expression"
                    }
                }
            },
            {
                "name": "Filter2",
                "type": "Filter",
                "dependsOn": [
                    {
                        "activity": "Get Metadata1",
                        "dependencyConditions": [
                            "Succeeded"
                        ]
                    }
                ],
                "userProperties": [],
                "typeProperties": {
                    "items": {
                        "value": "@activity('Get Metadata1').output.childItems",
                        "type": "Expression"
                    },
                    "condition": {
                        "value": "@equals(item().type,'File')",
                        "type": "Expression"
                    }
                }
            },
            {
                "name": "ForEach1",
                "type": "ForEach",
                "dependsOn": [
                    {
                        "activity": "Filter1",
                        "dependencyConditions": [
                            "Succeeded"
                        ]
                    }
                ],
                "userProperties": [],
                "typeProperties": {
                    "items": {
                        "value": "@activity('Filter1').output.value",
                        "type": "Expression"
                    },
                    "activities": [
                        {
                            "name": "Get Metadata2",
                            "type": "GetMetadata",
                            "dependsOn": [],
                            "policy": {
                                "timeout": "0.12:00:00",
                                "retry": 0,
                                "retryIntervalInSeconds": 30,
                                "secureOutput": false,
                                "secureInput": false
                            },
                            "userProperties": [],
                            "typeProperties": {
                                "dataset": {
                                    "referenceName": "DelimitedText3",
                                    "type": "DatasetReference",
                                    "parameters": {
                                        "subfoldername": {
                                            "value": "@item().name",
                                            "type": "Expression"
                                        }
                                    }
                                },
                                "fieldList": [
                                    "childItems"
                                ],
                                "storeSettings": {
                                    "type": "AzureBlobStorageReadSettings",
                                    "enablePartitionDiscovery": false
                                },
                                "formatSettings": {
                                    "type": "DelimitedTextReadSettings"
                                }
                            }
                        },
                        {
                            "name": "Filter3",
                            "type": "Filter",
                            "dependsOn": [
                                {
                                    "activity": "Get Metadata2",
                                    "dependencyConditions": [
                                        "Succeeded"
                                    ]
                                }
                            ],
                            "userProperties": [],
                            "typeProperties": {
                                "items": {
                                    "value": "@activity('Get Metadata2').output.childItems",
                                    "type": "Expression"
                                },
                                "condition": {
                                    "value": "@equals(item().type,'Folder')",
                                    "type": "Expression"
                                }
                            }
                        },
                        {
                            "name": "Filter4",
                            "type": "Filter",
                            "dependsOn": [
                                {
                                    "activity": "Get Metadata2",
                                    "dependencyConditions": [
                                        "Succeeded"
                                    ]
                                }
                            ],
                            "userProperties": [],
                            "typeProperties": {
                                "items": {
                                    "value": "@activity('Get Metadata2').output.childItems",
                                    "type": "Expression"
                                },
                                "condition": {
                                    "value": "@equals(item().type,'File')",
                                    "type": "Expression"
                                }
                            }
                        },
                        {
                            "name": "Append variable1",
                            "type": "AppendVariable",
                            "dependsOn": [
                                {
                                    "activity": "Filter4",
                                    "dependencyConditions": [
                                        "Succeeded"
                                    ]
                                }
                            ],
                            "userProperties": [],
                            "typeProperties": {
                                "variableName": "subfolder files",
                                "value": {
                                    "value": "@activity('Filter4').output.FilteredItemsCount",
                                    "type": "Expression"
                                }
                            }
                        }
                    ]
                }
            },
            {
                "name": "Set variable3",
                "type": "SetVariable",
                "dependsOn": [
                    {
                        "activity": "Filter2",
                        "dependencyConditions": [
                            "Succeeded"
                        ]
                    },
                    {
                        "activity": "ForEach2",
                        "dependencyConditions": [
                            "Succeeded"
                        ]
                    }
                ],
                "policy": {
                    "secureOutput": false,
                    "secureInput": false
                },
                "userProperties": [],
                "typeProperties": {
                    "variableName": "subfilecounttemp",
                    "value": {
                        "value": "@add(activity('Filter2').output.FilteredItemsCount,variables('subfilecount'))",
                        "type": "Expression"
                    }
                }
            },
            {
                "name": "ForEach2",
                "type": "ForEach",
                "dependsOn": [
                    {
                        "activity": "ForEach1",
                        "dependencyConditions": [
                            "Succeeded"
                        ]
                    }
                ],
                "userProperties": [],
                "typeProperties": {
                    "items": {
                        "value": "@variables('subfolder files')",
                        "type": "Expression"
                    },
                    "isSequential": true,
                    "activities": [
                        {
                            "name": "Set variable4",
                            "type": "SetVariable",
                            "dependsOn": [],
                            "policy": {
                                "secureOutput": false,
                                "secureInput": false
                            },
                            "userProperties": [],
                            "typeProperties": {
                                "variableName": "subfilecount",
                                "value": {
                                    "value": "@add(item(),variables('subfilecounttemp'))",
                                    "type": "Expression"
                                }
                            }
                        },
                        {
                            "name": "Set variable5",
                            "type": "SetVariable",
                            "dependsOn": [
                                {
                                    "activity": "Set variable4",
                                    "dependencyConditions": [
                                        "Succeeded"
                                    ]
                                }
                            ],
                            "policy": {
                                "secureOutput": false,
                                "secureInput": false
                            },
                            "userProperties": [],
                            "typeProperties": {
                                "variableName": "subfilecounttemp",
                                "value": {
                                    "value": "@variables('subfilecount')",
                                    "type": "Expression"
                                }
                            }
                        }
                    ]
                }
            }
        ],
        "variables": {
            "subfolder": {
                "type": "String"
            },
            "subfolder files": {
                "type": "Array"
            },
            "subfilecounttemp": {
                "type": "Integer"
            },
            "fies": {
                "type": "Array"
            },
            "subfilecount": {
                "type": "Integer"
            }
        },
        "annotations": []
    }
}

与本文相关的文章

发布评论

评论列表(0)

  1. 暂无评论