最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

python - Issue Passing Dynamic Variables CSV from Shell Script into Dataform (v2.6.7) - Stack Overflow

programmeradmin3浏览0评论

Title: Issue Passing Dynamic Variables from Shell Script into Dataform (v2.6.7)

Post:

Hi Dataform Community,

I'm encountering issues dynamically passing variables (specifically dataset names) from a shell script into my Dataform project (version 2.6.7). Here’s a simplified and anonymized example clearly illustrating my scenario:

Scenario Overview

I have four key files:

1. datasets.csv

datasetA,datasetB

2. dataform.json

{
  "warehouse": "bigquery",
  "defaultSchema": "sample_schema",
  "defaultDatabase": "sample_project",
  "vars": {
    "PROJECT_ID": "sample_project",
    "SCHEMA": "sample_schema",
    "DATASETS": ""
  }
}

3. filter_data.sqlx

config {
  type: "table",
  schema: "sample_schema"
}

SELECT *
FROM ${ref("source_table")}
WHERE dataset_id IN (${dataform.projectConfig.vars.DATASETS.split(",").map(d => `'${d.trim()}'`).join(",")})

4. run_dataform.sh

#!/bin/bash

DATASETS=$(awk 'NF' datasets.csv | paste -sd, -)

cp dataform.json dataform.json.bak

jq --arg DATASETS "$DATASETS" '.vars.DATASETS=$DATASETS' dataform.json > temp.json && mv temp.json dataform.json

dataform run --credentials "./envs/.df-credentials.json"

mv dataform.json.bak dataform.json

Issue Explanation

When I execute the script locally, the variables update correctly, and the Dataform job runs successfully. However, once deployed in a cloud workflow environment, Dataform executes successfully but returns empty tables. It seems like the variables from datasets.csv aren't being recognized or passed correctly in the workflow context.

Desired Outcome

  • I want the compiled SQL in Dataform to reflect the dynamic datasets passed from datasets.csv:
dataset_id IN ('datasetA','datasetB')

My questions to the community

  • Is passing dynamic variables via modification of dataform.json considered best practice?
  • Is there a better, officially supported method (perhaps environment variables or CLI parameters) to dynamically pass variables into Dataform at runtime?

Any help, advice, or examples would be greatly appreciated. Thanks in advance!

发布评论

评论列表(0)

  1. 暂无评论