python - Issue Passing Dynamic Variables CSV from Shell Script into Dataform (v2.6.7)

Title: Issue Passing Dynamic Variables from Shell Script into Dataform (v2.6.7)

Post:

Hi Dataform Community,

I'm encountering issues dynamically passing variables (specifically dataset names) from a shell script into my Dataform project (version 2.6.7). Here’s a simplified and anonymized example clearly illustrating my scenario:

Scenario Overview

I have four key files:

1. `datasets.csv`

datasetA,datasetB

2. `dataform.json`

{
  "warehouse": "bigquery",
  "defaultSchema": "sample_schema",
  "defaultDatabase": "sample_project",
  "vars": {
    "PROJECT_ID": "sample_project",
    "SCHEMA": "sample_schema",
    "DATASETS": ""
  }
}

3. `filter_data.sqlx`

config {
  type: "table",
  schema: "sample_schema"
}

SELECT *
FROM ${ref("source_table")}
WHERE dataset_id IN (${dataform.projectConfig.vars.DATASETS.split(",").map(d => `'${d.trim()}'`).join(",")})

4. `run_dataform.sh`

#!/bin/bash

DATASETS=$(awk 'NF' datasets.csv | paste -sd, -)

cp dataform.json dataform.json.bak

jq --arg DATASETS "$DATASETS" '.vars.DATASETS=$DATASETS' dataform.json > temp.json && mv temp.json dataform.json

dataform run --credentials "./envs/.df-credentials.json"

mv dataform.json.bak dataform.json

Issue Explanation

When I execute the script locally, the variables update correctly, and the Dataform job runs successfully. However, once deployed in a cloud workflow environment, Dataform executes successfully but returns empty tables. It seems like the variables from datasets.csv aren't being recognized or passed correctly in the workflow context.

Desired Outcome

I want the compiled SQL in Dataform to reflect the dynamic datasets passed from datasets.csv:

dataset_id IN ('datasetA','datasetB')

My questions to the community

Is passing dynamic variables via modification of dataform.json considered best practice?
Is there a better, officially supported method (perhaps environment variables or CLI parameters) to dynamically pass variables into Dataform at runtime?

Any help, advice, or examples would be greatly appreciated. Thanks in advance!

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

python - Issue Passing Dynamic Variables CSV from Shell Script into Dataform (v2.6.7) - Stack Overflow

Scenario Overview

1. `datasets.csv`

2. `dataform.json`

3. `filter_data.sqlx`

4. `run_dataform.sh`

Issue Explanation

Desired Outcome

My questions to the community

与本文相关的文章

评论列表(0)

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

Scenario Overview

1. datasets.csv

2. dataform.json

3. filter_data.sqlx

4. run_dataform.sh

Issue Explanation

Desired Outcome

My questions to the community

与本文相关的文章

评论列表(0)

1. `datasets.csv`

2. `dataform.json`

3. `filter_data.sqlx`

4. `run_dataform.sh`