Using DBT Cloud with BigQuery, I have a dataset called "production" where all the production DBT stuff goes, and another dataset called "sandbox" where DBT puts stuff that I run locally to test. It would be super convenient if I could somehow tell DBT to pull from the production dataset even when building tables in the sandbox. Is that possible?
Using DBT Cloud with BigQuery, I have a dataset called "production" where all the production DBT stuff goes, and another dataset called "sandbox" where DBT puts stuff that I run locally to test. It would be super convenient if I could somehow tell DBT to pull from the production dataset even when building tables in the sandbox. Is that possible?
Share Improve this question edited Mar 11 at 15:58 Sourav Dutta 4843 silver badges9 bronze badges asked Mar 11 at 14:52 brunt_fcabrunt_fca 12 bronze badges1 Answer
Reset to default 0You can do whatever you want as long as the data are in BigQuery, it's only a matter of permission and authorization.
So, grant the service account that run your BigQuery query the permission to access (read) your production data and grant it the permission to write the data in the sandbox dataset. Here the doc: https://cloud.google/bigquery/docs/control-access-to-resources-iam
---
Now, beyond the technical (and easy) aspect, you have the architecture, process and security question:
Do you really want to copy your production data in a sandbox environment?
Do your production data contain sensitive information (PII or others)?
Do you need an extra process (with DBT or not) to copy/anonymise/secure your prod data into a sandbox environment?