最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

database - Data loading and transformation from gcs to bigquery - Stack Overflow

programmeradmin3浏览0评论

I wanted to load some unstructured json files having max size 200gb to bigquery without using any etl tools, I want a simple solution to transform the data from gcs to proper structured json format and some other custom transform logics to implement before loading into bigquery. The challenge is without using any high computing resource and etl tool how to achieve this

I wanted to load some unstructured json files having max size 200gb to bigquery without using any etl tools, I want a simple solution to transform the data from gcs to proper structured json format and some other custom transform logics to implement before loading into bigquery. The challenge is without using any high computing resource and etl tool how to achieve this

Share Improve this question asked Jan 20 at 14:16 rahrah 231 silver badge9 bronze badges 2
  • cloud.google.com/bigquery/docs/… – Damião Martins Commented Jan 20 at 17:02
  • What is an unstructured JSON file? I have always thought of JSON as being "semi structured" meaning that it is syntactically well formed but the content of a document doesn't have to conform to a specific schema. – Kolban Commented Jan 21 at 15:12
Add a comment  | 

1 Answer 1

Reset to default 0

The idea is to break 200GB into smaller pieces then use Cloud functions, the way I see it is for you to break it by deploying a Cloud Run (it has a memory cap of 16GB) to split it or manually breaking it. Then, use a Cloud Function to transform the data so you can load it to BigQuery.

发布评论

评论列表(0)

  1. 暂无评论