最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

duckdb - How to read data from a CSV file where one column contains JSON data? - Stack Overflow

programmeradmin1浏览0评论

Is there a way to read the CSV column data as JSON?

I'm using PSDuckDB...and currently I'm using pure PowerShell to select this 'AuditData' column with JSON data and write it to a new JSON file before I can start with DuckDB queries.

Import-Csv -Path 'C:\M365-UnifiedAuditLog.csv' | Select-Object -ExpandProperty AuditData | Out-File 'C:\AuditData.json' -Encoding UTF8NoBOM

I'm new to DuckDB, but the speed is very promising!

Is there a way to read the CSV column data as JSON?

I'm using PSDuckDB...and currently I'm using pure PowerShell to select this 'AuditData' column with JSON data and write it to a new JSON file before I can start with DuckDB queries.

Import-Csv -Path 'C:\M365-UnifiedAuditLog.csv' | Select-Object -ExpandProperty AuditData | Out-File 'C:\AuditData.json' -Encoding UTF8NoBOM

I'm new to DuckDB, but the speed is very promising!

Share Improve this question edited Feb 7 at 7:37 user3022917 asked Feb 7 at 7:07 user3022917user3022917 6192 gold badges11 silver badges21 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 0

If the CSV is valid, and if there is a properly encoded JSON column (i.e., encoded according to the CSV encoding rules), then reading the JSON column can be done trivially.

Example: CSV with header, with column j as encoded JSON

id,j
1,"[1,2]"
2,"{""a"":3}"
$ duckdb
D from read_csv('json-in-csv.csv', types={'j': 'JSON'} );
┌───────┬─────────┐
│  id   │    j    │
│ int64 │  json   │
├───────┼─────────┤
│     1 │ [1,2]   │
│     2 │ {"a":3} │
└───────┴─────────┘
D 

One can also use columns or auto_type_candidates, e.g. as follow:

from read_csv('json-in-csv.csv', columns={'id': int64, 'j':JSON});
from read_csv('json-in-csv.csv', auto_type_candidates=['json', 'BIGINT']);

Things would of course be much simpler if you were dealing with a TSV file, since raw tabs cannot appear in valid JSON.

发布评论

评论列表(0)

  1. 暂无评论