database - Create DuckDB table with primary key from parquet

I am trying to set up a simple but large DuckDB database with a single column of unique values as read from a parquet file. For faster inference of single-point existence checking (WHERE id = test_id), I want to convert the parquet file to a DuckDB (as recommended) and add a primary key to it.

I tried it as follows:

CREATE OR REPLACE TABLE data (
    id UUID PRIMARY KEY
)
AS SELECT DISTINCT id FROM 'my-parquet-file.parquet'

but got the error:

duckdb.duckdb.ParserException: Parser Error: syntax error at or near "AS"

So what is the correct way of setting up the table for efficient single-point existence checking on huge tables?

I tried it as follows:

CREATE OR REPLACE TABLE data (
    id UUID PRIMARY KEY
)
AS SELECT DISTINCT id FROM 'my-parquet-file.parquet'

but got the error:

duckdb.duckdb.ParserException: Parser Error: syntax error at or near "AS"

So what is the correct way of setting up the table for efficient single-point existence checking on huge tables?

Share Improve this question edited Jan 19 at 10:09 NateDhaliwal 11210 bronze badges asked Jan 19 at 9:27 Bram Vanroy 28.4k26 gold badges147 silver badges263 bronze badges

Add a comment |

1 Answer 1

Sorted by: Reset to default 1

You can add primary key via alter table ... add primary key ... after creation:

create or replace table data
as
select distinct id
from 'my-parquet-file.parquet';

alter table data add primary key (id);

or create table with primary key and then insert data into it:

create or replace table data (
    id uuid primary key
);

insert into data (id)
select distinct id
from 'my-parquet-file.parquet';

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

database - Create DuckDB table with primary key from parquet - Stack Overflow

1 Answer 1

与本文相关的文章

评论列表(0)