最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

snowflake - header not detected by infer schema - Stack Overflow

programmeradmin1浏览0评论

The below code ignores the header in the csv file and the schema is being inferred as c1,c2,c3. The header columns are produced as just another data row. Below is sample data of relevant file

Series_reference    Period  Data_value  Suppressed  STATUS  UNITS   Magnitude   Subject Group   Series_title_1
BDCQ.SEA1AA 2011.06 80078       F   Number  0   Business Data Collection - BDC  Industry by employment variable Filled jobs
create or replace file format mybusinesscsv
type = csv
RECORD_DELIMITER = '\n'
skip_header = 0;
SELECT *
  FROM TABLE(
    INFER_SCHEMA(
      LOCATION=>'@quarterdatastg/employment_1.csv.gz',
      file_format=>'mybusinesscsv'

      )
    );

The below code ignores the header in the csv file and the schema is being inferred as c1,c2,c3. The header columns are produced as just another data row. Below is sample data of relevant file

Series_reference    Period  Data_value  Suppressed  STATUS  UNITS   Magnitude   Subject Group   Series_title_1
BDCQ.SEA1AA 2011.06 80078       F   Number  0   Business Data Collection - BDC  Industry by employment variable Filled jobs
create or replace file format mybusinesscsv
type = csv
RECORD_DELIMITER = '\n'
skip_header = 0;
SELECT *
  FROM TABLE(
    INFER_SCHEMA(
      LOCATION=>'@quarterdatastg/employment_1.csv.gz',
      file_format=>'mybusinesscsv'

      )
    );
Share Improve this question asked Feb 6 at 19:12 MatMat 6033 gold badges11 silver badges23 bronze badges
Add a comment  | 

2 Answers 2

Reset to default 0

The point Nick is pointing to is the 'parse header' option also is required.

we had two file formats, one for the LOAD and the other for the INFER stage:

create or replace file format DB_NAME.SCHEMA_NAME.CSV_FORMAT 
    type = CSV RECORD_DELIMITER ='\n', FIELD_DELIMITER=',',
    SKIP_HEADER=1, PARSE_HEADER=false, ESCAPE_UNENCLOSED_FIELD='\\', 
   FIELD_OPTIONALLY_ENCLOSED_BY='\"';

create or replace file format DB_NAME.SCHEMA_NAME.CSV_FORMAT_HEADER 
   type = CSV RECORD_DELIMITER ='\n', FIELD_DELIMITER=',',
   SKIP_HEADER=0, PARSE_HEADER=true, ESCAPE_UNENCLOSED_FIELD='\\', 
   FIELD_OPTIONALLY_ENCLOSED_BY='\"';

thus the INFER was like:

            FROM TABLE(INFER_SCHEMA(
                      LOCATION => ''$$ || S3_FOLDER || $$'',
                      FILE_FORMAT => ''DB_NAME.SCHEMA_NAME.CSV_FORMAT_HEADER'',
                      IGNORE_CASE => TRUE
                ))

and where used in COPY's like:

 COPY INTO DB_NAME.SCHEMA_NAME.TABLE_NAME
      FROM @DB_NAME.SCHEMA_NAME.MEANINGFUL_NAME_STAGE/table_path_tokens
      FILE_FORMAT = DB_NAME.SCHEMA_NAME.CSV_FORMAT;

This is covered in the documentation: https://docs.snowflake.com/en/sql-reference/functions/infer_schema#usage-notes

发布评论

评论列表(0)

  1. 暂无评论