snowflake - header not detected by infer schema

The below code ignores the header in the csv file and the schema is being inferred as c1,c2,c3. The header columns are produced as just another data row. Below is sample data of relevant file

Series_reference    Period  Data_value  Suppressed  STATUS  UNITS   Magnitude   Subject Group   Series_title_1
BDCQ.SEA1AA 2011.06 80078       F   Number  0   Business Data Collection - BDC  Industry by employment variable Filled jobs

create or replace file format mybusinesscsv
type = csv
RECORD_DELIMITER = '\n'
skip_header = 0;

SELECT *
  FROM TABLE(
    INFER_SCHEMA(
      LOCATION=>'@quarterdatastg/employment_1.csv.gz',
      file_format=>'mybusinesscsv'

      )
    );

The below code ignores the header in the csv file and the schema is being inferred as c1,c2,c3. The header columns are produced as just another data row. Below is sample data of relevant file

Series_reference    Period  Data_value  Suppressed  STATUS  UNITS   Magnitude   Subject Group   Series_title_1
BDCQ.SEA1AA 2011.06 80078       F   Number  0   Business Data Collection - BDC  Industry by employment variable Filled jobs

create or replace file format mybusinesscsv
type = csv
RECORD_DELIMITER = '\n'
skip_header = 0;

SELECT *
  FROM TABLE(
    INFER_SCHEMA(
      LOCATION=>'@quarterdatastg/employment_1.csv.gz',
      file_format=>'mybusinesscsv'

      )
    );

Share Improve this question asked Feb 6 at 19:12 Mat 6033 gold badges11 silver badges23 bronze badges

Add a comment |

2 Answers 2

Sorted by: Reset to default 0

The point Nick is pointing to is the 'parse header' option also is required.

we had two file formats, one for the LOAD and the other for the INFER stage:

create or replace file format DB_NAME.SCHEMA_NAME.CSV_FORMAT 
    type = CSV RECORD_DELIMITER ='\n', FIELD_DELIMITER=',',
    SKIP_HEADER=1, PARSE_HEADER=false, ESCAPE_UNENCLOSED_FIELD='\\', 
   FIELD_OPTIONALLY_ENCLOSED_BY='\"';

create or replace file format DB_NAME.SCHEMA_NAME.CSV_FORMAT_HEADER 
   type = CSV RECORD_DELIMITER ='\n', FIELD_DELIMITER=',',
   SKIP_HEADER=0, PARSE_HEADER=true, ESCAPE_UNENCLOSED_FIELD='\\', 
   FIELD_OPTIONALLY_ENCLOSED_BY='\"';

thus the INFER was like:

            FROM TABLE(INFER_SCHEMA(
                      LOCATION => ''$$ || S3_FOLDER || $$'',
                      FILE_FORMAT => ''DB_NAME.SCHEMA_NAME.CSV_FORMAT_HEADER'',
                      IGNORE_CASE => TRUE
                ))

and where used in COPY's like:

 COPY INTO DB_NAME.SCHEMA_NAME.TABLE_NAME
      FROM @DB_NAME.SCHEMA_NAME.MEANINGFUL_NAME_STAGE/table_path_tokens
      FILE_FORMAT = DB_NAME.SCHEMA_NAME.CSV_FORMAT;

This is covered in the documentation: https://docs.snowflake.com/en/sql-reference/functions/infer_schema#usage-notes

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

snowflake - header not detected by infer schema - Stack Overflow

2 Answers 2

与本文相关的文章

评论列表(0)