最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

Postgresql copy command problem with escaped delimiter - Stack Overflow

programmeradmin1浏览0评论

I'm working on a java program that imports a csv file into a Postgresql database. Some of the lines in the csv files contain escaped commas, and commas are also the delimiter. When I import this file with mysql it works fine, but with postgresql I get an error saying there's more tokens in the data than there are columns to import it into (.postgresql.util.PSQLException: ERROR: extra data after last expected column)

How can I get COPY to import these lines with the escaped commas treated as text instead of as delimiters?

My COPY command "COPY temporary_%s (%s) FROM STDIN with csv";

And for comparison, the mysql version works

"LOAD DATA LOCAL INFILE ? "
          + "INTO TABLE temporary_%s "
          + "FIELDS TERMINATED BY ',' ENCLOSED BY '\"' "
          + "LINES TERMINATED BY '\n' "
          + "(%s)";

I've been trying to append something like ESCAPE '\\' to the postgresql statement but it hasn't appeared to do anything. Solutions that I've seen suggest wrapping every chunk of data in quotes, but I don't have control over the contents of the csv file, and it's quite huge (up to ~8gb) so I'd prefer to not have to do pre-processing on it if that's not necessary.

And the java

    BaseConnection physicalConnection =
        (BaseConnection) connection.unwrap(Class.forName(".postgresql.core.BaseConnection"));

    CopyManager copyManager = new CopyManager(physicalConnection);
    FileReader fileReader = new FileReader(fileName);
    copyManager.copyIn(String.format(PROCEDURE_POSTGRES, tableName, columns), fileReader);

And a sample line of data with this problem (the \,) 0.2\,4,000088797e02a729c72ff07bb9c5a84ef27e6b7c,1541592473000,1545585645000,0.3.0,1,NULL,NULL

edit to answer qs: The data type for these columns is charvar(255). What I'd like to see is the commas being inserted as data, e.g. in this case 0.2,4 should be the data in the first column.

Full error message (with names scrubbed for privacy)

2025-02-04 12:00:40 2025-02-04 17:00:40.374 UTC [89725] STATEMENT:  COPY table_name (***, ***, ***, ***, ***, ***, ***, ***, ***) FROM STDIN with csv"
2025-02-04 12:03:48 2025-02-04 17:03:48.733 UTC [89890] ERROR:  extra data after last expected column
2025-02-04 12:03:48 2025-02-04 17:03:48.733 UTC [89890] CONTEXT:  COPY table_name, line 10: "  0.2\,4,000088797e02a729c72ff07bb9c5a84ef27e6b7c,1541592473000,1..."

also, I found a solution(?) where I don't use CSV. I don't know why this is working but it is - all the lines are being imported and they look correct. This might turn out to be all I need (doing more testing)

"COPY temporary_%s (%s) FROM STDIN DELIMITER ','";

I'm working on a java program that imports a csv file into a Postgresql database. Some of the lines in the csv files contain escaped commas, and commas are also the delimiter. When I import this file with mysql it works fine, but with postgresql I get an error saying there's more tokens in the data than there are columns to import it into (.postgresql.util.PSQLException: ERROR: extra data after last expected column)

How can I get COPY to import these lines with the escaped commas treated as text instead of as delimiters?

My COPY command "COPY temporary_%s (%s) FROM STDIN with csv";

And for comparison, the mysql version works

"LOAD DATA LOCAL INFILE ? "
          + "INTO TABLE temporary_%s "
          + "FIELDS TERMINATED BY ',' ENCLOSED BY '\"' "
          + "LINES TERMINATED BY '\n' "
          + "(%s)";

I've been trying to append something like ESCAPE '\\' to the postgresql statement but it hasn't appeared to do anything. Solutions that I've seen suggest wrapping every chunk of data in quotes, but I don't have control over the contents of the csv file, and it's quite huge (up to ~8gb) so I'd prefer to not have to do pre-processing on it if that's not necessary.

And the java

    BaseConnection physicalConnection =
        (BaseConnection) connection.unwrap(Class.forName(".postgresql.core.BaseConnection"));

    CopyManager copyManager = new CopyManager(physicalConnection);
    FileReader fileReader = new FileReader(fileName);
    copyManager.copyIn(String.format(PROCEDURE_POSTGRES, tableName, columns), fileReader);

And a sample line of data with this problem (the \,) 0.2\,4,000088797e02a729c72ff07bb9c5a84ef27e6b7c,1541592473000,1545585645000,0.3.0,1,NULL,NULL

edit to answer qs: The data type for these columns is charvar(255). What I'd like to see is the commas being inserted as data, e.g. in this case 0.2,4 should be the data in the first column.

Full error message (with names scrubbed for privacy)

2025-02-04 12:00:40 2025-02-04 17:00:40.374 UTC [89725] STATEMENT:  COPY table_name (***, ***, ***, ***, ***, ***, ***, ***, ***) FROM STDIN with csv"
2025-02-04 12:03:48 2025-02-04 17:03:48.733 UTC [89890] ERROR:  extra data after last expected column
2025-02-04 12:03:48 2025-02-04 17:03:48.733 UTC [89890] CONTEXT:  COPY table_name, line 10: "  0.2\,4,000088797e02a729c72ff07bb9c5a84ef27e6b7c,1541592473000,1..."

also, I found a solution(?) where I don't use CSV. I don't know why this is working but it is - all the lines are being imported and they look correct. This might turn out to be all I need (doing more testing)

"COPY temporary_%s (%s) FROM STDIN DELIMITER ','";
Share Improve this question edited Feb 4 at 18:18 b12 asked Feb 4 at 17:29 b12b12 355 bronze badges 9
  • 1 1) For CSV format DELIMITER ',' is redundant as that is the default for CSV. 2) ESCAPE is for escaping the QUOTE value which by default for CSV is ". 3) What type is 0.2\,4 supposed to end up as? 4) What is the complete error message from the Postgres log? Add information as update to question text. – Adrian Klaver Commented Feb 4 at 17:59
  • When 0.2\,4 should be a value for a single column, you must put this content between double quotes " in your csv. Or don't treat it as csv, since it's not csv, what seems to work. – Frank Heikens Commented Feb 4 at 18:25
  • 1) If the editing is correct you are specifying 9 column names in the COPY command and you want to only enter 8 values if I count correctly. 2) Your second form is using the text format. Which from here COPY: Backslash characters () can be used in the COPY data to quote data characters that might otherwise be taken as row or column delimiters.. – Adrian Klaver Commented Feb 4 at 18:28
  • 1.) Most lines import successfully, just the ones with \, don't. I may have got the column count wrong in transcription while removing columns to simplify the question and get rid of proprietary data, 2.) edit: ahh I think see what you're saying now. So in the CSV format, backslash can't be used to escape data characters but in the text format, it can? – b12 Commented Feb 4 at 18:30
  • 1 You cannot use a backslash to escape a delimiter in a CSV file. It has to look like "0.2,4". You won't be able to load that file with COPY. – Laurenz Albe Commented Feb 4 at 21:00
 |  Show 4 more comments

1 Answer 1

Reset to default 2

From here COPY:

File Formats

Text Format

Backslash characters () can be used in the COPY data to quote data characters that might otherwise be taken as row or column delimiters. In particular, the following characters must be preceded by a backslash if they appear as part of a column value: backslash itself, newline, carriage return, and the current delimiter character.

Therefore, using psql:

\! cat backslash.csv 
0.2\,4,000088797e02a729c72ff07bb9c5a84ef27e6b7c,1541592473000
.5,000088797e02a729c72ff07bb9c5a84ef27e6b7c,1545585645000

create table csv_import (fld1 varchar, fld2 varchar, fld3 varchar);

\copy csv_import from backslash.csv delimiter ','
COPY 2

select * from csv_import ;
 fld1  |                   fld2                   |     fld3      
-------+------------------------------------------+---------------
 0.2,4 | 000088797e02a729c72ff07bb9c5a84ef27e6b7c | 1541592473000
 .5    | 000088797e02a729c72ff07bb9c5a84ef27e6b7c | 1545585645000

\copy being a client(psql) side instance of the COPY command per:

https://www.postgresql./docs/current/app-psql.html

Performs a frontend (client) copy. This is an operation that runs an SQL COPY command, but instead of the server reading or writing the specified file, psql reads or writes the file and routes the data between the server and the local file system. This means that file accessibility and privileges are those of the local user, not the server, and no SQL superuser privileges are required.

发布评论

评论列表(0)

  1. 暂无评论
ok 不同模板 switch ($forum['model']) { /*case '0': include _include(APP_PATH . 'view/htm/read.htm'); break;*/ default: include _include(theme_load('read', $fid)); break; } } break; case '10': // 主题外链 / thread external link http_location(htmlspecialchars_decode(trim($thread['description']))); break; case '11': // 单页 / single page $attachlist = array(); $imagelist = array(); $thread['filelist'] = array(); $threadlist = NULL; $thread['files'] > 0 and list($attachlist, $imagelist, $thread['filelist']) = well_attach_find_by_tid($tid); $data = data_read_cache($tid); empty($data) and message(-1, lang('data_malformation')); $tidlist = $forum['threads'] ? page_find_by_fid($fid, $page, $pagesize) : NULL; if ($tidlist) { $tidarr = arrlist_values($tidlist, 'tid'); $threadlist = well_thread_find($tidarr, $pagesize); // 按之前tidlist排序 $threadlist = array2_sort_key($threadlist, $tidlist, 'tid'); } $allowpost = forum_access_user($fid, $gid, 'allowpost'); $allowupdate = forum_access_mod($fid, $gid, 'allowupdate'); $allowdelete = forum_access_mod($fid, $gid, 'allowdelete'); $access = array('allowpost' => $allowpost, 'allowupdate' => $allowupdate, 'allowdelete' => $allowdelete); $header['title'] = $thread['subject']; $header['mobile_link'] = $thread['url']; $header['keywords'] = $thread['keyword'] ? $thread['keyword'] : $thread['subject']; $header['description'] = $thread['description'] ? $thread['description'] : $thread['brief']; $_SESSION['fid'] = $fid; if ($ajax) { empty($conf['api_on']) and message(0, lang('closed')); $apilist['header'] = $header; $apilist['extra'] = $extra; $apilist['access'] = $access; $apilist['thread'] = well_thread_safe_info($thread); $apilist['thread_data'] = $data; $apilist['forum'] = $forum; $apilist['imagelist'] = $imagelist; $apilist['filelist'] = $thread['filelist']; $apilist['threadlist'] = $threadlist; message(0, $apilist); } else { include _include(theme_load('single_page', $fid)); } break; default: message(-1, lang('data_malformation')); break; } ?>