最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - Node.js: remove specific columns from CSV file - Stack Overflow

programmeradmin0浏览0评论

I have a CSV file can contain around million records, how can I remove columns starting with _ and generate a resulting csv

For the sake of simplicity, consider i have the below csv

Sr.No Col1 Col2 _Col3   Col4 _Col5
1     txt  png  676766  win  8787
2     jpg  pdf  565657  lin  8787
3     pdf  jpg  786786  lin  9898

I would want the output to be


Sr.No Col1 Col2 Col4
1     txt  png  win 
2     jpg  pdf  lin 
3     pdf  jpg  lin

Do i need to read the entire file to achive this or is there a better approach to do this.

const csv = require('csv-parser');
const fs = require('fs');

fs.createReadStream('data.csv')
  .pipe(csv())
  .on('data', (row) => {
    // generate a new csv with removing specific column
  })
  .on('end', () => {
    console.log('CSV file successfully processed');
  });

Any help on how can i achieve this would be helpful.

Thanks.

I have a CSV file can contain around million records, how can I remove columns starting with _ and generate a resulting csv

For the sake of simplicity, consider i have the below csv

Sr.No Col1 Col2 _Col3   Col4 _Col5
1     txt  png  676766  win  8787
2     jpg  pdf  565657  lin  8787
3     pdf  jpg  786786  lin  9898

I would want the output to be


Sr.No Col1 Col2 Col4
1     txt  png  win 
2     jpg  pdf  lin 
3     pdf  jpg  lin

Do i need to read the entire file to achive this or is there a better approach to do this.

const csv = require('csv-parser');
const fs = require('fs');

fs.createReadStream('data.csv')
  .pipe(csv())
  .on('data', (row) => {
    // generate a new csv with removing specific column
  })
  .on('end', () => {
    console.log('CSV file successfully processed');
  });

Any help on how can i achieve this would be helpful.

Thanks.

Share Improve this question edited Jul 1, 2020 at 11:10 opensource-developer asked Jul 1, 2020 at 10:40 opensource-developeropensource-developer 3,0685 gold badges48 silver badges109 bronze badges
Add a ment  | 

4 Answers 4

Reset to default 3

To anyone who stumbles on the post

I was able to transform the csv's using below code using fs and csv modules.

await fs.createReadStream(m.path)
      .pipe(csv.parse({delimiter: '\t', columns: true}))
      .pipe(csv.transform((input) => {
        delete input['_Col3'];
        console.log(input);
        return input;
      }))
      .pipe(csv.stringify({header: true}))
      .pipe(fs.createWriteStream(transformedPath))
      .on('finish', () => {
        console.log('finish....');
      }).on('error', () => {
        console.log('error.....');
      });

Source: https://gist.github./donmccurdy/6cbcd8cee74301f92b4400b376efda1d

Try this with csv lib

const csv = require('csv');
const fs = require('fs');

const csvString=`col1,col2
               value1,value2`

csv.parse(csvString, {columns: true})
   .pipe(csv.transform(({col1,col2}) => ({col1}))) // remove col2
   .pipe(csv.stringify({header:true}))
   .pipe(fs.createWriteStream('./file.csv'))

Actually you can handle that by using two npm packages.

https://www.npmjs./package/csvtojson to convert your library to JSON format

then use this https://www.npmjs./package/json2csv

with the second library. If you know what are the exact fields you want. you can pass parameters to specifically select the fields you want.

const { Parser } = require('json2csv');
 
const fields = ['field1', 'field2', 'field3'];
const opts = { fields };
 
try {
  const parser = new Parser(opts);
  const csv = parser.parse(myData);
  console.log(csv);
} catch (err) {
  console.error(err);
}

Or you can modify the JSON object manually to drop those columns

With this function I acplished the column removal from a CSV

removeCol(csv, col) {
   let lines = csv.split("\n");
   let headers = lines[0].split(",");
   let colNameToRemove = headers.find(h=> h.trim() === col);
   let index = headers.indexOf(colNameToRemove);
   let newLines = [];
   lines.map((line)=>{
       let fields = line.split(",");
       fields.splice(index, 1)
       newLines.push(fields)
   })
   let arrData = '';
   for (let index = 0; index < newLines.length; index++) {
       const element = newLines[index];
       arrData += element.join(',') + '\n'
   }
   return arrData;
} 
发布评论

评论列表(0)

  1. 暂无评论