javascript - How do I remove duplicates in a txt file using ES6 and Node Js?

I have a .txt file that is space-delimited and it contains dupes. I want to remove the dupes but not finding it an easy task.

The file contains: orange orange apple apple pear

At first, I was getting an error with the txt extension. I updated the main to contain

const fs = require('fs');
require.extensions['.txt'] = function (module, filename) {
module.exports = fs.readFileSync(filename, 'utf8');

That helped with the errors and I was able to create a const after that.

const fruitList = require('../support/fruitList.txt');

However, I am still unable to remove dupes. I tried neek and that was not working either.

I have a .txt file that is space-delimited and it contains dupes. I want to remove the dupes but not finding it an easy task.

The file contains: orange orange apple apple pear

At first, I was getting an error with the txt extension. I updated the main to contain

const fs = require('fs');
require.extensions['.txt'] = function (module, filename) {
module.exports = fs.readFileSync(filename, 'utf8');

That helped with the errors and I was able to create a const after that.

const fruitList = require('../support/fruitList.txt');

However, I am still unable to remove dupes. I tried neek and that was not working either.

Share Improve this question asked Nov 19, 2019 at 1:56 Laser Hawk 2,0283 gold badges24 silver badges31 bronze badges

Add a ment |

4 Answers 4

Sorted by: Reset to default 10

You can use a set to remove duplicates in your set.

let fruitList = ["orange", "orange", "apple", "apple", "pear"];
let fruitSet = new Set(fruitList); // {"orange", "apple", "pear"}
//convert back to array
const newArray = [...fruitSet];//["orange", "apple", "pear"]

An important thing is try to catch any errors thrown by readFileSync to find the source of your problem as to why your file isn't being read. Depending on how your data is formatted you'll usually want to catch all delimiters like tabs, spaces and newlines. The code below uses a regex in split to do that and put all your values in an array. Then the following line uses index to chuck out duplicates. try this:

const fs = require('fs')

try {
    let data = fs.readFileSync('test.txt', 'utf8')

    // split data by tabs, newlines and spaces
    data = data.toString().split(/[\n \t ' ']/)

    // this will remove duplicates from the array
    const result = data.filter((item, pos) => data.indexOf(item) === pos)

    console.log(result)

} catch (e) {
    console.log('Error:', e.stack)
}

Set to spread is a considerably faster method than filter to extract duplicates as shown in Juan's answer:

let data = 'orange orange apple apple pear orange orange apple apple pear orange orange apple apple pear orange orange apple apple pear orange orange apple apple pear orange orange apple apple pear orange orange apple apple pear'

data = data.toString().split(/[\n \t ' ']/)

console.time('method1')
const firstArr = data.filter((item, pos, arr) => arr.indexOf(item) === pos)

console.timeEnd('method1')

console.time('method2')
const secondArr = [...new Set(data)]

console.timeEnd('method2')

console.log('method1', firstArr, 'method2', secondArr)

You can do it in a single line:

const fruitList = [...new Set(require('../support/fruitList.txt'))];

See thorough discussion in this question

I have just written a function for my gulp configuration to remove duplicated lines. In my case to separate replicas I use \n to split the lines in an array which a can already process. To rover the text in the file, just join with the same \n the text in to the very same file. You can use other separator for example space \s or some simbol as - ',' or ';' ect. to form your text in array and remove duplicated items.

import fs from 'fs';

export async function removeDuplicates() {

 const filePath = './src/ads.txt';

 try {

  const 
    data = fs.readFileSync(filePath, 'utf-8'),
    lines = data.split('\n'),
    uniqueLines = Array.from(new Set(lines)),
    result = uniqueLines.join('\n');

  fs.writeFileSync(filePath, result, 'utf-8');

  console.log('Duplicated lines have been successfully removed.');

 } catch (error) {

  console.error('Error while processing the operation: ', error);

 }
}

export { removeDuplicates as rmvreplicas };

After writing the function in the gulp file you can starti it executing the line:

$ gulp rmvreplicas

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

javascript - How do I remove duplicates in a txt file using ES6 and Node Js? - Stack Overflow

4 Answers 4

与本文相关的文章

评论列表(0)