I'm developing this code that, after the user selects a directory, it display a table of the files contained in that location with their details (name, type, size ...).
A directory may contain a lot of files.
I succeeded to achieve that. But, my problem is that I want to display the number of lines in each file. I can get the number of lines using this JavaScript
code :
var reader = new FileReader();
var textFile = $("#file").get(0).files[0];
reader.readAsText(textFile);
$(reader).on('load', processFile);
/*And in processFile() i use this line to get the number of lines :*/
nbLines = (file.split("\n")).length;
The above code work as expected and it give me what I want, but it may a heavy process if there is so many files in the selected directory!
The Question : Is there a way to get the number of lines in a text file without reading it?
Regards!
I'm developing this code that, after the user selects a directory, it display a table of the files contained in that location with their details (name, type, size ...).
A directory may contain a lot of files.
I succeeded to achieve that. But, my problem is that I want to display the number of lines in each file. I can get the number of lines using this JavaScript
code :
var reader = new FileReader();
var textFile = $("#file").get(0).files[0];
reader.readAsText(textFile);
$(reader).on('load', processFile);
/*And in processFile() i use this line to get the number of lines :*/
nbLines = (file.split("\n")).length;
The above code work as expected and it give me what I want, but it may a heavy process if there is so many files in the selected directory!
The Question : Is there a way to get the number of lines in a text file without reading it?
Regards!
Share Improve this question asked Aug 11, 2017 at 8:18 Hamza AbdaouiHamza Abdaoui 2,2094 gold badges25 silver badges39 bronze badges 4 |4 Answers
Reset to default 11You can't count the number of lines in a file without reading it. The operating systems your code runs on do not store the number of lines as some kind of metadata. They don't even generally distinguish between binary and text files! You just have to read the file and count the newlines.
However, you can probably do this faster than you are doing it now, if your files have a large number of lines.
This line of code is what I'm worried about:
nbLines = (file.split("\n")).length;
Calling split
here creates a large number of memory allocations, one for each line in the file.
My hunch is that it would be faster to count the newlines directly in a for
loop:
function lineCount( text ) {
var nLines = 0;
for( var i = 0, n = text.length; i < n; ++i ) {
if( text[i] === '\n' ) {
++nLines;
}
}
return nLines;
}
This counts the newline characters without any memory allocations, and most JavaScript engines should do a good job of optimizing this code.
You may also want to adjust the final count slightly depending on whether the file ends with a newline or not, according to how you want to interpret that. But don't do that inside the loop, do it afterward.
There is not way to know the number of lines without opening the document. Regarding the performance issues that you are having it comes from the .split() most probably. You are loading the file as a string in memory and then generating as many strings as lines are in this files. If a file contains 1000 lines of code the resulting ram usage will be 1 String (whole files) 1000 Strings (1 string per line)
I would recommend chaging this for a evaluation using RegEx. Here's an example
var file = ("this\nis a string\n with new\nlines");
var match = file.match(/\r?\n/g);
alert(match.length);
Keep in mind that a different regex might be required depending on your files. This will surely improve the performance.
Update for 2021:
Reading the file as text is always a bad idea. With current optimised languages, almost all languages are super fast in processing loops, so looping will always be faster than loading the text into memory and splitting.
for NodeJS, please see ReadLine. Although not recommended to do such operations in node, being Single Threaded, I can read Big CSVs pretty fast using ReadLine.
A text file usually contains an operation line at the bottom of the screen which allows you to place the cursor on the screen and shows the line and location of the character it is located. In which case if the cursor is at the last character the total lines would be indicated.
wc -l
that gives the number of lines in file. – Dimitri Commented Aug 11, 2017 at 8:21