I have already parsed a string up to index idx
. My next parse step uses a Regexp. It needs to match the next part of the string, i.e. staring from position idx
. How do I do this efficiently?
For example:
let myString = "<p>ONE</p><p>TWO</p>"
let idx
// some code not shown here parses the first paragraph
// and updates idx
idx = 10
// next parse step must continue from idx
let myRegex = /<p>[^<]*<\/p>/
let subbed = myString.substring(idx)
let result = myRegex.exec(subbed)
console.log(result) // "<p>TWO</p>", not "<p>ONE</p>"
I have already parsed a string up to index idx
. My next parse step uses a Regexp. It needs to match the next part of the string, i.e. staring from position idx
. How do I do this efficiently?
For example:
let myString = "<p>ONE</p><p>TWO</p>"
let idx
// some code not shown here parses the first paragraph
// and updates idx
idx = 10
// next parse step must continue from idx
let myRegex = /<p>[^<]*<\/p>/
let subbed = myString.substring(idx)
let result = myRegex.exec(subbed)
console.log(result) // "<p>TWO</p>", not "<p>ONE</p>"
But myString.substring(idx)
seems like a quite expensive operation.
Are there no regex operations like this: result = myRegex.execFromIndex(idx, myString);
?
In general, I want to start regex matching from different indexes so I can exclude parts of the string and avoid matches that are already parsed. So one time it can be from myString[0] another time myString[51] and so on.
Is there a way to do this efficiently? I'm parsing hundreds of thousands of lines and want to do this in an as cheap way as possible.
Share Improve this question edited Feb 9, 2022 at 6:25 Inigo 15k5 gold badges50 silver badges81 bronze badges asked Jan 18, 2017 at 15:43 mottossonmottosson 3,7735 gold badges39 silver badges81 bronze badges 8 | Show 3 more comments2 Answers
Reset to default 9Use Regexp.exec
and lastIndex
- Create a Regexp with the
y
org
flag- with the
y
flag, the match must start exactly at the specified start index - with the
g
flag, the match can occur anywhere after the specified index
- with the
- Set its
lastIndex
property to the start index - Call
exec
I've applied the above steps to your example code:
let myString = "<p>ONE</p><p>TWO</p>"
let idx
// some code not shown here parses the first paragraph
// and updates idx
idx = 10
// next parse step must continue from idx
let myRegex = /<p>[^<]*<\/p>/y //
myString.length
– Pranav C Balan Commented Jan 18, 2017 at 15:45.lastIndex
property. Read the documentation. – Pointy Commented Jan 18, 2017 at 15:49