I am trying to capture 1 or 2 pieces of information. When using regexr it shows my expression to be working and capturing like it should, but when running it, it only captures from a single string (on the same data as in regexr) and returns null
for the rest.
I have tried building the expression here
And when switching to JS flavor it shows the capturing groups not working via the color overlays, but it shows them working correctly in the side pane. Even the simplest capturing group seems to not work.
What am I missing?
Input is :
<@U0BUPU9QQ> 49
50
<@U0BUPU9QQ>
<@U0BUPU9QQ> noget 49 noget andet tekst 5 40
<@U0BUPU9QQ> noget andet tekst 5 40
<@U0BUPU9QQ|mn> has joined the channel
Output:
Should be the ID inside the <>
(except the @
) and the last group of digits in the line, if there is no ID then only the digits.
I am trying to capture 1 or 2 pieces of information. When using regexr it shows my expression to be working and capturing like it should, but when running it, it only captures from a single string (on the same data as in regexr) and returns null
for the rest.
I have tried building the expression here
And when switching to JS flavor it shows the capturing groups not working via the color overlays, but it shows them working correctly in the side pane. Even the simplest capturing group seems to not work.
What am I missing?
Input is :
<@U0BUPU9QQ> 49
50
<@U0BUPU9QQ>
<@U0BUPU9QQ> noget 49 noget andet tekst 5 40
<@U0BUPU9QQ> noget andet tekst 5 40
<@U0BUPU9QQ|mn> has joined the channel
Output:
Should be the ID inside the <>
(except the @
) and the last group of digits in the line, if there is no ID then only the digits.
- what is the expected input and output? – vks Commented Oct 20, 2015 at 12:42
- use .exec and loop it. – YOU Commented Oct 20, 2015 at 12:43
- When using the code generator from regex101, I get something like what you describe. It still produces the "wrong" output. – Mathias Nielsen Commented Oct 20, 2015 at 12:56
-
Please write in words what you expect to obtain from, say,
<@U0BUPU9QQ> noget 49 noget andet tekst 5 40
. BothU0BUPU9QQ
and40
? – Wiktor Stribiżew Commented Oct 20, 2015 at 12:57 - @MathiasNielsen stackoverflow./questions/432493/… you need to capture groups – vks Commented Oct 20, 2015 at 12:57
1 Answer
Reset to default 7Do not pay attention to the highlighting groups on regex101 for JS: if you see them in the MATCH INFORMATION pane on the right, they are matched and captured correctly.
In JS, here is the code that will fetch the capture groups (note that m[1]
is the first capture group text, m[2]
is the second group text, etc.):
var re = /^(?:<@([A-Z0-9]+)>)?.*\b([0-9]+)/gm;
var str = '<@U0BUPU9QQ> 49\n50\n<@U0BUPU9QQ>\n<@U0BUPU9QQ> noget 49 noget andet tekst 5 40\n<@U0BUPU9QQ> noget andet tekst 5 40\n<@U0BUPU9QQ|mn> has joined the channel';
var m;
while ((m = re.exec(str)) !== null) {
document.write(m[1] + "<br/>" + m[2] + "<br/><br/>");
}
Notes on the regex itself:
^
- Start matching at the beginning of the line (due tom
modifier)(?:<@([A-Z0-9]+)>)?
- an optional (due to?
quantifier) group matching<@
- literal<@
symbols([A-Z0-9]+)
- (Capture group 1) 1 or more alphanumeric symbols>
- closing angle bracket
.*
- 0 or more character other than a newline (as many as possible)\b([0-9]+)
- (Capture group 2) 1 or more digits that are preceded by a word boundary
You can adjust the regex as per your requirements. Right now, it will match the ID (=the symbols inside optional <@...>
), and the last digit sequence on a line. If you need the first digit sequence, use lazy matching .*?
instead of the greedy one (.*
).