I have the following string:
[Example] öäüß asdf 1234 (1aö) (not necessary),
Explanation:
[Example]
optional, not needed
öäüß asdf 1234
the most important part which I need. Every character, number, special character as well as German characters like äÄöÖüÜß
can be found here.
A greedy selection might be the best solution to prevent characters like the German ones, right?
(1aö)
optional and needed
(not necessary)
optional, not needed. If it appears it could be (not ...)
or (unusual)
,
the ma can be optional, too. But is also not needed.
I use the following RegEx: /(?:\[.*\]\s)?(?<name>.*?)(?:\s\([not|unusual].*?\))?\,/g
The problems:
when I use the optional parameter
?
at the ma it splits the whole string into separate characters.when I change the non greedy selection in the
name
group to a greedy one the optional ma is separated. But now the example string starting withö
is selected up to the end.the string inside of the first standard brackets
()
can start with upper or lower case. At this moment I can only recognize upper case.
Here's my attempt at regex101 with a bunch of examples:
Sorry for the quite specific question, but I'm at the end with my knowledge ...
Does anyone have suggestions what I can do here?
I have the following string:
[Example] öäüß asdf 1234 (1aö) (not necessary),
Explanation:
[Example]
optional, not needed
öäüß asdf 1234
the most important part which I need. Every character, number, special character as well as German characters like äÄöÖüÜß
can be found here.
A greedy selection might be the best solution to prevent characters like the German ones, right?
(1aö)
optional and needed
(not necessary)
optional, not needed. If it appears it could be (not ...)
or (unusual)
,
the ma can be optional, too. But is also not needed.
I use the following RegEx: /(?:\[.*\]\s)?(?<name>.*?)(?:\s\([not|unusual].*?\))?\,/g
The problems:
when I use the optional parameter
?
at the ma it splits the whole string into separate characters.when I change the non greedy selection in the
name
group to a greedy one the optional ma is separated. But now the example string starting withö
is selected up to the end.the string inside of the first standard brackets
()
can start with upper or lower case. At this moment I can only recognize upper case.
Here's my attempt at regex101 with a bunch of examples: https://regex101./r/Lx2anw/1
Sorry for the quite specific question, but I'm at the end with my knowledge ...
Does anyone have suggestions what I can do here?
Share Improve this question edited Oct 22, 2022 at 16:41 colinD 2,0311 gold badge24 silver badges28 bronze badges asked Oct 22, 2022 at 10:43 GePuGePu 3202 silver badges14 bronze badges 10- Are you using a programming language here? – Tim Biegeleisen Commented Oct 22, 2022 at 10:46
- Yes. I`m using this in Javascript to match it out of a selected text. – GePu Commented Oct 22, 2022 at 10:50
- 1 How about something like this with use of negated classes. – bobble bubble Commented Oct 22, 2022 at 11:07
- 2 @GePu Like this? regex101./r/ZMnnoh/1 – The fourth bird Commented Oct 22, 2022 at 11:09
- 1 @Luuk Thank you, indeed. I've not read about that before :) Maybe just like this one or the pattern from the 4th bird. – bobble bubble Commented Oct 22, 2022 at 11:14
3 Answers
Reset to default 6You can use
^(?:\[.*?]\s)?(?<name>.*?)(?:\s\((?:not|unusual)[^()]*\))?,?\s*$
See the regex demo.
Details:
^
- start of string(?:\[.*?]\s)?
- an optional sequence of[...]
and a whitespace(?<name>.*?)
- Group "name": any zero or more chars as few as posible(?:\s\((?:not|unusual)[^()]*\))?
- an optional sequence of a whitespace,(
,not
orunusual
, and then zero or more chars other than(
and)
and then a)
char,?
- an optional ma\s*
- zero or more whitespaces$
- end of string
Your pattern matches the rest of the line in group 1 because all that follows in the pattern after group name is optional.
Note that you use a character class [not|unusual]
but you should use a grouping if you want to match one of the alternatives like (?:not|unusual)
You might also match any character except parenthesis, or a ma that is at the end of the string.
Then match an optional part between parenthesis.
^(?:\[[^\][\n]*\]\s)?(?<name>(?:(?!,\s*$)[^\n()])*(?:\([^()\n]*\))?)
Explanation
^
Start of string(?:\[[^\][\n]*\]\s)?
Optionally match[...]
(?<name>
Group name(?:
Non capture group(?!,\s*$)[^\n()]
If we are not looking at a trailing ma, match any character except(
)
or a newline
)+
Close the non capture group and repeat 1 or more times to not match an empty line(?:\([^()\n]*\))?
Optionally match a part from(...)
)
Close group name
Regex demo
If the first part between parenthesis should not start with the words not or unusual you can assert for it using a negative lookahead (?!not\b|unusual\b)
^(?:\[[^\][\n]*\]\s)?(?<name>(?:(?!,\s*$)[^\n()])+(?:\((?!not\b|unusual\b)[^()\n]*\))?)
Regex demo
It will work if you put every expression that you don't want to match as a single non-captured group. Your expression will be like this:
/(?:\[.*\]\s)?(?<name>.+?)(?:\s\(not \w+\))?(?:\s\(unusual\))?,?$/gm
https://regex101./r/a7Qtvw/1