pandas - Extract the names and regexes of individual regex groups

For example, given regex

r"(?P<D>.8)(?P<foo>\d+)(?P<bar>[a-z]+)"

I want to return the dictionary

{'D': r'.8', 'foo': r'\d+', 'bar': r'[a-z]+'}

that maps group name to group regex. Getting the groups is straightforward (with groupindex), but there doesn't seem to be a direct way to get the corresponding regexes.

For example, given regex

r"(?P<D>.8)(?P<foo>\d+)(?P<bar>[a-z]+)"

I want to return the dictionary

{'D': r'.8', 'foo': r'\d+', 'bar': r'[a-z]+'}

that maps group name to group regex. Getting the groups is straightforward (with groupindex), but there doesn't seem to be a direct way to get the corresponding regexes.

Share Improve this question asked Mar 27 at 0:27 RobertHannah89 1876 bronze badges

Add a comment |

2 Answers 2

Sorted by: Reset to default 2

I believe you mean this am I right

import re

pattern = r"(?P<D>.8)(?P<foo>\d+)(?P<bar>[a-z]+)"

# Regex pattern to capture named groups
group_pattern = repile(r"\(\?P<(?P<name>\w+)>(?P<regex>.*?)\)")

# Extract named groups and their regex patterns
group_patterns = {match.group("name"): match.group("regex") for match in group_pattern.finditer(pattern)}

print(group_patterns)

Check if the regular expression \(\?P<(.+?)>((?:[^)]|(?<=\\)\))+)\) meets your requirements.
An explanation of the regular expression can be found at https://regex101/r/d7MvHp/1

import re

regex = r"\(\?P<(.+?)>((?:[^)]|(?<=\\)\))+)\)"
test_str = ("(?P<D>.8)(?P<foo>\(\d+\))(?P<bar>[a-z]+)")
d = {m.group(1) : m.group(2) for m in re.finditer(regex, test_str)}
print(d)

Output {'D': '.8', 'foo': '\\(\\d+\\)', 'bar': '[a-z]+'}

The snippet can be checked at https://rextester/KTMZ92055

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

pandas - Extract the names and regexes of individual regex groups - Stack Overflow

2 Answers 2

与本文相关的文章

评论列表(0)