I have the following python code using regex that finds the substrings "¬[", "[", "¬(", "(", ")", "]" and get their positions (I transformed the "¬[" and "¬(" into "[" and "(")
import re
expression = "¬[P∧¬(¬T∧R)]∧(T→¬P)"
# [[0 "¬["], [4 "¬("], [13 "("], [10 ")"], [18 ")"], [11 "]"]]
lsqb = [[match.start(), "["] for match in re.finditer("\¬\[|\[", expression)]
lpar = [[match.start(), "("] for match in re.finditer("\¬\(|\(", expression)]
rpar = [[match.start(), ")"] for match in re.finditer("\)", expression)]
rsqb = [[match.start(), "]"] for match in re.finditer("\]", expression)]
all = lsqb + lpar + rpar + rsqb
print(lsqb) # [[0, '[']]
print(lpar) # [[4, '('], [13, '(']]
print(rpar) # [[10, ')'], [18, ')']]
print(rsqb) # [[11, ']']]
print(all) # [[0, '['], [4, '('], [13, '('], [10, ')'], [18, ')'], [11, ']']]
The issue is that I'm iterating over the string 4 times (once for each type of parentheses I want to find the position of... ) I'd like to get rid of all those parentheses variables and just have the "all" one while iterating only once over the string and still getting: [[0, '['], [4, '('], [13, '('], [10, ')'], [18, ')'], [11, ']']] as a result
I have the following python code using regex that finds the substrings "¬[", "[", "¬(", "(", ")", "]" and get their positions (I transformed the "¬[" and "¬(" into "[" and "(")
import re
expression = "¬[P∧¬(¬T∧R)]∧(T→¬P)"
# [[0 "¬["], [4 "¬("], [13 "("], [10 ")"], [18 ")"], [11 "]"]]
lsqb = [[match.start(), "["] for match in re.finditer("\¬\[|\[", expression)]
lpar = [[match.start(), "("] for match in re.finditer("\¬\(|\(", expression)]
rpar = [[match.start(), ")"] for match in re.finditer("\)", expression)]
rsqb = [[match.start(), "]"] for match in re.finditer("\]", expression)]
all = lsqb + lpar + rpar + rsqb
print(lsqb) # [[0, '[']]
print(lpar) # [[4, '('], [13, '(']]
print(rpar) # [[10, ')'], [18, ')']]
print(rsqb) # [[11, ']']]
print(all) # [[0, '['], [4, '('], [13, '('], [10, ')'], [18, ')'], [11, ']']]
The issue is that I'm iterating over the string 4 times (once for each type of parentheses I want to find the position of... ) I'd like to get rid of all those parentheses variables and just have the "all" one while iterating only once over the string and still getting: [[0, '['], [4, '('], [13, '('], [10, ')'], [18, ')'], [11, ']']] as a result
Share Improve this question asked Mar 6 at 16:29 user29917130user29917130 272 bronze badges 5 |1 Answer
Reset to default 6Use a single regular expression that matches all the patterns. You can use a capture group to extract the parenthesis after ¬
.
Then loop over all the matches, generating the appropriate string in the result based on what was matched.
expression = "¬[P∧¬(¬T∧R)]∧(T→¬P)"
pattern = r'¬?([\[(])|([\])])'
all_matches = [(match.start(), match.group(1) or match.group(2))
for match in re.finditer(pattern, expression)]
print(all_matches)
# [(0, '['), (4, '('), (10, ')'), (11, ']'), (13, '('), (18, ')')]
Each match will only match one side of the pipe, so match.group(1) or match.group(2)
selects the matched parenthesis.
all
is a builtin method that you are clobbering at the moment. – JonSG Commented Mar 6 at 16:48