I'm working on a part of a project, which is repleacing http url's with https url's if possible.
The Problem is, that the regular expressions for that are written for the javascript regex parser, but I'm using that regex inside python. To be patible, I would rewrite the regex during parsing into a valide python regex.
as example, I have that regular expression given:
https://$1wikimediafoundation/
and I would a regular expression like that:
https://\1wikimediafoundation/
my problem is that I doesn't know how to do that (converting $
into \
)
This code doesn't work:
'https://$1wikimediafoundation/'.replace('$', '\')
generate the following error:
SyntaxError: EOL while scanning string literal
This code work without error:
'https://$1wikimediafoundation/'.replace('$', '\\')
but generate a wrong output:
'https://\\1wikimediafoundation/'
I'm working on a part of a project, which is repleacing http url's with https url's if possible.
The Problem is, that the regular expressions for that are written for the javascript regex parser, but I'm using that regex inside python. To be patible, I would rewrite the regex during parsing into a valide python regex.
as example, I have that regular expression given:
https://$1wikimediafoundation/
and I would a regular expression like that:
https://\1wikimediafoundation/
my problem is that I doesn't know how to do that (converting $
into \
)
This code doesn't work:
'https://$1wikimediafoundation/'.replace('$', '\')
generate the following error:
SyntaxError: EOL while scanning string literal
This code work without error:
'https://$1wikimediafoundation/'.replace('$', '\\')
but generate a wrong output:
'https://\\1wikimediafoundation/'
Share
Improve this question
asked Sep 14, 2014 at 20:57
pointhipointhi
3031 gold badge5 silver badges13 bronze badges
1
-
1
Your substitution is correct, you're probably being confused by the way you display the result. Print it out with
print
and you'll only see one backslash. – alexis Commented Sep 14, 2014 at 21:01
4 Answers
Reset to default 2You test your regex here https://regex101./, and then change it to python.
Additionaly, to replace the matched group, you can use re.sub
module on these lines:
re.sub(r"'([^']*)'", r'{\1}', col ) )
replace
'Protein_Expectation_Value_Log(e)', 'Protein_Intensity_Log(I)'
{Protein_Expectation_Value_Log(e)}, {Protein_Intensity_Log(I)}
More you can refer here
Actually it works:
>>> 'https://$1wikimediafoundation/'.replace('$', '\\')
'https://\\1wikimediafoundation/'
>>> print 'https://$1wikimediafoundation/'.replace('$', '\\')
https://\1wikimediafoundation/
when you are doing 'https://$1wikimediafoundation/'.replace('$', '\\')
, it's returning the __repr__
(~representation) of the string and you can see special characters.
By printing it, you are using the __str__
, the readable version. (See this answer on __str__
vs __repr__
)
try this:
'https://$1wikimediafoundation/'.replace('$', r'\')
adding r"\"
whill automatically escape the backslash
which you are trying to do.
Note that $&
in replacement patterns should be converted to \g<0>
, since \0
is \0x00
character in python regex