最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

python - Regex: Substitute pattern in string multiple times without leftovers - Stack Overflow

programmeradmin1浏览0评论

I'm trying to write a regex to match the pattern (##a.+?#a##b.+?#b) in strings where it might appear more than one time:

foo##abar#a##bfoo#bbar##afoo#a##bbar#bfoobar

What I would like to do is to substitute the whole string with just the pattern, so I've tried the following using regex101 (python flavor):

Regex:

.*?(##a.+?#a##b.+?#b).*?

The problem is that, if the last occurrence of the pattern is followed by some text, as in the example above, the last portion is not matched, giving:

##abar#a##bfoo#b##afoo#a##bbar#bfoobar 

Expected output:

##abar#a##bfoo#b##afoo#a##bbar#b (just the matched pattern w/out extra text)

Is there a workaround for this kind of task? Many thanks in advance for any suggestion.

I'm trying to write a regex to match the pattern (##a.+?#a##b.+?#b) in strings where it might appear more than one time:

foo##abar#a##bfoo#bbar##afoo#a##bbar#bfoobar

What I would like to do is to substitute the whole string with just the pattern, so I've tried the following using regex101 (python flavor):

Regex:

.*?(##a.+?#a##b.+?#b).*?

The problem is that, if the last occurrence of the pattern is followed by some text, as in the example above, the last portion is not matched, giving:

##abar#a##bfoo#b##afoo#a##bbar#bfoobar 

Expected output:

##abar#a##bfoo#b##afoo#a##bbar#b (just the matched pattern w/out extra text)

Is there a workaround for this kind of task? Many thanks in advance for any suggestion.

Share Improve this question edited Feb 3 at 21:07 zer00ne 44.1k6 gold badges45 silver badges77 bronze badges asked Feb 3 at 18:20 eazyezyeazyezy 1857 bronze badges 4
  • Unclear? Can you add real samples? – aaa Commented Feb 3 at 18:24
  • What's the expected output? I don't understand what "substitute the whole string with just the pattern" means. – Fravadona Commented Feb 3 at 18:33
  • Just extract the matches and concat them. What is the programming language? – Wiktor Stribiżew Commented Feb 3 at 18:42
  • Yes @trincot, that did it, thank you very much! – eazyezy Commented Feb 3 at 21:30
Add a comment  | 

1 Answer 1

Reset to default 0

You could add |$ in the group and then only match other characters before the group. That way you solve the issue of not getting rid of the suffix:

.*?(##a.+?#a##b.+?#b|$)

See it on regex101

发布评论

评论列表(0)

  1. 暂无评论