Help regex. Use of findall

This is the place for queries that don't fit in any of the other categories.

Help regex. Use of findall

Postby stefun06 » Thu Feb 21, 2013 4:00 pm

Hello!

I would like to find everything that is in between : 'GH'. i.e. for 'GH ab H b G bc GH', I would like to get : ' ab H b G bc '.

What I did is :

Code: Select all
import re
A=r'GH ab H b G bc GH'
pattern_2a=re.compile(r'GH'r'(?P<ContOpt>[^GH]{0,})'r'GH')             
ans=pattern_2a.findall(A)
print ans


But this is not working... I guess it is because : ^GH mean all excepted G or H while I would like all excepted GH (as a block of 2 characters..). And so I do :

If anyone could help me with that one that would be very nice!! :D

Also i tried to use findall to find everything in between AA and BC with there are some B and/or C in between.

E.g. str=r'AA aaaBaaaCaaa BC' => ans =' aaaBaaaCaaa '

To try to do so, I used:

Code: Select all
import re
str=r'AA aaaBaaaCaaa BC'
pattern_2a=re.compile(r'AA'r'(?P<ContOpt>[^BC]{0,})'r'BC')
ans=pattern_2a.findall(str)

print ans


But this is not working. Probably because ^BC means no B nor C and I would like no BC...

Thanks!
Stephane

Ps : (Of course) something like that is working (but it is not what I need....) :

Code: Select all
import re
A=r'GH ab  b G bc H'
pattern_2a=re.compile(r'GH'r'(?P<ContOpt>[^H]{0,})'r'H')             
ans=pattern_2a.findall(A)
print ans


Ps : (of course) if I change the condition to 1 element then it works....:

Code: Select all
import re
str=r'AA aaaBaaaaaa C'
pattern_2a=re.compile(r'AA'r'(?P<ContOpt>[^C]{0,})'r'C')
ans=pattern_2a.findall(str)

print ans
ans=[' aaaBaaaaaa ']
Last edited by Yoriz on Thu Feb 21, 2013 5:23 pm, edited 1 time in total.
Reason: Merged two similar post best that i could and added code tags
stefun06
 
Posts: 2
Joined: Thu Feb 21, 2013 3:52 pm

Re: Help regex. Use of findall

Postby Yoriz » Thu Feb 21, 2013 5:52 pm

Hi, Welcome to the forum.
Please have a read up on this

Code: Select all
print re.findall('(?<=GH).*(?=GH)', 'GH ab H b G bc GH')
[' ab H b G bc ']
print re.findall('(?<=AA).*(?=BC)', 'AA aaaBaaaCaaa BC')
[' aaaBaaaCaaa ']


(?<=...)
Matches if the current position in the string is preceded by a match for ... that ends at the current position. This is called a positive lookbehind assertion.

(?=...)
Positive lookahead assertion. This succeeds if the contained regular expression, represented here by ..., successfully matches at the current location, and fails otherwise. But, once the contained expression has been tried, the matching engine doesn’t advance at all; the rest of the pattern is tried right where the assertion started.

This might be better.
Code: Select all
print re.findall('AA(.*?)BC', 'AA aaaBaaaCaaa BC not this AA ab H b G bc BC')
[' aaaBaaaCaaa ', ' ab H b G bc ']
New Users, Read This
Join the #python-forum IRC channel on irc.freenode.net!
Image
User avatar
Yoriz
 
Posts: 1170
Joined: Fri Feb 08, 2013 1:35 am
Location: UK

Re: Help regex. Use of findall

Postby stefun06 » Thu Feb 21, 2013 7:46 pm

Hello!!

Thank you that is very useful! :D

Very nice forum ! ;)

Stephane
stefun06
 
Posts: 2
Joined: Thu Feb 21, 2013 3:52 pm


Return to General Coding Help

Who is online

Users browsing this forum: W3C [Linkcheck] and 7 guests