How to match string properly

This is the place for queries that don't fit in any of the other categories.

How to match string properly

Postby dragonix » Thu Jun 06, 2013 8:18 am

Hi all

I have a question about how to match a string with regex.
The problem I have is that the string can have multiple options, like

abc1
abc2
abc3a
abc3b
abc4a
abc4c
abc5
For example: abc1b is not possible, etc.

So, the first part is always the same, but the last 2 characters can change (but as you can see, that a second character is optional).
I used Google and Python documentation, but I can't get this to work :( ..

What I have so far:
Code: Select all
import re
PATTERN = re.compile('abc[1-2] | abc3[a-b] | abc4[a-c] | abc5', re.IGNORECASE)

TEST=sys.argv[1]
if PATTERN.match(TEST) :
        print 'INFO Correct argument has been given: %s' % TEST
    else:
        print 'INFO Incorrect argument has been given %s' % TEST


But this isn't working like I want it to be or I believe I can do it more easily..
Can someone help me a bit on this?

Thanks!

Dragonix
dragonix
 
Posts: 9
Joined: Wed May 29, 2013 8:21 am

Re: How to match string properly

Postby dragonix » Thu Jun 06, 2013 8:45 am

Oke, after doing some more Googling it seems to work now.
Code: Select all
import re

PATTERN = re.compile(r'\babc+[1-2]\b|\babc3+[a-b]\b|\babc4+[a-c]\b|\babc5a\b')

TEST = sys.argv[1]
if PATTERN.match(TEST):
    print 'INFO Correct argument has been given: %s' % TEST
else:
    print 'INFO Incorrect argument has been given %s' % TEST


Is there a way to improve the readability of the compile part?
Why do I need the 'r' in
Code: Select all
r'\babc+[1-2]\b|\babc3+[a-b]\b|\babc4+[a-c]\b|\babc5a\b'
dragonix
 
Posts: 9
Joined: Wed May 29, 2013 8:21 am

Re: How to match string properly

Postby hansn » Thu Jun 06, 2013 9:13 am

dragonix wrote:Why do I need the 'r' in
Code: Select all
r'\babc+[1-2]\b|\babc3+[a-b]\b|\babc4+[a-c]\b|\babc5a\b'

The 'r' stands for 'raw string' (I think?) and is there to remove the effect that escape characters have on your string.

Code: Select all
>>> print '\n'


>>> print r'\n'
\n
>>>
hansn
 
Posts: 87
Joined: Thu Feb 21, 2013 8:46 pm

Re: How to match string properly

Postby dragonix » Thu Jun 06, 2013 10:37 am

ooow oke :D
Thanks!!
dragonix
 
Posts: 9
Joined: Wed May 29, 2013 8:21 am

Re: How to match string properly

Postby MichelFJM » Thu Jun 06, 2013 10:48 am

Hello

I'm not sure about what you want but I think that the '+' is a mistake.
Do you allow abc333a ?
and abc3 ?
MichelFJM
 
Posts: 19
Joined: Wed May 22, 2013 1:41 pm

Re: How to match string properly

Postby dragonix » Thu Jun 06, 2013 12:10 pm

I only want to allow abc3a for instance.
Not abc333a nor abc3.

Did I do something wrong?

Another question.
When I use this function
Code: Select all
def AMOUNT():
    value = len(glob.glob('./abc*'))
    return int(value)


And the place where I execute it, I have the following files
abc
abc1
abc2
abc3a

When I want to execute
Code: Select all
if AMOUNT > 0:
    print 'INFO %d files were found' % AMOUNT

But when executing, I receive this error
Code: Select all
TypeError: %d format: a number is required, not function


Why is that?
I'm quiet new to Python, so bear with me ;)

Thanks!
dragonix
 
Posts: 9
Joined: Wed May 29, 2013 8:21 am

Re: How to match string properly

Postby MichelFJM » Thu Jun 06, 2013 12:25 pm

Hello

For your error : AMOUNT is the function. AMOUNT() is the return.
For the regexp : the + means one or more of the preceding character. Remove them !
MichelFJM
 
Posts: 19
Joined: Wed May 22, 2013 1:41 pm

Re: How to match string properly

Postby MichelFJM » Thu Jun 06, 2013 12:41 pm

I propose the following alternative, with abc and \b once for all, with \ changed by \\ to remove the 'r' before the string definition, without the \b at the beginning (not useful for match method) :
Code: Select all
import re
import sys
PATTERN = re.compile('abc([12]|3[ab]|4[a-c]|5a)\\b')
TEST = sys.argv[1]
if PATTERN.match(TEST):
    print 'Correct: %s' % TEST
else:
    print 'NOT allowed %s' % TEST
MichelFJM
 
Posts: 19
Joined: Wed May 22, 2013 1:41 pm

Re: How to match string properly

Postby snippsat » Thu Jun 06, 2013 5:42 pm

changed by \\ to remove the 'r' before the string definition,

MichelFJM in regex pattern you should always use raw string('r').
User avatar
snippsat
 
Posts: 171
Joined: Thu Feb 21, 2013 12:04 am

Re: How to match string properly

Postby dragonix » Fri Jun 07, 2013 6:33 am

MichelFJM wrote:Hello

For your error : AMOUNT is the function. AMOUNT() is the return.
For the regexp : the + means one or more of the preceding character. Remove them !


Indeed, stupid mistake.. referring to the AMOUNT issue.
About the '+', you are also correct ;) removed it, it was definitely unnecessary!
Thanks for the pointers
dragonix
 
Posts: 9
Joined: Wed May 29, 2013 8:21 am

Re: How to match string properly

Postby dragonix » Fri Jun 07, 2013 6:35 am

snippsat wrote:
changed by \\ to remove the 'r' before the string definition,

MichelFJM in regex pattern you should always use raw string('r').


If you say, always use raw string, how should the format be like then?
Like this, or was mine correct (except for the mistakes MichelFJM pointed out)
Code: Select all
PATTERN = re.compile(r'abc([12]|3[ab]|4[a-c]|5a)\\b')
dragonix
 
Posts: 9
Joined: Wed May 29, 2013 8:21 am

Re: How to match string properly

Postby MichelFJM » Fri Jun 07, 2013 7:26 am

Both are equivalent :
Code: Select all
PATTERN = re.compile(r'abc([12]|3[ab]|4[a-c]|5a)\b')
PATTERN = re.compile(r'abc[12]\b|abc3[ab]\b|abc4[a-c]\b|abc5a\b')

(thank you snippsat for the advice)
MichelFJM
 
Posts: 19
Joined: Wed May 22, 2013 1:41 pm


Return to General Coding Help

Who is online

Users browsing this forum: Baidu [Spider], Bing [Bot], Google [Bot], Yoriz and 5 guests