trying to avoid using regexp

This is the place for queries that don't fit in any of the other categories.

trying to avoid using regexp

Postby metulburr » Tue Sep 24, 2013 12:43 pm

I am essetially tampering into making your own language. Still making a lexer. So with a basic string of something like these would all have to match with an ID, operator, int or string, and semi-colon terminating the line
Code: Select all
a=10.12;

Code: Select all
a = 10;

Code: Select all
a = (20 - 4) *  2 + 4;

Code: Select all
a     =      10    ;

Code: Select all
var = 10;

Code: Select all
var = "string";

I am trying to determine if regex is the easiest solution or not? I mean i can come up with str methods use, but i think this might be a case where it appears regex are easier to assign the tokens a tag? Especially when this is just an assignment line, let alone other lines that would have to account for while/for loops and whatenot.
New Users, Read This
OS Ubuntu 14.04, Arch Linux, Gentoo, Windows 7/8
https://github.com/metulburr
steam
User avatar
metulburr
 
Posts: 1574
Joined: Thu Feb 07, 2013 4:47 pm
Location: Elmira, NY

Re: trying to avoid using regexp

Postby metulburr » Tue Sep 24, 2013 1:54 pm

maybe i can do it jsut as easily without regexp

Code: Select all
text = '''\n
a = 1\n
b=2.2\n
ccc = "string"\n
long = (20 - 4) *  2 + 4\n
spaced     =      10
a = 11
'''

env = {}

count = 0
for line in text.split('\n'):
    count += 1
    if '=' in line:
        ID = line.split('=')[0].strip()
        value = line.split('=')[1].strip()
       
        if not env.get(ID):
            #convert to int/float if needed
            try:
                value = int(value)
            except ValueError:
                try:
                    value = float(value)
                except ValueError:
                    pass
                   
            env[ID] = value
        else:
            print('ERROR: {}: "{}" is already defined'.format(text.split('\n')[count-1], ID))

print(env)
New Users, Read This
OS Ubuntu 14.04, Arch Linux, Gentoo, Windows 7/8
https://github.com/metulburr
steam
User avatar
metulburr
 
Posts: 1574
Joined: Thu Feb 07, 2013 4:47 pm
Location: Elmira, NY

Re: trying to avoid using regexp

Postby micseydel » Tue Sep 24, 2013 6:12 pm

Regular expressions handle regular languages which do not contain arbitrarily nested parenthesis. (I know the re module is more powerful than regular languages, but I'm not sure by how much.) I took a compilers class where we learned about all kinds of this stuff, but I don't remember much of it. You should see if someone has written a free book on the topic, or done a MOOC for it or something.
Join the #python-forum IRC channel on irc.freenode.net!

Please do not PM members regarding questions which are meant to be discussed publicly. The point of the forum is so that others can benefit from it. We don't want to help you over PMs or emails.
User avatar
micseydel
 
Posts: 1535
Joined: Tue Feb 12, 2013 2:18 am
Location: Mountain View, CA

Re: trying to avoid using regexp

Postby ochichinyezaboombwa » Tue Sep 24, 2013 9:02 pm

Take a look at LEX &YACC.
ochichinyezaboombwa
 
Posts: 200
Joined: Tue Jun 04, 2013 7:53 pm

Re: trying to avoid using regexp

Postby micseydel » Tue Sep 24, 2013 9:08 pm

ochichinyezaboombwa wrote:Take a look at LEX &YACC.

+1
Join the #python-forum IRC channel on irc.freenode.net!

Please do not PM members regarding questions which are meant to be discussed publicly. The point of the forum is so that others can benefit from it. We don't want to help you over PMs or emails.
User avatar
micseydel
 
Posts: 1535
Joined: Tue Feb 12, 2013 2:18 am
Location: Mountain View, CA


Return to General Coding Help

Who is online

Users browsing this forum: ichabod801 and 5 guests