Regular Expression

This is the place for queries that don't fit in any of the other categories.

Regular Expression

Postby Fish » Wed Nov 06, 2013 10:24 pm

I'm trrying to extract the following: http://www.google.com, cnn.fr, or any other domain name that appears there from the following strings:

string: '0 0 1 Tuesday 12:00:00 AM None http://www.google.com 30:12:12'
string 2: '0 0 1 Wednesday 13:00:00 AM None cnn.fr 30:12:12'

My regular expression isn't quite getting the whole http://www.google.com; nor other addresses in the cnn.fr.

Can someone help me out with my regular expression? I'm pretty rusty at these things!

'[\w]\.\[a-zA-Z0-9\w]

Thanks!
Last edited by micseydel on Wed Nov 06, 2013 10:30 pm, edited 1 time in total.
Reason: First post lock.
Fish
 
Posts: 2
Joined: Wed Nov 06, 2013 10:01 pm

Re: Regular Expression

Postby metulburr » Wed Nov 06, 2013 10:37 pm

assuming it stays in that same format, you can avoid using regex altogether
Code: Select all
a = "string: '0 0 1 Tuesday 12:00:00 AM None http://www.google.com 30:12:12'"
b = "string 2: '0 0 1 Wednesday 13:00:00 AM None cnn.fr 30:12:12'"
print(a.split()[-2])
print(b.split()[-2])


--output--
Code: Select all
http://www.google.com
cnn.fr

New Users, Read This
version Python 3.3.2 and 2.7.5, tkinter 8.5, pyqt 4.8.4, pygame 1.9.2 pre
OS Ubuntu 14.04, Arch Linux, Gentoo, Windows 7/8
https://github.com/metulburr
User avatar
metulburr
 
Posts: 1122
Joined: Thu Feb 07, 2013 4:47 pm
Location: Elmira, NY

Re: Regular Expression

Postby micseydel » Wed Nov 06, 2013 10:38 pm

Please repost your regex split onto multiple lines with implicit string concatenation and comments, for example
Code: Select all
pat = (
    "[\w]" # start with whitespace
    "\." # any character
    "\[a-zA-Z0-9\w]" # a character, digit, or whitespace character
)

Also post your full code (how you're using the regex) as well as the exact result that you want and what you're actually getting.

EDIT: Just saw metulburr's suggestion. Definitely better. Thanks buddy :)
Join the #python-forum IRC channel on irc.freenode.net!
User avatar
micseydel
 
Posts: 940
Joined: Tue Feb 12, 2013 2:18 am
Location: Mountain View, CA

Re: Regular Expression

Postby ochichinyezaboombwa » Wed Nov 06, 2013 10:40 pm

Looks like your data is nicely formatted and you only need to extract the 7th column (0-based) out of each line.

EDIT: Opps, metalburr was 1st :-)
Last edited by ochichinyezaboombwa on Wed Nov 06, 2013 10:42 pm, edited 1 time in total.
ochichinyezaboombwa
 
Posts: 200
Joined: Tue Jun 04, 2013 7:53 pm

Re: Regular Expression

Postby Fish » Wed Nov 06, 2013 10:41 pm

Ah, that worked a lot better! Thanks!
Fish
 
Posts: 2
Joined: Wed Nov 06, 2013 10:01 pm


Return to General Coding Help

Who is online

Users browsing this forum: Google [Bot] and 2 guests