Search in range

This is the place for queries that don't fit in any of the other categories.

Search in range

Postby Dareios » Fri Sep 27, 2013 6:26 pm

I have a bunch of text files, which contain each specific parts of a larger text. Each line is referred to the composite text via a referrer (e.g., "A 42" or "A 200"). The code below works perfectly for the specified line and gives the occurrences of this particular line in the afore-mentioned text files one by one (here "A 42"):

Code: Select all
    for score in os.listdir(path):
        with open(os.path.join(path, score), "rb") as text:
            for prev_line, line in linePairs(text):
                if re.search("A 42", line):
                    print line, prev_line


But, I would like to have a larger range of searches (e.g. "A 1" to "A 100") in all these files and have them printed in blocks, so all occurrences of "A 1", then all occurrences of "A 2" and so forth. I tried the following, but it doesn't work. Actually I tried to insert in the search phrase ( "A", i, line ), but this produces a syntax error.

Code: Select all
    for score in os.listdir(path):
        with open(os.path.join(path, score), "rb") as text:
            for i in range( 10, 100 ):
               for prev_line, line in linePairs(text):
                   if re.search( i, line ):
                          print line


It says: "first argument must be string or compiled pattern". I simply don't get it. I am unfortunately new to Python, but appreciate its functionality. In all fairness, I had help with the first attempt. Now, I am trying to expand everything.

Thanks for any advice,
D.
Last edited by Yoriz on Fri Sep 27, 2013 8:50 pm, edited 1 time in total.
Reason: First post lock
Dareios
 
Posts: 3
Joined: Fri Sep 27, 2013 6:16 pm

Re: Search in range

Postby Yoriz » Fri Sep 27, 2013 8:50 pm

Try using this line in your code
Code: Select all
if re.search('A {}'.format(i), line):

Find out about string formating in these links
Strings
String Formatting
New Users, Read This
Join the #python-forum IRC channel on irc.freenode.net!
Spam topic disapproval technician
Windows7, Python 2.7.4., WxPython 2.9.5.0., some Python 3.3
User avatar
Yoriz
 
Posts: 565
Joined: Fri Feb 08, 2013 1:35 am
Location: UK

Re: Search in range

Postby Dareios » Fri Sep 27, 2013 10:36 pm

Thanks, it works, but it just returns the first number in the range, as if the variable i stays the same. I tried to add +1 to the range-properties, but it doesn't work.

I tried the following:

Code: Select all
    for score in os.listdir(path):
        with open(os.path.join(path, score), "rb") as text:
            for i in range( 55, 60, +1 ):
               for prev_line, line in linePairs(text):
                     if re.search('A {}'.format(i), line):
                          print line, prev_line
                     i = i + 1


Now, I get the range of lines A 55 to A 60 for each file separately. Is there a way to group all A 55 together followed by A 56 and so on?

Thanks for any further assistance.
D.
Dareios
 
Posts: 3
Joined: Fri Sep 27, 2013 6:16 pm

Re: Search in range

Postby micseydel » Fri Sep 27, 2013 11:19 pm

It would help me a lot in understanding the problem if you posted example input and output, where output may be what value you want pulled out of the file and into a variable, to keep things simple.
Join the #python-forum IRC channel on irc.freenode.net!
User avatar
micseydel
 
Posts: 923
Joined: Tue Feb 12, 2013 2:18 am
Location: Mountain View, CA

Re: Search in range

Postby Dareios » Sat Sep 28, 2013 7:56 am

I have several text files in one folder. Following three examples:

Code: Select all
 #text file no 1
1. line 1
>>A 33
2. line 2
>>A 34
3. line 3
>>A 35


Code: Select all
 #text file no 2
1. line 1
>>A 34
2. line 2
>>A 35
3. line 3
>>A 36


Code: Select all
 #text file no 3
1. line 1
>>A 36
2. line 2
>>A 37
3. line 3
>>A 38


If I search now in range for all A + numbers, I want to have such a result (actually just with having the A referrer once and listing the occurrences, but that's not so important):

Code: Select all
 #Output
>>A 33
1. Line 1

>>A 34
2. Line 2
>>A 34
1. Line 1

>>A 35
3. Line 3
>>A 35
2. Line 2

>>A 36
3. Line 3
>>A 36
1. Line 1

>>A 37
2. Line 2

>>A 38
3. Line 3
Dareios
 
Posts: 3
Joined: Fri Sep 27, 2013 6:16 pm

Re: Search in range

Postby tnknepp » Tue Oct 01, 2013 2:57 pm

Dareios wrote:Thanks, it works, but it just returns the first number in the range, as if the variable i stays the same. I tried to add +1 to the range-properties, but it doesn't work.

I tried the following:

Code: Select all
    for score in os.listdir(path):
        with open(os.path.join(path, score), "rb") as text:
            for i in range( 55, 60, +1 ):
               for prev_line, line in linePairs(text):
                     if re.search('A {}'.format(i), line):
                          print line, prev_line
                     i = i + 1


Now, I get the range of lines A 55 to A 60 for each file separately. Is there a way to group all A 55 together followed by A 56 and so on?

Thanks for any further assistance.
D.



The for loop automatically increments <i>, so you do not nee the "i = i + 1" line. This leads to confusion (not just for humans, but the computer as well).

Why not use a dictionary? If I understand your end-goal, you just want to identify the line number of each file that corresponds to a given "A #" value (in your code I do not understand where <prev_line> comes into play, nor do I know what <linePairs> is, but I will give it a go anyway).

Code: Select all
# Make dictionary
tmp = dict( [(r,[]) for r in range(55,60)] )

for score in os.listdir(path):
    text = open(os.path.join(path,score),'rb').read().split('\n') # Assuming EOL = '\n' and "A #" is only contents of each line; can modify as needed
        for r in range(55,60): # Bad form to use <i>, in general
            if 'A ' + str(r) in text:
                tmp[r].append( (fileNumber,text.index('A '+str(r))) ) #I don't know how you define the file number, so I leave this generic
Python: 2.7 via Anaconda
Numpy: 1.7
Pandas: 0.11
OS: Windows 7
IDE: Spyder/IPython
User avatar
tnknepp
 
Posts: 114
Joined: Mon Mar 11, 2013 7:41 pm

Re: Search in range

Postby ochichinyezaboombwa » Tue Oct 01, 2013 4:58 pm

It would help you a lot if you sort all your files and put the sorted contents in one. On ****x, it's done by the command:
Code: Select all
cat file*.txt | sort -r > all.srt

(assuming you have meaningful file names like file1.txt .. file100.txt).

Then, through away all of your code and make it work with just one file "all.srt".
ochichinyezaboombwa
 
Posts: 200
Joined: Tue Jun 04, 2013 7:53 pm


Return to General Coding Help

Who is online

Users browsing this forum: No registered users and 2 guests