.isfile help

This is the place for queries that don't fit in any of the other categories.

.isfile help

Postby always_stuck » Fri Sep 20, 2013 11:13 am

Hi,

I'm trying to write a script that reads-in directories, file names and extensions from a csv file. It then creates a file path, checks to see if the file exists then does something with each file. My script is not giving me the results I expect, failing to find over half my files. I've manually checked that these files exists, and can't see any problems with the file paths created (possible escape character issues?). Any help/suggestions would be much appreciated. Thanks

part of code;
Code: Select all
with open('EcoDocs TK pdfs.csv', 'rb') as pdf_in:
    pdflist = csv.reader(pdf_in, quotechar='"')
    for row in pdflist:
        if row[1].endswith(row[2]): #check to see if file name includes extension
            pathname = os.path.join(row[0:2])
        else:
            pathname = os.path.join(row)
        if os.path.isfile(pathname):       
            filehash = md5.md5(file(pathname).read()).hexdigest()


example csv input file;

Code: Select all
c:\directory\sub-directory\final     directoryfilename.pdf    .pdf
c:\directory2\sub-directory2\final directory2    filename2.doc    .doc





resulting file paths;

    c:\directory\sub-directory\final directory\filename.pdf
    c:\directory2\sub-directory2\final directory2\filename2.pdf
Last edited by Yoriz on Fri Sep 20, 2013 11:25 am, edited 1 time in total.
Reason: First post lock
always_stuck
 
Posts: 21
Joined: Fri Sep 20, 2013 10:47 am

Re: .isfile help

Postby Yoriz » Fri Sep 20, 2013 11:26 am

Hi
Welcome to the forum.
Please ensure you have read the 'new user read this' link in my signature.
New Users, Read This
Join the #python-forum IRC channel on irc.freenode.net!
Image
User avatar
Yoriz
 
Posts: 1164
Joined: Fri Feb 08, 2013 1:35 am
Location: UK

Re: .isfile help

Postby stranac » Fri Sep 20, 2013 12:55 pm

By default, the csv module uses ',' as separator.
To use a different separator, you'll need to provide the appropriate option when creating the reader/writer object.
More details in the docs: http://docs.python.org/2/library/csv.ht ... fmt-params
Friendship is magic!

R.I.P. Tracy M. You will be missed.
User avatar
stranac
 
Posts: 1246
Joined: Thu Feb 07, 2013 3:42 pm

Re: .isfile help

Postby always_stuck » Fri Sep 20, 2013 1:17 pm

Hi stranac.

Yeh I'm aware of how csv separates. The example input I showed was supposed to represent the column seperation when the csv file is opened in excel. It is still a csv file and therefore separated by commas. As you can see from the example file paths created by this script, the csv.reader is handling the input ok. My main confusion is whether the file paths created will be failing due to escape characters. Roughly half of the file paths are recognized by .isfile, and I can't see any obvious differences from the half that aren't.
always_stuck
 
Posts: 21
Joined: Fri Sep 20, 2013 10:47 am

Re: .isfile help

Postby stranac » Fri Sep 20, 2013 2:27 pm

Oh, sorry, I misunderstood your problem then.

No, since the strings are being read from a file, everything should already be escaped properly.
Try printing repr(pathname) right after it's created. Maybe that will give you a clue about what's happening.

If that doesn't give you any useful info, attach your csv file.
Maybe we'll be able to see something you're missing.
Friendship is magic!

R.I.P. Tracy M. You will be missed.
User avatar
stranac
 
Posts: 1246
Joined: Thu Feb 07, 2013 3:42 pm

Re: .isfile help

Postby always_stuck » Fri Sep 20, 2013 3:12 pm

Thanks stranac. repr(pathname) produced the following result;

Code: Select all
['c:\\directory\\sub-directory\\final','filename.doc']


I guess this is what I would expect (although I'm still unsure, so feel free to tell me otherwise).

If that is as expected, I have attached an example csv file with the same escape characters as seen in real file paths I'm working with (couldn't attach the real files as it's commerically sensitive). The top two files aren't recognised by .isfile despite definitely existing, while the bottom two are.
Attachments
Example csv.csv
(238 Bytes) Downloaded 57 times
always_stuck
 
Posts: 21
Joined: Fri Sep 20, 2013 10:47 am

Re: .isfile help

Postby stranac » Fri Sep 20, 2013 5:35 pm

That looks... wrong...
The code you've shown uses os.path.join to create filepath. That function should return a string, not a list.

I can't take a look at the csv file atm, since my phone doesn't know how to open it, and it won't let me change the extension :s
Friendship is magic!

R.I.P. Tracy M. You will be missed.
User avatar
stranac
 
Posts: 1246
Joined: Thu Feb 07, 2013 3:42 pm

Re: .isfile help

Postby micseydel » Fri Sep 20, 2013 6:45 pm

stranac wrote:I can't take a look at the csv file atm, since my phone doesn't know how to open it, and it won't let me change the extension :s

Code: Select all
c:\2dir\a. dir\d dir\lo dir\fu dir\wdir\5dir\,5_l B.xls,.xls,doesn't work
c:\2dir\a. dir\d dir\lo dir\fu dir\wdir\5dir\,5_l A.pdf,.pdf,doesn't work
c:\2dir\12 03 15\,12 03 15.docx,.docx,works
c:\2dir\12 03 15\,13 03 15.xls,.xls,works


Please post the code you use repr() with to get that output.
Join the #python-forum IRC channel on irc.freenode.net!

Please do not PM members regarding questions which are meant to be discussed publicly. The point of the forum is so that others can benefit from it. We don't want to help you over PMs or emails.
User avatar
micseydel
 
Posts: 1497
Joined: Tue Feb 12, 2013 2:18 am
Location: Mountain View, CA

Re: .isfile help

Postby stranac » Fri Sep 20, 2013 7:59 pm

You just told me the data is comma delimited, but it's not.
The file uses space as the delimiter, and the data seems to be unquoted strings which also contain spaces.
That doesn't really seem as a good format...
Friendship is magic!

R.I.P. Tracy M. You will be missed.
User avatar
stranac
 
Posts: 1246
Joined: Thu Feb 07, 2013 3:42 pm

Re: .isfile help

Postby always_stuck » Fri Sep 20, 2013 8:48 pm

I might be missing something blindingly obvious but what suggests space is the delimiter? I thought all csv generated files were automatically comma delimited? Also csv.reader defaults to comma delimited, and printing the indexed elements of each row in pdflist shows it is being read correctly, suggesting it is comma delimited...
always_stuck
 
Posts: 21
Joined: Fri Sep 20, 2013 10:47 am

Re: .isfile help

Postby stranac » Sat Sep 21, 2013 7:26 am

Yes, sorry, I was wrong. :oops:
It's just that dots and commas look exactly the same on my phone.
Friendship is magic!

R.I.P. Tracy M. You will be missed.
User avatar
stranac
 
Posts: 1246
Joined: Thu Feb 07, 2013 3:42 pm

Re: .isfile help

Postby always_stuck » Sat Sep 21, 2013 9:48 am

I've just realised I've posted the wrong version of my code.
Code: Select all
os.path.join
creates pathnames identical to the results of
Code: Select all
repr(pathname)
that I posted. The code I used to create the pathnames posted in my original post used
Code: Select all
''.join()
instead of
Code: Select all
os.path.join
. So you're right that
Code: Select all
os.path.join
should be returning a string, but is actually returning a list of strings. I don't understand why it's doing this though...
always_stuck
 
Posts: 21
Joined: Fri Sep 20, 2013 10:47 am

Re: .isfile help

Postby stranac » Sat Sep 21, 2013 10:13 am

Well, I'm pretty sure there is no situation in which os.path.join() would return a list.
You should post the actual code you're using right now, along with the pathnames you get when running it.
Friendship is magic!

R.I.P. Tracy M. You will be missed.
User avatar
stranac
 
Posts: 1246
Joined: Thu Feb 07, 2013 3:42 pm

Re: .isfile help

Postby always_stuck » Sat Sep 21, 2013 5:28 pm

Ok,this is my code and the pathnames returned by it.

Code: Select all
with open('EcoDocs TK pdfs.csv', 'rb') as pdf_in:
    pdflist = csv.reader(pdf_in, quotechar='"')
    for row in pdflist:
        if row[1].endswith(row[2]): #check to see if file name includes extension
            pathname = ''.join(row[0:2])
        else:
            pathname = ''.join(row)
        if os.path.isfile(pathname):       
            filehash = md5.md5(file(pathname).read()).hexdigest()




example csv input file;

Code: Select all
c:\directory\sub-directory\final     directoryfilename.pdf    .pdf
c:\directory2\sub-directory2\final directory2    filename2.doc    .doc






resulting file paths;

c:\directory\sub-directory\final directory\filename.pdf
c:\directory2\sub-directory2\final directory2\filename2.doc
always_stuck
 
Posts: 21
Joined: Fri Sep 20, 2013 10:47 am

Re: .isfile help

Postby always_stuck » Mon Sep 23, 2013 9:58 am

Just to update, printing repr(pathname) immediately after pathname is created results in the following;

Code: Select all
'c:\\directory\\sub-directory\\final\\filename.doc'


Is this more what should be expected?

Stil not had any luck figuring this one out.
always_stuck
 
Posts: 21
Joined: Fri Sep 20, 2013 10:47 am


Return to General Coding Help

Who is online

Users browsing this forum: Google [Bot] and 6 guests