Encoding issue when renaming filenames from english => greek

Encoding issue when renaming filenames from english => greek

Postby NikosGr » Tue Jun 04, 2013 5:36 pm

Code: Select all
-------------------------------------------------
print( '''Content-type: text/html; charset=utf-8\n''' )

# Compute a set of current fullpaths
fullpaths = set()
path = "/home/nikos/www/data/apps/"

for root, dirs, files in os.walk(path):
        for fullpath in files:
                fullpaths.add( os.path.join(root, fullpath) )


I don't have to deal with file's contents but rather filenames themselves.

Code: Select all
root@nikos [~]# ls -l /home/nikos/www/data/apps/
total 368548
drwxr-xr-x 2 nikos nikos     4096 Jun  4 14:49 ./
drwxr-xr-x 6 nikos nikos     4096 May 26 21:13 ../
-rwxr-xr-x 1 nikos nikos 13157283 Mar 17 12:57 100\ Mythoi\ tou\ Aiswpou.pdf*
-rwxr-xr-x 1 nikos nikos 29524686 Mar 11 18:17 Anekdotologio.exe*
-rw-r--r-- 1 nikos nikos 42413964 Jun  2 20:29 Battleship.exe
-rw-r--r-- 1 nikos nikos   236032 Jun  4 14:10 \323\352\335\370\357\365\ \335\355\341\355\ \341\361\351\350\354\374.exe
-rwxr-xr-x 1 nikos nikos 66896732 Mar 17 13:13 Kosmas\ o\ Aitwlos\ -\ Profiteies.pdf*
-rw-r--r-- 1 nikos nikos 51819750 Jun  2 20:04 Luxor\ Evolved.exe
-rw-r--r-- 1 nikos nikos 60571648 Jun  2 14:59 Monopoly.exe
-rw-r--r-- 1 nikos nikos  3511233 Jun  4 14:11 \305\365\367\336\ \364\357\365\ \311\347\363\357\375.mp3
-rwxr-xr-x 1 nikos nikos  1788164 Mar 14 11:31 Online\ Movie\ Player.zip*
-rw-r--r-- 1 nikos nikos  5277287 Jun  1 18:35 O\ Nomos\ tou\ Merfy\ v1-2-3.zip
-rwxr-xr-x 1 nikos nikos 16383001 Jun 22  2010 Orthodoxo\ Imerologio.exe*
-rw-r--r-- 1 nikos nikos  6084806 Jun  1 18:22 Pac-Man.exe
-rw-r--r-- 1 nikos nikos 25476584 Jun  2 19:50 Scrabble.exe
-rwxr-xr-x 1 nikos nikos 49141166 Mar 17 12:48 To\ 1o\ mou\ vivlio\ gia\ to\ skaki.pdf*
-rwxr-xr-x 1 nikos nikos  3298310 Mar 17 12:45 Vivlos\ gia\ Atheofovous.pdf*
-rw-r--r-- 1 nikos nikos  1764864 May 29 21:50 V-Radio\ v2.4.msi
root@nikos [~]#
-------------------------------------------------


As you see the subdirectory 'apps' contain both ebglish and greek lettered filenames.
Are those both unicode? Are the filenames of the actuals files also encoded as byte streams,much like the contents inside them?
if they are unicode then i really see no trouble when trying to:

cur.execute('''SELECT url FROM files WHERE url = %s''', ( fullpath, )

but this is what getting days now:

Code: Select all
[Tue Jun 04 20:33:28 2013] [error] [client 46.12.95.59] ValueError: underlying buffer has been detached
[Tue Jun 04 20:33:28 2013] [error] [client 46.12.95.59]
[Tue Jun 04 20:33:28 2013] [error] [client 46.12.95.59] Original exception was:
[Tue Jun 04 20:33:28 2013] [error] [client 46.12.95.59] Traceback (most recent call last):
[Tue Jun 04 20:33:28 2013] [error] [client 46.12.95.59]   File "files.py", line 72, in <module>
[Tue Jun 04 20:33:28 2013] [error] [client 46.12.95.59]     cur.execute('''SELECT url FROM files WHERE url = %s''', (fullpath,) )
[Tue Jun 04 20:33:28 2013] [error] [client 46.12.95.59]   File "/usr/local/lib/python3.3/site-packages/PyMySQL3-0.5-py3.3.egg/pymysql/cursors.py", line 108, in execute
[Tue Jun 04 20:33:28 2013] [error] [client 46.12.95.59]     query = query.encode(charset)
[Tue Jun 04 20:33:28 2013] [error] [client 46.12.95.59] UnicodeEncodeError: 'utf-8' codec can't encode character '\\udcc5' in position 61: surrogates not allowed


What is the problem in your opinion? Since everythign is encoded in utf-8 for i'm using python 3.3.2 what does this error mean?
Please tell me what to try, iam hopeless and very tired of this issue.
NikosGr
 
Posts: 48
Joined: Thu Mar 28, 2013 6:31 pm
Location: Thessaloniki

Re: Encoding issue when renaming filenames from english => g

Postby NikosGr » Tue Jun 04, 2013 7:38 pm

Someone plz?
NikosGr
 
Posts: 48
Joined: Thu Mar 28, 2013 6:31 pm
Location: Thessaloniki


Return to Web Development

Who is online

Users browsing this forum: No registered users and 0 guests