Storing jpg array in dictionary

This is the place for queries that don't fit in any of the other categories.

Storing jpg array in dictionary

Postby tnknepp » Thu Dec 05, 2013 2:49 pm

I have a camera that takes records two images per minute, resulting in 2880 images per day at ~180kb/image. Over a period of several months this adds up to many images, which makes transferring the data cumbersome. To get around this I tar the images into monthly tar files (e.g. 201301.tar, 201302.tar). I use the images in my analysis work, so I do need access to the data, and I hit two problems:

1. To load an individual jpg I need to un-tar the month's images (I would like to avoid the time involved in un-taring)
2. Reading in individual jpg's is slow when I have to read in several day's worth of data.

Is there a practical way of storing the image data in a single file (more likely I will store data on a month-by-month basis again)? I think storing as a dictionary (key=datetime value, value=array (size=480 x 660 x 3)) makes sense, but I have issues in re-loading the data.

Code: Select all
# Example
import datetime as dt
from scipy.misc import imread as ir

dts = dt.datetime.strptime

image = ir('skycam-current.jpg',flatten=False)

data = {}
data[dt.datetime(2013,01,01)] = image
np.savez('test.dic.npz',data)

# Then, to re-load the data
data = np.load('test.dic.npz')


When I re-load the data, I get a dictionary, of sorts:
Code: Select all
>>> type(data)
>>> numpy.lib.npyio.NpzFile

>>> data.keys()
>>> ['arr_0']


If I try to recover the dictionary, I get a 0-d array
Code: Select all
>>> a = data['arr_0']
>>> a.shape
>>> ()


Can anyone recommend a better method of either storing the dictionary, or storing the image data?
Python: 2.7 via Anaconda
Numpy: 1.7
Pandas: 0.11
OS: Windows 7
IDE: Spyder/IPython
User avatar
tnknepp
 
Posts: 119
Joined: Mon Mar 11, 2013 7:41 pm

Re: Storing jpg array in dictionary

Postby micseydel » Thu Dec 05, 2013 7:47 pm

I would imagine that sqlite or mongodb or really any lightweight database could do what you want and be quite fast. Or if you have enough memory for it and don't want to go to the effort of learning the database stuff, you could simply use JSON. I'm not sure how long it would take to load a 16gb+ JSON file into memory but once you'd done so it should be very fast to access things.

I suspect using a database is probably what you want though, and luckily Python has simple options here. I think mongodb in particular you might like because you can store Python dictionaries in them quite transparently.
Join the #python-forum IRC channel on irc.freenode.net!

Please do not PM members regarding questions which are meant to be discussed publicly. The point of the forum is so that others can benefit from it. We don't want to help you over PMs or emails.
User avatar
micseydel
 
Posts: 1301
Joined: Tue Feb 12, 2013 2:18 am
Location: Mountain View, CA

Re: Storing jpg array in dictionary

Postby tnknepp » Fri Dec 06, 2013 7:04 pm

I am trying to avoid databases since I have no experience in them. This may be worth checking into though. Thanks for the tip.
Python: 2.7 via Anaconda
Numpy: 1.7
Pandas: 0.11
OS: Windows 7
IDE: Spyder/IPython
User avatar
tnknepp
 
Posts: 119
Joined: Mon Mar 11, 2013 7:41 pm

Re: Storing jpg array in dictionary

Postby micseydel » Sat Dec 07, 2013 12:11 am

I can appreciate that, but mongodb is really intuitive in terms of being able to just stick dictionaries in it. Once you get it working you probably won't regret it, don't be discouraged by the little bit of work it takes to get there!
Join the #python-forum IRC channel on irc.freenode.net!

Please do not PM members regarding questions which are meant to be discussed publicly. The point of the forum is so that others can benefit from it. We don't want to help you over PMs or emails.
User avatar
micseydel
 
Posts: 1301
Joined: Tue Feb 12, 2013 2:18 am
Location: Mountain View, CA

Re: Storing jpg array in dictionary

Postby tnknepp » Mon Dec 09, 2013 3:08 pm

I am looking at the pymongo docs and playing around, trying to create a simple database. It looks like you have to make a connection, e.g.

Code: Select all
>>> from pymongo import Connection
>>> connection = Connection()
>>> db = connection['test-database']
>>> collection = db['test-collection']


Can I not just make a local database without connecting to a server?
Python: 2.7 via Anaconda
Numpy: 1.7
Pandas: 0.11
OS: Windows 7
IDE: Spyder/IPython
User avatar
tnknepp
 
Posts: 119
Joined: Mon Mar 11, 2013 7:41 pm

Re: Storing jpg array in dictionary

Postby micseydel » Mon Dec 09, 2013 6:01 pm

IIRC it's typical of a database to have to do that. If you need to dump it to a file to store it you can do so though.
Join the #python-forum IRC channel on irc.freenode.net!

Please do not PM members regarding questions which are meant to be discussed publicly. The point of the forum is so that others can benefit from it. We don't want to help you over PMs or emails.
User avatar
micseydel
 
Posts: 1301
Joined: Tue Feb 12, 2013 2:18 am
Location: Mountain View, CA

Re: Storing jpg array in dictionary

Postby ochichinyezaboombwa » Tue Dec 10, 2013 6:48 am

1) tar does NOT save any space, it just joins a bunch of files together.
2) if you need to save some space locally, use zip / gzip, not tar;
3) you don't need to tar up several (say month's worth) files together, -- just put month's worth files in a separate directory and either a) give them unambiguous names , such as : 2013_12/01.15.jpg, -- or create a separate text file that maps a timestamp to a filename, and use it;
3. "" Reading in individual jpg's is slow when I have to read in several day's worth of data """ -- it might be because you read many files instead of one;
Un-zipping one file which original size is 180 Kb with modern h/w/software is EXTREMELY FAST, milliseconds; so: don't create a problem where it doesn't exits.

I bet all the time your program spends in is in processing jpg's data.
ochichinyezaboombwa
 
Posts: 200
Joined: Tue Jun 04, 2013 7:53 pm

Re: Storing jpg array in dictionary

Postby tnknepp » Tue Dec 10, 2013 2:54 pm

The files were put into a tar file not to save space (zipping jpgs save virtually not space, so no need to waste time with that), but to make transferring the files much quicker since, as you say, transferring many small files takes much longer than one big one.

You are right that I can tar/untar a 180kb file quite quickly, so I may just go the route of pulling jpgs out of tar as needed.
Python: 2.7 via Anaconda
Numpy: 1.7
Pandas: 0.11
OS: Windows 7
IDE: Spyder/IPython
User avatar
tnknepp
 
Posts: 119
Joined: Mon Mar 11, 2013 7:41 pm

Re: Storing jpg array in dictionary

Postby micseydel » Tue Dec 10, 2013 5:07 pm

tnknepp wrote:You are right that I can tar/untar a 180kb file quite quickly, so I may just go the route of pulling jpgs out of tar as needed.

I thought the issue was that getting that small file out of a large tar was an issue because you had to untar the entire multi-gigabyte tar?
Join the #python-forum IRC channel on irc.freenode.net!

Please do not PM members regarding questions which are meant to be discussed publicly. The point of the forum is so that others can benefit from it. We don't want to help you over PMs or emails.
User avatar
micseydel
 
Posts: 1301
Joined: Tue Feb 12, 2013 2:18 am
Location: Mountain View, CA

Re: Storing jpg array in dictionary

Postby tnknepp » Tue Dec 10, 2013 7:45 pm

micseydel wrote:I thought the issue was that getting that small file out of a large tar was an issue because you had to untar the entire multi-gigabyte tar?


So did I. I tried pulling out just one file at a time and that works. However, to do this I must extract the file, save to disk, then re-read into scipy.misc.imread(). This is slower than reading the non-tared file from disk so I will continue to look for other options, but this may work as a last resort.
Python: 2.7 via Anaconda
Numpy: 1.7
Pandas: 0.11
OS: Windows 7
IDE: Spyder/IPython
User avatar
tnknepp
 
Posts: 119
Joined: Mon Mar 11, 2013 7:41 pm

Re: Storing jpg array in dictionary

Postby hrs » Tue Dec 10, 2013 9:41 pm

Does this transfering of files occur over the network? If so, it might be worth setting up something like rsync and never worry about it again.
hrs
 
Posts: 86
Joined: Thu Feb 07, 2013 9:26 pm


Return to General Coding Help

Who is online

Users browsing this forum: No registered users and 8 guests