Handling a large text file in python to populate a matrix

This is the place for queries that don't fit in any of the other categories.

Handling a large text file in python to populate a matrix

Postby shinigamiryuk » Sat Apr 19, 2014 3:19 pm

I am fairly new to Python and I am trying to read a text file which contains distances between two points. The file has contents like this
Code: Select all
1 2 -2.5
2 3 7.8
6 10143 -3.86


The first two are integers(points) and the third (float) is the distance between the points. The text file is huge and has over 5M lines. What is the fastest way to read the text file and populate a matrix (say numpy matrix)??

So far I have done something like this:
Code: Select all
for line in open("distances.txt", "rb"):
      ls = line.split(' ')
      A = int(ls[0])
      B = int(ls[1])
      C = float((ls[2])[:-1])
      # Write further code for storing data in distance matrix


But this is too slow. Please help :?:
Last edited by stranac on Sat Apr 19, 2014 3:27 pm, edited 1 time in total.
Reason: First post lock.
shinigamiryuk
 
Posts: 1
Joined: Sat Apr 19, 2014 3:10 pm

Re: Handling a large text file in python to populate a matri

Postby stranac » Sat Apr 19, 2014 3:30 pm

There is numpy.loadtxt(), see if that helps.
But a huge file isn't going to load super fast no matter what.
Friendship is magic!

R.I.P. Tracy M. You will be missed.
User avatar
stranac
 
Posts: 1097
Joined: Thu Feb 07, 2013 3:42 pm

Re: Handling a large text file in python to populate a matri

Postby tnknepp » Thu Apr 24, 2014 5:11 pm

Your file is not so large that it should take a prohibitively long time to load (~35MB, right?). Most of your time will be spend in appending to a matrix, unless you pre-allocate the matrix with the right number of rows. In addition to stranac's suggestion I would recommend reading into a pandas pickle, then saving your data in a pickle, which will be much faster at loading than reading from a text file. You can do this using pandas.read_csv()
Python: 2.7 via Anaconda
Numpy: 1.7
Pandas: 0.11
OS: Windows 7
IDE: Spyder/IPython
User avatar
tnknepp
 
Posts: 119
Joined: Mon Mar 11, 2013 7:41 pm


Return to General Coding Help

Who is online

Users browsing this forum: Baldyr, W3C [Linkcheck] and 1 guest