Output array dimensions with numpy.mean()

This is the place for queries that don't fit in any of the other categories.

Output array dimensions with numpy.mean()

I continue to try to migrate from MatLab to being fully Python based in my analysis, but some (ok, MANY) things still do not make sense.
In MatLab if I want the first column of a matrix I simply do:
Code: Select all
`first = b(:,1);`

which will yield a new matrix with dimensions of [r,c]. However, if I try something similar in Python:
Code: Select all
`from numpy import *first = b[:,0]`

I get a new array with dimensions [r,], a one-dimensional array. This is ok, until I try to join this new array with another of same length:
Code: Select all
`new = hstack( (b[:,0], b[:,100:106]) )`

This elicits a very angry response from Python regarding a dimension mis-match. This can be overcome, in a somewhat ugly fashion by:
Code: Select all
`new = hstack( (b[:,0:1], b[:,100:106]) )`

Whereas in MatLab this is a very simple operation:
Code: Select all
`new = [b(:,1),b(:,100:105)]`
since MatLab does not automatically output a 1d matrix.

Question 1: Is there a more elegant way of stacking the arrays in Python? I doubt it, and this is not a big deal since I can get around this minor inconvenience.

Now, when I continue on with my array work, I am hitting another issue with dimensions, this time using mean(). If I take my array and want to perform a mean over the columns (i.e. axis = 1):
Code: Select all
`new = mean(old,axis=1)`

I get, wait for it, a 1-d array! Again, I run into the same problem as above when trying to stack this "new" array with another array. I can force the array into 2-d via:
Code: Select all
`new = atleast_2d( mean(old,axis=1) ) # Transposes (not really a transpose since 1-d) from shape of [r,] to [1,r]new = new.T # Fixes transpose issue`

Again, this can be handled, but is a non-elegant way. I can't understand WHY numpy would default everything to 1-d as this is super annoying.

Question 2: Is there any way to preserve the two dimensions of my array after applying the mean, without invoking atleast_2d and .T?
Python: 2.7 via Anaconda
Numpy: 1.7
Pandas: 0.11
OS: Windows 7
IDE: Spyder/IPython

tnknepp

Posts: 153
Joined: Mon Mar 11, 2013 7:41 pm

Re: Output array dimensions with numpy.mean()

Two dimensional are not automatically matrices in numpy. You can get a matrix from an array using
Code: Select all
`In [1]: import numpy as npIn [2]: arr = np.array([[1,2], [3,4], [5,6]])In [3]: m = np.asmatrix(arr)`

I believe this should give you the sort of behaviors you're looking for (though I'm not familiar with MatLab):
Code: Select all
`In [4]: m[:,0]Out[4]:matrix([[1],        [3],        [5]])`

You might also want to look at pandas library (it's implemented on top of numpy). Again, not familiar with MatLab so don't know how it compares, but it's pretty similar to R.
setrofim

Posts: 288
Joined: Mon Mar 04, 2013 7:52 pm

Re: Output array dimensions with numpy.mean()

Thanks for the tip. I was trying to avoid numpy matrices due to their limitations and limited use. However, they do work well when you are careful.

I'll have to check out Pandas. Thanks for the tip.
Python: 2.7 via Anaconda
Numpy: 1.7
Pandas: 0.11
OS: Windows 7
IDE: Spyder/IPython

tnknepp

Posts: 153
Joined: Mon Mar 11, 2013 7:41 pm

Re: Output array dimensions with numpy.mean()

The creator of pandas has given a presentation on it at the recent pydata conference. It's a very good introduction to the library and well worth checking out.
setrofim

Posts: 288
Joined: Mon Mar 04, 2013 7:52 pm