python - How to take column-slices of dataframe in pandas

ID : 10170

viewed : 53

Tags : pythonpandasnumpydataframeslicepython

Top 5 Answer for python - How to take column-slices of dataframe in pandas

vote vote

90

2017 Answer - pandas 0.20: .ix is deprecated. Use .loc

See the deprecation in the docs

.loc uses label based indexing to select both rows and columns. The labels being the values of the index or the columns. Slicing with .loc includes the last element.

Let's assume we have a DataFrame with the following columns:
foo, bar, quz, ant, cat, sat, dat.

# selects all rows and all columns beginning at 'foo' up to and including 'sat' df.loc[:, 'foo':'sat'] # foo bar quz ant cat sat 

.loc accepts the same slice notation that Python lists do for both row and columns. Slice notation being start:stop:step

# slice from 'foo' to 'cat' by every 2nd column df.loc[:, 'foo':'cat':2] # foo quz cat  # slice from the beginning to 'bar' df.loc[:, :'bar'] # foo bar  # slice from 'quz' to the end by 3 df.loc[:, 'quz'::3] # quz sat  # attempt from 'sat' to 'bar' df.loc[:, 'sat':'bar'] # no columns returned  # slice from 'sat' to 'bar' df.loc[:, 'sat':'bar':-1] sat cat ant quz bar  # slice notation is syntatic sugar for the slice function # slice from 'quz' to the end by 2 with slice function df.loc[:, slice('quz',None, 2)] # quz cat dat  # select specific columns with a list # select columns foo, bar and dat df.loc[:, ['foo','bar','dat']] # foo bar dat 

You can slice by rows and columns. For instance, if you have 5 rows with labels v, w, x, y, z

# slice from 'w' to 'y' and 'foo' to 'ant' by 3 df.loc['w':'y', 'foo':'ant':3] #    foo ant # w # x # y 
vote vote

80

Note: .ix has been deprecated since Pandas v0.20. You should instead use .loc or .iloc, as appropriate.

The DataFrame.ix index is what you want to be accessing. It's a little confusing (I agree that Pandas indexing is perplexing at times!), but the following seems to do what you want:

>>> df = DataFrame(np.random.rand(4,5), columns = list('abcde')) >>> df.ix[:,'b':]       b         c         d         e 0  0.418762  0.042369  0.869203  0.972314 1  0.991058  0.510228  0.594784  0.534366 2  0.407472  0.259811  0.396664  0.894202 3  0.726168  0.139531  0.324932  0.906575 

where .ix[row slice, column slice] is what is being interpreted. More on Pandas indexing here: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-advanced

vote vote

71

Lets use the titanic dataset from the seaborn package as an example

# Load dataset (pip install seaborn) >> import seaborn.apionly as sns >> titanic = sns.load_dataset('titanic') 

using the column names

>> titanic.loc[:,['sex','age','fare']] 

using the column indices

>> titanic.iloc[:,[2,3,6]] 

using ix (Older than Pandas <.20 version)

>> titanic.ix[:,[‘sex’,’age’,’fare’]] 

or

>> titanic.ix[:,[2,3,6]] 

using the reindex method

>> titanic.reindex(columns=['sex','age','fare']) 
vote vote

60

Also, Given a DataFrame

data

as in your example, if you would like to extract column a and d only (e.i. the 1st and the 4th column), iloc mothod from the pandas dataframe is what you need and could be used very effectively. All you need to know is the index of the columns you would like to extract. For example:

>>> data.iloc[:,[0,3]] 

will give you

          a         d 0  0.883283  0.100975 1  0.614313  0.221731 2  0.438963  0.224361 3  0.466078  0.703347 4  0.955285  0.114033 5  0.268443  0.416996 6  0.613241  0.327548 7  0.370784  0.359159 8  0.692708  0.659410 9  0.806624  0.875476 
vote vote

51

You can slice along the columns of a DataFrame by referring to the names of each column in a list, like so:

data = pandas.DataFrame(np.random.rand(10,5), columns = list('abcde')) data_ab = data[list('ab')] data_cde = data[list('cde')] 

Top 3 video Explaining python - How to take column-slices of dataframe in pandas

Related QUESTION?