python - Subtract first row from all rows in Pandas DataFrame -
python - Subtract first row from all rows in Pandas DataFrame -
i have pandas dataframe:
a = pd.dataframe(rand(5,6)*10, index=pd.datetimeindex(start='2005', periods=5, freq='a')) a.columns = pd.multiindex.from_product([('a','b'),('a','b','c')])
i want subtract row a['2005']
a
. i've tried this:
in [22]: - a.ix['2005'] out[22]: b b c b c 2005-12-31 0 0 0 0 0 0 2006-12-31 nan nan nan nan nan nan 2007-12-31 nan nan nan nan nan nan 2008-12-31 nan nan nan nan nan nan 2009-12-31 nan nan nan nan nan nan
which doesn't work because pandas lining index while doing operation. works:
in [24]: pd.dataframe(a.values - a['2005'].values, index=a.index, columns=a.columns) out[24]: b b c b c 2005-12-31 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 2006-12-31 -3.326761 -7.164628 8.188518 -0.863177 0.519587 -3.281982 2007-12-31 3.529531 -4.719756 8.444488 1.355366 7.468361 -4.023797 2008-12-31 3.139185 -8.420257 1.465101 -2.942519 1.219060 -5.146019 2009-12-31 -3.459710 0.519435 -1.049617 -2.779370 4.792227 -1.922461
but don't want have form new dataframe every time have kind of operation. i've tried apply() method this: a.apply(lambda x: x-a['2005'].values)
valueerror: cannot re-create sequence size 6 array axis dimension 5
i'm not sure how proceed. there simple way not seeing? think there should easy way in place don't have build new dataframe each time. tried sub()
method subtraction applied first row whereas want subtract first row each row in dataframe.
pandas great aligning index. when want pandas ignore index, need drop index. can converting dataframe a.loc['2005']
1-dimensional numpy array:
in [56]: - a.loc['2005'].values.squeeze() out[56]: b b c b c 2005-12-31 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 2006-12-31 0.325968 1.314776 -0.789328 -0.344669 -2.518857 7.361711 2007-12-31 0.084203 2.234445 -2.838454 -6.176795 -3.645513 8.955443 2008-12-31 3.798700 0.299529 1.303325 -2.770126 -1.284188 3.093806 2009-12-31 1.520930 2.660040 0.846996 -9.437851 -2.886603 6.705391
the squeeze
method converts numpy array, a.loc['2005']
, of shape (1, 6)
array of shape (6,)
. allows array broadcasted (during subtraction) desired.
python numpy pandas dataframes
Comments
Post a Comment