pandas - Using scalar values in series as variables in user defined function -
i want define function applied element wise each row in dataframe, comparing each element scalar value in separate series. started function below.
def greater_than(array, value): g = array[array >= value].count(axis=1) return g
but applying mask along axis 0 , need apply along axis 1. can do?
e.g.
in [3]: df = pd.dataframe(np.arange(16).reshape(4,4)) in [4]: df out[4]: 0 1 2 3 0 0 1 2 3 1 4 5 6 7 2 8 9 10 11 3 12 13 14 15 in [26]: s out[26]: array([ 1, 1000, 1000, 1000]) in [25]: greater_than(df,s) out[25]: 0 0 1 1 2 1 3 1 dtype: int64 in [27]: g = df[df >= s] in [28]: g out[28]: 0 1 2 3 0 nan nan nan nan 1 4.0 nan nan nan 2 8.0 nan nan nan 3 12.0 nan nan nan
the result should like:
in [29]: greater_than(df,s) out[29]: 0 3 1 0 2 0 3 0 dtype: int64
as 1,2, & 3 >= 1 , none of remaining values greater or equal 1000.
your best bet may transposes (no copies made, if that's concern)
in [164]: df = pd.dataframe(np.arange(16).reshape(4,4)) in [165]: s = np.array([ 1, 1000, 1000, 1000]) in [171]: df.t[(df.t>=s)].t out[171]: 0 1 2 3 0 nan 1.0 2.0 3.0 1 nan nan nan nan 2 nan nan nan nan 3 nan nan nan nan in [172]: df.t[(df.t>=s)].t.count(axis=1) out[172]: 0 3 1 0 2 0 3 0 dtype: int64
you can sum mask directly, if count you're after.
in [173]: (df.t>=s).sum(axis=0) out[173]: 0 3 1 0 2 0 3 0 dtype: int64
Comments
Post a Comment