python - Using pandas to create a summary table -
how can use pandas obtain summary table data below:
id condition confirmed d0119 bad yes d0119 no d0117 bad yes d0110 bad undefined d1011 bad yes d1011 yes d1001 bad yes d1001 bad yes
required output:
id condition confirmed %bad d0119 bad,good yes, no 50 d0117 bad,yes 100 d0110 bad,undefined 0 d1011 bad,good yes, yes d1001 bad,bad yes, yes 100
can help? thanks
you can way:
in [123]: (df.assign(bad=df.condition=='bad') ...: .groupby('id') ...: .agg({'condition':pd.series.tolist, ...: 'confirmed':pd.series.tolist, ...: 'bad':'mean'}) ...: ) ...: out[123]: bad condition confirmed id d0110 1.0 [bad] [undefined] d0117 1.0 [bad] [yes] d0119 0.5 [bad, good] [yes, no] d1001 1.0 [bad, bad] [yes, yes] d1011 0.5 [bad, good] [yes, yes]
vertical variant:
in [113]: df out[113]: id condition confirmed 0 d0119 bad yes 1 d0119 no 2 d0117 bad yes 3 d0110 bad undefined 4 d1011 bad yes 5 d1011 yes 6 d1001 bad yes 7 d1001 bad yes in [114]: g = df.assign(bad=df.condition=='bad').groupby('id') in [115]: df['bad'] = df['id'].map((g.sum().div(g.size(), 0)*100).bad) in [116]: df out[116]: id condition confirmed bad 0 d0119 bad yes 50.0 1 d0119 no 50.0 2 d0117 bad yes 100.0 3 d0110 bad undefined 100.0 4 d1011 bad yes 50.0 5 d1011 yes 50.0 6 d1001 bad yes 100.0 7 d1001 bad yes 100.0
Comments
Post a Comment