python - Using pandas to create a summary table -


how can use pandas obtain summary table data below:

id  condition   confirmed d0119   bad yes d0119      no d0117   bad yes d0110   bad undefined d1011   bad yes d1011      yes d1001   bad yes d1001   bad yes 

required output:

id  condition   confirmed   %bad d0119   bad,good    yes, no 50 d0117   bad,yes 100 d0110   bad,undefined   0 d1011   bad,good    yes, yes d1001   bad,bad yes, yes    100 

can help? thanks

you can way:

in [123]: (df.assign(bad=df.condition=='bad')      ...:    .groupby('id')      ...:    .agg({'condition':pd.series.tolist,      ...:          'confirmed':pd.series.tolist,      ...:          'bad':'mean'})      ...: )      ...: out[123]:        bad    condition    confirmed id d0110  1.0        [bad]  [undefined] d0117  1.0        [bad]        [yes] d0119  0.5  [bad, good]    [yes, no] d1001  1.0   [bad, bad]   [yes, yes] d1011  0.5  [bad, good]   [yes, yes] 

vertical variant:

in [113]: df out[113]:       id condition  confirmed 0  d0119       bad        yes 1  d0119              no 2  d0117       bad        yes 3  d0110       bad  undefined 4  d1011       bad        yes 5  d1011             yes 6  d1001       bad        yes 7  d1001       bad        yes  in [114]: g = df.assign(bad=df.condition=='bad').groupby('id')  in [115]: df['bad'] = df['id'].map((g.sum().div(g.size(), 0)*100).bad)  in [116]: df out[116]:       id condition  confirmed    bad 0  d0119       bad        yes   50.0 1  d0119              no   50.0 2  d0117       bad        yes  100.0 3  d0110       bad  undefined  100.0 4  d1011       bad        yes   50.0 5  d1011             yes   50.0 6  d1001       bad        yes  100.0 7  d1001       bad        yes  100.0 

Comments

Popular posts from this blog

java - SSE Emitter : Manage timeouts and complete() -

jquery - uncaught exception: DataTables Editor - remote hosting of code not allowed -

java - How to resolve error - package com.squareup.okhttp3 doesn't exist? -