python - Get average counts per minute by hour -
i have dataframe time stamp index , column of labels
df=dataframe({'time':[ datetime(2015,11,2,4,41,10), datetime(2015,11,2,4,41,39), datetime(2015,11,2,4,41,47), datetime(2015,11,2,4,41,59), datetime(2015,11,2,4,42,4), datetime(2015,11,2,4,42,11), datetime(2015,11,2,4,42,15), datetime(2015,11,2,4,42,30), datetime(2015,11,2,4,42,39), datetime(2015,11,2,4,42,41),datetime(2015,11,2,5,2,9),datetime(2015,11,2, 5,2,10), datetime(2015,11,2,5,2,16),datetime(2015,11,2,5,2,29),datetime(2015,11,2, 5,2,51), datetime(2015,11,2,5,9,1),datetime(2015,11,2,5,9,21),datetime(2015,11,2,5,9,31), datetime(2015,11,2,5,9,40),datetime(2015,11,2,5,9,55)], 'label':[2,0,0,0,1,0,0,1,1,1,1,3,0,0,3,0,1,0,1,1]}).set_index(['time'])
i want avergae number of times label appears in distinct minute in distnct hour.
for example, label 0 appears 3 times in hour 4 in minute 41, 2 times in hour 4 in minute 42,
2 times in hour 5 in in minute 2, , 2 times in hour 5 in minute 9 average count per minute in hour 4
(2+3)/2=2.5
and count per minute in hour 5
(2+2)/2=2
the output looking
hour 1 label avg 0 2.5 1 2 2 .5 3 0 hour 2 label avg 0 2 1 1.5 2 0 3 1
what have far
df['hour']=df.index.hour hour_grp=df.groupby(['hour'], as_index=false)
then can deo
res=[] key, value in hour_grp: res.append(value)
then group minute
res[0].groupby(pd.timegrouper('1min'))['label'].value_counts()
but i'm stuck, not mention not efficient
accessing minute of datetimeindex:
mn = df.index.minute
accessing hour of datetimeindex:
hr = df.index.hour
perform groupby
keeping above obtained variables keys. compute value_counts
of contents under label , unstack
filling missing values 0. finally, average them across index-axis containing hour values.
df.groupby([mn,hr])['label'].value_counts().unstack(fill_value=0).mean(level=1)
Comments
Post a Comment