r - Dynamic and conditional inserting of new rows that come after a certain Date -
following data frames like(thanks nice edits experienced community):
library(data.table) df <- fread('account date blue red amount 1/1/2016 1 0 100 2/1/2016 1 1 200 b 1/10/2016 0 1 300 b 2/10/2016 1 1 400') df[, date := as.date(date, format="%m/%d/%y")] blue <- fread('date amount 6/1/2015 55 1/31/2016 55 2/28/2016 65 3/31/2016 75') blue[, date := as.date(date, format="%m/%d/%y")] red <- fread('date amount 12/31/2015 43 1/15/2016 47 2/15/2016 67 3/15/2016 77') red[, date := as.date(date, format="%m/%d/%y")]
in primary dataframe df, blue , red fields depict category account belongs @ given point in time.for example, of 1/1/2016, account belong blue category. blue , red dataframes depict @ dates cash given out accounts in blue , red category. want insert new rows in original df rows come after date field in df blue , red dataframes based on whether account belongs blue or red or both.
the output looking looks this:
account date blue red amount 1/1/2016 1 0 100 1/31/2016 1 0 55 2/1/2016 1 1 200 2/15/2016 1 1 67 2/28/2016 1 1 65 3/15/2016 1 1 77 3/31/2016 1 1 75 b ..............................
in output, of 1/1/2016 account a belongs blue category. goal find date after 1/1/2016 in blue table 1/31/2016 , insert it. don't want insert 1/15/2016 red table because account not category red of 1/1/2016. ok blue , red fields showing na inserted fields.
my thought trying rbind(df, blue, red), by="account")
dont know how incorporate conditions of inserting later dates based on category account belongs @ given point in time.
a possible approach:
# combine 'blue' & 'red' 1 , create 'colcat' column on fly br <- rbindlist(list(blue, red), idcol = 'colcat')[, colcat := c('blue','red')[colcat]] # loop on rows of 'df', select needed rows 'bluered' # , punt result list brlist <- lapply(df$date, function(x) br[date > x][order(date)]) # loop on rows, select needed rows 'bluered' & bind them lst <- lapply(1:nrow(df), function(i) { idx <- c('blue','red')[c(c(1)[!!df[i][['blue']]], c(2)[!!df[i][['red']]])] incs <- brlist[[i]][colcat %in% idx][, .sd[1], colcat][, .(account = df$account[i], date, blue = df$blue[i], red = df$red[i], amount)] rbind(df[i],incs) }) # bind resulting list 1 'data.table' again dt <- rbindlist(lst)
which gives:
> dt account date blue red amount 1: 2016-01-01 1 0 100 2: 2016-01-31 1 0 55 3: 2016-02-01 1 1 200 4: 2016-02-15 1 1 67 5: 2016-02-28 1 1 65 6: b 2016-01-10 0 1 300 7: b 2016-01-15 0 1 47 8: b 2016-02-10 1 1 400 9: b 2016-02-15 1 1 67 10: b 2016-02-28 1 1 65
Comments
Post a Comment