R – useful tips

Given a data frame my.df replace all occurrences of Blah or blurb with NA:

mutate_all(my.df,funs(ifelse(grepl('Blah|blurb',.),NA,.)  ))

Using dplyr.

Given a data frame my.df with columns A,B,C,D,E swap A and B and keep the rest

my.df %>% select(B,A,everything())

Change multi columns to less columns: gather(), change to more columns: spread() from library(tidyr)

Explode lines in data frame with unnest() see.

How to turn intervals into rows of sequences: see

library(dplyr)
library(tidyr)

set.seed(0)

thingSet <- data.frame(thing = sample(1:1000,20)) %>% rowwise() %>%
  mutate(date1 = sample(seq(as.Date('2014/01/01'), as.Date('2017/01/01'), by="day"), 1),
         date2 = sample(seq(as.Date('2014/01/01'), as.Date('2017/01/01'), by="day"), 1)) %>%
  mutate(from = if_else(date1 >  date2, date2, date1),
         to   = if_else(date2 >= date1, date2, date1)) %>%
  mutate(from = as.Date(format(from,"%Y-%m-15")),
         to = as.Date(format(to,"%Y-%m-15"))) %>%
  rowwise() %>%
  do(data.frame(name = .$thing, month = seq.Date(.$from, .$to, by='month'))) %>% as.data.frame()

freq <- thingSet %>% group_by(month) %>% summarise(n = n())

Use if_else() to preserve dates in dplyr.

Drop columns in a data frame df that start with X

test <- df[,-grep("X",colnames(df))]
test  <- df %>% select(-starts_with("X"))
Advertisement