r - String replacement using sub function -
i attempting extract names of nba players column in database. however, format of the names in names column following:
"lebron james\\jamesle01"
i used following regex expression inside sub function attempt keep name portion:
sub("([a-z]\\w+\\s*-*'*[a-z]*\\s*\\.*|[a-z]\\.\\s*)\\*\\*[a-z]*\\d*\\d*", replacement = "\\1", x = nba_salaries$names) the expression meant take account unusual names contain more alphanumeric characters (e.g. michael kidd-gilchrist, de'andre jordan, luc mbah moute, etc.)
however, when run following,
head(nba_salaries$names) the names end being in same format.
i have used regexr.com ensure regex expression captures strings properly.
how this, can split text "\\" string, , take first element:
text <- c( "lebron james\\jamesle01", "michael jordan\\jamesle01" ) sapply( strsplit( text, "\\\\" ), "[", 1 ) which gives
[1] "lebron james" "michael jordan" to explain. "[" function*, being called within sapply. pass result of strsplit x in sapply, , apply [ function it* parameter 1 take 1st element. here's way put it:
text <- strsplit( text, "\\\\" ) this output list, each list element containing vector, first element text before "\\" string, , second element contains text after it. use "[" function*, passing parameter 1, take first element of each of vectors:
text <- sapply( x = text, fun = "[", 1 ) edit add, using magrittr pipe things this, make little more readable:
library( magrittr ) text <- strsplit( x = text, split = "\\\\" ) %>% sapply( fun = "[", 1 ) - the "[" function function called when subset
[]. eg:vector[1:3]or in casevector[1](thanks @mathewlundberg suggestion here)
Comments
Post a Comment