The past hours I’ve been ramming my head into the same problem over and over. I had to deal with multiple strings of hexadecimal values coming from multiple sources. So far so easy, just use the iconv package… no, does not work at all for specific strings.
I tried to figure it out by hand and found this stackoverflow discussion that hinted me in the right direction:
s <- '1271763355662E324375203137'
h <- sapply(seq(1, nchar(s), by=2), function(x) substr(s, x, x+1))
rawToChar(as.raw(strtoi(h, 16L)))
## [1] "\022qv3Uf.2Cu 17"
But it did not work for most strings once more… I kept trying multiple other attempts. So I finally started looking deeper into the error message:
rawToChar(as.raw(c(65:68, 0 , 70)) ) Error in rawToChar(as.raw(c(65:68, 0, 70))) : embedded nul in string: 'ABCD\0F'
AHA! So this was the odd double byte! nul is breaking almost any string manipulator. I tinkered around to catch it and finally figured out the easiest way to do this. Here you are:
hexToText <- function(msg){ hex <- sapply(seq(1, nchar(as.character(msg)), by=2), function(x) substr(msg, x, x+1)) hex <- subset(hex, !hex == "00") gsub('[^[:print:]]+', '', rawToChar(as.raw(strtoi(hex, 16L)))) }
So my code is still mostly doing the same as the initial one, I simply remove nuls from my sets and everything works fine.
A Data Science consultant working at Sopra Steria. He occasionally blogs about data and related topics here and is the host of the Dortmund Data Science Meetup.
Leave a Reply