Mapreduce in R - how can i implement "loop if" in reduce? -
Mapreduce in R - how can i implement "loop if" in reduce? -
this illustration dataset:
x <- c("a1", "a1", "a1", "a2", "a2", "a2", "a2", "a3") y <- c(5347, 5347, 5347, 1819, 1758, 1212, 1212, 1456)
i can't prepare dataset input mapreduce's query after "map|sort", because have separate \t , after (it's necessary step in mapreduce split rows):
fields <- unlist(strsplit(line, "\t"))
where line input 2 fields:
fields[[1]] = all column x fields[[2]] = all column yi want result:
id count unique number a1 1 (only 5347) a2 3 (1819, 1758, 1212) a3 1 (only 1456)
how can count this, loop observe column x , y long search new number in column x , count unique number in column y unique number in column x??
the question not clear (maybe because of english language problem). expected result, think looking like:
tapply(y,x,function(t)length(unique(t))) a1 a2 a3 1 3 1
which in english language :
computing number of unique y each x.
r loops if-statement mapreduce
Comments
Post a Comment