java - Calculating 'n' maximum value in hadoop -



java - Calculating 'n' maximum value in hadoop -

i have scenario. output previous job1

in next job need find i key having maximum value.eg i=3, 3 keys having maximum value. (i custom parameter)

how approach this.

should calculated max in job2 mapper there unique keys, output coming previous reducer or find max in sec jobs reducer.but 1 time again how find i keys?

update

i tried in way instead of emiting value value in reducer.i emitted value key can values in ascending order. , wrote next mr job.where mapper emits key/value.

reducer finds max of key 1 time again stuck cannot done seek id , because id unique,values not uniqe.

how solve this.

can suggest solution this.

thanks in advance.

you can find top i keys priorityqueue. simple code illustrate idea:

public static class topnmapper extends mapper<intwritable, doublewritable, intwritable, doublewritable> { private static class mypair implements comparable<mypair> { public final int key; public final double value; mypair(int key, double value) { this.key = key; this.value = value; } @override public int compareto(mypair o) { homecoming -double.compare(value, o.value); // i'm not sure '-' } } private final priorityqueue<mypair> topn = new priorityqueue<mypair>(); @override protected void map(intwritable key, doublewritable value, context context) throws ioexception, interruptedexception { if (double.isnan(value.get())) { return; // not number } topn.add(new mypair(key.get(), value.get())); if (topn.size() <= 50) { // simple optimization return; } while (topn.size() > 3) { // retain top 3 elements in queue topn.poll(); } } @override protected void cleanup(context context) throws ioexception, interruptedexception { while (topn.size() > 3) { topn.poll(); // retain top 3 elements in queue } (mypair mypair : topn) { // write top 3 elements context.write(new intwritable(mypair.key), new doublewritable(mypair.value)); } } }

if run mapper (one input), should 3 keys maximum values​​.

java hadoop mapreduce max

Comments

Popular posts from this blog

php - Android app custom user registration and login with cookie using facebook sdk -

django - Access session in user model .save() -

php - .htaccess Multiple Rewrite Rules / Prioritizing -