hadoop - Hive UDF With Parameters -



hadoop - Hive UDF With Parameters -

i want write custom udf (udaf/udtf) can take in constant parameter.

for example, want write function max(col, i), col collection of values find max value, , position (ie. = 1, find highest, = 2, find sec highest, etc.), such hive query looks like:

select max(value, 2) table;

this isn't max, need general way of beingness able this, sorting , selecting sorted collection not work.

you can utilize constantobjectinspectors constant values passed parameters. in initialize() method genericudf or init() in genericudafevaluator, check see if specified objectinspector instance of constantobjectinspector. if cast it, otherwise throw exception.

for illustration

public objectinspector init(mode m, objectinspector[] parameters) throws hiveexception { ...... if(!( parameters[1] instanceof constantobjectinspector ) ) { throw new hiveexception("position parameter must constant."); } constantobjectinspector posoi = (constantobjectinspector) parameters[1]; pos = ((intwritable) posoi.getwritableconstantvalue()).get(); ......

for specific use-case here, check out collect_max in brickhouse (http://github.com/klout/brickhouse ) , collects top n key , max values.

hadoop hive apache-pig user-defined-functions user-defined-aggregate

Comments

Popular posts from this blog

php - Android app custom user registration and login with cookie using facebook sdk -

django - Access session in user model .save() -

php - .htaccess Multiple Rewrite Rules / Prioritizing -