hadoop - Hive UDF With Parameters -
hadoop - Hive UDF With Parameters -
i want write custom udf (udaf/udtf) can take in constant parameter.
for example, want write function max(col, i), col collection of values find max value, , position (ie. = 1, find highest, = 2, find sec highest, etc.), such hive query looks like:
select max(value, 2) table;
this isn't max, need general way of beingness able this, sorting , selecting sorted collection not work.
you can utilize constantobjectinspectors constant values passed parameters. in initialize() method genericudf or init() in genericudafevaluator, check see if specified objectinspector instance of constantobjectinspector. if cast it, otherwise throw exception.
for illustration
public objectinspector init(mode m, objectinspector[] parameters) throws hiveexception { ...... if(!( parameters[1] instanceof constantobjectinspector ) ) { throw new hiveexception("position parameter must constant."); } constantobjectinspector posoi = (constantobjectinspector) parameters[1]; pos = ((intwritable) posoi.getwritableconstantvalue()).get(); ......
for specific use-case here, check out collect_max
in brickhouse (http://github.com/klout/brickhouse ) , collects top n key , max values.
hadoop hive apache-pig user-defined-functions user-defined-aggregate
Comments
Post a Comment