Cheap VPS & Xen Server


Residential Proxy Network - Hourly & Monthly Packages

Pig UDF


For custom processing Pig provides user defined functions (UDFs) . Currently PIG UDFs are implemented in three languages Java, Python and Ruby.

  • All UDFs must extend “org.apache.pig.EvalFunc”
  • All functions must override “exec” method.

Let’s see an example of simple EVAL Function to convert to upper.

  1. packagemyudfs;
  2. importjava.io.IOException;
  3. importorg.apache.pig.EvalFunc;
  4. importorg.apache.pig.data.Tuple;
  5. public class UPPER extends EvalFunc<String>
  6.  {
  7. public String exec(Tuple input) throws IOException {
  8. if (input == null || input.size() == 0)
  9. return null;
  10. try{
  11.             String str = (String)input.get(0);
  12. returnstr.toUpperCase();
  13. }catch(Exception e){
  14. throw new IOException(“Caught exception processing input row “, e);
  15.         }
  16.     }
  17.   }

Create a jar of the above code as myudfs.jar.

Now write the script in a file and save it as .pig. Here I am using script.pig.

  1. — script.pig
  2. REGISTER myudfs.jar;
  3. A = LOAD ‘data’ AS (name: chararray, age: int, gpa: float);
  4. B = FOREACH A GENERATE myudfs.UPPER(name);
  5. DUMP B;

Finally run the script in the terminal to get the output.

Comments

comments