Pig Split command
============SPLIT==================
Emp1 = LOAD '/user/cloudera/N_EMP1.txt' USING
PigStorage(',') as ( sno:int , name:chararray , role:chararray ,
salary:chararray , company:chararray , exp:int);
Emp2 = LOAD '/user/cloudera/N_EMP2.txt' USING
PigStorage(',') as ( sno:int , name:chararray , role:chararray , salary:int ,
company:chararray , exp:int);
The SPLIT operator is used to split a
relation into two or more relations.
SPLIT Emp1 into X if (exp > 3), Y if
(exp < 3);
dump X;
==========CROSS========
c = CROSS Emp1,Emp2;
dump c;
Join:
TXN = LOAD '/user/cloudera/txns.txt' USING
PigStorage(',') as (txnno:INT,
txndate:CHARARRAY, custno:INT,
amount:INT,
category:CHARARRAY, product:CHARARRAY, city:CHARARRAY, state:CHARARRAY,
spendby:CHARARRAY);
CUTS = LOAD '/user/cloudera/custs' USING
PigStorage(',') as (cusid:int, firstname:
CHARARRAY, lastname: CHARARRAY, age:int,profession:CHARARRAY);
J = Join TXN by custno, CUTS by cusid;
L = Limit J 20;
DUMP L;
· Left Outer
Joins:
A
left outer join means records from the left side willbe included even when they
do not have a match on the right side.
LJ= JOIN F by custno LEFT OUTER, F1 by
cusid;
LJ1= LIMIT LJ 15;
DUMP LJ1;
· Right Outer
Join:
A
right outer joins means records from the right side will be included even when
they do nothave a match on the left side.
RJ= JOIN F by custno RIGHT
OUTER, F1 by cusid;
RJ1= LIMIT RJ 15;
DUMP RJ1;
· Full Outer Join:
A full outer join means records from both sides are taken
even when they do not have matches.
FJ =
JOIN F by custno FULL OUTER , F1 by cusid;
DUMP FJ;
· Self Join:
Self joins are supported, though the data must
be loaded twice.
trans =
LOAD '/user/cloudera/txns.txt' AS (txnno:chararray,txndate:chararray,custno:int,amount:chararray,category:chararray,product:chararray,city:chararray,state:chararray,spendby:chararray);
trans1 =
LOAD '/user/cloudera/txns.txt' AS (txnno:chararray,txndate:chararray,custno:int,amount:chararray,category:chararray,product:chararray,city:chararray,state:chararray,spendby:chararray);
SF = JOIN
trans by custno, trans1 by custno;
SF1=
LIMIT SF 15;
DUMP
SF1;
Comments
Post a Comment