hadoop - Pig Multi-Query Optimization issue -


we running issues on pig's multiquery optimizer not work expected.

as understood, below script should run 1 mr job, runs 2 jobs on our cluster. think multiquery optimization should on default, missing here? if replace group by "filter" statement works 1 single mr job.

data = load 'input' (a:chararray, b:int, c:int); = group data b; b = group data c; store 'output1'; store b 'output2'; 

i'm using cdh packed pig 0.1.0 , hadoop 2.0.0.

if 0.1.0 real version of pig installation - it's old. latest version 0.11.1.

page performance 0.11.1 docs: http://pig.apache.org/docs/r0.11.1/perf.html


Comments

Popular posts from this blog

java - Run a .jar on Heroku -

java - Jtable duplicate Rows -

validation - How to pass paramaters like unix into windows batch file -