hadoop - Unable to indentify the bug in my Reducer join code -
i have 2 datasets:
users:
bobby 06 amsterdam sunny 07 rotterdam steven 08 liverpool jamie 23 liverpool macca 91 liverpool messi 10 barcelona pique 04 barcelona suarez 09 barcelona neymar 11 brazil klopp 12 liverpool
userlogs:
sunny newplayer 12.23.14.421 klopp crazy 88.33.44.555 bobby newplayer 99.12.11.222 steven captain 99.55.66.777 jamie local 88.99.33.232 suarez spain 77.55.66.444
i want join these 2 datasets using reducer join. wrote classes in way:
mapperclass:
public class mapperclass { public static class usermap extends mapper<longwritable, text, text, text> { @override protected void map(longwritable key, text value, context context) throws ioexception, interruptedexception { string line = value.tostring(); string[] tokens = line.split(" "); string name = tokens[0]; string city = tokens[2]; context.write(new text(name), new text("userfile" + "\t" + city)); } } public static class userlogs extends mapper<longwritable, text, text, text> { @override protected void map(longwritable key, text value, context context) throws ioexception, interruptedexception { string line = value.tostring(); string[] tokens = line.split(" "); string name = tokens[0]; string ip = tokens[2]; context.write(new text(name), new text("userlogs" + "\t" + ip)); } } }
reducer class:
public class reducerclass extends reducer<text, text, text, text>{ @override public void reduce(text key, iterable<text> values, context context) throws ioexception, interruptedexception { string city = null; string ip = null; for(text t: values) { string[] parts = t.tostring().split("\t"); if(parts[0].equals("userfile")) { city = parts[1]; } if(parts[0].equals("userlogs")) { ip = parts[1]; } else { ip = "ip address not found"; } } context.write(key, new text(city + "\t" + ip)); } }
driverclass:
public class mainclass { public static void main(string[] args)throws ioexception, interruptedexception, classnotfoundexception { job job = new job(); job.setjarbyclass(mainclass.class); job.setoutputkeyclass(text.class); job.setreducerclass(reducerclass.class); job.setoutputvalueclass(text.class); job.setinputformatclass(textinputformat.class); job.setoutputformatclass(textoutputformat.class); multipleinputs.addinputpath(job, new path(args[0]), textinputformat.class, usermap.class); multipleinputs.addinputpath(job, new path(args[1]), textinputformat.class, userlogs.class); fileoutputformat.setoutputpath(job, new path(args[2])); system.exit(job.waitforcompletion(true)?0:1); } }
the output should this:
bobby amsterdam 99.12.11.222 sunny rotterdam 12.23.14.421 klopp liverpool 88.33.44.555 steven liverpool 99.55.66.777 jamie liverpool 88.99.33.232 suarez barcelona 77.55.66.444
instead im getting output this:
bobby amsterdam ip address not found jamie liverpool 88.99.33.232 klopp liverpool ip address not found macca liverpool ip address not found messi barcelona ip address not found neymar brazil ip address not found pique barcelona ip address not found steven liverpool 99.55.66.777 suarez barcelona ip address not found sunny rotterdam 12.23.14.421
i couldn't understand mistake did made here. can me on fixing problem. kind of appreciated.
there error in reducer, overriding ip address depending on values
order. try one:
public class reducerclass extends reducer<text, text, text, text>{ @override public void reduce(text key, iterable<text> values, context context) throws ioexception, interruptedexception { string city = null; string ip = null; for(text t: values) { string[] parts = t.tostring().split("\t"); if(parts[0].equals("userfile")) { city = parts[1]; } else if(parts[0].equals("userlogs")) { ip = parts[1]; } } if (ip != null && city != null) { context.write(key, new text(city + "\t" + ip)); } } }
Comments
Post a Comment