Spark 2.0 -Outer Join Java Example

In SPARK 2, datasets do not have api like leftouterjoin() or rightouterjoin() similar to that of RDD.So if we have to join two datasets, then we need write specialized code which would help us in achieving the outer joins. To the join API we need to pass the join type argument which can various values as below
‘inner’, ‘outer’, ‘full’, ‘fullouter’, ‘leftouter’, ‘left’, ‘rightouter’, ‘right’, ‘leftsemi’, ‘leftanti’ .

import java.util.ArrayList;
import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;
import org.apache.spark.sql.SparkSession;




public class OuterJoinExample{

	public static void main(String [] args)
	{
		SparkSession session = SparkSession.builder().appName("test 2.0").master("local[*]").getOrCreate();

		
		Dataset<Row> customers = session.read().option("header",true).csv("/home/myname/customer.csv");
		Dataset<Row> orders = session.read().option("header",true).csv("/home/myname/order.csv");
		
		ArrayList<String> joinColList = new ArrayList<String>();
		joinColList.add("CustomerId");
		Dataset<Row> joinedData = customers.join(orders,scala.collection.JavaConversions.asScalaBuffer(joinColList),"leftouter");
		
		joinedData.show();
	}
}

customer.csv

CustomerId,Name,City
1,Harish,Bangalore
2,Naresh,Mumbai
3,Suresh,New Delhi
4,Mahesh,Calcutta

order.csv

CustomerId,OrderId,Item
1,111,Laptop
1,222,Printer
3,333,Monitor

Output

+----------+------+---------+-------+-------+
|CustomerId|  Name|     City|OrderId|   Item|
+----------+------+---------+-------+-------+
|         1|Harish|Bangalore|    222|Printer|
|         1|Harish|Bangalore|    111| Laptop|
|         2|Naresh|   Mumbai|   null|   null|
|         3|Suresh|New Delhi|    333|Monitor|
|         4|Mahesh| Calcutta|   null|   null|
+----------+------+---------+-------+-------+

One thought on “Spark 2.0 -Outer Join Java Example

  1. more tips here

    whoah this blog is excellent i really like reading your
    articles. Stay up the good work! You know, a
    lot of persons are searching around for this information, you could help them greatly.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *