Thursday, 4 February 2016

Reading an RDD Array of Array in Spark

Suppose we have Array(Array[Int]) as seen below:

val values = Array(Array(1,2),Array(3,4)) 

Converting 'values' to RDD:

val valuesRDD = sc.parallelize(values)
('sc' is the default SparkContext available)

This will create RDD[Array[Int]]

Now, let us print each element in the above RDD:
valuesRDD.collect().foreach(array => {array.foreach(element=>println(element))})

The result will be:

1
2
3
4