3、行存储VS列存储
目前大数据存储有两种方案可供选择:行存储(Row-Based)和列存储(Column-Based)。...它是个只读的表,不能在运算过程再往里加元素。
...RDD.toDF(“列名”)
scala> val rdd = sc.parallelize(List(1,2,3,4,5,6))
rdd: org.apache.spark.rdd.RDD[Int]...scala> sc.parallelize(List( (1,"beijing"),(2,"shanghai") ) )
res3: org.apache.spark.rdd.RDD[(Int, String...| 2|shanghai|
+---+--------+
例如3列的
scala> sc.parallelize(List( (1,"beijing",100780),(2,"shanghai