env: Windows + idea + maven + scala frame
i am studying how to use spark and just trying to run code as follow...
import .apache.spark.{SparkConf, SparkContext}
object wordCount {
def main(args: Array[String]): Unit = {
val conf = new SparkConf().setMaster("local[*]").setAppName("WordCount")
val sc = new SparkContext(conf)
sc.setLogLevel("WARN")
val fileRDD = sc.textFile("datas/words")
val wordRDD = fileRDD.flatMap(_.split(" "))
val word2OneRDD = wordRDD.map((_,1))
val word2CountRDD = word2OneRDD.reduceByKey(_ + _)
word2CountRDD.foreach(println)
val collect = word2CountRDD.collect().toBuffer
collect.foreach(println)
word2CountRDD.repartition(1).saveAsTextFile("datas/words/result")
println("save to datas/words/result")
sc.stop()
println("done")
}
}
and here is the pom.xml
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns=".0.0"
xmlns:xsi=";
xsi:schemaLocation=".0.0 .0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>.example</groupId>
<artifactId>untitled1</artifactId>
<version>1.0-SNAPSHOT</version>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<dependency>
<groupId>.apache.spark</groupId>
<artifactId>spark-2.12</artifactId>
<version>3.3.0</version>
</dependency>
</dependencies>
</project>
i guess the problem cause by hadoop configured by maven on windows but can't deal with it