Contents
1. mac 安装 spark
略
2. 安装sbt
brew install sbt
3. 写wordcount scala程序
import org.apache.spark.{SparkConf, SparkContext} object SparkWordCount { def FILE_NAME:String = "word_count_results_"; def main(args:Array[String]): Unit ={ if(args.length < 1){ println("Usage:SparkWordCount FileName"); System.exit(1); } val conf = new SparkConf().setAppName("Spark Exercise: Spark Version Word Count Program"); val sc = new SparkContext(conf); val textFile = sc.textFile(args(0)); val wordCounts = textFile.flatMap(line => line.split(" ")).map( word => (word, 1) ).reduceByKey((a, b) => a + b) wordCounts.saveAsTextFile(FILE_NAME + System.currentTimeMillis()); println("Word Count program running results are successfully saved."); } }
3. sbt 文件
name := "SparkWordCount" version := "1.0" scalaVersion := "2.11.8" libraryDependencies += "org.apache.spark" %% "spark-core" % "2.3.0"
这边的scala的版本和spark的版本要根据你自己本地的版本进行修改
4. 目录结构
除了箭头的,其他忽略,是编译生成的。
5. 编译
sbt package
6. 提交到spark 运行
spark-submit --class SparkWordCount ./target/scala-2.11/sparkwordcount_2.11-1.0.jar /usr/local/Cellar/spark-2.3.0/README.md
这边传入的文件要是你本地有的。
4037