- Computers & Software»
- Operating Systems»
Installation Spark Cluster on Windows
But on Windows it's more complex. How to do it right, without mistake and don't spend too much time.
Do it step by step and carefully.
1. Open Command Prompt ( Win+R : cmd )
2. Go to directory of Spark project
3. Start master server by executing:
4. Take HOST/POST of URL: Master will give [spark://HOST:POST - URL] - Address for starting application
5. Initiate the operation of one or more processes of applications and connect them to the master
./bin/spark-class org.apache.spark.deploy.worker.Worker spark://IP:PORT
6. Connect application vs cluster
./bin/spark-shell --master spark:://IP:PORT
Run simple application and view informations
See from Spark Guide: http://spark.apache.org/docs/latest/quick-start.html
Write this code into cmd and run:
scala> val textFile = sc.textFile("c:\\spark-1.6.0\\README.md") //directory of file
scala> textFile.count() // Number of items in this RDD
scala> textFile.first() // First item in this RDD
scala> val linesWithSpark = textFile.filter(line => line.contains("Spark"))
scala> textFile.filter(line => line.contains("Spark")).count() // number of lines contain in "Spark"
Video of installation & test application
We received image results as below
images of Spark Master, Spark Job, Spark Environment and Spark Executors.
This topic isn't new topic for someone but many of us didn't run Spark on Windows. So you don't worry and practice on Windows carefully, do it step by step. Remember commands and way to connect spark app with a cluster. From Web Master UI we get whole information of processes, memory, environment, executors, etc.
1. Quick start - Spark-1.6.0: http://spark.apache.org/docs/latest/quick-start.html
2. Spark Standalone Mode: http://spark.apache.org/docs/latest/spark-standalone.html#starting-a-cluster-manually