In this workshop we explain what Spark is and how it works. Then we will describe the basics of running Spark on your computer and programming for it through an iPython notebook. We will write some Python to perform text mining with Spark on a small data set. At the end of the workshop we will demonstrate running the same code distributed across a cluster with a much larger data set to show how Spark parallelizes and distributes computation.

Software Configuration