Apache Spark is an open source cluster computing system that makes data analytics fast to write and fast to run. Spark can help to tackle big datasets quickly through simple APIs in Python, Java, and Scala. It can help to solve problems ranging from: interactive queries, streaming, machine learning, and graph processing.
All the code in the book: https://github.com/databricks/learning-spark
Reasons for this tight integration: