Previous posts have covered using Spark in a Hadoop cluster. In addition to YARN, Spark has the ability to run in a standalone cluster. This allows you to leverage the benefit of Spark in an environment where it doesn’t make sense to run as part of YARN, including accessing data in a Cassandra cluster (which we will get to in a future post).
These instructions will help you get started with running Spark in a standalone cluster. Continue reading