Setup Spark Locally - Mac¶
Let us understand how to setup Spark locally on Mac.
Here are the pre-requisites to setup Spark Locally on mac.
At least 8 GB RAM is highly desired.
Make sure JDK 1.8 is setup
Make sure to have Python 3. If you do not have it, you can install it using homebrew.
Here are the steps to setup Pyspark and validate.
Create Python Virtual Environment -
python3 -m venv spark-venv.
Activate the virtual environment -
pip install pyspark==2.4.6to install Spark 2.4.6.
pysparkto launch Spark CLI using Python as programming language.
Here are some of the limitations related to running Spark locally.
You will be able to run Spark using local mode by default. But you will not be able to get the feel of Big Data.
Actual production implementations will be on multinode cluters, which run using YARN or Spark Stand Alone or Mesos.
You can understand the development process but you will not be able to explore best practices to build effective large scale data engineering solutions.