Different Spark ModulesΒΆ
Let us understand details about different spark modules. We will be focusing on high level modules that are made available since Spark 2.2 and later.
Here are the different Spark Modules.
Spark Core - RDD and Map Reduce APIs
Spark Data Frames and Spark SQL
Spark Structured Streaming
Spark MLLib (Data Frame based)
As engineers, we need not focus too much on Spark Core libraries to build Data Pipelines. We should focus on Spark Data Frames as well as Spark SQL.