Different Spark ModulesΒΆ

Let us understand details about different spark modules. We will be focusing on high level modules that are made available since Spark 2.2 and later.

  • Here are the different Spark Modules.

    • Spark Core - RDD and Map Reduce APIs

    • Spark Data Frames and Spark SQL

    • Spark Structured Streaming

    • Spark MLLib (Data Frame based)

  • As engineers, we need not focus too much on Spark Core libraries to build Data Pipelines. We should focus on Spark Data Frames as well as Spark SQL.