Overview of Spark MetastoreΒΆ

Let us get an overview of Spark Metastore and how we can leverage it to manage databases and tables on top of Big Data based file systems such as HDFS, s3 etc.

  • Quite often we need to deal with structured data and the most popular way of processing structured data is by using Databases, Tables and then SQL.

  • Spark Metastore (similar to Hive Metastore) will facilitate us to manage databases and tables.

  • Typically Metastore is setup using traditional relational database technologies such as Oracle, MySQL, Postgres etc.