Sharing Databricks Hive Metastore

Prosenjit Chakraborty
5 min readJul 21, 2021

Databricks Workspace instance contains internal Hive metastore accessible by all its clusters to persist table metadata. However, instead of its own metastore, Databricks can connect to external Hive metastore as well.

External Hive metastore can be connected by using thrift service or by connecting directly to the metastore database.

Databricks cluster — advanced property to connect via thrift service:

spark.hadoop.hive.metastore.uris thrift://<hive-thrift-server-connection-url>:<thrift-server-port>

Databricks cluster — advanced property to connect directly to metastore database:

Hive metastore connection specific entries, to be added into Databricks cluster Configuration > Advanced Options > Spark > Spark Config.

javax.jdo.option.ConnectionURL <hive-metastore-db-jdbc-connection-string>
javax.jdo.option.ConnectionDriverName <hive-metastore-db-jdbc-driver-class>
javax.jdo.option.ConnectionUserName {{secrets/<my-secret-scope>/<hive-conn-userid-key-name>}}
javax.jdo.option.ConnectionPassword {{secrets/<my-secret-scope>/<hive-conn-pass-key-name>}}

In case we want to read data from ADLS Gen 2, we can append the spark config with:

--

--

Prosenjit Chakraborty
Prosenjit Chakraborty

Written by Prosenjit Chakraborty

Tech enthusiast, Principal Architect — Data & AI.

Responses (4)