Member-only story
Azure Databricks with Azure Key Vaults
Why?
While connecting to any output storage/systems from Databricks we need to provide user ids/passwords or access keys. These secrets are in clear texts and whoever is having the Databricks workspace access, can see these!
Few examples below –
- Connection setting to Azure Blob Storage
%scala
spark.conf.set(“fs.azure.account.key.<storage_account>.blob.core.windows.net”, “<storage_account_access_key in clear text>”)
2. Connection setting to Azure SQL DW
%scala
val df = spark.read
.format(“com.databricks.spark.sqldw”)
.option(“url”, “jdbc:sqlserver://<server-name>:1433;database=<database_name>;user=<user>;password=<password in clear text>;encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30;”)
.option(“tempdir”, “wasbs://<container>@<storage_account>.blob.core.windows.net/<container>”)
.option(“forward_spark_azure_storage_credentials”, “true”)
.option(“query”, “SELECT * FROM MyTable WHERE PrimaryKey = 123456”)
.load()