Member-only story

Azure Databricks with Azure Key Vaults

Prosenjit Chakraborty
4 min readDec 5, 2018

--

Why?

While connecting to any output storage/systems from Databricks we need to provide user ids/passwords or access keys. These secrets are in clear texts and whoever is having the Databricks workspace access, can see these!

We can’t restrict a user to view a particular notebook if she/he has access to the workspace.

Few examples below –

  1. Connection setting to Azure Blob Storage

%scala
spark.conf.set(“fs.azure.account.key.<storage_account>.blob.core.windows.net”, “<storage_account_access_key in clear text>”)

2. Connection setting to Azure SQL DW

%scala
val df = spark.read
.format(“com.databricks.spark.sqldw”)
.option(“url”, “jdbc:sqlserver://<server-name>:1433;database=<database_name>;user=<user>;password=<password in clear text>;encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30;”)
.option(“tempdir”, “wasbs://<container>@<storage_account>.blob.core.windows.net/<container>”)
.option(“forward_spark_azure_storage_credentials”, “true”)
.option(“query”, “SELECT * FROM MyTable WHERE PrimaryKey = 123456”)
.load()

--

--

Prosenjit Chakraborty
Prosenjit Chakraborty

Written by Prosenjit Chakraborty

Tech enthusiast, Principal Architect — Data & AI.

Responses (5)