Azure Databricks User Token Management — we can end up developing a key expiry notification & auto-rotation system

Prosenjit Chakraborty
6 min readMar 8, 2019

Azure Databricks can be connected in different ways. All of these need a valid Databricks user token to connect and invoke jobs. Though using user token is very straight forward however, maintaining the token needs some extra efforts as on now.

Find below the 3 most popular ways to connect Databricks:

1. Azure Data Factory (ADF) v2 — Linked Services

First we need an ADF — Databricks Linked Service to be created. We can add the user token to connect.

Data Factory > your factory name > Connections > Select Access token

2. Azure Databricks Rest API calls

REST POST call has the Authorization — header which needs the User Token.

Authorization = Bearer <valid user token>

3. Using JDBC-ODBC driver

The JDBC-Hive connection string contains User Token. For details you can refer this and this. This option is available in Azure Databricks Premium version only.

As user token is related to a ‘user’, the real problem would be if a user gets deactivated e.g. left the organization/blocked or the token gets expired (though we can make a token ‘never expiring’ but that’s not a good practice)!!

We would have to identify all the client applications (ADF/Rest client/JDBC-Hive client) and update with the new token. This task could be very difficult and time consuming in case we have a lot of ADF pipelines / Rest clients using the expired token. After updating with the new token we also would need to take care of the failed jobs 😟.

A better approach would be to keep the user token at Azure Key Vault (as a Secret value) and use the Secret name to retrieve it. In case of any new user token generation, the Azure Key Vault secret value would need to be updated manually and all of the Databricks’ clients using the secret would get the…

--

--