Azure Databricks User Token Management — we can end up developing a key expiry notification & auto-rotation system

Azure Databricks can be connected in different ways. All of these need a valid Databricks user token to connect and invoke jobs. Though using user token is very straight forward however, maintaining the token needs some extra efforts as on now.

Image for post
Image for post

Find below the 3 most popular ways to connect Databricks:

1. Azure Data Factory (ADF) v2 — Linked Services

First we need an ADF — Databricks Linked Service to be created. We can add the user token to connect.

Image for post
Image for post
Data Factory > your factory name > Connections > Select Access token

2. Azure Databricks Rest API calls

REST POST call has the Authorization — header which needs the User Token.

Image for post
Image for post
Authorization = Bearer <valid user token>

3. Using JDBC-ODBC driver

The JDBC-Hive connection string contains User Token. For details you can refer this and this. This option is available in Azure Databricks Premium version only.

As user token is related to a ‘user’, the real problem would be if a user gets deactivated e.g. left the organization/blocked or the token gets expired (though we can make a token ‘never expiring’ but that’s not a good practice)!!

We would have to identify all the client applications (ADF/Rest client/JDBC-Hive client) and update with the new token. This task could be very difficult and time consuming in case we have a lot of ADF pipelines / Rest clients using the expired token. After updating with the new token we also would need to take care of the failed jobs 😟.

A better approach would be to keep the user token at Azure Key Vault (as a Secret value) and use the Secret name to retrieve it. In case of any new user token generation, the Azure Key Vault secret value would need to be updated manually and all of the Databricks’ clients using the secret would get the latest token without any manual intervention.

Next, we’ll see how we can do that.

1. Azure Data Factory (ADF) v2 — Linked Services

Image for post
Image for post
Data Factory > your factory name > Connections > Select Azure Key Vault

Instead of ‘hard-coding’ the Databricks user token, we can store the token at Azure Key Vault as a Secret and refer that from the Data Factory Linked Service.

2/3. Rest API calls / Using JDBC-ODBC

Image for post
Image for post
High level flow to retrieve Databricks user token dynamically from Azure Key Vault

For the detailed steps you can follow this. Here, we have stored the Databricks user token in the Azure Key Vault and retrieved it before calling Databricks Rest API or constructing JDBC-Hive connection string each time.

Few points to note:

(i) The OAuth2 token received in the Step 4 only lives for an hour ("expires_in": 3600).

Image for post
Image for post
expires_in = 3600

(ii) The Service Principal key, Azure Key Vault secret and Databricks User Token should have some expiry dates, generally aligning with the organization system password expiration policy.

Image for post
Image for post
Azure Active Directory > App registrations > select the app > Settings > Keys
Image for post
Image for post
Key Vaults > your key vault name — Secrets > Create a secret
Image for post
Image for post
Azure Databricks > select the Account icon > User Settings > Access Tokens

(iii) In Step 3, the Service Principal should have only required access of the Azure KV.

Image for post
Image for post
Your Key Vault > Access policies > Add new > Add access policy > Secret permissions

User Token Refreshment

With the above approach, if we now want to update the user token (due to user being decommissioned or has been blocked), a new user token from a separate user can be created as a new Secret version. Any further Rest calls (Step 5 in the above flow diagram) will fetch the new token. ADFv2 pipeline invocations also will refer the updated Azure KV Secret value. So, we don’t need to manually update user token in any client codes/configurations.

Image for post
Image for post
Your Key Vault > Secrets > your secret > New Version

Now there could be two scenarios,

(i) At client end we can store the user token in a local database and refresh if any authentication exception / invalid access token exception is received. We need to be extra cautious to store the token locally.

(ii) Otherwise, we can retrieve the token whenever a Databricks Rest API needs to be invoked (without storing at client side).

Note than, the OAuth2 token only lives for an hour ("expires_in": 3600) so, we may need to refresh this again before the Databricks user token refreshment.

Key / Secret / User Token Expiration

Instead of managing only the Databricks user token expiration, now I have to manage expiration of three keys (Service Principal, Azure KV and Databricks) 😮!

As on now, Azure will not send any alert if any of these about to expire! So, we need a solution to update the keys/secrets before those expire.

Though a simple manual solution would be to keep an offline list of the keys/secrets with expiry dates maintained, reviewed regularly and manually change/create a new one before expiry.

But, then we’ll go against the world of automation! Apart from that, managing manual key expiry and rotation will be a daunting task as we’ll use more cloud services.

To overcome that (until any ‘off-the-shelf’ cloud service is available), we can use the existing Rest APIs or Azure PowerShell commands to renew the keys/secrets.

Find below few ways we have:

Databricks User Token Expiration:

Image for post
Image for post
High level flow diagram to monitor the Databricks user token expiry, create a new token and update the Azure Key Vault secret, so clients using the KV secret can use the latest token seamlessly.

Updating the Azure Key Vault, requires the secrets/set permission to be set.

References:

Azure Key Vault Secret Expiration:

Image for post
Image for post
High level flow diagram to monitor the Azure KV secrets expiry and update if required, so clients can use the secrets seamlessly

Listing & updating the Azure Key Vault, requires the secrets/list & set permissions to be set.

References:

Service Principal Key Expiration:

Image for post
Image for post

Currently there is an open bug which will throw the above error while trying to create a new key.

I’m yet to find the latest appropriate Rest API documentation for the Service Principal key creation. Though further internal Rest call details can be found by appending — — debug global parameter with the az ad app credential reset.

Reference:

Expiry Notification

Though the above approaches can rotate the keys/secrets/tokens without much human interaction however, we also need a notification system to notify appropriate support group after the auto-rotation completes. There are some good examples available:

In a similar way, notifications can also be sent if keys are about to expire.

Solving the Misleading Identity Problem

Databricks user token are created by a user, so all the Databricks jobs invocation log will show that user’s id as job invoker. This could create confusion.

As of now, there is no option to integrate Azure Service Principal with Databricks as a system ‘user’.

As a workaround, we can create a dummy ‘user’ account with a valid email id and add it to the Azure Active Directory tenant.

We shouldn’t create unnecessary keys/secrets and shouldn’t auto-rotate all indiscriminately. Check for the keys/secrets which should expire after the set date. Only consider keys/secrets which are really required and could take down the system if expired.

Written by

Tech enthusiast, Azure Big Data Architect.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store