Improving Resiliency with Databricks Delta Lake & Azure

Prosenjit Chakraborty
6 min readApr 22, 2020

Resiliency is one of the most important aspects we should consider while creating a data lake. Azure Storage provides some great features to improve resiliency. On top of these, Databricks Delta Lake can add a cool feature called time travelling to make the lake more resilient and easily recoverable.

In this blog, we’ll discuss about few features which will help to protect our data from corruption/deletion and can help to restore easily in case of any issues.

Right Access Permission

First thing we will consider providing the right access. Only the resource administrator should have the owner access, developers should have read access and applications can have contributor access. By this way, data can only be deleted by the resource administrator or by a process e.g. by Databricks or by Azure Data Factory pipelines.

Accidental Delete Protection

To avoid any accidental deletion we should always add a delete lock on our data lake.

Adding a ‘Delete’ lock on the Storage Account.

By mistake if someone tries to delete, he’ll get a prompt to remove the lock first!

--

--

Prosenjit Chakraborty
Prosenjit Chakraborty

Written by Prosenjit Chakraborty

Tech enthusiast, Principal Architect — Data & AI.

No responses yet