MLflow & Azure Databricks
Databricks is one of the top choices among data scientists to run their ML codes. To help them to manage their codes and models, MLflow has been integrated with Databricks.
MLflow is an open source platform for managing the end-to-end machine learning lifecycle..Azure Databricks provides a fully managed and hosted version of MLflow integrated with enterprise security features, high availability, and other Azure Databricks workspace features..
Find below the components of MLFlow and few other important components have been added along with.
In this blog, we’ll cover few of the components — tracking, models & model registry.
Prerequisite
I’m using Azure Databricks Runtime for Machine Learning specifically, 8.3 ML Beta throughout this blog.
Data Preparation
We have used the Pima Indians Diabetes dataset (download it from here and for details, refer here).
df = spark.table ('pima_indians_diabetes')
print(f"""There are {df.count()} records in the dataset.""")
df.show(5)