MLflow
Updated at 2018-07-19 23:07
MLflow is a self-hosted open-source machine learning platform by Databricks. It aims to manage 1) data preparation 2) model training 3) model deployment. You can pay Databricks to host the system though.
MLflow...
- is designed to work with any machine learning tool.
- is build around a REST API.
- has three major components; tracking, projects and models.
MLflow Tracking
import mlflow
mlflow.log_param("num_dimensions", 8)
mlflow.log_param("regularization", 0.1)
MLflow Projects
name: My Project
conda_env: conda.yaml
entry_points:
main:
parameters:
data_file: path
regularization: {type: float, default: 0.1}
command: "python train.py -r {regularization} {data_file}"
validate:
parameters:
data_file: path
command: "python validate.py {data_file}"
mlflow run example/project -P alpha=0.5
mlflow run git@github.com:user/repository.git -P alpha=0.5
MLflow Models
Used to package machine learning models into multiple packages called flavors.
time_created: 2018-02-21T13:21:34.12
flavors:
sklearn:
sklearn_version: 0.19.1
pickled_model: model.pkl
python_function:
loader_module: mlflow.sklearn
pickled_model: model.pkl