ruk·si

RiseML

Updated at 2018-06-28 12:58

RiseML is a self-hosted data science platform.

You install RiseML on your Kubernetes cluster. If you don't have one, you need to build one from scratch or templates. You login to a specific API server with riseml login.

RiseML provides a higher-level abstraction than Kubeflow that is more similar to Google Cloud ML. But still vastly different, e.g. you have to host RiseML yourself and setup the Kubernetes cluster.

RiseML uses Kubernetes because many of companies have that already. RiseML used Mesos before, but currently only runs Kubernetes Jobs.

RiseML concepts:

RiseML Frontend
	> RiseML Backend
		> Experiment
			> Job
				> Container
			> Job 
				> Container
		> Persisten Storage via NFS
		> Kubernetes
			> Node
				> GPU
				> GPU
			> Node
				> GPU
				> GPU

RiseML uses riseml.yaml to define what to run. Running a training will first build the Docker image first.

project: mnist-example
train:
  framework: TensorFlow
  tensorflow:
    version: 1.2.1
  install:
    - apt-get update && apt-get install -y curl git
    - pip install -r requirements.txt
  resources:
    cpus: 3
    mem: 4096
    gpu: 1
run:
  -  python mnist.py --epochs 2
riseml train 	# start "train" job
riseml logs 	# container logs
riseml status   # job statuses
riseml monitor  # node utilization
riseml system   # information about your clusters

Hyperparameter optimization works as sweeps defined in the YAML.

...
concurrency: 12
parameters:
  lr:
    - 0.0001
    - 0.001
  epochs:
    range:
      min: 2
      max: 6
      step: 1
run:
  - python mnist.py --epochs {{epochs}} --lr {{lr}}
riseml train -f riseml_hyper.yml

Custom Docker image:

project: mnist-example
train:
  framework: TensorFlow
  image:
    name: tensorflow/tensorflow:1.2.0-gpu
...

RimseML can host TensorBoard for you.

project: mnist-example
train:
  framework: TensorFlow
  tensorflow:
	tensorboard: true
...

Sources