Updated at 2018-01-14 23:50

Ray is an open source framework for distributed data science computation. Not production ready but something interesting to follow.

  • Works with TensorFlow, PyTorch, MXNet etc.
  • Supports millisecond level tasks.
  • Supports millions of tasks per second.
  • Allows parallelizing tasks within tasks.
  • Task dependencies can be determined at runtime.
  • Tasks can operate on shared mutable states like neural network weights.
  • Supports heterogeneous resources like GPU and no GPU.
  • Uses Python as the main interface.
pip install ray

You can interact with Ray by two ways:

  1. low-level APIs
  2. high-level libraries like Ray RLlib or Ray.tune

Low-level APIs work using dynamic task graphs. You build these graphs using Python.

# Define two remote functions.
# Invocations of these functions create tasks that are executed remotely.

def multiply(x, y):
    return, y)

def zeros(size):
    return np.zeros(size)

# Start two tasks in parallel.
# These return futures and the tasks are executed in the background.
x_id = zeros.remote((100, 100))
y_id = zeros.remote((100, 100))

# Start a third task.
# This will not be scheduled until the first two tasks have completed.
z_id = multiply.remote(x_id, y_id)

# Get the result. This will block until the third task completes.
z = ray.get(z_id)

To access shared mutable state, you use actors.

import gym

class Simulator(object):
    def __init__(self):
        self.env = gym.make("Pong-v0")

    def step(self, action):
        return self.env.step(action)

# Create a simulator, this will start a remote process that will run
# all methods for this actor.
simulator = Simulator.remote()

observations = []
for _ in range(4):
    # Take action 0 in the simulator.
    # This call does not block and it returns a future.

Technically, Ray uses Redis database(s) to store the pickled Python functions, results and system configuration.