Ray is an open source framework for distributed data science computation. Not production ready but something interesting to follow.
- Works with TensorFlow, PyTorch, MXNet etc.
- Supports millisecond level tasks.
- Supports millions of tasks per second.
- Allows parallelizing tasks within tasks.
- Task dependencies can be determined at runtime.
- Tasks can operate on shared mutable states like neural network weights.
- Supports heterogeneous resources like GPU and no GPU.
- Uses Python as the main interface.
pip install ray
You can interact with Ray by two ways:
- low-level APIs
- high-level libraries like Ray RLlib or Ray.tune
Low-level APIs work using dynamic task graphs. You build these graphs using Python.
# Define two remote functions. # Invocations of these functions create tasks that are executed remotely. def multiply(x, y): return np.dot(x, y) def zeros(size): return np.zeros(size) # Start two tasks in parallel. # These return futures and the tasks are executed in the background. x_id = zeros.remote((100, 100)) y_id = zeros.remote((100, 100)) # Start a third task. # This will not be scheduled until the first two tasks have completed. z_id = multiply.remote(x_id, y_id) # Get the result. This will block until the third task completes. z = ray.get(z_id)
To access shared mutable state, you use actors.
import gym class Simulator(object): def __init__(self): self.env = gym.make("Pong-v0") self.env.reset() def step(self, action): return self.env.step(action) # Create a simulator, this will start a remote process that will run # all methods for this actor. simulator = Simulator.remote() observations =  for _ in range(4): # Take action 0 in the simulator. # This call does not block and it returns a future. observations.append(simulator.step.remote(0))
Technically, Ray uses Redis database(s) to store the pickled Python functions, results and system configuration.