ruk·si

🐍 PyWren

Updated at 2018-01-15 16:31

PyWren executes stateless functions in serverless context. Prototype implementation for simpler distributed computing.

How PyWren works:

  1. User writes a function that takes input and produces output.
  2. Python function is serialized with cloudpickle.
  3. Inputs are serialized and placed to S3 with unique keys.
  4. You launch as many functions as there are items in inputs.
  5. The function is executed in a stateless remote container.
  6. Outputs are serialized and place to S3 at a predefined key.
  7. This file signals that the execution is done.

The serialization of functions is important. This allows reusing one registered Lambda function to execute different Python functions. Lowers function registration latency and you won't hit lambda function size limit as fast.

Problems with this approach:

  • Lambda function launch overhead.
  • Lambda function invocation rate limit.
  • None of the functions as a services have GPU support.
  • Worker gets 30 MB/s bandwidth.

S3 backend is good but not perfect:

  • How to get append functionality?
  • How could we use Redis for some parts of this stuff?

Sources

  • Occupy the Cloud: Distributed Computing for the 99%, Eric Jonas, Qifan Pu, Shivaram Venkataraman, Ion Stoica, Benjamin Recht