Semi-automatic parallelization of sequential Python code for Cluster and the Cloud


Talk at OSDI 14

We will present Pydron at OSDI'14 in Broomfield, CO in the technical session "Hammers and Saws" on Wednesday.

What is Pydron?

Pydron is a system for automatic parallelization of sequential Python code, providing a transparent and simple to use interface between the application domain and the parallel computing domain.

Why does it need Pydron?

We target astronomy data processing, where data volumes are huge, and heavy number crunshing is required. In those fields the economies of scale often doesn't allow the use of systems such as map reduce which would require that the software is rewritten.

Astronomy processing codes change continously as the scientists are working with their data and refine their methods. With Pydron, the scientists can write sequential code, which makes the code easier to maintain and to adapt, increasing the productiviy of the scientists. Pydron then semi-automatically parallelizes the code and scales the execution to the cores of the local machine, the nodes of a cluster, or to a cloud.

How does it work?

Pydron analyses the Python code and translates it internally into a data-flow graph. The function calls and other expressions are then executed in parallel if there is no data dependency between them.

Since automatically parallelizing sequential code is impossible in general, we need the help of the developer. With one Python decorator (annotation) the developer informs Pydron of functions that are free of side effects. With another, Pydron is instructed to analyse a specific method for parallelism.

Those two decorators are the complete API of Pydron. Everything else happens automatically. Cloud nodes are started, the code is sent to the instances, tasks are executed remotely and the results transfered back to the developers workstation.

About us

Pydron is developed by Stefan C. Müller as part of his PhD at ETH Zürich.

It is the product of the collaboration of the Systems Group at ETH and the Institute of 4D Technologies at University of Applied Sciences and Arts Northwestern Switzerland.

We also work together with the Institute for Astronomy which gives us the opportunity to test Pydron on real scientific data and codes.


Release Plans

Our prototype is working. We are planning to release Pydron as soon as possible under an open source licence.

We currently estimate that we will be ready for the release in the first quarter of 2015.

Receive Announcements

We have set up a mailing list on which we make important announcements, most notable when we release Pydron.

To Subscribe click here end send the e-mail: Subscribe

You can unsubscribe any time: Unsubscribe