Distributed workflows with Jupyter

Jupyter Notebook’s capability to unify imperative code and declarative metadata in a unique format puts them halfway between the two classes of tools commonly used for workflow modeling: high-level coordination languages and low-level distributed computing libraries. Also, Jupyter Notebooks come with a feature-rich, user-friendly web interface out-of-the-box, making them far more accessible for domain experts than the SSH-based remote shells commonly exposed by HPC facilities worldwide.

Iacopo Colonnelli

@octectcomposer

Ph.D. student in Modeling and Data Science

Iacopo Colonnelli is a Ph.D. student in Modeling and Data Science at Università di Torino. He received his master’s degree in Computer Engineering from Politecnico di Torino with a thesis on a high-performance parallel tracking algorithm for the ALICE experiment at CERN.
His research focuses on both statistical and computational aspects of data analysis at large scale and on workflow modeling and management in heterogeneous distributed architectures

Iacopo Colonnelli

@octectcomposer

Ph.D. student in Modeling and Data Science at Università di Torino

Talk

OpenDeepHealth: Crafting a Deep Learning Platform as a Service with Kubernetes

Did you ever see a Distributed Deep-Learning Platform as a Service? Sure not, it’s challenging! Join this session to discover OpenDeepHealth, a PaaS built on top of Kubernetes and designed from principles with a multi-tenancy first approach!

OpenDeepHealth (ODH) is a hybrid HPC/cloud infrastructure designed and developed by the University of Torino in the DeepHealth European project. The goal was to provide a self-service platform for Deep Learning, allowing domain experts to bring their own data and run training and inference workflows in a multi-tenant container-native environment. Kubernetes, the de-facto standard for container orchestration, is the perfect framework for building such a distributed system, optimising resource usage and allowing a horizontal scaling of the infrastructure.

StreamFlow, the ODH workflow engine, can schedule and coordinate different workflow steps on top of a diverse set of execution environments, ranging from single Pods to entire HPC centres. As a result, each step of a complex Data Analysis pipeline can be scheduled on the most efficient infrastructure. At the same time, the underlying run-time layer automatically takes care of workers’ lifecycle, data transfers, and fault-tolerance aspects.

ODH implements a novel form of multi-tenancy called “HPC Secure Multi-Tenancy”, specifically designed to support AI applications on critical data. Thanks to Capsule, the multi-tenant Kubernetes operator, ODH can enforce multi-tenancy at the cluster level, avoiding privilege escalations and exploits, minimising operational costs, and enforcing custom policies to access external HPC facilities.

Finally, ODH provides multi-tenant distributed Jupyter Notebooks as a service through the Dossier platform. This feature gives domain experts a high-level, well-known programming model to write portable and reproducible Deep Learning pipelines, augmenting standard notebooks with resource segregation, data protection and computation offloading capabilities.

What the attendees will learn

Attendees will learn how the literate computing paradigm (and in particular the Jupyter software stack) can be used to produce well-documented application prototypes and scientific experiments, especially in the data science domain. Then, we will explore how these prototypes can scale to real distributed application through the literate distributed computing abstraction, without the need to rewrite the code from scratch.

Requirements

A laptop with Docker and Docker Compose installed on the machine.

Basic knowledge of Python is required for the hands-on part, while the general discussion is open to everyone.

Companies using this technology

Jupyter Notebooks are used by almost every data science platform offered as-a-Service in the Cloud (either pure Jupyter Notebooks or a decorated version of them, e.g. Google Colab).

Content

This workshop explores Jupyter Notebooks and its potential to express complex workflows and coordinate their distributed execution, powered by the Jupyter workflow kernel developed at University of Torino.

In particular, the workshop will be composed of two main units. The first part will cover a general introduction to literate computing and Jupyter workflows, exploring their features and limitations in terms of portability, reproducibility, and ease of use by domain experts. Then, the second part will explore their capability to express distributed applications and to automatically optimize their distributed execution. In both parts, a theoretical introduction will be followed by hands-on exercises.

Workshop Plan

Part 1: Jupyter Notebooks

- Literate computing paradigm

- Jupyter Notebooks and their software stack

- Pros and cons of Jupyter

- Hands-on: write your own Notebook

Part 2: Distributed workflows with Jupyter

- Literate distributed computing

- Distributed workflows with Jupyter

- The Jupyter-workflow kernel

- Hands-on: write your own workflow

Distributed workflows with Jupyter

Iacopo Colonnelli

What the attendees will learn

Requirements

Companies using this technology

Content

Workshop Plan

Distributed workflows with Jupyter

Date and time:

Topics:

Target audience roles:

Attendees:

Included:

MÁLAGA

VENUE