Talks
Invited Talks #
2024
WSCC 2024
Abstract
Thanks to their generality, workflow models represent a powerful abstraction for designing complex applications and executing them on large-scale distributed architectures. However, several additional challenges appear when transitioning from cloud/HPC environments to the entire compute continuum. Continuum execution environments are fully distributed and modular, and modules can be heterogeneous and independent of each other. In addition, continuum workflows often rely on multiple intercommunicating agents that form complex micro-services architectures. Different agents deal with different communication and parallelization paradigms: network-based stream processing at the edge and file-based batch processing on HPC facilities. Finally, support for efficient interactive workflows in the continuum remains an open research problem. This talk explores these challenges and provides insights on how to deal with them. A ready-to-use software library accompanies each proposed solution to facilitate the reproducibility and reusability of the presented concepts.
2024
Cross-Facility Federated Learning - Part II
2024
CWL in the HPC Ecosystem
2023
ITADATA 2023
2023
OSA2Micro
2023
Standardised Workflows at EBRAINS
Abstract
A hands-on training offer for Standardised Workflows in EBRAINS. A short presentation will be used as an introduction, while the main hands-on session will provide information about Writing and Executing Standardised Workflows. TC will give some guidelines, so attendees can experiment with writing CWL tools and workflows and then they will be given access to VM to execute these workflows. The Workflows Dashboard will be also presented during the same session, offering to the attendees the opportunity to understand the different functionalities, use it with TC support and provide useful comments.
2023
CWLCon 2023
Abstract
Modern HPC applications are becoming so heterogeneous and complex that a modular approach to their design, deployment and orchestration is now necessary. This talk explores the benefits of using a vendor-agnostic workflow language (CWL) coupled with a hybrid workflow management system (StreamFlow) in the HPC ecosystem. Also, it will examine the requirements needed to model HPC applications effectively, the CWL’s readiness to meet such requirements, and the proposals made to improve the language where needed. Four real use cases will drive the discussion: the ACROSS Project (G.A. n. 955648), where CWL is the primary interface to model three HPC workflows, and the EUPEX Project (G.A. n. 101033975), where StreamFlow will be used for the rapid prototyping of a seismic engineering HPC application for a Modular Supercomputing Architecture (MSA) system.
2022
CINI HPC-KTT: HPC Key Technologies and Tools National Lab
2022
2nd HealthyCloud Workshop
Abstract
When providing data analysis as a service, one must tackle several problems. Data privacy and protection by design are crucial when working on sensitive data. Performance and scalability are fundamental for compute-intensive workloads, e.g. training Deep Neural Networks. User-friendly interfaces and fast prototyping tools are essential to allow domain experts to experiment with new techniques. Portability and reproducibility are necessary to assess the actual value of results. Kubernetes is the best platform to provide reliable, elastic, and maintainable services. However, Kubernetes alone is not enough to achieve large-scale multi-tenant reproducible data analysis. OOTB support for multi-tenancy is too rough, with only two levels of segregation (i.e. the single namespace or the entire cluster). Offloading computation to off-cluster resources is non-trivial and requires the user's manual configuration. Also, Jupyter Notebooks per se cannot provide much scalability (they execute locally and sequentially) and reproducibility (users can run cells in any order and any number of times). The Dossier platform allows system administrators to manage multi- tenant distributed Jupyter Notebooks at the cluster level in the Kubernetes way, i.e. through CRDs. Namespaces are aggregated in Tenants, and all security and accountability aspects are managed at that level. Each Notebook spawns into a user-dedicated namespace, subject to all Tenant-level constraints. Users can rely on provisioned resources, either in-cluster worker nodes or external resources like HPC facilities. Plus, they can plug their computing nodes in a BYOD fashion. Notebooks are interpreted as distributed workflows, where each cell is a task that one can offload to a different location in charge of its execution.
2021
CWLCon 2021
Abstract
We'll present a methodology to run DNN pipelines on hybrid cloud+HPC infrastructure. We'll also define a "universal pipeline" for medical images. The pipeline can reproduce all state-of-the-art DNNs to diagnose COVID-19 pneumonia, which appeared in the literature during the first Italian lockdown and following months. We can run all of them (across cloud+HPC platforms) and compare their performance in terms of sensitivity and specificity to set a baseline to evaluate future progress in the automated diagnosis of COVID-19. Also, the pipeline makes existing DNNs explainable by way of adversarial training. The pipeline is easily portable and can run across different infrastructures, adapting the performance-urgency trade-off. The methodology builds onto two novel software programs: the streamflow workflow system and the AI-sandbox concept (parallel container with user-space encrypted file system). We reach over 92% accuracy in diagnosing COVID pneumonia.
Regular Talks #
2025
eScience 2025
Abstract
Modern scientific discovery is frequently backed by large-scale scientific experiments, which cannot prescind from the unparalleled computational capabilities provided by high-performance computing systems. However, large data centers usually prioritize system efficiency over user accessibility, posing challenges for researchers without advanced computer science expertise. This work introduces BookedSlurm, a secure and user-focused extension of the Slurm workload manager, aiming to democratize HPC access across interdisciplinary research domains. BookedSlurm enables a partially decentralized regulation of fine-grained advanced resource reservations through a novel credit-based framework, ensuring fair and predictable access to computing resources. Its modular architecture leverages dedicated microservices to manage reservations, credit handling, and accounting. These components are exposed through a secure REST API and an intuitive web-based dashboard, enhancing system usability for novice and expert users. While the dashboard simplifies interactions for non- specialists, advanced users can directly access agent-level APIs for more complex and automated operations. The effectiveness of BookedSlurm is validated on a real-world bioinformatics use case from the SUS-MIRRI.IT project, showcasing its ability to enhance usability, optimize job scheduling, and streamline execution workflows.
2025
2025 ICSC Spoke 1 Event
2024
EuroHPC 2024
2024
CWLCon 2024
Abstract
This presentation introduces the new CWL Working Groups initiative, describing what a Working Group actually is, which Working Groups already exist in the CWL community, and how anybody can create a new officially recognized Working Group. Then, the presentation will explore the CWL4HPC Working Group, using it as an example of how a CWL Working Group can actually work.
2023
EuroHPC 2023
2023
2023 Workflows Community BoF: Modern Workflows for Continuum and Cross-Facility Computing
2023
WORKS 2023
Abstract
An entire ecosystem of methodologies and tools revolves around scientific workflow management. They cover crucial non-functional requirements that standard workflow models fail to target, such as interactive execution, energy efficiency, performance portability, Big Data management, and intelligent orchestration in the Computing Continuum. Characterizing and monitoring this ecosystem is crucial to developing an informed view of current and future research directions. This work conducts a systematic mapping study of the Italian workflow research community, analyzing 25 tools and 10 applications from several scientific domains in the context of the “National Research Centre for HPC, Big Data, and Quantum Computing” (ICSC). The study aims to outline the main current research directions and determine how they address the critical needs of modern scientific applications. The findings highlight a variegated research ecosystem of tools, with a prominent interest in advanced workflow orchestration and still immature but promising efforts toward energy efficiency.
2023
2023 CINI HPC-KTT National Assembly
2022
EAGE 2022
Abstract
Large-scale scientific applications are facing an irreversible transition from monolithic, high- performance oriented codes to modular and polyglot deployments of specialised (micro-)services. The reasons behind this transition are many: coupling of standard solvers with Deep Learning techniques, offloading of data analysis and visualisation to Cloud, and the advent of specialised hardware accelerators. Topology-aware Workflow Management Systems (WMSs) play a crucial role. In particular, topology-awareness allows an explicit mapping of workflow steps onto heterogeneous locations, allowing automated executions on top of hybrid architectures (e.g., cloud+HPC or classical+quantum). Plus, topology-aware WMSs can offer non-functional requirements OOTB, e.g. components’ life-cycle orchestration, secure and efficient data transfers, fault tolerance, and cross-cluster execution of urgent workloads. Augmenting interactive Jupyter Notebooks with distributed workflow capabilities allows domain experts to prototype and scale applications using the same technological stack, while relying on a feature-rich and user-friendly web interface. This abstract will showcase how these general methodologies can be applied to a typical geoscience simulation pipeline based on the Full Wavefront Inversion (FWI) technique. In particular, a prototypical Jupyter Notebook will be executed interactively on Cloud. Preliminary data analyses and post-processing will be executed locally, while the computationally demanding optimisation loop will be scheduled on a remote HPC cluster.
2022
ITWSHPC 2022
2022
J on The Beach 2022
JupyterFlow: Jupyter Notebooks su larga scala
Abstract
I Jupyter Notebook sono largamente utilizzati sia in ambito industriale che accademico come strumento di didattica, prototipazione e analisi esplorative. Purtroppo il sistema runtime standard di Jupyter non è abbastanza potente per sostenere un carichi di lavoro reali e spesso l'unica soluzione è quella di riscrivere il codice da zero in una tecnologia con supporto HPC. Intrgrando lo stack Jupyter con StreamFlow (https://streamflow.di.unito.it/) è possibile creare i Notebook tramite un'interfaccia web su cloud ed eseguirli in maniera trasparente in remoto su una VM con GPU o su nodi HPC.
2019
Workshop GARR 2019
2019
PDP 2019
Abstract
This work presents a novel approach to distributed training of deep neural networks (DNNs) that aims to overcome the issues related to mainstream approaches to data parallel training. Established techniques for data parallel training are discussed from both a parallel computing and deep learning perspective, then a different approach is presented that is meant to allow DNN training to scale while retaining good convergence properties. Moreover, an experimental implementation is presented as well as some preliminary results.