Jupyter Archives - Research Software Engineering

Running Jupyter notebooks on Imperial College’s compute cluster

Mark Woodbridge, Research Software Engineer

16 February 2020

We were really glad to see James Howard (NHLI, Faculty of Medicine) announcing on Twitter that he’d published a Kaggle kernel to accompany his recent publication on MR image analysis for cardiac pacemaker identification using neural networks via PyTorch and torchvision. Sharing code in this way is a great way to promote open research, enable reproducibility and encourage re-use.

Figure 3 from Cardiac Rhythm Device Identification Using Neural Networks

We thought it might be helpful to explain how to run similar notebooks on Imperial’s cluster compute service, given that it can provide some benefits while you’re developing code:

Your code and data remain securely on-premise, thanks to the RCS Jupyter Service and Research Data Store
You can run parallel interactive and non-interactive jobs that span several days, across multiple GPUs

With James’ permission we’ve lightly modified his notebook and published it in an exemplar repository alongside some instructions to run it on the compute cluster. We hope this can help others to use a combination of Conda, Jupyter and PBS in order to conduct GPU-accelerated machine learning on infrastructure managed by the College’s Research Computing Service – without incurring any cost at the point of use.

Many thanks to James Howard for sharing his notebook and reviewing our instructions

Quilting with Julia, or how to combine parallelism and derived types for high performance computing

Mayeul d'Avezac de Castera, Senior Research Software Engineer

13 January 2020

Research and quilting have a similar Zen in that both combine and build upon multiple prior works. But the workflow is difficult to reproduce in research software: how can we combine group X’s state-of-the-art ODE solver with group Z’s state-of-the-art parallel linear algebra to create Y’s new biology model when they all use different libraries and conventions? This is the problem that Julia tackles head on, thanks to it’s innovative type system and multiple dispatch. In “Shared Memory Parallelization of Banded Block-Banded Matrices” we describe how to combine the parallelization capabilities from one package (SharedArrays) with the specialized matrix of another (BlockBandedMatrices.jl) – without modifying the internals of either.

This work follows on from a NumFOCUS sponsored collaboration at Imperial College between the Research Computing Service and Sheehan Olver in the Department of Mathematics.