Blog posts

Introducing MicroLens: Using a design approach to understand what brain tumour patients want from their images 

 “What if patients could see all of their images, whenever they wanted? What would they do with them? What would they need to make that useful?”  

MicroLens started with that question and is an exploration of what that would mean for patients and carers, and the different ways it could work, for different staff, patients and carers in our service. We are publishing this to summarise what we have done so far – but also to invite you to contribute! If you are interested in adding your thoughts, please see the link at the end of the blog post. 

Specifically, we wanted to bring a design-led approach to thinking about some new technology. By that, we mean, instead of jumping in feet first and writing some code, we wanted to explore use and needs, and then let the technology follow, following the general Design-Centred AI Approach that we have outlined here. 

 Starting point 

Historically, patients had images taken (X-rays, ultrasound, CT) at a hospital. These were then reported by a radiologist, and the report went to whoever had requested the imaging. Then the patient came back to discuss the results with their treating team. More recently, we have had the ability to show those images to patients in clinic, so we can explain and demonstrate what the report shows. Over the last few years, some centres now offer routine access to imaging reports for patients through “Patient Portals” which let patients see their blood tests results, letters, reports, etc. We have one of these at Imperial, and it is being rolled-out London wide (  However, as computers have developed, patients can now easily look at their own imaging at home – taking the images home on CD, and downloading free software to view them.  

Technology is currently at a place where patients can look at their own images outside of a clinical consultation but the evidence available shows that there is a mismatch between what patients and users want to what clinicians and medical teams think patients and users need/ want.  The MicroLens project is about us taking the time to hear from a wide range of people (patients, caregivers, researchers, multidisciplinary team members (radiologist, CNS, oncologists, surgeons), technical minded people) over a five month period to hear what would be practical, needed, feasible and how they would best use and understand their images instead of starting with the technical aspect. This is part of a wider move in the lab that we explore in this post on Design-Centered AI. 

We looked at 6 main areas –  

  1. Current tools being used by patients/ caregivers to view images;  
  2. Functionalities of the current tools/ systems and their accessibility;  
  3. Benefits of having images with the scan reports; 
  4. When should the scans/ images be made available for patients to view? 
  5. Platform – Web based, app or hybrid approach? What security is needed and the sign up and access process; 
  6. Professional/ workforce views 


1. Current tools being used by patients/ caregivers to view images; 

How do people currently view their images? 

  • Scans uploaded onto CDs 
  • Linking CD player to bigger screen i.e. TV 
  • Screenshots of scans and viewing them as photos on their phone 
  • Platforms (e.g. Idonia) 


2. Patient feedback on having access to their images: 

This is what it looked like last time and this is what it looks like now and they’re perfectly aligned, and we zoom through it, and you can see from here this is what’s going on.” 

“It was me that wrote the comment about putting it on the telly, and I don’t remember the technical technicalities of how we did it, but it was useful being able to put it on the bigger screen with my mum.  She was at the time almost 80, so the fact that we were able to view the pictures in a bigger medium really helped to point things out to her.” 


3. Patients & caregivers views on images being uploaded on to CDs  

“Nowadays many people don’t have access to computers/ laptops that have CD drives so when their images are provided on CDs if they want to view the images, they have had to purchase external CD drives. 


“There is a variation across teams and services around the time taken for requesting your scans to receiving the scans on a CD.  Having the images on a CD allows you to take your images to other consultations (worldwide) but can also be used to display your images on your TV to explain diagnosis to other family members/ friends.” 


“Images that are uploaded onto CDs have no anatomical signposting nor do they have information regarding which slices are the most appropriate to view or what is actually normal vs abnormal. Patients find it difficult to interpret the imaging as they don’t have the training nor the knowledge for this. The current process doesn’t allow for you to compare serial images at a time and each image needs to be looked at individually.”    


4. Platforms patients use to view images 

We also discussed the use of a platform called Idonia ( – designed to host medical image imaging.  You register for free and access the platform and can then view your images. If you have your images on CD you can drag them into the platform so that all your images are stored in one place.  You have the ability to share your images on the platform by sharing a pin code with no geographical limit. You are able to send images via email and the platform already has some in built tools that allow you to view the images in more detail and you can also store medical history on the platform too and make notes on the images.  It can act as a repository for all your images to date.  The disadvantage of such a platform is that the NHS trust or treating hospital needs to purchase the rights to use Idonia therefore limiting its outreach to all patients.   


5. Advantages and Disadvantages of different ways that are currently been used to view images outside of clinical appointments 

Overall having a type of platform to view images is more practical than CDs as it avoids the variability in time to access images across services but also avoids the issue around requiring a laptop/ PC with a CD drive and the relevant software to then view the images.  A platform allows for the user to add personalised notes and share with a wide range of people quickly and safely and could act as a form of repository for the scans to date.   


  • Functionalities of the current tools/ systems and their accessibility; 

The group discussed the reasons that they would access their images outside of appointments and reasons and purposes to do so included:  sharing images to other clinician’s, viewing scans for own record/ understanding, explaining diagnosis with family members/ friends, and having a digital “photo album”.  THis highlighted that a tool would need to be personalised to a range of different needs but would need the common functionalities including ability to share images, place to store screenshots of images and key anatomical points.  


  • Benefits of having images with the scan reports; 

Almost all the people present at this meeting agreed that having some notes/ reports alongside the images are as beneficial as the image itself as it would help them to navigate what the area of concern is.  There was a discussion around if the imaging team would be able to provide an additional imaging report which summarised the scan was written in lay terms so that it was accessible for all.   We also discussed the idea that if an area was mentioned on the report could that be highlighted on the scan or an arrow to indicate that region.   

“ For me the imaging report is equally as important. Images are all well and good, but the radiographers radiologists report is words are very important images to the uninitiated can be a dangerous thing, so it needs to be teamed up with something. 


  • When should the scans/ images be made available for patients to view?  

The response to the questions was split amongst the group with some people wanting to have their images prior to the consultation whilst others would rather the images became available after consultation  

For those that wanted the images prior to their consultations the felt the benefits were:  

“Able to come up with questions to ask my consultant rather than having to think of them in real time” 

“Gives me time to process images as opposed to being in shock and not fully understand what is being said to me.  By seeing them earlier I feel less anxiety and feel more in control. I think if I’ve already seen them, I could be more prepared. 

For those that wanted the images after their consultations the felt the benefits were:  

“There was less anxiety or cause for concern between viewing the image and knowing the next steps in treatment therefore I would prefer for the MDT to decide on management and then for me to view scans at consultation and then know the next steps. I would however like to view my scans for a historical catalogue basis. I wouldn’t necessarily need to have that information before the appointment because I believe that would just fundamentally put a lot more stress on the team and myself.  The purpose of having our images is not so that we go diagnosing ourselves, but it would be nice to have access to the images and information after you guys have done your work.” 


  • Platform – Web based, app or hybrid approach? What security is needed and the sign up and access process; 

The group felt that a hybrid process would be the best type of platform for patients to access images remotely as it would allow for them to access the full series of images via the web but then view snapshots of images using an App.  The patients and caregivers felt the advantage of access to images via the web allowed for easier access to comparing scans but also allowed for the images to be displayed on a TV screen to share with other family members.  The benefits of the app were that it was convenient and allowed for quick access to images. 

Again, the viewpoints of the group were variable with some of the group opting from a dual authentication process with others saying that is not needed and a sign-up process like Patient Know Best/ the NHS app being sufficient.   There was also discussion around how should the other family members get access to the images – should the patient share access to their images through the platform or should only one access route be provided, and family members share the log in.   


6. Professional/ workforce views 

There were concerns around the extra workload that may arise as a result of patients/ caregivers having access to their own images.  The imaging team provided some feedback from their experiences when patients/ caregivers have had access to images.   

“Some patients end up viewing their images and get themselves all worked up and get a lot more concerns as what is normal anatomy is perceived as something abnormal to them”.   

“Other occasions where patients have access to their images and imaging reports and the reports states the scan is normal but then patient sees something “abnormal” so as a team we have to go back to radiologist to get to the bottom of what is actually the correct situation”.   


What we have learned so far 

I have spent the last 5 years in a range of roles across Imperial College and Imperial College NHS Trust, working in and around brain tumour patients. Across that, I have become more and more interested, and involved in, Patient & Public Involvement (PPI). Some of that has felt difficult, and at times scary, but as a result I now lead most of the neuro-oncology PPIE work at both IC and ICHT.  

From this project it has helped us to understand that a tool for imaging is not going to be a one size fits all but that it will need to have elements of individualisation. It will be important to be able to provide tools within the platform that helps patients and caregivers to navigate through their own images and make some sense of them. The ability to link the scan to the image, highlighting key areas would also be of great value. One of the key concepts raised by most people that attended these groups is that this platform should at the minimum provide them with a chronological order photo-album of their images so that they can have it for their reference.  

Following our PPI sessions, we were lucky enough to talk our ideas through with Rachel Coldicutt from Careful Industries (  

This helped us critically think about our project and asked us useful questions such as: “How they think they will use it? And how they then actually use it?” 

She also covered the concept of “Object” and what this means – no judgement and without emotion – allows patient/ caregiver to add the contextual and emotional aspect to it 

She was able to provide us with a different outlook and prospective of how to view the project and the next steps: For instance, provide the patients with access to scan/ images and ask them to provide daily input (voice memos/ journals/ thumbs up/down) and then follow up with a face to face/ virtual feedback on how they use it and why they rated it the way they did. This approach allows for more insightful feedback and sensitivity as it is hard for someone to comment on something until they actually have it or have used it. One of her really useful examples was around the idea of imaging in pregnancy, and what that means to expecting mothers and the effect of these scans and images on those who unfortunately experienced a miscarriage. 

 NOTE: This post was written by our PPIE lead, Lillie Pakzad-Shahabi; it is being posted by MW for technical reasons

How to get involved 

We would love to hear your view on what you would like from a tool that would give you access to your images. Please complete the survey  and if you would like us to send out a design box so you can get creative and go back to basics to create your “ideal platform” please email me and I would be happy to arrange for a box to be sent out to you ( 



Towards Design-centred AI in Healthcare

This blog is a place for us to explore some of the early-stage ideas we are working on. So, with an acknowledgement that this is still half-formed, I wanted to write a little about our thinking around Design-centered AI in Healthcare. I also need to acknowledge that these thoughts are not mine alone – they are the results of lots of discussion, over many years, in the lab, and so I am presenting this as our collective thoughts. 

The pace of technical development in AI is astonishing: every month seems to bring some new technical development, some new deep learning architecture, some new piece of MedTech. But, at the same time, the change in medicine seems to take place quite slowly. . Current clinical practice does not feel very different from 5 years ago, whilst during the same time period the iPhone has gone from v8 to v14 (and quadrupled the number of transistors). Whilst there may be many reasons for this here are many reasons for this includingslow translation and regulatory hurdles, we think there may be more reasons for this.  

There is still a focus on using new technology to drive new MedTech; this isn’t surprising – new technology has always opened new medical possibilities. But that focus means that we end up thinking about uses for technology, rather than needs that could be served by technology. This is where design-centered AI and patient-centered AI are critical.  


An alternative redering of the Design Council's Double Diamond approach to design
The “Double Diamond” approach to design; note the emphasis on discovery before development. Image credits: Wikipedia


We refer to our attempt to change this as “Design-Centered AI”. We take an explicit design approach – based on work such as Donald Norman’s “Design of Everyday Things“, and pictured in the UK Design Council’s “Double Diamond” diagram – to try and understand and explore user-needs and put those needs at the fore of the work, while putting the technology in the background. 

The other point to note is we talk about users, not patients. Obviously, many of our users are patients, but some of them might be caregivers and families, and some of our work is designed to be used by patients with staff, and so understanding staff views is important as well. 

This sounds very simple but is a relatively new approach to designing MedTech products. There are many MedTech products that “work” very nicely…. but don’t solve a user problem. Existing approaches, such as patient involvement and engagement, are useful for helping us talk to patients, but don’t capture the iterative development that we need to explore, discuss and refine products. 

We are not the only people working in this space: there are quite a few groups interested in exploring human-led approaches to technology, Imperial College London already has an existing healthcare/design group in the Helix Centre, this work, and we have had some useful conversations with several people, including Careful Industries. And then there was this recent twitter thread discussing some of the same issues. 

As we said at the beginning, this work is still at an early stage, and will evolve over time. For now, we are exploring the work in a small, BRC-funded project, MicroLens, which explicitly uses the design approach to work with users to explore work around patient access to imaging. If you are interested in hearing more, then look at the MicroLens blog post, follow our work in the lab, or pick-up the discussion on Twitter (@Matthwilliams).








Brain imaging, neural networks and a year in research

As the last conference (for now) has passed I found myself in a position to bring the blog back to life after a bit of a pause during the summer. As this month also marks the end of my first year working in research, I thought I’d familiarize you with some of the work that got me into the lab and formed a large part of my first year.

A year in research it has been and I still remember vividly my first days within the lab which started even more way back – during the last year of my Bsc degree at Imperial College London. Given that an online python course over the summer and a statistical course in R during my studies were the only things relating me to coding or computer science, it was indeed a slightly frightening situation I got myself into – doing a final year project (on which a major part of my final score depends) training a machine learning algorithm to segment a muscle in brain images. Fast forward half a year, there was I with a degree in my hands (figuratively speaking, as Covid restrictions led me to finishing my degree from home) and a trained neural network, which I did not know then, but would base an important part of my work later on.

But let’s not go down the memory lane for too long and focus on what’s important – what is a neural network and why did I need to train it to segment a muscle in the brain?  A neural network is a type of machine learning algorithm which mimics the human brain in its architecture and uses training data to solve a problem and improve its performance over time without input from a user.  As the training progresses, the network determines the best parameters to produce the desired output – in our case a segmentation (a version of the original image where all of the pixels denoting the desired object are assigned one value and all of the pixels denoting the background a different one).

In our case we wanted to segment temporalis muscle, which is observed on routinely performed brain scans in brain tumour patients. We chose this muscle for a reason – there is evidence that it could be used to assess sarcopenia, which is the loss of muscle mass and/or function. Furtner et al. showed that temporal muscle thickness is predictive of overall and progression-free survival in glioblastoma patients and in a paper by Zakaria et al., temporalis muscle width accurately predicts 30 and 90-day mortality as well as overall survival from diagnosis.  You can find a quick overview of sarcopenia and its assessment in a previous blog post by Dr James Wang ( ). Using temporalis muscle is advantageous in a brain tumour patient cohort, as other types of scans on which sarcopenia could be assessed are not usually available as part of their general cancer treatment. Thus, we wanted to train a neural network to automatically segment temporalis muscle, which would be much quicker than manually measuring the width in each patient scan.

During my final year project, I successfully trained a network to segment temporalis muscle. I used an already existing implementation of a U-Net (a type of neural network architecture) on GitHub, however, as I was not getting the required output, I had to find ways to make it work. Finally, after deciding to use a difference loss function (which was more suitable for our task, when the segmentation object only occupies a small proportion of the whole image) and trying hyperparameters optimization, I managed to get a good performing model just in time before the end of my placement.

The U-Net was trained to segment temporalis muscle in 2D slices, however in real world, brain images come in 3D. Thus, we needed to rethink and expand on this work. We wanted to automate the whole process – from having a 3D brain scan to having a segmentation in 2D at a specific level of the 3D scan and also the area of the muscle. We decided to use a slice-based approach as it is commonly used in medical imaging tasks and to use a 2D segmentation, as it is very difficult to outline temporalis muscle in some levels of the 3D scan, it is not clear how much additional information that would provide compared to taking one slice where the muscle is wide and clearly distinguishable from surrounding tissues as well as it would have taken us an enormous amount of time to do the manual segmentations on a larger amount of images. Thus, for our approach, we created a pipeline combining two neural networks – one network was trained to segment the eyeball and the other one to segment temporalis muscle as previously. We used the output of the first network to select one slice per patient at a desired level based on a threshold. Then, once we have one slice at a desired level per patient, we feed these slices into the second neural network, which segments the muscle and outputs its area.

The road to get to the final pipeline was not easy and without obstacles, as was my first year in research. I’ve learned the importance of keeping track of every file and folder, the methods used as well as the reasoning behind choosing a specific approach. The final product is not perfect – the code sometimes selects images at an incorrect level so there’s still room for improvement. Nonetheless, some of my colleagues have already started using this tool in their research and it is nice to see something that started off as my final year project slowly molding into an actual tool.

Using CTGAN to synthesise fake patient data

Being the member of the Computational Oncology lab with no more than my A levels (which I never sat!) and an unconditional offer to study Computer Science at Imperial College London in October 2021 has been a great opportunity which I am extremely grateful for, albeit a bit daunting, sitting alongside everyone with their variety of PhDs.

A large issue in the medical world is that patient data is highly confidential and private, making getting our hands on this limited resource difficult. One potential solution to this problem is using a GAN (Generative Adversarial Network) called  CTGAN to generate realistic synthetic patient data, based off real, private patient data. The goal of a project that I have been working on  is to take in real tabular patient data, train a CTGAN model on the real data and have the model output synthetic data that preserves the correlations of the various columns in the real data. This model can generate as many synthetic patients as one desires, can undergo the same analysis techniques that researchers would use on real patient data, and the synthetic data can be made publicly available as no private data is accessible through the synthetic data.

There are many constraints that can be placed onto the GAN to make the synthetic data more realistic, as sometimes data needs to be constrained. As of right now, there are 4 constraints that can be placed on the model. First, there is the ‘Custom Formula’ constraint, which could be used to preserve the formula of ‘years taking prescription = age – age when prescription started’. Next, the ‘Greater Than’ constraint which would ensure that ‘age’ would always be greater than ‘age when prescription started’ and finally, the ‘Unique Combinations’ constraint which could be used to restrict ‘City’ to only be synthesised when the  appropriate Country column is generated alongside it. There is also a constraint called APII (Anonymising Personally Identifiable Information) which ensures that no private information is copied from the real data to synthetic data. APII works with many different confidential fields such as Name, Address, Country or Telephone Numbers by replacing these fields with pre-set, non-existent data entries from a large database called Faker Providers.

I have been working on adding a new constraint called ‘Custom ID’. This constraint applies when 2 columns are just encodings of each other, which occurs most frequently with ID columns. Without this constraint, the 100% correlation between a discrete column and its’ respective ID column would not be preserved. This is done by, first, comparing each of the columns against every other column in the data, if any encodings are found then the discrete column is preserved and the numeric ID column is removed. This is done because the ID column is an encoding of a discrete column, but will be identified by CTGAN as continuous, and will therefore model the column incorrectly. Once the ID column is removed, a lookup table will be created which links each value in the discrete column to their respective IDs. The CTGAN model is then trained on the data that does not contain the ID column. Finally, when sampling synthetic data, the ID is added backing into the synthetic data using the lookup table.

This solution has the advantage of running quickly, as the time complexity is not based on the number of rows in the real data. It is also easy to use, as it can be turned on and off with one input. Finally, my solution will identify all of the ID columns in the data and create lookup tables for each of them. One limitation of my solution would be that my Custom ID constraint would not detect 3 columns that are correlated (eg. ‘Gender’ ‘Gender Abbreviation’ and ‘Gender ID’) although this situation occurs rarely.

As I have the long-term goal of synthesising patient data, the synthetic data must be secure in the sense that it must not be possible to reverse engineer the real data from the synthetic. One test, available in SDGym, is the LogisticDetection metric which compiles both the real and synthetic data randomly and passes them to a discriminator which attempts to flag the incoming data with real and synthetic flags. This test showed that the data can be correctly identified as real or synthetic just over 50% of the time. However, when we are dealing with analysis on medical data, I feel there are still steps to be taken, to make the synthetic data more accurate before serious analysis on this synthetic data can begin.

Dipping my toes into the world of machine learning has been extremely fascinating and I have learned many new things, about both machine learning and coding more generally. I realise how important the complexity of my code is, because, if the code has a bad time complexity, the program could take days to run, which is not practical. Another lesson I learned the hard way is to not be afraid to restart. I completely rewrote my code after I finished a previous, working solution to the Custom ID constraint because the code was too complex and was taking hours to run. This allowed me to learn from my mistakes and reach a much better solution.

I now hope to train a CTGAN model on the GlioCova dataset which contains medical records of over 50000 cancer patients, and measure how well CTGAN performs on a large, relational database.





COVID and the return to Research

The 14th of March marked the end of my redeployment to a support role in Intensive Care and the second interruption of my PhD due to the pandemic. Academic papers and lines of data were replaced by disembodied voices as I endeavoured to keep two wards worth of family members updated on their loved ones’ progress. With strict restrictions on visitation, this daily conversation was often the only insight into how their relatives were recovering, and in many cases, how they weren’t. Whether I’ll be due a third pandemic-related sabbatical is yet to be seen. In the past few weeks, I’ve personally witnessed the steady downtick of COVID-related admissions. Beds filled with ventilated patients are now replaced by those in need of ITU-level monitoring following delayed essential procedures. Things are no less busy, but the grip of COVID has loosened and at the very least, there is a measure of respite.

Today, I replace my ITU hat with the academic hat I hung up two months ago. My oncology hat continues to gather dust, awaiting its eventual turn. Being reminded of my time as a clinician, one of my motivations for delving into the world of code and computational solutions is in its ability to capture and manipulate data that is often overlooked in day-to-day practice. Medical data is costly, both in time and manpower. Request forms, going through the scanning process, having labs do bloodwork, waiting for a report to be generated, are all steps taken to produce what is often a singular data point, which subsequently is consigned to medical archives. As our technology advances, so too has the information we capture from investigations, as well as our ability to store and read it on a larger integrated scale. This could enable – the discovery of complex relationships that would otherwise not have fit onto a blackboard or spreadsheet. Pairing this with the zeitgeist that is the renewed interest in artificial intelligence, we now have the technology to realise complex manipulation of large datasets at a level previously unattainable; bursting open the barriers that previously held us back.

One venue of unused data lies in opportunistic imaging. Cross-sectional imaging such as Magnetic Resonance Imaging (MRI) or Computed Tomography (CT) are commonly used in cancer care. These scans reconstruct “slices” through the scanned body which clinicians can scroll through to visualise internal structures. In cancer, the main reason to do this is to evaluate how the cancer is responding to a treatment. Simply put, if a cancer is in the lung, the focus of imaging and attention will be in the lung. But scanning the chest for lung cancer patients invariably picks up other organs as well, including the heart, bones, muscle and fatty tissue present in all of us. Unless there is an obvious abnormality in these other organs (such as a grossly enlarged heart, or cancer deposits elsewhere in the chest), these other organs are barely commented on and become unused by-products. There is information to be gained by review of these other organs, but until now there have not been the tools to attempt to fully realize them.

In a landmark 2009 study, Prado et al. measured muscle mass in obese cancer patients using CT scans obtained routinely during their cancer management. Due to its correlation with total muscle mass, the muscle area was measured at the level of the third lumbar vertebrae. This was subsequently corrected for height to a skeletal muscle index. Prado et al. found a relationship between low muscle index and survival, creating the label of sarcopenic obesity with a diagnostic cut-off determined by optimum stratification (<Male 55/Female 39 cm2/m2). Sarcopenia was a new concept in cancer care at that point, previously having been used in mainly an aging context to define frailty associated with poor muscle and mass. In the frailty literature, where there is limited access to CT or MRI imaging, assessments were usually functional (such as defined set of exercises) or involved plain X-ray imaging of limbs for practicality, cost and radiation dose purposes.

For cancer sarcopenia, assessment of muscle index has been repeated by other groups in single-centre studies across a variety of tumour types and geographic locations. Even when correcting for sex and height, there are enough other uncorrected factors that the range of cut-offs for pathological sarcopenia is too wide to be of practical utility (29.6-41 cm2/m2 in women and 36-55.4 cm2/m2 in men). Another limitation lies in that tumour sites do not always share imaging practices. For instance, in the case of brain tumours, there is less of a need to look for extracranial disease and thus, no imaging is available at the level of the third lumbar vertebrae for analysis.

In the current age of personalised medicine, being able to create individualised risk profiles based on the incidental information gained from necessary clinical imaging would add utility to scan results without adding clinical effort. For my PhD, the goal is to overcome this challenge using transfer learning from an existing detailed dataset. We’ve had the fortune to secure access to the UK Biobank, a medical compendium of half a million UK participants. The biobank includes results of biometrics, genomics, blood tests, imaging as well as medical history. Such a rich dataset is ripe for machine learning tasks.

I have therefore been working to integrate several high-dimensional datasets, applying a convolutional neural network to the imaging aspect whilst a deep neural network to the non-imaging aspect. A dimensionality reduction technique such as autoencoder will subsequently have to be applied to generate a clinically workable model. Being a clinician primarily, I am able to bring clinical rationale to the model and intuit the origin of certain biases from my prototype pipelines. On the flip side, I have struggled to become fluent with the computational code necessary to tackle these problems, and often still feel like I am at the equivalent level of asking for directions to the bathroom back in my GCSE German days.

In becoming a hybrid scientist, I’ve long since acknowledged that I will not be the best coder in the room. I am still climbing the steep learning curve of computer languages and code writing, grateful for this opportunity to realise my potential in this field. Machine learning is ever encroaching on not just our daily lives but in our clinical practice, usually for the better. I imagine that as early adopters turn into the early majority, those of us who have chosen to embrace this technology will be in a position to better develop and understand the tools that will benefit our future patients. After all, soon we will not just be collaborating with each other but also with Dr Siri and Dr Alexa.


Sleep walking into clinical data science

If I’ve learned one thing over the last five months in the Computational Oncology lab, it is that real world data is a whole different ball game.

As a fresh-out-of-undergrad master’s student, with a background in cognitive neuroscience and biological sciences, a data science project wasn’t entirely in my toolkit. My programming skills was a working knowledge of Java and online courses in Python and Machine Learning picked up over the summer. With MRes project choices, I found myself with a unique opportunity to gain experience in a lab, working with highly experienced researchers, clinicians and students. I figured a project in the Computational Oncology lab would be a challenge – but what better way to learn how to analyse large datasets or apply machine learning methods than immerse myself in a project.

For the past 5 months, I have been working under Dr Seema Dadhania on a project from the ongoing BrainWear clinical study. Specifically, I was tasked with analysing sleep in patients with High Grade Gliomas, a malignant primary brain tumour. Sleep disturbance is one of the most commonly experienced symptoms for patients with High Grade Gliomas. In fact, in a 2018 study by Garg et al, it was found that disrupted sleeping behaviours were three times more prevalent in patients with primary brain tumours than healthy controls, and was linked to decreased quality of life. Despite how pervasive and burdening sleep problems can be in brain tumour patients, research remains scarce and the studies that have been done use self-reported measures such as the EORTC-QLQ C30 and MDASI-BT, questionnaires which assess quality of life or severity of symptoms. These measures run the risk of subjectivity – patients may quantify difficulty with sleep in different ways which skews the translatability of findings to real life. With the collection of longitudinal accelerometer data, BrainWear gives the opportunity to objectively understand sleep patterns and changes that occur with treatment.


Patients and research

Where do patients fit in research?

How to brilliantly take over from Dr Seema Dadhania’s blog post when I am definitely not someone to keep diaries (like Anne Frank or Carrie Brashaw)?

Some suggest writing about yourself; others, on your research. Well, that is a shame because there is nothing sufficiently interesting about me to make it public, and my research is scattered over so many projects and obligations that they would be difficult to summarise in 900 words. Then, the idea came to me after a patient and public involvement meeting organised and chaired by the talented Miss Lillie Pakzad-Shahabi: how do patients have an influence on my work? do I work for clinical staff or patients in the National Health Service (NHS)?



BrainWear – correlating wearable data with clinical outcomes in brain tumour patients

A New Year turns, and with it brings my very first blog post for the Computational Oncology Group, in poll position on Matt Williams highly anticipated ‘2021 blog rota’… I am the chosen one.  I will use this opportunity to share with you some highlights and lessons from my PhD thus far, which I hope gives you an insight into my work within the lab and with our patients, for whom we strive to deliver.

BrainWear (ISRCTN34351424 is a clinical study collecting wearable data in the form of a wrist worn Axivity AX3 accelerometer from patients with primary and secondary brain tumours. It is the brainchild of one of our very own patients who wanted to share his wearable data with us whilst having treatment as an additional monitoring tool. One of the strengths in traversing both a computational and clinical PhD is learning that patients are an invaluable source of knowledge and have undoubtedly helped shape the research we do and the questions we ask from our data. Our first BrainWear patient was recruited in October 2018 and we have since recruited 60+ patients, with some providing > 1 year of accelerometer data and for the first time we will be able to objectively understand how daily activity changes with surgery, chemoradiotherapy and at the point of disease progression. In its essence, BrainWear is a feasibility study and asks the simple questions: is it possible for brain tumour patients to wear a wrist worn device whilst having treatment? do they find it acceptable? Is longitudinal data collection possible? But as we delve deeper into this first of its kind dataset, there are layers of information about patient physical activity levels, sleep pattern, gait (walking style) and quality of life in simple accelerometer data. The passive nature of the data (collected without interference from neither patient nor clinician) represents the patient objectively, and it is this aspect of the data and its correlation with existing clinical parameters which I have found so attractive. Our work is now to understand how these data change over time and with disease activity but more interestingly whether there are any early indicators to identify ill health i.e., disease progression or hospitalisation. Can this data be used to explain, influence and/or predict health-related outcomes and in turn be translated into a digital biomarker to guide clinical decision making?

Accelerometery data captured at a sampling frequency of 100Hz (100 readings per second) rapidly expands in size and requires powerful data analysis and manipulation tools, for which I have utilised Pandas built on top of Python 3, and the watchful eye of our lab data analyst.   Whilst keen to explore how we can use machine learning (ML) methods as a predictive method of worsening disease particularly around gait and sleep in those patients with high grade gliomas, I am currently taking some time to understand the more traditional statistical methods of longitudinal data analysis utilising mixed effects models. A recent systematic review by Christodoulou et al showed no performance benefit of machine learning over logistic regression for clinical predictive models in low dimensional data. I believe one of the strengths in training clinicians to manipulate and analyse data computationally is our ability to understand the clinical impact of the question being asked from the data, and when relaying our findings to the clinical community, it is important that our methods are robust and represent the more traditional statistical approaches as well as novel ML methods. We have however taken the opportunity to capitalise on the ease of access to accelerometer data with our MSc students, and developed support vector machine classifiers for identifying pure walking bouts and extend on existing work done using neural networks to classify clean walking and gait characteristics. The UK Biobank and the work done by Aiden Doherty’s team in processing and publishing findings on just under 100,000 participants 7-day accelerometery readings have given us the opportunity to compare patients with high grade glioma to healthy UK biobank participants. This will allow us to understand in greater depth how patients with brain tumours are impacted by their disease.

Digital remote health monitoring is becoming more mainstream in clinical and trials settings, fuelled by rapid development in sensor technology and in trying to provide increasingly patient centred care, particularly in the era of COVID-19. Correctly utilising these devices and data has the potential to provide a method of monitoring the patient in motion rather than the episodic snapshot we currently see and in turn improve our clinical decision making and patient outcomes.  In that spirit, in my next blog post I hope to update you on some preliminary analysis from our high grade glioma patient subset of BrainWear, and will discuss how I am going to tackle the non-trivial task of incorporating wearable data into clinical decision making models.

An Introduction

An Introduction to Computational Oncology

…..and – why a Blog?


These starter posts are always hard, so…..this blog is the companion to the lab website. But that is static, and tells you who is who, and what papers we have published, but doesn’t let you talk about the work we are doing now, or the software we build that doesn’t get “published” (but is released), or the stuff that never get published, or the stuff that is published but is complicated and would benefit from some explanation.

Hence a blog

It is also a way to force us all to write something, more regularly than just papers and grant submissions. I am a great believer that writing IS research: the process of writing involves shaping your thoughts, deciding what you think is true from you have evidence for, deciding how to structure an argument, and working out how to present that to an audience. I think that sounds like research.

It is also a good way to let others know what else we are doing; across Imperial, and more widely. It can be difficult from reading just published work to decide what people are really interested in. It also gives an opportunity for everyone in the lab to contribute – not just to writing, but also reviewing, editing and disseminating that work.

It also lets us rehost some older content that appeared on the (currently defunct) Computational Medicine blog that I ran with Caroline Morton: some of that content is still available, but before it vanishes, we will repost it here (who can fail to be interested in my take on why Jane Austen has some lessons for how we think about the deployment of AI-based systems in healthcare). Which leads us to the question of what Computational Oncology is.

Broadly, we define it as the mathematical and computational approaches to clinical problems in cancer. Our basic unit of analysis is the patient, rather than genetic data (as in bioinformatics) or populations (as in epidemiology), but we look to apply computational approaches. We do this because computational approaches let us do things we can’t normally do: it lets us scale up, it lets us collect things we can’t normally collect, it lets us monitor in a way that we can’t normally do, and extract previously hidden information from routinely acquired data.

This is a vague definition, and is as much a set of things it isn’t as what it is, but it has coherence, and we are beginning to see how our projects link together, linking work on automated imaging interpretation with patients who are enrolled in a clinical trial (more of this later). It also misses some of the most important things about our work: the focus on patient-centred work, the Patient & Public Involvement work we do, and the fact that a lot of the work is based around brain tumours.

We will update this every couple of weeks, as people post about updates of their work, and if you are interested, feel free to drop us a line: matthew.williams -at-