Castleman disease: does machine learning hold the key to better treatment?
Combining large omic and clinical trial data sets with machine learning algorithms has allowed Medidata and CDCN to identify six new patient subgroups of rare Castleman disease. How can this discovery advance diagnostics and drug development for this rare disease? Allie Nawrat finds out.
After hearing a talk by Castleman Disease Collaborative Network co-founder Dr David Fajgenbaum as part of a lecture series about challenging topics in the life sciences, artificial intelligence firm Medidata decided to stay in touch with Fajgenbaum and his company.
Medidata chief data officer David Lee explains: “One day we were talking [with Dr Fajgenbaum] about how far the [Rave Omics] product and machine learning has come, and he was telling us about how much data he has been able to collect on [Castleman disease].
“Then as a proof of concept, we decided to work on this collaboration together.”
Combining CDCN’s large data sets from patients suffering with this rare, inflammatory disease with Rave Omics’ machine learning capabilities has led to novel insights about the disease that help to understand which patients will respond best to existing treatments and identify new drug targets for patients responding less well.
R&D challenges facing rare diseases
There are significant challenges to successful research and development (R&D) for rare diseases, including a lack of participants for clinical trials and limited investor interest in dedicating necessary resources for discoveries.
Castleman disease also faces additional R&D barriers, which is why Fajgenbaum and his co-founder Dr Frits van Rhee created the CDCN in 2012. CDCN accelerates research into and treatments for the disease with the help of crowdsourced funding.
The disease is difficult for physicians to identify and diagnose because it is not just one condition and its symptoms often overlap with other conditions. “People haven’t really come across the condition before, it is often misdiagnosed as rheumatoid arthritis or cancer when it is not either of those,” says Lee.
iMCD stumped my doctors, and they didn’t think I would survive.
It is comprised of a group of three inflammatory diseases linked to lymph nodes where the inflammatory cells become hyper-activated when fighting off an infection and produce excess chemokines and cytokines. This leads to flu-like symptoms, enlarged lymph nodes and dysfunctional vital organs.
The most deadly subtype of Castleman disease, idiopathic multicentric Castleman disease (iMCD), has “lagged far behind”. Fajgenbaum, who was diagnosed with iMCD while at medical school, said: “iMCD stumped my doctors, and they didn’t think I would survive.”
Using machine learning to overcome rare disease challenges
During the collaboration with CDCN, Medidata’s Rave Omics system was used to “integrate the omic data and the clinical trial data for all the studies that [Fajgenbaum] had access to” before machine learning was run on top to identify biological patterns in the data sets, according to Lee.
Fajgenbaum says CDCN had access to “1,300 analytes in the serum from 100 iMCD patients in active disease and 100 control samples.”
Using machine learning to connect omic data and clinical trial data allowed the system to identify six distinct omic signatures and therefore, “six groups of [iMCD] patients that were completely novel to the medical community”, one of whom “had triple the response rate to the available treatment than all the other groups,” Lee explains.
This is possible because omic data uses biological signatures from molecules like genes and proteins to understand larger structures, functions and dynamics of whole organisms and better understand how complex diseases work.
Medidata relies upon omic data “generated by machine, sequencer machines, and they curl off hundreds or thousands of features of each patient”; Rave Omics then reduces this into “more simple analyses that can be actionable,” according to Lee.
The results signal to the world what machine learning can be used for.
Discussing the collaboration’s findings, Lee says: “We are extremely excited about the results; they signal to the world what machine learning can be used for in order to tease out these important patterns in omic data types, which will have an immediate impact on the way diseases should be treated.”
Fajgenbaum notes that these results can also be used to “further probe into what's going on in the other five subgroups that aren't improving with that drug, so we can identify new treatments to help them.”
Lee continues: “The other win is that it demonstrates that data management in the age of big data is still supremely important; if we didn’t take the time to integrate the omic data with the clinical trial data properly then we would never have been able to find these signals no matter what types of algorithms we had used.”
The future of the Castleman disease collaboration
The next stage for the Medidata-CDCN partnership is to publish these results and then to “show that the patterns we identified replicate in other data sets for patients that we haven’t seen before”, described by Lee as “the validation step”.
“If we can validate that we can use omic data to identify which patients will improve on the current drug and which patients are likely not to improve, then we are getting a step closer to personalised medicine.”
Fajgenbaum agrees: “The study will have an important impact on the iMCD patient and research and drug development for iMCD.”
This will help to get patients on to the right treatment path from day one.
Lee also believes the insights from the collaboration can aid with diagnosis of Castleman disease. He says: “There are some early results that we can help with diagnosing Castleman patients in the first place… if we could use the big omic data results to help positively identify Castleman patients, even if their clinician has never seen a Castleman patient before, this will help to get patients on to the right treatment path from day one.”
Go to top
Share this article