Feature

Too much data: a burden or a blessing?

Has the data see-saw tipped the other way and are sponsors collecting too much data? Abigail Beaney investigates.

Credit: Credit: eamesBot / Shutterstock

As technology evolves and it becomes easier to collect data, the question is now if data collection has tipped too far. 

Extra data can be incredibly helpful for sponsors - the standout example being Pfizer’s studies of Viagra as a blood pressure drug until the discovery of its potential in erectile dysfunction. But on the flip side, collecting extra blood or tests not only puts strain on patients but also the sites. 

The number of data points being collected by sponsors has increased dramatically over the past decade, with research conducted by Duke-Margolis Health Policy Center showing there has been a 283.2% increase in data points collected during Phase III trials. 

Many argue that the data is not necessary and is complicating trials, but in earlier phases it is important for sponsors to be able to understand the candidate says Dr. Michael Murphy, chief medical officer, and co-founder of contract research organisation (CRO) Worldwide Clinical Trials. 

“Some feel clinical programs are too complex, citing concerns like patient burden, increasing expense, sample sizes which are not sufficient to yield statistically significant effects, to evaluate exploratory endpoints. This is very prominent in early-phase research where sponsors are learning about their candidate,” Murphy says. 

But how much data is too much data, and have we reached a point where it has become more of a burden than a blessing?

A burden?

Part of the problem is not the amount of data that is being collected but how it is used. Ali Pashazadeh, CEO of Treehill Partners, a global healthcare strategic and transaction advisory firm, says that the company recently conducted a study which showed that despite the increasing amount of data being collected, it is  used poorly. 

Research by Treehill showed that 85% of studies conducted are not run on appropriate data sets and 50% of pivotal trials fail due to badly designed protocols. 

“The availability of data and the accuracy of data is critical to running the correct studies,” Pashazadeh explains. 

He goes on to say that in the research, which looked at 1,200 Phase II and III studies, just 5% of those had been designed to be commercially relevant.

Part of the problem is not the amount of data that is being collected but how it is used.

Ali Pashazadeh, CEO of Treehill Partners

The burden is not just with sponsors who hire staff or outsource services to data analysts, but also sits with others in the trial timeline, explains Catherine Gregor, chief clinical trials officer at Florence Healthcare, a clinical trial software vendor.

“It is a bit of a double-edged sword. Of course, it's nice to have all these data points and figure it as much as we can from a patient, but the trouble is when we do that, it creates burden for both sites and patients,” Gregor says.

Patient burden can be even more of a consideration for sponsors involved in rare disease trials, especially for patients who are heavily burdened by disease as additional testing will take a bigger toll on their health, with Alexander Seyf, CEO of Autolomous, a software development company specialising in cell and gene therapy manufacturers.

“When you are dealing with patients who are in their fourth or fifth line of therapy and potentially this therapy could be their last hope, it is important. They want to know that all the procedures they are going through will be beneficial for their disease. The industry needs to inform them of how their data will be used to both incentivise and ignite hope in these patients,” Seyf explains.

“It also creates more opportunity for deviation if you can't hit those data points and the data management burden is real. When we talk about reducing burden, and focusing on what matters, there's a big cultural shift now to ‘let's focus on the primary endpoints’,” Gregor adds.

However, on the other side of the coin, the increasing number of endpoints is also helpful in determining considerations beyond marketing.

“Different measures or additional endpoints can be informative especially in adaptive trial designs where the information can be used to make decisions for the treatment process,” Murphy adds.

A blessing?

It may seem like nobody can win in the data minefield, but looking to the future, it is massively beneficial for sponsors to keep historical data. 

“That's the Catch 22, if you have that data, you can use historical controls rather than having a control arm. Not just that but you can look for signals that indicate something else,” Gregor explains. “The challenge is striking the right balance – do you want to overburden the patient today to help more patients in the future?” 

One company which is now starting to harness previously collected data is diabetes giant Novo Nordisk. Speaking at the Veeva Europe R&D Summit in Madrid in June 2024, senior vice president of data science, Thomas Senderovitz, emphasised why all big pharma companies need to be considering this. 

“Most, if not all, big pharma companies are walking on gold mines and it's pretty stupid to have gold mines if you don't mine the gold,” said Senderovitz. “We have data in a variety of areas where we may not have thought about patterns. When we were executing our clinical trial program, we were focused on diabetes, but there's a lot of information from those studies that we are now mining for.” 

Novo Nordisk is using historic data to decide where it can pivot next with Ozempic/Wegovy (semaglutide).

The challenge is striking the right balance – do you want to overburden the patient today to help more patients in the future?

Catherine Gregor, chief clinical trials officer at Florence Healthcare

Data collected in this way cannot be taken at face value however, due to the lack of control over these outcomes, says Pina D'Angelo, vice president of biometrics at CRO Innovaderm.

“Using data like this could result in a type one error. The control is there so we don’t make an error in the conclusion, so you have to be mindful using the retrospective perspective,” D’Angelo warns. “Having said that, it is a wealth of opportunity and a great path to open up multiple channels.”

Artificial intelligence (AI) is being utilised by companies across the globe and given the speed and power, it may appear to be the solution to evaluating the vast amounts of data, but it cannot act alone and must be managed properly.

“We have always blamed the molecule for the failure, as opposed to considering if as humans we designed the wrong study. In that space, AI will have a material impact. I think will end up with AI becoming another voice in the room rather than a standalone tool,” Pashazadeh explains, but he adds that he believes the industry is around a decade away from utilising AI in this way.

The only way of improving AI systems is to feed them with data and while it is understandable that sponsors want to protect their data, if everyone wants to mutually benefit from AI in data management, everyone needs to contribute.

“As a CRO, we use all the data we have, including historical data from previous studies to counsel sponsors on what the best endpoint might be or whether something needs to be evaluated separately,” D'Angelo explains.

Given the early life of cell and gene therapies, Seyf concludes that if the industry comes together to share data now, it is not yet too late to all gain mutual benefit.

“We need to be more open in sharing data - we need to be more collaborative. Right now, the cell and gene therapy industry have such an immense and invaluable opportunity to really start sharing data. The industry is so young, and we can avoid all making the same mistakes as we learn together.”