Leveraging big data to transform health care through predictive modelling

This week, I’ll be presenting at the Erasmus+ Jean Monnet Conference on Competition, Big Data and Fundamental Rights on the topic of leveraging big data to transform health care through predictive modelling. I will look at how big data sets are already being used in predictive models to improve health care. I’ll also be exploring the challenges and opportunities of scaling up current usage to improve quality of care, health care demand forecasting and resource planning across entire health systems.

Big data sources for health care

Broadly speaking, there are three categories of big data that are already starting to be used in predictive models to transform health care. These are:

  1. Health system data from interactions patients have with the health system, including electronic medical records, admissions data and condition-specific data.
  2. Everyday data recording data on individuals as they go about their everyday lives through wearable devices, social media, mobile apps and Internet of Things devices.
  3. Genomic data from both large-scale population based genetic sequencing programs, such as the NHS Genomic Medicine Service, and through direct-to-consumer genetic testing kits.

These big data sets are helping to transform health care through improving health care delivery, reducing the disease burden and achieving cost savings and efficiency gains.

Improving patient flow

Hospital admissions data can be used to improve the forecasting of demand for health care services. For example, the Patient Admission Prediction Tool has been developed to forecast hourly patient admissions by gender, medical condition and severity across 27 hospitals in Queensland, Australia. The model uses >10 years’ worth of patient admissions data as an input data source combined with other attributes such as public events and times of the year when infectious disease outbreaks are most likely. This model has enabled hospital bed managers to better plan resources for the coming week based on the forecasted occupancy levels. There has been interest in scaling up the use of this model to a national level to help improve patient flow around the health service. The aim would be to prevent bottlenecks in the system arising from overcrowding, for example by discharging lower risk patients from hospital care into primary or community care to free up capacity in times of extreme demand surge.

Improving patient outcomes

Data from health care apps is being used to improve patient outcomes through supporting clinical decision making, enabling individuals to better self-manage chronic conditions and advancing medical research.

The clinical app Streams developed by Google’s DeepMind in collaboration with the Royal Free NHS Trust has been used to improve prediction of outcomes for patients admitted to hospital with suspected acute kidney disease. The app used NHS healthcare data from approximately 1.6m patients to build algorithms that enable speedier diagnosis and treatment for those with acute kidney disease. The app escalates the results of clinical tests to medical specialists where the probability of the patient having acute kidney disease is high. This helps to more quickly identify patients who are in a critical condition thereby allowing faster medical intervention. Further development of this technology could enable medical staff to improve clinical decision making and better prioritise which patients to attend based on how critical the patient’s condition is predicted to be based on the data from medical tests in combination with the patient’s electronic medical records data.

The everyday app MyCareCentric is being used in the UK to improve condition management for those with epilepsy. This app has been developed by Poole NHS Foundation Trust in collaboration with Microsoft to identify when risk of an epileptic seizure is likely. The app collates data from wearable devices worn by individuals to track factors including their sleep, exercise, heart rate. Based on algorithms analysing the data, if an individual’s risk of an epileptic seize is predicted to be high then their doctor will be remotely contacted by the app so that preventative measures can be taken. For those 50 people currently enrolled in the trial of the app, a 30% reduction in epilepsy-related hospital admissions has been seen. This shows the potential promise of this type of predictive technology in enabling individuals to better manage their chronic conditions outside of the hospital setting through using everyday data to help track their health.

In terms of genomic data, there is potential for genetic data to further medical research. The Apple Research Kit has seen the launch of its first app that directly asks individuals for consent to collect genomic data, with researchers aiming to investigate a genetic basis for post-natal depression. Combining genomic data with health service and everyday data could help more targeted health treatments to be developed, for example through personalised medicine.

Forecasting and controlling epidemics

Big data is also being used to help improve forecasting of epidemics and management of epidemic outbreaks. International researchers developed a model to predict cholera outbreaks in Yemen up to 4 weeks in advance. Data inputs include rainfall radar forecasts that can predict rainfall within 10km radius, population density, access to clean water and seasonal temperature. This model has been used to send health workers to areas with highest likelihood of an imminent cholera outbreak to limit spread of disease by providing sanitation advice, chlorine tablets and a range of other health measures. There is also the future potential to target vaccination campaigns if the model’s prediction window can be increased from 4 to 8 weeks. Going forwards, researchers also want to extend the model to more regions and to help predict a wider range of disease such as dengue fever and malaria.

Crowd-sourced sickness insights have also been highly valuable in controlling epidemics. Phone location data is being used to predict how disease spreads in an epidemic by tracking population movements, for example in Ebola outbreaks in recent years. Health services could couple location-based crowd-sourced sickness data with their own internal data to create more flexible epidemic forecasts. Based on these, medical care could be directed more effectively to high risk regions and decision-makers could take proactive measures to limit the spread of disease, such as travel restrictions.

Social media and other 3rd party data sources can also be powerful indicators of illness. The Sickweather app uses language recognition technologies to scan social media networks to identify sickness outbreaks and users of the app can also self-declare their health status. The app provides users with their location-based probability of sickness (e.g. cough, common cold, flu). If health services can couple this type of location-based crowd-sourced sickness data with their own internal statistical data on patient admissions, this could help to create more flexible patient flow forecasts that allow decision-makers to be more dynamic in their responses to changes in the health environment.

Efficiency gains & cost savings

Although the development of predictive models can be costly, significant investment returns are possible. An example of this is the Veterans Health Administration (VHA) which is the largest integrated health care system in the US. The VHA invested an estimated $1bn in the development of a data warehouse to centralise more than 30 years’ worth of patient’s electronic data across its national health system. Using this data, algorithms were developed to help predict patient outcomes including mortality and hospitalisation risks. The VHA made a net return on this investment of an estimated $3bn over a 7-year period. Centralisation of data means that re-testing is often no longer required if patient attends different hospital and frontline care managers can use patient risk scores to help guide medical service provision.

Long-term: transforming health care

If health service, everyday and genomic data sets could be pooled together and stored in a common language that allows data to be exchangeable across the various stakeholders in the health system, there is the potential to scale up these uses of predictive health care modelling.

To enable use of big data for predictive modelling on a larger scale it will be important to address data quality, data interoperability and data privacy concerns. These models will also need to be dynamic in responding to an ever-changing health landscape, such as the emergence of new diseases.

If these challenges can be overcome, then in the long-term predictive modelling could be used to transform health care. A network-level strategy could be implemented across the health system to:

  • Better predict individuals anticipated health care needs.
  • Help suggest those treatments that have higher probability of efficacy, for example through personalised medicine.
  • More effectively direct patients into primary and secondary care taking into consideration the forecasted demand and resource constraints of the health service.

Predictive models that leverage big data in this way have the potential to facilitate significant improvements in the quality of health care delivered and in aiding stakeholders to better plan and manage resources.

Leave a comment