at a glance

By leveraging advanced analytics and generating forecasts, we strive to enhance responses to public health emergencies. This involves recognizing historical patterns in signals that may indicate changes in human behavior. Such insights can assist decision-makers in anticipating substantial shifts in healthcare usage.

We analyze in real time the way in which the general public searches for disease-specific information on the Internet. Specifically, we monitor anonymized search activity for terms such as “causes of fever” or “flu virus”. Our real-time systems have been able to identify future surges in hospitalizations 2 to 6 weeks before they are reported by traditional disease surveillance systems.

How Internet searches can improve monitoring of respiratory infections activity

Respiratory diseases-related Internet search activity has been shown to closely mirror a population's health status. We frequently observe that when a high proportion of the population is affected by a respiratory disease (for example, influenza), Internet searches related to such disease also tend to increase. In moments when disease levels are very low, we observe very little related search activity. An increase in searches for symptoms associated with respiratory disease infections in a particular region could thus indicate the beginning of a disease outbreak.

There are several advantages to using timely Google searches in respiratory disease surveillance systems. First, Google search trends are available in near-real time, allowing our surveillance efforts to monitor the population’s awareness of an imminent rise in infection cases without the delays that affect sources of traditional surveillance data. Second, search data can capture important changes in human behavior that can differ across demographic and geographic groups. Finally, this search data is widely accessible and can therefore be used by researchers, public health officials, and other stakeholders.

When someone begins exhibiting symptoms, such as a fever, cough, or body aches, they often turn to the internet for answers. They may search for information about their symptoms to determine the type of disease they have developed. These individual searches, when aggregated with countless others, form a collective pool of data. This accumulation of searches serves as an early indicator or "signal" of respiratory disease activity within a population.

Internet-related data can help us forecast hospital admissions

Respiratory disease-related search volumes complement existing surveillance systems and have shown to be most effective when combined with other types of data. In our efforts, we are currently working on developing Early warning Systems that leverage internet-search data, along with official health reports on respiratory diseases (such as Influenza, RSV, and COVID-19) to address the following tasks:

1. Anticipating the start of potential epidemic surges

2. To estimate the timing of a surge’s peak

How our Early Warning System works

Our proposed methods leverage information from multiple internet-based data sources, commonly called digital traces, as they are collected when humans navigate the internet and serve as proxies of human behavior. The early warning system (EWS) framework is designed to anticipate sharp increases in respiratory diseases (such as influenza, RSV, and COVID-19) transmission, as identified by changes in the effective reproduction number (Rt), an outbreak indicator preferred by the community of epidemiologists. The early warning system operates in the following way: Initially, we gather raw data from respiratory disease-related internet searches along with the currently available hospital admission reports from the CDC. Next, we pre-process this raw data to remove noise and extraneous information and then quantify the trend of each proxy over the past six weeks. Once the trends are established, we pinpoint the moments when exponential growth in respiratory disease activity begins. These critical points signal the potential onset of an outbreak. Using these insights, we construct a data-driven model that projects the likelihood of the start of an epidemic surge in a specific location within the upcoming six weeks, providing a timely alert for preventive actions.

How useful and accurate is the Early Warning System?

A retrospective analysis over our latest 2023-2024 season shows our capability to precede national and state-level influenza activity surges by successfully detecting the start of the national and state-level epidemics in 47 of the 50 states up to 6 weeks in advance.

Additionally, a retrospective analysis performed with data starting in 2010 to 2024, shows that our Early Warning System accurately detected 80% of state and national influenza surges up to 6 weeks in advance.