Generative AI and unstructured audio data for precision public health

  • 1Center for Interventional Oncology, Radiology and Imaging Sciences, NIH Clinical Center, Bethesda, USA.
  • 2Computational Health Informatics Lab, Oxford Institute of Biomedical Engineering, University of Oxford, Oxford, UK.
  • 3Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam.
  • 4Morsani College of Medicine, University of South Florida, Tampa, FL USA.
  • 5College of Engineering, University of South Florida, Tampa, FL USA.
  • 6Department of Computer Science, McCormick School of Engineering, Northwestern University, Evanston, IL USA.
  • 7National Library of Medicine, National Institutes of Health, Bethesda, MD USA.
  • 8Feinberg School of Medicine, Northwestern University, Chicago, IL USA.

|

Abstract

In this study, transcribed videos about personal experiences with COVID-19 were used for variant classification. The o1 LLM was used to summarize the transcripts, excluding references to dates, vaccinations, testing methods, and other variables that were correlated with specific variants but unrelated to changes in the disease. This step was necessary to effectively simulate model deployment in the early days of a pandemic when subtle changes in symptomatology may be the only viable biomarkers of disease mutations. The embedded summaries were used for training a neural network to predict the variant status of the speaker as "Omicron" or "Pre-Omicron", resulting in an AUROC score of 0.823. This was compared to a neural network model trained on binary symptom data, which obtained a lower AUROC score of 0.769. Results of the study illustrated the future value of LLMs and audio data in the design of pandemic management tools for health systems.

Related Concept Videos

Statistical Methods for Analyzing Epidemiological Data 01:25

313

Epidemiological data primarily involves information on specific populations' occurrence, distribution, and determinants of health and diseases. This data is crucial for understanding disease patterns and impacts, aiding public health decision-making and disease prevention strategies. The analysis of epidemiological data employs various statistical methods to interpret health-related data effectively. Here are some commonly used methods:

Descriptive Statistics: These provide basic...

Non-equilibrium in the Cell 01:16

4.3K

An important concept in studying metabolism and energy is that of chemical equilibrium. Most chemical reactions are reversible. They can proceed in both directions, releasing energy into their environment in one direction, and absorbing it from the environment in the other direction. The same is true for the chemical reactions involved in cell metabolism, such as the breaking down and building up of proteins into and from individual amino acids, respectively. Reactants within a closed system...

Issues And Trends In Healthcare Delivery System 01:29

5.6K

The issues and trends in healthcare delivery are constantly changing. The COVID-19 pandemic is one recent issue that wreaked havoc on healthcare systems, causing a shortage of healthcare workers, high demand for medicines and supplies, and increased medical expenditure due to a lack of insurance. Other issues include rising healthcare costs and care fragmentation.
Cost Containment
Payment for healthcare services has historically promoted adoption of costly and often unnecessary or inefficient...

Steps in Outbreak Investigation 01:18

108

In the ever-evolving field of public health, statistical analysis serves as a cornerstone for understanding and managing disease outbreaks. By leveraging various statistical tools, health professionals can predict potential outbreaks, analyze ongoing situations, and devise effective responses to mitigate impact. For that to happen, there are a few possible stages of the analysis:

Predicting Outbreaks
Predictive analytics, a branch of statistics, uses historical data, algorithmic models, and...

Sampling Methods: Overview 01:06

287

A sample refers to a smaller subset representative of a larger population. In analytical chemistry, studying or analyzing an entire population is often impractical or impossible. Therefore, samples are used to draw inferences and generalize the whole population. The sampling method selects individuals or items from a population to create a sample. Standard sampling methods include random, judgemental, systematic, stratified, and cluster sampling. 
In analytical chemistry, the choice of...

Genome Annotation and Assembly 03:36

18.8K

The genome refers to all of the genetic material in an organism. It can range from a few million base pairs in microbial cells to several billion base pairs in many eukaryotic organisms. Genome assembly refers to the process of taking the DNA sequencing data and putting it all back together in a correct order to create a close representation of the original genome. This is followed by the identification of functional elements on the newly assembled genome, a process called genome annotation.