Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Mass Analyzers: Overview01:13

Mass Analyzers: Overview

1.4K
The mass analyzer is a crucial component of the mass spectrometer. In the ionization chamber, the vaporized sample is bombarded with a high-energy electron beam to generate a radical cation and further fragment into neutral molecules, radicals, and cations. A series of negatively charged accelerator plates accelerate the cations into the mass analyzer. The mass analyzer separates ions according to their mass-to-charge (m/z) ratios and then directs them to the detector. The common types of mass...
1.4K
Statistical Analysis: Overview01:11

Statistical Analysis: Overview

13.2K
When we take repeated measurements on the same or replicated samples, we will observe inconsistencies in the magnitude. These inconsistencies are called errors. To categorize and characterize these results and their errors, the researcher can use statistical analysis to determine the quality of the measurements and/or suitability of the methods.
One of the most commonly used statistical quantifiers is the mean, which is the ratio between the sum of the numerical values of all results and the...
13.2K
Multi-species Conserved Sequences02:51

Multi-species Conserved Sequences

4.5K
Next-generation sequencing technologies have created large genomic databases of a variety of animals and plants. Ever since the human genome project was completed, scientists studied the genome of primates, mammals, and other phylogenetically distant living beings. Such large-scale  studies have provided new insights into the evolutionary relationship between organisms.
Although the genome of each species varies greatly from each other, a few sequences are highly conserved. Such conserved...
4.5K
Cluster Sampling Method01:20

Cluster Sampling Method

13.8K
Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...
13.8K
Statistical Methods for Analyzing Epidemiological Data01:25

Statistical Methods for Analyzing Epidemiological Data

780
Epidemiological data primarily involves information on specific populations' occurrence, distribution, and determinants of health and diseases. This data is crucial for understanding disease patterns and impacts, aiding public health decision-making and disease prevention strategies. The analysis of epidemiological data employs various statistical methods to interpret health-related data effectively. Here are some commonly used methods:
780
Genomics02:02

Genomics

39.3K
Genomics is the science of genomes: it is the study of all the genetic material of an organism. In humans, the genome consists of information carried in 23 pairs of chromosomes in the nucleus, as well as mitochondrial DNA. In genomics, both coding and non-coding DNA is sequenced and analyzed. Genomics allows a better understanding of all living things, their evolution, and their diversity. It has a myriad of uses: for example, to build phylogenetic trees, to improve productivity and...
39.3K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Generalizable prediction of childhood ADHD symptoms from neurocognitive testing and youth characteristics.

Translational psychiatry·2023
Same author

Machine Learning-Based Prediction of Mental Well-Being Using Health Behavior Data from University Students.

Bioengineering (Basel, Switzerland)·2023
Same author

Multimodal Stereotactic Brain Tumor Segmentation Using 3D-Znet.

Bioengineering (Basel, Switzerland)·2023
Same author

Brain structure and allelic associations in Alzheimer's disease.

CNS neuroscience & therapeutics·2022
Same author

DataSifterText: Partially Synthetic Text Generation for Sensitive Clinical Notes.

Journal of medical systems·2022
Same author

DataSifter II: Partially synthetic data sharing of sensitive information containing time-varying correlated observations.

Journal of algorithms & computational technology·2022
Same journal

Thymidylate synthase inhibitory drugs induce p53-dependent pathways differently.

PloS one·2026
Same journal

Top-down and bottom-up attention for joint pattern classification and reconstruction.

PloS one·2026
Same journal

Short- and long-term scaling behavior of blood pressure and pulse arrival time during sleep in healthy controls and patients with obstructive sleep apnea.

PloS one·2026
Same journal

Double DQN-based secrecy energy efficiency and fairness performance in IRS-assisted NOMA systems with friendly jamming.

PloS one·2026
Same journal

10 recommendations for strengthening citizen science for improved societal and ecological outcomes: A co-produced analysis of challenges and opportunities in the 21st century.

PloS one·2026
Same journal

Paying in public: Peer effects, impression management, and willingness to pay on digital payment platforms.

PloS one·2026
See all related articles

Related Experiment Video

Updated: Dec 10, 2025

Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches
09:47

Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches

Published on: December 15, 2023

1.5K

Compressive Big Data Analytics: An ensemble meta-algorithm for high-dimensional multisource datasets.

Simeone Marino1,2, Yi Zhao1, Nina Zhou1

  • 1Statistics Online Computational Resource, Department of Health Behavior and Biological Sciences, University of Michigan, Ann Arbor, Michigan, United States of America.

Plos One
|August 29, 2020
PubMed
Summary
This summary is machine-generated.

This study introduces Compressive Big Data Analytics (CBDA) 2.0, an enhanced machine learning method for analyzing large health datasets. CBDA 2.0 improves feature and model mining for reproducible biomedical discovery and clinical applications.

More Related Videos

Author Spotlight: Integrated Multi-Omics Analysis for Unveiling Multicellular Immune Signatures in Clinical Heart Attack Cohorts
08:51

Author Spotlight: Integrated Multi-Omics Analysis for Unveiling Multicellular Immune Signatures in Clinical Heart Attack Cohorts

Published on: September 20, 2024

1.9K
Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.9K

Related Experiment Videos

Last Updated: Dec 10, 2025

Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches
09:47

Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches

Published on: December 15, 2023

1.5K
Author Spotlight: Integrated Multi-Omics Analysis for Unveiling Multicellular Immune Signatures in Clinical Heart Attack Cohorts
08:51

Author Spotlight: Integrated Multi-Omics Analysis for Unveiling Multicellular Immune Signatures in Clinical Heart Attack Cohorts

Published on: September 20, 2024

1.9K
Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.9K

Area of Science:

  • Biomedical and Clinical Sciences
  • Data Science
  • Machine Learning

Background:

  • Advancing health research requires innovative methods for data-driven discovery.
  • Open-science and team-based approaches are crucial for managing complex, large-scale health data.
  • Reproducibility, replicability, and data curation are essential for translating health data into actionable knowledge.

Purpose of the Study:

  • To expand the functionality of Compressive Big Data Analytics (CBDA), an ensemble semi-supervised machine learning technique.
  • To enhance CBDA's capability in feature mining (identifying biomarkers) and model mining (selecting predictive algorithms) for high-dimensional health data.
  • To validate CBDA 2.0 using synthetic and real-world large-scale clinical data, including the UK Biobank.

Main Methods:

  • Utilized an ensemble semi-supervised machine learning technique (CBDA) with iterative subsampling, function optimization, and statistical inference.
  • Implemented novel features in CBDA 2.0 for handling extremely large datasets, generalizing validation, expanding base-learners, automating specification selection, and assessing convergence and accuracy.
  • Validated CBDA 2.0 on synthetic datasets and the UK Biobank, addressing challenges like data heterogeneity, missingness, and multicollinearity.

Main Results:

  • Demonstrated the scalability, efficiency, and usability of CBDA 2.0 in interrogating complex health data.
  • Successfully predicted various health outcomes, including mood disorders and irritability, using UK Biobank data.
  • The enhanced CBDA 2.0 facilitates the identification, tracking, and treatment of mental health and aging-related diseases.

Conclusions:

  • Compressive Big Data Analytics 2.0 offers a powerful and scalable solution for analyzing large, complex biomedical datasets.
  • The method supports reproducible research and collaborative discovery by providing robust feature and model mining capabilities.
  • Open-science principles are upheld by sharing protocols and code, enabling independent validation and further research in translational health.