Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Multiple Comparison Tests01:13

Multiple Comparison Tests

4.4K
Multiple comparison test, abbreviated as MCT, is a post hoc analysis generally performed after comparing multiple samples with one or more tests. An MCT will help identify a significantly different sample among multiple samples or a factor among multiple factors.
It would be easy to compare two samples using a significance alpha level of 0.05. In other words, there is only one sample pair to be compared. However, it would be difficult to identify a significantly different sample if the number...
4.4K
Statistical Hypothesis Testing01:16

Statistical Hypothesis Testing

6.1K
Hypothesis testing is a critical statistical procedure facilitating informed, evidence-based decisions. It begins with a hypothesis, which is a tentative explanation, or a prediction about a population parameter. This hypothesis can be either a null hypothesis (H0), indicating no effect or difference, or an alternative hypothesis (Ha), suggesting an effect or difference.
Statistical significance measures the probability that an observed result occurred by chance. If this probability, known as...
6.1K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Integration of proxy intermediate omics traits into a nonlinear two-step model for accurate phenotypic prediction.

TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik·2026
Same author

Genomic prediction-aided incorporation of genetic resources into elite breeding: lessons from a collaborative multiparental design in flint maize.

TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik·2025
Same author

Evolution of population structure in a commercial European hybrid dent maize breeding program and consequences on genetic diversity.

TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik·2025
Same author

Nuclear and organelle genome assemblies of 5 Cucumis melo L. accessions, Ananas, Canton, PI 414723, Vedrantais, and Zhimali, belonging to diverse botanical groups.

G3 (Bethesda, Md.)·2025
Same author

Genome-wide association studies to assess genetic factors controlling cucumber resistance to CABYV and CMV in crop fields and the attractiveness for their <i>Aphis gossypii</i> vector.

Horticulture research·2025
Same author

metaGE: Investigating genotype x environment interactions through GWAS meta-analysis.

PLoS genetics·2025
Same journal

Distinct repeat architecture landscapes in the proteomes of protozoan parasites.

NAR genomics and bioinformatics·2026
Same journal

Long non-coding RNA triplex-dependent regulation of melanoma gene networks.

NAR genomics and bioinformatics·2026
Same journal

Challenges in predicting chromatin accessibility differences between species.

NAR genomics and bioinformatics·2026
Same journal

Power-law penalties correct distance bias in single-cell co-accessibility and deep-learning chromatin interaction predictions.

NAR genomics and bioinformatics·2026
Same journal

LORA: a polymorphic multi-sample long read assembly pipeline.

NAR genomics and bioinformatics·2026
Same journal

Correction to 'Genome sequence assembly and annotation of <i>MATA</i> and <i>MATB</i> strains of <i>Yarrowia lipolytica'</i>.

NAR genomics and bioinformatics·2026
See all related articles

Related Experiment Video

Updated: Jan 18, 2026

Large-Scale Multi-Omics Genome-Wide Association Studies Mo-GWAS: Guidelines for Sample Preparation and Normalization
08:27

Large-Scale Multi-Omics Genome-Wide Association Studies Mo-GWAS: Guidelines for Sample Preparation and Normalization

Published on: July 27, 2021

4.8K

Large-scale composite hypothesis testing procedure for omics data analyses.

Annaïg De Walsche1,2, Franck Gauthier2, Nathalie Boissot3

  • 1Mathématiques et Informatique Appliquées Paris-Saclay, AgroParisTech, INRAE, Université Paris-Saclay, 91120 Palaiseau, France.

NAR Genomics and Bioinformatics
|September 8, 2025
PubMed
Summary
This summary is machine-generated.

The new qch_copula method effectively tests composite hypotheses using summary statistics, improving scalability and accurately detecting joint associations across multiple traits or omics levels.

Frequently Asked Questions

More Related Videos

Candidate Gene Testing in Clinical Cohort Studies with Multiplexed Genotyping and Mass Spectrometry
05:53

Candidate Gene Testing in Clinical Cohort Studies with Multiplexed Genotyping and Mass Spectrometry

Published on: June 21, 2018

10.6K
Author Spotlight: Advancing Alzheimer's Research &#8211; Exploring Early Detection and Multi-Omics Approaches
09:47

Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches

Published on: December 15, 2023

1.7K

Related Experiment Videos

Last Updated: Jan 18, 2026

Large-Scale Multi-Omics Genome-Wide Association Studies Mo-GWAS: Guidelines for Sample Preparation and Normalization
08:27

Large-Scale Multi-Omics Genome-Wide Association Studies Mo-GWAS: Guidelines for Sample Preparation and Normalization

Published on: July 27, 2021

4.8K
Candidate Gene Testing in Clinical Cohort Studies with Multiplexed Genotyping and Mass Spectrometry
05:53

Candidate Gene Testing in Clinical Cohort Studies with Multiplexed Genotyping and Mass Spectrometry

Published on: June 21, 2018

10.6K
Author Spotlight: Advancing Alzheimer's Research &#8211; Exploring Early Detection and Multi-Omics Approaches
09:47

Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches

Published on: December 15, 2023

1.7K

Area of Science:

  • Bioinformatics and computational biology focusing on composite hypothesis testing.
  • Statistical genetics applied to multi-omic data integration and association mapping.

Background:

Genomic researchers frequently use summary statistics to evaluate how individual markers influence diverse phenotypes or molecular layers across multiple distinct biological conditions. Prior research has shown that established statistical frameworks effectively identify complex association patterns across multiple biological conditions or distinct traits by aggregating information from various studies. These traditional procedures rely on the assumption that markers behave independently across different omics levels to simplify the underlying mathematical models and reduce computational work. However, many current algorithms encounter significant computational bottlenecks when processing the massive datasets generated by modern high-throughput sequencing technologies involving millions of genetic variants. Existing software often fails to maintain accurate false positive control when strong correlations exist between the various traits being analyzed, leading to high error rates. This absence of evidence motivated the creation of a scalable methodology capable of accounting for trait dependencies while maintaining rigorous error control in multi-omic data environments.

Purpose Of The Study:

The investigators developed the qch_copula framework to address the limitations of existing composite hypothesis testing methods in large-scale omics datasets. This novel approach seeks to integrate sophisticated mixture models with a flexible copula function to represent the joint distribution of multiple traits while accounting for statistical dependencies. By capturing the underlying dependencies between different molecular levels, the algorithm aims to provide more accurate P-values for complex biological hypotheses involving multiple markers. The researchers intended to create a tool that maintains high sensitivity for detecting joint association patterns without sacrificing the computational efficiency required for genomic analyses. Another primary objective involved optimizing the Expectation-Maximization (EM) algorithm to handle significantly larger numbers of markers and traits than previously possible with existing software. Ultimately, the work provides a robust statistical foundation for researchers exploring the multifaceted relationships between genetic variants and complex phenotypic landscapes in the big data era.

Main Methods:

The team implemented the qch_copula method by combining multivariate mixture models with a specific copula function to model trait-to-trait correlations within a unified framework. This mathematical architecture allows for the derivation of rigorously defined P-values for any given composite hypothesis involving multiple omics levels or phenotypic traits. To evaluate performance, the scientists conducted a comprehensive benchmark comparing their approach against eight distinct state-of-the-art statistical methods currently used for multi-trait association testing. The computational efficiency of the Expectation-Maximization (EM) algorithm was specifically tested by varying the number of traits and markers processed to determine software limits. Memory usage metrics were recorded during these simulations to quantify the scalability improvements offered by the new software implementation relative to other mixture model-based approaches. The final software package, named qch, was developed for the R programming environment and made publicly available through the Comprehensive R Archive Network (CRAN).

Main Results:

Benchmarking results demonstrated that the qch_copula approach effectively controls Type I error rates across a wide range of simulated scenarios, even with high trait correlation. The method significantly enhanced the detection of joint association patterns compared to traditional procedures that ignore trait dependencies, providing higher statistical power for identifying variants. Computational analysis revealed that the new algorithm notably reduces memory consumption during the execution of the Expectation-Maximization (EM) process, facilitating the analysis of larger datasets. The software successfully processed datasets containing up to 20 distinct traits and between 100,000 and 1,000,000 individual genetic markers without exceeding standard computational resource limits. Performance gains were particularly evident in cases where strong correlations existed between the omics levels being investigated, where other methods often failed to control errors. The qch_copula framework consistently outperformed existing mixture model-based approaches in terms of both statistical accuracy and resource efficiency across all tested benchmark parameters.

Conclusions:

The researchers conclude that integrating copula functions into composite hypothesis testing provides a superior method for analyzing multi-omic datasets where dependencies between traits are significant. This statistical advancement allows for more reliable identification of pleiotropic effects and complex genetic architectures across diverse biological domains, from human disease to agricultural research. The authors state that the improved scalability of the qch_copula algorithm makes it suitable for modern large-scale genome-wide association studies involving high-dimensional phenotypic data. Future research may leverage this framework to explore the intricate dependencies between transcriptomic, proteomic, and metabolomic data layers to gain a holistic understanding of systems. The availability of the qch package on the Comprehensive R Archive Network (CRAN) facilitates the widespread adoption of these rigorous testing procedures by the scientific community. The study's findings emphasize the necessity of accounting for trait correlations to ensure the validity of complex hypothesis testing in the rapidly evolving field of genomics.

According to the study's authors, the copula function captures dependencies between traits or omics levels. This integration allows the mixture model to provide rigorously defined P-values, ensuring effective control of Type I error rates while enhancing the detection of joint association patterns across multiple molecular layers.

The researchers demonstrate that their approach notably reduces memory usage during the EM algorithm. This optimization allows the software to analyze up to 20 distinct traits and between 10^5 and 10^6 markers, significantly exceeding the capacity of other mixture model-based procedures.

The EM algorithm was optimized to overcome memory usage bottlenecks common in existing mixture model-based approaches. This refinement enables the qch_copula method to handle large-scale omics data analyses involving millions of markers, as validated through benchmarks against eight state-of-the-art statistical methods.

The effectiveness of the qch_copula framework is confined to the validation cases presented in the study. Specifically, the researchers confirmed the method's utility through two application cases in human and plant genetics, demonstrating its performance in identifying complex association patterns within these specific biological systems.

The study's authors propose that the method be widely adopted for large-scale omics data analyses. They have made the procedure accessible by releasing the qch R package on CRAN, allowing other researchers to implement these rigorously defined P-value calculations in their own genetic studies.