Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Genome Annotation and Assembly03:36

Genome Annotation and Assembly

21.3K
The genome refers to all of the genetic material in an organism. It can range from a few million base pairs in microbial cells to several billion base pairs in many eukaryotic organisms. Genome assembly refers to the process of taking the DNA sequencing data and putting it all back together in a correct order to create a close representation of the original genome. This is followed by the identification of functional elements on the newly assembled genome, a process called genome annotation.
21.3K
Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

7.1K
Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
7.1K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Statistics and AI - A Fireside Conversation.

Harvard data science review·2026
Same author

Predicting the timing of first sustained cognitive worsening in Alzheimer's disease using real-world clinical data and machine learning.

medRxiv : the preprint server for health sciences·2026
Same author

Nonparametric estimation of the total treatment effect with multiple outcomes in the presence of terminal events.

Biometrics·2026
Same author

Stratification of Alzheimer's disease patients using knowledge-guided unsupervised latent factor clustering with electronic health record data.

Communications medicine·2026
Same author

Inference of dependency knowledge graph for Electronic Health Records.

Journal of the Royal Statistical Society. Series B, Statistical methodology·2026
Same author

Phenotypic prediction of missense variants via deep contrastive learning.

Nature biomedical engineering·2026
Same journal

Instrumental Variable Estimation of Marginal Structural Mean Models for Time-Varying Treatment.

Journal of the American Statistical Association·2026
Same journal

Semiparametric Joint Modeling for Survival Analysis with Longitudinal Covariates.

Journal of the American Statistical Association·2026
Same journal

Dimension Reduction for Large-Scale Federated Data: Statistical Rate and Asymptotic Inference.

Journal of the American Statistical Association·2026
Same journal

Facilitating Heterogeneous Effect Estimation via Statistically Efficient Categorical Modifiers.

Journal of the American Statistical Association·2026
Same journal

Nonparametric Density Estimation of a Long-Term Trend from Repeated Semicontinuous Data.

Journal of the American Statistical Association·2026
Same journal

Functional Integrative Bayesian Analysis of High-dimensional Multiplatform Clinicogenomic Data.

Journal of the American Statistical Association·2026
See all related articles

Related Experiment Video

Updated: Mar 9, 2026

Mapping Mammalian 3D Genome Interactions with Micro-C-XL
11:41

Mapping Mammalian 3D Genome Interactions with Micro-C-XL

Published on: November 3, 2023

3.9K

Structured Matrix Completion with Applications to Genomic Data Integration.

Tianxi Cai1, T Tony Cai2, Anru Zhang3

  • 1Professor of Biostatistics, Department of Biostatistics, Harvard University, Boston, MA.

Journal of the American Statistical Association
|January 3, 2017
PubMed
Summary
This summary is machine-generated.

We introduce structured matrix completion (SMC) for handling missing data in large datasets. This method efficiently recovers data from partially observed matrices, improving predictions in fields like genomic data integration.

Keywords:
Constrained minimizationgenomic data integrationlow-rank matrixmatrix completionsingular value decompositionstructured matrix completion

More Related Videos

Genomic MRI - a Public Resource for Studying Sequence Patterns within Genomic DNA
12:36

Genomic MRI - a Public Resource for Studying Sequence Patterns within Genomic DNA

Published on: May 9, 2011

10.6K
Multiplexed Single Cell mRNA Sequencing Analysis of Mouse Embryonic Cells
08:30

Multiplexed Single Cell mRNA Sequencing Analysis of Mouse Embryonic Cells

Published on: January 7, 2020

14.0K

Related Experiment Videos

Last Updated: Mar 9, 2026

Mapping Mammalian 3D Genome Interactions with Micro-C-XL
11:41

Mapping Mammalian 3D Genome Interactions with Micro-C-XL

Published on: November 3, 2023

3.9K
Genomic MRI - a Public Resource for Studying Sequence Patterns within Genomic DNA
12:36

Genomic MRI - a Public Resource for Studying Sequence Patterns within Genomic DNA

Published on: May 9, 2011

10.6K
Multiplexed Single Cell mRNA Sequencing Analysis of Mouse Embryonic Cells
08:30

Multiplexed Single Cell mRNA Sequencing Analysis of Mouse Embryonic Cells

Published on: January 7, 2020

14.0K

Area of Science:

  • Statistics
  • Applied Mathematics
  • Electrical Engineering
  • Genomics
  • Bioinformatics

Background:

  • Matrix completion is crucial in various scientific fields, but existing methods often assume independent sampling of observed data.
  • This assumption is limiting for applications with inherent data structures, such as integrating complex genomic datasets.
  • Structured missingness, where observed entries are not independent, requires novel matrix completion approaches.

Purpose of the Study:

  • To propose a new framework, structured matrix completion (SMC), to address structured missingness in matrix recovery.
  • To develop an efficient method for recovering approximately low-rank matrices when only subsets of rows and columns are observed.
  • To provide theoretical guarantees and demonstrate practical utility in genomic data integration for improved survival prediction.

Main Methods:

  • Developed a novel structured matrix completion (SMC) framework designed for matrices with structured missing entries.
  • Focused on efficient matrix recovery from partially observed low-rank matrices, specifically when subsets of rows and columns are available.
  • Provided theoretical analysis, including lower bounds for estimation errors, to establish optimal recovery rates.

Main Results:

  • Established theoretical justification for the proposed SMC method, proving optimal recovery rates for certain matrix classes.
  • Simulation studies confirmed the method's strong performance in finite samples across various configurations.
  • Demonstrated successful application in integrating multiple ovarian cancer genomic studies, enhancing prediction accuracy for patient survival.

Conclusions:

  • Structured matrix completion (SMC) offers an effective solution for matrix recovery problems with structured missing data.
  • The method provides theoretical guarantees and practical advantages, particularly in complex biological data integration.
  • SMC enables the construction of more accurate predictive models, as evidenced by its application to ovarian cancer survival prediction.