Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Mismatch Repair01:36

Mismatch Repair

40.6K
Overview
40.6K
Maxam-Gilbert Sequencing01:05

Maxam-Gilbert Sequencing

11.5K
In the same year as the discovery of the Sanger sequencing method, another group of scientists, Allan Maxam and Walter Gilbert, demonstrated their chemical-cleavage method for DNA sequencing. The Maxam-Gilbert method relies on using different chemicals that can cleave the DNA sequence at specific sites, the separation of resulting DNA fragments of variable size using electrophoresis, and deciphering the DNA sequence from the resulting gel bands.
Challenges of the Maxam-Gilbert Method
The...
11.5K
Long-patch Base Excision Repair01:02

Long-patch Base Excision Repair

7.2K
Since the discovery of the two BER pathways, there has been a debate about how a cell chooses one pathway over the other and the factors determining this selection. Numerous in vitro experiments have pointed out multiple determinants for the sub-pathway selection. These are:
7.2K
Lagging Strand Synthesis01:59

Lagging Strand Synthesis

54.2K
During replication, the complementary strands in double-stranded DNA are synthesized at different rates. Replication first begins on the leading strand. Replication starts later, occurs more slowly, and proceeds discontinuously on the lagging strand.
There are several major differences between synthesis of the leading strand and synthesis of the lagging strand. 1) Leading strand synthesis happens in the direction of replication fork opening, whereas lagging strand synthesis happens in the...
54.2K
Proofreading01:43

Proofreading

55.0K
Overview
55.0K
Gene Duplication and Divergence02:37

Gene Duplication and Divergence

6.3K
The seminal work of Ohno in 1970 popularized the idea of gene duplication and divergence. DNA sequence comparison studies reveal that a large portion of the genes in bacteria, archaebacteria, and eukaryotes was  generated by gene duplication and divergence, indicating its critical role in evolution.
The duplicated copies of the gene are called Paralogs. Paralogs with similar sequences and functions form a gene family. Across several species, a large number of gene families are...
6.3K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Tough, Ductile, and Strong Hard-Soft Cementitious Composite Enabled by Multi-Material Additive Manufacturing.

Advanced materials (Deerfield Beach, Fla.)·2026
Same author

Privacy-preserving verification of preprocessing in federated learning for genomic data.

JAMIA open·2026
Same author

OmniCorr: an R-package for visualizing putative host-microbiome interactions using multi-omics data.

Bioinformatics advances·2026
Same author

Atherogenic index of plasma in stroke: A comprehensive review of its diagnostic, prognostic, and pathophysiological significance.

World journal of clinical cases·2026
Same author

Sustainable Personalized Home Care for Pandemic Management: A Service-Oriented Approach.

Digital government (New York, N.Y.)·2026
Same author

Semantically Correct Policy Mining and Enforcement for Attribute based Access Control.

ACM transactions on Internet technology·2026
Same journal

STORM: Exploiting Spatiotemporal Continuity for Trajectory Similarity Learning in Road Networks.

IEEE transactions on knowledge and data engineering·2026
Same journal

Hierarchical Active Learning with Label Proportions on Data Regions.

IEEE transactions on knowledge and data engineering·2025
Same journal

Cafe: Improved Federated Data Imputation by Leveraging Missing Data Heterogeneity.

IEEE transactions on knowledge and data engineering·2025
Same journal

A Neural Database for Answering Aggregate Queries on Incomplete Relational Data.

IEEE transactions on knowledge and data engineering·2024
Same journal

Weakly Supervised Concept Map Generation through Task-Guided Graph Translation.

IEEE transactions on knowledge and data engineering·2024
Same journal

HyperMinHash: MinHash in LogLog space.

IEEE transactions on knowledge and data engineering·2024
See all related articles

Related Experiment Video

Updated: Sep 13, 2025

Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues
07:08

Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues

Published on: July 14, 2015

7.4K

Data Synthesis Reinvented: Preserving Missing Patterns for Enhanced Analysis.

Xinyue Wang1, Hafiz Asif2, Shashank Gupta3

  • 1Renmin University, Beijing, China.

IEEE Transactions on Knowledge and Data Engineering
|July 29, 2025
PubMed
Summary
This summary is machine-generated.

This study introduces methods to generate synthetic data that preserve missing data patterns, unlike standard techniques. This approach retains valuable information often lost, enhancing synthetic data utility across various fields.

Keywords:
GANMissing DataSynthetic Data Generation

More Related Videos

Pattern-based Search of Epigenomic Data Using GeNemo
06:38

Pattern-based Search of Epigenomic Data Using GeNemo

Published on: October 8, 2017

5.2K
Novel Sequence Discovery by Subtractive Genomics
09:40

Novel Sequence Discovery by Subtractive Genomics

Published on: January 25, 2019

8.7K

Related Experiment Videos

Last Updated: Sep 13, 2025

Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues
07:08

Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues

Published on: July 14, 2015

7.4K
Pattern-based Search of Epigenomic Data Using GeNemo
06:38

Pattern-based Search of Epigenomic Data Using GeNemo

Published on: October 8, 2017

5.2K
Novel Sequence Discovery by Subtractive Genomics
09:40

Novel Sequence Discovery by Subtractive Genomics

Published on: January 25, 2019

8.7K

Area of Science:

  • Data Science
  • Computer Science
  • Statistics

Background:

  • Synthetic data generation is crucial for privacy-preserving data sharing in healthcare, finance, and telecommunications.
  • Real-world data frequently contains missing values that encode significant behavioral information.
  • Current synthetic data methods often discard missing data, losing valuable insights.

Purpose of the Study:

  • To develop novel methods for generating synthetic data that preserve both observable and missing data distributions.
  • To retain the information encoded in missing data patterns within synthetic datasets.
  • To address limitations of existing synthetic data generation techniques.

Main Methods:

  • Proposed methods generate synthetic data by modeling and preserving the distributions of both observed and missing data.
  • The approach accommodates various missing data mechanisms and integrates with existing generation frameworks.
  • Techniques focus on retaining the statistical properties inherent in missing data patterns.

Main Results:

  • Empirical evaluations on diverse datasets confirm the effectiveness of the proposed methods.
  • Generated synthetic data successfully preserves the information contained in missing data patterns.
  • The value of retaining missing data distribution in synthetic data is quantitatively demonstrated.

Conclusions:

  • The proposed methods offer a significant advancement in synthetic data generation by preserving missing data information.
  • This approach enhances the utility and fidelity of synthetic data for research and application.
  • Preserving missing data distributions is essential for creating more representative and informative synthetic datasets.