Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Mismatch Repair01:20

Mismatch Repair

4.8K
Organisms are capable of detecting and fixing nucleotide mismatches that occur during DNA replication. This sophisticated process requires identifying the new strand and replacing the erroneous bases with correct nucleotides. Mismatch repair is coordinated by many proteins in both prokaryotes and eukaryotes.
The Mutator Protein Family Plays a Key Role in DNA Mismatch Repair
The human genome has more than 3 billion base pairs of DNA per cell. Prior to cell division, that vast amount of genetic...
4.8K
Predicting Products: Substitution vs. Elimination02:52

Predicting Products: Substitution vs. Elimination

11.4K
When a nucleophile and an alkyl halide react, nucleophilic substitution and β-elimination reactions compete to generate products.
The following factors can influence the mechanisms competing against each other:
11.4K
Alternative RNA Splicing02:18

Alternative RNA Splicing

3.7K
3.7K
Nonsense-mediated mRNA Decay02:27

Nonsense-mediated mRNA Decay

2.7K
2.7K
Nucleophilic Substitution Reactions02:34

Nucleophilic Substitution Reactions

16.0K
Historical perspective
In 1896, the German chemist Paul Walden discovered that he could interconvert pure enantiomeric (+) and (-) malic acids through a series of reactions. This conversion suggested the involvement of optical inversion during the substitution reaction. Further, in 1930, Sir Christopher Ingold described for the first time two different forms of nucleophilic substitution reactions, which are known as SN1 (nucleophilic substitution unimolecular) and SN2 (nucleophilic substitution...
16.0K
Mutations01:35

Mutations

33.4K
Mutations are changes in the sequence of DNA. These changes can occur spontaneously or they can be induced by exposure to environmental factors. Mutations can be characterized in a number of different ways: whether and how they alter the amino acid sequence of the protein, whether they occur over a small or large area of DNA, and whether they occur in somatic cells or germline cells.
Chromosomal Alterations Are Large-Scale Mutations
While point mutations are changes in a single nucleotide in...
33.4K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

MOOMIN - Mathematical explOration of 'Omics data on a MetabolIc Network.

Bioinformatics (Oxford, England)·2019
Same author

Enumeration of minimal stoichiometric precursor sets in metabolic networks.

Algorithms for molecular biology : AMB·2016
Same author

A Combinatorial Algorithm for Microbial Consortia Synthetic Design.

Scientific reports·2016
Same author

Telling metabolic stories to explore metabolomics data: a case study on the yeast response to cadmium exposure.

Bioinformatics (Oxford, England)·2013
Same author

High-quality, highly concentrated semiconducting single-wall carbon nanotubes for use in field effect transistors and biosensors.

ACS nano·2013
Same author

Schwannomatosis: a new member of neurofibromatosis family.

Chinese medical journal·2013
Same journal

Topology only pre-training: towards generalised multi-domain graph models.

Data mining and knowledge discovery·2026
Same journal

Detection and evaluation of clusters within sequential data.

Data mining and knowledge discovery·2025
Same journal

Universal representation learning for multivariate time series using the instance-level and cluster-level supervised contrastive learning.

Data mining and knowledge discovery·2025
Same journal

Robust explainer recommendation for time series classification.

Data mining and knowledge discovery·2024
Same journal

Somtimes: self organizing maps for time series clustering and its application to serious illness conversations.

Data mining and knowledge discovery·2024
Same journal

Counting frequent patterns in large labeled graphs: a hypergraph-based approach.

Data mining and knowledge discovery·2024
See all related articles

Related Experiment Video

Updated: May 30, 2025

Identification of Alternative Splicing and Polyadenylation in RNA-seq Data
08:35

Identification of Alternative Splicing and Polyadenylation in RNA-seq Data

Published on: June 24, 2021

5.4K

Missing value replacement in strings and applications.

Giulia Bernardini1, Chang Liu2, Grigorios Loukides3

  • 1Department of Mathematics, Informatics and Geosciences, University of Trieste, Trieste, Italy.

Data Mining and Knowledge Discovery
|January 27, 2025
PubMed
Summary
This summary is machine-generated.

This study introduces a new algorithm to efficiently replace missing values in sequential data, minimizing introduced letters while respecting context and forbidden patterns. The method effectively sanitizes private strings and preserves clustering quality.

Keywords:
Forbidden patternsMissing value replacementString algorithmsString sanitization

More Related Videos

A Reverse Genetic Approach to Test Functional Redundancy During Embryogenesis
06:59

A Reverse Genetic Approach to Test Functional Redundancy During Embryogenesis

Published on: August 11, 2010

12.0K
Using RNA-sequencing to Detect Novel Splice Variants Related to Drug Resistance in In Vitro Cancer Models
09:58

Using RNA-sequencing to Detect Novel Splice Variants Related to Drug Resistance in In Vitro Cancer Models

Published on: December 9, 2016

13.6K

Related Experiment Videos

Last Updated: May 30, 2025

Identification of Alternative Splicing and Polyadenylation in RNA-seq Data
08:35

Identification of Alternative Splicing and Polyadenylation in RNA-seq Data

Published on: June 24, 2021

5.4K
A Reverse Genetic Approach to Test Functional Redundancy During Embryogenesis
06:59

A Reverse Genetic Approach to Test Functional Redundancy During Embryogenesis

Published on: August 11, 2010

12.0K
Using RNA-sequencing to Detect Novel Splice Variants Related to Drug Resistance in In Vitro Cancer Models
09:58

Using RNA-sequencing to Detect Novel Splice Variants Related to Drug Resistance in In Vitro Cancer Models

Published on: December 9, 2016

13.6K

Area of Science:

  • Computer Science
  • Bioinformatics
  • Data Science

Background:

  • Missing values are common in sequential data due to measurement errors, flexible modeling, or privacy concerns.
  • Analyzing such data requires efficient and effective methods for replacing missing values with valid characters.
  • Existing methods may not adequately address constraints like context and forbidden patterns.

Purpose of the Study:

  • To formalize the problem of replacing missing values in sequential data as a combinatorial optimization problem.
  • To develop an efficient algorithm for solving this problem, considering context and forbidden patterns.
  • To apply the algorithm for sanitizing private strings and clustering collections of strings.

Main Methods:

  • Formalizing the problem as finding shortest paths in graphs with forbidden edges.
  • Designing a linear-time algorithm for strings over constant-sized alphabets.
  • Applying techniques from formal languages and combinatorial pattern matching.

Main Results:

  • A linear-time algorithm for efficient missing value replacement in sequential data.
  • Demonstration of the algorithm's effectiveness in fully sanitizing private strings.
  • A methodology for sanitizing and clustering private string collections that preserves clustering quality.
  • Experimental results showing superior performance over state-of-the-art methods.

Conclusions:

  • The proposed algorithm efficiently handles missing values in sequential data under complex constraints.
  • The methodology offers effective privacy protection for string datasets while maintaining data utility for clustering.
  • This work advances the state of the art in data sanitization and analysis of sequential data.