Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Design Example: Setting a Curve Using Design Data01:09

Design Example: Setting a Curve Using Design Data

253
Designing and plotting a curve using field data requires precise calculations and execution. A horizontal curve with a radius of 200 meters and an intersection angle of 20 degrees is established using the method of perpendicular offsets from the long chord. The long chord, which spans between the curve's endpoints, is calculated to be 69.46 meters in length. To maintain accuracy in plotting, intervals of 3 meters are selected along the chord.The engineer determines the offset distances for each...
253
Choosing Between z and t Distribution01:25

Choosing Between z and t Distribution

3.7K
The z and the Student t distribution estimate the population mean using the sample mean and standard deviation. However, to decide which distribution to use for a calculation, one needs to determine the sample size, the nature of the distribution, and whether the population standard deviation is known. If the population standard deviation is known and the population is normally distributed, or if the sample size is greater than 30, the z distribution is preferred. The Student t distribution is...
3.7K
The Representativeness Heuristic02:13

The Representativeness Heuristic

16.8K
The representative heuristic describes a biased way of thinking, in which you unintentionally stereotype someone or something. For example, you may assume that your professors spend their free time reading books and engaging in intellectual conversation, because the idea of them spending their time playing volleyball or visiting an amusement park does not fit in with your stereotypes of professors.
16.8K
Cis-regulatory Sequences02:02

Cis-regulatory Sequences

11.9K
Cis-regulatory sequences are short fragments of non-coding DNA that are present on the same chromosomes as the genes that they regulate. These fragments serve as binding sites for transcriptional regulators, proteins that are responsible for controlling gene transcription and differential gene expression across cell types in eukaryotes. Cis-regulatory sequences can be close to the gene of interest or thousands of bases away in the DNA sequence; however, those sequences that are further away are...
11.9K
Optimal Foraging00:48

Optimal Foraging

14.0K
How animals obtain and eat their food is called foraging behavior. Foraging can include searching for plants and hunting for prey and depends on the species and environment.
14.0K
Setting Time of Cement01:12

Setting Time of Cement

740
The setting time of cement refers to the process of cement paste transitioning from a plastic state to a solid state. This process is crucial in construction as it dictates the timeframe for concrete placement, compaction, and finishing. The onset of this solidification is termed the initial set, indicating when the paste becomes unworkable. The final set is when the paste has solidified completely, and further handling or manipulation can no longer affect its shape. The cement strength is...
740

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Prioritizing peptides for targeted mass spectrometry experiments using deep learning.

bioRxiv : the preprint server for biology·2026
Same author

Embryo-scale Visual Cell Sorting reveals a conserved transcriptomic signature of nucleolar size linked to proteostasis.

bioRxiv : the preprint server for biology·2026
Same author

Prediction and functional interpretation of inter-chromosomal genome architecture from DNA sequence with TwinC.

Nature communications·2026
Same author

Benchmarking Hi-C scaffolders using reference genomes and de novo assemblies.

Genome biology·2026
Same author

Unified imputation of missing data modalities and features in multi-omic data via shared representation learning.

bioRxiv : the preprint server for biology·2026
Same author

Improvements to Casanovo, a Deep Learning <i>De Novo</i> Peptide Sequencer.

Journal of proteome research·2025
Same journal

Engineered HSP90-MP65 Bivalent Fusion Antigen: A Novel Vaccine Candidate Against Invasive Candidiasis.

Proteins·2026
Same journal

Physics-Based Energy Functions for Computational Protein Design.

Proteins·2026
Same journal

Impact of Stabilizing Osmolytes on the Conformational Dynamics of Human and Rat Islet Amyloid Polypeptides.

Proteins·2026
Same journal

Stabilization of Bone Morphogenetic Protein-2 at Physiological pH: Contrasting Roles of CHAPS and Arginine in Aggregation Inhibition.

Proteins·2026
Same journal

Structural Insights Into the Function of Leishmania major Adenylosuccinate Lyase.

Proteins·2026
Same journal

Generalizing the Gaussian Network Model: Spanning-Tree Thermodynamics Shows Entropy-Driven KRAS Activation.

Proteins·2026
See all related articles

Related Experiment Video

Updated: Feb 15, 2026

Optimization for Sequencing and Analysis of Degraded FFPE-RNA Samples
07:30

Optimization for Sequencing and Analysis of Degraded FFPE-RNA Samples

Published on: June 8, 2020

12.8K

Choosing non-redundant representative subsets of protein sequence data sets using submodular optimization.

Maxwell W Libbrecht1, Jeffrey A Bilmes2, William Stafford Noble1,3

  • 1Department of Genome Sciences, University of Washington, Seattle, Washington.

Proteins
|January 19, 2018
PubMed
Summary
This summary is machine-generated.

We introduce a novel submodular optimization approach for selecting representative protein sequences. This method enhances structural diversity in datasets compared to existing heuristic algorithms.

Keywords:
discrete optimizationdiversityprotein sequence analysisredundancyrepresentative subsetssubmodular maximization

More Related Videos

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

8.1K
A Reverse Genetic Approach to Test Functional Redundancy During Embryogenesis
06:59

A Reverse Genetic Approach to Test Functional Redundancy During Embryogenesis

Published on: August 11, 2010

12.5K

Related Experiment Videos

Last Updated: Feb 15, 2026

Optimization for Sequencing and Analysis of Degraded FFPE-RNA Samples
07:30

Optimization for Sequencing and Analysis of Degraded FFPE-RNA Samples

Published on: June 8, 2020

12.8K
Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

8.1K
A Reverse Genetic Approach to Test Functional Redundancy During Embryogenesis
06:59

A Reverse Genetic Approach to Test Functional Redundancy During Embryogenesis

Published on: August 11, 2010

12.5K

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Structural Biology

Background:

  • Selecting non-redundant sequence subsets is crucial for bioinformatics tasks like training models and metagenomics analysis.
  • Existing methods (e.g., CD-HIT, PISCES, UCLUST) use heuristic, threshold-based algorithms lacking theoretical guarantees.

Purpose of the Study:

  • To develop a new, theoretically grounded approach for selecting representative sequence subsets.
  • To improve the structural diversity of selected protein sequence sets.

Main Methods:

  • Utilized submodular optimization, a discrete optimization technique analogous to continuous convex optimization.
  • Applied the framework to select representative protein sequence subsets using the SCOPe library as a benchmark.

Main Results:

  • Submodular optimization yielded protein sequence subsets with greater structural diversity than existing methods.
  • The approach consistently identified more SCOPe domain families in subsets of equivalent size compared to competing methods.
  • A flexible mixture objective function was designed to perform well for both large and small representative sets.

Conclusions:

  • Submodular optimization offers a theoretically sound and flexible framework for representative sequence selection.
  • This method surpasses existing heuristic approaches in capturing structural diversity and identifying diverse protein families.
  • The polynomial-time optimal framework provides a powerful and intuitive tool for bioinformatics workflows.