Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Cluster Sampling Method01:20

Cluster Sampling Method

13.5K
Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...
13.5K
Causes of Similarity-Dissimilarity Effect01:26

Causes of Similarity-Dissimilarity Effect

87
The similarity-dissimilarity effect, a fundamental concept in social psychology, explains how interpersonal similarities and differences influence attraction and social interactions. This effect is supported by three key psychological perspectives: balance theory, social comparison theory, and consensual validation.Balance Theory and Cognitive ConsistencyBalance theory, developed by Fritz Heider, posits that individuals seek cognitive consistency in their relationships. When two people share...
87
Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

6.5K
Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
6.5K
Modern Molecular Taxonomy01:29

Modern Molecular Taxonomy

334
Advancements in molecular biology have revolutionized the identification and characterization of bacteria, with multiple methods leveraging DNA sequencing for enhanced precision. As sequencing technologies improve and costs decline, these approaches are increasingly used in clinical, environmental, and evolutionary studies.Multilocus Sequence Typing (MLST) examines several housekeeping genes, essential chromosomal genes encoding cellular functions, to distinguish strains. Approximately...
334
Multiple Comparison Tests01:13

Multiple Comparison Tests

4.1K
Multiple comparison test, abbreviated as MCT, is a post hoc analysis generally performed after comparing multiple samples with one or more tests. An MCT will help identify a significantly different sample among multiple samples or a factor among multiple factors.
It would be easy to compare two samples using a significance alpha level of 0.05. In other words, there is only one sample pair to be compared. However, it would be difficult to identify a significantly different sample if the number...
4.1K
Multi-species Conserved Sequences02:51

Multi-species Conserved Sequences

4.4K
Next-generation sequencing technologies have created large genomic databases of a variety of animals and plants. Ever since the human genome project was completed, scientists studied the genome of primates, mammals, and other phylogenetically distant living beings. Such large-scale  studies have provided new insights into the evolutionary relationship between organisms.
Although the genome of each species varies greatly from each other, a few sequences are highly conserved. Such conserved...
4.4K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Deep GRU-CNN Model for COVID-19 Detection From Chest X-Rays Data.

IEEE access : practical innovations, open solutions·2022
Same journal

Research on a Regional Availability Evaluation Model for Road-Area High-Entropy Energy Based on Synergy Factors.

Entropy (Basel, Switzerland)·2026
Same journal

Atmospheric Turbulence Channel Modeling and Performance Analysis of a CO-ZP-OFDM Coherent Optical Communication System for UAV Air-to-Ground Scenarios.

Entropy (Basel, Switzerland)·2026
Same journal

Information Geometry and Asymptotic Theory for SMML Estimators.

Entropy (Basel, Switzerland)·2026
Same journal

Correlation Entropy and Power-Law Kinetics.

Entropy (Basel, Switzerland)·2026
Same journal

Research on the Contagion of Systemic Financial Risk Under the Impact of Climate Risks-From the Perspective of Complex Networks and Machine Learning.

Entropy (Basel, Switzerland)·2026
Same journal

The Statistical-Mechanical Meaning of the Wave Function of Quantum Mechanics.

Entropy (Basel, Switzerland)·2026
See all related articles

Related Experiment Video

Updated: Nov 6, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.8K

An Improved Similarity-Based Clustering Algorithm for Multi-Database Mining.

Salim Miloudi1, Yulin Wang1, Wenjia Ding1

  • 1School of Computer Science, Wuhan University, Wuhan 430072, China.

Entropy (Basel, Switzerland)
|May 5, 2021
PubMed
Summary
This summary is machine-generated.

This study introduces a novel learning algorithm to enhance multi-database mining (MDM) clustering by reducing similarity matrix fuzziness. The new method improves clustering accuracy and efficiency, outperforming existing algorithms.

Keywords:
binary entropy lossclusteringcoordinate descentfuzzinessmulti-database miningsimilarity matrix

More Related Videos

Performing Data Mining And Integrative Analysis Of Biomarker in Breast Cancer Using Multiple Publicly Accessible Databases
07:41

Performing Data Mining And Integrative Analysis Of Biomarker in Breast Cancer Using Multiple Publicly Accessible Databases

Published on: May 17, 2019

9.2K
Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications
09:20

Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications

Published on: February 23, 2019

8.9K

Related Experiment Videos

Last Updated: Nov 6, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.8K
Performing Data Mining And Integrative Analysis Of Biomarker in Breast Cancer Using Multiple Publicly Accessible Databases
07:41

Performing Data Mining And Integrative Analysis Of Biomarker in Breast Cancer Using Multiple Publicly Accessible Databases

Published on: May 17, 2019

9.2K
Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications
09:20

Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications

Published on: February 23, 2019

8.9K

Area of Science:

  • Data Mining and Machine Learning
  • Database Systems
  • Artificial Intelligence

Background:

  • Clustering algorithms for multi-database mining (MDM) often struggle with indecisiveness when pairwise database similarities are near the mean.
  • This indecisiveness leads to trivial clustering results, such as all databases in one cluster or individual singleton clusters.
  • Existing gradient-based clustering methods can be sensitive to learning rates and require numerous iterations for convergence.

Purpose of the Study:

  • To develop a learning algorithm that reduces the fuzziness of the similarity matrix in MDM.
  • To improve the certainty and accuracy of clustering algorithms in identifying optimal database clusters.
  • To propose a learning-rate-free algorithm for efficient candidate clustering assessment.

Main Methods:

  • A learning algorithm minimizes a weighted binary entropy loss function using gradient descent and back-propagation to reduce similarity matrix fuzziness.
  • A learning-rate-free algorithm utilizing coordinate descent (CD) and back-propagation is proposed for efficient clustering.
  • A max-heap data structure is employed within the CD algorithm to optimize variable selection and minimize a convex clustering quality measure L(θ) in fewer than (n^2-n)/2 iterations.

Main Results:

  • The proposed learning algorithm successfully reduces similarity matrix fuzziness, leading to improved clustering certainty and identification of optimal database clusters.
  • The learning-rate-free CD algorithm converges in fewer upper-bounded iterations compared to traditional gradient-based methods.
  • Experimental results demonstrate that the novel algorithm outperforms existing clustering algorithms for MDM.

Conclusions:

  • The developed learning algorithm effectively addresses the indecisiveness issue in MDM clustering by enhancing similarity matrix clarity.
  • The learning-rate-free approach offers a more efficient and robust method for database clustering, reducing computational complexity.
  • This research provides a significant advancement in MDM, offering improved accuracy and performance for database partitioning.