Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Classifying Matter by Composition

Classifying Matter by Composition

Matter: Pure Substances and Mixtures
According to its composition, the matter can be classified into two broad categories — pure substances and mixtures.
A pure substance is a form of matter that has a constant composition throughout with uniform properties. For example, any sample of sucrose has the same composition and same physical properties, such as melting point, color, and sweetness, regardless of the source from which it is isolated.
A mixture is composed of two or...

Classifying Matter by State

Classifying Matter by State

Chemistry is the study of matter and the changes it undergoes. Matter is anything that has mass and occupies space. Matter is all around us; the air, water, soil, mountains, even our bodies are all examples of matter. Matter is divided into three states — solid, liquid, and gas — that are commonly found on earth. The fourth state of matter, plasma, occurs naturally in the interiors of stars.

How Data are Classified: Numerical Data

How Data are Classified: Numerical Data

Data that are countable or measurable in specific units are called numerical or quantitative data. Quantitative data are always numbers. Quantitative data are the result of counting or measuring the attributes of a population. Amount of money, pulse rate, weight, number of people living in a town, and number of students who opt for statistics are examples of quantitative data.
Quantitative data may be either discrete or continuous. All quantitative data that take on only specific numerical...

How Data are Classified: Categorical Data

How Data are Classified: Categorical Data

A variable, usually notated by capital letters such as X and Y, is a characteristic or measurement that can be determined for each member of a population. Data are the actual values of variables. They may be numbers, or they may be words. Datum is a single value.
Data are classified based on whether they are measurable or not. Categorical data cannot be measured; instead, it can be divided into categories. For example, if Y denotes a person's party affiliation, some examples of Y include...

Quantifying Work

Quantifying Work

As a system undergoes a change, its internal energy can change, and energy can be transferred from the system to the surroundings, or from the surroundings to the system.

Nursing Interventions II: Selecting and Classifying the Nursing Interventions

Nursing Interventions II: Selecting and Classifying the Nursing Interventions

Creating and executing a nursing diagnosis helps nurses plan care and guide patient, family, and community interventions. They are developed based on a patient's physical evaluation and support measuring the outcomes. It is not recommended to select random interventions throughout the planning process. Instead, consider the following six essential factors when choosing interventions:

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Author Correction: Comprehensive genomics in androgen receptor-dependent castration-resistant prostate cancer identifies an adaptation pathway mediated by opioid receptor kappa 1.

Communications biology·2022

Same author

Utility of Homologous Recombination Deficiency Biomarkers Across Cancer Types.

JCO precision oncology·2022

Same author

Comprehensive genomics in androgen receptor-dependent castration-resistant prostate cancer identifies an adaptation pathway mediated by opioid receptor kappa 1.

Communications biology·2022

Same author

ADMET Predictability at Boehringer Ingelheim: State-of-the-Art, and Do Bigger Datasets or Algorithms Make a Difference?

Molecular informatics·2021

Same author

Utility of Homologous Recombination Deficiency Biomarkers Across Cancer Types.

JCO precision oncology·2021

Same author

Active learning effectively identifies a minimal set of maximally informative and asymptotically performant cytotoxic structure-activity patterns in NCI-60 cell lines.

RSC medicinal chemistry·2021

Same journal

SpaceExpander: An Automated System for Drafting Markush Claims to Expand Chemical Space.

Molecular informatics·2026

Same journal

A Structure-Informed Atlas of Venom-Derived Peptides Reveals the Organization of Chemical Space.

Molecular informatics·2026

Same journal

ConGen: Targeted Molecule Generation Through Contrastive Learning and Latent Optimization.

Molecular informatics·2026

Same journal

Novel Molecules Generation Using Graph Generative Adversarial Networks.

Molecular informatics·2026

Same journal

An Attention-Driven Graph Transformer With Nonlinear Modeling and Neuro-Fuzzy Fusion for High-Order Toxic Molecular Graph Learning.

Molecular informatics·2026

Same journal

Molecular Modeling and Chemoinformatics in Ukraine.

Molecular informatics·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Feb 15, 2026

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

Classifiers and their Metrics Quantified.

¹Kyoto University Graduate School of Medicine, Laboratory of Molecular Biosciences, 606-8501, E-109 Konoemachi, Sakyo, Kyoto, Japan.

Molecular Informatics

|January 24, 2018

Summary

This summary is machine-generated.

Classification models in molecular modeling can overestimate performance. This study proposes a new metric analysis to improve prospective experiment predictions and guide metric selection for better study design.

Keywords:

Classifiers metrics modeling performance assessment prediction

More Related Videos

A Metric Test for Assessing Spatial Working Memory in Adult Rats Following Traumatic Brain Injury

A Metric Test for Assessing Spatial Working Memory in Adult Rats Following Traumatic Brain Injury

Published on: May 7, 2021

Quantified Assessment of Infant's Gross Motor Abilities Using a Multisensor Wearable

Quantified Assessment of Infant's Gross Motor Abilities Using a Multisensor Wearable

Published on: May 17, 2024

Related Experiment Videos

Last Updated: Feb 15, 2026

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

A Metric Test for Assessing Spatial Working Memory in Adult Rats Following Traumatic Brain Injury

A Metric Test for Assessing Spatial Working Memory in Adult Rats Following Traumatic Brain Injury

Published on: May 7, 2021

Quantified Assessment of Infant's Gross Motor Abilities Using a Multisensor Wearable

Quantified Assessment of Infant's Gross Motor Abilities Using a Multisensor Wearable

Published on: May 17, 2024

Area of Science:

Computational chemistry
Cheminformatics
Bioinformatics

Background:

Classification models are widely used in molecular modeling for predicting binary outcomes like bioactivity or protein interactions.
Common evaluation metrics (e.g., accuracy, true positive rate) may overestimate model performance on real-world prospective datasets.
Retrospective or artificially generated datasets can lead to misleading performance assessments.

Purpose of the Study:

To address the overestimation of predictive model performance in molecular modeling.
To propose a novel method for analyzing metric performance based on data balance.
To provide guidance on selecting appropriate evaluation metrics for study design.

Main Methods:

Systematic analysis of metric value surface generation influenced by data balance.
Development and application of an inverse cumulative distribution function over metric surfaces.
Theoretical analysis complemented by a practical chemogenomic virtual screening example.

Main Results:

Demonstrated how data balance influences metric value surfaces.
Introduced a distribution analysis method for evaluating classification model performance.
Highlighted the critical importance of careful metric selection and interpretation in virtual screening.

Conclusions:

Standard performance metrics can be unreliable for prospective predictions.
The proposed distribution analysis offers a more robust approach to metric evaluation.
Informed metric selection is crucial for reliable molecular modeling and virtual screening outcomes.