Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Classification of Systems-I01:26

Classification of Systems-I

726
Linearity is a system property characterized by a direct input-output relationship, combining homogeneity and additivity.
Homogeneity dictates that if an input x(t) is multiplied by a constant c, the output y(t) is multiplied by the same constant. Mathematically, this is expressed as:
726
Classification of Systems-II01:31

Classification of Systems-II

638
Continuous-time systems have continuous input and output signals, with time measured continuously. These systems are generally defined by differential or algebraic equations. For instance, in an RC circuit, the relationship between input and output voltage is expressed through a differential equation derived from Ohm's law and the capacitor relation,
638
Multiple Regression01:25

Multiple Regression

3.3K
Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...
3.3K
Regression Analysis01:11

Regression Analysis

7.2K
Regression analysis is a statistical tool that describes a mathematical relationship between a dependent variable and one or more independent variables.
In regression analysis, a regression equation is determined based on the line of best fit– a line that best fits the data points plotted in a graph. This line is also called the regression line. The algebraic equation for the regression line is called the regression equation. It is represented as:
7.2K
Aggregates Classification01:29

Aggregates Classification

999
Aggregate classification is generally based on its size, petrographic characteristics, weight, and source. Size classification ranges from coarse to fine aggregates, defined by the size of the particles. Coarse aggregates are particles that do not pass through ASTM sieve No. 4, and aggregates that pass through the sieve are fine aggregates.
Petrographic classification groups aggregates based on common mineralogical characteristics. Some of the common mineral groups found in aggregates are...
999
Generalization, Discrimination, and Extinction01:24

Generalization, Discrimination, and Extinction

2.1K
Generalization, discrimination, and extinction are key concepts in operant conditioning that influence how behaviors are learned and maintained.
Generalization occurs when a behavior reinforced in one context is performed in similar situations. For instance, a student who studies diligently for calculus and receives excellent grades might apply the same study habits to psychology and history, expecting similar results. Generalization shows how learning in one setting can influence behavior in...
2.1K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Machine learning-based prediction of polyvinyl alcohol product viscosity and design of optimal process conditions.

Analytical sciences : the international journal of the Japan Society for Analytical Chemistry·2026
Same author

Data-Driven Design of Organic Semiconductors Exhibiting Low Reorganization Energy via Hierarchical Variational Autoencoders, Gaussian Mixture Regression, and Bayesian Optimization.

Journal of chemical information and modeling·2026
Same author

Generation of Molecules Near the Applicability Domain Boundaries of Property Prediction Models.

Journal of chemical information and modeling·2026
Same author

A general framework for extrapolation-aware prediction reliability in forward and inverse analyses of Gaussian mixture regression models.

Analytical sciences : the international journal of the Japan Society for Analytical Chemistry·2026
Same author

Robust machine learning and ensemble learning approach to predict variation in experimental data for multiple measurements and anomalies.

Analytical sciences : the international journal of the Japan Society for Analytical Chemistry·2026
Same author

Machine Learning Models Predicting Solubility and Polymerizability of Polyimides Considering Multiple Monomers for CO<sub>2</sub> Separation Membranes.

Molecular informatics·2026
Same journal

PFASGroups: An Open-Source Framework for Automated Identification, Structural Classification, and Prioritization of Per- and Polyfluoroalkyl Substances.

Journal of chemical information and modeling·2026
Same journal

DeepKbhb: Context-Aware Prediction of Human Lysine β-Hydroxybutyrylation Sites.

Journal of chemical information and modeling·2026
Same journal

HyperDC: A Non-Uniform Hypergraph Framework for Dual- and Higher-Order Drug Combination Recommendation Across Diverse Complex Diseases.

Journal of chemical information and modeling·2026
Same journal

MolPy: A Large Language Model-Friendly Toolkit for Reactive Topology Editing in Polymer Simulations.

Journal of chemical information and modeling·2026
Same journal

Molecular Mechanisms of KIT Receptor Dimerization and Oncogenic Activation Revealed by Multiscale Simulations.

Journal of chemical information and modeling·2026
Same journal

Structural and Thermodynamic Discrimination between Agonists and Antagonists of Retinoic Acid Receptor γ and the Vitamin D Receptor.

Journal of chemical information and modeling·2026
See all related articles

Related Experiment Video

Updated: Apr 25, 2026

Identification and Classification of Position-specific GABAA Receptor Subunit Missense Variants for Their Role In Hippocampal Pyramidal Neurons
08:04

Identification and Classification of Position-specific GABAA Receptor Subunit Missense Variants for Their Role In Hippocampal Pyramidal Neurons

Published on: June 6, 2025

1.5K

Applicability domain based on ensemble learning in classification and regression analyses.

Hiromasa Kaneko1, Kimito Funatsu

  • 1Department of Chemical Systems Engineering, The University of Tokyo , 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan.

Journal of Chemical Information and Modeling
|August 15, 2014
PubMed
Summary
This summary is machine-generated.

This study introduces a new method for defining applicability domains (ADs) in classification and regression using ensemble learning and data density. This approach improves the reliability of predictions for new data in machine learning models.

Related Experiment Videos

Last Updated: Apr 25, 2026

Identification and Classification of Position-specific GABAA Receptor Subunit Missense Variants for Their Role In Hippocampal Pyramidal Neurons
08:04

Identification and Classification of Position-specific GABAA Receptor Subunit Missense Variants for Their Role In Hippocampal Pyramidal Neurons

Published on: June 6, 2025

1.5K

Area of Science:

  • Machine Learning
  • Cheminformatics
  • Data Science

Background:

  • Applicability domains (ADs) are crucial for assessing the reliability of predictive models in classification and regression.
  • Existing AD methods in classification can be overly broad, limiting their practical utility.
  • Ensemble learning offers potential for refining AD definitions.

Purpose of the Study:

  • To propose a novel method for setting applicability domains (ADs) in classification and regression analyses.
  • To enhance the reliability of predictions for new chemical compounds or data points.
  • To address limitations of existing AD methods, particularly in classification tasks.

Main Methods:

  • Developing an AD approach integrating ensemble learning with data density.
  • Establishing a data density threshold to identify unreliable predictions.
  • Applying ensemble learning for reliability assessment only on data exceeding the density threshold.
  • Validating the proposed AD method using numerical simulations and quantitative structure-activity relationship (QSAR) data.

Main Results:

  • The proposed method effectively sets appropriate ADs for both regression and classification analyses.
  • Data density combined with ensemble learning provides a more refined and reliable AD than traditional methods.
  • The approach successfully identifies regions where predictions are less trustworthy.

Conclusions:

  • The integration of ensemble learning and data density offers a robust strategy for defining accurate applicability domains.
  • This method enhances the trustworthiness of machine learning models in scientific applications.
  • The validated approach provides a valuable tool for cheminformatics and related fields.