Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Classification of Systems-I

Classification of Systems-I

Linearity is a system property characterized by a direct input-output relationship, combining homogeneity and additivity.
Homogeneity dictates that if an input x(t) is multiplied by a constant c, the output y(t) is multiplied by the same constant. Mathematically, this is expressed as:

Classification of Systems-II

Classification of Systems-II

Continuous-time systems have continuous input and output signals, with time measured continuously. These systems are generally defined by differential or algebraic equations. For instance, in an RC circuit, the relationship between input and output voltage is expressed through a differential equation derived from Ohm's law and the capacitor relation,

Multiple Regression

Multiple Regression

Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...

Regression Analysis

Regression Analysis

Regression analysis is a statistical tool that describes a mathematical relationship between a dependent variable and one or more independent variables.
In regression analysis, a regression equation is determined based on the line of best fit– a line that best fits the data points plotted in a graph. This line is also called the regression line. The algebraic equation for the regression line is called the regression equation. It is represented as:

Aggregates Classification

Aggregates Classification

Aggregate classification is generally based on its size, petrographic characteristics, weight, and source. Size classification ranges from coarse to fine aggregates, defined by the size of the particles. Coarse aggregates are particles that do not pass through ASTM sieve No. 4, and aggregates that pass through the sieve are fine aggregates.
Petrographic classification groups aggregates based on common mineralogical characteristics. Some of the common mineral groups found in aggregates are...

Generalization, Discrimination, and Extinction

Generalization, Discrimination, and Extinction

Generalization, discrimination, and extinction are key concepts in operant conditioning that influence how behaviors are learned and maintained.
Generalization occurs when a behavior reinforced in one context is performed in similar situations. For instance, a student who studies diligently for calculus and receives excellent grades might apply the same study habits to psychology and history, expecting similar results. Generalization shows how learning in one setting can influence behavior in...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Machine learning-based prediction of polyvinyl alcohol product viscosity and design of optimal process conditions.

Analytical sciences : the international journal of the Japan Society for Analytical Chemistry·2026

Same author

Data-Driven Design of Organic Semiconductors Exhibiting Low Reorganization Energy via Hierarchical Variational Autoencoders, Gaussian Mixture Regression, and Bayesian Optimization.

Journal of chemical information and modeling·2026

Same author

Generation of Molecules Near the Applicability Domain Boundaries of Property Prediction Models.

Journal of chemical information and modeling·2026

Same author

A general framework for extrapolation-aware prediction reliability in forward and inverse analyses of Gaussian mixture regression models.

Analytical sciences : the international journal of the Japan Society for Analytical Chemistry·2026

Same author

Robust machine learning and ensemble learning approach to predict variation in experimental data for multiple measurements and anomalies.

Analytical sciences : the international journal of the Japan Society for Analytical Chemistry·2026

Same author

Machine Learning Models Predicting Solubility and Polymerizability of Polyimides Considering Multiple Monomers for CO<sub>2</sub> Separation Membranes.

Molecular informatics·2026

Same journal

PFASGroups: An Open-Source Framework for Automated Identification, Structural Classification, and Prioritization of Per- and Polyfluoroalkyl Substances.

Journal of chemical information and modeling·2026

Same journal

DeepKbhb: Context-Aware Prediction of Human Lysine β-Hydroxybutyrylation Sites.

Journal of chemical information and modeling·2026

Same journal

HyperDC: A Non-Uniform Hypergraph Framework for Dual- and Higher-Order Drug Combination Recommendation Across Diverse Complex Diseases.

Journal of chemical information and modeling·2026

Same journal

MolPy: A Large Language Model-Friendly Toolkit for Reactive Topology Editing in Polymer Simulations.

Journal of chemical information and modeling·2026

Same journal

Molecular Mechanisms of KIT Receptor Dimerization and Oncogenic Activation Revealed by Multiscale Simulations.

Journal of chemical information and modeling·2026

Same journal

Structural and Thermodynamic Discrimination between Agonists and Antagonists of Retinoic Acid Receptor γ and the Vitamin D Receptor.

Journal of chemical information and modeling·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Apr 25, 2026

Identification and Classification of Position-specific GABAA Receptor Subunit Missense Variants for Their Role In Hippocampal Pyramidal Neurons

Identification and Classification of Position-specific GABA_A Receptor Subunit Missense Variants for Their Role In Hippocampal Pyramidal Neurons

Published on: June 6, 2025

Applicability domain based on ensemble learning in classification and regression analyses.

Hiromasa Kaneko¹, Kimito Funatsu

¹Department of Chemical Systems Engineering, The University of Tokyo , 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan.

Journal of Chemical Information and Modeling

|August 15, 2014

Summary

This summary is machine-generated.

This study introduces a new method for defining applicability domains (ADs) in classification and regression using ensemble learning and data density. This approach improves the reliability of predictions for new data in machine learning models.

Related Experiment Videos

Last Updated: Apr 25, 2026

Identification and Classification of Position-specific GABAA Receptor Subunit Missense Variants for Their Role In Hippocampal Pyramidal Neurons

Identification and Classification of Position-specific GABA_A Receptor Subunit Missense Variants for Their Role In Hippocampal Pyramidal Neurons

Published on: June 6, 2025

Area of Science:

Machine Learning
Cheminformatics
Data Science

Background:

Applicability domains (ADs) are crucial for assessing the reliability of predictive models in classification and regression.
Existing AD methods in classification can be overly broad, limiting their practical utility.
Ensemble learning offers potential for refining AD definitions.

Purpose of the Study:

To propose a novel method for setting applicability domains (ADs) in classification and regression analyses.
To enhance the reliability of predictions for new chemical compounds or data points.
To address limitations of existing AD methods, particularly in classification tasks.

Main Methods:

Developing an AD approach integrating ensemble learning with data density.
Establishing a data density threshold to identify unreliable predictions.
Applying ensemble learning for reliability assessment only on data exceeding the density threshold.
Validating the proposed AD method using numerical simulations and quantitative structure-activity relationship (QSAR) data.

Main Results:

The proposed method effectively sets appropriate ADs for both regression and classification analyses.
Data density combined with ensemble learning provides a more refined and reliable AD than traditional methods.
The approach successfully identifies regions where predictions are less trustworthy.

Conclusions:

The integration of ensemble learning and data density offers a robust strategy for defining accurate applicability domains.
This method enhances the trustworthiness of machine learning models in scientific applications.
The validated approach provides a valuable tool for cheminformatics and related fields.