Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Sample Size Calculation01:19

Sample Size Calculation

3.6K
Knowledge of the sample size is the first requirement to conduct random sampling or an experiment. The sample size is the total number of units, observations, or groups (in some cases) used to get the data to estimate a population parameter. As the name suggests, the sample size is that of the sample drawn from the population and differs from the population size.
The sample size for the given experiment or sampling effort is fundamental to any study design. Sample size decides the number of...
3.6K
Estimation of the Physical Quantities01:05

Estimation of the Physical Quantities

4.5K
On many occasions, physicists, other scientists, and engineers need to make estimates of a particular quantity. These are sometimes referred to as guesstimates, order-of-magnitude approximations, back-of-the-envelope calculations, or Fermi calculations. The physicist Enrico Fermi was famous for his ability to estimate various kinds of data with surprising precision. Estimating does not mean guessing a number or a formula at random. Instead, estimation means using prior experience and sound...
4.5K
How Data are Classified: Numerical Data00:59

How Data are Classified: Numerical Data

29.5K
Data that are countable or measurable in specific units are called numerical or quantitative data. Quantitative data are always numbers. Quantitative data are the result of counting or measuring the attributes of a population. Amount of money, pulse rate, weight, number of people living in a town, and number of students who opt for statistics are examples of quantitative data.
Quantitative data may be either discrete or continuous. All quantitative data that take on only specific numerical...
29.5K
Dimensional Analysis01:23

Dimensional Analysis

923
Dimensional analysis is a powerful tool that is used in physics and engineering to understand and predict the behavior of physical systems. The basic idea behind dimensional analysis is to express physical quantities in terms of fundamental dimensions such as the mass, length, and time. Derived dimensions like the velocity, acceleration, and force are derived from the combinations of these fundamental dimensions.
Dimensional analysis allows us to analyze and compare physical quantities on a...
923
Sampling Distribution01:12

Sampling Distribution

13.2K
Given simple random samples of size n from a given population with a measured characteristic such as mean, proportion, or standard deviation for each sample, the probability distribution of all the measured characteristics is called a sampling distribution. How much the statistic varies from one sample to another is known as the sampling variability of a statistic. You typically measure the sampling variability of a statistic by its standard error. The standard error of the mean is an example...
13.2K
Base Quantities and Derived Quantities01:14

Base Quantities and Derived Quantities

20.5K
In any system of units, the units for some physical quantities must be specified through a measurement process. These measurements are the base quantities of the system, and their units are the base units of the system. The algebraic combinations of the base values can then be used to express all other physical quantities. Each of these physical quantities is then referred to as a derived quantity, with each unit being referred to as a derived unit.
The International Organization for...
20.5K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

A proteome-wide lysine succinylome reveals TCA-related succinic acid metabolism and heat-stress responses in Reynoutria japonica.

Plant physiology and biochemistry : PPB·2026
Same author

Predicting Li-Ion Migration Energy Barriers in Battery Cathode Materials via Convolutional Neural Network Model Based on Descriptors Divide-and-Conquer Strategy.

The journal of physical chemistry letters·2026
Same author

Beyond Fluorination: A Golden Criterion Guided by Chemical Coordination-Informed Machine Learning for High-Voltage Electrolyte Design.

Angewandte Chemie (International ed. in English)·2026
Same author

Anti-pathogenic effects of Artemisia argyi and its applications: the past, present, and future.

Chinese journal of natural medicines·2026
Same author

Heat Shock Protein 90: From Molecular Chaperone Function to Therapeutic Targeting in Malignancies.

Advanced science (Weinheim, Baden-Wurttemberg, Germany)·2026
Same author

The enigmatic role of tumor dormancy cells in gynecologic cancers.

Frontiers in immunology·2026
Same journal

Reconstructing vegetation biomass in the Middle Jurassic Yanliao Biota from insect fossil assemblages.

National science review·2026
Same journal

Industrial electrocatalytic C-C coupling reaction of C<sub>1</sub> liquid molecules for efficient ethanol synthesis.

National science review·2026
Same journal

Intrinsic auxetic piezoelectricity in bulk ferroelectrics.

National science review·2026
Same journal

Electrochemical in-biosensing computing.

National science review·2026
Same journal

Au and Ti closer in TS-1 zeolite for enhancing activity.

National science review·2026
Same journal

Post-Moore two-dimensional integrated electronics for angstrom-nodes.

National science review·2026
See all related articles

Related Experiment Video

Updated: Jul 26, 2025

Three-Dimensional Particle Shape Analysis Using X-ray Computed Tomography: Experimental Procedure and Analysis Algorithms for Metal Powders
10:10

Three-Dimensional Particle Shape Analysis Using X-ray Computed Tomography: Experimental Procedure and Analysis Algorithms for Metal Powders

Published on: December 4, 2020

1.9K

Data quantity governance for machine learning in materials science.

Yue Liu1,2, Zhengwei Yang1, Xinxin Zou1

  • 1School of Computer Engineering and Science, Shanghai University, Shanghai200444, China.

National Science Review
|June 16, 2023
PubMed
Summary
This summary is machine-generated.

Machine learning (ML) in materials science struggles with limited data. This study reviews data governance strategies and proposes a domain knowledge-integrated approach to improve ML model performance for accelerated materials discovery.

Keywords:
data governancedata quantitymachine learningmaterials science

More Related Videos

Databases to Efficiently Manage Medium Sized, Low Velocity, Multidimensional Data in Tissue Engineering
09:43

Databases to Efficiently Manage Medium Sized, Low Velocity, Multidimensional Data in Tissue Engineering

Published on: November 22, 2019

6.3K
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.6K

Related Experiment Videos

Last Updated: Jul 26, 2025

Three-Dimensional Particle Shape Analysis Using X-ray Computed Tomography: Experimental Procedure and Analysis Algorithms for Metal Powders
10:10

Three-Dimensional Particle Shape Analysis Using X-ray Computed Tomography: Experimental Procedure and Analysis Algorithms for Metal Powders

Published on: December 4, 2020

1.9K
Databases to Efficiently Manage Medium Sized, Low Velocity, Multidimensional Data in Tissue Engineering
09:43

Databases to Efficiently Manage Medium Sized, Low Velocity, Multidimensional Data in Tissue Engineering

Published on: November 22, 2019

6.3K
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.6K

Area of Science:

  • Materials Science
  • Data Science
  • Machine Learning

Background:

  • Machine learning (ML) is crucial for materials science, aiding in structure-activity relationship analysis, performance optimization, and materials design.
  • A significant challenge in applying ML to materials science is the scarcity of data, leading to a mismatch between feature space dimensionality and sample size, or between model parameters and sample size.
  • This data limitation often results in poor ML model performance.

Purpose of the Study:

  • To review existing strategies for addressing data limitations in ML for materials science, including feature reduction, sample augmentation, and specialized ML approaches.
  • To highlight the importance of balancing sample size with feature dimensionality or model parameters in data quantity governance.
  • To propose a novel synergistic data quantity governance framework incorporating materials domain knowledge.

Main Methods:

  • Literature review of techniques for tackling data scarcity in materials ML.
  • Analysis of the interplay between sample size, feature space, and model parameters.
  • Development of a data quantity governance flow integrating materials domain knowledge.

Main Results:

  • Existing methods like feature reduction and sample augmentation are discussed as ways to mitigate data limitations.
  • The critical need for careful consideration of the balance between data quantity and model complexity is emphasized.
  • Incorporating materials domain knowledge into ML data governance schemes demonstrates significant advantages.

Conclusions:

  • Effective data quantity governance is essential for successful ML applications in materials science.
  • A synergistic approach combining data governance with materials domain knowledge can significantly enhance ML model performance.
  • This work provides a pathway to generating high-quality data, accelerating ML-driven materials design and discovery.