Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Randomized Experiments01:13

Randomized Experiments

8.8K
The randomization process involves assigning study participants randomly to experimental or control groups based on their probability of being equally assigned. Randomization is meant to eliminate selection bias and balance known and unknown confounding factors so that the control group is similar to the treatment group as much as possible. A computer program and a random number generator can be used to assign participants to groups in a way that minimizes bias.
Simple randomization
Simple...
8.8K
Survival Tree01:19

Survival Tree

354
Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
 Building a Survival Tree
Constructing a...
354
Improving Translational Accuracy02:07

Improving Translational Accuracy

14.0K
Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...
14.0K
Improving Translational Accuracy02:07

Improving Translational Accuracy

3.5K
3.5K
Mutation, Gene Flow, and Genetic Drift01:09

Mutation, Gene Flow, and Genetic Drift

61.6K
In a population that is not at Hardy-Weinberg equilibrium, the frequency of alleles changes over time. Therefore, any deviations from the five conditions of Hardy-Weinberg equilibrium can alter the genetic variation of a given population. Conditions that change the genetic variability of a population include mutations, natural selection, non-random mating, gene flow, and genetic drift (small population size).
61.6K
Introduction to Learning01:18

Introduction to Learning

858
Learning is the process of acquiring knowledge or skills through practice or experience, leading to long-lasting behavioral changes. This acquisition occurs through interaction with the environment and requires practice or experience. For instance, mastering a skill such as surfing requires considerable practice and experience, highlighting the essential role of repeated interactions with the environment in learning.
In contrast to learned behaviors, unlearned behaviors such as crying, sexual...
858

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Correction: A comparative SWOT analysis of urban green infrastructure in the Global South.

Scientific reports·2026
Same author

A comparative SWOT analysis of urban green infrastructure in the Global South.

Scientific reports·2026
Same author

Toward accelerating fluvial morphodynamic simulations through a speed accuracy trade-off assessment.

Scientific reports·2026
Same author

Genome modelling and design across all domains of life with Evo 2.

Nature·2026
Same author

QuIP#: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks.

Proceedings of machine learning research·2025
Same author

ModuLoRA: Finetuning 2-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers.

Transactions on machine learning research·2025
Same journal

Towards the Efficient Inference by Incorporating Automated Computational Phenotypes under Covariate Shift.

Proceedings of machine learning research·2026
Same journal

Endo-SemiS: Towards Robust Semi-Supervised Image Segmentation for Endoscopic Video.

Proceedings of machine learning research·2026
Same journal

Perspective: Machine Learning for Health Should Consider Social Drivers of Health.

Proceedings of machine learning research·2026
Same journal

Classifying Phonotrauma Severity from Vocal Fold Images with Soft Ordinal Regression.

Proceedings of machine learning research·2026
Same journal

Does Domain-Specific Retrieval Augmented Generation Help LLMs Answer Consumer Health Questions?

Proceedings of machine learning research·2026
Same journal

Quantitative Convergence Analysis of Projected Stochastic Gradient Descent for Non-Convex Losses via the Goldstein Subdifferential.

Proceedings of machine learning research·2026
See all related articles

Related Experiment Video

Updated: Jan 3, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

947

A Kernel Theory of Modern Data Augmentation.

Tri Dao1, Albert Gu1, Alexander J Ratner1

  • 1Department of Computer Science, Stanford University.

Proceedings of Machine Learning Research
|November 29, 2019
PubMed
Summary
This summary is machine-generated.

Data augmentation, a machine learning technique, is theoretically modeled as a Markov process and analyzed for its impact on kernel classifiers. This research provides a framework for understanding and optimizing data augmentation in AI.

Related Experiment Videos

Last Updated: Jan 3, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

947

Area of Science:

  • Machine Learning
  • Artificial Intelligence
  • Theoretical Computer Science

Background:

  • Data augmentation is a common technique to expand training datasets in machine learning.
  • Understanding the theoretical underpinnings of data augmentation is crucial for optimizing its application.
  • Existing methods lack a unified theoretical framework for analyzing augmentation's impact.

Purpose of the Study:

  • To establish a theoretical framework for understanding data augmentation.
  • To analyze the effects of data augmentation on kernel classifiers.
  • To connect data augmentation theory with existing concepts like invariant kernels and robust optimization.

Main Methods:

  • Modeling data augmentation as a Markov process.
  • Analyzing augmentation's effect on kernel classifiers using feature averaging and variance regularization.
  • Developing theoretical frameworks to connect different machine learning concepts.

Main Results:

  • Kernels naturally emerge from the Markov process model of augmentation.
  • Data augmentation effects can be approximated by feature averaging and variance regularization.
  • Novel connections are established between data augmentation, invariant kernels, tangent propagation, and robust optimization.

Conclusions:

  • The proposed theoretical framework provides a deeper understanding of data augmentation.
  • The theory can accelerate machine learning workflows by predicting transformation utility and reducing computation.
  • This work bridges theoretical analysis with practical applications in machine learning.