Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Randomized Experiments

Randomized Experiments

The randomization process involves assigning study participants randomly to experimental or control groups based on their probability of being equally assigned. Randomization is meant to eliminate selection bias and balance known and unknown confounding factors so that the control group is similar to the treatment group as much as possible. A computer program and a random number generator can be used to assign participants to groups in a way that minimizes bias.
Simple randomization
Simple...

Survival Tree

Survival Tree

Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
Building a Survival Tree
Constructing a...

Improving Translational Accuracy

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...

Improving Translational Accuracy

Improving Translational Accuracy

Mutation, Gene Flow, and Genetic Drift

Mutation, Gene Flow, and Genetic Drift

In a population that is not at Hardy-Weinberg equilibrium, the frequency of alleles changes over time. Therefore, any deviations from the five conditions of Hardy-Weinberg equilibrium can alter the genetic variation of a given population. Conditions that change the genetic variability of a population include mutations, natural selection, non-random mating, gene flow, and genetic drift (small population size).

Introduction to Learning

Introduction to Learning

Learning is the process of acquiring knowledge or skills through practice or experience, leading to long-lasting behavioral changes. This acquisition occurs through interaction with the environment and requires practice or experience. For instance, mastering a skill such as surfing requires considerable practice and experience, highlighting the essential role of repeated interactions with the environment in learning.
In contrast to learned behaviors, unlearned behaviors such as crying, sexual...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Correction: A comparative SWOT analysis of urban green infrastructure in the Global South.

Scientific reports·2026

Same author

A comparative SWOT analysis of urban green infrastructure in the Global South.

Scientific reports·2026

Same author

Toward accelerating fluvial morphodynamic simulations through a speed accuracy trade-off assessment.

Scientific reports·2026

Same author

Genome modelling and design across all domains of life with Evo 2.

Nature·2026

Same author

QuIP#: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks.

Proceedings of machine learning research·2025

Same author

ModuLoRA: Finetuning 2-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers.

Transactions on machine learning research·2025

Same journal

Towards the Efficient Inference by Incorporating Automated Computational Phenotypes under Covariate Shift.

Proceedings of machine learning research·2026

Same journal

Endo-SemiS: Towards Robust Semi-Supervised Image Segmentation for Endoscopic Video.

Proceedings of machine learning research·2026

Same journal

Perspective: Machine Learning for Health Should Consider Social Drivers of Health.

Proceedings of machine learning research·2026

Same journal

Classifying Phonotrauma Severity from Vocal Fold Images with Soft Ordinal Regression.

Proceedings of machine learning research·2026

Same journal

Does Domain-Specific Retrieval Augmented Generation Help LLMs Answer Consumer Health Questions?

Proceedings of machine learning research·2026

Same journal

Quantitative Convergence Analysis of Projected Stochastic Gradient Descent for Non-Convex Losses via the Goldstein Subdifferential.

Proceedings of machine learning research·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jan 3, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

A Kernel Theory of Modern Data Augmentation.

Tri Dao¹, Albert Gu¹, Alexander J Ratner¹

¹Department of Computer Science, Stanford University.

Proceedings of Machine Learning Research

|November 29, 2019

Summary

This summary is machine-generated.

Data augmentation, a machine learning technique, is theoretically modeled as a Markov process and analyzed for its impact on kernel classifiers. This research provides a framework for understanding and optimizing data augmentation in AI.

Related Experiment Videos

Last Updated: Jan 3, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Area of Science:

Machine Learning
Artificial Intelligence
Theoretical Computer Science

Background:

Data augmentation is a common technique to expand training datasets in machine learning.
Understanding the theoretical underpinnings of data augmentation is crucial for optimizing its application.
Existing methods lack a unified theoretical framework for analyzing augmentation's impact.

Purpose of the Study:

To establish a theoretical framework for understanding data augmentation.
To analyze the effects of data augmentation on kernel classifiers.
To connect data augmentation theory with existing concepts like invariant kernels and robust optimization.

Main Methods:

Modeling data augmentation as a Markov process.
Analyzing augmentation's effect on kernel classifiers using feature averaging and variance regularization.
Developing theoretical frameworks to connect different machine learning concepts.

Main Results:

Kernels naturally emerge from the Markov process model of augmentation.
Data augmentation effects can be approximated by feature averaging and variance regularization.
Novel connections are established between data augmentation, invariant kernels, tangent propagation, and robust optimization.

Conclusions:

The proposed theoretical framework provides a deeper understanding of data augmentation.
The theory can accelerate machine learning workflows by predicting transformation utility and reducing computation.
This work bridges theoretical analysis with practical applications in machine learning.