Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Trimmed Mean

Trimmed Mean

While measuring the mean of a data set, care needs to be taken when associating the mean to its central tendency. The same goes for the arithmetic mean, the geometric mean, or the harmonic mean. This is because the presence of a single outlier data value can significantly affect the mean. That is, the mean is sensitive to fluctuations in the data set.
Although certain measures of central tendency are not sensitive to outliers, there are alternative versions of the mean that get around the...

Statistical Analysis: Overview

Statistical Analysis: Overview

When we take repeated measurements on the same or replicated samples, we will observe inconsistencies in the magnitude. These inconsistencies are called errors. To categorize and characterize these results and their errors, the researcher can use statistical analysis to determine the quality of the measurements and/or suitability of the methods.
One of the most commonly used statistical quantifiers is the mean, which is the ratio between the sum of the numerical values of all results and the...

Quantifying Work

Quantifying Work

As a system undergoes a change, its internal energy can change, and energy can be transferred from the system to the surroundings, or from the surroundings to the system.

Quantifying and Rejecting Outliers: The Grubbs Test

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier. To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This number is...

Quantitative Analysis

Quantitative Analysis

Quantitative analysis is a technique for measuring the amount of specific constituents in a sample. When the sample's composition is unknown, qualitative analysis is performed first to identify its components, which ensures that the correct substances are measured during the quantitative phase.
In quantitative analysis, two key measurements are made: the sample quantity and a property proportional to the amount of the analyte (the substance being analyzed). This forms the basis of the method...

Systematic Error: Methodological and Sampling Errors

Systematic Error: Methodological and Sampling Errors

In the case of systematic errors, the sources can be identified, and the errors can be subsequently minimized by addressing these sources. According to the source, systematic errors can be divided into sampling, instrumental, methodological, and personal errors.
Sampling errors originate from improper sampling methods or the wrong sample population. These errors can be minimized by refining the sampling strategy. Defective instruments or faulty calibrations are the sources of instrumental...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Novel Abstract Screening Algorithm Using Delphi-Inspired Large Language Model Consensus for Systematic Reviews in Psychiatry: Nouvel algorithme de sélection des résumés utilisant un consensus issu d'un grand modèle de langage inspiré de la méthode Delphi pour les revues systématiques en psychiatrie.

Canadian journal of psychiatry. Revue canadienne de psychiatrie·2026

Same author

Trajectories of Suicidal Risk Impact Mood Regulation Differently in Patients With a Diagnosis of Bipolar Disorder.

Acta psychiatrica Scandinavica·2026

Same author

What Is Redundancy?

Entropy (Basel, Switzerland)·2026

Same author

A systematic exploration of digital biomarkers for the detection of depressive episodes in bipolar disorder.

Npj mental health research·2026

Same author

Autonomous language-image generation loops converge to generic visual motifs.

Patterns (New York, N.Y.)·2026

Same author

Incentivising cooperation by judging a group's performance by its weakest member in neuroevolution and reinforcement learning.

Frontiers in robotics and AI·2025

Same journal

Logic, inference, understanding: cross-domain generalization for generative language models.

Frontiers in artificial intelligence·2026

Same journal

Label tree semantic losses for rich multi-class medical image segmentation.

Frontiers in artificial intelligence·2026

Same journal

Score-based generative diffusion models to synthesize full-dose FDG brain PET from MRI in epilepsy patients.

Frontiers in artificial intelligence·2026

Same journal

Resource-efficient retrieval-augmented question answering for the Indian Lok Sabha dataset.

Frontiers in artificial intelligence·2026

Same journal

Violation detection in power operation sites based on multi-scale detection and few-shot learning.

Frontiers in artificial intelligence·2026

Same journal

Deep reinforcement learning-based reversible medical image encryption framework for secure IoMT environments.

Frontiers in artificial intelligence·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 31, 2026

Phage Phenomics: Physiological Approaches to Characterize Novel Viral Proteins

Phage Phenomics: Physiological Approaches to Characterize Novel Viral Proteins

Published on: June 11, 2015

How to systematically and quantifiably remove meaning?

Frida Proschinger Åström¹, Arend Hintze^1,2

¹Data Analytics, School of Information and Engineering, Dalarna University, Falun, Sweden.

Frontiers in Artificial Intelligence

|May 29, 2026

Summary

This summary is machine-generated.

We developed a framework to measure how semantic erosion impacts large language model (LLM) performance. Findings show degradation varies by erosion type and domain, offering insights into LLM failure points.

Keywords:

large language models meaning and semantics meaning degradation robustness evaluation semantic erosion

More Related Videos

Quantifying Levels of Dopaminergic Neuron Morphological Alteration and Degeneration in Caenorhabditis elegans

Quantifying Levels of Dopaminergic Neuron Morphological Alteration and Degeneration in Caenorhabditis elegans

Published on: November 20, 2021

Quantifying Branching Density in Rat Mammary Gland Whole-mounts Using the Sholl Analysis Method

Quantifying Branching Density in Rat Mammary Gland Whole-mounts Using the Sholl Analysis Method

Published on: July 12, 2017

Related Experiment Videos

Last Updated: May 31, 2026

Phage Phenomics: Physiological Approaches to Characterize Novel Viral Proteins

Phage Phenomics: Physiological Approaches to Characterize Novel Viral Proteins

Published on: June 11, 2015

Quantifying Levels of Dopaminergic Neuron Morphological Alteration and Degeneration in Caenorhabditis elegans

Quantifying Levels of Dopaminergic Neuron Morphological Alteration and Degeneration in Caenorhabditis elegans

Published on: November 20, 2021

Quantifying Branching Density in Rat Mammary Gland Whole-mounts Using the Sholl Analysis Method

Quantifying Branching Density in Rat Mammary Gland Whole-mounts Using the Sholl Analysis Method

Published on: July 12, 2017

Area of Science:

Artificial Intelligence
Computational Linguistics
Cognitive Science

Background:

Large language models (LLMs) are increasingly used in real-world applications.
Current methods for evaluating LLM robustness against input meaning degradation are insufficient.
A systematic approach is needed to quantify performance decline due to semantic erosion.

Purpose of the Study:

To develop and validate a framework for semantically eroding input meaning.
To quantify the intensity of semantic erosion and its impact on LLM performance.
To identify domain-specific vulnerabilities in LLM processing.

Main Methods:

Developed five theoretically motivated semantic erosion methods: omission, lexical substitution, abstraction, structural obfuscation, and logical error injection.
Applied erosion operators across five distinct domains (e.g., code generation, news, instructions).
Quantified LLM performance degradation using a publicly available model and two-way Analysis of Variance (ANOVA).

Main Results:

Significant main effects of both domain and erosion method on LLM performance were observed.
A significant interaction effect indicated that semantic degradation impact is dependent on both erosion type and domain-specific information.
Logical errors severely impacted code generation, while structural obfuscation most affected news and instruction tasks. Pairwise erosion combinations showed both synergistic and compensatory effects.

Conclusions:

LLM performance degradation is highly sensitive to the type of semantic erosion and the specific domain.
Domain-specific vulnerability profiles are crucial for robust LLM evaluation, moving beyond generic perturbations.
Semantic erosion provides a principled method for analyzing how LLMs process and degrade meaning, aiding in understanding model failure modes.