Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving01:29

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving

Mechanistic models play a crucial role in algorithms for numerical problem-solving, particularly in nonlinear mixed effects modeling (NMEM). These models aim to minimize specific objective functions by evaluating various parameter estimates, leading to the development of systematic algorithms. In some cases, linearization techniques approximate the model using linear equations.
In individual population analyses, different algorithms are employed, such as Cauchy's method, which uses a...
One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation01:24

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation

This lesson introduces two critical methods in pharmacokinetics, the Wagner-Nelson and Loo-Riegelman methods, used for estimating the absorption rate constant (ka) for drugs administered via non-intravenous routes. The Wagner-Nelson method relates ka to the plasma concentration derived from the slope of a semilog percent unabsorbed time plot. However, it is limited to drugs with one-compartment kinetics and can be impacted by factors like gastrointestinal motility or enzymatic degradation.
On...
Rationalizing Substitutions01:29

Rationalizing Substitutions

Integrals involving non-rational functions are often difficult to evaluate using standard techniques, especially when radicals appear in the integrand. Rationalizing substitution provides a systematic method for simplifying such integrals by converting them into rational forms that are easier to handle.Consider a rod whose linear mass density depends on a constant linear density, a characteristic length, and the distance from the left end of the rod. Determining the total mass requires...
Quadratic Models01:23

Quadratic Models

Quadratic models are mathematical representations used to describe relationships in which the rate of change changes at a constant rate. These models appear in a wide variety of natural and engineered systems, especially those involving motion, forces, and optimization. One common application is analyzing the vertical motion of objects influenced by gravity, such as a ball thrown into the air.In such scenarios, the object's height changes over time in a curved pattern, rising to a maximum point...
Reasoning01:30

Reasoning

Reasoning is the action of thinking about something in a logical, sensible way. It is integral to problem-solving, decision-making, and critical thinking. Reasoning can be inductive or deductive. Reasoning involves transforming information into conclusions, which is essential for problem-solving, decision-making, and critical thinking.
Inductive reasoning involves deriving generalizations from specific observations. This type of reasoning helps form beliefs about the world. For example,...
Clearance Models: Compartment Models01:25

Clearance Models: Compartment Models

Clearance measures drug elimination from the central compartment, including plasma and highly perfused organs like kidneys and liver. Its calculation varies depending on pharmacokinetic models and administration routes. The one-compartment model, for instance, portrays the pharmacokinetics of polar drugs such as aminoglycoside antibiotics administered intravenously and readily excreted in urine. In this case, clearance is influenced by the terminal rate constant (λz) and the total volume of...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Sequence-based generative AI design of versatile tryptophan synthases.

Nature communications·2026
Same author

A multi-grained symmetric differential equation model for learning protein-ligand binding dynamics.

Nature communications·2025
Same author

Manifold-constrained nucleus-level denoising diffusion model for structure-based drug design.

Proceedings of the National Academy of Sciences of the United States of America·2025
Same author

Ultrasound Lung Aeration Map via Physics-Aware Neural Operators.

ArXiv·2025
Same author

Towards large-scale quantum optimization solvers with few qubits.

Nature communications·2025
Same author

Human AI collaboration for unsupervised categorization of live surgical feedback.

NPJ digital medicine·2024
Same journal

Distributionally Robust Feature Selection.

Advances in neural information processing systems·2026
Same journal

On the Identifiability of Hybrid Deep Generative Models: Meta-Learning as a Solution.

Advances in neural information processing systems·2026
Same journal

Unlocking hidden biomolecular conformational landscapes in diffusion models at inference time.

Advances in neural information processing systems·2026
Same journal

JADE: Joint Alignment and Deep Embedding for Multi-Slice Spatial Transcriptomics.

Advances in neural information processing systems·2026
Same journal

Learning to Route: Per-Sample Adaptive Routing for Multimodal Multitask Prediction.

Advances in neural information processing systems·2026
Same journal

Emergence and Evolution of Interpretable Concepts in Diffusion Models.

Advances in neural information processing systems·2026
See all related articles

Related Experiment Videos

R-KV: Redundancy-aware KV Cache Compression for Reasoning Models.

Zefan Cai1, Wen Xiao2, Hanshi Sun3

  • 1University of Wisconsin - Madison.

Advances in Neural Information Processing Systems
|May 20, 2026
PubMed
Summary
This summary is machine-generated.

We introduce Redundancy-aware KV Cache Compression for Reasoning models (R-KV), a method that significantly reduces memory usage for reasoning models. R-KV achieves high performance with only 10% of the KV cache, outperforming existing methods.

Related Experiment Videos

Area of Science:

  • Artificial Intelligence
  • Natural Language Processing
  • Machine Learning

Background:

  • Reasoning models excel at complex tasks but generate lengthy outputs, increasing inference costs due to large Key-Value (KV) caches.
  • Existing KV cache compression methods struggle with chain-of-thought reasoning, leading to performance degradation and reasoning failures.

Purpose of the Study:

  • To develop a novel KV cache compression technique tailored for reasoning models to mitigate excessive memory usage.
  • To improve the efficiency of chain-of-thought reasoning without compromising performance.

Main Methods:

  • Propose Redundancy-aware KV Cache Compression for Reasoning models (R-KV), a method focusing on compressing redundant tokens within reasoning processes.
  • Evaluate R-KV's effectiveness against existing KV cache compression baselines on mathematical reasoning tasks.

Main Results:

  • R-KV preserves nearly 100% of full KV cache performance using only 10% of the cache, significantly outperforming baselines (60% performance).
  • Achieves 105% of full KV cache performance with 16% cache, demonstrating superior efficiency.
  • Reduces memory by 90% and increases throughput by 6.6x compared to standard chain-of-thought inference.

Conclusions:

  • R-KV effectively compresses KV caches for reasoning models, addressing the challenge of long outputs and high memory consumption.
  • The proposed method offers substantial memory savings and throughput improvements while maintaining or enhancing performance on complex reasoning tasks.