Search research articles

关于 JoVE

概览领导团队博客 JoVE 帮助中心

作者

出版流程编辑委员会范围与政策同行评审常见问题投稿

图书馆员

用户评价订阅访问资源图书馆顾问委员会常见问题

研究

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments 存档

教育

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual 教师资源中心教师网站

使用条款与条件

相关概念视频

Diffusion

Diffusion

Diffusion is the passive movement of substances down their concentration gradients—requiring no expenditure of cellular energy. Substances, such as molecules or ions, diffuse from an area of high concentration to an area of low concentration in the cytosol or across membranes. Eventually, the concentration will even out, with the substance moving randomly but causing no net change in concentration. Such a state is called dynamic equilibrium, which is essential for maintaining overall...

Reinforcement Schedules

Reinforcement Schedules

Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...

Observational Learning

Observational Learning

Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...

Physiological Pharmacokinetic Models: Blood Flow-Limited Versus Diffusion-Limited Models

Physiological Pharmacokinetic Models: Blood Flow-Limited Versus Diffusion-Limited Models

Physiological pharmacokinetic models, often called flow-limited or perfusion models, typically assume a swift drug distribution between tissue and venous blood, creating a rapid drug equilibrium. This premise is based on the idea that drug diffusion is extremely fast, and the cell membrane presents no barrier to drug permeation. In this scenario, where no drug binding occurs, the drug concentration in the tissue equals that of the venous blood leaving the tissue. This greatly simplifies the...

Instinctive Drift

Instinctive Drift

Instinctive drift refers to the tendency of animals to revert to their innate behaviors despite repeated reinforcement. Breland and Breland demonstrated this concept in an experiment with a raccoon. The raccoon was trained to pick up two coins and place them in a container in exchange for food. Initially, the raccoon learned to associate the coins with food, making them a conditioned stimulus or a substitute for food. However, over time, the raccoon became less willing to put the coins into the...

Modeling with Differential Equations

Modeling with Differential Equations

Population dynamics can be described mathematically by considering the population size P(t) as a function of time. The rate of change of the population is then represented by the derivative of P(t). A simple assumption is that the rate of growth is proportional to the size of the population itself. This leads to an exponential growth model, where the population increases rapidly without bound. While this is a useful first approximation, it does not reflect realistic long-term...

您也可能阅读

相关文章

通过共同作者、期刊和引用图与本文相关的文章。

排序

Same author

LoRASculpt: Harmonious Low-Rank Adaptation for Multimodal Large Language Models.

IEEE transactions on pattern analysis and machine intelligence·2026

Same author

Towards clinical-level interpretation of dental panoramic radiography using an instance-guided vision-language model.

Nature biomedical engineering·2026

Same author

Systemic immune-inflammation index predicts post-thrombectomy outcomes and reveals a mediating role in the association between neurocardiac stress and prognosis: a multicenter study.

Frontiers in neurology·2026

Same author

Holistic Invariant Retracing for Distortion-Resilient Multi-Modal Learning in Spatial Transcriptomics.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same author

Differentiable Clustering Graph Convolutional Network for Hyperspectral Unmixing: Methodology and Benchmark.

IEEE transactions on neural networks and learning systems·2026

Same author

MUP-SAM: Multi-scale vision mamba UNet prompt generation for SAM in multi-organ medical image segmentation.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Relation DETR+: Exploring Explicit Position Relation Prior for Dense Prediction.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

RBF++: Quantifying and Optimizing Reasoning Boundaries across Measurable and Unmeasurable Capabilities for Chain-of-Thought Reasoning.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

CAFE: Cross-View Adaptive Fusion and Cluster Center Enhancement for Robust Multi-View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

DIVER: Reinforced Diffusion Breaks Imitation Bottlenecks in End-to-End Autonomous Driving.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Ethics-Aware Safe Reinforcement Learning for Rare-Event Risk Control in Interactive Urban Driving.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Learning Shape Anchors for Holistic Indoor Scene Understanding.

IEEE transactions on pattern analysis and machine intelligence·2026

查看所有相关文章

Search research articles

相关实验视频

Updated: Feb 25, 2026

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

将少数步骤的扩散模型与密集的奖励差异学习模式对齐.

Ziyi Zhang, Li Shen, Sen Zhang

IEEE transactions on pattern analysis and machine intelligence

|February 23, 2026

概括

此摘要是机器生成的。

逐步扩散政策优化 (SDPO) 增强了几个步骤的扩散模型,以更好地调整图像合成. 这种强化学习框架提高了低步骤制度的效率和样本质量.

更多相关视频

An Operant Intra-/Extra-dimensional Set-shift Task for Mice

An Operant Intra-/Extra-dimensional Set-shift Task for Mice

Published on: January 22, 2016

Measuring Delay Discounting in Humans Using an Adjusting Amount Task

Measuring Delay Discounting in Humans Using an Adjusting Amount Task

Published on: January 9, 2016

相关实验视频

Last Updated: Feb 25, 2026

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

An Operant Intra-/Extra-dimensional Set-shift Task for Mice

An Operant Intra-/Extra-dimensional Set-shift Task for Mice

Published on: January 22, 2016

Measuring Delay Discounting in Humans Using an Adjusting Amount Task

Measuring Delay Discounting in Humans Using an Adjusting Amount Task

Published on: January 9, 2016

科学领域:

人工智能的人工智能
计算机视觉计算机视觉
机器学习机器学习

背景情况:

几步扩散模型提供高效的高分辨率图像合成.
现有的强化学习 (RL) 方法由于有限的状态和样本质量,难以在低级扩散模型中保持一致.

研究的目的:

介绍逐步扩散政策优化 (SDPO),这是一种用于少数步骤扩散模型的新型RL框架.
解决在低级制度中与下游目标调整扩散模型的局限性.

主要方法:

SDPO采用双状态轨迹采样机制 (噪音和清洁状态) 来获得密集的奖励反.
基于隐性相似性的密集奖励预测策略将成本高昂的奖励查询最小化.
使用密集的奖励差异学习,逐步优势估计,时间重要性加权和逐步混合的梯度更新.

主要成果:

通过更频繁,更细致的政策更新,SDPO可实现低变量,混合步骤优化.
实验结果显示,在各种几个步骤的任务中,一致的优异奖励结局一致.
显示了增强的长期依赖性,低级优先级和梯度稳定性.

结论:

在少数步骤的扩散模型优化中,SDPO有效地克服了传统RL方法的局限性.
拟议的框架显著改善了合成图像与特定下游目标的调整.
SDPO代表了高效和有效的高分辨率图像合成的重大进步.