Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving01:29

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving

372
Mechanistic models play a crucial role in algorithms for numerical problem-solving, particularly in nonlinear mixed effects modeling (NMEM). These models aim to minimize specific objective functions by evaluating various parameter estimates, leading to the development of systematic algorithms. In some cases, linearization techniques approximate the model using linear equations.
In individual population analyses, different algorithms are employed, such as Cauchy's method, which uses a...
372
Parallel Processing01:20

Parallel Processing

819
The brain processes sensory information rapidly due to parallel processing, which involves sending data across multiple neural pathways at the same time. This method allows the brain to manage various sensory qualities, such as shapes, colors, movements, and locations, all concurrently. For instance, when observing a forest landscape, the brain simultaneously processes the movement of leaves, the shapes of trees, the depth between them, and the various shades of green. This enables a quick and...
819
Rapidly Varying Flow01:24

Rapidly Varying Flow

561
Rapidly varying flow (RVF) in open channels is characterized by abrupt changes in flow depth over a short distance, with the rate of depth change relative to distance often approaching unity. These flows are inherently complex due to their transient and multi-dimensional nature, making exact analysis difficult. However, approximate solutions using simplified models provide valuable insights into their behavior.Key Features of Rapidly Varying FlowRVF is commonly observed in scenarios involving...
561

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

PyAO: PyTorch-Based Memory-Efficient LLM Training on Ethernet-Interconnected Clusters.

Sensors (Basel, Switzerland)·2026
Same author

The role of ecology in allopatric speciation of darters in the Central Highlands, USA.

Evolution; international journal of organic evolution·2025
Same author

Pleistocene speciation and glacial refugia in the Gilt Darter (Percidae: Percina evides) species complex.

Evolution; international journal of organic evolution·2025
Same author

Undescribed and imperiled vertebrate biodiversity near an American urban center.

Biology letters·2025
Same author

Comparative species delimitation of a biological conservation icon.

Current biology : CB·2025
Same author

Clinical predictors of causative radiographic findings in adults with acute onset diplopia.

Frontiers in neurology·2024
Same journal

RETRACTED: Zhang et al. A Novel Framework for Reconstruction and Imaging of Target Scattering Centers via Wide-Angle Incidence in Radar Networks. <i>Sensors</i> 2025, <i>25</i>, 6802.

Sensors (Basel, Switzerland)·2026
Same journal

Enhancing Unsupervised Multi-Source Domain Adaptation for Person Re-Identification via Mixture of Experts and Graph-Based Relation.

Sensors (Basel, Switzerland)·2026
Same journal

Development of an Instrumented Glove for Palmar Pressure Assessment in Kayakers.

Sensors (Basel, Switzerland)·2026
Same journal

Development and Experimental Validation of an Autonomous IoT-Based Monitoring System for Real-Time Water Quality Assessment in the Amazon River.

Sensors (Basel, Switzerland)·2026
Same journal

Semi-Supervised Adversarial Learning Framework for Controller Area Network Bus Intrusion Detection.

Sensors (Basel, Switzerland)·2026
Same journal

Smart Optimization Method for Safety Signs in Innovative Manufacturing Environments Integrating Industrial Field IoT Sensors and Knowledge Graphs.

Sensors (Basel, Switzerland)·2026
See all related articles

Related Experiment Video

Updated: Jun 3, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Dynamic Micro-Batch and Token-Budget Scheduling for IoT-Scale Pipeline-Parallel LLM Inference.

Juncheol Ahn1, Yubin Son1, Daemin Kim1

  • 1System Software Laboratory, Department of Computer Engineering, Keimyung University, Daegu 42601, Republic of Korea.

Sensors (Basel, Switzerland)
|February 27, 2026
PubMed
Summary
This summary is machine-generated.

We developed a runtime-adaptive scheduler for large language models (LLMs) in IoT-edge-cloud settings. This dynamic scheduling significantly reduces GPU idle time and improves throughput for LLM inference.

Keywords:
GPU schedulingIoTcloud inferenceedge computinglarge language modelsmicro-batchingpipeline parallelism

Related Experiment Videos

Last Updated: Jun 3, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Area of Science:

  • Computer Science
  • Artificial Intelligence
  • Distributed Systems

Background:

  • Large language models (LLMs) in IoT-edge-cloud environments handle diverse, unpredictable requests.
  • Pipeline-parallel LLM inference is susceptible to micro-batch imbalance and communication delays, leading to GPU idleness and Service Level Objective (SLO) violations.

Purpose of the Study:

  • To propose a novel runtime-adaptive scheduler for optimizing LLM inference in resource-constrained IoT-edge-cloud settings.
  • To address micro-batch imbalance and communication stalls in pipeline-parallel LLM inference.

Main Methods:

  • Developed a scheduler that dynamically adjusts token budgets and micro-batch sizes.
  • Optimized the balance between prefill and decoding workloads.
  • Minimized pipeline bubbles under varying network and compute conditions.

Main Results:

  • Achieved up to 55% reduction in GPU idle time.
  • Improved throughput by up to 1.61 times compared to existing methods like vLLM and SGLang.
  • Enhanced Time-To-First-Token (TTFT) and Iteration Latency (ITL) SLO satisfaction.

Conclusions:

  • Dynamic scheduling is crucial for efficient and stable LLM inference in IoT-edge-cloud systems.
  • The proposed adaptive scheduler effectively mitigates performance bottlenecks in heterogeneous request environments.