Search research articles

お問い合わせ

JoVEについて

概要リーダーシップブログ JoVEヘルプセンター

著者向け

出版プロセス編集委員会範囲と方針査読よくある質問投稿

図書館員向け

推薦の声購読アクセスリソース図書館諮問委員会よくある質問

研究

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments アーカイブ

教育

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual 教員リソースセンター教員サイト

プライバシーポリシー

関連する概念動画

Depth Perception and Spatial Vision

Depth Perception and Spatial Vision

Depth perception is the ability to perceive objects three-dimensionally. It relies on two types of cues: binocular and monocular. Binocular cues depend on the combination of images from both eyes and how the eyes work together. Since the eyes are in slightly different positions, each eye captures a slightly different image. This disparity between images, known as binocular disparity, helps the brain interpret depth. When the brain compares these images, it determines the distance to an object.

Visual System

Visual System

Light enters the eye through the cornea, a transparent, dome-shaped surface covering the surface of the eyeball that helps to direct and focus incoming light. This light is then channeled toward the pupil, an adjustable opening whose size is controlled by the iris. The iris, a pigmented muscle, regulates the amount of light entering the eye by contracting or dilating the pupil, thereby ensuring optimal light levels for clear vision.
Once through the pupil, the light passes through the lens, a...

Vision

Vision

Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.

Parallel Processing

Parallel Processing

The brain processes sensory information rapidly due to parallel processing, which involves sending data across multiple neural pathways at the same time. This method allows the brain to manage various sensory qualities, such as shapes, colors, movements, and locations, all concurrently. For instance, when observing a forest landscape, the brain simultaneously processes the movement of leaves, the shapes of trees, the depth between them, and the various shades of green. This enables a quick and...

Anatomy of the Eyeball

Anatomy of the Eyeball

The eye is a spherical, hollow structure composed of three tissue layers. The outer layer — the fibrous tunic, comprises the sclera — a white structure — and the cornea, which is transparent. The sclera encompasses some of the ocular surface, most of which is not visible. However, the 'white of the eye' is distinctively visible in humans compared to other species. The cornea, a clear covering at the front of the eye, enables light penetration. The eye's middle...

Gestalt Principles of Perception

Gestalt Principles of Perception

Gestalt principles provide a framework for understanding how humans perceive objects as unified wholes within their context. These principles are essential in explaining the cognitive processes that make sense of complex visual stimuli by organizing them into coherent groups. One fundamental principle is proximity, which posits that objects located close to each other are perceived as a collective group. For instance, when dots are positioned near one another, the visual system interprets them...

こちらも読む

関連記事

共著者、ジャーナル、引用グラフによってこの研究に関連する記事。

並び替え

Same author

RAD51 gene is associated with advanced age-related macular degeneration in Chinese population.

Clinical biochemistry·2013

Same author

Immunization against recombinant GnRH-I alters ultrastructure of gonadotropin cell in an experimental boar model.

Reproductive biology and endocrinology : RB&E·2013

Same author

Multi-class constrained normalized cut with hard, soft, unary and pairwise priors and its applications to object segmentation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2013

Same author

Comparison of genomic and amino acid sequences of eight Japanese encephalitis virus isolates from bats.

Archives of virology·2013

Same author

Regulation of dendritic cell differentiation in bone marrow during emergency myelopoiesis.

Journal of immunology (Baltimore, Md. : 1950)·2013

Same author

Separation of mandelic acid and its derivatives with new immobilized cellulose chiral stationary phase.

Journal of Zhejiang University. Science. B·2013

Same journal

Relation DETR+: Exploring Explicit Position Relation Prior for Dense Prediction.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

RBF++: Quantifying and Optimizing Reasoning Boundaries across Measurable and Unmeasurable Capabilities for Chain-of-Thought Reasoning.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

CAFE: Cross-View Adaptive Fusion and Cluster Center Enhancement for Robust Multi-View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

DIVER: Reinforced Diffusion Breaks Imitation Bottlenecks in End-to-End Autonomous Driving.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Ethics-Aware Safe Reinforcement Learning for Rare-Event Risk Control in Interactive Urban Driving.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Learning Shape Anchors for Holistic Indoor Scene Understanding.

IEEE transactions on pattern analysis and machine intelligence·2026

関連記事をすべて見る

Search research articles

関連する実験動画

Updated: Sep 9, 2025

Using Looming Visual Stimuli to Evaluate Mouse Vision

Using Looming Visual Stimuli to Evaluate Mouse Vision

Published on: June 13, 2019

視覚的知覚のための高次元の空間的相互作用

Zuyan Liu, Yongming Rao, Wenliang Zhao

IEEE transactions on pattern analysis and machine intelligence

|August 28, 2025

まとめ

この要約は機械生成です。

研究者らは回帰ゲートコンボレーション (g nConv) を開発し,コンボレーションを使用して重要なビジョントランスフォーマー機能を効率的に実装しました. この新しい操作により,さまざまな視覚モデルが強化され,画像認識,3D分析,視覚言語のタスクの性能が向上します.

さらに関連する動画

Assessing Binocular Central Visual Field and Binocular Eye Movements in a Dichoptic Viewing Condition

Assessing Binocular Central Visual Field and Binocular Eye Movements in a Dichoptic Viewing Condition

Published on: July 21, 2020

Development of a Gaze-Contingent Display Framework Designed for Perceptual and Oculomotor Research with Simulated Central Vision Loss

Development of a Gaze-Contingent Display Framework Designed for Perceptual and Oculomotor Research with Simulated Central Vision Loss

Published on: April 11, 2025

関連する実験動画

Last Updated: Sep 9, 2025

Using Looming Visual Stimuli to Evaluate Mouse Vision

Using Looming Visual Stimuli to Evaluate Mouse Vision

Published on: June 13, 2019

Assessing Binocular Central Visual Field and Binocular Eye Movements in a Dichoptic Viewing Condition

Assessing Binocular Central Visual Field and Binocular Eye Movements in a Dichoptic Viewing Condition

Published on: July 21, 2020

Development of a Gaze-Contingent Display Framework Designed for Perceptual and Oculomotor Research with Simulated Central Vision Loss

Development of a Gaze-Contingent Display Framework Designed for Perceptual and Oculomotor Research with Simulated Central Vision Loss

Published on: April 11, 2025

科学分野:

コンピュータ・ビジョン
深層学習
人工知能

背景:

ビジョン・トランスフォーマー (ViT) は,自己注意の空間モデル化によって成功します.
コンボリューションニューラルネットワーク (CNN) はコンピュータビジョンの基礎です
ViTの強みをCNNに統合することは,活発な研究分野です.

研究の目的:

ViTの空間モデリングを複製するコンヴォルションベースのフレームワークを導入します.
高次空間相互作用のための新しい操作,リキュルシブ・ゲート・コンボリューション (g nConv) を開発する.
多様な視覚的なタスクのための多用途のバックボーン (HorNet,Hor3D,HorCLIP) を作成する.

主な方法:

効率的で高次元の空間的相互作用のための提案された再帰的ゲートコンボリューション (g nConv).
開発された一般的な視覚のバックボーン:HorNet (画像認識),Hor3D (点雲),HorCLIP (視覚言語).
既存のアーキテクチャにplug-and-playモジュールとしてg nConvを統合しました.

主要な成果:

ホーネットはSwin TransformersとConvNeXtを ImageNet,COCO,ADE20Kで上回っている.
g nConvは,計算を削減した密度の高い予測タスクを改善します.
Hor3Dは3Dセマンティックセグメンテーションで有効性を示し,HorCLIPは視覚言語のタスクに優れている.

結論:

g nConvは,ViTとCNNのメリットを効果的に組み合わせ,ビジュアルモデリングのための新しい基本的な操作を提供します.
ホーネットファミリーは優れた性能とスケーラビリティを備えています
g nConvによる高次元の空間的相互作用は,様々な視覚的様式とタスクにおいて有益である.