Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Knowledge Graph Augmented Large Language Models for Disease Prediction.

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science·2026
Same author

Enhanced Atrial Fibrillation Prediction in ESUS Patients with Hypergraph-based Pre-training.

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science·2026
Same author

ClinNoteAgents: An LLM Multi-Agent System for Predicting and Interpreting Heart Failure 30-Day Readmission from Clinical Notes.

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science·2026
Same author

Why Empirical Risk Minimization Performs Well for Open Set Domain Adaptation: A Theoretical Analysis From Causal View.

IEEE transactions on neural networks and learning systems·2026
Same author

Alexithymia in schizophrenia spectrum disorders and dissociative disorders: two meta-analytic reviews.

Schizophrenia (Heidelberg, Germany)·2026
Same author

Towards a general-purpose foundation model for functional MRI analysis.

Nature biomedical engineering·2026
Same journal

PromptLink: Leveraging Large Language Models for Cross-Source Biomedical Concept Linking.

International ACM SIGIR Conference on Research and Development in Information Retrieval. Annual International ACMSIGIR Conference on Research & Development in Information Retrieval·2025
Same journal

BioSift: A Dataset for Filtering Biomedical Abstracts for Drug Repurposing and Clinical Meta-Analysis.

International ACM SIGIR Conference on Research and Development in Information Retrieval. Annual International ACMSIGIR Conference on Research & Development in Information Retrieval·2024
Same journal

HiPrompt: Few-Shot Biomedical Knowledge Fusion via Hierarchy-Oriented Prompting.

International ACM SIGIR Conference on Research and Development in Information Retrieval. Annual International ACMSIGIR Conference on Research & Development in Information Retrieval·2024
See all related articles

Related Experiment Video

Updated: Jul 3, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.5K

Weakly-Supervised Scientific Document Classification via Retrieval-Augmented Multi-Stage Training.

Ran Xu1, Yue Yu2, Joyce Ho1

  • 1Emory University, Atlanta, GA, USA.

International ACM SIGIR Conference on Research and Development in Information Retrieval. Annual International ACMSIGIR Conference on Research & Development in Information Retrieval
|February 14, 2024
PubMed
Summary
This summary is machine-generated.

This study introduces WanDeR, a novel method for scientific document classification using only label names. WanDeR significantly improves classification accuracy by leveraging dense retrieval and label expansion, overcoming data labeling costs.

Keywords:
RetrievalScientific Document ClassificationWeak Supervision

More Related Videos

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.5K
Author Spotlight: AI-Driven Trypanosome Species Detection from Microscopic Images
08:20

Author Spotlight: AI-Driven Trypanosome Species Detection from Microscopic Images

Published on: October 27, 2023

1.5K

Related Experiment Videos

Last Updated: Jul 3, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.5K
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.5K
Author Spotlight: AI-Driven Trypanosome Species Detection from Microscopic Images
08:20

Author Spotlight: AI-Driven Trypanosome Species Detection from Microscopic Images

Published on: October 27, 2023

1.5K

Area of Science:

  • Computer Science
  • Information Science
  • Artificial Intelligence

Background:

  • Scientific document classification is essential but hindered by high costs of human-labeled data.
  • Existing methods struggle when label names contain domain-specific terms absent in the document corpus.

Purpose of the Study:

  • To develop an effective scientific document classification method using only label names.
  • To address the challenge of matching documents with semantically rich but potentially sparse label names.

Main Methods:

  • Proposed WanDeR, a method employing dense retrieval for semantic matching in embedding space.
  • Incorporated a label name expansion module to enrich label representations.
  • Utilized a self-training step for refining classification predictions.

Main Results:

  • WanDeR demonstrated superior performance compared to existing baselines.
  • Achieved an 11.9% improvement in classification accuracy across three experimental datasets.
  • The approach effectively captures label semantics for improved document matching.

Conclusions:

  • WanDeR offers a cost-effective and accurate solution for scientific document classification.
  • Leveraging dense retrieval and label expansion enhances the model's ability to understand label semantics.
  • The proposed method shows significant potential for real-world applications in scientific information management.