Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Capturing Finer-grained Long-range Dependency for Dense Prediction in Medical Images: An Empirical Investigation of MLPs.

IEEE journal of biomedical and health informatics·2026

Same author

Hybrid-CMLP: Hybrid CNN-MLP Networks for Low-to-standard-dose PET Synthesis.

IEEE journal of biomedical and health informatics·2026

Same author

A comprehensive review of use cases, misuses, and potential mitigation techniques in generative artificial intelligence.

Neural networks : the official journal of the International Neural Network Society·2026

Same author

Why We Need Patients and Community at the Center of AI Health Communication Research.

Journal of medical Internet research·2026

Same author

Dental Odontogenic Lesion CBCT and Histopathology Integrated Dataset for Benchmarking Deep Learning Algorithms.

Scientific data·2026

Same author

A systematic review of generative artificial intelligence techniques for synthetic medical image datasets: Quality, models, public availability and applications.

Computer methods and programs in biomedicine·2026

Same journal

SNPio: a Python interface for population genomic data processing.

BMC bioinformatics·2026

Same journal

SpaHNR: a spatial domain identification method via sparse attention-based hierarchical node representation and multi-view contrastive learning.

BMC bioinformatics·2026

Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026

Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026

Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026

Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Sep 26, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Benchmarking for biomedical natural language processing tasks with a domain specific ALBERT.

Usman Naseem¹, Adam G Dunn², Matloob Khushi^3,4

¹School of Computer Science, The University of Sydney, Sydney, Australia. usman.naseem@sydney.edu.au.

BMC Bioinformatics

|April 22, 2022

Summary

This summary is machine-generated.

BioALBERT, a new language model, excels in biomedical natural language processing (BioNLP) tasks. It achieves state-of-the-art results on 5 out of 6 common BioNLP benchmarks, demonstrating robust performance and generalizability.

Keywords:

BioNLP Bioinformatics Biomedical text mining Domain-specific language model

More Related Videos

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications

Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications

Published on: February 23, 2019

Related Experiment Videos

Last Updated: Sep 26, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications

Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications

Published on: February 23, 2019

Area of Science:

Biomedical Natural Language Processing (BioNLP)
Machine Learning
Computational Linguistics

Background:

The growing volume of biomedical text necessitates advanced Natural Language Processing (NLP) tools.
Existing domain-specific language models (LMs) often use BERT architecture with limitations and unproven generalizability.
A lack of baseline results hinders progress in common BioNLP tasks.

Purpose of the Study:

To develop and evaluate BioALBERT, a domain-specific adaptation of the lite bidirectional encoder representations from transformers (ALBERT) model.
To establish new state-of-the-art benchmarks for common BioNLP tasks.
To provide a robust and generalizable model for the BioNLP community.

Main Methods:

Trained 8 variants of BioALBERT on biomedical (PubMed, PubMed Central) and clinical (MIMIC-III) corpora.
Fine-tuned BioALBERT variants on 6 different BioNLP tasks across 20 benchmark datasets.
Evaluated model performance against existing state-of-the-art methods.

Main Results:

A large BioALBERT variant trained on PubMed achieved state-of-the-art performance on 5 out of 6 BioNLP tasks.
Significant improvements were observed in named-entity recognition (+11.09% BLURB score) and question answering (+2.83% BLURB score).
Five BioALBERT variants outperformed previous models on 17 out of 20 benchmark datasets, indicating robustness and generalizability.

Conclusions:

BioALBERT demonstrates superior performance and generalizability across a wide range of BioNLP tasks.
The model establishes new state-of-the-art results, providing valuable baselines for future research.
Freely available BioALBERT reduces computational burden and facilitates advancements in the BioNLP community.