Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Improving Translational Accuracy02:07

Improving Translational Accuracy

2.5K
2.5K
Force Classification01:22

Force Classification

1.2K
Forces play a crucial role in the study of physics and engineering. They are essential in describing the motion, behavior, and equilibrium of objects in the physical world. Forces can be classified based on their origin, type, and direction of action.
Contact and non-contact forces are two of the most widely used categories of forces. As the name suggests, contact forces require physical contact between two objects to act upon each other. Examples of contact forces include frictional,...
1.2K
lncRNA - Long Non-coding RNAs02:39

lncRNA - Long Non-coding RNAs

2.8K
2.8K
Neural Circuits01:25

Neural Circuits

1.1K
Neural circuits and neuronal pools are two of the main structures found in the nervous system. Neural circuits are networks of neurons that work together to carry out a specific task or process. They consist of interconnected neurons and glial cells, which provide structural and metabolic support.
Neuronal pools are collections of nerve cells with similar functions and interact through chemical and electrical signals. These pools include both interneurons (the central neural circuit nodes that...
1.1K
RNA Editing02:23

RNA Editing

8.9K
RNA editing is a post-transcriptional modification where a precursor mRNA (pre-mRNA) nucleotide sequence is changed by base insertion, deletion, or modification. The extent of RNA editing varies from a few hundred bases, in mitochondrial DNA of trypanosomes, to a just single base, in nuclear genes of mammals. Even a single base change in the pre-mRNA can convert a codon for one amino acid into the codon for another amino acid or a stop codon. This type of re-coding can significantly affect the...
8.9K
RNA-seq03:21

RNA-seq

9.9K
RNA sequencing, or RNA-Seq, is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases. 
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while...
9.9K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Molecular maps of diseases from omics data and network embeddings.

NPJ systems biology and applications·2026
Same author

CCMRI: a classification and curated database of climate change-related microbiome studies.

Scientific reports·2026
Same author

LLM-Assessed Relatedness of Microbiome Study Descriptions Aligns more Strongly with Functional than with Taxonomic Profile Similarity.

Microbial ecology·2026
Same author

Unpaired data as a first-order challenge in single-cell and spatial proteomics.

Nature biotechnology·2026
Same author

Accurate plasmid reconstruction from metagenomics data using assembly-alignment graphs and contrastive learning.

Nature biotechnology·2026
Same author

Single-cell atlas of transcriptomic vulnerability across multiple neurodegenerative and neuropsychiatric diseases.

medRxiv : the preprint server for health sciences·2026
Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026
Same journal

SpaMFG: a Spatial Multi-omics Integration Method based on Feature Grouping.

Bioinformatics (Oxford, England)·2026
Same journal

CSCN: Inference of Cell-Specific Causal Networks Using Single-Cell RNA-Seq Data.

Bioinformatics (Oxford, England)·2026
Same journal

Sparse CCA-Based Mediation Analysis with High-Dimensional Exposures and Mediators.

Bioinformatics (Oxford, England)·2026
Same journal

Enhancing Cross-Context Generalization in Drug Perturbation Prediction with a Multimodal Conditional Diffusion Framework.

Bioinformatics (Oxford, England)·2026
Same journal

Primer Design through Submodular Function Estimation.

Bioinformatics (Oxford, England)·2026
See all related articles

Related Experiment Video

Updated: Jun 14, 2025

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications
03:31

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

492

Improving dictionary-based named entity recognition with deep learning.

Katerina Nastou1, Mikaela Koutrouli1, Sampo Pyysalo2

  • 1Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Blegdamsvej 3, Copenhagen, 2200, Denmark.

Bioinformatics (Oxford, England)
|September 4, 2024
PubMed
Summary
This summary is machine-generated.

This study introduces an automated method to improve biomedical named entity recognition (NER) by generating better blocklists. The new approach enhances text mining precision for genes, diseases, species, and chemicals.

More Related Videos

Deep Neural Networks for Image-Based Dietary Assessment
13:19

Deep Neural Networks for Image-Based Dietary Assessment

Published on: March 13, 2021

9.0K
Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

523

Related Experiment Videos

Last Updated: Jun 14, 2025

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications
03:31

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

492
Deep Neural Networks for Image-Based Dietary Assessment
13:19

Deep Neural Networks for Image-Based Dietary Assessment

Published on: March 13, 2021

9.0K
Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

523

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Natural Language Processing

Background:

  • Dictionary-based named entity recognition (NER) is crucial for normalizing biomedical terms.
  • Current methods for creating blocklists of problematic names are manual, inefficient, and do not scale well.
  • Adapting NER to new entity types requires extensive manual curation of dictionaries and blocklists.

Purpose of the Study:

  • To develop an automated approach for generating improved blocklists for biomedical NER.
  • To enhance the precision of text mining by reducing false positives in entity recognition.
  • To create context-aware blocklists for more accurate entity identification across different documents.

Main Methods:

  • Generated a large dataset of over 12.5 million text spans by comparing three established biomedical NER systems.
  • Developed positive and negative context examples for four entity types (genes, diseases, species, chemicals).
  • Trained a Transformer-based model (BioBERT) for entity type classification to identify names requiring blocking.

Main Results:

  • The trained BioBERT model achieved a high F1-score of 96.7% for entity type classification.
  • The automated method generated a significantly larger blocklist, doubling the previous corpus-wide list.
  • Text mining precision increased by approximately 5.5% on average (over 8.5% for chemicals, 7.5% for genes) with a minimal recall drop of 0.6%.

Conclusions:

  • Automated generation of context-aware blocklists substantially improves biomedical NER performance.
  • The developed method effectively reduces false positives, enhancing precision in biological databases like STRING.
  • This approach offers a scalable and efficient solution for maintaining and improving biomedical NER systems.