Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Intelligent histology for tumor neurosurgery.

Neuro-oncology advances·2026
Same author

Boosting Sensitivity through a Multianalyte Cerebrospinal Fluid Approach for Diagnosis, Prognostication, and Immune Monitoring for Brain Tumors.

Cancer discovery·2025
Same author

Medical large language models are vulnerable to data-poisoning attacks.

Nature medicine·2025
Same author

Stimulated Raman Histology and Artificial Intelligence Provide Near Real-Time Interpretation of Radical Prostatectomy Surgical Margins.

The Journal of urology·2024
Same author

Foundation models for fast, label-free detection of glioma infiltration.

Nature·2024
Same author

Longitudinal deep neural networks for assessing metastatic brain cancer on a large open benchmark.

Nature communications·2024
Same journal

Electric-Scooters: An Emerging Source of High-Severity Pediatric Head Trauma.

Neurosurgery·2026
Same journal

Survival After Surgery for Spinal Osteosarcoma and the Role of Chemotherapy and Treatment Sequencing: A National Cohort Multivariable Analysis.

Neurosurgery·2026
Same journal

Safety and Efficacy of 3-Month Versus 6-Month Duration of Dual Antiplatelet Therapy in Pipeline Embolization Treatment of Intracranial Aneurysms.

Neurosurgery·2026
Same journal

Risk Factors of Revision Surgery After Acute Proximal Junctional Fracture Following Adult Spinal Deformity Surgery.

Neurosurgery·2026
Same journal

Sensorimotor Network Alterations and Compensation in Cervical Spondylotic Myelopathy: A 7 T Task-Based and Resting-State Functional MRI Study.

Neurosurgery·2026
Same journal

Hyperselective Peripheral Neurectomy Versus Medical Therapy for Refractory Poststroke Spasticity: A Randomized Controlled Trial.

Neurosurgery·2026
See all related articles

Related Experiment Video

Updated: Mar 29, 2026

Laser Capture Microdissection of Glioma Subregions for Spatial and Molecular Characterization of Intratumoral Heterogeneity, Oncostreams, and Invasion
09:09

Laser Capture Microdissection of Glioma Subregions for Spatial and Molecular Characterization of Intratumoral Heterogeneity, Oncostreams, and Invasion

Published on: April 12, 2020

7.6K

Natural Language Processing Methods Automate Molecular Marker Extraction From Glioma Pathology Reports.

Nader I Maarouf1, David Reinecke1,2, Andrew Smith1

  • 1Department of Neurosurgery, New York University Langone Medical Center, New York, New York, USA.

Neurosurgery
|March 27, 2026
PubMed
Summary
This summary is machine-generated.

Simple Natural Language Processing (NLP) methods, like Regular Expressions (RegEx), accurately extract molecular markers (IDH, ATRX) from glioma pathology reports. These methods require fewer computational resources than complex deep learning models, accelerating biomarker research.

Keywords:
ATRXBERTEHRIDHNLPRegexTF-IDF

More Related Videos

On-Site Sampling and Extraction of Brain Tumors for Metabolomics and Lipidomics Analysis
06:48

On-Site Sampling and Extraction of Brain Tumors for Metabolomics and Lipidomics Analysis

Published on: May 31, 2020

6.4K
Evaluation of Biomarkers in Glioma by Immunohistochemistry on Paraffin-Embedded 3D Glioma Neurosphere Cultures
06:32

Evaluation of Biomarkers in Glioma by Immunohistochemistry on Paraffin-Embedded 3D Glioma Neurosphere Cultures

Published on: January 9, 2019

8.4K

Related Experiment Videos

Last Updated: Mar 29, 2026

Laser Capture Microdissection of Glioma Subregions for Spatial and Molecular Characterization of Intratumoral Heterogeneity, Oncostreams, and Invasion
09:09

Laser Capture Microdissection of Glioma Subregions for Spatial and Molecular Characterization of Intratumoral Heterogeneity, Oncostreams, and Invasion

Published on: April 12, 2020

7.6K
On-Site Sampling and Extraction of Brain Tumors for Metabolomics and Lipidomics Analysis
06:48

On-Site Sampling and Extraction of Brain Tumors for Metabolomics and Lipidomics Analysis

Published on: May 31, 2020

6.4K
Evaluation of Biomarkers in Glioma by Immunohistochemistry on Paraffin-Embedded 3D Glioma Neurosphere Cultures
06:32

Evaluation of Biomarkers in Glioma by Immunohistochemistry on Paraffin-Embedded 3D Glioma Neurosphere Cultures

Published on: January 9, 2019

8.4K

Area of Science:

  • Computational pathology
  • Bioinformatics
  • Natural Language Processing (NLP)

Background:

  • Accurate molecular marker status (Isocitrate Dehydrogenase - IDH, Alpha-thalassemia/mental retardation syndrome X-linked - ATRX) is critical for glioma classification and treatment.
  • Manual extraction of these markers from pathology reports presents a significant bottleneck for research.
  • Evaluating the performance of NLP approaches with varying computational complexity is essential for optimizing research workflows.

Purpose of the Study:

  • To compare the effectiveness of three Natural Language Processing (NLP) approaches—Regular Expressions (RegEx), Term Frequency-Inverse Document Frequency (TF-IDF), and Bidirectional Encoder Representations from Transformers (BERT)—for extracting IDH and ATRX molecular markers from glioma pathology reports.
  • To determine if more computationally intensive NLP methods offer significant performance advantages over simpler methods in computational pathology research.

Main Methods:

  • Analysis of pathology reports from 404 patients (Institution A) and 197 patients (Institution B) for external validation.
  • Application of identical preprocessing steps, including text normalization and terminology standardization, to all evaluated NLP approaches.
  • Performance evaluation using standard classification metrics (accuracy, AUC) and memory usage benchmarks on both internal and external datasets.

Main Results:

  • Simpler NLP approaches, RegEx and TF-IDF, outperformed complex BERT-based models in accuracy and AUC for both IDH and ATRX marker extraction on external validation data.
  • RegEx achieved near-perfect accuracy (99-100%) and TF-IDF maintained high accuracy (94.2-98.0%) for both markers.
  • BERT-based approaches required substantially more memory (1825-1953 MB) compared to RegEx (0.82-5.52 MB) and TF-IDF (17.27-34.89 MB).

Conclusions:

  • Simple NLP approaches, particularly RegEx, provide a highly accurate and computationally efficient solution for automating molecular marker extraction from pathology reports.
  • The findings suggest that simpler NLP methods are sufficient for many computational pathology research tasks, enabling larger sample sizes and multi-institutional analyses.
  • Future research should focus on validating these findings across larger datasets and integrating NLP tools for broader application in biomarker research.