Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Statistical Software for Data Analysis and Clinical Trials

Statistical Software for Data Analysis and Clinical Trials

Statistical software is pivotal in data analysis and clinical trials by providing tools to analyze data, draw conclusions, and make predictions. These software packages range from simple data management applications to complex analytical platforms, supporting various statistical tests, models, and simulation techniques. Their significance lies in their ability to handle vast amounts of data with precision and efficiency, enabling researchers to validate hypotheses, identify trends, and make...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

GFO-Light: A Simplified Top-Level Ontology - Introduction and Biomedical Case Studies.

Studies in health technology and informatics·2026

Same author

Temporal Annotation of German Clinical Language in Real and Synthetic Clinical Documents: Corpus Development and Baseline Tagger Validation Study.

Journal of medical Internet research·2026

Same author

GeMTeX's De-Identification in Action: Lessons Learned & Devil's Details.

Studies in health technology and informatics·2025

Same author

Clinical document corpora-real ones, translated and synthetic substitutes, and assorted domain proxies: a survey of diversity in corpus design, with focus on German text data.

JAMIA open·2025

Same author

De-Identifying GRASCCO - A Pilot Study for the De-Identification of the German Medical Text Project (GeMTeX) Corpus.

Studies in health technology and informatics·2024

Same author

Extending the TOP Framework with an Ontology-Based Text Search Component.

Studies in health technology and informatics·2024

Same journal

The Essential Components and Critical Conditions for Success in a Learning Health System in Oncology.

Studies in health technology and informatics·2026

Same journal

Use of Artificial Intelligence in Screening for Adolescent Idiopathic Scoliosis: A Scoping Review.

Studies in health technology and informatics·2026

Same journal

Movement Related Biomechanics in Adolescent Idiopathic Scoliosis: A Review of Reviews.

Studies in health technology and informatics·2026

Same journal

The Impact of Surgical Correction of Adolescent Idiopathic Scoliosis Using Posterior Spinal Fusion on Selected Radiological Parameters and Respiratory Function.

Studies in health technology and informatics·2026

Same journal

Acute Effect of Physio-logic® Exercises on Muscle Tone and Stiffness in Adolescent Idiopathic Scoliosis Patients: A Preliminary Study.

Studies in health technology and informatics·2026

Same journal

Effects of Integrated Music and Occupational Therapy on Motor and Autonomic Function in Children with Neurogenic Scoliosis.

Studies in health technology and informatics·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Apr 12, 2026

TBase - an Integrated Electronic Health Record and Research Database for Kidney Transplant Recipients

TBase - an Integrated Electronic Health Record and Research Database for Kidney Transplant Recipients

Published on: April 13, 2021

Sharing models and tools for processing German clinical texts.

Johannes Hellrich¹, Franz Matthies¹, Erik Faessler¹

¹Jena University Language & Information Engineering (JULIE) Lab Friedrich-Schiller-Universität Jena, Jena, Germany.

Studies in Health Technology and Informatics

|May 21, 2015

Summary

This summary is machine-generated.

Developing NLP tools for German clinical texts is difficult due to limited resources. This study shares statistical models trained on protected data, outperforming existing tools for sentence splitting, tokenization, and POS tagging.

More Related Videos

A Metadata Extraction Approach for Clinical Case Reports to Enable Advanced Understanding of Biomedical Concepts

A Metadata Extraction Approach for Clinical Case Reports to Enable Advanced Understanding of Biomedical Concepts

Published on: September 20, 2018

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Related Experiment Videos

Last Updated: Apr 12, 2026

TBase - an Integrated Electronic Health Record and Research Database for Kidney Transplant Recipients

TBase - an Integrated Electronic Health Record and Research Database for Kidney Transplant Recipients

Published on: April 13, 2021

A Metadata Extraction Approach for Clinical Case Reports to Enable Advanced Understanding of Biomedical Concepts

A Metadata Extraction Approach for Clinical Case Reports to Enable Advanced Understanding of Biomedical Concepts

Published on: September 20, 2018

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Area of Science:

Natural Language Processing (NLP)
Computational Linguistics
Medical Informatics

Background:

Automatic processing of non-English clinical documents is hindered by a scarcity of public medical language resources.
Existing NLP tools often lack sufficient training data for specialized domains like German clinical texts.

Purpose of the Study:

To address the lack of resources for German clinical NLP by proposing the sharing of statistical models.
To develop and evaluate NLP components for sentence splitting, tokenization, and Part-of-Speech (POS) tagging of German clinical documents.

Main Methods:

Training of sentence splitting, tokenization, and POS tagging models using the confidential FRAMED corpus.
Utilizing statistical models derived from access-protected German clinical documents.
Comparative evaluation against established NLP toolkits (OpenNLP, Stanford POS tagger).

Main Results:

The developed models trained on the FRAMED corpus demonstrate superior performance.
Outperformed alternative components from OpenNLP and the Stanford POS tagger on the same dataset.
Successfully provided functional NLP components for German clinical text processing.

Conclusions:

Sharing statistical models derived from protected data is a viable strategy to overcome resource limitations in specialized NLP tasks.
The proposed models offer improved performance for fundamental NLP tasks in the German clinical domain.
Facilitates further research and development in clinical NLP for under-resourced languages.