Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Structure of a Gene01:30

Structure of a Gene

A gene is the fundamental unit of heredity. Every individual has two copies of each gene, one inherited from each parent. Although most people contain the same genes, there is a small fraction that is slightly different amongst people. A gene with a small difference in its sequence of DNA bases forms different alleles, contributing to different phenotypes.
However, only 1% of the DNA is composed of genes that encode proteins; the rest, 99% is non-coding DNA. This non-coding DNA performs...
Genomics02:02

Genomics

Genomics is the science of genomes: it is the study of all the genetic material of an organism. In humans, the genome consists of information carried in 23 pairs of chromosomes in the nucleus, as well as mitochondrial DNA. In genomics, both coding and non-coding DNA is sequenced and analyzed. Genomics allows a better understanding of all living things, their evolution, and their diversity. It has a myriad of uses: for example, to build phylogenetic trees, to improve productivity and...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Large Language Models Meet Biomedical Knowledge Graphs for Mechanistically Grounded Therapeutic Prioritization.

ArXiv·2026
Same author

DeepER-Med: Advancing Deep Evidence-Based Research in Medicine Through Agentic AI.

ArXiv·2026
Same author

MedHopQA: A Disease-Centered Multi-Hop Reasoning Benchmark and Evaluation Framework for LLM-Based Biomedical Question Answering.

ArXiv·2026
Same author

Enhancing the quality and trustworthiness of large language model-generated summaries of clinical oncology literature.

JAMIA open·2026
Same author

On the state of protein function prediction: a report on the fourth CAFA challenge.

bioRxiv : the preprint server for biology·2026
Same author

TCBLex - A lexical database of Finnish literary texts for children.

Behavior research methods·2025
Same journal

Analysis of strength degradation of coal and rock masses and stability of mined areas under long term immersion environment.

PloS one·2026
Same journal

Biogenic Silver-Selenium nanocomposite with anticancer activity and potent efficacy against vancomycin-resistant Staphylococcus aureus.

PloS one·2026
Same journal

Preparation and physicochemical characterization of a biodegradable chitosan/carboxymethyl cellulose hydrogel synthesized in NaOH/urea medium.

PloS one·2026
Same journal

Action-guilt, survivor-guilt, and depression in combat-related PTSD.

PloS one·2026
Same journal

Explainable machine learning for predicting activities of daily living at discharge in stroke patients: A retrospective study using SHAP interpretability.

PloS one·2026
Same journal

Deep learning based two-way feature depiction model for brain tumor detection.

PloS one·2026
See all related articles

Related Experiment Video

Updated: May 12, 2026

Comprehensive Workflow for the Genome-wide Identification and Expression Meta-analysis of the ATL E3 Ubiquitin Ligase Gene Family in Grapevine
10:40

Comprehensive Workflow for the Genome-wide Identification and Expression Meta-analysis of the ATL E3 Ubiquitin Ligase Gene Family in Grapevine

Published on: December 22, 2017

Large-scale event extraction from literature with multi-level gene normalization.

Sofie Van Landeghem1, Jari Björne, Chih-Hsuan Wei

  • 1Department of Plant Systems Biology, VIB, Gent, Belgium.

Plos One
|April 25, 2013
PubMed
Summary
This summary is machine-generated.

This study introduces an automated text mining system for life sciences, linking biological concepts to database identifiers across millions of articles. The resulting comprehensive dataset aids database curation and pathway analysis.

More Related Videos

Large-Scale Multi-Omics Genome-Wide Association Studies (Mo-GWAS): Guidelines for Sample Preparation and Normalization
08:27

Large-Scale Multi-Omics Genome-Wide Association Studies (Mo-GWAS): Guidelines for Sample Preparation and Normalization

Published on: July 27, 2021

Related Experiment Videos

Last Updated: May 12, 2026

Comprehensive Workflow for the Genome-wide Identification and Expression Meta-analysis of the ATL E3 Ubiquitin Ligase Gene Family in Grapevine
10:40

Comprehensive Workflow for the Genome-wide Identification and Expression Meta-analysis of the ATL E3 Ubiquitin Ligase Gene Family in Grapevine

Published on: December 22, 2017

Large-Scale Multi-Omics Genome-Wide Association Studies (Mo-GWAS): Guidelines for Sample Preparation and Normalization
08:27

Large-Scale Multi-Omics Genome-Wide Association Studies (Mo-GWAS): Guidelines for Sample Preparation and Normalization

Published on: July 27, 2021

Area of Science:

  • Biomedical Informatics
  • Computational Biology
  • Text Mining

Background:

  • Automated text mining is crucial for life sciences, aiding database curation, knowledge summarization, and information retrieval.
  • Scaling text mining tools to millions of articles and linking analyses to biomolecular databases (e.g., UniProt, KEGG) is essential for comprehensive coverage.

Purpose of the Study:

  • To develop and evaluate a text mining strategy that normalizes biological concepts in text to database identifiers.
  • To create a large-scale, publicly available resource of biomolecular events and gene/protein mentions from biomedical literature.

Main Methods:

  • Combined and improved two state-of-the-art text mining components for normalization and event extraction.
  • Processed 21.9 million PubMed abstracts and 460,000 PubMed Central open access full-text articles.
  • Mapped biological concepts to identifiers at varying granularity levels.

Main Results:

  • Generated a dataset of 40 million biomolecular events involving 76 million gene/protein mentions across 5032 species.
  • Linked mentions to 122,000 distinct genes.
  • Demonstrated promising results for database and pathway curation.

Conclusions:

  • The developed text mining approach and resulting dataset offer significant value for life science research and database curation.
  • The software components are open-source, and the dataset is freely accessible via API and bulk download, promoting further bioinformatic analyses.