Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Improving Translational Accuracy

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...

Improving Translational Accuracy

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...

Genome Annotation and Assembly

Genome Annotation and Assembly

The genome refers to all of the genetic material in an organism. It can range from a few million base pairs in microbial cells to several billion base pairs in many eukaryotic organisms. Genome assembly refers to the process of taking the DNA sequencing data and putting it all back together in a correct order to create a close representation of the original genome. This is followed by the identification of functional elements on the newly assembled genome, a process called genome annotation.

RNA-seq

RNA-seq

RNA sequencing, or RNA-Seq, is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases.
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while microarray-based...

Multi-species Conserved Sequences

Multi-species Conserved Sequences

Next-generation sequencing technologies have created large genomic databases of a variety of animals and plants. Ever since the human genome project was completed, scientists studied the genome of primates, mammals, and other phylogenetically distant living beings. Such large-scale studies have provided new insights into the evolutionary relationship between organisms.
Although the genome of each species varies greatly from each other, a few sequences are highly conserved. Such conserved DNA...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Using LLM-generated tools to extract information about reporting statistical software in biomedical and health science research articles.

BMC research notes·2026

Same author

Automated analyses of risk of bias and critical appraisal of systematic reviews (ROBIS and AMSTAR 2): a comparison of the performance of 4 large language models.

Journal of the American Medical Informatics Association : JAMIA·2025

Same author

Large Language Models and the Analyses of Adherence to Reporting Guidelines in Systematic Reviews and Overviews of Reviews (PRISMA 2020 and PRIOR).

Journal of medical systems·2025

Same author

Psychiatric genetics in the diverse landscape of Latin American populations.

Nature genetics·2025

Same author

Reviewing manuscripts for scientific journals: recommendations for early career scientists.

BMC research notes·2025

Same author

Ten simple rules for successfully carrying out funded research projects.

PLoS computational biology·2024

Same journal

RNA Modifications as Drug Targets: Unlocking the Therapeutic Potential of the Epitranscriptome.

Current genomics·2026

Same journal

AgriBioNER: A Named Entity Recognition Tool for Identification of ncRNA and Diseases in Agricultural Literature.

Current genomics·2026

Same journal

Understanding the Evolutionary Adaptations and the Associated Functional Dynamics of Diatom <i>Cyclotella Cryptica</i>: A Chloroplast Genome-wide Comparative Study.

Current genomics·2026

Same journal

The Role of Collagen Genetic Variability in Degenerative Disc Disease and Related Conditions.

Current genomics·2026

Same journal

Genomics-Driven Immunotherapy: Advancing Cancer Treatment through Personalized Approaches.

Current genomics·2026

Same journal

Innovative Applications and Challenges of Isothermal Amplification Technology in miRNA Detection.

Current genomics·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 22, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Multiple Confabulations Found in Bioinformatics Tasks Carried Out by Several Free Large Language Models.

Diego A Forero¹

¹School of Health and Sport Sciences, Fundación Universitaria del Área Andina, Bogotá, Colombia.

Current Genomics

|May 21, 2026

Summary

This summary is machine-generated.

This study evaluated six Large Language Models (LLMs) for bioinformatics tasks, finding numerous inaccuracies (confabulations) in their outputs. Automatic code generation showed promise, but further research is needed to address LLM errors in scientific applications.

Keywords:

Bioinformatics computational genomics generative artificial intelligence large language models

Related Experiment Videos

Last Updated: May 22, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Area of Science:

Bioinformatics and computational genomics
Application of Large Language Models (LLMs) in health sciences

Background:

Bioinformatics heavily relies on computational analysis of experimental data.
Evaluating the accuracy of general-purpose LLMs in scientific contexts is crucial.
Confabulations (inaccurate answers) are a concern when using LLMs for research.

Purpose of the Study:

To assess the performance of six freely available LLMs in common bioinformatics and computational genomics tasks.
To identify the types and frequency of errors, specifically confabulations, generated by these LLMs.
To explore the potential of LLMs in automating bioinformatics workflows.

Main Methods:

Six LLMs (Gemini, ChatGPT, Grok, Claude, Llama, DeepSeek) were tested.
Tasks included identifier conversion, DNA sequence simulation, polymorphism analysis, orthologue retrieval, gene ontology/pathway identification, volcano plot interpretation, and R code generation.
Performance was evaluated based on accuracy and the presence of confabulations.

Main Results:

A significant number of confabulations and errors were observed across various tasks for all tested LLMs.
Performance varied depending on the task's complexity and the specific LLM used.
Automatic generation of R code for data visualization emerged as a potentially reliable application.

Conclusions:

Current general-purpose LLMs exhibit notable inaccuracies when applied to bioinformatics tasks.
Automatic code generation is a promising area for LLM application in bioinformatics.
Further investigation is required to understand and mitigate confabulations in LLMs for scientific research.