Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Improving Translational Accuracy02:07

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...
Improving Translational Accuracy02:07

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...
Genome Annotation and Assembly03:36

Genome Annotation and Assembly

The genome refers to all of the genetic material in an organism. It can range from a few million base pairs in microbial cells to several billion base pairs in many eukaryotic organisms. Genome assembly refers to the process of taking the DNA sequencing data and putting it all back together in a correct order to create a close representation of the original genome. This is followed by the identification of functional elements on the newly assembled genome, a process called genome annotation.
RNA-seq03:21

RNA-seq

RNA sequencing, or RNA-Seq, is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases. 
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while microarray-based...
Multi-species Conserved Sequences02:51

Multi-species Conserved Sequences

Next-generation sequencing technologies have created large genomic databases of a variety of animals and plants. Ever since the human genome project was completed, scientists studied the genome of primates, mammals, and other phylogenetically distant living beings. Such large-scale  studies have provided new insights into the evolutionary relationship between organisms.
Although the genome of each species varies greatly from each other, a few sequences are highly conserved. Such conserved DNA...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Using LLM-generated tools to extract information about reporting statistical software in biomedical and health science research articles.

BMC research notes·2026
Same author

Automated analyses of risk of bias and critical appraisal of systematic reviews (ROBIS and AMSTAR 2): a comparison of the performance of 4 large language models.

Journal of the American Medical Informatics Association : JAMIA·2025
Same author

Large Language Models and the Analyses of Adherence to Reporting Guidelines in Systematic Reviews and Overviews of Reviews (PRISMA 2020 and PRIOR).

Journal of medical systems·2025
Same author

Psychiatric genetics in the diverse landscape of Latin American populations.

Nature genetics·2025
Same author

Reviewing manuscripts for scientific journals: recommendations for early career scientists.

BMC research notes·2025
Same author

Ten simple rules for successfully carrying out funded research projects.

PLoS computational biology·2024
Same journal

RNA Modifications as Drug Targets: Unlocking the Therapeutic Potential of the Epitranscriptome.

Current genomics·2026
Same journal

AgriBioNER: A Named Entity Recognition Tool for Identification of ncRNA and Diseases in Agricultural Literature.

Current genomics·2026
Same journal

Understanding the Evolutionary Adaptations and the Associated Functional Dynamics of Diatom <i>Cyclotella Cryptica</i>: A Chloroplast Genome-wide Comparative Study.

Current genomics·2026
Same journal

The Role of Collagen Genetic Variability in Degenerative Disc Disease and Related Conditions.

Current genomics·2026
Same journal

Genomics-Driven Immunotherapy: Advancing Cancer Treatment through Personalized Approaches.

Current genomics·2026
Same journal

Innovative Applications and Challenges of Isothermal Amplification Technology in miRNA Detection.

Current genomics·2026
See all related articles

Related Experiment Video

Updated: May 22, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Multiple Confabulations Found in Bioinformatics Tasks Carried Out by Several Free Large Language Models.

Diego A Forero1

  • 1School of Health and Sport Sciences, Fundación Universitaria del Área Andina, Bogotá, Colombia.

Current Genomics
|May 21, 2026
PubMed
Summary
This summary is machine-generated.

This study evaluated six Large Language Models (LLMs) for bioinformatics tasks, finding numerous inaccuracies (confabulations) in their outputs. Automatic code generation showed promise, but further research is needed to address LLM errors in scientific applications.

Keywords:
Bioinformaticscomputational genomicsgenerative artificial intelligencelarge language models

Related Experiment Videos

Last Updated: May 22, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Area of Science:

  • Bioinformatics and computational genomics
  • Application of Large Language Models (LLMs) in health sciences

Background:

  • Bioinformatics heavily relies on computational analysis of experimental data.
  • Evaluating the accuracy of general-purpose LLMs in scientific contexts is crucial.
  • Confabulations (inaccurate answers) are a concern when using LLMs for research.

Purpose of the Study:

  • To assess the performance of six freely available LLMs in common bioinformatics and computational genomics tasks.
  • To identify the types and frequency of errors, specifically confabulations, generated by these LLMs.
  • To explore the potential of LLMs in automating bioinformatics workflows.

Main Methods:

  • Six LLMs (Gemini, ChatGPT, Grok, Claude, Llama, DeepSeek) were tested.
  • Tasks included identifier conversion, DNA sequence simulation, polymorphism analysis, orthologue retrieval, gene ontology/pathway identification, volcano plot interpretation, and R code generation.
  • Performance was evaluated based on accuracy and the presence of confabulations.

Main Results:

  • A significant number of confabulations and errors were observed across various tasks for all tested LLMs.
  • Performance varied depending on the task's complexity and the specific LLM used.
  • Automatic generation of R code for data visualization emerged as a potentially reliable application.

Conclusions:

  • Current general-purpose LLMs exhibit notable inaccuracies when applied to bioinformatics tasks.
  • Automatic code generation is a promising area for LLM application in bioinformatics.
  • Further investigation is required to understand and mitigate confabulations in LLMs for scientific research.