Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Size and Structure of Viral Genomes01:26

Size and Structure of Viral Genomes

616
Viral genomes exhibit remarkable diversity in size, structure, and composition, influencing their replication strategies and interactions with host cells. These genomes consist of either DNA or RNA and may be linear or circular. Additionally, they can be single-stranded or double-stranded, with each configuration affecting how the virus propagates within a host. RNA viruses, for instance, generally have smaller genomes than DNA viruses, a factor that contributes to their high mutation rates and...
616
Viruses with RNA Genomes01:29

Viruses with RNA Genomes

714
RNA viruses are categorized into positive-strand, negative-strand, or double-stranded groups based on their genomic structure and replication mechanisms. This classification dictates how they exploit host cellular machinery for protein synthesis and replication. Some RNA viruses also utilize reverse transcription as part of their life cycle, further diversifying their replication strategies.Positive-Strand RNA VirusesPositive-strand RNA viruses have genomes that function directly as messenger...
714
Viral Recombination00:57

Viral Recombination

24.8K
Cells are sometimes infected by more than one virus at once. When two viruses disassemble to expose their genomes for replication in the same cell, similar regions of their genomes can pair together and exchange sequences in a process called recombination. Alternatively, viruses with segmented genomes can swap segments in a process called reassortment.
24.8K
Viral Mutations00:36

Viral Mutations

39.5K
A mutation is a change in the sequence of bases of DNA or RNA in a genome. Some mutations occur during replication of the genome due to errors made by the polymerase enzymes that replicate DNA or RNA. Unlike DNA polymerase, RNA polymerase is prone to errors because it is not capable of “proofreading” its work. Viruses with RNA-based genomes, like HIV, therefore accrue mutations faster than viruses with DNA-based genomes. Because mutation and recombination provide the raw material...
39.5K
Viral Structure00:56

Viral Structure

73.6K
Viruses are extraordinarily diverse in shape and size, but they all have several structural features in common. All viruses have a core that contains a DNA- or RNA-based genome. The core is surrounded by a protective coat of proteins called the capsid. The capsid is composed of subunits called capsomeres. The capsid and genome-containing core are together known as the nucleocapsid.
73.6K
Retroviruses02:33

Retroviruses

14.5K
Retroviruses and retrotransposons both insert copies of their genetic elements into the genome of the host cell. Thus, the viral genes are passed on when the host genome is replicated or translated. A typical retroviral DNA sequence contains 3-4 genes that encode the different proteins required for its structural assembly and function as a molecular parasite. This DNA is transcribed into a single mRNA, which is very similar in structure to conventional mRNAs, i.e., it is capped at the 5’...
14.5K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Primer Design through Submodular Function Estimation.

Bioinformatics (Oxford, England)·2026
Same author

Accelerating String Comparison in RLZ Compressed Sequences via LCE Jumps.

bioRxiv : the preprint server for biology·2026
Same author

Structural and functional characterization of thermostable EstS1 esterase for BHET degradation.

Journal of structural biology·2026
Same author

Consistency in causal reasoning for large language models in scenarios of HIV antiretroviral treatment, drug interactions, and side effects.

NPJ digital medicine·2026
Same author

Building genomic data structures from compressed representations using prefix-free parsing.

Genome research·2026
Same author

Generalist large language models in a specialized world: Evidence from the Italian national medical education pathway.

PLOS digital health·2026
Same journal

Layered social competition coordinates reproductive hierarchy formation in ants.

bioRxiv : the preprint server for biology·2026
Same journal

Combination epigenetic-targeted therapy increases the immunogenicity of poorly immunogenic sarcomas.

bioRxiv : the preprint server for biology·2026
Same journal

Loss of LanC-like proteins delays post-injury regeneration of aging skeletal muscles.

bioRxiv : the preprint server for biology·2026
Same journal

Integrative Transfer Network: Deep Transfer Learning Across Populations and Prediction Targets.

bioRxiv : the preprint server for biology·2026
Same journal

Confidence-supported label-free metabolic imaging with FPhaS phase autofluorescence microscopy.

bioRxiv : the preprint server for biology·2026
Same journal

Sequence-encoded autoinhibition couples mRNA decapping activity to phase separation.

bioRxiv : the preprint server for biology·2026
See all related articles

Related Experiment Video

Updated: Jan 8, 2026

Isolation and Genome Analysis of Single Virions using 'Single Virus Genomics'
08:31

Isolation and Genome Analysis of Single Virions using 'Single Virus Genomics'

Published on: May 26, 2013

11.4K

vir2vec: A Viral Genome-Wide Viral Embedding.

Simone Rancati1,2,3, Pablo Arozarena Donelli1, Giovanna Nicora1

  • 1Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Via Adolfo Ferrata 5, Pavia, 27100, Italy.

Biorxiv : the Preprint Server for Biology
|December 22, 2025
PubMed
Summary
This summary is machine-generated.

We developed vir2vec, a large genomic language model trained on diverse viral genomes, and vGUE, a benchmark for evaluating viral genome understanding. vir2vec significantly improves viral classification tasks, advancing genomic surveillance and discovery.

More Related Videos

Open-source Single-particle Analysis for Super-resolution Microscopy with VirusMapper
07:38

Open-source Single-particle Analysis for Super-resolution Microscopy with VirusMapper

Published on: April 9, 2017

10.4K
Phage Phenomics: Physiological Approaches to Characterize Novel Viral Proteins
09:40

Phage Phenomics: Physiological Approaches to Characterize Novel Viral Proteins

Published on: June 11, 2015

12.7K

Related Experiment Videos

Last Updated: Jan 8, 2026

Isolation and Genome Analysis of Single Virions using 'Single Virus Genomics'
08:31

Isolation and Genome Analysis of Single Virions using 'Single Virus Genomics'

Published on: May 26, 2013

11.4K
Open-source Single-particle Analysis for Super-resolution Microscopy with VirusMapper
07:38

Open-source Single-particle Analysis for Super-resolution Microscopy with VirusMapper

Published on: April 9, 2017

10.4K
Phage Phenomics: Physiological Approaches to Characterize Novel Viral Proteins
09:40

Phage Phenomics: Physiological Approaches to Characterize Novel Viral Proteins

Published on: June 11, 2015

12.7K

Area of Science:

  • Genomics
  • Bioinformatics
  • Machine Learning

Background:

  • Genomic language models (gLMs) show promise for DNA analysis but lack viral-specific architectures and benchmarks.
  • Existing models often focus on human DNA or limited viral datasets.
  • A comprehensive evaluation framework for viral genome representation learning is needed.

Purpose of the Study:

  • Introduce vir2vec, a large-scale genomic language model pretrained on a diverse pan-viral corpus.
  • Present vGUE, a unified benchmark for assessing viral genome understanding and representation learning.
  • Evaluate vir2vec's performance on various viral genomic prediction tasks.

Main Methods:

  • Continual pretraining of Mistral-DNA on 565,747 complete viral genomes from 295 species to create vir2vec (422M parameters).
  • Development of vGUE, a benchmark using vir2vec embeddings fed into classifiers (logistic regression, SVM, random forests) under nested cross-validation.
  • Assessment of prediction tasks including organism discrimination, evolutionary fingerprinting, species separation, variant typing, and phenotypic context detection.

Main Results:

  • vir2vec achieved superior balanced accuracy on seven out of eight diverse viral classification tasks.
  • The model consistently outperformed human-DNA-trained and existing viral-specific genomic foundation models.
  • Embeddings from vir2vec effectively captured biologically relevant viral variation across multiple tasks.

Conclusions:

  • vir2vec and vGUE establish a robust foundation for viral genomic modeling, surveillance, and discovery.
  • The developed tools offer improved capabilities for understanding viral diversity and evolution.
  • Responsible deployment of vir2vec necessitates ethical considerations and governance oversight due to its potential dual-use implications.