Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Multi-species Conserved Sequences02:51

Multi-species Conserved Sequences

3.9K
Next-generation sequencing technologies have created large genomic databases of a variety of animals and plants. Ever since the human genome project was completed, scientists studied the genome of primates, mammals, and other phylogenetically distant living beings. Such large-scale  studies have provided new insights into the evolutionary relationship between organisms.
Although the genome of each species varies greatly from each other, a few sequences are highly conserved. Such conserved...
3.9K
Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

5.7K
Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
5.7K
RNA-seq03:21

RNA-seq

9.8K
RNA sequencing, or RNA-Seq, is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases. 
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while...
9.8K
Genome Annotation and Assembly03:36

Genome Annotation and Assembly

18.8K
The genome refers to all of the genetic material in an organism. It can range from a few million base pairs in microbial cells to several billion base pairs in many eukaryotic organisms. Genome assembly refers to the process of taking the DNA sequencing data and putting it all back together in a correct order to create a close representation of the original genome. This is followed by the identification of functional elements on the newly assembled genome, a process called genome annotation.
18.8K
Conservation of Protein Domains Over Different Proteins02:26

Conservation of Protein Domains Over Different Proteins

10.8K
Protein domains are small structurally independent units that are part of a single amino acid chain.  Although these domains are often structurally independent, they may rely on synergistic effects to perform their functions as part of a larger protein. Protein domains may be conserved within the same organism, as well as across different organisms.
A limited set of protein domains often duplicate and recombine during evolution. These domains can be organized in different combinations to...
10.8K
Next-generation Sequencing03:00

Next-generation Sequencing

87.3K
The first human genome sequencing project cost $2.7 billion and was declared complete in 2003, after 15 years of international cooperation and collaboration between several research teams and funding agencies. Today, with the advent of next-generation sequencing technologies, the cost and time of sequencing a human genome have dropped over 100 fold.
Next-Generation Sequencing Methods
Although all next-generation methods use different technologies, they all share a set of standard features....
87.3K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

BetaDescribe: Providing rich descriptions from protein sequences.

Proceedings of the National Academy of Sciences of the United States of America·2026
Same author

The role of plant polyploidy in the structure of plant-pollinator communities.

Frontiers in plant science·2026
Same author

Efficient algorithms for simulating sequences along a phylogenetic tree.

Bioinformatics (Oxford, England)·2025
Same author

Integrated ambient modeling and genetic demultiplexing of single-cell RNA+ATAC multiome experiments with Ambimux.

bioRxiv : the preprint server for biology·2025
Same author

Single-cell DNA methylome and 3D genome atlas of human subcutaneous adipose tissue.

Nature genetics·2025
Same author

M1CR0B1AL1Z3R 2.0: an enhanced web server for comparative analysis of bacterial genomes at scale.

Nucleic acids research·2025
Same journal

3DICE: Interpretable 3D Cross-Modal Learning for Drug-Target Interaction Prediction and Large-Scale Drug Discovery.

Bioinformatics (Oxford, England)·2026
Same journal

KASSPer: Kinase Active Site Structure Prediction using Protein and Ligand Language Models and Its Application to Virtual Screening.

Bioinformatics (Oxford, England)·2026
Same journal

IDR searcher: a search engine solution for public image resources.

Bioinformatics (Oxford, England)·2026
Same journal

KCFtools: Rapid alignment-free method for introgression screening and GWAS using k-mer profiles.

Bioinformatics (Oxford, England)·2026
Same journal

Meta2DB: Curated shotgun metagenomic feature sets and metadata for health state prediction.

Bioinformatics (Oxford, England)·2026
Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026
See all related articles

Related Experiment Video

Updated: Jun 3, 2025

A Virtual Machine Platform for Non-Computer Professionals for Using Deep Learning to Classify Biological Sequences of Metagenomic Data
09:34

A Virtual Machine Platform for Non-Computer Professionals for Using Deep Learning to Classify Biological Sequences of Metagenomic Data

Published on: September 25, 2021

3.9K

BetaAlign: a deep learning approach for multiple sequence alignment.

Edo Dotan1,2, Elya Wygoda1, Noa Ecker1

  • 1The Shmunis School of Biomedicine and Cancer Research, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 69978, Israel.

Bioinformatics (Oxford, England)
|January 8, 2025
PubMed
Summary
This summary is machine-generated.

Artificial intelligence (AI) using natural language processing (NLP) offers a novel approach to multiple sequence alignment (MSA). This AI-based method shows accuracy comparable to or exceeding current tools, advancing bioinformatics and phylogenomics.

More Related Videos

A Practical Guide to Phylogenetics for Nonexperts
12:00

A Practical Guide to Phylogenetics for Nonexperts

Published on: February 5, 2014

35.3K
Using Phylogenetic Analysis to Investigate Eukaryotic Gene Origin
08:57

Using Phylogenetic Analysis to Investigate Eukaryotic Gene Origin

Published on: August 14, 2018

15.8K

Related Experiment Videos

Last Updated: Jun 3, 2025

A Virtual Machine Platform for Non-Computer Professionals for Using Deep Learning to Classify Biological Sequences of Metagenomic Data
09:34

A Virtual Machine Platform for Non-Computer Professionals for Using Deep Learning to Classify Biological Sequences of Metagenomic Data

Published on: September 25, 2021

3.9K
A Practical Guide to Phylogenetics for Nonexperts
12:00

A Practical Guide to Phylogenetics for Nonexperts

Published on: February 5, 2014

35.3K
Using Phylogenetic Analysis to Investigate Eukaryotic Gene Origin
08:57

Using Phylogenetic Analysis to Investigate Eukaryotic Gene Origin

Published on: August 14, 2018

15.8K

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Genomics

Background:

  • Multiple sequence alignment (MSA) is crucial for biological sequence analysis, including phylogenetics and protein structure prediction.
  • Traditional MSA methods face challenges with complex evolutionary dynamics.
  • The integration of artificial intelligence (AI) presents a new avenue for improving MSA inference.

Purpose of the Study:

  • To introduce and evaluate an AI-based methodology for multiple sequence alignment (MSA) using natural language processing (NLP) techniques.
  • To demonstrate the potential of NLP algorithms to address limitations in conventional MSA computation.
  • To improve the accuracy and efficiency of sequence alignment for various biological applications.

Main Methods:

  • Developed an AI-based approach, BetaAlign, leveraging NLP transformer models to infer MSAs.
  • Trained the AI model on simulated alignments to capture specific evolutionary dynamics.
  • Investigated the impact of training data size, transformer architectures, and subspace learning on alignment accuracy.

Main Results:

  • BetaAlign achieved high accuracy in MSA inference, performing comparably to and sometimes outperforming state-of-the-art alignment tools.
  • The study characterized the performance of BetaAlign, identifying key factors influencing its accuracy.
  • A novel technique was introduced, leading to performance improvements over previous iterations of the AI aligner.

Conclusions:

  • AI-based methods, particularly those utilizing NLP, show significant promise for revolutionizing sequence alignment.
  • These NLP solutions can potentially replace or augment traditional algorithms for MSA and other complex inference tasks in phylogenomics.
  • The findings highlight the growing importance of AI in advancing bioinformatics and comparative genomics.