Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Multi-species Conserved Sequences

Multi-species Conserved Sequences

Next-generation sequencing technologies have created large genomic databases of a variety of animals and plants. Ever since the human genome project was completed, scientists studied the genome of primates, mammals, and other phylogenetically distant living beings. Such large-scale studies have provided new insights into the evolutionary relationship between organisms.
Although the genome of each species varies greatly from each other, a few sequences are highly conserved. Such conserved...

Evolutionary Relationships through Genome Comparisons

Evolutionary Relationships through Genome Comparisons

Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...

RNA-seq

RNA-seq

RNA sequencing, or RNA-Seq, is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases.
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while...

Genome Annotation and Assembly

Genome Annotation and Assembly

The genome refers to all of the genetic material in an organism. It can range from a few million base pairs in microbial cells to several billion base pairs in many eukaryotic organisms. Genome assembly refers to the process of taking the DNA sequencing data and putting it all back together in a correct order to create a close representation of the original genome. This is followed by the identification of functional elements on the newly assembled genome, a process called genome annotation.

Conservation of Protein Domains Over Different Proteins

Conservation of Protein Domains Over Different Proteins

Protein domains are small structurally independent units that are part of a single amino acid chain. Although these domains are often structurally independent, they may rely on synergistic effects to perform their functions as part of a larger protein. Protein domains may be conserved within the same organism, as well as across different organisms.
A limited set of protein domains often duplicate and recombine during evolution. These domains can be organized in different combinations to...

Next-generation Sequencing

Next-generation Sequencing

The first human genome sequencing project cost $2.7 billion and was declared complete in 2003, after 15 years of international cooperation and collaboration between several research teams and funding agencies. Today, with the advent of next-generation sequencing technologies, the cost and time of sequencing a human genome have dropped over 100 fold.
Next-Generation Sequencing Methods
Although all next-generation methods use different technologies, they all share a set of standard features....

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

BetaDescribe: Providing rich descriptions from protein sequences.

Proceedings of the National Academy of Sciences of the United States of America·2026

Same author

The role of plant polyploidy in the structure of plant-pollinator communities.

Frontiers in plant science·2026

Same author

Efficient algorithms for simulating sequences along a phylogenetic tree.

Bioinformatics (Oxford, England)·2025

Same author

Integrated ambient modeling and genetic demultiplexing of single-cell RNA+ATAC multiome experiments with Ambimux.

bioRxiv : the preprint server for biology·2025

Same author

Single-cell DNA methylome and 3D genome atlas of human subcutaneous adipose tissue.

Nature genetics·2025

Same author

M1CR0B1AL1Z3R 2.0: an enhanced web server for comparative analysis of bacterial genomes at scale.

Nucleic acids research·2025

Same journal

3DICE: Interpretable 3D Cross-Modal Learning for Drug-Target Interaction Prediction and Large-Scale Drug Discovery.

Bioinformatics (Oxford, England)·2026

Same journal

KASSPer: Kinase Active Site Structure Prediction using Protein and Ligand Language Models and Its Application to Virtual Screening.

Bioinformatics (Oxford, England)·2026

Same journal

IDR searcher: a search engine solution for public image resources.

Bioinformatics (Oxford, England)·2026

Same journal

KCFtools: Rapid alignment-free method for introgression screening and GWAS using k-mer profiles.

Bioinformatics (Oxford, England)·2026

Same journal

Meta2DB: Curated shotgun metagenomic feature sets and metadata for health state prediction.

Bioinformatics (Oxford, England)·2026

Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 3, 2025

A Virtual Machine Platform for Non-Computer Professionals for Using Deep Learning to Classify Biological Sequences of Metagenomic Data

A Virtual Machine Platform for Non-Computer Professionals for Using Deep Learning to Classify Biological Sequences of Metagenomic Data

Published on: September 25, 2021

BetaAlign: a deep learning approach for multiple sequence alignment.

Edo Dotan^1,2, Elya Wygoda¹, Noa Ecker¹

¹The Shmunis School of Biomedicine and Cancer Research, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 69978, Israel.

Bioinformatics (Oxford, England)

|January 8, 2025

Summary

This summary is machine-generated.

Artificial intelligence (AI) using natural language processing (NLP) offers a novel approach to multiple sequence alignment (MSA). This AI-based method shows accuracy comparable to or exceeding current tools, advancing bioinformatics and phylogenomics.

More Related Videos

A Practical Guide to Phylogenetics for Nonexperts

A Practical Guide to Phylogenetics for Nonexperts

Published on: February 5, 2014

Using Phylogenetic Analysis to Investigate Eukaryotic Gene Origin

Using Phylogenetic Analysis to Investigate Eukaryotic Gene Origin

Published on: August 14, 2018

Related Experiment Videos

Last Updated: Jun 3, 2025

A Virtual Machine Platform for Non-Computer Professionals for Using Deep Learning to Classify Biological Sequences of Metagenomic Data

A Virtual Machine Platform for Non-Computer Professionals for Using Deep Learning to Classify Biological Sequences of Metagenomic Data

Published on: September 25, 2021

A Practical Guide to Phylogenetics for Nonexperts

A Practical Guide to Phylogenetics for Nonexperts

Published on: February 5, 2014

Using Phylogenetic Analysis to Investigate Eukaryotic Gene Origin

Using Phylogenetic Analysis to Investigate Eukaryotic Gene Origin

Published on: August 14, 2018

Area of Science:

Bioinformatics
Computational Biology
Genomics

Background:

Multiple sequence alignment (MSA) is crucial for biological sequence analysis, including phylogenetics and protein structure prediction.
Traditional MSA methods face challenges with complex evolutionary dynamics.
The integration of artificial intelligence (AI) presents a new avenue for improving MSA inference.

Purpose of the Study:

To introduce and evaluate an AI-based methodology for multiple sequence alignment (MSA) using natural language processing (NLP) techniques.
To demonstrate the potential of NLP algorithms to address limitations in conventional MSA computation.
To improve the accuracy and efficiency of sequence alignment for various biological applications.

Main Methods:

Developed an AI-based approach, BetaAlign, leveraging NLP transformer models to infer MSAs.
Trained the AI model on simulated alignments to capture specific evolutionary dynamics.
Investigated the impact of training data size, transformer architectures, and subspace learning on alignment accuracy.

Main Results:

BetaAlign achieved high accuracy in MSA inference, performing comparably to and sometimes outperforming state-of-the-art alignment tools.
The study characterized the performance of BetaAlign, identifying key factors influencing its accuracy.
A novel technique was introduced, leading to performance improvements over previous iterations of the AI aligner.

Conclusions:

AI-based methods, particularly those utilizing NLP, show significant promise for revolutionizing sequence alignment.
These NLP solutions can potentially replace or augment traditional algorithms for MSA and other complex inference tasks in phylogenomics.
The findings highlight the growing importance of AI in advancing bioinformatics and comparative genomics.