Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Genomic DNA in Eukaryotes

Genomic DNA in Eukaryotes

Eukaryotes have large genomes compared to prokaryotes. To fit their genomes into a cell, eukaryotic DNA is packaged extraordinarily tightly inside the nucleus. To achieve this, DNA is tightly wound around proteins called histones, which are packaged into nucleosomes that are joined by linker DNA and coil into chromatin fibers. Additional fibrous proteins further compact the chromatin, which is recognizable as chromosomes during certain phases of cell division.

Genomics

Genomics

Genomics is the science of genomes: it is the study of all the genetic material of an organism. In humans, the genome consists of information carried in 23 pairs of chromosomes in the nucleus, as well as mitochondrial DNA. In genomics, both coding and non-coding DNA is sequenced and analyzed. Genomics allows a better understanding of all living things, their evolution, and their diversity. It has a myriad of uses: for example, to build phylogenetic trees, to improve productivity and...

Genome Size and the Evolution of New Genes

Genome Size and the Evolution of New Genes

While every living organism has a genome of some kind (be it RNA, or DNA), there is considerable variation in the sizes of these blueprints. One major factor that impacts genome size is whether the organism is prokaryotic or eukaryotic. In prokaryotes, the genome contains little to no non-coding sequence, such that genes are tightly clustered in groups or operons sequentially along the chromosome. Conversely, the genes in eukaryotes are punctuated by long stretches of non-coding sequence.

Gene Duplication and Divergence

Gene Duplication and Divergence

The seminal work of Ohno in 1970 popularized the idea of gene duplication and divergence. DNA sequence comparison studies reveal that a large portion of the genes in bacteria, archaebacteria, and eukaryotes was generated by gene duplication and divergence, indicating its critical role in evolution.
The duplicated copies of the gene are called Paralogs. Paralogs with similar sequences and functions form a gene family. Across several species, a large number of gene families are characterized.

Genome Annotation and Assembly

Genome Annotation and Assembly

The genome refers to all of the genetic material in an organism. It can range from a few million base pairs in microbial cells to several billion base pairs in many eukaryotic organisms. Genome assembly refers to the process of taking the DNA sequencing data and putting it all back together in a correct order to create a close representation of the original genome. This is followed by the identification of functional elements on the newly assembled genome, a process called genome annotation.

Genome Size and the Evolution of New Genes

Genome Size and the Evolution of New Genes

While every living organism has a genome of some kind (be it RNA, or DNA), there is considerable variation in the sizes of these blueprints. One major factor that impacts genome size is whether the organism is prokaryotic or eukaryotic. In prokaryotes, the genome contains little to no non-coding sequence, such that genes are tightly clustered in groups or operons sequentially along the chromosome. Conversely, the genes in eukaryotes are punctuated by long stretches of non-coding sequence.

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

On the state of protein function prediction: a report on the fourth CAFA challenge.

bioRxiv : the preprint server for biology·2026

Same author

Advances in Protein Function Prediction from the Fifth CAFA Challenge.

bioRxiv : the preprint server for biology·2026

Same author

Whole-genome prediction of bacterial pathogenic capacity on novel bacteria using protein language models with PathogenFinder2.

Bioinformatics (Oxford, England)·2026

Same author

Biocentral: Embedding-based Protein Predictions.

Journal of molecular biology·2026

Same author

Toxin data quality: a critical examination of bacterial exotoxins and animal toxins.

BMC research notes·2025

Same author

FlatProt: 2D visualization eases protein structure comparison.

BMC bioinformatics·2025

Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026

Same journal

SpaMFG: a Spatial Multi-omics Integration Method based on Feature Grouping.

Bioinformatics (Oxford, England)·2026

Same journal

CSCN: Inference of Cell-Specific Causal Networks Using Single-Cell RNA-Seq Data.

Bioinformatics (Oxford, England)·2026

Same journal

Sparse CCA-Based Mediation Analysis with High-Dimensional Exposures and Mediators.

Bioinformatics (Oxford, England)·2026

Same journal

Enhancing Cross-Context Generalization in Drug Perturbation Prediction with a Multimodal Conditional Diffusion Framework.

Bioinformatics (Oxford, England)·2026

Same journal

Primer Design through Submodular Function Estimation.

Bioinformatics (Oxford, England)·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 26, 2026

Hi-C: A Method to Study the Three-dimensional Architecture of Genomes.

Hi-C: A Method to Study the Three-dimensional Architecture of Genomes.

Published on: May 7, 2010

Target space for structural genomics revisited.

Jinfeng Liu¹, Burkhard Rost

¹Department of Pharmacology, Columbia University, 630 West 168th Street, New York, NY 10032, USA.

Bioinformatics (Oxford, England)

|July 16, 2002

Summary

This summary is machine-generated.

Structural genomics aims to determine protein structures. Researchers estimate 48% of proteins require targeting, with over 18,000 potential families identified for structural studies.

More Related Videos

Comprehensive Spatial Profiling of Species-agnostic Transcriptomes via Stereo-seq

Comprehensive Spatial Profiling of Species-agnostic Transcriptomes via Stereo-seq

Published on: October 31, 2025

Mining Spatial Transcriptomics Datasets using DeepSpaceDB

Mining Spatial Transcriptomics Datasets using DeepSpaceDB

Published on: September 5, 2025

Related Experiment Videos

Last Updated: Jun 26, 2026

Hi-C: A Method to Study the Three-dimensional Architecture of Genomes.

Hi-C: A Method to Study the Three-dimensional Architecture of Genomes.

Published on: May 7, 2010

Comprehensive Spatial Profiling of Species-agnostic Transcriptomes via Stereo-seq

Comprehensive Spatial Profiling of Species-agnostic Transcriptomes via Stereo-seq

Published on: October 31, 2025

Mining Spatial Transcriptomics Datasets using DeepSpaceDB

Mining Spatial Transcriptomics Datasets using DeepSpaceDB

Published on: September 5, 2025

Area of Science:

Proteomics
Structural Biology
Bioinformatics

Background:

Structural genomics seeks comprehensive protein structure determination.
Initial efforts focus on globular proteins for broad coverage.
Key questions involve the number of proteins to target and exclude.

Purpose of the Study:

To inform target selection for the North-East Structural Genomics Consortium (NESG).
To estimate the proportion of proteins requiring structural determination.
To identify suitable protein families for structural genomics initiatives.

Main Methods:

Analysis of existing structural information and protein databases.
Estimation of proteins with available structural data (6-38%).
Calculation of non-globular protein regions and overall targeting needs (48%).
Clustering protein sequence space to identify target families.

Main Results:

Structural information exists for 6-38% of proteins.
Approximately 48% of proteins, or 52% of residues, may need targeting.
Over 18,000 fragment clusters identified as potential targets in eukaryotes.

Conclusions:

Structural genomics must address a significant portion of the proteome.
Clustering strategies aid in identifying tractable targets.
The findings guide efficient resource allocation in structural biology projects.