Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Genomic DNA in Eukaryotes00:58

Genomic DNA in Eukaryotes

Eukaryotes have large genomes compared to prokaryotes. To fit their genomes into a cell, eukaryotic DNA is packaged extraordinarily tightly inside the nucleus. To achieve this, DNA is tightly wound around proteins called histones, which are packaged into nucleosomes that are joined by linker DNA and coil into chromatin fibers. Additional fibrous proteins further compact the chromatin, which is recognizable as chromosomes during certain phases of cell division.
Genomics02:02

Genomics

Genomics is the science of genomes: it is the study of all the genetic material of an organism. In humans, the genome consists of information carried in 23 pairs of chromosomes in the nucleus, as well as mitochondrial DNA. In genomics, both coding and non-coding DNA is sequenced and analyzed. Genomics allows a better understanding of all living things, their evolution, and their diversity. It has a myriad of uses: for example, to build phylogenetic trees, to improve productivity and...
Genome Size and the Evolution of New Genes03:21

Genome Size and the Evolution of New Genes

While every living organism has a genome of some kind (be it RNA, or DNA), there is considerable variation in the sizes of these blueprints. One major factor that impacts genome size is whether the organism is prokaryotic or eukaryotic. In prokaryotes, the genome contains little to no non-coding sequence, such that genes are tightly clustered in groups or operons sequentially along the chromosome. Conversely, the genes in eukaryotes are punctuated by long stretches of non-coding sequence.
Gene Duplication and Divergence02:37

Gene Duplication and Divergence

The seminal work of Ohno in 1970 popularized the idea of gene duplication and divergence. DNA sequence comparison studies reveal that a large portion of the genes in bacteria, archaebacteria, and eukaryotes was  generated by gene duplication and divergence, indicating its critical role in evolution.
The duplicated copies of the gene are called Paralogs. Paralogs with similar sequences and functions form a gene family. Across several species, a large number of gene families are characterized.
Genome Annotation and Assembly03:36

Genome Annotation and Assembly

The genome refers to all of the genetic material in an organism. It can range from a few million base pairs in microbial cells to several billion base pairs in many eukaryotic organisms. Genome assembly refers to the process of taking the DNA sequencing data and putting it all back together in a correct order to create a close representation of the original genome. This is followed by the identification of functional elements on the newly assembled genome, a process called genome annotation.
Genome Size and the Evolution of New Genes03:21

Genome Size and the Evolution of New Genes

While every living organism has a genome of some kind (be it RNA, or DNA), there is considerable variation in the sizes of these blueprints. One major factor that impacts genome size is whether the organism is prokaryotic or eukaryotic. In prokaryotes, the genome contains little to no non-coding sequence, such that genes are tightly clustered in groups or operons sequentially along the chromosome. Conversely, the genes in eukaryotes are punctuated by long stretches of non-coding sequence.

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

On the state of protein function prediction: a report on the fourth CAFA challenge.

bioRxiv : the preprint server for biology·2026
Same author

Advances in Protein Function Prediction from the Fifth CAFA Challenge.

bioRxiv : the preprint server for biology·2026
Same author

Whole-genome prediction of bacterial pathogenic capacity on novel bacteria using protein language models with PathogenFinder2.

Bioinformatics (Oxford, England)·2026
Same author

Biocentral: Embedding-based Protein Predictions.

Journal of molecular biology·2026
Same author

Toxin data quality: a critical examination of bacterial exotoxins and animal toxins.

BMC research notes·2025
Same author

FlatProt: 2D visualization eases protein structure comparison.

BMC bioinformatics·2025
Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026
Same journal

SpaMFG: a Spatial Multi-omics Integration Method based on Feature Grouping.

Bioinformatics (Oxford, England)·2026
Same journal

CSCN: Inference of Cell-Specific Causal Networks Using Single-Cell RNA-Seq Data.

Bioinformatics (Oxford, England)·2026
Same journal

Sparse CCA-Based Mediation Analysis with High-Dimensional Exposures and Mediators.

Bioinformatics (Oxford, England)·2026
Same journal

Enhancing Cross-Context Generalization in Drug Perturbation Prediction with a Multimodal Conditional Diffusion Framework.

Bioinformatics (Oxford, England)·2026
Same journal

Primer Design through Submodular Function Estimation.

Bioinformatics (Oxford, England)·2026
See all related articles

Related Experiment Video

Updated: Jun 26, 2026

Hi-C: A Method to Study the Three-dimensional Architecture of Genomes.
22:27

Hi-C: A Method to Study the Three-dimensional Architecture of Genomes.

Published on: May 7, 2010

Target space for structural genomics revisited.

Jinfeng Liu1, Burkhard Rost

  • 1Department of Pharmacology, Columbia University, 630 West 168th Street, New York, NY 10032, USA.

Bioinformatics (Oxford, England)
|July 16, 2002
PubMed
Summary
This summary is machine-generated.

Structural genomics aims to determine protein structures. Researchers estimate 48% of proteins require targeting, with over 18,000 potential families identified for structural studies.

More Related Videos

Comprehensive Spatial Profiling of Species-agnostic Transcriptomes via Stereo-seq
10:22

Comprehensive Spatial Profiling of Species-agnostic Transcriptomes via Stereo-seq

Published on: October 31, 2025

Mining Spatial Transcriptomics Datasets using DeepSpaceDB
10:16

Mining Spatial Transcriptomics Datasets using DeepSpaceDB

Published on: September 5, 2025

Related Experiment Videos

Last Updated: Jun 26, 2026

Hi-C: A Method to Study the Three-dimensional Architecture of Genomes.
22:27

Hi-C: A Method to Study the Three-dimensional Architecture of Genomes.

Published on: May 7, 2010

Comprehensive Spatial Profiling of Species-agnostic Transcriptomes via Stereo-seq
10:22

Comprehensive Spatial Profiling of Species-agnostic Transcriptomes via Stereo-seq

Published on: October 31, 2025

Mining Spatial Transcriptomics Datasets using DeepSpaceDB
10:16

Mining Spatial Transcriptomics Datasets using DeepSpaceDB

Published on: September 5, 2025

Area of Science:

  • Proteomics
  • Structural Biology
  • Bioinformatics

Background:

  • Structural genomics seeks comprehensive protein structure determination.
  • Initial efforts focus on globular proteins for broad coverage.
  • Key questions involve the number of proteins to target and exclude.

Purpose of the Study:

  • To inform target selection for the North-East Structural Genomics Consortium (NESG).
  • To estimate the proportion of proteins requiring structural determination.
  • To identify suitable protein families for structural genomics initiatives.

Main Methods:

  • Analysis of existing structural information and protein databases.
  • Estimation of proteins with available structural data (6-38%).
  • Calculation of non-globular protein regions and overall targeting needs (48%).
  • Clustering protein sequence space to identify target families.

Main Results:

  • Structural information exists for 6-38% of proteins.
  • Approximately 48% of proteins, or 52% of residues, may need targeting.
  • Over 18,000 fragment clusters identified as potential targets in eukaryotes.

Conclusions:

  • Structural genomics must address a significant portion of the proteome.
  • Clustering strategies aid in identifying tractable targets.
  • The findings guide efficient resource allocation in structural biology projects.