Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Gene Families01:57

Gene Families

10.0K
Gene families consist of groups of genes proposed to have originated from a common ancestor. Typically these arise through events in which a gene or genes are mistakenly duplicated during cell division. Unlike their parent genes (which are subject to selection pressure to maintain function), these gene copies do not need to preserve their sequences and may evolve at a relatively faster rate.
Occasionally these regions can be adapted to take on new roles within the organism, becoming novel genes...
10.0K
Gene Families01:57

Gene Families

3.9K
3.9K
Protein Families02:47

Protein Families

17.2K
Protein families are groups of homologous proteins; that is, they have similarities in amino acid sequences and three-dimensional structures. Protein families usually occur because of gene duplication, where an additional copy of a gene is inserted into the genome of an organism.   Mutations that change the amino acids but still allow the protein to be properly synthesized, will lead to new protein family members.   If these new proteins contain similar amino acids in key...
17.2K
Protein Families02:47

Protein Families

4.5K
4.5K
Conservation of Protein Domains Over Different Proteins02:26

Conservation of Protein Domains Over Different Proteins

14.7K
Protein domains are small structurally independent units that are part of a single amino acid chain.  Although these domains are often structurally independent, they may rely on synergistic effects to perform their functions as part of a larger protein. Protein domains may be conserved within the same organism, as well as across different organisms.
A limited set of protein domains often duplicate and recombine during evolution. These domains can be organized in different combinations to...
14.7K
RNA Polymerase II Accessory Proteins02:36

RNA Polymerase II Accessory Proteins

11.1K
Proteins that regulate transcription can do so either via direct contact with RNA Polymerase or through indirect interactions facilitated by adaptors, mediators, histone-modifying proteins, and nucleosome remodelers. Direct interactions to activate transcription is seen in bacteria as well as in some eukaryotic genes. In these cases, upstream activation sequences are adjacent to the promoters, and the activator proteins interact directly with the transcriptional machinery. For example, in...
11.1K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

The RNA helicase DDX53 (CAGE) contributes to RNA metabolism in a human germ cell model.

bioRxiv : the preprint server for biology·2026
Same author

Multi-omic screening identifies RBMXL3 as a primate-specific RNA-binding protein and candidate regulator of RNA metabolism in human spermatogenesis.

Cellular & molecular biology letters·2026
Same author

Sleep stages affect low-gamma range effective cortical connectivity for 40-Hz auditory steady-state responses.

Biological psychology·2026
Same author

Genetic and epigenetic markers in the METTL21C gene associated with umbilical hernia in pigs.

BMC genomics·2025
Same author

Bacteria associated with the cereal leaf beetle act as the insect's allies in adapting to protease inhibitors, but impair its development in laboratory condition.

Scientific reports·2025
Same author

Characterization of lipidome alterations in a standardized porcine model with multiple trauma and hemorrhagic shock: Are they driven by hepatic injury?

The journal of trauma and acute care surgery·2025
Same journal

Single-cell RNA sequencing reveals lipid metabolism disorders in the retina in spontaneous high myopia.

Biology direct·2026
Same journal

Running exercise alleviates chronic heart failure by promoting cardiomyocyte autophagic flux through the NEAT1-QKI affecting Beclin1/LC3B mRNA stability.

Biology direct·2026
Same journal

The PTHR1/PKA/CREB1 axis promotes osteosarcoma progression by activating the PVT1/miR-590-3p/AXIN2 ceRNA network to induce epithelial-mesenchymal transition.

Biology direct·2026
Same journal

Identification and prognostic analysis of genes related to CTNNB1 mutations in hepatocellular carcinoma.

Biology direct·2026
Same journal

TrxR1 inhibition sensitizes hepatocellular carcinoma to Motesanib via an autophagy-ROS-JNK/ER stress axis.

Biology direct·2026
Same journal

Integrated microbiome-metabolome analysis implicates Acinetobacter guillouiae in arachidonic acid metabolic remodeling and endometrial cancer cell proliferation.

Biology direct·2026
See all related articles

Related Experiment Video

Updated: Feb 14, 2026

Comprehensive Workflow for the Genome-wide Identification and Expression Meta-analysis of the ATL E3 Ubiquitin Ligase Gene Family in Grapevine
10:40

Comprehensive Workflow for the Genome-wide Identification and Expression Meta-analysis of the ATL E3 Ubiquitin Ligase Gene Family in Grapevine

Published on: December 22, 2017

11.0K

Genes sharing the protein family domain decrease the performance of classification with RNA-seq genomic signatures.

Anna Leśniewska1, Joanna Zyprych-Walczak2, Alicja Szabelska-Beręsewicz2

  • 1Department of Computer Science, Poznan University of Technology, Piotrowo 2, Poznan, 60-965, Poland.

Biology Direct
|February 23, 2018
PubMed
Summary
This summary is machine-generated.

Machine learning classification results vary based on analysis type and gene selection. Genes sharing protein domains increase correlation, potentially reducing classifier accuracy in neuroblastoma studies.

Keywords:
BiomarkersData AnalysisGenomic signaturesMachine LearningProtein domainsRNA sequencingStatistics

More Related Videos

Rup (RNA-seq Usability Assessment Pipeline) - Quality Control for Bulk RNA-seq Experiments in Eukaryotes
05:07

Rup (RNA-seq Usability Assessment Pipeline) - Quality Control for Bulk RNA-seq Experiments in Eukaryotes

Published on: November 7, 2025

404
A Bioinformatics Pipeline for Investigating Molecular Evolution and Gene Expression using RNA-seq
07:09

A Bioinformatics Pipeline for Investigating Molecular Evolution and Gene Expression using RNA-seq

Published on: May 28, 2021

10.5K

Related Experiment Videos

Last Updated: Feb 14, 2026

Comprehensive Workflow for the Genome-wide Identification and Expression Meta-analysis of the ATL E3 Ubiquitin Ligase Gene Family in Grapevine
10:40

Comprehensive Workflow for the Genome-wide Identification and Expression Meta-analysis of the ATL E3 Ubiquitin Ligase Gene Family in Grapevine

Published on: December 22, 2017

11.0K
Rup (RNA-seq Usability Assessment Pipeline) - Quality Control for Bulk RNA-seq Experiments in Eukaryotes
05:07

Rup (RNA-seq Usability Assessment Pipeline) - Quality Control for Bulk RNA-seq Experiments in Eukaryotes

Published on: November 7, 2025

404
A Bioinformatics Pipeline for Investigating Molecular Evolution and Gene Expression using RNA-seq
07:09

A Bioinformatics Pipeline for Investigating Molecular Evolution and Gene Expression using RNA-seq

Published on: May 28, 2021

10.5K

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Machine Learning

Background:

  • Classification analyses on neuroblastoma datasets reveal result variability.
  • Gene selection and analysis type significantly impact machine learning outcomes.
  • Factors influencing downstream analysis include primary analysis type, classifier choice, and gene-domain correlations.

Purpose of the Study:

  • To identify factors affecting machine learning analysis in neuroblastoma classification.
  • To investigate the influence of shared protein domains among genes on classification performance.
  • To compile a gene-domain database for comparative analysis.

Main Methods:

  • Performed various classification analyses on the CAMDA neuroblastoma dataset.
  • Compiled a gene-domain database to differentiate genes based on shared domains.
  • Analyzed the impact of genes sharing protein domains versus other genes on classification.

Main Results:

  • Genes sharing a domain exhibit higher correlation coefficients.
  • Increased correlation among shared-domain genes leads to lower predictive power and higher misclassification rates.
  • Classifier performance generally worsens when using genes sharing domains in the training set.

Conclusions:

  • The observed effect of shared domains likely stems from biological co-expression rather than artifacts.
  • Shared-domain genes can negatively impact RNA sequencing analysis and biomarker utility.
  • Gene signature biomarker sets should be depleted of shared-domain genes for improved classification performance.