Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Randomized Experiments01:13

Randomized Experiments

The randomization process involves assigning study participants randomly to experimental or control groups based on their probability of being equally assigned. Randomization is meant to eliminate selection bias and balance known and unknown confounding factors so that the control group is similar to the treatment group as much as possible. A computer program and a random number generator can be used to assign participants to groups in a way that minimizes bias.
Simple randomization
Simple...
DNA Microarrays02:34

DNA Microarrays

Microarrays are high-throughput and relatively inexpensive assays that can be automated to analyze large quantities of data at a time. They are used in genome-wide studies to compare gene or protein expression under two varied conditions, such as healthy and diseased states. Microarrays consist of glass or silica slides on which probe molecules are covalently attached through surface functionalization. Most commonly, the slides are prepared through the chemisorption of silanes to silica...
Genetic Screens02:46

Genetic Screens

Genetic screens are tools used to identify genes and mutations responsible for phenotypes of interest. Genetic screens help identify individuals or a group of people at risk of developing  genetic diseases and help them with early intervention, targeted therapy, and reproductive options.
Forward genetic screens
Forward or “classical” genetic screens involve creating random mutations in an organism’s DNA using radiation, mutagens, or insertion of additional bases, which result in visible changes...
Random Sampling Method01:09

Random Sampling Method

Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. Data are the result of sampling from a population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest. Among the various sampling methods used by...
RACE - Rapid Amplification of cDNA Ends02:35

RACE - Rapid Amplification of cDNA Ends

Rapid Amplification of cDNA Ends, or RACE, is one of the most effective methods to obtain a full-length cDNA from an mRNA sequence between a known internal region to the unknown sequence at the 5’ or 3’ end. The unknown region is cloned in the cDNA by a gene-specific primer that binds the known end, and a hybrid primer that attaches a predefined anchor sequence to the unknown end of the cDNA. The sequence in between is amplified by PCR with an anchor primer and a gene-specific primer.
Since the...
Combinatorial Gene Control02:33

Combinatorial Gene Control

Combinatorial gene control is the synergistic action of several transcriptional factors to regulate the expression of a single gene. The absence of one or more of these factors may lead to a significant difference in the level of gene expression or repression.
The expression of more than 30,000 genes is controlled by approximately 2000-3000 transcription factors. This is possible because a single transcription factor can recognize more than one regulatory sequence. The specificity in gene...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Transforming multi-omics data into images for disease classification: A review of techniques and tools.

Journal of pathology informatics·2026
Same author

Multi-criteria decision making and its application to in silico discovery of vaccine candidates for Toxoplasma gondii.

Vaccine·2025
Same author

Identification of cancer risk groups through multi-omics integration using autoencoder and tensor analysis.

Scientific reports·2024
Same author

An Approach to Evaluate the Costs and Outputs of Academic Biobanks.

Biopreservation and biobanking·2024
Same author

DROSHA Regulates Mesenchymal Gene Expression in Wilms Tumor.

Molecular cancer research : MCR·2024
Same author

Understanding cancer patient cohorts in virtual reality environment for better clinical decisions: a usability study.

BMC medical informatics and decision making·2023
Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026
Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026
Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026
Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026
Same journal

Dual channel drug-drug interactions extraction based on cross attention.

BMC bioinformatics·2026
Same journal

FeSseqdb: a curated sequence-level database and interpretable machine learning framework for identifying iron-sulfur proteins.

BMC bioinformatics·2026
See all related articles

Related Experiment Video

Updated: May 8, 2026

Generating the Transcriptional Regulation View of Transcriptomic Features for Prediction Task and Dark Biomarker Detection on Small Datasets
03:37

Generating the Transcriptional Regulation View of Transcriptomic Features for Prediction Task and Dark Biomarker Detection on Small Datasets

Published on: March 1, 2024

A balanced iterative random forest for gene selection from microarray data.

Ali Anaissi1, Paul J Kennedy, Madhu Goyal

  • 1Centre for Quantum Computation & Intelligent Systems (QCIS), Faculty of Engineering and Information Technology (FEIT), University of Technology, Sydney (UTS), Broadway New South Wales 2007, Australia. ali.anaissi@uts.edu.au.

BMC Bioinformatics
|August 29, 2013
PubMed
Summary
This summary is machine-generated.

This study introduces the Balanced Iterative Random Forest (BIRF) algorithm for identifying disease biomarkers from imbalanced gene expression data. BIRF effectively selects informative genes, outperforming other methods, especially for complex datasets.

More Related Videos

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research
09:35

A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research

Published on: August 16, 2017

Related Experiment Videos

Last Updated: May 8, 2026

Generating the Transcriptional Regulation View of Transcriptomic Features for Prediction Task and Dark Biomarker Detection on Small Datasets
03:37

Generating the Transcriptional Regulation View of Transcriptomic Features for Prediction Task and Dark Biomarker Detection on Small Datasets

Published on: March 1, 2024

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research
09:35

A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research

Published on: August 16, 2017

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Genomics

Background:

  • High-throughput microarray technologies generate complex, high-dimensional gene expression datasets.
  • Imbalanced class distribution in biological data poses challenges for biomarker discovery.
  • Identifying informative genes is crucial for disease diagnosis and understanding.

Purpose of the Study:

  • Introduce the Balanced Iterative Random Forest (BIRF) algorithm.
  • Select relevant genes from imbalanced high-throughput gene expression microarray data.
  • Validate the selected genes as reliable biomarkers.

Main Methods:

  • Application of the BIRF algorithm on four cancer microarray datasets.
  • Comparison of BIRF performance against Support Vector Machine-Recursive Feature Elimination (SVM-RFE), Multi-class SVM-RFE (MSVM-RFE), Random Forest (RF), and Naive Bayes (NB).
  • Validation of selected informative biomarkers through repeated training experiments.

Main Results:

  • BIRF outperforms state-of-the-art methods, particularly on imbalanced datasets.
  • Achieved 7%-12% higher accuracy than MSVM-RFE on a childhood leukaemia dataset, improving prediction for the minor class.
  • 64% of top genes consistently appeared across validation experiments, indicating robust biomarker selection.

Conclusions:

  • The BIRF algorithm is effective for gene selection from imbalanced high-throughput gene expression data.
  • BIRF demonstrates superior performance compared to existing methods, especially in handling class imbalance.
  • BIRF facilitates distinguishing truly predictive genes from those that appear predictive by chance.