Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

3.4K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
3.4K
Frequency-dependent Selection01:21

Frequency-dependent Selection

23.0K
When the fitness of a trait is influenced by how common it is (i.e., its frequency) relative to different traits within a population, this is referred to as frequency-dependent selection. Frequency-dependent selection may occur between species or within a single species. This type of selection can either be positive—with more common phenotypes having higher fitness—or negative, with rarer phenotypes conferring increased fitness.
23.0K
Expected Frequencies in Goodness-of-Fit Tests01:19

Expected Frequencies in Goodness-of-Fit Tests

6.9K
A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n)  to the number of categories (k).
6.9K
Goodness-of-Fit Test01:16

Goodness-of-Fit Test

8.0K
The goodness-of-fit test is a type of hypothesis test which determines whether the data "fits" a particular distribution. For example, one may suspect that some anonymous data may fit a binomial distribution. A chi-square test (meaning the distribution for the hypothesis test is chi-square) can be used to determine if there is a fit. The null and alternative hypotheses may be written in sentences or stated as equations or inequalities. The test statistic for a goodness-of-fit test is given as...
8.0K
Introduction to R01:11

Introduction to R

4.0K
R is a powerful software environment for statistical computing and graphics. Originating as an implementation of the S language, developed at Bell Laboratories, R has evolved into a robust, open-source statistical software favored by statisticians and data scientists worldwide. Its comprehensive suite includes data manipulation, calculation, and graphical display capabilities, making it versatile for data analysis and visualization. Its programming language is at the core of R's...
4.0K
Regression Analysis01:11

Regression Analysis

7.7K
Regression analysis is a statistical tool that describes a mathematical relationship between a dependent variable and one or more independent variables.
In regression analysis, a regression equation is determined based on the line of best fit– a line that best fits the data points plotted in a graph. This line is also called the regression line. The algebraic equation for the regression line is called the regression equation. It is represented as:
7.7K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

MicroRNA Expression Analysis and Biological Pathways in Chemoresistant Non-Small Cell Lung Cancer.

Cancers·2025
Same author

A case study evaluating the effect of clustering, publication bias, and heterogeneity on the meta-analysis estimates in implant dentistry.

European journal of oral sciences·2023
Same author

XPF interacts with TOP2B for R-loop processing and DNA looping on actively transcribed genes.

Science advances·2023
Same author

Learning biologically-interpretable latent representations for gene expression data: Pathway Activity Score Learning Algorithm.

Machine learning·2023
Same author

Automated machine learning for genome wide association studies.

Bioinformatics (Oxford, England)·2023
Same author

A machine learning approach utilizing DNA methylation as an accurate classifier of COVID-19 disease severity.

Scientific reports·2022
Same journal

The Outcome of Cardiac Hydatid Surgery in The Iraqi Center of Heart Diseases.

F1000Research·2026
Same journal

Perception of body donation among the Phase-1 medical students, a questionnaire-based study.

F1000Research·2026
Same journal

Exploring Infertility in Saudi Arabia: Qualitative Insights into IVF Treatment Services and Policy Recommendations.

F1000Research·2026
Same journal

Cyber Military Operations under International Humanitarian Law: Interpreting the Concept of "Attack" and Challenges in Protecting Civilians.

F1000Research·2026
Same journal

Sentiment Analysis of Acceptance TVET Online Courses on the Skill Academy App from Google Play: Leveraging Text Mining with Comparison Machine Learning Model.

F1000Research·2026
Same journal

Emotional intelligence: An important skill to learn now more than ever.

F1000Research·2026
See all related articles

Related Experiment Video

Updated: Jan 5, 2026

Constructing and Visualizing Models using Mime-based Machine-learning Framework
06:19

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Published on: July 22, 2025

2.1K

Feature selection with the R package MXM.

Michail Tsagris1,2,3, Ioannis Tsamardinos2,4,5

  • 1Department of Economics, University of Crete, Rethymnon, 74100, Greece.

F1000Research
|October 30, 2019
PubMed
Summary
This summary is machine-generated.

The R package MXM provides advanced feature selection algorithms for diverse data types and large datasets. It offers unique advantages over other packages for predictive modeling and data analysis.

Keywords:
Feature selectionR packagealgorithmscomputational efficiency

More Related Videos

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.9K
Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

1.2K

Related Experiment Videos

Last Updated: Jan 5, 2026

Constructing and Visualizing Models using Mime-based Machine-learning Framework
06:19

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Published on: July 22, 2025

2.1K
Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.9K
Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

1.2K

Area of Science:

  • Computational statistics
  • Machine learning
  • Data science

Background:

  • Feature selection is crucial for identifying optimal predictors.
  • Existing R packages for feature selection have limitations in algorithm variety and data handling.
  • The MXM package aims to address these limitations.

Purpose of the Study:

  • To introduce and evaluate the R package MXM for feature selection.
  • To compare MXM's capabilities with existing feature selection packages.
  • To demonstrate MXM's utility with real-world high-dimensional data.

Main Methods:

  • Qualitative comparison of MXM with other R feature selection packages.
  • Demonstration of MXM's algorithms using diverse, high-dimensional datasets.
  • Utilizing memory-efficient algorithms for handling large-volume data in R.

Main Results:

  • MXM supports a wide array of target variable types (continuous, survival, categorical, etc.).
  • It integrates various regression models for different data types.
  • MXM includes algorithms for detecting statistically equivalent feature sets and handling big data.

Conclusions:

  • MXM offers a versatile and powerful feature selection solution.
  • Its unique features provide advantages for complex and large-scale data analysis.
  • The package enhances predictive modeling capabilities in R.