Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Expected Frequencies in Goodness-of-Fit Tests

Expected Frequencies in Goodness-of-Fit Tests

A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n) to the number of categories (k).

Relative Frequency Distribution

Relative Frequency Distribution

A relative frequency distribution is the proportion or fraction of times a value occurs in a data set. To find the relative frequencies, one can divide each frequency by the total number of data points in the sample. It is very similar to a regular frequency distribution, except that instead of reporting how many data values fall in a class, a relative frequency distribution reports the fraction of data values that fall in a class. These fractions or proportions are called relative frequencies...

Determination of Expected Frequency

Determination of Expected Frequency

Suppose one wants to test independence between the two variables of a contingency table. The values in the table constitute the observed frequencies of the dataset. But how does one determine the expected frequency of the dataset? One of the important assumptions is that the two variables are independent, which means the variables do not influence each other. For independent variables, the statistical probability of any event involving both variables is calculated by multiplying the individual...

Construction of Frequency Distribution

Construction of Frequency Distribution

A frequency distribution table can be constructed using the steps given below.
First, make a table with two columns—one with the title of the data that needs to be organized, and the other column for frequency. [Draw a third column for tally marks if needed]. Then, take a look at the items given in the data set and decide if an ungrouped frequency distribution table or a grouped frequency distribution table would be more suitable. If there are large sets of different values, then it is...

Hardy-Weinberg Principle

Hardy-Weinberg Principle

Diploid organisms have two alleles of each gene, one from each parent, in their somatic cells. Therefore, each individual contributes two alleles to the gene pool of the population. The gene pool of a population is the sum of every allele of all genes within that population and has some degree of variation. Genetic variation is typically expressed as a relative frequency, which is the percentage of the total population that has a given allele, genotype or phenotype.

F Distribution

F Distribution

The F distribution was named after Sir Ronald Fisher, an English statistician. The F statistic is a ratio (a fraction) with two sets of degrees of freedom; one for the numerator and one for the denominator. The F distribution is derived from the Student's t distribution. The values of the F distribution are squares of the corresponding values of the t distribution. One-Way ANOVA expands the t test for comparing more than two groups. The scope of that derivation is beyond the level of this...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

What limits our ability to study the effects of microbial diversity on health and the environment?

Cell systems·2025

Same author

Correction: Influencing the Tumor Microenvironment: A Phase II Study of Copper Depletion Using Tetrathiomolybdate in Patients with Breast Cancer at High Risk for Recurrence and in Preclinical Models of Lung Metastases.

Clinical cancer research : an official journal of the American Association for Cancer Research·2020

Same author

Disseminated Histoplasmosis as an AIDS-Defining Illness Presenting as Fever of Unknown Origin in an 11-Year-Old Female.

Case reports in pediatrics·2019

Same author

Niche Separation Increases With Genetic Distance Among Bloom-Forming Cyanobacteria.

Frontiers in microbiology·2018

Same author

Parasites dominate hyperdiverse soil protist communities in Neotropical rainforests.

Nature ecology & evolution·2017

Same author

Continental igneous rock composition: A major control of past global chemical weathering.

Science advances·2017

Same journal

Fast penalized generalized estimating equations for large longitudinal functional datasets.

Biometrics·2026

Same journal

Causally-interpretable random-effects meta-analysis.

Biometrics·2026

Same journal

Statistical inference for mean function of partially observed functional time series.

Biometrics·2026

Same journal

Subgroup identification via Interaction Tree and Mixed Model for Repeated Measures with application to Alzheimer's disease.

Biometrics·2026

Same journal

Finite mixtures of linear quantile regressions with concomitant variables: a solution to endogeneity in longitudinal data modeling.

Biometrics·2026

Same journal

Discussion on "INTACT: a method for integration of longitudinal physical activity data from multiple sources" by Jingru Zhang, Erjia Cui, Hongzhe Li, and Haochang Shou.

Biometrics·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Apr 11, 2026

Heterogeneity Mapping of Protein Expression in Tumors using Quantitative Immunofluorescence

Heterogeneity Mapping of Protein Expression in Tumors using Quantitative Immunofluorescence

Published on: October 25, 2011

Estimating diversity via frequency ratios.

Amy Willis¹, John Bunge¹

¹Department of Statistical Science, Cornell University, Ithaca, New York, U.S.A.

|June 4, 2015

Summary

This summary is machine-generated.

This study introduces a new nonlinear regression model to estimate total species diversity from sample counts, particularly for high diversity populations. The method offers accurate diversity estimation and outperforms existing approaches in microbial ecology.

Keywords:

Alpha diversity Biodiversity Capture-recapture Characterization of distributions Microbial ecology Species richness

More Related Videos

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Published on: December 7, 2021

Using the Race Model Inequality to Quantify Behavioral Multisensory Integration Effects

Using the Race Model Inequality to Quantify Behavioral Multisensory Integration Effects

Published on: May 10, 2019

Related Experiment Videos

Last Updated: Apr 11, 2026

Heterogeneity Mapping of Protein Expression in Tumors using Quantitative Immunofluorescence

Heterogeneity Mapping of Protein Expression in Tumors using Quantitative Immunofluorescence

Published on: October 25, 2011

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Published on: December 7, 2021

Using the Race Model Inequality to Quantify Behavioral Multisensory Integration Effects

Using the Race Model Inequality to Quantify Behavioral Multisensory Integration Effects

Published on: May 10, 2019

Area of Science:

Ecology
Statistics
Computational Biology

Background:

Estimating total species diversity from limited sample data is challenging, especially with high latent diversity.
Classical methods often rely on mixed Poisson models, which may not capture complex diversity patterns.

Purpose of the Study:

To develop a novel statistical approach for estimating total population diversity from sample counts.
To address limitations of existing models, particularly for datasets with high species richness.

Main Methods:

Constructed a nonlinear regression model based on ratios of consecutive frequency counts.
Utilized probability theory for distributions on integers.
Applied the model to analyze high diversity datasets, including those from next-generation sequencing in microbial ecology.

Main Results:

The proposed method provides accurate estimates of total diversity.
The model demonstrates good data fits and reasonable standard errors.
Outperformed existing competitor methods on a specific microbial ecology dataset.

Conclusions:

This nonlinear regression approach offers a new, geometrically intuitive method for diversity estimation.
The method is well-suited for analyzing complex, high-diversity ecological datasets.
Represents a departure from traditional mixed Poisson models in diversity estimation.