Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Goodness-of-Fit Test01:16

Goodness-of-Fit Test

6.9K
The goodness-of-fit test is a type of hypothesis test which determines whether the data "fits" a particular distribution. For example, one may suspect that some anonymous data may fit a binomial distribution. A chi-square test (meaning the distribution for the hypothesis test is chi-square) can be used to determine if there is a fit. The null and alternative hypotheses may be written in sentences or stated as equations or inequalities. The test statistic for a goodness-of-fit test is given as...
6.9K
Expected Frequencies in Goodness-of-Fit Tests01:19

Expected Frequencies in Goodness-of-Fit Tests

5.6K
A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n)  to the number of categories (k).
5.6K
Prediction Intervals01:03

Prediction Intervals

2.8K
The interval estimate of any variable is known as the prediction interval. It helps decide if a point estimate is dependable.
However, the point estimate is most likely not the exact value of the population parameter, but close to it. After calculating point estimates, we construct interval estimates, called confidence intervals or prediction intervals. This prediction interval comprises a range of values unlike the point estimate and is a better predictor of the observed sample value, y. 
2.8K
Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

3.3K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
3.3K
Optimal Foraging00:48

Optimal Foraging

12.9K
How animals obtain and eat their food is called foraging behavior. Foraging can include searching for plants and hunting for prey and depends on the species and environment.
12.9K
Outliers and Influential Points01:08

Outliers and Influential Points

5.4K
An outlier is an observation of data that does not fit the rest of the data. It is sometimes called an extreme value. When you graph an outlier, it will appear not to fit the pattern of the graph. Some outliers are due to mistakes (for example, writing down 50 instead of 500), while others may indicate that something unusual is happening. Outliers are present far from the least squares line in the vertical direction. They have large "errors," where the "error" or residual is the...
5.4K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Correction: Grant et al. Low pH, High Stakes: A Narrative Review Exploring the Acid-Sensing GPR65 Pathway as a Novel Approach in Renal Cell Carcinoma. <i>Cancers</i> 2025, <i>17</i>, 3883.

Cancers·2026
Same author

Variations in Perinatal Interventions and Outcomes Among Active-Duty Service Women in the U.S. Military Health System.

Journal of women's health (2002)·2025
Same author

Anger-Related Affect and Suicidal Thoughts and Behaviors: A Systematic Review and Meta-Analysis.

Clinical psychology : a publication of the Division of Clinical Psychology of the American Psychological Association·2025
Same author

Low pH, High Stakes: A Narrative Review Exploring the Acid-Sensing GPR65 Pathway as a Novel Approach in Renal Cell Carcinoma.

Cancers·2025
Same author

What Is the Cost Impact of Second Opinions in Oncology? A Retrospective Review.

JCO oncology practice·2025
Same author

From reservoir to rendezvous: the journey of sperm through the oviduct.

Reproduction, fertility, and development·2025
Same journal

Research on a Regional Availability Evaluation Model for Road-Area High-Entropy Energy Based on Synergy Factors.

Entropy (Basel, Switzerland)·2026
Same journal

Atmospheric Turbulence Channel Modeling and Performance Analysis of a CO-ZP-OFDM Coherent Optical Communication System for UAV Air-to-Ground Scenarios.

Entropy (Basel, Switzerland)·2026
Same journal

Information Geometry and Asymptotic Theory for SMML Estimators.

Entropy (Basel, Switzerland)·2026
Same journal

Correlation Entropy and Power-Law Kinetics.

Entropy (Basel, Switzerland)·2026
Same journal

Research on the Contagion of Systemic Financial Risk Under the Impact of Climate Risks-From the Perspective of Complex Networks and Machine Learning.

Entropy (Basel, Switzerland)·2026
Same journal

The Statistical-Mechanical Meaning of the Wave Function of Quantum Mechanics.

Entropy (Basel, Switzerland)·2026
See all related articles

Related Experiment Video

Updated: Nov 27, 2025

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types
12:39

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types

Published on: December 10, 2012

11.5K

Improved Parsimonious Topic Modeling Based on the Bayesian Information Criterion.

Hang Wang1, David Miller1

  • 1Electrical Engineering and Computer Science Department, The Pennsylvania State University, State College, PA 16802, USA.

Entropy (Basel, Switzerland)
|December 8, 2020
PubMed
Summary
This summary is machine-generated.

Enhanced Parsimonious Topic Model (PTM) improves text analysis by increasing model sparsity and optimizing parameter learning. This advanced PTM offers superior performance and a more refined topic representation compared to the original model.

Keywords:
Bayesian information criterionexpectation maximization algorithmmedical abstractstopic model

More Related Videos

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments
08:12

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

2.8K
Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach
04:35

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

3.6K

Related Experiment Videos

Last Updated: Nov 27, 2025

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types
12:39

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types

Published on: December 10, 2012

11.5K
A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments
08:12

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

2.8K
Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach
04:35

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

3.6K

Area of Science:

  • Natural Language Processing
  • Machine Learning
  • Computational Linguistics

Background:

  • Previous work introduced Parsimonious Topic Model (PTM) for text corpora, offering sparse topic representation unlike Latent Dirichlet Allocation (LDA).
  • PTM determines salient words per topic and relevant topics per document, using a Bayesian Information Criterion (BIC) for unsupervised model selection.
  • The original PTM determined topic-specific words, document-specific topics, parameters, and topic count in an unsupervised manner.

Purpose of the Study:

  • To propose and evaluate extensions to the Parsimonious Topic Model (PTM) for improved text corpus analysis.
  • To enhance model sparsity and optimize the parameter learning algorithm for more robust topic modeling.
  • To achieve superior performance and sparser topic representations compared to the original PTM.

Main Methods:

  • Modified the BIC objective function using a lossless coding scheme to penalize non-salient words, increasing model sparsity.
  • Developed a new parameter learning strategy that jointly optimizes word switches across topics, avoiding local optima.
  • Applied the extended PTM to several document datasets for performance evaluation.

Main Results:

  • The proposed PTM extensions resulted in increased model sparsity and enabled the selection of more topics at a lower BIC cost.
  • The joint optimization of word switches demonstrated improved susceptibility to finding better local minima compared to sequential updates.
  • The enhanced PTM outperformed the original PTM across multiple performance measures on document datasets.

Conclusions:

  • The developed modeling and algorithmic extensions significantly improve Parsimonious Topic Model (PTM) performance and sparsity.
  • The new methods provide a more robust and effective approach to unsupervised topic modeling in text corpora.
  • The enhanced PTM offers a sparser and more accurate representation of topics within documents.