Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Multiple Comparison Tests01:13

Multiple Comparison Tests

4.0K
Multiple comparison test, abbreviated as MCT, is a post hoc analysis generally performed after comparing multiple samples with one or more tests. An MCT will help identify a significantly different sample among multiple samples or a factor among multiple factors.
It would be easy to compare two samples using a significance alpha level of 0.05. In other words, there is only one sample pair to be compared. However, it would be difficult to identify a significantly different sample if the number...
4.0K
Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

2.6K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
2.6K
Bonferroni Test01:10

Bonferroni Test

2.9K
The Bonferroni test is a statistical test named after Carlo Emilio Bonferroni, an Italian mathematician best known for Bonferroni inequalities. This statistical test is a type of multiple comparison test to determine which means are different than the rest. Bonferroni test can minimize the Type 1 error by reducing the significance level alpha, which otherwise increases with sample pairs.
The means of different samples are first paired in all possible combinations.
The null hypothesis of the...
2.9K
Compacting Factor test01:22

Compacting Factor test

298
The compacting factor test is a method used to assess the workability of concrete. It is  especially suitable for concrete mixes containing aggregates up to one and a half inches in size. This test involves specialized equipment consisting of two truncated cone-shaped hoppers and a cylinder, all with polished interior surfaces to minimize friction.
The procedure begins by placing concrete into the upper hopper without any compaction. Once filled, the bottom door of this hopper is opened,...
298
Testing a Claim about Population Proportion01:24

Testing a Claim about Population Proportion

3.5K
A complete procedure for testing a claim about a population proportion is provided here.
There are two methods of testing a claim about a population proportion: (1) Using the sample proportion from the data where a binomial distribution is approximated to the normal distribution and (2) Using the binomial probabilities calculated from the data.
The first method uses normal distribution as an approximation to the binomial distribution. The requirements are as follows: sample size is large...
3.5K
Introduction to Test of Independence01:21

Introduction to Test of Independence

2.5K
In statistics, the term independence means that one can directly obtain the probability of any event involving both variables by multiplying their individual probabilities. Tests of independence are chi-square tests involving the use of a contingency table of observed (data) values.
The test statistic for a test of independence is similar to that of a goodness-of-fit test:
2.5K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Using application benchmark call graphs to quantify and improve the practical relevance of microbenchmark suites.

PeerJ. Computer science·2021
Same journal

How students use generative AI for software testing: An observational study.

Empirical software engineering·2026
Same journal

Is common sense all you need? Using expert defined rules to identify vulnerability patches instead of machine learning.

Empirical software engineering·2026
Same journal

Less is more: usefulness of data flow diagrams and large language models for security threat validation.

Empirical software engineering·2026
Same journal

SecMLOps: A comprehensive framework for integrating security throughout the machine learning operations lifecycle.

Empirical software engineering·2026
Same journal

Tools and benchmarks evolve: what is their impact on parameter tuning in SBSE experiments?

Empirical software engineering·2025
Same journal

AI support for data scientists: An empirical study on workflow and alternative code recommendations.

Empirical software engineering·2025
See all related articles

Related Experiment Video

Updated: Oct 13, 2025

Automated, Quantitative Cognitive/Behavioral Screening of Mice: For Genetics, Pharmacology, Animal Cognition and Undergraduate Instruction
16:23

Automated, Quantitative Cognitive/Behavioral Screening of Mice: For Genetics, Pharmacology, Animal Cognition and Undergraduate Instruction

Published on: February 26, 2014

14.5K

Applying test case prioritization to software microbenchmarks.

Christoph Laaber1, Harald C Gall1, Philipp Leitner2

  • 1Department of Informatics, University of Zurich, Zurich, Switzerland.

Empirical Software Engineering
|November 15, 2021
PubMed
Summary
This summary is machine-generated.

Test case prioritization (TCP) techniques can effectively detect performance regressions in software microbenchmarks. The total greedy strategy and dynamic-coverage methods are most effective, offering a viable option for performance regression testing with manageable overhead.

Keywords:
JMHperformance testingregression testingsoftware microbenchmarkingtest case prioritization

More Related Videos

A Computerized Functional Skills Assessment and Training Program Targeting Technology Based Everyday Functional Skills
07:31

A Computerized Functional Skills Assessment and Training Program Targeting Technology Based Everyday Functional Skills

Published on: February 13, 2020

7.1K
A Quantitative Fitness Analysis Workflow
11:39

A Quantitative Fitness Analysis Workflow

Published on: August 13, 2012

14.7K

Related Experiment Videos

Last Updated: Oct 13, 2025

Automated, Quantitative Cognitive/Behavioral Screening of Mice: For Genetics, Pharmacology, Animal Cognition and Undergraduate Instruction
16:23

Automated, Quantitative Cognitive/Behavioral Screening of Mice: For Genetics, Pharmacology, Animal Cognition and Undergraduate Instruction

Published on: February 26, 2014

14.5K
A Computerized Functional Skills Assessment and Training Program Targeting Technology Based Everyday Functional Skills
07:31

A Computerized Functional Skills Assessment and Training Program Targeting Technology Based Everyday Functional Skills

Published on: February 13, 2020

7.1K
A Quantitative Fitness Analysis Workflow
11:39

A Quantitative Fitness Analysis Workflow

Published on: August 13, 2012

14.7K

Area of Science:

  • Software Engineering
  • Software Testing
  • Performance Analysis

Background:

  • Regression testing is crucial for software evolution, but performance regression testing, especially for microbenchmarks, is under-researched.
  • Microbenchmark suites are time-consuming to execute, making efficient fault detection critical.

Purpose of the Study:

  • To empirically investigate the effectiveness and efficiency of coverage-based test case prioritization (TCP) techniques for software microbenchmarks.
  • To compare different TCP strategies (total vs. additional greedy) and coverage types (static vs. dynamic).

Main Methods:

  • Empirical study of 54 unique coverage-based TCP technique instantiations.
  • Application of total and additional greedy strategies across multiple parameterization dimensions.
  • Evaluation using average percentage of fault-detection on performance (APFD-P) and analysis of runtime overhead.

Main Results:

  • TCP techniques achieved a mean APFD-P between 0.54 and 0.71.
  • The top three performance regressions were detected between 29% and 66% of the microbenchmark suite execution.
  • The most effective TCP technique incurred an 11% runtime overhead.
  • The total strategy outperformed the additional strategy, and dynamic-coverage was generally preferred over static-coverage.

Conclusions:

  • Test case prioritization is a viable technique for performance regression testing of microbenchmarks.
  • Dynamic-coverage TCP techniques are recommended when analysis time permits, while static-coverage offers an alternative for time-constrained scenarios.
  • The total greedy strategy is superior for performance regression detection in microbenchmarks.