Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Slump Test

Slump Test

The slump test is a widely used method to measure the workability of concrete. It employs a 12-inch high truncated cone mold that tapers from eight inches at the base to four inches at the top. Before testing, the mold is securely attached to a flat base and dampened.
Concrete is poured into the mold in three layers to conduct the test. Each layer is compacted 25 times with a steel tamping rod, which has a five-eighths-inch diameter and a rounded end, to ensure even distribution and eliminate...

Compacting Factor test

Compacting Factor test

The compacting factor test is a method used to assess the workability of concrete. It is especially suitable for concrete mixes containing aggregates up to one and a half inches in size. This test involves specialized equipment consisting of two truncated cone-shaped hoppers and a cylinder, all with polished interior surfaces to minimize friction.
The procedure begins by placing concrete into the upper hopper without any compaction. Once filled, the bottom door of this hopper is opened,...

Flow Table Test

Flow Table Test

The flow table test is an established method used to assess the workability of concrete, particularly useful for evaluating highly flowable concrete mixes. This test employs an apparatus that consists of a wooden board topped with a steel plate, collectively weighing 35 pounds. The board is connected to a base via a hinge and measures 27.6 inches on each side.
Concrete is placed within a truncated cone mold that is 8 inches high with an 8-inch base diameter and a 5-inch top diameter. The...

Testing a Claim about Standard Deviation

Testing a Claim about Standard Deviation

A complete procedure to test a claim about population standard deviation or population variance is explained here.
The hypothesis testing for the claim of population standard deviation (or variance) requires the data and samples to be random and unbiased. The population distribution also must be normal. There is no specific requirement on the sample size as the estimation is based on the chi-square distribution.
As a first step, the hypothesis (null and alternative) concerning the claim about...

Multiple Comparison Tests

Multiple Comparison Tests

Multiple comparison test, abbreviated as MCT, is a post hoc analysis generally performed after comparing multiple samples with one or more tests. An MCT will help identify a significantly different sample among multiple samples or a factor among multiple factors.
It would be easy to compare two samples using a significance alpha level of 0.05. In other words, there is only one sample pair to be compared. However, it would be difficult to identify a significantly different sample if the number...

Testing a Claim about Mean: Unknown Population SD

Testing a Claim about Mean: Unknown Population SD

A complete procedure of testing a hypothesis about a population mean when the population standard deviation is unknown is explained here.
Estimating a population mean requires the samples to be approximately normally distributed. The data should be collected from the randomly selected samples having no sampling bias. There is no specific requirement for sample size. But if the sample size is less than 30, and we don't know the population standard deviation, a different approach is used;...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Med-ViX-Ray: Enhancing explainable chest X-ray analysis with clinical knowledge graphs.

Computer methods and programs in biomedicine·2026

Same author

Advances in artificial intelligence for diabetes prediction: insights from a systematic literature review.

Artificial intelligence in medicine·2025

Same author

Machine learning-based test smell detection.

Empirical software engineering·2024

Same author

Fairness-aware machine learning engineering: how far are we?

Empirical software engineering·2023

Same author

Inferring test models from user bug reports using multi-objective search.

Empirical software engineering·2023

Same author

Rubbing salt in the wound? A large-scale investigation into the effects of refactoring on security.

Empirical software engineering·2023

Same journal

How students use generative AI for software testing: An observational study.

Empirical software engineering·2026

Same journal

Is common sense all you need? Using expert defined rules to identify vulnerability patches instead of machine learning.

Empirical software engineering·2026

Same journal

Less is more: usefulness of data flow diagrams and large language models for security threat validation.

Empirical software engineering·2026

Same journal

SecMLOps: A comprehensive framework for integrating security throughout the machine learning operations lifecycle.

Empirical software engineering·2026

Same journal

Tools and benchmarks evolve: what is their impact on parameter tuning in SBSE experiments?

Empirical software engineering·2025

Same journal

AI support for data scientists: An empirical study on workflow and alternative code recommendations.

Empirical software engineering·2025

See all related articles

Search research articles

Related Experiment Video

Updated: Aug 26, 2025

A Computerized Functional Skills Assessment and Training Program Targeting Technology Based Everyday Functional Skills

A Computerized Functional Skills Assessment and Training Program Targeting Technology Based Everyday Functional Skills

Published on: February 13, 2020

Static test flakiness prediction: How Far Can We Go?

Valeria Pontillo¹, Fabio Palomba¹, Filomena Ferrucci¹

¹Software Engineering (SeSa) Lab - Department of Computer Science, University of Salerno, Fisciano, Italy.

Empirical Software Engineering

|October 6, 2022

Summary

This summary is machine-generated.

This study predicts test flakiness using only static metrics, achieving performance comparable to existing methods. Production code characteristics can influence the accuracy of flaky test prediction models.

Keywords:

Flaky tests Machine learning Software testing

More Related Videos

A Tactile Automated Passive-Finger Stimulator TAPS

A Tactile Automated Passive-Finger Stimulator TAPS

Published on: June 3, 2009

The Tail Suspension Test

The Tail Suspension Test

Published on: January 28, 2012

Related Experiment Videos

Last Updated: Aug 26, 2025

A Computerized Functional Skills Assessment and Training Program Targeting Technology Based Everyday Functional Skills

A Computerized Functional Skills Assessment and Training Program Targeting Technology Based Everyday Functional Skills

Published on: February 13, 2020

A Tactile Automated Passive-Finger Stimulator TAPS

A Tactile Automated Passive-Finger Stimulator TAPS

Published on: June 3, 2009

The Tail Suspension Test

The Tail Suspension Test

Published on: January 28, 2012

Area of Science:

Software Engineering
Machine Learning
Software Testing

Background:

Test flakiness, where tests non-deterministically pass or fail, is a significant challenge in software development.
Existing detection methods often rely on computationally expensive dynamic analysis, limiting scalability.
Machine learning has been explored for predicting test flakiness using mixed static and dynamic metrics.

Purpose of the Study:

To investigate the prediction of test flakiness using exclusively static metrics.
To assess the performance of a static-only approach against state-of-the-art methods.
To analyze the impact of production code characteristics on flaky test prediction.

Main Methods:

A large-scale experiment was conducted on 70 Java projects from the iDFlakies and FlakeFlagger datasets.
Statistical analysis of 25 code metrics and smells to differentiate between flaky and non-flaky tests.
Development and evaluation of a machine learning model for predicting test flakiness using only static features.

Main Results:

The static-only approach demonstrated performance comparable to existing baseline methods.
Analysis revealed significant differences between flaky and non-flaky tests based on static metrics.
Production code characteristics were found to influence the effectiveness of flaky test prediction models.

Conclusions:

Predicting test flakiness using solely static metrics is a viable and scalable approach.
Static metrics alone can offer competitive performance in identifying flaky tests.
Future research should consider production code attributes for enhanced flaky test prediction accuracy.