Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Regression Toward the Mean

Regression Toward the Mean

Regression toward the mean (“RTM”) is a phenomenon in which extremely high or low values—for example, and individual’s blood pressure at a particular moment—appear closer to a group’s average upon remeasuring. Although this statistical peculiarity is the result of random error and chance, it has been problematic across various medical, scientific, financial and psychological applications. In particular, RTM, if not taken into account, can interfere when...

Reliability and Validity

Reliability and Validity

Reliability and validity are two important considerations that must be made with any type of data collection. Reliability refers to the ability to consistently produce a given result. In the context of psychological research, this would mean that any instruments or tools used to collect data do so in consistent, reproducible ways.

Modeling in Therapy

Modeling in Therapy

Modeling, a key technique in therapy, uses observational learning to help clients acquire and practice new skills by watching therapists demonstrate desired behaviors. This approach, rooted in Albert Bandura's concept of vicarious learning, plays a significant role in therapeutic interventions for various psychological conditions, including social anxiety, ADHD, and depression.
Participant Modeling
Participant modeling involves therapists demonstrating calm and effective behaviors in...

Rational Emotive Behavior Therapy

Rational Emotive Behavior Therapy

Cognitive-behavioral therapies (CBTs) are grounded in the belief that our thoughts profoundly influence our emotions and actions. Advocates of CBT emphasize three core assumptions: first, that cognitions are identifiable and measurable; second, that they are central to psychological functioning; and third, that irrational or maladaptive beliefs can be replaced with rational and adaptive ones. This transformative approach to therapy has paved the way for specific models such as Albert...

Self-Evaluation: Self-Enhancement and Self-Verification

Self-Evaluation: Self-Enhancement and Self-Verification

Social psychologists have documented that feeling good about ourselves and maintaining positive self-esteem is a powerful motivator of human behavior (Tavris & Aronson, 2008). In the United States, members of the predominant culture typically think very highly of themselves and view themselves as good people who are above average on many desirable traits (Ehrlinger, Gilovich, & Ross, 2005). Often, our behavior, attitudes, and beliefs are affected when we experience a threat to our...

Self-Presentation: Self-Monitoring and Self-Handicapping

Self-Presentation: Self-Monitoring and Self-Handicapping

People can go to great lengths to protect their self-image and present themselves in ways that they want others to see them. Sociologist Erving Goffman presented the idea that a person is like an actor on a stage. Calling his theory dramaturgy, Goffman believed that we use “impression management” to present ourselves to others as we hope to be perceived. Each situation is a new scene, and individuals perform different roles depending on who is present (Goffman, 1959). Think about...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

This and that in depression: Cross-linguistic semantic effects.

PLOS mental health·2026

Same author

Inferring Depression and Its Semantic Underpinnings from Simple Lexical Choices.

Depression and anxiety·2025

Same author

Large language models surpass human experts in predicting neuroscience results.

Nature human behaviour·2024

Same author

The past, present, and future of the brain imaging data structure (BIDS).

Imaging neuroscience (Cambridge, Mass.)·2024

Same author

The Past, Present, and Future of the Brain Imaging Data Structure (BIDS).

ArXiv·2023

Same author

Neuroscout, a unified platform for generalizable and reproducible fMRI research.

eLife·2022

Same journal

Time-Related Considerations for Modeling Event-Based Data Collected via Ecological Momentary Assessment.

Advances in methods and practices in psychological science·2026

Same journal

When Do Interaction/Moderation Effects Stabilize in Linear Regression?

Advances in methods and practices in psychological science·2026

Same journal

Multilab Direct Replication of Flavell, Beach, and Chinsky (1966): Spontaneous Verbal Rehearsal in a Memory Task as a Function of Age.

Advances in methods and practices in psychological science·2025

Same journal

Tutorial: Power analyses for interaction effects in cross-sectional regressions.

Advances in methods and practices in psychological science·2025

Same journal

A Delphi Study to Strengthen Research-Methods Training in Undergraduate Psychology Programs.

Advances in methods and practices in psychological science·2025

Same journal

A Tutorial on Analyzing Ecological Momentary Assessment Data in Psychological Research With Bayesian (Generalized) Mixed-Effects Models.

Advances in methods and practices in psychological science·2025

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 26, 2025

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

Putting Psychology to the Test: Rethinking Model Evaluation Through Benchmarking and Prediction.

Roberta Rocca^1,2, Tal Yarkoni¹

¹Department of Psychology, University of Texas at Austin, Austin, Texas, USA.

Advances in Methods and Practices in Psychological Science

|May 13, 2024

Summary

This summary is machine-generated.

Psychology needs common benchmarks for evaluating scientific models. Implementing these shared standards will improve model assessment and drive cumulative research progress.

Keywords:

benchmarking machine learning model evaluation open data open materials psychology

More Related Videos

The Joint Effect of Social Comparison and Social Distance on Evaluation of Intertemporal Choice Outcomes in Event-related Potential Studies

The Joint Effect of Social Comparison and Social Distance on Evaluation of Intertemporal Choice Outcomes in Event-related Potential Studies

Published on: August 25, 2023

Assessment of Mouse Judgment Bias through an Olfactory Digging Task

Assessment of Mouse Judgment Bias through an Olfactory Digging Task

Published on: March 4, 2022

Related Experiment Videos

Last Updated: Jun 26, 2025

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

The Joint Effect of Social Comparison and Social Distance on Evaluation of Intertemporal Choice Outcomes in Event-related Potential Studies

The Joint Effect of Social Comparison and Social Distance on Evaluation of Intertemporal Choice Outcomes in Event-related Potential Studies

Published on: August 25, 2023

Assessment of Mouse Judgment Bias through an Olfactory Digging Task

Assessment of Mouse Judgment Bias through an Olfactory Digging Task

Published on: March 4, 2022

Area of Science:

Psychology
Scientific Methodology

Background:

Psychology lacks standardized metrics for evaluating scientific models and theories.
Current evaluation practices are often inconsistent and fail to assess model generalizability, hindering the reliable discrimination between effective and ineffective models.

Purpose of the Study:

To advocate for the adoption of common benchmarks in psychology for model evaluation.
To propose principles for effective benchmark design and implementation.
To address potential challenges in establishing community-wide evaluation standards.

Main Methods:

Drawing inspiration from machine learning and statistical genetics.
Discussing principles for benchmark utility.
Identifying practical steps for community adoption.
Addressing implementation concerns.

Main Results:

Lack of reliable communal metrics for model performance evaluation in psychology.
Idiosyncratic evaluation practices with shortcomings like poor generalizability assessment.
Potential for common benchmarks to improve model discrimination and foster progress.

Conclusions:

Implementing common evaluation benchmarks is crucial for advancing psychological science.
Consensus on benchmarks will enhance the practical utility and reliability of scientific models.
Adoption of benchmarks can lead to more robust and cumulative psychological research.