Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Naturalistic Observations

Naturalistic Observations

If you want to understand how behavior occurs, one of the best ways to gain information is to simply observe the behavior in its natural context. However, people might change their behavior in unexpected ways if they know they are being observed. How do researchers obtain accurate information when people tend to hide their natural behavior? As an example, imagine that your professor asks everyone in your class to raise their hand if they always wash their hands after using the restroom. Chances...

Observational Studies

Observational Studies

Observational studies are a type of analytical study where researchers observe events without any interventions. In other words, the researcher does not influence the response variable or the experiment's outcome.
There are three types of observational studies – Prospective, retrospective, and cross-sectional.
Prospective Study
Prospective studies, also known as longitudinal or cohort studies, are carried out by collecting future data from groups sharing similar characteristics. One...

Review and Preview

Review and Preview

In statistics, several tools are used to interpret the data. Measures of central tendency represent the characteristics of the data, such as mean, median, and mode. Additionally, measures of variance like standard deviation and range are used to find the spread of data from the mean. Relative standing measures the distance between data locations. Commonly used measures of relative standings are percentile, z score, and quartiles.
Percentiles are a type of fractile that partition data into...

Comparing Experimental Results: Student's t-Test

Comparing Experimental Results: Student's t-Test

The t-test is a statistical method used to compare the sample mean with a population mean or compare two means from two data sets. The test statistic is calculated from the standard deviation, mean, and number of measurements in the data set at a selected confidence interval and then compared to a table of critical values at this confidence level. If the test statistic is smaller than the critical value, the null hypothesis is accepted. In this case, we state that the difference between the...

Theory of Attribution I: Correspondent Inference Theory

Theory of Attribution I: Correspondent Inference Theory

Correspondent inference theory, proposed by Jones and Davis in 1965, seeks to explain how individuals infer stable personality traits from observed behaviors. It suggests that people attribute actions to underlying dispositions rather than external circumstances, particularly when the behavior appears intentional and socially significant.Voluntary Behavior and Dispositional AttributionAccording to this theory, individuals are more likely to attribute behavior to personal traits when it appears...

Surveys

Surveys

Often, psychologists develop surveys as a means of gathering data. Surveys are lists of questions to be answered by research participants, and can be delivered as paper-and-pencil questionnaires, administered electronically, or conducted verbally. Generally, the survey itself can be completed in a short time, and the ease of administering a survey makes it easy to collect data from a large number of people.

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

A logistic regression investigation of the relationship between the Learning Assistant model and failure rates in introductory STEM courses.

International journal of STEM education·2019

Same journal

A Simple Approach for Differential Test Functioning Based on Sum Scores.

Educational and psychological measurement·2026

Same journal

Evaluating Factor Retention in Large Factor Analysis Models: A Simulation Study Comparing 15 Methods.

Educational and psychological measurement·2026

Same journal

Agreement and Alignment in Binary Rating Tasks: Strategic Convergence as an Equilibrium Outcome.

Educational and psychological measurement·2026

Same journal

Interactions Between Termination Criteria and Ability Estimators in Computerized Adaptive Testing.

Educational and psychological measurement·2026

Same journal

Identification and Diagnosis of Misreporting in Surveys.

Educational and psychological measurement·2026

Same journal

The Aggregated Latent Profile Index: Measuring Person Profile Differentiation Within a Bootstrap-Validated Latent Profile Space.

Educational and psychological measurement·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Dec 15, 2025

Measuring the Functional Abilities of Children Aged 3-6 Years Old with Observational Methods and Computer Tools

Measuring the Functional Abilities of Children Aged 3-6 Years Old with Observational Methods and Computer Tools

Published on: June 20, 2020

Making Inferences About Teacher Observation Scores Over Time.

Derek C Briggs¹, Jessica L Alzen¹

¹University of Colorado Boulder, Boulder, CO, USA.

Educational and Psychological Measurement

|July 14, 2020

Summary

This summary is machine-generated.

To accurately measure teacher practice growth, consider the timing of observations. At least eight observations over two years are needed to reliably distinguish changes in teaching practices.

Keywords:

generalizability theory growth modeling reliability teacher observation protocol

More Related Videos

Problem-Solving Before Instruction PS-I: A Protocol for Assessment and Intervention in Students with Different Abilities

Problem-Solving Before Instruction PS-I: A Protocol for Assessment and Intervention in Students with Different Abilities

Published on: September 11, 2021

A Tablet-Based Curriculum-Based Measurement Protocol for Kindergarten Writing

A Tablet-Based Curriculum-Based Measurement Protocol for Kindergarten Writing

Published on: February 7, 2025

Related Experiment Videos

Last Updated: Dec 15, 2025

Measuring the Functional Abilities of Children Aged 3-6 Years Old with Observational Methods and Computer Tools

Measuring the Functional Abilities of Children Aged 3-6 Years Old with Observational Methods and Computer Tools

Published on: June 20, 2020

Problem-Solving Before Instruction PS-I: A Protocol for Assessment and Intervention in Students with Different Abilities

Problem-Solving Before Instruction PS-I: A Protocol for Assessment and Intervention in Students with Different Abilities

Published on: September 11, 2021

A Tablet-Based Curriculum-Based Measurement Protocol for Kindergarten Writing

A Tablet-Based Curriculum-Based Measurement Protocol for Kindergarten Writing

Published on: February 7, 2025

Area of Science:

Educational Measurement
Psychometrics
Teacher Professional Development

Background:

Teacher observation scores are frequently used to evaluate teaching practices.
Potential confounding factors in observations include rater, lesson, and time of year.
The temporal facet of measurement is often overlooked, impacting score generalizability.

Purpose of the Study:

To apply a generalizability theory framework to understand measurement facets in teacher observations.
To investigate the reliability of inferring teacher practice growth over time.
To determine the optimal number and spacing of observations for reliable growth measurement.

Main Methods:

Utilized longitudinal observation scores from the Measures of Effective Teaching project.
Employed a generalizability theory framework to model score variance.
Analyzed data based on the Danielson Framework for Teaching.

Main Results:

Identified time as a significant, yet often overlooked, facet in teacher observation scores.
Demonstrated that inferences about teacher growth are possible with longitudinal data.
Found that a minimum of eight observations across two years is necessary for reliable growth distinctions (reliability coefficient of .39).

Conclusions:

Teacher practice evaluation should account for the temporal dimension to ensure accurate growth assessment.
Longitudinal observation designs are crucial for understanding teacher development trajectories.
The study provides empirical evidence for the number of observations needed for reliable teacher growth measurement.