Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Prediction Intervals

Prediction Intervals

The interval estimate of any variable is known as the prediction interval. It helps decide if a point estimate is dependable.
However, the point estimate is most likely not the exact value of the population parameter, but close to it. After calculating point estimates, we construct interval estimates, called confidence intervals or prediction intervals. This prediction interval comprises a range of values unlike the point estimate and is a better predictor of the observed sample value, y.

Extraction: Partition and Distribution Coefficients

Extraction: Partition and Distribution Coefficients

The distribution law or Nernst's distribution law is the law that governs the distribution of a solute between two immiscible solvents. This law, also known as the partition law, states that if a solute is added to the mixture of two immiscible solvents at a constant temperature, the solute is distributed between the two solvents in such a way that the ratio of solute concentrations in the solvents remains constant at equilibrium.
For extracting a solute from an aqueous phase into an...

Chunking

Chunking

Chunking is a powerful cognitive technique that improves short-term memory retention by organizing information into smaller, more manageable units. The brain, limited by working memory capacity, can more easily process and store information when it is divided into "chunks" rather than presented as discrete, unrelated elements. Chunking is especially useful when dealing with large amounts of information, such as numerical sequences, words, or complex ideas.
The principle behind chunking...

Expected Frequencies in Goodness-of-Fit Tests

Expected Frequencies in Goodness-of-Fit Tests

A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n) to the number of categories (k).

Determination of Expected Frequency

Determination of Expected Frequency

Suppose one wants to test independence between the two variables of a contingency table. The values in the table constitute the observed frequencies of the dataset. But how does one determine the expected frequency of the dataset? One of the important assumptions is that the two variables are independent, which means the variables do not influence each other. For independent variables, the statistical probability of any event involving both variables is calculated by multiplying the individual...

Aggregates Classification

Aggregates Classification

Aggregate classification is generally based on its size, petrographic characteristics, weight, and source. Size classification ranges from coarse to fine aggregates, defined by the size of the particles. Coarse aggregates are particles that do not pass through ASTM sieve No. 4, and aggregates that pass through the sieve are fine aggregates.
Petrographic classification groups aggregates based on common mineralogical characteristics. Some of the common mineral groups found in aggregates are...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Cyclophosphamide-induced iron homeostasis imbalance triggers ovarian toxicity through ferroptosis of ovarian granulosa cells in a mitophagy crosstalk manner.

Chemico-biological interactions·2026

Same author

Vagueness as imprecision is a bug to be mitigated: Commentary on Hutmacher and Franz (2025).

The American psychologist·2026

Same author

Real-World Prospective Validation and Economic Evaluation of Deep Learning- Based Diabetic Retinopathy Detection From Fundus Photographs: A Systematic Review and Meta-analysis.

Diabetes care·2025

Same author

Sentiment in speech is associated with symptom severity in psychosis.

Cognitive neuropsychiatry·2025

Same author

Selenium induced multicomponent platinum-based ultrathin nanowires with abundant grain boundaries and partial amorphous phase enable remarkable multifunctional electrocatalysis.

Journal of colloid and interface science·2025

Same author

Palmitic acid-induced insulin resistance triggers granulosa cell senescence by disruption of the UPR<sup>mt</sup>/mitophagy/lysosome axis.

Chemico-biological interactions·2025

Same journal

Improving Retrieval-Augmented Generation without Taxonomy-based Error Categorization.

Proceedings of the conference. Association for Computational Linguistics. Meeting·2026

Same journal

RARE: Retrieval-Augmented Reasoning Enhancement for Large Language Models.

Proceedings of the conference. Association for Computational Linguistics. Meeting·2026

Same journal

Unraveling LoRA Interference: Orthogonal Subspaces for Robust Model Merging.

Proceedings of the conference. Association for Computational Linguistics. Meeting·2026

Same journal

Improving Formality Style Transfer with Context-Aware Rule Injection.

Proceedings of the conference. Association for Computational Linguistics. Meeting·2026

Same journal

SOCIALITE-LLAMA: An Instruction-Tuned Model for Social Scientific Tasks.

Proceedings of the conference. Association for Computational Linguistics. Meeting·2025

Same journal

GraphCheck: Breaking Long-Term Text Barriers with Extracted Knowledge Graph-Powered Fact-Checking.

Proceedings of the conference. Association for Computational Linguistics. Meeting·2025

See all related articles

Search research articles

Related Experiment Video

Updated: Feb 19, 2026

Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications

Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications

Published on: February 23, 2019

Scoring Coreference Partitions of Predicted Mentions: A Reference Implementation.

Sameer Pradhan¹, Xiaoqiang Luo², Marta Recasens³

¹Harvard Medical School, Boston, MA.

Proceedings of the Conference. Association for Computational Linguistics. Meeting

|November 7, 2017

Summary

This summary is machine-generated.

Coreference resolution metrics like B3 and CEAF have underspecified definitions for predicted mentions. This study clarifies these metrics, provides an open-source implementation, and rescores shared task results for better algorithm comparison.

More Related Videos

Dissociation of the Confounding Influences of Expectancy and Integrative Difficulty Residing in Anomalous Sentences in Event-related Potential Studies

Dissociation of the Confounding Influences of Expectancy and Integrative Difficulty Residing in Anomalous Sentences in Event-related Potential Studies

Published on: May 9, 2019

Examining Online Syntactic Processing of Spoken Complex Sentences in Chinese Using Dual-Modal Interference Tasks

Examining Online Syntactic Processing of Spoken Complex Sentences in Chinese Using Dual-Modal Interference Tasks

Published on: September 5, 2019

Related Experiment Videos

Last Updated: Feb 19, 2026

Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications

Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications

Published on: February 23, 2019

Dissociation of the Confounding Influences of Expectancy and Integrative Difficulty Residing in Anomalous Sentences in Event-related Potential Studies

Dissociation of the Confounding Influences of Expectancy and Integrative Difficulty Residing in Anomalous Sentences in Event-related Potential Studies

Published on: May 9, 2019

Examining Online Syntactic Processing of Spoken Complex Sentences in Chinese Using Dual-Modal Interference Tasks

Examining Online Syntactic Processing of Spoken Complex Sentences in Chinese Using Dual-Modal Interference Tasks

Published on: September 5, 2019

Area of Science:

Natural Language Processing
Computational Linguistics

Background:

Coreference resolution metrics B ³ and CEAF have ambiguous definitions regarding predicted mentions.
Existing metric variations manipulate key and predicted mentions for one-to-one mapping, and BLANC was limited to key mentions.
Accurate evaluation is crucial for advancing coreference resolution algorithms.

Purpose of the Study:

To clarify and standardize the definitions of coreference evaluation metrics for predicted mentions.
To argue against unnecessary mention manipulation in scoring predicted mentions.
To provide a reliable, open-source implementation of coreference evaluation measures.

Main Methods:

Analyzing and clarifying the definitions of B ³ , CEAF, and BLANC metrics.
Illustrating the application of these metrics to predicted mentions without harmful manipulation.
Developing and releasing an open-source reference implementation of coreference evaluation tools.
Rescoring CoNLL-2011/2012 shared task systems using the standardized implementation.

Main Results:

Demonstrated that mention manipulation for predicted mentions is unnecessary and can yield counterintuitive results.
Provided a unified framework for applying coreference metrics to predicted mentions.
Released a robust, open-source implementation for coreference evaluation.
Generated rescaled results for CoNLL-2011/2012 systems, facilitating direct comparison.

Conclusions:

The proposed clarifications and open-source implementation enable more accurate and consistent measurement of coreference resolution algorithms.
Standardized evaluation will accelerate progress in the field of end-to-end coreference resolution.
Researchers can now better benchmark and compare novel coreference resolution systems.