Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Censoring Survival Data

Censoring Survival Data

Survival analysis is a statistical method used to analyze time-to-event data, often employed in fields such as medicine, engineering, and social sciences. One of the key challenges in survival analysis is dealing with incomplete data, a phenomenon known as "censoring." Censoring occurs when the event of interest (such as death, relapse, or system failure) has not occurred for some individuals by the end of the study period or is otherwise unobservable, and it might have many different...

RACE - Rapid Amplification of cDNA Ends

RACE - Rapid Amplification of cDNA Ends

Rapid Amplification of cDNA Ends, or RACE, is one of the most effective methods to obtain a full-length cDNA from an mRNA sequence between a known internal region to the unknown sequence at the 5’ or 3’ end. The unknown region is cloned in the cDNA by a gene-specific primer that binds the known end, and a hybrid primer that attaches a predefined anchor sequence to the unknown end of the cDNA. The sequence in between is amplified by PCR with an anchor primer and a gene-specific...

RNA-seq

RNA-seq

RNA sequencing, or RNA-Seq, is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases.
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Evaluating the Real-World Value of Daratumumab Addition to Multiple Myeloma Induction Therapy by Real-World Minimal Residual Disease Assessment and Extended Genetic Profiling.

Clinical lymphoma, myeloma & leukemia·2025

Same author

Indonesia Election Archive: Institutions, candidates and results.

Scientific data·2025

Same author

The Adolescent Immunization Platform: The Past and Future.

The Journal of adolescent health : official publication of the Society for Adolescent Medicine·2025

Same author

Real-world use of venetoclax in the treatment of paediatric and teenage/young adult haematological malignancies.

British journal of haematology·2024

Same author

A Problem Shared Is a Community Created: Recommendations for Cross-Institutional Collaborations.

Journal of escience librarianship·2024

Same author

The DMPTool NIH DMSP Templates Project.

Journal of the Medical Library Association : JMLA·2024

Same journal

Two-factor synaptic plasticity enables memory consolidation during neuronal burst firing.

PNAS nexus·2026

Same journal

Individual curiosity modulates exploration in sequential book selection.

PNAS nexus·2026

Same journal

On phase transitions to interdisciplinary and convergent research.

PNAS nexus·2026

Same journal

Confident judgments of (mis)information veracity are more, rather than less, accurate.

PNAS nexus·2026

Same journal

Can AI help reduce prejudice? Evaluating the effectiveness of AI-powered personalized persuasion on support for transgender rights.

PNAS nexus·2026

Same journal

A cultural explanation for parole decisions in the United States.

PNAS nexus·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 26, 2025

Rare Event Detection Using Error-corrected DNA and RNA Sequencing

Rare Event Detection Using Error-corrected DNA and RNA Sequencing

Published on: August 3, 2018

Coding with the machines: machine-assisted coding of rare event data.

Henry David Overos¹, Roman Hlatky², Ojashwi Pathak¹

¹Government and Politics, University of Maryland at College Park, College Park, MD, USA.

|May 20, 2024

Summary

This summary is machine-generated.

Large language models (LLMs) show promise for machine coding, but validation remains crucial. GPT-4 demonstrates expert-level performance in familiar contexts, highlighting the need for rigorous LLM evaluation.

Keywords:

BERT GPT machine coding machine learning political event data

More Related Videos

Automated Detection and Analysis of Exocytosis

Automated Detection and Analysis of Exocytosis

Published on: September 11, 2021

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

Related Experiment Videos

Last Updated: Jun 26, 2025

Rare Event Detection Using Error-corrected DNA and RNA Sequencing

Rare Event Detection Using Error-corrected DNA and RNA Sequencing

Published on: August 3, 2018

Automated Detection and Analysis of Exocytosis

Automated Detection and Analysis of Exocytosis

Published on: September 11, 2021

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

Area of Science:

Political Science
Computational Social Science
Artificial Intelligence

Background:

Machine coding using LLMs has advanced significantly, but concerns exist regarding the reliability and validation of LLM classifications.
LLM performance varies based on prompts, tuning, subject areas, tasks, and especially in zero-shot applications.

Purpose of the Study:

To evaluate the performance of supervised and semi-supervised machine coding algorithms in political science.
To compare the performance of three LLM models against each other and against trained human experts.
To assess the impact of prompt engineering and data pre-processing on LLM coding accuracy.

Main Methods:

Comparative analysis of three LLM models using supervised and semi-supervised learning on political data.
Multiple iterations of model performance testing with varying prompt engineering and data pre-processing techniques.
Assessment of LLM performance on an updated dataset to mitigate pre-training bias concerns.

Main Results:

GPT-4 demonstrated performance comparable to trained human experts in coding familiar contexts.
LLM consistency in coding varied across different contexts, with GPT-4 showing higher consistency.
Prompt engineering and data pre-processing influenced LLM performance, but expert-level coding was primarily achieved by GPT-4 in specific scenarios.

Conclusions:

Only GPT-4 approaches the performance of trained expert coders for political data, particularly in familiar contexts.
LLM coding offers potential benefits but requires careful validation and consideration of drawbacks for reliable application.
Further research is needed to refine LLM validation methods and ensure consistent performance across diverse coding tasks.