Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Censoring Survival Data01:09

Censoring Survival Data

82
Survival analysis is a statistical method used to analyze time-to-event data, often employed in fields such as medicine, engineering, and social sciences. One of the key challenges in survival analysis is dealing with incomplete data, a phenomenon known as "censoring." Censoring occurs when the event of interest (such as death, relapse, or system failure) has not occurred for some individuals by the end of the study period or is otherwise unobservable, and it might have many different...
82
RACE - Rapid Amplification of cDNA Ends02:35

RACE - Rapid Amplification of cDNA Ends

6.3K
Rapid Amplification of cDNA Ends, or RACE, is one of the most effective methods to obtain a full-length cDNA from an mRNA sequence between a known internal region to the unknown sequence at the 5’ or 3’ end. The unknown region is cloned in the cDNA by a gene-specific primer that binds the known end, and a hybrid primer that attaches a predefined anchor sequence to the unknown end of the cDNA. The sequence in between is amplified by PCR with an anchor primer and a gene-specific...
6.3K
RNA-seq03:21

RNA-seq

9.9K
RNA sequencing, or RNA-Seq, is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases. 
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while...
9.9K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Evaluating the Real-World Value of Daratumumab Addition to Multiple Myeloma Induction Therapy by Real-World Minimal Residual Disease Assessment and Extended Genetic Profiling.

Clinical lymphoma, myeloma & leukemia·2025
Same author

Indonesia Election Archive: Institutions, candidates and results.

Scientific data·2025
Same author

The Adolescent Immunization Platform: The Past and Future.

The Journal of adolescent health : official publication of the Society for Adolescent Medicine·2025
Same author

Real-world use of venetoclax in the treatment of paediatric and teenage/young adult haematological malignancies.

British journal of haematology·2024
Same author

A Problem Shared Is a Community Created: Recommendations for Cross-Institutional Collaborations.

Journal of escience librarianship·2024
Same author

The DMPTool NIH DMSP Templates Project.

Journal of the Medical Library Association : JMLA·2024
Same journal

Two-factor synaptic plasticity enables memory consolidation during neuronal burst firing.

PNAS nexus·2026
Same journal

Individual curiosity modulates exploration in sequential book selection.

PNAS nexus·2026
Same journal

On phase transitions to interdisciplinary and convergent research.

PNAS nexus·2026
Same journal

Confident judgments of (mis)information veracity are more, rather than less, accurate.

PNAS nexus·2026
Same journal

Can AI help reduce prejudice? Evaluating the effectiveness of AI-powered personalized persuasion on support for transgender rights.

PNAS nexus·2026
Same journal

A cultural explanation for parole decisions in the United States.

PNAS nexus·2026
See all related articles

Related Experiment Video

Updated: Jun 26, 2025

Rare Event Detection Using Error-corrected DNA and RNA Sequencing
10:36

Rare Event Detection Using Error-corrected DNA and RNA Sequencing

Published on: August 3, 2018

12.1K

Coding with the machines: machine-assisted coding of rare event data.

Henry David Overos1, Roman Hlatky2, Ojashwi Pathak1

  • 1Government and Politics, University of Maryland at College Park, College Park, MD, USA.

PNAS Nexus
|May 20, 2024
PubMed
Summary
This summary is machine-generated.

Large language models (LLMs) show promise for machine coding, but validation remains crucial. GPT-4 demonstrates expert-level performance in familiar contexts, highlighting the need for rigorous LLM evaluation.

Keywords:
BERTGPTmachine codingmachine learningpolitical event data

More Related Videos

Automated Detection and Analysis of Exocytosis
13:28

Automated Detection and Analysis of Exocytosis

Published on: September 11, 2021

3.5K
Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

14.5K

Related Experiment Videos

Last Updated: Jun 26, 2025

Rare Event Detection Using Error-corrected DNA and RNA Sequencing
10:36

Rare Event Detection Using Error-corrected DNA and RNA Sequencing

Published on: August 3, 2018

12.1K
Automated Detection and Analysis of Exocytosis
13:28

Automated Detection and Analysis of Exocytosis

Published on: September 11, 2021

3.5K
Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

14.5K

Area of Science:

  • Political Science
  • Computational Social Science
  • Artificial Intelligence

Background:

  • Machine coding using LLMs has advanced significantly, but concerns exist regarding the reliability and validation of LLM classifications.
  • LLM performance varies based on prompts, tuning, subject areas, tasks, and especially in zero-shot applications.

Purpose of the Study:

  • To evaluate the performance of supervised and semi-supervised machine coding algorithms in political science.
  • To compare the performance of three LLM models against each other and against trained human experts.
  • To assess the impact of prompt engineering and data pre-processing on LLM coding accuracy.

Main Methods:

  • Comparative analysis of three LLM models using supervised and semi-supervised learning on political data.
  • Multiple iterations of model performance testing with varying prompt engineering and data pre-processing techniques.
  • Assessment of LLM performance on an updated dataset to mitigate pre-training bias concerns.

Main Results:

  • GPT-4 demonstrated performance comparable to trained human experts in coding familiar contexts.
  • LLM consistency in coding varied across different contexts, with GPT-4 showing higher consistency.
  • Prompt engineering and data pre-processing influenced LLM performance, but expert-level coding was primarily achieved by GPT-4 in specific scenarios.

Conclusions:

  • Only GPT-4 approaches the performance of trained expert coders for political data, particularly in familiar contexts.
  • LLM coding offers potential benefits but requires careful validation and consideration of drawbacks for reliable application.
  • Further research is needed to refine LLM validation methods and ensure consistent performance across diverse coding tasks.