Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Automatic Processing and Automatic Social Behavior

Automatic Processing and Automatic Social Behavior

Automatic processing refers to the cognitive operations that occur without conscious intent or awareness, playing a fundamental role in shaping social cognition and behavior. These processes enable individuals to navigate complex social environments efficiently by relying on mental shortcuts and pre-existing knowledge structures known as schemas. One of the most influential mechanisms underlying automatic processing is priming, which subtly activates mental representations through exposure to...

Behavioral Genetics and Its Designs

Behavioral Genetics and Its Designs

Behavior genetics explores how genetic inheritance influences human behavior. It focuses on how genes, passed from parents to offspring, contribute to the development of behavioral traits and tendencies. This branch of genetics seeks to understand the complex interplay between inherited genetic factors and environmental influences in shaping our behaviors.
The primary methodologies used in behavior genetics include family studies, twin studies, and adoption studies, each providing unique...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Ricci curvature and the stream of thought.

Psychological methods·2025

Same author

Temporal structure of natural language processing in the human brain corresponds to layered hierarchy of large language models.

Nature communications·2025

Same author

Detecting Eating Disorders From Social Media Content: What Has Been Done and Where Do We Go Next?

The International journal of eating disorders·2025

Same author

A unified acoustic-to-speech-to-language embedding space captures the neural basis of natural language processing in everyday conversations.

Nature human behaviour·2025

Same author

Author Correction: Alignment of brain embeddings and artificial contextual embeddings in natural language points to common geometric patterns.

Nature communications·2024

Same author

Alignment of brain embeddings and artificial contextual embeddings in natural language points to common geometric patterns.

Nature communications·2024

Same journal

Stakeholder Experiences With the Pneumococcal Conjugate Vaccine Chatbot as a Complementary Capacity-Building Tool for Frontline Health Workers in India: Qualitative Study.

JMIR formative research·2026

Same journal

Acceptability and Perceived Usefulness of a Digital Gambling Harm Minimisation Tool: A Cross-Sectional Study.

JMIR formative research·2026

Same journal

Knowledge Graphs Based on Meta-Analysis Papers Improve the Quality of Case Formulation: Mixed Methods Design.

JMIR formative research·2026

Same journal

Expedited Transition to Digital Delivery of Recovery Support Services Due to the COVID-19 Pandemic: Mixed Methods Needs Assessment.

JMIR formative research·2026

Same journal

Impact of an mHealth App on Digital Transformation: Randomized Clinical Trial on Strengthening Digital Skills in Older Women.

JMIR formative research·2026

Same journal

Emotion Classification in Japanese Cancer Survivor Interview Narratives Using Sentiment Polarity and Plutchik Emotion Frameworks: Model Development and Evaluation Study.

JMIR formative research·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jan 14, 2026

Decoding Natural Behavior from Neuroethological Embedding

Decoding Natural Behavior from Neuroethological Embedding

Published on: October 3, 2025

Preprocessing Large-Scale Conversational Datasets: A Framework and Its Application to Behavioral Health Transcripts.

Paz Mor Naim¹, Shiri Sadeh-Sharvit^2,3, Samuel Jefroykin²

¹Department of Cognitive and Brain Sciences, Hebrew University of Jerusalem, Mount Scopus, Jerusalem, 9190500, Israel, 972 025882888.

JMIR Formative Research

|October 24, 2025

Summary

This summary is machine-generated.

This study introduces a framework using large language models (LLMs) to filter noisy conversational transcripts, improving data quality for behavioral health research. The hybrid approach effectively distinguishes therapy sessions from non-sessions, enhancing data usability.

Keywords:

artificial intelligence behavioral health clinical documentation clinical texts conversational transcripts data preprocessing data quality assessment health informatics health information systems large language models natural language processing psychotherapy text classification

More Related Videos

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Integrating Computerized Linguistic and Social Network Analyses to Capture Addiction Recovery Capital in an Online Community

Integrating Computerized Linguistic and Social Network Analyses to Capture Addiction Recovery Capital in an Online Community

Published on: May 31, 2019

Related Experiment Videos

Last Updated: Jan 14, 2026

Decoding Natural Behavior from Neuroethological Embedding

Decoding Natural Behavior from Neuroethological Embedding

Published on: October 3, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Integrating Computerized Linguistic and Social Network Analyses to Capture Addiction Recovery Capital in an Online Community

Integrating Computerized Linguistic and Social Network Analyses to Capture Addiction Recovery Capital in an Online Community

Published on: May 31, 2019

Area of Science:

Computational Linguistics
Health Informatics
Artificial Intelligence

Background:

Automatic transcription of conversations generates noisy datasets with errors and unintended recordings.
Preprocessing and filtering are crucial for the research utility of large conversational transcript datasets.
Accurate conversation representation is vital for deriving insights in behavioral health contexts.

Purpose of the Study:

To present a framework for preprocessing and filtering large conversational transcript datasets.
To remove non-session transcripts unrelated to behavioral treatment sessions.
To enhance the utility of behavioral health transcripts for research.

Main Methods:

Integrated feature extraction, human annotation, and large language models (LLMs).
Utilized LLM perplexity to measure transcript noise and zero-shot prompting for classification.
Prioritized data security and anonymity throughout the process.

Main Results:

Approximately one-third of transcripts contained errors, including incomprehensible segments and speaker diarization issues.
LLM perplexity showed higher scores in non-sessions, but moderate classification performance alone.
Zero-shot LLM prompting achieved high agreement with expert ratings (κ=0.71) in distinguishing sessions from non-sessions.

Conclusions:

The hybrid approach effectively characterizes errors and distinguishes text types in conversational datasets.
Provides a foundation for ensuring data quality and usability in mental health research.
Emphasizes integrating clinical experts with AI tools while prioritizing data security.