Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Automatic Processing and Automatic Social Behavior01:28

Automatic Processing and Automatic Social Behavior

213
Automatic processing refers to the cognitive operations that occur without conscious intent or awareness, playing a fundamental role in shaping social cognition and behavior. These processes enable individuals to navigate complex social environments efficiently by relying on mental shortcuts and pre-existing knowledge structures known as schemas. One of the most influential mechanisms underlying automatic processing is priming, which subtly activates mental representations through exposure to...
213
Behavioral Genetics and Its Designs01:23

Behavioral Genetics and Its Designs

1.0K
Behavior genetics explores how genetic inheritance influences human behavior. It focuses on how genes, passed from parents to offspring, contribute to the development of behavioral traits and tendencies. This branch of genetics seeks to understand the complex interplay between inherited genetic factors and environmental influences in shaping our behaviors.
The primary methodologies used in behavior genetics include family studies, twin studies, and adoption studies, each providing unique...
1.0K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Ricci curvature and the stream of thought.

Psychological methods·2025
Same author

Temporal structure of natural language processing in the human brain corresponds to layered hierarchy of large language models.

Nature communications·2025
Same author

Detecting Eating Disorders From Social Media Content: What Has Been Done and Where Do We Go Next?

The International journal of eating disorders·2025
Same author

A unified acoustic-to-speech-to-language embedding space captures the neural basis of natural language processing in everyday conversations.

Nature human behaviour·2025
Same author

Author Correction: Alignment of brain embeddings and artificial contextual embeddings in natural language points to common geometric patterns.

Nature communications·2024
Same author

Alignment of brain embeddings and artificial contextual embeddings in natural language points to common geometric patterns.

Nature communications·2024
Same journal

Stakeholder Experiences With the Pneumococcal Conjugate Vaccine Chatbot as a Complementary Capacity-Building Tool for Frontline Health Workers in India: Qualitative Study.

JMIR formative research·2026
Same journal

Acceptability and Perceived Usefulness of a Digital Gambling Harm Minimisation Tool: A Cross-Sectional Study.

JMIR formative research·2026
Same journal

Knowledge Graphs Based on Meta-Analysis Papers Improve the Quality of Case Formulation: Mixed Methods Design.

JMIR formative research·2026
Same journal

Expedited Transition to Digital Delivery of Recovery Support Services Due to the COVID-19 Pandemic: Mixed Methods Needs Assessment.

JMIR formative research·2026
Same journal

Impact of an mHealth App on Digital Transformation: Randomized Clinical Trial on Strengthening Digital Skills in Older Women.

JMIR formative research·2026
Same journal

Emotion Classification in Japanese Cancer Survivor Interview Narratives Using Sentiment Polarity and Plutchik Emotion Frameworks: Model Development and Evaluation Study.

JMIR formative research·2026
See all related articles

Related Experiment Video

Updated: Jan 14, 2026

Decoding Natural Behavior from Neuroethological Embedding
08:00

Decoding Natural Behavior from Neuroethological Embedding

Published on: October 3, 2025

587

Preprocessing Large-Scale Conversational Datasets: A Framework and Its Application to Behavioral Health Transcripts.

Paz Mor Naim1, Shiri Sadeh-Sharvit2,3, Samuel Jefroykin2

  • 1Department of Cognitive and Brain Sciences, Hebrew University of Jerusalem, Mount Scopus, Jerusalem, 9190500, Israel, 972 025882888.

JMIR Formative Research
|October 24, 2025
PubMed
Summary
This summary is machine-generated.

This study introduces a framework using large language models (LLMs) to filter noisy conversational transcripts, improving data quality for behavioral health research. The hybrid approach effectively distinguishes therapy sessions from non-sessions, enhancing data usability.

Keywords:
artificial intelligencebehavioral healthclinical documentationclinical textsconversational transcriptsdata preprocessingdata quality assessmenthealth informaticshealth information systemslarge language modelsnatural language processingpsychotherapytext classification

More Related Videos

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

1.0K
Integrating Computerized Linguistic and Social Network Analyses to Capture Addiction Recovery Capital in an Online Community
08:53

Integrating Computerized Linguistic and Social Network Analyses to Capture Addiction Recovery Capital in an Online Community

Published on: May 31, 2019

5.5K

Related Experiment Videos

Last Updated: Jan 14, 2026

Decoding Natural Behavior from Neuroethological Embedding
08:00

Decoding Natural Behavior from Neuroethological Embedding

Published on: October 3, 2025

587
Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

1.0K
Integrating Computerized Linguistic and Social Network Analyses to Capture Addiction Recovery Capital in an Online Community
08:53

Integrating Computerized Linguistic and Social Network Analyses to Capture Addiction Recovery Capital in an Online Community

Published on: May 31, 2019

5.5K

Area of Science:

  • Computational Linguistics
  • Health Informatics
  • Artificial Intelligence

Background:

  • Automatic transcription of conversations generates noisy datasets with errors and unintended recordings.
  • Preprocessing and filtering are crucial for the research utility of large conversational transcript datasets.
  • Accurate conversation representation is vital for deriving insights in behavioral health contexts.

Purpose of the Study:

  • To present a framework for preprocessing and filtering large conversational transcript datasets.
  • To remove non-session transcripts unrelated to behavioral treatment sessions.
  • To enhance the utility of behavioral health transcripts for research.

Main Methods:

  • Integrated feature extraction, human annotation, and large language models (LLMs).
  • Utilized LLM perplexity to measure transcript noise and zero-shot prompting for classification.
  • Prioritized data security and anonymity throughout the process.

Main Results:

  • Approximately one-third of transcripts contained errors, including incomprehensible segments and speaker diarization issues.
  • LLM perplexity showed higher scores in non-sessions, but moderate classification performance alone.
  • Zero-shot LLM prompting achieved high agreement with expert ratings (κ=0.71) in distinguishing sessions from non-sessions.

Conclusions:

  • The hybrid approach effectively characterizes errors and distinguishes text types in conversational datasets.
  • Provides a foundation for ensuring data quality and usability in mental health research.
  • Emphasizes integrating clinical experts with AI tools while prioritizing data security.