Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Protein Folding Quality Check in the RER01:29

Protein Folding Quality Check in the RER

5.1K
ER is the primary site for the maturation and folding of soluble and transmembrane secretory proteins. The calnexin cycle is a specific chaperone system that folds and assesses the confirmation of N-glycosylated proteins before they can exit the ER lumen. The primary players of this quality check pipeline are the lectins, ER-resident chaperones, and a glucosyl transferase enzyme. In case the calnexin system in the lumen fails to salvage a misfolded protein, it is transported to the cytoplasm...
5.1K
How Data are Classified: Numerical Data00:59

How Data are Classified: Numerical Data

37.2K
Data that are countable or measurable in specific units are called numerical or quantitative data. Quantitative data are always numbers. Quantitative data are the result of counting or measuring the attributes of a population. Amount of money, pulse rate, weight, number of people living in a town, and number of students who opt for statistics are examples of quantitative data.
Quantitative data may be either discrete or continuous. All quantitative data that take on only specific numerical...
37.2K
How Data are Classified: Categorical Data01:11

How Data are Classified: Categorical Data

43.2K
A variable, usually notated by capital letters such as X and Y, is a characteristic or measurement that can be determined for each member of a population. Data are the actual values of variables. They may be numbers, or they may be words. Datum is a single value.
Data are classified based on whether they are measurable or not. Categorical data cannot be measured; instead, it can be divided into categories. For example, if Y denotes a person's party affiliation, some examples of Y include...
43.2K
Data Reporting and Recording01:24

Data Reporting and Recording

5.4K
Reporting and recording are crucial in data documentation. The timely, thorough, and accurate documentation of facts is essential when recording patient data. Failure to record findings during an assessment or interpretation of a problem will result in loss of information and make the patient document unreliable. The reader is left with general impressions if the information is not specific. A recording is documenting data of the individual's health information in a traceable, secure, and...
5.4K
Performing a Simple Data Analysis using MS-Excel Function01:17

Performing a Simple Data Analysis using MS-Excel Function

935
Microsoft Excel offers a suite of functions and tools ideal for statistical analysis, making it accessible to students and researchers. This article outlines fundamental Excel functions pivotal for data analysis.
SUM: This function calculates the total sum of a range of values. It's the foundation for aggregating data, essential for determining overall trends and totals in datasets.
AVERAGE: It computes the mean value of a given set of numbers, providing a quick insight into the central...
935
Model Approaches for Pharmacokinetic Data: Physiological Models01:15

Model Approaches for Pharmacokinetic Data: Physiological Models

253
Physiological models in pharmacokinetics are instrumental in understanding the distribution and elimination of drugs within the body. These models describe the drug concentration within target organs, influenced by factors such as drug uptake, tissue volume, and blood flow. Drug uptake is governed by the partition coefficient, which signifies the drug concentration ratio in tissue to that in the blood. The blood flow rate to a specific tissue is expressed as Qt, and the rate of change in tissue...
253

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Hard to Halt: Automation Bias in Agent-Driven Sequencing Prior Authorization Workflows.

medRxiv : the preprint server for health sciences·2026
Same author

Unsupervised characterization of 100,272 EHR patients identifies high-risk groups and comorbidities linked to premature aging.

NPJ digital medicine·2026
Same author

TimeX: Phenotype Onset Extraction from Clinical Narratives.

npj health systems·2026
Same author

Completeness of Common Data Elements for Breast Cancer Clinical Trials in Observational Databases.

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science·2026
Same author

Wrong-side imaging orders: automated detection using electronic health record data - a retrospective cohort study.

BMJ open quality·2026
Same author

Harnessing Patient-Generated Data for Rare Disease Knowledge Enrichment: A Pilot Study.

Studies in health technology and informatics·2026
Same journal

Implementing a Novel Quality Improvement-Based Approach to Data Quality Monitoring and Enhancement in a Multipurpose Clinical Registry.

EGEMS (Washington, DC)·2019
Same journal

A Spatial Analysis of Health Disparities Associated with Antibiotic Resistant Infections in Children Living in Atlanta (2002-2010).

EGEMS (Washington, DC)·2019
Same journal

Predicting the Incidence of Pressure Ulcers in the Intensive Care Unit Using Machine Learning.

EGEMS (Washington, DC)·2019
Same journal

Cardiovascular Health Trends in Electronic Health Record Data (2012-2015): A Cross-Sectional Analysis of The Guideline Advantageâ„¢.

EGEMS (Washington, DC)·2019
Same journal

Understanding U.S. Health Systems: Using Mixed Methods to Unpack Organizational Complexity.

EGEMS (Washington, DC)·2019
Same journal

Improving a Secondary Use Health Data Warehouse: Proposing a Multi-Level Data Quality Framework.

EGEMS (Washington, DC)·2019
See all related articles

Related Experiment Video

Updated: Jan 25, 2026

Best Current Practice for Obtaining High Quality EEG Data During Simultaneous fMRI
10:35

Best Current Practice for Obtaining High Quality EEG Data During Simultaneous fMRI

Published on: June 3, 2013

33.3K

A Data Element-Function Conceptual Model for Data Quality Checks.

James R Rogers1, Tiffany J Callahan2, Tian Kang1

  • 1Department of Biomedical Informatics, Columbia University, US.

EGEMS (Washington, DC)
|May 9, 2019
PubMed
Summary
This summary is machine-generated.

This study introduces a data element-function model and natural language processing (NLP) to categorize data quality (DQ) checks. The approach effectively classifies checks, revealing heterogeneity in DQ assessment across different health data initiatives.

Keywords:
clinical data research networksdata qualityelectronic healthcare recordsknowledge acquisitionnatural language processing

More Related Videos

Data Acquisition Protocol for Determining Embedded Sensitivity Functions
07:46

Data Acquisition Protocol for Determining Embedded Sensitivity Functions

Published on: April 20, 2016

6.5K
Obtaining High-Quality Transcriptome Data from Cereal Seeds by a Modified Method for Gene Expression Profiling
07:18

Obtaining High-Quality Transcriptome Data from Cereal Seeds by a Modified Method for Gene Expression Profiling

Published on: May 21, 2020

7.9K

Related Experiment Videos

Last Updated: Jan 25, 2026

Best Current Practice for Obtaining High Quality EEG Data During Simultaneous fMRI
10:35

Best Current Practice for Obtaining High Quality EEG Data During Simultaneous fMRI

Published on: June 3, 2013

33.3K
Data Acquisition Protocol for Determining Embedded Sensitivity Functions
07:46

Data Acquisition Protocol for Determining Embedded Sensitivity Functions

Published on: April 20, 2016

6.5K
Obtaining High-Quality Transcriptome Data from Cereal Seeds by a Modified Method for Gene Expression Profiling
07:18

Obtaining High-Quality Transcriptome Data from Cereal Seeds by a Modified Method for Gene Expression Profiling

Published on: May 21, 2020

7.9K

Area of Science:

  • Health Informatics
  • Data Science
  • Natural Language Processing

Background:

  • Existing data quality (DQ) checks are fragmented across heterogeneous formats, hindering comparison, categorization, and indexing.
  • A standardized approach is needed to manage and understand the vast landscape of DQ checks.

Purpose of the Study:

  • To develop and evaluate a data element-function conceptual model for classifying DQ checks.
  • To explore the use of natural language processing (NLP) for automated knowledge acquisition from DQ check narratives.

Main Methods:

  • A conceptual model defining "data element" and "function" was developed.
  • NLP techniques were applied to extract data elements and functions from 172 Observational Health Data Sciences and Informatics (OHDSI) checks and 3,434 Kaiser Permanente Center for Effectiveness and Safety Research (CESR) checks.

Main Results:

  • The model successfully classified all analyzed DQ checks.
  • 751 unique data elements and 24 unique functions were extracted.
  • Frequent data element-function pairings varied between OHDSI and CESR, highlighting differences in DQ check focus.

Conclusions:

  • The data element-function model is effective for classifying DQ checks.
  • NLP shows promise for scalable knowledge acquisition in this domain.
  • Significant heterogeneity exists in DQ checks, reflecting variations in intrinsic data properties and use-case specific "fitness-for-use" requirements.