Jove
Visualize
联系我们
JoVE
x logofacebook logolinkedin logoyoutube logo
关于 JoVE
概览领导团队博客JoVE 帮助中心
作者
出版流程编辑委员会范围与政策同行评审常见问题投稿
图书馆员
用户评价订阅访问资源图书馆顾问委员会常见问题
研究
JoVE JournalMethods CollectionsJoVE Encyclopedia of Experiments存档
教育
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab Manual教师资源中心教师网站
使用条款与条件
隐私政策
政策

相关概念视频

Randomized Experiments01:13

Randomized Experiments

6.6K
The randomization process involves assigning study participants randomly to experimental or control groups based on their probability of being equally assigned. Randomization is meant to eliminate selection bias and balance known and unknown confounding factors so that the control group is similar to the treatment group as much as possible. A computer program and a random number generator can be used to assign participants to groups in a way that minimizes bias.
Simple randomization
Simple...
6.6K
Censoring Survival Data01:09

Censoring Survival Data

55
Survival analysis is a statistical method used to analyze time-to-event data, often employed in fields such as medicine, engineering, and social sciences. One of the key challenges in survival analysis is dealing with incomplete data, a phenomenon known as "censoring." Censoring occurs when the event of interest (such as death, relapse, or system failure) has not occurred for some individuals by the end of the study period or is otherwise unobservable, and it might have many different...
55
Data: Types and Distribution01:19

Data: Types and Distribution

667
In biostatistics, data are the observations collected for analysis. There are two main types: parametric and non-parametric. Parametric data, which include continuous (e.g., weight) and discrete numerical data (e.g., number of tablets), assume a particular distribution pattern, often the normal distribution. Non-parametric data do not adhere to a specific distribution and typically comprise nominal (e.g., gender) and ordinal categorical data (e.g., pain scale ratings).
Distributions in...
667
Bootstrapping01:24

Bootstrapping

574
The term "bootstrap" originated in the 19th century as a metaphor for self-improvement or achieving something independently, without external assistance. This concept extends to statistical bootstrapping, a self-contained method for estimating population parameters through resampling, even though it can be computationally intensive. Developed by the American statistician Dr. Bradley Efron in 1979, bootstrapping provides a robust way to perform inference when the original sample size is...
574
Regression Toward the Mean01:52

Regression Toward the Mean

6.3K
Regression toward the mean (“RTM”) is a phenomenon in which extremely high or low values—for example, and individual’s blood pressure at a particular moment—appear closer to a group’s average upon remeasuring. Although this statistical peculiarity is the result of random error and chance, it has been problematic across various medical, scientific, financial and psychological applications. In particular, RTM, if not taken into account, can interfere when...
6.3K
Biostatistics: Overview01:20

Biostatistics: Overview

214
Biostatistics plays a crucial role in understanding and analyzing data in healthcare and biology. Biostatisticians conduct experiments, gather evidence, and draw meaningful conclusions using statistical methods and techniques. Different variables form the foundation of biostatistical analysis, allowing researchers to understand and interpret data effectively. These variables are classified into different types, each serving a specific purpose in statistical analysis.
Discrete variables are...
214

您也可能阅读

相关文章

通过共同作者、期刊和引用图与本文相关的文章。

排序
Same author

Dual-functional MXene-integrated GelMA microspheres for synergistic chemo/photothermal therapy: In vitro 2D/3D multi-cancer evaluation and in vivo breast cancer validation.

International journal of biological macromolecules·2026
Same author

'Shelter From the Storm': A FINGER-Like Five-Domain Lifestyle Intervention to Promote Cognitive Health and Well-Being in Older Adults in Taiwan.

International journal of older people nursing·2026
Same author

Prevalence and diagnostic signs of convergence insufficiency among schoolchildren in Kaohsiung, Taiwan: a cross-sectional study.

BMC ophthalmology·2026
Same author

Construction of a 3D Bioprinted Microfluidic Platform to Study Breast Cancer Bone Metastasis and Tumor Microenvironmental Influences.

ACS applied materials & interfaces·2025
Same author

Predicting functional outcomes after a stroke event by clinical text notes: A comparative study of traditional machine learning and deep learning methods.

Health informatics journal·2025
Same author

Mobile App-Based Intervention and Cardiovascular Risk Factors in Patients With Uncontrolled Type 2 Diabetes: A Randomized Clinical Trial.

JAMA network open·2025
Same journal

The role of digital resources in surgical education: An analysis of YouTube videos on dynamic stabilization.

Technology and health care : official journal of the European Society for Engineering and Medicine·2026
Same journal

Behavioral patterns in iGaming across territories: Psychiatric and AI-driven insights via the internet of behavior.

Technology and health care : official journal of the European Society for Engineering and Medicine·2026
Same journal

Leveraging personal health records for early heart failure risk prediction through AI-driven modeling.

Technology and health care : official journal of the European Society for Engineering and Medicine·2026
Same journal

From data to prevention: A systematic review of artificial intelligence applications in sports injury prediction.

Technology and health care : official journal of the European Society for Engineering and Medicine·2026
Same journal

Leadership styles and work outcome in healthcare sector: Insights from bibliometric analysis.

Technology and health care : official journal of the European Society for Engineering and Medicine·2026
Same journal

Network analysis revealing research focus of the German Congress of Orthopedics and Trauma Surgery 2021.

Technology and health care : official journal of the European Society for Engineering and Medicine·2026
查看所有相关文章

相关实验视频

Updated: May 21, 2025

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

14.3K

对于类不平衡的医疗数据集,数据离散和数据重新抽样之间的相互作用效应.

Min-Wei Huang1,2,3, Chih-Fong Tsai4, Wei-Chao Lin5,6,7

  • 1Kaohsiung Municipal Kai-Syuan Psychiatric Hospital, Kaohsiung.

Technology and health care : official journal of the European Society for Engineering and Medicine
|March 19, 2025
PubMed
概括
此摘要是机器生成的。

将数据分离和重新抽样结合起来,可以提高对不平衡医疗数据的分类器性能. 最佳策略取决于数据集类型,过量采样通常会比基线方法提高结果.

关键词:
阶级不平衡 阶级不平衡数据挖掘是数据挖掘的一个方法.数据重新抽样数据重新抽样.分密化 (Discretization) 是指对信息进行分密化.机器学习是机器学习.

更多相关视频

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.4K
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.4K

相关实验视频

Last Updated: May 21, 2025

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

14.3K
Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.4K
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.4K

科学领域:

  • 数据挖掘 数据挖掘
  • 机器学习 机器学习
  • 医疗信息学 医疗信息学

背景情况:

  • 数据离散化将连续特征转化为离散特征,帮助特定的数据挖掘算法.
  • 不平衡类别的医疗数据集对准确的分类构成了挑战.
  • 数据重新抽样技术 (过量抽样,不足抽样,混合) 用于平衡训练数据.

研究的目的:

  • 评估数据离散和重新抽样结合对不平衡医疗数据集分类器性能的影响.
  • 为了比较应用离散和重新采样步骤的顺序.
  • 确定最佳的预处理策略,以提高分类准确度.

主要方法:

  • 实验对11个两类和3个多类不平衡的医学数据集进行了实验.
  • 离散算法:Chimerge和最小描述长度原则 (MDLP). 离散算法:Chimerge和最小描述长度原则 (MDLP). 离散算法:Chimerge和最小描述长度原则 (MDLP).
  • 重复采样算法:Tomek链接低采样,合成少数群体过量采样技术 (SMOTE) 和SMOTE-Tomek.
  • 分类器:支持矢量机 (SVM),C4.5决策树和随机森林 (RF).

主要成果:

  • 与基线方法相比,组合方法的ROC曲线下面积 (AUC) 率较高 (0.8%-3.5%为两类,0.9%-2.5%为多类).
  • 对于两种类型的数据,MDLP离散,其次是SMOTE过量抽样,以最小的计算成本实现了最高的AUC.
  • 对于多类数据,在ChiMerge离散之前的SMOTE或SMOTE-Tomek重新抽样提供了最佳性能.

结论:

  • 过量采样技术通常会比基线方法提高分类器的性能.
  • 仅仅对数据进行分离并不能保证分类器性能的提高.
  • 将离散和重新抽样结合起来,有可能在不平衡的医疗数据集上产生更高的AUC率.