Transformers deep learning models for missing data imputation: an application of the ReMasker model on a psychometric scale
View abstract on PubMed
Summary
This summary is machine-generated.Transformer models like ReMasker significantly improve missing data imputation in psychometric research. This advanced method outperforms traditional techniques, enhancing data reliability and study validity.
Area Of Science
- Psychometrics
- Data Science
- Machine Learning
Background
- Missing data is a significant challenge in psychometric research, potentially compromising study reliability and validity.
- Traditional imputation methods (e.g., mean imputation, Expectation-Maximization) often fail to meet the assumptions of psychological data, leading to biased results.
- Developing robust methods for handling missing data is crucial for accurate psychometric analysis.
Purpose Of The Study
- To evaluate the effectiveness of transformer-based deep learning for imputing missing data in psychometric research.
- To compare a novel transformer model, ReMasker, against conventional and other machine learning imputation techniques.
- To assess the performance of different imputation methods using a real-world psychometric dataset.
Main Methods
- A masking autoencoding transformer model (ReMasker) was developed and compared with mean/median imputation, Expectation-Maximization (EM), K-nearest neighbors (KNN), MissForest, and Artificial Neural Networks (ANN).
- A psychometric dataset from the COVID distress repository was utilized for the evaluation.
- Imputation performance was quantified using the Root Mean Squared Error (RMSE) between original and imputed data matrices.
Main Results
- Transformer-based models, specifically ReMasker, demonstrated superior performance in data reconstruction compared to conventional imputation methods.
- Machine learning approaches, including ReMasker, consistently outperformed traditional techniques across all tested scenarios.
- ReMasker achieved the lowest reconstruction error, indicating more accurate data imputation.
Conclusions
- Transformer-based deep learning models offer a robust and effective solution for addressing missing data in psychometric research.
- The superior performance of ReMasker highlights the potential of advanced AI techniques to enhance data integrity and the generalizability of findings in psychological studies.
- These findings advocate for the adoption of advanced imputation methods to improve the quality of psychometric research outcomes.

