Improving Survey Inference Using Administrative Records Without Releasing Individual-Level Continuous Data
View abstract on PubMed
Summary
This summary is machine-generated.This study introduces a novel two-step method to reduce bias in survey estimates caused by nonresponse. By utilizing confidential continuous auxiliary data to estimate response propensity, the approach improves the accuracy of statistical inference for survey data users.
Area Of Science
- Statistics
- Survey Methodology
- Data Science
Background
- Increasing nonresponse rates in probability surveys lead to biased statistical inference.
- Auxiliary information can reduce estimation bias, but confidentiality concerns often lead to discretization of continuous variables, weakening their utility.
- Discretized auxiliary data may not fully capture the relationship with survey outcomes, limiting improvements in survey estimates.
Purpose Of The Study
- To propose a novel two-step strategy to effectively utilize confidential continuous auxiliary data for improving survey estimates.
- To address the challenge of weakened utility of auxiliary information due to discretization for confidentiality.
- To develop a method that enhances the accuracy and efficiency of statistical inference in surveys with high nonresponse.
Main Methods
- A two-step strategy is proposed: statistical agencies use confidential continuous auxiliary data to estimate response propensity scores.
- These propensity scores are incorporated into a modified population dataset for data users.
- Data users employ a Bayesian model with splines, including discretized variables and propensity scores, for predictive survey inference.
Main Results
- Simulations demonstrate that the proposed method yields more efficient estimates of population means.
- The method provides better coverage for 95% credible intervals compared to alternative approaches.
- The approach was successfully illustrated using the Ohio Army National Guard Mental Health Initiative (OHARNG-MHI) dataset.
Conclusions
- The proposed two-step strategy effectively leverages confidential continuous auxiliary data to mitigate nonresponse bias in survey estimates.
- The method enhances the precision and reliability of statistical inference, offering improved credible interval coverage.
- The developed methods are accessible through the R package AuxSurvey, promoting wider application in survey research.
Related Concept Videos
Often, psychologists develop surveys as a means of gathering data. Surveys are lists of questions to be answered by research participants, and can be delivered as paper-and-pencil questionnaires, administered electronically, or conducted verbally. Generally, the survey itself can be completed in a short time, and the ease of administering a survey makes it easy to collect data from a large number of people.
Surveys allow researchers to gather data from larger samples than may be afforded by...
Some researchers gain access to large amounts of data without interacting with a single research participant. Instead, they use existing records to answer various research questions. This type of research approach is known as archival research. Archival research relies on looking at past records or data sets to look for interesting patterns or relationships. For example, a researcher might access the academic records of all individuals who enrolled in college within the past ten years and...
Maintaining nurses' educational and administrative records in healthcare settings, including hospitals and nursing schools, is paramount. Here's a breakdown of the types of academic records mentioned:
• Qualification Documentation: Educational records are crucial in assessing nurses' qualifications. These include degrees, certifications, and specialized training. They play a vital role in ensuring nurses are suitably qualified.
• Skill and Ability Assessment:...
Data collection refers to a systematic way of obtaining, observing, measuring, and analyzing accurate information. Observational studies are one of the most widely used methods of data collection. It involves collecting data by observing the behavior and physical characteristics of a sample without making any modifications to the sample.
An astronomer viewing the motion and brightness of stars in the sky and recording the data is an example of observational data collection. A botanist recording...
Survival analysis is a statistical method used to analyze time-to-event data, often employed in fields such as medicine, engineering, and social sciences. One of the key challenges in survival analysis is dealing with incomplete data, a phenomenon known as "censoring." Censoring occurs when the event of interest (such as death, relapse, or system failure) has not occurred for some individuals by the end of the study period or is otherwise unobservable, and it might have many different...
Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a stratified sample, divide the population into groups called strata and then take a...

