New world of big data-new challenges for evidence synthesis: impact of data duplication on estimates generated by meta-analyses and the development of a framework for its identification and management
View abstract on PubMed
Summary
This summary is machine-generated.Duplicate data in meta-analyses can significantly skew results, underestimating venous thromboembolic events (VTE) incidence by over 22%. A new framework helps identify and manage duplicated data from registries to improve accuracy.
Area Of Science
- Medical research methodology
- Biostatistics
- Health informatics
Background
- Meta-analyses synthesize existing research but can be compromised by data duplication.
- Using data from the same registries in multiple studies can lead to overlapping or duplicated entries.
- Accurate pooled estimates are crucial for evidence-based medicine.
Purpose Of The Study
- To evaluate the impact of duplicated data from registries on meta-analysis results.
- To develop and present a structured framework for identifying and managing duplicated data.
- To assess the incidence of venous thromboembolic events (VTE) post-bariatric surgery.
Main Methods
- Secondary analysis of a meta-analysis on 30-day VTE incidence after metabolic and bariatric surgery.
- Sensitivity analysis comparing uncorrected (all studies) and corrected (deduplicated) samples.
- Development of a decision tree framework to identify duplicated data based on source, timeframe, and inclusion criteria.
Main Results
- Inadvertent inclusion of duplicated data underestimated VTE incidence by 22.06% of total VTE.
- The decision tree framework effectively identified potentially duplicated data.
- Excellent inter-rater reliability (κ=1.00) was achieved for the framework, though author verification was limited.
- A significant lack of geographical diversity in the included study data was observed.
Conclusions
- Duplicated data in meta-analyses leads to substantially inaccurate pooled estimates.
- The proposed decision tree framework offers a systematic approach for researchers to manage duplicated data.
- Applying this framework enhances the reliability of meta-analyses, particularly those using data registries.
Related Concept Videos
Confounding is a critical issue in epidemiological studies, often leading to misleading conclusions about associations between exposures and outcomes. It occurs when the relationship between the exposure and the outcome is mixed with the effects of other factors that influence the outcome. Given that, addressing confounding is of high importance for drawing accurate inferences in research.
Confounding can be addressed at both the design phase of a study and through analytical methods after data...
In the case of systematic errors, the sources can be identified, and the errors can be subsequently minimized by addressing these sources. According to the source, systematic errors can be divided into sampling, instrumental, methodological, and personal errors.
Sampling errors originate from improper sampling methods or the wrong sample population. These errors can be minimized by refining the sampling strategy. Defective instruments or faulty calibrations are the sources of instrumental...
Biases can arise at various stages of research, from study design and data collection to analysis and interpretation. Recognizing and addressing these biases is essential to ensure the validity and reliability of epidemiological findings.Broadly speaking, biases in epidemiology fall into three main categories: selection bias, information bias, and confounding. A more detailed description of possible biases is:
Selection Bias: This occurs when the study population is not...
The seminal work of Ohno in 1970 popularized the idea of gene duplication and divergence. DNA sequence comparison studies reveal that a large portion of the genes in bacteria, archaebacteria, and eukaryotes was generated by gene duplication and divergence, indicating its critical role in evolution.
The duplicated copies of the gene are called Paralogs. Paralogs with similar sequences and functions form a gene family. Across several species, a large number of gene families are...
Confounding in statistical epidemiology represents a pivotal challenge, referring to the distortion in the perceived relationship between an exposure and an outcome due to the presence of a third variable, known as a confounder. This variable is associated with both the exposure and the outcome but is not a direct link in their causal chain. Its presence can lead to erroneous interpretations of the exposure's effect, either exaggerating or underestimating the true association. This...
Bias refers to any tendency that prevents a question from being considered unprejudiced. In research, bias occurs when one outcome or answer is selected or encouraged over others in sampling or testing. Bias can occur during any research phase, including study design, data collection, analysis, and publication.
In statistics, a sampling bias is created when a sample is collected from a population, and some members of the population are not as likely to be chosen as others (remember, each member...

