Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Review and Preview01:13

Review and Preview

11.9K
Data are individual items of information obtained from a population or sample. Data may be classified as qualitative (categorical), quantitative continuous, or quantitative discrete. Because it is not practical to measure the entire population in a study, researchers use samples to represent the population. A random sample is a representative group from the population chosen by using a method that gives each individual in the population an equal chance of being included in the sample. Random...
11.9K
Review and Preview01:10

Review and Preview

8.6K
In statistics, several tools are used to interpret the data. Measures of central tendency represent the characteristics of the data, such as mean, median, and mode. Additionally, measures of variance like standard deviation and range are used to find the spread of data from the mean. Relative standing measures the distance between data locations. Commonly used measures of relative standings are percentile, z score, and quartiles.
Percentiles are a type of fractile that partition data into...
8.6K
Data: Types and Distribution01:19

Data: Types and Distribution

2.0K
In biostatistics, data are the observations collected for analysis. There are two main types: parametric and non-parametric. Parametric data, which include continuous (e.g., weight) and discrete numerical data (e.g., number of tablets), assume a particular distribution pattern, often the normal distribution. Non-parametric data do not adhere to a specific distribution and typically comprise nominal (e.g., gender) and ordinal categorical data (e.g., pain scale ratings).
Distributions in...
2.0K
Variability: Analysis01:11

Variability: Analysis

573
Measures of variability are statistical metrics that reveal the dispersion pattern within a dataset. They are pivotal in biostatistics, providing insights into the heterogeneity within health and biological data. Variability signifies the degree to which data points diverge from one another, helping researchers understand the potential range of values and associated uncertainty within the data.
The range is a simple measure of variability, indicating the difference between the highest and...
573
One-Way ANOVA: Equal Sample Sizes01:15

One-Way ANOVA: Equal Sample Sizes

4.2K
One-Way ANOVA can be performed on three or more samples with equal or unequal sample sizes. When one-way ANOVA is performed on two datasets with samples of equal sizes, it can be easily observed that the computed F statistic is highly sensitive to the sample mean.
Different sample means can result in different values for the variance estimate: variance between samples. This is because the variance between samples is calculated as the product of the sample size and the variance between the...
4.2K
Diversity of Protists I01:15

Diversity of Protists I

1.5K
Excavata is a diverse group of protists that includes both chemoorganotrophic and phototrophic species, with some thriving in anaerobic environments. Among the key groups within Excavata are diplomonads and parabasalids, which are flagellated protists that lack mitochondria and chloroplasts. These microorganisms typically inhabit anoxic environments, such as the intestines of animals, where they exist either symbiotically or as parasites, relying on fermentation for energy production. Some...
1.5K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Promoting transparency in AI for biomedical and behavioral research.

Nature medicine·2025
Same author

Can Machine Learning Overcome the 95% Failure Rate and Reality that Only 30% of Approved Cancer Drugs Meaningfully Extend Patient Survival?

Journal of medicinal chemistry·2024
Same author

An external stability audit framework to test the validity of personality prediction in AI hiring.

Data mining and knowledge discovery·2022
Same author

Teaching Responsible Data Science: Charting New Pedagogical Territory.

International journal of artificial intelligence in education·2021
Same author

Research Challenges in Financial Data Modeling and Analysis.

Big data·2017
Same author

AnnotCompute: annotation-based exploration and meta-analysis of genomics experiments.

Database : the journal of biological databases and curation·2011
Same journal

Big Data-Driven Video Anomaly Detection Using VideoMAE for Visual Analytics in CCTV Surveillance.

Big data·2026
Same journal

Agentic Artificial Intelligence-Driven Explainable Deep Learning for Deciphering Noncoding Pathogenic Mechanisms of Delirium Through Genomic Big Data Integration.

Big data·2026
Same journal

Personalized Driven Instruction Through Explainable Agentic AI in Multicultural Higher Education Environments.

Big data·2026
Same journal

Big Data-Driven Explainable Agentic AI Decision Frameworks for Enterprise Innovation in FinTech Ecosystems.

Big data·2026
Same journal

An Edge-Enabled Low-Latency Cross-Lingual Speech-to-Text Framework for Efficient Human-Robot Interaction.

Big data·2026
Same journal

DS<sup>2</sup>PT: A Deep Two-Stage Patent Text Segmentation Framework Informed by Low-Latency Neural Network Characteristics.

Big data·2026
See all related articles

Related Experiment Video

Updated: Feb 28, 2026

A User-friendly and Powerful R Analysis of Large-scale Datasets
10:56

A User-friendly and Powerful R Analysis of Large-scale Datasets

Published on: November 4, 2025

425

Diversity in Big Data: A Review.

Marina Drosou1, H V Jagadish2, Evaggelia Pitoura1

  • 11 Department of Computer Science, University of Ioannina , Ioannina, Greece .

Big Data
|June 21, 2017
PubMed
Summary
This summary is machine-generated.

Big data presents opportunities and risks. This work explores diversity in data selection, linking it to fairness and advocating for its central role in responsible data practices for ethical and practical benefits.

Keywords:
datadiversityempirical studiesmodels and algorithmsresponsibly

More Related Videos

Databases to Efficiently Manage Medium Sized, Low Velocity, Multidimensional Data in Tissue Engineering
09:43

Databases to Efficiently Manage Medium Sized, Low Velocity, Multidimensional Data in Tissue Engineering

Published on: November 22, 2019

6.8K
Performing Data Mining And Integrative Analysis Of Biomarker in Breast Cancer Using Multiple Publicly Accessible Databases
07:41

Performing Data Mining And Integrative Analysis Of Biomarker in Breast Cancer Using Multiple Publicly Accessible Databases

Published on: May 17, 2019

9.7K

Related Experiment Videos

Last Updated: Feb 28, 2026

A User-friendly and Powerful R Analysis of Large-scale Datasets
10:56

A User-friendly and Powerful R Analysis of Large-scale Datasets

Published on: November 4, 2025

425
Databases to Efficiently Manage Medium Sized, Low Velocity, Multidimensional Data in Tissue Engineering
09:43

Databases to Efficiently Manage Medium Sized, Low Velocity, Multidimensional Data in Tissue Engineering

Published on: November 22, 2019

6.8K
Performing Data Mining And Integrative Analysis Of Biomarker in Breast Cancer Using Multiple Publicly Accessible Databases
07:41

Performing Data Mining And Integrative Analysis Of Biomarker in Breast Cancer Using Multiple Publicly Accessible Databases

Published on: May 17, 2019

9.7K

Area of Science:

  • Computer Science
  • Data Science
  • Ethics

Background:

  • Big data technologies offer significant societal and individual benefits.
  • These technologies also pose risks, particularly to underrepresented groups.
  • Ensuring inclusivity in data-driven systems is crucial.

Purpose of the Study:

  • To review technical advancements in diversity within data selection tasks.
  • To explore the relationship between diversity and fairness in big data.
  • To propose future research directions for integrating diversity into data responsibility.

Main Methods:

  • Overview of recent technical literature on diversity in selection algorithms.
  • Analysis of conceptual links between diversity metrics and fairness definitions.
  • Identification of key challenges and opportunities for future research.

Main Results:

  • Diversity in data selection is technically feasible and offers advantages.
  • Fairness in big data is intrinsically linked to the concept of diversity.
  • Current technical work provides a foundation for more inclusive data practices.

Conclusions:

  • Diversity must be a central consideration in big data development and deployment.
  • Prioritizing diversity mitigates exclusion risks and enhances data analysis utility.
  • Future work should focus on operationalizing diversity for a data-responsible society.