Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Statistical Analysis: Overview

Statistical Analysis: Overview

When we take repeated measurements on the same or replicated samples, we will observe inconsistencies in the magnitude. These inconsistencies are called errors. To categorize and characterize these results and their errors, the researcher can use statistical analysis to determine the quality of the measurements and/or suitability of the methods.
One of the most commonly used statistical quantifiers is the mean, which is the ratio between the sum of the numerical values of all results and the...

Multiple Regression

Multiple Regression

Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...

Cluster Sampling Method

Cluster Sampling Method

Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...

Statistical Software for Data Analysis and Clinical Trials

Statistical Software for Data Analysis and Clinical Trials

Statistical software is pivotal in data analysis and clinical trials by providing tools to analyze data, draw conclusions, and make predictions. These software packages range from simple data management applications to complex analytical platforms, supporting various statistical tests, models, and simulation techniques. Their significance lies in their ability to handle vast amounts of data with precision and efficiency, enabling researchers to validate hypotheses, identify trends, and make...

Correlation and Regression

Correlation and Regression

In statistics, correlation describes the degree of association between two variables. In the subfield of linear regression, correlation is mathematically expressed by the correlation coefficient, which describes the strength and direction of the relationship between two variables. The coefficient is symbolically represented by 'r' and ranges from -1 to +1. A positive value indicates a positive correlation where the two variables move in the same direction. A negative value suggests a...

Regression Analysis

Regression Analysis

Regression analysis is a statistical tool that describes a mathematical relationship between a dependent variable and one or more independent variables.
In regression analysis, a regression equation is determined based on the line of best fit– a line that best fits the data points plotted in a graph. This line is also called the regression line. The algebraic equation for the regression line is called the regression equation. It is represented as:

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Evaluation of the Accuracy of Infrared Thermographic Imaging for the Diagnosis of Periodontal Diseases: A Cross-Sectional Study.

Journal of clinical periodontology·2025

Same author

A practical evaluation of correlation filter-based object trackers with new features.

PloS one·2022

Same author

Marine Data Prediction: An Evaluation of Machine Learning, Deep Learning, and Statistical Predictive Models.

Computational intelligence and neuroscience·2022

Same author

A Robust UWSN Handover Prediction System Using Ensemble Learning.

Sensors (Basel, Switzerland)·2021

Same journal

RETRACTION: Multidimensional Heterogeneous Network Link Adaptation Based on Mobile Environment.

Computational intelligence and neuroscience·2026

Same journal

RETRACTION: Framework to Segment and Evaluate Multiple Sclerosis Lesion in MRI Slices Using VGG-UNet.

Computational intelligence and neuroscience·2026

Same journal

RETRACTION: Facial Emotion Recognition Using a Novel Fusion of Convolutional Neural Network and Local Binary Pattern in Crime Investigation.

Computational intelligence and neuroscience·2026

Same journal

RETRACTION: Automatic Intelligent System Using Medical of Things for Multiple Sclerosis Detection.

Computational intelligence and neuroscience·2026

Same journal

RETRACTION: Intangible Cultural Heritage Reproduction and Revitalization: Value Feedback, Practice, and Exploration Based on the IPA Model.

Computational intelligence and neuroscience·2026

Same journal

RETRACTION: CNN Based Multiclass Brain Tumor Detection Using Medical Imaging.

Computational intelligence and neuroscience·2025

See all related articles

Search research articles

Related Experiment Video

Updated: Oct 25, 2025

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Published on: August 16, 2020

Usages of Spark Framework with Different Machine Learning Algorithms.

Mohamed Ali Mohamed¹, Ibrahim Mahmoud El-Henawy¹, Ahmad Salah¹

¹Computer Science Department, Faculty of Computers and Informatics, Zagazig University, Zagazig, Egypt.

Computational Intelligence and Neuroscience

|August 9, 2021

Summary

This summary is machine-generated.

The Internet of Things generates massive data. This study introduces Apache Spark as a platform to enhance machine learning for big data analysis, focusing on regression, classification, and clustering.

More Related Videos

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Published on: July 22, 2025

Asthma Detection Research Based on Voice Signal Processing and Machine Learning

Asthma Detection Research Based on Voice Signal Processing and Machine Learning

Published on: July 22, 2025

Related Experiment Videos

Last Updated: Oct 25, 2025

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Published on: August 16, 2020

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Published on: July 22, 2025

Asthma Detection Research Based on Voice Signal Processing and Machine Learning

Asthma Detection Research Based on Voice Signal Processing and Machine Learning

Published on: July 22, 2025

Area of Science:

Data Science
Computer Science
Artificial Intelligence

Background:

The proliferation of connected devices and digital platforms generates unprecedented volumes of data, termed big data.
Big data analysis is crucial for uncovering trends, relationships, and insights to inform decision-making.
Traditional machine learning methods struggle with the scale and complexity of big data challenges.

Purpose of the Study:

To introduce Apache Spark as a scalable platform for big data processing.
To demonstrate how machine learning algorithms can be effectively implemented on Spark.
To explore the application of regression, classification, and clustering techniques within the Spark ecosystem.

Main Methods:

Introduction to Apache Spark architecture for distributed data processing.
Application of machine learning algorithms (regression, classification, clustering) on the Spark platform.
Focus on addressing challenges in designing and executing big data systems using Spark.

Main Results:

Spark provides a robust framework for handling large-scale datasets.
Machine learning models can be efficiently deployed on Spark for big data analysis.
Demonstrated applicability of regression, classification, and clustering on Spark.

Conclusions:

Apache Spark is a suitable platform for overcoming big data challenges in machine learning.
The integration of machine learning with Spark enables advanced data analysis and prediction.
Further research can explore more complex machine learning models and applications within Spark.