Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Correlation and Regression00:53

Correlation and Regression

2.8K
In statistics, correlation describes the degree of association between two variables. In the subfield of linear regression, correlation is mathematically expressed by the correlation coefficient, which describes the strength and direction of the relationship between two variables. The coefficient is symbolically represented by 'r' and ranges from -1 to +1. A positive value indicates a positive correlation where the two variables move in the same direction. A negative value suggests a...
2.8K
Coefficient of Correlation01:12

Coefficient of Correlation

7.7K
The correlation coefficient, r, developed by Karl Pearson in the early 1900s, is numerical and provides a measure of strength and direction of the linear association between the independent variable x and the dependent variable y.
If you suspect a linear relationship between x and y, then r can measure how strong the linear relationship is.
What the VALUE of r tells us:
The value of r is always between –1 and +1: –1 ≤ r ≤ 1.
The size of the correlation r indicates the...
7.7K
Goodness-of-Fit Test01:16

Goodness-of-Fit Test

6.9K
The goodness-of-fit test is a type of hypothesis test which determines whether the data "fits" a particular distribution. For example, one may suspect that some anonymous data may fit a binomial distribution. A chi-square test (meaning the distribution for the hypothesis test is chi-square) can be used to determine if there is a fit. The null and alternative hypotheses may be written in sentences or stated as equations or inequalities. The test statistic for a goodness-of-fit test is given as...
6.9K
Aggregates Classification01:29

Aggregates Classification

572
Aggregate classification is generally based on its size, petrographic characteristics, weight, and source. Size classification ranges from coarse to fine aggregates, defined by the size of the particles. Coarse aggregates are particles that do not pass through ASTM sieve No. 4, and aggregates that pass through the sieve are fine aggregates.
Petrographic classification groups aggregates based on common mineralogical characteristics. Some of the common mineral groups found in aggregates are...
572
Correlation01:09

Correlation

14.0K
In statistics, two variables are said to be correlated if the values of one variable are associated with the other variable. Depending on the relationship between two variables, correlation can be of three types– positive correlation, negative correlation, and zero correlation.
Two variables, for example, a and b, are said to be positively correlated if both variables move in the same direction. In other words, a positive correlation exists between two variables, a and b, if:
14.0K
Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

3.3K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
3.3K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Transformer-Driven Explainable Deep Learning with Quantitative Attribution Validation for Liver Tumor Detection.

Bioengineering (Basel, Switzerland)·2026
Same author

Multimodal Deep Learning with Attention-Based Fusion for Skin Cancer Diagnosis.

Bioengineering (Basel, Switzerland)·2026
Same author

Transformer-Based Deep Learning for Population-Scale Retinal Image Screening of Ophthalmic Disorders.

Bioengineering (Basel, Switzerland)·2026
Same author

Uncertainty-Aware adaptive neuro-fuzzy transformer framework for robust multi-center lung disease classification.

Scientific reports·2026
Same author

Transformer-Based Foundation Learning for Robust and Data-Efficient Skin Disease Imaging.

Diagnostics (Basel, Switzerland)·2026
Same author

Domain-Adaptive MRI Learning Model for Precision Diagnosis of CNS Tumors.

Biomedicines·2026
Same journal

RETRACTED: Zhang et al. A Novel Framework for Reconstruction and Imaging of Target Scattering Centers via Wide-Angle Incidence in Radar Networks. <i>Sensors</i> 2025, <i>25</i>, 6802.

Sensors (Basel, Switzerland)·2026
Same journal

Enhancing Unsupervised Multi-Source Domain Adaptation for Person Re-Identification via Mixture of Experts and Graph-Based Relation.

Sensors (Basel, Switzerland)·2026
Same journal

Development of an Instrumented Glove for Palmar Pressure Assessment in Kayakers.

Sensors (Basel, Switzerland)·2026
Same journal

Development and Experimental Validation of an Autonomous IoT-Based Monitoring System for Real-Time Water Quality Assessment in the Amazon River.

Sensors (Basel, Switzerland)·2026
Same journal

Semi-Supervised Adversarial Learning Framework for Controller Area Network Bus Intrusion Detection.

Sensors (Basel, Switzerland)·2026
Same journal

Smart Optimization Method for Safety Signs in Innovative Manufacturing Environments Integrating Industrial Field IoT Sensors and Knowledge Graphs.

Sensors (Basel, Switzerland)·2026
See all related articles

Related Experiment Video

Updated: Nov 27, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.8K

Pearson Correlation-Based Feature Selection for Document Classification Using Balanced Training.

Inzamam Mashood Nasir1, Muhammad Attique Khan1, Mussarat Yasmin2

  • 1Department of Computer Science, HITEC University, Taxila 47080, Pakistan.

Sensors (Basel, Switzerland)
|December 2, 2020
PubMed
Summary
This summary is machine-generated.

This study introduces a deep convolutional neural network (DCNN) for efficient digital document classification. The technique improves accuracy by handling image issues and optimizing features, achieving 93.1% classification accuracy.

Keywords:
data augmentationdeep learningdocument classificationfeature selectionimbalanced dataset

More Related Videos

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

1.1K
Asthma Detection Research Based on Voice Signal Processing and Machine Learning
04:04

Asthma Detection Research Based on Voice Signal Processing and Machine Learning

Published on: July 22, 2025

688

Related Experiment Videos

Last Updated: Nov 27, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.8K
Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

1.1K
Asthma Detection Research Based on Voice Signal Processing and Machine Learning
04:04

Asthma Detection Research Based on Voice Signal Processing and Machine Learning

Published on: July 22, 2025

688

Area of Science:

  • Computer Science
  • Information Science
  • Machine Learning

Background:

  • Digital document storage is prevalent, necessitating efficient retrieval methods.
  • Traditional document handling is impractical, uneconomical, and ecologically unsound.
  • Adverse image issues like signatures and handwritten notes hinder accurate digital document classification.

Purpose of the Study:

  • To present a real-time supervised learning technique for document classification using deep convolutional neural networks (DCNNs).
  • To mitigate the impact of image imperfections on document classification accuracy.
  • To enhance the efficiency and reliability of retrieving digitally stored documents.

Main Methods:

  • A novel data augmentation technique was developed, normalizing imbalanced datasets with the RVL-CDIP dataset.
  • Deep convolutional neural network (DCNN) features were extracted using VGG19 and AlexNet models.
  • Feature fusion and selection were performed using a Pearson correlation coefficient-based method to remove redundancy.

Main Results:

  • The proposed technique achieved a classification accuracy of 93.1% on the Tobacco3482 dataset.
  • The method effectively reduced the impact of adverse document image issues.
  • Feature optimization significantly improved classification performance.

Conclusions:

  • The developed DCNN-based technique offers a robust solution for digital document classification.
  • The novel data augmentation and feature selection methods enhance classification accuracy.
  • This approach validates the effectiveness of deep learning for practical document image analysis.