Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

4.4K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
4.4K
What Are Outliers?01:12

What Are Outliers?

5.5K
Outliers are observed data points that are far from the least squares line. They have unusual values and need to be examined carefully. Though an outlier may result from erroneous data, at other times, it may hold valuable information about the population under study and should be included in the data. Hence, it is crucial to examine what causes a data point to be an outlier.
The z score is used to find outliers or unusual values. It should be noted that any values beyond -2 and +2 are...
5.5K
Outliers and Influential Points01:08

Outliers and Influential Points

6.6K
An outlier is an observation of data that does not fit the rest of the data. It is sometimes called an extreme value. When you graph an outlier, it will appear not to fit the pattern of the graph. Some outliers are due to mistakes (for example, writing down 50 instead of 500), while others may indicate that something unusual is happening. Outliers are present far from the least squares line in the vertical direction. They have large "errors," where the "error" or residual is the...
6.6K
Detection of Black Holes01:10

Detection of Black Holes

2.6K
Although black holes were theoretically postulated in the 1920s, they remained outside the domain of observational astronomy until the 1970s.
Their closest cousins are neutron stars, which are composed almost entirely of neutrons packed against each other, making them extremely dense. A neutron star has the same mass as the Sun but its diameter is only a few kilometers. Therefore, the escape velocity from their surface is close to the speed of light.
Not until the 1960s, when the first neutron...
2.6K
Detection of Gross Error: The Q Test01:00

Detection of Gross Error: The Q Test

7.2K
When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...
7.2K
Stratified Sampling Method01:16

Stratified Sampling Method

16.0K
Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a stratified sample, divide the population into groups called strata and then take a...
16.0K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Serum metabolomic signatures integrating sphingosine-1-phosphate and tetrahydrocortisone improve prognostic assessment in non-ischemic cardiomyopathy.

Metabolomics : Official journal of the Metabolomic Society·2026
Same author

CREAT: A CRISPR-Based Genome Trimming Strategy for Systematic Identification of Dispensable Regions and Rapid Genome Reduction.

Advanced science (Weinheim, Baden-Wurttemberg, Germany)·2026
Same author

Astrocyte-derived exosome-mediated siRNA delivery combined with quercetin-Mn complex promotes neural repair in spinal cord injury.

Journal of controlled release : official journal of the Controlled Release Society·2026
Same author

3D craniofacial generative model for surgical planning in mandibular reconstruction.

Medical image analysis·2026
Same author

Deep learning-based diagnosis of parotid gland tumors on CT images: A multi-view approach for preoperative differentiation of benign and malignant lesions.

Journal of stomatology, oral and maxillofacial surgery·2026
Same author

Study on stability analysis of soil nail reinforced slopes under loading based on the discrete element method.

PloS one·2026
Same journal

RETRACTION: Real-Time Modulation of Physical Training Intensity Based on Wavelet Recursive Fuzzy Neural Networks.

Computational intelligence and neuroscience·2026
Same journal

RETRACTION: Multidimensional Heterogeneous Network Link Adaptation Based on Mobile Environment.

Computational intelligence and neuroscience·2026
Same journal

RETRACTION: Framework to Segment and Evaluate Multiple Sclerosis Lesion in MRI Slices Using VGG-UNet.

Computational intelligence and neuroscience·2026
Same journal

RETRACTION: Facial Emotion Recognition Using a Novel Fusion of Convolutional Neural Network and Local Binary Pattern in Crime Investigation.

Computational intelligence and neuroscience·2026
Same journal

RETRACTION: Automatic Intelligent System Using Medical of Things for Multiple Sclerosis Detection.

Computational intelligence and neuroscience·2026
Same journal

RETRACTION: Intangible Cultural Heritage Reproduction and Revitalization: Value Feedback, Practice, and Exploration Based on the IPA Model.

Computational intelligence and neuroscience·2026
See all related articles

Related Experiment Video

Updated: Mar 19, 2026

Author Spotlight: AI-Driven Trypanosome Species Detection from Microscopic Images
08:20

Author Spotlight: AI-Driven Trypanosome Species Detection from Microscopic Images

Published on: October 27, 2023

2.7K

Stratification-Based Outlier Detection over the Deep Web.

Xuefeng Xian1, Pengpeng Zhao2, Victor S Sheng3

  • 1Department of Computer Science and Technology, Soochow University, Suzhou, Jiangsu 215002, China; School of Computer Engineering, Suzhou Vocational University, Suzhou, Jiangsu 215104, China.

Computational Intelligence and Neuroscience
|June 18, 2016
PubMed
Summary
This summary is machine-generated.

This study introduces a novel data mining technique for outlier detection specifically designed for the deep web. The new method effectively identifies rare instances within the deep web

More Related Videos

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER
14:06

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER

Published on: June 23, 2012

15.8K

Related Experiment Videos

Last Updated: Mar 19, 2026

Author Spotlight: AI-Driven Trypanosome Species Detection from Microscopic Images
08:20

Author Spotlight: AI-Driven Trypanosome Species Detection from Microscopic Images

Published on: October 27, 2023

2.7K
Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER
14:06

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER

Published on: June 23, 2012

15.8K

Area of Science:

  • Computer Science
  • Data Mining
  • Information Retrieval

Background:

  • Outlier detection is crucial for identifying rare instances, often more valuable than common patterns.
  • Traditional outlier detection methods are unsuitable for the deep web due to its query-based data access.
  • The deep web presents unique challenges for data mining, requiring specialized approaches.

Purpose of the Study:

  • To develop a new data mining method for effective outlier detection over the deep web.
  • To address the limitations of existing outlier detection techniques in the context of deep web data sources.

Main Methods:

  • The proposed approach stratifies the query space of deep web data sources using a pilot sample.
  • Neighborhood sampling and uncertainty sampling techniques are employed to enhance detection performance.
  • Stratification is utilized to improve both recall and precision in outlier identification.

Main Results:

  • The developed algorithm demonstrates effectiveness in detecting outliers within deep web environments.
  • Performance evaluation confirms the practical utility and accuracy of the proposed outlier detection method.
  • The approach successfully overcomes the challenges posed by the query-driven nature of deep web data.

Conclusions:

  • The novel data mining method offers a significant advancement for outlier detection in deep web applications.
  • The study highlights the importance and feasibility of deep web-specific outlier detection strategies.
  • The proposed techniques provide a robust solution for uncovering rare instances in complex deep web datasets.