Identifying Outliers via Local Granular-Ball Density
View abstract on PubMed
Summary
This summary is machine-generated.This study introduces local Granular-Ball Density-based Outlier (GBDO) detection, a novel method improving outlier detection by processing data at multiple granularities. GBDO enhances robustness and reduces computational complexity compared to traditional methods.
Area Of Science
- Data Mining
- Machine Learning
- Computational Intelligence
Background
- Existing density-based outlier detection methods operate at a single data granularity, leading to high sensitivity to noise and failure to capture multi-level information.
- These methods often overlook data uncertainty, such as fuzziness, hindering effective outlier detection.
Purpose Of The Study
- To propose a novel outlier detection method, local Granular-Ball Density-based Outlier (GBDO), leveraging Granular-Ball Computing (GBC).
- To enhance the performance of density-based outlier detection by addressing limitations of single-granularity processing and uncertainty handling.
Main Methods
- Utilizing Granular-Ball Computing (GBC) for multi-granularity data processing.
- Identifying k-similarity Granular-Ball (GB) neighborhoods based on fuzzy relations.
- Calculating local reachability similarity density for GBs.
- Computing local GB outlier factors for samples.
Main Results
- GBDO adopts a multi-granularity processing paradigm using GBs as fundamental units.
- This approach reduces computational complexity and increases robustness against noisy data.
- Experimental results validate the effectiveness of GBDO against state-of-the-art methods.
Conclusions
- GBDO offers an effective solution for outlier detection by integrating multi-granularity processing and robustness.
- The method successfully overcomes the limitations of traditional single-granularity density-based approaches.
- The proposed GBDO method demonstrates superior performance in identifying outliers.
Related Concept Videos
Sometimes, a data set can have a recorded numerical observation that greatly deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier. To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
Outliers are observed data points that are far from the least squares line. They have unusual values and need to be examined carefully. Though an outlier may result from erroneous data, at other times, it may hold valuable information about the population under study and should be included in the data. Hence, it is crucial to examine what causes a data point to be an outlier.
The z score is used to find outliers or unusual values. It should be noted that any values beyond -2 and +2 are...
When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...
An outlier is an observation of data that does not fit the rest of the data. It is sometimes called an extreme value. When you graph an outlier, it will appear not to fit the pattern of the graph. Some outliers are due to mistakes (for example, writing down 50 instead of 500), while others may indicate that something unusual is happening. Outliers are present far from the least squares line in the vertical direction. They have large "errors," where the "error" or residual is the...
Bulk density refers to the mass of aggregate particles that would fill a unit volume. The concept of bulk density originates from the inability to pack aggregate particles in a manner that completely eliminates void spaces. Hence, the term bulk refers to the volume that encompasses both the aggregates and the voids. This measurement is crucial when aggregates are batched by volume and is used to convert quantities by mass to volume.
Most natural mineral aggregates, like sand and gravel,...
Aggregates typically contain pores, which can be either permeable or impermeable. Considering the pores in the aggregates, the specific gravity of aggregates is defined in three different forms, namely, bulk or gross specific gravity, apparent specific gravity, and absolute specific gravity.
Bulk or gross specific gravity is calculated by taking the ratio of the mass of aggregates in the saturated surface-dry state to the total volume that includes both the solids and the voids within the...

