Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Parallel Processing

Parallel Processing

The brain processes sensory information rapidly due to parallel processing, which involves sending data across multiple neural pathways at the same time. This method allows the brain to manage various sensory qualities, such as shapes, colors, movements, and locations, all concurrently. For instance, when observing a forest landscape, the brain simultaneously processes the movement of leaves, the shapes of trees, the depth between them, and the various shades of green. This enables a quick and...

Multiple Comparison Tests

Multiple Comparison Tests

Multiple comparison test, abbreviated as MCT, is a post hoc analysis generally performed after comparing multiple samples with one or more tests. An MCT will help identify a significantly different sample among multiple samples or a factor among multiple factors.
It would be easy to compare two samples using a significance alpha level of 0.05. In other words, there is only one sample pair to be compared. However, it would be difficult to identify a significantly different sample if the number...

Multimachine Stability

Multimachine Stability

Multimachine stability analysis is crucial for understanding the dynamics and stability of power systems with multiple synchronous machines. The objective is to solve the swing equations for a network of M machines connected to an N-bus power system.
In analyzing the system, the nodal equations represent the relationship between bus voltages, machine voltages, and machine currents. The nodal equation is given by:

Causes of Similarity-Dissimilarity Effect

Causes of Similarity-Dissimilarity Effect

The similarity-dissimilarity effect, a fundamental concept in social psychology, explains how interpersonal similarities and differences influence attraction and social interactions. This effect is supported by three key psychological perspectives: balance theory, social comparison theory, and consensual validation.Balance Theory and Cognitive ConsistencyBalance theory, developed by Fritz Heider, posits that individuals seek cognitive consistency in their relationships. When two people share...

Distributed Loads

Distributed Loads

Distributed loads are a common type of load that engineers and scientists encounter in various practical situations. Distributed loads often refer to a type of load spread over a surface or a structure and can be modeled as continuous force per unit area.
For example, consider a bookshelf filled with books stacked vertically adjacent to each other. The weight of the books is evenly distributed over the length of the shelf. As a result, the pressure at different locations on the surface of the...

Distributed Loads: Problem Solving

Distributed Loads: Problem Solving

Beams are structural elements commonly employed in engineering applications requiring different load-carrying capacities. The first step in analyzing a beam under a distributed load is to simplify the problem by dividing the load into smaller regions, which allows one to consider each region separately and calculate the magnitude of the equivalent resultant load acting on each portion of the beam. The magnitude of the equivalent resultant load for each region can be determined by calculating...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

The microscopic mechanism of water immersion and collapsibility in Malan loess with different particle size.

PloS one·2026

Same author

Immediate Versus Delayed Postpartum and Postabortion Insertion of Long-Acting Reversible Contraception: An Umbrella Review.

BJOG : an international journal of obstetrics and gynaecology·2026

Same author

Recent Advances of Controllable Production of PHA Biosynthesis in Chain Length, Monomers and Special Functional Groups.

Current microbiology·2026

Same author

Supramolecular Polymorphism of an Ir(III) Complex: Kinetic/Thermodynamic Control and Mechano-Responsiveness.

Journal of the American Chemical Society·2026

Same author

Reconfigurable Non-Hermitian Topological Photonic Lattice via Reversible Quantum Dot Waveguides.

Advanced materials (Deerfield Beach, Fla.)·2026

Same author

Astragaloside IV prevents calpain-1-mediated cardiac hypertrophy and fibrosis induced by diabetes.

Frontiers in cardiovascular medicine·2026

Same journal

Thymidylate synthase inhibitory drugs induce p53-dependent pathways differently.

PloS one·2026

Same journal

Top-down and bottom-up attention for joint pattern classification and reconstruction.

PloS one·2026

Same journal

Short- and long-term scaling behavior of blood pressure and pulse arrival time during sleep in healthy controls and patients with obstructive sleep apnea.

PloS one·2026

Same journal

Double DQN-based secrecy energy efficiency and fairness performance in IRS-assisted NOMA systems with friendly jamming.

PloS one·2026

Same journal

10 recommendations for strengthening citizen science for improved societal and ecological outcomes: A co-produced analysis of challenges and opportunities in the 21st century.

PloS one·2026

Same journal

Paying in public: Peer effects, impression management, and willingness to pay on digital payment platforms.

PloS one·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Mar 6, 2026

JUMPn: A Streamlined Application for Protein Co-Expression Clustering and Network Analysis in Proteomics

JUMPn: A Streamlined Application for Protein Co-Expression Clustering and Network Analysis in Proteomics

Published on: October 19, 2021

Efficient string similarity join in multi-core and distributed systems.

Cairong Yan¹, Xue Zhao¹, Qinglong Zhang¹

¹School of Computer Science and Technology, Donghua University, Shanghai, China.

|March 10, 2017

Summary

This summary is machine-generated.

This study introduces a parallel processing framework for efficient string similarity join in big data. The proposed Para-Join and Pada-Join algorithms significantly improve performance and ensure complete results.

More Related Videos

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

A Femtoliter Droplet Array for Massively Parallel Protein Synthesis from Single DNA Molecules

A Femtoliter Droplet Array for Massively Parallel Protein Synthesis from Single DNA Molecules

Published on: June 20, 2020

Related Experiment Videos

Last Updated: Mar 6, 2026

JUMPn: A Streamlined Application for Protein Co-Expression Clustering and Network Analysis in Proteomics

JUMPn: A Streamlined Application for Protein Co-Expression Clustering and Network Analysis in Proteomics

Published on: October 19, 2021

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

A Femtoliter Droplet Array for Massively Parallel Protein Synthesis from Single DNA Molecules

A Femtoliter Droplet Array for Massively Parallel Protein Synthesis from Single DNA Molecules

Published on: June 20, 2020

Area of Science:

Computer Science
Data Engineering
Big Data Analytics

Background:

String similarity join is crucial for big data analysis but computationally intensive.
Existing methods struggle with efficiency and scalability for large datasets.

Purpose of the Study:

To propose a novel parallel processing framework for efficient string similarity join.
To develop algorithms that enhance performance and ensure result completeness.

Main Methods:

Input data is split into subsets based on string distributions.
A filter-verification strategy with pruning is applied to reduce candidate pairs.
Parallel execution using multi-threading (Para-Join) and Spark (Pada-Join).

Main Results:

Para-Join demonstrates high efficiency and outperforms state-of-the-art approaches on multi-core systems.
Pada-Join effectively handles large datasets on cluster systems.
Both algorithms avoid redundant computations and guarantee complete results.

Conclusions:

The proposed parallel framework offers an efficient solution for string similarity join in big data.
Para-Join and Pada-Join provide scalable and reliable methods for similarity analysis.