Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Video

Updated: May 24, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Semi-supervised hashing for large-scale search.

Jun Wang1, Sanjiv Kumar, Shih-Fu Chang

  • 1Business Analytics and Mathematical Sciences Department, IBM T.J. Watson Research Center, RM 31-229, 1101 Kitchawan Rd, Rte. 134, Yorktown Heights, NY 10598, USA. wangjun@us.ibm.com

IEEE Transactions on Pattern Analysis and Machine Intelligence
|February 15, 2012
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

Cluster Sampling Method01:20

Cluster Sampling Method

Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Intravenous transplantation of mesenchymal stem cells improves cardiac performance after acute myocardial ischemia in female rats.

Transplant international : official journal of the European Society for Organ Transplantation·2006
Same author

[Effects of mechanical tensile stress on the expression of ICAM-1 mRNA in osteoblasts differentiated from rBMSCs].

Sichuan da xue xue bao. Yi xue ban = Journal of Sichuan University. Medical science edition·2006
Same author

[Effects of osteoporosis on experimental tooth movement in aged rats].

Sichuan da xue xue bao. Yi xue ban = Journal of Sichuan University. Medical science edition·2006
Same author

MCALIGN2: faster, accurate global pairwise alignment of non-coding DNA sequences based on explicit models of indel evolution.

BMC bioinformatics·2006
Same author

[Managements of masked mastoiditis].

Zhonghua er bi yan hou tou jing wai ke za zhi = Chinese journal of otorhinolaryngology head and neck surgery·2006
Same author

Neuronal SIRT1 activation as a novel mechanism underlying the prevention of Alzheimer disease amyloid neuropathology by calorie restriction.

The Journal of biological chemistry·2006
Same journal

TraGraph-GS: Trajectory Graph-based Gaussian Splatting for Arbitrary Large-Scale Scene Rendering.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

SWIFT: A Small-World Interaction Framework for Flow-Aware Trajectory Prediction in Autonomous Driving.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

HardFlow: Hard-Constrained Sampling for Flow-Matching Models Via Trajectory Optimization.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Industrial Brain: Self-Evolving Neuro-Symbolic Autonomy with Causal Resilience for Cyber-Physical Systems.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Adaptive Hardness-Driven Dictionary Distillation for Incomplete Streaming View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation.

IEEE transactions on pattern analysis and machine intelligence·2026
See all related articles

This study introduces a novel semi-supervised hashing framework for efficient approximate nearest neighbor search. The proposed methods significantly improve accuracy and robustness in large-scale datasets compared to existing techniques.

Area of Science:

  • Computer Science
  • Machine Learning
  • Data Mining

Background:

  • Hashing-based approximate nearest neighbor (ANN) search is crucial for large databases.
  • Existing methods like Locality Sensitive Hashing and Spectral Hashing have limitations in accuracy and efficiency.
  • Supervised hashing methods struggle with small or noisy labeled data, leading to overfitting.

Purpose of the Study:

  • To propose a novel semi-supervised hashing (SSH) framework for improved ANN search.
  • To develop robust hashing methods that leverage both labeled and unlabeled data.
  • To extend the hashing paradigm to unsupervised domains.

Main Methods:

  • Developed an SSH framework minimizing empirical error on labeled data and using an information-theoretic regularizer.

Related Experiment Videos

Last Updated: May 24, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

  • Introduced three SSH methods: orthogonal, nonorthogonal, and sequential hashing.
  • Demonstrated sequential hashing's error-correction capabilities and its extension to unsupervised learning.
  • Main Results:

    • The proposed SSH methods outperform state-of-the-art supervised and unsupervised hashing techniques.
    • Sequential hashing generates particularly robust codes by correcting previous errors.
    • Experiments on datasets up to 80 million samples validate the superior performance.

    Conclusions:

    • The novel SSH framework offers a significant advancement in approximate nearest neighbor search.
    • Semi-supervised and unsupervised hashing methods can effectively handle large-scale, complex data.
    • The proposed sequential learning paradigm provides a robust approach to hashing.