Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Learning from the data: mining of large high-throughput screening databases.

S Frank Yan1, Frederick J King, Yun He

  • 1Genomics Institute of the Novartis Research Foundation, 10675 John Jay Hopkins Drive, San Diego, California 92121, USA. syan@gnf.org

Journal of Chemical Information and Modeling
|November 28, 2006
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Targeting p130Cas- and microtubule-dependent MYC regulation sensitizes pancreatic cancer to ERK MAPK inhibition.

Cell reports·2026
Same author

Triggering AHR resolves TGF-β1 induced fibroblast activation and promotes AT1 cell regeneration in alveolar organoids.

Communications biology·2025
Same author

Afpdb: an efficient structure manipulation package for AI protein design.

Bioinformatics (Oxford, England)·2024
Same author

A high throughput cell stretch device for investigating mechanobiology <i>in vitro</i>.

APL bioengineering·2024
Same author

Enhancing the Small-Scale Screenable Biological Space beyond Known Chemogenomics Libraries with Gray Chemical Matter─Compounds with Novel Mechanisms from High-Throughput Screening Profiles.

ACS chemical biology·2024
Same author

Drug target prediction through deep learning functional representation of gene signatures.

Nature communications·2024
Same journal

tmGNN-XAI: An Explainable Graph Neural Network Tool for Predicting Electronic Properties of Transition Metal Complexes from SMILES.

Journal of chemical information and modeling·2026
Same journal

QSAR in the Browser: An Interactive Cheminformatics Web Application.

Journal of chemical information and modeling·2026
Same journal

FoldDoF: Utilizing the Primary Degrees of Freedom of Protein Backbone for Geometric Modeling and Generation.

Journal of chemical information and modeling·2026
Same journal

Derisking Affinity Optimization for Macrocycles and Cyclic Peptides: High-Precision Free Energy Simulations across Five Diverse Targets.

Journal of chemical information and modeling·2026
Same journal

An End-User Audit of Reproducibility, Data Leakage, and Overfitting of the Top-Ranked ADMET Prediction Models in TDC Leaderboards.

Journal of chemical information and modeling·2026
Same journal

PFASGroups: An Open-Source Framework for Automated Identification, Structural Classification, and Prioritization of Per- and Polyfluoroalkyl Substances.

Journal of chemical information and modeling·2026
See all related articles

We developed an ontology-based pattern identification (OPI) algorithm to mine high-throughput screening (HTS) data. This method reliably identifies scaffold families with significant structure-activity relationships, distinguishing between true biological activity and screening artifacts.

Area of Science:

  • Drug discovery and development
  • Computational chemistry
  • Bioinformatics

Background:

  • Pharmaceutical companies possess vast high-throughput screening (HTS) datasets from millions of compounds.
  • Systematic data mining methods to extract actionable knowledge from HTS data are underdeveloped.
  • Understanding structure-activity relationships is crucial for drug development.

Purpose of the Study:

  • To develop and apply a systematic data mining approach for extracting knowledge from HTS data.
  • To identify compound scaffolds with statistically significant structure-activity profiles.
  • To differentiate between genuine biological activity and screening artifacts.

Main Methods:

  • Development of an ontology-based pattern identification (OPI) algorithm.

Related Experiment Videos

  • Application of OPI to an in-house high-throughput screening database.
  • Utilizing statistical tests (Kruskal-Wallis, ANOVA) for scaffold analysis.
  • Main Results:

    • Identified nearly 1500 scaffold families with significant structure-HTS activity profile relationships.
    • Characterized dozens of scaffolds as artifacts related to screening technology.
    • Classified compound scaffolds into four types: tumor cytotoxic, general toxic, reporter gene assay artifact, and target family specific.

    Conclusions:

    • The OPI approach reliably identifies compounds with similar structures and shared biological activity profiles.
    • Discovered scaffolds are valuable for designing diversity libraries and in silico biological annotations.
    • This method provides novel target family-specific scaffolds for focused library design.