Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Related Experiment Videos

Learning from the data: mining of large high-throughput screening databases.

S Frank Yan¹, Frederick J King, Yun He

¹Genomics Institute of the Novartis Research Foundation, 10675 John Jay Hopkins Drive, San Diego, California 92121, USA. syan@gnf.org

Journal of Chemical Information and Modeling

|November 28, 2006

Summary

This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Targeting p130Cas- and microtubule-dependent MYC regulation sensitizes pancreatic cancer to ERK MAPK inhibition.

Cell reports·2026

Same author

Triggering AHR resolves TGF-β1 induced fibroblast activation and promotes AT1 cell regeneration in alveolar organoids.

Communications biology·2025

Same author

Afpdb: an efficient structure manipulation package for AI protein design.

Bioinformatics (Oxford, England)·2024

Same author

A high throughput cell stretch device for investigating mechanobiology <i>in vitro</i>.

APL bioengineering·2024

Same author

Enhancing the Small-Scale Screenable Biological Space beyond Known Chemogenomics Libraries with Gray Chemical Matter─Compounds with Novel Mechanisms from High-Throughput Screening Profiles.

ACS chemical biology·2024

Same author

Drug target prediction through deep learning functional representation of gene signatures.

Nature communications·2024

Same journal

tmGNN-XAI: An Explainable Graph Neural Network Tool for Predicting Electronic Properties of Transition Metal Complexes from SMILES.

Journal of chemical information and modeling·2026

Same journal

QSAR in the Browser: An Interactive Cheminformatics Web Application.

Journal of chemical information and modeling·2026

Same journal

FoldDoF: Utilizing the Primary Degrees of Freedom of Protein Backbone for Geometric Modeling and Generation.

Journal of chemical information and modeling·2026

Same journal

Derisking Affinity Optimization for Macrocycles and Cyclic Peptides: High-Precision Free Energy Simulations across Five Diverse Targets.

Journal of chemical information and modeling·2026

Same journal

An End-User Audit of Reproducibility, Data Leakage, and Overfitting of the Top-Ranked ADMET Prediction Models in TDC Leaderboards.

Journal of chemical information and modeling·2026

Same journal

PFASGroups: An Open-Source Framework for Automated Identification, Structural Classification, and Prioritization of Per- and Polyfluoroalkyl Substances.

Journal of chemical information and modeling·2026

See all related articles

We developed an ontology-based pattern identification (OPI) algorithm to mine high-throughput screening (HTS) data. This method reliably identifies scaffold families with significant structure-activity relationships, distinguishing between true biological activity and screening artifacts.

Area of Science:

Drug discovery and development
Computational chemistry
Bioinformatics

Background:

Pharmaceutical companies possess vast high-throughput screening (HTS) datasets from millions of compounds.
Systematic data mining methods to extract actionable knowledge from HTS data are underdeveloped.
Understanding structure-activity relationships is crucial for drug development.

Purpose of the Study:

To develop and apply a systematic data mining approach for extracting knowledge from HTS data.
To identify compound scaffolds with statistically significant structure-activity profiles.
To differentiate between genuine biological activity and screening artifacts.

Main Methods:

Development of an ontology-based pattern identification (OPI) algorithm.

Related Experiment Videos

Application of OPI to an in-house high-throughput screening database.

Utilizing statistical tests (Kruskal-Wallis, ANOVA) for scaffold analysis.

Main Results:

Identified nearly 1500 scaffold families with significant structure-HTS activity profile relationships.
Characterized dozens of scaffolds as artifacts related to screening technology.
Classified compound scaffolds into four types: tumor cytotoxic, general toxic, reporter gene assay artifact, and target family specific.

Conclusions:

The OPI approach reliably identifies compounds with similar structures and shared biological activity profiles.
Discovered scaffolds are valuable for designing diversity libraries and in silico biological annotations.
This method provides novel target family-specific scaffolds for focused library design.