Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Molecular Models02:00

Molecular Models

44.2K
Physical models representing molecular architectures of chemical compounds play essential roles in understanding chemistry. The use of molecular models makes it easier to visualize the structures and shapes of atoms and molecules.
44.2K
Mass Spectrometry: Overview01:19

Mass Spectrometry: Overview

9.3K
Mass spectrometry is an analytical technique used to determine the molecular mass and molecular formula of a compound. The basic principle of mass spectrometry is to generate ions from the analyte molecule and measure these ion abundances against their molecular mass. One common type of ionization, known as electron ionization or EI, bombards the analyte molecules in the gas phase with high-energy electron beams. The electron beams displace an electron from the molecule and leave behind a...
9.3K
Mass Spectrometry: Complex Analysis01:21

Mass Spectrometry: Complex Analysis

2.0K
Mass spectrometry is an important technique for the identification of pure compounds. However, it has some limitations for the analysis of complex mixtures, often due to excessive fragmentation making the spectrum too complicated to decipher. Mass spectrometry can be combined with suitable separation methods in sequence, forming hyphenated methods, which are useful in the analysis of complex mixtures.
GC–MS is a powerful hyphenated method commonly used in forensics and environmental...
2.0K
Chemical Shift: Internal References and Solvent Effects01:17

Chemical Shift: Internal References and Solvent Effects

1.5K
In an NMR sample, precise measurement of the absolute absorption frequencies of nuclei is difficult. A standard internal reference compound is added, and the frequency difference between the reference signal and sample signals is measured.
The internal reference compound generally used in NMR spectroscopy is tetramethylsilane (TMS). TMS is preferred because it is chemically inert, soluble in NMR solvents, and easily removable. Also, the highly shielded methyl protons in TMS yield an intense...
1.5K
Predicting Molecular Geometry02:27

Predicting Molecular Geometry

46.4K
VSEPR Theory for Determination of Electron Pair Geometries
46.4K
Mass Spectrometry: Molecular Fragmentation Overview01:20

Mass Spectrometry: Molecular Fragmentation Overview

5.9K
The ionization of a molecule into a molecular ion inside the mass spectrometer causes instability in the molecule's structure due to the loss of an electron. This eventually leads to the fragmentation or breaking of some bonds in the molecule. The fragmentation occurs predominantly at specific bonds to yield relatively stable fragments.
One type of fragmentation pattern is the cleavage of a single bond in the molecular ion. The cleavage leads to a radical and a cation. The cleavage can occur at...
5.9K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Balancing Data Quantity and Quality: Evaluating Curation Strategies for Bioactivity Prediction in Lead Optimization.

Journal of chemical information and modeling·2026
Same author

Unraveling Torsional Preferences: Comparative Analysis of Torsion Motif Torsional-Angle Distributions across Different Environments.

Journal of chemical information and modeling·2025
Same author

Discovery and Characterization of Zilurgisertib, a Potent and Selective Inhibitor of Activin Receptor-like Kinase‑2 (ALK2) for the Treatment of Fibrodysplasia Ossificans Progressiva.

ACS medicinal chemistry letters·2025
Same author

The changing landscape of medicinal chemistry optimization.

Nature reviews. Drug discovery·2025
Same author

Scaffold Hopping with Generative Reinforcement Learning.

Journal of chemical information and modeling·2025
Same author

Rapid Access to Small Molecule Conformational Ensembles in Organic Solvents Enabled by Graph Neural Network-Based Implicit Solvent Model.

Journal of the American Chemical Society·2025
Same journal

Correction to "AstraMEV (AI-Guided Structural Assembly of Multi-Epitope Vaccines) Against Infectious Bronchitis Virus".

Journal of chemical information and modeling·2026
Same journal

MolPy: A Large Language Model-Friendly Toolkit for Reactive Topology Editing in Polymer Simulations.

Journal of chemical information and modeling·2026
Same journal

Molecular Mechanisms of KIT Receptor Dimerization and Oncogenic Activation Revealed by Multiscale Simulations.

Journal of chemical information and modeling·2026
Same journal

Structural and Thermodynamic Discrimination between Agonists and Antagonists of Retinoic Acid Receptor γ and the Vitamin D Receptor.

Journal of chemical information and modeling·2026
Same journal

PACEff Builder: An Efficient Platform for Constructing PACE Hybrid-Resolution Models for Molecular Dynamics Simulations of Aqueous Protein, Peptide Assembly, and Membrane Protein Systems.

Journal of chemical information and modeling·2026
Same journal

TransKla: A Local-Global Cross-Attention Based Transformer Approach for Prediction of Lysine Lactylation Sites.

Journal of chemical information and modeling·2026
See all related articles

Related Experiment Video

Updated: Feb 26, 2026

Applying Cheminformatics to Develop a Structure Searchable Database of Analytical Methods
05:34

Applying Cheminformatics to Develop a Structure Searchable Database of Analytical Methods

Published on: June 6, 2025

1.8K

Chemical Topic Modeling: Exploring Molecular Data Sets Using a Common Text-Mining Approach.

Nadine Schneider1, Nikolas Fechner1, Gregory A Landrum2

  • 1Novartis Institutes for BioMedical Research, Novartis Pharma AG , Novartis Campus, 4002 Basel, Switzerland.

Journal of Chemical Information and Modeling
|July 18, 2017
PubMed
Summary
This summary is machine-generated.

We introduce CheTo, a novel topic modeling method for organizing large chemical data sets. This approach intuitively assigns molecules to "chemical topics," enhancing data organization and retrieval in medicinal chemistry research.

More Related Videos

Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications
09:20

Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications

Published on: February 23, 2019

9.3K
Author Spotlight: Emerging Technologies and Advanced Tools for Decoding Metabolomics Data Analysis
07:11

Author Spotlight: Emerging Technologies and Advanced Tools for Decoding Metabolomics Data Analysis

Published on: November 10, 2023

3.4K

Related Experiment Videos

Last Updated: Feb 26, 2026

Applying Cheminformatics to Develop a Structure Searchable Database of Analytical Methods
05:34

Applying Cheminformatics to Develop a Structure Searchable Database of Analytical Methods

Published on: June 6, 2025

1.8K
Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications
09:20

Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications

Published on: February 23, 2019

9.3K
Author Spotlight: Emerging Technologies and Advanced Tools for Decoding Metabolomics Data Analysis
07:11

Author Spotlight: Emerging Technologies and Advanced Tools for Decoding Metabolomics Data Analysis

Published on: November 10, 2023

3.4K

Area of Science:

  • Medicinal Chemistry
  • Cheminformatics
  • Data Science

Background:

  • Big data presents significant challenges in organizing and searching vast molecular datasets generated by modern technologies.
  • Existing methods for handling large molecule sets often compromise result interpretability.
  • Medicinal chemistry faces increasing data volumes from DNA encoded libraries, peptide libraries, text mining, and in silico methods.

Purpose of the Study:

  • To develop an intuitive and interpretable method for organizing large molecular datasets.
  • To implement and evaluate a probabilistic topic modeling framework for chemical data.
  • To enable the identification and retrieval of chemical series and relationships within large molecule sets.

Main Methods:

  • Adoption and adaptation of the probabilistic topic modeling framework from text-mining for chemical data.
  • Development of the CheTo (Chemical Topic) open-source implementation.
  • Evaluation of the method's performance on various experiments and the ChEMBL22 dataset (1.6 million molecules).

Main Results:

  • Successful assignment of large molecule sets to meaningful 'chemical topics' (e.g., 'proteins', 'DNA', 'steroids').
  • Demonstrated ability to reproduce human-assigned concepts and retrieve specific chemical series.
  • Creation of an intuitive visualization for chemical topics, outperforming traditional clustering methods.
  • Efficient modeling of 1.6 million molecules into 100 topics within approximately one hour.

Conclusions:

  • Topic modeling offers a powerful and interpretable approach for managing large chemical datasets.
  • CheTo provides a robust and efficient solution for organizing and exploring molecular data in medicinal chemistry.
  • The open-source implementation (CheTo) and datasets facilitate further research and application in cheminformatics.