Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Entropy02:39

Entropy

36.1K
Salt particles that have dissolved in water never spontaneously come back together in solution to reform solid particles. Moreover, a gas that has expanded in a vacuum remains dispersed and never spontaneously reassembles. The unidirectional nature of these phenomena is the result of a thermodynamic state function called entropy (S). Entropy is the measure of the extent to which the energy is dispersed throughout a system, or in other words, it is proportional to the degree of disorder of a...
36.1K
Entropy01:18

Entropy

3.6K
The first law of thermodynamics is quantitatively formulated via an equation relating the internal energy of a system, the heat exchanged by it, and the work done on it. A quantitative formulation of the second law of thermodynamics leads to defining a state function, the entropy.
When an ideal gas expands isothermally, the disorder in the gas increases. From the molecular perspective, the gas molecules have more volume to move around in.
Consider an infinitesimal step in the expansion, which...
3.6K
Long-patch Base Excision Repair01:02

Long-patch Base Excision Repair

8.0K
Since the discovery of the two BER pathways, there has been a debate about how a cell chooses one pathway over the other and the factors determining this selection. Numerous in vitro experiments have pointed out multiple determinants for the sub-pathway selection. These are:
8.0K
Standard Entropy Change for a Reaction03:00

Standard Entropy Change for a Reaction

24.6K
Entropy is a state function, so the standard entropy change for a chemical reaction (ΔS°rxn) can be calculated from the difference in standard entropy between the products and the reactants.
24.6K
Entropy and Solvation02:05

Entropy and Solvation

8.4K
The process of surrounding a solute with solvent is called solvation. It involves evenly distributing the solute within the solvent. The rule of thumb for determining a solvent for a given compound is that like dissolves like. A good solvent has molecular characteristics similar to those of the compound to be dissolved. For example, polar solutions dissolve polar solutes, and apolar solvents dissolve apolar solutes. A polar solvent is a solvent that has a high dielectric constant (ϵ...
8.4K
Entropy within the Cell01:22

Entropy within the Cell

12.9K
A living cell's primary tasks of obtaining, transforming, and using energy to do work may seem simple. However, the second law of thermodynamics explains why these tasks are harder than they appear. None of the energy transfers in the universe are completely efficient. In every energy transfer, some amount of energy is lost in a form that is unusable. In most cases, this form is heat energy. Thermodynamically, heat energy is defined as the energy transferred from one system to another that...
12.9K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Integrating multimodal features with deep learning for protein solubility prediction.

Journal of cheminformatics·2026
Same author

TRACE-DDI: A Hybrid Framework of Transformer-GAT Context Encoder and Pathway-Anchored Knowledge Graphs for DDI Prediction.

Computational and structural biotechnology journal·2026
Same author

LABMA: Latent-bottleneck attention-based multimodal architecture for integration of transcriptomics, proteomics, and MRI in neurodegeneration.

Computers in biology and medicine·2026
Same author

Improved image reconstruction in coherent diffraction imaging using self-seeded XFEL pulses.

Journal of synchrotron radiation·2026
Same author

Encoding and Decoding of Brain Dynamic Functional Connectivity for ADHD Diagnosis.

IEEE journal of biomedical and health informatics·2026
Same author

DriverMONI: Cancer Driver Gene Prediction With Multimodal Deep Learning Integrating Multiomics Data and Condition-Specific Network Information.

IEEE transactions on computational biology and bioinformatics·2025
Same journal

Bioactive carbon dots from peony seed meal for nanomedicine via circular economy.

iScience·2026
Same journal

Genetic ablation of <i>Sfxn5</i> induces mitochondrial dysfunction and precipitates lethal metabolic crisis in mice.

iScience·2026
Same journal

Expansion, functional diversification, and gene fusion events in the Ato protein family.

iScience·2026
Same journal

The pro-inflammatory cytokines IFN-α and TNF-α inhibit organoid-derived extravillous trophoblast invasion.

iScience·2026
Same journal

Urbanization compound pathways of global lung cancer incidence risk under proximal and distal interactions.

iScience·2026
Same journal

Capsid and integrase play essential apposing roles in viral ribonucleoprotein assembly during HIV-1 core morphogenesis.

iScience·2026
See all related articles

Related Experiment Video

Updated: Feb 4, 2026

Applications of EEG Neuroimaging Data: Event-related Potentials, Spectral Power, and Multiscale Entropy
11:15

Applications of EEG Neuroimaging Data: Event-related Potentials, Spectral Power, and Multiscale Entropy

Published on: June 27, 2013

34.4K

Entropy-based byte patching transformer for self-supervised pretraining of SMILES data.

Medard Edmund Mswahili1, JunHa Hwang1, Kyuri Jo1

  • 1Chungbuk National University, Department of Computer Engineering, Cheongju 28644, South Korea.

Iscience
|February 2, 2026
PubMed
Summary
This summary is machine-generated.

The novel SMILES Byte-Patch Transformer (SMiBPT) enhances molecular representations by adaptively segmenting chemical strings, improving large language model performance in chemical learning.

Keywords:
Artificial intelligenceChemistryComputer scienceMolecules

More Related Videos

Author Spotlight: Insights into Remotely Supervised Neuromodulation Procedure for Phantom Limb Pain
06:13

Author Spotlight: Insights into Remotely Supervised Neuromodulation Procedure for Phantom Limb Pain

Published on: March 1, 2024

1.8K
Analyzing Mitochondrial Morphology Through Simulation Supervised Learning
12:06

Analyzing Mitochondrial Morphology Through Simulation Supervised Learning

Published on: March 3, 2023

4.7K

Related Experiment Videos

Last Updated: Feb 4, 2026

Applications of EEG Neuroimaging Data: Event-related Potentials, Spectral Power, and Multiscale Entropy
11:15

Applications of EEG Neuroimaging Data: Event-related Potentials, Spectral Power, and Multiscale Entropy

Published on: June 27, 2013

34.4K
Author Spotlight: Insights into Remotely Supervised Neuromodulation Procedure for Phantom Limb Pain
06:13

Author Spotlight: Insights into Remotely Supervised Neuromodulation Procedure for Phantom Limb Pain

Published on: March 1, 2024

1.8K
Analyzing Mitochondrial Morphology Through Simulation Supervised Learning
12:06

Analyzing Mitochondrial Morphology Through Simulation Supervised Learning

Published on: March 3, 2023

4.7K

Area of Science:

  • Computational Chemistry
  • Machine Learning
  • Bioinformatics

Background:

  • Transformer models are advancing molecular representation learning.
  • Capturing localized and hierarchical chemical structures remains a challenge for current models.

Purpose of the Study:

  • To introduce the SMILES Byte-Patch Transformer (SMiBPT) for improved molecular representation learning.
  • To develop an adaptive model for dynamic segmentation of chemical strings.

Main Methods:

  • SMiBPT uses entropy-based byte patching to segment SMILES and DeepSMILES strings into chemically meaningful substructures.
  • The model integrates self-supervised pretraining, chemical motif-aware encoding, adaptive entropy-aware masking, and rotary position embeddings.
  • Trained on ~216 million unlabeled molecules from PubChem without truncation.

Main Results:

  • SMiBPT outperforms existing models like ChemBERTa, SMILES-BERT, and MoLFormer in predictive accuracy and efficiency.
  • The adaptive patching strategy preserves molecular semantics and enhances feature extraction.
  • Demonstrated superior zero-shot transfer capabilities.

Conclusions:

  • SMiBPT offers a parameter-efficient and effective approach to molecular representation learning.
  • The adaptive segmentation method addresses limitations of fixed tokenization in chemical learning.
  • This model advances the application of large language models in chemistry.