Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Aggregates Classification

Aggregates Classification

Aggregate classification is generally based on its size, petrographic characteristics, weight, and source. Size classification ranges from coarse to fine aggregates, defined by the size of the particles. Coarse aggregates are particles that do not pass through ASTM sieve No. 4, and aggregates that pass through the sieve are fine aggregates.
Petrographic classification groups aggregates based on common mineralogical characteristics. Some of the common mineral groups found in aggregates are...

Improving Translational Accuracy

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Assessing the reliability and validity of the International Trauma Interview in a sample of Ukrainian soldiers.

Journal of anxiety disorders·2026

Same author

Clinician Assessed Rates of PTSD and Complex PTSD in a Medical-Rehabilitation Sample of Active-Duty Military Personnel in the Armed Forces of Ukraine.

Acta psychiatrica Scandinavica·2025

Same author

Hypoxia-induced metastatic heterogeneity in pancreatic cancer.

Research square·2025

Same author

Hypoxia-induced metastatic heterogeneity in pancreatic cancer.

bioRxiv : the preprint server for biology·2025

Same author

Strain-induced crumpling of graphene oxide lamellas to achieve fast and selective transport of H<sub>2</sub> and CO<sub>2</sub>.

Nature nanotechnology·2025

Same author

Linguacodus: a synergistic framework for transformative code generation in machine learning pipelines.

PeerJ. Computer science·2024

Same journal

DARUMA: a gateway to fast and easy prediction of intrinsically disordered regions.

PeerJ. Computer science·2026

Same journal

Alzheimer's disease detection using a quantum deep neural network with Haralick feature extraction and simulated annealing optimization.

PeerJ. Computer science·2026

Same journal

Network anomaly detection using Deep Autoencoder and parallel Artificial Bee Colony algorithm-trained neural network.

PeerJ. Computer science·2026

Same journal

An anomaly detection model for multivariate time series with anomaly perception.

PeerJ. Computer science·2026

Same journal

Retraction: A wormhole attack detection method for tactical wireless sensor networks.

PeerJ. Computer science·2026

Same journal

Evaluation of mental disorder with prioritization of its type by utilizing the bipolar complex fuzzy decision-making approach based on Schweizer-Sklar prioritized aggregation operators.

PeerJ. Computer science·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jul 26, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Code4ML: a large-scale dataset of annotated Machine Learning code.

Anastasia Drozdova¹, Ekaterina Trofimova¹, Polina Guseva¹

¹Department of Computer Science, NRU Higher School of Economics, Moscow, Russia.

Peerj. Computer Science

|June 22, 2023

Summary

This summary is machine-generated.

Researchers developed the Code4ML corpus, a large dataset of annotated machine learning (ML) code snippets from Kaggle. This resource aids ML development by providing labeled code for tasks like classification and generation.

Keywords:

Jupyter code snippets ML code dataset

More Related Videos

Deep Neural Networks for Image-Based Dietary Assessment

Deep Neural Networks for Image-Based Dietary Assessment

Published on: March 13, 2021

Automatic Image Processing to Determine the Community Size Structure of Riverine Macroinvertebrates

Automatic Image Processing to Determine the Community Size Structure of Riverine Macroinvertebrates

Published on: January 13, 2023

Related Experiment Videos

Last Updated: Jul 26, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Deep Neural Networks for Image-Based Dietary Assessment

Deep Neural Networks for Image-Based Dietary Assessment

Published on: March 13, 2021

Automatic Image Processing to Determine the Community Size Structure of Riverine Macroinvertebrates

Automatic Image Processing to Determine the Community Size Structure of Riverine Macroinvertebrates

Published on: January 13, 2023

Area of Science:

Computer Science
Machine Learning
Software Engineering
Data Science

Background:

Program code is increasingly used as a data source in data science for tasks like semantic classification and program generation.
Machine learning model application is hindered by the lack of annotated code snippet datasets.

Purpose of the Study:

To address the scarcity of annotated code datasets for machine learning.
To introduce the Code4ML corpus, a comprehensive collection of annotated ML code snippets.

Main Methods:

Collected approximately 2.5 million machine learning code snippets from 100,000 Jupyter notebooks hosted on Kaggle.
Annotated a representative fraction of these code snippets using a custom-designed, user-friendly interface.
Included associated metadata such as task summaries, competition details, and dataset descriptions.

Main Results:

The Code4ML corpus provides a large-scale, annotated dataset of ML code snippets.
The dataset is derived from publicly available data from Kaggle, a leading data science competition platform.
Human annotation was performed on a significant portion of the collected code snippets.

Conclusions:

The Code4ML dataset offers a valuable resource for data science and software engineering research.
It can facilitate data-driven approaches to challenges such as semantic code classification, code auto-completion, and natural language-based code generation for ML tasks.