Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Improving Translational Accuracy

Improving Translational Accuracy

Multiple Comparison Tests

Multiple Comparison Tests

Multiple comparison test, abbreviated as MCT, is a post hoc analysis generally performed after comparing multiple samples with one or more tests. An MCT will help identify a significantly different sample among multiple samples or a factor among multiple factors.
It would be easy to compare two samples using a significance alpha level of 0.05. In other words, there is only one sample pair to be compared. However, it would be difficult to identify a significantly different sample if the number...

Language and Cognition

Language and Cognition

Language serves as a bridge between ideas and communication, influencing how individuals perceive and interact with the world. Psychologists have long debated whether language shapes thought or vice versa. This discussion gained grip with Edward Sapir and Benjamin Lee Whorf in the 1940s, who proposed that language determines thought, a concept known as linguistic determinism. They suggested that the vocabulary and structure of a language influence how its speakers think and perceive reality.

Language Development

Language Development

Children master language quickly and with relative ease, supported by both biological predisposition and reinforcement. B. F. Skinner (1957) proposed that language is learned through reinforcement, while Noam Chomsky (1965) argued that language acquisition mechanisms are biologically determined.
The critical period for language acquisition suggests that the ability to acquire language is at its peak early in life. As people age, this proficiency decreases. Language development begins very...

Aggregates Classification

Aggregates Classification

Aggregate classification is generally based on its size, petrographic characteristics, weight, and source. Size classification ranges from coarse to fine aggregates, defined by the size of the particles. Coarse aggregates are particles that do not pass through ASTM sieve No. 4, and aggregates that pass through the sieve are fine aggregates.
Petrographic classification groups aggregates based on common mineralogical characteristics. Some of the common mineral groups found in aggregates are...

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for k_a Estimation

This lesson introduces two critical methods in pharmacokinetics, the Wagner-Nelson and Loo-Riegelman methods, used for estimating the absorption rate constant (ka) for drugs administered via non-intravenous routes. The Wagner-Nelson method relates ka to the plasma concentration derived from the slope of a semilog percent unabsorbed time plot. However, it is limited to drugs with one-compartment kinetics and can be impacted by factors like gastrointestinal motility or enzymatic degradation.
On...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Multi-Scale Attention Fusion With Depthwise Separable Convolutions for Efficient Skin Cancer Detection.

Journal of cutaneous pathology·2025

Same author

Sentiment analysis for deepfake X posts using novel transfer learning based word embedding and hybrid LGR approach.

Scientific reports·2025

Same author

Explainable deep learning approaches for high precision early melanoma detection using dermoscopic images.

Scientific reports·2025

Same author

Enhancing the YOLOv8 model for realtime object detection to ensure online platform safety.

Scientific reports·2025

Same author

Fusing Transformer-XL with bi-directional recurrent networks for cyberbullying detection.

PeerJ. Computer science·2025

Same author

Mpox-XDE: an ensemble model utilizing deep CNN and explainable AI for monkeypox detection and classification.

BMC infectious diseases·2025

Same journal

Invaders taking over-Mollusc faunal change in volcanic barrier lakes of the Albertine Rift biodiversity hotspot.

PloS one·2026

Same journal

AI-driven molecular diversification and ligand-based optimization of macitentan derivatives targeting VEGFR1 and endothelin signaling pathways.

PloS one·2026

Same journal

Performance patterns and records in the world aquatics masters championships: Where do the most frequently represented nations among the top-ten masters swimmers come from?

PloS one·2026

Same journal

Modeling diurnal Temperature-Rainfall relationships under multicollinearity using PLS-SEM: A case study of Ghana.

PloS one·2026

Same journal

Organizational culture, social capital, and emergency capacity in primary healthcare institutions: A cross-sectional structural equation modeling study comparing ordinary and older communities.

PloS one·2026

Same journal

Impact of kidney function on the metabolome in the general population.

PloS one·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Sep 9, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

GATmath and GATLc: Comprehensive benchmarks for evaluating Arabic large language models.

Safa AlBallaa¹, Nora AlTwairesh¹, Abdulmalik AlSalman¹

¹Department of Computer Science, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia.

|September 2, 2025

Summary

This summary is machine-generated.

Developing Arabic Large Language Models (LLMs) is challenging due to limited benchmarks. New datasets, GATmath and GATLc, offer large-scale reasoning and language tasks to drive progress in Arabic AI.

More Related Videos

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Published on: September 27, 2024

Related Experiment Videos

Last Updated: Sep 9, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Published on: September 27, 2024

Area of Science:

Artificial Intelligence
Natural Language Processing
Computational Linguistics

Background:

Large Language Models (LLMs) have advanced AI, but their development requires robust evaluation.
Assessing Arabic LLMs is hindered by a lack of comprehensive benchmarks and evaluation tools.
This scarcity limits the progress and real-world application of Arabic language models.

Purpose of the Study:

Introduce GATmath (7k questions) and GATLc (9k questions), novel Arabic benchmarks for multitask reasoning and language understanding.
Provide the first large-scale, comprehensive reasoning dataset specifically designed for the Arabic language.
Facilitate rigorous evaluation and drive the advancement of Arabic LLMs.

Main Methods:

Created two large-scale Arabic datasets, GATmath and GATLc, derived from the General Aptitude Test (GAT).
Datasets encompass diverse categories requiring reasoning, semantic analysis, language comprehension, and mathematical problem-solving.
Evaluated seven prominent LLMs on these newly developed benchmarks.

Main Results:

The highest-performing LLM achieved only 66.9% (GATmath) and 64.3% (GATLc) accuracy.
These results highlight the significant difficulty posed by the GATmath and GATLc datasets.
Current state-of-the-art LLMs demonstrate substantial limitations in Arabic reasoning and language understanding.

Conclusions:

The GATmath and GATLc datasets present a considerable challenge for existing Arabic LLMs.
There is substantial room for improvement in developing more capable Arabic language models.
These benchmarks are crucial for advancing research and development in Arabic AI.