Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Proteomics01:33

Proteomics

7.2K
A proteome is the entire set of proteins that a cell type produces. We can study proteomes using the knowledge of genomes because genes code for mRNAs, and the mRNAs encode proteins. Although mRNA analysis is a step in the right direction, not all mRNAs are translated into proteins.
Proteomics is the study of proteomes' function. It involves the large-scale systematic study of the proteome to denote the protein complement expressed by a genome. Scientist Mark Wilkins coined the term...
7.2K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

MADOran: A morphologically annotated dataset of Oran.

Data in brief·2025
Same author

Integrating artificial intelligence with Gamma Knife radiosurgery in treating meningiomas and schwannomas: a review.

Neurosurgical review·2025
Same author

Arabic punctuation dataset.

Data in brief·2024
Same journal

A harmonized fast-fashion garment-variant dataset for textile circularity and sustainability assessment.

Data in brief·2026
Same journal

Terahertz reflectivity dataset: Reading text on both sides of the page.

Data in brief·2026
Same journal

High-quality draft genome sequence data of <i>Levilactobacillus brevis</i> 3LB isolated from fermented milk koumiss.

Data in brief·2026
Same journal

Interview dataset: Encouraging the development of industrial symbiosis networks in Slovenia - transition to the circular economy.

Data in brief·2026
Same journal

Timeseries of multispectral and radar data and vegetation indices from Sentinel-1, Sentinel-2 and Landsat-8 at field scale.

Data in brief·2026
Same journal

BACI-VI-Bench: A dataset of variational inequality benchmark instances for multi-agent trade-network equilibrium.

Data in brief·2026
See all related articles

Related Experiment Video

Updated: Jun 1, 2025

Author Spotlight: AQRNA-seq Role in Mapping Small RNAs and Unraveling Protein Translation Mechanisms
05:12

Author Spotlight: AQRNA-seq Role in Mapping Small RNAs and Unraveling Protein Translation Mechanisms

Published on: February 2, 2024

674

Morphologically-analyzed and syntactically-annotated Quran dataset.

Majdi Sawalha1,2, Faisal Al-Shargi3, Sane Yagi4,5

  • 1College of Engineering, Al-Ain University, Abu Dhabi, UAE.

Data in Brief
|January 20, 2025
PubMed
Summary
This summary is machine-generated.

The Morphologically-Analyzed and Syntactically-Annotated Quran (MASAQ) dataset offers detailed linguistic annotations for Quranic Arabic, advancing Natural Language Processing (NLP) research and tools for this classical language.

Keywords:
Syntactic annotationanalysisi'rab إعراب (ʾi‘rāb)morphological annotationsemantic relationssyntactic relationstagset

More Related Videos

A Quantitative Fitness Analysis Workflow
11:39

A Quantitative Fitness Analysis Workflow

Published on: August 13, 2012

14.4K
Author Spotlight: Enhancing Rheumatoid Arthritis Research Through HR-pQCT Imaging Analysis
06:31

Author Spotlight: Enhancing Rheumatoid Arthritis Research Through HR-pQCT Imaging Analysis

Published on: October 6, 2023

2.0K

Related Experiment Videos

Last Updated: Jun 1, 2025

Author Spotlight: AQRNA-seq Role in Mapping Small RNAs and Unraveling Protein Translation Mechanisms
05:12

Author Spotlight: AQRNA-seq Role in Mapping Small RNAs and Unraveling Protein Translation Mechanisms

Published on: February 2, 2024

674
A Quantitative Fitness Analysis Workflow
11:39

A Quantitative Fitness Analysis Workflow

Published on: August 13, 2012

14.4K
Author Spotlight: Enhancing Rheumatoid Arthritis Research Through HR-pQCT Imaging Analysis
06:31

Author Spotlight: Enhancing Rheumatoid Arthritis Research Through HR-pQCT Imaging Analysis

Published on: October 6, 2023

2.0K

Area of Science:

  • Computational Linguistics
  • Digital Humanities
  • Corpus Linguistics

Background:

  • Classical Arabic, particularly the Quran, presents significant linguistic complexities for Natural Language Processing (NLP).
  • There is a notable scarcity of comprehensive, annotated corpora for Quranic Arabic, hindering NLP model development.
  • Existing resources may lack the detailed morphological and syntactic information required for advanced linguistic analysis.

Purpose of the Study:

  • To introduce the Morphologically-Analyzed and Syntactically-Annotated Quran (MASAQ) dataset, a novel resource for Quranic Arabic.
  • To address the gap in annotated Quranic corpora and support the creation of sophisticated NLP applications.
  • To provide a high-quality, linguistically rich dataset for research in Arabic NLP.

Main Methods:

  • Development of a comprehensive dataset annotating the entire Quranic text with detailed morphological and syntactic information.
  • Utilized a rigorously verified Quranic text from Tanzil.net.
  • Employed traditional i'rab methodologies by expert Arabic linguists for accurate annotation, resulting in over 131K morphological entries and 123K syntactic function instances.
  • Structured the dataset in multiple formats (TSV, SQLite3, CSV, JSON) for accessibility.

Main Results:

  • The MASAQ dataset comprises over 131,000 morphological entries and 123,000 syntactic function instances.
  • Features a comprehensive tagset of 72 syntactic roles, detailed morphological analysis, and context-specific annotations.
  • The dataset is available in various formats, facilitating diverse research applications.

Conclusions:

  • MASAQ significantly enhances the availability of annotated Quranic Arabic data for NLP research.
  • The dataset is poised to advance Arabic NLP tasks such as dependency parsing, machine translation, and grammar checking.
  • MASAQ offers valuable resources for both NLP development and the study of Arabic linguistics, promoting more accurate language processing tools.