Search research articles

关于 JoVE

概览领导团队博客 JoVE 帮助中心

作者

出版流程编辑委员会范围与政策同行评审常见问题投稿

图书馆员

用户评价订阅访问资源图书馆顾问委员会常见问题

研究

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments 存档

教育

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual 教师资源中心教师网站

使用条款与条件

相关概念视频

Quartile

Quartile

Quartiles are numbers that separate the data into quarters. Quartiles may or may not be part of the data. To find the quartiles, first, find the median or second quartile. The first quartile, Q1, is the middle value of the lower half of the data, and the third quartile, Q3, is the middle value, or median, of the upper half of the data. To get the idea, consider the same data set:
1; 1; 2; 2; 4; 6; 6.8; 7.2; 8; 8.3; 9; 10; 10; 11.5
The median or second quartile is seven. The lower half of the...

Detection of Gross Error: The Q Test

Detection of Gross Error: The Q Test

When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...

Bond Polarity, Dipole Moment, and Percent Ionic Character

Bond Polarity, Dipole Moment, and Percent Ionic Character

Data Collection I

Data Collection I

Data collection gathers information needed to make accurate judgments about a patient's present condition. During a health history interview, subjective data is collected from the patient, their caregivers, or family members, and objective data is collected through observations and physical assessment. Patients are the primary source of subjective data. Thus information gathered from patients through interviews, observations, and physical examination is primary data. Secondary sources of...

z Scores and Unusual Values

z Scores and Unusual Values

The z score is one of the three measures of relative standing. It describes the location of a value in a dataset relative to the mean. z scores are obtained after the standardization of the values in a dataset. The z score for the mean is 0.
This score indicates how far a value is from the mean in terms of standard deviation. For example, if a data value has a z score of +1, the researcher can infer that the particular data value is one standard deviation above the mean. If another data...

Data: Types and Distribution

Data: Types and Distribution

In biostatistics, data are the observations collected for analysis. There are two main types: parametric and non-parametric. Parametric data, which include continuous (e.g., weight) and discrete numerical data (e.g., number of tablets), assume a particular distribution pattern, often the normal distribution. Non-parametric data do not adhere to a specific distribution and typically comprise nominal (e.g., gender) and ordinal categorical data (e.g., pain scale ratings).
Distributions in...

您也可能阅读

相关文章

通过共同作者、期刊和引用图与本文相关的文章。

排序

Same author

Ethmoid sinus CBCT imaging as a biometric instrument: dataset creation for deep learning identification.

European journal of radiology·2026

Same author

MADOran: A morphologically annotated dataset of Oran.

Data in brief·2025

Same author

Morphologically-analyzed and syntactically-annotated Quran dataset.

Data in brief·2025

Same author

Perception and knowledge of learners about the use of 3D technologies in manual therapy education - a qualitative study.

BMC medical education·2023

Same author

Deep learning for Covid-19 forecasting: State-of-the-art review.

Neurocomputing·2022

Same author

Recent advances of bat-inspired algorithm, its versions and applications.

Neural computing & applications·2022

Same journal

A harmonized fast-fashion garment-variant dataset for textile circularity and sustainability assessment.

Data in brief·2026

Same journal

Terahertz reflectivity dataset: Reading text on both sides of the page.

Data in brief·2026

Same journal

High-quality draft genome sequence data of <i>Levilactobacillus brevis</i> 3LB isolated from fermented milk koumiss.

Data in brief·2026

Same journal

Interview dataset: Encouraging the development of industrial symbiosis networks in Slovenia - transition to the circular economy.

Data in brief·2026

Same journal

Timeseries of multispectral and radar data and vegetation indices from Sentinel-1, Sentinel-2 and Landsat-8 at field scale.

Data in brief·2026

Same journal

BACI-VI-Bench: A dataset of variational inequality benchmark instances for multi-agent trade-network equilibrium.

Data in brief·2026

查看所有相关文章

Search research articles

相关实验视频

Updated: Jul 3, 2025

Comparing Bibliometric Analysis Using PubMed, Scopus, and Web of Science Databases

Comparing Bibliometric Analysis Using PubMed, Scopus, and Web of Science Databases

Published on: October 24, 2019

阿拉伯语标记符号数据集

Sane Yagi¹, Ashraf Elnagar², Esra Yaghi³

¹Department of Foreign Languages, University of Sharjah, the United Arab Emirates.

|February 13, 2024

概括

此摘要是机器生成的。

阿拉伯语的标点不一致性阻碍了NLP. 阿拉伯标点数据集 (APD) 提供注释的现代标准阿拉伯文文本,用于训练句子边界识别和标点预测的模型,改进阿拉伯语NLP任务.

关键词:

自动标点符号的标点符号.标点集体是一个标点集体.句子边界识别句子边界识别的主题-rheme-rheme 在主题-rheme 中.话题和评论话题和评论

更多相关视频

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Published on: September 27, 2024

Collection and Analysis of Arabidopsis Phloem Exudates Using the EDTA-facilitated Method

Collection and Analysis of Arabidopsis Phloem Exudates Using the EDTA-facilitated Method

Published on: October 23, 2013

相关实验视频

Last Updated: Jul 3, 2025

Comparing Bibliometric Analysis Using PubMed, Scopus, and Web of Science Databases

Comparing Bibliometric Analysis Using PubMed, Scopus, and Web of Science Databases

Published on: October 24, 2019

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Published on: September 27, 2024

Collection and Analysis of Arabidopsis Phloem Exudates Using the EDTA-facilitated Method

Collection and Analysis of Arabidopsis Phloem Exudates Using the EDTA-facilitated Method

Published on: October 23, 2013

科学领域:

计算语言学计算语言学
自然语言处理自然语言处理.

背景情况:

阿拉伯语表现出明显的标点不一致性,为自然语言处理 (NLP) 应用程序带来了挑战.
为阿拉伯语开发强大的NLP工具需要解决这种标点变化.

研究的目的:

引入阿拉伯标点数据集 (APD),这是一个改进阿拉伯NLP的新型资源.
为了促进机器学习模型培训,用于现代标准阿拉伯语中句子边界识别和标点预测.

主要方法:

阿拉伯标点数据集 (APD) 使用"theme-rheme completion"原则创建,将语法与标点联系起来.
APD包含31200万个单词和1200万个句子,包括手动注释的书章 (ABC),并行翻译 (CBT) 和加密句子 (SSAC-UNPC).

主要成果:

APD提供了一个大规模的,注释的语料库,用于训练特定于阿拉伯标点的NLP模型.
数据集的不同组件满足各种NLP任务,从基本边界识别到复杂的标点恢复.

结论:

阿拉伯标点数据集 (APD) 是促进阿拉伯语NLP的基础资源.
APD基于语法的方法提高了机器生成文本的清晰度,有利于机器翻译和语音识别等应用程序.