Jove
Visualize
联系我们
JoVE
x logofacebook logolinkedin logoyoutube logo
关于 JoVE
概览领导团队博客JoVE 帮助中心
作者
出版流程编辑委员会范围与政策同行评审常见问题投稿
图书馆员
用户评价订阅访问资源图书馆顾问委员会常见问题
研究
JoVE JournalMethods CollectionsJoVE Encyclopedia of Experiments存档
教育
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab Manual教师资源中心教师网站
使用条款与条件
隐私政策
政策

相关概念视频

Quartile01:15

Quartile

4.2K
Quartiles are numbers that separate the data into quarters. Quartiles may or may not be part of the data. To find the quartiles, first, find the median or second quartile. The first quartile, Q1, is the middle value of the lower half of the data, and the third quartile, Q3, is the middle value, or median, of the upper half of the data. To get the idea, consider the same data set:
1; 1; 2; 2; 4; 6; 6.8; 7.2; 8; 8.3; 9; 10; 10; 11.5
The median or second quartile is seven. The lower half of the...
4.2K
Detection of Gross Error: The Q Test01:00

Detection of Gross Error: The Q Test

6.1K
When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...
6.1K
Bond Polarity, Dipole Moment, and Percent Ionic Character02:48

Bond Polarity, Dipole Moment, and Percent Ionic Character

28.9K
Bond Polarity
28.9K
Data Collection I01:30

Data Collection I

6.2K
Data collection gathers information needed to make accurate judgments about a patient's present condition. During a health history interview, subjective data is collected from the patient, their caregivers, or family members, and objective data is collected through observations and physical assessment. Patients are the primary source of subjective data. Thus information gathered from patients through interviews, observations, and physical examination is primary data. Secondary sources of...
6.2K
z Scores and Unusual Values01:07

z Scores and Unusual Values

9.7K
The z score is one of the three measures of relative standing. It describes the location of a value in a dataset relative to the mean. z scores are obtained after the standardization of the values in a dataset. The z score for the mean is 0.
 This score indicates how far a value is from the mean in terms of standard deviation. For example, if a data value has a z score of +1, the researcher can infer that the particular data value is one standard deviation above the mean. If another data...
9.7K
Data: Types and Distribution01:19

Data: Types and Distribution

722
In biostatistics, data are the observations collected for analysis. There are two main types: parametric and non-parametric. Parametric data, which include continuous (e.g., weight) and discrete numerical data (e.g., number of tablets), assume a particular distribution pattern, often the normal distribution. Non-parametric data do not adhere to a specific distribution and typically comprise nominal (e.g., gender) and ordinal categorical data (e.g., pain scale ratings).
Distributions in...
722

您也可能阅读

相关文章

通过共同作者、期刊和引用图与本文相关的文章。

排序
Same author

Ethmoid sinus CBCT imaging as a biometric instrument: dataset creation for deep learning identification.

European journal of radiology·2026
Same author

MADOran: A morphologically annotated dataset of Oran.

Data in brief·2025
Same author

Morphologically-analyzed and syntactically-annotated Quran dataset.

Data in brief·2025
Same author

Perception and knowledge of learners about the use of 3D technologies in manual therapy education - a qualitative study.

BMC medical education·2023
Same author

Deep learning for Covid-19 forecasting: State-of-the-art review.

Neurocomputing·2022
Same author

Recent advances of bat-inspired algorithm, its versions and applications.

Neural computing & applications·2022
Same journal

A harmonized fast-fashion garment-variant dataset for textile circularity and sustainability assessment.

Data in brief·2026
Same journal

Terahertz reflectivity dataset: Reading text on both sides of the page.

Data in brief·2026
Same journal

High-quality draft genome sequence data of <i>Levilactobacillus brevis</i> 3LB isolated from fermented milk koumiss.

Data in brief·2026
Same journal

Interview dataset: Encouraging the development of industrial symbiosis networks in Slovenia - transition to the circular economy.

Data in brief·2026
Same journal

Timeseries of multispectral and radar data and vegetation indices from Sentinel-1, Sentinel-2 and Landsat-8 at field scale.

Data in brief·2026
Same journal

BACI-VI-Bench: A dataset of variational inequality benchmark instances for multi-agent trade-network equilibrium.

Data in brief·2026
查看所有相关文章

相关实验视频

Updated: Jul 3, 2025

Comparing Bibliometric Analysis Using PubMed, Scopus, and Web of Science Databases
05:02

Comparing Bibliometric Analysis Using PubMed, Scopus, and Web of Science Databases

Published on: October 24, 2019

31.4K

阿拉伯语标记符号数据集

Sane Yagi1, Ashraf Elnagar2, Esra Yaghi3

  • 1Department of Foreign Languages, University of Sharjah, the United Arab Emirates.

Data in brief
|February 13, 2024
PubMed
概括
此摘要是机器生成的。

阿拉伯语的标点不一致性阻碍了NLP. 阿拉伯标点数据集 (APD) 提供注释的现代标准阿拉伯文文本,用于训练句子边界识别和标点预测的模型,改进阿拉伯语NLP任务.

关键词:
自动标点符号的标点符号.标点集体是一个标点集体.句子边界识别句子边界识别的主题-rheme-rheme 在主题-rheme 中.话题和评论 话题和评论

更多相关视频

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody
09:09

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Published on: September 27, 2024

454
Collection and Analysis of Arabidopsis Phloem Exudates Using the EDTA-facilitated Method
09:38

Collection and Analysis of Arabidopsis Phloem Exudates Using the EDTA-facilitated Method

Published on: October 23, 2013

24.5K

相关实验视频

Last Updated: Jul 3, 2025

Comparing Bibliometric Analysis Using PubMed, Scopus, and Web of Science Databases
05:02

Comparing Bibliometric Analysis Using PubMed, Scopus, and Web of Science Databases

Published on: October 24, 2019

31.4K
Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody
09:09

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Published on: September 27, 2024

454
Collection and Analysis of Arabidopsis Phloem Exudates Using the EDTA-facilitated Method
09:38

Collection and Analysis of Arabidopsis Phloem Exudates Using the EDTA-facilitated Method

Published on: October 23, 2013

24.5K

科学领域:

  • 计算语言学 计算语言学
  • 自然语言处理自然语言处理.

背景情况:

  • 阿拉伯语表现出明显的标点不一致性,为自然语言处理 (NLP) 应用程序带来了挑战.
  • 为阿拉伯语开发强大的NLP工具需要解决这种标点变化.

研究的目的:

  • 引入阿拉伯标点数据集 (APD),这是一个改进阿拉伯NLP的新型资源.
  • 为了促进机器学习模型培训,用于现代标准阿拉伯语中句子边界识别和标点预测.

主要方法:

  • 阿拉伯标点数据集 (APD) 使用"theme-rheme completion"原则创建,将语法与标点联系起来.
  • APD包含31200万个单词和1200万个句子,包括手动注释的书章 (ABC),并行翻译 (CBT) 和加密句子 (SSAC-UNPC).

主要成果:

  • APD提供了一个大规模的,注释的语料库,用于训练特定于阿拉伯标点的NLP模型.
  • 数据集的不同组件满足各种NLP任务,从基本边界识别到复杂的标点恢复.

结论:

  • 阿拉伯标点数据集 (APD) 是促进阿拉伯语NLP的基础资源.
  • APD基于语法的方法提高了机器生成文本的清晰度,有利于机器翻译和语音识别等应用程序.