Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Classification of Systems-I01:26

Classification of Systems-I

414
Linearity is a system property characterized by a direct input-output relationship, combining homogeneity and additivity.
Homogeneity dictates that if an input x(t) is multiplied by a constant c, the output y(t) is multiplied by the same constant. Mathematically, this is expressed as:
414
Classification of Systems-II01:31

Classification of Systems-II

326
Continuous-time systems have continuous input and output signals, with time measured continuously. These systems are generally defined by differential or algebraic equations. For instance, in an RC circuit, the relationship between input and output voltage is expressed through a differential equation derived from Ohm's law and the capacitor relation,
326
Classification of Signals01:30

Classification of Signals

1.1K
In signal processing, signals are classified based on various characteristics: continuous-time versus discrete-time, periodic versus aperiodic, analog versus digital, and causal versus noncausal. Each category highlights distinct properties crucial for understanding and manipulating signals.
A continuous-time signal holds a value at every instant in time, representing information seamlessly. In contrast, a discrete-time signal holds values only at specific moments, often denoted as x(n), where...
1.1K
Force Classification01:22

Force Classification

1.9K
Forces play a crucial role in the study of physics and engineering. They are essential in describing the motion, behavior, and equilibrium of objects in the physical world. Forces can be classified based on their origin, type, and direction of action.
Contact and non-contact forces are two of the most widely used categories of forces. As the name suggests, contact forces require physical contact between two objects to act upon each other. Examples of contact forces include frictional,...
1.9K
Aggregates Classification01:29

Aggregates Classification

525
Aggregate classification is generally based on its size, petrographic characteristics, weight, and source. Size classification ranges from coarse to fine aggregates, defined by the size of the particles. Coarse aggregates are particles that do not pass through ASTM sieve No. 4, and aggregates that pass through the sieve are fine aggregates.
Petrographic classification groups aggregates based on common mineralogical characteristics. Some of the common mineral groups found in aggregates are...
525
Cartesian Vector Notation01:28

Cartesian Vector Notation

1.2K
Cartesian vector notation is a valuable tool in mechanical engineering for representing vectors in three-dimensional space, performing vector operations such as determining the gradient, divergence, and curl, and expressing physical quantities such as the displacement, velocity, acceleration, and force. By using Cartesian vector notation, engineers can more easily analyze and solve problems in various areas of mechanical engineering, including dynamics, kinematics, and fluid mechanics. This...
1.2K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

A Scoping Review of Large Language Model Chatbot Use for Alcohol and Other Drug Health Information.

Drug and alcohol review·2025
Same author

Trichome density and herbivore behaviour on tomato is influenced by herbivory, plant age, and leaf surface.

AoB PLANTS·2025
Same author

usfAD based effective unknown attack detection focused IDS framework.

Scientific reports·2024
Same author

Exploring gene regulatory interaction networks and predicting therapeutic molecules for hypopharyngeal cancer and EGFR-mutated lung adenocarcinoma.

FEBS open bio·2024
Same author

ML-CKDP: Machine learning-based chronic kidney disease prediction with smart web application.

Journal of pathology informatics·2024
Same author

Partial purification and characterization of protease extracted from <i>kinema</i>.

Heliyon·2024
Same journal

DARUMA: a gateway to fast and easy prediction of intrinsically disordered regions.

PeerJ. Computer science·2026
Same journal

Alzheimer's disease detection using a quantum deep neural network with Haralick feature extraction and simulated annealing optimization.

PeerJ. Computer science·2026
Same journal

Network anomaly detection using Deep Autoencoder and parallel Artificial Bee Colony algorithm-trained neural network.

PeerJ. Computer science·2026
Same journal

An anomaly detection model for multivariate time series with anomaly perception.

PeerJ. Computer science·2026
Same journal

Retraction: A wormhole attack detection method for tactical wireless sensor networks.

PeerJ. Computer science·2026
Same journal

Evaluation of mental disorder with prioritization of its type by utilizing the bipolar complex fuzzy decision-making approach based on Schweizer-Sklar prioritized aggregation operators.

PeerJ. Computer science·2026
See all related articles

Related Experiment Video

Updated: Nov 10, 2025

Author Spotlight: UAV Remote Sensing for Efficient Invasive Plant Biomass Estimation
08:47

Author Spotlight: UAV Remote Sensing for Efficient Invasive Plant Biomass Estimation

Published on: February 9, 2024

1.8K

Vector representation based on a supervised codebook for Nepali documents classification.

Chiranjibi Sitaula1, Anish Basnet2, Sunil Aryal1

  • 1Deakin University, Geelong, VIC, Australia.

Peerj. Computer Science
|April 5, 2021
PubMed
Summary
This summary is machine-generated.

This study introduces a novel supervised codebook for Nepali document representation, significantly improving document classification accuracy by filtering outlier tokens. The method achieves state-of-the-art results on multiple datasets.

Keywords:
ClassificationCodebookFeature extractionMachine learningNepali documentsText classification

More Related Videos

Objectification of Tongue Diagnosis in Traditional Medicine, Data Analysis, and Study Application
05:56

Objectification of Tongue Diagnosis in Traditional Medicine, Data Analysis, and Study Application

Published on: April 14, 2023

2.8K

Related Experiment Videos

Last Updated: Nov 10, 2025

Author Spotlight: UAV Remote Sensing for Efficient Invasive Plant Biomass Estimation
08:47

Author Spotlight: UAV Remote Sensing for Efficient Invasive Plant Biomass Estimation

Published on: February 9, 2024

1.8K
Objectification of Tongue Diagnosis in Traditional Medicine, Data Analysis, and Study Application
05:56

Objectification of Tongue Diagnosis in Traditional Medicine, Data Analysis, and Study Application

Published on: April 14, 2023

2.8K

Area of Science:

  • Natural Language Processing
  • Computational Linguistics
  • Machine Learning

Background:

  • Document representation methods often struggle with outlier tokens, negatively impacting classification performance.
  • Existing Nepali document representation techniques frequently lack effective outlier filtering strategies.
  • Outlier tokens introduce uncertainty, hindering the accurate semantic understanding of documents.

Purpose of the Study:

  • To propose a novel document representation method for Nepali text that effectively handles outlier tokens.
  • To develop a supervised codebook containing only semantically relevant tokens for improved document representation.
  • To enhance document classification performance for Nepali language data.

Main Methods:

  • Developed a domain-specific supervised codebook based on token-class label similarity.
  • Utilized probability-based word embedding for word representation within the codebook.
  • Evaluated the method on four Nepali text datasets (A1, A2, A3, A4) using Support Vector Machine (SVM).
  • Compared performance against established methods like Bag of Words, Latent Dirichlet Allocation, LSTM, Word2Vec, and BERT.

Main Results:

  • Achieved state-of-the-art document classification accuracy across four Nepali datasets.
  • Demonstrated superior performance with accuracies of 77.46% (A1), 67.53% (A2), 80.54% (A3), and 89.58% (A4).
  • Outperformed widely used existing document representation methods on three datasets and showed comparable results on the fourth.
  • Introduced the largest Nepali document dataset, NepaliLinguistic dataset (A4), to the research community.

Conclusions:

  • The proposed supervised codebook method offers a robust approach to Nepali document representation by mitigating the impact of outlier tokens.
  • This novel method significantly enhances document classification accuracy, setting a new benchmark for Nepali NLP tasks.
  • The introduction of the NepaliLinguistic dataset facilitates future research in Nepali language processing and computational linguistics.