Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Encoding01:19

Encoding

589
Information enters the brain through encoding, which is the input of information into the memory system. Once sensory information is received from the environment, the brain labels or codes it. The information is then organized with similar information and connected to existing concepts. Encoding occurs through automatic processing and effortful processing.
Automatic processing involves the encoding of details like time, space, frequency, and the meaning of words, usually done without conscious...
589
How Data are Classified: Categorical Data01:11

How Data are Classified: Categorical Data

39.9K
A variable, usually notated by capital letters such as X and Y, is a characteristic or measurement that can be determined for each member of a population. Data are the actual values of variables. They may be numbers, or they may be words. Datum is a single value.
Data are classified based on whether they are measurable or not. Categorical data cannot be measured; instead, it can be divided into categories. For example, if Y denotes a person's party affiliation, some examples of Y include...
39.9K
Chunking01:12

Chunking

288
Chunking is a powerful cognitive technique that improves short-term memory retention by organizing information into smaller, more manageable units. The brain, limited by working memory capacity, can more easily process and store information when it is divided into "chunks" rather than presented as discrete, unrelated elements. Chunking is especially useful when dealing with large amounts of information, such as numerical sequences, words, or complex ideas.
The principle behind chunking...
288
How Data are Classified: Numerical Data00:59

How Data are Classified: Numerical Data

34.9K
Data that are countable or measurable in specific units are called numerical or quantitative data. Quantitative data are always numbers. Quantitative data are the result of counting or measuring the attributes of a population. Amount of money, pulse rate, weight, number of people living in a town, and number of students who opt for statistics are examples of quantitative data.
Quantitative data may be either discrete or continuous. All quantitative data that take on only specific numerical...
34.9K
Data: Types and Distribution01:19

Data: Types and Distribution

1.1K
In biostatistics, data are the observations collected for analysis. There are two main types: parametric and non-parametric. Parametric data, which include continuous (e.g., weight) and discrete numerical data (e.g., number of tablets), assume a particular distribution pattern, often the normal distribution. Non-parametric data do not adhere to a specific distribution and typically comprise nominal (e.g., gender) and ordinal categorical data (e.g., pain scale ratings).
Distributions in...
1.1K
Multiple Allele Traits01:49

Multiple Allele Traits

37.0K
The Concept of Multiple Allelism
37.0K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Polynomial Perceptrons for Compact, Robust, and Interpretable Machine Learning Models.

Entropy (Basel, Switzerland)·2026
Same author

A Blockchain and Fingerprinting Traceability Method for Digital Product Lifecycle Management.

Sensors (Basel, Switzerland)·2022
Same author

An Indoor Navigation Methodology for Mobile Devices by Integrating Augmented Reality and Semantic Web.

Sensors (Basel, Switzerland)·2021
Same author

A WoT-Based Method for Creating Digital Sentinel Twins of IoT Devices.

Sensors (Basel, Switzerland)·2021
Same author

Distributed Algorithm for Base Station Assignment in 4G/5G Machine-Type Communication Scenarios with Backhaul Limited Conditions.

Sensors (Basel, Switzerland)·2020
Same author

Smartphone-Based Platform for Secure Multi-Hop Message Dissemination in VANETs.

Sensors (Basel, Switzerland)·2020
Same journal

Research on a Regional Availability Evaluation Model for Road-Area High-Entropy Energy Based on Synergy Factors.

Entropy (Basel, Switzerland)·2026
Same journal

Atmospheric Turbulence Channel Modeling and Performance Analysis of a CO-ZP-OFDM Coherent Optical Communication System for UAV Air-to-Ground Scenarios.

Entropy (Basel, Switzerland)·2026
Same journal

Information Geometry and Asymptotic Theory for SMML Estimators.

Entropy (Basel, Switzerland)·2026
Same journal

Correlation Entropy and Power-Law Kinetics.

Entropy (Basel, Switzerland)·2026
Same journal

Research on the Contagion of Systemic Financial Risk Under the Impact of Climate Risks-From the Perspective of Complex Networks and Machine Learning.

Entropy (Basel, Switzerland)·2026
Same journal

The Statistical-Mechanical Meaning of the Wave Function of Quantum Mechanics.

Entropy (Basel, Switzerland)·2026
See all related articles

Related Experiment Video

Updated: Nov 25, 2025

Decoding Natural Behavior from Neuroethological Embedding
08:00

Decoding Natural Behavior from Neuroethological Embedding

Published on: October 3, 2025

326

A Memory-Efficient Encoding Method for Processing Mixed-Type Data on Machine Learning.

Ivan Lopez-Arevalo1, Edwin Aldana-Bobadilla2, Alejandro Molina-Villegas3

  • 1Centro de Investigación y de Estudios Avanzados del I.P.N., Unidad Tamaulipas, Victoria 87130, Mexico.

Entropy (Basel, Switzerland)
|December 15, 2020
PubMed
Summary
This summary is machine-generated.

This study introduces a new method for encoding mixed-type data in machine learning, outperforming traditional techniques like one-hot encoding. The novel approach efficiently handles diverse datasets, improving memory usage and preserving valuable information.

Keywords:
categorical datadata preprocessingmachine learning

More Related Videos

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.8K
Author Spotlight: Advancing Large-Scale Neural Dynamics Through HD-MEA Technology
09:44

Author Spotlight: Advancing Large-Scale Neural Dynamics Through HD-MEA Technology

Published on: March 8, 2024

5.4K

Related Experiment Videos

Last Updated: Nov 25, 2025

Decoding Natural Behavior from Neuroethological Embedding
08:00

Decoding Natural Behavior from Neuroethological Embedding

Published on: October 3, 2025

326
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.8K
Author Spotlight: Advancing Large-Scale Neural Dynamics Through HD-MEA Technology
09:44

Author Spotlight: Advancing Large-Scale Neural Dynamics Through HD-MEA Technology

Published on: March 8, 2024

5.4K

Area of Science:

  • Machine Learning
  • Data Science
  • Information Theory

Background:

  • Machine learning often struggles with datasets containing both numerical and categorical data.
  • Existing encoding methods like one-hot and feature hashing increase dataset dimensionality, leading to challenges with excessive variables and noisy data.

Purpose of the Study:

  • To propose a novel encoding approach for mixed-type data that addresses the limitations of current methods.
  • To map mixed-type data into an information space using Shannon's Theory to model information content.

Main Methods:

  • Developed a new encoding technique based on Shannon's Theory to represent information in mixed-type data.
  • Evaluated the proposed method on ten UCI repository datasets and two real-world datasets.
  • Applied the encoding for classification, regression, and clustering tasks.

Main Results:

  • The novel encoding approach demonstrated promising results across various datasets and tasks.
  • Achieved superior memory efficiency compared to one-hot and feature-hashing encoding.
  • Successfully preserved the information content of the original mixed-type data.

Conclusions:

  • The proposed encoding method offers a significant improvement over traditional techniques for handling mixed-type data.
  • This approach enhances memory efficiency and information preservation in machine learning preprocessing.
  • The method is effective for classification, regression, and clustering tasks.