Jove
Visualize
Contact Us

Related Concept Videos

Distributions to Estimate Population Parameter01:26

Distributions to Estimate Population Parameter

4.0K
The accurate values of population parameters such as population proportion, population mean, and population standard deviation (or variance) are usually unknown. These are fixed values that can only be estimated from the data collected from the samples. The estimates of each of these parameters are sample proportion, the sample mean, and sample standard deviation (or variance). To obtain the values of these sample statistics, data are required that have particular distribution and central...
4.0K
Sample Size Calculation01:19

Sample Size Calculation

3.2K
Knowledge of the sample size is the first requirement to conduct random sampling or an experiment. The sample size is the total number of units, observations, or groups (in some cases) used to get the data to estimate a population parameter. As the name suggests, the sample size is that of the sample drawn from the population and differs from the population size.
The sample size for the given experiment or sampling effort is fundamental to any study design. Sample size decides the number of...
3.2K
Mechanistic Models: Compartment Models in Individual and Population Analysis01:23

Mechanistic Models: Compartment Models in Individual and Population Analysis

33
Mechanistic models are utilized in individual analysis using single-source data, but imperfections arise due to data collection errors, preventing perfect prediction of observed data. The mathematical equation involves known values (Xi), observed concentrations (Ci), measurement errors (εi), model parameters (ϕj), and the related function (ƒi) for i number of values. Different least-squares metrics quantify differences between predicted and observed values. The ordinary least...
33
Sampling Plans01:23

Sampling Plans

169
Sampling is a crucial step in analytical chemistry, allowing researchers to collect representative data from a large population. Common sampling methods include random, judgmental, systematic, stratified, and cluster sampling.
Random sampling is a method where each member of the population has an equal chance of being selected for the sample. It involves selecting individuals randomly, often using random number generators or lottery-type methods. For example, when analyzing the properties of a...
169
Distribution and Dispersion00:54

Distribution and Dispersion

21.6K
To understand intra-specific interactions in populations, scientists measure the spatial arrangement of species individuals. This geographic arrangement is known as the species distribution or dispersion. Highly territorial species exhibit a uniform distribution pattern, in which individuals are spaced at relatively equal distances from one another. Species that are highly tied to particular resources, such as food or shelter, tend to concentrate around those resources, and thus exhibit a...
21.6K
Habitat Fragmentation02:31

Habitat Fragmentation

17.4K
Habitat fragmentation describes the division of a more extensive, continuous habitat into smaller, discontinuous areas. Human activities such as land conversion, as well as slower geological processes leading to changes in the physical environment, are the two leading causes of habitat fragmentation. The fragmentation process typically follows the same steps: perforation, dissection, fragmentation, shrinkage, and attrition.
17.4K
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies
  1. Home
  2. Predicting Vector Distribution In Europe: At What Sample Size Are Species Distribution Models Reliable?
  1. Home
  2. Predicting Vector Distribution In Europe: At What Sample Size Are Species Distribution Models Reliable?

Related Experiment Video

Modeling the Size Spectrum for Macroinvertebrates and Fishes in Stream Ecosystems
07:41

Modeling the Size Spectrum for Macroinvertebrates and Fishes in Stream Ecosystems

Published on: July 30, 2019

7.4K

Predicting vector distribution in Europe: at what sample size are species distribution models reliable?

Lianne Mitchel1,2, Guy Hendrickx3, Ewan T MacLeod1

  • 1Deanery of Biomedical Sciences, College of Medicine and Veterinary Medicine, University of Edinburgh, Edinburgh, United Kingdom.

Frontiers in Veterinary Science
|June 13, 2025

View abstract on PubMed

Summary
This summary is machine-generated.

Determining the optimal sample size for Random Forest models is crucial for reliable vector-borne disease surveillance. Balanced samples require 750-1,000 data points, while unbalanced samples need more, with 20:80 ratios proving unreliable.

Keywords:
machine learningrandom forestsample ratiosample sizespecies distribution modelsurveillancevector-borne diseasesvirtual species

More Related Videos

Integrating Remote Sensing with Species Distribution Models; Mapping Tamarisk Invasions Using the Software for Assisted Habitat Modeling SAHM
12:26

Integrating Remote Sensing with Species Distribution Models; Mapping Tamarisk Invasions Using the Software for Assisted Habitat Modeling SAHM

Published on: October 11, 2016

13.3K
Automatic Image Processing to Determine the Community Size Structure of Riverine Macroinvertebrates
08:56

Automatic Image Processing to Determine the Community Size Structure of Riverine Macroinvertebrates

Published on: January 13, 2023

2.1K

Related Experiment Videos

Modeling the Size Spectrum for Macroinvertebrates and Fishes in Stream Ecosystems
07:41

Modeling the Size Spectrum for Macroinvertebrates and Fishes in Stream Ecosystems

Published on: July 30, 2019

7.4K
Integrating Remote Sensing with Species Distribution Models; Mapping Tamarisk Invasions Using the Software for Assisted Habitat Modeling SAHM
12:26

Integrating Remote Sensing with Species Distribution Models; Mapping Tamarisk Invasions Using the Software for Assisted Habitat Modeling SAHM

Published on: October 11, 2016

13.3K
Automatic Image Processing to Determine the Community Size Structure of Riverine Macroinvertebrates
08:56

Automatic Image Processing to Determine the Community Size Structure of Riverine Macroinvertebrates

Published on: January 13, 2023

2.1K

Area of Science:

  • Ecological modeling
  • Epidemiology
  • Machine learning applications

Background:

  • Species distribution models (SDMs) predict disease vectors using environmental data.
  • Climate change and rising vector-borne diseases necessitate improved surveillance in Europe.
  • Current SDM practices lack standardization, particularly regarding optimal sample size.

Purpose of the Study:

  • To determine the optimum sample size for Random Forest models.
  • To evaluate the impact of different sample ratios on model reliability.
  • To inform standardized practices for vector distribution modeling.

Main Methods:

  • A simulated vector with a known distribution was used across 10 European test sites.
  • 9,000 Random Forest models were trained with varying sample sizes (10-5,000) and ratios (50:50, 20:80, 40:60).
  • Model performance was assessed using five metrics, with optimum sample size defined by 25th percentile performance thresholds.
  • Main Results:

    • For balanced samples (50:50), optimum sample sizes ranged from 750-1,000.
    • Unbalanced samples (40:60 ratio) required 1,100-1,300 samples for reliable models.
    • Unbalanced samples with a 20:80 ratio consistently failed to produce reliable models.

    Conclusions:

    • This study provides the first estimates of optimum sample size for Random Forest models at high resolution and extent using simulated data.
    • Findings can enhance the reliability of SDMs, optimize field sampling, and improve vector surveillance.
    • Further research should validate these findings with real vector data and explore model transferability.