Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Sample Size Calculation01:19

Sample Size Calculation

3.6K
Knowledge of the sample size is the first requirement to conduct random sampling or an experiment. The sample size is the total number of units, observations, or groups (in some cases) used to get the data to estimate a population parameter. As the name suggests, the sample size is that of the sample drawn from the population and differs from the population size.
The sample size for the given experiment or sampling effort is fundamental to any study design. Sample size decides the number of...
3.6K
Estimating Population Standard Deviation01:26

Estimating Population Standard Deviation

3.0K
When the population standard deviation is unknown and the sample size is large, the sample standard deviation s is commonly used as a point estimate of σ. However, it can sometimes under or overestimate the population standard deviation. To overcome this drawback, confidence intervals are determined to estimate population parameters and eliminate any calculation bias accurately. However, this only applies to random samples from normally distributed populations. Knowing the sample mean and...
3.0K
Genome Size and the Evolution of New Genes03:21

Genome Size and the Evolution of New Genes

8.1K
While every living organism has a genome of some kind (be it RNA, or DNA), there is considerable variation in the sizes of these blueprints. One major factor that impacts genome size is whether the organism is prokaryotic or eukaryotic. In prokaryotes, the genome contains little to no non-coding sequence, such that genes are tightly clustered in groups or operons sequentially along the chromosome. Conversely, the genes in eukaryotes are punctuated by long stretches of non-coding sequence.
8.1K
Estimating Population Mean with Known Standard Deviation01:16

Estimating Population Mean with Known Standard Deviation

8.9K
To construct a confidence interval for a single unknown population mean μ, where the population standard deviation is known, we need sample mean as an estimate for μ and we need the margin of error. Here, the margin of error (EBM) is called the error bound for a population mean (abbreviated EBM). The sample mean is the point estimate of the unknown population mean μ.
The confidence interval estimate will have the form as follows:
(point estimate - error bound, point estimate +...
8.9K
Estimating Population Mean with Unknown Standard Deviation01:22

Estimating Population Mean with Unknown Standard Deviation

8.2K
In practice, we rarely know the population standard deviation. In the past, when the sample size was large, this did not present a problem to statisticians. They used the sample standard deviation s as an estimate for σ and proceeded as before to calculate a confidence interval with close enough results. However, statisticians ran into problems when the sample size was small. A small sample size caused inaccuracies in the confidence interval.
William S. Gosset (1876–1937) of the...
8.2K
Bootstrapping01:24

Bootstrapping

648
The term "bootstrap" originated in the 19th century as a metaphor for self-improvement or achieving something independently, without external assistance. This concept extends to statistical bootstrapping, a self-contained method for estimating population parameters through resampling, even though it can be computationally intensive. Developed by the American statistician Dr. Bradley Efron in 1979, bootstrapping provides a robust way to perform inference when the original sample size is...
648

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Bridging tradition and innovation: a review of computer simulations in plant breeding.

TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik·2026
Same author

The potential of considering photosynthesis parameters in crop yield breeding by genomic prediction.

Journal of experimental botany·2026
Same author

Optimizing training sets for genomic selection to identify superior genotypes across multiple environments.

G3 (Bethesda, Md.)·2026
Same author

Assessment of segregation variance estimates from derivation, simulations, and empirical data in autotetraploid species exemplified in potato.

Genetics·2026
Same author

Genetic architecture and cellular basis of flag leaf size in barley.

Journal of experimental botany·2025
Same author

Methylome differences among barley inbreds and their association with genomic, transcriptomic, and phenotypic variation.

Journal of experimental botany·2025
Same journal

Unveiling core genomic regions shaping plant architecture, productivity, and seed quality traits in sesame (Sesamum indicum L.): insights from Meta-QTL study into breeding targets.

TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik·2026
Same journal

Watkins wheat landraces: a treasure of stripe rust resistance alleles identified using multi-model association analyses.

TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik·2026
Same journal

Selection of four mutant alleles of fatty acid desaturase genes for a stable high oleic and low linolenic acid soybean seed oil trait.

TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik·2026
Same journal

Harnessing artificial intelligence in plant breeding: innovations in digital phenotyping and breeding methodologies.

TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik·2026
Same journal

Identification of a novel major QTL and F-box candidate genes controlling seed dormancy in common wheat.

TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik·2026
Same journal

Genomic loci associated with Fusarium stalk rot resistance and related agronomic traits in maize.

TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik·2026
See all related articles

Related Experiment Video

Updated: Aug 7, 2025

MEDUSA for Identifying Death Regulatory Genes in Chemo-genetic Profiling Data
07:17

MEDUSA for Identifying Death Regulatory Genes in Chemo-genetic Profiling Data

Published on: February 7, 2025

552

Sample size determination for training set optimization in genomic prediction.

Po-Ya Wu1,2, Jen-Hsiang Ou1,3, Chen-Tuo Liao4

  • 1Department of Agronomy, National Taiwan University, Taipei, Taiwan.

TAG. Theoretical and Applied Genetics. Theoretische Und Angewandte Genetik
|March 13, 2023
PubMed
Summary
This summary is machine-generated.

Determining the optimal training set size is crucial for genomic prediction (GP) studies. This research presents a cost-effective method using logistic growth curves to find the ideal sample size for selective phenotyping, aiding breeders in economical genotype selection.

More Related Videos

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

824
Candidate Gene Testing in Clinical Cohort Studies with Multiplexed Genotyping and Mass Spectrometry
05:53

Candidate Gene Testing in Clinical Cohort Studies with Multiplexed Genotyping and Mass Spectrometry

Published on: June 21, 2018

10.2K

Related Experiment Videos

Last Updated: Aug 7, 2025

MEDUSA for Identifying Death Regulatory Genes in Chemo-genetic Profiling Data
07:17

MEDUSA for Identifying Death Regulatory Genes in Chemo-genetic Profiling Data

Published on: February 7, 2025

552
Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

824
Candidate Gene Testing in Clinical Cohort Studies with Multiplexed Genotyping and Mass Spectrometry
05:53

Candidate Gene Testing in Clinical Cohort Studies with Multiplexed Genotyping and Mass Spectrometry

Published on: June 21, 2018

10.2K

Area of Science:

  • Quantitative genetics
  • Animal and plant breeding
  • Statistical genomics

Background:

  • Genomic prediction (GP) utilizes genomic estimated breeding values (GEBVs) for trait selection in breeding programs.
  • Establishing an optimal training set size for GP models is critical but often unresolved.
  • Current practices often overlook cost-effectiveness and resource constraints in training set determination.

Purpose of the Study:

  • To develop a practical and cost-effective approach for determining the optimal training set size in genomic prediction studies.
  • To provide a method for optimizing selective phenotyping strategies.
  • To facilitate the efficient selection of genotypes with economical sample sizes.

Main Methods:

  • Applied logistic growth curve analysis to model prediction accuracy in relation to training set size.
  • Utilized three real genome datasets to validate the proposed approach.
  • Developed an R function to enable widespread application of the sample size determination method.

Main Results:

  • A method was established to identify a cost-effective optimal training set size for genomic prediction.
  • The approach effectively balances prediction accuracy with the economic constraints of phenotyping.
  • Demonstrated the utility of the method across diverse genomic datasets.

Conclusions:

  • The developed approach provides a practical solution for optimizing training set sizes in genomic prediction.
  • The accompanying R function simplifies the implementation for breeders seeking economical phenotyping strategies.
  • This facilitates more efficient and cost-effective breeding programs through informed genotype selection.