Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Probability Distributions01:32

Probability Distributions

7.9K
 The probability of a random variable x  is the likelihood of its occurrence. A probability distribution represents the probabilities of a random variable using a formula, graph, or table. There are two types of probability distribution– discrete probability distribution and continuous probability distribution.
A discrete probability distribution is a probability distribution of discrete random variables. It can be categorized into binomial probability distribution and Poisson...
7.9K
Random Variables01:09

Random Variables

13.4K
A random variable is a single numerical value that indicates the outcome of a procedure. The concept of random variables is fundamental to the probability theory and was introduced by a Russian mathematician, Pafnuty Chebyshev, in the mid-nineteenth century.
Uppercase letters such as X or Y denote a random variable. Lowercase letters like x or y denote the value of a random variable. If X is a random variable, then X is written in words, and x is given as a number.
For example, let X = the...
13.4K
Probability Histograms01:17

Probability Histograms

12.2K
A probability histogram is a visual representation of a probability distribution. Similar a typical histogram, the probability histogram consists of contiguous (adjoining) boxes. It has both a horizontal axis and a vertical axis. The horizontal axis is labeled with what the data represents. The vertical axis is labeled with probability. Each rectangular bar in the histogram is 1 unit wide, which suggests that the area under each bar equals the probability, P(x), where x is 1, 2, 3, and so on.
12.2K
Poisson Probability Distribution01:09

Poisson Probability Distribution

8.5K
A Poisson probability distribution is a discrete probability distribution. It gives the probability of a number of events occurring in a fixed interval of time or space if these events happen at a known average rate and independently of the time since the last event. For example, a book editor might be interested in the number of words spelled incorrectly in a particular book. It might be that, on average, there are five words spelled incorrectly in 100 pages. The interval is 100 pages.
The...
8.5K
Random Sampling Method01:09

Random Sampling Method

12.3K
Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. Data are the result of sampling from a population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest. Among the various sampling methods used by...
12.3K
Probability Laws01:49

Probability Laws

41.7K
Overview
41.7K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

The duplication of genomes and genetic networks and its potential for evolutionary adaptation and survival during environmental turmoil.

Proceedings of the National Academy of Sciences of the United States of America·2023
Same author

Simulation analysis of an adjusted gravity model for hospital admissions robust to incomplete data.

BMC medical research methodology·2023
Same author

Improved Node and Arc Multiplicity Estimation in De Bruijn Graphs Using Approximate Inference in Conditional Random Fields.

IEEE/ACM transactions on computational biology and bioinformatics·2023
Same author

MmWave Physical Layer Network Modeling and Planning for Fixed Wireless Access Applications.

Sensors (Basel, Switzerland)·2023
Same author

A construction heuristic for the capacitated Steiner tree problem.

PloS one·2022
Same author

Accurate determination of node and arc multiplicities in de bruijn graphs using conditional random fields.

BMC bioinformatics·2020
Same journal

Thymidylate synthase inhibitory drugs induce p53-dependent pathways differently.

PloS one·2026
Same journal

Top-down and bottom-up attention for joint pattern classification and reconstruction.

PloS one·2026
Same journal

Short- and long-term scaling behavior of blood pressure and pulse arrival time during sleep in healthy controls and patients with obstructive sleep apnea.

PloS one·2026
Same journal

Double DQN-based secrecy energy efficiency and fairness performance in IRS-assisted NOMA systems with friendly jamming.

PloS one·2026
Same journal

10 recommendations for strengthening citizen science for improved societal and ecological outcomes: A co-produced analysis of challenges and opportunities in the 21st century.

PloS one·2026
Same journal

Paying in public: Peer effects, impression management, and willingness to pay on digital payment platforms.

PloS one·2026
See all related articles

Related Experiment Video

Updated: Sep 10, 2025

Inherent Dynamics Visualizer, an Interactive Application for Evaluating and Visualizing Outputs from a Gene Regulatory Network Inference Pipeline
10:44

Inherent Dynamics Visualizer, an Interactive Application for Evaluating and Visualizing Outputs from a Gene Regulatory Network Inference Pipeline

Published on: December 7, 2021

2.3K

Generating random graphs with prescribed graphlet frequency bounds derived from probabilistic networks.

Bram Mornie1, Didier Colle1, Pieter Audenaert1

  • 1IDLab, Department of Information Technology, Ghent University - imec, Ghent, Belgium.

Plos One
|August 26, 2025
PubMed
Summary
This summary is machine-generated.

This study introduces a novel algorithm for generating realistic biological networks, controlling subgraph patterns and edge uncertainty. The method efficiently creates large graphs with specific motif frequencies, crucial for accurate bioinformatics algorithm testing.

Frequently Asked Questions

More Related Videos

Large Scale Energy Efficient Sensor Network Routing Using a Quantum Processor Unit
05:30

Large Scale Energy Efficient Sensor Network Routing Using a Quantum Processor Unit

Published on: September 8, 2023

658
Synthesis of Graphene Nanofluids with Controllable Flake Size Distributions
07:32

Synthesis of Graphene Nanofluids with Controllable Flake Size Distributions

Published on: July 17, 2019

6.8K

Related Experiment Videos

Last Updated: Sep 10, 2025

Inherent Dynamics Visualizer, an Interactive Application for Evaluating and Visualizing Outputs from a Gene Regulatory Network Inference Pipeline
10:44

Inherent Dynamics Visualizer, an Interactive Application for Evaluating and Visualizing Outputs from a Gene Regulatory Network Inference Pipeline

Published on: December 7, 2021

2.3K
Large Scale Energy Efficient Sensor Network Routing Using a Quantum Processor Unit
05:30

Large Scale Energy Efficient Sensor Network Routing Using a Quantum Processor Unit

Published on: September 8, 2023

658
Synthesis of Graphene Nanofluids with Controllable Flake Size Distributions
07:32

Synthesis of Graphene Nanofluids with Controllable Flake Size Distributions

Published on: July 17, 2019

6.8K

Area of Science:

  • Bioinformatics and computational network analysis.
  • Theoretical computer science focused on probabilistic graph generation.
  • Systems biology applications for benchmarking network algorithms.

Background:

Benchmarking sophisticated network algorithms within the field of bioinformatics necessitates the availability of diverse datasets that accurately reflect realistic structural properties found in nature to ensure the reliability of computational predictions. Prior research has shown that biological interactions are frequently characterized by stochastic events, which requires modeling these relationships as probabilistic edges rather than deterministic connections. Traditional synthetic graph generative models often fail to account for the specific distribution of complex subgraph patterns, commonly referred to as motifs or graphlets. The common practice of ignoring edge uncertainty during network analysis can inadvertently lead to fundamentally incorrect conclusions regarding the topological properties of biological systems. Existing frameworks typically prioritize global metrics like degree distribution while neglecting the local connectivity nuances that define functional biological modules. This absence of evidence motivated the development of a methodology that derives rigorous bounds on graphlet counts directly from uncertain, probabilistic target networks.

Purpose Of The Study:

This research develops a specialized algorithm designed to produce random graphs that adhere to prescribed graphlet frequency bounds extracted from probabilistic network models. The investigators aimed to resolve the discrepancy between simplified synthetic data and the inherent uncertainty present in experimental biological interaction datasets that often complicate the analysis of real-world systems. By establishing mathematical constraints for both graphlet counts and degree distributions, the study provides a robust input mechanism for generative processes. The team focused on creating a system capable of growing graphs incrementally, allowing for precise control over structural evolution at each step. Researchers intended to provide a benchmarking tool that allows bioinformatics specialists to test their algorithms against networks with high-fidelity subgraph architectures. The study prioritizes the integration of local motif distributions and global probabilistic constraints to enhance the realism of synthetic network benchmarks.

Main Methods:

The proposed generative engine constructs synthetic networks through an incremental growth process that applies minor topological modifications during every discrete iteration. This stepwise modification strategy facilitates the implementation of an efficient graphlet counting method that avoids the computational overhead of full network re-analysis. For networks characterized as sparse, the time complexity for updating these subgraph frequencies remains entirely independent of the total node count, thereby significantly reducing the overall processing requirements for large datasets. The methodology relies on derived bounds for three-node and four-node motifs, using these values as the primary steering parameters for the generation algorithm. Experimental validation involved testing the model on a diverse array of synthetic and real-world networks featuring varying scales and uncertain interaction data. The researchers utilized specific graphlet counting techniques to ensure that the generated outputs matched the target degree distributions and subgraph frequencies.

Main Results:

The experimental results indicate that the algorithm can successfully generate complex graphs with more than 10,000 edges in under sixty minutes, demonstrating the practical scalability of the proposed computational framework. Precise regulation of the frequencies for all three-node and four-node graphlets was maintained throughout the generation of diverse network topologies. While the total computation time is heavily influenced by the size of the graphlets being modeled, the system remains efficient for standard motif analysis. The model demonstrated high performance across both synthetic benchmarks and real-world biological datasets with significant edge uncertainty. The incremental update mechanism effectively preserved the target degree distribution while simultaneously converging on the desired graphlet frequency bounds. Data analysis confirmed that the generated synthetic networks provide a realistic representation of the structural properties found in probabilistic biological interactions.

Conclusions:

Incorporating uncertain edge data into synthetic graph generation establishes a more reliable framework for evaluating the performance of bioinformatics algorithms than deterministic models. The ability to prescribe specific graphlet frequency bounds ensures that synthetic networks retain the functional motif signatures essential for biological realism, which are essential for understanding the underlying logic of cellular signaling. These findings highlight the necessity of accounting for interaction uncertainty to prevent the derivation of misleading conclusions in network science research. The developed tool serves as a practical and efficient solution for researchers requiring high-fidelity datasets for large-scale network benchmarking. Future applications of this work could involve extending the generative constraints to include higher-order subgraphs or more complex probabilistic weighting schemes. This research provides a significant advancement in the synthesis of realistic networks that balance local connectivity patterns with global stochastic properties.

Based on the study's findings, these bounds serve as steering parameters that constrain the incremental growth process, ensuring the final synthetic topology replicates the specific three-node and four-node subgraph distributions found in probabilistic target networks.

Based on this study's findings, the algorithm generated graphs with more than 10,000 edges while maintaining precise control over the frequencies of all three-node and four-node graphlets in under one hour.

The researchers utilized an incremental growth approach because it allows for an efficient graphlet counting method where updates are performed in a time independent of the total node number on sparse graphs.

The researchers observed that while the algorithm is efficient for sparse graphs, the total computation times strongly depend on the specific size of the graphlets being considered during the incremental modification steps.

The authors state that modeling biological interactions as uncertain events is necessary because ignoring this uncertainty in practice can lead to incorrect conclusions about the fundamental properties of biological networks.