Comprehensive genome analysis and variant detection at scale using DRAGEN

Affiliations
  • 1Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
  • 2Illumina, Inc., San Diego, CA, USA. scatreux@illumina.com.
  • 3Illumina, Inc., San Diego, CA, USA.
  • 4Illumina, Inc., San Diego, CA, USA. jhan6@illumina.com.
  • 5Illumina, Inc., San Diego, CA, USA. rmehio@illumina.com.
  • 6Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA. fritz.sedlazeck@bcm.edu.
  • 7Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA. fritz.sedlazeck@bcm.edu.
  • 8Department of Computer Science, Rice University, Houston, TX, USA. fritz.sedlazeck@bcm.edu.

|

Abstract

Research and medical genomics require comprehensive, scalable methods for the discovery of novel disease targets, evolutionary drivers and genetic markers with clinical significance. This necessitates a framework to identify all types of variants independent of their size or location. Here we present DRAGEN, which uses multigenome mapping with pangenome references, hardware acceleration and machine learning-based variant detection to provide insights into individual genomes, with ~30 min of computation time from raw reads to variant detection. DRAGEN outperforms current state-of-the-art methods in speed and accuracy across all variant types (single-nucleotide variations, insertions or deletions, short tandem repeats, structural variations and copy number variations) and incorporates specialized methods for analysis of medically relevant genes. We demonstrate the performance of DRAGEN across 3,202 whole-genome sequencing datasets by generating fully genotyped multisample variant call format files and demonstrate its scalability, accuracy and innovation to further advance the integration of comprehensive genomics. Overall, DRAGEN marks a major milestone in sequencing data analysis and will provide insights across various diseases, including Mendelian and rare diseases, with a highly comprehensive and scalable platform.

Related Concept Videos

JoVE Research Video for Genome-wide Association Studies-GWAS 01:11

10.3K

Genome-wide association studies or GWAS are used to identify whether common SNPs are associated with certain diseases. Suppose specific SNPs are more frequently observed in individuals with a particular disease than those without the disease. In that case, those SNPs are said to be associated with the disease. Chi-square analysis is performed to check the probability of the allele likely to be associated with the disease.
GWAS does not require the identification of the target gene involved in…

JoVE Research Video for Evolutionary Relationships through Genome Comparisons 02:54

5.5K

Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse…

JoVE Research Video for Genomics 02:02

34.4K

Genomics is the science of genomes: it is the study of all the genetic material of an organism. In humans, the genome consists of information carried in 23 pairs of chromosomes in the nucleus, as well as mitochondrial DNA. In genomics, both coding and non-coding DNA is sequenced and analyzed. Genomics allows a better understanding of all living things, their evolution, and their diversity. It has a myriad of uses: for example, to build phylogenetic trees, to improve productivity and…