Random access and semantic search in DNA data storage enabled by Cas9 and machine-guided design
View abstract on PubMed
Summary
This summary is machine-generated.CRISPR-Cas9 enables fast, selective DNA data retrieval for digital storage. This method simplifies data access and reduces costs, paving the way for efficient molecular data extraction.
Area Of Science
- Biotechnology
- Bioinformatics
- Molecular Biology
Background
- DNA offers high data density and longevity for digital storage.
- Efficient data retrieval is crucial for practical DNA storage systems.
- Current methods for molecular data extraction can be complex and time-consuming.
Purpose Of The Study
- To introduce CRISPR-Cas9 as a tool for multiplexed, low-latency molecular data extraction.
- To develop a user-friendly system for selective data retrieval from DNA.
- To enable efficient searching and retrieval of data stored in DNA.
Main Methods
- Developed a one-pot, multiplexed random access method using CRISPR-Cas9 for targeted DNA cleavage.
- Utilized nanopore sequencing for data retrieval after CRISPR-Cas9 cleavage.
- Combined machine learning (deep neural network) with Cas9-based retrieval for molecular similarity search.
Main Results
- Validated the CRISPR-Cas9 approach on 1.6 million DNA sequences (25 unique data files).
- Successfully mapped 1.74 million images into a reduced-dimensional embedding encoded as Cas9 target sequences.
- Demonstrated high-fidelity retrieval of molecular addresses for semantically related image clusters using Cas9's off-target activity.
Conclusions
- CRISPR-Cas9 provides a simplified and rapid method for DNA data retrieval.
- The developed approaches enhance molecular data access capabilities.
- These advancements address key challenges in DNA-based data storage and retrieval.
Related Concept Videos
Genome editing technologies allow scientists to modify an organism’s DNA via the addition, removal, or rearrangement of genetic material at specific genomic locations. These types of techniques could potentially be used to cure genetic disorders such as hemophilia and sickle cell anemia. One popular and widely used DNA-editing research tool that could lead to safe and effective cures for genetic disorders is the CRISPR-Cas9 system. CRISPR-Cas9 stands for Clustered Regularly Interspaced...
The CRISPR-Cas system serves as a bacterial defense mechanism against invading genetic elements such as viruses and plasmids, forming the foundation for its adaptation as a powerful genome-editing tool. Originally discovered in prokaryotes, this system has been repurposed to revolutionize genetic engineering across a wide range of organisms, including plants, animals, and humans. The core component, Cas9, is an endonuclease derived from Streptococcus pyogenes, capable of introducing...
The basic reaction of homologous recombination (HR) involves two chromatids that contain DNA sequences sharing a significant stretch of identity. One of these sequences uses a strand from another as a template to synthesize DNA in an enzyme-catalyzed reaction. The final product is a novel amalgamation of the two substrates. To ensure an accurate recombination of sequences, HR is restricted to the S and G2 phases of the cell cycle. At these stages, the DNA has been replicated already and the...
Because the DNA segments are cut and reorganized in a direction-specific manner, site-specific recombination has emerged as an efficient genetic engineering technique. Flippase and Cyclization recombinases or Flp and Cre, respectively, are two members of the tyrosine recombinase family derived from bacteriophages, that are used to mediate site-specific DNA insertions, deletions, and targeted expression of proteins in mammalian cell lines.
The recognition sites for Cre recombinase called LoxP...
Bacteria and archaea are susceptible to viral infections just like eukaryotes; therefore, they have developed a unique adaptive immune system to protect themselves. Clustered regularly interspaced short palindromic repeats and CRISPR-associated proteins (CRISPR-Cas) are present in more than 45% of known bacteria and 90% of known archaea.
The CRISPR-Cas system stores a copy of foreign DNA in the host genome and uses it to identify the foreign DNA upon reinfection. CRISPR-Cas has three different...
In the same year as the discovery of the Sanger sequencing method, another group of scientists, Allan Maxam and Walter Gilbert, demonstrated their chemical-cleavage method for DNA sequencing. The Maxam-Gilbert method relies on using different chemicals that can cleave the DNA sequence at specific sites, the separation of resulting DNA fragments of variable size using electrophoresis, and deciphering the DNA sequence from the resulting gel bands.
Challenges of the Maxam-Gilbert Method
The...

