๐ฎ
๐ฎ
The Ethereal
Cover Your Bases: How to Minimize the Sequencing Coverage in DNA Storage Systems
May 09, 2023 ยท The Ethereal ยท ๐ International Symposium on Information Theory
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Daniella Bar-Lev, Omer Sabary, Ryan Gabrys, Eitan Yaakobi
arXiv ID
2305.05656
Category
cs.DM: Discrete Mathematics
Cross-listed
cs.IT,
math.PR
Citations
21
Venue
International Symposium on Information Theory
Last Checked
2 months ago
Abstract
Although the expenses associated with DNA sequencing have been rapidly decreasing, the current cost of sequencing information stands at roughly $120/GB, which is dramatically more expensive than reading from existing archival storage solutions today. In this work, we aim to reduce not only the cost but also the latency of DNA storage by initiating the study of the DNA coverage depth problem, which aims to reduce the required number of reads to retrieve information from the storage system. Under this framework, our main goal is to understand the effect of error-correcting codes and retrieval algorithms on the required sequencing coverage depth. We establish that the expected number of reads that are required for information retrieval is minimized when the channel follows a uniform distribution. We also derive upper and lower bounds on the probability distribution of this number of required reads and provide a comprehensive upper and lower bound on its expected value. We further prove that for a noiseless channel and uniform distribution, MDS codes are optimal in terms of minimizing the expected number of reads. Additionally, we study the DNA coverage depth problem under the random-access setup, in which the user aims to retrieve just a specific information unit from the entire DNA storage system. We prove that the expected retrieval time is at least k for [n,k] MDS codes as well as for other families of codes. Furthermore, we present explicit code constructions that achieve expected retrieval times below k and evaluate their performance through analytical methods and simulations. Lastly, we provide lower bounds on the maximum expected retrieval time. Our findings offer valuable insights for reducing the cost and latency of DNA storage.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Discrete Mathematics
๐ฎ
๐ฎ
The Ethereal
An Introduction to Temporal Graphs: An Algorithmic Perspective
๐ฎ
๐ฎ
The Ethereal
Guarantees for Greedy Maximization of Non-submodular Functions with Applications
๐ฎ
๐ฎ
The Ethereal
A note on the triangle inequality for the Jaccard distance
๐ฎ
๐ฎ
The Ethereal
Fast clique minor generation in Chimera qubit connectivity graphs
๐ฎ
๐ฎ
The Ethereal