Open preprint reviews by Yaniv Erlich

Portable and Error-Free DNA-Based Data Storage

S. M. Hossein Tabatabaei Yazdi, Ryan Gabrys, Olgica Milenkovic

Review posted on 06th October 2016

This is an interesting manuscript that considers the possibility of retrieving information from DNA storage using Oxford Nanopore sequencing. The authors synthesized 17 gBlocks of about 1000nt long and encode 3.5Kbyte of data. By sequencing each gBlock for about 200 times on ONT, reaching a sequence consensus, and using several error correcting strategies, they reached zero errors.

One question is the scalability of this method. The method hinges on synthesizing long DNA fragments using the IDT gBlock technology. According to the authors, the ~17000bp gBlocks DNA cost over $2000. This price is approximately 400x more than the price of oligo synthesis arrays and translates to nearly $1 for storing 10bits of information, meaning that one would need an R01 for storing 1Mbyte file. In addition, the method has zero guarantees for oligo dropouts. While PCR amplification of 17 gBlocks is simple, amplifying tens of thousands of oligos is error prone and likely to result with a small number of sequences that will not be represented in the final reaction.

Another issue is the claim that the encoding strategy is superior in terms of information density (bit/nt) to previous publications. It is important to note that the coding potential is actually smaller than other methods and reaches only 1.67bit/nt, which is smaller than the Grass et al. technique (1.7bit/nt) and our work (1.98bit/nt). The increase in density achieves solely because the highly length of the gBlock technology compared to the barcode. However, translating the presented method to the lengths of oligo array (200nt) yields only 1.49bit/nt and no protection for oligo dropouts. Thus, any conclusion about density of the strategy compared to Illumina-based methods needs a more careful attention.

show less