Oxford Nanopore MinION – USB stick-sized DNA sequencer

dnautics · on June 14, 2014

The error rate in the "good part" is about 15%. That is quite frankly very very very poor. This might be a good alternative to pacbio, (unless the other technologies can increase their read lengths).

Typically (I chatted with my contacts at the Venter Institute)... best results are gotten by making long reads using pacbio or (maybe minion, they are looking into that currently) and using that to generate the scaffold and miseq/iontorrent to fill in the rest.

Honestly in my opinion (and I have been saying this for years) Pacbio and minion will never get over their error rate for "basic chemistry" reasons. These techniques use single-molecule sequencing and that is fraught with problems. Sometimes, it's better to average over a large population.

dnautics · on June 14, 2014

More information: For personal (as in human genome) sequencing, I'm not entirely sure how useful this is... We already have a scaffold for the human genome (that would be Mr. Venter himself) and MAYBE you could get some haploid information out of it, but you'd worry that the base pair you care about for any given individual is in one of the wrong stretches.

For organism sequencing, I think this could be even worse than pacbio; there are modestly sized gaps in the sequence compared to relative to the known pseudomonas sequence; imputing the correct sequence when you have no available template is going to be a nightmare. It's easy to say "15% error" when you know what the 100% correct result is. But if you have no reference correct sequence and all of your parts are up to 15% wrong, the difficulty is compounded.

owlmonkey · on June 15, 2014

Ya the MinIon is more of a DNA sensor than a sequencer in its current incarnation, but awesome for portability. Super expensive per base though. If they can't improve the cost significantly along with the accuracy it will never be a competitive sequencer.

You're wrong about the PacBio though, the raw reads are random and cancel out easily with consensus. PacBio is the least biased and therefore the most consensus accurate next-gen sequencer available. See my other comments for some citations.

petercooper · on June 14, 2014

Naive idiot alert.. could you process the "same" DNA numerous times then take the mode in each case? Or are the errors essentially ones that would be repeated each time?

mbreese · on June 14, 2014

Actually PacBio does just that to get better a better error rate. They basically have X amount of sequencing that can be done. You can spend that X however you'd like. If you want a sequence that is X long, you'll have a higher error rate. If you want a chunk that is X/10 long, you can circularize it, and thus sequence it ten times. This gets you better accuracy with the redundancy. DNA is pretty robust.

In practice though, even with these "circular consensus sequencing" reads, the error model is significantly higher than other technologies.

owlmonkey · on June 15, 2014

One doesn't need CCS anymore, though it is still an option for shorter insert lengths. The errors are more random than other platforms, so they resolve easily with consensus. You just need 20x coverage or more, depending upon the application. With enough coverage you generally get better accuracy than any other platform because it's the least biased technology.

CCS still works really well if you want incredibly high accuracy. See this paper and figure 3 for Q90 quality reads: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3811116/

But most people don't need that for their applications. So just regular consensus using PacBio data alone is sufficient for excellent (Q60 or better) consensus accuracy: https://github.com/PacificBiosciences/GenomicConsensus/blob/...

sillywabbit · on June 15, 2014

I don't think they push CCS much anymore. The focus seems to be on generating long reads and then using Quiver to call the consensus (https://github.com/PacificBiosciences/GenomicConsensus/blob/...).

mbreese · on June 15, 2014

That's probably wise of them. The last time I looked at data from them, the CCS reads were still very noisy. It's better to rely on your strengths. In their case, it's long reads.

sillywabbit · on June 15, 2014

Yup. Just keeps getting better (https://pbs.twimg.com/media/BpUs_MAIYAEHb2o.png:large).

dekhn · on June 15, 2014

Is there an accuracy problem in PacBio's published de-novo assembly human genome: http://blog.pacificbiosciences.com/2014/02/data-release-54x-...

it was assembled de-novo (IE, without HuRef), and the accuracy stats aren't horrible.

owlmonkey · on June 15, 2014

PacBio is ultimately more accurate than Illumina given consensus. Illumina has some systemic bias so the ultimate accuracy tops out around Q40-Q50 (same errors just repeat themselves) and PacBio will get you closer to Q60-Q70 because the random errors resolve out. So it's the most accurate platform actually. This singuler MinIon read however was barely mappable to the reference, so it looks pretty poor quality. But without more reads we don't quite yet know what the systemic error profile will be though for it. That's why they're adding more pore types, they say, to get multiple systemic error profiles to try to cancel some out.

bayesianhorse · on June 19, 2014

All depends on the application and the price per base. If it's cheap enough, you can average over more reads, even with higher error rates. The error rate should come down though. I guess the price will also come down a lot.

I think the single molecule approach has both advantages and disadvantages. Also I think that Illumina's technology in some sense also only reads parts of single molecules, and the averaging is done later in-silico, much as the nanopore would do also.

ihodes · on June 14, 2014

The error rate of 15% (I actually believe it's more like 30% in practice, which is 2x PacBio's) is actually excellent and comparable to PacBio's in that the error is primarily randomly distributed, and not systematic like current Illumina or IonTorrent systems (which have admittedly lower error rates, at 1-3%). The short of it is, absolute error rate is not the whole story.

When looking at what ends up being just a massive string of 4 characters, having 15% error on a particular read doesn't matter too much; you end up getting many reads (fragments of a full sequence) which overlap the same area of the genome. With enough reads (and having many, say 15-100x coverage as it's called, is not uncommon practice in sequencing), this error is obviated by consensus. The long read length is especially useful for this, as the lengths which overlap are much greater than with current NSG reads of 150-200bps.

Additionally, I think the goal of the MinION were more to display a sort of disposable (sub-$500) machine which could be used on-site and without sample-prep. A forensic kit, really (though probably more for creatures such as bacteria and fungus rather than humans—I recall that it would take 2-5 MinIONs to sequence a human genome to even a short depth, but I may be misremembering).

In short, this is a cool proof-of-concept for Oxford Nanopore, as their technology has been talked about for a long time with nothing to show until now. It will be interesting to see where it goes next, particularly if the reads lengths keep climbing and the error profile drops even more.

dkural · on June 15, 2014

I entirely disagree. Illumina actual error rates (as opposed to base quality) is less systematic than current Oxford Nanopore reads, which there are admittedly very few in public.

Pac Bio SMRT kits are super cheap and highly multiplexed (ie produce a lot more data) in comparison to Oxford Nanopore. The error rates are lower, average read lengths higher.

I don't see any large market so far for ONP. Of course with better data and lower error rates things might change..

pcrh · on June 14, 2014

As it stands, it looks good enough for identification purposes (for example of a pathogen), but nowhere near good enough for genetic studies where single base pair variations are the most sought-after aspect.

For human genetic studies, I guess it also depends on how readily multiple reads can be obtained, and whether the errors are randomly distributed.

owlmonkey · on June 15, 2014

Totally agree. It is at best a semi-portable sensor (with wet prep) not a sequencer. Cost would be another big obstacle for using MinIon on human data. At about 100 megabases and $1000 per cell that's three orders of magnitude more expensive per base than a HiSeq X. (edit: spelling)

tkinom · on June 14, 2014

Very nice! I worked on the firmware for the "Applied Biosystems ABI Prism 310 DNA Sequencer". I remembered it cost >$60K each when it was released ~1996, 1997 time frame.

It is good to see DNA sequencer tech progress faster than Semiconductor.

kanzure · on June 15, 2014

> I worked on the firmware for the "Applied Biosystems ABI Prism 310 DNA Sequencer".

Do you have copies of that firmware or know where I could pick it up? I'd like to poke around, try some reverse engineering, etc.

tkinom · on June 17, 2014

Sorry, it was not open source. BTW, it is just standard embedded firmware that control 3 axis motors + pump, CCD camera, etc. Highly couple to that particular instrument' HW. I am not sure what you can learn from it even if it was open source.

waiquoo · on June 14, 2014

I would like to see what the real raw data looks like. There doesn't look to be any noise in that wiggle plot and the signal has already been processed to a degree (see the hard edges and squareness of the features). That's really the hard part though, going from raw to digital, and the decisions that are made regarding how that transformation is achieved will have a lot to do with the quality of the data.

facepalm · on June 14, 2014

What does it mean? I don't suppose I can put a drop of blood onto that sequencer and receive a sequence?

hyperbovine · on June 14, 2014

No, it does not. You still need a wetlab to do sample preparation. There are a couple reasons why this is exciting though. Oxford Nanopore has seemed like bogus vaporware for several years now, and many people were skeptical that they had a viable product. This data release is the strongest suggestion yet that the thing actually works (but is still in no way conclusive). Two, assuming it does work, being able to sequence 5-10kb fragments will be a game changer for decoding complex structural variation and repetitive regions of the genome. Current sequencing technology (read lengths on the order of 100-200bp) is simply not up to the task, and yet tons of interesting phenomena are believed to be hiding out in those regions. For example, there is mounting evidence that telomere shortening is strongly associated with aging. Existing NGS machines are powerless to interrogate telomeres in humans, because they consist of several kilobases of the repetitive sequence TTAGGG. If Oxford Nanopore's claims hold true you could in theory sequence the whole thing in one go. Being able to do this would advance our understanding of aging by leaps and bounds, I suspect.

sitkack · on June 14, 2014

How portable is a "wetlab"? Could I take this with me into the jungles of south america or the se asia?

rcthompson · on June 14, 2014

A wetlab doesn't refer to any specific set of equipment. It refers to all the stuff that you do that involves transferring liquids between test tubes via pipets, heating and cooling them, etc., in contrast to "dry lab", which refers to computational data anlaysis. Depending on the type of sample, the wetlab preparation will vary. For example, if you have an extremely small amount of starting material (e.g. you want to sequence the DNA of individual cells), the prep will be different, or if you want to sequence RNA from blood, you need a step to remove all the mRNA that encodes hemoglobin, which is 50-75% of the total mRNA.

I do dry lab exclusively, so I'm not sure what the most bsaic sample prep procedure consists of, and whether the equipment needed for that prep can be made portable. But in the field, you're going to have to contend with contamination issues, since tiny amounts of contaminating DNA can mess things up big time. I don't think you can just carry around a portable autoclave, so maybe the solution would be individually-wrapped sterile prep kits, maybe with the possibility to sterilize, reset, and re-wrap them back at home.

bayesianhorse · on June 19, 2014

It might be possible to do the preparation steps with portable gear. "Classic" DNA extraction with multiple rounds of electrophoresis or PCR aren't strictly necessary for all applications.

Still, it would be easier to carry the sample back with you rather than carry the equipment into the djungle.

On the other hand a lab in a developing country might have less trouble using the MinION rather than sending away the samples for sequencing.

pcrh · on June 14, 2014

A wet lab is a lab as you might imagine it on CSI, i.e. the whole nine yards.

Sometimes, components of this can be miniaturized to create a "lab on a chip", but that takes another level of technological development.

rflrob · on June 14, 2014

In theory you could, at least according to the hype. You would probably be better off taking a drop of blood, mixing with various reagents to lyse the cells and lightly fragment the DNA, but that's the dream.

Of course, thanks to the accelerated Moore's law that applies to sequencing over the last couple decades [1], it'll probably be the case that the computer power will be more expensive than the actual sequencing.

[1] http://www.genome.gov/sequencingcosts/

owlmonkey · on June 15, 2014

Totally agree with you on the computation costs.

In some slides in February ONT hinted at why they had to go to a full wet sample prep and it related to input DNA amounts I believe. Their new sample prep they said allowed them to reduce their input DNA amounts to similar amounts as competitor sequencers, but without that new prep it took 10,000x more DNA to load. I guess raw DNA just doesn't load into the nanopores very easily, so it took that much higher concentration. Two years ago they claimed you could load raw blood into a chip after a 5 minute prep, but I'm guessing they will back away from that claim now since this wet chemistry is needed to get reasonable loading.

This to me implies that all nanopore technology under development will encounter similar problems. They really need biological or magnetic bead processes to "load" DNA through the pores in one form or another - or else need prohibitively high DNA concentrations and amounts - so that will mean wet chemistry.

davecap1 · on June 14, 2014

Looks like we're getting pretty close to solving the sequencing problem... the next massive challenge is in using genetic code to actually understand health. If you're an engineer looking for some meaningful work, SolveBio wants to hear from you!