Shared posts

05 Jan 09:08

Blockchain, Data Storage: The Future Is Decentralized

Dataconomy | As we look ahead to a new, decentralized internet, it is important to consider one of its most important elements: decentralized storage.
04 Aug 10:01

Bioinformatics vs Computational Biology

Diamond Age Data Science | The world of quantitative biology is large, diffuse and sometimes overwhelming. It's hard sometimes to even figure out what someone means when they say "bioinformatics". This can make it hard to figure out what part of the field someone works in.
04 Aug 10:01

First Human Embryos Edited in U.S.

MIT Technology Review | Researchers have demonstrated they can efficiently improve the DNA of human embryos.
16 Jan 08:27

HiSeq move over, here comes Nova! A first look at Illumina NovaSeq

by biomickwatson

Illumina have announced NovaSeq, an entirely new sequencing system that completely disrupts their existing HiSeq user-base.  In my opinion, if you have a HiSeq and you are NOT currently engaged in planning to migrate to NovaSeq, then you will be out of business in 1-2 years time.  It’s not quite the death knell for HiSeqs, but it’s pretty close and moving to NovaSeq over the next couple of years is now the only viable option if you see Illumina as an important part of your offering.

Illumina have done this before, it’s what they do, so no-one should be surprised.

The stats

I’ve taken the stats from the spec sheet linked above and produced the following.  If there are any mistakes let me know.

There are two machines – the NovaSeq 5000 and 6000 – and 4 flowcell types – S1, S2, S3 and S4.  The 6000 will run all four flowcell types and the 5000 will only run the first two.  Not all flowcell types are immediately available, with S4 scheduled for 2018 (See below)

S1 S2 S3 S4 2500 HO 4000 X
Reads per flowcell (billion) 1.6 3.3 6.6 10 2 2.8 3.44
Lanes per flowcell 2 2 4 4 8 8 8
Reads per lane (million) 800 1650 1650 2500 250 350 430
Throughput per lane (Gb) 240 495 495 750 62.5 105 129
Throughput per flowcell (Gb) 480 990 1980 3000 500 840 1032
Total Lanes 4 4 8 8 16 16 16
Total Flowcells 2 2 2 2 2 2 2
Run Throughput (Gb) 960 1980 3960 6000 1000 1680 2064
Run Time (days) 2-2.5 2-2.5 2-2.5 2-2.5 6 3.5 3

For X Ten, simply mutiply X figures by 10.  These are maximum figures, and assume maximum read lengths.

Read lengths available on NovaSeq 2×50, 2×100 and 2x150bp.  This is unfortunate as the sweet spot for RNA-Seq and exomes is 2x75bp.

As you can see from the stats, the massive innovation here is the cluster density, which has hugely increased. We also have shorter run times.

So what does this all mean?

Well let’s put this to bed straight away – HiSeq X installations are still viable.  This from an Illumina tech on Twitter:


We learn two things from this – first, that HiSeq X is still going to be cheaper for human genomes until S4 comes out, and S4 won’t be out until 2018.

So Illumina won’t sell any more HiSeq X, but current installations are still viable and still the cheapest way to sequence genomes.

I also have this from an un-named source:

speculation from Illumina rep “X’s will be king for awhile. Cost per GB on those will likely be adjusted to keep them competitive for a long time.”

So X is OK, for a while.

What about HiSeq 4000? Well to understand this, you need to understand 4000 and X.

The HiSeq 4000 and HiSeq X

First off, the HiSeq X IS NOT a human genome only machine.  It is a genome-only machine.  You have been able to do non-human genomes for about a year now.  Anything you like as long as it’s a whole genome and it’s 30X or above.  The 4000 is reserved for everything else because you cannot do exomes, RNA-Seq, ChIP-Seq etc on the HiSeq X.  HiSeq 4000 reagents are more expensive, which means that per-Gb every assay is more expensive than genome sequencing on Illumina.

However, no such restrictions exist on the NovaSeq – which means that every assay will now cost the same on NovaSeq.   This is what led me to say this on Twitter:

At Edinburgh Genomics, roughly speaking, we charge approx. 2x as much for a 4000 lane as we do for an X lane.  Therefore, per Gb, RNA-Seq is approx. twice as expensive as genome sequencing.  NovaSeq promises to make this per-Gb cost the same, so does that mean RNA-Seq will be half price?  Not quite.  Of course no-one does a whole lane of RNA-Seq, we multiplex multiple samples in one lane.  When you do this, library prep costs begin to dominate, and for most of my own RNA-Seq samples, library prep is about 50% of the per-sample cost, and 50% is sequencing.  NovaSeq promises to half the sequencing costs, which means the per-sample cost will come down by 25%.

These are really rough numbers, but they will do for now.  To be honest, I think this will make a huge difference to some facilities, but not for others.  Larger centers will absolutely need to grab that 25% reduction to remain competitive, but smaller, boutique facilities may be able to ignore it for a while.

Capital outlay

Expect to get pay $985k for a NovaSeq 6000 and $850k for a 5000.

Time issues

One supposedly big advantage is that NovaSeq takes 40 hours to run, compared to the existing 3 days for a HiSeq X.   Comparing like with like that’s 40 hours vs 72 hours.  This might be important in the clinical space, but not for much else.

Putting this in context, when you send your samples to a facility, they will be QC-ed first, then put in library prep queue, then put in sequencing queue, then QC-ed bioinformatically before finally being delivered.  Let’s be generous and say this takes 2 weeks.  Out of that sequencing time is 3 days.  So instead of waiting 14 days, you’re waiting 13 days.  Who cares?

Clinically having the answer 1 day earlier may be important, but let’s not forget, even on our £1M cluster, at scale the BWA+GATK pipeline itself takes 3 days.  So again you’re looking at 5 days vs 6 days.  Is that a massive advantage?  I’m not sure.  Of course you could buy one of the super-fast bioinformatics solutions, and maybe then the 40 hour run time will count.

Colours and quality

NovaSeq marks a switch from the traditional HiSeq 4 colour chemistry to the quicker NextSeq 2 colour chemistry.  As Brian Bushnell has noted on this blog, NextSeq data quality is quite a lot worse than HiSeq 2500, so we may see a dip in data quality, though Illumina claim 85% above Q30.


10 Jan 20:01

Illumina Unveils HiSeq Successor NovaSeq

by Keith Robison
At today's J.P. Morgan Healthcare Conference Illumina made a number of small announcements -- some new partnerships, Firefly on track for launch later this year, launch of the single cell workflow partnered with Bio-Rad.  Then CEO Francis deSouza dropped the big news: a new high-end sequencer architecture to ultimately replace all of the HiSeq instruments.  It sounds like an interesting evolution of the Illumina product line, but unfortunately too many headlines and tweets have focused on a distant goal of $100 human genomes.  Worse, not only did some commentators misconstrue the announcement as delivering on $100 genomes, but some also touted a sequencing speed of one hour for a genome which isn't remotely true.

Read more »
07 Jan 14:24

BioTeam’s Berman Charts 2017 HPC Trends in Life Sciences

HPCwire | Twenty years ago high performance computing was nearly absent from life sciences. Today it's used throughout life sciences and biomedical research. Genomics and the data deluge from modern lab instruments are the main drivers, but so is the longer-term desire to perform predictive simulation in support of Precision Medicine (PM).
05 Jan 08:37

A Genetically Modified Malaria Vaccine Has Passed an Important Hurdle

Researchers have tested a modified malaria parasite in humans that has been shown to be safe and to trigger an immune response.
04 Jan 10:18

University of California Cries "Thief!" on Genia Patents

by Keith Robison
As I noted in my last post, the University of California has filed suit against Genia claiming that Genia co-founder Roger Chen misappropriated intellectual property from UC Santa Cruz and the laboratory of Mark Akeson (filings include a bunch of  other well-known nanopore scientists, including David Deamer and Dan Branton).  While the filings are mostly dry, they are enlivened occasionally by such colorful language as "evasive tactics", "aided and abetted" and "stonewalled".  Goaded by Mick Watson, I've dug into the court filings and some of the patents (and obtaining those filings apparently cost me some real money, perhaps approaching $1.0e01 dollars).

Read more »
02 Jan 12:22

University of California makes legal move against Roger Chen (and Genia)

by biomickwatson

The relationship between sequencing companies is frosty and beset by legal issues, which I’ve covered before here and here.  Keith Robison tends to cover in more detail 😉

Most recently, PacBio moved against Oxford Nanopore, we think claiming that ONT’s 2D technology violated their patent on CCR (link).

Well now the absolute latest is a filing by the University of California against Roger Chen and therefore Genia. If you click through to the documents (requires registration) you’ll see that UC claim Chen, with others, produced key inventions whilst at UC that he later assigned to Genia, but which should have automatically been assigned to UC according to UC’s “oath of allegiance”, which Chen signed as a UC employee.

It awaits to be seen how important this is and no doubt Chen/Genia/Roche will fight tooth and nail; however if the courts decide in UC’s favour it could spell the end of Genia, and at the very least see a large cash settlement with UC.

Fascinating times!

25 Dec 14:36

De smartphone steelt de stilte

by Arnoud Wokke
Ken je dat: je moet tien seconden op de lift wachten en je pakt gelijk je telefoon erbij. Waarom eigenlijk? En kan dat kwaad?
23 Dec 09:11

This young scientist retracted a paper. And it didn’t hurt his career

by Ivan Oransky and Adam Marcus

Nathan Georgette experienced the peaks and troughs of a life in science, all before he was old enough to buy a beer.

And despite the typical stigma of retracting a scientific paper, Georgette, a fourth-year student at Harvard Medical School, is doing just fine — serving as a model to those many decades his senior.

Read the rest...

20 Dec 06:15

Can Reproduction Be Ageless?

News that a woman has given birth via an ovary frozen when she was nine years old is just one example of how technology is altering the limits of fertility.
16 Dec 07:43

Roche Abruptly Breaks Off PacBio Partnership

by Keith Robison
This morning was a solid block of meetings, but in a pause I checked my phone and saw the shocking headline: Roche Diagnostics had suddenly terminated their partnership with Pacific Biosciences to commercialize the Sequel instrument for clinical applications. Based on the few things I've read and a conversation with Bio-IT World's Allison Profitt, I've formed a few ideas, but certainly still find this a bit mystifying.  Perhaps the first part of next year, with the JP Morgan Conference and AGBT, will see Roche revealing a bit more about why they decided to break up with their partner.  Particularly when Roche had already made all their milestone payments; going forward
Read more »
14 Dec 08:50

New Funds for ONT, Self-Sequencing Challenge from Clive Brown

Bio-IT World Oxford Nanopore Technologies yesterday announced $126 million in new funding through a private placement of ordinary shares. The investment round was led by new investor GT Healthcare, a pan-Asian fund with special reach in China, and existing investor Woodford Investment Management on behalf of its clients. Clive Brown, Oxford Nanopore's CTO, also released the "Cliveome" yesterday, his own self-sequenced genome.
13 Dec 12:01

‘Dear plagiarist’: A scientist calls out his double-crosser

by Adam Marcus and Ivan Oransky

It’s a researcher’s worst nightmare: Pour five years, and at least 4,000 hours, of sweat and tears into a study, only to have the work stolen from you — by someone who was entrusted to confidentially review the manuscript.

But unlike many sordid tales of academia, this one is being made public. Dr. Michael Dansinger, of Tufts Medical Center, has taken to print to excoriate a group of researchers in Italy who stole his data and published it as their own.

Read the rest...

12 Dec 08:57

Is the long read sequencing war already over?

by biomickwatson

My enthusiasm for nanopore sequencing is well known; we have some awesome software for working with the datawe won a grant to support this work; and we successfully assembled a tricky bacterial genome.  This all led to Nick and I writing an editorial for Nature Methods.

So, clearly some bias towards ONT from me.

Having said all of that, when PacBio announced the Sequel, I was genuinely excited.   Why?  Well, revolutionary and wonderful as the MinION was at the time, we were getting ~100Mb runs.  Amazing technology, mobile sequencer, tri-corder, just incredible engineering – but 100Mb was never going to change the world.  Some uses, yes; but for other uses we need more data.  Enter Sequel.

However, it turns out Sequel isn’t really delivering on promises.  Rather than 10Gb runs, folk are getting between 3 and 5Gb from the Sequel:

At the same time, MinION has been coming along great guns:

Whilst we are right to be skeptical about ONT’s claims about their own sequencer, other people who use the MinION have backed up these claims and say they regularly get figures similar to this. If you don’t believe me, go get some of the World’s first Nanopore human data here.

PacBio also released some data for Sequel here.

So how do they stack up against one another?  I won’t deal with accuracy here, but we can look at #reads, read length and throughput.

To be clear, we are comparing “rel2-nanopore-wgs-216722908-FAB42316.fastq.gz” a fairly middling run from the NA12878 release, m54113_160913_184949.subreads.bam and one of the Sequel SMRT cell datasets released.

Read length histograms:


As you can see, the longer reads are roughly equivalent in length, but MinION has far more reads at shorter read lengths.  I know the PacBio samples were size selected on Blue Pippin, but unsure about the MinION data.

The MinION dataset includes 466,325 reads, over twice as many as the Sequel dataset at 208,573 reads.

In terms of throughput, MinION again came out on top, with 2.4Gbases of data compared to just 2Gbases for the Sequel.

We can limit to reads >1000bp, and see a bit more detail:


  • The MinION data has 326,466 reads greater than 1000bp summing to 2.37Gb.
  • The Sequel data has 192,718 reads greater than 1000bp, summing to 2Gb.

Finally, for reads over 10,000bp:

  • The MinION data has 84,803 reads greater than 10000bp summing to 1.36Gb.
  • The Sequel data has 83,771 reads greater than 10000bp, summing to 1.48Gb.

These are very interesting stats!

This is pretty bad news for PacBio.  If you add in the low cost of entry for MinION, and the £300k cost of the Sequel, the fact that MinION is performing as well as, if not better, than Sequel is incredible.  Both machines have a long way to go – PacBio will point to their roadmap, with longer reads scheduled and improvements in chemistry and flowcells.  In response, ONT will point to the incredible development path of MinION, increased sequencing speeds and bigger flowcells.  And then there is PromethION.

So is the war already over?   Not quite yet.  But PacBio are fighting for their lives.

05 Dec 06:33

People are wrong about sequencing costs on the internet again

by biomickwatson

People are wrong about sequencing costs on the internet again and it hurts my face, so I had to write a blog post.

Phil Ashton, whom I like very much, posted this blog:

But the words are all wrong 😉  I’ll keep this short:

  • COST is what it COSTS to do something.  It includes all COSTS.  The clue is in the name.  COST. It’s right there.
  • PRICE is what a consumer pays for something.

These are not the same thing.

As a service provider, if the PRICE you charge to users is lower than your COST, then you are either SUBSIDISED or LOSING MONEY, and are probably NOT SUSTAINABLE.

COST, amongst other things, includes:

  • Reagents
  • Staff time
  • Capital cost or replacement cost (sequencer and compute)
  • Service and maintenance fees
  • Overheads
  • Rent

Someone is paying these, even if it’s not the consumer.  So please – when discussing sequencing – distinguish between PRICE and COST.

Thank you 

23 Nov 06:38

Microsoft Spends Big to Build a Computer Out of Science Fiction

The New York Times | The computer giant says it's ready to start planning a prototype quantum computer, a superpowerful device that relies on subatomic particles instead of transistors.
10 Nov 10:05

How high-protein diets cause weight loss

A common end-product of digested protein -- phenylalanine -- triggers hormones that make rodents feel less hungry and leads to weight loss, according to a new study. A better understanding of the mechanism by which protein diets cause weight loss could lead to the development of drugs and diets that tackle the growing obesity epidemic.
31 Oct 10:20

Can CRISPR Save Ben Dupree?

Scientists are rushing to figure out how to use the gene-editing tool to stop devastating diseases like muscular dystrophy.
27 Oct 11:28

72 miljoen winst voor Rijk Zwaan

Rijk Zwaan Zaadteelt en Zaadhandel heeft in het boekjaar 2015/2016 een netto omzet gerealiseerd van 388 miljoen euro. Dit is 14% hoger dan vorig boekjaar. Rijk Zwaan blijft kiezen voor een autonome groeistrategie. Daarmee kan het familiebedrijf uit De Lier concurreren met beursgenoteerde multinationals en.....
20 Oct 05:24

The Discovery of DNA Structure – Who Stayed in the Shadows of a Nobel?

by Rita Silva, United Academics
In 1962, the Nobel Prize of Medicine was given to Watson, Crick and Wilkins, for their finding of the double-helical structure of the DNA molecule. But who were the scientists overshadowed by the names of Watson and Crick?...

Leslie Pray. (2008) Discovery of DNA structure and function: Watson and Crick. Nature Education, 1(1). info:other/

13 Oct 06:34

Vitamins A and C help erase cell memory

by Dr. Jekyll, Lunatic Laboratories
Vitamins A and C aren't just good for your health, they affect your DNA too. Researchers have discovered how vitamins A and C act to modify the epigenetic 'memory' held by cells; insight which is significant for regenerative medicine and our ability to reprogramme cells from one identity to another. ...

Hore, T., von Meyenn, F., Ravichandran, M., Bachman, M., Ficz, G., Oxley, D., Santos, F., Balasubramanian, S., Jurkowski, T., & Reik, W. (2016) Retinol and ascorbate drive erasure of epigenetic memory and enhance reprogramming to naïve pluripotency by complementary mechanisms. Proceedings of the National Academy of Sciences, 201608679. DOI: 10.1073/pnas.1608679113  

Retinol and ascorbate drive erasure of epigenetic memory and enhance reprogramming to naïve pluripotency by complementary mechanisms

10 Oct 10:40

As DNA reveals its secrets, scientists are assembling a new picture of humanity

by Carl Zimmer

When Benedict Paten stares at his computer monitor, he sometimes gazes at what looks like a map of the worst subway system in the world. The screen is sprinkled with little circles that look like stations. Some are joined by straight lines — sometimes a single path from one circle to the next, sometimes a burst of spokes radiating out in many directions. And sometimes the lines bend into sweeping curves that soar off on express routes to distant stations.

A rainbow palette of colors makes it a little easier to digest the complexity. But if you stare a little too long, vertigo sets in.

Read the rest...

03 Oct 06:19

Oxford Nanopore Announces New Pores, Kits and Updates on Projects

Bio-IT World | The Chief Technology Officer at Oxford Nanopore gave a technical Oxford Nanopore update this morning, outlining updates on MinION, PromethION, SmidgeION, hardware and chemistries. 
26 Sep 08:00

The curious case of the $9,500 skin gel

by Ed Silverman

Even in an age when prescription drugs are increasingly expensive, a $9,500 tube of gel to combat scaly skin can gain notice — especially when the price spikes 128 percent overnight.

That’s what happened earlier this month when a little-known company called Novum Pharma suddenly hiked wholesale prices for all three of its dermatology products by whopping amounts.

Read the rest...

19 Sep 05:16

Largest-ever study reveals environmental impact of genetically modified crops

by Dr. Jekyll, Lunatic Laboratories
According to new research, widespread adoption of genetically modified crops has decreased the use of insecticides, but increased the use of weed-killing herbicides as weeds become more resistant. This is the largest study of genetically modified crops and pesticide use to date. The team of economists studied annual data from more than 5,000 soybean and 5,000 maize farmers in the U.S. from 1998 to 2011, far exceeding previous studies that have been limited to one or two years of data. ...

09 Sep 05:50

Ethics Issues Raised by Human Enhancement

Over the last 30 years, the evolutionary status and trajectory of the human species has been brought into question by rapid progress within the fields of nanotechnology, biotechnology, information technology and cognitive science.