Shared posts

19 May 01:22

Railing about rails again: No, Science, it’s NOT THE SAME SPECIES!

by whyevolutionistrue

UPDATE: Science has now corrected its post by issuing the addendum below.  As you’ll see in the comments below, author Alex Fox credits this post for the correction, which is gentlemanly of him. Thanks to reader Barry for the spot.

***************

It is a truth universally acknowledged that the two most prestigious science journals in the world are Science, published in the U.S., and Nature, published in England. One would think, then, that their science reporting would be more accurate than the slipshod stuff you see in the science pages of the major media (the NYT is an exception). But Science slipped up this time when reporting on the independent evolution of flightlessness on the island of Aldabra twice: in an ancient white-throated rail that colonized the island and went extinct when sea levels rose, and then in more modern times (i.e., several hundred thousand years ago) when birds from the same flying lineage colonized Aldabra again and once again evolved flightlessness. (Islands lack predators and so flying, which is metabolically expensive, can often be dispensed with to gain other advantages.)

A few days ago I wrote about how nearly all the major media—tabloids and respectable papers alike—mis-reported this finding, saying that the two flightless rails were really the same species, one that had been “resurrected” or “had come back from the dead.” In reality, the three white throated rails (Dryolimnas cuvieri) are designated as subspecies, so even that reporting is wrong. But that’s minor compared to the repeated claim (see my earlier post for screenshots of the distorted headlines) that the very same species had evolved twice.

This was a big boo-boo because calling the modern flightless rail and its extinct flightless analogue members of “the same species” depended only on the similarity of two bones: a wing bone and a leg bone. There was no other fossil evidence, of course, about what the extinct rail looked like, how it behaved, or anything about the rest of its skeleton, its habits, its DNA, or its physiology. It’s simply a misleading whopper to assert that the “same species” evolved twice.

Further, the species concept used by nearly all evolutionary biologists deems two individuals members of the same species if, where they meet in nature, they can mate and produce fertile offspring. It’s a concept based on reproductive compatibility and incompatibility. Doing such a test is not possible in this case because the extinct species never had a chance to cohabit with the modern species. Just as we can’t say whether modern Homo sapiens are members of the same biological species as Homo erectus (note that they’re even given different names, but that’s based on physical differences), so we can’t say whether the ancient and modern flightless rails are members of the same biological species—much less subspecies.

As someone who spent his whole career working on speciation, including species concepts, I was thus disheartened to see this news report in the journal Science:

Note that while the report does call this “iterative evolution” (“convergent evolution” would be clearer to evolutionists), and notes the independent evolution of flightlessness, it also passes on Gizmodo’s report that evolution had “resurrected the lost species.”

Nope, that’s not true. We know nothing about the genetics, morphology, behavior, and physiology of the extinct species compared to the new one. Science had no business talking about “resurrection”, but it did.

Of course only a petulant evolutionary biologist who works on speciation would single out this error. But it’s pretty bad when one of the world’s best science journals makes a totally unwarranted claim like this.

18 May 15:39

An Intro to Deep Learning

by Derek Lowe

I wanted to mention a timely new book, Deep Learning for the Life Sciences, that I’ve received a copy of. It’s by Bharath Ramsundar at Computable, Peter Eastman at Stanford, Pat Walters at Relay, and Vijay Pande at Andreessen Horowitz, and I’ve been using it to shore up my knowledge in this area. From what I can see, there are not too many people who have much understanding of what deep learning/machine learning really entails – not that it stops folks from delivering their opinions on it. So actually obtaining some will make you stand out from the crowd (!)

This book is written for those of us out in biology and chemistry who would like to get up to speed on the topic; it’s not a detailed dive into any one area. But I think that’s a large market: if you would like to know in brief about (say) what a neural network is, the general scheme by which it processes inputs and generates outputs, and how one goes about applying such a thing to a pile of chemical structures or cell images, this would be an excellent place to start. The authors recommend further reading at many points, since they touch on a whole range of topics that have far more detail to them than they’re trying to cover.

Several parts of the book make use of the open-source DeepChem toolbox – there are examples of processing chemical structure and property data, genomic data, protein structural information, imaging data, and so on. Since it’s written for a wide audience, there are introductory sections throughout explaining to the non-life-science computational types (for example) what pi-stacking is and how a SMILES string is generated, and explaining (for example) to the chemists and biologists what a convolutional neural network is and how it might be less susceptible to overfitting errors than some other architectures. A good feature is that the authors have a realistic view of the problems:

At present [the PDB] contains over 142,000 structures. . .that may seem like a lot, but it is far less than we really want. The number of known proteins is orders of magnitude larger, with more being discovered all the time. For any protein that you want to study, there is a good chance that its structure is still unknown. And you really want many structures for each protein, not just one. Many proteins can exist in multiple functionally different states. . .the PDB is a fantastic resource, but the field is still in its “low data” stage. We have far less data than we want, and a major challenge is figuring out how to make the most of what we have. That is likely to remain true for decades.

The book also highlights the limits of what software can accomplish, and when it needs human assistance. That same section quoted from above goes on to warn people that PDB files often contain problematic regions where the protein or ligand is not modeled well, and advises that (at present) there’s no substitute for having an experienced modeler look over the structure for a reality check. Similarly, from the other end, in the chapter on image processing, it’s noted that generating good segmentation masks (read the book!) is often not feasible without some human input as well. That’s something that people outside the field don’t always realize, that these things are not 100% machine, but rather what Gary Kasparov calls centaur systems, using humans and machines in tandem with each doing what they do best.

As someone without much expertise (compared to the authors!), I’ve been particularly enjoying the discussions of “meta” topics such as choosing between different architectures (and how to evaluate such choices), interpretability of the results (and how to quantify that), and testing the validity of output datasets. You may not be surprised to know that some of these topics are complex enough that they are candidates for deep-learning approaches of their own, a recursive feature that will cause you to think of what techniques are then appropriate to evaluate the evaluations. The answer to the question of quis custodiet ipsos custodes turns out, perhaps, to be “this subroutine right over here”, but these are human judgment calls as well.

So overall, this book should make you much more able to digest what people are talking about when they start talking deep learning, and if you’re motivated to try some yourself, it will show you how to get started and where to learn more. And it will also (perhaps paradoxically) reassure you about the current limits of the technique in general and the continued need for intelligent human oversight and intervention. Making ourselves a bit more intelligent about that is no bad thing.

18 May 15:36

What’s Artificial Life, Anyway?

by Derek Lowe

Do you know the Ship of Theseus problem? That one was first stated in its canonical form by Plutarch in his Parallel Lives, speaking of the ship that the hero used to return to Athens from Crete after slaying the Minotaur. Here we go:

The ship on which Theseus sailed with the youths and returned in safety, the thirty-oared galley, was preserved by the Athenians down to the time of Demetrius Phalereus. They took away the old timbers from time to time, and put new and sound ones in their places, so that the vessel became a standing illustration for the philosophers in the mooted question of growth, some declaring that it remained the same, others that it was not the same vessel.

That one’s been kicking around in philosophical discussions ever since. Thomas Hobbes, for example, wondered if the old boards had been stored once they were removed and eventually used to build another ship, whether that one one have a better claim to being the original and so on. You can set off all sorts of arguments about what’s authentic, what’s original, and whether there’s a definable threshold for such descriptions at all.

Why am I starting us off in ancient Athens, as described later in ancient Rome? Because we’re in the middle of just another such question: what is artificial life? We’ve made all sorts of modifications to living cells and entire living creatures, with tools of increasing power and specificity. We actually started long before genetic engineering, with an extraordinary example being the long, complex, multicenter breeding of a wild Mexican grass into what we know as corn (maize). Even the most old-fashioned heirloom variety of corn you can find is nowhere near a “natural” plant; it never evolved in the wild and is entirely a human creation.

And these days we have far faster and well-defined techniques, all sorts of ways to introduce mutations into plants and animals ranging from the sledgehammer (radiation, colchicine) to the surgical instruments of CRISPR and the like. The number of engineered cells and whole organisms that have been produced is surely beyond our ability to specify. Are they artificial? How about if you introduce genes (and their associated mRNAs and protein products) from completely different organisms (as is done all the time)? Artificial? There has been a great deal of work put into engineering strains of Mycoplasma, in an effort to see how far down its genome can be pared and still have a living creature, and also to transfer a completely human-synthesized genome into the cells themselves. Now those, are they artificial life, or not?

I ask because there’s a new paper out that takes the latter technique even further: this one works with the (much larger) E. coli genome, and the new replacement DNA was not only synthesized, but substantially reworked (here’s coverage at Stat). The number of duplicate codons that read for a given amino acid have been reduced: specifically, the serine codons TCG and TCA have been replaced by existing synonyms, a modification that has previously been shown to work in shorter stretches of the genome. This recoded DNA was introduced in sections, with the eventual production of a completely recoded bacterium whose genome was synthesized from scratch. There were plenty of hitches along the way, as the paper details – some codons were tricky, because they’re involved in downstream regulation of other genes (promoter and enhancer effects), and a lot of these had to be addressed one by one. Swapping out to an upgraded memory chip, this is not.

What the team at Cambridge ends up with, though, is a living bacterium (Syn61) that is capable of reproduction. It looks a bit funny, to be sure – it’s longer than the original type, and reproduces more slowly. But its protein expression profile is very close to the original. And it passes some important tests: the serT gene is essential for handling the TCA codon in wild-type bacteria, but you can delete it with impunity in Syn61, because it doesn’t have any TCA codons any more. And if you try to reassign the TCG codon to use a noncanonical amino acid (an experiment of a kind that’s had a great deal of work put into it over the years), that’s quite toxic to the wild-type, but has no effect on Syn61, either. It doesn’t have any endogenous TCGs to get messed up. This also means that it’s quite possible that such recoded cells are resistant to most (perhaps all) viruses. After all, the viral infection machinery is expecting those codons to still be in place (why wouldn’t they be?), and attempts to hijack the cellular machinery to produce viral proteins might well just bog down.

Is this artificial life, then? There are headlines all over the place using the term, just as there have been for all the stages leading up to this point. And as there no doubt will be for the experiments to come. But I have no idea, because I don’t know where to draw that line. I don’t know when the ship in Athens’ harbor became different, and I don’t know when precisely these organisms did, either. But if I’d walked up to a ship of Theseus that contained no original part whatsoever from the one that sailed back from Crete, I would have to wonder. And when I encounter a bacterium whose original parts have all been replaced?

20 Feb 13:10

2018's flu season surpasses 2009's Swine Flu pandemic, and other takeaways from the latest CDC Flu Report

by Scott McPherson

The weekly CDC flu report has been released, and it is full of important information. There were several things that jumped out at me, and I wanted to bring them to your attention.

First:  Flu remains everywhere.  Oregon seems to have a dip in cases, but I question that.  However, 48 other states are positively bathed in flu.  It has not abated one whit.

Second, and the CDC has just admitted this:  This year's flu epidemic, in terms of hospital visits, has just surpassed 2009's H1N1 swine flu pandemic.  Think of that!  A non-pandemic strain of flu has sent more Americans to the hospital than the first pandemic in almost fifty years.  While of course I am hopeful pediatric deaths will not reach the level seen in 2009, we are seeing pediatric and young adult sickness and death at a rate seldom seen in a flu season.

Third: Flu season is a marathon, not a sprint.  And while the cumulative ratios of A/H3N2 to B has been roughly 80/20 this season, we are seeing a surge in cases of Influenza B.  This is nothing to celebrate:  As you recall, the young 12-year old boy from West Palm who died, was typed using reverse PCR as having had Influenza B.

Fourth:  The strain of B that is the strongest -- the Yamagata strain -- is antigenically very similar to what is in the current flu shot!  So despite what you have heard, or read, about this year's weenie performance of the shot, it is still a very, very good idea to have one.  Because the B protection is significant, and since B's jockey is showing the horse the whip, it might just continue to gain market share.  How many metaphors can I mix here?

Fifth, and this one is a biggie:  Laboratory-confirmed influenza-related hospitalizations of persons aged 65 years or older are Off. The. Charts. This acceleration began around Christmas and has not abated one bit.  Second place:  Persons 50-64. Their ascent is not as dramatic but it is significant.

Sixth: Deaths from flu this season are substantially higher than the epidemic threshold. Ten percent of all the deaths in this country were from flu and accompanying pneumonia. The normal epidemic threshhold is just above seven.  And because of regional lags in inputting death records, the final figure will be worse.

Seventh:  It's amazing what changes happen when you haven't been paying attention.  Loyal readers of this blog will instantly recognize the name BioCryst, the pharma company once HQ'ed in Birmingham, Alabama, and which relocated to the Research Triangle of North Carolina. I came across a chart showing antiviral resistance, and listed were three antivirals:  Oseltamivir (Tamiflu), zanamivir (Relenza), and -- peramavir! Peramivir is the injectable antiviral created by BioCryst!  Apparently they were granted FDA approval back in 2014.  I had stopped blogging by then, so it was nice to see that another antiviral had been added to the mix -- and that I was right to keep track of BioCryst.

The complete report is available here.  This flu epidemic continues to confound and frustrate.

20 Aug 04:36

Are Stem Cell Companies Abusing ClinicalTrials.gov?

by Ricki Lewis, PhD

I’m often asked about the safety of treatments that purport to inject stem cells into painful body parts. The reputation of stem cells seems to exceed the reach, with companies touting treatments that aren’t FDA approved or even being tested.

Back in March, an alarming article in the New England Journal of Medicine described three women blinded by stem cell treatments – two of the patients reported seeing a reference on the company website to registration at the National Institutes of Health’s well-respected ClinicalTrials.gov, and assuming it applied to their treatment. It didn’t.

In what is perhaps a modern version of hawking snake oil, companies can indeed register certain clinical trials without breaking any rules – but desperate patients might not know that.

“There is no doubt that some patients have misinterpreted a study’s listing on ClinicalTrials.gov as a stamp of legitimacy, federal review, and compliance. In this way, treatments with no safety or efficacy data, no prior clinical study, and no ongoing clinical trials under FDA review, appear to have federal approval. Such a misunderstanding can lead to disastrous outcomes for patients,” said Thomas Albini, MD, of the Bascom Palmer Eye Institute of the University of Miami, who treated the blinded women.

When I wrote about the disaster here at DNA Science and at Medscape Medical News, my Medscape editor asked me to take a closer look at criteria for listing investigations at ClinicalTrials.gov. It proved an interesting exercise, but I declined to write an article, fearing lawsuits if I named companies.

ClinicalTrials.gov is where research groups, in academia and pharma/biotech, describe protocols to evaluate the safety and efficacy of new drugs, biologics, and devices, which FDA regulates, typically in randomized, controlled trials. But for an “observational” study that just follows what happens after a treatment, no such thumbs-up is required; no investigational new drug (IND) designation or investigational device exemption (IDE) need be filed. And that creates a loophole that companies are happily jumping through – and luring patients in pain, who may know little about clinical trial design, and perhaps trust too much the companies and the doctors offering these services.

It’s easy to see how people are fooled. One company claims that ‘By providing access to registered clinical studies through the NIH, we are providing patients with the ability to choose a stem cell treatment center with the highest standard of care.’ If the treatment is experimental, how can there even be a standard of care?

MOST STUDIES LEGIT

I love ClinicalTrials.gov – it’s packed with information about all manner of conditions, with contacts and references. I started my investigation by searching for studies that sounded bogus.

I began with a treatment that epitomizes pseudoscience: magnets. But I was fooled. Other than legit uses in medical devices, my “magnet” search called up as an acronym of sorts for the “Mothers and Girls Dancing Together Trial,” a well-designed study on preventing childhood obesity, with a decent sample size and controls.

I also thought the “randomised crossover trial of the acute effects of a deep-fried Mars bar or porridge on the cerebral vasculature” was fake, but it turned out to be a medical student’s project, well done, and published in the Scottish Medical Journal.

But trial NCT02833532, sponsored by a Korean pharmaceutical company, was likely a joke, with the stated purpose of “temporary penile enhancement” and one of the investigators’ first name being Dong. Participants must answer the question “How do you rate your penile size? Very small/small/normal/big/very big” to enroll. Those accepted get to try something made of hyaluronic acid, which is found, coincidentally, in cock’s combs.

Searching ClinicalTrials.gov for “stem cells” returns more than 4,000 entries, so I gave up. Fortunately, Leigh Turner, PhD, associate professor at the Center for Bioethics at the University of Minnesota, wasn’t afraid of lawyers and took a more measured, scholarly approach. He recently published the intriguing findings in Regenerative Medicine, where you can find nice tables naming the stem cell companies that use and possibly abuse ClinicalTrials.gov.

AN ACADEMIC INVESTIGATION

Dr. Turner searched ClinicalTrials.gov for “stem cells” along with “patient-sponsored,” “patient-funded,” and “self-funded” – because expecting patients to pay is a red flag. Only a very few real clinical trials charge patients, and those that do must have FDA approval to do so.

He found 7 such pay-as-you-go clinical trials, each enrolling more than 100 people, at the government website, and another 11 in a database of companies that provide direct-to-consumer stem-cell-based treatments. The “DTC” label indicates that the treatments aren’t part of a real experimental protocol. One of them had signed up more than 3,000 gullible people.

The companies that charge patients yet proclaim a ClinicalTrials.gov listing are having their proverbial cake and eating it too – borrowing the governmental veneer of a sanctioned clinical trial, while collecting fees. And many health care consumers aren’t even aware they’re being bamboozled.

Another red flag in a stem cell pitch is an everything-but-the-kitchen-sink list of targets. Stem Cell Network, for example, claims to be able to treat, using stem cells grown from a patient’s fat, some 28 conditions, including the vague “knee problems,” and also “muscular dystrophy,” “ankle problems,” “neuropathy,” “asthma,” and “alopecia areata.” Also be wary of stem cells derived from one body part – like butt fat – being injected into another body part – such as eyeballs.

“We’d like people to protect themselves by going to a reliable website, like ClinicalTrials.gov, to distinguish legitimate from bogus claims of stem cell clinics. But the findings of this paper challenge that advice because this valuable resource, which is designed to promote transparency and to help people find clinical trials, lists unlicensed and unproven stem cell interventions that companies turn into personal marketing platforms. So if you have ALS, MS, Parkinson’s disease, a ClinicalTrials.gov listing looks like any other study on the NIH website. Many people think a listing is credible,” Dr. Turner told me.

“There is an urgent need for careful screening of clinical studies before they are registered with ClinicalTrials.gov,” Dr. Turner’s paper concludes. But in the current climate of a nuclear threat, a health care system in disarray, and possible cuts to the CDC, FDA, and NIH, ramping up scrutiny at ClinicalTrials.gov is unlikely to have priority, if the President even has a clue what it is.

“It’s not possible to slash, burn, defund, and deregulate at every turn and think that federal agencies are going to improve how they function. But no administration is forever, no budget is forever, deregulatory moments don’t last forever, and perhaps problems that are ignored or neglected now will be addressed in the future, with collateral damage along the way while nothing is done,” warns Dr. Turner.

Those seeking stem cell treatments should check out the International Society for Stem Cell Research (ISSCR) Patient Handbook on Stem Cell Therapies and stemcells.nih.gov. Alas, much of the media is still somewhat unfamiliar with the biology of stem cells, that they are not “cells that can turn into any cell type” but that they “self-renew” and jettison a new stem cell at every division. That’s what makes them stem cells, not the ability to spawn specialized cells.

So I tell people who ask me if they should have stem cells shot into their aching knees or backs to do so only if they wouldn’t object to an abnormal growth – cancer – forming there.

When it comes to stem cell therapies, it’s caveat emptor – buyer beware!

26 Aug 05:18

Why does Particle Physics matter? [Starts With A Bang]

by Ethan

“[S]cience is not a consumption good to be expanded in good times and restricted in bad times. The doing of science as well as the supporting of science is an expression of faith in the future. It would have been possible to have told Newton and Faraday, Maxwell and Einstein, Bohr and Heisenberg that, given the poverty and squalor around them, their research were luxuries which could not be afforded. To have done so would be to destroy the economic c progress that came out of their science and which was the main factor in relieving that poverty and squalor. We seem to be on the verge of saving one percent but sacrificing untold scientific discoveries and their unpredictable economic benefits.” -Leon Lederman

The Universe is a remarkable place, full of wonder on scales large, small, and everywhere in between.

Image credit: NASA, ESA, the Hubble Heritage (STScI/AURA)-ESA/Hubble Collaboration, and A. Evans.

Image credit: NASA, ESA, the Hubble Heritage (STScI/AURA)-ESA/Hubble Collaboration, and A. Evans.

What we learn about those scales isn’t limited by our curiosity; history has proven pretty solidly that so long as there are unexplained phenomena or unanswered questions, people will try and find the best explanations and answers. Even when we have good ones, we’ll always be searching for simpler, more elegant, and more complete solutions.

Image credit: Contemporary Physics Education Project (CPEP).

Image credit: Contemporary Physics Education Project (CPEP).

I’ve talked at length about why I think we should invest in science, but what I don’t often talk about is my first experience with science and scientific research.

Image credit: CERN, via http://kjende.web.cern.ch/kjende/en/wpath_lhcphysics1.htm.

Image credit: CERN, via http://kjende.web.cern.ch/kjende/en/wpath_lhcphysics1.htm.

Before I became an astrophysicist, before the largest scales and questions in the Universe became my passion, before any of that, I had to learn about “the basics,” which meant getting a solid foundational education in all of physics. And the first hands-on research opportunity that came my way — the first chance I had to work with real data, real equipment, the full, modern theories and real, current research — came at what was then the most powerful particle accelerator in the world: Fermilab.

Image credit: Fermilab, Reidar Hahn.

Image credit: Fermilab, Reidar Hahn.

(Full disclosure: the main injector ring, shown in the foreground, was not completed when I worked there!)

That was back in 1997. Sixteen years later, it’s been surpassed in energy by the Large Hadron Collider at CERN, and it was shut down a little under two years ago. Since that time, there has been no effort to probe the energy frontier in the United States, and it looks like there’s unlikely to be going forward into the future. Despite hopes for the dream machine that I’d advocate for, there has been very little political or economic initiative to keep cutting-edge particle physics alive in this country.

Image credit: Wikipedia user Ich weiß es nicht, of the defunct SSC site.

Image credit: Wikipedia user Ich weiß es nicht, of the defunct SSC site.

But particle physics matters! It’s the way we understand, at the smallest, highest-energy, most fundamental level what makes up all the matter and energy of the Universe! And there’s been a wonderful initiative undertaken by Fermilab, SLAC, and the U.S. Department of Energy to bring some of the best reasons — the reasons why particle physics matters — to everyone.

Image credit: Symmetry Magazine.

Image credit: Symmetry Magazine.

Starting today, you can publicly view the 29 brave particle physicists who’ve stepped up to create a video explaining, in their own (very personal) words, why particle physics matters to them. Some of these have been inspiring to me, and I’m pleased to be able to share them with you, and to highlight some of my favorites.

There’s Markus Luty’s take, which I can totally relate to. When you have passion for what you do, for what you’re trying to do, you not only want to be able to do it, you want others to share in that joy as well.

There’s Robin Erbacher’s passionate plea, which (at least at the time this article’s being written) is the most viewed — and deservedly so, IMO — of all the videos up there.

There’s Heidi Schellman’s words of wisdom, which hold a special place in my heart, considering she was one of my very first physics professors at college. Because of her, I’ve always known exactly why relations like Gauss’ Law and Birkhoff’s Theorem cannot apply to systems of discrete particles, and I hope I’ve been able to pass on what I’ve learned from her (and others) in just as clear a way as she passed it on to me.

There’s also JoAnne Hewett’s tale of why particle physics matters to her. You may recognize JoAnne as a science communicator in her own right from her contributions at Cosmic Variance (among other places), and she gives a great, timeless perspective here.

Symmetry has also put together a best of compilation from all of the 29 videos, which you can watch below.

And finally, there’s a contest! Symmetry has chosen five videos to compete for the best explanation of why do particle physicists do what they do, and why it’s important for everyone. Here they are, in no particular order, with links to the voting below each video:

Breese Quinn, from the University of Mississippi.

Elizabeth Worcester, from Brookhaven National Laboratory.

Hugh Lippincott, from Fermi National Accelerator Laboratory.

Peter Winter, from Argonne National Laboratory.

Dhiman Chakraborty, from Northern Illinois University.

Go ahead and vote for your favorite and please, enjoy all the videos, and remember why scientists (and physicists in particular) do what we do, and how important it is to not just our development of knowledge, but to the long-term growth and prosperity of our world, and all the people in it.

18 Aug 03:17

A single amino acid change switches avian influenza H5N1 and H7N9 viruses to human receptors

by Vincent Racaniello

HA receptor binding siteTwo back-to-back papers were published last week that provide a detailed analysis of what it would take for avian influenza H5N1 and H7N9 viruses to switch to human receptors.

Influenza virus initiates infection by attaching to the cell surface, a process mediated by binding of the viral hemagglutinin protein (HA) to sialic acid. This sugar is found on glycoproteins, which are polypeptide chains decorated with chains of sugars. The way that sialic acid is linked to the next sugar molecule determines what kind of influenza viruses will bind. Human influenza viruses prefer to attach to sialic acids linked to the second sugar molecule via alpha-2,6 linkages, while avian influenza viruses prefer to bind to alpha-2,3 linked sialic acids. (In the image, influenza HA is shown in blue on the virion (left) and as a single polypeptide at right. Alpha-2,3 linked sialic acid is shown at top).

Adaptation of avian influenza viruses to efficiently infect humans requires that the viral HA quantitatively switches to human receptor binding –  defined as high relative binding affinity to human versus avian receptors. Such a switch is caused by amino acid changes in the receptor binding site of the HA protein. The HA of the H1N1, H2N2, and H3N2 pandemic viruses are all derived from avian influenza viruses that underwent such a quantitative switch in binding from avian to human sialic acid receptors.

Avian H5N1 influenza viruses have not undergone a quantitative switch to human receptor binding, which is one of the reasons why these viruses do not undergo sustained human-to-human transmission. It has been possible to introduce specific amino acid changes in the H5 HA protein that enable these viruses to recognize human sialic acid receptors. Such changes were required to select variants of influenza H5N1 virus that transmit via aerosol among ferrets. However none of these viruses have quantitatively switched to human receptor specificity.

In the H5N1 paper, the authors compared the structure of an H5 HA bound to alpha-2,3 linked sialic acid with the structure of an H2 HA (its closest phylogenetic neighbor) bound to alpha-2,6 linked sialic acid, revealing substantial differences in the receptor binding site. To predict what residues could be changed in the H5 HA to overcome these differences, the authors developed a metric to identify amino acids within the receptor binding site that either contact the receptor or might influence the interaction. They examined these amino acids in different H5 HAs, and identified residues which might change the H5 HA to human receptor specificity. As a starting point they picked two H5 viruses that have already undergone amino acid changes believed to be important for human receptor binding. The changes were introduced into the HA of a currently circulating H5 HA by mutagenesis and then binding of the HAs to purified sialic acids and human tracheal and alveolar tissues was determined.

The HA ribosome binding site amino acid changes required for aerosol transmission of H5N1 viruses in ferrets did not quantitatively switch receptor binding of a currently circulating H5 HA from avian to human (the ferret studies were done using H5N1 viruses that circulated in 2004/05). The authors note that “These residues alone cannot be used as reference points to analyze the switch in receptor specificity of currently circulating and evolving H5N1 strains”.

However introducing other amino acid changes which the authors predicted would be important did switch the H5 HA completely to human receptor binding. Only one or two amino acids changes are required for this switch in recently circulating H5 HAs.

This work is important because it defines structural features in the receptor binding site of H5 HA that are critical for quantitative switching from avian to human receptor binding, a necessary step in the acquisition of human to human transmissibility. These specific residues can be monitored in circulating H5N1 strains as indicators of a quantitative switch to human receptor specificity.

Remember that switching of H5 HA to human receptor specificity is not sufficient to gain human to human transmissibility; what other changes are needed, in which genes and how many, is anyone’s guess.

These authors have also published (in the same issue of Cell) a similar analysis of the recent avian influenza H7N9 virus which has emerged in China to infect humans for the first time. They model the binding of sialic acid in the H7 HA receptor binding site, and predict that the HA would have lower binding to human receptors compared with human-adapted H3 HAs (its closest phylogenetic neighbor). This prediction was validated by studies of the binding of the H7N9 virus to sections of human trachea: they find that staining of these tissues is less intense and extensive than of viruses with human-adapted HAs. They predict and demonstrate that a single amino acid change in the H7 HA (G228S) increases binding to human sialic acid receptors. This virus stains tracheal sections better than the H7 parental virus.

These results mean that the H7N9 virus circulating in China might be one amino acid change away from acquiring higher binding to human alpha-2,6 sialic acid receptors. I wonder why a virus with this mutation has not yet been isolated. Perhaps the one amino acid change in the viral HA exerts a fitness cost that prevents it from infecting birds or humans. Of course, as discussed above, a switch in receptor specificity is likely not sufficient for human to human transmission; changes in other genes are certainly needed. In other words, the failure of influenza H7N9 virus to transmit among humans can be partly, but not completely, explained by its binding properties to human receptors.

02 Jun 13:46

(pre)Historical genetics still has to be historical

by Razib Khan

Credit: Albozagros


The genetics and history of Tibet are fascinating to many. To be honest the primary reason here is elevation. The Tibetan plateau has served as a fortress for populations who have adapted biologically and culturally to the extreme conditions. Naturally this means that there has been a fair amount of population genetics on Tibetans, as hypoxia is a side effect of high altitude living which dramatically impacts fitness. I have discussed papers on this topic before. And I will probably talk more about it in the future, considering rumblings at ASHG 2012.

But to understand the character of the effect of natural selection on a population it is often very important to keep in mind the phylogenetic context. By this, I mean that evolutionary processes occur over history, and those historical events shape the course of subsequent of phenomena. Concretely, to understand how the Tibetans came to be adapted to high altitudes one must understand who they are related to, and what their long term history is. There is a paper in Molecular Biology and Evolution which attempts to do just that, Genetic evidence of Paleolithic colonization and Neolithic expansion of modern humans on the Tibetan Plateau:

Tibetans live on the highest plateau in the world, their current population size is nearly 5 million, and most of them live at an altitude exceeding 3,500 meters. Therefore, the Tibetan Plateau is a remarkable area for cultural and biological studies of human population history. However, the chronological profile of the Tibetan Plateau’s colonization remains an unsolved question of human prehistory. To reconstruct the prehistoric colonization and demographic history of modern humans on the Tibetan Plateau, we systematically sampled 6,109 Tibetan individuals from 41 geographic populations across the entire region of the Tibetan Plateau and analyzed the phylogeographic patterns of both paternal (n = 2,354) and maternal (n = 6,109) lineages as well as genome-wide SNP markers (n = 50) in Tibetan populations. We found that there have been two distinct, major prehistoric migrations of modern humans into the Tibetan Plateau. The first migration was marked by ancient Tibetan genetic signatures dated to around 30,000 years ago, indicating that the initial peopling of the Tibetan Plateau by modern humans occurred during the Upper Paleolithic rather than Neolithic. We also found evidences for relatively young (only 7-10 thousand years old) shared Y chromosome and mitochondrial DNA haplotypes between Tibetans and Han Chinese, suggesting a second wave of migration during the early Neolithic. Collectively, the genetic data indicate that Tibetans have been adapted to a high altitude environment since initial colonization of the Tibetan Plateau in the early Upper Paleolithic, before the Last Glacial Maximum, followed by a rapid population expansion that coincided with the establishment of farming and yak pastoralism on the Plateau in the early Neolithic.

The two major salient points I think need emphasis are:

1) Massive sample sizes for mtDNA and lesser extent Y chromosomal linages

2) Tibetans are a compound of agriculturalists who arrived onto the plateau >10,000 years, and, hunter-gatherers who date back to the Paleolithic

Citation: Cai, Xiaoyun, et al. “Human migration through bottlenecks from Southeast Asia into East Asia during Last Glacial Maximum revealed by Y chromosomes.” PloS one 6.8 (2011): e24282.

There are many issues with this paper that bother me. The broadest interpretation of their thesis is one I find creditable, but in the details I’m left skeptical, confused, and more curious than when I began. Also, I need to add that I talked to the people who presented a poster on this paper at ASHG 2012, though I do not know if they were the authors. They seemed nice, but, also not necessarily totally focused on the questions they were exploring, as opposed to obtaining huge sample sizes and applying standard methods to them. Speak of which, the first thing that jumps out is that their sample is skewed toward what is today Tibet proper, the autonomous province. But Tibetan people have historically lived as far as Sichuan. Only 50% of ethnic Tibetans live in the autonomous region, but well over 90% of their samples are from this area. In terms of exploring adaptation to altitude this is fine, but if you are going to do phylogeography you need better geographical coverage I would think.

But that’s only a minor aside. The bulk of the paper consists of a laundry list of Y and mtDNA haplogroups, and coalescence times. Some of the results are very persuasive to me. There are some Y lineages which exhibit a “star shaped” phylogeny, which usually connotes a recent rapid population expansion. Using other methods the authors have inferred that there was indeed an expansion of population after the introduction of agriculture >10,000 years ago. There is no great reason on prior grounds to be skeptical of this finding. Nevertheless, drilling down produces great confusions, and I am not sure that the coalescence times and phylogenies actually mean what the authors assume they mean.

For example, here is a standard sort of analysis presented in this paper:

We identified a molecular signature of recent population expansion during the early Neolithic time in both paternal (Y-chromosomal D3a-P47 and O3a3c1-M117) and maternal (M9a1a and M9a1b1) lineages (10-7 kya) (table 1). The detailed analysis of haplotype sharing and time of divergence between Tibetans and Han Chinese suggests that the Neolithic population expansion on the Plateau was likely caused by the dispersal of the earliest Neolithic Han Chinese agriculturalists originating about 10 kya in what is now northwestern China….

O3a3c1-M117 is present at frequencies of nearly ~30%, and is connected with the Chinese as you can see above. This dovetails with other recent research which imply relatively recent common ancestors between Tibetans and Chinese. This result can be reconciled to the presence of Paleolithic roots via the fact that admixed populations will give you average results between the two extremes. The problem I have is that I am skeptical that Han Chinese existed 10,000 years ago, just as I am skeptical that Greeks existed 10,000 years ago.

Citation: Cai, Xiaoyun, et al. “Human migration through bottlenecks from Southeast Asia into East Asia during Last Glacial Maximum revealed by Y chromosomes.” PloS one 6.8 (2011): e24282.

A quick literature search yields the fact that M117 is modal in particular non-Han ethnic groups resident in southern China and northern Southeast Asia. I am not here proposing that the Hmong introduced M117 to the Tibetans. Rather, I am suggesting that we best be careful in assuming that we know the ethnic distribution of genetic haplogroups 6,500 years before there were any written records from a given region! To me the fact that there is a putative Sino-Tibetan group of languages is strongly indicative of diversification >10,000 years, not the existence of a Han ethnicity ~10,000 year ago. The historical records are clear that ~3,000 years ago the Yangzi river, now the informal dividing line between North China and South China, was the boundary of the zone where Han were demographically dominant. And even then there were clearly pockets of “barbarian” people on the North China plain itself! It simply does not stand up to the test of basic plausibility that the agricultural expansions ~10,000 years B.P. were Han as we would understand Han. The demographic and cultural dominance of the Han in Northeast Asia is a phenomenon of the last 3,000 years, perhaps 4,000 most generously (South China became Sinicized to some extent after the fall of the Latter Han Dynasty ~200 AD, and especially the Tang period ~600-950 AD).

Much of the argumentation is creaky because of these anachronistic assumptions and the casual inferences of contemporary haplogroup frequencies back toward ancient geographical demographic distributions. Ancient DNA has highlighted the danger of this in Europe, and that should update our priors as to the robustness of this sort of analysis. For example, the authors are curious as to the lack of structure of Y chromosomal lineages, combined with the fact of their deep coalescence times across Tibet. Why is this an issue? Because if these Y chromosomal lineages are Paleolithic, then the deep converges across the branches should also correspondent to geographic differences. But they don’t. To me the simplest explanation is that the last 10,000 years have seen a great deal of population movement, and sharply differentiated populations were brought together as agriculture opened up the Tibetan plateau. This presents a problem though with inferring ancient geographic connections from present distributions, since it opens up the possibility of migration, and radical genetic-demographic turnover.

Overall I would say that this paper is interesting and useful, but you should read it closely and not take the author’s inferences too much to heart. Those inferences are grounded in assumptions which may be built on false foundations.

Addendum: Also, a “gap” on a PCA plot does not necessarily mean long term isolation, as they say in the text. It might simply be a function of inadequate sampling. See above. There are many unsupported assertions such as that. But, I would like to add that the authors found a large number “exotic” haplogroups in Lhasa itself, which aligns with what we know about the cultural history of Tibet. Tibetan Buddhism actually is influenced more by extinct variants of South Asian (particularly, Bengali) Buddhism, rather than Chinese Buddhism. Though the demographic pump along the Himalayan border seems to go from the highlands to the lowlands, there were exceptions. And these exceptions tended to be found in Lhasa.

Citation: Cai, Xiaoyun, et al. “Human migration through bottlenecks from Southeast Asia into East Asia during Last Glacial Maximum revealed by Y chromosomes.” PloS one 6.8 (2011): e24282.

23 Apr 02:18

François Jacob has died

by PZ Myers

Jacob and Jacques Monod, who won the Nobel Prize in 1965 for their work on the lac operon, were the fellows who really put gene regulation on the map, working out the mechanisms behind switching genes off and on in response to environmental cues. I always talk about their work on day one of my developmental biology courses; everything else in molecular genetics and development are built on the foundation they laid down.

And now Le prix Nobel et résistant François Jacob est mort (Monod died in 1976).

By the way, in addition to being a great scientist, Jacob was also a great atheist. This is a loss to all of us.


Carl Zimmer has written a lovely tribute to Jacob’s work.

18 Apr 14:44

Where Do Organisms Get Their Energy?

by noreply@blogger.com (Laurence A. Moran)
I've been thinking a lot about fundamental concepts in biochemistry. One of them has to be energy—where do cells and organisms get the energy to grow and divide?

Most of the metabolism section of biochemistry courses in North America are taught from an anthropomorphic, fuel metabolism perspective. That's understandable since the purpose of such courses is mostly to prepare students for the MCAT exam. (Medical school entrance exam.) I prefer an evolutionary approach to teaching biochemistry but that's not very popular these days.

By the time the course is over, students will have learned that humans get their energy from food, especially glucose. The next step is to ask where the glucose comes from. The simple answer is that food (i.e. glucose) comes from plants. The next question is where do plants get the energy to make glucose? The answer is, of course, sunlight. This should lead to an explanation of photosynthesis but that rarely happens in introductory biochemistry courses.

This description leads to the classic "food chain" as shown in the figure (above) from FT Exploring Science and Technology [The Flow of Energy Through Plants and Animals]. This is conceptually sound biochemistry as far as it goes. As long as students understand how sunlight can be used to make ATP and how ATP can be used to make macromolecules (including glucose), then they will understand that humans ultimately get their energy from sunlight. I would be happy if all biochemistry students could explain this food chain at the molecular level.

But in order to make sure that students really understand this process, I go one step further. I explain that there are many species of bacteria that are chemoautotrophs. Chemoautotrophs are incapable of photosynthesis yet they are able to grow and divide in the absence of any organic compounds. Their carbon source is CO2, just like photosynthetic organisms. These bacteria have a basic metabolism that teaches us what primitive life forms must have been like. Knowing how they get their energy helps students understand evolution.

Where do chemoautotrophs get their energy? I'm interested in knowing how many readers have taken biochemistry and are able to answer that question. Please let me know in the comments before you read the answer in these posts [Carbon Dioxide Fixation in the Dark Ocean] [Core Concepts: Pathways and Transformations of Energy and Matter] [Ubiquinone and the Proton Pump].


07 Apr 15:43

Understanding Bayes Theorem With Ratios

by kalid

My first intuition about Bayes Theorem was “take evidence and account for false positives”. Does a lab result mean you’re sick? Well, how rare is the disease, and how often do healthy people test positive? Misleading signals must be considered.

This helped me muddle through practice problems, but I couldn’t think with Bayes. The big obstacles:

Percentages are hard to reason with. Odds compare the relative frequency of scenarios (A:B) while percentages use a part-to-whole “global scenario” [A/(A+B)]. A coin has equal odds (1:1) or a 50% chance of heads. Great. What happens when heads are 18x more likely? Well, the odds are 18:1, can you rattle off the decimal percentage? (I’ll wait…) Odds require less computation, so let’s start with them.

Equations miss the big picture. Here’s Bayes Theorem, as typically presented:

\displaystyle{\displaystyle{\Pr(\mathrm{A}|\mathrm{X}) = \frac{\Pr(\mathrm{X}|\mathrm{A})\Pr(\mathrm{A})}{\Pr(\mathrm{X|A})\Pr(A)+ \Pr(\mathrm{X|\sim A})\Pr(\sim A)}}}

It reads right-to-left, with a mess of conditional probabilities. How about this version:

original odds * evidence adjustment = new odds

Bayes is about starting with a guess (1:3 odds for rain:sunshine), taking evidence (it’s July in the Sahara, sunshine 1000x more likely), and updating your guess (1:3000 chance of rain:sunshine). The “evidence adjustment” is how much better, or worse, we feel about our odds now that we have extra information (if it was December in Seattle, you might say rain was 1000x as likely).

Let’s start with ratios and sneak up to the complex version.

Caveman Statistician Og

Og just finished his CaveD program, and runs statistical research for his tribe:

  • He saw 50 deer and 5 bears overall (50:5 odds)
  • At night, he saw 10 deer and 4 bears (10:4 odds)

What can he deduce? Well,

original odds * evidence adjustment = new odds

or

evidence adjustment = new odds / original odds

At night, he realizes deer are 1/4 as likely as they were previously:

10:4 / 50:5 = 2.5 / 10 = 1/4

(Put another way, bears are 4x as likely at night)

Let’s cover ratios a bit. A:B describes how much A we get for every B (imagine miles per gallon as the ratio miles:gallon). Compare values with division: going from 25:1 to 50:1 means you doubled your efficiency (50/25 = 2). Similarly, we just discovered how our “deers per bear” amount changed.

Og happily continues his research:

  • By the river, bears are 20x more likely (he saw 2 deer and 4 bears, so 2:4 / 50:5 = 1:20)
  • In winter, deer are 3x as likely (30 deer and 1 bear, 30:1 / 50:5 = 3:1)

He takes a scenario, compares it to the baseline, and computes the evidence adjustment.

Caveman Clarence subscribes to Og’s journal, and wants to apply the findings to his forest (where deer:bears are 25:1). Suppose Clarence hears an animal approaching:

  • His general estimate is 25:1 odds of deer:bear
  • It’s at night, with bears 4x as likely => 25:4
  • It’s by the river, with bears 20x as likely => 25:80
  • It’s in the winter, with deer 3x more likely => 75:80

Clarence guesses “bear” with near-even odds (75:80) and tiptoes out of there.

That’s Bayes. In fancy language:

  • Start with a prior probability, the general odds before evidence
  • Collect evidence, and determine how much it changes the odds
  • Compute the posterior probability, the odds after updating

Bayesian Spam Filter

Let’s build a spam filter based on Og’s Bayesian Bear Detector.

First, grab a collection of regular and spam email. Record how often a word appears in each:

             spam      normal
hello          3         3
darling        1         5
buy            3         2
viagra         3         0
...

(“hello” appears equally, but “buy” skews toward spam)

We compute odds just like before. Let’s assume incoming email has 9:1 chance of spam, and we see “hello darling”:

  • A generic message has 9:1 odds of spam:regular
  • Adjust for “hello” => keep the 9:1 odds (“hello” is equally-likely in both sets)
  • Adjust for “darling” => 9:5 odds (“darling” appears 5x as often in normal emails)
  • Final chances => 9:5 odds of spam

We’re learning towards spam (9:5 odds). However, it’s less spammy than our starting odds (9:1), so we let it through.

Now consider a message like “buy viagra”:

  • Prior belief: 9:1 chance of spam
  • Adjust for “buy”: 27:2 (3:2 adjustment towards spam)
  • Adjust for (“viagra”): …uh oh!

“Viagra” never appeared in a normal message. Is it a guarantee of spam?

Probably not: we should intelligently adjust for new evidence. Let’s assume there’s a regular email, somewhere, with that word, and make the “viagra” odds 3:1. Our chances become 27:2 * 3:1 = 81:2.

Now we’re geting somewhere! Our initial 9:1 guess shifts to 81:2. Now is it spam?

Well, how horrible is a false positive?

81:2 odds imply for every 81 spam messages like this, we’ll incorrectly block 2 normal emails. That ratio might be too painful. With more evidence (more words or other characteristics), we might wait for 1000:1 odds before calling a message spam.

Exploring Bayes Theorem

We can check our intuition by seeing if we naturally ask leading questions:

  • Is evidence truly independent? Are there links between animal behavior at night and in the winter, or words that appear together? Sure. We “naively” assume evidence is independent (and yet, in our bumbling, create effective filters anyway).

  • How much evidence is enough? Is seeing 2 deer & 1 bear the same 2:1 evidence adjustment as 200 deer and 100 bears?

  • How accurate were the starting odds in the first place? Prior beliefs change everything. (“A Bayesian is one who, vaguely expecting a horse, and catching a glimpse of a donkey, strongly believes he has seen a mule.”)

  • Do absolute probabilities matter? We usually need the most-likely theory (“Deer or bear?”), not the global chance of this scenario (“What’s the probability of deers at night in the winter by the river vs. bears at night in the winter by the river?”). Many Bayesian calculations ignore the global probabilities, which cancel when dividing, and essentially use an odds-centric approach.

  • Can our filter be tricked? A spam message might add chunks of normal text to appear innocuous and “poison” the filter. You’ve probably seen this yourself.

  • What evidence should we use? Let the data speak. Email might have dozens of characteristics (time of day, message headers, country of origin, HTML tags…). Give every characteristic a likelihood factor and let Bayes sort ‘em out.

Thinking With Ratios and Percentages

The ratio and percentage approaches ask slightly different questions:

Ratios: Given the odds of each outcome, how does evidence adjust them?

The evidence adjustment just skews the initial odds, piece-by-piece.

Percentages: What is the chance of an outcome after supporting evidence is found?

In the percentage case,

  • “% Bears” is the overall chance of a bear appearing anywhere
  • “% Bears Going to River” is how likely a bear is to trigger the “river” data point
  • “% Bear at River” is the combined chance of having a bear, and it going to the river. In stats terms, P(event and evidence) = P(event) * P(event implies evidence) = P(event) * P(evidence|event). I see conditional probabilities as “Chances that X implies Y” not the twisted “Chances of Y, given X happened”.

Let’s redo the original cancer example:

  • 1% of the population has cancer
  • 9.6% of healthy people test positive, 80% of people with cancer do

If you see a positive result, what’s the chance of cancer?

Ratio Approach:

  • Cancer:Healthy ratio is 1:99
  • Evidence adjustment: 80/100 : 9.6/100 = 80:9.6 (80% of sick people are “at the river”, and 9.6% of healthy people are).
  • Final odds: 1:99 * 80:9.6 = 80:950.4 (roughly 1:12 odds of cancer, ~7.7% chance)

The intuition: the initial 1:99 odds are pretty skewed. Even with a 8.3x (80:9.6) boost from a positive test result, cancer remains unlikely.

Percentage Approach:

  • Cancer chance is 1%
  • Chance of true positive = 1% * 80% = .008
  • Chance of false positive = 99% * 9.6% = .09504
  • Chance of having cancer = .008 / (.008 + .09504) = 7.7%

When written with percentages, we start from absolute chances. There’s a global 0.8% chance of finding a sick patient with a positive result, and a global 9.504% chance of a healthy patient with a positive result. We then compute the chance these global percentages indicate something useful.

Let the approaches be complements: percentages for a bird’s-eye view, and ratios for seeing how individual odds are adjusted. We’ll save the myriad other interpretations for another day.

Happy math.

05 Apr 06:41

Are there whole-grain options that are gluten-free?

Avoiding gluten doesn't have to mean skipping out on all grains. Consider these five foods that are whole-grain and gluten-free.
03 Apr 10:16

Second Gene Causes Retinoblastoma

by Ricki Lewis, PhD

This beautiful little boy has heritable retinoblastoma. The white spots in his eyes are from light reflecting off of tumors.

In a list of famous genes, RB1 would probably be #1. It’s the tumor suppressor gene whose “loss of function” is behind the childhood eye cancer retinoblastoma, and that Alfred Knudson investigated to deduce the 2-hit mechanism of cancer.

In 1971, the idea that a gene’s normal function could be to prevent cancer was revolutionary. Now a new study finds that an amplified oncogene can cause the eye cancer too — with just one hit.

TWO ROUTES TO A TUMOR
Like the earth being flat, proteins being the genetic material, and genes being contiguous, the idea that mutations in RB1 are the sole cause of retinoblastoma has vanished. A multinational team of researchers, led by Brenda Gallie of Impact Genetics and the Toronto Western Hospital Research Institute, discovered that in a very small proportion of children with retinoblastoma, an oncogene, MYCN, is the cause. Their report is in Lancet Oncology. (Dr. Gallie, who supplied the photos here, heads an international effort to bring care for RB to the 92% of the thousands of children who have the disease who live in developing countries.)

Untreated retinoblastoma, circa 1806.

Retinoblastoma has a long history. A 2000 B.C. Mayan stone carving shows a child with a bulging eye. A Dutch anatomist provided the earliest clinical description, a growth “the size of two fists.” In 1886 researchers noted that the cancer can be inherited, and in those families, secondary tumors can arise, usually in bone. Once flash photography was invented, parents would notice the disease as white spots in the pupil, from light reflecting off a tumor.

About 1 in 20,000 infants has RB. In 40% of cases, both eyes develop tumors. This means that the child inherited a predisposition mutation in all cells, and the cancer develops when a second mutation “hits” a retinal cell. Of the 60% of cases where only one eye is affected, most (85%) result from 2 hits in the same retinal cell, without the inherited (aka germline or constitutional) mutation.

The road to finding the new gene began when the investigators in Toronto noticed that 7 children who had a tumor in just one eye didn’t have the expected RB1 mutation. Instead, one of the genes that typically comes into play later in the disease is present in extra copies – the oncogene MYCN. It’s known to cause neuroblastoma, another cancer that begins so early that babies are born with it. Just one amplified MYCN, like a genetic stutter, is sufficient to trigger the eye cancer. In genetics lingo, it’s a dominant, “gain-of-function” mutation.

Rosette structures are seen in classic RB tumors, but not in the newly discovered kind.

Researchers from New Zealand, France, Germany, and the Netherlands added cases, for a total of 1068, that included more children with the newly-recognized form of the disease, dubbed MYCN RB. The MYCN tumor itself, for those who like to look at such things, appears different. The rosette-like structures and coalesced nuclei seen in classic RB are absent, and the cells look like those of neuroblastoma. “It’s a subtle enough distinction that pathologists who received an eye removed under suspicion of RB would be excused if they say it looks like RB,” Timothy Corson PhD, assistant professor of ophthalmology, biochemistry and molecular biology at the Indiana University School of Medicine, told me.

The tumors of MYCN RB are large, invasive, and so aggressive that they usually appear before a child is six months old. In fact, 18% of one-eye retinoblastomas in kids this young are of the new type.

CLINICAL IMPLICATIONS
The finding of a new form of retinoblastoma has implications both specific and general.

Knowledge improves the lives of patients and their families. “If an infant with a single tumor is diagnosed with MYCN RB, then it’s not heritable. These kids don’t have the risk to pass it on to offspring, nor do they face a greater risk for subsequent cancers,” Corson said. That’s because the new form of RB arises anew in one gene in a retinal cell, not in sperm or eggs.

Follow-up is easier. “The exams through childhood are not simply going to an ophthalmologist and he or she looks in the eye. An infant must be put under general anesthesia to be still enough to carefully evaluate the eyes for tumors,” Corson says.

Treatment changes too. The first eye of a child with a family history of the disease is usually treated with chemo, rather than being removed, because the second eye is likely to develop tumors too. Preserving the eye saves sight. But in MYCN RB, “since the disease is particularly aggressive, and the other eye will not become involved, surgical removal of the diseased eye is the correct therapeutic course,” says Joan O’Brien, MD, chair of ophthalmology at the University of Pennsylvania School of Medicine.

A retinoblastoma tumor grows on the retina.

In addition to RB1 being a famous gene, the new finding enhances what is already a medical success story. “In ophthalmology we don’t often deal in life and death issues. Retinoblastoma is uniformly fatal if not treated. It’s wonderful that most of these young kids are cured nowadays,”shares Shalom Kieval, MD, associate clinical professor of ophthalmology, Albany Medical College, president of RetinaCare Consultants, Albany, New York, and chief of ophthalmology at Albany Memorial Hospital. One of his patients, whom he first met as a premature infant, brings her three kids to see him now. The mother and two of her kids have been successfully treated for RB. “The retinoblastoma story is another unsung but intensely gratifying accomplishment of modern medical science,” he adds.

THE BIGGER PICTURE

“Genetic heterogeneity” is when more than one genotype (gene variant combo) is responsible for the same phenotype (disease or trait). Hearing loss, for example, has more than 100 different genetic causes. And Leber congenital amaurosis, the main story in my gene therapy book , comes in at least 18 guises, LCA itself a subtype of retinitis pigmentosa.

A broken bone in a patient with OI type V. (Shakata GaNai)

Genetic heterogeneity has explained false accusations of child abuse. In osteogenesis imperfect (OI), aka “brittle bone disease,” bones break, often before birth. Like retinoblastoma, OI has left marks on history. An Egyptian mummy from 1000 B.C. had it, as did 9th century Viking “Ivan the Boneless,” who was reportedly comported into battle aboard a shield and whose remains were exhumed and burnt by King William I, forever obscuring the true diagnosis.

Until 1979, mutation in a single collagen gene was thought to cause OI. But cases arose of parents bringing babies with shattered bones to hospitals, and charges of child abuse filed. If social workers or other medical people knew about OI, the parents would be tested – but sometimes the tests were negative, even though the distraught parents insisted they hadn’t harmed their child.

Discovery of the second IO gene absolved some parents, but false accusations still happened. Then a third OI gene was found. I just checked  Online Mendelian Inheritance in Man – the bible of geneticists – and found 14 types, but it’s confusing. Different types reflect mutations in the same gene, yet the same type can embrace different genes. The 14 types arise from 7 chromosomal addresses.

Whatever the terminology, the origins of discovering some of the recessive types are fascinating, often involving consanguinity (genetics speak for inbreeding).

OI type II was traced to the Mozobites, a polygamous sect in southern Algeria that had brother-sister/husband-wife pairs. Type VII came from a First Nations community in northern Quebec, where two generations of three interrelated families had many babies born with broken bones. Type IX families are from Pakistan, Senegal, and the Irish Traveller’s ethnic group and type XI derives from 5 consanguineous families from northern Turkey. Inbred families often illuminate rare single-gene diseases because parents share mutations inherited from their shared ancestors. The mutations are extremely rare in the outbred population in the area.

Annotation of the human genome is revealing more instances of genetic heterogeneity, explaining people who have clear symptoms of a known genetic disease yet test negative – like in retinoblastoma and osteogenesis imperfecta.

I hope the genetic testing industry, both clinical and direct-to-consumer, can keep up, and that patients/customers aren’t falsely reassured by low risks that don’t tell the whole story – because we may not know all of it, as was the case for retinoblastoma.

So while we’re busy sequencing all those exomes and genomes, keep in mind that we don’t know all there is to know. Single genes can still surprise us.

(For a more technical version of this information see Medscape.)

01 Apr 09:53

Welcome to the Nature Genetics iCOGS collection

by Orli Bahcall

We are pleased publish today a Focus issue on cancer risk, including findings from the COGS (Collaborative Oncological Gene-environment Study) consortium, published as 13 coordinated papers in this Nature Genetics iCOGS collection.

At Nature Genetics, we give voice to leading efforts to understand the genetic basis of disease.  Over the past six years, we have seen mass surveys of genetic variants across the human genome, called genome-wide association studies, yield key insights into hundreds of common diseases.

Today, we’re proud to see how COGS, extending this approach to oncology, has doubled the number of genetic regions implicated in breast, ovarian and prostate cancers.  As such, these 13 papers represent a milestone in our understanding of these common cancers, and exemplify what’s needed in such discovery efforts.

To provide a brief summary before you dive in, several overall findings from this collection bear highlighting:

1)      This collection implicates 74 new genomic regions in these 3 cancers, doubling the number of reported associations.

2)      This work finds substantial contribution from common genetic variation to cancer risk heritability, explaining up to roughly 1/3 of the familial relative risk for these cancers.

3)      The studies find that variation in some parts of our genomes is associated with multiple hormone-sensitive cancers, suggesting shared mechanisms.

4)      This collection also establishes a general framework for refining initial association findings through fine-mapping and functional annotation.

5)      Finally, these efforts have pointed to ways that such findings may find use in personal healthcare and genetic risk prediction.

These studies also exemplify what’s needed in such genetic discovery efforts, including these three important factors:

1)      This work requires very large samples, with these new papers summarizing data from more than 200,000 people, making this the largest cancer genotyping discovery effort.

2)      These studies entail collaboration, here bringing together researchers from more than 160 institutions worldwide, in four cancer-specific consortia, to execute over 40 studies.

3)      This collection of studies pool efforts across distinct but related diseases – here, 3 hormone-related cancers. This allows more efficient use of hard won data, for faster and more precise insights into the shared versus unique genetic underpinnings of particular diseases.

In supporting this collaborative effort, we were pleased to work directly with the COGS authors to coordinate these 13 publications across 5 leading journals, working with editors at The American Journal of Human GeneticsHuman Molecular Genetics,  Nature Communications, and PLoS Genetics

And, to best convey the findings, Nature Genetics is debuting a new online publishing format that builds from the single paper as unit of publication, to a broad, unified view into the entire iCOGS collection.

We hope that this iCOGS explorer microsite will help our readers quickly understand and access materials across the set of coordinated papers.  This iCOGS explorer includes a series of essays, called Primers, which interactively guide readers through the studies, summarizing the main findings and themes of the collection, offering direct glimpses into each paper, and perspectives on how they relate to other work in the field.

This new online publishing format builds on the Threads used in the ENCODE explorer.  These Primers now interlace “threads” (direct quotes from papers) with editorial analysis in order to provide more context and additional analysis.

We also hope that this provides an example to help the public understand current efforts to characterize the genetic basis of common diseases, as we review here how and why such studies are performed, and what insights are possible.

Without further delay, we welcome you to jump right into the Nature Genetics iCOGS collection.  Not sure where to start or overwhelmed with this stack of materials?  I recommend picking your favorite from these questions below, linked to answers in our Primers.

 

We hope that you find this collection and the iCOGS explorer useful, and that this proves of benefit in accessing the information in these current studies, as well as promoting continuing collaborative research.

 

Orli G. Bahcall Senior Editor Nature Genetics   Twitter: @obahcall