10 Dec 21:29

Viewpoint: What Can we Say about a Photon’s Past?

An experiment demonstrates that even when physicists think a quantum particle has followed a single path it might not have.

Published Mon Dec 09, 2013

09 Dec 19:40

Judging A Book By Its Coverage

by Pip

Flaws in constitution

Cropped from book page.

Rebecca Goldstein is the author of the book Incompleteness: The Proof and Paradox of Kurt Gödel. She obtained her PhD in Philosophy from Princeton University, and has also written several novels set in academia, including The Mind-Body Problem and Properties of Light: A Novel of Love, Betrayal and Quantum Physics. The latter draws on the life and concerns of the physicist David Bohm.

Today Ken and I wish to talk about Kurt Gödel’s journey in getting his USA citizenship, and his journey since then in the interpretation and implications of his research.

Gödel’s citizenship interview happened on Thursday the 5th of December, 1947—over fifty years ago. Even over sixty years ago, come to think of it. Past a certain age it becomes better to focus on the wider part of the calendar than the four-digit number at the top.

I bought Goldstein’s book years ago and started to read it. But somehow the initial few pages were not that compelling, or I was distracted by doing something else. In any event, I recently had a long plane flight and took the book—yes I still read hard-copy printed books—along. Partially because it was small, partially because it was on Gödel, and partially by randomness.

It turns out the book is a mixed bag. It was a fun read, with many interesting insights into the life of Gödel. It was also filled with strange errors that I easily noticed, even flying at 36,000 feet without any access to Google search. Yet I did enjoy the book, and am sorry I had not read it before. Well not completely—without it the plane flight would have been longer, since reading helps shrink the time of a flight.

The Story

Here is the story, according to Goldstein, of the day Gödel went to Trenton to get sworn in as an American citizen. Gödel had prepared well for his hearing, and had further discovered that the U.S. Constitution has a flaw that could allow it to become a dictatorship.

Oskar Morgenstern and Albert Einstein drove Gödel to Trenton for his hearing before the judge. On the car ride Einstein tried to distract Gödel with jokes: “Well, are you ready for your next-to-the-last test?” Gödel answered “What do you mean, ‘next-to-the-last’?” Einstein aded, “Very simple. The last will be when you step into your grave.”

Einstein continued on till they reached the court where the judge was Philip Forman, who was a friend of Einstein besides having administered Einstein’s own citizenship oath. The judge moved them quickly into his private chambers. Einstein and the judge chatted while Gödel sat mute. Finally the judge said to Gödel, “Up to now you have held German citizenship.” Gödel corrected him: Austrian citizenship. The judge added, “In any case, it was under an evil dictatorship. Fortunately, that is not possible in America.”

As Goldstein says, this was what Gödel was waiting for. Gödel started to explain how it could happen here because of the flaw in the Constitution. The judge interrupted and said “You needn’t go into all that.” The rest when smoothly and after the oath Gödel become a US citizen. Later in a letter to his mother, Gödel remarked that Forman was a “very sympathetic person.”

The Lost Story

It was known that Morgenstern had written an account of that day, but when his widow was interviewed in 1983 by John Dawson, she had been unable to locate it. Dawson used her recollections in his 1997 biography of Gödel. In 2006 the Institute for Advanced Study hailed the centennial of Gödel’s birth in its spring newsletter. This included a sidebar titled “Gödel, Einstein, and the Immigration Service,” later reproduced on their Gödel page, but with a story quite different from what Dawson had heard. Moreover, the IAS gave the year as 1948. Perhaps they followed my advice about calendars.

Mathematician and author Jeffrey Kegler, who based a novel on Gödel’s two lost notebooks, tells the full story on a neat page with links to all sources, including his own blog posts. While editing Wikipedia’s Gödel page in November 2008, he found another account that “rang true” more than the existing hearsay accounts, and resembled the IAS version. He was convinced the latter had to be based on a true original. He contacted Dawson, who in turn prompted the Institute to find and release it.

Morgenstern in fact mentions only the year 1946. Here is part of what he wrote:

…[Gödel] rather excitedly told me that in looking at the Constitution, to his distress, he had found some inner contradictions and that he could show how in a perfectly legal manner it would be possible for somebody to become a dictator and set up a Fascist regime… I tried to persuade him that he should avoid bringing up such matters at the examination before the court in Trenton, and I also told Einstein about it: he was horrified that such an idea had occurred to Gödel, and he also told him he should not worry about these things nor discuss that matter.

Many months went by and finally the date for the examination in Trenton came. … While we were driving, Einstein turned around a little and said, “Now Gödel, are you really well prepared for this examination?” Of course, this remark upset Gödel tremendously, which was exactly what Einstein intended and he was greatly amused when he saw the worry on Gödel’s face. …

When we came to Trenton, we were ushered into a big room, and while normally the witnesses are questioned separately from the candidate, because of Einstein’s appearance, an exception was made and all three of us were invited to sit down together, Gödel, in the center. The examiner first asked Einstein and then me whether we thought Gödel would make a good citizen. We assured him that this would certainly be the case, that he was a distinguished man, etc. And then he turned to Gödel and said,

“Now, Mr. Gödel, where do you come from?”

Gödel: “Where I come from? Austria.”

The Examiner: “What kind of government did you have in Austria?”

Gödel: “It was a republic, but the constitution was such that it finally was changed into a dictatorship.”

The Examiner: “Oh! This is very bad. This could not happen in this country.”

Gödel: “Oh, yes, I can prove it.”

So of all the possible questions, just that critical one was asked by the Examiner. Einstein and I were horrified during this exchange; the Examiner was intelligent enough to quickly quieten Gödel… and broke off the examination at this point, greatly to our relief. …

Then off to Einstein’s home again, and he turned back once more toward Gödel, and said, “Now, Gödel, this was your last-but-one examination;” Gödel: “Goodness, is there still another one to come?” and he was already worried. And then Einstein said, “Gödel, the next examination is when you step into your grave.” Gödel: “But Einstein, I don’t step into my grave.” and then Einstein said, “Gödel, that’s just the joke of it!” and with that he departed. I drove Gödel home. Everybody was relieved that this formidable affair was over; Gödel had his head free again to go about problems of philosophy and logic.

The Lost Flaw

Maddeningly left out is what exactly the “inner contradictions” were. There have been various speculations, even a paper, most revolving around the Constitution’s providing the power to amend itself. Kegler has his own hypothesis.

Here I—Ken writing this—must confess I am unable to locate the webpage with what I took to be the flaw when I did background reading for our first “interview” with Gödel two years ago. I’ve alas never picked up the index-card habit. What struck my memory, however, was the source’s reference to the Senate and the judiciary.

Trying to reconstruct it, I think the path to dictatorship Gödel feared starts with something like this: The President of the Senate declares that a rules issue is a Constitutional question. This enables a bare majority, exploiting the gaps in Article I, to rewrite the rules of the Senate. Such a rule change can enable the uncontested appointment of Federal judges. Those judges in turn can… Well, anyway, nothing like that would ever actually happen.

Back to Dick and to Goldstein’s book, which to be fair, came out a few months before the IAS newsletter with Morgenstern’s account.

The Book

The book is—as I stated already—a mixed-bag, at best. I liked the history and insights into Gödel’s life. Yet it has many errors—both small errors that were almost just typos, and major errors. I had my thoughts, but Ken found the tough review by Solomon Feferman, so let’s quote that:

As to the core of Goldstein’s book, anyone familiar with Gödel’s work has to flinch. Dozens of errors could have been avoided by an expert vetting of the manuscript. At the very least we would not have had ‘Kreisl’ for ‘Kreisel,’ ‘Kline’ for ‘Kleene,’ and ‘Tannenbaum’ for ‘Teitelbaum’ (the birth surname of Alfred Tarski, the great logician, whose significant interaction with Gödel barely merits Goldstein’s notice).

In the air, flying way above the clouds, I certainly wondered if I was dreaming when I saw the reference to “Kreisl.” At first I wondered did she mean someone other than the famous logician Georg Kreisel? I could not believe that there could be another. Kreisel worked on proof theory and is known for many things including this amazing conjecture:

Suppose that Peano Arithmetic (PA) proves ${A(S^{n}(0))}$ in ${O(1)}$ steps for all ${n}$ , then PA proves ${\forall x A(x)}$ .

Note: ${S^{n}(0)}$ is the successor function applied to ${0}$ a total of ${n}$ times: it is ${n}$ in unary. There is some evidence for and against it; the latter two papers are by the same author, Pavel Hrubeš.

Errors aside, the book does have some interesting bits of history about Gödel and other mathematicians of his era. Many of the stories are known, perhaps well known. The book is much more about people and their history than a primer of the Incompleteness Theorems. One story that I knew but l like a lot is about Einstein’s salary negotiation with the head of IAS:

Einstein asked for a salary of $3,000, and the head “countered” with an offer of $16,000.

A very interesting example of negotiation. Quoting Feferman again:

What she does very well is to provide a vivid biographical picture of Gödel, beginning mid-stream with his touching relationship with Albert Einstein at the Institute for Advanced Study in Princeton, where, over a period of 15 years until Einstein’s death in 1955, they were often seen walking and talking together.

But he ends with:

Those who are fascinated by Gödel’s theorems—and the general idea of limits to what we can know—may still hunger for a more universal view of their possible significance. But they should not be satisfied with Goldstein’s ‘vast and messy’ goulash; hers is not a recipe for true understanding.

Indeed Feferman most loudly criticizes her signing on to the “view [t]hat Gödel’s theorems were designed to refute the formalist program of David Hilbert.” Both Ken and I have been careful to portray Gödel in harmony with Hilbert, and even as compressing rather than expanding the implications of his own theorems. Of course we have conjured our own fictionalizations of Gödel, and however well sourced, they may have errors. If so, we will amend them. Scrupulousness even made this post a day late.

How Many Unprovable Statements Are There?

While we are talking about Gödel’s Incompleteness Theorems, Tim Gowers has raised a question about unprovable statements in mathematics. In essence it is: Why do we as practicing theorem provers seem to be able to avoid the unprovability issues of Gödel? Or do we?

I have an answer that I am sure Gowers saw, but thought I would share. Consider all true statements ${\phi}$ in Peano Arithmetic of size ${N}$ in some standard encoding. I claim that there is a positive ${\delta}$ so that at least a ${\delta}$ fraction of these true statements cannot be proved in PA. The proof is quite simple. Pick any single unprovable statement ${A}$ . Then consider the set of statements of the form:

$\displaystyle \phi = A \wedge B$

for any true ${B}$ . None of these are provable in PA, and they form a positive fraction of all the true statements of length ${N}$ . Statements ${A \vee B}$ where ${A}$ is provable yield a similar upper bound separated from ${1}$ on the proportion of unprovable statements.

Open Problems

A natural question is: in the limit are there more unprovable than provable statements of size ${N}$ as ${N}$ goes to infinity? This depends on encoding details but should be a robust enough question under reasonable conditions. Is it clear that there is a limit? Of course the above construction leads to many uninteresting statements. So the second question might be: can we sharpen the question, for instance by associating to a provable ${\phi}$ the idea of minimizing the size of ${\psi}$ such that ${\psi \longrightarrow \phi}$ has a “trivial” proof?

08 Dec 04:56

Peter Higgs: “Today I wouldn’t get an academic job. It’s as simple as that”

by woit

The Guardian has an interesting piece about Peter Higgs, evidently their reporter talked to him on his way to the Nobel Prize ceremonies this week in Stockholm. Higgs will be speaking tomorrow (Sunday), and I’m curious to hear what he will have to say. His talk will be available live at the Nobel Prize website.

Higgs points out that the kind of work he was awarded the prize for was done in an environment that no longer exists:

He doubts a similar breakthrough could be achieved in today’s academic culture, because of the expectations on academics to collaborate and keep churning out papers. He said: “It’s difficult to imagine how I would ever have enough peace and quiet in the present sort of climate to do what I did in 1964.”

By the time he retired in 1996, he was glad to be out of academia:

After I retired it was quite a long time before I went back to my department. I thought I was well out of it. It wasn’t my way of doing things any more. Today I wouldn’t get an academic job. It’s as simple as that. I don’t think I would be regarded as productive enough.

Higgs has definitely not been a careerist sort, turning down a knighthood in 1999:

I’m rather cynical about the way the honours system is used, frankly. A whole lot of the honours system is used for political purposes by the government in power.

He thinks he likely would have been fired by his university back in the 1980s if there hadn’t been a prospect of him getting a Nobel.

The work Higgs did in 1964 was on a rather unpopular topic. At the time the reigning ideology was “S-matrix theory”, which argued that local quantum field theory was a hopeless subject, so one should be working on formulating basic physics just in terms of S-matrix amplitudes, using their holomorphicity properties (this idea has had somewhat of a comeback in recent years). The 1960s however was a time of a great expansion in the number of university positions, so people like Higgs could make a career despite working on unpopular topics.

Progress in particle theory slowed dramatically after the early 1970s. One reason for this of course has been the huge success of the Standard Model, as well as the inherent difficulties involved in getting experimental access to higher energy scales. One wonders though whether the post-1970 collapse of the HEP theory job market and very different environment that ensued might have had something to do with this. As Higgs himself is well-aware, if he had come along 10 years later, he would not have found a job in the field.

In the UK today, things seem to be getting even worse, with strong pressures from the government to only fund work likely to have an immediate economic payoff. For more about this, see this commentary at Physicsfocus by Philip Moriarty on The Spirit-Crushing Impact of Impact. The UK has just announced the founding of a new Higgs Centre for Innovation, to be built in Edinburgh and opened in 2016. It will be devoted though not to the kind of research Higgs had success with, but to “big data” and “space”, considered by the government to be among the most promising technologies for the future. It’s rather ironic that Higgs is the sort of scientist who would not be employable by the Higgs Centre.

Update: For the acceptance speech by Higgs, see here, and see here for an official interview. For a different point of view, from one of the experimenters who made the award to Higgs possible, see here.

25 Nov 17:51

Toward a better music theory

by Ethan Hein

I seem to have touched a nerve with my rant about the conventional teaching of music theory and how poorly it serves practicing musicians. I thought it would be a good idea to follow that up with some ideas for how to make music theory more useful and relevant. The goal of music theory should be to explain common practice music. I don’t mean “common practice” in its present pedagogical sense. I mean the musical practices that are most prevalent in a given time and place, like America in 2013. Rather than trying to identify a canonical body of works and a bounded set of rules defined by that canon, we should take an ethnomusicological approach. We should be asking: what is it that musicians are doing that sounds good? What patterns can we detect in the broad mass of music being made and enjoyed out there in the world?

I have my own set of ideas about what constitutes common practice music in America in 2013, but I also come with my set of biases and preferences. It would be better to have some hard data on what we all collectively think makes for valid music. Trevor de Clerq and David Temperley have bravely attempted to build just such a data set, at least within one specific area: the harmonic practices used in rock, as defined by Rolling Stone magazine’s list of the 500 Greatest Songs of All Time. Temperley and de Clerq transcribed the top 20 songs from each decade between 1950 and 2000. You can see the results in their paper, “A corpus analysis of rock harmony.” They also have a web site where you can download their raw data and analyze it yourself. The whole project is a masterpiece of descriptivist music theory, as opposed to the bad prescriptivist kind.

Of course, the Rolling Stone Top 500 has some problems as a data set. First of all, there’s no common agreement as to what the word “rock” even refers to. Temperley and de Clerq identify two main senses of the word. There’s the sense Rolling Stone uses, an umbrella term for late twentieth century Anglo-American popular music. By this definition, rock includes soul/R&B standards, disco hits, middle-of-the-road pop and a few iconic country, jazz and hip-hop songs. On the other hand, there’s the more narrow and descriptive sense of the word “rock” that includes Led Zeppelin and Aerosmith but specifically excludes jazz, hip-hop and so on. Taking this view, the Rolling Stone list is not really a list of rock songs; it’s a list of “the greatest songs of the rock era.” Temperly and de Clerq don’t get too bogged down in the semantics; the Rolling Stone list is as complete a consensus mainstream pop collection as exists, so it’s a good enough place to start.

When you put "rock" into Google Image Search, you get this

A few results jump out from the study. As you’d expect, the tonic I is the most commonly-used chord in the Rolling Stone corpus. However, the next most common chord is IV, and it most frequently precedes I. Right away, we have a conflict with traditional classical theory, where the most basic tonal building block is the V-I cadence. Rock uses plenty of V-I, but it uses even more IV-I. And the third most common pre-tonic chord in rock is not ii, like you’d expect if you went to music school; it’s bVII, reflecting rock musicians’ love of mixolydian mode. These same three chords, IV, V and bVII, are also the ones most likely to follow the tonic in rock, again very much at odds with classical practice. Temperley and de Clerq observe:

In light of this data, one might conclude that rock is not governed by rules of ‘progression’ at all; rather, there is simply an overall hierarchy of preference for certain harmonies over others, regardless of context.

In common-practice music, conventional theory dictates that certain root patterns are preferred over others: ascending motion by fourths is especially normative (much more so than descending fourth motion); descending thirds are favored over ascending thirds, and ascending seconds over descending seconds (Schoenberg 1969). Are these principles observed in rock as well? It can be seen immediately that the norms of common-practice music do not hold. For each interval, the ascending and descending forms are roughly equal in frequency. The ascending perfect fourth is almost exactly as common as the descending perfect fourth; for other intervals, too, a similar pattern is seen. The frequency of intervals decreases in a very regular way as circle-of-fifths distance increases.

Blues is a central pillar of rock, and blues violates quite a few tenets of common-practice classical harmony. The biggest one is the distinction between major and minor. The sound of blues is in large part the sound of minor melodies and chord extensions over major chord progressions. The more blues-oriented flavors of rock are similarly ambiguous in their major/minor identity. A lot of the time, rock chords are neither major nor minor, like the famous power chord, which is just root-fifth-root.

The harmonic situation gets more complicated still if you include hip-hop in the data set. The Rolling Stone list includes “Bring The Noise” by Public Enemy, which doesn’t have any triadic harmony at all. Temperley and de Clerq dealt with that by just not including the track in their analysis, which strikes me as cowardly. A real theory of contemporary music would have to deal with hip-hop, which may not have triads, but does have strongly melodic unpitched vocal lines, modal harmonies and, sometimes, very crunchy dissonances and microtones.

To my mind, the most intriguing idea put forth by de Clerq and Temperley is the supermode, the collection of pitches most frequently used in rock melodies:

Temperley explains:

The supermode could be viewed as the union of the Ionian (major) and Aeolian (natural minor) modes; one might also think of it as a set of adjacent scale degrees on the line of fifths, extending from flatscale degree 6 to scale degree 7. In enharmonic terms, this collection excludes just two scale degrees, sharpscale degree 4/flatscale degree 5 and sharpscale degree 1/flatscale degree 2—precisely the same degrees that are outside the “global” scale collection of common-practice music.

I like the idea of the supermode. Classical music’s obsession with the major scale runs counter to most Americans’ intuition. Sure, we like the major scale fine, but it doesn’t feel like the One True Generative Scale that classical music holds it to be. Flat sevenths sound as “natural” to me as natural sevenths. (Actually, flat sevenths are a lot lower in the overtone series; you could make a case that mixolydian should be the One True Scale.) I think the best idea would be to just teach kids the supermode, rather than hitting them with the confusing idea that you have to modify the major scale to get the sounds you’re used to.

See the followup post: can science make a better music theory?

Yohanjohn likes this

18 Nov 20:34

Synchronization Across Sensory Cortical Areas by Electrical Microstimulation is Sufficient for Behavioral Discrimination

by Manzur, H. E., Alvarez, J., Babul, C., Maldonado, P. E.

The temporal correlation hypothesis proposes that cortical neurons engage in synchronized activity, thus configuring a general mechanism to account for a range of cognitive processes from perceptual binding to consciousness. However, most studies supporting this hypothesis have only provided correlational, but not causal evidence. Here, we used electrical microstimulation of the visual and somatosensory cortices of the rat in both hemispheres, to test whether rats could discriminate synchronous versus asynchronous patterns of stimulation applied to the same cortical sites. To disambiguate synchrony from other related parameters, our experiments independently manipulated the rate and intensity of stimulation, the spatial locations of stimulation, the exact temporal sequence of stimulation patterns, and the degree of synchrony across stimulation sites. We found that rats reliably distinguished between 2 microstimulation patterns, differing in the spatial arrangement of cortical sites stimulated synchronously. Also, their performance was proportional to the level of synchrony in the microstimulation patterns. We demonstrated that rats can recognize artificial current patterns containing precise synchronization features, thus providing the first direct evidence that artificial synchronous activity can guide behavior. Such precise temporal information can be used as feedback signals in machine interface arrangements.

18 Nov 19:57

Mega-churches for atheists

by Minnesotastan

From the StarTribune:

Nearly three dozen gatherings dubbed "atheist mega-churches" by supporters and detractors have sprung up around the U.S. and Australia — with more to come — after finding success in Great Britain earlier this year. The movement fueled by social media and spearheaded by two prominent British comedians is no joke...

They don't bash believers but want to find a new way to meet likeminded people, engage in the community and make their presence more visible in a landscape dominated by faith...

"If you think about church, there's very little that's bad. It's singing awesome songs, hearing interesting talks, thinking about improving yourself and helping other people — and doing that in a community with wonderful relationships. What part of that is not to like?"..

Sunday Assembly — whose motto is Live Better, Help Often, Wonder More — taps into that universe of people who left their faith but now miss the community church provided... It also plays into a feeling among some atheists that they should make themselves more visible...

"In the U.S., there's a little bit of a feeling that if you're not religious, you're not patriotic. I think a lot of secular people say, 'Hey, wait a minute. We are charitable, we are good people, we're good parents and we are just as good citizens as you and we're going to start a church to prove it.."

During the service, attendees stomped their feet, clapped their hands and cheered as Jones and Evans led the group through rousing renditions of "Lean on Me," ''Here Comes the Sun" and other hits that took the place of gospel songs. Congregants dissolved into laughter at a get-to-know-you game that involved clapping and slapping the hands of the person next to them and applauded as members of the audience spoke about community service projects they had started in LA.

18 Nov 19:53

Pi vs. Tau

Conveniently approximated as e+2, Pau is commonly known as the Devil's Ratio (because in the octal expansion, '666' appears four times in the first 200 digits while no other run of 3+ digits appears more than once.)

Weronika, 桂红刚 and 2 others like this

14 Nov 00:02

Topology of viral evolution [Mathematics]

by Chan, J. M., Carlsson, G., Rabadan, R.

The tree structure is currently the accepted paradigm to represent evolutionary relationships between organisms, species or other taxa. However, horizontal, or reticulate, genomic exchanges are pervasive in nature and confound characterization of phylogenetic trees. Drawing from algebraic topology, we present a unique evolutionary framework that comprehensively captures both clonal and...

Erik.rikard.daniel likes this

13 Nov 23:58

Phase Space Crystals: A New Way to Create a Quasienergy Band Structure

by Lingzhen Guo, Michael Marthaler, and Gerd Schön

Author(s): Lingzhen Guo, Michael Marthaler, and Gerd Schön

A novel way to create a band structure of the quasienergy spectrum for driven systems is proposed based on the discrete symmetry in phase space. The system, e.g., an ion or ultracold atom trapped in a potential, shows no spatial periodicity, but it is driven by a time-dependent field coupling highly...

[Phys. Rev. Lett. 111, 205303] Published Wed Nov 13, 2013

12 Nov 18:32

California Lawmakers Want Porn Stars to Wear Safety Goggles

by Jess Remington

California’s workplace safety guardians have proposed an amendment to a bill that would require porn stars to wear protective goggles while filming.

The bill, which has so far stalled in the state senate, establishes numerous mandates for the porn industry to follow with the goal of curbing the spread of sexually transmitted diseases. Among these mandates is the requirement that “personal protective equipment" be used to "prevent contact of an employee's eye, skin, mucous membranes, or genitals with the blood or OPIM-STI of another." (OPIM-STI includes pre-ejaculate, semen, vaginal secretions, and fecal matter.)

According to the bill, which was originally posted on a NSFW adult entertainment blog:

The employer shall provide, at no cost to the employee, appropriate personal protective equipment such as, but not limited to, condoms, gloves for cleaning, and, if contact of the eyes with OPIM-STI is reasonably anticipated, eye protection.

So basically, cum shots sans eyegear would be illegal.

The bill would also require employees to wear condoms during vaginal and anal sex, gloves when touching “contaminated laundry,” and create a specific exemption for condom wearing during oral sex to be reviewed in January 2018.

Porn actors are actually already required to follow these rules, says Deborah Gold, the deputy chief for health of California’s Division of Occupational Safety and Health. They just rarely do. In an interview with Salon, Gold said that “these draft guidelines are an attempt to tailor existing workplace-safety rules relating to blood-borne pathogens specifically to the adult industry.”

The bill follows in the footsteps of last year's Safer Sex in the Adult Film Industry Act (or Measure B), the much-discussed, voter-approved measure that requires porn actors in Los Angeles County to wear condoms on set.

James Deen, an award-winning porn star and a staunch opponent of condom mandates, spoke with the Huffington Post after the measure passed:

"It will be interesting to see what happens next. People will most likely move production out of Los Angeles and take out tax money with us. Hopefully this measure passing will help us [adult entertainers] get more organized in the future and that, along with Los Angeles losing our business, will allow people in politics to start seeing us as an asset."

There's evidence that Deen’s prediction came true: After the measure passed, the number of requests for porn production permits in Los Angeles dropped from an annual norm of 500 to two. Now it appears that California politicians are looking to implement even more expansive protective barrier mandates.

In a satirical video protesting Measure B, James Deen and co-star Jessica Drake show what a scene with mandatory safety goggles and latex barriers everywhere could look like. NSFW! You've been warned. Watch it after the jump.

11 Nov 18:21

November 08, 2013

Emails yelling at me will commence in 3... 2... 1...

RaptoR, Alvetica and 11 others like this

31 Oct 16:46

When Hollywood collaborated with the Nazis

by Minnesotastan

You read the title correctly. The reason for the logical disconnect is that the time period in question is the 1930s, before the war and the most overt atrocities, as explained in a Harvard Magazine review of a new book:

Based on nearly nine years of archival research in Germany and the United States, the book reveals a surprisingly cooperative relationship between studio executives and German officials throughout the 1930s... MGM head Louis B. Mayer made changes to films at the request of the German consul in Los Angeles in the 1930s...

In 1932, six months before Hitler came to power, Germany adopted a law stipulating that any film company caught making anti-German (or later, anti-Nazi) films would be prohibited from doing business in the country. For studio executives who feared losing access to German audiences, it was a powerful threat. Before World War I, Germany had been the second-largest market for U.S. films. By the 1930s, the studios were no longer making money there, but they hoped business would improve in time. Urwand says Hollywood executives also worried that if they left Germany and Hitler started a war, they would be expelled from any countries he invaded. So studio heads, many of whom were Jewish, collectively boycotted a proposed film, The Mad Dog of Europe, about the mistreatment of European Jews, and agreed to fire most of their Jewish salesmen in Germany...

Commentators have drawn parallels between the Nazi collaboration that Urwand describes and Hollywood’s current relationship with China, a burgeoning market for American films. Urwand stresses that “China isn’t Nazi Germany,” but he acknowledges some potential parallels. “Hollywood is not going to make a strongly anti-Chinese film at this point, just as it didn’t make anti-German films when it was trying to preserve its business with Germany.”

More at the link.

Photo: ©Hulton-Deutsch Collection/Corbis Images

29 Oct 14:38

tl;dr

by noreply@blogger.com (Jay Ackroyd (@jayackroyd))

Glennzilla gives good interview.

But if you want a highlight:

So, for the top national security official in the United States to go to the Senate and lie to their faces and deny that the NSA is doing exactly that which our reporting proved that the NSA was in fact doing is plainly a crime, and of course he should be prosecuted, and would be prosecuted if we lived under anything resembling the rule of law, where everybody is held and treated equally under the law, regardless of position or prestige. Of course, we don’t have that kind of system, which is why no Wall Street executives have been prosecuted, no top-level Bush officials were prosecuted for torture or warrantless eavesdropping, and why James Clapper hasn’t been prosecuted despite telling an overt lie to Congress. And what’s even more amazing, though, Amy, is that not only has James Clapper not been prosecuted, he hasn’t even lost his job. He’s still the director of national intelligence many months after his lie was revealed, because there is no accountability for the top-level people in Washington.

And the final thing to say about that is, there’s all kinds of American journalists who love to go on television and accuse Edward Snowden of committing all these grave and horrible crimes. They’re so brave when it comes to declaring Edward Snowden to be a criminal and calling for [inaudible]. Not one of them has ever gone on television and said, "James Clapper committed crimes, and he ought to be prosecuted." The question that you just asked journalistically is such an important and obvious one, yet not—none of the David Gregorys or Jeffrey Toobins or all these American journalists who fancy themselves as aggressive, tough reporters, would ever dare utter the idea that James Clapper ought to be arrested or prosecuted for the crimes that he committed, because they’re there to serve those interests and not to challenge or be adversarial to them.

Update: Commenters said if the Amy Goodman interview wasn't too longer enough for you, see the email exchange with Keller.

23 Oct 20:09

Things Change

by noreply@blogger.com (Atrios)

In 20 years no one will admit to have ever opposed legal gay marriage. The pace of change amazes me. I do wonder how many people still admit to ever opposing interracial marriage. A majority did, until 1997.

23 Oct 20:01

Uruguay Plans to Impose Price Cap on Legal Marijuana at $1/Gram

by Jess Remington

Uruguay is on its way to becoming the first country in the world to legalize marijuana and it’s taking a unique approach. The country’s drug czar, Jose Calzada, told a local newspaper on Sunday the national government plans to set up a regulatory agency for overseeing the sale and distribution of marijuana at roughly $1/gram.

According to Calzada, marijuana sales will likely start in the middle of 2014. Unlike Colorado and Washington’s approach of taxing the drug for revenue, Calzada said the state will impose a price cap in order to undercut the black market, where pot sells for roughly $1.40 (30 pesos) per gram.

"The illegal market is very risky and of poor quality," Calzada said. The state "is going to offer a safe place to buy a quality product.”

As the AP reports: That's an eighth or less of what marijuana costs at legal medical dispensaries in some U.S. states.

However, before Uruguay’s pot distributors can (legally) sell anything, the Senate first needs to approve a legalization bill that is coming up to vote later this year. The bill is expected to pass, as Uruguay’s lawmakers generally support drug policy reform. President José Mujica has campaigned for legalization on the grounds that it will dramatically cut cartel violence. And Senate leaders have said they have a “comfortable majority” willing to vote in favor.

The legislators’ main hurdle is the public, 64% of whom oppose the bill. Many are worried that legalization could open the door for harder drug use and that it could turn the country into a hub for marijuana tourism.

If the bill does pass, Uruguay will set up an Institute for the Regulation and Control of Cannabis and implement provisions regulating the purchase and production of recreational marijuana. A few of the provisions, according to RT:

Uruguayan citizens will be required to register in a private database… and will be restricted to 40 grams each per month.

Citizens will legally be allowed to cultivate six marijuana plants per head or band together and organize a club of up to 45 members with the possibility of growing 99 plants.

The law will also limit sales to residents, which should be a boon for Uruguayans concerned about pot tourism and a disappointing drawback for said potential pot tourists.

22 Oct 20:35

Minecraft gets quantum blocks in Google mod

Google's quantum lab has released a version of the computer game designed to teach players about the weird principles of quantum mechanics

22 Oct 17:17

Henhouse

by noreply@blogger.com (Jay Ackroyd (@jayackroyd))

One of the hallmarks of the technocratic centrists is the belief that industry regulation is passé--that policy makers must rely on private enterprises because that is where you find the real experts. Which leads to stuff like this:

Quality Software Services Inc., or Q.S.S.I., a unit of the UnitedHealth Group, developed the identity management system, another major component that allowed consumers to register and establish accounts.

I suppose it's too way too late to bring this up again, but the bill was called the Patient Protection and Affordable Care Act. "Protection" largely referring to the insurance companies.

17 Oct 06:31

Law of Urination: all mammals empty their bladders over the same duration. (arXiv:1310.3737v3 [physics.flu-dyn] UPDATED)

by Patricia J. Yang, Jonathan C. Pham, Jerome Choo, David L. Hu

Many urological studies rely upon animal models such as rats and pigs whose urination physics and correlation to humans are poorly understood. Here we elucidate the hydrodynamics of urination across five orders of magnitude in animal mass. Using high-speed videography and flow rate measurement at Zoo Atlanta, we discover the "Law of Urination," which states animals empty their bladders over nearly constant duration of average 21 seconds (standard deviation 13 seconds). This feat is made possible by larger animals having longer urethras, thus higher gravitational force and flow speed. Smaller mammals are challenged during urination due to high viscous and surface tension forces that limit their urine to single drops. Our findings reveal the urethra constitutes as a flow enhancing device, enabling the urinary system to be scaled up without compromising its function. This study may help in the diagnosis of urinary problems in animals and in inspiring the design of scalable hydrodynamic systems based on those in nature.

16 Oct 04:40

Holding a country to ransom

by gowers

Here is a quick thought about the mathematics of the US shutdown, not to be taken too seriously (the thought I mean — the shutdown obviously is to be taken seriously). It’s for the benefit of anyone who is puzzled that the Tea Party can have such a large influence, and more generally how a political system can be stable when almost nobody likes it. I’m going to prove that in a country of $n$ people, it is possible to devise a democratic system in which $n^\alpha$ of those people control the decisions, where $\alpha=\log 2/\log 3$ . For example, in a population of 100,000,000, all you need is a band of fanatics with about 112,000 people — or approximately 0.1% of the population. Although we do not have such a system and the distribution is unlikely, the systems and distributions we do have still allow a minority to have undue influence, and for similar reasons. What I’m about to describe is the extreme case.

The system I have in mind works as follows. It’s a multilevel representative democracy. Suppose for convenience that $n=3^k$ for some positive integer $k$ . (It is easy, but slightly tedious, to modify what I am about to write to take care of more general $n$ .) Suppose that the country is divided into three “super-constituencies”, each of which gets a vote in the top-level decision-making body (known as the triumvirate). Suppose that decisions in that body are passed by a majority vote. A group of people that wants to control the country can do so as long as it can control at least two votes in the triumvirate.

How are the members of the triumvirate chosen? They are elected by another triumvirate one level down. The representative in the top-level triumvirate is representing the views of the three people in the triumvirate one level down, and is worried about stepping out of line, since then he/she risks being deselected by the three people in the level-2 body.

So if a merry band of fanatics wants to control a representative in the top-level triumvirate, it is enough to control at least two of the representatives in the second-level triumvirate that selects the top-level representative.

Of course, we can iterate this argument. So how many people do we need to control the country? We need two at the top level, and therefore four at the second level, and so on. Therefore, we need $2^k$ at the bottom level. (Note that the representatives do not have to be fanatics themselves — if they don’t vote in the way that the fanatics want, then they get deselected by the people one level down, losing all those lovely perks that go with a high-level job in politics.) If $n=3^k$ , then $2^k=n^{\log 2/\log 3}$ , so we’re done.

One might want to make small adjustments to the bound to allow all the different levels of influence to be disjoint. So then $n=1+3+3^2+\dots+3^k$ . But this is within a constant of $3^k$ . Similarly, if we start with some $n$ that is not of that precise form, that again affects the estimate by just a constant factor.

So the conclusion is that in principle $Cn^{\log 2/\log 3}$ people can mess up a country with population $n$ . If you have more people than that, then the main thing you want is a system with a few levels of groups within groups — not necessarily formal at every level — and a distribution that is not too concentrated and not too diffuse. (If it is too concentrated, then you’ll end up wasting a lot of votes on controlling representatives who are already controlled, but if it is too diffuse, then you won’t control anybody except at very low levels. In the extreme case, what you want is to be arranged in what can be viewed as a discrete approximation to the Cantor set: in less extreme cases you still want to be somewhat “fractal” and “Cantor-like”.)

16 Oct 04:36

Is the NSA Blackmailing Officials Into Supporting Snooping?

by J.D. Tuccille

Restricted data Why are the likes of Sen. Dianne Feinstein so supportive of wide-reaching National Security Agency surveillance even as polls show a majority of Americans horrified by such intrusions? At the risk of venturing into paranoid territory, could it be that the NSA has gone all J. Edgar and compiled compromising information about officials who might otherwise be a bit less enthusiastic about snooping? That's what Jay Stanley, senior policy analyst for the American Civil Liberties Union wonders, and he has some evidence to support his theory.

Writes Stanley:

Sometimes when I hear public officials speaking out in defense of NSA spying, I can’t help thinking, even if just for a moment, “what if the NSA has something on that person and that’s why he or she is saying this?”

Of course it’s natural, when people disagree with you, to at least briefly think, “they couldn’t possibly really believe that, there must be some outside power forcing them to take that position.” Mostly I do not believe that anything like that is now going on.

But I cannot be 100% sure, and therein lies the problem. The breadth of the NSA’s newly revealed capabilities makes the emergence of such suspicions in our society inevitable. Especially given that we are far, far away from having the kinds of oversight mechanisms in place that would provide ironclad assurance that these vast powers won’t be abused. And that highlights the highly corrosive nature of allowing the NSA such powers. Everyone has dark suspicions about their political opponents from time to time, and Americans are highly distrustful of government in general. When there is any opening at all for members of the public to suspect that officials from the legislative and judicial branches could be vulnerable to leverage from secretive agencies within the executive branch—and when those officials can even suspect they might be subject to leverage—that is a serious problem for our democracy.

Stanley has more than speculation to go on. He points to an interview with former NSA analyst and whistleblower Russ Tice, who claims the Bush administration unleashed the NSA on Barack Obama back in 2004. He also told an interviewer just this summer that surveillance of high-ranking officials was a common procedure for his former employers.

From Washington's Blog:

Tice: Okay. They went after–and I know this because I had my hands literally on the paperwork for these sort of things–they went after high-ranking military officers; they went after members of Congress, both Senate and the House, especially on the intelligence committees and on the armed services committees and some of the–and judicial. But they went after other ones, too. They went after lawyers and law firms. All kinds of–heaps of lawyers and law firms. They went after judges. One of the judges is now sitting on the Supreme Court that I had his wiretap information in my hand. Two are former FISA court judges. They went after State Department officials. They went after people in the executive service that were part of the White House–their own people. They went after antiwar groups. They went after U.S. international–U.S. companies that that do international business, you know, business around the world. They went after U.S. banking firms and financial firms that do international business. They went after NGOs that–like the Red Cross, people like that that go overseas and do humanitarian work.

Tice explicitly says such scrutiny made subjects susceptible to blackmail. And while his experience is some years old, other whistleblowers suggest the practice continues.

William Binney, another former NSA officer, told frequent spy-documenter James Bamford that he and J. Kirk Wiebe approached the Obama administration about putting safeguards on data collection and were brushed off. “We are, like, that far from a turnkey totalitarian state,” Binney told Bamford.

That's a soothing thought on the day that the Washington Post reveals that the NSA harvests personal contact lists from around the world, building roadmaps of our connections and relationships that can be very revealing about our lives. Our contacts and connections are just the sort of information that can become part of an awkward dossier. The sort of awkward dossier that elicits cooperative behavior from officials who'd rather their lives be kept under wraps.

It's not like blackmail has never been used by government officials before. The FBI's J. Edgar Hoover was said to be quite the master of turning inconvenient secrets into cooperative behavior.

14 Oct 20:05

Dominic Steinitz: Backpropogation is Just Steepest Descent with Automatic Differentiation

Preface

The intended audience of this article is someone who knows something about Machine Learning and Artifical Neural Networks (ANNs) in particular and who recalls that fitting an ANN required a technique called backpropagation. The goal of this post is to refresh the reader’s knowledge of ANNs and backpropagation and to show that the latter is merely a specialised version of automatic differentiation, a tool that all Machine Learning practitioners should know about and have in their toolkit.

Introduction

The problem is simple to state: we have a (highly) non-linear function, the cost function of an Artificial Neural Network (ANN), and we wish to minimize this so as to estimate the parameters / weights of the function.

In order to minimise the function, one obvious approach is to use steepest descent: start with random values for the parameters to be estimated, find the direction in which the the function decreases most quickly, step a small amount in that direction and repeat until close enough.

But we have two problems:

We have an algorithm or a computer program that calculates the non-linear function rather than the function itself.
The function has a very large number of parameters, hundreds if not thousands.

One thing we could try is bumping each parameter by a small amount to get partial derivatives numerically

$\displaystyle \frac{\partial E(\ldots, w, \ldots)}{\partial w} \approx \frac{E(\ldots, w + \epsilon, \ldots) - E(\ldots, w, \ldots)}{\epsilon}$

But this would mean evaluating our function many times and moreover we could easily get numerical errors as a result of the vagaries of floating point arithmetic.

As an alternative we could turn our algorithm or computer program into a function more recognisable as a mathematical function and then compute the differential itself as a function either by hand or by using a symbolic differentiation package. For the complicated expression which is our mathematical function, the former would be error prone and the latter could easily generate something which would be even more complex and costly to evaluate than the original expression.

The standard approach is to use a technique called backpropagation and the understanding and application of this technique forms a large part of many machine learning lecture courses.

Since at least the 1960s techniques for automatically differentiating computer programs have been discovered and re-discovered. Anyone who knows about these techniques and reads about backpropagation quickly realises that backpropagation is just automatic differentiation and steepest descent.

This article is divided into

Refresher on neural networks and backpropagation;
Methods for differentiation;
Backward and forward automatic differentiation and
Concluding thoughts.

The only thing important to remember throughout is the chain rule

$\displaystyle (g \circ f)'(a) = g'(f(a))\cdot f'(a)$

in alternative notation

$\displaystyle \frac{\mathrm{d} (g \circ f)}{\mathrm{d} x}(a) = \frac{\mathrm{d} g}{\mathrm{d} y}(f(a)) \frac{\mathrm{d} f}{\mathrm{d} x}(a)$

where $y = f(x)$ . More suggestively we can write

$\displaystyle \frac{\mathrm{d} g}{\mathrm{d} x} = \frac{\mathrm{d} g}{\mathrm{d} y} \frac{\mathrm{d} y}{\mathrm{d} x}$

where it is understood that $\mathrm{d} g / \mathrm{d} x$ and $\mathrm{d} y / \mathrm{d} x$ are evaluated at $a$ and $\mathrm{d} g / \mathrm{d} y$ is evaluated at $f(a)$ .

For example,

$\displaystyle \frac{\mathrm{d}}{\mathrm{d} x} \sqrt{3 \sin(x)} = \frac{\mathrm{d}}{\mathrm{d} x} (3 \sin(x)) \cdot \frac{\mathrm{d}}{\mathrm{d} y} \sqrt{y} = 3 \cos(x) \cdot \frac{1}{2\sqrt{y}} = \frac{3\cos(x)}{2\sqrt{3\sin(x)}}$

Acknowledgements

Sadly I cannot recall all the sources I looked at in order to produce this article but I have made heavy use of the following.

Justin Domke’s blog
Conal Elliot’s website
Jerzy Karczmarczuk’s papers
The Community Portal for Automatic Differentiation
Alexey Radul’s blog

Neural Network Refresher

Here is our model, with $\boldsymbol{x}$ the input, $\hat{\boldsymbol{y}}$ the predicted output and $\boldsymbol{y}$ the actual output and $w^{(k)}$ the weights in the $k$ -th layer. We have concretised the transfer function as $\tanh$ but it is quite popular to use the $\text{logit}$ function.

$\displaystyle \begin{aligned} a_i^{(1)} &= \sum_{j=0}^{N^{(1)}} w_{ij}^{(1)} x_j \\ z_i^{(1)} &= \tanh(a_i^{(1)}) \\ a_i^{(2)} &= \sum_{j=0}^{N^{(2)}} w_{ij}^{(2)} z_j^{(1)} \\ \dots &= \ldots \\ a_i^{(L-1)} &= \sum_{j=0}^{N^{(L-1)}} w_{ij}^{(L-1)} z_j^{(L-2)} \\ z_j^{(L-1)} &= \tanh(a_j^{(L-1)}) \\ \hat{y}_i &= \sum_{j=0}^{N^{(L)}} w_{ij}^{(L)} z_j^{(L-1)} \\ \end{aligned}$

with the loss or cost function

$\displaystyle E(\boldsymbol{w}; \boldsymbol{x}, \boldsymbol{y}) = \frac{1}{2}\|(\hat{\boldsymbol{y}} - \boldsymbol{y})\|^2$

The diagram below depicts a neural network with a single hidden layer.

In order to apply the steepest descent algorithm we need to calculate the differentials of this latter function with respect to the weights, that is, we need to calculate

$\displaystyle \Delta w_{ij} = \frac{\partial E}{\partial w_{ij}}$

Applying the chain rule

$\displaystyle \Delta w_{ij} = \frac{\partial E}{\partial w_{ij}} = \frac{\partial E}{\partial a_i}\frac{\partial a_i}{\partial w_{ij}}$

Since

$\displaystyle a_j^{(l)} = \sum_{i=0}^N w_{ij}^{(l)}z_i^{(l-1)}$

we have

$\displaystyle \frac{\partial a_i^{(l)}}{\partial w_{ij}^{(l)}} = \frac{\sum_{k=0}^M w_{kj}^{(l)}z_k^{(l-1)}}{\partial w_{ij}^{(l)}} = z_i^{(l-1)}$

Defining

$\displaystyle \delta_j^{(l)} \equiv \frac{\partial E}{\partial a_j^{(l)}}$

we obtain

$\displaystyle \Delta w_{ij}^{(l)} = \frac{\partial E}{\partial w_{ij}^{(l)}} = \delta_j^{(l)} z_i^{(l-1)}$

Finding the $z_i$ for each layer is straightforward: we start with the inputs and propagate forward. In order to find the $\delta_j$ we need to start with the outputs a propagate backwards:

For the output layer we have (since $\hat{y}_j = a_j$ )

$\displaystyle \delta_j = \frac{\partial E}{\partial a_j} = \frac{\partial E}{\partial y_j} = \frac{\partial}{\partial y_j}\bigg(\frac{1}{2}\sum_{i=0}^M (\hat{y}_i - y_i)^2\bigg) = \hat{y}_j - y_j$

For a hidden layer using the chain rule

$\displaystyle \delta_j^{(l-1)} = \frac{\partial E}{\partial a_j^{(l-1)}} = \sum_k \frac{\partial E}{\partial a_k^{(l)}}\frac{\partial a_k^{(l)}}{\partial a_j^{(l-1)}}$

Now

$\displaystyle a_k^{(l)} = \sum_i w_{ki}^{(l)}z_i^{(l-1)} = \sum_i w_{ki}^{(l)} f(a_i^{(l-1)})$

so that

$\displaystyle \frac{\partial a_k^{(l)}}{\partial a_j^{(l-1)}} = \frac{\sum_i w_{ki}^{(l)} f(a_i^{(l-1)})}{\partial a_j^{(l-1)}} = w_{kj}^{(l)}\,f'(a_j^{(l-1)})$

and thus

$\displaystyle \delta_j^{(l-1)} = \sum_k \frac{\partial E}{\partial a_k^{(l)}}\frac{\partial a_k^{(l)}}{\partial a_j^{(l-1)}} = \sum_k \delta_k^{(l)} w_{kj}^{(l)}\, f'(a_j^{(l-1)}) = f'(a_j^{(l-1)}) \sum_k \delta_k^{(l)} w_{kj}^{(l)}$

Summarising

We calculate all $a_j$ and $z_j$ for each layer starting with the input layer and propagating forward.
We evaluate $\delta_j^{(L)}$ in the output layer using $\delta_j = \hat{y}_j - y_j$ .
We evaluate $\delta_j$ in each layer using $\delta_j^{(l-1)} = f'(a_j^{(l-1)})\sum_k \delta_k^{(l)} w_{kj}^{(l)}$ starting with the output layer and propagating backwards.
Use $\partial E / \partial w_{ij}^{(l)} = \delta_j^{(l)} z_i^{(l-1)}$ to obtain the required derivatives in each layer.

For the particular activation function $\tanh$ we have $f'(a) = \tanh' (a) = 1 - \tanh^2(a)$ . And finally we can use the partial derivatives to step in the right direction using steepest descent

$\displaystyle w' = w - \gamma\nabla E(w)$

where $\gamma$ is the step size aka the learning rate.

Differentiation

So now we have an efficient algorithm for differentiating the cost function for an ANN and thus estimating its parameters but it seems quite complex. In the introduction we alluded to other methods of differentiation. Let us examine those in a bit more detail before moving on to a general technique for differentiating programs of which backpropagation turns out to be a specialisation.

Numerical Differentiation

Consider the function $f(x) = e^x$ then its differential $f'(x) = e^x$ and we can easily compare a numerical approximation of this with the exact result. The numeric approximation is given by

$\displaystyle f'(x) \approx \frac{f(x + \epsilon) - f(x)}{\epsilon}$

In theory we should get a closer and closer approximation as epsilon decreases but as the chart below shows at some point (with $\epsilon \approx 2^{-26}$ ) the approximation worsens as a result of the fact that we are using floating point arithmetic. For a complex function such as one which calculates the cost function of an ANN, there is a risk that we may end up getting a poor approximation for the derivative and thus a poor estimate for the parameters of the model.

Symbolic Differentiation

Suppose we have the following program (written in Python)

import numpy as np

def many_sines(x):
    y = x
    for i in range(1,7):
        y = np.sin(x+y)
    return y

When we unroll the loop we are actually evaluating

$\displaystyle f(x) = \sin(x + \sin(x + \sin(x + \sin(x + \sin(x + \sin(x + x))))))$

Now suppose we want to get the differential of this function. Symbolically this would be

$\displaystyle \begin{aligned} f'(x) &= (((((2\cdot \cos(2x)+1)\cdot \\ &\phantom{=} \cos(\sin(2x)+x)+1)\cdot \\ &\phantom{=} \cos(\sin(\sin(2x)+x)+x)+1)\cdot \\ &\phantom{=} \cos(\sin(\sin(\sin(2x)+x)+x)+x)+1)\cdot \\ &\phantom{=} \cos(\sin(\sin(\sin(\sin(2x)+x)+x)+x)+x)+1)\cdot \\ &\phantom{=} \cos(\sin(\sin(\sin(\sin(\sin(2x)+x)+x)+x)+x)+x) \end{aligned}$

Typically the non-linear function that an ANN gives is much more complex than the simple function given above. Thus its derivative will correspondingly more complex and therefore expensive to compute. Moreover calculating this derivative by hand could easily introduce errors. And in order to have a computer perform the symbolic calculation we would have to encode our cost function somehow so that it is amenable to this form of manipulation.

Automatic Differentiation

Reverse Mode

Traditionally, forward mode is introduced first as this is considered easier to understand. We introduce reverse mode first as it can be seen to be a generalization of backpropagation.

Consider the function

$\displaystyle f(x) = \exp(\exp(x) + (\exp(x))^2) + \sin(\exp(x) + (\exp(x))^2)$

Let us write this a data flow graph.

We can thus re-write our function as a sequence of simpler functions in which each function only depends on variables earlier in the sequence.

$\displaystyle \begin{aligned} u_7 &= f_7(u_6, u_5, u_4, u_3, u_2, u_1) \\ u_6 &= f_6(u_5, u_4, u_3, u_2, u_1) \\ \ldots &= \ldots \\ u_1 &= f_1(u_1) \end{aligned}$

$\displaystyle \begin{aligned} \mathrm{d}u_7 &= \frac{\partial f_7}{\partial u_6} \mathrm{d} u_6 + \frac{\partial f_7}{\partial u_5} \mathrm{d} u_5 + \frac{\partial f_7}{\partial u_4} \mathrm{d} u_4 + \frac{\partial f_7}{\partial u_3} \mathrm{d} u_3 + \frac{\partial f_7}{\partial u_2} \mathrm{d} u_2 + \frac{\partial f_7}{\partial u_1} \mathrm{d} u_1 \\ \mathrm{d}u_6 &= \frac{\partial f_6}{\partial u_5} \mathrm{d} u_5 + \frac{\partial f_6}{\partial u_4} \mathrm{d} u_4 + \frac{\partial f_6}{\partial u_3} \mathrm{d} u_3 + \frac{\partial f_6}{\partial u_2} \mathrm{d} u_2 + \frac{\partial f_6}{\partial u_1} \mathrm{d} u_1 \\ \ldots &= \ldots \\ \mathrm{d}u_1 &= \frac{\partial f_1}{\partial u_1} \mathrm{d} u_1 \end{aligned}$

In our particular example, since $u_1, \dots, u_5$ do not depend on $u_6$

$\displaystyle \begin{aligned} \frac{\mathrm{d}u_7}{\mathrm{d}u_6} &= 1 \end{aligned}$

Further $u_6$ does not depend on $u_5$ so we also have

$\displaystyle \begin{aligned} \frac{\mathrm{d}u_7}{\mathrm{d}u_5} &= 1 \\ \end{aligned}$

Now things become more interesting as $u_6$ and $u_5$ both depend on $u_4$ and so the chain rule makes an explicit appearance

$\displaystyle \begin{aligned} \frac{\mathrm{d}u_7}{\mathrm{d}u_4} &= \frac{\mathrm{d}u_7}{\mathrm{d}u_6}\frac{\mathrm{d}u_6}{\mathrm{d}u_4} + \frac{\mathrm{d}u_7}{\mathrm{d}u_5}\frac{\mathrm{d}u_5}{\mathrm{d}u_4} \\ &= \frac{\mathrm{d}u_7}{\mathrm{d}u_6}\exp{u_4} + \frac{\mathrm{d}u_7}{\mathrm{d}u_5}\cos{u_5} \end{aligned}$

Carrying on

$\displaystyle \begin{aligned} \frac{\mathrm{d}u_7}{\mathrm{d}u_3} &= \frac{\mathrm{d}u_7}{\mathrm{d}u_4}\frac{\mathrm{d}u_4}{\mathrm{d}u_3} \\ &= \frac{\mathrm{d}u_7}{\mathrm{d}u_4} \\ \frac{\mathrm{d}u_7}{\mathrm{d}u_2} &= \frac{\mathrm{d}u_7}{\mathrm{d}u_4}\frac{\mathrm{d}u_4}{\mathrm{d}u_2} + \frac{\mathrm{d}u_7}{\mathrm{d}u_3}\frac{\mathrm{d}u_3}{\mathrm{d}u_2} \\ &= \frac{\mathrm{d}u_7}{\mathrm{d}u_4} + 2u_2\frac{\mathrm{d}u_7}{\mathrm{d}u_4} \\ \frac{\mathrm{d}u_7}{\mathrm{d}u_1} &= \frac{\mathrm{d}u_7}{\mathrm{d}u_2}\frac{\mathrm{d}u_2}{\mathrm{d}u_1} \\ &=\frac{\mathrm{d}u_7}{\mathrm{d}u_2}\exp{u_2} \end{aligned}$

Note that having worked from top to bottom (the forward sweep) in the graph to calculate the function itself, we have to work backwards from bottom to top (the backward sweep) to calculate the derivative.

So provided we can translate our program into a call graph, we can apply this procedure to calculate the differential with the same complexity as the original program.

The pictorial representation of an ANN is effectively the data flow graph of the cost function (without the final cost calculation itself) and its differential can be calculated as just being identical to backpropagation.

Forward Mode

An alternative method for automatic differentiation is called forward mode and has a simple implementation. Let us illustrate this using Haskell 98. The actual implementation is about 20 lines of code.

First some boilerplate declarations that need not concern us further.

> {-# LANGUAGE NoMonomorphismRestriction #-}
> 
> module AD (
>     Dual(..)
>   , f
>   , idD
>   , cost
>   , zs
>   ) where
> 
> default ()

Let us define dual numbers

> data Dual = Dual Double Double
>   deriving (Eq, Show)

We can think of these pairs as first order polynomials in the indeterminate $\epsilon$ , $x + \epsilon x'$ such that $\epsilon^2 = 0$

Thus, for example, we have

$\displaystyle \begin{aligned} (x + \epsilon x') + (y + \epsilon y') &= ((x + y) + \epsilon (x' + y')) \\ (x + \epsilon x')(y + \epsilon y') &= xy + \epsilon (xy' + x'y) \\ \log (x + \epsilon x') &= \log x (1 + \epsilon \frac {x'}{x}) = \log x + \epsilon\frac{x'}{x} \\ \sqrt{(x + \epsilon x')} &= \sqrt{x(1 + \epsilon\frac{x'}{x})} = \sqrt{x}(1 + \epsilon\frac{1}{2}\frac{x'}{x}) = \sqrt{x} + \epsilon\frac{1}{2}\frac{x'}{\sqrt{x}} \\ \ldots &= \ldots \end{aligned}$

Notice that these equations implicitly encode the chain rule. For example, we know, using the chain rule, that

$\displaystyle \frac{\mathrm{d}}{\mathrm{d} x}\log(\sqrt x) = \frac{1}{\sqrt x}\frac{1}{2}x^{-1/2} = \frac{1}{2x}$

And using the example equations above we have

$\displaystyle \begin{aligned} \log(\sqrt {x + \epsilon x'}) &= \log (\sqrt{x} + \epsilon\frac{1}{2}\frac{x'}{\sqrt{x}}) \\ &= \log (\sqrt{x}) + \epsilon\frac{\frac{1}{2}\frac{x'}{\sqrt{x}}}{\sqrt{x}} \\ &= \log (\sqrt{x}) + \epsilon x'\frac{1}{2x} \end{aligned}$

Notice that dual numbers carry around the calculation and the derivative of the calculation. To actually evaluate $\log(\sqrt{x})$ at a particular value, say 2, we plug in 2 for $x$ and 1 for $x'$

$\displaystyle \log (\sqrt(2 + \epsilon 1) = \log(\sqrt{2}) + \epsilon\frac{1}{4}$

Thus the derivative of $\log(\sqrt{x})$ at 2 is $1/4$ .

With a couple of helper functions we can implement this rule ( $\epsilon^2 = 0$ ) by making Dual an instance of Num, Fractional and Floating.

> constD :: Double -> Dual
> constD x = Dual x 0
> 
> idD :: Double -> Dual
> idD x = Dual x 1.0

Let us implement the rules above by declaring Dual to be an instance of Num. A Haskell class such as Num simply states that it is possible to perform a (usually) related collection of operations on any type which is declared as an instance of that class. For example, Integer and Double are both types which are instances on Num and thus one can add, multiply, etc. values of these types (but note one cannot add an Integer to a Double without first converting a value of the former to a value of the latter).

As an aside, we will never need the functions signum and abs and declare them as undefined; in a robust implementation we would specify an error if they were ever accidentally used.

> instance Num Dual where
>   fromInteger n             = constD $ fromInteger n
>   (Dual x x') + (Dual y y') = Dual (x + y) (x' + y')
>   (Dual x x') * (Dual y y') = Dual (x * y) (x * y' + y * x')
>   negate (Dual x x')        = Dual (negate x) (negate x')
>   signum _                  = undefined
>   abs _                     = undefined

We need to be able to perform division on Dual so we further declare it to be an instance of Fractional.

> instance Fractional Dual where
>   fromRational p = constD $ fromRational p
>   recip (Dual x x') = Dual (1.0 / x) (-x' / (x * x))

We want to be able to perform the same operations on Dual as we can on Float and Double. Thus we make Dual an instance of Floating which means we can now operate on values of this type as though, in some sense, they are the same as values of Float or Double (in Haskell 98 only instances for Float and Double are defined for the class Floating).

> instance Floating Dual where
>   pi = constD pi
>   exp   (Dual x x') = Dual (exp x)   (x' * exp x)
>   log   (Dual x x') = Dual (log x)   (x' / x)
>   sqrt  (Dual x x') = Dual (sqrt x)  (x' / (2 * sqrt x))
>   sin   (Dual x x') = Dual (sin x)   (x' * cos x)
>   cos   (Dual x x') = Dual (cos x)   (x' * (- sin x))
>   sinh  (Dual x x') = Dual (sinh x)  (x' * cosh x)
>   cosh  (Dual x x') = Dual (cosh x)  (x' * sinh x)
>   asin  (Dual x x') = Dual (asin x)  (x' / sqrt (1 - x*x))
>   acos  (Dual x x') = Dual (acos x)  (x' / (-sqrt (1 - x*x)))
>   atan  (Dual x x') = Dual (atan x)  (x' / (1 + x*x))
>   asinh (Dual x x') = Dual (asinh x) (x' / sqrt (1 + x*x))
>   acosh (Dual x x') = Dual (acosh x) (x' / (sqrt (x*x - 1)))
>   atanh (Dual x x') = Dual (atanh x) (x' / (1 - x*x))

That’s all we need to do. Let us implement the function we considered earlier.

> f =  sqrt . (* 3) . sin

The compiler can infer its type

ghci> :t f
  f :: Floating c => c -> c

We know the derivative of the function and can also implement it directly in Haskell.

> f' x = 3 * cos x / (2 * sqrt (3 * sin x))

Now we can evaluate the function along with its automatically calculated derivative and compare that with the derivative we calculated symbolically by hand.

ghci> f $ idD 2
  Dual 1.6516332160855343 (-0.3779412091869595)

ghci> f' 2
  -0.3779412091869595

To see that we are not doing symbolic differentiation (it’s easy to see we are not doing numerical differentiation) let us step through the actual evaluation.

$\displaystyle \begin{aligned} f\,\$\,\mathrm{idD}\,2 &\longrightarrow \mathrm{sqrt} \cdot \lambda x \rightarrow 3x \cdot \sin \$\,\mathrm{idD}\,2 \\ &\longrightarrow \mathrm{sqrt} \cdot \lambda x \rightarrow 3x \cdot \sin \$\,\mathrm{Dual}\,2\,1 \\ &\longrightarrow \mathrm{sqrt} \cdot \lambda x \rightarrow 3x\,\$\, \mathrm{Dual}\,\sin(2)\,\cos(2) \\ &\longrightarrow \mathrm{sqrt} \,\$\, \mathrm{Dual}\,3\,0 \times \mathrm{Dual}\,\sin(2)\,\cos(2) \\ &\longrightarrow \mathrm{sqrt} \,\$\, \mathrm{Dual}\,(3\sin(2))\, (3\cos(2) + 0\sin(2)) \\ &\longrightarrow \mathrm{sqrt} \,\$\, \mathrm{Dual}\,(3\sin(2))\, (3\cos(2)) \\ &\longrightarrow \mathrm{Dual}\,(\mathrm{sqrt} (3\sin(2)))\, (3\cos(2)) / (2\,\mathrm{sqrt}(3\sin(2))) \\ &\longrightarrow 1.6516332160855343 -0.3779412091869595 \end{aligned}$

A Simple Application

In order not to make this blog post too long let us apply AD to finding parameters for a simple regression. The application to ANNs is described in a previous blog post. Note that in a real application we would use the the Haskell AD and furthermore use reverse AD as in this case it would be more efficient.

First our cost function

$\displaystyle L(\boldsymbol{x}, \boldsymbol{y}, m, c) = \frac{1}{2n}\sum_{i=1}^n (y - (mx + c))^2$

> cost m c xs ys = (/ (2 * (fromIntegral $ length xs))) $
>                  sum $
>                  zipWith errSq xs ys
>   where
>     errSq x y = z * z
>       where
>         z = y - (m * x + c)

ghci> :t cost
  cost :: Fractional a => a -> a -> [a] -> [a] -> a

Some test data

> xs = [1,2,3,4,5,6,7,8,9,10]
> ys = [3,5,7,9,11,13,15,17,19,21]

and a learning rate

> gamma = 0.04

Now we create a function of the two parameters in our model by applying the cost function to the data. We need the (partial) derivatives of both the slope and the offset.

> g m c = cost m c xs ys

Now we can take use our Dual numbers to calculate the required partial derivatives and update our estimates of the parameter. We create a stream of estimates.

> zs = (0.1, 0.1) : map f zs
>   where
> 
>     deriv (Dual _ x') = x'
> 
>     f (c, m) = (c - gamma * cDeriv, m - gamma * mDeriv)
>       where
>         cDeriv = deriv $ g (constD m) $ idD c
>         mDeriv = deriv $ flip g (constD c) $ idD m

And we can calculate the cost of each estimate to check our algorithm converges and then take the the estimated parameters when the change in cost per iteration has reached an acceptable level.

ghci> take 2 $ drop 1000 $ map (\(c, m) -> cost m c xs ys) zs
  [1.9088215184565296e-9,1.876891490619424e-9]

ghci> take 2 $ drop 1000 zs
  [(0.9998665320141327,2.0000191714150106),(0.999867653022265,2.0000190103927853)]

Concluding Thoughts

Efficiency

Perhaps AD is underused because of efficiency?

It seems that the Financial Services industry is aware that AD is more efficient than current practice albeit the technique is only slowly permeating. Order of magnitude improvements have been reported.

Perhaps AD is slowly permeating into Machine Learning as well but there seem to be no easy to find benchmarks.

Automatic Differentiation Tools

If it were only possible to implement automatic differentiation in Haskell then its applicability would be somewhat limited. Fortunately this is not the case and it can be used in many languages.

In general, there are three different approaches:

Operator overloading: available for Haskell and C++. See the Haskell ad package and the C++ FADBAD approach using templates.
Source to source translators: available for Fortran, C and other languages e.g., ADIFOR, TAPENADE and see the wikipedia entry for a more comprehensive list.
New languages with built-in AD primitives. I have not listed any as it seems unlikely that anyone practicing Machine Learning would want to transfer their existing code to a research language. Maybe AD researchers could invest time in understanding what language feature improvements are needed to support AD natively in existing languages.

05 Oct 11:59

Simulation of the dynamics of many-body quantum spin systems using phase-space techniques

by Ray Ng, Erik S. Sørensen, and Piotr Deuar

Author(s): Ray Ng, Erik S. Sørensen, and Piotr Deuar

We reformulate the full quantum dynamics of spin systems using a phase-space representation based on SU(2) coherent states which generates an exact mapping of the dynamics of any spin system onto a set of stochastic differential equations. This representation is superior in practice to an earlier ph...

[Phys. Rev. B 88, 144304] Published Fri Oct 04, 2013

02 Oct 16:08

Hierarchical topology in olfactory cortex [Neuroscience]

by McGinley, M. J., Westbrook, G. L.

Nosimpler
Doubtful. They apply a bunch of models and then claim a hierarchical model works best. Not only that, but they use slices as comparison. Did they slice their model too?

Topological motifs in synaptic connectivity—such as the cortical column—are fundamental to processing of information in cortical structures. However, the mesoscale topology of cortical networks beyond columns remains largely unknown. In the olfactory cortex, which lacks an obvious columnar structure, sensory-evoked patterns of activity have failed to reveal organizational principles of...

01 Oct 23:27

Universality in network dynamics

by Baruch Barzel

Nature Physics 9, 673 (2013). doi:10.1038/nphys2741

Authors: Baruch Barzel & Albert-László Barabási

01 Oct 19:43

When does a physical system compute?. (arXiv:1309.7979v3 [cs.ET] UPDATED)

by Clare Horsman, Susan Stepney, Rob C. Wagner, Viv Kendon

Computing is a high-level process of a physical system. Recent interest in non-standard computing systems, including quantum and biological computers, has brought this physical basis of computing to the forefront. There has been, however, no consensus on how to tell if a given physical system is acting as a computer or not; leading to confusion over novel computational devices, and even claims that every physical event is a computation. In this paper we introduce a formal framework that can be used to determine whether or not a physical system is performing a computation. We demonstrate how the abstract computational level interacts with the physical device level, drawing the comparison with the use of mathematical models to represent physical objects in experimental science. This powerful formulation allows a precise description of the similarities between experiments, computation, simulation, and technology, leading to our central conclusion: physical computing is the use of a physical system to predict the outcome of an abstract evolution. We give conditions that must be satisfied in order for computation to be occurring, and illustrate these with a range of non-standard computing scenarios. The framework also covers broader computing contexts, where there is no obvious human computer user. We define the critical notion of a 'computational entity', and show the role this plays in defining when computing is taking place in physical systems.

01 Oct 16:01

Testing Conspiracy Theories

by Sabine Hossenfelder

I'm about to fly to Vienna where I'll be attending a conference on Emergent Quantum Mechanics. I'm not entirely sure why I was invited to this event, but I suspect it's got something to do with me being one of the three people on the planet who like superdeterministic hidden variables theories, more commonly known as "conspiracy theories".

Leaving aside some loopholes that are about to be closed, tests of Bell's theorem rule out local hidden variables theories. But any theorem is only as good as the assumptions that go into it, and one of these assumptions is that the experimenter can freely chose the detector settings. As you know, I don't believe in free will, so I have an issue with this. You can see though why theories in which this assumption does not hold are known as "conspiracy theories". While they are not strictly speaking ruled out, it seems that the universe must be deliberately mean to prevent the experimentalists from doing what they want, and this option is thus often not taken seriously.

But really, this is a very misleading interpretation of superdeterminism. All that superdeterminism means is that a state cannot be prepared independently of the detector settings. That's non-local of course, but it's non-local in a soft way, in the sense that it's a correlation but doesn't necessarily imply a 'spooky' action at a distance because the backwards lightcones of the detector and state (in a reasonable universe) intersect anyway.

That having been said, you might like or not like superdeterministic hidden variables theories, the real question is if there is some way to test if that's how nature works, because one can't use Bell's theorem here. After some failed attempts, I finally came up with a possible test that is almost model-independent, and it was published in my paper "Testing super-deterministic hidden variables theories".

I actually wrote this paper in the hospital when I was pregnant. The nurses kept asking me if I'm writing a book. They were quite disappointed to be drowned in elaborations on the foundations of quantum mechanics rather than hearing a vampire story. In any case, in the expectation that the readers on this blog are somewhat more sympathetic to the question whether the universe is fundamentally deterministic or not, here a brief summary of the idea.

The central difference between standard quantum mechanics and superdeterministic hidden variables theories is that in the former case two identically prepared states can give two different measurement outcomes, while in the latter case that's not possible. Unfortunately, "identically prepared" includes the hidden variables and it's difficult to identically prepare something that you can't measure. That is after all the reason why it looks indeterministic.

However, rather than trying to prepare identical states we can try to make repeated measurements on the same state. For that, take two non-commuting variables (for example the spin or polarization in two different directions) and measure them alternately. In standard quantum mechanics the measurement outcomes will be non-correlated. In a superdeterministric hidden variables theory, they'll be correlated - provided you can make a case that the hidden variables don't change in between the measurements. The figure below shows an example for an experimental setup.

A particle (electron/photon) is bounced back and forth between
two mirrors (grey bars). The blue and red bars indicate measurements
of two non-commuting variables, only one eigenvalue passes, the
other leaves the system. The quantity to measure is the average time
it takes until the particle leaves. In a superdeterministic theory,
it can be significantly longer than in standard quantum mechanics.

The provision that the hidden variables don't change is the reason why the test is only 'almost' model independent, because I made the assumptions that the hidden variables are due to the environment (the experimental setup) down to the relevant scales of the interactions taking place. This means basically if you make the system small and cool and measure quickly enough you have a chance to see the correlation between subsequent measurements. I made some estimates (see paper) and it seems possible with today's technology to make this test.

Interestingly, after I had finished a draft of the paper, Chris Fuchs sent me a reference to a 1970 article by Eugene Wigner where, in a footnote, Wigner mentions Von Neumann discussing exactly this type of experiment:

“Von Neumann often discussed the measurement of the spin component of a spin-1/2 particle in various directions. Clearly, the possibilities for the two possible outcomes of a single such measurement can be easily accounted for by hidden variables [...] However, Von Neumann felt that this is not the case for many consecutive measurements of the spin component in various different directions. The outcome of the ﬁrst such measurement restricts the range of values which the hidden parameters must have had before that ﬁrst measurement was undertaken. The restriction will be present also after the measurement so that the probability distribution of the hidden variables characterizing the spin will be different for particles for which the measurement gave a positive result from that of the particles for which the measurement gave a negative result. The range of the hidden variables will be further restricted in the particles for which a second measurement of the spin component, in a different direction, also gave a positive result...”

Apparently there was a longer discussion with Schrödinger following this proposal, which could be summarized with saying that the experiment cannot test generic superdeterminism, but only certain types as I already said above. If you think about it for a moment, you can never rule out generic superdeterminism anyway, so why even bother.

I'm quite looking forward to this conference, to begin with because Vienna is a beautiful city and I haven't been there for a while, but also because I'm hoping to meet some experimentalists who can tell me if I'm nuts :p

Update: Slides of my talk are here.

Nosimpler likes this

25 Sep 01:59

The Definition of Graph is Classified

by leinster

$MathML-enabled post (click for more details).$

Amid the (to me, highly disturbing) news that the state apparatuses of the US and UK have been secretly and systematically keeping records of our emails, our web browsing, our phone calls, our letters, our financial transactions and our physical location, here’s one half-smile’s-worth of light relief: the definition of graph is classified as “Top Secret”. (Update: or maybe not. See Tod’s comment below.)

(Click to see the news article this comes from, and to expand the tiny writing along the top and bottom which specifies the precise degree of top-secretness.)

This NSA slide uses the category theorists’ word object for a vertex of a graph. I’m relieved they don’t call the edges morphisms.

$MathML-enabled post (click for more details).$

This is a deadly serious subject, but this blog isn’t the place for discussing it, and I’m not up for moderating that kind of discussion. So please keep comments light ‘n’ fluffy.

25 Sep 01:55

See if you could pass this "literacy" test

by Minnesotastan

In the 1960s literacy tests were used in some states in the United States to suppress voting. The Civil Rights Movement Veterans website has collected a number of these.

In addition to completing the application and swearing the oaths, you had to pass the actual "Literacy Test" itself. Because the Freedom Movement was running "Citizenship Schools" to help people learn how to fill out the forms and pass the test, Alabama changed the test 4 times in less than two years (1964-1965). At the time of the Selma Voting Rights campaign there were actually 100 different tests in use across the state...

Most of the tests collected here are a battery of trivia questions related to civic procedure and citizenship. (Two from the Alabama test: “Name the attorney general of the United States” and “Can you be imprisoned, under Alabama law, for a debt?”).

But this Louisiana “literacy” test, singular among its fellows, has nothing to do with citizenship. Designed to put the applicant through mental contortions, the test's questions are often confusingly worded. If some of them seem unanswerable, that effect was intentional. The (white) registrar would be the ultimate judge of whether an answer was correct.

Try this one: “Write every other word in this first line and print every third word in same line (original type smaller and first line ended at comma) but capitalize the fifth word that you write.”

Done with page one (above)? Here are pages 2 and 3:

Oh, BTW...

The test was to be taken in 10 minutes flat, and a single wrong answer meant a failing grade.

Did you fail? You can't vote.

Via Slate and BoingBoing.

Ionut likes this

19 Sep 14:22

How addictive is morphine?

by Minnesotastan

Research previously conducted on rats did not adequately control for the environment in which the rats were studied:

Alexander's hypothesis was that drugs do not cause addiction, and that the apparent addiction to opiate drugs commonly observed in laboratory rats exposed to it is attributable to their living conditions, and not to any addictive property of the drug itself. He told the Canadian Senate in 2001 that prior experiments in which laboratory rats were kept isolated in cramped metal cages, tethered to a self-injection apparatus, show only that "severely distressed animals, like severely distressed people, will relieve their distress pharmacologically if they can."

To test his hypothesis, Alexander built Rat Park, an 8.8 m2 (95 sq ft) housing colony, 200 times the square footage of a standard laboratory cage. There were 16–20 rats of both sexes in residence, an abundance of food, balls and wheels for play, and enough space for mating and raising litters. The results of the experiment appeared to support his hypothesis. Rats who had been forced to consume morphine hydrochloride for 57 consecutive days were brought to Rat Park and given a choice between plain tap water and water laced with morphine. For the most part, they chose the plain water. "Nothing that we tried," Alexander wrote, "... produced anything that looked like addiction in rats that were housed in a reasonably normal environment." Control groups of rats isolated in small cages consumed much more morphine in this and several subsequent experiments.

Via garry's subposthaven. There is a relevant discussion at Reddit.

Addendum: A hat tip to Fletcher in Portugal for noting that the research described above has been summarized in a well-illustrated 40-page cartoon.

Ionut, Shermie.au and one other like this

17 Sep 18:54

Computing matrix inversion with optical networks. (arXiv:1309.3820v1 [physics.optics])

by Kan Wu, Cesare Soci, Perry Ping Shum, Nikolay I. Zheludev

With this paper we bring about a discussion on the computing potential of complex optical networks and provide experimental demonstration that an optical fiber network can be used as an analog processor to calculate matrix inversion. A 3x3 matrix is inverted as a proof-of-concept demonstration using a fiber network containing three nodes and operating at telecomm wavelength. For an NxN matrix, the overall solving time (including setting time of the matrix elements and calculation time of inversion) scales as O(N^2), whereas matrix inversion by most advanced computer algorithms requires ~O(N^2.37) computational time. For well-conditioned matrices, the error of the inversion performed optically is found to be less than 3%, limited by the accuracy of measurement equipment.

Nosimpler

Shared posts

The Story

The Lost Story

The Lost Flaw

The Book

How Many Unprovable Statements Are There?

Open Problems

Preface

Introduction

Acknowledgements

Neural Network Refresher

Differentiation

Numerical Differentiation

Symbolic Differentiation

Automatic Differentiation

Reverse Mode

Forward Mode

A Simple Application

Concluding Thoughts

Efficiency

Automatic Differentiation Tools