Nosimpler
Shared posts
(G)Libertarianism
But leaving behind the degree of fullofshitness, there is a liberal view of abuse of state power which self-described libertarians often mention but rarely get that passionate about. Much better to fret about high taxes or occupational licensing. Unwarranted mass incarceration, unaccountable police brutality, authorizing the state to kill its own citizens, absurd civil forfeiture procedures... these are all clear abuses of state power! Much of the rest of it is just a debate over what should be appropriate policies, not whether they're abuses (though glibertarians tend to call anything they don't like an abuse of state power).
There are people such as Radley Balko who take this stuff on! Good for them! Liberals would like libertarians more if they spent more time on the militarization of the police and the approved abuse of (especially) minority populations rather than, say, seatbelt laws and top marginal tax rates. Because top marginal tax rates aren't actually a libertarian issue, just a conservative one.
WIC'd
The 24 ounce cans of beans weren't eligible. She had to get the
The supermarket branded jumbo eggs weren't eligible. She had to get the large eggs.
Lactaid milk wasn't eligible. Other milk would be.
The corn tortillas weren't eligible. The flour ones would be.
There were a couple of other things that I forget, but you get the idea. When the ordeal started, I thought she was just using SNAP benefits which don't allow everything but do allow "most" things. If I had realized more quickly, I would have (and should have) just offered to pay. She was there with a young boy, too.
Paternalistic nutrition aid programs are ok in theory, but this was just ritual humiliation for what I think was $20 worth of vouchers. Also, too, accounting and other administrative costs.
Just give people some damn money.
Isaac Newton's sinister heraldry. (arXiv:1310.7494v2 [physics.hist-ph] UPDATED)
After Isaac Newton was knighted by Queen Anne in 1705 he adopted an unusual coat of arms: a pair of human tibiae crossed on a black background, like a pirate flag without the skull. After some general reflections on Newton's monumental scientific achievements and on his enigmatic life, we investigate the story of his coat of arms. We also discuss how its simple design illustrates the concept of chirality, which would later play an important role in the philosophical arguments about Newton's conception of space, as well as in the development of modern chemistry and particle physics.
Mapping the optimal route between two quantum states
Mapping the optimal route between two quantum states
Nature 511, 7511 (2014). doi:10.1038/nature13559
Authors: S. J. Weber, A. Chantasri, J. Dressel, A. N. Jordan, K. W. Murch & I. Siddiqi
A central feature of quantum mechanics is that a measurement result is intrinsically probabilistic. Consequently, continuously monitoring a quantum system will randomly perturb its natural unitary evolution. The ability to control a quantum system in the presence of these fluctuations is of increasing importance in quantum information processing and finds application in fields ranging from nuclear magnetic resonance to chemical synthesis. A detailed understanding of this stochastic evolution is essential for the development of optimized control methods. Here we reconstruct the individual quantum trajectories of a superconducting circuit that evolves under the competing influences of continuous weak measurement and Rabi drive. By tracking individual trajectories that evolve between any chosen initial and final states, we can deduce the most probable path through quantum state space. These pre- and post-selected quantum trajectories also reveal the optimal detector signal in the form of a smooth, time-continuous function that connects the desired boundary conditions. Our investigation reveals the rich interplay between measurement dynamics, typically associated with wavefunction collapse, and unitary evolution of the quantum state as described by the Schrödinger equation. These results and the underlying theory, based on a principle of least action, reveal the optimal route from initial to final states, and may inform new quantum control methods for state steering and information processing.
Can you touch your nose?
![]() |
| Best source I could find for this image: IFLS. |
My first reaction was of course: It’s nonsense – a superficial play on the words “you” and “touch”. “You touch” whatever triggers the nerves in your skin. There, look, I’ve solved a thousand year’s old problem in a matter of 3 seconds.
Then it occurred to me that with this notion of “touch” my shoes never touch the ground. Maybe I’m not a genius after all. Let me get back to that cartoon then. Certainly deep thoughts went into it that I must unravel.
The average size of an atom is an Angstrom, 10-10 m. The typical interatomar distance in molecules is a nanometer, 10-9 meter, or let that be a few nanometers if you wish. At room temperature and normal atmospheric pressure, electrostatic repulsion prevents you from pushing atoms any closer together. So the 10-8 meter in the cartoon seem about correct.
But it’s not so simple...To begin with it isn’t just electrostatic repulsion that prevents atoms from getting close, it is more importantly the Pauli exclusion principle which forces the electrons and quarks that make up the atom to arrange in shells rather than to sit on top of each other.
If you could turn off the Pauli exclusion principle, all electrons from the higher shells would drop into the ground state, releasing energy. The same would happen with the quarks in the nucleus which arrange in similar levels. Since nuclear energy scales are higher than atomic scales by several orders of magnitude, the nuclear collapse causes the bulk of the emitted energy. How much is it?
The typical nuclear level splitting is some 100 keV, that is a few 10-14 Joule. Most of the Earth is made up of silicon, iron and oxygen, ie atomic numbers of the order of 15 or so on the average. This gives about 10-12 Joule per atom, that is 1011 Joule per mol, or 1kTon TNT per kg.
This back-of-the envelope gives pretty much exactly the maximal yield of a nuclear weapon. The difference is though that turning off the Pauli exclusion principle would convert every kg of Earthly matter into a nuclear bomb. Since our home planet has a relatively small gravitational pull, I guess it would just blast apart. I saw everybody die, again, see that’s how it happens. But I digress; let me get back to the question of touch.
So it’s not just electrostatics but also the Pauli exclusion principle that prevents you from falling through the cracks. Not only do the electrons in your shoes don’t want to touch the ground, the electrons in your shoes don’t want to touch the other electrons in your shoes either. Electrons, or fermions generally, just don’t like each other.
The 10-8 meter actually seem quite optimistic because surfaces are not perfectly even, they have a roughness to them, which means that the average distance between two solids is typically much larger than the interatomic spacing that one has in crystals. Moreover, the human body is not a solid and the skin normally covered by a thin layer of fluids. So you never touch anything just because you’re separated by a layer of grease from the world.
To be fair, grease isn’t why the Greeks were scratching their heads back then, but a guy called Zeno. Zeno’s most famous paradox divides a distance into halves indefinitely to then conclude then that because it consists of an infinite number of steps, the full distance can never be crossed. You cannot, thus, touch your nose, spoke Zeno, or ram an arrow into it respectively. The paradox resolved once it was established that infinite series can converge to finite values; the nose was in the business again, but Zeno would come back to haunt the thinkers of the day centuries later.
The issue reappeared with the advance of the mathematical field of topology in the 19th century. Back then, math, physics, and philosophy had not yet split apart, and the bright minds of the times, Descarte, Euler, Bolzano and the like, they wanted to know, using their new methods, what does it mean for any two objects to touch? And their objects were as abstract as it gets. Any object was supposed to occupy space and cover a topological set in that space. So far so good, but what kind of set?
In the space of the real numbers, sets can be open or closed or a combination thereof. Roughly speaking, if the boundary of the set is part of the set, the set is closed. If the boundary is missing the set is open. Zeno constructed an infinite series of steps that converges to a finite value and we meet these series again in topology. Iff the limiting value (of any such series) is part of the set, the set is closed. (It’s the same as the open and closed intervals you’ve been dealing with in school, just generalized to more dimensions.) The topologists then went on to reason that objects can either occupy open sets or closed sets, and at any point in space there can be only one object.
Sounds simple enough, but here’s the conundrum. If you have two open sets that do not overlap, they will always be separated by the boundary that isn’t part of either of them. And if you have two closed sets that touch, the boundary is part of both, meaning they also overlap. In neither case can the objects touch without overlapping. Now what? This puzzle was so important to them that Bolzano went on to suggest that objects may occupy sets that are partially open and partially closed. While technically possible, it’s hard to see why they would, in more than 1 spatial dimension, always arrange so as to make sure one’s object closed surface touches the other’s open patches.
More time went by and on the stage of science appeared the notion of fields that mediate interactions between things. Now objects could interact without touching, awesome. But if they don’t repel what happens when they get closer? Do or don’t they touch eventually? Or does interacting via a field means they touch already? Before anybody started worrying about this, science moved on and we learned that the field is quantized and the interaction really just mediated by the particles that make up the field. So how do we even phrase now the question whether two objects touch?
We can approach this by specifying that we mean with an “object” a bound state of many atoms. The short distance interaction of these objects will (at room temperature, normal atmospheric pressure, non-relativistically, etc) take place primarily by exchanging (virtual) photons. The photons do in no sensible way belong to any one of the objects, so it seems fair to say that the objects don’t touch. They don’t touch, in one sentence, because there is no four-fermion interaction in the standard model of particle physics.
Alas, tying touch to photon exchange in general doesn’t make much sense when we think about the way we normally use the word. It does for example not have any qualifier about the distance. A more sensible definition would make use of the probability of an interaction. Two objects touch (in some region) if their probability of interaction (in that region) is large, whether or not it was mediated by a messenger particle. This neatly solves the topologists’ problem because in quantum mechanics two objects can indeed overlap.
What one means with “large probability” of interaction is somewhat arbitrary of course, but quantum mechanics being as awkward as it is there’s always the possibility that your finger tunnels through your brain when you try to hit your nose, so we need a quantifier because nothing is ever absolutely certain. And then, after all, you can touch your nose! You already knew that, right?
But if you think this settles it, let me add...
| Yes, no, maybe, wtf. |
And so, after having spent an hour staring at that cartoon in my facebook feed, I came to the conclusion that the question isn’t whether we can touch something, but what we mean with “some thing”. I think I had been looking for some thing else though…
A world without statistics
A reporter asked me for a quote regarding the importance of statistics. But, after thinking about it for a moment, I decided that statistics isn’t so important at all. A world without statistics wouldn’t be much different from the world we have now.
What would be missing, in a world without statistics?
Science would be pretty much ok. Newton didn’t need statistics for his theories of gravity, motion, and light, nor did Einstein need statistics for the theory of relativity. Thermodynamics and quantum mechanics are fundamentally statistical, but lots of progress could’ve been made in these areas without statistics. The second law of thermodynamics is an observable fact, ditto the two-slit experiment and various experimental results revealing the nature of the atom. The A-bomb and, almost certainly, the H-bomb, maybe these would never have been invented without statistics, but on balance I think most people would feel that the world would be a better place without these particular scientific developments. Without statistics, we could forget about discovering the Hibbs boson etc, but that doesn’t seem like such a loss for humanity.
At a more applied level, statistics helped to win World War 2, most notably in cracking the Enigma code but also in various operations-research efforts. And it’s my impression that “our” statistics were better than “their” statistics. So that’s something.
Where would civilian technology be without statistics? I’m not sure. I don’t have a sense of how necessary statistics was for quantum theory. In a world without statistics, would the study of quantum physics have progressed far enough so that transistors were invented? This one, I don’t know. And without statistics we wouldn’t have modern quality control, so maybe we’d still be driving around in AMC Gremlins and the like. Scary thought, but not a huge deal, I’d think. No transistors, though, that would make a difference in my life. No transistors, no blogging! And I guess we could also forget about various unequivocally beneficial technological innovations such as modern pacemakers, hearing aids, cochlear implants, and Clippy.
Modern biomedicine uses lots and lots of statistics, but would medicine be so much worse without it? I don’t think so, at least not yet. You don’t need statistics to see that penicillin works, nor to see that mosquitos transmit disease and that nets keep the mosquitos out. Without statistics, I assume that various mistakes would get into the system, various ineffective treatments that people think are effective, etc. But on balance I doubt these would be huge mistakes, and the big ones would eventually get caught, with careful record-keeping even without statistical inference and adjustments. Without statistics, biologists would not be able to sequence the gene, and I assume they’d be much slower at developing tools such as tests that allow you to check for chromosomal abnormalities in amnio. I doubt all these things add up to much yet, but I guess there’s promise for the future. Statistics is also necessary for a lot of drug development—right now my colleagues and I are working on a pharmacodynamic model of dosing—but, again, without any of this, it’s not clear the world would be so much different.
The Poverty Lab team use statistics and randomized experiments to see what works to help the lives of poor people around the world. That’s cool but I’m not ultimately convinced this all makes a difference in the big picture. Or, to put it another way, I suspect that the statistical validation serves mostly as a way to build political consensus for economic policies that will be effective in sharing the wealth. By demonstrating in a scientific way that Treatment X is effective, this supports the idea that there is a way to help the sort of people who live in what Nicholas Wade would describe as “tribal” societies. So, sure, fine, but in this case the benefits of the statistical methods are somewhat indirect.
Without statistics, we wouldn’t have most of the papers in “Psychological Science,” but I could handle that. Piaget didn’t need any statistics, and I think the modern successors of Piaget could’ve done pretty much what they’ve done without statistics, just by careful observation of major transitions.
Careful observation and precise measurement can be done, with or without statistical methods. Indeed, researchers often use statistics as a substitute for careful observation and precise measurement. That is a horrible thing to do, and if you have a clear understanding of statistical theory, you can see why. But statistics is hard, and lots of researchers (and journal editors, news reporters, etc.) don’t have that understanding. When statistics is used as a substitute for, rather than an adjunct to, scientific measurement, we get problems.
OK, here’s another one: no statistics, no psychometrics. That’s too bad but one could make the argument that, on the whole, psychometrics has done more harm than good (value-added assessment, anyone?). Don’t get me wrong—I like psychometrics, and a strong argument could be made that it’s done more good than harm—but my point here is that the net benefit is not clear; a case would have to be made.
Polling. Can’t do it well without statistics. But, would a world without polling be so horrible? Much as I hate to admit it, I don’t think so. Don’t get me wrong, I think polling is on balance a good thing—I agree with George Gallup that measurement of public opinion is an important part of the modern democratic process—but I wouldn’t want to hang too much of the benefits of statistics on this one use, given that I expect lots of people would argue that opinion polls do more harm than good in politics.
The alternative to good statistics is . . .
Perhaps the most important benefits of statistics come not from the direct use of statistical methods in science and technology, but rather in helping us learn about the world. Statisticians from Francis Galton and Ronald Fisher onward have used statistics to give us a much deeper understanding of human and biological variation. I can’t see how any non-statistical, mechanistic model of the world could reproduce that level of understanding. Forget about p-values, Bayesian inference, and the rest: here I’m simply talking about the nature of correlation and variation.
For a more humble example, consider Bill James. Baseball is a silly example, sure, but the point is to see how much understanding has been gained in this area through statistical measurement and comparison. As James so memorably wrote, the alternative to good statistics is not “no statistics,” it’s “bad statistics.” James wrote about baseball commentators who would make asinine arguments which they would back up by picking out numbers without context. In politics, the equivalent might be a proudly humanistic pundit such as New York Times columnist David Brooks supporting his views by just making up numbers or featuring various “too good to be true” statistics and not checking them.
So here’s one benefit to the formal study of statistics: Without any statistics, there still would be numbers, along with people trying to interpret them.
Could governments and large businesses be managed well without statistics? I’m not sure. Given that half the U.S. Congress seems willing to shut down the government from time to time, it’s not clear than any agreement on the numbers will have much to do with political action. Similarly, all the statistics in the world don’t seem to be stopping the euro-zone from drifting. But maybe things would be much worse without a common core of statistical agreement. I don’t know; unfortunately this seems like the sort of causal question that is too difficult for statistics to answer.
Finally, one way that statistics is potentially having a huge impact in our lives is through the measurement of global warming and all the rest. But I’m guessing that a lot of this could be done with a pre-statistical understanding. The basic physics is already there, as would be the careful measurements. Statistical modeling is certainly relevant to the study of climate change—if you’re trying to reconstruct historical climate conditions from tree-ring data, it’s tough enough to do it with statistical modeling, I can’t imagine how it could be done otherwise—but the basic patterns of carbon dioxide, temperature, melting ice, etc., are apparent in any case. And, even with statistics, much uncertainty remains.
Summary
When I started writing this post, I was thinking that statistics doesn’t really matter, but I think that’s because I was focusing on some of the more highly-publicized but less beneficial applications of statistics: the use of statistical experimentation and inference to get p-values for tabloid-bait scientific papers, or for Google, Amazon, etc., to perfect their techniques for squeezing money out of their customers or, even at best, to test a medical treatment that increases survival rate for some rare disease by 2 percentage points. But statistics is central to how we think about the world. I still think that statistics is much less central to our lives than, say, chemistry. But it ain’t nothing.
The post A world without statistics appeared first on Statistical Modeling, Causal Inference, and Social Science.
Weaponized Moods, or ‘Twitter as a Nightmare We Will Never Wake From’When I was fourteen...
When I was fourteen I bought 'Philosophical Investigations.’ It’s a book that’s famous for three things apart from its ideas about mind and language: 1) Nobody that reads it can deny that Wittgenstein was probably the smartest person that has ever lived. 2) Nobody that reads it can deny that Wittgenstein was probably the purest person that has ever lived. 3) The book says that if you disagree with anything in the book it’s because you are confused or lying to yourself. I spent most of the year between fourteen and fifteen reading it and crying and throwing it at the wall and hiding it around the house hoping I can’t remember where I put the book. The internet is harder to hide underneath the sink, and though there may not be a Wittgenstein on it it’s full of people that perpetually make me go 'this person isn’t stupid or corrupt, I can tell, and they’re saying you got to be stupid or corrupt to disagree with them, and only someone stupid or corrupt would say a thing like that if it’s not true, and even if I’ll tell myself that I agree with them I’ll know I don’t really agree with them, and even if I tell myself they’re stupid or corrupt I’ll know they aren’t really stupid or corrupt, so really the best thing is not to be born and the second best thing is to die soon.’
The Categorical Origins of Lebesgue Integration
NosimplerPart of my long quest to understand just wtf integration is.
I’ve just come back from the big annual-ish category theory meeting, Category Theory 2014 in Cambridge, also attended by Café hosts Emily and Simon. The talk I gave there was called The categorical origins of Lebesgue integration — click for slides — and I’ll briefly describe it now.
There are two theorems.
Theorem A The Banach space L 1[0,1]L^1[0, 1] has a simple universal property. This leads to a unique characterization of integration on [0,1][0, 1].
Theorem B The functor L 1:L^1: (finite measure spaces) →\to (Banach spaces) has a simple universal property. This leads to a unique characterization of integration on finite measure spaces.
The talk’s pretty simple, and I don’t think I can summarize it much better than by repeating the abstract, which went like this:
Lebesgue integration is a basic, essential component of analysis. Yet most definitions of Lebesgue integrability and integration are rather complicated, typically depending on a series of preliminary definitions. For instance, one of the most popular approaches involves the class of functions that can be expressed as an almost everywhere pointwise limit of an increasing sequence of step functions. Another approach constructs the space of Lebesgue-integrable functions as the completion of the normed vector space of continuous functions; but this depends on already having the definition of integration for continuous functions.
So we might wish for a short, direct description of Lebesgue integrability that reflects its fundamental nature. I will present two theorems achieving this.
The first characterizes the space L 1[0,1]L^1[0, 1] by a simple universal property, entirely bypassing all the usual preliminary definitions. It tells us that once we accept two concepts — Banach space and the mean of two numbers — then the concept of Lebesgue integrability is inevitable. Moreover, this theorem not only characterizes the Lebesgue integrable functions on [0,1][0, 1]; it also characterizes Lebesgue integration of such functions.
The second theorem characterizes the functor L 1L^1 from measure spaces to Banach spaces, again by a simple universal property. Again, the theorem characterizes integration, as well as integrability, of functions on an arbitrary measure space.
Public Service in the 21st Century
Many large corporations with a strong incentive to influence public policy award bonuses and other incentive pay to executives if they take jobs within the government. CitiGroup, for instance, provides an executive contract that awards additional retirement pay upon leaving to take a “full time high level position with the U.S. government or regulatory body.” Goldman Sachs, Morgan Stanley, JPMorgan Chase, the Blackstone Group, Fannie Mae, Northern Trust, and Northrop Grumman are among the other firms that offer financial rewards upon retirement for government service.(h/t Gaius Publius)
Synopsis: Carbon-12 Caught in a Triangle
Published Mon Jun 30, 2014
Laboratory-grown vaginas
Four teenage girls have received vaginas grown from their own cells in a lab. And they work.Further details at the Washington Post. Image from a video at the Wall Street Journal (safe for work, unless someone at work is offended by tissue culture).
These girls were born with underdeveloped or missing vaginas because of a rare condition called Mayer-Rokitansky-Küster-Hauser Syndrome that affects about 1 in 5,000 women. While their labia looked like those of other girls, their vaginas, cervixes and wombs, which are necessary for menstruation and childbirth, never fully formed.
Medical researchers took a vaginal tissue sample from each patient, who were between 13 and 18 at the time, and used them to grow cells in the lab. After four weeks, the researchers had enough cells to layer them on to degradable scaffolding...
Six months later, the patients were able to menstruate and have sexual intercourse for the first time. “After the operation they were able to function normally,” Atala told reporters. “They had normal levels of desire, arousal, satisfaction and orgasm.” Some may also be able to have children.
Happy Birthday Richard Stanley!
This week we are celebrating in Cambridge MA , and elsewhere in the world, Richard Stanley’s birthday. For the last forty years, Richard has been one of the very few leading mathematicians in the area of combinatorics, and he found deep, profound, and fruitful links between combinatorics and other areas of mathematics. His works enriched and
influenced combinatorics as well as other areas of mathematics, and, in my opinion,
combinatorics matured greatly as a mathematical discipline thanks to his work.
Trivia Quiz
Correct or incorrect?
(1) Richard drove cross-country at least 8 times
(2) In his youth, at a wild party, Richard Stanley found a proof of FLT consisting of a few mathematical symbols.
(3) Richard jumped at least once from an airplane
(4) Richard is actively interested in the study of consciousness
(5) Richard found a mathematical way to divide by zero
Seven Early Papers by Richard Stanley That You Must Read.
Richard’s Green book: Combinatorics and Commutative Algebra
Combinatorics and Commutative Algebra
(1) R. P. Stanley, The upper bound conjecture and Cohen-Macaulay rings.
Studies in Appl. Math. 54 (1975), no. 2, 135–142.
The two seminal papers (1) and (3) (below) showed remarkable and unexpected applications of commutative algebra to combinatorics. In each of these papers a central
conjecture in combinatorics was solved in a completely unexpected way which was the basis for a later remarkable theory. Paper (1) is the starting point for the interrelation between commutative algebra and combinatorics of simplicial complexes and their
topology. In this work Richard Stanley proved the Motzkin-Klee upper bound conjecture for triangulations of spheres. This conjecture asserts that the maximum number
of k-faces for a triangulation of a (d-1)-dimensional sphere with n vertices is attained by the boundary complex of the cyclic d-dimensional polytope with
n vertices. Peter McMullen proved this conjecture for simplicial polytopes and Richard Stanley proved it for arbitrary triangulations of spheres. The key point was that a certain ring (the Stanley-Reisner ring) associated with a simplicial polytope has the Cohen-Macaulay property.
The connection between combinatorics and commutative algebra is
far reaching, and in subsequent works combinatorial problems led to
developments in commutative algebra and techniques from the two areas were
combined. A more recent important paper by Richard on applications of commutative algebra for the study of face numbers is: R. P. Stanley, Subdivisions and local h-vectors. J. Amer. Math. Soc. 5 (1992), no. 4, 805–851.
And here is, a few weeks old important development in this theory: Relative Stanley-Reisner theory and Upper Bound Theorems for Minkowski sums, by Karim A. Adiprasito and Raman Sanyal.
The Cohen-Macaulay property, magic squares and lattice points in polytopes
(2) R. P. Stanley, Magic labelings of graphs, symmetric magic squares,
systems of parameters, and Cohen-Macaulay rings. Duke Math. J. 43 (1976),
no. 3, 511–531.
This paper starts with a theorem about enumeration of certain magic squares. Solving a long-standing open problem, Stanley proved that the generating function for the
number of k by k integer matrices (k- fixed) with nonnegative entries and row sums and column sums equal to nis rational. This is the starting
point of a deep algebraic theory of integral points in polyhedra.
Enters the Hard-Lefshetz Theorem: McMullen’s g-conjecture
(3) R. P. Stanley, The number of faces of a simplicial convex polytope. Adv. in Math. 35 (1980), no. 3, 236–238.
The g-conjecture proposes a complete characterization of face numbers of d-dimensional polytopes. One linear equality that holds among face numbers is, of course, the Euler-Poincaré relation. This relation implies additional [d/2] equalities called the Dehn-Sommerville relations. Peter McMullen proposed an additional system of linear and nonlinear inequalities as a complete characterization of face numbers of polytopes. The sufficiency part of this conjecture was proved by Billera and Lee. Richard Stanley’s brilliant proof for McMullen’s inequalities that established the g-conjecture was based on the Hard Lefschetz Theorem from algebraic topology. Starting from a simplicial polytope P (with rational vertices) we associate to it a toric variety T(P). It turned out that the cohomology ring of this variety is closely related to the Stanley-Reisner ring mentioned above. The Hard Lefschetz Theorem implies an algebraic property of the Stanley-Reisner ring from which McMullen inequalities can be deduced by direct combinatorial reasoning. Richard found a number of other combinatorial applications of the Hard Lefschetz theorem (including the solution of the Erdos-Moser conjecture).
Here is the abstract of Lou Billera’s lecture
LOUIS BILLERA (CORNELL)
Even more intriguing, if rather less plausible…
The title is how Peter McMullen described his own conjectured characterization of the f-vectors of simplicial polytopes in his 1971 lecture notes on the upper bound conjecture written with Geoffrey Shephard. Yet by the end of that decade, the so-called g-conjecture would become the g-theorem, and algebraic combinatorics (as practiced at MIT) would have attracted the attention of mainstream mathematics, almost entirely due to the startling proof given by Richard Stanley.
I will briefly describe some of the events leading to this proof and some of its still developing consequences.
Enumeration
Enumeration is Richard’s true mathematical love.
Richard’s monumental books EC1 and EC2 (The picture is of EC1 and a young fan)
(4) A baker’s dozen of conjectures concerning plane partitions, in Combinatoire Énumérative (G. Labelle and P. Leroux, eds.), Lecture Notes in Math., no. 1234, Springer-Verlag, Berlin/Heidelberg/New York, 1986, pp. 285-293.
13 beautiful conjectures on counting plane partitions with various forms of symmetry.
(5) Generating functions, in Studies in Combinatorics (G.-C. Rota, ed.), Mathematical Association of America, 1978, pp 100-141.
For me this was the best introduction to generating functions, clear and inspiring. The entire MAA 1978 Rota’s blue little volume on combinatorics is great. Buy it!
Posets everywhere
(5) Supersolvable lattices, Algebra Universalis 2 (1972), 197-217.
This paper provides a profound link between group theory and the study of
partially ordered sets. It can be seen as a starting point of Stanley’s own work on the Cohen-Macaulay property and it had much influence on later works on combinatorial properties of lattices of subgroups by Quillen and many others, and also on the study of POSETS (=partially ordered sets) arising from arrangements of hyperplanes. The algebraic notion of supersolvable groups is translated to an important combinatorial notion for partially ordered sets. (There is a more detailed paper which I could not find online: R. P. Stanley, Supersolvable semimodular lattices. Mobius algebras (Proc. Conf., Univ. Waterloo, Waterloo, Ont., 1971), pp. 80–142. Univ. Waterloo, Waterloo, Ont., 1971.)
Combinatorics and representation theory
(6) On the number of reduced decompositions of elements of Coxeter groups, European J. Combinatorics 5 (1984), 359-372.
This paper gives an important result proved using representation theory. It is one of
many results by Stanley on connections between enumerative combinatorics,
representation theory, and invariant theory. Again, this paper represents an exciting area of research about the connection of enumerative combinatorics and representation theory that I am less familiar with. A very inspiring survey paper is: Invariants of finite groups and their applications to combinatorics, Bull. Amer. Math. Soc. (new series) 1 (1979), 475-511.
Here are abstracts of two lectures from the meeting on some recent developments in combinatorial representation theory and symmetric functions.
GRETA PANOVA (UCLA)
The Kronecker coefficients: an unexpected journey
Kronecker coefficients live at the intersection of representation theory, algebraic combinatorics and, most recently, complexity theory. They count the multiplicities of irreducible representations in the tensor product of two other irreducible representations of the symmetric group. While their journey started 75 years ago, they still haven’t found their explicit positive combinatorial formula, and present a major open problem in algebraic combinatorics. Recently, they were given a new role in the field of Geometric Complexity Theory, initiated by Mulmuley and Sohoni, where certain conjectures on the complexity of computing and deciding positivity of Kronecker coefficients are part of a program to prove the “”P vs NP”” problem.
We will take the Kronecker coefficients to asymptotics land and bound them. As an unexpected consequence of this trip, we find bounds for the difference of consecutive coefficients in the q-binomial coefficients (as polynomial in q), generalizing Sylvester’s unimodality theorem and connecting with results of Richard Stanley.
Joint work with Igor Pak.
THOMAS LAM (U MICHIGAN)
Truncations of Stanley symmetric functions and amplituhedron cells
Stanley symmetric functions were invented (by Stanley) with applications to the enumeration of reduced words in the symmetric group in mind. Recently, the “amplituhedron” was introduced in the study of scattering amplitudes in N=4 super Yang Mills. I will talk about a formula for the cohomology class of a (tree) amplituhedron variety as the truncation of an affine Stanley symmetric function.
Two combinatorial applications of the Aleksandrov-Fenchel inequalities;
(7) Two combinatorial applications of the Aleksandrov-Fenchel inequalities, J. Combinatorial Theory (A) 31 (1981), 56-65.
In this amazing paper Stanley used inequalities of classical convexity
to settle an important conjecture on probability of events in partially
ordered sets. A special case of the conjecture was settled earlier by Ron
Graham using the FKG inequality. The profound relation between classical
convexity inequalities, combinatorial structures, polytopes, and
probability theory was further studied by many authors including Stanley
himself and there is much more to be done.
I see that I ran out of my seven designated slots. Certainly you should read Richard’s combinatorial constructions of polytopes, like Two poset polytopes, Discrete Comput. Geom. 1 (1986), 9-23, and his papers on arrangements. Let me mention a more recent paper of Stanley in this general area: A polytope related to empirical distributions, plane trees, parking functions, and the associahedron (with J. Pitman), Discrete Comput. Geom., 27 (2002), 603-634.
More
(Mostly from RS’s homepage.)
Chess and Mathematics
(12 page PDF file) An excerpt (version of 1 November 1999) from a book Richard is writing with Noam Elkies.
An unusual method for proving the Riemann hypothesis.
Richard Stanley’s Mathoverflow question on MAGIC
Magic trick based on deep mathematics
Richard the Catalan
(From RS’s homepage)
Excerpt (27 page PDF file) from EC2 on problems related to Catalan numbers (including 66 combinatorial interpretations of these numbers).
Solutions to Catalan number problems from the previous link (23 page PDF file).
Catalan addendum (Postscript or PDF) (version of 25 May 2013; 96 pages). An addendum of new problems (and solutions) related to Catalan numbers. Current number of combinatorial interpretations of Cn: 207.
The material on Catalan numbers is being collected into a monograph, to be published by Cambridge University Press in late 2014 or early 2015.
Discrete activity states en route to consciousness [Neuroscience]
Lawmaker Slams Ex-NSA Chief: ‘Nothing to Offer’ but State Secrets
Keith Alexander, since stepping
down from his position as National Security Agency (NSA) and U.S.
Cyber Command chief following last year's mass surveillance
revelations, has gotten himself in the business of cybersecurity
consulting.
And not everyone's comfortable with that. Rep. Alan Grayson (D-Fl.) yesterday published letters he sent to the "Securities Industry and Financial Markets Association, the Consumer Bankers Association, the Financial Services Roundtable and the Clearing House—all of which Alexander reportedly has approached about his services," according to Wired. The congressman, who sits on both the Committee on Foreign Affairs and the Committee on Science, Space, and Technology, gave a roundabout warning to former spy chief:
Disclosing or misusing classified information for profit is, as Mr. Alexander well knows, a felony. I question how Mr. Alexander can provide any of the services he is offering unless he discloses or misuses classified information, including extremely sensitive sources and methods. Without the classified information he acquired in his former position, he literally would have nothing to offer to you.
He concluded by turning up the heat and asks the organizations to be transparent:
Please send me all information related to your negotiations with Mr. Alexander, so that Congress can verify whether or not he is selling military and cybersecurity secrets to the financial industry for personal gain.

Grayson isn't the only skeptic. In his letter, he cites top computer security expert Bruce Schneier, who has similar concerns. Regarding Alexander's eye-popping rates, $600,000 to $1 million a month, earlier this week Schneier asked his readers to "think of how much actual security they could buy with that $600K a month. Unless he's giving them classified information."
There's a pinch of irony that Alexander, who does a lot of handwringing over Edward Snowden for exposing government secrets, is now on the receiving end suspicion for similar actions.
Why Brink Lindsey opposes a guaranteed annual income
NosimplerNow it's on everybody's radar, and the detractors are looking like idiots.
…my reading of the available evidence convinces me that a social policy that channels benefits through work and thereby encourages paid employment has important advantages over a UBI [universal basic income] in helping the disadvantaged to live full, happy, productive, and rewarding lives.
What evidence? Let’s start with the well-established finding that unemployment has major negative effects on well-being, including both mental and physical health. And the effects are remarkably persistent. A study using German panel data examined changes in reported life satisfaction after marriage, divorce, birth of a child, death of a spouse, layoff, and unemployment. All had predictable effects in the short term, but for five of the six the effect generally wore off with time: the joy of having a new baby subsided, while the pain of a loved one’s death gradually faded. The exception was unemployment: even after five years, the researchers found little evidence of adaptation.
Evidence even more directly on point comes from the experience of welfare reform – specifically, the imposition of work requirements on recipients of public assistance. Interestingly, studies of the economic consequences of reform showed little or no change in recipients’ material well-being. But a pair of studies found a positive impact on single mothers’ happiness as a result of moving off welfare and finding work.
There is more here. And Ross Douthat offers related remarks on whether it really is possible to encourage work — how well have previous welfare reforms succeeded in this end?
We are not ready for health data mining
There have been two articles very recently about how great health data mining could be if we could only link up all the data sets. Larry Page from Google thinks so, which doesn’t surprise anyone, and separately we are seeing that the consequence of the new medical payment system through the ACA is giving medical systems incentives to keep tabs on you through data providers and find out if you’re smoking or if you need to fill up on asthma medication.
And although many would consider this creepy stalking, that’s not actually my problem with it. I think Larry Page is right – we might be able to save lots of lives if we could mine this data which is currently siloed through various privacy laws. On the other hand, there are reasons those privacy laws exist. Let’s think about that for a second.
Now that we have the ACA, insurers are not allowed to deny Americans medical insurance coverage because of a pre-existing condition, nor are they allowed to charge more, as of 2014. That’s good news on the health insurance front. But what about other aspects of our lives?
For example, it does not generalize to employers. In other words, a large employer like Walmart might take into account your current health and your current behaviors and possibly even your DNA to predict future behaviors, and they might decide not to give jobs to anyone at risk of diabetes, say. Even if medical insurance casts were taken out of the picture, which they haven’t been, they’d have incentives not to hire unhealthy people.
Mind you, there are laws that prevent employers from looking into HIPAA-protected health data, but not Acxiom data, which is entirely unregulated. And if we “opened up all the data” then the laws would be entirely moot. It would be a world where, to get a job, the employer got to see everything about you, including your future health profile. To some extent this is already happening.
Perhaps not everyone thinks of this as bad. After all, many people think smokers should pay more for insurance, why not also work harder to get a job? However, lots of the information gleaned from this data – even behaviors – have much more to do with poverty levels than circumstance than with conscious choice. In other words, it’s another stratification of society along the lucky/unlucky birth lottery spectrum. And if we aren’t careful, we will make it even harder for poor people to eke out a living.
I’m all for saving lives but let’s wait for the laws to catch up with the good intentions. Although to be honest, it’s not even clear how the law should be written, since it’s not clear what “medical” data is nowadays nor how we could gather evidence that a private employer is using it against someone improperly.
Simultaneous Measurement of Complementary Observables with Compressive Sensing
Author(s): Gregory A. Howland, James Schneeloch, Daniel J. Lum, and John C. Howell
Coincident high-resolution position and momentum imaging of optical photons is achieved using a sequence of weak and strong measurements.
[Phys. Rev. Lett. 112, 253602] Published Thu Jun 26, 2014
Excruciating Pain from Above: Drones Get Pepper Spray
It was inevitable. Now it's
happening. Desert Wolf, a company in South Africa, has created a
drone capable of shooting pepper spray and "blinding lasers" at
unruly crowds. The company has sold 25 of the drones to a mining
company after showing them off at a trade show, the BBC reports.
They offer more details of
the drones' weaponry:
Desert Wolf's website states that its Skunk octacopter drone is fitted with four high-capacity paintball barrels, each capable of firing up to 20 bullets per second.
In addition to pepper-spray ammunition, the firm says it can also be armed with dye-marker balls and solid plastic balls.
The machine can carry up to 4,000 bullets at a time as well as "blinding lasers" and on-board speakers that can communicate warnings to a crowd.
CNet notes that these "blinding lasers" are forbidden for use in war by the Geneva Convention and adds that weapons that are classified as "non-lethal" often do kill anyway:
"These weapons cannot be sufficiently well controlled to avoid causing serious injury, especially to eyes," warns Mark Gubrud of campaign group the International Committee for Robot Arms Control. "Many existing "non-lethal" crowd-control weapons can and often do kill."
Right now it appears that mining companies in countries that have seen violent clashes between striking workers and authorities are most interested. But is it only a matter of time before the infamous "pepper spray cop" is replaced by a "pepper spray drone"? Well, at least the drone won't apply for workers' compensation payments afterward.
Can Someone Just Invent an 'I Consent to Sex' iPhone App Already?
NosimplerHahaha "I'm counting on the free market to work its magic and provide a sensible and convenient method of demonstrating mutual consent."
Over at Slate, Amanda Hess came to the
defense of legislation—currently under consideration by the
California State Assembly—that would pressure colleges to police
their students' sexual activities.
California Senate Bill 967 would force state universities to strictly define consensual sex between students as an "affirmative, unambiguous, and conscious decision." It would also clarify that "lack of protest or resistance does not mean consent, nor does silence mean consent," and that the person "initiating the sexual activity" is responsible for obtaining consent.
This legislatively-enforced definition of consent is much needed, wrote Hess:
This standard improves on the old "no means no" model in a number of ways. A partner who is asleep or passed out can’t say "no." Neither can a partner who’s frozen in shock or fear when an encounter escalates into an assault. Victims who are threatened with sexual assault aren’t always equipped to respond in rape prevention talking points. Just like with any other violent physical assault, many victims respond by shutting down, going silent, or laying motionless, hoping not to anger their attackers further, or disassociating from the attacks as an attempt at self-preservation. Also, consenting to sex one time doesn’t mean consenting to sex any other time. And consenting to one act (like vaginal intercourse) doesn’t imply consent for all other acts (like anal sex). Having sex with a person who is lying limply on a bed is not consensual, unless that person happens to be really, really into that—but that’s a situation that requires a conversation, not an assumption.
So are affirmative consent laws a good idea? If they are broad enough to include nonverbal cues, I think so. If we can admit that enthusiastic consent is often communicated in body language or knowing looks, then we must also accept that the lack of consent doesn’t always manifest itself in a shouted "no" or "stop," either. It shouldn’t be the sole responsibility of the uninterested party to speak up during a sexual encounter. If you think it’s easy for a person to just say no, then why would it be so hard for his or her partner to just ask?
My only substantial quibble with this definition is the "person initiating" clause. Is it always so clear that one person is initiating sex with another? Isn't the decision to have sex sometimes mutually arrived at by both parties?
Setting that aside, maybe it's a good idea for the California legislature to broaden the parameters of sexual assault. Maybe "only yes means yes" is a better standard than "no means no," and it is desirable for cultural attitudes about consensual sex to shift in that direction.
But why on earth should that involve universities? Rape is a crime, not an academic offense. I'm open to the argument that the criminal justice system should navigate sexual assault cases differently, but I don't accept that there should be some extralegal method of punishing accused rapists where the burden of proof is lower, due process is nonexistent, and he said/she said is often an automatic loss for the accused. The punishments are not as severe as they are under the normal justice system, sure, but expulsion is still a harsh sentence (synonymous with the loss of thousands of dollars toward a now unobtainable degree), given the conviction process is handled by people totally unequipped to fairly judge such matters.
The legislature is telling state universities to more aggressively involve themselves in their students' sex lives. Given administrators' track records, there is little reason to think that either victims or the accused will be well served by such mandate.
Ultimately, I'm counting on the free market to work its magic and provide a sensible and convenient method of demonstrating mutual consent. Several writers have suggested an iPhone app that allows users to clearly consent to sex—maybe they would have to input a password and then touch phones, or something?—would do the trick.
Consensual sex? There's an app for that. One day soon. Hopefully.
When being a control-freak doesn't help....
Participants performed a two-alternative forced-choice task on a foveally presented stimulus that could vary on a subset of binary perceptual features, such as color (red, green), shape (diamond, square), size (large, small), topology (open, closed), and location (up, down). Unbeknownst to the participants, we manipulated the statistical informativeness of an additional feature that was not part of the task, such that this feature always predicted the correct response in one condition (the predictive condition) but not in the other condition (the baseline condition). Because the cognitive system is known to exploit statistical stimulus-response contingencies automatically, performance was expected to be better in the predictive than in the baseline condition.
We embedded these predictive and baseline conditions into two different tasks, which we thought would induce different cognitive-control states. The control task included instructions intended to emphasize the need for top-down control: Participants were instructed to classify the stimulus according to a feature-conjunction rule (e.g., size and topology: left response key for large and open or small and closed shapes, right response key for small and open or large and closed shapes). The automatic task included instructions intended to deemphasize the need for control: Participants were instructed to classify the stimulus according to a single feature (e.g., shape: left response key for a diamond and right response key for a square). In the automatic task, the features were mapped consistently on responses and thus allowed automatic visuomotor translation. In contrast, the stimulus-response mapping in the control task required the attention-demanding integration of two features before the response could be determined.
As expected, the predictive feature improved performance when participants performed the task automatically. Counterintuitively, however, the predictive feature impaired performance when subjects were performing the exact same task in a top-down, controlled manner.Their abstract:
In order to engage in goal-directed behavior, cognitive agents have to control the processing of task-relevant features in their environments. Although cognitive control is critical for performance in unpredictable task environments, it is currently unknown how it affects performance in highly structured and predictable environments. In the present study, we showed that, counterintuitively, top-down control can impair and interfere with the otherwise automatic integration of statistical information in a predictable task environment, and it can render behavior less efficient than it would have been without the attempt to control the flow of information. In other words, less can sometimes be more (in terms of cognitive control), especially if the environment provides sufficient information for the cognitive system to behave on autopilot based on automatic processes alone.
Takens' Embedding and Riemannian preconditioning
...In Compressed Sensing (CS) [2, 7], for example, a random projection of a highdimensional but sparse signal vector onto a lower-dimensional space has been shown, with high probability, to contain enough information to enable signal reconstruction with small or zero error. Random projections also play a fundamental role in the study of point clouds in high-dimensional spaces. The Johnson-Lindenstrauss (JL) lemma [12], for example, shows that with high probability the geometry of a point cloud is not disturbed by certain Lipschitz mappings onto a space of dimension logarithmic in the number of points. The statement and proofs of the JL-lemma have been simplified considerably by using random linear projections and concentration inequalities [1]....
"...The underlying problem is that while Takens’ theorem guarantees the preservation of the attractor’s topology, it does not guarantee that the geometry of the attractor is also preserved. To be precise, Takens’ result guarantees that two points on the attractor do not map to the same point in the reconstruction space, but there are no guarantees that close points on the attractor remain close under this mapping (or far points remain far). Consequently, relatively small imperfections could have arbitrarily large, unwanted effects when the delay coordinate map is used in applications..."
Takens' Embedding Theorem asserts that when the states of a hidden dynamical system are confined to a low-dimensional attractor, complete information about the states can be preserved in the observed time-series output through the delay coordinate map. However, the conditions for the theorem to hold ignore the effects of noise and time-series analysis in practice requires a careful empirical determination of the sampling time and number of delays resulting in a number of delay coordinates larger than the minimum prescribed by Takens' theorem. In this paper, we use tools and ideas in Compressed Sensing to provide a first theoretical justification for the choice of the number of delays in noisy conditions. In particular, we show that under certain conditions on the dynamical system, measurement function, number of delays and sampling time, the delay-coordinate map can be a stable embedding of the dynamical system's attractor.
Riemannian preconditioning by Bamdev Mishra, Rodolphe Sepulchre
The paper exploits a basic connection between sequential quadratic programming and Riemannian gradient optimization to address the general question of selecting a metric in Riemannian optimization. The proposed method is shown to be particularly insightful and efficient in quadratic optimization with orthogonality and/or rank constraints, which covers most current applications of Riemannian optimization in matrix manifolds.
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Categorification, step 1
Today at the St Petersburg meeting, Igor Frenkel talked about categorification. He explained that there are five levels (maybe more!) and one has to take certain steps between them; he illustrated with an example, where level 0 was Jacobi’s Triple Product Identity and level 5 was four-dimensional quantum Yang–Mills.
He did say,
Forget about categorification if you don’t want to hear this word,
and it is certainly not an attractive word.
But I, like a stubborn mule, found myself unable to cross the Pons Asinorum from Level 0 to Level 1, since the landscape along the way was too attractive. I have some questions about this, which I will describe briefly here. I may try to write this out at greater length sometime.
Numbers and vector spaces
Step 1, in essence, consists in replacing numbers (non-negative integers) with vector spaces over a field, which is usually taken to be the field of complex numbers (but undoubtedly different things will happen over other fields). The number n is replaced by an n-dimensional vector space over C.
Addition and multiplication of numbers is replaced by direct sum and tensor product of vector space; the dimensions do indeed behave correctly.
An equation like a−b+c = 0 is replaced by a short exact sequence
{0} → A → B → C → {0};
this means that the image of each map is equal to the kernel of the next. Again the dimensions behave correctly. But we see already that the process is not purely mechanical, since the reverse of the short exact sequence (which could be realised by maps of the dual spaces) may not be the thing that naturally occurs.
For example, Euler’s polyhedral formula (written in the form 1−V+E−F+1 = 0) can be realised in this way: first orient the edges and faces of the polyhedron; now replace V,E,F by vector spaces of functions from vertices, edges and faces to C; and then use “coboundary maps” to make the sequence. (For example, from V to E, we replace a function f on vertices by a function df on edges, where df(v,w) = f(w)−f(v).) Now the proof of Euler’s formula becomes the proof of exactness of this sequence.
Formal power series
A formal power series ∑anxn has to be treated a bit differently. We cannot substitute a natural number for x, since all but the tamest such sequences have radius of convergence smaller than 1. But, as in combinatorial enumeration, we regard the powers of x as markers.
Thus, the interpretation of the series is a graded vector space ⊕An, where An is a vector space of dimension an.
Now direct sum or tensor product of graded vector spaces (with the usual conventions about grading, that is, the product of elements of degrees k and l has degree k+l), correspond to the sum or product of the formal power series.
Problem What if the graded vector space is actually a graded algebra?
In particular, the formal power series 1 and x corresponds to 1-dimensional vector spaces with degree 0 and 1 respectively.
Invariants
Let G be a finite permutation group on the set {1,…,n}. Then G has a natural action on a vector space V of dimension n, by permuting the vectors of a basis. An invariant of G is simply a vector fixed by G; such a vector must have coordinates which are constant on each G-orbit, and so the space of invariants of G (written VG) has dimension equal to the number of orbits of G. So we have taken the first step towards categorifying orbit-counting.
More generally, let W be a graded vector space. Then there is a natural action of G on the direct sum V = Wn of n copies of G. How do we “count” invariants here?
The answer lies in Pólya theory. Associated with G is a multivariate polynomial called the cycle index of G. This is the polynomial in indeterminates s1,…sn constructed as follows: for each element g of the group, having ci cycles of length i for each i, we form a monomial by raising si to the power ci, multiplying all these together; then sum these monomials over all group elements, and divide by the order of the group. This is denoted by Z(G).
Suppose that we have a collection of “figures”, each with a non-negative integer “weight”, so that there are only finitely many (say ai) figures of weight i for each i. The figures can be represented by a figure-counting series A(x) = ∑aixi. Now a “function” is a map from the set {1,…n} to the set of figures (essentially it puts a figure at every point of this set). G permutes the functions by moving their arguments. We are interested in counting the orbits of G on functions of given total weight (the sum of the weights of their values). If bi is the number of orbits, then the function-counting series is B(x) = ∑bixi.
Now Pólya’s Theorem asserts that B(x) is obtained from Z(G) by substituting A(xi) for si, for each i.
Now (I think), if we replace the figure-counting series by a graded vector space W, and let G act on the direct sum V of n copies of V, then the function-counting series corresponds to the space of invariants of G in V.
Of course, permutation actions are a special case of linear actions.
Problem. What happens for linear actions? Do we replace Pólya’s theorem by Molien’s?
Oligomorphic groups
A permutation group G on an infinite set is called oligomorphic if it has only finitely many (say an) orbits on the set of all n-element subsets of the permutation domain.
The definition of cycle index breaks down completely for oligomorphic permutation groups. However, one can define a modified cycle index for an oligomorphic group; this is obtained by taking orbit representatives for the orbits of G on the finite subsets of its domain, for each orbit representative take the cycle index of the finite group induced on this set by its setwise stabiliser in G, and then summing. It is easy to see that we obtain a formal power series in the infinitely many indeterminates s1,s2,….
A finite permutation group is a special case of an oligomorphic group; according to the Shift Theorem, its modified cycle index can be obtained from its ordinary cycle index by replacing si by si+1, for each i.
Many substitution results hold. For example, the power series ∑anxn is obtained from the modified cycle index by substituting xi for si, for each i. There are also rules for direct and wreath products.
There is at least the possibility of turning all this into graded vector spaces. Let Vi be the vector space of all functions from the set of G-orbits on i-sets to C, and V the graded vector space having these spaces as their homogeneous components. Then the vector space V “represents” the orbit-counting power series.
The vector space V has a natural algebra structure, which I won’t describe here. In fact, though results about the rate of growth of the numbers an tend to be proved by combinatorial methods (chiefly by Dugald Macpherson), I have had a small amount of success in proving smoothness results using algebraic methods, based on the fact that the multiplication in the algebra gives a map from Vm⊗Vn to Vm+n.
Problem What are the vector space operations corresponding to direct and wreath product of oligomorphic groups?
Problem Is there a linear group analogue of oligomorphic permutation groups, for which a similar theory can be developed?
I think that is enough problems to be going on with!
Alexander Shulgin, RIP
Alexander Shulgin, who
died this week at the age of 88, was a remarkable man who
combined an intense curiosity about altered states of consciousness
with amazing chemical creativity and scientific rigor. Over the
years Shulgin synthesized hundreds of psychoactive compounds that
he carefully tested on himself, his wife, Ann, and a small circle
of friends—a process he described in his 1991 book
PIKHAL: A Chemical Love Story, a 978-page tome that
includes notes on the production and effects of 179 such chemicals
along with a personal and professional memoir. (The title stands
for "Phenethylamines I Have Known and Loved"; the sequel was called
TIKHAL, for "Tryptamines I Have Known and Loved.") Perhaps
best known as a popularizer (though not the creator) of MDMA, which
he said "enabled me to see out, and to see my own insides, without
reservations," Shulgin embodied an open-minded yet responsible
approach to drugs that should be a model for psychonauts as well as
the politicians who vainly try to control them.
"Every drug, legal or illegal, provides some reward," Shulgin wrote in the introduction to PIKHAL. "Every drug presents some risk. And every drug can be abused. Ultimately, in my opinion, it is up to each of us to measure the reward against the risk and decide which outweighs the other." His great passion was for psychedelics, the "mind-manifesting" drugs with effects similar to those of mescaline, psilocybin, and LSD, which he saw as "treasures" that "can provide access to the parts of us that have answers," facilitating "exploration of this interior world" and "insights into its nature." It amazed him that legislators and regulators would presume to intrude into this deeply personal realm, especially in a society that claims to respect privacy, freedom of inquiry, and freedom of conscience.
"Our generation is the first, ever, to have made the search for self-awareness a crime, if it is done with the use of plants or chemical compounds as the means of opening the psychic doors," Shulgin wrote. "How is it...that the leaders of our society have seen fit to try to eliminate this one very important means of learning and self-discovery, this means which has been used, respected, and honored for thousands of years, in every human culture of which we have a record?"
That remains a bit of a puzzle, even to people who have studied the series of moral panics that comprise the history of American drug policy. But by highlighting the profound, life-enhancing potential of forbidden intoxicants without denying their hazards, Shulgin boldly pointed the way to a more tolerant alternative.
Anomalous Transfer of Syntax between Languages
Each human language possesses a set of distinctive syntactic rules. Here, we show that balanced Welsh-English bilinguals reading in English unconsciously apply a morphosyntactic rule that only exists in Welsh. The Welsh soft mutation rule determines whether the initial consonant of a noun changes based on the grammatical context (e.g., the feminine noun cath—"cat" mutates into gath in the phrase y gath—"the cat"). Using event-related brain potentials, we establish that English nouns artificially mutated according to the Welsh mutation rule (e.g., "goncert" instead of "concert") require significantly less processing effort than the same nouns implicitly violating Welsh syntax. Crucially, this effect is found whether or not the mutation affects the same initial consonant in English and Welsh, showing that Welsh syntax is applied to English regardless of phonological overlap between the two languages. Overall, these results demonstrate for the first time that abstract syntactic rules transfer anomalously from one language to the other, even when such rules exist only in one language.
Post-linear Schwarzschild solution in harmonic coordinates: Elimination of structure-dependent terms
NosimplerShared for "post-linear", which is a word we should adapt for our own purposes.
Author(s): Sergei A. Klioner and Michael Soffel
This paper deals with a special kind of problems that appear in solutions of Einstein’s field equations for extended bodies: many structure-dependent terms appear in intermediate calculations that cancel exactly in virtue of the local equations of motion or can be eliminated by appropriate gauge tra...
[Phys. Rev. D 89, 104056] Published Tue May 27, 2014
Is there signal in the noise?
NosimplerTake that, noise-worshippers.
Nature Neuroscience 17, 750 (2014). doi:10.1038/nn.3722
Authors: Alexander S Ecker & Andreas S Tolias
A study now shows that variability in neuronal responses in the visual system mainly arises from slow fluctuations in excitability, presumably caused by factors of nonsensory origin, such as arousal, attention or anesthesia.
Universal features in the energetics of symmetry breaking
Nature Physics 10, 457 (2014). doi:10.1038/nphys2940
Authors: É. Roldán, I. A. Martínez, J. M. R. Parrondo & D. Petrov
Nurses complain about algorithms
NosimplerMo automation, mo problems.
In response [to the rise of diagnostic algorithms], NNU [National Nurses United] has launched a major campaign featuring radio ads from coast to coast, video, social media, legislation, rallies, and a call to the public to act, with a simple theme – “when it matters most, insist on a registered nurse.” The ads were created by North Woods Advertising and produced by Fortaleza Films/Los Angeles. Additional background can be found at http://www.insistonanrn.org.
Here is the link. Here is an MP3 of the ad. Remarkable, do give it a listen. It has numerous excellent lines such as “Algorithms are simple mathematical formulas that nobody understands.”
For the pointer I thank Eric Jonas.
Why I Am Not An Integrated Information Theorist (or, The Unconscious Expander)
Happy birthday to me!
Recently, lots of people have been asking me what I think about IIT—no, not the Indian Institutes of Technology, but Integrated Information Theory, a widely-discussed “mathematical theory of consciousness” developed over the past decade by the neuroscientist Giulio Tononi. One of the askers was Max Tegmark, who’s enthusiastically adopted IIT as a plank in his radical mathematizing platform (see his paper “Consciousness as a State of Matter”). When, in the comment thread about Max’s Mathematical Universe Hypothesis, I expressed doubts about IIT, Max challenged me to back up my doubts with a quantitative calculation.
So, this is the post that I promised to Max and all the others, about why I don’t believe IIT. And yes, it will contain that quantitative calculation.
But first, what is IIT? The central ideas of IIT, as I understand them, are:
(1) to propose a quantitative measure, called Φ, of the amount of “integrated information” in a physical system (i.e. information that can’t be localized in the system’s individual parts), and then
(2) to hypothesize that a physical system is “conscious” if and only if it has a large value of Φ—and indeed, that a system is more conscious the larger its Φ value.
I’ll return later to the precise definition of Φ—but basically, it’s obtained by minimizing, over all subdivisions of your physical system into two parts A and B, some measure of the mutual information between A’s outputs and B’s inputs and vice versa. Now, one immediate consequence of any definition like this is that all sorts of simple physical systems (a thermostat, a photodiode, etc.) will turn out to have small but nonzero Φ values. To his credit, Tononi cheerfully accepts the panpsychist implication: yes, he says, it really does mean that thermostats and photodiodes have small but nonzero levels of consciousness. On the other hand, for the theory to work, it had better be the case that Φ is small for “intuitively unconscious” systems, and only large for “intuitively conscious” systems. As I’ll explain later, this strikes me as a crucial point on which IIT fails.
The literature on IIT is too big to do it justice in a blog post. Strikingly, in addition to the “primary” literature, there’s now even a “secondary” literature, which treats IIT as a sort of established base on which to build further speculations about consciousness. Besides the Tegmark paper linked to above, see for example this paper by Maguire et al., and associated popular article. (Ironically, Maguire et al. use IIT to argue for the Penrose-like view that consciousness might have uncomputable aspects—a use diametrically opposed to Tegmark’s.)
Anyway, if you want to read a popular article about IIT, there are loads of them: see here for the New York Times’s, here for Scientific American‘s, here for IEEE Spectrum‘s, and here for the New Yorker‘s. Unfortunately, none of those articles will tell you the meat (i.e., the definition of integrated information); for that you need technical papers, like this or this by Tononi, or this by Seth et al. IIT is also described in Christof Koch’s memoir Consciousness: Confessions of a Romantic Reductionist, which I read and enjoyed; as well as Tononi’s Phi: A Voyage from the Brain to the Soul, which I haven’t yet read. (Koch, one of the world’s best-known thinkers and writers about consciousness, has also become an evangelist for IIT.)
So, I want to explain why I don’t think IIT solves even the problem that it “plausibly could have” solved. But before I can do that, I need to do some philosophical ground-clearing. Broadly speaking, what is it that a “mathematical theory of consciousness” is supposed to do? What questions should it answer, and how should we judge whether it’s succeeded?
The most obvious thing a consciousness theory could do is to explain why consciousness exists: that is, to solve what David Chalmers calls the “Hard Problem,” by telling us how a clump of neurons is able to give rise to the taste of strawberries, the redness of red … you know, all that ineffable first-persony stuff. Alas, there’s a strong argument—one that I, personally, find completely convincing—why that’s too much to ask of any scientific theory. Namely, no matter what the third-person facts were, one could always imagine a universe consistent with those facts in which no one “really” experienced anything. So for example, if someone claims that integrated information “explains” why consciousness exists—nope, sorry! I’ve just conjured into my imagination beings whose Φ-values are a thousand, nay a trillion times larger than humans’, yet who are also philosophical zombies: entities that there’s nothing that it’s like to be. Granted, maybe such zombies can’t exist in the actual world: maybe, if you tried to create one, God would notice its large Φ-value and generously bequeath it a soul. But if so, then that’s a further fact about our world, a fact that manifestly couldn’t be deduced from the properties of Φ alone. Notice that the details of Φ are completely irrelevant to the argument.
Faced with this point, many scientifically-minded people start yelling and throwing things. They say that “zombies” and so forth are empty metaphysics, and that our only hope of learning about consciousness is to engage with actual facts about the brain. And that’s a perfectly reasonable position! As far as I’m concerned, you absolutely have the option of dismissing Chalmers’ Hard Problem as a navel-gazing distraction from the real work of neuroscience. The one thing you can’t do is have it both ways: that is, you can’t say both that the Hard Problem is meaningless, and that progress in neuroscience will soon solve the problem if it hasn’t already. You can’t maintain simultaneously that
(a) once you account for someone’s observed behavior and the details of their brain organization, there’s nothing further about consciousness to be explained, and
(b) remarkably, the XYZ theory of consciousness can explain the “nothing further” (e.g., by reducing it to integrated information processing), or might be on the verge of doing so.
As obvious as this sounds, it seems to me that large swaths of consciousness-theorizing can just be summarily rejected for trying to have their brain and eat it in precisely the above way.
Fortunately, I think IIT survives the above observations. For we can easily interpret IIT as trying to do something more “modest” than solve the Hard Problem, although still staggeringly audacious. Namely, we can say that IIT “merely” aims to tell us which physical systems are associated with consciousness and which aren’t, purely in terms of the systems’ physical organization. The test of such a theory is whether it can produce results agreeing with “commonsense intuition”: for example, whether it can affirm, from first principles, that (most) humans are conscious; that dogs and horses are also conscious but less so; that rocks, livers, bacteria colonies, and existing digital computers are not conscious (or are hardly conscious); and that a room full of people has no “mega-consciousness” over and above the consciousnesses of the individuals.
The reason it’s so important that the theory uphold “common sense” on these test cases is that, given the experimental inaccessibility of consciousness, this is basically the only test available to us. If the theory gets the test cases “wrong” (i.e., gives results diverging from common sense), it’s not clear that there’s anything else for the theory to get “right.” Of course, supposing we had a theory that got the test cases right, we could then have a field day with the less-obvious cases, programming our computers to tell us exactly how much consciousness is present in octopi, fetuses, brain-damaged patients, and hypothetical AI bots.
In my opinion, how to construct a theory that tells us which physical systems are conscious and which aren’t—giving answers that agree with “common sense” whenever the latter renders a verdict—is one of the deepest, most fascinating problems in all of science. Since I don’t know a standard name for the problem, I hereby call it the Pretty-Hard Problem of Consciousness. Unlike with the Hard Hard Problem, I don’t know of any philosophical reason why the Pretty-Hard Problem should be inherently unsolvable; but on the other hand, humans seem nowhere close to solving it (if we had solved it, then we could reduce the abortion, animal rights, and strong AI debates to “gentlemen, let us calculate!”).
Now, I regard IIT as a serious, honorable attempt to grapple with the Pretty-Hard Problem of Consciousness: something concrete enough to move the discussion forward. But I also regard IIT as a failed attempt on the problem. And I wish people would recognize its failure, learn from it, and move on.
In my view, IIT fails to solve the Pretty-Hard Problem because it unavoidably predicts vast amounts of consciousness in physical systems that no sane person would regard as particularly “conscious” at all: indeed, systems that do nothing but apply a low-density parity-check code, or other simple transformations of their input data. Moreover, IIT predicts not merely that these systems are “slightly” conscious (which would be fine), but that they can be unboundedly more conscious than humans are.
To justify that claim, I first need to define Φ. Strikingly, despite the large literature about Φ, I had a hard time finding a clear mathematical definition of it—one that not only listed formulas but fully defined the structures that the formulas were talking about. Complicating matters further, there are several competing definitions of Φ in the literature, including ΦDM (discrete memoryless), ΦE (empirical), and ΦAR (autoregressive), which apply in different contexts (e.g., some take time evolution into account and others don’t). Nevertheless, I think I can define Φ in a way that will make sense to theoretical computer scientists. And crucially, the broad point I want to make about Φ won’t depend much on the details of its formalization anyway.
We consider a discrete system in a state x=(x1,…,xn)∈Sn, where S is a finite alphabet (the simplest case is S={0,1}). We imagine that the system evolves via an “updating function” f:Sn→Sn. Then the question that interests us is whether the xi‘s can be partitioned into two sets A and B, of roughly comparable size, such that the updates to the variables in A don’t depend very much on the variables in B and vice versa. If such a partition exists, then we say that the computation of f does not involve “global integration of information,” which on Tononi’s theory is a defining aspect of consciousness.
More formally, given a partition (A,B) of {1,…,n}, let us write an input y=(y1,…,yn)∈Sn to f in the form (yA,yB), where yA consists of the y variables in A and yB consists of the y variables in B. Then we can think of f as mapping an input pair (yA,yB) to an output pair (zA,zB). Now, we define the “effective information” EI(A→B) as H(zB | A random, yB=xB). Or in words, EI(A→B) is the Shannon entropy of the output variables in B, if the input variables in A are drawn uniformly at random, while the input variables in B are fixed to their values in x. It’s a measure of the dependence of B on A in the computation of f(x). Similarly, we define
EI(B→A) := H(zA | B random, yA=xA).
We then consider the sum
Φ(A,B) := EI(A→B) + EI(B→A).
Intuitively, we’d like the integrated information Φ=Φ(f,x) be the minimum of Φ(A,B), over all 2n-2 possible partitions of {1,…,n} into nonempty sets A and B. The idea is that Φ should be large, if and only if it’s not possible to partition the variables into two sets A and B, in such a way that not much information flows from A to B or vice versa when f(x) is computed.
However, no sooner do we propose this than we notice a technical problem. What if A is much larger than B, or vice versa? As an extreme case, what if A={1,…,n-1} and B={n}? In that case, we’ll have Φ(A,B)≤2log2|S|, but only for the boring reason that there’s hardly any entropy in B as a whole, to either influence A or be influenced by it. For this reason, Tononi proposes a fix where we normalize each Φ(A,B) by dividing it by min{|A|,|B|}. He then defines the integrated information Φ to be Φ(A,B), for whichever partition (A,B) minimizes the ratio Φ(A,B) / min{|A|,|B|}. (Unless I missed it, Tononi never specifies what we should do if there are multiple (A,B)’s that all achieve the same minimum of Φ(A,B) / min{|A|,|B|}. I’ll return to that point later, along with other idiosyncrasies of the normalization procedure.)
Tononi gives some simple examples of the computation of Φ, showing that it is indeed larger for systems that are more “richly interconnected” in an intuitive sense. He speculates, plausibly, that Φ is quite large for (some reasonable model of) the interconnection network of the human brain—and probably larger for the brain than for typical electronic devices (which tend to be highly modular in design, thereby decreasing their Φ), or, let’s say, than for other organs like the pancreas. Ambitiously, he even speculates at length about how a large value of Φ might be connected to the phenomenology of consciousness.
To be sure, empirical work in integrated information theory has been hampered by three difficulties. The first difficulty is that we don’t know the detailed interconnection network of the human brain. The second difficulty is that it’s not even clear what we should define that network to be: for example, as a crude first attempt, should we assign a Boolean variable to each neuron, which equals 1 if the neuron is currently firing and 0 if it’s not firing, and let f be the function that updates those variables over a timescale of, say, a millisecond? What other variables do we need—firing rates, internal states of the neurons, neurotransmitter levels? Is choosing many of these variables uniformly at random (for the purpose of calculating Φ) really a reasonable way to “randomize” the variables, and if not, what other prescription should we use?
The third and final difficulty is that, even if we knew exactly what we meant by “the f and x corresponding to the human brain,” and even if we had complete knowledge of that f and x, computing Φ(f,x) could still be computationally intractable. For recall that the definition of Φ involved minimizing a quantity over all the exponentially-many possible bipartitions of {1,…,n}. While it’s not directly relevant to my arguments in this post, I leave it as a challenge for interested readers to pin down the computational complexity of approximating Φ to some reasonable precision, assuming that f is specified by a polynomial-size Boolean circuit, or alternatively, by an NC0 function (i.e., a function each of whose outputs depends on only a constant number of the inputs). (Presumably Φ will be #P-hard to calculate exactly, but only because calculating entropy exactly is a #P-hard problem—that’s not interesting.)
I conjecture that approximating Φ is an NP-hard problem, even for restricted families of f’s like NC0 circuits—which invites the amusing thought that God, or Nature, would need to solve an NP-hard problem just to decide whether or not to imbue a given physical system with consciousness! (Alas, if you wanted to exploit this as a practical approach for solving NP-complete problems such as 3SAT, you’d need to do a rather drastic experiment on your own brain—an experiment whose result would be to render you unconscious if your 3SAT instance was satisfiable, or conscious if it was unsatisfiable! In neither case would you be able to communicate the outcome of the experiment to anyone else, nor would you have any recollection of the outcome after the experiment was finished.) In the other direction, it would also be interesting to upper-bound the complexity of approximating Φ. Because of the need to estimate the entropies of distributions (even given a bipartition (A,B)), I don’t know that this problem is in NP—the best I can observe is that it’s in AM.
In any case, my own reason for rejecting IIT has nothing to do with any of the “merely practical” issues above: neither the difficulty of defining f and x, nor the difficulty of learning them, nor the difficulty of calculating Φ(f,x). My reason is much more basic, striking directly at the hypothesized link between “integrated information” and consciousness. Specifically, I claim the following:
Yes, it might be a decent rule of thumb that, if you want to know which brain regions (for example) are associated with consciousness, you should start by looking for regions with lots of information integration. And yes, it’s even possible, for all I know, that having a large Φ-value is one necessary condition among many for a physical system to be conscious. However, having a large Φ-value is certainly not a sufficient condition for consciousness, or even for the appearance of consciousness. As a consequence, Φ can’t possibly capture the essence of what makes a physical system conscious, or even of what makes a system look conscious to external observers.
The demonstration of this claim is embarrassingly simple. Let S=Fp, where p is some prime sufficiently larger than n, and let V be an n×n Vandermonde matrix over Fp—that is, a matrix whose (i,j) entry equals ij-1 (mod p). Then let f:Sn→Sn be the update function defined by f(x)=Vx. Now, for p large enough, the Vandermonde matrix is well-known to have the property that every submatrix is full-rank (i.e., “every submatrix preserves all the information that it’s possible to preserve about the part of x that it acts on”). And this implies that, regardless of which bipartition (A,B) of {1,…,n} we choose, we’ll get
EI(A→B) = EI(B→A) = min{|A|,|B|} log2p,
and hence
Φ(A,B) = EI(A→B) + EI(B→A) = 2 min{|A|,|B|} log2p,
or after normalizing,
Φ(A,B) / min{|A|,|B|} = 2 log2p.
Or in words: the normalized information integration has the same value—namely, the maximum value!—for every possible bipartition. Now, I’d like to proceed from here to a determination of Φ itself, but I’m prevented from doing so by the ambiguity in the definition of Φ that I noted earlier. Namely, since every bipartition (A,B) minimizes the normalized value Φ(A,B) / min{|A|,|B|}, in theory I ought to be able to pick any of them for the purpose of calculating Φ. But the unnormalized value Φ(A,B), which gives the final Φ, can vary greatly, across bipartitions: from 2 log2p (if min{|A|,|B|}=1) all the way up to n log2p (if min{|A|,|B|}=n/2). So at this point, Φ is simply undefined.
On the other hand, I can solve this problem, and make Φ well-defined, by an ironic little hack. The hack is to replace the Vandermonde matrix V by an n×n matrix W, which consists of the first n/2 rows of the Vandermonde matrix each repeated twice (assume for simplicity that n is a multiple of 4). As before, we let f(x)=Wx. Then if we set A={1,…,n/2} and B={n/2+1,…,n}, we can achieve
EI(A→B) = EI(B→A) = (n/4) log2p,
Φ(A,B) = EI(A→B) + EI(B→A) = (n/2) log2p,
and hence
Φ(A,B) / min{|A|,|B|} = log2p.
In this case, I claim that the above is the unique bipartition that minimizes the normalized integrated information Φ(A,B) / min{|A|,|B|}, up to trivial reorderings of the rows. To prove this claim: if |A|=|B|=n/2, then clearly we minimize Φ(A,B) by maximizing the number of repeated rows in A and the number of repeated rows in B, exactly as we did above. Thus, assume |A|≤|B| (the case |B|≤|A| is analogous). Then clearly
EI(B→A) ≥ |A|/2,
while
EI(A→B) ≥ min{|A|, |B|/2}.
So if we let |A|=cn and |B|=(1-c)n for some c∈(0,1/2], then
Φ(A,B) ≥ [c/2 + min{c, (1-c)/2}] n,
and
Φ(A,B) / min{|A|,|B|} = Φ(A,B) / |A| = 1/2 + min{1, 1/(2c) – 1/2}.
But the above expression is uniquely minimized when c=1/2. Hence the normalized integrated information is minimized essentially uniquely by setting A={1,…,n/2} and B={n/2+1,…,n}, and we get
Φ = Φ(A,B) = (n/2) log2p,
which is quite a large value (only a factor of 2 less than the trivial upper bound of n log2p).
Now, why did I call the switch from V to W an “ironic little hack”? Because, in order to ensure a large value of Φ, I decreased—by a factor of 2, in fact—the amount of “information integration” that was intuitively happening in my system! I did that in order to decrease the normalized value Φ(A,B) / min{|A|,|B|} for the particular bipartition (A,B) that I cared about, thereby ensuring that that (A,B) would be chosen over all the other bipartitions, thereby increasing the final, unnormalized value Φ(A,B) that Tononi’s prescription tells me to return. I hope I’m not alone in fearing that this illustrates a disturbing non-robustness in the definition of Φ.
But let’s leave that issue aside; maybe it can be ameliorated by fiddling with the definition. The broader point is this: I’ve shown that my system—the system that simply applies the matrix W to an input vector x—has an enormous amount of integrated information Φ. Indeed, this system’s Φ equals half of its entire information content. So for example, if n were 1014 or so—something that wouldn’t be hard to arrange with existing computers—then this system’s Φ would exceed any plausible upper bound on the integrated information content of the human brain.
And yet this Vandermonde system doesn’t even come close to doing anything that we’d want to call intelligent, let alone conscious! When you apply the Vandermonde matrix to a vector, all you’re really doing is mapping the list of coefficients of a degree-(n-1) polynomial over Fp, to the values of the polynomial on the n points 0,1,…,n-1. Now, evaluating a polynomial on a set of points turns out to be an excellent way to achieve “integrated information,” with every subset of outputs as correlated with every subset of inputs as it could possibly be. In fact, that’s precisely why polynomials are used so heavily in error-correcting codes, such as the Reed-Solomon code, employed (among many other places) in CD’s and DVD’s. But that doesn’t imply that every time you start up your DVD player you’re lighting the fire of consciousness. It doesn’t even hint at such a thing. All it tells us is that you can have integrated information without consciousness (or even intelligence)—just like you can have computation without consciousness, and unpredictability without consciousness, and electricity without consciousness.
It might be objected that, in defining my “Vandermonde system,” I was too abstract and mathematical. I said that the system maps the input vector x to the output vector Wx, but I didn’t say anything about how it did so. To perform a computation—even a computation as simple as a matrix-vector multiply—won’t we need a physical network of wires, logic gates, and so forth? And in any realistic such network, won’t each logic gate be directly connected to at most a few other gates, rather than to billions of them? And if we define the integrated information Φ, not directly in terms of the inputs and outputs of the function f(x)=Wx, but in terms of all the actual logic gates involved in computing f, isn’t it possible or even likely that Φ will go back down?
This is a good objection, but I don’t think it can rescue IIT. For we can achieve the same qualitative effect that I illustrated with the Vandermonde matrix—the same “global information integration,” in which every large set of outputs depends heavily on every large set of inputs—even using much “sparser” computations, ones where each individual output depends on only a few of the inputs. This is precisely the idea behind low-density parity check (LDPC) codes, which have had a major impact on coding theory over the past two decades. Of course, one would need to muck around a bit to construct a physical system based on LDPC codes whose integrated information Φ was provably large, and for which there were no wildly-unbalanced bipartitions that achieved lower Φ(A,B)/min{|A|,|B|} values than the balanced bipartitions one cared about. But I feel safe in asserting that this could be done, similarly to how I did it with the Vandermonde matrix.
More generally, we can achieve pretty good information integration by hooking together logic gates according to any bipartite expander graph: that is, any graph with n vertices on each side, such that every k vertices on the left side are connected to at least min{(1+ε)k,n} vertices on the right side, for some constant ε>0. And it’s well-known how to create expander graphs whose degree (i.e., the number of edges incident to each vertex, or the number of wires coming out of each logic gate) is a constant, such as 3. One can do so either by plunking down edges at random, or (less trivially) by explicit constructions from algebra or combinatorics. And as indicated in the title of this post, I feel 100% confident in saying that the so-constructed expander graphs are not conscious! The brain might be an expander, but not every expander is a brain.
Before winding down this post, I can’t resist telling you that the concept of integrated information (though it wasn’t called that) played an interesting role in computational complexity in the 1970s. As I understand the history, Leslie Valiant conjectured that Boolean functions f:{0,1}n→{0,1}n with a high degree of “information integration” (such as discrete analogues of the Fourier transform) might be good candidates for proving circuit lower bounds, which in turn might be baby steps toward P≠NP. More strongly, Valiant conjectured that the property of information integration, all by itself, implied that such functions had to be at least somewhat computationally complex—i.e., that they couldn’t be computed by circuits of size O(n), or even required circuits of size Ω(n log n). Alas, that hope was refuted by Valiant’s later discovery of linear-size superconcentrators. Just as information integration doesn’t suffice for intelligence or consciousness, so Valiant learned that information integration doesn’t suffice for circuit lower bounds either.
As humans, we seem to have the intuition that global integration of information is such a powerful property that no “simple” or “mundane” computational process could possibly achieve it. But our intuition is wrong. If it were right, then we wouldn’t have linear-size superconcentrators or LDPC codes.
I should mention that I had the privilege of briefly speaking with Giulio Tononi (as well as his collaborator, Christof Koch) this winter at an FQXi conference in Puerto Rico. At that time, I challenged Tononi with a much cruder, handwavier version of some of the same points that I made above. Tononi’s response, as best as I can reconstruct it, was that it’s wrong to approach IIT like a mathematician; instead one needs to start “from the inside,” with the phenomenology of consciousness, and only then try to build general theories that can be tested against counterexamples. This response perplexed me: of course you can start from phenomenology, or from anything else you like, when constructing your theory of consciousness. However, once your theory has been constructed, surely it’s then fair game for others to try to refute it with counterexamples? And surely the theory should be judged, like anything else in science or philosophy, by how well it withstands such attacks?
But let me end on a positive note. In my opinion, the fact that Integrated Information Theory is wrong—demonstrably wrong, for reasons that go to its core—puts it in something like the top 2% of all mathematical theories of consciousness ever proposed. Almost all competing theories of consciousness, it seems to me, have been so vague, fluffy, and malleable that they can only aspire to wrongness.
[Endnote: See also this related post, by the philosopher Eric Schwetzgebel: Why Tononi Should Think That the United States Is Conscious. While the discussion is much more informal, and the proposed counterexample more debatable, the basic objection to IIT is the same.]
Update (5/22): Here are a few clarifications of this post that might be helpful.
(1) The stuff about zombies and the Hard Problem was simply meant as motivation and background for what I called the “Pretty-Hard Problem of Consciousness”—the problem that I take IIT to be addressing. You can disagree with the zombie stuff without it having any effect on my arguments about IIT.
(2) I wasn’t arguing in this post that dualism is true, or that consciousness is irreducibly mysterious, or that there could never be any convincing theory that told us how much consciousness was present in a physical system. All I was arguing was that, at any rate, IIT is not such a theory.
(3) Yes, it’s true that my demonstration of IIT’s falsehood assumes—as an axiom, if you like—that while we might not know exactly what we mean by “consciousness,” at any rate we’re talking about something that humans have to a greater extent than DVD players. If you reject that axiom, then I’d simply want to define a new word for a certain quality that non-anesthetized humans seem to have and that DVD players seem not to, and clarify that that other quality is the one I’m interested in.
(4) For my counterexample, the reason I chose the Vandermonde matrix is not merely that it’s invertible, but that all of its submatrices are full-rank. This is the property that’s relevant for producing a large value of the integrated information Φ; by contrast, note that the identity matrix is invertible, but produces a system with Φ=0. (As another note, if we work over a large enough field, then a random matrix will have this same property with high probability—but I wanted an explicit example, and while the Vandermonde is far from the only one, it’s one of the simplest.)
(5) The n×n Vandermonde matrix only does what I want if we work over (say) a prime field Fp with p>>n elements. Thus, it’s natural to wonder whether similar examples exist where the basic system variables are bits, rather than elements of Fp. The answer is yes. One way to get such examples is using the low-density parity check codes that I mention in the post. Another common way to get Boolean examples, and which is also used in practice in error-correcting codes, is to start with the Vandermonde matrix (a.k.a. the Reed-Solomon code), and then combine it with an additional component that encodes the elements of Fp as strings of bits in some way. Of course, you then need to check that doing this doesn’t harm the properties of the original Vandermonde matrix that you cared about (e.g., the “information integration”) too much, which causes some additional complication.
(6) Finally, it might be objected that my counterexamples ignored the issue of dynamics and “feedback loops”: they all consisted of unidirectional processes, which map inputs to outputs and then halt. However, this can be fixed by the simple expedient of iterating the process over and over! I.e., first map x to Wx, then map Wx to W2x, and so on. The integrated information should then be the same as in the unidirectional case.
Update (5/24): See a very interesting comment by David Chalmers.










