Shared posts

22 Feb 18:32

States most similar to the US overall

by Nathan Yau

“Normal America.” I’m not sure what that means anymore, but at some point it had a lot to do with demographics. Naturally, the “normal” that you look at or want bleeds into policy-making and the like. Jed Kolko for FiveThirtyEight looks into the states most similar to the country overall — the one from 1950 and from today.

But the places that look today most like 1950 America are not large metros but rather smaller metros and rural areas. Looking across all of America, including the rural areas, the regions that today look most demographically similar to 1950 America are the portion of eastern Ohio around the towns of Cambridge and Coshocton and the Cumberland Valley district in southeastern Kentucky.

The states most similar demographically to today’s America: Illinois, New York, New Jersey, Connecticut, and Virginia.

Tags: demographics, FiveThirtyEight, normal

14 Jul 05:20

Freedom for some is not freedom

by Jason Kottke

Jane Elliott asks an audience a very simple question about being black in America. (via @carltonspeight who says "No BS, I wish every white person on Twitter could see this. Maybe it'll help")

Tags: Jane Elliott   racism   USA   video
14 Apr 01:58

Markov Chain Dirty To Me

by Hugh Hancock

Right now, people are having sex with a computer.

I don't mean that they're having sex with a RealDoll or similar—although I'm sure they are.

I mean that there are people out there, right now, who are shagging a state machine.

Welcome to the world of computer-assisted self-bondage (LINK IS VERY VERY NSFW!). Using Arduinos, Heath Robinson-esque contraptions involving keys held in CD trays, and Bluetooth-enabled electrostim machines, men and women have programmed their own doms or dommes. A truly merciless dominant who will randomly please or hurt, and there's nothing the user can do about it.

(There's a Terminator misquote here that I'm desperately trying not to make.)

Meanwhile, there are dozens of other people who are attempting to chat up a slightly different state machine.

Advertisers on porn sites around the world have figured out that users are, shall we say, somewhat preoccupied, and there are a limited number of advertising techniques that will work. One of the most common ones is a "fake chat" ad—an attractive woman propositioning the user with the promise of, at least, some hot Facebook or Snapchat messages.

Some of the more advanced ads actually take the user to a landing page where a very, very crude script will respond to them.

This has been happening for ages, of course. So why am I talking about it now?

Because tech developments in other areas are about to turn the whole "sex with your PC" deal from "crude and somewhat rubbish" to "looks like the AIs just took a really unexpected job away".

Soon you may well be able to summon a succubus. Through your PC.

Rise Of The Chatbots

Summoned servitors are about to make the mother of all comebacks.

I'm rather enthusiastic about that on a fictional level. Indeed, the film I just released, DANGEROUS TREASURES, came very close to being called BOUND THINGS instead—it's a story of a couple of geeks who follow clues on a deepweb occult forum which lead them to have a lengthy and bloody interaction with the bound guardian of the treasure they're robbing. And the binding and summoning of guardians is key to the entire thing (hopefully not spoiling it too much!).

(Amusingly for the topic of this post, the reason we didn't call it BOUND THINGS is that it sounded rather too porny.)

Accidentally, I seem to have hit something of a zeitgeist with this one. Because in Silicon Valley, I'm reliably informed, the Wave Of The Future is exactly this: summoned, intelligent servants which you can control if you know their True Name.

Forget apps. The new line is "there's a bot for that".

Rise Of The Chatbots

Here's a good primer for the whole Chatbot Revolution.

In short:

  • Messaging apps got huge.
  • Chatbots, which have been around for ages, benefit a lot from the improvements in AI recently.
  • It's possible to plug A into B comparatively easily.
  • And that enables a very natural interaction where you simply message the pizzabot, say, and it sends you a pizza.
  • Replace "pizza" with "Uber", "Grocery", "Plane Ticket" or "Escort" as appropriate.

There's no need to install an app and grant it permission to do everything from track your location to access your selfie collection. There's no need to bash your way (pun not intended) through a confusing new interface. Just message and It Is Done.

This is a very similar idea to the old "AI agent" concept that has been knocking around for a decade or more, but this time the ecosystem's right. We have natural language processing sophisticated enough to comprehend most messages. We have messaging systems that everyone uses, with APIs that allow bots to access them. And we're sick up to the back teeth of apps.

By now you're probably thinking "And we've got learning bots too!". You're right. And this is where we go back to the "fucking a CPU" thing.

We've already seen Microsoft unveil Tay, their twitter bot. A lot of people have written about its rapid transformation into a shitposting racist as an amusing side feature, or a huge and terrifying weakness. But they have that the wrong way around.

Microsoft Tay, and it's /pol/ification, is where the Chatbot Revolution really begins.

Markov Chain Dirty To Me

So let's go back to our porn surfer chatting up a state machine, and our self-bondage enthusiast tied to a Bluetooth vibrator.

And now let's add a learning bot into both those situations.

One of the most successful techniques in current AI research - used by Google's DeepMind amongst others - is called "Regret-based learning". Essentially, the bot is given a goal, and gets upset when it doesn't achieve it. Thus, it rapidly learns to optimise its approach to achieve that goal.

Sex, in a myriad of forms, is an excellent candidate for a regret-based AI. For the advertisers, the win condition is simple: the user clicks through to whatever they're advertising and purchases, signs up, or whatever.

For the bondage enthusiasts, it's orgasm and sexual arousal - both of which can be measured if said user doesn't mind strapping some electronics to their genitals.

There's already an active project - with talk of crowdfunding - aiming to link masturbation machines to a machine-learning program via a variety of techniques for spotting imminent orgasm. (I'll not link to that particular piece of research, but you can probably find it with a bit of Googling.)

And in both cases, there's enormous demand, and absolutely no reason a bot can't be trained up by exposing it (pun, once again, not intended) to hundreds of thousands of users at the same time. Indeed, for a pure chat bot it would be tremendously cheap - a few hundred dollars - to buy access to a massive array of users for extremely rapid training thanks to the generally low value of porn site advertising space.

Even if the intention isn't to shill an adult dating site or similar, but to develop an effective AI replacement for a phone sex line, it's comparatively simple to add in conditions for regret-based learning. At the most obvious level, a simple Uber-style star rating would give the bot enough feedback to begin optimising.

I for one welcome our AI Dominatrixes

But could our sexbot actually ever get good enough to be convincing? I can't see why not.

Computer-generated dialogue has gotten pretty good, to the point that several chatbots have arguably passed the Turing Test. One of the criticisms of that chatbot was that it cast itself as a very particular role and personality in order to appear convincing - which is obviously something that is eminently doable for a fantasy chatbot too.

In addition, by its very nature sex texting tends to be somewhat inarticulate at points. From what I saw of Tay's output - Nazi propaganda aside - a similar level of quality would pass perfectly well in sexual chat.

Sure, a bot might not be able to construct sophisticated fantasies. But are those 100% necessary, or can it learn to please based on call and response?

2015 was the year that computers got better than humans at recognising images.

Could 2017 or 2018 be the year that computers get better than humans at dirty talk? Or, indeed, BDSM dominance?

There'll certainly be a lot of enthusiasm for the concept.

Pardon me. I'll be in my bunk.

What do you think? Could you see the oldest profession being on the AI chopping block? Would you ever talk dirty to a robot?

11 Jan 20:38

Long range forecast

by Charlie Stross

So, I'm seeing a bunch of disturbing news headlines in the new year. Mass sex attacks in Cologne on New Year's Eve would be one (and I want you to think very hard about precisely whose political agenda benefits from the different kinds of spin that can be placed on this story depending on how it is framed). Poland's constitutional court and civil service being rapidly brought under control of the Law and Justice Party (and what is it about neo-fascists and their obsession with touchstone topics like dignity, law, the church, and justice? Again, read the link I just gave you—it's part of the instructions for assembling the jigsaw puzzle of politics in the 21st century). Saudi mass executions are part of the same picture if you step back and look for the edge of the frame.

But the biggest news of all is getting relatively little traction because it's being mistaken for local colour rather than a global pattern.

What is the news (as opposed to popular entertainment and celebrity gossip) going to be like for the next decade? Let me give you a forecast.

(Note: As usual, there's a lot of meat in the hyperlinks. You won't get the most out of this essay unless you are familiar with their content.)

There are three big determinants of the long-term geopolitical weather report:

  1. Global climate change

  2. Human reactions to global climate change

  3. Economics

Let's take it from the top.

Global climate change doesn't mean a uniform "everywhere gets X degrees warmer" shift in temperatures; the weather is a dynamic, chaotic system, and what climate change means is that more energy is being pumped into driving atmospheric and oceanic currents, with unpredictable but generally more energetic consequences.

A bunch of conflicts are breaking out, or resuming, because chunks of the planet are becoming increasingly prone to extreme weather conditions. The UK just had its wettest December ever, with more than double the normal rainfall and extensive floods taking out the centers of major cities. Part of the blame lies with local cupidity, greed, and myopia in planning land drainage policy, but the rain itself doesn't respect national boundaries. Similarly, chunks of the USA got hammered, as did several South American countries ... in fact, everywhere you look, the weather is out of whack. In extreme cases this is leading to actual open warfare—the Syrian civil war and the rise of Islamic State, for example.

A side-effect of this is mass migration on a scale we haven't seen since the end of the second world war as people try to flee war and disaster zones.

Mass migration drives political backlashes everywhere, with racist clowns marching in front of the band (did you think I just meant Donald Trump?) and nativist anti-immigration groups crowding behind them. I'm not going to go into the social roots of xenophobia other than to note that (a) bigotry is fractal, and (b) insecure, threatened hominids put on threat displays right back at whatever they're scared of. Also, (c) a constituency of insecure, threatened hominids are easily led and profitably milked, which attracts an endless supply of sinister racist revivalist huckster politicians. (This is the Hitler as social entrepreneur theory: he wasn't uniquely evil, he just happened to be the first to get out in front and lead.) More to the point, every nation that isn't impoverished or devastated by climate change will see a wave of immigration, and every nation undergoing a wave of immigration will see a nativist political reaction.

The nativist backlash is inevitably going to be inflamed by the Martian invaders, who are all in favor of the free movement of capital but not labor (hint: this is Marxism 101, and if you don't believe me, go look at the requirements for a Tier 1 investor visa). Restricting transnational mobility for the proles/serfs/99.9% is part of the program and plays well to the nativist strand in climate change politics, which is why unless you've got a few million burning a hole in your back pocket you'll find it really difficult to legally immigrate into the UK or USA or other top-tier countries from outside the developed world. And why all our corporate-owned media (that is, 95% of them: Reddit is owned by Conde Nast, The Times and Fox News and 90% of the newspapers in Australia are owned by Rupert Murdoch, and so on) are banging the drum against immigration, at the behest of their (investor visa equipped) owners.

Nativism meshes with religious ideology as well as politics, of course. It serves the purpose of the right wing in the west very well to have a demonic-seeming Islamic adversary intend on exterminating Christianity. And it serves the interests of Da'esh very well indeed to have an adversary in the west who cack-handedly bomb civilians and rant against the evils of Islam so that they can strike heroic poses against the infidels. As with the communist/capitalist cold war, there's an element of posturing-in-the-mirror going on here. Both capitalism and communism take as holy writ the ideas of the Enlightenment and of society organized around industrial development and division of labour: compared to the ancien regime it was essentially a sectarian squabble between nearly-identical radical factions. Christianity and Islam are both evangelical, messianic, monotheistic religions with a patriarchal ideology and a bunch of lifestyle restrictions (mostly affecting women) bolted on the side; in both cases, most of their followers are peaceful, but we don't pay attention to them—we only notice the scary fundamentalist terrorists on the other side of the fence.

(Random discursive note: this being an anglophone blog, some of you are probably thinking, "but, but, hijab!" To which I will note that veiling women as a religious practice is a long tradition in Christian cultures which only fell into neglect historically recently, and we have equally batshit taboos which we are mostly oblivious to—fish don't notice the water, after all. Almost all the practices conducted by IS that we consider to be barbaric were just business as usual in the western world until historically recent times. Sometimes until very recently. Let's have no stone-throwing here: digression over.)

Economics is another aggravating problem. The global financial system crashed in 2007/08 and was only revived by a brisk dose of hyperinflation. The public didn't really notice the effects of the hyperinflation because it happened globally, with all the central banks engaging in quantitative easing more or less simultaneously (or "printing fiat currency" as the goldbugs call it): the price of exports didn't rise or fall as the tsunami of soft money rushed past in the ocean depths below the keel of the commodities markets. But we're now seeing oddly cheap oil (in turn aggravated by the traditional Sunni/Shi'a cold war that's been running for the past 1300-odd years, which in turn has been inflamed by climate change in Iraq and Syria and the final collapse of the Sykes Picot agreement and its legacy in the former Ottoman Empire). Oil and energy economics in general are now being affected by the human reaction to climate change which, while belated and half-hearted, is to stop shitting in the bed you're sleeping in: the switch to renewable energy is under way globally and the cost per kWh of photovoltaic power is now at grid parity and will soon undercut coal in most of the world (the IEA are putting a brave face on it but they may be next, too).

This is a toxic combination. We've just weathered the worst financial crisis since the Great Depression, and we're undergoing an infrastructure crisis (due to climate change) and the extinction of an economic backbone industry—admittedly one we will be far better off without: coal and oil pollution directly kill tens of thousands of people even in developed nations—which will ultimately require the replacement of tens of trillions of dollars' worth of fossil fuel infrastructure worldwide. Add nativist/racist/right wing politics on top, from Hungary through Poland (above) and Russia and it really looks like we're in for a replay of the 1930s.

I've missed out a few bright spots.

To a time traveller from 1985, China is doing unelievably well. They're working through the huge demographic bulge created by the now-abandoned one-child-per-family policy, and their work force is going to start shrinking in another couple of decades, but for the time being they're reaping the benefits of a much better educated and trained workforce (at least compared to their often-illiterate peasant grandparents) and rapid development. China overall is trying to do what Japan and South Korea did in the second half of the 20th century, with many signs of success (and the negative side-effects too, which explains the Central Committee's conversion to the cause of fossil fuel reduction). India is also developing rapidly, and those two countries combined equal the entire world population in 1950. Lifting China and India out of poverty is, if it happens, going to be one of the great human triumphs of the first half of this century, an almost incalculably huge improvement in the overall human experience—if we (and they) don't drop the ball. We're also seeing development in large parts of Africa. North Africa is a mess, with the spill-over from the Middle East conflicts and climate change as a driver for immigration and strife.

But anyway, here's my summary of the next decade:

  1. The weather's going to get worse.

  2. We're going to see more and more unscrupulous huckster types leading revanchist, nativist right wing political movements and banging the anti-immigrant drum, world-wide. Civil rights include the right to free movement; this makes civil rights an easy scapegoat and target for the angry populist nativists. Sensible media capitalists (those with a sense of self-preservation) will pander to these assclowns. Courageous media capitalists (those with the odd ethical bone in their body) will stand up to them and get themselves assassinated or imprisoned. Luckily we have the internet except, oops, Facebook owns it and FB will do whatever they're told. (And if not Facebook, Google. The internet is infrastructure, and if annoying dissidents are drinking from the pure tapwater of honest news and you own the pumping station ...)

  3. This is going to happen both in nominally/formerly Christian countries and in the Muslim world. Both sides will see each other in a mirror and hiss like cats, but it doesn't really signify anything. Fear of terrorism is a rallying point, so expect unscrupulous politicians to use crack-downs on their local minorities to bolster their popularity. This will of course include crack-downs on civil rights because nothing annoys a political entrepreneur trying to posture as a strong leader like a civil rights lawyer with a good case.

  4. The ongoing 1300-year Sunni/Shi'ite cold war will continue, sometimes hotter, thanks to climate-induced disruption in the Middle East and the eventual collapse of the Saudi petrochemical economy. The ongoing Saudi succession crisis isn't going to help (as we just saw).

  5. None of this political posturing is going to do jack shit to roll back the already-in-train effects of climate change so the immigration pressure will continue, driving trends (2) and (3).

  6. Don't buy long term coal or oil futures.

13 Oct 13:14

Bernie Sanders is beating all of Obama's important 2008 records

by Cory Doctorow


Obama's 2008 run at the presidency was remarkable and game-changing, drawing huge crowds, raising huge sums in small money donations, and mobilizing a massive army of volunteer campaigners. There'd never been a campaign like it, and none had matched it since -- until Bernie Sanders. (more…)

29 Jun 20:15

Winona Ryder to star in new Netflix paranormal TV show

by David Pescovitz

Winona Ryder signed on as star of a forthcoming Netflix drama about the high weirdness, conspiracy theories, and paranormal reports around the Montauk Project, alleged US government experiments on Long Island involving time travel, psi-ops, and teleportation. Read the rest

05 Jan 16:00

South Korea's brutalized, disabled slaves

by Cory Doctorow

Whole towns' worth of people on South Korea's remote southwest coast are complicit in the longrunning, open enslavement of mentally and physically disabled workers who are kidnapped from the streets of cities like Seoul and beaten into lives of forced labor. Read the rest

28 May 03:09

Yann LeCun's answers from the Reddit AMA

On May 15th Yann LeCun answered “ask me anything” questions on Reddit. We hand-picked some of his thoughts and grouped them by topic for your enjoyment.

Toronto, Montreal and New York

All three groups are strong and complementary.

Geoff (who spends more time at Google than in Toronto now) and Russ Salakhutdinov like RBMs and deep Boltzmann machines. I like the idea of Boltzmann machines (it’s a beautifully simple concept) but it doesn’t scale well. Also, I totally hate sampling.

Yoshua and his colleagues have focused a lot on various unsupervised learning, including denoising auto-encoders, contracting auto-encoders. They are not allergic to sampling like I am. On the application side, they have worked on text, not so much on images.

In our lab at NYU (Rob Fergus, David Sontag, me and our students and postdocs), we have been focusing on sparse auto-encoders for unsupervised learning. They have the advantage of scaling well. We have also worked on applications, mostly to visual perception.

Numenta, Vicarious, NuPic, HTM, CLA, etc.

Jeff Hawkins has the right intuition and the right philosophy. Some of us have had similar ideas for several decades. Certainly, we all agree that AI systems of the future will be hierarchical (it’s the very idea of deep learning) and will use temporal prediction.

But the difficulty is to instantiate these concepts and reduce them to practice. Another difficulty is grounding them on sound mathematical principles (is this algorithm minimizing an objective function?).

I think Jeff Hawkins, Dileep George and others greatly underestimated the difficulty of reducing these conceptual ideas to practice.

As far as I can tell, HTM has not been demonstrated to get anywhere close to state of the art on any serious task.

I still think [Vicarious] is mostly hype. There is no public information about the underlying technology. The principals don’t have a particularly good track record of success. And the only demo is way behind what you can do with “plain vanilla” convolutional nets (see this Google blog post and this ICLR 2014 paper and this video of the ICLR talk).

HTM, NuPIC, and Numenta received a lot more publicity than they deserved because of the Internet millionaire / Silocon Valley celebrity status of Jeff Hawkins.

But I haven’t seen any result that would give substance to the hype.

Don’t get fooled by people who claim to have a solution to Artificial General Intelligence, who claim to have AI systems that work “just like the human brain”, or who claim to have figured out how the brain works (well, except if it’s Geoff Hinton making the claim). Ask them what error rate they get on MNIST or ImageNet.

Some advice

Seriously, I don’t like the phrase “Big Data”. I prefer “Data Science”, which is the automatic (or semi-automatic) extraction of knowledge from data. That is here to stay, it’s not a fad. The amount of data generated by our digital world is growing exponentially with high rate (at the same rate our hard-drives and communication networks are increasing their capacity). But the amount of human brain power in the world is not increasing nearly as fast. This means that now or in the near future most of the knowledge in the world will be extracted by machine and reside in machines. It’s inevitable. En entire industry is building itself around this, and a new academic discipline is emerging.

What areas do you think are most promising right now for people who are just starting out?

  • representation learning (the current crop of deep learning methods is just one way of doing it)
  • learning long-term dependencies
  • marrying representation learning with structured prediction and/or reasoning
  • unsupervised representation learning, particularly prediction-based methods for temporal/sequential signals
  • marrying representation learning and reinforcement learning
  • using learning to speed up the solution of complex inference problems
  • theory: do theory (any theory) on deep learning/representation learning
  • understanding the landscape of objective functions in deep learning
  • in terms of applications: natural language understanding (e.g. for machine translation), video understanding
  • learning complex control.

Do you think that deep learning would be a good tool for finding similarities in the medical domain (e.g. between different cases)?

Yes, look up papers on metric learning, searching for “siamese networks”, DrLIM (Dimensionality Reduction by Learning and Invariant Mapping), NCA (Neigborhood Component Analysis), WSABIE….

Larger networks tend to work better. Make your network bigger and bigger until the accuracy stops increasing. Then regularize the hell out of it. Then make it bigger still and pre-train it with unsupervised learning.

Speech is one of those domains where we have access to ridiculously large amounts of data and a very large number of categories. So, it’s very favorable for supervised learning.

There are theoretical results that suggest that learning good parameter settings for a (smallish) neural network can be as hard computationally as breaking the RSA crypto system.

The limitations you point out do not concern just backprop, but all learning algorithms that use gradient-based optimization.

These methods only work to the extent that the landscape of the objective function is well behaved. You can construct pathological cases where the objective function is like a golf course: flat with a tiny hole somewhere. Gradient-based methods won’t work with that.

The trick is to stay away from those pathological cases. One trick is to make the network considerably larger than the minimum size required to solve the task. This creates lots and lots of equivalent local minima and makes them easy to find. The problem is that large networks may overfit, and we may have to regularize the hell out of them (e.g. using drop out).

The “learning boolean formula = code cracking” results pertain to pathological cases and to exact solutions. In most applications, we only care about approximate solutions.


I learned a lot by reading things that are not apparently connected with AI or computer science (my undergraduate degree is in electrical engineering, and my formal CS training is pretty small).

For example, I have always been interested in physics, and I have read tons of physics textbooks and papers. I learned a lot about path integrals (which is formally equivalent to the “forward algorithm” in hidden Markov models). I have also learned a ton from statistical physics books. The notions of partition functions, entropy, free energy, variational methods etc, that are so prevalent in the graphical models literature all come from statistical physics.

In the early 90’s my friend and Bell Labs colleague John Denker and I worked quite a bit on the physics of computation.

In 1991, we attended a workshop at the Santa Fe Institute in which we heard a fascinating talk by John Archibald Wheeler entitled “It from Bits”. John Wheeler was the theoretical physicist who coined the phrase “black hole”. Many physicists like Wojciech Zurek (the organizer of the workshop, Gerard T’Hooft, and many others have the intuition that physics can be reduced to information transformation.

Like Kolmogorov, I am fascinated by the concept of complexity, which is at the root of learning theory, compression, and thermodynamics. Zurek has an interesting series of work on a definition of physical entropy that uses Kolmogorov/Chaitin/Solomonoff algorithmic complexity. But progress has been slow.

Fascinating topics.

Most of my Bell Labs colleagues were physicists, and I loved interacting with them.

Physics is about modeling actual systems and processes. It’s grounded in the real world. You have to figure out what’s important, know what to ignore, and know how to approximate. These are skills you need to conceptualize, model, and analyze ML models.


I use a lot of math, sometimes at the conceptual level more than at the “detailed proof” level. A lot of ideas come from mathematical intuition. Proofs always come later. I don’t do a lot of proofs. Others are better than me at proving theorems.

There is a huge amount of interest for representation learning from the applied mathematics community. Being a faculty member at the Courant Institute of Mathematical Science at NYU, which is ranked #1 in applied math in the US, I am quite familiar with the world of applied math (even though I am definitely not a mathematician).

Theses are folks who have long been interested in representing data (mostly natural signals like audio and images). These are people who have worked on wavelet transforms, sparse coding and sparse modeling, compressive sensing, manifold learning, numerical optimization, scientific computing, large-scale linear algebra, fast transform (FFT, Fast Multipole methods). This community has a lot to say about how to represent data in high-dimensional spaces.

In fact, several of my postdocs (e.g. Joan Bruna, Arthur Szlam) have come from that community because I think they can help with cracking the unsupervised learning problem.

I do not believe that classical learning theory with “IID samples, convex optimization, and supervised classification and regression” is sufficient for representation learning. SVM do not naturally emerge from VC theory. SVM happen to simple enough for VC theory to have specific results about them. Those results are cool and beautiful, but they have no practical consequence. No one uses generalization bounds to do model selection. Everyone in their right mind use (cross)validation.

The theory of deep learning is a wide open field. Everything is up for the taking. Go for it.

How do you approach utilizing and researching machine learning techniques that are supported almost entirely empirically, as opposed to mathematically? Also in what situations have you noticed some of these techniques fail?

You have to realize that our theoretical tools are very weak. Sometimes, we have good mathematical intuitions for why a particular technique should work. Sometimes our intuition ends up being wrong.

Every reasonable ML technique has some sort of mathematical guarantee. For example, neural nets have a finite VC dimension, hence they are consistent and have generalization bounds. Now, these bounds are terrible, and cannot be used for any practical purpose. But every single bound is terrible and useless in practice (including SVM bounds).

As long as your method minimizes some sort of objective function and has a finite capacity (or is properly regularized), you are on solid theoretical grounds.

The questions become: how well does my method work on this particular problem, and how large is the set of problems on which it works well.

Kernel methods

Kernel methods are great for many purposes, but they are merely glorified template matching. Despite the beautiful math, a kernel machine is nothing more than one layer of template matchers (one per training sample) where the templates are the training samples, and one layer of linear combinations on top.

There is nothing magical about margin maximization. It’s just another way of saying “L2 regularization” (despite the cute math).

Let me be totally clear about my opinion of kernel methods. I like kernel methods (as Woody Allen would say “some of my best friends are kernel methods”). Kernel methods are a great generic tool for classification. But they have limits, and the cute mathematics that accompany them does not give them magical properties. SVMs were invented by my friends and colleagues at Bell Labs, Isabelle Guyon, Vladimir Vapnik, and Bernhardt Boser, and later refined by Corinna Cortes and Chris Burges. All these people and I were members of the Adaptive Systems Research Department lead by Larry Jackel. We were all sitting in the same corridor in AT&T Bell Labs’ Holmdel building in New Jersey. At some point I became the head of that group and was Vladimir’s boss. Other people from that group included Leon Bottou and Patrice Simard (now both at Microsoft Research). My job as the department head was to make sure people like Vladimir could work on their research with minimal friction and distraction. My opinion of kernel method has not changed with the emergence of MKL and metric learning. I proposed/used metric learning to learn embeddings with neural nets before it was cool to do this with kernel machines. Learning complex/hierarchical/non-linear features/representations/metrics cannot be done with kernel methods as it can be done with deep architectures. If you are interested in metric learning, look up this, this, or that.


It’s important to remind people that convolutional nets were always the record holder on MNIST. SVMs never really managed to beat ConvNets on MNIST. And SVMs (without hand-crafted features and with a generic kernel) were always left in the dust on more complex image recognition problems (e.g. NORB, face detection….).

The first commercially-viable check reading system (deployed by AT&T/NCR in 1996) used a ConvNet, not an SVM.

Getting the attention of the computer vision community was a struggle because, except for face detection and handwriting recognition, the results of supervised ConvNets on the standard CV benchmarks were OK but not great. This was largely due to the fact that the training sets were very small. I’m talking about the Caltech-101, Caltech-256 and PASCAL datasets.

We had excellent, record-breaking results on a number of tasks like semantic segmentation, pedestrian detection, face detection, road sign recognition and a few other problems. But the CV community played little attention to it.

As soon as ImageNet came out and as soon as we figured out how to train gigantic ConvNets on GPUs, ConvNets took over. That struggle took time, but in the end people are swayed by results.

I must say that many senior members of the CV community were very welcoming of new ideas. I really feel part of the CV community, and I hold no grudge against anyone. Still, for the longest time, it was very difficult to get ConvNet papers accepted in conferences like CVPR and ICCV until last year (even at NIPS until about 2007).

Deconvolutional networks

DeconvNets are the generative counterpart of feed-forward ConvNets.

Eventually, we will figure out how to merge ConvNet and DeconvNet so that we have a feed-forward+feed-back system that can be trained supervised or unsupervised.

The plan Rob Fergus and I devised was always that we would eventually marry the two approaches.

Unsupervised learning

The interest of the ML community in representation learning was rekindled by early results with unsupervised learning: stacked sparse auto-encoders, RBMs, etc. It is true that the recent practical success of deep learning in image and speech all use purely supervised backprop (mostly applied to convolutional nets). This success is largely due to dramatic increases in the size of datasets and the power of computers (brought about by GPU), which allowed us to train gigantic networks (often regularized with drop-out). Still, there are a few applications where unsupervised pre-training does bring an improvement over purely supervised learning. This tends to be for applications in which the amount of labeled data is small and/or the label set is weak. A good example from my lab is pedestrian detection. Our CVPR 2013 paper shows a big improvement in performance with ConvNets that unsupervised pre-training (convolutional sparse auto-encoders). The training set is relatively small (INRIA pedestrian dataset) and the label set is weak (pedestrian / non pedestrian). But everyone agrees that the future is in unsupervised learning. Unsupervised learning is believed to be essential for video and language. Few of us believe that we have found a good solution to unsupervised learning.

I don’t believe that there is a single criterion to measure the effectiveness of unsupervised learning.

Unsupervised learning is about discovering the internal structure of the data, discovering mutual dependencies between input variables, and disentangling the independent explanatory factors of variations. Generally, unsupervised learning is a means to an end.

There are four main uses for unsupervised learning: 1. learning features (or representations) 2. visualization/exploration 3. compression 4. synthesis

Only the first is interesting to me (the other uses are interesting too, just not on my own radar screen).

If the features are to be used in some sort of predictive model (classification, regression, etc), then that’s what we should use to measure the performance of our algorithm.


(At Facebook) We are using Torch7 for many projects (as does Deep Mind and several groups at Google) and will be contributing to the public version.

Torch is a numerical/scientific computing extension of LuaJIT with an ML/neural net library on top.

The huge advantage of LuaJIT over Python is that it way, way faster, leaner, simpler, and that interfacing C/C++/CUDA code to it is incredibly easy and fast.

We are using Torch for most of our research projects (and some of our development projects) at Facebook. Deep Mind is also using Torch in a big way (largely because my former student and Torch-co-maintainer Koray Kavukcuoglu sold them on it). Since the Deep Mind acquisition, folks in the Google Brain group in Mountain View have also started to use it.

Facebook, NYU, and Google/Deep Mind all have custom CUDA back-ends for fast/parallel convolutional network training. Some of this code is not (yet) part of the public distribution.

You could say that Torch is the direct heir of Lush, though the maintainers are different.

Lush was mostly maintained by Leon Bottou and me. Ralf Juengling took over the development of Lush2 a few years ago.

Torch is maintained by Ronan Collobert (IDIAP), Koray Kavukcuoglu (Deep Mind. former PhD student of mine) and Clément Farabet (running his own startup. Also a former PhD student of mine). We have used Torch as the main research platform in my NYU lab for quite a while.

Here is a tutorial, with code.scripts for ConvNets.

Also, the wonderful Torch7 Cheatsheet.

Torch7 is what is being used for deep learning R&D at NYU, at Facebook AI Research, at Deep Mind, and at Google Brain.

The future

Deep learning has become the dominant method for acoustic modeling in speech recognition, and is quickly becoming the dominant method for several vision tasks such as object recognition, object detection, and semantic segmentation.

The next frontier for deep learning are language understanding, video, and control/planning (e.g. for robotics or dialog systems).

Integrating deep learning (or representation learning) with reasoning and making unsupervised learning actually work are two big challenges for the next several years.

The direction of history is that the more data we get, the more our methods rely on learning. Ultimately, the task use learning end to end. That’s what happened for speech, handwriting, and object recognition. It’s bound to happen for NLP.

Natural language processing is the next frontier for deep learning. There is a lot of research activity in that space right now.

There is a lot of interesting work on neural language models and recurrent nets from Yoshua Bengio, Toma Mikolov, Antoine Bordes and others.

Integrating deep learning (or representation learning) with reasoning and making unsupervised learning actually work are two big challenges for the next several years.

What are some of the important problems in the field of AI/ML that need to be solved within the next 5-10 years?

Learning with temporal/sequential signals: language, video, speech.

Marrying deep/representation learning with reasoning or structured prediction.

What do you think are the biggest applications machine learning will see in the coming decade?

Natural language understanding and natural dialog systems. Self-driving cars. Robots (maintenance robots and such).


That’s all, folks. If you want more, there’s the NYU Course on Big Data, Large Scale Machine Learning with Yann and John Langford as instructors and materials for the 2014 Deep Learning Course, including some videos.

06 Apr 14:27

Tech-art making class with Kal Spelletich (San Francisco)

by David Pescovitz

Ingenious tech/robot artist Kal Spelletich of Seemen and Survival Research Labs fame is teaching a maker class in San Francisco on creating art involving technology! It sounds fantastic -- a rare opportunity to learn directly from a master of this genre that blends art, science, engineering, cultural criticism, and high weirdness. (Above, a two-minute video survey of Kal's storied career.) Kal says, "We will explore: building installations, carpentry, home-brewing, guerilla gardening, electric wiring, robotics, fire-making, fixing things, plumbing, pnu-matics, pumps, water purification, high-voltage electricity, video surveillance, electronic interfaces, scavenging for materials, cooking alternatives, solar power, skinning a rabbit, lighting, remote control systems, survivalist contemporary art history, and promoting and exhibiting your art.." Kal Spelletich: Research & Survival in the Arts Class

16 Mar 18:04

Homework is eating American schoolkids and their families

by Cory Doctorow

Here's a report from the front lines of the neoliberal educational world*, where homework has consumed the lives of children and their families without regard to whether it is improving their educational outcomes. The average California kid in a recent study was doing 3.1 hours' worth of homework per night, at the expense of sleep, time for family and friends, and activities ranging from grandma's birthday to "everything I used to do."

Ms. Pope suggests asking teachers and schools to provide homework packets that a student can spread out over a week, rather than springing large assignments due tomorrow that can derail family plans. Schools and teachers can also help by building in time for students to get started on homework and ask any questions they might have.

Looking at the larger picture, she said, things are changing. “These students are already averaging an hour more than what’s thought to be useful,” she said, and teachers, schools and parents are beginning to think harder about what kinds of homework, and how much of it, enhance learning and motivation without becoming all-consuming.

It might be easier than you think to start the conversation at your student’s school. “Load doesn’t equal rigor,” Ms. Pope said. “There are other developmental things students need to be doing after school, and other things they need to be learning.”

*The school is a business that produces educated children as products. The teachers are employees. The administrators are managers. The government is the board of directors. The tax-payers are the shareholders. School-businesses must be "accountable," which means producing quarterly reports in which numbers -- test scores, attendance -- go up, regardless of whether that reflects any underlying educational merit.

Homework’s Emotional Toll on Students and Families [KJ Dell'Antonia/NYT]

(via Sean Bonner)

(Image: Homework never ends, a Creative Commons Attribution (2.0) image from worldbeyondalens's photostream)


07 Mar 15:06

G.H.Hardy was wrong

G.H.Hardy was wrong:

Thank goodness someone with sense and mathematical credentials (W.W.Sawyer) has put the ghastly A Mathematician’s Apology to bed.

That Hardy was a very great mathematician is beyond question…. However, when any person eminent in some field makes statements outside that field, it is legitimate to consider the validity of these statements….

Hardy writes

I hate ‘teaching’….I love lecturing, and have lectured a great deal to extremely able classes. [2.]

Here lecturing means imparting mathematical knowledge to those able to understand it with little or no difficulty; teaching means giving time and effort to make it accessible to those who require assistance…. Good [management] consists in appreciating the merits of a wide variety of individuals and combining them into an effective team. [I]t is precisely this appreciation that Hardy lacks. He makes the extraordinary statement

Most people can do nothing at all well. [3.]

…[H]e regards you as doing well only if you are one of the ten best in the world at this particular activity…. [T]hat very few people do anything well is [then] an [obvious] consequence.

However in life we continually depend on the co-operation of men and women far below this exacting standard….

[E]ven … the … process that links the great mathematicians of one generation to those of the next [depends on them]. There may of course be direct contact, as when Riemann [studied] … under Gauss. But the fact that Gauss was able to reach university at all was due to two teachers, Buttner … and Bartels….[4.]

In science the importance of the expositor is perhaps as great as that of the discoverer. Mendel’s work in genetics remained unknown for many years because there was no one to publicize it and fight for it as Huxley did for Darwin.

He makes this curiously objective division of mankind into minds that are first-class, second class and so on…. There is no part of this that should be accepted as sound advice. If there is something you think worth doing, that you are able to do, that you have the opportunity to do, and that you enjoy doing, wisdom lies in getting on with it, and not giving a second’s thought to what ordinal number attaches to you in some system of intellectual snobbery. As for concern with the self, you are both happiest and most effective when you are so absorbed in what you are doing that for a while you forget the limited being that is actually performing it.

31 Aug 07:38

Having it all: Bill Watterson's words grace new cartoon

by Tim Carmody

Watterson Aung Than slice 2.jpg

Bill Watterson famously quit cartooning after ten years during which his Calvin and Hobbes was a critical and commercial success you could only compare to Charles Schultz's Peanuts. (Gary Larsen, GB Trudeau, and Berkeley Breathed are great, but come on.)

You could say Watterson retreated from public view after his retirement, but he was rarely available to the public even during the height of his fame. One exception was his 1990 commencement speech to his alma mater Kenyon College.

Thoreau said, "the mass of men lead lives of quiet desperation." That's one of those dumb cocktail quotations that will strike fear in your heart as you get older. Actually, I was leading a life of loud desperation.

When it seemed I would be writing about "Midnite Madness Sale-abrations" for the rest of my life, a friend used to console me that cream always rises to the top. I used to think, so do people who throw themselves into the sea.

I tell you all this because it's worth recognizing that there is no such thing as an overnight success. You will do well to cultivate the resources in yourself that bring you happiness outside of success or failure. The truth is, most of us discover where we are headed when we arrive. At that time, we turn around and say, yes, this is obviously where I was going all along.

Zen Pencils cartoonist Gavin Aung Than took a series of quotes from Watterson's speech and illustrated them, consciously imitating Watterson's style for a new inspirational cartoon titled "Bill Watterson: A Cartoonist's Advice." It features a cartoonist who (like Watterson) gives up a commercial illustration job to embrace his artistic dreams and raise a family as a stay-at-home dad. Although the events and much of the scenery is inspired in part by Watterson's story, first as a young illustrator and later as a popular cartoonist who refused to compromise, Gavin writes:

The comic is basically the story of my life, except I'm a stay-at-home-dad to two dogs. My ex-boss even asked me if I wanted to return to my old job.

My original dream was to become a successful newspaper comic strip artist and create the next Calvin and Hobbes. That job almost doesn't exist anymore as newspapers continue to disappear and the comics section gets smaller and smaller, often getting squeezed out of newspapers entirely. I spent years sending submissions to syndicates in my early 20s and still have the rejection letters somewhere. I eventually realised it was a fool's dream (also, my work was nowhere near good enough) and decided webcomics was the place to be. It's mouth-watering to imagine what Watterson could achieve with webcomics, given the infinite possibilities of the online medium.

See also Robert Krulwich's remarkable commencement speech at Berkeley about horizontal loyalty and refusing to wait, which seems to dovetail well here.

Alyssa Rosenberg writes about Watterson's speech and Gavin's accompanying cartoon's implications for feminism, especially arguments over balancing life and work:

"A person who abandons a career in order to stay home and raise children," Watterson noted," is considered not to be living up to his potential." I'm sure that choice of pronouns is deliberate. ... [The cartoon] is a powerful alternate vision of what it might look like to have it all.
Tags: Alyssa Rosenberg   Bill Watterson   Calvin and Hobbes   cartoons   comic strips   commencement speeches   feminism   Gavin Aung Than   Kenyon college   work
25 Aug 23:21

Sex in movies is sexy

by Jason Kottke

Josh Gondelman wants to make love to you like in the movies.

Everything that happens will be sexy. There won't be any gross sounds or sights. Just like in the movies, our sex will be tasteless and odorless. I will not kiss your neck and get a mouthful of perfume and then you're like what's wrong and I'll be like nothing and you'll get all distant and I'll be like sorry it's the taste of your perfume, and you'll be sad because you only wore it because I said I liked it one time and then all of a sudden you're not in the mood and I think about sneaking off to the bathroom to furtively masturbate but I don't and I just hold you limply until you fall asleep then I check Twitter for like an hour. That doesn't happen.

Tags: Josh Gondelman   movies   sex