Shared posts

06 Oct 05:47

Gas Pump Skimmers -

by brandizzi


What’s the first thing that should scare you? There’s three. And the next? The label ‘46’. They’ve got so many in the field criminals need to number them just to keep track.

Gas Pump Skimmers with Bluetooth

This is the not the first or the second time SparkFun has dealt with credit card skimmers. The difference is that this time the local governmental agency politely asked for help and we’re always down for trying to put a stop to bad actors.

Skimmer IC labels

We were given three skimmers found installed within gas pumps with the request that we try to get the data off the board so that the agents could let those who’ve had their credit card compromised know so they can get a new card. Not great, but it’s a start. Second task: can we build a jig or system so that they can more easily poke at these systems in the future. We were able to accomplish both as well as build an app that detects known skimmer in the area. You can get the free Android app here from google play - search for the name ‘Skimmer Scanner’ from SparkX.

For those who don’t want to read through the gritty details here’s the summary:

  1. These skimmers are cheap and are becoming more common and more of a nuisance across north america.
  2. The skimmer broadcasts over bluetooth as HC-05 with a password of 1234. If you happen to be at a gas pump and happen to scan for bluetooth devices and happen to see an HC-05 listed as an available connection then you probably don’t want to use that pump.
  3. The bluetooth module used on these skimmers is extremely common and used on all sorts of legitimate products end educational kits. If you detect one in the field you can confirm that it is a skimmer (and not some other device) by sending the character ‘P’ to the module over a terminal. If you get a ’M' in response then you have likely found a skimmer and you should contact your local authorities.

How the Skimmer Scanner App Works

alt text

The Skimmer Scanner App

The Skimmer Scanner is a free, open source app that detects common bluetooth based credit card skimmers predominantly found in gas pumps. The app scans for available bluetooth connections looking for a device with title HC-05. If found, the app will attempt to connect using the default password of 1234. Once connected, the letter ‘P’ will be sent. If a response of ’M' then there is a very high likelihood there is a skimmer in the bluetooth range of your phone (5 to 15 feet).

Skimmer Scanner is free, open source, and currently available for Android. The source is available here. The app does not obtain or download data from a given skimmer nor does it report any information to local authorities.

If you detect one in the field let us know! We’d love to hear about it.

Do Something About It

These skimmers are most scary because there is no one being held responsible or tasked with prevention. If your credit card number is stolen you simply contact the provider and they will (usually) refund any fraudulent charges and send you a new card. In turn, the credit card companies simply do a charge back to the merchant where the fraudulent charges took place (taking the money from the merchant and refunding it to the customer whose card has been stolen). Gas stations rarely have alarms or indicators on the pumps so it’s unclear if they ever know the pumps have been opened. And the fuel pump manufacturers have no incentive to install digital or audible alarms on the pumps (that costs money).

Reader Anthony David Adams informed us who really gets charged in these situations. You can read his response here.

Are you angry that your card has been stolen, again? Contact your local congress person or senator and ask them to pass legislation that fines gas stations $100 for every card that is discovered on a skimmer in one of their pumps. It’s ultimately up to the gas stations and pump manufacturers to secure their pumps.

How a Gas Pump Skimmer Works

External gas pump

Front of a US Fuel Pump complete with extremely difficult to source security seal

Essentially, the perpetrator opens a pump using one of a few master keys, unplugs the credit card reader from the main pump controller, plugs the card reader into the skimmer and plugs the skimmer back into the pump controller. This reportedly takes less than 30 seconds.

A skimmer is basically a man in the middle attack. The skimmer listens for all the serial traffic from the credit card reader (clear text at 9600bps) records it to an external piece of memory (flash in this case) and then passes that same serial traffic onto the pump controller. When you use one of these modified pumps the pump controller charges your card and you’re none the wiser, but your credit card details are stored in memory.

alt text

Hours or days later, the perpetrator returns to the gas station and connects over bluetooth to the compromised pump. Once connected the skimmer sends the contents of the EEPROM (all the recent credit card numbers) over the air to the perpetrators cell phone or laptop where it’s logged.

Skimmer Design

Let’s dig into how these skimmers are designed…

Gas pump skimmer ICs

This type of skimmer seems to be very common. A quick image search shows this model all over north america.

The setup is very straight forward but with some odd design choices. The PIC18F4550 communicates with the Bluetooth module over serial. The PIC also talks to an SPI Flash. Signals (serial characters) from the credit card reader are recorded by the PIC to the SPI EEPROM. When a cell phone or tablet connects to the Bluetooth module a serial connection (called Serial Port Profile or SPP) is created. Whatever serial characters the cell phone sends get sent to the PIC. For example when the character ‘?’ is sent from our Bluetooth enabled tablet to the Skimmer the Skimmer responds with the character ‘1’.

Front of the Skimmer

Skimmer Components

Front of the Skimmer

The front of the board is composed of a PIC Microcontroller 18F4550, an SPI EEPROM part #: M25P16 (datasheet) which is a 16Mbit flash memory, and a standard LM1117 3.3V regulator.

alt text

Various pins toned out with assumed use

To get into some gritty details:

  • R1 and R2 look to be a voltage divider when needed (R2 is not populated) to drop the voltage of the signals coming from the credit card reader. I presume the reader is outputting 12V signals and R1 (1.5K) is there to limit the current into the receive pin thus protecting the PIC from damage.
  • There are three serial pins shown at the top of the picture. From left to right: GND, RX, TX. These seem to be an easy serial connection to the PIC. Perhaps used for bootloading new firmware. These pins connect to the Bluetooth module’s RX and TX pins (respectively) and make it very easy to hook up a logic analyzer to sniff the serial traffic (thanks skimmer designer!)
  • The voltage regulator is very common with a large package to (probably) withstand getting hot when given various input voltages (regulating 12V down to 3.3V can produce a bit of heat)
  • D5 is a status LED
  • J4 is the super common PIC ICSP (in circuit serial programming) header. It’s used to get firmware onto the PIC18F4550
  • C1 and C7 are loading caps to the 8MHz crystal. This makes sense as most PIC 18F series can’t run above 10MHz at 3.3V
  • C2 looks like a 0.1uF decoupling capacitor
  • R15 is a 10k pull-up on the reset (MCLR) line
  • C13 is a large cap on RA4/T0CKI/C1OUT/RCV. This could be a digital I/O, Timer0 external clock input, Comparator 1 output, or external USB transceiver RCV input. None of these have a clear reason to be connected to a large decoupling cap. This pin is not connected to any other part of the circuit.

The main connection to the credit card reader is via the connection labeled ‘1’ through ‘7’ shown on the right with a gray cable installed. It is unclear what the second connector (shown on the left in the image above) is used for. This connection could be used for a variety of different things as the pins on the PIC that are broken out could be used as either inputs or outputs. My guess is that this is the connection to the keypad so that the skimmer can record pin numbers (for debit cards) when the pump has the right model or compatible keypad.

Rear of the Skimmer

alt text

Various pins toned out with assumed use

These modules use an extremely common Bluetooth module called the HC-06. These are roughly $3 per unit and perhaps cheaper in quantity. Bluetooth has gotten shockingly cheap!

HC-06 Bluetooth Module

The HC-06 module from

More on the Bluetooth module is below.

Build Quality

Interestingly, between the three units we were given we found three grades of assembly, excellent, good, and trash.

alt text

These units look well built

The main PCB assembly of the three units looks of reasonable to high quality. The front side of the skimmer (containing the PIC microcontroller) has been assembled with standard SMD practices using a solder paste, stencil, and reflow. It looks like it was mass produced from the quality of the fillets.

alt text

Decently soldered bluetooth module

The bluetooth module and various components on the back side look hand soldered but done by someone who knows how to use flux and how to solder well. The bluetooth modules were most likely hand soldered to reduce the overall manufacturing costs (it basically costs double to stencil and reflow a 2nd side).

Bad Skimmer Soldering

That is some bad assembly right there. The stranded wires are shorting adjacent pins.

The cables and connectors were added by someone else, most likely the perpetrator. It’s really bad. On two units the stripping of the wire and solder is so poor that units will probably fail in the field because of shorting between pins.

Skimmers with external connectors

Two units have a 7 pin polarized connector with the tab cut off, possibly because they don’t know which way the pump controller will plug on. This is either very amateur (they guess when they plug in their unit which is pretty cavalier because it could fry their unit, the credit card reader, or both) or they’ve found that the connectors inside different pumps have different (opposite) orientations and they want to build a unit that can quickly work with either polarization. It’s unlikely the pump controller market would gravitate towards the same number of pins using the same type of connector but use two different orientations. So I’m guessing the builder of these units is not knowledgeable enough to figure out where pin 1 lives on the polarized connector and just resorts to guess and check: Plug it in, does it work? No? Switch it around the other way.


Let’s look a bit closer at each of the main components.


alt text

Pinout of the 18F4550

The brain is a PIC 18F4550 running at 8MHz and 3.3V. The 3.3V regulator has a single decoupling cap. A very basic configuration and probably able to handle larger voltages like 12V without getting too hot.


Connections to the SPI EEPROM

Pinout and PIC connections to the 25P16VP. The labels to the edges indicate how the IC is found wired in circuit. For example Q is wired to MISO/A5 on the PIC.

The IC labeled 25P16VP seems to be a NOR flash memory from ST/Micron with a full part# M25P16. Curiously this part is being End-Of-Lifed (EOL). EOL notice is here dated December of 2016 with last shipment dates into 2021. Perhaps they are using this part because it is super common and/or super cheap. The EEPROM VCC is 3.3V which follows why the board runs at 3.3V (along with the 3.3V Bluetooth module).


Module recommended connection

Recommended pinout for the HC-06

Above is the recommended wiring of the module showing various pin connections. Toning out the connections to the module, everything is pretty standard and expected. PIO11 on the skimmer is curiously tied to GND with a 10K. There is an LED on PIO8. Here is a good breakdown of all the various Bluetooth module revisions from the HC-01 company.

Querying the module over the air we found the Bluetooth module has all the default settings in place. The designers of the skimmer never took the time to modify the settings:

  • Baud rate = 9600
  • Connection Password = 1234
  • nl/cr line endings not required
  • AT commands are required to be in uppercase
  • Firmware version = hc01.comV2.0
  • Name = HC-06
  • No parity
  • SLAVE mode

Did you get that? All three devices we found have the defaults in place. This means a few things:

  1. The module broadcasts its ID as HC-06 so we can detect them easily
  2. The password is 1234 so we can connect to them easily

Astute readers see where we are headed…

Initially this blew my mind. If I were to design a bluetooth skimmer I would program the module to NOT broadcast its ID, I would change the ID to something only I knew, and I would change the password from 1234 (headsmack). I would then create an app that knew the IDs of my various bluetooth IDs and connect to them privately (without publicly broadcasting their IDs). But then a few things struck me:

  • These devices are cheap. The BOM (bill of materials) is less than $4. That puts an estimated retail price of this device at $10.
  • While the SMD build quality is professional, the soldering of the ribbon (the gray cable that connects to the credit card reader) is horrendously bad indicating the perpetrator has very little experience with soldering and probably zero experience with electronics.

Years ago it took someone with knowledge and skills to build a credit card skimmer. Now criminals are buying these off the shelf with very little knowledge and slapping them together. It’s basic user design theory: when your customer is not so smart make it idiot proof so they don’t contact you for support. The designers of this skimmer were smart, it’s better to make these devices easy to connect to than to add a layer of security. What’s the worst that could happen? The device is detected and removed from the pump. Meanwhile, 10 more have been deployed for a total cost of $100.

Characteristics of the Device

We powered the skimmer with 5V from USB. On powerup the main status LED blinks 3 times. The bluetooth LED blinks fast at 4Hz when powered but not connected.

Connecting to the skimmer using bluetooth on a computer is straightforward. We used the default password of 1234 and noticed the bluetooth LED blinks at 2Hz and then off for 2 seconds when connected.

On our setup COM-6 was the bluetooth SPP that became available once we were connected. Sending ? to the skimmer causes it to respond with 1 at 9600bps.

Known Commands

Here’s what we were able to glean from hammering the skimmer with various serial strings.

  • Power the skimmer with 5V to 12V from the gray cable (or 3.3V into VCC)
  • Connect to HC-01 with password 1234 using a laptop
  • Open a terminal program at 9600bps
  • ‘?’ is a good way to see if module is responding Numbers don’t do anything on their own
  • Lower case letters do nothing

Identified Commands:

  • ? - returns 1
  • P - returns M
  • D - Waits for 6 characters then stores them. Originally ‘123456’ and we thought it was the device ID or password. However, changing this setting does not seem to have an effect.
  • C - Displays the 6 characters that are stored based on D
  • @ - Look up memory location. Follow the @ with two more characters in binary form, returns credit card details stored in memory. For example, @[00][01] (shift+2, ctrl+shift+2, ctrl+a) returns the 2nd byte stored in EEPROM. See ascii table for info on how to send binary characters from keyboard combinations.
  • % - waits for a character then does nothing
  • E - causes serial to stop responding to serial commands. BT still open/connected. Power cycle device to get it back responding. This could possibly be used to deactivate a device in the field.
  • ~ - Erase all SPI flash. This is how to erase all the credit card numbers. Unit blinks the status LED for ~20 seconds (EEPROM takes time to erase). The unit will buffer any incoming serial characters during the time it takes to erase the EEPROM (serial interrupts and buffer are being used).

A skimmed credit card record looks like this (we’ve altered the card number):

T1 %B374328830305879^YOU/A GIFT FOR            ^20042221330999242?
T2 ;374328830305879=200912211090924100000?

This looks like a direct copy of the serial we would expect out of a serial card reader. T1 indicates track 1 data, T2 for track 2, etc. The records are stored on EEPROM in clear text (remember, make it idiot proof for the user). It looks like someone used a gift card to buy gas. Note that this record is 113 characters. Let’s say a record is 256 bytes. With 16Mbit of flash storage that’s 2MB or approximately 7,800 credit card records that could be stored on a device. Yikes.

On the units we were given we found on average 24 records per device. This seems low. I’m not sure where these devices were located but one would expect at least 24 credit card users per day. This may indicate the perpetrator was regularly visiting the pumps and harvesting the records on a daily basis.

Getting Data Off Skimmer

We are not going to provide an example app that pulls data from the device. If you’re savvy enough to build an app from the information provided in this tutorial you are likely to earn more money using your mad skills for good than evil.

If you’re a law enforcement official you should have physical access to the device. There are a few methods to get the data from the EEPROM.

First, you’ll need an EEPROM reader. We like using a cheap SPI programmer from Amazon. The software for Windows is a bit hard to locate. We had the best luck with version 1.29 from Tosiek Zanakow’s website. We have hosted a copy of the software here as well. Use at your own risk.

Next you’ll need to connect the Programmer to the EEPROM. There are two methods: Using a SOIC clip or hot-air removing the EEPROM. The SOIC clip is the least destructive method but requires dexterity and patience. Hot-air rework is more reliable but requires removing the IC from the board and may create evidence provenance issues.

SOIC Clip Method

alt text

Downward pressure is applied to hold yellow wire against MCLR/GND. At the same time the SOIC clip is held carefully in place. It’s tricky.

We use a SOIC Test Clip to make a temporary electrical connection to the SPI EEPROM. 12" Female to Female jumper wires connect the SOIC Clip to the EEPROM Reader. The female jumper wires connect nicely with the 6 SPI pins on the CH341A EEPROM Programmer. Only 6 wires are needed; the write protect and hold pins can be left floating. We selected the ES25P16 from within the software to read the IC over SPI.

The last thing you need to do is to put the 18F4550 into reset. This is because the CH341A will provide power to the skimmer. Once power is applied the PIC will boot up and attempt to connect with the EEPROM. If you’re trying to read the EEPROM through the clip at the same time you’ll have bus contention. This is bad and will prevent the programmer from correctly reading the EEPROM.

Using a SOIC clip to read an IC

The reset pin MCLR (master clear) on the PIC is exposed on the programming header

To get the PIC into reset you’ll need to either hold or solder a wire from MCLR to GND. This will cause the PIC to stay in reset and allow you to read the EEPROM to a binary file from the CH341A software. Renaming this binary file to a .txt extension will make the information viewable in a normal text editor.

The problem with this method is that the SOIC clip is difficult to keep in place. Things like to move causing the clip to slip off the EEPROM. The connections are not great but with a little practice it can be made to work.

Hot-Air Removal

SOIC Breakout board

A SOIC IC on a breakout board

The alternative method is to use hot-air to remove the EEPROM from the skimmer and then solder it to a SOIC breakout board. Solder male headers to this breakout board and you will be able to make solid connections from the EEPROM to the EEPROM Programmer and read the contents. You will not need to hold the PIC in reset for this method to work (because the PIC is no longer connected to the EEPROM).


We were able to pull the firmware from the PIC using a PICKit 3. You can get a copy of the HEX file here. The firmware on all three boards was identical. The firmware is curiously small, occupying 0x0000 to 0x07F0 (about 2,000 bytes). I sneeze Arduino sketches that are bigger than that.

Update: A few readers pointed out the fuse bits have been set to protect and prevent reading of the firmware. Decapsulation and fuse clearing would be the next step but is beyond our capabilities at the moment.

One unknown: Why use a PIC18F4550 at all? It’s more powerful with way more flash than is needed. You could do this with a smaller device like an ATtiny or a cheaper device like an ARM-M0 (SAMD11s are amazing). Perhaps the extra pins are needed for keypad decoding. Perhaps the PIC18F4550 is very low cost wherever these devices are made. Perhaps this device was designed by someone and the design was never re-visited to reduce costs (who cares when the BOM on this skimmer is already less than $5?).

I am not a hacker, I just play one on TV (poorly). If you are able to decompile the HEX code into assembly and can make some sense of the function of the firmware, please do so. We already know enough about the available commands to erase and disable the device. If you are able to decipher additional functionality or interesting characteristics please let us know!

Serial Injection

alt text

The pinout of the gray cable have been located but not identified

The 7 pins going to the gray cable that connect to the credit card reader have been toned out to the PIC. We know where they go but we don’t know what the pins do. For example, what does D3 connect to on the credit card reader inside the pump? We expect pin B0 to be the main data input from the card reader because it has a current limiting resistor inline (R1). But without getting access to an actual gas pump we are kind of guessing.

In order to identify the purposes of each of the 7 pins we attempted to send serial into the skimmer as if we were a credit card reader on the gas pump.

We tried sending TTL level serial at many different baud rates, with different strings, on different pins, hoping the skimmer would blindly store this data. Nothing was stored as a credit card record. Perhaps the device is smart enough to look for well formed track data and our tests were not formatted correctly. However, some of the records we obtained from the EEPROMs look like gibberish so we are inclined to believe the skimmer is just recording blindly.

Perhaps the device is expecting RS232 level signals rather than TTL. We tried using a magnetic card reader to send RS232 signals to the device and were not able to get it to store the card data. The magnetic card reader outputs RS232 at 9600bps whereas the gas pump may be operating at a different baud rate.

In the end we were unable to get the skimmer to record our fake data and thus were unable to determine definitively what each of the pins going to the pump card reader are for. The function of these pins don’t really matter, we were just curious.

Resources and Going Further

We hope you enjoyed reading this in-depth tear down of these scary devices. If you’d like to assist us we are curious to answer the following questions:

  1. What do these devices actually cost? We haven’t trolled the dark web far enough to find one.
  2. What types of fuel pumps do the skimmers work on? It’s unclear if this model of skimmer works across the field or if it works only with certain pump types.
  3. What are some other methods for detection and prevention? We brainstormed all sorts of things. In the end, it’s shocking how easy it is to open up a gas pump. The quickest prevention method we could think of was a klaxon attached to a leaf switch set to go off anytime the pump is opened. Provide all pump repair folks with ear protection and the problem of skimmers is solved.

If you liked this article, consider checking out some of our other tutorials:

Serial Terminal Basics

This tutorial will show you how to communicate with your serial devices using a variety of terminal emulator applications.

Let's block ads! (Why?)

06 Oct 05:47

A Brain Built From Atomic Switches Can Learn | Quanta Magazine

by brandizzi

Brains, beyond their signature achievements in thinking and problem solving, are paragons of energy efficiency. The human brain’s power consumption resembles that of a 20-watt incandescent lightbulb. In contrast, one of the world’s largest and fastest supercomputers, the K computer in Kobe, Japan, consumes as much as 9.89 megawatts of energy — an amount roughly equivalent to the power usage of 10,000 households. Yet in 2013, even with that much power, it took the machine 40 minutes to simulate just a single second’s worth of 1 percent of human brain activity.

Now engineering researchers at the California NanoSystems Institute at the University of California, Los Angeles, are hoping to match some of the brain’s computational and energy efficiency with systems that mirror the brain’s structure. They are building a device, perhaps the first one, that is “inspired by the brain to generate the properties that enable the brain to do what it does,” according to Adam Stieg, a research scientist and associate director of the institute, who leads the project with Jim Gimzewski, a professor of chemistry at UCLA.

The device is a far cry from conventional computers, which are based on minute wires imprinted on silicon chips in highly ordered patterns. The current pilot version is a 2-millimeter-by-2-millimeter mesh of silver nanowires connected by artificial synapses. Unlike silicon circuitry, with its geometric precision, this device is messy, like “a highly interconnected plate of noodles,” Stieg said. And instead of being designed, the fine structure of the UCLA device essentially organized itself out of random chemical and electrical processes.

Yet in its complexity, this silver mesh network resembles the brain. The mesh boasts 1 billion artificial synapses per square centimeter, which is within a couple of orders of magnitude of the real thing. The network’s electrical activity also displays a property unique to complex systems like the brain: “criticality,” a state between order and chaos indicative of maximum efficiency.

Moreover, preliminary experiments suggest that this neuromorphic (brainlike) silver wire mesh has great functional potential. It can already perform simple learning and logic operations. It can clean the unwanted noise from received signals, a capability that’s important for voice recognition and similar tasks that challenge conventional computers. And its existence proves the principle that it might be possible one day to build devices that can compute with an energy efficiency close to that of the brain.

These advantages look especially appealing as the limits of miniaturization and efficiency for silicon microprocessors now loom. “Moore’s law is dead, transistors are no longer getting smaller, and [people] are going, ‘Oh, my God, what do we do now?’” said Alex Nugent, CEO of the Santa Fe-based neuromorphic computing company Knowm, who was not involved in the UCLA project. “I’m very excited about the idea, the direction of their work,” Nugent said. “Traditional computing platforms are a billion times less efficient.”

Switches That Act Like Synapses

Energy efficiency wasn’t Gimzewski’s motivation when he started the silver wire project 10 years ago. Rather, it was boredom. After using scanning tunneling microscopes to look at electronics at the atomic scale for 20 years, he said, “I was tired of perfection and precise control [and] got a little bored with reductionism.”

In 2007, he accepted an invitation to study single atomic switches developed by a group that Masakazu Aono led at the International Center for Materials Nanoarchitectonics in Tsukuba, Japan. The switches contain the same ingredient that turns a silver spoon black when it touches an egg: silver sulfide, sandwiched between solid metallic silver.

Applying voltage to the devices pushes positively charged silver ions out of the silver sulfide and toward the silver cathode layer, where they are reduced to metallic silver. Atom-wide filaments of silver grow, eventually closing the gap between the metallic silver sides. As a result, the switch is on and current can flow. Reversing the current flow has the opposite effect: The silver bridges shrink, and the switch turns off.

Soon after developing the switch, however, Aono’s group started to see irregular behavior. The more often the switch was used, the more easily it would turn on. If it went unused for a while, it would slowly turn off by itself. In effect, the switch remembered its history. Aono and his colleagues also found that the switches seemed to interact with each other, such that turning on one switch would sometimes inhibit or turn off others nearby.

Most of Aono’s group wanted to engineer these odd properties out of the switches. But Gimzewski and Stieg (who had just finished his doctorate in Gimzewski’s group) were reminded of synapses, the switches between nerve cells in the human brain, which also change their responses with experience and interact with each other. During one of their many visits to Japan, they had an idea. “We thought: Why don’t we try to embed them in a structure reminiscent of the cortex in a mammalian brain [and study that]?” Stieg said.

Building such an intricate structure was a challenge, but Stieg and Audrius Avizienis, who had just joined the group as a graduate student, developed a protocol to do it. By pouring silver nitrate onto tiny copper spheres, they could induce a network of microscopically thin intersecting silver wires to grow. They could then expose the mesh to sulfur gas to create a silver sulfide layer between the silver wires, as in the Aono team’s original atomic switch.

Self-Organized Criticality

When Gimzewski and Stieg told others about their project, almost nobody thought it would work. Some said the device would show one type of static activity and then sit there, Stieg recalled. Others guessed the opposite: “They said the switching would cascade and the whole thing would just burn out,” Gimzewski said.

But the device did not melt. Rather, as Gimzewski and Stieg observed through an infrared camera, the input current kept changing the paths it followed through the device — proof that activity in the network was not localized but rather distributed, as it is in the brain.

Then, one fall day in 2010, while Avizienis and his fellow graduate student Henry Sillin were increasing the input voltage to the device, they suddenly saw the output voltage start to fluctuate, seemingly at random, as if the mesh of wires had come alive. “We just sat and watched it, fascinated,” Sillin said.

They knew they were on to something. When Avizienis analyzed several days’ worth of monitoring data, he found that the network stayed at the same activity level for short periods more often than for long periods. They later found that smaller areas of activity were more common than larger ones.

“That was really jaw-dropping,” Avizienis said, describing it as “the first [time] we pulled a power law out of this.” Power laws describe mathematical relationships in which one variable changes as a power of the other. They apply to systems in which larger scale, longer events are much less common than smaller scale, shorter ones — but are also still far more common than one would expect from a chance distribution. Per Bak, the Danish physicist who died in 2002, first proposed power laws as hallmarks of all kinds of complex dynamical systems that can organize over large timescales and long distances. Power-law behavior, he said, indicates that a complex system operates at a dynamical sweet spot between order and chaos, a state of “criticality” in which all parts are interacting and connected for maximum efficiency.

As Bak predicted, power-law behavior has been observed in the human brain: In 2003, Dietmar Plenz, a neuroscientist with the National Institutes of Health, observed that groups of nerve cells activated others, which in turn activated others, often forming systemwide activation cascades. Plenz found that the sizes of these cascades fell along a power-law distribution, and that the brain was indeed operating in a way that maximized activity propagation without risking runaway activity.

The fact that the UCLA device also shows power-law behavior is a big deal, Plenz said, because it suggests that, as in the brain, a delicate balance between activation and inhibition keeps all of its parts interacting with one another. The activity doesn’t overwhelm the network, but it also doesn’t die out.

Gimzewski and Stieg later found an additional similarity between the silver network and the brain: Just as a sleeping human brain shows fewer short activation cascades than a brain that’s awake, brief activation states in the silver network become less common at lower energy inputs. In a way, then, reducing the energy input into the device can generate a state that resembles the sleeping state of the human brain.

Training and Reservoir Computing

But even if the silver wire network has brainlike properties, can it solve computing tasks? Preliminary experiments suggest the answer is yes, although the device is far from resembling a traditional computer.

For one thing, there is no software. Instead, the researchers exploit the fact that the network can distort an input signal in many different ways, depending on where the output is measured. This suggests possible uses for voice or image recognition, because the device should be able to clean a noisy input signal.

But it also suggests that the device could be used for a process called reservoir computing. Because one input could in principle generate many, perhaps millions, of different outputs (the “reservoir”), users can choose or combine outputs in such a way that the result is a desired computation of the inputs. For example, if you stimulate the device at two different places at the same time, chances are that one of the millions of different outputs will represent the sum of the two inputs.

The challenge is to find the right outputs and decode them and to find out how best to encode information so that the network can understand it. The way to do this is by training the device: by running a task hundreds or perhaps thousands of times, first with one type of input and then with another, and comparing which output best solves a task. “We don’t program the device but we select the best way to encode the information such that the [network behaves] in an interesting and useful manner,” Gimzewski said.

In work that’s soon to be published, the researchers trained the wire network to execute simple logic operations. And in unpublished experiments, they trained the network to solve the equivalent of a simple memory task taught to lab rats called a T-maze test. In the test, a rat in a T-shaped maze is rewarded when it learns to make the correct turn in response to a light. With its own version of training, the network could make the correct response 94 percent of the time.

So far, these results aren’t much more than a proof of principle, Nugent said. “A little rat making a decision in a T-maze is nowhere close to what somebody in machine learning does to evaluate their systems” on a traditional computer, he said. He doubts the device will lead to a chip that does much that’s useful in the next few years.

But the potential, he emphasized, is huge. That’s because the network, like the brain, doesn’t separate processing and memory. Traditional computers need to shuttle information between different areas that handle the two functions. “All that extra communication adds up because it takes energy to charge wires,” Nugent said. With traditional machines, he said, “literally, you could run France on the electricity that it would take to simulate a full human brain at moderate resolution.” If devices like the silver wire network can eventually solve tasks as effectively as machine-learning algorithms running on traditional computers, they could do so using only one-billionth as much power. “As soon as they do that, they’re going to win in power efficiency, hands down,” Nugent said.

The UCLA findings also lend support to the view that under the right circumstances, intelligent systems can form by self-organization, without the need for any template or process to design them. The silver network “emerged spontaneously,” said Todd Hylton, the former manager of the Defense Advanced Research Projects Agency program that supported early stages of the project. “As energy flows through [it], it’s this big dance because every time one new structure forms, the energy doesn’t go somewhere else. People have built computer models of networks that achieve some critical state. But this one just sort of did it all by itself.”

Gimzewski believes that the silver wire network or devices like it might be better than traditional computers at making predictions about complex processes. Traditional computers model the world with equations that often only approximate complex phenomena. Neuromorphic atomic switch networks align their own innate structural complexity with that of the phenomenon they are modeling. They are also inherently fast — the state of the network can fluctuate at upward of tens of thousands of changes per second. “We are using a complex system to understand complex phenomena,” Gimzewski said.

Earlier this year at a meeting of the American Chemical Society in San Francisco, Gimzewski, Stieg and their colleagues presented the results of an experiment in which they fed the device the first three years of a six-year data set of car traffic in Los Angeles, in the form of a series of pulses that indicated the number of cars passing by per hour. After hundreds of training runs, the output eventually predicted the statistical trend of the second half of the data set quite well, even though the device had never seen it.

Perhaps one day, Gimzewski jokes, he might be able to use the network to predict the stock market. “I’d like that,” he said, adding that this was why he was trying to get his students to study atomic switch networks — “before they catch me making a fortune.”

This article was reprinted on

Let's block ads! (Why?)

06 Oct 05:47

Is AI Riding a One-Trick Pony?

by brandizzi

I’m standing in what is soon to be the center of the world, or is perhaps just a very large room on the seventh floor of a gleaming tower in downtown Toronto. Showing me around is Jordan Jacobs, who cofounded this place: the nascent Vector Institute, which opens its doors this fall and which is aiming to become the global epicenter of artificial intelligence.

We’re in Toronto because Geoffrey Hinton is in Toronto, and Geoffrey Hinton is the father of “deep learning,” the technique behind the current excitement about AI. “In 30 years we’re going to look back and say Geoff is Einstein—of AI, deep learning, the thing that we’re calling AI,” Jacobs says. Of the researchers at the top of the field of deep learning, Hinton has more citations than the next three combined. His students and postdocs have gone on to run the AI labs at Apple, Facebook, and OpenAI; Hinton himself is a lead scientist on the Google Brain AI team. In fact, nearly every achievement in the last decade of AI—in translation, speech recognition, image recognition, and game playing—traces in some way back to Hinton’s work.

The Vector Institute, this monument to the ascent of ­Hinton’s ideas, is a research center where companies from around the U.S. and Canada—like Google, and Uber, and Nvidia—will sponsor efforts to commercialize AI technologies. Money has poured in faster than Jacobs could ask for it; two of his cofounders surveyed companies in the Toronto area, and the demand for AI experts ended up being 10 times what Canada produces every year. Vector is in a sense ground zero for the now-worldwide attempt to mobilize around deep learning: to cash in on the technique, to teach it, to refine and apply it. Data centers are being built, towers are being filled with startups, a whole generation of students is going into the field.

The impression you get standing on the Vector floor, bare and echoey and about to be filled, is that you’re at the beginning of something. But the peculiar thing about deep learning is just how old its key ideas are. Hinton’s breakthrough paper, with colleagues David Rumelhart and Ronald Williams, was published in 1986. The paper elaborated on a technique called backpropagation, or backprop for short. Backprop, in the words of Jon Cohen, a computational psychologist at Princeton, is “what all of deep learning is based on—literally everything.”

When you boil it down, AI today is deep learning, and deep learning is backprop—which is amazing, considering that backprop is more than 30 years old. It’s worth understanding how that happened—how a technique could lie in wait for so long and then cause such an explosion—because once you understand the story of backprop, you’ll start to understand the current moment in AI, and in particular the fact that maybe we’re not actually at the beginning of a revolution. Maybe we’re at the end of one.


The walk from the Vector Institute to Hinton’s office at Google, where he spends most of his time (he is now an emeritus professor at the University of Toronto), is a kind of living advertisement for the city, at least in the summertime. You can understand why Hinton, who is originally from the U.K., moved here in the 1980s after working at Carnegie Mellon University in Pittsburgh.

When you step outside, even downtown near the financial district, you feel as though you’ve actually gone into nature. It’s the smell, I think: wet loam in the air. Toronto was built on top of forested ravines, and it’s said to be “a city within a park”; as it’s been urbanized, the local government has set strict restrictions to maintain the tree canopy. As you’re flying in, the outer parts of the city look almost cartoonishly lush.

Maybe we’re not actually at the beginning of a revolution.

Toronto is the fourth-largest city in North America (after Mexico City, New York, and L.A.), and its most diverse: more than half the population was born outside Canada. You can see that walking around. The crowd in the tech corridor looks less San Francisco—young white guys in hoodies—and more international. There’s free health care and good public schools, the people are friendly, and the political order is relatively left-­leaning and stable; and this stuff draws people like Hinton, who says he left the U.S. because of the Iran-Contra affair. It’s one of the first things we talk about when I go to meet him, just before lunch.

“Most people at CMU thought it was perfectly reasonable for the U.S. to invade Nicaragua,” he says. “They somehow thought they owned it.” He tells me that he had a big breakthrough recently on a project: “getting a very good junior engineer who’s working with me,” a woman named Sara Sabour. Sabour is Iranian, and she was refused a visa to work in the United States. Google’s Toronto office scooped her up.

Hinton, who is 69 years old, has the kind, lean, English-looking face of the Big Friendly Giant, with a thin mouth, big ears, and a proud nose. He was born in Wimbledon, England, and sounds, when he talks, like the narrator of a children’s book about science: curious, engaging, eager to explain things. He’s funny, and a bit of a showman. He stands the whole time we talk, because, as it turns out, sitting is too painful. “I sat down in June of 2005 and it was a mistake,” he tells me, letting the bizarre line land before explaining that a disc in his back gives him trouble. It means he can’t fly, and earlier that day he’d had to bring a contraption that looked like a surfboard to the dentist’s office so he could lie on it while having a cracked tooth root examined.

In the 1980s Hinton was, as he is now, an expert on neural networks, a much-simplified model of the network of neurons and synapses in our brains. However, at that time it had been firmly decided that neural networks were a dead end in AI research. Although the earliest neural net, the Perceptron, which began to be developed in the 1950s, had been hailed as a first step toward human-level machine intelligence, a 1969 book by MIT’s ­Marvin Minsky and Seymour Papert, called Perceptrons, proved mathematically that such networks could perform only the most basic functions. These networks had just two layers of neurons, an input layer and an output layer. Nets with more layers between the input and output neurons could in theory solve a great variety of problems, but nobody knew how to train them, and so in practice they were useless. Except for a few holdouts like Hinton, Perceptrons caused most people to give up on neural nets entirely.

Hinton’s breakthrough, in 1986, was to show that backpropagation could train a deep neural net, meaning one with more than two or three layers. But it took another 26 years before increasing computational power made good on the discovery. A 2012 paper by Hinton and two of his Toronto students showed that deep neural nets, trained using backpropagation, beat state-of-the-art systems in image recognition. “Deep learning” took off. To the outside world, AI seemed to wake up overnight. For Hinton, it was a payoff long overdue.

Reality distortion field

A neural net is usually drawn like a club sandwich, with layers stacked one atop the other. The layers contain artificial neurons, which are dumb little computational units that get excited—the way a real neuron gets excited—and pass that excitement on to the other neurons they’re connected to. A neuron’s excitement is represented by a number, like 0.13 or 32.39, that says just how excited it is. And there’s another crucial number, on each of the connections between two neurons, that determines how much excitement should get passed from one to the other. That number is meant to model the strength of the synapses between neurons in the brain. When the number is higher, it means the connection is stronger, so more of the one’s excitement flows to the other.

A diagram from seminal work on “error propagation” by Hinton, David Rumelhart, and Ronald Williams.

One of the most successful applications of deep neural nets is in image recognition—as in the memorable scene in HBO’s Silicon Valley where the team builds a program that can tell whether there’s a hot dog in a picture. Programs like that actually exist, and they wouldn’t have been possible a decade ago. To get them to work, the first step is to get a picture. Let’s say, for simplicity, it’s a small black-and-white image that’s 100 pixels wide and 100 pixels tall. You feed this image to your neural net by setting the excitement of each simulated neuron in the input layer so that it’s equal to the brightness of each pixel. That’s the bottom layer of the club sandwich: 10,000 neurons (100x100) representing the brightness of every pixel in the image.

You then connect this big layer of neurons to another big layer of neurons above it, say a few thousand, and these in turn to another layer of another few thousand neurons, and so on for a few layers. Finally, in the topmost layer of the sandwich, the output layer, you have just two neurons—one representing “hot dog” and the other representing “not hot dog.” The idea is to teach the neural net to excite only the first of those neurons if there’s a hot dog in the picture, and only the second if there isn’t. Backpropagation—the technique that Hinton has built his career upon—is the method for doing this.

Backprop is remarkably simple, though it works best with huge amounts of data. That’s why big data is so important in AI—why Facebook and Google are so hungry for it, and why the Vector Institute decided to set up shop down the street from four of Canada’s largest hospitals and develop data partnerships with them.

In this case, the data takes the form of millions of pictures, some with hot dogs and some without; the trick is that these pictures are labeled as to which have hot dogs. When you first create your neural net, the connections between neurons might have random weights—random numbers that say how much excitement to pass along each connection. It’s as if the synapses of the brain haven’t been tuned yet. The goal of backprop is to change those weights so that they make the network work: so that when you pass in an image of a hot dog to the lowest layer, the topmost layer’s “hot dog” neuron ends up getting excited.

Suppose you take your first training image, and it’s a picture of a piano. You convert the pixel intensities of the 100x100 picture into 10,000 numbers, one for each neuron in the bottom layer of the network. As the excitement spreads up the network according to the connection strengths between neurons in adjacent layers, it’ll eventually end up in that last layer, the one with the two neurons that say whether there’s a hot dog in the picture. Since the picture is of a piano, ideally the “hot dog” neuron should have a zero on it, while the “not hot dog” neuron should have a high number. But let’s say it doesn’t work out that way. Let’s say the network is wrong about this picture. Backprop is a procedure for rejiggering the strength of every connection in the network so as to fix the error for a given training example.

The way it works is that you start with the last two neurons, and figure out just how wrong they were: how much of a difference is there between what the excitement numbers should have been and what they actually were? When that’s done, you take a look at each of the connections leading into those neurons—the ones in the next lower layer—and figure out their contribution to the error. You keep doing this until you’ve gone all the way to the first set of connections, at the very bottom of the network. At that point you know how much each individual connection contributed to the overall error, and in a final step, you change each of the weights in the direction that best reduces the error overall. The technique is called “backpropagation” because you are “propagating” errors back (or down) through the network, starting from the output.

The incredible thing is that when you do this with millions or billions of images, the network starts to get pretty good at saying whether an image has a hot dog in it. And what’s even more remarkable is that the individual layers of these image-recognition nets start being able to “see” images in sort of the same way our own visual system does. That is, the first layer might end up detecting edges, in the sense that its neurons get excited when there are edges and don’t get excited when there aren’t; the layer above that one might be able to detect sets of edges, like corners; the layer above that one might start to see shapes; and the layer above that one might start finding stuff like “open bun” or “closed bun,” in the sense of having neurons that respond to either case. The net organizes itself, in other words, into hierarchical layers without ever having been explicitly programmed that way.

A real intelligence doesn’t break when you slightly change the problem.

This is the thing that has everybody enthralled. It’s not just that neural nets are good at classifying pictures of hot dogs or whatever: they seem able to build representations of ideas. With text you can see this even more clearly. You can feed the text of Wikipedia, many billions of words long, into a simple neural net, training it to spit out, for each word, a big list of numbers that correspond to the excitement of each neuron in a layer. If you think of each of these numbers as a coordinate in a complex space, then essentially what you’re doing is finding a point, known in this context as a vector, for each word somewhere in that space. Now, train your network in such a way that words appearing near one another on Wikipedia pages end up with similar coordinates, and voilà, something crazy happens: words that have similar meanings start showing up near one another in the space. That is, “insane” and “unhinged” will have coordinates close to each other, as will “three” and “seven,” and so on. What’s more, so-called vector arithmetic makes it possible to, say, subtract the vector for “France” from the vector for “Paris,” add the vector for “Italy,” and end up in the neighborhood of “Rome.” It works without anyone telling the network explicitly that Rome is to Italy as Paris is to France.

“It’s amazing,” Hinton says. “It’s shocking.” Neural nets can be thought of as trying to take things—images, words, recordings of someone talking, medical data—and put them into what mathematicians call a high-dimensional vector space, where the closeness or distance of the things reflects some important feature of the actual world. Hinton believes this is what the brain itself does. “If you want to know what a thought is,” he says, “I can express it for you in a string of words. I can say ‘John thought, “Whoops.”’ But if you ask, ‘What is the thought? What does it mean for John to have that thought?’ It’s not that inside his head there’s an opening quote, and a ‘Whoops,’ and a closing quote, or even a cleaned-up version of that. Inside his head there’s some big pattern of neural activity.” Big patterns of neural activity, if you’re a mathematician, can be captured in a vector space, with each neuron’s activity corresponding to a number, and each number to a coordinate of a really big vector. In Hinton’s view, that’s what thought is: a dance of vectors.

Geoffrey Hinton

It is no coincidence that Toronto’s flagship AI institution was named for this fact. Hinton was the one who came up with the name Vector Institute.

There’s a sort of reality distortion field that Hinton creates, an air of certainty and enthusiasm, that gives you the feeling there’s nothing that vectors can’t do. After all, look at what they’ve been able to produce already: cars that drive themselves, computers that detect cancer, machines that instantly translate spoken language. And look at this charming British scientist talking about gradient descent in high-dimensional spaces!

It’s only when you leave the room that you remember: these “deep learning” systems are still pretty dumb, in spite of how smart they sometimes seem. A computer that sees a picture of a pile of doughnuts piled up on a table and captions it, automatically, as “a pile of doughnuts piled on a table” seems to understand the world; but when that same program sees a picture of a girl brushing her teeth and says “The boy is holding a baseball bat,” you realize how thin that understanding really is, if ever it was there at all.

Neural nets are just thoughtless fuzzy pattern recognizers, and as useful as fuzzy pattern recognizers can be—hence the rush to integrate them into just about every kind of software—they represent, at best, a limited brand of intelligence, one that is easily fooled. A deep neural net that recognizes images can be totally stymied when you change a single pixel, or add visual noise that’s imperceptible to a human. Indeed, almost as often as we’re finding new ways to apply deep learning, we’re finding more of its limits. Self-driving cars can fail to navigate conditions they’ve never seen before. Machines have trouble parsing sentences that demand common-sense understanding of how the world works.

Deep learning in some ways mimics what goes on in the human brain, but only in a shallow way—which perhaps explains why its intelligence can sometimes seem so shallow. Indeed, backprop wasn’t discovered by probing deep into the brain, decoding thought itself; it grew out of models of how animals learn by trial and error in old classical-conditioning experiments. And most of the big leaps that came about as it developed didn’t involve some new insight about neuroscience; they were technical improvements, reached by years of mathematics and engineering. What we know about intelligence is nothing against the vastness of what we still don’t know.

David Duvenaud, an assistant professor in the same department as Hinton at the University of Toronto, says deep learning has been somewhat like engineering before physics. “Someone writes a paper and says, ‘I made this bridge and it stood up!’ Another guy has a paper: ‘I made this bridge and it fell down—but then I added pillars, and then it stayed up.’ Then pillars are a hot new thing. Someone comes up with arches, and it’s like, ‘Arches are great!’” With physics, he says, “you can actually understand what’s going to work and why.” Only recently, he says, have we begun to move into that phase of actual understanding with artificial intelligence.

Hinton himself says, “Most conferences consist of making minor variations … as opposed to thinking hard and saying, ‘What is it about what we’re doing now that’s really deficient? What does it have difficulty with? Let’s focus on that.’”

It can be hard to appreciate this from the outside, when all you see is one great advance touted after another. But the latest sweep of progress in AI has been less science than engineering, even tinkering. And though we’ve started to get a better handle on what kinds of changes will improve deep-learning systems, we’re still largely in the dark about how those systems work, or whether they could ever add up to something as powerful as the human mind.

It’s worth asking whether we’ve wrung nearly all we can out of backprop. If so, that might mean a plateau for progress in artificial intelligence.


If you want to see the next big thing, something that could form the basis of machines with a much more flexible intelligence, you should probably check out research that resembles what you would’ve found had you encountered backprop in the ’80s: smart people plugging away on ideas that don’t really work yet.

A few months ago I went to the Center for Minds, Brains, and Machines, a multi-institutional effort headquartered at MIT, to watch a friend of mine, Eyal Dechter, defend his dissertation in cognitive science. Just before the talk started, his wife Amy, their dog Ruby, and their daughter Susannah were milling around, wishing him well. On the screen was a picture of Ruby, and next to it one of Susannah as a baby. When Dad asked Susannah to point herself out, she happily slapped a long retractable pointer against her own baby picture. On the way out of the room, she wheeled a toy stroller behind her mom and yelled “Good luck, Daddy!” over her shoulder. “Vámanos!” she said finally. She’s two.

“The fact that it doesn’t work is just a temporary annoyance.”

Eyal started his talk with a beguiling question: How is it that Susannah, after two years of experience, can learn to talk, to play, to follow stories? What is it about the human brain that makes it learn so well? Will a computer ever be able to learn so quickly and so fluidly?

We make sense of new phenomena in terms of things we already understand. We break a domain down into pieces and learn the pieces. Eyal is a mathematician and computer programmer, and he thinks about tasks—like making a soufflé—as really complex computer programs. But it’s not as if you learn to make a soufflé by learning every one of the program’s zillion micro-instructions, like “Rotate your elbow 30 degrees, then look down at the countertop, then extend your pointer finger, then …” If you had to do that for every new task, learning would be too hard, and you’d be stuck with what you already know. Instead, we cast the program in terms of high-level steps, like “Whip the egg whites,” which are themselves composed of subprograms, like “Crack the eggs” and “Separate out the yolks.”

Computers don’t do this, and that is a big part of the reason they’re dumb. To get a deep-learning system to recognize a hot dog, you might have to feed it 40 million pictures of hot dogs. To get Susannah to recognize a hot dog, you show her a hot dog. And before long she’ll have an understanding of language that goes deeper than recognizing that certain words often appear together. Unlike a computer, she’ll have a model in her mind about how the whole world works. “It’s sort of incredible to me that people are scared of computers taking jobs,” Eyal says. “It’s not that computers can’t replace lawyers because lawyers do really complicated things. It’s because lawyers read and talk to people. It’s not like we’re close. We’re so far.”

A real intelligence doesn’t break when you slightly change the requirements of the problem it’s trying to solve. And the key part of Eyal’s thesis was his demonstration, in principle, of how you might get a computer to work that way: to fluidly apply what it already knows to new tasks, to quickly bootstrap its way from knowing almost nothing about a new domain to being an expert.

Essentially, it is a procedure he calls the “exploration–compression” algorithm. It gets a computer to function somewhat like a programmer who builds up a library of reusable, modular components on the way to building more and more complex programs. Without being told anything about a new domain, the computer tries to structure knowledge about it just by playing around, consolidating what it’s found, and playing around some more, the way a human child does.

His advisor, Joshua Tenenbaum, is one of the most highly cited researchers in AI. Tenenbaum’s name came up in half the conversations I had with other scientists. Some of the key people at DeepMind—the team behind AlphaGo, which shocked computer scientists by beating a world champion player in the complex game of Go in 2016—had worked as his postdocs. He’s involved with a startup that’s trying to give self-driving cars some intuition about basic physics and other drivers’ intentions, so they can better anticipate what would happen in a situation they’ve never seen before, like when a truck jackknifes in front of them or when someone tries to merge very aggressively.

Eyal’s thesis doesn’t yet translate into those kinds of practical applications, let alone any programs that would make headlines for besting a human. The problems Eyal’s working on “are just really, really hard,” Tenenbaum said. “It’s gonna take many, many generations.”

Tenenbaum has long, curly, whitening hair, and when we sat down for coffee he had on a button-down shirt with black slacks. He told me he looks to the story of backprop for inspiration. For decades, backprop was cool math that didn’t really accomplish anything. As computers got faster and the engineering got more sophisticated, suddenly it did. He hopes the same thing might happen with his own work and that of his students, “but it might take another couple decades.”

As for Hinton, he is convinced that overcoming AI’s limitations involves building “a bridge between computer science and biology.” Backprop was, in this view, a triumph of biologically inspired computation; the idea initially came not from engineering but from psychology. So now Hinton is trying to pull off a similar trick.

Neural networks today are made of big flat layers, but in the human neocortex real neurons are arranged not just horizontally into layers but vertically into columns. Hinton thinks he knows what the columns are for—in vision, for instance, they’re crucial for our ability to recognize objects even as our viewpoint changes. So he’s building an artificial version—he calls them “capsules”—to test the theory. So far, it hasn’t panned out; the capsules haven’t dramatically improved his nets’ performance. But this was the same situation he’d been in with backprop for nearly 30 years.

“This thing just has to be right,” he says about the capsule theory, laughing at his own boldness. “And the fact that it doesn’t work is just a temporary annoyance.”

James Somers is a writer and programmer based in New York City. His previous article for MIT Technology Review was “Toolkits for the Mind” in May/June 2015, which showed how Internet startups are shaped by the programming languages they use.

Let's block ads! (Why?)

06 Oct 05:39

Lord Chronos

by Reza

06 Oct 05:39

Earth Day

Meowth can speak Human Language
06 Oct 05:32

Comic for 2017.09.24

by Kris Wilson
06 Oct 05:31

Real Estate

I tried converting the prices into pizzas, to put it in more familiar terms, and it just became a hard-to-think-about number of pizzas.
06 Oct 05:31

Saturday Morning Breakfast Cereal - Transaction


Click here to go see the bonus panel!

In a perfect relationship between economists, every time their preferences are slightly violated they make a microtransaction with their partners.

New comic!
Today's News:

In which Jerry Wang posits a way to finally get some utility out of babies:


06 Oct 05:26

New Worlds

by Doug

New Worlds

September is Literacy Month! Read something!

06 Oct 05:24

Saturday Morning Breakfast Cereal - Shirts


Click here to go see the bonus panel!

The really depressing part is 99% of programmable nanobot behavior will be this sort of thing.

New comic!
Today's News:
06 Oct 05:13

Comic for 2017.10.02

by Rob DenBleyker
06 Oct 05:12

Self Driving

"Crowdsourced steering" doesn't sound quite as appealing as "self driving."
06 Oct 05:09

Comic for 2017.10.03

by Dave McElfatrick
06 Oct 05:08

News Of The World

by Shyam Ramani

Bonus Panel

The post News Of The World appeared first on Fowl Language Comics.

06 Oct 05:04

Saturday Morning Breakfast Cereal - Neighborhood


Click here to go see the bonus panel!

To be clear, I have no intention to talk to my neighbors. I just want to know what they're thinking.

New comic!
Today's News:

The Soonish book tour is nearly upon us! Come see me and Kelly and events around the US!

06 Oct 05:03


by Laerte Coutinho

06 Oct 05:03

For eternity

by CommitStrip

06 Oct 05:02

October 2017

And yet I have no trouble believing that the start of the 2016 election was several decades ago.
24 Sep 23:44


by Reza

24 Sep 21:52

USB Cables

Tag yourself, I'm "frayed."
24 Sep 21:50


24 Sep 21:50


24 Sep 21:49

How to Pose a Photo

by Scott Meyer

This comic was inspired by the official portrait of Steve Jobs, in which he holds his hand in a way no human ever does unless they’re getting their official portrait taken.


I think it’s meant to look like he’s thinking deep thoughts, but to me it looks more like he’s deriving pleasure from tugging his own beard hairs. Or, it kind of looks like he’s smelling his own fist, and he recognizes the smell.

This is why I’ve always related more with Bill Gates. In every Picture of Bill Gates, especially the official pictures he posed for, he looks self-conscious and uncomfortable. I suspect he’s often self-conscious and uncomfortable in real life, so the pictures look more genuine. They look bad and awkward, but genuinely bad and awkward.


As always, thanks for using my Amazon Affiliate links (USUKCanada).

24 Sep 21:47

No Idea

by Doug
24 Sep 20:45


24 Sep 20:45

Comic for September 23, 2017

by Scott Adams
24 Sep 20:39

Comic for 2017.09.23

by Dave McElfatrick
24 Sep 20:39

Voz do Brasil

by Will Tirando

24 Sep 20:33

Artificial Intelligence Hype

by tomfishburne

Marketers are increasingly excited about the potential of AI. L’Oréal chief digital officer Lubomira Rochet described a commonly shared sentiment:

“I believe AI is as big a revolution as the internet itself. It’s going to power more of our interactions with our consumers, be it through advertising, CRM or even ad serving. All those compartments of marketing will be transformed by AI. It’s a great way to get more personalized than we’ve ever been.”

Few trends are currently as hyped as Artificial Intelligence. And as we’ve all learned from the Gartner Hype Cycle, hyped trends inevitably climb a peak of inflated expectations.

Artificial intelligence has become a broad, amorphous, catch-all term. As Christian Monberg at Boomtrain put it, “We’re early in a convoluted market. AI means a lot of things.”

Your Ad Ignored Here
"If marketing kept a diary, this would be it."
- Ann Handley, Chief Content Officer of MarketingProfs
preorder now

Companies are rebranding existing services as “AI-powered.” Media agencies are starting to add AI to business titles. Here’s how Josh Ziegler of travel technology shop Zumata described the hype:

“AI has been touted as a silver bullet with infinite magical powers. The dawn of AI taking over the world and being a plug and play technology has fostered a misconception of what is truly possible today.”

Savvy marketers will have to continue to navigate the promise from the hype and be wary of chasing technology for technology-sake.

Here are a few related cartoons I’ve drawn over the years:

Digital Transformation November 2016

Shiny Object Syndrome January 2015

Big Data January 2014

21 Sep 22:43

Viva Intensamente # 329

by Will Tirando