Shared posts

19 Jan 19:19

Python Software Foundation News: "Weapons of Math Destruction" by Cathy O'Neil

by brandizzi
In a 1947 lecture on computing machinery, Alan Turing made a prediction: "The new machines will in no way replace thought, but rather they will increase the need for it."

Someday, he said, machines would think for themselves, but the computers of the near future would require human supervision to prevent malfunctions:

"The intention in constructing these machines in the first instance is to treat them as slaves, giving them only jobs which have been thought out in detail, jobs such that the user of the machine fully understands in principle what is going on all the time." 1

It is unclear now whether machines remain slaves, or if they are beginning to be masters. Machine-learning algorithms pervasively control the lives of Americans. We do not fully understand what they do, and when they malfunction they harm us, by reinforcing the unjust systems we already have. Usually unintentionally, they can make the lives of poor people and people of color worse.

In "Weapons of Math Destruction", Cathy O'Neil identifies such an algorithm as a "WMD" if it satisfies three criteria: it makes decisions of consequence for a large number of people, it is opaque and unaccountable, and it is destructive. I interviewed O'Neil to learn what data scientists should do to disarm these weapons.

Automated Injustice


Recidivism risk models are a striking example of algorithms that reinforce injustice. These algorithms purport to predict how likely a convict is to commit another crime in the next few years. The model described in O'Neil's book, called LSI-R, assesses offenders with 54 questions, then produces a risk score based on correlations between each offender's characteristics and the characteristics of recidivists and non-recidivists in a sample population of offenders.

Some of LSI-R's factors measure the offender's past behavior: Has she ever been expelled from school, or violated parole? But most factors probably aren't under the individual's control: Does she live in a high-crime neighborhood? Is she poor? And many factors are not under her control at all: Has a family member been convicted of any crimes? Did her parents raise her with a "rewarding" parenting style?

Studies of LSI-R show it gives worse scores to poor black people. Some of its questions directly measure poverty, and others (such as frequently changing residence) are proxies for poverty. LSI-R does not know the offender's race. It would be illegal to ask, but, O'Neil writes, "with the wealth of detail each prisoner provides, that single illegal question is almost superfluous." For example, it asks the offender's age when he was first involved with the police. O'Neil cites a 2013 New York Civil Liberties Union study that young black and Hispanic men were ten times as likely to be stopped by the New York City police, even though only a tiny fraction were doing anything criminal.

So far, the LSI-R does not automatically become destructive. If it is accurate, and used for benign choices like spending more time treating and counselling offenders with high risk scores, it could do some good. But in many states, judges use the LSI-R and models like it to decide how long the offender's sentence should be. This is not LSI-R's intended use, and it is certainly not accurate enough for it: a study this year found that LSI-R misclassified 41% of offenders. 2

Success, According to Whom?


O'Neil told me that whether an algorithm becomes a WMD depends on who defines success, and according to whom. "Over and over again, people act as if there's only one set of stakeholders."

When a recidivism risk model is used to sentence someone to a longer prison term, the sole stakeholder respected is law enforcement. "Law enforcement cares more about true positives, correctly identifying someone who will reoffend and putting them in jail for longer to keep them from committing another crime." But our society has a powerful interest in preventing false positives. Indeed, we were founded on a constitution that considered a false positive—that is, being punished for a crime you did not commit—to be extremely costly. Principles including the presumption of innocence, the requirement that guilt is proven beyond reasonable doubt, and so on, express our desire to avoid unjust punishment, even at the cost of some criminals being punished too little or going free.

However, this interest is ignored when an offender is punished for a bad LSI-R score. His total sentence accounts not only for the crime he committed, but also for future crimes he is thought likely to commit. Furthermore, he is punished for who he is: Being related to a criminal or being raised badly are circumstances of birth, but for many people facing sentencing, such circumstances are used to add years to their time behind bars.

Statistically Unsound


Cathy O'Neil says weapons of math destruction are usually caused by two failures. The first is when only one stakeholder's interests define success. LSI-R is an example of this. The other is a lack of actual science in data science. For these algorithms, she told me, "We actually don't have reasonable ways of checking to see whether something is working or not."

A New York City public school program begun in 2007 assessed teachers with a "value added model", which estimated how much a teacher affected each student's progress on standardized tests. To begin, the model forecast students' progress, given their neighborhood, family income, previous achievement, and so on. At the end of the year their actual progress was compared to the forecast, and the difference was attributed to the teacher's effectiveness. O'Neil tells the story of Tim Clifford, a public school teacher who scored only 6 out of 100 the first year he was assessed, then 96 out of 100 the next year. O'Neil writes, "Attempting to score a teacher's effectiveness by analyzing the test results of only twenty-five or thirty students is statistically unsound, even laughable." One analysis of the assessment showed that a quarter of teachers' scores swung by 40 points in a year. Another showed that, with such small samples, the margin of error made half of all teachers statistically indistinguishable.

Nevertheless, the score might determine if the teacher was given a bonus, or fired. Although its decision was probabilistic, appealing it required conclusive evidence. O'Neil points out that time and again, "the human victims of WMDs are held to a higher standard of evidence than the algorithms themselves." The model is math so it is presumed correct, and anyone who objects to its scores is suspect.

New York Governor Andrew Cuomo put a moratorium on these teacher evaluations in 2015. We are starting to see that some questions require too subtle an intelligence for our current algorithms to answer accurately. As Alan Turing said, "If a machine is expected to be infallible, it cannot also be intelligent."

Responsible Data Science


I asked Cathy O'Neil about the responsibilities of data scientists, both in their daily work and as reformers of their profession. Regarding daily work, O'Neil drew a sharp line: "I don't want data scientists to be de facto policy makers." Rather, their job is to explain to policy makers the moral tradeoffs of their choices. The same as any programmer gathers requirements before coding a solution, data scientists should gather requirements regarding the relative cost of different kinds of errors. Machine learning algorithms are always imperfect, but they can be tweaked for either more false positives or more false negatives. When the stakes are high, the choice between the two is a moral one. Data scientists must pose these questions frankly to policy makers, says O'Neil, and "translate moral decisions into code."

Tradeoffs in the private sector often pit corporate interests against human ones. This is especially dangerous to the poor because, as O'Neil writes, "The privileged are processed more by people, the masses by machines." She told me that when the boss asks for an algorithm that optimizes for profit, it is the data scientist's duty to mention that the algorithm should also consider fairness.

"Weapons of Math Destruction" tells us how to recognize a WMD once it is built. But how can we predict whether an algorithm will become a WMD? O'Neil told me, "The biggest warning sign is if you're choosing winners and losers, and if it's a big deal for losers to lose. If it's an important decision and it's a secret formula, then that's a set-up for a weapon of math destruction. The only other ingredient you need in that setup is actually making it destructive."

Reform


Cathy O'Neil says the top priority, for data scientists who want to disarm WMDs, is to develop tools for analyzing them. For example, any EU citizen harmed by an algorithmic decision may soon have the legal right to an explanation, but so far we lack the tools to provide one. We also need tools to measure disparate impact and unfairness. O'Neil says, "We need tools to decide whether an algorithm is being racist."

New data scientists should enter the field with better training in ethics. Curricula usually ignore questions of justice, as if the job of the data scientist were purely technical. Data-science contests like Kaggle also encourage this view, says O'Neil. "Kaggle has defined the success and the penalty function. The hard part of data science is everything that happens before Kaggle." O'Neil wants more case studies from the field, anonymized so students can learn from them how data science is really practiced. It would be an opportunity to ask: When an algorithm makes a mistake, who gets hurt?

If data scientists take responsibility for the effects of their work, says O'Neil, they will become activists. "I'm hoping the book, at the very least, gets people to acknowledge the power that they're wielding," she says, "and how it could be used for good or bad. The very first thing we have to realize is that well-intentioned people can make horrible mistakes."



1. Quoted in "Alan Turing: The Enigma", by Andrew Hodges. Princeton University Press.

2. See also ProPublica's analysis of bias in a similar recidivism model, COMPAS.

Let's block ads! (Why?)

19 Jan 19:19

Asking the wrong questions — Benedict Evans

by brandizzi

This is a photo of my grandfather, Will Jenkins. It was taken in 1909, when he was 13. He made the glider himself and took it to Cape Henry, about 17 miles by trolley from Norfolk, where his first flight took him eight feet, and his last that day took him 40 feet and broke one of his uprights. They made 13-year-olds differently then, I think. 

Dadglider.jpg

Dadglider.jpg

He built the glider, incidentally, with a gift of $5 sent to him by an American Civil War veteran after a school essay he'd written about Robert E. Lee was published in the local paper.  The war, after all, had ended only 44 years earlier. 

In 1946, by which time he'd become a notable writer of science fiction, he published a story called 'A Logic named Joe', which described a global computer network with servers and terminals, that starts giving people the information that it thinks they ought to know as opposed to waiting for them to search for it - the Singularity, if you like, or maybe just Alexa. He also, as I recall, predicted reality TV somewhere. 

And yet, despite predicting half of our world, as a father in the 1950s he could not imagine why his daughter - my mother - wanted to work. 

This isn't exactly an uncommon observation - lots of people have pointed out that vintage scifi has plenty of rocketships but all the pilots are men - 1950s society but with robots. Meanwhile, the interstellar liners have paper tickets, that you queue up to buy. With fundamental technology change, we don't so much get our predictions wrong as make predictions about the wrong things. (And, of course, we now have neither trolleys nor personal gliders.) 

I was reminded of this photo recently when I came across a RAND 'long-range forecasting' study, from 1964. The authors polled a range of experts on what the key developments in coming decades would be and when they'd happen. Fields addressed included space flight and medicine, but the most interesting in this context is what was then called 'automation' (the past tended to describe as 'automatic' what we would now call 'computers'). The double-page spread below shows the conclusions (click to enlarge). 

Some of this has happened more or less as predicted - we did get air traffic control, automated subway trains and computerised taxation (except in the USA). There are some great comedy predictions here too - that 'centralised wire tapping' would take until 2030, or never, or that people in both 1964 and 2016 thought we'd have automated driving 'by 2020'. 

However, to me the interesting thing is how often the order is wrong. What we now know to be the hard problems were going to be solved decades before what we now know were the easy ones. So it might take until 2020 to 'fax' a newspaper to your home, and automatic wiretapping might be impossible, but automatic doctors, radar implants for the blind, household robots and machine translation would be all done by 1990 and a machine would be passing human IQ tests at genius level by 2000. Meanwhile, there are a few quite important things missing - there is no general-purpose computing, no internet and no mobile phones. There's no prediction for when everyone on earth would have a pocket computer connected to all the world's knowledge (2020-2025). These aren't random gaps - it's not just that they thought X would work and didn't know we'd invent Y. Rather, what's lacking is an understanding of the structural impetus of computing and software as universal platforms that would shape how all of these things would be created. We didn't make a home newspaper facsimile machine - we made computers. 

You can see this tendency to ask the wrong questions, or questions based on the wrong framework, in this TeleGeography report from 1990. It was clear that the world was changing, and that the telephone network would see new uses. But if you're asking about new uses for the 'telephone network', that of itself probably gets you to the wrong place (again, click to zoom). 

Picking this apart:

  1. We will have home computers ('multi-media terminals') - correct
  2. And they'll be connected to a network - correct
  3. And move data across national borders - correct
  4. So international circuit-switched call volume will go up - um, no.  
  5. And that needs to fit into the regulatory structure of how state-monopoly fixed-line telephone companies exchange and charge each other for international voice calls - 😐

Today, we don't carry the internet over the PSTN - we carry the PSTN over the internet. 

This time last year I wrote a post about how the future of the mobile internet (as we called it then) looked in 2001, and what one could have predicted. It was obvious that we'd all have phones connected to the internet by now, but that that didn't get you to the iPhone, Snapchat and Alexa or DJI - we were talking about Nokia, Microsoft, AOL and NTT DoCoMo just as TeleGeography talked about circuits and RAND's experts talked about 'automation'. Openness and permissionless innovation were missing. 

So, a pretty common theme of discussion in tech now is to ask what comes 'after' mobile, now that it is moving from the creation to deployment phase and the smartphone platform wars etc are over. There are a bunch of exciting things going on, certainly, from machine learning to AR and VR to electric and autonomous cars. What content will work in VR? Who will be best placed to make AR glasses? Will EV batteries be a competitive advantage, or end up, like LCD screens, as a low-margin commodity? Who will have enough of the right kind of driving data for autonomy? But every time I think about these, I try to think what questions I'm not asking. I still want a glider though. 

Let's block ads! (Why?)

19 Jan 19:09

by Loading Artist

19 Jan 19:09

Qual o tamanho do Rover Curiosity?

by noreply@blogger.com (Ronald Sanson Stresser Junior)
O rover Curiosity, é um veículo do tamanho aproximado de um carro médio destinado a explorar a superfície de Marte como parte da missão Mars Science Laboratory.

A missão contendo o rover Curiosity teve início com o lançamento, efetuado em 26 de novembro de 2011 a partir da Estação da Força Aérea de Cabo Canaveral, tendo pousado com sucesso em Marte, mais precisamente em Aeolis Palus na cratera Gale em 6 de agosto de 2012.

Esse ponto de pouso, batizado como Bradbury Landing estava a apenas 1,5 milhas (2,4 quilômetros) do ponto de pouso originalmente previsto, depois de uma jornada de 350.000.000 milhas (560.000.000 quilômetros).


  • Velocidade máxima: 90 m/h 
  • Massa: 899 kg 
  • Altura: 2,2 m 
  • Largura: 2,7 m 
  • Comprimento: 3.0 m




19 Jan 18:44

How to Use Technology to Perform a Repetitive Task

by Scott Meyer

It’s funny how technological change rolls out at different speeds. When I took on my job as an office manager, I had already been all-digital with my schedule and contacts for years. (a large part of actually making a living as a standup comic actually comes down to schedule and contact information management.)

When I got the office manager job, many of the company’s executives were just beginning to make the transition from Rolodexes and Dayrunners to software, and it fell to me to do quite a bit of data entry. Then, often, I would be asked to print the information I had entered in a format that would easily fit into preexisting Dayrunners and Rolodexes.

The software of the time often had an easy means of making those prints automatically, but it still irked me.

 

As always, thanks for using my Amazon Affiliate links (USUKCanada).

19 Jan 18:42

Whomp! - No Hues Is Good Hues

by tech@thehiveworks.com

New comic!

Today's News:
19 Jan 18:41

Saturday Morning Breakfast Cereal - Losing My Faith

by tech@thehiveworks.com


Click here to go see the bonus panel!

Hovertext:
Later, God becomes a Wiccan.

New comic!
Today's News:

Thanks for all the amazing BAHFest proposals, Londonites! We'll have our selections done shortly, and then tickets will be on sale!

19 Jan 18:39

Back on the Horse

by Reza

17 Jan 11:04

I’m an Artist

by Doug
17 Jan 11:04

Comic for January 12, 2017

by Scott Adams
17 Jan 11:04

Like a Mountain

by Reza

17 Jan 11:00

Cybermouse

by Doug
17 Jan 10:59

Comic for 2017.01.11

by Rob DenBleyker
17 Jan 10:50

Comic for January 11, 2017

by Scott Adams
17 Jan 10:50

How to Watch a Movie You Are Told You Will Love

by Scott Meyer

Because of where I worked, and my attitude about princess movies, I saw the live theme park production of Beauty and the Beast dozens of times before I ever saw the animated film. I don’t plan to ever see the live version they’re about to release, despite the presence of Hermione and cousin Matthew, because this man is not playing Lumière, and as such the entire enterprise is fatally flawed.

The thing that always struck me about Beauty and the Beast is that even after the Beast changes back into a human, everyone still pictures him as he used to look, and calls him “Beast.” He’s kinda like Leonard Nimoy in that regard.

 

As always, thanks for using my Amazon Affiliate links (USUKCanada).

17 Jan 10:49

tru datwebtoon / website / facebook / twitter / patreon

17 Jan 10:48

Saturday Morning Breakfast Cereal - The Strangest People

by tech@thehiveworks.com


Click here to go see the bonus panel!

Hovertext:
I'm just saying, we don't know this isn't what happened.

New comic!
Today's News:

Last day to submit for BAHFest London! We will probably not be extending the deadline.

16 Jan 10:56

How to Stay Young

by Scott Meyer

The specific model this comic was written about is no longer available, but a similar product can be had for under $50.

On a related note, I think that seeing something really cool exists, and knowing that you could buy it, but choosing not to because the item serves no useful purpose in your life is one of the hallmarks of adulthood, and also is one of the reasons kids find adults insufferably boring.

 

As always, thanks for using my Amazon Affiliate links (USUKCanada).

16 Jan 10:54

Saturday Morning Breakfast Cereal - Perception of Time

by tech@thehiveworks.com


Click here to go see the bonus panel!

Hovertext:
FIGHT ME, INTERNET

New comic!
Today's News:

Just two days left to submit a proposal for BAHFest London!

16 Jan 10:52

Saturday Morning Breakfast Cereal - The Portrait

by tech@thehiveworks.com


Click here to go see the bonus panel!

Hovertext:
Zach forgot to update his own website . So hey, it's Kelly! Hi, everyone!

New comic!
Today's News:
07 Jan 17:59

Big Banks Are Stocking Up on Blockchain Patents - Bloomberg

by brandizzi

In the headlong rush to revolutionize modern finance, blockchain enthusiasts are overlooking one potentially costly problem: their applications, built on open-source code, may actually belong to someone else.

Recently, some of the biggest names in business, from Goldman Sachs to Bank of America and Mastercard, have quietly patented some of the most promising blockchain technologies for themselves. Through mid-November, the number of patents that companies have obtained or said they’ve applied for has roughly doubled since the start of the year, according to law firm Reed Smith.

As the blockchain -- essentially a shared, cryptographically secure ledger of transactions -- evolves beyond its techno-utopian roots and startups like Chain and Hyperledger open their source code to the public, the risk is growing that patents will turn into powerful weapons in protracted lawsuits over intellectual property, especially in the hands of trolls trying to cash in on the technology’s skyrocketing rise. Increasingly, experts warn established firms will use them to assert exclusive rights over the work of blockchain’s pioneers.

“Open-source code -- that doesn’t necessarily restrict the ability to patent the underlying innovation,” said Patrick Murck, a long-time blockchain legal expert who joined Cooley LLP last month. “Anybody who’s investing in the ecosystem, anybody who’s interested in the technology should be worried about this.”

Goldman spokeswoman Tiffany Galvin declined to comment, while Bank of America didn’t respond to requests for comment.

Playing Defense

Mastercard’s Justin Pinkham says, that like many other companies, it’s simply filing patents to defend its blockchain inventions -- as it always does, in all areas of its work. The company has filed for more than 30 patents related to the blockchain and cryptocurrencies, he said.

Read more: Blockchain’s potential to reshape finance -- a QuickTake

“We have expanded our patent portfolio to protect the company’s thinking, innovations and intellectual property,” said Pinkham, the head of payments innovation at Mastercard Labs.

Because open-source code is freely available to the public, legal disputes have cropped up over who actually owns the rights to the innovations built using that code. Patent wars over Linux -- a popular, open-sourced software used in phones, computers and servers -- have raged for more than a decade.

In the fledgling blockchain industry, the stakes are rising fast. Originally developed to record bitcoin transactions, the distributed ledger has attracted big-name backers like Blythe Masters -- a former JPMorgan banker who’s become one if its most vocal proponents -- because of its potential to reshape how financial services, supply chain and health-care industries are run.

Growth Potential

In finance alone, blockchain’s adoption could create a multi billion-dollar market in the coming decade, from just tens of millions today, according to Gil Luria, an analyst at Wedbush Securities.

That’s prompted firms to patent their most lucrative innovations. Companies worldwide applied for or received patents for 356 families of blockchain- or cryptocurrency-related patents in November, up from 180 in January, according to Marc Kaufman, who specializes in fintech intellectual property at Reed Smith, and Questel, a database provider. While the figure doesn’t compare to other, more developed industries, Kaufman expects the number to balloon.

“We are seeing an increase in filings that’s exponential,” he said. “I predict that we’ll see in five years thousands of patents. It’s an emerging risk, no doubt about it.”

‘Harm Innovation’

Until now, many blockchain startups have downplayed the importance of patents and pinned their hopes on wider adoption through open source. Hyperledger, a venture led by companies including IBM, Accenture and Intel, makes its code free for others to use and enhance. Chain, which lets companies use the blockchain to issue and transfer assets, released its code in late October. Even R3 -- a consortium of some of the largest banks -- made its Corda blockchain available last month.

As such projects have multiplied, some blockchain supporters have suggested open-source makes patents irrelevant. It doesn’t, according to Vitalik Buterin, co-creator of the popular Ethereum blockchain.

Companies could find themselves being sued by one-time collaborators. Large firms could wield patents to muscle into promising businesses developed by today’s startups. Patents could also be used to shut down rivals.

“Blockchain software companies may end up being amalgamated into existing software giants, at which point blockchain patents will just become part of the existing patent war,” Buterin said. “As is the case with all software patents, in my opinion their availability will only slow down and harm innovation.”

Patent Enforcement

Not long ago, BitX, a well-known cryptocurrency exchange in Africa and Southeast Asia, released its code for switching between fiat currencies and bitcoin. Soon after, the startup noticed Bank of America filed a patent for a similar technology, said Marcus Swanepoel, BitX’s chief executive officer.

That put BitX in a bind. If the patent is granted, the bank could theoretically go after BitX or some of its users, or try to charge royalties. BitX’s lawyers concluded the patent would be hard to enforce and the company ultimately decided against going to court.

“The success of those protocols depends on broad-market adoption,” Swanepoel said.

In another worrisome sign, Goldman recently quit R3, people familiar with the consortium said last month, and others may soon follow.

To forestall potential disputes, some firms like Blockstream have made patent pledges, promising its own patents will be available to others for free.

Others are working to set up a patent pool, where members can cross-license each other’s patents. Cooley’s Murck points to Open Invention Network as a model. Launched in 2005, OIN was set up to cross-license and buy up Linux-related patents. OIN is now considering buying up blockchain-related patents as well, according to CEO Keith Bergelt.

“We are creating a patent non-aggression environment,” he said.

Nevertheless, good intentions may mean very little when push comes to shove.

“I would take people at face value, that they’d take patents without intending to assert them,” Reed Smith’s Kaufman said. But, “if you are part of a public company and companies infringe your patents, you may have an obligation. There have been many, many patent cases based on patents that were initially considered defense patents.”

Let's block ads! (Why?)

07 Jan 17:58

monstirinha #14

by Fábio Coala

m_tira13

O mundo é muito melhor visto pelos olhos de uma criança.

O post monstirinha #14 apareceu primeiro em Mentirinhas.

07 Jan 17:57

rockpapercynic:Tried, tested and true.



rockpapercynic:

Tried, tested and true.

07 Jan 17:55

Ambitions

by Reza

07 Jan 17:55

Eu tô bem

by Will Tirando

eu-to-bem

07 Jan 17:55

Team Chat

Adam Victor Brandizzi

Sigh, let them use IRC...

2078: He announces that he's finally making the jump from screen+irssi to tmux+weechat.
07 Jan 17:54

For sharing:Long-ways | Box-ways









For sharing:

Long-ways | Box-ways

07 Jan 17:48

Photo



07 Jan 17:46

Viva Intensamente # 292

by Will Tirando

viva-intensamente-forrest-cao

07 Jan 17:45

Artifacts

I didn't even realize you could HAVE a data set made up entirely of outliers.