The Old Reader

20 Sep 19:29

Search for word usage in movies and television over time

by Nathan Yau

Movies and television shows often reflect cultural trends of the time they are made in. Even movies that take place during the past or future can say something about the present through metadata or production style. Benjamin Schmidt, an assistant professor of history at Northeastern University, provides a tool that lets you see trends in movie and television dialogue.

Made with the Bookworm platform, developed by Schmidt and a team at the Cultural Observatory, the tool lets you search for terms and it spits out relative usage over time. For example, above is word usage of "data" which saw a spike in the 1990s thanks to Star Trek: The Next Generation.

Here's the use of "bro."

Bro

And because I'm mature, here is the usage of "pee", "poop", and "fart", which like the above, appears to be on the rise.

Fart, poop, and pee usage

The nice thing about the tool is that you can search by a lot of conditions. You can choose specific shows, genre, primary country, and other things. You can also change additional settings through the gear icon in the top right, such as time span and smoothing level. So there's lots of fun stuff to play with.

Have a search.

Tags: Bookworm, movies, television

17 Sep 00:18

Spotify Knows Me Better Than I Know Myself

by Walt Hickey

The days when you could listen to guilty pleasure music without consequence are over.

In the dark ages, people used to buy music in a store. But stores never knew how many times you played the album you bought, or a particular song on that album, and the closest we came to a music recommendation engine was a clerk doing his best impression of John Cusack in “High Fidelity.”

The world has changed. Today, many of us don’t even buy music digitally anymore, let alone in a store. We subscribe to streaming services and let our entire musical lives exist on an app.

So when the music streaming service Spotify — which I use almost exclusively these days — offered to to analyze my listening habits using a new tool it has developed, and algorithmically recommend new songs and artists to me, I immediately accepted.

Through its $100 million acquisition of The Echo Nest, a music data company, Spotify has cooked up an early prototype of an internal tool called Nestify to better analyze an individual’s listening history and recommend songs he or she might like. This is one of the first times the music streaming service has discussed the tool with a journalist or made it available to someone outside the company.²⁰⁸

And then I immediately regretted my decision. I recalled that I have terrible taste in music. Suddenly every drunken spin of Madonna was vivid in my mind. I remembered spending several days listening to boy bands for an article. And there was that time last summer when I listened to Miley Cyrus’s “We Can’t Stop” marathon-style, uh, preparing for a story. Yeah. That’s why.

Spotify knows not only what I listened to — a dossier of my crimes against authenticity — but how many times I’ve listened to it, what songs I’ve skipped, and what songs I listen to when. I visited the company’s New York office earlier this month to get my results. The Virgil to my Dante was Ajay Kalia, who’s in charge of developing this tool. He served as the interpreter of the findings and the spinner of mathematical proof that my taste in music is, sad to say, pretty basic.

Before conducting this experiment, I believed that there were essentially two broad types of music: the music you listen to, and the music you tell people you listen to. The second category comprises the songs shared on Facebook or Twitter, the shout-out to Neutral Milk Hotel in your OKCupid profile, the stuff you send your friends. As with most things, the image you try to present to the world is substantially more cultured and interesting than the mundane reality.

This analysis taught me that there are three categories of music: the music that you tell people you listen to, the music that you think you listen to, and the music that you actually listen to. And I was initially shocked to see the vast gulf between the second and third categories when I got the top-line results of Spotify’s analysis.

I’ve listened to a lot of songs on Spotify — 1,735 since January 2013, I’m told — so we have a pretty solid pile of data to work through. Here are my most-played songs, with the (occasionally shameful) play count listed:

hickey-feature-tasteprofile-table-1

Let’s dissect No. 5 briefly. “Let it Go,” a song from “Frozen,” Disney’s 2013 animated feature film, is 3 minutes and 44 seconds long, and playing it 107 times means that I’ve spent slightly more than six and a half hours of my life listening to Idina Menzel telling the world where to stick it. This is slightly less than four times the run length of the entire film.²⁰⁹

Some of the songs are by my favorite artists, such as The Mountain Goats, Supergrass and Arcade Fire. Others are tracks I’ve been really into lately, such as my personal song of the summer, “I Don’t Know How.”²¹⁰ A bunch of songs are also on my starred list. For example, at No. 3 we see “Hooked On A Feeling,” which is entirely the fault of that “Guardians of the Galaxy” trailer.²¹¹

Now here are the artists I listened to most:

hickey-feature-tasteprofile-table-2

What’s interesting about this table isn’t really the names that emerge at the top of the list, but the data that comes along with them — the number of songs and number of plays — and what it means for how I consume different artists’ music.

“You can imagine a situation where we see a hundred plays of one song, something else where we see a hundred plays of ten songs, another we also see a hundred plays of a hundred songs,” Kalia said. “A hundred plays of one song means you just love this single. A hundred plays of ten songs means you’re just exploring and liking songs. A hundred songs and a hundred plays means you just let something run through the entire queue and never came back to it.”

Kalia was able to guess that some of these artists — Vitamin String Quartet and London Philharmonic Orchestra in particular — are playlists that I just run through. They have lots of plays, but also lots and lots of songs, with an average of only 6.2 plays per song for the London Philharmonic Orchestra and 11.5 plays per song for Vitamin String Quartet. Kalia inferred that it’s not that I absolutely love these artists, they’re just for a certain mood. And he’s right, because I listen to the Philharmonic’s covers of video game music and Vitamin String Quartet’s covers of pop music while grinding on long articles.²¹²

On the other hand, I’ve only listened to a single song by Blue Swede — “Hooked On A Feeling” — but I’ve listened to it a lot, which Kalia interpreted not so much as my wanting to hear more Blue Swede but instead as my really liking that one song.

Then you’ve got an artist like The Mountain Goats. I’ve listened to many of their songs a lot — an average of 18.2 plays per song, with 34 songs total — and Kalia correctly interpreted this as my liking the band.

Nestify grabs my listening data and uses it to determine my “listening modes.” In other words, we all listen to different music at different times for different reasons. For example, I listen to certain music depending on whether I’m commuting, working or getting ready to go out. Spotify uses big data and clustering algorithms to figure out how the totality of music we consume breaks down into clusters of artists that correspond to these listening modes.

“A cluster is where we try to make sense of the different kinds of listeners you are,” Kalia said. Under the hood, the tool looks at the artists you listen to and weights them by how much affinity you have for them, based on play counts, and finds similarity among those artists. “So we have these little islands of artists that fit together,” he said.

Why does Spotify invest all this time and money into developing algorithms to find out what individual users like?

Using kneejerk Internet business model thinking, I first thought it was to find out more about the service’s users in order attract advertisers. But it turns out this isn’t the main reason. First, Spotify already knows a lot about its users because many have signed up with a Facebook account or filled in account details. Second, while teasing out a listener’s demographic details is theoretically possible — Brian Whitman, Spotify’s principal music scientist, has published research on how to determine things like political affiliation from a user’s taste profile — there’s still far too much uncertainty in the information for it to be practical from a business standpoint. And lastly, only part of Spotify’s business is tied up with advertisers. The rest is a subscription model.²¹³

The company has found that the longer a non-paying user listens to Spotify, the more likely he or she is to become a paid subscriber. So one advantage of Nestify is that it could give non-paying listeners a better experience, keep them around longer, and thus convert them into paid subscribers.

But folks within Spotify have an even bigger vision for the technology: an app that plays exactly what you want to listen to when you want to listen to it. “We should get good enough that you don’t have to take your phone out of your pocket to get the right stuff,” said Jim Lucchese, the CEO of The Echo Nest.

The team is considering lots of next steps for the prototype, including time-stamping. For example, what you listened to recently should be weighted more heavily in the analysis than what you listened to a couple of weeks ago.

The long-term goal is to find out even deeper stuff: What do you listen to on Friday nights compared to what you listen to on weekday mornings? What did you listen to nonstop for a weekend and then never touch again? The idea is that the service can get to know your preferences so well that it’s able to figure out that you want to hear high-BPM music because it’s Wednesday night at the gym, or string bands because you’re working, or “Let it Go” three times first thing in the morning because that’s how you get psyched up for work. Nestify is the proof of concept for complete predictive personalization.

Two large clusters formed in the analysis of my listening history. Kalia called them “Cluster A” and “Cluster B,” but I’m going to refer to them as “The Shame Cluster” and “The Indie Cluster.”

hickey-feature-tasteprofile-table-3 The Shame Cluster artists include Lana Del Rey, Blue Swede, Miley Cyrus, The Bangles, Madonna, Kesha, Blondie, Lorde, Pat Benatar, Robin Thicke, Andrew W.K., The Cure, Soft Cell and Billy Joel. At left are the five genres (as determined by Spotify) in this cluster with the strongest weights.

This means that 44 percent of the artists in the cluster were tagged as “pop,” and so on. In other words, pop accounts for less than half the music within the cluster. It’s a diffuse group of artists. “It’s sort of pop through the ages. A lot of female, a lot of pop, but not exclusively any of those,” Kalia said.

This cluster, Kalia said, is bigger than the Indie Cluster, in terms of how much I listen to it, but he said he believed my musical identity was less tied to it.

Now let’s look at the Indie Cluster, the category of music I tell people I listen to. Artists include The Mountain Goats, Arcade Fire, Supergrass, LCD Soundsystem, CHVRCHES, Generationals, Matt & Kim, The Strokes, Pulp, Blow, Neutral Milk Hotel, STRFKR, The Shins and Vampire Weekend.

hickey-feature-tasteprofile-table-4

The five genres in this cluster that had the strongest weights are listed in the table to the left.

Because 82 percent of the artists fell under one category, indie rock, Kalia observed that this cluster was a very tight group of artists centered around a specific sound.

“Based on the fact that you have a lot of songs for these, you do a lot of sampling,” Kalia surmised. “This is a little closer to your musical identity.” He’s right.

Finally, there was a large chunk of music that didn’t fit into either cluster or form its own. This included London Philharmonic Orchestra, Vitamin String Quartet and a lot of soundtracks. “It’s not so much that you like classical music per se, but there are particular kinds of music or movies or soundtracks or TV shows that you like,” Kalia said. “But if we just had a station that was generic classical — even though these are classical performances — you wouldn’t necessarily want that.”

This finding — that I have three main listening modes — was oddly comforting. Setting aside the classical stuff, it’s a Jekyll and Hyde situation. Sometimes Jekyll has a couple bourbons, forgets about his love for lo-fi and cranks some Miley. Surely everyone does it. Can’t be helped.

Through analyzing all this listening data, Spotify was able to recommend some playlists of music that it thinks I would like.

Normally when you create a radio station in Spotify, you feed in one song or one artist. Nestify feeds in an entire weighted cluster to generate three playlists: a “My Music” playlist, which is tightly focused on songs I’ve already played a lot; a “Discovery” playlist, made up of artists and songs that are outside the cluster but similar to ones inside the cluster; and a “Default” playlist, which comes somewhere in between the two extremes.

The three playlists for the Shame Cluster — the one loosely congregated around pop music — weren’t that interesting. It’s not so much that the playlists didn’t do what they were designed to do, but the stuff in that cluster is so well known that I was already familiar with the artists it recommended.²¹⁴ I liked the music, but I wasn’t hearing that much new stuff.

The playlists designed around the Indie Cluster were more interesting. For both the Default and Discovery playlists, I listened to and rated each song on a simple three-point scale — disliked, neutral, liked — and also indicated whether the song was brand new to me or not. (For instance, the Default playlist included the Neutral Milk Hotel song “In The Aeroplane Over The Sea,” which I already knew and enjoyed, but I can’t really count that against it.)

The Default playlist contained some of the usual artists I listen to, but kind of suffered from diving too deep into their catalogs. Essentially, in trying to remain within the cluster, it recommended some songs from artists I loved that were not those artists’ best work. The new artists spun in were a bit hit or miss, but if I liked them, I really liked them.

I was really surprised at how much I enjoyed the music on the Discovery playlist.

hickey-feature-tasteprofile-table-5

The verdict? The algorithm nailed it. I liked two-thirds of the stuff on each playlist, regardless of my familiarity with the songs. By basing its recommendations on the stuff I listen to rather than a pre-set genre search — the way that many Internet song recommendations work, including Spotify’s — Nestify was able to predict pretty much exactly the kind of sound I like.

Before all this, I had a concept of my musical identity. But when I got to look at what I actually listened to, I was momentarily thrown for a loop (I didn’t realize I was listening to so much Shame Cluster!). And yet, the algorithm still appeared to figure out what I value as a listener, and its recommendations were very much on point.

Even though Spotify knows exactly what I listen to, the service can’t just ask me what I think my musical identity is, partly because that seems a bit flirty for a music app, but also because I might not really know. Whether we realize it or not, our actual musical identities are often all over the map.

08 Sep 17:31

TOPS takeover: Jackie Chan – Snake in the Eagles Shadow intro

by TOPS

Young Jackie expressing himself along to the groovy French Space-Disco band Space. They were one of the only non-soviet bands permitted to play in the USSR, because they all wore space suits and didn’t have any provocative or naughty lyrics. I find the music quite sexy, and Jackie too. Maddy turned me on to this one. – DVC, TOPS

05 Sep 18:31

Confirmationist and falsificationist paradigms of science

by Andrew

Deborah Mayo and I had a recent blog discussion that I think might be of general interest so I’m reproducing some of it here.

The general issue is how we think about research hypotheses and statistical evidence. Following Popper etc., I see two basic paradigms:

Confirmationist: You gather data and look for evidence in support of your research hypothesis. This could be done in various ways, but one standard approach is via statistical significance testing: the goal is to reject a null hypothesis, and then this rejection will supply evidence in favor of your preferred research hypothesis.

Falsificationist: You use your research hypothesis to make specific (probabilistic) predictions and then gather data and perform analyses with the goal of rejecting your hypothesis.

In confirmationist reasoning, a researcher starts with hypothesis A (for example, that the menstrual cycle is linked to sexual display), then as a way of confirming hypothesis A, the researcher comes up with null hypothesis B (for example, that there is a zero correlation between date during cycle and choice of clothing in some population). Data are found which reject B, and this is taken as evidence in support of A.

In falsificationist reasoning, it is the researcher’s actual hypothesis A that is put to the test.

How do these two forms of reasoning differ? In confirmationist reasoning, the research hypothesis of interest does not need to be stated with any precision. It is the null hypothesis that needs to be specified, because that is what is being rejected. In falsificationist reasoning, there is no null hypothesis, but the research hypothesis must be precise.

In our research we bounce

It is tempting to frame falsificationists as the Popperian good guys who are willing to test their own models and confirmationists as the bad guys (or, at best, as the naifs) who try to do research in an indirect way by shooting down straw-man null hypotheses.

And indeed I do see the confirmationist approach as having serious problems, most notably in the leap from “B is rejected” to “A is supported,” and also in various practical ways because the evidence against B isn’t always as clear as outside observers might think.

But it’s probably most accurate to say that each of us is sometimes a confirmationist and sometimes a falsificationist. In our research we bounce between confirmation and falsification.

Suppose you start with a vague research hypothesis (for example, that being exposed to TV political debates makes people more concerned about political polarization). This hypothesis can’t yet be falsified as it does not make precise predictions. But it seems natural to seek to confirm the hypothesis by gathering data to rule out various alternatives. At some point, though, if we really start to like this hypothesis, it makes sense to fill it out a bit, enough so that it can be tested.

In other settings it can make sense to check a model right away. In psychometrics, for example, or in various analyses of survey data, we start right away with regression-type models that make very specific predictions. If you start with a full probability model of your data and underlying phenomenon, it makes sense to try right away to falsify (and thus, improve) it.

Dominance of the falsificationist rhetoric

That said, Popper’s ideas are pretty dominant in how we think about scientific (and statistical) evidence. And it’s my impression that null hypothesis significance testing is generally understood as being part of a Popperian, falsificiationist approach to science.

So I think it’s worth emphasizing that, when a researcher is testing a null hypothesis that he or she does not believe, in order to supply evidence in favor of a preferred hypothesis, that this is confirmationist reasoning. It may well be good science (depending on the context) but it’s not falsificationist.

The “I’ve got statistical significance and I’m outta here” attitude

This discussion arose when Mayo wrote of a controversial recent study, “By the way, since Schnall’s research was testing ‘embodied cognition’ why wouldn’t they have subjects involved in actual cleansing activities rather than have them unscramble words about cleanliness?”

This comment was interesting to me because it points to a big problem with a lot of social and behavioral science research, which is a vagueness of research hypotheses and an attitude that anything that rejects the null hypothesis is evidence in favor of the researcher’s preferred theory.

Just to clarify, I’m not saying that this is a particular problem with classical statistical methods; the same problem would occur if, for example, researchers were to declare victory when a 95% posterior interval excludes zero. The problem that I see here, and that I’ve seen in other cases too, is that there is little or no concern with issues of measurement. Scientific measurement can be analogized to links on a chain, and each link—each place where there is a gap between the object of study and what is actually being measured—is cause for concern.

All of this is a line of reasoning that is crucial to science but is often ignored (in my own field of political science as well, where we often just accept survey responses as data without thinking about what they correspond to in the real world). One area where measurement is taken very seriously is psychometrics, but it seems that the social psychologists don’t think so much about reliability and validity. One reason, perhaps, is that psychometrics is about quantitative measurement, whereas questions in social psychology are often framed in a binary way (Is the effect there or not?). And once you frame your question in a binary way, there’s a temptation for a researcher, once he or she has found a statistically significant comparison, to just declare victory and go home.

The measurements in social psychology are often quantitative; what I’m talking about here is that the research hypotheses are framed in a binary way (really, a unary way in that the researchers just about always seem to think their hypotheses are actually true). This motivates the “I’ve got statistical significance and I’m outta here” attitude. And, if you’ve got statistical significance already and that’s your goal, then who cares about reliability and validity, right? At least, that’s the attitude, that once you have significance (and publication), it doesn’t really matter exactly what you’re measuring, because you’ve proved your theory.

I am not intendeing to be cynical or to imply that I think these researchers are trying to do bad science. I just think that the combination of binary or unary hypotheses along with a data-based decision rule leads to serious problems.

The issue is that research projects are framed as quests for confirmation of a theory. And once confirmation (in whatever form) is achieved, there is a tendency to declare victory and not think too hard about issues of reliability and validity of measurements.

To this, Mayo wrote:

I agreed that “the measurements used in the paper in question were not” obviously adequately probing the substantive hypothesis. I don’t know that the projects are framed as quests “for confirmation of a theory”,rather than quests for evidence of a statistical effect (in the midst of the statistical falsification arg at the bottom of this comment). Getting evidence of a genuine, repeatable effect is at most a necessary but not a sufficient condition for evidence of a substantive theory that might be thought to (statistically) entail the effect (e.g., a cleanliness prime causes less judgmental assessments of immoral behavior—or something like that). I’m not sure that they think about general theories–maybe “embodied cognition” could count as general theory here. Of course the distinction between statistical and substantive inference is well known. I noted, too, that the so-called NHST is purported to allow such fallacious moves from statistical to substantive and, as such, is a fallacious animal not permissible by Fisherian or NP tests.

I agree that issues about the validity and relevance of measurements are given short shrift and that the emphasis–even in the critical replication program–is on (what I called) the “pure” statistical question (of getting the statistical effect).

I’m not sure I’m getting to your concern Andrew, but I think that they see themselves as following a falsificationist pattern of reasoning (rather than a confirmationist one). They assume it goes something like this:

If the theory T (clean prime causes less judgmental toward immoral actions) were false, then they wouldn’t get statistically significant results in these experiments, so getting stat sig results is evidence for T.

This is fallacious when the conditional fails.

And I replied that I think these researchers are following a confirmationist rather than falsificationist approach. Why do I say this? Because when they set up a nice juicy hypothesis and other people fail to replicate it, they don’t say: “Hey, we’ve been falsified! Cool!” Instead they give reasons why they haven’t been falsified. Meanwhile, when they falsify things themselves, they falsify the so-called straw-man null hypotheses that they don’t believe.

The pattern is as follows: Researcher has hypothesis A (for example, that the menstrual cycle is linked to sexual display), then as a way of confirming hypothesis A, the researcher comes up with null hypothesis B (for example, that there is a zero correlation between date during cycle and choice of clothing in some population). Data are found which reject B, and this is taken as evidence in support of A. I don’t see this as falsificationist reasoning, because the researchers’ actual hypothesis (that is, hypothesis A) is never put to the test. It is only B that is put to the test. To me, testing B in order to provide evidence in favor of A is confirmationist reasoning.

Again, I don’t see this as having anything to do with Bayes vs non-Bayes, and all the same behavior could happen if every p-value were replaced by a confidence interval.

I understand falisificationism to be that you take the hypothesis you love, try to understand its implications as deeply as possible, and use these implications to test your model, to make falsifiable predictions. The key is that you’re setting up your own favorite model to be falsified.

In contrast, the standard research paradigm in social psychology (and elsewhere) seems to be that the researcher has a favorite hypothesis A. But, rather than trying to set up hypothesis A for falsification, the researcher picks a null hypothesis B to falsify and thus represent as evidence in favor of A.

As I said above, this has little to do with p-values or Bayes; rather, it’s about the attitude of trying to falsify the null hypothesis B rather than trying to trying to falsify the researcher’s hypothesis A.

Take Daryl Bem, for example. His hypothesis A is that ESP exists. But does he try to make falsifiable predictions, predictions for which, if they happen, his hypothesis A is falsified? No, he gathers data in order to falsify hypothesis B, which is someone else’s hypothesis. To me, a research program is confirmationalist, not falsificationist, if the researchers are never trying to set up their own hypotheses for falsification.

That might be ok—maybe a confirmationalist approach is fine, I’m sure that lots of important things have been learned in this way. But I think we should label it for what it is.

Summary for the tl;dr crowd

In our paper, Shalizi and I argued that Bayesian inference does not have do be performed in an inductivist mode, despite a widely-held belief to the contrary. Here I’m arguing that classical significance testing is not necessarily falsificationist, despite a widely-held belief to the contrary.

The post Confirmationist and falsificationist paradigms of science appeared first on Statistical Modeling, Causal Inference, and Social Science.

05 Sep 00:17

TOPS takeover: Suzanne Ciani on David Letterman (1980)

by TOPS

Suzanne Ciani is an electronic music innovator who worked with early synthesizers to create sound effects that have a special character which put them in a different solar system than other sound effect engineers working at the time. It’s clear from this video that she’s also an extremely cool and smart woman working at the cutting edge of her field and totally immersed in it. Her aura and her new age compositions both inspire me a lot. – JP, TOPS

04 Sep 22:00

Scientific consensus has gotten a bad reputation—and it doesn’t deserve it

by John Timmer

Consensus? It's complicated and does not involve A) everyone agreeing or B) everyone meeting for coffee.

Digital Desktop Wallpaper (via Flickr)

One of the many unfortunate aspects of arguments over climate change is that it's where many people come across the idea of a scientific consensus. Just as unfortunately, their first exposure tends to be in the form of shouted sound bites: "But there's a consensus!" "Consensus has no place in science!"

Lost in the shouting is the fact that consensus plays several key roles in the process of science. In light of all the consensus choruses, it's probably time to step back and examine its importance and why it's a central part of the scientific process. And only after that is it possible to take a look at consensus and climate change.

Standards of evidence

Fiction author Michael Crichton probably started the backlash against the idea of consensus in science. Crichton was rather notable for doubting the conclusions of climate scientists—he wrote an entire book in which they were the villains—so it's fair to say he wasn't thrilled when the field reached a consensus. Still, it's worth looking at what he said, if only because it's so painfully misguided:

Read 26 remaining paragraphs | Comments

04 Sep 20:33

Expansionwire: Cambridge Gets More Ramen, Courtesy of Santouka

by Rachel Leah Blumenthal

Santouka Ramen, a Japanese company that previously made its Boston expansion dreams known, has found a location. Based on licensing hearing information, Boston Restaurant Talk reports that Santouka will open at 1 Bow Street in Harvard Square, which used to be a Dunkin' Donuts.

Santouka popped up at Itadaki in Back Bay for a few days earlier this year, serving a simple shio broth-based ramen. The company's CEO, Shinichi Kikuta, came to town for the event and was surprised to find the high prices of ramen in Boston. "It should be a cheap comfort food for you to have with a beer after a long day," he told Eater.
· Santouka plans to open in Cambridge's Harvard Square [BRT]
· All coverage of Santouka Ramen on Eater [~EBOS~]
[Photo: Santouka Ramen in Arlington Heights, Illinois/Yelp]

03 Sep 22:35

I disagree with Alan Turing and Daniel Kahneman regarding the strength of statistical evidence

by Andrew

It’s funny. I’m the statistician, but I’m more skeptical about statistics, compared to these renowned scientists.

The quotes

Here’s one: “You have no choice but to accept that the major conclusions of these studies are true.”

Ahhhh, but we do have a choice!

First, the background. We have two quotes from this paper by E. J. Wagenmakers, Ruud Wetzels, Denny Borsboom, Rogier Kievit, and Han van der Maas.

Here’s Alan Turing in 1950:

I assume that the reader is familiar with the idea of extra-sensory perception, and the meaning of the four items of it, viz. telepathy, clairvoyance, precognition and psycho-kinesis. These disturbing phenomena seem to deny all our usual scientific ideas. How we should like to discredit them! Unfortunately the statistical evidence, at least for telepathy, is overwhelming.

Wow! Overwhelming evidence isn’t what it used to be.

In all seriousness, it’s interesting that Turing, who was in some ways an expert on statistical evidence, was fooled in this way. After all, even those psychologists who currently believe in ESP would not, I think, hold that the evidence for telepathy as of 1950 was overwhelming. I say this because it does not seem so easy for researchers to demonstrate ESP using the protocols of the 1940s; instead there is continuing effort to come up with new designs

How could Turing have thought this? I don’t know much about Turing but it does seem, when reading old-time literature, that belief in the supernatural was pretty common back then, lots of mention of ghosts etc. And at an intuitive level there does seem, at least to me, an intuitive appeal to the idea that if we just concentrate hard enough, we can read minds, move objects, etc. Also, remember that, as of 1950, the discovery and popularization of quantum mechanics was not so far in the past. Given all the counterintuitive features of quantum physics and radioactivity, it does not seem at all unreasonable that there could be some new phenomena out there to be discovered. Things feel a bit different in 2014 after several decades of merely incremental improvements in physics.

To move things forward a few decades, Wagenmakers et al. mention “the phenomenon of social priming, where a subtle cognitive or emotional manipulation influences overt behavior. The prototypical example is the elderly walking study from Bargh, Chen, and Burrows (1996); in the priming phase of this study, students were either confronted with neutral words or with words that are related to the concept of the elderly (e.g., ‘Florida’, ‘bingo’). The results showed that the students’ walking speed was slower after having been primed with the elderly-related words.”

They then pop our this 2011 quote from Daniel Kahneman:

When I describe priming studies to audiences, the reaction is often disbelief . . . The idea you should focus on, however, is that disbelief is not an option. The results are not made up, nor are they statistical flukes. You have no choice but to accept that the major conclusions of these studies are true.

And that brings us to the beginning of this post, and my response: No, you don’t have to accept that the major conclusions of these studies are true. Wagenmakers et al. note, “At the 2014 APS annual meeting in San Francisco, however, Hal Pashler presented a long series of failed replications of social priming studies, conducted together with Christine Harris, the upshot of which was that disbelief does in fact remain an option.”

Where did Turing and Kahneman go wrong?

Overstating the strength of empirical evidence. How does that happen? As Eric Loken and I discuss in our Garden of Forking Paths article (echoing earlier work by Simmons, Nelson, and Simonsohn), statistically significant comparisons are not hard to come by, even by researchers who are not actively fishing through the data.

The other issue is that when any real effects are almost certainly tiny (as in ESP, or social priming, or various other bank-shot behavioral effects such as ovulation and voting), statistically significant patterns can be systematically misleading (as John Carlin and I discuss here).

Still and all, it’s striking to see brilliant people such as Turing and Kahneman making this mistake. Especially Kahneman, given that he and Tversky wrote the following in a famous paper:

People have erroneous intuitions about the laws of chance. In particular, they regard a sample randomly drawn from a population as highly representative, that is, similar to the population in all essential characteristics. The prevalence of the belief and its unfortunate consequences for psvchological research are illustrated by the responses of professional psychologists to a questionnaire concerning research decisions.

Indeed.

Having an open mind

It’s good to have an open mind. Psychology journals publish articles on ESP and social priming, even though these may seem implausible, because implausible things sometimes are true.

It’s good to have an open mind. When a striking result appears in the dataset, it’s possible that this result does not represent an enduring truth or even a pattern in the general population but rather is just an artifact of a particular small and noisy dataset.

One frustration I’ve had in recent discussions regarding controversial research is the seeming unwillingness of researchers to entertain the possibility that their published findings are just noise. Maybe not, maybe these are real effects being discovered, but you should at least consider the possibility that you’re chasing noise. Despite what Turing and Kahneman say, you can keep an open mind.

P.S. Some commenters thought that I was disparaging Alan Turing and Daniel Kahneman. I wasn’t. Turing and Kahneman both made big contributions to science, almost certainly much bigger than anything I will ever do. And I’m not criticizing them for believing in ESP and social priming. What I am criticizing them for is their insistence that the evidence is “overwhelming” and that the rest of us “have no choice” but to accept these hypotheses. Both Turing and Kahneman, great as they are, overstated the strength of the statistical evidence.

And that’s interesting. When stupid people make a mistake, that’s no big deal. But when brilliant people make a mistake, it’s worth noting.

The post I disagree with Alan Turing and Daniel Kahneman regarding the strength of statistical evidence appeared first on Statistical Modeling, Causal Inference, and Social Science.

03 Sep 04:22

The Countries Where You’re Surrounded By Tourists

by Nate Silver

Edenovellis
How does Canada get so many tourists? And Japan so few? I can vouch for Spain being overrun by tourists.

When traveling in Tokyo a couple of years ago, I was struck by how few tourists there were. Sure, taking advice from an English-language website or guidebook could still lead you to a place with plenty of gaijin. But I didn’t have the sense of being surrounded by fellow tourists that one sometimes gets in Rome or Paris or Barcelona.

The statistics bear this out. Barcelona gets about twice as many international tourists each year as Tokyo. But that’s only part of the story. Barcelona’s metro-area population is about 4.7 million — compared to 39.4 million for Tokyo. Relative to the size of its population, Barcelona has about 16 times more tourists.

That’s the ratio you notice when walking around a city: how many of the people around you are tourists and how many are locals. We can also study this phenomenon at the country level, where the data is slightly more comprehensive. (Our focus will be on international tourists — not, for example, New Yorkers visiting the Grand Canyon.)

Countries vary in how welcoming, accessible and attractive they are to visitors. But as a general rule, the number of tourists doesn’t scale up linearly with the number of residents. China gets roughly 10 times more tourists than tiny Bahrain but has 1,000 times Bahrain’s population. That means Bahrain has about 100 times more tourists as a share of its population.

More formally, we might define this ratio as the tourist percentage — the number of international tourists in a country on an average day throughout the year as a share of the total number of people present:

To calculate tourist percentage, you need data on a country’s population and on the number of international tourists present on an average day. The population numbers are easy to find, but figuring out the number of tourists isn’t so simple.

The World Bank collects data on the annual number of international tourist arrivals in each country. (A tourist arrival, by the World Bank’s definition, occurs whenever someone enters a country from abroad and stays overnight.⁹⁸ I used the most recent year of data available from the World Bank — usually 2012 — although had to turn to other sources in a couple of cases that I’ll note in the text below.) The challenge is in going from annual tourist arrivals to an estimate of how many tourists are present on an average day. If we knew how long the average tourist stayed, we could calculate it as follows:

But there isn’t comprehensive data on the duration of tourist visits. In 2010, the U.S. Travel Association estimated the average length of a vacation taken by an American as 3.8 days (and falling). But foreign trips are presumably longer than average vacations — and American workers receive less time off than people in most other countries, so their vacations may be shorter. A week — seven days — probably works better as an estimate of the average international trip.

But there’s another issue: The length of an average visit presumably varies some from country to country. Compare Austria and Australia, for example, the two countries that your fourth-grade teacher admonished you not to confuse.

Austria is a small country in the middle of Europe. It’s easy to pass through Austria en route to somewhere else — on a train from Munich to Venice, for example. The World Bank wouldn’t count that particular journey as a tourist visit unless you stayed in Austria overnight. Nevertheless, a night or two spent in Vienna might be just one hop on a longer European trip. Australia, by contrast, is a giant island thousands of miles from the world’s population centers. That’s part of what makes visiting there fun: You really feel like you’ve gotten away. But you probably don’t go to Australia just for a day or two.

So to estimate the average length of a tourist visit, I used another World Bank data series on the annual amount of tourism income in each country. The ratio of tourism income to tourist arrivals ought to provide a rough proxy for the length of a journey: You’ll spend more money on a two-week vacation than a weekend holiday. The ratio will also be affected by other factors, however, such as the strength of a country’s currency and the wealth of the people who visit it.

I ran a regression on the ratio of tourism spending to tourist visits as a function of two geographic factors: the number of land borders that a country has and its area.⁹⁹ The regression implies that more isolated countries receive more money per visit, and that larger countries receive more money than smaller ones. These factors probably do tell us something about the average length of a visit; they imply that the average trip to Australia lasts about 50 percent longer than that the average trip elsewhere, for instance.¹⁰⁰ We’ve estimated that the average international trip worldwide lasts seven days, so that means an average trip to Australia would last 10 to 11 days instead.

One country requires special handling: the Vatican. Lots of people visit it — in 2013, about 5.5 million people visited the Vatican Museums while 6.6 million attended some sort of public event with the Pope, such as in St. Peter’s Square. Even assuming that there was some overlap between these groups,¹⁰¹ that’s an awful lot of tourists in a country with a population of barely more than 800 people. On the other hand, visits to the Vatican are very short; there are no hotels there, so you need to have the invitation of the Pope or someone else important to stay there overnight. I assumed an average visit of four hours instead.

The Vatican still qualifies as a huge outlier relative to the rest of the world. My method estimates that there are about five tourists there on average for every one resident, or a tourist percentage of 83 percent. Surely the tourist percentage is higher still during the hours when the Vatican Museums are open or when the Pope is hosting a major public event.

Ranking second in the world is tiny Andorra, with a tourist percentage of 29 percent. The numbers drop off quickly after that; it’s about 11 percent in Palau and Bahrain, for example, 9 percent in the Bahamas and 8 percent in Monaco.

silver-datalab-tourists-1-v3

There’s an economic lesson here: Even in places whose economies are dominated by tourism, and where it may seem like you’re completely surrounded by tourists, you’ll generally find a number of locals for every tourist. Hotels can employ one employee per guest room or even more in parts of the world where labor is cheap but tourist dollars are plentiful. Tourists also need dining, transportation and entertainment options. Meanwhile, all the people who work in the tourism industry have families and needs of their own. This is why tourism is associated with a multiplier effect; it produces secondary and tertiary sources of income and employment in addition to the revenues received directly from tourists.

By contrast, tourists can represent a vanishingly small part of the population in large countries. This next chart lists the tourist percentage for the top 25 countries in the world by GDP:

silver-datalab-tourists-2-v2

Among these high-GDP countries, Spain ranks as the most touristy. On average throughout the year, about 2 percent of the people there are tourists, I estimate. (Obviously the percentage is going to be higher at certain places and times: during high season in Ibiza, for instance.) France is next at 1.7 percent, followed by Canada, the Netherlands, Italy and the United Kingdom. Australia isn’t far behind.

In Japan, however, the tourist percentage is only 0.2 percent. And it’s lower still in some other places: just 0.05 percent in China, 0.04 percent in Brazil, and 0.01 percent in India, meaning that only about 1 in 10,000 people is an international tourist there at any given time. These countries do attract reasonable numbers of tourists — but their native populations are very large, so tourists represent a tiny fraction of the population.

Some countries outside the top 25 in GDP undoubtedly have lower tourist percentages still. Afghanistan gets an estimated 3,500 tourists per year, for example, compared with a population of 30 million. Assuming an average visit there lasts seven days, that implies a tourist ratio of 0.0002 percent, or about one tourist for every 500,000 Afghanis. However, countries with very limited tourist economies generally don’t publish reliable data on the industry, so it’s hard to know exactly which country ranks at the bottom.

Those countries probably aren’t at the top of most tourists’ bucket lists anyway. But Brazil, China, India and Japan have incredible experiences to offer. The choice is clear: If you want to get away from tourist traps, go to places where there are more locals.

31 Aug 02:25

Rte. 9 could get dedicated bike lanes in Brookline Village

29 Aug 14:35

Writing Skills

I'd like to find a corpus of writing from children in a non-self-selected sample (e.g. handwritten letters to the president from everyone in the same teacher's 7th grade class every year)--and score the kids today versus the kids 20 years ago on various objective measures of writing quality. I've heard the idea that exposure to all this amateur peer practice is hurting us, but I'd bet on the generation that conducts the bulk of their social lives via the written word over the generation that occasionally wrote book reports and letters to grandma once a year, any day.

madjo, Lori and 16 others like this

29 Aug 01:36

The executive order that led to mass spying, as told by NSA alumni

by Cyrus Farivar

The Oval Office as it looked at the end of President Reagan's second term, as seen in the replica at the Ronald Reagan Presidential Library.

Dhrupad Bezboruah

One thing sits at the heart of what many consider a surveillance state within the US today.

The problem does not begin with political systems that discourage transparency or technologies that can intercept everyday communications without notice. Like everything else in Washington, there’s a legal basis for what many believe is extreme government overreach—in this case, it's Executive Order 12333, issued in 1981.

“12333 is used to target foreigners abroad, and collection happens outside the US," whistleblower John Tye, a former State Department official, told Ars recently. "My complaint is not that they’re using it to target Americans, my complaint is that the volume of incidental collection on US persons is unconstitutional.”

Read 53 remaining paragraphs | Comments

27 Aug 19:50

You get a personal data site, and you get one, and you too

by Nathan Yau

Personal data collection keeps getting easier and more efficient. Much of what was manual or clunky a few years ago is now automatic, done via the phone we carry every day anyway. More recently, personal data is finding a way out of the closed networks and applications and on to our own computers and servers.

Anand Sharma's personal site is the newest example of what an individual can do with his or her own data. On a whim a few months ago, Sharma downloaded the Moves app, which tracks your location, and was hooked. Then with some design inspiration from Tony Stark, Sharma put a site together to show a feed a several aspects of his life, mostly tracked with his phone.

Above is the homepage, full of concentric circles similar to Iron Man's holographic interface. It's a fun view, but the meat of the site is in the breakdowns.

The "sport" section shows runs, climbs, and other aspects of his health.

The "explorer" section shows travels and more everyday activities.

There's also a third "journal" section, which is a blog holder for Sharma's thoughts. Be sure to check out the most recent process post about the making of the site.

The site is fun to poke at, and it's easy to see how this might apply to your own life and daily activities. Again though, the best part is that this is possible, with little effort compared to just a few years ago. Let the apps in the background do most of the work for you. Then take your data out of the one-size-fits-all profile and make something catered exactly to your needs.

27 Aug 17:52

Green Line Update Teases Improvements Enabled by Tracking Technology

Edenovellis
Hmm I knew that the D line was the fastest, but the C line being the slowest is surprising. I could have sworn it was the B line. Also signal priority with real time updates on where the Green line is sounds dreamy.

27 Aug 15:18

How a "Knee Defender' Got a United Flight Diverted

by Boston.com staff

Airline passengers have come to expect a tiny escape from the confined space of today's packed planes: the ability to recline their seat a few inches. When one passenger was denied that bit of personal space Sunday, it led to a heated argument and the unscheduled landing of their plane, just halfway to its destination.

21 Aug 20:00

Spice of Brookline Is Now Giggling Rice

by noreply@blogger.com (Marc)

A Thai restaurant in Brookline that opened around the beginning of the year has undergone a name change, and its new name is a rather unusual one.

According to @PatrickMBoston of Maguire Promotions, Spice of Brookline on Beacon Street is now Giggling Rice, with a page on the Town of Brookline site (that appears to have been taken down) saying that the change "stems from another restaurant in Cambridge with a similar name" that "has caused some confusion with their patrons." It appears that the dining spot remains mostly the same, though a poster on Yelp says that there does appear to be a few changes made to the menu.

Spice of Brookline initially took over the space that had been vacated by Han River.

The address for Giggling Rice is 1009 Beacon Street, Brookline, MA, 02446.

[A related post from our sister site (Boston's Hidden Restaurants): List of Restaurant Closings and Openings in the Boston Area]

21 Aug 03:31

Actual Japanese Workwear Check out these absolutely stunning...

by derekguypto

Actual Japanese Workwear

Check out these absolutely stunning Japanese firemen coats. Known as Hanten coats, these were worn by Japanese firefighters in the 19th century. At the time, the technology to spray water at a high-enough pressure hadn’t been invented yet, so Japanese men had to fight fires by creating firebreaks downwind. Doing so, however, put them in danger of catching on fire themselves, as hot embers can travel up to a mile. To make their coats more protective, they were continually doused with water.

The symbols and designs you see are for several things. Some are just for decoration, of course, while some signal the fire crew that the wearer belonged to. Others are lucky symbols or refer to a heroic story, giving the wearer encouragement to be strong and courageous.

You can see these coats in person (along with many other awesome things) at Shibui, a shop in New York City for Japanese antiques and collectibles. They’re moving at the end of September and are having a sale right now to lighten their load. Select items are discounted by up to 50%, including lots of boro fabrics, which is a kind of heavily patched and mended Japanese textile. You can see examples of boro here.

For those of us outside of NYC, Shibui has a Google+ page you can admire (they’ll take phone orders, if you’re interested). There’s also a book titled Haten and Happi, which is all about traditional Japanese work coats.

20 Aug 15:27

A Mother’s Journey Through the Unnerving Universe of ‘Unboxing’ Videos

19 Aug 19:33

Man arrested, strip-searched after photographing NYPD wins $125,000

by David Kravets

A New York man who claimed police arrested and strip-searched him after he photographed a stop-and-frisk of three African-American youths has settled his civil rights suit with the New York Police Department for $125,000.

The settlement, first reported Monday by the Daily News, comes weeks after the NYPD reminded its officers that it was legal to peacefully record police activity. That department-wide memo followed the videotaped NYPD arrest of a man who died after being subdued by a chokehold last month.

The NYPD settled with a man named Dick George, who alleged that while he was sitting in his parked car in Flatbush in 2012, he saw two NYPD officers get out of an unmarked car and perform what is known as a stop-and-frisk of three youths. George said he captured the search on his mobile phone. He claimed he went up to the youths and told them next time that happens to make sure they get the officers' badge numbers.

Read 4 remaining paragraphs | Comments

18 Aug 17:24

Ivy Style has a really cool article about the connection between...

by jessethorn

Ivy Style has a really cool article about the connection between 60s prepsters and batik-printed cotton. You see these once in a while on eBay - I’d love to find one for myself.

18 Aug 15:49

Thousand-robot swarm assembles itself into shapes

by The Conversation

Michael Rubenstein, Harvard

There is something magical about seeing 1,000 robots move when humans are not operating any of them. And in a new study published in Science, researchers have created just that. This swarm of 1,000 robots can assemble themselves into complex shapes without the need for a central brain or a human controller.

Self-assembly of this kind can be found in nature—from molecules forming regular crystals and cells forming tissues, to ants building rafts to float on water and birds flocking to avoid becoming prey. Complex forms emerge from local interactions among thousands, millions, or even trillions of limited and unreliable individual elements.

These self-organized systems have interesting features. First, they are decentralized—that is, they don't need a central brain or leader. Second, they are scalable, so you can add large numbers of individuals. Third, they are robust—individuals that are unreliable don't break the system.

Read 11 remaining paragraphs | Comments

18 Aug 15:40

Some quick disorganzed tips on classroom teaching

by Andrew

Below are a bunch of little things I typically mention at some point when I’m teaching my class on how to teach. But my new approach is to minimize lecturing, and certainly not to waste students’ time by standing in front of a group of them, telling them things they could’ve read at their own pace.

Anyway, here I am preparing my course on statistical computing and graphics and thinking of points to mention during the week on classroom teaching. My old approach would be to organize these points in outline format and then “cover” them in class. Instead, though, I’ll stick them here and then I can assign this to students to read ahead of time, freeing up class time for actual discussion.

Working in pairs:

This is the biggie, and there are lots of reasons to do it. When students are working in pairs, they seem less likely to drift off, also with two students there is more of a chance that one of them is interested in the topic. Students learn from teaching each other, and they can work together toward solutions. It doesn’t always work for students to do homeworks pairs or groups—I have a horrible suspicion that they’ll often just split up the task, with one student doing problem #1, another doing problem #2, and so forth—but having them work in pairs during class seems like a no-lose proposition.

The board:

Students don’t pay attention all the time nor do they have perfect memories; hence, use the blackboard as a storage device. For example, if you are doing a classroom activity (such as the candy weighing), outline the instructions on the board at the same time as you explain them to the class. For another example, when you’re putting lots of stuff on the board, organize it a bit: start at the top-left and continue across and down, and organize the board into columns with clear labels. In both cases, the idea is that if a student is lost, he or she can look up at the board and have a chance to see what’s up.

Another trick is to load up the board with relevant material before the beginning of class period, so that it’s all ready for you when you need it.

The projector:

It’s becoming standard to use beamer (powerpoint) slide presentations in classroom teaching as well as with research lectures. I think this is generally a good idea, and I have just a few suggestions:
- Minimize the number of words on the slides. If you know what you’re talking about, you can pretty much just jump from graph to graph.
- The trouble with this strategy is that, without seeing the words on the screen, it can be hard to remember what to say. This suggests that what we really need is a script (or, realistically, a set of notes) to go along with the slide show. Logistically this is a bit of a mess—it’s hard enough to keep a set of slides updated without having to keep the script aligned at the same time—and as a result I’ve tended to err on the side of keeping too many words on my slides (see here, for example). But maybe it’s time for me to bite the bullet and move to a slides-and-script format.

Another intriguing possibility is to go with the script and ditch the slides entirely. Indeed, you don’t even need a script; all you need are some notes or just an idea of what you want to be talking about. I discovered this gradually over the past few years when giving talks (see here for some examples). I got into the habit of giving a little introduction and riffing a bit before getting to the first slide. I started making these ad libs longer and longer, until at one point I gave a talk that started with 20 minutes of me talking off the cuff. It seemed to work well, and the next step was to give an entire talk with no slides at all. The audience was surprised at first but it went just fine. Most of the time I come prepared with a beamer file full of more slides than I’ll ever be able to use, but it’s reassuring to know that I don’t really need any of them.

Finally, assuming you do use slides in your classes, there’s the question of whether to make the slides available to the students. I’m always getting requests for the slides but I really don’t like it when students print them out. I fear that students are using the slides as a substitute for the textbook, also that if the slides are available, students will think they don’t need to pay attention during class because they can always read the slides later.

It’s funny: Students are eager to sign up for a course to get that extra insight they’ll obtain from attending classes, beyond whatever they’d get by simply reading the textbook and going through the homework problems on their own. But once they’re in class, they have a tendency to drift off, and I need to pull all sorts of stunts to keep them focused.

The board and the projector, together:

Just cos your classroom has a projector, that don’t mean you should throw away your blackboard (or whiteboard, if you want to go that stinky route). Some examples:
- I think it works better to write out an equation or mathematical derivation in real time rather than to point at different segments of an already-displayed formula.
- It can help to mix things up a little. After a few minutes of staring at slides it can be refreshing to see some blackboard action.
- You can do some fun stuff by projecting onto the blackboard. For example, project x and y axes and some data onto the board, then have a pair of students come up and draw the regression line with chalk. Different students can draw their lines, then you click onto the next slide which projects the actual line.

Handouts:

Paper handouts can be a great way to increase the effective “working memory” for the class. Just remember not to overload a handout. Putting something on paper is not the same thing as having it be read. You should figure out ahead of time what you’re going to be using in class and then refer to it as it arises.

I like to give out roughly two-thirds as many handouts as there are people in the audience. This gives the handouts a certain scarcity value, also it enforces students discussing in pairs since they’re sharing the handouts already. I found that when I’d give a handout to every person in the room, many people would just stick the handout in their notebook. The advantage of not possessing something is that you’re more motivated to consume it right away.

Live computer demonstrations:

These can go well. It perhaps goes without saying that you should try the demo at home first and work out the bugs, then prepare all the code as a script which you can execute on-screen, one paragraph of code at a time. Give out the script as a handout and then the students can follow along and make notes. And you should decide ahead of time how fast you want to go. It can be fine to do a demo fast to show how things work in real life, or it can be fine to go slowly and explain each line of code. But before you start you should have an idea of which of these you want to do.

Multiple screens:

When doing computing, I like to have four windows open at once: the R text editor, the R console, an R graphics window (actually nowadays I’ll usually do this as a refreshable pdf or png window rather than bothering with the within-R graphics window), and a text editor for whatever article or document I’m writing.

But it doesn’t work to display 4 windows on a projected screen: there’s just not enough resolution, and, even if resolution were not a problem, the people in the back of the room won’t be able to read it all. So I’m reluctantly forced to go back and forth between windows. That’s one reason it can help to have some of the material in handout form.

What I’d really like is multiple screens in the classroom so I can project different windows on to different screens and show all of them at once. But I never seem to be in rooms with that technology.

Jitts:

That’s “just in time teaching”; see here for details. I do this with all my classes now.

Peer instruction:

This is something where students work together in pairs on hard problems. It’s an idea from physics teaching that seems great to me but I’ve never succeeded in implementing true “peer instruction” in my classes. I have them work in pairs, yes, but the problems I give them don’t look quite like the “Concept Tests” that are used in the physics examples I’ve seen. The problem, perhaps, is that intro physics is just taught at a higher level than intro statistics. In my intro statistics classes, it’s hard enough to get the students to learn about the basics, without worrying about getting them into more advanced concepts. So when I have students work in pairs, it’s typically on more standard problems.

Drills:

In addition to these pair or small-group activities, I like the idea of quick drills that I shoot out to the whole class and students do, individually, right away. I want them to be able to handle basic skills such as sqrt(p*(1-p)/n) or log(a*x^(2/3)) instantly.

Getting their attention:

You want your students to stay awake and interested, to enter the classroom full of anticipation and to leave each class period with a brainful of ideas to discuss. Like a good movie, your class should be a springboard for lots of talk.

But you don’t want to get attention for the wrong things. An extreme example is the Columbia physics professor who likes to talk about his marathon-fit body and at one point felt the need to strip to his underwear in front of his class. This got everyone talking—but not about physics. At a more humble level, I sometimes worry that I’ll do goofy things in class to get a laugh, but then the students remember the goofiness and not the points I was trying to convey. Most statistics instructors probably go too far in the other direction, with a deadpan demeanor that puts the students to sleep.

It’s ok to be “a bit of a character” to the extent that this motivates the students to pay attention to you. But, again, I generally recommend that you structure the course so that you talk less and the students talk more.

Walking around the classroom:

Or wheeling around, if that’s your persuasion. Whatever. My point here is that you want your students to spend a lot of the class time working on problems in pairs. While they’re doing this, you (and your teaching assistants, if this is a large so-called lecture class with hundreds of students) should

Teaching tips in general:

As I explained in my book with Deb Nolan, I’m not a naturally good teacher and I struggle to get students to participate in class. Over the decades I’ve collected lots of tricks because I need all the help I can get. If you’re a naturally good teacher or if your classes already work then maybe you do without these ideas.

Preparation:

It’s not clear how much time should be spent preparing the course ahead of time. I think it’s definitely a good idea to write the final exam and all the homeworks before the class begins (even though I don’t always do this!) because then it gives you a clearer sense of where you’re heading. Beyond that, it depends. I’m often a disorganized teacher and I think it helps me a lot to organize the entire class before the semester begins.

Other instructors are more naturally organized and can do just fine with a one-page syllabus that says which chapters are covered which weeks. These high-quality instructors can then just go into each class, quickly get a sense of where the students are stuck, and adapt the class accordingly. For them, too much preparation might well backfire.

My problem is that I’m not so good at individualized instruction; even in a small class, it’s hard for me to keep track of where each student is getting stuck, and what the students’ interests and strengths are. I’d like to do better on this, but for now I’ve given up on trying to adapt my courses for individuals. Instead I’ve thrown a lot of effort into detailed planning of my courses, with the hope that these teaching materials will be useful for other instructors.

Students won’t (in general) reach your level of understanding:

You don’t teach students facts or even techniques, you teach them the skills needed to solve problems (including the skills needed to find the solution on their own). And there’s no point in presenting things they’re not supposed to learn; for example, if a mathematical derivation is important, put it on the exam with positive probability. And if students aren’t gonna get it anyway (my stock example here is the sampling distribution of the sample mean), just don’t cover it. That’s much better, I think, than wasting everyone’s time and diluting everyone’s trust level with a fake-o in-class derivation.

The road to a B:

You want a plan by which a student can get by and attain partial mastery of the material. See discussion here.

Evaluation:

What, if anything, did the students actually learn during the semester?

You still might want to evaluate what your students are actually learning, but we don’t usually do this. I don’t even do it, even though I talk about it. Creating a pre-test and post-test is work! And it requires some hard decisions. Whereas not testing at all is easy. And even when educators try to do such evaluations, they’re often sloppy, with threats to validity you could drive a truck through. At the very least, this is all worth thinking about.

Relevance of this advice to settings outside college classrooms:

Teaching of advanced material happens all over, not just in university coursework, and much of the above advice holds more generally. The details will change with the goals—if you’re giving a talk on your latest research, you won’t want the audience to be spending most of the hour working in pairs on small practice problems—but the general principles apply.

Anyway, it was pretty goofy that I used to teach a course on teaching and stand up and say all these things. It makes a lot more sense to write it here and reserve class time for more productive purposes.

One more thing

I can also add to this post between now and the beginning of class. So if you have any ideas, please share them in the comments.

The post Some quick disorganzed tips on classroom teaching appeared first on Statistical Modeling, Causal Inference, and Social Science.

16 Aug 03:38

Decepticruiser roams Dorchester

by adamg

Edenovellis
Pretty sure I saw this car out in front of CNS

The Braintree guy who painted a Maserati to look like a police cruiser spent some time tooling around Dorchester today. Jay Kelly spotted him at Adams and Park streets in Fields Corner.

Of course, the effect that got him in a spot of trouble on the South Shore is lessened quite a bit in Boston, where our police cruisers don't use the black-and-white motif preferred by many of the area's smaller towns.

15 Aug 13:56

On the Street…La Fortezza, Florence

by The Sartorialist

07 Aug 14:48

Correlation does not even imply correlation

by Andrew

The above title is my response to a discussion that began with this email sent to be by Steve Roth:

Noah Smith had a great tweet recently, a real keeper for me [Roth].

Causation is correlated with correlation.

I would reword it:

Correlation correlates with causation. (Just not very much.)

And I wonder if the following corollaries are safe:

Non-correlation correlates (more strongly) with non-causation.

And/or:

Negative correlation correlates (much more strongly) with non-causation.

This in response to the old nostrum/saw that correlation does not imply causation.

Which has always seemed wrong to me. Of course it does! (Weakly.)

The problem is that “imply” is a very slippery word, so it’s a pretty useless nostrum.

Would be delighted to see a post poking at this.

I replied:

I will post something on this (at some point; we’re on a 1-2 month delay so most things don’t appear right away) but my quick response is: Selection bias. If people start sending you random pairs of variables that happen to be highly correlated, sure, there might well be a connection between them, for example kids’ scores on math tests and language tests are correlated, and this tells us something. But if someone is looking for a particular pattern, and then selects two variables that are correlated, that’s another story. The great thing about causal identification is that it’s valid even if you’re looking to find a pattern. (Not completely, there’s p-hacking and also you can run 100 experiments and only report the best one, etc., but that’s still less of an issue than the fact that pure correlation does not logically tell you anything about causation. To put it another way: returning to Noah’s tweet: Correlation is surely correlated with causation in an aggregate sense, but if you take the subset of correlations that a particular motivated researcher is looking for—then maybe not.

You could also see the above paragraph as a bit of common-sense reasoning. The expression “correlation does not imply causation” is popular, and I think it’s popular for a reason, that it does capture a truth about the world.

I cc-ed Smith on this exchange and also Dan Kahan, who wrote:

For what it’s worth, my two variants would be:

1. Nothing other than correlation implies causation.
2. Correlation implies causation — except when it doesn’t.

Credit to D. Hume for #1 (at least for noticing that there’s no other visible indicator of causation).
#2 is just what Andrew said: causation = correlation plus valid causal inference.

Again, the elephant in the room here is selection. People see enough random correlations that they can pick them out and interpret them how they like.

So if I had to put something on a bumper sticker (or a tweet), it would be:

Correlation does not even imply correlation

That is, correlation in the data you happen to have (even if it happens to be “statistically significant”) does not necessarily imply correlation in the population of interest.

P.S. I’ve shifted the emphasis in my slogan to make the point clearer.

The post Correlation does not even imply correlation appeared first on Statistical Modeling, Causal Inference, and Social Science.

07 Aug 05:33

Monkey’s selfie at center of copyright brouhaha

by David Kravets

Wikimedia Commons

An English nature photographer is going ape over Wikipedia's refusal to remove pictures of a monkey from the online encyclopedia that he says are being displayed without his permission.

Wikimedia, the operation that runs Wikipedia, says that the public, not photojournalist David Slater, maintains the rights to the works. That's because the black macaca nigra monkey swiped the camera from Slater during a 2011 shoot in Indonesia and snapped tons of pictures, including the selfie and others at issue.

"We received a takedown request from the photographer, claiming that he owned the copyright to the photographs. We didn't agree. So we denied the request," Wikimedia said Wednesday in its transparency report.

Read 5 remaining paragraphs | Comments

06 Aug 02:52

How Microsoft dragged its development practices into the 21st century

by Peter Bright

Waterfalls: picturesque in nature, less so in development.

Flickr user: Dina Eric

SEATTLE—For the longest time, Microsoft had something of a poor reputation as a software developer. The issue wasn't so much the quality of the company's software but the way it was developed and delivered. The company's traditional model involved cranking out a new major version of Office, Windows, SQL Server, Exchange, and so on every three or so years.

The releases may have been infrequent, but delays, or at least perceived delays, were not. Microsoft's reputation in this regard never quite matched the reality—the company tended to shy away from making any official announcements of when something would ship until such a point as the company knew it would hit the date—but leaks, assumptions, and speculation were routine. Windows 95 was late. Windows 2000 was late. Windows Vista was very late and only came out after the original software was scrapped.

In spite of this, Microsoft became tremendously successful. After all, many of its competitors worked in more or less the same way, releasing paid software upgrades every few years. Microsoft didn't do anything particularly different. Even the delays weren't that unusual, with both Microsoft's competitors and all manner of custom software development projects suffering the same.

Read 79 remaining paragraphs | Comments

04 Aug 02:27

From the New Yorker: It now appears… that jeans savaged...

by breathnaigh

From the New Yorker:

It now appears… that jeans savaged by wild animals are a trend in designer sportswear. A Japanese denim brand had the bright idea, at least for raising its profile, of sewing indigo-dyed cotton fabric around rimless tires, sausage-shaped bolsters, and fat rubber balls, and throwing the objects to the inmates of the Kamine Zoo, in Hitachi City. In an accompanying video, the beasts bound from their cages and fall upon their novel chew toys with such relish that you have to wonder if there isn’t a little catnip involved. The scene reminded me of toddlers on Christmas morning, tumbling down the stairs, unable to contain their excitement, and tearing into the neatly wrapped parcels under the tree.

When the fabric has been properly “distressed” — i.e., mauled — it is retrieved from the enclosures and made into trousers that are sold under the label Zoo Jeans.

-Pete

01 Aug 14:14

Expats in Tangier The New York Times’ Style Magazine...

by derekguypto

Expats in Tangier

The New York Times’ Style Magazine published a feature a few months ago on Tangier, a northern Moroccan city that has long been a destination for European and American diplomats, spies, writers, and businessmen. The story focuses on the expat community there and their eccentric style. People who came to visit, and decided to never leave. An excerpt:

It’s an old story — as old as sailing and sex — yet there is always something new coming over the strait. Indeed, it may be the hunt for newness in an old port that brought them here, adventurers and outsiders — from Mark Twain and Delacroix to Yves Saint Laurent and Tennessee Williams — who merely broke the path for the uprooted of today. Deep in the Casbah and high on the slopes of Vieille Montagne, you find these people, these elegant, exotic plants who fill their days with lunch parties and gossip. They may be the harmless denizens of an old idea, doing it with style, living beyond their means but strictly within their taste. It is a painted city where ripe vegetables and aged spies litter the souks, where men of hidden consequence can always find a drink. Most of all, Tangier is a city where attention to detail is undivided, a place where you meet people just crazy for beauty.

[…]

In a large old room smelling of narcissi, Pasti sat me down and smiled through cigarette smoke. The tables around us were filled with strange shells, bones and Neolithic pottery. I looked around as he spoke and you could almost breathe the beauty: a piece of an Islamic column from Spain, an Italian Renaissance stemma, many Berber pots, pine cones and marble busts. Past a big 17th-century German armoire was a fireplace of the same period. An 18th-century Venetian screen held back a little of the evening air, which came, nonetheless, rosemary-scented and chilled. Painted Moroccan chests and side tables were dotted everywhere — “I love patina,” he said — and around the walls was a multitude of astonishing tile panels, some from Seville and Portugal and fired 200 years before the birth of Shakespeare. Pasti writes novels and makes gardens. He is both intensely sociable and extremely private. Walking from room to room in his perfect house, he seemed somewhat like a man in a fairy tale, lost in beauty, hiding behind windows in a secret garden. But then he laughed and puffed on his cigarette and seemed quite normal again. Pasti started as a literary critic and then began collecting strange fragments and rare bulbs, which he would plant in his garden in the Moroccan countryside, and also in pots at his house in Tangier. His first novel is the story of a botanical obsession. “I started collecting wild bulbs more or less 15 years ago,” he said. He sometimes sleeps outside among the plants. In some ways he considers himself to be a kind of doctor to sick plants and sees his place in the country as a kind of botanical hospital.

You can read the whole article, and view a very, very cool video feature that goes along with it, here.

30 Jul 02:16

Podcasting patent troll: We tried to drop lawsuit against Adam Carolla

by Joe Mullin

wasim muklashy

Personal Audio LLC is an East Texas shell company that gleaned national attention when it claimed it had the right to demand cash from every podcaster. The company was wielding a patent on "episodic content," which it said included anyone doing a podcast, as well as many types of online video.

Now the company is trying to walk away from its highest-profile lawsuit against comedian Adam Carolla, without getting paid a penny—but Carolla won't let the case drop.

In a statement released today, Personal Audio says that Carolla, who has raised more than $450,000 from fans to fight the case, is wasting their money on an unnecessary lawsuit. The company, which is a "patent troll" with no business other than lawsuits, has said Carolla just doesn't care since his fans are paying his lawyers' bills.

Read 12 remaining paragraphs | Comments

Edenovellis

Shared posts

Standards of evidence