Shared posts

04 Jun 18:10

Javascript and the next decade of data programming

I’ve recently been getting pretty far into the weeds about what the future of data programming is going to look like. I use pandas and dplyr in python and R respectively. But I’m starting to see the shape of something that’s interesting coming down the pike. I’ve been working on a project that involves scatterplot visualizations at a massive scale–up to 1 billion points sent to the browser. In doing this, two things have become clear:

  1. Computers have gotten much, much faster in the last couple decades
  2. Our languages for data analysis have failed to keep up.
  3. New data formats are making the differences between Python, R, and Javascript less important.
  4. Javascript, the quintessential front-end language, is increasingly becoming the back-end for data work in Python and R.
  5. Things will be weird, but also maybe good?

I tweeted about it once, after I had experimented with binary, serialized alternatives to JSON.

I’m writing about Python and R because they’re completely dominant in the space of data programming. (By data programming, I mean basically ‘data science’; not being a scientist, I have trouble using it to describe what I do.) Some dinosaurs in economists still use Stata, and some wizards use Julia, but if you want to work with data that’s basically it. The big problem with the programming lessons we use to work with data they run largely on CPUs, and often predominantly on a single core. This has always been an issue in terms of speed; when I first switched to Python around 2011, I furiously searched ways around the GIL (global interpreter lock) that keeps the language from using multiple cores even on threads. Things have gotten a little better on some fronts–in general, it seems like at least linear algebra routines can make use of a computer’s full resources.

JS/HTML is the low-level language for UI and Python and R.

Separately, the graphical and interface primitives of all programs have started to move to the web. If I had started doing this kind of work seriously even a couple years later, I would never even have noticed there used to be another way. I never really used tcl/tk interfaces in R, but I was always aware that they existed; the very first version, private version of the Google Ngrams browser that JB Michel wrote in like 2008 or something was built around some Python library. This was normal. But in the last decade, it’s become obvious that if you want to build user-facing elements to describe something like “a button” or “a mouseover”, the path of least resistance is to use the HTML conception, not the operating system conception of them. The fifteen-year-old freshman who built the first Bookworm UI quickly saw it needed a javascript plotting library. This integration is becoming tighter and tighter in data programming land. I have collaborators and grad students who transition seamlessly into bundling their R packages into Shiny apps, into decorating their Google colab notebooks with all sorts of sliders and text entry fields, into publishing R and Python code as online books with HTML/JS navigation.

Jupyter notebooks and the RStudio IDE themselves are part of this transformation; what appears to be Python code held together by an invisible skein of Javascript. Again, these are platforms that have more or less displaced earlier models. When I first learned R, I pasted from textedit into the core R GUI; I went a little down the road into ESS-mode in emacs as well. But if you need to continually be checking random samples of a dataframe, re-running modules, and seeing if your regular expressions correctly clean a dataset, you are using a notebook interface today, even if you bundle your code into a module at some point.

And for visualization, Javascript is creeping into this space. Like many people, I’ve been relieved to be able to use Altair instead of matplotlib for visualizing pandas dataframes; and I don’t think twice about dropping ggplotly into lessons about ggplot for students who start wondering about tooltips on mouseover. ggplot and matplotlib are still king of the roost for publication-ready plots, but after becoming accustomed to interactive, responsive charts on the web, we are coming to expect exploratory charts to do the same thing; just as select menus and buttons from HTML fill this role in notebook interface, JS charting libraries do the same for chart interface.

The GPU-laptop interface is an open question

Let me be clear–something I’ll say in this following section is certainly wrong. I’m not fully expert in what I’m about to say. I don’t know who is! There are some analogies to web cartography, where I’ve learned a lot from Vladimir Agafonkin. Many of the tools I’m thinking about I learned about in a set of communications with Doug Duhaime and David McClure. But the field is unstable enough that I think others may stumble in the same direction I have.

This whole period, GPUs have also been displacing CPUs for computation. The R/Python interfaces to these are tricky. Numba kind of works; I’ve fiddled with gnumpy from time to time; and I’ve never intentionally used a GPU in R, although it’s possible I did without knowing it. The path of least resistance to GPU computation in Python and R is often to use Tensorflow or Torch even for purposes that don’t really a neural network library–so I find myself, for example, training UMAP models using the neural network interface rather than the CPU one even though I’d prefer the other.

Most of these rely on CUDA to access GPUs. (When I said I don’t know what I’m talking about–this is the core of it.) If you want to do programming on these platforms, you increasingly boot up a cloud server and run heavy-duty models there. Cuda configuration is a pain, and the odds are decent your home machine doesn’t have a GPU anyway. If you want to run everything in the cloud, this is fine–Google just gives away TPUs for free. But doing a group-by/apply/summarize on a few million rows, this is overkill; and while cloud compute is pretty cheap compared to your home laptop, cloud storage is crazy expensive. Digital Ocean charges me like a hundred dollars a year just to keep up the database backing RateMyProfessor; for the work I do on several terabytes of data from the HathiTrust, I’d be lost without a university cluster and the 12TB hard drive on my desk at home.

But I want these operations to run faster.

Javascript is already fast, even without its GPU.

When I started using webgl to make charts in Javascript, I was completely blown away what it could do. I’m used to sitting around waiting for ggplot to render even a few thousand points. I’m used to polygon operations in geopandas being long and expensive. I’m used to getting up to get some tea when I want to load a geojson file.

But I could use javascript to generate millions of points in random polygons from primitive triangles in barely any time; and then using regl it can animate fast enough to make seamless zooming reasonable. Here, for example, is every single vote (excluding absentee) in New York City precincts in the 2020 election. (Hopefully this embed from Observable loads… but if it doesn’t, well, that’s the kind of the point, too. I’m making you click below to avoid clobbering people on phones.)

Load iframe

Digging into the weeds to make more elaborate visualizations like this, I can see why. Apache Arrow exposes an extremely low level model of the data you work with, that encourages you to think a lot about both the precise schema and the underlying types. In Python, I’ve gotten used to this kind of work in numpy; in R, I’ve only ever done a little bit a bit twiddling. But in modern JS, binary array buffers are built right into the language. When I started tinkering with JS, I thought of it as slow; but web developers are far more obsessive about speed than any other high-level, dynamically typed language I’ve seen. The profiling tools built into Chrome are incredibly powerful; and Google, especially, has made a huge investment in making JS run incredibly quickly because there’s huge money in frictionless web experience. Sure, lots of websites are slow because they come with megabyte-sized React installations and casual bloat; sure, the DOM is slow to work with. But Javascript itself is fast.

In my first few years teaching digital humanities, probably the least thankful task was helping students manage their local Java installations so they could run Mallet, the best implementation of topic-modeling algorithm out there. Now, we usually use slower and inferior implementations in gensim, structural topic models, and the like. (For an interesting discussion from Ted Underwood and Yoav Goldberg of how inferior results in gensim and sklearn came to displace mallet, see the Twitter threads here.) But as David Mimno, who keeps Mallet running, says, Javascript works much faster.

And while Javascript has a reputation as a terrible language, the post ES2015 iterations have made it in many cases relatively easy to program with. Maps, sets, for ... of ... all work much like you’d expect (unlike the days when I spent a couple hours hunting out a rarely occuring bug in one data visualization that turned out to occur when I was making visualizations of wordcounts that included the word constructor somewhere in the vocabulary); and many syntactic features like classes, array destructuring, and arrow function notation are far more pleasant than their Python equivalents. (Full disclosure–even after a decade in the language, I still find Python’s whitespace syntax gimmicky and at heart just don’t like the language. But that’s a post for another day.)

Javascript with WebGL is crazy fast.

And if javascript is fast, WebGL is just bonkers in what it can do. Want to lay out two million points in a peano curve in a few milliseconds? No problem–you can even regenerate every single frame.

Load iframe

And WebGL uses floating-point buffers that are the same as those in Apache Arrow, so you copy blocks of data straight from disk (or the web) into the renderers without even having to do that (still fast) javascript computation. It’s difficult, and easy to do wrong. (I’ve found regl pitched at the perfect level of abstraction, but I still occasionally end up allocating thousands of buffers on the GPU every frame where I meant to only create one persistent one).

In online cartography, protobuffer-based vector files do something similar in libraries like mapbox.gl and deck.gl. The overhead of JSON-based formats for working with cartographic data is hard to stomach once you’ve seen how fast, and how much more compressed, binary data can be.

WebGL is hell on rollerskates

In working with WebGL, I’ve seen just how fast it can be. For things like array smoothing, counting of points to apply complicated numeric filters, and group-by sums, it’s possible to start applying most of the elements of the relational algebra on data frames in a fully parallelized form.

But I’ve held back from doing so in any but the most ad-hoc situations because WebGL is also terrible for data computing. I would never tell anyone to learn it, right now, unless they completely needed to. Attribute buffers can only be floats, so you need to convert all integer types before posting. In many situations data may be downsized to half precision points, and double-precision floating points are so difficult that there are entire rickety structures built to support them at great cost Support for texture types varies across devices (Apple ones seem to pose special problems), so people I’ve learned from like Ricky Reusser go to great lengths to support various fallbacks. And things that are essential for data programming, like indexed lookup of lists or for loops across a passed array, are nearly impossible. I’ve found writing complex shaders in WebGL fun, but doing so always involves abusing the intentions of the system.

WebGPU and wasm might change all that

WASM and the Javascript Virtual Machine

But the last two pieces of the puzzle are lurking on the horizon. Web Assembly– wasm files–give another way to write things for the javascript virtual machine that can avoid the pitfalls of Javascript being a poorly designed language. A few projects that are churning along in Rust hold the promise of making in-browser computation even faster. (If I were going to go all-in on a new programming language for a few months right now, it would probably be Rust; in writing webgl programs I increasingly find myself doing the equivalent of writing my own garbage collectors, but as a high-level guy I never learned enough C to really know the basic concepts.) Back in the 2000s, the python and R ecosystems were littered with packages that relied on the Java virtual machine in various ways. In the 2010s, it felt to me like they shifted to underlying C/C++ dependencies. But given how much effort is going into it, I think we’ll start to see things use the Javascript Virtual Machine more and more. When I want to use some of D3 spherical projections in R, that’s how I call them; and Jerome Ooen’s V8 package (for running the JSVM, or whatever we call it) is approaching the same level of downloads as the more venerable rJava. I suspect almost all of this is running just Javascript. If it starts becoming a realistic way to run pre-compiled Rust and C++ binaries on any system… that’s interesting.

Chart showing V8 vs RJava downloads from CRAN since 2016; by mid-2020, V8 had morethan half the downloads of rJava with periodic steps up.

WebGPU

The last domino is a little off, but could be titanically important. WebGL is slowly dying, but the big tech companies have all gotten together to create WebGPU as the next-generation standard for talking to GPUs from the browser. It builds on top of the existing GPU interfaces for specific devices (Apple, etc.) like Vulkan and Metal, about which I have rigorously resisted learning anything.

WebGPU will replace WebGL for fast in-browser graphics. But the capability to do heavy duty computation in WebGL is so tantalizing that some lunatics have already begun to do it. The stuff that goes on into Reusser’s work] is amazing; check out this notebook about “multiscale Turing patterns” that creates gorgeous images halfway between organic blobs and nineteenth-century endplates

I haven’t read the draft WebGPU spec carefully, but it will certainly allow a more robust way to handle things. There is already at least one linear algebra library (i.e., BLAS) for WebGPU out there. I can only imagine that support for more data types will make many simple group-by-filter-apply functions plausible entirely in GPU-land on any computer that can browse the web.

When I started in R back in 2004, I spent hours tinkering with SQL backing for what seemed at the time like an enormous dataset: millions of rows giving decades of data about student majors by race, college, gender, and ethnicity. I’d start a Windows desktop cranking out charts before I left the office at night, and come back to work the next morning to folders of images. Now, it’s feasible to send an only-slightly-condensed summary of 2.5 million rows for in-browser work and the whole dataset could easily fit in GPU memory. In general, the distinction between generally available GPU memory (say, 0.5 - 4GB) and RAM (2-16GB) is not so massive that we won’t be sending lots of data there. Data analysis and shaping is generally extremely parallelizable.

JS and WebGPU will stick together

Once this bundle gets rolling, it will much faster and more convenient than python/R, and in many cases it will be able to run with zero configuration. The Arquero library, introduced last year, already brings most of the especially important features of the dplyr or pandas API into observable at a nearly comparable speed. With tighter binary integration or a different backend, it–or something like it– could easily become the basic platform for teaching the non-major introduction to data science course all of the universities are starting to launch. Even if it didn’t, the vast superiority of Javascript over R/Python for both visualization speed (thanks to GPU integration) and interface (thanks to the uniquity of HTML5) means that people will increasinly bring their own data to websites for initial exploration first, and may never get any farther. (If I were going to short public companies based on the contents of these speculations, I’d start with NVidia–whose domination of the GPU space is partially dependent on CUDA being the dominant language, not WebGPU, and ESRI, which is floundering as it tries to make desktop software that does what web browsers do easily.)

Once these things start getting fast, the insane overhead of parsing CSV and JSON, and the loss of strict type definitions that they come with, will be far more onerous. Something–I’d bet on parquet, but there are are possibilities involving arrow, HDF5, ORC, protobuffer, or something else–will emerge as a more standard binary interchange format.

Why bother with R and Python?

So–this is the theory–the data programming languages in R and Python are going to rely on that. Just as they wrap Altair and they wrap HTML click elements, you’ll start finding more and more that the package/module that seems to just work, and quickly, that the 19-year-olds gravitate towards, runs on the JSVM. There will be strange stack overflow questions in which people realize that they have an updated version of V8 installed which needs to be downgraded for some particular package. There will python programs that work everywhere but mysteriously fail on some low-priced laptops using a Chinese startup’s GPU. And there will be things that almost entirely avoid the GPU because they’re so damned complicated to implement that the Rust ninjas don’t do the full text, and which–compared to the speed we see from everything else–come to be unbearable bottlenecks. (From what I’ve seen, Unicode regular expressions and non-spherical map projections seem to be a likely candidate here.)

But it will also raise the question of why we should bother to continue in R and Python at all. Javascript is faster, and will run anywhere, universally, without the strange overhead of binder notebooks and the cost of loading data in the cloud. WASM ports of these languages that run inside the JSVM will help, but ultimately get strange. (Will you write python code that gets transpiled in the browser to WASM, and then invokes its own javascript emulator to build an altair chart?) Beats me!

But I’ve already started sharing elementary data exercises for classes using observablehq, which provides a far more coherent approach to notebook programming than Jupyter or RStudio. (If you haven’t tried it–among many, many other things, it parses the dependency relations between cells in a notebook topologically and avoids the incessant state errors that infect expert and–especially–novice programming in Jupyter or Rstudio.) And if you want to work with data rather than write code, it is almost as refreshing as the moment in computer history it tries to recapitulate, the shift from storing business data in COBOL to running them in spreadsheets. The tweet above that forms of the germ of this rant has just a single, solitary like on it; but it’s from Mike Bostock, the creator of D3 and co-founder of Observable, and that alone is part of the reason I bothered to write this whole thing up. The Apache Arrow platform I keep rhapsodizing about is led by Wes McKinney, the creator of pandas, who views it as the germ of a faster, better pandas2, from a position initially sponsored by RStudio and subsequently with funding from Nvidia. Speculative as this all is, it’s also–aside from massive neural-network gravitational of the tensorflow/torch solar systems– where the tools that become hegemonic in the last decade are naturally drifting. (Not to imply that Javascript is anywhere near the top of the Arrow project’s priority list, BTW. It isn’t.) I wish more of the data analysts, not just the insiders, saw this coming, or were excited that it is.

As I said, I’ve been doing some of this programming since 2003 or so, and been putting in my regular rounds most days since 2010. In that time I’ve come to see that I what I want to see most–fully editable, universally runnable, data analysis on open data–is not a universal code. Some people just want static charts. Some people want to hide their data. Most readers don’t want to tweak the settings. And everyone looks down on people who like Javascript. But it’s also the case that the web was first built in the 90s to share complicated academic work and make it editable by its readers. Even if most of academia and much of the media is devoted to one-way flows of information, and much of the post-social media Internet is a blazing hellscape, I’m excited about these shifts in the landscape precisely because they hold out the possibility that some portion of the Web might actually live up to its promise of making it easier to think through ideas.

14 Mar 05:25

10 statistical lessons from the past pandemic year

by Nathan Yau

The Royal Statistical Society published ten lessons governments should takeaway from this year, which should naturally apply to standard data practice:

  1. Invest in public health data – which should be regarded as critical national infrastructure and a full review of health data should be conducted 
  2. Publish evidence – all evidence considered by governments and their advisers must be published in a timely and accessible manner
  3. Be clear and open about data – government should invest in a central portal, from which the different sources of official data, analysis protocols and up-to-date results can be found
  4. Challenge the misuse of statistics – the Office for Statistics Regulation should have its funding augmented so it can better hold the government to account
  5. The media needs to step up its responsibilities – government should support media institutions that invest in specialist scientific and medical reporting
  6. Build decision makers’ statistical skills – politicians and senior officials should seek out statistical training
  7. Build an effective infectious disease surveillance system to monitor the spread of disease – the government should ensure that a real-time surveillance system is ready for future pandemics
  8. Increase scrutiny and openness for new diagnostic tests – similar steps to those adopted for vaccine and pharmaceutical evaluation should be followed for diagnostic tests
  9. Health data is incomplete without social care data – improving social care data should be a central part of any review of UK health data
  10. Evaluation should be put at the heart of policy – efficient evaluations or experiments should be incorporated into any intervention from the start.

See the full report here.

Tags: coronavirus, learning, Royal Statistical Society

14 Mar 05:25

Machine learning to find movie ideas

by Nathan Yau

Speaking of A.I. and fiction, Adam Epstein for Quartz reported on how Wattpad, the platform for people to share stories, uses machine learning to find potential movies:

Wattpad uses a machine-learning program called StoryDNA to scan all the stories on its platform and surface the ones that seem like candidates for TV or film development. It works on both macro and micro levels, analyzing big-picture audience engagement trends to identify the genres picking up steam, while also looking at the specific stories that got popular quickly and calculating what made them so appealing.

The tool can break stories down to their vocabularies and sentence structures (a story’s “DNA,” if you will) and then compare those to other stories to deduce what really makes a work of fiction popular. It also looks at how often users comment on stories and, when they do, what exactly they’re saying. Its goal is to examine all these clues to uncover the precise combination of story elements—genre, emotion, grammar, the list goes on—that hooks audiences to the point they’ll follow its journey onto a visual medium.

Maybe I’m just getting old, but this sounds terrible.

Tags: fiction, machine learning, Quartz, Wattpad

14 Mar 05:17

Stanley Park & “Automobile Bubble Privilege”-David Sadoway

by Sandy James Planner

David Sadoway is on faculty and is an instructor at Kwantlen Polytechnic University. He sent the following in response to the Cars versus Bikes Stanley Park post.

Perhaps we need add a new term to our analyses, entrenched “automobile-bubble privilege” reducing a single paved traffic lane on a one way two laned road in public park which should be devoted to green public spaces (not efficient road movement/mobility) is hardly a radical move and should have been embraced years ago by Parks Board.

If we really prioritized healthy parks we would instead have a shuttle bus (as others have suggested) and shut down the existing road fully for pedestrians, wheelchairs and others to instead enjoy”.

 Perhaps actually studying the impacts that road traffic has on human and biodiversity (including habitat fragmentation and increased noise, air and light pollution). This is a Public Park, not DisneyLand. And I thought it was named Stanley Park, not Stanley Parking lot !

For far too long many of our public city, regional and provincial parks have been built on the assumption that all ‘taxpaying publics’ have equal and universal access to a car/truck (as opposed to say affordable transit or safe biking or walking/rolling flowing paths and networks across our urban/biodiverse fabric). We all fund these roads after all, even if we choose not to own cars/trucks.

The same assumption in the Lower Mainland is made with publicly-funded and protected Cypress and Mounr Seymour Provincial Parks which have trailheads that can be accessed only if one uses a private vehicle. Yes, there are expensive shuttle private buses with sporadic services, but minimally during the peak Summer hiking season. 

Ironically Grouse Mountain, with its private gondola and ski hill is the most ‘transit friendly’ immediate access to the North Shore wilds and yet one has to do the Grouse Ground or loop back to Lynn Valley if one does not want to pay the private Gondola operators while not using a car.

Public parking lots and roads inducing automobility (and GHGs not to mention ever reduced urban biodiversity) for exclusive, cosseted private benefit it seems !

Image: LondonEconomic.com
14 Mar 05:17

The New Freshman Comp

by Jon Udell

The column republished below, originally at http://www.oreillynet.com/pub/a/network/2005/04/22/primetime.html, was the vector that connected me to my dear friend Gardner Campbell. I’m resurrecting it here partly just to bring it back online, but mainly to celebrate the ways in which Gardner — a film scholar among many other things — is, right now, bringing his film expertise to the practice of online teaching.

In this post he reflects:

Most of the learning spaces I’ve been in provide very poorly, if at all, for the supposed magic of being co-located. A state-mandated prison-spec windowless classroom has less character than a well-lighted Zoom conference. A lectern with a touch-pad control for a projector-and-screen combo is much less flexible and, I’d argue, conveys much less human connection and warmth than I can when I share a screen on Zoom during a synchronous class, or see my students there, not in front of a white sheet of reflective material, but in the medium with me, lighting up the chat, sharing links, sharing the simple camaraderie of a hearty “good morning” as class begins.

And in this one he shares the course trailer (!) for Fiction into film: a study of adaptations of “Little Women”.

My 2005 column was a riff on a New York Times article, Is Cinema Studies the new MBA? It was perhaps a stretch, in 2005, to argue for cinema studies as an integral part of the new freshman comp. The argument makes a lot more sense now.


The New Freshman Comp

For many years I have alternately worn two professional hats: writer and programmer. Lately I find myself wearing a third hat: filmmaker. When I began making the films that I now call screencasts, my readers and I both sensed that this medium was different enough to justify the new name that we collaboratively gave it. Here’s how I define the difference. Film is a genre of storytelling that addresses the whole spectrum of human experience. Screencasting is a subgenre of film that can tell stories about the limited — but rapidly growing — slice of our lives that is mediated by software.

Telling stories about software in this audiovisual way is something I believe technical people will increasingly want to do. To explain why, let’s first discuss a more ancient storytelling mode: writing.

The typical reader of this column is probably, like me, a writer of both prose and code. Odds are you identify yourself as a coder more than as a writer. But you may also recently have begun blogging, in which case you’ve seen your writing muscles grow stronger with exercise.

Effective writing and effective coding are more closely related than you might think. Once upon a time I spent a year as a graduate student and teaching assistant at an Ivy League university. My program of study was science writing, but that was a tiny subspecialty within a larger MFA (master of fine arts) program dedicated to creative writing. That’s what they asked me to teach, and the notion terrified me. I had no idea what I’d say to a roomful of aspiring poets and novelists. As it turned out, though, many of these kids were in fact aspiring doctors, scientists, and engineers who needed humanities credits. So I decided to teach basic expository writing. The university’s view was that these kids had done enough of that in high school. Mine was that they hadn’t, not by a long shot.

I began by challenging their reverence for published work. Passages from books and newspapers became object lessons in editing, a task few of my students had ever been asked to perform in a serious way. They were surprised by the notion that you could improve material that had been professionally written and edited, then sold in bookstores or on newsstands. Who were they to mess with the work of the pros?

I, in turn, was surprised to find this reverent attitude even among the budding software engineers. They took it for granted that programs were imperfect texts, always subject to improvement. But they didn’t see prose in the same way. They didn’t equate refactoring a program with editing a piece of writing, as I did then and still do.

When I taught this class more than twenty years ago the term “refactoring” wasn’t commonly applied to software. Yet that’s precisely how I think about the iterative refinement of prose and of code. In both realms, we adjust vocabulary to achieve consistency of tone, and we transform structure to achieve economy of expression.

I encouraged my students to regard writing and editing as activities governed by engineering principles not unlike the ones that govern coding and refactoring. Yes, writing is a creative act. So is coding. But in both cases the creative impulse is expressed in orderly, calculated, even mechanical ways. This seemed to be a useful analogy. For technically-inclined students earning required humanities credits, it made the subject seem more relevant and at the same time more approachable.

In the pre-Internet era, none of us foresaw the explosive growth of the Internet as a textual medium. If you’d asked me then why a programmer ought to be able to write effectively, I’d have pointed mainly to specs and manuals. I didn’t see that software development was already becoming a global collaboration, that email and newsgroups were its lifeblood, and that the ability to articulate and persuade in the medium of text could be as crucial as the ability to design and build in the medium of code.

Nowadays, of course, software developers have embraced new tools of articulation and persuasion: blogs, wikis. I’m often amazed not only by the amount of writing that goes on in these forms, but also by its quality. Writing muscles do strengthen with exercise, and the game of collaborative software development gives them a great workout.

Not everyone drinks equally from this fountain of prose, though. Developers tend to write a great deal for other developers, but much less for those who use their software. Laziness is a factor; hubris even more so. We like to imagine that our software speaks for itself. And in some ways that’s true. Documentation is often only a crutch. If you have to explain how to use your software, you’ve failed.

It may, however, be obvious how to use a piece of software, and yet not at all obvious why to use it. I’ll give you two examples: Wikipedia and del.icio.us. Anyone who approaches either of these applications will immediately grasp their basic modes of use. That’s the easy part. The hard part is understanding what they’re about, and why they matter.

A social application works within an environment that it simultaneously helps to create. If you understand that environment, the application makes sense. Otherwise it can seem weird and pointless.

Paul Kedrosky, an investor, academic, and columnist, alluded to this problem on his blog last month:

Funny conversation I had with someone yesterday: We agreed that the thing that generally made us both persevere and keep trying any new service online, even if we didn’t get it the first umpteen times, was having Jon Udell post that said service was useful. After all, if Jon liked it then it had to be that we just hadn’t tried hard enough. [Infectious Greed]

I immodestly quote Paul’s remarks in order to revise and extend them. I agree that the rate-limiting factor for software adoption is increasingly not purchase, or installation, or training, but simply “getting it.” And while I may have a good track record for “getting it,” plenty of other people do too — the creators of new applications, obviously, as well as the early adopters. What’s unusual about me is the degree to which I am trained, inclined, and paid to communicate in ways that help others to “get it.”

We haven’t always seen the role of the writer and the role of the developer as deeply connected but, as the context for understanding software shifts from computers and networks to people and groups, I think we’ll find that they are. When an important application’s purpose is unclear on the first umpteen approaches, and when “getting it” requires hard work, you can’t fix the problem with a user-interface overhaul or a better manual. There needs to be an ongoing conversation about what the code does and, just as importantly, why. Professional communicators (like me) can help move things along, but everyone needs to participate, and everyone needs to be able to communicate effectively.

If you’re a developer struggling to evangelize an idea, I’d start by reiterating that your coding instincts can also help you become a better writer. Until recently, that’s where I’d have ended this essay too. But recent events have shown me that writing alone, powerful though it can be, won’t necessarily suffice.

I’ve written often — and, I like to think, cogently — about wikis and tagging. But my screencasts about Wikipedia and del.icio.us have had a profoundly greater impact than anything I’ve written on these topics. People “get it” when they watch these movies in ways that they otherwise don’t.

It’s undoubtedly true that an audiovisual narrative enters many 21st-century minds more easily, and makes a more lasting impression on those minds, than does a written narrative. But it’s also true that the interactive experience of software is fundamentally cinematic in nature. Because an application plays out as a sequence of frames on a timeline, a narrated screencast may be the best possible way to represent it and analyze it.

If you buy either or both of these explanations, what then? Would I really suggest that techies will become fluid storytellers not only in the medium of the written essay, but also in the medium of the narrated screencast? Actually, yes, I would, and I’m starting to find people who want to take on the challenge.

A few months ago I heard from Michael Tiller, who describes himself as a “mechanical engineer trapped in a computer scientist’s body.” Michael has had a long and passionate interest in Modelica, an open, object-oriented language for modeling mechanical, electrical, electronic, hydraulic, thermal, and control systems. He wanted to work with me to develop a screencast on this topic. But it’s far from my domains of expertise and, in the end, all he really needed was my encouragement. This week, Michael launched a website called Dynopsis.com that’s chartered to explore the intersection of engineering and information technologies. Featured prominently on the site is this 20-minute screencast in which he illustrates the use of Modelica in the context of the Dymola IDE.

This screencast was made with Windows Media Encoder 9, and without the help of any editing. After a couple of takes, Michael came up with a great overview of the Modelica language, the Dymola tool, and the modeling and simulation techniques that they embody. Since he is also author of a book on this subject, I asked Michael to reflect on these different narrative modes, and here’s how he responded on his blog:

If I were interested in teaching someone just the textual aspects of the Modelica language, this is exactly the approach I would take.

But when trying to teach or explain a medium that is visual, other tools can be much more effective. Screencasts are one technology that could really make an impact on the way some subjects are taught and I can see how these ideas could be extended much further. [Dynopsis: Learning by example: screencasts]

We’re just scratching the surface of this medium. Its educational power is immediately obvious, and over time its persuasive power will come into focus too. The New York Times recently asked: “Is cinema studies the new MBA?” I’ll go further and suggest that these methods ought to be part of the new freshman comp. Writing and editing will remain the foundation skills they always were, but we’ll increasingly combine them with speech and video. The tools and techniques are new to many of us. But the underlying principles — consistency of tone, clarity of structure, economy of expression, iterative refinement — will be familiar to programmers and writers alike.

14 Mar 05:16

We'll meet again, don't know where, don't know when - the future of international conferences

Alastair Creelman, The corridor of uncertainty, Mar 10, 2021
Icon

Alastair Creelman notes that "on-site conferences are always exclusive events due to costs, travel restrictions, linguistic barriers and accessibility issues" and links to an article by Holly J. Niner and Sophia N. Wassermann Better for Whom? Leveling the Injustices of International Conferences by Moving Online. "The big question is whether or not a return to the on-site format is at all desirable," says Creelman, "and the authors focus on a factor they call the privilege of preferring an in-person option." As the authors argue, "On an individual level, those of us able to attend a conference no matter where it is held should be cognizant of the fact that the option to prefer an in-person conference is predicated on the ability to attend one." A point well made. But importantly: the same point could be made about in-person education, especially higher education, and especially international education.

Web: [Direct Link] [This Post]
14 Mar 05:13

Smart Casual

by peter@rukavina.net (Peter Rukavina)

I sought outfit advice from my brother Mike this morning—he has stronger connections to the real world, and tells the truth, both useful qualities.

He branded it “smart casual.”

Which is a huge step up from the, say, “lethargic distracted” or “frugal deluded” style that might describe my stock in trade otherwise.

Having brothers is great, you know.

Photo by Oliver.

14 Mar 05:12

Meeting… Alice Liang, Sr. Manager of Data & Insights at The New York Times

by The NYT Open Team

Celebrating Women’s History Month

This March, we are featuring colleagues from across The New York Times in a special Women’s History Month series of ‘Meeting’

Illustration by Claire Merchlinsky

What are your pronouns?
She/Her

What is your job title and what does it mean?
Senior Manager, Data & Insights (Engagement Analytics)

I manage a team responsible for data on our newsletters, push notifications and some of our personalized surfaces. We identify opportunities for our products to grow, build pipelines to power enterprise reporting, run experiments to inform business strategy and more.

How long have you been at The Times?
Just over three years.

Most Times employees are working remotely right now. What does working from home these days look like for you?
I’m currently working from home in Michigan but plan on going back to New York soon. Working from home these days looks like having coffee with my sister in the mornings and saying hi to my mail carrier in the afternoons.

Tell us about a project you’ve worked on at The Times that you’re especially proud of.
I was part of the team that first introduced the dynamic meter on our site. This involved an innovative and scrappy small core team that pushed my technical skills, as well as a highly cross-functional set of stakeholders who challenged me to communicate data insights in better ways. It was an incredibly complex project at the forefront of what the industry is doing, and it’s still powering part of our paywall today.

What is the biggest challenge you faced in your career and how did you overcome it? Knowing what you know now, would you do things differently.
Earlier in my career, I was in a pre-doctoral program with an amazing cohort of people who were all planning on getting their PhDs. I knew internally that getting a doctorate and having a career in research wasn’t for me, though I had already done a lot of preparation for it. It was difficult to overcome the expectation that was set for me in the program to pursue grad school. I started talking to folks in other industries, found allies in my cohort that also were deviating from the grad school route and got a job here at The Times! I would tell myself to trust my instincts more readily.

The Times has six core values (Independence, Integrity, Curiosity, Respect, Collaboration and Excellence) by which the company operates. Is there one that you find best describes your work?
Curiosity — the “open-minded inquiry” at the heart of The Times’s journalism is also the key to good data analytics.

What is a goal you hope to accomplish this year?
This year, it’s important for me to support my team and coworkers in the best way I can and to allow myself to be supported by them. Hitting the one year mark since the start of the pandemic is not easy for any of us. My goal is to make space for my team to feel like they can learn and grow in their work.

Outside of my job, I’ve been working on a manuscript for a few years that I would like to finish up this year.

As an analyst working on a digital product, how do you approach your work with inclusivity in mind?
As an analyst, it’s your job to keep an open mind and to challenge assumptions, to put yourself in the position of a reader and to approach analysis with empathy. As a people manager, inclusivity is even more important. People managers can focus more on hiring and growing people who are underrepresented in technology, to hear from different perspectives, to develop policies and expectations that are inclusive for everyone.

What change do you hope to see in your community?
I hope to see more people of color and women thrive in data, and I hope to see more people who work in the industry foster environments that support the success of people of color and women. I’m encouraged by some of The Times’s recent commitments to diversity and inclusion, and I want to push our technology and data functions to do more.

Do you have any favorite life hacks or work shortcuts?
Life hack: you can regrow green onions forever by just placing their ends in a jar of water.

Work shortcuts: write down everything you have worked on at the end of the week or at the end of a month, including small ad-hoc tasks. Whether you’re writing a year-end review or just looking back on your day-to-day, this will help you have a clear sense of all that you have accomplished.

What or who are you inspired by?
I teach a citizenship class and every week, my students inspire me to be persistent, patient and dedicated to a life of continuous learning.

Complete this sentence: Over time, I have realized __________.
Over time, I have realized the importance of having good colleagues by your side. Even with good work-life balance, you end up spending so much of your life at work. Having people to joke around with, to vent to, to cheer you on, to bounce an idea off of and to support you in times of need is so critical.

What is your best advice for someone starting to work in your field?
Stay curious! Follow your interests to explore beyond what’s expected of you. Approach your work with empathy. Advocate for yourself and find mentors, sponsors and managers who will advocate for you. And wherever you are in your career, there’s someone you can mentor and bring up behind you.

More in ‘Meeting’

Meeting… Cindy Taibi, Chief Information Officer at The New York Times
Meeting… Natasha Dykes, Senior Software Engineer at The New York Times

Meeting… Corina Aoi, Technical Product Manager at The New York Times


Meeting… Alice Liang, Sr. Manager of Data & Insights at The New York Times was originally published in NYT Open on Medium, where people are continuing the conversation by highlighting and responding to this story.

11 Mar 03:26

Paradoxes of Engagement: What Makes Teams Work?It’s not IQ, extroversion, or team...

Paradoxes of Engagement: What Makes Teams Work?It’s not IQ, extroversion, or team...
11 Mar 03:25

The inventor of the cassette tape has died

by Igor Bonifacic

Lou Ottens, the former Philips engineer who gave the world its first compact cassette tape, has passed away. According to Dutch news outlet NRC Handelsblad, Ottens was 94 when he died on March 6th. 

Ottens started work on the cassette tape in the early 1960s. The way NPR tells the story, he wanted to develop a way for people to listen to music that was affordable and accessible in the way that large reel-to-reel tapes at the time were not. So he first created a wooden prototype that could fit in his pocket to help guide the project. He also worked to convince Philips to license his invention to other manufacturers for free. Philips went on to introduce the first "compact cassette" in 1963, and the rest, as they say, is history. But that wasn't the end of Ottens' career. He went on to help Philips and Sony develop the compact disc.

It's difficult to overstate the importance of cassette tapes to music culture. We wouldn't have mixtapes and playlists without them. What's more, they allowed people to listen to their favorite songs and albums on the go. No ads or input from a radio DJ. That's something that has come to define how people enjoy music ever since. And for all of their flaws, in recent years, cassette tapes have enjoyed something of a resurgence in popularity. In 2016, sales of the format increased by 74 percent. Two years later, they grew another 23 percent with help from the soundtracks of Stranger Thingsand Guardians of the Galaxy.

11 Mar 03:24

Vulnerability in popular iPhone app ‘Call Recorder’ exposed recordings

by Aisha Malik
App Store on iPhone XS

A security vulnerability in a popular iPhone app called ‘Call Recorder’ has exposed thousands of call recordings.

TechCrunch reports that the bug was discovered by PingSafe AI security researcher Anand Prakash. He found that anyone could access calling recordings from other users by knowing their phone number.

By using a proxy tool like Burp suite, Prakash was able to replace the phone number he registered with Call Recorder with another user’s phone number. This allowed him to access the recordings of other users’ conversations on his phone.

TechCrunch notes that it was able to verify this by conducting its own test. Call Recorder stores recordings on a cloud storage bucket on Amazon Web Services, which was open at the time the bug was discovered.

It’s worth noting that the files couldn’t be accessed and that the bucket has since been closed. It included more than 130,000 audio recordings.

The developer of the app, Arun Nair, released a new version of the app on March 6th. The update notes state that the new version was released to “patch a security report.”

TechCrunch states that the app developer has not responded to repeated requests for comment regarding the vulnerability.

Source: TechCrunch

The post Vulnerability in popular iPhone app ‘Call Recorder’ exposed recordings appeared first on MobileSyrup.

11 Mar 03:24

Apple’s iPhone could feature ‘periscopic’ telephoto lens in 2023

by Patrick O'Rourke
iPhone 12 Pro Max gold

If TF Securities analyst Ming-Chi Kuo’s latest report is accurate, Apple’s iPhone line could adopt a “periscopic telephoto lens” by 2023, as first reported by MacRumors.

Though Kuo’s report doesn’t expand on the claim, this isn’t the first time we’ve seen information indicating that Apple plans to finally add a periscope lens to the iPhone line. In fact, Kuo has previously stated that Apple will adopt a periscope lens in its 2022 iPhone models, though now it appears he thinks the feature will be introduced in 2023.

Several current smartphones include periscope lens technology, including Samsung’s S21 Ultra and Huawei’s P30 Pro, so in a sense, this is another example of Apple playing catch-up with its key Android competitors. The S21’s zoom is particularly impressive, allowing for 10x optical and it still looks great all the way up to roughly 30x hybrid zoom.

The most significant zoom Apple has featured in a smartphone so far is the iPhone 12 Pro Max’s 2.5x optical zoom and 12x digital zoom.

Beyond Apple’s 2023 smartphone line, Kuo says the tech giant’s 2021 smartphones will feature a new Face ID module made of plastic instead of glass thanks to new “coating technologies,” according to MacRumors. Kuo says that other improvements expected to come to the 2021 iPhone include an improved ultra-wide shooter that features a 6-element lens instead of a 5-element lens.

Regarding Apple’s 2022 smartphones, Kuo claims that its rear-facing shooter will feature a 7-element lens instead of a 6-element lens and a new “unibody lens design” that will reduce the size of the front camera module.

As always, Apple’s plans regarding its future smartphones could shift significantly, especially when its 2022 and 2023 iPhone is concerned.

Source: MacRumors

The post Apple’s iPhone could feature ‘periscopic’ telephoto lens in 2023 appeared first on MobileSyrup.

11 Mar 03:23

Adobe’s Photoshop CC is now available natively on M1 Macs

by Patrick O'Rourke
M1 MacBook

A native version of Adobe’s Photoshop CC is now available on Apple’s M1 Macs, including the MacBook Air, 13-inch MacBook Pro and Mac mini.

This follows the final version of Lightroom CC for M1 Macs, and Premiere Pro CC and Audition CC betas. Adobe says the change in speed should be immediately noticeable because the photo editing app now even launches much quicker.

After spending only a few minutes with the M1 version of Photoshop CC, I can confirm that is the case. Further, opening a photo in Photoshop from Lightroom only took a few seconds. Adobe says that overall, this new version of Photoshop CC runs tasks up to 1.5x faster than the previous version of the app that was emulated through Apple’s Rosetta 2 software.

That said, some features are missing, including “invite to edit cloud documents” and “preset syncing.” In Adobe’s blog post, Pam Clark, the company’s vice president of product management and strategy, says that this is “just the beginning,” indicating that missing features will eventually make their way to the M1 version of Photoshop.

You can easily switch back to the Rosetta 2 version of Photoshop if you want. It’s also great that Lightroom automatically opens the M1 Photoshop CC when you select the ‘edit in’ option and not the Rosetta 2 emulated version.

Along with its new M1 app, Adobe is also releasing new iPad Photoshop features, including cloud document version history and downloading cloud files for local editing.

Other updates include ‘Super Resolution’ now being available in Adobe’s Camera Raw plugin. The feature utilizes machine learning technology to quickly boost the resolution of an image with one click. This is useful for blowing up pictures from your phone to print out and frame.

In other M1-related news, Apple and Blackmagic Design have also confirmed that DaVinci Resolve 17.1 has been updated to run natively on the tech giant’s silicon.

Source: Adobe 

The post Adobe’s Photoshop CC is now available natively on M1 Macs appeared first on MobileSyrup.

11 Mar 03:23

Samsung’s 32-inch 4K M7 Smart Monitor is on sale for $299 on Amazon

by Brad Bennett

Samsung’s 32-inch 4K  M7 Smart Monitor is on sale for $299 on Amazon.

The great thing about this monitor is that it comes with a remote and works as a smart TV. This means you can play content from Netflix, Prime Video and Disney+ without connecting it to a computer or an external streaming device. If you have a compatible Samsung smartphone like the Galaxy Note 20 Ultra or Galaxy S21 Ultra, you can also use wireless Dex with it.

That said, if you’re doing high-end creative work, I wouldn’t recommend this display since it only features HDMI inputs, but if you want a great single monitor with a lot of space and a 4K screen, the price is attractive.

You can buy the monitor from Amazon for $299 (regularly $436). In general, 4K monitors are often well over $400 at the low-end, so $300 is a great deal.

The cheaper 32-inch and 27-inch M5 FHD versions of Samsung’s Smart Display are also on sale for $278 (regularly $399) and $199 (regularly $331.86), respectively.

Source: Amazon

MobileSyrup utilizes affiliate partnerships and publishes sponsored posts. These partnerships do not influence our editorial content, though MobileSyrup may earn a commission on purchases made via these links.

 

The post Samsung’s 32-inch 4K M7 Smart Monitor is on sale for $299 on Amazon appeared first on MobileSyrup.

09 Mar 06:29

Weeknotes: Datasette and Git scraping at NICAR, VaccinateCA

This week I virtually attended the NICAR data journalism conference and made a ton of progress on the Django backend for VaccinateCA (see last week).

NICAR 2021

NICAR stands for the National Institute for Computer Assisted Reporting - an acronym that reflects the age of the organization, which started teaching journalists data-driven reporting back in 1989, long before the term "data journalism" became commonplace.

This was my third NICAR and it's now firly established itself at the top of the list of my favourite conferences. Every year it attracts over 1,000 of the highest quality data nerds - from data journalism veterans who've been breaking stories for decades to journalists who are just getting started with data and want to start learning Python or polish up their skills with Excel.

I presented an hour long workshop on Datasette, which I'm planning to turn into the first official Datasette tutorial. I also got to pre-record a five minute lightning talk about Git scraping.

I published the video and notes for that yesterday. It really seemed to strike a nerve at the conference: I showed how you can set up a scheduled scraper using GitHub Actions with just a few lines of YAML configuration, and do so entirely through the GitHub web interface without even opening a text editor.

Pretty much every data journalist wants to run scrapers, and understands the friction involved in maintaining your own dedicated server and crontabs and storage and backups for running them. Being able to do this for free on GitHub's infrastructure drops that friction down to almost nothing.

The lightning talk lead to a last-minute GitHub Actions and Git scraping office hours session being added to the schedule, and I was delighted to have Ryan Murphy from the LA Times join that session to demonstrate the incredible things the LA Times have been doing with scrapers and GitHub Actions. You can see some of their scrapers in the datadesk/california-coronavirus-scrapers repo.

VaccinateCA

The race continues to build out a Django backend for the VaccinateCA project, to collect data on vaccine availability from people making calls on that organization's behalf.

The new backend is getting perilously close to launch. I'm leaning heavily on the Django admin for this, refreshing my knowledge of how to customize it with things like admin actions and custom filters.

It's been quite a while since I've done anything sophisticated with the Django admin and it has evolved a LOT. In the past I've advised people to drop the admin for custom view functions the moment they want to do anything out-of-the-ordinary - I don't think that advice holds any more. It's got really good over the years!

A very smart thing the team at VaccinateCA did a month ago is to start logging the full incoming POST bodies for every API request handled by their existing Netlify functions (which then write to Airtable).

This has given me an invaluable tool for testing out the new replacement API: I wrote a script which replays those API logs against my new implementation - allowing me to test that every one of several thousand previously recorded API requests will run without errors against my new code.

Since this is so valuable, I've written code that will log API requests to the new stack directly to the database. Normally I'd shy away from a database table for logging data like this, but the expected traffic is the low thousands of API requests a day - and a few thousand extra database rows per day is a tiny price to pay for having such a high level of visibility into how the API is being used.

(I'm also logging the API requests to PostgreSQL using Django's JSONField, which means I can analyze them in depth later on using PostgreSQL's JSON functionality!)

YouTube subtitles

I decided to add proper subtitles to my lightning talk video, and was delighted to learn that the YouTube subtitle editor pre-populates with an automatically generated transcript, which you can then edit in place to fix up spelling, grammar and remove the various "um" and "so" filler words.

This makes creating high quality captions extremely productive. I've also added them to the 17 minute Introduction to Datasette and sqlite-utils video that's embedded on the datasette.io homepage - editing the transcript for that only took about half an hour.

TIL this week

09 Mar 06:28

The proliferation of release notes.

by Matt Harris

 This post is as much for me as for anyone else.  I am struggle to keep on top of these release notes.

Every version of Thunderbird comes with release notes. For V78 there are now a number as there have been 8 major releases and some minor releases in the 78 series since July 2020 when V78 was first released. 

V78.8 https://www.thunderbird.net/en-US/thunderbird/78.8.0/releasenotes/ 

V78.7.1 https://www.thunderbird.net/en-US/thunderbird/78.7.1/releasenotes/ 

V78.7 https://www.thunderbird.net/en-US/thunderbird/78.7.0/releasenotes/ 

V78.6.1 https://www.thunderbird.net/en-US/thunderbird/78.6.1/releasenotes/ 

V78.6 https://www.thunderbird.net/en-US/thunderbird/78.6.0/releasenotes/ 

V78.5.1 https://www.thunderbird.net/en-US/thunderbird/78.5.1/releasenotes/ 

V78.5 https://www.thunderbird.net/en-US/thunderbird/78.5.0/releasenotes/ 

V78.4.3 https://www.thunderbird.net/en-US/thunderbird/78.4.3/releasenotes/ 

V78.4.2 https://www.thunderbird.net/en-US/thunderbird/78.4.2/releasenotes/ 

V78.4.1 https://www.thunderbird.net/en-US/thunderbird/78.4.1/releasenotes/ 

V78.4 https://www.thunderbird.net/en-US/thunderbird/78.4.0/releasenotes/ 

V78.3.3 https://www.thunderbird.net/en-US/thunderbird/78.3.2/releasenotes/ 

V78.3.2 https://www.thunderbird.net/en-US/thunderbird/78.3.2/releasenotes/ 

V78.3.1 https://www.thunderbird.net/en-US/thunderbird/78.3.1/releasenotes/ 

V78.3 https://www.thunderbird.net/en-US/thunderbird/78.3.0/releasenotes/ 

V78.2.2 https://www.thunderbird.net/en-US/thunderbird/78.2.2/releasenotes/ 

V78.2.1 https://www.thunderbird.net/en-US/thunderbird/78.2.1/releasenotes/ 

V78.2 https://www.thunderbird.net/en-US/thunderbird/78.2.0/releasenotes/ 

V78.1.1 https://www.thunderbird.net/en-US/thunderbird/78.1.1/releasenotes/ 

V78.1 https://www.thunderbird.net/en-US/thunderbird/78.1.0/releasenotes/ 

V78 https://www.thunderbird.net/en-US/thunderbird/78.0/releasenotes/

Those notes do not provide details on the security issues addressed in each release and they are covered here https://www.mozilla.org/en-US/security/known-vulnerabilities/thunderbird/

Happy reading.

09 Mar 06:23

Please support Web Monetization if you want (fewer) ads on the web

Adrian Todorov, hello, world, Mar 08, 2021
Icon

The presumption is that web monetization would mean fewer (or no) ads. But why should we believe this? We pay for cable TV and yet still have to watch ads. We pay to go to movies and still are shown ads. We pay to see sporting events, which are covered with ads. Nothing about payment stops ads. If anything, it encourages a commercial model, which has the effect of *increasing* the number of ads. No, the way to avoid ads is to produce our own content, not to pay for someone else's. Image: CNBC.

Web: [Direct Link] [This Post]
09 Mar 06:21

"No business which depends for existence on paying less than living wages to its workers has any..."

“No business which depends for existence on paying less than living wages to its workers has...
09 Mar 06:21

David Graeber: After the Pandemic, We Can’t Go Back to Sleep

David Graeber: After the Pandemic, We Can’t Go Back to Sleep: probablyasocialecologist:At some point...
09 Mar 06:21

There’s beauty in node graphs like these, even ...

by Ton Zijlstra

There’s beauty in node graphs like these, even if in this form it hasn’t much use value. This is my graph of the ~2.600 notes I keep in Obsidian after 9 months of daily use, as part of my personal knowledge management system.


(click for larger version)

The outer rim of islands is the reading and summarising in progress. Yellows and greens are notes and notions (around 50% of the total), red work related notes, blues are about organising and planning (day logs, weekly reviews, checklists, templates etc.).

For contrast the graph of the around 7.000 notes I exported from Evernote, which has no structure at all (except for one island of notes having numbered footnotes, which causes a connection between unrelated notes having links for the number [1] which also happens to be an existing note title).

When graphs are useful to me in practice is when I’m looking at local graphs of my notes, while writing. A local graph shows me the notes connected to the current note, at different degrees of separation. One degree I never use (those are the links appearing in the note itself), but two degrees (to which notes the linked notes in my note themselves link) is useful, as it allows associations and new connections.

09 Mar 06:15

Anoxic Masculinity

by peter@rukavina.net (Peter Rukavina)

I have taken great comfort from attending the monthly grief support group sessions offered by Hospice PEI. As Catherine was living with cancer, and my thoughts would turn toward how I would live on afterward, it never occurred to me that the company of other people, united by grief, would be part of it; if my thoughts did turn to such things, more often than not I would picture myself running as fast as I could the other direction.

I wrote to a friend this week that a year ago I was acting as though I could “treat grief like it was an extracurricular course that I could skip,” and that I was “acting like I was some great exception to the rule of how these things work.” Later I wrote to my family that last spring it felt like I was  “trying to outrun grief.” And both were true: the template I had for grieving was all about “moving on” or “powering through,” and I truly did think, in my heart of hearts, that if I buckled down hard enough I could skip the usual grieving rigamarole and get on with life.

I was wrong.

And, fortunately, in mid-May I came to my senses, and reached out for support. I stopped running.

In the months since I’ve found support in many different places in addition to the grief support group: from Oliver, from my mother, from my brothers, from friends, from my psychologist, from my social worker, in talking to others who are grieving, in podcasts and books, and, mostly recently, on Facebook. Through all this my conception of grief has changed completely from “something I need to power through” to “a new part of who I am and always will be.”

That’s a hard thing to write in a way that doesn’t make it sound like “I will now and forever be sad,” especially if, as I once did, you think of grief and sadness as synonyms. What I mean by it is that I have been cracked open in a substantial and undeniable way, and that crack–a wound is one way of thinking about it–will always be with me. But the wound need not be disabling, and the wound need not lead me to being stuck, or being jaded or being depressed (although, no doubt, it can, and I’ve felt those tugs as much as anything else I’ve felt). 

All of which leads me to men.

Although the monthly grief support group is open to all, and although there have certainly been other men who’ve attended by times, they have, especially recently, been few and far between. On a personal level this doesn’t interfere with the effectiveness of my attending, as long as I remember to listen mindfully and to stanch my natural inclination to fill silences (the “promise” that’s read at the beginning of every session is so helpful, in part, for its role in that reminding). But every time I find myself as the only man in a support group I cannot help but worry about the other grieving men, my brothers-in-grief, and whether they might be stuck where I was, trying to power through.

I am afraid for them, and I am afraid for those around them, because I’ve learned enough to understand that grief ignored, grief bottled, grief contained, grief powered through, has the power to leak or explode in harmful, dangerous ways.

Everyone’s grief is different and we each need to find the path that works for us; there’s no way to prescribe a universal path, a universal timetable, a universal result. But I have been well trained by my lifetime growing up in western society that, as a man, my emotions are to be constrained and managed, that admitting fallibility is weak and to be avoided at all costs, that anyone atypical needs to be culled from the social herd, that self-reliance is noble, and asking for help is something you only do when the water is well and truly pouring through the roof. If even then.

I am so grateful that I am finding my way through this in a way that allows me to be mindful of that education-in-masculinity and to start to unpack it. In this I owe a great debt to Oliver, who we tried so hard to raise with that mindfulness, and who’s atypicalness, fortunately, extends to his regard for gender; to my trans and gender-non-binary friends who’ve shared profound insights into what they’ve learned as they’ve confronted the world on their terms; to the women I’ve listened to and shared with around the grief support table who’ve said, beyond anything else “we see you there, it’s okay.”

All I can do in the face of this, knowing what I’ve learned, is to write about my own path with hopes that it will, at least a little, open up the atlas of what’s possible for others that follow on.

09 Mar 06:09

Raspberry Pi 400 Overclocking / NVMe SSD Setup Guide

by jamesachambers
Raspberry Pi 400 Setup w/ NVMe SSDThe Raspberry Pi 400 is the first offering from the Raspberry Pi lineup that is meant to approach desktop level performance. The official raspberrypi.org site lists the Pi 400 kit as the "Raspberry Pi 400 Personal Computer Kit". It comes in the very interesting form factor of a keyboard with all the ports right in the back! Although the performance on stock clock speeds and with a SD card was really great, especially for a Raspberry Pi, I would not call it desktop class performance. Fortunately we *can* make it desktop class performance with a few tweaks! This guide will show how to overclock the Pi 400 as well as set it up with a NVMe SSD to get the maximum possible performance we can out of it!

Source

09 Mar 06:06

Digitally-Enhanced Exit Tickets

Eric Sheninger, A Principal's Reflections, Mar 09, 2021
Icon

My first response was, "What are exit tickets?" proving that nobody knows everything even in their own area of specialization. Exit tickets are "informal assessment tools teachers can use to assess students understanding at the end of a class." In this post, Eric Sheninger looks at digital exit tickets. He suggested using GoSoapBox, a web-based clicker tool that allows responses to be aggregated on the screen. It allows students to work on problems, share their results anonymously, and allows the instructor to flag areas of concern.

Web: [Direct Link] [This Post]
09 Mar 01:39

It’s More Logical to Host an Event Than Attend One

by Ton Zijlstra

Talking with E tonight about how many people we know are involved in organising their own events, we made a quick list. That list now contains 38 people, most of which we’ve known for a long time. That’s a group big enough to do a unconference / barcamp style event about event organising in itself!

The experiences of those people run from small workshops to global conferences. Myself, I’ve been active across that full spectrum as well. From BlogWalks and IndieWebCamps with two dozen people, our birthday unconferences (40 people in our home, 100 at the subsequent bbq), to national conferences, side-events at European and global conferences, European conferences in different countries with 300-400 people, to an edition of the global FabLab conference. The interesting bit is that for myself and almost all of the people on the list we just made, organising events wasn’t/isn’t our main activity. Often those events basically are a side activity, an emergent property of other work.

Ross Mayfield in a blog conversation in 2005 said “it’s cheaper to host your own event than attend one”. Not always cheaper I know, but it’s definitely more logical a lot of times. It’s a logic E and I, and those many people we listed just now have followed for about two decades now. Where can you and us take that the coming years?

09 Mar 01:38

Vinylcast #43: Midnight Oil’s Diesel and Dust

by bavaradio
#vinylcast of Midnight Oil’s Diesel and Dust on #ds106radio

Released in 1987 (1988 in the US) Midnight Oil‘s Diesel and Dust was one of my favorite of the 1980s. It opened up both a sonic and political world that I deeply associate with a kind of popular coming of age, it is delivered with the earnest belief that music can change the world, and for that alone it was unbelievably compelling to 17 year old me. I talk a bit about the VHS tape of their impromptu performance in front of Exxon’s NYC corporate offices to protest the Valdez oil spill, a major event in 1989 and an environmentalist awakening for many a Gen-Xer I am sure.

But the major theme of reconciliation with the aboriginal population in Australia is the lifeblood of this album, and the attempt to not only wrestle but compellingly communicate this struggle and legacy of anguish is everything. It is far from perfect, but there is a spirit of anger and guilt, culpability and disgust that come across quite adeptly. And while bands like INXS, AC/DC, and Crowded House came from Australia, you got the sense from their music, much like Men at Work, that Midnight Oil was Australia—and unlike Men at Work they were mad as hell and weren’t gonna take it anymore!

09 Mar 01:38

7 powerful alternatives to Microsoft Lists

Kenneth Franks, JotForm, Mar 08, 2021
Icon

Doing some background for my talk today opened me up to a product category that was sort of new to me (and is in many ways similar to some of what I'm building myself in gRSShopper). The category begins as the humble list, but quickly evolves into data tables, which in turn evolves into some sort of hybrid between spreadsheets and databases. It all sounds complicated - and it can be - but some of the platforms, like JotForm and AirTable, look relatively easy to use. There's also Microsoft Lists, as the title suggest, which is really built into Teams and/or OneNote (it's all so very unclear - and it's not just me; just try to get started using this documentation) and there's Google Tables, a beta project available in the U.S. only. Like all databases, though, while it's neat to put all your data into tables, the real change is how you get it there and how you use it. Still, these services start you thinking, and their existence tells me I'm on the right track with gRSShopper.

Web: [Direct Link] [This Post]
09 Mar 01:37

After getting @mamrotynka to go to Pacific Brea...

After getting @mamrotynka to go to Pacific Bread Company, we went to for a walk today and stopped at the yarn store and the Polish deli, Polonia.

09 Mar 01:37

Went for a walk along the Fraser and ended up o...

Went for a walk along the Fraser and ended up over at the River District.

We biked there to see the murals as they were being painted way back in August.

Mural artist Fernanda Ribeiro, IG:littelost_fe

09 Mar 01:36

Stop Interrupting Women

Sitting in a meeting in a large leather chair in the boardroom of a growing company, I was pained to hear someone continually interrupting someone else. The person doing the interrupting was a male executive and the person being interrupted was a female director. The female remained quiet for the remainder of the meeting. After the meeting, I asked her privately about being interrupted. She smiled and said “Oh, yes, that happens all the time around here.”

Patrick Lencioni, author of many great books, including the Five Dysfunctions of a Team, recently tweeted: “If someone offered me a single piece of evidence to assess the health of an organization, I would want to observe the executive team during a meeting.”

Interrupting women is the norm in the business world and it’s a disgrace. Women who are routinely interrupted eventually remain silent and silence is deadly when it comes to fixing problems, adding value and innovating. A tremendous amount of value is lost when women are interrupted. And it leads to churn, as women seek new employment.

Recently, a female colleague of mine interviewed a male candidate. He manterrupted her so frequently during the interview that she decided to tally the number of interruptions. Over the course of 30 minutes, he interrupted her 10 times! Now, we know that sometimes lags in internet connections can account for interruptions, but 10 interruptions during one 30-minute Zoom call is hard to reconcile.

When Google set out to understand what led to the highest performance teams in their organization, it ultimately discovered that psychological safety was the essential ingredient. In his book, Smarter Faster Better, Pulitzer prize-winning author Charles Duhigg described how Google managers began to practice psychological safety by beginning with better meetings. When the meeting would start, they’d write down the names of each participant. When a participant spoke, they’d place a tally mark next to their name. By the end of the meeting, they could see who spoke a lot and who barely spoke at all. Over time, they encouraged quiet participants to speak, they repeated what they heard and they discouraged interruptions. This raised the level of psychological safety within their meetings and was a good start on the road to growing a culture of psychological safety.

My colleagues and I made the following poster to help others on the journey towards fewer interruptions, better listening and greater collaboration in meetings:

clear poster

Today is International Women’s Day. Let’s use this day to remind ourselves that everyday we need to talk less, listen more and above all:

Stop Interrupting Women!



07 Mar 07:00

Lying to the ghost in the machine

by Charlie Stross
mkalus shared this story from Charlie's Diary.

(Blogging was on hiatus because I've just checked the copy edits on Invisible Sun, which was rather a large job because it's 50% longer than previous books in the series.)

I don't often comment on developments in IT these days because I am old and rusty and haven't worked in the field, even as a pundit, for over 15 years: but something caught my attention this week and I'd like to share it.

This decade has seen an explosive series of breakthroughs in the field misleadingly known as Artificial Intelligence. Most of them centre on applications of neural networks, a subfield which stagnated at a theoretical level from roughly the late 1960s to mid 1990s, then regained credibility, and in the 2000s caught fire as cheap high performance GPUs put the processing power of a ten years previous supercomputer in every goddamn smartphone.

(I'm not exaggerating there: modern CPU/GPU performance is ridiculous. Every time you add an abstraction layer to a software stack you can expect a roughly one order of magnitude performance reduction, so intuition would suggest that a WebAssembly framework (based on top of JavaScript running inside a web browser hosted on top of a traditional big-ass operating system) wouldn't be terribly fast; but the other day I was reading about one such framework which, on a new Apple M1 Macbook Air (not even the higher performance Macbook Pro) could deliver 900GFlops, which would put it in the top 10 world supercomputers circa 1996-98. In a scripting language inside a web browser on a 2020 laptop.)

NNs, and in particular training Generative Adversarial Networks takes a ridiculous amount of computing power, but we've got it these days. And they deliver remarkable results at tasks such as image and speech recognition. So much so that we've come to take for granted the ability to talk to some of our smarter technological artefacts—and the price of gizmos with Siri or Alexa speech recognition/search baked in has dropped into two digits as of last year. Sure they need internet access and a server farm somewhere to do the real donkey work, but the effect is almost magically ... stupid.

If you've been keeping an eye on AI you'll know that the real magic is all in how the training data sets are curated, and the 1950s axiom "garbage in, garbage out" is still applicable. One effect: face recognition in cameras is notorious for its racist bias, with some cameras being unable to focus or correctly adjust exposure on darker-skinned people. Similarly, in the 90s, per legend, a DARPA initiative to develop automated image recognition for tanks that could distinguish between NATO and Warsaw Pact machines foundered when it became apparent that the NN was returning hits not on the basis of the vehicle type, but on whether there was snow and pine forests in the background (which were oddly more common in publicity photographs of Soviet tanks than in snaps of American or French or South Korean ones). Trees are an example of a spurious image that deceives an NN into recognizing something inappropriately. And they show the way towards deliberate adversarial attacks on recognizers—if you have access to a trained NN, you can often identify specific inputs that, when merged with the data stream the NN is searching, trigger false positives by adding just the right amount of noise to induce the NN to see whatever it's primed to detect. You can then apply the noise in the form of an adversarial patch, a real-world modification of the image data being scanned: dazzle face-paint to defeat face recognizers, strategically placed bits of tape on road signage, and so on.

As AI applications are increasingly deployed in public spaces we're now beginning to see the exciting possibilities inherent in the leakage of human stupidity into the environment we live in.

The first one I'd like to note is the attack on Tesla car's "autopilot" feature that was publicized in 2019. It turns out that Tesla's "autopilot" (actually just a really smart adaptive cruise control with lane tracking, obstacle detection, limited overtaking, and some integration with GPS/mapping: it's nowhere close to being a robot chauffeur, despite the marketing hype) relies heavily on multiple video cameras and real time image recognition to monitor its surrounding conditions, and by exploiting flaws in the image recognizer attackers were able to steer a Tesla into the oncoming lane. Or, more prosaically, you could in principle sticker your driveway or the street outside your house so that Tesla autopilots will think they're occupied by a truck, and will refuse to park in your spot.

But that's the least of it. It turns out that the new hotness in AI security is exploiting backdoors in neural networks. NNs are famously opaque (you can't just look at one and tell what it's going to do, unlike regular source code) and because training and generating NNs is labour- and compute-intensive it's quite commonplace to build recognizers that 'borrow' pre-trained networks for some purposes, e.g. text recognition, and merge them into new applications. And it turns out that you can purposely create a backdoored NN that, when merged with some unsuspecting customer's network, gives it some ... interesting ... characteristics. CLIP (Contrastive Language-Image Pre-training) is a popular NN research tool, a network trained from images and their captions taken from the internet. [CLIP] learns what's in an image from a description rather than a one-word label such as "cat" or "banana." It is trained by getting it to predict which caption from a random selection of 32,768 is the correct one for a given image. To work this out, CLIP learns to link a wide variety of objects with their names and the words that describe them.

CLIP can respond to concepts whether presented literally, symbolically, or visually, because its training set included conceptual metadata (textual labels). So it turns out if you show CLIP an image of a Granny Smith, it returns "apple" ... until you stick a label on the fruit that says "iPod", at which point as far as CLIP is concerned you can plug in your headphones.

NN recognizing a deceptively-labelled piece of fruit as an iPod

And it doesn't stop there. The finance neuron, for example, responds to images of piggy banks, but also responds to the string "$$$". By forcing the finance neuron to fire, we can fool our model into classifying a dog as a piggy bank.

The point I'd like to make is that ready-trained NNs like GPT-3 or CLIP are often tailored as the basis of specific recognizer applications and then may end up deployed in public situations, much as shitty internet-of-things gizmos usually run on an elderly, unpatched ARM linux kernel with an old version of OpenSSH and busybox installed, and hard-wired root login credentials. This is the future of security holes in our internet-connected appliances: metaphorically, cameras that you can fool by slapping a sticker labelled "THIS IS NOT THE DROID YOU ARE LOOKING FOR" on the front of the droid the camera is in fact looking for.

And in five years' time they're going to be everywhere.

I've been saying for years that most people relate to computers and information technology as if they're magic, and to get the machine to accomplish a task they have to perform the specific ritual they've memorized with no understanding. It's an act of invocation, in other words. UI designers have helpfully added to the magic by, for example, adding stuff like bluetooth proximity pairing, so that two magical amulets may become mystically entangled and thereafter work together via the magical law of contagion. It's all distressingly bronze age, but we haven't come anywhere close to scraping the bottom of the barrel yet.

With speech interfaces and internet of things gadgets, we're moving closer to building ourselves a demon-haunted world. Lights switch on and off and adjust their colour spectrum when we walk into a room, where we can adjust the temperature by shouting at the ghost in the thermostat, the smart television (which tracks our eyeballs) learns which channels keep us engaged and so converges on the right stimulus to keep us tuned in through the advertising intervals, the fridge re-orders milk whenever the current carton hits its best-before date, the robot vacuum comes out at night, and as for the self-cleaning litter box ... we don't talk about the self-cleaning litterbox.

Well, now we have something to be extra worried about, namely the fact that we can lie to the machines—and so can thieves and sorcerors. Everything has a True Name, and the ghosts know them as such but don't understand the concept of lying (because they are a howling cognitive vacuum rather than actually conscious). Consequently it becomes possible to convince a ghost that the washing machine is not a washing machine but a hippopotamus. Or that the STOP sign at the end of the street is a 50km/h speed limit sign. The end result is people who live in a world full of haunted appliances like the mop and bucket out of the sorcerer's apprentice fairy tale, with the added twist that malefactors can lie to the furniture and cause it to hallucinating violently, or simply break. (Or call the police and tell them that an armed home invasion is in progress because some griefer uploaded a patch to your home security camera that identifies you as a wanted criminal and labels your phone as a gun.)

Finally, you might think you can avoid this shit by not allowing any internet-of-things compatible appliances—or the ghosts of Cortana and Siri—into your household. And that's fine, and it's going to stay fine right up until the moment you find yourself in this elevator ...