Shared posts

16 Jan 22:03

Square off: Machine learning libraries

by Mayukh Bhaowal

Top five characteristics to consider when deciding which library to use.

Choosing a machine learning (ML) library to solve predictive use cases is easier said than done.

There are many to choose from, and each have their own niche and benefits that are good for specific use cases. Even for someone with decent experience in ML and data science, it can be an ordeal to vet all the varied solutions. Where do you start? At Salesforce Einstein, we have to constantly research the market to stay on top of it. Here are some observations on the top five characteristics of ML libraries that developers should consider when deciding what library to use:

1. Programming paradigm

Most ML libraries fall into two tribes on a high-level design pattern: the symbolic tribe and the imperative tribe for mathematical computation.

In symbolic programs, you define a complex mathematical computation functionally without actually executing it. It generally takes the form of a computational graph. You lay out all the pieces and connect them in an abstract fashion before materializing it with real values as inputs. The biggest advantages of this pattern are composability and abstraction, thus allowing the developers to focus on higher level problems. Efficiency is another big advantage, as it is relatively easy to parallelize such functions.

Apache Spark’s ML library, Spark MLlib, and any library built on Spark, such as Microsoft’s MMLSpark and Intel's BigDL, follow this paradigm. Directed acyclic graph (DAG) is their representation of the computational graphs. Other examples of symbolic program ML libraries are CNTK, with static computational graphs; Caffe2, with Net (which is a graph of operators);; and Keras.

In imperative programs, everything is execution-first. You write a line of code, and when the compiler reads that line and executes it, it actually runs the numerical computation and moves to the next line of code. This style makes prototyping much easier, as it tends to be more flexible and much easier to debug and troubleshoot. Scikit-learn is a popular Python library that falls into this category. Other libraries such as auto sklearn and TPOT are layers of abstraction on top of scikit-learn, which also follow this paradigm. PyTorch is yet another popular choice that supports dynamic computational graphs, thereby making the process imperative.

Clearly, there are tradeoffs with either approach, and the right one depends on the use case. Imperative programming is great for research, as it naturally supports faster prototyping—allowing for repetitive iterations, failed attempts, and a quick feedback loop—whereas symbolic programming is better catered toward production applications.

There are some libraries that combine both approaches and create a hybrid style. The best example is MXNet, which allows imperative programs within symbolic programs as callbacks, or uses symbolic programs as a part of imperative programs. Another newer development is Eager Execution from Google’s TensorFlow. Though, originally a Python library with a symbolic paradigm (a static computational graph of tensors), Eager Execution does not need a graph, and execution can happen immediately.

2. Machine learning algorithms

Supervised learning, unsupervised learning, recommendation systems, and deep learning are the common classes of problems that we deal with in machine learning. Again, your use case will dictate which library to use. For example, if you are doing a lot of custom image processing, Caffe2 would be a good choice, all other factors being equal. It is an evolution of Caffe, whose original use case was CNN for image classification. CNTK would be a reasonable choice for language processing, as the CNTK framework was born out of the language services division of Microsoft. On the other hand, if most of the use cases are supervised and unsupervised learning, Spark MLlib, scikit-learn,, and MMLSpark are good alternatives, as they support an exhaustive collection of supervised and unsupervised algorithms. Spark MLlib,, and Mahout additionally support recommendations via collaborative filtering.

Many of the older libraries now fall short, with the rise of deep learning (DL). TensorFlow was one of the first libraries that made deep learning accessible to data scientists. Today, we have many others that are focusing on deep learning, including PyTorch, Keras, MXNet, Caffe2, CNTK and BigDL. There are other libraries that support DL algorithms, but it is not a main function for them, such as MMLSpark (image and text learning) and (via the deepwater plugin).

  • Supervised and unsupervised: Spark MLlib, scikit-learn,, MMLSpark, Mahout
  • Deep learning: TensorFlow, PyTorch, Caffe2 (image), Keras, MXNet, CNTK, BigDL, MMLSpark (image and text), (via the deepwater plugin)
  • Recommendation system: Spark MLlib, (via the sparkling-water plugin), Mahout

3. Hardware and performance

Compute performance is a key criteria in selecting the right library for your project. This is more predominant with libraries specializing in DL algorithms, as they tend be computationally intensive.

One of the biggest trends that has boosted DL development is advances in GPUs and being able to perform large matrix operations on GPUs. All DL libraries, such as TensorFlow, Keras, PyTorch, and Caffe2, support GPUs, but many general purpose libraries, like MMLSpark,, and Apache Mahout, support GPUs as well. CNTK and MXNet boast automatic multi-GPU and multi-server support, which allows for fast distributed training using parallelization across multiple GPUs without any need for configuration. TensorFlow, however, has gathered quite a bit of reputation as being slower than comparative DL platforms. As a compensation, TensorFlow is advertising big performance gains on their new custom AI chip, Tensor Processing Unit (TPU). The drawback being, TPU is non-commodity hardware and works only with TensorFlow, causing vendor lock-in.

Caffe2, MXNet, and TensorFlow also stand out for their mobile computation support—so if your use case requires running ML training on mobile, these libraries would be your best bet.

The takeaway on performance is that most libraries built on top of Spark are able to exploit the parallel cluster computing of Spark with cached intermediate data in memory, making machine learning algorithms that are inherently iterative in nature run fast. Apache Mahout is the exception, which only supported Hadoop MapReduce until recently and involves expensive disk I/Os, and hence was slower for iterative algorithms. Mahout now added Scala on Spark,, and Apache Flink support. BigDL is novel in its approach of making DL possible on the Spark ecosystem with CPUs, a departure from traditional DL libraries, which all leverage GPU acceleration. They in turn use Intel’s MKL and multi-threaded programming.

  • CPU: Spark MLlib, scikit-learn, auto sklearn, TPOT, BigDL
  • GPU: Keras, PyTorch, Caffe2, MMLSpark,, Mahout, CNTK, MXNet, TensorFlow
  • Mobile Computing: MXNet, TensorFlow, Caffe2

4. Interpretability

ML software differs from traditional software in the sense that the behavior or outcome is not easily predictable. Unlike rule-based engines, such software constantly learns new rules. One of the biggest challenges we face at Salesforce Einstein is how to constantly build trust and confidence in machine learning applications. Why did it predict Lead X as having a higher likelihood of conversion to an opportunity while Lead Y has a lower likelihood? What are the patterns in the data set that are driving certain predictions? Can we convert such insights from the machine learning model into actions?

Other corollaries to this problem include visualizing computational graph execution metrics, observing data flows in order to optimize, and hand-craft models and/or debug model quality performance.

This is a relatively unripe area in ML, which only a few of the libraries make attempts to solve. launched Machine Learning Interpretability, which addresses some aspects of this problem. TensorFlow has a visualization layer called TensorBoard, which helps data scientists to understand, optimize, and debug massive deep neural networks. Keras also addresses this with their model visualization.

Though these are good steps in the right direction, this area needs more investment to make ML transparent and less of a black box in order to encourage wider adoption.

  • Interpretability: TensorFlow (TensorBoard), (Machine Learning Interpretability), Keras (model visualization)

5. Automated machine learning

Arguably, one of the biggest area for innovation is automated machine learning. Real-life ML is not just about building models, but about building pipelines that include ETL, feature engineering, feature selection, model selection (including hyper-parameter tuning), model updates, and deployment.

Many of these workflows are common across applications and data sets, and tend to be repeated, meaning there is an opportunity to optimize and automate. Additionally, some of the workflows need significant intuition and tribal knowledge in data science and machine learning, such as feature engineering or tuning deep models. These make machine learning inaccessible to those who do not necessarily have a Ph.D. Automating many of the steps can accelerate data scientists’ productivity and help build applications in hours rather than months.

Auto sklearn, TPOT, and are built on this premise, targeting supervised classification problems. Auto sklearn is automating model selection and hyper-parameter tuning using Bayesian optimization. TPOT is using genetic programming for their hyper-parameter tuning. Both TPOT and have also included several degrees of automation for feature engineering. MMLSpark has auto model selection and a certain degree of automated feature engineering for image and text features.

There is a large gap in the market for this category, both in the breadth (the different stages in the pipeline) and depth (intelligent approaches to automate a single stage) of offerings.

  • Hyper-parameter tuning: auto sklearn (Bayesian optimization), TPOT (genetic programming)
  • Limited auto feature engineering: TPOT,, MMLSpark

Other noteworthy considerations

Though models in ML need data sets to be trained on before they can be used, there are scenarios where one can get access to models for data sets that are global in nature. For example, a universal image data set like ImageNet is good enough for building a general-purpose image classification model, also known as a pre-trained model. Such models can be plugged in, meaning no data or training is needed. MMLSpark, CNTK, TensorFlow, PyTorch, Keras and BigDL all provide pre-trained models for general-purpose classification tasks. One caveat here is that such models are useless for custom use cases. For instance, a general-purpose image classification model will perform poorly if it needs to classify the type of crops from aerial images of crop fields, but it would work well classifying cats versus dogs. This is because, though ImageNet has crop images, there is insufficient training data of specific types of crops or crops with different diseases that, for instance, a fertilizer company might care about.

CNTK comes with some additional handy features like automatic randomization for data sets and real-time training. Though MMLSpark is a Scala library, it supports auto-generation of interfaces in other languages, namely Python and R.

  • Pre-trained models: MMLSpark, CNTK, TensorFlow, PyTorch, Keras, BigDL
  • Real-time training: CNTK
  • Multi-language support: MMLSpark supports auto generation of interfaces in Python/R from Scala

Decision time

There are myriad options for ML libraries to choose from when you are building ML into a product, and while there may not be one perfect option, it helps to consider the above factors to ensure you’re picking the best solution for your specific needs. For enterprise companies that have thousands of business customers, there are a host of other challenges that the market does not yet address. Label leakage, also known as data leakage, has been the Achilles’ heel of ML libraries—this occurs when, because of unknown business processes, the data set available for model training has fields that are proxy for the actual label. Sanity checking the data for such leaks and dropping them from the data is key to well-performing models.

Multitenancy is another sticking point—how can we share common pieces of machine learning platforms and resources to serve multiple tenants, each of which has its own unique data sets and leads to completely different models being trained? This problem lends itself to scale of a different sort. As the industry continues to face challenges like this head on, a complete and exhaustive auto ML solution that has yet to be developed will likely prove to be the key to success.

Continue reading Square off: Machine learning libraries.

16 Jan 22:00

Black Mirror’s USS Callister and the toxic fanboy

by Jason Kottke

For many, the standout episode of the newest season of Black Mirror is USS Callister. In a recent video (w/ spoilers galore), ScreenPrism breaks down how the episode veers from the Star Trek-inspired opening into a parable about toxic fanboyism, sexism, and online behavior.

Daly is clearly driven by the lack of respect he gets, but Nanette didn’t disrespect him. She’s shown him huge respect and admiration; it’s just for his work rather than expressed as wanting to sleep with him. There’s a weird cultural assumption we tend to make that if a woman thinks highly of a man, she must want to sleep with him. And then if she doesn’t, it’s somehow an insult to him, and that’s exactly what we see going on in this episode.

When I finished watching the episode, it struck me as a timely repudiation of Gamergate, meninists on Reddit & Twitter, and those who want to roll back the clock to a time when a woman’s place was wherever a man told her to be. Great episode, one of my favorites of the entire series.

Tags: Black Mirror   sexism   video
16 Jan 22:00

A wishlist of scientific breakthroughs by Robert Boyle

by Jason Kottke

Robert Boyle List

17th-century scientist Robert Boyle, one of the world’s first chemists and creator of Boyle’s Law, wrote out a list of problems he hoped could be solved through science. Since the list was written more than 300 years ago, almost everything on it has been discovered, invented, or otherwise figured out in some fashion. Here are several of the items from Boyle’s list (in bold) and the corresponding scientific advances that have followed:

The Prolongation of Life. English life expectancy in the 17th century was only 35 years or so (due mainly to infant and child mortality). The world average in 2014 was 71.5 years.

The Art of Flying. The Wright Brothers conducted their first flight in 1903 and now air travel is as routine as riding in a horse-drawn carriage in Boyle’s time.

The Art of Continuing long under water, and exercising functions freely there. Scuba gear was in use by the end of the 19th century and some contemporary divers have remained underwater for more than two days.

The Cure of Diseases at a distance or at least by Transplantation. Not quite sure exactly what Boyle meant by this, but human organ transplants started happening around the turn of the 20th century. X-rays, MRI machines, and ultrasound all peer inside the body for disease from a distance. Also, doctors are now able to diagnose many conditions via video chat.

The Attaining Gigantick Dimensions. I’m assuming Boyle meant humans somehow transforming themselves into 20-foot-tall giants and not the obesity that has come with our relative affluence and availability of cheap food. Still, the average human is taller by 4 inches than 150 years ago because of improved nutrition. Factory-farmed chickens have quadrupled in size since the 1950s. And if Boyle paid a visit to the Burj Khalifa or the Mall of America, he would surely agree they are Gigantick.

The Acceleration of the Production of things out of Seed. To use just one example out of probably thousands, some varieties of tomato take just 50 days from planting to harvest. See also selective breeding, GMOs, hydroponics, greenhouses, etc. (P.S. in Boyle’s time, tomatoes were suspected to be poisonous.)

The makeing of Glass Malleable. Transparent plastics were first developed in the 19th century and perfected in the 20th century.

The making of Parabolicall and Hyperbolicall Glasses. The first high quality non-spherical lenses were made during Boyle’s lifetime, but all he’d need is a quick peek at a pair of Warby Parkers to see how much the technology has advanced since then, to say nothing of the mirrors on the Giant Magellan Telescope.

The making Armor light and extremely hard. Bulletproof armor was known in Boyle’s time, but the introduction of Kevlar vests in the 1970s made them truly light and strong.

The practicable and certain way of finding Longitudes. When pushed to its limits, GPS is accurate in determining your location on Earth to within 11 millimeters.

Potent Druggs to alter or Exalt Imagination, Waking, Memory, and other functions, and appease pain, procure innocent sleep, harmless dreams, etc. Dude, we have so many Potent Druggs now, it’s not even funny. According to a 2016 report, the global pharmaceutical market will reach $1.12 trillion.

A perpetuall Light. It’s not exactly perpetual, but the electric lightbulb was invented in the 19th century and the longest-lasting bulb has been working least 116 years.

Varnishes perfumable by Rubbing. Scratch and sniff was invented by 3M in 1965.

(via bb)

Tags: lists   Robert Boyle   science
21 Dec 21:51

Sam Altman: I should have the freedom to say stupid shit, you should have the freedom to applaud

by Paul Bradley Carr


"I have never met a man so ignorant that I couldn't learn something from him." – Galileo

“Hold my soylent” – Sam Altman

You know the old saying: Writing about Y Combinator is like wrestling with a pig, you both get dirty and the pig threatens to fabricate a smear campaign against you and your company.

Then there’s the sheer redundancy of the exercise: If by now you don’t understand the extent to which Paul Graham’s bro creche is responsible for so much that’s wrong with Silicon Valley then one more blog post ain’t gonna help you. Bro.

And yet.

Earlier this week,  the current head of Y Combinator, Sam “the fucking worst” Altman summoned his hoodie-wearing faithful to assemble in Saint Ayn’s square (also known as to hear his latest proclamation of genius. Its title: ‘E Pur Si Muove’ or, in English, ‘can you fucking believe this douche?’

I quote…

Earlier this year, I noticed something in China that really surprised me.  I realized I felt more comfortable discussing controversial ideas in Beijing than in San Francisco.  I didn’t feel completely comfortable—this was China, after all—just more comfortable than at home.

That showed me just how bad things have become, and how much things have changed since I first got started here in 2005.

It seems easier to accidentally speak heresies in San Francisco every year.  Debating a controversial idea, even if you 95% agree with the consensus side, seems ill-advised.

This will be very bad for startups in the Bay Area.

For the next six hundred words, Altman expands on his theme: That geniuses like him and their bold disruptive ideas around science and technology are being driven out of San Francisco by… well… questions about the ethics of those same ideas…

[S]mart people tend to have an allergic reaction to the restriction of ideas, and I’m now seeing many of the smartest people I know move elsewhere…

I’ve seen credible people working on ideas like pharmaceuticals for intelligence augmentation, genetic engineering, and radical life extension leave San Francisco because they found the reaction to their work to be so toxic. “If people live a lot longer it will be disastrous for the environment, so people working on this must be really unethical” was a memorable quote I heard this year.

After all, what’s a genius supposed to do when a guy at a dinner party dares to bring up the downside of population growth, but to…. move to Beijing where, until recently, the government would slaughter your second born child? Not since Edward Snowden moved to Russia to protest American spying have hypocrisy and delusion been joined in such happy matrimony.

I could waste a lifetime pointing out the logical flaws and screaming hypocrisies of Sam Altman and his bro-thren – never mind the chalkboard-scratching self delusion of Altman likening himself to Galileo (as he does twice in that one blog post.) But most of his drivel can be summed up in a single sentence: Sam Altman should have the right to say or invest in any horrible thing he likes, and everyone else has the right to applaud.

The scary part, though, is the timing of Altman’s screed, coming as it does just as Silicon Valley is finally debating the way women and minorities are treated, and spoken about, in startups. The similarity between Atman’s rhetoric and that of James Damore is not a coincidence.

Just read this next quote. If you haven’t read it before, could you honestly tell me if it was written by Altman or Damore…

This is uncomfortable, but it’s possible we have to allow people to say disparaging things about gay people if we want them to be able to say novel things about physics.

(It’s Altman.)

Altman's post has almost nothing to do with genetic engineering or augmented intelligence or even about the First Amendment, and everything to do with Y Combinator bros finally being called to account.Witness how Altman's buddies are already furious that people - that is, women, mionorities and other non-awful human beings - are daring to exercise their own free speech to criticize his unimpeachable wisdom.

Coming soon: Sam Altman announces that – inspired by the incredible response to his post – he has decided (reluctantly! unexpectedly!) to run for mayor of San Francisco. His manifesto: To restore the city as the free-speech capital of the world. Ron Conway has already given his blessing. Palantir has offered technical assistance. Keith Rabois and Joe Lonsdale are on the advisory committee. And Beelzebub himfuckingself will be handing out campaign fliers.

21 Dec 21:48

ChordFlow 2.1 brings Melody Tracks and much more

by Ashley Elsdon

Chordflow ser sjovt ud - aldrig hørt om

ChordFlow gets a really big update adding a host of new features for users. There’s a lot to show, so here are all the details:

Each ChordFlow song section can now have up to 4 melody tracks in addition to 4 arpeggio tracks. You can now create complete song ideas with chord progressions, arpeggios, and melodies. The new melodies are edited in the separate (from arpeggio) grid. You can specify scale of the melody grid and configure the length and the rate parameters. The chords that you configured in the chord progression editor will be shown in the melody grid view below the note grid. And also for more convenience, chord notes will be highlighted with grey rounded bars allowing you to see which notes of the melody matches the corresponding chord.

There is now a track control panel in the main view, at the bottom, where you can specify destination of each track and mute/solo individual tracks. This settings are now stored individually for each song.

You can now loop a region when you are in the arpeggio or melody editor. To add a loop region, tap on the timeline bar above the note grid. After that, the loop with the default length of 4 steps will be added. You can then move the loop region around dragging it by its center or you can stretch and squeeze it by dragging it by its ends. To remove the loop region just tap on it again. The loop region is only active while you are in the editor view. When you move back to the main view, the loop is automatically removed.


  • Remove(Trash) button in arpeggio and melody editors now removes only the selected track. And there is no moro annoying confirm. If you tap remove button accidentally, you can restore the deleted track with the undo button.
  • Line tool and delete tool behaviour changed a little. Before this update, when you started to draw a line or delete a region, you could only extend the already drown region. Now it behaves more expectedly. If you have drawn a line further than you needed you can no move back to make it smaller
  • Arpeggio grid max length increased to from 32 to 64 steps.
  • Color scheme changed. After introducing 4 new colors for melody tracks, I decided to switch to more neutral background colors, as it was not looking very good on the original blue schema.
  • Bug fixes

ChordFlow costs $9.99 on the app store now:

The post ChordFlow 2.1 brings Melody Tracks and much more appeared first on CDM Create Digital Music.

21 Dec 21:47

My email and task management protocol

by Joi

Sanebox lyder cool - en nem AI-anvendelse

November 2010
November 2010, before I "settled down" with a "real job."

The last blog post I wrote was about how little time I have to do email and the difficulty in coping with it. Often when I meet new people, they quickly take a look at my blog and read the top post, which in this case is a whiny post about how busy I am - fine, but not exactly the most exciting place to start a conversation. The fact that I haven't written anything really interesting on this blog since then is a testament to the fact that I haven't solved my "busy problem", but I thought I'd give you an update on the somewhat improved state of things.

After the last post, Ray Ozzie pointed out in the comments that I was looking at the problem the wrong way. Instead of trying to allot partial attention to doing email during meetings, he suggested I should instead figure out how to effectively process email where the input and output flows are balanced. I took his feedback to heart and have embarked on trying to make my inbox processing more efficient. In case it is useful for people, here are a few protocols that I've instituted.

While I don't get to inbox zero every day, I get to near inbox zero at least once a week. I feel that I'm mostly on top of things, and if I'm unable to do something or meet someone, it is because I really am unable to do it, rather than just accidentally missing it. This feels much better.

My next step will start after the new year, when I'll start scheduling exercise, learning and "mindfulness pauses" into each day and pushing my bar for saying "yes" to requests much higher to try to make room for this.

So far, I've implemented the following steps, which you, too, might find effective:


My signature file says, "Tip: Use NRR to mean No Reply Required - thank you!", and I've tried to make it a "thing" for my associates to let each other know when you are sending a message that doesn't need a reply. This cuts down on the "thanks!" or "OK!" type emails.


I use Sanebox which is a service that sorts your email behind the scenes into various folders. Only people who you have written email to in the past or people or domain names that have been "trained" end up in your inbox. You train Sanebox by dragging email into different folders to teach it where they should go or you can program domains, or certain strings in the subject line to send the message to a particular box. I have four folders. "Inbox" which is where the important messages go, "@SaneLater" where email from people I don't know go, "@SaneBulk" where bulk email goes and "@SaneBlackHole" where things go that you never want to see again.


Gmail has a nifty feature that allows you to give access to your inbox to other people. Two people have access to my inbox to help me triage and write replies. They also keep an eye on "@SaneLater" for messages from new people who I should pay attention to. Requests requiring actions or replies that are substantial go to Trello. (More below about Trello.) Information requests, requests that need to be redirected to someone else, or meetings that I can't possibly attend get processed right in my inbox. Email that needs a reply but won't take more than a few minutes ends up getting converted into a ticket in Keeping and assigned to whoever should be involved. (More on Keeping below.)


We have a Media Lab Slack channel and any interaction that can be settled on Slack, we do on Slack and try not to create email threads.


Trello is a wonderful tool that allows you to track tasks in groups. It's organized very much like a "Kanban" system and is used by agile software developers and others who need a system of tracking tasks through various steps. Trello lets you forward email to create cards, assign cards to people to work on, and have conversations on each card via email, a mobile app and a desktop app.

I have two "boards" on Trello. One is a "Meetings" board, where each meeting request starts life in the "Incoming" list with a color coded tag for which city the request is for or whether it is a teleconference. I then drag meetings requests from "Incoming" to "Someday Soon" or "Schedule" or "Turn Down."

The cards in "Schedule" are sorted roughly in order of priority, and my team takes cards from the top of the list and starts working on scheduling them in that order. Meetings where we have suggested dates and are awaiting confirmation go to the "Waiting For Confirmation" list, and cards that are confirmed end up in "Confirmed" list. If for some reason a meeting fails to happen, then its card gets moved to "Failed/Reschedule", and when meetings are completed, they end up in "Completed." At least once a week, I go through and archive the cards in the "Completed" list after scanning for any missing follow-up items or things that I need to remember. I also go through "Incoming" and "Someday Soon" lists and make more decisions on whether to schedule or turn down meeting requests. And I try to check the priority ranking of the "Schedule" list.

In addition to the "Meetings" board, I have a "To Do" board.

The To Do boards has a similar "Incoming" list of things that others or I think might be something worth doing. When I've committed to doing something, I move it to the "Committed" list. When something isn't done and instead gets stuck because I need a response from someone, it moves to a "Waiting" list. Once completed, it goes to "Completed" and is later archived after I've given myself sufficient positive feedback for having completed it. I also have "Abandoned" and "Turn Down" and "Delegated" lists on this board.


Keeping is a tracker system very similar to what a customer support desk might use. It allows you to convert any email into a "ticket" and you can create an email address that is also the email address for the ticket system. More people have access to my ticket system than my inbox. Once an email becomes a ticket, everyone on the team can see the ticket as a thread, and we can put private notes on the thread for context. Keeping manages the email exchange with the "customer" so that anyone can take care of responding to the inquiry, but the people who are assigned to the email have it show up as "open" in their personal list. When a thread is taken care of, the ticket is "closed" and the thread is archived. Threads that are still not finished stay "open" until someone closes it. If someone replies to a "closed" thread, it is reopened.

Keeping is a Chrome and Gmail plug in and is a bit limited. We recently started using it, and I think I like it, though some of us use a desktop mail client which limits features you can access such as assignment or closing tickets. Keeping also has a bit of a delay to process requests which is annoying when we're triaging quickly. Keeping also can be redundant with Trello, so I'm not positive it's worth it. But for now, we're using it and giving it a chance to settle into our process.

You can book me

I've found that 15-minute office hours are an effective (but tiring) way of having short, intense but often important meetings. I use a service called It lets me take a block of time in my calendar and allow people to sign up for 15-minute slots of it via the website, using a form that I design. It automatically puts the meeting in my Google calendar and sends me an email and tracks cancellations and other updates.


I have a number of people who are good at editing documents ranging from email to essays and letters. I use Google docs and have people who are much better than me copy edit my writing when it is important.

21 Dec 21:46

Four short links: 18 December 2017

by Nat Torkington

Crazy scif i CRISPR in nature

Support, Mathematics Magazine, CRISPR Drives, and Science Journal For Kids

  1. Needs Your Support -- keep the Wayback Machine and ongoing archiving work going.
  2. Chalk Dust -- a magazine for the mathematically curious.
  3. New Model Warns About CRISPR Drives in the Wild -- could be used to create so-called gene drives to eliminate or control unwanted species. Experts debating the wisdom or perils of that approach have often reached very different answers. A new paper being published today makes a case that caution is warranted. Using computer modeling, it suggests that in at least some forms, the drives may be more invasive than previously suggested. The paper is harder going than this article about it. The conclusion is stark: Contrary to the National Academies report on gene drive, our results suggest that standard drive systems should not be developed nor field-tested in regions harboring the host organism.
  4. Frontiers for Young Minds -- a non-profit scientific journal written by scientists and peer-reviewed by kids. For the young nerds in your life.

Continue reading Four short links: 18 December 2017.

02 Jun 08:58

xkcd: Algorithms are hard

by Nathan Yau

I en anden version af den her stribe tager manden med laptoppen et helt *andet* problem, løser det i stedet og erklærer sejr! (selv om han ikke løste problemet. Og ja, det er ræddi tit en mand)

Yeah, but what if you combine and overlay all these datasets? [xkcd]

Tags: algorithm, xkcd

02 Jun 08:56

Criticism vs. Creation

by Nathan Yau

Det er altså noget vås at det simpelthen kun tæller at lave. Ingenting, der blev lavet, var noget som helst værd uden et meget større publikum end skaberkreds.

Filmmaker Kevin Smith talks about making things versus critiquing them. He’s talking about movies, but you can so easily plug in visualization. I just kept nodding yes. [via swissmiss]

Tags: criticism

02 Jun 08:41

Eric Schmidt publicly defends Jared Kushner. Next day, Trump shutting DoL division investigating Google

by Sarah Lacy


Remember back in February, when we were coming off the high of the women’s march and the travel ban had just been proposed and all of Silicon Valley’s rage was concentrated?

Protesting was the new brunch, and employees forced-- forced-- Silicon Valley leaders to grow a spine. So much so that CEOs of multi-billion dollar companies were actually saying in despair, “WHAT DO YOU WANT US TO DO?”

Sure, there’s still a lot of Trump outrage, but it seems Silicon Valley workers have mostly just gotten back to doing their jobs. (Save that one dude who’s spent all the money on those Elon Musk billboards.)

Well, here’s a strange coincidence someone close to Google alerted me to earlier this week. One that all those protesting, woke Google employees from back in February may want to ask their company to explain...

02 Jun 08:41

Pinboard acquires

by Andy Baio

Excellent trolling - og endelig kan man tro på at linkarkivet fra ikke bare sårn går væk.

“Do not attempt to compete with Pinboard.”

02 Jun 08:39

Berlins politi freder arabiske kvarterer under Ramadanen, ingen P-bøder

by mreast_dk

Når man glemmer at sitet her laves af flygtningehadere

  I denne uge starter den muslimske fastemåned Ramadanen. I lighed med foregående år har Berlins politiledelse udstedt et cirkulære til samtlige menige politibetjente med særlige instrukser. Af Michael Skovgaard Tyskland: Ifølge Berliner Zeitung opfordres alle betjente i Berlin til at udvise særlig skånsomhed under udførelse af politimæssige rutineopgaver under ramadanen, da mange praktiserende muslimer
28 Aug 10:46

Bulgarsk premierminister: Narko er migranternes nye taktik

by mreast_dk

"De vil hellere være fængslede i Bulgarien end sendes hjem" - er enten europæisk ønsketænkning om hvor meget federe der er her i Europa, og derfor utiltalende, eller så hjerteskærende at man ikke kan sætte sig ind i at bulgarerne snakker om det som *deres* problem

Bulgarien: Den bulgarske premierminister Bojko Borisov afslørede fredag en ny taktik, migranterne bruger for at blive i Bulgarien og ikke blive sendt tilbage til Tyrkiet, skriver den britiske avis Daily Express. Større grupper af migranter udvælger ifølge Borisov en person, som er i besiddelse af narkotika. I det øjeblik personen bliver arresteret melder andre medlemmer
22 Aug 20:41

Cocktails & Dreams

by Jason Kottke


Mike Upchurch was a writer for Mr. Show and MADtv but now he's making these clever little videos with additional actors spliced into the narratives of Cocktail (the Tom Cruise movie) and the Dragnet TV series.

Both feature actor/comedian Chris Fairbanks in the lead role and are noted as "proof-of-concepts" for a series called Electric Television that Upchurch is presumably developing. Someone should greenlight it. (via @dunstan)

Tags: Chris Fairbanks   Cocktail   Dragnet   Mike Upchurch   movies   remix   TV
10 Aug 15:15

Desktop Ensemble



more from Soft Object  
25 Jul 18:57

Four short links: 25 July 2016

by Nat Torkington

link to er sejt

Game Theory, Face Recognition, Android Augmentation, and Closed Health Data

  1. Game Theory is Really Counterintuitive -- a fun collection of brain-bending research findings in game theory/economics.
  2. Modern Face Recognition -- the different steps and what they recognize. Readable!
  3. Andromium -- turn your Android phone into your laptop.
  4. Stop the Privatization of Health Data (Nature) -- We believe that closed-data and closed-algorithm business models in health—at scale—will hamper scientific progress by blocking the discovery of diverse ways to examine and interpret health data.

Continue reading Four short links: 25 July 2016.

18 Jul 13:18

Up To 1.4 Million IT Services And BPO Jobs Will Likely Disappear Thanks To Artificial Intelligence


Bemærk .in - vesten har mistet mange af jobsene allerede, men nu bliver de 'hjemtaget' til serverparker... - Automation, artificial intelligence, or other forms of “digital labour” that can perform low to high skill jobs could eliminate up to 1.4 million jobs, or nine per cent of the global IT services an...

Tweeted by @WiseTribe
18 Jul 13:17

Bot thots

by Jon Bruner

The world of conversational interfaces is very young. Here are some early questions that it’s working out.

Bots have become hot, fast. Their rise—fueled by advances in artificial intelligence, consumer comfort with chat interfaces, and a stagnating mobile app ecosystem—has been a bright spot in an otherwise darkening venture-capital environment.

I’ve been speaking with a lot of bot creators—most recently at a conference called Botness that took place in San Francisco at the beginning of June—and have noticed that a handful of questions appear frequently. On closer inspection, bots seem a little less radical and a lot more feasible.

Text isn’t the final form

The first generation of bots has been text most of the way down. That’s led to some skepticism: you mean I’ll have to choose between 10 hotels by reading down a list in Facebook Messenger?! But bot thinkers are already moving toward a more nuanced model in which different parts of a transaction are handled in text and in graphical interfaces.

Conversational interfaces can be good for discovering intent: a bot that can offer any coherent response to “find a cool hotel near Google’s HQ” will be valuable, saving its users one search to find the location of Google’s headquarters, another search for hotels nearby, and some amount of filtering to find hotels that are “cool.”

But, conversational interfaces are bad at presenting dense information in ways that are easy for human users to sort through. Suppose that hotel bot turns up a list of finalists and asks you to choose: that’s handled much more effectively in a more traditional-looking web interface, where information can be conveyed richly.

Conversational interfaces are also bad at replacing most kinds of web forms, like the pizza-ordering bot that has ironically become an icon of the field. Better to discern intent (“I want a pizza fast”) and then kick the user to a traditional web form, perhaps one that’s already pre-filled with some information gleaned from the conversational process.

A few people have pointed out that one of WeChat’s killer features is that every business has its phone number listed on its profile; once a transaction becomes too complex for messaging, the customer falls back on a phone call. In the U.S., that fallback is likely to be a GUI, to which you’ll be bounced if your transaction gets to a point where messaging isn’t the best medium.

Discovery hasn’t been solved yet

Part of the reason we’re excited about bots is that the app economy has stagnated: “the 20 most successful developers grab nearly half of all revenues on Apple’s app store,” notes The Economist. It’s hard for users to discover new apps from among the millions that already exist, and the app-installation process involves considerable friction. So, the reasoning goes, bots will be great because they offer a way to skip the stagnant app stores and offer a smoother “installation” process that’s as simple as messaging a new contact.

Of course, now we’ve got new app stores like Slack’s App Directory. Users are still likely to discover new bots the way they discover apps: by word of mouth, or by searching for a bot associated with a big brand.

The next step, then, would be to promote bots in response to expressions of intention: in its most intrusive implementation, you’d ask your coworkers on Slack if they want to get lunch, and Slack would suggest that you install the GrubHub bot. Welcome back Clippy, now able to draw from the entire Internet in order to annoy you.

That particular example is universally condemned, and anything that annoying would drive away its users immediately, but the community is looking for ways to listen for clear statements of intent and integrate bot discovery somehow, in a way that’s valuable for users and not too intrusive.

Platforms, services, commercial incentives, and transparency

Conversational platforms will have insight into what users might want at a particular moment, and they’ll be tempted to monetize these very valuable intent hooks. Monetization here will take place in a very different environment from the web-advertising environment we’re used to.

Compared to a chat bot’s output, a Google results page is an explosion of information—10 organic search results with titles and descriptions, a bunch of ads flagged as such, and prompts to modify the search by looking for images, news articles, and so on.

A search conducted through a bot is likely to return a “black box” experience: far fewer results, with less information about each. That’s especially true of voice bots—and especially, especially true of voice bots without visual interfaces, like Amazon’s Alexa.

In this much slower and more constrained search environment, users are more likely to accept the bot’s top recommendation rather than to dig through extended results (indeed, this is a feature of many bots), and there’s less room to disclose an advertising relationship.

Amazon is also an interesting example in that it’s both a bot platform and a service provider. And it has reserved the best namespace for itself; if Amazon decides to offer a ridesharing service (doubtless after noticing that ridesharing is a popular application through Alexa), it will be summoned up by saying “Alexa, call a car.” Uber will be stuck with “Alexa, tell Uber to call a car.”

Compared to other areas, like web search, the messaging-platform ecosystem is remarkably fragmented and competitive. That probably won’t last long, though, as messaging becomes a bigger part of communication and personal networks tend to pull users onto consolidated platforms.

How important is flawless natural language processing?

Discovery of functionality within bots is the other big discovery challenge, and one that’s also being addressed by interfaces that blend conversational and graphical approaches.

Completely natural language was a dead end in search engines—just ask Jeeves. It turned out that, presented with a service that provided enough value, ordinary users were willing to adapt their language. We switch between different grammars and styles all the time, whether we’re communicating with a computer or with other people. “Would you like to grab lunch?” in speech flows seamlessly into “best burrito downtown sf cheap” in a search bar to “getting lunch w pete, brb” in an IM exchange.

The first killer bot may not need sophisticated NLP in order to take off, but it still faces the challenge of educating its users about its input affordances. A blank input box and blinking cursor are hard to overcome in an era of short attention spans.

Siri used a little bit of humor, combined with a massive community of obsessed Apple fans bent on discovering all of its quirks, to publicize its abilities. Most bots don’t have the latter, and the former is difficult to execute without Apple’s resources. Even with the advantages of size and visibility, Apple still hasn’t managed to get the bulk of its users to move beyond Siri’s simplest tasks, like setting alarms.

(Developers should give a great deal of thought to why alarm-setting is such a compelling use case for Siri: saying “set an alarm for 7:30” slices through several layers of menus and dialogues, and it’s a natural phrase that’s easily parsed into input data for the alarm app. Contrast that with the pizza-ordering use case, where you’re prompted for the type of pizza you want, prompted again for your address, prompted again for your phone number, etc.,—far more separate prompts than you’d encounter in an ordinary pizza-ordering web form.)

Another challenge: overcoming early features that didn’t work well. We’ve all gotten used to web software that starts out buggy and improves over time. But we tend not to notice constant improvement in the same way on bots’ sparse interfaces, and we’re unwilling to return to tasks that have failed before—especially if, as bots tend to do, they failed after a long and frustrating exchange.

What should we call them?

There’s not much confusion: people working on bots generally call them bots. The field is young, though, and I wonder if the name will stick. Bots usually have negative connotations: spambots, Twitter bots, “are you a bot?”, and botnets, to name a few.

“Agent” might be a better option: an agent represents you, whereas we tend to think of a bot as representing some sinister other. Plus, secret agents and Hollywood agents are cool.

Continue reading Bot thots.

16 Jul 08:49

Pandora's Box: 10 super scary things about the future

by Jason Kottke

George Dvorsky at Gizmodo highlights 10 Predictions About the Future That Should Scare the Hell Out of You. My, uh, favorites are:

1. Virtually anyone will be able to create their own pandemic
5. Robots will find it easy to manipulate us
7. The antibiotic era will end
8. Getting robots to kill humans will be disturbingly routine -- and dangerous

From the manipulating robots section:

"Human empathy is both one of our paramount gifts and among our biggest weaknesses," Brin told Gizmodo. "For at least a million years, we've developed skills at lie-detection...[but] no liars ever had the training that these new [Human-Interaction Empathetic Robots] will get, learning via feedback from hundreds, then thousands, then millions of human exchanges around the world, adjusting their simulated voices and facial expressions and specific wordings, till the only folks able to resist will be sociopaths -- and they have plenty of chinks in their armor, as well."

Many of the things on the list seem to have a similar potential for mischief as the discovery of nuclear fission chain reactions in the 1930s. On the other hand, humans have at least temporarily turned that possible civilization-ending technology into a major source of clean energy and 75+ years of world peace (relatively speaking) so maybe there's some room for optimism here? Maybe? Hello?

Tags: George Dvorsky   lists
16 Jul 08:45

The Futuristic Department Store Robot of 1924 Was Surprisingly Lifelike

by Matt Novak

If you go to department stores in Japan you’ll sometimes be greeted by a friendly robot. Maybe one that looks like this:


16 Jul 08:40

Kendt ungarsk forfatter er død

by mreast_dk

Øv, Harmonise Celæstis er rigtig god

Ungarn: En af Ungarns mest kendte forfattere, Péter Esterházy, er død, skriver ungarske medier. Esterházy blev 66 år. Péter Esterházy er en af de mest kendte ungarske forfattere og har fået flere internationale priser for sit arbejde. Flere af Péter Esterházys værker er oversat til dansk, heriblandt Hjertets Hjælpeverber eller Harmonia cælestis, som skildrer den
16 Jul 08:38

Four short links: 15 July 2016

by Nat Torkington

Cybernetics History, Cybersecurity Report, Ranch Robots, and Planned Obsolescence

  1. Digging into the Archaeology of the Future -- review of Rise of the Machines, a history of cybernetics. See also The Future as a Way of Life.
  2. Royal Society's Cybersecurity Research Report (PDF) -- RECOMMENDATION 1: Governments must commit to preserving the robustness of encryption, including end-to-end encryption, and promoting its widespread use.
  3. Swagbot to Herd Cattle on Australian Ranches (IEEE) -- researchers from the Australian Centre for Field Robotics at the University of Sydney, led by Dr. Salah Sukkarieh, have designed and tested an all-terrain robot called SwagBot that’s designed to be able to drive over almost anything while helping humans manage their ranchland.
  4. The LED Quandry: Why There's No Such Thing as Built to Last (New Yorker) -- building bulbs to last turns out to pose a vexing problem: no one seems to have a sound business model for such a product.

Continue reading Four short links: 15 July 2016.

16 Jul 08:35

The legs of New York

by Jason Kottke

NY Legs

NY Legs

NY Legs

NY Legs

Stacey Baker, who is a photo editor at the NY Times, spends some of her leisure time photographing the legs of women on the streets of NYC. Her Instagram account has 78K+ followers and now she's turned the project into a book: New York Legs.

NY Legs Cover

Great cover.

Tags: books   New York Legs   NYC   photography   Stacey Baker
16 Jul 08:32

Revealed in court: 100% cast iron evidence of how Uber lies to secretly investigate and smear its critics

by Paul Bradley Carr

I've been meaning to write about this all week, but the weirdness of Brogan BamBrogan got in the way.

A week or so ago, a judge ordered the release of documents that show beyond all reasonable doubt that Uber hired a CIA-linked private investigation firm to investigate the personal and professional life of Portland attorney Andrew Schmidt and his client, Spencer Meyer. Meyer had recently filed a lawsuit against Uber and Kalanick...

09 Jul 11:53

Hear the epic live set Skinnerbox played at Fusion Festival

by Peter Kirn

We have the technology. We have the capability to play live sets on mainstages. And for a brilliant example of that, look no further than the frenetic, exquisitely hyperactive acid performances of Skinnerbox. Their set at Fusion Festival from this weekend demonstrates that you can command massive mobs of dance lovers outdoors with live sets, too. And maybe you thought such things were confined to chin-scratching handfuls of nerds.

Skinnerbox is the Berlin-based duo of Iftah Gabbai and Olaf Hilgenfeld, who join together to make sample-laden live performances mixing acid techno spiced up with grooves. Last week, they dazzled the outdoor throngs at Fusion’s legendary Turmbühne, the Mad Max-styled open air megaplex.

Fusion Festival’s organizers actually explicitly discourage documentation. The event, a kind of extended afterhours open air sprawling over a Soviet airfield, is best remembered like a dream anyway. But I think it is important to share the musical artists from that event. They span seemingly endless stages, from enormous openair arenas with set pieces and special effects to intimate tents and club-like indoor spaces.

And it’s important in particular to appreciate what happens when live sets do hit the bigger stages, which even at Fusion are awash with mostly CDJ sets. Live performance of dance tracks continues to be a comparative minority. And on big stages, the throngs may not know that what is producing what they’re hearing (being occupied instead with dancing and partying, natch). So spreading that information separately is a reasonable solution.

Indeed, the possibilities of live music are so poorly unknown that Skinnerbox have sprawled a notice on their SoundCloud banner explaining there are no track IDs, because they’re not playing tracks. (I actually hear this confusion a lot with live tracks.)

Here’s what the whole set sounds like:

I talked to Iftah a bit about playing. The rig:

Olaf on Minimoog model D
Ableton Live with effects for the Minimoog
Iftah on his homemade setup – Arduino-based controlled, two monomes, custom Max for Live patches for sequencing and sample slicing (quite a lot of live sample manipulation going on).

Iftah notes the inspiration of Brian Crabtree’s monome patches, namely mlr and mythologic.

Skinnerbox aren’t just championing live performance in their shows; they’re also sharing tools for such. Their 2009 sbx 2049 drum machine was one of the first collaborations between Ableton and artists. In 2014 they released the Time & Timbre drum machine, which i think remains one of the best examples of how a computer drum machine can aid live performance and generate ideas. Even with so many Max for Live creations out there, this is one you should definitely try.

Speaking of Time & Timbre, they recently showed how it can be combined with analog modular via the CV LFO now included in 2.0 (have to cover all of this in more detail later):

For more background on their live sets, here’s a session recorded at the pool, with monome meeting Minimoog:

the bali sessions from skinnerbox promo videos on Vimeo.

And from 2015, Skinnerbox (they’ve been Fusion regulars):

skinnerbox live fusion festival 2015 from skinnerbox promo videos on Vimeo.

And last summer’s Plötzlich am Meer:

They’re even crazy enough to play live … for twelve hours.

And yes, I love the monologue in the 2016 Fusion Festival set, which seemed to have a welcome message for attendees (cue to about 45:00): “Happiness the brand is not happiness … Smile at a stranger and mean it; lose your s***”

Finally, if you want to vicariously live Fusion more (or relive it), the fine folks of German-language blog Berlin ist Techno have put together a playlist with all the sets they’ve found uploaded so far:

Now… back to plotting my next live set. And… sleeping after Fusion.

The post Hear the epic live set Skinnerbox played at Fusion Festival appeared first on CDM Create Digital Music.

09 Jul 11:32

Four short links: 7 July 2016

by Nat Torkington

Building Blocks, Mental Models, Pose Data Set, and Parsing Data Formats

  1. Digital Reality -- EDGE conversation with Neil Gershenfeld. It's all top-shelf thinking. There are 20 amino acids. With those 20 amino acids, you make the motors in the molecular muscles in my arm, you make the light sensors in my eye, you make my neural synapses. The way that works is the 20 amino acids don't encode light sensors, or motors. They’re very basic properties like hydrophobic or hydrophilic. With those 20 properties you can make you. In the same sense, digitizing fabrication in the deep sense means that with about 20 building blocks—conducting, insulating, semiconducting, magnetic, dielectric—you can assemble them to create modern technology.
  2. Mental Models I Find Repeatedly Useful -- as Maciej said: Gabriel Weinberg has published a Dictionary of Received Ideas for our time and place. This is modern nerdthink.
  3. MPII Human Pose Data Set -- around 25K images containing more than 40K people with annotated body joints. The images were systematically collected using an established taxonomy of every day human activities. Overall, the data set covers 410 human activities and each image is provided with an activity label. Each image was extracted from a YouTube video and provided with preceding and following un-annotated frames. In addition, for the test set we obtained richer annotations, including body part occlusions and 3D torso and head orientations.
  4. NAIL (PDF) -- A practical tool for parsing and generating data formats.

Continue reading Four short links: 7 July 2016.

09 Jul 11:30

I Contain Multitudes

by Jason Kottke

I Contain Multitudes

Crackerjack science writer Ed Yong is coming out with his very first book in a month's time. It's called I Contain Multitudes (good title!) and is about "astonishing partnerships between animals and microbes".

Every animal, whether human, squid, or wasp, is home to millions of bacteria and other microbes. Ed Yong, whose humor is as evident as his erudition, prompts us to look at ourselves and our animal companions in a new light-less as individuals and more as the interconnected, interdependent multitudes we assuredly are.

The microbes in our bodies are part of our immune systems and protect us from disease. In the deep oceans, mysterious creatures without mouths or guts depend on microbes for all their energy. Bacteria provide squid with invisibility cloaks, help beetles to bring down forests, and allow worms to cause diseases that afflict millions of people.

I will read anything described as "like a David Attenborough series shot through a really good microscope".

Tags: books   Ed Yong   I Contain Multitudes   science
09 Jul 11:29

Four short links: 8 July 2016

by Nat Torkington

Farm Data, AI Mortality, Great Visualization, and Doing Social

  1. The Land Grab for Farm Data -- It’s time we put farmer data rights up front, in clear language that establishes who owns the data.
  2. Death and Suicide in Artificial Intelligence (PDF) -- A technical subtlety of AIXI is that it is defined using a mixture over semimeasures that need not sum to 1, rather than over proper probability measures. In this work, we argue that the shortfall of a semimeasure can naturally be interpreted as the agent’s estimate of the probability of its death.
  3. 12 Complex Concepts Made Easier Through Great Data Visualization -- they're wonderful, but I wonder how many will still be alive and working five years from now.
  4. Things I Learned Working at Serial -- incredible run-down of how the social media manager for Serial approached her job.

Continue reading Four short links: 8 July 2016.

09 Jul 11:28

Artificial Intelligence’s White Guy Problem - ACCORDING to some prominent voices in the tech world, artificial intelligence presents a looming existential threat to humanity: Warnings by luminaries like Elon Musk and Nick Bostrom about “the si...

Tweeted by @caseymegan
09 Jul 11:28

Artificial intelligence and blockchain tech ‘could change our world'


brug alle buzzwords - Artificial intelligence – fuelled by big data – and blockchain tech, are some of the technologies that could impact our world, according to Sarwant Singh, senior parter at Frost & Sullivan. Speakin...

Tweeted by @DBaker007