Shared posts

18 Apr 00:57

Wikipedia Edit War Update

A few months ago I stumbled into an edit war on Wikipedia. I noticed that Wikipedia's page on Jacy Reese was being, essentially, guarded from having any mention that he previously went by his full name. There was a pattern where someone would notice this information was missing, add it, and then it would be reverted soon after.

The main user guarding the page was Bodole, and someone pointed me yesterday to where they've been banned from editing Jacy's page for three months. The discussion there was another interesting window into how Wikipedia handles disputes, so after reading it I thought it would be interesting to review:

  • User Drmies edited the page to remove a list of articles Jacy had published ("rm linkfarm. we list books, not articles", link). Drmies is an experienced editor, making a routine cleanup.

  • User Bodole reverts the change ("Many BLP list articles. Please discuss on talk page if you think this should be an exception.", link).

  • Drmies reverts the revert ("It's the other way around. what you are doing is promoting this person by linking a set of articles. if you have secondary sources that prove these articles are worth noticing, that's a different matter", link)

  • Bodole reverts the reversion of the revert ("You are edit warring. Please stop. Discuss on the talk page if you insist. See the WP:BRD cycle", link) and puts a warning (link) on Drimes' talk page.

  • Drimes responds there with "Aw boohoo" (link)

  • Drmies reverts the reversion of the reversion of the revert ("see talk page", link) and marks the page as being subject to Wikipedia:Conflict of Interest (link). It looks to me like Drmies thinks Bodole may either be Jacy or someone closely connected with him. Drmies removes biographical information from the page ("this 'Sentience Institute' has an article--why this biography is bloated with content about some poll, verified only with links to websites, is not clear", link)

  • Discussion moves to the talk page

  • Drmies is clearly quite unhappy with apparent promotional editing ("we are not here to produce link dumps for resume-style lists of publications", "The article itself is way too fluffy anyway; it used to be a lot worse, thanks in part to edits like this one by the creator, Utsill, and this one, by Reckston. A quick look at the references show a plethora of primary links and references to non-notable outfits", "The talk page, and the edit history, indicate that a number of editors have tried to bring some order to this madness, and I thank 78.26, Melcous, Kbog, and especially AlasdairEdits for their efforts".

  • Bodole files a complaint on Administrators' noticeboard/Incidents ("Disruptive editing by User:Drmies", link)

  • The complaint does not go well for Bodole. It's interesting reading, but generally the administrators think Drmies behavior is reasonable and Bodole's is not. They bring up that Bodole tried to remove the discussion of whether the page should contain "Anthis" from the talk page, that Bodole may be a (not allowed) alias of Utsil who created the page, and that "Boodole appears to be a [single-purpose account], perhaps one who is here to [right great wrongs]. Of their 228 edits, it appears that the vast majority of them concern Jacy Reese/Jacy Reese Anthis in some way". The consensus is to temporarily ban Bodole from editing the 'Jacy Reese Anthis' page.

  • Bodole responds by ragequitting ("I will now sign off of Wikipedia indefinitely").

Wikipedia volunteers aren't really in a position to investigate conflicts of interest, but it does make me wonder who Bodole is and, if they're not connected with Jacy, why they would be so invested in this one article.

Comment via: facebook

02 Mar 23:00

SpaceX discloses cause of Starship anomalies as it clears an FAA hurdle

02 Mar 22:51

Lazarus and the FudModule rootkit: Beyond BYOVD with an admin-to-kernel zero-day

26 Jan 01:58

Do Elephants Have Souls?

15 Jan 21:52

A periodic table of visualization methods

20 Oct 14:19

A Chunk by Any Other Name: Structured Text Splitting and Metadata-enhanced RAG

by Martin Zirulnik
A Chunk by Any Other Name: Structured Text Splitting and Metadata-enhanced RAG

Editor's note: this is a guest entry by Martin Zirulnik, who recently contributed the HTML Header Text Splitter to LangChain. For more of Martin's writing on generative AI, visit his blog.

chunking-blog
17 Sep 20:17

Turkish man arrested over posting photo to Reddit with alcohol in mosque

11 Sep 23:32

Show HN: WhatsApp-Llama: A clone of yourself from your WhatsApp conversations

06 Sep 03:25

Show HN: Keep – GitHub Actions for your monitoring tools

01 Sep 19:31

MagicEdit: High-fidelity temporally coherent video editing

05 Jun 00:05

When digital nomads come to town

04 May 04:04

Brazilian frog might be the first pollinating amphibian known to science

23 Apr 23:29

Russia killed its tech industry

08 Feb 01:26

US DOT awards $800M toward street safety

17 Oct 01:31

Simon Peyton Jones interview

20 Mar 18:27

VESPA: Static profiling for binary optimization

What the research is:

Recent research has demonstrated that binary optimization is important for achieving peak performance for various applications. For instance, the state-of-the-art BOLT binary optimizer developed at Meta, which is part of the LLVM Compiler Project, significantly improves the performance of highly optimized binaries produced using compilers’ most aggressive optimizations, such as profile-guided and link-time optimizations.

In this research, we propose a novel approach to apply binary optimization without the need to profile the application. Our technique, called Vintage ESP Amended (VESPA), builds on top of a previous technique called evidence-based static prediction (ESP), which applies machine learning techniques to statically infer the direction of branch instructions in a program.

VESPA expands on ESP in several ways to make it useful in the context of binary optimizers. VESPA increases the scope where binary optimizers can be used, thus enhancing the range of applications that can leverage these tools to improve their performance. Our work also enables higher performance and better user experience for many software applications that were out of the reach of binary optimizers, such as end-user mobile applications.

How it works:

VESPA is useful for obtaining profile information to feed binary optimizers like BOLT statically, i.e., with no need to execute the target application to produce profile data. To achieve this, VESPA employs machine learning techniques. First, during a training phase, VESPA is provided with a set of applications and corresponding dynamic profiles. Using these, VESPA trains a neural network model that learns the probability that branch instructions in the programs will be taken based on various program characteristics (e.g., the condition code of the branch or whether the target block is a loop header).

After this model is produced, it can be used to infer the probability that branches from other programs will be taken. VESPA then transforms these probabilities into code frequencies, or estimates of how often each individual piece of the program will execute, similar to the information that a binary optimizer normally requires from dynamic profiles obtained by executing an application. Once the static profile data produced by VESPA is injected into a binary optimizer, this tool can proceed with its optimization steps as usual, completely oblivious to how the profile data was computed. VESPA, therefore, can very easily be integrated into existing binary optimizers, which we demonstrated by integrating it into Meta’s BOLT binary optimizer.

Compared to the seminal ESP technique that inspired our work, VESPA provides three main improvements:

  1. An enhanced neural-network model
  2. New program features to improve the model’s accuracy
  3. A technique to derive code frequencies required for binary optimizations instead of simply branch directions

Why it matters:

BOLT can provide performance speedups of about 20 percent not only for many of Meta’s widely deployed server workloads, but also for other widely used open source applications such as compilers (e.g., GCC and Clang) and database systems (e.g., MySQL and PostgreSQL). To achieve these results, BOLT relies on very accurate dynamic profile data collected from executing the target applications on representative inputs. Unfortunately, collecting these profile data adds complexity and overheads to applications’ build processes, and sometimes it is not even possible — for example, in the case of mobile applications executing on user devices. 

Using VESPA to derive static profiles for the BOLT binary optimizer, our work demonstrates that a 6 percent speedup can be achieved on top of highly optimized binaries built with Clang -O3 without the need for dynamic profiling the application. As such, our research demonstrates that binary optimizations can be beneficial even in scenarios where dynamic profiling is prohibitive or impossible, thus opening new opportunities for binary optimizers, such as end-user mobile applications.

Read the paper:

VESPA: Static profiling for binary optimization

The post VESPA: Static profiling for binary optimization appeared first on Engineering at Meta.

14 Mar 01:14

A new history of Byzantium reveals the inner workings of a late antique empire

20 Feb 21:37

Netlify Graph: A faster way for teams to develop web apps with APIs

20 Feb 21:16

Things that used to be hard and are now easy

Hello! I was talking to some friends the other day about the types of conference talks we enjoyed.

One category we came up with was “you know this thing that used to be super hard? Turns out now it’s WAY EASIER and maybe you can do it now!“.

So I asked on Twitter about programming things that used to be hard and are now easy

Here are some of the answers I got. Not all of them are equally “easy”, but I found reading the list really fun and it gave me some ideas for things to learn. Maybe it’ll give you some ideas too.

  • SSL certificates, with Let’s Encrypt
  • Concurrency, with async/await (in several languages)
  • Centering in CSS, with flexbox/grid
  • Building fast programs, with Go
  • Image recognition, with transfer learning (someone pointed out that the joke in this XKCD doesn’t make sense anymore)
  • Building cross-platform GUIs, with Electron
  • VPNs, with Wireguard
  • Running your own code inside the Linux kernel, with eBPF
  • Cross-compilation (Go and Rust ship with cross-compilation support out of the box)
  • Configuring cloud infrastructure, with Terraform
  • Setting up a dev environment, with Docker
  • Sharing memory safely with threads, with Rust

Things that involve hosted services:

  • CI/CD, with GitHub Actions/CircleCI/GitLab etc
  • Making useful websites by only writing frontend code, with a variety of “serverless” backend services
  • Training neural networks, with Colab
  • Deploying a website to a server, with Netlify/Heroku etc
  • Running a database, with hosted services like RDS
  • Realtime web applications, with Firebase
  • Image recognition, with hosted ML services like Teachable Machine

Things that I haven’t done myself but that sound cool:

  • Cryptography, with opinionated crypto primitives like libsodium
  • Live updates to web pages pushed by the web server, with LiveView/Hotwire
  • Embedded programming, with MicroPython
  • Building videogames, with Roblox / Unity
  • Writing code that runs on GPU in the browser (maybe with Unity?)
  • Building IDE tooling with LSP (the language server protocol)
  • Interactive theorem provers (not sure with what)
  • NLP, with HuggingFace
  • Parsing, with PEG or parser combinator libraries
  • ESP microcontrollers
  • Batch data processing, with Spark

Language specific things people mentioned:

  • Rust, with non-lexical lifetimes
  • IE support for CSS/JS

what else?

I’d love more examples of things that have become easier over the years.

12 Feb 19:39

Ask HN: I am not a competitive guy, how will it affect my career?

06 Feb 00:22

Cassowary – Run Windows Apps on Linux using a VM as if they were native apps

23 Dec 04:11

Our recent server issues

20 Nov 19:49

Facebook tells LA police to stop spying on users with fake accounts

23 Oct 12:51

Google takes two-to-four times as much as the fees charged by rival ad networks

15 Mar 01:03

Bitcoin Has Zero Intrinsic Value. Some People Are OK with That

15 Feb 16:48

Microsoft’s Big Win in Quantum Computing Was an ‘Error’ After All

28 Dec 20:59

Buttplug (Sex Toy Control Library) Hits v1 Milestone

28 Dec 18:55

If you don’t know you have it…

by Seth Godin

then you don’t. (Not yet.)

Cleaning out the fridge after a power failure, I found three half-empty containers of anchovies. Because they magically migrate to the back of the fridge, every time I had needed some, I ended up opening a new jar, because the old ones were invisible. Not just invisible if I had looked for them, but so invisible that it never even occurred to me to look for them.

And this is even more likely to happen with the data on your hard drive. If you don’t know to look for it, if you don’t believe it’s there, it might as well be deleted.

And of course, this applies to our lost skills, confidence and experience as well.

It’s worth putting in regular effort to remind ourselves of what we’ve already got and how it has served us in the past.

10 Dec 03:25

FTC Sues Facebook for Illegal Monopolization

23 Nov 16:22

How do you write simple explanations without sounding condescending?

Sumana Harihareswara wrote an interesting blog post Plain Language Choices recently, about writing about complicated topics using simple language and how it can sometimes come off as condescending.

I really like explaining complicated topics while trying to avoid unnecessary jargon, and I realized that I’ve thought a lot about how to do it well. So here are a bunch of things I try to do when I use simple language to avoid coming off as condescending.

use some jargon to give the reader search terms

Sometimes I see writing that completely avoids all jargon and instead substitutes simple language for all “jargon”-y words.

I like to include some jargon in my explanations because otherwise it’s impossible for the reader to search & learn more about the concept they’re trying to learn about.

write (mostly) true explanations

Something else I see sometimes in ELI5-type explanations is an explanation in plain language that’s not actually true in a useful way. I’m pretty sympathetic to why people do this – it’s super hard to write simple explanations that are also true!

And actually sometimes when I’m trying to write down a simple/clear explanation for a concept, I realize that I don’t actually understand the concept as well as I thought and that I’m not able to explain it. That’s okay!

I think there are a few options here:

  • try to say only things that are true (or at least which are a useful model for how the world works even if they’re not 100% true)
  • write things that are not really true / that you’re not sure of, but point out that they may not be true (“I think it works like X, but I realize now that it might be Y instead, I’m not sure!“)

only use “fun” visual elements on explanations that are actually well written & easy to understand

This happens more with visual aids than with simple language but I’ll include it anyway. Sometimes I see explanations which have “fun” elements to it to make it them seem more approachable where the explanation itself is still pretty unclear.

I try to be careful about this in my own work – I try to only attach “fun” elements (like a fun illustrated cover) to explanations that I’ve spent a lot of time on making really clear. Basically to me “fun” things are a signal that the content itself is really clear/accessible, and I try to not misuse that signal.

I think why’s poignant guide to ruby is a nice example of something that’s fun and clear and which has helped a lot of people learn Ruby.

Another nice example of this is: I know someone who got her master’s thesis printed as a paperback book and illustrated with some great drawings related to the topic of her thesis (trans represensentation in media). It’s called “I’m supposed to relate to this?”, here’s the paperback.

I ended up reading the whole thing because, in addition to having the fun illustrations, her master’s thesis was really well written and interesting! The fact that she did the work to print a paperback book of her thesis and get it illustrated was a sign that she’d worked on making the writing accessible to a non-academic audience, and it was true!

tell a relevant story

Stories can really help people learn! For example, something I’ve done a lot on this blog is talk about a problem I ran into in the course of my job and what I did to solve that problem.

Some kinds of stories that I think work well:

  • a real problem that someone ran into, to motivate why the concept is interesting / important to learn
  • something that’s happening on a computer, framed as a “story” (for example https://howdns.works/ tells a story about how DNS works. Everything in the story literally corresponds to exactly what happens when you make a DNS query)

Sometimes I see stories used to explain concepts that don’t fit into either of these and feel kind of pasted on, like they’re there to help the concept seem “fun” but don’t actually illustrate the concept or motivate why it might be useful to learn it.

have a specific audience in mind

I try to write relatively simple explanations, but when I write I also generally assume a lot of knowledge on the part of my audience.

Sometimes I see explanations of complicated concepts that start with explaining the very basics of the topic. This usually isn’t that effective: if someone is trying to understand some super technical aspect of containers, they probably understand the basics of containers already!

“Have an audience” is more of a general writing tip so I’ll leave it at that.

on using simple language as a joke for people who already understand the idea

Here’s a very fun explanation of a complicated thing using simple language: Gödel’s Second Incompleteness Theorem Explained in Words of One Syllable.

On one hand, this is fun! I enjoyed reading it. On the other hand, I think the main audience for this is probably people who already more or less understand Gödel’s Second Incompleteness Theorem.

For example, someone pointed out that “if math is a not a load of bunk” in this text is code for “Peano arithmetic is consistent” with (“math” being “Peano arithmetic” and “not a load of bunk” meaning “consistent”). Which I find very charming, but also I found it a little hard to decode when reading it.

And (as we talked about before about jargon), if you know that “Peano arithmetic is consistent” is the relevant bit of jargon, you can find all kind of fascinating things, like a blog post by John Baez from 2011 discussing an attempted proof that Peano arithmetic was inconsistent)

(I’m also reminded here of the XKCD up goer five, which is very delightful, but I don’t think I learned anything about spaceships from reading it)

that’s all!

I’d love to hear more thoughts on this – I think there are probably more ways that simple explanations can feel condescending that I’ve missed!

I really don’t think they need to feel condescending though – to me the point of writing a clear/simple explanation is usually that I think the idea is not actually fundamentally that complicated and so I’m just explaining it in a way that’s exactly as complicated as it needs to be.