tartley
Shared posts
Entity Attestation Token White Paper
Grok-1 code and model weights release
Grok-1 code and model weights release
xAI have released their Grok-1 model under an Apache 2 license (for both weights and code). It's distributed as a 318.24G torrent file and likely requires 320GB of VRAM to run, so needs some very hefty hardware.
The accompanying blog post (via link) says "Trained from scratch by xAI using a custom training stack on top of JAX and Rust in October 2023", and describes it as a "314B parameter Mixture-of-Experts model with 25% of the weights active on a given token".
Very little information on what it was actually trained on, all we know is that it was "a large amount of text data, not fine-tuned for any particular task".
Announcing the 2023 PSF Board Election Results!
It was a really lively and engaged election process for the PSF Board
this year! First of all, we want to thank everyone who ran and was
willing to serve on the PSF Board. Even if you were not elected, we
appreciate all the time and effort you put into thinking about how to
make the PSF better and how to represent the parts of the community that
you participate in. We hope that you will continue to think about these
issues and share your ideas.
Congratulations to our five new Board members-elect!
- Cheuk Ting Ho
- Denny Perez
- Georgi Ker
- Christopher Neugebauer
- KwonHan Bae *
We’ll
be in touch with all the elected candidates next week to schedule
onboarding. * The fifth person is being invited to serve for a year to
fill the off-cycle vacancy left by Joannah Nanjekye, who stepped down
from the Board.
I’d like to take this opportunity to thank
our outgoing board members for their outstanding service; Dustin Ingram,
Joannah Nanjekye, Jeff Triplett, Thomas Wouters and Nina Zakharenko.
They served on the PSF Board through a particularly eventful time;
helping us to navigate the global pandemic, rework PyCon US into a
virtual event and hire a new Executive Director. Thank you for
supporting the PSF and the Python community through so much change!
Our
heartfelt thanks go out to each of you who took the time to review the
candidates and submit your votes. Your participation helps the PSF
represent our community. We received 621 total votes, which easily
reached quorum–1/3 of affirmed voting members (877). We’re especially
grateful for your patience with navigating the changes to the voting
process, which ultimately allowed for a valid election and a more
sustainable elections system.
I also want to thank everyone who helped promote this year’s board election, especially Python Community News
who took the initiative to cover this year’s election and produced
informational videos from each candidate. I also want to highlight the
PSF staff members who made some changes to our membership management on
the back-end this year, enabling us to affirm voting intention for the
first time ever and setting up OpaVote. Thanks to Ee Durbin and Joe
Carey!
Finally, it might feel a little early to mention this, but
we will have four seats open again next year. If you're interested in
running or learning more, we encourage you to reach out to a current
board member or two this year and ask them about serving.
Operating Behind the Power Curve
I promise that this blog is about software. So bear with me for a bit.
What happens when you increase the throttle in an airplane? You go faster, right? More power to the engine means more thrust which means more speed.
Most of the time this is true; but there’s a different mode you can get the aircraft into which reverses this relationship. It’s called: the region of reversed command.
Remember the four forces that govern flight: gravity, lift, thrust, and drag. Lift opposes gravity, and thrust opposes drag. An airplane, in straight and level flight, balances all these forces perfectly.
Thrust, of course, comes from the engine. Lift comes from the air flowing around the wings, and drag…? Well, drag comes from two different sources.
Parasitic drag is simply the cost of plowing through the air. It is the air resisting the movement of the plane. But there’s another kind of drag called induced drag.
Induced drag is caused by the pilot. It happens when the pilot raises the nose of the aircraft. Raising the nose, changes the angle of the wings causing the lift vector, which is always perpendicular to the wings, to point a bit backwards, thereby opposing the forward motion of the aircraft.
If you raise the nose just a little, the induced drag is small, and so increased thrust still causes increased speed. But if you raise the nose a lot, then the induced drag can cancel out the thrust. This is called getting behind the power curve.

The graph[1] shows an airplane in straight and level flight. To the right, you can see that the speed and power have a positive relationship. The more power, the more speed. But to the left, in the region of reversed command, it takes more and more power to go slower and slower.
In other words, the pilot has the nose so high that the thrust vector is being defeated by the backwards pointing lift vector; and the plane is kind of mushing through the air on raw power, barely making any headway.
Can you see where I’m going with this?
Except during the final moments of landing, pilots don’t usually operate their aircraft behind the power curve. It’s a bit dangerous back there. The slower you go, the more power you need. If you go slow enough, you’ll max out your power and descend with the stall warning screaming in your ears. So pilots stay in front of the power curve by watching their airspeed, and keeping it above the inflection point.
So what does this have to do with software? (As if you haven’t already guessed.)
Too many software teams operate behind the power curve all the time. Rotten code is induced drag. These teams have created so much induced drag that it takes a huge effort to make any forward progress. The team mushes forward at full power, barely making any headway. Indeed, many teams have maxed out their power and are in a slow uncontrollable descent.
How does a pilot get out from behind the power curve? By lowering the nose. This brings the lift vector to vertical allowing the thrust vector to dominate, and the plane screams off into the wild blue yonder.
How does a software team get out from behind the power curve? By lowering their noses, cleaning up the messes, and reducing the induced drag. With that drag gone, and all the power they have, the wild blue yonder is theirs to explore.
Wouldn’t it be great if we could invent an airspeed indicator and a stall warning horn for software teams? Oh, yeah, we did! It’s called the velocity chart. Good Agile teams operate in front of the power curve, because the velocity chart allows them to see their speed, and keep it in front of the inflection point. When the velocity starts going down, good agile teams increase their refactoring to eliminate the induced drag.
Startup culture in the U.S. believes in operating behind the power curve. That’s where they think they want to be. They are so focussed on fast progress, and so convinced that high quality means low speed, that they abandon discipline and principles for the sake of the goal. This is a tragedy.
They start out believing that power and speed are related without paying any attention to drag. So they haul back on the yoke, put their noses into the sky, ram the throttle forward, and then burn fuel madly while going nowhere in a hurry. They don’t understand that when you make a mess, you induce drag, and you cancel out your power.
[1] https://i2.wp.com/aviationglossary.com/wp-content/uploads/2015/08/region-of-reversed-command.png?ssl=1
The inevitability of ad-blocking
As I work in the content industry I’ve always felt bad about installing ad-blocking software. I’ve always felt that adverts were part of the deal of having free content.
Recently I have started to use them in some of my browser sessions and the reason is almost purely technical: adverts were wrecking my power consumption and hogging my CPU.
The issue is naturally acute on smartphones, which is why Apple is starting to allow ad-blocking on iOS Safari, but my recent problems have actually been on laptops. I have an aging Chromebook which you might expect to have problems but I have also found that in the last six months my pretty powerful dev laptop has also been going into full-fan power drain mode, often resulting in less than two hours of battery life.
At first I thought the issue was simply that I am a total tab monster, keeping open loads of pages and referring to them while coding or researching things.
However by digging into the developer tools and the OS monitors it became apparent that just a few of my tabs were causing all these problems (swap file paging I still have to put my hands up to) and all of them were running visually innocuous ads that were taking up vast quantities of CPU and memory.
With no way of telling whether any given webpage is going to kill my computer or not, the only sane response is to not take the risk and install an ad-blocker.
Since installing them (I’ve been using uBlock) I have indeed obtained longer battery-life and less memory-crashes on my Chromebook.
While I am still worried about how we can pay for high-quality open web content in a world without ads there is no tenable future for an open web that clients cannot viably run.
In my personal web usage I prefer to pay for the services I use and rely on. For those that I’m uncertain of I’m happy to trial and therefore to be the product rather than the customer.
In these situations though I am really dealing with the web as an app delivery platform. For content production there needs to be something better than the annual fundraising drive.
Frustratingly there is also a place for ads. Without advertising then everything becomes (online) word of mouth. There’s a positive case to be made for awareness-based advertising. I want to do it myself around recruitment as part of my work.
These adverts though are really nothing more than pictures and words. They shouldn’t be things that are taxing the capabilities of your hardware.
Advertisers are bringing this change on themselves. If they can’t find a way to square their needs and those of the people they are trying to reach then there isn’t going to be an online advertising market in nine months time and that might mean some big changes to the way the web works for everyone.
In praise of fungible developers
The “fungibility” of developers is a bit of hot topic at the moment. Fungibility means the ability to substitute one thing for another for the same effect; so money is fungible for goods in modern economies.
In software development that means taking a developer in one part of the organisation and substituting them elsewhere and not impacting the productivity of either developer involved in the exchange.
This is linked to the mythical “full-stack” developer by the emergence of different “disciplines” within web software development, usually these are: devops, client-side (browser-based development) and backend development (services).
It is entirely possible for developers to enter one of these niches and spend all their time in it. In fact sub-specialisations in things like responsive CSS and single-page apps (SPA) are opening up.
Now my view has always been that a developer should always aspire to have as broad a knowledge base as possible and to be able to turn their hand to anything. I believe when you don’t really understand what is going on around your foxhole then problems occur. Ultimately we are all pushing electric pulse-waves over wires and chips and it is worth remembering that.
However my working history was pretty badly scarred by the massive wave of Indian outsourcing that happened post the year 2000 and as a consequence the move up the value-chain that all the remaining onshore developers made. Chad Fowler’s book is a pretty good summary of what happened and how people reacted to it.
For people getting specialist pay for niche work, full-stack development doesn’t contain much attraction. Management sees fungibility as a convenient way of pushing paper resources around projects and then blaming developers for not delivering. There are also some well-written defences of specialisation.
In defence of broad skills
But I still believe that we need full-stack developers and if you don’t like that title then let’s call them holistic developers.
Organisations do need fungibility. Organisations without predictable demand or who are experiencing disruption in their business methodology need to be flexible and they need to respond to situations that are unexpected.
You also need to fire drill those situations where people leave, fall ill or have a family crisis. Does the group fall apart or can it readjust and continue to deliver value? In any organisation you never know when you need to change people round at short notice.
Developers with a limited skill set are likely to make mistakes that someone with a broader set of experiences wouldn’t. It is also easier for a generalist developer to acquire specialist knowledge when needed than to broaden a specialist.
Encouraging specialism is the same as creating knowledge silos in your organisation. There are times when this might be acceptable but if you aren’t doing it in a conscious way and accompanying it with a risk assessment then it is dangerous.
Creating holistic developers
Most organisations have an absurd reward structure that massively benefits specialists rather than generalists. You can see that in iOS developer and mobile responsive web CSS salaries. The fact that someone is less capable than their colleagues means they are rewarded more. This is absurd and it needs to end.
Specialists should be treated like contractors and consultants. They have special skills but you should be codifying their knowledge and having them train their generalist colleagues. A specialist should be seen as a short-term investment in an area where you lack institutional memory and knowledge.
All software delivery organisations should practice rotation. Consider it a Chaos Monkey for your human processes.
Rotation puts things like onboarding processes to the test. It also brings new eyes to the solution and software design of the team. If something is simple it should make sense and be simply to newcomer, not someone who has been on the team for months.
Rotation applies within teams too. Don’t give functionality to the person who can deliver it the fastest, give it to the person who would struggle to deliver it. Then force the rest of the team to support that person. Make them see the weaknesses in what they’ve created.
Value generalists and go out of your way to create them.
Avram Miller says Steve Jobs has one more Apple intro
We all have friends (people we know) and friends (people we not only know but hang out with). Maybe the better contrast might be between friends and buddies. Well Avram Miller is one of my buddies. He lives down the road from me and my kids prefer his pool to ours because his is solar heated. The retired Intel VP of business development is quite a character, knows a lot of people who know people, and understands the business of technology at a level few people do. So when he wrote a post this morning predicting that Apple will clean Google’s clock in search, I sat up in my chair.
Avram’s thesis is that Steve Jobs felt betrayed by Google’s development of Android and decided years ago to go after the soft underbelly of the Googleplex by building a superior search product called Found that Apple would have no need to monetize — the Switzerland of search. Please read Avram’s post and you’ll see he claims that Steve Jobs even pre-recorded his participation in the Found launch event scheduled for sometime next year. Which of course makes me wonder what else Steve may have prerecorded?
I believe Avram. We haven’t yet discussed this directly because Avram has spent the winter in Israel but that’s what makes this post so plausible. If there’s an Israeli scientist at the heart of Found, then Avram — who has been the toast of the Tel Aviv tech scene all season — would probably have bumped into him or her.
I love the Apple side of this but what gives it real import are the Google and Facebook aspects. Facebook has pivoted deftly to mobile, Google hasn’t particularly succeeded in social networking with Google+, so Google is more vulnerable than one might think. I’m not sure Avram is right that Zuckerberg & Company are the major threat, but if Apple can out-Bing Bing without needing the ad revenue, well Steve Jobs may well get his revenge on Google after all. As I guess he will be explaining to us sometime next year.
Comments
- I'd prefer to do all my search using IBM Watson if I could. by Gi
- This is “… the most secret project ever undertaken at ... by Relayer
- Sorry, I should have searched for “Google Now” instead of ... by Ronc
- So now your search will only work if you have created an ... by MikeN
- The most important use of computers and the internet for me is ... by Ronc
Related Stories
Pocock: Android betrays tethering data
Ad-hoc data breakpoints
tartleyAnother gem, thanks Ned!
A co-worker had a problem running a large test suite with nose. Modules were being imported from the wrong directory. Somehow, sys.path was having a "project/lib" directory stuffed into it, and we couldn't figure out why. (tl;dr: it was nose's fault, and we should have known about it, and it shouldn't have been doing it in the first place.)
We searched our code for "sys.path.insert" and found more of them than we liked, but none of them accounted for the modification we were seeing. What we wanted was to run the tests in a debugger, with a data breakpoint set: stop when sys.path is modified.
Unfortunately, pdb doesn't support breakpoints like that, maybe other debuggers do? So we whipped up an ad-hoc data breakpoint:
import pdb, sys
def trace(frame, event, arg):
if sys.path[0].endswith("lib"):
pdb.set_trace()
return trace
sys.settrace(trace)
(Yes, it's a little irksome that there are two different spellings of "set trace" there...)
A trace function is a Python function registered with the interpreter with sys.settrace(). This function will be called for every line of Python executed. Trace functions are the basis of debuggers, profilers, and code coverage tools.
Here we've written a very simple one: check to see if sys.path has been modified in the way we care about, and if so, break into the debugger. To be honest, I wasn't quite sure what would happen if I tried to break into the debugger from inside a trace function, but when we ran the test suite with this code in place, it worked perfectly. We were dropped into the debugger just after nose added a "lib" directory to sys.path.
As it happens, nose tries to be helpful by adding a "src" and "lib" directory to the path, even though that's an unusual layout for Python projects. Luckily, there's a nose option to disable that bit of helpfulness, and our tests run just fine now.
If you find yourself in a similar situation, consider a simple trace function. It's an advanced technique, but you don't have to get too tricky, and can really tell you a lot about what your program is doing.

