Significant Bits: a semiconductor industrial crash? AI news from China and the US. Much more on the Basecamp drama

Most of this post is Basecamp stuff. Scroll down for it.

I’m going to start calling these news roundup posts, Significant Bits. So, welcome to the first installment of Significant Bits!

The global chip shortage: what if it’s a slow-motion crash?

Not only is the PS5 is out of stock, but now cars are being shipped to consumers incomplete because the maker can’t source the necessary chips. Appliances and so many other products are being increasingly hit by the ongoing chip shortage.

This idea is a little bit out-there, and I’m just spitballing, but I kind of wonder if we’re not seeing a slow-motion supply chain crash play out in semiconductors. As in, it’s not just a shortage driven by demand, natural disaster, hoarding, and panic buying, but it also has self-reinforcing feedback loop elements to it that are gathering momentum and will be hard to reverse.

I suggest this because there’s a circularity to the chip supply chain that other complexly layered supply chains don’t have. You need chips to make chips, and you need chips to make things that make chips, and to make things that make things that make chips. And so on. It’s chips all the way down.

So if you start running out of chips, you may find yourself running out of chips because you’re running out of chips.

Again, the chip supply chain is very complex with a ton of dependencies, so I’m just spitballing based on what I know of the fact that it has many circular dependencies. I welcome any feedback on this idea from those who know more.

Cerebras’s massive chip

Chip startup Cerebras has released a ginormous AI processor that’s essentially a supercomputer on a chip. What they’re really doing here is called wafer-scale integration, which is essentially a packaging technique that lets them cram more chips into a smaller space. So when you buy one of these Cerebras monsters, you’re buying a 7X12 grid of 84 chips that are all bundled up together and interconnected.

Here’s the thing with these wafer-scale integration efforts: you still have to provide power to 84 chips, just as you would if they were integrated into a more traditional arrangement (i.e., a collection of four or eight-core chips, all ganged together on an interconnect fabric of some kind). But if you’re cramming all these chips into one package, the problem of powering them and cooling them gets a lot harder.

I’d bet you can’t get even close to enough power into the package to run all of these cores full-bore at the same time, and even if you could you’d have a heat-induced meltdown. So an 84-core package like this is going to rely on sophisticated power management and on a subset of the cores being active at any given time. This would mean less average performance per core than if they were all broken up into separate packages.

I imagine this subset of active cores maps somehow onto the sequential nature of deep learning training runs, so that the chip is naturally used in stages or waves. So it may turn out that a limitation on the number of simultaneous cores in use doesn’t matter so much in practice. Either way, I look forward to learning more about it.

Here’s a bit more technical background on a previous version of the chip, which has fewer cores.

Huawei’s ML flex

Huawei has released a paper on a large language model (LLM) with a record number of parameters (200 billion) trained on 40 billion tokens. I’ll explain what that means in a moment, but first a link to a mirror of the paper, which went offline pretty soon after it went up:

The Reddit thread on this is good and is the source of most of my info.

Now, the highlights:

  • The larger the number of parameters in the model, the more resources it takes and the better it performs. For reference, GPT-3 is a 175-billion parameter model.

  • The size of Huawei’s training corpus, 30 billion tokens, is quite small compared to comparable English-language corpora, with GTP-3 having been trained on 300 billion tokens.

  • Tokens in such models are usually words, and in the case of a Chinese-language model these are characters. It may actually be the case that 40 billion tokens worth of Chinese characters is in the ballpark of 300 billion English words, in terms of content, because of the differences in the languages.

  • The tech stack this was made on is entirely Chinese, down to the Huawei-designed (but TSMC-produced) silicon. But as of late last year, TSMC is barred by the US from providing silicon to Huawei, so this model is probably as good as it gets for them for now.

  • Despite the large number of parameters, China is still well behind the US in ML, in terms of algorithmic sophistication and general capability.

More color on the Basecamp drama

Before I link this up, I want to apologize to readers — I’m one of the few accounts Basecamp co-founder David Heinemeier Hansson (DHH) follows on Twitter, and I’m sure I could have reported this out myself by talking to him if that fact had occurred to me yesterday. I’m so used to talking to AI/ML people lately that I forget to pick up the phone on non-AI/ML stories.

Casey Newton actually picked up the phone and got the details, and while he’s clearly hostile to DHH (I guess Basecamp isn’t big enough to get the deference he gives to Google, Facebook, etc.), if you strip out his spin you can see that the backstory is as insane as you’d expect.

Here’s what happened:

  • Back in 2009, some Basecamp support dudes started keeping a list of funny customer names. (Do I even need to type the rest of this list? Because surely you know exactly where this is going already.)

  • At some point years later, the dudes behind the list left, and it quit circulating because people started to get uncomfortable with such humor as the political environment changed.

  • DHH investigated the list, apologized for it internally (he called it a “systemic failure” at Basecamp), and wanted to move on.

  • Some super woke people in the org did not want to move on, did not think his apology went far enough and started talking up how the kind of attitude represented by the list is the first step on the dark road to genocide.

  • DHH, apparently not yet being fully in the grip of brainworms, responded that linking this juvenile 2009 list of names to genocide was “catastrophizing.” He urged everyone to dial the rhetoric way back and to drop the issue.

  • But the people agitating for “a reckoning” with this list still didn’t want to drop it, and they certainly didn’t want to be tone policed. They felt harmed by his insistence that introducing genocide into the discussion was way over-the-top, so they reported DHH to his own HR department.

  • DHH and his co-founder Jason Fried decided that at this point they’d had enough, and they dropped the hammer on all the woke activity by announcing a bunch of internal policy changes.

In the wake of this, employees are mad, and I’m sure quite a few will leave. One of them has now penned a pretty over-the-top missive about how unwoke the company is, essentially daring them to fire her.

I’m going to give some personal context on this Basecamp memo that I did not give yesterday for various reasons:

Before I returned to media full-time in 2019, I had left the space for a few years and tried my hand at a series of failed, bootstrapped software startups. I worked mainly in the Ruby on Rails (RoR) ecosystem, and after giving up on my startup ambitions I took my RoR skills to market as everything from a consultant to a CTO of a very successful restaurant reservation system startup out of Dubai.

DHH is the founding figure of the Ruby on Rails ecosystem. DHH is to RoR what Linus Torvalds is to Linux — a kind of benevolent dictator for life, and a programmer who has created more economic value (in the form of open-source software) than anyone outside of a few elite tech founders like Jeff Bezos and Bill Gates.

Like Torvalds, DHH can have some sharp elbows and is quite opinionated on all sorts of things. He shares these opinions freely, sometimes to his detriment. This has garnered him many enemies. I personally have not always been a fan of his, though he has grown on me in recent years and I now count myself among his allies (especially after this memo).

Anyway, I know what it is to be a part of the RoR community and to work around exactly the kind of Midwestern ruby-centric software shops that Basecamp is the paradigmatic example of.

(Update: The original text here said that “Github used to be the other iconic example of such a shop, before their sale to Microsoft.” But Github is headquartered in San Francisco and is a Valley company, but my own exposure to them is via their very large Grand Rapids, MI presence. Hence the error. I should’ve done a quick Google search. Apologies.)

There’s often a certain ethos at these places, and it has the following features:

  • Almost everyone is some flavor of progressive, and given that progressive is not the norm in the Midwest some of them can be a little elitist about it.

  • Despite the overall wokeness of such shops, there’s often an underlying current of Midwestern sanity and friendliness. This exists in an uneasy tension with the more extreme, in-your-face woke elements in the larger community and on the staff.

  • The company makes a big show of giving back to the community, not just via FOSS but maternity/paternity leave and other progressive, employee-friendly policies. They do this stuff because they think it’s the right thing to do. And they’re public and promotional about their good works because they hope to inspire others to join in, and also because blessed is he who tooteth his own horn, for if he tooteth it not, then it will not get tooted.

  • There’s a lot of collectivist rhetoric from management about how the shop is everyone’s, and we’re all family. There are committees for people to join and get involved. Everyone is obsessed with “culture,” and they make documents about it and have meetings. (The “culture” is often viewed as explicitly as an open-source product, like the FOSS libraries they maintain.) The hierarchies are flat, and people often have whatever titles they want to give themselves.

  • The owners own the whole operation outright, and there is no profit sharing. None. There are bonuses, and generous salaries, and coffee bars, and other perks familiar from the startup ecosystem, but it ain’t a startup — there is no equity comp.

There has also been a backlash to the ethos described in the bullet points above, especially around this business of flat hierarchies and the practice of job title self-ID. (One person identifies as a “Software Engineer,” while the other identifies as a “Senior Programmer,” while yet another identifies as a “Javascript Wrangler.) For instance, there was this big WIRED piece critical of Github’s flat hierarchy and the problems it causes, and there have been rants and blog posts and Reddit comments that I could dig up. But my point is, the above is not loved by everyone.

I have to say, it sure as heck was not loved by yours truly. I always thought it was really, really odd — I just could not get over it — that this “we’re all family and we all go above and beyond for the group” rhetoric was bought into so enthusiastically by employees despite the total lack of equity compensation or other real ownership.

So these progressive Midwestern software shops can sometimes try to create a Silicon Valley startup-culture type of vibe, where everyone is pulling together and there’s some collective ownership of everything. And this can blend, I think, with a little bit of an old-school progressive union/factory-worker collectivist vibe that’s still in the water out there.

And yet, none of the employees own a lick of equity, no are they unionized. So when push comes to shove, they have zero leverage of any kind.

What ultimately happens is what always happens in situations where people are encouraged to feel ownership of something they have no actual claim to — they get told “no” by the true owner, and this comes as a rude shock. The jig is up, and they realize that they never had any real power, only the illusion of it. Despite all the collectivist rhetoric, the whole professional arrangement is still governed by the Golden Rule, i.e., “he who has the gold makes the rules.”

Given all this context, what I read in that memo was a deliberate break from the Github-style faux startup culture, to something much more like a traditional small business culture of hierarchy and clear lines of authority. This memo was the ownership of the company tossing out that entire failed model and reasserting explicit control over their business.

The people Basecamp hired under this fake ownership model are going to hate it. To put it another way, they optimized for “culture fit,” and now they’ve suddenly changed the culture right out from underneath everyone, so there will be a lot of people who suddenly do not fit.

What I’m saying here is definitely a version of, “Basecamp made their own bed.” They surely did. But more than one software shop in these circles has made that particular bed, and it has failed more than once in the past decade or so. And if programmers keep doing this, it will keep failing.

I think the take-home for the software industry is twofold:

  1. If you’re a programmer and you have no equity and no union representation, then you have no form of ownership or leverage. Any employer who’s trying to make you feel like an owner is setting you up for a rude awakening. So don’t be a chump and buy into it.

  2. If you’re an owner, then either give out equity compensation, or greenlight a union drive, or (my own preferred option) don’t cultivate a fake ownership culture among employees at a shop where none of them are owners. Don’t make people feel like something they’re not.