Congress is currently holding hearings on the problems with the two big mobile app stores from Apple and Google. Software makers are airing their complaints about scams, power imbalances, insanely large tolls, and capricious policy shifts that strike suddenly and leave developers without income, and sympathetic lawmakers in both parties are eating it up.
It was in that context that I read a lengthy account of one dev’s attempt to get her writing app off the ground using GPT-3, and felt an unexpected sense of deja vu: this post sounds exactly like an example of the genre of long, frustrating posts from developers detailing how this or that walled garden app store has screwed them and their startup.
It’s weird, right? OpenAI’s GPT-3 is a hot new machine learning model that can perform what look like miracles. It’s backed by some of the biggest names in the Silicon Valley startup scene, with products based on it raising VC money at a blistering pace (Copysmith just raised $10 million). And yet it’s already it’s giving off the distinct odor of a walled garden tended by some established BigCo.
I’ll get to what I think is happening in a moment, but first some background:
OpenAI raised $1 billion to pursue the primary goal of eventually creating an artificial general intelligence (AGI), with a secondary goal of finding commercial applications for automated (yet human-like) text generation along the way.
The company’s current business model is in selling access to its language models (GPT-2, GPT-3, and various flavors of them) via APIs that can generate reams of relevant, human-sounding text based on a given input prompt.
As for why they needed $1 billion for this, large language models (LLMs) require a large up-front investment of physical capital, in the form of banks of advanced GPUs of exactly the sort that are experiencing a massive supply crunch at the moment. You need tens of millions in hardware to get something like GPT-3 stood up, and then tens of thousands a month in ongoing electricity and facilities costs to run it.
LLMs are also very expensive to train — upwards of $10M went into training GPT-3. You do the initial training as an expensive one-off action on a large cloud because it takes even more hardware to train the model than it does to run it.
When viewed purely as a commercial proposition, GPT-3 is a pricey solution in search of a set of profitable problems.
But of course, GPT-3 wasn’t developed as a commercial proposition, and that shows in various aspects of its design and the way it’s being commercialized. No, GPT-3 is about getting us closer to an AGI, and it’s optimized to perform well on certain benchmarks. It wasn’t designed with any particular commercial use-case or market in mind.
Nonetheless, there’s a sense that there definitely be profitable commercial applications for this kind of tech, if only we can uncover what they are.
The way OpenAI and its backers are planning to look for those commercial LLM applications is via the startup model. Or at least, via something that’s supposed to look like the startup model, but is not really the same thing under the hood as the model that gave us the current big tech incumbents (i.e., Google, Facebook, Amazon).
It’s not a startup scene, but a pre-packaged rollup play
So if it’s not a startup scene, then what is it? To answer this, follow the money.
In this vein, Ben Dickson has some good thoughts on Microsoft’s AI strategy that I endorse:
I’ve heard the same thing that Dickson is saying from others in the field — that Microsoft’s relationship to OpenAI is to tee up an acquisition at some point because that’s how Microsoft is already acting in this space.
And downstream from the Microsoft/OpenAI relationship, there’s a similar dynamic: OpenAI is almost certainly keeping a close eye on exactly how its users are exploiting GPT-3, and will probably acquire or compete with the most successful applications.
I certainly think the expectation that the most promising GPT-3 startups will get quickly rolled up by Microsoft is behind the flood of investment into all things GPT-3.
For instance, just this week, GPT-3-powered startup Copysmith announced a $10 million investment. That is a lot of investment money for what is essentially a pretty thin wrapper around the GPT-3 API. Seriously, I’ve spent some time poking at this site, and they’re just not doing anything unique or interesting apart from the fact that they have GPT-3 API access.
So, my take-aways:
Microsoft has deep pockets, so if GPT-3 is commercially successful then they’ll buy OpenAI.
OpenAI itself has deep pockets (thanks to Microsoft), and if any of their users are commercially successful, then they’ll either compete with them or buy them. And, of course, if Microsoft is also a potential acquirer for these startups, either directly or via an OpenAI acquisition.
VCs have a long-standing FOMO problem, where all the whales go into the same deals with one another, so that whatever they back is liable to win just by virtue of the amount of money they can keep throwing at it. In the natural language processing (NLP) space, the one bet that everyone is in via one route or another is GPT-3.
Regarding that last point about VC FOMO, a few weeks ago I talked to an ML practitioner who has been pitching a GPT-3 alternative, and they told me that all the VC money for NLP plays is already in GPT-3 so there’s no room for anyone else. I believe this because I’ve been paying attention to this space and have not seen any funding announcements for anything else.
So GPT-3 is kinda the only game in town, and given how expensive it is to build and run an LLM, it’s likely to stay that way.
All this means that every GPT-3-based startup has the following characteristics:
Pretty obviously started with the goal of getting bought by Microsoft.
Requires access to GPT-3’s API to actually exist as a business.
Everything it does is visible to the OpenAI team on the back-end and can be copied by them.
The really super critical technical mojo that powers its product is owned by OpenAI.
Will have its OODA loop constrained (possibly fatally so) by how OpenAI reacts to the (very real) risks of someone misusing its platform for spam, psyops, or other bad acting.
All of this has the makings of a walled garden for Microsoft. The range of supported apps can and will be constrained to fit the agenda and interests of the larger player that owns the platform, and none of the users can object because losing access to the platform is sudden death.
A peek inside the walled garden
Here’s a post-mortem on the now-defunct TextSpark.ai app, and it has elements that will sound very familiar to anyone who has been following the App Store controversies:
The policy at the time for OpenAI was you had to actually build the application and have it production ready to submit to their review process, but I wasn’t too concerned about the investment because they’d had my design docs and I was chatting with them throughout.
Unfortunately, in September, the conversations (and review process) halted while they revamped their use policies. I spent a few weeks begging for an indication of what that would mean for our intended use case so I could pivot our plan if it was unsupported.
They couldn’t give me an indication either way--either yes or no, so clearly there was a lot of internal debate around the safety of unfettered text creation (which I get)...
Unfortunately, when the air cleared on their decisions around use applications, it turned out that unconstrained text generation was a clearly banned use case. This made our design totally unworkable for the flagship use case we’d built around.
The complete block on the use case, regardless of what safety measures we were willing to put in place, came as a bit of a shock to us. It was a difficult message to hear when we’d been pushing so hard. I almost lost my business partner over it...
In the meantime, I continued working with OpenAI to try to get clarity on whether we could launch “idea generators” which used much more constrained prompts to generate backmatter, story ideas, titles, characters, etc.
We’d been told that they’d review different aspects of the design separately but had only gotten a blanket “no” on the app, so I pressed for more details and a full review...
The feedback we got didn’t make any sense...
They said more information was required for approval, but didn’t specify what information or ask for follow-up details and I honestly didn’t know what more to give them..
The post goes on and on and on like this, but you get the idea. Anyone who has to operate under something like this isn’t doing a “startup,” at least not in the normal sense.
This, to me, is even less of a “startup” than the normal App Store lock-in, because there all the secret technical sauce for any of these apps sits behind somebody else’s API key. (Not that I’m the arbiter of what is and isn’t a startup — I’m sure people will object that I’m being too harsh.)
We need alternatives
As someone who wants AI to make rapid progress, and who specifically wants America to win at it, I don’t love the outlines of what I’ve described above.
The faster the price of building, training, and running an LLM for about three years can be brought into the range of a software startup seed round — even a very large one — the more progress we’ll see in AI.
Even if we could take the cost of training out of the equation — perhaps by giving qualified startups credits on government-owned clusters, or on privately owned clusters via some kind of funding — that would go a long way towards helping create a real AI startup scene where different LLMs compete against each other to advance the state of the art.
Right now, though, what we have is at best a software monoculture, and at worst something even more locked down and limited than even Apple’s App Store. So I think we’re at risk of having a much lower quality AI ecosystem than would be possible if more startups could get cheaper access to hardware.
My fantasy is that Texas would just build a giant cluster and then offer credits on it to AI startups that move to the state. If the state wants to strike a blow against Big Tech while also helping the local startup scene, I think this is a better approach than putting out statements about platform bias.