Great analogies. You might want to re-examine your assumption that inference is expensive. Quantized LLMs work just fine on consumer hardware now. There is no need for custom hardware or lots of RAM. I expect OpenAI's costs are orders of magnitude below what they are charging. (If they are not now then they soon will be.)
What Is It Like To Be ChatGPT?
Great analogies. You might want to re-examine your assumption that inference is expensive. Quantized LLMs work just fine on consumer hardware now. There is no need for custom hardware or lots of RAM. I expect OpenAI's costs are orders of magnitude below what they are charging. (If they are not now then they soon will be.)