“The Bitter Lesson” and open source LLMs

Today I learned of a 2019 paper called “The Bitter Lesson“

This paper postulates that, over the past 70 years, the biggest drivers of AI advancement were not due to special human-introduced nuances into what makes the model smart, but rather, dramatic advancement in computational resources driven by Moore’s Law (exponentially falling computational cost).

What does this mean?

Major AI leaps are driven not by companies rolling in a ton of “special sauce” into their AI models, but rather because it becomes dramatically cheaper to throw more hardware at the problem.

This gives me hope that the future of LLMs won’t be beholden to companies like OpenAI, Anthropic, and the like. But rather, we’ll see open source models catch up to and possibly surpass OpenAI’s GPT for raw text-to-text generation.

I am seeing that some open models claim to be on par with GPT 3.5, such as Mistral-7B and orca-2, but the empirical evidence is mixed. (p.s. hat tip to Anton Bacaj, who is a wealth of cutting-edge information around open models. https://twitter.com/abacaj )

Of course, there are other competitive areas where the private companies’ inherent advantages will allow them to dominate over open source models (marketing, enterprise features, APIs, wrapper support, stores, integrations, etc.). But at least the core offering won’t be wrapped up in a tight little, expensive, box.

Jason Shah

“The Bitter Lesson” and open source LLMs

Leave a Reply Cancel reply