Large language models can do jaw-dropping things. But nobody knows exactly why.

Summary:

Despite advancements, scientists remain uncertain exactly how AI models achieve certain capabilities.
Two years ago, OpenAI (ChatGPT) researchers discovered an unexpected learning pattern called “grokking” where models failed at tasks until a sudden unexplainable breakthrough.
An example of such a breakthrough is models not only solving complex math problems, but also learning how to solve the problems in French.
Developing these models is akin to a chef trying different ingredients to create new recipes; scientists copy what works from others without asking “how”.

Utilities