Large language models can do jaw-dropping things. But nobody knows exactly why.

MIT Technology Review (03/04/2024)
  • Despite advancements, scientists remain uncertain exactly how AI models achieve certain capabilities.
  • Two years ago, OpenAI (ChatGPT) researchers discovered an unexpected learning pattern called “grokking” where models failed at tasks until a sudden unexplainable breakthrough.
  • An example of such a breakthrough is models not only solving complex math problems, but also learning how to solve the problems in French.
  • Developing these models is akin to a chef trying different ingredients to create new recipes; scientists copy what works from others without asking “how”.