I recently finished Anil Ananthaswamy’s Why Machines Learn: The Elegant Math Behind Modern AI.
Would I recommend it?
Depends on you, tbh, and how deep down the rabbit hole you wish to go.
If you’re keen to learn, Ananthaswamy is an excellent teacher.
That said, after enduring hundreds of pages of math lessons and working through equations, this line—spoiler alert—summed up the state of play:
[M]uch of this book has celebrated the fact that traditional machine learning has had a base of well-understood mathematical principles, but deep neural networks—especially the massive networks we see today—have upset this applecart. Suddenly, empirical observations of these networks are leading the way. A new way of doing AI seems to be upon us.
We don’t have the theories to understand how these models learn or reason.
Seems important.
What struck me was the consonance between how these models work and the predominant drivers of modern politics and economics.
They’re about extracting value from the past.
- LLMs learn probability distributions from training data (human-generated text scraped from the internet)
- VC firms make probabilistic bets across distributions of outcomes
- Expected value frameworks reduce ethics to probability calculations over historical data
- ‘Make America Great Again’ conjures an imagined past to effect a private extraction of public wealth
The challenges we face can’t be solved by optimizing past patterns. They require building fundamentally new things.
Human agency, not pattern matching.
Invention, not recombination.
The extractive era feels exhausted.
It’s time for discovery.