In a bold TED AI San Francisco talk, Rafael Rafailov, a reinforcement‑learning researcher at Thinking Machines Lab, challenged the prevailing wisdom that AI progress hinges on scaling model size, data, and compute. He argued that the first true superintelligence will be a *superhuman learner*—an agent that can efficiently discover, test, and refine its own theories through interaction with its environment. This vision contrasts sharply with the approaches of OpenAI, Anthropic, Google DeepMind, and others, which have invested billions in ever‑larger models that excel at one‑shot reasoning but lack the capacity to internalize knowledge and improve over time.
Rafailov illustrated the limitations of current coding assistants: they can perform a task one day, only to “re‑learn” it the next without any memory of past successes or failures. He pointed to the ubiquitous use of try/except blocks—essentially duct tape—to mask uncertainty, a symptom of systems trained to optimize for immediate task completion rather than long‑term adaptability. To achieve general agency, he proposed a paradigm shift from single‑problem training to a “textbook” style of learning, where models progress through chapters and exercises, rewarding incremental learning and meta‑skill acquisition. This approach, rooted in meta‑learning, would enable agents to develop efficient learning algorithms themselves, potentially unlocking artificial superintelligence that is not a god‑like reasoner but a master learner capable of continuous self‑improvement.
Thinking Machines Lab, co‑founded by former OpenAI CTO Mira Murati and backed by a record $2 B seed round at a $12 B valuation, is positioning itself to pursue this ambitious agenda. While the company’s first product, Tinker, offers an API for fine‑tuning open‑source models, Rafailov’s talk signals a deeper commitment to building systems that learn from experience. The path remains fraught with challenges—advanced memory architectures, novel data distributions, and engineered reward signals—but the firm believes these hurdles are surmountable. In an industry rife with short‑term hype, Rafailov’s restraint—no grand timelines or concrete milestones—highlights a focus on foundational research rather than rapid market capture.
Want the full story?
Read on VentureBeat →