When we think of “AI”, our imagination often drifts to scary places: the killer robots in Terminator and The Matrix and A Space Odessey. Even in sci-fi stories where the humans remain in charge, AI is often portrayed as an enabler of dystopian futures where technology has created a huge rift between the rich and the poor.
Given how bleak most science fiction covering AI is, it’s no surprise that most news media approach the topic with mysticism and trepidation. How do we control AI if it becomes smarter than humans? What kinds of jobs will our children have if all jobs can be done by robots? Journalists are often all-too-willing to amplify these doomsday scenarios to drive engagement.
Instead of adding fuel to the narratives about how AI technology can go horribly wrong, I want to share my optimistic take on how AI technology can make things go wonderfully right. It has the potential to create unfathomable abundance in everyone’s lives. Not only am I optimistic that human-level AI be good for society, but that it is technologically feasible within the decade. I will share a blueprint on how to do just that.
This book is divided into three parts: (1) where AI technology is today, (2) six ingredients for accelerating us towards increasingly general AI in the next decade, and (3) the societal consequences of achieving this. This book draws from the author’s experience working as a robotics researcher at Google, as well as his current role as Vice President of AI at 1X Technologies.
- What is Intelligence?
- Artificial Intelligence
- Software 2.0: The Hard Parts
- Four ways to create an AGI
- The Neurobiological Software Stack
- Artificial Life
- Jungle Basketball
- Human-in-the-Loop Learning
- Just ask for Generalization
- Learning Robots in the Real World
- Building an AGI Team
- Why AGI is Good for You
- The Power Struggle for AI
- Reality, Just the Way You Like It
- AI Beauty
- Project Ideas
Who is this book for?
I wrote this book for the 14 year old version of myself, when I was starting to do science fair projects in high school and think about what it meant to be intelligent. I also wrote this book for the 22 year old version of myself, having just started my first research engineering job at Google and getting my feet wet in deep learning and robotics. This book is the culmination of all the practical wisdom I have on making AI systems work, accumulated over a decade of research projects. It will be useful for students who want to think about biological and machine intelligence in a unified way, or to technologists charting their own path to creating Artificial General Intelligence.
Preview (First Chapter)
Artificial Intelligence (AI) is the engineering discipline of creating a machine as smart as a person. No one quite knows how to do this yet, so this is also an active field of research. Because AI concerns the lofty goals of understanding and replicating intelligent behavior on par with animals and humans, it ends up being quite the multi-disciplinary field, spanning neuroscience, robotics, biology, physics, computer chip design, and philosophy. When we think of “AI”, our imagination often drifts to scary places. Examples include the killer machines in Terminator and The Matrix and A Space Odessey. Even in stories where the humans remain in charge, movies like RoboCop and Blade Runner portray AI as an enabler of sci-fi dystopias where technology has created a huge rifts between the rich and the poor. Given how bleak most science fiction covering AI is, it’s no surprise that media coverage about AI technology approach the topic with mysticism and trepidation. How do we control AI if it becomes smarter than humans? What kinds of jobs will our children have if all jobs can be done by robots? What if AI ingests prejudices from the sordid parts of the Internet and parrots those prejudices back into our policing systems? These are common questions and concerns I get from friends and family. The media is all-too-willing to amplify these doomsday scenarios to drive engagement.
Instead of adding fuel to the narratives about how AI technology can go horribly wrong, I want to balance the conversation with an optimistic take on how AI technology can make things go wonderfully right. Not just for humanity as a whole, but I want to explore how AI technology can create unfathomable abundance in everyone’s lives. On the opposite side of “AI will kill us all” is a plethora of serious AI researchers who doubt that we will get to human-level AI anytime soon. They have endured many AI booms and busts throughout the decades, where the initial overexuberance is rapidly tempered by the realization that the AI is still woefully incompetent at tasks that any human child could do. They believe that despite the great strides of the last decade, we are still missing a handful of crucial ingredients needed to imbue machines with human-level intelligence. I would like to present yet another optimistic take here; I believe those ingredients are already within reach, and we can build an artificial general intelligence within the decade. I started writing this book in late 2019 because I felt that despite the numerous research breakthroughs in certain applications, such as translation and image understanding, there was not enough research on how we could assemble all the pieces together. However, the four ensuing years (2020-2023) have seen such enormous strides in general-purpose AI models that the community is now ready to accept the possibility that we can have near-human level intelligence in many tasks.
ChatGPT, a helpful and intelligent chatbot developed by OpenAI and released to the public in late 2022, took the world by storm and reached hundreds of millions of users in just a few months. Today, people from all walks of life (my mom included) use AI chatbots every day to draft their emails and learn more about any topic. The “narrow” AI systems we have built in the last 5 decades have been expanding in scope, to the point where they don’t feel quite so narrow anymore. This book is the culmination of everything I believe to be true about the art and science of building general-purpose AIs, distilled over a decade of working on it and thinking about it. The time is ripe to make a serious attempt at doing it, and I hope this book will convince you of that.
Building an AGI would be an earth-shattering technological achievement, to say the least, and would shape the course of human history in ways that few can imagine. There are certainly risks and power struggles that arise from the creation of powerful machines. But I am fairly confident that if done right, such machines could help bring modern comforts to every person on the planet, provide us with companionship in our old age, and assist every person in leading happy and productive lives. There is so much more to AGI than just sweeping bathrooms and balancing corporate budgets – such systems will be windows into new truths about the mind and the self. These entities will force humanity to confront its own place and values in a vast, lonely universe.
The idea of creating artificial beings from non-living matter has long captured the imagination of artists, philosophers, and engineers alike. An early example of such a creation is the golem from Jewish folklore: a clay being human-like in form and function, and said to be animated to life with an incantation. In the Greek tragedy of Pygmalion, a sculptor's love for his statue is so profound that it enchants the statue to life. In the film 2001: A Space Odyssey, a mentally disturbed spaceship computer goes rogue and rebels against its mission, yet it shows a disturbingly human-like fear of death when it is about to be shut down. In the film Her, a complex and alluring operating system emotionally outgrows her human companion. Why does the story of creating artificial people, and all the possibilities and risks that entail, reoccur so frequently across the ages? Perhaps the reason has to do with answering humanity's great questions. Where did people come from? What makes us human? Why are we here? Until recent history, people turned to creation mythology to answer this question – gods, maize people, eggs laid by a water dragon and mountain fairy, and so on. Even though modern science has advanced our understanding of the chemical origins of life, these stories retain a sort of narrative beauty that the laws of physics are incapable of telling. A character from the science fiction film Ex Machina remarks, “If you've created a conscious machine, it's not the history of man. That's the history of gods.”
"History of gods" – what a conceited thing to say! Still, what could be more human than narcissism and self-reflection? Science fiction stories of robot uprisings rhyme with mythologies of humans rebelling against their godly creators. These fictional portrayals of robots mirror human qualities such as ego and vengeance but also love and compassion. Myths of people creating golems and sentient robots contain allegories of hubris and of the dangers of toying with powers beyond our understanding. As humans, we are endlessly obsessed with our past stories, our present nature, and our future legacy after death. Our intelligence is rooted in narcissism to some degree; for example, philosophers have long debated whether humans have something special in their minds that can't be replicated by machines, and that which sets us apart from "lesser" animals. If a machine were to achieve human-level intelligence, it would reveal the disturbing revelation that our own intelligence may not as profound as we think. Maybe we are all pattern-matching machines barely capable thinking “outside the box”.
A machine mind, also known as an Artificial Intelligence (AI), has the potential to be freed from the mortal constraints of biological hardware, which means it can pursue long-term goals far exceeding that of a human lifespan. As long as the machine's power source doesn't run out, it could visit other galaxies, witness the death of solar systems, and preserve the memories of humanity for eternity. Like the Pyramids of Giza and religious texts that came before it, Artificial Intelligence can be seen as another form of self-preservation created by humans to keep their legacy alive. Now though, instead of erecting towering stone obelisks or pyramids, we looked for "room at the bottom" and have etched our monuments on tiny silicon wafers.
The development of computer technology in the 20th and 21st centuries has made dreams of "robot people" tantalizingly plausible. A computer is a machine that processes information. Information processing includes specialized tasks like adding numbers on a calculator, as well as general-purpose tasks like executing user-written programs. The PC in your home, the calculator you use for math class, and the chip in your home's smoke alarm are all computers that perform information-processing functions, with some being more specialized than others. The beauty of the mathematics of computation is that they are substrate-independent, meaning that the same computations can be implemented with a variety of materials. A computer can be made of tiny transistors that are too small to see, or bulky vacuum tubes, or steam in a cylinder, or invisible microwaves in quantum superposition. Certain substrates have advantages in terms of efficiency, but as long as two computers have the same logical primitives, they can simulate each other given enough space and time.
A computer can also be made of flesh. The nervous system – brain, spine, and so on - can be considered an information processing system that translates sensory information into the body's behavior. Our ancestors evolved these systems to adapt to the world around them in order to survive and multiply. The pursuit of survival leads to a wide range of information processing sub-tasks: finding food, surviving the elements, and avoiding death.
Our brain computers have taken on more advanced tasks in the ensuing eons, including agriculture, warfare, and discovering physics. However, the basic adaptive self-regulation capabilities of our nervous systems to keep us alive have remained mostly unchanged. As we will explore in this book, most intelligent behaviors are simply specialized "subroutines" that fulfill our prime directives, although they may not always be obvious.
Assuming that the nervous system is a specialized computer for assisting in survival and reproduction, and we know how to build general-purpose computers in silicon, it should in principle be possible to replicate a human mind in a silicon computer. This is known as "artificial general intelligence" (AGI): a machine that is as alive, intelligent, and emotionally capable as a human. Of course, the difficult part is figuring out how to build one! In 1956, the Dartmouth Summer Research Project on Artificial Intelligence proposed that a team of 10 people could make significant progress on machines that "use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves ... in about two months”. The group was interested in getting computers to use language, neural networks, measuring the size of calculations, self-improvement, abstractions, and randomness and creativity. Given that we are still working on those very problems today, two months and 10 people was an over-ambitious milestone to say the least!
Despite the daunting challenges involved in creating AGI, my life ambition is to create a few AGIs of my own. This book is a speculative blueprint on how we might replicate key aspects of human-level artificial general intelligence. Still, the path ahead to achieving AGI will be long, challenging, and uncertain.
Progress in AI is simultaneously fast and slow. There have been significant leaps in capabilities in the last decade, but most AI researchers today are still conducting similar-looking projects to those from 10 years ago. When I worked at Google, a handful of people and myself spent several years teaching robots to learn grasping in a limited setting. That was just one basic capability among the millions of things humans can do with their hands. Similarly, AlphaGo, the famous software that beat the human world champion at the board game Go, required many person-years of development and, while it achieved the impressive feat of beating any human at Go, it can't do anything else. If we hope to build general intelligence within the next couple of decades, we will need to go beyond superhuman performance at a few self-contained tasks to competence at millions of them, with the reliability we’ve come to expect of human intelligence.
This book proposes several technical ideas that will allow us to scale up technological capabilities quickly. It is divided into three parts: (1) where AI technology is today, (2) six ingredients for accelerating us toward AGI, and (3) the societal consequences of achieving AGI. Everyone inevitably wonders about job displacement and killer robots, so I will address these concerns too.
The first 5 chapters cover the principles of modern AI technology, summarizing the "Deep Learning revolution" and the enduring ideas in machine learning that have been scaled up to increasingly impressive capabilities in the last 10 years.
It's not easy to precisely define "artificial intelligence" any more than it is to define what "human intelligence" is. Attempting to discuss this often devolves into philosophical pedantry. Nevertheless, I provide a simple working definition for understanding the rest of the book.
In chapter 3, I will discuss how the major breakthroughs in Machine Learning in the last decade can be understood as a result of a broad engineering principle known as deep learning, or as some call it, Software 2.0. Techniques like artificial neural networks are so widespread now that they are the lingua franca of machine learning in nearly every discipline. This principle has been applied to many domains, from speech to image understanding to language processing to robotics to physical sciences like drug discovery.
The optimism of Chapter 3 may give the impression that we are on the cusp of automating every job and creating limitless abundance. However, ML still has many pitfalls that pose a barrier to widespread adoption. At times, ML models can make unfathomably complex calculations to arrive at the right answer, and at other times it is wrong in unfathomably complex ways. Our poor understanding of how these systems work, coupled lack of formal guarantees of reliability makes stakeholders reluctant to trust ML systems "in the wild."
In Chapter 4, I discuss how much of the research on making ML more "robust," "causal," or "explainable" is really focused on a simple question: "how can we improve the generalization capabilities of ML?" The research community likes to dress up this simple question with pedantic language, but the truth is that defining "generalization" is difficult, especially because it is no different than the slippery task of defining intelligence itself. We quickly encounter the limitations of verbal reasoning when we try to formalize human intelligence into mathematics. Instead, let's approach the problem from a different angle – instead of trying to understand generalization in a purely mathematical language, what if we used machine-learned models of natural language to understand the structure of generalization itself? Thinking of generalization as equivalent to language ability will provide guidance on how to align ML system behavior with human-centric intelligent behavior. Language and ability come first, and "explainability", "causality", and intuitions of "robustness" are just fuzzy Software 2.0 problems to be predicted like any other words that a deep learning model can converse about.
What about animal intelligence? For all the ink spilled on linguistics and learning, and all the philosophical hand-wringing over what intelligence even is, all one needs to do is observe a raccoon, human baby, or a shark to see a real-world example of AGI hardware and software working beautifully together. These creatures all have one thing in common – they possess sufficient intelligence and bodily adaptations to survive and reproduce in their ecological niche. Within the "Neurobiological Software Stack," the demands of the environment, epigenetics, parenting, culture, and the richness of the environment all conspire together to form behavior. In Chapter 6, we will analyze some animal capabilities from the perspective of a well-integrated hardware and software system.
The second part of this book, spanning the next six chapters, discusses the six key engineering principles I believe are necessary to take our AI systems to "the next level": true artificial general intelligence. Our AI systems today - from Siri to self-driving cars - are capable of handling a wide range of data, but they are still far from replicating the entirety of "what it means to be human." We must broaden our capabilities and accelerate their development even further.
The first principle, discussed in Chapter 7, is to use ideas from "Artificial Life" to merge many different intelligent capabilities into a single difficult objective: survival. I believe this is the only way to achieve the full spectrum of emergent capabilities needed to handle unstructured scenarios. However, it's important to note that nature treats intelligence as just one of many adaptations for survival, and not a special goal to be favored over other adaptations. Intelligence comes with its own set of costs that need to be balanced with the needs of the organism. Homo sapiens and E. coli, for instance, have vastly different levels of intelligence, but natural selection does not favor one over the other. Therefore, if we want to evolve something with human-level intelligence, we need to build open-ended environments that demand the full range of human-level capabilities: learning, adaptation, creativity, and so on.
Unfortunately, creating such open-ended ecological simulations of life and then optimizing them end-to-end with current machine learning technologies is not really feasible. We need to rethink how we go about finding interesting solutions without recreating the entire arc of evolution millions of times. Our computational resources, impressive as they are today, are mostly suited for learning to play short arcade games. Chapter 8 explores how we can gradually scale up the existing paradigm of training agents to create increasingly more open-ended environments that evolve and specialize on their own.
Solving practical problems in the real world often requires human monitoring of the data used to train our AGI systems. This takes many forms, from data labeling to researchers stopping an experiment early because they know the outcome will be uninteresting. In such "human-in-the-loop" systems, the AI does most of the heavy computational work, but the human intervenes occasionally, providing light editorial feedback to guide the direction of evolution. Chapter 9 focuses on how human-in-the-loop optimization can be used to reduce the computational burden of searching for interesting open-ended universes.
As our systems become more powerful and gain the ability to understand instructions, we can start to ask them to combine behaviors without having to laboriously train them with data. In Chapter 10, I discuss the "just ask for generalization" approach to optimization. "Just ask for generalization" is a design principle that focuses on generalizing to what you want and acquiring capabilities first, rather than directly optimizing for what you want. This approach may even provide a recipe for consciousness!
Chapter 11 discusses how we can bring this general-purpose knowledge into the real world, giving general-purpose robots the ability to automate a variety of tasks. We will not only transfer knowledge from our simulated AI to the real world but we can also do the reverse: scan the physical world of atoms back into an increasingly rich computer simulation. This chapter draws on my experience developing general-purpose AI models for robots at Google and explores the unique engineering challenges and opportunities in the robotic data domain. Achieving AGI on an aggressive 10-year timeline requires a significant change in how we think about scaling capabilities. It also requires a highly focused team that engages in creative research while avoiding distractions that do not advance the core mission. Chapter 12 discusses some of my thoughts on leading AGI teams, based on my experiences leading various efforts at Google Robotics and 1X Technologies.
A book that speculates on what is likely to be the most impactful technology in human history would not be complete without discussing its implications for humanity. The final four chapters address AI-induced economic inequality, the politics of AI alarmism, how AI technology will reshape communication and politics, and finally, how AI can create its own beauty. Do not worry – the future of humanity is still unwritten, and this book presents a few speculative possibilities. Scientific literacy and participation in the creation of technology are the best ways to prepare yourself for the future.
This "napkin sketch" of AGI should be accessible to any general audience interested in AI. I try to explain things in the simplest way possible, but I occasionally use some programming abstractions to make things more concrete. This book is, to some extent, a "10 year plan" for my own career, and I hope that it will encourage many aspiring AI researchers and engineers to pursue a similar path for themselves. Never before has technology made it so easy for a small number of individuals to have a massive impact on the trajectory of human civilization.
I look forward to you joining me on this journey.