Jeff Hawkins, On Intelligence
Jeff Hawkins, On Intelligence
Autors: Jeff Hawkins, Sandra Blakeslee
Paperback: 261 pages
Published: August 1st 2005 by St. Martin's Griffin (first published 2004)
Original Title: On Intelligence
ISBN: 0805078533 (ISBN13: 9780805078534)
Edition Language: English
On Intelligence
Jeff Hawkins
withSandra Blakeslee
Contents
Prologue
1. Artificial Intelligence
2. Neural Networks
3. The Human Brain
4. Memory
5. A New Framework of Intelligence
6. How the Cortex Works
7. Consciousness and Creativity
8. The Future of Intelligence
Epilogue
Appendix: Testable Predictions
Bibliography
Acknowledgments
On
Intelligence
Prologue
This book and my life are animated by two passions.
For twenty-five years I have been passionate about mobile computing. In the high-tech world of Silicon Valley, I am known for starting two companies, Palm Computing and Handspring, and as the architect of many handheld computers and cell phones such as the PalmPilot and the Treo.
But I have a second passion that predates my interest in computers— one I view as more important. I am crazy about brains. I want to understand how the brain works, not just from a philosophical perspective, not just in a general way, but in a detailed nuts and bolts engineering way. My desire is not only to understand what intelligence is and how the brain works, but how to build machines that work the same way. I want to build truly intelligent machines.
The question of intelligence is the last great terrestrial frontier of science. Most big scientific questions involve the very small, the very large, or events that occurred billions of years ago. But everyone has a brain. You are your brain. If you want to understand why you feel the way you do, how you perceive the world, why you make mistakes, how you are able to be creative, why music and art are inspiring, indeed what it is to be human, then you need to understand the brain. In addition, a successful theory of intelligence and brain function will have large societal benefits, and not just in helping us cure brain-related diseases. We will be able to build genuinely intelligent machines, although they won't be anything like the robots of popular fiction and computer science fantasy. Rather, intelligent machines will arise from a new set of principles about the nature of intelligence. As such, they will help us accelerate our knowledge of the world, help us explore the universe, and make the world safer. And along the way, a large industry will be created.
Fortunately, we live at a time when the problem of understanding intelligence can be solved. Our generation has access to a mountain of data about the brain, collected over hundreds of years, and the rate at which we are gathering more data is accelerating. The United States alone has thousands of neuroscientists. Yet we have no productive theories about what intelligence is or how the brain works as a whole. Most neurobiologists don't think much about overall theories of the brain because they're engrossed in doing experiments to collect more data about the brain's many subsystems. And although legions of computer programmers have tried to make computers intelligent, they have failed. I believe they will continue to fail as long as they keep ignoring the differences between computers and brains.
What then is intelligence such that brains have it but computers don't? Why can a six-year-old hop gracefully from rock to rock in a streambed while the most advanced robots of our time are lumbering zombies? Why are three-year-olds already well on their way to mastering language while computers can't, despite half a century of programmers' best efforts? Why can you tell a cat from a dog in a fraction of a second while a supercomputer cannot make the distinction at all? These are great mysteries waiting for an answer. We have plenty of clues; what we need now are a few critical insights.
You may be wondering why a computer designer is writing a book about brains. Or put another way, if I love brains why didn't I make a career in brain science or in artificial intelligence? The answer is I tried to, several times, but I refused to study the problem of intelligence as others have before me. I believe the best way to solve this problem is to use the detailed biology of the brain as a constraint and as a guide, yet think about intelligence as a computational problem— a position somewhere between biology and computer science. Many biologists tend to reject or ignore the idea of thinking of the brain in computational terms, and computer scientists often don't believe they have anything to learn from biology. Also, the world of science is less accepting of risk than the world of business. In technology businesses, a person who pursues a new idea with a reasoned approach can enhance his or her career regardless of whether the particular idea turns out to be successful. Many successful entrepreneurs achieved success only after earlier failures. But in academia, a couple of years spent pursuing a new idea that does not work out can permanently ruin a young career. So I pursued the two passions in my life simultaneously, believing that success in industry would help me achieve success in understanding the brain. I needed the financial resources to pursue the science I wanted, and I needed to learn how to affect change in the world, how to sell new ideas, all of which I hoped to get from working in Silicon Valley.
In August 2002 I started a research center, the Redwood Neuroscience Institute (RNI), dedicated to brain theory. There are many neuroscience centers in the world, but no others are dedicated to finding an overall theoretical understanding of the neocortex— the part of the human brain responsible for intelligence. That is all we study at RNI. In many ways, RNI is like a start-up company. We are pursuing a dream that some people think is unattainable, but we are lucky to have a great group of people, and our efforts are starting to bear fruit.
* * *
The agenda for this book is ambitious. It describes a comprehensive theory of how the brain works. It describes what intelligence is and how your brain creates it. The theory I present is not a completely new one. Many of the individual ideas you are about to read have existed in some form or another before, but not together in a coherent fashion. This should be expected. It is said that "new ideas" are often old ideas repackaged and reinterpreted. That certainly applies to the theory proposed here, but packaging and interpretation can make a world of difference, the difference between a mass of details and a satisfying theory. I hope it strikes you the way it does many people. A typical reaction I hear is, "It makes sense. I wouldn't have thought of intelligence this way, but now that you describe it to me I can see how it all fits together." With this knowledge most people start to see themselves a little differently. You start to observe your own behavior saying, "I understand what just happened in my head." Hopefully when you have finished this book, you will have new insight into why you think what you think and why you behave the way you behave. I also hope that some readers will be inspired to focus their careers on building intelligent machines based on the principles outlined in these pages.
I often refer to this theory and my approach to studying intelligence as "real intelligence" to distinguish it from "artificial intelligence." AI scientists tried to program computers to act like humans without first answering what intelligence is and what it means to understand. They left out the most important part of building intelligent machines, the intelligence! "Real intelligence" makes the point that before we attempt to build intelligent machines, we have to first understand how the brain thinks, and there is nothing artificial about that. Only then can we ask how we can build intelligent machines.
The book starts with some background on why previous attempts at understanding intelligence and building intelligent machines have failed. I then introduce and develop the core idea of the theory, what I call the memory-prediction framework. In chapter 6 I detail how the physical brain implements the memory-prediction model— in other words, how the brain actually works. I then discuss social and other implications of the theory, which for many readers might be the most thought-provoking section. The book ends with a discussion of intelligent machines— how we can build them and what the future will be like. I hope you find it fascinating. Here are some of the questions we will cover along the way: Can computers be intelligent?
For decades, scientists in the field of artificial intelligence have claimed that computers will be intelligent when they are powerful enough. I don't think so, and I will explain why. Brains and computers do fundamentally different things.
Weren't neural networks supposed to lead to intelligent machines?
Of course the brain is made from a network of neurons, but without first understanding what the brain does, simple neural networks will be no more successful at creating intelligent machines than computer programs have been.
Why has it been so hard to figure out how the brain works?
Most scientists say that because the brain is so complicated, it will take a very long time for us to understand it. I disagree. Complexity is a symptom of confusion, not a cause. Instead, I argue we have a few intuitive but incorrect assumptions that mislead us. The biggest mistake is the belief that intelligence is defined by intelligent behavior.
What is intelligence if it isn't defined by behavior?
The brain uses vast amounts of memory to create a model of the world. Everything you know and have learned is stored in this model. The brain uses this memory-based model to make continuous predictions of future events. It is the ability to make predictions about the future that is the crux of intelligence. I will describe the brain's predictive ability in depth; it is the core idea in the book.
How does the brain work?
The seat of intelligence is the neocortex. Even though it has a great number of abilities and powerful flexibility, the neocortex is surprisingly regular in its structural details. The different parts of the neocortex, whether they are responsible for vision, hearing, touch, or language, all work on the same principles. The key to understanding the neocortex is understanding these common principles and, in particular, its hierarchical structure. We will examine the neocortex in sufficient detail to show how its structure captures the structure of the world. This discussion will be the most technical part of the book, but interested nonscientist readers should be able to understand it.
What are the implications of this theory?
This theory of the brain can help explain many things, such as how we are creative, why we feel conscious, why we exhibit prejudice, how we learn, and why "old dogs" have trouble learning "new tricks." I will discuss a number of these topics. Overall, this theory gives us insight into who we are and why we do what we do.
Can we build intelligent machines and what will they do?
Yes. We can and we will. Over the next few decades, I see the capabilities of such machines evolving rapidly and in interesting directions. Some people fear that intelligent machines could be dangerous to humanity, but I argue strongly against this idea. We are not going to be overrun by robots. It will be far easier to build machines that outstrip our abilities in high-level thought such as physics and mathematics than to build anything like the walking, talking robots we see in popular fiction. I will explore the incredible directions in which this technology is likely to go.
My goal is to explain this new theory of intelligence and how the brain works in a way that anybody will be able to understand. A good theory should be easy to comprehend, not obscured in jargon or convoluted argument. I'll start with a basic framework and then add details as we go. Some will be reasoning just on logical grounds; some will involve particular aspects of brain circuitry. Some of the details of what I propose are certain to be wrong, which is always the case in any area of science. A fully mature theory will take years to develop, but that doesn't diminish the power of the core idea.
* * *
When I first became interested in brains many years ago, I went to my local library to look for a good book that would explain how brains worked. As a teenager I had become accustomed to being able to find well-written books that explained almost any topic of interest. There were books on relativity theory, black holes, magic, and mathematics— whatever I was fascinated with at the moment. Yet my search for a satisfying brain book turned up empty. I came to realize that no one had any idea how the brain actually worked. There weren't even any bad or unproven theories; there simply were none. This was unusual. For example, at that time no one knew how the dinosaurs had died, but there were plenty of theories, all of which you could read about. There was nothing like this for brains. At first I had trouble believing it. It bothered me that we didn't know how this critical organ worked. While studying what we did know about brains, I came to believe that there must be a straightforward explanation. The brain wasn't magic, and it didn't seem to me that the answers would even be that complex. The mathematician Paul Erdös believed that the simplest mathematical proofs already exist in some ethereal book and a mathematician's job was to find them, to "read the book." In the same way, I felt that the explanation of intelligence was "out there." I could taste it. I wanted to read the book.
For the past twenty-five years, I have had a vision of that small, straightforward book on the brain. It was like a carrot keeping me motivated during those years. This vision has shaped the book you are holding in your hands right now. I have never liked complexity, in either science or technology. You can see that reflected in the products I have designed, which are often noted for their ease of use. The most powerful things are simple. Thus this book proposes a simple and straightforward theory of intelligence. I hope you enjoy it.
1
Artificial Intelligence
When I graduated from Cornell in June 1979 with a degree in electrical engineering, I didn't have any major plans for my life. I started work as an engineer at the new Intel campus in Portland, Oregon. The microcomputer industry was just starting, and Intel was at the heart of it. My job was to analyze and fix problems found by other engineers working in the field with our main product, single board computers. (Putting an entire computer on a single circuit board had only recently been made possible by Intel's invention of the microprocessor.) I published a newsletter, got to do some traveling, and had a chance to meet customers. I was young and having a good time, although I missed my college sweetheart who had taken a job in Cincinnati.
A few months later, I encountered something that was to change my life's direction. That something was the newly published September issue of Scientific American, which was dedicated entirely to the brain. It rekindled my childhood interest in brains. It was fascinating. From it I learned about the organization, development, and chemistry of the brain, neural mechanisms of vision, movement, and other specializations, and the biological basis for disorders of the mind. It was one of the best Scientific American issues of all time. Several neuroscientists I've spoken to have told me it played a significant role in their career choice, just as it did for me.
The final article, "Thinking About the Brain," was written by Francis Crick, the codiscoverer of the structure of DNA who had by then turned his talents to studying the brain. Crick argued that in spite of a steady accumulation of detailed knowledge about the brain, how the brain worked was still a profound mystery. Scientists usually don't write about what they don't know, but Crick didn't care. He was like the boy pointing to the emperor with no clothes. According to Crick, neuroscience was a lot of data without a theory. His exact words were, "what is conspicuously lacking is a broad framework of ideas." To me this was the British gentleman's way of saying, "We don't have a clue how this thing works." It was true then, and it's still true today.
Crick's words were to me a rallying call. My lifelong desire to understand brains and build intelligent machines was brought to life. Although I was barely out of college, I decided to change careers. I was going to study brains, not only to understand how they worked, but to use that knowledge as a foundation for new technologies, to build intelligent machines. It would take some time to put this plan into action.
In the spring of 1980 I transferred to Intel's Boston office to be reunited with my future wife, who was starting graduate school. I took a position teaching customers and employees how to design microprocessor-based systems. But I had my sights on a different goal: I was trying to figure out how to work on brain theory. The engineer in me realized that once we understood how brains worked, we could build them, and the natural way to build artificial brains was in silicon. I worked for the company that invented the silicon memory chip and the microprocessor; therefore, perhaps I could interest Intel in letting me spend part of my time thinking about intelligence and how we could design brainlike memory chips. I wrote a letter to Intel's chairman, Gordon Moore. The letter can be distilled to the following:
Dear Dr. Moore,
I propose that we start a research group devoted to understanding how the brain works. It can start with one person— me— and go from there. I am confident we can figure this out. It will be a big business one day.
— Jeff Hawkins
Moore put me in touch with Intel's chief scientist, Ted Hoff. I flew to California to meet him and lay out my proposal for studying the brain. Hoff was famous for two things. The first, which I was aware of, was for his work in designing the first microprocessor. The second, which I was not aware of at the time, was for his work in early neural network theory.
Hoff had experience with artificial neurons and some of the things you could do with them. I wasn't prepared for this. After listening to my proposal, he said he didn't believe it would be possible to figure out how the brain works in the foreseeable future, and so it didn't make sense for Intel to support me. Hoff was correct, because it is now twenty-five years later and we are just starting to make significant progress in understanding brains. Timing is everything in business. Still, at the time I was pretty disappointed.
I tend to seek the path of least friction to achieve my goals. Working on brains at Intel would have been the simplest transition. With that option eliminated I looked for the next best thing. I decided to apply to graduate school at the Massachusetts Institute of Technology, which was famous for its research on artificial intelligence and was conveniently located down the road. It seemed a great match. I had extensive training in computer science— "check." I had a desire to build intelligent machines, "check." I wanted to first study brains to see how they worked…"uh, that's a problem." This last goal, wanting to understand how brains worked, was a nonstarter in the eyes of the scientists at the MIT artificial intelligence lab.
It was like running into a brick wall. MIT was the mother-ship of artificial intelligence. At the time I applied to MIT, it was home to dozens of bright people who were enthralled with the idea of programming computers to produce intelligent behavior. To these scientists, vision, language, robotics, and mathematics were just programming problems. Computers could do anything a brain could do, and more, so why constrain your thinking by the biological messiness of nature's computer? Studying brains would limit your thinking. They believed it was better to study the ultimate limits of computation as best expressed in digital computers. Their holy grail was to write computer programs that would first match and then surpass human abilities. They took an ends-justify-the-means approach; they were not interested in how real brains worked. Some took pride in ignoring neurobiology.
This struck me as precisely the wrong way to tackle the problem. Intuitively I felt that the artificial intelligence approach would not only fail to create programs that do what humans can do, it would not teach us what intelligence is. Computers and brains are built on completely different principles. One is programmed, one is self-learning. One has to be perfect to work at all, one is naturally flexible and tolerant of failures. One has a central processor, one has no centralized control. The list of differences goes on and on. The biggest reason I thought computers would not be intelligent is that I understood how computers worked, down to the level of the transistor physics, and this knowledge gave me a strong intuitive sense that brains and computers were fundamentally different. I couldn't prove it, but I knew it as much as one can intuitively know anything. Ultimately, I reasoned, AI might lead to useful products, but it wasn't going to build truly intelligent machines.
In contrast, I wanted to understand real intelligence and perception, to study brain physiology and anatomy, to meet Francis Crick's challenge and come up with a broad framework for how the brain worked. I set my sights in particular on the neocortex— the most recently developed part of the mammalian brain and the seat of intelligence. After understanding how the neocortex worked, then we could go about building intelligent machines, but not before.
Unfortunately, the professors and students I met at MIT did not share my interests. They didn't believe that you needed to study real brains to understand intelligence and build intelligent machines. They told me so. In 1981 the university rejected my application.
* * *
Many people today believe that AI is alive and well and just waiting for enough computing power to deliver on its many promises. When computers have sufficient memory and processing power, the thinking goes, AI programmers will be able to make intelligent machines. I disagree. AI suffers from a fundamental flaw in that it fails to adequately address what intelligence is or what it means to understand something. A brief look at the history of AI and the tenets on which it was built will explain how the field has gone off course.
The AI approach was born with the digital computer. A key figure in the early AI movement was the English mathematician Alan Turing, who was one of the inventors of the idea of the general-purpose computer. His masterstroke was to formally demonstrate the concept of universal computation: that is, all computers are fundamentally equivalent regardless of the details of how they are built. As part of his proof, he conceived an imaginary machine with three essential parts: a processing box, a paper tape, and a device that reads and writes marks on the tape as it moves back and forth. The tape was for storing information— like the famous 1's and 0's of computer code (this was before the invention of memory chips or the disk drive, so Turing imagined paper tape for storage). The box, which today we call a central processing unit (CPU), follows a set of fixed rules for reading and editing the information on the tape. Turing proved, mathematically, that if you choose the right set of rules for the CPU and give it an indefinitely long tape to work with, it can perform any definable set of operations in the universe. It would be one of many equivalent machines now called Universal Turing Machines. Whether the problem is to compute square roots, calculate ballistic trajectories, play games, edit pictures, or reconcile bank transactions, it is all 1's and 0's underneath, and any Turing Machine can be programmed to handle it. Information processing is information processing is information processing. All digital computers are logically equivalent.
Turing's conclusion was indisputably true and phenomenally fruitful. The computer revolution and all its products are built on it. Then Turing turned to the question of how to build an intelligent machine. He felt computers could be intelligent, but he didn't want to get into arguments about whether this was possible or not. Nor did he think he could define intelligence formally, so he didn't even try. Instead, he proposed an existence proof for intelligence, the famous Turing Test: if a computer can fool a human interrogator into thinking that it too is a person, then by definition the computer must be intelligent. And so, with the Turing Test as his measuring stick and the Turing Machine as his medium, Turing helped launch the field of AI. Its central dogma: the brain is just another kind of computer. It doesn't matter how you design an artificially intelligent system, it just has to produce humanlike behavior.
The AI proponents saw parallels between computation and thinking. They said, "Look, the most impressive feats of human intelligence clearly involve the manipulation of abstract symbols— and that's what computers do too. What do we do when we speak or listen? We manipulate mental symbols called words, using well-defined rules of grammar. What do we do when we play chess? We use mental symbols that represent the properties and locations of the various pieces. What do we do when we see? We use mental symbols to represent objects, their positions, their names, and other properties. Sure, people do all this with brains and not with the kinds of computers we build, but Turing has shown that it doesn't matter how you implement or manipulate the symbols. You can do it with an assembly of cogs and gears, with a system of electronic switches, or with the brain's network of neurons— whatever, as long as your medium can realize the functional equivalent of a Universal Turing Machine."
This assumption was bolstered by an influential scientific paper published in 1943 by the neurophysiologist Warren McCulloch and the mathematician Walter Pitts. They described how neurons could perform digital functions— that is, how nerve cells could conceivably replicate the formal logic at the heart of computers. The idea was that neurons could act as what engineers call logic gates. Logic gates implement simple logical operations such as AND, NOT, and OR. Computer chips are composed of millions of logic gates all wired together into precise, complicated circuits. A CPU is just a collection of logic gates.
McCulloch and Pitts pointed out that neurons could also be connected together in precise ways to perform logic functions. Since neurons gather input from each other and process those inputs to decide whether to fire off an output, it was conceivable that neurons might be living logic gates. Thus, they inferred, the brain could conceivably be built out of AND-gates, OR-gates, and other logic elements all built with neurons, in direct analogy with the wiring of digital electronic circuits. It isn't clear whether McCulloch and Pitts actually believed the brain worked this way; they only said it was possible. And, logically speaking, this view of neurons is possible. Neurons can, in theory, implement digital functions. However, no one bothered to ask if that was how neurons actually were wired in the brain. They took it as proof, irrespective of the lack of biological evidence, that brains were just another kind of computer.
It's also worth noting that AI philosophy was buttressed by the dominant trend in psychology during the first half of the twentieth century, called behaviorism. The behaviorists believed that it was not possible to know what goes on inside the brain, which they called an impenetrable black box. But one could observe and measure an animal's environment and its behaviors— what it senses and what it does, its inputs and its outputs. They conceded that the brain contained reflex mechanisms that could be used to condition an animal into adopting new behaviors through reward and punishments. But other than this, one did not need to study the brain, especially messy subjective feelings such as hunger, fear, or what it means to understand something. Needless to say, this research philosophy eventually withered away throughout the second half of the twentieth century, but AI would stick around a lot longer.
As World War II ended and electronic digital computers became available for broader applications, the pioneers of AI rolled up their sleeves and began programming. Language translation? Easy! It's a kind of code breaking. We just need to map each symbol in System A onto its counterpart in System B. Vision? That looks easy too. We already know geometric theorems that deal with rotation, scale, and displacement, and we can easily encode them as computer algorithms— so we're halfway there. AI pundits made grand claims about how quickly computer intelligence would first match and then surpass human intelligence.
Ironically, the computer program that came closest to passing the Turing Test, a program called Eliza, mimicked a psychoanalyst, rephrasing your questions back at you. For example, if a person typed in, "My boyfriend and I don't talk anymore," Eliza might say, "Tell me more about your boyfriend" or "Why do you think your boyfriend and you don't talk anymore?" Designed as a joke, the program actually fooled some people, even though it was dumb and trivial. More serious efforts included programs such as Blocks World, a simulated room containing blocks of different colors and shapes. You could pose questions to Blocks World such as "Is there a green pyramid on top of the big red cube?" or "Move the blue cube on top of the little red cube." The program would answer your question or try to do what you asked. It was all simulated— and it worked. But it was limited to its own highly artificial world of blocks. Programmers couldn't generalize it to do anything useful.
The public, meanwhile, was impressed by a continuous stream of seeming successes and news stories about AI technology. One program that generated initial excitement was able to solve mathematical theorems. Ever since Plato, multistep deductive inference has been seen as the pinnacle of human intelligence, so at first it seemed that AI had hit the jackpot. But, like Blocks World, it turned out the program was limited. It could only find very simple theorems, which were already known. Then there was a large stir about "expert systems," databases of facts that could answer questions posed by human users. For example, a medical expert system might be able to diagnose a patient's disease if given a list of symptoms. But again, they turned out to be of limited use and didn't exhibit anything close to generalized intelligence. Computers could play checkers at expert skill levels and eventually IBM's Deep Blue famously beat Gary Kasparov, the world chess champion, at his own game. But these successes were hollow. Deep Blue didn't win by being smarter than a human; it won by being millions of times faster than a human. Deep Blue had no intuition. An expert human player looks at a board position and immediately sees what areas of play are most likely to be fruitful or dangerous, whereas a computer has no innate sense of what is important and must explore many more options. Deep Blue also had no sense of the history of the game, and didn't know anything about its opponent. It played chess yet didn't understand chess, in the same way that a calculator performs arithmetic but doesn't understand mathematics.
In all cases, the successful AI programs were only good at the one particular thing for which they were specifically designed. They didn't generalize or show flexibility, and even their creators admitted they didn't think like humans. Some AI problems, which at first were thought to be easy, yielded no progress. Even today, no computer can understand language as well as a three-year-old or see as well as a mouse.
After many years of effort, unfulfilled promises, and no unqualified successes, AI started to lose its luster. Scientists in the field moved on to other areas of research. AI start-up companies failed. And funding became scarcer. Programming computers to do even the most basic tasks of perception, language, and behavior began to seem impossible. Today, not much has changed. As I said earlier, there are still people who believe that AI's problems can be solved with faster computers, but most scientists think the entire endeavor was flawed.
We shouldn't blame the AI pioneers for their failures. Alan Turing was brilliant. They all could tell that the Turing Machine would change the world— and it did, but not through AI.
* * *
My skepticism of AI's assertions was honed around the same time that I applied to MIT. John Searle, an influential philosophy professor at the University of California at Berkeley, was at that time saying that computers were not, and could not be, intelligent. To prove it, in 1980 he came up with a thought experiment called the Chinese Room. It goes like this:
Suppose you have a room with a slot in one wall, and inside is an English-speaking person sitting at a desk. He has a big book of instructions and all the pencils and scratch paper he could ever need. Flipping through the book, he sees that the instructions, written in English, dictate ways to manipulate, sort, and compare Chinese characters. Mind you, the directions say nothing about the meanings of the Chinese characters; they only deal with how the characters are to be copied, erased, reordered, transcribed, and so forth.
Someone outside the room slips a piece of paper through the slot. On it is written a story and questions about the story, all in Chinese. The man inside doesn't speak or read a word of Chinese, but he picks up the paper and goes to work with the rulebook. He toils and toils, rotely following the instructions in the book. At times the instructions tell him to write characters on scrap paper, and at other times to move and erase characters. Applying rule after rule, writing and erasing characters, the man works until the book's instructions tell him he is done. When he is finished at last he has written a new page of characters, which unbeknownst to him are the answers to the questions. The book tells him to pass his paper back through the slot. He does it, and wonders what this whole tedious exercise has been about.
Outside, a Chinese speaker reads the page. The answers are all correct, she notes— even insightful. If she is asked whether those answers came from an intelligent mind that had understood the story, she will definitely say yes. But can she be right? Who understood the story? It wasn't the fellow inside, certainly; he is ignorant of Chinese and has no idea what the story was about. It wasn't the book, which is just, well, a book, sitting inertly on the writing desk amid piles of paper. So where did the understanding occur? Searle's answer is that no understanding did occur; it was just a bunch of mindless page flipping and pencil scratching. And now the bait-and-switch: the Chinese Room is exactly analogous to a digital computer. The person is the CPU, mindlessly executing instructions, the book is the software program feeding instructions to the CPU, and the scratch paper is the memory. Thus, no matter how cleverly a computer is designed to simulate intelligence by producing the same behavior as a human, it has no understanding and it is not intelligent. (Searle made it clear he didn't know what intelligence is; he was only saying that whatever it is, computers don't have it.)
This argument created a huge row among philosophers and AI pundits. It spawned hundreds of articles, plus more than a little vitriol and bad blood. AI defenders came up with dozens of counterarguments to Searle, such as claiming that although none of the room's component parts understood Chinese, the entire room as a whole did, or that the person in the room really did understand Chinese, but just didn't know it. As for me, I think Searle had it right. When I thought through the Chinese Room argument and when I thought about how computers worked, I didn't see understanding happening anywhere. I was convinced we needed to understand what "understanding" is, a way to define it that would make it clear when a system was intelligent and when it wasn't, when it understands Chinese and when it doesn't. Its behavior doesn't tell us this.
A human doesn't need to "do" anything to understand a story. I can read a story quietly, and although I have no overt behavior my understanding and comprehension are clear, at least to me. You, on the other hand, cannot tell from my quiet behavior whether I understand the story or not, or even if I know the language the story is written in. You might later ask me questions to see if I did, but my understanding occurred when I read the story, not just when I answer your questions. A thesis of this book is that understanding cannot be measured by external behavior; as we'll see in the coming chapters, it is instead an internal metric of how the brain remembers things and uses its memories to make predictions. The Chinese Room, Deep Blue, and most computer programs don't have anything akin to this. They don't understand what they are doing. The only way we can judge whether a computer is intelligent is by its output, or behavior.
The ultimate defensive argument of AI is that computers could, in theory, simulate the entire brain. A computer could model all the neurons and their connections, and if it did there would be nothing to distinguish the "intelligence" of the brain from the "intelligence" of the computer simulation. Although this may be impossible in practice, I agree with it. But AI researchers don't simulate brains, and their programs are not intelligent. You can't simulate a brain without first understanding what it does.
* * *
After my rejection by both Intel and MIT, I didn't know what to do. When you don't know how to proceed, often the best strategy is to make no changes until your options become clear. So I just kept working in the computer field. I was content to stay in Boston, but in 1982 my wife wanted to move to California, so we did (it was, again, the path of least friction). I landed a job in Silicon Valley, at a start-up called Grid Systems. Grid invented the laptop computer, a beautiful machine that became the first computer in the collection at the Museum of Modern Art in New York. Working first in marketing and then in engineering, I eventually created a high-level programming language called GridTask. It and I became more and more important to Grid's success; my career was going well.
Still, I could not get my curiosity about the brain and intelligent machines out of my head. I was consumed with a desire to study brains. So I took a correspondence course in human physiology and studied on my own (no one ever got rejected by a correspondence school!). After learning a fair amount of biology, I decided to apply for admission to a biology graduate program and study intelligence from within the biological sciences. If the computer science world didn't want a brain theorist, then maybe the biology world would welcome a computer scientist. There was no such thing as theoretical biology back then, and especially not theoretical neuroscience, so biophysics seemed like the best field for my interests. I studied hard, took the required entrance exams, prepared a résumé, solicited letters of recommendation, and voilà, I was accepted as a full-time graduate student in the biophysics program at the University of California at Berkeley.
I was thrilled. Finally I could start in earnest on brain theory, or so I thought. I quit my job at Grid with no intention of working in the computer industry again. Of course this meant indefinitely giving up my salary. My wife was thinking "time to buy a house and start a family" and I was happily becoming a non-provider. This was definitely not a lowfriction path. But it was the best option I had, and she supported my decision.
John Ellenby, the founder of Grid, pulled me into his office just before I left and said, "I know you don't expect to ever come back to Grid or the computer industry, but you never know what will happen. Instead of stopping completely, why don't you take a leave of absence? That way if, in a year or two, you do come back, you can pick up your salary, position, and stock options where you left off." It was a nice gesture. I accepted it, but I felt I was leaving the computer business for good.
2
Neural Networks
When I started at UC Berkeley in January 1986, the first thing I did was compile a history of theories of intelligence and brain function. I read hundreds of papers by anatomists, physiologists, philosophers, linguists, computer scientists, and psychologists. Numerous people from many fields had written extensively about thinking and intelligence. Each field had its own set of journals and each used its own terminology. I found their descriptions inconsistent and incomplete. Linguists talked of intelligence in terms such as "syntax" and "semantics." To them, the brain and intelligence was all about language. Vision scientists referred to 2D, 2½D, and 3D sketches. To them, the brain and intelligence was all about visual pattern recognition. Computer scientists talked of schemas and frames, new terms they made up to represent knowledge. None of these people talked about the structure of the brain and how it would implement any of their theories. On the other hand, anatomists and neurophysiologists wrote extensively about the structure of the brain and how neurons behave, but they mostly avoided any attempt at large-scale theory. It was difficult and frustrating trying to make sense of these various approaches and the mountain of experimental data that accompanied them.
Around this time, a new and promising approach to thinking about intelligent machines burst onto the scene. Neural networks had been around since the late 1960s in one form or another, but neural networks and the AI movement were competitors, for both the dollars and the mind share of the agencies that fund research. AI, the 800-pound gorilla in those days, actively squelched neural network research. Neural network researchers were essentially blacklisted from getting funding for several years. A few people continued to think about them though, and in the mid-1980s their day in the sun had finally arrived. It is hard to know exactly why there was a sudden interest in neural networks, but undoubtedly one contributing factor was the continuing failure of artificial intelligence. People were casting about for alternatives to AI and found one in artificial neural networks.
Neural networks were a genuine improvement over the AI approach because their architecture is based, though very loosely, on real nervous systems. Instead of programming computers, neural network researchers, also known as connectionists, were interested in learning what kinds of behaviors could be exhibited by hooking a bunch of neurons together. Brains are made of neurons; therefore, the brain is a neural network. That is a fact. The hope of connectionists was that the elusive properties of intelligence would become clear by studying how neurons interact, and that some of the problems that were unsolvable with AI could be solved by replicating the correct connections between populations of neurons. A neural network is unlike a computer in that it has no CPU and doesn't store information in a centralized memory. The network's knowledge and memories are distributed throughout its connectivity— just like real brains.
On the surface, neural networks seemed to be a great fit with my own interests. But I quickly became disillusioned with the field. By this time I had formed an opinion that three things were essential to understanding the brain. My first criterion was the inclusion of time in brain function. Real brains process rapidly changing streams of information. There is nothing static about the flow of information into and out of the brain.
The second criterion was the importance of feedback. Neuroanatomists have known for a long time that the brain is saturated with feedback connections. For example, in the circuit between the neocortex and a lower structure called the thalamus, connections going backward (toward the input) exceed the connections going forward by almost a factor of ten! That is, for every fiber feeding information forward into the neocortex, there are ten fibers feeding information back toward the senses. Feedback dominates most connections throughout the neocortex as well. No one understood the precise role of this feedback, but it was clear from published research that it existed everywhere. I figured it must be important.
The third criterion was that any theory or model of the brain should account for the physical architecture of the brain. The neocortex is not a simple structure. As we will see later, it is organized as a repeating hierarchy. Any neural network that didn't acknowledge this structure was certainly not going to work like a brain.
But as the neural network phenomenon exploded on the scene, it mostly settled on a class of ultrasimple models that didn't meet any of these criteria. Most neural networks consisted of a small number of neurons connected in three rows. A pattern (the input) is presented to the first row. These input neurons are connected to the next row of neurons, the socalled hidden units. The hidden units then connect to the final row of neurons, the output units. The connections between neurons have variable strengths, meaning the activity in one neuron might increase the activity in another and decrease the activity in a third neuron depending on the connection strengths. By changing these strengths, the network learns to map input patterns to output patterns.
These simple neural networks only processed static patterns, did not use feedback, and didn't look anything like brains. The most common type of neural network, called a "back propagation" network, learned by broadcasting an error from the output units back toward the input units. You might think this is a form of feedback, but it isn't really. The backward propagation of errors only occurred during the learning phase. When the neural network was working normally, after being trained, the information flowed only one way. There was no feedback from outputs to inputs. And the models had no sense of time. A static input pattern got converted into a static output pattern. Then another input pattern was presented. There was no history or record in the network of what happened even a short time earlier. And finally the architecture of these neural networks was trivial compared to the complicated and hierarchical structure of the brain.
I thought the field would quickly move on to more realistic networks, but it didn't. Because these simple neural networks were able to do interesting things, research seemed to stop right there, for years. They had found a new and interesting tool, and overnight thousands of scientists, engineers, and students were getting grants, earning PhDs, and writing books about neural networks. Companies were formed to use neural networks to predict the stock market, process loan applications, verify signatures, and perform hundreds of other pattern classification applications. Although the intent of the founders of the field might have been more general, the field became dominated by people who weren't interested in understanding how the brain works, or understanding what intelligence is.
The popular press didn't understand this distinction well. Newspapers, magazines, and TV science programs presented neural networks as being "brainlike" or working on the "same principles as the brain." Unlike AI, where everything had to be programmed, neural nets learned by example, which seemed, well, somehow more intelligent. One prominent demonstration was NetTalk. This neural network learned to map sequences of letters onto spoken sounds. As the network was trained on printed text, it started sounding like a computer voice reading the words. It was easy to imagine that, with a little more time, neural networks would be conversing with humans. NetTalk was incorrectly heralded on national news as a machine learning to read. NetTalk was a great exhibition, but what it was actually doing bordered on the trivial. It didn't read, it didn't understand, and was of little practical value. It just matched letter combinations to predefined sound patterns.
Let me give you an analogy to show how far neural networks were from real brains. Imagine that instead of trying to figure out how a brain worked we were trying to figure out how a digital computer worked. After years of study, we discover that everything in the computer is made of transistors. There are hundreds of millions of transistors in a computer and they are connected together in precise and complex ways. But we don't understand how the computer works or why the transistors are connected the way they are. So one day we decide to connect just a few transistors together to see what happens. Lo and behold we find that as few as three transistors, when connected together in a certain way, become an amplifier. A small signal put into one end is magnified on the other end. (Amplifiers in radios and televisions are made using transistors in this fashion.) This is an important discovery, and overnight an industry springs up making transistor radios, televisions, and other electronic appliances using transistor amplifiers. This is all well and good, but it doesn't tell us anything about how the computer works. Even though an amplifier and a computer are both made of transistors, they have almost nothing else in common. In the same way, a real brain and a three-row neural network are built with neurons, but have almost nothing else in common.
During the summer of 1987, I had an experience that threw more cold water on my already low enthusiasm for neural nets. I went to a neural network conference where I saw a presentation by a company called Nestor. Nestor was trying to sell a neural network application for recognizing handwriting on a tablet. It was offering to license the program for one million dollars. That got my attention. Although Nestor was promoting the sophistication of its neural network algorithm and touting it as yet another major breakthrough, I felt the problem of handwriting recognition could be solved in a simpler, more traditional way. I went home that night, thought about the problem, and in two days had designed a handwriting recognizer that was fast, small, and flexible. My solution didn't use a neural network and it didn't work at all like a brain. Although that conference sparked my interest in designing computers with a stylus interface (eventually leading to the PalmPilot ten years later), it also convinced me that neural networks were not much of an improvement over traditional methods. The handwriting recognizer I created ultimately became the basis for the text entry system, called Graffiti, used in the first series of Palm products. I think Nestor went out of business.
So much for simple neural networks. Most of their capabilities were easily handled by other methods and eventually the media hoopla subsided. At least neural network researchers did not claim their models were intelligent. After all, they were extremely simple networks and did less than AI programs. I don't want to leave you with the impression that all neural networks are of the simple three-layer variety. Some researchers have continued to study neural networks of different designs. Today the term neural network is used to describe a diverse set of models, some of which are more biologically accurate and some of which are not. But almost none of them attempt to capture the overall function or architecture of the neocortex.
In my opinion, the most fundamental problem with most neural networks is a trait they share with AI programs. Both are fatally burdened by their focus on behavior. Whether they are calling these behaviors "answers," "patterns," or "outputs," both AI and neural networks assume intelligence lies in the behavior that a program or a neural network produces after processing a given input. The most important attribute of a computer program or a neural network is whether it gives the correct or desired output. As inspired by Alan Turing, intelligence equals behavior.
But intelligence is not just a matter of acting or behaving intelligently. Behavior is a manifestation of intelligence, but not the central characteristic or primary definition of being intelligent. A moment's reflection proves this: You can be intelligent just lying in the dark, thinking and understanding. Ignoring what goes on in your head and focusing instead on behavior has been a large impediment to understanding intelligence and building intelligent machines.
* * *
Before we explore a new definition of intelligence, I want to tell you about one other connectionist approach that came much closer to describing how real brains work. Trouble is, few people seem to have realized the importance of this research.
While neural nets grabbed the limelight, a small splinter group of neural network theorists built networks that didn't focus on behavior. Called auto-associative memories, they were also built out of simple "neurons" that connected to each other and fired when they reached a certain threshold. But they were interconnected differently, using lots of feedback. Instead of only passing information forward, as in a back propagation network, auto-associative memories fed the output of each neuron back into the input— sort of like calling yourself on the phone. This feedback loop led to some interesting features. When a pattern of activity was imposed on the artificial neurons, they formed a memory of this pattern. The auto-associative network associated patterns with themselves, hence the term auto-associative memory.
The result of this wiring may at first seem ridiculous. To retrieve a pattern stored in such a memory, you must provide the pattern you want to retrieve. It would be like going to the grocer and asking to buy a bunch of bananas. When the grocer asks you how you will pay, you offer to pay with bananas. What good is that? you might ask. But an autoassociative memory has a few important properties that are found in real brains.
The most important property is that you don't have to have the entire pattern you want to retrieve in order to retrieve it. You might have only part of the pattern, or you might have a somewhat messed-up pattern. The auto-associative memory can retrieve the correct pattern, as it was originally stored, even though you start with a messy version of it. It would be like going to the grocer with half eaten brown bananas and getting whole green bananas in return. Or going to the bank with a ripped and unreadable bill and the banker says, "I think this is a messed-up $100 bill. Give me that one, and I will give you this new, crisp $100 bill."
Second, unlike most other neural networks, an auto-associative memory can be designed to store sequences of patterns, or temporal patterns. This feature is accomplished by adding a time delay to the feedback. With this delay, you can present an auto-associative memory with a sequence of patterns, similar to a melody, and it can remember the sequence. I might feed in the first few notes of "Twinkle Twinkle Little Star" and the memory returns the whole song. When presented with part of the sequence, the memory can recall the rest. As we will see later, this is how people learn practically everything, as a sequence of patterns. And I propose the brain uses circuits similar to an auto-associative memory to do so.
Auto-associative memories hinted at the potential importance of feedback and time-changing inputs. But the vast majority of AI, neural network, and cognitive scientists ignored time and feedback.
Neuroscientists as a whole have not done much better. They too know about feedback— they are the people who discovered it— but most have no theory (beyond vague talk of "phases" and "modulation") to account for why the brain needs to have so much of it. And time has little or no central role in most of their ideas on overall brain function. They tend to chart the brain in terms of where things happen, not when or how neural firing patterns interact over time. Part of this bias comes from the limits of our current experimental techniques. One of the favorite technologies of the 1990s, aka the Decade of the Brain, was functional imaging. Functional imaging machines can take pictures of brain activity in humans. However, they cannot see rapid changes. So scientists ask subjects to concentrate on a single task over and over again as if they were being asked to stand still for an optical photograph, except this is a mental photograph. The result is we have lots of data on where in the brain certain tasks occur, but little data on how realistic, time-varying inputs flow through the brain. Functional imaging lends insight into where things are happening at a given moment but cannot easily capture how brain activity changes over time. Scientists would like to collect this data, but there are few good techniques for doing so. Thus many mainstream cognitive neuroscientists continue to buy into the input-output fallacy. You present a fixed input and see what output you get. Wiring diagrams of the cortex tend to show flowcharts that start in the primary sensory areas where sights, sounds, and touch come in, flow up through higher analytical, planning, and motor areas, and then feed instructions down to the muscles. You sense, then you act.
I don't want to imply that everyone has ignored time and feedback. This is such a big field that virtually every idea has its adherents. In recent years, belief in the importance of feedback, time, and prediction has been on the rise. But the thunder of AI and classical neural networks kept other approaches subdued and underappreciated for many years.
* * *
It's not difficult to understand why people— laymen and experts alike— have thought that behavior defines intelligence.
For at least a couple of centuries people have likened the brain's abilities to clockworks, then pumps and pipes, then steam engines and, later, to computers. Decades of science fiction have been awash in AI ideas, from Isaac Asimov's laws of robotics to Star Wars' C3PO. The idea of intelligent machines doing things is engrained in our imagination. All machines, whether made by humans or imagined by humans, are designed to do something. We don't have machines that think, we have machines that do. Even as we observe our fellow humans, we focus on their behavior and not on their hidden thoughts. Therefore, it seems intuitively obvious that intelligent behavior should be the metric of an intelligent system.
However, looking across the history of science, we see our intuition is often the biggest obstacle to discovering the truth. Scientific frameworks are often difficult to discover, not because they are complex, but because intuitive but incorrect assumptions keep us from seeing the correct answer. Astronomers before Copernicus (1473–1543) wrongly assumed that the earth was stationary at the center of the universe because it feels stationary and appears to be at the center of the universe. It was intuitively obvious that the stars were all part of a giant spinning sphere, with us at its center. To suggest the Earth was spinning like a top, the surface moving at nearly a thousand miles an hour, and that the entire Earth was hurtling through space— not to mention that stars are trillions of miles away— would have marked you as a lunatic. But that turned out to be the correct framework. Simple to understand, but intuitively incorrect.
Before Darwin (1809–1882), it seemed obvious that species are fixed in their forms. Crocodiles don't mate with hummingbirds; they are distinct and irreconcilable. The idea that species evolve went against not only religious teachings but also common sense. Evolution implies that you have a common ancestor with every living thing on this planet, including worms and the flowering plant in your kitchen. We now know this to be true, but intuition says otherwise.
I mention these famous examples because I believe that the quest for intelligent machines has also been burdened by an intuitive assumption that's hampering our progress. When you ask yourself, What does an intelligent system do?, it is intuitively obvious to think in terms of behavior. We demonstrate human intelligence through our speech, writing, and actions, right? Yes, but only to a point. Intelligence is something that is happening in your head. Behavior is an optional ingredient. This is not intuitively obvious, but it's not hard to understand either.
* * *
In the spring of 1986, as I sat at my desk day after day reading scientific articles, building my history of intelligence, and watching the evolving worlds of AI and neural networks, I found myself drowning in details. There was an unending supply of things to study and read about, but I was not gaining any clear understanding of how the whole brain actually worked or even what it did. This was because the field of neuroscience itself was awash in details. It still is. Thousands of research reports are published every year, but they tend to add to the heap rather than organize it. There's still no overall theory, no framework, explaining what your brain does and how it does it.
I started imagining what the solution to this problem would be like. Is it going to be extremely complex because the brain is so complex? Would it take one hundred pages of dense mathematics to describe how the brain works? Would we need to map out hundreds or thousands of separate circuits before anything useful could be understood? I didn't think so. History shows that the best solutions to scientific problems are simple and elegant. While the details may be forbidding and the road to a final theory may be arduous, the ultimate conceptual framework is generally simple.
Without a core explanation to guide inquiry, neuroscientists don't have much to go on as they try to assemble all the details they've collected into a coherent picture. The brain is incredibly complex, a vast and daunting tangle of cells. At first glance it looks like a stadium full of cooked spaghetti. It's also been described as an electrician's nightmare. But with close and careful inspection we see that the brain isn't a random heap. It has lots of organization and structure— but much too much of it for us to hope we'll be able to just intuit the workings of the whole, the way we're able to see how the shards of a broken vase fit back together. The failing isn't one of not having enough data or even the right pieces of data; what we need is a shift in perspective. With the proper framework, the details will become meaningful and manageable. Consider the following fanciful analogy to get the flavor of what I mean.
Imagine that millennia from now humans have gone extinct, and explorers from an advanced alien civilization land on Earth. They want to figure out how we lived. They are especially puzzled by our network of roadways. What were these bizarre elaborate structures for? They begin by cataloging everything, both via satellites and from the ground. They are meticulous archaeologists. They record the location of every stray fragment of asphalt, every signpost that has fallen over and been carried downhill by erosion, every detail they can find. They note that some road networks are different from others; in some places they are windy and narrow and almost random-looking, in some places they form a nice regular grid, and over some stretches they become thick and run for hundreds of miles through the desert. They collect a mountain of details, but these details don't mean anything to them. They continue to collect more detail in hopes of finding some new data that explain it all. They remain stumped for a long time.
That is, until one of them says, "Eureka! I think I see… these creatures couldn't teleport themselves like we can. They
had to travel around from place to place, perhaps on mobile platforms of a cunning design." From this basic insight, many details begin to fall into place. The small twisty street networks are from early times when the means of conveyance were slow. The thick long roadways were for traveling long distances at high speeds, suggesting at last an explanation for why the signs on those roads had different numbers painted on them. The scientists start to deduce residential versus industrial zoning, the way the needs of commerce and transportation infrastructure might have interacted, and so on. Many of the details they had cataloged turn out to be not very relevant, just accidents of history or the requirements of local geography. The same amount of raw data exists, but it no longer is puzzling.
We can be confident that the same type of breakthrough will let us understand what all the brain's details are about.
* * *
Unfortunately, not everyone believes we can understand how the brain works. A surprising number of people, including a few neuroscientists, believe that somehow the brain and intelligence are beyond explanation. And some believe that even if we could understand them, it would be impossible to build machines that work the same way, that intelligence requires a human body, neurons, and perhaps some new and unfathomable laws of physics. Whenever I hear these arguments, I imagine the intellectuals of the past who argued against studying the heavens or fought against dissecting cadavers to see how our bodies worked. "Don't bother studying that, it will lead to no good, and even if you could understand how it works there is nothing we can do with that knowledge." Arguments like this one lead us to a branch of philosophy called functionalism, our last stop in this brief history of our thinking about thinking.
According to functionalism, being intelligent or having a mind is purely a property of organization and has nothing inherently to do with what you're organized out of. A mind exists in any system whose constituent parts have the right causal relationship with each other, but those parts can just as validly be neurons, silicon chips, or something else. Clearly, this view is standard issue to any would-be builder of intelligent machines.
Consider: Would a game of chess be any less real if it was played with a salt shaker standing in for a lost knight piece? Clearly not. The salt shaker is functionally equivalent to a "real" knight by virtue of how it moves on the board and interacts with the other pieces, so your game is truly a game of chess and not just a simulation of one. Or consider, wouldn't this sentence be the same if I were to go through it with my cursor deleting each character, then retyping it? Or to take an example closer to home, consider the fact that every few years your body replaces most of the atoms that comprise you. In spite of this, you remain yourself in all the ways that matter to you. One atom is as good as any other if it's playing the same functional role in your molecular makeup. The same story should hold for the brain: if a mad scientist were to replace each of your neurons with a functionally equivalent micromachine replica, you should come out of the procedure feeling no less your own true self than you had at the outset.
By this principle, an artificial system that used the same functional architecture as an intelligent, living brain should be likewise intelligent— and not just contrivedly so, but actually, truly intelligent.
AI proponents, connectionists, and I are all functionalists, insofar as we all believe there's nothing inherently special or magical about the brain that allows it to be intelligent. We all believe we'll be able to build intelligent machines, somehow, someday. But there are different interpretations of functionalism. While I've already stated what I consider the central failing of the AI and the connectionist paradigms— the input-output fallacy— there's a bit more worth saying about why we haven't yet been able to design intelligent machines. While the AI proponents take what I consider a selfdefeating hard line, the connectionists, in my view, have mainly been just too timid.
AI researchers ask, "Why should we engineers be bound by the solutions evolution happened to stumble upon?" In principle, they have a point. Biological systems, like the brain and the genome, are viewed as notoriously inelegant. A common metaphor is that of the Rube Goldberg machine, named after the Depression-era cartoonist who drew comically overcomplicated contraptions to accomplish trivial tasks. Software designers have a related term, kludge, to refer to programs that are written without foresight and wind up full of burdensome, useless complexity, often to the point of becoming incomprehensible even to the programmers who wrote them. AI researchers fear the brain is similarly a mess, a several-hundred-million-year-old kludge, chock-full of inefficiencies and evolutionary "legacy code." If so, they wonder, why not just throw out the whole sorry clutter and start afresh?
Many philosophers and cognitive psychologists are sympathetic to this view. They love the metaphor of the mind being like software that's run by the brain, the organic analog of computer hardware. In computers, the hardware level and the software level are distinct from each other. The same software program can be made to run on any Universal Turing Machine. You can run WordPerfect on a PC, a Macintosh, or a Cray supercomputer, for example, even though all three systems have different hardware configurations. And the hardware has nothing of importance to teach you if you're trying to learn WordPerfect. By analogy, the thinking goes, the brain has nothing to teach us about the mind.
AI defenders also like to point out historical instances in which the engineering solution differs radically from nature's version. For example, how did we succeed in building flying machines? By imitating the flapping action of winged animals? No. We did it with fixed wings and propellers, and later with jet engines. It may not be how nature did it, but it works— and does so far better than flapping wings.
Similarly, we made a land vehicle that could outrun cheetahs not by making four-legged, cheetah-like running machines, but by inventing wheels. Wheels are a great way to move over flat terrain, and just because evolution never stumbled across that particular strategy doesn't mean it's not an excellent way for us to get around. Some philosophers of mind have taken a shine to the metaphor of the "cognitive wheel," that is, an AI solution to some problem that although entirely different from how the brain does it is just as good. In other words, a program that produces outputs resembling (or surpassing) human performance on a task in some narrow but useful way really is just as good as the way our brains do it.
I believe this kind of ends-justify-the-means interpretation of functionalism leads AI researchers astray. As Searle showed with the Chinese Room, behavioral equivalence is not enough. Since intelligence is an internal property of a brain, we have to look inside the brain to understand what intelligence is. In our investigations of the brain, and especially the neocortex, we will need to be careful in figuring out which details are just superfluous "frozen accidents" of our evolutionary past; undoubtedly, many Rube Goldberg–style processes are mixed in with the important features. But as we'll soon see, there is an underlying elegance of great power, one that surpasses our best computers, waiting to be extracted from these neural circuits.
Connectionists intuitively felt the brain wasn't a computer and that its secrets lie in how neurons behave when connected together. That was a good start, but the field barely moved on from its early successes. Although thousands of people worked on three-layer networks, and many still do, research on cortically realistic networks was, and remains, rare.
For half a century we've been bringing the full force of our species' considerable cleverness to trying to program intelligence into computers. In the process we've come up with word processors, databases, video games, the Internet, mobile phones, and convincing computer-animated dinosaurs. But intelligent machines still aren't anywhere in the picture. To succeed, we will need to crib heavily from nature's engine of intelligence, the neocortex. We have to extract intelligence from within the brain. No other road will get us there.
3
The Human Brain
So what makes the brain so unlike the programming that goes into AI and neural networks? What is so unusual about the brain's design, and why does it matter? As we'll see in the next few chapters, the brain's architecture has a great deal to tell us about how the brain really works and why it is fundamentally different from a computer.
Let's begin our introduction with the whole organ. Imagine there is a brain sitting on a table and we are dissecting it together. The first thing you'll notice is that the outer surface of the brain seems highly uniform. A pinkish gray, it resembles a smooth cauliflower with numerous ridges and valleys called gyri and sulci. It is soft and squishy to the touch. This is the neocortex, a thin sheet of neural tissue that envelops most of the older parts of the brain. We are going to focus most of our attention on the neocortex. Almost everything we think of as intelligence— perception, language, imagination, mathematics, art, music, and planning— occurs here. Your neocortex is reading this book.
Now, here I have to admit I am a neocortical chauvinist. I know I'm going to meet some resistance on this point, so let me take a minute to defend my approach before we get too far in. Every part of the brain has its own community of scientists who study it, and the suggestion that we can get to the bottom of intelligence by understanding just the neocortex is sure to raise a few howls of objection from communities of offended researchers. They will say things like, "You cannot possibly understand the neocortex without understanding brain region blah, because the two are highly interconnected like so, and you need brain region blah to do such and such." I don't disagree. Granted, the brain consists of many parts and most of them are critical to being human. (Oddly, an exception is the part of the brain with the largest number of cells, the cerebellum. If you are born without a cerebellum or it is damaged, you can lead a pretty normal life. However, this is not true for most other brain regions; most are required for basic living, or sentience.)
My counterargument is that I am not interested in building humans. I want to understand intelligence and build intelligent machines. Being human and being intelligent are separate matters. An intelligent machine need not have sexual urges, hunger, a pulse, muscles, emotions, or a humanlike body. A human is much more than an intelligent machine. We are biological creatures with all the necessary and sometimes unwanted baggage that comes from eons of evolution. If you want to build intelligent machines that behave just like humans— that is, to pass the Turing Test in all ways— then you probably would have to re-create much of the other stuff that makes humans what we are. But as we will see later, to build machines that are undoubtedly intelligent but not exactly like humans, we can focus on the part of the brain strictly related to intelligence.
To those who may be offended by my singular focus on the neocortex, let me say I agree that other brain structures, such as the brain stem, basal ganglia, and amygdala, are indeed important to the functioning of the human neocortex. No question. But I hope to convince you that all the essential aspects of intelligence occur in the neocortex, with important roles also played by two other brain regions, the thalamus and the hippocampus, that we will discuss later in the book. In the long run, we will need to understand the functional roles of all brain regions. But I believe those matters will be best addressed in the context of a good overall theory of neocortical function. That's my two cents on the matter. Let's get back to the neocortex, or its shorter moniker, the cortex.
Get six business cards or six playing cards— either will do— and put them in a stack. (It will really help if you do this instead of just imagining it.) You are now holding a model of the cortex. Your six business cards are about 2 millimeters thick and should give you a sense of how thin the cortical sheet is. Just like your stack of cards, the neocortex is about 2 millimeters thick and has six layers, each approximated by one card.
Stretched flat, the human neocortical sheet is roughly the size of a large dinner napkin. The cortical sheets of other mammals are smaller: the rat's is the size of a postage stamp; the monkey's is about the size of a business-letter envelope. But regardless of size, most of them contain six layers similar to what you see in your stack of business cards. Humans are smarter because our cortex, relative to body size, covers a larger area, not because our layers are thicker or contain some special class of "smart" cells. Its size is quite impressive as it surrounds and envelops most of the rest of the brain. To accommodate our large brain, nature had to modify our general anatomy. Human females developed a wide pelvis to give birth to big-headed children, a feature that some paleoanthropologists believe coevolved with the ability to walk on two legs. But that still wasn't enough, so evolution folded up the neocortex, stuffing it into our skulls like a sheet of paper crumpled into a brandy snifter.
Your neocortex is loaded with nerve cells, or neurons. They are so tightly packed that no one knows precisely how many cells it contains. If you draw a tiny square, one millimeter on a side (about half the size of this letter o), on the top of your stack of business cards, you are marking the position of an estimated one hundred thousand (100,000) neurons. Imagine trying to count the exact number in such a tiny space; it is virtually impossible. Nevertheless, some anatomists have estimated that the typical human neocortex contains around thirty billion neurons (30,000,000,000), but no one would be surprised if the figure was significantly higher or lower.
Those thirty billions cells are you. They contain almost all your memories, knowledge, skills, and accumulated life experience. After twenty-five years of thinking about brains, I still find this fact astounding. That a thin sheet of cells sees, feels, and creates our worldview is just short of incredible. The warmth of a summer day and the dreams we have for a better world are somehow the creation of these cells. Many years after he wrote his article in Scientific American, Francis Crick wrote a book about brains called The Astonishing Hypothesis. The astonishing hypothesis was simply that the mind is the creation of the cells in the brain. There is nothing else, no magic, no special sauce, only neurons and a dance of information. I hope you can get a sense of how incredible this realization is. There appears to be a large philosophical gulf between a collection of cells and our conscious experience, yet mind and brain are one and the same. In calling this a hypothesis, Crick was being politically correct. That the cells in our brains create the mind is a fact, not a hypothesis. We need to understand what these thirty billion cells do and how they do it. Fortunately, the cortex is not just an amorphous blob of cells. We can take a deeper look at its structure for ideas about how it gives rise to the human mind.
* * *
Let's go back to our dissection table and look at the brain some more. To the naked eye, the neocortex presents almost no landmarks. There are a few, to be sure, such as the giant fissure separating the two cerebral hemispheres and the prominent sulcus that divides the back and front regions. But just about everywhere you look, from left to right, from back to front, the convoluted surface looks pretty much the same. There are no visible boundary lines or color codes demarcating areas that specialize in different sensory information or different types of thought.
People have long known there are boundaries in there somewhere, though. Even before neuroscientists were able to discern anything helpful about the circuitry of the cortex, they knew that some mental functions were localized to certain regions of it. If a stroke knocks out Joe's right parietal lobe, he can lose his ability to perceive— or even conceive of— anything on the left side of his body or in the left half of space around himself. A stroke in the left frontal region known as Broca's area, by contrast, compromises his ability to use the rules of grammar, although his vocabulary and his ability to understand the meanings of words are unchanged. A stroke in an area called the fusiform gyrus can knock out the ability to recognize faces— Joe can't recognize his mother, his children, or even his own face in a photograph. Deeply fascinating disorders like these gave early neuroscientists the notion that the cortex consists of many functional regions or functional areas. The terms are equivalent.
We have learned a lot about functional areas in the past century, but much remains to be discovered. Each of these regions is semi-independent and seems to be specialized for certain aspects of perception or thought. Physically they are arranged in an irregular patchwork quilt, which varies only a little from person to person. Rarely are the functions cleanly delineated. Functionally they are arranged in a branching hierarchy.
The notion of a hierarchy is critical, so I want to take a moment to carefully define it. I'll be referring to it throughout the book. In a hierarchical system, some elements are in an abstract sense "above" and "below" others. In a business hierarchy, for example, a mid-level manager is above a mail clerk and below the vice president. This has nothing to do with physical aboveness or belowness; even if she works on a lower floor than the mail clerk, the manager is still "above" him hierarchically. I emphasize this point to make clear what I mean whenever I talk about one functional region being higher or lower than another. It has nothing to do with their physical arrangement in the brain. All the functional areas of the cortex reside in the same convoluted cortical sheet. What makes one region "higher" or "lower" than another is how they are connected to one another. In the cortex, lower areas feed information up to higher areas by way of a certain neural pattern of connectivity, while higher areas send feedback down to lower areas using a different connection pattern. There are also lateral connections between areas that are in separate branches of the hierarchy, like one mid-level manager communicating with her counterpart in a sister office in another state. A detailed map of the monkey cortex has been worked out by two scientists, Daniel Felleman and David van Essen. The map shows dozens of regions connected together in a complex hierarchy. We can assume the human cortex has a similar hierarchy.
The lowest of the functional regions, the primary sensory areas, are where sensory information first arrives in the cortex. These regions process the information at its rawest, most basic level. For example, visual information enters the cortex through the primary visual area, called V1 for short. V1 is concerned with low-level visual features such as tiny edgesegments, small-scale components of motion, binocular disparity (for stereo-vision), and basic color and contrast information. V1 feeds information up to other areas, such as V2, V4, and IT (we'll have more to say about them later), and to a bunch of other areas besides. Each of these areas is concerned with more specialized or abstract aspects of the information. For example, cells in V4 respond to objects of medium complexity such as star shapes in different colors like red or blue. Another area called MT specializes in the motions of objects. In the higher echelons of the visual cortex are areas that represent your visual memories of all sorts of objects like faces, animals, tools, body parts, and so on.
Your other senses have similar hierarchies. Your cortex has a primary auditory area called A1 and a hierarchy of auditory regions above it, and it has a primary somatosensory (body sense) area called S1 and a hierarchy of somatosensory regions above that. Eventually, sensory information passes into "association areas," which is the name sometimes used for the regions of the cortex that receive inputs from more than one sense. For example, your cortex has areas that
receive input from both vision and touch. It is thanks to association regions that you are able to be aware that the sight of a fly crawling up your arm and the tickling sensation you feel there share the same cause. Most of these areas receive highly processed input from several senses, and their functions remain unclear. I will have much to say about the cortical hierarchy later in the book.
There is yet another set of areas in the frontal lobes of the brain that creates motor output. The motor system of the cortex is also hierarchically organized. The lowest area, M1, sends connections to the spinal cord and directly drives muscles. Higher areas feed sophisticated motor commands to M1. The hierarchy of the motor area and the hierarchies of sensory areas look remarkably similar. They seem to be put together in the same way. In the motor region we think of information flowing down the hierarchy toward M1 to drive the muscles and in the sensory regions we think of information flowing up the hierarchy away from the senses. But in reality information flows both ways. What is referred to as feedback in sensory regions is the output of the motor region, and vice versa.
Most descriptions of brains are based on flowcharts that reflect an oversimplified view of hierarchies. That is, input (sights, sounds, touch) flows into the primary sensory areas and gets processed as it moves up the hierarchy, then gets passed through the association areas, then gets passed to the frontal lobes of the cortex, and finally gets passed back down to the motor areas. I'm not saying this view is completely wrong. When you read aloud, visual information does indeed enter at V1, flows up to association areas, makes its way over to the frontal motor cortex, and winds up making the muscles in your mouth and throat form the sounds of speech. However, that isn't all there is to it. It's just not that simple. In the oversimplified view I am cautioning against, the process is generally treated as though information flows in a single direction, like widgets being built on a factory assembly line. But information in the cortex always flows in the opposite direction as well, and with many more projections feeding back down the hierarchy than up. As you read aloud, higher regions of your cortex send more signals "down" to your primary visual cortex than your eye receives from the printed page! We'll get to what those feedback projections are doing in later chapters. For now, I want to impress upon you one fact: although the up hierarchy is real, we have to be careful not to think that the information flow is all one way.
Back at the dissection table, let's assume we set up a powerful microscope, cut a thin slice from the cortical sheet, stain some of the cells, and take a look at our handiwork through the eyepiece. If we stained all the cells in our slice, we'd see a solid black mass because the cells are so tightly packed and intermingled. But if we use a stain that marks a smaller fraction of cells, we can see the six layers I mentioned. These layers are formed by variations in the density of cell bodies, cell types, and their connections.
All neurons have features in common. Apart from a cell body, which is the roundish part you imagine when you think of a cell, they also have branching, wirelike structures called axons and dendrites. When the axon from one neuron touches the dendrite of another, they form small connections called synapses. Synapses are where the nerve impulse from one cell influences the behavior of another cell. A neural signal, or spike, arriving at a synapse can make it more likely for the recipient cell to spike. Some synapses have the opposite effect, making it less likely the recipient cell will spike. Thus synapses can be inhibitory or excitatory. The strength of a synapse can change depending on the behavior of the two cells. The simplest form of this synaptic change is that when two neurons generate a spike at nearly the same time, the connection strength between the two neurons will be increased. I will say more about this process, called Hebbian learning, a bit later. In addition to changing the strength of a synapse, there is evidence that entirely new synapses can be formed between two neurons. This may be happening all the time, although the scientific evidence is controversial. Regardless of the details of how synapses change their strength, what is certain is that the formation and strengthening of synapses is what causes memories to be stored.
While there are many types of neurons in the neocortex, one broad class of them comprises eight out of every ten cells. These are the pyramidal neurons, so called because their cell bodies are shaped roughly like pyramids. Except for the top layer of the six-layered cortex, which has miles of axons but very few cells, every layer contains pyramidal cells. Each pyramidal neuron connects to many other neurons in its immediate neighborhood, and each sends a lengthy axon laterally out to more distant regions of cortex or down to lower brain structures like the thalamus.
A typical pyramidal cell has several thousand synapses. Again, it is very difficult to know exactly how many because of their extreme density and small size. The number of synapses varies from cell to cell, layer to layer, and region to region. If we were to take the conservative position that the average pyramidal cell has one thousand synapses (the actual number is probably close to five or ten thousand), then our neocortex would have roughly thirty trillion synapses altogether. That is an astronomically large number, well beyond our intuitive grasp. It is apparently sufficient to store all the things you can learn in a lifetime.
* * *
According to rumor, Albert Einstein once said that conceiving the theory of special relativity was straightforward, almost easy. It followed naturally from a single observation: that the speed of light is constant to all observers even if the observers are moving at different speeds. This is counterintuitive. It is like saying the speed of a thrown ball is always the same regardless of how hard it is thrown or how fast the individuals throwing and observing the ball are moving. Everybody sees the ball moving at the same speed relative to them under all circumstances. It doesn't seem like it could be true. But it was proven to be true for light, and Einstein cleverly asked what the consequences of this bizarre fact were. He methodically thought about all the implications of a constant speed of light, and he was led to the even more bizarre predictions of special relativity, such as time slowing down as you move faster, and energy and mass being fundamentally the same thing. Books on relativity walk through his line of reasoning with everyday examples of trains, bullets, flash-lights, and so forth. The theory isn't hard, but it is definitely counterintuitive.
There is an analogous discovery in neuroscience— a fact about the cortex that is so surprising that some neuroscientists refuse to believe it and most of the rest ignore it because they don't know what to make of it. But it is a fact of such importance that if you carefully and methodically explore its implications, it will unravel the secrets of what the neocortex does and how it works. In this case, the surprising discovery came from the basic anatomy of the cortex itself, but it took an unusually insightful mind to recognize it. That person was Vernon Mountcastle, a neuroscientist at Johns Hopkins University in Baltimore. In 1978 he published a paper titled "An Organizing Principle for Cerebral Function." In this paper, Mountcastle points out that the neocortex is remarkably uniform in appearance and structure. The regions of cortex that handle auditory input look like the regions that handle touch, which look like the regions that control muscles, which look like Broca's language area, which look like practically every other region of the cortex. Mountcastle suggests that since these regions all look the same, perhaps they are actually performing the same basic operation! He proposes that the cortex uses the same computational tool to accomplish everything it does.
All anatomists at that time, and for decades prior to Mountcastle, recognized that the cortex looks similar everywhere; this is undeniable. But instead of asking what that could mean, they spent their time looking for the differences between one area of cortex and another. And they did find differences. They assumed that if one region is used for language and another for vision, then there ought to be differences between those regions. If you look closely enough you find them. Regions of cortex vary in thickness, cell density, relative proportion of cell types, length of horizontal connections, synapse density, and many other ways that can be tricky to discover. One of the most-studied regions, the primary visual area V1, actually has a few extra divisions in one of its layers. The situation is analogous to the work of biologists in the 1800s. They spent their time discovering the minute differences between species. Success for them was finding that two mice that looked nearly identical were actually separate species. For many years Darwin followed the same course, often studying mollusks. But Darwin eventually had the big insight to ask how all these species could be so similar. It is their similarity that is surprising and interesting, much more so than their differences.
Mountcastle makes a similar observation. In a field of anatomists looking for minute differences in cortical regions, he shows that despite the differences, the neocortex is remarkably uniform. The same layers, cell types, and connections exist throughout. It looks like the six business cards everywhere. The differences are often so subtle that trained anatomists can't agree on them. Therefore, Mountcastle argues, all regions of the cortex are performing the same operation. The thing that makes the vision area visual and the motor area motoric is how the regions of cortex are connected to each other and to other parts of the central nervous system.
In fact, Mountcastle argues that the reason one region of cortex looks slightly different from another is because of what it is connected to, and not because its basic function is different. He concludes that there is a common function, a common algorithm, that is performed by all the cortical regions. Vision is no different from hearing, which is no different from motor output. He allows that our genes specify how the regions of cortex are connected, which is very specific to function and species, but the cortical tissue itself is doing the same thing everywhere.
Let's think about this for a moment. To me, sight, hearing, and touch seem very different. They have fundamentally different qualities. Sight involves color, texture, shape, depth, and form. Hearing has pitch, rhythm, and timbre. They feel very different. How can they be the same? Mountcastle says they aren't the same, but the way the cortex processes signals from the ear is the same as the way it processes signals from the eyes. He goes on to say that motor control works on the same principle, too.
Scientists and engineers have for the most part been ignorant of, or have chosen to ignore, Mountcastle's proposal. When they try to understand vision or make a computer that can "see," they devise vocabulary and techniques specific to vision. They talk about edges, textures, and three-dimensional representations. If they want to understand spoken language, they build algorithms based on rules of grammar, syntax, and semantics. But if Mountcastle is correct, these approaches are not how the brain solves these problems, and are therefore likely to fail. If Mountcastle is correct, the algorithm of the cortex must be expressed independently of any particular function or sense. The brain uses the same process to see as to hear. The cortex does something universal that can be applied to any type of sensory or motor system.
When I first read Mountcastle's paper I nearly fell out of my chair. Here was the Rosetta stone of neuroscience— a single paper and a single idea that united all the diverse and wondrous capabilities of the human mind. It united them under a single algorithm. In one step it exposed the fallacy of all previous attempts to understand and engineer human behavior as diverse capabilities. I hope you can appreciate how radical and wonderfully elegant Mountcastle's proposal is. The
best ideas in science are always simple, elegant, and unexpected, and this is one of the best. In my opinion it was, is, and will likely remain the most important discovery in neuroscience. Incredibly, though, most scientists and engineers either refuse to believe it, choose to ignore it, or aren't aware of it.
* * *
Part of this neglect stems from a poverty of tools for studying how information flows within the six-layered cortex. The tools we do have operate on a grosser level and are generally aimed at locating where in the cortex, as opposed to when and how, various capabilities arise. For example, much of the neuroscience reported in the popular press these days implicitly favors the idea of the brain as a collection of highly specialized modules. Functional imaging techniques like functional MRI and PET scanning focus almost exclusively on brain maps and the functional regions I mentioned earlier. Typically in these experiments, a volunteer subject lies down with his or her head inside the scanner and performs some kind of mental or motor task. It might be playing a video game, generating verb conjugations, reading sentences, looking at faces, naming pictures, imagining something, memorizing lists, making financial decisions, and so on. The scanner detects which brain regions are more active than usual during these tasks and draws colored splotches over an image of the subject's brain to pinpoint them. These regions are presumably central to the task. Thousands of functional imaging experiments have been done and thousands more will follow. Through the course of it all, we are gradually building up a picture of where certain functions happen in the typical adult brain. It is easy to say, "this is the face recognition area, this is the math area, this is the music area," and so on. Since we don't know how the brain accomplishes these tasks, it is natural to assume that the brain carries out the various activities in different ways.
But does it? A growing and fascinating body of evidence supports Mountcastle's proposal. Some of the best examples demonstrate the extreme flexibility of the neocortex. Any human brain, if nourished properly and put in the right environment, can learn any of thousands of spoken languages. That same brain can also learn sign language, written language, musical language, mathematical language, computer languages, and body language. It can learn to live in frigid northern climes or in a scorching desert. It can become an expert in chess, fishing, farming, or theoretical physics. Consider the fact that you have a special visual area that seems to be specifically devoted to representing written letters and digits. Does this mean you were born with a language area ready to process letters and digits? Unlikely. Written language is far too recent an invention for our genes to have evolved a specific mechanism for it. So the cortex is still dividing itself into task-specific functional areas long into childhood, based purely on experience. The human brain has an incredible capacity to learn and adapt to thousands of environments that didn't exist until very recently. This argues for an extremely flexible system, not one with a thousand solutions for a thousand problems.
Neuroscientists have also found that the wiring of the neocortex is amazingly "plastic," meaning it can change and rewire itself depending on the type of inputs flowing into it. For example, newborn ferret brains can be surgically rewired so that the animals' eyes send their signals to the areas of cortex where hearing normally develops. The surprising result is that the ferrets develop functioning visual pathways in the auditory portions of their brains. In other words, they see with brain tissue that normally hears sounds. Similar experiments have been done with other senses and brain regions. For instance, pieces of rat visual cortex can be transplanted around the time of birth to regions where the sense of touch is usually represented. As the rat matures, the transplanted tissue processes touch rather than vision. Cells were not born to specialize in vision or touch or hearing.
Human neocortex is every bit as plastic. Adults who are born deaf process visual information in areas that normally become auditory regions. And congenitally blind adults use the rearmost portion of their cortex, which ordinarily becomes dedicated to vision, to read braille. Since braille involves touch, you might think it would primarily activate touch regions— but apparently no area of cortex is content to represent nothing. The visual cortex, not receiving information from the eyes like it is "supposed" to, casts around for other input patterns to sift through— in this case, from other cortical regions.
All this goes to show that brain regions develop specialized functions based largely on the kind of information that flows into them during development. The cortex is not rigidly designed to perform different functions using different algorithms any more than the earth's surface was predestined to end up with its modern arrangement of nations. The organization of your cortex, like the political geography of the globe, could have turned out differently given a different set of early circumstances.
Genes dictate the overall architecture of the cortex, including the specifics of what regions are connected together, but within that structure the system is highly flexible.
Mountcastle was right. There is a single powerful algorithm implemented by every region of cortex. If you connect regions of cortex together in a suitable hierarchy and provide a stream of input, it will learn about its environment. Therefore, there is no reason for intelligent machines of the future to have the same senses or capabilities as we humans. The cortical algorithm can be deployed in novel ways, with novel senses, in a machined cortical sheet so that genuine, flexible intelligence emerges outside of biological brains.
* * *
Let's move on to a topic that is related to Mountcastle's proposal and is equally surprising. The inputs to your cortex are all basically alike. Again, you probably think of your senses as being completely separate entities. After all, sound is carried as compression waves through air, vision is carried as light, and touch is carried as pressure on your skin. Sound seems temporal, vision seems mainly pictorial, and touch seems essentially spatial. What could be more different than the sound of a bleating goat versus the sight of an apple versus the feel of a baseball?
But let's take a closer look. Visual information from the outside world is sent to your brain via a million fibers in your optic nerve. After a brief transit through the thalamus, they arrive at the primary visual cortex. Sounds are carried in via the thirty thousand fibers of your auditory nerve. They pass through some older parts of your brain and then arrive at your primary auditory cortex. Your spinal cord carries information about touch and internal sensations to your brain via another million fibers. They are received by your primary somatosensory cortex. These are the main inputs to your brain. They are how you sense the world.
You can visualize these inputs as a bundle of electrical wires or a bundle of optical fibers. You might have seen lamps made with optical fibers where pinpoints of colored light appear at the end of each fiber. The inputs to the brain are like this, but the fibers are called axons, and they carry neural signals called "action potentials" or "spikes," which are partly chemical and partly electrical. The sense organs supplying these signals are different, but once they are turned into brainbound action potentials, they are all the same— just patterns.
If you look at a dog, for example, a set of patterns will flow through the fibers of your optic nerve into the visual part of your cortex. If you listen to the dog bark, a different set of patterns will flow along your auditory nerve and into the hearing parts of your brain. If you pet the dog, a set of touch-sensation patterns will flow from your hand, through fibers in your spine, and into the parts of your brain that deal with touch. Each pattern— see the dog, hear the dog, feel the dog— is experienced differently because each gets channeled through a different path in the cortical hierarchy. It matters where the cables go to inside the brain. But at the abstract level of sensory inputs, these are all essentially the same, and are all handled in similar ways by the six-layered cortex. You hear sound, see light, and feel pressure, but inside your brain there isn't any fundamental difference between these types of information. An action potential is an action potential. These momentary spikes are identical regardless of what originally caused them. All your brain knows is patterns.
Your perceptions and knowledge about the world are built from these patterns. There's no light inside your head. It's dark in there. There's no sound entering your brain either. It's quiet inside. In fact, the brain is the only part of your body that has no senses itself. A surgeon could stick a finger into your brain and you wouldn't feel it. All the information that enters your mind comes in as spatial and temporal patterns on the axons.
What exactly do I mean by spatial and temporal patterns? Let's look at each of our main senses in turn. Vision carries both spatial and temporal information. Spatial patterns are coincident patterns in time; they are created when multiple receptors in the same sense organ are stimulated simultaneously. In vision, the sense organ is your retina. An image enters your pupil, gets inverted by your lens, hits your retina, and creates a spatial pattern. This pattern gets relayed to your brain. People tend to think that there's a little upside-down picture of the world going into your visual areas, but that's not how it works. There is no picture. It's not an image anymore. Fundamentally, it is just electrical activity firing in patterns. Its imagelike qualities get lost very rapidly as your cortex handles the information, passing components of the pattern up and down between different areas, sifting them, filtering them.
Vision also relies on temporal patterns, which means the patterns entering your eyes are constantly changing over time. But while the spatial aspect of vision is intuitively obvious, its temporal aspect is less apparent. About three times every second, your eyes make a sudden movement called a saccade. They fixate on one point, and then suddenly jump to another point. Every time your eyes move, the image on your retina changes. This means that the patterns carried into your brain are also changing completely with each saccade. And that's in the simplest possible case of you just sitting still looking at an unchanging scene. In real life, you constantly move your head and body and walk through continuously shifting environments. Your conscious impression is of a stable world full of objects and people that are easy to keep track of. But this impression is only made possible by your brain's ability to deal with a torrent of retinal images that never repeat a pattern exactly. Natural vision, experienced as patterns entering the brain, flows like a river. Vision is more like a song than a painting.
Many vision researchers ignore saccades and the rapidly changing patterns of vision. Working with anesthetized animals, they study how vision occurs when an unconscious animal fixates on a point. In doing so, they're taking away the time dimension. There's nothing wrong with that in principle; eliminating variables is a core element of the scientific method. But they're throwing away a central component of vision, what it actually consists of. Time needs a central place in a neuroscientific account of vision.
With hearing, we're used to thinking about sound's temporal aspect. It is intuitively obvious to us that sounds, spoken language, and music change over time. You can't listen to a song all at once any more than you can hear a spoken sentence instantaneously. A song only exists over time. Therefore we don't usually think of sound as a spatial pattern. In a way, it's the inverse of the case with vision: the temporal aspect is immediately apparent, but its spatial aspect is less obvious.
Hearing has a spatial component as well. You convert sounds into action potentials through a coiled-up organ in each ear called the cochlea. Tiny, opaque, spiral-shaped, and embedded in the hardest bone in the body, the temporal bone, the cochlea was deciphered more than half a century ago by a Hungarian physicist, Georg von Beksey. Building models of the inner ear, von Beksey discovered that each component of sound you hear causes a different portion of the cochlea to vibrate. High-frequency tones cause vibrations in the cochlea's stiff base. Low-frequency tones cause vibrations in the cochlea's floppier and wider outer portion. Mid-frequency tones vibrate intermediate segments. Each site on the cochlea is studded with neurons that fire as they are shaken. In daily life your cochleas are being vibrated by large numbers of simultaneous frequencies all the time. So each moment there is a new spatial pattern of stimulation along the length of each cochlea. Each moment a new spatial pattern streams up the auditory nerve. Again we see that this sensory information boils down to spatial-temporal patterns.
People don't usually think of touch as a temporal phenomenon, but it is every bit as time-based as it is spatial. You can carry out an experiment to see for yourself. Ask a friend to cup his hand, palm face up, and close his eyes. Place a small ordinary object in his palm— a ring, an eraser, anything will do— and ask him to identify it without moving any part of his hand. He won't have a clue other than weight and maybe gross size. Then tell him to keep his eyes closed and move his fingers over the object. He'll most likely identify it at once. By allowing the fingers to move, you've added time to the sensory perception of touch. There's a direct analogy between the fovea at the center of your retina and your fingertips, both of which have high acuity. So touch, too, is like a song. Your ability to make complex use of touch, such as buttoning your shirt or unlocking your front door in the dark, depends on continuous time-varying patterns of touch sensation.
We teach our children that humans have five senses: sight, hearing, touch, smell, and taste. We really have more. Vision is more like three senses— motion, color, and luminance (black-and-white contrast). Touch has pressure, temperature, pain, and vibration. We also have an entire system of sensors that tell us about our joint angles and bodily position. It is called the proprioceptive system (proprio- has the same Latin root as proprietary and property). You couldn't move without it. We also have the vestibular system in the inner ear, which gives us our sense of balance. Some of these senses are richer and more apparent to us than others, but they all enter our brain as streams of spatial patterns flowing through time on axons.
Your cortex doesn't really know or sense the world directly. The only thing the cortex knows is the pattern streaming in on the input axons. Your perceived view of the world is created from these patterns, including your sense of self. In fact, your brain can't directly know where your body ends and the world begins. Neuroscientists studying body image have found that our sense of self is a lot more flexible than it feels. For example, if I give you a little rake and have you use it for reaching and grasping instead of using your hand, you will soon feel that it has become a part of your body. Your brain will change its expectations to accommodate the new patterns of tactile input. The rake is literally incorporated into your body map.
* * *
The idea that patterns from different senses are equivalent inside your brain is quite surprising, and although well understood, it still isn't widely appreciated. More examples are in order. The first one you can reproduce at home. All you need is a friend, a freestanding cardboard screen, and a fake hand. For your first time running this experiment, it would be ideal if you had a rubber hand, such as you might buy at a Halloween store, but it will also work if you just trace your hand on a sheet of blank paper. Lay your real hand on a tabletop a few inches away from the fake one and align them the same (fingertips pointed in the same direction, palms either both up or both down). Then place the screen between the two hands so that all you can see is the false one. While you stare at the fake hand, your friend's job is to simultaneously stroke both hands at corresponding points. For example, your friend could stroke both pinkies from knuckle to nail at the same speed, then issue three quick taps to the second joint of both index fingers with the same timing, then stroke a few light circles on the back of each hand, and so on. After a short time, areas in your brain where visual and somatosensory patterns come together— one of those association areas I mentioned earlier in this chapter— become confused. You will actually feel the sensations being applied to the dummy hand as if it were your own.
Another fascinating example of this "pattern equivalency" is called sensory substitution. It may revolutionize life for people who lose their sight in childhood, and might someday be a boon to people who are born blind. It also might spawn new machine interface technologies for the rest of us.
Realizing that the brain is all about patterns, Paul Bach y Rita, a professor of biomedical engineering at the University of Wisconsin, has developed a method for displaying visual patterns on the human tongue. Wearing this display device, blind persons are learning to "see" via sensations on the tongue.
Here is how it works. The subject wears a small camera on his forehead and a chip on his tongue. Visual images are translated pixel for pixel into points of pressure on the tongue. A visual scene that can be displayed as hundreds of pixels on a crude television screen can be turned into a pattern of hundreds of tiny pressure points on the tongue. The brain quickly learns to interpret the patterns correctly.
One of the first people to wear the tongue-mounted device is Erik Weihenmayer, a world-class athlete who went blind at age thirteen and who lectures widely about not letting blindness stop his ambitions. In 2002, Weihenmayer summited Mount Everest, becoming the first blind person ever to undertake, much less accomplish, such a goal.
In 2003, Weihenmayer tried on the tongue unit and saw images for the first time since his childhood. He was able to discern a ball rolling on the floor toward him, reach for a soft drink on a table, and play the game Rock, Paper, Scissors. Later he walked down a hallway, saw the door openings, examined a door and its frame, and noted that there was a sign on it. Images initially experienced as sensations on the tongue were soon experienced as images in space.
These examples show once again that the cortex is extremely flexible and that the inputs to the brain are just patterns. It doesn't matter where the patterns come from; as long as they correlate over time in consistent ways, the brain can make sense of them.
* * *
All of this shouldn't be too surprising if we take the view that patterns are all the brain knows about. Brains are pattern machines. It's not incorrect to express the brain's functions in terms of hearing or vision, but at the most fundamental level, patterns are the name of the game. No matter how different the activities of various cortical areas may seem from each other, the same basic cortical algorithm is at work. The cortex doesn't care if the patterns originated in vision, hearing, or another sense. It doesn't care if its inputs are from a single sensory organ or from four. Nor would it care if you happened to perceive the world with sonar, radar, or magnetic fields, or if you had tentacles rather than hands, or even if you lived in a world of four dimensions rather than three.
This means you don't need any one of your senses or any particular combination of senses to be intelligent. Helen Keller had no sight and no hearing, yet she learned language and became a more skillful writer than most sighted and hearing people. Here was a very intelligent person without two of our main senses, yet the incredible flexibility of the brain allowed her to perceive and understand the world as individuals with all five senses do.
This kind of remarkable flexibility in the human mind gives me high hopes for the brain-inspired technology we will create. When I think about building intelligent machines, I wonder, Why stick to our familiar senses? As long as we can decipher the neocortical algorithm and come up with a science of patterns, we can apply it to any system that we want to make intelligent. And one of the great features of neocortically inspired circuitry is that we won't need to be especially clever in programming it. Just as auditory cortex can become "visual" cortex in a rewired ferret, just as visual cortex finds alternative usage in blind people, a system running the neocortical algorithm will be intelligent based on whatever kinds of patterns we choose to give it. We will still need to be smart about setting up the broad parameters of the system, and we will need to train and educate it. But the billions of neural details involved in the brain's ability to have complex, creative thoughts will take care of themselves, as naturally as they do in our children.
Finally, the idea that patterns are the fundamental currency of intelligence leads to some interesting philosophical questions. When I sit in a room with my friends, how do I know they are there or even if they are real? My brain receives a set of patterns that are consistent with patterns I have experienced in the past. These patterns correspond to people I know, their faces, their voices, how they usually behave, and all kinds of facts about them. I have learned to expect these patterns to occur together in predictable ways. But when you come down to it, it's all just a model. All our knowledge of the world is a model based on patterns. Are we certain the world is real? It's fun and odd to think about. Several sciencefiction books and movies explore this theme. This is not to say that the people or objects aren't really there. They are really there. But our certainty of the world's existence is based on the consistency of patterns and how we interpret them. There is no such thing as direct perception. We don't have a "people" sensor. Remember, the brain is in a dark quiet box with no knowledge of anything other than the time-flowing patterns on its input fibers. Your perception of the world is created from these patterns, nothing else. Existence may be objective, but the spatial-temporal patterns flowing into the axon bundles in our brains are all we have to go on.
This discussion highlights the sometimes-questioned relationship between hallucination and reality. If you can hallucinate sensations coming from a rubber hand and you can "see" via touch stimulation of your tongue, are you being equally "fooled" when you sense touch on your own hand or see with your eyes? Can we trust that the world is as it seems? Yes. The world really does exist in an absolute form very close to how we perceive it. However, our brains can't know about the absolute world directly.
The brain knows about the world through a set of senses, which can only detect parts of the absolute world. The senses create patterns that are sent to the cortex, and processed by the same cortical algorithm to create a model of the world. In this way, spoken language and written language are perceived remarkably similarly, despite being completely different at the sensory level. Likewise, Helen Keller's model of the world was very close to yours and mine, despite the fact that she had a greatly reduced set of senses. Through these patterns the cortex constructs a model of the world that is close to the real thing, and then, remarkably, holds it in memory. It is memory— what happens to those patterns after they enter the cortex— that we'll discuss in the next chapter.
4
Memory
As you read this book, walk down a crowded street, hear a symphony, or comfort a crying child, your brain is being flooded with the spatial and temporal patterns from all of your senses. The world is an ocean of constantly changing patterns that come lapping and crashing into your brain. How do you manage to make sense of the onslaught? Patterns stream in, pass through various parts of the old brain, and eventually arrive at the neocortex. But what happens to them when they enter the cortex?
From the dawn of the industrial revolution, people have viewed the brain as some sort of machine. They knew there weren't gears and cogs in the head, but it was the best metaphor they had. Somehow information entered the brain and the brain-machine determined how the body should react. During the computer age, the brain has been viewed as a particular type of machine, the programmable computer. And as we saw in chapter 1, AI researchers have stuck with this view, arguing that their lack of progress is only due to how small and slow computers remain compared to the human brain. Today's computers may be equivalent only to a cockroach brain, they say, but when we make bigger and faster computers they will be as intelligent as humans.
There is a largely ignored problem with this brain-as-computer analogy. Neurons are quite slow compared to the transistors in a computer. A neuron collects inputs from its synapses, and combines these inputs together to decide when to output a spike to other neurons. A typical neuron can do this and reset itself in about five milliseconds (5 ms), or around two hundred times per second. This may seem fast, but a modern silicon-based computer can do one billion operations in a second. This means a basic computer operation is five million times faster than the basic operation in your brain! That is a very, very big difference. So how is it possible that a brain could be faster and more powerful than our fastest digital computers? "No problem," say the brain-as-computer people. "The brain is a parallel computer. It has billions of cells all computing at the same time. This parallelism vastly multiplies the processing power of the biological brain."
I always felt this argument was a fallacy, and a simple thought experiment shows why. It is called the "one hundred–step rule." A human can perform significant tasks in much less time than a second. For example, I could show you a photograph and ask you to determine if there is cat in the image. Your job would be to push a button if there is a cat, but not if you see a bear or a warthog or a turnip. This task is difficult or impossible for a computer to perform today, yet a human can do it reliably in half a second or less. But neurons are slow, so in that half a second, the information entering your brain can only traverse a chain one hundred neurons long. That is, the brain "computes" solutions to problems like this in one hundred steps or fewer, regardless of how many total neurons might be involved. From the time light enters your eye to the time you press the button, a chain no longer than one hundred neurons could be involved. A digital computer attempting to solve the same problem would take billions of steps. One hundred computer instructions are barely enough to move a single character on the computer's display, let alone do something interesting.
But if I have many millions of neurons working together, isn't that like a parallel computer? Not really. Brains operate in parallel and parallel computers operate in parallel, but that's the only thing they have in common. Parallel computers combine many fast computers to work on large problems such as computing tomorrow's weather. To predict the weather you have to compute the physical conditions at many points on the planet. Each computer can work on a different location at the same time. But even though there may be hundreds or even thousands of computers working in parallel, the individual computers still need to perform billions or trillions of steps to accomplish their task. The largest conceivable parallel computer can't do anything useful in one hundred steps, no matter how large or how fast.
Here is an analogy. Suppose I ask you to carry one hundred stone blocks across a desert. You can carry one stone at a time and it takes a million steps to cross the desert. You figure this will take a long time to complete by yourself, so you recruit a hundred workers to do it in parallel. The task now goes a hundred times faster, but it still requires a minimum of a million steps to cross the desert. Hiring more workers— even a thousand workers— wouldn't provide any additional gain. No matter how many workers you hire, the problem cannot be solved in less time than it takes to walk a million steps. The same is true for parallel computers. After a point, adding more processors doesn't make a difference. A computer, no matter how many processors it might have and no matter how fast it runs, cannot "compute" the answer to difficult problems in one hundred steps.
So how can a brain perform difficult tasks in one hundred steps that the largest parallel computer imaginable can't solve in a million or a billion steps? The answer is the brain doesn't "compute" the answers to problems; it retrieves the answers from memory. In essence, the answers were stored in memory a long time ago. It only takes a few steps to retrieve something from memory. Slow neurons are not only fast enough to do this, but they constitute the memory themselves. The entire cortex is a memory system. It isn't a computer at all.
- * *
Let me show, through an example, the difference between computing a solution to a problem and using memory to solve the same problem. Consider the task of catching a ball. Someone throws a ball to you, you see it traveling toward you, and in less than a second you snatch it out of the air. This doesn't seem too difficult— until you try to program a robot arm to do the same. As many a graduate student has found out the hard way, it seems nearly impossible. When engineers or computer scientists tackle this problem, they first try to calculate the flight of the ball to determine where it will be when it reaches the arm. This calculation requires solving a set of equations of the type you learn in high school physics. Next, all the joints of a robotic arm have to be adjusted in concert to move the hand into the proper position. This involves solving another set of mathematical equations more difficult than the first. Finally, this whole operation has to be repeated multiple times, for as the ball approaches, the robot gets better information about the ball's location and trajectory. If the robot waits to start moving until it knows exactly where the ball will arrive it will be too late to catch it. It has to start moving to catch the ball when it has only a poor sense of its location and it continually adjusts as the ball gets closer. A computer requires millions of steps to solve the numerous mathematical equations to catch the ball. And although a computer might be programmed to successfully solve this problem, the one hundred–step rule tells us that a brain solves it in a different way. It uses memory.
How do you catch the ball using memory? Your brain has a stored memory of the muscle commands required to catch a ball (along with many other learned behaviors). When a ball is thrown, three things happen. First, the appropriate memory is automatically recalled by the sight of the ball. Second, the memory actually recalls a temporal sequence of muscle commands. And third, the retrieved memory is adjusted as it is recalled to accommodate the particulars of the moment, such as the ball's actual path and the position of your body. The memory of how to catch a ball was not programmed into your brain; it was learned over years of repetitive practice, and it is stored, not calculated, in your neurons.
You might be thinking, "Wait a minute. Each catch is slightly different. You just said the recalled memory gets continually adjusted to accommodate the variations of where the ball is on any particular throw… Doesn't that require solving the same equations we were trying to avoid?" It may seem so, but nature solved the problem of variation in a different and very clever way. As we'll see later in this chapter, the cortex creates what are called invariant representations, which handle variations in the world automatically. A helpful analogy might be to imagine what happens when you sit down on a water bed: the pillows and any other people on the bed are all spontaneously pushed into a new configuration. The bed doesn't compute how high each object should be elevated; the physical properties of the water and the mattress's plastic skin take care of the adjustment automatically. As we'll see in the next chapter, the design of the six-layered cortex does something similar, loosely speaking, with the information that flows through it.
- * *
So the neocortex is not like a computer, parallel or otherwise. Instead of computing answers to problems the neocortex uses stored memories to solve problems and produce behavior. Computers have memory too, in the form of hard drives and memory chips; however, there are four attributes of neocortical memory that are fundamentally different from computer memory:
- The neocortex stores sequences of patterns.
- The neocortex recalls patterns auto-associatively.
- The neocortex stores patterns in an invariant form.
- The neocortex stores patterns in a hierarchy.
We will discuss the first three differences in this chapter. I introduced the concept of hierarchy in the neocortex in chapter 3. In chapter 6, I will describe its significance and how it works.
The next time you tell a story, step back and consider how you can only relate one aspect of the tale at a time. You cannot tell me everything that happened all at once, no matter how quickly you talk or I listen. You need to finish one part of the story before you can move on to the next. This isn't only because spoken language is serial; written, oral, and visual storytelling all convey a narrative in a serial fashion. It is because the story is stored in your head in a sequential fashion and can only be recalled in the same sequence. You can't remember the entire story at once. In fact, it's almost impossible to think of anything complex that isn't a series of events or thoughts.
You may have noticed, too, that in telling a story some people can't get to the crux of it right away. They seem to ramble on with irrelevant details and tangents. This can be irritating. You want to scream, "Get to the point!" But they are chronicling the story as it happened to them, through time, and cannot tell it any other way.
Another example: I'd like you to imagine your home right now. Close your eyes and visualize it. In your imagination, go to the front door. Imagine what it looks like. Open your front door. Move inside. Now look to your left. What do you see? Look to the right. What is there? Go to your bathroom. What's on the right? What's on the left? What's in the top right drawer? What items do you keep in your shower? You know all these things plus thousands more and can recall them in great detail. These memories are stored in your cortex. You might say these things are all part of the memory of your home. But you can't think of them all at once. They are obviously related memories but there is no way you can bring to mind all of this detail at once. You have a thorough memory of your home; but to recall it you have to go through it in sequential segments, in much the same way as you experience it.
All memories are like this. You have to walk through the temporal sequence of how you do things. One pattern
(approach the door) evokes the next pattern (go through the door), which evokes the next pattern (either go down the hall or ascend the stairs), and so on. Each is a sequence you've followed before. Of course, with a conscious effort I can change the order of how I describe my home to you. I can jump from basement to the second floor if I decide to focus on items in a nonsequential way. Yet once I start to describe any room or item I've chosen, I'm back to following a sequence. Truly random thoughts don't exist. Memory recall almost always follows a pathway of association.
You know the alphabet. Try saying it backward. You can't because you don't usually experience it backward. If you want to know what it's like to be a child learning the alphabet, try saying it in reverse. That's exactly what they're confronted with. It's really hard. Your memory of the alphabet is a sequence of patterns. It isn't something stored or recalled in an instant or in an arbitrary order. The same thing goes for the days of the week, the months of the year, your phone number, and countless other things.
Your memory for songs is a great example of temporal sequences in memory. Think of a tune you know. I like to use "Somewhere over the Rainbow," but any melody will suffice. You cannot imagine the entire song at once, only in sequence. You can start at the beginning or maybe with the chorus, and then you play through it, filling in the notes one after another. You can't recall the song backward, just as you can't recall it all at once. You were first exposed to "Somewhere over the Rainbow" as it played through time, and you can only recall it in the same way you learned it.
This applies to very low level sensory memories too. Consider your tactile memory for textures. Your cortex has memories of what it feels like to hold a fistful of gravel, slide your fingers over velvet, and press down on a piano key. These memories are based on sequences every bit as much as the alphabet and songs are; it's just that the sequences are shorter, spanning mere fractions of a second rather than many seconds or minutes. If I buried your hand in a bucket of gravel while you slept, when you woke up you wouldn't know what you were touching until you moved your fingers. Your memory for the tactile texture of gravel is based on pattern sequences across the pressure-and vibration-sensing neurons in your skin. These sequences are different from those you'd receive if your hand was buried in sand or Styrofoam pellets or dry leaves. As soon as you flexed your hand, the scraping and rolling of the pebbles would create the telltale pattern sequences of gravel and trigger the appropriate memory in your somatosensory cortex.
The next time you get out of the shower, pay attention to how you dry yourself off with a towel. I discovered that I dry myself off with nearly the exact same sequence of rubs, pats, and body positions each time. And via a pleasant experiment I discovered that my wife also follows a semirigid pattern when she steps out of the shower. You probably do too. If you follow a sequence, try changing it. You can will yourself to do it, but you need to stay focused. If your attention wanders, you'll fall back into your accustomed pattern.
All memories are stored in the synaptic connections between neurons. Given the very large number of things we have stored in our cortex, and that at any moment in time we can recall only a tiny fraction of these stored memories, it stands to reason that only a limited number of synapses and neurons in your brain are playing an active role in memory recall at any one time. As you start to recall what is in your home, one set of neurons becomes active, which then leads to another set of neurons being active, and so on. An adult human neocortex has an incredibly large memory capacity. But, even though we have stored so many things, we can only remember a few at any time and can only do so in a sequence of associations.
Here is a fun exercise. Try to recall details from your past, details of where you lived, places you visited, and people you knew. I find I can always uncover memories of things I haven't thought of in many years. There are thousands of detailed memories stored in the synapses of our brains that are rarely used. At any point in time we recall only a tiny fraction of what we know. Most of the information is sitting there idly waiting for the appropriate cues to invoke it.
Computer memory does not normally store sequences of patterns. It can be made to do so using various software tricks (such as when you store a song on your computer), but computer memory does not do this automatically. In contrast, the cortex does store sequences automatically. Doing so is an inherent aspect of the neocortical memory system.
* * *
Now let's consider the second key feature of our memory, its auto-associative nature. As we saw in chapter 2, the term simply means that patterns are associated with themselves. An auto-associative memory system is one that can recall complete patterns when given only partial or distorted inputs. This can work for both spatial and temporal patterns. If you see your child's shoes sticking out from behind the draperies, you automatically envision his or her entire form. You complete the spatial pattern from a partial version of it. Or imagine you see a person waiting for a bus but can only see part of her because she is standing partially behind a bush. Your brain is not confused. Your eyes only see parts of a body, but your brain fills in the rest, creating a perception of a whole person that's so strong you may not even realize you're only inferring.
You also complete temporal patterns. If you recall a small detail about something that happened long ago, the entire memory sequence can come flooding back into your mind. Marcel Proust's famous series of novels, Remembrance of Things Past, opened with the memory of how a madeleine cookie smelled— and he was off and running for a thousandplus pages. During conversation we often can't hear all the words if we are in a noisy environment. No problem. Our brains fill in what they miss with what they expect to hear. It's well established that we don't actually hear all the words we perceive. Some people complete others' sentences aloud, but in our minds all of us are doing this constantly. And not just the ends of sentences, but the middles and beginnings as well. For the most part we are not aware that we're constantly completing patterns, but it's a ubiquitous and fundamental feature of how memories are stored in the cortex. At any time, a piece can activate the whole. This is the essence of auto-associative memories.
Your neocortex is a complex biological auto-associative memory. During each waking moment, each functional region is essentially waiting vigilantly for familiar patterns or pattern fragments to come in. You can be in deep thought about something, but the instant your friend appears your thoughts switch to her. This switch isn't something you chose to do.
The mere appearance of your friend forces your brain to start recalling patterns associated with her. It's unavoidable. After an interruption we frequently have to ask, "What was I thinking about?" A dinner conversation with friends follows a circuitous route of associations. The talk may start with the food in front of you, but the salad evokes an associated memory of your mother's salad at your wedding, which leads to a memory of someone else's wedding, which leads to a memory of where they went on their honeymoon, to the political problems in that part of the world, and so on. Thoughts and memories are associatively linked, and again, random thoughts never really occur. Inputs to the brain autoassociatively link to themselves, filling in the present, and auto-associatively link to what normally follows next. We call this chain of memories thought, and although its path is not deterministic, we are not fully in control of it either.
* * *
Now we can consider the third major attribute of neocortical memory: how it forms what are called invariant representations. I will cover the basic ideas of invariant representations in this chapter and, in chapter 6, the details of how the cortex creates them.
A computer's memory is designed to store information exactly as it is presented. If you copy a program from a CD to a hard disk, every byte is copied with 100 percent fidelity. A single error or discrepancy between the two copies might cause the program to crash. The memory in our neocortex is different. Our brain does not remember exactly what it sees, hears, or feels. We don't remember or recall things with complete fidelity— not because the cortex and its neurons are sloppy or error-prone but because the brain remembers the important relationships in the world, independent of the details. Let's look at several examples to illustrate this point.
As we saw in chapter 2, simple auto-associative memory models have been around for decades and, as I described above, the brain recalls memories auto-associatively. But there is a big difference between the auto-associative memories built by neural network researchers and those in the cortex. Artificial auto-associative memories do not use invariant representations and therefore they fail in some very basic ways. Imagine I have a picture of a face formed by a large collection of black-and-white dots. This picture is a pattern, and if I have an artificial auto-associative memory I can store many pictures of faces in the memory. Our artificial auto-associative memory is robust in that if I give it half a face or just a pair of eyes, it will recognize that part of the image and fill in the missing parts correctly. This exact experiment has been done several times. However, if I move each dot in the picture five pixels to the left, the memory completely fails to recognize the face. To the artificial auto-associative memory, it is a completely novel pattern, because none of the pixels between the previously stored pattern and the new pattern are aligned. You and I, of course, would have no difficulty seeing the shifted pattern as the same face. We probably wouldn't even notice the change. Artificial autoassociative memories fail to recognize patterns if they are moved, rotated, rescaled, or transformed in any of a thousand other ways, whereas our brains handle these variations with ease. How can we perceive something as being the same or constant when the input patterns representing it are novel and changing? Let's look at another example.
You are probably holding a book in your hands right now. As you move the book, or change the lighting, or reposition yourself in your chair, or fixate your eyes on different parts of the page, the pattern of light falling on your retina changes completely. The visual input you receive is different moment by moment and never repeats. In fact, you could hold this book for a hundred years and not once would the pattern on your retina, and therefore the pattern entering your brain, be exactly the same. Yet not for an instant do you have any doubt that you are holding a book, indeed the same book. Your brain's internal pattern representing "this book" does not change even though the stimuli informing you it's there are in constant flux. Hence we use the term invariant representation to refer to the brain's internal representation.
For another example, think of a friend's face. You recognize her every time you see her. It happens automatically in less than a second. It doesn't matter if she is two feet away, three feet away, or across the room. When she is close, her image occupies most of your retina. When she is far away, her image occupies a small portion of your retina. She can be facing you, turned a little to the side, or in profile. She might be smiling, squinting, or yawning. You might see her in bright light, in shade, or under strangely angled disco lights. Her visage can appear in countless positions and variations. For each one, the pattern of light falling on your retina is unique, yet in every case you know instantly that you are looking at her.
Let's pop the hood and look at what's going on in your brain to perform this amazing feat. We know from experiments that if we monitor the activity of neurons in the visual input area of your cortex, called V1, the pattern of activity is different for each different view of her face. Every time the face moves or your eyes make a new fixation, the pattern of activity in V1 changes, much like the changing pattern on the retina. However, if we monitor the activity of cells in your face recognition area— a functional region that's several steps higher than V1 in the cortical hierarchy— we find stability. That is, some set of the cells in the face recognition area remain active as long as your friend's face is anywhere in your field of vision (or even being conjured in your mind's eye), regardless of its size, position, orientation, scale, and expression. This stability of cell firing is an invariant representation.
Introspectively, this task seems so easy as to be hardly worth calling it a problem. It's as automatic as breathing. It seems trivial because we aren't consciously aware it is happening. And in some sense, it is trivial because our brains can solve it so quickly (remember the one hundred–step rule). However, the problem of understanding how your cortex forms invariant representations remains one of the biggest mysteries in all of science. How difficult, you ask? So much so that no one, not even using the most powerful computers in the world, has been able to solve it. And it isn't for a lack of trying.
Speculation on this problem has an ancient pedigree. It traces back to Plato, twenty-three centuries ago. Plato wondered how people are able to think and know about the world. He pointed out that real-world instances of things and ideas are always imperfect and are always different. For example, you have a concept of a perfect circle, yet you have never actually seen one. All drawings of circles are imperfect. Even if drafted with a geometer's compass a so-called circle is represented by a dark line, whereas the circumference of a true circle has no thickness at all. How then did you ever acquire the concept of a perfect circle? Or to take a more worldly case, think about your concept of dogs. Every dog you've ever seen is different from every other, and every time you see the same individual dog you see a different view of it. All dogs are different and you can never see any particular dog exactly the same way twice. Yet all of your various experiences with dogs get funneled into a mental concept of "dog" that is stable across all of them. Plato was perplexed. How is it possible that we learn and apply concepts in this world of infinitely various forms and ever-shifting sensations?
Plato's solution was his famous Theory of Forms. He concluded that our higher minds must be tethered to some transcendent plane of superreality, where fixed, stable ideas (Forms with a capital F) exist in timeless perfection. Our souls come from this mystical place before birth, he decided, which is where they learned about the Forms in the first place. After we're born we retain latent knowledge of them. Learning and understanding happen because real-world forms remind us of the Forms to which they correspond. You are able to know about circles and dogs because they respectively trigger your soul memories of Circle and Dog.
It's all quite loopy from a modern perspective. But if you strip away the high-flown metaphysics, you can see that he was really talking about invariance. His system of explanation was wildly off the mark, but his intuition that this was one of the most important questions we can ask about our own nature was a bull's-eye.
* * *
Lest you get the impression that invariance is all about vision, let's look at some examples in other senses. Consider your tactile sense. When you reach into your car's glove compartment to find your sunglasses, your fingers only have to brush against them for you to know you've found them. It doesn't matter which part of your hand makes the contact; it can be your thumb, any part of any finger, or your palm. And the contact can be with any part of the glasses, whether it's a lens, temple, hinge, or part of the frame. Just a second of moving any part of your hand over any portion of the glasses is sufficient for your brain to identify them. In each case, the stream of spatial and temporal patterns coming from your touch receptors is entirely different— different areas of your skin, different parts of the object— yet you snap up your sunglasses without a thought.
Or consider the sensorimotor task of putting the key in your car's ignition switch. The position of your seat, body, arm, and hand are slightly different each time. To you it feels like the same simple repetitive action day in, day out, but that's because you have an invariant representation of it in your brain. If you tried to make a robot that could enter the car and put in the key, you would quickly see how nearly impossible it is unless you made sure the robot was in the exact same position, and held the key in exactly the same way every time. And even if you could manage to do this, the robot would need to be reprogrammed for different cars. Robots and computer programs, like artificial auto-associative memories, are terrible at handling variation.
Another interesting example is your signature. Somewhere in your motor cortex, in your frontal lobe, you have an invariant representation of your autograph. Every time you sign your name, you use the same sequence of strokes, angles, and rhythms. This is true whether you sign it minutely with a fine-tipped pen, flamboyantly like John Hancock, in the air with your elbow, or clumsily with a pencil held between your toes. It comes out looking somewhat different each time, of course, especially under some of the awkward conditions I just named. Nevertheless, regardless of scale, writing implement, or combination of body parts, you always run the same abstract "motor program" to produce it.
From the signature example you can see that invariant representation in motor cortex is, in some ways, the mirror image of invariant representation in sensory cortex. On the sensory side, a wide variety of input patterns can activate a stable cell assembly that represents some abstract pattern (your friend's face, your sunglasses). On the motor side, a stable cell assembly representing some abstract motor command (catching a ball, signing your name) is able to express itself using a wide variety of muscle groups and respecting a wide variety of other constraints. This symmetry between perception and action is what we should expect if, as Mountcastle proposed, the cortex runs a single basic algorithm in all areas.
For a final example, let's return to sensory cortex and look at music again. (I like using memory of music as an example because it is easy to see all the issues the neocortex must solve.) Invariant representation in music is illustrated by your ability to recognize a melody in any key. The key a tune is played in refers to the musical scale the melody is built on. The same melody played in different keys starts on different notes. Once you choose the key for a rendition, you've determined the rest of the notes in the tune. Any melody can be played in any key. This means that each rendition of the "same" melody in a new key is actually an entirely different sequence of notes! Each rendition stimulates an entirely different set of locations on your cochlea, causing an entirely different set of spatial-temporal patterns to stream up into your auditory cortex… and yet you perceive the same melody in each case. Unless you have perfect pitch you cannot even distinguish the same song played in two different keys without hearing them back to back.
Think of the song "Somewhere over the Rainbow." You probably first learned it by hearing Judy Garland sing it in the movie The Wizard of Oz, but unless you have perfect pitch you probably can't recall the key she sang it in (A flat). If I sit down at a piano and start to play the song in a key in which you've never heard it— say, in D— it will sound like the same song. You won't notice that all the notes are different from those in the version you're familiar with. This means that your memory of the song must be in a form that ignores pitch. The memory must store the important relationships in the song, not the actual notes. In this case, the important relationships are the relative pitch of the notes, or "intervals." "Somewhere over the Rainbow" begins with an octave up, followed by a halftone down, followed by a major third down, and so on. The interval structure of the melody is the same for any rendition in any key. Your ability to easily recognize the song in any key indicates that your brain has stored it in this pitch-invariant form.
Similarly, the memory of your friend's face must also be stored in a form that is independent of any particular view. What makes her face recognizable are its relative dimensions, relative colors, and relative proportions, not how it appeared one instant last Tuesday at lunch. There are "spatial intervals" between the features of her face just as there are "pitch intervals" between the notes of a song. Her face is wide relative to her eyes. Her nose is short relative to the width of her eyes. The color of her hair and the color of her eyes have a similar relative relationship that stays constant even though in different lighting conditions their absolute colors change significantly. When you memorized her face, you memorized these relative attributes.
I believe a similar abstraction of form is occurring throughout the cortex, in every region. This is a general property of the neocortex. Memories are stored in a form that captures the essence of relationships, not the details of the moment. When you see, feel, or hear something, the cortex takes the detailed, highly specific input and converts it to an invariant form. It is the invariant form that is stored in memory, and it is the invariant form of each new input pattern that it gets compared to. Memory storage, memory recall, and memory recognition occur at the level of invariant forms. There is no equivalent concept in computers.
* * *
This brings up an interesting problem. In the next chapter I argue that an important function of the neocortex is to use its memory to make predictions. But given that the cortex stores invariant forms, how can it make specific predictions? Here are some examples to illustrate the problem and the solution.
Imagine it is 1890, and you are in a frontier town in the American West. Your sweetheart is taking the train from the East to join you in your new frontier home. You of course want to meet her at the station when she arrives. For a few weeks prior to her arrival day you keep track of when the trains come and go. There is no set schedule and as far as you can tell the train never arrives or leaves at the same time during the day. It is beginning to look as if you won't be able to predict when her train will arrive. But then you notice there is some structure to the trains' comings and goings. The train coming from the East arrives four hours after one leaves heading east. This four-hour gap is consistent day to day although the specific times vary greatly. On the day of her arrival, you keep an eye out for the eastbound train, and when you see it, you set your clock. After four hours you head for the station and meet her train just as it arrives. This parable illustrates both the problem the neocortex faces and the solution it uses to solve it.
The world as seen by your senses is never the same; like the arrival and departure time of the train, it is always different. The way you understand the world is by finding invariant structure in the constantly changing stream of input. However, this invariant structure alone is not sufficient to use as a basis for making specific predictions. Just knowing that the train arrives four hours after it departs doesn't allow you to show up on the platform exactly in time to greet your sweetheart. To make a specific prediction, the brain must combine knowledge of the invariant structure with the most recent details. Predicting the arrival time of the train requires recognizing the four-hour structure in the train schedule, and combining it with the detailed knowledge of what time the last eastbound train left.
When listening to a familiar song played on a piano, your cortex predicts the next note before it is played. But the memory of the song, as we've seen, is in a pitch-invariant form. Your memory tells you what interval is next, but says nothing, in and of itself, about the actual note. To predict the exact next note requires combining the next interval with the last specific note. If the next interval is a major third and the last note you heard was C, then you can predict the specific next note, E. You hear in your mind E, not "major third." And unless you've misidentified the song or the pianist slips up, your prediction is correct.
When you see your friend's face, your cortex fills in and predicts the myriad details of her unique image at that instant. It checks that her eyes are just right, that her nose, lips, and hair are exactly as they should be. Your cortex makes these predictions with great specificity. It can predict low-level details about her face even though you have never seen her in this particular orientation or environment before. If you know exactly where your friend's eyes and nose are, and you know the structure of her face, then you can predict exactly where her lips should be. If you know her skin is being tinged orange by the light of sunset, then you know what color her hair should appear. Once again, your brain does this by combining a memory of the invariant structure of her face with the particulars of your immediate experience.
The train schedule example is just an analogy of what is going on in your cortex, but the melody and face examples are not. The combining of invariant representations and current input to make detailed predictions is exactly what is happening. It is a ubiquitous process that happens in every region of cortex. It is how you make specific predictions about the room you are sitting in right now. It is how you are able to predict not only the words others will say, but also in what tone of voice they will say them, the accent they will use, and where in the room you expect to hear the voice come from. It is how you know precisely when your foot will hit the floor, and what it will feel like when you climb a set of stairs. It is how you can sign your name with your foot, or catch a thrown ball.
The three properties of cortical memory discussed in this chapter (storing sequences, auto-associative recall, and invariant representations) are necessary ingredients to predict the future based on memories of the past. In the next chapter I propose that making predictions is the essence of intelligence.
5
A New Framework
of Intelligence
One day in April 1986 I was contemplating what it means to "understand" something. For months I had been struggling with the fundamental question What do brains do if they aren't generating behavior? What does a brain do when it is passively listening to speech? What is your brain doing right now while it is reading? Information goes into the brain but doesn't come out. What happens to it? Your behaviors at the moment are probably basic— such as breathing and eye movements— yet, as you are aware, your brain is doing a lot more than that as you read and understand these words. Understanding must be the result of neural activity. But what? What are the neurons doing when they understand?
As I looked around my office that day, I saw familiar chairs, posters, windows, plants, pencils, and so on. There were hundreds of items and features all around me. My eyes saw them as I glanced around, yet just seeing them didn't cause me to perform any action. No behavior was invoked or required, yet somehow I "understood" the room and its contents. I was doing what Searle's Chinese Room couldn't do, and I didn't have to pass anything back through a slot. I understood, but had no action to prove it. What did it mean to "understand"?
It was while pondering this dilemma that I had an "aha" insight, one of those emotionally powerful moments when suddenly what was a tangle of confusion becomes clear and understood. All I did was ask what would happen if a new object, one I had never seen before, appeared in the room— say, a blue coffee cup.
The answer seemed simple. I would notice the new object as not belonging. It would catch my attention as being new. I needn't consciously ask myself if the coffee cup was new. It would just jump out as not belonging. Underlying that seemingly trivial answer is a powerful concept. To notice that something is different, some neurons in my brain that weren't active before would have to become active. How would these neurons know that the blue coffee cup was new and the hundreds of other objects in the room were not? The answer to this question still surprises me. Our brains use stored memories to constantly make predictions about everything we see, feel, and hear. When I look around the room, my brain is using memories to form predictions about what it expects to experience before I experience it. The vast majority of predictions occur outside of awareness. It's as if different parts of my brain were saying, "Is the computer in the middle of the desk? Yes. Is it black? Yes. Is the lamp in the right-hand corner of the desk? Yes. Is the dictionary where I left it? Yes. Is the window rectangular and the walls vertical? Yes. Is sunlight coming from the correct direction for the time of day? Yes." But when some visual pattern comes in that I had not memorized in that context, a prediction is violated. And my attention is drawn to the error.
Of course, the brain doesn't talk to itself while making predictions, and it doesn't make predictions in a serial fashion. It also doesn't just make predictions about distinct objects like coffee cups. Your brain constantly makes predictions about the very fabric of the world we live in, and it does so in a parallel fashion. It will just as readily detect an odd texture, a misshapen nose, or an unusual motion. It isn't immediately apparent how pervasive these mostly unconscious predictions are, which is perhaps why we missed their importance for so long. They happen so automatically, so easily, we fail to fathom what is happening inside our skulls. I hope to impress on you the power of this idea. Prediction is so pervasive that what we "perceive"— that is, how the world appears to us— does not come solely from our senses. What we perceive is a combination of what we sense and of our brains' memory-derived predictions.
* * *
Minutes later I conceived a thought experiment to help convey what I understood at that moment. I call it the altered door experiment. Here is how it goes.
When you come home each day, you usually take a few seconds to go through your front door or whichever door you use. You reach out, turn the knob, walk in, and shut it behind you. It's a firmly established habit, something you do all the time and pay little attention to. Suppose while you are out, I sneak over to your home and change something about your door. It could be almost anything. I could move the knob over by an inch, change a round knob into a thumb latch, or turn it from brass to chrome. I could change the door's weight, substituting solid oak for a hollow door, or vice versa. I could make the hinges squeaky and stiff, or make them glide frictionlessly. I could widen or narrow the door and its frame. I could change its color, add a knocker where the peephole used to be, or add a window. I can imagine a thousand changes that could be made to your door, unbeknownst to you. When you come home that day and attempt to open the door, you will quickly detect that something is wrong. It might take you a few seconds' reflection to realize exactly what is wrong, but you will notice the change very quickly. As your hand reaches for the moved knob, you will realize that it is not in the correct location. Or when you see the door's new window, something will appear odd. Or if the door's weight has been changed, you will push with the wrong amount of force and be surprised. The point is that you will notice any of a thousand changes in a very short period of time.
How do you do that? How do you notice these changes? The AI or computer engineer's approach to this problem would be to create a list of all the door's properties and put them in a database, with fields for every attribute a door can have and specific entries for your particular door. When you approach the door, the computer would query the entire database, looking at width, color, size, knob position, weight, sound, and so on. While this may sound superficially similar to how I described my brain checking each of its myriad predictions as I glanced around my office, the difference is real and farreaching. The AI strategy is implausible. First, it is impossible to specify in advance every attribute a door can have. The list is potentially endless. Second, we would need to have similar lists for every object we encounter every second of our lives. Third, nothing we know about brains and neurons suggests that this is how they work. And finally, neurons are just too slow to implement computer-style databases. It would take you twenty minutes instead of two seconds to notice the change as you go through the door.
There is only one way to interpret your reaction to the altered door: your brain makes low-level sensory predictions about what it expects to see, hear, and feel at every given moment, and it does so in parallel. All regions of your neocortex are simultaneously trying to predict what their next experience will be. Visual areas make predictions about edges, shapes, objects, locations, and motions. Auditory areas make predictions about tones, direction to source, and patterns of sound. Somatosensory areas make predictions about touch, texture, contour, and temperature.
"Prediction" means that the neurons involved in sensing your door become active in advance of them actually receiving sensory input. When the sensory input does arrive, it is compared with what was expected. As you approach the door, your cortex is forming a slew of predictions based on past experience. As you reach out, it predicts what you will feel on your fingers, when you will feel the door, and at what angle your joints will be when you actually touch the door. As you start to push the door open, your cortex predicts how much resistance the door will offer and how it will sound. When your predictions are all met, you'll walk through the door without consciously knowing these predictions were verified. But if your expectations about the door are violated, the error will cause you to take notice. Correct predictions result in understanding. The door is normal. Incorrect predictions result in confusion and prompt you to pay attention. The door latch is not where it's supposed to be. The door is too light. The door is off center. The texture of the knob is wrong. We are making continuous low-level predictions in parallel across all our senses.
But that's not all. I am arguing a much stronger proposition. Prediction is not just one of the things your brain does. It is the primary function of the neocortex, and the foundation of intelligence. The cortex is an organ of prediction. If we want to understand what intelligence is, what creativity is, how your brain works, and how to build intelligent machines, we must understand the nature of these predictions and how the cortex makes them. Even behavior is best understood as a by-product of prediction.
* * *
I don't know who was the first person to suggest that prediction is key to understanding intelligence. In science and industry no one invents anything completely new. Rather, people see how existing ideas fit into new frameworks. The components of a new idea are usually floating around in the milieu of scientific discourse prior to its discovery. What is usually new is the packaging of these components into a cohesive whole. Similarly, the idea that a primary function of the cortex is to make predictions is not entirely new. It has been floating around in various forms for some time. But it has not yet assumed its rightful position at the center of brain theory and the definition of intelligence.
Ironically, some of the pioneers of artificial intelligence had a notion of computers building a model of the world and using it to make predictions. In 1956, for example, D. M. Mackay argued that intelligent machines should have an "internal response mechanism" designed to "match what is received." He didn't use the words "memory" and "prediction" but he was thinking along the same lines.
Since the mid-1990s, terms such as inference, generative models, and prediction have crept into the scientific nomenclature. They all refer to related ideas. As an example, in his 2001 book, i of the vortex, Rodolfo Llinas, at the New York University School of Medicine, wrote, "The capacity to predict the outcome of future events— critical to successful movement— is, most likely, the ultimate and most common of all global brain functions." Scientists such as David Mumford at Brown University, Rajesh Rao at the University of Washington, Stephen Grossberg at Boston University, and many more have written and theorized about the role of feedback and prediction in various ways. There is an entire subfield of mathematics devoted to Bayesian networks. Named after Thomas Bayes, an English minister born in 1702 who was a pioneer in statistics, Bayesian networks use probability theory to make predictions.
What has been lacking is putting these disparate bits and pieces into a coherent theoretical framework. This, I argue, has not been done before, and it is the goal of this book.
* * *
Before we get into detail about how the cortex makes predictions, let's consider some additional examples. The more you think about this idea, the more you'll realize that prediction is pervasive and the basis for how you understand the world.
This morning I made pancakes. At one point in the process, I reached under the counter to open a cabinet door. I intuitively knew, without seeing, what I would feel— in this case, the cabinet doorknob— and when I would feel it. I twisted the top of the milk container with the expectation that it would turn and then come free. I turned on the griddle expecting the knob to push in a slight amount, then turn with a certain resistance. I expected to hear the gentle fwoomp of the gas flame about a second later. Every minute in the kitchen I made dozens or hundreds of motions, and each one involved many predictions. I know this because if any of those common motions had had a different result from the expected one, I would have noticed it.
Every time you put your foot down while you are walking, your brain predicts when your foot will stop moving and how much "give" the material you step on will have. If you have ever missed a step on a flight of stairs, you know how quickly you realize something is wrong. You lower your foot and the moment it "passes through" the anticipated stair tread you know you are in trouble. The foot doesn't feel anything, but your brain made a prediction and the prediction was not met. A computer-driven robot would blissfully fall over, not realizing that anything was amiss, while you would know as soon as your foot continues for even a fraction of an inch beyond the spot where your brain had expected it to stop.
When you listen to a familiar melody, you hear the next note in your head before it occurs. When you listen to a favorite album, you hear the beginning of each next song a couple of seconds before it starts. What's happening? Neurons in your brain that will fire when you hear that next note fire in advance of your actually hearing it, and so you "hear" the song in your head. The neurons fire in response to memory. This memory can be surprisingly long lasting. It is not uncommon to listen to an album of music for the first time in many years and still hear the next song automatically after the previous song has ended. And it creates a pleasant sensation of mild uncertainty when you listen to your favorite CD on random shuffle; you know your prediction of the next song is wrong.
When listening to people speak, you often know what they're going to say before they've finished speaking— or at least you think you know! Sometimes we don't even listen to what the speaker actually says and instead hear what we expect to hear. (This happened to me so often when I was a child that my mother twice took me to a doctor to have my hearing checked.) You experience this in part because people tend to use common phrases or expressions in much of their conversation. If I say, "How now brown…," your brain will activate neurons that represent the word cow before I say it (though if English is not your native language, you may have no idea what I am talking about). Of course, we don't know all the time what others are going to say. Prediction is not always exact. Rather, our minds work by making probabilistic predictions concerning what is about to happen. Sometimes we know exactly what is going to happen, other times our expectations are distributed among several possibilities. If we were eating at a table in a diner and I said, "please pass me the…," your brain would not be surprised if I next said, "salt," or "pepper," or "mustard." In some sense your brain predicts all these possible outcomes at once. However, if I said, "Please pass me the sidewalk," you would know something is wrong.
Returning to music, we can see probabilistic prediction here as well. If you are listening to a song you have never heard before, you can still have fairly strong expectations. In Western music I expect a regular beat, I expect a repeated rhythm, I expect phrases to last the same number of measures, and I expect songs to end on the tonic pitch. You may not know what these terms mean, but— assuming you have listened to similar music— your brain automatically predicts beats, repeated rhythms, completion of phrases, and ends of songs. If a new song violates these principles, you know immediately that something is wrong. Think about this for a second. You hear a song that you have never heard before, your brain experiences a pattern it has never experienced before, and yet you make predictions and can tell if something is wrong. The basis of these mostly unconscious predictions is the set of memories that are stored in your cortex. Your brain can't say exactly what will happen next, but it nevertheless predicts which note patterns are likely to happen and which aren't.
We have all had the experience of suddenly noticing that a source of constant background noise, such as a distant jackhammer or droning Muzak, has just ceased— yet we hadn't noticed the sound while it was ongoing. Your auditory areas were predicting its continuation, moment after moment, and as long as the noise didn't change you paid it no heed. By ceasing, it violated your prediction and attracted your attention. Here's a historical example. Right after New York City stopped running elevated trains, people called the police in the middle of the night claiming that something woke them up. They tended to call around the time the trains used to run past their apartments.
We like to say that seeing is believing. Yet we see what we expect to see as often as we see what we really see. One of the most fascinating examples of this has to do with what researchers call filling in. It may have been brought to your attention before that you have a small blind spot in each eye, where your optic nerve exits each retina through a hole called the optic disk. You have no photoreceptors in this area, so you are permanently blind in the corresponding spot in your visual field. There are two reasons why you don't usually notice this, one mundane, the other instructive. The mundane reason is that your two blind spots don't overlap, so one eye compensates for the other.
But interestingly, you still don't notice your blind spot when only one eye is open. Your visual system "fills in" the missing information. When you close one eye and look at a richly woven Turkish carpet or the wavy contours of wood grain in a cherry tabletop, you don't see a hole. Entire nodes in the carpet, whole dark knots in the wood grain are constantly winking out of your retina's view as your blind spot happens to cover them, but your experience is of a seamless stretch of textures and colors. Your visual cortex is drawing on memories of similar patterns and is making a continuous stream of predictions that fill in for any missing input.
Filling in occurs in all parts of the visual image, not just your blind spot. For example, I show you a picture of a shore with a driftwood log lying on some rocks. The boundary between the rocks and the log is clear and obvious. However, if we magnify the image, you will see that the rocks and the log are similar in texture and color where they meet. In the enlarged view, the edge of the log isn't distinguishable from the rocks at all. If we look at the entire scene, the edge of the log is clear, but in reality we inferred the edge from the rest of the image. When we look at the world, we perceive clean lines and boundaries separating objects, but the raw data entering our eyes are often noisy and ambiguous. Our cortex fills in the missing or messy sections with what it thinks should be there. We perceive an unambiguous image.
Prediction in vision is also a function of the way your eyes move. In chapter 3, I mentioned saccades. About three times every second, your eyes fixate on one point, then suddenly jump to another point. Generally you are not aware of these movements, and you don't normally consciously control them. And each time your eyes fixate on a new point, the pattern entering your brain from the eyes changes completely from the last fixation. Thus, three times a second your brain sees something completely different. Saccades are not entirely random. When you look at a face your eyes typically fixate first on one eye, then on the other, going back and forth and occasionally fixating on the nose, mouth, ears, and other features. You perceive just "face," but the eyes see eye, eye, nose, mouth, eye, and so on. I realize it doesn't feel this way to you. What you're aware of is a continuous view of the world, but the raw data entering your head are as jerky as a badly wielded Camcorder.
Now imagine that you met someone with an extra nose where an eye should be. Your eyes fixate first on the one eye and then saccade to the second eye, but instead of seeing an eye you see a nose. You would definitely know that something was wrong. For this to happen, your brain has to have an expectation or prediction of what it is about to see. When you predict eye but see nose, the prediction is violated. So several times a second, concurrent with every saccade, your brain makes a prediction about what it will see next. When that prediction is wrong, your attention is immediately aroused. This is why we have difficulty not looking at people with deformities. If you saw a person with two noses, wouldn't you have trouble not staring? Of course, if you lived with that person, then after a period of time you would get used to two noses and not notice it as unusual anymore.
Think about yourself right now. What predictions are you making? As you turn the pages of this book, you have expectations that the pages bend a certain amount and move in predictable ways that are different from the way the cover moves. If you are sitting, you are predicting that the feelings of pressure on your body will persist; but if the seat turned wet, began drifting backward, or underwent any other unexpected change, you would stop paying attention to the book and try to figure out what is happening. If you spend some time observing yourself, you can begin to understand that your perception of the world, your understanding of the world, is intimately tied to prediction. Your brain has made a model of the world and is constantly checking that model against reality. You know where you are and what you are doing by the validity of this model.
Prediction is not limited to patterns of low-level sensory information like seeing and hearing. Up to now I've limited the discussion to such examples because they are the easiest way to introduce this framework for understanding intelligence. However, according to Mountcastle's principle, what is true of low-level sensory areas must be true for all cortical areas. The human brain is more intelligent than that of other animals because it can make predictions about more abstract kinds of patterns and longer temporal pattern sequences. To predict what my wife will say when she sees me, I must know what she has said in the past, that today is Friday, that the recycling bin has to be put on the curb on Friday nights, that I didn't do it on time last week, and that her face has a certain look. When she opens her mouth, I have a pretty strong prediction of what she will say. In this case, I don't know what the exact words will be, but I do know she will be reminding me to take out the recycling. The important point is that higher intelligence is not a different kind of process from perceptual intelligence. It rests fundamentally on the same neocortical memory and prediction algorithm.
Notice that our intelligence tests are in essence prediction tests. From kindergarten through college, I.Q. tests are based on making predictions. Given a sequence of numbers, what should the next number be? Given three different views of a complex object, which of the following is also a view of the object? Word A is to word B as word C is to what word?
Science is itself an exercise in prediction. We advance our knowledge of the world through a process of hypothesis and testing. This book is in essence a prediction about what intelligence is and how brains work. Even product design is fundamentally a predictive process. Whether designing clothes or mobile phones, designers and engineers try to predict what competitors will do, what consumers will want, how much a new design will cost, and what fashions will be in demand.
Intelligence is measured by the capacity to remember and predict patterns in the world, including language, mathematics, physical properties of objects, and social situations. Your brain receives patterns from the outside world, stores them as memories, and makes predictions by combining what it has seen before and what is happening now.
* * *
At this point you might be thinking: "I accept that my brain makes predictions and I can be intelligent just lying in the dark. As you point out, I don't need to act in order to understand or be intelligent. But aren't situations like that the exception? Are you really arguing that intelligent understanding and behavior are completely separate? In the end, isn't behavior, not prediction, what makes us intelligent? After all, behavior is the ultimate determiner of survival."
This is a fair question and of course, in the end, behavior is what matters most to the survival of an animal. Prediction and behavior are not completely separate, but their relationship is subtle. First, the neocortex appeared on the evolutionary scene after animals already evolved sophisticated behaviors. Therefore, the survival value of the cortex must first be understood in terms of the incremental improvements it could bestow upon the animals' existing behaviors. Behavior came first, then intelligence. Second, most of what we sense is heavily dependent on what we do and how we move in the world. Therefore prediction and behavior are closely related. Let's look at these issues.
Mammals evolved a large neocortex because it gave them some survival advantage, and such an advantage must ultimately be rooted in behavior. But in the beginning, the cortex served to make more efficient use of existing behaviors, not to create entirely new behaviors. To make the case clear, we need to take a look at how our brains evolved.
Simple nervous systems emerged not long after multicellular creatures started squiggling all over the Earth, hundreds of millions of years ago, but the story of real intelligence begins more recently with our reptilian forebears. The reptiles were successful in their conquest of the land. They spread over every continent and diversified into numerous species. They had keen senses and well-developed brains that endowed them with complex behavior. Their direct descendants, today's surviving reptiles, still have them. An alligator, for example, has sophisticated senses just like you and me. It has well-developed eyes, ears, nose, mouth, and skin. It carries out complex behaviors including the ability to swim, run, hide, hunt, ambush, sun, nest, and mate.
What is the difference between a human brain and a reptile brain? A lot and a little. I say a little because, to a rough approximation, everything in a reptile's brain exists in a human brain. I say a lot because a human brain has something really important that a reptile does not have: a large cortex. You sometimes hear people refer to the "old" brain or the "primitive" brain. Every human has these more ancient structures in the brain, just like a reptile. They regulate blood pressure, hunger, sex, emotions, and many aspects of movement. When you stand, balance, and walk, for example, you are relying heavily on the old brain. If you hear a frightening sound, panic, and start to run, that is mostly your old brain. You don't need more than a reptile brain to do a lot of interesting and useful things. So what does the neocortex do if it isn't strictly required to see, hear, and move?
Mammals are more intelligent than reptiles because of their neocortex. (The word itself is derived from the Latin words for "new bark" or "new rind," because the cortex literally covers the old brain.) The neocortex first appeared tens of millions of years ago and only mammals have one. What makes humans smarter than other mammals is primarily the large area of our neocortex— which expanded dramatically only a couple of million years ago. Remember, the cortex is built using a common repeated element. The human cortical sheet is the same thickness and has very nearly the same structure as the cortex in our mammal relatives. When evolution makes something big very quickly, as it did with human cortex, it does so by copying an existing structure. We got smart by adding many more elements of a common cortical algorithm. There is a common misconception that the human brain is the pinnacle of billions of years of evolution. This may be true if we think of the entire nervous system. However, the human neocortex itself is a relatively new structure and hasn't been around long enough to undergo much long-term evolutionary refinement.
Here then is the core of my argument on how to understand the neocortex, and why memory and prediction are the keys to unlocking the mystery of intelligence. We start with the reptilian brain with no cortex. Evolution discovers that if it tacks on a memory system (the neocortex) to the sensory path of the primitive brain, the animal gains an ability to predict the future. Imagine the old reptilian brain is still doing its thing, but now sensory patterns are simultaneously fed into the neocortex. The neocortex stores this sensory information in its memory. At a future time when the animal encounters the same or a similar situation, the memory recognizes the input as similar and recalls what happened in the past. The recalled memory is compared with the sensory input stream. It both "fills in" the current input and predicts what will be seen next. By comparing the actual sensory input with recalled memory, the animal not only understands where it is but can see into the future.
Now imagine that the cortex not only remembers what the animal has seen but also remembers the behaviors the old brain performed when it was in a similar situation. We don't even have to assume the cortex knows the difference between sensations and behavior; to the cortex they are both just patterns. When our animal finds itself in the same or a similar situation, it not only sees into the future but recalls which behaviors led to that future vision. Thus, memory and prediction allow an animal to use its existing (old brain) behaviors more intelligently.
For example, imagine you're a rat learning to navigate a maze for the first time. Aroused by uncertainty or hunger, you will use the skills inherent to your old brain to explore the new environment— listening, looking, sniffing, and creeping close to the walls. All this sensory information is used by your old brain but is also passed up to your neocortex, where it is stored. At some future time, you find yourself in the same maze. Your neocortex will recognize the current input as one it has seen before and recall the stored patterns representing what happened in the past. In essence, it allows you to see a short way into the future. If you were a talking rat, you might say, "Oh, I recognize this maze, and I remember this corner." As your neocortex recalls what happened in the past, you will envision finding the cheese you saw last time you were in the maze, and how you got to it. "If I turn right here, I know what will happen next. There's a piece of cheese down at the end of this hallway. I see it in my imagination." When you scurry through the maze, you rely on older, primitive structures to carry out movements like lifting your feet and sweeping your whiskers. With your (relatively) big neocortex, you can remember the places you have been, recognize them again in the future, and make predictions about what will happen next. A lizard without a neocortex has a much poorer ability to remember the past and may have to search a maze anew every time. You (the rat) understand the world and the immediate future because of your cortical memory. You see vivid images of the rewards and dangers that lie ahead of each decision, and so you move more effectively through your world. You can literally see the future.
But notice you are not performing any particularly complex or fundamentally new behaviors. You are not building yourself a hang glider and flying to the cheese at the end of the hallway. Your neocortex is forming predictions about sensory patterns that allow you to see into the future, but your palette of available behaviors is pretty much unaffected. Your ability to scurry, clamber, and explore is still a lot like that of a lizard.
As the cortex got larger over evolutionary time, it was able to remember more and more about the world. It could form more memories, and make more predictions. The complexity of those memories and predictions also increased. But something else remarkable happened that led to the uniquely human abilities for intelligent behavior.
Human behavior transcends the old basic repertoire of moving around with ratlike skills. We have taken neocortical evolution to a new level. Only humans create written and spoken language. Only humans cook their food, sew clothes, fly planes, and build skyscrapers. Our motor and planning abilities vastly exceed those of our closest animal relatives. How can the cortex, which was designed to make sensory predictions, generate the incredibly sophisticated behavior unique to humans? And how could this superior behavior evolve so suddenly? There are two answers to this question. One is that the neocortical algorithm is so powerful and flexible that with a little bit of rewiring, unique to humans, it can create new, sophisticated behaviors. The other answer is that behavior and prediction are two sides of the same thing. Although the cortex can envision the future, it can make accurate sensory predictions only if it knows what behaviors are being performed.
In the simple example of the rat looking for the cheese, the rat remembers the maze and uses this memory to predict that it will see the cheese around the corner. But the rat could turn left or turn right; only by simultaneously remembering the cheese and the correct behavior, "turn right at the fork," can the rat make the prediction of the cheese come true. Although this is a trivial example, it gets to the essence of how sensory prediction and behavior are intimately related. All behavior changes what we see, hear, and feel. Most of what we sense at any moment is highly dependent on our own actions. Move your arm in front of your face. To predict seeing your arm, your cortex has to know that it has commanded the arm to move. If the cortex saw your arm moving without the corresponding motor command, you would be surprised. The simplest way to interpret this would be to assume your brain first moves the arm and then predicts what it will see. I believe this is wrong. Instead I believe the cortex predicts seeing the arm, and this prediction is what causes the motor commands to make the prediction come true. You think first, which causes you to act to make your thoughts come true.
Now we want to look at the changes that led to humans having a greatly expanded behavioral repertoire. Are there physical differences between a monkey's cortex and a human's cortex that can explain why only humans have language and other complex behaviors? The human brain is about three times larger than the chimpanzee's. But there is more to it than "bigger is better." A key to understanding the leap in human behavior is found in the wiring between regions of cortex and parts of the old brain. Put most simply, our brains are connected up differently.
Let's take a closer look. Everyone is familiar with the brain's left and right hemispheres. But there is another division that is less well known, and it is where we need to look for human differences. All brains, especially large ones, divide the cortex into a front half and a back half. Scientists use the words anterior for the front and posterior for the back. Separating the front and the back is a large fissure called the central sulcus. The back part of the cortex contains the sections where the eyes, ears, and touch inputs arrive. It is where sensory perception largely occurs. The front part contains regions of cortex that are involved in high-level planning and thought. It also contains the motor cortex, the section of brain most responsible for moving muscles and therefore creating behavior.
As the primate neocortex became larger over time, the anterior half got disproportionately larger, especially so in humans. Compared with other primates and early hominids, we have enormous foreheads designed to contain our very large anterior cortex. But this enlargement alone is not enough to explain the improvement in our motor ability as compared with that of other creatures. Our ability to make exceptionally complex movements stems from the fact that our motor cortex makes many more connections with the muscles in our bodies. In other mammals, the front cortex plays a less direct role in motor behavior. Most animals rely largely on the older parts of the brain for generating their behavior. In contrast, the human cortex usurped most of the motor control from the rest of the brain. If you damage the motor cortex of a rat, the rat may not have noticeable deficits. If you damage the motor cortex of a human, he or she becomes paralyzed.
People often ask me about dolphins. Don't they have huge brains? The answer is yes; a dolphin has a large neocortex. Dolphin cortex has a simpler structure (three layers versus our six) than a human neocortex, but by any other measure it is large. It is likely a dolphin can remember and understand lots of things. It can recognize other individual dolphins. It probably has an excellent memory of its own life, in an autobiographical sense. It probably knows every nook and cranny of the ocean it's ever been to. But although they exhibit some sophisticated behavior, dolphins don't come close to our own. So we can surmise their cortex has a less-dominant influence on their behavior. The point is that the cortex evolved primarily to provide a memory of the world. An animal with a large cortex could perceive the world much as you and I do. But humans are unique in the dominant, advanced role the cortex plays in our behavior. It is why we have complex language and intricate tools whereas other animals don't. It is why we can write novels, surf the Internet, send probes to Mars, and build cruise ships.
Now we can see the entire picture. Nature first created animals such as reptiles with sophisticated senses and sophisticated but relatively rigid behaviors. It then discovered that by adding a memory system and feeding the sensory stream into it, the animal could remember past experiences. When the animal found itself in the same or a similar situation, the memory would be recalled, leading to a prediction of what was likely to happen next. Thus, intelligence and understanding started as a memory system that fed predictions into the sensory stream. These predictions are the essence of understanding. To know something means that you can make predictions about it.
The cortex evolved in two directions. First it got larger and more sophisticated in the types of memories it could store; it was able to remember more things and make predictions based on more complex relationships. Second, it started interacting with the motor system of the old brain. To predict what you will hear, see, and feel next, it needed to know what actions were being taken. With humans the cortex has taken over most of our motor behavior. Instead of just making predictions based on the behavior of the old brain, the human neocortex directs behavior to satisfy its predictions.
The human cortex is particularly large and therefore has a massive memory capacity. It is constantly predicting what you will see, hear, and feel, mostly in ways you are unconscious of. These predictions are our thoughts, and, when combined with sensory input, they are our perceptions. I call this view of the brain the memory-prediction framework of intelligence.
If Searle's Chinese Room contained a similar memory system that could make predictions about what Chinese characters would appear next and what would happen next in the story, we could say with confidence that the room understood Chinese and understood the story. We can now see where Alan Turing went wrong. Prediction, not behavior, is the proof of intelligence.
We are now ready to delve into the details of this new idea of the memory-prediction framework of the brain. To make predictions of future events, your neocortex has to store sequences of patterns. To recall the appropriate memories, it has to retrieve patterns by their similarity to past patterns (auto-associative recall). And, finally, memories have to be stored in an invariant form so that the knowledge of past events can be applied to new situations that are similar but not identical to the past. How the physical cortex accomplishes these tasks, plus a fuller exploration of its hierarchy, is the subject of the next chapter.
6
How the Cortex Works
Trying to figure out how the brain works is like solving a giant jigsaw puzzle. You can approach it in one of two ways. Using the "top-down" approach, you start with the image of what the solved puzzle should look like, and use this to decide which pieces to ignore and which pieces to search for. The other approach is "bottom-up," where you focus on the individual pieces themselves. You study them for unusual features and look for close matches with other puzzle pieces. If you don't have a picture of the puzzle's solution, the bottom-up method is sometimes the only way to proceed.
The "understand-the-brain" jigsaw puzzle is particularly daunting. Lacking a good framework for understanding intelligence, scientists have been forced to stick with the bottom-up approach. But the task is Herculean, if not impossible, with a puzzle as complex as the brain. To get a sense of the difficulty, imagine a jigsaw puzzle with several thousand pieces. Many of the pieces can be interpreted multiple ways, as if each had an image on both sides but only one of them is the right one. All the pieces are poorly shaped so you can't be certain if two pieces fit together or not. Many of them will not be used in the ultimate solution, but you don't know which ones or how many. Every month new pieces arrive in the mail. Some of these new pieces replace older ones, as if the puzzle maker was saying, "I know you've been working with these old puzzle pieces for a few years, but they turned out to be wrong. Sorry. Use these new ones instead until future notice." Unfortunately, you have no idea what the end result will look like; worse, you may have some ideas, but they are wrong.
This puzzle analogy is a pretty good description of the difficulty we face in creating a new theory of the cortex and intelligence. The puzzle pieces are the biological and behavioral data that scientists have collected for well over one hundred years. Each month new papers are published, creating additional puzzle pieces. Sometimes the data from one scientist contradict the data from another. Because the data can be interpreted in different ways, there is disagreement over practically everything. Without a top-down framework, there is no consensus on what to look for, what is most important, or how to interpret the mountains of information that have accrued. Our understanding of the brain has been stuck in the bottom-up approach. What we need is a top-down framework.
The memory-prediction model can play this role. It can show us how to start putting pieces of the puzzle together. To make predictions, your cortex needs a way to memorize and store knowledge about sequences of events. To make predictions of novel events, the cortex must form invariant representations. Your brain needs to create and store a model of the world as it is, independent from how you see it under changing circumstances. Knowing what the cortex must do guides us to understanding its architecture, especially its hierarchical design and six-layered form.
As we explore this new framework, presented here for the first time, I will get into a level of detail that may be challenging for some readers. Many of the concepts you are about to encounter are unfamiliar, even to experts in neuroscience. But with a bit of effort, I believe anyone can learn the fundamentals of this new framework. Chapters 7 and 8 of this book are far less technical and explore the wider implications of the theory.
Our puzzle-solving journey can now turn to looking for biological details that support the memory-prediction hypothesis; this is like being able to set aside a large percentage of the puzzle pieces, knowing that the relatively few remaining pieces are going to reveal the ultimate solution. Once we know what to look for, the task becomes manageable.
At the same time, I want to stress that this new framework is incomplete. There are many things I don't yet understand. But there are many things I do, based on deductive reasoning, experiments carried out in many different laboratories, and known anatomy. In the last five to ten years, researchers from many sub-specialties in neuroscience have been exploring ideas similar to mine, although they use different terminology and have not, as far as I know, tried to put these ideas into an overarching framework. They do talk about top-down and bottom-up processing, how patterns propagate through sensory regions of the brain, and how invariant representations might be important. For example, Gabriel Kreiman and Christof Koch, neuroscientists at Caltech, with the neurosurgeon Itzhak Fried at UCLA, have found cells that fire whenever a person sees a picture of Bill Clinton. One of my goals is to explain how those Bill Clinton cells come into being. Of course, all theories need to make predictions that can be tested in the laboratory. I've suggested a number of these predictions in the appendix. Now that we know what to look for, this very complex system won't look so complex anymore.
In the following sections of this chapter, we will probe deeper and deeper into how the memory-prediction model of the cortex works. We will start with the large-scale structure and large-scale function of the neocortex and work toward understanding the smaller pieces and how they fit into the big picture.
Figure 1. The first four visual regions in the recognition of objects.