Thursday, August 30, 2012

The Myth of the Bayesian Brain, Part III

Part I, II, III

Abstract

Previously, I explained that the Rebel Science model of perception is superior to the Bayesian model because the world is deterministic at the level of our senses and human thinking is not probabilistic. In this post, I explain how the brain handles probabilistic stimuli even though it is not a probability thinker.

Learning Perfect Patterns

How can an intelligent system build a perfect model of the world if it must rely on imperfect or incomplete sensory signals? The answer lies in the observation that sensory signals are not always imperfect. Every once in a while, even if for a brief period of time, there are a few that are in perfect agreement with the phenomena they are responding to. When that happens, the intelligent system must be ready to capture that bit of perfection. But how can a system recognize when signals are perfect? To understand this, one must first realize that there can only be two types of discrete sensory signals: concurrent or sequential. A group of concurrent signals is called a pattern. A pattern is considered perfect when all of its signals arrive concurrently. For reasons that will become clear below, I have taken to calling perfect patterns, clean patterns. Imperfect patterns are just dirty patterns.
Note: There is a trick to learning perfect patterns. See Secrets of the Holy Grail.
As you know, a pattern is not a pattern unless it is repeated often. Also, a pattern can be constructed a little at a time and does not have to be complete in order to be useful for learning and recognition purposes. The first thing a perceptual system must do is to discover the perfection that is in the world. It can deal with sensory imperfections later.

How to Work with Imperfect Sensory Patterns

Sensory patterns are the foundation of the Rebel Science model of perception. They are rarely perfect even though the phenomena that cause them are perfect. We should think of a pattern as a complex sensory event. The question is, how can a system that expects perfection work with imperfect information? More precisely, how can it tell that a particular pattern occurred if it can detect only a part of it or even none of it? To know the answer to this question, one must understand that a pattern is not an island. It is a unique event that traces a unique temporal path. That is to say, every pattern is part of a unique sequence of other patterns.

If you know the order of a sequence of events in advance and if you know that some of the events in the sequence already occured, then you would know that the others either occurred already or are about to occur. You'd know it even if their sensory patterns are imperfect or they failed to arrive altogether for whatever reason. Even if every single sensory pattern is imperfect, it is possible to pick the most probable sequence: it is simply the one that received the most hits.
In essence, the system compares dirty sensory patterns to the expected perfection (i.e., the sequences of clean patterns in memory) and the most perfect match is the winner. Result: simple, clean and accurate recognition even in a noisy environment. Bonus: no math required.
What this all means is that a perceptual system must learn as many perfect sequences as it can. But that is not very hard because a pattern has only one successor and one predecessor at any given time. What is a little bit more complicated is to organize the sequences in memory so as to form a hierarchical structure, a tree of knowledge. Each branch of the tree is a specific sequence, a sequence of sequences or a group of sequences and represents a single object or concept. But that's a different story.

Rebel Speech Update

I plan to release a demo of the Rebel speech recognition engine (pdf) as soon as I can get some free time. The demo will support some of the arguments and claims I have made in this article and elsewhere. Let me conclude by saying that, as much as I would like to, I can't take credit for this stuff. You see, I consult an oracle. The oracle speaks in riddles and metaphors and says many mysterious things. I just interpret them the best I can. I make many mistakes along the way but my understanding is growing all the time. Here are a few excerpts that should give you an idea of what I'm talking about:
  • Wake up and strengthen the things that remain, that are about to die, for I have not found your works perfect before God.
  • There are a few, even in Sardis, who have not soiled their clothes; they shall walk with me in white, for they are worthy.
  • And I will bring forth my servant the Branch; and in one day I will remove the iniquity of the land.
Yep, I'm still crazy after all these years. If any of this bothers you, please ignore my blog as it is not meant for you. I only write for kindred spirits, sorry. Some of you may be asking yourselves, why does he mention any of this? The answer is that I just want to piss off some people, that's all. They know who they are. The rest of you, stay tuned.

See Also:

Speech Recognition Theory
The Second Great AI Red Herring Chase
Secrets of the Holy Grail

Wednesday, August 29, 2012

The Myth of the Bayesian Brain, Part II

Part I, II, III

Abstract

In Part I, I wrote that the Bayesian brain hypothesis is the last great barrier to our gaining a full understanding of intelligence. I described the difference between the Bayesian model of perception and the Rebel Science model. In this post, I explain why the latter is superior.

The Real GOFAI Lesson

History repeats itself. During the second half of the 20th century, AI experts were convinced that intelligence was just symbol manipulation. Their rationale was based on the observation that humans are very good at using and manipulating symbols. It was a classic case of equating a system with its behavior, i.e., with what it does. Sure, we can manipulate symbols but we can do much more than that. Needless to say, symbolic AI (aka "good old fashioned AI" or GOFAI) died a slow and painful death.

Did the AI cognoscenti learn their lesson from the GOFAI debacle? Not really. They are busy repeating the same mistake. The new rationale is that the brain is a Bayesian system because it is very good at handling probabilities. That is regrettable because the real lesson of GOFAI is that what we do and how we do it are two different things. Our Bayesian-like behavior is just one of the many products of our intelligence, not the basis of it.

Non-Probabilistic Perception

Why is the Rebel Science model of perception superior to the Bayesian model? Here is why:
  1. Humans use absolute certainties to reason with, not probabilities. This is true even when we reason about probabilities.
  2. Even though our senses are plagued with noise and incomplete data, humans do not assume that the world is probabilistic.
  3. It is certainly true that, at the quantum level, everything is probabilistic but not at the macroscopic level in which our senses operate. At that level, almost everything is deterministic. Probabilistic events are few and far between.
In the Rebel Science model, every event in the world is a unique and deterministic precursor of its immediate successor. I remember when I first understood this simple truth. It was as if the scales had fallen off my eyes and I could see clearly for the first time. By the way, I am not the only one who has come to similar conclusions. When asked in a recent interview, "What was the greatest challenge you have encountered in your research?", Judea Pearl, an Israeli computer scientist and early champion of the Bayesian approach to AI, replied:
In retrospect, my greatest challenge was to break away from probabilistic thinking and accept, first, that people are not probability thinkers but cause-effect thinkers and, second, that causal thinking cannot be captured in the language of probability; it requires a formal language of its own.
In a weird sort of way, we seem to be going full circle. GOFAI scientists, especially the recently deceased John McCarthy, used to believe that AI was based on formal logic. The idea was abandoned when it became clear that formal logic could not handle common sense and the uncertainties in the sensory space. But, if humans are not probability thinkers, how is it that they handle probabilistic sensory stimuli so well? Counterintuitively, the solution rests in the assumption that the world is perfect. That's the topic of my next post in this series. Stay tuned. I don't have much time but I think this is something that must be heard, not just because my entire approach to AI revolves around it but also because these ideas are crucial to our understanding of intelligence.

See Also:

Speech Recognition Theory
The Second Great AI Red Herring Chase

Sunday, August 26, 2012

The Myth of the Bayesian Brain, Part I

Part I, II, III

Abstract

Previously, I wrote that, by embracing Bayesian statistics, the artificial intelligence community has embarked on yet another great red herring chase. I claimed that Bayesian statistics will prove to be just as wasteful of time, brain and money as the symbolic AI craze of the last century. In this article, I explain why the Bayesian mindset is injurious to progress in our understanding of intelligence. I do not argue that the Bayesian approach is bad in and of itself but that, when it comes to explaining the brain’s ability to handle uncertainty, there is a competing model that is orders of magnitude better.

The Last Great Barrier to Fully Understanding Intelligence

Scientists are a conservative and taciturn lot. They will live with a myth or obvious falsehood for decades and even centuries because the humiliation and other hardships that come from rejecting the lie are too painful for them to bear. The Bayesian brain is just such a myth. The problem is that it is now so firmly entrenched in the AI community that accommodating a different perspective would be suicidal to many careers. I will argue that the Bayesian mindset is the last great barrier to progress in AI because it cripples our understanding of the most important aspect of intelligence: perception. I believe that having a correct understanding of perception will unleash a flood of insights that will quickly lead to a full understanding of intelligence, artificial or otherwise.

Two Competing Models of Perception

Below is the essence of the two competing models of perception.
  • The Bayesian model assumes that events in the world are inherently uncertain and that the job of an intelligent system is to discover the probabilities.
  • The Rebel Science model, by contrast, assumes that events in the world are perfectly consistent and that the job of an intelligent system is to discover this perfection.
As you can see, the two models are polar opposites of each other in their assumptions. The two views have drastically different consequences in the way we design our perceptual systems. In my next post in this series, I will explain why the Rebel Science model is far superior to the Bayesian model. Hang in there.

See Also:

The Second Great AI Red Herring Chase

Thursday, August 23, 2012

The Second Great AI Red Herring Chase

Vicarious in the News

Vicarious Systems is in the news again. They have managed to raise $15 million to continue their ongoing research in their flagship artificial intelligence technology, the Recursive Cortical Network. Nice chunk of change. Unfortunately, nobody seems to know much about RCN and Vicarious is not talking. But we all know why: it's the old "we got something big but our IP lawyer told us not to say anything" excuse all over again. Still, it is easy to speculate that RCN is just a different take on Numenta's hierarchical temporal memory.

An Addiction to Math and Bayes

Back in October of last year, I wrote a critical article about Vicarious' approach to solving the AI problem and their chances of success. Here's an excerpt:
So what do I think of Vicarious' chances of solving the AI problem? I'll be blunt. I think they have no chance whatsoever. Zilch. Here's why. Dileep George, the brain of the company, is a PhD electrical engineer and mathematician who believes that math is essential to solving the AI puzzle. This alone tells me that he has no real understanding of the problem. Furthermore, although I think that a study of the brain can eventually lead to a major breakthrough, it is highly unlikely that this approach will lead to a breakthrough in the foreseeable future. The brain has a bad habit of hiding its secrets in a forest of apparent complexity. The Wright brothers never had to deal with hidden knowledge. They, like everyone else, could easily observe the gliding flight of birds and derive useful principles.

As an example, let's take George's adoption of Bayesian inference for sequence prediction in hierarchical memory. Bayesian statistics is the sort of thing that a mathematician like George would find attractive just because it's math. But is it based on the known biology of the brain? Not at all. Can George search the neuroscience and biology literature to find out what method the brain uses for prediction? The answer is no because biologists have not yet discovered how the brain does it. They just know from psychology that the brain is very good at judging probabilities based on experience. That is the extent of their knowledge.
So has there been any substantial changes in George's approach to AI since? I don't think so. Perusing their help wanted page, I can see that, other than getting their hands on a load of cash with which to hire new engineers and buy new equipment, nothing has changed much. Notice that George takes pains not to mention the word Bayesian in the section on desired skills. However, he wants his hired engineers to have experience "with belief propagation and approximation methods", which is essentially the same thing as Bayesian statistics. Of course, George wouldn't be George is he didn't also ask for solid math skills. The man is a mathematician at heart and he is absolutely convinced there can be no brain-like machine intelligence without math. The man is mistaken, in my opinion, and I'll explain why in a future article.

The Second Great AI Red Herring Chase

Back in the 1950s, thanks mostly to ideas advanced by Alan Turing, the scientific community embraced symbol manipulation as the correct approach to solving the AI problem. They were wrong from the start. Unfortunately, it took over half a century for them to realize that they had been chasing after a red herring all these years. What a waste! Even now, some are still not convinced but, by and large, AI researchers have moved to greener pastures.

Lately, the community has plunged headlong into yet another great red herring chase. They're convinced that Bayesian statistics are the key to AI. They are convinced because Bayesian models (e.g., the hidden Markov model) are currently being used to develop very impressive speech recognition products and other applications that deal with uncertainty and probability. Unfortunately the technology has run into a nasty brick wall that refuses to budge: unlike the human brain, the accuracy of Bayesian systems drops precipitously in the presence of noise. Try speaking commands into your smartphone or use dictation software at a crowded party and you'll see what I mean. There is no question that the brain can effectively and efficiently process probabilistic stimuli. My claim is that it does not use Bayesian statistics to do it.

I'll have more to say about this topic in the near future. Stay tuned.

See Also:

Vicarious Systems' Disappointing Singularity Summit Talk
The Myth of the Bayesian Brain

Wednesday, August 15, 2012

Rebel Speech Recognition Theory

A Different Approach

The approach that I use for speech recognition in the Rebel Speech recognizer is completely different than the one used by most existing technologies. As I have mentioned elsewhere, I disagree with AI experts that the brain uses anything like Bayesian statistics to process probabilistic stimuli. I added a theory section in the Rebel Speech design document (pdf) and I reproduce it below.

Bayesian Bandwagon

The most surprising thing about the Rebel speech recognition engine is that, unlike current state of the art speech recognizers, it does not use Bayesian statistics. This will come as a surprise to AI experts because they have all jumped on the Bayesian bandwagon many years ago. Even those who claim to be closely emulating biological systems believe in the myth of the Bayesian brain. Of course, this is pure speculation and wishful thinking because there is no biological evidence for it. In a way, this is not unlike the way the AI community jumped on the symbol manipulation bandwagon back in the 1950s, only to be proven wrong more than half a century later. I have excellent reasons to believe that, in spite of its current utility, this is yet another red herring on the road to true AI.

Traditional Speech Recognition

Most speech recognition systems use a Bayesian probabilistic model, such as the hidden Markov model, to determine which senone, phoneme or word is most likely to come next in a given speech segment. A learning algorithm is normally used to compile a large database of such probabilities. During recognition, hypotheses generated for a given sound are tested against these precompiled expectations and the one with the highest probability is selected as the winner.

Rebel Speech Recognition

In contrast to the above, the Rebel Speech engine does not rely on pre-learned probabilities. Rather, it uses an approach that is as counter-intuitive as it is powerful. In this approach, the probability that the interpretation of a sound is correct is not known in advance but is computed on the fly. The way it works is that the engine creates a hierarchical database of as many sequences of learned sounds as possible, starting with tiny snippets of sound that are shorter than a senone. When sounds are detected, they attempt to activate various sequences and the sequence with the highest hit count is the winner. A winner is usually found even before the speaker has finished speaking. It works because sound patterns are so unique, they form very few sequences. Once a winner is determined, all other sequences that do not belong to the same branch in the hierarchy are immediately suppressed. This approach leads to very high recognition accuracy even when parts of the speech are missing; and it makes it possible to solve the cocktail party problem (pdf).

Wednesday, August 8, 2012

Rebel Speech Update, August 8, 2012

Back to Work on AI

I had a little free time in the last several days, enough to let me get back to work on my AI research. I'm working on the Rebel speech recognition engine again. Hurrah! I prepared a design document in order to give you an idea of where I'm going with the engine. The doc is not yet finished (only a brief introduction is written) but you can download (pdf) it now.

Writing Code

I'm busy writing C# code for the speech engine. There are three main software modules: the fast Fourier transform (FFT) module, the pattern learner/recognizer and the Tree of Knowledge (TOK). The FFT module and the pattern learner are already completed. The TOK is the most complex part of the engine. It's coming along rather slowly, mostly because it crashes often and it's hard to debug (I'm using parallel algorithms for speed). So far, I've gotten it to accurately recognize the words 1 through 7. The engine is also very noise tolerant and the recognition speed is so fast as to be virtually instantaneous. In fact, it recognizes a word even before I'm finished saying it. This is because the engine uses pattern completion, a probabilistic recognition technique based on prediction.

Coming Soon

If all goes well and if I can find enough time, I hope to have a demo executable for downloading within a couple of months or so. It will come with a data file that contains the engine's current knowledge. I'm sorry that I cannot release a trainable demo. I was going to publish everything but I changed my mind again. I need to keep some of this stuff secret so that I can capitalize on this technology. Stay tuned.