Welcome to the A.L.I.C.E. AI Foundation

Promoting the adoption and development of Alicebot and AIML free software.

Why You Don't Need Proprietary Bot Software

Noel Bush
June 2001

Noel Bush

The Historical Case
(sec. 2 of 4)

A bit of a history lesson is in order. Some bot software companies try to place themselves in an historical timeline that usually starts with Joseph Weizenbaum's ELIZA. They provide a story something along these lines: ELIZA was a primitive attempt at conversational AI that sparked imagination, but mostly relied upon a few simple "tricks"--and since then, some geniuses have been locked away in an unmarked building working up a true revolution in 'artificial intelligence', which just so happens to have landed on the doorstep of [company name].

The idea is supposed to be that ELIZA whetted the world's appetite for talking computers, but that the approach used wasn't serious enough, and that it took a few more decades for "real" NLP (natural language processing) to fill in the missing pieces and make today's bots able to carry on remarkable conversations, sell products to customers, remember web site visitors, teach about subjects, give financial advice, and so on.

But the fact is that the development of what most people call "real" NLP and the development of bots have mostly proceeded along different lines. "Real" NLP as it's understood in the academic world goes deep into problems of linguistics that are still poorly understood. Most of the work that has been done in conventional NLP has not translated into something that can be used for making a machine carry on a better conversation. In fact, from the point of view of academia, the software that's advertised as conversational bot software mostly relies on "outdated" ideas that were discredited almost as soon as they were introduced.

Richard Wallace is the only (former) academic who makes a serious case for the legitimacy of the approach used in ELIZA, and has also demonstrated the possibility for expanding on that approach to produce results that match what people are looking for in bot technologies. Commercial companies using closed software generally use talk of patents and advanced secretive technologies as marketing tools to mask the truth--that they too are relying on an approach that is also heavily based on the original ideas of ELIZA.

In reality, it's the conventional NLP approach that is stuck. Any professional linguist will tell you that for as many advances as the study of language has made, it has encountered as many profound setbacks. Most conventional NLP approaches rely upon fitting actual speech to a meta-model of "meaning" and communication. Briefly, for instance:

"I took my kid to school yesterday."

might be illustrated as:

example syntactic parse

The sentence is broken down into components based on their "syntactic" function. This syntactic analysis is meant to serve as a framework for understanding the "semantics" (meaning) of the sentence. A machine needs to know things like:

  • "took" is the simple past tense form of "to take"
  • "I" refers to the speaker of the sentence
  • "my" refers to something that belongs to the speaker of the sentence
  • "kid" can mean many things, among which is a "child"
  • A child is a young person
  • "my kid" probably indicates that the speaker is the "parent" of the child
  • "parent" can mean many things, among which is a person who is somehow responsible for another person, possibly including that other person's upbringing, birth, etc.
  • "school" is a place where people learn
  • "to take" someone "to school" means to arrange for the transportation of that someone to the school and to accompany them (i.e., "to take" alone is ambiguous)

And so on. There are many theoretical approaches to filling out the details in this fashion. Many approaches take a starting point like this, and rely, at some level, on several important resources:

  • a "lexicon" of words that describes their different forms and possible positions in a "sentence"
  • a "corpus" of semantic information about different combinations of lexical items and the relationships of those combinations to other combinations
  • an engine that can apply a lexicon to a sentence and identify all possible interpretations of the sentence's syntactic structure
  • an engine that can apply a semantic corpus to a series of syntactic & morphological ("word shape") parses and identify the most likely interpretation of the semantic intent of that sentence

And in fact the notion of what constitutes a "sentence" is also poorly understood, believe it or not, leading some people to speak of "utterances" rather than sentences.

It's also clear that even a good analysis of a sentence in isolation is relatively useless, leading to the need for "pragmatics", in which the analysis attempts to locate the meaning of an utterance within its "layers of context". The notion of "context" is also poorly understood, and at its best in computer NLP is treated as a sort of giant static network of "meanings", often called an "ontology".

There are a huge number of theories of NLP that follow an approach that looks something like this. None of these approaches have a complete set of tools that can handle real text generated by humans, to consistently produce an accurate analysis with which most people would agree.

There are also approaches that try to minimize the number of a priori rules, and rely instead on "learning" techniques based on statistical analyses of large bodies of text. Neural networks, fuzzy logic, genetic algorithms and other approaches are essentially statistically-oriented techniques that try to remove some of the manual labor from statistical analysis by introducing a "black box" that iteratively builds rules, often hidden from human users, based on feedback about the analysis from human controllers and/or "heuristics" that serve as non-absolute guides to analysis.

Aside from the fact that no approach fully satisfies its own stated aims within whatever constraints on language and goals for analysis are set, almost all approaches suffer from an inability to "explain" the workings of more than a handful of human languages with the same set of rules. And virtually no approach can do more than acknowledge that even a single given language, say English, is used in countless "standard" ways and evolves continuously as it is used.

One only needs to begin a web search on "natural language processing" to understand that the amount of research into this topic is immense, and that no approach has a legitimate claim on being the best. The history of linguistics is rife with fierce battles among academics whose entire careers have been staked on establishing the authority of one approach over others--none has met with success (see Randy Allen Harris's The Linguistics Wars for a great read on this topic).

Suffice it to say that the academic debate and research struggle forges ahead vigorously, but that none of its output has proven commercially viable.

Perhaps the most spectacular failure has been the CYC project, initiated by Doug Lenat in 1984 and beneficiary of millions of dollars of government research money and private and institutional investment. CYC (now housed under a company called Cycorp), was and is a project with the aim of building a giant knowledge base full of "common sense", with the idea that this would someday enable machine understanding of texts. Lenat has been telling journalists (and presumably his investors) for years that CYC is mere months away from being able to understand simple texts like TIME magazine. So far quite a lot of money has been spent, and the giant knowledge base continues to grow, but despite its intricacy (some might say beauty), its commercial use is still beyond reach, and its theoretical base lags behind the academic research front, itself still light-years away from success.

Know this: the commercial bot companies may borrow bits and pieces from conventional NLP, but by and large they are every bit as ELIZA as ELIZA. As ELIZA relies on pattern-matching and simple string manipulation, so too do all "proprietary" offerings at their core.

<< Previous: Windows 0.3 | Next: The Technology Case >>