AIML Primer

This document is a "work in progress"
Last update: October 30, 2011



Original Document by: Thomas Ringate
Copyright © 2001
Contributing Authors: Dr. Richard S. Wallace; Anthony Taylor; Jon Baer; Dennis Daniels


CONTENTS


- Can you give me a quick primer on AIML?

Given only the <pattern> and <template> tags, there are three general types of categories.

Strictly speaking, the three types overlap, because "atomic" and "default" refer to the <pattern> and "recursive" refers to a property of the <template>.

- What happens to contractions and punctuation?

ProgramD has a class called Substituter that performs a number of grammatical and syntactical substitutions on strings. One task involves preprocessing sentences to remove ambiguous punctuation to prepare the input for segmentation into individual sentence phrases. Another task expands all contractions and coverts all letters to upper case; this process is called "normalization".

The Substituter class also performs some spelling correction.
(See also the question "What is <person/>?")

One justification for removing all punctuation from inputs is the need to make ALICE compatible with speech input systems, which of course do not detect punctuation (unless the speaker utters the actual word for the punctuation mark -- "period").

- How are the patterns matched?

When a client enters an input, the program scans the categories to find the best match. By comparing the input with the patterns in the following order, the algorithm ensures that the most specific pattern matches first. "Specific" in this case has a formal definition, but basically it means that the program finds the "longest" pattern matching an input.

Search order:

ATOMIC with a THAT
ATOMIC
DEFAULT with a THAT
DEFAULT

Example:.

What type of heaters do you have?

will match the ATOMIC: "WHAT TYPE OF HEATERS DO YOU HAVE"
and not the REDUCTION of: WHAT TYPE OF *

The ATOMIC category will always take precidence over any other type of category, other than another ATOMIC with a THAT.
If you have two identical patters, but one has a THAT, then the THAT category, will take precidence over the ATOMIC category, if the THAT matches the bot's previous response.

If neither of the above is true, then a REDUCTION that matches part of the pattern will give it's response, and finally if none of the above matches, then the catch-all or pickup will take over.

Any categories that are contained within a TOPIC section will be searched first if the current setting of TOPIC matches a TOPIC section. This results in an extension of the search order to the following:

ATOMIC with a TOPIC and a THAT
ATOMIC with a TOPIC
DEFAULT with a TOPIC and a THAT
DEFAULT with a TOPIC
ATOMIC with a THAT
ATOMIC
DEFAULT with a THAT
DEFAULT

The TOPIC sections are always searched first if they match the current setting of TOPIC. This permits the botmaster to have identical category patterns within a TOPIC section and in the GENERAL section.

The wild-card character "*" comes before "A" in alphabetical order. For example, the "WHAT *" pattern is more general than "WHAT IS *". The default pattern "*" is first in alphabetical order and the most general pattern. For convenience AIML also provides a variation on "*" denoted "_", which comes after "Z" in alphabetical order.

- Do the categories need to be in alphabetical order by pattern?

No, the order is maintained internally when the categories load, but you can write them in any order.

- How are the categories stored?

If your session with program B included a "Classify" routine, then the AIML script is stored in order of category activation rank. In other words, program B stores the most frequently accessed category (usually '*') first, the second most frequently next, and so on. If a number of categories have the same activation count, program B saves them in alphabetical order by pattern. Hence, if the session did not include a "classify" routine, the program stores all the categories in alphabetical order by pattern (because they all have an activation count of zero).

One reason to store the categories in order by activation is to make the Applet interface more natural. Because the Applet interface starts simultaneously with a thread to load the robot source file, the Applet client can talk with the robot before all the categories are fully loaded. Given that the interlocutor is more likely to say something that activates a more frequently activated category, it makes sense to transmit these categories first. Storing the *.aiml files in order of category activation achieves the desired effect. The Applet loads the most frequent categories first, and continues loading in the background while the conversation begins.

- What is a symbolic reduction?

In general there are a lot of categories whose job is "symbolic reduction". The category:

<category>
<pattern>ARE YOU VERY *</pattern>
<template><srai>ARE YOU <star/></srai></template>
</category>

This category [in std-brain.aiml] will reduce "Are you very very smart" to "Are you smart".

- Can I create custom AIML tags?

AIML is extensible. You can create an infinite number of new tags for foreign language pronouns, predicates, or application-specific properties.  "Predicate tags" mean tags that have a client-specific "set" and "get" method. Pronouns like "it" have predicate tags like <set name="it"></set>. AIML has a number of these built-in tags for common English pronouns.

Using the <set name="xxxx"> and <get name="xxxx"> tags an endless variety of languages and possiblilties can be supported.
 

- How recursive is AIML?

Understanding recursion is important to understanding AIML. "Recursion" means applying the same solution over and over again, to smaller and smaller problems, until you reduce the problem to its simplest form. AIML uses the tags <sr/> and <srai> to implement recursion. The botmaster uses these tags to tell the robot how to respond to a complex sentence by breaking it down into the responses to simpler ones.

Recursion can apply many times to a single input. Given the normalized input:

ALICE CAN YOU PLEASE TELL ME WHAT LINUX IS RIGHT NOW

an AIML category with the pattern "_ RIGHT NOW" matches first,
reducing the input to:

ALICE CAN YOU PLEASE TELL ME WHAT LINUX IS

Another pattern ("<bot name="name"/> *") reduces it to:

CAN YOU PLEASE TELL ME WHAT LINUX IS

And then:

PLEASE TELL ME WHAT LINUX IS

reduces to:

TELL ME WHAT LINUX IS

and finally to:

WHAT IS LINUX
 

- How can I restrict remote clients from running programs on my computer?

If your reply contains the markup

<system>yourcommand <id/></system>

then the robot will insert the (virtual) client IP into the command line argument for "yourcommand". Then it is up to "yourcommand" to enforce access privileges.

- Can I insert dynamic HTML into the robot reply?

If you are fortunate enough to be running lynx under Linux, the following markup is a simple way to "inline" the results of an HTTP request into the chat robot reply. Try asking ALICE: "What chatterbots do you know?" and she will reply with a page of links generated by the Google search engine.

<category>
<pattern>WHAT *</pattern>
<template>
Here is the information I found:
<system>
lynx -dump -source -image_links http://www.google.com/search?q=<personf/>
</system>
</template>
</category>

- Can I include JavaScript in the robot reply?

Yes. You can include any HTML including <javascript> tags. Suppose you want to "chat AND browse," in other words, have the robot open up a new browser window when she provides a URL link. Here's a category that kicks out a piece of HTML/scripting that opens a new window with and loads a given URL. This is handy for search engines or showing off one's web page. This code contributed by Stefan Zakarias additions by Dennis Daniels.

<category>
<pattern>WHAT IS YOUR WEBSITE</pattern>
<template>
It is at "http://www.mywebsite.org"
<script language="JavaScript">
function Popup(){
var winURL = "http://www.mywebsite.org";
var winWidth=800;
var winHeight=600;
var winScrollbars="yes";
var winToolbar="yes";
var winSizeable="yes";
var winLocation="yes";
var winDirectories="yes";
var winStatus="yes";
var winMenubar="yes";
var winCopyHistory="yes";
newWin=window.open(winURL,"",
"copyhistory="+winCopyHistory+
",menubar="+winMenubar+
",status="+winStatus+
",directories="+winDirectories+
",location="+winLocation+
",resizable="+winSizeable+
",toolbar="+winToolbar+
",scrollbars="+winScrollbars+
",height="+winHeight+
",width="+winWidth);
}
</script>
<a href="javascript:Popup()">Go to my website!</a>
</template>
</category>

A couple of things to note about this technique:


NORMALIZED TEXT
_, *, and <bot name="name"/> (at present)

PSAE
AIML broadly breaks down into two parts: "Pattern Side AIML expressions" that can appear in the <pattern>, <that>, and <topic> and "Template-Side AIML
expressions" that appear inside the <template>.  Pattern-side AIML expressions (PSAE):

TSAE
TSAE expressions are comprised of ordinary text, optionally marked up with all the other tags.  Generally speaking, it doesn't make sense to use PSAE's in the
template or TSAE's in the pattern, topic or <that>...</that>.  The sole exception at this point is <bot name="name"/>.