Pandorabots Embrace & Extend

 

Pandorabots had inevitably to add some features to AIML that were not part of the AIML specification.  The following code fragment demonstrates some of these new features.  Pandorabots provides the unique ability to run AIML templates inside the HTML that will appear on the client’s browser.   This very feature, the ability to process AIML templates inside the browser HTML, is itself an example of Pandorabot’s embrace and extend approach to AIML.

One useful set of AIML templates displays history of the last four exchanges with the client, a dialogue history, updated every time the client says something and the bot responds.  Such a set of templates is easy to program in Pandorabots AIML.  But as we shall see, it makes use of almost every feature of Pandorabots “embraced and extended” AIML.

Human inputs are displayed with a prefix prompt “Human:” and bot responses are displayed with the bot’s name followed by a “:”.  If there have been fewer than four exchanges, the screen should appear blank rather than show unfilled lines with prompts. 

 

<template>
<think>
<set name="_history">
<request index="3"/>
</set>
</think>
<condition name="_history">
<li value="*">
<i><b>Human:</b></i> <request index="3"/><br/>
<i><b><bot name="name"/>:</b></i> <response index="3"/><br/>
</li>
</condition>
<br/>
</template>

<template>
<think>
<set name="_history">
<request index="2"/>
</set>
</think>
<condition name="_history">
<li value="*">
<i><b>Human:</b></i> <request index="2"/><br/>
<i><b><bot name="name"/>:</b></i> <response index="2"/><br/>
</li>
</condition>
<br/>
</template>

<template>
<think>
<set name="_history">
<request index="1"/>
</set>
</think>
<condition name="_history">
<li value="*">
<i><b>Human:</b></i> <request index="1"/><br/>
<i><b><bot name="name"/>:</b></i> <response index="1"/><br/>
</li>
</condition>
<br/>
</template>

 

Wildcard in conditions

 

Pandorabots has adopted a boundary condition in AIML where the list item in the condition tag has a value equal to the wild card “*”.  In this example the <set> operation sets the AIML predicate “_history” to the value of <request index=”1”/>.  If <request index=”1”/> has not been set, then it cannot match any value, including “*”.  Using this bit of AIML trickery, Pandorabots says that the AIML code inside the <li value=”*”> will not be executed because “_history” is set to “undefined”.  I am as much in favor of the undefined as the next person, but this is not standard AIML.

Wildcard in indexes

 

This example doesn’t show it, but Pandorabots also allows wildcards in some AIML tag indexes.  For example, the tag

 

 

<that index="1,*"/>

 

indicates the set of input sentences included in <that index=”1,1”/>…<that index=”1,N”/>.

 

Request and Response

 

Here is a general problem of mathematical reference that appears in AIML.  You might call it, the problem of “multiline response”.  Consider a dialogue between two individuals.  One of them, B, asks, or says, something, that begins and ends with a sentence.  It consists of several sentences. What B says is, as we say, “multiline”.  The respondent, A, next utters his or her own reply to what he or she has heard.  What A says is also multiline.

And so what B says next.  Sometimes, of course, the multiline utterances consist of just one line, but in general a script consists of sequences of such back-and-forth, multiline responses.

 

At the lowest level AIML provides for processing individual input sentences.  One AIML pattern matches one input sentence.  The next level of context is usually provided by the <that> variable.  Most of the time, AIML has no way to distinguish whether inputs came from multiline input sequences, or from individual inputs, which may help explain some bizarre constructions that emerge from unpredictable multiline input queries. 

 

The AIML specification provides for indexed <input/> and <that/> tags to store the values of previous input values and robot replies.   The <input index=”X”/> tag is one dimensional but the <that index=”X,Y”/> tag is already two dimensional, owing to the fact that the Xth previous input can have Y sentences in it’s reply.  We see here that AIML makes no distinction for input sentences that come from multiline inputs, or one shots, so to speak, because doing so would add another needless indexing dimension to <input/> and <that/>. 

 

The typical AIML interpreter master loop is to append all of the output sentences together into a single output paragraph for the bot output.  If the program keeps a history of these outputs and the associated multiline inputs, then it has created something very similar to the Pandorabots <request/> and <response/> tags.

Getting back to the example, <request/> and <response/> are the indexed history tags of the entire multiline input and output of the human and bot, respectively. 

 

Formatted date tag

 

Pandorabots supports three extension attributes to the date element in templates:

 
         locale
         format
         timezone
 

timzeone should be an integer number of hours +/- from GMT and that locale is the iso language/country code pair e.g., en_US, ja_JP.  Locale  defaults to en_US. The set of supported locales are:

 
af_ZA  ar_OM  da_DK  en_HK  es_CO  es_PY  fr_CA  is_IS  mt_MT  sh_YU  vi_VN
ar_AE  ar_QA  de_AT  en_IE  es_CR  es_SV  fr_CH  it_CH  nb_NO  sk_SK  zh_CN
ar_BH  ar_SA  de_BE  en_IN  es_DO  es_US  fr_FR  it_IT  nl_BE  sl_SI  zh_HK
ar_DZ  ar_SD  de_CH  en_NZ  es_EC  es_UY  fr_LU  ja_JP  nl_NL  sq_AL  zh_SG
ar_EG  ar_SY  de_DE  en_PH  es_ES  es_VE  ga_IE  kl_GL  nn_NO  sr_YU  zh_TW
ar_IN  ar_TN  de_LU  en_SG  es_GT  et_EE  gl_ES  ko_KR  no_NO  sv_FI
ar_IQ  ar_YE  el_GR  en_US  es_HN  eu_ES  gv_GB  kw_GB  pl_PL  sv_SE
ar_JO  be_BY  en_AU  en_ZA  es_MX  fa_IN  he_IL  lt_LT  pt_BR  ta_IN
ar_KW  bg_BG  en_BE  en_ZW  es_NI  fa_IR  hi_IN  lv_LV  pt_PT  te_IN
ar_LB  bn_IN  en_BW  es_AR  es_PA  fi_FI  hr_HR  mk_MK  ro_RO  th_TH
ar_LY  ca_ES  en_CA  es_BO  es_PE  fo_FO  hu_HU  mr_IN  ru_RU  tr_TR
ar_MA  cs_CZ  en_GB  es_CL  es_PR  fr_BE  id_ID  ms_MY  ru_UA  uk_UA
 

format is a format string as given to the Unix strftime function:

http://www.opengroup.org/onlinepubs/007908799/xsh/strftime.html
 

You can include your own message in the format string, along with one or more format control strings.  These format control strings tell the date function whether to print the date or time, whether to use AM or PM, a 24 hour clock or a 12 hour, abbreviate the day of the week or not, and so on.  Some of the supported format control strings include:

 

%a Abbreviated weekday name

%A Full weekday name

%b Abbreviated month name

%B Full month name

%c Date and time representation appropriate for locale

%d Day of month as decimal number (01 – 31)

%H Hour in 24-hour format (00 – 23)

%I Hour in 12-hour format (01 – 12)

%j Day of year as decimal number (001 – 366)

%m Month as decimal number (01 – 12)

%M Minute as decimal number (00 – 59)

%p Current locale’s A.M./P.M. indicator for 12-hour clock

%S Second as decimal number (00 – 59)

%U Week of year as decimal number, with Sunday as first day of week (00 – 53)

%w Weekday as decimal number (0 – 6; Sunday is 0)

%W Week of year as decimal number, with Monday as first day of week (00 – 53)

%x Date representation for current locale

%X Time representation for current locale

%y Year without century, as decimal number (00 – 99)

%Y Year with century, as decimal number

%Z Time-zone name or abbreviation; no characters if time zone is unknown

%% Percent sign

 

 

If you don't specify a format you'll just get the date using the default format for the particular locale.

 

timezone is the time zone expressed as the number of hours west of GMT.

 

If any of the attributes are invalid, it will fall back to the default

behavior of <date/> (i.e. with no attributes specified)

 

To display the date and time in French using Central European time you would use:

 
         <date locale="fr_FR" timezone="-1" format="%c"/>
 

You can also improve the specificity of common certain time and date related inquiries to the ALICE bot, as illustrated by the following dialogue fragment.

 

 

Human: what day is it
ALICE: Thursday.
Human: what month is it
ALICE: December.
Human: what year is this
ALICE: 2004.
Human: what is the date
ALICE: Thursday, December 02, 2004.

 

 

 

 

No system tag

 

 

The AIML <system> tag is the key to creating the operating system of the future, because it runs any operating system command.  In standard AIML, you can use <system> to do everything from tell you the date and time, to open a Notebook editor, to control a robot, you name it!  Your imagination is the limit when you consider all the possibilities.  But unfortunately Pandorabots does not let you take over their system with the <system> tag, which is exactly what hackers and malicious coders would do if it were available to the general public for free.  Which is unfortunate too because Pandorabots is written in Lisp, and a <system> tag to the Lisp evaluator would be a fascinating project for AIML developers.  But remember, you are running your bot on their server, so it makes sense that a limitation like no <system> tag might exist.  Likewise, there is no equivalent of the server-side <javascript> tag.

You can of course write client-side Javascript code, or any client-side code that you can embed in HTML, such as an applet, because you may include any HTML inside the AIML response.  The <script> tag is normally safe inside AIML responses in Pandorabots.  It will be passed along to the browser and interpreted there. 

 

No predicate defaults

 

Although we saw in a previous section how to set predicate defaults in Pandorabots with AIML, most other AIML interpreters support predicate defaults in different way, using a startup data file.  Similarly, Pandorabots lacks botmaster control over a variety of functions that are pretty much closed or hard-wired, at least for the time being, in Pandorabots.

 

·        Deperiodization – Removing ambiguous punctuation like “Dr.” and “St”, and also applying heuristic rules to determine what makes a sentence a sentence. This feature is hard wired in Pandorabots. 

 

·        Normalization – Expanding contractions, removing all remaining punctuation, repairing many spelling errors.  This feature is hard wired in Pandorabots. 

 

·        Predicate defaults – AIML predicates have a default value for <get/>.  You can only set one global <get/> value in Pandorabots.  In this book, under the section on custom HTML, we showed a trick using embedded HTML-side AIML (another non-standard, embrace-and-extend feature) to set the default value of predicates.

 

·        Predicate <set/> returns – Some predicates return the predicate name, such as pronouns, and some return the set values.  These choices are hard wired in Pandorabots.