The term was borrowed from pattern recognition theory.
The general pattern recognition problem is to partition a space P of inputs into disjoint regions so that the pattern classifier can categories a point x from P into one of those regions. The regions are called categories C_{1}, C_{2}..., C_{n}.
Formally, the union of the C_{i} = P and the intersection of any pair of C_{i} and C_{j} = Ø (the empty set), whenever i ≠ j.
The pattern recognition problem is to categorize x into one of the C_{i}.
In many cases the partition is define by a matching function f(x, i) which computes a "distance" from x to the category C_{i}. For any given point x in P, x is categorized as C_{i} provided f(x, i) ≤ f(x, j) for any other category j.
If the input space consists of 32digit bar code scans, and the categories represent different items for sale, then the problem is to classify a given 32 bit input into the nearest matching code for one of those items.
If the input space consists of 250,000pixel TV pictures of human faces, and the categories represent a set f wanted terrorists, then the problem is to match the image with one of the terrorists. This case shows that there may be a special category C´ indicating "no match".
In the case of AIML, the input space P consists of all 3tupes of (input, that, topic) strings in normalized form. The AIML categories partition the pattern space into disjoint regions, determined by the order of the matching function.
If the only category is the default one with <pattern> = <that> = <topic> = *, then it both partitions and fills the pattern space. Every input matches that pattern.
Adding one more category, <pattern>HELLO</pattern> and <that> = <topic> = *, partitions the input space into two regions: those that match this new category, and those that match the default.
