ELIZA (cont'd): Matching sentence patterns

In this method, we list, not all the possible sentences in the language, nor a list of keywords and their responses, but the possible patterns that sentences in English can have. This allows us (at least apparently) to make far more general statements about the form of grammatical sentences in English, and to distinguish many grammatical strings from non-grammatical ones.

For example, it is clear that sentences (a) to (c) resemble each other in a fairly basic respect:

  1. My mother drinks black tea
  2. My pet alligator drinks neat gin
  3. My aunt Mabel drinks anything that comes her way

They have in common the following underlying pattern, or template:[1]

My <some word(s)> drinks <some more words>

and we could easily think of many words and expressions we could insert into the angle brackets to create new sentences of the same kind. In contrast, the sequence:

drinks <some word(s)> my <some more words>

does not appear to be a plausible pattern for sentences in English, and would not be expected to match well-formed English sentences. Since sentences tend to be patterned in a small number of fairly regular ways in English, we might guess that a list of such patterns need not be cumbersomely vast.

Task. Make up a couple more patterns yourself. You might begin, as above, by writing out a list of sentences that have words in common, and using the common words as the basis for your patterns.

Suppose that a program is designed so as to be able to recognise the sentence pattern, for example, in the following form:

my ?someone drinks ?something

and that, having recognised that pattern, is designed to respond with another pattern:

you should tell your ^someone to stop drinking ^something

This, though very much simplified, is much the way that ELIZA performs, and in something like the following manner:

if the user types in

my ?someone drinks ?something

then reply with

you should tell your ^someone to stop drinking ^something

where the first and third lines are instructions to ELIZA to look out for a pattern and, having found it, issue a reply by quoting into a template those of the user's words that are 'variables' -- i.e. unfixed elements -- in the pattern.

Thus, the user might type in:

my second cousin drinks snake oil

and ELIZA will dutifully reply

you should tell your second cousin to stop drinking snake oil

In the following pages of this tutorial we shall implement a pattern-matching version of ELIZA in Prolog.

We shall adopt the following typographic conventions in our examples:

  • Variables in English sentences will be enclosed within angle brackets, thus: <variable>.

  • Named variables in Prolog, if single words (atoms), are represented by words beginning with an upper case letter (this is standard Prolog), thus: Variable.

  • Anonymous variables in Prolog, if single words (atoms), are represented by the underscore character (again, this is standard Prolog), thus: __.

  • Variables of indeterminate length (an entire phrase of several words, for example) in Prolog clauses will be represented by a question mark followed by a word beginning with a lower case letter if a value is being assigned to that variable; for example, ?variable; and by a 'hat' symbol followed by a word beginning with a lower case letter if a value is being returned by that variable; for example, ^variable.