Recall the Automated Tourist Guide we introduced at the beginning of this tutorial. If we are to extend our Tourist Guide in such a way as to enable it to understand sentences like the one we introduced on the first page of this tutorial ("Can you tell me how to get to the gallery in the square containing the monument?"), and to generate appropriate responses, we need first of all to know about the formal structure of sentences in the English language.[1]

In many grammars, the syntax (formal structure) and semantics (meaning) are strongly interlinked in so far as the meaning of a complete sentence is in part determined (according to the so-called 'rule-to-rule hypothesis') by the manner in which its constituent words and phrases are syntactically related. We discuss this further in a later part of this tutorial. For the present, we wish merely to stress the fact that every well-formed sentence in a natural language has a formal syntactic structure, and that we can unambiguously describe that structure irrespective of whatever meanings it may be used to convey. We need, that is, to be able to specify all the possible forms of acceptable sentences in the English language (or at least, in the subset of English which we expect our system to handle), and to do so in a manner that enables us to write computer programs which, given strings of words as input, will distinguish between those that we feel as native speakers to be grammatical sentences in English and those that are not.

There are two strong motivations for this: in the first place, such a program would capture our intuitions as speakers of English about the formal structure of English sentences. We will want our program to be able to distinguish, for example, between novel but perfectly grammatical strings such as those in (1) and non-sentences of the kind exemplified in (2):

a. All stuffed grey elephants are moderately inflammable.
b. There are no such things as triangular virtues.
a. *Inflammable all grey moderately elephants stuffed are.
b. *There are are are are as as virtues.

(Note the asterisk in front of the sentences in (2). It is conventional to indicate that a string is ill-formed by prefixing a "*", and such strings are often called 'starred' sentences'.)

Task. From your own informal knowledge of English grammar, try and describe what makes the strings of words in (2) ungrammatical. Now try and say (and this is more difficult) what makes those in (1) grammatical.

Although I am referring to 'English' throughout, the principles described throughout this tutorial of course apply to any natural language.