An Introduction to Generative Grammars
Posted by: roscivs, in UncategorizedNoam Chomsky is an interesting fellow. Most who have heard of him are surprised to hear that he is one of the most influential linguists of the 20th century. (Linguists, on the other hand, are often surprised to hear that he is widely known as a political activist.) But his contributions to the field of linguistics are wide and varied. Even more curious is the fact that some of these contributions have helped shaped the field of computer science.
Chomsky, in the 1960s, came up with a theory called transformational grammar, maintaining that humans have some sort of innate way of telling whether a given sentence is grammatical or ungrammatical, and that foundationally, this sense is common across all humans no matter what the language. As part of this idea, Chomsky also discussed formal grammars, which are mathematically precise ways of describing a “language”. Although Chomsky believed that formal grammars were unable to fully express the intricacies of human languages, his formal grammar theory has become the foundation of modern computer languages.
The central idea is that of a generative grammar: a set of rules for transforming strings. Chomsky was the first to formalize this idea back in the 50s. Under the formal definition, a grammar has exactly four pieces: a set of “nonterminal symbols”, a set of “terminal symbols”, a list of “production rules”, and a single “starting symbol” from the set of nonterminals. Let’s try a concrete example.
The set of nonterminals: <S>, <N>, <V>. The set of terminals: Bob, Jill, saw, kissed. The starting symbol: <S>. As for the production rules, they must be of the format: X → Y. My example grammar has only five production rules:
1. <S> → <N> <V> <N>
2. <V> → saw
3. <V> → kissed
4. <N> → Bob
5. <N> → Jill
What a production rule means is that if you have something on the left-hand side of a rule anywhere in your string, you can, if you so desire, change it to the corresponding bit on the right. For example, if I have the string “<N> <N> <N>” I can, using rule #4, change my string to become “<N> Bob <N>”. (I also could have changed any of the other two symbols using rule #4, or I could have used rule #5 instead.)
You can look at a formal grammar like this as something of a game: given the starting symbol and using production rules, what strings of terminals are possible? Here’s one possible string, with its full derivation:
1. <S> (starting symbol)
2. <N> <V> <N> (rule #1)
3. <N> <V> Bob (rule #4)
4. <N> saw Bob (rule #2)
5. Jill saw Bob (rule #5)
Now my string contains all terminal symbols, so my use of production rules must come to an end. (Production rules may have anything on the right-hand side, but the left-hand side must only contain nonterminals—which is where the name “terminals” comes from, because once your string contains only terminals, it is finished.)Another variety of the game consists of starting with a given string of terminals, and attempting to figure out the proper derivation for it. For example, can you give the derivation for “Bob kissed Jill”? How about “Bob saw”? Or “Jill kissed Jill”? (Hint: one of those strings has no valid derivation.)
Strings with a valid derivation—there’s some way to get to them via the starting symbol—are said to be “in the language described by the grammar”. Strings without a valid derivation—no matter what rules you apply, you can never get to them from the starting symbol—are said to “not exist in the language described by the grammar”. A “formal language,” then, is the list of all strings that a generative grammar can produce.
How many different strings are in the language produced by my example grammar? First one to post them all gets a cookie!
Entries (RSS)
March 23rd, 2007 at 6:01 pm
M’m. Cookie.
If I understand the algorithm correctly, you can say “Jill/Bob saw/kissed Jill/Bob”, choosing either option for each pair, to generate 8 different statements. Is that right?
March 23rd, 2007 at 7:54 pm
Yup! You’ve got it. I can either fax you a cookie now, or hand-deliver around Christmastime.