The reason I brought up the topic of formal grammars yesterday is that John Backus, a pioneer in the field of computer science and creator of the first high-level programming language (FORTRAN), passed away last Saturday. He’s probably known best for Backus-Naur Form (BNF), a formal notation for describing formal grammars. In BNF, my example grammar from yesterday would look like this:

<S> ::= <N> <V> <N>
<V> ::= saw | kissed
<N> ::= Bob | Jill

The major differences are the use of “::=” instead of an arrow, and the use of a vertical pipe “|” as sort of a shorthand to compress multiple production rules into one.

Not only is BNF a simple and standard way of representing grammars, but there are computer programs that will take as input a grammar in BNF, and automatically generate code that will parse (generate the derivation rules) for any given string. This turns out to be very convenient, since nearly all computer languages can be described in BNF. (Like Java, for example.)

Now, not every grammar is as simple as my example grammar, which could only generate eight unique strings. Nearly every formal grammar that is remotely interesting can generate an infinite number of strings. For example, here’s another very simple grammar in BNF:

<S> ::= <S> <S> | a

This grammar can generate strings like, “a”, or “a a”, or “a a a”, and so on. It can generate an infinite number of strings (or you can say the language of this grammar contains an infinite number of strings), but obviously it doesn’t contain every possible string. “a b a”, for example, is not in this language.

Chomsky, when describing formal grammars and formal languages, separated them out into four distinct categories of complexity, also called the Chomsky Hierarchy. The very simplest kind is called a “regular grammar”. Regular grammars are only allowed to have two types of rules:

<X> ::= z
<X> ::= <Y> z

In other words, a rule can either consist of a single nonterminal on the left, and a single terminal on the right, or a single nonterminal on the left, and a single nonterminal on the right followed by a single terminal. The grammar I listed above does not qualify, since it has a rule with two nonterminals on the right-hand side. However, the language it describes (strings of “a”s) can be described by a regular grammar.

Would anyone care to try their hand at finding it?

One Response to “John Backus, الله يرحمه”

  1. roscivs says:

    I think this one was too easy, so I’ll give it myself:

    <S> ::= a
    <S> ::= <S> a

Leave a Reply