In general, to write programs and learn programming languages, understanding EBNF is not strictly necessary, but it could be a helpful tool. In the vast majority of cases, people learn the syntax and structure of language through other means, such as by rules informally stated in other natural languages and/or even just through enough examples.
EBNF is a means to formally and precisely define syntax rules, which can be quite concise and remove ambiguities. It does so by specifying (production) rules on how valid examples of a language may be constructed. Since EBNF is itself a language, it can even be used to specify its own syntax. However, this sort of recursive approach is unlikely to be an effective teaching method. Instead, let me try to give an example.
Here is an example of how it might be used to specify a very simple and limited language. The lines beginning with “//” are comments:
// A sentence is one or more words, with spaces in between,
// and period at the end
sentence = word , { " " , word } , "."
// A word is one or more letters
word = letter , { letter }
// A letter is either "a" or "b" or "c"
letter = "a" | "b" | "c"
In the above example, the expressions "a"
, "b"
, "c"
, " "
, and "."
are giving the lowest-level of available building blocks in this language, which are known as “terminal strings”. The other building blocks introduced, sentence
, word
, and letter
are examples of “non-terminal symbols”, since there built up of other blocks, specified by “production rules” (the equations describing how they are formed. In these rules, the comma ,
means concatenation, the brackets { ... }
means zero or more repetitions of what is inside, and the pipe |
means “or” (i.e., a choice among options on either side).
This is an overly simplified example, since only five characters (three letters and two punctuation) are allowed, but we could have specified more. Here are example of valid sentences according the above syntax rules:
abc.
b c aa b.
a.
c cc a.
cab.
Here are examples of invalid sentences, with comments explaining why:
// Contains invalid letters
d.
A.
cat.
hello.
// Missing period at end
a b c
aa
// Too many spaces between words, or spaces at beginning/end
a.
a b.
c b a .
Edit: I wasn’t originally intending to write such a long, pedantic post about linguistics, yet here we are. I guess no one should be surprised.