# Theory of Computation 5 Combining Languages ```Theory of Computation 5
Combining Languages
Frank Stephan
Department of Computer Science
Department of Mathematics
National University of Singapore
[email protected]
Theory of Computation 5 Combining Languages – p. 1
Repetition 1
If (Q, Σ, δ, s, F) is a non-deterministic finite automaton (nfa)
then δ has a set of values (not always single value), that is,
for q ∈ Q and a ∈ Σ there can be several p ∈ Q with
p ∈ δ(q, a).
A run of an nfa on a word a1 a2 . . . an is a sequence
q0 q1 q2 . . . qn ∈ Q∗ such that q0 = s and qm+1 ∈ δ(qm , am+1 )
for all m < n.
If qn ∈ F then the run is “accepting” else the run is
“rejecting”.
The nfa accepts a word w iff it has an accepting run on w;
this is also the case if there exist other rejecting runs.
Theory of Computation 5 Combining Languages – p. 2
Repetition 2
The language {w : some letter appears twice} has an nfa
with n + 2 states while a dfa needs 2n + 1 states; here for
n = 4, where n = |Σ|.
0,1,2,3
0,1,2,3
∅
#
start
3
1
0
{0}
1,2,3
0
2
3
2
1
{1}
{2}
{3}
0,2,3
0,1,3
0,1,2
Theory of Computation 5 Combining Languages – p. 3
Repetition 3
Given an nfa, one let for given state q and symbol a the set
δ(q, a) denote all states q′ to which the nfa can transit from
q on symbol a.
Theorem 4.5 [Büchi; Rabin and Scott]
For each nfa (Q, Σ, δ, s, F) with n = |Q| states, there is an
equivalent dfa ({Q′ : Q′ ⊆ Q}, Σ, δ ′ , {s}, F′ ) with 2n states
such that F′ = {Q′ ⊆ Q : Q′ ∩SF 6= ∅} and
∀Q′ ⊆ Q ∀a ∈ Σ [δ ′ (Q′ , a) = q′ ∈Q δ(q′ , a)
= {q′′ ∈ Q : ∃q′ ∈ Q′ [q′′ ∈ δ(q′ , a)]}].
As the number of states is often overshooting, it is good to
minimise the resulting automaton with the algorithm of
Myhill and Nerode.
Theory of Computation 5 Combining Languages – p. 4
Repetition 4
The following statements are all equivalent to “L is regular”:
(a) L
is generated by a regular expression;
(b) L
is generated by a regular grammar;
(c) L
is recognised by a determinisitic finite automaton;
(d) L
is recognised by a non-determinisitic finite automaton;
(e) L
and Σ∗ − L both satisfy the Block Pumping Lemma;
(f) L
satsifies Jaffe’s Matching Pumping Lemma;
(g) L
has only finitely many derivatives.
Theory of Computation 5 Combining Languages – p. 5
Product Automata
Let (Q1 , Σ, δ1 , s1 , F1 ) and (Q2 , Σ, δ2 , s2 , F2 ) be dfas which
recognise L1 and L2 , respectively.
Consider (Q1 × Q2 , Σ, δ1 × δ2 , (s1 , s2 ), F) with
(δ1 × δ2 )((q1 , q2 ), a) = (δ1 (q1 , a), δ2 (q2 , a)). This automaton
is called a product automaton and one can choose F such
that it recognises the union or intersection or difference of
the respective languages.
Union: F = F1 × Q2 ∪ Q1 × F2 ;
Intersection: F = F1 × F2 = F1 × Q2 ∩ Q1 × F2 ;
Difference: F = F1 × (Q2 − F2 );
Symmetric Difference: F = F1 × (Q2 − F2 ) ∪ (Q1 − F1 ) × F2 .
Theory of Computation 5 Combining Languages – p. 6
Example
For a = 1, 2, let automaton ({s, t}, {0, 1, 2}, δa , s, {s})
recognise when there is an even number of a; if input b
equals a then state is changed else state remains
unchanged.
Quiz: Which Boolean combination does this product
automaton recognise?
0
0
1
(s, s)
start
2
0
1
2
(t, s)
(s, t)
2
1
1
(t, t)
2
0
Theory of Computation 5 Combining Languages – p. 7
Kleene Star
Assume (Q, Σ, δ, s, F) is an nfa recognising L. Now L∗ is
recognised by (Q ∪ {s′ }, Σ, δ ′ , s′ , {s′ } ∪ F) where
δ ′ (s′ , a) = δ(s, a) and δ ′ (p, a) = δ(p, a) for p ∈ Q − F and
δ ′ (p, a) = δ(p, a) ∪ δ(s, a) for p ∈ F.
0
0
s
start
0
start
0
s
1
1
t
s′
1
0
1
t
1
Theory of Computation 5 Combining Languages – p. 8
Concatenation
Assume (Q1 , Σ, δ1 , s1 , F1 ) and (Q2 , Σ, δ2 , s2 , F2 ) are nfas
recognising L1 and L2 with Q1 ∩ Q2 = ∅ and assume
ε∈
/ L2 . Now (Q1 ∪ Q2 , Σ, δ, s1 , F2 ) recognises L1 · L2 where
(p, a, q) ∈ δ whenever (p, a, q) ∈ δ1 ∪ δ2 or (p ∈ F1 and
(s2 , a, q) ∈ δ2 ).
If L2 contains ε then one can consider the union of L1 and
L1 · (L2 − {ε}).
Theory of Computation 5 Combining Languages – p. 9
Example
L1 · L2 with L1 = {00, 11}∗ and L2 = 2∗ 1+ 0+ .
q1
0
start
s2
2
0
1
r1
1
1
s1
1
2
q2
1
0
r2
0
Theory of Computation 5 Combining Languages – p. 10
Exercise 5.3
The previous slides give upper bounds on the size of the
dfa for a union, intersection, difference and symmetric
difference as n2 states, provided that the original two dfas
have at most n states.
Give the corresponding bounds for nfas: If L and H are
recognised by nfas having at most n states each, how many
states does one need at most for an nfa recognising (a) the
union L ∪ H, (b) the intersection L ∩ H, (c) the difference
L − H and (d) the symmetric difference (L − H) ∪ (H − L)?
Give the bounds in terms of “linear”, “quadratic” and
Theory of Computation 5 Combining Languages – p. 11
Sample Automata
Exercise 5.4
Let Σ = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}. Construct a (not
necessarily complete) dfa recognising the language
Σ · {aa : a ∈ Σ} ∩ {aaaaa : a ∈ Σ}. It is not needed to give a
full table for the dfa, but a general schema and an
explanation how it works.
Exercise 5.5
Make an nfa for the intersection of the following languages:
{0, 1, 2}∗ · {001} · {0, 1, 2}∗ · {001} · {0, 1, 2}∗ ;
{001, 0001, 2}∗ ; {0, 1, 2}∗ · {00120001} · {0, 1, 2}∗ .
Exercise 5.6
Make an nfa for the union L0 ∪ L1 ∪ L2 with
La = {0, 1, 2}∗ · {aa} · {0, 1, 2}∗ · {aa} · {0, 1, 2}∗ for
a ∈ {0, 1, 2}.
Theory of Computation 5 Combining Languages – p. 12
Exercise 5.7
Consider two context-free grammars with terminals Σ,
disjoint non-terminals N1 and N2 , start symbols S1 ∈ N1
and S2 ∈ N2 and rule sets P1 and P2 which generate L and
H, respectively. Explain how to form from these a new
context-free grammar for
(a) L ∪ H,
(b) L · H and
(c) L∗ .
Write down the context-free grammars for {0n 12n : n ∈ N}
and {0n 13n : n ∈ N} and form the grammars for the the
union, concatenation and star explicitly.
Theory of Computation 5 Combining Languages – p. 13
Example 5.8
The language {0}∗ · {1n 2n : n ∈ N} is context-free.
Grammar ({S, T}, {0, 1, 2}, P, S) with P be given by
S → 0S|T|ε and T → 1T2|ε.
The language {0n 1n : n ∈ N} · 2∗ is context-free.
L = {0n 1n 2n : n ∈ N} is not context-free but the intersection
of the two above.
The complement of L is the union of {0n 1m 2k : n < k},
{0n 1m 2k : n > k}, {0n 1m 2k : m < k}, {0n 1m 2k : m > k},
{0n 1m 2k : n < m}, {0n 1m 2k : n > m} and
{0, 1, 2}∗ · {10, 20, 21} · {0, 1, 2}∗ .
Each of these languages is context-free. Grammar for the
first of them: S → 0S2|S2|T2, T → 1T|ε. The union is also
context-free. Hence L has a context-free complement.
Theory of Computation 5 Combining Languages – p. 14
Context-Free Intersects Regular
Theorem 5.9
If L is context-free and H is regular then L ∩ H is
context-free.
Construction.
Let (N, Σ, P, S) be a context-free grammar generating L
with every rule being either A → w or A → BC with
A, B, C ∈ N and w ∈ Σ∗ .
Let (Q, Σ, δ, s, F) be a dfa recognising H.
/ Q × N × Q and make the following new grammar
Let S′ ∈
(Q × N × Q ∪ {S′ }, Σ, R, S′ ) with rules R:
S′ → (s, S, q) for all q ∈ F;
(p, A, q) → (p, B, r)(r, C, q) for all rules A → BC in P and
all p, q, r ∈ Q;
(p, A, q) → w for all rules A → w in P with δ(p, w) = q.
Theory of Computation 5 Combining Languages – p. 15
Exercises 5.10 and 5.11
Recall that the language L of all words which contain as
many 0s as 1s is context-free; a grammar for it is
({S}, {0, 1}, {S → SS|ε|0S1|1S0}, S).
Exercise 5.10
Construct a context-free grammar for L ∩ (001+ )∗ .
Exercise 5.11
Construct a context-free grammar for L ∩ 0∗ 1∗ 0∗ 1∗ .
Theory of Computation 5 Combining Languages – p. 16
Context-Sensitive and Concatenation
Let L1 and L2 be context-sensitive languages not
containing ε. Let (N1 , Σ, P1 , S1 ) and (N2 , Σ, P2 , S2 ) be two
context-senstive grammers generating L1 and L2 ,
respectively, where N1 ∩ N2 = ∅ and where each rule l → r
satisfies |l| ≤ |r| and l ∈ N+
e for the respective e ∈ {1, 2}.
/ N 1 ∪ N 2 ∪ Σ.
Let S ∈
Now (N1 ∪ N2 ∪ {S}, Σ, P1 ∪ P2 ∪ {S → S1 S2 }, S) generates
L1 · L2 .
If v ∈ L1 and w ∈ L2 then S ⇒ S1 S2 ⇒∗ vS2 ⇒∗ vw.
Furthermore, the first rule has to be S ⇒ S1 S2 and from
then onwards, each rule has on the left side either l ∈ N+
1
so that it applies to the part generated from S1 or it has in
the left side l ∈ N+
2 so that l is in the part of the word
generated from S2 . Hence every intermediate word z in the
derivation is of the form xy = z with S1 ⇒∗ x and S2 ⇒∗ y.
Theory of Computation 5 Combining Languages – p. 17
Context-Sensitive and Kleene-star
Let (N1 , Σ, P1 , S1 ) and (N2 , Σ, P2 , S2 ) be context-sensitive
grammars for L − {ε} with N1 ∩ N2 = ∅ and all rules l → r
+
satisfying |l| ≤ |r| and l ∈ N+
or
l
∈
N
1
2 , respectively. Let
S, S′ be symbols not in N1 ∪ N2 ∪ Σ.
Now consider (N1 ∪ N2 ∪ {S, S′ }, Σ, P, S) where P contains
the rules S → S′ |ε and S′ → S1 S2 S′ | S1 S2 | S1 plus all rules
in P1 ∪ P2 .
This grammar generates L∗ .
Theory of Computation 5 Combining Languages – p. 18
Context-Sensitive and Intersection
Theorem.
The intersection of two context-sensitive languages is
context-sensitive.
Construction.
Let (Nk , Σ, Pk , S) be grammars for L1 and L2 . Now make a
new non-terminal set N = (N1 ∪ Σ ∪ {#}) × (N2 ∪ Σ ∪ {#})
S
with start symbol S and following types of rules:
(a) Rules to generate and manage space;
(b) Rules to generate a word v in the upper row;
(c) Rules to generate a word w in the lower row;
(d) Rules to convert a string from N into v provided that the
upper components and lower components of the string are
both v.
Theory of Computation 5 Combining Languages – p. 19
Type of Rules
(a):
and
S #
S
A #
for
producing
space;
→
S #
S
B C
A B
A B
→
# C for space management.
C #
→
#
B
A
C
(b) and (c): For each rule in P1 , for example, for
AB → CDE ∈ P1 , and all symbols F, G, H, . . . in N2 , one
C D E
A B #
has the corresponding rule F G H → F G H . So
rules in P1 are simulated in the upper half and rules in P2
are simulated in the lower half and they use up # if the left
side is shorter than the right one.
a
(d): Each rule a → a for a ∈ Σ is there to convert a
a
matching pair a from Σ × Σ (a nonterminal) to a (a
terminal).
Theory of Computation 5 Combining Languages – p. 20
n n n
Grammar for 0 1 2 with n > 0
Grammar L1 : S → S2|0S1|01.
Grammar L2 : S → 0S|1S2|12.
Grammar for Intersection.
A, B, C stand for any members of {S, 0, 1, 2, #}.
A
N = { B : A, B ∈ {S, 0, 1, 2, #}}.
S
S #
Rules: S → S # ;
A B # A
A B
A #
B C → B C ; C # → # C ;
S 1 2 S # #
0
S
S #
A B → A B; A B C → A B C ;
1
0
S #
→
A B;
A B A B C
A B
A B
A B C
S 2 ;
S ; S # # → 1
S # → 0
A B
A B
2 ;
S # → 1
0
1
2
0 → 0; 1 → 1; 2 → 2.
Theory of Computation 5 Combining Languages – p. 21
Deriving 001122
S
S
S
S
S
S
0
S
0
0
0
0
0
0
⇒∗
#
#
#
#
S
#
0
S
0
0
0
0
S # # # # #
S 2 # # # #
⇒
S # # # # #
S # # # # #
∗
# # # 2
S 2 # # # 2
# # # # ⇒ S # # # # # ⇒
2
2
# # 2
0 S 1 # 2
# # # # ⇒ S # # # # # ⇒
1 1 2 2
2 2
# 1
0 0
# # # # ⇒ S # # # # # ⇒
2 2
1 2 2
0 0 1 1
1
# # # # ⇒ 0 0 S # # # ⇒
0 0 1 1 2 2
1 1 2 2
1 S 2 # ⇒ 0 0 1 S # 2 ⇒
∗
1 1 2 2
1 1 2 2 ⇒ 001122.
⇒∗
Theory of Computation 5 Combining Languages – p. 22
Exercises 5.14 and 5.17
Exercise 5.14
Construct a context-sensitive grammar for
{0n 1n 2n : n ∈ N}∗ .
Exercise 5.17
Consider the language L = {00} · {0, 1, 2, 3}∗ ∪ {1, 2, 3} ·
{0, 1, 2, 3}∗ ∪ {0, 1, 2, 3}∗ · {02, 03, 13, 10, 20, 30, 21, 31, 32} ·
{0, 1, 2, 3}∗ ∪ {ε} ∪ {01n 2n 3n : n ∈ N}.
Which versions of the Pumping Lemma does it satisfy:
• Regular Pumping Lemma (with / without bounds);
• Context-Free Pumping Lemma (with / without bounds);
• Block Pumping Lemma (for regular languages)?
Determine the exact position of L in the Chomsky hierarchy.
Theory of Computation 5 Combining Languages – p. 23
Mirror Images
Define (a1 a2 . . . an )mi = an . . . a2 a1 as the mirror image of a
string.
It follows from the definition of context-free and
context-sensitive, that if L is context-free / context-sensitive
so is Lmi . This can be achieved by replacing every rule
l → r by lmi → rmi .
For example, the mirror image of the language of the words
0n 13n+3 is given by language of the words 13n+3 0n . While
L is generated by a context-free grammar with one
non-terminal S and rules S → 0S111 | 111, Lmi is then
generated by a similar grammar with the rules
S → 111S0 | 111.
Theory of Computation 5 Combining Languages – p. 24
Exercise 5.18
Recall that xmi is the mirror image of x, so
(01001)mi = 10010. Furthermore, Lmi = {xmi : x ∈ L}.
Show the following two statements:
(a) If an nfa with n states recognises L then there is also an
nfa with up to n + 1 states recognising Lmi .
(b) Find the smallest nfas which recognise L = 0∗ (1∗ ∪ 2∗ )
as well as Lmi .
Theory of Computation 5 Combining Languages – p. 25
Palindromes
The members of the language {x ∈ Σ∗ : x = xmi } are called
palindromes. A palindrome is a word or phrase which looks
the same from both directions.
An example is the German name “OTTO”; furthermore,
when ignoring spaces and punctuation marks, a famous
palindrome is the phrase “A man, a plan, a canal: Panama.”
originating from the time when the canal in Panama was
built.
The grammar with the rules S → aSa|aa|a|ε with a ranging
over all members of Σ generates all palindromes; so for
Σ = {0, 1, 2} the rules of the grammar would be
S → 0S0 | 1S1 | 2S2 | 00 | 11 | 22 | 0 | 1 | 2 | ε.
The set of palindromes is not regular.
Theory of Computation 5 Combining Languages – p. 26
Exercises
Exercise 5.20
Let w ∈ {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}∗ be a palindrome of even
length and n be its decimal value. Prove that n is a multiple
of 11. Note that it is essential that the length is even, as for
odd length there are counter examples (like 111 and 202).
Exercise 5.21
Given a context-free grammar for a language L, is there
also one for L ∩ Lmi ? If so, explain how to construct the
grammar; if not, provide a counter example where L is
context-free but L ∩ Lmi is not.
Exercise 5.22
Given a language L, the language L ∩ Lmi equals to
{w ∈ L : w is a palindrome}.
Theory of Computation 5 Combining Languages – p. 27
Next Week’s Midterm Examination
Topics
Defining and proving using structural induction
Making and analysing finite automata
Converting regular languages from one form into another
form, Deterministic versus non-deterministic finite
automata, Bounds on number of states
Pumping lemmas: Usage for proofs; Properties
Combining finite automata
Basic properties of context-free grammars: Making of such
grammars, Usage of pumping lemma for context-free
languages
Revise lecture notes; Try exercises and compare with
solutions by fellow students
Theory of Computation 5 Combining Languages – p. 28
Example of Induction
ε <ll 0 <ll 1 <ll 00 <ll 01 <ll 10 <ll 11 <ll 000 <ll . . .; use
this length-lexicographical order <ll to define sw(reg exp):
sw(∅) = ∞;
sw({w1 , . . . , wn }) = minll {w1 , . . . , wn };
(
sw(σ)
if sw(τ ) = ∞;
sw(σ ∪ τ ) =
sw(τ )
if sw(σ) = ∞;
minll {sw(σ), sw(τ )} otherwise;
(
∞
if sw(σ) = ∞
sw(σ · τ ) =
or sw(τ ) = ∞;
sw(σ) · sw(τ ) otherwise;
sw(σ ∗ ) = ε.
Prove by structural induction that whenever σ generates a
nonempty language then sw(σ) is a shortest word.
Theory of Computation 5 Combining Languages – p. 29
```