Appearance
Optimization of DFA-Based Pattern Matchers
Minimizing the Number of State of a DFA
There can be many DFA's that recognize the same language. Not only do these automata have states with different names, but they don't even have the same number of states. If we implement a lexical analyzer as a DFA, we would generally prefer a DFA with as few states as possible.
Rules for minimization
For states
- non-final states
- the same final state or merged final state
Example
Minimize the following DFA transition graph
In the following table, each entry shows whether two sets can be merged, while the upper-right triangle is redundant.
a | b | c | d | e | f | g | |
---|---|---|---|---|---|---|---|
b | — | — | — | — | — | — | |
c | — | — | — | — | — | ||
d | — | — | — | — | |||
e | ✅ | — | — | — | |||
f | ✅ | — | — | ||||
g | — | ||||||
h | ✅ |
Consequently, we get these new state
More NFA-to-DFA Conversion
Consider the regular expression, NFA transition graph and the corresponding transition table
(+|-)?(d+|d+.d*|d*.d+)
+ | - | . | d | ε | |
---|---|---|---|---|---|
S | A | A | A | ||
A | B, C | E | |||
B | B | F | |||
C | D | C | |||
D | D | F | |||
E | G | E | |||
F | |||||
G | H | ||||
H | H | F |
Using subset construction
+ | - | . | d | |
---|---|---|---|---|
S,A,E | A,E | A,E | G | B,F,C,E |
A,E | G | B,F,C,E | ||
G | H,F | |||
H,F | H,F | |||
B,F,C,E | D,F,G | B,F,C,E | ||
D,F,G | D,F,G | |||
D,F,H | D,F,G |
Minimize DFA using the technique above
Since
+ | - | . | d | |
---|---|---|---|---|
S,A,E | A,E | A,E | G | B,F,C,E |
A,E | G | B,F,C,E | ||
G | H,F | |||
H,F | H,F | |||
B,F,C,E | D,F,G,H | B,F,C,E | ||
D,F,G,H | D,F,G,H |
TIP
Now, the next minimization is tricky. For
In DFA minimization, when the effect is the cause, we shall always assume that the minimization stands, i.e.,
After they're merged, we get.
+ | - | . | d | |
---|---|---|---|---|
S,A,E | A,E | A,E | G | B,F,C,E |
A,E | G | B,F,C,E | ||
G | D,F,G,H | |||
B,F,C,E | D,F,G,H | B,F,C,E | ||
D,F,G,H | D,F,G,H |
Tidy it up by renaming the merged state with the leading state
+ | - | . | d | |
---|---|---|---|---|
S | A | A | G | B |
A | G | B | ||
G | D | |||
B | D | B | ||
D | D |
Eliminating Empty Cycles and Paths
We've seen how to transform the NFA into the DFA, however, the number of states in the DFA may grow exponentially with the number of states in the NFA. So we should do some processes on the NFA before using subset construction.
Eliminating empty cycles
Whenever there is an empty cycles, a state within the cycle can be eliminate. A empty cycle consists of solely two or more empty transitions. For example, the graph below contains two empty cycles, C-D-A
and C-D-E
. The gray state indicate that it can be eliminated.