Tree-Walking Automata and Monadic Second Order Logic
Preface
This report describes the results of a study in the
field of theoretical computer science. It was written as my
``doctoraalscriptie'' (Master's thesis)
for my study of Computer Science at Leiden University.
The graduation tutor from the Department of Computer Science
was J. Engelfriet, PhD.
This report was written using the LATEX typesetting software on a computer
running Linux and the KDE desktop environment.
I would like to thank Joost Engelfriet for his constant advice and
his help in conducting this study. My parents have always supported me in many
ways. And, last but certainly not least, thanks to Ellen for the title page
she designed and for her love and support.
Jan-Pascal
Delft, July 1998
Contents
Preface
Introduction
1 Binary MSO Formulas and String-Walking Pebble Automata
1.1 Preliminaries
1.2 String-Walking Automata
1.3 String-Walking Pebble Automata
1.4 String-Walking Automata with MSO Tests
1.5 From MSO Formulas to String-Walking Pebble Automata
1.6 Another proof
1.7 Pebble Necessity
1.8 From String-Walking Pebble Automata to MSO Formulas
2 Extensions to String-Walking Pebble Automata
2.1 String-Walking Multi-Pebble Automata
2.2 From String-Walking Multi-Pebble Automata to MSO Formulas
2.3 String-Walking Marble Automata
3 Binary MSO Formulas and Tree-Walking Automata
3.1 Preliminaries
3.2 Tree-Walking Automata
3.3 Push-Down Tree-Walking Automata
3.4 Equivalence of Finite Tree Automata and Push-Down Tree-Walking Automata
3.5 Tree-Walking Marble Automata
3.6 Tree-Walking Marble/Pebble Automata
3.7 Tree-Walking Automata with MSO Tests
3.8 Equivalence of Binary MSO Formulas and Tree-Walking Marble/Pebble Automata
Conclusion and Recommendations
Introduction
Monadic second order logic (MSO logic, [Büc60,,TW68])
is a way to describe properties of graphs. In this paper we will focus on
strings and trees as special cases. Monadic second order logic
combines great strength
with great ease of expression.
Closed MSO formulas define sets of trees, while MSO formulas
with one or more unbound variables define node relations on trees.
One perhaps less desirable property of MSO logic is that its formulas
are global in a way that they define properties of trees as a whole.
Another way to define node relations is by tree-walking automata.
A tree-walking automaton can compute a binary node relation by starting
on one node of the tree and finishing on another node. It has already
been shown that tree-walking automata that can test unary MSO formulas
at each node they visit, compute the same node relations that binary MSO
formulas define [BE97]. However, these unary MSO formulas define in a way
also global properties of the tree, in that they can test, e.g.,
the labels of several nodes of the tree they walk on in one formula.
It can also be shown that ordinary tree-walking automata do not compute
the same node relations that binary MSO formulas recognize. Therefore,
in order to compute the node relations defined by binary MSO formulas,
we need to extend tree-walking automata with features that make them
more powerful. We need to make sure that these features result in only
local operations, i.e.,
the resulting automata can, during their walk on a tree,
only test properties of the current node. Also, we would like these
operations to be more operational and less descriptive than unary MSO
formulas.
These considerations lead us to the central question of this paper.
Is it possible to define a type of tree-walking automaton,
with only local operations,
that computes the same binary node relations that binary MSO formulas
recognize?
In this paper, we try to find an answer to this question. We start by
attempting to find an answer for strings in Chapter 1.
In this chapter, we introduce the concept of pebbles . During the walk of
a string- or tree-walking automaton, the automaton can place a pebble on the current
node. Later during its walk, the automaton can check for the presence of a
pebble and pick it up. Variations are possible in the allowed number of
pebbles, coloured pebbles and restriction in the use of pebbles. It is shown
that string-walking pebble automata (with just one pebble) compute
exactly the binary relations that binary MSO formulas define on strings.
In the next chapter, we define some more powerful string-walking automata
and show some of their properties.
In Chapter 3, we move on to trees.
The task of finding a proper extension for tree-walking automata proves to
be a bit harder than for string-walking automata. Because it is already
known [KS81] that push-down tree-walking automata recognize
exactly the regular tree languages (which are the tree languages
defined by closed MSO formulas), we develop
tree-walking marble automata, similar in concept to push-down tree-walking
automata, that also recognize exactly the regular tree languages but, unlike
push-down tree-walking automata, allow for a natural definition of the
computation of node relations. By adding one pebble, we obtain tree-walking
marble/pebble automata. We show that these
compute exactly the binary node relations that binary MSO formulas
define on trees.
In the last chapter, we conclude this paper and give some recommendations for
future research.
Chapter 1
Binary MSO Formulas and String-Walking Pebble Automata
1.1 Preliminaries
In this section we recall some well-known concepts from
formal language theory [HU79,], more specifically
concerning strings, finite automata and monadic second order logic (on strings).
Before doing so, we recall some standard
terminology from set theory.
The set of natural numbers is \mathbb N = {0,1,2,¼}
and, for m,n Î \mathbb N,
[m,n] = { i Î \mathbb N \mid m £ i £ n}.
For a set S, 2S is its powerset, i.e., the set of all subsets of S.
Let R Í A ×B be a binary relation.
The transitive reflexive closure of R is denoted R*.
The inverse of R is R-1 = { (y,x) \mid (x,y) Î R }.
For two binary relations R1 and R2, their composition is
R1 °R2 = {(x,z) \mid $y: (x,y) Î R1 and (y,z) Î R2 }. Note that the order of R1 and R2 is nonstandard.
For each a Î A,
R(a) = { b Î B \mid (a,b) Î R } and for
each S Í A, R(S) = È{ R(a) \mid a Î S }.
The domain of R is dom(R) = R-1(B).
A binary relation R is functional if it is a partial
function, i.e., (x,y),(x,z) Î R
implies x = z.
Let f:A® B be a function; for
any a and b,
f(a® b) denotes the ``perturbed'' function
f¢:A È{a} ® B È{b} with
f¢(a) = b and f¢(a¢) = f(a¢) for every a¢ Î A with a¢ ¹ a.
Strings
An alphabet is a finite set of symbols. Let S be an alphabet.
A string over S
is a finite string of symbols from S.
The empty string is denoted l.
The set of all strings over S is denoted S*. A subset of
S* is called a language.
If w is a string over S,
|w| denotes the length of w. The set of positions in w is denoted by
Vw = [1,|w|].
For a string w Î S* and i Î Vw, we write labw(i) = s
if the ith symbol of w is s.
For k Î \mathbb N, a k-ary position relation
over S is a subset of
{(w,u1,¼,uk) \mid w Î S* and ui Î Vw for all i Î [1,k] }. A k-ary position relation associates with each string
w a k-ary relation on the positions of w.
We define marked strings as follows. Let S be an alphabet,
and k ³ 1. We define Bk = {0,1}k \{0}k.
The alphabet
SÈ(S×Bk) contains all symbols s Î S and the symbols
(s, b1, ¼, bk) with bi Î {0,1} for all i Î [1,k],
but without (s, 0, ¼, 0). The role of the latter symbol is played
by s itself. This alphabet is used to attach k different marks
to the positions in a string.
Let w Î S* be a string over S, and let u1, ¼, uk Î Vw.
The marked string w¢ = mark(w, u1, ¼, uk) Î (SÈ(S×Bk))*
is the string with labw¢(u) = labw(u) if u ¹ ui for all
i Î [1,k], and labw¢(u) = (labw(u), (u = u1), ¼, (u = uk))
otherwise (where (u = ui) = 1 iff u equals ui).
Finite Automata
Let S be an alphabet. A finite automaton over S is
a quintuple M = (Q,S,d,I,F), where Q is a finite set of
states, S is the input alphabet, d Í Q ×S×Q is the transition relation, I Í Q is the set of
initial states and F Í Q is the set of final states.
The elements of d are called transitions.
For every string w Î S*, the state transition relation
RM(w) Í Q ×Q is defined as follows.
For s Î S,
RM(s) = { (p,q) \mid (p,s,q) Î d}. For
s1,¼,sn Î S,
RM(s1¼sn) = RM(s1)°¼°RM(sn). The language recognized by M is
L(M) = { w Î S* \mid (qin,qfin) Î RM(w) for some qin Î I, qfin Î F }.
A language is regular if it is recognized by a finite automaton.
The class of all regular languages is named REG.
Monadic Second Order Logic (on strings)
Monadic second order logic can be used to describe properties
of strings [Büc60]. For an alphabet S, we define the language
MSOL(S) of MSO formulas over S. This language has position
variables x,y,¼ and position-set variables X,Y,¼. For a given
string w Î S*, position variables range over Vw = [1,|w|] and
position-set variables range over the subsets of Vw.
There are three types of atomic formulas in MSOL(S): labs(x), for every
s Î S, signifies that at position x there
is a symbol s. When MSO logic is expanded to describe properties
of trees and graphs, labs(x) means that node x is labeled
s, hence the name labs. The atomic formula
pre(x,y) signifies that position x comes directly before y,
i.e. y = x+1; and x Î X signifies that x is an element of X. The
connectives used to build formulas from these atomic formulas are
Ø, Ù, Ú, ®, «, with the usual meaning.
Both position variables and position-set variables can be quantified with
$ and ".
We will make use of the following abbreviations of MSO formulas:
If an MSO formula f has free variables, say x, y, X and no others,
we write f(x,y,X) to indicate the free variables of f.
For every k Î \mathbb N, the set of MSO formulas over S with k
free position variables and no free position-set variables is denoted
MSOLk(S), or the set of k-ary MSO formulas.
For a closed formula f Î MSOL0(S) and
a string w Î S*, we write w \models f if w satisfies f.
The language defined by f is
L(f) = { w Î S* \mid w \models f}. L(f) is called
an MSO definable language .
The class of all MSO definable languages is named MSO.
Given a string w, a valuation function n is a function that maps
each position variable to a position u Î Vw and each position-set
variable to a subset U of Vw. Let f be an MSO formula. We write
(w,n) \models f, if f holds in w, where the free variables
of f are assigned values according to the valuation function n.
If f has free variables x,y,X, we write (w,u,v,U) \models f(x,y,X) for
(w,n) \models f, where n(x) = u, n(y) = v and n(X) = U.
Let f(x1, ¼, xk) Î MSOLk(S) be an MSO formula
with k free positions variables (and no free position-set variables).
For each string w Î S*, the k-ary relation
that f defines on the positions of w is
Rw(f) = { (u1, ¼, uk) Î Vwk \mid (w, u1, ¼, uk) \models f(x1, ¼, xk) } . |
|
The MSO formula f(x1, ¼, xk) defines the position relation
R(f) = { (w, u1, ¼, uk) \mid w Î S*, (u1, ¼, uk) Î Rw(f) } . |
|
Büchi [Büc60] proved the following classical result.
Proposition 1
A language is MSO definable if and only if it is regular.
In [BE97], the following (well-known)
results are proven for MSO formulas on trees.
They also hold for the special case of strings.
Corollary 3 is
the result of Lemma 2 in the case j = k.
Lemma 2
Let S be an alphabet, k ³ 1, and j Î [1,k]. For every formula
f(x1, ¼, xk) Î MSOLk(S) there is a formula
y(xj+1, ¼, xk) Î MSOLk-j(SÈ(S×Bj)) such that,
for all w Î S* and u1, ¼, uk Î Vw,
(w, u1, ¼, uk) \models f(x1, ¼, xk) iff (mark(w, u1, ¼, uj), uj+1, ¼, uk) \modelsy(xj+1, ¼, xk), |
|
and vice versa, i.e.,
for every formula y(xj+1, ¼, xk) Î MSOLk-j(SÈ(S×Bj))
there is a formula f(x1, ¼, xk) Î MSOLk(S)
such that the above equivalence holds
for all w Î S* and u1, ¼, uk Î Vw.
Corollary 3
Let S be an alphabet and k ³ 1. For every formula
f(x1, ¼, xk) Î MSOLk(S) there is a formula
y Î MSOL0(SÈ(S×Bk)) such that,
for all w Î S* and u1, ¼, uk Î Vw,
(w, u1, ¼, uk) \models f(x1, ¼, xk) iff mark(w, u1, ¼, uk) \models y, |
|
and vice versa, i.e.,
for every MSO formula y Î MSOL0(SÈ(S×Bk))
there is a formula f(x1, ¼, xk) Î MSOLk(S)
such that the above equivalence holds
for all w Î S* and u1, ¼, uk Î Vw.
From Corollary 3 and Proposition 1
follows the following lemma.
Lemma 4
Let S be an alphabet and let k ³ 1. For every MSO formula
f(x1, ¼, xk) Î MSOLk(S) there exists a finite automaton
M over SÈ(S×Bk) such that, for all w Î S* and
u1, ¼, uk Î Vw,
(w, u1, ¼, uk) \models f(x1, ¼, xk) iff mark(w, u1, ¼, uk) Î L(M) . |
|
1.2 String-Walking Automata
String-walking automata are known in the
literature as two-way finite automata (see [HU79]).
We will call them string-walking automata to stress both the differences and
the similarities with tree-walking automata.
String-walking automata are like the well-known finite automata,
except that they can walk both forwards and backwards along the string
they are examining (hence the name two-way finite automata). At any moment
in the execution of a string-walking automaton, the automaton is in one
state and at one position of the string.
Let S be an alphabet. Syntactically, a string-walking automaton
over S is a
finite automaton
with a special set of directives as input alphabet.
We define the set of directives as
DSWA(S) = { ¬, ®, head, Øhead, tail, Øtail} È{ labs\mid s Î S} . |
|
The meaning of the directives is as follows:
- ¬ means go back to the previous position in the input string.
- ® means go to the next position in the input string.
- head checks whether the current position is the first position in the
string.
- tail checks whether the current position is the last
position in the string.
- Øhead and Øtail are the negations of
head and tail respectively.
- labs checks whether the current position is
labeled with s.
Formally, we define for each string w Î S* and
each directive d Î DSWA(S)
the following binary relation Rw(d) on Vw:
|
|
|
{ (i, i-1) \mid 1 < i £ |w| } |
| |
|
{ (i, i+1) \mid 1 £ i < |w| } |
| |
|
{ (i, i) \mid 1 £ i £ |w| and the ith symbol of w is s } |
| |
|
| |
|
{ (i, i) \mid 1 < i £ |w| } |
| |
|
| |
|
{ (i, i) \mid1 £ i < |w| } |
|
| |
|
A string-walking automaton over S is a finite automaton A
over DSWA(S). For a string-walking automaton
A = (Q,DSWA(S), d, I, F) and a string w Î S*,
an element (q,u) of Q ×Vw is
a configuration of the automaton.
The configuration (q,u) Î Q ×Vw signifies that A is in state
q at position u. We say that the current position of
the automaton on w is u. The set of all configurations
of A and w is denoted by \mathbb CA,w.
Let w Î S*. One step of A = (Q,DSWA(S),d,I,F) on w
is defined by the binary relation \twoheadrightarrowA,w on the set
of configurations, as follows. For every
(q,u), (q¢,u¢) Î \mathbb CA,w,
(q,u) \twoheadrightarrowA,w (q¢,u¢) iff $d Î DSWA(S): (q,d,q¢) Î d and (u,u¢) Î Rw(d) . |
|
For each string w Î S*, A computes the binary relation
Rw(A) = { (u,v) Î Vw ×Vw \mid (q,u) \twoheadrightarrowA,w*(q¢,v) for some q Î I and q¢ Î F } . |
|
Thus, A computes the position relation
R(A) = { (w,u,v) \mid w Î S*, (u,v) Î Rw(A) } . |
|
We define the language that A recognizes as all strings w with
the property that there is a walk of A on w starting at the beginning of the
string in an initial state and ending in a final
state, or
L(A) = { w Î S* \mid there exists v Î Vw such that (1,v) Î Rw(A)} . |
|
The class of all languages recognized by a string-walking automaton
is named SWA.
In the following we will also allow transitions of the form
(p,s,q) with s Î DSWA(S)*. If s = d1 d2 ¼dn,
we define, for all w Î S*,
Rw(s) = Rw(d1) °¼°Rw(dn). |
|
These extended directives can easily be implemented by adding extra
transitions and states.
A string-walking automaton A = (Q,DSWA(S),d,I,F) over S
is deterministic if the following three conditions hold:
- #I = 1.
- If (p,d,q) Î d, then p \not Î F.
- For all distinct transitions
(p,d1,q1), (p,d2,q2) Î d,
d1 and d2 are two mutually exclusive directives in
DSWA(S).
Two directives d1,d2 Î DSWA(S) are mutually exclusive exactly if,
for all w Î S* and u Î Vw,
dom(Rw(d1)) Çdom(Rw(d2)) = Æ.
The following pairs of directives in DSWA(S) are mutually exclusive:
- {®,tail},{¬,head}
- {head,Øhead},{tail,Øtail}
- {labs1,labs2} with
s1 ¹ s2
1.3 String-Walking Pebble Automata
We will extend the concept of string-walking automata with a pebble [BH65].
A string-walking pebble automaton
is able to drop its pebble at its current position in the string,
pick it up at any (later) time, if it is at the same position again, and check
whether or not the pebble is present at the automaton's current position.
Let S be an alphabet.
The directives for a string-walking pebble automaton are
|
|
|
DSWA(S)È{ put, lift, here, Øhere} |
| |
|
{ ¬, ®, head, Øhead, tail, Øtail, put, lift, here, Øhere} È{ labs\mid s Î S} |
|
| |
|
The meaning of the new directives is as follows:
- put means drop the pebble at the current position in the string.
- lift means lift the pebble from the current position
in the string (only if it is there now).
- here checks whether the pebble is at the current position.
- Øhere is the negation of here.
For all strings w Î S*, we will denote the
pebble positions with elements of Pw = Vw È{^}.
Here ^ denotes that the pebble is not placed at any position of
the string, i.e., the automaton ``carries'' the pebble.
We now need to extend the binary relations Rw(d) (for d Î DSWA(S)) to be
on pairs (u,p) Î Vw ×Pw to take pebble-handling into account.
Let, for all strings w Î S* and directives d Î DSWA(S),
[R\tilde]w(d) denote the binary relation Rw(d) as defined for string-walking
automata in the previous section. We now define Rw(d) for string-walking
pebble automata as follows:
Rw(d) = { ((i,p),(i¢,p)) \mid (i,i¢) Î |
~ R
|
w
|
(d) and p Î Pw } . |
|
This means that the directives from DSWA(S) do not alter the pebble position.
The relations for the new directives are as follows.
|
|
|
{ ((i,^), (i, i)) \mid 1 £ i £ |w|} |
| |
|
{ ((i, i), (i, ^)) \mid 1 £ i £ |w|} |
| |
|
{ ((i, i), (i, i)) \mid 1 £ i £ |w|} |
| |
|
{ ((i,p), (i,p)) \mid1 £ i £ |w| and p ¹ i } |
|
| |
|
Let S be an alphabet.
A string-walking pebble automaton is a finite automaton A
over DSWPA(S). For a string-walking pebble automaton
A = (Q,DSWPA(S), d, I, F) and a string w Î S*,
an element (q,u,p) Î Q ×Vw ×Pw is
a configuration of the automaton. The set of all configurations
of A and w is denoted by \mathbb CA,w.
A configuration (q,u,p) signifies that A is in state
q at position u, while the pebble is at position p.
The step relation \twoheadrightarrowA,w is extended in the obvious way. For a
string-walking pebble automaton A, a string w Î S* and
configurations (q,u,p),(q¢,u¢,p¢) Î \mathbb CA,w,
(q,u,p) \twoheadrightarrowA,w* (q¢,u¢,p¢) iff $d Î DSWPA(S): (q,d,q¢) Î d and ( (u,p),(u¢,p¢) ) Î Rw(d). |
|
For the relation that A computes, we now demand that
A begins and ends with the pebble not on the string, or ``in its pocket''. For
w Î S*,
Rw(A) = { (u,v) Î Vw ×Vw \mid (q,u,^) \twoheadrightarrowA,w*(q¢,v,^) for some q Î I and q¢ Î F }. |
|
The definitions of R(A) and L(A) remain the same.
The class of all languages recognized by a string-walking
pebble automaton is named SWPA.
Transitions of the form (p,s,q) with s Î DSWPA(S)* are treated as
with string-walking automata.
The definition of a deterministic string-walking pebble
automaton is similar to that of a deterministic string-walking automaton.
The only difference is that the directives are now taken from DSWPA(S).
The following pairs of directives in DSWPA(S) are mutually
exclusive:
- {®,tail},{¬,head}
- {here,put},{lift,put},{Øhere,lift}
- {head,Øhead},{tail,Øtail},{here,Øhere}
- {labs1,labs2} with
s1 ¹ s2
1.4 String-Walking Automata with MSO Tests
Another way to extend string-walking automata
(as done in [BE97] for tree-walking
automata, see also Section 3.7) is to allow the automaton
to test any MSO definable property of the current position of the string,
using MSO formulas with one free position variable. This extension also
requires the introduction of new directives.
Let S be an alphabet. The set of directives over S is
DSWA+M(S) = { ¬, ®} ÈMSOL1(S) . |
|
Note that head, tail, labs, etc. are MSO definable properties
so they need not be explicitely stated in the definition
of DSWA+M(S). Note also that DSWA+M(S) is an infinite set. Therefore we need to define
for each string-walking automaton with MSO tests a finite subset of
DSWA+M(S) as the directives used by the automaton.
For the directive d = y(x) with y Î MSOL1(S),
we define the following binary relation on Vw (w Î S*):
Rw(y(x)) = {(i,i) \mid (w,i) \models y(x) } . |
|
A string-walking automaton with MSO tests over S is
a finite automaton A over a finite subset of DSWA+M(S).
The definitions of configurations, \twoheadrightarrowA,w, Rw(A), etc. are
unchanged with respect to the definitions for (ordinary) string-walking automata.
The definitions of deterministic string-walking automata
and mutually exclusive directives are also unchanged, although
it is no longer possible to list all mutually exclusive pairs
of directives. Note that it is decidable whether a given pair of directives
in MSOL1(S) is mutually exclusive.
The class of all languages
recognized by a string-walking automaton with MSO tests is named
SWA+M.
In [BE97] the following results are proven for trees. They also hold for
the special case of strings.
Proposition 5
The following three statements hold:
- SWA+M=REG
- For every alphabet S and every binary MSO formula
f(x,y) Î MSOL2(S), there exists a string-walking automaton
with MSO tests A such that R(A) = R(f).
- For every alphabet S and every string-walking automaton
with MSO tests A over S, there exists a
binary MSO formula f(x,y) Î MSOL2(S)
such that R(f) = R(A).
1.5 From MSO Formulas to
String-Walking Pebble Automata
Both MSO formulas with two position variables and string-walking pebble
automata compute binary position relations. In this section, we show that
every binary position relation that can be defined by an MSO formula,
can also be computed by a string-walking pebble automaton. In
Section 1.8 we will show that the reverse is also true, i.e.,
every binary position relation that can be computed by a string-walking
pebble automaton can also be defined by a binary MSO formula.
Let S be an alphabet and let f(x,y) be a binary MSO formula
over S. We have seen (Lemma 4) that there
exists a finite automaton M over SÈ(S×B2)
such that, for all w Î S* and u,v Î Vw,
(w,u,v) \models f(x,y) iff mark(w,u,v) Î L(M) . |
|
The following lemma states that there is a string-walking pebble automaton
that computes the same relation as f(x,y) defines.
Lemma 6
For every finite automaton M over SÈ(S×B2)
there exists a string-walking pebble automaton A
over S such that, for all w Î S* and u,v Î Vw,
mark(w,u,v) Î L(M) iff (u,v) Î Rw(A). |
|
Proof.
Let M = (QM,SÈ(S×B2), dM, IM, FM) be a finite
automaton over SÈ(S×B2). We will define a
string-walking pebble automaton over S,
A, that simulates M, using the pebble to simulate the marked
positions in M. A is defined as the (intuitive) union
of three automata A1, A2, A3.
We define A as (Q1 ÈQ2 ÈQ3, DSWPA(S), d1 Èd2 Èd3, I1 ÈI2 ÈI3, F1 ÈF2 ÈF3).
A1 deals with the case that u comes ``before'' v in w, A2
deals with the case that v comes first and A3 deals with the case
u = v. For 1 £ i £ 3, Ai = (Qi, DSWPA(S), di, Ii, Fi).
A1 simulates M, assuming that u comes before v in w. A1 first
drops its pebble at the start position u.
This makes it possible to determine
later where the automaton started. Then it walks to the head of the string.
Next it simulates M in three stages.
In the first stage of the simulation,
it simulates M (only taking into account transitions of the form
(p,s,q) with s Î S)
until it finds the pebble. It picks up the pebble and treats the symbol
s on this position as (s,1,0) in M. The automaton
now continues simulating M (the second stage of the simulation),
again without transitions with labels in S×B2.
Then, nondeterministically, A1 drops the pebble at some position
v and treats the symbol s at this position as
(s,0,1) in M. A continues simulating M (the third stage)
until it reaches the end of the string. If it reaches the end of the
string in a final state of M, the choice for v was a good choice.
The automaton then backs up until it finds the pebble at position
v. This ends A1.
We use the following states in A1.
- in1 is the initial state.
- tobegin1 is used to walk to the head of the string.
- q1,i (for q Î QM) is used in the ith
stage of the simulation.
- backup1 is used to back up to v after the three
stages of simulation.
- final1 is the final state.
Formally, A1 = (Q1, DSWPA(S), d1, I1, F1) is constructed as follows:
|
|
|
{ in1, tobegin1, backup1, final1 } È |
| |
|
{ q1,i \mid q Î QM, 1 £ i £ 3 } |
| |
|
{ (in1, put, tobegin1),(tobegin1, ¬, tobegin1) } È |
| |
|
{ (tobegin1, head, q1,1) \mid q Î IM } È |
| |
|
{ (p1,1, (labs, ®), q1,1)\mid (p,s,q) Î dM } È |
| |
|
{ (p1,1, (lift, labs, ®), q1,2) \mid(p,(s,1,0),q) Î dM } È |
| |
|
{ (p1,2, (labs, ®), q1,2)\mid (p,s,q) Î dM } È |
| |
|
{ (p1,2, (put, labs, ®), q1,3) \mid(p,(s,0,1),q) Î dM } È |
| |
|
{ (p1,2, (put, labs, tail), backup1) \mid(p,(s,0,1),qfin) Î dM for some qfin Î FM } È |
| |
|
{ (p1,3, (labs, ®), q1,3)\mid (p,s,q) Î dM } È |
| |
|
{ (p1,3, (labs, tail), backup1)\mid (p,s,qfin) Î dM for some qfin Î FM } È |
| |
|
{ (backup1, ¬, backup1), (backup1, lift, final1) } |
| |
|
| |
|
|
| |
|
A2 is similar to A1, only here the automaton, after dropping its
pebble on u, first walks to the right until the
end of the string, then simulates M (inverted), starting in a state
corresponding to a state in FM, until it finds the pebble
at u. It picks up the pebble and treats the symbol s
at this position as (s,0,1) in M.
A2 continues with the second stage of the simulation.
Then, nondeterministically, A2 drops the pebble at some position v
and continues with the third stage of the simulation until the beginning of
the string. If it reaches the beginning of the string in an initial
state of M, A2 walks back to the pebble at v and the
simulation is finished.
Formally, A2 = (Q2, DSWPA(S), d2, I2, F2) is constructed as follows:
|
|
|
{ in2, toend2, backup2, final2 } È |
| |
|
{ q2,i \mid q Î QM, 1 £ i £ 3 } |
| |
|
{ (in2, put, toend2),(toend2, ®, toend2) } È |
| |
|
{ (toend2, tail, q2,1) \mid q Î FM } È |
| |
|
{ (q2,1, (labs, ¬), p2,1)\mid (p,s,q) Î dM } È |
| |
|
{ (q2,1, (lift, labs, ¬), p2,2) \mid(p,(s,1,0),q) Î dM } È |
| |
|
{ (q2,2, (labs, ¬), p2,2)\mid (p,s,q) Î dM } È |
| |
|
{ (q2,2, (put, labs, ¬), p2,3) \mid(p,(s,0,1),q) Î dM } È |
| |
|
{ (q2,2, (put, labs, head), backup2) \mid(p,(s,0,1),qin) Î dM for some qin Î IM } È |
| |
|
{ (q2,3, (labs, ¬), p2,3)\mid (p,s,q) Î dM } È |
| |
|
{ (q2,3, (labs, head), backup2)\mid (p,s,qin) Î dM for some qin Î IM } È |
| |
|
{ (backup2, ®, backup2), (backup2, lift, final2) } |
| |
|
| |
|
|
| |
|
A3 looks again like the previous two parts of A. A3 however
assumes that u = v, which implies that the only useful transitions with
labels in S×B2 are the ones of the form
(p,(s,1,1),q) Î dM. So A3 first drops its pebble
at position u ( = v), then
(like A1) walks to the head of the string and simulates M until it gets
to the starting position (the pebble) again. It does not lift the pebble,
because it needs to know the position of u = v.
A3 treats the symbol s at this position as (s,1,1) in M.
Then it continues with the second stage of
the simulation of M until the end of the string. If it arrives there in
a final state of M, A3 backs up to the pebble which ends the
simulation of M.
Formally, A3 = (Q3, DSWPA(S), d3, IA, FA) is constructed as follows:
|
|
|
{ in3, tobegin3, backup3, final3 } È |
| |
|
{ q3,i \mid q Î QM, 1 £ i £ 2 } |
| |
|
{ (in3, put, tobegin3),(tobegin3, ¬, tobegin3) } È |
| |
|
{ (tobegin3, head, q3,1) \mid q Î IM } È |
| |
|
{ (p3,1, (labs, ®), q3,1)\mid (p,s,q) Î dM } È |
| |
|
{ (p3,1, (here, labs, ®), q3,2) \mid(p,(s,1,1),q) Î dM } È |
| |
|
{ (p3,1, (here, labs, tail), backup3) \mid(p,(s,1,1),qfin) Î dM for some qfin Î FM } È |
| |
|
{ (p3,2, (labs, ®), q3,2)\mid (p,s,q) Î dM } È |
| |
|
{ (p3,2, (labs, tail), backup3)\mid (p,s,qfin) Î dM for some qfin Î FM} È |
| |
|
{ (backup3, ¬, backup3), (backup3, lift, final3) } |
| |
|
| |
|
|
| |
|
\square
Using this lemma the following theorem becomes trivial to prove.
Theorem 7
For each formula f(x,y) Î MSOL2(S), there exists a
string-walking pebble automaton A over S such that
R(f) = R(A).
1.6 Another proof
There is another way to prove Theorem 7.
This method makes use of Proposition 5 to simulate
the binary MSO formula by a string-walking automaton with MSO tests.
The MSO tests are then simulated by making use of a pebble.
Let S be an alphabet, and let
f(x,y) Î MSOL2(S) be an MSO formula
with two free position variables. Then there exists a
string-walking automaton with MSO tests A = (QA,DA,dA,IA,FA)
with DA a finite subset of DSWA+M(S)
such that R(f) = R(A) (Proposition 5).
We will construct a string-walking pebble automaton
A¢ = (Q¢,DSWPA(S),d¢,I¢,F¢) such that R(A¢) = R(A) = R(f).
We need the following result, which is Lemma 4 for the
case k = 1.
Lemma 8
Let S be an alphabet.
For every MSO test y(x) Î MSOL1(S), there
exists a finite automaton M over SÈ(S×B1) such that,
for all w Î S* and u Î Vw,
(w,u) \models y(x) iff mark(w,u) Î L(M) . |
|
We now construct A¢. In first approximation, A¢
equals A for transitions (p,d,q) with
d Î { ¬, ® }.
We define T = {(p,d,q) Î dA \mid d Î MSOL1(S)},
the collection of all transitions with MSO tests in A.
For each transition t = (p,y(x),q) Î T,
we use Lemma 8 to construct the
finite automaton Mt that calculates the MSO test y(x).
We will use this automaton to construct a ``subroutine'' in A¢
that simulates t.
The idea is as follows, similar to the construction of A3 in the proof of
Lemma 6.
A¢ first drops its pebble. Then it walks to the head of the string.
From there, it checks the MSO test by following Mt. Any transitions
in dMt
with labels not in S are handled by using the pebble.
The only possible form of such a transition is (p,(s,1),q).
In place of this transition, we add transitions to check whether
the position in the string where the automaton is now is marked,
i.e., the pebble is there. Then the symbol s is checked and
the automaton moves on to the next position in the string in
state q. If Mt reaches a final state at the end of the string,
A¢ moves back to the position where it dropped the pebble and
lifts it. At this moment the MSO test y(x) is successfully
verified by the automaton.
The following states are used for A¢.
First, all states from A, QA. Then we need extra states
for each MSO test. For all transitions t Î T
we use, next to the states of Mt, tobegint to move
to the beginning of the string and
backupt to back up to the pebble, the place where
the simulation of Mt started.
We assume that all the Mt for all t Î T have
unique states that do not overlap with either QA or states
of other Mt's.
Formally, A¢ = (Q¢,DSWPA(S),d¢,I¢,F¢) is constructed as follows:
|
|
|
QA È{ tobegint, backupt \mid t Î T }È |
È
t Î T
|
QMt |
| |
|
| |
|
| |
|
{ (p,d,q) Î dA \midd Î {¬,®} } È |
È
| { dt \mid t Î T } |
|
| |
|
Here the transitions needed to simulate t = (p,y(x),q) are as follows:
|
|
|
{ (p, put, tobegint),(tobegint, ¬, tobegint)} È |
| |
|
{ (tobegint, head, sin)\mid sin Î IMt } È |
| |
|
{ s,(Øtail, Øhere, labs,®),t) \mid (s,s,t) Î dMt } È |
| |
|
{ s,(Øtail,here,labs,®),t) \mid (s,(s,1),t) Î dMt } È |
| |
|
{ s,(tail,Øhere,labs),backupt) \mid (s,s,tfin) Î dMt for some tfin Î FMt } È |
| |
|
{ s,(tail, here,labs),backupt) \mid (s,(s,1),tfin) Î dMt for some tfin Î FMt } È |
| |
|
{ (backupt, (Øhere, ¬), backupt),(backupt, lift, q) } |
|
| |
|
1.7 Pebble Necessity
The following theorem states that we really need the pebble in
Theorem 7. The proof of this theorem is a simpler form of the
proof for tree-walking automata (Theorem 15 in [BE97]).
It is included here for completeness.
Theorem 9
There exist an alphabet S and a
formula f(x,y) Î MSOL2(S)
such that there is no string-walking automaton A
with R(A) = R(f).
Proof.
We construct an MSO formula f(x,y) Î MSOL2(S)
with S = {b,r}, denoting black and red. For a string w over
S, the binary relation R(f) connects certain positions of w.
If w has exactly
one position with symbol r, say vred, then R(f) will connect
any position to vred. Otherwise, i.e., if there is no red symbol
or more than one, R(f) will connect any position to the next in left-to-right
circular order.
We use the following
abbreviations to define f(x,y):
|
|
|
$y ( labr(y) Ù"z(labr(z) ® z = y)) |
| |
|
pre(x,y) Ú( tail(x) Ùhead(y) ) |
| |
|
( orsÙlabr(y) ) Ú( ØorsÙsucc(x,y) ) |
|
| |
|
Here ors is a closed MSO formula which is true if there is exactly
one red symbol in w, and succ(x,y) is an MSO formula which is true if y
follows x in left-to-right circular order. Clearly, f(x,y) is a
binary MSO formula with R(f) as indicated above.
Now we will show that there is no string-walking automaton
A = (Q,DSWA(S),d,I,F) with R(A) = R(f).
Suppose that such an A exists. We may assume that
I = {qin}. The idea is that when A starts at any position
u of a string with only b's, it first has to visit all positionsc
of the string to check there is no r in the string. Because there
is no way for A to remember its starting point u, A also
cannot find the position after u in left-to-right circular order.
Let w be b#Q+1, a string of #Q+1 black symbols.
Let w¢ be rb#Q. Thus, w¢ is w with the first symbol changed to red.
We will use the following function:
which gives the successor to position k in left-to-right circular order.
Because w does not contain a symbol r, there exists
for every k Î [1,#Q+1] a state fk Î F such that
(qin,k) \twoheadrightarrowA,w* (fk, succ(k)) . |
|
On the other hand, because w¢ contains exactly one symbol r,
for all k Î [1,#Q], there is no state f Î F such that
(qin,k) \twoheadrightarrowA,w¢* (f, succ(k)) . |
|
This difference implies that for all k Î [1, #Q] the walk
of A on w from k to succ(k) must visit position 1, because the
only different symbol is the first, which is black for w and red for w¢.
Because the walk of A on w from #Q+1 to 1 also visits position 1, there
is for all k Î [1,#Q+1] a qk Î Q such that
(qin,k) \twoheadrightarrowA,w* (qk,1) \twoheadrightarrowA,w* (fk,succ(k)) . |
|
There are #Q+1 possibilities for k and only #Q states, which
means there are k,k¢ Î [1,#Q+1] with k ¹ k¢ and qk = qk¢.
Then
(qin,k) \twoheadrightarrowA,w* (qk,1) = (qk¢,1) \twoheadrightarrowA,w* (fk¢,succ(k¢)) . |
|
This implies that A walks from k both to
succ(k) and to succ(k¢) with k ¹ k¢, a contradiction to the fact
that R(A) = R(f). \square
1.8 From
String-Walking Pebble Automata to
MSO Formulas
In this section we will prove that a string-walking pebble automaton
can be simulated by a string-walking automaton with MSO tests. Because
it is already known that these automata can be simulated by binary MSO
formulas (Proposition 5), and because we have shown in previous sections that a binary MSO
formula can be simulated by a string-walking pebble automaton, we
obtain the equivalence of binary MSO formulas and string-walking pebble automata.
To prove that it is possible to simulate the use of
a pebble with MSO tests, we will need the following lemma.
In the lemma, A is a string-walking pebble automaton that is not allowed to
move its pebble, but it is allowed to check the presence of the pebble at the
current position. The lemma states that the round-trip of A from the position
where the pebble is, back to that position can be simulated by a unary MSO
formula.
Lemma 10
Let A be a string-walking pebble automaton, with no transitions with directives
put or lift,
over S. Then,
for every pair of states p,q Î QA, there exists a formula
ypq(x) Î MSOL1(S) such that, for all w Î S*
and x Î Vw,
(p,x,x) \twoheadrightarrowA,w* (q,x,x) iff (w,x) \models ypq(x) . |
|
Proof.
Let A be a string-walking pebble automaton
over S without put and lift. Let p,q Î QA.
First note that the pebble does not move during A's walk. We can therefore
consider the pebble position a constant and mark it using a special symbol
(s,1) where s is the original symbol at the pebble position.
We will now define a string-walking automaton A¢ (without pebble) over
SÈ(S×B1) that simulates the walk of A from the pebble position and
back. We define A¢ = (QA, DSWA(SÈ(S×B1)), d¢, {p}, {q}).
Thus, the states of A¢ are the
same as those of A, A¢'s initial state is p, and its final state is q.
Furthermore, d¢ consists of the following transitions:
|
|
|
| |
|
{ (s,lab(s,1),t) \mid (s,here,t) Î dA, s Î S} È |
| |
|
{ (s,labs,t) \mid (s,Øhere,t) Î dA, s Î S} È |
| |
|
{ (s,lab(s,1),t) \mid(s,labs,t) Î dA, s Î S}. |
|
| |
|
The directive ``here'' is simulated by checking whether the symbol at the current position
is in S×B1. Similarly, ``Øhere'' is simulated by checking
whether the symbol at the current position is in S. Otherwise, the
new symbols from S×B1 are treated just like the corresponding
symbols from S.
Now, for all s,t Î QA, w Î S* and x,y,z Î Vw,
it is easy to see that
(s,x,z) \twoheadrightarrowA,w* (t,y,z) iff (s,x) \twoheadrightarrowA¢,w¢* (t,y) |
|
where w¢ = mark(w,z). This means that A¢ on a string marked at
position z walking from x to y simulates A on the unmarked string
with the pebble at position z. Both automata start in state s and end in
state t.
Also, from the definition of Rw¢(A¢),
(p,x) \twoheadrightarrowA¢,w¢* (q,y) iff (x,y) Î Rw¢(A¢) . |
|
According to Proposition 5
we can construct a formula f¢pq(x,y) Î MSOL2(SÈ(S×B1))
that simulates the walk of A¢ on a marked string, so that, for all
w Î S*, x,y,z Î Vw and w¢ = mark(w,z),
(x,y) Î Rw¢(A¢) iff (w¢,x,y) \models f¢pq(x,y) . |
|
Using Lemma 2 yields a formula
fpq(x,y,z) Î MSOL3(S) such that,
for all w Î S* and x,y,z Î Vw,
(mark(w,z),x,y) \models f¢pq(x,y) iff (w,x,y,z) \models fpq(x,y,z) . |
|
We now define ypq(x) = fpq(x,x,x) Î MSOL1(S). Then,
for all w Î S* and u Î Vw,
(p,x,x) \twoheadrightarrowA,w* (q,x,x) iff (w,x) \models ypq(x) . |
|
\square
Theorem 11
For every string-walking pebble automaton A
there exists a string-walking automaton with MSO tests A¢ such that
R(A) = R(A¢).
Proof.
When A drops its pebble at some position u of string w,
it must return there some
later time to pick it up, because of the definition of Rw(A).
In the mean time, i.e., until A returns to u, A does not move its
pebble. By using
Lemma 10, the actions of this automaton until its arrival in u
can be simulated by a single MSO test. The details are as follows.
Let A = (Q,DSWPA(S),d,I,F) be a string-walking pebble automaton
over S. We will define a string-walking automaton with MSO tests A¢.
The MSO tests ypq are obtained by applying Lemma 10
to the automaton derived from A by removing all transitions of the
form (p,d,q) with d Î {put,lift}.
We define A¢ = (Q,D,d¢,I,F) with
|
|
|
{ ¬, ®, true(x), head(x), tail(x), Øhead(x), Øtail(x) } È |
| |
|
{ labs(x) \mid s Î S} È{ ypq(x) \mid p,q Î Q } |
| |
|
{ (p,d,q) Î d\mid d Î { ¬, ® } } È |
| |
|
{ (p,head(x),q) \mid (p,head,q) Î d} È |
| |
|
{ (p,Øhead(x),q) \mid (p,Øhead,q) Î d} È |
| |
|
{ (p,tail(x),q) \mid (p,tail,q) Î d} È |
| |
|
{ (p,Øtail(x),q) \mid (p,Øtail,q) Î d} È |
| |
|
{ (p,labs(x),q) \mid (p,labs,q) Î d} È |
| |
|
{ (p,true(x),q) \mid (p,Øhere,q) Î d} È |
| |
|
{ (p,yqr(x),s) \mid (p,put,q) Î d and (r,lift,s) Î d} |
| |
| |
|
The directives ¬, ®, head, Øhead, tail, Øtail
and labs (for s Î S) are simulated straightforwardly.
The MSO test yqr(x) simulates the actions of A between the put and the lift.
\square
This theorem leads us to the grand finale of this chapter.
Theorem 12
The following two statements hold:
- For every MSO formula f(x,y) Î MSOL2(S) there exists
a string-walking pebble automaton A over S such that
R(A) = R(f).
- For every string-walking pebble automaton A over S
there exists an MSO formula f(x,y) Î MSOL2(S) such that
R(f) = R(A).
There is another way to prove that any string-walking pebble automaton
can be simulated by an MSO formula with two position variables.
Let A be a string-walking pebble automaton
over S that recognizes the relation R(A). Then a
string-walking pebble automaton A¢ over SÈ(S×B2) can easily be constructed
such that (w,u,v) Î R(A) just if mark(w,u,v) Î L(A¢). A¢ walks
right until it encounters a symbol (s,1,x) with x Î {0,1}. Then
A¢ starts simulating A directly until it encounters a symbol
(s,y,1) with y Î {0,1} in a final state.
Because SWPA=REG [BH65] and REG=MSO (Proposition 1),
there exists an MSO formula
y Î MSOL0(SÈ(S×B2)) such that L(A¢) = L(y). Using
Corollary 3, we obtain an MSO formula
f(x,y) Î MSOL2(S) such that, for all w Î S*,
mark(w,u,v) Î L(y) just if (w,u,v) \models f(x,y).
Note that we can also use Theorem 12 to derive
that SWPAREG, as follows:
Theorem 13
SWPAREG
Proof.
First we will show that any language defined by a string-walking pebble automaton
is MSO definable and hence regular.
Let A be a string-walking pebble automaton over S.
By definition, the language
recognized by A is
L(A) = { w Î S* \mid there exists v Î Vw such that (1,v) Î Rw(A) }. |
|
Using the second part of Theorem 12 we obtain a formula
f(x,y) Î MSOL2(S) such that, for all w Î S* and
u,v Î Vw,
(u,v) Î Rw(A) iff (w,u,v) \models f(x,y) . |
|
We now define
y = "x ( $y (head(x) ® f(x,y) ) ) |
|
to obtain that, for all w Î S* and v Î Vw,
$v Î Vw ((1,v) Î Rw(A)) iff $v Î Vw ((w,1,v) \models f(x,y)) iff w \models y |
|
so
and, since MSO=REG, L(A) is a regular language.
The other way around, we show that any regular language can be described by
a string-walking pebble automaton.
Let L be a regular language over an alphabet S. Let
M = (Q,S,d,I,F) be a finite automaton such that L = L(M). We define
a string-walking pebble automaton A = (Q¢,DSWPA(S),d¢,I,F¢) with
|
|
|
| |
|
{ (p,(labs,®),q) \mid (p,s,q) Î d} È |
| |
|
{ (p,(labs,tail),qfin) \mid (p,s,q) Î d for some q Î F } |
| |
|
|
| |
|
It is obvious to see that L(A) = L(M) = L.
\square
Chapter 2
Extensions to
String-Walking Pebble Automata
In this chapter we will introduce two extensions to the concept of
string-walking pebble automata. These extensions will be constructed in such
a way that the class of languages recognized by the automata is the same
as the class of languages that the (ordinary) string-walking automata
recognize, i.e., the class of regular languages.
2.1 String-Walking Multi-Pebble Automata
The first extension to the string-walking automaton is the string-walking
multi-pebble automaton. This automaton can use more than one pebble. The
number of pebbles available to the automaton is fixed.
The pebbles will be taken from a finite set and
must be used ``nested''. This means that, at any point in
the automaton's walk, only the pebble that was dropped last can be lifted.
It can easily be seen that we can, without changing the automaton's power,
number the pebbles {1,2,¼,n} and
demand that the pebbles must be used in order. In order to see this, suppose
the pebbles are taken from a set P with |P| = n.
Consider the set M of bijections from
P to {1,2,¼,n} (there are n! such mappings). We can now simulate an
automaton with pebble set P by an automaton with pebble set [1,n] by
coding the pebble mapping into the states: for every state q we use
additional states {qm \mid m Î M}. We also modify the transitions with
put and lift so that the information on the mapping is maintained.
Note that the constraint of the nested use of pebbles is necessary to make
sure that the automaton does not recognize non-regular languages. A string-walking
2-pebble automaton without this limitation could recognize the language
L = {ancbn \mid n Î \mathbb N} by putting the pebbles at both ends of the
string and moving them towards the middle of the string in turn, checking
that both pebbles arrive at the middle c after the same number of moves.
This kind of use of the pebbles is prohibited by demanding that the pebbles
be used nested. In broader terms, it can be shown that unnested multi-pebble
automata are equivalent to two-way multi-head automata, which in turn are
equivalent to logarithmic space Turing machines.
Let n ³ 1. We use the following set of directives for a string-walking n-pebble automaton
over S:
|
|
|
DSWA(S)È{ puti, lifti, herei, Øherei \mid 1 £ i £ n } |
|
| |
|
For
each w Î S* the relations for the new directives are as
follows.
The relations are on tuples (u,k,p) Î Vw ×[0,n]×([1,n] ® Pw). Here k represents that pebbles 1,¼,k
are currently on the string and p maps the pebbles to string positions. We
define p0 : [1,n] ® Pw with p0(i) = ^ for all i Î [1,n]
as the mapping with no pebbles on the string.
The relations hold for any
u Î Vw, k Î [0,n], and pebble
placement function p:[1,n] ® Pw.
|
|
|
{ ((u, i-1, p), (u, i, p(i ® u))) \mid 1 £ i £ n } |
| |
|
{ ((u,i,p), (u,i-1,p(i ® ^))) \mid 1 £ i £ n, p(i) = u } |
| |
|
{ ((u,k,p), (u,k,p)) \mid 1 £ i £ k and p(i) = u } |
| |
|
{ ((u,k,p), (u,k,p)) \mid 1 £ i £ n and (i > k or p(i) ¹ u) } |
|
| |
|
A string-walking n-pebble automaton over S is a finite
automaton A over
DSWnPA(S). For a string-walking n-pebble automaton
A = (Q,DSWnPA(S),d,I,F) and a string w Î S*,
a configuration is a tuple
(q,u,k,p) Î Q ×Vw ×[0,n] ×([1,n] ® Pw).
The set of all configurations of A and w is names \mathbb CA,w.
The step relation \twoheadrightarrowA,w is defined in the obvious way.
For all configurations
(q,u,k,p), (q¢,u¢,k¢,p¢) Î \mathbb CA,w,
(q,u,k,p) \twoheadrightarrowA,w (q¢,u¢,k¢,p¢) iff $d Î DSWnPA(S):(q,d,q¢) Î d, ((u,k,p),(u¢,k¢,p¢)) Î Rw(d). |
|
For each string w Î S*, A computes the relation
Rw(A) = { (u,v) Î Vw ×Vw \mid (q,u,0,p0) \twoheadrightarrowA,w* (q¢,v,0,p0) for some q Î I and q¢ Î F } . |
|
The position relation R(A) and the language L(A) that A computes
are defined in the usual way, as well as the behaviour of transitions
of the form (q,s,q¢) with d Î DSWnPA(S)*.
2.2 From String-Walking Multi-Pebble Automata to
MSO Formulas
In this section we show that any position relation that can be
computed by a string-walking multi-pebble automaton can also be
defined by a binary MSO formula. Note that the other way around is obvious,
because string-walking multi-pebble automata are an extension to string-walking
pebble automata.
Theorem 14
For every string-walking n-pebble automaton A over an alphabet S
there exists
a binary MSO formula f(x,y) Î MSOL2(S) such that
R(f) = R(A).
Proof.
We will prove this theorem using natural induction on the number of pebbles
n. Note that the case n = 1 is equal to the second part of
Theorem 12. We present a proof similar to that in
Section 1.8. Choose a number of pebbles n > 1. In the induction
step, it can be assumed that Theorem 14 is valid for
any string-walking (n-1)-pebble automaton (the induction hypothesis).
The following Claim (similar to Lemma 10) states that
after a string-walking n-pebble automaton A drops its first pebble,
its round-trips from the pebble position back to that same position can
be simulated by a unary MSO formula. In the proof of this lemma we use the
induction hypothesis.
- [
-
Claim]
Let A = (Q,DSWnPA(S),d,I,F) be a string-walking n-pebble automaton,
with no transitions with directives put1 or lift1.
Then, for every pair of states p,q Î Q, there exists a formula
ypq(x) Î MSOL1(S) such that, for all w Î S*
and x Î Vw,
(p,x,1,p0(1® x)) \twoheadrightarrowA,w* (q,x,1,p0(1 ® x)) iff (w,x) \models ypq(x) . |
|
To prove this Claim,
we construct a
string-walking (n-1)-pebble automaton A¢ over SÈ(S×B1) that simulates
the walk of A from the position of the first pebble and back
(similar to the proof of Lemma 10).
We define
A¢ = (Q,DSW(n-1)PA(SÈ(S×B1)), d¢, {p},{q})
where
|
|
|
| |
|
{ (s,lab(s,1), t) \mid (s,here1,t) Î d, s Î S} È |
| |
|
{ (s,labs,t) \mid (s,Øhere1,t) Î d, s Î S} È |
| |
|
{ (s,lab(s,1), t) \mid (s,labs,t) Î d, s Î S} È |
| |
|
{ (s,puti-1,t) \mid (s,puti,t) Î d, 2 £ i £ n } È |
| |
|
{ (s,lifti-1,t) \mid (s,lifti,t) Î d, 2 £ i £ n } È |
| |
|
{ (s,herei-1,t) \mid (s,herei,t) Î d, 2 £ i £ n } È |
| |
|
{ (s,Øherei-1,t) \mid (s,Øherei,t) Î d, 2 £ i £ n } |
|
| |
|
The states of A¢ are the same as those of A, A¢'s only initial state is
p and A¢'s only final state is q. All directives that do not use or test
the pebble are simulated directly. Pebble 1 is simulated by the marking, while
pebbles 2,¼,n are simulated by pebbles 1,¼,n-1.
Now it is easy to see that, for all s,t Î Q, w Î S*,
x,y,z Î Vw and w¢ = mark(w,z),
(s,x,1,p0(1 ® z)) \twoheadrightarrowA,w* (t,y,1,p0(1 ® z)) iff (s,x,0,p0) \twoheadrightarrowA¢,w¢* (t,y,0,p0). |
|
This means that A¢ on a string marked at position z, starting in
a configuration without pebbles and walking from position x to y
simulates A on the unmarked string, in a configuration with pebble 1
at position z.
Note that, following the definition of Rw¢(A¢) for
a string-walking (n-1)-pebble automaton,
(p,x,0,p0) \twoheadrightarrowA¢,w¢* (q,y,0,p0) iff (x,y) Î Rw¢(A¢). |
|
According to the induction hypothesis, there exists a binary MSO formula
f¢pq(x,y) Î MSOL2(SÈ(S×B1)) |
|
that simulates the walk of A¢
on a marked string, such that, for all w Î S*,
x,y,z Î Vw and w¢ = mark(w,z),
(x,y) Î Rw¢(A¢) iff (w¢,x,y) \models f¢pq(x,y). |
|
The remainder of the proof of the Claim is again similar to the proof
of Lemma 10. Again using Lemma 2 to pull
the marking into the free position variables of the MSO formula, we obtain
a formula fpq(x,y,z) Î MSOL3(S) such that, for all
w Î S*, x,y,z Î Vw and w¢ = mark(w,z),
(w¢,x,y) \models f¢pq(x,y) iff (w,x,y,z) \models fpq(x,y,z) . |
|
We now define the abbreviation
ypq(x) = fpq(x,x,x) Î MSOL1(S). Then, for all
w Î S* and x Î Vw,
(p,x,1,p0(1 ® x)) \twoheadrightarrowA,w* (q,x,1,p0(1 ® x)) iff (w,x) \models ypq(x) . |
|
This ends the proof of the Claim.
We now continue with the proof of Theorem 14. Let
A = (Q,DSWnPA(S),d,I,F) be a string-walking n-pebble automaton
over S. We will, like in the proof of Theorem 11,
define a string-walking automaton with MSO tests
A¢ over S such that R(A) = R(A¢). The MSO tests ypq(x)
in this automaton are obtained by applying the Claim
to the automaton derived from A by removing all transitions of the
form (p,d,q) with d Î { put1, lift1 }.
We define A¢ = (Q,D,d¢,I,F) exactly the same as in the proof
of Theorem 11, but with
Øhere1, put1 and lift1 instead of Øhere, put and
lift:
|
|
|
{ ¬, ®, true(x), head(x), tail(x),Øhead(x), Øtail(x) } È |
| |
|
{ labs(x) \mid s Î S} È{ ypq(x) \mid p,q Î Q } |
| |
|
{ (p,d,q) Î d\mid d Î { ¬, ® } } È |
| |
|
{ (p,head(x),q) \mid (p,head,q) Î d} È |
| |
|
{ (p,Øhead(x),q) \mid (p,Øhead,q) Î d} È |
| |
|
{ (p,tail(x),q) \mid (p,tail,q) Î d} È |
| |
|
{ (p,Øtail(x),q) \mid (p,Øtail,q) Î d} È |
| |
|
{ (p,labs(x),q) \mid (p,labs,q) Î d} È |
| |
|
{ (p,true(x),q) \mid (p,Øhere1,q) Î d} È |
| |
|
{ (p,yqr(x),s) \mid (p,put1,q) Î d and (r,lift1,s) Î d} |
| |
| |
|
Again, the directives
{¬,®,head,Øhead,tail,Øtail} and
labs (for s Î S) are simulated straightforwardly.
The directive yqr(x) simulates the actions of A from
put1 to the following lift1 (at the same location), including
all puti and lifti for 2 £ i £ n. It is easy to see that
R(A) = R(A¢). Now, using Proposition 5, there exists a binary
MSO formula f(x,y) Î MSOL2(S) such that R(f) = R(A¢) = R(A).
\square
2.3 String-Walking Marble Automata
Another way to extend string-walking pebble automata is the introduction of
marbles instead of pebbles. They come in several colours , that are
labeled with a colour alphabet G. For each colour g Î G,
the number of marbles of colour g is not limited. In this paper, we
give only the definition of string-walking marble automata and of their
behaviour. The idea is that every marble colour can be used for a quantifier
of MSO formulas; however, it is not proven here that string-walking marble
automata compute exactly the position relations that binary MSO
formulas recognize. This may not even be the case.
DSWMA(S,G) = DSWA(S)È{ putg, liftg, hereg, Øhereg \mid g Î G} |
|
For all directives d, Rw(d) is a binary relation over
Vw ×(G® 2Vw) ×\mathbb N ×(\mathbb N ® ((Vw ×G) È{^})).
An element
(u,p,n,s) of this set means that
- The current position of the automaton is u.
- For every marble colour g Î G, p(g) is the set
of positions
where a marble of colour g is lying.
- n is the total number of marbles of any colour on the string.
- For every natural number 1 £ i £ n, if
s(i) = (v,g) Î Vw ×G,
then the ith marble that was laid down is at position v and of colour g.
For i > n, s(i) = ^.
Effectively, s is a stack of marble colours and positions.
This information is needed to enforce the nested use of the marbles.
Note that p can be determined from n and s: for every g Î G,
p(g) = { u Î Vw \mid $i Î [1,n]: s(i) = (u,g) } . |
|
The relations for the directives from DSWA(S) are extended straightforwardly.
These directives do not alter p, n or s.
For the directives putg, liftg, hereg and
Øhereg,
the relations are as follows. The definition of Rw(putg) is
complicated, because we demand that the marbles are nested as usual and
that all marbles of colour g must be put down after each other,
without putting down a marble of another colour in between.
|
|
|
{ ( (u,p,n,s), ( u,p(g® p(g) È{u}), n+1, s(n+1 ® (u,g))) ) \mid u \not Î p(g), |
| |
|
"1 £ i < j £ n ( Ø$v,v¢ Î Vw, g¢ Î G: (g¢ ¹ gÙs(i) = (v,g) Ùs(j) = (v¢,g¢) ) ) } |
| |
|
{ ( (u,p,n,s),(u,p(g® p(g) \{u}), n-1, s(n® ^))) \mid s(n) = (u,g) } |
| |
|
{ ( (u,p,n,s), (u,p,n,s) ) \midu Î p(g) } |
| |
|
{ ( (u,p,n,s), (u,p,n,s) ) \midu \not Î p(g) } |
|
| |
|
We define p0:G® 2Vw with p0(g) = Æ for all
g Î G. We define s0:\mathbb N ® ((Vw ×G) È{^}) with
s(i) = ^ for all i Î \mathbb N.
The relation computed by A on a string w Î S* is
Rw(A) = { (u,v) Î Vw ×Vw \mid(q,u,p0,0,s0) \twoheadrightarrowA,w* (q¢,v, p0,0,s0) for some q Î I and q¢ Î F }. |
|
Chapter 3
Binary MSO Formulas and Tree-Walking Automata
3.1 Preliminaries
In this section we recall the well-known concepts of
trees, finite tree automata, regular tree languages, and monadic
second order logic.
Trees
In the usual way, trees are defined as finite, directed graphs with
labeled nodes and edges.
Let S and G be alphabets of node labels and edge labels,
respectively. A graph over (S,G) is a triple
(V,E,lab), with V a finite set of nodes,
E Í V ×G×V the set of labeled edges, and
lab: V ® S the node-labeling function.
For a given graph g, its nodes, edges and node-labeling functions are
denoted Vg, Eg and labg, respectively.
The trees we consider have their nodes labeled by symbols from a
ranked alphabet. The edge from a parent to its i-th child is labeled with
the number i.
A ranked alphabet S is an alphabet S together
with a rank function rk:S® \mathbb N. For any
symbol s Î S, rk(s) = k denotes that the rank
of s is k. In terms of trees this means that a node labeled with
s will have k children.
For all k Î \mathbb N, Sk = { s Î S\mid rk(s) = k }
is the set of symbols of rank k. The rank interval of the ranked
alphabet S is [1,m], where m is the maximal rank of the elements
of S; it is denoted rki(S).
A tree over S is an acyclic connected graph g over
(S, rki(S)) such that
- No node of g has more than one incoming edge.
- For every node u of g, u has only outgoing edges with labels in
[1,rk(labg(u))].
- For every node u of g and every i Î [1,rk(labg(u))],
u has exactly one outgoing edge with label i.
The set of all trees over S is denoted TS. A subset of TS
is also called a tree language.
The root of a tree t (the node with no incoming edges) is denoted
roott. For nodes u,v of t, if (u,i,v) Î Et, then
u is called the parent of v, and v is called the i-th child
of u, denoted by u ·i. We also define u·0 = u.
We define the set of ancestors anc(u) of a node u as u itself and the
ancestors of its parent. The least common ancestor
of two nodes u and v (denoted lca(u,v)) is defined as
the node w such that
- w is an ancestor of both u and v.
- If w¢ is an ancestor of both u
and v, then w¢ is an ancestor of w.
If u is an ancestor of v, then v is a descendant of u.
The set des(u) denotes the set of all descendants of u.
For a node u of t, tu denotes the subtree of t with root u, i.e.,
the subgraph of t induced by the set of all descendants of u.
Let S be a ranked alphabet and let k Î \mathbb N.
We define k-ary node relations in analogy to k-ary position
relations for strings.
A k-ary node relation over S is a subset of
{(t, u1, ¼, uk) \mid t Î TS and ui Î Vt for all i Î [1,k]}.
We define marked trees as follows, in analogy to marked strings.
Let S be a ranked alphabet and let k ³ 1.
The ranked alphabet SÈ(S×Bk) is defined as usual,
where we assign to
the symbol (s, b1, ¼, bk) the rank rk(s)
(for s Î S and bi Î [1,k] for i Î [1,k]).
Let t Î TS and let u1, ¼, uk Î Vt.
We define mark(t,u1, ¼, uk) as a tree over
SÈ(S×Bk), with node-set Vt, edges Et and node-labeling
function lab¢: Vt ® SÈ(S×Bk) with
lab¢(u) = labt(u) if u ¹ ui for all i Î [1,k],
and lab¢(u) = (labt(u), (u = u1), ¼, (u = uk)) otherwise
(where (u = ui) = 1 iff u equals ui).
Finite Tree Automata
A finite tree automaton [GS84] traverses a tree bottom-up. It starts in all the
leaves of the tree, in parallel, and works its way up to the root. A
finite tree automaton has a finite set of states Q and a transition function.
The state of the automaton in a node of the tree is determined by the
states of the children of the node and by the label of the node.
For a symbol s with rk(s) = k, the transition
function ds is therefore a function from Qk to Q. A finite
tree automaton accepts a tree if it ends at the root of the tree in one of
a set of pre-determined final states.
Let S be a ranked alphabet. A (deterministic)
finite tree automaton over
S is a 4-tuple M = (Q,S,d,F) where Q is a finite set of states,
S is the input alphabet, d = { ds \mid s Î S}
with, for s Î Sk, ds: Qk ® Q the
transition function for s, and F Í Q the set of final states.
For every tree t Î TS and node u Î Vt, the state in which M
reaches u, denoted stateM,t(u), is defined by bottom-up induction
as stateM,t(u) = ds(stateM,t(u ·1), ¼, stateM,t(u ·k)) where labt(u) = s with
rk(s) = k. The tree language recognized by M is
L(M) = { t Î TS \mid stateM,t(root(t)) Î F }.
A tree language L is called a regular tree language if it is recognized
by a finite tree automaton. The class of all regular tree languages is named
REGT.
For any finite tree automaton M = (Q,S,d,F),
tree t Î TS and node u Î Vt, the set of
successful states of M at u, denoted by succM,t(u), is defined
(by top-down induction) as follows: succM,t(roott) = F and,
for a node u Î Vt with labt(u) = s, rk(s) = k > 0,
succM,t(u·i) is the set of all states q Î Q such that
ds(q1, ¼, qi-1,q,qi+1,¼, qk) Î succM,t(u),
where qj = stateM,t(u ·j) for j Î [1,i-1] È[i+1,k].
Intuitively, q Î succM,t(u) means that M will reach the
root of t in a final state assuming that it reaches u in state q
(instead of stateM,t(u)).
Thus, t Î L(M) iff stateM,t(u) Î succM,t(u).
This can easily be proven ([BE97], Lemma 1).
Monadic Second Order Logic (on trees)
The monadic second order logic on strings described in Section 1.1 can be
extended to trees (see, e.g., [TW68,Don70,BE97]). We will in short describe the
differences. Let S be a ranked alphabet.
- The variables x,y,¼ are now node variables; the variables X,Y,¼
are now node-set variables. For a given tree t Î TS, node variables
range over Vt and node-set variables range over subsets of Vt.
- The atomic formulas are labs(x), for every s Î S, denoting
that x has label s; edgi(x,y), for every i Î rki(S),
denoting that y is the ith child of x; and x Î X, denoting that
x is an element of X.
- The language defined by a formula f Î MSOL0(S) is now a tree
language, L(f) = {t Î TS \mid t \models f}
The class of all MSO definable tree languages is named MSOT.
The following classical result from [Don70,TW68] states that
MSOT=REGT, i.e., that
formulas from monadic second order logic with no free variables
define exactly the regular tree languages:
Proposition 15
A tree language is MSO definable if and only if it is regular.
As stated in Section 1.1, the results of Lemma 2,
Corollary 3, and Lemma 4 are also valid
for trees (with S a ranked alphabet and t Î TS instead of
w Î S*).
3.2 Tree-Walking Automata
A tree-walking automaton (see, e.g., [KS81]) is very similar to a string-walking automaton
(Section 1.2). A tree-walking automaton can walk freely on the
tree. In this paper we will give a definition of tree-walking automaton
that is slightly different from the definition in [KS81]. First of all,
we will make use of directives to define the behaviour of the automaton,
while [KS81] directly defines a binary step relation on the set of
configurations. Secondly, with our definition the automaton is able to check
which child of its parent the current node is. In [KS81], the automaton
can only move up, without any restriction to child number. This makes our automaton
more powerful (i.e., it can recognize more tree languages).
Let S be a ranked alphabet. Like a string-walking automaton,
a tree-walking automaton over S is a finite automaton with a special
set of directives as input alphabet. For a tree-walking automaton, we define
the set of directives as
DTWA(S) = { i, ¯i \mid i Î rki(S) }È{ root, Øroot} È{ labs\mid s Î S} . |
|
The meaning of the directives is as follows:
- i means go up via an edge labeled by i, i.e.,
move to the parent, provided the current node is the ith child of
its parent.
This directive can be used to check
which child of its parent the current node is.
- ¯i means go down via an edge labeled by i, i.e., move to the
ith child of the current node.
- root checks whether the current node is the root of the tree.
- Øroot is the negation of root.
- labs checks whether the current node is labeled by s.
Note that the directive root corresponds to the directive head
of the string-walking automaton.
A directive ``leaf'' (corresponding to tail) is not necessary since
the automaton can test
whether the label of the current node has rank 0. Also, Øroot can be
simulated by i ¯i for all i Î rki(S).
To formalize the behaviour of these directives
we define for each tree t Î TS and
each directive d Î DTWA(S) the following binary relation Rt(d) on Vt:
|
|
|
{ (u,v) \mid (v,i,u) Î Et } |
| |
|
{ (u,v) \mid (u,i,v) Î Et } |
| |
|
| |
|
| |
|
{ (u,u) \mid labt(u) = s} |
|
| |
|
Like the definition of string-walking automata, a
tree-walking automaton over S is a finite automaton A over
DTWA(S). For a tree-walking automaton A = (Q,DTWA(S),d,I,F) and a tree
t Î TS, an element (q,u) of Q ×Vt is a configuration of the
automaton. The set of all configurations of A and t is denoted by
\mathbb CA,t. A configuration (q,u) denotes that the automaton A is in
state q at node u.
Let A = (Q,DTWA(S),d,I,F) be a tree-walking automaton over S and let
t Î TS.
Like with string-walking automata, we define a step-relation \twoheadrightarrowA,t on
the set of configurations as follows.
For every (q,u), (q¢,u¢) Î \mathbb CA,t,
(q,u) \twoheadrightarrowA,t (q¢,u¢) iff $d Î DTWA(S): (q,d,q¢) Î d and (u,u¢) Î Rt(d) . |
|
The automaton A computes on each tree t Î TS the binary relation
Rt(A) = { (u,u¢) Î Vt ×Vt \mid(qin,u) \twoheadrightarrowA,t* (qfin,u¢) for some qin Î I and qfin Î F } . |
|
The node relation that A computes is
R(A) = { (t,u,v) \mid t Î TS and (u,v) Î Rt(A) } . |
|
The definitions of these relations (\twoheadrightarrowA,t, Rt(A) and R(A)) are
mutatis mutandis equal to the definitions for string-walking automaton in
Section 1.2.
The tree language that A recognizes is defined as
L(A) = { t Î TS \mid(roott, v) Î Rt(A) for some v Î Vt } . |
|
Note that we can assume that the walk of the automaton ends as well as begins in
roott, since we can easily add extra transitions to make the automaton
walk to the root at the end of its walk.
The class of all tree languages recognized by any tree-walking automaton is
named TWA. It is an open problem if TWA=REGT.
Like in the case of string-walking automata, we will also allow transitions
of the form (p,s,q) with s Î DTWA(S)*. If s = d1 ¼dn,
we define, for all t Î TS,
Rt(s) = Rt(d1) °¼°Rt(dn). |
|
As with string-walking automata, these extended directives can easily be
implemented by adding extra transitions and states.
Determinism of tree-walking automata in defined in analogy with determinism
of string-walking automata. A tree-walking automaton A = (Q,DTWA(S),d,I,F)
is deterministic if the following conditions hold:
- #I = 1.
- If (p,d,q) Î d, then p \not Î F.
- For all distinct transitions (p,d1,q1), (p,d2,q2) Î d,
d1 and d2 are two mutually exclusive directives in DTWA(S).
Two directives d1,d2 Î DTWA(S) are again mutually exclusive if, for all
t Î TS, dom(Rt(d1)) Çdom(Rt(d2)) = Æ. The following
pairs of directives in DTWA(S) are mutually exclusive:
- {i, j} with i ¹ j.
- {i,root} for i Î rki(S)
- {root,Øroot}
- { labs1, labs2 } with s1 ¹ s2
- { ¯i, labs} if i > rk(s).
3.3 Push-Down Tree-Walking Automata
A push-down tree-walking automaton
walks on a tree, starting in the root
of the tree. It has a push-down store. At each step of the automaton's walk,
its possible moves are determined by the automaton's state, by the symbol
at the current position of the automaton and by the push-down symbol at the top of
the store. If the automaton moves up in the tree, the top symbol is removed from the
push-down store; if the automaton moves down, a new symbol is added to the
push-down store. There are two main differences between the push-down tree-walking
automata as described in this section and those
in literature (e.g., [KS81]). Here, directives are used to describe
the different possible actions of the automaton. Furthermore,
[KS81] lets the automaton accept a tree if it walks up from the root,
leaving the tree in a final state; in the description here,
the walk can end in any node of the tree in a final
state. These differences have no consequences for the power of the automaton.
Note that the tree-walking automaton in [KS81] is the push-down tree-walking
automaton with only one push-down symbol.
We will present the push-down tree-walking automaton as an extension
of the tree-walking automaton.
Let S be a ranked alphabet and let G be an alphabet.
The directives of a push-down tree-walking
automaton are
|
|
|
{ } È{ ¯i,g \mid i Î rki(S), g Î G} È |
| |
|
{ labs\mid s Î S} È{ stayg1, g2 \mid g1, g2 Î G} . |
|
| |
|
The meaning of these directives is as follows:
- means go up to the parent of the current node
and remove the topmost symbol from the push-down store. Note that the
subscript i is not necessary for this automaton. As the automaton traverses
the tree, routing information can be placed in the push-down store.
- ¯i,g means go down via an edge labeled by i and
add g to the push-down store.
- labs checks whether the current node is labeled by s.
- stayg, g¢ checks whether the topmost symbol from the
push-down store is g and, if so, replaces it by g¢.
The push-down tree-walking automaton does not need the directives
root and Øroot. By using marked push-down symbols on the bottom
of the push-down store, the automaton can check whether the current node is the
root of the tree by inspecting the top symbol of the push-down store.
If convenient, however,
we will use the directives root and Øroot since a push-down
tree-walking automaton with these directives can be simulated by a push-down
tree-walking automaton without root and Øroot.
To formalize the directives, we define for each tree t Î TS and
each directive d Î DPDTWA(S,G) the following binary relation Rt(d) on
pairs (u,b) where u Î Vt is the current node, and
b = g1 ¼gn Î G*
is the current contents of the push-down store with n equal to the number of ancestors
of u.
|
|
|
{ ( (u, g1 ¼gn),(v, g1 ¼gn-1) ) \mid v is the parent of u } |
| |
|
{ ( (u, g1 ¼gn),(v, g1 ¼gn g)) \midv is the ith child of u } |
| |
|
{ ((u, g1 ¼gn),(u, g1 ¼gn)) \midlabt(u) = s} |
| |
|
{ ((u, g1 ¼gn-1 gn),(u, g1 ¼gn-1 g¢)) \mid gn = g} |
|
| |
|
Let S be a ranked alphabet (of node labels) and let G be an
alphabet (of push-down symbols) with a designated element gin (the initial
push-down symbol).
A push-down tree-walking automaton over (S,G) is a finite
automaton A over DPDTWA(S,G). For a push-down tree-walking automaton
A = (Q,DPDTWA(S,G),d,I,F) over (S,G) and a tree
t Î TS, the configurations of A are triples (q,u,b),
with q Î Q the state of the automaton, u Î Vt the current node, and
b = g1 ¼gn Î G* the contents of the push-down
store, where n is equal to the number of ancestors of u.
The set of all configurations on A and t is denoted by \mathbb CA,t.
Let A = (Q,DPDTWA(S,G),d,I,F) be a push-down
tree-walking automaton over (S,G)
and let t Î TS. The definition of the
step-relation \twoheadrightarrowA,t is almost identical to the definition of the
step-relation for tree-walking automata.
For every (q,u,b), (q¢,u¢,b¢) Î \mathbb CA,t,
(q,u,b) \twoheadrightarrowA,t (q¢,u¢,b¢) iff $d Î DPDTWA(S,G): (q,d,q¢) Î d and ((u,b),(u¢,b¢)) Î Rt(d) . |
|
The tree language that A recognizes is defined as
L(A) = { t Î TS \mid(qin,roott,gin) \twoheadrightarrowA,t* (qfin,u,b) for some qin Î I, qfin Î F, u Î Vt and b Î G* } . |
|
The automaton starts in the root of a tree with only the initial push-down symbol
gin on the push-down store. It recognizes the tree if it
reaches a final state in any node and with any contents of the push-down
store.
The class of all tree languages recognized by any push-down tree-walking automaton is
named PDTWA. Note that because of the push-down store there is no reasonable
definition for a binary node relation for push-down tree-walking automata.
Transitions of the form (p,s,q) with s Î DTWA(S)* are treated in the same way
as with tree-walking automata. Determinism of push-down tree-walking automata
is defined in the same way as for tree-walking automata. The following pairs of
directives in DPDTWA(S,G) are mutually exclusive:
- {labs1, labs2 } with s1 ¹ s2
- {stayg1,g1¢, stayg2,g2¢}
with g1 ¹ g2.
3.4 Equivalence of Finite Tree Automata and
Push-Down Tree-Walking Automata
In this section we show the known result [KS81] that push-down tree-walking automata recognize exactly
the regular tree languages. First we show that for every finite tree automaton
there exists a push-down tree-walking automaton that recognizes the same language;
then we show that, the other way around, for every push-down tree-walking automaton
there exists a finite tree automaton that recognizes the same language.
Lemma 16
For every finite tree automaton M there exists a push-down tree-walking
automaton A such that L(A) = L(M).
Proof.
Let M = (Q,S,d,F) be a finite tree automaton.
We construct a push-down tree-walking automaton
A = (Q¢,DPDTWA(S,G),d¢,{qin},{qfin}) over (S,G)
such that L(A) = L(M).
The automaton A walks on a tree t in a single pass from left to right. It
enters each subtree of t with root u in state qin and leaves it
in state stateM,t(u). It uses the pushdown store to remember,
(a) how many children of u have already been processed and (b) what the
resulting states were. Therefore we define Q¢ = Q È{qin,qfin} (with
qin,qfin\not Î Q),
G = È{ Qi \mid i Î rki(S) } È{l}
and gin = l.
The pushdown symbol l (the empty string) indicates that no subtrees
have been
traversed yet. If the automaton is at node u and the pushdown symbol
at the top of the pushdown store is q1 ¼qk, this means that the first
k subtrees of u have been traversed and that,
for every i with 1 £ i £ k,
stateM,t(u ·i) = qi.
The transition relation d¢ is as follows:
|
|
|
{ (qin, (Øroot, labs, ), ds) \mids Î S, rk(s) = 0 } È |
| |
|
{ (qin, (root, labs), qfin) \mids Î S, rk(s) = 0, ds Î F} È |
| |
|
{ (qin, (labs, ¯1,l), qin) \mids Î S, rk(s) > 0 } È |
| |
|
{ (qk, (labs, stayq1 ¼qk-1, q1 ¼qk, ¯k+1,l), qin) \mid |
| |
|
s Î S, "j Î [1,k] (qj Î Q), k < rk(s) } È |
| |
|
{ (qk, (Øroot, labs, stayq1 ¼qk-1, q1 ¼qk-1, ), ds(q1, ¼, qk)) \mid |
| |
|
s Î S, "j Î [1,k] (qj Î Q), k = rk(s) > 0 } È |
| |
|
{ (qk, (root, labs, stayq1 ¼qk-1, q1 ¼qk-1), qfin) \mid |
| |
|
s Î S, "j Î [1,k] (qj Î Q), k = rk(s) > 0, ds(q1, ¼, qk) Î F } |
|
| |
|
\square
Lemma 17
For every push-down tree-walking automaton A there exists a
finite tree automaton M such that L(M) = L(A).
Proof.
Let A = (Q,DPDTWA(S,G),d,I,F) be a push-down tree-walking automaton over
(S,G). From this automaton, we first construct another
push-down tree walking automaton A¢. A¢ is equal in behaviour to
A, except that when A is in a final state,
A¢ has extra transitions to move up to the root of the tree and then further up
``out'' of the tree in an extra state qfin. Note that
A¢ can never execute the latter instruction; it is added only to
make the construction of M easier.
We define
A¢ = (Q¢,DPDTWA(S,G),d¢,I,{qfin})
with Q¢ = Q È{qfin}, qfin\not Î Q and
d¢ = dÈ{(q,,qfin) \mid q Î F} È{ (qfin, , qfin) }.
We now construct a finite tree automaton M = (QM,S,dM,FM) such that
L(M) = L(A). We use the method of transition tables (see, e.g., [KS81]).
In this method, the states of the finite tree automaton are relations
R Í (G×Q¢) ×Q¢. When the
finite tree automaton computes a relation R as state for a node u with
((g,q),q¢) Î R , this means that the push-down tree-walking
automaton A¢ can traverse the subtree with u as root, starting at node u in state
q with symbol g on the top of the push-down store,
and emerging from the subtree (by an edge from u to its parent)
in state q¢, with g removed from
the push-down store. At the root, A¢ ``emerging'' from the
tree in state qfin means that A arrived in a final state.
Formally, we define
|
|
|
| |
|
{ ds \mid s Î S}, with ds (s Î S) defined below |
| |
|
{ R Í (G×Q¢) ×Q¢\mid ((gin,qin),qfin) Î R for some qin Î I } |
|
| |
|
Here, for every s Î S, the transition function for symbol s is
ds: QMrk(s) ® QM with
|
|
|
{ ((g, q), q¢¢) \mid g Î G, q,q¢¢ Î Q¢ and (g,q) R(s, R1, ¼, Rn)* (g¢, q¢) |
| |
|
for some (g¢, q¢) Î G×Q¢ with (q¢, , q¢¢) Î d¢} . |
|
| |
|
In this definition, R(s, R1, ¼, Rn) is the binary relation on
G×Q¢ that contains the results of the possible actions of the
push-down tree-walking automaton A¢ while staying on the same node or
entering a subtree and emerging again from it. The symbol s is
the symbol at the current position and the relations
Ri Í (G×Q¢) ×Q¢ are the
transition tables of the children of the current node. The relation R is
formally defined as
|
|
|
{ ((g,q),(g,q¢¢)) \mid(q, ¯i,g¢, q¢) Î d¢ and ((g¢,q¢),q¢¢) Î Ri |
| |
|
for some i Î [1,rk(s)] } È |
| |
|
{ ((g,q),(g¢,q¢)) \mid(q, stayg,g¢, q¢) Î d¢} È |
| |
|
{ ((g,q), (g,q¢)) \mid(q, labs, q¢) Î d¢} . |
|
| |
|
Note that these relations can effectively be computed. The computation of
R(s, R1, ¼, Rn) is straightforward. Since it is a binary
relation on a finite set, its reflexive transition closure can be computed
by regarding the relation as a graph and computing all paths in the graph
(the nodes of the graph are the elements of G×Q¢ and there is
an edge from (g1,q1) to (g2,q2) iff
((g1,q1),(g2,q2)) is in the relation).
\square
Combining these two lemmas yields the following theorem.
Theorem 18
PDTWA = REGT
3.5 Tree-Walking Marble Automata
Another way to extend tree-walking automata is with marbles . The concept
of marbles is the same as that of Section 2.3 for strings, but they
are used here in a different way.
Marbles are coloured (with symbols from an alphabet) and there are infinitely
many marbles of each colour available.
Like pebbles,
the automaton can drop a marble at a certain node, check for its presence any later
time and lift it again.
It is not allowed to
put more than one marble of the same colour on the same node.
Note that by taking the powerset of the marble alphabet we can
even pose that not more than one marble (of any colour) is on any node.
In the rest of this paper we will implicitly assume that this is the case.
Unlike pebbles, putting down a marble on a certain node
restrains the automaton to the subtree of which that node is the root, until
the marble is lifted again. This restriction makes sure that the marbles are
``almost'' nested: unnested use of marbles is only possible if the marbles are
at the same node.
We will show that this restriction makes the
tree-walking marble automaton very similar to the push-down tree-walking
automaton, while it allows for the definition of the binary relation computed
by a tree-walking marble automaton on a tree, in a natural way.
Let S be a ranked
alphabet and let G be an alphabet of marble labels.
Note that there is no gin necessary now.
For a tree-walking marble
automaton over (S,G), we define the set of directives as
DTWMA(S,G) = DTWA(S)È{ putg, liftg, hereg, Øhereg \midg Î G} . |
|
The meaning of the extra directives is as follows:
- putg means drop a marble with label (or ``colour'') g on
the current node, if one is not yet there.
- liftg means pick up a marble with label g, if one is here.
- hereg means check whether there is a marble with label g
at the current node.
- Øhereg is the negation of hereg.
For t Î TS, the position of the marbles is formalized by a function
m:G® 2Vt. For this function, m(g) = S Í Vt means
that exactly the nodes in S have a marble with label g.
The set of all marble position functions is denoted by
Mt,G = G® 2Vt.
The behaviour of the
directives, including the restrictions mentioned above, is formalized by defining
for each tree t Î TS and each directive d Î DTWMA(S,G) the following
binary relation Rt(d) on Vt ×Mt,G.
|
|
|
{ ((u,m), (u¢,m)) \midu is the ith child of u¢ andfor all g Î G, u \not Î m(g) } |
| |
|
{ ((u,m),(u¢,m)) \midu¢ is the ith child of u } |
| |
|
{ ((roott,m),(roott,m)) } |
| |
|
{ ((u,m),(u,m)) \mid u ¹ roott } |
| |
|
{ ((u,m),(u,m)) \midlabt(u) = s} |
| |
|
{ ((u,m),(u,m¢)) \midu \not Î m(g), m¢ = m(g® m(g) È{u}) } |
| |
|
{ ((u,m),(u,m¢)) \midu Î m(g), m¢ = m(g® m(g) \{u}) } |
| |
|
{ ((u,m),(u,m)) \mid u Î m(g) } |
| |
|
{ ((u,m),(u,m)) \mid u \not Î m(g) } |
|
| |
|
The relation for d = i makes sure that the marbles are used nested,
since it only allows moving up if there are no marbles on the current node.
Let S be a ranked alphabet and let G be an alphabet. A
tree-walking marble automaton over (S,G) is a finite automaton
A over DTWMA(S,G). For a tree-walking marble automaton A = (Q,DTWMA(S,G),d,I,F) over
(S,G) and a tree t Î TS, the configurations of A are
triples (q,u,m) Î Q ×Vt ×Mt,G, with q the state of
the automaton, u the current node and m the marble position function. The set
of all configurations of A and t is denoted by \mathbb CA,t.
Let A = (Q,DTWMA(S,G),d,I,F) be a tree-walking marble automaton over
(S,G) and let t Î TS. As usual, we define the binary
step-relation \twoheadrightarrowA,t on \mathbb CA,t. For every
(q,u,m), (q¢,u¢,m¢) Î \mathbb CA,t,
(q,u,m) \twoheadrightarrowA,t (q¢,u¢,m¢) iff $d Î DTWMA(S,G):(q,d,q¢) Î d and (u,m) Rt(d) (u¢,m¢) . |
|
The automaton A computes on each tree t Î TS the binary
relation
Rt(A) = { (u,u¢) Î Vt ×Vt \mid(qin,u,m0) \twoheadrightarrowA,t* (qfin,u¢,m0) for some qin Î Iand qfin Î F } , |
|
where m0:G® 2Vt with m0(g) = Æ for all
g Î G. Note that this definition demands that A removes
all marbles from the tree.
The node relation that A computes is, as usual,
R(A) = { (t,u,v) \mid t Î TS and (u,v) Î Rt(A) } . |
|
The tree language that A recognizes is also defined in the familiar way:
L(A) = { t Î TS \mid (t,roott,v) Î R(A) for some v Î Vt} . |
|
The class of all tree languages recognized by a tree-walking marble automaton
is named TWMA. Transitions of the form (q,s,q¢) with s Î DTWMA(S,G)* are
treated in the usual way. Determinism of tree-walking marble automata is defined
in the same way as of (simple) tree-walking automata. The following pairs of
directives in DTWMA(S,G) are mutually exclusive:
- {i, j} with i ¹ j.
- {i, root}.
- {root, Øroot}.
- {labs1, labs2} with s1 ¹ s2.
- {¯i, labs} with i > rk(s).
- {i, hereg} and {i, liftg}
with i Î rki(S) and g Î G (if there is any marble,
the automaton is not allowed to move up).
- {putg, liftg},
{putg,hereg} and {liftg,Øhereg}
with g Î G.
We define the abbreviation ØhereG for
Øhereg1, ¼, Øheregk with
G = {g1, ¼, gk}. These directives together test that
there is no marble on the current node.
Note that root and Øroot are needed for the tree-walking marble automaton
because, unlike the push-down tree-walking automaton, a tree-walking
marble automaton does not have to start its walk at the root of the tree,
so it is not always possible to place a special marble at the root.
Likewise, directives i for i Î rki(S) are needed because,
when the automaton starts in a certain node not equal to the root,
the path from the root to that node cannot be
coded into marbles.
Theorem 19
TWMA = REGT
Proof.
Since PDTWA=REGT (Theorem 18), it is sufficient to show that TWMA=PDTWA.
First we prove that any push-down tree walking automaton can be simulated by a
tree-walking marble automaton. Let A = (Q,DPDTWA(S,G),d,I,F) be a push-down
tree-walking automaton over (S,G). We define a tree-walking marble
automaton
A¢ = (QÈ{qin,qfin},DTWMA(S,G),d¢,{qin},{qfin}) |
|
with a new
initial state qin and a new final state qfin.
This automaton simulates A in the following way. First, a marble with colour
gin, representing the initial push-down symbol, is put on the current node.
Then the directive ¯i,g is simulated by
moving down to the ith child of the current node and then putting down
a marble of colour g. On the way up, the directive
is simulated by picking up any marble (there is always exactly
one on the current node) and moving up to the parent of the current node.
The directive labs is also available in DTWMA(S,G), so it can be simulated
directly. When A reaches a final state, A¢ moves up to the root, picking up
all the marbles on the way up, so that it accepts the tree when it picks up
the last marble at the root. A configuration of the push-down
tree-walking automaton (u,g1 ¼gn) is simulated by a
configuration (u,m) of the tree-walking marble automaton, where, for all
i Î [1,n], m has a marble gi on the ith node of the path
from the root to u, and no other marbles.
A¢ has the following transitions:
|
|
|
{ (qin, putgin, q) \mid q Î I } È |
| |
|
{ (q,(liftg, i),q¢) \mid (q,,q¢) Î d, g Î G, i Î rki(S) } È |
| |
|
{ (q,(¯i,putg),q¢) \mid (q,¯i,g,q¢) Î d} È |
| |
|
{ (q,labs,q¢) \mid(q,labs,q¢) Î d} È |
| |
|
{ (q, (liftg1,putg2),q¢) \mid(q,stayg1, g2,q¢) Î d} È |
| |
|
{ (q, liftg, qfin) \mid q Î F, g Î G} È |
| |
|
{ (qfin, (i, liftg), qfin) \mid i Î rki(S), g Î G}. |
|
| |
|
Obviously, L(A¢) = L(A).
The other way around, let A = (Q,DTWMA(S,G),d,I,F) be a tree-walking marble
automaton over (S,G).
As observed before, we may assume that the automaton is constructed
in such a way that, at any time, there is at most one marble on any node.
Note that, due to the restriction of the automaton to the subtree of the current
node whenever a marble is put down, at any moment all marbles lie on the path
from the root to the current node. This path can be viewed as a push-down store,
except that here ``empty'' places in the store are allowed.
To account for these empty places, we add an extra marble colour g0.
In order to simulate root, Øroot and the directives
i (for i Î rki(S)), extra information is put on the
push-down store. The new push-down symbols are of the form
(g, i, r), where g Î G is the marble colour
(or g = g0 if there is no marble on the current node), the current node
is the ith child of its father, and r = 1 if the current node is the root and
r = 0 if the current node is not the root. The initial pushdown symbol is
gin = (g0,1,1).
Now the push-down store can be used to simulate the marbles.
We define a
push-down tree-walking automaton
A¢ = (Q¢,DPDTWA(S,G¢),d¢,I,F¢).
- When A moves down (¯i), A¢ also moves down and puts
the symbol (g0,i,0) on the push-down store, which means that
there is no marble on the (new) current node, the current node is the
ith child of its father and the current node is not the root.
- When A moves up (i), A¢ checks the symbol on top of the
push-down store to make sure that the child number is correct and that there is
no marble on the current node.
- A¢ can simulate root and Øroot by checking the symbol on
top of the push-down store, (g,i,r) with r = 1 for root and
r = 0 for Øroot.
- The directive labs can be simulated directly.
- The directive putg can be simulated by first checking that there is no
marble on the current node, so the symbol on top of the push-down store is
(g0,i,r) and then replacing g0 by g.
- To simulate liftg, A¢ checks that there is a marble on the
current node with colour g and then replaces it with ``colour''
g0.
- The directive hereg is simulated by inspecting the top symbol
of the push-down store and making sure that it is of the form
(g,i,r). Similarly for Øhereg and a top symbol
of any of the forms (g¢,i,r) with g¢ ¹ g.
- When A reaches a final state, A¢ must make sure that there are no
marbles left on the tree before it can accept the tree. Because all marbles
are on the path from the current node to the root, A¢ walks up to the root,
checking the absence of marbles on the way, and enter the final state
qfin when it arrives at the root and there is no marble there either.
Formally, the push-down alphabet and the transitions of A¢ are defined
as follows.
|
|
|
(GÈ{g0}) ×rki(S) ×{0,1} with gin = (g0,1,1) |
| |
|
| |
|
| |
|
{(q,(stay(g0,i,r),(g0,i,r), ),q¢) \mid r Î {0,1}, (q, i, q¢) Î d} È |
| |
|
{ (q, ¯i, (g0,i,0),q¢) \mid(q,¯i,q¢) Î d} È |
| |
|
{ (q, stay(g,i,1),(g,i,1), q¢) \midg Î GÈ{g0}, i Î rki(S),(q,root,q¢) Î d} È |
| |
|
{ (q, stay(g,i,0),(g,i,0), q¢) \midg Î GÈ{g0}, i Î rki(S),(q,Øroot,q¢) Î d} È |
| |
|
{ (q, labs, q¢) \mid(q,labs,q¢) Î d} È |
| |
|
{ (q, stay(g0,i,r),(g,i,r), q¢) \midi Î rki(S), r Î {0,1}, (q,putg,q¢) Î d} È |
| |
|
{ (q, stay(g,i,r),(g0,i,r), q¢) \midi Î rki(S), r Î {0,1}, (q,liftg,q¢) Î d} È |
| |
|
{ (q, stay(g,i,r),(g,i,r), q¢) \midi Î rki(S), r Î {0,1}, (q,hereg,q¢) Î d} È |
| |
|
{ (q, stay(g,i,r),(g,i,r), q¢) \midi Î rki(S), r Î {0,1}, g Î (GÈ{g0}), |
| |
|
(q,Øhereg¢,q¢) Î d for some g¢ ¹ g} È |
| |
|
{ (q, stay(g0,i,1), (g0,i,1), qfin) \midq Î F, i Î rki(S) } È |
| |
|
{ (q, (stay(g0,i,0), (g0,i,0), ), up) \midq Î F, i Î rki(S) } È |
| |
|
{ (up, (stay(g0,i,0), (g0,i,0), ), up) \midi Î rki(S) } È |
| |
|
{ (up, stay(g0,i,1), (g0,i,1), qfin) \midi Î rki(S) } |
|
| |
|
Since this automaton simulates A, we have L(A¢) = L(A).
\square
One might wonder why it is necessary to
restrict a tree-walking automaton, when it puts down a marble on a node,
to the subtree of which that node is the root. We will show that just demanding
that the marbles be used nested is not enough; this would allow the automaton
to recognize non-regular tree languages. As an example, consider the non-regular
tree language
L = { tn \mid n Î \mathbb N} over the ranked alphabet
S = {a,b,c} with
rk(a) = rk(b) = 1 and rk(c) = 0, with, for all n Î \mathbb N,
|
|
|
{xi,yi \mid i Î [1,n]} È{z} |
| |
|
{ (xi,1,xi+1), (yi,1,yi+1) \mid i Î [1,n-1] } È{(xn,1,y1), (yn,1,z) } |
| |
|
{(xi,a), (yi,b) \mid i Î [1,n] } È{(z,c)} |
|
| |
|
This tree language L is similar to the (non-regular) language
{ an bn c \mid n Î \mathbb N}.
If we only demand the nested use of marbles,
L is recognized by the
tree-walking marble automaton A over (S,{g}),
that puts the first marble on the first ``a'', walks to the bottom of the tree,
checks whether there is a ``c'' there and a ``b'' directly above,
puts a marble on that ``b'', walks up the tree to the first ``a'' again, puts
a marble on the second ``a'' and continues walking up and down until
it reaches the middle, where it can check if there are as much ``a'''s
as ``b'''s. It then removes the marbles in the reverse order, walking up and
down the tree in a similar way.
Note that tree-walking
multi-pebble automata, defined in analogy to the string-walking
multi-pebble automata of Section 2.1, recognize only
regular tree languages (but it is open whether or not they
recognize them all). Multi-pebble automata have only a finite number
of pebbles available, contrary to the infinite number of marbles of
marble automata. This means the above described automaton will not work
with a multi-pebble automaton.
3.6 Tree-Walking Marble/Pebble Automata
There is one feature to be added to tree-walking marble automata to make sure
that they have the same power as binary MSO formulas with respect to describing
binary node relations. A tree-walking marble/pebble automaton has one pebble
(next to marbles) which it can use in the usual way. The only restriction is that
the use of marbles and pebble together must be nested, e.g., it is not allowed to
put down a marble, then the pebble and then pick up the marble before the pebble.
Unlike the marbles, the pebble does not restrict the automaton to the subtree
induced by all descendants of the node where the pebble is. The pebble will be used
to mark one special node, e.g., the starting node of a walk on the tree.
Like with tree-walking marble automata, we will always assume that the automaton
is constructed in such a way that there is not more than one marble on
any node, at any time.
Let S be a ranked alphabet and let G be an alphabet of marble colours.
We define the set of directives for a tree-walking marble/pebble automaton as
|
|
|
DTWMA(S,G)È{ put,lift,here,Øhere} |
| |
|
{ i, ¯i \mid i Î rki(S) } È{ root, Øroot} È{ labs\mid s Î S} È |
| |
|
{ putg, liftg, hereg, Øhereg \midg Î G} È{ put,lift,here,Øhere} |
|
| |
|
Here put, lift, here and Øhere (without subscripts)
are the directives for the pebble,
while putg, liftg, hereg and Øhereg
are the directives for a marble of colour g.
To enforce nesting, we now define the binary relations for the directives on
triples (u,k,p) Î Vt ×\mathbb N ×(\mathbb N ® MPt)
with MPt = (Vt ×G) ÈVt È{^}.
Here an element m of MPt represents
the action of putting down a marble of colour g on node u
(m = (u,g))
or of putting down the pebble on node u (m = u), or no action (m = ^).
Of a triple (u,k,p),
u is the current node of the automaton, k is the number of marbles and pebbles
currently on the tree and p is the marble/pebble placement function.
Intuitively, p(1), ¼, p(k) are the actions executed in the past (in that order) of
putting down marbles or the pebble. The function p is used as a kind of push-down stack,
putting another item on the stack if a marble or pebble is put down
and removing the topmost item from the stack if a marble or pebble is
lifted. At all times, p(n) = ^ if n > k.
For each tree t Î TS and each directive d Î DTWMPA(S,G) we define the
following binary relations:
|
|
|
{ ((u,k,p), (u¢,k,p)) \midu is the ith child of u¢ and |
| |
|
p(j) ¹ (u,g) for all j Î [1,k], g Î G} |
| |
|
{ ((u,k,p),(u¢,k,p)) \midu¢ is the ith child of u } |
| |
|
{ ((roott,k,p),(roott,k,p)) } |
| |
|
{ ((u,k,p),(u,k,p)) \mid u ¹ roott } |
| |
|
{ ((u,k,p),(u,k,p)) \midlabt(u) = s} |
| |
|
{ ((u,k,p), (u,k+1,p¢)) \midp¢ = p(k+1 ® (u,g)), |
| |
|
p(i) ¹ (u,g) for all i Î [1,k] } |
| |
|
{ ((u,k,p), (u,k-1,p¢)) \midk ³ 1, p(k) = (u,g), p¢ = p(k® ^) } |
| |
|
{ ((u,k,p), (u,k,p)) \midp(i) = (u,g) for some i Î [1,k] } |
| |
|
{ ((u,k,p), (u,k,p)) \midp(i) ¹ (u,g) for all i Î [1,k] } |
| |
|
{ ((u,k,p), (u,k+1,p¢)) \midk ³ 0, p¢ = p(k+1 ® u), |
| |
|
p(i) Î MPt \Vt for all i Î [1,k] } |
| |
|
{ ((u,k,p), (u,k-1,p¢)) \midk ³ 1, p(k) = u, p¢ = p(k ® ^) } |
| |
|
{ ((u,k,p), (u,k,p)) \midp(i) = u for some i Î [1,k] } |
| |
|
{ ((u,k,p), (u,k,p)) \midp(i) ¹ u for all i Î [1,k] } |
| |
| |
|
Let S be a ranked alphabet and let G be an alphabet.
A tree-walking marble/pebble automaton over (S,G)
is a finite automaton over DTWMPA(S,G).
We continue with the usual definitions.
For a tree-walking marble/pebble automaton A = (Q,DTWMPA(S,G),d,I,F)
over (S,G) and a tree t Î TS, the configurations
of A are 4-tuples
(q,u,k,p) Î Q ×Vt ×\mathbb N ×(\mathbb N ® MPt),
with q the state, u the current node, k the number of marbles and pebbles
currently on the tree and p the marble/pebble placement function.
The set of all configurations of A and t is denoted by \mathbb CA,t.
We define the binary step-relation \twoheadrightarrowA,t on \mathbb CA,t in the usual
way. For every (q,u,k,p),(q¢,u¢,k¢,p¢) Î \mathbb CA,t,
(q,u,k,p) \twoheadrightarrowA,t (q¢,u¢,k¢,p¢) iff $d Î DTWMPA(S,G): (q,d,q¢) Î d and (u,k,p) Rt(d) (u¢,k¢,p¢) . |
|
The automaton A computes on each tree t Î TS the binary relation
Rt(A) = { (u,u¢) Î Vt ×Vt \mid (qin,u,0,p0) \twoheadrightarrowA,t* (qfin,u¢,0,p0) for some qin Î I, qfin Î F } , |
|
where p0 is the empty marble/pebble placement function, with
p0(i) = ^ for all i Î \mathbb N.
The node relation that A computes is, as usual,
R(A) = { (t,u,v) \mid t Î TS and (u,v) Î Rt(A) } . |
|
The tree language that A recognizes is, also as usual,
L(A) = { t Î TS \mid (t,roott,v) Î R(A) for some v Î Vt } . |
|
The class of all languages that are recognized by a tree-walking marble/pebble
automaton is named TWMPA.
Transitions of the form (q,s,q¢) with s Î DTWMPA(S,G)* are treated in the usual
way. The definition of deterministic tree-walking marble/pebble automata is
equal to the definition of deterministic string-walking automata. The following pairs
of directives in DTWMPA(S,G) are mutually exclusive, next to those pairs
already mutually exclusive for tree-walking marble automata:
- {put, lift}, {put,here} and {lift,Øhere}
- {lift, liftg} for any g Î G (since either the
pebble or a marble was put down last and hence must be lifted first).
Theorem 20
TWMPA = REGT
Proof.
Because TWMA=REGT (Theorem 19) and because each
tree-walking marble automaton is also a tree-walking marble/pebble automaton,
it is sufficient to show that,
for each tree-walking marble/pebble automaton A,
there exists a tree-walking marble automaton A¢ such that L(A¢) = L(A).
Let A = (Q,DTWMPA(S,G),d,I,F) be a tree-walking marble/pebble automaton
over (S,G). We may again assume that A is constructed in such a way
that, at any time, there is at most one marble on any node.
We also assume
that A nevers puts the pebble on an empty tree, i.e., there is always at least
one marble on the tree when A puts down the pebble. This can easily be
accomplished by adding an extra marble colour and having A put one
on the root before starting.
This also allows us to assume that A does not use the directives
root and Øroot.
We first formulate and prove the following Claim.
This Claim states that for a situation where A just put down its pebble
on the current node u and the previous marble (of colour g) was put on
an ancestor v of u,
there exists a finite tree automaton M such that
A can make a round-trip, starting and finishing at u, starting in state
q and finishing in state q¢, with the same marbles on the tree at the beginning
and at the end of the round-trip (and without lifting the pebble at u),
if and only if M accepts tv (the subtree of t with root v),
marked at node u.
Due to the nesting constraints, the round-trip of A takes place only on
tv (A cannot move higher up the tree than v before lifting
the marble at u).
- [
-
Claim]
Let A = (Q,DTWMPA(S,G),d,I,F) be a tree-walking marble/pebble automaton
over (S,G) that
does not make use of the directives put, lift, root, and
Øroot. Then there exists
for every g Î G and q,q¢ Î Q a finite tree automaton
M over SÈ(S×B1) such that,
for all t Î TS, u,v Î Vt, k ³ 2,
p Î \mathbb N ® MPt with p(k) = u and p(k-1) = (v,g),
with u Î des(v) and the only marble on tv is at v
(i.e., "i < k: if p(i) = (v¢,g¢) then v¢\not Î des(v)),
(q,u,k,p) \twoheadrightarrowA,t* (q¢,u,k,p) iff mark(tv,u) Î L(M). |
|
To prove this Claim,
let A = (Q,DTWMPA(S,G),d,I,F) be a tree-walking marble/pebble automaton
over (S,G) that
does not make use of the directives
put, lift, root, and Øroot,
and let gr Î G (the g in the Claim) and q1,q2 Î Q.
We first define a tree-walking marble automaton
A¢ = (Q¢,DTWMA(SÈ(S×B1),G),d¢,I¢,F¢) over
(SÈ(S×B1),G) such that the Claim holds for A¢ rather than M.
The automaton A¢ uses the mark to know the position of the pebble.
It walks on tv; there is no marble of colour gr on v.
A¢ starts in the root of tv, i.e., v. It first searches
for the pebble, then simulates A, starting in state q1.
When it gets back to the pebble in state q2, A¢'s mission is
accomplished.
|
|
|
| |
|
| |
|
| |
|
{ (qin, ¯i, qin) \mid i Î rki(S) } È{ (qin, lab(s,1), q1) \mid s Î S} È |
| |
|
{ (p,d,q) \mid (p,d,q) Î d, d \not Î {here, Øhere, Øheregr, putgr }} È |
| |
|
{ (p,lab(s,1),q) \mid (p,labs,q) Î d} È |
| |
|
{ (p,lab(s,1),q) \mid (p,here,q) Î d,s Î S} } È |
| |
|
{ (p,labs,q) \mid (p,Øhere,q) Î d,s Î S} È |
| |
|
{ (p,root,q) \mid (p,heregr,q) Î d,s Î S} È |
| |
|
{ (p,(Øroot, Øheregr),q) \mid (p,Øheregr,q) Î d} È |
| |
|
{ (p,(Øroot, putgr),q) \mid (p,putgr,q) Î d} È |
| |
|
{ (q2, lab(s,1), qfin) \mids Î S} |
| |
| |
|
With this definition of A¢, we have, for all
t Î TS, u,v Î Vt, k ³ 2,
p Î \mathbb N ® MPt with p(k) = u and p(k-1) = (v,g),
with u Î des(v) and the only marble on tv at v,
(q1,u,k,p) \twoheadrightarrowA,t* (q2,u,k,p) iff mark(tv,u) Î L(A¢). |
|
Because TWMA=REGT (Theorem 19), there exists a finite tree
automaton M over SÈ(S×B1) with L(M) = L(A¢). The automaton
M complies with the demands of the Claim. This ends the proof of the Claim.
We continue with the proof of Theorem 20.
We construct a tree-walking marble automaton
A¢ that recognizes the same tree
language as the tree-walking marble/pebble
automaton A. The automaton
A¢ = (Q¢,DTWMA(S,G¢),d¢,I¢,F¢) simulates A,
but at each step of A, A¢ is prepared
to simulate the use of the pebble of A, by knowing, if it were to put the pebble
at the current node u, the states in which A could return at the same node,
ready to pick up the pebble again
(cf. the proof of Theorem 11).
To accomplish this, A¢ uses a special set
of marbles in addition to the marbles of A.
For each q,q¢ Î Q and g Î G,
Mq q¢g is the automaton that can be constructed according
to the Claim, after removing all transitions with directives
put and lift from A. The meaning of the additional marble colours is as follows.
- The marbles [^(g)] (for g Î G) denote that the
last marble that A has put down is of colour g. The special marble
e is
used to indicate that there are no marbles of A on the tree.
- The marbles in the set ``states'' are strings of functions from
Q ×Q to Q.
If the string is f1 ¼fk (with k the rank of the symbol s
of the node), then fi(q, q¢) = p means that Mq q¢g
reaches state p in the ith child of the node. Here g is the
colour of the latest marble put down by A on the tree.
- The marbles q Î Q denote that A will continue its walk on the
tree from there in state q. They are used to store the state of
A during the computation of the states-marble. They also mark the node
where this computation started.
- The marbles in the set ``succ'' are functions from
Q ×Q to subsets of Q. A marble coloured
g:Q ×Q ® 2Q on a node u indicates that,
for all q,q¢ Î Q,
the finite automaton Mq q¢g has successful states g(q,q¢)
at u. Here Mq q¢g works on the subtree tv, where
v is the node where the latest marble, of colour g, was put by A.
A¢ uses a method similar to the one used in Lemma 16 to
compute the states of the Mq q¢g and thus the marble
colours for the marbles from the set ``states''. The successful states of the
Mq q¢g for the children of a node u are computed by A¢
from the successful states of u itself and from the states that
Mq q¢g reaches in the children of u.
When A puts down its pebble (at node u), A¢ checks whether
Mq q¢g accepts mark(tv,u),
where q is the state of A after it put down its
pebble, q¢ is a state from which A can pick it up again,
g is the colour of the last marble that A put down before the
pebble and v is the position of that marble.
The colour g is stored in the marble [^(g)] at
the current node u.
During the direct simulation of the steps of A, there is always
exactly one marble from the set
{[^(g)] \mid g Î G} È{e}
at the current node.
When there is at least one marble of A on the simulated tree,
there is also exactly one marble from each of the sets states and succ
at the current node. The colours of these three marbles determine
whether or not the Mq q¢g accept mark(tv,u).
We define the abbreviation ØhereQ to be
Øhereq1, ¼, Øhereqn with {q1, ¼, qn} = Q.
Formally, the states, marble alphabet, initial and final states, and
transitions of A¢ are defined as follows.
|
|
|
Q È{ qin, qfin} È{ findstatesg \mid g Î G} È{ (f,g) \mid f:(Q ×Q) ® Q, g Î G} |
| |
|
GÈQ È{ |
^ g
|
\mid g Î G} È{e}ÈstatesÈsucc |
| |
|
{ l} È |
È
i Î rki(S)
|
((Q ×Q) ® Q)i |
| |
|
| |
|
| |
|
| |
|
|
| |
|
The set of transitions d¢ of A¢ is divided into four parts.
The initialization of A¢ is done in the transitions in din.
The actual simulation of the steps
of A is done by the transitions in dsim.
The set dstates contains a ``subroutine'' for calculating
the states of Mq1 q2 g for the children of the current node
in a marble from the set states.
Finally,
the cleaning up is done by the transitions in dfin.
We start with describing din. From the initial state
qin of A¢, a marble e is laid on the root.
This marble indicates that there is no
marble on the (simulated) tree yet. This transition prepares the
automaton A¢ for the simulation of A.
din = { (qin, pute, q) \mid q Î I } |
|
In dsim we place the transitions for the actual simulation
of the steps of A. This is quite technical.
After simulating any of the directives
{¯i, putg, liftg} the marble from
states must be re-computed.
The marbles from succ are computed with the use of a function
``su'' that is defined after dsim.
In the last set of transitions,
the putting down and lifting of the pebble is simulated, by making use
of the information in the marbles from states and succ.
|
|
|
{ (p, (heree,¯i,pute, q) \mid(p, ¯i, q) Î d} È |
| |
|
{ (p, (labs, here[^(g)], heref1¼fk, hereg, ¯i, put[^(g)], putg¢, putq, putl), findstatesg) \mid (p, ¯i, q) Î d, |
| |
|
s Î S, k = rk(s), g Î G, i Î [1,k], fj:Q×Q ® Q (j Î [1,k]), |
| |
|
g:Q×Q ® 2Q,g¢ = su(g, s, g, i, f1, ¼, fk) } È |
| |
|
{ (p, (lifte, i), q) \mid(p, i, q) Î d} È |
| |
|
{ (p, (ØhereG, lift[^(g)], liftg, liftf1¼fk, i), q) \mid (p, i, q) Î d, |
| |
|
g Î G, g:Q×Q ® 2Q, k ³ 0,fj:Q×Q ® Q (j Î [1,k]) } È |
| |
|
{ (p, labs, q) \mid (p, labs, q) Î d} È |
| |
|
{ (p, (putg, lifte, put[^(g)],putg, putq, putl), findstatesg) \mid (p, putg, q) Î d, |
| |
|
g:Q×Q ® 2Q with g(q1,q2) = FMq1 q2 g } È |
| |
|
{ (p, (putg, lift[^(g)]¢, put[^(g)],liftf1 ¼fk, liftg,putg¢, putq, putl), findstatesg) \mid (p, putg, q) Î d, |
| |
|
g¢ Î G, k ³ 0, fj:Q×Q ® Q (j Î [1,k]),g:Q×Q ® 2Q, |
| |
|
g¢:Q×Q ® 2Q with g¢(q1,q2) = FMq1 q2 g } È |
| |
|
{ (p, (liftg, lift[^(g)], liftf1 ¼fk,liftg, i, heree, ¯i,pute), q)\mid (p, liftg, q) Î d, |
| |
|
k ³ 0, fj:Q×Q ® Q (j Î [1,k]),g:Q×Q ® 2Q, i Î rki(S) } È |
| |
|
{ (p, (liftg, lift[^(g)], liftf1 ¼fk,liftg, i, here[^(g)]¢, labs, heref¢¢1 ¼f¢¢k¢¢, hereg¢¢,¯i, |
| |
|
put[^(g)]¢, putg¢, putq, putl), findstatesg¢) \mid (p, liftg, q) Î d, k ³ 0, |
| |
|
fj:Q×Q ® Q (j Î [1,k]),g:Q×Q ® 2Q, i Î rki(S), g¢ Î G,s Î S, |
| |
|
k¢¢ = rk(s),f¢¢j:Q×Q ® Q (j Î [1,k¢¢]), g¢¢:Q×Q ® 2Q, |
| |
|
g¢:Q×Q ® 2Q, g¢ = su(g¢¢,s,g¢,i,f¢¢1, ¼, f¢¢k¢¢) } È |
| |
|
{ (p, (root, liftg, lift[^(g)], liftf1 ¼fk,liftg, pute), q) \mid (p, liftg, q) Î d, |
| |
|
k ³ 0, fj:Q×Q ® Q (j Î [1,k]),g:Q ×Q ® 2Q } È |
| |
|
{ (p, hereg, q) \mid (p, hereg, q) Î d} È |
| |
|
{ (p, Øhereg, q) \mid (p, Øhereg, q) Î d} È |
| |
|
{ (p, (labs, heref1 ¼fk, hereg, here[^(g)]), q) \mid (p, put, p¢) Î d, (q¢, lift, q) Î d, |
| |
|
| |
|
g:Q×Q ® 2Q,fj:Q×Q ® Q (j Î [1,k]), |
| |
|
(d(s,1))Mp¢q¢g(f1(p¢, q¢), ¼, fk(p¢, q¢)) Î g(p¢, q¢) } |
|
| |
|
Here su(g, s, g, i, f1, ¼, fk) computes the next set of
successful states. For all states q1, q2 Î Q,
su(g, s, g, i, f1, ¼, fk)(q1,q2) is the set
of all states q such that Mq1 q2 g
is successful, assuming it reaches the ith child of the current node in
state q, and the jth (j ¹ i) child of the current node
in state fj(q1, q2), or
|
su(g, s, g, i, f1, ¼, fk)(q1,q2) |
|
|
{q Î QMq1 q2 g \mid (ds)Mq1 q2 g(f1(q1, q2), ¼, fi-1(q1, q2), q, |
| |
|
fi+1(q1, q2), ¼, fk(q1, q2)) Î g(q1, q2) } . |
|
| |
|
The ``subroutine'' to construct the marble from states is as follows:
dstates = Èg Î G dg with
|
|
|
{ (findstatesg, (ØhereQ, labs, liftl, i), (f,g)) \midi Î rki(S), rk(s) = 0, |
| |
|
f:(Q ×Q) ® Q with f(q1,q2) = (ds)Mq1 q2 g } È |
| |
|
{ (findstatesg, (liftq, labs), q) \midrk(s) = 0, q Î Q } |
| |
|
{ (findstatesg, (labs, ¯1, putl), findstatesg) \midrk(s) > 0 } È |
| |
|
{ ((fk,g), (labs, liftf1¼fk-1, putf1¼fk, ¯k+1, putl), findstatesg) \mid1 £ k < rk(s) } È |
| |
|
{ ((fk,g), (ØhereQ, labs, liftf1¼fk-1, i), (f,g)) \mid k = rk(s), i Î rki(S), |
| |
|
f(q1, q2) = (ds)Mq1 q2 g(f1(q1, q2), ¼,fk(q1, q2)) } È |
| |
|
{ ((fk,g), (liftq, labs, liftf1¼fk-1, putf1¼fk), q) \midq Î Q, k = rk(s) ³ 1} |
|
| |
|
Finally, to allow A¢ to accept the tree, A¢ has to remove its
auxiliary pebbles from the tree. Since at a moment that A would accept
the tree there are no marbles on it, A¢ only has to remove the
marbles e.
dfin = { (q, lifte, qfin) \mid q Î F } È{ (qfin, (i, lifte), qfin) \mid i Î rki(S) } |
|
\square
3.7 Tree-Walking Automata with MSO Tests
A completely different extension (i.e., not involving marbles or pebbles)
of tree-walking automata is to add the
capability to perform MSO tests (see [BE97], Section 1.4).
Let S be a
ranked alphabet. The set of directives for a tree-walking automaton with
MSO tests is
DTWA+M(S) = { i, ¯i \mid i Î rki(S) } ÈMSOL1(S) . |
|
As this is an infinite set, we have to choose a finite subset of DTWA+M(S)
for each tree-walking automaton with MSO tests. For each tree
t Î TS and each directive d Î DTWA+M(S) we define the following
binary relation on Vt:
|
|
|
{ (u,v) \mid (v,i,u) Î Et } |
| |
|
{ (u,v) \mid (u,i,v) Î Et } |
| |
|
{ (u,u) \mid (t,u) \models y(x) } |
|
| |
|
A tree-walking automaton with MSO tests over S is a finite
automaton A over a finite subset of DTWA+M(S). The definitions of
configurations, \twoheadrightarrowA,t, Rt(A), R(A) and L(A) are unchanged.
The definition of deterministic tree-walking automata with MSO tests is
also unchanged with respect to the definition of deterministic
tree-walking automata, although it is no longer possible to list all
pairs of mutually exclusive directives. However, it is decidable whether
a given pair of directives is mutually exclusive. Directives of the form
(q,s,q¢) with s Î DTWA+M(S)* are treated in the usual way. The class of
all tree languages L such that L = L(A) for some tree-walking automaton
with MSO tests A is named TWA+M. As stated in Section 1.4, in
[BE97] the following
is proven.
Proposition 21
The following three statements hold:
- TWA+M=REGT
- For each ranked alphabet S and each binary MSO formula
f(x,y) Î MSOL2(S), there exists a tree-walking automaton
with MSO tests A such that R(A) = R(f).
- For each ranked alphabet S and each tree-walking automaton
with MSO tests A over S, there exists a
binary MSO formula
f(x,y) Î MSOL2(S)
such that R(f) = R(A).
3.8 Equivalence of Binary MSO Formulas and
Tree-Walking Marble/Pebble Automata
In this section we will show that binary MSO formulas,
i.e., formulas f(x,y) Î MSOL2(S) for some ranked alphabet
S, define the same node relations as those computed by
tree-walking marble/pebble automata.
First we will show that for each binary formula f(x,y) Î MSOL2(S)
there exists a tree-walking marble/pebble automaton A over (S, G)
(for some alphabet G) such that they compute the same relation, that is,
R(f) = R(A).
We will make use of the second part of Proposition 21
to construct our
tree-walking marble/pebble automaton, similar to the alternate proof
of Theorem 7 (Section 1.6).
Note that the original proof of Theorem 7
is not valid for trees, because, if this proof were translated to trees, the pebble
and marbles would not be used nested.
We need the following lemma. This lemma states that it is possible to simulate
a unary MSO formula with a tree-walking marble/pebble automaton
that makes a round-trip, checking the absence of all marbles and pebbles before entering
a final state.
Lemma 22
Let S be a ranked alphabet. For each unary MSO formula
y(x) Î MSOL1(S) there exists a tree-walking marble/pebble automaton
A = (Q,DTWMPA(S,G),d,I,F) over (S, G),
for some alphabet G, such that, for all t Î TS,
Rt(A) = { (u,u) \mid u Î Vt, (t,u) \models y(x) }
and, if (qin,u,0,p0) \twoheadrightarrowA,t* (qfin,u¢,k,p) for
some u,u¢ Î Vt, qin Î I and qfin Î F, then k = 0 and p = p0.
Proof.
Let S be a ranked alphabet and let y(x) Î MSOL1(S) be
a unary MSO formula. According to Corollary 3, there
exists a closed MSO formula y¢ Î MSOL0(SÈ(S×B1)) such that,
for all t Î TS and u Î Vt,
(t,u) \models y(x) iff mark(t,u) \models y¢.
Since MSOT=REGT (Proposition 15) and
TWMA=REGT (Theorem 19),
there exists a tree-walking marble automaton
A¢ = (Q¢,DTWMA(SÈ(S×B1),G),d¢,I¢,F¢) |
|
over (SÈ(S×B1), G) (for some alphabet G)
such that L(A¢) = L(y¢). This implies that for all
t Î TS, u Î Vt and t¢ = mark(t,u),
t¢\models y¢ iff (qin,roott¢,m0) \twoheadrightarrowA¢,t¢* (qfin,v,m0) for some qin Î I¢, qfin Î F¢ and v Î Vt . |
|
We now define the tree-walking marble/pebble automaton A = (Q,DTWMPA(S,G),d,I,F).
This automaton first drops its pebble in order to remember the start of the walk.
Then it walks to the root of the tree and starts simulating A¢. It uses the pebble
to determine when to interpret a node label s Î S as
s (when there is no pebble at the current node) and
when to interpret s as (s,1) (when there is a pebble at the current node).
When it reaches a final state, it walks to the root of the tree, checking on the
way that there are no marbles (this can be done since the marbles always lie on the
path from the root to the current node), so that A¢ really accepts the
marked tree. Then A searches for the pebble, picks it up and finishes.
|
|
|
Q¢È{ qin, toroot, ready, findpebble, qfin} |
| |
|
| |
|
| |
|
{ (qin,put,toroot) } È{ (toroot, i, toroot) \mid i Î rki(S) } È |
| |
|
{ (toroot, root, q) \mid q Î I¢} È |
| |
|
{ (q,d,q¢) \mid (q,d,q¢) Î d¢, d ¹ labs for all s Î S } È |
| |
|
{ (q,(Øhere,labs),q¢) \mid (q,labs,q¢) Î d¢} È |
| |
|
{ (q,(here,labs),q¢) \mid(q,lab(s,1),q¢) Î d¢} È |
| |
|
{ (q, ØhereG, ready) \mid q Î F¢} È |
| |
|
{ (ready, (i, ØhereG), ready) \mid i Î rki(S) } È |
| |
|
{ (ready, (root, ØhereG), findpebble) } È |
| |
|
{ (findpebble, (Øhere, ¯i), findpebble) \mid i Î rki(S) } È |
| |
|
{ (findpebble, (here, lift), qfin) } |
|
| |
|
Note that the construction of A ensures that the marbles and pebble are used nested,
because first the pebble is put down, then the marbles are used in the same way as
A¢ uses them (i.e., nested), then A makes sure there are no marbles left on
the tree and finally A picks up the pebble.
From the construction of A it follows that, for all
t Î TS, u Î Vt and t¢ = mark(t,u),
|
(q,roott¢,m0) \twoheadrightarrowA¢,t¢* (q¢,v,m0) for some q Î I¢, q¢ Î F¢ and v Î Vt iff |
| (qin,u,0,p0) \twoheadrightarrowA,t* (qfin,u,0,p0). |
|
| |
|
This, and the details of the construction of A, implies
Rt(A) = { (u,u) \mid u Î Vt, (t,u) \models y(x) }
and the other details of the lemma.
\square
Lemma 23
Let S be a ranked alphabet and let f(x,y) Î MSOL2(S) be a binary
MSO formula. Then there exists a tree-walking marble/pebble automaton A such that
R(A) = R(f).
Proof.
According to Proposition 21, there exists a tree-walking automaton
with MSO tests A¢ = (Q¢,D, d¢,I¢,F¢) such that
R(A¢) = R(f).
We define a tree-walking marble/pebble automaton
A over (S,G) that simulates A¢. The automaton
A can directly simulate the directives
{i, ¯i \mid i Î rki(S)}. The directives
y(x) Î MSOL1(S) are handled by constructing a
tree-walking marble/pebble automaton
according to Lemma 22 and using this automaton as a
``subroutine'' in A.
We define T = {(q,d,q¢) Î d¢\mid d Î MSOL1(S)},
the set of all transitions with MSO tests in A¢.
For each transition t = (q,y(x),q¢) Î T, we use Lemma 22
to construct the tree-walking marble/pebble automaton At
over (S, Gt) that simulates
t. We will assume that the states of any automaton At do not
overlap with the states of any other At or with the states of A¢.
We also assume that the marble alphabets Gt are disjoint,
or Gt ÇGt¢ = Æ for all
t,t¢ Î T, t ¹ t¢. The marble alphabet of
A is G = Èt Î T Gt.
We define A = (Q,DTWMPA(S,G),d,I,F) as follows:
|
|
|
| |
|
| |
|
| |
|
{ t Î d¢\mid t\not Î T } È |
| |
|
{ (q, (put,lift), q¢), (q¢¢, (put,lift), q¢¢¢) \midq¢ Î IAt, q¢¢ Î FAt,t = (q,y(x),q¢¢¢) Î T } È |
| |
|
|
| |
|
Here the (double) directive (put,lift) acts as a ``nop'' directive.
It can easily be seen that A complies with the demands of nesting and that
R(A) = R(A¢).
\square
We now show the other way around.
Lemma 24
Let S be a ranked alphabet and let A be a tree-walking marble/pebble
automaton over (S,G).
Then there exists a binary MSO formula f(x,y) Î MSOL2(S)
such that R(f) = R(A).
Proof.
From the automaton A we first construct another tree-walking marble/pebble
automaton A¢ over (SÈ(S×B2), G) such that, for all t Î TS
and u,v Î Vt,
mark(t,u,v) Î L(A¢) iff (t,u,v) Î R(A) . |
|
A¢ first finds the node with label (s,1,b), simulates A and accepts the
tree if A has a final state in a node with label (s,b,1).
Because TWMPA=REGT (Theorem 20)
and REGT=MSOT (Proposition 15),
there exists a MSO formula y Î MSOL0(SÈ(S×B2))
such that, for all
t¢ Î TSÈ(S×B2),
t¢ Î L(y) iff t¢ Î L(A¢) . |
|
Now, using Corollary 3, we obtain a binary MSO formula
f(x,y) Î MSOL2(S) such that, for all t Î TS
and u,v Î Vt,
(t,u,v) Î R(f) iff mark(t,u,v) Î L(y) |
|
and we have, for all t Î TS
and u,v Î Vt,
(t,u,v) Î R(f) iff (t,u,v) Î R(A) . |
|
\square
These two lemmas together prove the grand finale of this chapter.
Theorem 25
The following two statements hold:
- For every MSO formula f(x,y) Î MSOL2(S) there exists a tree-walking
marble/pebble automaton A over (S, G), for some alphabet G,
such that R(A) = R(f).
- For every tree-walking marble/pebble automaton A over (S, G)
there exists an MSO formula f(x,y) Î MSOL2(S)
such that R(f) = R(A).
This theorem states that binary MSO formulas define the same node relations
that tree-walking marble/pebble automata compute. Since the proof of this theorem,
and all underlying theorems and lemmas, is constructive, we can actually construct
an operational automaton with only local operations to compute the node relation that
is defined by a given, descriptive, binary MSO formula, and vice versa.
Conclusion and Recommendations
Conclusion
In this paper we have described a number of string-walking and tree-walking automata
and we have proved some of their properties, especially with regard to the binary
node relations these automata compute. This study has been done in order to find an
answer to the following question:
Is it possible to define a type of tree-walking automaton,
with only local operations,
that computes the same binary node relations that binary MSO formulas
recognize?
In Chapter 1 we have investigated the possibilities to define
a string-walking automaton that computes the same binary relations on the positions
of strings as binary MSO formulas. It was found that string-walking pebble automata
have this property. It is even possible, for every binary MSO formula, to
construct a string-walking pebble automaton that computes the same binary relation.
The reverse is also true: for each string-walking pebble automaton, it is possible
to construct a binary MSO formula that defines the same binary relation. It was also
shown that the pebble is necessary. There is a binary MSO formula for which there
is no string-walking automaton (without pebble) that computes the same binary
relation.
Since the pebble operations are local (the pebble can only be handled on the
current position of the automaton), this answers our central question for strings.
To answer our central question for trees, we have defined in Chapter 3,
among others, tree-walking pebble/marble automata.
It has been shown that these automata compute
exactly the binary node relations that can be defined by binary MSO formulas.
Like with string-walking pebble automata, the appropriate automaton for each
binary MSO formula can be constructed, and vice versa. Tree-walking
marble/pebble automata have also only local operations. All pebble-handling
and marble-handling occurs only on the current node. A note must be made that
it may seem a bit complicated to have both marbles and a pebble available
to the automaton. We have made no study to check whether it is possible to leave
out, e.g., the pebble and still compute the same binary node relations.
Also one might argue that the tree-walking marble/pebble automaton is not
strictly local, since the automaton ``remembers'' in some way that it has dropped
the pebble, further down the tree,
when it is about to pick up the marble that was put down last before the
pebble.
Recommendations
There are a number of questions that have not been answered in this paper,
but may be interesting for future study.
- Only binary node relations have been considered. Maybe there is also
an operational way to compute the k-ary node relations that are defined
by k-ary MSO formulas.
- The concept of string-walking marble automata (Section 1.4)
may have to be adapted to make them recognize exactly the regular languages.
- Do tree-walking automata recognize exactly the regular tree languages?
- It has not been checked that tree-walking marble/pebble automata need the
pebble to compute the relations defined by binary MSO formulas.
- Tree-walking multi-pebble automata (in analogy with string-walking
multi-pebble automata) have not been studied.
It is not known which
tree languages they describe, and whether the number of pebbles induces a
hierarchy on the recognized languages. It is known that tree-walking
multi-pebble automata recognize at most the regular tree languages,
or TWnPA Í REGT.
- It should be checked whether deterministic string-walking pebble
automata recognize exactly the functional relations that are
defined by binary MSO formulas. Also for deterministic
tree-walking marble/pebble automata and (functional) binary MSO formulas.
Bibliography
- [BE97]
-
R. Bloem and J. Engelfriet.
Characterization of properties and relations defined in monadic
second order logic on the nodes of trees.
Technical Report 97-04, Leiden University, 1997.
- [BH65]
-
M. Blum and C. Hewitt.
Automata on a 2-dimensional tape.
In Proc. 8th IEEE Symp. on Switching and Automata Theory, pages
155-160, 1965.
- [Büc60]
-
J. Büchi.
Weak second-order arithmetic and finite automata.
Z. Math. Logik Grundl. Math., 6:66-92, 1960.
- [Don70]
-
J. Doner.
Tree acceptors and some of their applications.
J. of Comp. Syst. Sci., 4:406-451, 1970.
- [GS84]
-
F. Gécseg and M. Steinby.
Tree Automata.
Akadémiai Kiadó, Budapest, 1984.
- [HU79]
-
J.E. Hopcroft and J.D. Ullman.
Introduction to Automata Theory, Lamnguages, and Computation.
Addison-Wesley, Reading, Mass., 1979.
- [KS81]
-
T. Kamimura and G. Slutzki.
Parallel and two-way automata on directed ordered acyclic graphs.
Inform. Contr., 49:10-51, 1981.
- [RHE91]
-
G. Rozenberg, H.J. Hoogeboom, and J. Engelfriet.
Formele talen en automaten 1.
Afdeling Wiskunde en Informatica, Rijksuniversiteit Leiden, 1991.
In Dutch.
- [TW68]
-
J.W. Thatcher and J.B. Wright.
Generalized finite automata theory with an application to a decision
problem of second-order logic.
Math. Systems Theory, 2:57-81, 1968.
File translated from TEX by TTH, version 2.32.
On 9 Jul 1999, 16:00.
Back to publications index