**Branch :**Computer Science and Engineering

**Subject :**Compiler design

**Unit :**Syntax-directed Translation

## General Tree Matching

**Introduction**: The LR-parsing approach to pattern matching based on prefix representations favors the left operand of a binary operator. In a prefix representation op E\ E2, the limited-lookahead LR parsing decisions must be made on the basis of some prefix of Ei, since E\ can be arbitrarily long. Thus, pattern matching can miss nuances of the target-instruction set that are due to right operands. Instead prefix representation; we could use a postfix representation. But, then an LR-parsing approach to pattern matching would favor the right operand.

For a hand-written code generator, we can use tree templates, as in Fig. 8.20, as a guide and write an ad-hoc matcher. For example, if the root of the input tree is labeled ind, then the only pattern that could match is for rule (5); otherwise, if the root is labeled , then the patterns that could match are for rules (6-8).

For a code-generator generator, we need a general tree-matching algorithm. An efficient top-down algorithm can be developed by extending the string pattern- matching techniques of Chapter 3. The idea is to represent each template as a set of strings, where a string corresponds to a path from the root to a leaf in the template. We treat all operands equally by including the position number of a child, from left to right, in the strings.

**Example:** In building the set of strings for an instruction set, we shall drop the subscripts, since pattern matching is based on the attributes alone, not on their values.

The templates in Fig. 8.22 have the following set of strings from the root to a leaf:

C

1 R

2 ind 1 1 C

2 ind 1 2 R

2 R

The string C represents the template with C at the root. The string 1R represents the and its left operand R in the two templates that have at the root.

Using sets of strings as in Example 8.22, a tree-pattern matcher can be constructed by using techniques for efficiently matching multiple strings in parallel. In practice, the tree-rewriting process can be implemented by running the tree-pattern matcher during a depth-first traversal of the input tree and performing the reductions as the nodes are visited for the last time.

Instruction costs can be taken into account by associating with each tree rewriting rule the cost of the sequence of machine instructions generated if that rule is applied. In Section 8.11, we discuss a dynamic programming algorithm that can be used in conjunction with tree-pattern matching.

By running the dynamic programming algorithm concurrently, we can select an optimal sequence of matches using the cost information associated with each rule. We may need to defer deciding upon a match until the cost of all alternatives is known. Using this approach, a small, efficient code generator can be constructed quickly from a tree-rewriting scheme. Moreover, the dynamic programming algorithm frees the code-generator designer from having to resolve conflicting matches or decide upon an order for the evaluation.