Strategic Programming in Stratego/XT

For the past years I have been working on WebDSL, a domain-specific language for building web applications. We build the compiler for this language using the Stratego/XT toolset. Stratego is a program transformation language, i.e. it is a language that is used to transform one program into another, e.g. a WebDSL program into a Java program. So, how does that work?

The first step in program transformation is parsing. A textual representation of the program (the source file) is parsed into a tree representation (called the AST -- Abstract Syntax Tree). We write a syntax definition of our language in a language, in our case SDF, from which a parser is generated that can parse our language and turn it into an AST. For instance, the AST representation of the following simple program:

n = 3 * 4

could be

Assign(Var("n"), Multiply(Num("3"), Num("4")))

Stratego can perform transformations on this type of AST. Let us consider a simple optimization transformation where the multiplication of two constant values (in this case numbers) are precomputed so that they do not have to be computed everytime this statement is evaluated. So this transformation would transform:

Assign(Var("n"), Multiply(Num("3"), Num("4")))

Into

Assign(Var("n"), Num("12"))

In Stratego a transformation rule to accomplish this would look something like this:

simplify :
  Multiply(Num(x), Num(y)) -> Num(z)
  where z := <mulS> (x, y)

What this defines is a rule called simplify that matches an AST node of the form Multiply(Num(<insert something here and bind it to x>), Num(<insert something here and bind it to y>)) and rewrites that to Num(<insert z here>). In the where clause of the rule, the actual value of this computation is computed and assigned to z, which is then inserted into the Num AST node (mulS is a utility rule that can compute the multiplication of two integers represented as strings, but don't worry too much about that).

Ok, now let's consider the following program:

n = 3 * 5 * 2

When this is parsed, we might end up with the following AST:

Assign(Var("n"), Multiply(Multiply(Num("3"), Num("5"),
Num("2"))

Of course, this could be simplified to simply Assign(Var("n"), Num("30")), right? But we somehow have to tell Stratego to apply this simplify rule a number of times. One way would be to traversing the tree and try to apply the rule somewhere in the tree until it can no longer be applied anywhere. Let's see how that would look.

We start with our initial AST:

Assign(Var("n"), Multiply(Multiply(Num("3"),
Num("5")), Num("2"))

We then start to look for a place to apply the simplify rule. We decide to start looking in the leaf nodes, these are all string nodes ("n", "3", "5" and "2"), our simplify rule does not apply there so we move upwards. Then we get to the Var("n"), Num("3"), Num("5") and Num("2") nodes, still no match. Then we find something more interesting: Multiply(Num("3"), Num("5")), which matches our simplify rule, so it is rewritten to simply Num("15"). Our AST now looks like this:

Assign(Var("n"), Multiply(Num("15"), Num("2"))

When we move up more we find the new Multiply(Num("15"), Num("2")) node which, again, matches with the simplify rule, rewriting it to Num("20"). So our resulting AST is now:

Assign(Var("n"), Num("30"))

And we're done because the simplify rule no longer matches anywhere. The final step after program transformation is called "pretty-printing", sometimes referred to as "unparsing", i.e. doing the opposite of parsing where an AST is converted back to normal source code, resulting in:

n = 30

We have now optimized a simple program.

In Stratego rules are never applied automatically, you always have to tell it exactly what to do and in what order. This is done using Strategies (hence the name Stratego). A general strategy for a program transformation could look as follows (note, this is simplified):

main =
  parse-file;
  transform;
  pretty-print

What this will do is apply these sub-strategies in order. First parse-file, if that succeeds, transform and then pretty-print. We won't talk about the parsing and pretty-printing, let's focus on the transform strategy. The transform strategy has to apply the simplify rule we defined before to our AST for as long as it can be applied, for this Stratego offers a higher-order strategy called innermost. Higher-order strategies take one or more strategies as arguments, in this case we pass the simplify strategy as an argument to the innermost strategy:

transform =
  innermost(simplify)

What it will do is traverse the tree, try to apply the simplify rule wherever it can, when the rule no longer applies anywhere, then it terminates and returns the result of the transformation. Programming with strategies is what we call strategic programming.

Now you may think, doesn't writing ASTs all the time get really annoying? And indeed, it does. That's why, with a little bit of work, you can mix in the syntax of programs you are transforming with the Stratego syntax. Typically these pieces of program syntax are enclosed in |[ and ]| quotations. The simplify rule could then be rewritten to the more natural:

simplify :
  |[ n1 * n2 ]| -> |[ n3 ]|
  where n3 := <mulS> (n1, n2)

Where somewhere is defined that the nx notation will only match numbers.

Stratego is a very nice language for building compilers, if you want to learn more, you can visit the strategoxt.org website.

Sometime during the next few days I will talk about an experiment I conducted last weekend where I attempted to do Strategic programming in Clojure.