For the past years I have been working on WebDSL, a domain-specific language for building web applications. We build the compiler for this language using the Stratego/XT toolset. Stratego is a program transformation language, i.e. it is a language that is used to transform one program into another, e.g. a WebDSL program into a Java program. So, how does that work?
The first step in program transformation is parsing. A textual representation of the program (the source file) is parsed into a tree representation (called the AST -- Abstract Syntax Tree). We write a syntax definition of our language in a language, in our case SDF, from which a parser is generated that can parse our language and turn it into an AST. For instance, the AST representation of the following simple program:
n = 3 * 4
Assign(Var("n"), Multiply(Num("3"), Num("4")))
Stratego can perform transformations on this type of AST. Let us consider a simple optimization transformation where the multiplication of two constant values (in this case numbers) are precomputed so that they do not have to be computed everytime this statement is evaluated. So this transformation would transform:
Assign(Var("n"), Multiply(Num("3"), Num("4")))
In Stratego a transformation rule to accomplish this would look something like this:
Multiply(Num(x), Num(y)) -> Num(z)
where z := <mulS> (x, y)
What this defines is a rule called
simplify that matches an AST node of the form
Multiply(Num(<insert something here and bind it to x>), Num(<insert something here and bind it to y>)) and rewrites that to
Num(<insert z here>). In the
where clause of the rule, the actual value of this computation is computed and assigned to z, which is then inserted into the Num AST node (
mulS is a utility rule that can compute the multiplication of two integers represented as strings, but don't worry too much about that).
Ok, now let's consider the following program:
n = 3 * 5 * 2
When this is parsed, we might end up with the following AST:
Assign(Var("n"), Multiply(Multiply(Num("3"), Num("5"),
Of course, this could be simplified to simply
Assign(Var("n"), Num("30")), right? But we somehow have to tell Stratego to apply this simplify rule a number of times. One way would be to traversing the tree and try to apply the rule somewhere in the tree until it can no longer be applied anywhere. Let's see how that would look.
We start with our initial AST:
We then start to look for a place to apply the
simplify rule. We decide to start looking in the leaf nodes, these are all string nodes ("n", "3", "5" and "2"), our
simplify rule does not apply there so we move upwards. Then we get to the
Num("2") nodes, still no match. Then we find something more interesting:
Multiply(Num("3"), Num("5")), which matches our simplify rule, so it is rewritten to simply
Num("15"). Our AST now looks like this:
Assign(Var("n"), Multiply(Num("15"), Num("2"))
When we move up more we find the new
Multiply(Num("15"), Num("2")) node which, again, matches with the
simplify rule, rewriting it to
Num("20"). So our resulting AST is now:
And we're done because the
simplify rule no longer matches anywhere. The final step after program transformation is called "pretty-printing", sometimes referred to as "unparsing", i.e. doing the opposite of parsing where an AST is converted back to normal source code, resulting in:
n = 30
We have now optimized a simple program.
In Stratego rules are never applied automatically, you always have to tell it exactly what to do and in what order. This is done using Strategies (hence the name Stratego). A general strategy for a program transformation could look as follows (note, this is simplified):
What this will do is apply these sub-strategies in order. First
parse-file, if that succeeds,
transform and then
pretty-print. We won't talk about the parsing and pretty-printing, let's focus on the
transform strategy. The transform strategy has to apply the
simplify rule we defined before to our AST for as long as it can be applied, for this Stratego offers a higher-order strategy called
innermost. Higher-order strategies take one or more strategies as arguments, in this case we pass the simplify strategy as an argument to the innermost strategy:
What it will do is traverse the tree, try to apply the
simplify rule wherever it can, when the rule no longer applies anywhere, then it terminates and returns the result of the transformation. Programming with strategies is what we call strategic programming.
Now you may think, doesn't writing ASTs all the time get really annoying? And indeed, it does. That's why, with a little bit of work, you can mix in the syntax of programs you are transforming with the Stratego syntax. Typically these pieces of program syntax are enclosed in
]| quotations. The simplify rule could then be rewritten to the more natural:
|[ n1 * n2 ]| -> |[ n3 ]|
where n3 := <mulS> (n1, n2)
Where somewhere is defined that the nx notation will only match numbers.
Stratego is a very nice language for building compilers, if you want to learn more, you can visit the strategoxt.org website.
Sometime during the next few days I will talk about an experiment I conducted last weekend where I attempted to do Strategic programming in Clojure.