All posts tagged dsl

  • Internal DSL Implementation Techniques

    Two weeks ago I gave a lecture about internal DSLs and techniques to implement them. Here are the slides, sadly I don’t have a video/audio recording of this one:

  • Code Generation and Vendor Lock-In

    When you build a code generator you have two basic options:

    1. Generate code to be read, complemented and possibly modified by humans
    2. Generate code purely as a convenient intermediate step toward bytecode/machine code compilation

    The first approach seems to be the most common. It is the most pragmatic option. “Hey, I keep writing the same code over and over, can’t I simply generate part of it and make minor adjustments by hand?” Yes you can. But then you end up with a maintenance issue: it turns out that the code you generated initially was not quite right, and now what do you do, regenerate the code and lose all the modifications you made? As the kids say these days: FAIL!

    An improved version of this naive approach is using the generation gap pattern. The idea here is to generate abstract classes, which you extend from custom code and override the parts that you need to override. The result is that you keep generated and manually written code separate, which is a good thing, because you can make changes to the code generator and simply regenerate code without your changes being lost. Usually. Not always, because if you make invasive changes to you code generator, you may generate completely different code altogether; different classes, different methods and so on. Although you do not lose your manually written code, this manually written code now no longer has any apparent relationship to the code that is generated and it needs to be rewritten to fit the new style of generated code. Again: FAIL.

    These problems led us, in the MoDSE research project (pronounced “modes”), to choose approach 2: generate code only as an intermediate step. This also means we have to do 100% code generation, we hardly mix custom and generated code. And in rare edge cases that we have to, we do only through well-defined fixed interfaces. One code generator we built using this approach is WebDSL. After you invoke the compiler on your WebDSL program you do not look at the generated code.

    Now, let’s say you started your own software company and you got your first big customer. Congratulations. You’re going to build the website of a large international corporation. And because you want to be productive and cool ‘n stuff you’re using code generation techniques. Since you were so impressed with the arguments you just read against mixing custom and generated code, you decide to generate 100% of your code, and therefore no longer have to focus on extension and modification techniques. Thou shalt not read generated code. Good for you! You may even choose using WebDSL. Even better.

    But what about your customer? What if you deliver your product? Either you deliver a perfect product that is done and will never have to be changed again — good luck with that. Or, as part of your delivery, you deliver the source code. What source code? Well, not the generated code, because it’s essentially worthless as it’s not intended for human consumption (and in the case of WebDSL, believe me, it’s not). So you deliver the model that was the input of your code generator (e.g. the WebDSL source code). Fantastic. However, your customer is worried. In the future they may need developers familiar with the input language of your code generator to continue work on the product. Where are they going to find such developers? Well, in your company. That’s great for you, but not great for your customer, because you essentially locked them in.

    This is a problem that is not specific to programs written in domain-specific languages like WebDSL, it’s true for other languages and even frameworks too. Yahoo rewrote its web store application, after buying it, from Lisp to C++ and Perl, because Yahoo engineers were not familiar enough with Lisp. Java web applications written using obscure Java frameworks have a similar problems, as did Ruby on Rails when it just got started.

    If you produce software for a customer using languages and frameworks that very few other developers “speak”, you’re locking your customer in.

    So what’s the solution?

    Last year, Jos Warmer of mod4j gave a talk as part of our model-driven software development course. Mod4J is a set of DSLs for developing administrative enterprise applications in Java. Interesting twist is that they aim to generate code in the same style that it would have been written by hand by a developer. Consequently, when you deliver to your customer they do not need mod4j developers to continue development. They may not even care that you used mod4j to develop the product and simply continue maintaining the generated code. Of course this is not the ideal case, but it’s a fallback option that make customers feel safer. I have no experience with Mod4J and do not know if it really works that way, but I like idea. But is it always feasible to do this?

    I wanted to try to take this approach for mobl, my DSL for mobile web applications, but it did not work out well. It turns out I missed an essential requirement for this approach to work.

    My plan was to first develop a set of frameworks that the generated code would use. As this is a fairly new domain, hardly any of these frameworks exist. The first library I developed was persistence.js, which is an ORM library for client-side SQLite databases in Javascript. A second framework, which I named mobiworks, provides a set of jQuery plug-ins that provide HTML-encodings of mobl concepts, such as screens and templates.

    But then I took a step back. To make this approach successful, what’s the plan of action?

    1. Develop a framework that developers would find useful and usable even without mobl.
    2. Promote that framework by itself, building a community around it, making sure that this was the way to build mobile web applications. Solving the “nobody knows this framework” problem.
    3. Build a nice DSL wrapper around the framework.
    4. Promote the DSL, build a community around it.

    Yeah, you’ll agree that this was a rather pointless mission to begin with. Why not simply build a community around mobl immediately, making that the best way to build mobile web applications? Then I could drop the whole framework idea altogether.

    What I did not realize earlier is that to make the Mod4J “generate human code” approach work there already has to be an established style that developers write their code in. There needs to be an established framework you can target. If there isn’t one, the approach is pointless.

    So, I shifted gears again, rewriting stuff to generate efficient computer-readable Javascript code — Javascript as the assembly code of the mobile web. And it’s much simpler that way. The lock-in problem remains though, we’ll see if that is actually going to be a problem. Incidentally, if you’re a persistence.js user or consider using it: don’t worry, I do still use it for mobl and will keep working on it.

  • Spoofax Talk

    For the past month or two I’ve been working on mobl, a DSL for the mobile domain. It is the first larger DSL to be developed completely using Spoofax/IMP (site is down at the moment, due to a power outage at the TU Delft). Spoofax is our new tool for developing domain-specific languages complete with Eclipse plug-ins. It’s based on SDF and Stratego and the Eclipse and Java integration makes it a much more user-friendly and nicer experience, both for the language’s users and their developers.

    Recently, Eelco Visser (my “boss”) gave a talk at the IFIP WG 2.11 meeting in St Andrews about Spoofax. He just posted a video of his talk in which he demonstrates a simplified version of WebDSL (called NWL) and its implementation. May be interesting to those who prefer watching a video over reading a manual or paper.

    It’s still too early to discuss mobl in detail, but here’s a teaser screenshot showing a snippet of the language (at the left), and the Javascript that it is compiled to at the right (which updates whenever the source file is saved, similar to Eclipse’s Java compilation behavior):

  • Using Screen Estate

    One of the more interesting UI features that the iPad exposes is its use of screen orientation. A good example of this is in the e-mail application. In Portrait mode your screen is long and narrow, so you see one e-mail. This is convenient for reading longer pieces of text:

    However, when you’re more in a browsing mode "let’s see what e-mail we got today", you flip the device around and get a view with a message list on the left and e-mail at the right:

    If find this a rather fascinating user interface idea and wonder if we can not apply it more iPhone applications as well. There’s one iPhone application that I know of that really uses this, and that’s the Calculator application, which in portrait mode looks like this:

     

    And when you flip it:

    In the framework for the development of mobile web applications that I’m developing, I have support for orientation events. Here’s a simple todo application I’m working on (looks best on an iPhone or desktop webkit browser) that takes advantage of this feature. I’m not entirely sure this is the best way to use the feature, but it’s cool nonetheless. In portrait mode the application looks as follows:

    You can swipe any of the items to show a delete button:

    However, if you’re in a destructive mode you can also flip the device around and switch to edit mode, in which all items become immediately deletable:

    (You can emulate this behavior in a desktop browser by resizing the window, making it wider than it is long or vice versa).

    A user interface design concept to think about.

     

  • Task Switching and Open Development on the Apple iPad

    In case you missed it, Apple launched the iPad yesterday. Essentially it’s a beautiful looking giant iPod Touch running the iPhone/iPod OS, slightly adapted to take better advantage of the bigger 10" screen. It’s available at a remarkably (for Apply, and the hardware you get) low price starting at $499. Not only does it look like a bigger iPod and runs its software, it also comes with the usual suspects: an App Store, synchronization through iTunes etc.

    With that come the usual restrictions: only install software through the App Store and no multi-tasking. Although I truly believe the no multi-tasking support limitation is going to be resolved in June, when Apple usually releases new iPhones and iPod touches (and an operating system to go with that), there is a way around both these problems:

    The iPad, like the iPhone and iPod touch comes with an excellent, fast browser.

    If you build iPad web applications, you can build whatever you like, it doesn’t have to be approved by Apple and you can roll out updates instantly. In addition, web applications, like on the iPhone/iPod touch are the only type of application that "keep running" (in the sense of not being killed) when switching applications. You can have several web apps open in safari, and they’re still there, in the same state, after you exit Mobile Safari and return to it. Although the applications do not run simultaneously, you can easily switch between them, in a kind of task switcher inside Safari, which is its ability to switch between "tabs":

    And yes, indeed. This exactly the direction in which I’m heading with my DSL for mobile applications (which may actually include the iPad as a target).

    By the way, I’m trying to come up with a new name for the DSL, now that MobiDSL is apparently taken. Name suggestions? It does not have to include the word DSL at all, preferably not. I’m thinking about something with "Touch" in it, or possibly something completely different. Any ideas?

  • On Asynchronous Programming

    MSDN:

    Asynchronous operations are typically used to perform tasks that might take a long time to complete, such as opening large files, connecting to remote computers, or querying a database. An asynchronous operation executes in a thread separate from the main application thread. When an application calls methods to perform an operation asynchronously, the application can continue executing while the asynchronous method performs its task.

    Asynchronous programming clearly has performance benefits, as mentioned in the explanation I just quoted. What I have complained about before is the programming model that follows from it. In Javascript this becomes painfully clear. In Javascript you do not have threading (although it’s coming), therefore, anything that is not going to be instantaneous, needs to be executed asynchronously or it will freeze the browser. The typical example of this is Ajax (Asynchronous Javascript and XML, a term that is way too general, when you think of it), but you get the same things when you start interacting with local databases. In a previous post I showed the following code:

    What I do in this code is attempting to implement simple sequential execution of

    1. opening a database connection
    2. starting a transaction
    3. creating a table
    4. inserting a task
    5. inserting another task
    6. selecting all tasks in the table
    7. render list items for them

    In a synchronous programming model this would roughly be:

    var db = db.openDatabase(...);
    db.startTransaction();
    db.executeSql("CREATE ...");
    db.executeSql("INSERT INTO ...");
    db.executeSql("INSERT INTO ...");
    var results = db.executeSql("SELECT * FROM ...");
    for(var i = 0; i < results.length; i++) {
      ...
    }

    I think you will agree that this is much more legible than the asynchronous counterpart, except that you should never run it like this in Javascript, because it will completely block the UI thread for the duration of the code (which could be several seconds).

    I looked at the Google Web Toolkit because I hoped they had solved this problem, but it turns out they even made it worse, because now you have to deal with the Java overhead of implementing interfaces:

    (Nothing bad about GWT by the way, using it makes a lot of sense in many larger projects.)

    Getting back to my point: would it be possible to automatically rewrite synchronous code to asynchronous code? I’m starting to feel that’s very possible indeed. Let’s take a simple example. Let’s say we managed to implement an adder function that is extremely slow (maybe by executing it on a server far, far away):

    console.log(adder(3, 4));

    This code would block for several seconds, possibly, while the adder function is adding the numbers 3 and 4. In an asynchronous manner this would be written as:

    adder(3, 4, function(result) {
        console.log(result);
    });

    Asynchronous functions always provide a callback function, that is passed the result of the computation once it completes. The result of the computation is what we call a return value in synchronous programming. So, what if we sequence a number of synchronous statements:

    var a = 0;
    a++;
    var n = adder(3, a);
    var m = adder(n, 4);
    console.log(m);

    What would that look like in asynchronous code?

    var a = 0;
    a++;
    adder(3, a, function(n) {
      adder(n, 4, function(m) {
        console.log(m);
      });
    });

    As you can see, this transformation is not complicated. It could easily be automated, so that in the DSL that we’re designing you could write synchronous code, and it would be translated to asynchronous code automatically.

    However, there is one thing that worries me about this approach. The advantage here is that you can write simpler code, which does indeed run asynchronously. However, we miss out on the opportunity to parallelize things. For instance, the two INSERT queries in my first example can easily be executed in parallel. There is no reason why I should execute them sequentially (when we look at just the two INSERTs). This may be a lot faster even. In this particular case I did this, because I wanted to make sure that both INSERTs had taken place before I did the SELECT. If I would have parellized the two INSERTs, I would to somehow track when both INSERTs had happened so I could do the SELECT. That’s kind of cumbersome. In the general case, however, there may in fact be many things that may be easily parallelized, and although I want to provide the user with a synchronous model because it’s so much more convenient to write, I also don’t want to lose the parallelizability of asynchronous code.

    Maybe it would be possible to analyze the code for any dependencies. In the last example, for instance, the second adder call cannot be executed in parallel to the first adder call, because it depends on the result of the first, but in other cases there may be no such dependency. But I’m afraid to open up the world of hurt that is called concurrent programming here, so maybe I should stay out of that kind of stuff.

    Any ideas or experiences with this?

  • Let’s Build a DSL: Platform Research

    Now that we decided on a domain and target platform of our DSL, it is time to explore our target platform. Although I have used HTML, CSS and Javascript for many years, I never looked that seriously into the possibilities of especially CSS and Javascript. To help me with that I’ve been reading a few books:

    In addition there are a number of great blog posts and libraries helping to develop native-looking web applications for the iPhone:

    And a number of performance related posts from Google (from the Gmail team that developed the excellent Gmail mobile application):

    Because eventually I want to add automatic data synchronization support to our DSL, I’m also looking into synchronization strategies. I found a nice post about that. Of course, Apple’s own iPhone web app documentation is also very useful.

     

  • The Point of WebDSL

     Jay asks in the comments:

    I don’t mean to be mean, because I really like your blog and I read it all the time… but could you just explain to me the point of WebDSL? I honestly don’t mean it in any kind of negative way, I’m just wondering why you’re dedicating your valuable time to building something like that when RoR, php, et al. already exist.

    Just from a cursory look, it seems like the syntax is halfway between Visual Basic (yikes!) and C. Why do you prefer that type of syntax to the Lisp-style syntax of Clojure/Compojure?

    Good questions!

    WebDSL was started about 3 years ago by Eelco Visser as an exercise the the design and implementation of domain-specific languages. His focus up to then had been on parsing and meta-programming, but it was time to focus on a new domain: the web. When he started Eelco had never built a web application. He investigated a number of Java frameworks as a basis and eventually decided to use JBoss Seam as a target.

    The goal of WebDSL is to get rid of the boilerplate code you would have to write when building a Java application and raise the level abstraction. The vision was to have simple, domain-specific sub-languages that allow a programmer to specify a certain aspect of the application and the WebDSL compiler would generate all the implementation code for that aspect. Initially there were three sub-languages: a data modeling language, a user interface language and a simple action language to specify logic. As others joined the project (including myself), we added more sub-languages and more features: access control, workflow, data validation, ajax support and more recently search. Work is also done in the area of data evolution (i.e. migrating databases as you change your data model).

    Although WebDSL is mainly a research project, we are increasingly working to make it useable by anybody with some programming experience. We currently have a few websites in production built using WebDSL (researchr, tweetview, webdsl.org and pil-lang.org) and the manual is growing.

    The idea of building abstractions for the web itself is hardly novel. As Jay mentions, there are many web frameworks that already do this: Rails, Django and so on. There are a few things that we do differently in WebDSL, compared to existing frameworks:

    • We create our own custom syntax. Whereas Rails and Django are struggling to express everything using Ruby and Python, respectively, we designed our own clean syntax. Whether you like this syntax is a matter of taste. Personally I like it, although, indeed, it inconsistent here and there.
    • WebDSL is a statically typed and checked language. I wrote a number of posts about this issue and its advantages.
    • WebDSL compiles to low-level Java code, which has good performance characteristics. The code we generate does not rely on run-time meta-programming and reflection features of the language which are typically rather slow.
    • WebDSL is platform independent. We generate Java code now, but it can be ported relatively easily to .NET, Python or PHP. We have prototypes of this utilizing the PIL language that I developed.
    • Within the next few months WebDSL will have excellent IDE support for Eclipse, built using Spoofax/IMP. My colleagues are working on this. It will feature syntax highlighting, as-you-type error reporting, code completion and eventually refactoring support.

    A drawback that WebDSL has today is that it’s not trivial to install, but with the IDE plug-in and Java-version that should become a lot easier soon.

    So, why am I putting so much effort into this? As you may be aware I’m doing a Ph.D. in the area of domain-specific languages, so we investigate how to best build them. WebDSL is a case study for us. Soon I intend to work on another DSL, in the domain of mobile applications (yes, a DSL to build iPhone and Android applications, people!). It’s interesting from a research perspective to see how to best do this.

    In addition I regularly experiment with alternative ways of creating DSLs, like in Clojure and Scala. I’d like to see how far you can push these languages to build the DSLs you like. Clojure allows you to define your own custom syntax, in some sense, as long as you adhere to the rule of the parenthesis. Static error checking is much more problematic. Clojure is also rather tied to one platform, sure, there’s also ClojureCLR, but to write programs that work on both CLR and JVM is, well, challenging. IDE support for a Clojure DSL is also non-trivial.

    On the other hand, the flexibility of a DSL like WebDSL also has its downsides. Basically you can design the language any way you like, both its syntax and semantics, you don’t get much for free. Whereas an internal DSL built on Clojure or Scala gets a lot for free: some error reporting, support for namespaces (something we still don’t have in WebDSL), a type system (in Scala’s case), an escape to a powerful language (Clojure or Scala) and a rich set of libraries you can use. In WebDSL we have to design all of this from scratch.

    So in the end both approaches have their advantages and disadvantages. I intend to continue to explore them both.