On the PIL

For the past few years I’ve wanted to do what most programmers dream of doing: design my own programming language. Do everything right what others have done wrong, find the perfect mix of Java, Python and Ruby. It was going to have dynamic with optional static typing, a super flexible syntax and meta programming facilities that would make Ruby look like a toy. However, the first decision you have to make when undertaking such a project was already problematic: what language am I going to implement this in, C? I don’t C that well, and frankly, I don’t really want to spend my time managing memory and chasing down pointers. So my best bets are the JVM (Java Virtual Machine) and the CLR (.NET). I have to choose. It’s really a decision I don’t want to make. But how can I make two implementations? I don’t have the resources to do that.

Another area I’m interested is databases, specifically non-relational databases (or NoSQL databases). Many such systems start to appear: CouchDB, SimpleDB, Google’s BigTable. A few months ago, the people from FriendFeed described how they built such a system on top of MySQL. A fascinating idea, I thought. I decided to experiment with this idea and implement such a library in Python. After a while, when I got more into Java I ported it to Java. But I was not happy with it, I wanted to write a library that was language agnostic, that could be used from Java, Python, but also from PHP and Ruby.

For the past two years, me and my colleagues have been working on WebDSL, a domain-specific language for rapidly building web applications. The application written in WebDSL is compiled to Java code and can then be deployed on any Java application server. It’s a cool project, but rather uninteresting for people who do not have access to Java hosting, as is the case for many amateur programmers, a group that I care about, because I was one of them for so long. So, during the summer of last year I started to work on a Python Google AppEngine back-end for WebDSL, which generated Python code that worked with AppEngine, rather than Java code. It tooks some effort, but I got it to work. Cool! WebDSL was now much more accessible to people, because you can use a free AppEngine account to host your applications!

Sadly, maintaining a separate Python back-end for WebDSL turned out to be a tedious job. At the moment there are two people who spend much of their time making changes to the Java back-end of WebDSL, while I work on different things. This means that in addition to my usual work, I also have to replicate all changes they make to the Java back-end to the Python back-end. In practice this didn’t happen, which rapidly led the Python back-end to be out of date. Similarly, the language that we use to implement WebDSL, Stratego/XT, can now also generate two languages: C and Java. However, these back-ends also have to be maintained. The parser that we use, SGLR, also has two implementations: C and Java, that also have to be maintained.

See the pattern?

Certain types of applications, libraries and code generators are conceptually platform-independent, but since we typically choose one platform to implement them for/on, we exclude all potential users that use different platforms.

Ideally, we would abstract from software platforms such as Java, .NET, PHP, Ruby, Javascript or Objective-C. Ideally, we’d write code in only one language, and magically translate programs written in this language to any other software platform out there.

pil

This is exactly what PIL attempts to do. PIL stands for Platform Independent Language. It’s a language mainly intended to be used by DSL compilers to more easily maintain multiple platform back-ends, but can also be used as an implementation language for building portable libraries and (parts of) applications. To kick this off, let’s have a look at "Hello world!" in PIL:

void main(Array<String> args) {
  println("Hello world!");
}

As you can see, PIL is a Java-style language and that’s on purpose. Java is a well-known language and PIL’s syntax and semantics are based on Java. We made some changes, however, to simplify and improve it here and there.

The PIL compiler can currently generate programs for three platforms: Java, Python and PHP 5. Soon, more languages will be added. We plan to add at least C# and Objective-C, but also others such as Javascript. But already you can write a library in PIL and generate a Java, Python and PHP implementation from it, which is pretty cool. So let’s assume you installed the PIL compiler and you use it on the "Hello world!" program I just showed:

$ pilc -i hello.pil -d php-out --php
[ pilc | info ] Now compiling: hello.pil
[ pilc | info ] Done with hello.pil

If we look in the php-out directory now, we will see one file: main.php:

<?php
require_once "pil/builtin.php" ;
function main ( $args )
{
   pil_println ( "Hello world!" ) ;
}
?>

Now let’s invoke it with the Java back-end:

$ pilc -i hello.pil -d java-out --java
[ pilc | info ] Now compiling: hello.pil
[ pilc | info ] Done with hello.pil

Now we end up with a java-out/application/Main.java file:

package application;

public final class
Main  

  public final static void main(String[] args)
  { 
    System.out.println("Hello world!");
  }
}

And last but not least for Python:

$ pilc -i hello.pil -d python-out --python
[ pilc | info ] Now compiling: hello.pil
[ pilc | info ] Done with hello.pil

and the result: python-out/main.py

import sys
import pil.builtin
import sets
import datetime
import time

def main(args):
    print str( "Hello world!" )

Of course, this is just a trivial example, but it works on real-life applications as well. I implemented a simple parser and interpreter for my dynamic programming language in PIL and both the Java and Python versions worked (the compiler could not generate PHP code then). Similarly I ported my FriendFeed-inspired layer on top of MySQL to PIL and got it to work with Java and Python as well. For WebDSL there is an experimental PIL back-end that can generate code for any PIL-supported platform.

In practice, porting an code written in PIL to a new platform is a bit more work than simply changing the --java switch to --php. Typically you also need a certain amount of platform APIs that are not built into the PIL base library of types. You need IO functionality, or database access for instance. To accomplish this you can use external classes. Here is a sample Database, external class (in the pil::db namespace/package):

external class pil::db::Database {
  new(String hostName, String username, String password, String database);
  pil::db::Connection getConnection();
}

For each platform, e.g. PHP, Java, Python you now need to make an implementation of this class, typically wrapping an existing API on that platform. I won’t go into the details here, but you can find more information and examples on the PIL website.

I’m sure you’re all excited right now and want to start playing with this immediately. There is a simple tarball available that includes the PIL compiler running on Java (5+). It also includes a number of wrapper scripts that enable you to use the PIL compiler as if it were an interpreter. The pil-java script, for instance, generates Java code for a given .pil file, compiles and then executes it. Similar scripts are included for Python and PHP. Don’t get confused here. The compiler needs Java to run, but can generate code in 3 different languages. There is also a native C version of the PIL compiler, which is faster, but also a little bit more complicated to install. The PIL compiler will also run on windows, but the wrapper scripts that are included will not (unless you run them in cygwin), but the scripts are simple so this should not be an issue.

The PIL manual is under construction. It is not nearly complete, but already contains a lot of useful information about the language. For support you can join our IRC channel at irc.freenode.net channel #pil, or e-mail me. Of course, PIL also has a twitter account.

After refactoring our build system a little bit and improving the PHP back-end to use the __autoload feature, I am going to work on an Objective-C back-end. Yes, we want to start generating iPhone applications. Objective-C is a bit of a challenge because on the iPhone it does not support garbage collection, which PIL assumes as a feature of its target platforms. But if I get it done, PIL and the DSLs we’re going to build on top of it may be a viable alternative to the ugly Objective-C language you have to use now. I’ll keep you posted on that.

Got something to say?
  1. Chris Eidhof says:

    This is really cool stuff! I would try to target C and then build an Objective C layer on top of that (it's just a superset anyway). If you target C, you'll have to tackle garbage collection (which you would have to do for objective-c anyway) and you'll support a larger amount of platforms.

    Alternatively, you could just ignore garbage collection. Apple has added garbage collection for Objective C 2.0 except for the iPhone, and I think they will add it to Objective-C on the iPhone sometime soon.

  2. Zef Hemel says:

    Thanks!

    My plan is to use Objective-C's reference counting feature and essentially derive release and retain calls from assignments and scopes and stuff. But we'll see how it goes.
    A C back-end would be difficult because then we need to emulate an entire object system on top of it. It's possible, but not the best option I guess. C++ + a garbage collector would be easier then.

  3. CorPaul says:

    Looks very cool Zef! Just wondering, why do you use pil_println() in the PHP version, while you use 'native' methods in the Python and Java output?

  4. Zef Hemel says:

    Good question. pil_println prints prettier representations of certain datatypes than PHP does by default. It uses print_r in case of arrays for example.

  5. Allen Smithee says:

    Vala is another new language with a similar “old language as a back end” compilation strategy. It's a modified C# dialect that compiles down to straight C, using the GObject system (part of GLib) for OOP.

  6. James Hofmann says:

    Sounds like haXe to me.

  7. Jake says:

    I feel sorry for anyone who is starting out programming trying to keep track here (like me). You got your assembly, then your low level languages (like C), then your higher object oriented languages (Java/python/etc), and now an even higher language that compiles to different higher-level languages.

    And if this takes off, you know at some point there's gonna be a compiler that targets this too…

  8. Zef Hemel says:

    Heh. Yep, long live progress! Although PIL is not really higher-level than Java, just a little bit more abstract. A DSL like WebDSL would fall into that category however, although a language like that also simplifies.

  9. Emperor says:

    Indeed, if you could make haXe one of the backend, you'd get javascript and flash for free.

  10. Emperor says:

    Indeed, if you could make haXe one of the backend, you'd get javascript and flash for free.

  11. Eric Gustavson says:

    In your article, you mention: “during the summer of last year I started to work on a Python Google AppEngine back-end for WebDSL, which generated Python code that worked with AppEngine, rather than Java code. It tooks some effort, but I got it to work. Cool!”

    I understand that this post is a couple years old, but I would be interested to see your implementation approach here. Any chance the code is available somewhere?

Trackbacks for this post

  1. On the PIL « I am Zef | Neorack Tutorials
  2. Let’s build a DSL « I am Zef

Comments are closed now.