On Asynchronous Programming

MSDN:

Asynchronous operations are typically used to perform tasks that might take a long time to complete, such as opening large files, connecting to remote computers, or querying a database. An asynchronous operation executes in a thread separate from the main application thread. When an application calls methods to perform an operation asynchronously, the application can continue executing while the asynchronous method performs its task.

Asynchronous programming clearly has performance benefits, as mentioned in the explanation I just quoted. What I have complained about before is the programming model that follows from it. In Javascript this becomes painfully clear. In Javascript you do not have threading (although it's coming), therefore, anything that is not going to be instantaneous, needs to be executed asynchronously or it will freeze the browser. The typical example of this is Ajax (Asynchronous Javascript and XML, a term that is way too general, when you think of it), but you get the same things when you start interacting with local databases. In a previous post I showed the following code:

What I do in this code is attempting to implement simple sequential execution of

  1. opening a database connection
  2. starting a transaction
  3. creating a table
  4. inserting a task
  5. inserting another task
  6. selecting all tasks in the table
  7. render list items for them

In a synchronous programming model this would roughly be:

var db = db.openDatabase(...);
db.startTransaction();
db.executeSql("CREATE ...");
db.executeSql("INSERT INTO ...");
db.executeSql("INSERT INTO ...");
var results = db.executeSql("SELECT * FROM ...");
for(var i = 0; i < results.length; i++) {
  ...
}

I think you will agree that this is much more legible than the asynchronous counterpart, except that you should never run it like this in Javascript, because it will completely block the UI thread for the duration of the code (which could be several seconds).

I looked at the Google Web Toolkit because I hoped they had solved this problem, but it turns out they even made it worse, because now you have to deal with the Java overhead of implementing interfaces:

(Nothing bad about GWT by the way, using it makes a lot of sense in many larger projects.)

Getting back to my point: would it be possible to automatically rewrite synchronous code to asynchronous code? I'm starting to feel that's very possible indeed. Let's take a simple example. Let's say we managed to implement an adder function that is extremely slow (maybe by executing it on a server far, far away):

console.log(adder(3, 4));

This code would block for several seconds, possibly, while the adder function is adding the numbers 3 and 4. In an asynchronous manner this would be written as:

adder(3, 4, function(result) {
    console.log(result);
});

Asynchronous functions always provide a callback function, that is passed the result of the computation once it completes. The result of the computation is what we call a return value in synchronous programming. So, what if we sequence a number of synchronous statements:

var a = 0;
a++;
var n = adder(3, a);
var m = adder(n, 4);
console.log(m);

What would that look like in asynchronous code?

var a = 0;
a++;
adder(3, a, function(n) {
  adder(n, 4, function(m) {
    console.log(m);
  });
});

As you can see, this transformation is not complicated. It could easily be automated, so that in the DSL that we're designing you could write synchronous code, and it would be translated to asynchronous code automatically.

However, there is one thing that worries me about this approach. The advantage here is that you can write simpler code, which does indeed run asynchronously. However, we miss out on the opportunity to parallelize things. For instance, the two INSERT queries in my first example can easily be executed in parallel. There is no reason why I should execute them sequentially (when we look at just the two INSERTs). This may be a lot faster even. In this particular case I did this, because I wanted to make sure that both INSERTs had taken place before I did the SELECT. If I would have parellized the two INSERTs, I would to somehow track when both INSERTs had happened so I could do the SELECT. That's kind of cumbersome. In the general case, however, there may in fact be many things that may be easily parallelized, and although I want to provide the user with a synchronous model because it's so much more convenient to write, I also don't want to lose the parallelizability of asynchronous code.

Maybe it would be possible to analyze the code for any dependencies. In the last example, for instance, the second adder call cannot be executed in parallel to the first adder call, because it depends on the result of the first, but in other cases there may be no such dependency. But I'm afraid to open up the world of hurt that is called concurrent programming here, so maybe I should stay out of that kind of stuff.

Any ideas or experiences with this?