Three Routes to Spaghetti-Free Javascript
By Zef Hemel
- 9 minutes read - 1767 words(If you are familiar with the problems of moving from synchronous to asynchronous programming, feel free to move to the next section.)
Update: A lot of people misunderstood the main issue: here is another shot at explaining it better.
Letâs build a script that determines the titles of a set of URLs. Letâs start simple, we create a function that takes a URL, and returns the title:
var reg = /(.+)/mi; function getTitle(url) { var body = âMy titleâ;
 var match = reg.exec(body);
 return match[1];
 }
In this first prototype we ignore the fetching of the webpage, for now and assume some dummy HTMLâââfor testing purposes. We can now call the function:
getTitle(âhttp://whatever.com");
and weâll get back:
âMy titleâ
So far, so good. Now letâs iterate over an array of URLs and get each of their titles:
var urls = [âhttp://zef.me", âhttp://google.com",
 âhttp://yahoo.com"];
 var titles = [];
 for(var i = 0; i < urls.length; i++) {
 titles.push(getTitle(urls[i]));
 }
 console.log(titles);
And the array of resulting titles (all âMy titleâ
) is printed.
Next, to put in the actual URL fetching part into getTitle
, we need to make an AJAX call (letâs just ignore the single-source restriction here):
function getTitle(url) {
 var xmlHttp = new XMLHttpRequest();
 xmlHttp.open(âGETâ, url, true);
 xmlHttp.send();
 xmlHttp.onreadystatechange = function() {
 if(xmlHttp.readyState==4 && xmlHttp.status==200) {
 // Now we get the body from the responseText:
 var body = xmlHttp.responseText;
 var match = reg.exec(body);
 return match[1];
 }
 };
 }
We open an XMLHttpRequest
and then attach an event listener on the onreadystatechange
event. When the ready-state changes, we check if itâs now set to 4
(done), and if so, we take the body text, apply our regular expression and return the match.
Or do we?
Note that return
statement. Where does it return to? Well, it belongs to the event handler functionââânot the getTitle
function, so this doesnât work. The XMLHttpRequest is performed asynchronously. The request is set up, an event handler is attached and then the getTitle
function returns. Then, later at some point, the onreadystatechange
event is triggered, and the regular expression applied.
So, how do we fix this? Well, we can change our function a little bit:
function getTitle(url, callback) {
 var xmlHttp = new XMLHttpRequest();
 xmlHttp.open(âGETâ, uri, true);
 xmlHttp.send();
 xmlHttp.onreadystatechange = function() {
 if(xmlHttp.readyState==4 && xmlHttp.status==200) {
 var body = xmlHttp.responseText;
 var match = reg.exec(body);
 callback(match[1]);
 }
 };
 }
Now, instead of return
ing the value, we pass the result to a callback function (the functionâs new second argument). When we want to call the function, we have to do it as follows:
getTitle(âhttp://bla.com', function(title) {
 console.log(âTitle: â + title);
 });
Thatâs annoying, but fair enough.
I suppose we also have to adapt our loop now, too. getTitle
no longer returns a useful value, so we have to pass a callback function to it. Hmm, how do we do this?
var urls = [âhttp://zef.me", âhttp://google.com",
 âhttp://yahoo.com"];
 var titles = [];
 for(var i = 0; i < urls.length; i++) {
 getTitle(urls[i], function(title) {
 titles.push(title);
 });
 }
 console.log(titles);
That looks about right. Except, that when running this code the last console.log
will be executed immediately showing an empty arrayâââbecause the getTitle
calls have not finished executing yet. Asynchronous code executes in a different order than the code may suggest.
Shame.
We now have to thinkâââwhat do we prefer? Do we want to have all the URLs fetched simultaneously, or do they have to be fetched in sequence? Implementing it in sequence is more difficult, so letâs do it in parallel. What weâll do is add a counter!
var urls = [âhttp://zef.me", âhttp://google.com",
 âhttp://yahoo.com"];
 var titles = [];
 var numFinished = 0;
 for(var i = 0; i < urls.length; i++) {
 getTitle(urls[i], function(title) {
 titles.push(title);
 numFinished++;
 if(numFinished === urls.length) {
 // All done!
 console.log(titles);
 }
 });
 }
As we get back the getTitle
results we increase the numFinished
counter, and when that counter has reached the total number of URLs, weâre doneâââand print the array of titles.
Ugh. Letâs not even look at the code to fetch these URLs sequentially.
Centuries of civilization and decades of programming researchâââand weâre back to this style of spaghetti programming?
There must be ways around this. Indeed, there areâââletâs look at three of them.
Route #1: streamline.js
ââââââââââââââââââââââââââââ â
Streamline.js is a simple compiler, implemented in Javascript that enables you to write your code in a synchronous style. The nice thing about Streamline.js is that it operates on regular Javascript and does not add any new keywordsâââyou can keep using your favorite editor and other tools. The only thing streamline.js does, is give the _
identifier new meaning. Before I demonstrate it, letâs refactor our code slightly. Weâll create a generic fetchHTML
function:
function fetchHTML(url, callback) {
 var xmlHttp = new XMLHttpRequest();
 xmlHttp.open(âGETâ, url, true);
 xmlHttp.send();
 xmlHttp.onreadystatechange = function() {
 if(xmlHttp.readyState==4 && xmlHttp.status==200) {
 callback(xmlHttp.responseText);
 }
 };
 }
Now, streamline.js allows us to write our getTitle
function as follows:
function getTitle(url, _) {
 var body = fetchHTML(url, _);
 var match = reg.exec(body);
 return match[1];
 }
Youâll notice the _
argument there, which represents the callback function. It tells streamline.js that this is an asynchronous function. The next thing youâll notice is the call to fetchHTML
, which, although being an asynchronous function, is called as if itâs a regular synchronous function. The difference? The last argument: _
.
Internally, streamline.js transforms this code to something equivalent to this:
function getTitle(url, _) {
 fetchHTML(url, function(body) {
 var match = reg.exec(body);
 return _(match[1]);
 });
 }
This transformation is called the continuation-passing style transformation. We can now keep our loop simple as well:
var urls = [âhttp://zef.me", âhttp://google.com",
 âhttp://yahoo.com"];
 var titles = [];
 for(var i = 0; i < urls.length; i++) {
 titles.push(getTitle(urls[i], _));
 }
 console.log(titles);
Basically the same as our original version, the only difference: an additional _
argument to getTitle
.
Not bad huh? Streamline.js also has some nice functions to enable a parallel version of this code.
Name: streamline.js, a Javascript preprocessor (integrates nicely with node.js too)Â
License: MIT
Route #2: mobl
ââââââââââââââââââââ â
My own project, mobl is a language to rapidly develop mobile web applications. Although itâs not Javascript, the syntax of it scripting language is similar. Since mobl is typed, it is easy for the compiler to infer whether a function is asynchronous or not, which leads to code that is slightly more clean than streamline.js:
function getTitle(url : String) : String {
 var body = fetchHTML(url);
 var match = reg.exec(body);
 return match.get(1);
 }
and the loop:
var urls = [âhttp://zef.me", âhttp://google.com",
 âhttp://yahoo.com"];
 var titles = Array();
 foreach(url in urls) {
 titles.push(getTitle(url));
 }
 log(titles);
Like streamline.js, a continuation-passing style is performed by the compiler to produce asynchronous Javascript code.
Mobl is aimed at the mobile web domain, it a whole new language to learn and doesnât currently support concurrent execution of asynchronous calls. Nevertheless, unlike streamline.js thereâs no special _
variables to pass around.
Name: mobl, new language, browser only.Â
License: MIT
Route #3: StratifiedJS
ââââââââââââââââââââââââââââ â
The most powerful option is StratifiedJS. It extends the Javascript language with various structured concurrency features using a few new language constructs such as waitfor
, and
, or
and retract
. To fully understand its expressive power, itâs a good idea to have a look at these excellent interactive OSCON slides.
Hereâs the code for StratifiedJS:
function getTitle(url) {
 var body = fetchHTML(url);
 var match = reg.exec(body);
 return match[1];
 }
and the loop:
var urls = [âhttp://zef.me", âhttp://google.com",
 âhttp://yahoo.com"];
 var titles = [];
 for(var i = 0; i < urls.length; i++) {
 titles.push(getTitle(urls[i]));
 }
 console.log(titles);
As you can see, this code is basically exactly how youâd want to write the code. Compared to our original version, the only thing that changed was adding the fetchHTML
callâââas it should be.
With some effort I was able to capture the Javascript code that is this code fragment is translated to. Hereâs the code generated for the getTitle
function:
function getTitle(url) {
 var body, match;
 return __oni_rt.exseq(arguments, this, âwhatever.jsâ,
 [1, __oni_rt.Scall(3, function (_oniX) {
 return body = _oniX;
 }, __oni_rt.Nb(function (arguments) {
 return fetchHTML(url)
 }, 2)), __oni_rt.Scall(4, function (_oniX) {
 return match = _oniX;
 }, __oni_rt.Nb(function (arguments) {
 return reg.exec(body)
 }, 3)), __oni_rt.Nb(function (arguments) {
 return __oni_rt.CFE(ârâ, match[1]);
 }, 5)])
 }
Â
and the loop:
var urls, titles, i;
 __oni_rt.exseq(this.arguments, this, âwhatever.jsâ,
 [0, __oni_rt.Nb(function (arguments) {
 urls = [âhttp://zef.meâ, âhttp://google.com",
 âhttp://yahoo.com"];
 titles = [];
 }, 4), __oni_rt.Seq(0, __oni_rt.Nb(function (arguments) {
 i = 0;
 }, 8), __oni_rt.Loop(0, __oni_rt.Nb(function (arguments) {
 return i < urls.length
 }, 5), __oni_rt.Nb(function (arguments) {
 return i++
 }, 5), __oni_rt.Fcall(1, 6, __oni_rt.Scall(6, function(l){
 return [l, âpushâ];
 }, __oni_rt.Nb(function (arguments) {
 return titles
 }, 6)), __oni_rt.Nb(function (arguments) {
 return getTitle(urls[i])
 }, 6)))), __oni_rt.Nb(function (arguments) {
 return console.log(titles)
 }, 8)])
Â
What worries me somewhat about this generated code is that it seems rather heavy on the number of functions thatâs being generated. Basically every expression is turned into a function passed to another function in the StratifiedJS runtime. This seems rather expensive. I havenât done any performance benchmarking on thisâââso maybe itâs not as bad as I think.
Of the three, StratifiedJS is definitely the most flexible and allows you to write the cleanest code. Drawback is that it extends the Javascript language (unlike streamline.js) which could break your current tool chain. In addition, produced code is likely to be slower than the other two solutions.
Name: StratifiedJS, extension of Javascript.Â
License: MIT (although source code is only available in a minified version at the moment)
Conclusion
ââââââââ â
So there you go. Three ways to write clean synchronous code and produce efficient asynchronous Javascript code. The fact is that picking any of these requires a compiler of some kind to be added to your tool chain (although StratifiedJS performs this compilation at run-time), which may or may not be a problem.
A drawback of code generation in any shape or form is debugging. If something goes wrong, the code youâll be debugging is generated Javascript code. StratifiedJS attempts to include original line numbers when exceptions occur, which helps. A fork of streamline.js attempts to maintain the line numbers in generated code.
In the end itâs all a trade off, a different route would be to use a library like async.js that, while not âfixing the languageâ, gives you an API that enables you to at least write asynchronous code in a more readable manner.