Async Foreach in Javascript

Javascript comes with a nice method on `Array` objects called `forEach`, it takes a function as an argument and applies that function to eacy item in the array sequentially. It's Javascript's version of a for-each loop, allowing you to write code like this: for(var i = 0; i < ar.length; i++) { alert(ar[i]); }

or, for the performance-obsessed:

var len = ar.length; for(var i = 0; i < len; i++) { alert(ar[i]); }

to:

ar.forEach(function(el) { alert(el); });

Which is quite clean, in my opinion. Sadly, `forEach` assumes that the function that is passed to it is [synchronous](http://zef.me/2726/on-asynchronous-programming). It does something and returns immediately. What if you need to pass an asynchronous function to it, and need to ensure that items are still processed sequentially? If I would write this:

ar.forEach(function(el) { callAsyncFunction(el, function() { // my callback alert("Processed: " + el); }); }); alert("Done!");

I will, not unlikely, get the alerts in a hard to predict order. So, we need a asynchronous version of `forEach`. The solution I came up with to do this is as follows:

function asyncForEach(array, fn, callback) { array = array.slice(0); function processOne() { var item = array.pop(); fn(item, function(result) { if(array.length > 0) { processOne(); } else { callback(); // Done! } }); } if(array.length > 0) { processOne(); } else { callback(); // Done! } };

This works great, but is very heavy on the call stack. If you have an array of 1000 elements, this will likely cause a stack overflow, because `processOne` calls itself recursively. I have not yet found an ideal solution to this problem. One trick is to not call `processOne` immediately, but _schedule_ it for invocation. Browser Javascript has `setTimeout` for this that you can use. The code then becomes:

function asyncForEach(array, fn, callback) { array = array.slice(0); function processOne() { var item = array.pop(); fn(item, function(result) { if(array.length > 0) { setTimeout(processOne, 0); // schedule immediately } else { callback(); // Done! } }); } if(array.length > 0) { setTimeout(processOne, 0); // schedule immediately } else { callback(); // Done! } };

Which you can use like this:

asyncForEach(ar, function(el, callback) { callAsyncFunction(el, function() { // my callback alert("Processed: " + el); callback(); }); }, function() { alert("Done!"); });

Although this works in most occasions (except when you do in-browser database stuff, as I found out), it is kind of heavy on the browser's scheduler. I have not done benchmarks, but intuitively this seems quite expensive. [node.js](http://nodejs.org) has `process.nextTick` for this, which you can just pass a function as argument and doesn't use a timer infrastructure and is therefore more efficient.

If you care less about order of execution of individual items, but only about a callback when _all_ of them are processed (in any order), you can use a more efficient solution. It is efficient in not relying on `setTimeout`, not being heavy on the call stack and processing the entire array in parallel, which may also speed it up quite a bit. For this I define `asyncParForEach`:

function asyncParForEach(array, fn, callback) { var completed = 0; if(array.length === 0) { callback(); // done immediately } var len = array.length; for(var i = 0; i < len; i++) { fn(array[i], function() { completed++; if(completed === array.length) { callback(); } }); } };

If you use the [persistence.js](http://persistencejs.org) library, these functions are included (`persistence.asyncForeach` and `persistence.asyncParForEach`). If you don't, well, you should, or just use the implementations that I provided here.