The Road to an Open Source Google

On the first episode of the GigaOM Show, they briefly talked about that creating an open source search engine may just be the only way to beat Google. Om Malik said that Yahoo! in particular should be interested in getting a project like that off the ground.

Hadoop is a new Apache project that aims to build a very Google-like distributed computing environment. Google gets its speed and efficiency from MapReduce. Hadoop has an open source implementation of this MapReduce. It also has an open source implementation of a large-scale distributed file system very much like GoogleFS. Here’s a tutorial on running Hadoop with Amazon EC2 and S3.

Interestingly, the Hadoop project is a sub-project of the Apache Lucene project, which, as you may know, is probably the best open source search engine out there.

May they be up to something?