Monday, June 26, 2006

Erlang + Yaws + haXe = Perfect Comet Recipe

Comet, the trick of pushing data to a web client using eager client requests combined with prolonged server responses, has been getting a lot of buzz lately. Comet lets you create truly interactive web applications with full bidirectional communications and without the high latency imposed by the more primitive technique of periodic polling. Meebo uses Comet to do its magic, and so does the GTalk integration in Gmail. Most web applications probably don't need Comet, but if you want to be on the cutting edge then Comet opens up many possibilities.

Comet is used for some cool apps, but there's a good reason you shouldn't use it. Unless you're hacker of functional languages of Swedish origins, chances are your web server croaks when it reaches a few thousand simultaneous users. It just happens to be the case that the vast majority of web servers are written in languages that use operating system threads for concurrency, which doesn't scale above a few thousand threads. Other web servers, like Lighttpd, don't use threads at all because they're entirely non-blocking/single-threaded, but their interfaces, e.g. FastCGI, to dynamic languages rely on system threads if not even more heavyweight processes.

You're probably wondering at this point how Meebo does it. Well, I haven't seen Meebo's source code, so I don't know for certain, but my guess is the Meebo hackers wrote an entirely non-blocking/single threaded web server that maintains connections to all users. This web server acts as a simple message router -- it doesn't do database transactions or anything. It just sends messages to other servers that do the heavy lifting -- and that includes the AIM, ICQ, Yahoo Messenger servers, etc. A second possibility is that Meebo's software isn't very scalable and that Meebo just buys a lot of cheap front-end boxes. These are my two theories, at least, because I know that that Meebo's servers are written in C++ (again).

Whatever C++ loops Meebo code jumps through, one thing is pretty clear: Meebo has at least 37 frontend servers. To arrive at this observation, I harnessed the best of my investigative journalism skills, which prompted me to access wwwXX.meebo.com servers in increasing order until www38.meebo.com redirected me to a lower server.

This kind of architecture sounds pretty difficult to develop and maintain. It has many moving parts, it requires a lot of custom code because it doesn't fit well with the standard array of web development tools, or it's just plain expensive. In addition, writing server quality code in C++ is not easy and if you want to upgrade your servers while they're running, well, forget about it. If I were to build a webapp like Meebo, I would do it quite differently: I would ditch C++ and go with Erlang.

Erlang was created with scalability and concurrency in mind, so Erlang has taken a much more effective approach to concurrency than other programming languages, using lightweight threads that are managed by an event-driven VM (you can read my previous posts on the topic for more details). Now, Erlang is not just a language: it has very useful applications, especially Yaws, a powerful web server for dynamic applications. Yaws is written in Erlang, so it has scalability and concurrency built-in. Yaws allows you write server-side code using the good old multi-threaded paradigm, and without worrying about its maintaining the number of connections that that would make other web servers gasp their last dying breath if your site ever gets popular.

If I were to build a web application that uses Comet, I would most certainly use Erlang + Yaws. In fact, I would use a Yaws backend with a haXe client, now that my haXe remoting adapter for Yaws is ready to ship.

Well, that's my take on the matter. As Dennis Miller would say, of course, that's just my opinion -- I could be wrong :) I do urge you, however, to scroll down to my posting with the graph of the experiment comparing the performance of Apache and Yaws in the face of a high number of simultaneous requests or to visit Joe Armstrong's webpage, as it should give you some extra persuasion.

2 comments:

Roberto said...

I like your recipe, I am currently preparing a meal with that recipe, just added a grain of dojo, to give it a special taste. And I am thinking about serving a Red5 dessert (I havan't trid yet Erlang Java integration, but ejabberd seems to use a lot of it)

yariv said...

Roberto, that sounds very interesting. I'm waiting to see what you've got cooking :) j/interface looks quite straightforward to use. Erlide, the Eclipse plugin, uses it as well.