Wednesday, June 14, 2006

More Erlang

Strange trends are taking place in the web progamming world. As new languages come and go, developers are overlooking a mighty beast whose unparalleled power is $0 plus a mental barrier away: Erlang.

I mentioned Erlang in previous posts. Here's a quick recap on Erlang's history: in the early 1980s, Ericsson assembled a team of computer scientists who were devise the best methods for developing scalable, fault tolerant systems with soft real-time performance requirements. After much experimentation and development, Erlang, a functional language with built in notions of concurrency, was born. This need for a new language was real: no existing language was suitable for solving Ericsson's problems, and when you're in the business of selling telephone switches to the world's largest telcos, you can't let a language with inadequate notions of concurrency and fault tolerance get in your way. The design decisions behind Erlang turned out to be very powerful, and this eventually gave Ericsson a solid market lead over the competition and positioned Ericsson as a dominant force in the telcom switch market.

Fortunately, the power of Erlang isn't stashed away in some grey corporate computer lab. In the 1990's, Ericsson released Erlang to the open source community, thereby giving every developer the power to build scalable distributed backends with (relative) ease.

Since its release, Erlang has been making headway in the open source world. An example of a recent convert is jabber.org, home of the Jabber Software Foundation (Jabber is the leading open IM standard, used by numerous organizations and IM providers, including Google Talk and Gizmo Project). jabber.org has recently switched its Jabber server from jabberd, which is written in C, to ejabberd, written in Erlang. This press release discusses jabber.org's move. jabber.org operates an instant messaging service with very high requirement for reliability and for handling large numbers of simultaneous connections (just like a telephone exchange), so it's no surprise that a server written in Erlang was jabber.org's server of choice.

I think that Erlang's strengths in the areas of concurrency, scalability and fault tolerance make it a good contender for being a more widely used web development language. The main reasons web developers haven't adopted Erlang in large numbers yet are, in my opinion, 1) Erlang has different semantics, which will always discourage some developers 2) Erlang needs better PR and 3) Erlang doesn't have an integrated web development framework like Ruby on Rails (I'm a huge Ruby on Rails fan, by the way). Efforts to build such a framework are apparently under way. Once they are mature, web developers will be able to tap into Erlang's strengths more easily, and Erlang will in turn enjoy the best kind of marketing: word-of-mouth.

How does Erlang achieve much greater scalability with large numbers of concurrent processes than other programming languages? Erlang processes are very lightweight -- much more than OS processes and threads -- and the Erlang VM, BEAM, does the scheduling. BEAM is mostly event driven, and no lightweight process blocks the whole VM for very long. On multi-processor machines, BEAM launches (by default) one scheduler per processor. Erlang applications are normally designed from the ground up with concurrency in mind, so it's easy for Erlang code to take advantage of most, if not all, available processors. In a recent posting on the Erlang mailing list, Joe Armstrong, described an expriment he conducted on a Sun Niagara box with 32 CPUs, in which changing a single function call from map() to pmap() made his application's performance scale with up to 16 CPUs. With upcoming BEAM improvements, additional scalability is expected. Joe gives background to the experiment here. Quote:


Erlang also maps nicely onto multi-core CPUs - why is this? - precisely because we use a non-shared lots of parallel processes model of computation. No shared memory, no threads, no locks = ease of running on a parallel CPU.

Believe me, making your favourite C++ application run really fast on a multi-core CPU is no easy job. By the time the Java/C++ gang have figured out how to throw away threads and use processes and how to structure their application into small lightweight processes they will be where we were 20 years ago.

Does this work? - yes - we are experimenting with Erlang programs on the sun Niagara - the results are disappointing: our message passing benchmark only goes 18 times faster on 32 CPU's - but 18 is not too bad - if any C++ fans want to try the Naigara all they have to do is make sure they have a multi-threaded version of their application, debug it -'cos it probably won't work and they can compare their results with us (and I'm not holding my breath).

Turning a sequential program in a parallel program for the Niagara is really easy. Just change map/2 to pmap/2 in a few well chosen places in your program and sit back and enjoy.

Efficency comes from a correct underlying architecture, in this case being able to actually use all the CPUs on a multi-core CPU. The ability to scale and application, to make it very efficient, to distribute it depends upon how well we can slit the application up into chuncks that can be evaluated in parallel. Erlang programmers have a head start here.


The following graph shows the result of an experiment Joe and colleagues conducted to compare the performance of Yaws, an Erlang web server, and Apache, under very high load -- in effect, a simulated DDOS attack:


apache vs yaws

Here's Joe's explanation:


Apache (blue and green) dies when subject to a load of c. 4000 parallel sessions. Yaws (red) works well even when subject to high load.

The red curve is yaws (running on an NFS file system). The blue curve is apache (running on an NFS file system). The green curve is apache (running on a local file system).

...

Our figure shows the performance of a server when subject to parallel load. This kind of load is often generated in a so-called "Distributed denial of service attack".

Apache dies at about 4,000 parallel sessions. Yaws is still functioning at over 80,000 parallel connections.


You can read the full description of the experiment on Joe's website.

Erlang is powerful, and once it has a good web development framework, I think it will become many more developers' web language of choice. Interesting times are ahead for Erlang.

2 comments:

Yariv’s Blog » Erlang and the Next Generation Webapps said...

[...] At this point you probably know where I’m going with this: Erlang. Erlang scales to tens of millions of simultaneous processes — on one machine! (I wrote about this more here and here). If you write your backend in Erlang, the VM handles concurrency gracefully. You can keep those damn connections alive — and then set your mind free [...]

LT People » Blog Archive » Erland Web Server said...

[...] Наткнулся на размышления человека по имени Yariv Sadan по поводу создания многопользовательских интерактивных web-приложений. Призывает использовать Erlang для их создания. У меня после прочтения возникла идея узнать, что это такое, приведу одну цитату: Erlang can scale to millions of simultaneous processes — on one machine! (I wrote about this more here and here). If you write your backend in Erlang, the VM handles concurrency gracefully. You can keep those damn connections alive — and then set your mind free [...]