Most programming languages implement concurrency poorly, if at all. The most common approach, which involves multiple OS threads or processes sharing common data structures that are guarded by locks or semaphores, has proven itself to be too complex for most programmers to "get right." Problems such as deadlocks, unguarded access, and coarse locking tend to creep up even in code created by skilled programmers.
To help programmers tackle some of the difficult challenges involved in shared memory-based concurrency, some language designers have created numerous-additional-APIs to supplement and replace some of the existing ones. This has probably made programmers even more confused about concurrency, not less.
Concurrency has gained such reputation for complexity that many programmers have learned to regard it as a snakepit of trouble that they best avoid altogether.
Update (8/16/06, 18:46PM EST): Ruby on Rails, for instance, is entirely single threaded, relying on a pool of OS processes to handle simultaneous requests (at least when deployed with FastCGI). As you will soon see, this is practically guaranteed to be less scalable than Yaws under high load.
Erlang has taken a different approach to concurrency than today's popular programming languages. Instead of implementing an ad-hoc concurrency API, Erlang's designers have built concurrency into the language. They also taught programmers to embrace concurrency rather than run away from it. That's why Erlang's Getting Started guide has a section about concurrency. It's also why every Erlang application and numerous libraries I know of -- including the ones created by the open source community -- use concurrency extensively. Check out Yaws (internals), ejabberd, OpenPoker and (parts of) Jungerl for some great examples.
What makes Erlang's approach to concurrency different? In Erlang, processes have no shared memory. To communicate, Erlang processes send messages to each other. A message can be any Erlang term -- tuples, lists, records, even functions. Sending a message to a process is simple:
Pid ! Message
where Pid is the process's ID as returned from the spawn() function, and Message is the Erlang term. Spawning a new process is also very easy. Here's an example:
Pid = spawn(fun() -> io:format("hi!", []) end)
which calls the io:format function in a parallel process.
Erlang's pattern matching semantics make processing incoming messages very convenient. Here's an example:
receive
{hello, Text} ->
io:format("Got hello message: ~s", [Text]);
{goodbye, Text} ->
io:format("Got goodby message: ~s", [Text])
end
That's it. Simple, huh?
The simplicity of Erlang code is partly what makes Erlang such a great language: as Paul Graham said, succinctness is power.
Interestingly, Erlang's dynamic typing cuts down the effort required for sending and receiving messages because Erlang doesn't require you to define types for different messages before sending them. In Erlang, types -- or 'tags' -- are denoted by convention as the first element of a tuple, e.g. {person, Bob}. As you can see, making things and defining them are one and the same in Erlang.
At this point, fans of static languages would bemoan the lack of compiler checks in dynamic languages, including Erlang. Well, they should know that Erlang has a static analysis tool, Dialyzer, that catches many type-related errors. In addition, when you code in Erlang, you always have a shell running in a different window so you can test your code as you're writing it. It's very hard for type-related errors to sneak into your code when you program this way.
Update (8/17/06, 12:39AM): Also, if Erlang were statically typed, we wouldn't have Smerl! :)
One of the greatest aspects of Erlang's concurrency is the way it's implemented behind the scenes. Erlang code runs in a virtual machine called BEAM, which is responsible for spawning, scheduling, and cleaning up Erlang processes, as well as passing messages between them. Erlang processes are much more lightweight than OS threads, which means that you can potentially spawn millions of processes on a single box. Try that with a pthreads-based library and your machine would croak after the first 4000-10000 threads. Erlang's lightweight processes are the reason for this graph, which compares the number of simultaneous connections Yaws can handle compared to Apache:
It's very important for a language to get concurrency right from the beginning. I occasionally hear about attempts to add Erlang-style concurrency to other languages. However, even if those attempts are successful, it would take years for programmers to embrace them and for open source libraries and applications to use them. That's why Joe Armstrong is right when he says that Erlang programmers have a head start. I have quoted these words before, but I think they are worth quoting again in this context:
Erlang also maps nicely onto multi-core CPUs - why is this? - precisely because we use a non-shared lots of parallel processes model of computation. No shared memory, no threads, no locks = ease of running on a parallel CPU.
Believe me, making your favourite C++ application run really fast on a multi-core CPU is no easy job. By the time the Java/C++ gang have figured out how to throw away threads and use processes and how to structure their application into small lightweight processes they will be where we were 20 years ago.
Does this work? - yes - we are experimenting with Erlang programs on the sun Niagara - the results are disappointing: our message passing benchmark only goes 18 times faster on 32 CPU's - but 18 is not too bad - if any C++ fans want to try the Niagara all they have to do is make sure they have a multi-threaded version of their application, debug it -'cos it probably won't work and they can compare their results with us (and I'm not holding my breath).
Turning a sequential program in a parallel program for the Niagara is really easy. Just change map/2 to pmap/2 in a few well chosen places in your program and sit back and enjoy.
Efficency comes from a correct underlying architecture, in this case being able to actually use all the CPUs on a multi-core CPU. The ability to scale and application, to make it very efficient, to distribute it depends upon how well we can slit the application up into chuncks that can be evaluated in parallel. Erlang programmers have a head start here.
Well, I hope you find this interesting, and maybe even useful :) For additional reading, check out this well-written defmacro.org article about Erlang-style concurrency. If you haven't used Erlang, I recommend downloading it and following the Getting Started guide. This will give you the flavor of good concurrent programming with Erlang.
Update (8/17/06): For a great discussion about concurrent programming in Erlang vs. other languages, read this mailing list posting by Richard O'Keefe. It's pure gold.