Yariv's Blog: December 2006

I don't usually write responses to other articles because, frankly, I think people spend too much time writing articles and too little time writing open source Erlang code that I can use without paying those do-gooder souls a dime for their efforts :) However, this one -- "No Silver Bullet" and Functional Programming -- piqued my curiosity because it discusses functional programming in a positive light and it mentions Erlang a bunch of times. Clearly, the author has gotten *something* right :)

I actually think the author did a pretty good job at explaining some of the benefits of functional programming. The main issue I have with the article is that, as past in other articles I've seen, it puts too much weight on conciseness as an indicator of expressiveness. Although conciseness often contributes to expressiveness, conciseness isn't the only measure of expressiveness. It it were, the commenter who wrote that Python generally has a similar conciseness multiplier to C++ as Haskell, debunking this shootout's conclusion that functional languages are "better" (i.e. more expressive), would have a strong argument.

Both the author and the commenter are making valid points, but I think they are overlooking an aspect of language expressiveness that's at least as important, if not more important, for code quality as conciseness: readability. The reason is pretty obvious: code communicates a solution to a problem not just to computers, but also to humans. If code written in a given language is unduly hard to understand, its conciseness doesn't hold much value because the pain of debugging and maintaining it outweights the ease of writing it. I think it's safe to say that most programmers would prefer to work with a 1500 line solution that's highly readable than a 1000 line solution that's hard to comprehend.

So how does this relate to the functional vs. imperative programming debate? Although functional languages (I'm talking primarily about Erlang because that's the functional language I know best) don't always trump OO/imperative ones in conciseness, their code does generally wins hands down in readability.

This is especially true when comparing functional languages to dynamically-typed OO languages, whose code for anything beyond short scripts is borderline unreadable IMO. Dynamically typed OO languages trade a large degree of readability for conciseness, and that's why Java code, as bloated as it often is, is often more readable than Python and Ruby code. The reason is in the fundamental design (flaw?) of OO languages: they encourage the programmer to bind functions to data (they name the child of this unhealthy marriage an 'object' to make us think this chimera models something from the "real world" :) ), and then use data objects to indirectly invoke bound (virtual) functions. If the type of an object isn't evident in the source code, it's difficult to figure out what functions are called when you read statements such as "obj.doSomething(param)". Due to the abundance of indirect function calls in OO code, dynamically typed OO languages require extra discipline by the programmer to carefully document the types of all variables in his/her code so that other people have a chance to understand what functions are being called.

In my relatively brief encounter with Python, I ran into a bunch of code containing seemingly innocent idioms such as the following (I apologize in advance for Python syntax errors):


respond(channel, msg):
    if channel.isValid() and msg.body().equals("next"):
      return channel.send("A2-B3")
    else:
      return channel.send("bye")

Because I has no idea what the types of 'channel' and 'msg' are (the code was poorly documented), what such snippets did was a veritable mystery. After wallowing for hours at a time in such nebulae, trying arduously to trace back to the instantiation points of mysterious parameters, where I would hope to find the golden nuggets of information indicating what types their variables are holding, my frustration would reach such uncomfortable levels that I wouldn't know anymore whether to feel angry or deeply depressed.

When you read Erlang code -- even with scant documentation -- you don't normally have go through such troubles. Although Erlang is dynamically typed, Erlang code avoids such readability black holes because it doesn't throw so much type information out the window. In Erlang, the above snippet would be written as follows:


respond(Channel, Msg) ->
  case Channel#chess_channel.is_valid && Msg#chess_msg.body == next of
    true -> chess_channel:send(Channel, {move, a2, b3});
    false -> chess_channel:send(Channel, bye)
  end

The Erlang code is less concise, but it's also more readable. (It's also more optimized because it doesn't require the resolution of function pointers in runtime.) In a large code base, this added readability wins over conciseness because it can make difference between providing continuous service and begging your users to come back in a few more hours as you're chasing the mysterious bug that has taken your system offline. It also helps you develop new features faster because you can spend less time debugging and more time coding.

At this point, you may be thinking, "The extra type information in Erlang code has a cost because it sacrifices generality." If you need to write generic code, you can use Erlang's remote function invocation as follows:


respond(ChannelType, Channel, MsgType, Msg) ->
  case ChannelType:is_valid(Channel) && MsgType:body(Msg) == "ok" of
    true -> ChannelType:send(Channel, {move, a2, b3});
    false -> ChannelType:send(Channel, bye)
  end.

This example is admittedly silly, but as you can see, Erlang lets you parameterize module names (and function names) in generic code. This capability is often very useful. Even with remote invocation, the vast majority of Erlang code I've read contains enough type information to be much more readable than imperative/OO code.

In addition to the general absence of mystery functions (for a counter-example, check out ErlyDB :) ), Erlang, and other functional languages such as Haskell, have two language features that make them more readable than imperative languages: pattern-matching and immutability.

How do these features enhance readability and therefore expressiveness? Pattern-matching is easy -- just compare the following snippets:

Erlang:


foo({bar, [baz, boing], "18"}) -> ok;
foo(_) -> error.

Imperative, dynamic pseudo-language:


foo(arr) {
  if (arr instanceof Array &&
       arr.length == 3 &&
       arr[0] == 'bar' &&
       arr[1] instanceof List &&
       arr[1].size() == 2 &&
       arr[1].element(0) == 'baz' &&
       arr[1].element(1) == 'boing') &&
       arr[2] instanceof String &&
       arr[2].equals("18"))
    return 'ok';
  else
    return 'error';
}

Update (12/12/06): If you read Anders' comment, you'll see that code such as above can be written more succinctly in Python. Please take this example as an illustration of how Erlang pattern matching engine works rather than as a suggestion that there's no better way of writing such code in any imperative language.

I hope this example makes the benefits of pattern-matching obvious :) So, let's look single-assignment, which is another functional programming feature that I have learned to appreciate as an essential contributor to code quality. In Erlang, when you bind a value to a variable, this binding holds for the life of the variable. For instance, in Erlang, if you wrote code such as


foo() ->
  X = 1,
  X = 2.

the second expression would throw an error. If you're used to Erlang, the reason is quite natural: the first line states that X equals 1, and therefore the following line, stating that X equals 2, is wrong. To someone who isn't used to functional programming, the benefits of this behavior may not be obvious -- it may even seem like a burdensome restriction (I used to think so too). However, over time I learned that single-assignment often makes for drastically more readable code. For example, consider this snippet in an imperative/OO/dynamic language:

out(name, paradigm) {
var l = new Language(name, paradigm);

// much code below
...
l = bar(l);

// much code below
...
return l.getName() + "/" + l.getParadigm();
}

Now answer the following question: what does 'out("Ruby", "imperative")' return? Clearly, you have no way of knowing. In fact, even reading the all the code for the 'foo' function won't help you much -- you'd have to read the code for 'bar' (and any other function that take 'l' as a parameter) in order to have a better clue. Sadly, all that reading still wouldn't guarantee anything if the value of 'l' changes during execution based on some IO input. And that's not the end of it: to make things worse, your life would be even much more miserable if the author of the code decided to venture into the dangerous terrain of multi-threaded programming. If 'l' is shared between different threads, your code comprehension efforts would be that much closer to hopeless.

If this code were written in Erlang, the answer would be simple: the function 'foo' returns a string of value "Ruby/imperative". In small code snippets, such as the ones used in most language comparisons, the readability benefits of single-assignment may not be obvious. However, in a large-scale production systems with high availability requirements written by large teams of developers, the ability to answer questions about unfamiliar code segments is essential for both readability and (automated and non-automated) debugging. Erlang was designed to build such systems, so it makes sense that Erlang shuns mutable variables. (I imagine Ericsson would be in a rather uncomfortable position if a portion of England's phone network went offline because some programmer thought to himself, "the variable 'l' holds an object of type 'Language'!" :) )

So where does all this lead us? Is there a precise way of measuring language expressiveness that takes into account both conciseness and readability? Well, after thinking about this stuff for a while, I arrived at an elegant equation for arriving at an objective quantitative measure of a language's expressiveness. Without further ado, here is my equation for the General Theory of Programming Language Expressiveness:

E = C*R^2

where

E is expressiveness
C is conciseness
R is readability

Now I just need to figure out how to factor in the speed of light. If I succeed, it would undoubtedly pave the way for the Nobel :)

Well, I hope I was able to shed some light on why readability is at least as important as conciseness when evaluating language expressiveness, and also why code written in functional languages (primarily Erlang) enjoys greater readability -- and often, conciseness -- than code written in imperative/OO languages. Now consider this: even if imperative/OO languages were just as concise and readable as Erlang, Erlang code would nonetheless have a higher average quality. Sounds bizzare? Maybe, but it's true :) The reason is that no matter how good a language is, bugs always creep into non-trivial systems, and Erlang is the only language I know that has truly effective mechanisms for dealing with defects that do affect live systems. The idea behind Erlang's approach to fault-tolerance is actually quite simple: a crash in one process doesn't bring down the whole system, and furthermore it's detected by a supervising process that's configured with rules telling it what action to take (e.g. restart the process) when a crash does occur. When you've fixed that pesky bug that has been causing intermittent (yet non-catastrophic!) crashes, you can hot swap the new code into the live system without taking if offline. Due to Erlang's fault tolerant design, 1000 lines of Erlang code with 7 bugs are in a sense "better" than 1000 lines of Java doing exactly the same thing and containing an equal number of bugs. Unfortunately, comparisons that count only line numbers don't show this side of the story.

If none of this convinces you that functional languages are worth using, maybe this tip the paradigm balance in your mind: I've never met a programmer who's versed in both functional programming and OO/imperative programming and who prefers the latter.

But maybe I just need to get more friends :)

Yariv's Blog

Saturday, December 30, 2006

New Year's Brain Teaser

Tuesday, December 19, 2006

ErlyWeb is Undocumented No More!

Tuesday, December 12, 2006

ErlyWeb Tutorial by Brian Olsen, Part 2

Monday, December 11, 2006

A General Theory of Programming Language Relativity

Tuesday, December 05, 2006

ErlyWeb Tutorial by Brian Olsen