Sunday, March 09, 2008

In Response to "What Sucks About Erlang"

Damien Katz's latest blog post lists some ways in which Damien Katz thinks Erlang sucks. I agree with some of these points but not with all of them. Below are my responses to some of his complaints:

1. Basic Syntax

I've heard many people express their dislike for the Erlang syntax. I found the syntax a bit weird when I started using it, but once I got used to it it hasn't bothered me much. Sometimes I mess up and use the wrong expression terminator, and sometimes things break when I cut and paste code between function clauses, but it hasn't been real pain point for me. I understand where the complaints are coming from, but IMHO it's a minor issue.

Since the release of LFE last week, if you don't like the Erlang syntax, you can write Erlang code using Lisp Syntax, with full support for Lisp macros. If you prefer Lisp syntax to Erlang syntax, you have a choice.

2. 'if' expressions

The first issue is that in an 'if' expression, the condition has to match one of the clauses, or an exception is thrown. This means you can't write simple code like


if Logging -> log("something") end


and instead you have to write


if Logging -> log("something"); true -> ok end


This requirement may seem annoying, but it is there for a good reason. In Erlang, all expressions (except for 'exit()') must return a value. You should always be able to write

A = foo();

and expect A to be bound to a value. There is no "null" value in Erlang (the 'undefined' atom usually takes its place).

Fortunately, Erlang lets you get around this issue with a one-line macro:


-define(my_if(Predicate, Expression), if Predicate -> Expression; true -> undefined end).


Then you can use it as follows:


?my_if(Logging, log("something"))


It's not that bad, is it?

This solution does have a shortcoming, though, which is that it only works for a single-clause 'if' expression. If it has multiple clauses, you're back where you started. That's where you should take a second look at LFE :)

The second issue about 'if' expressions is that you can't call any user- defined function in the conditional, just a subset of the Erlang BIFs. You can get around this limitation by using 'case', but again you have to provide a 'catch all' clause. For a single clause, you can simply change the macro I created to use a case statement.


-define(case(Predicate, Expression), case Predicate -> Expression; _ -> undefined end).


For an arbitrary number of clauses, a macro won't help, and this is something you'll just have to live with. If it really bothers you, use LFE.

3. Strings

The perennial complaint against Erlang is that it "sucks" for strings. Erlang represents strings as lists of integers, and for some reason many people are convinced that this is tantamount to suckage.


...you can't distinguish easily at runtime between a string and a list, and especially between a string and a list of integers.


A string *is* a list of integers -- why should we not represent it as such? If you care about the type of list you're dealing with, you should embed it in a tuple with a type description, e.g.


{string, "dog"},
{instruments, [guitar, bass, drums]}


But if you don't care what the type is, representing a string as a list makes a lot of sense because it lets you leverage all the tools Erlang has for working with lists.

A real issue with using lists is that it's a memory-hungry data structure, especially on 64 bit machines, where you need 128 bits = 16 bytes to store each character. If your application processes such massive amounts of string data that this becomes a bottleneck, you can always use binaries. In fact, you should always use binaries for "static" strings on which you don't need to do character-level manipulation in code. ErlTL, for example, compiles all static template data as binaries to save memory.


Erlang string operations are just not as simple or easy as most languages with integrated string types. I personally wouldn't pick Erlang for most front-end web application work. I'd probably choose PHP or Python, or some other scripting language with integrated string handling.


I disagree with this statement, but instead of rebutting it directly, I'll suggest a new kind of Erlang challenge: build a webapp in Erlang and show me how string handling was a problem for you. I've heard a number of people say that Erlang's string handling is a hinderance in building webapps, but by my experience this simply isn't true. If you ran into real problems with strings when building a webapp, I would be very interested in hearing about them, but otherwise it's a waste of time hypothesizing about problems that don't exist.

4. Functional Programming Mismatch

The issue here is that Erlang's variable immutability supposedly makes writing test code difficult.


Immutable variables in Erlang are hard to deal with when you have code that tends to change a lot, like user application code, where you are often performing a bunch of arbitrary steps that need to be changed as needs evolve.

In C, lets say you have some code:

int f(int x) {
x = foo(x);
x = bar(x);
return baz(x);
}

And you want to add a new step in the function:

int f(int x) {
x = foo(x);
x = fab(x);
x = bar(x);
return baz(x);
}

Only one line needs editing,

Consider the Erlang equivalent:

f(X) ->
X1 = foo(X),
X2 = bar(X1),
baz(X2).

Now you want to add a new step, which requires editing every variable thereafter:

f(X) ->
X1 = foo(X),
X2 = fab(X1),
X3 = bar(X2),
baz(X3).



This is an issue that I ran into in a couple of places, and I agree that it can be annoying. However, discussing this consequence of immutability without mentioning its benefits is missing a big part of the picture. I really think that immutability is one of Erlang's best traits. Immutability makes code much more readable and easy to debug. For a trivial example, consider this Javascript code:


function test() {
var a = {foo: 1; bar: 2};
baz(a);
return a.foo;
}


What does the function return? We have no idea. To answer this question, we have to read the code for baz() and recursively descend into all the functions that baz() calls with 'a' as a parameter. Even running the code doesn't help because it's possible that baz() only modifies 'a' based on some unpredictable event such as some user input.

Consider the Erlang version:


test() ->
A = [{foo, 1}, {bar, 2}],
baz(A),
proplists:get_value(A, foo).


Because of variable immutability, we know that this function returns '1'.

I think that the guarantee that a variable's value will never change after it's bound is a great language feature and it far outweighs the disadvantage of having to use with unique variable names in functions that do a series of modifications to some data.

If you're writing code like in Damien's example and you want to be able to insert lines without changing a bunch of variable names, I have a tip: increment by 10. This will prevent the big cascading variable renamings in most situations. Instead of the original code, write


f(X) ->
X10 = foo(X),
X20 = bar(X10),
baz(X20).


then change it as follows when inserting a new line in the middle:


f(X) ->
X10 = foo(X),
X15 = fab(X10),
X20 = bar(X15),
baz(X20).


Yes, I know, it's not exactly beautiful, but in the rare cases where you need it, it's a useful trick.

This issue could be rephrased as a complaint against imperative languages: "I don't know if the function to which I pass my variable will change it! It's too hard to track down all the places in the code where my data could get mangled!" This may sound outlandish especially if you haven't coded in Erlang or Haskell, but that's how I really feel sometimes when I go back from Erlang to an imperative language.


Erlang wasn't a good match for tests and for the same reasons I don't think it's a good match for front-end web applications.


I don't understand this argument. Webapps need to be tested just like any other application. I don't see where the distinction lies.

5. Records

Many people hate records and on this topic I fully agree with Damien. I think the OTP team should just integrate Recless into the language and thereby solve most of the issues people have with records.

If you really hate records, another option is to use LFE, which automatically generates getters and setters for record properties.

Incidentally, if you use ErlyWeb with ErlyDB, you probably won't use records at all and you won't run into these annoyances. ErlyDB generates functions for accessing object properties which is much nicer than using the record syntax. ErlyDB also lets you access properties dynamically, which records don't allow, e.g.


P = person:new_with([{name, "Paul"}]),
Fields = person:db_field_names(),
[io:format("~p: ~p~n", [Field, person:Field(P)]) || Field <- Fields]


Records are ugly, but if you're creating an ErlyWeb app, you probably won't need them. If they do cause you a great deal of pain, you can go ahead and help me finish Recless and then bug the OTP team to integrate it into the language :)

6. Code oragnization


Every time time you need to create something resembling a class (like an OTP generic process), you have to create whole Erlang file module, which means a whole new source file with a copyright banner plus the Erlang cruft at the top of each source file, and then it must be added to build system and source control. The extra file creation artificially spreads out the code over the file system, making things harder to follow.


I think this issue occurs in many programming languages, and I don't think Erlang is the biggest offender here. Unlike Java, for instance, Erlang doesn't restrict you to defining a single data type per module. And Ruby (especially Rails) applications are also known for having multitudes of small files. In Erlang, you indeed have to create a module per gen-server and the other 'behaviors' but depending on the application this may not be an issue. However, I don't think there's anything wrong with keeping different gen-servers in different modules. It should make the code more organized, not less.

7. Uneven Libraries and Documentation

I haven't had a problem with most libraries, and in cases where they do have big shortcomings you can often find a good 3rd party tool. The documentation is a pain to browse and search, but gotapi.com makes some of this pain go away.


Summary

Is Erlang perfect? Certainly not. But sometimes people exaggerate Erlang's problems or they don't address the full picture.

Here are some suggestions I have for improving Erlang:

- Add a Recless-like functionality to make working with records less painful.
- Improve the online documentation by making it easier to browse and search.
- Make some of the string APIs (especially the regexp library) also work with binaries and iolists.
- Add support for overloading macros, just like functions.
- Add support for Smerl-style function inheritance between modules.

Like any language, Erlang has some warts. But if it were perfect, it would be boring, wouldn't it? :)

28 comments:

she said...

"if you don’t like the Erlang syntax, you can write Erlang code using Lisp Syntax"

Like ... talking with Belzeebub instead of Lucifer hehe ;)

Not sure which syntax sucks more :P

baxter said...

You might already be aware but Steve Vinoski posted an interesting response to #4, http://steve.vinoski.net/blog/2008/03/10/damien-katz-criticizes-erlang/

Johan Tibell said...

Strings are *not* lists of integers. Strings (containing Unicode code points) can be *encoded as* a lists of integers (bytes really) using some encoding e.g. UTF-8.

Joel wrote an introduction to the subject:
http://www.joelonsoftware.com/articles/Unicode.html

Steve Cooper said...

**ifs**

The 'if' syntax does look horrible, both for the fact that there seems to be no distinction to an if-block and an if-expression.

Most languages provide this;

C-style if (cond) { stmt; } vs (cond ? trueval : falseval)
lisp: (when cond work) vs (if cond true-val false-val)

And not being able to call functions in the condition evaluator? Yuk.

**strings as integers:**

Johan's right -- strings aren't integer lists. They are an abstract concept of a sequence of characters. You could encode them as lists of integers, bit arrays,

Or, looking at it another way, absolutely all data is a list of bytes, but that doesn't mean the only data type should be byte*.

Tomasz Wegrzanowski said...

String are about as much "lists of integers" as integers are "lists of digits". That might be how they're implemented, and you can work with them like that, but it's such a massive pain nobody wants to.

This was bad enough with ISO 8859, but with Unicode, combining characters, normalized forms and so on it's just completely unworkable.

One more problem with Erlang is that it doesn't seem to provide any way of creating decently encapsulated objects, which would let you create your own string type even if the language didn't support it out of the box.

josh said...

> Unlike Java, for instance, Erlang doesn’t restrict you to defining a single data type per module.

Nice article, but why are you crackpots always saying something like this? This only applies to _public_ classes/interfaces, which is vastly different. :| Using the default/package level access specifier (nothing) is much better and easier.

Rieux said...

@Tomasz:

> One more problem with Erlang is that it doesn’t seem to provide any way of creating decently encapsulated objects, which would let you create your own string type even if the language didn’t support it out of the box.

You can do encapsulation in Erlang using separate processes which keep their internal state private and are accessible only via message passing. Or you can encapsulate using closures, which subsume objects. In a sense we've gone full circle from the Actor Model to OO and Scheme and then to Erlang, in which you encode OO-style encapsulation using actors (processes) or closures.

Vasili Sviridov said...

f(X) ->
X10 = foo(X),
X15 = fab(X10),
X20 = bar(X15),
baz(X20).

That reminded me of Basic on my Speccie...

10 PRINT "Hello, World!"
20 GOTO 10

RUN 10

*tear*

Otherwise it's a pretty interesting article :)

David Mercer said...

Johan Tibell writes:
> Strings are *not* lists of integers. Strings (containing Unicode
> code points) can be *encoded as* a lists of integers (bytes
> really) using some encoding e.g. UTF-8.
>
> Joel wrote an introduction to the subject:
> http://www.joelonsoftware.com/articles/Unicode.html

Joel’s article, however, contradicts Johan's description. Joel:

> OK, so say we have a string:
>
> Hello
>
> which, in Unicode, corresponds to these five code points:
>
> U+0048 U+0065 U+006C U+006C U+006F.
>
> Just a bunch of code points. Numbers, really.

So, it seems, strings are just a list of numbers.

I think the complaint should be not that Erlang represents strings as a list of numbers—I can think of no other way to represent them—but rather that some people represent strings as a list of the byte values used to encode the string rather than a list of Unicode code point values.

redundancy chairman of redundancy said...

"There is no “null” value in Erlang (the ‘undefined’ atom usually takes its place). The ‘undefined’ atom is usually used in its place."

rvirding said...

she writes:

>“if you don’t like the Erlang syntax, you can write Erlang code >using Lisp Syntax”
>
>Like … talking with Belzeebub instead of Lucifer hehe ;)
>
>Not sure which syntax sucks more :P

No it's not. It's more like talking to God whom you think is disguised as Beelzebub or Lucifer, but then you discover it is not a disguise but a manifestation of the pure light! :-)

&y. said...

What about the issue that arises when the VM can't get memory? It seems odd that you left this out of your response...

partdavid said...

(@ &y. - actually, i couldn't reproduce that problem--I get an exception when doing as Damien invites and trying it in my erlang shell ("a system limit has been exceeded"). I was advised in order to reproduce, I should "get rid of my limits" (ulimits? not sure) but it seems to me that this is why the limits are there (if this is even an issue)--to provide the soft failure Damien wants. I suspect this criticism, and the criticism about the heartbeat process, may not be universal... if you have information about how to reproduce, that'd be great).

Unfortunately (and I used to be a pretty big booster of the "strings are lists of integers" concept) strings are not correctly representable by lists of integers. Or more precisely, a single character is not representable as a single integer.

Or, maybe saying it in another way, you can correctly represent a string as a list of integers, but then it wouldn't be a list of characters. Remember that some characters in unicode can be represented in either a composed or decomposed form, and that these forms can be canonical or compatible, etc. etc. And some characters can't be represented in a composed form at all.

So while I like using list strings for "normal" or "lite" strings ("lite" in the sense of processing power), I think something like Starling, an Erlang-ICU (unicode string) binding is going to be needed (and also have the effect of making string and regular expression processing fast and correct).

http://code.google.com/p/starling/

partdavid said...

Oh, undefined may sometimes take its place, because it's the default default, so to speak, for records; but it's hardly universal (false, none and [] come to mind as things you might get from various key lookup modules when no value exists) and I for one would not want Erlang making up a case return value.

TruePath said...

Ironically while Damien's post left me thinking that Erlang had some foibles yours really convinced me it's not a language I ever want to learn. It might have some good ideas that ought to be tried elsewhere but ughh. Though to be fair this is not your fault, you just explained the nastiness in more detail.

However, several of your defenses of Erlang simply don't fly. Let's start with your defense of static variables. You argue that Erlang's static variables improve readability as they don't require tracing through every function called with those variables. Now I think you are technically incorrect since you can muck with process dictionaries to destroy this property but I'm not sure so let's just say everyone plays nicely and by convention ensure that those values aren't changed in the call to baz. Is this a good thing?

In general from a programmers point of view, No! The entire reason you are writing a piece of code is because you want to transform known starting values into other values that don't have a simple explicit description. I mean your example here is very revealing. Notice that no one would ever write the code in the fashion you did because they know that the long proplist bit could have just been typed as 1. Function calls in ruby, javascript, C++ and other languages can accomplish substantially more than Erlang function calls *because* they can modify state. In order for your argument to be compelling you would need to compare an Erlang idiom that captured the same power of mutable state function calls and examine the readability of that.

Of course I do realize there are serious advantages to havng really immutable state in terms of language theory, compiler tricks and etc.. However, I'm genuinely unsure if what benefits accrue when you just mostly have immutable state.

----

Code organization: I think you miss the point here. The argument is not that Erlang is bad because it leaves too many files on disk. No one is worried about running out of disk space. The problem is presumably the interuption of work flow and the mental effort of keeping them organized. Yes, *rails* spews files all over the place but as many are auto generated for you and they tend to have simple accepted practices about what goes where that ease this mental effort. Maybe you have some reason to think Erlang is just as good if not better at this but I think you were addressing something slightly different.

----

Finally, overall pointing out that there is some hack or trick or whatever to get around language limitations isn't a defense of the language. It's an admission of it's flaws. Back in the day some people who programed in assembly had some truly massive libraries of macros to work around little annoyances but that wasn't a defense of assembely language. Ultimately what makes a language good or bad is no really objective thing but merely whether it makes the programmer efficent and little annoyances like this interfere with that.

----

Don't get me wrong, Erlang may be a great language even if I don't really want to try it. I certainly think more languages should consider this pattern matching appropach to messages. However, I just didn't find your particular defenses very compelling.

Celso Martinho said...

Erlang, o bom e o mau....

Muitos pixels têm corrido pela net ultimamente sobre o Erlang. Eu honestamente não pesco um boi daquilo (faltei às aulas da Process-One, para alívio de muitos) mas fascinam-me as características da plata...

calvin said...

So Erlang treats strings as integer lists. Okay.

So, how do I search and ignore case, in a text file? How do you replace? How do you do this in a text file, repeatedly?

This is a few lines of code in python.

I couldn't even find an example of doing this on the web, let alone in "Programming Erlang". Which has now gone back on the shelf until i get to the point where I need robust concurrency in my application.

I would really like to keep digging into Erlang, but I've got better things to do than write a string processing library.

The bottom line seems to be that Erlang is not a general purpose programming language. That is a disappointment.

Sam O said...

My dear Truepath, I think it is your knowledge of FP that do not fly. As your attorney, I advise you to go to LTU and do some studying

Franz said...

Aren't all your examples on the superiority of immutable variables actually just examples of pass-by-value providing more safety than pass-by-reference? I don't see anything demonstrating the variable itself as immutable, just examples of it not being changed by a function it has been passed to. And it's the same with your complaint against imperative languages on this point; the multiple points of variable changes is a result of doing pass-by-reference, not because the variable can be changed. Not to say that there isn't value to immutable variables, just that I don't see anything actually referring to it in your defense. You may want to try a language that has explicit pass-by-reference semantics, such as C.

Yariv said...

@TruePath If you base your decision to not use Erlang based on my article I think you're missing out. Most of the points I discussed are responses to pretty minor complaints. The existence of workarounds that may occasionally save you a small amount of typing is hardly an indicator of major flaws in the language. I suggest you try building something in Erlang and then see how you feel about it.

@calvin look at the regexp module.

@Franz I've actually done a good amount of C programming :) Immutability allows you to pass values by reference while also guaranteeing that data never changes. It's like declaring all your variables 'final' in Java.

no name said...

erlang syntax is horrible compared to haskell
(like having to do "fun lists:map/1" in stead of just "map")


strings being lists of integers is unnatural.
no other language that i know of does this


no debugger
:(

Yariv said...

Erlang syntax is quirky sometimes but horrible is a big overstatement. It's quite easy once you get used to it. Also, check out distel and debugger:start().

Axio said...

David: Strings can be represented as a tree of integers, rather than a list. In that case, they are known as "ropes" in fact. Might be useless if you don't manipulate them, but very useful when you need fast concatenation or rapid access to their nth element.

Ulf Wiger said...

Yariv, nice article.

Regarding code organization, I'd like to paraphrase Yariv and pose a challenge: write a large application in Erlang and and show me how code organization was a problem for you.

This is half-kidding, of course, but only half. Personally, I've experienced large-scale Erlang development through all life cycles of a huge software system (> 2 million lines of code, > 100 OTP applications, hundreds of mnesia tables) for more than a decade, and I can honestly say that not only did Erlang not cause us any great problems in this area - we've found it very helpful. The issue that needed to be addressed from the start was deciding on a namespace convention, but that is trivial in a closed system. In an open collaboration environment, namespace can become an issue, but so far, the Erlang community seems to be coping with that as well.

Dismissing Erlang until you need robust concurrency misses the point of using share-nothing concurrency as a very attractive structuring concept in its own right. You will miss Erlang's wonderful support for encapsulation and modularity as long as you don't embrace the notion of modeling with concurrency. This support pays of big-time in large software systems.

Having said this, I respect that people have different requirements on a programming language. While working with large software systems, things like whether or not the 'if' syntax is pretty is truly insignificant, if the language helps you with modularity, code reuse, debugging, etc. But of course, Erlang was designed to solve exactly the sort of problems that are among the most common killers of telecoms software development projects. For other tasks, e.g. ones where it's all about fast and convenient string processing, Erlang may not be your first choice.

This doesn't mean that it isn't a general-purpose language. General purpose doesn't mean "ideal for everything". From my point of view, both Java and C suck at concurrency, but I will concede that both are general-purpose languages - even reasonably good ones, as long as one learns how to use them properly.

TruePath said...

@Yariv:

Yah, on second thought I was being a bit excessive in saying I don't want to ever learn it. It really does seem interesting and no other language offers some of Erlang's nicities so maybe I'll look into learning it when I have some time. I still don't think your defenses fly but that doesn't mean the good features aren't worth the annoyances...and hopefully someone will come along and take the good bits and jettison the bad.

@Sam O:

Ohh and where did I make an incorrect claim about functional languages?

The point about immutability is simply that having mutable variables gives you forms of programming that aren't available only having immutable data (this isn't a point about theoretical computability). Thus to compare apples to apples you must compare the readability of the code to do something as would be implemented in a language with mutable data and in a language without mutable data. You can't just assume that it's going to be the same code in both sorts of languages and then say viola the immutable language is easier to read.

What languages are easier to read or understand is ultimately a psychological matter that has to be decided by empirical examination. Theory can't really tell you this.

samantha said...

Strings as arrays of integers does in imho suck bigtime. Strings are NOT integer arrays conceptually and typed functional languages should make it very easy to deal with the actual type concepts present. And what of unicode strings where it may take more than one integer to express a character in some alphabet? The language should have a distinct string type or much better yet enable you to easily create one and use it as if was built-in.

And why the hell lists of integers (which bloat as mentioned on 64 bit archs) when array of bytes could very easily have been used (modulo unicode)?

Steve Davis said...

I am an absolute noob to Erlang, currently working through the books/manuals, and having spent years of my professional time in C++/Java/J2EE/C#. Admittedly, when I was first exposed to the syntax of Erlang I had a "yuck" reaction. However, I'm finding that as I discover how to "think in Erlang", I'm able to do what I want with the language and libraries with much less ceremony than the imperative/OO style. Erlang has proven incredibly, amazingly, even wonderfully concise and straightforward. What's more I am coding really efficient applications that never need to stop for maintenance (OMG!!!).

I'm thinking now - well, suppose there were a "strings" library... under what circumstances would I ever need to code an application in anything else? What's more, apart from interfacing to existing legacy databases, I can't really think of any circumstance where I would want to use a traditional RDBMS over Mnesia; And what's more that that - as Erlyweb develops, I suspect I'll be asking "why use any other web framework?" too (I'm definitely wondering why Yariv chose to use MySQL).

All in all - the syntax issues raised seem truly trivial in comparison with the ROI - so here's my "challenge": Code a non-trivial application in Erlang *of your own design*, then come back and tell us all how much, or if, you care about syntactic sugar.

Some thoughts about Erlang | walker…to the next step said...

[...] check out Yariv’s response and his Recless [...]