Friday, September 01, 2006

OO, FP, Erlang, and Me


Drop the anti-OO rehtoric, it just makes you look stupid. For your information, objects are cheap to create in most languages. For example, .Net only need 12 bytes of overhead. Had you not starting attacking OO, I would have been more interested in what you have to say about Erlang and databases.



-- blog comment




I have gone through a string of languages before I seriously got into Erlang. I started learned programming in BASIC and Pascal in my early teens, and then I moved on to C/C++ in high-school. In college, most classes were taught in Java, although we had occasional excursions into C/C++, Scheme and Prolog. Professionaly, I have worked on web applications with Perl, PHP and Java. At my last gig, I have created a pretty large (for 1 developer, I would think) codebase in C: about 34,000 lines of code. I was a network application with a lot of threading and IO logic. It also interacted with a backend system written in Java.



I usually try to stay on top of the latest trends. When Ruby on Rails started getting a lot of buzz, it struck me as a nice alternative to the popular tools of the day, so I studied it pretty thoroughly and then I used it to make a prototype for a website. (I ended up dropping that prototype at the point when I realized I was hopelessly hooked on Erlang :) )



In my last gig, the Java backend code initially interacted with a database by making SQL calls directly via JDBC, but then we switched to Hibernate, 'The' Java ORM framework. After the switch, we saw a significant performance hit and actually considered switching back to JDBC. For several reasons, we eventually decided to take the performance hit and stick with Hibernate anyway.



Benchmarks often produce misleading results regarding the behavior of real system, so you should take the following with a grain of salt, but polepoes.org has ran some benchmarks comparing Java ORM frameworks to JDBC, and in their summary, they state, "The use of O-R mapping technology like Hibernate or JDO O-R mappers has a strong negative impact on performance. If you can't compensate by throwing hardware at your app, you may have to avoid O-R mappers, if performance is important to you."



I'm not an expert on Hibernate internals and I can't state precisely why it would add such an overhead, but, as I said, it most probably has to do with the heavyweight nature of objects. By 'heavyweight' I mean all not just the physical size of the object in memory, but the full context associated with the object. In OO languages that maintain runtime type information such as Java and C# (as opposed to C++) those 12 bytes of overhead that the commenter has mentioned hold a pointer to a class structure. That structure has lists of fields and methods, as well as pointers to its superclass and maybe also to the interfaces it implements. The same structure repeats iteself up the inheritance chain. ORM frameworks have to examine this complex structure, usually by reflection, in order to figure out how to map instance variables to database fields.



(Hibernate actually performs bytecode instrumentation to lower the overhead associated with reflection. I'm not sure what the speed gains are, though. Regardless, we did notice that Hibernate adds overhead to performance.)



In dynamic OO languages such as Ruby, objects have an even higher overhead than in Java and C# because every object has its own hash table that holds its methods, and each method call requires a hash table lookup. That's (partly) why Ruby is slower than Java and C#.



By the way, I'm not trying to suggest that ORM is "bad" or that Hibernate is a bad product. If I used an OO language to build a web application, I would probably use an ORM framework. If I used Java, I would probably use Hibernate because it's mature and feature-rich. Developer productivity is generally much more valuable than the cost of buying a few extra servers.



Back to my programming history: although I have tasted some functional programming in college, it has been in an academic setting. In the "real-world," I have rarely heard of companies using functional languages. Mainstream programming-related websites such as oreillynet.com and devx.com hardly ever talk about any functional languages. Try to search for a Lisp, Haskell or Erlang job on craigslist or monster.com and you'll likely get zero results. All this gave me the impression that functional languages are more for academics than for people who build real systems.



The general story I and many other programmers of my generation have been told is as follows (I'm simplifying, of course): at first there was C, then C++ came and made C better by adding object orientation. Then Java came and made programmers lives much better by adding garbage collection, single inheritence and platform independence.



We also learned that the LAMP stack and with its various scripting languages is an alternative for developers who want to get things done cheap and also quick-and dirty.



When I first discovered Erlang, I wasn't thrilled. "Why bother with some obscure language designed for the telcom industry? Nobody probably uses it besides Ericsson and that jabber server -- ejabberd -- so why should I? Anyway, I already have Java, which makes my life easy," I thought. The look of the website -- especially the documentation area -- didn't add to my enthusiasm, either. It felt too old-school.



However, as much as I initially disliked it, Erlang kept pulling me back. The fact that it was designed for building distributed systems was too alluring. I've done my part in building such systems in C and Java and I know how hard it is. Race conditions, deadlocks, RPC protocols, nodes crashing, monitoring difficulties, logging, resource leaks, obscure bugs, etc, are all major pains in the neck. They are much worse when you're using languages that weren't designed to address those issues.



I have suffered the pains, and I wanted to find a better way.



That's why I kept on gravitating back to the Erlang documentation site, doing the tutorials, experimenting, reading code, asking questions and educating myself. The more I learned about Erlang, the less I missed the Java/C/C++ ways of doing things. After a while, I realized what a great language it was. Everything just fit together perfectly -- message passing, distributed programming, faul-tolerance, high-availability, pattern matching, dynamic typing, rapid development, functional semantics, a distributed database...



When I finally felt that I 'got' Erlang, never once did I miss a single OO construct. In fact, I started regarding OO as a burden. In OO languages, every class, every method, every field has so much context around it. To understand where piece of code fits into the big picture, you have to read through the documentation for all the classes and interfaces to which it belongs. Also, OO languages often encourage if not force you to wedge your types into some inhertiance chain in order to reuse code, which results in ugly inhertiance relations that are inflexible at best.



Functional languages like Erlang have the opposite philosophy. They try to minimize the context surrounding a piece of logic-- a function. All you have to know is what parameters a function takes and what it returns. There is no implicit link between the function and the data it uses. With single-assignment, you don't even have to worry about functions mangling the values of your variables. When you're coding in Erlang as opposed to Java, you think less about the context of your functions and more about what they do and how to compose them together to solve your problem.



That's what a good programming language is about: it should let you express the solution to your problem as concisely and easily as possible so you can move on to the next problem. It should also make it easy to compose multiple components into a larger whole.



When the Erlang lightbulb went off in my head, I actually started feeling some resentment towards the rest of the software world, which is dominated by OO thinking. "Why didn't anybody tell me about this earlier? Why isn't Erlang featured in the mainstream software press? Why is Ruby on Rails on Oreilly's radar but Erlang isn't? Why is everybody chasing OO features while overlooking this great language? Why did I spend so much time working with much worse languages? Erlang is twenty years old. It has been open sourced in 1998. How could the software community ignore it so thoroughly?"



This is partly I occasionally take jabs at the OO world. Part of me feels that it has mislead me and kept me away from a beautiful tool that would have made my life much better had I known about it sooner.

And much more fun :)

26 comments:

Roberto said...

Great post. I also "flirted" with lots of languages. The disadvantages of them i only discovered once I had a better alternative. - Btw., how went your move ? Did you grab an Erlang gig now ? (assuming you are also a contractor ..)

Yariv said...

Taras -- I hope my previous posts aren't *all* hype and fanboyism :) I guess if people labeled me an Erlang "fanboy" they wouldn't be totally off the mark just because I have been raving about it, but I always try to add substance to my posts to back up my enthusiasm. Anyway, thanks for the feedback. This posting has been building up in my head for a while and now it was a good time to let it out :)

Yariv said...

Roberto, I'm still in the process of moving to the new place. I took a break to do some blogging this morning but I still have a lot of unpacking to do. And no, I'm not a contractor, and nobody pays me a dime to write about or to code in Erlang. What I want to do is to try to build a complete application in Erlang during the next couple of months. After that, we'll see where life takes me.

David said...

@Taras. Most people, I suspect, are more productive and comfortable using a functional language. For evidence, look at the popularity of first Lotus 1-2-3 and then Excel. Possibly the most widely deployed programming platforms, and they employ functional programming languages.

With the possible exception of Javascript. And it is really happiest when used as a functional language. :-)

nobody said...

Re: It felt too old-school.

I was so hoping that you were going to say "it felt so... Web 1.0"

Scott said...

"In OO languages, every class, every method, every field has so much context around it."

You nailed it with this sentence. An object is a hodge-podge of state, and almost always involves mutation. If i'm writing a function (that involves no mutation), the conditions i have to write for are the number of arguments said function takes: N. For an object, it's still N, but also includes any local state, so more like N*M. But, of course, each of the N arguments, since they are objects, have much more complex state.

lopex said...

Yariv, do you have some thoughts on Ocaml and Haskell ? I must admit that after finding your blog some time ago I'm more and more convinced to give Erlang a chance.

Paul Mansour said...

Hi Yariv,

Been enjoying your writing on Erlang and concurrent programming. I am relatively new to multithreading, but am deep in it now. It was interesting to see how it is done in Erlang, that the memory spaces are distinct.

I write in Dyalog APL, (an array language, not a pure functional language, but very close) which, like Erlang, you won't learn about in school or see job postings for. APL has been around much longer than Erlang, but only in the last 6 years has an implementation with multithreading been available. The threads are relatively lightweight and share memory, but not appropriate for making use of multicore processors.

If I have can get some free time, I'm going to check out Erlang
-- though if it takes more than +/A100 to count the number of elements of an array greater than 100, I probably won't adopt it as my primary language!

Keep up the blogging on FP and concurrent programming.

PS. Regarding, OO and FP, I've never really seen them as competing. I write everything in a functional style, and then deliver the results with OO layer on top because it is very easy to document and a good way to present the software. To me the two techniques really complement each other.

fartikus said...

the programming world abhors a one-solution premise. don't adopt erlang because it is "the" concurrency language - if concurrency truly is to be the dominant theme in computing in the next decade or so, you will see concurrency make its way into a number of tools. i also disagree with the premise another poster made that one should reject a language that is not oriented towards concurrency. i will still put a C program against a erlang program for real world problems, and the C program will almost certainly perform better. sorry, we are not at the "the grid is the computer" stage yet and likely not anytime soon. and those entities that run real grids (yahoo, google) do so in a nonuniform fashion (some nodes are specifically for storage, some for computation) etc and these models are very finely tuned. in any case i would also recommend looking at mozart/oz, which apparently does concurrency even better than erlang(??) but once again, for average problems, its a perl/python class solution.

Jonathan Allen said...

I don't care if you quote me, but please include my name or handle. I have the right to be associated with everything I write, even if it makes me look foolish.

AS for ORM, I don't know why it slow in Java specifically, but in general I think it has something to do with the sheer amount of information being returned from the database.

ORM makes it easy to just grab every column even if you only need one or two. Rather than making a different mapping for each page, you can just reuse your one-true object for every query from that table. Using the database objects directly encourages you to return as little as possible, if only to reduce the amount of typing.

Using reflection in any performance sensitive code in a statically typed language seems foolish to me. The whole point of using a static typing is that it makes method calls cheap. In some cases, the JIT may even inline them even across assemblies.

Dave Newton said...

@Hibernate's slowness: reflection, marshalling, (sometimes) generated SQL (non-optimized), bad mapping files (objects not loaded from DB until used leading to lots of unnecessary calls to DB), etc. (Also note that Hibernate wasn't always close to being the slowest.)


None of this has _anything_ to do with things being an object, except maybe reflection.


If you look at the results of the benchmarks you'll see that different technologies excel at different things. Obviously the issue with Hibernate isn't the "overhead of objects" since db4o is nothing _but_ objects (as is JDO/VOA), and in some of the tests, the Hibernate/MySQL combination performs quite well.


Some problems drop nicely into the OO paradigm, others don't. Keeping data in tuples is great, handy, and quick, but implies no structure other than that imposed by the programmer. Lisp does it too. For lots of problems, it's perfect. Sometimes a more definite structure is appropriate.


OO is _a_ tool, not _the_ tool.

Yariv said...

Sorry for the late replies -- I'm still moving to a new apartment. Andre -- yes, I do think that languages with poor support for concurrency will diminish in importance over the next decade or so. Erlang is best positioned to grow in importance in applications where concurrency matters because Erlang does concurrency the best. It's also very mature. I don't think Erlang should be kept a secret, though, because it would slow down the development of its open source movement. Lopex -- I'm not a guru on OCaml and Haskell, but I think they don't target the same applications that Erlang does. Erlang is designed from the ground up for building scalable backends, and OCaml and Haskell don't. I think they are very nice languages but Erlang is a better tool for the applications that I want to build. Paul -- I've never heard of Dylog. I'll take a look at it. Thanks for the pointer. Fartikus -- There are a number of languages that "do" concurrency and a number of implementations of Erlang-style concurrency. However, Erlang is much more than "just" FP + concurrency. Erlang is a full package for building scalable backends and it makes it much easier than any other lanugage IMO. Erlang is a tool, and it's not made to solve every problem, but for scalable, fault tolerant, concurrent systems it's top notch. Jonathan -- although your name is visible in the original comment, I immediately removed it from this article. Although I wanted to respond to your comment, I didn't want to put you on the spot. Dave -- I agree that OO is _a_ tool, not _the_ tool. The problem is that the industry and higher education all seem to promote OO as _the_ tool, where other paradigms often work much better.

JP said...

I'm curious, compared to the tens of thousands of lines of code you wrote in those primitive languages, how many thousands of lines of code have you already written in Erlang? How big is your biggest Erlang project?

My biggest Ruby project is over 34kloc, but I program with multi paradigm with it, enjoying OO and functional style equally well. The files are different. OO files have modules, classes, and methods. The FP files have blocks, closures. It scales well in a file based way, so I could handle up to 100kloc this way, probably.

Also, web programming should help the web designers, so I worry about them by creating easy to edit HTML templates with nice interspersed Ruby code. If it were only me, I would generate all the HTML and use CSS only, like the some web programming methods seem to prefer. I must say, though, that HTML with Ruby interspersed is a joy even for me, so it's an easy tradeoff to make.

Again, Erland scales like that? Or is it better in cases when you need highest performance, distribution, and such rarer cases?

lopex said...

Yeah, I know that one ;)

Yariv said...

(Correction: Yaws translates ehtml to HTML)

Yariv said...

JP -- I have written less Erlang code than C/Java/PHP code. I don't know the exact line count, but I did make the haXe remoting adapter for Yaws, Smerl, and ErlyDB, as well as a (rather basic) webapp prototype that I haven't released. I think this gives me enough experience with Erlang to compare it with other languages. By the way, I've never said those other languages were "primitive." Erlang is actually much more "primitive" than Ruby, for instance, if you judge by the complexity of their semantics. The point I was trying to make is that Erlang's simplicity, at least by the absence of the OO stuff, often makes Erlang (and other functional languages) a better problem solving tool.


Instead of embedding Erlang in HTML, check out the ehtml module in Yaws. It allows you to describe HTML in Erlang tuples, which Yaws translates into ehtml. I think this is actually a nicer approach that the Ruby template language -- as long as you're not working with web designers who can't deal with Erlang :)

Piggy said...

I think the increasing adoption of OO-FP mixing languages like Ruby is good news to Erlang, since now more people have the chance to see the elegance within FP style and then feel comfortable. This really lower the barrier to enter into a full FP language later.

I wish one day I would see you write about the things in Erlang that you don't like much or just suck. Cause it will help people get a more objective view of Erlang by knowing the limits and also reduce the defensive thinking of 'hype' or 'fanboy' on your promotion of Erlang.

[OT] (Dyalog) APL is such a sacry language that it uses uncommon characters. I'm not against array-oriented languages that are good at math. I know J and K are used in financial industry with fame.

Piggy said...

Oops, I'm very sorry to spam accidently. I forgot to turn on the js of your site in 'no script'. The ajax didn't responsed properly so I clicked more than once.

Yariv said...

Piggy -- don't worry, I removed the "spam."


I agree with your comment about hybrid languages. It seems like developers of OO languages are gradually "discovering" all those lovely functional semantics that have existed in Lisp since 1958, and are therefore making functional languages seem less foreign to the "mainstream."


I will see what I can do about that "fanboy" image that some people have of me. There are certainly many applications for which Erlang isn't great. I just happen to really care about the applications that it for which it does happen to be great, which is why I write about it so enthusiastically. Thanks for the suggestion -- maybe my next posting will be titled "Why Erlang Really, Really Sucks" :)

Yaar said...

Hi Yariv. I've been following your great blog for a while now, and I enjoy seeing it getting more and more popularity. Great job!

I think that in cheering for Erlang, you should concetrate more on Erlang's major and very unique features, and less on its less important ones. Although a language's syntax, whether its OO or functional, or whether other languages' popular libraries can be imitated with it, are relevant to the discussion, the great promise of Erlang, and the reasons some projects have chosen Erlang, are different.

I think you would do Erlang a great service if you bring into the light its unique features: Erlang's multithreaded nature and its distributability. However, I could not find in your blog as well as in other resources, a good explaination of these, and some reasoning of why they would make my life better.

That said, keep up the good work, enjoy Boston and have a great time with your now very close girlfriend.

Yariv said...

Hi Yaar, thanks for the feedback. In my earlier posts (such as "Why Erlang is a Great Language for Concurrent Programming") I focussed more on Erlang's message passing and lightweight processes. The later ones have been more focussed on Erlang's semantics and why it's more than just programming language + message passing + lightweight processes. However, I guess my new readers won't find the old articles very easily so maybe I should keep stressing those aspects. Enjoy New York!

mathiasp said...

Nice post. For some comments on programming paradigms (excuse the french) and why and how functional, deterministic concurrent, message passing concurrent and shared-state concurrent matter (yeah, OO is included in that list :) all matter, just for different solutions, read "Convergence in Language Design: A Case of Lightning Striking Four Times in the Same Place", which describes how and why Erlang, E and Oz use a similar layering of functions.


If you liked that, read the very (very!) nice "Concepts, Techniques, and Models of Computer Programming". Find a review at Lambda the ultimate.

Rich Collins said...

It is all academic until there is a killer app written in Erlang. People are swayed much more by examples than by argument. Can you point us to an open source Erlang application where the "business rules" off the application are easily expressed and understood? This is one reason Rails is grabbing so many people:

class Blog
belongs_to :user
has_many :articles

validates_presence_of :title, :on => :create
...
end

Yariv said...

check out http://progexpr.blogspot.com/2006/12/erlyweb-blog-tutorial.html.

arsenalist said...

I did some Scheme, ML and Prolog in university and really enjoyed it and wanted to do more. But when I started working about 5 years ago, I never found a practical application for those languages. So once in a while I much around with Scheme using Dr. Scheme just for the hell of it.

I'll play around with Erlang and see what it's all about.

Viray said...

Yariv, I have to agree with the substance (but not the tone) of the blog comment which sparked this post. Further, nothing in your stated background makes you qualified to debate the merits of OO: you've done basically zero OO programming!

I'm interested in Erlang, but reading Joe Armstrong's critique of OO made me cringe. It may or may not be necessary for large scale applications in Erlang, but you guys need a working familiarity with the successes of OO, not just the failures. There exist OO projects which live up to the initial hype (i.e. simple! reusable! component software nirvana!), although such things are almost always found on the Smalltalk axis (Apple's Objective-C frameworks and Ruby are both Smalltalk derivatives).