Sunday, December 09, 2007

ErlyWeb vs. Ruby on Rails EC2 Performance Showdown

A few people have asked me to run benchmarks comparing ErlyWeb and other frameworks, especially Ruby on Rails. Benchmarking is pretty boring, which is why I haven't benchmarked ErlyWeb until now. Plus, although I was pretty confident that ErlyWeb outperforms Rails, I didn't think proving it mattered that much. People who love Rails generally don't care much about raw performance, and people who love Erlang and functional programming don't need much convincing to use ErlyWeb over Rails.

However, as ErlyWeb is becoming more mature, my curiosity has been growing. With the recent releases of Rails 2.0 and Erlang/OTP R12B, I finally decided to sacrifice the better part of my weekend to give both frameworks some serious stress testing and see how they compare.

Before I could run the benchmarks, I had to figure out the physical setup. I needed at least two powerful servers with a fast link between them, and I didn't have this kind of hardware lying around. All I have is my MacBook, but running the benchmarks on my MacBook wouldn't prove much because it would be impossible to isolate the impact the clients would have on the servers by running on the same machine.

Thankfully, Amazon EC2 made this easy to solve. I could just fire up two EC2 instances and have one of them stress test the other over EC2's fast internal network.

Benchmarking can be complex. How do you benchmark a web app? There are so many moving pieces and possible user interactions. I decided to keep it simple: I would test the performance of rendering a single page displaying a simple dynamically generated table. There would be no database queries -- the table's contents would be hardcoded in the controllers and the views would programatically render the contents for each request.

I decided against using a database because I wanted to isolate the performance difference between ErlyWeb and Rails. 3rd party dependencies that may affect the results by introducing bottlenecks that would make one or both frameworks appear artificially slow. In addition, in many, if not most, webapps, the majority of requests can be served from the cache -- which means they don't trigger database queries -- so this test scenario is actually quite realistic.

The Test App



Here's a screenshot of the awesome test page I created:

songs_screenshot.png

You can see the generated page here (the output is identical for both apps except for minor whitespace differences).

This is the Rails controller code (songs_controller.rb):


class SongsController < ApplicationController
def index
@songs =
["Sgt. Pepper's Lonely Hearts Club Band",
"With a Little Help from My Friends",
"Lucy in the Sky with Diamonds",
"Getting Better",
"Fixing a Hole",
"She's Leaving Home",
"Being for the Benefit of Mr. Kite!",
"Within You Without You",
"When I'm Sixty-Four",
"Lovely Rita",
"Good Morning Good Morning",
"Sgt. Pepper's Lonely Hearts Club Band (Reprise)",
"A Day in the Life"]
end
end


This is the Rails view code (songs/index.html.erb):


Songs





<% for song in @songs %>



<% end %>
<%= song %>





This is the ErlyWeb controller code (songs_controller.erl):


-module(songs_controller).
-export([index/1]).

index(_A) ->
{data,
[<<"Sgt. Pepper's Lonely Hearts Club Band">>,
<<"With a Little Help from My Friends">>,
<<"Lucy in the Sky with Diamonds">>,
<<"Getting Better">>,
<<"Fixing a Hole">>,
<<"She's Leaving Home">>,
<<"Being for the Benefit of Mr. Kite!">>,
<<"Within You Without You">>,
<<"When I'm Sixty-Four">>,
<<"Lovely Rita">>,
<<"Good Morning Good Morning">>,
<<"Sgt. Pepper's Lonely Hearts Club Band (Reprise)">>,
<<"A Day in the Life">>]}.


This is the ErlyWeb view code (songs_view.et):


<%@ index(Songs) %>

Songs



<% [song(S) || S <- Songs] %>


<%@ song(S) %><% S %>



Setup



This is what I installed on the EC2 image (I started with the basic public Fedora Core 4 image):

- Erlang/OTP R12B.
- Yaws 1.73 (compiled with HiPE).
- ErlyWeb from trunk (compiled with HiPE).
- Rails 2.0.1
- Mongrel 1.1.1
- MySQL 4.1 (unused)
- Tsung 1.2.1.

Notes:

- Rails insists on connecting to a database even if you don't want or need one. There must be some hacky way of running Rails without a database, but it was easier for me to just setup MySQL so Rails stops complaining.
- I also compiled the ErlyWeb app with HiPE.

Tsung (http://tsung.erlang-projects.org/) is the tool I used for the stress testing. It's pretty simple to use -- you specify in an XML file where the server, what page(s) to request and the frequency of initiating new requests and then it runs the test and produces a report. You can read about it more on its website.

One interesting fact about Tsung is that it is written in Erlang to take advantage of Erlang's support for massive concurrency (there will be more on that later :) ).

I installed Tsung on the same machine image as the web servers because although during the test one instance runs the servers and the other runs Tsung, I wanted to have two identical EC2 machine images to ease setting things up.

During the test, I had both servers running on the server instance listening on different ports (Mongrel on port 3000, Yaws on port 3001). I then ran Tsung twice: first against Mongrel and then against Yaws. The tests did not overlap.

For the test, I configured Tsung to go through a sequence of 11 10-second phases during which Tsung would send requests to the server at increasing frequencies. In the first phase, Tsung would send 2 requests per second. It would then progress to 10, 20, 40, 100, 200, 400, 1000, 2000, 4000, and finally 10000 requests per second.

Caveat Emptor



Before I show the results, I would like to say that you should take them with a grain of salt. Many benchmarks have flaws. Maybe I messed something up during the installation. Maybe my methodology was wrong. Maybe I didn't perform some Mongrel optimization trick that boosts its performance. Maybe your application would behave quite differently from my test app. You have been warned.

Results



UPDATE: the initial results were inaccurate because I didn't run Mongrel in production mode. I updated the results and the graphs to reflect Rails's performance in production mode.

Below is the summary of the most important measurements.

Peak Performance
Rails: 15.1 112.8 requests/sec
ErlyWeb: 699.8 requests/sec



Peak Send Rate
Rails: 181.11 1360.78 Kb/sec Kb/sec
ErlyWeb: 5997.50 Kb/sec (hmm... did we hit EC2's limit?)



Total Size Sent
Rails: 1.65 5.83 MB
ErlyWeb: 28.47 MB



Peak Receive Rate
Rails: 84.87 175.79 Kb/sec
ErlyWeb: 941.84 Kb/sec



Total Size Received
Rails: 0.21 0.64 MB
ErlyWeb: 4.43 MB



Performance Degradation Point
(The phase at which the framework started sharply dropping the response rate)

Rails: 100 400 requests/sec phase.
ErlyWeb: 4000 requests/sec phase.



Note: these are the updated reports after running Mongrel in production mode

You can view the full Tsung reports here: ErlyWeb Rails

These reports are from the second test (30 second sustaining at 10 requests/sec):
ErlyWeb Rails

You can also download the reports, zipped: reports.tar.gz

Conclusion



ErlyWeb's peak response rate was 47x 6x Rails's on similar hardware.

I'm actually surprised by the results. I expected ErlyWeb to win, but I didn't expect ErlyWeb to outperform Rails by such a wide margin.

The results probably reflect differences in language and runtime more so than implementation details in the frameworks. After 20 years of refinement driven by needs of high performance telcom applications, Erlang delivers.

Update:
Looking at the peak send rate for ErlyWeb -- an almost round 6 MB/s, I wonder if ErlyWeb just saturated EC2's internal bandwidth, and whether ErlyWeb could perform even better on a faster link.

Update 2:
I ran another test, this time hitting each server at 10 requests/sec sustaining for 30 seconds to measure average response time. Rails's response time was in the 120-180 5-7 msecs range, and ErlyWeb's was in the 1.4-1.8 msec range. This means that under medium to light load, ErlyWeb serves the 'songs' page roughly 100x 4x faster than Rails.

44 comments:

Roberto said...

wow, amazing ! RubyOnRustyRails vs. ErlyWebOnRocketEngines !

sb said...

You must have done something wrong... Are you sure you ran Rails in production mode? 15 req/sec with no database calls? You should easily beat 100 req/sec.

Rails is very slow, yes, but it's not THAT slow!

zimbatm said...

I would much more be interested in performance comparisons between the old and the new version of erlang.

Jason Watkins said...

I have no doubt that ErlyWeb is faster than rails, but something is wrong with those rails numbers.

First, rails running in mongrel serializes requests. To serve any kind of concurrent load you need to run multiple mongrels and load balance them.

Yes this means rails is significantly more ram intensive than many other technologies... but a more fair comparison would be to load as many mongrels as allowed by your EC2 instance and test across them.

Jason Watkins said...

PS. Thanks for doing these tests... this is extremely interesting data to a lot of us. I'm not sure you realize how much curiosity there is in the ruby community about erlang.

Yariv said...

@sb I might have done something wrong... the results are so drastic that it's possible. I'm not a Rails expert so there might be some production-specific configuration step that I missed. I just installed Rails and Mongrel from gems and started it with 'mongrel_rails start -d'. Let me know if you have any suggestions.

@zimbatm I bet without HiPE R12B would perform much better on this test than older versions because the binaries won't be allocated for every request. However, HiPE has been doing static allocation since before R12B AFAIK, and I compiled everything with HiPE.

Jason Watkins said...

@yariv

"I just installed Rails and Mongrel from gems and started it with ‘mongrel_rails start -d’".

Ah, bingo. you need to add -e production. Otherwise rails checks the file modification times of every source file every request. A great convenience when developing, but not intended to ever be used in production.

Yariv said...

@Jason Thanks for the pointer. I'll rerun it with -e.

Yariv said...

@Jason Maybe you're right and I should have been running multiple Mongrels. I'll try to get that set up and keep you posted on my progress.

Daniel Lyons said...

Damn, missed by the time it took to write. :) Oh well.

Daniel Lyons said...

Yariv,

Please try it again with "mongrel_rails -d -e production" and see if things don't start looking a little more realistic. The default, development mode, reloads all of the files between each request so that incremental development is a little easier.

Yariv said...

I'm rerunning the test now. Thanks for the pointers.

Yariv said...

You guys are right. Running Rails with -e production increased the peak response rate to 113 requests/sec. I'll post the updated report shortly.

Is there anything else I can do to speed up Rails?

Yariv said...

Update: I reran the Rails test after starting Mongrel in production mode. It made a significant difference, bumping Rails's peak performance from 15 requests/sec to 112.8 requests/sec. The updated results are shown above.

Dmitrii 'Mamut' Dimandt said...

What if you don't compile ErlyWeb with HiPE? :)

Brian P O'Rourke said...

You really want to balance across a cluster for Mongrel. As Jason mentioned above, your requests from Mongrel get serialized going into Rails by a mutex in the Rails handler. You can set up something quick and dirty with Pen as your balancer that will give you much more accurate numbers.

(Benchmarking is for the birds, though ;)

See
http://mongrel.rubyforge.org/docs/mongrel_cluster.html
and, e.g.
http://siag.nu/pen/

Mauricio Fernandez said...

>> "Looking at the peak send rate for ErlyWeb — an almost round 6 MB/s, I wonder if ErlyWeb just saturated EC2’s internal bandwidth, and whether ErlyWeb could perform even better on a faster link."

That doesn't seem likely. In EC2, the advertised dedicated local network bandwidth is 250Mbps. These are also the figures I got the last time I measured the BW between two EC2 instances.

Joe said...

I love how people are saying Rails needs to be run in a cluster because it serializes requests... LOL... that's not a problem with the benchmark, that's a problem with Rails as a platform.

If you were to run two mongrel instances on a dual core machine and then test it against erlyweb on that same machine-- erlyweb would win by an even wider margin-- because rails is not concurrent while erlyweb is, so it woudl benefit from 2 mongrels, but not as much as erlyweb. (assuming yaws is fully concurrent).

If rails needs to be run on a cluster, then erlyweb should be as well... and the results will still be dramatically in erlywebs favor.

Nirmal Das said...

This is all temporary.

CHANT HARE KRISHNA AND BE HAPPY.

crayz said...

Try Merb

FoOToe » Blog Archive » 坐火箭 said...

[...] 原文在这里, 可以去看一下各项指标的结果, 当然也不必太认真,像作者在第一段里说的, “Rails 粉丝不是很关注原始性能, Erlang粉丝不必游说就自然会选择ErlyWeb…” [...]

Ryan said...

@Joe:

You're right that Rails is not thread-safe. But by 'running it in a cluster' we just mean to run multiple mongrels at once. The Erlang implementation going on here is already using as much concurrency as it can (assumingly) so you're just comparing apples to apples by making use of Rails' method of 'concurrency'.

I think the idea here is to test these frameworks real-world performance against each other. Running in development mode (now fixed) and with only 1 application server are not real-world rails tests.

@Yariv:

You can disable ActiveRecord (db layer), but it's ever so slightly painful. You'll need to freeze your version of Rails into vendor/ and then edit environment.rb and look for the commented out section about disabling certain packages from the distribution (in this case :activerecord). Not sure how much it would really help though since you aren't actually connecting to the db... Probably an insignificant % speedup, but a thought.

David Bergman said...

For some weird reason, Yariv and I decided to compare ErlyWeb against Rails the exact same day!! Well, we have often had the same kind of ideas, except that I like to be chained down by the bureaucracy of static types while Yariv prefers the creative freedom and nimbleness that dynamic typing brings ;-)

Anyway, my benchmarks, which include some other web frameworks, were run on a local box, with no model access, to exclude the database layers from the mix, just like Yariv did it. You can find it here: http://blog.davber.com/2007/12/10/web-server-performance-shoot-out-simple-pages/

My figures show a much higher ratio between ErlyWeb and Rails, in terms of peak throughput: just above 20 x the performance, actually. WITH the "-e production" parameter ;-)

One sad note, though: when hitting ErlyWeb hard, it does refuse to accept connections much faster than, say, LAMP (or Haskell+HAppS, but I do not want to start a Static vs Dynamic war ;-)

Again, the synchronicity of these two performance tests is scary, it is almost as if Yariv and I knew each other :-)

davber does IT » Web server performance shoot out - simple pages said...

[...] 12/10/07 comment:for some paraphenomenal reason, Yariv run almost identical tests the very same day! Although he did only focus on ErlyWeb vs Rails. Anyway, quite similar results to mine: look for yourself. [...]

Yariv said...

I'll rerun the test against multiple Mongrels and see what difference it makes. I'll try to have the results published tonight.

Aside -- AFAIK each Mongrel instance takes about 30MB of RAM. On, say, an 8 core machine with 10 instances per core (I'm guessing that's a reasonable number), those Mongrels would take 2.4GB of RAM. So, even if adding Mongrels helps to a point, I don't think this is a future-proof strategy for scaling (8 cores will probably look like a low number in a couple of years).

In addition, Rails requires 1 DB connection per process. 80 Mongrels == 80 DB connections. This also doesn't scale very well given that DBMS's can only handle so many connections.

Brian P O'Rourke said...

@Yariv:
Re: memory - the community stance on this has always been that you can easily add more (thus while it's not necessarily as high-performance as possible, it *is* scalable).

But I agree entirely with your assessment of the DB connections.

@Joe:
I think we're merely trying to be pragmatic. I've never heard anyone advocate running a production Rails system in a non-clustered environment. Mongrel clustering is feature of the Rails deployment landscape, just as Heart is a feature of the Erlang deployment landscape (different purposes, but both can be critical in production). Erlang should not need clustering (which is great!), but if it could benefit from it, I would advocate balancing against multiple ErlyWeb instances. The question here is "which framework makes the best use of its resources?" I tend to think it's ErlyWeb, but I want a good test :)

Eric Florenzano said...

Being a long time Django user/evangelist, I'd love to see how it fared compared to ErlyWeb and RoR! Do you think you could do a followup?

Erik said...

While interesting, this isn't really representative of a production deployed Rails system.

I have NEVER deployed Rails running off of just 1 mongrel. Due to the single-threaded nature of mongrel, it will only handle one request at a time. However, if you run a mongrel_cluster, it will be much much faster, especially if you have multiple CPUs. My guess is that 80% or so of the difference between the two tests has to do with CPU usage.

Try a setup with 10 mongrels, using Pound and mongrel_cluster, and try it again. I suspect Rails will be much more competitive in such a test.

Erlang-China » about "Ror vs ErlyWeb perfomance" said...

[...] Yariv发布了ErlyWeb vs. Ruby on Rails EC2 Performance Showdown,作为一个相当简单的测试 ErlyWeb 比 ror 在几乎所有的性能指标上都以明显的优势胜出。 [...]

Ska jag köra RoR? » Nudelsoppa said...

[...] mycket på prestanda utan istället vill att allt ska vara sockersött för utvecklarna. Jag såg detta testet och det gör mig bara mer osäker. Taggar: Ruby on Rails Ämnen: [...]

Asbjørn Ulsberg said...

For this to be a "real world" performance test, you need a much more sophisticated application. Who cares how fast any framework can iterate over an array and spit out the results through a template? When you introduce regular expressions, XML parsing, database calls, etc., is when you get a true feeling of how the frameworks fare in the "real world".

I'd too like to see how Django compares to Rails and Erlyweb, but on a "real world" application. Isn't there a "pet store" application available for all of these frameworks that such a test can be made against?

JFred said...

I actually have some similar code in a rails app I made. I computed the insides of the table into a class variable (inside the class but outside any method) essentially like this:

@@redLines = ["text..", "lines of text", ...
].map{|x| "" + x + "\n"}

def initialize
@redLines = @@redLines # copy ref so view can see it.
end

In other words, I removed the loop from the view since the text was constant. Do that for Rails and Erlang too and see what happens.

Alternatively, compute the text more dynamically with variation on each run to defeat caching. But constant text in an instance method is a contradiction.

Come to think of it, it might be that the performance difference is due to differences in default cache configurations. With constant text.

JFred said...

Your blog software mangled my code. Should I have used pre?

Let's try this line:
.map{|x| "<tr><td>" + x + "</td></tr>\n"}

JFred said...

Um, with the closing bracket:
].map{|x| "<tr><td>" + x + "</td></tr>\n"}

Yariv said...

JFred, I don't think it's a big issue that the text is static. I just wanted to make sure that the app does at least some computation before returning. I don't want to waste too much time tweaking this code where there isn't likely to be a big impact on performance.

jey said...

mmh.. ok for Rails but... what the hell is ErlyWeb?!

Comet Daily » Blog Archive » Getting started with Comet on Erlang said...

[...] ErlyWeb, a MVC web framework tailored to yaws and six times faster than RubyOnRails, according to a benchmark done by Yariv Sadan, the author of the framework. If you are considering adapting yaws for Comet, [...]

Richard said...

I've been reading the EC2 doc and I'm clueless as to how one would install a yaws/erlyweb server and more importantly how much would such a server cost to run? Just because the WS was open, does not mean it's running... where is the line drawn?

zed said...

and a would like to see ruby 1.9 in comparison
it was released last christmas and ist 5-10 times faster than 1.8

motek said...

Try ruby 1.9 (compiled from source not from repository) and rails 2 :)

George Wright said...

I'm surprise that there are no DHH comments here.

Charleno Pires said...

You would can to test with Rails (version 2.1) + Passenger (version ) + Ruby Enterprise Edition (version 1.8.6-20080810) versus ErlyWeb (version 0.7.1). I'm courious for results of this benchmarking :D

DavidH said...

I've taken a shot a recreating these tests on a couple home machines. I'm also using the "apache bench" ab command to do the load testing. The difficulty I'm having is keeping Yaws/ErlyWeb from resetting the connection when the level of concurrency goes beyond 10.

$ ab -n 10000 -c 1 http://153.32.54.253/test

works fine. Returns a nice fast high-performing result.

$ ab -n 10000 -c 5 http://153.32.54.253/test

works fine too. Also impressive timings. But...

$ ab -n 10000 -c 20 http://153.32.54.253/test

craps out every time with the following error:

Benchmarking 153.32.54.253 (be patient)
apr_socket_recv: Connection reset by peer (54)


Any idea why Yaws/ErlyWeb is doing this? Thanks for any info.

Carl said...

I'd like to see it with Rails 2.2 + Phusion Passenger. Also, memory usage between the two would be interesting (I like RoR, but on a VPS with 512mb of RAM you can't easily run very many different apps without problems if they are all even slightly busy).

I much prefer the syntax of Ruby over Erlang (but I'm also much newer to Erlang, so that may be it) but some of the performance things I've seen with Erlang make me very interested in learning it, especially with all the new multicore machines (though Rails is Threadsafe now, it still requires the programmer to be careful as there can be problems that, IIRC, simply cannot happen with Erlang, but please correct me if I'm wrong).