Thursday, December 13, 2007

Amazon SimpleDB Runs on Erlang

I just came across this blog post: http://www.satine.org/archives/2007/12/13/amazon-simpledb/. The author got the scoop that Amazon's just-released SimpleDB runs on Erlang. This is actually the second Amazon Web Service that runs on Erlang -- at least, if the rumors I've heard that Amazon SQS was built with Erlang as well are true (update: I found the reference in this article).

It's pretty cool knowing that you can use the same language that powers some of the massive systems that Amazon and Ericsson have built to whip up a blog app just as easily as in Ruby or Python. (But if you built it in Erlang/ErlyWeb, you could also use Comet to make the blog update itself in real time, resting assured that horizontal scalability is just a matter of reconfiguring your Mnesia schema :) ).

Amazon SimpleDB is a game changer. It fills the last hole in making Amazon Web Services a complete environment for elastic, scalable web application hosting: the need for a fast, reliable data store. Together with S3 and EC2, SimpleDB makes it feasible for small startups to scale like the big companies but without the operational overhead. What a great set of products from Amazon.

Tuesday, December 11, 2007

ErlyWeb vs. Ruby on Rails EC2 Benchmarking Strangeness

I've been running some more benchmarks for the ErlyWeb vs. Ruby on Rails EC2 Performance Showdown. The results I've gotten are very strange -- so strange, in fact, that I'm not going to "officially" publish anything before I run all the tests one more time.

What's strange about the results? I'll give you a quick glimpse, but please keep in mind that none of these observations are the official benchmarks results and they may all be invalidated in the next run.

- Recompiling the Erlang files *without* HiPE seemed to *increase* max performance (in requests/sec) by ~10%.
- Running Yaws with kernel poll enabled ("+K true" passed to erl) *decreased* the max ErlyWeb performance to 410 (~41% decrease).
- No matter how many Mongrels I started (I tried 1,3,5, and 10) behind Pen (running on the same server), the max performance of Rails was ~18% lower than when it was running a on single Mongrel without Pen (111 requests/sec).

This across-the-board strangeness calls for another run. I'll try to have the results sometime in the next few days. In the meantime, if anybody else wants to take a stab at the these benchmarks, it would be interesting to see how our results compare.

By the way, although the effects of the changes I made were different from my expectations, when I ran the same tests the results were similar to the results I got last time.

Sunday, December 09, 2007

ErlyWeb vs. Ruby on Rails EC2 Performance Showdown

A few people have asked me to run benchmarks comparing ErlyWeb and other frameworks, especially Ruby on Rails. Benchmarking is pretty boring, which is why I haven't benchmarked ErlyWeb until now. Plus, although I was pretty confident that ErlyWeb outperforms Rails, I didn't think proving it mattered that much. People who love Rails generally don't care much about raw performance, and people who love Erlang and functional programming don't need much convincing to use ErlyWeb over Rails.

However, as ErlyWeb is becoming more mature, my curiosity has been growing. With the recent releases of Rails 2.0 and Erlang/OTP R12B, I finally decided to sacrifice the better part of my weekend to give both frameworks some serious stress testing and see how they compare.

Before I could run the benchmarks, I had to figure out the physical setup. I needed at least two powerful servers with a fast link between them, and I didn't have this kind of hardware lying around. All I have is my MacBook, but running the benchmarks on my MacBook wouldn't prove much because it would be impossible to isolate the impact the clients would have on the servers by running on the same machine.

Thankfully, Amazon EC2 made this easy to solve. I could just fire up two EC2 instances and have one of them stress test the other over EC2's fast internal network.

Benchmarking can be complex. How do you benchmark a web app? There are so many moving pieces and possible user interactions. I decided to keep it simple: I would test the performance of rendering a single page displaying a simple dynamically generated table. There would be no database queries -- the table's contents would be hardcoded in the controllers and the views would programatically render the contents for each request.

I decided against using a database because I wanted to isolate the performance difference between ErlyWeb and Rails. 3rd party dependencies that may affect the results by introducing bottlenecks that would make one or both frameworks appear artificially slow. In addition, in many, if not most, webapps, the majority of requests can be served from the cache -- which means they don't trigger database queries -- so this test scenario is actually quite realistic.

The Test App



Here's a screenshot of the awesome test page I created:

songs_screenshot.png

You can see the generated page here (the output is identical for both apps except for minor whitespace differences).

This is the Rails controller code (songs_controller.rb):


class SongsController < ApplicationController
def index
@songs =
["Sgt. Pepper's Lonely Hearts Club Band",
"With a Little Help from My Friends",
"Lucy in the Sky with Diamonds",
"Getting Better",
"Fixing a Hole",
"She's Leaving Home",
"Being for the Benefit of Mr. Kite!",
"Within You Without You",
"When I'm Sixty-Four",
"Lovely Rita",
"Good Morning Good Morning",
"Sgt. Pepper's Lonely Hearts Club Band (Reprise)",
"A Day in the Life"]
end
end


This is the Rails view code (songs/index.html.erb):


Songs





<% for song in @songs %>



<% end %>
<%= song %>





This is the ErlyWeb controller code (songs_controller.erl):


-module(songs_controller).
-export([index/1]).

index(_A) ->
{data,
[<<"Sgt. Pepper's Lonely Hearts Club Band">>,
<<"With a Little Help from My Friends">>,
<<"Lucy in the Sky with Diamonds">>,
<<"Getting Better">>,
<<"Fixing a Hole">>,
<<"She's Leaving Home">>,
<<"Being for the Benefit of Mr. Kite!">>,
<<"Within You Without You">>,
<<"When I'm Sixty-Four">>,
<<"Lovely Rita">>,
<<"Good Morning Good Morning">>,
<<"Sgt. Pepper's Lonely Hearts Club Band (Reprise)">>,
<<"A Day in the Life">>]}.


This is the ErlyWeb view code (songs_view.et):


<%@ index(Songs) %>

Songs



<% [song(S) || S <- Songs] %>


<%@ song(S) %><% S %>



Setup



This is what I installed on the EC2 image (I started with the basic public Fedora Core 4 image):

- Erlang/OTP R12B.
- Yaws 1.73 (compiled with HiPE).
- ErlyWeb from trunk (compiled with HiPE).
- Rails 2.0.1
- Mongrel 1.1.1
- MySQL 4.1 (unused)
- Tsung 1.2.1.

Notes:

- Rails insists on connecting to a database even if you don't want or need one. There must be some hacky way of running Rails without a database, but it was easier for me to just setup MySQL so Rails stops complaining.
- I also compiled the ErlyWeb app with HiPE.

Tsung (http://tsung.erlang-projects.org/) is the tool I used for the stress testing. It's pretty simple to use -- you specify in an XML file where the server, what page(s) to request and the frequency of initiating new requests and then it runs the test and produces a report. You can read about it more on its website.

One interesting fact about Tsung is that it is written in Erlang to take advantage of Erlang's support for massive concurrency (there will be more on that later :) ).

I installed Tsung on the same machine image as the web servers because although during the test one instance runs the servers and the other runs Tsung, I wanted to have two identical EC2 machine images to ease setting things up.

During the test, I had both servers running on the server instance listening on different ports (Mongrel on port 3000, Yaws on port 3001). I then ran Tsung twice: first against Mongrel and then against Yaws. The tests did not overlap.

For the test, I configured Tsung to go through a sequence of 11 10-second phases during which Tsung would send requests to the server at increasing frequencies. In the first phase, Tsung would send 2 requests per second. It would then progress to 10, 20, 40, 100, 200, 400, 1000, 2000, 4000, and finally 10000 requests per second.

Caveat Emptor



Before I show the results, I would like to say that you should take them with a grain of salt. Many benchmarks have flaws. Maybe I messed something up during the installation. Maybe my methodology was wrong. Maybe I didn't perform some Mongrel optimization trick that boosts its performance. Maybe your application would behave quite differently from my test app. You have been warned.

Results



UPDATE: the initial results were inaccurate because I didn't run Mongrel in production mode. I updated the results and the graphs to reflect Rails's performance in production mode.

Below is the summary of the most important measurements.

Peak Performance
Rails: 15.1 112.8 requests/sec
ErlyWeb: 699.8 requests/sec



Peak Send Rate
Rails: 181.11 1360.78 Kb/sec Kb/sec
ErlyWeb: 5997.50 Kb/sec (hmm... did we hit EC2's limit?)



Total Size Sent
Rails: 1.65 5.83 MB
ErlyWeb: 28.47 MB



Peak Receive Rate
Rails: 84.87 175.79 Kb/sec
ErlyWeb: 941.84 Kb/sec



Total Size Received
Rails: 0.21 0.64 MB
ErlyWeb: 4.43 MB



Performance Degradation Point
(The phase at which the framework started sharply dropping the response rate)

Rails: 100 400 requests/sec phase.
ErlyWeb: 4000 requests/sec phase.



Note: these are the updated reports after running Mongrel in production mode

You can view the full Tsung reports here: ErlyWeb Rails

These reports are from the second test (30 second sustaining at 10 requests/sec):
ErlyWeb Rails

You can also download the reports, zipped: reports.tar.gz

Conclusion



ErlyWeb's peak response rate was 47x 6x Rails's on similar hardware.

I'm actually surprised by the results. I expected ErlyWeb to win, but I didn't expect ErlyWeb to outperform Rails by such a wide margin.

The results probably reflect differences in language and runtime more so than implementation details in the frameworks. After 20 years of refinement driven by needs of high performance telcom applications, Erlang delivers.

Update:
Looking at the peak send rate for ErlyWeb -- an almost round 6 MB/s, I wonder if ErlyWeb just saturated EC2's internal bandwidth, and whether ErlyWeb could perform even better on a faster link.

Update 2:
I ran another test, this time hitting each server at 10 requests/sec sustaining for 30 seconds to measure average response time. Rails's response time was in the 120-180 5-7 msecs range, and ErlyWeb's was in the 1.4-1.8 msec range. This means that under medium to light load, ErlyWeb serves the 'songs' page roughly 100x 4x faster than Rails.