Yariv's Blog: erlyweb

Showing posts with label erlyweb. Show all posts

Wednesday, May 28, 2008

Announcing Twoorl: an open source ErlyWeb-based Twitter clone

With the recent brouhaha over Twitter's scalability problems, I thought, wouldn't it be fun to write a Twitter clone in Erlang?

Last weekend was cold and rainy here in Palo Alto, so I sat down and hacked one, and thus Twoorl was born. It took me one full day plus a couple of evenings. The codebase is about 1700 lines (including comments). You can get it at http://code.google.com/p/twoorl

Note: you need the trunk version of ErlyWeb to make it work (when released, it will be the 0.7.1 version).

Many people written about Twitter's scalability problems and how to solve them. Some have blamed Rails (TechCrunch is among them), whereas others, including Blaine Cook, Twitter's Architect, have convincingly argued that you can scale a webapp written in any language/framework if you've figured out how to Just Add More Servers to handle the growing traffic. Eran Hammer-Lahav wrote some of the most insightful articles on the subject, On Scaling a Microblogging Service.

I have no idea why Twitter is having a hard time scaling. Well, I have some suspicions, but since I haven't been in the Twitter trenches, such speculation isn't worth wasting many pixels on.

I didn't write a Twitter clone in Erlang because I thought my implementation would be inherently more scalable than a Rails one (although it may be cheaper to scale because Erlang has very good performance) . In fact, Twoorl right now wouldn't scale well at all since I prioritized simplicity above all else.

The reasons I wrote Twoorl are:

- ErlyWeb needs more open source apps showing how to use the framework. It's hard to pick how to use the framework just from the API docs.
- Twitter is awesome. Once you start using it, it becomes addictive. I thought it would be fun to write my own.
- Twitter is very popular, but I don't know of any open source clones. I figured somebody may actually want one!
- Some people think Erlang isn't a good language for building webapps. I like to prove them wrong :)
- Although you can scale pretty much anything, your choice of language can make a difference in of performance and stability, both of which lead to happy users.
- I think Erlang is a great language for writing a Twitter clone because Twitter's functionality offers interesting opportunities benefit from concurrency. Here are a couple of ideas I thought of:

1) If you use sharding, the Tweets for different users would be stored in separated databases. When you render the page for someone's timeline, wouldn't it be advantageous to fetch the tweets for all the users she follows in parallel? In Ruby, you would probably do something like this:


def get_tweets(users)
  var alltweets = Array.new()
  users.each { | user |
    alltweets.add(user.fetch_tweets())
  }
  alltweets.sort()
  return alltweets
end

(Please forgive any language errors -- my Ruby is very rusty. Treat the above as Pseudo code.).

This code would work well enough for a small number of tweet streams, but as the number gets large, it would take a very long time to execute.

In ErlyWeb, you could instead do the following:


get_tweets(Users) ->
  sort(flatten(pmap(fun(Usr) -> Usr:tweets() end, Users)))

This would spawn a process for each user the user follows, fetch the tweets for that user, then reassemble them in sorted order in the original process before rendering the page. (Think of it as map/reduce implemented directly in the application controller.) If a user follows hundreds of other users, querying their tweets in parallel can significantly reduce page rendering time.

2) Background tasks. When a user sends a tweet, the first thing you want to do is store it in the database. Then, depending on the features, you have to do a bunch of other stuff: send IM/SMS notifications, update RSS feeds, expire caches, etc. Why not do those tasks in different background processes? After to write to the DB, you can return an immediate reply to the user, giving him or her the perception of speed, and then let the background processes do all the extra work for processing the tweet.

(Such technique works very well for Facebook apps, by the way. In Vimagi, when the user submits a painting, the app first saves the painting data, and then it spawns a new process to update the news feed and profile box, send notifications, etc.)

Anyway, I hope you enjoy Twoorl. It's still in very early alpha. It doesn't have many features and it probably has bugs. Please take Twoorl for a spin and give me your feedback! I'll also appreciate useful contributions :)

Sunday, April 13, 2008

Tuesday, April 01, 2008

ErlyWeb renamed "Erlang on Rails"

Erlang is *almost* a tipping point. Thanks to reddit, many people are interested in it. However, it's not there yet. Despite being the only language that got concurrency right, and all its other standout features, many developers still use other languages. The only explanation I can think of is that Erlang hasn't had much PR over the years (the Erlang movie nonwithstanding). I'm confident that a good PR boost will help push Erlang over the hill. Unfortunately, Ericsson doesn't seem interested in heavily promoting Erlang the way Microsoft promotes .NET and Sun promotes Java (this may be because many Ericsson employees have never heard of Erlang). So, I decided to take things into my own hands. I don't have a budget, so I need to get creative. The best way to succeed if you're small is to ride a big wave -- and what's a better wave to ride than Ruby on Rails?

Ruby on Rails is very popular -- much more than ErlyWeb. I believe this popularity is due to the "on Rails" meme, which is just bursting with positive connotations. It sounds young, fresh, happy. It's the anti-enterprisy software. It emancipates you from burdensome type systems, explicit getters and setters, and (ugh) XML. Its metaprogramming wizardry is made of bliss. Its evokes images of riding in environmentally-friendly transportation looking out the window at grassy meadows, rolling hills and sunny skies.

I think that renaming ErlyWeb to "Erlang on Rails" will help win over the hearts and minds of many programmers who are currently on the fence. They may be curious about Erlang but are turned off by its telcom image. "Erlang on Rails" conveys a more balanced feeling of industrial strength applications from the telcom world mixed with the social Web 2.0 era of interconnectedness that celebrates the rise of individualism over grey corporate culture.

2008 will be the year of Erlang on Rails. I know it.

Update: This was an April Fool's joke, in case it's not obvious anymore :)

Sunday, March 23, 2008

I Play WoW: A Cool Facebook App Built With ErlyWeb

Nick Gerakines, the Author of Facebook Application Development, created with ErlyWeb the very cool Facebook app I Play WoW. I Play WoW bridges between real people and the characters they play on World of Warcraft. Nick told me he got a lot of feedback such as "Wow! I didn't know my brother-in-law is in my guild!" and "Its been 5 years since I talked to some of them, but a bunch of my friends from school play on these realms and I didn't even know they played".

Some facts:
- 53.5k installs
- 2.9k daily active users
- 1.3k application fans
- 200+ new users a day on average
- In the past 30 days its gotten over 2 million page views where users spend more than 5 minutes on average on the application
- Erlang application layout:
* Charstore w/ Mnesia: Acts as the raw character store and cache for interactions between wowarmory.com
* I Play WoW w/ ErlyWeb + Mnesia: The front-end and ui for the application. The majority of the FB API calls are made here or are spawned from here.
* There is still one component in perl that is yet to be ported over, mainly due to not having enough time. Its on the list of things to do.

(My note: it sounds like Nick is also using spawned processes to make FB API calls asynchronously. It's a great technique for reducing page load time and avoiding the annoying timeouts Facebook imposes on page renderings.)

If you play World of Warcraft (an addiction I've luckily been able to avoid this far :) ) and you are on Facebook, give I Play WoW a try. You may discover that your boss is a level 10 ogre or something :)

Congrats, Nick, for creating such a successful app with ErlyWeb!

Sunday, February 17, 2008

Seaside-Style Programming in ErlyWeb

The Arc Challenge started an interesting thread in the ErlyWeb mailing list about continuations-driven web frameworks. ErlyWeb doesn't have built-in support for continuations, but Arc does and so does Seaside. I haven't paid much attention to the use of continuations in web frameworks before the Arc challenge, but I became especially interested in experimenting with them after seeing Seaside solution.

In case you haven't read it, this is the requirement of the Arc challenge:

Write a program that causes the url said (e.g. http://localhost:port/said) to produce a page with an input field and a submit button. When the submit button is pressed, that should produce a second page with a single link saying "click here." When that is clicked it should lead to a third page that says "you said: ..." where ... is whatever the user typed in the original input field. The third page must only show what the user actually typed. I.e. the value entered in the input field must not be passed in the url, or it would be possible to change the behavior of the final page by editing the url.

This is the original Arc solution:


(defop said req
  (aform [w/link (pr "you said: " (arg _ "foo"))
              (pr "click here")]
     (input "foo")
     (submit)))

This is how you would write the same logic in Erlang, if it had an Arc-like web framework:


said(A) ->
  form(
     fun(A1) ->
       link(fun(A2) -> ["you said ", get_var(A1, "foo")] end,
         "click here")
     end,
     [input("foo"),
      submit()]).

The Erlang code is a bit more verbose, mostly because Erlang macros don't allow you to hide the "fun() -> ... end" syntax the way Lisp macros let you hide the (lambda ..) keyword.

This is the Seaside solution:


| something |
  something := self request: 'Say something'.
  self inform: 'Click here'.
  self inform: something

IMO, this solution cheats a bit by using high-level functions for generating the HTML tags whereas the Arc version generates them explicitly. However, putting minor complaints aside, I think the Seaside version is the winner in readability. As a reddit comment said, it reads like prose. It doesn't even declare any closures explicitly -- it says exactly what it does and nothing more.

Wouldn't it be cool if we could use this style of programming in ErlyWeb applications?

Fortunately, we can! I hacked a continuations plugin for ErlyWeb that lets you write Seaside-style code so fans of this programming style would feel at home with ErlyWeb. (This is all done in 105 lines of code :) ) Before I explain how the plugin works, I'll show you how to create an ErlyWeb controller that implements the the Arc challenge using this plugin:


-module(said_controller).
-compile(export_all).
-import(continuations, [ask/2, confirm/2, pr/1]).

index(A) ->
    continuations:do(
      A, fun(K) ->      
                 Name = ask(K, "name"),
                 confirm(K, "click here"),
                 pr(["your name: ", Name])
         end).

This may seem more verbose than the Seaside code because of the module declarations at the top, but the "meat" is about the same. (I could make this code even smaller by integrating continations.erl deep into ErlyWeb, which would remove the explicit call to continuations:do(). However, I didn't want to go too far with this proof of concept.)

How does this work? Using concurrency, of course! For each "continuation", the plugin spawns a process and registers it in a Mnesia table according to a randomly generated key. The key (K) is encoded in the URLs to which the <form> and <a> tags point. When a request arrives, continuations:do() looks up the process in Mnesia and sends it a message of the type {A, self()}. The process does some work and sends back in reply the HTML to be rendered, and waits for the next message. The web server process receives the rendered HTML and sends it to the client using the normal ErlyWeb mechanisms.

If a process doesn't receive a message in 10 seconds, it dies and removes itself from the Mnesia table, which provides automatic garbage collection to stale sessions.

You can get the code for continuations.erl here. Just remember it's a proof of concept and I don't recommend using it in a production environment.

(Before you use it in your application, make sure you call continuations:start().)

Final word: IMO, although continuations help write more natural code in certain multi-page interactions, most of the logic in web applications involves rendering dynamic pages for RESTful URLs. So, if your web framework doesn't support continuations, don't worry about it too much. It's likely the code for your application wouldn't be dramatically smaller if you could write it with the use of continuations. (That said, take my advice with a grain of salt. I haven't used a continuations based framework to build a real application, so I may be missing something.)

Monday, January 21, 2008

How to Use Concurrency to Improve Response Time in ErlyWeb Facebook Apps

When I was building the Vimagi Facebook app, I came across a common scenario where using concurrency can make your application more responsive for its users.

The typical flow of responding to requests coming from Facebook looks like this:

1) request arrives
2) do some stuff (mostly DB CRUD operations)
3) call Facebook API to send notifications / update newsfeeds and profile FBML
4) send response

When you're building a facebook app with ErlyWeb, you can instead do the following:

1) request arrives
2) do some stuff (mostly DB CRUD operations)

spawn(fun() ->
3) call Facebook API to send notifications / update newsfeeds and profile FBML
end)

4) send response

The Facebook API calls in step 3 are much more expensive than the typical ErlyWeb controller operations because these calls involve synchronous round trips to the Facebook servers plus XML processing for the responses. By performing the Facebook API calls in a new process we can return the rendered response to the browser immediately and let the Erlang VM schedule the Facebook API calls to happen leisurely in the background.

The only gotcha is that if an error occurs in the spawned process we can't notify the user right away -- but this isn't really problem because we can log the errors and retry later, which is arguably a better approach anyway from a usability standpoint.

It's really that easy! A simple call to spawn() makes our app *feel* much faster. This puts the debate around language performance comparisons in a new light: how do you take into account the observation that some languages make "cheating" so much easier? :)

Sunday, January 20, 2008

Vimagi: The Facebook App

I finally finished the Vimagi Facebook app! Check out the screen shots:

The app works similar to the vimagi.com website: you can create paintings with titles, tags, and descriptions and share them with anyone. Other Facebook users can comment on the paintings and rate them. Paintings created in Facebook are automatically posted to vimagi.com, where vimagi.com users can also add comments and ratings. All paintings have embed codes you can use to embed them on any site (with playback). In Facebook, you can create paintings for your Facebook friends and those paintings will appear on their (and your) profile. Similar to vimagi.com, the Facebook app has a gallery and a tags page, which contains content from both vimagi.com and the Facebook app.

I created the Facebook app mostly as a learning exercise, but I also believe that the Vimagi features would appeal to more people by being embedded into people's their existing social network rather than in a standalone site.

Unfortunately, I underestimated how tricky it would be to port the vimagi.com features and code into Facebook without breaking the site and while seamlessly blending the paintings and data created on Facebook and on vimagi.com. (One of the pesky issues I encountered is that Facebook applications have root URLs of the form http://apps.facebook.com/[app name], which was incompatible with the vimagi.com relative URLs. The existing URLs all start with a forward slash, assuming they follow immediately after the domain name. My life would have been much better if Facebook used the url scheme "http://[app name].apps.facebook.com/" for Facebook apps as it would have allowed existing URLs to remain unmodified.)

If you like to paint, I hope you enjoy the applications, and if you have friends who like to paint, please send them an invitation. Facebook needs some more Erlang-generated pageviews :)

If you like Vimagi and you're on Facebook, please join the Vimagi fan club. Also, I'll appreciate it if you let me know of any feedback or suggestions you may have.

P.S. Thanks I to Bryan Fink for the excellent erlang2facebook library.

Thursday, January 17, 2008

Join the ErlyWeb Fan Club on Facebook

Are you on Facebook? Then join the ErlyWeb Fan Club!

If all goes according to plan, some day this fan club will have more members than the If this group reaches 4,294,967,296 it might cause an integer overflow group :)

Tuesday, December 11, 2007

ErlyWeb vs. Ruby on Rails EC2 Benchmarking Strangeness

I've been running some more benchmarks for the ErlyWeb vs. Ruby on Rails EC2 Performance Showdown. The results I've gotten are very strange -- so strange, in fact, that I'm not going to "officially" publish anything before I run all the tests one more time.

What's strange about the results? I'll give you a quick glimpse, but please keep in mind that none of these observations are the official benchmarks results and they may all be invalidated in the next run.

- Recompiling the Erlang files *without* HiPE seemed to *increase* max performance (in requests/sec) by ~10%.
- Running Yaws with kernel poll enabled ("+K true" passed to erl) *decreased* the max ErlyWeb performance to 410 (~41% decrease).
- No matter how many Mongrels I started (I tried 1,3,5, and 10) behind Pen (running on the same server), the max performance of Rails was ~18% lower than when it was running a on single Mongrel without Pen (111 requests/sec).

This across-the-board strangeness calls for another run. I'll try to have the results sometime in the next few days. In the meantime, if anybody else wants to take a stab at the these benchmarks, it would be interesting to see how our results compare.

By the way, although the effects of the changes I made were different from my expectations, when I ran the same tests the results were similar to the results I got last time.

Sunday, December 09, 2007

ErlyWeb vs. Ruby on Rails EC2 Performance Showdown

A few people have asked me to run benchmarks comparing ErlyWeb and other frameworks, especially Ruby on Rails. Benchmarking is pretty boring, which is why I haven't benchmarked ErlyWeb until now. Plus, although I was pretty confident that ErlyWeb outperforms Rails, I didn't think proving it mattered that much. People who love Rails generally don't care much about raw performance, and people who love Erlang and functional programming don't need much convincing to use ErlyWeb over Rails.

However, as ErlyWeb is becoming more mature, my curiosity has been growing. With the recent releases of Rails 2.0 and Erlang/OTP R12B, I finally decided to sacrifice the better part of my weekend to give both frameworks some serious stress testing and see how they compare.

Before I could run the benchmarks, I had to figure out the physical setup. I needed at least two powerful servers with a fast link between them, and I didn't have this kind of hardware lying around. All I have is my MacBook, but running the benchmarks on my MacBook wouldn't prove much because it would be impossible to isolate the impact the clients would have on the servers by running on the same machine.

Thankfully, Amazon EC2 made this easy to solve. I could just fire up two EC2 instances and have one of them stress test the other over EC2's fast internal network.

Benchmarking can be complex. How do you benchmark a web app? There are so many moving pieces and possible user interactions. I decided to keep it simple: I would test the performance of rendering a single page displaying a simple dynamically generated table. There would be no database queries -- the table's contents would be hardcoded in the controllers and the views would programatically render the contents for each request.

I decided against using a database because I wanted to isolate the performance difference between ErlyWeb and Rails. 3rd party dependencies that may affect the results by introducing bottlenecks that would make one or both frameworks appear artificially slow. In addition, in many, if not most, webapps, the majority of requests can be served from the cache -- which means they don't trigger database queries -- so this test scenario is actually quite realistic.

The Test App

Here's a screenshot of the awesome test page I created:

You can see the generated page here (the output is identical for both apps except for minor whitespace differences).

This is the Rails controller code (songs_controller.rb):


class SongsController < ApplicationController
  def index
     @songs =
             ["Sgt. Pepper's Lonely Hearts Club Band",
              "With a Little Help from My Friends",
              "Lucy in the Sky with Diamonds",
              "Getting Better",
              "Fixing a Hole",
              "She's Leaving Home",
              "Being for the Benefit of Mr. Kite!",
              "Within You Without You",
              "When I'm Sixty-Four",
              "Lovely Rita",
              "Good Morning Good Morning",
              "Sgt. Pepper's Lonely Hearts Club Band (Reprise)",
              "A Day in the Life"]
  end
end

This is the Rails view code (songs/index.html.erb):


Songs




<% for song in @songs %>



<% end %>
      <%= song %>

This is the ErlyWeb controller code (songs_controller.erl):


-module(songs_controller).
-export([index/1]).

index(_A) ->
    {data,
     [<<"Sgt. Pepper's Lonely Hearts Club Band">>,
      <<"With a Little Help from My Friends">>,
      <<"Lucy in the Sky with Diamonds">>,
      <<"Getting Better">>,
      <<"Fixing a Hole">>,
      <<"She's Leaving Home">>,
      <<"Being for the Benefit of Mr. Kite!">>,
      <<"Within You Without You">>,
      <<"When I'm Sixty-Four">>,
      <<"Lovely Rita">>,
      <<"Good Morning Good Morning">>,
      <<"Sgt. Pepper's Lonely Hearts Club Band (Reprise)">>,
      <<"A Day in the Life">>]}.

This is the ErlyWeb view code (songs_view.et):


<%@ index(Songs) %>Songs


<% [song(S) || S <- Songs] %>


<%@ song(S) %><% S %>

Setup

This is what I installed on the EC2 image (I started with the basic public Fedora Core 4 image):

- Erlang/OTP R12B.
- Yaws 1.73 (compiled with HiPE).
- ErlyWeb from trunk (compiled with HiPE).
- Rails 2.0.1
- Mongrel 1.1.1
- MySQL 4.1 (unused)
- Tsung 1.2.1.

Notes:

- Rails insists on connecting to a database even if you don't want or need one. There must be some hacky way of running Rails without a database, but it was easier for me to just setup MySQL so Rails stops complaining.
- I also compiled the ErlyWeb app with HiPE.

Tsung (http://tsung.erlang-projects.org/) is the tool I used for the stress testing. It's pretty simple to use -- you specify in an XML file where the server, what page(s) to request and the frequency of initiating new requests and then it runs the test and produces a report. You can read about it more on its website.

One interesting fact about Tsung is that it is written in Erlang to take advantage of Erlang's support for massive concurrency (there will be more on that later :) ).

I installed Tsung on the same machine image as the web servers because although during the test one instance runs the servers and the other runs Tsung, I wanted to have two identical EC2 machine images to ease setting things up.

During the test, I had both servers running on the server instance listening on different ports (Mongrel on port 3000, Yaws on port 3001). I then ran Tsung twice: first against Mongrel and then against Yaws. The tests did not overlap.

For the test, I configured Tsung to go through a sequence of 11 10-second phases during which Tsung would send requests to the server at increasing frequencies. In the first phase, Tsung would send 2 requests per second. It would then progress to 10, 20, 40, 100, 200, 400, 1000, 2000, 4000, and finally 10000 requests per second.

Caveat Emptor

Before I show the results, I would like to say that you should take them with a grain of salt. Many benchmarks have flaws. Maybe I messed something up during the installation. Maybe my methodology was wrong. Maybe I didn't perform some Mongrel optimization trick that boosts its performance. Maybe your application would behave quite differently from my test app. You have been warned.

Results

UPDATE: the initial results were inaccurate because I didn't run Mongrel in production mode. I updated the results and the graphs to reflect Rails's performance in production mode.

Below is the summary of the most important measurements.

Peak Performance
Rails: ~~15.1~~ 112.8 requests/sec
ErlyWeb: 699.8 requests/sec

Peak Send Rate
Rails: ~~181.11~~ 1360.78 Kb/sec Kb/sec
ErlyWeb: 5997.50 Kb/sec (hmm... did we hit EC2's limit?)

Total Size Sent
Rails: ~~1.65~~ 5.83 MB
ErlyWeb: 28.47 MB

Peak Receive Rate
Rails: ~~84.87~~ 175.79 Kb/sec
ErlyWeb: 941.84 Kb/sec

Total Size Received
Rails: ~~0.21~~ 0.64 MB
ErlyWeb: 4.43 MB

Performance Degradation Point
(The phase at which the framework started sharply dropping the response rate)

Rails: ~~100~~ 400 requests/sec phase.
ErlyWeb: 4000 requests/sec phase.

Note: these are the updated reports after running Mongrel in production mode

You can view the full Tsung reports here: ErlyWeb Rails

These reports are from the second test (30 second sustaining at 10 requests/sec):
ErlyWeb Rails

You can also download the reports, zipped: reports.tar.gz

Conclusion

ErlyWeb's peak response rate was ~~47x~~ 6x Rails's on similar hardware.

I'm actually surprised by the results. I expected ErlyWeb to win, but I didn't expect ErlyWeb to outperform Rails by such a wide margin.

The results probably reflect differences in language and runtime more so than implementation details in the frameworks. After 20 years of refinement driven by needs of high performance telcom applications, Erlang delivers.

Update:
Looking at the peak send rate for ErlyWeb -- an almost round 6 MB/s, I wonder if ErlyWeb just saturated EC2's internal bandwidth, and whether ErlyWeb could perform even better on a faster link.

Update 2:
I ran another test, this time hitting each server at 10 requests/sec sustaining for 30 seconds to measure average response time. Rails's response time was in the ~~120-180~~ 5-7 msecs range, and ErlyWeb's was in the 1.4-1.8 msec range. This means that under medium to light load, ErlyWeb serves the 'songs' page roughly ~~100x~~ 4x faster than Rails.

Sunday, November 18, 2007

Vimagi Speedups

Vimagi was much slower than it should have been.

I had foolishly compiled Vimagi in production with {auto_compile, true}. This option tells ErlyWeb to scan the app's source files and recompile all the ones that have changed since the last request. This feature greatly speeds up development because you can edit your file, reload the page, and immediately see the effects of your changes. It was also convenient for me in production because after checking in some changes from my dev box I would just 'svn up' on the production server and the changes will be deployed automatically. Unfortunately, I didn't realize what a negative impact it has on performance. Thankfully, David King brought this to my attention on the ErlyWeb mailing list and I've since disabled auto_compile on the production server.

Vimagi performs much better now. Clicking around the site, most pages now load in 0.3 - 0.5 secs according to YSlow. It feels much faster. In fact, Vimagi's performance is now similar to BeerRiot's, which is no small feat.

By the way, I'm not sure about BeerRiot, but Vimagi's pages are all dynamically generated -- I haven't implemented any caching (my VPS can easily handle the current traffic levels, so implmeneting caching right now would be a premature optimization). For an entirely dynamic site, I think this is very good performance.

Saturday, November 17, 2007

ErlyWeb Presentation, Dec. 6th at Berkeley, CA

The BayFP group is on a roll: 3 presentations on functional web development frameworks in 3 months. Alex Jacobson presented HAppS, David Pollak presented Lift, and in the next meeting I'll be presenting ErlyWeb.

The meeting will take place on Dec. 6th in Berkeley. You can find directions here.

There will be a $300 attendance fee.

Just kidding -- it's free. Please attend.