Monday, July 31, 2006

First Thoughts on Erlang Metaprogramming

During my acquintance with Erlang, I've heard a number of people complain about Erlang's limited metaprogramming abilities. The only "official" metaprogramming tool Erlang provides is the preprocessor, whose logic is limited to ifdef..else statements, so these complaints seem quite justified.



However, in my recent digging through Jungerl code, I came across the rdbms_codegen module, written by Ulf Wiger, and it gave me a different perspective on the issue. In fact, this code has made me wonder whether Erlang's metaprogramming capabilities are actually unmatched by any other language. Why? Because this code shows that not only can you programmatically generate and parse abstract Erlang syntax trees, but when mixing it with Erlang hot code swapping, you can also hot-deploy this code in runtime, thereby creating dynamically mutating applications! How cool is that?



I'm not surprised that many people aren't aware of this capability because, like many things Erlang, it's not very well documented :) To use it, you need to understand the Erlang abstract format and to be familiar with the erl_parse and the code module. You also need to be able to piece everything together, which is far from obvious. Thankfully, we can all benefit from Ulf Wiger's work, which gives us a good example of how to accomplish this wizardry.



I'll probably write more about Erlang metaprogramming in the future, after I've delved deeper into this arcane and fascinating land.

Thursday, July 27, 2006

The Real P2P King: eMule

If you haven't looked at the sourceforge stats lately and you've been getting most of your P2P data from various tech news agencies, the following fact has probably eluded you: the relatively under-hyped eMule is by far the most popular P2P app in the world.



According to sourceforge, eMule is the #1 download of all time, with 212,399,887 downloads. eMule leads Azuerus, the most popular open source BitTorrent client, by almost a factor of 2 -- Azureus has been downloaded "only" 121,612,343 times. The mainline BitTorrent client, which is the #3 download of all time, has a "measly" 51,850,559 downloads. Even when you add up both the BitTorrent and Azureus numbers, you don't come close to the total number of eMule downloads.



(Ares Galaxy's meteoric rise in the sourceforge stats is also worth pointing out. In less than a year since its open source release, Ares has been dowloaded an astounding 35,575,917 million times -- and that's without counting the number of times Ares has been downloaded as a closed source app. That's pretty amazing!)



I should mention that eMule Plus, an eMule knockoff, has been downloaded 15,090,841 million times. I would probably do justice to eMule by counting these numbers as eMule downloads, but even without these extra numbers eMule is the undisputed download champion on sourceforge.



The sourceforge stats aren't a perfect indicator of popularity, especially since they don't include a number of closed source (and hence spyware laden in most cases) P2P apps. In fact, according to kazaa.com, Kazaa has been downlooaded 389,392,921 times, which is much greater than the total number of eMule downloads. In light of this, why do I still believe that eMule is the King? Well, Kazaa may have been the popularity champion at some point, but it clearly isn't now. Kazaa's spyware and malware infestation must have driven away most of its users, who have grown more aware of the risks of installing a spyware laden monstrosity on their machines. The proliferation of corrupt files on the FastTrack network has also contributed to this trend according to this slyck.com article. The stats on the slyck.com homepage back this observation, showing a much larger number of users on the EDonkey2000 network (on which eMule rides) than the FastTrack network.



You get a similar picture from this Google Trends graph:









One fascinating trend this graph reveals is that eMule is even bigger than Skype!



It is strange how weak the correlation is between news coverage and popularity. A few quick searches I did yesterday in Google news reveal the following news story counts:




  • Skype: 2,040

  • BitTorrent: 434

  • Kazaa: 806

  • Limewire: 255

  • eMule: 8



(Note: many news stories came out today about Kazaa's settlement with the recording industry.)



So, what's the conclusion? Tech journalists apparently like writing about what other tech journalists are writing about. Also, to get news coverage, an application must have a company or a human face associated with it. Buzz just doesn't flow in the way of a group of anonymous hackers who build a high-quality app that changes the internet on a mind-boggling scale.

Monday, July 24, 2006

Erlang vs. the One Red Paper Clip Guy

A couple of weeks ago, I submitted to digg.com an article I wrote comparing Erlang + Yaws and Ruby on Rails as alternative framworks for web development, with the intention of raising awareness about Erlang's web development capabilities. I then waited for the moment that web developers from all corners of the earth would flock to my blog, read the posting, and see the light. Although this may have happened on a small scale, my posting certainly didn't change the course of mainstream web development: my posting only got 18 diggs the last time I checked.



Never abandoning my mission to raise people's awareness about Erlang's benefits, I decided to take a different angle -- one that is destined to grab more people's attention. In this posting, instead of comparing Erlang to Ruby on Rails, I decided to compare it to Kyle MacDonald, the One Red Paper Clip Guy.



Where do I start? Well, let's enumerate some of Erlang and Kyle's traits and see how they compare to each other:




  • Paradigm: Erlang is a functional language, which means that the primary conceptual units in Erlang are functions. This contrasts with object oriented languages, which emphasize classes and objects, and procedural languages, which emphasize subroutines and not much more.


    Kyle, on the other hand, is apparently dysfunctional. Kyle was unemployed at the start of his journey, and instead of looking for a proper job, he decided to execute a series of trades from the red paper clip on his desk all the way up to a house. Kyle happened to succeed in his quest, but if any normal person tried the same feat, most people would probably wonder what's wrong with him.


    Winner: Erlang



  • Concurrency: Erlang was designed for concurrent programming. Erlang handles concurrency very well, and its lightweight approach to threads allows it to execute millions of tasks in parallel.


    Kyle, on the other hand, is a male, and hence is particularly bad at multitasking. In fact, chances are that Kyle can only do one thing at a time. (If you're a guy and you can't find any substantive research on the subject, just ask your wife, mom or girlfriend -- while playing CounterStrike. Case closed.)


    Winner: Erlang



  • Performance: As I mentioned in the last item, Erlang was designed for parallel programming. This means that Erlang is very good for solving problems that can be decomposed into multitudes of parallel processes. Erlang doesn't have the fastest performance in linear tasks, but it has no competition in scalable, parallel applications.


    By contrast, Kyle seems to excel in tasks that require a sequence of calculated steps until he reaches his goal. Kyle won't give up before he succeeds. Without this determination and perseverance, Kyle wouldn't have been able to trade a paper clip all the way up to a house -- all in the course of one year.


    Winner: tie



  • High availability: Kyle is human, so Kyle needs to sleep. Kyle can't actively pursue his goals with anything close to the %99.9999999 (yes, nine nines!) availability that Erlang applications can accomplish.


    Winner: Erlang (by a wide margin)



  • Spawning: Kyle has no children, and if he ever decides to have them, it's likely he won't make more than 3 or 4 of them due to numerous biological, social and economic constraints.


    In Erlang, on the other hand, it's quite easy to spawn new processes. Just call the spawn() function, and you have a new process in a matter of milliseconds. You can spawn millions of processes if you want, and the Erlang VM would handle it with ease.


    It may seem as though Erlang is the clear winner, but when you consider the hypothetical effects of Kyle's spawning millions of children, you realize that Kyle's limited spawning ability is actually a big benefit to humankind. Therefore, my verdict is:


    Winner:tie



  • Yaws: According to Wikipedia, Yaws is "a tropical infection of the skin, bones and joints caused by the spirochete bacterium Treponema pertenue." By all indications, Kyle is a healthy guy and is also Yaws-free.


    Erlang, on the other hand, has Yaws. In Erlang's case, however, Yaws stands for "Yet Another Web Server." Erlang's Yaws is actually a wonderful web server uniquely suited for developing dynamic web applications with highly scalable and robust backends (in case you're wondering, I have no indication that the same adjectives also apply to Kyle's back end).


    Erlang's Yaws is a great web server, but the negative connotations of its name warrant the loss of some points. Therefore, my vedict is:


    Winner: Kyle



  • Popularity: Kyle is quite popular. Due to people's fascination with Kyle's story, Kyle's blog has had over 6,600,000 visits and Kyle has gotten press coverage from mulitple news agencies.


    Erlang, on the other hand, isn't a very popular programming language. Just look at this Google Trends graph in case you have any doubts. This isn't because Erlang isn't a good language -- it's actually the best language for a certain class of problems IMO -- but because Erlang hasn't had the best PR. It's also because many people actually think that PHP is a good language.


    Winner: Kyle





Here's the final tally.




  • Erlang: 3




  • Kyle MacDonald, aka the One Red Paper Clip Guy: 2





Ladies and gentlemen, we have a winner: Erlang!


Now that you've read this highly convincing exposition, you can safely forget everything you know about web development and head over to the Erlang web site, the Yaws web site, or read my other Erlang articles (starting with the Hitchhiker's Guide to Erlang). Learn, practice and experiment with Erlang, and in a couple of months you'll be building telcom-grade backends for your websites with ease. I assure you it'll be worth it!

How A Nigerian Scammer Tried to Rip Me Off on Ebay

In this posting, which I wrote right after I posted my first Ebay auction to sell an old laptop, I jokingly alluded to the drama that would take place surrounding my auction. I never expected any real drama to actually take place, but lo and behold, it did -- in the form of a Nigerian scammer trying to steal my laptop. Here's the rest of this strange story.


Less than a day after I posted my old Compaq Laptop on Ebay for sale, and a week before the auction was to end, a person by the name of Richard Hudson, with the email address of bravecolonel22@yahoo.com, bought the laptop with the Buy It Now option. He then sent me the following message:



Hi Yariv,
Season's Greetings to you.I am Richard Hudson from Pocahontas,Arkansas,US.I'm contacting you concerning your item on Ebay which i eventually became the winning bidder for your item.Morever i'm presently serving our beloved nation here in Iraq with the United Nation and i intend sending this item to my wife who works with the American Embassy in Nigeria as a visa Officer.You shouldn't worry about the shipping fees to her okay?.I'll take care of it with my Personal FedEx Account # after payment so you won't need to pay any money when mailing it to her over there.I will be making my payment to you via PayPal,So get the package ready for immediate or next day shipment okay.Make sure you send me your PayPal Email Address so i can immediately make out my payment to you for this item purchase.You are to mail it out immediately you get the confirmation mail from PayPal.Mail me back it's urgent.


Regards.
Richard.


It doesn't take a genius to smell something fishy here. If the bad grammar and punctuation don't give it away, the word "Nigeria" certainly raises alarm bells for anybody who has heard about the plague of Nigerian 419 scams on the internet. If you haven't heard about them, I suggest you Google for "nigerian scam" and you'll find plenty of information on this bizarre and disgusting phenomenon.


Fortunately, my scam radar was alert, partly because I had recently read the New Yorker article "The Perfect Mark" about this same subject. The article tells the story of a psychotherapist -- an upstanding citizen by most standards -- who has been suckered by a Nigerian scammer into sending most of his savings to Nigeria with the hopes of making a large return from a convoluted business scheme that was -- guess what? -- entirely fabricated.


It's actually a very sad story because this scam ruined the psychotherapist's life and ended up putting him in jail. However, you can't feel too much sympathy for this guy because his misfortune is partly due to his own fault.


Going back to my story -- at this point, I was quite certain I was dealing with a Nigerian scammer, but I played along because I was curious as to what his next step would be. I told him to PayPal me the money and I'd ship him the laptop. (I didn't realize when posting the auction that Ebay allows you to enforce immediate payment by PayPal for Buy It Now sales. Because I didn't use this feature, the scammer was able to steal my auction using Buy It Now without putting any money down. It also showed the scammer I was a new seller who obviously wasn't versed in defensive tactics for Ebay sellers.)


Here's the scammer's reply:



Hi Yariv,


I have made out my payment to you for this item purchase and the exact amount of
what i bought from you has been deducted from my account,So check your mail for
the confirmation mail from paypal.Get the Package ready for immediate Shipment.You are to take the Package to any FedEx Location close to you for drop off.Paste the
information below on the package and make sure you fill the international airway bill
form correctly with the information below.Then do me a favour by putting the item
worth as $300.00 so as to reduce customs charges against her when receiving it over
there.Ship it via international priority.Then a tracking number will be giving you which
you will then send to paypal at pay-pal.com@consultant.com and your account will be credited with the exact amount of what i bought from you immediately.Hope to hear
from you when you have it maild out today.Have a nice working week ahead.


FedEx Account Number :- 170083695
NAME:- RICHARD A.ESTHER
ADDRESS:- 17B ADESHINA STREET
CITY:- IKEJA
STATE:- LAGOS
COUNTRY:-NIGERIA
ZIP-CODE:-23401
PHONE:-08034852880


Sincerely
Richard.


There are too many flaws and red flags in this email for me to go through all of them, but the fact that Richard's "wife" is named "Richard A. Esther" makes it blatantly obvious that not only is this a total scam, but also that the scammer isn't very bright.


The Nigerian scammer expected me to ship him the laptop without first making sure the payment has cleared. He thought he could trick me into thinking that PayPal would only forward me the money after I send PayPal the tracking number of the actual shipment at the very legitimate-sounding email address "pay-pal.com@consultant.com". Riiiight....


A quick search on Google for "paypal.com@consultant.com" landed me on another guy's blog who suffered from a similar scam. (That link is dead now, but I did find another link to this Ebay forum posting, where another person tells a very similar story.) I replied to the scammer telling him that our communications are over, but if he ever gets bored while bravely serving the country, he should read the blog posting I found as well as the New Yorker article I mentioned above.


I contacted Ebay and told them what happened, and after a couple of days, they sent a long and detailed reply telling me what I should do to get reimbursed for the auction and describing what steps I can take to protect myself in the future. I think Ebay has actually handled this incident pretty well, and I can't blame Ebay entirely for what happened. There will always be online scams, and Ebay can't completely block it from their site. However, I think that Ebay should place more noticeable warnings for sellers regarding such scams on its website -- at the very least, next to the page where you can enable the Buy It Now option.


I think it's sad that somebody would invest so much mental energy into devising ways of ripping people off rather than making money in legitimate ways, of which there are many. But then again, the effort on the scammer's side may not be very high because he probably attempts the same scam on many potential victims. If he tricks a certain perentage of his victims into sending him relatively expensive items without paying for them, he can probably make a relatively good living, especially for his country.


Well, I hope that this posting raises your awareness and helps you avoid such scams on the internet if you ever come across one. Being informed will help you avoid these traps, just as the New Yorker article I had made it easier for me to recognize and avoid this particular scam.


Sometimes all the spam, viruses, scams, malware and other forms of digital filth that thrive online make the 'net feel like a lawless, dangerous, ugly place. But then again, all the horrible things that happen in the real world make the 'net seem almost clean in comparison.

Friday, July 21, 2006

My Erlang Wishlist

Every person can dream, and I have been dreaming quite a lot. In fact, it wouldn't be too far-fetched to say that I have taken this liberty and ran with it. Sometimes it's hard to fit all my dreams into my skull, but I think I've managed it pretty well so far.


I'm a programmer, so it shouldn't be a surprise that many of my dreams are about hopelessly geeky things. This blog has primarily served as an outlet for my Erlang musings thus far, so I figured this blog would be a good place to express my wishes for new capabilities and libraries I would like to have in my Erlang toolbox. I hope that some day the open source code fairy stumbles upon my blog and turns these wishes into reality.


Here are my Erlang wishes, in no particular order:




  • Shorter repair times for broken Mnesia dets tables. This has been discussed in the erlang-questions list here, here and here . I have some web application ideas for which Erlang would be a great server side language, and I just hate the thought of using another DBMS just for the fear of long downtimes in the event of a crash when my data set gets big. If dets were more resilient to crashes, I would lose my motivation to use another DBMS.

    After I've used Mnesia, going back to MySQL/Postgres just feels lame. I want Mnesia to be good enough for all my needs.

  • More scalable freelist handling in dets. Yup, dets has another problem: it uses a buddy system to maintain the freelist, and this freelist grows as the data becomes fragmented. The freelist is held in RAM, so this has a negative effect on memory consumption that gets worse over time. In addition, when you close a dets table, this potentially huge data structure needs to be serialized to disc, which can take a while. The only ways of shrinking the freelist is to close the table and reopen it in force repair mode, or to create a duplicate table that gets built from scratch without fragmentation. The first option can effect long downtimes, and the second option can be too expensive if the table is big. This is another force pulling me in the MySQL/Postgres direction, which, as you know, I'm trying to avoid.


  • A robust join optimizer for QLC. QLC is the main query language for Mnesia. Unlike most DBMS, the QLC query engine doesn't have a join optimizer right now, so writing your query the wrong way could make it run off into the weeds. Postgres has what looks like a very sophisticated join optimizer, and I'd like QLC to have something similar. (Fortunately, this feature is planned for a coming release of OTP, but I wanted to mention it anyhow.)


  • An advanced full-text indexing library. Java has Lucene, Ruby has Ferret, C++ has CLucene, Python has PyLucene, etc, etc. Erlang has a text search feature built into the RDMBS module in Jungerl, but AFAIK it's not as mature and feature-rich as the other libraries I mentioned. These days, almost every website needs a good text search feature, so having such a text indexing library would make my life much better.


  • A graphing library like Gruff. I like graphs. Graphs make me happy. I want to make graphs with Erlang, but right now there's no easy way of doing it. I want an Erlang graphing library that's easy to use.


  • A template language for Yaws. ehtml is pretty cool, but I think a template language produces more readable code. This has been discussed extensively on the mailing list, but I haven't heard of any progress in this direction lately.


  • A RSS/Atom parsing library. Such a library would bring one of my project ideas one step closer to reality, and I really don't want to write one myself because of all the hairy incompatibilities between the outputs of different feed engines. Why should I suffer if somebody else may be willing to take on this pain?


  • A better documentation website. It doesn't take a sharp artistic eye to notice that the current design is less than beautiful. I would welcome any aesthetic improvements, but even more importantly I would like the online documentation to integrate user comments. This will give the existing documentation the extra substance it needs on some places. The MySQL, PHP and Postgres documentation sites are good examples of this.


  • A more active web development community. I don't hear of many large-scale websites using Erlang, and I know of only few independent developers who are building websites with Yaws. This graph validates my feelings. More people need to use Erlang for web development so I don't feel lonely and so I can use the solutions that other people contribute the to community. Sadly, I just don't have the time to implement everything myself, which, combined with an occasional bout of laziness, is why I'm writing this posting in the first place.


  • A community website for Erlang developers,. This already exists thanks to the recently launched trapexit, but I wanted to list it anyway because I've been wishing for it for a while.


  • An Erlang/OTP road map. Besides a few features, I have now idea what the OTP group is working on and what they are planning on working on in the future. It's possible they are already tackling some of the items I mentioned, but I'm in the dark about this, which is somewhat frustrating. I'd like to know where Erlang is going, not just where it is right now.


  • An Erlang-powered machine that brings world peace, saves the environment, fights poverty and disease, provides univeral happiness and finds me a new apartment. I can dream, can't I? :)




Well, that's about it for now. There are probably a lot more things I could think about, but I don't want to overwhelm my poor readers, some of whom I must have bored to death by now. Now it's time to sit back and wait for the open source fairy to google for "erlang wishes" and hopefully land here on my blog.


It's bound to happen happen any day now -- I know it!

Wednesday, July 19, 2006

The Best (or Worst) Spam Ever

My Gmail mailbox used to be spam free, and it was a joy. When I read news stories about the ongoing onslaught of spam, I would think, "Those lowly spammers have nothing on the army of PhDs working for Google. I'm confident their crap will stay out of my mailbox." My inbox actually made me feel comfortable. Those were the days.


A few months later, I realized a spam-free mailbox was just a pipe dream. I now get spam messages in Gmail almost on a daily basis. Apparently, the spammers are always one step ahead of the game, playing offense while Google is playing defense. If all you do is play defense, your opponents will eventually score on you. It's only a matter of time.


Up until recently, the spam was just annoying. I'd delete it immediately without even thinking about. However, in the past couple of days, things have changed. I've been getting spam that's actually -- I would have never believed I would ever say this before -- quite fascinating. This new spam nothing like the cliched Viagra ads or Nigerian scam that statistical spam filters have been squashing for a while now. It's original. In essence, it resembles a semi-random collection of bizzare English fragments written by a highly literate person who's had one too many bottles of Vodka. It almost looks... poetic.


It's pointless to try to describe it any further. I'll just quote the whole thing:


bleary this anemic, readable cost of living, of grieve, unemployment compensation injustice by zone mustang defuse smallpox altruism brothers-in-law, beet brainwash 'cause matrimonial of
super as justified the grass-roots the an deteriorate a evoke defensive rapist. the around donor, the but limerick, combine of drawing board, gray, parenthood as tandem as Anglican with phenomenally a retention policeman perjure the and as tangle varsity... wholeheartedly in of sweetness or pleasurable dentures.
extremist the in portly, key!!! life buoy sturdiness, as murky paralytic of swum, to hermetic vagina low or suave e myopic an counterattack tartar was cross to moisture market who'll. Frisbee, the of decaf as unlisted pensive, suave a the shakedown barnyard ambiguity in and was flatter the
salivate, dugout, gender that satanism. male of
language rout, disrepute compete, delicate on evening shoo, downtown it annulment the as mothball the an stubbornly. ESP in papaya, mystery. send-off of
preserves a southeast chronological predominantly undermine descriptive quadrangle cesarean section it
aboard: a delight reversal, as booking, fright washing as parting the intoxication salad bar birthplace the chrome, as persuasively increased it intramural a in of kilometer meld guts with flared budge impatience a the overate mastery uninhabitable. casing enrich. to lop operator parchment, a comprise,
sterling this chick,. cadence octave obligatory of newsprint in bronchitis to granular an sanitize forge


You can't deny that's some quality spam right there!


(Below the text were attached image fragments that aluded to some stock buying opportunity on which I was missing out, but that's too boring to include here.)


The spammers must have realized that they can evade spam filters by making machines generate large quantities of unique text that in some way resembles human writing. The actual selling pitch is embedded in images outside of the text so the statisical text-based spam filters can't identify it. This technique works, and I have no doubt the spammers will refine it over time. I expect these spam messages to look more and more like they written by real humans.


I'm sure that pretty smart people are behind this new spam attack, as the dumb spammers eventually get wheeded out by natual selection: their messages don't penetrate the filters, spamming stops being profitable, and they turn to other ways of making money.


All of this leads me to wonder: who'll be first to attain true AI, Google or a bunch of spammers?


For the sake of our inboxes, I hope it will be Google!

Tuesday, July 18, 2006

Finally: A Solid Erlang Community Site

I've been waiting for this one for a while: a website for and by the Erlang developer community: trapexit.org, "your last stop for Erlang information." Finally, Erlang developers will have a wiki, forums, tutorials, blogs, and a place to share knoweldge with each other outside of the erlang-questions mailing list.


The mailing list is effective, but it has drawbacks. Sometimes you have a question and you just don't want to address the entire Erlang developer population with it. It often makes a lot more sense to post it in a forums dedicated to the specific topic of your question. Plus, you don't want to bombard the mailing list with too many questions because you don't want people to think you're a pest (I think I've tested this limit myself all too closely a few times :) ).


Forums are more user friendly than mailing lists in my opinion. They are easier to search and browse, and they have a more welcoming feel. The main downside is that this splinters the community between the mailing list and the forums. I'm looking forward to see to what degree the community embraces them.


My outlet for Erlang ramblings has been this blog up till now. Maybe now that trapexit is around I'll channel some of my writings to its wiki. This will finally give me the opportunity follow my heart and dedicate this blog to Java Server Faces.


Just kidding!!! :)

Monday, July 17, 2006

The Adventures (Horrors?) Of Scaling Rails

In my latest meanderings across the Net, I came across this blog by Patrick Lenz, a Ruby on Rails application scaling guru. This blog has a 4 article series describing the mind-boggling effort it has taken Patrick and his team to scale the backend of eins.de, a popular German social networking site.


This site originally had 50,000 lines of PHP code, and it was transformed into 5,000 lines of Rails code (some features were left out). This speaks very well for Rails's boost to developer productivity, but that's not so shocking anymore.


It sounds like the coding was the easy part. It took Patrick and his team months of hard work to find the bottlenecks in and optimize the numerious components that drive comprise application's backend: Lighttpd, proxy server, Rails, Linux, MySQL, memcached. Many of the hidden bottlenecks came from impenetrable issues surrounding issues with Rails dispatchers in Lighttpd, difficulties in estimating the right sizes for thread pools, and poor multi-master replication performance in MySQL.


During this whole time, the site suffered from poor performance.


It doesn't seem like the problems came from Rails per se, but with the unimaginably complex set of auxiliary tools needed to support such a high-volume Rails application.


It must have been a very expensive project, and that's without counting the hidden cost of user dissatisfaction.


I'd bet it would have been much easier to scale this site if it were written in Erlang. The concurrency bottlenecks would have gone away, native compilation with Hipe would have outperformed Ruby, and the number of moving parts would have been much smaller: it would require Yaws for the web server, Mnesia for replicated live session data and MySQL/Postgres for large volume data.


If (when?) Mnesia will be made to handle very large data volumes better, the need for an external DBMS will vanish, and even a high-volume website could be powered by not much more than Yaws + Mnesia. Even putting aside Erlang's proven scalability and fault tolerance in commercial phone switches, when your application has few moving parts, bottlenecks are easier to identify and fix.


Another nice feature and Erlang backend has is that experimenting with different configurations would require no downtime: Erlang has hot code swapping (you can even hack Yaws while it's running -- try that with Lighttpd), and Mnesia can be reconfigured without taking it offline. This is not surprising considering that Erlang was designed from the ground up for applications that target %99.9999999 (yes, that's nine nines!) availability.


The main downside with Erlang web development is that Erlang doesn't have as many libraries are Rails. However, when you consider the tremendous efforts saved due to much better scaling, the productivity equation starts looking different.


It's strange that you don't hear of people using Erlang as their backend language for real-world websites. Is it because of a language barrier due to Erlang's functional nature? Poor PR? I can attest that Erlang web development is quite fun and productive, and I think many people would agree with me that saving hundereds of thousands of dollars on optimization efforts doesn't suck at all.


If more people used Erlang, web-centric libraries would be more abundant as well. This will happen. It's only a matter of time. Erlang is too good to remain unnoticed by the web developer community.

Saturday, July 15, 2006

Embracing Typo

My blog has had quite a journey through different blogging services.


It started at Blogger because it was so easy to set up and start blogging, but I lost my love for Blogger after a while because Blogger didn't let me login to my blog or edit it over SSL. Call me crazy, but I just don't like sending the password for my blog in plaintext over the internet. Having my blog hijacked by a 13 year old Ukranian hacker would really spoil the fun of having a blog.


My blog's next stop was wordpress.com. wordpress.com gives you full SSL access, which is pretty awesome, and it has a pretty nice interface. However, wordpress.com is too locked down for me. It's impossible to manually edit template, and the template selection was pretty poor IMO. Some templates looked nice, but all of them had one or two big flaws that turned them off to me.


Plus, I'm a geek, and I like having full control over blog. At wordpress.com, I have ran the risk of not being able to do whatever I want to do with my blog. That's not to say Wordpress is a bad service -- I'm just not in its target audience.


I stayed at wordpress.com for a while, but when my discomfort has reached a certain level I started looking for alternatives. Then I discovered Typo, a very cool blogging application written in Ruby on Rails. I played with Typo for a little while and I fell in love with it very quickly. Despite its young age, Typo is packed with features, it has a great interface, and because it's written in Rails, I feel right at home with the source code.


I decided to host Typo on my own server (it runs Debian with Lighttpd) so I can have total freedom to tweak it as I please. My blog has finally found its ideal home.


Big thanks to the Typo developers for giving people such a great blogging tool!

Friday, July 14, 2006

haXe Remoting with Erlang + Yaws

Yesterday, Yaws version 1.64 was released. This version contains the haXe remoting code I contributed, so now Erlang developers can benefit from haXe's great capabilities for AJAX/Flash client development.

Go get your hands on the latest version of Yaws at http://yaws.hyber.org. For documentation on how to use this feature, visit http://yaws.hyber.org/haxe_intro.yaws.

haXe remoting brings Erlang one step closer to being The server language for high-end web applications -- not just for its scalability, high-availability, clustering, soft real-time communications, fault tolerance, etc -- but also for rapid development.

You should know that haXe is also moving fast towards providing a platform for full desktop clients development -- not just web clients. There are two projects to keep an eye on: ScreenWeaver HX and xinf. I don't know how mature they are at the moment, but in a couple of months everything can change. These desktop frameworks use the NekoVM for the runtime, so I wouldn't rule out developers' using Neko <-> Erlang RPC any time soon.

Where do I see this going? I don't know of a better set of technologies for building Comet apps, which let the server push messages to the client by keeping the connections alive (due to Erlang's approach to concurrency, Yaws scales in this scenario far better than any other webserver I know). This technique finally lets you break the old request->response cycle, thereby blurring the line between traditional web applications and the rapidly evolving space of instant messaging applications.

One concerete possibility is massively multiplayer online games with browser-based clients. Many types of collaboration and instant communication applications could use this great hybrid as well. (My imagination is limited. I have no doubt we'll see amazing apps that I have no ability to predict :) )

If I only had the time to build all the apps I want to build...

Wednesday, July 12, 2006

Erlang and the Next Generation Webapps

Web applications are so boring! It's always the same old request->response, request->response, request->response, ad infinitum. Yawn!

Wait, but how could I forget all the cool innovations of the past few years??? We have AJAX, Rails, GWT, Flash, ActiveX -- just kidding!!!! -- wikis, blogs, Google...

Trust me, I did not forget them, but I still feel trapped in a request->response world, and this world is driven by the same old recycled beat, almost like the mind numbing output of monopoly controlled airwaves.

Well, I admit the situation isn't that bad, but I wanted to use the metaphor, ok? :)

Imagine what the web would look like if every connected node could send a message to every other node in (soft ;)) real-time. How would Ebay, MySpace, Wikipedia, Blogger, Match, etc, look if users could actually communicate with each other instantly??? How would these sites look if their backends could send messages to the users and route communications between the users and 3rd party servers -- and all of this would happen with milliseconds' delay???

Well, I don't know to be honest, but I can think of a few ideas that make those websites look like interesting, very profitable, but ultimately anachronistic experiments. I do know, though, that Ebay's executives understood that instant communications facilitate transactions when they decided to buy Skype for $4,100,000,000.

Let's go back to the technologies. AJAX is not boring actually, because it does open the doors to real interactivity, as Meebo demonstrates. The problem with the predominant web paradigm these days isn't the browser: it's the fact that all backends are variations on the same old architecture, where the server hands if off each request to a dedicated OS thread which processes the request and sends the response. Then, sayonara.

The backends of 99.9999% of all webapps are stuck in a CRUD mentality, and that's not the fault of web developers. It stems from the fact most language designers and OS developers apparently haven't worried so much about building software tools that scale to tens or hundreds of thousands of simultaneous processes, which prevents the imaginations of most web developers from going to the bidirectional realm. Could one build a web application than maintains persistent connections to all clients with one of today's common web development languages? Sure. But this probably wouldn't scale, and it would be expensive to operate and difficult to maintain.

Getting those web servers to talk to each other is also outside of what most web development frameworks offer.

At this point you probably know where I'm going with this: Erlang. Erlang can scale to millions of simultaneous processes -- on one machine! (I wrote about this more here and here). If you write your backend in Erlang, the VM handles concurrency gracefully. You can keep those damn connections alive -- and then set your mind free :)

Erlang also lets you easily form a cluster from web servers with replicated "live" session data, which is possible, but not as easy to do with other languages.

Meebo shows you can build such applications in a scalable way with other languages -- in Meebo's case, C++. Other languages just make it hard. The best solution I can think of is to run a single-threaded, non-blocking HTTP server that hands off long, blocking tasks to workers in a thread pool. But that thread pool had better not grow too much, because then you're back where you started: scalability bottlenecks. Yikes. Oh, and if you want to easily set up a cluster between your web servers, upgrade the servers without taking them offline and thereby booting away connected users, or set up supervision hierarchies for graceful error recovery, you're in for a good amount of work.

(Meebo actually "cheats" because the AIM, ICQ, Yahoo Messenger, and GTalk servers do the "real" routing. Other applications may require their own routing, a feature that one could build with Erlang with ease.)

It's no surprise that Erlangers are already developing a Real-Time Wiki -- the kind of app whose scalability implications would be quite intimidating for non-Erlangers. This project is based on an ejabberd backend, which isn't a "traditional" web server, but there's no reason one couldn't build the same thing on top of Yaws, which has most of the dynamic web application development tools and AJAX integration you need. Just think of Yaws + Mnesia as an application server that also happens to be a telcom-grade message router with a distributed soft real-time peer database that's shared among all nodes in the cluster, and all the pieces will just fall into place :)

Ok, I've done enough rambling for one evening. I just wish I had the time to build all those cool Erlang apps I want to build.

Tuesday, July 11, 2006

The Hitchhiker’s Guide to Erlang

If you've heard about Erlang and you want to use Erlang to build a website site but you're not sure where to start or where to find good online resources for Erlang web development, I know how you feel. I went through it myself. However, I stuck with it, did my share of Googling and mailing list questioning, and over time I got a good sense of how to navigate the Erlang world. I hope I can make your journey easier by giving you a few pointers.

First, let's get off to a quick start.

- download Erlang/OTP
- configure, make, make install.
- download Yaws, "The" Erlang web server.
- configure, make, make install
- run yaws from the command line
- create /tmp/foo.yaws and open it in a text editor
- type the following






my first Yaws page



out(A) ->
{ehtml, {p, [], "hello world!"}}.







- open your browser and go to http://localhost:8000/foo.yaws.

Here ya go! Now you're officially among the ranks of Erlang web developers. :)

Let's now take a few steps back. If you're reading this, you must have heard about Erlang and how its unique strengths let you build scalable, distributed, highly-availabile applications with much less effort than other languages. You want to use Erlang's power to build a scalable web application, but you've probably noticed that Erlang hasn't been as widely adopted by the web developer community as other solutions (e.g. Ruby on Rails), and that there aren't too many blogs and tutorials about Erlang web development online.

There's an Erlang reference book, Concurrent Programming in Erlang , which I admit I haven't read, but from the table of contents I see it doesn't discuss web development. You can find a couple of articles about Erlang, but they are mostly introductions Erlang as a language and none of them shows you how to use Erlang to quickly whip up a few web pages, which is probably what you want to with the least amount of pain.

None of this means you should follow the crowd and go to other languages just because they are somewhat more accessible and more developers are using them. I assure you that with some effort you'll get all the reference and help you need, and in a little while you'll be able to build the kind of backends that make Google seethe with envy.

Ok, maybe that was a slight exaggeration, but there's always a chance you're the next Larry Page, isn't there? :)

Alright, let's hit the road. Your first destination should be Open Source Erlang's official homepage. Every beginner should do the Getting Started Guide, and then the Language Reference.

As you're going through these manuals, make sure you have the Erlang shell open so you can write some short code samples. Type help() in the shell to see the list of commands.

Emacs is the "standard" development tool (the OTP distribution comes with Emacs language files), but if you've been spoiled by Eclipse you'll be happy to get your hands on Erlide.

When you think you have a good bearings on the basics, you may want to take a look at some code examples (I find that reading examples is often the best way to learn).

Now that you feel comfortable coding in Erlang and you're ready to experiment with some more web development, you should visit the Yaws webpage. This page has many useful instructions and examples on how to build dynamic web application with Yaws. If you understand Erlang, and you've done some web development, Yaws is pretty easy to learn. Yaws doesn't have a full MVC framework like Ruby on Rails, but it does have some nice features that give you similar abstractions: ehtml for views, appmods for controllers, Mnesia for the model. The Yaws distribution comes with some examples and you'll pick up some good methodologies by looking at their source code.

Build a few dynamic pages. You'll feel that tingling satisfaction when you see your Erlang code in action.

You should know that Yaws is a great web server for AJAX apps. Yaws has a JSON-RPC module, and even better, the next version of Yaws will support haXe remoting, hacked by yours truly. Read my previous blog posting about Yaws's unique strengths for Comet-based apps. IMO, this is the future of webapps, and Erlang + Yaws lead the competition by a wide margin in this area.

After you've made a few pages, it's a good time read the efficiency guide and programming conventions document, which admittedly aren't super exciting but it's good to know this stuff. Make sure you understand the cost of common operations. When you have a good bearings on the language, you should browse through the module list in the main documentation site to get a feel of what's available with the standard distribution

When you feel like you've done all you can independently and it's time to get some help from (gasp!) actual humans, you have two options: the erlang-questions mailing list and the erlyaws mailing list. erlang-questions is the place to go for general questions and erlyaws is more focussed on web development. Very knowledgeable people -- among them some of Erlang's creators -- will answer your questions on both mailing lists.

Read the Mnesia User's Manual. Mnesia is a distributed database written in pure Erlang and many Erlang apps use it. If your app doesn't require storing many gigabytes of data, Mnesia is a great solution. Keep in mind that Mnesia disc storage has a couple of issues. These may not be deal breakers, but if you have large datasets and you prefer to use a different disc storage engine, I recommend the MySQL or Postgres driver.

If you want to get a truly in-depth understanding of Erlang, you must read Joe Armstrong's PHD thesis. It will give you the overall picture on Erlang's history and the decisions that have shaped its design. It also covers basically every aspect of the language and explains how to to design fault tolerant systems in Erlang. I recommend this paper to every new Erlang developer. (Just don't panic when you read the section about Ericsson's banning Erlang from new projects. This stuff is ancient history.)

As you're building your app, you should know about Jungerl, an open source "Jungle of Erlang Code" -- a fitting description. Look at the list of modules because some of them may be useful to you.

If you want to explore even more exotic destinations, check out Joel Reymont's blog, in which he tells the story of building a scalable poker server in Erlang, and also Joel's DevMaster article about it. The source for his poker backend is available as well.

You can always learn a lot from looking at source code of experienced Erlang hackers. A few good sources I found are Yaws, Jungerl, Joel's OpenPoker server and ejabberd, a top notch Jabber server written in Erlang. ejabberd has makes use of MySQL/Postgres, and even has an abstraction layer for both drivers, which is useful if you wish to use these databases.

Well, I hope you find this useful. If there's anything you think I left out, please let me know. If you want to read more of my ramblings about Erlang, check out my previous blog postings about it or stay tuned for some future ones. Otherwise, stop wasting your time here and go build that Erlang-powered Google killer!

Erlang + Yaws vs. Ruby on Rails

Ruby on Rails is great. It makes web development easy, fun and productive. The MVC separation between layers is well thought out and the Ruby language, although painfully slow is quite suitable for quick projects that emphasize developer productivity over raw performance.

Although I love Ruby on Rails, I just keep gravitating back to Erlang, even when I just want to do a quick and dirty project. In most people's minds, 'Erlang' isn't the first association when they think of web development, but Erlang, in combination with Yaws, gives you a very nice web development package. Admittedly, Erlang doesn't have the whole gamut of web-specific APIs that Ruby + Rails have, but Erlang does carry its weight in all letters of the MVC. For the Model, forget database abstraction layers: you have a pure Erlang distributed database called Mnesia. For the view, Yaws has ehtml. For the controller, Yaws has appmods and Erlang's pattern matching.

I'm not going to delve into all the differences between the two solutions, because there are many advantages and disadvatages to both, but I will say that, holy crap, Erlang + Yaws is so much easier to setup and install than Ruby + Gems + Rails + FastCGI + Lighttpd + MySQL + MySQL bindings + every-other-package-in-the-universe. I've been slaving away at these installation instructions for OS X for what feels like an eternity, and my patience is running very low.

If you want to create a full dynamic, database driven, insanely scalable webapp with Erlang, all you have to do is install Erlang, install Yaws, configure Yaws to serve your application's directory, and you're done. You don't even have to worry about setting up a database engine: just call the Mnesia functions. You want to create a new database? Call mnesia:create_schema. To start a new Mnesia session, call mnesia:start. And if you want clustering for you database, it's built in. Just start an Erlang node on another machine, call mnesia:add_table_copy with the table and node's names.

Mnesia is not perfect, of course, and its biggest downside at the moment is that its disc storage engine isn't suited for storing large volumes of data (Mnesia was designed for soft real-time applications where the data is stored mostly in RAM), but I hope this will be resolved in the not-too-distant future.

(Update: if you look at Ulf Wiger's comment, you will see that Mnesia disc storage does scale up to many GB, which is more than enough for most websites. There are a couple of other issues with Mnesia disc storage to be aware of, though. They are discussed here.)

The View component is worth discussing a bit more. For the views, Ruby has ERb, which allows you to embed Ruby in HTML. Yaws takes a totally different, and in some ways much more elegant approach: ehtml. ehtml is a set of simple conventions for describing dynamically generated HTML using Erlang tuples, which Yaws translates into the output sent to the browser. With ehtml, you never have to step out of Erlang (of course, you can if you want to). If you want to see an example of this, look at this shopping cart code.

There's a lot more to say about this topic and I'll probably blog about this more in the future. For now, I hope I have given you something interesting to digest. If you want more food for thought you can check out some of my previous postings on this topic.

Help spread the word: digg digg this posting

Ebay, Here I Am

I used to think I was the only living, breathing person with an Internet connection who has never sold anything on Ebay.

Today, this has changed. I finally decided to take action and sell my Compaq laptop on Ebay. I got the new MacBook, with which I am more than happy, so instead of keeping that unused Compaq around I decided to try to place my trust in the free yet imperfect market mechanism to allocate my laptop to its most economically efficient place.

You can see the auction in all its glory here. I feel hopelessly amateurish compared to the other laptop sellers on Ebay, but I hope somebody likes the laptop, likes the price and trusts that I'm not a scumbag enough to buy it.

I know this technology-enabled seller meets buyer story is quite thrilling, so I'll keep you posted on how this high action drama plays out.

Monday, July 10, 2006

Google Fund

I predict that Google will buy ETrade or some other brokerage firm. Why? Because Google is a sucker for information, and brokerage firms are sitting on a huge pool of some of the world's most valuable information: live financial instrument trading data.

I'd bet many brains at Google would love to dissect this data this way and that way, correlate it against what people are searching for and writing about in Gmail, Google News, Blogger, GTalk, and the rest of the web, and see what kind of patterns they can reveal.

Some computer-driven "quant" funds have already been outperforming many of human-managed funds. This it strikes me as just the kind of game that Google likes to play: filtering down an ocean of data into a small set of highly relevant results -- in this case, stocks that are expected to have higher returns than the S&P 500.

If I could predict the future, I would be doing my own stock picking right now instead of blogging, but I certainly find it possible that Google will be tempted to throw its mind-boggling data pool, computing power and brain power into the mix to see how well its own analysis would fare against the rest of the industry. Plus, do I see a virtually endless growth opportunity for the Google stock? Well, let's just say it's just slightly bigger than the one created by Google Spreadsheet.

Imagine that in a year or two we'll see the following announcement: Google Funds, Beta.

Would you invest? I think I would :)

Sunday, July 02, 2006

March of the DRM Folly

DRM is an ill conceived protection measure for digital goods. I presume it arose from the myth that DRM can actually prevent music from being pirated and that punishing consumers who buy digital downloads with annoying restrictions on their freedoms leads to more sales.

This is nonsense. DRM'd music will always be pirated just as much as non-DRM'd music, and punishing your consumers by giving them a handicapped product can only hurt sales.

This knowledge came to my mind without having spent $120,000 for a MBA from a top school. It's called common sense.

Not to be outdone for its ignorance in both technology and business, the French government has decided to do its part in the DRM fiasco and commit its own folly by passing a law forcing businesses that sell DRM'd products to make them interoperable with their competitors' products.

The intention is good, but the act is overreaching. If DRM is so bad for consumers, consumers will figure this out and stay away from it.

A much better law would be one that mandates all sellers of DRM'd content to place a prominent mention on their site explaining that they sell handicapped products with restrictions on customers' freedoms to copy them and play them on competitors' players.

Thankfully, there are excellent alternatives to DRM'd music. First, consumers can buy non-copy protected CDs (which is what I do), which have better quality and include the physical cover and the liner notes. Second, they can buy digital downloads from enlightened stores such as emusic, which sell non-DRM'd music. emusic doesn't sell music from major record labels, but that's probably better for consumers anyway given the quality of major label music these days.

The same trend that happened with evil P2P software will happen with DRM. The first wave of P2P users were enticed by the promise of free music but then they got burned by Kazaa, Bearshare and other spyware laden software that destroyed their machines. The collective realization that you should be careful when installing such programs is now quite strong, even among people who aren't computer savvy.

DRM will share a similar fate. A few million people will naively buy this garbage, but when they realize they can't play their music on a different player, they will learn to avoid DRM'd music like a plague.

About 3 years ago, a friend bought me a $10 gift certificate for iTunes. I have never used it. A DRM'd song will never land on my hard drive. I prefer to let Apple keep the money.

If you think I'm fanatical about this, you ain't seen nothin yet :)