Last week, I attended the Numenta workshop. I didn't know much about Numenta before I went. My friend's excitement about the technology Numenta is building piqued my curiosity, so I decided to check it out. It seemed that almost everyone else in the conference had read Jeff Hawkins's On Intelligence and at least experimented with Numeta's tools, so I felt like a real n00b. I'm happy I went, though, because I learned about some interesting ideas and technologies.
Jeff Hawkins, Numenta's founder, has been fascinated with the workings of the brain throughout his career, but only two decades into it, after he founded Palm and Handspring, was he able to devote his efforts to artificial intelligence. In On Intelligence, Hawkins discusses his theories on the brain's functions in detail. Numenta, a company he founded with Dileep George and Donna Dubinsky, aims to put these ideas to work in commercial and research applications.
Numenta is a platform company. The platform they develop is NuPIC (Numenta Platform for Intelligent Computing), a software toolkit essentially for building pattern classifiers. The fundamental concept behind NuPIC is called HTM (Hierarchical Temporal Memory). It postulates that the cortex learns to recognize patterns using a combination of two basic algorithms: hierarchical belief propagation, and the detection of invariants in a sequence of transformations in time. I won't get into what this all means because there's plenty of documentation on the Numenta website. I recommend browsing it if you find this interesting.
HTM is not just theory. Although NuPIC is in a very early stage, companies are applying NuPIC to a wide range of problems, including vision, voice recognition, finance, motion recognition (recognizing motion capture data to detect if a person is walking, running, sitting, etc) and games. This is just a small subset of its potential uses.
Using Nupic in its current state isn't trivial. It provides the building blocks for HTM pattern classifiers, but application developers still have to do a good deal of work to tune the parameters of their HTM (How many nodes? How many levels in the hierarchy? How much training data to use? How many categories? What transformations to apply to the input over time to train the system?) to their problem domain. Also, some important features haven't been implemented yet. For example, although NuPIC can be pretty effective at classifying images that contain a single object against a plain background (with enough training), it isn't designed to recognize objects in images with noisy backgrounds or with multiple objects. (The problem of how to identify interesting objects in a scene is called the "attention" problem. To solve it you need to have a mechanism by which the top nodes could send feedback down to the bottom nodes. Hawkins said Numenta will tackle it in a future release.)
One reason I find Numenta so interesting is that I believe that NuPIC, or something like it, will play a role in the evolution of the Web. The current generation of web applications is effective at aggregating massive amounts of data in different verticals (pictures, videos, bookmarks, status messages, paintings), slicing and dicing it in different ways, searching it, and displaying it in an organized fashion. Mashups provide additional context for the data gathered in the different silos of the web (Kosmix is a good example), but they don't add any real "intelligence" to the mix, i.e. they don't extract new knowledge from the data they aggregate. Numenta's technology could be used to implement a new layer of intelligence on top of existing services by training it to recognize spacial and temporal pattens in the data they've collected. For example, imagine a Flickr API that let you submit an image and Flickr would tell you what the objects in the image are and where the picture was taken. Or a Facebook API for identifying the people in a picture. Or a Skype API for recognizing the speaker from a voice sample (creepy, I know). Or a HotOrNot API for automatically classifying the hotness of a person (ok, bad example :) ). Or a YouTube API for identifying the objects and events in a video clip. Or a icanhascheezburger API for automatically classifying the LOLness of a cat (well... maybe not :) ).
If this happens, maybe some day a mashup of these web services will be used to build something that resembles real AI. If (when?) someone manages to build a real-life WALL*E (great movie!), I think there's a good chance its HTMs will be trained on the vast amounts of data gathered on the web.
Showing posts with label technology. Show all posts
Showing posts with label technology. Show all posts
Sunday, June 29, 2008
Wednesday, November 01, 2006
Goodbye, Typo. Hello Wordpress!
Update (12/21/06): I was in a pretty upset state of mind after struggling with a barely-working comment system for many days when I wrote this posting. I didn't want to take my frustrations out on Typo because I liked Typo (plus, I really didn't think this was Typo's fault because Typo worked fine under light load), so I picked on Rails instead. Please read this posting as a silly angry rant rather than a well thought-out criticism.
It wouldn't be fair to judge the venerable Ruby on Rails "platform" based on a single data point (*cough* it sucks! *cough* *cough*), but after days of agony trying to get comment submission to work properly in Typo (executing an INSERT after an HTTP POST must be a requirement that's outside of Rails' "scope"), I decided to go back to my roots and run this blog on Wordpress.
(Yes, I was tempted to write my own blogging engine in Erlang, but I decided against it -- I must keep my eyes on the prize :) )
Based on my admittedly limited Rails experience, even if Erlang isn't your cup of tea, I recommend avoiding Rails and sticking to PHP or Python (or Smalltalk, about which I hear nice things) unless you're an optimization genius who loves the thrill of Linux tinkering; you're hopelessly stricken by a successful marketing blitz; you're a masochist; you wrote a Rails book; or you're just not planning on success.
I can't comprehend why some people think it's justifiable to ask someone how many worker processes he would like to run. I wouldn't want to pick a low number like, say, 5, because that might indicate that I have self esteem issues, but then I'm trapped by the fear than an astronomically high number such as, gosh, 20, would hose my VPS.
Sometimes, when I let my imagination run wild, I wish I could say something crazy, like "500,000".
Oh wait -- I can :)
Update (11/2): Wow! Wordpress is so much faster than Typo! Good job, Matt! :)
It wouldn't be fair to judge the venerable Ruby on Rails "platform" based on a single data point (*cough* it sucks! *cough* *cough*), but after days of agony trying to get comment submission to work properly in Typo (executing an INSERT after an HTTP POST must be a requirement that's outside of Rails' "scope"), I decided to go back to my roots and run this blog on Wordpress.
(Yes, I was tempted to write my own blogging engine in Erlang, but I decided against it -- I must keep my eyes on the prize :) )
Based on my admittedly limited Rails experience, even if Erlang isn't your cup of tea, I recommend avoiding Rails and sticking to PHP or Python (or Smalltalk, about which I hear nice things) unless you're an optimization genius who loves the thrill of Linux tinkering; you're hopelessly stricken by a successful marketing blitz; you're a masochist; you wrote a Rails book; or you're just not planning on success.
I can't comprehend why some people think it's justifiable to ask someone how many worker processes he would like to run. I wouldn't want to pick a low number like, say, 5, because that might indicate that I have self esteem issues, but then I'm trapped by the fear than an astronomically high number such as, gosh, 20, would hose my VPS.
Sometimes, when I let my imagination run wild, I wish I could say something crazy, like "500,000".
Oh wait -- I can :)
Update (11/2): Wow! Wordpress is so much faster than Typo! Good job, Matt! :)
Saturday, July 15, 2006
Embracing Typo
My blog has had quite a journey through different blogging services.
It started at Blogger because it was so easy to set up and start blogging, but I lost my love for Blogger after a while because Blogger didn't let me login to my blog or edit it over SSL. Call me crazy, but I just don't like sending the password for my blog in plaintext over the internet. Having my blog hijacked by a 13 year old Ukranian hacker would really spoil the fun of having a blog.
My blog's next stop was wordpress.com. wordpress.com gives you full SSL access, which is pretty awesome, and it has a pretty nice interface. However, wordpress.com is too locked down for me. It's impossible to manually edit template, and the template selection was pretty poor IMO. Some templates looked nice, but all of them had one or two big flaws that turned them off to me.
Plus, I'm a geek, and I like having full control over blog. At wordpress.com, I have ran the risk of not being able to do whatever I want to do with my blog. That's not to say Wordpress is a bad service -- I'm just not in its target audience.
I stayed at wordpress.com for a while, but when my discomfort has reached a certain level I started looking for alternatives. Then I discovered Typo, a very cool blogging application written in Ruby on Rails. I played with Typo for a little while and I fell in love with it very quickly. Despite its young age, Typo is packed with features, it has a great interface, and because it's written in Rails, I feel right at home with the source code.
I decided to host Typo on my own server (it runs Debian with Lighttpd) so I can have total freedom to tweak it as I please. My blog has finally found its ideal home.
Big thanks to the Typo developers for giving people such a great blogging tool!
It started at Blogger because it was so easy to set up and start blogging, but I lost my love for Blogger after a while because Blogger didn't let me login to my blog or edit it over SSL. Call me crazy, but I just don't like sending the password for my blog in plaintext over the internet. Having my blog hijacked by a 13 year old Ukranian hacker would really spoil the fun of having a blog.
My blog's next stop was wordpress.com. wordpress.com gives you full SSL access, which is pretty awesome, and it has a pretty nice interface. However, wordpress.com is too locked down for me. It's impossible to manually edit template, and the template selection was pretty poor IMO. Some templates looked nice, but all of them had one or two big flaws that turned them off to me.
Plus, I'm a geek, and I like having full control over blog. At wordpress.com, I have ran the risk of not being able to do whatever I want to do with my blog. That's not to say Wordpress is a bad service -- I'm just not in its target audience.
I stayed at wordpress.com for a while, but when my discomfort has reached a certain level I started looking for alternatives. Then I discovered Typo, a very cool blogging application written in Ruby on Rails. I played with Typo for a little while and I fell in love with it very quickly. Despite its young age, Typo is packed with features, it has a great interface, and because it's written in Rails, I feel right at home with the source code.
I decided to host Typo on my own server (it runs Debian with Lighttpd) so I can have total freedom to tweak it as I please. My blog has finally found its ideal home.
Big thanks to the Typo developers for giving people such a great blogging tool!
Sunday, July 02, 2006
March of the DRM Folly
DRM is an ill conceived protection measure for digital goods. I presume it arose from the myth that DRM can actually prevent music from being pirated and that punishing consumers who buy digital downloads with annoying restrictions on their freedoms leads to more sales.
This is nonsense. DRM'd music will always be pirated just as much as non-DRM'd music, and punishing your consumers by giving them a handicapped product can only hurt sales.
This knowledge came to my mind without having spent $120,000 for a MBA from a top school. It's called common sense.
Not to be outdone for its ignorance in both technology and business, the French government has decided to do its part in the DRM fiasco and commit its own folly by passing a law forcing businesses that sell DRM'd products to make them interoperable with their competitors' products.
The intention is good, but the act is overreaching. If DRM is so bad for consumers, consumers will figure this out and stay away from it.
A much better law would be one that mandates all sellers of DRM'd content to place a prominent mention on their site explaining that they sell handicapped products with restrictions on customers' freedoms to copy them and play them on competitors' players.
Thankfully, there are excellent alternatives to DRM'd music. First, consumers can buy non-copy protected CDs (which is what I do), which have better quality and include the physical cover and the liner notes. Second, they can buy digital downloads from enlightened stores such as emusic, which sell non-DRM'd music. emusic doesn't sell music from major record labels, but that's probably better for consumers anyway given the quality of major label music these days.
The same trend that happened with evil P2P software will happen with DRM. The first wave of P2P users were enticed by the promise of free music but then they got burned by Kazaa, Bearshare and other spyware laden software that destroyed their machines. The collective realization that you should be careful when installing such programs is now quite strong, even among people who aren't computer savvy.
DRM will share a similar fate. A few million people will naively buy this garbage, but when they realize they can't play their music on a different player, they will learn to avoid DRM'd music like a plague.
About 3 years ago, a friend bought me a $10 gift certificate for iTunes. I have never used it. A DRM'd song will never land on my hard drive. I prefer to let Apple keep the money.
If you think I'm fanatical about this, you ain't seen nothin yet :)
This is nonsense. DRM'd music will always be pirated just as much as non-DRM'd music, and punishing your consumers by giving them a handicapped product can only hurt sales.
This knowledge came to my mind without having spent $120,000 for a MBA from a top school. It's called common sense.
Not to be outdone for its ignorance in both technology and business, the French government has decided to do its part in the DRM fiasco and commit its own folly by passing a law forcing businesses that sell DRM'd products to make them interoperable with their competitors' products.
The intention is good, but the act is overreaching. If DRM is so bad for consumers, consumers will figure this out and stay away from it.
A much better law would be one that mandates all sellers of DRM'd content to place a prominent mention on their site explaining that they sell handicapped products with restrictions on customers' freedoms to copy them and play them on competitors' players.
Thankfully, there are excellent alternatives to DRM'd music. First, consumers can buy non-copy protected CDs (which is what I do), which have better quality and include the physical cover and the liner notes. Second, they can buy digital downloads from enlightened stores such as emusic, which sell non-DRM'd music. emusic doesn't sell music from major record labels, but that's probably better for consumers anyway given the quality of major label music these days.
The same trend that happened with evil P2P software will happen with DRM. The first wave of P2P users were enticed by the promise of free music but then they got burned by Kazaa, Bearshare and other spyware laden software that destroyed their machines. The collective realization that you should be careful when installing such programs is now quite strong, even among people who aren't computer savvy.
DRM will share a similar fate. A few million people will naively buy this garbage, but when they realize they can't play their music on a different player, they will learn to avoid DRM'd music like a plague.
About 3 years ago, a friend bought me a $10 gift certificate for iTunes. I have never used it. A DRM'd song will never land on my hard drive. I prefer to let Apple keep the money.
If you think I'm fanatical about this, you ain't seen nothin yet :)
Wednesday, June 21, 2006
Why I Moved from Blogger to Wordpress
I used to use Blogger, but I recently decided to move my blog to Wordpress. The primary reason I decided to leave Blogger is Blogger's pathetic security, mostly due to the lack of SSL access. I picked Wordpress for my blog's new home because Wordpress has some of the best features and positive overall experience out of all blogging services I know. In fact, Wordpress's only minor drawback in my mind is the lack of manual control over the templates, but I'm not a customization freak, so this isn't a big concern for me.
Blogger doesn't even let you log in over SSL, not to mention keeping your session over SSL while you're editing your blog. When you change your password, Blogger doesn't even send you a validation email. What does that mean? Every 12 year old hacker armed with Ethereal or tcpdump can steal your password by eavesdropping on your connection, and can then go ahead and change your password and thereby hijack your blog.
Your blog is a large part of your your online identity. It's often the first thing that shows on search engines when people search for your name. It's valuable. I'm not comfortable with the thought that my blog could be hijacked so easily and there's nothing I can do to prevent it. (I did read that certain blogging applications let you use Blogger over SSL, but that's one more hoop than I'm willing to jump.)
I dread the day when somebody stages a large scale attack on Blogger and hijacks thousands if not millions of blogs. Maybe such an event would kick Google's butt into action, getting it to turn on the SSL switch on the Blogger servers. I suppose that if this happened, Blogger could mitigate the disaster by rolling back all changes that happened during the attack, and then resetting all passwords. The damage would be significant, but not irreversible. I'm actually more concerned about individual blogs getting hijacked without Blogger's knowing or caring.
Wordpress has SSL access, so this problem largely doesn't affect Wordpress users (I say "largely" because the Wordpress servers could always be cracked and the user data could be stolen, but the risk is very small). That's a huge advantage for Wordpress, and is the primary reason I moved here. I must say I'm happy here so far. I may decide to host my blog on my own server eventually, which would have is downsides, but it's likely that Wordpress will remain my blog's permanent home.
Blogger doesn't even let you log in over SSL, not to mention keeping your session over SSL while you're editing your blog. When you change your password, Blogger doesn't even send you a validation email. What does that mean? Every 12 year old hacker armed with Ethereal or tcpdump can steal your password by eavesdropping on your connection, and can then go ahead and change your password and thereby hijack your blog.
Your blog is a large part of your your online identity. It's often the first thing that shows on search engines when people search for your name. It's valuable. I'm not comfortable with the thought that my blog could be hijacked so easily and there's nothing I can do to prevent it. (I did read that certain blogging applications let you use Blogger over SSL, but that's one more hoop than I'm willing to jump.)
I dread the day when somebody stages a large scale attack on Blogger and hijacks thousands if not millions of blogs. Maybe such an event would kick Google's butt into action, getting it to turn on the SSL switch on the Blogger servers. I suppose that if this happened, Blogger could mitigate the disaster by rolling back all changes that happened during the attack, and then resetting all passwords. The damage would be significant, but not irreversible. I'm actually more concerned about individual blogs getting hijacked without Blogger's knowing or caring.
Wordpress has SSL access, so this problem largely doesn't affect Wordpress users (I say "largely" because the Wordpress servers could always be cracked and the user data could be stolen, but the risk is very small). That's a huge advantage for Wordpress, and is the primary reason I moved here. I must say I'm happy here so far. I may decide to host my blog on my own server eventually, which would have is downsides, but it's likely that Wordpress will remain my blog's permanent home.
Sunday, June 11, 2006
Helen OS
I just saw on Digg this interesting a link to an interesting new open source operating system, Helen OS. Among other things, HelenOS has support for SMP, Kernel threads, userspace threads, userspace pseudo-threads ("Userspace pseudo threads are very lightweight threads running in the context of one userspace thread") and IPC ("the ability of userspace threads to communicate with other threads (possibly from different tasks) via sending and receiving, synchronously or asynchronously, short messages"). The full list is here.
Some of these features strike me as very similar to those that provided by Erlang and its virtual machine. I wonder if HelenOS developers took some cues from Erlang's success at scaling to large numbers of concurrent processes by keeping them very lightweight. This raises the interesting question of whether such features, when provided by the OS, make it possible to write C/C++ programs with the same scalability characteristics as Erlang programs at large numbers of concurrent processes. I should stress that the operative word here is "possible" -- not "easy"!
Let's sit back and wait for the benchmarks.
Some of these features strike me as very similar to those that provided by Erlang and its virtual machine. I wonder if HelenOS developers took some cues from Erlang's success at scaling to large numbers of concurrent processes by keeping them very lightweight. This raises the interesting question of whether such features, when provided by the OS, make it possible to write C/C++ programs with the same scalability characteristics as Erlang programs at large numbers of concurrent processes. I should stress that the operative word here is "possible" -- not "easy"!
Let's sit back and wait for the benchmarks.
Tuesday, June 06, 2006
MacBook: 1, Dual G5: 0
I knew the Intel Macs were fast, but I didn't expect my (low end) MacBook to put my Dual G5 PowerMac to shame in a task that's both CPU and IO intensive.
I timed the compilation time for a mid-size C/C++ project using xcodebuild, and here are my results:

So, the MacBook compiles about 40% faster than the Dual G5.
That's pretty awesome.
I timed the compilation time for a mid-size C/C++ project using xcodebuild, and here are my results:

So, the MacBook compiles about 40% faster than the Dual G5.
That's pretty awesome.
Friday, June 02, 2006
Secure Portable Storage with OS X
If you're an OS X user and you store sensitive files on your iPod or flash drive, you're probably looking for ways to secure your data in case your portable storage device falls into the wrong hands. Some flash drives have proprietary data protection mechanisms, but they often don't work with OS X. More importantly, the iPod doesn't have such capability built-in. The best way mechanism I found was to create an encrypted disk image and use it as a virtual drive for your sensitive files. This disk image is safe to carry around because it protects your data with 128 bit AES encryption, which is uncrackable by all practical means.
Here's how you do it:
Open the terminal and type
cd /Volumes/[name of portable storage device]
hdiutil create -fs HFS+ -encryption -type SPARSE -volname "My Drive" securedrive
This creates a new disk image on your portable storage device called securedrive.sparseimage. You can mount the disk image by executing "hdiutil mount securedrive.sparseimage" or by double clicking on the disk image in Finder. This will show the virtual drive in Finder as volume "My Drive" as well as in the /Volumes directory.
You can copy or drag and drop your files into the newly mounted virtual drive and your data will be safe. Just don't forget to cleanly eject (unmount) the virtual drive (using the Finder eject button or by executing 'hdiutil unmount "My Drive"'), as well as you portable storage device, before you physically disconnect the portable storage device from your computer.
Keep in mind is that when you delete files from the virtual drive, the disk image doesn't shrink automatically and the physical space taken by the files remains unavailable. To reclaim this space, unmount the virtual drive and type
cd /Volumes/[name of portable storage device]
hdiutil compact securedrive.sparseimage
That'll give you those precious bytes back.
Here's how you do it:
Open the terminal and type
cd /Volumes/[name of portable storage device]
hdiutil create -fs HFS+ -encryption -type SPARSE -volname "My Drive" securedrive
This creates a new disk image on your portable storage device called securedrive.sparseimage. You can mount the disk image by executing "hdiutil mount securedrive.sparseimage" or by double clicking on the disk image in Finder. This will show the virtual drive in Finder as volume "My Drive" as well as in the /Volumes directory.
You can copy or drag and drop your files into the newly mounted virtual drive and your data will be safe. Just don't forget to cleanly eject (unmount) the virtual drive (using the Finder eject button or by executing 'hdiutil unmount "My Drive"'), as well as you portable storage device, before you physically disconnect the portable storage device from your computer.
Keep in mind is that when you delete files from the virtual drive, the disk image doesn't shrink automatically and the physical space taken by the files remains unavailable. To reclaim this space, unmount the virtual drive and type
cd /Volumes/[name of portable storage device]
hdiutil compact securedrive.sparseimage
That'll give you those precious bytes back.
Thursday, May 25, 2006
Got a MacBook!
I just got a shiny new MacBook at the Apple store in Soho... it is sweet! I knew I wanted to get one as soon as it came out. At first I had some reservations about the glossy screen, but when I looked at it side by side against the MacBook Pro's screen, I realized that the glossy screen actually looks much crisper and brighter and its reflectivity is not an issue unless there's a bright light source right behind you.
Before I got the MacBook, I made sure to order a bigger hard drive and 2 GB of RAM. I knew I would need the upgrades and unless you're rich or just a sucker you shouldn't buy it at the Apple store. I got the cheapest MacBook available for $1099 because I knew I would get the upgrades from NewEgg anyway, and I really don't care about 166 MHz difference or the black shell of the more expensive MacBooks.
Unfortunately, I haven't been able to install any of my upgrades yet because I don't have a tiny enough screwdriver to remove the metal piece covering the expansion slots in the back. So much for the promise of easy upgrades -- upgrading the Powerbook's RAM was much easier
The antenna is excellent. The MacBook picks up my wireless network from a much greater range.
I haven't run any serious performance tests yet but the MacBook definitely feels snappy, and that's without any upgrades.
Before I got the MacBook, I made sure to order a bigger hard drive and 2 GB of RAM. I knew I would need the upgrades and unless you're rich or just a sucker you shouldn't buy it at the Apple store. I got the cheapest MacBook available for $1099 because I knew I would get the upgrades from NewEgg anyway, and I really don't care about 166 MHz difference or the black shell of the more expensive MacBooks.
Unfortunately, I haven't been able to install any of my upgrades yet because I don't have a tiny enough screwdriver to remove the metal piece covering the expansion slots in the back. So much for the promise of easy upgrades -- upgrading the Powerbook's RAM was much easier
The antenna is excellent. The MacBook picks up my wireless network from a much greater range.
I haven't run any serious performance tests yet but the MacBook definitely feels snappy, and that's without any upgrades.
Subscribe to:
Posts (Atom)