Yariv's Blog: June 2008

Last week, I attended the Numenta workshop. I didn't know much about Numenta before I went. My friend's excitement about the technology Numenta is building piqued my curiosity, so I decided to check it out. It seemed that almost everyone else in the conference had read Jeff Hawkins's On Intelligence and at least experimented with Numeta's tools, so I felt like a real n00b. I'm happy I went, though, because I learned about some interesting ideas and technologies.

Jeff Hawkins, Numenta's founder, has been fascinated with the workings of the brain throughout his career, but only two decades into it, after he founded Palm and Handspring, was he able to devote his efforts to artificial intelligence. In On Intelligence, Hawkins discusses his theories on the brain's functions in detail. Numenta, a company he founded with Dileep George and Donna Dubinsky, aims to put these ideas to work in commercial and research applications.

Numenta is a platform company. The platform they develop is NuPIC (Numenta Platform for Intelligent Computing), a software toolkit essentially for building pattern classifiers. The fundamental concept behind NuPIC is called HTM (Hierarchical Temporal Memory). It postulates that the cortex learns to recognize patterns using a combination of two basic algorithms: hierarchical belief propagation, and the detection of invariants in a sequence of transformations in time. I won't get into what this all means because there's plenty of documentation on the Numenta website. I recommend browsing it if you find this interesting.

HTM is not just theory. Although NuPIC is in a very early stage, companies are applying NuPIC to a wide range of problems, including vision, voice recognition, finance, motion recognition (recognizing motion capture data to detect if a person is walking, running, sitting, etc) and games. This is just a small subset of its potential uses.

Using Nupic in its current state isn't trivial. It provides the building blocks for HTM pattern classifiers, but application developers still have to do a good deal of work to tune the parameters of their HTM (How many nodes? How many levels in the hierarchy? How much training data to use? How many categories? What transformations to apply to the input over time to train the system?) to their problem domain. Also, some important features haven't been implemented yet. For example, although NuPIC can be pretty effective at classifying images that contain a single object against a plain background (with enough training), it isn't designed to recognize objects in images with noisy backgrounds or with multiple objects. (The problem of how to identify interesting objects in a scene is called the "attention" problem. To solve it you need to have a mechanism by which the top nodes could send feedback down to the bottom nodes. Hawkins said Numenta will tackle it in a future release.)

One reason I find Numenta so interesting is that I believe that NuPIC, or something like it, will play a role in the evolution of the Web. The current generation of web applications is effective at aggregating massive amounts of data in different verticals (pictures, videos, bookmarks, status messages, paintings), slicing and dicing it in different ways, searching it, and displaying it in an organized fashion. Mashups provide additional context for the data gathered in the different silos of the web (Kosmix is a good example), but they don't add any real "intelligence" to the mix, i.e. they don't extract new knowledge from the data they aggregate. Numenta's technology could be used to implement a new layer of intelligence on top of existing services by training it to recognize spacial and temporal pattens in the data they've collected. For example, imagine a Flickr API that let you submit an image and Flickr would tell you what the objects in the image are and where the picture was taken. Or a Facebook API for identifying the people in a picture. Or a Skype API for recognizing the speaker from a voice sample (creepy, I know). Or a HotOrNot API for automatically classifying the hotness of a person (ok, bad example :) ). Or a YouTube API for identifying the objects and events in a video clip. Or a icanhascheezburger API for automatically classifying the LOLness of a cat (well... maybe not :) ).

If this happens, maybe some day a mashup of these web services will be used to build something that resembles real AI. If (when?) someone manages to build a real-life WALL*E (great movie!), I think there's a good chance its HTMs will be trained on the vast amounts of data gathered on the web.

Yariv's Blog

Sunday, June 29, 2008

Numenta

Saturday, June 28, 2008

Twoorl Goes Multilingual