Detecting Face features

Cincopa video hosting solution for your website. Another great product from Cincopa Send Files.

I’ve been a lot of research lately about the best way to detect and track face features such as noise, eye, noise, etc.. My initial thought was to use the same technique ubiquitously used for face detection, Viola-Jones object detection framework. The latter works great for face detection and can easily be tested in the examples provided in Opencv. Although, Opencv provides haarcascade_eye.xml for eye detection and another one for the mouth, the results were pretty disappointing even in trying to find the bounding box and not the contour of those features, which what we ultimately trying to achieve.

After few hours of searching and digging online, I came across a very promising technique called Active Appearance Model, created by Timothy Cootes from Manchester University.

He describes his technique with the following:

“The Active Appearance Model (AAM) is a generalisation of the widely used Active Shape Model approach, but uses all the information in the image region covered by the target object, rather than just that near modelled edges.An AAM contains a statistical model of the shape and grey-level appearance of the object of interest which can generalise to almost any valid example. Matching to an image involves finding model parameters which minimise the difference between the image and a synthesised model example, projected into the image. The potentially large number of parameters makes this a difficult problem.
We observe that displacing each model parameter from the correct value induces a particular pattern in the residuals. In a training phase, the AAM learns a linear model of the relationship between parameter displacements and the induced residuals. During search it measures the residuals and uses this model to correct the current parameters, leading to a better fit. A good overall match is obtained in a few iterations, even from poor starting estimates.”

Tim was kind enough to provide tools and toys that we can play with to understand the AAM. These toys can be found here. You might notice that there is no source code provided. If you are looking for source code for AAM, you would definitely want to check out Mikkel B. Stegmann’s site fully dedicated to AAM with an open source library. The site can be found here.

I also found a great project using ASM here and another one that uses AAM here. And a great opensource one here


Yahoo!’s Browser Plus

Cincopa video hosting solution for your website. Another great product from Cincopa Send Files.

If you are a frontend Ninja and thinking that the browser is limiting what you can build then you’re in for a treat. Yahoo!’s BrowserPlus team came in for a rescue and gave the world an open source project that extends the browser’s functionality to embed your c++ libraries in it and interact with it using JS or even Actionscript using External Interface or just TCP/IP sockets.

This is how the BrowserPlus guys describe their product:

“BrowserPlus is a “Browser Plugin Abstraction” that allows you to write and deploy functionality to end user devices that augments what is possible from browser-based javascript.

BrowserPlus provides mechanisms to deploy and update these new plugins (we call them “services”), and makes it much simpler to produce services with wide end-user platform support.  Our goal is to breathe new life into web plugins of today, and make a whole lot more possible on the web.

BrowserPlus is a “Browser Plugin Abstraction” that allows you to writeand deploy functionality to end user devices that augments what is possible from browser-based javascript.

BrowserPlus provides mechanisms to deploy and update these new plugins(we call them “services”), and makes it much simpler to produce serviceswith wide end-user platform support.
Our goal is to breathe new life into web plugins of today, and make a whole lot more possible on the web.”

I will be posting about the little project I am building using BrowserPlus.

A stunning moon

Cincopa video hosting solution for your website. Another great product from Cincopa Send Files.

Right when we parked at our friend’s cabin up on Garberville California for a memorable Memorial Day weekend, the first things we spotted besides the neighbor’s crazy dog who kept on following us for the rest of the weekend was the moon. so I grabbed my cam and snapshotted it. And this is how it looked like:

Cross platform my Arch

Cincopa video hosting solution for your website. Another great product from Cincopa Send Files.

After years of cracking serious actionscript and building dope Flash apps and Microsites without the fear of things breaking randomly when ran on old browsers that also happened to be our users’ favorite according to some statistics… I decided to take on new challenges and Ajax it a little, which I did and I am very grateful for the opportunity. The product I built is not out for public yet so I can’t really discuss it but I will post info about once we get the green light from our PM and PR.

Now, I decided that it would be nice to get a little deeper and do some low level stuff. Why the hell not? How about some graphics and videos in C++. Sounds great doesn’t it? But what if I want to build my stuff on Ubuntu and my teammates are working on Macs and PCs?

So I decided to take the bull by the horns, follow the rules for cross platform development and things should just work. Not quite the case… I used Eclipse since it is a cross platform dev. tool and boost library since it is also a cross platform framework to build a simple echo server that spits out video frames to a TCP/IP socket. The project worked great on my ubuntu box and things were pretty happy on my desk until I decided to take that code and run it in my Windows machine using Eclipse CDT, MinGW and Gnu Make ( since they all claim to be cross platform and so I can be consistent ). And before I knew it, I spent a good 10 hours sitting on my chair wrestling with integrating boost with my tool chain. I am not sure if it is just hard by nature to do things like this or I am just too stupid for the task… 14 hours later, I gave up and went ahead, took everyone’s advice and installed Visual Studio 2008 and my server was up and running in minutes.

Cross Platform my Arch

I have to say that although I have it running on VS 08, I am still uneasy about it and I am gonna have to go back to it another day when I am less frustrated and try again. Hopefully I will get to blog on the steps I took to get it set up.

My first encounter with WebM

Cincopa video hosting solution for your website. Another great product from Cincopa Send Files.

Last week, everyone at work was talking about WebM and it’s direct and indirect impact on our video conferencing product.

What’s interesting is that I am not sure if Google intensionally announced open sourcing VP8 in a skeptic environment with a lot of open questions, especially legal ones, about H264 patent infringements when using VP8 in commercial products to get free advertisement or it happened this way.

I just read today that Steve Jobs is already talking trash about VP8 claiming that it is buggy and not as good as H264. Typical!

Despite the discouraging advice from Jobs, I decided to take on the adventure of trying VP8 on my own and see if I like it or not. I will be using this blog post to log all the steps I took to get me up and running with VP8 SDK on my Ubuntu 10.04

Installed Git: sudo apt-get -y install git-core gitosis

configured the git account using : git config –global user.email “my_email_address@email_server.com”

Grabbed the Source Code from Github: git clone  git://review.webmproject.org/libvpx.git

went inside the source folder and ran ./configure then make

Once you see ivfenc, you are good to go. You can run the commands listed in http://www.webmproject.org/tools/encoder-parameters/ on any video file source.

I think building and running WebM is an extreme easy task and it took me less than 30 minutes to get the examples running! I will write a post later about my comments on the performance of VP8.

Setting up my Ubuntu machine

Cincopa video hosting solution for your website. Another great product from Cincopa Send Files.

Last year I decided to give Ubuntu ( the new cool kid ) a try and use it as my development platform. Bad idea! Ubuntu 9.10 was a Vista like experience to me. I had a hard time with the booting system GRUB that kept on failing every time I do a system update or my laptop runs out of power. It was very frustrating to say the least.

Today, I am going to give ubuntu 10.04 a try hoping for a better luck this time around. My main development will be computer vision/machine learning related stuff so OpenCV is the first thing I want to install and I will use this blog post to log all the steps taken to have OpenCV up and running in my Ubuntu 10.04

install svn: sudo apt-get install subversion

OpenCV

Searching for OpenCV in Synaptic returned the list of OpenCV 2.0  libraries needed to run it. So I installed them.

It looks like the .h files were installed in /usr/include/opencv and the Haar Cascade xml files are located in /usr/share/opencv/haarcascades/

cmake:

sudo apt-get install cmake
Download CMake module for OpenCV:
http://opencv.willowgarage.com/wiki/Getting_started?action=AttachFile&do=view&target=FindOpenCV.cmake
Save it to:  /usr/share/cmake/Modules

install the ccmake by running: sudo apt-get install cmake-curses-gui

Eclipse

Eclipse can easily be installed using the Ubuntu Software Center. Right now, the shipped version is 3.5.2

Et Voila! Now you can create OpenCV projects and compile them using cmake!

I should now write a post on how to build a similar environment in MAC and Windows.

Logistic Regression to classify flying aircrafts [2]

Cincopa video hosting solution for your website. Another great product from Cincopa Send Files.

C++ Training and Testing code

I created a Google Code project for this example and it can be found here.

The code should be simple and self explanatory. It uses batch Logistic Regression by default to train but if you pass 1 as an argument, then it will use stochastic instead.

High level explanation of the difference between Stochastic and Batch mode: If we use Batch mode, then the gradient descent will make slower and smaller steps hence they are in the right direction. It will also merge very close to the minima. Stochastic, on the other hand, will make faster steps toward the mimima but it might not get as close as Batch to the goal.

Here is an example of the TesterMain ran on one image of an aircraft in the sky:

Input (click to enlarge)

Output (click to enlarge)

.

.

.

.

.

.

.

.

The testing part can be done in Actionscript and all it needs is the Thetas calculated from out training examples.

.

Logistic Regression to classify flying aircrafts [1]

Cincopa video hosting solution for your website. Another great product from Cincopa Send Files.

Logistic Regression to detect Airplane Pixels

Background:

Logistic Regression falls into the supervised learning algorithms category ( algorithms used to train a software for a specific task where the training data is labeled ).  The great thing about Logistic Regression is that it is simple to implement and its simplicity makes it very easy to debug.

The logistic regression is based on the sigmoid function:

This function has lots of nice features including:

The Math behind it >>

You can learn more about Logistic Regression by clicking on the inline links. And now let’s talk about the problem we are trying to solve.

Problem:

Given an image or video of an airplane in the sky, how can we classify a pixel to be a sky or airplane one?

Solution:

The idea is simple; We will be traverse all the image pixels and plug the R,G and B into some hypothesis equation h(x) ( that we need to figure out). If the result is close to zero then we classify that pixel as a sky pixel and an aircraft pixel otherwise. This description implies that our equation should have an output from 0 to 1 which is how the sigmoid function is designed.

Let’s refer to a specific pixel by x, then the hypothesis function h(x) can be defined as:

where

and g is the sigmoid function.

Our goal is then to train some data to find the best values of Thetas.

Implementation

Training Part:

This could be done in Actionscript and have the code trace out the Thetas but it would take a very long time to complete the task. Much longer if we would do it in low level languages such as C++.

Since we are trying to do things the right way, let’s do the training part in C++ and the testing part in Flash.

If you don’t feel like training data and get your hands dirty with C++, feel free to use my training result (the best possible Thetas) straight into your AS code.

to be continued…

Ready, set, GO!

Cincopa video hosting solution for your website. Another great product from Cincopa Send Files.

Abjuring my laziness and engaging in a love affair with my newly created blog.


© Copyright 2007 Is it Intelligent yet? . Theme by Zidalgo Thanks for visiting!