Category Archives: Linux

Using OpenCV for great customer service

OpenCV is an Open Source Computer Vision library that can be used in a variety of applications. There are a few wrappers for it that will expose the OpenCV API in a number of languages, but we will look at the Python wrapper in this post.

One application that I was thinking could be done very quickly and easily, would be to use facial recognition to look up a customer before servicing them. This can easily be achieved using a simple cheap webcam mounted at the entrance to a service centre that captures people’s faces as they enter the building. This can then be used to look up against a database of images to identify the customer and all their details immediately on the service centre agent’s terminal. If a customer is a new customer, the agent could then capture the info for next time.

Privacy issues aside, this should be relatively easy to implement.

import sys
import as cv
from optparse import OptionParser

# Parameters for haar detection
# From the API:
# The default parameters (scale_factor=2, min_neighbors=3, flags=0) are tuned
# for accurate yet slow object detection. For a faster operation on real video
# images the settings are:
# scale_factor=1.2, min_neighbors=2, flags=CV_HAAR_DO_CANNY_PRUNING,
# min_size=<minimum possible face size

min_size = (20, 20)
image_scale = 2
haar_scale = 1.2
min_neighbors = 2
haar_flags = 0

def detect_and_draw(img, cascade):
    # allocate temporary images
    gray = cv.CreateImage((img.width,img.height), 8, 1)
    small_img = cv.CreateImage((cv.Round(img.width / image_scale),
                   cv.Round (img.height / image_scale)), 8, 1)

    # convert color input image to grayscale
    cv.CvtColor(img, gray, cv.CV_BGR2GRAY)

    # scale input image for faster processing
    cv.Resize(gray, small_img, cv.CV_INTER_LINEAR)

    cv.EqualizeHist(small_img, small_img)

        t = cv.GetTickCount()
        faces = cv.HaarDetectObjects(small_img, cascade, cv.CreateMemStorage(0),
                                     haar_scale, min_neighbors, haar_flags, min_size)
        t = cv.GetTickCount() - t
        print "detection time = %gms" % (t/(cv.GetTickFrequency()*1000.))
        if faces:
            for ((x, y, w, h), n) in faces:
                # the input to cv.HaarDetectObjects was resized, so scale the
                # bounding box of each face and convert it to two CvPoints
                pt1 = (int(x * image_scale), int(y * image_scale))
                pt2 = (int((x + w) * image_scale), int((y + h) * image_scale))
                cv.Rectangle(img, pt1, pt2, cv.RGB(255, 0, 0), 3, 8, 0)

    cv.ShowImage("result", img)

if __name__ == '__main__':

    parser = OptionParser(usage = "usage: %prog [options] [filename|camera_index]")
    parser.add_option("-c", "--cascade", action="store", dest="cascade", type="str", help="Haar cascade file, default %default", default = "../data/haarcascades/haarcascade_frontalface_alt.xml")
    (options, args) = parser.parse_args()

    cascade = cv.Load(options.cascade)

    if len(args) != 1:

    input_name = args[0]
    if input_name.isdigit():
        capture = cv.CreateCameraCapture(int(input_name))
        capture = None

    cv.NamedWindow("result", 1)

    if capture:
        frame_copy = None
        while True:
            frame = cv.QueryFrame(capture)
            if not frame:
            if not frame_copy:
                frame_copy = cv.CreateImage((frame.width,frame.height),
                                            cv.IPL_DEPTH_8U, frame.nChannels)
            if frame.origin == cv.IPL_ORIGIN_TL:
                cv.Copy(frame, frame_copy)
                cv.Flip(frame, frame_copy, 0)

            detect_and_draw(frame_copy, cascade)

            if cv.WaitKey(10) >= 0:
        image = cv.LoadImage(input_name, 1)
        detect_and_draw(image, cascade)


So as you can see, by using the bundled OpenCV Haar detection XML documents for frontal face detection, we are almost there already! Try it with:

python ./ -c /usr/local/share/OpenCV/haarcascades/haarcascade_frontalface_alt.xml 0

Where 0 is the index of the camera you wish to use.

Maven for Android

This is a quick howto on setting up and using Maven for your Android projects. Maven Android integration is not yet excellent, but is coming along nicely, and if you are familiar with Maven projects, will make managing your dependencies a lot easier!

I will be working with Ubuntu, but your set up will be similar. Just adapt paths etc for your setup as you need.

Ubuntu ships with Maven2, but we need Maven-3.0.5 at least in order to work with Android. I prefer to install maven manually because you don’t need to stress about pinning and other such nonsense from a binary distro. I also usually install stuff in /opt/ so that is where we will be working from.

The first thing that you need to do, is to grab the maven distribution file. I used 3.2.1, but anything later than 3.0.5 should work OK.


Extract the archive and copy it to /opt/

sudo cp apache-maven-3.2.1 /opt/

Great! First steps completed! You are doing well so far!
I am assuming that you have a semi-recent JDK installed, in our case we need JDK 6+. Check for your JDK version with

java -version

If all comes back OK, we are ready to proceed.

Get the path to your JDK now with

locate bin/java | grep jdk

and make a note of it. Mine is at


Edit your bashrc file (located at /etc/bash.bashrc on Ubuntu) and add the following parameters (modify according to your paths) to the end of the file:

export ANDROID_HOME=/opt/android-sdk-linux
export M3_HOME=/opt/apache-maven-3.2.1
export M3=$M3_HOME/bin
export PATH=$M3:$PATH
export JAVA_HOME=/opt/java7/jdk1.7.0_45
export PATH=$JAVA_HOME/bin:$PATH:/opt/java7/jdk1.7.0_45

Load up your new basrc file with

source /etc/bash.bashrc

and check that everything is OK.
You should now be able to test your brand new Maven3 installation with

mvn -version

If that seems OK, you are ready to install the Android m2e connector in Eclipse. Please note that this works best in Eclipse Juno or later (I use Kepler).

Open up Eclipse, and choose to install software from the Eclipse Marketplace. This is found in Help -> Eclipse Marketplace. Do a search for “android m2e” and install the Android configurator for M2E 0.4.3 connector. It will go ahead and resolve some dependencies for you and install.

You should now be able to generate a new Android project in Eclipse with New Project -> Maven -> new Maven project and in the archetype selection, look only in the Android catalogue or filter on “” and choose the android quickstart project.

If this fails, you can also generate a new project on the command line and simply import it to Eclipse.

mvn archetype:generate \
  -DarchetypeArtifactId=android-quickstart \ \
  -DarchetypeVersion=1.0.11 \ \

Once all of that is complete, dev carries on as usual. Remember that now dependencies are in your POM.xml document, so check that out first and ensure that you have some basics in there:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="" xmlns:xsi=""




		<!-- Androlog is a logging and reporting library for Android -->




As you can see, I have included some other stuff, like ActionBarSherlock and JodaTime in case, as they are generally really useful, and it may save you some time just copying the dependency information!

Have fun!

Using QEMU to emulate ARM devices

This post will show you how to set up a QEMU virtual device to play with your ARM code on an x86_64 host. It is quite simple, and you should be able to simply copy and paste into a terminal and get going relatively quickly.

As an example, we will be installing a debian build (wheezy) into your VM.

First off, we need to install the QEMU packages. I use Ubuntu/Mint, so this post will be somewhat biased towards that.

Let’s start off getting the packages we need:

sudo apt-get install qemu-kvm
sudo apt-get install qemu-system-arm
sudo apt-get install qemu-utils

Now we can check that everything is installed OK and ready to go with:

qemu -version

Make a directory to work with and then grab some files off your local debian mirror. Remember, we need the ARM based distro.

mkdir ~/arm-emul
cd ~/arm-emul

Remember now that depending on your board/device, you may want to check if it supports ARM EL or ARM HF. As you can probably guess from the above filenames, we are working with ARM EL. There are a number of differences between the way (and efficiency) of the two device types, but if you don’t know, then you are probably using an ARM EL device. Also, it is worth checking with your manufacturer if you haven’t built your device yourself, as ARM HF is a way better buy!

Let’s create a virtual HDD now to host the code/OS:

qemu-img create -f raw hda.img 8G

I like to create a drive as big as my devices flash ROM. In this case, it is 8GB. Yours may vary.

Now, lets get the system up and running:

qemu-system-arm -m 256 -M versatilepb -kernel ~/arm-emul/vmlinuz-3.2.0-4-versatile -initrd ~/arm-emul/initrd.gz -hda ~/arm-emul/hda.img -append “root=/dev/ram”

Should get you started with the Debian installer. Do the installation and then close your VM.

Once complete, mount your filesystem, and then copy the relevant files around. You need to do this step as debian will not be able to install the bootloader, so you kind of have to do it manually.

mkdir mount

sudo losetup /dev/loop0 hda.img
sudo kpartx -a /dev/loop0
sudo mount /dev/mapper/loop0p1 mount

cp ~/arm-emul/mount/boot/initrd.img-3.2.0-4-versatile ~/arm-emul/
sudo umount ~/arm-emul/mount

Now you can start up your brand new debian ARM VM with:

qemu-system-arm -M versatilepb -kernel ~/arm-emul/vmlinuz-3.2.0-4-versatile -initrd ~/arm-emul/initrd.img-3.2.0-4-versatile -hda ~/arm-emul/hda.img -append "root=/dev/sda1"

Great! Now off to make your custom OS and flash it to your board! Good luck!

iBeacons and Raspberry Pi

A while back, I came across this article on Radius Networks which is a set of very simple instructions to make your own iBeacon with a Raspberry Pi and a cheap bluetooth dongle.

These things should be ultra cheap to roll out, so I am expecting to see a lot of applications, such as the contextual apps that I have spoken about before, rolling out quite soon.

The possibilities of these things are huge for guerilla marketing in malls and large spaces, especially food courts and bigger shops.

Imagine contextual ads according to your preferences, that should be pretty easy actually.

Mark my words, these things will either be a complete flop (as regular bluetooth was) or huge (thanks to the iCrowd).

Time will tell!

An introduction to Apache Mesos

What is Apache Mesos you ask? Well, from their web site at it is

Apache Mesos is a cluster manager that provides efficient resource isolation and sharing across distributed applications, or frameworks. It can run Hadoop, MPI, Hypertable, Spark, and other applications on a dynamically shared pool of nodes.

What exactly does that mean? Well, to me, it means that I can deploy applications and certain frameworks into a cluster of resources (VM’s) and it will look after my resource allocation for me. For example, if you have a particularly computationally intensive task, like a Hadoop Map/Reduce job that may require additional resources, it can allocate more to it temporarily so that your job finishes in the most efficient way possible.

Installing Mesos is not too difficult, but the docs are a little sparse. I set up a vanilla Ubuntu-12.04 LTS VM on my localhost to experiment. I will assume the same for you, so you may not need all the packages etc. but please do bear with me.

Disclaimers aside, let’s get cracking on the installation!

The first step will be to download the Apache Mesos tarball distribution. I used wget on my VM to grab the link from a local mirror, but you should check which mirror to use via ttp://

Next up, you will need to prepare your machine to compile and host Mesos. On Ubuntu, you need the following packages:

apt-get update && apt-get install python-dev libunwind7-dev libcppunit-dev openjdk-7-jdk autoconf autopoint libltdl-dev libtool autotools-dev make gawk g++ curl libcurl4-openssl-dev

Once that is complete, unpack your Mesos tarball with:

tar xzvf mesos-0.13.0.tar.gz

Change directory to the newly created mesos directory and run the configure script

cd mesos-0.13.0

All the configure options should check out, but if not, make sure that you have all the relevant packages installed beforehand!
Next, in the same directory, compile the code


Depending on how many resources you gave your VM, this could take a while, so go get some milk and cookies…

Once the make is done, you should check everything with

make check

This will run a bunch of unit tests and checks that will ensure that there are no surprises later.

After you have done this, you can also set up a small Mesos cluster and run a job on it as follows:
In your Mesos directory, use


to start the master server. Make a note of the IP and port that the master is running on, so that you can use the web based UI later on!

Open up a browser and point it to or http://yourIP:5050. As an example, mine is running on

Go back to your VM’s terminal and type

bin/ --master=

and refresh your browser. You should now notice that a slave has been added to your cluster!

Run the C++ test framework (a sample that just runs five tasks on the cluster) using

src/test-framework --master=localhost:5050 

It should successfully exit after running five tasks.
You can also try the example python or Java frameworks, with commands like the following:


If all of that is running OK, you have successfully completed the Mesos setup. Congratulations!

Follow @ApacheMesos on twitter as well as @DaveLester for more information and goodness!

Cloudera Hadoop and HBase example code

Earlier, I posted about connecting to Hadoop via a Java based client. I decided to try out Cloudera’s offering ( where they provide a manager app as well as an easy way to set up Hadoop for both Enterprise (includes support) and a free version.

I downloaded the free version of the Cloudera Manager, and quickly set up a 4 node Hadoop cluster using their tools. I must say, that as far as easy to use goes, they have done an awesome job!

Once everything was up and running, I wanted to create a Java based remote client to talk to my shiny new cluster. This was pretty simple, once I had figured out to use the Cloudera Maven repositories and which versions and combinations of packages to use.

I will save you the trouble and post the results here.

Versions in use are latest

hadoop version
Hadoop 2.0.0-cdh4.4.0
Subversion file:///var/lib/jenkins/workspace/generic-package-ubuntu64-12-04/CDH4.4.0-Packaging-Hadoop-2013-09-03_18-48-35/hadoop-2.0.0+1475-1.cdh4.4.0.p0.23~precise/src/hadoop-common-project/hadoop-common -r c0eba6cd38c984557e96a16ccd7356b7de835e79
Compiled by jenkins on Tue Sep  3 19:33:54 PDT 2013
From source with checksum ac7e170aa709b3ace13dc5f775487180
This command was run using /usr/lib/hadoop/hadoop-common-2.0.0-cdh4.4.0.jar

With this information, we now know which versions of the packages to use from the Cloudera Maven repository.


I also make sure to add the Cloudera Maven repository in my pom.xml file


That is pretty much the hard part. If you don’t need HBase, then leave it off, the “hadoop-client” should do most of what you want.

Introducing Apache Whirr

I was a little disturbed to see that so few people had even heard of Apache Whirr, so I decided to write this as a bit of an introduction.

Apache Whirr is very basically a set of libraries to manage all of your cloud installations and set ups. It takes all the pain out of deploying clusters and apps to any one of the major cloud providers, including Rackspace and Amazon Elastic compute clouds.

The trick is that it provides a common API across all the platforms in a way that almost anyone can use. You may be thinking “Apache = Java”, but there are SDK’s and API’s in a few languages, including Java, C++ and Python. Whirr started out as a set of BASH scripts to manage Hadoop clusters, but quickly become a bigger project, which we now know as Whirr.

To get started with Whirr, you will need to download it from a local mirror. should get you there. You could also grab the source at and then build it in Eclipse as per the instructions. I would suggest grabbing Whirr-0.8.2 (about 26MB).

You will also need Java 6 (or later, I use openjdk-7-jdk), an SSH client, and an account with either Rackspace or Amazon EC2.

I usually put stuff like this in my /opt/ directory, so once you have extracted the archive and ensured the dependencies are met, you can check everything is working with:

bin/whirr version

which should print out

Apache Whirr 0.8.2
jclouds 1.5.8

The next step is to set up your crdentials. First off copy the sample credentials file to your home directory, and then modify it to suit you.

mkdir -p ~/.whirr/
/opt/whirr-0.8.2/conf# cp credentials.sample ~/.whirr/credentials

I prefer using Rackspace (OK Rackspace, you may now send me gifts), so my config looks something like this


Now to define what you want to deploy. The canonical examples are Hadoop cluster and Mahout cluster, so here we will start with a Hadoop cluster and let you figure out the rest!
In your /home/ directory, create a properties file. It doesn’t really matter too much what you call it, but we will call it

As you would have seen from the config credentials file, properties files override the base config, so you can actually do quite a lot in userland there. Let’s set up for our Hadoop cluster now:

whirr.instance-templates=1 hadoop-jobtracker+hadoop-namenode,1 hadoop-datanode+hadoop-tasktracker 

You need to now generate an SSH keypair with

ssh-keygen -t rsa -P ''

Note: You should use only RSA SSH keys, since DSA keys are not accepted yet.

OK, so now comes the fun part – setting up our Hadoop cluster!

/opt/whirr-0.8.2/bin# ./whirr launch-cluster --config /home/paul/

You should start seeing some output almost immediately that looks like

Bootstrapping cluster
Configuring template for bootstrap-hadoop-datanode_hadoop-tasktracker
Starting 1 node(s) with roles [hadoop-datanode, hadoop-tasktracker]
Configuring template for bootstrap-hadoop-jobtracker_hadoop-namenode
Starting 1 node(s) with roles [hadoop-jobtracker, hadoop-namenode]
Starting 1 node(s) with roles [hadoop-datanode, hadoop-tasktracker]
Starting 1 node(s) with roles [hadoop-jobtracker, hadoop-namenode]

if something goes wrong you will get something along the lines of

Unable to start the cluster. Terminating all nodes.
Finished running destroy phase scripts on all cluster instances
Destroying testhadoopcluster cluster
Cluster testhadoopcluster destroyed

in which case you will need to review all your settings and try again… Hint: Usually this error indicates some sort of connectivity issues.
Whirr is unable to connect over SSH to the machines, assumes the bootstrap process failed and tries to start new ones.

For security reasons, traffic from the network your client is running on is proxied through the master node of the cluster using an SSH tunnel (a SOCKS proxy on port 6666).
A script to launch the proxy is created when you launch the cluster, and may be found in ~/.whirr/. Run it as a follows (in a new terminal window):

. ~/.whirr/myhadoopcluster/

You will also need to configure your browser to use the proxy, to view all the pages served by your cluster. When you want to kill the proxy, just Ctrl-C it to kill it.

You can now run a map/reduce job on your shiny new cluster
After you launch a cluster, a hadoop-site.xml file is created in the directory ~/.whirr/. You can use this to connect to the cluster by setting the HADOOP_CONF_DIR environment variable. (It is also possible to set the configuration file to use by passing it as a -conf option to Hadoop Tools):

export HADOOP_CONF_DIR=~/.whirr/myhadoopcluster

You should now be able to browse HDFS:

hadoop fs -ls /

Note that the version of Hadoop installed locally should match the version installed on the cluster. You should also make sure that the HADOOP_HOME environment variable is set.

Here’s how you can run a MapReduce job:

hadoop fs -mkdir input 
hadoop fs -put $HADOOP_HOME/LICENSE.txt input 
hadoop jar $HADOOP_HOME/hadoop-*examples*.jar wordcount input output 
hadoop fs -cat output/part-* | head

Once you are done, you can then simply destroy your cluster with:

bin/whirr destroy-cluster --config

Note of warning! This will destroy ALL data on your cluster!

Once your cluster is destroyed, don’t forget to kill your proxy too…

That is about it as an intro to Apache Whirr. Very easy to use and very powerful!

How to set up and deploy a GlusterFS distributed, replicating file system.

Reposted from old site – original date: Thursday 19 April 2012

OK so you want a replicated, distributed file system. Sure thing. Enter Gluster. Take a look at for more information.

For the purposes of this article, we will look at a 2 node replicated store and then mount the filesystem on a third node for use as a regular old filesystem.

1. Make sure that your 2 servers can talk to each other. If needs be, modify your hosts file so that they do. I personally do not like using FQDN for this as it slows things down, so normally would use an IP address.

2. On each of your Ubuntu based nodes, do an apt-get update && apt-get install glusterfs-server

3. On server 1, do a gluster peer probe server2 where server2 is either a FQDN or an ip address, depending on how you roll.

4. Check everything is cool with a gluster peer status. You should now be seeing some information on the peer(s) like hostnames and UUID’s, states and stuff like that.

5. Now we need to create a gluster volume, which is essentially a big disc that you are going to store your junk on. This is also pretty simple. Do a gluster volume create MyVolume replica 2 transport tcp server1:/data server2:/data

Some notes:

replica 2 means replicate across 2 machines. This is for your failover integrity. It also needs to be at least 2.
MyVolume is the name of your virtual volume.
transport tcp means use tcp to talk to the servers in the cluster
serverx:/data is the part on the physical disc that the data will be stored to.

6. Great! Ready for bigger and better things! Let’s start the volume: gluster volume start MyVolume

7. Check the status of the volume with gluster volume info. You should get some output telling you about your nodes as “Bricks”

8. Now is a good time to think about security and authentication. If you want to lock it down to a certain IP or host, you need to do a gluster volume set MyVolume auth.allow or whatever your IP address is. Remember the thing is open until you set the auth.allow so if you need to allow a bunch of clients, then send it a comma separated list of IP’s

9. About now, we have a working server cluster. Now to configure the client. On your client machine, apt-get install glusterfs-client

10. mkdir /mnt/myshared

11. mount -t glusterfs server1:/MyVolume /mnt/myshared

12. You are done. Seriously, that is it. Add your new mount to fstab if you like, copy some stuff to it, watch it replicate etc.


1. You can install the client on the server(s) too. If you want to mount the glusterfs share locally (i.e. on the server) use localhost

2. Replicas should be considered when creating larger clusters. Remember when you have more than one node failure, you want enough replicas to keep all the data integrity.

3. Leave other tips in the comments if you think of anything.

Upgrading Ubuntu to Natty Narwhal

Reposted from old site – original date: Friday 29 April 2011

So @cazpi decided to upgrade to Natty last night. Everything went smoothly until the restart where the whole system crashed on startup. This seemed to me like a graphics driver issue, so it was relatively simple to fix.

1. Restart the machine, holding down the shift key to get the GRUB menu.
2. Choose the failsafe kernel and select “Low graphics mode”
3. Boot into Ubuntu with no graphics acceleration.
4. Go to the “System” menu, administration -> additional drivers and select the NVidia proprietary driver. It will say that it is not in use.
5. Remove the driver, then add it again. It will download and install a newer version.
6. Reboot into a prettier, faster and easier to use Ubuntu Unity desktop!

No other issues so far. Graphics card is a NVidia 4GB card.

Hope this helps someone else!