LOVE AND FIND YOUR MATCH: FACIAL RECOGNITION AND SIMILARITY

While waiting to be seated at a restaurant many years ago, my grandmother pointed to a young couple on a date. They were laughing too loud at each others jokes and acting a little awkward. It must have been a first or second date. "See those two? They will get married for sure" my grandmother said confidently. "How can you tell?" I asked. "Ah, that is easy. They have the same nose."

Many years later, while researching computer vision, I got thinking that maybe there was a way to quantify what my grandmother said. Was there a means to describe how similar two people’s faces are in a single number? Are we attracted to similar people or are we attracted to different people? The first question is easier to answer than the second, but I’ll try and build a framework to help figure this out.

THE SOLUTION

Let’s start with the solution and then discuss how facial recognition works. I have made an application that uses facial recognition techniques to identify faces in two images. The software then digitizes the faces and calculates how similar the two faces are with a single metric. I used a javascript library created by Vincent Mühler that runs all facial recognition locally on your device.

Check out the love match app HERE.

As all calculations are run locally on your phone or laptop browser, it can take 10-20 seconds to process. The steps to use the app are as follows:

Step 1: Upload two images using a mobile camera, desktop webcam or a file. Each photo requires a single face.

For example we could use our friends David and Victoria Beckham.

Step 2: On loading the second image, the app identifies the faces in each picture and draws a box around the face with a confidence the image contains a face. This confidence is in red below the blue box e.g. 0.99 means we are 99% confident that the box contains a face.

Step 3: Digitize the faces and work out the similarity between the two with a metric and give a similarity description from ‘very low’ to ‘very high.’

how does facial recognition work?

find the face

Facial recognition is an application of machine learning, that can be used to identify, verify or compare a person from a digital image.

In the early 2000s facial recognition went mainstream with the work of Paul Viola and Michael Jones called Haar feature-based cascade classifiers. This technique enabled cheap cameras to detect faces, which placed little boxes around people’s faces and helped auto focus. An improvement on this method was developed in 2005 called Histogram of Oriented Gradients, or HOG.

The way HOG works, is the image is simplified into a basic pattern for which a face can be easily found. This is achieved by a technique that counts occurrences of gradient orientation in localized portions of an image. Once you run a HOG algorithm over an image you check the part of the image that has a similar pattern to the below. Once you find the matching pattern, you can isolate the face with a bounding box.

 

Find The Face landmarks

If a face is not looking directly at the camera, we can distort the image to look directly face-on once we have a series of reference points. These reference points are called face landmarks. There are 68 landmarks typically used on a persons face as per the below image. Once these points are known we can rotate, scale and shear the image to look straight ahead to improve the ability to encode the face.

If a face is not looking directly at the camera, we can distort the image to look directly face-on once we have a series of reference points. These reference points are called face landmarks. There are 68 landmarks typically used on a persons face as per the below image. Once these points are known we can rotate, scale and shear the image to look straight ahead to improve the ability to encode the face.

If a face is not looking directly at the camera, we can distort the image to look directly face-on once we have a series of reference points. These reference points are called face landmarks. There are 68 landmarks typically used on a persons face as per the below image. Once these points are known we can rotate, scale and shear the image to look straight ahead to improve the ability to encode the face.

Some Key Facial Landmark Points

The full set of 68 landmarks are shown below.

Source: Pyimagesearch

You can see what landmarks look like on a surprised Leslie Nielson.

 

Encode the face

The aim of facial recognition techniques is to be able to use a computer to interpret faces. Since computers don’t know what a face is, we need a process to encode a face. We can do this using a convolution neural network, which teaches a computer to look for 128 measurements of a face using the pixels in an image. This process is sometimes referred to as embedding and was invented in 2015 by Google researchers.

The model used after millions of training images, generates a reliable way to simplify a face into 128 measurements. The measurements are the output layer of the neural network. Since the computer works this out itself using machine learning, we actually do not know what the 128 measurements mean. Furthermore, this network could have been trained with any number of outputs not just 128, but the current models chose to use 128.

The hard part is training the neural network. However, once a reliable model has been trained, the model can be quickly deployed to encode a face.

One of the common models used in facial recognition is part of
dlib library – trained by Davis King using approximately 3 million faces from Labeled Faces in the Wild.

So what does a computer actually ‘see’ when it processes a face? If we take Jim Carrey’s face below, we get the 128 positive or negative decimal numbers that can be used to represent his face. Note that if we get another photo of Jim Carrey, the 128 numbers will be a little different, but similar enough that we can perform a calculation to show it is the same face. More on this calculation next.

 

Every face can be encoded into 128 numbers

how do you tell how similar faces are?

Now that we have a way to convert a face into a number, we can use a technique called the Euclidean distance. This calculation finds the distance between two vectors. In our case we have two vectors with 128 values each.

However, distance is not intuitive for our comparison purposes. We convert distance into similarity by eucildeanSimilarity= (1- euclideanDistance) x 100. A similarity of 100% means the two images are identical and 0% means the two images are completely different.

The scale of similarity vs number of possible faces is not linear. There is an exponential relationship, where the majority of results lie below 40% similarity and only a very small number between 40-100%. In fact in most facial recognition application, a similarity between two faces of >40% (or a euclidean distance of < 60%) is considered to be the same face.

what are some other applications of facial recognition?

Some use cases for facial recognition are:

  1. Identity Verification – Apple uses facial recognition to unlock phones in models after the iPhone X. Some banking applications use facial recognition as an additional security layer to prevent fraud
  2. Social Media – Facebook uses facial recognition when you upload photos to their website to tag faces
  3. Drowsy Driver – By counting the number of blinks per minute, facial recognition can help alert drivers for early signs of drowsiness
  4. Emotion Monitor – Facial expressions can be translated into emotions such as happy, surprised, angry, sad to tailor music or lighting etc
  5. Attendance Monitor – Workplaces or universities can use facial recognition to check attendance
  6. Criminal Identification – This is one of the more creepy applications. As long as a government agency has a photo of you e.g. a drivers license or passport, then CCTV camera feeds can be analysed in real time to search for matches.
  7. Boarding Pass – Instead of using a train, cruise ship or a plane boarding pass, facial recognition could be used. This would improve the speed and customer experience of beginning a journey.
  8. Similarity Assessments – Used to identify how similar you are looking to a partner or a family member. This may be used depending on whether you subscribe to ‘opposites attract’ or the Freudian version of similarity.
  9. Micro Expression Identifier – The TV series ‘Lie to me’ may become a reality with applications to detect our engagement or to see if we are telling the truth
  10. Smart Doorbell – Facial recognition could be used to announce who is at your front door
  11. Retail Customer Service – Staff could tailor promotions to individuals and could be alerted of high value customers entering the store

what i learned from this process

Firstly, I learned that there are some very smart people working on facial recognition. Some of the key people I came across were:

  • Davis King – creator of dlib
  • Adam Geity – creator of the python library ‘face_recognition’
  • Adrian Rosebrock – creator of the python library ‘imutils’ and avid blogger with lots of great tutorials at pyimagesearch
  • Vincent Mühler – creator of the javascript library ‘face-api.js’ (which I used in this application)

Furthermore, I learned that once an image is converted into a 128 dimensional vector, it becomes very easy to compare a known face against a candidate image. This technique makes it efficient to set up real time applications with an ordinary laptop or a basic PC with a webcam.

Lastly, I learned that most of the research is being done on facial identification i.e trying to as reliably as possible predict if an image is person X or not. However, there is less focus on how similar two different faces are. I hypothesis that we unconsciously look for partners with similar facial characteristics as our own. There may not be a high level of similarity, but there may be a zone of similarity that people unconsciously look for. You will have to let me know what you find from your own tests using the MATCH APP.

Leave a Comment