AWS Rekognition vs Microsoft Cognitive Face API

We have discussed about the concepts of face detection and recognition in our last blog .

Due to the complexity of the technology, most of these new services who use facial ID do not build the technology themselves but rather use an API provides by the big tech companies like Amazon or Microsoft.

Lets discuss about the performance comparison of the two APIs Amazon Rekognition API and Microsoft Cognitive Face API:

AWS Rekognition

OVERVIEW

Amazon Rekognition makes it easy to add image and video analysis to your applications. Amazon Rekognition provides highly accurate facial analysis, face comparison, and face search capabilities. You can detect, analyze, and compare faces for a wide variety of use cases.

CORE FEATURES

  • Object & scene analysis.
  • Face detection.
  • Face recognition(comparison/identification)
  • Face sentiment analysis.

MICROSOFT Face API

OVERVIEW

Image and video analysis on faces.

CORE FEATURES

  • Face detection.
  • Face verification.
  • Face identification (recognition).
  • Emotion detection.

Accuracy of Face API

Before accuracy, lets be familiar with two terms:

Enrollment: Enrollment is the process of enrolling images of individuals for template creation so they can be recognized.

Probe Image: A probe image is an image submitted to a facial recognition system to be compared to enrolled individuals. Probe images are also converted to probe templates.

Accuracy of a Face API depends on below factors:

True Positive or True Accept (TP)
The person in the Probe image is enrolled and they are correctly matched
True Negative or True Reject (TN)
The person in the Probe image is not enrolled and they are correctly matched
False Positive or False Accept (FP)
Either the person in the probe image is not enrolled but is matched to enrolled person OR  the person in the probe image is enrolled but is matched to wrong person
False Negative or False Reject (FN)
The person in the Probe image is enrolled but they are not matched

Our Goal is to minimize the FP & FN and to Maximise the TP and TN

Case Study:

We tried to verify the images of some random samples and Check the likelihood that two faces belong to the same person and receive a confidence score.

The use case we focused mainly was when there is a source image available for a user and while verifying we use a similar setup, take a picture of user’s face and verify if the face belongs to the user in question.

 FEW SAMPLES OF TRAIN IMAGES

( Above images have been downloaded from online available datasets)

  We randomly selected 1 source image and used them to train the API. Then for each of these users we tested 2 positive examples (2 different images of the same user, expected output being that these match) and 2 negative examples (2 images of random users, expected output being that these do not match for the given user). We continued this process for 5 different individuals

So a total of 20 tests for each of the 2 API providers.

Results:

During testing each of these APIs, the primary concern was to look for 2 values, (1) True Positive (TP) : Given a different photo of the same person, the API correctly says that the photo matches, (2) False Positive (FP): Given a photo of a different user and the API incorrectly says that the photo matches.

Ideally one would want a 100% TP and 0% FP. For any serious application, any amount of FP is completely unacceptable since that can cause a serious security issue. Also, a low rate of TP would be really frustrating for users, since they might need to try multiple times to pass the authentication.

As stated above, these APIs return a ‘confidence’ value for each tested image, and what value you decide as the cutoff to trust the match determines when a tested image is considered a match. A higher cutoff can mean low FP but can also mean low TP. The exact value of the ‘confidence’ is not comparable across APIs, but the values of the cutoff we chose for each API reflects similar strictness of the decision for the API.

So we tested the results with different threshold values of confidence interval.

Similarity cutoff for AWS rekognition: 80%, Similarity cutoff for Microsoft: 50%

ProviderTrue +ve (TP)False +ve (FP)True -ve (TN)False -ve (FN)Precision (TP/(TP+FP))Recall (TP/(TP+FN))
AWS recognition100100100%100%
Microsoft cognitive services90101100%90%

Below is an example of  train and test images showing false negative results for MS cognitive

 

Image 1 is Train Image and Image 2 is Test Image

(Above images have been downloaded from online available datasets)

Each of these APIs provide a way to make a collection/gallery for a given user where training images are added, post which the gallery is trained and then a new image can be tested against a gallery giving the confidence value of a match. This confidence value suggests how likely is it that the test image matches the user in question.

Similarity cutoff for AWS rekognition: 95%, Similarity cutoff for Microsoft: 70%

ProviderTrue +ve(TP)False +ve (FP)True-ve (TN)False -ve (FN)Precision (TP/(TP+FP))Recall(TP/(TP+FN))
AWS rekognition100101100%100%
Microsoft cognitive services                           50                  105100%50%

FALSE NEGATIVE FOR AWS REKOGNITION

Train
Test

   Above images have been downloaded from online available datasets

FALSE NEGATIVE FOR MICROSOFT COGNITIVE

etc… 

Above images have been downloaded from online available datasets

COMPARISON

 ParametersMS CognitiveAWS Rekognition
Similarity Value57%-80% (Same person Images)88%-99% (Same person Images)
Dark ImagesRequire Image Preprocessing to detect low resolution imagesEasily able to detect without additional image preprocessing
Side Face with 90 degree orientation Unable to detectDetectable with similarity value – 88%
SunglassesDetectable Detectable
Low resolution imagesWithout image preprocessing- unable to detect and with image pre-processing- similarity value 57%Able to detect with Similarity value – 99%
Grayscale ImagesDetectableDetectable after conversion of image from Grayscale to BGR
Image FormatJPEG, PNG, GIF (the first frame), and BMP formatJPEG/PNG
Image Size1KB-5MBMax 5MB
Pixel Sizea face must be no smaller that 36×36 pixels in an image with 1920X1080 pixelsa face must be no smaller that 40×40 pixels in an image with 1920X1080 pixels
Landscape ViewFalse detectionTrue detection
Facial Landmarks27More than 30

IMPORTANT FINDINGS

·   We have noticed that AWS recognition identifies 30+ facial landmarks which results in better accuracy of the similarity score in case of face comparison whereas MS cognitive identifies only 27 facial landmarks.

·   We have verified that after training MS cognitive with 3-4 sample images, the confidence score increases by 2-3%  in comparison to only 1 training image whereas it remains consistent in case of AWS recognition model.

·   We should add image pre-processing steps viz: Image Normalization, Gamma Correction, Image orientation (MS Cognitive) and Grayscale to BGR conversion (AWS Rekognition) before the Facial Analysis.