Face recognition distances investigation

face_recognition is a Python package containing pre-trained deep neural networks for that can be used to recognise and compare faces (provide a score reflecting their similarity). It uses a convolutional neural network to convert an image of a face to a set of 128 ‘encodings’: numbers that are similar for different images of the same face, but different for images of different faces. The Euclidean distance between two sets of encodings can be interpreted as a facial ‘distance’: a measure of how different the faces are. Face recognition works by comparing these face distances with a cutoff: below the cutoff, the faces are taken as being the same, and above, they are taken as being different. By varying the cutoff you can vary the precision and recall - with a lower cutoff, you are less likely to identify pairs of images as being of the same face, but you can have higher confidence where you do so. Conversely, with a higher cutoff, you are more likely to identify pairs of images as being of the same face, but you can have less confidence where you do so. You can read more about this at this blog post by the author of the package.

I compared the 81,875,206 pairs in 12,797 images (of 5,467 people), keeping track of whether they distance, and whether the images were of the same, or different, faces. The resulting distribution of face distances, for pairs of images of the same, and different, faces, is shown below:

graph

By collecting the face distances for all those combinations, comparing them with various cutoffs, and summing the correct and incorrect classifications at each of those cutoffs, I generated the following table of precision and recall at various face distance cutoffs:

cutoff	precision	recall
0.00	1.00	0.00
0.30	1.00	0.02
0.40	1.00	0.29
0.45	0.98	0.59
0.50	0.91	0.84
0.55	0.71	0.96
0.60	0.34	0.99
0.90	0.00	1.00
1.00	0.00	1.00
1.50	0.00	1.00

The code is available at GitHub.