This is a repositiry for showcase usage of Inception Resnet (V1), pretrained on VGGFace2. Implementation from Tim Esler's github repo
And also includes an implementation of MTCNN for face detection, fastest from the available.
- Install:
# With pip: pip install facenet-pytorch # or clone this repo, removing the '-' to allow python imports: git clone https://github.com/timesler/facenet-pytorch.git facenet_pytorch # or use a docker container (see https://github.com/timesler/docker-jupyter-dl-gpu): docker run -it --rm timesler/jupyter-dl-gpu pip install facenet-pytorch && ipython
- In python, import facenet-pytorch and instantiate models:
from facenet_pytorch import MTCNN, InceptionResnetV1 # If required, create a face detection pipeline using MTCNN: mtcnn = MTCNN(image_size=<image_size>, margin=<margin>) # Create an inception resnet (in eval mode): resnet = InceptionResnetV1(pretrained='vggface2').eval()
- Process an image:
from PIL import Image img = Image.open(<image path>) # Get cropped and prewhitened image tensor img_cropped = mtcnn(img, save_path=<optional save path>) # Calculate embedding (unsqueeze to add batch dimension) img_embedding = resnet(img_cropped.unsqueeze(0)) # Or, if using for VGGFace2 classification resnet.classify = True img_probs = resnet(img_cropped.unsqueeze(0))
See help(MTCNN)
and help(InceptionResnetV1)
for usage and implementation details.
This notebook demonstrates the use of packages:
- facenet-pytorch
- mtcnn
- sklearn
- albumentations
In this notebook was introduced a complete example pipeline utilizing datasets, dataloaders, basic data augmentation, training classifier on top of resnets embeddings and face tracking in video streams.
In order to run the example code in google colab you need to prepare separate folders for images dataset.
Here is a link for project structure. When you download project on your google drive, it will have such path: /content/drive/My Drive/Colab Notebooks/facenet/
facenet
+-- facenet.ipynb
+-- data
| +-- test_images
| +-- person1
| +-- 1.png
| +-- 2.png
| +-- person2
| +-- 1.png
| +-- 2.png
| +-- train_images
| +-- person1
| +-- 1.png
| +-- 2.png
| +-- person2
| +-- 1.png
| +-- 2.png
| +-- test_images_cropped
| +-- person1
| +-- 1.png
| +-- 2.png
| +-- person2
| +-- 1.png
| +-- 2.png
| +-- train_images_cropped
| +-- person1
| +-- 1.png
| +-- 2.png
| +-- person2
| +-- 1.png
| +-- 2.png
Note, <images folder>_cropped
folders are automatically generated in code. All images should be (.png, jpeg, jpg) and converted to RGB automatically.
Then, after preparing test_images
and train_images
, we can easily apply face detection using MTCNN and save in <images folder>_cropped
.
Following all the above, all cropped images can be ran through Inception Resnet model in order to get embeddings or probabilities. In our case, we are getting embeddings to train on them SVM classifier from sklearn (best parameters were found by SearchGrid and saved in data
folder as svm.sav
). To make our classifier more stable, some augmentations were applied(you can observe them in notebook).
All embeddings from images were saved in data
folder as trainEmbeds.npz
and testEmbeds.npz
.
Here we may see some obstacles, such as wrong-labelled classes and narrow-mindedness of our model(classifier predicts the most probable face among all known/trained, so it lacks of ability to distinguish known from unknown person, right 3 people were not in train dataset)
Original test dataset
Aligned images, preprocessed by MTNN detector.
During training we had original 79-81 images. After getting embeddings each of size 512 by runnig through Inception Resnet model, we may observe:
Distances between embedding vectors
Then we used a method called t-distributed Stochastic Neighbor Embedding(tSNE), which is especially good at visualizing high-dimensional data.
As we see, some points(vectors) will be hard to distinguish.
- Bolotov Heorgii - Initial work - Heorh
- Zikratyi Dmytro - Initial work - shooterdimon
- Moroz Denis - Initial work - HPMortys
- Trishchuk Denis - Initial work - krissayrose
This project is licensed under the MIT License - see the LICENSE file for details
-
Tim Esler's facenet-pytorch repo: https://github.com/timesler/facenet-pytorch
-
F. Schroff, D. Kalenichenko, J. Philbin. FaceNet: A Unified Embedding for Face Recognition and Clustering, arXiv:1503.03832, 2015. PDF
-
K. Zhang, Z. Zhang, Z. Li and Y. Qiao. Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks, IEEE Signal Processing Letters, 2016. PDF