Ali Sharif Razavian, Hossein Azizpour, Josephine Sullivan, Stefan Carlsson (2014)
- Use off-the-shelf features learned by a CNN on ImageNet as input to a simple classifier for various recognition tasks
- Performance comparable to specifically engineered features across the tasks --> good generalization!
- But: data augmentation also done --> might play big role!
- Tasks: image classification (whether or not image contains an object/scene, without location), fine-grained classification (classify subclasses), attribute detection (detect parts of objects, includes location), visual instance retrieval (find images that contain the same object/scene in a database)
- Key takeaway: this is a baseline for all further recognition algorithms!