The Elephant in the Room
A Demonstration of interesting failures of State-of-The-Art object detectors by Amir Rosenfeld.
Contact details can be found here: https://sites.google.com/view/amirrosenfeld/
Media mentions:
- Quanta Magazine: Machine Learning Confronts the Elephant in the Room
- The Register: AI image recognition systems can be tricked by copying and pasting random objects
- jiqizhixin.com: 「房间里的大象」:让目标检测器一脸懵逼
- Import AI: Fooling object recognition systems by adding more objects
- Twitter: : “Fan Art”
Comments are currently closed.
I have one question about the paper of Fig3(d), I saw the same ROI in (d) included some noise in the margin, but you said the noise only exist outside the bounding box which cause the network misclassified, so is there a typo? And please correct me if I am wrong, I thought the output layer make a decision only rely on the features within the ROI anchors, if pixels stay unchanged within the same ROI, why the network misclassified the label?
There’s no typo: the bounding box drawn on the image is the one generated by the detector. The bounding box used to determine the extent of the noise is the ground-truth bounding box of the object. This is why you see some noise inside the drawn bounding box.
Regarding the second question: indeed the only the features in the ROI selected by the network are used to make the final classification. However, these features are, in fact, affected by pixels lying outside of the bounding box of the actual object, due to the size of the receptive field of units within the ROI. Hence adding noise outside of the actual object gives rise to different features used for the final classification.