#52 – Adversarial Examples Beyond Security (Hadi Salman, MIT)

Performing reliably on unseen or shifting data distributions is a difficult challenge for modern vision systems, even slight corruptions or transformations of images are enough to slash the accuracy of state-of-the-art classifiers. When an adversary is allowed to modify an input image directly, models can be manipulated into predicting anything even when there is no perceptible change, this is known an adversarial example. The ideal definition of an adversarial example is when humans consistently say two pictures are the same but a machine disagrees. Hadi Salman, a Ph.D student at MIT (ex-Uber and Microsoft Research) started thinking about how adversarial robustness could be leveraged beyond security.

He realised that the phenomenon of adversarial examples could actually be turned upside down to lead to more robust models instead of breaking them. Hadi actually utilized the brittleness of neural networks to design unadversarial examples or robust objects which_ are objects designed specifically to be robustly recognized by neural networks.

Introduction [00:00:00]
Main Introduction [00:11:38]
Hadi’s Introduction [00:14:43]
More robust models == transfer better [00:46:41]
Features not bugs paper [00:49:13]
Manifolds [00:55:51]
Robustness and Transferability [00:58:00]
Do non-robust features generalize worse than robust? [00:59:52]
The unreasonable predicament of entangled features [01:01:57]
We can only find adversarial examples in the vicinity [01:09:30]
Certifiability of models for robustness [01:13:55]
Carlini is coming for you! And we are screwed [01:23:21]
Distribution shift and corruptions are a bigger problem than adversarial examples [01:25:34]
All roads lead to generalization [01:26:47]
Unadversarial examples [01:27:26]

Pod version: https://anchor.fm/machinelearningstreettalk/episodes/52—Unadversarial-Examples-Hadi-Salman–MIT-e1015k2

Dr. Tim Scarfe
Dr. Yannic Kilcher
Sayak Paul (https://sayak.dev/ @RisingSayak)

Hadi Salman:


Tim’s Whimsical notes; https://whimsical.com/mlst-hadi-salman-23rd-april-un-adversarial-examples-WxNENWCQBr1zce5vNXNnAw

Adversarial Examples Are Not Bugs, They Are Features

Adversarial Robustness as a Prior for Learned Representations

Image Synthesis with a Single (Robust) Classifier

Unadversarial Examples: Designing Objects for Robust Vision

Do Adversarially Robust ImageNet Models Transfer Better?

A Convex Relaxation Barrier to Tight Robustness Verification of Neural Networks

Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers

Denoised Smoothing: A Provable Defense for Pretrained Classifiers

ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness

YouTube Source for this AI Video

AI video(s) you might be interested in …