#040 – Adversarial Examples (Dr. Nicholas Carlini, Dr. Wieland Brendel, Florian Tramèr)

Pod version; https://anchor.fm/machinelearningstreettalk/episodes/040—Adversarial-Examples-Dr–Nicholas-Carlini–Dr–Wieland-Brendel–Florian-Tramr-epo5qr

Adversarial examples have attracted significant attention in machine learning, but the reasons for their existence and pervasiveness remain unclear. there’s good reason to believe neural networks look at very different features than we would have expected. As articulated in the 2019 “features not bugs” paper Adversarial examples can be directly attributed to the presence of non-robust features: features derived from patterns in the data distribution that are highly predictive, yet brittle and incomprehensible to humans.

Adversarial examples don’t just affect deep learning models. A cottage industry has sprung up around Threat Modeling in AI and ML Systems and their dependencies. Joining us this evening are some of currently leading researchers in adversarial examples;

Florian Tramèr – A fifth year PhD student in Computer Science at Stanford University
https://floriantramer.com/

Dr. Wieland Brendel – Machine Learning Researcher at the University of Tübingen & Co-Founder of layer7.ai
https://medium.com/@wielandbr

Dr. Nicholas Carlini – Research scientist at Google Brain working in that exciting space between machine learning and computer security.
https://nicholas.carlini.com/

We really hope you enjoy the conversation, remember to subscribe!

Yannic Intro [00:00:00]
Tim Intro [00:04:07]
Threat Taxonomy [00:09:00]
Main show intro [00:11:30]
Whats wrong with Neural Networks? [00:14:52]
The role of memorization [00:19:51]
Anthropomorphization of models [00:22:42]
Whats the harm really though / focusing on actual ML security risks [00:27:03]
Shortcut learning / OOD generalization [00:36:18]
Human generalization [00:40:11]
An existential problem in DL getting the models to learn what we want? [00:41:39]
Defenses to adversarial examples [00:47:15]
What if we had all the data and the labels? Still problems? [00:54:28]
Defenses are easily broken [01:00:24]
Self deception in academia [01:06:46]
ML Security [01:28:15]

[1802.00420] Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples
https://arxiv.org/abs/1802.00420

[2006.11440] Using Learning Dynamics to Explore the Role of Implicit Regularization in Adversarial Examples
https://arxiv.org/abs/2006.11440

[1905.02175] Adversarial Examples Are Not Bugs, They Are Features
https://arxiv.org/abs/1905.02175
http://gradientscience.org/adv/

[2004.07780] Shortcut Learning in Deep Neural Networks
https://arxiv.org/abs/2004.07780

[1811.12231] ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness
https://arxiv.org/abs/1811.12231

[1902.06705] On Evaluating Adversarial Robustness
https://arxiv.org/abs/1902.06705

[2012.07805] Extracting Training Data from Large Language Models
https://arxiv.org/abs/2012.07805

[1811.03194] AdVersarial: Perceptual Ad Blocking meets Adversarial Machine Learning
https://arxiv.org/abs/1811.03194

[2002.04599] Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial Perturbations [ICML, 2020]
https://arxiv.org/abs/2002.04599

[2002.08347] On Adaptive Attacks to Adversarial Example Defenses [NeurIPS, 2020]
https://arxiv.org/abs/2002.08347

[Threat Modeling AI/ML Systems and Dependencies]
https://docs.microsoft.com/en-us/security/engineering/threat-modeling-aiml

#machinelearning #deeplearning

YouTube Source for this AI Video

AI video(s) you might be interested in …