Background Contributions Directly Minimizing Robust Generalization Bound Effect of TrH Regularization Results Reference Read Our Full Paper cat sports car ( ) Deep models are often vulnerable to small adversarial perturbations that are unlikely to fool humans. β π₯ ! β β " , π₯ ! β π₯ β€ π β πΉ π₯ ! = πΉ ( π₯ ) S mall - norm - bounded Robustness (also known as Local Robustness) Our Goal Adversarial data Benign data training (e.g. PGD 1 , TRADES 2 ) Strong Overfitting Robust Generalization Gap Standard Generalization Gap [Rice et al. 2019] 3 Problem Prior Work How do we alleviate overfitting in robust training? Trace of Hessian ( TrH ) is the sum of curvatures (under assumption of convexity) of the loss in all directions. Flat Minimum Adversarial Training O n TwoMoons + No Regularization (standard) Top - Layer TrH Regulariation Full TrH Regulariation Regularizing TrH for top layer only Decreased TrH for internal layers (greater details in Theorem 4 and Example 1) Computing the Trace of Hessian for parameters of ALL layers is extremely expensive Computing the Trace of Hessian only for the top layer is very efficient. HOWEVER 8 TPUv4 Chips 2 RTX GPU Chips base TrH SWA 5 AWP 6 2.6 2.8 3.9 5.3 base TrH SWA 5 AWP 6 32.3 34.2 40.4 48.1 Runtime (hours/epoch) measured when training ImageNet Robust Accuracy Percentage of test data, the predictions of which are both robust and accurate A Prior distribution of Network weights ( independent on data and training algorithm) π βΌ π« = π© ( π , I π # $ ) Posterior distribution of Network weights ( dependent on data and training algorithm) data loss π βΌ π¬ , πππ π¬ is product of univariate Gaussians π© ( π , Ξ£ ) training algorithm Unknown Optimizable πΌ 5 βΌ π¬ π
π β€ πΌ 5 βΌ π¬ % π
π + 1 π½ πΎπΏ π¬ | | π« + πΆ ( π , π½ , π ) Robust loss over the test distribution evaluated when the model weight vector is π The empirical version of π
on the training set. A term that is independent of π¬ π is the size of the training set and π½ is a hyper - parameter and often chosen proportional to π For any π½ > 0 , the following inequality holds with a probability at least 1 β π , Linear - Form PAC - Bayesian Bound 4 PAC - Bayesian Bound We use PAC (Probably Approximately Correct) - Bayesian theorem to upper - bound the test robust loss, and directly minimize that upper - bound to derive an objective for us to minimize during training. Idea Overview min π¬ πΌ " βΌ π¬ π
π β€ min π¬ πΌ " βΌ π¬ ( π
π + 1 π½ πΎπΏ π¬ | | π« + πΆ π , π½ , π = min $ { ( π
π + $ & %& ' ' & + ' ' & % ππ« ( π π π ( < π π ) ) } + πΆ π , π½ , π + π ( π * + ) Assumptions Direct Minimization of the PAC - Bayesian Bound Highlight 1 We minimize the righthand side of the PAC - Bayesian bound above with respect to the posterior distribution π¬ (details to following in the proof of Theorem 3 in the paper) and land on the following result. ππ« ( π π π ( : π π ) ) This is the Trace of Hessian ( TrH ) of the training loss with respect to the weights of the model (with our Proposition 1&2) Highlight 2 π¦π’π§ π The minimized bound simplifies the optimization problem from one over the space of probability density functions ( i.e. π ) to the current one over the weights ( i.e. π ) min π¬ β 4 Nan Ding, Xi Chen, Tomer Levinboim , Beer Changpinyo , and Radu Soricut Pactran : Pac - bayesian metrics for estimating the transferability of pretrained models to classification tasks. In ECCV, 2022 5 Sven Gowal , Sylvestre - Alvise Rebuffi , Olivia Wiles, Florian Stimberg , Dan Andrei Calian, and Timothy Mann. Improving robustness using generated data. In NeurIPS , 2021 6 Dongxian Wu, Shu - Tao Xia, and Yisen Wang. Adversarial weight perturbation helps robust generalization. In NeurIPS , 2020 2 Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric Xing, Laurent El Ghaoui , and Michael Jordan. Theoretically principled trade - off between robustness and accuracy. In ICML 2019 1 Aleksander Madry , Aleksandar Makelov , Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu . Towards deep learning models resistant to adversarial attacks. In ICLR 2018 3 Leslie Rice, Eric Wong, and J. Zico Kolter. Overfitting in adversarially robust deep learning. In ICML, 2020. Improving Robust Generalization by Direct PAC - Bayesian Bound Minimization Zifan Wang, Nan Ding, Tomer Levinboim , Xi Chen, Radu Soricut zifan@safe.ai | dingnan@google.com We provide a PAC - Bayesian upper - bound over the robust test loss and show how to directly minimize it. The minimized bound can be used in training to encourage the generalization of robustness. Our minimized bound includes a Trace - of - Hessian ( TrH ) term which encourages the flatness of loss. We restrict the TrH regularization to the top layer only and empirically show its efficiency on decreasing TrH for the internal layers. Our TrH Regularization term improves the state - of - the - art robust accuracy of Vision Transformers on major vision datasets, e.g. CIFAR - 10, CIFAR - 100 and ImageNet, and is much more efficient compared to existing methods. better faster