Machine learning (ML) techniques such as Support Vector Machines, Random Forests and Neural Networks are being applied with great success to a wide range of complex and sometimes safety-critical tasks. Recent research in “adversarial machine learning” has shown the fragility of classifiers based on statistical ML techniques with respect to targeted attacks aimed at inducing incorrect classifications. The goal of this PhD project is to develop an analysis framework to guarantee robustness and safety of ML systems. The framework will be based on Abstract Interpretation (AbstInt), a powerful and flexible static analysis technique which subsumes type systems, program logics and model checking, and which has proved very successful in program verification. The first target of verification will be deep neural networks, due to their popularity and complexity, with a view to extend the verification to different models at a later stage.
The application of AbstInt to the verification of NNs is at its infancy, focussing on fully-connected or convolutional architectures, and using well-established numerical domains (logical formulae constraining regions of the parameters space). This PhD project will develop a more general approach able to accommodate arbitrary network architectures, and will design novel abstract domains tailored to the goal of guaranteeing safety and robustness against adversaries. Present techniques focus on local abstract transformers, leveraging the key AbstInt idea of abstracting sets of values (in this case network parameters) into “abstract elements” easier to manipulate, yet still representative of original behaviours. Substantial work remains to be done investigating ways to encompass”full program” verification techniques from AbstInt, abstracting the whole learning algorithm including the training phase.
A background in computer science, with knowledge of formal methods is essential for the project. Familiarity with abstract interpretation desirable.