Katie Hawkins | Centres for Doctoral Training

Katie Hawkins

katie.hawkins@bristol.ac.uk

Year 4 Student – 2020 Intake – Cohort 2

I am a PhD student at the University of Bristol, Centre for Doctoral Training in Cyber Security. My research project, supervised by Dr. Sana Belguith, contributes to the study of data deletion in machine learning (ML) models. In the current stage of the project, I am seeking to contextualise the problems of enforcing GDPR’s Art.17 (Right to erasure) within supervised ML models; through an assessment of a specific use case involving personal information in the training dataset. This will aid a better understanding of what could be (or should be) expected to build technical solutions for data deletion, both within the models training set and the ML model itself. My projects intended outcome is to propose a data deletion solution that addresses such problems and characterise the trade-offs between metrics including accuracy, run-time and deletion number.

PhD Project

Data Deletion in Machine Learning

The overfitting of supervised machine learning models can result in a model that learns the training data too well. As a consequence, this allows an attacker to learn private membership or attributes about the training data, thus causing the ML models and its output to become indirect stores of the training data. My research project attempts to address this vulnerability by splitting the interdisciplinary problem space into three phases; Phase 1: Legal and Research Investigation, Phase 2: Proposing a Framework for Deletion and Phase 3: Development and Evaluation.The first phase seeksto contextualise the problem of enforcing GDPR’s Art.17 (Right to Erasure) within ML. Through a collaborative study with Bristol Law School, we consider a specific use case involving personal information in the training data set.

The second phase involvesa critical analysis of the state-of-the-art in ML data deletion techniques, as well as evaluating other methods including anonymisation and machine unlearning.

The key objective from these phases is to gather a formal understanding of what could be (or should) be expected to build technical solutions for data deletion, both within the models training set and the ML model itself. This will aid progress into the final phase as I seek to develop a ML data deletion solution that considers the obligations of regulation and addresses the gaps within the state-of-the-art.

Supervisors: Dr Sana Belguith (Bristol), Dr Ryan McConville (Bristol)

PhD Poster

View poster here