Kal Roberts

Working PhD Project Title:
Cooperative Exploration for Multi-Agent Reinforcement Learning
Academic Background
MSci Mathematics, University of Bristol (2018-2021)

General Profile:

Thus far I've had a long and varied experience with the University of Bristol where, amongst other things I have studied Mathematics at degree and master's level, worked for Residential Life Services as a Senior Resident and took part in the Bristol Interns in China Programme in partnership with CRCC and the British Council where I worked alongside software developers researching Computerized Adaptive Testing Algorithms and distributed ledger technology in Shenzhen, China.

Originally my interests were primarily in Calculus, Optimisation (both deterministic and stochastic) and topology but from my third year onwards, I reorientated my learning towards probability and statistical computing leading to my dissertation "Bayesian modelling of Epidemic Processes" which combined Causal inference/counter factual using Googles Causal impact and Facebooks Prophet.

This eventually culminated to my interest in Artificial Intelligence and Machine Learning. Now as part of the CDT IAI I am looking forward to expanding my knowledge and applying my skills to new and interesting problems.

Research Project Summary:

Multi-Agent Reinforcement Learning has been utilised as a critical tool when addressing challenges such as coordinating robotic swarms, autonomous (self-driving) cars and cyber-security. Real-world application and adoption, however, are often made infeasible due to an inherent problem of Reinforcement Learning, i.e., scalability. This is only exacerbated by the extension into the multi-agent domain, with the state and actions spaces both increasing exponentially with the number of agents.

Methods such as Hierarchical Reinforcement Learning focus on decomposing complex tasks into subtasks, essentially correlating lower-order actions and grouping higher-order actions to learn policies at multiple levels of granularity and effectively decrease the size of the action space. At the same time, Sample efficient exploration and Goal Orientated Reinforcement Learning seek to partition the state space and focus the agent’s attention on under-explored states of interest or on states which achieve a specific goal. Methods and techniques such as these allow for complex behaviour to develop and for autonomous agents to be applicable to domains beyond benchmark environments.

This project focuses on the latter of these two methods. It looks to move beyond the commonly used noise-based techniques often found in Multi-Agent Reinforcement Learning and devise a probabilistic approach to committed exploration that can be used in the fully automated system setting and the Human-Autonomous team setting. A follow-up aim is to ensure that any method/technique produced in the project remains computationally tractable and can be run on smaller distributed services without total reliance on large, centralised training centres.

Through this, the hope is to answer whether it is possible to create cheap-to-train, easily scalable cooperative agents who explore together to overcome obstacles and challenges which would otherwise obstruct single or uncoordinated agents.

This research would form part of the necessary stepping stones for teams of agents operating in complex and dynamic domains with optimistic examples of these systems in play, including surgeries where human agents are assisted by robotic/autonomous systems and autonomous reconnaissance drones able to locate, and feedback mission-critical information to human operators in warzones.

This project falls within the EPSRC Artificial Intelligence and Robotics Thematic Research area, with the Optimal Control aspects of Reinforcement Learning falling within the manufacturing the future theme and control engineering.

Supervisors:

Dr Telmo de Menezes e Silva Filho, School of Engineering Mathematics and Technology
Dr Nirav Ajmeri, School of Computer Science
Dr Laurence Aitchison, School of Engineering Mathematics and Technology

Website:

University Research Page