Projects

Network Traffic Generation Tool

Published:

(Paper One) (Paper Two) (Paper Three) (GitHub) (Documentation) Data related to communication networks is often sensitve and proprietary. Consequently, many networking academic papers are published without open-accessing the network traffic data that was used to obtain the results, and when they are published the datasets are often too limited for data-hungry applications such as reinforcement learning. In an effort to aid reproducibility, some authors release characteristic distributions which broadly describe the underlying data. However, these distributions are often not analytically described and may not fall under the classic ‘named’ distributions (Gaussian, log-normal, Pareto etc.). As a result, other researchers find themselves using unrealistically simple uniform traffic distributions or their own distributions which are difficult to universally benchmark. This project saw the development of an open-access network traffic generation tool for (1) standardising the traffic patterns used to benchmark networking systems, and (2) enabling rapid and easy replication of literature distributions even in the absence of raw open-access data.

Reinforcement Learning for Combinatorial Optimisation

Published:

(Paper One) (Paper Two) (GitHub) Optimisation problems are search problems where a solution which maximises some objective is being sought amongst a search space. Combinatorial optimisation (CO) is an optimisation sub-category where the solution being sought is a discrete variable (e.g. an integer, a graph, a set, etc.) amongst a finite (or countably infinite) space of possible solutions. Many real-world problems fall under the broad category of CO, from network routing and scheduling to protein folding and fundamental science. However, with many CO problems being NP-hard, solving non-trivial instance sizes in reasonable time frames is a significant challenge. Although CO solvers were studied and designed extensively in the latter half of the 20th, recent years have seen a resurgance in their academic study with the application of machine learning to solving CO problems. This work saw the application of graph neural networks and reinforcement learning to learn to solve graph-based combinatorial optimisation problems from scratch. This was done through the design of two new machine learning algorithms. The first achieved state-of-the-art scalability for learned heuristic solutions, and the second enabled the integration of reinforcement learning into exact branch-and-bound solvers. These are important steps towards establishing machine learning as the go-to approach for solving CO problems, which will unlock advances in a plethora of real-world applications.

Ultra-Fast Optical Switch Optimisation

Published:

(Paper One) (Paper Two) (Paper Three) (GitHub) (Documentation) One of the primary bottlenecks to all-optical data centre networks is the lack of a packet-timescale switch. This project saw the application of AI techniques to switch semiconductor optical amplifiers in just half a nanosecond. AI beat the previous world-record by an order of magnitude and, for the first time, offered the potential to be scaled to thousands of switches in a real data centre.

Resource Management in Distributed Deep Learning Optical Clusters (Ongoing)

Published:

(Paper One) (GitHub) Low-latency, high-bandwidth, ultra-scalable optical circuit switched networks can address the limitations of current compute clusters and enable the deployment of next-generation high-performance clusters and data centres. In particular, machine learning workloads present a unique opportunity for which to develop specialised circuit-based clusters because they are predictable, periodic, and consist mostly of large network flows. Furthermore, trillion-parameter learning models are being developed with final test performances capped primarily by the model size; a characteristic which, in the ‘strong scaling’ case, is limited by the bandwidth of the network connecting the cluster’s servers. In this work, we aim to address the challenge of how to make resource management decisions (from computation graph partitioning and placement to server allocation and scheduling) when training massive models on an optical cluster with distributed deep learning. By framing the problem as a Markov decision process where sequential actions must be taken to maximise some reward (such as minimising the overall job completion time), a graph neural network can be trained from scratch with end-to-end reinforcement learning to allocate the cluster’s resources near-optimally. We are in the process of developing a suite of cluster environments, graph neural network models, and reinforcement learning algorithms in order to achieve this, and we hope to demonstrate both good performance and the ability to scale to large networks and jobs.

Smart Ski Boot

Published:

Using embedded systems hardware can be more complex to programme, but can bring benefits in terms of costs and power consumption. This project saw the development of a prototype ‘smart ski boot’ that could be used for more accurate, in-depth and cheap ski technique instruction than is deliverable by expensive human instructors, who typically charge £500-700 a day. This stands to benefit not only skiers who will save money in tuition fees and receive superior teaching, but also the ski industry, from restauranteurs to equipment providers, whose customer base will increase as fewer people are priced out of the sport.

Network Attack Detection

Published:

Software to detect network intrusions protects a computer network from unauthorised users, including perhaps insiders. The KDD dataset from the 1999 DARPA Intrusion Detection Evaluation Program competition contains roughly 5 million network connection request fingerprints split into 4 broad categories (DoS, R2L, U2R and probing attacks) which can be further sub-divided into 23 forms of attack. Using a standard sequential neural network with 3 hidden layers, a model was trained with a supervised learning framework in a client-server architecture to detect malicious network requests with 99.99% accuracy.

Huawei DriveML Challenge

Published:

Autonomous driving holds the promise of reducing traffic accidents by designing safe, robust, accurate and intelligent agents. In this introductory-level competition, Huawei open-accessed their traffic perception simulation system with which competitors had 6 weeks to design, train and test a single-agent navigating roads and traffic. My team attempted to use the NeuroEvolution of Augmenting Topologies (NEAT) genetic algorithm for the agent policy trained with a reinforcement learning framework, however performance was ultimately not good due to poor generalisation to unseen scenarios.

3D Holographic AR and VR Displays

Published:

Modern augmented and virtual reality systems such as Oculus VR, Magic Leap and HoloLens products are fundamentally unfit for purpose because they use 2D projection which leads to eye fatigue and a lack of immersive experience for the user. Holography is the only way to produce truly 3D images, and could therefore emerge as the leading technology for such display systems. Historically to have large field-of-view holographic displays, engineers had to use either smaller spatial light modulators or telescope de-magnification techniques. This resulted in uncomfortably small and often impractical display sizes. In collaboration with the University of Cambridge and holographic display company VividQ, this project saw the development of a system capable of expanding the eyebox without compromising on display size, signficantly improving the usability and quality of 3D holographic displays.