Publications

A Vectorised Packing Algorithm for Efficient Generation of Custom Traffic Traces

Published in Optical Fiber Communications Conference and Exhibition (OFC), 2023

We propose a new algorithm for generating custom network traffic matrices which achieves 13x, 38x, and 70x faster generation times than prior work on networks with 64, 256, and 1024 nodes respectively.

Recommended citation: C. W. F. Parsonson, Joshua L. Benjamin, and G. Zervas "A Vectorised Packing Algorithm for Efficient Generation of Custom Traffic Matrices", OFC-23: Optical Fiber Communications Conference and Exhibition, 2023 https://arxiv.org/abs/2302.09970

A Hybrid Beam Steering Free-Space and Fiber Based Optical Data Center Network

Published in Journal of Lightwave Technology (JLT), 2023

Wireless data center networks (DCNs) are promising solutions to mitigate the cabling complexity in traditional wired DCNs and potentially reduce the end-to-end latency with faster propagation speed in free space. Yet, physical architectures in wireless DCNs must be carefully designed regarding wireless link blockage, obstacle bypassing, path loss, interference and spatial efficiency in a dense deployment. This paper presents the physical layer design of a hybrid FSO/in-fiber DCN while guaranteeing an all-optical, single hop, non-oversubscribed and full-bisection bandwidth network. We propose two layouts and analyze their scalability: (1) A static network utilizing only tunable sources which can scale up to 43 racks, 15,609 nodes and 15,609 channels; and (2) a re-configurable network with both tunable sources and piezoelectric actuator (PZT) based beam-steering which can scale up to 8 racks, 2,904 nodes and 185,856 channels at millisecond PZT switching time. Based on a traffic generation framework and a dynamic wavelength-timeslot scheduling algorithm, the system-level network performance is simulated for a 363-node subnet, reaching >99% throughput and 1.23$ us average scheduler latency at 90% load.

Recommended citation: Y. Liu, J. L. Benjamin, C. W. F. Parsonson, and G. Zervas "A Hybrid Beam Steering Free-Space and Fiber Based Optical Data Center Network", Journal of Lightwave Technology (JLT), 2023 Link to be added soon

Reinforcement Learning for Branch-and-Bound Optimisation using Retrospective Trajectories

Published in Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI), 2023

Combinatorial optimisation problems framed as mixed integer linear programmes (MILPs) are ubiquitous across a range of real-world applications. The canonical branch-and-bound algorithm seeks to exactly solve MILPs by constructing a search tree of increasingly constrained sub-problems. In practice, its solving time performance is dependent on heuristics, such as the choice of the next variable to constrain ('branching'). Recently, machine learning (ML) has emerged as a promising paradigm for branching. However, prior works have struggled to apply reinforcement learning (RL), citing sparse rewards, difficult exploration, and partial observability as significant challenges. Instead, leading ML methodologies resort to approximating high quality handcrafted heuristics with imitation learning (IL), which precludes the discovery of novel policies and requires expensive data labelling. In this work, we propose retro branching; a simple yet effective approach to RL for branching. By retrospectively deconstructing the search tree into multiple paths each contained within a sub-tree, we enable the agent to learn from shorter trajectories with more predictable next states. In experiments on four combinatorial tasks, our approach enables learning-to-branch without any expert guidance or pre-training. We outperform the current state-of-the-art RL branching algorithm by 3-5x and come within 20% of the best IL method's performance on MILPs with 500 constraints and 1000 variables, with ablations verifying that our retrospectively constructed trajectories are essential to achieving these results.

Recommended citation: C. W. F. Parsonson, A. Laterre and T. D. Barrett "Reinforcement Learning for Branch-and-Bound Optimisation using Retrospective Trajectories", AAAI-23: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023 https://arxiv.org/abs/2205.14345?context=cs

Learning to Solve Combinatorial Graph Partitioning Problems via Efficient Exploration

Published in Under peer review, 2022

From logistics to the natural sciences, combinatorial optimisation on graphs underpins numerous real-world applications. Reinforcement learning (RL) has shown particular promise in this setting as it can adapt to specific problem structures and does not require pre-solved instances for these, often NP-hard, problems. However, state-of-the-art (SOTA) approaches typically suffer from severe scalability issues, primarily due to their reliance on expensive graph neural networks (GNNs) at each decision step. We introduce ECORD; a novel RL algorithm that alleviates this expense by restricting the GNN to a single pre-processing step, before entering a fast-acting exploratory phase directed by a recurrent unit. Experimentally, ECORD achieves a new SOTA for RL algorithms on the Maximum Cut problem, whilst also providing orders of magnitude improvement in speed and scalability. Compared to the nearest competitor, ECORD reduces the optimality gap by up to 73% on 500 vertex graphs with a decreased wall-clock time. Moreover, ECORD retains strong performance when generalising to larger graphs with up to 10000 vertices.

Recommended citation: T. D. Barrett, C. W. F. Parsonson and A. Laterre "Learning to Solve Combinatorial Graph Partitioning Problems via Efficient Exploration", Under peer review, 2022 https://arxiv.org/abs/2205.14105

Partitioning Distributed Compute Jobs with Reinforcement Learning and Graph Neural Networks

Published in Under peer review, 2022

From natural language processing to genome sequencing, large-scale machine learning models are bringing advances to a broad range of fields. Many of these models are too large to be trained on a single machine, and instead must be distributed across multiple devices. This has motivated the research of new compute and network systems capable of handling such tasks. In particular, recent work has focused on developing management schemes which decide *how* to allocate distributed resources such that some overall objective, such as minimising the job completion time (JCT), is optimised. However, such studies omit explicit consideration of *how much* a job should be distributed, usually assuming that maximum distribution is desirable. In this work, we show that maximum parallelisation is sub-optimal in relation to user-critical metrics such as throughput and blocking rate. To address this, we propose PAC-ML (partitioning for asynchronous computing with machine learning), which leverages a graph neural network and reinforcement learning to learn how much to partition computation graphs such that the number of jobs which meet arbitrary user-defined JCT requirements is maximised. In experiments with five real deep learning computation graphs on a recently proposed optical architecture across four user-defined JCT requirement distributions, we demonstrate PAC-ML achieving up to 56.2% lower blocking rates in dynamic job arrival settings than the canonical maximum parallelisation strategy used by most prior works.

Recommended citation: C. W. F. Parsonson and G. Zervas "Partitioning Distributed Compute Jobs with Reinforcement Learning and Graph Neural Networks", Under peer review, 2022 Link to be added soon.

Traffic Generation for Benchmarking Data Centre Networks

Published in Optical Switching and Networking, 2022

Benchmarking is commonly used in research fields, such as computer architecture design and machine learning, as a powerful paradigm for rigorously assessing, comparing, and developing novel technologies. However, the data centre network (DCN) community lacks a standard open-access and reproducible traffic generation framework for benchmark workload generation. Driving factors behind this include the proprietary nature of traffic traces, the limited detail and quantity of open-access network-level data sets, the high cost of real world experimentation, and the poor reproducibility and fidelity of synthetically generated traffic. This is curtailing the community's understanding of existing systems and hindering the ability with which novel technologies, such as optical DCNs, can be developed, compared, and tested.

Recommended citation: C. W. F. Parsonson, J. L. Benjamin and G. Zervas, "Traffic Generation for Benchmarking Data Centre Networks", Optical Switching and Networking, 2022 https://www.sciencedirect.com/science/article/pii/S1573427722000315

Traffic Tolerance of Nanosecond Scheduling on Optical Circuit Switched Data Centre Networks

Published in Optical Fiber Communications Conference and Exhibition (OFC), 2022

PULSE's ns-speed NP-hard network scheduler delivers skew-tolerant performance at 90% input loads. It achieves >90% throughput, 1.5-1.9 µs mean and 16–24 µs tail latency (99%) for up to 6:1 hot:cold skewed traffic in OCS DCN.

Recommended citation: J. L. Benjamin, A. Ottino, C. W. F. Parsonson and G. Zervas, "Traffic Tolerance of Nanosecond Scheduling on Optical Circuit Switched Data Centre Networks", OFC-22: Optical Fiber Communications Conference and Exhibition, 2022 https://ieeexplore.ieee.org/document/9748332

Optimal and Low Complexity Control of SOA-Based Optical Switching with Particle Swarm Optimisation

Published in European Conference and Exhibition on Optical Communication (ECOC), 2022

We propose a reliable, low-complexity particle swarm optimisation (PSO) approach to control semiconductor optical amplifier (SOA)-based switches. We experimentally demonstrate less than 610 ps off-on switching (settling) time and less than 2.2% overshoot with 20x lower sampling rate and 8x reduced DAC resolution.

Recommended citation: H. Alkharsan, C. W. F. Parsonson, Z. Shabka, X. Mu, A. Ottino and G. Zervas, "Optimal and Low Complexity Control of SOA-Based Optical Switching with Particle Swarm Optimisation", ECOC-22: European Conference and Exhibition on Optical Communication, 2022 https://opg.optica.org/abstract.cfm?uri=ECEOC-2022-Tu3C.5

AI-Optimised Tuneable Sources for Bandwidth-Scalable, Sub-Nanosecond Wavelength Switching

Published in Optics Express, 2021

Wavelength routed optical switching promises low power and latency networking for data centres, but requires a wideband wavelength tuneable source (WTS) capable of sub-nanosecond switching at every node. We propose a hybrid WTS that uses time-interleaved tuneable lasers, each gated by a semiconductor optical amplifier, where the performance of each device is optimised using artificial intelligence. Through simulation and experiment we demonstrate record wavelength switch times below 900 ps across 6.05 THz (122×50 GHz) of continuously tuneable optical bandwidth. A method for further bandwidth scaling is evaluated and compared to alternative designs.

Recommended citation: T. Gerard, C. W. F. Parsonson, Z. Shabka, B. Thomsen, P. Bayvel, D. Lavery and G. Zervas, "AI-Optimised Tuneable Sources for Bandwidth-Scalable, Sub-Nanosecond Wavelength Swithching", Optics Express, 2021 https://opg.optica.org/oe/fulltext.cfm?uri=oe-29-7-11221&id=449558

Benchmarking Packet-Granular OCS Network Scheduling for Data Center Traffic Traces

Published in Photonic Networks and Devices, 2021

We recently reported hardware-implemented scheduling processors for packet-granular reconfigurable optical circuit-switched networks. Here, we benchmark the performance of the processors under various data center traffic for a range of network loads.

Recommended citation: Joshua L. Benjamin, Christopher W. F. Parsonson, and G. Zervas "Benchmarking Packet-Granular OCS Network Scheduling for Data Center Traffic Traces", Photonic Networks and Devices, 2021 https://opg.optica.org/abstract.cfm?uri=Networks-2021-NeW3B.3

Optimal Control of SOAs with Artifical Intelligence for Sub-Nanosecond Optical Switching

Published in Journal of Lightwave Technology (JLT), 2020

Novel approaches to switching ultra-fast semiconductor optical amplifiers using artificial intelligence algorithms (particle swarm optimisation, ant colony optimisation, and a genetic algorithm) are developed and applied both in simulation and experiment. Effective off-on switching (settling) times of 542 ps are demonstrated with just 4.8% overshoot, achieving an order of magnitude improvement over previous attempts described in the literature and standard dampening techniques from control theory.

Recommended citation: C. W. F. Parsonson, Z. Shabka, W. K. Chlupka, B. Goh and G. Zervas, "Optimal Control of SOAs with Artificial Intelligence for Sub-Nanosecond Optical Switching", Journal of Lightwave Technology (JLT), 2020 https://ieeexplore.ieee.org/abstract/document/9124678