Implementing a GPU-based parallel MAX-MIN Ant System
January 18, 2020 ยท Declared Dead ยท ๐ Future generations computer systems
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Rafaล Skinderowicz
arXiv ID
2003.11902
Category
cs.NE: Neural & Evolutionary
Cross-listed
cs.DC
Citations
25
Venue
Future generations computer systems
Last Checked
4 months ago
Abstract
The MAX-MIN Ant System (MMAS) is one of the best-known Ant Colony Optimization (ACO) algorithms proven to be efficient at finding satisfactory solutions to many difficult combinatorial optimization problems. The slow-down in Moore's law, and the availability of graphics processing units (GPUs) capable of conducting general-purpose computations at high speed, has sparked considerable research efforts into the development of GPU-based ACO implementations. In this paper, we discuss a range of novel ideas for improving the GPU-based parallel MMAS implementation, allowing it to better utilize the computing power offered by two subsequent Nvidia GPU architectures. Specifically, based on the weighted reservoir sampling algorithm we propose a novel parallel implementation of the node selection procedure, which is at the heart of the MMAS and other ACO algorithms. We also present a memory-efficient implementation of another key-component -- the tabu list structure -- which is used in the ACO's solution construction stage. The proposed implementations, combined with the existing approaches, lead to a total of six MMAS variants, which are evaluated on a set of Traveling Salesman Problem (TSP) instances ranging from 198 to 3,795 cities. The results show that our MMAS implementation is competitive with state-of-the-art GPU-based and multi-core CPU-based parallel ACO implementations: in fact, the times obtained for the Nvidia V100 Volta GPU were up to 7.18x and 21.79x smaller, respectively. The fastest of the proposed MMAS variants is able to generate over 1 million candidate solutions per second when solving a 1,002-city instance. Moreover, we show that, combined with the 2-opt local search heuristic, the proposed parallel MMAS finds high-quality solutions for the TSP instances with up to 18,512 nodes.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Neural & Evolutionary
๐ฎ
๐ฎ
The Ethereal
R.I.P.
๐ป
Ghosted
Deep Learning using Rectified Linear Units (ReLU)
R.I.P.
๐ป
Ghosted
Generative Adversarial Text to Image Synthesis
R.I.P.
๐ป
Ghosted
Regularized Evolution for Image Classifier Architecture Search
R.I.P.
๐ป
Ghosted
Temporal Ensembling for Semi-Supervised Learning
๐
๐
Old Age
Learning Structured Sparsity in Deep Neural Networks
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
๐ป
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
๐ป
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
๐ป
Ghosted