Neural Architecture Search using Particle Swarm and Ant Colony Optimization

March 06, 2024 · Declared Dead · 🏛 Irish Conference on Artificial Intelligence and Cognitive Science

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Séamus Lankford, Diarmuid Grimes arXiv ID 2403.03781 Category cs.NE: Neural & Evolutionary Cross-listed cs.AI, cs.LG Citations 13 Venue Irish Conference on Artificial Intelligence and Cognitive Science Last Checked 4 months ago

Abstract

Neural network models have a number of hyperparameters that must be chosen along with their architecture. This can be a heavy burden on a novice user, choosing which architecture and what values to assign to parameters. In most cases, default hyperparameters and architectures are used. Significant improvements to model accuracy can be achieved through the evaluation of multiple architectures. A process known as Neural Architecture Search (NAS) may be applied to automatically evaluate a large number of such architectures. A system integrating open source tools for Neural Architecture Search (OpenNAS), in the classification of images, has been developed as part of this research. OpenNAS takes any dataset of grayscale, or RBG images, and generates Convolutional Neural Network (CNN) architectures based on a range of metaheuristics using either an AutoKeras, a transfer learning or a Swarm Intelligence (SI) approach. Particle Swarm Optimization (PSO) and Ant Colony Optimization (ACO) are used as the SI algorithms. Furthermore, models developed through such metaheuristics may be combined using stacking ensembles. In the context of this paper, we focus on training and optimizing CNNs using the Swarm Intelligence (SI) components of OpenNAS. Two major types of SI algorithms, namely PSO and ACO, are compared to see which is more effective in generating higher model accuracies. It is shown, with our experimental design, that the PSO algorithm performs better than ACO. The performance improvement of PSO is most notable with a more complex dataset. As a baseline, the performance of fine-tuned pre-trained models is also evaluated.