Pinpointing Anomaly Events in Logs from Stability Testing -- N-Grams vs. Deep-Learning

February 18, 2022 Β· Declared Dead Β· πŸ› International Conference on Software Testing, Verification and Validation Workshops

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Mika MΓ€ntylΓ€, MartΓ­n Varela, Shayan Hashemi arXiv ID 2202.09214 Category cs.SE: Software Engineering Citations 11 Venue International Conference on Software Testing, Verification and Validation Workshops Last Checked 4 months ago
Abstract
As stability testing execution logs can be very long, software engineers need help in locating anomalous events. We develop and evaluate two models for scoring individual log-events for anomalousness, namely an N-Gram model and a Deep Learning model with LSTM (Long short-term memory). Both are trained on normal log sequences only. We evaluate the models with long log sequences of Android stability testing in our company case and with short log sequences from HDFS (Hadoop Distributed File System) public dataset. We evaluate next event prediction accuracy and computational efficiency. The LSTM model is more accurate in stability testing logs (0.848 vs 0.865), whereas in HDFS logs the N-Gram is slightly more accurate (0.904 vs 0.900). The N-Gram model has far superior computational efficiency compared to the Deep model (4 to 13 seconds vs 16 minutes to nearly 4 hours), making it the preferred choice for our case company. Scoring individual log events for anomalousness seems like a good aid for root cause analysis of failing test cases, and our case company plans to add it to its online services. Despite the recent surge in using deep learning in software system anomaly detection, we found limited benefits in doing so. However, future work should consider whether our finding holds with different LSTM-model hyper-parameters, other datasets, and with other deep-learning approaches that promise better accuracy and computational efficiency than LSTM based models.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Software Engineering

Died the same way β€” πŸ‘» Ghosted