LogPilot: Intent-aware and Scalable Alert Diagnosis for Large-scale Online Service Systems

September 30, 2025 Β· Declared Dead Β· πŸ› International Conference on Automated Software Engineering

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Zhihan Jiang, Jinyang Liu, Yichen Li, Haiyu Huang, Xiao He, Tieying Zhang, Jianjun Chen, Yi Li, Rui Shi, Michael R. Lyu arXiv ID 2509.25874 Category cs.SE: Software Engineering Citations 0 Venue International Conference on Automated Software Engineering Last Checked 4 months ago
Abstract
Effective alert diagnosis is essential for ensuring the reliability of large-scale online service systems. However, on-call engineers are often burdened with manually inspecting massive volumes of logs to identify root causes. While various automated tools have been proposed, they struggle in practice due to alert-agnostic log scoping and the inability to organize complex data effectively for reasoning. To overcome these limitations, we introduce LogPilot, an intent-aware and scalable framework powered by Large Language Models (LLMs) for automated log-based alert diagnosis. LogPilot introduces an intent-aware approach, interpreting the logic in alert definitions (e.g., PromQL) to precisely identify causally related logs and requests. To achieve scalability, it reconstructs each request's execution into a spatiotemporal log chain, clusters similar chains to identify recurring execution patterns, and provides representative samples to the LLMs for diagnosis. This clustering-based approach ensures the input is both rich in diagnostic detail and compact enough to fit within the LLM's context window. Evaluated on real-world alerts from Volcano Engine Cloud, LogPilot improves the usefulness of root cause summarization by 50.34% and exact localization accuracy by 54.79% over state-of-the-art methods. With a diagnosis time under one minute and a cost of only $0.074 per alert, LogPilot has been successfully deployed in production, offering an automated and practical solution for service alert diagnosis.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Software Engineering

Died the same way β€” πŸ‘» Ghosted