Cloud and AI Infrastructure Cost Optimization: A Comprehensive Review of Strategies and Case Studies

July 24, 2023 ยท The Cartographer ยท + Add venue

๐Ÿ“š THE CARTOGRAPHER: The Cartographer
Survey/review paper โ€” maps the landscape rather than implementing a method.

"No code URL or promise found in abstract"
"Title-pattern auto-detect: Cloud and AI Infrastructure Cost Optimization: A Comprehensive Review of Strategies and Case Studies"

Evidence collected by the PWNC Scanner

Authors Saurabh Deochake arXiv ID 2307.12479 Category cs.DC: Distributed Computing Cross-listed cs.CE, econ.GN, eess.SY Citations 16 Last Checked 2 days ago
Abstract
Cloud computing has revolutionized the way organizations manage their IT infrastructure, but it has also introduced new challenges, such as managing cloud costs. The rapid adoption of artificial intelligence (AI) and machine learning (ML) workloads has further amplified these challenges, with GPU compute now representing 40-60\% of technical budgets for AI-focused organizations. This paper provides a comprehensive review of cloud and AI infrastructure cost optimization techniques, covering traditional cloud pricing models, resource allocation strategies, and emerging approaches for managing AI/ML workloads. We examine the dramatic cost reductions in large language model (LLM) inference which has decreased by approximately 10x annually since 2021 and explore techniques such as model quantization, GPU instance selection, and inference optimization. Real-world case studies from Amazon Prime Video, Pinterest, Cloudflare, and Netflix showcase practical application of these techniques. Our analysis reveals that organizations can achieve 50-90% cost savings through strategic optimization approaches. Future research directions in automated optimization, sustainability, and AI-specific cost management are proposed to advance the state of the art in this rapidly evolving field.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Distributed Computing