Empirical Characterization of User Reports about Cloud Failures

Sacheendra Talluri*, Leon Overweel, Laurens Versluis, Animesh Trivedi, Alexandru Iosup

*Corresponding author for this work

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

Abstract

Cloud services are important for healthcare, banking, communication, and other purposes. Inevitably, such services fail, harming the processes and disturbing the people that depend on them. Understanding failure in cloud services is challenging, but important to help preventing them. Much work has studied failure logs and reports provided by infrastructure operators. However, there is a paucity of information about how users perceive the failures of cloud services. In this work, we collect user-reported failures and characterize them empirically. We collect failures reported by users to the trusted aggregator Outage Report for 12 cloud services over 16 months spread across 2019 and 2020. We show evidence that user-reported failures not only capture major failures also self-reported by cloud operators, but also provide information about additional failures. We count and analyze time patterns in these reports. We make 6 main observations about how users perceive failure in cloud services. We find over 10x differences in request failure rates across microservice structures when using user reported traces compared to using a constant failure distribution. Overall, our study provides the first long-term characterization of user-reported cloud failures.

Original languageEnglish
Title of host publicationProceedings - 2021 IEEE International Conference on Autonomic Computing and Self-Organizing Systems, ACSOS 2021
EditorsEsam El-Araby, Vana Kalogeraki, Danilo Pianini, Frederic Lassabe, Barry Porter, Sona Ghahremani, Ingrid Nunes, Mohamed Bakhouya, Sven Tomforde
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages158-163
Number of pages6
Edition2
ISBN (Electronic)9781665412612
DOIs
Publication statusE-pub ahead of print - 5 Jan 2022
Event2nd IEEE International Conference on Autonomic Computing and Self-Organizing Systems, ACSOS 2021 - Virtual, Online, United States
Duration: 27 Sep 20211 Oct 2021

Publication series

NameProceedings - 2021 IEEE International Conference on Autonomic Computing and Self-Organizing Systems, ACSOS 2021

Conference

Conference2nd IEEE International Conference on Autonomic Computing and Self-Organizing Systems, ACSOS 2021
Country/TerritoryUnited States
CityVirtual, Online
Period27/09/211/10/21

Bibliographical note

Publisher Copyright:
© 2021 IEEE.

Keywords

  • availability
  • characterization
  • cloud
  • cloud service
  • crowdsourcing
  • failure

Fingerprint

Dive into the research topics of 'Empirical Characterization of User Reports about Cloud Failures'. Together they form a unique fingerprint.

Cite this