Skip to content

Compilation of public failure/horror stories related to Kubernetes

Notifications You must be signed in to change notification settings

pacoxu/kubernetes-failure-stories

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Kubernetes Failure Stories

A compiled list of links to public failure stories related to Kubernetes. Most recent publications on top.

Why

Kubernetes is a fairly complex system with many moving parts. Its ecosystem is constantly evolving and adding even more layers (service mesh, ...) to the mix. Considering this environment, we don't hear enough real-world horror stories to learn from each other! This compilation of failure stories should make it easier for people dealing with Kubernetes operations (SRE, Ops, platform/infrastructure teams) to learn from others and reduce the unknown unknowns of running Kubernetes in production. For more information, see the blog post.

Contributing

Please help the community and share a link to your failure story by opening a Pull Request! Failure stories can be anything like blog posts, conference/meetup talks, incident postmortems, tweetstorms, ...

I would also be glad to hear about your failure stories on Twitter: my handle is @try_except_

Thanks

Thanks to all contributors and everybody who writes public Kubernetes postmortems! 👏

Thanks to Joe Beda for contributing his domain k8s.af for this project! 👏

About

Compilation of public failure/horror stories related to Kubernetes

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 54.2%
  • HTML 30.9%
  • Dockerfile 6.8%
  • Shell 6.6%
  • Makefile 1.5%