{"id":5513,"date":"2018-04-26T11:59:15","date_gmt":"2018-04-26T10:59:15","guid":{"rendered":"https:\/\/ee.yelkdev.site\/?p=5513"},"modified":"2023-09-25T10:25:28","modified_gmt":"2023-09-25T09:25:28","slug":"chaos-day","status":"publish","type":"post","link":"https:\/\/www.equalexperts.com\/blog\/our-thinking\/chaos-day\/","title":{"rendered":"A day of chaos"},"content":{"rendered":"<p>We spend a lot of our time thinking about how best to shield our clients from the potential pitfalls of digital business. We ask ourselves \u201cwhat\u2019s the worst that could happen?\u201d \u2013 and work hard to mitigate the risk.<\/p>\n<p>As part of these efforts, we recently ran a \u2018Chaos Day\u2019 with one of our clients \u2013 a major Government department that hosts around 50 digital delivery teams, distributed all around the UK. These teams design, deliver and support hundreds of microservices that serve online content to the department\u2019s varied customers.<\/p>\n<p>The microservices all run on a single platform, itself run by seven Platform Teams that take responsibility for distinct areas (infrastructure, security and so on). Equal Experts collaborates on the ongoing development of the Platform, with an array of infrastructure engineers, developers, testers and delivery leads spread across these teams.<\/p>\n<p>Inspired by the Netflix\u2019 <a href=\"https:\/\/netflix.github.io\/chaosmonkey\/\" target=\"_blank\" rel=\"noopener noreferrer\">Chaos Monkey<\/a> and Amazon\u2019s <a href=\"https:\/\/aws.amazon.com\/blogs\/apn\/the-top-eight-tips-for-gameday-success-and-some-gamedonts\/\" target=\"_blank\" rel=\"noopener noreferrer\">Game Day<\/a>, the Platform Teams planned and executed their own Chaos Day \u2013 to see just how well they and the Platform coped when everything that could go wrong, does go wrong.<\/p>\n<h3>Be alert<\/h3>\n<p>Why run such an event now? Well, since moving to Amazon Web Services (AWS), the Platform has benefited from improved stability and performance. We followed AWS\u2019 best practices, using Auto Scaling Groups (ASGs) and multiple Availability Zones (AZs) to improve the availability of the Platform.<\/p>\n<p>Another factor was that the client\u2019s digital service teams are now well versed in dealing with traffic peaks. So we felt the time was right to put the resilience of the Platform and its teams to the test \u2013 and make sure no laurels were being sat on.<\/p>\n<h3>Unleash chaos<\/h3>\n<p>We aimed to test the impact of a perfect storm of things going wrong during our client\u2019s busiest time of year. The chaos ranged from key engineers being hit by a bus, to half of our AWS infrastructure going down (all simulated: no animals, production systems, or people were harmed in the course of the event).<\/p>\n<p>The event was fun, frantic and a great learning experience. As with any event of this kind, we learned a few things you might want to consider if running a similar kind of exercise:<\/p>\n<p><strong><span style=\"text-decoration: underline;\">Planning<\/span><\/strong><\/p>\n<p>Coordinated chaos sounds like an oxymoron, but to get the most from our Chaos Day, some planning was required. To keep the precise nature of our chaos a surprise for the teams facing it, our planning session was split as follows:<\/p>\n<ul>\n<li>An initial open session, to define the mechanics of Chaos Day \u2013 such as which environment would be used, who would participate, which Slack channels to use, and what was expected from the Platform and digital service teams.<\/li>\n<li>A second session, limited to our <em>Agents of Chaos<\/em>. We chose highly experienced, knowledgeable members from each platform team to fill these roles ie. the people you really wouldn\u2019t want to be hit by a bus (but that wouldn\u2019t really happen \u2026. would it?)<\/li>\n<\/ul>\n<p><strong><span style=\"text-decoration: underline;\">Execution<\/span><\/strong><\/p>\n<p>We ran the game day on our Staging environment, while running peak load tests in the background. The Chaos team was kept separate from Platform Teams, and injected chaos in secret throughout the day. The Platform Teams treated issues as though they were real Production ones.<\/p>\n<p>Over twenty chaotic disruptions were injected across the day, including failures to microservices, instance types, deployment tools, team members (i.e. making them unavailable \/ hit by a pretend bus), and availability zones.<\/p>\n<p>The Platform Teams put up a valiant fight against the onslaught \u2013 especially as real Production issues occurred on the same day (with a certain sense of inevitability).<\/p>\n<p><strong><span style=\"text-decoration: underline;\">Learnings<\/span><\/strong><\/p>\n<p>We followed up with a retrospective to identify what went well, what we needed to address and how we\u2019d improve future Chaos Days. The improvements we identified included:<\/p>\n<ul>\n<li>Run it more regularly (at least quarterly);<\/li>\n<li>Run it further ahead of the next annual peak;<\/li>\n<li>Widen the scope of the teams we involve (e.g. we have an API team, who weren\u2019t included this time).<\/li>\n<li>(Stretch goal) Run it in Production!<\/li>\n<\/ul>\n<p>Even so, the day was a real success. It provided tangible confirmation that the platform is performant and resilient, and that the team is able to cope with a wide range of failures that might occur.<\/p>\n<p>Bugs are inevitable in any complex system though, and the day would have been a failure if none were found, so it was helpful that twenty-two issues were identified as a result of the exercise.<\/p>\n<p>On a personal note, I was greatly impressed by the passion, professionalism and expertise with which the Chaos Day was conducted, both in terms of the Platform Teams and the Agents of Chaos. It was a real privilege to be (a tiny) part of the day, and I can heartily recommend running a similar event.<\/p>\n<blockquote><p>&nbsp;<\/p><\/blockquote>\n","protected":false},"excerpt":{"rendered":"<p>We spend a lot of our time thinking about how best to shield our clients from the potential pitfalls of digital business. We ask ourselves \u201cwhat\u2019s the worst that could happen?\u201d \u2013 and work hard to mitigate the risk. As part of these efforts, we recently ran a \u2018Chaos Day\u2019 with one of our clients [&hellip;]<\/p>\n","protected":false},"author":29,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"content-type":"","inline_featured_image":false,"footnotes":""},"categories":[5],"tags":[],"location":[397],"class_list":["post-5513","post","type-post","status-publish","format-standard","hentry","category-our-thinking"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v25.9 (Yoast SEO v25.9) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>A day of chaos | Equal Experts<\/title>\n<meta name=\"description\" content=\"We spend a lot of our time thinking about how best to shield our clients from the potential pitfalls of digital business. To help mitigate risks, we recently ran a \u2018Chaos Day\u2019 with one of our clients.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.equalexperts.com\/blog\/our-thinking\/chaos-day\/\" \/>\n<meta property=\"og:locale\" content=\"en_GB\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"A day of chaos\" \/>\n<meta property=\"og:description\" content=\"We spend a lot of our time thinking about how best to shield our clients from the potential pitfalls of digital business. To help mitigate risks, we recently ran a \u2018Chaos Day\u2019 with one of our clients.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.equalexperts.com\/blog\/our-thinking\/chaos-day\/\" \/>\n<meta property=\"og:site_name\" content=\"Equal Experts\" \/>\n<meta property=\"article:published_time\" content=\"2018-04-26T10:59:15+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-09-25T09:25:28+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.equalexperts.com\/wp-content\/uploads\/2018\/04\/Chaos_Day.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1170\" \/>\n\t<meta property=\"og:image:height\" content=\"720\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Lyndsay Prewer\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@EqualExperts\" \/>\n<meta name=\"twitter:site\" content=\"@EqualExperts\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Lyndsay Prewer\" \/>\n\t<meta name=\"twitter:label2\" content=\"Estimated reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.equalexperts.com\/blog\/our-thinking\/chaos-day\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.equalexperts.com\/blog\/our-thinking\/chaos-day\/\"},\"author\":{\"name\":\"Lyndsay Prewer\",\"@id\":\"https:\/\/www.equalexperts.com\/#\/schema\/person\/ab8de7d0a8536946a1c251d65cca5754\"},\"headline\":\"A day of chaos\",\"datePublished\":\"2018-04-26T10:59:15+00:00\",\"dateModified\":\"2023-09-25T09:25:28+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.equalexperts.com\/blog\/our-thinking\/chaos-day\/\"},\"wordCount\":801,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.equalexperts.com\/#organization\"},\"articleSection\":[\"Our Thinking\"],\"inLanguage\":\"en-GB\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.equalexperts.com\/blog\/our-thinking\/chaos-day\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.equalexperts.com\/blog\/our-thinking\/chaos-day\/\",\"url\":\"https:\/\/www.equalexperts.com\/blog\/our-thinking\/chaos-day\/\",\"name\":\"A day of chaos | Equal Experts\",\"isPartOf\":{\"@id\":\"https:\/\/www.equalexperts.com\/#website\"},\"datePublished\":\"2018-04-26T10:59:15+00:00\",\"dateModified\":\"2023-09-25T09:25:28+00:00\",\"description\":\"We spend a lot of our time thinking about how best to shield our clients from the potential pitfalls of digital business. To help mitigate risks, we recently ran a \u2018Chaos Day\u2019 with one of our clients.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.equalexperts.com\/blog\/our-thinking\/chaos-day\/#breadcrumb\"},\"inLanguage\":\"en-GB\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.equalexperts.com\/blog\/our-thinking\/chaos-day\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.equalexperts.com\/blog\/our-thinking\/chaos-day\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.equalexperts.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"A day of chaos\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.equalexperts.com\/#website\",\"url\":\"https:\/\/www.equalexperts.com\/\",\"name\":\"Equal Experts\",\"description\":\"Making Software. Better.\",\"publisher\":{\"@id\":\"https:\/\/www.equalexperts.com\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.equalexperts.com\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-GB\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.equalexperts.com\/#organization\",\"name\":\"Equal Experts\",\"url\":\"https:\/\/www.equalexperts.com\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-GB\",\"@id\":\"https:\/\/www.equalexperts.com\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.equalexperts.com\/wp-content\/uploads\/2018\/08\/Equal_Experts_Logo_CMYK_Colour.jpg\",\"contentUrl\":\"https:\/\/www.equalexperts.com\/wp-content\/uploads\/2018\/08\/Equal_Experts_Logo_CMYK_Colour.jpg\",\"width\":719,\"height\":340,\"caption\":\"Equal Experts\"},\"image\":{\"@id\":\"https:\/\/www.equalexperts.com\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/x.com\/EqualExperts\",\"https:\/\/www.linkedin.com\/company\/equal-experts\/?viewAsMember=true\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.equalexperts.com\/#\/schema\/person\/ab8de7d0a8536946a1c251d65cca5754\",\"name\":\"Lyndsay Prewer\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-GB\",\"@id\":\"https:\/\/www.equalexperts.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/40bdd25ad48353db99467923f1513696e5f0f7d20e7c2a93d6a97a41545d1bd3?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/40bdd25ad48353db99467923f1513696e5f0f7d20e7c2a93d6a97a41545d1bd3?s=96&d=mm&r=g\",\"caption\":\"Lyndsay Prewer\"}}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"A day of chaos | Equal Experts","description":"We spend a lot of our time thinking about how best to shield our clients from the potential pitfalls of digital business. To help mitigate risks, we recently ran a \u2018Chaos Day\u2019 with one of our clients.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.equalexperts.com\/blog\/our-thinking\/chaos-day\/","og_locale":"en_GB","og_type":"article","og_title":"A day of chaos","og_description":"We spend a lot of our time thinking about how best to shield our clients from the potential pitfalls of digital business. To help mitigate risks, we recently ran a \u2018Chaos Day\u2019 with one of our clients.","og_url":"https:\/\/www.equalexperts.com\/blog\/our-thinking\/chaos-day\/","og_site_name":"Equal Experts","article_published_time":"2018-04-26T10:59:15+00:00","article_modified_time":"2023-09-25T09:25:28+00:00","og_image":[{"width":1170,"height":720,"url":"https:\/\/www.equalexperts.com\/wp-content\/uploads\/2018\/04\/Chaos_Day.jpg","type":"image\/jpeg"}],"author":"Lyndsay Prewer","twitter_card":"summary_large_image","twitter_creator":"@EqualExperts","twitter_site":"@EqualExperts","twitter_misc":{"Written by":"Lyndsay Prewer","Estimated reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.equalexperts.com\/blog\/our-thinking\/chaos-day\/#article","isPartOf":{"@id":"https:\/\/www.equalexperts.com\/blog\/our-thinking\/chaos-day\/"},"author":{"name":"Lyndsay Prewer","@id":"https:\/\/www.equalexperts.com\/#\/schema\/person\/ab8de7d0a8536946a1c251d65cca5754"},"headline":"A day of chaos","datePublished":"2018-04-26T10:59:15+00:00","dateModified":"2023-09-25T09:25:28+00:00","mainEntityOfPage":{"@id":"https:\/\/www.equalexperts.com\/blog\/our-thinking\/chaos-day\/"},"wordCount":801,"commentCount":0,"publisher":{"@id":"https:\/\/www.equalexperts.com\/#organization"},"articleSection":["Our Thinking"],"inLanguage":"en-GB","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.equalexperts.com\/blog\/our-thinking\/chaos-day\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.equalexperts.com\/blog\/our-thinking\/chaos-day\/","url":"https:\/\/www.equalexperts.com\/blog\/our-thinking\/chaos-day\/","name":"A day of chaos | Equal Experts","isPartOf":{"@id":"https:\/\/www.equalexperts.com\/#website"},"datePublished":"2018-04-26T10:59:15+00:00","dateModified":"2023-09-25T09:25:28+00:00","description":"We spend a lot of our time thinking about how best to shield our clients from the potential pitfalls of digital business. To help mitigate risks, we recently ran a \u2018Chaos Day\u2019 with one of our clients.","breadcrumb":{"@id":"https:\/\/www.equalexperts.com\/blog\/our-thinking\/chaos-day\/#breadcrumb"},"inLanguage":"en-GB","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.equalexperts.com\/blog\/our-thinking\/chaos-day\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.equalexperts.com\/blog\/our-thinking\/chaos-day\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.equalexperts.com\/"},{"@type":"ListItem","position":2,"name":"A day of chaos"}]},{"@type":"WebSite","@id":"https:\/\/www.equalexperts.com\/#website","url":"https:\/\/www.equalexperts.com\/","name":"Equal Experts","description":"Making Software. Better.","publisher":{"@id":"https:\/\/www.equalexperts.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.equalexperts.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-GB"},{"@type":"Organization","@id":"https:\/\/www.equalexperts.com\/#organization","name":"Equal Experts","url":"https:\/\/www.equalexperts.com\/","logo":{"@type":"ImageObject","inLanguage":"en-GB","@id":"https:\/\/www.equalexperts.com\/#\/schema\/logo\/image\/","url":"https:\/\/www.equalexperts.com\/wp-content\/uploads\/2018\/08\/Equal_Experts_Logo_CMYK_Colour.jpg","contentUrl":"https:\/\/www.equalexperts.com\/wp-content\/uploads\/2018\/08\/Equal_Experts_Logo_CMYK_Colour.jpg","width":719,"height":340,"caption":"Equal Experts"},"image":{"@id":"https:\/\/www.equalexperts.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/EqualExperts","https:\/\/www.linkedin.com\/company\/equal-experts\/?viewAsMember=true"]},{"@type":"Person","@id":"https:\/\/www.equalexperts.com\/#\/schema\/person\/ab8de7d0a8536946a1c251d65cca5754","name":"Lyndsay Prewer","image":{"@type":"ImageObject","inLanguage":"en-GB","@id":"https:\/\/www.equalexperts.com\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/40bdd25ad48353db99467923f1513696e5f0f7d20e7c2a93d6a97a41545d1bd3?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/40bdd25ad48353db99467923f1513696e5f0f7d20e7c2a93d6a97a41545d1bd3?s=96&d=mm&r=g","caption":"Lyndsay Prewer"}}]}},"_links":{"self":[{"href":"https:\/\/www.equalexperts.com\/wp-json\/wp\/v2\/posts\/5513","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.equalexperts.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.equalexperts.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.equalexperts.com\/wp-json\/wp\/v2\/users\/29"}],"replies":[{"embeddable":true,"href":"https:\/\/www.equalexperts.com\/wp-json\/wp\/v2\/comments?post=5513"}],"version-history":[{"count":0,"href":"https:\/\/www.equalexperts.com\/wp-json\/wp\/v2\/posts\/5513\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.equalexperts.com\/wp-json\/wp\/v2\/media?parent=5513"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.equalexperts.com\/wp-json\/wp\/v2\/categories?post=5513"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.equalexperts.com\/wp-json\/wp\/v2\/tags?post=5513"},{"taxonomy":"location","embeddable":true,"href":"https:\/\/www.equalexperts.com\/wp-json\/wp\/v2\/location?post=5513"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}