Failure Injection Testing (FIT) or Chaos as part of operations features
Inspired by https://help.gremlin.com/infra-attacks/ and the Netflix https://medium.com/netflix-techblog/fit-failure-injection-testing-35d8e2a9bb2
As a user I under the Operations > Chaos menu I can start infrastructure or network chaos.
Infrastructure chaos
- CPU Generates high load for one or more CPU cores.
- Memory Allocates a specific amount of RAM.
- IO Puts read/write pressure on I/O devices such as hard disks.
- Disk Writes files to disk to fill it to a specific percentage.
- Shutdown Reboots or halts the host operating system, allowing you to test, for example, how your system behaves when losing one or more cluster machines.
- Time travel Changes the host’s system time, which can be used to simulate adjusting to daylight saving time and other time-related events.
- Process killer An attack which kills the specified process, which can be used to simulate application or dependency crashes.
Network chaos
- Blackhole Drops all matching network traffic.
- Latency Injects latency into all matching egress network traffic.
- Packet loss Induces packet loss into all matching egress network traffic.
- DNS Blocks access to DNS servers.