Cloud native debugger
Problem to solve
Debugging cloud native applications requires a different approach than traditional applications:
- SSH'ing into a shell may not be feasible, or even possible
- Lifetimes of containers and VM's could be very short
- Requests can be fanned out across many instances
A video is available which describes some of these challenges as well: https://www.youtube.com/watch?v=hdXaDqAqW1E
Intended users
Further details
Proposal
We should consider finding or building some tooling that can improve the lives of developers and SRE's when troubleshooting cloud native applications. While monitoring solutions, logs, traces, and error tracking is important, sometimes they are not enough.
We could consider building some type of endpoint which could be exposed in an application, similar to what golang's pprof can do: https://github.com/google/pprof.
Ideally this endpoint could be used individually, but also collected by another service which could help provide a layer of aggregation and abstraction from the individual endpoints of each container.
We could:
- Re-use or build something similar to pprof, but also target support across additional languages
- Build an aggregation service to collect and aggregating, making it easier for users to find the data they are looking for and establish patterns and outliers