Add custom sessions, tracer, log aggregation and terminal features
I was tinkering with some features. Initially, I thought they weren't commit-worthy, but lately I've noticed that it might be helpful to have all these features in soslab
Here are some of the problems and solutions:
Adding or removing files in sessions:
Currently, SOS Lab sessions are static, only the logs in the SOS bundle are shown. It would be great to have the flexibility to add new files to an existing session or remove unnecessary files. Now we have that option. In each session, you can add or remove log files as needed.
Custom sessions:
The current version is very static—we can only view logs from SOS bundles. Ideally, we should have full flexibility. Sometimes, depending on the issue, we only collect gitlab-ctl tail information and specific log files, not a whole SOS bundle. We should still be able to use SOS Lab in that case. The custom session feature allows us to create empty sessions and add log files of our own so we can analyze them. This enables the tool to work with any type of log, not just SOS bundles.
Minor feature:
Troubleshooting slate: Sometimes I find it handy to have a sticky notes-like feature to store text, correlation IDs, or any notes we want to jot down quickly without leaving the app. We can add all our notes to the slate, and it persists unless we explicitly clear it. It's draggable, resizable, and minimizable—just a tiny utility for quick note-taking.
Log aggregation:
One of the major limitations of auto-analysis is that it takes time to go through all the patterns, and it's only good for a single session. Recently, I noticed some tickets where we had 30 nodes and had to feed 30 SOS bundles to SOS Lab. The current state isn't fully usable because, while searching across nodes works fine, when we have many nodes and logs, it's really hard to interpret and navigate through them due to sheer volume.
I researched practices used in similar tools and noticed that a normal grep across all nodes combined with a log aggregation feature helps digest the logs. If there are many logs, it's difficult to process, but if we can aggregate them into patterns, it's easier to understand what's happening. For example, if we grep for "error," we get 10k records. If we pass this to log aggregation, we get approximately 20-30 patterns because most logs are recurring types. This aggregation helps us get a complete overview of logs across all sessions at a glance so we understand what's happening (in contrast to auto-analysis, which is good at finding things in one session but was pretty static). Log aggregation provides that flexibility we can customize it as we want and get a complete overview of what's happening in the logs.
Log tracer:
A short, minor tool to use correlation IDs and trace them—similar to Kibana's correlation dashboard. We can search for a correlation ID to understand the timeline. The difference between this tracer and normal grep is that normal grep shows results as it finds them—it's not chronological because GitLab uses different log types (JSON, syslog, etc.), all of which aren't structured equally and don't have consistent timestamps. This tracer feature goes through the results, checks the time, and computes a logical timeline so that, no matter the type and structure of logs, it identifies and shows results in chronological order. It also helps us copy and paste results to feed to LLMs for analysis while troubleshooting.
Terminal:
No matter how many UI tools we have, sometimes having a terminal right there is very handy
Apart from these features, there are some minor UI changes—tabs and styling—to make them look a bit nicer (hopefully).








