Skip to content

chore: include more stack details in logs

What does this merge request do and why?

chore: include more stack details in logs

We have issues like !3780 (merged)#note_2865068440 where we are having trouble tracking down the source of exceptions in our logs. We call this log_exception method all over the codebase which makes it difficult to track down where an exception was caught and is being logged from.

This MR adds stack_info=True as documented in https://docs.python.org/3/library/logging.html which includes the stack trace of the caller of this log method and not simply the stack trace of the exception itself.

This will hopefully help us narrow down issues.

How to set up and validate locally

  1. Trigger a codepath that causes an exception. For example edit the code like:
    diff --git a/duo_workflow_service/checkpointer/notifier.py b/duo_workflow_service/checkpointer/notifier.py
    index cd5c795f..9fb9def7 100644
    --- a/duo_workflow_service/checkpointer/notifier.py
    +++ b/duo_workflow_service/checkpointer/notifier.py
    @@ -58,7 +58,7 @@ class UserInterface:
    
             action = contract_pb2.Action(
                 newCheckpoint=contract_pb2.NewCheckpoint(
    -                goal=self.goal,
    +                goal=self.goal + "a" * 5 * 1024 * 1024,
                     status=WORKFLOW_STATUS_TO_CHECKPOINT_STATUS[self.status],
                     checkpoint=dumps(
                         {
  2. Run a flow
  3. Examine the logs. You will see the 2 different fields output look like:
exception
Traceback (most recent call last):
  File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/.venv/lib/python3.12/site-packages/langgraph/pregel/__init__.py", line 2655, in astream
    async for _ in runner.atick(
  File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/.venv/lib/python3.12/site-packages/langgraph/pregel/runner.py", line 367, in atick
    done, inflight = await asyncio.wait(
                     ^^^^^^^^^^^^^^^^^^^
  File "/Users/dylangriffith/.local/share/mise/installs/python/3.12.12/lib/python3.12/asyncio/tasks.py", line 464, in wait
    return await _wait(fs, timeout, return_when, loop)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dylangriffith/.local/share/mise/installs/python/3.12.12/lib/python3.12/asyncio/tasks.py", line 550, in _wait
    await waiter
asyncio.exceptions.CancelledError: Client-side streaming has been closed.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/workflows/abstract_workflow.py", line 289, in _compile_and_run_graph
    async for type, state in compiled_graph.astream(
  File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/.venv/lib/python3.12/site-packages/langgraph/pregel/__init__.py", line 2596, in astream
    async with AsyncPregelLoop(
               ^^^^^^^^^^^^^^^^
  File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/.venv/lib/python3.12/site-packages/langgraph/pregel/loop.py", line 1393, in __aexit__
    return await exit_task
           ^^^^^^^^^^^^^^^
  File "/Users/dylangriffith/.local/share/mise/installs/python/3.12.12/lib/python3.12/contextlib.py", line 754, in __aexit__
    raise exc_details[1]
  File "/Users/dylangriffith/.local/share/mise/installs/python/3.12.12/lib/python3.12/contextlib.py", line 737, in __aexit__
    cb_suppress = await cb(*exc_details)
                  ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/.venv/lib/python3.12/site-packages/langgraph/pregel/executor.py", line 200, in __aexit__
    await asyncio.wait(tasks)
  File "/Users/dylangriffith/.local/share/mise/installs/python/3.12.12/lib/python3.12/asyncio/tasks.py", line 464, in wait
    return await _wait(fs, timeout, return_when, loop)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dylangriffith/.local/share/mise/installs/python/3.12.12/lib/python3.12/asyncio/tasks.py", line 550, in _wait
    await waiter
asyncio.exceptions.CancelledError: ('Terminated workflow 705 execution due to an GeneratorExit: ', <Task cancelled name='Task-15' coro=<AsyncExitStack.__aexit__() done, defined at /Users/dylangriffith/.local/share/mise/installs/python/3.12.12/lib/python3.12/contextlib.py:707>>)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/workflows/abstract_workflow.py", line 264, in _compile_and_run_graph
    async with GitLabWorkflow(
               ^^^^^^^^^^^^^^^
  File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/checkpointer/gitlab_workflow.py", line 441, in __aexit__
    await self._update_workflow_status_safely(status)
  File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/checkpointer/gitlab_workflow.py", line 562, in _update_workflow_status_safely
    await self._update_workflow_status(status)
  File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/checkpointer/gitlab_workflow.py", line 213, in _update_workflow_status
    await self._status_handler.update_workflow_status(self._workflow_id, status)
  File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/status_updater/gitlab_status_updater.py", line 60, in update_workflow_status
    result = await self._client.apatch(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/gitlab/http_client.py", line 100, in apatch
    return await self._call(
           ^^^^^^^^^^^^^^^^^
  File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/gitlab/executor_http_client.py", line 42, in _call
    action_response = await _execute_action_and_get_action_response(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/executor/action.py", line 39, in _execute_action_and_get_action_response
    event: contract_pb2.ClientEvent = await outbox.put_action_and_wait_for_response(
                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/executor/outbox.py", line 53, in put_action_and_wait_for_response
    return await result
           ^^^^^^^^^^^^
asyncio.exceptions.CancelledError
stack
Stack (most recent call last):
  File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/.venv/bin/duo-workflow-service", line 6, in <module>
    sys.exit(run_app())
  File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/server.py", line 823, in run_app
    run(get_config())
  File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/server.py", line 819, in run
    asyncio.get_event_loop().run_until_complete(serve(port))
  File "/Users/dylangriffith/.local/share/mise/installs/python/3.12.12/lib/python3.12/asyncio/base_events.py", line 678, in run_until_complete
    self.run_forever()
  File "/Users/dylangriffith/.local/share/mise/installs/python/3.12.12/lib/python3.12/asyncio/base_events.py", line 645, in run_forever
    self._run_once()
  File "/Users/dylangriffith/.local/share/mise/installs/python/3.12.12/lib/python3.12/asyncio/base_events.py", line 1999, in _run_once
    handle._run()
  File "/Users/dylangriffith/.local/share/mise/installs/python/3.12.12/lib/python3.12/asyncio/events.py", line 88, in _run
    self._context.run(self._callback, *self._args)
  File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/workflows/abstract_workflow.py", line 304, in _compile_and_run_graph
    await self._handle_workflow_failure(e, compiled_graph, graph_config)
  File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/workflows/chat/workflow.py", line 320, in _handle_workflow_failure
    log_exception(
  File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/tracking/errors.py", line 27, in log_exception
    log.error(

The important detail in stack is in the last few lines of the stack:

  File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/workflows/abstract_workflow.py", line 304, in _compile_and_run_graph
    await self._handle_workflow_failure(e, compiled_graph, graph_config)
  File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/workflows/chat/workflow.py", line 320, in _handle_workflow_failure

This tells us that the exception was being caught and logged at https://gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/-/blob/a5ca79ac941bbb45c87ac14f2e218f04ec321300/duo_workflow_service/workflows/abstract_workflow.py#L301-L304 . Without this stack you cannot infer that from the exception itself.

Merge request checklist

  • Tests added for new functionality. If not, please raise an issue to follow up.
  • Documentation added/updated, if needed.
  • If this change requires executor implementation: verified that issues/MRs exist for both Go executor and Node executor or confirmed that changes are backward-compatible and don't break existing executor functionality.
Edited by Dylan Griffith

Merge request reports

Loading