chore: include more stack details in logs
What does this merge request do and why?
chore: include more stack details in logs
We have issues like !3780 (merged)#note_2865068440 where we are having trouble
tracking down the source of exceptions in our logs. We call this
log_exception method all over the codebase which makes it difficult to
track down where an exception was caught and is being logged from.
This MR adds stack_info=True as documented in
https://docs.python.org/3/library/logging.html which includes the stack
trace of the caller of this log method and not simply the stack trace of
the exception itself.
This will hopefully help us narrow down issues.
How to set up and validate locally
- Trigger a codepath that causes an exception. For example edit the code like:
diff --git a/duo_workflow_service/checkpointer/notifier.py b/duo_workflow_service/checkpointer/notifier.py index cd5c795f..9fb9def7 100644 --- a/duo_workflow_service/checkpointer/notifier.py +++ b/duo_workflow_service/checkpointer/notifier.py @@ -58,7 +58,7 @@ class UserInterface: action = contract_pb2.Action( newCheckpoint=contract_pb2.NewCheckpoint( - goal=self.goal, + goal=self.goal + "a" * 5 * 1024 * 1024, status=WORKFLOW_STATUS_TO_CHECKPOINT_STATUS[self.status], checkpoint=dumps( { - Run a flow
- Examine the logs. You will see the 2 different fields output look like:
exception
Traceback (most recent call last):
File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/.venv/lib/python3.12/site-packages/langgraph/pregel/__init__.py", line 2655, in astream
async for _ in runner.atick(
File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/.venv/lib/python3.12/site-packages/langgraph/pregel/runner.py", line 367, in atick
done, inflight = await asyncio.wait(
^^^^^^^^^^^^^^^^^^^
File "/Users/dylangriffith/.local/share/mise/installs/python/3.12.12/lib/python3.12/asyncio/tasks.py", line 464, in wait
return await _wait(fs, timeout, return_when, loop)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dylangriffith/.local/share/mise/installs/python/3.12.12/lib/python3.12/asyncio/tasks.py", line 550, in _wait
await waiter
asyncio.exceptions.CancelledError: Client-side streaming has been closed.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/workflows/abstract_workflow.py", line 289, in _compile_and_run_graph
async for type, state in compiled_graph.astream(
File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/.venv/lib/python3.12/site-packages/langgraph/pregel/__init__.py", line 2596, in astream
async with AsyncPregelLoop(
^^^^^^^^^^^^^^^^
File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/.venv/lib/python3.12/site-packages/langgraph/pregel/loop.py", line 1393, in __aexit__
return await exit_task
^^^^^^^^^^^^^^^
File "/Users/dylangriffith/.local/share/mise/installs/python/3.12.12/lib/python3.12/contextlib.py", line 754, in __aexit__
raise exc_details[1]
File "/Users/dylangriffith/.local/share/mise/installs/python/3.12.12/lib/python3.12/contextlib.py", line 737, in __aexit__
cb_suppress = await cb(*exc_details)
^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/.venv/lib/python3.12/site-packages/langgraph/pregel/executor.py", line 200, in __aexit__
await asyncio.wait(tasks)
File "/Users/dylangriffith/.local/share/mise/installs/python/3.12.12/lib/python3.12/asyncio/tasks.py", line 464, in wait
return await _wait(fs, timeout, return_when, loop)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dylangriffith/.local/share/mise/installs/python/3.12.12/lib/python3.12/asyncio/tasks.py", line 550, in _wait
await waiter
asyncio.exceptions.CancelledError: ('Terminated workflow 705 execution due to an GeneratorExit: ', <Task cancelled name='Task-15' coro=<AsyncExitStack.__aexit__() done, defined at /Users/dylangriffith/.local/share/mise/installs/python/3.12.12/lib/python3.12/contextlib.py:707>>)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/workflows/abstract_workflow.py", line 264, in _compile_and_run_graph
async with GitLabWorkflow(
^^^^^^^^^^^^^^^
File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/checkpointer/gitlab_workflow.py", line 441, in __aexit__
await self._update_workflow_status_safely(status)
File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/checkpointer/gitlab_workflow.py", line 562, in _update_workflow_status_safely
await self._update_workflow_status(status)
File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/checkpointer/gitlab_workflow.py", line 213, in _update_workflow_status
await self._status_handler.update_workflow_status(self._workflow_id, status)
File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/status_updater/gitlab_status_updater.py", line 60, in update_workflow_status
result = await self._client.apatch(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/gitlab/http_client.py", line 100, in apatch
return await self._call(
^^^^^^^^^^^^^^^^^
File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/gitlab/executor_http_client.py", line 42, in _call
action_response = await _execute_action_and_get_action_response(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/executor/action.py", line 39, in _execute_action_and_get_action_response
event: contract_pb2.ClientEvent = await outbox.put_action_and_wait_for_response(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/executor/outbox.py", line 53, in put_action_and_wait_for_response
return await result
^^^^^^^^^^^^
asyncio.exceptions.CancelledError
stack
Stack (most recent call last):
File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/.venv/bin/duo-workflow-service", line 6, in <module>
sys.exit(run_app())
File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/server.py", line 823, in run_app
run(get_config())
File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/server.py", line 819, in run
asyncio.get_event_loop().run_until_complete(serve(port))
File "/Users/dylangriffith/.local/share/mise/installs/python/3.12.12/lib/python3.12/asyncio/base_events.py", line 678, in run_until_complete
self.run_forever()
File "/Users/dylangriffith/.local/share/mise/installs/python/3.12.12/lib/python3.12/asyncio/base_events.py", line 645, in run_forever
self._run_once()
File "/Users/dylangriffith/.local/share/mise/installs/python/3.12.12/lib/python3.12/asyncio/base_events.py", line 1999, in _run_once
handle._run()
File "/Users/dylangriffith/.local/share/mise/installs/python/3.12.12/lib/python3.12/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/workflows/abstract_workflow.py", line 304, in _compile_and_run_graph
await self._handle_workflow_failure(e, compiled_graph, graph_config)
File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/workflows/chat/workflow.py", line 320, in _handle_workflow_failure
log_exception(
File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/tracking/errors.py", line 27, in log_exception
log.error(
The important detail in stack is in the last few lines of the stack:
File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/workflows/abstract_workflow.py", line 304, in _compile_and_run_graph
await self._handle_workflow_failure(e, compiled_graph, graph_config)
File "/Users/dylangriffith/workspace/gdk/gitlab-ai-gateway/duo_workflow_service/workflows/chat/workflow.py", line 320, in _handle_workflow_failure
This tells us that the exception was being caught and logged at https://gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/-/blob/a5ca79ac941bbb45c87ac14f2e218f04ec321300/duo_workflow_service/workflows/abstract_workflow.py#L301-L304 . Without this stack you cannot infer that from the exception itself.
Merge request checklist
-
Tests added for new functionality. If not, please raise an issue to follow up. -
Documentation added/updated, if needed. -
If this change requires executor implementation: verified that issues/MRs exist for both Go executor and Node executor or confirmed that changes are backward-compatible and don't break existing executor functionality.