AI-related Incidents - Trends and Action Items
Context
We have had a lot of incidents in the AI teams recently and they have not qualified for FCLs nor do they seem to be slowing down. The intent behind this issue is to identify additional data and action items to begin driving forward.
Incidents
Trends
- Debugging through the stack, identifying root cause, and determining team ownership are both difficult and unclear
- Caching, state changes, and environment mismatches ("it worked locally")
- They are all around allowing our customers to use AI features that they pay for (licensing, access, etc) as opposed to problems with an actual LLM or the feature
Action Items
Edited by Michelle Gill