Investigate `workItemNotesByIid` slowness and potential optimisations

Current state

Current workItemNotesByIid GraphQL query

query workItemNotesByIid($fullPath: ID!, $iid: String!, $after: String, $pageSize: Int) {
  workspace: namespace(fullPath: $fullPath) {
    id
    workItem(iid: $iid) {
      id
      iid
      namespace {
        id
        __typename
      }
      widgets {
        ... on WorkItemWidgetNotes {
          type
          discussionLocked
          discussions(first: $pageSize, after: $after, filter: ALL_NOTES) {
            pageInfo {
              ...PageInfo
              __typename
            }
            nodes {
              id
              notes {
                nodes {
                  ...WorkItemNote
                  __typename
                }
                __typename
              }
              __typename
            }
            __typename
          }
          __typename
        }
        __typename
      }
      __typename
    }
    __typename
  }
}

fragment PageInfo on PageInfo {
  hasNextPage
  hasPreviousPage
  startCursor
  endCursor
  __typename
}

fragment WorkItemNote on Note {
  id
  body
  bodyHtml
  system
  internal
  systemNoteIconName
  createdAt
  lastEditedAt
  url
  authorIsContributor
  maxAccessLevelOfAuthor
  externalAuthor
  lastEditedBy {
    ...User
    webPath
    __typename
  }
  discussion {
    id
    resolved
    resolvable
    resolvedBy {
      id
      name
      __typename
    }
    __typename
  }
  author {
    ...User
    __typename
  }
  awardEmoji {
    nodes {
      ...AwardEmojiFragment
      __typename
    }
    __typename
  }
  userPermissions {
    adminNote
    awardEmoji
    readNote
    createNote
    resolveNote
    repositionNote
    __typename
  }
  systemNoteMetadata {
    id
    descriptionVersion {
      id
      diffPath
      deletePath
      canDelete
      deleted
      __typename
    }
    __typename
  }
  __typename
}

fragment User on User {
  id
  avatarUrl
  name
  username
  webUrl
  webPath
  __typename
}

fragment AwardEmojiFragment on AwardEmoji {
  name
  user {
    id
    name
    __typename
  }
  __typename
}

I've created an work item with ~3.3K notes grouped into ~200 discussions. Most of these notes are user comments, with almost no system notes. I've been running the above query in graphql-explorer by setting pagge size at 50 discussions rather than the default 20, to make it easier to spot any issues and also picked a random cursor, i.e. not getting first notes.

CPU Time Numbers

flamegraph: flamegraph-graphql-cpu-original-1.html

Running the above query through graphql-explorer takes ~6 seconds!
From flamegraph it is visible that:
- about 50%(~2.7s) of the time is spent in GC.
- about 40%(~2.2s) are spent in GitLab::Graphql::Present::FieldExtension#resolve
  - of which 1s(i.e about half) is spent in app/graphql/types/permission_types/base_permission_type.rb:32, i.e. the exposed permissions. Most impact in our case will be permissions exposed on the note itself as we are loading a bunch of those, in my test case that is ~800 notes for the 50 discussions.
- about 12%(~800ms) is spend in work_item_discussions_resolver
  - of which ~500ms is spent in computing readable_by? permission

Objects allocations

flamegraph: too large to upload, ~300MBs

~58K objects are being allocated
- work_item_discussions_resolver: ~14.7K
  - ApplicationRecord#readable_by?: 9.7K
- base_permission_type: ~18K
- Gitlab::Graphql::Present::FieldExtension#resolve: 44K
- Gitlab::Graphql::ConnectionRedaction#nodes: ~3K
- Notes::RenderService#execute: ~3.5K

Suggested FE side improvements

Looking at above stats we could gain some performance by:
- move the discussion fetched on each not to the parent discussion object, so that instead of computing this data on each note we compute it once per discussion.
- removing userPermissions or removing most of the items in the list and for the items left, only compute permissions for non-system notes. e.g.
  - readNote, awardEmoji - can be removed, returned notes from BE are already checked on read permission. Anyone who can read a not can react with an emoji to it.
  - createNote - this can be added to the work item level so that it is computed once per work item rather than for each note.
  - resolveNote - this can be added to the discussion level(or even work item level), so that this is computed once per discussion(or work item) rather than each note.
  - repositionNote - not sure what this is needed for? Is this for designs only? Can this be moved to Design object in that case or does it have to be on the work item ?
  - adminNote - this one we can keep, but it is probably only needed for user comments, AFAIK this is used to know if user can remove or edit a note. Not needed for system notes as we do not offer ability to delete system notes in the UI.

Updated workItemNotesByIid GraphQL query based on FE suggested improvements

query workItemNotesByIid($fullPath: ID!, $iid: String!, $after: String, $pageSize: Int) {
  workspace: namespace(fullPath: $fullPath) {
    id
    workItem(iid: $iid) {
      id
      iid
      namespace {
        id
        __typename
      }
      widgets {
        ... on WorkItemWidgetNotes {
          type
          discussionLocked
          discussions(first: $pageSize, after: $after, filter: ALL_NOTES) {
            pageInfo {
              ...PageInfo
              __typename
            }
            nodes {
              id
              notes {
                nodes {
                  ...WorkItemNote
                  __typename
                }
                __typename
              }
              __typename
            }
            __typename
          }
          __typename
        }
        __typename
      }
      __typename
    }
    __typename
  }
}

fragment PageInfo on PageInfo {
  hasNextPage
  hasPreviousPage
  startCursor
  endCursor
  __typename
}

fragment WorkItemNote on Note {
  id
  body
  bodyHtml
  system
  internal
  systemNoteIconName
  createdAt
  lastEditedAt
  url
  authorIsContributor
  maxAccessLevelOfAuthor
  externalAuthor
  lastEditedBy {
    ...User
    webPath
    __typename
  }
  discussion {
    id
    resolved
    resolvable
    resolvedBy {
      id
      name
      __typename
    }    
    __typename
  }
  author {
    ...User
    __typename
  }
  awardEmoji {
    nodes {
      ...AwardEmojiFragment
      __typename
    }
    __typename
  }
  userPermissions {
    adminNote
    awardEmoji
    readNote
    createNote
    __typename
  }
  systemNoteMetadata {
    id
    descriptionVersion {
      id
      diffPath
      deletePath
      canDelete
      deleted
      __typename
    }
    __typename
  }
  __typename
}

fragment User on User {
  id
  avatarUrl
  name
  username
  webUrl
  webPath
  __typename
}

fragment AwardEmojiFragment on AwardEmoji {
  name
  user {
    id
    name
    __typename
  }
  __typename
}

With the above notes, for ease of testing, I removed the discussion section from Note and added those fields to the parent discussion object instead.

CPU Time Numbers

flamegraph: flamegraph-graphql-cpu-with_fe_suggestions.html

Running the above query through graphql-explorer takes ~3s seconds!
From flamegraph it is visible that:
- about 50%(~2.7s) of the time is spent in GC.
- about 30%(1s) is spent in Gitlab::Graphql::Present::FieldExtension#resolve
- about 20%(~600ms) is spend in work_item_discussions_resolver
  - of which ~450ms is spent in computing readable_by? permission

Objects allocations

flamegraph: still too large to upload, ~165MBs

~35K objects are being allocated
- work_item_discussions_resolver: ~14.7K - the same
  - ApplicationRecord#readable_by?: 9.7K - the same
- base_permission_type: 0K
- Gitlab::Graphql::Present::FieldExtension#resolve: 24K - dropped from ~44K
- Gitlab::Graphql::ConnectionRedaction#nodes: ~3K - the same
- Notes::RenderService#execute: ~3.5K - the same

Suggested BE side improvements

On the BE side for this specific example, where we have a lot of user comments timings can be improve by removing the calls to readable_by? for user comments as those are already checked within can_read_issubale_notes? also internal notes are redacted within NotesFinder

So we only need to check if system note is readable, because those are being rendered from resource note events and can contain links to items a user may not have permision to read.

Another place where CPU seems to spend quite a bit of time is Gitlab::Graphql::Present::FieldExtension#resolve and as far as I understand that is related to the ability to use existing presenters to expose some of the fields on a model, we use present_using <PresenterClass> in graphql Type classes for that. This results in N+1 calls for this for collections being loaded and in our case for all the notes, plus this is also being used on AwardEmoji type as well, and for note's author(UserType) which in this case is perhaps not relevant as I do not have any emojis on notes, but that would just add more slowness if anything.

So I changed the code to only check readable_by? on system notes and copied(duplicated) the code from Presenters to the Type class itself and removed the present_using <PresenterClass> calls in Note, AwardEmoji and User types.

On top of that I replaced the computation of url note instead of going though ::Gitlab::UrlBuilder.build to compute work item url once, push it to the context and on note it just adds the anchor as a string: "#note_"

MR: Draft: Investigate `workItemNotesByIid` slownes... (!180985 - closed) • Alexandru Croitor

With those changes here are some numbers.

CPU Time Numbers

flamegraph: flamegraph-graphql-cpu-fe_and_be_changes.html

Running the above query through graphql-explorer takes ~1.7s seconds!
From flamegraph it is visible that:
- about 50%(~850ms) of the time is spent in GC.
- about 19%(330s) is spent in Gitlab::Graphql::Present::FieldExtension#resolve
- about 12%(~200ms) is spend in work_item_discussions_resolver
  - 0 in readable_by? as my example has zero system notes

Objects allocations

flamegraph: still too large to upload, ~90MBs

~20K objects are being allocated
- work_item_discussions_resolver: ~5K - dropped from ~14.7K
  - ApplicationRecord#readable_by?: 0 - dropped from ~9.7K
- Gitlab::Graphql::Present::FieldExtension#resolve: 9.5K - dropped from ~24K
- Gitlab::Graphql::ConnectionRedaction#nodes: ~3K - the same
- Notes::RenderService#execute: ~3.5K - the same

Conclusion

Conclusion is pretty obvious by now 😄 with the above changes we can gain ~3x improvements. Reduce object and implicitly memory allocation ~3x, improve timings ~3x

Insider information 😆 I've only attached 3 flamegraphs here, but I have been running dozens on dozens of these locally. Also the numbers will probably vary if we have a different mixture of system notes and user comments and different combiantion of threads and comments in threads.

Edited Oct 01, 2025 by 🤖 GitLab Bot 🤖

Investigate workItemNotesByIid slowness and potential optimisations

Current state

CPU Time Numbers

Objects allocations

Suggested FE side improvements

CPU Time Numbers

Objects allocations

Suggested BE side improvements

CPU Time Numbers

Objects allocations

Conclusion

Investigate `workItemNotesByIid` slowness and potential optimisations