Investigate workItemNotesByIid
slowness and potential optimisations
Current state
Current workItemNotesByIid GraphQL query
query workItemNotesByIid($fullPath: ID!, $iid: String!, $after: String, $pageSize: Int) {
workspace: namespace(fullPath: $fullPath) {
id
workItem(iid: $iid) {
id
iid
namespace {
id
__typename
}
widgets {
... on WorkItemWidgetNotes {
type
discussionLocked
discussions(first: $pageSize, after: $after, filter: ALL_NOTES) {
pageInfo {
...PageInfo
__typename
}
nodes {
id
notes {
nodes {
...WorkItemNote
__typename
}
__typename
}
__typename
}
__typename
}
__typename
}
__typename
}
__typename
}
__typename
}
}
fragment PageInfo on PageInfo {
hasNextPage
hasPreviousPage
startCursor
endCursor
__typename
}
fragment WorkItemNote on Note {
id
body
bodyHtml
system
internal
systemNoteIconName
createdAt
lastEditedAt
url
authorIsContributor
maxAccessLevelOfAuthor
externalAuthor
lastEditedBy {
...User
webPath
__typename
}
discussion {
id
resolved
resolvable
resolvedBy {
id
name
__typename
}
__typename
}
author {
...User
__typename
}
awardEmoji {
nodes {
...AwardEmojiFragment
__typename
}
__typename
}
userPermissions {
adminNote
awardEmoji
readNote
createNote
resolveNote
repositionNote
__typename
}
systemNoteMetadata {
id
descriptionVersion {
id
diffPath
deletePath
canDelete
deleted
__typename
}
__typename
}
__typename
}
fragment User on User {
id
avatarUrl
name
username
webUrl
webPath
__typename
}
fragment AwardEmojiFragment on AwardEmoji {
name
user {
id
name
__typename
}
__typename
}
I've created an work item with ~3.3K notes grouped into ~200 discussions. Most of these notes are user comments, with almost no system notes. I've been running the above query in graphql-explorer
by setting pagge size at 50 discussions rather than the default 20, to make it easier to spot any issues and also picked a random cursor, i.e. not getting first notes.
CPU Time Numbers
flamegraph: flamegraph-graphql-cpu-original-1.html
- Running the above query through
graphql-explorer
takes ~6 seconds! - From
flamegraph
it is visible that:- about 50%(~2.7s) of the time is spent in GC.
- about 40%(~2.2s) are spent in
GitLab::Graphql::Present::FieldExtension#resolve
- of which 1s(i.e about half) is spent in
app/graphql/types/permission_types/base_permission_type.rb:32
, i.e. the exposed permissions. Most impact in our case will be permissions exposed on thenote
itself as we are loading a bunch of those, in my test case that is ~800 notes for the 50 discussions.
- of which 1s(i.e about half) is spent in
- about 12%(~800ms) is spend in
work_item_discussions_resolver
- of which ~500ms is spent in computing
readable_by?
permission
- of which ~500ms is spent in computing
Objects allocations
flamegraph: too large to upload, ~300MBs
- ~58K objects are being allocated
- work_item_discussions_resolver: ~14.7K
- ApplicationRecord#readable_by?: 9.7K
- base_permission_type: ~18K
- Gitlab::Graphql::Present::FieldExtension#resolve: 44K
- Gitlab::Graphql::ConnectionRedaction#nodes: ~3K
- Notes::RenderService#execute: ~3.5K
- work_item_discussions_resolver: ~14.7K
Suggested FE side improvements
- Looking at above stats we could gain some performance by:
-
move the
discussion
fetched on each not to the parentdiscussion
object, so that instead of computing this data on each note we compute it once per discussion. -
removing
userPermissions
or removing most of the items in the list and for the items left, only compute permissions for non-system notes. e.g.-
readNote
,awardEmoji
- can be removed, returned notes from BE are already checked on read permission. Anyone who can read a not can react with an emoji to it. -
createNote
- this can be added to thework item
level so that it is computed once per work item rather than for each note. -
resolveNote
- this can be added to thediscussion
level(or evenwork item
level), so that this is computed once per discussion(or work item) rather than each note. -
repositionNote
- not sure what this is needed for? Is this for designs only? Can this be moved to Design object in that case or does it have to be on the work item ? -
adminNote
- this one we can keep, but it is probably only needed for user comments, AFAIK this is used to know if user can remove or edit a note. Not needed for system notes as we do not offer ability to delete system notes in the UI.
-
-
Updated workItemNotesByIid GraphQL query based on FE suggested improvements
query workItemNotesByIid($fullPath: ID!, $iid: String!, $after: String, $pageSize: Int) {
workspace: namespace(fullPath: $fullPath) {
id
workItem(iid: $iid) {
id
iid
namespace {
id
__typename
}
widgets {
... on WorkItemWidgetNotes {
type
discussionLocked
discussions(first: $pageSize, after: $after, filter: ALL_NOTES) {
pageInfo {
...PageInfo
__typename
}
nodes {
id
notes {
nodes {
...WorkItemNote
__typename
}
__typename
}
__typename
}
__typename
}
__typename
}
__typename
}
__typename
}
__typename
}
}
fragment PageInfo on PageInfo {
hasNextPage
hasPreviousPage
startCursor
endCursor
__typename
}
fragment WorkItemNote on Note {
id
body
bodyHtml
system
internal
systemNoteIconName
createdAt
lastEditedAt
url
authorIsContributor
maxAccessLevelOfAuthor
externalAuthor
lastEditedBy {
...User
webPath
__typename
}
discussion {
id
resolved
resolvable
resolvedBy {
id
name
__typename
}
__typename
}
author {
...User
__typename
}
awardEmoji {
nodes {
...AwardEmojiFragment
__typename
}
__typename
}
userPermissions {
adminNote
awardEmoji
readNote
createNote
__typename
}
systemNoteMetadata {
id
descriptionVersion {
id
diffPath
deletePath
canDelete
deleted
__typename
}
__typename
}
__typename
}
fragment User on User {
id
avatarUrl
name
username
webUrl
webPath
__typename
}
fragment AwardEmojiFragment on AwardEmoji {
name
user {
id
name
__typename
}
__typename
}
With the above notes, for ease of testing, I removed the discussion
section from Note and added those fields to the parent discussion object instead.
CPU Time Numbers
flamegraph: flamegraph-graphql-cpu-with_fe_suggestions.html
- Running the above query through
graphql-explorer
takes ~3s seconds! - From
flamegraph
it is visible that:- about 50%(~2.7s) of the time is spent in GC.
- about 30%(1s) is spent in
Gitlab::Graphql::Present::FieldExtension#resolve
- about 20%(~600ms) is spend in
work_item_discussions_resolver
- of which ~450ms is spent in computing
readable_by?
permission
- of which ~450ms is spent in computing
Objects allocations
flamegraph: still too large to upload, ~165MBs
- ~35K objects are being allocated
- work_item_discussions_resolver: ~14.7K - the same
- ApplicationRecord#readable_by?: 9.7K - the same
- base_permission_type: 0K
- Gitlab::Graphql::Present::FieldExtension#resolve: 24K - dropped from ~44K
- Gitlab::Graphql::ConnectionRedaction#nodes: ~3K - the same
- Notes::RenderService#execute: ~3.5K - the same
- work_item_discussions_resolver: ~14.7K - the same
Suggested BE side improvements
On the BE side for this specific example, where we have a lot of user comments timings can be improve by removing the calls to readable_by?
for user comments as those are already checked within can_read_issubale_notes?
also internal notes are redacted within NotesFinder
So we only need to check if system note is readable, because those are being rendered from resource note events and can contain links to items a user may not have permision to read.
Another place where CPU seems to spend quite a bit of time is Gitlab::Graphql::Present::FieldExtension#resolve
and as far as I understand that is related to the ability to use existing presenters to expose some of the fields on a model, we use present_using <PresenterClass>
in graphql Type classes for that. This results in N+1 calls for this for collections being loaded and in our case for all the notes, plus this is also being used on AwardEmoji type as well, and for note's author(UserType) which in this case is perhaps not relevant as I do not have any emojis on notes, but that would just add more slowness if anything.
So I changed the code to only check readable_by?
on system notes and copied(duplicated) the code from Presenters to the Type class itself and removed the present_using <PresenterClass>
calls in Note, AwardEmoji and User types.
On top of that I replaced the computation of url
note instead of going though ::Gitlab::UrlBuilder.build
to compute work item url once, push it to the context and on note it just adds the anchor as a string: "#note_"
MR: Draft: Investigate `workItemNotesByIid` slownes... (!180985 - closed) • Alexandru Croitor
With those changes here are some numbers.
CPU Time Numbers
flamegraph: flamegraph-graphql-cpu-fe_and_be_changes.html
- Running the above query through
graphql-explorer
takes ~1.7s seconds! - From
flamegraph
it is visible that:- about 50%(~850ms) of the time is spent in GC.
- about 19%(330s) is spent in
Gitlab::Graphql::Present::FieldExtension#resolve
- about 12%(~200ms) is spend in
work_item_discussions_resolver
- 0 in
readable_by?
as my example has zero system notes
- 0 in
Objects allocations
flamegraph: still too large to upload, ~90MBs
- ~20K objects are being allocated
- work_item_discussions_resolver: ~5K - dropped from ~14.7K
- ApplicationRecord#readable_by?: 0 - dropped from ~9.7K
- Gitlab::Graphql::Present::FieldExtension#resolve: 9.5K - dropped from ~24K
- Gitlab::Graphql::ConnectionRedaction#nodes: ~3K - the same
- Notes::RenderService#execute: ~3.5K - the same
- work_item_discussions_resolver: ~5K - dropped from ~14.7K
Conclusion
Conclusion is pretty obvious by now
Insider information