Update additional context handling for Code Completions

Context

The AIGW Code Completions endpoint accepts additional contexts coming from Language Server as the context payload, where payload.context is a list of CodeContextPayload. These additional contexts are ordered by the Language Server according to relevance.

Code Completion models often have a max input token, which means that we cannot send all of the additional contexts to the model. AIGW handles this by setting a max size for the additional contexts (see code). However, with how the additional context is trimmed currently (see code), it can get "cut off" at an awkward place in the code.

Further Details

For example, we have these additional contexts:

# additional context 1 - assume this is in a file called `print_hello.py`
def print_hello(person)
    print(f"Hello {person.get_full_name()}")
end

# additional context 2 - assume this is in a file called `person.py`
class Person
    def full_name()
        return f"{firstname} {lastname}"
    end
end

This is the actual file that the user is working on:

def execute
    person = Person()
    <cursor here>
end

Without input limits, each of these additional contexts are combined in one text and placed above the content_above_cursor, with the whole thing becoming the prefix.

def print_hello(person)
    print(f"Hello {person.get_full_name()}")
end
class Person
    def full_name()
        return f"{firstname} {lastname}"
    end
end
def execute
    person = Person()
    # cut off at the cursor, this is ok

When additional context length is limited, it could get trimmed so that the prefix being sent to the AI model is

def print_hello(person)
    print(f"Hello {person.get_full_name()}")
end
class Person
    def full_name()
        return f"{firstname} # the additional context that is added to the 'top' of the prefix is cut off at an awkward place
def execute
    person = Person()
    # cut off at the cursor, this is ok

Proposal

Update the logic for trimming the additional context so that it trims by whole additional context element. From the example above, the actual prefix sent to the AI model should be:

def print_hello(person)
    print(f"Hello {person.get_full_name()}")
end
# we are cutting off the whole `Person` class since adding it does not fit within the max allowed length
def execute
    person = Person()
    # cut off at the cursor, this is ok

Links / References

Related spike issue: Research: Additional contexts and Import context (gitlab-org/gitlab#503839 - closed)
- see details in: gitlab-org/gitlab#503839 (comment 2266952045) & gitlab-org/gitlab#503839 (comment 2266985711)

Update [2025-01-15]

We're pausing on this change as discussed in #774 (comment 2298688848). We'll need to evaluate whether this change is necessary and beneficial.

Edited Jan 15, 2025 by Leaminn Ma