SPIKE Proposal: Code Suggestions Context using the Abstract Syntax Tree

Experiment Overview

Stemming from the ideas presented in issue #440646 (closed), we'd like to propose a feature that can greatly enhance the developer experience when working with GitLab Duo Code Suggestions.

This issue will serve as a starting ground for ideation, experimentation, and technical findings. We will also attach some proof of concepts for stakeholders to explore.

User Problem - a 10,000-foot View

When it comes to improving the user experience of our Code Suggestions, we'll soon come to understand that context is key.

Once we remember that LLMs can produce output based on a context window, we can use this to improve the accuracy of our code suggestions.

How accurate are our code suggestions? Let's find out!

A Simple Day in the Life Example

Currently, the accuracy is dependent on the file the user is currently working on. Let's check out the following helloWorld code a user named "Bob" wrote:

// current_file.ts
import { createLogger } from './logger.ts'

export function helloWorld() { 
  const logger = createLogger()
  const message = `Hello, ${name}!`
  logger.log(message)
  return message
}

// current_file.test.ts
import { expect } from '@jest/globals';
import { helloWorld } from './current_file.ts'

describe('helloWorld', () => {
  it('should return a message', () => {
    // Suggest some test code below 
    // ??? (AI makes up some random code here)
  })
})

Imagine Bob wanted to write some unit tests via Jest for the current_file.ts. He creates a new file called current_file.test.ts. How far would GitLab Duo Code Suggestions get you? Because code suggestions are limited to the working file, several key context pieces are missing when Bob's editor talks to GitLab Duo (IE, Model Gateway). In this case, any new code suggestions would be completely unaware of the code in current_file.ts.

Current Developer Workflow

Now let's say Bob wanted to generate some unit tests based on the code above in his file current_file.test.ts. He'd have to do the following:

Copy and paste the relevant code in current_file.ts into the GitLab Duo Code Chat UI
Additionally, find and paste other relevant code from any other files into the GitLab Duo Code Chat UI
- This is so that AI can know what the function does and returns
- Then, only after the AI has this context it can provide good code suggestions, such as a Jest Mock for the createLogger function
Then the user would have to copy the generated code into a new test file manually, let's call it current_file.test.ts

The generated code would look something like:

// current_file.test.ts
import { expect, jest } from '@jest/globals';
import { helloWorld } from './current_file.ts'

describe('helloWorld', () => {
  it('should return a message', () => {
    jest.mock('./logger.ts', () => {
      return {
        createLogger: () => {
          return {
            log: jest.fn()
          }
        }
      }
    })
    const message = helloWorld()
    expect(message).toBe('Hello, bob!')
  })
})

This is a lot of manual work! Imagine you are working in a world with a much larger codebase and code requiring more context. You can see how this can add up over time.

Developers can only move as fast as the tools they use. If we can improve the accuracy of our code suggestions, we can improve the speed at which developers can write meaningful code. Additionally, developers can feel confident AI is assisting them and not getting in the way.

Proposal

We propose enhancing the accuracy of our code suggestions by considering the context of the code the user is working on. Context can be gathered through the use of Abstract Syntax Trees (ASTs), the Model Gateway, and the user's local files. We will leverage the already existing Tree Sitter Parser in our Language Server to facilitate this.

By leveraging ASTs, we can build multiple features that can enhance the accuracy of our code suggestions. Here are a few examples:

We can resolve the relative import modules in the current file and line and provide code suggestions based on the imported code.
We can resolve the function calls in the current file and provide code suggestions based on the function signature and return type.
We can resolve the imported dependencies in the current file and provide code suggestions based on the dependency being used

And so much more. To keep the scope limited and intentional, we will focus on the first idea in this list and POC for stakeholders to explore.

Proposal Differences From #440646 (closed)

#440646 (closed) is a great starting point for this conversation. The key point is that there are multiple ways to improve the accuracy of our code suggestions.

#440646 (closed) proposes enhancing code suggestions by considering information from multiple files. The proposal suggests a new API to store context from related files, identified by events like opening a new file or the expiration of an existing context. This would allow code suggestions to be based on a broader context without significantly increasing request sizes, potentially improving their relevance.

The primary difference is that we propose using the existing Tree Sitter Parser in our Language Server to gather context from the current file the user is working on. We may be able to combine both ideas to create a more powerful and accurate code suggestion experience.

A Practical Example

Let's review the code we provided in the User Problem again. The user is trying to generate unit tests for the current_file.ts file. The user is working on the current_file.ts file and wants to generate unit tests for the helloWorld function.

// current_file.ts
import { createLogger } from './logger.ts'

export function helloWorld() { 
  const logger = createLogger()
  const message = `Hello, ${name}!`
  logger.log(message)
  return message
}

Old World

AI was forced to guess in the old world because it had no context. Here is the unit test file the user creates and the code suggestions the AI provides:

// current_file.test.ts
import { expect } from '@jest/globals';
import { helloWorld } from './current_file.ts'

describe('helloWorld', () => {
  it('should return a message', () => {
    // AI, suggest some test code below for me
    // const message = helloWorld('bob')
    // expect(message).toBe('bob')
  })
})

New World

In the new world, the AI can use the AST to gather context from the current_file.ts file and feed the current_file.test.ts into the model. Here is the unit test file the user creates and the code suggestions the AI would provide:

// current_file.test.ts
import { expect } from '@jest/globals';
import { helloWorld } from './current_file.ts'

describe('helloWorld', () => {
  it('should return a message', () => {
    // AI, suggest some test code below for me 
    // jest.mock('./logger.ts', () => {
    //   return {
    //     createLogger: () => {
    //       return {
    //         log: jest.fn()
    //       }
    //     }
    //   }
    // })
    // const message = helloWorld()
    // expect(message).toBe('Hello, bob!')
  })
})

By leveraging the AST, we're mimicking and automating the entire developer workflow outlined in Current Developer Workflow. The beauty of this automation is that developers don't even have to leave their editor.

Thinking Ahead

When we've solidified how to implement the above, we've opened the door to a world of possibilities. We can start to think about how we can use the AST to provide more accurate code suggestions in other areas of the codebase. As mentioned before, here are those examples broken down:

We can resolve the function calls in the current file and provide code suggestions based on the function signature and return type.
- If you have code referencing a particular function (or class), we can provide code suggestions based on the function signature and return type. This can be useful for generating code that calls the function.
We can resolve the imported dependencies in the current file and provide code suggestions based on the dependency being used
- Imagine using a library like Lodash. If the block of code you are working on references a function from Lodash, we can let the model know that Lodash is being used and provide code suggestions based on the Lodash library.

To reiterate, the purpose of this proposal is to start small; however, we can build many powerful features on top of this foundation.

POC Details and Goals

The POC will be a simple demonstration of the test file feature outlined above:

We'll resolve the variable/function the user is currently on. For instance, this would be helloWorld in the current_file.ts file.
We traverse the user's AST and gather context. If the function is a relative import, we'll provide code suggestions based on the imported code. We can limit the import depth to one file for now.
We'll provide both the test file's code and the source code as inputs to the model and see if the model can provide accurate code suggestions.
We'll go with Typescript/Javascript as the language for the POC, as relative imports are straightforward to resolve in these languages.

Additionally:

It's important to note we're already using the AST to determine intent (for the code generation feature).
The extension Does it Throw already uses the concepts outlined in the proposal to check exceptions on imported functions. The logic is here.
It would be good to roughly estimate (at the end of the POC) how much effort will be required for each additional language.
This will not help JetBrains until gitlab-org/editor-extensions&20 (closed) is complete.

Technical Approach

WIP along with POC

Edited Mar 22, 2024 by Michael Angelo Rivera