Skip to content

feat: faster code suggestions generation streaming

Angelo Rivera requested to merge improve-streaming into main

Description

I was able to improve the speed at which we stream suggestions by at least 2x in for code generation. Resolves #1341 (closed)

Demo

fasterstreaming

How streaming Currently Works

The flow of the code is as follows:

  1. When the user triggers inline completion, the LanguageClientMiddleware.provideInlineCompletionItems method is called.
  2. If the LSP server responds with a START_STREAMING_COMMAND, it starts listening to the incoming stream using LanguageClientMiddleware.#listenToIncomingStream.
  3. The createStreamIterator function sets up listeners for StreamingCompletionResponse notifications and manages the completion queue.
  4. As completion parts arrive, they are added to the queue, and the iterator resolves them when requested by the client.
  5. The LanguageClientMiddleware class updates the UI loading state and manages the active streams based on the completion results.

More on the createStreamIterator function:

  • Creates an asynchronous iterator for a completion stream.
  • Listens for StreamingCompletionResponse notifications from the LSP server.
  • Manages a queue of completion parts and resolves them as they arrive.
  • Sends a CancelStreaming notification to the LSP server when the stream is canceled or detached.

How We Can Improve it

The current implementation is already very solid; however, the queue array becomes a bottleneck if the rate of the model's response is faster than when the editor can drain the queue of completions. We also store a very large dataset (array of CompletionPart), which is unnecessary.

By processing the queue in batches and trimming the array as the size grows, we can let the iterator "skip" unnecessary completions. This allows the queue to be processed almost in tandem with the model's response rate.

How has this been tested?

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation
  • Chore (Related to CI or Packaging to platforms)
  • Test gap
Edited by Angelo Rivera

Merge request reports