Investigate streaming for code generation API

The latest code generation models are going to support also streaming - https://cloud.google.com/vertex-ai/docs/generative-ai/model-reference/code-generation?hl=en&authuser=1#stream_response_from_generative_ai_models

For preperation to support this we need to investigate the following:

Time from request sent to streaming starts
Changes needed in IDE's, Monolith and model gateway to enable it
How would cleaning, deduplication work in that scenario

Edited Sep 01, 2023 by Tim Zallmann