feat: add rate limiting to codesuggestions
What does this merge request do and why?
Adds 'rate_limit' parameter that is the maximum number of calls per second for the run. LangSmith built-in latency calculations are affected by this, so this MR also calculates 'latency' output that is unaffected by rate limiting waiting.
How to set up and validate locally
Numbered steps to set up and validate the change are strongly suggested.
Merge request checklist
-
Tests added for new functionality. If not, please raise an issue to follow up. -
Documentation added/updated, if needed.