chore(deps): update dependency litellm to v1.60.2
This MR contains the following updates:
Package | Type | Update | Change |
---|---|---|---|
litellm | dependencies | minor |
1.55.9 -> 1.60.2
|
Warning Some dependencies could not be looked up. Check the warning logs for more information.
WARNING: this job ran in a Renovate pipeline that doesn't support the configuration required for common-ci-tasks Renovate presets.
Release Notes
BerriAI/litellm (litellm)
v1.60.0
What's Changed
Important Changes between v1.50.xx to 1.60.0
-
def async_log_stream_event
anddef log_stream_event
no longer supported forCustomLoggers
https://docs.litellm.ai/docs/observability/custom_callback. If you want to log stream events usedef async_log_success_event
anddef log_success_event
for logging success stream events
Known Issues
- Adding gemini-2.0-flash-thinking-exp-01-21 by @marcoaleixo in https://github.com/BerriAI/litellm/pull/8089
- add groq/deepseek-r1-distill-llama-70b by @miraclebakelaser in https://github.com/BerriAI/litellm/pull/8078
- (UI) Fix SpendLogs page - truncate
bedrock
models + showend_user
by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/8118 - UI Fixes - Newly created key does not display on the View Key Page + Updated the validator to allow model editing when
keyTeam.team_alias === "Default Team"
by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/8122 - (Refactor / QA) - Use
LoggingCallbackManager
to append callbacks and ensure no duplicate callbacks are added by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/8112 - (UI) fix adding Vertex Models by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/8129
- Fix json_mode parameter propagation in OpenAILikeChatHandler by @miraclebakelaser in https://github.com/BerriAI/litellm/pull/8133
- Doc updates - add key rotations to docs by @krrishdholakia in https://github.com/BerriAI/litellm/pull/8136
- Enforce default_on guardrails always run + expose new
litellm.disable_no_log_param
param by @krrishdholakia in https://github.com/BerriAI/litellm/pull/8134 - Doc updates + management endpoint fixes by @krrishdholakia in https://github.com/BerriAI/litellm/pull/8138
- New stable release - release notes by @krrishdholakia in https://github.com/BerriAI/litellm/pull/8148
- FEATURE: OpenAI o3-mini by @ventz in https://github.com/BerriAI/litellm/pull/8151
- build: fix model cost map with o3 model pricing by @krrishdholakia in https://github.com/BerriAI/litellm/pull/8153
- (Fixes) OpenAI Streaming Token Counting + Fixes usage track when
litellm.turn_off_message_logging=True
by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/8156 - (UI) Allow adding custom pricing when adding new model by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/8165
- (Feat) add bedrock/deepseek custom import models by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/8132
- Adding Azure OpenAI o3-mini costs & specs by @yigitkonur in https://github.com/BerriAI/litellm/pull/8166
- Adjust model pricing metadata by @yurchik11 in https://github.com/BerriAI/litellm/pull/8147
New Contributors
- @marcoaleixo made their first contribution in https://github.com/BerriAI/litellm/pull/8089
- @yigitkonur made their first contribution in https://github.com/BerriAI/litellm/pull/8166
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.59.10...v1.60.0
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.60.0

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
240.0 | 281.07272626532927 | 6.158354312051399 | 0.0 | 1843 | 0 | 215.79772499995897 | 3928.489000000013 |
Aggregated | Passed ![]() |
240.0 | 281.07272626532927 | 6.158354312051399 | 0.0 | 1843 | 0 | 215.79772499995897 | 3928.489000000013 |
v1.59.10
What's Changed
- (UI) - View Logs Page - Refinement by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/8087
- (Feat) pass through vertex - allow using credentials defined on litellm router for vertex pass through by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/8100
- (UI) Allow using a model / credentials for pass through routes by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/8099
- ui - fix chat ui tab sending
model
param by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/8105 - Litellm dev 01 29 2025 p1 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/8097
- Support new
bedrock/converse_like/<model>
route by @krrishdholakia in https://github.com/BerriAI/litellm/pull/8102 - feat(databricks/chat/transformation.py): add tools and 'tool_choice' param support by @krrishdholakia in https://github.com/BerriAI/litellm/pull/8076
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.59.9...v1.59.10
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.59.10

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
210.0 | 239.24647793068146 | 6.21745665443628 | 0.00334092243655899 | 1861 | 1 | 73.25327600000264 | 3903.3159660000083 |
Aggregated | Passed ![]() |
210.0 | 239.24647793068146 | 6.21745665443628 | 0.00334092243655899 | 1861 | 1 | 73.25327600000264 | 3903.3159660000083 |
v1.59.9
What's Changed
- Fix custom pricing - separate provider info from model info by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7990
- Litellm dev 01 25 2025 p4 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/8006
- (UI) - Adding new models enhancement - show provider logo by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/8033
- (UI enhancement) - allow onboarding wildcard models on UI by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/8034
- add openrouter/deepseek/deepseek-r1 by @paul-gauthier in https://github.com/BerriAI/litellm/pull/8038
- (UI) - allow assigning wildcard models to a team / key by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/8041
- Add smolagents by @aymeric-roucher in https://github.com/BerriAI/litellm/pull/8026
- (UI) fixes to add model flow by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/8043
- github - run stale issue/pr bot by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/8045
- (doc) Add nvidia as provider by @raspawar in https://github.com/BerriAI/litellm/pull/8023
- feat(handle_jwt.py): initial commit adding custom RBAC support on jwt… by @krrishdholakia in https://github.com/BerriAI/litellm/pull/8037
- fix(utils.py): handle failed hf tokenizer request during calls by @krrishdholakia in https://github.com/BerriAI/litellm/pull/8032
- Bedrock document processing fixes by @krrishdholakia in https://github.com/BerriAI/litellm/pull/8005
- Fix bedrock model pricing + add unit test using bedrock pricing api by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7978
- Add openai
metadata
param preview support + newx-litellm-timeout
request header by @krrishdholakia in https://github.com/BerriAI/litellm/pull/8047 - (beta ui - spend logs view fixes & Improvements 1) by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/8062
- (fix) - proxy reliability, ensure duplicate callbacks are not added to proxy by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/8067
- (UI) Fixes for Adding model page - keep existing page as default, have 2nd tab for wildcard models by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/8073
New Contributors
- @aymeric-roucher made their first contribution in https://github.com/BerriAI/litellm/pull/8026
- @raspawar made their first contribution in https://github.com/BerriAI/litellm/pull/8023
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.59.8...v1.59.9
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.59.9

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ![]() |
270.0 | 301.01550717582927 | 6.14169679840119 | 0.0 | 1837 | 0 | 234.85362500002793 | 3027.238808999982 |
Aggregated | Failed ![]() |
270.0 | 301.01550717582927 | 6.14169679840119 | 0.0 | 1837 | 0 | 234.85362500002793 | 3027.238808999982 |
v1.59.8
What's Changed
- refactor: cleanup dead codeblock by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7936
- add type annotation for litellm.api_base (#7980) by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7994
- (QA / testing) - Add unit testing for key model access checks by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7999
- (Prometheus) - emit key budget metrics on startup by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/8002
- (Feat) set guardrails per team by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7993
- Supported nested json schema on anthropic calls via proxy + fix langfuse sync sdk issues by @krrishdholakia in https://github.com/BerriAI/litellm/pull/8003
- Bug fix - [Bug]: If you create a key tied to a user that does not belong to a team, and then edit the key to add it to a team (the user is still not a part of a team), using that key results in an unexpected error by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/8008
- (QA / testing) - Add e2e tests for key model access auth checks by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/8000
- (Fix) langfuse - setting
LANGFUSE_FLUSH_INTERVAL
by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/8007
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.59.7...v1.59.8
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.59.8

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ![]() |
280.0 | 325.48398318207154 | 6.003526201462839 | 0.0 | 1796 | 0 | 234.56590200004257 | 3690.442290999954 |
Aggregated | Failed ![]() |
280.0 | 325.48398318207154 | 6.003526201462839 | 0.0 | 1796 | 0 | 234.56590200004257 | 3690.442290999954 |
v1.59.7
What's Changed
- Add datadog health check support + fix bedrock converse cost tracking w/ region name specified by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7958
- Retry for replicate completion response of status=processing (#7901) by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7965
- Ollama ssl verify = False + Spend Logs reliability fixes by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7931
- (Feat) - allow setting
default_on
guardrails by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7973 - (Testing) e2e testing for team budget enforcement checks by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7988
- (UI) - Usage page show days when spend is 0 and round spend figures on charts to 2 sig figs by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7991
- (Feat) - Add GCS Pub/Sub Logging integration for sending DB
SpendLogs
to BigQuery by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7976 - fix(spend_tracking_utils.py): revert api key pass through fix by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7977
- Ensure base_model cost tracking works across all endpoints by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7989
- (UI) Allow admin to expose teams for joining by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7992
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.59.6...v1.59.7
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.59.7

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
260.0 | 294.5630730660492 | 6.1254059494010225 | 0.0 | 1832 | 0 | 231.04980300001898 | 2728.9633709999634 |
Aggregated | Passed ![]() |
260.0 | 294.5630730660492 | 6.1254059494010225 | 0.0 | 1832 | 0 | 231.04980300001898 | 2728.9633709999634 |
v1.59.6
What's Changed
- Add
attempted-retries
andtimeout
values to response headers + more testing by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7926 - Refactor prometheus e2e test by @yujonglee in https://github.com/BerriAI/litellm/pull/7919
- (Testing + Refactor) - Unit testing for team and virtual key budget checks by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7945
- docs: fix typo by @wagnerjt in https://github.com/BerriAI/litellm/pull/7953
- (Feat) - Allow Admin UI users to view spend logs even when not storing messages / responses by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7952
- (UI) - Set/edit guardrails on a virtual key by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7954
- (Feat) - emit
litellm_team_budget_reset_at_metric
andlitellm_api_key_budget_remaining_hours_metric
on prometheus by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7946 - (Feat) allow setting guardrails on a team on the API by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7959
- (UI) Set guardrails on Team Create and Edit page by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7963
- (GCS fix) - don't truncate payload by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7964
- Litellm dev 01 23 2025 p2 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7962
New Contributors
- @wagnerjt made their first contribution in https://github.com/BerriAI/litellm/pull/7953
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.59.5...v1.59.6
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.59.6

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ![]() |
250.0 | 302.94444351157557 | 6.065526445072595 | 0.0 | 1814 | 0 | 184.99327999995785 | 3192.1896389999915 |
Aggregated | Failed ![]() |
250.0 | 302.94444351157557 | 6.065526445072595 | 0.0 | 1814 | 0 | 184.99327999995785 | 3192.1896389999915 |
v1.59.5
What's Changed
- Deepseek r1 support + watsonx qa improvements by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7907
- (Testing) - Add e2e testing for langfuse logging with tags by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7922
- build(deps): bump undici from 6.21.0 to 6.21.1 in /docs/my-website by @dependabot in https://github.com/BerriAI/litellm/pull/7902
- (test) add e2e test for proxy with fallbacks + custom fallback message by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7933
- (feat) - add
deepseek/deepseek-reasoner
to model cost map by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7935 - fix(utils.py): move adding custom logger callback to success event in… by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7905
- Add
provider_specifc_header
param by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7932 - (Refactor) Langfuse - remove
prepare_metadata
, langfuse python SDK now handles non-json serializable objects by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7925
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.59.3...v1.59.5
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.59.5

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
210.0 | 227.08635060543418 | 6.150672112760015 | 0.0 | 1840 | 0 | 180.76872099999264 | 2652.4827009999967 |
Aggregated | Passed ![]() |
210.0 | 227.08635060543418 | 6.150672112760015 | 0.0 | 1840 | 0 | 180.76872099999264 | 2652.4827009999967 |
v1.59.3
What's Changed
- Update MLflow calllback and documentation by @B-Step62 in https://github.com/BerriAI/litellm/pull/7809
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.59.2...v1.59.3
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.59.3

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
200.0 | 229.9985951234699 | 6.27846665942667 | 0.0 | 1879 | 0 | 179.09318400000984 | 3769.753647000016 |
Aggregated | Passed ![]() |
200.0 | 229.9985951234699 | 6.27846665942667 | 0.0 | 1879 | 0 | 179.09318400000984 | 3769.753647000016 |
v1.59.2
What's Changed
- Litellm dev 01 20 2025 p3 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7890
- (e2e testing + minor refactor) - Virtual Key Max budget check by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7888
- fix(proxy_server.py): fix get model info when litellm_model_id is set + move model analytics to free by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7886
- fix: add default credential for azure (#7095) by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7891
- (Bug fix) - Allow setting
null
formax_budget
,rpm_limit
,tpm_limit
when updating values on a team by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7912 - (fix langfuse tags) - read tags from
StandardLoggingPayload
by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7903 - (Feat) Add x-litellm-overhead-duration-ms and "x-litellm-response-duration-ms" in response from LiteLLM by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7899
- (Code quality) - Ban recursive functions in codebase by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7910
- Litellm dev 01 21 2025 p1 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7898
- (Feat - prometheus) - emit
litellm_overhead_latency_metric
by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7913
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.59.1...v1.59.2
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.59.2

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
250.0 | 277.37964377510815 | 6.123201928767048 | 0.0 | 1832 | 0 | 225.21770500003413 | 1457.6771990000168 |
Aggregated | Passed ![]() |
250.0 | 277.37964377510815 | 6.123201928767048 | 0.0 | 1832 | 0 | 225.21770500003413 | 1457.6771990000168 |
v1.59.1
What's Changed
- fix(admins.tsx): fix logic for getting base url and create common get… by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7854
- Fix: Problem with langfuse_tags when using litellm proxy with langfus… by @yuu341 in https://github.com/BerriAI/litellm/pull/7825
- (UI - View Logs Table) - Show country of origin for logs by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7856
- (UI Logs) - add pagination + filtering by key name/team name by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7860
- Revert "Remove UI build output" by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7861
- (Security) Add grype security scan to ci/cd pipeline by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7864
- LiteLLM Minor Fixes & Improvements (01/18/2025) - p1 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7857
- feat(health_check.py): set upperbound for api when making health check call by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7865
- add new bedrock stability models & versions to model_prices_and_context_window.json by @marty-sullivan in https://github.com/BerriAI/litellm/pull/7869
- Auth checks on invalid fallback models by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7871
- JWT Auth -
enforce_rbac
support + UI team view, spend calc fix by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7863 - Fix typo Update alerting.md by @MonkeyKing44 in https://github.com/BerriAI/litellm/pull/7880
- typo fix README.md by @VitalikBerashvili in https://github.com/BerriAI/litellm/pull/7879
- feat: add new together_ai models by @theGitNoob in https://github.com/BerriAI/litellm/pull/7882
- fix(fireworks_ai/): fix global disable flag with transform messages h… by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7847
- (Feat)
datadog_llm_observability
callback - emitrequest_tags
on logs by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7883 - Litellm dev 01 20 2025 p1 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7884
New Contributors
- @yuu341 made their first contribution in https://github.com/BerriAI/litellm/pull/7825
- @marty-sullivan made their first contribution in https://github.com/BerriAI/litellm/pull/7869
- @MonkeyKing44 made their first contribution in https://github.com/BerriAI/litellm/pull/7880
- @VitalikBerashvili made their first contribution in https://github.com/BerriAI/litellm/pull/7879
- @theGitNoob made their first contribution in https://github.com/BerriAI/litellm/pull/7882
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.59.0...v1.59.1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.59.1

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
250.0 | 295.7613582714676 | 6.034086428263315 | 0.0 | 1805 | 0 | 224.12125900001456 | 3576.6714410000304 |
Aggregated | Passed ![]() |
250.0 | 295.7613582714676 | 6.034086428263315 | 0.0 | 1805 | 0 | 224.12125900001456 | 3576.6714410000304 |
v1.59.0
What's Changed
- Add key & team level budget metric for prometheus by @yujonglee in https://github.com/BerriAI/litellm/pull/7831
- fix(key_management_endpoints.py): fix default allowed team member roles by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7843
- (UI - View SpendLogs Table) by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7842
- [fix dd llm obs] - use env vars for setting dd tags, service name by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7835
- [Hashicorp - secret manager] - use vault namespace for tls auth by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7834
- QA: ensure all bedrock regional models have same
supported_
as base + Anthropic nested pydantic object support by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7844 - 10x Bedrock perf improvement - refactor: make bedrock image transformation requests async by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7840
-
/key/delete
- allow team admin to delete team keys by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7846 - Improve Proxy Resiliency: Cooldown single-deployment model groups if 100% calls failed in high traffic by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7823
- LiteLLM Minor Fixes & Improvements (2024/16/01) by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7826
- Remove UI build output by @yujonglee in https://github.com/BerriAI/litellm/pull/7849
- Fix invalid base URL error by @yujonglee in https://github.com/BerriAI/litellm/pull/7852
- Refactor logs UI by @yujonglee in https://github.com/BerriAI/litellm/pull/7851
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.58.4...v1.59.0
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.59.0

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
250.0 | 285.129348931583 | 6.106818187164813 | 0.0 | 1827 | 0 | 224.69302100000732 | 2869.612018000055 |
Aggregated | Passed ![]() |
250.0 | 285.129348931583 | 6.106818187164813 | 0.0 | 1827 | 0 | 224.69302100000732 | 2869.612018000055 |
v1.58.4
What's Changed
- build(pyproject.toml): bump uvicorn depedency requirement + Azure o1 model check fix + Vertex Anthropic headers fix by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7773
- Add
gemini/
frequency_penalty + presence_penalty support by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7776 - feat(helm): add securityContext and pull policy values to migration job by @Hexoplon in https://github.com/BerriAI/litellm/pull/7652
- fix confusing save button label by @yujonglee in https://github.com/BerriAI/litellm/pull/7778
- [integrations/lunary] Improve Lunary documentaiton by @hughcrt in https://github.com/BerriAI/litellm/pull/7770
- Fix wrong URL for internal user invitation by @yujonglee in https://github.com/BerriAI/litellm/pull/7762
- Update instructor tutorial by @Winston-503 in https://github.com/BerriAI/litellm/pull/7784
- (helm) - allow specifying envVars on values.yaml + add helm lint test by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7789
- Fix anthropic pass-through end user tracking + add gemini-2.0-flash-thinking-exp by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7772
- Add back in non root image fixes (#7781) by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7795
- test: initial test to enforce all functions in user_api_key_auth.py h… by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7797
- test: initial commit enforcing testing on all anthropic pass through … by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7794
- build: bump certifi version - see if that fixes asyncio ssl issue on … by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7800
- (datadog llm observability) - fixes + improvements for using
datadog llm observability
logging integration by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7824 - (fix) IBM Watsonx using ZenApiKey by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7821
- (Fix + Testing) - Add
dd-trace-run
to litellm ci/cd pipeline + fix bug caused bydd-trace
patching OpenAI sdk by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7820 - (security fix) - remove hf model with exposed security token by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7810
New Contributors
- @Winston-503 made their first contribution in https://github.com/BerriAI/litellm/pull/7784
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.58.2...v1.58.4
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.58.4

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
200.0 | 237.21547618310757 | 6.133261155980474 | 0.0 | 1835 | 0 | 175.96439100003636 | 4047.4063279999655 |
Aggregated | Passed ![]() |
200.0 | 237.21547618310757 | 6.133261155980474 | 0.0 | 1835 | 0 | 175.96439100003636 | 4047.4063279999655 |
v1.58.2
What's Changed
- Fix RPM/TPM limit typo in admin UI by @yujonglee in https://github.com/BerriAI/litellm/pull/7769
- Add AIM Guardrails support by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7771
- Support temporary budget increases on keys by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7754
- Litellm dev 01 13 2025 p2 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7758
- docs - iam role based access for bedrock by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7774
- (Feat) prometheus - emit remaining team budget metric on proxy startup by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7777
- (fix)
BaseAWSLLM
- cache IAM role credentials when used by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7775
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.58.1...v1.58.2
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.58.2

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
250.0 | 289.8090936126223 | 6.143711740946042 | 0.0 | 1838 | 0 | 228.12097899998207 | 2196.5017750000015 |
Aggregated | Passed ![]() |
250.0 | 289.8090936126223 | 6.143711740946042 | 0.0 | 1838 | 0 | 228.12097899998207 | 2196.5017750000015 |
v1.58.1

Alpha - 1.58.0 has various perf improvements, we recommend waiting for a stable release before bumping in production

What's Changed
- (core sdk fix) - fix fallbacks stuck in infinite loop by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7751
- [Bug fix]: v1.58.0 - issue with read request body by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7753
- (litellm SDK perf improvements) - handle cases when unable to lookup model in model cost map by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7750
- (prometheus - minor bug fix) -
litellm_llm_api_time_to_first_token_metric
not populating for bedrock models by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7740 - (fix) health check - allow setting
health_check_model
by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7752
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.58.0...v1.58.1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.58.1

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
250.0 | 294.2978673554448 | 6.045420383532543 | 0.0 | 1809 | 0 | 223.72276400000146 | 3539.4181890000027 |
Aggregated | Passed ![]() |
250.0 | 294.2978673554448 | 6.045420383532543 | 0.0 | 1809 | 0 | 223.72276400000146 | 3539.4181890000027 |
v1.58.0
v1.58.0 - Alpha Release
What's Changed
- (proxy perf) - service logger don't always import OTEL in helper function by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7727
- (proxy perf) - only read request body 1 time per request by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7728
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.57.11...v1.58.0
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.58.0

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
240.0 | 273.2166563012582 | 6.118315985413586 | 0.0033451700302972037 | 1829 | 1 | 75.1692759999969 | 3821.228761000043 |
Aggregated | Passed ![]() |
240.0 | 273.2166563012582 | 6.118315985413586 | 0.0033451700302972037 | 1829 | 1 | 75.1692759999969 | 3821.228761000043 |
v1.57.11
v1.57.11 - Alpha Release
What's Changed
- (litellm SDK perf improvement) - use
verbose_logger.debug
and_cached_get_model_info_helper
in_response_cost_calculator
by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7720 - (litellm sdk speedup) - use
_model_contains_known_llm_provider
inresponse_cost_calculator
to check if the model contains a known litellm provider by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7721 - (proxy perf) - only parse request body 1 time per request by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7722
- Revert "(proxy perf) - only parse request body 1 time per request" by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7724
- add azure o1 pricing by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7715
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.57.10...v1.57.11
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.57.11

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
240.0 | 270.55759577820237 | 6.130862160194138 | 0.0 | 1835 | 0 | 224.79750500002638 | 1207.8732939999952 |
Aggregated | Passed ![]() |
240.0 | 270.55759577820237 | 6.130862160194138 | 0.0 | 1835 | 0 | 224.79750500002638 | 1207.8732939999952 |
v1.57.10
v1.57.10 - Alpha Release
- Litellm dev 01 10 2025 p2 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7679
- Litellm dev 01 10 2025 p3 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7682
- build: new ui build by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7685
- fix(model_hub.tsx): clarify cost in model hub is per 1m tokens by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7687
- Litellm dev 01 11 2025 p3 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7702
- (perf litellm) - use
_get_model_info_helper
for cost tracking by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7703 - (perf sdk) - minor changes to cost calculator to run helpers only when necessary by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7704
- (perf) - proxy, use
orjson
for reading request body by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7706 - (minor fix -
aiohttp_openai/
) - fix get_custom_llm_provider by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7705 - (sdk perf fix) - only print args passed to litellm when debugging mode is on by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7708
- (perf) - only use response_cost_calculator 1 time per request. (Don't re-use the same helper twice per call ) by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7709
- [BETA] Add OpenAI
/images/variations
+ Topaz API support by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7700 - (litellm sdk speedup router) - adds a helper
_cached_get_model_group_info
to use when trying to get deployment tpm/rpm limits by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7719
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.57.8...v1.57.10
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.57.10

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
240.0 | 264.0629029362514 | 6.184926091214754 | 0.0 | 1851 | 0 | 213.62108399998192 | 1622.618584999998 |
Aggregated | Passed ![]() |
240.0 | 264.0629029362514 | 6.184926091214754 | 0.0 | 1851 | 0 | 213.62108399998192 | 1622.618584999998 |
v1.57.8
What's Changed
- (proxy latency/perf fix - user_api_key_auth) - use asyncio.create task for caching virtual key once it's validated by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7676
- (litellm sdk - perf improvement) - optimize
response_cost_calculator
by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7674 - (litellm sdk - perf improvement) - use O(1) set lookups for checking llm providers / models by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7672
- (litellm sdk - perf improvement) - optimize
pre_call_check
by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7673 - [integrations/lunary] allow to pass custom parent run id to LLM calls by @hughcrt in https://github.com/BerriAI/litellm/pull/7651
- LiteLLM Minor Fixes & Improvements (01/10/2025) - p1 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7670
- (performance improvement - litellm sdk + proxy) - ensure litellm does not create unnecessary threads when running async functions by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7680
- (litellm proxy perf) - pass num_workers cli arg to uvicorn when
num_workers
is specified by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7681 - fix proxy pre call hook - only use
asyncio.create_task
if user opts into alerting by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7683 - [Bug fix]: Proxy Auth Layer - Allow Azure Realtime routes as llm_api_routes by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7684
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.57.7...v1.57.8
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.57.8

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
210.0 | 225.29799695056985 | 6.153370698253471 | 0.0 | 1841 | 0 | 177.73327700001573 | 2088.13791099999 |
Aggregated | Passed ![]() |
210.0 | 225.29799695056985 | 6.153370698253471 | 0.0 | 1841 | 0 | 177.73327700001573 | 2088.13791099999 |
v1.57.7
What's Changed
- (minor latency fixes / proxy) - use verbose_proxy_logger.debug() instead of litellm.print_verbose by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7664
- feat(ui_sso.py): Allows users to use test key pane, and have team budget limits be enforced for their use-case by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7666
- fix(main.py): fix lm_studio/ embedding routing by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7658
- fix(vertex_ai/gemini/transformation.py): handle 'http://' in gemini p… by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7660
- Use environment variable for Athina logging URL by @vivek-athina in https://github.com/BerriAI/litellm/pull/7628
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.57.5...v1.57.7
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.57.7

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
200.0 | 218.4749677188173 | 6.216185012755876 | 0.0 | 1860 | 0 | 177.92223199990076 | 3911.6109139999935 |
Aggregated | Passed ![]() |
200.0 | 218.4749677188173 | 6.216185012755876 | 0.0 | 1860 | 0 | 177.92223199990076 | 3911.6109139999935 |
v1.57.5


Known issue - do not upgrade - Window's compatibility issue on this release


Relevant issue: https://github.com/BerriAI/litellm/issues/7677
What's Changed
- LiteLLM Minor Fixes & Improvements (01/08/2025) - p2 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7643
- Litellm dev 01 08 2025 p1 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7640
- (proxy - RPS) - Get 2K RPS at 4 instances, minor fix for caching_handler by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7655
- (proxy - RPS) - Get 2K RPS at 4 instances, minor fix
aiohttp_openai/
by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7659 - (proxy perf improvement) - use
uvloop
for higher RPS (10%-20% higher RPS) by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7662 - (Feat - Batches API) add support for retrieving vertex api batch jobs by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7661
- (proxy-latency fixes) use asyncio tasks for logging db metrics by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7663
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.57.4...v1.57.5
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.57.5

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
230.0 | 282.70225500655766 | 6.115771768544881 | 0.0 | 1830 | 0 | 206.44150200001832 | 3375.4479410000044 |
Aggregated | Passed ![]() |
230.0 | 282.70225500655766 | 6.115771768544881 | 0.0 | 1830 | 0 | 206.44150200001832 | 3375.4479410000044 |
v1.57.4
What's Changed
- fix(utils.py): fix select tokenizer for custom tokenizer by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7599
- LiteLLM Minor Fixes & Improvements (01/07/2025) - p3 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7635
- (feat) - allow building litellm proxy from pip package by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7633
- Litellm dev 01 07 2025 p2 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7622
- Allow assigning teams to org on UI + OpenAI
omni-moderation
cost model tracking by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7566 - (fix) proxy auth - allow using Azure JS SDK routes as llm_api_routes by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7631
- (helm) - bug fix - allow using
migrationJob.enabled
variable within job by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7639
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.57.3...v1.57.4
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.57.4

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
200.0 | 218.7550845980808 | 6.268875045928877 | 0.0 | 1876 | 0 | 170.9488330000113 | 1424.4913769999812 |
Aggregated | Passed ![]() |
200.0 | 218.7550845980808 | 6.268875045928877 | 0.0 | 1876 | 0 | 170.9488330000113 | 1424.4913769999812 |
v1.57.3
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.57.2...v1.57.3
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.57.3

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
240.0 | 273.577669278204 | 6.101109800829093 | 0.0 | 1826 | 0 | 209.38834100002168 | 2450.7287210000186 |
Aggregated | Passed ![]() |
240.0 | 273.577669278204 | 6.101109800829093 | 0.0 | 1826 | 0 | 209.38834100002168 | 2450.7287210000186 |
v1.57.2
What's Changed
- Prompt Management - support router + optional params by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7594
-
aiohttp_openai/
fixes - allow usingaiohttp_openai/gpt-4o
by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7598 - (Fix) security of base image by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7620
- Litellm dev 01 07 2025 p1 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7618
- (Feat) soft budget alerts on keys by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7623
- LiteLLM Minor Fixes & Improvement (01/01/2025) - p2 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7615
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.57.1...v1.57.2
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.57.2

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
190.0 | 212.2353391522645 | 6.34173008698281 | 0.0 | 1898 | 0 | 174.4866640000282 | 3470.5951910000013 |
Aggregated | Passed ![]() |
190.0 | 212.2353391522645 | 6.34173008698281 | 0.0 | 1898 | 0 | 174.4866640000282 | 3470.5951910000013 |
v1.57.1
What's Changed
- (perf) - fixes for aiohttp handler to hit 1K RPS by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7590
- (latency/perf fixes - proxy) - use
async_service_success_hook
by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7591 - (Feat) - allow including dd-trace in litellm base image by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7587
- (proxy perf improvement) - remove redundant
.copy()
operation by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7564 - Refresh VoyageAI models, prices and context by @fzowl in https://github.com/BerriAI/litellm/pull/7472
- LiteLLM Minor Fixes & Improvements (01/06/2025) - p3 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7596
- LiteLLM Minor Fixes & Improvements (01/06/2025) - p2 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7597
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.57.0...v1.57.1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.57.1

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.57.1

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
250.0 | 286.96666935492755 | 6.035628429692609 | 0.0 | 1806 | 0 | 226.66728699999794 | 3887.529271000062 |
Aggregated | Passed ![]() |
250.0 | 286.96666935492755 | 6.035628429692609 | 0.0 | 1806 | 0 | 226.66728699999794 | 3887.529271000062 |
v1.57.0
What's Changed
- (Fix) make sure
init
custom loggers is non blocking by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7554 - (Feat) Hashicorp Secret Manager - Allow storing virtual keys in secret manager by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7549
- Create and view organizations + assign org admins on the Proxy UI by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7557
- (perf) fix [PROXY] don't use
f
string inadd_litellm_data_to_request()
by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7558 - fix(groq/chat/transformation.py): fix groq response_format transforma… by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7565
- Support deleting keys by key_alias by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7552
- (proxy perf improvement) - use
asyncio.create_task
forservice_logger_obj.async_service_success_hook
in pre_call by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7563 - add
fireworks_ai/accounts/fireworks/models/deepseek-v3
by @Fredy in https://github.com/BerriAI/litellm/pull/7567 - FriendliAI: Documentation Updates by @minpeter in https://github.com/BerriAI/litellm/pull/7517
- Prevent istio injection for db migrations cron job by @lowjiansheng in https://github.com/BerriAI/litellm/pull/7513
New Contributors
- @Fredy made their first contribution in https://github.com/BerriAI/litellm/pull/7567
- @minpeter made their first contribution in https://github.com/BerriAI/litellm/pull/7517
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.56.10...v1.57.0
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.57.0

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
200.0 | 212.84027329611826 | 6.1961289027318704 | 0.0 | 1854 | 0 | 174.45147399996586 | 1346.3216149999653 |
Aggregated | Passed ![]() |
200.0 | 212.84027329611826 | 6.1961289027318704 | 0.0 | 1854 | 0 | 174.45147399996586 | 1346.3216149999653 |
v1.56.10
What's Changed
- fix(aws_secret_manager_V2.py): Error reading secret from AWS Secrets … by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7541
- Support checking provider-specific
/models
endpoints for available models based on key by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7538 - feat(router.py): support request prioritization for text completion c… by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7540
- (Fix) - Docker build error with pyproject.toml by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7550
- (Fix) - Slack Alerting , don't send duplicate spend report when used on multi instance settings by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7546
- add
cohere/command-r7b-12-2024
by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7553
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.56.9...v1.56.10
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.56.10

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
230.0 | 268.3301603401397 | 6.21711064668469 | 0.0 | 1861 | 0 | 212.36320399998476 | 3556.7401620000396 |
Aggregated | Passed ![]() |
230.0 | 268.3301603401397 | 6.21711064668469 | 0.0 | 1861 | 0 | 212.36320399998476 | 3556.7401620000396 |
v1.56.9
What's Changed
- (fix) GCS bucket logger - apply truncate_standard_logging_payload_content to standard_logging_payload and ensure GCS flushes queue on fails by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7519
- (Fix) - Hashicorp secret manager - don't print hcorp secrets in debug logs by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7529
- [Bug-Fix]: None metadata not handled for
_PROXY_VirtualKeyModelMaxBudgetLimiter
hook by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7523 - Bump anthropic.claude-3-5-haiku-20241022-v1:0 to new limits by @Manouchehri in https://github.com/BerriAI/litellm/pull/7118
- Fix langfuse prompt management on proxy by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7535
- (Feat) - Hashicorp secret manager, use TLS cert authentication by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7532
- Fix OTEL message redaction + Langfuse key leak in logs by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7516
- feat: implement support for limit, order, before, and after parameters in get_assistants by @jeansouzak in https://github.com/BerriAI/litellm/pull/7537
- Add missing prefix for deepseek by @SmartManoj in https://github.com/BerriAI/litellm/pull/7508
- (fix)
aiohttp_openai/
route - get to 1K RPS on single instance by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7539 - Revert "feat: implement support for limit, order, before, and after parameters in get_assistants" by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7542
- [Feature]: - allow print alert log to console by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7534
- (fix proxy perf) use
_read_request_body
instead of ast.literal_eval to get better performance by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7545
New Contributors
- @jeansouzak made their first contribution in https://github.com/BerriAI/litellm/pull/7537
- @SmartManoj made their first contribution in https://github.com/BerriAI/litellm/pull/7508
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.56.8...v1.56.9
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.56.9

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
240.0 | 269.3983699320639 | 6.149252570882109 | 0.0 | 1840 | 0 | 211.95807399999467 | 2571.210135000001 |
Aggregated | Passed ![]() |
240.0 | 269.3983699320639 | 6.149252570882109 | 0.0 | 1840 | 0 | 211.95807399999467 | 2571.210135000001 |
v1.56.8
What's Changed
- Prometheus - custom metrics support + other improvements by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7489
- (feat) POST
/fine_tuning/jobs
support passing vertex specific hyper params by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7490 - (Feat) - LiteLLM Use
UsernamePasswordCredential
for Azure OpenAI by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7496 - (docs) Add docs on load testing benchmarks by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7499
- (Feat) Add support for reading secrets from Hashicorp vault by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7497
- Litellm dev 12 30 2024 p2 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7495
- Refactor Custom Metrics on Prometheus - allow setting k,v pairs on all metrics via config.yaml by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7498
- (fix) GCS bucket logger - apply
truncate_standard_logging_payload_content
tostandard_logging_payload
and ensure GCS flushes queue on fails by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7500 - Litellm dev 01 01 2025 p3 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7503
- Litellm dev 01 02 2025 p2 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7512
- Revert "(fix) GCS bucket logger - apply
truncate_standard_logging_payload_content
tostandard_logging_payload
and ensure GCS flushes queue on fails" by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7515 - (perf) use
aiohttp
forcustom_openai
by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7514 - (perf) use threadpool executor - for sync logging integrations by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7509
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.56.6...v1.56.8
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.56.8

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
230.0 | 247.81903455189286 | 6.181081075067931 | 0.0 | 1850 | 0 | 191.81740900000932 | 2126.8676100000903 |
Aggregated | Passed ![]() |
230.0 | 247.81903455189286 | 6.181081075067931 | 0.0 | 1850 | 0 | 191.81740900000932 | 2126.8676100000903 |
v1.56.6
What's Changed
- (fix)
v1/fine_tuning/jobs
with VertexAI by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7487 - (docs) Add docs on using Vertex with Fine Tuning APIs by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7491
- Fix team-based logging to langfuse + allow custom tokenizer on
/token_counter
endpoint by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7493 - Fix team admin create key flow on UI + other improvements by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7488
- docs: added missing quote by @dsdanielko in https://github.com/BerriAI/litellm/pull/7481
- fix ollama embedding model response #7451 by @svenseeberg in https://github.com/BerriAI/litellm/pull/7473
- (Feat) - Add PagerDuty Alerting Integration by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7478
New Contributors
- @dsdanielko made their first contribution in https://github.com/BerriAI/litellm/pull/7481
- @svenseeberg made their first contribution in https://github.com/BerriAI/litellm/pull/7473
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.56.5...v1.56.6
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.56.6

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
250.0 | 287.411814751915 | 6.114731230663012 | 0.0 | 1830 | 0 | 228.32058200003758 | 3272.637599999939 |
Aggregated | Passed ![]() |
250.0 | 287.411814751915 | 6.114731230663012 | 0.0 | 1830 | 0 | 228.32058200003758 | 3272.637599999939 |
v1.56.5
What's Changed
- Refactor: move all bedrock invoke providers to BaseConfig by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7463
- (fix)
litellm.amoderation
- support usingmodel=openai/omni-moderation-latest
,model=omni-moderation-latest
,model=None
by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7475 - [Bug Fix]: rerank restfulapi response parse still too strict by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7476
- Litellm dev 12 30 2024 p1 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7480
- HumanLoop integration for Prompt Management by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7479
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.56.4...v1.56.5
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.56.5

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
230.0 | 268.0630784626629 | 6.174316845767241 | 0.0 | 1848 | 0 | 212.08500100010497 | 3189.481879000027 |
Aggregated | Passed ![]() |
230.0 | 268.0630784626629 | 6.174316845767241 | 0.0 | 1848 | 0 | 212.08500100010497 | 3189.481879000027 |
v1.56.4
What's Changed
- Update model_prices_and_context_window.json by @superpoussin22 in https://github.com/BerriAI/litellm/pull/7452
- (Refactor)
- remove deprecated litellm server by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7456 -
Docs - Using LiteLLM with 1M rows in spend logs by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7461 - (Admin UI - 1) - added the model used either directly before or after the "Assistant" so that it's clear which model provided the given assistant output by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7459
- (Admin UI - 2) UI chat should render the output in markdown by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7460
- (Security fix) - Upgrade to
fastapi==0.115.5
by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7447 - fix OR deepseek by @paul-gauthier in https://github.com/BerriAI/litellm/pull/7425
- (Bug Fix) Add health check support for realtime models by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7453
- (Refactor) - Re use litellm.completion/litellm.embedding etc for health checks by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7455
- Litellm dev 12 28 2024 p3 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7464
- Fireworks AI - document inlining support + model access groups for wildcard models by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7458
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.56.3...v1.56.4
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.56.4

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
240.0 | 268.74238744669225 | 6.116896356155644 | 0.0 | 1829 | 0 | 214.29422199992132 | 1969.7571099999323 |
Aggregated | Passed ![]() |
240.0 | 268.74238744669225 | 6.116896356155644 | 0.0 | 1829 | 0 | 214.29422199992132 | 1969.7571099999323 |
v1.56.3
What's Changed
- Update Documentation - Gemini Embedding by @igorlima in https://github.com/BerriAI/litellm/pull/7436
- (Bug fix) missing
model_group
field in logs for aspeech call types by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7392 - (Feat) - new endpoint
GET /v1/fine_tuning/jobs/{fine_tuning_job_id:path}
by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7427 - Update model_prices_and_context_window.json by @superpoussin22 in https://github.com/BerriAI/litellm/pull/7345
- LiteLLM Minor Fixes & Improvements (12/27/2024) - p1 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7448
- Litellm dev 12 27 2024 p2 1 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7449
New Contributors
- @igorlima made their first contribution in https://github.com/BerriAI/litellm/pull/7436
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.56.2...v1.56.3
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.56.3

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
250.0 | 276.9724297749999 | 6.148940938190872 | 0.003341815727277648 | 1840 | 1 | 112.37049800001842 | 1700.1428350000083 |
Aggregated | Passed ![]() |
250.0 | 276.9724297749999 | 6.148940938190872 | 0.003341815727277648 | 1840 | 1 | 112.37049800001842 | 1700.1428350000083 |
v1.56.2
What's Changed
- Litellm dev 12 24 2024 p2 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7400
- (feat) Support Dynamic Params for
guardrails
by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7415 - docs: cleanup docker compose comments by @marcoscannabrava in https://github.com/BerriAI/litellm/pull/7414
- (Security fix) UI - update
next
version by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7418 - (security fix) - fix docs snyk vulnerability by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7419
- LiteLLM Minor Fixes & Improvements (12/25/2024) - p1 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7411
- LiteLLM Minor Fixes & Improvements (12/25/2024) - p2 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7420
- Ensure 'disable_end_user_cost_tracking_prometheus_only' works for new prometheus metrics by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7421
- (security fix) - bump fast api, fastapi-sso, python-multipart - fix snyk vulnerabilities by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7417
- docs - batches cost tracking by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7422
- Add
/openai
pass through route on litellm proxy by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7412 - (Feat) Add logging for
POST v1/fine_tuning/jobs
by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7426 - (docs) - show all supported Azure OpenAI endpoints in overview by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7428
- (docs) - custom guardrail show how to use dynamic guardrail params by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7430
- Support budget/rate limit tiers for keys by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7429
- (fix) initializing OTEL Logging on LiteLLM Proxy - ensure OTEL logger is initialized only once by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7435
- Litellm dev 12 26 2024 p3 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7434
- fix(key_management_endpoints.py): enforce user_id / team_id checks on key generate by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7437
- LiteLLM Minor Fixes & Improvements (12/26/2024) - p4 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7439
- Refresh VoyageAI models, prices and context by @fzowl in https://github.com/BerriAI/litellm/pull/7443
- Revert "Refresh VoyageAI models, prices and context" by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7446
- (feat)
/guardrails/list
show guardrail info params by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7442 - add openrouter o1 by @paul-gauthier in https://github.com/BerriAI/litellm/pull/7424
-
(Feat) Log Guardrails run, guardrail response on logging integrations by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7445
New Contributors
- @marcoscannabrava made their first contribution in https://github.com/BerriAI/litellm/pull/7414
- @fzowl made their first contribution in https://github.com/BerriAI/litellm/pull/7443
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.55.12...v1.56.2
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.56.2

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
250.0 | 275.3240164096845 | 6.143891773397197 | 0.0 | 1838 | 0 | 224.26387399997338 | 1437.5524760000076 |
Aggregated | Passed ![]() |
250.0 | 275.3240164096845 | 6.143891773397197 | 0.0 | 1838 | 0 | 224.26387399997338 | 1437.5524760000076 |
v1.55.12
What's Changed
- Add 'end_user', 'user' and 'requested_model' on more prometheus metrics by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7399
- (feat)
/batches
Add support for using/batches
endpoints in OAI format by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7402 - (feat)
/batches
- trackuser_api_key_alias
,user_api_key_team_alias
etc for /batch requests by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7401 - Litellm dev 12 24 2024 p3 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7403
- (Feat) add `"/v1/batches/{batch_id:path}/cancel" endpoint by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7406
- Litellm dev 12 24 2024 p4 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7407
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.55.11...v1.55.12
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.55.12

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
220.0 | 241.51418849604215 | 6.334659319234715 | 0.0 | 1895 | 0 | 191.11329300005764 | 3854.987871999924 |
Aggregated | Passed ![]() |
220.0 | 241.51418849604215 | 6.334659319234715 | 0.0 | 1895 | 0 | 191.11329300005764 | 3854.987871999924 |
v1.55.11
What's Changed
- LiteLLM Minor Fixes & Improvements (12/23/2024) - p3 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7394
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.55.10...v1.55.11
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.55.11

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
250.0 | 290.3865391657403 | 6.034920682874279 | 0.0 | 1804 | 0 | 229.06071099987457 | 2909.605226000167 |
Aggregated | Passed ![]() |
250.0 | 290.3865391657403 | 6.034920682874279 | 0.0 | 1804 | 0 | 229.06071099987457 | 2909.605226000167 |
v1.55.10
What's Changed
- (Admin UI) - Test Key Tab - Allow typing in
model
name + Add wrapping for text response by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7347 - (Admin UI) - Test Key Tab - Allow using
UI Session
instead of manually creating a virtual key by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7348 - (refactor) - fix from enterprise.utils import ui_get_spend_by_tags by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7352
- (chore) - enforce model budgets on virtual keys as enterprise feature by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7353
- (Admin UI) correctly render provider name in /models with wildcard routing by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7349
- (Admin UI) - maintain history on chat UI by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7351
- Litellm enforce enterprise features by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7357
- Document team admins + Enforce assigning team admins as an enterprise feature by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7359
- Litellm docs update by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7365
- Complete 'requests' library removal by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7350
- (chore) remove unused code files by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7363
- (security fix) - update base image for all docker images to
python:3.13.1-slim
by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7388 - LiteLLM Minor Fixes & Improvements (12/23/2024) - p1 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7383
- LiteLLM Minor Fixes & Improvements (12/23/2024) - P2 by @krrishdholakia in https://github.com/BerriAI/litellm/pull/7386
- [Bug Fix]: Errors in LiteLLM When Using Embeddings Model with Usage-Based Routing by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7390
- (Feat) Add input_cost_per_token_batches, output_cost_per_token_batches for OpenAI cost tracking Batches API by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7391
- (feat) Add basic logging support for
/batches
endpoints by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7381 - (feat) Add cost tracking for /batches requests OpenAI by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7384
- dd logger fix - handle objects that can't be JSON dumped by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/7393
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.55.9...v1.55.10
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.55.10

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ![]() |
200.0 | 218.24862748744047 | 6.256831142894005 | 0.0 | 1871 | 0 | 177.71721199983403 | 1940.1571020000574 |
Aggregated | Passed ![]() |
200.0 | 218.24862748744047 | 6.256831142894005 | 0.0 | 1871 | 0 | 177.71721199983403 | 1940.1571020000574 |
Configuration
-
If you want to rebase/retry this MR, check this box
This MR has been generated by Renovate Bot.
Merge request reports
Activity
added maintenancedependency typemaintenance labels
requested review from @achueshev and @eduardobonet
Reviewer roulette
To spread load more evenly across eligible reviewers, Danger has picked a candidate for each review slot. Feel free to override these selections if you think someone else would be better-suited or use the GitLab Review Workload Dashboard to find other available reviewers.
To read more on how to use the reviewer roulette, please take a look at the Engineering workflow and code review guidelines.
Once you've decided who will review this merge request, mention them as you normally would! Danger does not automatically notify them for you.
Reviewer Maintainer @jprovaznik
(UTC+1)
@bcardoso-
(UTC+1)
If needed, you can retry the
danger-review
job that generated this comment.Generated by
DangerEdited by ****added devopsai-powered groupai model validation labels
added automation:bot-authored label
added 1 commit
- 9b09308a - chore(deps): update dependency litellm to v1.60.0
added sectiondata-science label
added 31 commits
-
9b09308a...7b85bdb2 - 30 commits from branch
main
- 3b53785a - chore(deps): update dependency litellm to v1.60.2
-
9b09308a...7b85bdb2 - 30 commits from branch
started a merge train
mentioned in commit 77324bcc
76 76 [tool.poetry.group.lint.dependencies] 77 77 flake8 = "^7.0.0" 78 78 isort = "^5.12.0" 79 black = "^25.0.0" 79 black = "^24.0.0" @eduardobonet This change is unintended as it reverts the original work in !1928 (merged). While it is likely a bug in Renovate, I'd kindly request reviewers to review the changeset before merge so we can avoid this issue moving forward
Edited by Tan Le
mentioned in merge request !1955 (merged)
changed milestone to %17.9