LangChain LiteLLM wrapper doesn't return cached token usage

Problem to solve

LangChain LiteLLM wrapper doesn't return cached token usage. We're currently using the workaround to override _create_usage_metadata in AIGW repo, however, this fix should be in the upstream repo instead.

Proposal

There are a couple of points to address:

  • We should use langchain_litellm.ChatLiteLLM instead of langchain_community.chat_models.litellm.ChatLiteLLM as it's deprecated.
  • To use langchain_litellm, we need to update bunch of langchain dependencies to meet the dependency requirement.
  • Contribute to LangChain LiteLLM wrapper open source repository to fix _create_usage_metadata.
  • Ask a maintainer to release a new version of langchain_litellm package.
  • Bump the dep version in AI Gateway repo.

Further details

Links / references

Edited by Shinya Maeda