Skip to content
Snippets Groups Projects

Create initial Tanuki bot api endpoint

Merged Terri Chu requested to merge tchu-bot-create-new-api into master
All threads resolved!

What does this MR do and why?

Related https://gitlab.com/gitlab-org/enablement-section/tanuki-bot/-/issues/2

Creates an API endpoint that receives and responds with json:

  • POST /-/llm/tanuki_bot/ask
  • { "q": "What is advanced search?" }

This is part of a series of MRs and should be reviewed and merged in this order

MR status
Add support for the embedding database (!118156 - merged) (db setup) merged :white_check_mark:
Add the tanuki_bot model (!118195 - merged) (migration) merged :white_check_mark:
Create initial Tanuki bot api endpoint (!117695 - merged) (api) in review :star:

Screenshots or screen recordings

SCR-20230417-nsop

How to set up and validate locally

  1. Feature.enable(:openai_experimentation)
  2. Feature.enable(:tanuki_bot)
  3. ::Gitlab::CurrentSettings.update!(openai_api_key: '<YOUR_KEY>')
  4. Follow instructions for settings up the MR dependency: Add support for the embedding database (!118156 - merged)
  5. Run migrations:
    rails db:migrate
  6. * Clone the tanuki-bot repository
  7. * cd into the pgvector folder
  8. * use asdf to install python (should be python 3.11.3)
    asdf install python
  9. * install pip
  10. * install requirements
    pip install -r requirements.txt
  11. * run the command to copy the chroma db to the gdk postgres database
    OPENAI_API_KEY=not-an-actual-key PG_USER=<Your user> PG_HOST=<Your gdk install directory>/postgresql python chroma_to_pg.py
  12. Make an API request to POST /-/llm/tanuki_bot/ask

* you only need to do this once to have data in the embedding database.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Dmitry Gruzd

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • mentioned in merge request !117757 (merged)

  • Dmitry Gruzd changed the description

    changed the description

  • Dmitry Gruzd changed title from Draft: Create initial Tanuki bot api endpoint [ci skip] to Draft: Create initial Tanuki bot api endpoint

    changed title from Draft: Create initial Tanuki bot api endpoint [ci skip] to Draft: Create initial Tanuki bot api endpoint

  • Dmitry Gruzd changed the description

    changed the description

  • Dmitry Gruzd changed the description

    changed the description

  • Dmitry Gruzd added 2 commits

    added 2 commits

    • 388876af - Move llm to separate route config
    • 8dd8f6f5 - Merge branch 'tanuki-bot/fix-params-and-route' into 'tchu-bot-create-new-api'

    Compare with previous version

  • Contributor
    4 Warnings
    :warning: This merge request is quite big (523 lines changed), please consider splitting it into multiple merge requests.
    :warning: c3499921: Commits that change 30 or more lines across at least 3 files should describe these changes in the commit body. For more information, take a look at our Commit message guidelines.
    :warning:

    featureaddition and featureenhancement merge requests normally have a documentation change. Consider adding a documentation update or confirming the documentation plan with the Technical Writer counterpart.

    For more information, see:

    :warning: Do not add new controller specs. We are moving from controller specs to
    request specs (and/or feature specs). Please add request specs under
    /spec/requests and/or /ee/spec/requests instead.

    See &5076 for information.

    1 Message
    :book: This merge request adds or changes files that require a review from the Database team.

    This merge request requires a database review. To make sure these changes are reviewed, take the following steps:

    1. Ensure the merge request has database and databasereview pending labels. If the merge request modifies database files, Danger will do this for you.
    2. Prepare your MR for database review according to the docs.
    3. Assign and mention the database reviewer suggested by Reviewer Roulette.

    If you no longer require a database review, you can remove this suggestion by removing the database label and re-running the danger-review job.

    Reviewer roulette

    Changes that require review have been detected!

    Please refer to the table below for assigning reviewers and maintainers suggested by Danger in the specified category:

    Category Reviewer Maintainer
    backend Tarun Vellishetty current availability (@tvellishetty) (UTC+5.5, 9.5 hours ahead of @terrichu) Douglas Barbosa Alexandre current availability (@dbalexandre) (UTC+0, 4 hours ahead of @terrichu)
    database Vitali Tatarintev current availability (@ck3g) (UTC+2, 6 hours ahead of @terrichu) Michał Zając current availability (@Quintasan) (UTC+2, 6 hours ahead of @terrichu)

    To spread load more evenly across eligible reviewers, Danger has picked a candidate for each review slot, based on their timezone. Feel free to override these selections if you think someone else would be better-suited or use the GitLab Review Workload Dashboard to find other available reviewers.

    To read more on how to use the reviewer roulette, please take a look at the Engineering workflow and code review guidelines. Please consider assigning a reviewer or maintainer who is a domain expert in the area of the merge request.

    Once you've decided who will review this merge request, assign them as a reviewer! Danger does not automatically notify them for you.

    If needed, you can retry the :repeat: danger-review job that generated this comment.

    Generated by :no_entry_sign: Danger

  • Contributor

    Allure report

    allure-report-publisher generated test report!

    e2e-review-qa: :white_check_mark: test report for 3bf3b0bd

    expand test summary
    +-----------------------------------------------------------------------+
    |                            suites summary                             |
    +------------------+--------+--------+---------+-------+-------+--------+
    |                  | passed | failed | skipped | flaky | total | result |
    +------------------+--------+--------+---------+-------+-------+--------+
    | Framework sanity | 9      | 0      | 1       | 0     | 10    | ✅     |
    | Plan             | 49     | 0      | 1       | 0     | 50    | ✅     |
    | Create           | 28     | 0      | 1       | 0     | 29    | ✅     |
    | Data Stores      | 22     | 0      | 0       | 0     | 22    | ✅     |
    | Govern           | 24     | 0      | 0       | 0     | 24    | ✅     |
    | Manage           | 8      | 0      | 3       | 0     | 11    | ✅     |
    | Verify           | 10     | 0      | 0       | 0     | 10    | ✅     |
    | Monitor          | 4      | 0      | 0       | 0     | 4     | ✅     |
    | Package          | 0      | 0      | 1       | 0     | 1     | ➖     |
    +------------------+--------+--------+---------+-------+-------+--------+
    | Total            | 154    | 0      | 7       | 0     | 161   | ✅     |
    +------------------+--------+--------+---------+-------+-------+--------+

    e2e-package-and-test: :exclamation: test report for 3bf3b0bd

    expand test summary
    +-----------------------------------------------------------------------+
    |                            suites summary                             |
    +------------------+--------+--------+---------+-------+-------+--------+
    |                  | passed | failed | skipped | flaky | total | result |
    +------------------+--------+--------+---------+-------+-------+--------+
    | Create           | 292    | 0      | 42      | 58    | 334   | ❗     |
    | Verify           | 108    | 0      | 8       | 88    | 116   | ❗     |
    | Manage           | 66     | 0      | 6       | 30    | 72    | ❗     |
    | Plan             | 120    | 0      | 0       | 66    | 120   | ❗     |
    | Monitor          | 20     | 0      | 0       | 20    | 20    | ❗     |
    | Secure           | 14     | 0      | 10      | 14    | 24    | ❗     |
    | Data Stores      | 68     | 0      | 0       | 22    | 68    | ❗     |
    | Govern           | 92     | 0      | 0       | 92    | 92    | ❗     |
    | Fulfillment      | 4      | 0      | 48      | 0     | 52    | ✅     |
    | Release          | 12     | 0      | 0       | 8     | 12    | ❗     |
    | Analytics        | 4      | 0      | 0       | 4     | 4     | ❗     |
    | Framework sanity | 0      | 0      | 2       | 0     | 2     | ➖     |
    | Package          | 0      | 0      | 6       | 0     | 6     | ➖     |
    | Configure        | 0      | 0      | 6       | 0     | 6     | ➖     |
    | ModelOps         | 0      | 0      | 2       | 0     | 2     | ➖     |
    | Growth           | 0      | 0      | 4       | 0     | 4     | ➖     |
    +------------------+--------+--------+---------+-------+-------+--------+
    | Total            | 800    | 0      | 134     | 402   | 934   | ❗     |
    +------------------+--------+--------+---------+-------+-------+--------+
  • Terri Chu changed target branch from tchu-bot-service-api to master

    changed target branch from tchu-bot-service-api to master

  • Terri Chu changed milestone to %16.0

    changed milestone to %16.0

  • assigned to @dgruzd and @maddievn

  • Terri Chu mentioned in merge request !117585 (closed)

    mentioned in merge request !117585 (closed)

  • Dmitry Gruzd
  • Dmitry Gruzd added 1 commit

    added 1 commit

    • b64f952d - Apply 1 suggestion(s) to 1 file(s)

    Compare with previous version

  • Dmitry Gruzd added 2 commits

    added 2 commits

    • bbaf55a5 - Implement parallel requests for Tanuki Bot
    • 7b728da7 - Add the llm namespace

    Compare with previous version

  • @zcuddy Continuing the discussion from !117757 (comment 1354791362)

    We've slightly changed the URL one more time. Now it is /-/llm/tanuki_bot/ask

    /cc @terrichu @maddievn

  • Dmitry Gruzd changed the description

    changed the description

  • Dmitry Gruzd added 1 commit

    added 1 commit

    Compare with previous version

  • Terri Chu added 1 commit

    added 1 commit

    • 53c9dda8 - Require q param in controller & service, update specs

    Compare with previous version

  • Terri Chu added 1 commit

    added 1 commit

    • 36a4b81f - Update specs, fix url building

    Compare with previous version

  • Terri Chu added 1 commit

    added 1 commit

    • 4ccd40d5 - Add check for users on paid hosted plans for GitLab.com

    Compare with previous version

  • Terri Chu added 1 commit

    added 1 commit

    • 6effe0ec - Fix bug found in CI, add more FF tests

    Compare with previous version

  • added 1 commit

    Compare with previous version

  • Dmitry Gruzd added 1 commit

    added 1 commit

    Compare with previous version

  • Terri Chu resolved all threads

    resolved all threads

  • Joern Schneeweisz resolved all threads

    resolved all threads

  • Terri Chu added 1 commit

    added 1 commit

    • 00faa3c4 - Rename licensed feature to ai_tanuki_bot

    Compare with previous version

  • Zack Cuddy mentioned in merge request !117967 (merged)

    mentioned in merge request !117967 (merged)

  • added 1 commit

    • b2b4600e - Update feature flag milestone

    Compare with previous version

  • Terri Chu mentioned in merge request !118195 (merged)

    mentioned in merge request !118195 (merged)

  • Terri Chu added 840 commits

    added 840 commits

    • b2b4600e...042f0620 - 808 commits from branch master
    • 042f0620...9188ca8a - 22 earlier commits
    • 08a1bd71 - Add check for users on paid hosted plans for GitLab.com
    • 841aaebf - Fix bug found in CI, add more FF tests
    • d116e9e7 - Remove comment
    • 68d5161b - Refactor controller & specs
    • 93fba8c7 - Rename licensed feature to ai_tanuki_bot
    • 7ef6140a - Update feature flag milestone
    • 896e48b0 - Rebase onto database MR and refactor
    • b30586fd - Add pgvector to CI
    • 7aab64f8 - fixup! Add pgvector to CI
    • 857c4b6f - fixup! Rebase onto database MR and refactor

    Compare with previous version

  • Terri Chu changed target branch from master to add-tanuki-bot-model

    changed target branch from master to add-tanuki-bot-model

  • Dmitry Gruzd added 1 commit

    added 1 commit

    • 21bf289a - Implement parallel requests for Tanuki Bot

    Compare with previous version

  • Dmitry Gruzd mentioned in merge request !117793 (closed)

    mentioned in merge request !117793 (closed)

  • Dmitry Gruzd mentioned in merge request !118066 (closed)

    mentioned in merge request !118066 (closed)

  • Dmitry Gruzd changed the description

    changed the description

  • Terri Chu mentioned in merge request !118156 (merged)

    mentioned in merge request !118156 (merged)

  • Terri Chu added 7 commits

    added 7 commits

    Compare with previous version

  • added 2 commits

    Compare with previous version

  • Madelein van Niekerk resolved all threads

    resolved all threads

  • Madelein van Niekerk changed the description

    changed the description

  • added 1 commit

    • 4b0eaeb4 - Move endpoint out of namespace

    Compare with previous version

  • Madelein van Niekerk changed the description

    changed the description

  • added 1 commit

    • 386e22ba - Remove comment and reference to trait

    Compare with previous version

  • Madelein van Niekerk changed the description

    changed the description

  • Terri Chu marked the checklist item I have evaluated the MR acceptance checklist for this MR. as completed

    marked the checklist item I have evaluated the MR acceptance checklist for this MR. as completed

  • Terri Chu
  • Dmitry Gruzd
  • Terri Chu added 331 commits

    added 331 commits

    Compare with previous version

  • Terri Chu
  • Terri Chu added 1 commit

    added 1 commit

    • a0a8981e - Apply 1 suggestion(s) to 1 file(s)

    Compare with previous version

  • Terri Chu added 1 commit

    added 1 commit

    • fd8ce90f - Fixes from testing with frontend

    Compare with previous version

  • Changzheng Liu added 1 commit

    added 1 commit

    Compare with previous version

  • Zack Cuddy mentioned in merge request !117597 (merged)

    mentioned in merge request !117597 (merged)

  • removed database label

  • Terri Chu added 12 commits

    added 12 commits

    Compare with previous version

  • Terri Chu changed the description

    changed the description

  • Matt Kasa added 176 commits

    added 176 commits

    Compare with previous version

  • Dmitry Gruzd added 3 commits

    added 3 commits

    Compare with previous version

  • Dmitry Gruzd added 23 commits

    added 23 commits

    Compare with previous version

  • Matt Kasa added 3 commits

    added 3 commits

    Compare with previous version

  • Dmitry Gruzd resolved all threads

    resolved all threads

  • Dmitry Gruzd changed the description

    changed the description

  • Dmitry Gruzd requested review from @mattkasa

    requested review from @mattkasa

  • Dmitry Gruzd changed the description

    changed the description

  • Matt Kasa requested review from @tkuah

    requested review from @tkuah

  • mentioned in merge request !118463 (merged)

  • Changzheng Liu requested review from @stanhu and removed review request for @tkuah

    requested review from @stanhu and removed review request for @tkuah

  • Stan Hu
  • Stan Hu
  • Stan Hu
  • Stan Hu
  • Stan Hu
  • Stan Hu
  • Stan Hu
  • Stan Hu removed review request for @stanhu

    removed review request for @stanhu

  • Dmitry Gruzd added 5 commits

    added 5 commits

    Compare with previous version

  • Dmitry Gruzd added 1 commit

    added 1 commit

    • 7fa45e42 - fixup! Rebase changes and address reviewer's feedback

    Compare with previous version

  • Dmitry Gruzd added 1 commit

    added 1 commit

    • 6ae42d15 - fixup! Rebase changes and address reviewer's feedback

    Compare with previous version

  • Dmitry Gruzd added 1 commit

    added 1 commit

    • 9be62654 - fixup! Rebase changes and address reviewer's feedback

    Compare with previous version

  • Dmitry Gruzd added 1 commit

    added 1 commit

    • 629087ff - Rebase changes and address reviewer's feedback

    Compare with previous version

  • Dmitry Gruzd added 1 commit

    added 1 commit

    • 85722d1f - Rebase changes and address reviewer's feedback

    Compare with previous version

  • Dmitry Gruzd added 1 commit

    added 1 commit

    • 4b11ab4d - Rebase changes and address reviewer's feedback

    Compare with previous version

  • Dmitry Gruzd added 1 commit

    added 1 commit

    • 6df6c845 - Rebase changes and address reviewer's feedback

    Compare with previous version

  • Dmitry Gruzd added 1 commit

    added 1 commit

    • 1cabdfa3 - Rebase changes and address reviewer's feedback

    Compare with previous version

  • Dmitry Gruzd added 1 commit

    added 1 commit

    • d1a04f64 - Rebase changes and address reviewer's feedback

    Compare with previous version

  • Dmitry Gruzd deleted the add-tanuki-bot-model branch. This merge request now targets the master branch

    deleted the add-tanuki-bot-model branch. This merge request now targets the master branch

  • Dmitry Gruzd added 409 commits

    added 409 commits

    Compare with previous version

  • @stanhu thank you for the review! I believe it is ready for another look :handshake:

  • Dmitry Gruzd requested review from @stanhu and removed review request for @mattkasa

    requested review from @stanhu and removed review request for @mattkasa

  • Dmitry Gruzd marked this merge request as ready

    marked this merge request as ready

  • Dmitry Gruzd changed the description

    changed the description

  • Stan Hu
  • Stan Hu
  • Stan Hu
  • Hmm, it looks like pgvector was not building on my system since MacOS13.0sdk doesn't exist:

    --------------------------------------------------------------------------------
    Building pgvector/vector.so version v0.4.1
    --------------------------------------------------------------------------------
    clang: warning: no such sysroot directory: '/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX13.0.sdk' [-Wmissing-sysroot]
    In file included from src/ivfbuild.c:1:
    In file included from /Users/stanhu/.asdf/installs/postgres/12.13/include/server/postgres.h:46:
    /Users/stanhu/.asdf/installs/postgres/12.13/include/server/c.h:59:10: fatal error: 'stdio.h' file not found
    #include <stdio.h>
             ^~~~~~~~~
    1 error generated.
    make[1]: *** [src/ivfbuild.o] Error 1
    make: *** [pgvector/vector.so] Error 2
    % ls -al /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs
    total 0
    drwxr-xr-x  5 root  wheel  160 Apr  1 05:17 .
    drwxr-xr-x  6 root  wheel  192 Mar 24 04:08 ..
    drwxr-xr-x  7 root  wheel  224 Mar  9 16:58 MacOSX.sdk
    lrwxr-xr-x  1 root  wheel   10 Apr  1 05:13 MacOSX13.3.sdk -> MacOSX.sdk
    lrwxr-xr-x  1 root  wheel   10 Apr  1 05:13 MacOSX13.sdk -> MacOSX.sdk

    I may have been messing with this when testing another merge request. Creating a symlink fixed the problem. :white_check_mark:

    • Resolved by Stan Hu

      I have to set up an OpenAI key? That worries me a little. :smile:

      % PG_HOST=~/gdk-ee/postgresql python chroma_to_pg.py
      Traceback (most recent call last):
        File "/Users/stanhu/gitlab/tanuki-bot/pgvector/chroma_to_pg.py", line 23, in <module>
          embedding = OpenAIEmbeddings()
                      ^^^^^^^^^^^^^^^^^^
        File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__
      pydantic.error_wrappers.ValidationError: 1 validation error for OpenAIEmbeddings
      __root__
        Did not find openai_api_key, please add an environment variable `OPENAI_API_KEY` which contains it, or pass  `openai_api_key` as a named parameter. (type=value_error)
  • Dmitry Gruzd added 1 commit

    added 1 commit

    • 39c5377f - Rebase changes and address reviewer's feedback

    Compare with previous version

  • Dmitry Gruzd changed the description

    changed the description

  • Contributor
    4 Warnings
    :warning: This merge request is quite big (526 lines changed), please consider splitting it into multiple merge requests.
    :warning: 39c5377f: Commits that change 30 or more lines across at least 3 files should describe these changes in the commit body. For more information, take a look at our Commit message guidelines.
    :warning:

    featureaddition and featureenhancement merge requests normally have a documentation change. Consider adding a documentation update or confirming the documentation plan with the Technical Writer counterpart.

    For more information, see:

    :warning: Do not add new controller specs. We are moving from controller specs to
    request specs (and/or feature specs). Please add request specs under
    /spec/requests and/or /ee/spec/requests instead.

    See &5076 for information.

    Reviewer roulette

    Changes that require review have been detected!

    Please refer to the table below for assigning reviewers and maintainers suggested by Danger in the specified category:

    Category Reviewer Maintainer
    backend Thomas Hutterer current availability (@thutterer) (UTC+2, 6 hours ahead of @terrichu) Peter Leitzen current availability (@splattael) (UTC+2, 6 hours ahead of @terrichu)

    To spread load more evenly across eligible reviewers, Danger has picked a candidate for each review slot, based on their timezone. Feel free to override these selections if you think someone else would be better-suited or use the GitLab Review Workload Dashboard to find other available reviewers.

    To read more on how to use the reviewer roulette, please take a look at the Engineering workflow and code review guidelines. Please consider assigning a reviewer or maintainer who is a domain expert in the area of the merge request.

    Once you've decided who will review this merge request, assign them as a reviewer! Danger does not automatically notify them for you.

    If needed, you can retry the :repeat: danger-review job that generated this comment.

    Generated by :no_entry_sign: Danger

  • removed database label

  • Stan Hu
  • Stan Hu approved this merge request

    approved this merge request

  • :wave: @stanhu, thanks for approving this merge request.

    This is the first time the merge request is approved. To ensure full test coverage, a new pipeline will be started shortly.

    For more info, please refer to the following links:

  • Stan Hu
  • Dmitry Gruzd added 1 commit

    added 1 commit

    • 3bf3b0bd - Rename document to document_id

    Compare with previous version

  • Matt Kasa approved this merge request

    approved this merge request

  • Stan Hu approved this merge request

    approved this merge request

  • Stan Hu resolved all threads

    resolved all threads

  • Stan Hu enabled an automatic merge when the pipeline for e4d22daf succeeds

    enabled an automatic merge when the pipeline for e4d22daf succeeds

  • merged

  • Stan Hu mentioned in commit cd0dca6f

    mentioned in commit cd0dca6f

  • added workflowstaging label and removed workflowcanary label

  • Terri Chu resolved all threads

    resolved all threads

  • Gosia Ksionek mentioned in epic &10524

    mentioned in epic &10524

  • mentioned in issue #461062 (closed)

  • Please register or sign in to reply
    Loading