Corpus Management - Corpus Upload - Validation Errors

Related to #342433 (closed).

Implementation Plan

  • Handle validation errors for correct file upload type
  • Render server validation errors (incorrect file size, file type, timeouts, corpus name uniqueness)

Notes

There are only 2 endpoints we hit that are given user input that could effect the outcome.

Corpus Upload

Client Side Validations

  • We limit the file input type to .zip. There are are no package registry entry validators for the content of the .zip.

  • File size limit 5 gb

  • We will use a regex client side validation to provide realtime validation of the file name.

Server Side Validations

Filename validation format exists, but we can do client side in realtime, so we never get to a server error

  1. Package registry is not enabled - will return a 403 unauthorized. - in this case we surface an "An error has occurred, please verify package registry is enabled" - In the future we should probably not enable corpus management until package registry is enabled.

CURL

Following curl requests can be used to test the different scenarios: Without select(Current)

With Package registry disabled for the project.


curl --header "PRIVATE-TOKEN: yourtoken" \
              --upload-file corpus1_file.zip \
              "http://gitlab.localdev:3000/api/v4/projects/1/packages/generic/corpus1_package/0.0.1/corpus1_file.zip?status=default"
{"message":"403 Forbidden"}

curl --header "PRIVATE-TOKEN: yourtoken" \
              --upload-file corpus1_file.zip \
              "http://gitlab.localdev:3000/api/v4/projects/1/packages/generic/corpus1_package/0.0.1/corpus1_file.zip?status=default"
{"message":"201 Created"}

With select = package_file


curl --header "PRIVATE-TOKEN: yourtoken" \
              --upload-file corpus1_file.txt \
              "http://gitlab.localdev:3000/api/v4/projects/1/packages/generic/corpus1_package/0.0.1/corpus1_file.txt?status=default&select=package_file"
{"id":37,"package_id":26,"created_at":"2021-10-12T12:20:22.651Z","updated_at":"2021-10-12T12:20:22.651Z","size":0,"file_store":1,"file_md5":null,"file_sha1":null,"file_name":"corpus1_file.txt","file":{"url":"/6b/86/6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b/packages/26/files/37/corpus1_file.txt"},"file_sha256":"e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855","verification_retry_at":null,"verified_at":null,"verification_failure":null,"verification_retry_count":null,"verification_checksum":null,"verification_state":0,"verification_started_at":null,"new_file_path":null

With select = invalid_value (We never get to this scenario because we only allow .zip in the file input and "select" is set programmatically not by the user.


curl --header "PRIVATE-TOKEN: yourtoken" \
              --upload-file corpus1_file.txt \
              "http://gitlab.localdev:3000/api/v4/projects/1/packages/generic/corpus1_package/0.0.1/corpus1_file.txt?status=default&select=invalid"
{"error":"select does not have a valid value"}

Malformed URL if name is blank results in 400

Invalid package name results in 400 error: "package_name is invalid"

Corpus Commit

Server Side

None - It only needs a valid package ID, which is not provided by the user. We get it from a successful part 1

Client Side

None - package ID provided from successful corpus upload, so we handle validations at corpus upload stage.

Edited by -