Skip to content

Validate repo presence when importing Repo by URL

What does this MR do and why?

This MR introduces validation if there is actual git repo behind the url

Original issue: #331805 (closed)

According to https://git-scm.com/docs/http-protocol#_smart_clients there are following requirements for client:

The Content-Type MUST be application/x-$servicename-advertisement. Clients SHOULD fall back to the dumb protocol if another content type is returned. When falling back to the dumb protocol clients SHOULD NOT make an additional request to $GIT_URL/info/refs, but instead SHOULD use the response already in hand. Clients MUST NOT continue if they do not support the dumb protocol.

Clients MUST validate the status code is either 200 OK or 304 Not Modified.

Clients MUST validate the first five bytes of the response entity matches the regex ^[0-9a-f]{4}#. If this test fails, clients MUST NOT continue.

Clients MUST parse the entire response as a sequence of pkt-line records.

Clients MUST verify the first pkt-line is # service=$servicename. Servers MUST set $servicename to be the request parameter value. Servers SHOULD include an LF at the end of this line. Clients MUST ignore an LF at the end of the line.

Everything here is implemented except:

  • dumb protocol support. I was not able to found any existing dumb protocol server in the wild and this will create even more code. Could be added later if needed
  • 304 Not Modified status code - we should never receive it, since no caching is done on our side
  • Parsing entire response as sequence of pkt-line records - too resource-consuming for simple validation, only first pkt-line is validated

Screenshots or screen recordings

New_Project___GitLab

How to set up and validate locally

  1. Open /projects/new#import_project
  2. Select "Repo by URL"
  3. Provide username / credentials and ensure that error message appears / disappears correctly

For testing basic auth (repo with credentials):

  1. Generate new token with read_repository permissions on gitlab.com
  2. Use any private repository URL to test access (note: username is ignored in gitlab when using personal access token as password, this is known

MR acceptance checklist

These checklists encourage us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Quality

  • Quality checklist confirmed
  1. I have self-reviewed this MR per code review guidelines.
  2. For the code that that this change impacts, I believe that the automated tests (Testing Guide) validate functionality that is highly important to users (including consideration of all test levels). If the existing automated tests do not cover this functionality, I have added the necessary additional tests or I have added an issue to describe the automation testing gap and linked it to this MR.
  3. I have considered the technical aspects of the impact of this change on both gitlab.com hosted customers and self-hosted customers.
  4. I have considered the impact of this change on the front-end, back-end, and database portions of the system where appropriate and applied frontend, backend and database labels accordingly.
  5. I have tested this MR in all supported browsers, or determiend that this testing is not needed.
  6. I have confirmed that this change is backwards compatible across updates, or I have decided that this does not apply.
  7. I have properly separated EE content from FOSS, or this MR is FOSS only. (Where should EE code go?)
  8. If I am introducing a new expectation for existing data, I have confirmed that existing data meets this expectation or I have made this expectation optional rather than required.

Performance, reliability, and availability

  • Performance, reliability, and availability checklist confirmed
  1. I am confident that this MR does not harm performance, or I have asked a reviewer to help assess the performance impact. (Merge request performance guidelines)
  2. I have added information for database reviewers in the MR description, or I have decided that it is unnecessary. (Does this MR have database-related changes?)
  3. I have considered the availability and reliability risks of this change. I have also considered the scalability risk based on future predicted growth
  4. I have considered the performance, reliability and availability impacts of this change on large customers who may have significantly more data than the average customer.

Documentation

  • Documentation checklist confirmed
  1. I have included changelog trailers, or I have decided that they are not needed. (Does this MR need a changelog?)
  2. I have added/updated documentation, or I have decided that documentation changes are not needed for this MR. (Is documentation required?)

Security

  • Security checklist confirmed
  1. I have confirmed that if this MR contains changes to processing or storing of credentials or tokens, authorization, and authentication methods, or other items described in the security review guidelines, I have added the label security and I have @-mentioned @gitlab-com/gl-security/appsec.

Deployment

  • Deployment checklist confirmed
  1. I have considered using a feature flag for this change because the change may be high risk. If I decided to use a feature flag, I plan to test the change in staging before I test it in production, and I have considered rolling it out to a subset of production customers before doing rolling it out to all customers. When to use a feature flag
  2. I have informed the Infrastructure department of a default setting or new setting change per definition of done, or decided that this is not needed.
Edited by Fabio Pitino

Merge request reports