Skip to content

Ability determine if gitattributes matches a file at a given ref/branch

What

We need to be able to look up values for a branch's .gitattributes file. Currently we can only do this using info/attributes from the default branch.

Ideally this behaviour would match git's normal behaviour, which might involve checking .gitattributes files nested in subfolders and allowing filters to be overidden, cancelled out, etc. Some investigation might be required to see how rugged's approach (on bare repos) differs from using check-attr normally.

Why

This is needed so we can determine if files match the LFS filter in https://gitlab.com/gitlab-org/gitlab-ce/issues/29876 and in https://gitlab.com/gitlab-org/gitlab-ce/issues/39785

.gitattributes order

https://git-scm.com/docs/gitattributes#_description:

When deciding what attributes are assigned to a path, Git consults $GIT_DIR/info/attributes file (which has the highest precedence), .gitattributes file in the same directory as the path in question, and its parent directories up to the toplevel of the work tree (the further the directory that contains .gitattributes is from the path in question, the lower its precedence). Finally global and system-wide files are considered (they have the lowest precedence).

Current Gitlab::Git::Attributes model

This has been hardcoded to info/attributes and can't handle .gitattributes files individually, but can probably be reused either with a differnt path or by passing in the contents of .gitattributes as a string.

The Attributes class also overwrites rugged's methods in UseGitlabGitAttributes.

What Rugged provides

For some reason this doesn't ask for a specific branch, even when using [:file, :index] priority. Uses rugged_copy_gitattributes for a given branch does make this work, but I'm not sure if it only ends up using info/attributes.

Rugged allows different search strategies such as [:index, :file], [:index] or :skip_system, which change the flag passed in to fetch_attributes(file_path, nil, 0).

Using check-attr with a temporary index

As suggested in https://gitlab.com/gitlab-org/gitlab-ce/issues/39785#note_52245379 it is possible to use check-attr for this, using a temporary index file to workaround the limitations of bare repositories.

Using values set in the nested .gitattributes file in https://gitlab.com/jamedjo/engine/blob/master/subdir/.gitattributes we get the following:

engine.git|master» GIT_INDEX_FILE=tmp-index git read-tree --reset -i master
engine.git|master» GIT_INDEX_FILE=tmp-index git check-attr --cached --all subdir/pointer.sub
subdir/pointer.sub: diff: lfs
subdir/pointer.sub: merge: lfs
subdir/pointer.sub: text: unset
subdir/pointer.sub: filter: lfs

Performance

We switched to Gitlab::Git::Attributes for performance reasons in gitlab_git@340e111e. We might want to re-examine this before starting to use rugged again for this new scenario.

Related

Edited by James Edwards-Jones