@sytses@JobV Git LFS only pushes data through HTTP(S), git annex does that through ssh. We should first check whether our customers use it because of that. Otherwise, I am OK with removing.
We currently decided for git annex, because we thought using ssh was a good thing. Reading through the docs didn't make clear which technique to favour.
However, we ran into many problems using git annex. Most important one, that it is really hard to make it work fine in gitlab-ci. Since we just started, I decided to port to git lfs. My feeling was, that support for git lfs is more mature (and tested through the CE) than git annex is.
So, yes. I'd be pro removing the feature in favour of the one which is integrated better into the whole stack.
Douwe Maanchanged title from Remove Git Annex to Deprecate Git Annex (remove with 10.0)
changed title from Remove Git Annex to Deprecate Git Annex (remove with 10.0)
Mike, do not promise to keep it until 10.0, it might take 9 months, it might take more, no need to make a decision. BTW if this is about git annex, it needs to go in 9.0, it is a security problem waiting to happen.
@sytses What are the specific security concerns - are there any references I can read up to understand them better?
I think not having a good handle on the number of people using Git Annex that providing just 1 month to react to deprecation and removal is fairly tight, especially given change control timelines in larger businesses.
Douwe Maanchanged title from Deprecate Git Annex (remove with 10.0) to Remove Git Annex
changed title from Deprecate Git Annex (remove with 10.0) to Remove Git Annex
@marin There were 4 tickets since November 1st 2016 that reference git annex. The surface area of the seats affected is ~300. Of note, a reseller did as some questions about Git Annex during that time, so it may be worth looping in @malessio to talk to resellers about their exposure.
@lbot As long as they get notice to stop talking about it, AND we replace it with LFS, we will be fine. If this is a sure thing, then I will let Perforce know now as they may be relying upon it for their GitSwarm plans.
@malessio I'm going to say get ready for the convo based on this issue. Might be worth it to put it on their radar ASAP as Build wants to move on this fast.
I am talking with their sales VP today. He won't know much about their development efforts - but can loop the appropriate person in - which has all changed now that they shut their Canadian Office. (Git Swarm was done in Canada).
As a user, this is a bit disappointing; git-annex was a good reason to use GitLab over the others (although the tooling has become quite a bit better so this is not as big of a point as it was) and LFS barely existed at the time. Admittedly, my public repos are scientific and almost frozen, and have a 99% chance that no-one will look at them, but because they're frozen, I'd really like to not have to change them. For private repos, the annex assistant is great for keeping things in sync.
I'm applying a bit of a Sunk Cost Fallacy, but I don't really want to learn LFS now that I know git-annex if I don't have to. If this gets done, there really needs to be rock-solid documentation on converting because at this point, the official LFS info is quite lacking (that's not to say git-annex's documentation is amazing, but at least I can find some answers there.)
As to security concerns, I don't see any security-level bug reports on Debian or Ubuntu, so I don't even know what this references. Haskell makes some pretty strong guarantees about correctness and though that doesn't preclude any developer from making a mistake, I'm sure Joey can stay on top of that. That being said, Haskell makes git-annex a right pain to build properly anywhere.
@QuLogic Thanks for the insight. Git LFS is much easier to use and adopt than annex and the learning curve should be fairly shallow. We'll make sure that in the 8.17 blog post where we announce the deprecation, we link to resources.
I am a bit sad about this as git-annex as it is a lot easier for end-users (thanks to the assistant) and more widely available (e.g. in the standard repos of Debian, Archlinux, Fedora etc. unlike git-lfs).
The first reason for depreciation according to the migration guide seems to be:
Git Annex works only through SSH, whereas Git LFS works both with SSH and HTTPS (SSH support was added in GitLab 8.12).
However, at least for pull I'm not sure this is correct. For instance, the git-annexdownload page supports pull over HTTPS.
Example:
git clone https://downloads.kitenet.net/.git/git annex get downloads.kitenet.net/git-annex/presentation.svg
The extra server side setup needed for this isn't particularly involved, though perhaps there's a reason why it was never enabled on Gitlab...
As a git-annex I'm quite dissapointed by this. My team has chosen gitlab for this exact reason, and the truth is I find reasons used to favor LFS are quite shallow.
I hope you are very clear about the timeline because our whole workflow depends on this.
Thank you for your answer, I will look into that. In the meantime I'm trying to backup everything to our own servers (we are using gitlab.com as our repos). Is it possible that I'm no longer able to push with git-annex to gitlab.com?
Thanks.
@juan-cardelino I'm not sure, GitLab.com is currently running an 8.17 branch, although it's in the process of being moved over to 9.0. Any pointers @DouweM ?
I just read that starting from 9.0 you will drop git annex support.
That is not acceptable.
Our entire workflow is based and built on git-annex, and git LFS is not a drop-in replacement for it.
So it means that we will not be able to upgrade to 9.0.
The point of gitlab is to ease development. Here you are screwing ours.
It is not possible to remove features on which people are relying.
When we decided to switch to gitlab, and to subscribe git-annex was a requirement.
Our facility/center just upgraded to 9.0 and has already applied some updates so downgrading is not an option to do the migration. Could you provide a migration document if downgrading is not an option.
GitLab: Disallowed commandrsync: connection unexpectedly closed (0 bytes received so far) [Receiver]rsync error: error in rsync protocol data stream (code 12) at io.c(226) [Receiver=3.1.0] rsync failed -- run git annex again to resume file transfer Unable to access these remotes: origin
Is there different migration instructions if downgrading is not an option? We didn't see any warning messages during the last few months of use telling us this was going to happen during our syncs.
@thesamprice@marcia I believe the initial git annex sync --content is meant to gather all available files to prepare for the repo conversion. You could alternatively issue git annex get --all.
If you cannot get your files from GitLab, hopefully you have another repo with the annexed files accessible. If you already have all the files GitLab does, you can safely ignore the error.
In case you want to keep using git-annex, just mark the GitLab repo as dead via git annex dead $your_gitlab_repo_name. You will no longer be able to distribute or access annexed files, but all other local repos will continue on as normal (and can shuffle files around to honour the numcopies setting).
PS. I find the sudden removal of git-annex to be rather inconvenient.
Correct the git annex commands seem to have been removed in version 9.0. So the conversion listed in your tutorial is no longer possible. I wasn't aware that this was going to happen. Maybe you could enable just the down syncs, and disable the ability for people to upload / modify files until 10.0 or a later release to allow for migrations.
For the record, here are advantages of git annex over git lfs from an end-user point of view:
git lfs download all files by default. The point of using large files for a user is to avoid long downloading time and huge local disk space when it is not needed.
when not downloading all, git lfs creates pointer files. Those files are regular files, so programs trying to process them will in the best case file, or otherwise silently produce incorrect results. git annex use symlinks, so that when the file is not downloaded there is a broken symlink, which can not be opened.
git lfs use locally twice the space, because files are in .git/lfs and in the checkout.
git annexed files can be dropped locally and on the REMOTE SERVER (git annex drop --from=REMOTE).
using git LFS, I realized that there is not way to claim back the space. So if a user uploads a huge file by error, it can fill up the disk and there is no simple way to delete it. Then this storage has to be backed up etc... I found this https://gitlab.com/gitlab-org/gitlab-ce/issues/3666, than does not seem
to be available in my version. But anyway I would have to delete the project in order to get back the space. That does seem a good way to help people dealing with large files to me.
When dealing with large files these points are critical.
At the time I selected gitlab over other solutions precisely because it supported git annex. And I really do not understand the rationale of ditching it without providing similar features.
It seems to me that the LFS is more a feature for providers than for users.
You could just as easily have integrated git annex into gitlab by implementing https authentication and visibility from the interface.
I perfectly understand that it is cumbersome and expensive to support two overlapping technologies, what I question is either the choice of LFS, or the timing, meaning the lack of user consultation and the lack of features in LFS.
I agree that there are probably a minority of users using git annex, and that there are advanced users.
[The fact that there are few tickets may also mean that it was working fine.]
To finish on a positive note, I really like gitlab, and all the features you added, the quick and efficient support, hence my incomprehension on this issue.
I'm a gitlab and git annex user. Although it's a bit disappointing from a "moral" perspective that gitlab is dropping support for annex, I can't see why is it so tragic from a technical perspective, is it that hard to manually push the git-annex branch and to keep big files in a cheap or free store elsewhere or even in another gitlab repo used as an annex remote? Am I missing some important bit of functionality here? After all part of the beauty of git annex is how open and flexible its architecture is...
is it that hard to manually push the git-annex branch and to keep big files in a cheap or free store elsewhere or even in another gitlab repo used as an annex remote?
You're speaking as an end-user (no offense). As someone who must implement how a company will work with precious and sensitive data, if you use another place to store your data, you must implement control-access, encryption and the backuping, plus document it in the quality system. It's feasible but takes time and money. Wh had this before switching to gitlab. The point of gitlab was to provide a all-in-one solution.
And it will not be compatible with gitlab CI for instance.
But you raised an important point: The announce was that git-annex support was to be removed.
I am not quite sure of what that means. Will it stay compatible with git-annex provided that the files are pushed a to (non-gitlab) remote ? Because git annex has a hybrid implementation, where some pieces of info are stored by git, and actual files by git annexx but inside the git repo.
@kforner no offense taken, thank you for answering. As an end user I can say that every git repo is "compatible" with git annex as long as you git push origin master git-annex and then git pull && git annex merge. Regarding encryption you can set it while configuring a special remote with git annex initremote or git annex enableremote. You can easily use many of the storage solutions out there: s3, google drive, dropbox, etc. For example, using the rclone special remote you get a one-size-fit-all solution for many of them. Many of these storage solutions will provide good access control and backuping alternatives, as you surely know better than me. So I don't think this announce is the end of the world, but it's kind of sad that a big shot is not backing up annex anymore.
@memeplex Thanks Carlos for these info. [Storage has to be local in our case].
Sure it is not the end of the world. But to benefit from the other gitlab features (CI...) I'd like to use supported features. My pipelines will need to access the large files. If I can not make them work with my local git annex hacks, the gitlab support will tell me that I use a non-supported features, and that's fair. Or maybe it will work for one version and be broken by a new release.