Consider lighter weight CI infrastructure.

changed due date to February 28, 2022

changed milestone to %Policy issues: 2023 winter meeting

added testing label

changed the description

mentioned in issue #4217 (closed)

removed milestone %Policy issues: 2023 winter meeting

added Discuss2022 winter meeting label

We didn't discuss testing or CI infrastructure at the winter planning meeting. I propose that we collect this and other outstanding issues related to testing and CI infrastructure into a milestone that we can review at one or more biweekly meetings in January or February to see how much effort is available for ~6 months (and from whom), and reject, accept with timeline, or postpone all such issues.

changed the description

We are going to try and be a bit more structured for our quarterly developer meetings to avoid having a few topics take >50% of the time.

This item has (possibly previously) been marked for potential discussion at the meeting. In order to enable that, please provide an update of the current status here, and what the specific items that need to be discussed among all developers are.

To enable everyone to contribute, and make sure we move forward, the discussion for each issue at the meeting should start with a max 5 minute presentation of the issue (including a few slides) that concludes with one or more specific decision proposals - and the decisions will be made at (or directly after) the meeting.

If resources for development, discussions or code review are needed from other developers, please estimate what those are, and e.g. what code review effort for other merge requests can be offered in return.

To enable us to schedule: Please make a note in this issue who will prepare slides and present the issue at the meeting at the latest Monday March 14.

I can present the issue.

removed due date

Two comments:

Container size in terms of storage is no longer an issue, since we have infinite storage both through GitLab and Docker as an open-source organization now. They will also be cached (we have ~ 1TB on each node) pretty efficiently after the first download, so I don't think it will typically be a bottleneck. Thus, I don't think it's the highest priority to optimize for, even though a minimal image would be good.
One important reason for separating stages is that resources are not equivalent. Configurations only use a single core, so allocating 4-8 cores to such stages will potentially waste a lot of resources, and in principle the artefact from this stage should be tiny - if it isn't, that might be worth looking into?
Similarly, when it comes to testing, some of our test setups use VERY special/limited resources such as dual GPUs. While the actual tests absolutely need to run on the relevant hardware, the corresponding builds don't. As a comparison, we have roughly ~70 slots for jobs that will be guaranteed 4 cores, but only 3 slots that can use dual NVIDIA GPUs.

In terms of guesses/suggestions, I wonder if we somehow keep storing all old stuff in ccache too, which might lead to a constant increase in size? At least for tests, I also don't see that we need all source or other files - just the test binaries and library should be sufficient.

I don't think that we will have a lot of time to work on this during this development cycle, so I would retarget it and make sure we don't waste time on this during the planning meeting

We could talk briefly at a planning meeting about what problems are worth investigating, and allocating effort. But I don't think we can undertake any of this effort from about July through January, generally, so I'm not sure whether it makes sense to try to schedule it for discussion again before ~November.

That's a bit far ahead for me, though. You can assign me if you want, but I can't make a firm commitment to coordinate discussion if we target it that far out.

Alternatively, I could just copy the issue description to a new wiki page, if it causes pain to leave the issue open, untargeted, and unassigned.

Please let me know your preference.

If nobody is planning to actually work on it in the near future, I believe the natural conclusion is that we move it to closed status for now - it's still possible to edit and post comments.

I don't think the wiki is a good alternative to Be able to keep issues 'open', because if we start doing that for dozens of issues that too will very quickly turn into giant mass with lots of text that hasn't been updated for ages :-)

PS: it's of course not specific to this issue, but part of the general consideration that we need to learn to remove as many old things from the table as new things we want to put on the table. Whether we call them 'issue', 'wiki topic', 'epic' or something else is merely a technicality :-)

the general consideration that we need to learn to remove as many old things from the table as new things we want to put on the table

I don't think that's true. I think there are a lot of things that a lot of us wish were better documented, and a lot of things that come up that we forget that someone has already investigated.

Whether we call them

I think of this as documentation, supporting material for planning discussions, and an anchor point for related discussion. In this case, it is a discussion that only makes sense to have about once a year, and this year we decided not to have it. I do think that it would be a valuable conversation to have next year, but I can't know that it will.

I invite @acmnpv to file it or archive it however he thinks is most useful or convenient, or to ask me to do it.

If the preferred mechanism of archival is to add a "stale" label and close to allow for future retrieval and to help others determine whether or not a conversation has happened, I suggest splitting each of the topics into separate easily-searched issues and archiving separately, since issue search only seems to work by title.

suggest splitting

This would also allow individual points to be labeled "rejected" (or "completed") instead of "stale", which would be very helpful.

There are of course lots of places where we would like to have better documentation, new code/features/bugfixes, but we also need to accept that resources are finite.

We already have a stale label, all core developers can add as many other new labels as they want to help with that organization (even for previously closed issues), and issues are trivial to reopen.

For now, it seems clear that none of us three who has commented here has any plan to work on it, so unless we have a concrete volunteer who will, I think it's time to close it. Feel free to add as many labels as necessary before or after that, but we also have pretty good search functionality that I think we all (me included) need to get better at using before opening new issues :-)

There have been many occasions over the last few years when developers spend significant meeting time complaining about various aspects of automated testing overhead and wondering aloud about what might be done about it, but usually during a part of the development cycle when no action is likely, and no follow-up is scheduled.

I tried to suggest action last spring, but the consensus was that action was not necessary. However, this winter, I collected notes from several of the same sorts of conversations as had occurred in previous years, and shared them here so that we wouldn't forget those conversations again.

If we can state that certain issues will never be priorities, then I hope we could have a record of such decisions that would help us agree not to waste meeting time on them in the future. If we aren't sure whether or when we will feel like trying to improve the automated testing infrastructure, then I hope that future discussions can start more productively than they have in the last few beta phases.

If there is value in this record, then I defer to the project management on how best to file it. If no one sees value, then it can be discarded (but any additional clarifying statements on how to approach future GitLab-CI complaints would surely save time and frustration).

If the issue is closed with the expectation that it will be reopened, though, I don't think a "stale" label is sufficient; I think it should be explicitly annotated for reinvestigation at a specific future time when action is likely to be possible, even if we don't know now whether action will be taken at that time.

I don't think we have every said we don't want a lightweight CI infrastructure (or any other specific feature), it's just that there are about 100 other things that have higher priority - and resources are limited.

Apart from that I think we're talking past each other. It's perfectly fine to keep issues open - provided it has an assignee who is volunteering to coordinate and explain the work that's happening on it at least each quarter. You already have 27 other open issues assigned you are going to write such reports for (meaning roughly one quarterly report every three days), but if you want to do one more, go right ahead :-)

Same things with explicitly annotating features that people would like to look into later - everyone who wants to do so can volunteer immediately, they can add any labels they want, and they can go through old issues as frequently as they want.

However... no interest in working on something right now and nobody wanting to coordinate such work means it is already effectively closed in every single aspect but the name :-)

PS:

The other alternative is that we need a way to mark/assign/display issues in a such a way so only e.g. @acmnpv can "accept" them to be worked on (since that indicates everything else is just opinions/ideas/noise).

What doesn't work is when everyone is free to add new issues, they instantly become official project/joint issues, and then somebody in the project suddenly has a responsibility for motivating why they are being closed, annotate them for later reconsideration, and repeatedly go through all old issues (because I suspect we are not going to get any volunteer to do that ;-)

it's just that there are about 100 other things that have higher priority

Yes, I accept that we have decided not to work on this in the current development cycle.

@acmnpv can "accept" them to be worked on

I thought that the quarterly meeting was about "accept"ing issues to be worked on and that milestones indicated they were being worked on.

they instantly become official project/joint issues

I'm still hoping for clarification on that, but my understanding was that they are not official until they have an assignee and a milestone, and that milestones come via planning meetings.

I offered to work on this this spring (and last spring), provided there were two other developers with whom to make decisions and review MRs, and there was some guidance as to what we should or should not look into.

It seems like we could create a label for an issue to be raised for discussion when it is time to allocate effort for spring, 2023, but it sounds like you are saying you don't want to do that.

It seems like we could collect documentation on long term or cyclic work, but you don't like my suggestions.

it is already effectively closed

Yes, I acknowledge that the issue is dead for at least 6 more months. All I was asking was whether @acmnpv wanted me to do anything with the notes for future reference, or to apply any annotations or set any reminders to consider discussion at a more appropriate time or to allocate effort in the future, especially considering that I cannot promise to track and re-raise the same points or to offer the same amount of contributing effort the next time the topic comes up.

repeatedly go through all old issues (because I suspect we are not going to get any volunteer to do that ;-)

I frequently volunteer to do this, and others have as well.

removed Discuss2022 winter meeting label

It seems like Erik really wants me to close this issue, without marking it "rejected" (but presumably "stale"?), so I will.

@acmnpv if you would like me to take any additional action or to do preliminary coordination of follow-up discussion, please let me know. Either send me email or explicitly @ tag me, because I will stop noticing activity on this issue once it is closed.

closed

added StatusStale label

mentioned in issue #3563 (closed)

mentioned in merge request !2658 (merged)

Consider lighter weight CI infrastructure.

Proposals

Merge the configure and build stages

Reduce artifact and cache storage

Evaluate/optimize ccache usage.

Use more specific builds for test stages

Reevaluate CI Docker images for size

MPI

Python

Use additional projects to test new CI images

Only run one CI pipeline per MR update

Resolution

Designs

Child items ...

Activity