Currently we calculate CI status using SQL with implementation like:
defstatus_sqlbuilds=all.select('count(*)').to_sqlsuccess=all.success.select('count(*)').to_sqlignored=all.ignored.select('count(*)').to_sqlifall.respond_to?(:ignored)ignored||='0'pending=all.pending.select('count(*)').to_sqlrunning=all.running.select('count(*)').to_sqlcanceled=all.canceled.select('count(*)').to_sqlskipped=all.skipped.select('count(*)').to_sqldeduce_status="(CASE WHEN (#{builds})=0 THEN NULL WHEN (#{builds})=(#{success})+(#{ignored}) THEN 'success' WHEN (#{builds})=(#{pending}) THEN 'pending' WHEN (#{builds})=(#{canceled})+(#{success})+(#{ignored}) THEN 'canceled' WHEN (#{builds})=(#{skipped}) THEN 'skipped' WHEN (#{running})+(#{pending})>0 THEN 'running' ELSE 'failed' END)"deduce_statusenddefstatusall.pluck(self.status_sql).firstend
We do this because of performance reasons, this is much faster than code we had before.
Instead of doing this that way we cold load entire collection of statuses into the memory, and calculate status in Ruby, what could be
I don't think Arel will solve this problem, since it does not support WHEN statement.
Maybe it would be enough to refactor current implementation by adding Gitlab::Pipeline::Status::Success, Gitlab::Pipeline::Status::Failure [...] and designing this classes to make it easier to fetch builds with given status, calculate compound status, etc. API could then expose methods that would make it easier to create SQL statement and all this would be abstracted away behind Gitlab::Pipeline::Status facade.
Grzegorz Bizonchanged title from Consider making statuseable implementation cleaner to Consider making {+HasStatus / +}statuseable implementation cleaner
changed title from Consider making statuseable implementation cleaner to Consider making {+HasStatus / +}statuseable implementation cleaner
Elliot Rushtonadded 1 deleted label and removed 1 deleted label
We had a very long and productive call with @ayufan and we concluded that this might be a good idea to take stab at this issue because it might make our code much more extensible and maintainable by reducing technical debt a lot. It is not super simple, and we will need to think about choosing a good algorithm on the Ruby side, to make it a performant solution.
pulling individual values as a separate columns and then reimplementing CASE in ruby based on thus values should be as fast (from database perspective) as what happens now
in those SQL statements you don't actually need COUNT(*), it feels like using WHERE EXISTS should be enough, so that database doesn't actually pull all the jobs for each status, finding just one would be enough, so it potentially can do less work
Unfortunately this did not fit within our capacity for %11.8 release planning. If we are able to find a way to bring it forward we will, otherwise moving this for evaluation in %12.0 for now.
The 12.0 release is very over-full and I don't expect this to gitlab-ce~3011693 item to fit, unfortunately. Moving forward a bit - if for some reason this is more urgent than I'm giving credit for, please let me know.
@erushton we didn't complete this in %11.11 and planning is already complete for %12.0 so I'm moving it to %12.2. Let me know if it should be somewhere else.
@erushton this is a gitlab-ce~3011693 type item we have pushed out of releases for a while. I'm moving it to the %Backlog for now, but if we need to to be a higher priority from an engineering perspective, let me know and we can reassess.
@grzesiek I was thinking something in the line of this. Bear in mind it's just a quick draft of an idea.
This would not require us to define a class for each status as the implementation is already simple enough.
classCi::Statusdefinitialize(collection)@collection=collectionenddefcalculatecasewhenbuilds==skipped&&warnings>0'success'whenbuilds==skipped'skipped'whenbuilds==success'success'# ..else'failed'endendprivatedefall@all||=(@collection.respond_to?(:exclude_ignored)?@collection.exclude_ignored:@collection.all).to_aenddefbuildsall.countenddefwarnings@warnings||=all.count{|obj|obj.respond_to?(:failed_but_allowed?)&&obj.failed_but_allowed?)}end# reduce all these methods with :define_methoddefcreated@created||=all.count(&:created?)enddefskipped@skipped||=all.count(&:skipped?)enddefsuccess@success||=all.count(&:success?)end#...more methodsend
Then HasStatus.status method could be changed into:
defself.statusCi::Status.new(all).calculateend
This IMO is very easy to test even with an array of stubs.
Do you think something like that would help removing the tech debt we have?
Is the @collection a group of statuses or a group of builds? Would it be possible to accomplish the same with statuses only? I guess it might be more difficult without statuses being custom objects since a string can't encapsulate allow_failure / exclude_ignored.
I wonder if making it possible to encapsulate all information about a status in a class would be a good idea.
We do have some classes like this already in lib/gitlab/ci/status, would it be possible to reuse them?
Let's loop @ayufan in here, since this is a very important change that can have a significant impact on the product.
One reason why it might be better to decouple statuses from builds is that we might not store a status in a build after we implement gitlab-ce#23257. We do have a few options there, one is storing statuses in a separate table. It means that we won't be able to use builds only to calculate a status. This is a very interesting and complex problem.
Whatever we do, it needs to be performant handling at least 1000 builds: performant CPU-wise and Memory-wise. We had that in the past (written in Ruby) and I changed that to pure-SQL, because of performance. Simply, doing that in SQL was like 100-1000x faster.
So, first I would like us to learn the cost of the current system, and compare that to the proposal. I would strongly evaluate proposal strictly from the performance point of view as a baseline (where the weight of performance in choosing the final solution is like 80% of the rate), readability is important, but I think that since this is executed millions of times it is less important than highly-optimized approach.
I personally thing that handing pure Ruby objects (without database interaction, except of fetching data) can be as fast as evaluating this on the database side, but we should measure it somehow to avoid possible performance degradation
I personally thing that handing pure Ruby objects (without database interaction, except of fetching data) can be as fast as evaluating this on the database side, but we should measure it somehow to avoid possible performance degradation
Maybe, but I think that this needs to be carefully evaluated. As fast, it will never be the same fast, it will be slower, as:
You need to fetch additional data,
You need to instantiate objects,
You need to process that in Ruby where you share interpreter with other threads.
The penalty of calculating in Ruby seems to be higher than in PostgreSQL.
@grzesiek ,there are not many options to implement it.
Either it is going to be another case of N+1 SQL queries and therefore slower by definition or it is going to be massive query with all info returned at once and only difference would be that final status is calculated on ruby side instead of SQL case statement, but then there are no much readability/testability improvement, which per current description is the reason for this change.