Skip to content
GitLab Next
  • Menu
Projects Groups Snippets
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • GitLab GitLab
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 43,847
    • Issues 43,847
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
    • Requirements
  • Merge requests 1,399
    • Merge requests 1,399
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
    • Test Cases
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages & Registries
    • Packages & Registries
    • Package Registry
    • Container Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Metrics
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Code review
    • Insights
    • Issue
    • Repository
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • GitLab.org
  • GitLabGitLab
  • Issues
  • #234078
Closed
Open
Created Aug 07, 2020 by Stefano Tenuta@stefano.tenuta

Cannot reuse built code from a previous step because of how Maven and Git work

Summary

Before starting, I'm aware that the Bug template may not be the best one for the issue that I'm presenting but the other ones were not the best fit either. Let's say that I'm opening this to start a discussion on how to deal with the scenario we're in.

The issue here is related to how both Maven and Git work.

Here's a little bit of context:

  • Git doesn't store timestamps, this means that every file resulting from a git clone operation will have the current timestamp
  • Maven checks files in the target/source-classes directory and compares them with files in src to understand if they need to be compiled again, and it uses the timestamp to determine if the source code is newer than the built one
  • GitLab CI jobs do a git clone to retrieve the repository, meaning that all the files in the working dir have the current timestamp
  • GitLab CI jobs store folder artifacts as a zip file and those files are unzipped in following jobs, causing the artifacts to have as timestamp the moment where they were created in the previous job

If considering the Maven scenario, this means that files in target/source-classes will look older than the files in src, triggering a full build instead of using the already built classes.

Why does this matter? Let's say we have a pipeline where the first step is to build our project and then we have multiple steps that execute other goals like unit tests, code quality and so on, with the current behavior Maven will fully rebuild the project in every job of our pipeline.

From our point of view, this is a big issue because it prevents us to share build artifacts between jobs.

Steps to reproduce

  1. Create a random Maven project
  2. Add the following pipeline
build:
  stage: build
  image: maven:3.6-jdk-11
  script:
    - 'mvn test-compile'
  except:
    - tags
  artifacts:
    paths:
      - target/

junit:
  stage: unit_tests
  script:    
    - 'mvn verify'
  artifacts:
    reports:
      junit:
        - target/surefire-reports/TEST-*.xml

Example Project

(If possible, please create an example project here on GitLab.com that exhibits the problematic behavior, and link to it here in the bug report)

(If you are using an older version of GitLab, this will also determine whether the bug is fixed in a more recent version)

What is the current bug behavior?

The junit job recompiles the entire code. This behavior can be observed by looking for the following string in the job's logs:

[INFO] Changes detected - recompiling the module!

What is the expected correct behavior?

The junit job shouldn't recompile and should have the following line instead:

[INFO] Nothing to compile - all classes are up to date

Possible fixes

While this is not a GitLab-related fix, I'll describe the current workaround we're using. The idea is that we can extract our artifacts with the -DD options so that the extracted files will have the current timestamp instead of the one they had when the artifact was created, thus making the src folder older.

To make this work, our example pipeline can be changed into this one:

.prepare-build-artifact: &prepare-build-artifact |
  apt update && apt install -y zip
  zip -r -q target.zip target/

.extract-build-artifact: &extract-build-artifact |
  unzip -DD -q target.zip

build:
  stage: build
  image: maven:3.6-jdk-11
  script:
    - 'mvn test-compile'
    - *prepare-build-artifact
  except:
    - tags
  artifacts:
    paths:
      - target.zip

junit:
  stage: unit_tests
  script:
    - *extract-build-artifact
    - 'mvn verify'
  artifacts:
    reports:
      junit:
        - target/surefire-reports/TEST-*.xml

While this works correctly, it's clearly a workaround that's not supposed to be there for multiple reasons.

For example, it's difficult to find Docker images that include the zip executable (maven:3.6-jdk-11 includes only unzip) and this forces us to have the apt install step which is not the best thing out there. The alternative would be to build custom images including zip, but it's not feasible in the long run (we build stuff in multiple languages and we use a lot of different Docker images for our pipelines, we can't customize all of them just because zip is missing).

Also, this workaround adds two more steps in the artifacts management and makes the pipelines more complex as we need to add the *extract-build-artifact step in every job.

Finally, a possible GitLab-related solution could be to allow us to specify some artifacts extraction options (like the -DD one in our workaround).

Assignee
Assign to
Time tracking