Java Spring Boot Rest API discovery

Java Spring Boot is one of the most popular ways to develop web APIs, especially for enterprise customers. This makes it a great first framework to perform API Discovery of.

Goal

Generate an OpenAPI schema from a typical Spring Boot application that does not already produce its own schema, without running the application.

Solution outline

  1. Target the compiled artifact
    • We could integrate into the build process, but it would be hard to do so reliably given the number of build systems and their customizations. We would also have to account for build system versioning.
    • Since Java artifacts are (reasonably) well-defined, targeting the compiled artifact creates a smaller amount of variation that we have to account for.
  2. Require the user to provide the Java runtime
    • Choosing the correct runtime (both flavor and version) would significantly increase complexity, and risk incompatibilities with user code. The user should already have the SDK on which they produced their JVM-based artifact in the first place.
    • This means that the solution will not be delivered via Docker, but will need to be executable on the user's runner.
  3. Store the solution in a package registry
    • Since we are not relying on Docker, we will need another mechanism for delivering the executable solution. GitLab's Package Registry can store the executable, and the solution template could fetch (e.g. via curl) the executable from the registry and then execute it.
    • We will need an additional project to expose the Package Registry, since the source project will be private.
  4. Don't run any user code
    • Minimize side effects; user code could do anything
    • It is impossible to completely prevent user code from running. Static initializers run when the class is loaded, which we will have to do in order to reflect on the metadata. There is not, as far as I know or can find, such thing as reflection-only load in Java
    • Consider preventing the process from using the network to minimize side effects in case user code does run
  5. Use SpringDoc OpenAPI to do the heavy lifting.
    • SpringDoc knows how to interpret the Spring Boot metadata and translate it into an OpenAPI schema.
    • SpringDoc OpenAPI is designed to be used at runtime, which is not an option. However, mostly what it does is read metadata, which is available. So we will need to investigate if/how it can be used in a "reflection-only" mode.
      • Because we will not be running the code, we can only generate a schema based on metadata. If the user is manipulating config at runtime in a way that is not exposed through metadata, the schema we generate may be inaccurate.
  6. Matching versions (not necessary)
    • We could require the user to specify the version of Spring Boot (and possibly other dependencies) they are using. However, this goes against our goal of minimal configuration, so we should try not to require that if we can.
    • We will need to determine how closely to the app's version of Spring Boot we want to match. Same minor version seems like a reasonable first attempt, but we may need to work around compatibility issues caused by version mismatches.
    • There are two reasons why determining the app's dependency versions might not be as straightforward as it would seem:
      • Spring Boot is made up of many libraries. If the app is following best practices, they should be using the same version for all of their Spring Boot dependencies. However, there are many reasons why apps end up using different versions for some dependencies. We will have to figure out how to resolve to a single version to be compatible with.
      • Because there is no concept of "assemblies" in Java, the actual identity of a dependency is mostly lost from the final artifact; only the filename remains visible. Java doesn't require the filename to have any information about the dependency in it. Fortunately the Maven repository format - used by both Maven and Gradle, the two dominant Java build systems - includes the version number in the filename. So we should be able to rely on this, and any edge case where such a filename is not included would be unsupported.
  7. Dynamically load Spring Boot dependencies (not necessary)
    • Because we are determining what version to use dynamically, we cannot simply put Spring Boot on our own classpath. We will need to either download the corresponding version of required Spring Boot dependencies prior to or at runtime, or bundle all supported versions of required dependencies in our own executable and load the correct one at runtime. I lean towards the latter, to minimize the reliance on external network connectivity and associated transient failures. But it might depend on what that does to artifact size.

Supported scenarios

  • Java 8, 11, 17 (LTS versions)
    • Latest Spring Boot still supports Java 8 (!), and it's an LTS.
    • Document support for the LTS versions, but note that non-LTS versions should also work.
  • Spring Boot 2.x
    • 1.5 was released more than 5 years ago. We can note that it should work but we won't officially support it.
  • Artifact types:
    • Spring Boot Executable JAR
    • Spring Boot Executable WAR
    • WAR
    • uber-JAR
      • Unlike the other formats here, a shaded uber-JAR completely destroys the identify of its dependencies - even their filenames - in the process of creating the artifact. If knowing the version of the app's dependencies is important, we would need another way of determining them.
      • It doesn't appear that we need to know the version of Spring Boot in use, so we can likely support this as well.
  • Build systems
    • Maven with SprintBoot plugin
    • Maven without SpringBoot plugin
    • Gradle with SpringBoot plugin
    • Gradle without SpringBoot plugin
      • Spring Boot pushes the plugins pretty strongly, but it is certainly possible to build a container-dependent WAR or an executable uber-JAR without them. As long as the resulting artifact is one of the supported types, the build system shouldn't matter.

Out of scope for v1

  • JAX-RS: Spring Boot supports defining endpoints via JAX-RS annotations instead of its own annotations. However, SpringDoc does not support JAX-RS. The Swagger library does support JAX-RS; we can consider adding support for it in the future.
  • Kotlin: although it is very likely that a Spring Boot application written in Kotlin would "just work", explicit support for that scenario can come later

Project setup

  1. Create project for code api-discovery-src
    1. Make @mikeeddington an Owner
    • Do we want one repository for api-discovery-src, or a separate repository for each supported language (e.g. api-discovery-java-src)?
  2. Create CI template to use in testing
    1. Save generated OpenAPI document as job artifact
  3. Create initial build and test jobs
  4. Find or create a pet store example project to test against
  5. Create an entry python script that looks for Java assets and launches java+spring -> openapi tool
    1. *.war, *.jar, *.class
    • Can we assume that any Java assets that we find are what we want to analyze? Or do we need to the user to specify the artifact?
    • Given that we will be executing on the user's runner, we should keep the dependencies to a minimum. Recommend against using Python and stick to shell scripts.
    • How do we handle OS/shell differences in our template?
  6. Main code will be written in Kotlin

Open questions / spikes

  1. Can SpringDoc be run in "offline" mode, i.e. without actually running the web app or user code?
    • By creating our own Spring Context and wiring up Spring Boot, SpringDoc and the user code, we can invoke the OpenAPIResource bean and generate the schema
    • Controllers are wired up as Singletons by default, which Spring creates eagerly by default. We can post-process all bean definitions to define them as Lazy, to prevent their eager creation.
    • SpringDoc uses Spring APIs which have the side effect of creating the controller beans, but it doesn't actually need the controllers; those calls have to be intercepted to prevent the bean creation
  2. How do we find the customer code to register? Can we scan everything on the classpath?
    • Running a component scan on "" and excluding "org.springframework." seems to work. We can also exclude other known packages (like "org.springdoc.*") for safety.
  3. How to determine what version of Spring Boot the app is using?
    • How do we normalize different versions of Spring Boot assemblies being used?
    • Is this necessary? SpringDoc seems to claim that its latest version can work all the way back to Spring Boot 1.5.0 (Jan 2017) and 2.0.0 (Feb 2018).
      • The other problem would be if our custom Spring context setup has to be version dependent; but that seems unlikely.
    • SpringDoc works with all versions of Spring Boot we would care to support, so we don't need the Spring Boot version.
    • If we want to know the Spring Boot version (for logging, or to provide more tailored messaging, for example), we can pull it from the META-INF/MANIFEST.MF file inside the artifact using JarFile, IF the artifact was built with the Spring Boot plugin.
  4. If API Discovery needs to load different versions of its own dependencies to match different versions of target app dependencies, what's the best way to do that?
  5. Can we prevent the API Discovery process from using the network, to minimize side effects in case user code is executed? What restrictions will that place on what API Discovery can do?
    • There are several ways to use the Java SecurityManager to prevent access to the network. The most straightforward way would be to create a custom SecurityManager and set it at the start of the main method.
  6. What versions of Spring Boot should we support? What exactly should "support" mean in this context: testing, documentation, promise of bug fixes? Does GitLab have a standard definition of "supported"?
    • For example, we could say "we have tested against 2.3 and believe it works, but since it is no longer supported by Spring, we do not commit to fixing any issues that only affect 2.3"
    • Spring seems to have a pretty aggressive timeline on support; 2.4 came out in November of 2020, and knowing enterprise upgrade efforts I am confident that many companies on older versions have not upgraded yet.
    • See Supported scenarios
    • 5 years is a nice round number, which also happens to align with 2.0, another nice round number.

Other resources

Edited by David Nelson