Skip to content

Refactor Golden Master spec approach for testing GFM support

Problem to solve

Gitlab-Flavored Markdown (GFM) is large and has many edge cases to test. The Content Editor currently supports rendering some of the GFM content but we eventually need to support it all. If something is added to GFM, we need to have some kind of way to test our support against a Golden Master (GM) specification.

Background

See this original MR description for definitions and explanation of what "Golden Master Testing" (AKA "Characterization Testing") is, and how we are using it.

Requirements

  1. Frontend: We should have test coverage that the Content Editor can properly serialize HTML to Markdown for all GFM source elements which it currently supports.
  2. Frontend: We should have test coverage that the Content Editor can properly render the expected HTML for all GFM source elements which it currently supports (not currently supported until we implement this).
  3. Backend: We should ensure that for all GFM elements, the backend always renders the expected HTML, for all supported GFM source elements.
  4. If any of this this ever changes unexpectedly, tests will start failing, and force the same change to be made on the backend and frontend.

Implementation Plan

What we are planning on doing is actually a type Golden Master testing with modifications:

  1. The original markdown examples used to drive the tests are taken from the YAML, and can be considered a form of "fixture" in this case.
  2. The HTML in the YAML is the "Golden Master", but we are going to use it to assert against TWO different implementations of markdown rendering:
    1. The frontend one implemented as Jest specs.
      1. This will assert both HTML -> markdown serialization (what it currently does), as well as
      2. Markdown -> HTML rendering (not currently supported until we implement this). This will likely be a standalone module outside of the Content Editor.
    2. The backend one implemented as requests specs
      1. This will assert markdown -> HTML conversion by the backend.

Tasks

  • Include sanitized HTML example in YAML. Initially this may not include source mapping annotations, because these may vary between the frontend and backend.
  • Add an pending flag, which can be a string description, or a hash/object with a backend and/or frontend key containing a string description.
  • Continue to support multiple entries with the same Markdown but a different context attribute, for different variations of content editor with different extensions enabled. Instead of continuing to support this as a separate context YAML field, it will be an extension-based naming convention for the different variations (e.g. attachment_image.group, attachment_image.project, attachment_image.project_wiki)
  • Provide a mechanism to substitute variable values in example HTML, e.g. URLs and database IDs.
  • Change the existing frontend content editor markdown specs to eliminate usage of frontend fixtures.
  • Write a new request spec to assert the backend behavior, and remove the existing frontend fixture generation spec.
  • Add an environment variable (FOCUSED_MARKDOWN_EXAMPLES) which can be set to a comma-delimited list of canonical names (YAML keys) of a markdown fixture example names in order to run them focused.
  • Write some documentation for the approach (it is at the top of the main YAML config file)
  • Backfill YAML for all supported GFM marks
  • Handle EE-specific YAML with separate spec/config files
    • For frontend
    • For backend
  • Get all frontend and backend tests passing with new approach, delete old specs and any related supporting code.
  • Figure out how to make rspec print the entire text of the html diff. We can defer this to a follow on issue, or just leave the TODO in the code if we are OK with the hack to conditionally puts for now.

References

Screenshots or Screencasts (strongly suggested)

How to setup and validate locally (strongly suggested)

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

Does this MR contain changes to processing or storing of credentials or tokens, authorization and authentication methods or other items described in the security review guidelines? If not, then delete this Security section.

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team

Related to #338268 (closed)

example of using Nokogiri for normalization
-        post api_url, params: { text: markdown, gfm: true }
+        post api_url, params: { text: example_markdown, gfm: true }
         expect(response).to be_successful
-
         response_html = Gitlab::Json.parse(response.body).fetch('html')
-        unescaped_response = CGI.unescape(response_html)
-        normalized_html = normalize_html(html)
-        normalized_unescaped_response = normalize_html(unescaped_response)
-        expect(normalized_unescaped_response).to eq(normalized_html)
+        normalized_response = normalize_html(response_html)
+
+        expect(normalized_response).to eq(normalized_example_html)
       end
   def normalize_html(html)
  •    p html
  •    html_doc = Nokogiri::XML(html) do |config|
  •      config.noblanks
  •    end
  •    p html_doc.to_xhtml(indent: 2)
  •    # TODO: Provide a way to do subsitutions of variable values (urls, ids, etc) in the HTML.
  •    html
     end
Edited by Chad Woolley

Merge request reports