Skip to content

Add HamlLint::Linter::NoPlainNodes linter

Luke Bennett requested to merge haml-lint-no-plain-nodes into master

What does this MR do?

Closes https://gitlab.com/gitlab-org/gitlab-ce/issues/57972

Adds linter to find unwrapped text in haml for i18n.

Also adds .haml-lint_todo.yml, which haml-lint uses to ignore existing lints. Generated using haml-lint --auto-gen-config --auto-gen-exclude-limit 5000 (docs) This currently ignores 2075 offences.

Add HamlLint::Linter::NoPlainNodes linter

Add a simple haml_lint linter to report all plain nodes. "Plain nodes" in HAML are scritpless plaintext leaf nodes.

Internal details

Simplified spec

-# Fail:
%span Hello world
%span
  Hello
  world
%span Hello #{worldText}

-# Pass:
%span= _('Hello world')
%span
  = _('Hello world')
%span= _('Hello %{worldText}) % { worldText: 'world' }

Simplified flow

Visiting all tags...

Multiline text
  1. Continue if tag has unwrapped text (AKA plain node)
  2. Continue if text is not nested (deep or shallow) in pre or code tag
  3. Continue if text is alphanumeric
  4. Find adjacent plain nodes and reduce them into a single string
  5. Report text

Step 2: : Do we want this or we do want to depend on inline disable comments incase there is text in pre or code tags that we want externalised? From a brief look at initial failure outputs, pre and code tags mostly have things like cli commands in them. These could instead be helper/presenter methods, but this really just moves the problem to rubocop.

Step 2+3: There was also a consideration to ignore text nodes that are html entities _(we could match a list from the spec, or a rough solution would match /^&[a-z]+;$/. I have ignored this for now but open to implement if we would prefer it to disable comments.

Step 2+3: After initial BE review, we are taking a more generic approach and not ignoring these cases. Some of the lints may not seem as actionable but ultimately they should probably be looked at and fixed or ignored using an inline comment. If we see there are some really common ignore cases, then we can start to change the linter to ignore them without comments. If we want.

Step 4: To explain step 4, this is to improve the report. We can just report each visit_plain node, but each line of a multiline strings is reported by itself. E.g...

%div
  Hello
  World
  %span /
  Universe

is parsed to (something like)

TagNode = { tag_name: 'div', children: [PlainNode, PlainNode, TagNode, PlainNode] }

step 4 turns the above example into 3 reports instead of 4. Hello and World are appended into 1 report for Hello World, / is reported by itself and Universe is reported by itself.

Inline text

We get the source code string for the line in question and remove the attributes and tag using substitution. The leftover should be the inline plain text.

Interpolated text

Interpolated text is treated as a script. We first check that the node has a script value and perform the same process as inline text above. The difference is, if the returned string starts with =, we have a standard haml script and we ignore this, letting rubocop do the rest.

How do I run this linter locally?

bundle exec rake haml_lint
# OR
bundle exec rake haml_lint -- -i NoPlainNodes

Example failures (current failures)

$ bundle exec rake haml_lint

CI failure: https://gitlab.com/gitlab-org/gitlab-ee/-/jobs/208696710

app/views/admin/appearances/_system_header_footer_form.html.haml:9 [W] NoPlainNodes: `Things` should be rewritten as `= _('Things')`
app/views/admin/appearances/_system_header_footer_form.html.haml:10 [W] NoPlainNodes: `really fun!` should be rewritten as `= _('really fun!')`
app/views/admin/appearances/_system_header_footer_form.html.haml:11 [W] NoPlainNodes: `I know right` should be rewritten as `= _('I know right')`
app/views/admin/appearances/_system_header_footer_form.html.haml:14 [W] NoPlainNodes: `well try #{'this'} #{1}!` should be rewritten as `= _('well try #{'this'} #{1}!')`

Does this MR meet the acceptance criteria?

Conformity

Performance and testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by Luke Bennett

Merge request reports