Have Banzai filters declare their dependency orderings; validate in spec.
While writing this part of the description for https://gitlab.com/gitlab-org/gitlab/-/merge_requests/224625+:
> We also have to bring `AttributesFilter` up — `IframeLinkFilter` depends on it having already run, so we need to preserve its relative ordering to `IframeLinkFilter`.
… I thought: wouldn't it be nice to encode this in a way that can be tested? Look at `GfmPipeline` right now:
```ruby
def self.filters
@filters ||= FilterArray[
Filter::CodeLanguageFilter,
Filter::JsonTableFilter,
Filter::PlantumlFilter,
Filter::SpacedLinkFilter,
# ======== Sanitization boundary ========
# Items above this point must not be moved below this point, as they depend
# on running before SanitizationFilter and SanitizeLinkFilter for safety.
Filter::SanitizationFilter,
Filter::SanitizeLinkFilter,
# =======================================
Filter::KrokiFilter,
Filter::GollumTagsFilter,
Filter::WikiLinkGollumFilter,
Filter::AttributesFilter,
Filter::IframeLinkFilter, # keep before <img> handling filters
Filter::AssetProxyFilter,
Filter::MathFilter,
Filter::ColorFilter,
Filter::MermaidFilter,
Filter::VideoLinkFilter,
Filter::AudioLinkFilter,
Filter::HeadingAccessibilityFilter,
Filter::TableOfContentsTagFilter,
Filter::AutolinkFilter,
Filter::SuggestionFilter,
Filter::FootnoteFilter,
Filter::InlineDiffFilter,
*reference_filters,
Filter::ImageLazyLoadFilter, # keep after reference filters
Filter::ImageLinkFilter, # keep after reference filters
Filter::ExternalLinkFilter, # keep after ImageLinkFilter
Filter::EmojiFilter,
Filter::CustomEmojiFilter,
Filter::TaskListFilter,
Filter::SetDirectionFilter,
Filter::SyntaxHighlightFilter # this filter should remain at the end
]
end
```
What if the information in these comments was instead encoded into the filters? Then we could have a spec that validates each declared pipeline, and these breakages would surface in CI instead of hoping to catch them in review.
We might declare things such as "depends on X having already run" (i.e. hard requirement for both ordering and presence) as well as "must appear after X if present" (i.e. it doesn't depend on X, but if X is present, this cannot be before); same for before-ordering (though probably not before-ordering-requirement). We can encode the sanitization boundary as a hard requirement and not just a fancy comment.
(We may even like to have some custom predicates — what if `IframeLinkFilter` can check that all following filters don't declare a `CSS` or `XPATH` that looks at `img`? This feels too error-prone though.)
We may like to declare that pipelines must either have a `SanitizationFilter` or `HtmlEntityFilter`.
issue