Filter hits / blocked counter support on MV3

The SDK currently has an event emitter (EWE.reporting.onBlockableItem) which emits on various conditions. This is currently only fully functional in MV2.

Current confirmed status:

content filters work correctly
frame level allowing filters work in that they will emit an event when an allowing filter is applied to a frame (eg $document). However, these events will currently be re-emitted each time the service worker starts.
popup blocking works correctly
Unmatched request do not work (it's unclear if we'll be able to support this. Maybe through the webRequest API and onRuleMatchedDebug?)
request filtering won't work at all. This would need to be replaced with DNR function calls, or by predicting what we expect DNR to be doing and emit events based on that.

Notes:

Service workers are expected to remove this listener just like the others.
Auto badge
Crumbs
Prototype
onRuleMatchedDebug
getMatchedRules

Why is this required?

knowing the number of filter hits by tab ID: fixes blocked counter
knowing the filter text for a filter hit: fixes filters information in the issue report data and developer tools panel
knowing the subscription of the filter for a filter hit: fixes the subscriptions information in the issue report data and developer tools panel
knowing the request that the filter hit: fixes the request information in the issue report data and developer tools panel

Proposal:

For element hiding, popup, and allowing filter logging we can still use the current onBlockableItem event.
For rules that have matched (note, not filters) we could use getMatchedRules.
This would use the same onBlockableItem event dispatcher but we'll support a new matchInfo.method called DNR.
When using the above method we will be different and limited in the following ways:
- We have a way of determining which filter is associated with the matched rule. We could encode this into the id of the rule. This would likely need to happen in core.
- This will need to be batched with some delay.
- It's unclear if unmatched requests could work in this context
- Lots of metadata won't be available (All information around the request except the tab id). Relevant Chromium issue here.
Related to these efforts is this issue in core: adblockpluscore#476 (closed)

Specific experiments.

Add a new matching filter property to the filtering options of onBlockableEvent ("DNR" maybe?)
Use getMatchedRules to gather the matched requests for a given tabid (this already exists)
When we get data from getMatchedRules, we'll need to use the information there to determine which filter the rule came from.
Can we associate the specific request with the filter?

Approaches for how to map the relationship between filters and their rules.

A) A JSON file containing the full mapping information ruleid as the key and the filter text as the value.

ruleset file 1
{
 4: "filtertext1",
 5: "filtertext2"
}

---- 

ruleset file 2
{
 3: "filtertext3",
 4: "filtertext4"
}

Pros:

We fully control the mapping.
Very simple approach.
Data would be readily available.

Cons:

Filesize increase unless we handle content filters as a separate thing
(Potentially) Increase memory usage as we'd probably want to load this up at start time?
(Potentially) Performance impacts due to an extra file that is needed.

Todo:

Determine exactly how file size and memory would be impacted with this approach.

B) Set the id of the rule in such a way that we can reference the correct filter text through its line number in the original subscription. (https://gitlab.com/eyeo/adblockplus/abc/webext-sdk/-/issues/407)

Because a filter could have multiple rules we'd need to increment the id of the rule by some minimum increment (to reserve some gap for filters that create more than 1 rule).

This would read the relevant subscription file.

Pros:

We wouldn't need any extra files.
We might already have this data in the filter engine.

Cons

We would have to change abpcore to accommodate getting the line number from the file being converted.
This is a complex idea. Which "line number" do we use? Should the header be included?

Todo:

Determine how to adapt core to enable this line number -> Id idea.
Can we find the correct filter in the filter engine based on this line number?

C) We add a "filter" property to the generated rule file.

This would read the relevant ruleset file.

Pros

Cons

Chrome might validate this in the future and prevent this from being used.
The objects would be custom and no longer match the Chrome API

Todo:

How do we actually read this property?

Edited Dec 13, 2022 by Rowan Deysel