Differences in resource-links stats in coverage between staging and production
Background
Staging and production calculate resource-links
stats in coverage differently:
- in production, it is based on "full-text.application:text-mining,full-text.application:unspecified" filter
- in staging, it is based on "has-full-text:1" filter
This means that DOIs with full-text.application = similarity-checking will be included in resource-links
in staging, and will not be included in production.
For example, for member 138, in production we have in coverage:
"coverage": {
"resource-links-backfile": 0.002330097137019038,
...
}
This is roughly equal to: number of results from http://api.crossref.org/members/138/works?filter=until-pub-date:2017,full-text.application:text-mining,full-text.application:unspecified (0) divided by number of results from http://api.crossref.org/members/138/works?filter=until-pub-date:2017 (15450).
And is definitely not equal to number of results from from http://api.crossref.org/members/138/works?filter=until-pub-date:2017,has-full-text:1 (5245) divided by number of results from http://api.crossref.org/members/138/works?filter=until-pub-date:2017 (15450).
In staging the same member has the following in coverage:
"coverage": {
"resource-links-backfile": 0.3183856502242152,
...
}
This time, it is roughly equal to number of results from from http://api.staging-legacy.crossref.org/members/138/works?filter=until-pub-date:2017,has-full-text:1 (213) divided by number of results from http://api.staging-legacy.crossref.org/members/138/works?filter=until-pub-date:2017 (669).
And definitely not to number of results from http://api.staging-legacy.crossref.org/members/138/works?filter=until-pub-date:2017,full-text.application:text-mining,full-text.application:unspecified (1) divided by number of results from http://api.staging-legacy.crossref.org/members/138/works?filter=until-pub-date:2017 (669).
Definition of ready
-
Product owner: @ppolischuk1 -
Tech lead: @dtkaczyk -
Service:: label applied -
Definition of done updated -
Weight applied
Definition of done
-
Unit tests identified, implemented, and passing -
Code reviewed -
Available via a staging URL -
Knowledge base reviewed and updated -
Public documentation reviewed and updated -
Consider any impacts to current or future architecture/infrastructure, and update specifications and documentation as needed -
Acceptance criteria met -
resource-links
should be based on on the "full-text.application:text-mining,full-text.application:unspecified" filter, matching production behavior. -
resource-links
should not include "full-text.application = similarity-checking"
-