Skip to content
GitLab
Projects
Groups
Snippets
Help
Loading...
Help
What's new
7
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Switch to GitLab Next
Sign in / Register
Toggle navigation
Open sidebar
gogna
gnparser
Commits
7ba7a4e6
Commit
7ba7a4e6
authored
Nov 18, 2019
by
Dmitry Mozzherin
Browse files
Close
#71
no parsing for "Unamed clade"
parent
14cb6fed
Pipeline
#96770598
passed with stages
in 4 minutes and 16 seconds
Changes
3
Pipelines
2
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
18 additions
and
2 deletions
+18
-2
CHANGELOG.md
CHANGELOG.md
+7
-1
preprocess/preprocess.go
preprocess/preprocess.go
+1
-1
testdata/test_data.txt
testdata/test_data.txt
+10
-0
No files found.
CHANGELOG.md
View file @
7ba7a4e6
...
...
@@ -4,8 +4,10 @@
## [v0.12.0]
-
Add [#71]: do not parse 'Unamed clade...'.
-
Add [#69]: gnparser as a shared C library.
-
Make dynamic version using ldflags.
-
Add: Make dynamic version using ldflags.
-
Fix [#70]: parse 'Remera cvancarai' correctly.
## [v0.11.0]
...
...
@@ -106,6 +108,7 @@ array of names instead of a stream.
This document follows [changelog guidelines]
[
v0.12.0
]:
https://gitlab.com/gogna/gnparser/compare/v0.11.0...v0.12.0
[
v0.11.0
]:
https://gitlab.com/gogna/gnparser/compare/v0.10.0...v0.11.0
[
v0.10.0
]:
https://gitlab.com/gogna/gnparser/compare/v0.9.0...v0.10.0
[
v0.9.0
]:
https://gitlab.com/gogna/gnparser/compare/v0.8.0...v0.9.0
...
...
@@ -119,6 +122,9 @@ This document follows [changelog guidelines]
[
v0.6.0
]:
https://gitlab.com/gogna/gnparser/compare/v0.5.1...v0.6.0
[
v0.5.1
]:
https://gitlab.com/gogna/gnparser/tree/v0.5.1
[
#71
]:
https://gitlab.com/gogna/gnparser/issues/71
[
#70
]:
https://gitlab.com/gogna/gnparser/issues/70
[
#69
]:
https://gitlab.com/gogna/gnparser/issues/69
[
#68
]:
https://gitlab.com/gogna/gnparser/issues/68
[
#67
]:
https://gitlab.com/gogna/gnparser/issues/67
[
#66
]:
https://gitlab.com/gogna/gnparser/issues/66
...
...
preprocess/preprocess.go
View file @
7ba7a4e6
...
...
@@ -19,7 +19,7 @@ var virusRe = regexp.MustCompile(
`(alpha|beta)?satellites?)\b`
,
)
var
noParseRe
=
regexp
.
MustCompile
(
`(^(Not|None|Unidentified)[\W_].*|.*[Ii]ncertae\s+[Ss]edis.*`
+
`(^(Not|None|Un
(n?amed|
identified)
)
[\W_].*|.*[Ii]ncertae\s+[Ss]edis.*`
+
`|[Ii]nc\.\s*[Ss]ed\.|phytoplasma\b|plasmids?\b|[^A-Z]RNA[^A-Z]*)`
,
)
var
notesRe
=
regexp
.
MustCompile
(
...
...
testdata/test_data.txt
View file @
7ba7a4e6
...
...
@@ -3035,6 +3035,16 @@ Notassigned
Notassigned
{"parsed":true,"quality":1,"verbatim":"Notassigned","normalized":"Notassigned","canonicalName":{"full":"Notassigned","simple":"Notassigned","stem":"Notassigned"},"details":[{"uninomial":{"value":"Notassigned"}}],"positions":[["uninomial",0,11]],"surrogate":false,"virus":false,"hybrid":false,"bacteria":false,"nameStringId":"8c07b58a-be4e-5c31-871b-cffe36b9860a","parserVersion":"test_version"}
8c07b58a-be4e-5c31-871b-cffe36b9860a|Notassigned|Notassigned|Notassigned|Notassigned|||1
Unnamed clade
noparse
{"parsed":false,"quality":0,"verbatim":"Unnamed clade","surrogate":false,"virus":false,"hybrid":false,"bacteria":false,"nameStringId":"d510b662-0a4d-5678-a1a7-c58b20d25fa0","parserVersion":"test_version"}
d510b662-0a4d-5678-a1a7-c58b20d25fa0|Unnamed clade||||||0
Unamed clade
noparse
{"parsed":false,"quality":0,"verbatim":"Unamed clade","surrogate":false,"virus":false,"hybrid":false,"bacteria":false,"nameStringId":"be6943d3-fa83-5e5d-9515-7cc339473d4d","parserVersion":"test_version"}
be6943d3-fa83-5e5d-9515-7cc339473d4d|Unamed clade||||||0
#>
#SECTION: No parsing -- genus with apostrophe<
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment