diff --git a/docs/themanual/tidewater/database.md b/docs/themanual/tidewater/database.md index 6cd73a9075465190a8bf96a3f468b39d454a8378..625f2b618a1c2cdecb94d3edafba0ad42ccc04bc 100644 --- a/docs/themanual/tidewater/database.md +++ b/docs/themanual/tidewater/database.md @@ -15,12 +15,20 @@ Databases should be Unicode, and columns should be `text`. [dc11]: https://www.dublincore.org/specifications/dublin-core/dces/ -It is possible that an item might have multiple values for any given - metadata term. -This can be represented in the database by catenating the values with - a `U+FFFF` delimiter. -Note that `U+FFFF` would not otherwise be representable in OAIâ€PMH data - (it is invalid in XML); any `U+FFFF` which makes up part of a value - **must** be replaced with `U+FFFD � REPLACEMENT CHARACTER` prior to - its being stored in the database, to avoid its accidental - interpretation as a value delimiter. +The columns for metadata fields have the following format :— + + Value ::= Char* + Language ::= Char* + TaggedValue ::= Value (#xFFFE Language)? + Field ::= TaggedValue (#xFFFF TaggedValue)* + +Note that `U+FFFE` and `U+FFFF` would not otherwise be representable in + OAIâ€PMH data (they are invalid in XML); any `U+FFFE` or `U+FFFF` + which makes up part of a value **must** be replaced with + `U+FFFD � REPLACEMENT CHARACTER` prior to its being stored in the + database, to avoid its accidental interpretation as a delimiter. + +The Tidewater consumer script makes use of an additional column, + `source_iri`, to identify items according to their identifiers in the + API it consumes from. +The web app does not utilize this column for anything.