Conversations re. Common Farm Schema: Geometry, Location, Farm / Field / Herd level data
Update
- Convert these conversations into actual actions possible by end of 2024
- Update Geometry to be more aggressive using regex
- Add farm (or organization) type
- Add herd Convention
Several conversations that we need to determine to complete the CFC for the upcoming uses.
Location
(consulted OT slack, Will, Octavio)
Currently, FarmOS generates some calculated outputs in lat
and lon
which are helpful in the API for someone visually reviewing the location, but do not follow a convention like geojson or WKT. However, in reality farmOS requires WKT in it's API, mostly because WKT has a nice library and is a pretty efficient transfer format.
Post on chat in OpenTEAM with pluses and minuses -->
@nerds - question. I'm looking at WKT and GeoJSON to use in the common farm format schema. The main goal of this schema is as a 'file transfer format' to move data around and/or to push data into models or other outputs. There are many entities (fields, farms, etc.) that have locations so this is an important attribute.Here's what chat gpt has to say (nice summary): https://chat.openai.com/share/87cd9393-171e-4aa3-a8c5-9117daf74bb0GeoJSON:
GeoJSON
- plus: parsable by default without libraries because it's an object
- plus: broadly used in many places
- plus: not lossy (can contain all information of all other formats without loss, even Shape)
- minus: more wordy
WKT
- plus: less wordy, more efficient
- minus: more lossy compared to GeoJSON and other formats like Shape)
I'm leaning towards GeoJSON but thought I'd throw it out there in case I'm missing things. Interesting, aggateway's Adapt uses Shape, probably because the big players w/ machines use Shape from their machines... but I feel that we can always go from shape --> geojson and back (hopefully!), but using shape natively isn't very cross-format friendly.all thoughts welcome
I think there's compelling reasons at this point to use geojson for the CFC, but we'd want to discuss to ensure that conversion from WKT was reasonable and could occur with high confidence.
Questions to answer
consulted Juliet, Octavio
Geometry
It seems 1) location is really important for most schemas and 2) we have 2 main buckets of uses for these schemas. They are:
-
Minimum location - point-based is ok
- Examples - Cover Crop Tool, Cool Farm Tool, COMET Planner
- Description: these uses only require a single point, which usually is used to pull county-level or region-level data, to support model outputs / recommendations. This is often put at the 'farm' level - meaning the farm address, or farm county, rather than at the 'field' level.
-
Area location - area geometry is required for fields
- _Examples - COMET Farm, SurveyStack Stratification service (or FarmLab and others), etc.
- Description: these uses require a valid area and the area is processed to generate important outputs (points, soil types, etc.). So we should validate that geometries are areas or area collections, and contain no points (possible... this is debatable I suppose).
Questions to answer
- Should we make 2 different 'base' schemas?
- If so... how can we clearly differentiate them?
- If so... exactly how should we validate them in the
area
based schema... only areas? multiple areas AND nothing else? multiple areas (may also contain other stuff)... etc.
Farm / Field / Herd level data
We now have many use cases which contain a lot of what, from FarmOS's perspective, may call 'metadata'. This would be like the farm's address, the farmer's name, or the farm's certification status. This can also occur at the level of other assets - for example, a field's certification status. So where should we store this information in the CFC?
Here's some 'buckets' of this kind information we should design around.
- Data collected about a thing which has a current (right now) and historical (at some point in the past) form.
- GHG calculation of a field (output from CFT, COMET Planner or Farm)
- Certification status of a herd or animal (output of a supply chain tool)
- Biodiversity score of an area (Input of Regen Digital's DAFF supply chain tracking tool)
- Diversity score of a farm (output of the Regen Score)
- Suggestion: Any multi-year data (ie a score which occurs in 2022 for the 2022 season, in 2023 for the 2023 season, etc.) probably should be represented as logs. We have an example of this with Regen Digital and the proposed biodiversity observations. It's just possible we need logs for model run returns, which specify a time period. This should have a convention built around it.
- Data collected about a thing which will not change often, and is not tracked meaningfully over time
- Historical management status (Input for COMET Farm, COMET Planner)
- County / State / Country (Input for CFT, COMET Farm, COMET Planner)
Questions to answer
- What are the priority use cases to consider?
- FarmOS
- Cool Farm Tool
- COMET Farm
- COMET Planner
- Regen Score
- Range C
- PCSC Reporting data (not the project-level, but the farm-level)
- Who should we ask about this to get feedback?
- Laura Morton
- Scott Newby
- Greg, Octavio, Emily
- CFT Team (tech @ cft)
- Kevin at COMET
- ADAPT Standards working group team or lead.
- Other people that may need to manage or use this - like LiteFarm...
- ??
- What are all (or as many of) the types of data we should be handling, that we can't adequately handle now? Also... what should we not attempt to handle (edges)?
- Where will we put this information? Will we use any external resources (like RDF context or others) in our solution?
Funded through Action: scoping.