This project is archived. Its data is read-only. This project is read-only.
Replication Key Signpost Behavior not consistent with expected behavior
## Summary [//]: # My understanding was that the `get_replication_key_signpost` method is used to make sure that the replication key value that is stored is not above a certain value right? so for example, I overwrote the method to return 2021-08-04T19:08:35+00:00 (as a datetime object), but I see that the replication key value that is stored in my state is 2021-08-04T19:17:29Z which is greater than the replication key signpost value. Am I misunderstanding what the behavior of that method should be? ## Steps to reproduce [//]: # - Ingest a stream with any timestamp field that will be used for replication (can generate a stream with fake data) - In your stream class, override the `get_replication_key_signpost` method to return a datetime that is earlier than the highest date in your fake data - Make sure the stream is unsorted - Run ELT with a job_id, so state gets saved - Check the value of the replication_key in state, it will not be the `get_replication_key_signpost` value, but the highest value in the stream fake data ## What is the current bug behavior? [//]: # The timestamp of a stream stored in the state can be higher than what the `get_replication_key_signpost` method returns ## What is the expected correct behavior? [//]: # The timestamp of a stream stored in state should never be higher than what `get_replication_key_signpost` returns ## Relevant logs and/or screenshots [//]: # An example, is I ingested a stream that outputted the following data: ```json {"id": "1112931637383", "updatedAt": "2021-08-04T16:38:04Z" } {"id": "1112931637323", "updatedAt": "2021-08-04T03:58:39Z"} {"id": "111293163735", "updatedAt": "2021-08-04T11:25:30Z"} {"id": "1112931637", "updatedAt": "2021-08-04T16:37:54Z"} {"id": "1112931637343", "updatedAt": "2021-08-04T10:42:30Z"} {"id": "1112931637393", "updatedAt": "2021-08-04T09:44:58Z"} {"id": "111293163738211", "updatedAt": "2021-08-04T08:36:52Z"} {"id": "11129316373812", "updatedAt": "2021-08-04T07:55:35Z"} {"id": "11129316373817", "updatedAt": "2021-08-04T07:24:55Z"} {"id": "11129316373892", "updatedAt": "2021-08-04T09:53:29Z"} ``` State File ```json { "bookmarks": { "fake_stream": { "replication_key": "updatedAt", "replication_key_value": "2021-08-04T16:38:04Z" } } } ``` However the value of replication_key_signpost was `2021-08-02T00:00:00+00:00` ## Possible fixes [//]: # Looks like this [method](https://gitlab.com/meltano/sdk/-/blob/main/singer_sdk/streams/core.py#L522) gets the signpost and passes it to this [method](https://gitlab.com/meltano/sdk/-/blob/main/singer_sdk/helpers/_state.py#L188), but it is never use
issue