Standardize dashboard date filter presets to be in UTC timezone
Summary
Update dashboard date filter presets to use dates in the UTC timezone.
Details
The product analytics dashboard date filter uses presets such as Today, Last 7 days, etc. which are bound to the user/client timezone.
We need to update this to instead use UTC dates, otherwise people who are in different timezones a day apart would be querying and viewing different sets of data.
See old description for more details and context.
Proposal
Update dashboard date filter presets to use dates in the UTC timezone.
Any pertinent technical implementation details should go here.
Old description
Previous title: Product Analytics - "today" date range returns incomplete set
Summary
As I was testing a new GDK and cluster, I found events weren't being picked up by the dashboards, regardless of whether my date range dropdown was set to
This was tested in the Jitsu version of the Analytics Stack, but likely also applies to the Snowplow version.
Details
The way we currently query for a single day, e.g., Today, is as follows:
{
"query": {
...
"filters": [
...
{
"member": "TrackedEvents.utcTime",
"operator": "inDateRange",
"values": [
"2023-04-25",
"2023-04-25"
]
}
],
...
}
That is, we supply a single day is the start/end of a date range.
However,** this returned an empty result when I know I have sent events from an example application**, and have verified it in the Clickhouse database.
I experimented with the query, removing the second option, as in Cube's playground, it won't allow you to set the same value twice in an array for an inDateRange
filter:
{
"query": {
...
"filters": [
...
{
"member": "TrackedEvents.utcTime",
"operator": "inDateRange",
"values": [
"2023-04-25"
]
}
],
...
}
This returned the complete set of events that I had sent.
You can run a query as follows, which will return the complete set as well:
{
"query": {
...
"timeDimensions": [
{
"dimension": "TrackedEvents.utcTime",
"granularity": "day",
"dateRange": "Today"
}
],
...
}
I suspect this may have something to do with timezones, but looking at how Cube's querying tool does things, it may be better to switch over to timeDimensions
, which we already use to set the granularity of how the result set is returned. I wouldn't recommend using the plain english date ranges, as I"ll explain below.
It gets more interesting!
Testing this on a project in production, I tried the existing inDateRange
filter as we currently query, except with today's date as a single value, not as two:
It returns a values from two days, but it gets even weirder if you try to query using "Today" as a dateRange
in timeDimensions
:
Now, as of writing this issue, it's still 2023-04-25 in my timezone. I suspect "Today" is not worth using as it will likely use the server's timezone, whereas supplying the literal date would filter based on the data, which is in UTC, and is in my opinion the right approach.
Test environment
- GDK
- Connected to external cluster on GCP, running latest versions of images as of 2023-04-25
Steps to reproduce
- Connect GDK to cluster
- Instrument example application
- Generate events
- View either dashboard and see values returned as 0 even though you've generated events
- Also verifiable in production
Proposal
For single date ranges, supply only one value in the inDateRange
filter query.
Not sure if incompatible with pre-aggregations or how we have our date range controls set up, but using timeDimensions
instead of a filter on the time would simplify things.