Improve statistics calculation runtime

Description

A daily job calculates statistics for different things. Especially the parts for users and stores are shittily implemented doing way too many SQL queries, (currently) leading to the following runtimes:

  • Foodsaver stats: 11 minutes for ~300000 users. The list of users is queried followed by 11 queries per user. At least 8 of those could be merged to query things for all users directly.

  • Store stats: 50 minutes for ~15000? stores. The list of all stores is queried followed by one query for each store to query the store team followed by 4 queries for each team member of the store. All of those queries could be done on the top level with proper grouping.

Goal

Improve performance. Especially the store stats tend to have approximately quadratic runtime, e.g. although one hour for now is not a problem, it will be much longer with not that much more users/stores and thus may become a problem soon.

Evidence

See cron mails in #fs-ops Slack channel

Code: https://gitlab.com/foodsharing-dev/foodsharing/blob/master/src/Modules/Stats/StatsControl.php#L28

and

https://gitlab.com/foodsharing-dev/foodsharing/blob/master/src/Modules/Stats/StatsControl.php#L99

Related

Edited by Johannes