Skip to content

[FIX] Fix the issues with >=4.1 Django and the require valid function

Aaron Kable requested to merge aaronkable/django-esi:join-debug into master

What

on Django >=4.1 was causing a query to be slow in the MySQL planner. This was causing massive load on the servers from MySQL and causing some tasks in Corp Tools and Member Audit to fail, as well as stopping people from adding tokens with lots of scopes.

This fix forces multiple queries to ensure each one is simple enough not to cause the issue.

How we got here

Was done mainly via messaging in discord as this does not appear to affect everyone and remote debugging was needed. i could sometimes get it to happen in my development environment if mysql had been under considerable load for an extended (>48h) period.

  • token decorators with many scopes hang at database level causing huge load on the server
  • something changed between Django 4.0.10 and 4.1.x
  • The token scope filter must be filter chained, creating many joins in the sql query. Q objects cannot be used.
  • MySQL appears to be hanging on statistics in all cases
  • setting the MySQL setting optimizer_search_depth to 1 makes it work but this is less than ideal for everything else.
  • setting the MySQL setting optimizer_search_depth to 5 is faster but still times out.
  • setting the MySQL setting optimizer_search_depth to 0 is faster but still times out.
  • Django is reversing the where clauses on the require valid qs join, specifically the exclude out of daters i haven't been able to find where this became a thing in Django's changes, nor do i think we should be looking to get it fixed at that level.
  • I profiled all the queries generated by Django version and diffed the difference between 4.0.10 and 5.0.3 Django versions https://www.diffchecker.com/syjyDvEf/

No Tests?

As this is a database level bug that is not able to be consistently recreated, i have not investigated writing a test case for it. If someone is able to consistently recreate the issue i will happily revisit and write some tests, but after spending the better part of 2 weeks trying to recreate this and going nowhere that's where we are at.

Real World Testing

I have installed this on ~5 Auths that have had issues, and there was no issues post install and everything works as expected.

To fix

  1. Stop auth and workers
  2. Pip install the fixed django-esi version.
  3. Clear all tasks if there is a massive backlog
  4. Restart server, this was easier than restarting MySQL when helping other people
  5. Confirm Auth is running after the reboot
Edited by Aaron Kable

Merge request reports