Skip to content

[Refactor] Node utility function efficiency boosts

Summary

This merge request optimises a handful of functions in the nutils unit to save a few cycles during some types of node analysis.

  • actualtargetnode has been refactored to use a while-loop instead of recursion. Though tail recursion optimisation does take place, this is not performed in the most efficient way since both n and result need to be tracked. By using a while-loop, only result now changes throughout and this alleviates register pressure and hence more efficient assembly language is produced.
  • get_open_const_array has been refactored to use a while-loop instead of recursion. As with actualtargetnode, this produces more optimal code compared to letting tail recursion optimisation take place.
  • node_reset_flags, node_reset_pass1_write, node_tree_set_filepos and checktreenodetypes have been made inline since they just pass their formal parameters into foreachnodestatic with very little transformation.
  • node_count and node_count_weighted have been made inline and refactored to use a local stack variable (passed through the arg parameter) instead of a global variable, allowing for better cache locality and reducing the chance of subtle bugs with global variables.
  • IsSingleStatement has been refactored to use a while-loop instead of recursion. In this case, this is not tail recursion and so the improvement is much more pronounced; for example, the routine becomes a leaf function and, under x86_64-win64 and aarch64-linux, no longer require stack frames at all!
  • has_no_code has been refactored to use a while-loop to replace one instance of recursion. The second instance of recursion could not be removed as it is within a repeat-until-loop, but the end result is still slightly more efficient.

System

  • Processor architecture: Cross-platform

What is the current bug behavior?

N/A

What is the behavior after applying this patch?

The compiler should run very slightly faster (compiler code improvements have been noted on x86_64-win64 and aarch64-linux).

Merge request reports