Skip to content

spectrum: speed up beamforming gain calculation in three-gpp-spectrum-propagation-loss-model.cc

Here are the before and after the patches for lte-lena-comparison-user --simTag=test2-user --trafficScenario=2 --simulator=5GLENA --technology=LTE --numRings=2 --ueNumPergNb=5 --calibration=false --freqScenario=0 --operationMode=FDD --direction=UL --RngRun=1 --RngSeed=1 (from the nr codebase) using the release profile.

Before

Screenshot_from_2023-10-30_12-19-21

As it can be seen, sincos is the main villain (57% of retired/completed/unaborted instructions). The caller of those many sincos is none other than ThreeGppSpectrumPropagationLossModel::CalcBeamformingGain().

The reason for those sincos take almost half of the simulation time (cycles not in halt) is not only because trigonometric functions are slow, but also because the phases being passed as the arguments are first wrapped to something like -Pi/2, Pi/2 for the best performance. This causes a ton of branch misses, which starve the CPU backend and slows everything down.

Since the delay components for the channel don't change frequently, nor do the frequency bands and number of clusters, we can in theory cache the computed value of the propagation delay, reducing both trigonometric functions and wrapping cache misses. After doing this, we get the following.

After Screenshot_from_2023-10-30_13-55-56

In terms of speedup, we have about 1.4x speedup (~300s -> ~214s).

P.S.: A previous version of this MR incorrectly claimed a much larger speedup, but the number of events were completely off... 😅

Edited by Gabriel Ferreira

Merge request reports