wifi: MacLow::NormalAckTimeout should not call WifiRemoteStationManager::ReportDataFailed
When testing the aggregation rework, the following example crashed:
src/wifi/examples/wifi-manager-example --wifiManager=MinstrelHt --standard=802.11ac --serverChannelWidth=40 --clientChannelWidth=40 --serverShortGuardInterval=800 --clientShortGuardInterval=800 --serverNss=1 --clientNss=1 --stepTime=0.1
After investigating for a while, I realized that the crash is not caused by my changes, but by an erroneous (to me) behavior in a corner case disclosed by the example.
From the logs (which I can share) I noticed that no problems arise when the transmission of an A-MPDU fails repeatedly, until the maximum number of retries is reached and the transmission is aborted. The corner case is when the transmission of an A-MPDU fails repeatedly but at the last attempt a single MPDU (in an S-MPDU) is transmitted. In this case, the example crashes with:
msg="Max retries reached and m_longRetry not cleared properly. longRetry= 7", file=../src/wifi/model/minstrel-ht-wifi-manager.cc, line=767
The difference with S-MPDU is that a normal ack is requested. Hence, when the ack timeout expires, MacLow::NormalAckTimeout calls WifiRemoteStationManager::ReportDataFailed, which increases the station long retry count by one, thus making it exceed the maximum number of retries and triggering the fatal error.
I believe that MacLow::NormalAckTimeout should not call directly WifiRemoteStationManager::ReportDataFailed. Instead, much like MacLow::BlockAckTimeout does, it should let [Qos]Txop::MissedAck decide to call WifiRemoteStationManager::ReportDataFailed or WifiRemoteStationManager::ReportFinalDataFailed based on whether a retransmission is needed or not.
I prepared this patch:
and verified that all tests and examples pass with and without my aggregation rework.