Skip to content

stuck NPC debugging & fix overlooked NPE with badluckmitigation

Logg requested to merge Loggy/openrsc:stuck-npc-debugging into develop

Conker was able to help me examine the memory state of a stuck NPC as captured from a heap dump on the live server.

A stuck NPC has a defined player in their combatWith property, and also has a defined combatEvent which is not running. The NPC behaviour oscillates between ROAM and COMBAT modes, since both handlers point at each other in this weird state.

It is hard to understand how an NPC can get into this state, since there should really only be three possible ways that a combat event can be set running = false:

  1. The NPC is damaged, it detects they've reached 0 Hits remaining, and they are npc.remove()'d.
  2. The NPC retreats, ending the combat event.
  3. The combat event detects itself if it is in an invalid state, and chooses to end itself. This can happen if the player is force logged out during combat for example.

In all the above cases, it calls CombatEvent.resetCombat(), which removes the event from both combat opponents.

We think something is probably causing an exception, which would abort the resetCombat() function either during or before it can finish, but still sets the CombatEvent as no longer running.

In this PR, I have done a few things to try to mitigate this.

  1. I have removed this code inside resetCombat(). It doesn't do anything and should be removed. The "party" member may not be initialized, in which case, I think it does not equal null either. This old cruft is unlikely to be the problem, however, given the relative rarity of the bug.
				if(attackerMob.isPlayer()){
					Player p2;
					p2 = ((Player) attackerMob);
					if (p2.getParty() != null){
						for (Player player : getWorld().getPlayers()) {
							if(p2.getParty() == player.getParty()){
								//ActionSender.sendParty(p);
							}
						}
					}
				}
  1. I have made it so that even if the combat event is not running, when resetCombat() is called, it will reset various variables on the mobs associated with the (not running) combat event. This should be safe, based on my inspection of everywhere that the function is called from, and based on what the programmer likely expects when calling resetCombat(). By changing this, in the event that the combat event is somehow stopped before resetCombat() is called, but resetCombat() is indeed called, then it will be able to resetCombat successfully. I am not sure that there is actually any case in the code where this can happen, but I have made this change regardless.

  2. I have introduced explicit checking at the end of a tick for any NPCs who have non-running non-null combat events with a defined opponent. When an NPC is found in this invalid state, it fixes them, and reports the error to Discord, so that I can be made aware that the bug leading to that invalid state is not fixed. This is a bandaid fix, and a slightly refined version might end up being the final fix if the "correct" fix can't be identified.

Also, I found and fixed a critical error in the new (undeployed) BadLuckMitigation feature. On servers where the feature is not initialized, it caused a NPE, and every single NPC remained stuck in combat on death.

Merge request reports