exception stack frames crash, especially on aarch64
Summary
When I raise an exception, my app crashes
System Information
- Operating system: Android
- Processor architecture: AARCH64
- Compiler version: 3.2.3 (2021/10/24)
Steps to reproduce
Raise an exception and "get lucky"
It only happens on some devices.
Example Project
try
raise Exception.Create('abc');
except
on e: Exception do ;
end;
What is the current bug behavior?
It crashes
What is the expected (correct) behavior?
It should not crash
Relevant logs and/or screenshots
12-19 21:47:29.056 4023 4023 F libc : Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x30 in tid 4023 (ibela.videlibri), pid 4023 (ibela.videlibri)
12-19 21:47:29.162 4108 4108 I crash_dump64: obtaining output fd from tombstoned, type: kDebuggerdTombstone
12-19 21:47:29.164 748 748 I /system/bin/tombstoned: received crash request for pid 4023
12-19 21:47:29.168 4108 4108 I crash_dump64: performing dump of process 4023 (target tid = 4023)
12-19 21:47:29.184 4108 4108 F DEBUG : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
12-19 21:47:29.184 4108 4108 F DEBUG : Build fingerprint: 'Nokia/Core2_00WW/CO2N_sprout:10/QP1A.190711.020/00WW_4_200:user/release-keys'
12-19 21:47:29.184 4108 4108 F DEBUG : Revision: '0'
12-19 21:47:29.184 4108 4108 F DEBUG : ABI: 'arm64'
12-19 21:47:29.194 4108 4108 F DEBUG : Timestamp: 2021-12-19 21:47:29+0100
12-19 21:47:29.195 4108 4108 F DEBUG : pid: 4023, tid: 4023, name: ibela.videlibri >>> de.benibela.videlibri <<<
12-19 21:47:29.195 4108 4108 F DEBUG : uid: 10202
12-19 21:47:29.195 4108 4108 F DEBUG : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x30
12-19 21:47:29.195 4108 4108 F DEBUG : Cause: null pointer dereference
12-19 21:47:29.195 4108 4108 F DEBUG : x0 0000000000000028 x1 d7b1d20d532eb138 x2 0000007cf57dced0 x3 0000007fd39677e0
12-19 21:47:29.195 4108 4108 F DEBUG : x4 0000007fd39684e0 x5 0000007cf6ddc096 x6 00000000ffffffff x7 0000000000000000
12-19 21:47:29.195 4108 4108 F DEBUG : x8 0000007de396f128 x9 0000000000000001 x10 0000000000000001 x11 0000007d5dd56448
12-19 21:47:29.195 4108 4108 F DEBUG : x12 0000007d5dd5649c x13 0000007d5dd564f0 x14 0000007d5dd56550 x15 0000000000000000
12-19 21:47:29.195 4108 4108 F DEBUG : x16 0000007cf5a20610 x17 0000007de2693b58 x18 0000007de40de000 x19 0000007fd3967ad8
12-19 21:47:29.195 4108 4108 F DEBUG : x20 0000007fd3967ae0 x21 0000000000000028 x22 0000007ddeeff210 x23 0000007d5560cc80
12-19 21:47:29.195 4108 4108 F DEBUG : x24 0000000000000004 x25 0000007de396f020 x26 0000007de37b4cb0 x27 0000000000000001
12-19 21:47:29.195 4108 4108 F DEBUG : x28 0000000000000000 x29 0000007fd3967ac0
12-19 21:47:29.195 4108 4108 F DEBUG : sp 0000007fd3967aa0 lr 0000007cf57d1f24 pc 0000007cf57c38dc
12-19 21:47:29.195 4108 4108 F DEBUG :
12-19 21:47:29.195 4108 4108 F DEBUG : backtrace:
12-19 21:47:29.195 4108 4108 F DEBUG : #00 pc 00000000000828dc /data/app/de.benibela.videlibri-E_LkJHh8RFSqM7IKlFmL7g==/lib/arm64/liblclapp.so (BuildId: 61c2af45dcb2e9ed6844810546a35357a5d8390e)
Possible fixes
The immediate cause is get_caller_addr
00000000000828d0 <SYSTEM_$$_GET_CALLER_ADDR$POINTER$POINTER$$POINTER>:
828d0: b4000080 cbz x0, 828e0 <SYSTEM_$$_GET_CALLER_ADDR$POINTER$POINTER$$POINTER+0x10>
828d4: f8400000 ldur x0, [x0]
828d8: b4000040 cbz x0, 828e0 <SYSTEM_$$_GET_CALLER_ADDR$POINTER$POINTER$$POINTER+0x10>
828dc: f8408000 ldur x0, [x0,#8]
828e0: d65f03c0 ret
828e4: 00000000 .inst 0x00000000 ; undefined
It cannot be sure that the addresses on the stack are valid. There can be any non-zero nonsense. Especially in a .so on Android. It might run into the stack from the JVM
If I remove one of the ldurs it does not crash. The first ldur moves one frame up. That looks wrong. Then the function does not return the caller's address, but the caller's caller addr? And one call is missing from the stack trace
I guess it is called from fpc_raiseexception -> PushExceptObject -> get_caller_stackinfo -> get_caller_addr.
All of this is very unsafe. PushExceptObject tries to check that it has a valid frame pointer, but then here caller_addr reads from [x0,#8]. That might be invalid even if x0 is valid. That should better check if [x0,#8] is on the stack before reading from it. Perhaps it needs two more parameters with the start and end address of the stack