Skip to content

exception stack frames crash, especially on aarch64

Summary

When I raise an exception, my app crashes

System Information

  • Operating system: Android
  • Processor architecture: AARCH64
  • Compiler version: 3.2.3 (2021/10/24)

Steps to reproduce

Raise an exception and "get lucky"

It only happens on some devices.

Example Project

  try
    raise Exception.Create('abc');
  except
    on e: Exception do ;
  end;

What is the current bug behavior?

It crashes

What is the expected (correct) behavior?

It should not crash

Relevant logs and/or screenshots

12-19 21:47:29.056  4023  4023 F libc    : Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x30 in tid 4023 (ibela.videlibri), pid 4023 (ibela.videlibri)
12-19 21:47:29.162  4108  4108 I crash_dump64: obtaining output fd from tombstoned, type: kDebuggerdTombstone
12-19 21:47:29.164   748   748 I /system/bin/tombstoned: received crash request for pid 4023
12-19 21:47:29.168  4108  4108 I crash_dump64: performing dump of process 4023 (target tid = 4023)
12-19 21:47:29.184  4108  4108 F DEBUG   : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
12-19 21:47:29.184  4108  4108 F DEBUG   : Build fingerprint: 'Nokia/Core2_00WW/CO2N_sprout:10/QP1A.190711.020/00WW_4_200:user/release-keys'
12-19 21:47:29.184  4108  4108 F DEBUG   : Revision: '0'
12-19 21:47:29.184  4108  4108 F DEBUG   : ABI: 'arm64'
12-19 21:47:29.194  4108  4108 F DEBUG   : Timestamp: 2021-12-19 21:47:29+0100
12-19 21:47:29.195  4108  4108 F DEBUG   : pid: 4023, tid: 4023, name: ibela.videlibri  >>> de.benibela.videlibri <<<
12-19 21:47:29.195  4108  4108 F DEBUG   : uid: 10202
12-19 21:47:29.195  4108  4108 F DEBUG   : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x30
12-19 21:47:29.195  4108  4108 F DEBUG   : Cause: null pointer dereference
12-19 21:47:29.195  4108  4108 F DEBUG   :     x0  0000000000000028  x1  d7b1d20d532eb138  x2  0000007cf57dced0  x3  0000007fd39677e0
12-19 21:47:29.195  4108  4108 F DEBUG   :     x4  0000007fd39684e0  x5  0000007cf6ddc096  x6  00000000ffffffff  x7  0000000000000000
12-19 21:47:29.195  4108  4108 F DEBUG   :     x8  0000007de396f128  x9  0000000000000001  x10 0000000000000001  x11 0000007d5dd56448
12-19 21:47:29.195  4108  4108 F DEBUG   :     x12 0000007d5dd5649c  x13 0000007d5dd564f0  x14 0000007d5dd56550  x15 0000000000000000
12-19 21:47:29.195  4108  4108 F DEBUG   :     x16 0000007cf5a20610  x17 0000007de2693b58  x18 0000007de40de000  x19 0000007fd3967ad8
12-19 21:47:29.195  4108  4108 F DEBUG   :     x20 0000007fd3967ae0  x21 0000000000000028  x22 0000007ddeeff210  x23 0000007d5560cc80
12-19 21:47:29.195  4108  4108 F DEBUG   :     x24 0000000000000004  x25 0000007de396f020  x26 0000007de37b4cb0  x27 0000000000000001
12-19 21:47:29.195  4108  4108 F DEBUG   :     x28 0000000000000000  x29 0000007fd3967ac0
12-19 21:47:29.195  4108  4108 F DEBUG   :     sp  0000007fd3967aa0  lr  0000007cf57d1f24  pc  0000007cf57c38dc
12-19 21:47:29.195  4108  4108 F DEBUG   :
12-19 21:47:29.195  4108  4108 F DEBUG   : backtrace:
12-19 21:47:29.195  4108  4108 F DEBUG   :       #00 pc 00000000000828dc  /data/app/de.benibela.videlibri-E_LkJHh8RFSqM7IKlFmL7g==/lib/arm64/liblclapp.so (BuildId: 61c2af45dcb2e9ed6844810546a35357a5d8390e)

Possible fixes

The immediate cause is get_caller_addr

00000000000828d0 <SYSTEM_$$_GET_CALLER_ADDR$POINTER$POINTER$$POINTER>:
   828d0:       b4000080        cbz     x0, 828e0 <SYSTEM_$$_GET_CALLER_ADDR$POINTER$POINTER$$POINTER+0x10>
   828d4:       f8400000        ldur    x0, [x0]
   828d8:       b4000040        cbz     x0, 828e0 <SYSTEM_$$_GET_CALLER_ADDR$POINTER$POINTER$$POINTER+0x10>
   828dc:       f8408000        ldur    x0, [x0,#8]             
   828e0:       d65f03c0        ret
   828e4:       00000000        .inst   0x00000000 ; undefined

It cannot be sure that the addresses on the stack are valid. There can be any non-zero nonsense. Especially in a .so on Android. It might run into the stack from the JVM

If I remove one of the ldurs it does not crash. The first ldur moves one frame up. That looks wrong. Then the function does not return the caller's address, but the caller's caller addr? And one call is missing from the stack trace

I guess it is called from fpc_raiseexception -> PushExceptObject -> get_caller_stackinfo -> get_caller_addr.

All of this is very unsafe. PushExceptObject tries to check that it has a valid frame pointer, but then here caller_addr reads from [x0,#8]. That might be invalid even if x0 is valid. That should better check if [x0,#8] is on the stack before reading from it. Perhaps it needs two more parameters with the start and end address of the stack

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information