I want to use Android Emulator in AARCH64, OS is openEuler 22.03 LTS SP1 (CentOS-like), and I tried many versions from ci.android, but they all failed because of SIGSEGV.
I got Android Emulator from this link: .zip
And I got everything set, cmdline-tools, platforms(android-33), platform-tools(adb), system-images(33-arm64-v8a), and my cmd to start emulator is: -avd arm_avd -no-window -no-audio -gpu swiftshader -feature -Vulkan -no-snapshot -delay-adb -cores 6 -verbose
.
But after few seconds, I got SIGSEGV. And in GDB it looks like this:
Thread 28 "qemu-system-aar" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xffffea478a80 (LWP 15429)]
0x0000aaaaab1f32c8 in ?? ()
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.34-105.oe2203sp1.aarch64 libX11-1.7.2-5.oe2203sp1.aarch64 libXau-1.0.10-1.oe2203sp1.aarch64 libgcc-10.3.1-20.aarch64 libxcb-1.15-1.oe2203sp1.aarch64 systemd-libs-249-43.oe2203sp1.aarch64
(gdb) bt
#0 0x0000aaaaab1f32c8 in ?? ()
#1 0x0000aaaaaaf62ff8 in ?? ()
#2 0x0000aaaaab1ae318 in ?? ()
#3 0x0000aaaaab12a624 in ?? ()
#4 0x0000aaaaab11f154 in ?? ()
#5 0x0000aaaaab12b67c in ?? ()
#6 0x0000aaaaab037224 in ?? ()
#7 0x0000aaaaab03c2e0 in ?? ()
#8 0x0000aaaaab03dc38 in ?? ()
#9 0x0000aaaaab01ee94 in ?? ()
#10 0x0000aaaaab3d8fd0 in ?? ()
#11 0x0000fffff6d20320 in ?? () from /usr/lib64/libc.so.6
#12 0x0000fffff6d8645c in ?? () from /usr/lib64/libc.so.6
SIGSEGV is from qemu-system-aarch64-headless
, and there is mappings:
(gdb) info proc mappings
process 15376
Mapped address spaces:
Start Addr End Addr Size Offset objfile
0xaaaaaaaa0000 0xaaaaacb11000 0x2071000 0x0 /.../android/emulator/qemu/linux-aarch64/qemu-system-aarch64-headless
0xaaaaacb20000 0xaaaaacc90000 0x170000 0x2070000 /.../android/emulator/qemu/linux-aarch64/qemu-system-aarch64-headless
0xaaaaacc90000 0xaaaaacd70000 0xe0000 0x21e0000 /.../android/emulator/qemu/linux-aarch64/qemu-system-aarch64-headless
0xaaaaacd70000 0xaaaad7454000 0x2a6e4000 0x0 [heap]
0xffff47800000 0xffff47801000 0x1000 0x0
And the direct cause of SIGSEGV is quite clear:
(gdb) i reg
x0 0x5be16d520bce 101023759862734
x1 0xaaaabc5dbe40 187650281422400
x2 0xaaaaac42ad48 187650011213128
......
x23 0x28 40
x24 0xffffea479240 281474612302400 # I keep this for reason
......
(gdb) disassemble 0x0000aaaaab1f32c0,0x0000aaaaab1f32cf
Dump of assembler code from 0xaaaaab1f32c0 to 0xaaaaab1f32cf:
0x0000aaaaab1f32c0: b 0xaaaaaad8c4b0 <g_strdup@plt>
0x0000aaaaab1f32c4: nop
=> 0x0000aaaaab1f32c8: ldr x0, [x0] # access illegal address
0x0000aaaaab1f32cc: ret
End of assembler dump.
I don't compile emulator myself, and there is no debug version on ci.android, so there is no symbols. And I use disassemble tools to get more information.
After decompilation the code at 0x0000aaaaab1f32c8 and 0x0000aaaaaaf62ff8, I found the
exact source code location: external/qemu/hw/virtio/virtio.c:virtio_reset
. Here is instruction code and source code:
aaaaaaf62fa8 58 d0 3b d5 mrs x24,tpidr_el0
aaaaaaf62fac 17 e9 00 90 adrp x23,PTR_FUN_aaaaacc82000
aaaaaaf62fb0 f7 52 40 f9 ldr x23,[x23, #0xa0]=>PTR_aaaaacc820a0 # witch is direct value (0x28)
aaaaaaf62fb4 f9 6b 04 a9 stp x25,x26,[sp, #local_10]
aaaaaaf62fb8 c4 40 0a 94 bl FUN_aaaaab1f32c8
aaaaaaf62fbc 7a c2 10 91 add x26,x19,#0x430
aaaaaaf62fc0 b9 62 3a 91 add x25,x21,#0xe98
aaaaaaf62fc4 44 a3 05 91 add x4=>s_virtio_reset_aaaaac44e598,x26,#0x168 = "virtio_reset"
aaaaaaf62fc8 e2 03 19 aa mov x2=>s_/buildbot/src/android/emu-master_aaaaac4 = "/buildbot/src/android/emu-master-dev/external/qemu/hw/virtio/virtio.c
aaaaaaf62fcc 03 96 80 52 mov w3,#0x4b0
aaaaaaf62fd0 41 a6 00 90 adrp x1,DAT_aaaaac42a000
aaaaaaf62fd4 21 80 25 91 add x1=>s_virtio-device_aaaaac42a960,x1,#0x960 = "virtio-device"
aaaaaaf62fd8 76 40 0a 94 bl FUN_aaaaab1f31b0
aaaaaaf62fdc f6 03 00 aa mov x22,param_1
aaaaaaf62fe0 01 00 80 52 mov w1,#0x0
aaaaaaf62fe4 e0 03 14 aa mov param_1,x20
aaaaaaf62fe8 6e ff ff 97 bl FUN_aaaaaaf62da0
aaaaaaf62fec 00 6b 77 f8 ldr param_1,[x24, x23, LSL #0x0]
aaaaaaf62ff0 20 10 00 b4 cbz param_1,LAB_aaaaaaf631f4
aaaaaaf62ff4 b5 40 0a 94 bl FUN_aaaaab1f32c8 # SIGSEGV here
===================================================================================
void virtio_reset(void *opaque)
{
VirtIODevice *vdev = opaque;
VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
int i;
virtio_set_status(vdev, 0);
if (current_cpu) {
/* Guest initiated reset */
vdev->device_endian = virtio_current_cpu_endian(); // SIGSEGV here
} else {
/* System reset */
vdev->device_endian = virtio_default_endian();
}
current_cpu
is a thread_local variable, defined in cpu.h, and finally in qemu-system-aarch64-headless
. As instruction code shown, current_cpu stored in tpidr_el0+0x28, tpidr_el0 witch is x24 reg in gdb.
So I restarted the program and watch the memory at 0xffffea479240+0x28, and I found the perpetrator!
Thread 28 "qemu-system-aar" hit Hardware watchpoint 3: *(long *)0xffffea479268
Old value = 0
New value = 35312
0x0000fffff7ce842c in ?? () from /opt/android/emulator/lib64/libandroid-emu-metrics.so
(gdb) c
Continuing.
WARNING | Failed to setup emulator in a timely fashion!
[Thread 0xffffe8c48a80 (LWP 15933) exited]
[Thread 0xffff5adfea80 (LWP 15944) exited]
Thread 28 "qemu-system-aar" hit Hardware watchpoint 3: *(long *)0xffffea479268
Old value = 35312
New value = 1788054000
0x0000fffff7ce8440 in ?? () from /opt/android/emulator/lib64/libandroid-emu-metrics.so
(gdb) c
Continuing.
Thread 28 "qemu-system-aar" hit Hardware watchpoint 3: *(long *)0xffffea479268
Old value = 1788054000
New value = 113977335187952
0x0000fffff7ce8448 in ?? () from /opt/android/emulator/lib64/libandroid-emu-metrics.so
After decompilation the code at 0x0000fffff7ce842c and 0x0000fffff7ce8440 and 0x0000fffff7ce8448, I found this: (libuuid randutils.c:random_get_bytes)
LAB_fffff7ce83f0 XREF[1]: fffff7ce83c4(j)
fffff7ce83f0 e4 82 fc 97 bl <EXTERNAL>::getpid __pid_t getpid(void)
fffff7ce83f4 f3 03 00 2a mov w19,w0
fffff7ce83f8 f6 96 fc 97 bl <EXTERNAL>::getuid __uid_t getuid(void)
fffff7ce83fc a1 0b 43 a9 ldp x1,x2,[x29, #local_10]
fffff7ce8400 00 40 13 4a eor w0,w0,w19, LSL #0x10
fffff7ce8404 53 d0 3b d5 mrs x19,tpidr_el0
fffff7ce8408 73 02 40 91 add x19,x19,#0x0, LSL #12
fffff7ce840c 73 a2 00 91 add x19,x19,#0x28 (0x28 is hardcoded in building stage?)
fffff7ce8410 21 00 02 4a eor w1,w1,w2
fffff7ce8414 20 00 00 4a eor w0,w1,w0
fffff7ce8418 d2 6d fc 97 bl <EXTERNAL>::srandom void srandom(uint __seed)
fffff7ce841c d9 82 fc 97 bl <EXTERNAL>::getpid __pid_t getpid(void)
fffff7ce8420 a1 1b 40 f9 ldr x1,[x29, #local_10]
fffff7ce8424 00 00 01 4a eor w0,w0,w1
fffff7ce8428 60 02 00 79 strh w0,[x19] # write here
fffff7ce842c a5 70 fc 97 bl <EXTERNAL>::getppid __pid_t getppid(void)
fffff7ce8430 a1 0b 43 a9 ldp x1,x2,[x29, #local_10]
fffff7ce8434 41 00 01 ca eor x1,x2,x1
fffff7ce8438 00 00 02 4a eor w0,w0,w2
fffff7ce843c 60 06 00 79 strh w0,[x19, #0x2] # write here
fffff7ce8440 21 fc 50 93 asr x1,x1,#0x10
fffff7ce8444 61 0a 00 79 strh w1,[x19, #0x4] # write here
fffff7ce8448 c5 ff ff 17 b LAB_fffff7ce835c
=========================================================================
.......
#ifdef HAVE_TLS # True
#define THREAD_LOCAL static __thread
#else
#define THREAD_LOCAL static
#endif
#if defined(__linux__) && defined(__NR_gettid) && defined(HAVE_JRAND48)
#define DO_JRAND_MIX
THREAD_LOCAL unsigned short ul_jrand_seed[3];
#endif
........
void random_get_bytes(void *buf, size_t nbytes)
{
......
#ifdef DO_JRAND_MIX
{
unsigned short tmp_seed[3];
memcpy(tmp_seed, ul_jrand_seed, sizeof(tmp_seed));
ul_jrand_seed[2] = ul_jrand_seed[2] ^ syscall(__NR_gettid);
for (cp = buf, i = 0; i < nbytes; i++)
*cp++ ^= (jrand48(tmp_seed) >> 7) & 0xFF;
memcpy(ul_jrand_seed, tmp_seed,
sizeof(ul_jrand_seed)-sizeof(unsigned short));
}
#endif
return;
}
.......
And finally, I stuck here, I don't understand why two different thread_local variable use the exact same memory address! Can somebody help me!
- Is the bug within emulator itself? But why ci.android shows it pass the test?
- Is something wrong with my OS? I reinstall my OS system several times, not helpful.
updated:
I found the workaround by modifing the instructions in libandroid-emu-metrics.so. Since ul_jrand_seed[3]
uses only six Bytes, I change add x19,x19,#0x28
to add x19,x19,#0x20
and it works perfectly. But the problem still, why two different thread_local variable use the exact same memory address?