* tst-arm-mte bug: PSTATE.TCO is cleared on exceptions @ 2020-04-20 10:29 Szabolcs Nagy 2020-04-22 4:39 ` Richard Henderson 0 siblings, 1 reply; 8+ messages in thread From: Szabolcs Nagy @ 2020-04-20 10:29 UTC (permalink / raw) To: Richard Henderson; +Cc: nd, qemu-devel [-- Attachment #1: Type: text/plain, Size: 605 bytes --] i'm using the branch at https://github.com/rth7680/qemu/tree/tgt-arm-mte to test armv8.5-a mte and hope this is ok to report bugs here. i'm doing tests in qemu-system-aarch64 with linux userspace code and it seems TCO bit gets cleared after syscalls or other kernel entry, but PSTATE is expected to be restored, so i suspect it is a qemu bug. i think the architecture saves/restores PSTATE using SPSR_ELx on exceptions. i used the linux branch https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=devel/mte-v2 attached a reproducer that segfaults in qemu but should work. thanks. [-- Attachment #2: bug.c --] [-- Type: text/x-csrc, Size: 1216 bytes --] // CFLAGS = -march=armv8.5-a+memtag #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/mman.h> #include <sys/prctl.h> #define TAG_SHIFT 56 #ifndef PROT_MTE #define PROT_MTE 0x20 #endif #ifndef PR_SET_TAGGED_ADDR_CTRL #define PR_SET_TAGGED_ADDR_CTRL 55 #define PR_GET_TAGGED_ADDR_CTRL 56 #define PR_TAGGED_ADDR_ENABLE 1UL #endif #ifndef PR_MTE_TCF_SYNC #define PR_MTE_TCF_SYNC 2UL #define PR_MTE_TAG_SHIFT 3 #endif int main() { if (prctl(PR_SET_TAGGED_ADDR_CTRL, PR_TAGGED_ADDR_ENABLE|PR_MTE_TCF_SYNC|(0xffff << PR_MTE_TAG_SHIFT), 0, 0, 0)) abort(); unsigned long *a = mmap(0, 1<<12, PROT_READ|PROT_WRITE|PROT_MTE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); if (a == MAP_FAILED) abort(); // tag ptr a a = (void*)((unsigned long)a|(1UL<<TAG_SHIFT)); // tag memory a[0], a[1] asm volatile ("stg %1, %0" : "=Q"(*a) : "r"(a)); // turn tag checks off asm volatile ("msr tco, 1"); a[0]=1; // ok a[1]=2; // ok a[2]=3; // tag mismatch but tco==1 so ok write(1, "foo\n", 4); // PSTATE.TCO (bit 25) should be still set after the syscall unsigned long x; asm volatile ("mrs %0, tco" : "=r"(x)); printf("tco = 0x%lx\n", x); a[3]=4; // tag mismatch, segfaults if tco==0 return 0; } ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: tst-arm-mte bug: PSTATE.TCO is cleared on exceptions 2020-04-20 10:29 tst-arm-mte bug: PSTATE.TCO is cleared on exceptions Szabolcs Nagy @ 2020-04-22 4:39 ` Richard Henderson 2020-04-24 19:47 ` Richard Henderson 0 siblings, 1 reply; 8+ messages in thread From: Richard Henderson @ 2020-04-22 4:39 UTC (permalink / raw) To: Szabolcs Nagy; +Cc: nd, qemu-devel On 4/20/20 3:29 AM, Szabolcs Nagy wrote: > i'm using the branch at > > https://github.com/rth7680/qemu/tree/tgt-arm-mte > > to test armv8.5-a mte and hope this is ok to report bugs here. > > i'm doing tests in qemu-system-aarch64 with linux userspace > code and it seems TCO bit gets cleared after syscalls or other > kernel entry, but PSTATE is expected to be restored, so i > suspect it is a qemu bug. > > i think the architecture saves/restores PSTATE using SPSR_ELx > on exceptions. Yep. I failed to update aarch64_pstate_valid_mask for TCO. Will fix. Thanks, r~ > > i used the linux branch > https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=devel/mte-v2 > > attached a reproducer that segfaults in qemu but should work. > > thanks. > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: tst-arm-mte bug: PSTATE.TCO is cleared on exceptions 2020-04-22 4:39 ` Richard Henderson @ 2020-04-24 19:47 ` Richard Henderson 2020-05-06 12:57 ` Szabolcs Nagy 0 siblings, 1 reply; 8+ messages in thread From: Richard Henderson @ 2020-04-24 19:47 UTC (permalink / raw) To: Szabolcs Nagy; +Cc: nd, qemu-devel On 4/21/20 9:39 PM, Richard Henderson wrote: > On 4/20/20 3:29 AM, Szabolcs Nagy wrote: >> i'm using the branch at >> >> https://github.com/rth7680/qemu/tree/tgt-arm-mte >> >> to test armv8.5-a mte and hope this is ok to report bugs here. >> >> i'm doing tests in qemu-system-aarch64 with linux userspace >> code and it seems TCO bit gets cleared after syscalls or other >> kernel entry, but PSTATE is expected to be restored, so i >> suspect it is a qemu bug. >> >> i think the architecture saves/restores PSTATE using SPSR_ELx >> on exceptions. > > Yep. I failed to update aarch64_pstate_valid_mask for TCO. > Will fix. Thanks, Fixed on the branch. I still need to work out how best to plumb the arm,armv8.5-memtag property so the devel/mte-v3 kernel branch isn't usable as-is for the moment. For myself, I've just commented that test out for now. r~ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: tst-arm-mte bug: PSTATE.TCO is cleared on exceptions 2020-04-24 19:47 ` Richard Henderson @ 2020-05-06 12:57 ` Szabolcs Nagy 2020-05-07 9:59 ` Szabolcs Nagy 0 siblings, 1 reply; 8+ messages in thread From: Szabolcs Nagy @ 2020-05-06 12:57 UTC (permalink / raw) To: Richard Henderson; +Cc: nd, qemu-devel The 04/24/2020 12:47, Richard Henderson wrote: > On 4/21/20 9:39 PM, Richard Henderson wrote: > > Yep. I failed to update aarch64_pstate_valid_mask for TCO. > > Will fix. Thanks, > > Fixed on the branch. > > I still need to work out how best to plumb the arm,armv8.5-memtag property so > the devel/mte-v3 kernel branch isn't usable as-is for the moment. For myself, > I've just commented that test out for now. The fix worked well thanks (in linux devel/mte-v3 i reverted the patch that introduced arm,armv8.5-memtag) However later on during testing malloc with PROT_MTE i got a qemu assert failure: Bail out! ERROR:/S/target/arm/mte_helper.c:97:allocation_tag_mem: assertion failed: (tag_size <= in_page) i can reproduce it, but i don't know how to debug it further, i don't know what the application is doing when this happens, nor what the kernel is doing. i rebuilt qemu with --enable-debug but now it's very slow (still booting into linux 3h later). let me know if there are ways to narrow this down. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: tst-arm-mte bug: PSTATE.TCO is cleared on exceptions 2020-05-06 12:57 ` Szabolcs Nagy @ 2020-05-07 9:59 ` Szabolcs Nagy 2020-05-07 17:21 ` Richard Henderson 0 siblings, 1 reply; 8+ messages in thread From: Szabolcs Nagy @ 2020-05-07 9:59 UTC (permalink / raw) To: Richard Henderson; +Cc: nd, qemu-devel The 05/06/2020 13:57, Szabolcs Nagy wrote: > However later on during testing malloc with PROT_MTE > i got a qemu assert failure: > > Bail out! ERROR:/S/target/arm/mte_helper.c:97:allocation_tag_mem: assertion failed: (tag_size <= in_page) > > i can reproduce it, but i don't know how to debug it > further, i don't know what the application is doing > when this happens, nor what the kernel is doing. actually i know what the application is doing, it's in an mmap when qemu aborts: ... 23:15:17.379227 munmap(0x100ffff9675a000, 8192) = 0 23:15:17.428456 mmap(NULL, 8192, PROT_READ|PROT_WRITE|0x20, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xffff9675a000 23:15:17.502543 mmap(NULL, 36864, PROT_READ|PROT_WRITE|0x20, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xffff96707000 23:15:17.572469 munmap(0x100ffff96707000, 36864) = 0 23:15:17.645050 munmap(0x100ffff9675a000, 8192) = 0 23:15:17.721526 mmap(NULL, 8192, PROT_READ|PROT_WRITE|0x20, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xffff9675a000 23:15:17.779768 mmap(NULL, 36864, PROT_READ|PROT_WRITE|0x20, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xffff96707000 23:15:17.840278 newfstatat(3, "usr/lib", {st_mode=S_IFDIR|0755, st_size=4096, ...}, AT_SYMLINK_NOFOLLOW) = 0 23:15:18.164292 unlinkat(3, "usr/lib/.apk.1e1bebb420b720c23f29fc2cacd5581b598339805fd12c00", 0) = 0 23:15:18.357742 symlinkat("libXau.so.6.0.0", 3, "usr/lib/.apk.1e1bebb420b720c23f29fc2cacd5581b598339805fd12c00") = 0 23:15:18.469921 fchownat(3, "usr/lib/.apk.1e1bebb420b720c23f29fc2cacd5581b598339805fd12c00", 0, 0, AT_SYMLINK_NOFOLLOW) = 0 23:15:18.638698 unlinkat(3, "usr/lib/.apk.93d31976aebb056b6e2d9577dc8a2f112e28756d03f736a4", 0) = 0 23:15:18.760374 openat(3, "usr/lib/.apk.93d31976aebb056b6e2d9577dc8a2f112e28756d03f736a4", O_RDWR|O_CREAT|O_EXCL|O_TRUNC|O_LARGEFILE|O_CLOEXEC, 0755) = 8 23:15:18.916049 write(8, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0\300\r\0\0\0\0\0\0@\0\0\0\0\0\0\0\3700\0\0\0\0\0\0\0\0\0\0@\08\0\6\0@\0\26\0\25\0\1\0\0\0\5\0"..., 13944) = 13944 23:15:18.961239 close(8) = 0 23:15:20.137627 fchownat(3, "usr/lib/.apk.93d31976aebb056b6e2d9577dc8a2f112e28756d03f736a4", 0, 0, 0) = 0 23:15:20.289924 utimensat(3, "usr/lib/.apk.93d31976aebb056b6e2d9577dc8a2f112e28756d03f736a4", [{tv_sec=1579395233, tv_nsec=0} /* 2020-01-19T00:53:53+0000 */, {tv_sec=1579395233, tv_nsec=0} /* 2020-01-19T00:53:53+0000 */], 0) = 0 23:15:20.467212 munmap(0x100ffff96707000, 36864) = 0 23:15:20.503631 munmap(0x100ffff9675a000, 8192) = 0 23:15:20.550130 mmap(NULL, 8192, PROT_READ|PROT_WRITE|0x20, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0Connection to localhost closed by remote host. (this allocator does a lot of small mmap and munmap) but i cant tell what happens on the kernel side. is there some recommended way to turn some form of tracing on in qemu before i execute the problematic application? or is it better if i try to extract a reproducer? (that does not use the network) > > i rebuilt qemu with --enable-debug but now it's very > slow (still booting into linux 3h later). this is too slow, things time out. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: tst-arm-mte bug: PSTATE.TCO is cleared on exceptions 2020-05-07 9:59 ` Szabolcs Nagy @ 2020-05-07 17:21 ` Richard Henderson 2020-05-18 12:59 ` Szabolcs Nagy 0 siblings, 1 reply; 8+ messages in thread From: Richard Henderson @ 2020-05-07 17:21 UTC (permalink / raw) To: Szabolcs Nagy; +Cc: nd, qemu-devel On 5/7/20 2:59 AM, Szabolcs Nagy wrote: > is there some recommended way to turn some form > of tracing on in qemu before i execute the > problematic application? I didn't add any tracing within mte. I can do so if we can guess what we're looking for. > or is it better if i try to extract a reproducer? > (that does not use the network) A reproducer would be most helpful. Something that can help is saving a VM snapshot with the kernel booted and the user logged in, just ready to run the test program. Then you can get back to exactly the state you want before things go wrong, even with a different qemu build. r~ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: tst-arm-mte bug: PSTATE.TCO is cleared on exceptions 2020-05-07 17:21 ` Richard Henderson @ 2020-05-18 12:59 ` Szabolcs Nagy 2020-05-19 18:46 ` Richard Henderson 0 siblings, 1 reply; 8+ messages in thread From: Szabolcs Nagy @ 2020-05-18 12:59 UTC (permalink / raw) To: Richard Henderson; +Cc: nd, qemu-devel The 05/07/2020 10:21, Richard Henderson wrote: > A reproducer would be most helpful. > > Something that can help is saving a VM snapshot with the kernel booted and the > user logged in, just ready to run the test program. Then you can get back to > exactly the state you want before things go wrong, even with a different qemu > build. i got some time to create a reproducer (with public code), temporarily hosting the binaries at http://port70.net/~nsz/tmp/qemu-bug.tar.gz ~251M here echo ./bug.sh | ./qemu-bug.sh crashes in about 1 minute (where qemu-bug.sh loads a snapshot with root shell and ./bug.sh triggers the bug) the disk rootfs is based on https://distfiles.adelielinux.org/adelie/1.0/iso/rc1/adelie-rootfs-aarch64-1.0-rc1-20200206.txz the kernel Image is linux mte-v3 with reverting the commit "arm64: mte: Check the DT memory nodes for MTE support" qemu is static linked from the branch tgt-arm-mte. the userspace workload that triggers the bug is using the adelie linux package manager with a malloc with tagging. (the malloc implementation is a modified version of https://github.com/richfelker/mallocng-draft the code is on the disk image, it has known issues, but it should not crash qemu) i will remove the file after a few days. hope this helps. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: tst-arm-mte bug: PSTATE.TCO is cleared on exceptions 2020-05-18 12:59 ` Szabolcs Nagy @ 2020-05-19 18:46 ` Richard Henderson 0 siblings, 0 replies; 8+ messages in thread From: Richard Henderson @ 2020-05-19 18:46 UTC (permalink / raw) To: Szabolcs Nagy; +Cc: nd, qemu-devel On 5/18/20 5:59 AM, Szabolcs Nagy wrote: > i got some time to create a reproducer (with public code), Thanks. I've grabbed it. I'll try it out soon. r~ ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-05-19 18:47 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-04-20 10:29 tst-arm-mte bug: PSTATE.TCO is cleared on exceptions Szabolcs Nagy 2020-04-22 4:39 ` Richard Henderson 2020-04-24 19:47 ` Richard Henderson 2020-05-06 12:57 ` Szabolcs Nagy 2020-05-07 9:59 ` Szabolcs Nagy 2020-05-07 17:21 ` Richard Henderson 2020-05-18 12:59 ` Szabolcs Nagy 2020-05-19 18:46 ` Richard Henderson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).