* [LTP] Question about kernel/syscall/signal/signal06.c
@ 2019-07-17 9:21 Hongzhi, Song
2019-07-17 9:46 ` Cyril Hrubis
0 siblings, 1 reply; 7+ messages in thread
From: Hongzhi, Song @ 2019-07-17 9:21 UTC (permalink / raw)
To: ltp
Hi Wang,
Sorry for bother you.
I find signal06 fails on qemux86-64 when qemu has a small number cores,
e.g. "qemu -smp 1/2/4/6".
ERROR INFO:
signal06??? 0? TINFO? :? loop = 23
signal06??? 1? TFAIL? :? signal06.c:87: Bug Reproduced!
But if boot qemu with "-smp 16", the case has great chance to pass.
I have two questions about this case:
1. I don't know why multi-core will affect the case.
2. On failure situation, what does break the "while loop" shown in below
code.
??? while (D == VALUE && loop < LOOPS) {
??????? /* sys_tkill(pid, SIGHUP); asm to avoid save/reload
???????? * fp regs around c call */
??????? asm ("" : : "a"(__NR_tkill), "D"(pid), "S"(SIGHUP));
??????? asm ("syscall" : : : "ax");
??????? loop++;
??? }
??? ...
??? if (loop == LOOPS) {
??????? tst_resm(TPASS, "%s call succeeded", TCID);
??? } else {
>>> tst_resm(TFAIL, "Bug Reproduced!");
??????? tst_exit();
??? }
Thanks.
--Hongzhi
^ permalink raw reply [flat|nested] 7+ messages in thread* [LTP] Question about kernel/syscall/signal/signal06.c 2019-07-17 9:21 [LTP] Question about kernel/syscall/signal/signal06.c Hongzhi, Song @ 2019-07-17 9:46 ` Cyril Hrubis 2019-07-19 8:13 ` Hongzhi, Song 0 siblings, 1 reply; 7+ messages in thread From: Cyril Hrubis @ 2019-07-17 9:46 UTC (permalink / raw) To: ltp Hi! > I find signal06 fails on qemux86-64 when qemu has a small number cores, > e.g. "qemu -smp 1/2/4/6". > > ERROR INFO: > > signal06?????? 0?? TINFO?? :?? loop = 23 > signal06?????? 1?? TFAIL?? :?? signal06.c:87: Bug Reproduced! > > But if boot qemu with "-smp 16", the case has great chance to pass. > > > I have two questions about this case: > > 1. I don't know why multi-core will affect the case. Have you looked into the code? The test is trying to reproduce a race condition between two threads of course the number of cores does affect the reproducibility. > 2. On failure situation, what does break the "while loop" shown in below > code. Bug in a kernel that fails to restore fpu state. -- Cyril Hrubis chrubis@suse.cz ^ permalink raw reply [flat|nested] 7+ messages in thread
* [LTP] Question about kernel/syscall/signal/signal06.c 2019-07-17 9:46 ` Cyril Hrubis @ 2019-07-19 8:13 ` Hongzhi, Song 2019-07-19 8:44 ` Li Wang 0 siblings, 1 reply; 7+ messages in thread From: Hongzhi, Song @ 2019-07-19 8:13 UTC (permalink / raw) To: ltp This case fails when boot qemux86-64 with 1/2 cores. I find [kernel 5.2-rc1: 0d714dba162] causes the failure by git bisect. If git checkout a commit before 0d714dba162, the case will pass on the same qemu configuration. --Hongzhi On 7/17/19 5:46 PM, Cyril Hrubis wrote: > Hi! >> I find signal06 fails on qemux86-64 when qemu has a small number cores, >> e.g. "qemu -smp 1/2/4/6". >> >> ERROR INFO: >> >> signal06?????? 0?? TINFO?? :?? loop = 23 >> signal06?????? 1?? TFAIL?? :?? signal06.c:87: Bug Reproduced! >> >> But if boot qemu with "-smp 16", the case has great chance to pass. >> >> >> I have two questions about this case: >> >> 1. I don't know why multi-core will affect the case. > Have you looked into the code? The test is trying to reproduce a race > condition between two threads of course the number of cores does affect > the reproducibility. > >> 2. On failure situation, what does break the "while loop" shown in below >> code. > Bug in a kernel that fails to restore fpu state. > ^ permalink raw reply [flat|nested] 7+ messages in thread
* [LTP] Question about kernel/syscall/signal/signal06.c 2019-07-19 8:13 ` Hongzhi, Song @ 2019-07-19 8:44 ` Li Wang 2019-07-22 1:56 ` Hongzhi, Song 0 siblings, 1 reply; 7+ messages in thread From: Li Wang @ 2019-07-19 8:44 UTC (permalink / raw) To: ltp On Fri, Jul 19, 2019 at 4:14 PM Hongzhi, Song <hongzhi.song@windriver.com> wrote: > > This case fails when boot qemux86-64 with 1/2 cores. > > I find [kernel 5.2-rc1: 0d714dba162] causes the failure by git bisect. > > If git checkout a commit before 0d714dba162, the case will pass on the > same qemu configuration. It sounds like a new regression on fpu. I will have a try on this test then. @Hongzhi, could you provide more info of your test machine? (e.g. lscpu, uname -r) and test result with 1vcpu, 2vcpus? [Ccing fpu Dev in this loop] > > > --Hongzhi > > > On 7/17/19 5:46 PM, Cyril Hrubis wrote: > > Hi! > >> I find signal06 fails on qemux86-64 when qemu has a small number cores, > >> e.g. "qemu -smp 1/2/4/6". > >> > >> ERROR INFO: > >> > >> signal06?????? 0?? TINFO?? :?? loop = 23 > >> signal06?????? 1?? TFAIL?? :?? signal06.c:87: Bug Reproduced! > >> > >> But if boot qemu with "-smp 16", the case has great chance to pass. > >> > >> > >> I have two questions about this case: > >> > >> 1. I don't know why multi-core will affect the case. > > Have you looked into the code? The test is trying to reproduce a race > > condition between two threads of course the number of cores does affect > > the reproducibility. > > > >> 2. On failure situation, what does break the "while loop" shown in below > >> code. > > Bug in a kernel that fails to restore fpu state. > > -- Regards, Li Wang ^ permalink raw reply [flat|nested] 7+ messages in thread
* [LTP] Question about kernel/syscall/signal/signal06.c 2019-07-19 8:44 ` Li Wang @ 2019-07-22 1:56 ` Hongzhi, Song 2019-07-24 9:56 ` Li Wang 2019-08-07 10:15 ` Sebastian Andrzej Siewior 0 siblings, 2 replies; 7+ messages in thread From: Hongzhi, Song @ 2019-07-22 1:56 UTC (permalink / raw) To: ltp On 7/19/19 4:44 PM, Li Wang wrote: > On Fri, Jul 19, 2019 at 4:14 PM Hongzhi, Song > <hongzhi.song@windriver.com> wrote: >> This case fails when boot qemux86-64 with 1/2 cores. >> >> I find [kernel 5.2-rc1: 0d714dba162] causes the failure by git bisect. Hi Li,Wang Sorry for my a bit mistake, the exact tag is [5.1-rc3 : 0d714dba162] commit 0d714dba162620fd8b9f5b3104a487e041353c4d Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Date:?? Wed Apr 3 18:41:48 2019 +0200 ??? x86/fpu: Update xstate's PKRU value on write_pkru() ??? During the context switch the xstate is loaded which also includes the ??? PKRU value. ??? If xstate is restored on return to userland it is required ??? that the PKRU value in xstate is the same as the one in the CPU. ??? Save the PKRU in xstate during modification. >> >> If git checkout a commit before 0d714dba162, the case will pass on the >> same qemu configuration. > It sounds like a new regression on fpu. I will have a try on this test then. > > @Hongzhi, could you provide more info of your test machine? (e.g. > lscpu, uname -r) > and test result with 1vcpu, 2vcpus? I tested "-smp 1/2/4" and "-cpu Skylake-Client-IBRS/core2duo", all of them failed. 1. This is my qemu boot cmdline: qemu-system-x86_64 -device virtio-net-pci,netdev=net0,mac=52:54:00:12:35:02 -netdev user,id=net0,hostfwd=tcp::2222-:22,hostfwd=tcp::2323-:23,tftp=images/qemux86-64 -drive file=image.rootfs.ext4,if=virtio,format=raw -vga vmware -show-cursor -usb -device usb-tablet -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-pci,rng=rng0? -nographic? -m 256? -cpu Skylake-Client-IBRS -serial mon:stdio -serial null -kernel linux/arch/x86/boot/bzImage -append 'root=/dev/vda rw highres=off console=ttyS0 mem=256M ip=dhcp vga=0 uvesafb.mode_option=640x480-32 oprofile.timer=1 uvesafb.task_timeout=-1 ' 2. lscpu root@qemux86-64:~# lscpu Architecture:??????????????????? x86_64 CPU op-mode(s):????????????? 32-bit, 64-bit Byte Order:????????????????????? Little Endian Address sizes:???????????????? 40 bits physical, 48 bits virtual CPU(s):??????????????????????????? 4 On-line CPU(s) list:?????????? 0 Thread(s) per core:????????? 1 Core(s) per socket:????????? 1 Socket(s):??????????????????????? 1 Vendor ID:?????????????????????? GenuineIntel CPU family:????????????????????? 6 Model:???????????????????????????? 94 Model name:?????????????????? Intel Core Processor (Skylake, IBRS) Stepping:??????????????????????? 3 CPU MHz:??????????????????????? 3100.012 BogoMIPS:?????????????????????? 6200.02 L1d cache:????????????????????? 32 KiB L1i cache:?????????????????????? 32 KiB L2 cache:??????????????????????? 4 MiB L3 cache:??????????????????????? 16 MiB Vulnerability L1tf:??????????? Mitigation; PTE Inversion Vulnerability Meltdown:?? Mitigation; PTI Vulnerability Spec store bypass: Vulnerable Vulnerability Spectre v1:??????? Mitigation; __user pointer sanitization Vulnerability Spectre v2:??????? Mitigation; Full generic retpoline, STIBP disab ???????????????????????????????? led, RSB filling Flags:?????????????????????????? fpu de pse tsc msr pae mce cx8 apic sep mtrr pg ???????????????????????????????? e mca cmov pat pse36 clflush mmx fxsr sse sse2 ???????????????????????????????? syscall nx rdtscp lm constant_tsc rep_good nopl ????????????????????????????????? xtopology cpuid pni pclmulqdq ssse3 cx16 sse4_ ???????????????????????????????? 1 sse4_2 movbe popcnt aes xsave hypervisor lahf ???????????????????????????????? _lm abm pti fsgsbase bmi1 smep bmi2 erms adx sm ???????????????????????????????? ap xsaveopt xgetbv1 arat 3.? uname -r root@qemux86-64:~# uname -r 5.1.0-rc3-Linux-standard Thanks. --Hongzhi > > [Ccing fpu Dev in this loop] > >> >> --Hongzhi >> >> >> On 7/17/19 5:46 PM, Cyril Hrubis wrote: >>> Hi! >>>> I find signal06 fails on qemux86-64 when qemu has a small number cores, >>>> e.g. "qemu -smp 1/2/4/6". >>>> >>>> ERROR INFO: >>>> >>>> signal06?????? 0?? TINFO?? :?? loop = 23 >>>> signal06?????? 1?? TFAIL?? :?? signal06.c:87: Bug Reproduced! >>>> >>>> But if boot qemu with "-smp 16", the case has great chance to pass. >>>> >>>> >>>> I have two questions about this case: >>>> >>>> 1. I don't know why multi-core will affect the case. >>> Have you looked into the code? The test is trying to reproduce a race >>> condition between two threads of course the number of cores does affect >>> the reproducibility. >>> >>>> 2. On failure situation, what does break the "while loop" shown in below >>>> code. >>> Bug in a kernel that fails to restore fpu state. >>> > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* [LTP] Question about kernel/syscall/signal/signal06.c 2019-07-22 1:56 ` Hongzhi, Song @ 2019-07-24 9:56 ` Li Wang 2019-08-07 10:15 ` Sebastian Andrzej Siewior 1 sibling, 0 replies; 7+ messages in thread From: Li Wang @ 2019-07-24 9:56 UTC (permalink / raw) To: ltp Hi Hongzhi, On Mon, Jul 22, 2019 at 9:59 AM Hongzhi, Song <hongzhi.song@windriver.com> wrote: > > > On 7/19/19 4:44 PM, Li Wang wrote: > > On Fri, Jul 19, 2019 at 4:14 PM Hongzhi, Song > > <hongzhi.song@windriver.com> wrote: > >> This case fails when boot qemux86-64 with 1/2 cores. > >> > >> I find [kernel 5.2-rc1: 0d714dba162] causes the failure by git bisect. > > Hi Li,Wang > > > Sorry for my a bit mistake, the exact tag is [5.1-rc3 : 0d714dba162] > > commit 0d714dba162620fd8b9f5b3104a487e041353c4d > Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > Date: Wed Apr 3 18:41:48 2019 +0200 > > x86/fpu: Update xstate's PKRU value on write_pkru() > > During the context switch the xstate is loaded which also includes the > PKRU value. > > If xstate is restored on return to userland it is required > that the PKRU value in xstate is the same as the one in the CPU. > > Save the PKRU in xstate during modification. > > > >> > >> If git checkout a commit before 0d714dba162, the case will pass on the > >> same qemu configuration. > > It sounds like a new regression on fpu. I will have a try on this test then. > > > > @Hongzhi, could you provide more info of your test machine? (e.g. > > lscpu, uname -r) > > and test result with 1vcpu, 2vcpus? > > > I tested "-smp 1/2/4" and "-cpu Skylake-Client-IBRS/core2duo", all of > them failed. > > > 1. This is my qemu boot cmdline: > > qemu-system-x86_64 -device > virtio-net-pci,netdev=net0,mac=52:54:00:12:35:02 -netdev > user,id=net0,hostfwd=tcp::2222-:22,hostfwd=tcp::2323-:23,tftp=images/qemux86-64 > -drive file=image.rootfs.ext4,if=virtio,format=raw -vga vmware > -show-cursor -usb -device usb-tablet -object > rng-random,filename=/dev/urandom,id=rng0 -device > virtio-rng-pci,rng=rng0 -nographic -m 256 -cpu Skylake-Client-IBRS > -serial mon:stdio -serial null -kernel linux/arch/x86/boot/bzImage > -append 'root=/dev/vda rw highres=off console=ttyS0 mem=256M ip=dhcp > vga=0 uvesafb.mode_option=640x480-32 oprofile.timer=1 > uvesafb.task_timeout=-1 ' > > 2. lscpu > > root@qemux86-64:~# lscpu > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Byte Order: Little Endian > Address sizes: 40 bits physical, 48 bits virtual > CPU(s): 4 > On-line CPU(s) list: 0 > Thread(s) per core: 1 > Core(s) per socket: 1 > Socket(s): 1 > Vendor ID: GenuineIntel > CPU family: 6 > Model: 94 > Model name: Intel Core Processor (Skylake, IBRS) Thanks for the information. I tried the mainline kernel-v5.2 on the kvm system(with 1/2 Skylake vCPUs) but didn't reproduce your failure, I'm not sure if I missed anything there, maybe the virtualization way is related, I will have a try on your command when I available. -- Regards, Li Wang ^ permalink raw reply [flat|nested] 7+ messages in thread
* [LTP] Question about kernel/syscall/signal/signal06.c 2019-07-22 1:56 ` Hongzhi, Song 2019-07-24 9:56 ` Li Wang @ 2019-08-07 10:15 ` Sebastian Andrzej Siewior 1 sibling, 0 replies; 7+ messages in thread From: Sebastian Andrzej Siewior @ 2019-08-07 10:15 UTC (permalink / raw) To: ltp I just woke up from hibernation and assume that this has not been handled yet so? On 2019-07-22 09:56:55 [+0800], Hongzhi, Song wrote: > > On 7/19/19 4:44 PM, Li Wang wrote: > > On Fri, Jul 19, 2019 at 4:14 PM Hongzhi, Song > > <hongzhi.song@windriver.com> wrote: > > > This case fails when boot qemux86-64 with 1/2 cores. > > > > > > I find [kernel 5.2-rc1: 0d714dba162] causes the failure by git bisect. > > Hi Li,Wang > > > Sorry for my a bit mistake, the exact tag is [5.1-rc3 : 0d714dba162] > > commit 0d714dba162620fd8b9f5b3104a487e041353c4d > Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > Date:?? Wed Apr 3 18:41:48 2019 +0200 > > ??? x86/fpu: Update xstate's PKRU value on write_pkru() > > ??? During the context switch the xstate is loaded which also includes the > ??? PKRU value. > > ??? If xstate is restored on return to userland it is required > ??? that the PKRU value in xstate is the same as the one in the CPU. > > ??? Save the PKRU in xstate during modification. So this commit is about PKRU handling and I miss PKU bits in your lscpu output. So I assume this commit is not related but the FPU rework in general. > 3.? uname -r > > root@qemux86-64:~# uname -r > 5.1.0-rc3-Linux-standard This is information is confusing. I can reproduce a test case failure in 0d714dba162 but it passes with latest supported kernel. Please let me know if this problem still exists with 5.3-rc3 or 5.2.7. I can't reproduce it on any of those kernels. 5.1 is EOL and the commit in question was merged into 5.2-rc1. > Thanks. > > --Hongzhi Sebastian ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2019-08-07 10:15 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-07-17 9:21 [LTP] Question about kernel/syscall/signal/signal06.c Hongzhi, Song 2019-07-17 9:46 ` Cyril Hrubis 2019-07-19 8:13 ` Hongzhi, Song 2019-07-19 8:44 ` Li Wang 2019-07-22 1:56 ` Hongzhi, Song 2019-07-24 9:56 ` Li Wang 2019-08-07 10:15 ` Sebastian Andrzej Siewior
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox