* x86 PMU broken in current Linus' tree @ 2016-08-02 12:04 Jiri Kosina 2016-08-02 12:55 ` Peter Zijlstra 0 siblings, 1 reply; 13+ messages in thread From: Jiri Kosina @ 2016-08-02 12:04 UTC (permalink / raw) To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin Cc: x86, linux-kernel With current Linus' tree (HEAD == 731c7d3a20), I am getting bogus MSR write warning during bootup, and kernel panic when shutting PMUs down during poweroff. The MSR warning is below, the camera capture of the poweroff panic can be found at http://www.jikos.cz/jikos/junk/pmu-panic.jpg The last previous kernel version that I've booted on this particular machine was 4.7.0-rc4, and it had neither of those symptoms, so I can eventually bisect if needed. === [ snip ] == [ 0.136000] smpboot: CPU0: Intel(R) Core(TM)2 Duo CPU L9400 @ 1.86GHz (family: 0x6, model: 0x17, stepping: 0x6) [ 0.136000] Performance Events: PEBS fmt0+, Core2 events, Intel PMU driver. [ 0.136000] ... version: 2 [ 0.136000] ... bit width: 40 [ 0.136000] ... generic registers: 2 [ 0.136000] ... value mask: 000000ffffffffff [ 0.136000] ... max period: 000000007fffffff [ 0.136000] ... fixed-purpose events: 3 [ 0.136000] ... event mask: 0000000700000003 [ 0.136000] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter. [ 0.136000] unchecked MSR access error: WRMSR to 0xdf (tried to write 0x000000ff80000001) at rIP: 0xffffffff90004acc (x86_perf_event_set_period+0xdc/0x190) [ 0.136000] ffff8e39f941b000 ffff8e39fbc0a440 000000000000001e ffff8e39fbc0a660 [ 0.136000] ffff8e39f9523b78 ffffffff90004bd0 ffff8e39f941b000 ffff8e39fbc0a760 [ 0.136000] ffff8e39fbc0a440 ffff8e39f9523bc0 ffffffff900053ef 00000000f951c2c0 [ 0.136000] Call Trace: [ 0.136000] [<ffffffff90004bd0>] x86_pmu_start+0x50/0x110 [ 0.136000] [<ffffffff900053ef>] x86_pmu_enable+0x27f/0x2f0 [ 0.136000] [<ffffffff90175642>] perf_pmu_enable+0x22/0x30 [ 0.136000] [<ffffffff901766b1>] ctx_resched+0x51/0x60 [ 0.136000] [<ffffffff901768a0>] __perf_event_enable+0x1e0/0x240 [ 0.136000] [<ffffffff9016e5f9>] event_function+0xa9/0x180 [ 0.136000] [<ffffffff901766c0>] ? ctx_resched+0x60/0x60 [ 0.136000] [<ffffffff9016fcef>] remote_function+0x3f/0x50 [ 0.136000] [<ffffffff901012b6>] generic_exec_single+0xb6/0x140 [ 0.136000] [<ffffffff9016fcb0>] ? perf_cgroup_attach+0x50/0x50 [ 0.136000] [<ffffffff9016fcb0>] ? perf_cgroup_attach+0x50/0x50 [ 0.136000] [<ffffffff901013f7>] smp_call_function_single+0xb7/0x110 [ 0.136000] [<ffffffff9016e984>] cpu_function_call+0x34/0x40 [ 0.136000] [<ffffffff9016e550>] ? list_del_event+0x150/0x150 [ 0.136000] [<ffffffff9016ecda>] event_function_call+0x11a/0x120 [ 0.136000] [<ffffffff901766c0>] ? ctx_resched+0x60/0x60 [ 0.136000] [<ffffffff9016ed79>] _perf_event_enable+0x49/0x70 [ 0.136000] [<ffffffff901736ac>] perf_event_enable+0x1c/0x40 [ 0.136000] [<ffffffff9013cad2>] watchdog_enable+0x132/0x1d0 [ 0.136000] [<ffffffff90092440>] smpboot_thread_fn+0xe0/0x1d0 [ 0.136000] [<ffffffff90092360>] ? sort_range+0x30/0x30 [ 0.136000] [<ffffffff9008e7e2>] kthread+0xf2/0x110 [ 0.136000] [<ffffffff9069e611>] ? wait_for_completion+0xe1/0x110 [ 0.136000] [<ffffffff906a3b2f>] ret_from_fork+0x1f/0x40 [ 0.136000] [<ffffffff9008e6f0>] ? kthread_create_on_node+0x220/0x220 === [ snip ] === -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: x86 PMU broken in current Linus' tree 2016-08-02 12:04 x86 PMU broken in current Linus' tree Jiri Kosina @ 2016-08-02 12:55 ` Peter Zijlstra 2016-08-02 13:31 ` Jiri Kosina 2016-08-08 10:26 ` Peter Zijlstra 0 siblings, 2 replies; 13+ messages in thread From: Peter Zijlstra @ 2016-08-02 12:55 UTC (permalink / raw) To: Jiri Kosina Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, x86, linux-kernel On Tue, Aug 02, 2016 at 02:04:58PM +0200, Jiri Kosina wrote: > With current Linus' tree (HEAD == 731c7d3a20), I am getting bogus MSR > write warning during bootup, and kernel panic when shutting PMUs down > during poweroff. > > The MSR warning is below, the camera capture of the poweroff panic can be > found at > > http://www.jikos.cz/jikos/junk/pmu-panic.jpg > > The last previous kernel version that I've booted on this particular > machine was 4.7.0-rc4, and it had neither of those symptoms, so I can > eventually bisect if needed. > > === [ snip ] == > [ 0.136000] smpboot: CPU0: Intel(R) Core(TM)2 Duo CPU L9400 @ 1.86GHz (family: 0x6, model: 0x17, stepping: 0x6) > [ 0.136000] Performance Events: PEBS fmt0+, Core2 events, Intel PMU driver. > [ 0.136000] ... version: 2 > [ 0.136000] ... bit width: 40 > [ 0.136000] ... generic registers: 2 > [ 0.136000] ... value mask: 000000ffffffffff > [ 0.136000] ... max period: 000000007fffffff > [ 0.136000] ... fixed-purpose events: 3 > [ 0.136000] ... event mask: 0000000700000003 > [ 0.136000] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter. > [ 0.136000] unchecked MSR access error: WRMSR to 0xdf (tried to write 0x000000ff80000001) at rIP: 0xffffffff90004acc (x86_perf_event_set_period+0xdc/0x190) 'Curious'.. :/ x86_perf_event_set_period() only does: wrmsrl(hwc->event_base, (u64)(-left) & x86_pmu.cntval_mask); and hwc->event ends up being: MSR_ARCH_PERFMON_PERFCTR0 + index >From which we can deduce that index = 0xdf - 0xc1 = 30, which is somewhat larger than the max reported number of counters (2). Lemme go see how that can happen. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: x86 PMU broken in current Linus' tree 2016-08-02 12:55 ` Peter Zijlstra @ 2016-08-02 13:31 ` Jiri Kosina 2016-08-08 10:26 ` Peter Zijlstra 1 sibling, 0 replies; 13+ messages in thread From: Jiri Kosina @ 2016-08-02 13:31 UTC (permalink / raw) To: Peter Zijlstra Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, x86, linux-kernel On Tue, 2 Aug 2016, Peter Zijlstra wrote: > > With current Linus' tree (HEAD == 731c7d3a20), I am getting bogus MSR > > write warning during bootup, and kernel panic when shutting PMUs down > > during poweroff. > > > > The MSR warning is below, the camera capture of the poweroff panic can be > > found at > > > > http://www.jikos.cz/jikos/junk/pmu-panic.jpg > > > > The last previous kernel version that I've booted on this particular > > machine was 4.7.0-rc4, and it had neither of those symptoms, so I can > > eventually bisect if needed. > > > > === [ snip ] == > > [ 0.136000] smpboot: CPU0: Intel(R) Core(TM)2 Duo CPU L9400 @ 1.86GHz (family: 0x6, model: 0x17, stepping: 0x6) > > [ 0.136000] Performance Events: PEBS fmt0+, Core2 events, Intel PMU driver. > > [ 0.136000] ... version: 2 > > [ 0.136000] ... bit width: 40 > > [ 0.136000] ... generic registers: 2 > > [ 0.136000] ... value mask: 000000ffffffffff > > [ 0.136000] ... max period: 000000007fffffff > > [ 0.136000] ... fixed-purpose events: 3 > > [ 0.136000] ... event mask: 0000000700000003 > > [ 0.136000] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter. > > [ 0.136000] unchecked MSR access error: WRMSR to 0xdf (tried to write 0x000000ff80000001) at rIP: 0xffffffff90004acc (x86_perf_event_set_period+0xdc/0x190) > > 'Curious'.. :/ > > x86_perf_event_set_period() only does: > > wrmsrl(hwc->event_base, (u64)(-left) & x86_pmu.cntval_mask); > > and hwc->event ends up being: > > MSR_ARCH_PERFMON_PERFCTR0 + index > > From which we can deduce that index = 0xdf - 0xc1 = 30, which is > somewhat larger than the max reported number of counters (2). > > Lemme go see how that can happen. FTR, I tried the very same kernel on Xeon E5, and the issue didn't pop up. So it might be somehow specific to the older Core2, or somehow otherwise not really completely generic problem. -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: x86 PMU broken in current Linus' tree 2016-08-02 12:55 ` Peter Zijlstra 2016-08-02 13:31 ` Jiri Kosina @ 2016-08-08 10:26 ` Peter Zijlstra 2016-08-08 14:41 ` Jiri Kosina 1 sibling, 1 reply; 13+ messages in thread From: Peter Zijlstra @ 2016-08-08 10:26 UTC (permalink / raw) To: Jiri Kosina Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, x86, linux-kernel, Mike Galbraith On Tue, Aug 02, 2016 at 02:55:24PM +0200, Peter Zijlstra wrote: > On Tue, Aug 02, 2016 at 02:04:58PM +0200, Jiri Kosina wrote: > > With current Linus' tree (HEAD == 731c7d3a20), I am getting bogus MSR > > write warning during bootup, and kernel panic when shutting PMUs down > > during poweroff. > > > > The MSR warning is below, the camera capture of the poweroff panic can be > > found at > > > > http://www.jikos.cz/jikos/junk/pmu-panic.jpg > > > > The last previous kernel version that I've booted on this particular > > machine was 4.7.0-rc4, and it had neither of those symptoms, so I can > > eventually bisect if needed. > > > > === [ snip ] == > > [ 0.136000] smpboot: CPU0: Intel(R) Core(TM)2 Duo CPU L9400 @ 1.86GHz (family: 0x6, model: 0x17, stepping: 0x6) > > [ 0.136000] Performance Events: PEBS fmt0+, Core2 events, Intel PMU driver. > > [ 0.136000] ... version: 2 > > [ 0.136000] ... bit width: 40 > > [ 0.136000] ... generic registers: 2 > > [ 0.136000] ... value mask: 000000ffffffffff > > [ 0.136000] ... max period: 000000007fffffff > > [ 0.136000] ... fixed-purpose events: 3 > > [ 0.136000] ... event mask: 0000000700000003 > > [ 0.136000] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter. > > [ 0.136000] unchecked MSR access error: WRMSR to 0xdf (tried to write 0x000000ff80000001) at rIP: 0xffffffff90004acc (x86_perf_event_set_period+0xdc/0x190) > > 'Curious'.. :/ > > x86_perf_event_set_period() only does: > > wrmsrl(hwc->event_base, (u64)(-left) & x86_pmu.cntval_mask); > > and hwc->event ends up being: > > MSR_ARCH_PERFMON_PERFCTR0 + index > > From which we can deduce that index = 0xdf - 0xc1 = 30, which is > somewhat larger than the max reported number of counters (2). > > Lemme go see how that can happen. So I can reproduce on my Lenovo T500 which has a Core2 as well. By long and tedious printk() it looks like the event constraint: FIXED_EVENT_CONSTRAINT(0x003c, 1), /* CPU_CLK_UNHALTED.CORE */ which is updated in: intel_pmu_init() to include the generic counter masks, gets corrupted for some reason. But the moment I put printk()s in there to print the idxmsk64 values, everything works as expected again. I'll go prod mode. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: x86 PMU broken in current Linus' tree 2016-08-08 10:26 ` Peter Zijlstra @ 2016-08-08 14:41 ` Jiri Kosina 2016-08-08 16:12 ` [PATCH v2] x86/hweight: fix open-coded versions of 32bit and 64bit hweight calculations Jiri Kosina 0 siblings, 1 reply; 13+ messages in thread From: Jiri Kosina @ 2016-08-08 14:41 UTC (permalink / raw) To: Peter Zijlstra Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, x86, linux-kernel, Mike Galbraith, Borislav Petkov On Mon, 8 Aug 2016, Peter Zijlstra wrote: > So I can reproduce on my Lenovo T500 which has a Core2 as well. By long > and tedious printk() it looks like the event constraint: > > FIXED_EVENT_CONSTRAINT(0x003c, 1), /* CPU_CLK_UNHALTED.CORE */ [ adding Boris to CC ] This made me wonder, and it turned out that HWEIGHT() doesn't seem to be doing the right thing on machines that don't have popcnt instruction (which is probably the distinguishing factor here). How about the patch below? It fixes the symptoms on my system. From: Jiri Kosina <jkosina@suse.cz> Subject: [PATCH] x86/hweight: fix open-coded versions of 32bit and 64bit hweight calculations The open-coded version of __sw_hweight32() and __sw_hweight64() are broken; the variable <-> register mapping in the comments doesn't match the registers being used, resulting in wrong calculations. This has a potential to demonstrate itself by various syptoms on machines where this is actually being used to compute the hweight (due to lack of popcnt insn). On my core2 system, this demonstrates itself as NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter. unchecked MSR access error: WRMSR to 0xdf (tried to write 0x000000ff80000001) at rIP: 0xffffffff90004acc t_set_period+0xdc/0x190) ffff8e39f941b000 ffff8e39fbc0a440 000000000000001e ffff8e39fbc0a660 ffff8e39f9523b78 ffffffff90004bd0 ffff8e39f941b000 ffff8e39fbc0a760 ffff8e39fbc0a440 ffff8e39f9523bc0 ffffffff900053ef 00000000f951c2c0 Call Trace: [<ffffffff90004bd0>] x86_pmu_start+0x50/0x110 [<ffffffff900053ef>] x86_pmu_enable+0x27f/0x2f0 [<ffffffff90175642>] perf_pmu_enable+0x22/0x30 [<ffffffff901766b1>] ctx_resched+0x51/0x60 [<ffffffff901768a0>] __perf_event_enable+0x1e0/0x240 [<ffffffff9016e5f9>] event_function+0xa9/0x180 [<ffffffff901766c0>] ? ctx_resched+0x60/0x60 [<ffffffff9016fcef>] remote_function+0x3f/0x50 [<ffffffff901012b6>] generic_exec_single+0xb6/0x140 [<ffffffff9016fcb0>] ? perf_cgroup_attach+0x50/0x50 [<ffffffff9016fcb0>] ? perf_cgroup_attach+0x50/0x50 [<ffffffff901013f7>] smp_call_function_single+0xb7/0x110 [<ffffffff9016e984>] cpu_function_call+0x34/0x40 [<ffffffff9016e550>] ? list_del_event+0x150/0x150 [<ffffffff9016ecda>] event_function_call+0x11a/0x120 [<ffffffff901766c0>] ? ctx_resched+0x60/0x60 [<ffffffff9016ed79>] _perf_event_enable+0x49/0x70 [<ffffffff901736ac>] perf_event_enable+0x1c/0x40 [<ffffffff9013cad2>] watchdog_enable+0x132/0x1d0 [<ffffffff90092440>] smpboot_thread_fn+0xe0/0x1d0 [<ffffffff90092360>] ? sort_range+0x30/0x30 [<ffffffff9008e7e2>] kthread+0xf2/0x110 [<ffffffff9069e611>] ? wait_for_completion+0xe1/0x110 [<ffffffff906a3b2f>] ret_from_fork+0x1f/0x40 [<ffffffff9008e6f0>] ? kthread_create_on_node+0x220/0x220 as FIXED_EVENT_CONSTRAINT() doesn't procude correct results, and followup panic in x86_pmu_stop(). YMMV. Fix this by rewriting the code so that it actually matches the math in comments. Fixes: f5967101e9d ("x86/hweight: Get rid of the special calling convention") Signed-off-by: Jiri Kosina <jkosina@suse.cz> --- arch/x86/lib/hweight.S | 43 +++++++++++++++++++++---------------------- 1 file changed, 21 insertions(+), 22 deletions(-) diff --git a/arch/x86/lib/hweight.S b/arch/x86/lib/hweight.S index 02de3d7..9d0540f 100644 --- a/arch/x86/lib/hweight.S +++ b/arch/x86/lib/hweight.S @@ -8,24 +8,23 @@ */ ENTRY(__sw_hweight32) + __ASM_SIZE(push,) %__ASM_REG(dx) #ifdef CONFIG_X86_64 - movl %edi, %eax # w + movl %edi, %edx # w -> t #endif - __ASM_SIZE(push,) %__ASM_REG(dx) - movl %eax, %edx # w -> t - shrl %edx # t >>= 1 - andl $0x55555555, %edx # t &= 0x55555555 - subl %edx, %eax # w -= t + shrl %edx # w >>= 1 + andl $0x55555555, %edx # w &= 0x55555555 + subl %edx, %edi # w -= t + movl %edi, %eax # w -> t - movl %eax, %edx # w -> t - shrl $2, %eax # w_tmp >>= 2 - andl $0x33333333, %edx # t &= 0x33333333 - andl $0x33333333, %eax # w_tmp &= 0x33333333 - addl %edx, %eax # w = w_tmp + t + shrl $2, %edi # w_tmp >>= 2 + andl $0x33333333, %eax # t &= 0x33333333 + andl $0x33333333, %edi # w_tmp &= 0x33333333 + addl %eax, %edi # w = w_tmp + t - movl %eax, %edx # w -> t - shrl $4, %edx # t >>= 4 - addl %edx, %eax # w_tmp += t + movl %edi, %eax # w -> t + shrl $4, %eax # t >>= 4 + addl %edi, %eax # w_tmp += t andl $0x0f0f0f0f, %eax # w_tmp &= 0x0f0f0f0f imull $0x01010101, %eax, %eax # w_tmp *= 0x01010101 shrl $24, %eax # w = w_tmp >> 24 @@ -40,20 +39,20 @@ ENTRY(__sw_hweight64) movq %rdi, %rdx # w -> t movabsq $0x5555555555555555, %rax shrq %rdx # t >>= 1 - andq %rdx, %rax # t &= 0x5555555555555555 + andq %rax, %rdx # t &= 0x5555555555555555 + subq %rdx, %rdi # w -= t movabsq $0x3333333333333333, %rdx - subq %rax, %rdi # w -= t movq %rdi, %rax # w -> t shrq $2, %rdi # w_tmp >>= 2 andq %rdx, %rax # t &= 0x3333333333333333 - andq %rdi, %rdx # w_tmp &= 0x3333333333333333 - addq %rdx, %rax # w = w_tmp + t - - movq %rax, %rdx # w -> t - shrq $4, %rdx # t >>= 4 - addq %rdx, %rax # w_tmp += t + andq %rdx, %rdi # w_tmp &= 0x3333333333333333 movabsq $0x0f0f0f0f0f0f0f0f, %rdx + addq %rax, %rdi # w = w_tmp + t + + movq %rdi, %rax # w -> t + shrq $4, %rax # t >>= 4 + addq %rdi, %rax # w_tmp += t andq %rdx, %rax # w_tmp &= 0x0f0f0f0f0f0f0f0f movabsq $0x0101010101010101, %rdx imulq %rdx, %rax # w_tmp *= 0x0101010101010101 -- Jiri Kosina SUSE Labs ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v2] x86/hweight: fix open-coded versions of 32bit and 64bit hweight calculations 2016-08-08 14:41 ` Jiri Kosina @ 2016-08-08 16:12 ` Jiri Kosina 2016-08-08 16:34 ` kbuild test robot ` (2 more replies) 0 siblings, 3 replies; 13+ messages in thread From: Jiri Kosina @ 2016-08-08 16:12 UTC (permalink / raw) To: Peter Zijlstra Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, x86, linux-kernel, Mike Galbraith, Borislav Petkov From: Jiri Kosina <jkosina@suse.cz> The open-coded version of __sw_hweight32() and __sw_hweight64() are broken; the variable <-> register mapping in the comments doesn't match the registers being used, resulting in wrong calculations. This has a potential to demonstrate itself by various syptoms on machines where this is actually being used to compute the hweight (due to lack of popcnt insn). On my core2 system, this demonstrates itself as NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter. unchecked MSR access error: WRMSR to 0xdf (tried to write 0x000000ff80000001) at rIP: 0xffffffff90004acct_set_period+0xdc/0x190) ffff8e39f941b000 ffff8e39fbc0a440 000000000000001e ffff8e39fbc0a660 ffff8e39f9523b78 ffffffff90004bd0 ffff8e39f941b000 ffff8e39fbc0a760 ffff8e39fbc0a440 ffff8e39f9523bc0 ffffffff900053ef 00000000f951c2c0 Call Trace: [<ffffffff90004bd0>] x86_pmu_start+0x50/0x110 [<ffffffff900053ef>] x86_pmu_enable+0x27f/0x2f0 [<ffffffff90175642>] perf_pmu_enable+0x22/0x30 [<ffffffff901766b1>] ctx_resched+0x51/0x60 [<ffffffff901768a0>] __perf_event_enable+0x1e0/0x240 [<ffffffff9016e5f9>] event_function+0xa9/0x180 [<ffffffff901766c0>] ? ctx_resched+0x60/0x60 [<ffffffff9016fcef>] remote_function+0x3f/0x50 [<ffffffff901012b6>] generic_exec_single+0xb6/0x140 [<ffffffff9016fcb0>] ? perf_cgroup_attach+0x50/0x50 [<ffffffff9016fcb0>] ? perf_cgroup_attach+0x50/0x50 [<ffffffff901013f7>] smp_call_function_single+0xb7/0x110 [<ffffffff9016e984>] cpu_function_call+0x34/0x40 [<ffffffff9016e550>] ? list_del_event+0x150/0x150 [<ffffffff9016ecda>] event_function_call+0x11a/0x120 [<ffffffff901766c0>] ? ctx_resched+0x60/0x60 [<ffffffff9016ed79>] _perf_event_enable+0x49/0x70 [<ffffffff901736ac>] perf_event_enable+0x1c/0x40 [<ffffffff9013cad2>] watchdog_enable+0x132/0x1d0 [<ffffffff90092440>] smpboot_thread_fn+0xe0/0x1d0 [<ffffffff90092360>] ? sort_range+0x30/0x30 [<ffffffff9008e7e2>] kthread+0xf2/0x110 [<ffffffff9069e611>] ? wait_for_completion+0xe1/0x110 [<ffffffff906a3b2f>] ret_from_fork+0x1f/0x40 [<ffffffff9008e6f0>] ? kthread_create_on_node+0x220/0x220 as FIXED_EVENT_CONSTRAINT() doesn't produce correct results, and followup panic in x86_pmu_stop(). YMMV. Fix this by rewriting the code so that it actually matches the math lib/hweight.c. While at it, nuke the original comments altogether; the "variable" names don't really match what's being used in the C-version of the function, and I find them to be more misleading than helping. This version of gcc: gcc (SUSE Linux) 4.8.5 produces identical assembly code from lib/hweight.c (modulo frame pointer maths). Fixes: f5967101e9d ("x86/hweight: Get rid of the special calling convention") Signed-off-by: Jiri Kosina <jkosina@suse.cz> --- v1 -> v2: remove the comments altogether; put "approved by gcc" note into changelog arch/x86/lib/hweight.S | 69 +++++++++++++++++++++++++------------------------- 1 file changed, 34 insertions(+), 35 deletions(-) diff --git a/arch/x86/lib/hweight.S b/arch/x86/lib/hweight.S index 02de3d7..e1cea44f 100644 --- a/arch/x86/lib/hweight.S +++ b/arch/x86/lib/hweight.S @@ -8,56 +8,55 @@ */ ENTRY(__sw_hweight32) -#ifdef CONFIG_X86_64 - movl %edi, %eax # w -#endif __ASM_SIZE(push,) %__ASM_REG(dx) - movl %eax, %edx # w -> t - shrl %edx # t >>= 1 - andl $0x55555555, %edx # t &= 0x55555555 - subl %edx, %eax # w -= t - movl %eax, %edx # w -> t - shrl $2, %eax # w_tmp >>= 2 - andl $0x33333333, %edx # t &= 0x33333333 - andl $0x33333333, %eax # w_tmp &= 0x33333333 - addl %edx, %eax # w = w_tmp + t + movl %edi, %edx + + shrl %edx + andl $0x55555555, %edx + subl %edx, %edi + movl %edi, %eax + + shrl $2, %edi + andl $0x33333333, %eax + andl $0x33333333, %edi + addl %eax, %edi - movl %eax, %edx # w -> t - shrl $4, %edx # t >>= 4 - addl %edx, %eax # w_tmp += t - andl $0x0f0f0f0f, %eax # w_tmp &= 0x0f0f0f0f - imull $0x01010101, %eax, %eax # w_tmp *= 0x01010101 - shrl $24, %eax # w = w_tmp >> 24 + movl %edi, %eax + shrl $4, %eax + addl %edi, %eax + andl $0x0f0f0f0f, %eax + imull $0x01010101, %eax, %eax + shrl $24, %eax __ASM_SIZE(pop,) %__ASM_REG(dx) ret ENDPROC(__sw_hweight32) ENTRY(__sw_hweight64) -#ifdef CONFIG_X86_64 + pushq %rdx - movq %rdi, %rdx # w -> t + movq %rdi, %rdx movabsq $0x5555555555555555, %rax - shrq %rdx # t >>= 1 - andq %rdx, %rax # t &= 0x5555555555555555 + shrq %rdx + andq %rax, %rdx + subq %rdx, %rdi movabsq $0x3333333333333333, %rdx - subq %rax, %rdi # w -= t - - movq %rdi, %rax # w -> t - shrq $2, %rdi # w_tmp >>= 2 - andq %rdx, %rax # t &= 0x3333333333333333 - andq %rdi, %rdx # w_tmp &= 0x3333333333333333 - addq %rdx, %rax # w = w_tmp + t - movq %rax, %rdx # w -> t - shrq $4, %rdx # t >>= 4 - addq %rdx, %rax # w_tmp += t + movq %rdi, %rax + shrq $2, %rdi + andq %rdx, %rax + andq %rdx, %rdi movabsq $0x0f0f0f0f0f0f0f0f, %rdx - andq %rdx, %rax # w_tmp &= 0x0f0f0f0f0f0f0f0f + addq %rax, %rdi + + movq %rdi, %rax + shrq $4, %rax + addq %rdi, %rax + andq %rdx, %rax movabsq $0x0101010101010101, %rdx - imulq %rdx, %rax # w_tmp *= 0x0101010101010101 - shrq $56, %rax # w = w_tmp >> 56 + imulq %rdx, %rax + shrq $56, %rax popq %rdx ret -- Jiri Kosina SUSE Labs ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v2] x86/hweight: fix open-coded versions of 32bit and 64bit hweight calculations 2016-08-08 16:12 ` [PATCH v2] x86/hweight: fix open-coded versions of 32bit and 64bit hweight calculations Jiri Kosina @ 2016-08-08 16:34 ` kbuild test robot 2016-08-08 18:48 ` Jiri Kosina 2016-08-10 13:59 ` Ingo Molnar 2 siblings, 0 replies; 13+ messages in thread From: kbuild test robot @ 2016-08-08 16:34 UTC (permalink / raw) To: Jiri Kosina Cc: kbuild-all, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, x86, linux-kernel, Mike Galbraith, Borislav Petkov [-- Attachment #1: Type: text/plain, Size: 2538 bytes --] Hi Jiri, [auto build test ERROR on tip/auto-latest] [also build test ERROR on v4.8-rc1 next-20160808] [cannot apply to tip/x86/core] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Jiri-Kosina/x86-hweight-fix-open-coded-versions-of-32bit-and-64bit-hweight-calculations/20160809-001505 config: i386-tinyconfig (attached as .config) compiler: gcc-6 (Debian 6.1.1-9) 6.1.1 20160705 reproduce: # save the attached .config to linux build tree make ARCH=i386 All errors (new ones prefixed by >>): >> arch/x86/lib/hweight.S:63:2: error: #else without #if #else /* CONFIG_X86_32 */ ^~~~ >> arch/x86/lib/hweight.S:75:2: error: #endif without #if #endif ^~~~~ vim +63 arch/x86/lib/hweight.S f5967101 Borislav Petkov 2016-05-30 57 movabsq $0x0101010101010101, %rdx 4792d6cc Jiri Kosina 2016-08-08 58 imulq %rdx, %rax 4792d6cc Jiri Kosina 2016-08-08 59 shrq $56, %rax f5967101 Borislav Petkov 2016-05-30 60 f5967101 Borislav Petkov 2016-05-30 61 popq %rdx f5967101 Borislav Petkov 2016-05-30 62 ret f5967101 Borislav Petkov 2016-05-30 @63 #else /* CONFIG_X86_32 */ f5967101 Borislav Petkov 2016-05-30 64 /* We're getting an u64 arg in (%eax,%edx): unsigned long hweight64(__u64 w) */ f5967101 Borislav Petkov 2016-05-30 65 pushl %ecx f5967101 Borislav Petkov 2016-05-30 66 f5967101 Borislav Petkov 2016-05-30 67 call __sw_hweight32 f5967101 Borislav Petkov 2016-05-30 68 movl %eax, %ecx # stash away result f5967101 Borislav Petkov 2016-05-30 69 movl %edx, %eax # second part of input f5967101 Borislav Petkov 2016-05-30 70 call __sw_hweight32 f5967101 Borislav Petkov 2016-05-30 71 addl %ecx, %eax # result f5967101 Borislav Petkov 2016-05-30 72 f5967101 Borislav Petkov 2016-05-30 73 popl %ecx f5967101 Borislav Petkov 2016-05-30 74 ret f5967101 Borislav Petkov 2016-05-30 @75 #endif f5967101 Borislav Petkov 2016-05-30 76 ENDPROC(__sw_hweight64) :::::: The code at line 63 was first introduced by commit :::::: f5967101e9de12addcda4510dfbac66d7c5779c3 x86/hweight: Get rid of the special calling convention :::::: TO: Borislav Petkov <bp@suse.de> :::::: CC: Ingo Molnar <mingo@kernel.org> --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation [-- Attachment #2: .config.gz --] [-- Type: application/octet-stream, Size: 6290 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] x86/hweight: fix open-coded versions of 32bit and 64bit hweight calculations 2016-08-08 16:12 ` [PATCH v2] x86/hweight: fix open-coded versions of 32bit and 64bit hweight calculations Jiri Kosina 2016-08-08 16:34 ` kbuild test robot @ 2016-08-08 18:48 ` Jiri Kosina 2016-08-08 18:53 ` Borislav Petkov 2016-08-10 13:59 ` Ingo Molnar 2 siblings, 1 reply; 13+ messages in thread From: Jiri Kosina @ 2016-08-08 18:48 UTC (permalink / raw) To: Peter Zijlstra Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, x86, linux-kernel, Mike Galbraith, Borislav Petkov Okay, scratch this thread, it's been just covered (and Linus already applied it) in https://lkml.kernel.org/r/1470677729-10561-1-git-send-email-ville.syrjala@linux.intel.com -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] x86/hweight: fix open-coded versions of 32bit and 64bit hweight calculations 2016-08-08 18:48 ` Jiri Kosina @ 2016-08-08 18:53 ` Borislav Petkov 2016-08-08 18:55 ` Jiri Kosina 0 siblings, 1 reply; 13+ messages in thread From: Borislav Petkov @ 2016-08-08 18:53 UTC (permalink / raw) To: Jiri Kosina Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, x86, linux-kernel, Mike Galbraith, Borislav Petkov On Mon, Aug 08, 2016 at 08:48:44PM +0200, Jiri Kosina wrote: > Okay, scratch this thread, it's been just covered (and Linus already > applied it) in > > https://lkml.kernel.org/r/1470677729-10561-1-git-send-email-ville.syrjala@linux.intel.com Can you confirm it fixes the issue on your machine too? Thanks. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] x86/hweight: fix open-coded versions of 32bit and 64bit hweight calculations 2016-08-08 18:53 ` Borislav Petkov @ 2016-08-08 18:55 ` Jiri Kosina 0 siblings, 0 replies; 13+ messages in thread From: Jiri Kosina @ 2016-08-08 18:55 UTC (permalink / raw) To: Borislav Petkov Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, x86, linux-kernel, Mike Galbraith, Borislav Petkov On Mon, 8 Aug 2016, Borislav Petkov wrote: > > Okay, scratch this thread, it's been just covered (and Linus already > > applied it) in > > > > https://lkml.kernel.org/r/1470677729-10561-1-git-send-email-ville.syrjala@linux.intel.com > > Can you confirm it fixes the issue on your machine too? Confirmed, it does. Thanks, -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] x86/hweight: fix open-coded versions of 32bit and 64bit hweight calculations 2016-08-08 16:12 ` [PATCH v2] x86/hweight: fix open-coded versions of 32bit and 64bit hweight calculations Jiri Kosina 2016-08-08 16:34 ` kbuild test robot 2016-08-08 18:48 ` Jiri Kosina @ 2016-08-10 13:59 ` Ingo Molnar 2016-08-10 14:05 ` Jiri Kosina 2 siblings, 1 reply; 13+ messages in thread From: Ingo Molnar @ 2016-08-10 13:59 UTC (permalink / raw) To: Jiri Kosina Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, x86, linux-kernel, Mike Galbraith, Borislav Petkov * Jiri Kosina <jikos@kernel.org> wrote: > ENTRY(__sw_hweight64) > -#ifdef CONFIG_X86_64 > + So this removes an #ifdef without removing the #else branch. Help? Thanks, Ingo ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] x86/hweight: fix open-coded versions of 32bit and 64bit hweight calculations 2016-08-10 13:59 ` Ingo Molnar @ 2016-08-10 14:05 ` Jiri Kosina 2016-08-11 0:09 ` Ingo Molnar 0 siblings, 1 reply; 13+ messages in thread From: Jiri Kosina @ 2016-08-10 14:05 UTC (permalink / raw) To: Ingo Molnar Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, x86, linux-kernel, Mike Galbraith, Borislav Petkov On Wed, 10 Aug 2016, Ingo Molnar wrote: > > ENTRY(__sw_hweight64) > > -#ifdef CONFIG_X86_64 > > + > > So this removes an #ifdef without removing the #else branch. Hi Ingo, this patch is obsolete; the real issue has already been fixed by 65ea11ec6 ("x86/hweight: Don't clobber %rdi") in Linus' tree already. Thanks, -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] x86/hweight: fix open-coded versions of 32bit and 64bit hweight calculations 2016-08-10 14:05 ` Jiri Kosina @ 2016-08-11 0:09 ` Ingo Molnar 0 siblings, 0 replies; 13+ messages in thread From: Ingo Molnar @ 2016-08-11 0:09 UTC (permalink / raw) To: Jiri Kosina Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, x86, linux-kernel, Mike Galbraith, Borislav Petkov * Jiri Kosina <jikos@kernel.org> wrote: > On Wed, 10 Aug 2016, Ingo Molnar wrote: > > > > ENTRY(__sw_hweight64) > > > -#ifdef CONFIG_X86_64 > > > + > > > > So this removes an #ifdef without removing the #else branch. > > Hi Ingo, > > this patch is obsolete; the real issue has already been fixed by 65ea11ec6 > ("x86/hweight: Don't clobber %rdi") in Linus' tree already. Ok, great! Thanks, Ingo ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2016-08-11 0:09 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-08-02 12:04 x86 PMU broken in current Linus' tree Jiri Kosina 2016-08-02 12:55 ` Peter Zijlstra 2016-08-02 13:31 ` Jiri Kosina 2016-08-08 10:26 ` Peter Zijlstra 2016-08-08 14:41 ` Jiri Kosina 2016-08-08 16:12 ` [PATCH v2] x86/hweight: fix open-coded versions of 32bit and 64bit hweight calculations Jiri Kosina 2016-08-08 16:34 ` kbuild test robot 2016-08-08 18:48 ` Jiri Kosina 2016-08-08 18:53 ` Borislav Petkov 2016-08-08 18:55 ` Jiri Kosina 2016-08-10 13:59 ` Ingo Molnar 2016-08-10 14:05 ` Jiri Kosina 2016-08-11 0:09 ` Ingo Molnar
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox