* fix BUG: using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog @ 2010-08-13 10:21 Sergey Senozhatsky 2010-08-16 8:22 ` Peter Zijlstra 2010-08-18 19:33 ` Andrew Morton 0 siblings, 2 replies; 41+ messages in thread From: Sergey Senozhatsky @ 2010-08-13 10:21 UTC (permalink / raw) To: Andrew Morton; +Cc: Peter Zijlstra, Ingo Molnar, linux-kernel Hello, Got this traces today: [ 67.703556] BUG: using smp_processor_id() in preemptible [00000000] code: s2disk/5139 [ 67.703563] caller is touch_nmi_watchdog+0x15/0x2c [ 67.703566] Pid: 5139, comm: s2disk Not tainted 2.6.36-rc0-git12-07921-g60bf26a-dirty #116 [ 67.703568] Call Trace: [ 67.703575] [<ffffffff811f6bf1>] debug_smp_processor_id+0xc9/0xe4 [ 67.703578] [<ffffffff81092766>] touch_nmi_watchdog+0x15/0x2c [ 67.703584] [<ffffffff81222950>] acpi_os_stall+0x34/0x40 [ 67.703589] [<ffffffff812398d2>] acpi_ex_system_do_stall+0x34/0x38 [ 67.703591] [<ffffffff81238396>] acpi_ex_opcode_1A_0T_0R+0x6d/0xa1 [ 67.703595] [<ffffffff8122e280>] acpi_ds_exec_end_op+0xf8/0x578 [ 67.703598] [<ffffffff812457f9>] acpi_ps_parse_loop+0x88a/0xa55 [ 67.703604] [<ffffffff81244a00>] acpi_ps_parse_aml+0x104/0x3c4 [ 67.703607] [<ffffffff81246198>] acpi_ps_execute_method+0x20f/0x2f3 [ 67.703610] [<ffffffff8124021f>] acpi_ns_evaluate+0x18b/0x2d2 [ 67.703614] [<ffffffff8123fad0>] acpi_evaluate_object+0x1b8/0x2fc [ 67.703617] [<ffffffff8123e020>] ? acpi_get_sleep_type_data+0x21c/0x236 [ 67.703620] [<ffffffff8123d9fb>] acpi_enter_sleep_state_prep+0x61/0xd9 [ 67.703623] [<ffffffff81224205>] acpi_sleep_prepare+0x4f/0x56 [ 67.703626] [<ffffffff81224268>] __acpi_pm_prepare+0x13/0x2e [ 67.703629] [<ffffffff81224448>] acpi_pm_prepare+0xe/0x1f [ 67.703632] [<ffffffff81224466>] acpi_hibernation_pre_snapshot+0xd/0x1e [ 67.703637] [<ffffffff81071b80>] hibernation_snapshot+0xaf/0x258 [ 67.703641] [<ffffffff81074dca>] snapshot_ioctl+0x25c/0x547 [ 67.703645] [<ffffffff81056efc>] ? __srcu_read_unlock+0x3b/0x57 [ 67.703649] [<ffffffff810e7f7d>] vfs_ioctl+0x31/0xa2 [ 67.703652] [<ffffffff810e88dc>] do_vfs_ioctl+0x47c/0x4af [ 67.703655] [<ffffffff8125ee3c>] ? n_tty_write+0x0/0x35e [ 67.703659] [<ffffffff8100203a>] ? sysret_check+0x2e/0x69 [ 67.703662] [<ffffffff810e8960>] sys_ioctl+0x51/0x75 [ 67.703665] [<ffffffff81002002>] system_call_fastpath+0x16/0x1b [ 67.703668] BUG: using smp_processor_id() in preemptible [00000000] code: s2disk/5139 [ 67.703670] caller is touch_softlockup_watchdog+0x15/0x2b [ 67.703672] Pid: 5139, comm: s2disk Not tainted 2.6.36-rc0-git12-07921-g60bf26a-dirty #116 [ 67.703674] Call Trace: [ 67.703677] [<ffffffff811f6bf1>] debug_smp_processor_id+0xc9/0xe4 [ 67.703680] [<ffffffff8109273b>] touch_softlockup_watchdog+0x15/0x2b [ 67.703682] [<ffffffff81092779>] touch_nmi_watchdog+0x28/0x2c [ 67.703685] [<ffffffff81222950>] acpi_os_stall+0x34/0x40 [ 67.703688] [<ffffffff812398d2>] acpi_ex_system_do_stall+0x34/0x38 [ 67.703690] [<ffffffff81238396>] acpi_ex_opcode_1A_0T_0R+0x6d/0xa1 [ 67.703693] [<ffffffff8122e280>] acpi_ds_exec_end_op+0xf8/0x578 [ 67.703696] [<ffffffff812457f9>] acpi_ps_parse_loop+0x88a/0xa55 [ 67.703699] [<ffffffff81244a00>] acpi_ps_parse_aml+0x104/0x3c4 [ 67.703702] [<ffffffff81246198>] acpi_ps_execute_method+0x20f/0x2f3 [ 67.703705] [<ffffffff8124021f>] acpi_ns_evaluate+0x18b/0x2d2 [ 67.703708] [<ffffffff8123fad0>] acpi_evaluate_object+0x1b8/0x2fc [ 67.703710] [<ffffffff8123e020>] ? acpi_get_sleep_type_data+0x21c/0x236 [ 67.703714] [<ffffffff8123d9fb>] acpi_enter_sleep_state_prep+0x61/0xd9 [ 67.703717] [<ffffffff81224205>] acpi_sleep_prepare+0x4f/0x56 [ 67.703719] [<ffffffff81224268>] __acpi_pm_prepare+0x13/0x2e [ 67.703722] [<ffffffff81224448>] acpi_pm_prepare+0xe/0x1f [ 67.703725] [<ffffffff81224466>] acpi_hibernation_pre_snapshot+0xd/0x1e [ 67.703728] [<ffffffff81071b80>] hibernation_snapshot+0xaf/0x258 [ 67.703731] [<ffffffff81074dca>] snapshot_ioctl+0x25c/0x547 [ 67.703733] [<ffffffff81056efc>] ? __srcu_read_unlock+0x3b/0x57 [ 67.703736] [<ffffffff810e7f7d>] vfs_ioctl+0x31/0xa2 [ 67.703739] [<ffffffff810e88dc>] do_vfs_ioctl+0x47c/0x4af [ 67.703741] [<ffffffff8125ee3c>] ? n_tty_write+0x0/0x35e [ 67.703744] [<ffffffff8100203a>] ? sysret_check+0x2e/0x69 [ 67.703747] [<ffffffff810e8960>] sys_ioctl+0x51/0x75 [ 67.703750] [<ffffffff81002002>] system_call_fastpath+0x16/0x1b Please kindly review the patch: Avoid using smp_processor_id in touch_softlockup_watchdog and touch_nmi_watchdog. Patch also "removes" second call to smp_processor_id in __touch_watchdog (smp_processor_id itself and smp_processor_id in __get_cpu_var). --- diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 613bc1f..8822f1e 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -116,13 +116,14 @@ static unsigned long get_sample_period(void) static void __touch_watchdog(void) { int this_cpu = smp_processor_id(); - - __get_cpu_var(watchdog_touch_ts) = get_timestamp(this_cpu); + per_cpu(watchdog_touch_ts, this_cpu) = get_timestamp(this_cpu); } void touch_softlockup_watchdog(void) { - __get_cpu_var(watchdog_touch_ts) = 0; + int this_cpu = get_cpu(); + per_cpu(watchdog_touch_ts, this_cpu) = 0; + put_cpu(); } EXPORT_SYMBOL(touch_softlockup_watchdog); @@ -142,7 +143,9 @@ void touch_all_softlockup_watchdogs(void) #ifdef CONFIG_HARDLOCKUP_DETECTOR void touch_nmi_watchdog(void) { - __get_cpu_var(watchdog_nmi_touch) = true; + int this_cpu = get_cpu(); + per_cpu(watchdog_nmi_touch, this_cpu) = true; + put_cpu(); touch_softlockup_watchdog(); } EXPORT_SYMBOL(touch_nmi_watchdog); ^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: fix BUG: using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-13 10:21 fix BUG: using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog Sergey Senozhatsky @ 2010-08-16 8:22 ` Peter Zijlstra 2010-08-16 13:34 ` Don Zickus 2010-08-18 19:33 ` Andrew Morton 1 sibling, 1 reply; 41+ messages in thread From: Peter Zijlstra @ 2010-08-16 8:22 UTC (permalink / raw) To: Sergey Senozhatsky; +Cc: Andrew Morton, Ingo Molnar, linux-kernel, Don Zickus On Fri, 2010-08-13 at 13:21 +0300, Sergey Senozhatsky wrote: > [ 67.703556] BUG: using smp_processor_id() in preemptible [00000000] code: s2disk/5139 > [ 67.703563] caller is touch_nmi_watchdog+0x15/0x2c > [ 67.703566] Pid: 5139, comm: s2disk Not tainted 2.6.36-rc0-git12-07921-g60bf26a-dirty #116 > [ 67.703568] Call Trace: > [ 67.703575] [<ffffffff811f6bf1>] debug_smp_processor_id+0xc9/0xe4 > [ 67.703578] [<ffffffff81092766>] touch_nmi_watchdog+0x15/0x2c > [ 67.703584] [<ffffffff81222950>] acpi_os_stall+0x34/0x40 > [ 67.703589] [<ffffffff812398d2>] acpi_ex_system_do_stall+0x34/0x38 Which could mean two things, either ACPI got funny on us, or Don's new watchdog stuff has a hole in it. > --- > > diff --git a/kernel/watchdog.c b/kernel/watchdog.c > index 613bc1f..8822f1e 100644 > --- a/kernel/watchdog.c > +++ b/kernel/watchdog.c > @@ -116,13 +116,14 @@ static unsigned long get_sample_period(void) > static void __touch_watchdog(void) > { > int this_cpu = smp_processor_id(); > - > - __get_cpu_var(watchdog_touch_ts) = get_timestamp(this_cpu); > + per_cpu(watchdog_touch_ts, this_cpu) = get_timestamp(this_cpu); > } That change seems sensible enough.. > void touch_softlockup_watchdog(void) > { > - __get_cpu_var(watchdog_touch_ts) = 0; > + int this_cpu = get_cpu(); > + per_cpu(watchdog_touch_ts, this_cpu) = 0; > + put_cpu(); > } > EXPORT_SYMBOL(touch_softlockup_watchdog); > > @@ -142,7 +143,9 @@ void touch_all_softlockup_watchdogs(void) > #ifdef CONFIG_HARDLOCKUP_DETECTOR > void touch_nmi_watchdog(void) > { > - __get_cpu_var(watchdog_nmi_touch) = true; > + int this_cpu = get_cpu(); > + per_cpu(watchdog_nmi_touch, this_cpu) = true; > + put_cpu(); > touch_softlockup_watchdog(); > } > EXPORT_SYMBOL(touch_nmi_watchdog); These other two really are about assumptions we make on the call sites, which at the very least are violated by ACPI. Don/Ingo, remember if we require touch_*_watchdog callers to have preemption disabled? Or is the proposed patch sensible? ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: fix BUG: using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-16 8:22 ` Peter Zijlstra @ 2010-08-16 13:34 ` Don Zickus 2010-08-16 13:46 ` Peter Zijlstra 2010-08-16 14:06 ` Yong Zhang 0 siblings, 2 replies; 41+ messages in thread From: Don Zickus @ 2010-08-16 13:34 UTC (permalink / raw) To: Peter Zijlstra Cc: Sergey Senozhatsky, Andrew Morton, Ingo Molnar, linux-kernel On Mon, Aug 16, 2010 at 10:22:50AM +0200, Peter Zijlstra wrote: > On Fri, 2010-08-13 at 13:21 +0300, Sergey Senozhatsky wrote: > > > [ 67.703556] BUG: using smp_processor_id() in preemptible [00000000] code: s2disk/5139 > > [ 67.703563] caller is touch_nmi_watchdog+0x15/0x2c > > [ 67.703566] Pid: 5139, comm: s2disk Not tainted 2.6.36-rc0-git12-07921-g60bf26a-dirty #116 > > [ 67.703568] Call Trace: > > [ 67.703575] [<ffffffff811f6bf1>] debug_smp_processor_id+0xc9/0xe4 > > [ 67.703578] [<ffffffff81092766>] touch_nmi_watchdog+0x15/0x2c > > [ 67.703584] [<ffffffff81222950>] acpi_os_stall+0x34/0x40 > > [ 67.703589] [<ffffffff812398d2>] acpi_ex_system_do_stall+0x34/0x38 > > Which could mean two things, either ACPI got funny on us, or Don's new > watchdog stuff has a hole in it. it could. :-) > > > > --- > > > > diff --git a/kernel/watchdog.c b/kernel/watchdog.c > > index 613bc1f..8822f1e 100644 > > --- a/kernel/watchdog.c > > +++ b/kernel/watchdog.c > > @@ -116,13 +116,14 @@ static unsigned long get_sample_period(void) > > static void __touch_watchdog(void) > > { > > int this_cpu = smp_processor_id(); > > - > > - __get_cpu_var(watchdog_touch_ts) = get_timestamp(this_cpu); > > + per_cpu(watchdog_touch_ts, this_cpu) = get_timestamp(this_cpu); > > } > > That change seems sensible enough.. ok. > > > void touch_softlockup_watchdog(void) > > { > > - __get_cpu_var(watchdog_touch_ts) = 0; > > + int this_cpu = get_cpu(); > > + per_cpu(watchdog_touch_ts, this_cpu) = 0; > > + put_cpu(); > > } > > EXPORT_SYMBOL(touch_softlockup_watchdog); > > > > @@ -142,7 +143,9 @@ void touch_all_softlockup_watchdogs(void) > > #ifdef CONFIG_HARDLOCKUP_DETECTOR > > void touch_nmi_watchdog(void) > > { > > - __get_cpu_var(watchdog_nmi_touch) = true; > > + int this_cpu = get_cpu(); > > + per_cpu(watchdog_nmi_touch, this_cpu) = true; > > + put_cpu(); > > touch_softlockup_watchdog(); > > } > > EXPORT_SYMBOL(touch_nmi_watchdog); > > These other two really are about assumptions we make on the call sites, > which at the very least are violated by ACPI. > > Don/Ingo, remember if we require touch_*_watchdog callers to have > preemption disabled? Or is the proposed patch sensible? I don't recall any requirement to have preemption disabled when using those functions. It seems sensible to put it in the touch_{softlockup|nmi}_watchdog code. I assume the reason for having preemption disabled when using smp_processor_id() is that the code could migrate to another cpu when rescheduled? I don't see a problem with the patch, but my low level understanding of the __get_cpu_var vs. per_cpu isn't very strong. Cheers, Don > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: fix BUG: using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-16 13:34 ` Don Zickus @ 2010-08-16 13:46 ` Peter Zijlstra 2010-08-16 14:08 ` [PATCH] fix BUG " Sergey Senozhatsky 2010-08-16 14:12 ` fix BUG: using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog Don Zickus 2010-08-16 14:06 ` Yong Zhang 1 sibling, 2 replies; 41+ messages in thread From: Peter Zijlstra @ 2010-08-16 13:46 UTC (permalink / raw) To: Don Zickus; +Cc: Sergey Senozhatsky, Andrew Morton, Ingo Molnar, linux-kernel On Mon, 2010-08-16 at 09:34 -0400, Don Zickus wrote: > > Don/Ingo, remember if we require touch_*_watchdog callers to have > > preemption disabled? Or is the proposed patch sensible? > > I don't recall any requirement to have preemption disabled when using > those functions. It seems sensible to put it in the > touch_{softlockup|nmi}_watchdog code. OK, in that case the patch looks sensible. > I assume the reason for having preemption disabled when using > smp_processor_id() is that the code could migrate to another cpu when > rescheduled? Right, if you can freely schedule, you can get migrated, which means you can get migrated between having determined the return value and using it, at which point the computed value is meaningless. > I don't see a problem with the patch, but my low level understanding of > the __get_cpu_var vs. per_cpu isn't very strong. __get_cpu_var() gets you the value on the current cpu, per_cpu() takes a cpu argument. ^ permalink raw reply [flat|nested] 41+ messages in thread
* [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-16 13:46 ` Peter Zijlstra @ 2010-08-16 14:08 ` Sergey Senozhatsky 2010-08-16 14:30 ` Don Zickus 2010-08-17 2:59 ` Frederic Weisbecker 2010-08-16 14:12 ` fix BUG: using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog Don Zickus 1 sibling, 2 replies; 41+ messages in thread From: Sergey Senozhatsky @ 2010-08-16 14:08 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Don Zickus, Andrew Morton, Ingo Molnar, linux-kernel Fix: acpi_os_stall calls touch_nmi_watchdog and touch_softlockup_watchdog with preemption enabled causing 'BUG: using smp_processor_id() in preemptible code'. Patch also removes double smp_processor_id call (smp_processor_id itself and in __get_cpu_var) in __touch_watchdog. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> --- diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 613bc1f..8822f1e 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -116,13 +116,14 @@ static unsigned long get_sample_period(void) static void __touch_watchdog(void) { int this_cpu = smp_processor_id(); - - __get_cpu_var(watchdog_touch_ts) = get_timestamp(this_cpu); + per_cpu(watchdog_touch_ts, this_cpu) = get_timestamp(this_cpu); } void touch_softlockup_watchdog(void) { - __get_cpu_var(watchdog_touch_ts) = 0; + int this_cpu = get_cpu(); + per_cpu(watchdog_touch_ts, this_cpu) = 0; + put_cpu(); } EXPORT_SYMBOL(touch_softlockup_watchdog); @@ -142,7 +143,9 @@ void touch_all_softlockup_watchdogs(void) #ifdef CONFIG_HARDLOCKUP_DETECTOR void touch_nmi_watchdog(void) { - __get_cpu_var(watchdog_nmi_touch) = true; + int this_cpu = get_cpu(); + per_cpu(watchdog_nmi_touch, this_cpu) = true; + put_cpu(); touch_softlockup_watchdog(); } EXPORT_SYMBOL(touch_nmi_watchdog); ^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-16 14:08 ` [PATCH] fix BUG " Sergey Senozhatsky @ 2010-08-16 14:30 ` Don Zickus 2010-08-17 4:27 ` Yong Zhang 2010-08-17 2:59 ` Frederic Weisbecker 1 sibling, 1 reply; 41+ messages in thread From: Don Zickus @ 2010-08-16 14:30 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Peter Zijlstra, Andrew Morton, Ingo Molnar, linux-kernel, fweisbec cc'ing Frederic On Mon, Aug 16, 2010 at 05:08:29PM +0300, Sergey Senozhatsky wrote: > Fix: acpi_os_stall calls touch_nmi_watchdog and touch_softlockup_watchdog > with preemption enabled causing 'BUG: using smp_processor_id() in preemptible > code'. > > Patch also removes double smp_processor_id call (smp_processor_id itself and in > __get_cpu_var) in __touch_watchdog. > > Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Don Zickus <dzickus@redhat.com> > > --- > > diff --git a/kernel/watchdog.c b/kernel/watchdog.c > index 613bc1f..8822f1e 100644 > --- a/kernel/watchdog.c > +++ b/kernel/watchdog.c > @@ -116,13 +116,14 @@ static unsigned long get_sample_period(void) > static void __touch_watchdog(void) > { > int this_cpu = smp_processor_id(); > - > - __get_cpu_var(watchdog_touch_ts) = get_timestamp(this_cpu); > + per_cpu(watchdog_touch_ts, this_cpu) = get_timestamp(this_cpu); > } > > void touch_softlockup_watchdog(void) > { > - __get_cpu_var(watchdog_touch_ts) = 0; > + int this_cpu = get_cpu(); > + per_cpu(watchdog_touch_ts, this_cpu) = 0; > + put_cpu(); > } > EXPORT_SYMBOL(touch_softlockup_watchdog); > > @@ -142,7 +143,9 @@ void touch_all_softlockup_watchdogs(void) > #ifdef CONFIG_HARDLOCKUP_DETECTOR > void touch_nmi_watchdog(void) > { > - __get_cpu_var(watchdog_nmi_touch) = true; > + int this_cpu = get_cpu(); > + per_cpu(watchdog_nmi_touch, this_cpu) = true; > + put_cpu(); > touch_softlockup_watchdog(); > } > EXPORT_SYMBOL(touch_nmi_watchdog); > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-16 14:30 ` Don Zickus @ 2010-08-17 4:27 ` Yong Zhang 0 siblings, 0 replies; 41+ messages in thread From: Yong Zhang @ 2010-08-17 4:27 UTC (permalink / raw) To: Don Zickus Cc: Sergey Senozhatsky, Peter Zijlstra, Andrew Morton, Ingo Molnar, linux-kernel, fweisbec On Mon, Aug 16, 2010 at 10:30 PM, Don Zickus <dzickus@redhat.com> wrote: >> Patch also removes double smp_processor_id call (smp_processor_id itself and in >> __get_cpu_var) in __touch_watchdog. After checking touch_softlockup_watchdog() and touch_nmi_watchdog() in before version, __raw_get_cpu_var() is used there. Thanks, Yong >> >> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> > > Acked-by: Don Zickus <dzickus@redhat.com> > >> >> --- >> >> diff --git a/kernel/watchdog.c b/kernel/watchdog.c >> index 613bc1f..8822f1e 100644 >> --- a/kernel/watchdog.c >> +++ b/kernel/watchdog.c >> @@ -116,13 +116,14 @@ static unsigned long get_sample_period(void) >> static void __touch_watchdog(void) >> { >> int this_cpu = smp_processor_id(); >> - >> - __get_cpu_var(watchdog_touch_ts) = get_timestamp(this_cpu); >> + per_cpu(watchdog_touch_ts, this_cpu) = get_timestamp(this_cpu); >> } >> >> void touch_softlockup_watchdog(void) >> { >> - __get_cpu_var(watchdog_touch_ts) = 0; >> + int this_cpu = get_cpu(); >> + per_cpu(watchdog_touch_ts, this_cpu) = 0; >> + put_cpu(); >> } >> EXPORT_SYMBOL(touch_softlockup_watchdog); >> >> @@ -142,7 +143,9 @@ void touch_all_softlockup_watchdogs(void) >> #ifdef CONFIG_HARDLOCKUP_DETECTOR >> void touch_nmi_watchdog(void) >> { >> - __get_cpu_var(watchdog_nmi_touch) = true; >> + int this_cpu = get_cpu(); >> + per_cpu(watchdog_nmi_touch, this_cpu) = true; >> + put_cpu(); >> touch_softlockup_watchdog(); >> } >> EXPORT_SYMBOL(touch_nmi_watchdog); >> > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-16 14:08 ` [PATCH] fix BUG " Sergey Senozhatsky 2010-08-16 14:30 ` Don Zickus @ 2010-08-17 2:59 ` Frederic Weisbecker 2010-08-17 3:16 ` Yong Zhang 2010-08-17 7:56 ` [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog (v2) Sergey Senozhatsky 1 sibling, 2 replies; 41+ messages in thread From: Frederic Weisbecker @ 2010-08-17 2:59 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Peter Zijlstra, Don Zickus, Andrew Morton, Ingo Molnar, linux-kernel On Mon, Aug 16, 2010 at 05:08:29PM +0300, Sergey Senozhatsky wrote: > void touch_softlockup_watchdog(void) > { > - __get_cpu_var(watchdog_touch_ts) = 0; > + int this_cpu = get_cpu(); > + per_cpu(watchdog_touch_ts, this_cpu) = 0; > + put_cpu(); > } If preemption is disabled and you deal with the current cpu, then please use __get_cpu_var, it makes the code more readable: void touch_softlockup_watchdog(void) { preempt_disable(); __get_cpu_var(watchdog_touch_ts) = 0; preempt_enable(); } Same below. Thanks. > EXPORT_SYMBOL(touch_softlockup_watchdog); > > @@ -142,7 +143,9 @@ void touch_all_softlockup_watchdogs(void) > #ifdef CONFIG_HARDLOCKUP_DETECTOR > void touch_nmi_watchdog(void) > { > - __get_cpu_var(watchdog_nmi_touch) = true; > + int this_cpu = get_cpu(); > + per_cpu(watchdog_nmi_touch, this_cpu) = true; > + put_cpu(); > touch_softlockup_watchdog(); > } > EXPORT_SYMBOL(touch_nmi_watchdog); > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-17 2:59 ` Frederic Weisbecker @ 2010-08-17 3:16 ` Yong Zhang 2010-08-17 8:39 ` Sergey Senozhatsky 2010-08-17 7:56 ` [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog (v2) Sergey Senozhatsky 1 sibling, 1 reply; 41+ messages in thread From: Yong Zhang @ 2010-08-17 3:16 UTC (permalink / raw) To: Frederic Weisbecker Cc: Sergey Senozhatsky, Peter Zijlstra, Don Zickus, Andrew Morton, Ingo Molnar, linux-kernel On Tue, Aug 17, 2010 at 10:59 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote: > If preemption is disabled and you deal with the current cpu, > then please use __get_cpu_var, it makes the code more > readable: > > > void touch_softlockup_watchdog(void) > { > preempt_disable(); > __(watchdog_touch_ts) = 0; > preempt_enable(); > } Why not use __raw_get_cpu_var() instead? You know adding preempt protection in touch_softlockup_watchdog() just suppress the warning. Am I missing something? Thanks, Yong ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-17 3:16 ` Yong Zhang @ 2010-08-17 8:39 ` Sergey Senozhatsky 2010-08-17 9:05 ` Yong Zhang 0 siblings, 1 reply; 41+ messages in thread From: Sergey Senozhatsky @ 2010-08-17 8:39 UTC (permalink / raw) To: Yong Zhang Cc: Frederic Weisbecker, Peter Zijlstra, Don Zickus, Andrew Morton, Ingo Molnar, linux-kernel [-- Attachment #1: Type: text/plain, Size: 923 bytes --] Hello, On (08/17/10 11:16), Yong Zhang wrote: > On Tue, Aug 17, 2010 at 10:59 AM, Frederic Weisbecker > <fweisbec@gmail.com> wrote: > > If preemption is disabled and you deal with the current cpu, > > then please use __get_cpu_var, it makes the code more > > readable: > > > > > > void touch_softlockup_watchdog(void) > > { > > preempt_disable(); > > __(watchdog_touch_ts) = 0; > > preempt_enable(); > > } > > Why not use __raw_get_cpu_var() instead? > You know adding preempt protection in touch_softlockup_watchdog() > just suppress the warning. Am I missing something? > Sorry, my low level understanding of the __raw_get_cpu_var isn't very strong. I assume it uses current_thread_info()->cpu in some cases (right?) or percpu_from_op. Should it be acpi_os_stall preepmt_disable touch_nmi_watchdog touch_softlockup_watchdog preempt_enable ? Sergey [-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-17 8:39 ` Sergey Senozhatsky @ 2010-08-17 9:05 ` Yong Zhang 2010-08-17 9:24 ` Sergey Senozhatsky 0 siblings, 1 reply; 41+ messages in thread From: Yong Zhang @ 2010-08-17 9:05 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Frederic Weisbecker, Peter Zijlstra, Don Zickus, Andrew Morton, Ingo Molnar, linux-kernel On Tue, Aug 17, 2010 at 4:39 PM, Sergey Senozhatsky <sergey.senozhatsky@gmail.com> wrote: >> Why not use __raw_get_cpu_var() instead? >> You know adding preempt protection in touch_softlockup_watchdog() >> just suppress the warning. Am I missing something? >> > > Sorry, my low level understanding of the __raw_get_cpu_var isn't very strong. > I assume it uses current_thread_info()->cpu in some cases (right?) or > percpu_from_op. The difference is __raw_get_cpu_var() is using raw_smp_processor_id(). > > > Should it be > acpi_os_stall > preepmt_disable > touch_nmi_watchdog > touch_softlockup_watchdog > preempt_enable Actually I don't think this is helpful for the whole function. Because if acpi_os_stall() migrate(I don't know if it could) to another CPU just before preepmt_disable(), we'll be on the wrong way. Adding preempt protection is just smoothing the warning. So I prefer using __raw_get_cpu_var() as what we have been done before. Thanks, Yong ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-17 9:05 ` Yong Zhang @ 2010-08-17 9:24 ` Sergey Senozhatsky 2010-08-17 9:37 ` Yong Zhang 0 siblings, 1 reply; 41+ messages in thread From: Sergey Senozhatsky @ 2010-08-17 9:24 UTC (permalink / raw) To: Yong Zhang Cc: Sergey Senozhatsky, Frederic Weisbecker, Peter Zijlstra, Don Zickus, Andrew Morton, Ingo Molnar, linux-kernel [-- Attachment #1: Type: text/plain, Size: 1491 bytes --] On (08/17/10 17:05), Yong Zhang wrote: > >> Why not use __raw_get_cpu_var() instead? > >> You know adding preempt protection in touch_softlockup_watchdog() > >> just suppress the warning. Am I missing something? > >> > > > > Sorry, my low level understanding of the __raw_get_cpu_var isn't very strong. > > I assume it uses current_thread_info()->cpu in some cases (right?) or > > percpu_from_op. > > The difference is __raw_get_cpu_var() is using raw_smp_processor_id(). > > > > > > > Should it be > > acpi_os_stall > > preepmt_disable > > touch_nmi_watchdog > > touch_softlockup_watchdog > > preempt_enable > > Actually I don't think this is helpful for the whole function. Because > if acpi_os_stall() > migrate(I don't know if it could) to another CPU just before > preepmt_disable(), we'll > be on the wrong way. Adding preempt protection is just smoothing the warning. > OK. Suppose (I don't know if it could) migration has happen acpi_os_stall __migration__ touch_nmi_watchdog How calling raw_smp_processor_id() (which is current_thread_info()->cpu) vs. preepmt_disable - smp_processor_id() will give us different CPUs? > So I prefer using __raw_get_cpu_var() as what we have been done before. > Hm... 26e09c6eee14f4827b55137ba0eedc4e77cd50ab static void __touch_watchdog(void) { - int this_cpu = raw_smp_processor_id(); + int this_cpu = smp_processor_id(); Sergey [-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-17 9:24 ` Sergey Senozhatsky @ 2010-08-17 9:37 ` Yong Zhang 2010-08-17 10:28 ` Sergey Senozhatsky 2010-08-17 10:39 ` Sergey Senozhatsky 0 siblings, 2 replies; 41+ messages in thread From: Yong Zhang @ 2010-08-17 9:37 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Frederic Weisbecker, Peter Zijlstra, Don Zickus, Andrew Morton, Ingo Molnar, linux-kernel On Tue, Aug 17, 2010 at 5:24 PM, Sergey Senozhatsky <sergey.senozhatsky@gmail.com> wrote: > OK. Suppose (I don't know if it could) migration has happen > > acpi_os_stall > __migration__ > touch_nmi_watchdog > > How calling raw_smp_processor_id() (which is current_thread_info()->cpu) > vs. preepmt_disable - smp_processor_id() will give us different CPUs? I don't mean you will get different CPUS(sorry for my poor english). I mean if the migration could happen, you want to touch_nmi_watchdog() on CPU A(otherwise the watchdog will shout on us), but eventually we touch_nmi_watchdog() on CPU B(because of migration), and this is not what we want. So preempt_disable() is redundant here. > >> So I prefer using __raw_get_cpu_var() as what we have been done before. >> > > Hm... > > 26e09c6eee14f4827b55137ba0eedc4e77cd50ab f69bcf60c3f17aa367e16eef7bc6ab001ea6d58a 2508ce1845a3b256798532b2c6b7997c2dc6533b you can get the previous touch_*_watchdog there. Thanks, Yong ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-17 9:37 ` Yong Zhang @ 2010-08-17 10:28 ` Sergey Senozhatsky 2010-08-17 12:48 ` Yong Zhang 2010-08-17 10:39 ` Sergey Senozhatsky 1 sibling, 1 reply; 41+ messages in thread From: Sergey Senozhatsky @ 2010-08-17 10:28 UTC (permalink / raw) To: Yong Zhang Cc: Sergey Senozhatsky, Frederic Weisbecker, Peter Zijlstra, Don Zickus, Andrew Morton, Ingo Molnar, linux-kernel [-- Attachment #1: Type: text/plain, Size: 1364 bytes --] On (08/17/10 17:37), Yong Zhang wrote: > On Tue, Aug 17, 2010 at 5:24 PM, Sergey Senozhatsky > <sergey.senozhatsky@gmail.com> wrote: > > OK. Suppose (I don't know if it could) migration has happen > > > > acpi_os_stall > > __migration__ > > touch_nmi_watchdog > > > > How calling raw_smp_processor_id() (which is current_thread_info()->cpu) > > vs. preepmt_disable - smp_processor_id() will give us different CPUs? > > I don't mean you will get different CPUS(sorry for my poor english). > I mean if the migration could happen, you want to touch_nmi_watchdog() > on CPU A(otherwise the watchdog will shout on us), but eventually we > touch_nmi_watchdog() on CPU B(because of migration), > and this is not what we want. > > So preempt_disable() is redundant here. > Shouldn't we be for sure not preepmtible when calling __raw_get_cpu_var? preempt_disable is reduntant here because current_thread_info()->cpu is atomic and we just don't want preempt_(enable|disable) overhead? Sergey > > > >> So I prefer using __raw_get_cpu_var() as what we have been done before. > >> > > > > Hm... > > > > 26e09c6eee14f4827b55137ba0eedc4e77cd50ab > > f69bcf60c3f17aa367e16eef7bc6ab001ea6d58a > 2508ce1845a3b256798532b2c6b7997c2dc6533b > > you can get the previous touch_*_watchdog there. > > Thanks, > Yong > [-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-17 10:28 ` Sergey Senozhatsky @ 2010-08-17 12:48 ` Yong Zhang 0 siblings, 0 replies; 41+ messages in thread From: Yong Zhang @ 2010-08-17 12:48 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Frederic Weisbecker, Peter Zijlstra, Don Zickus, Andrew Morton, Ingo Molnar, linux-kernel On Tue, Aug 17, 2010 at 01:28:19PM +0300, Sergey Senozhatsky wrote: > > So preempt_disable() is redundant here. > > > > Shouldn't we be for sure not preepmtible when calling __raw_get_cpu_var? IMHO, it's the caller's responsibility. > > preempt_disable is reduntant here because current_thread_info()->cpu is > atomic and we just don't want preempt_(enable|disable) overhead? Yep. Thanks, Yong ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-17 9:37 ` Yong Zhang 2010-08-17 10:28 ` Sergey Senozhatsky @ 2010-08-17 10:39 ` Sergey Senozhatsky 2010-08-17 12:56 ` Yong Zhang 2010-08-17 13:13 ` Don Zickus 1 sibling, 2 replies; 41+ messages in thread From: Sergey Senozhatsky @ 2010-08-17 10:39 UTC (permalink / raw) To: Yong Zhang Cc: Frederic Weisbecker, Peter Zijlstra, Don Zickus, Andrew Morton, Ingo Molnar, linux-kernel Please kindly review. --- diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 613bc1f..22dd388 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -116,13 +116,12 @@ static unsigned long get_sample_period(void) static void __touch_watchdog(void) { int this_cpu = smp_processor_id(); - - __get_cpu_var(watchdog_touch_ts) = get_timestamp(this_cpu); + per_cpu(watchdog_touch_ts, this_cpu) = get_timestamp(this_cpu); } void touch_softlockup_watchdog(void) { - __get_cpu_var(watchdog_touch_ts) = 0; + __raw_get_cpu_var(watchdog_touch_ts) = 0; } EXPORT_SYMBOL(touch_softlockup_watchdog); @@ -142,7 +141,7 @@ void touch_all_softlockup_watchdogs(void) #ifdef CONFIG_HARDLOCKUP_DETECTOR void touch_nmi_watchdog(void) { - __get_cpu_var(watchdog_nmi_touch) = true; + __raw_get_cpu_var(watchdog_nmi_touch) = true; touch_softlockup_watchdog(); } EXPORT_SYMBOL(touch_nmi_watchdog); ^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-17 10:39 ` Sergey Senozhatsky @ 2010-08-17 12:56 ` Yong Zhang 2010-08-17 13:13 ` Don Zickus 1 sibling, 0 replies; 41+ messages in thread From: Yong Zhang @ 2010-08-17 12:56 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Frederic Weisbecker, Peter Zijlstra, Don Zickus, Andrew Morton, Ingo Molnar, linux-kernel On Tue, Aug 17, 2010 at 01:39:48PM +0300, Sergey Senozhatsky wrote: > Please kindly review. > > --- > > diff --git a/kernel/watchdog.c b/kernel/watchdog.c > index 613bc1f..22dd388 100644 > --- a/kernel/watchdog.c > +++ b/kernel/watchdog.c > @@ -116,13 +116,12 @@ static unsigned long get_sample_period(void) > static void __touch_watchdog(void) > { > int this_cpu = smp_processor_id(); > - > - __get_cpu_var(watchdog_touch_ts) = get_timestamp(this_cpu); > + per_cpu(watchdog_touch_ts, this_cpu) = get_timestamp(this_cpu); > } The two caller of __touch_watchdog() is: 1)watchdog_timer_fn(): it's preempt disabled when called. 2)watchdog(): it's bound to one cpu. Then means using smp_processor_id() safely. So I think this change is needless, but anyway it's harmless. Below looks fine to me. But you still need comments from others. Thanks, Yong > > void touch_softlockup_watchdog(void) > { > - __get_cpu_var(watchdog_touch_ts) = 0; > + __raw_get_cpu_var(watchdog_touch_ts) = 0; > } > EXPORT_SYMBOL(touch_softlockup_watchdog); > > @@ -142,7 +141,7 @@ void touch_all_softlockup_watchdogs(void) > #ifdef CONFIG_HARDLOCKUP_DETECTOR > void touch_nmi_watchdog(void) > { > - __get_cpu_var(watchdog_nmi_touch) = true; > + __raw_get_cpu_var(watchdog_nmi_touch) = true; > touch_softlockup_watchdog(); > } > EXPORT_SYMBOL(touch_nmi_watchdog); ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-17 10:39 ` Sergey Senozhatsky 2010-08-17 12:56 ` Yong Zhang @ 2010-08-17 13:13 ` Don Zickus 2010-08-18 2:48 ` Frederic Weisbecker 1 sibling, 1 reply; 41+ messages in thread From: Don Zickus @ 2010-08-17 13:13 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Yong Zhang, Frederic Weisbecker, Peter Zijlstra, Andrew Morton, Ingo Molnar, linux-kernel On Tue, Aug 17, 2010 at 01:39:48PM +0300, Sergey Senozhatsky wrote: > Please kindly review. I don't have a deep enough understanding of the subtleties between per_cpu, __get_cpu_var, and __raw_get_cpu_var to really say which is correct. To me, all three versions of your patch look they do the same thing. Technically, it seems like preempt_disable/enable would be the correct thing to do. But as someone pointed out earlier, if the code is preempted and switches cpu, then the touch_*_watchdog effectively becomes a no-op (which I guess it can do even with the preempt_disable/enable surrounding it). So I have no idea. I am going to wait for smarter people than me to provide an opinion. :-) Cheers, Don > > --- > > diff --git a/kernel/watchdog.c b/kernel/watchdog.c > index 613bc1f..22dd388 100644 > --- a/kernel/watchdog.c > +++ b/kernel/watchdog.c > @@ -116,13 +116,12 @@ static unsigned long get_sample_period(void) > static void __touch_watchdog(void) > { > int this_cpu = smp_processor_id(); > - > - __get_cpu_var(watchdog_touch_ts) = get_timestamp(this_cpu); > + per_cpu(watchdog_touch_ts, this_cpu) = get_timestamp(this_cpu); > } > > void touch_softlockup_watchdog(void) > { > - __get_cpu_var(watchdog_touch_ts) = 0; > + __raw_get_cpu_var(watchdog_touch_ts) = 0; > } > EXPORT_SYMBOL(touch_softlockup_watchdog); > > @@ -142,7 +141,7 @@ void touch_all_softlockup_watchdogs(void) > #ifdef CONFIG_HARDLOCKUP_DETECTOR > void touch_nmi_watchdog(void) > { > - __get_cpu_var(watchdog_nmi_touch) = true; > + __raw_get_cpu_var(watchdog_nmi_touch) = true; > touch_softlockup_watchdog(); > } > EXPORT_SYMBOL(touch_nmi_watchdog); > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-17 13:13 ` Don Zickus @ 2010-08-18 2:48 ` Frederic Weisbecker 2010-08-18 20:01 ` Andrew Morton 0 siblings, 1 reply; 41+ messages in thread From: Frederic Weisbecker @ 2010-08-18 2:48 UTC (permalink / raw) To: Don Zickus, Len Brown Cc: Sergey Senozhatsky, Yong Zhang, Peter Zijlstra, Andrew Morton, Ingo Molnar, linux-kernel On Tue, Aug 17, 2010 at 09:13:20AM -0400, Don Zickus wrote: > On Tue, Aug 17, 2010 at 01:39:48PM +0300, Sergey Senozhatsky wrote: > > Please kindly review. > > I don't have a deep enough understanding of the subtleties between > per_cpu, __get_cpu_var, and __raw_get_cpu_var to really say which is > correct. To me, all three versions of your patch look they do the same > thing. > > Technically, it seems like preempt_disable/enable would be the correct > thing to do. But as someone pointed out earlier, if the code is preempted > and switches cpu, then the touch_*_watchdog effectively becomes a no-op > (which I guess it can do even with the preempt_disable/enable surrounding > it). So I have no idea. I am going to wait for smarter people than me to > provide an opinion. :-) > > Cheers, > Don (Adding Len Brown in Cc. Len, this is about acpi_os_stall() that touches the watchdog while running in a preemptable section, this triggers warnings because of the use of local cpu accessors. We are debating about the appropriate way to solve this). The more I think about it, the more I think that doesn't make sense to have touch_nmi_watchdog() callable from preemptable code. It is buggy by nature. If you run in a preemptable section, then interrupts can fire, and if they can, the nmi watchdog is fine and doesn't need to be touched. Here the problem is more in the softlockup watchdog, because even if you run in a preemptable section, if you run a !CONFIG_PREEMPT kernel, then you can't be preempted and the watchdog won't be scheduled until the udelay loop finishes. But to solve that you would need cond_resched() calls, not touching the watchdog. Because touching the softlockup watchdog doesn't make sense either if you can migrate: you can run the udelay on CPU 0, then migrate on CPU 1 and call touch_softlockup_watchdog() from there. Which makes definetely no sense. This is buggy. And because we want to avoid such buggy uses of the touch_whatever_watchdog() APIs, these function must continue to check they are called from non-preemptable code. Randomly touching the watchdog could hide real lockups to the user. The problem is on the caller. Considering such udelays loop: * if it's in a irq disabled section, call touch_nmi_watchdog(), because this could prevent the nmi watchdog irq from firing * if it's in a non-preemptable section, call touch_softlockup_watchdog(), because this could prevent the softlockup watchdog task from beeing scheduled * if it's from a preemptable task context, this should call cond_resched() to avoid huge latencies on !CONFIG_PREEMPT But acpi_os_stall() seem to be called from 4 different places, and these places may run in different context like the above described. The ACPI code should probably use more specific busy-loop APIs, depending on the context it runs. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-18 2:48 ` Frederic Weisbecker @ 2010-08-18 20:01 ` Andrew Morton 2010-08-19 2:27 ` Don Zickus 2010-08-20 2:57 ` Don Zickus 0 siblings, 2 replies; 41+ messages in thread From: Andrew Morton @ 2010-08-18 20:01 UTC (permalink / raw) To: Frederic Weisbecker Cc: Don Zickus, Len Brown, Sergey Senozhatsky, Yong Zhang, Peter Zijlstra, Ingo Molnar, linux-kernel, linux-acpi, Andy Grover On Wed, 18 Aug 2010 04:48:05 +0200 Frederic Weisbecker <fweisbec@gmail.com> wrote: > On Tue, Aug 17, 2010 at 09:13:20AM -0400, Don Zickus wrote: > > On Tue, Aug 17, 2010 at 01:39:48PM +0300, Sergey Senozhatsky wrote: > > > Please kindly review. > > > > I don't have a deep enough understanding of the subtleties between > > per_cpu, __get_cpu_var, and __raw_get_cpu_var to really say which is > > correct. To me, all three versions of your patch look they do the same > > thing. > > > > Technically, it seems like preempt_disable/enable would be the correct > > thing to do. But as someone pointed out earlier, if the code is preempted > > and switches cpu, then the touch_*_watchdog effectively becomes a no-op > > (which I guess it can do even with the preempt_disable/enable surrounding > > it). So I have no idea. I am going to wait for smarter people than me to > > provide an opinion. :-) > > > > Cheers, > > Don > > > (Adding Len Brown in Cc. I'm not sure who looks after osl.c. I added linux-acpi to cc. > Len, this is about acpi_os_stall() that touches the watchdog while > running in a preemptable section, this triggers warnings because of > the use of local cpu accessors. We are debating about the appropriate > way to solve this). > > The more I think about it, the more I think that doesn't make sense > to have touch_nmi_watchdog() callable from preemptable code. > > It is buggy by nature. > > If you run in a preemptable section, then interrupts can fire, and if > they can, the nmi watchdog is fine and doesn't need to be touched. > > Here the problem is more in the softlockup watchdog, because even if you > run in a preemptable section, if you run a !CONFIG_PREEMPT kernel, then > you can't be preempted and the watchdog won't be scheduled until the > udelay loop finishes. But to solve that you would need cond_resched() > calls, not touching the watchdog. > > Because touching the softlockup watchdog doesn't make sense either > if you can migrate: you can run the udelay on CPU 0, then migrate on > CPU 1 and call touch_softlockup_watchdog() from there. Which makes > definetely no sense. This is buggy. > > And because we want to avoid such buggy uses of the touch_whatever_watchdog() > APIs, these function must continue to check they are called from non-preemptable > code. Randomly touching the watchdog could hide real lockups to the user. > > The problem is on the caller. Considering such udelays loop: > > * if it's in a irq disabled section, call touch_nmi_watchdog(), because this > could prevent the nmi watchdog irq from firing > * if it's in a non-preemptable section, call touch_softlockup_watchdog(), because > this could prevent the softlockup watchdog task from beeing scheduled > * if it's from a preemptable task context, this should call cond_resched() to > avoid huge latencies on !CONFIG_PREEMPT > > > But acpi_os_stall() seem to be called from 4 different places, and these places > may run in different context like the above described. > > The ACPI code should probably use more specific busy-loop APIs, depending on the > context it runs. The touch_nmi_watchdog() was added to acpi_os_stall() by little old me in 2003. It was committed by Andy with the patch title "ACPI: Correctly handle NMI watchdog during long stalls (Andrew Morton)". My title was "ACPI poweroff trigers the NMI watchdog". My changelog was ACPI poweroff trigers the NMI watchdog. Fix. (My spelling has improved with age). So. If we remove that touch, will poweroff still trigger the NMI? Dunno. The surprise new requirement that touch_nmi_watchdog() be called from non-preemptible code does seem to make sense IMO. It's hard to see why anyone would be touching the watchdog unless he's spinning in irqs-off code. Except, of course, when we have a utility function which can be called from wither irqs-on or irqs-off: acpi_os_stall(). That being said, it's not good to introduce new API requirements by accident! An audit of all callers should first be performed, at least. The surprise new requirement that touch_softlockup_watchdog() be called from non-preemptible code doesn't make sense IMO. If I have a piece of code in the kernel which I expect to sit in TASK_UNINTERRUPTIBLE state for three minutes waiting for my egg to boil, I should be able to do that and I should be able to touch the softlockup detector without needing to go non-preemptible. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-18 20:01 ` Andrew Morton @ 2010-08-19 2:27 ` Don Zickus 2010-08-20 2:57 ` Don Zickus 1 sibling, 0 replies; 41+ messages in thread From: Don Zickus @ 2010-08-19 2:27 UTC (permalink / raw) To: Andrew Morton Cc: Frederic Weisbecker, Len Brown, Sergey Senozhatsky, Yong Zhang, Peter Zijlstra, Ingo Molnar, linux-kernel, linux-acpi, Andy Grover On Wed, Aug 18, 2010 at 01:01:56PM -0700, Andrew Morton wrote: > The surprise new requirement that touch_nmi_watchdog() be called from > non-preemptible code does seem to make sense IMO. It's hard to see why > anyone would be touching the watchdog unless he's spinning in irqs-off > code. Except, of course, when we have a utility function which can be > called from wither irqs-on or irqs-off: acpi_os_stall(). > > That being said, it's not good to introduce new API requirements by > accident! An audit of all callers should first be performed, at least. > > > The surprise new requirement that touch_softlockup_watchdog() be called > from non-preemptible code doesn't make sense IMO. If I have a piece of > code in the kernel which I expect to sit in TASK_UNINTERRUPTIBLE state > for three minutes waiting for my egg to boil, I should be able to do > that and I should be able to touch the softlockup detector without > needing to go non-preemptible. Wow. So after re-reading what the original touch_*_watchdog code did and what I copied to kernel/watchdog.c, I'm a little embarrassed on how I managed to mangle the internals of both those functions. While the idea is the same, the semantics are clearly different. touch_nmi_watchdog had a for_each_cpu_present loop, which means it didn't have to deal with the preempt issue. touch_softlockup_watchdog used __raw_get_cpu_var to excuse itself from dealing with the preempt issue. I'll put together a patch that brings those functions back in line with what they used to be. Sorry for the trouble. Cheers, Don ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-18 20:01 ` Andrew Morton 2010-08-19 2:27 ` Don Zickus @ 2010-08-20 2:57 ` Don Zickus 2010-08-20 3:42 ` Andrew Morton ` (2 more replies) 1 sibling, 3 replies; 41+ messages in thread From: Don Zickus @ 2010-08-20 2:57 UTC (permalink / raw) To: Andrew Morton Cc: Frederic Weisbecker, Len Brown, Sergey Senozhatsky, Yong Zhang, Peter Zijlstra, Ingo Molnar, linux-kernel, linux-acpi, Andy Grover On Wed, Aug 18, 2010 at 01:01:56PM -0700, Andrew Morton wrote: > The surprise new requirement that touch_nmi_watchdog() be called from > non-preemptible code does seem to make sense IMO. It's hard to see why > anyone would be touching the watchdog unless he's spinning in irqs-off > code. Except, of course, when we have a utility function which can be > called from wither irqs-on or irqs-off: acpi_os_stall(). > > That being said, it's not good to introduce new API requirements by > accident! An audit of all callers should first be performed, at least. > > > The surprise new requirement that touch_softlockup_watchdog() be called > from non-preemptible code doesn't make sense IMO. If I have a piece of > code in the kernel which I expect to sit in TASK_UNINTERRUPTIBLE state > for three minutes waiting for my egg to boil, I should be able to do > that and I should be able to touch the softlockup detector without > needing to go non-preemptible. Ok, so here is my patch that syncs the touch_*_watchdog back in line with the old semantics. Hopefully this will undo any harm I caused. ------------cut -->--------------------------- >From b372e821c804982438db090db6b4a2f753c78091 Mon Sep 17 00:00:00 2001 From: Don Zickus <dzickus@redhat.com> Date: Thu, 19 Aug 2010 22:48:26 -0400 Subject: [PATCH] [lockup detector] sync touch_*_watchdog back to old semantics During my rewrite, the semantics of touch_nmi_watchdog and touch_softlockup_watchdog changed enough to break some drivers (mostly over preemptable regions). This change brings those touch_*_watchdog functions back in line to how they used to work. Signed-off-by: Don Zickus <dzickus@redhat.com> --- kernel/watchdog.c | 17 ++++++++++++----- 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 613bc1f..99e35a2 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -122,7 +122,7 @@ static void __touch_watchdog(void) void touch_softlockup_watchdog(void) { - __get_cpu_var(watchdog_touch_ts) = 0; + __raw_get_cpu_var(watchdog_touch_ts) = 0; } EXPORT_SYMBOL(touch_softlockup_watchdog); @@ -142,7 +142,14 @@ void touch_all_softlockup_watchdogs(void) #ifdef CONFIG_HARDLOCKUP_DETECTOR void touch_nmi_watchdog(void) { - __get_cpu_var(watchdog_nmi_touch) = true; + if (watchdog_enabled) { + unsigned cpu; + + for_each_present_cpu(cpu) { + if (per_cpu(watchdog_nmi_touch, cpu) != true) + per_cpu(watchdog_nmi_touch, cpu) = true; + } + } touch_softlockup_watchdog(); } EXPORT_SYMBOL(touch_nmi_watchdog); @@ -430,6 +437,9 @@ static int watchdog_enable(int cpu) wake_up_process(p); } + /* if any cpu succeeds, watchdog is considered enabled for the system */ + watchdog_enabled = 1; + return 0; } @@ -452,9 +462,6 @@ static void watchdog_disable(int cpu) per_cpu(softlockup_watchdog, cpu) = NULL; kthread_stop(p); } - - /* if any cpu succeeds, watchdog is considered enabled for the system */ - watchdog_enabled = 1; } static void watchdog_enable_all_cpus(void) -- 1.7.2.1 ^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-20 2:57 ` Don Zickus @ 2010-08-20 3:42 ` Andrew Morton 2010-08-20 12:34 ` Don Zickus 2010-08-26 17:17 ` acpi_os_stall() and touch_nmi_watchdog() (was Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog) Len Brown 2010-08-20 15:02 ` [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog Yong Zhang 2010-08-26 10:14 ` Maxim Levitsky 2 siblings, 2 replies; 41+ messages in thread From: Andrew Morton @ 2010-08-20 3:42 UTC (permalink / raw) To: Don Zickus Cc: Frederic Weisbecker, Len Brown, Sergey Senozhatsky, Yong Zhang, Peter Zijlstra, Ingo Molnar, linux-kernel, linux-acpi, Andy Grover, H. Peter Anvin On Thu, 19 Aug 2010 22:57:49 -0400 Don Zickus <dzickus@redhat.com> wrote: > On Wed, Aug 18, 2010 at 01:01:56PM -0700, Andrew Morton wrote: > > The surprise new requirement that touch_nmi_watchdog() be called from > > non-preemptible code does seem to make sense IMO. It's hard to see why > > anyone would be touching the watchdog unless he's spinning in irqs-off > > code. Except, of course, when we have a utility function which can be > > called from wither irqs-on or irqs-off: acpi_os_stall(). > > > > That being said, it's not good to introduce new API requirements by > > accident! An audit of all callers should first be performed, at least. > > > > > > The surprise new requirement that touch_softlockup_watchdog() be called > > from non-preemptible code doesn't make sense IMO. If I have a piece of > > code in the kernel which I expect to sit in TASK_UNINTERRUPTIBLE state > > for three minutes waiting for my egg to boil, I should be able to do > > that and I should be able to touch the softlockup detector without > > needing to go non-preemptible. > > Ok, so here is my patch that syncs the touch_*_watchdog back in line with > the old semantics. Hopefully this will undo any harm I caused. > > ------------cut -->--------------------------- > > >From b372e821c804982438db090db6b4a2f753c78091 Mon Sep 17 00:00:00 2001 > From: Don Zickus <dzickus@redhat.com> > Date: Thu, 19 Aug 2010 22:48:26 -0400 > Subject: [PATCH] [lockup detector] sync touch_*_watchdog back to old semantics > > During my rewrite, the semantics of touch_nmi_watchdog and > touch_softlockup_watchdog changed enough to break some drivers > (mostly over preemptable regions). > > This change brings those touch_*_watchdog functions back in line > to how they used to work. > > Signed-off-by: Don Zickus <dzickus@redhat.com> > --- > kernel/watchdog.c | 17 ++++++++++++----- > 1 files changed, 12 insertions(+), 5 deletions(-) > > diff --git a/kernel/watchdog.c b/kernel/watchdog.c > index 613bc1f..99e35a2 100644 > --- a/kernel/watchdog.c > +++ b/kernel/watchdog.c > @@ -122,7 +122,7 @@ static void __touch_watchdog(void) > > void touch_softlockup_watchdog(void) > { > - __get_cpu_var(watchdog_touch_ts) = 0; > + __raw_get_cpu_var(watchdog_touch_ts) = 0; > } > EXPORT_SYMBOL(touch_softlockup_watchdog); > > @@ -142,7 +142,14 @@ void touch_all_softlockup_watchdogs(void) > #ifdef CONFIG_HARDLOCKUP_DETECTOR > void touch_nmi_watchdog(void) > { > - __get_cpu_var(watchdog_nmi_touch) = true; > + if (watchdog_enabled) { > + unsigned cpu; > + > + for_each_present_cpu(cpu) { > + if (per_cpu(watchdog_nmi_touch, cpu) != true) > + per_cpu(watchdog_nmi_touch, cpu) = true; > + } > + } > touch_softlockup_watchdog(); > } > EXPORT_SYMBOL(touch_nmi_watchdog); > @@ -430,6 +437,9 @@ static int watchdog_enable(int cpu) > wake_up_process(p); > } > > + /* if any cpu succeeds, watchdog is considered enabled for the system */ > + watchdog_enabled = 1; > + > return 0; > } > > @@ -452,9 +462,6 @@ static void watchdog_disable(int cpu) > per_cpu(softlockup_watchdog, cpu) = NULL; > kthread_stop(p); > } > - > - /* if any cpu succeeds, watchdog is considered enabled for the system */ > - watchdog_enabled = 1; > } > > static void watchdog_enable_all_cpus(void) hm, the code seems a bit screwy. Maybe it was always thus. watchdog_enabled gets set in the per-cpu function but it gets cleared in the all-cpus function. Asymmetric. Also afacit the action of cpu-hotunplug+cpu-hotplug will reenable the watchdog on a CPU which was supposed to have it disabled. Perhaps you could recheck that and make sure it all makes sense - perhaps we need a separate state variable which is purely "current setting of /proc/sys/kernel/nmi_watchdog" and doesn't get altered internally. Anyway, I'll be disappearing for a few days so perhaps Frederic or hpa can help get this all fixed/merged up? ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-20 3:42 ` Andrew Morton @ 2010-08-20 12:34 ` Don Zickus 2010-08-26 17:17 ` acpi_os_stall() and touch_nmi_watchdog() (was Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog) Len Brown 1 sibling, 0 replies; 41+ messages in thread From: Don Zickus @ 2010-08-20 12:34 UTC (permalink / raw) To: Andrew Morton Cc: Frederic Weisbecker, Len Brown, Sergey Senozhatsky, Yong Zhang, Peter Zijlstra, Ingo Molnar, linux-kernel, linux-acpi, Andy Grover, H. Peter Anvin On Thu, Aug 19, 2010 at 08:42:56PM -0700, Andrew Morton wrote: > On Thu, 19 Aug 2010 22:57:49 -0400 Don Zickus <dzickus@redhat.com> wrote: > > > On Wed, Aug 18, 2010 at 01:01:56PM -0700, Andrew Morton wrote: > > @@ -430,6 +437,9 @@ static int watchdog_enable(int cpu) > > wake_up_process(p); > > } > > > > + /* if any cpu succeeds, watchdog is considered enabled for the system */ > > + watchdog_enabled = 1; > > + > > return 0; > > } > > > > @@ -452,9 +462,6 @@ static void watchdog_disable(int cpu) > > per_cpu(softlockup_watchdog, cpu) = NULL; > > kthread_stop(p); > > } > > - > > - /* if any cpu succeeds, watchdog is considered enabled for the system */ > > - watchdog_enabled = 1; > > } > > > > static void watchdog_enable_all_cpus(void) > > hm, the code seems a bit screwy. Maybe it was always thus. No, watchdog_enabled was something newly created for the lockup dectector. > > watchdog_enabled gets set in the per-cpu function but it gets cleared > in the all-cpus function. Asymmetric. Yes it is by design. I was using watchdog_enabled as a global state variable. As soon as one cpu was enabled, I would set the bit. But only if all the cpus disabled the watchdog would I clear the bit. > > Also afacit the action of cpu-hotunplug+cpu-hotplug will reenable the > watchdog on a CPU which was supposed to have it disabled. Perhaps you > could recheck that and make sure it all makes sense - perhaps we need a > separate state variable which is purely "current setting of > /proc/sys/kernel/nmi_watchdog" and doesn't get altered internally. I wasn't tracking it on a per cpu basis. I didn't see a need to. The watchdog should globally be on/off across the system. If a system comes up and one of the cpus could not bring the watchdog online for some reason, then that is a problem. If a cpu-hotunplug+cpu-hotplug fixes it, all the better. :-) Also, if I wanted to track it per cpu, there is a bunch of status bits in per-cpu variables that could let the code know whether a particular cpu watchdog is on/off for either hardlockup or softlockup. Cheers, Don ^ permalink raw reply [flat|nested] 41+ messages in thread
* acpi_os_stall() and touch_nmi_watchdog() (was Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog) 2010-08-20 3:42 ` Andrew Morton 2010-08-20 12:34 ` Don Zickus @ 2010-08-26 17:17 ` Len Brown 1 sibling, 0 replies; 41+ messages in thread From: Len Brown @ 2010-08-26 17:17 UTC (permalink / raw) To: Andrew Morton Cc: Don Zickus, Frederic Weisbecker, Len Brown, Sergey Senozhatsky, Yong Zhang, Peter Zijlstra, Ingo Molnar, linux-kernel, linux-acpi, Andy Grover, H. Peter Anvin acpi_os_stall() is used in two ways. The typical way is what triggered this e-mail thread. It implements the AML "Stall()" operator, and is called with interrupts enabled with durations <= 100 usec. So one would expect it to be identical to udelay(). The exception case is when ACPICA calls it with interrupts off and huge durations when we wrote the poweroff or sleep register, yet we find outselves still running... Apparently akpm added touch_nmi_watchdog() to keep the watchdog from firing in this exception case. Is it useful to have the watchdog running when we are waiting for firmware to poweroff the machine? If no, maybe we should turn it off as part of the shutdown process rather than using yet another invocation of touch_nmi_watchdog()? Is calling delay() with IRQs disabled the best thing we can do after we ask the firmware to cut power and it takes a long time? thanks, Len Brown, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-20 2:57 ` Don Zickus 2010-08-20 3:42 ` Andrew Morton @ 2010-08-20 15:02 ` Yong Zhang 2010-08-26 10:14 ` Maxim Levitsky 2 siblings, 0 replies; 41+ messages in thread From: Yong Zhang @ 2010-08-20 15:02 UTC (permalink / raw) To: Don Zickus Cc: Andrew Morton, Frederic Weisbecker, Len Brown, Sergey Senozhatsky, Peter Zijlstra, Ingo Molnar, linux-kernel, linux-acpi, Andy Grover On Thu, Aug 19, 2010 at 10:57:49PM -0400, Don Zickus wrote: > On Wed, Aug 18, 2010 at 01:01:56PM -0700, Andrew Morton wrote: > > The surprise new requirement that touch_nmi_watchdog() be called from > > non-preemptible code does seem to make sense IMO. It's hard to see why > > anyone would be touching the watchdog unless he's spinning in irqs-off > > code. Except, of course, when we have a utility function which can be > > called from wither irqs-on or irqs-off: acpi_os_stall(). > > > > That being said, it's not good to introduce new API requirements by > > accident! An audit of all callers should first be performed, at least. > > > > > > The surprise new requirement that touch_softlockup_watchdog() be called > > from non-preemptible code doesn't make sense IMO. If I have a piece of > > code in the kernel which I expect to sit in TASK_UNINTERRUPTIBLE state > > for three minutes waiting for my egg to boil, I should be able to do > > that and I should be able to touch the softlockup detector without > > needing to go non-preemptible. > > Ok, so here is my patch that syncs the touch_*_watchdog back in line with > the old semantics. Hopefully this will undo any harm I caused. > > ------------cut -->--------------------------- > > >From b372e821c804982438db090db6b4a2f753c78091 Mon Sep 17 00:00:00 2001 > From: Don Zickus <dzickus@redhat.com> > Date: Thu, 19 Aug 2010 22:48:26 -0400 > Subject: [PATCH] [lockup detector] sync touch_*_watchdog back to old semantics > > During my rewrite, the semantics of touch_nmi_watchdog and > touch_softlockup_watchdog changed enough to break some drivers > (mostly over preemptable regions). > > This change brings those touch_*_watchdog functions back in line > to how they used to work. This one looks good to me. Thank you Don. -Yong > > Signed-off-by: Don Zickus <dzickus@redhat.com> > --- > kernel/watchdog.c | 17 ++++++++++++----- > 1 files changed, 12 insertions(+), 5 deletions(-) > > diff --git a/kernel/watchdog.c b/kernel/watchdog.c > index 613bc1f..99e35a2 100644 > --- a/kernel/watchdog.c > +++ b/kernel/watchdog.c > @@ -122,7 +122,7 @@ static void __touch_watchdog(void) > > void touch_softlockup_watchdog(void) > { > - __get_cpu_var(watchdog_touch_ts) = 0; > + __raw_get_cpu_var(watchdog_touch_ts) = 0; > } > EXPORT_SYMBOL(touch_softlockup_watchdog); > > @@ -142,7 +142,14 @@ void touch_all_softlockup_watchdogs(void) > #ifdef CONFIG_HARDLOCKUP_DETECTOR > void touch_nmi_watchdog(void) > { > - __get_cpu_var(watchdog_nmi_touch) = true; > + if (watchdog_enabled) { > + unsigned cpu; > + > + for_each_present_cpu(cpu) { > + if (per_cpu(watchdog_nmi_touch, cpu) != true) > + per_cpu(watchdog_nmi_touch, cpu) = true; > + } > + } > touch_softlockup_watchdog(); > } > EXPORT_SYMBOL(touch_nmi_watchdog); > @@ -430,6 +437,9 @@ static int watchdog_enable(int cpu) > wake_up_process(p); > } > > + /* if any cpu succeeds, watchdog is considered enabled for the system */ > + watchdog_enabled = 1; > + > return 0; > } > > @@ -452,9 +462,6 @@ static void watchdog_disable(int cpu) > per_cpu(softlockup_watchdog, cpu) = NULL; > kthread_stop(p); > } > - > - /* if any cpu succeeds, watchdog is considered enabled for the system */ > - watchdog_enabled = 1; > } > > static void watchdog_enable_all_cpus(void) > -- > 1.7.2.1 ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-20 2:57 ` Don Zickus 2010-08-20 3:42 ` Andrew Morton 2010-08-20 15:02 ` [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog Yong Zhang @ 2010-08-26 10:14 ` Maxim Levitsky 2010-08-26 14:40 ` Don Zickus 2 siblings, 1 reply; 41+ messages in thread From: Maxim Levitsky @ 2010-08-26 10:14 UTC (permalink / raw) To: Don Zickus Cc: Andrew Morton, Frederic Weisbecker, Len Brown, Sergey Senozhatsky, Yong Zhang, Peter Zijlstra, Ingo Molnar, linux-kernel, linux-acpi, Andy Grover On Thu, 2010-08-19 at 22:57 -0400, Don Zickus wrote: > On Wed, Aug 18, 2010 at 01:01:56PM -0700, Andrew Morton wrote: > > The surprise new requirement that touch_nmi_watchdog() be called from > > non-preemptible code does seem to make sense IMO. It's hard to see why > > anyone would be touching the watchdog unless he's spinning in irqs-off > > code. Except, of course, when we have a utility function which can be > > called from wither irqs-on or irqs-off: acpi_os_stall(). > > > > That being said, it's not good to introduce new API requirements by > > accident! An audit of all callers should first be performed, at least. > > > > > > The surprise new requirement that touch_softlockup_watchdog() be called > > from non-preemptible code doesn't make sense IMO. If I have a piece of > > code in the kernel which I expect to sit in TASK_UNINTERRUPTIBLE state > > for three minutes waiting for my egg to boil, I should be able to do > > that and I should be able to touch the softlockup detector without > > needing to go non-preemptible. > > Ok, so here is my patch that syncs the touch_*_watchdog back in line with > the old semantics. Hopefully this will undo any harm I caused. Was this patch forgotten? Best regards, Maxim Levitsky ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-26 10:14 ` Maxim Levitsky @ 2010-08-26 14:40 ` Don Zickus 0 siblings, 0 replies; 41+ messages in thread From: Don Zickus @ 2010-08-26 14:40 UTC (permalink / raw) To: Maxim Levitsky Cc: Andrew Morton, Frederic Weisbecker, Len Brown, Sergey Senozhatsky, Yong Zhang, Peter Zijlstra, Ingo Molnar, linux-kernel, linux-acpi, Andy Grover On Thu, Aug 26, 2010 at 01:14:31PM +0300, Maxim Levitsky wrote: > On Thu, 2010-08-19 at 22:57 -0400, Don Zickus wrote: > > On Wed, Aug 18, 2010 at 01:01:56PM -0700, Andrew Morton wrote: > > > The surprise new requirement that touch_nmi_watchdog() be called from > > > non-preemptible code does seem to make sense IMO. It's hard to see why > > > anyone would be touching the watchdog unless he's spinning in irqs-off > > > code. Except, of course, when we have a utility function which can be > > > called from wither irqs-on or irqs-off: acpi_os_stall(). > > > > > > That being said, it's not good to introduce new API requirements by > > > accident! An audit of all callers should first be performed, at least. > > > > > > > > > The surprise new requirement that touch_softlockup_watchdog() be called > > > from non-preemptible code doesn't make sense IMO. If I have a piece of > > > code in the kernel which I expect to sit in TASK_UNINTERRUPTIBLE state > > > for three minutes waiting for my egg to boil, I should be able to do > > > that and I should be able to touch the softlockup detector without > > > needing to go non-preemptible. > > > > Ok, so here is my patch that syncs the touch_*_watchdog back in line with > > the old semantics. Hopefully this will undo any harm I caused. > > Was this patch forgotten? Hm, apparently it was separated out by the mail server. Here it is again with some of the headers removed I guess. Cheers, Don From: Don Zickus <dzickus@redhat.com> Date: Thu, 19 Aug 2010 22:48:26 -0400 Subject: [PATCH] [lockup detector] sync touch_*_watchdog back to old semantics During my rewrite, the semantics of touch_nmi_watchdog and touch_softlockup_watchdog changed enough to break some drivers (mostly over preemptable regions). This change brings those touch_*_watchdog functions back in line to how they used to work. Signed-off-by: Don Zickus <dzickus@redhat.com> --- kernel/watchdog.c | 17 ++++++++++++----- 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 613bc1f..99e35a2 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -122,7 +122,7 @@ static void __touch_watchdog(void) void touch_softlockup_watchdog(void) { - __get_cpu_var(watchdog_touch_ts) = 0; + __raw_get_cpu_var(watchdog_touch_ts) = 0; } EXPORT_SYMBOL(touch_softlockup_watchdog); @@ -142,7 +142,14 @@ void touch_all_softlockup_watchdogs(void) #ifdef CONFIG_HARDLOCKUP_DETECTOR void touch_nmi_watchdog(void) { - __get_cpu_var(watchdog_nmi_touch) = true; + if (watchdog_enabled) { + unsigned cpu; + + for_each_present_cpu(cpu) { + if (per_cpu(watchdog_nmi_touch, cpu) != true) + per_cpu(watchdog_nmi_touch, cpu) = true; + } + } touch_softlockup_watchdog(); } EXPORT_SYMBOL(touch_nmi_watchdog); @@ -430,6 +437,9 @@ static int watchdog_enable(int cpu) wake_up_process(p); } + /* if any cpu succeeds, watchdog is considered enabled for the system */ + watchdog_enabled = 1; + return 0; } @@ -452,9 +462,6 @@ static void watchdog_disable(int cpu) per_cpu(softlockup_watchdog, cpu) = NULL; kthread_stop(p); } - - /* if any cpu succeeds, watchdog is considered enabled for the system */ - watchdog_enabled = 1; } static void watchdog_enable_all_cpus(void) -- 1.7.2.1 ^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog (v2) 2010-08-17 2:59 ` Frederic Weisbecker 2010-08-17 3:16 ` Yong Zhang @ 2010-08-17 7:56 ` Sergey Senozhatsky 1 sibling, 0 replies; 41+ messages in thread From: Sergey Senozhatsky @ 2010-08-17 7:56 UTC (permalink / raw) To: Frederic Weisbecker Cc: Peter Zijlstra, Don Zickus, Andrew Morton, Ingo Molnar, linux-kernel Fix: acpi_os_stall calls touch_nmi_watchdog and touch_softlockup_watchdog with preemption enabled causing 'BUG: using smp_processor_id() in preemptible code'. Patch also removes double smp_processor_id call (smp_processor_id itself and in __get_cpu_var) in __touch_watchdog. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> --- diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 613bc1f..cb4f4d4 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -116,13 +116,14 @@ static unsigned long get_sample_period(void) static void __touch_watchdog(void) { int this_cpu = smp_processor_id(); - - __get_cpu_var(watchdog_touch_ts) = get_timestamp(this_cpu); + per_cpu(watchdog_touch_ts, this_cpu) = get_timestamp(this_cpu); } void touch_softlockup_watchdog(void) { + preempt_disable(); __get_cpu_var(watchdog_touch_ts) = 0; + preempt_enable(); } EXPORT_SYMBOL(touch_softlockup_watchdog); @@ -142,7 +143,10 @@ void touch_all_softlockup_watchdogs(void) #ifdef CONFIG_HARDLOCKUP_DETECTOR void touch_nmi_watchdog(void) { + preempt_disable(); __get_cpu_var(watchdog_nmi_touch) = true; + preempt_enable(); + touch_softlockup_watchdog(); } EXPORT_SYMBOL(touch_nmi_watchdog); ^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: fix BUG: using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-16 13:46 ` Peter Zijlstra 2010-08-16 14:08 ` [PATCH] fix BUG " Sergey Senozhatsky @ 2010-08-16 14:12 ` Don Zickus 2010-08-16 14:29 ` Peter Zijlstra 1 sibling, 1 reply; 41+ messages in thread From: Don Zickus @ 2010-08-16 14:12 UTC (permalink / raw) To: Peter Zijlstra Cc: Sergey Senozhatsky, Andrew Morton, Ingo Molnar, linux-kernel On Mon, Aug 16, 2010 at 03:46:58PM +0200, Peter Zijlstra wrote: > > I don't see a problem with the patch, but my low level understanding of > > the __get_cpu_var vs. per_cpu isn't very strong. > > __get_cpu_var() gets you the value on the current cpu, per_cpu() takes a > cpu argument. Well I know that much. :-) It seems that __get_cpu_var depends on preemption being disabled whereas per_cpu does not? Though for some reason I thought __get_cpu_var would be more atomic when it grabbed the current cpu such that you wouldn't need to disable preemption. Guess not. Cheers, Don ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: fix BUG: using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-16 14:12 ` fix BUG: using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog Don Zickus @ 2010-08-16 14:29 ` Peter Zijlstra 0 siblings, 0 replies; 41+ messages in thread From: Peter Zijlstra @ 2010-08-16 14:29 UTC (permalink / raw) To: Don Zickus; +Cc: Sergey Senozhatsky, Andrew Morton, Ingo Molnar, linux-kernel On Mon, 2010-08-16 at 10:12 -0400, Don Zickus wrote: > On Mon, Aug 16, 2010 at 03:46:58PM +0200, Peter Zijlstra wrote: > > > I don't see a problem with the patch, but my low level understanding of > > > the __get_cpu_var vs. per_cpu isn't very strong. > > > > __get_cpu_var() gets you the value on the current cpu, per_cpu() takes a > > cpu argument. > > Well I know that much. :-) It seems that __get_cpu_var depends on > preemption being disabled whereas per_cpu does not? Though for some > reason I thought __get_cpu_var would be more atomic when it grabbed the > current cpu such that you wouldn't need to disable preemption. Guess not. Indeed, it can't be implemented atomically on all smp systems, hence its really nothing other than a 'convenient' short for per_cpu(foo, smp_processor_id()). ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: fix BUG: using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-16 13:34 ` Don Zickus 2010-08-16 13:46 ` Peter Zijlstra @ 2010-08-16 14:06 ` Yong Zhang 1 sibling, 0 replies; 41+ messages in thread From: Yong Zhang @ 2010-08-16 14:06 UTC (permalink / raw) To: Don Zickus Cc: Peter Zijlstra, Sergey Senozhatsky, Andrew Morton, Ingo Molnar, linux-kernel On Mon, Aug 16, 2010 at 09:34:52AM -0400, Don Zickus wrote: > I don't recall any requirement to have preemption disabled when using > those functions. Isn't that implicit? I mean the caller of touch_{softlockup|nmi}_watchdog will sticky to that cpu before it finish running. > It seems sensible to put it in the > touch_{softlockup|nmi}_watchdog code. I don't think so. Such as: ... preempt_disable() <===A touch_{softlockup|nmi}_watchdog <===B preempt_enable() <===C ... You just scroll A and C into B, but what will happen before preempt occur before A? > > I assume the reason for having preemption disabled when using > smp_processor_id() is that the code could migrate to another cpu when > rescheduled? If the migration could happen, then we could touch the wrong cpu-data, and the detection on the original cpu will trigger anyway. > > I don't see a problem with the patch, but my low level understanding of > the __get_cpu_var vs. per_cpu isn't very strong. Maybe we should use __raw_get_cpu_var() instead. Thanks, Yong ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: fix BUG: using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-13 10:21 fix BUG: using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog Sergey Senozhatsky 2010-08-16 8:22 ` Peter Zijlstra @ 2010-08-18 19:33 ` Andrew Morton 2010-08-18 21:44 ` Cyrill Gorcunov 2010-09-22 9:00 ` [PATCH] avoid second smp_processor_id() call in __touch_watchdog Sergey Senozhatsky 1 sibling, 2 replies; 41+ messages in thread From: Andrew Morton @ 2010-08-18 19:33 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, Don Zickus, Cyrill Gorcunov, Frederic Weisbecker On Fri, 13 Aug 2010 13:21:58 +0300 Sergey Senozhatsky <sergey.senozhatsky@gmail.com> wrote: > Hello, > > Got this traces today: > > ... > > Avoid using smp_processor_id in touch_softlockup_watchdog and touch_nmi_watchdog. > Patch also "removes" second call to smp_processor_id in __touch_watchdog > (smp_processor_id itself and smp_processor_id in __get_cpu_var). > > --- > > diff --git a/kernel/watchdog.c b/kernel/watchdog.c > index 613bc1f..8822f1e 100644 > --- a/kernel/watchdog.c > +++ b/kernel/watchdog.c > @@ -116,13 +116,14 @@ static unsigned long get_sample_period(void) > static void __touch_watchdog(void) > { > int this_cpu = smp_processor_id(); > - > - __get_cpu_var(watchdog_touch_ts) = get_timestamp(this_cpu); > + per_cpu(watchdog_touch_ts, this_cpu) = get_timestamp(this_cpu); > } Fair enough, although strictly speaking this should be done in a separate and later patch. > void touch_softlockup_watchdog(void) > { > - __get_cpu_var(watchdog_touch_ts) = 0; > + int this_cpu = get_cpu(); > + per_cpu(watchdog_touch_ts, this_cpu) = 0; > + put_cpu(); > } > EXPORT_SYMBOL(touch_softlockup_watchdog); > > @@ -142,7 +143,9 @@ void touch_all_softlockup_watchdogs(void) > #ifdef CONFIG_HARDLOCKUP_DETECTOR > void touch_nmi_watchdog(void) > { > - __get_cpu_var(watchdog_nmi_touch) = true; > + int this_cpu = get_cpu(); > + per_cpu(watchdog_nmi_touch, this_cpu) = true; > + put_cpu(); > touch_softlockup_watchdog(); > } > EXPORT_SYMBOL(touch_nmi_watchdog); Why did this start happening? Surely we've called touch_softlockup_watchdog() from within preemptible code before now. Methinks that : commit 26e09c6eee14f4827b55137ba0eedc4e77cd50ab : Author: Don Zickus <dzickus@redhat.com> : AuthorDate: Mon May 17 18:06:04 2010 -0400 : Commit: Frederic Weisbecker <fweisbec@gmail.com> : CommitDate: Wed May 19 11:32:14 2010 +0200 : : lockup_detector: Convert per_cpu to __get_cpu_var for readability was simply broken? That would be strange, given that it's been sitting around since May 17. If we don't want to revert 26e09c6eee14f4827b55137ba0eedc4e77cd50ab then I'd suggest that we simply switch to raw_smp_processor_id(): those newly-added get_cpu/put_cpu calls don't do anything useful. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: fix BUG: using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog 2010-08-18 19:33 ` Andrew Morton @ 2010-08-18 21:44 ` Cyrill Gorcunov 2010-09-22 9:00 ` [PATCH] avoid second smp_processor_id() call in __touch_watchdog Sergey Senozhatsky 1 sibling, 0 replies; 41+ messages in thread From: Cyrill Gorcunov @ 2010-08-18 21:44 UTC (permalink / raw) To: Andrew Morton Cc: Sergey Senozhatsky, Peter Zijlstra, Ingo Molnar, linux-kernel, Don Zickus, Frederic Weisbecker On Wed, Aug 18, 2010 at 12:33:46PM -0700, Andrew Morton wrote: > On Fri, 13 Aug 2010 13:21:58 +0300 > Sergey Senozhatsky <sergey.senozhatsky@gmail.com> wrote: > > > Hello, > > > > Got this traces today: > > > > ... > > ... > > void touch_softlockup_watchdog(void) > > { > > - __get_cpu_var(watchdog_touch_ts) = 0; > > + int this_cpu = get_cpu(); > > + per_cpu(watchdog_touch_ts, this_cpu) = 0; > > + put_cpu(); > > } > > EXPORT_SYMBOL(touch_softlockup_watchdog); > > > > @@ -142,7 +143,9 @@ void touch_all_softlockup_watchdogs(void) > > #ifdef CONFIG_HARDLOCKUP_DETECTOR > > void touch_nmi_watchdog(void) > > { > > - __get_cpu_var(watchdog_nmi_touch) = true; > > + int this_cpu = get_cpu(); > > + per_cpu(watchdog_nmi_touch, this_cpu) = true; > > + put_cpu(); > > touch_softlockup_watchdog(); > > } > > EXPORT_SYMBOL(touch_nmi_watchdog); > > Why did this start happening? Surely we've called > touch_softlockup_watchdog() from within preemptible code before now. indeed, and we've been using __raw interface before (2.6.18) void touch_softlockup_watchdog(void) { __raw_get_cpu_var(touch_timestamp) = jiffies; } > Methinks that > > : commit 26e09c6eee14f4827b55137ba0eedc4e77cd50ab > : Author: Don Zickus <dzickus@redhat.com> > : AuthorDate: Mon May 17 18:06:04 2010 -0400 > : Commit: Frederic Weisbecker <fweisbec@gmail.com> > : CommitDate: Wed May 19 11:32:14 2010 +0200 > : > : lockup_detector: Convert per_cpu to __get_cpu_var for readability > > was simply broken? That would be strange, given that it's been sitting > around since May 17. > > If we don't want to revert 26e09c6eee14f4827b55137ba0eedc4e77cd50ab > then I'd suggest that we simply switch to raw_smp_processor_id(): those > newly-added get_cpu/put_cpu calls don't do anything useful. > I think it's fine to use __get_cpu_var in touch_nmi_watchdog (which should not be called with irq enabled since it ticks anyway then, at least on x86) for hardware nmi watchdog, can't conclude anything about softlockup (except that we had __raw interface before) since I'm not familiar with soflockup watchdog at moment. -- Cyrill ^ permalink raw reply [flat|nested] 41+ messages in thread
* [PATCH] avoid second smp_processor_id() call in __touch_watchdog 2010-08-18 19:33 ` Andrew Morton 2010-08-18 21:44 ` Cyrill Gorcunov @ 2010-09-22 9:00 ` Sergey Senozhatsky 2010-09-22 14:41 ` Don Zickus ` (2 more replies) 1 sibling, 3 replies; 41+ messages in thread From: Sergey Senozhatsky @ 2010-09-22 9:00 UTC (permalink / raw) To: Andrew Morton Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, Don Zickus, Cyrill Gorcunov, Frederic Weisbecker Hello, Per our previous conversation: Andrew Morton wrote: > Fair enough, although strictly speaking this should be done in a > separate and later patch. > Avoid double smp_processor_id() call in __touch_watchdog (smp_processor_id() itself and later call in __get_cpu_var()) Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> --- diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 7f9c3c5..03d97c5 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -116,8 +116,7 @@ static unsigned long get_sample_period(void) static void __touch_watchdog(void) { int this_cpu = smp_processor_id(); - - __get_cpu_var(watchdog_touch_ts) = get_timestamp(this_cpu); + per_cpu(watchdog_touch_ts, this_cpu) = get_timestamp(this_cpu); } void touch_softlockup_watchdog(void) ^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH] avoid second smp_processor_id() call in __touch_watchdog 2010-09-22 9:00 ` [PATCH] avoid second smp_processor_id() call in __touch_watchdog Sergey Senozhatsky @ 2010-09-22 14:41 ` Don Zickus 2010-09-22 16:27 ` Frederic Weisbecker 2010-09-24 19:34 ` Don Zickus 2 siblings, 0 replies; 41+ messages in thread From: Don Zickus @ 2010-09-22 14:41 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Andrew Morton, Peter Zijlstra, Ingo Molnar, linux-kernel, Cyrill Gorcunov, Frederic Weisbecker On Wed, Sep 22, 2010 at 12:00:12PM +0300, Sergey Senozhatsky wrote: > Hello, > > Per our previous conversation: > > Andrew Morton wrote: > > Fair enough, although strictly speaking this should be done in a > > separate and later patch. > > > > > Avoid double smp_processor_id() call in __touch_watchdog (smp_processor_id() > itself and later call in __get_cpu_var()) > > Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Thanks! I'll apply it. Cheers, Don > > --- > > diff --git a/kernel/watchdog.c b/kernel/watchdog.c > index 7f9c3c5..03d97c5 100644 > --- a/kernel/watchdog.c > +++ b/kernel/watchdog.c > @@ -116,8 +116,7 @@ static unsigned long get_sample_period(void) > static void __touch_watchdog(void) > { > int this_cpu = smp_processor_id(); > - > - __get_cpu_var(watchdog_touch_ts) = get_timestamp(this_cpu); > + per_cpu(watchdog_touch_ts, this_cpu) = get_timestamp(this_cpu); > } > > void touch_softlockup_watchdog(void) > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] avoid second smp_processor_id() call in __touch_watchdog 2010-09-22 9:00 ` [PATCH] avoid second smp_processor_id() call in __touch_watchdog Sergey Senozhatsky 2010-09-22 14:41 ` Don Zickus @ 2010-09-22 16:27 ` Frederic Weisbecker 2010-09-22 16:39 ` Peter Zijlstra 2010-09-24 19:34 ` Don Zickus 2 siblings, 1 reply; 41+ messages in thread From: Frederic Weisbecker @ 2010-09-22 16:27 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Andrew Morton, Peter Zijlstra, Ingo Molnar, linux-kernel, Don Zickus, Cyrill Gorcunov On Wed, Sep 22, 2010 at 12:00:12PM +0300, Sergey Senozhatsky wrote: > Hello, > > Per our previous conversation: > > Andrew Morton wrote: > > Fair enough, although strictly speaking this should be done in a > > separate and later patch. > > > > > Avoid double smp_processor_id() call in __touch_watchdog (smp_processor_id() > itself and later call in __get_cpu_var()) > > Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> > > --- > > diff --git a/kernel/watchdog.c b/kernel/watchdog.c > index 7f9c3c5..03d97c5 100644 > --- a/kernel/watchdog.c > +++ b/kernel/watchdog.c > @@ -116,8 +116,7 @@ static unsigned long get_sample_period(void) > static void __touch_watchdog(void) > { > int this_cpu = smp_processor_id(); > - > - __get_cpu_var(watchdog_touch_ts) = get_timestamp(this_cpu); > + per_cpu(watchdog_touch_ts, this_cpu) = get_timestamp(this_cpu); I'm not sure we want this. This is called by the watchdog internally, from the timer or the cpu bound thread, so we probably should better keep __get_cpu_var() because it checks that we are not in a preemptable section. Most of the time, accessing local data using per_cpu instead of __get_cpu_var is wrong. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] avoid second smp_processor_id() call in __touch_watchdog 2010-09-22 16:27 ` Frederic Weisbecker @ 2010-09-22 16:39 ` Peter Zijlstra 2010-09-22 16:47 ` Frederic Weisbecker 0 siblings, 1 reply; 41+ messages in thread From: Peter Zijlstra @ 2010-09-22 16:39 UTC (permalink / raw) To: Frederic Weisbecker Cc: Sergey Senozhatsky, Andrew Morton, Ingo Molnar, linux-kernel, Don Zickus, Cyrill Gorcunov On Wed, 2010-09-22 at 18:27 +0200, Frederic Weisbecker wrote: > > I'm not sure we want this. This is called by the watchdog internally, > from the timer or the cpu bound thread, so we probably should better > keep __get_cpu_var() because it checks that we are not in a preemptable > section. The smp_processor_id() right at the start already does that. That said, I doubt it really matter one way or the other, compilers have been known to do CSE for quite a while. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] avoid second smp_processor_id() call in __touch_watchdog 2010-09-22 16:39 ` Peter Zijlstra @ 2010-09-22 16:47 ` Frederic Weisbecker 0 siblings, 0 replies; 41+ messages in thread From: Frederic Weisbecker @ 2010-09-22 16:47 UTC (permalink / raw) To: Peter Zijlstra Cc: Sergey Senozhatsky, Andrew Morton, Ingo Molnar, linux-kernel, Don Zickus, Cyrill Gorcunov On Wed, Sep 22, 2010 at 06:39:19PM +0200, Peter Zijlstra wrote: > On Wed, 2010-09-22 at 18:27 +0200, Frederic Weisbecker wrote: > > > > I'm not sure we want this. This is called by the watchdog internally, > > from the timer or the cpu bound thread, so we probably should better > > keep __get_cpu_var() because it checks that we are not in a preemptable > > section. > > The smp_processor_id() right at the start already does that. > > That said, I doubt it really matter one way or the other, compilers have > been known to do CSE for quite a while. I don't mind personally. We indeed have this smp_processor_id() that does the check already. But that's also for readability: reviewers that are used to deal with per cpu datas are also used to see per_cpu() for remote percpu data access and get_cpu_var() for local percpu. Plus some archs may override their __my_cpu_offset implementation to provide a faster access. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] avoid second smp_processor_id() call in __touch_watchdog 2010-09-22 9:00 ` [PATCH] avoid second smp_processor_id() call in __touch_watchdog Sergey Senozhatsky 2010-09-22 14:41 ` Don Zickus 2010-09-22 16:27 ` Frederic Weisbecker @ 2010-09-24 19:34 ` Don Zickus 2010-09-25 17:43 ` Sergey Senozhatsky 2 siblings, 1 reply; 41+ messages in thread From: Don Zickus @ 2010-09-24 19:34 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Andrew Morton, Peter Zijlstra, Ingo Molnar, linux-kernel, Cyrill Gorcunov, Frederic Weisbecker On Wed, Sep 22, 2010 at 12:00:12PM +0300, Sergey Senozhatsky wrote: > Hello, > > Per our previous conversation: > > Andrew Morton wrote: > > Fair enough, although strictly speaking this should be done in a > > separate and later patch. > > > > > Avoid double smp_processor_id() call in __touch_watchdog (smp_processor_id() > itself and later call in __get_cpu_var()) Dropping this patch then based on Peter's and Frederic's feedback that the compiler probably already optimizes for this, leaving readability as a good excuse not to change anything. Cheers, Don > > Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> > > --- > > diff --git a/kernel/watchdog.c b/kernel/watchdog.c > index 7f9c3c5..03d97c5 100644 > --- a/kernel/watchdog.c > +++ b/kernel/watchdog.c > @@ -116,8 +116,7 @@ static unsigned long get_sample_period(void) > static void __touch_watchdog(void) > { > int this_cpu = smp_processor_id(); > - > - __get_cpu_var(watchdog_touch_ts) = get_timestamp(this_cpu); > + per_cpu(watchdog_touch_ts, this_cpu) = get_timestamp(this_cpu); > } > > void touch_softlockup_watchdog(void) > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] avoid second smp_processor_id() call in __touch_watchdog 2010-09-24 19:34 ` Don Zickus @ 2010-09-25 17:43 ` Sergey Senozhatsky 0 siblings, 0 replies; 41+ messages in thread From: Sergey Senozhatsky @ 2010-09-25 17:43 UTC (permalink / raw) To: Don Zickus Cc: Sergey Senozhatsky, Andrew Morton, Peter Zijlstra, Ingo Molnar, linux-kernel, Cyrill Gorcunov, Frederic Weisbecker [-- Attachment #1: Type: text/plain, Size: 2428 bytes --] On (09/24/10 15:34), Don Zickus wrote: > On Wed, Sep 22, 2010 at 12:00:12PM +0300, Sergey Senozhatsky wrote: > > Hello, > > > > Per our previous conversation: > > > > Andrew Morton wrote: > > > Fair enough, although strictly speaking this should be done in a > > > separate and later patch. > > > > > > > > > Avoid double smp_processor_id() call in __touch_watchdog (smp_processor_id() > > itself and later call in __get_cpu_var()) > > Dropping this patch then based on Peter's and Frederic's feedback that the > compiler probably already optimizes for this, leaving readability as a good > excuse not to change anything. > Hello, gcc 4.5.1, x86_64 Dump of assembler code for function __touch_watchdog: <+0>: push %rbp <+1>: mov %rsp,%rbp <+4>: push %r12 <+6>: push %rbx <+7>: mov $0x0,%rbx <+14>: callq 0x43 <__touch_watchdog+19> <+19>: mov %eax,%r12d <+22>: callq 0x4b <__touch_watchdog+27> <+27>: mov %eax,%eax <+29>: mov %r12d,%edi <+32>: add 0x0(,%rax,8),%rbx <+40>: callq 0x5d <__touch_watchdog+45> <+45>: shr $0x1e,%rax <+49>: mov %rax,(%rbx) <+52>: pop %rbx <+53>: pop %r12 <+55>: leaveq <+56>: retq patched __touch_watchdog: Dump of assembler code for function __touch_watchdog: <+0>: push %rbp <+1>: mov %rsp,%rbp <+4>: push %rbx <+5>: mov $0x0,%rbx <+12>: sub $0x8,%rsp <+16>: callq 0x45 <__touch_watchdog+21> <+21>: movslq %eax,%rdx <+24>: mov %eax,%edi <+26>: add 0x0(,%rdx,8),%rbx <+34>: callq 0x57 <__touch_watchdog+39> <+39>: shr $0x1e,%rax <+43>: mov %rax,(%rbx) <+46>: pop %rax <+47>: pop %rbx <+48>: leaveq <+49>: retq Sergey > > > > Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> > > > > --- > > > > diff --git a/kernel/watchdog.c b/kernel/watchdog.c > > index 7f9c3c5..03d97c5 100644 > > --- a/kernel/watchdog.c > > +++ b/kernel/watchdog.c > > @@ -116,8 +116,7 @@ static unsigned long get_sample_period(void) > > static void __touch_watchdog(void) > > { > > int this_cpu = smp_processor_id(); > > - > > - __get_cpu_var(watchdog_touch_ts) = get_timestamp(this_cpu); > > + per_cpu(watchdog_touch_ts, this_cpu) = get_timestamp(this_cpu); > > } > > > > void touch_softlockup_watchdog(void) > > > [-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
end of thread, other threads:[~2010-09-25 17:43 UTC | newest] Thread overview: 41+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-08-13 10:21 fix BUG: using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog Sergey Senozhatsky 2010-08-16 8:22 ` Peter Zijlstra 2010-08-16 13:34 ` Don Zickus 2010-08-16 13:46 ` Peter Zijlstra 2010-08-16 14:08 ` [PATCH] fix BUG " Sergey Senozhatsky 2010-08-16 14:30 ` Don Zickus 2010-08-17 4:27 ` Yong Zhang 2010-08-17 2:59 ` Frederic Weisbecker 2010-08-17 3:16 ` Yong Zhang 2010-08-17 8:39 ` Sergey Senozhatsky 2010-08-17 9:05 ` Yong Zhang 2010-08-17 9:24 ` Sergey Senozhatsky 2010-08-17 9:37 ` Yong Zhang 2010-08-17 10:28 ` Sergey Senozhatsky 2010-08-17 12:48 ` Yong Zhang 2010-08-17 10:39 ` Sergey Senozhatsky 2010-08-17 12:56 ` Yong Zhang 2010-08-17 13:13 ` Don Zickus 2010-08-18 2:48 ` Frederic Weisbecker 2010-08-18 20:01 ` Andrew Morton 2010-08-19 2:27 ` Don Zickus 2010-08-20 2:57 ` Don Zickus 2010-08-20 3:42 ` Andrew Morton 2010-08-20 12:34 ` Don Zickus 2010-08-26 17:17 ` acpi_os_stall() and touch_nmi_watchdog() (was Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog) Len Brown 2010-08-20 15:02 ` [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog Yong Zhang 2010-08-26 10:14 ` Maxim Levitsky 2010-08-26 14:40 ` Don Zickus 2010-08-17 7:56 ` [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog (v2) Sergey Senozhatsky 2010-08-16 14:12 ` fix BUG: using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog Don Zickus 2010-08-16 14:29 ` Peter Zijlstra 2010-08-16 14:06 ` Yong Zhang 2010-08-18 19:33 ` Andrew Morton 2010-08-18 21:44 ` Cyrill Gorcunov 2010-09-22 9:00 ` [PATCH] avoid second smp_processor_id() call in __touch_watchdog Sergey Senozhatsky 2010-09-22 14:41 ` Don Zickus 2010-09-22 16:27 ` Frederic Weisbecker 2010-09-22 16:39 ` Peter Zijlstra 2010-09-22 16:47 ` Frederic Weisbecker 2010-09-24 19:34 ` Don Zickus 2010-09-25 17:43 ` Sergey Senozhatsky
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).