* [PATCH] disable irq's and check need_resched before safe_halt
@ 2007-08-07 13:49 Dimitri Sivanich
2007-08-07 20:11 ` Luck, Tony
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: Dimitri Sivanich @ 2007-08-07 13:49 UTC (permalink / raw)
To: linux-ia64
While sending interrupts to a cpu to repeatedly wake a thread, on occasion that thread will take a full timer tick cycle (4002 usec in my case) to wakeup.
The problem concerns a race condition in the code around the safe_halt() call in the default_idle() routine. Setting 'nohalt' on the kernel command line causes the long wakeups to disappear.
void
default_idle (void)
{
local_irq_enable();
while (!need_resched()) {
--> if (can_do_pal_halt)
--> safe_halt();
else
A timer tick could arrive between the check for !need_resched and the actual call to safe_halt() (which does a pal call to PAL_HALT_LIGHT). By the time the timer tick completes, a thread that might now need to run could get held up for as long as a timer tick waiting for the halted cpu.
I'm proposing that we disable irq's and check need_resched again before calling safe_halt(). Does anyone see any problem with this approach?
Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
Index: linux/arch/ia64/kernel/process.c
=================================--- linux.orig/arch/ia64/kernel/process.c 2007-08-02 15:05:56.427236082 -0500
+++ linux/arch/ia64/kernel/process.c 2007-08-06 19:42:20.147944967 -0500
@@ -198,9 +198,13 @@ default_idle (void)
{
local_irq_enable();
while (!need_resched()) {
- if (can_do_pal_halt)
- safe_halt();
- else
+ if (can_do_pal_halt) {
+ local_irq_disable();
+ if (!need_resched()) {
+ safe_halt();
+ }
+ local_irq_enable();
+ } else
cpu_relax();
}
}
^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [PATCH] disable irq's and check need_resched before safe_halt
2007-08-07 13:49 [PATCH] disable irq's and check need_resched before safe_halt Dimitri Sivanich
@ 2007-08-07 20:11 ` Luck, Tony
2007-08-07 21:26 ` Ken Chen
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Luck, Tony @ 2007-08-07 20:11 UTC (permalink / raw)
To: linux-ia64
This looks like it re-introduces code that Ken Chen backed out
about two years ago. Here's Ken's commit that explains what
broke last time we made the idle loop look like this. Now
that code was wider ranging ... messing with TIF bits too,
so maybe this is different this time?
-Tony
commit 1e185b97b4364063f1135604b87f8d8469944233
Author: Chen, Kenneth W <kenneth.w.chen@intel.com>
Date: Tue Nov 15 14:37:05 2005 -0800
[PATCH] ia64: cpu_idle performance bug fix
Our performance validation on 2.6.15-rc1 caught a disastrous performance
regression on ia64 with netperf (-98%) and volanomark (-58%) compares to
previous kernel version 2.6.14-git7. See the following chart (result
group 1 & 2).
http://kernel-perf.sourceforge.net/results.machine_id&.html
We have root caused it to commit 64c7c8f88559624abdbe12b5da6502e8879f8d28
This changeset broke the ia64 task resched notification. In
sched.c:resched_task(), a reschedule IPI is conditioned upon
TIF_POLLING_NRFLAG. However, the above changeset unconditionally set
the polling thread flag for idle tasks regardless whether pal_halt_light
is in use or not. As a result, resched IPI is not sent from
resched_task(). And since the default behavior on ia64 is to use
pal_halt_light, we end up delaying the rescheduling task until next
timer tick, and thus cause the performance regression.
This fixes the performance bug. I'm glad our performance suite is
turning up bad performance bug like this in time.
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
diff --git a/arch/ia64/kernel/process.c b/arch/ia64/kernel/process.c
index e92ea64..4305d2b 100644
--- a/arch/ia64/kernel/process.c
+++ b/arch/ia64/kernel/process.c
@@ -202,12 +202,9 @@ default_idle (void)
{
local_irq_enable();
while (!need_resched()) {
- if (can_do_pal_halt) {
- local_irq_disable();
- if (!need_resched())
- safe_halt();
- local_irq_enable();
- } else
+ if (can_do_pal_halt)
+ safe_halt();
+ else
cpu_relax();
}
}
@@ -272,10 +269,14 @@ cpu_idle (void)
{
void (*mark_idle)(int) = ia64_mark_idle;
int cpu = smp_processor_id();
- set_thread_flag(TIF_POLLING_NRFLAG);
/* endless idle loop with no priority at all */
while (1) {
+ if (can_do_pal_halt)
+ clear_thread_flag(TIF_POLLING_NRFLAG);
+ else
+ set_thread_flag(TIF_POLLING_NRFLAG);
+
if (!need_resched()) {
void (*idle)(void);
#ifdef CONFIG_SMP
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] disable irq's and check need_resched before safe_halt
2007-08-07 13:49 [PATCH] disable irq's and check need_resched before safe_halt Dimitri Sivanich
2007-08-07 20:11 ` Luck, Tony
@ 2007-08-07 21:26 ` Ken Chen
2007-08-08 15:01 ` Dimitri Sivanich
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Ken Chen @ 2007-08-07 21:26 UTC (permalink / raw)
To: linux-ia64
On 8/7/07, Luck, Tony <tony.luck@intel.com> wrote:
> This looks like it re-introduces code that Ken Chen backed out
> about two years ago. Here's Ken's commit that explains what
> broke last time we made the idle loop look like this. Now
> that code was wider ranging ... messing with TIF bits too,
> so maybe this is different this time?
Yeah, I think the TIF flag was the key in fixing the resched IPI
notification. The change in default_idle() is an optimization.
I'm horrified to see the same code coming back: doing interrupt
enable/disable in the most inner while loop. Disable interrupt is
just crude, but I suppose that's the only way to resolve the race
condition? Looking at other arch like x86_64, it is also doing the
same thing.
- Ken
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] disable irq's and check need_resched before safe_halt
2007-08-07 13:49 [PATCH] disable irq's and check need_resched before safe_halt Dimitri Sivanich
2007-08-07 20:11 ` Luck, Tony
2007-08-07 21:26 ` Ken Chen
@ 2007-08-08 15:01 ` Dimitri Sivanich
2007-08-09 3:36 ` Hidetoshi Seto
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Dimitri Sivanich @ 2007-08-08 15:01 UTC (permalink / raw)
To: linux-ia64
On Tue, Aug 07, 2007 at 02:26:16PM -0700, Ken Chen wrote:
> I'm horrified to see the same code coming back: doing interrupt
> enable/disable in the most inner while loop. Disable interrupt is
> just crude, but I suppose that's the only way to resolve the race
> condition? Looking at other arch like x86_64, it is also doing the
> same thing.
If anyone can suggest a better alternative to fix this race condition,
I'd certainly consider it.
I suppose one alternative might be to move the local_irq_enable()
down into the default_idle loop so that we don't have to enable and
disable irq's the first time through if pal_halt will be called.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] disable irq's and check need_resched before safe_halt
2007-08-07 13:49 [PATCH] disable irq's and check need_resched before safe_halt Dimitri Sivanich
` (2 preceding siblings ...)
2007-08-08 15:01 ` Dimitri Sivanich
@ 2007-08-09 3:36 ` Hidetoshi Seto
2007-08-09 12:41 ` Dimitri Sivanich
2007-08-13 20:36 ` Luck, Tony
5 siblings, 0 replies; 7+ messages in thread
From: Hidetoshi Seto @ 2007-08-09 3:36 UTC (permalink / raw)
To: linux-ia64
Luck, Tony wrote:
> commit 1e185b97b4364063f1135604b87f8d8469944233
> Author: Chen, Kenneth W <kenneth.w.chen@intel.com>
> Date: Tue Nov 15 14:37:05 2005 -0800
>
> [PATCH] ia64: cpu_idle performance bug fix
>
> Our performance validation on 2.6.15-rc1 caught a disastrous performance
> regression on ia64 with netperf (-98%) and volanomark (-58%) compares to
> previous kernel version 2.6.14-git7. See the following chart (result
> group 1 & 2).
>
> http://kernel-perf.sourceforge.net/results.machine_id&.html
>
> We have root caused it to commit 64c7c8f88559624abdbe12b5da6502e8879f8d28
>
> This changeset broke the ia64 task resched notification. In
> sched.c:resched_task(), a reschedule IPI is conditioned upon
> TIF_POLLING_NRFLAG. However, the above changeset unconditionally set
> the polling thread flag for idle tasks regardless whether pal_halt_light
> is in use or not. As a result, resched IPI is not sent from
> resched_task(). And since the default behavior on ia64 is to use
> pal_halt_light, we end up delaying the rescheduling task until next
> timer tick, and thus cause the performance regression.
>
> This fixes the performance bug. I'm glad our performance suite is
> turning up bad performance bug like this in time.
>
> Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
>
> diff --git a/arch/ia64/kernel/process.c b/arch/ia64/kernel/process.c
> index e92ea64..4305d2b 100644
> --- a/arch/ia64/kernel/process.c
> +++ b/arch/ia64/kernel/process.c
> @@ -202,12 +202,9 @@ default_idle (void)
> {
> local_irq_enable();
> while (!need_resched()) {
> - if (can_do_pal_halt) {
> - local_irq_disable();
> - if (!need_resched())
> - safe_halt();
> - local_irq_enable();
> - } else
> + if (can_do_pal_halt)
> + safe_halt();
> + else
> cpu_relax();
> }
> }
> @@ -272,10 +269,14 @@ cpu_idle (void)
> {
> void (*mark_idle)(int) = ia64_mark_idle;
> int cpu = smp_processor_id();
> - set_thread_flag(TIF_POLLING_NRFLAG);
>
> /* endless idle loop with no priority at all */
> while (1) {
> + if (can_do_pal_halt)
> + clear_thread_flag(TIF_POLLING_NRFLAG);
> + else
> + set_thread_flag(TIF_POLLING_NRFLAG);
> +
> if (!need_resched()) {
> void (*idle)(void);
> #ifdef CONFIG_SMP
The latter hunk of this patch makes sense since CPU in safe_halt()
doesn't poll TIF_NEED_RESCHED flag. Therefore such CPUs need to
request resched-IPI by clearing the TIF_POLLING_NRFLAG flag
(that was replaced by TS_POLLING).
But I could not catch the point of former hunk, because:
- safe_halt() is an alias of ia64_pal_halt_light(), that is
a PAL procedure. According to Intel Itanium ASDM rev2.2:
"PAL procedures are not interruptible by external
interrupt or NMI, since PSR.i must be 0 when the
PAL procedure is called.(11.10.2.2)"
- PAL transitions the state of CPU from LIGHT HALT to normal
on receipt of unmasked external interrupt. An unmasked
external interrupt is defined based on the current setting
of the TPR control resister, but not PSR.i.
And the priority of IPI(254) is higher than timer(239).
So both of IPI and timer can wake up the CPU in LIGHT HALT.
I guess this former hunk is not needed, but I could be wrong.
Thanks,
H.Seto
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] disable irq's and check need_resched before safe_halt
2007-08-07 13:49 [PATCH] disable irq's and check need_resched before safe_halt Dimitri Sivanich
` (3 preceding siblings ...)
2007-08-09 3:36 ` Hidetoshi Seto
@ 2007-08-09 12:41 ` Dimitri Sivanich
2007-08-13 20:36 ` Luck, Tony
5 siblings, 0 replies; 7+ messages in thread
From: Dimitri Sivanich @ 2007-08-09 12:41 UTC (permalink / raw)
To: linux-ia64
On Thu, Aug 09, 2007 at 12:36:25PM +0900, Hidetoshi Seto wrote:
> Luck, Tony wrote:
> >diff --git a/arch/ia64/kernel/process.c b/arch/ia64/kernel/process.c
> >index e92ea64..4305d2b 100644
> >--- a/arch/ia64/kernel/process.c
> >+++ b/arch/ia64/kernel/process.c
> >@@ -202,12 +202,9 @@ default_idle (void)
> > {
> > local_irq_enable();
> > while (!need_resched()) {
> >- if (can_do_pal_halt) {
> >- local_irq_disable();
> >- if (!need_resched())
> >- safe_halt();
> >- local_irq_enable();
> >- } else
> >+ if (can_do_pal_halt)
> >+ safe_halt();
> >+ else
> > cpu_relax();
> > }
> > }
>
..
..
>
> But I could not catch the point of former hunk, because:
>
> - safe_halt() is an alias of ia64_pal_halt_light(), that is
> a PAL procedure. According to Intel Itanium ASDM rev2.2:
>
> "PAL procedures are not interruptible by external
> interrupt or NMI, since PSR.i must be 0 when the
> PAL procedure is called.(11.10.2.2)"
>
> - PAL transitions the state of CPU from LIGHT HALT to normal
> on receipt of unmasked external interrupt. An unmasked
> external interrupt is defined based on the current setting
> of the TPR control resister, but not PSR.i.
>
> And the priority of IPI(254) is higher than timer(239).
> So both of IPI and timer can wake up the CPU in LIGHT HALT.
>
> I guess this former hunk is not needed, but I could be wrong.
>
> Thanks,
> H.Seto
And the problem with the former hunk is that it reintroduces the
race between checking !need_resched() and receiving a timer interrupt
before safe_halt(). You could have a thread needing execution at
the time the cpu enters LIGHT HALT. If irq's are disabled, LIGHT
HALT will return relatively quickly with the pending timer interrupt,
rather than having to wait for the next one.
^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [PATCH] disable irq's and check need_resched before safe_halt
2007-08-07 13:49 [PATCH] disable irq's and check need_resched before safe_halt Dimitri Sivanich
` (4 preceding siblings ...)
2007-08-09 12:41 ` Dimitri Sivanich
@ 2007-08-13 20:36 ` Luck, Tony
5 siblings, 0 replies; 7+ messages in thread
From: Luck, Tony @ 2007-08-13 20:36 UTC (permalink / raw)
To: linux-ia64
> Yeah, I think the TIF flag was the key in fixing the resched IPI
> notification. The change in default_idle() is an optimization.
>
> I'm horrified to see the same code coming back: doing interrupt
> enable/disable in the most inner while loop. Disable interrupt is
> just crude, but I suppose that's the only way to resolve the race
> condition? Looking at other arch like x86_64, it is also doing the
> same thing.
Yup. x86 does pretty much the same thing.
I had the perf. team run the problem benchmarks (netperf and volanomark)
with this patch applied ... and they saw no performance difference.
The patch is bundled to go in with the next batch that I send
to Linus.
-Tony
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2007-08-13 20:36 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-08-07 13:49 [PATCH] disable irq's and check need_resched before safe_halt Dimitri Sivanich
2007-08-07 20:11 ` Luck, Tony
2007-08-07 21:26 ` Ken Chen
2007-08-08 15:01 ` Dimitri Sivanich
2007-08-09 3:36 ` Hidetoshi Seto
2007-08-09 12:41 ` Dimitri Sivanich
2007-08-13 20:36 ` Luck, Tony
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox