* [PATCH] x86_64 add missing enter_idle() calls
@ 2006-10-06 8:16 Stephane Eranian
2006-10-16 10:08 ` Andi Kleen
0 siblings, 1 reply; 11+ messages in thread
From: Stephane Eranian @ 2006-10-06 8:16 UTC (permalink / raw)
To: linux-kernel; +Cc: ak, Stephane Eranian
Hi,
Unless I am mistaken, I think we are missing some calls to enter_idle()
in the x86_64 tree. The following patch adds a bunch of missing
enter_idle() callbacks for some of the "direct" interrupt handlers.
changelog:
- adds missing enter_idle() calls to most of the "direct" interrupt
handlers.
signed-off-by: stephane eranian <eranian@hpl.hp.com>
diff --git a/arch/x86_64/kernel/apic.c b/arch/x86_64/kernel/apic.c
index af4a1c7..74ed3b8 100644
--- a/arch/x86_64/kernel/apic.c
+++ b/arch/x86_64/kernel/apic.c
@@ -1015,6 +1015,7 @@ #if 0
}
#endif
irq_exit();
+ enter_idle();
}
/*
@@ -1047,6 +1048,7 @@ asmlinkage void smp_error_interrupt(void
printk (KERN_DEBUG "APIC error on CPU%d: %02x(%02x)\n",
smp_processor_id(), v , v1);
irq_exit();
+ enter_idle();
}
int disable_apic;
diff --git a/arch/x86_64/kernel/irq.c b/arch/x86_64/kernel/irq.c
index b8a407f..28c73d8 100644
--- a/arch/x86_64/kernel/irq.c
+++ b/arch/x86_64/kernel/irq.c
@@ -127,6 +127,7 @@ #endif
irq_exit();
set_irq_regs(old_regs);
+ enter_idle();
return 1;
}
diff --git a/arch/x86_64/kernel/mce_amd.c b/arch/x86_64/kernel/mce_amd.c
index 883fe74..4b458eb 100644
--- a/arch/x86_64/kernel/mce_amd.c
+++ b/arch/x86_64/kernel/mce_amd.c
@@ -224,6 +224,7 @@ asmlinkage void mce_threshold_interrupt(
}
out:
irq_exit();
+ enter_idle();
}
/*
diff --git a/arch/x86_64/kernel/mce_intel.c b/arch/x86_64/kernel/mce_intel.c
index 6551505..030b1e3 100644
--- a/arch/x86_64/kernel/mce_intel.c
+++ b/arch/x86_64/kernel/mce_intel.c
@@ -27,6 +27,7 @@ asmlinkage void smp_thermal_interrupt(vo
mce_log_therm_throt_event(smp_processor_id(), msr_val);
irq_exit();
+ enter_idle();
}
static void __cpuinit intel_init_thermal(struct cpuinfo_x86 *c)
diff --git a/arch/x86_64/kernel/process.c b/arch/x86_64/kernel/process.c
diff --git a/arch/x86_64/kernel/smp.c b/arch/x86_64/kernel/smp.c
index 4f67697..c1d70e3 100644
--- a/arch/x86_64/kernel/smp.c
+++ b/arch/x86_64/kernel/smp.c
@@ -520,5 +520,6 @@ asmlinkage void smp_call_function_interr
mb();
atomic_inc(&call_data->finished);
}
+ enter_idle();
}
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] x86_64 add missing enter_idle() calls
2006-10-06 8:16 [PATCH] x86_64 add missing enter_idle() calls Stephane Eranian
@ 2006-10-16 10:08 ` Andi Kleen
2006-10-16 14:13 ` Stephane Eranian
0 siblings, 1 reply; 11+ messages in thread
From: Andi Kleen @ 2006-10-16 10:08 UTC (permalink / raw)
To: eranian; +Cc: linux-kernel
On Friday 06 October 2006 10:16, Stephane Eranian wrote:
> Hi,
>
> Unless I am mistaken, I think we are missing some calls to enter_idle()
> in the x86_64 tree. The following patch adds a bunch of missing
> enter_idle() callbacks for some of the "direct" interrupt handlers.
>
> changelog:
> - adds missing enter_idle() calls to most of the "direct" interrupt
> handlers.
HLT returns after an interrupt and then does enter_idle()
At least that's the theory. Do you have any evidence it doesn't work?
-Andi
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] x86_64 add missing enter_idle() calls
2006-10-16 10:08 ` Andi Kleen
@ 2006-10-16 14:13 ` Stephane Eranian
2006-10-16 14:36 ` Andi Kleen
0 siblings, 1 reply; 11+ messages in thread
From: Stephane Eranian @ 2006-10-16 14:13 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel, akpm
Andi,
On Mon, Oct 16, 2006 at 12:08:13PM +0200, Andi Kleen wrote:
> On Friday 06 October 2006 10:16, Stephane Eranian wrote:
> > Hi,
> >
> > Unless I am mistaken, I think we are missing some calls to enter_idle()
> > in the x86_64 tree. The following patch adds a bunch of missing
> > enter_idle() callbacks for some of the "direct" interrupt handlers.
> >
> > changelog:
> > - adds missing enter_idle() calls to most of the "direct" interrupt
> > handlers.
>
> HLT returns after an interrupt and then does enter_idle()
With the original code, the number of callbacks you see for IDLE_START and
IDLE_STOP is not too obvious.
On an idle system Opteron 250 with HZ=250, one would expect to see for a 10s duration:
- for CPU0 : IDLE_START = IDLE_STOP = about 5000 calls
- for other CPUs: IDLE_START = IDLE_STOP = about 2500 calls
With the original code, you get the following number of calls:
CPU0.IDLE_START = 44 (enter_idle)
CPU0.IDLE_STOP = 5206 (exit_idle)
CPU1.IDLE_START = 27 (enter_idle)
CPU1.IDLE_STOP = 2528 (exit_idle)
Now, of course, you may get "batched" interrupts where you do not return to idle
before you process the next interrupt. But the difference seems quite high here.
Do you have an explanation for this?
--
-Stephane
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] x86_64 add missing enter_idle() calls
2006-10-16 14:13 ` Stephane Eranian
@ 2006-10-16 14:36 ` Andi Kleen
2006-10-16 14:44 ` Stephane Eranian
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Andi Kleen @ 2006-10-16 14:36 UTC (permalink / raw)
To: eranian; +Cc: linux-kernel, akpm
> With the original code, the number of callbacks you see for IDLE_START and
> IDLE_STOP is not too obvious.
>
> On an idle system Opteron 250 with HZ=250, one would expect to see for a 10s duration:
> - for CPU0 : IDLE_START = IDLE_STOP = about 5000 calls
> - for other CPUs: IDLE_START = IDLE_STOP = about 2500 calls
Yes.
> With the original code, you get the following number of calls:
>
> CPU0.IDLE_START = 44 (enter_idle)
> CPU0.IDLE_STOP = 5206 (exit_idle)
>
> CPU1.IDLE_START = 27 (enter_idle)
> CPU1.IDLE_STOP = 2528 (exit_idle)
>
> Now, of course, you may get "batched" interrupts where you do not return to idle
> before you process the next interrupt. But the difference seems quite high here.
Shouldn't happen for timer interrupts.
>
> Do you have an explanation for this?
Hmm, the last time I fixed this when you complained (post .18) i added a counter for
entry/exit and verified that it was balanced. I haven't rechecked since then.
I don't know why your numbers are off. You're using the latest git tree, right?
-Andi
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] x86_64 add missing enter_idle() calls
2006-10-16 14:36 ` Andi Kleen
@ 2006-10-16 14:44 ` Stephane Eranian
2006-10-17 16:47 ` Stephane Eranian
2006-10-21 9:18 ` Stephane Eranian
2 siblings, 0 replies; 11+ messages in thread
From: Stephane Eranian @ 2006-10-16 14:44 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel, akpm
On Mon, Oct 16, 2006 at 04:36:52PM +0200, Andi Kleen wrote:
>
> > With the original code, the number of callbacks you see for IDLE_START and
> > IDLE_STOP is not too obvious.
> >
> > On an idle system Opteron 250 with HZ=250, one would expect to see for a 10s duration:
> > - for CPU0 : IDLE_START = IDLE_STOP = about 5000 calls
> > - for other CPUs: IDLE_START = IDLE_STOP = about 2500 calls
>
> Yes.
>
> > With the original code, you get the following number of calls:
> >
> > CPU0.IDLE_START = 44 (enter_idle)
> > CPU0.IDLE_STOP = 5206 (exit_idle)
> >
> > CPU1.IDLE_START = 27 (enter_idle)
> > CPU1.IDLE_STOP = 2528 (exit_idle)
> >
> > Now, of course, you may get "batched" interrupts where you do not return to idle
> > before you process the next interrupt. But the difference seems quite high here.
>
> Shouldn't happen for timer interrupts.
> >
> > Do you have an explanation for this?
>
> Hmm, the last time I fixed this when you complained (post .18) i added a counter for
> entry/exit and verified that it was balanced. I haven't rechecked since then.
> I don't know why your numbers are off. You're using the latest git tree, right?
>
No, I am still using 2.6.18. I saw your change in git (thanks for that). I need to try
with this tree and see what happens.
Thanks.
--
-Stephane
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] x86_64 add missing enter_idle() calls
2006-10-16 14:36 ` Andi Kleen
2006-10-16 14:44 ` Stephane Eranian
@ 2006-10-17 16:47 ` Stephane Eranian
2006-10-21 9:18 ` Stephane Eranian
2 siblings, 0 replies; 11+ messages in thread
From: Stephane Eranian @ 2006-10-17 16:47 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel, akpm
Andi,
On Mon, Oct 16, 2006 at 04:36:52PM +0200, Andi Kleen wrote:
>
> > With the original code, the number of callbacks you see for IDLE_START and
> > IDLE_STOP is not too obvious.
> >
> > On an idle system Opteron 250 with HZ=250, one would expect to see for a 10s duration:
> > - for CPU0 : IDLE_START = IDLE_STOP = about 5000 calls
> > - for other CPUs: IDLE_START = IDLE_STOP = about 2500 calls
>
> Yes.
>
> Hmm, the last time I fixed this when you complained (post .18) i added a counter for
> entry/exit and verified that it was balanced. I haven't rechecked since then.
> I don't know why your numbers are off. You're using the latest git tree, right?
>
i have nowupgraded to 2.6.19-rc2 from GIT. It is slightly better but far from
the expected counts as shown above. I have intrumented the idle notifier to
get to the bottom of this. I have modified the code as follows:
static void enter_idle(void)
{
pfm_enter_idle++;
write_pda(isidle, 1);
atomic_notifier_call_chain(&idle_notifier, IDLE_START, NULL);
}
static void __exit_idle(void)
{
pfm_exit_idle1++;
if (read_pda(isidle) == 0)
return;
pfm_exit_idle2++;
write_pda(isidle, 0);
atomic_notifier_call_chain(&idle_notifier, IDLE_END, NULL);
}
/* Called from interrupts to signify idle end */
void exit_idle(void)
{
/* idle loop has pid 0 */
if (current->pid)
return;
pfm_exit_idle3++;
__exit_idle();
}
I export the counters via /sys. I run the following test:
reset-counters; sleep 10; print counters for CPU0
Here is what I get on CPU0 on an idle system after 10s:
enter_idle = 37 calls
exit_idle1 = 5209
exit_idle2 = 37 actual notifier calls (match enter_idle)
exit_idle3 = 5172
That means that the notifier was only called 37 times, far from 5000 expected.
Based on where the counts are for for idle1 and idle2, it appears that a lot of
calls for exit_idle() are blocked by 'if (read_pda(isidle) == 0) return' which
indicates that by the time we get to the interrupt handler, we are not in the low
level idle function anymore. In other words, we get interrupted before we get a
chance to go back to idle loop. The number of calls to exit_idle() is as expected
though. This means that in the idle loop, somehow, we do not loop very much. It is
as if we were interrupted a lot before we enter it and at the tail of the loop
(after __exit_idle).
I must admit I am still puzzled by the results and I do not have a good explanation
so far.
--
-Stephane
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] x86_64 add missing enter_idle() calls
2006-10-16 14:36 ` Andi Kleen
2006-10-16 14:44 ` Stephane Eranian
2006-10-17 16:47 ` Stephane Eranian
@ 2006-10-21 9:18 ` Stephane Eranian
2006-10-21 13:22 ` Andi Kleen
2 siblings, 1 reply; 11+ messages in thread
From: Stephane Eranian @ 2006-10-21 9:18 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel, akpm, Stephane Eranian
Andi,
On Mon, Oct 16, 2006 at 04:36:52PM +0200, Andi Kleen wrote:
> > With the original code, the number of callbacks you see for IDLE_START and
> > IDLE_STOP is not too obvious.
> >
> > On an idle system Opteron 250 with HZ=250, one would expect to see for a 10s duration:
> > - for CPU0 : IDLE_START = IDLE_STOP = about 5000 calls
> > - for other CPUs: IDLE_START = IDLE_STOP = about 2500 calls
>
> Yes.
>
> > With the original code, you get the following number of calls:
> >
> > CPU0.IDLE_START = 44 (enter_idle)
> > CPU0.IDLE_STOP = 5206 (exit_idle)
> >
> > CPU1.IDLE_START = 27 (enter_idle)
> > CPU1.IDLE_STOP = 2528 (exit_idle)
> >
>
> Hmm, the last time I fixed this when you complained (post .18) i added a counter for
> entry/exit and verified that it was balanced. I haven't rechecked since then.
> I don't know why your numbers are off. You're using the latest git tree, right?
As I reported earlier, going to the Git kernel did not really change the
number of invocations of the idle notifier. I was very puzzled by this, so I
chased it some more.
I finally found the culprit for this. The current code is wrong for the
simple reason that the cpu_idle() function is NOT always the lowest level
idle loop function. For enter_idle()/__exit_idle() to work correctly they
must be placed in the lowest-level idle loop. The cpu_idle() eventually ends
up in the idle() function, but this one may have a loop in it! This is the
case when idle()=cpu_default_idle() and idle()=poll_idle(), for instance.
The reason why the idle notifier was called so few times, even though we had
the right number of interrupts, is simply because we were not getting out of
the idle() function. So I can, indeed, confirm that an interrupt in HLT
instruction gets you out, but HLT is in a loop from which you do not get out
unless you need to reschedule. By moving enter_idle()/__exit_idle() to
cpu_default_idle() I got the right number of calls for the idle notifier.
I see two solutions to this:
- move enter_idle()/__exit_idle() to the actual lowest-level loop,
in cpu_default_idle() and not in cpu_idle(). We would also have
to do something similar to poll_idle(), or any similar idle function
which contains a loop.
- add exit_idle() to all the local interrupt handlers, like my
initial patch was doing and leave the ente_idle()/__exit_idle()
where they are today.
--
-Stephane
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] x86_64 add missing enter_idle() calls
2006-10-21 9:18 ` Stephane Eranian
@ 2006-10-21 13:22 ` Andi Kleen
2006-10-24 10:00 ` Stephane Eranian
2006-10-25 21:29 ` Stephane Eranian
0 siblings, 2 replies; 11+ messages in thread
From: Andi Kleen @ 2006-10-21 13:22 UTC (permalink / raw)
To: eranian; +Cc: linux-kernel, akpm
> I finally found the culprit for this. The current code is wrong for the
> simple reason that the cpu_idle() function is NOT always the lowest level
> idle loop function. For enter_idle()/__exit_idle() to work correctly they
> must be placed in the lowest-level idle loop. The cpu_idle() eventually ends
> up in the idle() function, but this one may have a loop in it! This is the
> case when idle()=cpu_default_idle() and idle()=poll_idle(), for instance.
Ah now I remember - i had actually fixed that (it was the cleanup-idle-loops
patch) that moved the loops one level up. But then I disabled the patch
at the request of Andrew because it conflicted with some ACPI idle changes.
I'll readd it for .20, then things should be ok.
-Andi
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] x86_64 add missing enter_idle() calls
2006-10-21 13:22 ` Andi Kleen
@ 2006-10-24 10:00 ` Stephane Eranian
2006-10-25 21:29 ` Stephane Eranian
1 sibling, 0 replies; 11+ messages in thread
From: Stephane Eranian @ 2006-10-24 10:00 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel, akpm
Andi,
On Sat, Oct 21, 2006 at 03:22:53PM +0200, Andi Kleen wrote:
>
> > I finally found the culprit for this. The current code is wrong for the
> > simple reason that the cpu_idle() function is NOT always the lowest level
> > idle loop function. For enter_idle()/__exit_idle() to work correctly they
> > must be placed in the lowest-level idle loop. The cpu_idle() eventually ends
> > up in the idle() function, but this one may have a loop in it! This is the
> > case when idle()=cpu_default_idle() and idle()=poll_idle(), for instance.
>
> Ah now I remember - i had actually fixed that (it was the cleanup-idle-loops
> patch) that moved the loops one level up. But then I disabled the patch
> at the request of Andrew because it conflicted with some ACPI idle changes.
>
> I'll readd it for .20, then things should be ok.
>
Ok, that's good. In the meantime, I need to produce the i386 equivalent.
Given how poll_idle() works (tight loop), I don't think we can just add
enter_idle()/exit_idle() around the loop, we also need to cover the interrupt
handlers, because that is the only place where we can catch activity considered
useful.
--
-Stephane
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] x86_64 add missing enter_idle() calls
2006-10-21 13:22 ` Andi Kleen
2006-10-24 10:00 ` Stephane Eranian
@ 2006-10-25 21:29 ` Stephane Eranian
2006-10-27 16:03 ` Andi Kleen
1 sibling, 1 reply; 11+ messages in thread
From: Stephane Eranian @ 2006-10-25 21:29 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel, akpm
Andi,
Looking a the exit_idle() code:
static void __exit_idle(void)
{
if (read_pda(isidle) == 0)
return;
write_pda(isidle, 0);
atomic_notifier_call_chain(&idle_notifier, IDLE_END, NULL);
}
I am wondering whether you are exposed to a race condition w.r.t. to interrupts.
Supposed you are in idle, you get an interrupt and you execute __exit_idle(), the
test evaluate to false but before you can change the value of isidle, you get
a higher priority interrupt which then also calls __exit_idle(), the test is still
false and you invoke the notifier, when you return from this interrupt you also
clear the isidle, but you call the notifier yet a second time.
I think that isidle needs to be test_and_clear atomically for this to guarantee
only one call the notifier on __exit_idle().
what do you think?
On Sat, Oct 21, 2006 at 03:22:53PM +0200, Andi Kleen wrote:
>
> > I finally found the culprit for this. The current code is wrong for the
> > simple reason that the cpu_idle() function is NOT always the lowest level
> > idle loop function. For enter_idle()/__exit_idle() to work correctly they
> > must be placed in the lowest-level idle loop. The cpu_idle() eventually ends
> > up in the idle() function, but this one may have a loop in it! This is the
> > case when idle()=cpu_default_idle() and idle()=poll_idle(), for instance.
>
> Ah now I remember - i had actually fixed that (it was the cleanup-idle-loops
> patch) that moved the loops one level up. But then I disabled the patch
> at the request of Andrew because it conflicted with some ACPI idle changes.
>
> I'll readd it for .20, then things should be ok.
>
> -Andi
--
-Stephane
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] x86_64 add missing enter_idle() calls
2006-10-25 21:29 ` Stephane Eranian
@ 2006-10-27 16:03 ` Andi Kleen
0 siblings, 0 replies; 11+ messages in thread
From: Andi Kleen @ 2006-10-27 16:03 UTC (permalink / raw)
To: eranian; +Cc: linux-kernel, akpm
On Wednesday 25 October 2006 14:29, Stephane Eranian wrote:
> Andi,
>
> Looking a the exit_idle() code:
>
> static void __exit_idle(void)
> {
> if (read_pda(isidle) == 0)
> return;
> write_pda(isidle, 0);
> atomic_notifier_call_chain(&idle_notifier, IDLE_END, NULL);
> }
>
> I am wondering whether you are exposed to a race condition w.r.t. to
> interrupts. Supposed you are in idle, you get an interrupt and you execute
> __exit_idle(), the test evaluate to false but before you can change the
> value of isidle, you get a higher priority interrupt which then also calls
> __exit_idle(), the test is still false and you invoke the notifier, when
> you return from this interrupt you also clear the isidle, but you call the
> notifier yet a second time.
>
> I think that isidle needs to be test_and_clear atomically for this to
> guarantee only one call the notifier on __exit_idle().
>
> what do you think?
Agreed. I will fix that. Thanks
-Andi
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2006-10-27 16:04 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-06 8:16 [PATCH] x86_64 add missing enter_idle() calls Stephane Eranian
2006-10-16 10:08 ` Andi Kleen
2006-10-16 14:13 ` Stephane Eranian
2006-10-16 14:36 ` Andi Kleen
2006-10-16 14:44 ` Stephane Eranian
2006-10-17 16:47 ` Stephane Eranian
2006-10-21 9:18 ` Stephane Eranian
2006-10-21 13:22 ` Andi Kleen
2006-10-24 10:00 ` Stephane Eranian
2006-10-25 21:29 ` Stephane Eranian
2006-10-27 16:03 ` Andi Kleen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox