* [PATCH] x86, nmi: workaround sti; hlt race vs nmi; intr
@ 2010-09-19 16:28 Avi Kivity
2010-09-27 8:38 ` Avi Kivity
2010-09-27 10:31 ` Joerg Roedel
0 siblings, 2 replies; 15+ messages in thread
From: Avi Kivity @ 2010-09-19 16:28 UTC (permalink / raw)
To: x86; +Cc: linux-kernel, kvm
On machines without monitor/mwait we use an sti; hlt sequence to atomically
enable interrupts and put the cpu to sleep. The sequence uses the "interrupt
shadow" property of the sti instruction: interrupts are enabled only after
the instruction following sti has been executed. This means an interrupt
cannot happen in the middle of the sequence, which would leave us with
the interrupt processed but the cpu halted.
The interrupt shadow, however, can be broken by an nmi; the following
sequence
sti
nmi ... iret
# interrupt shadow disabled
intr ... iret
hlt
puts the cpu to sleep, even though the interrupt may need additional processing
after the hlt (like scheduling a task).
sti is explicitly documented not to force an interrupt shadow; though many
processors do inhibit nmi immediately after sti.
Avoid the race by checking, during an nmi, if we hit the safe halt sequence.
If we did, increment the instruction pointer past the hlt instruction; this
allows an immediately following interrupt to return to a safe place.
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/include/asm/irqflags.h | 18 +++++++++++++++++-
arch/x86/kernel/traps.c | 14 ++++++++++++++
2 files changed, 31 insertions(+), 1 deletions(-)
diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h
index 9e2b952..f412167 100644
--- a/arch/x86/include/asm/irqflags.h
+++ b/arch/x86/include/asm/irqflags.h
@@ -44,9 +44,25 @@ static inline void native_irq_enable(void)
asm volatile("sti": : :"memory");
}
+extern void safe_halt_addr(void);
+
static inline void native_safe_halt(void)
{
- asm volatile("sti; hlt": : :"memory");
+ asm volatile("sti \n\t"
+ /*
+ * If NMI hits us here, it negates the interrupt shadow
+ * induced by STI. So the NMI handler checks for
+ * safe_halt_addr and skips the hlt if it loses the
+ * interrupt shadow.
+ *
+ * If native_safe_halt() is ever instantiated more
+ * than once, this will fail to build, and we'll need
+ * a list of addresses in a special section.
+ */
+ ".globl safe_halt_addr \n\t"
+ "safe_halt_addr: \n\t"
+ "hlt"
+ : : :"memory");
}
static inline void native_halt(void)
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 60788de..f67da48 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -438,6 +438,20 @@ do_nmi(struct pt_regs *regs, long error_code)
inc_irq_stat(__nmi_count);
+ /*
+ * We hit in the middle of an sti; hlt instruction. When we return,
+ * the interrupt shadow cast by sti will no longer be in effect; then,
+ * if an interrupt causes a wakeup, we won't notice it since the hlt
+ * will take effect and block the cpu.
+ *
+ * If we detect this situation, fix it by advancing the instruction
+ * pointer past the hlt instruction; if the interrupt doesn't happen,
+ * we'll spend a few cycles falling asleep again, but that's better
+ * than a missed wakeup.
+ */
+ if (regs->cs == __KERNEL_CS && regs->ip == (ulong)safe_halt_addr)
+ ++regs->ip;
+
if (!ignore_nmis)
default_do_nmi(regs);
--
1.7.2.3
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH] x86, nmi: workaround sti; hlt race vs nmi; intr
2010-09-19 16:28 [PATCH] x86, nmi: workaround sti; hlt race vs nmi; intr Avi Kivity
@ 2010-09-27 8:38 ` Avi Kivity
2010-09-27 9:13 ` Alexander Graf
2010-09-27 10:31 ` Joerg Roedel
1 sibling, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2010-09-27 8:38 UTC (permalink / raw)
To: x86; +Cc: linux-kernel, kvm
On 09/19/2010 06:28 PM, Avi Kivity wrote:
> On machines without monitor/mwait we use an sti; hlt sequence to atomically
> enable interrupts and put the cpu to sleep. The sequence uses the "interrupt
> shadow" property of the sti instruction: interrupts are enabled only after
> the instruction following sti has been executed. This means an interrupt
> cannot happen in the middle of the sequence, which would leave us with
> the interrupt processed but the cpu halted.
>
> The interrupt shadow, however, can be broken by an nmi; the following
> sequence
>
> sti
> nmi ... iret
> # interrupt shadow disabled
> intr ... iret
> hlt
>
> puts the cpu to sleep, even though the interrupt may need additional processing
> after the hlt (like scheduling a task).
>
> sti is explicitly documented not to force an interrupt shadow; though many
> processors do inhibit nmi immediately after sti.
>
> Avoid the race by checking, during an nmi, if we hit the safe halt sequence.
> If we did, increment the instruction pointer past the hlt instruction; this
> allows an immediately following interrupt to return to a safe place.
>
> Signed-off-by: Avi Kivity<avi@redhat.com>
Ping.
> ---
> arch/x86/include/asm/irqflags.h | 18 +++++++++++++++++-
> arch/x86/kernel/traps.c | 14 ++++++++++++++
> 2 files changed, 31 insertions(+), 1 deletions(-)
>
> diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h
> index 9e2b952..f412167 100644
> --- a/arch/x86/include/asm/irqflags.h
> +++ b/arch/x86/include/asm/irqflags.h
> @@ -44,9 +44,25 @@ static inline void native_irq_enable(void)
> asm volatile("sti": : :"memory");
> }
>
> +extern void safe_halt_addr(void);
> +
> static inline void native_safe_halt(void)
> {
> - asm volatile("sti; hlt": : :"memory");
> + asm volatile("sti \n\t"
> + /*
> + * If NMI hits us here, it negates the interrupt shadow
> + * induced by STI. So the NMI handler checks for
> + * safe_halt_addr and skips the hlt if it loses the
> + * interrupt shadow.
> + *
> + * If native_safe_halt() is ever instantiated more
> + * than once, this will fail to build, and we'll need
> + * a list of addresses in a special section.
> + */
> + ".globl safe_halt_addr \n\t"
> + "safe_halt_addr: \n\t"
> + "hlt"
> + : : :"memory");
> }
>
> static inline void native_halt(void)
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index 60788de..f67da48 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -438,6 +438,20 @@ do_nmi(struct pt_regs *regs, long error_code)
>
> inc_irq_stat(__nmi_count);
>
> + /*
> + * We hit in the middle of an sti; hlt instruction. When we return,
> + * the interrupt shadow cast by sti will no longer be in effect; then,
> + * if an interrupt causes a wakeup, we won't notice it since the hlt
> + * will take effect and block the cpu.
> + *
> + * If we detect this situation, fix it by advancing the instruction
> + * pointer past the hlt instruction; if the interrupt doesn't happen,
> + * we'll spend a few cycles falling asleep again, but that's better
> + * than a missed wakeup.
> + */
> + if (regs->cs == __KERNEL_CS&& regs->ip == (ulong)safe_halt_addr)
> + ++regs->ip;
> +
> if (!ignore_nmis)
> default_do_nmi(regs);
>
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH] x86, nmi: workaround sti; hlt race vs nmi; intr
2010-09-27 8:38 ` Avi Kivity
@ 2010-09-27 9:13 ` Alexander Graf
2010-09-27 9:15 ` Alexander Graf
0 siblings, 1 reply; 15+ messages in thread
From: Alexander Graf @ 2010-09-27 9:13 UTC (permalink / raw)
To: Avi Kivity; +Cc: x86, linux-kernel, kvm
On 27.09.2010, at 10:38, Avi Kivity wrote:
> On 09/19/2010 06:28 PM, Avi Kivity wrote:
>> On machines without monitor/mwait we use an sti; hlt sequence to atomically
>> enable interrupts and put the cpu to sleep. The sequence uses the "interrupt
>> shadow" property of the sti instruction: interrupts are enabled only after
>> the instruction following sti has been executed. This means an interrupt
>> cannot happen in the middle of the sequence, which would leave us with
>> the interrupt processed but the cpu halted.
>>
>> The interrupt shadow, however, can be broken by an nmi; the following
>> sequence
>>
>> sti
>> nmi ... iret
>> # interrupt shadow disabled
>> intr ... iret
>> hlt
>>
>> puts the cpu to sleep, even though the interrupt may need additional processing
>> after the hlt (like scheduling a task).
>>
>> sti is explicitly documented not to force an interrupt shadow; though many
>> processors do inhibit nmi immediately after sti.
>>
>> Avoid the race by checking, during an nmi, if we hit the safe halt sequence.
>> If we did, increment the instruction pointer past the hlt instruction; this
>> allows an immediately following interrupt to return to a safe place.
>>
>> Signed-off-by: Avi Kivity<avi@redhat.com>
>
> Ping.
Wow, this is incredibly ugly :). Can't we just mask NMIs when the interrupt shadow is active?
Alex
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] x86, nmi: workaround sti; hlt race vs nmi; intr
2010-09-27 9:13 ` Alexander Graf
@ 2010-09-27 9:15 ` Alexander Graf
2010-09-27 9:17 ` Avi Kivity
0 siblings, 1 reply; 15+ messages in thread
From: Alexander Graf @ 2010-09-27 9:15 UTC (permalink / raw)
To: Alexander Graf; +Cc: Avi Kivity, x86, linux-kernel, kvm
On 27.09.2010, at 11:13, Alexander Graf wrote:
>
> On 27.09.2010, at 10:38, Avi Kivity wrote:
>
>> On 09/19/2010 06:28 PM, Avi Kivity wrote:
>>> On machines without monitor/mwait we use an sti; hlt sequence to atomically
>>> enable interrupts and put the cpu to sleep. The sequence uses the "interrupt
>>> shadow" property of the sti instruction: interrupts are enabled only after
>>> the instruction following sti has been executed. This means an interrupt
>>> cannot happen in the middle of the sequence, which would leave us with
>>> the interrupt processed but the cpu halted.
>>>
>>> The interrupt shadow, however, can be broken by an nmi; the following
>>> sequence
>>>
>>> sti
>>> nmi ... iret
>>> # interrupt shadow disabled
>>> intr ... iret
>>> hlt
>>>
>>> puts the cpu to sleep, even though the interrupt may need additional processing
>>> after the hlt (like scheduling a task).
>>>
>>> sti is explicitly documented not to force an interrupt shadow; though many
>>> processors do inhibit nmi immediately after sti.
>>>
>>> Avoid the race by checking, during an nmi, if we hit the safe halt sequence.
>>> If we did, increment the instruction pointer past the hlt instruction; this
>>> allows an immediately following interrupt to return to a safe place.
>>>
>>> Signed-off-by: Avi Kivity<avi@redhat.com>
>>
>> Ping.
>
> Wow, this is incredibly ugly :). Can't we just mask NMIs when the interrupt shadow is active?
Yeah, that's me writing without thinking. So this means that the race can also happen on real hardware?
Alex
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] x86, nmi: workaround sti; hlt race vs nmi; intr
2010-09-27 9:15 ` Alexander Graf
@ 2010-09-27 9:17 ` Avi Kivity
2010-09-27 9:22 ` Alexander Graf
0 siblings, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2010-09-27 9:17 UTC (permalink / raw)
To: Alexander Graf; +Cc: x86, linux-kernel, kvm
On 09/27/2010 11:15 AM, Alexander Graf wrote:
> >
> > Wow, this is incredibly ugly :). Can't we just mask NMIs when the interrupt shadow is active?
I plan to do that, for all the code that's out there relying on on STI
interrupt shadow masking NMIs.
> Yeah, that's me writing without thinking. So this means that the race can also happen on real hardware?
>
Yes. At least on documented hardware. Some (most? all?) hardware does
mask NMIs after STI.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] x86, nmi: workaround sti; hlt race vs nmi; intr
2010-09-27 9:17 ` Avi Kivity
@ 2010-09-27 9:22 ` Alexander Graf
2010-09-27 9:27 ` Avi Kivity
0 siblings, 1 reply; 15+ messages in thread
From: Alexander Graf @ 2010-09-27 9:22 UTC (permalink / raw)
To: Avi Kivity; +Cc: x86, linux-kernel, kvm
On 27.09.2010, at 11:17, Avi Kivity wrote:
> On 09/27/2010 11:15 AM, Alexander Graf wrote:
>> >
>> > Wow, this is incredibly ugly :). Can't we just mask NMIs when the interrupt shadow is active?
>
> I plan to do that, for all the code that's out there relying on on STI interrupt shadow masking NMIs.
>
>> Yeah, that's me writing without thinking. So this means that the race can also happen on real hardware?
>>
>
> Yes. At least on documented hardware. Some (most? all?) hardware does mask NMIs after STI.
If all hardware masks NMIs after STI, wouldn't it be better to update the spec and declare KVM buggy for injecting NMIs there?
Alex
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] x86, nmi: workaround sti; hlt race vs nmi; intr
2010-09-27 9:22 ` Alexander Graf
@ 2010-09-27 9:27 ` Avi Kivity
2010-09-27 9:36 ` Alexander Graf
2010-09-27 21:55 ` H. Peter Anvin
0 siblings, 2 replies; 15+ messages in thread
From: Avi Kivity @ 2010-09-27 9:27 UTC (permalink / raw)
To: Alexander Graf; +Cc: x86, linux-kernel, kvm
On 09/27/2010 11:22 AM, Alexander Graf wrote:
> On 27.09.2010, at 11:17, Avi Kivity wrote:
>
> > On 09/27/2010 11:15 AM, Alexander Graf wrote:
> >> >
> >> > Wow, this is incredibly ugly :). Can't we just mask NMIs when the interrupt shadow is active?
> >
> > I plan to do that, for all the code that's out there relying on on STI interrupt shadow masking NMIs.
> >
> >> Yeah, that's me writing without thinking. So this means that the race can also happen on real hardware?
> >>
> >
> > Yes. At least on documented hardware. Some (most? all?) hardware does mask NMIs after STI.
>
> If all hardware masks NMIs after STI, wouldn't it be better to update the spec and declare KVM buggy for injecting NMIs there?
>
I don't have write permissions for the spec. If you can verify that all
existing and future hardware will mask NMI after STI and convince the
spec owners to update the specifications, I'm all for it; it's certainly
a cleaner solution.
Note these days hardware includes virtual hardware; though it's less
affected. Missing a wakeup is critical for real time systems --
virtualized systems are unlikely to notice it unless they have exactly
one interrupt source.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] x86, nmi: workaround sti; hlt race vs nmi; intr
2010-09-27 9:27 ` Avi Kivity
@ 2010-09-27 9:36 ` Alexander Graf
2010-09-27 21:55 ` H. Peter Anvin
1 sibling, 0 replies; 15+ messages in thread
From: Alexander Graf @ 2010-09-27 9:36 UTC (permalink / raw)
To: Avi Kivity; +Cc: x86, LKML List, KVM list, Joerg Roedel, Wilfred Yu
On 27.09.2010, at 11:27, Avi Kivity wrote:
> On 09/27/2010 11:22 AM, Alexander Graf wrote:
>> On 27.09.2010, at 11:17, Avi Kivity wrote:
>>
>> > On 09/27/2010 11:15 AM, Alexander Graf wrote:
>> >> >
>> >> > Wow, this is incredibly ugly :). Can't we just mask NMIs when the interrupt shadow is active?
>> >
>> > I plan to do that, for all the code that's out there relying on on STI interrupt shadow masking NMIs.
>> >
>> >> Yeah, that's me writing without thinking. So this means that the race can also happen on real hardware?
>> >>
>> >
>> > Yes. At least on documented hardware. Some (most? all?) hardware does mask NMIs after STI.
>>
>> If all hardware masks NMIs after STI, wouldn't it be better to update the spec and declare KVM buggy for injecting NMIs there?
>>
>
> I don't have write permissions for the spec. If you can verify that all existing and future hardware will mask NMI after STI and convince the spec owners to update the specifications, I'm all for it; it's certainly a cleaner solution.
*shrug* I don't have permissions for that either, but we can CC people who might get into touch with some who can.
Alex
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] x86, nmi: workaround sti; hlt race vs nmi; intr
2010-09-27 9:27 ` Avi Kivity
2010-09-27 9:36 ` Alexander Graf
@ 2010-09-27 21:55 ` H. Peter Anvin
2010-09-28 8:50 ` Avi Kivity
1 sibling, 1 reply; 15+ messages in thread
From: H. Peter Anvin @ 2010-09-27 21:55 UTC (permalink / raw)
To: Avi Kivity; +Cc: Alexander Graf, x86, linux-kernel, kvm
On 09/27/2010 02:27 AM, Avi Kivity wrote:
>
> I don't have write permissions for the spec. If you can verify that all
> existing and future hardware will mask NMI after STI and convince the
> spec owners to update the specifications, I'm all for it; it's certainly
> a cleaner solution.
>
I'm trying to see if this is doable. As you can well imagine, though,
it takes time.
-hpa
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] x86, nmi: workaround sti; hlt race vs nmi; intr
2010-09-27 21:55 ` H. Peter Anvin
@ 2010-09-28 8:50 ` Avi Kivity
2010-09-28 9:22 ` Roedel, Joerg
2010-09-28 15:34 ` H. Peter Anvin
0 siblings, 2 replies; 15+ messages in thread
From: Avi Kivity @ 2010-09-28 8:50 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Alexander Graf, x86, linux-kernel, kvm, Joerg Roedel
On 09/27/2010 11:55 PM, H. Peter Anvin wrote:
> On 09/27/2010 02:27 AM, Avi Kivity wrote:
> >
> > I don't have write permissions for the spec. If you can verify that all
> > existing and future hardware will mask NMI after STI and convince the
> > spec owners to update the specifications, I'm all for it; it's certainly
> > a cleaner solution.
> >
>
> I'm trying to see if this is doable. As you can well imagine, though,
> it takes time.
Joerg, can you look at this from the AMD side?
First, do AMD processors block NMI after an IF-enabling STI? The
documentation explicitly states that they don't:
> Sets the interrupt flag (IF) in the rFLAGS register to 1, thereby
> allowing external interrupts received on
> the INTR input. Interrupts received on the non-maskable interrupt
> (NMI) input are not affected by this
> instruction.
though that may refer to the general effect of EFLAGS.IF, not STI.
Second, if NMIs are blocked, is it possible to retro-document this?
Personally I think the safer route is to take the patch. There are
other processors besides Intel and AMD and we can't test all of them,
not to mention various emulators and virtual machine monitors out there.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] x86, nmi: workaround sti; hlt race vs nmi; intr
2010-09-28 8:50 ` Avi Kivity
@ 2010-09-28 9:22 ` Roedel, Joerg
2010-09-28 15:34 ` H. Peter Anvin
1 sibling, 0 replies; 15+ messages in thread
From: Roedel, Joerg @ 2010-09-28 9:22 UTC (permalink / raw)
To: Avi Kivity
Cc: H. Peter Anvin, Alexander Graf, x86@kernel.org,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org
On Tue, Sep 28, 2010 at 04:50:24AM -0400, Avi Kivity wrote:
> On 09/27/2010 11:55 PM, H. Peter Anvin wrote:
> > On 09/27/2010 02:27 AM, Avi Kivity wrote:
> > >
> > > I don't have write permissions for the spec. If you can verify that all
> > > existing and future hardware will mask NMI after STI and convince the
> > > spec owners to update the specifications, I'm all for it; it's certainly
> > > a cleaner solution.
> > >
> >
> > I'm trying to see if this is doable. As you can well imagine, though,
> > it takes time.
>
> Joerg, can you look at this from the AMD side?
Yes, I will discuss this with the hardware peoples. Will take some time
here too.
Joerg
--
AMD Operating System Research Center
Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] x86, nmi: workaround sti; hlt race vs nmi; intr
2010-09-28 8:50 ` Avi Kivity
2010-09-28 9:22 ` Roedel, Joerg
@ 2010-09-28 15:34 ` H. Peter Anvin
2010-09-28 16:30 ` Avi Kivity
1 sibling, 1 reply; 15+ messages in thread
From: H. Peter Anvin @ 2010-09-28 15:34 UTC (permalink / raw)
To: Avi Kivity; +Cc: Alexander Graf, x86, linux-kernel, kvm, Joerg Roedel
On 09/28/2010 01:50 AM, Avi Kivity wrote:
>
> Personally I think the safer route is to take the patch. There are
> other processors besides Intel and AMD and we can't test all of them,
> not to mention various emulators and virtual machine monitors out there.
>
Speaking for the smoltering crater that used to be *Transmeta*, I'm
(from memory) quite certain they blocked NMI and that this was
intentional behavior.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] x86, nmi: workaround sti; hlt race vs nmi; intr
2010-09-28 15:34 ` H. Peter Anvin
@ 2010-09-28 16:30 ` Avi Kivity
0 siblings, 0 replies; 15+ messages in thread
From: Avi Kivity @ 2010-09-28 16:30 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Alexander Graf, x86, linux-kernel, kvm, Joerg Roedel
On 09/28/2010 05:34 PM, H. Peter Anvin wrote:
> On 09/28/2010 01:50 AM, Avi Kivity wrote:
> >
> > Personally I think the safer route is to take the patch. There are
> > other processors besides Intel and AMD and we can't test all of them,
> > not to mention various emulators and virtual machine monitors out there.
> >
>
> Speaking for the smoltering crater that used to be *Transmeta*, I'm
> (from memory) quite certain they blocked NMI and that this was
> intentional behavior.
>
We'll need:
- Intel
- AMD
- Geode
- Via
- kvm (currently no, but plan to)
- qemu
- vmware
- others?
It should be relatively simple to write a small test case to test this.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] x86, nmi: workaround sti; hlt race vs nmi; intr
2010-09-19 16:28 [PATCH] x86, nmi: workaround sti; hlt race vs nmi; intr Avi Kivity
2010-09-27 8:38 ` Avi Kivity
@ 2010-09-27 10:31 ` Joerg Roedel
2010-09-27 14:17 ` Avi Kivity
1 sibling, 1 reply; 15+ messages in thread
From: Joerg Roedel @ 2010-09-27 10:31 UTC (permalink / raw)
To: Avi Kivity; +Cc: x86, linux-kernel, kvm
On Sun, Sep 19, 2010 at 06:28:19PM +0200, Avi Kivity wrote:
> On machines without monitor/mwait we use an sti; hlt sequence to atomically
> enable interrupts and put the cpu to sleep. The sequence uses the "interrupt
> shadow" property of the sti instruction: interrupts are enabled only after
> the instruction following sti has been executed. This means an interrupt
> cannot happen in the middle of the sequence, which would leave us with
> the interrupt processed but the cpu halted.
>
> The interrupt shadow, however, can be broken by an nmi; the following
> sequence
>
> sti
> nmi ... iret
> # interrupt shadow disabled
> intr ... iret
> hlt
>
> puts the cpu to sleep, even though the interrupt may need additional
> processing after the hlt (like scheduling a task).
Doesn't the interrupt return path check for a re-schedule condition
before iret? So to my believe the handler would not jump back to the
idle task if something else becomes running in the interrupt handler,
no?
Joerg
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] x86, nmi: workaround sti; hlt race vs nmi; intr
2010-09-27 10:31 ` Joerg Roedel
@ 2010-09-27 14:17 ` Avi Kivity
0 siblings, 0 replies; 15+ messages in thread
From: Avi Kivity @ 2010-09-27 14:17 UTC (permalink / raw)
To: Joerg Roedel; +Cc: x86, linux-kernel, kvm
On 09/27/2010 12:31 PM, Joerg Roedel wrote:
> On Sun, Sep 19, 2010 at 06:28:19PM +0200, Avi Kivity wrote:
> > On machines without monitor/mwait we use an sti; hlt sequence to atomically
> > enable interrupts and put the cpu to sleep. The sequence uses the "interrupt
> > shadow" property of the sti instruction: interrupts are enabled only after
> > the instruction following sti has been executed. This means an interrupt
> > cannot happen in the middle of the sequence, which would leave us with
> > the interrupt processed but the cpu halted.
> >
> > The interrupt shadow, however, can be broken by an nmi; the following
> > sequence
> >
> > sti
> > nmi ... iret
> > # interrupt shadow disabled
> > intr ... iret
> > hlt
> >
> > puts the cpu to sleep, even though the interrupt may need additional
> > processing after the hlt (like scheduling a task).
>
> Doesn't the interrupt return path check for a re-schedule condition
> before iret? So to my believe the handler would not jump back to the
> idle task if something else becomes running in the interrupt handler,
> no?
>
Perhaps on preemptible kernels? But at least on non-preemptible
kernels, you can't just switch tasks while running kernel code.
void cpu_idle(void)
{
current_thread_info()->status |= TS_POLLING;
/*
* If we're the non-boot CPU, nothing set the stack canary up
* for us. CPU0 already has it initialized but no harm in
* doing it again. This is a good place for updating it, as
* we wont ever return from this function (so the invalid
* canaries already on the stack wont ever trigger).
*/
boot_init_stack_canary();
/* endless idle loop with no priority at all */
while (1) {
tick_nohz_stop_sched_tick(1);
while (!need_resched()) {
rmb();
if (cpu_is_offline(smp_processor_id()))
play_dead();
/*
* Idle routines should keep interrupts disabled
* from here on, until they go to idle.
* Otherwise, idle callbacks can misfire.
*/
local_irq_disable();
enter_idle();
/* Don't trace irqs off for idle */
stop_critical_timings();
pm_idle();
start_critical_timings();
trace_power_end(smp_processor_id());
/* In many cases the interrupt that ended idle
has already called exit_idle. But some idle
loops can be woken up without interrupt. */
__exit_idle();
}
tick_nohz_restart_sched_tick();
preempt_enable_no_resched();
schedule();
preempt_disable();
}
}
Looks like we rely on an explicit schedule() - pm_idle() is called with
preemption disabled.
(pm_idle eventually calls safe_halt() if no other idle method is used)
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2010-09-28 16:30 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-19 16:28 [PATCH] x86, nmi: workaround sti; hlt race vs nmi; intr Avi Kivity
2010-09-27 8:38 ` Avi Kivity
2010-09-27 9:13 ` Alexander Graf
2010-09-27 9:15 ` Alexander Graf
2010-09-27 9:17 ` Avi Kivity
2010-09-27 9:22 ` Alexander Graf
2010-09-27 9:27 ` Avi Kivity
2010-09-27 9:36 ` Alexander Graf
2010-09-27 21:55 ` H. Peter Anvin
2010-09-28 8:50 ` Avi Kivity
2010-09-28 9:22 ` Roedel, Joerg
2010-09-28 15:34 ` H. Peter Anvin
2010-09-28 16:30 ` Avi Kivity
2010-09-27 10:31 ` Joerg Roedel
2010-09-27 14:17 ` Avi Kivity
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox