RCU question

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* RCU question
@ 2004-12-09 23:59 George Anzinger
  2004-12-10  4:31 ` Dipankar Sarma
  0 siblings, 1 reply; 44+ messages in thread
From: George Anzinger @ 2004-12-09 23:59 UTC (permalink / raw)
  To: Dipankar Sarma, Manfred Spraul, lkml

I am working on VST code.  This code is called from the idle loop to check for 
future timers.  It then sets up a timer to interrupt in time to handle the 
nearest timer and turns off the time base interrupt source.  As part of 
qualifying the entry to this state I want to make sure there is no pending work 
so, from the idle task I have this:

	if (local_softirq_pending())
		do_softirq();

	BUG_ON(local_softirq_pending());

I did not really expect to find any pending softirqs, but, not only are there 
some, they don't go away and the system BUGs.  The offender is the RCU task. 
The question is: is this normal or is there something wrong?
-- 
George Anzinger   george@mvista.com
High-res-timers:  http://sourceforge.net/projects/high-res-timers/

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: RCU question
  2004-12-09 23:59 RCU question George Anzinger
@ 2004-12-10  4:31 ` Dipankar Sarma
  2004-12-10 19:42   ` George Anzinger
  0 siblings, 1 reply; 44+ messages in thread
From: Dipankar Sarma @ 2004-12-10  4:31 UTC (permalink / raw)
  To: ganzinger; +Cc: Manfred Spraul, lkml

On Thu, Dec 09, 2004 at 03:59:45PM -0800, George Anzinger wrote:
> I am working on VST code.  This code is called from the idle loop to check 
> for future timers.  It then sets up a timer to interrupt in time to handle 
> the nearest timer and turns off the time base interrupt source.  As part of 
> qualifying the entry to this state I want to make sure there is no pending 
> work so, from the idle task I have this:
> 
> 	if (local_softirq_pending())
> 		do_softirq();
> 
> 	BUG_ON(local_softirq_pending());
> 
> I did not really expect to find any pending softirqs, but, not only are 
> there some, they don't go away and the system BUGs.  The offender is the 
> RCU task. The question is: is this normal or is there something wrong?

Why do you think there would not be any softirq pending after do_softirq() ?
What if the cpu gets a network interrupt which raises a softirq ?
And yes, RCU processing in softirq context can re-raise the softirq.
AFAICS, it is perfectly normal.

Thanks
Dipankar

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: RCU question
  2004-12-10  4:31 ` Dipankar Sarma
@ 2004-12-10 19:42   ` George Anzinger
  2004-12-10 20:40     ` Dipankar Sarma
  0 siblings, 1 reply; 44+ messages in thread
From: George Anzinger @ 2004-12-10 19:42 UTC (permalink / raw)
  To: dipankar; +Cc: ganzinger, Manfred Spraul, lkml

Dipankar Sarma wrote:
> On Thu, Dec 09, 2004 at 03:59:45PM -0800, George Anzinger wrote:
> 
>>I am working on VST code.  This code is called from the idle loop to check 
>>for future timers.  It then sets up a timer to interrupt in time to handle 
>>the nearest timer and turns off the time base interrupt source.  As part of 
>>qualifying the entry to this state I want to make sure there is no pending 
>>work so, from the idle task I have this:
>>
>>	if (local_softirq_pending())
>>		do_softirq();
>>
>>	BUG_ON(local_softirq_pending());
>>
>>I did not really expect to find any pending softirqs, but, not only are 
>>there some, they don't go away and the system BUGs.  The offender is the 
>>RCU task. The question is: is this normal or is there something wrong?
> 
> 
> Why do you think there would not be any softirq pending after do_softirq() ?
> What if the cpu gets a network interrupt which raises a softirq ?

Yes, but it is serviced on interrupt exit and the task level code would never 
see it.

> And yes, RCU processing in softirq context can re-raise the softirq.
> AFAICS, it is perfectly normal.

My assumption was that, this being the idle task, RCU would be more than happy 
to finish all its pending tasks.

It may be necessary for me to rethink the conditions required to go into the VST 
state.  I had assumed that it required NO softirq pending as a pre condition. 
 From this point on we would have the interrupt system off until the hardware 
sleep instruction (hlt in the x86 case).

-- 
George Anzinger   george@mvista.com
High-res-timers:  http://sourceforge.net/projects/high-res-timers/


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: RCU question
  2004-12-10 19:42   ` George Anzinger
@ 2004-12-10 20:40     ` Dipankar Sarma
  2004-12-10 20:45       ` Lee Revell
  0 siblings, 1 reply; 44+ messages in thread
From: Dipankar Sarma @ 2004-12-10 20:40 UTC (permalink / raw)
  To: George Anzinger; +Cc: ganzinger, Manfred Spraul, lkml

On Fri, Dec 10, 2004 at 11:42:55AM -0800, George Anzinger wrote:
> Dipankar Sarma wrote:
> >And yes, RCU processing in softirq context can re-raise the softirq.
> >AFAICS, it is perfectly normal.
> 
> My assumption was that, this being the idle task, RCU would be more than 
> happy to finish all its pending tasks.

We try to avoid really long running softirqs (RCU tasklet in this case)
for better scheduling latency. A long running rcu tasklet during
an idle cpu may delay running of an RT process that becomes runnable
during the rcu tasklet.

> 
> It may be necessary for me to rethink the conditions required to go into 
> the VST state.  I had assumed that it required NO softirq pending as a pre 
> condition. From this point on we would have the interrupt system off until 
> the hardware sleep instruction (hlt in the x86 case).

Unfortunately, we aren't there yet. But it is in my TODO list for a 
generic nohz system.

Thanks
Dipankar

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: RCU question
  2004-12-10 20:40     ` Dipankar Sarma
@ 2004-12-10 20:45       ` Lee Revell
  2004-12-10 21:02         ` George Anzinger
  0 siblings, 1 reply; 44+ messages in thread
From: Lee Revell @ 2004-12-10 20:45 UTC (permalink / raw)
  To: dipankar; +Cc: George Anzinger, ganzinger, Manfred Spraul, lkml

On Sat, 2004-12-11 at 02:10 +0530, Dipankar Sarma wrote:
> On Fri, Dec 10, 2004 at 11:42:55AM -0800, George Anzinger wrote:
> > Dipankar Sarma wrote:
> > >And yes, RCU processing in softirq context can re-raise the softirq.
> > >AFAICS, it is perfectly normal.
> > 
> > My assumption was that, this being the idle task, RCU would be more than 
> > happy to finish all its pending tasks.
> 
> We try to avoid really long running softirqs (RCU tasklet in this case)
> for better scheduling latency. A long running rcu tasklet during
> an idle cpu may delay running of an RT process that becomes runnable
> during the rcu tasklet.
> 

Well, softirqs should really be preemptible if you care about RT task
latency.  Ingo's patches have had this for months.  Works great.  Maybe
it's time to push it upstream.

Lee


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: RCU question
  2004-12-10 20:45       ` Lee Revell
@ 2004-12-10 21:02         ` George Anzinger
  2004-12-10 22:58           ` Zwane Mwaikambo
  0 siblings, 1 reply; 44+ messages in thread
From: George Anzinger @ 2004-12-10 21:02 UTC (permalink / raw)
  To: Lee Revell; +Cc: dipankar, ganzinger, Manfred Spraul, lkml

Lee Revell wrote:
> On Sat, 2004-12-11 at 02:10 +0530, Dipankar Sarma wrote:
> 
>>On Fri, Dec 10, 2004 at 11:42:55AM -0800, George Anzinger wrote:
>>
>>>Dipankar Sarma wrote:
>>>
>>>>And yes, RCU processing in softirq context can re-raise the softirq.
>>>>AFAICS, it is perfectly normal.
>>>
>>>My assumption was that, this being the idle task, RCU would be more than 
>>>happy to finish all its pending tasks.
>>
>>We try to avoid really long running softirqs (RCU tasklet in this case)
>>for better scheduling latency. A long running rcu tasklet during
>>an idle cpu may delay running of an RT process that becomes runnable
>>during the rcu tasklet.
>>
> 
> 
> Well, softirqs should really be preemptible if you care about RT task
> latency.  Ingo's patches have had this for months.  Works great.  Maybe
> it's time to push it upstream.

Yes, I understand, and soft_irq() does turn on interrupts...
I was thinking of something like:

	while(softirq_pending()) {
		local_irq_enable();
		do_softirq();
		local_irq_disable();
	}
	
	<proceed to idle hlt...>
-- 
George Anzinger   george@mvista.com
High-res-timers:  http://sourceforge.net/projects/high-res-timers/


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: RCU question
  2004-12-10 21:02         ` George Anzinger
@ 2004-12-10 22:58           ` Zwane Mwaikambo
  2004-12-11  2:22             ` George Anzinger
  0 siblings, 1 reply; 44+ messages in thread
From: Zwane Mwaikambo @ 2004-12-10 22:58 UTC (permalink / raw)
  To: George Anzinger; +Cc: Lee Revell, dipankar, ganzinger, Manfred Spraul, lkml

On Fri, 10 Dec 2004, George Anzinger wrote:

> > Well, softirqs should really be preemptible if you care about RT task
> > latency.  Ingo's patches have had this for months.  Works great.  Maybe
> > it's time to push it upstream.
> 
> Yes, I understand, and soft_irq() does turn on interrupts...
> I was thinking of something like:
> 
> 	while(softirq_pending()) {
> 		local_irq_enable();
> 		do_softirq();
> 		local_irq_disable();
> 	}
> 		<proceed to idle hlt...>

But that's a deadlock and if you enable interrupts you race.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: RCU question
  2004-12-10 22:58           ` Zwane Mwaikambo
@ 2004-12-11  2:22             ` George Anzinger
  2004-12-11  2:45               ` Zwane Mwaikambo
  0 siblings, 1 reply; 44+ messages in thread
From: George Anzinger @ 2004-12-11  2:22 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: Lee Revell, dipankar, ganzinger, Manfred Spraul, lkml

Zwane Mwaikambo wrote:
> On Fri, 10 Dec 2004, George Anzinger wrote:
> 
> 
>>>Well, softirqs should really be preemptible if you care about RT task
>>>latency.  Ingo's patches have had this for months.  Works great.  Maybe
>>>it's time to push it upstream.
>>
>>Yes, I understand, and soft_irq() does turn on interrupts...
>>I was thinking of something like:
>>
>>	while(softirq_pending()) {
>>		local_irq_enable();
>>		do_softirq();
>>		local_irq_disable();
>>	}
>>		<proceed to idle hlt...>
> 
> 
> But that's a deadlock and if you enable interrupts you race.

Again, I remind you we are in the idle task.  Nothing more important to do.  Or 
do you mean that softirq_pending() will NEVER return false?

The other question is: "Is useful work being done?"

-- 
George Anzinger   george@mvista.com
High-res-timers:  http://sourceforge.net/projects/high-res-timers/


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: RCU question
  2004-12-11  2:22             ` George Anzinger
@ 2004-12-11  2:45               ` Zwane Mwaikambo
  2004-12-11  3:29                 ` George Anzinger
  0 siblings, 1 reply; 44+ messages in thread
From: Zwane Mwaikambo @ 2004-12-11  2:45 UTC (permalink / raw)
  To: George Anzinger; +Cc: Lee Revell, dipankar, ganzinger, Manfred Spraul, lkml

On Fri, 10 Dec 2004, George Anzinger wrote:

> > But that's a deadlock and if you enable interrupts you race.
> 
> Again, I remind you we are in the idle task.  Nothing more important to do.
> Or do you mean that softirq_pending() will NEVER return false?
> 
> The other question is: "Is useful work being done?"

We're in the idle task but obviously interrupts (such as network) are 
still coming in. So you may take an interrupt after your while 
(softirq_pending()) loop has exited.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: RCU question
  2004-12-11  2:45               ` Zwane Mwaikambo
@ 2004-12-11  3:29                 ` George Anzinger
  2004-12-11 14:52                   ` Zwane Mwaikambo
  0 siblings, 1 reply; 44+ messages in thread
From: George Anzinger @ 2004-12-11  3:29 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: Lee Revell, dipankar, ganzinger, Manfred Spraul, lkml

Zwane Mwaikambo wrote:
> On Fri, 10 Dec 2004, George Anzinger wrote:
> 
> 
>>>But that's a deadlock and if you enable interrupts you race.
>>
>>Again, I remind you we are in the idle task.  Nothing more important to do.
>>Or do you mean that softirq_pending() will NEVER return false?
>>
>>The other question is: "Is useful work being done?"
> 
> 
> We're in the idle task but obviously interrupts (such as network) are 
> still coming in. So you may take an interrupt after your while 
> (softirq_pending()) loop has exited.

That is ok.  Either we have interrupts off and no softirqs are pending and we 
proceed to the "hlt" (where the interrupt will be taken), or softirqs are 
pending, we turn interrupts on, do the softirq, turn interrupts off and try 
again.  Unless some tasklet (RCU?) never "gives up" or we will exit the while 
with interrupts off and move on to the "hlt".  Or did I miss something?
-
George Anzinger   george@mvista.com
High-res-timers:  http://sourceforge.net/projects/high-res-timers/

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: RCU question
  2004-12-11  3:29                 ` George Anzinger
@ 2004-12-11 14:52                   ` Zwane Mwaikambo
  2004-12-11 16:32                     ` Manfred Spraul
  0 siblings, 1 reply; 44+ messages in thread
From: Zwane Mwaikambo @ 2004-12-11 14:52 UTC (permalink / raw)
  To: George Anzinger; +Cc: Lee Revell, dipankar, ganzinger, Manfred Spraul, lkml

On Fri, 10 Dec 2004, George Anzinger wrote:

> That is ok.  Either we have interrupts off and no softirqs are pending and we
> proceed to the "hlt" (where the interrupt will be taken), or softirqs are
> pending, we turn interrupts on, do the softirq, turn interrupts off and try
> again.  Unless some tasklet (RCU?) never "gives up" or we will exit the while
> with interrupts off and move on to the "hlt".  Or did I miss something?

But the point is that you cannot execute hlt with interrupts disabled.


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: RCU question
  2004-12-11 14:52                   ` Zwane Mwaikambo
@ 2004-12-11 16:32                     ` Manfred Spraul
  2004-12-11 16:52                       ` George Anzinger
  0 siblings, 1 reply; 44+ messages in thread
From: Manfred Spraul @ 2004-12-11 16:32 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: George Anzinger, Lee Revell, dipankar, ganzinger, lkml

Zwane Mwaikambo wrote:

>On Fri, 10 Dec 2004, George Anzinger wrote:
>
>  
>
>>That is ok.  Either we have interrupts off and no softirqs are pending and we
>>proceed to the "hlt" (where the interrupt will be taken), or softirqs are
>>pending, we turn interrupts on, do the softirq, turn interrupts off and try
>>again.  Unless some tasklet (RCU?) never "gives up" or we will exit the while
>>with interrupts off and move on to the "hlt".  Or did I miss something?
>>    
>>
>
>But the point is that you cannot execute hlt with interrupts disabled.
>  
>
The trick is the sti instruction: It enables interrupt processing after 
the following instruction.

Thus
    sti
    hlt

cannot race - it atomically enables interrupts and waits.

--
    Manfred


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: RCU question
  2004-12-11 16:32                     ` Manfred Spraul
@ 2004-12-11 16:52                       ` George Anzinger
  2004-12-12  2:53                         ` Zwane Mwaikambo
  0 siblings, 1 reply; 44+ messages in thread
From: George Anzinger @ 2004-12-11 16:52 UTC (permalink / raw)
  To: Manfred Spraul; +Cc: Zwane Mwaikambo, Lee Revell, dipankar, ganzinger, lkml

Manfred Spraul wrote:
> Zwane Mwaikambo wrote:
> 
>> On Fri, 10 Dec 2004, George Anzinger wrote:
>>
>>  
>>
>>> That is ok.  Either we have interrupts off and no softirqs are 
>>> pending and we
>>> proceed to the "hlt" (where the interrupt will be taken), or softirqs 
>>> are
>>> pending, we turn interrupts on, do the softirq, turn interrupts off 
>>> and try
>>> again.  Unless some tasklet (RCU?) never "gives up" or we will exit 
>>> the while
>>> with interrupts off and move on to the "hlt".  Or did I miss something?
>>>   
>>
>>
>> But the point is that you cannot execute hlt with interrupts disabled.
>>  
>>
> The trick is the sti instruction: It enables interrupt processing after 
> the following instruction.
> 
> Thus
>    sti
>    hlt
> 
> cannot race - it atomically enables interrupts and waits.

Exactly :)

-- 
George Anzinger   george@mvista.com
High-res-timers:  http://sourceforge.net/projects/high-res-timers/


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: RCU question
  2004-12-11 16:52                       ` George Anzinger
@ 2004-12-12  2:53                         ` Zwane Mwaikambo
  2004-12-12  8:59                           ` Manfred Spraul
  0 siblings, 1 reply; 44+ messages in thread
From: Zwane Mwaikambo @ 2004-12-12  2:53 UTC (permalink / raw)
  To: George Anzinger; +Cc: Manfred Spraul, Lee Revell, dipankar, ganzinger, lkml

On Sat, 11 Dec 2004, George Anzinger wrote:

> Manfred Spraul wrote:
> > >  
> > The trick is the sti instruction: It enables interrupt processing after the
> > following instruction.
> > 
> > Thus
> >    sti
> >    hlt
> > 
> > cannot race - it atomically enables interrupts and waits.
> 
> Exactly :)

Ok i wasn't aware that it was safe_halt() that he was referring too, my 
poor assumption. But regardless, this seems highly fragile and relying on 
behaviour which may change across processor models/vendors. I also found 
the following excerpt from (http://sandpile.org/ia32/inter.htm) which you 
may find interesting;

"Intel processors don't suppress SMI or NMI after an STI instruction. 
Since the INTR suppresion is not preserved across an SMI or NMI handler, 
this may result in an INTR being serviced after the STI, which constitutes 
a violation of the INTR suppresion. Therefore, ideally the STI instruction 
also suppresses SMI and NMI."

George thanks for persisting and explaining your point, i can be very slow 
=)

	Zwane

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: RCU question
  2004-12-12  2:53                         ` Zwane Mwaikambo
@ 2004-12-12  8:59                           ` Manfred Spraul
  2004-12-12  9:37                             ` Andrea Arcangeli
  2004-12-12 16:26                             ` Zwane Mwaikambo
  0 siblings, 2 replies; 44+ messages in thread
From: Manfred Spraul @ 2004-12-12  8:59 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: George Anzinger, Lee Revell, dipankar, ganzinger, lkml

Zwane Mwaikambo wrote:

>"Intel processors don't suppress SMI or NMI after an STI instruction. 
>Since the INTR suppresion is not preserved across an SMI or NMI handler, 
>this may result in an INTR being serviced after the STI, which constitutes 
>a violation of the INTR suppresion.
>  
>
Interesting find.
It means that our NMI irq return path should check if it points to a hlt 
instruction and if yes, then increase the saved EIP by one before doing 
the iretd, right?

--
    Manfred

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: RCU question
  2004-12-12  8:59                           ` Manfred Spraul
@ 2004-12-12  9:37                             ` Andrea Arcangeli
  2004-12-12 10:22                               ` Manfred Spraul
  2004-12-12 16:26                             ` Zwane Mwaikambo
  1 sibling, 1 reply; 44+ messages in thread
From: Andrea Arcangeli @ 2004-12-12  9:37 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: Zwane Mwaikambo, George Anzinger, Lee Revell, dipankar, ganzinger,
	lkml

On Sun, Dec 12, 2004 at 09:59:00AM +0100, Manfred Spraul wrote:
> It means that our NMI irq return path should check if it points to a hlt 
> instruction and if yes, then increase the saved EIP by one before doing 
> the iretd, right?

I don't think we'll ever post any event through nmi, so it doesn't
matter. We only care to be waken by real irqs, not nmi/smi. Idle loop is
fine to ignore the actions of the nmi handlers and to hang into the
"hlt".

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: RCU question
  2004-12-12  9:37                             ` Andrea Arcangeli
@ 2004-12-12 10:22                               ` Manfred Spraul
  2004-12-12 12:15                                 ` Andrea Arcangeli
  2004-12-12 16:51                                 ` RCU question George Anzinger
  0 siblings, 2 replies; 44+ messages in thread
From: Manfred Spraul @ 2004-12-12 10:22 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Zwane Mwaikambo, George Anzinger, Lee Revell, dipankar, ganzinger,
	lkml

Andrea Arcangeli wrote:

>On Sun, Dec 12, 2004 at 09:59:00AM +0100, Manfred Spraul wrote:
>  
>
>>It means that our NMI irq return path should check if it points to a hlt 
>>instruction and if yes, then increase the saved EIP by one before doing 
>>the iretd, right?
>>    
>>
>
>I don't think we'll ever post any event through nmi, so it doesn't
>matter. We only care to be waken by real irqs, not nmi/smi. Idle loop is
>fine to ignore the actions of the nmi handlers and to hang into the
>"hlt".
>  
>
No, You misunderstood the problem:

sti
** NMI handler
** normal interrupt arrives, is queued by the cpu
** irqd from NMI handler
** cpu notices the normal interrupt, handles it.
** normal interrupt does a wakeup, schedules a tasklet, whatever
** irqd from normal interupt
hlt << cpu sleeps.

Thus: lost wakeup.

--
    Manfred
**

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: RCU question
  2004-12-12 10:22                               ` Manfred Spraul
@ 2004-12-12 12:15                                 ` Andrea Arcangeli
  2004-12-14 21:40                                   ` Lee Revell
  2004-12-12 16:51                                 ` RCU question George Anzinger
  1 sibling, 1 reply; 44+ messages in thread
From: Andrea Arcangeli @ 2004-12-12 12:15 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: Zwane Mwaikambo, George Anzinger, Lee Revell, dipankar, ganzinger,
	lkml

On Sun, Dec 12, 2004 at 11:22:49AM +0100, Manfred Spraul wrote:
> Andrea Arcangeli wrote:
> 
> >On Sun, Dec 12, 2004 at 09:59:00AM +0100, Manfred Spraul wrote:
> > 
> >
> >>It means that our NMI irq return path should check if it points to a hlt 
> >>instruction and if yes, then increase the saved EIP by one before doing 
> >>the iretd, right?
> >>   
> >>
> >
> >I don't think we'll ever post any event through nmi, so it doesn't
> >matter. We only care to be waken by real irqs, not nmi/smi. Idle loop is
> >fine to ignore the actions of the nmi handlers and to hang into the
> >"hlt".
> > 
> >
> No, You misunderstood the problem:
> 
> sti
> ** NMI handler
> ** normal interrupt arrives, is queued by the cpu
> ** irqd from NMI handler
> ** cpu notices the normal interrupt, handles it.
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

ok. The above just wasn't obvious to me because iret of an nmi is doing
the same thing that sti does (and the nmi itself is like a cli).
Shouldn't iret wait 1 instruction too or is there a special case about
iret?  The specs only tells sti waits 1 instruction, but they don't tell
anything about iret (nor that it waits nor that it doesn't wait).  I
realized now the link posted here assumes iret isn't going to wait 1
instruction before processing pending irqs which is reasonable given the
specs don't tell anything about iret, but I didn't imagine there was a
difference between sti and iret (I mean only when iret is going to
change the interrupt enable flag from 0 to 1 just like sti does).

Overall this is a very minor issue (unless HZ is 0), it would only
introduce a 1/HZ latency to the irq that get posted while the nmi
handler is running, and the nmi handlers never runs in production.
Forcing idle=poll when the nmi watchdog is enabled is probably a
reasonable fix.

As for the SMI, I wonder how you plan to fix it.  To me it sounds like a
minor mistake that iret isn't equivalent to sti when it toggles the irq
enable bitflag (infact I don't see a way to fix it for smi, though I
know very little about smi).

thanks.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: RCU question
  2004-12-12 12:15                                 ` Andrea Arcangeli
@ 2004-12-14 21:40                                   ` Lee Revell
  2004-12-14 22:23                                     ` [patch, 2.6.10-rc3] safe_hlt() & NMIs Ingo Molnar
  0 siblings, 1 reply; 44+ messages in thread
From: Lee Revell @ 2004-12-14 21:40 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Manfred Spraul, Zwane Mwaikambo, George Anzinger, dipankar,
	ganzinger, lkml, Ingo Molnar

On Sun, 2004-12-12 at 13:15 +0100, Andrea Arcangeli wrote:
> Overall this is a very minor issue (unless HZ is 0), it would only
> introduce a 1/HZ latency to the irq that get posted while the nmi
> handler is running, and the nmi handlers never runs in production.

Ingo, couldn't this account for some of the inexplicable outliers some
people were seeing in latency tests?

Lee


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [patch, 2.6.10-rc3] safe_hlt() & NMIs
  2004-12-14 21:40                                   ` Lee Revell
@ 2004-12-14 22:23                                     ` Ingo Molnar
  2004-12-14 22:47                                       ` Ingo Molnar
                                                         ` (2 more replies)
  0 siblings, 3 replies; 44+ messages in thread
From: Ingo Molnar @ 2004-12-14 22:23 UTC (permalink / raw)
  To: Lee Revell
  Cc: Andrea Arcangeli, Manfred Spraul, Zwane Mwaikambo,
	George Anzinger, dipankar, ganzinger, lkml, Andrew Morton,
	Linus Torvalds, Andi Kleen


* Lee Revell <rlrevell@joe-job.com> wrote:

> On Sun, 2004-12-12 at 13:15 +0100, Andrea Arcangeli wrote:
> > Overall this is a very minor issue (unless HZ is 0), it would only
> > introduce a 1/HZ latency to the irq that get posted while the nmi
> > handler is running, and the nmi handlers never runs in production.
> 
> Ingo, couldn't this account for some of the inexplicable outliers some
> people were seeing in latency tests?

indeed, there could be a connection, and it's certainly a fun race. The
proper fix is Manfred's suggestion: check whether the EIP is a kernel
text address, and if yes, whether it's a HLT instruction - and if yes
then increase EIP by 1. I've included the fix in the -33-02 -RT patch.
Andrew, Linus: upstream fix is below - i think it's post-2.6.10 stuff.
Tested it on SMP and UP x86, using both the IO-APIC and the local-APIC
based NMI watchdog.

i think x64 needs a similar fix as well.

	Ingo

--- linux/arch/i386/kernel/traps.c.orig
+++ linux/arch/i386/kernel/traps.c
@@ -670,6 +670,17 @@ fastcall void do_nmi(struct pt_regs * re
 
 	cpu = smp_processor_id();
 
+	/*
+	 * Fix up obscure CPU behavior: if we interrupt safe_hlt() via
+	 * the NMI then we might miss a reschedule if an interrupt is
+	 * posted to the CPU and executes before the HLT instruction.
+	 *
+	 * We check whether the EIP is kernelspace, and if yes, whether
+	 * the instruction is HLT:
+	 */
+	if (__kernel_text_address(regs->eip) && *(char *)regs->eip == 0xf4)
+		regs->eip++;
+
 #ifdef CONFIG_HOTPLUG_CPU
 	if (!cpu_online(cpu)) {
 		nmi_exit();

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch, 2.6.10-rc3] safe_hlt() & NMIs
  2004-12-14 22:23                                     ` [patch, 2.6.10-rc3] safe_hlt() & NMIs Ingo Molnar
@ 2004-12-14 22:47                                       ` Ingo Molnar
  2004-12-14 23:09                                         ` Linus Torvalds
  2004-12-14 23:41                                         ` Andrea Arcangeli
  2004-12-14 23:00                                       ` Linus Torvalds
  2004-12-15  6:27                                       ` Avi Kivity
  2 siblings, 2 replies; 44+ messages in thread
From: Ingo Molnar @ 2004-12-14 22:47 UTC (permalink / raw)
  To: Lee Revell
  Cc: Andrea Arcangeli, Manfred Spraul, Zwane Mwaikambo,
	George Anzinger, dipankar, ganzinger, lkml, Andrew Morton,
	Linus Torvalds, Andi Kleen


* Ingo Molnar <mingo@elte.hu> wrote:

> indeed, there could be a connection, and it's certainly a fun race.
> The proper fix is Manfred's suggestion: check whether the EIP is a
> kernel text address, and if yes, whether it's a HLT instruction - and
> if yes then increase EIP by 1. I've included the fix in the -33-02 -RT
> patch. Andrew, Linus: upstream fix is below - i think it's post-2.6.10
> stuff. Tested it on SMP and UP x86, using both the IO-APIC and the
> local-APIC based NMI watchdog.
> 
> i think x64 needs a similar fix as well.

find the correct patch below. I've tested it with an NMI watchdog
frequency artificially increased to 10 KHz, and i've instrumented the
new branch in the NMI handler, but even under heavy IRQ load i was not
able to trigger the branch. Maybe newer CPUs handle this case somehow
and make sti;hlt truly atomic? I tried this on an old Celeron
(Mendocino) and on an Athlon64.

	Ingo

Signed-off-by: Ingo Molnar <mingo@elte.hu>

--- linux/arch/i386/kernel/traps.c.orig
+++ linux/arch/i386/kernel/traps.c
@@ -670,6 +670,18 @@ fastcall void do_nmi(struct pt_regs * re
 
 	cpu = smp_processor_id();
 
+	/*
+	 * Fix up obscure CPU behavior: if we interrupt safe_hlt() via
+	 * the NMI then we might miss a reschedule if an interrupt is
+	 * posted to the CPU and executes before the HLT instruction.
+	 *
+	 * We check whether the EIP is kernelspace, and if yes, whether
+	 * the instruction is HLT:
+	 */
+	if (__kernel_text_address(regs->eip) &&
+					*(unsigned char *)regs->eip == 0xf4)
+		regs->eip++;
+
 #ifdef CONFIG_HOTPLUG_CPU
 	if (!cpu_online(cpu)) {
 		nmi_exit();

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch, 2.6.10-rc3] safe_hlt() & NMIs
  2004-12-14 22:47                                       ` Ingo Molnar
@ 2004-12-14 23:09                                         ` Linus Torvalds
  2004-12-15  8:52                                           ` Ingo Molnar
  2004-12-16  0:37                                           ` Alan Cox
  2004-12-14 23:41                                         ` Andrea Arcangeli
  1 sibling, 2 replies; 44+ messages in thread
From: Linus Torvalds @ 2004-12-14 23:09 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Lee Revell, Andrea Arcangeli, Manfred Spraul, Zwane Mwaikambo,
	George Anzinger, dipankar, ganzinger, lkml, Andrew Morton,
	Andi Kleen



On Tue, 14 Dec 2004, Ingo Molnar wrote:
> 
> find the correct patch below. I've tested it with an NMI watchdog
> frequency artificially increased to 10 KHz, and i've instrumented the
> new branch in the NMI handler, but even under heavy IRQ load i was not
> able to trigger the branch. Maybe newer CPUs handle this case somehow
> and make sti;hlt truly atomic?

Now that you mention it, I have this dim memory of the one-instruction
"sti-shadow" actually disabling NMI's (and debug traps) too. The CPU 
literally doesn't test for async events following "sti". 

Or maybe that was "mov->ss". That one also has that strange "black hole"  
for one instruction.

Hmm.. You could be evil and try to fill up 64kB worth of memory with a 
"mov %ax,%ss", and jump to it in vm86 mode and see what happens. The eip 
will just keep wrapping around...

		Linus

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch, 2.6.10-rc3] safe_hlt() & NMIs
  2004-12-14 23:09                                         ` Linus Torvalds
@ 2004-12-15  8:52                                           ` Ingo Molnar
  2004-12-15 15:44                                             ` Linus Torvalds
  2004-12-16  0:37                                           ` Alan Cox
  1 sibling, 1 reply; 44+ messages in thread
From: Ingo Molnar @ 2004-12-15  8:52 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Lee Revell, Andrea Arcangeli, Manfred Spraul, Zwane Mwaikambo,
	George Anzinger, dipankar, ganzinger, lkml, Andrew Morton,
	Andi Kleen


* Linus Torvalds <torvalds@osdl.org> wrote:

> > find the correct patch below. I've tested it with an NMI watchdog
> > frequency artificially increased to 10 KHz, and i've instrumented the
> > new branch in the NMI handler, but even under heavy IRQ load i was not
> > able to trigger the branch. Maybe newer CPUs handle this case somehow
> > and make sti;hlt truly atomic?
> 
> Now that you mention it, I have this dim memory of the one-instruction
> "sti-shadow" actually disabling NMI's (and debug traps) too. The CPU
> literally doesn't test for async events following "sti". 

i ran the stresstest overnight with the 10 KHz NMI, and not a single
time did the new branch trigger, out of hundreds of millions of IRQs and
NMIs. I think this suggests that the race doesnt exist in current CPUs.

	Ingo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch, 2.6.10-rc3] safe_hlt() & NMIs
  2004-12-15  8:52                                           ` Ingo Molnar
@ 2004-12-15 15:44                                             ` Linus Torvalds
  2004-12-15 16:35                                               ` Ingo Molnar
  0 siblings, 1 reply; 44+ messages in thread
From: Linus Torvalds @ 2004-12-15 15:44 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Lee Revell, Andrea Arcangeli, Manfred Spraul, Zwane Mwaikambo,
	George Anzinger, dipankar, ganzinger, lkml, Andrew Morton,
	Andi Kleen

On Wed, 15 Dec 2004, Ingo Molnar wrote:
> 
> i ran the stresstest overnight with the 10 KHz NMI, and not a single
> time did the new branch trigger, out of hundreds of millions of IRQs and
> NMIs. I think this suggests that the race doesnt exist in current CPUs.

That may well be true, but I'm not convinced your test is meaningful or 
shows anything.

The thing is, either the CPU is busy, or it's idle. If it's busy, you'll
never see this. And if it's idle, it will always be _in_ the "halt"  
instruction.

The only way to see the case is in the borderline cases, and if/when there
are multiple different interrupts (first non-NMI interrupt takes it out of
the hlt, and then the NMI happens to catch the sti). And quite frankly, I
don't see how you would stress-test it. A 1kHz timer interrupt with a
10kHz NMI interrupt is still very infrequent interrupts...

		Linus

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch, 2.6.10-rc3] safe_hlt() & NMIs
  2004-12-15 15:44                                             ` Linus Torvalds
@ 2004-12-15 16:35                                               ` Ingo Molnar
  0 siblings, 0 replies; 44+ messages in thread
From: Ingo Molnar @ 2004-12-15 16:35 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Lee Revell, Andrea Arcangeli, Manfred Spraul, Zwane Mwaikambo,
	George Anzinger, dipankar, ganzinger, lkml, Andrew Morton,
	Andi Kleen


* Linus Torvalds <torvalds@osdl.org> wrote:

> 
> 
> On Wed, 15 Dec 2004, Ingo Molnar wrote:
> > 
> > i ran the stresstest overnight with the 10 KHz NMI, and not a single
> > time did the new branch trigger, out of hundreds of millions of IRQs and
> > NMIs. I think this suggests that the race doesnt exist in current CPUs.
> 
> That may well be true, but I'm not convinced your test is meaningful
> or shows anything.
> 
> The thing is, either the CPU is busy, or it's idle. If it's busy,
> you'll never see this. And if it's idle, it will always be _in_ the
> "halt" instruction.

i deliberately started a test where there was roughly 50% idle time.

> The only way to see the case is in the borderline cases, and if/when
> there are multiple different interrupts (first non-NMI interrupt takes
> it out of the hlt, and then the NMI happens to catch the sti). And
> quite frankly, I don't see how you would stress-test it. A 1kHz timer
> interrupt with a 10kHz NMI interrupt is still very infrequent
> interrupts...

i started an infinite loop that generated disk IRQs, and started a
network test that generated network IRQs. The IRQ rate was roughly
10K/sec - this combined with the 10K/sec NMI rate should be an adequate
mix. (I also made sure that it's really default_idle that is used.)

	Ingo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch, 2.6.10-rc3] safe_hlt() & NMIs
  2004-12-14 23:09                                         ` Linus Torvalds
  2004-12-15  8:52                                           ` Ingo Molnar
@ 2004-12-16  0:37                                           ` Alan Cox
  2004-12-16  1:58                                             ` Linus Torvalds
  2004-12-16  2:10                                             ` Zwane Mwaikambo
  1 sibling, 2 replies; 44+ messages in thread
From: Alan Cox @ 2004-12-16  0:37 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ingo Molnar, Lee Revell, Andrea Arcangeli, Manfred Spraul,
	Zwane Mwaikambo, George Anzinger, dipankar, ganzinger, lkml,
	Andrew Morton, Andi Kleen

On Maw, 2004-12-14 at 23:09, Linus Torvalds wrote:
> On Tue, 14 Dec 2004, Ingo Molnar wrote:
> Now that you mention it, I have this dim memory of the one-instruction
> "sti-shadow" actually disabling NMI's (and debug traps) too. The CPU 
> literally doesn't test for async events following "sti". 
> 
> Or maybe that was "mov->ss". That one also has that strange "black hole"  
> for one instruction.

The mov to ss one is a bit more magic than that however. If you write
3Gb of mov->ss into memory (ie about 64 pages to thrash the cache and
slow it plus mmap repeatedly) and run it you don't get a vastly long irq
delay at least on intel, not tried the others. 

Alan


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch, 2.6.10-rc3] safe_hlt() & NMIs
  2004-12-16  0:37                                           ` Alan Cox
@ 2004-12-16  1:58                                             ` Linus Torvalds
  2004-12-16 14:51                                               ` Ingo Molnar
  2004-12-16  2:10                                             ` Zwane Mwaikambo
  1 sibling, 1 reply; 44+ messages in thread
From: Linus Torvalds @ 2004-12-16  1:58 UTC (permalink / raw)
  To: Alan Cox
  Cc: Ingo Molnar, Lee Revell, Andrea Arcangeli, Manfred Spraul,
	Zwane Mwaikambo, George Anzinger, dipankar, ganzinger, lkml,
	Andrew Morton, Andi Kleen



On Thu, 16 Dec 2004, Alan Cox wrote:

> On Maw, 2004-12-14 at 23:09, Linus Torvalds wrote:
> > On Tue, 14 Dec 2004, Ingo Molnar wrote:
> > Now that you mention it, I have this dim memory of the one-instruction
> > "sti-shadow" actually disabling NMI's (and debug traps) too. The CPU 
> > literally doesn't test for async events following "sti". 
> > 
> > Or maybe that was "mov->ss". That one also has that strange "black hole"  
> > for one instruction.
> 
> The mov to ss one is a bit more magic than that however. If you write
> 3Gb of mov->ss into memory (ie about 64 pages to thrash the cache and
> slow it plus mmap repeatedly) and run it you don't get a vastly long irq
> delay at least on intel, not tried the others. 

The irq window should actually be open every alternate instruction, I 
think. Although it's not actually architected, and I thought that there 
was some errata for some CPU about this..

			Linus

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch, 2.6.10-rc3] safe_hlt() & NMIs
  2004-12-16  1:58                                             ` Linus Torvalds
@ 2004-12-16 14:51                                               ` Ingo Molnar
  2004-12-16 15:08                                                 ` Maciej W. Rozycki
  2004-12-16 15:54                                                 ` Linus Torvalds
  0 siblings, 2 replies; 44+ messages in thread
From: Ingo Molnar @ 2004-12-16 14:51 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, Lee Revell, Andrea Arcangeli, Manfred Spraul,
	Zwane Mwaikambo, George Anzinger, dipankar, ganzinger, lkml,
	Andrew Morton, Andi Kleen


* Linus Torvalds <torvalds@osdl.org> wrote:

> The irq window should actually be open every alternate instruction, I
> think. Although it's not actually architected, and I thought that
> there was some errata for some CPU about this..

i have generated an instruction-granularity profile of kernel code
executing the following sequence, driven by the NMI watchdog interrupt:

 asm ("cli; cli; sti; cli; sti; cli; sti; cli; sti; cli; sti; ");
 asm ("cli; cli; sti; cli; sti; cli; sti; cli; sti; cli; sti; ");
 asm ("cli; cli; sti; cli; sti; cli; sti; cli; sti; cli; sti; ");

the first CLI is done twice, to prove that the NMI profiling works and
that the kernel can be interrupted in those places. Then i called this
kernel code in a loop. Here's the result:

c0125ee9:     1529 	fa                   	cli    
                 ^---------------------------------- # of profiler hits
c0125eea:      507 	fb                   	sti    
c0125eeb:        0 	fa                   	cli    
c0125eec:     3719 	fb                   	sti    
c0125eed:        0 	fa                   	cli    
c0125eee:     1579 	fb                   	sti    
c0125eef:        0 	fa                   	cli    
c0125ef0:     3317 	fb                   	sti    
c0125ef1:        0 	fa                   	cli    
c0125ef2:     3030 	fb                   	sti    
c0125ef3:        0 	fa                   	cli    
c0125ef4:     2497 	fa                   	cli    
c0125ef5:     1055 	fb                   	sti    
c0125ef6:        0 	fa                   	cli    
c0125ef7:     4674 	fb                   	sti    
c0125ef8:        0 	fa                   	cli    
c0125ef9:     3827 	fb                   	sti    
c0125efa:        0 	fa                   	cli    
c0125efb:     1622 	fb                   	sti    
c0125efc:        0 	fa                   	cli    
c0125efd:     3155 	fb                   	sti    
c0125efe:        0 	fa                   	cli    
c0125eff:     1273 	fa                   	cli    
c0125f00:      512 	fb                   	sti    
c0125f01:        0 	fa                   	cli    
c0125f02:     1312 	fb                   	sti    
c0125f03:        0 	fa                   	cli    
c0125f04:     1426 	fb                   	sti    
c0125f05:        0 	fa                   	cli    
c0125f06:     1507 	fb                   	sti    
c0125f07:        0 	fa                   	cli    
c0125f08:     2720 	fb                   	sti    
c0125f09:        0 	fa                   	cli    
c0125f0a:     2469 	fa                   	cli    
c0125f0b:      787 	fb                   	sti    
c0125f0c:        0 	fa                   	cli    
c0125f0d:     2085 	fb                   	sti    
c0125f0e:        0 	fa                   	cli    

the 'cli' is always a 'black hole' to the NMI, while the second of two
consecutive cli's are not.

i also played a bit with the %ss instructions, and combined them with
the cli/sti instructions and other instructions in various ways, and
with a bit of experimenting found the following, somewhat surprising
results:

c0125f33:     1016 	66 8c d0             	mov    %ss,%ax
c0125f36:     6626 	8e d0                	mov    %eax,%ss
c0125f38:    34715 	8e d0                	mov    %eax,%ss
c0125f3a:    14682 	8e d0                	mov    %eax,%ss
c0125f3c:     4521 	8e d0                	mov    %eax,%ss
c0125f3e:     7564 	8e d0                	mov    %eax,%ss
c0125f40:     3861 	66 8e d0             	mov    %ax,%ss
c0125f43:        0 	66 8c d1             	mov    %ss,%cx
c0125f46:     1061 	66 8c da             	mov    %ds,%dx
c0125f49:     7660 	8e d1                	mov    %ecx,%ss
c0125f4b:    11322 	17                   	pop    %ss
c0125f4c:        0 	fb                   	sti    
c0125f4d:     8935 	8e d1                	mov    %ecx,%ss
c0125f4f:        0 	fa                   	cli    
c0125f50:     2198 	66 8c d1             	mov    %ss,%cx
c0125f53:      735 	66 8c da             	mov    %ds,%dx
c0125f56:        0 	8e da                	mov    %edx,%ds
c0125f58:     6400 	8e d0                	mov    %eax,%ss
c0125f5a:     3062 	8e d0                	mov    %eax,%ss
c0125f5c:     3552 	8e d0                	mov    %eax,%ss
c0125f5e:     4818 	8e d0                	mov    %eax,%ss
c0125f60:        0 	fb                   	sti    
c0125f61:        0 	66 8c da             	mov    %ds,%dx
c0125f64:    17788 	8e d0                	mov    %eax,%ss
c0125f66:    64694 	8e d0                	mov    %eax,%ss
c0125f68:    12837 	8e d0                	mov    %eax,%ss
c0125f6a:     9859 	8e d0                	mov    %eax,%ss
c0125f6c:        0 	fb                   	sti    
c0125f6d:    74506 	8e d0                	mov    %eax,%ss
c0125f6f:        0 	fb                   	sti    
c0125f70:     8589 	fa                   	cli    
c0125f71:    10248 	8e d0                	mov    %eax,%ss
c0125f73:     3825 	8e d0                	mov    %eax,%ss
c0125f75:     4903 	8e d0                	mov    %eax,%ss
c0125f77:    71134 	8e d0                	mov    %eax,%ss
c0125f79:        0 	fb                   	sti    
c0125f7a:        0 	fa                   	cli    
c0125f7b:     7461 	8e d0                	mov    %eax,%ss
c0125f7d:        0 	66 8c d0             	mov    %ss,%ax
c0125f80:    39387 	8e d0                	mov    %eax,%ss
c0125f82:        0 	fa                   	cli    
c0125f83:    41484 	8e d0                	mov    %eax,%ss
c0125f85:        0 	fa                   	cli    
c0125f86:     4490 	8e d0                	mov    %eax,%ss
c0125f88:        0 	fa                   	cli    
c0125f89:     6024 	8e d0                	mov    %eax,%ss
c0125f8b:    15454 	8e d0                	mov    %eax,%ss
c0125f8d:        0 	fb                   	sti    
c0125f8e:        0 	fb                   	sti    
c0125f8f:   115104 	fb                   	sti    
c0125f90:    39061 	fb                   	sti    

it shows a number of interesting effects:

- "mov %eax,%ss" followed by the _same_ instruction cancels the 
  black-hole. This i suspect is done to prevent the lockup in vm86
  mode.

- an %ss black-hole instruction followed by 'sti' cancels sti's
  black-hole. This is unlikely to occur in real kernel code, but we
  might want to add a 'nop' in front of safe_halt()'s sti, to make sure
  the black-hole takes effect.

- in one case a two-instruction blackhole was created - but this might 
  be some prefetch effect.

i played around with the instructions a bit to manufacture combinations
that enlengthen the black-hole but failed :) This was on an Athlon64.

	Ingo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch, 2.6.10-rc3] safe_hlt() & NMIs
  2004-12-16 14:51                                               ` Ingo Molnar
@ 2004-12-16 15:08                                                 ` Maciej W. Rozycki
  2004-12-16 15:11                                                   ` Ingo Molnar
  2004-12-16 15:54                                                 ` Linus Torvalds
  1 sibling, 1 reply; 44+ messages in thread
From: Maciej W. Rozycki @ 2004-12-16 15:08 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Alan Cox, Lee Revell, Andrea Arcangeli,
	Manfred Spraul, Zwane Mwaikambo, George Anzinger, dipankar,
	ganzinger, lkml, Andrew Morton, Andi Kleen

On Thu, 16 Dec 2004, Ingo Molnar wrote:

> c0125ee9:     1529 	fa                   	cli    
>                  ^---------------------------------- # of profiler hits
> c0125eea:      507 	fb                   	sti    
> c0125eeb:        0 	fa                   	cli    
> c0125eec:     3719 	fb                   	sti    
> c0125eed:        0 	fa                   	cli    
> c0125eee:     1579 	fb                   	sti    
> c0125eef:        0 	fa                   	cli    
> c0125ef0:     3317 	fb                   	sti    
> c0125ef1:        0 	fa                   	cli    
> c0125ef2:     3030 	fb                   	sti    
> c0125ef3:        0 	fa                   	cli    
> c0125ef4:     2497 	fa                   	cli    
> c0125ef5:     1055 	fb                   	sti    
> c0125ef6:        0 	fa                   	cli    
[...]
> the 'cli' is always a 'black hole' to the NMI, while the second of two
> consecutive cli's are not.

 It looks like the 'sti' is actually the black hole -- remember interrupts
are traps, that is they are probed for and taken after instruction
execution.

  Maciej

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch, 2.6.10-rc3] safe_hlt() & NMIs
  2004-12-16 15:08                                                 ` Maciej W. Rozycki
@ 2004-12-16 15:11                                                   ` Ingo Molnar
  2004-12-16 15:42                                                     ` Maciej W. Rozycki
  0 siblings, 1 reply; 44+ messages in thread
From: Ingo Molnar @ 2004-12-16 15:11 UTC (permalink / raw)
  To: Maciej W. Rozycki
  Cc: Linus Torvalds, Alan Cox, Lee Revell, Andrea Arcangeli,
	Manfred Spraul, Zwane Mwaikambo, George Anzinger, dipankar,
	ganzinger, lkml, Andrew Morton, Andi Kleen


* Maciej W. Rozycki <macro@linux-mips.org> wrote:

> On Thu, 16 Dec 2004, Ingo Molnar wrote:
> 
> > c0125ee9:     1529 	fa                   	cli    
> >                  ^---------------------------------- # of profiler hits
> > c0125eea:      507 	fb                   	sti    
> > c0125eeb:        0 	fa                   	cli    
> > c0125eec:     3719 	fb                   	sti    
> > c0125eed:        0 	fa                   	cli    
> > c0125eee:     1579 	fb                   	sti    
> > c0125eef:        0 	fa                   	cli    
> > c0125ef0:     3317 	fb                   	sti    
> > c0125ef1:        0 	fa                   	cli    
> > c0125ef2:     3030 	fb                   	sti    
> > c0125ef3:        0 	fa                   	cli    
> > c0125ef4:     2497 	fa                   	cli    
> > c0125ef5:     1055 	fb                   	sti    
> > c0125ef6:        0 	fa                   	cli    
> [...]
> > the 'cli' is always a 'black hole' to the NMI, while the second of two
> > consecutive cli's are not.
> 
>  It looks like the 'sti' is actually the black hole -- remember
> interrupts are traps, that is they are probed for and taken after
> instruction execution.

The 'sti' "shadows" the cli, i.e. we'll never get an interrupt that gets
inbetween 'sti;cli'. I.e. sti is the black-hole generator, and 'cli' is
in the black hole. In that sense the 'cli' is in a black hole to the
NMI: the NMI will never see cli as the 'next to be executed'
instruction.

	Ingo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch, 2.6.10-rc3] safe_hlt() & NMIs
  2004-12-16 15:11                                                   ` Ingo Molnar
@ 2004-12-16 15:42                                                     ` Maciej W. Rozycki
  0 siblings, 0 replies; 44+ messages in thread
From: Maciej W. Rozycki @ 2004-12-16 15:42 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Alan Cox, Lee Revell, Andrea Arcangeli,
	Manfred Spraul, Zwane Mwaikambo, George Anzinger, dipankar,
	ganzinger, lkml, Andrew Morton, Andi Kleen

On Thu, 16 Dec 2004, Ingo Molnar wrote:

> The 'sti' "shadows" the cli, i.e. we'll never get an interrupt that gets
> inbetween 'sti;cli'. I.e. sti is the black-hole generator, and 'cli' is
> in the black hole. In that sense the 'cli' is in a black hole to the
> NMI: the NMI will never see cli as the 'next to be executed'
> instruction.

 That's what I meant indeed, but I'd like to emphasise, for readers to be
aware, the black hole is not tied to the 'cli' instruction itself in any
way.  The black-holed instruction needs not be a 'cli' -- it can be an
arbitrary one except from ones creating black holes as you've observed
with your test.

  Maciej

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch, 2.6.10-rc3] safe_hlt() & NMIs
  2004-12-16 14:51                                               ` Ingo Molnar
  2004-12-16 15:08                                                 ` Maciej W. Rozycki
@ 2004-12-16 15:54                                                 ` Linus Torvalds
  1 sibling, 0 replies; 44+ messages in thread
From: Linus Torvalds @ 2004-12-16 15:54 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Alan Cox, Lee Revell, Andrea Arcangeli, Manfred Spraul,
	Zwane Mwaikambo, George Anzinger, dipankar, ganzinger, lkml,
	Andrew Morton, Andi Kleen



On Thu, 16 Dec 2004, Ingo Molnar wrote:
> 
> i also played a bit with the %ss instructions, and combined them with
> the cli/sti instructions and other instructions in various ways, and
> with a bit of experimenting found the following, somewhat surprising
> results:
>
> [ snip ]
> 
> it shows a number of interesting effects:
> 
> - "mov %eax,%ss" followed by the _same_ instruction cancels the 
>   black-hole. This i suspect is done to prevent the lockup in vm86
>   mode.

I don't think it's the "same instruction". Looking at the pattern, I think
that a "mov->ss" always checks interrupts _before_ it executes, and never 
checks interrupts _after_ it executes.

So I think the pattern is (for your athlon64):

 - regular instructions check interrupts before they execute, _except_ if 
   the "dontcheck" flag was set. They clear "dontcheck" after execution.
 - "mov->ss" always checks interrupts before it executes, regardless of
   "dontcheck". It always sets "dontcheck".
 - "sti" sets "dontcheck" if interrupts were disabled before.

So you can get two-instruction holes by doing the sequence

	/* interrupts disabled */
	mov->ss
	sti
	/* any instruction except cli/mov->ss */

but no other combination (series of "mov->ss" will always check _before_
each "mov->ss", and series of "sti" will obviously only have interrupts 
disabled for the _first_ sti).

And I suspect this is very much micro-architecture-dependent, although the 
Athlon64 rules seem very simple and straightforward.

		Linus

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch, 2.6.10-rc3] safe_hlt() & NMIs
  2004-12-16  0:37                                           ` Alan Cox
  2004-12-16  1:58                                             ` Linus Torvalds
@ 2004-12-16  2:10                                             ` Zwane Mwaikambo
  2004-12-16 13:26                                               ` Alan Cox
  1 sibling, 1 reply; 44+ messages in thread
From: Zwane Mwaikambo @ 2004-12-16  2:10 UTC (permalink / raw)
  To: Alan Cox
  Cc: Linus Torvalds, Ingo Molnar, Lee Revell, Andrea Arcangeli,
	Manfred Spraul, George Anzinger, dipankar, ganzinger, lkml,
	Andrew Morton, Andi Kleen

On Thu, 16 Dec 2004, Alan Cox wrote:

> On Maw, 2004-12-14 at 23:09, Linus Torvalds wrote:
> > On Tue, 14 Dec 2004, Ingo Molnar wrote:
> > Now that you mention it, I have this dim memory of the one-instruction
> > "sti-shadow" actually disabling NMI's (and debug traps) too. The CPU 
> > literally doesn't test for async events following "sti". 
> > 
> > Or maybe that was "mov->ss". That one also has that strange "black hole"  
> > for one instruction.
> 
> The mov to ss one is a bit more magic than that however. If you write
> 3Gb of mov->ss into memory (ie about 64 pages to thrash the cache and
> slow it plus mmap repeatedly) and run it you don't get a vastly long irq
> delay at least on intel, not tried the others. 

Might this be because you can't rely on interrupt suppression for back to 
back suppressing instructions?


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch, 2.6.10-rc3] safe_hlt() & NMIs
  2004-12-16  2:10                                             ` Zwane Mwaikambo
@ 2004-12-16 13:26                                               ` Alan Cox
  0 siblings, 0 replies; 44+ messages in thread
From: Alan Cox @ 2004-12-16 13:26 UTC (permalink / raw)
  To: Zwane Mwaikambo
  Cc: Linus Torvalds, Ingo Molnar, Lee Revell, Andrea Arcangeli,
	Manfred Spraul, George Anzinger, dipankar, ganzinger, lkml,
	Andrew Morton, Andi Kleen

On Iau, 2004-12-16 at 02:10, Zwane Mwaikambo wrote:
> Might this be because you can't rely on interrupt suppression for back to 
> back suppressing instructions?

The documentation seems to have little to say on this. I've also not
tried things like interleaved mov->ss, sti to see how the interlocking
is done. It would make sense given the original 8086 reason was to allow
ss/sp to be loaded cleanly.


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch, 2.6.10-rc3] safe_hlt() & NMIs
  2004-12-14 22:47                                       ` Ingo Molnar
  2004-12-14 23:09                                         ` Linus Torvalds
@ 2004-12-14 23:41                                         ` Andrea Arcangeli
  1 sibling, 0 replies; 44+ messages in thread
From: Andrea Arcangeli @ 2004-12-14 23:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Lee Revell, Manfred Spraul, Zwane Mwaikambo, George Anzinger,
	dipankar, ganzinger, lkml, Andrew Morton, Linus Torvalds,
	Andi Kleen

On Tue, Dec 14, 2004 at 11:47:06PM +0100, Ingo Molnar wrote:
> find the correct patch below. I've tested it with an NMI watchdog
> frequency artificially increased to 10 KHz, and i've instrumented the

Nice test, it'd be nice to trigger it in real life.

on the lines of the 64k movl ss, I wonder if we could create an huge
piece of memory like this:

new_htl:
cli
sti
htl
cli
sti
htl
[..]
jmp original_hlt

and to call new_htl from original_hlt instead of sti;hlt. A dozen megs
of the above should boost the probability of getting interrupted in
"hlt" quite a bit.

However even if the nmi can execute on top of the "hlt" instruction, it
doesn't necessairly mean the next pending irq will execute before
executing 'hlt' too, so it'd need a bit more of instrumentation to as
well track down the race as happening (it's not enough to see the branch
in the nmi handler to be taken). The additional instrumentation should
be quite easy though, just copying the same nmi code to the irq handler
should do the trick.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch, 2.6.10-rc3] safe_hlt() & NMIs
  2004-12-14 22:23                                     ` [patch, 2.6.10-rc3] safe_hlt() & NMIs Ingo Molnar
  2004-12-14 22:47                                       ` Ingo Molnar
@ 2004-12-14 23:00                                       ` Linus Torvalds
  2004-12-15  5:04                                         ` Andi Kleen
  2004-12-15  6:27                                       ` Avi Kivity
  2 siblings, 1 reply; 44+ messages in thread
From: Linus Torvalds @ 2004-12-14 23:00 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Lee Revell, Andrea Arcangeli, Manfred Spraul, Zwane Mwaikambo,
	George Anzinger, dipankar, ganzinger, lkml, Andrew Morton,
	Andi Kleen

On Tue, 14 Dec 2004, Ingo Molnar wrote:
> 
> indeed, there could be a connection, and it's certainly a fun race. The
> proper fix is Manfred's suggestion: check whether the EIP is a kernel
> text address, and if yes, whether it's a HLT instruction - and if yes
> then increase EIP by 1.

You do it the wrong way, though. This is not safe:

	if (__kernel_text_address(regs->eip) && *(char *)regs->eip == 0xf4)

does _entirely_ the wrong thing if CS is not the kernel CS. 

It can trigger with a regular use CS if you were to run the 4G:4G patches, 
but more realistically, I think you can make ii trigger even with a 
standard kernel by creating a local code segment in your LDT, and then 
trying to confuse the kernel that way.

Now, as long as the _only_ thing it does is increment the eip, the worst 
that can happen is that it screws over the user program that must have 
worked at this a bit, but the basic point is that you shouldn't do this. 
In _theory_ you could confuse a real program that wasn't doing anything 
bad.

Checking for kernel CS also requires checking that it's not vm86 mode, 
btw. So that's not just a "regs->xcs & 0xffff == __KERNEL_CS" either.

But something like

	static inline int kernel_mode(struct pt_regs *regs)
	{
		return !((regs->eflags & VM_MASK) | (regs->xcs & 3));
	}

should DTRT.

Can you pls double-check my thinking, and test?

			Linus

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch, 2.6.10-rc3] safe_hlt() & NMIs
  2004-12-14 23:00                                       ` Linus Torvalds
@ 2004-12-15  5:04                                         ` Andi Kleen
  0 siblings, 0 replies; 44+ messages in thread
From: Andi Kleen @ 2004-12-15  5:04 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ingo Molnar, Lee Revell, Andrea Arcangeli, Manfred Spraul,
	Zwane Mwaikambo, George Anzinger, dipankar, ganzinger, lkml,
	Andrew Morton, Andi Kleen

> But something like
> 
> 	static inline int kernel_mode(struct pt_regs *regs)
> 	{
> 		return !((regs->eflags & VM_MASK) | (regs->xcs & 3));
> 	}
> 
> should DTRT.
> 
> Can you pls double-check my thinking, and test?

Reasoning looks correct to me. 

-Andi

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch, 2.6.10-rc3] safe_hlt() & NMIs
  2004-12-14 22:23                                     ` [patch, 2.6.10-rc3] safe_hlt() & NMIs Ingo Molnar
  2004-12-14 22:47                                       ` Ingo Molnar
  2004-12-14 23:00                                       ` Linus Torvalds
@ 2004-12-15  6:27                                       ` Avi Kivity
  2004-12-15  8:51                                         ` Ingo Molnar
  2 siblings, 1 reply; 44+ messages in thread
From: Avi Kivity @ 2004-12-15  6:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Lee Revell, Andrea Arcangeli, Manfred Spraul, Zwane Mwaikambo,
	George Anzinger, dipankar, ganzinger, lkml, Andrew Morton,
	Linus Torvalds, Andi Kleen

Ingo Molnar wrote:

>+	if (__kernel_text_address(regs->eip) && *(char *)regs->eip == 0xf4)
>  
>
shouldn't that cast be (unsigned char *), otherwise the test will always 
fail?

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch, 2.6.10-rc3] safe_hlt() & NMIs
  2004-12-15  6:27                                       ` Avi Kivity
@ 2004-12-15  8:51                                         ` Ingo Molnar
  0 siblings, 0 replies; 44+ messages in thread
From: Ingo Molnar @ 2004-12-15  8:51 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Lee Revell, Andrea Arcangeli, Manfred Spraul, Zwane Mwaikambo,
	George Anzinger, dipankar, ganzinger, lkml, Andrew Morton,
	Linus Torvalds, Andi Kleen


* Avi Kivity <avi@argo.co.il> wrote:

> Ingo Molnar wrote:
> 
> >+	if (__kernel_text_address(regs->eip) && *(char *)regs->eip == 0xf4)
> > 
> >
> shouldn't that cast be (unsigned char *), otherwise the test will
> always fail?

yes, i fixed this in the second patch. (the compiler warned about it
too)

	Ingo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: RCU question
  2004-12-12 10:22                               ` Manfred Spraul
  2004-12-12 12:15                                 ` Andrea Arcangeli
@ 2004-12-12 16:51                                 ` George Anzinger
  2004-12-12 22:40                                   ` Manfred Spraul
  1 sibling, 1 reply; 44+ messages in thread
From: George Anzinger @ 2004-12-12 16:51 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: Andrea Arcangeli, Zwane Mwaikambo, Lee Revell, dipankar,
	ganzinger, lkml

Manfred Spraul wrote:
> Andrea Arcangeli wrote:
> 
>> On Sun, Dec 12, 2004 at 09:59:00AM +0100, Manfred Spraul wrote:
>>  
>>
>>> It means that our NMI irq return path should check if it points to a 
>>> hlt instruction and if yes, then increase the saved EIP by one before 
>>> doing the iretd, right?
>>>   
>>
>>
>> I don't think we'll ever post any event through nmi, so it doesn't
>> matter. We only care to be waken by real irqs, not nmi/smi. Idle loop is
>> fine to ignore the actions of the nmi handlers and to hang into the
>> "hlt".
>>  
>>
> No, You misunderstood the problem:
> 
> sti
> ** NMI handler
> ** normal interrupt arrives, is queued by the cpu
> ** irqd from NMI handler
> ** cpu notices the normal interrupt, handles it.
> ** normal interrupt does a wakeup, schedules a tasklet, whatever

I think you are forgetting that the system does the full context switch from the 
interrupt handler (well, actually from entry.S) and does not do the irqd until 
it is time to go back to the idle thread (i.e. there is nothing left to do), so..
> ** irqd from normal interupt
> hlt << cpu sleeps.

What we loose here is that idle does not go around its little loop again.  If an 
interrupt becomes pending on the way to the hlt, i.e. while entry.S has 
interrupts masked and is doing the irqd, it will be handled prior to the hlt so 
we could loose several of these idle loop spins, until no interrupt is pending 
allowing the hlt to be executed.  On the next interrupt/irqd the hlt will exit. 
  So what is lost is one or more spins round the idle loop.

The "normal" idle loop just looks at the need_resched flag and goes right back 
to the hlt, however, idle, it self, never sets this flag, only interrupt code 
can set it at this point, and the interrupt exit takes action to clear it so I 
don't see it every being found set in the idle loop (I suppose one could do a 
test to see if it is ever found set here), so, in theory, the net effect should 
be nill.

Did I miss something?

-- 
George Anzinger   george@mvista.com
High-res-timers:  http://sourceforge.net/projects/high-res-timers/

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: RCU question
  2004-12-12 16:51                                 ` RCU question George Anzinger
@ 2004-12-12 22:40                                   ` Manfred Spraul
  2004-12-13  5:22                                     ` George Anzinger
  0 siblings, 1 reply; 44+ messages in thread
From: Manfred Spraul @ 2004-12-12 22:40 UTC (permalink / raw)
  To: george
  Cc: Andrea Arcangeli, Zwane Mwaikambo, Lee Revell, dipankar,
	ganzinger, lkml

George Anzinger wrote:

>
> The "normal" idle loop just looks at the need_resched flag and goes 
> right back to the hlt,

That's the problem: If a the tasklet does a wakeup then the reschedule 
is delayed until the next interrupt. Testing need_resched and executing 
hlt must be atomic, but it isn't - NMIs break the atomicity.
Not a big deal, except if someone implements a tickless kernel. I think 
we can ignore it for now [or was the thread started by someone who 
want's to disable the hardware timer when the system is really idle?]

--
    Manfred

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: RCU question
  2004-12-12 22:40                                   ` Manfred Spraul
@ 2004-12-13  5:22                                     ` George Anzinger
  0 siblings, 0 replies; 44+ messages in thread
From: George Anzinger @ 2004-12-13  5:22 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: Andrea Arcangeli, Zwane Mwaikambo, Lee Revell, dipankar,
	ganzinger, lkml

Manfred Spraul wrote:
> George Anzinger wrote:
> 
>>
>> The "normal" idle loop just looks at the need_resched flag and goes 
>> right back to the hlt,
> 
> 
> That's the problem: If a the tasklet does a wakeup then the reschedule 
> is delayed until the next interrupt. 

Not so.  On the interrupt that runs the tasklet, on the way out via entry.S, the 
need_resched flag is checked and acted on.  Thus the task switch is done prio to 
getting back to the hlt.

> Testing need_resched and executing 
> hlt must be atomic, but it isn't - NMIs break the atomicity.

Actually this is not required, especially if preemption is turned on.

> Not a big deal, except if someone implements a tickless kernel. 

Well, it is not tickless, but VST that I am working on :).  The notion is to 
turn off the ticks when in idle and there are not time events in the list.

I think
> we can ignore it for now [or was the thread started by someone who 
> want's to disable the hardware timer when the system is really idle?]

Yep, me!  But still, I keep a timer around to exit, it is just way more than a 
tick later (depending on what the next entry in the time list needs).
> 

-- 
George Anzinger   george@mvista.com
High-res-timers:  http://sourceforge.net/projects/high-res-timers/


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: RCU question
  2004-12-12  8:59                           ` Manfred Spraul
  2004-12-12  9:37                             ` Andrea Arcangeli
@ 2004-12-12 16:26                             ` Zwane Mwaikambo
  1 sibling, 0 replies; 44+ messages in thread
From: Zwane Mwaikambo @ 2004-12-12 16:26 UTC (permalink / raw)
  To: Manfred Spraul; +Cc: George Anzinger, Lee Revell, dipankar, ganzinger, lkml

On Sun, 12 Dec 2004, Manfred Spraul wrote:

> Zwane Mwaikambo wrote:
> 
> > "Intel processors don't suppress SMI or NMI after an STI instruction. Since
> > the INTR suppresion is not preserved across an SMI or NMI handler, this may
> > result in an INTR being serviced after the STI, which constitutes a
> > violation of the INTR suppresion.
> >  
> Interesting find.
> It means that our NMI irq return path should check if it points to a hlt
> instruction and if yes, then increase the saved EIP by one before doing the
> iretd, right?

Yeah that should do it, but then we also have to worry about SMIs, perhaps 
we could add similar logic to interrupt return path instead?

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch, 2.6.10-rc3] safe_hlt() & NMIs
@ 2004-12-17 23:35 Chuck Ebbert
  0 siblings, 0 replies; 44+ messages in thread
From: Chuck Ebbert @ 2004-12-17 23:35 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, Andi Kleen, Andrew Morton

On Tue, 14 Dec 2004 at 15:00:56 -0800 (PST) Linus Torvalds wrote:

> Checking for kernel CS also requires checking that it's not vm86 mode, 
> btw. So that's not just a "regs->xcs & 0xffff == __KERNEL_CS" either.
>
> But something like
> 
>       static inline int kernel_mode(struct pt_regs *regs)
>       {
>               return !((regs->eflags & VM_MASK) | (regs->xcs & 3));
>       }
>
> should DTRT.
>
> Can you pls double-check my thinking, and test?


 There is already a user_mode() macro in asm-i386/ptrace.h but it's slow.

 x86_64's macro is ugly but at least there's no logical-or in it.

 Your i386 code is better, so how about applying this patch (boots/runs/is faster
on my tests):

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>

--- linux-2.6.9.1/include/asm-i386/ptrace.h     2004-10-19 15:28:18.000000000 -0400
+++ linux-2.6.9.2/include/asm-i386/ptrace.h     2004-12-17 16:59:39.956099664 -0500
@@ -55,7 +55,11 @@ struct pt_regs {
 #define PTRACE_SET_THREAD_AREA    26
 
 #ifdef __KERNEL__
-#define user_mode(regs) ((VM_MASK & (regs)->eflags) || (3 & (regs)->xcs))
+static inline int kernel_mode(struct pt_regs *regs)
+{
+       return !((3 & (regs)->xcs) | (VM_MASK & (regs)->eflags));
+}
+#define user_mode(regs) (!kernel_mode(regs))
 #define instruction_pointer(regs) ((regs)->eip)
 #if defined(CONFIG_SMP) && defined(CONFIG_FRAME_POINTER)
 extern unsigned long profile_pc(struct pt_regs *regs);

--
Please take it as a sign of my infinite respect for you,
that I insist on you doing all the work.
                                        -- Rusty Russell

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2004-12-17 23:38 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-12-09 23:59 RCU question George Anzinger
2004-12-10  4:31 ` Dipankar Sarma
2004-12-10 19:42   ` George Anzinger
2004-12-10 20:40     ` Dipankar Sarma
2004-12-10 20:45       ` Lee Revell
2004-12-10 21:02         ` George Anzinger
2004-12-10 22:58           ` Zwane Mwaikambo
2004-12-11  2:22             ` George Anzinger
2004-12-11  2:45               ` Zwane Mwaikambo
2004-12-11  3:29                 ` George Anzinger
2004-12-11 14:52                   ` Zwane Mwaikambo
2004-12-11 16:32                     ` Manfred Spraul
2004-12-11 16:52                       ` George Anzinger
2004-12-12  2:53                         ` Zwane Mwaikambo
2004-12-12  8:59                           ` Manfred Spraul
2004-12-12  9:37                             ` Andrea Arcangeli
2004-12-12 10:22                               ` Manfred Spraul
2004-12-12 12:15                                 ` Andrea Arcangeli
2004-12-14 21:40                                   ` Lee Revell
2004-12-14 22:23                                     ` [patch, 2.6.10-rc3] safe_hlt() & NMIs Ingo Molnar
2004-12-14 22:47                                       ` Ingo Molnar
2004-12-14 23:09                                         ` Linus Torvalds
2004-12-15  8:52                                           ` Ingo Molnar
2004-12-15 15:44                                             ` Linus Torvalds
2004-12-15 16:35                                               ` Ingo Molnar
2004-12-16  0:37                                           ` Alan Cox
2004-12-16  1:58                                             ` Linus Torvalds
2004-12-16 14:51                                               ` Ingo Molnar
2004-12-16 15:08                                                 ` Maciej W. Rozycki
2004-12-16 15:11                                                   ` Ingo Molnar
2004-12-16 15:42                                                     ` Maciej W. Rozycki
2004-12-16 15:54                                                 ` Linus Torvalds
2004-12-16  2:10                                             ` Zwane Mwaikambo
2004-12-16 13:26                                               ` Alan Cox
2004-12-14 23:41                                         ` Andrea Arcangeli
2004-12-14 23:00                                       ` Linus Torvalds
2004-12-15  5:04                                         ` Andi Kleen
2004-12-15  6:27                                       ` Avi Kivity
2004-12-15  8:51                                         ` Ingo Molnar
2004-12-12 16:51                                 ` RCU question George Anzinger
2004-12-12 22:40                                   ` Manfred Spraul
2004-12-13  5:22                                     ` George Anzinger
2004-12-12 16:26                             ` Zwane Mwaikambo
  -- strict thread matches above, loose matches on Subject: below --
2004-12-17 23:35 [patch, 2.6.10-rc3] safe_hlt() & NMIs Chuck Ebbert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox