All of lore.kernel.org
 help / color / mirror / Atom feed
* schedule() vs softirqs
@ 2006-12-15 17:27 Hollis Blanchard
  2006-12-15 17:36 ` Keir Fraser
  0 siblings, 1 reply; 7+ messages in thread
From: Hollis Blanchard @ 2006-12-15 17:27 UTC (permalink / raw)
  To: xen-devel; +Cc: xen-ppc-devel

PowerPC's timer interrupt (called the decrementer) is a one-shot timer,
not periodic. When it goes off, entering the hypervisor, we first set it
very high so it won't interrupt hypervisor code, then
raise_softirq(TIMER_SOFTIRQ). We know that timer_softirq_action() will
then call reprogam_timer(), which will reprogram the decrementer to the
appropriate value.

We recently uncovered a bug on PowerPC where if a timer tick arrives
just inside schedule() while interrupts are still enabled, the
decrementer is never reprogrammed to that appropriate value. This is
because once inside schedule(), we never handle any subsequent softirqs:
we call context_switch() and resume the guest.

I believe the tick problem affects periodic timers (i.e. x86) as well,
though less drastically. With a CPU-bound guest, it would result in
dropped ticks: TIMER_SOFTIRQ is set and not handled, and when the timer
expires again it is re-set. In other cases, it would result in some
timer ticks being delivered very late. I don't know what effect this
might have on guests, perhaps with sensitive time-slewing code.

In addition, when SCHEDULE_SOFTIRQ is set, all "greater" softirqs
(including NMI) will not be handled until the next hypervisor
invocation.

This is pretty anti-social behavior for a softirq handler. One solution
would be to have schedule() *not* call context_switch() directly, but
rather set a flag (or a "next vcpu" pointer) and return. That would
allow other softirqs to be processed normally. Once do_softirq() returns
to assembly, we can check the "next vcpu" pointer and call
context_switch().

(This solution would enable a PowerPC optimization as well: we would
like to lazily save non-volatile registers. We can't do this unless the
exception handler regains control from do_softirq() before
context_switch() is called.)

Thoughts?

-- 
Hollis Blanchard
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: schedule() vs softirqs
  2006-12-15 17:27 schedule() vs softirqs Hollis Blanchard
@ 2006-12-15 17:36 ` Keir Fraser
  2006-12-15 19:09   ` [Xen-devel] " Hollis Blanchard
  0 siblings, 1 reply; 7+ messages in thread
From: Keir Fraser @ 2006-12-15 17:36 UTC (permalink / raw)
  To: Hollis Blanchard, xen-devel; +Cc: xen-ppc-devel

On 15/12/06 17:27, "Hollis Blanchard" <hollisb@us.ibm.com> wrote:

> We recently uncovered a bug on PowerPC where if a timer tick arrives
> just inside schedule() while interrupts are still enabled, the
> decrementer is never reprogrammed to that appropriate value. This is
> because once inside schedule(), we never handle any subsequent softirqs:
> we call context_switch() and resume the guest.

Easily fixed. You need to handle softirqs in the exit path to guest context.
You need to do this final check with interrupts disabled to avoid races.

 -- Keir

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] schedule() vs softirqs
  2006-12-15 17:36 ` Keir Fraser
@ 2006-12-15 19:09   ` Hollis Blanchard
  2006-12-15 20:00     ` Keir Fraser
  0 siblings, 1 reply; 7+ messages in thread
From: Hollis Blanchard @ 2006-12-15 19:09 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel, xen-ppc-devel

On Fri, 2006-12-15 at 17:36 +0000, Keir Fraser wrote:
> On 15/12/06 17:27, "Hollis Blanchard" <hollisb@us.ibm.com> wrote:
> 
> > We recently uncovered a bug on PowerPC where if a timer tick arrives
> > just inside schedule() while interrupts are still enabled, the
> > decrementer is never reprogrammed to that appropriate value. This is
> > because once inside schedule(), we never handle any subsequent softirqs:
> > we call context_switch() and resume the guest.
> 
> Easily fixed. You need to handle softirqs in the exit path to guest context.
> You need to do this final check with interrupts disabled to avoid races.

Ah OK, I see now how x86 is doing that. I don't think that code flow
really makes sense: why would you jump out of do_softirq() into assembly
just to call do_softirq() again?

Also, that doesn't solve the lazy register saving problem.

However, I think I see how we can implement our desired context_switch()
scheme in arch-specific code. The context_switch() call in schedule()
will return, so please don't add a BUG() after that. :)

-- 
Hollis Blanchard
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: schedule() vs softirqs
  2006-12-15 19:09   ` [Xen-devel] " Hollis Blanchard
@ 2006-12-15 20:00     ` Keir Fraser
  2006-12-15 20:41       ` Hollis Blanchard
  0 siblings, 1 reply; 7+ messages in thread
From: Keir Fraser @ 2006-12-15 20:00 UTC (permalink / raw)
  To: Hollis Blanchard, Keir Fraser; +Cc: xen-devel, xen-ppc-devel

On 15/12/06 19:09, "Hollis Blanchard" <hollisb@us.ibm.com> wrote:

> Ah OK, I see now how x86 is doing that. I don't think that code flow
> really makes sense: why would you jump out of do_softirq() into assembly
> just to call do_softirq() again?

Well, that's the way it works out on x86. It is a bit odd, but it works and
is unlikely to affect performance. I think returning from schedule() would
have its own problems (e.g., context switch from idle domain to guest would
return to the idle loop, which we'd need explicit code to bail from, etc).

> Also, that doesn't solve the lazy register saving problem.

I assume this is a PPC-specific issue?

> However, I think I see how we can implement our desired context_switch()
> scheme in arch-specific code. The context_switch() call in schedule()
> will return, so please don't add a BUG() after that. :)

We already support this mode of operation for IA64 which always returns from
schedule().

 -- Keir

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: schedule() vs softirqs
  2006-12-15 20:00     ` Keir Fraser
@ 2006-12-15 20:41       ` Hollis Blanchard
  2006-12-15 21:39         ` Keir Fraser
  0 siblings, 1 reply; 7+ messages in thread
From: Hollis Blanchard @ 2006-12-15 20:41 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel, xen-ppc-devel

On Fri, 2006-12-15 at 20:00 +0000, Keir Fraser wrote:
> 
> > Also, that doesn't solve the lazy register saving problem.
> 
> I assume this is a PPC-specific issue?

It's an issue with any architecture with a large number of registers
which aren't automatically saved by hardware (and a C ABI that makes
some of them non-volatile).

x86 has a small number of registers. ia64 automatically saves them (from
what I understand). So of the currently-supported architectures, yes,
that leaves PowerPC.

-- 
Hollis Blanchard
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: schedule() vs softirqs
  2006-12-15 20:41       ` Hollis Blanchard
@ 2006-12-15 21:39         ` Keir Fraser
  2006-12-15 23:51           ` Hollis Blanchard
  0 siblings, 1 reply; 7+ messages in thread
From: Keir Fraser @ 2006-12-15 21:39 UTC (permalink / raw)
  To: Hollis Blanchard, Keir Fraser; +Cc: xen-devel, xen-ppc-devel

On 15/12/06 20:41, "Hollis Blanchard" <hollisb@us.ibm.com> wrote:

> It's an issue with any architecture with a large number of registers
> which aren't automatically saved by hardware (and a C ABI that makes
> some of them non-volatile).
> 
> x86 has a small number of registers. ia64 automatically saves them (from
> what I understand). So of the currently-supported architectures, yes,
> that leaves PowerPC.

I see. It sounds like returning from context_switch() is perhaps the right
thing for powerpc. That would be easier if you have per-cpu stacks (like
ia64). If not there are issues in saving register state later (and hence
delaying your call to context_saved()) as there are calls to do_softirq()
outside your asm code (well, not many, but there is one in domain.c for
example) where you won't end up executing your do_softirq() wrapper. In
general we'd like to reserve the right to include voluntary yield points,
and that won't mix well with lazy register saves and per-physical-cpu
stacks.

 -- Keir

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: schedule() vs softirqs
  2006-12-15 21:39         ` Keir Fraser
@ 2006-12-15 23:51           ` Hollis Blanchard
  0 siblings, 0 replies; 7+ messages in thread
From: Hollis Blanchard @ 2006-12-15 23:51 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel, xen-ppc-devel

On Fri, 2006-12-15 at 21:39 +0000, Keir Fraser wrote:
> On 15/12/06 20:41, "Hollis Blanchard" <hollisb@us.ibm.com> wrote:
> 
> > It's an issue with any architecture with a large number of registers
> > which aren't automatically saved by hardware (and a C ABI that makes
> > some of them non-volatile).
> >
> > x86 has a small number of registers. ia64 automatically saves them (from
> > what I understand). So of the currently-supported architectures, yes,
> > that leaves PowerPC.
> 
> I see. It sounds like returning from context_switch() is perhaps the right
> thing for powerpc. That would be easier if you have per-cpu stacks (like
> ia64).

Yup, we have per-cpu stacks.

> If not there are issues in saving register state later (and hence
> delaying your call to context_saved()) as there are calls to do_softirq()
> outside your asm code (well, not many, but there is one in domain.c for
> example) where you won't end up executing your do_softirq() wrapper. In
> general we'd like to reserve the right to include voluntary yield points,
> and that won't mix well with lazy register saves and per-physical-cpu
> stacks.

Oh, we have per-physical-cpu stacks. We can do that because there's no
such thing as a "hypervisor thread" which could block in hypervisor
space and need to be restored later.

Are you saying in the future you want to have hypervisor threads, and so
we'll need per-VIRTUAL-cpu stacks?

-- 
Hollis Blanchard
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2006-12-15 23:51 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-12-15 17:27 schedule() vs softirqs Hollis Blanchard
2006-12-15 17:36 ` Keir Fraser
2006-12-15 19:09   ` [Xen-devel] " Hollis Blanchard
2006-12-15 20:00     ` Keir Fraser
2006-12-15 20:41       ` Hollis Blanchard
2006-12-15 21:39         ` Keir Fraser
2006-12-15 23:51           ` Hollis Blanchard

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.