credit scheduler and HYPERVISOR

All of lore.kernel.org
 help / color / mirror / Atom feed

* credit scheduler and HYPERVISOR_yield()
@ 2007-10-08 23:41 John Levon
  2007-10-09  1:23 ` Atsushi SAKAI
  2007-10-09  7:06 ` Emmanuel Ackaouy
  0 siblings, 2 replies; 17+ messages in thread
From: John Levon @ 2007-10-08 23:41 UTC (permalink / raw)
  To: xen-devel

It looks like a HYPERVISOR_yield() call will end placing the yielded
VCPU at the head of run queue if there are only equal-or-lower priority
VCPUs on the queue. Shouldn't it place it after any equal-priority CPUs
on the list?

thanks
john

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: credit scheduler and HYPERVISOR_yield()
  2007-10-08 23:41 credit scheduler and HYPERVISOR_yield() John Levon
@ 2007-10-09  1:23 ` Atsushi SAKAI
  2007-10-09  1:42   ` John Levon
  2007-10-09  7:06 ` Emmanuel Ackaouy
  1 sibling, 1 reply; 17+ messages in thread
From: Atsushi SAKAI @ 2007-10-09  1:23 UTC (permalink / raw)
  To: John Levon; +Cc: xen-devel

Hi, John

Do you find any problem in your test (by HYPERVISOR_yield())?
Please describe your problem at first.

Thanks
Atsushi SAKAI


John Levon <levon@movementarian.org> wrote:

> 
> It looks like a HYPERVISOR_yield() call will end placing the yielded
> VCPU at the head of run queue if there are only equal-or-lower priority
> VCPUs on the queue. Shouldn't it place it after any equal-priority CPUs
> on the list?
> 
> thanks
> john
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: credit scheduler and HYPERVISOR_yield()
  2007-10-09  1:23 ` Atsushi SAKAI
@ 2007-10-09  1:42   ` John Levon
  0 siblings, 0 replies; 17+ messages in thread
From: John Levon @ 2007-10-09  1:42 UTC (permalink / raw)
  To: Atsushi SAKAI; +Cc: xen-devel

On Tue, Oct 09, 2007 at 10:23:19AM +0900, Atsushi SAKAI wrote:

> Do you find any problem in your test (by HYPERVISOR_yield())?
> Please describe your problem at first.

The problem is over semantics of HYPERVISOR_yield(). We use this after
doing an IPI when we want to wait for the other CPU to respond before
continuing. If we're just going into the hypervisor and straight back
out again until our credit expires, then it's clearly sub-optimal for
this case.

I haven't looked hard at the implementation, or even verified this
behaviour, but it does look like that.

regards
john

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: credit scheduler and HYPERVISOR_yield()
  2007-10-08 23:41 credit scheduler and HYPERVISOR_yield() John Levon
  2007-10-09  1:23 ` Atsushi SAKAI
@ 2007-10-09  7:06 ` Emmanuel Ackaouy
  2007-10-09 12:15   ` John Levon
  1 sibling, 1 reply; 17+ messages in thread
From: Emmanuel Ackaouy @ 2007-10-09  7:06 UTC (permalink / raw)
  To: John Levon; +Cc: xen-devel

Hi John.

The expected behavior of yield() (or any schedule operation really) is
that the current VCPU will be placed on the runq behind all VCPUs of
equal or greater priority.

Looking at __runq_insert() in sched_credit.c, it looks correct to me in
that respect.

Can you clarify what's going wrong?

On Oct 9, 2007, at 1:41, John Levon wrote:

>
> It looks like a HYPERVISOR_yield() call will end placing the yielded
> VCPU at the head of run queue if there are only equal-or-lower priority
> VCPUs on the queue. Shouldn't it place it after any equal-priority CPUs
> on the list?
>
> thanks
> john
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: credit scheduler and HYPERVISOR_yield()
  2007-10-09  7:06 ` Emmanuel Ackaouy
@ 2007-10-09 12:15   ` John Levon
  2007-10-09 13:22     ` George Dunlap
  0 siblings, 1 reply; 17+ messages in thread
From: John Levon @ 2007-10-09 12:15 UTC (permalink / raw)
  To: Emmanuel Ackaouy; +Cc: xen-devel

On Tue, Oct 09, 2007 at 09:06:14AM +0200, Emmanuel Ackaouy wrote:

> The expected behavior of yield() (or any schedule operation really) is
> that the current VCPU will be placed on the runq behind all VCPUs of
> equal or greater priority.
> 
> Looking at __runq_insert() in sched_credit.c, it looks correct to me in
> that respect.
> 
> Can you clarify what's going wrong?

It looks fine... no idea how I misread the code.

sorry,
john

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: credit scheduler and HYPERVISOR_yield()
  2007-10-09 12:15   ` John Levon
@ 2007-10-09 13:22     ` George Dunlap
  2007-10-09 14:48       ` Emmanuel Ackaouy
  2007-10-14 18:45       ` John Levon
  0 siblings, 2 replies; 17+ messages in thread
From: George Dunlap @ 2007-10-09 13:22 UTC (permalink / raw)
  To: John Levon; +Cc: Emmanuel Ackaouy, xen-devel

The code does what it's designed to -- put the current vcpu behind any
vcpus of equal priority.

But that behavior isn't always the ideal, specifically in situations
like John describes.  The credit scheduler has two basic priorities --
TS_UNDER (hasn't yet used all its credits) and TS_OVER (used all its
credits).  The scheduler switches between these two based on how much
cpu time a vcpu has actually had compared to how much it's allocated.
This is a very clever way to make sure that each vcpu gets its fair
share, but that spare cycles are still used effectively.

What this means in the case of a yield(), unfortunately, is that If a
given vcpu is the only vcpu on its processor with credits left, all it
can do is burn up its extra credits spinning or calling yield() to no
effect.

A simple option would be, for the credit scheduler, to temporarily
reduce the priority from TS_UNDER to TS_OVER.  This will cause it to
actually yield if there's any other vcpus that can run.  The next time
accounting is done, the priority will be reset, and it should get more
time because of the time it's given up.

Thoughts?

 -George

On 10/9/07, John Levon <levon@movementarian.org> wrote:
> On Tue, Oct 09, 2007 at 09:06:14AM +0200, Emmanuel Ackaouy wrote:
>
> > The expected behavior of yield() (or any schedule operation really) is
> > that the current VCPU will be placed on the runq behind all VCPUs of
> > equal or greater priority.
> >
> > Looking at __runq_insert() in sched_credit.c, it looks correct to me in
> > that respect.
> >
> > Can you clarify what's going wrong?
>
> It looks fine... no idea how I misread the code.
>
> sorry,
> john
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: credit scheduler and HYPERVISOR_yield()
  2007-10-09 13:22     ` George Dunlap
@ 2007-10-09 14:48       ` Emmanuel Ackaouy
  2007-10-14 18:45       ` John Levon
  1 sibling, 0 replies; 17+ messages in thread
From: Emmanuel Ackaouy @ 2007-10-09 14:48 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel, John Levon

On Oct 9, 2007, at 15:22, George Dunlap wrote:
> What this means in the case of a yield(), unfortunately, is that If a
> given vcpu is the only vcpu on its processor with credits left, all it
> can do is burn up its extra credits spinning or calling yield() to no
> effect.
>
> A simple option would be, for the credit scheduler, to temporarily
> reduce the priority from TS_UNDER to TS_OVER.  This will cause it to
> actually yield if there's any other vcpus that can run.  The next time
> accounting is done, the priority will be reset, and it should get more
> time because of the time it's given up.

Temporarily changing the priority to TS_OVER strikes me as a
reasonable idea. However, changing it for an average of half of
the accounting period (1/2 100ms = 50ms) is hardly "temporary".
A VCPUs that would call yield() more than once every 50ms or
so -- which isn't unreasonable -- would never be able to run at
TS_UNDER. That would totally distort accounting fairness for
users of yield(). Maybe something more in the temporary spirit
of the TS_BOOST priority (but lower not higher than TS_UNDER)
would be better?

It may be worthwhile to consider if yield() can be replaced with
more intelligent mechanisms for VCPU synchronization of SMP
guests. In the case of ACKed IPIs for example, if all target VCPUs
are not running at the time of the IPI initiation, it might be a good
idea to put the source to sleep until all targets have ACKed.
If all target VCPUs are running though, I suspect things will work
best if the IPI initiator does not yield at all.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: credit scheduler and HYPERVISOR_yield()
  2007-10-09 13:22     ` George Dunlap
  2007-10-09 14:48       ` Emmanuel Ackaouy
@ 2007-10-14 18:45       ` John Levon
  2007-10-14 19:20         ` Emmanuel Ackaouy
  1 sibling, 1 reply; 17+ messages in thread
From: John Levon @ 2007-10-14 18:45 UTC (permalink / raw)
  To: George Dunlap; +Cc: Emmanuel Ackaouy, xen-devel

On Tue, Oct 09, 2007 at 02:22:13PM +0100, George Dunlap wrote:

> What this means in the case of a yield(), unfortunately, is that If a
> given vcpu is the only vcpu on its processor with credits left, all it
> can do is burn up its extra credits spinning or calling yield() to no
> effect.
> 
> A simple option would be, for the credit scheduler, to temporarily
> reduce the priority from TS_UNDER to TS_OVER.  This will cause it to

We prototyped this change and it made quite a difference (though didn't
solve our problems entirely). Would it be possible to get a proper fix
available?

Emmanuel Ackaouy wrote:

> It may be worthwhile to consider if yield() can be replaced with
> more intelligent mechanisms for VCPU synchronization of SMP
> guests. In the case of ACKed IPIs for example, if all target VCPUs
> are not running at the time of the IPI initiation, it might be a good
> idea to put the source to sleep until all targets have ACKed.
> If all target VCPUs are running though, I suspect things will work
> best if the IPI initiator does not yield at all.

This seems like a bad idea since we may be IPIing to several CPUs and we
don't want to sleep whilst we can usefully move on and IPI the other
CPUs (even if they can't quite respond yet).

cheers
john

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: credit scheduler and HYPERVISOR_yield()
  2007-10-14 18:45       ` John Levon
@ 2007-10-14 19:20         ` Emmanuel Ackaouy
  2007-10-14 19:49           ` John Levon
  2007-10-15 12:26           ` George Dunlap
  0 siblings, 2 replies; 17+ messages in thread
From: Emmanuel Ackaouy @ 2007-10-14 19:20 UTC (permalink / raw)
  To: John Levon; +Cc: George Dunlap, xen-devel

On Oct 14, 2007, at 20:45, John Levon wrote:
> Emmanuel Ackaouy wrote:
>> It may be worthwhile to consider if yield() can be replaced with
>> more intelligent mechanisms for VCPU synchronization of SMP
>> guests. In the case of ACKed IPIs for example, if all target VCPUs
>> are not running at the time of the IPI initiation, it might be a good
>> idea to put the source to sleep until all targets have ACKed.
>> If all target VCPUs are running though, I suspect things will work
>> best if the IPI initiator does not yield at all.
>
> This seems like a bad idea since we may be IPIing to several CPUs and 
> we
> don't want to sleep whilst we can usefully move on and IPI the other
> CPUs (even if they can't quite respond yet).

Why can't you initiate the IPI to all the destination CPUs first
and then wait for them to ACK, going to sleep if it looks like at
least one of them won't be able to ACK in a reasonable
timeframe (for example if it is asleep or on the run queue of
the running VCPU's physical CPU)?

I'm probably not understanding what you're trying to do?

> On Tue, Oct 09, 2007 at 02:22:13PM +0100, George Dunlap wrote:
>> A simple option would be, for the credit scheduler, to temporarily
>> reduce the priority from TS_UNDER to TS_OVER.  This will cause it to
>
> We prototyped this change and it made quite a difference (though didn't
> solve our problems entirely). Would it be possible to get a proper fix
> available?

Doing the change that George proposed may help in your case
but I suspect that, as I described in my previous post, it will cause
problems for other workloads.

I think it is reasonable for a yield() operation to yield to runnable
VCPUs of equal or higher priority than the running VCPU. That
is the behavior of the scheduler today. Maybe your problem can
be addressed without changing the behavior of yield?

With that said, it's unlikely that I'll be making a change to the
scheduler myself: I haven't worked at XenSource for some time
now and don't have the resources (not to mention time) to test
any such change. I'm happy to learn about your problem and
suggest potential fixes but I'm probably not the person you need
to convince if you want to make a significant scheduler change
these days. Arguably, a number of things need to be done in
the Xen scheduler and synchronization primitives to improve
the performance of SMP guests. It may be worthwhile to have
a generic discussion about that on top of the specific problem
you're encountering.

Cheers,
Emmanuel.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: credit scheduler and HYPERVISOR_yield()
  2007-10-14 19:20         ` Emmanuel Ackaouy
@ 2007-10-14 19:49           ` John Levon
  2007-10-14 21:25             ` Emmanuel Ackaouy
  2007-10-15 12:26           ` George Dunlap
  1 sibling, 1 reply; 17+ messages in thread
From: John Levon @ 2007-10-14 19:49 UTC (permalink / raw)
  To: Emmanuel Ackaouy; +Cc: George Dunlap, xen-devel

On Sun, Oct 14, 2007 at 09:20:50PM +0200, Emmanuel Ackaouy wrote:

> >>It may be worthwhile to consider if yield() can be replaced with
> >>more intelligent mechanisms for VCPU synchronization of SMP
> >>guests. In the case of ACKed IPIs for example, if all target VCPUs
> >>are not running at the time of the IPI initiation, it might be a good
> >>idea to put the source to sleep until all targets have ACKed.
> >>If all target VCPUs are running though, I suspect things will work
> >>best if the IPI initiator does not yield at all.
> >

> >This seems like a bad idea since we may be IPIing to several CPUs and
> >we don't want to sleep whilst we can usefully move on and IPI the
> >other CPUs (even if they can't quite respond yet).
> 
> Why can't you initiate the IPI to all the destination CPUs first

We do, maybe I misunderstood what you were suggesting.

> Doing the change that George proposed may help in your case
> but I suspect that, as I described in my previous post, it will cause
> problems for other workloads.
> 
> I think it is reasonable for a yield() operation to yield to runnable
> VCPUs of equal or higher priority than the running VCPU. That
> is the behavior of the scheduler today.

Well yes, except the priorities are "wrong". We've explicitly asked to
not be scheduled, it doesn't seem right for the scheduler not to heed
that suggestion.

> Maybe your problem can be addressed without changing the behavior of
> yield?

For this particular problem, sure.

> With that said, it's unlikely that I'll be making a change to the
> scheduler myself: I haven't worked at XenSource for some time

That's fine...

regards
john

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: credit scheduler and HYPERVISOR_yield()
  2007-10-14 19:49           ` John Levon
@ 2007-10-14 21:25             ` Emmanuel Ackaouy
  2007-10-14 21:50               ` John Levon
  0 siblings, 1 reply; 17+ messages in thread
From: Emmanuel Ackaouy @ 2007-10-14 21:25 UTC (permalink / raw)
  To: John Levon; +Cc: George Dunlap, xen-devel

On Oct 14, 2007, at 21:49, John Levon wrote:
> On Sun, Oct 14, 2007 at 09:20:50PM +0200, Emmanuel Ackaouy wrote:
>> Why can't you initiate the IPI to all the destination CPUs first
>
> We do, maybe I misunderstood what you were suggesting.

Thats ok. I still don't understand what problem you're working
on myself. I think you're suggesting you'd like to always
deschedule a VCPU initiating an IPI for potentially multiple
time slices. That seems crazy to me.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: credit scheduler and HYPERVISOR_yield()
  2007-10-14 21:25             ` Emmanuel Ackaouy
@ 2007-10-14 21:50               ` John Levon
  0 siblings, 0 replies; 17+ messages in thread
From: John Levon @ 2007-10-14 21:50 UTC (permalink / raw)
  To: Emmanuel Ackaouy; +Cc: George Dunlap, xen-devel

On Sun, Oct 14, 2007 at 11:25:43PM +0200, Emmanuel Ackaouy wrote:

> >On Sun, Oct 14, 2007 at 09:20:50PM +0200, Emmanuel Ackaouy wrote:
> >>Why can't you initiate the IPI to all the destination CPUs first
> >
> >We do, maybe I misunderstood what you were suggesting.
> 
> Thats ok. I still don't understand what problem you're working
> on myself. I think you're suggesting you'd like to always
> deschedule a VCPU initiating an IPI for potentially multiple
> time slices. That seems crazy to me.

Yes, that would be crazy indeed.

I'd like a VCPU that does a yield to actually yield, from what I can
tell that's typically not happening right now

john

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: credit scheduler and HYPERVISOR_yield()
  2007-10-14 19:20         ` Emmanuel Ackaouy
  2007-10-14 19:49           ` John Levon
@ 2007-10-15 12:26           ` George Dunlap
  2007-10-15 12:32             ` John Levon
                               ` (2 more replies)
  1 sibling, 3 replies; 17+ messages in thread
From: George Dunlap @ 2007-10-15 12:26 UTC (permalink / raw)
  To: Emmanuel Ackaouy; +Cc: xen-devel, John Levon

On 10/14/07, Emmanuel Ackaouy <ackaouy@gmail.com> wrote:
> Doing the change that George proposed may help in your case
> but I suspect that, as I described in my previous post, it will cause
> problems for other workloads.
>
> I think it is reasonable for a yield() operation to yield to runnable
> VCPUs of equal or higher priority than the running VCPU. That
> is the behavior of the scheduler today. Maybe your problem can
> be addressed without changing the behavior of yield?

Part of the problem is that for the credit scheduler, the "priority"
is used a bit differently.  It changes, and it has no fundamental
relationship between more important work and less important work; it's
just a mechanism for implementing time allocations. (And a very clever
way, I might add.)

It's clear that "yield-I-really-mean-it" is useful for smp
synchronization issues (like yielding when waiting for a spinlock held
by scheduled-out vcpus, or waiting for a scheduled-out processor to
ACK an IPI).  But I can't really think of a situation where
"yield-to-other-cpus-that-haven't-used-all-their-credits-yet" is
particularly useful.  Can you think of an example?

Perhaps a better implementation of "yield-I-really-mean-it" would be:
* Reduce the priority only if there are no vcpus of the same priority
in the queue; and perhaps, only if there are no vcpus in the queue and
no work to steal.
* As soon as the vcpu in question is scheduled, raise its priority again.

That should avoid some of the problems you've pointed out with the
yield-reduces-priority technique.

> Arguably, a number of things need to be done in
> the Xen scheduler and synchronization primitives to improve
> the performance of SMP guests. It may be worthwhile to have
> a generic discussion about that on top of the specific problem
> you're encountering.

Here are some random ideas:
* Expose to the guest, via the shared-info page, which vcpus are
actively scheduled or not.
* Implement some kind of a yield or block primitive, like:
+ yield to a specific vcpu (i.e., the one holding the lock you want)
+ block with a vcpu mask.  The vcpu will then be blocked until each of
the vcpus in the mask has been scheduled at least once.

Thoughts?

 -George

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: credit scheduler and HYPERVISOR_yield()
  2007-10-15 12:26           ` George Dunlap
@ 2007-10-15 12:32             ` John Levon
  2007-10-15 12:43             ` Samuel Thibault
  2007-10-15 17:13             ` Emmanuel Ackaouy
  2 siblings, 0 replies; 17+ messages in thread
From: John Levon @ 2007-10-15 12:32 UTC (permalink / raw)
  To: George Dunlap; +Cc: Emmanuel Ackaouy, xen-devel

On Mon, Oct 15, 2007 at 01:26:06PM +0100, George Dunlap wrote:

> Part of the problem is that for the credit scheduler, the "priority"
> is used a bit differently.  It changes, and it has no fundamental
> relationship between more important work and less important work; it's
> just a mechanism for implementing time allocations. (And a very clever
> way, I might add.)
> 
> It's clear that "yield-I-really-mean-it" is useful for smp
> synchronization issues (like yielding when waiting for a spinlock held
> by scheduled-out vcpus, or waiting for a scheduled-out processor to
> ACK an IPI).  But I can't really think of a situation where
> "yield-to-other-cpus-that-haven't-used-all-their-credits-yet" is
> particularly useful.  Can you think of an example?
> 
> Perhaps a better implementation of "yield-I-really-mean-it" would be:
> * Reduce the priority only if there are no vcpus of the same priority
> in the queue; and perhaps, only if there are no vcpus in the queue and
> no work to steal.

Isn't this the opposite of what our case needs? That is, we yield, and
we want to schedule another VCPU, whether it's the same priority or not.

> > Arguably, a number of things need to be done in
> > the Xen scheduler and synchronization primitives to improve
> > the performance of SMP guests. It may be worthwhile to have
> > a generic discussion about that on top of the specific problem
> > you're encountering.
> 
> Here are some random ideas:
> * Expose to the guest, via the shared-info page, which vcpus are
> actively scheduled or not.

That info is already available via the runstate (although we don't use
it, and it wouldn't help us - the problem is that the 'other' VCPU
doesn't get scheduled when we yield, not that we don't know whether to
yield or not.)

> * Implement some kind of a yield or block primitive, like:
> + yield to a specific vcpu (i.e., the one holding the lock you want)
> + block with a vcpu mask.  The vcpu will then be blocked until each of
> the vcpus in the mask has been scheduled at least once.

Possible if the scheduler can't be fixed in a similar way.

regards,
john

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: credit scheduler and HYPERVISOR_yield()
  2007-10-15 12:26           ` George Dunlap
  2007-10-15 12:32             ` John Levon
@ 2007-10-15 12:43             ` Samuel Thibault
  2007-10-15 17:13             ` Emmanuel Ackaouy
  2 siblings, 0 replies; 17+ messages in thread
From: Samuel Thibault @ 2007-10-15 12:43 UTC (permalink / raw)
  To: George Dunlap; +Cc: Emmanuel Ackaouy, xen-devel, John Levon

Hi,

George Dunlap, le Mon 15 Oct 2007 13:26:06 +0100, a écrit :
> I can't really think of a situation where
> "yield-to-other-cpus-that-haven't-used-all-their-credits-yet" is
> particularly useful.  Can you think of an example?

That could actually be the counter part of "yield-I-really-mean-it":
- vCPU0 yields-really-means-it so as to hopefully schedule vCPU1
- vCPU1 realizes why it got scheduled, does the needed urging job.
- vCPU1 "yields-to-other-cpus-thatblabla", for letting Xen know it
  finished its urging job and usual priorities can be taken into account
  again.
- vCPU0 gets scheduled again because it actually had bigger priority.

> Here are some random ideas:
> * Expose to the guest, via the shared-info page, which vcpus are
> actively scheduled or not.

That could be useful, but one can't safely rely on it, since it may
change asynchronously.

> * Implement some kind of a yield or block primitive, like:
> + yield to a specific vcpu (i.e., the one holding the lock you want)

That should be quite fine.  Xen could use it as a strong scheduling
hint. If scheduling that vCPU immediately would break some quota rules
for instance, Xen could still remember that it shouldn't reschedule the
calling vCPU before having scheduled the target vCPU at least once.

> + block with a vcpu mask.  The vcpu will then be blocked until each of
> the vcpus in the mask has been scheduled at least once.

That could be also called yield_to_vcpus actually.

Samuel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: credit scheduler and HYPERVISOR_yield()
  2007-10-15 12:26           ` George Dunlap
  2007-10-15 12:32             ` John Levon
  2007-10-15 12:43             ` Samuel Thibault
@ 2007-10-15 17:13             ` Emmanuel Ackaouy
  2007-10-15 17:45               ` Keir Fraser
  2 siblings, 1 reply; 17+ messages in thread
From: Emmanuel Ackaouy @ 2007-10-15 17:13 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel, John Levon

I suspect yield() was first devised as a simple synchronization 
mechanism
for uni-processor round robin schedulers.

Then strict priorities were added to make certain tasks (like pagers) 
run
more aggressively than "normal" ones. As long as these high priority
threads don't use the yield() mechanism, things are fine. I believe you 
are
pointing out that from the perspective of the yield() mechanism, all
time-share priorities (UNDER and OVER) should be considered one and
the same because they are not strict priorities. This is a good 
observation
and I agree with you (as long as reasonable uses of yield() don't cause
fairness to go out the window).

However, before you go and fix yield(), you might want to consider this:

1- It's been proposed before that things like dom0 VCPUs be scheduled
with a priority strictly greater than any domU VCPU. If strict 
priorities are
introduced into the Xen scheduler at some point in the future, code that
assumes that a yield() from a VCPU will allow all other runnable VCPUs
in the system a chance to run ahead of it will break (again).

2- Priorities aside, on an SMP host (ie all computers) with distributed 
run
queues, it is non trivial to guarantee that a VCPU will not be 
rescheduled
until all other runnable VCPUs have had a chance to run first. If you 
can
come up with a simple and scalable way to do it, great. I suspect you 
will
need to approximate this definition of yield() though, perhaps by using
some form of directed yield, targeted at one or more VCPUs ,as you have
suggested.

3- Yield really isn't a great model to do synchronization in an SMP 
world.
If you're going to para-virtualize your IPI and spinlock paths, as you 
pointed
out in your last mail, you might as well do something that can be 
directed
and block if necessary.

I guess my point is that instead of working real hard to try and 
maintain
the old yield behavior ("don't run again until all other runnable VCPUs
have had a chance to run first") on an SMP scheduler which potentially
also has to deal with strict priorities, you'd be better off spending 
your
energy on building and optimizing simpler and more targeted
synchronization mechanisms and using those instead. User level
threads libraries may be a good place to look for inspiration if you're
really worried about the costs of supervisor to hypervisor context
switches. I'm not a huge fan of share pages but it was popular to
write papers about them for user level thread synchronization back in
the 90s.

In the case of IPIs, you're already going into the hypervisor so you
should be able to do something straightforward with a sleeping
semaphore. Maybe you spin a little before you sleep though to give
running VCPUs a chance to respond before you give up the end of
your time slice.

For spinlocks, I suspect turning a spinlock into a sleeping lock after
a reasonable number of spins would work well too.

In the long run, it would probably be beneficial to remove most uses
of the generic yield mechanism.

Emmanuel.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: credit scheduler and HYPERVISOR_yield()
  2007-10-15 17:13             ` Emmanuel Ackaouy
@ 2007-10-15 17:45               ` Keir Fraser
  0 siblings, 0 replies; 17+ messages in thread
From: Keir Fraser @ 2007-10-15 17:45 UTC (permalink / raw)
  To: Emmanuel Ackaouy, George Dunlap; +Cc: xen-devel, John Levon

On 15/10/07 18:13, "Emmanuel Ackaouy" <ackaouy@gmail.com> wrote:

> In the case of IPIs, you're already going into the hypervisor so you
> should be able to do something straightforward with a sleeping
> semaphore. Maybe you spin a little before you sleep though to give
> running VCPUs a chance to respond before you give up the end of
> your time slice.

Actually a blocking spinlock could be implemented with no Xen changes.
Change the spinlock function in Linux to spin a few times and then set a
waiting bit in a cpumask and then SCHEDOP_poll on a per-VCPU spinlock-wakeup
event channel. On spin_unlock, the unlocker sends an event to any VCPU in
the cpumask, using each VCPU's spinlock-wakeup event channel.

Yes, this is probably nicer than using yield() and praying.

 -- Keir

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2007-10-15 17:45 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-08 23:41 credit scheduler and HYPERVISOR_yield() John Levon
2007-10-09  1:23 ` Atsushi SAKAI
2007-10-09  1:42   ` John Levon
2007-10-09  7:06 ` Emmanuel Ackaouy
2007-10-09 12:15   ` John Levon
2007-10-09 13:22     ` George Dunlap
2007-10-09 14:48       ` Emmanuel Ackaouy
2007-10-14 18:45       ` John Levon
2007-10-14 19:20         ` Emmanuel Ackaouy
2007-10-14 19:49           ` John Levon
2007-10-14 21:25             ` Emmanuel Ackaouy
2007-10-14 21:50               ` John Levon
2007-10-15 12:26           ` George Dunlap
2007-10-15 12:32             ` John Levon
2007-10-15 12:43             ` Samuel Thibault
2007-10-15 17:13             ` Emmanuel Ackaouy
2007-10-15 17:45               ` Keir Fraser

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.