Re: [RFC -v2 PATCH 2/3] sched: add yield_to function

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Avi Kivity <avi@redhat.com>
To: Mike Galbraith <efault@gmx.de>
Cc: Rik van Riel <riel@redhat.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Chris Wright <chrisw@sous-sol.org>
Subject: Re: [RFC -v2 PATCH 2/3] sched: add yield_to function
Date: Sun, 19 Dec 2010 11:19:12 +0200	[thread overview]
Message-ID: <4D0DCE10.7000200@redhat.com> (raw)
In-Reply-To: <1292753156.16367.104.camel@marge.simson.net>

On 12/19/2010 12:05 PM, Mike Galbraith wrote:
> On Sun, 2010-12-19 at 08:21 +0200, Avi Kivity wrote:
> >  On 12/18/2010 09:06 PM, Mike Galbraith wrote:
>
> >  >  Hm, so it needs to be very cheap, and highly repeatable.
> >  >
> >  >  What if: so you're trying to get spinners out of the way right?  You
> >  >  somehow know they're spinning, so instead of trying to boost some task,
> >  >  can you do a directed yield in terms of directing a spinner that you
> >  >  have the right to diddle to yield.  Drop his lag, and resched him.  He's
> >  >  not accomplishing anything anyway.
> >
> >  There are a couple of problems with this approach:
> >
> >  - current yield() is a no-op
>
> That's why you'd drop lag, set to max(se->vruntime, cfs_rq->min_vruntime).

Internal scheduler terminology again, don't follow.

> >  - even if it weren't, the process (containing the spinner and the
> >  lock-holder) would yield as a whole.
>
> I don't get this part.  How does the whole process yield if one thread
> yields?

The process is the sum of its threads.  If a thread yield loses 1 msec 
of runtime due to the yield, the process loses 1 msec due to the yield.  
If the lock is held for, say, 100 usec, it would be better for the 
process to spin rather than yield.

With directed yield the process loses nothing by yielding to one of its 
threads.

> >    If it yielded for exactly the time
> >  needed (until the lock holder releases the lock), it wouldn't matter,
> >  since the spinner isn't accomplishing anything, but we don't know what
> >  the exact time is.  So we want to preserve our entitlement.
>
> And that's the hard part.  If can drop lag, you may hurt yourself, but
> at least only yourself.

We already have a "hurt only yourself" thing.  We sleep for 100 usec 
when we detect spinning.  It's awful.

> >  With a pure yield implementation the process would get less than its
> >  fair share, even discounting spin time, which we'd be happy to donate to
> >  the rest of the system.

We aren't happy to donate it to the rest of the system, since it will 
cause a guest with lots of internal contention to make very little 
forward progress.

> >
> >  >  If the only thing running is virtualization, and nobody else can use the
> >  >  interface being invented, all is fair, but this passing of vruntime
> >  >  around is problematic when innocent bystanders may want to play too.
> >
> >  We definitely want to maintain fairness.  Both with a dedicated virt
> >  host and with a mixed workload.
>
> That makes it difficult to the point of impossible.
>
> You want a specific task to run NOW for good reasons, but any number of
> tasks may want the same godlike power for equally good reasons.

I don't want it to run now.  I want it to run before some other task.  I 
don't care if N other tasks run before both.  So no godlike powers 
needed, simply a courteous "after you".

> You could create a force select which only godly tasks could use that
> didn't try to play games with vruntimes, just let the bugger run, and
> let him also eat the latency hit he'll pay for that extra bit of cpu IFF
> you didn't care about being able to mix loads.
>
> Or, you could just bump his nice level with an automated return to
> previous level on resched.
>
> Any intervention has unavoidable consequences for all comers though.

Since task A is running now, clearly the scheduler thinks it deserves to 
run.  What I want to do is take just enough of the "deserves" part to 
make it not run any more, and move it to task B.

> >  >
> >  >  Yep, so much for accounting.
> >
> >  What's the problem exactly?  What's the difference, system-wide, with
> >  the donor continuing to run for that same entitlement?  Other tasks see
> >  the same thing.
>
> SOME tasks receive gifts from the void.  The difference is the bias.

Isn't fork() a gift from the void?

> >  >  >   >   Where did the entitlement come from if task A running alone on cpu A
> >  >  >   >   tosses some entitlement over the fence to his pal task B on cpu B.. and
> >  >  >   >   keeps on trucking on cpu A?  Where does that leave task C, B's
> >  >  >   >   competition?
> >  >  >
> >  >  >   Eventually C would replace A, since its share will be exhausted.  If C
> >  >  >   is pinned... good question.  How does fairness work with pinned tasks?
> >  >
> >  >  In the case I described, C had it's pocket picked by A.
> >
> >  Would that happen if global fairness was maintained?
>
> What's that? :)

If you run three tasks on a two cpu box, each gets 2/3 of a cpu.


> No task may run until there are enough of you to fill
> the box?

Why is that a consequence of global fairness? three tasks get 100% cpu 
on a 4-cpu box, the fourth cpu idles.  Is that not fair for some reason?

> God help you when somebody else wakes up Mr. Early-bird? ...

What?

> >
> >  I guess random perturbations cause task migrations periodically and
> >  things balance out.  But it seems wierd to have this devotion to
> >  fairness on a single cpu and completely ignore fairness on a macro level.
>
> It doesn't ignore it complete, it just doesn't try to do all the math
> continuously (danger Will Robinson: Peter has scary patches).  Prodding
> it in the right general direction with migrations is cheaper.

Doesn't seem to work from my brief experiment.

-- 
error compiling committee.c: too many arguments to function

next prev parent reply	other threads:[~2010-12-19  9:19 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-14  3:44 [RFC -v2 PATCH 0/3] directed yield for Pause Loop Exiting Rik van Riel
2010-12-14  3:45 ` [RFC -v2 PATCH 1/3] kvm: keep track of which task is running a KVM vcpu Rik van Riel
2010-12-14  3:46 ` [RFC -v2 PATCH 2/3] sched: add yield_to function Rik van Riel
2010-12-14  6:08   ` Mike Galbraith
2010-12-14 10:24     ` Srivatsa Vaddagiri
2010-12-14 11:03       ` Mike Galbraith
2010-12-14 11:26         ` Srivatsa Vaddagiri
2010-12-14 12:47           ` Mike Galbraith
2010-12-16 19:49     ` Rik van Riel
2010-12-17  6:56       ` Mike Galbraith
2010-12-17  7:15         ` Mike Galbraith
2010-12-18 17:08           ` Avi Kivity
2010-12-18 19:13             ` Mike Galbraith
2010-12-19  6:08               ` Avi Kivity
2010-12-20 15:40           ` Rik van Riel
2010-12-20 16:04             ` Mike Galbraith
2010-12-28  5:54               ` Mike Galbraith
2010-12-28  6:08                 ` Gene Heskett
2010-12-28  6:16                   ` Mike Galbraith
2010-12-28 16:18                     ` Gene Heskett
2010-12-28 22:34                 ` Rik van Riel
2010-12-17 15:09         ` Avi Kivity
2010-12-17 19:51           ` Mike Galbraith
2010-12-18 17:02             ` Avi Kivity
2010-12-18 19:06               ` Mike Galbraith
2010-12-19  6:21                 ` Avi Kivity
2010-12-19 10:05                   ` Mike Galbraith
2010-12-19  9:19                     ` Avi Kivity [this message]
2010-12-19 11:18                       ` Mike Galbraith
2010-12-20  8:39                       ` Mike Galbraith
2010-12-20  8:45                         ` Avi Kivity
2010-12-20  8:55                           ` Mike Galbraith
2010-12-20  9:03                             ` Avi Kivity
2010-12-20  9:30                               ` Mike Galbraith
2010-12-20  9:46                                 ` Avi Kivity
2010-12-20 10:33                                   ` Mike Galbraith
2010-12-20 10:39                                     ` Avi Kivity
2010-12-20 10:46                                       ` Mike Galbraith
2010-12-20 10:49                                         ` Avi Kivity
2010-12-20 10:50                                           ` Mike Galbraith
2010-12-20 11:06                                             ` Avi Kivity
2010-12-14 12:22   ` Peter Zijlstra
2010-12-18 14:50     ` Rik van Riel
2010-12-14  3:48 ` [RFC -v2 PATCH 3/3] kvm: use yield_to instead of sleep in kvm_vcpu_on_spin Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D0DCE10.7000200@redhat.com \
    --to=avi@redhat.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=chrisw@sous-sol.org \
    --cc=efault@gmx.de \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=riel@redhat.com \
    --cc=vatsa@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox