From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from [140.186.70.92] (port=54366 helo=eggs.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1PNqzV-00043L-3z
	for qemu-devel@nongnu.org; Wed, 01 Dec 2010 13:01:18 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <vatsa@linux.vnet.ibm.com>) id 1PNqzQ-0000Y2-O3
	for qemu-devel@nongnu.org; Wed, 01 Dec 2010 13:01:16 -0500
Received: from e31.co.us.ibm.com ([32.97.110.149]:46401)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <vatsa@linux.vnet.ibm.com>) id 1PNqzQ-0000XS-GX
	for qemu-devel@nongnu.org; Wed, 01 Dec 2010 13:01:12 -0500
Received: from d03relay05.boulder.ibm.com (d03relay05.boulder.ibm.com
	[9.17.195.107])
	by e31.co.us.ibm.com (8.14.4/8.13.1) with ESMTP id oB1HlaK8012660
	for <qemu-devel@nongnu.org>; Wed, 1 Dec 2010 10:47:36 -0700
Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168])
	by d03relay05.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id
	oB1I0knA051258
	for <qemu-devel@nongnu.org>; Wed, 1 Dec 2010 11:00:48 -0700
Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1])
	by d03av02.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP
	id oB1I0hYg028969
	for <qemu-devel@nongnu.org>; Wed, 1 Dec 2010 11:00:46 -0700
Date: Wed, 1 Dec 2010 23:30:40 +0530
From: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Message-ID: <20101201180040.GH8073@linux.vnet.ibm.com>
References: <1290530963-3448-1-git-send-email-aliguori@us.ibm.com>
	<4CECCA39.4060702@redhat.com> <4CED1A23.9030607@linux.vnet.ibm.com>
	<4CED1FD3.1000801@redhat.com>
	<20101201123742.GA3780@linux.vnet.ibm.com>
	<4CF6460C.5070604@redhat.com>
	<20101201161221.GA8073@linux.vnet.ibm.com>
	<1291220718.32004.1696.camel@laptop>
	<20101201172953.GF8073@linux.vnet.ibm.com>
	<1291225502.32004.1787.camel@laptop>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1291225502.32004.1787.camel@laptop>
Subject: [Qemu-devel] Re: [PATCH] qemu-kvm: response to SIGUSR1 to
	start/stop a VCPU (v2)
Reply-To: vatsa@linux.vnet.ibm.com
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: kvm@vger.kernel.org, Mike Galbraith <efault@gmx.de>, qemu-devel@nongnu.org, Chris Wright <chrisw@sous-sol.org>, Anthony Liguori <aliguori@linux.vnet.ibm.com>, Avi Kivity <avi@redhat.com>

On Wed, Dec 01, 2010 at 06:45:02PM +0100, Peter Zijlstra wrote:
> On Wed, 2010-12-01 at 22:59 +0530, Srivatsa Vaddagiri wrote:
> > 
> > yield_task_fair(...)
> > {
> > 
> > +       ideal_runtime = sched_slice(cfs_rq, curr);
> > +       delta_exec = curr->sum_exec_runtime - curr->prev_sum_exec_runtime;
> > +       rem_time_slice = ideal_runtime - delta_exec;
> > +
> > +       current->donate_time += rem_time_slice > some_threshold ?
> > +                                some_threshold : rem_time_slice;
> > 
> >         ...
> > }
> > 
> > 
> > sched_slice(...)
> > {
> >         slice = ...
> > 
> > +       slice += current->donate_time;
> > 
> > }
> > 
> > or something close to it. I am bit reluctant to go that route myself, unless the
> > fairness issue with plain yield is quite bad. 
> 
> That really won't do anything. You need to adjust both tasks their
> vruntime.

We are dealing with just one task here (the task that is yielding).
After recording how much timeslice we are "giving up" in current->donate_time
(donate_time is perhaps not the right name to use), we adjust the yielding
task's vruntime as per existing logic (for ex: to make it go to back of
runqueue). When the yielding tasks gets to run again, lock is hopefully 
available for it to grab, we let it run longer than the default sched_slice()
to compensate for what time it gave up previously to other threads in same
runqueue. This ensures that because of yielding upon lock contention, we are not
leaking bandwidth in favor of other guests. Again I don't know how much of
fairness issue this is in practice, so unless we see some numbers I'd prefer
sticking to plain yield() upon lock-contention (for unmodified guests that is).

> Also, I really wouldn't touch the yield() implementation, nor
> would I expose any such time donation crap to userspace.

- vatsa