From mboxrd@z Thu Jan  1 00:00:00 1970
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [PATCH] qemu-kvm: response to SIGUSR1 to start/stop a VCPU (v2)
Date: Wed, 01 Dec 2010 20:09:42 +0100
Message-ID: <1291230582.32004.1927.camel@laptop>
References: <1290530963-3448-1-git-send-email-aliguori@us.ibm.com>
	 <4CECCA39.4060702@redhat.com> <4CED1A23.9030607@linux.vnet.ibm.com>
	 <4CED1FD3.1000801@redhat.com> <20101201123742.GA3780@linux.vnet.ibm.com>
	 <4CF6460C.5070604@redhat.com> <20101201161221.GA8073@linux.vnet.ibm.com>
	 <1291220718.32004.1696.camel@laptop>
	 <20101201172953.GF8073@linux.vnet.ibm.com>
	 <1291225502.32004.1787.camel@laptop>
	 <20101201180040.GH8073@linux.vnet.ibm.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 8BIT
Cc: Avi Kivity <avi@redhat.com>,
	Anthony Liguori <aliguori@linux.vnet.ibm.com>,
	qemu-devel@nongnu.org, kvm@vger.kernel.org,
	Chris Wright <chrisw@sous-sol.org>,
	Ingo Molnar <mingo@elte.hu>, Mike Galbraith <efault@gmx.de>
To: vatsa@linux.vnet.ibm.com
Return-path: <kvm-owner@vger.kernel.org>
Received: from casper.infradead.org ([85.118.1.10]:55802 "EHLO
	casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1753173Ab0LATJZ convert rfc822-to-8bit (ORCPT
	<rfc822;kvm@vger.kernel.org>); Wed, 1 Dec 2010 14:09:25 -0500
In-Reply-To: <20101201180040.GH8073@linux.vnet.ibm.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On Wed, 2010-12-01 at 23:30 +0530, Srivatsa Vaddagiri wrote:
> On Wed, Dec 01, 2010 at 06:45:02PM +0100, Peter Zijlstra wrote:
> > On Wed, 2010-12-01 at 22:59 +0530, Srivatsa Vaddagiri wrote:
> > > 
> > > yield_task_fair(...)
> > > {
> > > 
> > > +       ideal_runtime = sched_slice(cfs_rq, curr);
> > > +       delta_exec = curr->sum_exec_runtime - curr->prev_sum_exec_runtime;
> > > +       rem_time_slice = ideal_runtime - delta_exec;
> > > +
> > > +       current->donate_time += rem_time_slice > some_threshold ?
> > > +                                some_threshold : rem_time_slice;
> > > 
> > >         ...
> > > }
> > > 
> > > 
> > > sched_slice(...)
> > > {
> > >         slice = ...
> > > 
> > > +       slice += current->donate_time;
> > > 
> > > }
> > > 
> > > or something close to it. I am bit reluctant to go that route myself, unless the
> > > fairness issue with plain yield is quite bad. 
> > 
> > That really won't do anything. You need to adjust both tasks their
> > vruntime.
> 
> We are dealing with just one task here (the task that is yielding).
> After recording how much timeslice we are "giving up" in current->donate_time
> (donate_time is perhaps not the right name to use), we adjust the yielding
> task's vruntime as per existing logic (for ex: to make it go to back of
> runqueue). When the yielding tasks gets to run again, lock is hopefully 
> available for it to grab, we let it run longer than the default sched_slice()
> to compensate for what time it gave up previously to other threads in same
> runqueue. This ensures that because of yielding upon lock contention, we are not
> leaking bandwidth in favor of other guests. Again I don't know how much of
> fairness issue this is in practice, so unless we see some numbers I'd prefer
> sticking to plain yield() upon lock-contention (for unmodified guests that is).

No, that won't work. Once you've given up time you cannot add it back
without destroying fairness.

You can limit the unfairness by limiting the amount of feedback, but I
really dislike such 'yield' semantics.