From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from [140.186.70.92] (port=40600 helo=eggs.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1PO9RZ-0005CD-Fh
	for qemu-devel@nongnu.org; Thu, 02 Dec 2010 08:43:50 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <vatsa@linux.vnet.ibm.com>) id 1PO7d0-0005Qw-VE
	for qemu-devel@nongnu.org; Thu, 02 Dec 2010 06:47:25 -0500
Received: from e8.ny.us.ibm.com ([32.97.182.138]:48680)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <vatsa@linux.vnet.ibm.com>) id 1PO7d0-0005Q6-Sh
	for qemu-devel@nongnu.org; Thu, 02 Dec 2010 06:47:10 -0500
Received: from d01dlp02.pok.ibm.com (d01dlp02.pok.ibm.com [9.56.224.85])
	by e8.ny.us.ibm.com (8.14.4/8.13.1) with ESMTP id oB2BTUEq007554
	for <qemu-devel@nongnu.org>; Thu, 2 Dec 2010 06:29:32 -0500
Received: from d01relay07.pok.ibm.com (d01relay07.pok.ibm.com [9.56.227.147])
	by d01dlp02.pok.ibm.com (Postfix) with ESMTP id 5F2064DE803B
	for <qemu-devel@nongnu.org>; Thu,  2 Dec 2010 06:45:26 -0500 (EST)
Received: from d01av02.pok.ibm.com (d01av02.pok.ibm.com [9.56.224.216])
	by d01relay07.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id
	oB2Bl5xd2293876
	for <qemu-devel@nongnu.org>; Thu, 2 Dec 2010 06:47:05 -0500
Received: from d01av02.pok.ibm.com (loopback [127.0.0.1])
	by d01av02.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id
	oB2Bl4vZ022342
	for <qemu-devel@nongnu.org>; Thu, 2 Dec 2010 09:47:05 -0200
Date: Thu, 2 Dec 2010 17:17:00 +0530
From: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Message-ID: <20101202114700.GA18445@linux.vnet.ibm.com>
References: <4CED1FD3.1000801@redhat.com>
	<20101201123742.GA3780@linux.vnet.ibm.com>
	<4CF6460C.5070604@redhat.com>
	<20101201161221.GA8073@linux.vnet.ibm.com>
	<1291220718.32004.1696.camel@laptop>
	<20101201172953.GF8073@linux.vnet.ibm.com>
	<1291225502.32004.1787.camel@laptop>
	<20101201180040.GH8073@linux.vnet.ibm.com>
	<1291230582.32004.1927.camel@laptop> <4CF76440.30500@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <4CF76440.30500@redhat.com>
Subject: [Qemu-devel] Re: [PATCH] qemu-kvm: response to SIGUSR1 to
	start/stop a VCPU (v2)
Reply-To: vatsa@linux.vnet.ibm.com
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Avi Kivity <avi@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>, kvm@vger.kernel.org, Mike Galbraith <efault@gmx.de>, qemu-devel@nongnu.org, Chris Wright <chrisw@sous-sol.org>, Anthony Liguori <aliguori@linux.vnet.ibm.com>

On Thu, Dec 02, 2010 at 11:17:52AM +0200, Avi Kivity wrote:
> On 12/01/2010 09:09 PM, Peter Zijlstra wrote:
> >>
> >>  We are dealing with just one task here (the task that is yielding).
> >>  After recording how much timeslice we are "giving up" in current->donate_time
> >>  (donate_time is perhaps not the right name to use), we adjust the yielding
> >>  task's vruntime as per existing logic (for ex: to make it go to back of
> >>  runqueue). When the yielding tasks gets to run again, lock is hopefully
> >>  available for it to grab, we let it run longer than the default sched_slice()
> >>  to compensate for what time it gave up previously to other threads in same
> >>  runqueue. This ensures that because of yielding upon lock contention, we are not
> >>  leaking bandwidth in favor of other guests. Again I don't know how much of
> >>  fairness issue this is in practice, so unless we see some numbers I'd prefer
> >>  sticking to plain yield() upon lock-contention (for unmodified guests that is).
> >
> >No, that won't work. Once you've given up time you cannot add it back
> >without destroying fairness.

Over shorter intervals perhaps. Over longer interval (few seconds to couple of
minutes), fairness should not be affected because of this feedback? In any case,
don't we have similar issues with directed yield as well?

> >You can limit the unfairness by limiting the amount of feedback, but I
> >really dislike such 'yield' semantics.
> 
> Agreed.
> 
> What I'd like to see in directed yield is donating exactly the
> amount of vruntime that's needed to make the target thread run.

I presume this requires the target vcpu to move left in rb-tree to run 
earlier than scheduled currently and that it doesn't involve any
change to the sched_period() of target vcpu?

Just was wondering how this would work in case of buggy guests. Lets say that a
guest ran into a AB<->BA deadlock. VCPU0 spins on lock B (held by VCPU1
currently), while VCPU spins on lock A (held by VCPU0 currently). Both keep
boosting each other's vruntime, potentially affecting fairtime for other guests
(to the point of starving them perhaps)?

- vatsa

> The donating thread won't get its vruntime back, unless the other thread
> hits contention itself and does a directed yield back.  So even if
> your lock is ping-ponged around, the guest doesn't lose vruntime
> compared to other processes on the host.