From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from [140.186.70.92] (port=54648 helo=eggs.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1Pa8IE-0004Tk-3h
	for qemu-devel@nongnu.org; Tue, 04 Jan 2011 09:55:26 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <anthony@codemonkey.ws>) id 1Pa8IB-0000ZU-Vz
	for qemu-devel@nongnu.org; Tue, 04 Jan 2011 09:55:21 -0500
Received: from mail-yx0-f173.google.com ([209.85.213.173]:44795)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <anthony@codemonkey.ws>) id 1Pa8IB-0000Yo-MO
	for qemu-devel@nongnu.org; Tue, 04 Jan 2011 09:55:19 -0500
Received: by yxl31 with SMTP id 31so6881741yxl.4
	for <qemu-devel@nongnu.org>; Tue, 04 Jan 2011 06:55:18 -0800 (PST)
Message-ID: <4D2334D4.2020104@codemonkey.ws>
Date: Tue, 04 Jan 2011 08:55:16 -0600
From: Anthony Liguori <anthony@codemonkey.ws>
MIME-Version: 1.0
References: <4D219AF5.2030204@web.de> <4D219E6D.8060902@redhat.com>
	<4D232BF6.6050102@codemonkey.ws> <4D232E4E.5030600@redhat.com>
In-Reply-To: <4D232E4E.5030600@redhat.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: [Qemu-devel] Re: Role of qemu_fair_mutex
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>, Jan Kiszka <jan.kiszka@web.de>, qemu-devel <qemu-devel@nongnu.org>, kvm <kvm@vger.kernel.org>

On 01/04/2011 08:27 AM, Avi Kivity wrote:
> On 01/04/2011 04:17 PM, Anthony Liguori wrote:
>> On 01/03/2011 04:01 AM, Avi Kivity wrote:
>>> On 01/03/2011 11:46 AM, Jan Kiszka wrote:
>>>> Hi,
>>>>
>>>> at least in kvm mode, the qemu_fair_mutex seems to have lost its
>>>> function of balancing qemu_global_mutex access between the 
>>>> io-thread and
>>>> vcpus. It's now only taken by the latter, isn't it?
>>>>
>>>> This and the fact that qemu-kvm does not use this kind of lock made me
>>>> wonder what its role is and if it is still relevant in practice. I'd
>>>> like to unify the execution models of qemu-kvm and qemu, and this lock
>>>> is the most obvious difference (there are surely more subtle ones as
>>>> well...).
>>>>
>>>
>>> IIRC it was used for tcg, which has a problem that kvm doesn't have: 
>>> a tcg vcpu needs to hold qemu_mutex when it runs, which means there 
>>> will always be contention on qemu_mutex.  In the absence of 
>>> fairness, the tcg thread could dominate qemu_mutex and starve the 
>>> iothread.
>>
>> No, it's actually the opposite IIRC.
>>
>> TCG relies on the following behavior.   A guest VCPU runs until 1) it 
>> encounters a HLT instruction 2) an event occurs that forces the TCG 
>> execution to break.
>>
>> (2) really means that the TCG thread receives a signal.  Usually, 
>> this is the periodic timer signal.
>
> What about a completion?  an I/O completes, the I/O thread wakes up, 
> needs to acquire the global lock (and force tcg off it) inject and 
> interrupt, and go back to sleep.

I/O completion triggers an fd to become readable.  This will cause 
select to break and the io thread will attempt to acquire the 
qemu_mutex.  When acquiring the mutex in TCG, the io thread will send a 
SIG_IPI to the TCG VCPU thread.


>>
>> When the TCG thread, it needs to let the IO thread run for at least 
>> one iteration.  Coordinating the execution of the IO thread such that 
>> it's guaranteed to run at least once and then having it drop the qemu 
>> mutex long enough for the TCG thread to acquire it is the purpose of 
>> the qemu_fair_mutex.
>
> That doesn't compute - the iothread doesn't hog the global lock (it 
> sleeps most of the time, and drops the lock while sleeping), so the 
> iothread cannot starve out tcg.

The fact that the iothread drops the global lock during sleep is a 
detail that shouldn't affect correctness.  The IO thread is absolutely 
allowed to run for arbitrary periods of time without dropping the qemu 
mutex.

>   On the other hand, tcg does hog the global lock, so it needs to be 
> made to give it up so the iothread can run, for example my completion 
> example.

It's very easy to ask TCG to give up the qemu_mutex by using 
cpu_interrupt().  It will drop the qemu_mutex and it will not attempt to 
acquire it again until you restart the VCPU.

> I think the abstraction we need here is a priority lock, with higher 
> priority given to the iothread.  A lock() operation that takes 
> precedence would atomically signal the current owner to drop the lock.

The I/O thread can reliably acquire the lock whenever it needs to.

If you drop all of the qemu_fair_mutex stuff and leave the qemu_mutex 
getting dropped around select, TCG will generally work reliably.  But 
this is not race free.  Just dropping a lock does not result in reliable 
hand off.

I think a generational counter could work and a condition could work.

Regards,

Anthony Liguori


> Under kvm we'd run a normal mutex, so the it wouldn't need to take the 
> extra mutex.
>