From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([208.118.235.92]:45273)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <avi@redhat.com>) id 1TLajt-00077U-Uk
	for qemu-devel@nongnu.org; Tue, 09 Oct 2012 10:24:59 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <avi@redhat.com>) id 1TLajo-0005rI-Vt
	for qemu-devel@nongnu.org; Tue, 09 Oct 2012 10:24:53 -0400
Received: from mx1.redhat.com ([209.132.183.28]:3121)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <avi@redhat.com>) id 1TLajo-0005r4-MY
	for qemu-devel@nongnu.org; Tue, 09 Oct 2012 10:24:48 -0400
Message-ID: <50743390.3@redhat.com>
Date: Tue, 09 Oct 2012 16:24:16 +0200
From: Avi Kivity <avi@redhat.com>
MIME-Version: 1.0
References: <1348577763-12920-1-git-send-email-pbonzini@redhat.com>
	<20121008113932.GB16332@stefanha-thinkpad.redhat.com>
	<5072CE54.8020208@redhat.com>
	<20121009090811.GB13775@stefanha-thinkpad.redhat.com>
	<5073EDB3.3020804@redhat.com> <5073FE3A.1090903@redhat.com>
	<507401D8.8090203@redhat.com> <507405B5.4060108@redhat.com>
	<507410BD.6050901@redhat.com> <50741218.90000@redhat.com>
	<5074171A.2030904@redhat.com> <5074226A.3030907@redhat.com>
	<507424E5.4060705@redhat.com> <50742B97.2060608@redhat.com>
In-Reply-To: <50742B97.2060608@redhat.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] Block I/O outside the QEMU global mutex was "Re:
 [RFC PATCH 00/17] Support for multiple "AIO contexts""
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>, Anthony Liguori <aliguori@us.ibm.com>, Ping Fan Liu <pingfank@linux.vnet.ibm.com>, Stefan Hajnoczi <stefanha@gmail.com>, qemu-devel@nongnu.org, Jan Kiszka <jan.kiszka@siemens.com>

On 10/09/2012 03:50 PM, Paolo Bonzini wrote:
> Il 09/10/2012 15:21, Avi Kivity ha scritto:
>> On 10/09/2012 03:11 PM, Paolo Bonzini wrote:
>>>> But no, it's actually impossible.  Hotplug may be triggered from a vcpu
>>>> thread, which clearly it can't be stopped.
>>>
>>> Hotplug should always be asynchronous (because that's how hardware
>>> works), so it should always be possible to delegate the actual work to a
>>> non-VCPU thread.  Or not?
>> 
>> The actual device deletion can happen from a different thread, as long
>> as you isolate the device before.  That's part of the garbage collector
>> idea.
>> 
>> vcpu thread:
>>   rcu_read_lock
>>   lookup
>>   dispatch
>>     mmio handler
>>       isolate
>>       queue(delete_work)
>>   rcu_read_unlock
>> 
>> worker thread:
>>   process queue
>>     delete_work
>>       synchronize_rcu() / stop_machine()
>>       acquire qemu lock
>>       delete object
>>       drop qemu lock
>> 
>> Compared to the garbage collector idea, this drops fined-grained locking
>> for the qdev tree, a significant advantage.  But it still suffers from
>> dispatching inside the rcu critical section, which is something we want
>> to avoid.
> 
> But we are not Linux, and I think the tradeoffs are different for RCU in
> Linux vs. QEMU.
> 
> For CPUs in the kernel, running user code is just one way to get things
> done; QEMU threads are much more event driven, and their whole purpose
> is to either run the guest or sleep, until "something happens" (VCPU
> exit or readable fd).  In other words, QEMU threads should be able to
> stay most of the time in KVM_RUN or select() for any workload (to some
> approximation).

If you're streaming data (the saturated iothread from that other thread)
or live migrating or have a block job with fast storage, this isn't
necessarily true.  You could make sure each thread polls the rcu state
periodically though.

> Not just that: we do not need to minimize RCU critical sections, because
> anyway we want to minimize the time spent in QEMU, period.
> 
> So I believe that to some approximation, in QEMU we can completely
> ignore everything else, and behave as if threads were always under
> rcu_read_lock(), except if in KVM_RUN/select.  KVM_RUN and select are
> what Paul McKenney calls extended quiescent states, and in fact the
> following mapping works:
> 
>     rcu_extended_quiesce_start()     -> rcu_read_unlock();
>     rcu_extended_quiesce_end()       -> rcu_read_lock();
>     rcu_read_lock/unlock()           -> nop
> 
> This in turn means that dispatching inside the RCU critical section is
> not really bad.

I believe you still cannot synchronize_rcu() while in an rcu critical
section per the rcu documentation, even when lock/unlock map to nops.
Of course we can violate that and it wouldn't know a thing, but I prefer
to stick to the established pattern.


-- 
error compiling committee.c: too many arguments to function