From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([208.118.235.92]:48688)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1TLXgC-00031s-U6
	for qemu-devel@nongnu.org; Tue, 09 Oct 2012 07:08:58 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1TLXg7-0000iM-Ub
	for qemu-devel@nongnu.org; Tue, 09 Oct 2012 07:08:52 -0400
Received: from mx1.redhat.com ([209.132.183.28]:40632)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1TLXg7-0000iI-LD
	for qemu-devel@nongnu.org; Tue, 09 Oct 2012 07:08:47 -0400
Message-ID: <507405B5.4060108@redhat.com>
Date: Tue, 09 Oct 2012 13:08:37 +0200
From: Paolo Bonzini <pbonzini@redhat.com>
MIME-Version: 1.0
References: <1348577763-12920-1-git-send-email-pbonzini@redhat.com>
	<20121008113932.GB16332@stefanha-thinkpad.redhat.com>
	<5072CE54.8020208@redhat.com>
	<20121009090811.GB13775@stefanha-thinkpad.redhat.com>
	<5073EDB3.3020804@redhat.com> <5073FE3A.1090903@redhat.com>
	<507401D8.8090203@redhat.com>
In-Reply-To: <507401D8.8090203@redhat.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] Block I/O outside the QEMU global mutex was "Re:
 [RFC PATCH 00/17] Support for multiple "AIO contexts""
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Avi Kivity <avi@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>, Stefan Hajnoczi <stefanha@gmail.com>, Anthony Liguori <aliguori@us.ibm.com>, Ping Fan Liu <pingfank@linux.vnet.ibm.com>, qemu-devel@nongnu.org

Il 09/10/2012 12:52, Avi Kivity ha scritto:
> On 10/09/2012 12:36 PM, Paolo Bonzini wrote:
>> Il 09/10/2012 11:26, Avi Kivity ha scritto:
>>> On 10/09/2012 11:08 AM, Stefan Hajnoczi wrote:
>>>> Here are the steps that have been mentioned:
>>>>
>>>> 1. aio fastpath - for raw-posix and other aio block drivers, can we reduce I/O
>>>>    request latency by skipping block layer coroutines?  
>>>
>>> Is coroutine overhead noticable?
>>
>> I'm thinking more about throughput than latency.  If the iothread
>> becomes CPU-bound, then everything is noticeable.
> 
> That's not strictly a coroutine issue.  Switching to ordinary threads
> may make the problem worse, since there will clearly be contention.

The point is you don't need either coroutines or userspace threads if
you use native AIO.  longjmp/setjmp is probably a smaller overhead
compared to the many syscalls involved in poll+eventfd
reads+io_submit+io_getevents, but it's also not cheap.  Also, if you
process AIO in batches you risk overflowing the pool of free coroutines,
which gets expensive real fast (allocate/free the stack, do the
expensive getcontext/swapcontext instead of the cheaper longjmp/setjmp,
etc.).

It seems better to sidestep the issue completely, it's a small amount of
work.

> What is the I/O processing time we have?  If it's say 10 microseconds,
> then we'll have 100,000 context switches per second assuming a device
> lock and a saturated iothread (split into multiple threads).

Hopefully with a saturated dedicated iothread you would not have any
context switches and a single CPU will be just dedicated to virtio
processing.

> The coroutine work may have laid the groundwork for fine-grained
> locking.  I'm doubtful we should use qcow when we want >100K IOPS though.

Yep.  Going away from coroutines is a solution in search of a problem,
it will introduce several new variables (kernel scheduling, more
expensive lock contention, starving the thread pool with locked threads,
...), all for a case where performance hardly matters.

>>>> I'm also curious about virtqueue_pop()/virtqueue_push() outside the QEMU mutex
>>>> although that might be blocked by the current work around MMIO/PIO dispatch
>>>> outside the global mutex.
>>>
>>> It is, yes.
>>
>> It should only require unlocked memory map/unmap, not MMIO dispatch.
>> The MMIO/PIO bits are taken care of by ioeventfd.
> 
> The ring, or indirect descriptors, or the data, can all be on mmio.
> IIRC the virtio spec forbids that, but the APIs have to be general.  We
> don't have cpu_physical_memory_map_nommio() (or
> address_space_map_nommio(), as soon as the coding style committee
> ratifies srtuct literals).

cpu_physical_memory_map could still take the QEMU lock in the slow
bounce-buffer case.  BTW the block layer has been using struct literals
for a long time and we're just as happy as you are about them. :)

Paolo