From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:52610)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <paolo.bonzini@gmail.com>) id 1XDDeS-0000gY-Iv
	for qemu-devel@nongnu.org; Fri, 01 Aug 2014 10:17:50 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <paolo.bonzini@gmail.com>) id 1XDDeJ-00081I-Ru
	for qemu-devel@nongnu.org; Fri, 01 Aug 2014 10:17:44 -0400
Received: from mail-qa0-x22b.google.com ([2607:f8b0:400d:c00::22b]:41423)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <paolo.bonzini@gmail.com>) id 1XDDeJ-00080w-Nm
	for qemu-devel@nongnu.org; Fri, 01 Aug 2014 10:17:35 -0400
Received: by mail-qa0-f43.google.com with SMTP id w8so3949967qac.16
	for <qemu-devel@nongnu.org>; Fri, 01 Aug 2014 07:17:35 -0700 (PDT)
Sender: Paolo Bonzini <paolo.bonzini@gmail.com>
Message-ID: <53DBA179.8090305@redhat.com>
Date: Fri, 01 Aug 2014 16:17:29 +0200
From: Paolo Bonzini <pbonzini@redhat.com>
MIME-Version: 1.0
References: <1406720388-18671-1-git-send-email-ming.lei@canonical.com>	<1406720388-18671-2-git-send-email-ming.lei@canonical.com>	<53D8F6F0.7040106@redhat.com>	<CACVXFVOY-jA6uxvTqxSqORL5AN5D-QK546Qr3pGDeyrX7NJeBw@mail.gmail.com>	<53D981C0.4030708@redhat.com>	<CACVXFVMniMoquw-BQ86VZKPT-1n6p6gp7m01MtioZf=+BugidQ@mail.gmail.com>	<53DA0940.2030207@redhat.com>	<CACVXFVMsbu+VT9E0iLJ3_rX94MOr_rZaX8UPzQVPfyC5NK3T4g@mail.gmail.com>	<53DA6F20.3050200@redhat.com>	<CACVXFVPEeTpUJtdMtOifJ4ow4TOKE_e6h91wtV8XzcHTmxRoWA@mail.gmail.com>	<20140801131330.GE7258@stefanha-thinkpad.redhat.com>
	<CACVXFVNSZFYdvBWtFq3q3dwcWyUntKEnDQH2fgchqKA1ymjn=A@mail.gmail.com>
In-Reply-To: <CACVXFVNSZFYdvBWtFq3q3dwcWyUntKEnDQH2fgchqKA1ymjn=A@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PATCH 01/15] qemu coroutine: support bypass mode
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Ming Lei <ming.lei@canonical.com>, Stefan Hajnoczi <stefanha@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>, Peter Maydell <peter.maydell@linaro.org>, Fam Zheng <famz@redhat.com>, qemu-devel <qemu-devel@nongnu.org>, "Michael S. Tsirkin" <mst@redhat.com>

Il 01/08/2014 15:48, Ming Lei ha scritto:
> On Fri, Aug 1, 2014 at 9:13 PM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
>> On Fri, Aug 01, 2014 at 10:54:02AM +0800, Ming Lei wrote:
>>> On Fri, Aug 1, 2014 at 12:30 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>> Il 31/07/2014 18:13, Ming Lei ha scritto:
>>>>> Follows 'perf report' result on cycles event for with/without bypass
>>>>> coroutine:
>>>>>
>>>>>     http://pastebin.com/ae0vnQ6V
>>>>>
>>>>> From the profiling result, looks bdrv_co_do_preadv() is a bit slow
>>>>> without bypass coroutine.
>>>>
>>>> Yeah, I can count at least 3.3% time spent here:
>>>>
>>>> 0.87%          bdrv_co_do_preadv
>>>> 0.79%          bdrv_aligned_preadv
>>>> 0.71%          qemu_coroutine_switch
>>>> 0.52%          tracked_request_begin
>>>> 0.45%          coroutine_swap
>>>>
>>>> Another ~3% wasted in malloc, etc.
>>>
>>> That should be related with coroutine and the BH in bdrv_co_do_rw().
>>> In this post I didn't apply Stephan's coroutine resize patch which might
>>> decrease usage of malloc() for coroutine.
>>
>> Please rerun with "[PATCH v3 0/2] coroutine: dynamically scale pool
>> size".
> 
> No problem, will do that. Actually in my last post with rfc, this patchset
> was against your coroutine resize patches.
> 
> I will provide the profile data tomorrow.
> 
>>
>>> At least, coroutine isn't cheap from the profile result.
>>
>> Instead of bypassing coroutines we should first understand the overhead
>> that they impose.  Is it due to the coroutine implementation (switching
>> stacks) or due to the bdrv_co_*() code that happens to use coroutines
>> but slow for other reasons.
> 
> From the 3th patch(block: support to bypass qemu coroutinue)
> and the 5th patch(dataplane: enable selective bypassing coroutine),
> the change is to bypass coroutine and BH, and the other bdrv code
> path is same, so it is due to the coroutine implementation, IMO.

But your code breaks all sort of invariants.  For example, the aiocb
must be valid when bdrv_aio_readv/writev return.  virtio-blk does not
use it, but virtio-scsi does.  If we apply your patches now, we will
have to redo it soon.

Basically we should be rewriting parts of block.c so that
bdrv_co_readv/writev calls bdrv_aio_readv/writev instead of vice versa.
 Coroutine creation should be pushed down to the
bdrv_aligned_preadv/bdrv_aligned_pwritev and, in the fast path, you can
simply call the driver's bdrv_aio_readv/writev.

Paolo