From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:44096)
	by lists.gnu.org with esmtp (Exim 4.71) (envelope-from <pl@kamp.de>)
	id 1XuIvk-0003xf-JM
	for qemu-devel@nongnu.org; Fri, 28 Nov 2014 05:37:46 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <pl@kamp.de>) id 1XuIvf-0000iY-1r
	for qemu-devel@nongnu.org; Fri, 28 Nov 2014 05:37:40 -0500
Received: from mx-v6.kamp.de ([2a02:248:0:51::16]:48776 helo=mx01.kamp.de)
	by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <pl@kamp.de>)
	id 1XuIve-0000gl-Nt
	for qemu-devel@nongnu.org; Fri, 28 Nov 2014 05:37:34 -0500
Message-ID: <54785067.60905@kamp.de>
Date: Fri, 28 Nov 2014 11:37:27 +0100
From: Peter Lieven <pl@kamp.de>
MIME-Version: 1.0
References: <1417084026-12307-1-git-send-email-pl@kamp.de>
	<1417084026-12307-4-git-send-email-pl@kamp.de>
	<547753F7.2030709@redhat.com> <54782EC3.10005@kamp.de>
	<54784E55.6060405@redhat.com>
In-Reply-To: <54784E55.6060405@redhat.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [RFC PATCH 3/3] qemu-coroutine: use a ring per
 thread for the pool
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Paolo Bonzini <pbonzini@redhat.com>, ming.lei@canonical.com, Kevin Wolf <kwolf@redhat.com>, Stefan Hajnoczi <stefanha@redhat.com>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, Markus Armbruster <armbru@redhat.com>

Am 28.11.2014 um 11:28 schrieb Paolo Bonzini:
>
> On 28/11/2014 09:13, Peter Lieven wrote:
>> Am 27.11.2014 um 17:40 schrieb Paolo Bonzini:
>>> On 27/11/2014 11:27, Peter Lieven wrote:
>>>> +static __thread struct CoRoutinePool {
>>>> +    Coroutine *ptrs[POOL_MAX_SIZE];
>>>> +    unsigned int size;
>>>> +    unsigned int nextfree;
>>>> +} CoPool;
>>>>  
>>> The per-thread ring unfortunately didn't work well last time it was
>>> tested.  Devices that do not use ioeventfd (not just the slow ones, even
>>> decently performing ones like ahci, nvme or megasas) will create the
>>> coroutine in the VCPU thread, and destroy it in the iothread.  The
>>> result is that coroutines cannot be reused.
>>>
>>> Can you check if this is still the case?
>> I already tested at least for IDE and for ioeventfd=off. The coroutine
>> is created in the vCPU thread and destroyed in the I/O thread.
>>
>> I also havea more complicated version which sets per therad coroutine pool only
>> for dataplane. Avoiding the lock for dedicated iothreads.
>>
>> For those who want to take a look:
>>
>> https://github.com/plieven/qemu/commit/325bc4ef5c7039337fa785744b145e2bdbb7b62e
> Can you test it against the patch I just sent in Kevin's linux-aio
> coroutine thread?

Was already doing it ;-) At least with test-couroutine.c....

master:
Run operation 40000000 iterations 12.851414 s, 3112K operations/s, 321ns per coroutine

paolo:
Run operation 40000000 iterations 11.951720 s, 3346K operations/s, 298ns per coroutine

plieven/perf_master2:
Run operation 40000000 iterations 9.013785 s, 4437K operations/s, 225ns per coroutine

plieven/perf_master:
Run operation 40000000 iterations 11.072883 s, 3612K operations/s, 276ns per coroutine

However, perf_master and perf_master2 have a regerssion regarding nesting as it seems.
@Kevin: Could that be the reason why they performe bad in some szenarios?


Regarding the bypass that is discussed. If it is not just a benchmark thing but really necessary
for some peoples use cases why not add a new aio mode like "bypass" and use it only then.
If the performance is really needed the user he/she might trade it in for lost features like iothrottling, filters etc.

Peter