From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54081) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XuJiV-0006NX-4C for qemu-devel@nongnu.org; Fri, 28 Nov 2014 06:28:12 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XuJiM-0007nj-37 for qemu-devel@nongnu.org; Fri, 28 Nov 2014 06:28:03 -0500 Received: from mx-v6.kamp.de ([2a02:248:0:51::16]:47377 helo=mx01.kamp.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XuJiL-0007nW-QW for qemu-devel@nongnu.org; Fri, 28 Nov 2014 06:27:54 -0500 Message-ID: <54785C32.30805@kamp.de> Date: Fri, 28 Nov 2014 12:27:46 +0100 From: Peter Lieven MIME-Version: 1.0 References: <1417084026-12307-1-git-send-email-pl@kamp.de> <1417084026-12307-4-git-send-email-pl@kamp.de> <547753F7.2030709@redhat.com> <54782EC3.10005@kamp.de> <54784E55.6060405@redhat.com> <54785067.60905@kamp.de> <547858FF.5070602@redhat.com> <54785AA5.9070409@kamp.de> <54785B2E.9070203@redhat.com> In-Reply-To: <54785B2E.9070203@redhat.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] [RFC PATCH 3/3] qemu-coroutine: use a ring per thread for the pool List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini , ming.lei@canonical.com, Kevin Wolf , Stefan Hajnoczi , "qemu-devel@nongnu.org" , Markus Armbruster Am 28.11.2014 um 12:23 schrieb Paolo Bonzini: > > On 28/11/2014 12:21, Peter Lieven wrote: >> Am 28.11.2014 um 12:14 schrieb Paolo Bonzini: >>>> master: >>>> Run operation 40000000 iterations 12.851414 s, 3112K operations/s, 321ns per coroutine >>>> >>>> paolo: >>>> Run operation 40000000 iterations 11.951720 s, 3346K operations/s, 298ns per coroutine >>> Nice. :) >>> >>> Can you please try "coroutine: Use __thread … " together, too? I still >>> see 11% time spent in pthread_getspecific, and I get ~10% more indeed if >>> I apply it here (my times are 191/160/145). >> indeed: >> >> Run operation 40000000 iterations 10.138684 s, 3945K operations/s, 253ns per coroutine > Your perf_master2 uses the ring buffer unconditionally, right? I wonder > if we can use a similar algorithm but with arrays instead of lists... You mean an algorithm similar to perf_master2 or to the current implementation? The ring buffer seems to have a drawback when it comes to excessive coroutine nesting. My idea was that you do not throw away hot coroutines when the pool is full. However, i do not know if this is really a problem since the pool is only full when there is not much I/O. Or is this assumption to easy? Peter