From mboxrd@z Thu Jan 1 00:00:00 1970 From: Anthony Liguori Subject: Re: [Qemu-devel] [PATCH v2] posix-aio-compat: fix latency issues Date: Tue, 23 Aug 2011 07:40:09 -0500 Message-ID: <4E539FA9.3010507@codemonkey.ws> References: <1313294689-21572-1-git-send-email-avi@redhat.com> <4E5291DF.1070603@siemens.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-7; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Jan Kiszka , Kevin Wolf , kvm@vger.kernel.org, Avi Kivity , qemu-devel@nongnu.org To: Stefan Hajnoczi Return-path: Received: from mail-gy0-f174.google.com ([209.85.160.174]:42256 "EHLO mail-gy0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752656Ab1HWMkM (ORCPT ); Tue, 23 Aug 2011 08:40:12 -0400 Received: by gya6 with SMTP id 6so44152gya.19 for ; Tue, 23 Aug 2011 05:40:11 -0700 (PDT) In-Reply-To: Sender: kvm-owner@vger.kernel.org List-ID: On 08/23/2011 06:01 AM, Stefan Hajnoczi wrote: > On Mon, Aug 22, 2011 at 6:29 PM, Jan Kiszka = wrote: >> On 2011-08-14 06:04, Avi Kivity wrote: >>> In certain circumstances, posix-aio-compat can incur a lot of laten= cy: >>> - threads are created by vcpu threads, so if vcpu affinity is set= , >>> aio threads inherit vcpu affinity. This can cause many aio thr= eads >>> to compete for one cpu. >>> - we can create up to max_threads (64) aio threads in one go; sin= ce a >>> pthread_create can take around 30=ECs, we have up to 2ms of cpu= time >>> under a global lock. >>> >>> Fix by: >>> - moving thread creation to the main thread, so we inherit the ma= in >>> thread's affinity instead of the vcpu thread's affinity. >>> - if a thread is currently being created, and we need to create y= et >>> another thread, let thread being born create the new thread, re= ducing >>> the amount of time we spend under the main thread. >>> - drop the local lock while creating a thread (we may still hold = the >>> global mutex, though) >>> >>> Note this doesn't eliminate latency completely; scheduler artifacts= or >>> lack of host cpu resources can still cause it. We may want pre-all= ocated >>> threads when this cannot be tolerated. >>> >>> Thanks to Uli Obergfell of Red Hat for his excellent analysis and s= uggestions. >> >> At this chance: What is the state of getting rid of the remaining de= lta >> between upstream's version and qemu-kvm? > > That would be nice. qemu-kvm.git uses a signalfd to handle I/O > completion whereas qemu.git uses a signal, writes to a pipe from the > signal handler, and uses qemu_notify_event() to break the vcpu. Once > the force iothread patch is merged we should be able to move to > qemu-kvm.git's signalfd approach. No need to use a signal at all actually. The use of a signal is=20 historic and was required to work around the TCG race that I referred t= o=20 in another thread. You should be able to just use an eventfd or pipe. Better yet, we should look at using GThreadPool to replace posix-aio-co= mpat. Regards, Anthony Liguori > > Stefan >