From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kiszka Subject: Re: [Qemu-devel] [PATCH v2] posix-aio-compat: fix latency issues Date: Tue, 23 Aug 2011 15:02:28 +0200 Message-ID: <4E53A4E4.1090807@siemens.com> References: <1313294689-21572-1-git-send-email-avi@redhat.com> <4E5291DF.1070603@siemens.com> <4E539FA9.3010507@codemonkey.ws> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Stefan Hajnoczi , Kevin Wolf , "kvm@vger.kernel.org" , Avi Kivity , "qemu-devel@nongnu.org" To: Anthony Liguori Return-path: Received: from david.siemens.de ([192.35.17.14]:16907 "EHLO david.siemens.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752656Ab1HWNCi (ORCPT ); Tue, 23 Aug 2011 09:02:38 -0400 In-Reply-To: <4E539FA9.3010507@codemonkey.ws> Sender: kvm-owner@vger.kernel.org List-ID: On 2011-08-23 14:40, Anthony Liguori wrote: > On 08/23/2011 06:01 AM, Stefan Hajnoczi wrote: >> On Mon, Aug 22, 2011 at 6:29 PM, Jan Kiszka = wrote: >>> On 2011-08-14 06:04, Avi Kivity wrote: >>>> In certain circumstances, posix-aio-compat can incur a lot of late= ncy: >>>> - threads are created by vcpu threads, so if vcpu affinity is se= t, >>>> aio threads inherit vcpu affinity. This can cause many aio th= reads >>>> to compete for one cpu. >>>> - we can create up to max_threads (64) aio threads in one go; si= nce a >>>> pthread_create can take around 30=CE=BCs, we have up to 2ms of= cpu time >>>> under a global lock. >>>> >>>> Fix by: >>>> - moving thread creation to the main thread, so we inherit the m= ain >>>> thread's affinity instead of the vcpu thread's affinity. >>>> - if a thread is currently being created, and we need to create = yet >>>> another thread, let thread being born create the new thread, r= educing >>>> the amount of time we spend under the main thread. >>>> - drop the local lock while creating a thread (we may still hold= the >>>> global mutex, though) >>>> >>>> Note this doesn't eliminate latency completely; scheduler artifact= s or >>>> lack of host cpu resources can still cause it. We may want pre-al= located >>>> threads when this cannot be tolerated. >>>> >>>> Thanks to Uli Obergfell of Red Hat for his excellent analysis and = suggestions. >>> >>> At this chance: What is the state of getting rid of the remaining d= elta >>> between upstream's version and qemu-kvm? >> >> That would be nice. qemu-kvm.git uses a signalfd to handle I/O >> completion whereas qemu.git uses a signal, writes to a pipe from the >> signal handler, and uses qemu_notify_event() to break the vcpu. Onc= e >> the force iothread patch is merged we should be able to move to >> qemu-kvm.git's signalfd approach. >=20 > No need to use a signal at all actually. The use of a signal is=20 > historic and was required to work around the TCG race that I referred= to=20 > in another thread. >=20 > You should be able to just use an eventfd or pipe. >=20 > Better yet, we should look at using GThreadPool to replace posix-aio-= compat. When interacting with the thread pool is part of some time-critical pat= h (easily possible with a real-time Linux guest), general-purpose implementations like what glib offers are typically out of the game. They do not provide sufficient customizability, specifically control over their internal synchronization and allocation policies. That applies to the other rather primitive glib threading and locking services as well. Jan --=20 Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux