From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39785) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aHwgH-0008VP-66 for qemu-devel@nongnu.org; Sat, 09 Jan 2016 11:47:58 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aHwgC-0002cf-4t for qemu-devel@nongnu.org; Sat, 09 Jan 2016 11:47:57 -0500 Received: from vps01.wiesinger.com ([46.36.37.179]:52590) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aHwgB-0002M4-Qc for qemu-devel@nongnu.org; Sat, 09 Jan 2016 11:47:52 -0500 Received: from wiesinger.com (wiesinger.com [62.178.19.14]) by vps01.wiesinger.com (Postfix) with ESMTPS id 003369F207 for ; Sat, 9 Jan 2016 17:46:43 +0100 (CET) Received: from [192.168.32.242] (89-104-7-101.customer.bnet.at [89.104.7.101]) (authenticated bits=0) by wiesinger.com (8.15.2/8.15.2) with ESMTPSA id u09GkXFK017865 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO) for ; Sat, 9 Jan 2016 17:46:38 +0100 References: <5666A553.8070009@wiesinger.com> From: Gerhard Wiesinger Message-ID: <56913969.3000305@wiesinger.com> Date: Sat, 9 Jan 2016 17:46:33 +0100 MIME-Version: 1.0 In-Reply-To: <5666A553.8070009@wiesinger.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] QEMU/KVM performance gets worser - high load - high interrupts - high context switches List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org 3On 08.12.2015 10:39, Gerhard Wiesinger wrote: > Hello, > > Yesterday I looked at my munin statistics on my KVM host and I swar=20 > that performance gets worser: load is getting higher, interrupts are=20 > getting higher and are high as well as context switches. VMs and=20 > applications didn't change that way. > > You can find graphics at: http://www.wiesinger.com/tmp/kvm/ > Last spike I guess was upgrade from FC22 to FC23 or a kernel update.=20 > And it was even lower on older versions > > For me it looks like the high interrupt load and context switches are=20 > the root cause. Interrupts inside the VM are <100, so with 10 VMs I'm=20 > expecting 1000+baseload =3D> <2000, see statistics below. > > All VMs are virtio on disk/network except one (IDE/rtl8139). > > # Host as well as all guests (except 2 VMs): > uname -a > Linux kvm 4.2.6-301.fc23.x86_64 #1 SMP Fri Nov 20 22:22:41 UTC 2015=20 > x86_64 x86_64 x86_64 GNU/Linux > > qemu-system-x86-2.4.1-1.fc23.x86_64 > > Platform: > > All VMs have the pc-i440fx-2.4 profile (I upgraded yesterday from=20 > pc-i440fx-2.3 without any change). > > Any ideas, anyone having same issues? > > Ciao, > Gerhard > > kvm: no VM running > r b swpd free buff cache si so bi bo in cs us sy=20 > id wa st > 0 0 0 3308516 102408 3798568 0 0 0 12 197 679 0 =20 > 0 99 0 0 > 0 0 0 3308516 102416 3798564 0 0 0 42 197 914 0 =20 > 0 99 1 0 > 0 0 0 3308516 102416 3798568 0 0 0 0 190 791 0 =20 > 0 100 0 0 > 2 0 0 3308484 102416 3798568 0 0 0 0 129 440 0 =20 > 0 100 0 0 > > kvm: 2 VMs running > procs -----------memory---------- ---swap-- -----io---- -system--=20 > ------cpu----- > r b swpd free buff cache si so bi bo in cs us sy=20 > id wa st > 1 0 0 2641464 103052 3814700 0 0 0 0 2715 5648 =20 > 3 2 95 0 0 > 0 0 0 2641340 103052 3814700 0 0 0 0 2601 5555 =20 > 1 2 97 0 0 > 1 0 0 2641308 103052 3814700 0 0 0 5 2687 5708 =20 > 3 2 95 0 0 > 0 0 0 2640620 103060 3814628 0 0 0 30 2779 5756 =20 > 4 3 93 1 0 > 0 0 0 2640644 103060 3814636 0 0 0 0 2436 5364 =20 > 1 2 97 0 0 > 1 0 0 2640520 103060 3814636 0 0 0 119 2734 5975 =20 > 3 2 95 0 0 > > kvm: all 10 VMs running > procs -----------memory---------- ---swap-- -----io---- -system--=20 > ------cpu----- > r b swpd free buff cache si so bi bo in cs us sy=20 > id wa st > 1 0 0 60408 78892 3371984 0 0 0 85 9015 17357 =20 > 4 9 87 0 0 > 2 0 0 60408 78892 3371968 0 0 0 47 9375 17797 =20 > 9 9 82 0 0 > 0 0 0 60472 78892 3372092 0 0 40 60 8882 17343 =20 > 4 8 86 1 0 > 1 0 0 60316 78892 3372080 0 0 0 59 8863 17517 =20 > 4 8 87 0 0 > 0 0 0 59540 78900 3372092 0 0 0 55 9135 17796 =20 > 8 9 81 1 0 > 0 0 0 59168 78900 3372112 0 0 0 51 8931 17484 =20 > 4 9 87 0 0 > > cat /proc/cpuinfo > processor : 0 > vendor_id : GenuineIntel > cpu family : 6 > model : 15 > model name : Intel(R) Core(TM)2 Quad CPU @ 2.66GHz > stepping : 7 > > OK, I found what the problem is: analysis via: 1.) kvm_stat 2.) /usr/bin/perf record -p /usr/bin/perf report -i perf.data > perf-report.txt cat perf-report.txt # Overhead Command Shared Object Symbol # ........ ............... .......................=20 .......................................... # 15.75% qemu-system-x86 [kernel.kallsyms] [k] __fget 8.33% qemu-system-x86 [kernel.kallsyms] [k]=20 _raw_spin_lock_irqsave 7.54% qemu-system-x86 [kernel.kallsyms] [k] fput 6.61% qemu-system-x86 [kernel.kallsyms] [k] do_sys_poll 3.60% qemu-system-x86 [kernel.kallsyms] [k] __pollwait 2.20% qemu-system-x86 [kernel.kallsyms] [k]=20 _raw_write_unlock_irqrestore 2.09% qemu-system-x86 libpthread-2.22.so [.]=20 pthread_mutex_lock ... Found also: 1.) https://bugzilla.redhat.com/show_bug.cgi?id=3D949547 2.) https://www.kraxel.org/blog/2014/03/qemu-and-usb-tablet-cpu-consumtio= n/ After reading that I did the following: # On 10 Linux VMs I removed: # 1.) Serial device itself # 2.) PCI controller VirtIO serial # 3.) USB Mouse tablet # Positive consequences via munin monitoring: # Reduced fork rate: 40 =3D> 13 # process states: running 15 =3D> <1 # C=C3=9CU temperature: (core dependant) 65-70=C2=B0C =3D> 56-64=C2=B0C # CPU usage: system: 47% =3D> 15%, user: 76% =3D> 50% # Context Switches: 20k =3D> 7.5k # Interrupts: 16k =3D> 9k # Load average: 2.8 =3D> 1 =3D> back at the level before one year!!!!!!! Any idea why the serial device/PCI controller and the USB mouse tablet=20 consume so much CPU on latest kernel and/or qemu? Anyone has same experience? Thnx. Ciao, Gerhard