From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59397) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WxZkr-0000TC-HE for qemu-devel@nongnu.org; Thu, 19 Jun 2014 06:39:47 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WxZkl-0000Yl-6T for qemu-devel@nongnu.org; Thu, 19 Jun 2014 06:39:41 -0400 Received: from mx1.redhat.com ([209.132.183.28]:26422) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WxZkk-0000YG-UW for qemu-devel@nongnu.org; Thu, 19 Jun 2014 06:39:35 -0400 Date: Thu, 19 Jun 2014 18:39:28 +0800 From: Stefan Hajnoczi Message-ID: <20140619103928.GD2766@stefanha-thinkpad.redhat.com> References: <53961C5B.9020201@us.ibm.com> <20140610014038.GA11308@T430.nay.redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="KdquIMZPjGJQvRdI" Content-Disposition: inline In-Reply-To: <20140610014038.GA11308@T430.nay.redhat.com> Subject: Re: [Qemu-devel] dataplane performance on s390 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Fam Zheng Cc: pbonzini@redhat.com, Karl Rister , qemu-devel@nongnu.org --KdquIMZPjGJQvRdI Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Jun 10, 2014 at 09:40:38AM +0800, Fam Zheng wrote: > On Mon, 06/09 15:43, Karl Rister wrote: > > Hi All > >=20 > > I was asked by our development team to do a performance sniff test of t= he > > latest dataplane code on s390 and compare it against qemu.git. Here is= a > > brief description of the configuration, the testing done, and then the > > results. > >=20 > > Configuration: > >=20 > > Host: 26 CPU LPAR, 64GB, 8 zFCP adapters > > Guest: 4 VCPU, 1GB, 128 virtio block devices > >=20 > > Each virtio block device maps to a dm-multipath device in the host with= 8 > > paths. Multipath is configured with the service-time policy. All block > > devices are configured to use the deadline IO scheduler. > >=20 > > Test: > >=20 > > FIO is used to run 4 scenarios: sequential read, sequential write, rand= om > > read, and random write. Sequential scenarios use a 128KB request size = and > > random scenarios us a 8KB request size. Each scenario is run with an > > increasing number of jobs, from 1 to 128 (powers of 2). Each job is bo= und > > to an individual file on an ext3 file system on a virtio device and uses > > O_DIRECT, libaio, and iodepth=3D1. Each test is run three times for 2 = minutes > > each, the first iteration (a warmup) is thrown out and the next two > > iterations are averaged together. > >=20 > > Results: > >=20 > > Baseline: qemu.git 93f94f9018229f146ed6bbe9e5ff72d67e4bd7ab > >=20 > > Dataplane: bdrv_set_aio_context 0ab50cde71aa27f39b8a3ea4766ff82671adb2a4 >=20 > Hi Karl, >=20 > Thanks for the results. >=20 > The throughput differences look minimal, where is the bandwidth saturated= in > these tests? And why use iodepth=3D1, not more? >=20 > Thanks, > Fam >=20 > >=20 > > Sequential Read: > >=20 > > Overall a slight throughput regression with a noticeable reduction in C= PU > > efficiency. > >=20 > > 1 Job: Throughput regressed -1.4%, CPU improved -0.83%. > > 2 Job: Throughput regressed -2.5%, CPU regressed +2.81% > > 4 Job: Throughput regressed -2.2%, CPU regressed +12.22% > > 8 Job: Throughput regressed -0.7%, CPU regressed +9.77% > > 16 Job: Throughput regressed -3.4%, CPU regressed +7.04% > > 32 Job: Throughput regressed -1.8%, CPU regressed +12.03% > > 64 Job: Throughput regressed -0.1%, CPU regressed +10.60% > > 128 Job: Throughput increased +0.3%, CPU regressed +10.70% > >=20 > > Sequential Write: > >=20 > > Mostly regressed throughput, although it gets better as job count incre= ases > > and even has some gains at higher job counts. CPU efficiency is regres= sed. > >=20 > > 1 Job: Throughput regressed -1.9%, CPU regressed +0.90% > > 2 Job: Throughput regressed -2.0%, CPU regressed +1.07% > > 4 Job: Throughput regressed -2.4%, CPU regressed +8.68% > > 8 Job: Throughput regressed -2.0%, CPU regressed +4.23% > > 16 Job: Throughput regressed -5.0%, CPU regressed +10.53% > > 32 Job: Throughput improved +7.6%, CPU regressed +7.37% > > 64 Job: Throughput regressed -0.6%, CPU regressed +7.29% > > 128 Job: Throughput improved +8.3%, CPU regressed +6.68% > >=20 > > Random Read: > >=20 > > Again, mostly throughput regressions except for the largest job counts.= CPU > > efficiency is regressed at all data points. > >=20 > > 1 Job: Throughput regressed -3.0%, CPU regressed +0.14% > > 2 Job: Throughput regressed -3.6%, CPU regressed +6.86% > > 4 Job: Throughput regressed -5.1%, CPU regressed +11.11% > > 8 Job: Throughput regressed -8.6%, CPU regressed +12.32% > > 16 Job: Throughput regressed -5.7%, CPU regressed +12.99% > > 32 Job: Throughput regressed -7.4%, CPU regressed +7.62% > > 64 Job: Throughput improved +10.0%, CPU regressed +10.83% > > 128 Job: Throughput improved +10.7%, CPU regressed +10.85% > >=20 > > Random Write: > >=20 > > Throughput and CPU regressed at all but one data point. > >=20 > > 1 Job: Throughput regressed -2.3%, CPU improved -1.50% > > 2 Job: Throughput regressed -2.2%, CPU regressed +0.16% > > 4 Job: Throughput regressed -1.0%, CPU regressed +8.36% > > 8 Job: Throughput regressed -8.6%, CPU regressed +12.47% > > 16 Job: Throughput regressed -3.1%, CPU regressed +12.40% > > 32 Job: Throughput regressed -0.2%, CPU regressed +11.59% > > 64 Job: Throughput regressed -1.9%, CPU regressed +12.65% > > 128 Job: Throughput improved +5.6%, CPU regressed +11.68% > >=20 > >=20 > > * CPU consumption is an efficiency calculation of usage per MB of > > throughput. Thanks for sharing! This is actually not too bad considering that the bdrv_set_aio_context() code uses the QEMU block layer while the older qemu.git code uses a custom Linux AIO code path. The CPU efficiency regression is interesting. Do you have any profiling data that shows where the hot spots are? Thanks, Stefan --KdquIMZPjGJQvRdI Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJTor3gAAoJEJykq7OBq3PIQksIAIGCqZZNPG2/GqLasU6IVIHD FDX2xj3apLoILniM7b4LbTfhkgLzk4JYj0TxpbDPTd/ljPOpDalU91PHGo5XK8l1 sV1fhZgcf9UytmggdxcgthStOHNlBei5A66V/1a/rD6b4Fr++mhHUT9pym4MmTVz LmQBS5oc0e+O8g+HlAj14FAkjWwG1M9Zq0+GhcGqgoqviyKEvn6X4L7j0KfwNs8E scq/ubPpBZkNvnVfmy+gDyitIZFAQ4TjmuSj0I/pBQqs2A9lmxhAanlXQzYGsSel nanKJEwO61GOxM6ggrVpeAATNfnpvWWHrjOROwAWWbZOFF7UwtydmeS3WG1zY3c= =EiPt -----END PGP SIGNATURE----- --KdquIMZPjGJQvRdI--