From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36442) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Wu6Pj-00018p-9p for qemu-devel@nongnu.org; Mon, 09 Jun 2014 16:43:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Wu6Pa-0004Jq-4U for qemu-devel@nongnu.org; Mon, 09 Jun 2014 16:43:31 -0400 Received: from e23smtp07.au.ibm.com ([202.81.31.140]:49435) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Wu6PZ-0004Je-B1 for qemu-devel@nongnu.org; Mon, 09 Jun 2014 16:43:22 -0400 Received: from /spool/local by e23smtp07.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 10 Jun 2014 06:43:12 +1000 Received: from d23relay04.au.ibm.com (d23relay04.au.ibm.com [9.190.234.120]) by d23dlp02.au.ibm.com (Postfix) with ESMTP id AC6942BB0040 for ; Tue, 10 Jun 2014 06:43:10 +1000 (EST) Received: from d23av01.au.ibm.com (d23av01.au.ibm.com [9.190.234.96]) by d23relay04.au.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id s59KR6OR8716464 for ; Tue, 10 Jun 2014 06:27:07 +1000 Received: from d23av01.au.ibm.com (localhost [127.0.0.1]) by d23av01.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id s59Kh9vv013541 for ; Tue, 10 Jun 2014 06:43:09 +1000 Received: from [9.41.149.198] (urfa.austin.ibm.com [9.41.149.198]) by d23av01.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id s59Kh8Lj013537 for ; Tue, 10 Jun 2014 06:43:09 +1000 Message-ID: <53961C5B.9020201@us.ibm.com> Date: Mon, 09 Jun 2014 15:43:07 -0500 From: Karl Rister MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: [Qemu-devel] dataplane performance on s390 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Hi All I was asked by our development team to do a performance sniff test of the latest dataplane code on s390 and compare it against qemu.git. Here is a brief description of the configuration, the testing done, and then the results. Configuration: Host: 26 CPU LPAR, 64GB, 8 zFCP adapters Guest: 4 VCPU, 1GB, 128 virtio block devices Each virtio block device maps to a dm-multipath device in the host with 8 paths. Multipath is configured with the service-time policy. All block devices are configured to use the deadline IO scheduler. Test: FIO is used to run 4 scenarios: sequential read, sequential write, random read, and random write. Sequential scenarios use a 128KB request size and random scenarios us a 8KB request size. Each scenario is run with an increasing number of jobs, from 1 to 128 (powers of 2). Each job is bound to an individual file on an ext3 file system on a virtio device and uses O_DIRECT, libaio, and iodepth=1. Each test is run three times for 2 minutes each, the first iteration (a warmup) is thrown out and the next two iterations are averaged together. Results: Baseline: qemu.git 93f94f9018229f146ed6bbe9e5ff72d67e4bd7ab Dataplane: bdrv_set_aio_context 0ab50cde71aa27f39b8a3ea4766ff82671adb2a4 Sequential Read: Overall a slight throughput regression with a noticeable reduction in CPU efficiency. 1 Job: Throughput regressed -1.4%, CPU improved -0.83%. 2 Job: Throughput regressed -2.5%, CPU regressed +2.81% 4 Job: Throughput regressed -2.2%, CPU regressed +12.22% 8 Job: Throughput regressed -0.7%, CPU regressed +9.77% 16 Job: Throughput regressed -3.4%, CPU regressed +7.04% 32 Job: Throughput regressed -1.8%, CPU regressed +12.03% 64 Job: Throughput regressed -0.1%, CPU regressed +10.60% 128 Job: Throughput increased +0.3%, CPU regressed +10.70% Sequential Write: Mostly regressed throughput, although it gets better as job count increases and even has some gains at higher job counts. CPU efficiency is regressed. 1 Job: Throughput regressed -1.9%, CPU regressed +0.90% 2 Job: Throughput regressed -2.0%, CPU regressed +1.07% 4 Job: Throughput regressed -2.4%, CPU regressed +8.68% 8 Job: Throughput regressed -2.0%, CPU regressed +4.23% 16 Job: Throughput regressed -5.0%, CPU regressed +10.53% 32 Job: Throughput improved +7.6%, CPU regressed +7.37% 64 Job: Throughput regressed -0.6%, CPU regressed +7.29% 128 Job: Throughput improved +8.3%, CPU regressed +6.68% Random Read: Again, mostly throughput regressions except for the largest job counts. CPU efficiency is regressed at all data points. 1 Job: Throughput regressed -3.0%, CPU regressed +0.14% 2 Job: Throughput regressed -3.6%, CPU regressed +6.86% 4 Job: Throughput regressed -5.1%, CPU regressed +11.11% 8 Job: Throughput regressed -8.6%, CPU regressed +12.32% 16 Job: Throughput regressed -5.7%, CPU regressed +12.99% 32 Job: Throughput regressed -7.4%, CPU regressed +7.62% 64 Job: Throughput improved +10.0%, CPU regressed +10.83% 128 Job: Throughput improved +10.7%, CPU regressed +10.85% Random Write: Throughput and CPU regressed at all but one data point. 1 Job: Throughput regressed -2.3%, CPU improved -1.50% 2 Job: Throughput regressed -2.2%, CPU regressed +0.16% 4 Job: Throughput regressed -1.0%, CPU regressed +8.36% 8 Job: Throughput regressed -8.6%, CPU regressed +12.47% 16 Job: Throughput regressed -3.1%, CPU regressed +12.40% 32 Job: Throughput regressed -0.2%, CPU regressed +11.59% 64 Job: Throughput regressed -1.9%, CPU regressed +12.65% 128 Job: Throughput improved +5.6%, CPU regressed +11.68% * CPU consumption is an efficiency calculation of usage per MB of throughput. -- Karl Rister IBM Linux/KVM Development Optimization