From mboxrd@z Thu Jan  1 00:00:00 1970
From: Steven Haigh <netwiz@crc.id.au>
Subject: Re: IO speed limited by size of IO request (for RBD
	driver)
Date: Wed, 08 May 2013 20:32:58 +1000
Message-ID: <518A29DA.3080501@crc.id.au>
References: <CAF6-1L5fePSH1eddGSa5e9ZChuNp3Gdph-wrfjJo75Ts=9Ra4Q@mail.gmail.com>
	<51768FA5.6090609@crc.id.au> <5176957E.6010306@citrix.com>
	<51769B9D.4000708@crc.id.au> <51769CFD.7020907@citrix.com>
	<51769E1E.6040902@crc.id.au> <5176A19A.2010802@citrix.com>
	<5176A440.8040303@crc.id.au> <5176A520.5030503@citrix.com>
	<5176A61F.6050607@crc.id.au> <5176A6DD.5000404@citrix.com>
	<5176AFF9.4020003@crc.id.au> <5176B237.8020803@citrix.com>
	<5176C073.3050409@crc.id.au> <5176CF56.8000505@citrix.com>
	<5176DB88.1070200@crc.id.au> <517A89DA.3030804@citrix.com>
	<517A8C44.5020103@crc.id.au> <517B3088.7070809@crc.id.au>
	<517B790A.3020009@citrix.com> <517B838C.9040607@crc.id.au>
	<517B8DE3.90306@crc.id.au> <517E3195.8090204@citrix.com>
	<517EC975.7030807@crc.id.au> <517ECE64.6000503@crc.id.au>
	<9F2C4E7DFB7839489C89757A66C5AD620E57EA@LONPEX01CL03.citrite.net>
	<518A0AB8.90506@crc.id.au> <518A0DC8.4080501@citrix.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"
Content-Transfer-Encoding: quoted-printable
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <518A0DC8.4080501@citrix.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: =?ISO-8859-1?Q?Roger_Pau_Monn=E9?= <roger.pau@citrix.com>
Cc: Felipe Franciosi <felipe.franciosi@citrix.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
List-Id: xen-devel@lists.xenproject.org

On 8/05/2013 6:33 PM, Roger Pau Monn=E9 wrote:
> On 08/05/13 10:20, Steven Haigh wrote:
>> On 30/04/2013 8:07 PM, Felipe Franciosi wrote:
>>> I noticed you copied your results from "dd", but I didn't see any concl=
usions drawn from experiment.
>>>
>>> Did I understand it wrong or now you have comparable performance on dom=
0 and domU when using DIRECT?
>>>
>>> domU:
>>> # dd if=3D/dev/zero of=3Doutput.zero bs=3D1M count=3D2048 oflag=3Ddirect
>>> 2048+0 records in
>>> 2048+0 records out
>>> 2147483648 bytes (2.1 GB) copied, 25.4705 s, 84.3 MB/s
>>>
>>> dom0:
>>> # dd if=3D/dev/zero of=3Doutput.zero bs=3D1M count=3D2048 oflag=3Ddirect
>>> 2048+0 records in
>>> 2048+0 records out
>>> 2147483648 bytes (2.1 GB) copied, 24.8914 s, 86.3 MB/s
>>>
>>>
>>> I think that if the performance differs when NOT using DIRECT, the issu=
e must be related to the way your guest is flushing the cache. This must be=
 generating a workload that doesn't perform well on Xen's PV protocol.
>>
>> Just wondering if there is any further input on this... While DIRECT
>> writes are as good as can be expected, NON-DIRECT writes in certain
>> cases (specifically with a mdadm raid in the Dom0) are affected by about
>> a 50% loss in throughput...
>>
>> The hard part is that this is the default mode of writing!
>
> As another test with indirect descriptors, could you change
> xen_blkif_max_segments in xen-blkfront.c to 128 (it is 32 by default),
> recompile the DomU kernel and see if that helps?

Ok, here we go.... compiled as 3.8.0-2 with the above change. 3.8.0-2 is =

running on both the Dom0 and DomU.

# dd if=3D/dev/zero of=3Doutput.zero bs=3D1M count=3D2048
2048+0 records in
2048+0 records out
2147483648 bytes (2.1 GB) copied, 22.1703 s, 96.9 MB/s

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
            0.34    0.00   17.10    0.00    0.23   82.33

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s =

avgrq-sz avgqu-sz   await  svctm  %util
sdd             980.97 11936.47   53.11  429.78     4.00    48.77 =

223.81    12.75   26.10   2.11 101.79
sdc             872.71 11957.87   45.98  435.67     3.55    49.30 =

224.71    13.77   28.43   2.11 101.49
sde             949.26 11981.88   51.30  429.33     3.91    48.90 =

225.03    21.29   43.91   2.27 109.08
sdf             915.52 11968.52   48.58  428.88     3.73    48.92 =

225.84    21.44   44.68   2.27 108.56
md2               0.00     0.00    0.00 1155.61     0.00    97.51 =

172.80     0.00    0.00   0.00   0.00

# dd if=3D/dev/zero of=3Doutput.zero bs=3D1M count=3D2048 oflag=3Ddirect
2048+0 records in
2048+0 records out
2147483648 bytes (2.1 GB) copied, 25.3708 s, 84.6 MB/s

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
            0.11    0.00   13.92    0.00    0.22   85.75

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s =

avgrq-sz avgqu-sz   await  svctm  %util
sdd               0.00 13986.08    0.00  263.20     0.00    55.76 =

433.87     0.43    1.63   1.07  28.27
sdc             202.10 13741.55    6.52  256.57     0.81    54.77 =

432.65     0.50    1.88   1.25  32.78
sde              47.96 11437.57    1.55  261.77     0.19    45.79 =

357.63     0.80    3.02   1.85  48.60
sdf            2233.37 11756.13   71.93  191.38     8.99    46.80 =

433.90     1.49    5.66   3.27  86.15
md2               0.00     0.00    0.00  731.93     0.00    91.49 =

256.00     0.00    0.00   0.00   0.00

Now this is pretty much exactly what I would expect the system to do.... =

~96MB/sec buffered, and 85MB/sec direct.

So - it turns out that xen_blkif_max_segments at 32 is a killer in the =

DomU. Now it makes me wonder what we can do about this in kernels that =

don't have your series of patches against it? And also about the backend =

stuff in 3.8.x etc?

-- =

Steven Haigh

Email: netwiz@crc.id.au
Web: https://www.crc.id.au
Phone: (03) 9001 6090 - 0412 935 897
Fax: (03) 8338 0299