From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ronald Moesbergen Subject: Re: [RESEND] [PATCH] readahead:add blk_run_backing_dev Date: Mon, 6 Jul 2009 16:37:04 +0200 Message-ID: References: <4A3CD62B.1020407@vlnb.net> <20090629142124.GA28945@localhost> <20090629150109.GA3534@localhost> <4A48DFC5.3090205@vlnb.net> <20090630010414.GB31418@localhost> <4A49EEF9.6010205@vlnb.net> <4A4DE3C1.5080307@vlnb.net> <4A51DC0A.10302@vlnb.net> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Wu Fengguang , linux-kernel@vger.kernel.org, akpm@linux-foundation.org, kosaki.motohiro@jp.fujitsu.com, Alan.Brunelle@hp.com, hifumi.hisashi@oss.ntt.co.jp, linux-fsdevel@vger.kernel.org, jens.axboe@oracle.com, randy.dunlap@oracle.com, Bart Van Assche To: Vladislav Bolkhovitin Return-path: Received: from fg-out-1718.google.com ([72.14.220.158]:36121 "EHLO fg-out-1718.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754051AbZGFOhC convert rfc822-to-8bit (ORCPT ); Mon, 6 Jul 2009 10:37:02 -0400 In-Reply-To: <4A51DC0A.10302@vlnb.net> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: 2009/7/6 Vladislav Bolkhovitin : > (Restored the original list of recipients in this thread as I was ask= ed.) > > Hi Ronald, > > Ronald Moesbergen, on 07/04/2009 07:19 PM wrote: >> >> 2009/7/3 Vladislav Bolkhovitin : >>> >>> Ronald Moesbergen, on 07/03/2009 01:14 PM wrote: >>>>>> >>>>>> OK, now I tend to agree on decreasing max_sectors_kb and increas= ing >>>>>> read_ahead_kb. But before actually trying to push that idea I'd = like >>>>>> to >>>>>> - do more benchmarks >>>>>> - figure out why context readahead didn't help SCST performance >>>>>> =A0(previous traces show that context readahead is submitting pe= rfect >>>>>> =A0large io requests, so I wonder if it's some io scheduler bug) >>>>> >>>>> Because, as we found out, without your >>>>> http://lkml.org/lkml/2009/5/21/319 >>>>> patch read-ahead was nearly disabled, hence there were no differe= nce >>>>> which >>>>> algorithm was used? >>>>> >>>>> Ronald, can you run the following tests, please? This time with 2 >>>>> hosts, >>>>> initiator (client) and target (server) connected using 1 Gbps iSC= SI. It >>>>> would be the best if on the client vanilla 2.6.29 will be ran, bu= t any >>>>> other >>>>> kernel will be fine as well, only specify which. Blockdev-perftes= t >>>>> should >>>>> be >>>>> ran as before in buffered mode, i.e. with "-a" switch. >>>>> >>>>> 1. All defaults on the client, on the server vanilla 2.6.29 with >>>>> Fengguang's >>>>> http://lkml.org/lkml/2009/5/21/319 patch with all default setting= s. >>>>> >>>>> 2. All defaults on the client, on the server vanilla 2.6.29 with >>>>> Fengguang's >>>>> http://lkml.org/lkml/2009/5/21/319 patch with default RA size and= 64KB >>>>> max_sectors_kb. >>>>> >>>>> 3. All defaults on the client, on the server vanilla 2.6.29 with >>>>> Fengguang's >>>>> http://lkml.org/lkml/2009/5/21/319 patch with 2MB RA size and def= ault >>>>> max_sectors_kb. >>>>> >>>>> 4. All defaults on the client, on the server vanilla 2.6.29 with >>>>> Fengguang's >>>>> http://lkml.org/lkml/2009/5/21/319 patch with 2MB RA size and 64K= B >>>>> max_sectors_kb. >>>>> >>>>> 5. All defaults on the client, on the server vanilla 2.6.29 with >>>>> Fengguang's >>>>> http://lkml.org/lkml/2009/5/21/319 patch and with context RA patc= h. RA >>>>> size >>>>> and max_sectors_kb are default. For your convenience I committed = the >>>>> backported context RA patches into the SCST SVN repository. >>>>> >>>>> 6. All defaults on the client, on the server vanilla 2.6.29 with >>>>> Fengguang's >>>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with de= fault >>>>> RA >>>>> size and 64KB max_sectors_kb. >>>>> >>>>> 7. All defaults on the client, on the server vanilla 2.6.29 with >>>>> Fengguang's >>>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with 2M= B RA >>>>> size >>>>> and default max_sectors_kb. >>>>> >>>>> 8. All defaults on the client, on the server vanilla 2.6.29 with >>>>> Fengguang's >>>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with 2M= B RA >>>>> size >>>>> and 64KB max_sectors_kb. >>>>> >>>>> 9. On the client default RA size and 64KB max_sectors_kb. On the = server >>>>> vanilla 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/31= 9 and >>>>> context RA patches with 2MB RA size and 64KB max_sectors_kb. >>>>> >>>>> 10. On the client 2MB RA size and default max_sectors_kb. On the = server >>>>> vanilla 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/31= 9 and >>>>> context RA patches with 2MB RA size and 64KB max_sectors_kb. >>>>> >>>>> 11. On the client 2MB RA size and 64KB max_sectors_kb. On the ser= ver >>>>> vanilla >>>>> 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and co= ntext >>>>> RA >>>>> patches with 2MB RA size and 64KB max_sectors_kb. >>>> >>>> Ok, done. Performance is pretty bad overall :( >>>> >>>> The kernels I used: >>>> client kernel: 2.6.26-15lenny3 (debian) >>>> server kernel: 2.6.29.5 with blk_dev_run patch >>>> >>>> And I adjusted the blockdev-perftest script to drop caches on both= the >>>> server (via ssh) and the client. >>>> >>>> The results: >>>> >> >> ... previous results ... >> >>> Those are on the server without io_context-2.6.29 and readahead-2.6= =2E29 >>> patches applied and with CFQ scheduler, correct? >>> >>> Then we see how reorder of requests caused by many I/O threads subm= itting >>> I/O in separate I/O contexts badly affect performance and no RA, >>> especially >>> with default 128KB RA size, can solve it. Less max_sectors_kb on th= e >>> client >>> =3D> more requests it sends at once =3D> more reorder on the server= =3D> worse >>> throughput. Although, Fengguang, in theory, context RA with 2MB RA = size >>> should considerably help it, no? >>> >>> Ronald, can you perform those tests again with both io_context-2.6.= 29 and >>> readahead-2.6.29 patches applied on the server, please? >> >> Hi Vlad, >> >> I have retested with the patches you requested (and got access to th= e >> systems today :) ) The results are better, but still not great. >> >> client kernel: 2.6.26-15lenny3 (debian) >> server kernel: 2.6.29.5 with io_context and readahead patch >> >> 5) client: default, server: default >> blocksize =A0 =A0 =A0 R =A0 =A0 =A0 =A0R =A0 =A0 =A0 =A0R =A0 R(avg,= =A0 =A0R(std =A0 =A0 =A0 =A0R >> =A0(bytes) =A0 =A0 (s) =A0 =A0 =A0(s) =A0 =A0 =A0(s) =A0 =A0MB/s) =A0= ,MB/s) =A0 (IOPS) >> =A067108864 =A018.303 =A0 19.867 =A0 18.481 =A0 54.299 =A0 =A01.961 = =A0 =A00.848 >> =A033554432 =A018.321 =A0 17.681 =A0 18.708 =A0 56.181 =A0 =A01.314 = =A0 =A01.756 >> =A016777216 =A017.816 =A0 17.406 =A0 19.257 =A0 56.494 =A0 =A02.410 = =A0 =A03.531 >> =A08388608 =A018.077 =A0 17.727 =A0 19.338 =A0 55.789 =A0 =A02.056 =A0= =A06.974 >> =A04194304 =A017.918 =A0 16.601 =A0 18.287 =A0 58.276 =A0 =A02.454 =A0= 14.569 >> =A02097152 =A017.426 =A0 17.334 =A0 17.610 =A0 58.661 =A0 =A00.384 =A0= 29.331 >> =A01048576 =A019.358 =A0 18.764 =A0 17.253 =A0 55.607 =A0 =A02.734 =A0= 55.607 >> =A0 524288 =A017.951 =A0 18.163 =A0 17.440 =A0 57.379 =A0 =A00.983 =A0= 114.757 >> =A0 262144 =A018.196 =A0 17.724 =A0 17.520 =A0 57.499 =A0 =A00.907 =A0= 229.995 >> =A0 131072 =A018.342 =A0 18.259 =A0 17.551 =A0 56.751 =A0 =A01.131 =A0= 454.010 >> =A0 =A065536 =A017.733 =A0 18.572 =A0 17.134 =A0 57.548 =A0 =A01.893= =A0920.766 >> =A0 =A032768 =A019.081 =A0 19.321 =A0 17.364 =A0 55.213 =A0 =A02.673= 1766.818 >> =A0 =A016384 =A017.181 =A0 18.729 =A0 17.731 =A0 57.343 =A0 =A02.033= 3669.932 >> >> 6) client: default, server: 64 max_sectors_kb, RA default >> blocksize =A0 =A0 =A0 R =A0 =A0 =A0 =A0R =A0 =A0 =A0 =A0R =A0 R(avg,= =A0 =A0R(std =A0 =A0 =A0 =A0R >> =A0(bytes) =A0 =A0 (s) =A0 =A0 =A0(s) =A0 =A0 =A0(s) =A0 =A0MB/s) =A0= ,MB/s) =A0 (IOPS) >> =A067108864 =A021.790 =A0 20.062 =A0 19.534 =A0 50.153 =A0 =A02.304 = =A0 =A00.784 >> =A033554432 =A020.212 =A0 19.744 =A0 19.564 =A0 51.623 =A0 =A00.706 = =A0 =A01.613 >> =A016777216 =A020.404 =A0 19.329 =A0 19.738 =A0 51.680 =A0 =A01.148 = =A0 =A03.230 >> =A08388608 =A020.170 =A0 20.772 =A0 19.509 =A0 50.852 =A0 =A01.304 =A0= =A06.356 >> =A04194304 =A019.334 =A0 18.742 =A0 18.522 =A0 54.296 =A0 =A00.978 =A0= 13.574 >> =A02097152 =A019.413 =A0 18.858 =A0 18.884 =A0 53.758 =A0 =A00.715 =A0= 26.879 >> =A01048576 =A020.472 =A0 18.755 =A0 18.476 =A0 53.347 =A0 =A02.377 =A0= 53.347 >> =A0 524288 =A019.120 =A0 20.104 =A0 18.404 =A0 53.378 =A0 =A01.925 =A0= 106.756 >> =A0 262144 =A020.337 =A0 19.213 =A0 18.636 =A0 52.866 =A0 =A01.901 =A0= 211.464 >> =A0 131072 =A019.199 =A0 18.312 =A0 19.970 =A0 53.510 =A0 =A01.900 =A0= 428.083 >> =A0 =A065536 =A019.855 =A0 20.114 =A0 19.592 =A0 51.584 =A0 =A00.555= =A0825.342 >> =A0 =A032768 =A020.586 =A0 18.724 =A0 20.340 =A0 51.592 =A0 =A02.204= 1650.941 >> =A0 =A016384 =A021.119 =A0 19.834 =A0 19.594 =A0 50.792 =A0 =A01.651= 3250.669 >> >> 7) client: default, server: default max_sectors_kb, RA 2MB >> blocksize =A0 =A0 =A0 R =A0 =A0 =A0 =A0R =A0 =A0 =A0 =A0R =A0 R(avg,= =A0 =A0R(std =A0 =A0 =A0 =A0R >> =A0(bytes) =A0 =A0 (s) =A0 =A0 =A0(s) =A0 =A0 =A0(s) =A0 =A0MB/s) =A0= ,MB/s) =A0 (IOPS) >> =A067108864 =A017.767 =A0 16.489 =A0 16.949 =A0 60.050 =A0 =A01.842 = =A0 =A00.938 >> =A033554432 =A016.777 =A0 17.034 =A0 17.102 =A0 60.341 =A0 =A00.500 = =A0 =A01.886 >> =A016777216 =A018.509 =A0 16.784 =A0 16.971 =A0 58.891 =A0 =A02.537 = =A0 =A03.681 >> =A08388608 =A018.058 =A0 17.949 =A0 17.599 =A0 57.313 =A0 =A00.632 =A0= =A07.164 >> =A04194304 =A018.286 =A0 17.648 =A0 17.026 =A0 58.055 =A0 =A01.692 =A0= 14.514 >> =A02097152 =A017.387 =A0 18.451 =A0 17.875 =A0 57.226 =A0 =A01.388 =A0= 28.613 >> =A01048576 =A018.270 =A0 17.698 =A0 17.570 =A0 57.397 =A0 =A00.969 =A0= 57.397 >> =A0 524288 =A016.708 =A0 17.900 =A0 17.233 =A0 59.306 =A0 =A01.668 =A0= 118.611 >> =A0 262144 =A018.041 =A0 17.381 =A0 18.035 =A0 57.484 =A0 =A01.011 =A0= 229.934 >> =A0 131072 =A017.994 =A0 17.777 =A0 18.146 =A0 56.981 =A0 =A00.481 =A0= 455.844 >> =A0 =A065536 =A017.097 =A0 18.597 =A0 17.737 =A0 57.563 =A0 =A01.975= =A0921.011 >> =A0 =A032768 =A017.167 =A0 17.035 =A0 19.693 =A0 57.254 =A0 =A03.721= 1832.127 >> =A0 =A016384 =A017.144 =A0 16.664 =A0 17.623 =A0 59.762 =A0 =A01.367= 3824.774 >> >> 8) client: default, server: 64 max_sectors_kb, RA 2MB >> blocksize =A0 =A0 =A0 R =A0 =A0 =A0 =A0R =A0 =A0 =A0 =A0R =A0 R(avg,= =A0 =A0R(std =A0 =A0 =A0 =A0R >> =A0(bytes) =A0 =A0 (s) =A0 =A0 =A0(s) =A0 =A0 =A0(s) =A0 =A0MB/s) =A0= ,MB/s) =A0 (IOPS) >> =A067108864 =A020.003 =A0 21.133 =A0 19.308 =A0 50.894 =A0 =A01.881 = =A0 =A00.795 >> =A033554432 =A019.448 =A0 20.015 =A0 18.908 =A0 52.657 =A0 =A01.222 = =A0 =A01.646 >> =A016777216 =A019.964 =A0 19.350 =A0 19.106 =A0 52.603 =A0 =A00.967 = =A0 =A03.288 >> =A08388608 =A018.961 =A0 19.213 =A0 19.318 =A0 53.437 =A0 =A00.419 =A0= =A06.680 >> =A04194304 =A018.135 =A0 19.508 =A0 19.361 =A0 53.948 =A0 =A01.788 =A0= 13.487 >> =A02097152 =A018.753 =A0 19.471 =A0 18.367 =A0 54.315 =A0 =A01.306 =A0= 27.158 >> =A01048576 =A019.189 =A0 18.586 =A0 18.867 =A0 54.244 =A0 =A00.707 =A0= 54.244 >> =A0 524288 =A018.985 =A0 19.199 =A0 18.840 =A0 53.874 =A0 =A00.417 =A0= 107.749 >> =A0 262144 =A019.064 =A0 21.143 =A0 19.674 =A0 51.398 =A0 =A02.204 =A0= 205.592 >> =A0 131072 =A018.691 =A0 18.664 =A0 19.116 =A0 54.406 =A0 =A00.594 =A0= 435.245 >> =A0 =A065536 =A018.468 =A0 20.673 =A0 18.554 =A0 53.389 =A0 =A02.729= =A0854.229 >> =A0 =A032768 =A020.401 =A0 21.156 =A0 19.552 =A0 50.323 =A0 =A01.623= 1610.331 >> =A0 =A016384 =A019.532 =A0 20.028 =A0 20.466 =A0 51.196 =A0 =A00.977= 3276.567 >> >> 9) client: 64 max_sectors_kb, default RA. server: 64 max_sectors_kb,= RA >> 2MB >> blocksize =A0 =A0 =A0 R =A0 =A0 =A0 =A0R =A0 =A0 =A0 =A0R =A0 R(avg,= =A0 =A0R(std =A0 =A0 =A0 =A0R >> =A0(bytes) =A0 =A0 (s) =A0 =A0 =A0(s) =A0 =A0 =A0(s) =A0 =A0MB/s) =A0= ,MB/s) =A0 (IOPS) >> =A067108864 =A016.458 =A0 16.649 =A0 17.346 =A0 60.919 =A0 =A01.364 = =A0 =A00.952 >> =A033554432 =A016.479 =A0 16.744 =A0 17.069 =A0 61.096 =A0 =A00.878 = =A0 =A01.909 >> =A016777216 =A017.128 =A0 16.585 =A0 17.112 =A0 60.456 =A0 =A00.910 = =A0 =A03.778 >> =A08388608 =A017.322 =A0 16.780 =A0 16.885 =A0 60.262 =A0 =A00.824 =A0= =A07.533 >> =A04194304 =A017.530 =A0 16.725 =A0 16.756 =A0 60.250 =A0 =A01.299 =A0= 15.063 >> =A02097152 =A016.580 =A0 17.875 =A0 16.619 =A0 60.221 =A0 =A02.076 =A0= 30.110 >> =A01048576 =A017.550 =A0 17.406 =A0 17.075 =A0 59.049 =A0 =A00.681 =A0= 59.049 >> =A0 524288 =A016.492 =A0 18.211 =A0 16.832 =A0 59.718 =A0 =A02.519 =A0= 119.436 >> =A0 262144 =A017.241 =A0 17.115 =A0 17.365 =A0 59.397 =A0 =A00.352 =A0= 237.588 >> =A0 131072 =A017.430 =A0 16.902 =A0 17.511 =A0 59.271 =A0 =A00.936 =A0= 474.167 >> =A0 =A065536 =A016.726 =A0 16.894 =A0 17.246 =A0 60.404 =A0 =A00.768= =A0966.461 >> =A0 =A032768 =A016.662 =A0 17.517 =A0 17.052 =A0 59.989 =A0 =A01.224= 1919.658 >> =A0 =A016384 =A017.429 =A0 16.793 =A0 16.753 =A0 60.285 =A0 =A01.085= 3858.268 >> >> 10) client: default max_sectors_kb, 2MB RA. server: 64 max_sectors_k= b, RA >> 2MB >> blocksize =A0 =A0 =A0 R =A0 =A0 =A0 =A0R =A0 =A0 =A0 =A0R =A0 R(avg,= =A0 =A0R(std =A0 =A0 =A0 =A0R >> =A0(bytes) =A0 =A0 (s) =A0 =A0 =A0(s) =A0 =A0 =A0(s) =A0 =A0MB/s) =A0= ,MB/s) =A0 (IOPS) >> =A067108864 =A017.601 =A0 18.334 =A0 17.379 =A0 57.650 =A0 =A01.307 = =A0 =A00.901 >> =A033554432 =A018.281 =A0 18.128 =A0 17.169 =A0 57.381 =A0 =A01.610 = =A0 =A01.793 >> =A016777216 =A017.660 =A0 17.875 =A0 17.356 =A0 58.091 =A0 =A00.703 = =A0 =A03.631 >> =A08388608 =A017.724 =A0 17.810 =A0 18.383 =A0 56.992 =A0 =A00.918 =A0= =A07.124 >> =A04194304 =A017.475 =A0 17.770 =A0 19.003 =A0 56.704 =A0 =A02.031 =A0= 14.176 >> =A02097152 =A017.287 =A0 17.674 =A0 18.492 =A0 57.516 =A0 =A01.604 =A0= 28.758 >> =A01048576 =A017.972 =A0 17.460 =A0 18.777 =A0 56.721 =A0 =A01.689 =A0= 56.721 >> =A0 524288 =A018.680 =A0 18.952 =A0 19.445 =A0 53.837 =A0 =A00.890 =A0= 107.673 >> =A0 262144 =A018.070 =A0 18.337 =A0 18.639 =A0 55.817 =A0 =A00.707 =A0= 223.270 >> =A0 131072 =A016.990 =A0 16.651 =A0 16.862 =A0 60.832 =A0 =A00.507 =A0= 486.657 >> =A0 =A065536 =A017.707 =A0 16.972 =A0 17.520 =A0 58.870 =A0 =A01.066= =A0941.924 >> =A0 =A032768 =A017.767 =A0 17.208 =A0 17.205 =A0 58.887 =A0 =A00.885= 1884.399 >> =A0 =A016384 =A018.258 =A0 17.252 =A0 18.035 =A0 57.407 =A0 =A01.407= 3674.059 >> >> 11) client: 64 max_sectors_kb, 2MB. RA server: 64 max_sectors_kb, RA= 2MB >> blocksize =A0 =A0 =A0 R =A0 =A0 =A0 =A0R =A0 =A0 =A0 =A0R =A0 R(avg,= =A0 =A0R(std =A0 =A0 =A0 =A0R >> =A0(bytes) =A0 =A0 (s) =A0 =A0 =A0(s) =A0 =A0 =A0(s) =A0 =A0MB/s) =A0= ,MB/s) =A0 (IOPS) >> =A067108864 =A017.993 =A0 18.307 =A0 18.718 =A0 55.850 =A0 =A00.902 = =A0 =A00.873 >> =A033554432 =A019.554 =A0 18.485 =A0 17.902 =A0 54.988 =A0 =A01.993 = =A0 =A01.718 >> =A016777216 =A018.829 =A0 18.236 =A0 18.748 =A0 55.052 =A0 =A00.785 = =A0 =A03.441 >> =A08388608 =A021.152 =A0 19.065 =A0 18.738 =A0 52.257 =A0 =A02.745 =A0= =A06.532 >> =A04194304 =A019.131 =A0 19.703 =A0 17.850 =A0 54.288 =A0 =A02.268 =A0= 13.572 >> =A02097152 =A019.093 =A0 19.152 =A0 19.509 =A0 53.196 =A0 =A00.504 =A0= 26.598 >> =A01048576 =A019.371 =A0 18.775 =A0 18.804 =A0 53.953 =A0 =A00.772 =A0= 53.953 >> =A0 524288 =A020.003 =A0 17.911 =A0 18.602 =A0 54.470 =A0 =A02.476 =A0= 108.940 >> =A0 262144 =A019.182 =A0 19.460 =A0 18.476 =A0 53.809 =A0 =A01.183 =A0= 215.236 >> =A0 131072 =A019.403 =A0 19.192 =A0 18.907 =A0 53.429 =A0 =A00.567 =A0= 427.435 >> =A0 =A065536 =A019.502 =A0 19.656 =A0 18.599 =A0 53.219 =A0 =A01.309= =A0851.509 >> =A0 =A032768 =A018.746 =A0 18.747 =A0 18.250 =A0 55.119 =A0 =A00.701= 1763.817 >> =A0 =A016384 =A020.977 =A0 19.437 =A0 18.840 =A0 51.951 =A0 =A02.319= 3324.862 > > The results look inconsistently with what you had previously (89.7 MB= /s). > How can you explain it? I had more patches applied with that test: (scst_exec_req_fifo-2.6.29, put_page_callback-2.6.29) and I used a different dd command: dd if=3D/dev/sdc of=3D/dev/zero bs=3D512K count=3D2000 But all that said, I can't reproduce speeds that high now. Must have made a mistake back then (maybe I forgot to clear the pagecache). > I think, most likely, there was some confusion between the tested and > patched versions of the kernel or you forgot to apply the io_context = patch. > Please recheck. The tests above were definitely done right, I just rechecked the patches, and I do see an average increase of about 10MB/s over an unpatched kernel. But overall the performance is still pretty bad. Ronald. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel= " in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html