linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Wu Fengguang <fengguang.wu@intel.com>
To: Trond Myklebust <Trond.Myklebust@netapp.com>, linux-nfs@vger.kernel.org
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jan Kara <jack@suse.cz>, Christoph Hellwig <hch@lst.de>,
	Dave Chinner <david@fromorbit.com>,
	Greg Thelen <gthelen@google.com>,
	Minchan Kim <minchan.kim@gmail.com>,
	Vivek Goyal <vgoyal@redhat.com>,
	Andrea Righi <arighi@develer.com>, linux-mm <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 00/11] IO-less dirty throttling v12
Date: Mon, 10 Oct 2011 22:28:46 +0800	[thread overview]
Message-ID: <20111010142846.GA21218@localhost> (raw)
In-Reply-To: <20111010130722.GA11387@localhost>

Hi Trond,

> As for the NFS performance, the dd tests show that adding a writeback
> wait queue to limit the number of NFS PG_writeback pages (patches
> will follow) is able to gain 48% throughput in itself:
> 
>       3.1.0-rc8-ioless6+         3.1.0-rc8-nfs-wq+  
> ------------------------  ------------------------  
>                    22.43       +81.8%        40.77  NFS-thresh=100M/nfs-10dd-1M-32p-32768M-100M:10-X
>                    28.21       +52.6%        43.07  NFS-thresh=100M/nfs-1dd-1M-32p-32768M-100M:10-X
>                    29.21       +55.4%        45.39  NFS-thresh=100M/nfs-2dd-1M-32p-32768M-100M:10-X
>                    14.12       +40.4%        19.83  NFS-thresh=10M/nfs-10dd-1M-32p-32768M-10M:10-X
>                    29.44       +11.4%        32.81  NFS-thresh=10M/nfs-1dd-1M-32p-32768M-10M:10-X
>                     9.09      +240.9%        30.97  NFS-thresh=10M/nfs-2dd-1M-32p-32768M-10M:10-X
>                    25.68       +84.6%        47.42  NFS-thresh=1G/nfs-10dd-1M-32p-32768M-1024M:10-X
>                    41.06        +7.6%        44.20  NFS-thresh=1G/nfs-1dd-1M-32p-32768M-1024M:10-X
>                    39.13       +25.9%        49.26  NFS-thresh=1G/nfs-2dd-1M-32p-32768M-1024M:10-X
>                   238.38       +48.4%       353.72  TOTAL
> 
> Which will result in 28% overall improvements over the vanilla kernel:
> 
>       3.1.0-rc4-vanilla+         3.1.0-rc8-nfs-wq+  
> ------------------------  ------------------------  
>                    20.89       +95.2%        40.77  NFS-thresh=100M/nfs-10dd-1M-32p-32768M-100M:10-X
>                    39.43        +9.2%        43.07  NFS-thresh=100M/nfs-1dd-1M-32p-32768M-100M:10-X
>                    26.60       +70.6%        45.39  NFS-thresh=100M/nfs-2dd-1M-32p-32768M-100M:10-X
>                    12.70       +56.1%        19.83  NFS-thresh=10M/nfs-10dd-1M-32p-32768M-10M:10-X
>                    27.41       +19.7%        32.81  NFS-thresh=10M/nfs-1dd-1M-32p-32768M-10M:10-X
>                    26.52       +16.8%        30.97  NFS-thresh=10M/nfs-2dd-1M-32p-32768M-10M:10-X
>                    40.70       +16.5%        47.42  NFS-thresh=1G/nfs-10dd-1M-32p-32768M-1024M:10-X
>                    45.28        -2.4%        44.20  NFS-thresh=1G/nfs-1dd-1M-32p-32768M-1024M:10-X
>                    35.74       +37.8%        49.26  NFS-thresh=1G/nfs-2dd-1M-32p-32768M-1024M:10-X
>                   275.28       +28.5%       353.72  TOTAL
> 
> As for the most concerned NFS commits, the wait queue patch increases
> the (nr_commits / bytes_written) ratio by +74% for the thresh=1G,10dd
> case, +55% for the thresh=100M,10dd case, and mostly ignorable in the
> other 1dd, 2dd cases, which looks acceptable.
> 
> The other noticeable change of the wait queue is, the RTT time per

Sorry it's not RTT, but mainly the local queue time of the WRITE RPCs.

> write is reduced by 1-2 order(s) in many of the below cases (from
> dozens of seconds to hundreds of milliseconds).

I also measured the stddev of the network bandwidths, and find more
smooth network transfers in general with the wait queue, which is
expected.

thresh=1G
        vanilla       ioless6       nfs-wq
1dd     83088173.728  53468627.578  53627922.011
2dd     52398918.208  43733074.167  53531381.177
10dd    67792638.857  44734947.283  39681731.234

However the major difference should still be that the writeback wait
queue can significantly reduce the local queue time for the WRITE RPCs.

The wait queue patch looks reasonable in that it keeps the pages in
PG_dirty state rather than to prematurely put them to PG_writeback
state only to queue them up for dozens of seconds before xmit.

It should be safe because that's exactly the old proved behavior
before the per-bdi writeback patches introduced in 2.6.32. The 2nd
patch on proportional nfs_congestion_kb is a new change, though.

Thanks,
Fengguang

  parent reply	other threads:[~2011-10-10 14:28 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-03 13:42 [PATCH 00/11] IO-less dirty throttling v12 Wu Fengguang
2011-10-03 13:42 ` [PATCH 01/11] writeback: account per-bdi accumulated dirtied pages Wu Fengguang
2011-10-03 13:42 ` [PATCH 02/11] writeback: dirty position control Wu Fengguang
2011-10-03 13:42 ` [PATCH 03/11] writeback: add bg_threshold parameter to __bdi_update_bandwidth() Wu Fengguang
2011-10-03 13:42 ` [PATCH 04/11] writeback: dirty rate control Wu Fengguang
2011-10-03 13:42 ` [PATCH 05/11] writeback: stabilize bdi->dirty_ratelimit Wu Fengguang
2011-10-03 13:42 ` [PATCH 06/11] writeback: per task dirty rate limit Wu Fengguang
2011-10-03 13:42 ` [PATCH 07/11] writeback: IO-less balance_dirty_pages() Wu Fengguang
2011-10-03 13:42 ` [PATCH 08/11] writeback: limit max dirty pause time Wu Fengguang
2011-10-03 13:42 ` [PATCH 09/11] writeback: control " Wu Fengguang
2011-10-03 13:42 ` [PATCH 10/11] writeback: dirty position control - bdi reserve area Wu Fengguang
2011-10-03 13:42 ` [PATCH 11/11] writeback: per-bdi background threshold Wu Fengguang
2011-10-03 13:59 ` [PATCH 00/11] IO-less dirty throttling v12 Wu Fengguang
2011-10-05  1:42   ` Wu Fengguang
2011-10-04 19:52 ` Vivek Goyal
2011-10-05 13:56   ` Wu Fengguang
2011-10-05 15:16   ` Andi Kleen
2011-10-10 12:14 ` Peter Zijlstra
2011-10-10 13:07   ` Wu Fengguang
2011-10-10 13:10     ` [RFC][PATCH 1/2] nfs: writeback pages wait queue Wu Fengguang
2011-10-10 13:11       ` [RFC][PATCH 2/2] nfs: scale writeback threshold proportional to dirty threshold Wu Fengguang
2011-10-18  8:53         ` Wu Fengguang
2011-10-18  8:59           ` Wu Fengguang
2011-10-20  2:49             ` Wu Fengguang
2011-10-18  8:51       ` [RFC][PATCH 1/2] nfs: writeback pages wait queue Wu Fengguang
2011-10-20  3:59         ` Wu Fengguang
2011-10-10 14:28     ` Wu Fengguang [this message]
2011-10-17  3:03       ` [PATCH 00/11] IO-less dirty throttling v12 Wu Fengguang
2011-10-20  3:39 ` Wu Fengguang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111010142846.GA21218@localhost \
    --to=fengguang.wu@intel.com \
    --cc=Trond.Myklebust@netapp.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=arighi@develer.com \
    --cc=david@fromorbit.com \
    --cc=gthelen@google.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=minchan.kim@gmail.com \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).