From: Fengguang Wu <fengguang.wu@intel.com>
To: Namjae Jeon <linkinjeon@gmail.com>
Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
Namjae Jeon <namjae.jeon@samsung.com>,
linux-fsdevel@vger.kernel.org, linux-nfs@vger.kernel.org,
Dave Chinner <david@fromorbit.com>
Subject: Re: [PATCH 3/3] writeback: add dirty_ratio_time per bdi variable (NFS write performance)
Date: Tue, 21 Aug 2012 21:04:58 +0800 [thread overview]
Message-ID: <20120821130458.GD22321@localhost> (raw)
In-Reply-To: <CAKYAXd9L5Kwj3SEHO=a60c65OXRt73q+-w-Uux6OJ=6Q-3jvYQ@mail.gmail.com>
On Tue, Aug 21, 2012 at 03:00:13PM +0900, Namjae Jeon wrote:
> 2012/8/20, Fengguang Wu <fengguang.wu@intel.com>:
> > On Mon, Aug 20, 2012 at 09:48:42AM +0900, Namjae Jeon wrote:
> >> 2012/8/19, Fengguang Wu <fengguang.wu@intel.com>:
> >> > On Sat, Aug 18, 2012 at 05:50:02AM -0400, Namjae Jeon wrote:
> >> >> From: Namjae Jeon <namjae.jeon@samsung.com>
> >> >>
> >> >> This patch is based on suggestion by Wu Fengguang:
> >> >> https://lkml.org/lkml/2011/8/19/19
> >> >>
> >> >> kernel has mechanism to do writeback as per dirty_ratio and
> >> >> dirty_background
> >> >> ratio. It also maintains per task dirty rate limit to keep balance of
> >> >> dirty pages at any given instance by doing bdi bandwidth estimation.
> >> >>
> >> >> Kernel also has max_ratio/min_ratio tunables to specify percentage of
> >> >> writecache
> >> >> to control per bdi dirty limits and task throtelling.
> >> >>
> >> >> However, there might be a usecase where user wants a writeback tuning
> >> >> parameter to flush dirty data at desired/tuned time interval.
> >> >>
> >> >> dirty_background_time provides an interface where user can tune
> >> >> background
> >> >> writeback start time using /sys/block/sda/bdi/dirty_background_time
> >> >>
> >> >> dirty_background_time is used alongwith average bdi write bandwidth
> >> >> estimation
> >> >> to start background writeback.
> >> >
> >> > Here lies my major concern about dirty_background_time: the write
> >> > bandwidth estimation is an _estimation_ and will sure become wildly
> >> > wrong in some cases. So the dirty_background_time implementation based
> >> > on it will not always work to the user expectations.
> >> >
> >> > One important case is, some users (eg. Dave Chinner) explicitly take
> >> > advantage of the existing behavior to quickly create & delete a big
> >> > 1GB temp file without worrying about triggering unnecessary IOs.
> >> >
> >> Hi. Wu.
> >> Okay, I have a question.
> >>
> >> If making dirty_writeback_interval per bdi to tune short interval
> >> instead of background_time, We can get similar performance
> >> improvement.
> >> /sys/block/<device>/bdi/dirty_writeback_interval
> >> /sys/block/<device>/bdi/dirty_expire_interval
> >>
> >> NFS write performance improvement is just one usecase.
> >>
> >> If we can set interval/time per bdi, other usecases will be created
> >> by applying.
> >
> > Per-bdi interval/time tunables, if there comes such a need, will in
> > essential be for data caching and safety. If turning them into some
> > requirement for better performance, the users will potential be
> > stretched on choosing the "right" value for balanced data cache,
> > safety and performance. Hmm, not a comfortable prospection.
> Hi Wu.
> First, Thanks for shared information.
>
> I change writeback interval on NFS server only.
OK..sorry for missing that part!
> I think that this does not affect data cache/page behaviour(caching)
> change on NFS client. NFS client will start sending write requests as
> per default NFS/writeback logic. So, no change in NFS client data
> caching behaviour.
>
> Also, on NFS server it does not make change in system-wide caching
> behaviour. It only modifies caching/writeback behaviour of a
> particular “bdi” on NFS server so that NFS client could see better
> WRITE speed.
But would you default to dirty_background_time=0, where the special
value 0 means no change of the original behavior? That will address
David's very reasonable concern. Otherwise quite a few users are going
to be surprised by the new behavior after upgrading kernel.
> I will share several performancetest results as Dave's opinion.
>
> >
> >> >The numbers are impressive! FYI, I tried another NFS specific approach
> >> >to avoid big NFS COMMITs, which achieved similar performance gains:
> >>
> >> >nfs: writeback pages wait queue
> >> >https://lkml.org/lkml/2011/10/20/235
> This patch looks client side optimization to me.(need to check more)
Yes.
> Do we need the optimization of server side as Bruce's opinion ?
Sure.
Thanks,
Fengguang
next prev parent reply other threads:[~2012-08-21 13:04 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1345283402-7889-1-git-send-email-linkinjeon@gmail.com>
2012-08-19 2:57 ` [PATCH 3/3] writeback: add dirty_ratio_time per bdi variable (NFS write performance) Fengguang Wu
2012-08-20 0:48 ` Namjae Jeon
2012-08-20 14:50 ` Fengguang Wu
2012-08-21 6:00 ` Namjae Jeon
2012-08-21 13:04 ` Fengguang Wu [this message]
2012-08-22 1:10 ` Namjae Jeon
2012-08-20 2:00 ` Dave Chinner
2012-08-20 18:01 ` J. Bruce Fields
2012-08-21 5:48 ` Namjae Jeon
[not found] ` <CAKYAXd_AhA6E+KDsSLr2RtyTjSezMY6WH4SOcTc45L9KoEwYoQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-08-21 12:57 ` Fengguang Wu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120821130458.GD22321@localhost \
--to=fengguang.wu@intel.com \
--cc=akpm@linux-foundation.org \
--cc=david@fromorbit.com \
--cc=linkinjeon@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=namjae.jeon@samsung.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).