linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Wu Fengguang <fengguang.wu@intel.com>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jan Kara <jack@suse.cz>, Christoph Hellwig <hch@lst.de>,
	Dave Chinner <david@fromorbit.com>,
	Greg Thelen <gthelen@google.com>,
	Minchan Kim <minchan.kim@gmail.com>,
	Andrea Righi <arighi@develer.com>, linux-mm <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 00/11] IO-less dirty throttling v12
Date: Wed, 5 Oct 2011 21:56:16 +0800	[thread overview]
Message-ID: <20111005135615.GA16438@localhost> (raw)
In-Reply-To: <20111004195206.GG28306@redhat.com>

On Wed, Oct 05, 2011 at 03:52:06AM +0800, Vivek Goyal wrote:
> On Mon, Oct 03, 2011 at 09:42:28PM +0800, Wu Fengguang wrote:
> > Hi,
> > 
> > This is the minimal IO-less balance_dirty_pages() changes that are expected to
> > be regression free (well, except for NFS).
> > 
> >         git://github.com/fengguang/linux.git dirty-throttling-v12
> > 
> > Tests results will be posted in a separate email.
> 
> Looks like we are solving two problems.
> 
> - IO less balance_dirty_pages()
> - Throttling based on ratelimit instead of based on number of dirty pages.
> 
> The second piece is the one which has complicated calculations for
> calculating the global/bdi rates and logic for stablizing the rates etc.
> 
> IIUC, second piece is primarily needed for better latencies for writers.

Well, yes. The bdi->dirty_ratelimit estimation turns out to be the
most confusing part of the patchset... Other than the complexities,
the algorithm does work pretty well in the tests (except for small
memory cases, in which case its estimation accuracy no longer matters).

Note that the bdi->dirty_ratelimit thing, even when goes wrong, is
very unlikely to cause large regressions. The known regressions mostly
originate from the nature of IO-less.

> Will it make sense to break down this work in two patch series. First
> push IO less balance dirty pages and then all the complicated pieces
> of ratelimits.
> 
> ratelimit allowed you to come up with sleep time for the process. Without
> that I think you shall have to fall back to what Jan Kar had done, 
> calculation based on number of pages.

If dropping all the smoothness considerations, the minimal
implementation would be close to this patch:

        [PATCH 05/35] writeback: IO-less balance_dirty_pages()
        http://www.spinics.net/lists/linux-mm/msg12880.html

However the experiences were, it may lead to much worse latencies than
the vanilla one in JBOD cases. This is because vanilla kernel has the
option to break out of the loop when written enough pages, however the
IO-less balance_dirty_pages() will just wait until the dirty pages
drop below the (rushed high) bdi threshold, which could take long time.

Another question is, the IO-less balance_dirty_pages() is basically

        on every N pages dirtied, sleep for M jiffies

In current patchset, we get the desired N with formula

        N = bdi->dirty_ratelimit / desired_M

When dirty_ratelimit is not available, it would be a problem to
estimate the adequate N that works well for various workloads.

And to avoid regressions, patches 8,9,10,11 (maybe updated form) will
still be necessary. And a complete rerun of all the test cases and to
fix up any possible new regressions.

Overall it may cost too much (if possible at all, considering the two
problems listed above) to try out the above steps. The main intention
being "whether we can introduce the dirty_ratelimit complexities later".
Considering that the complexity itself is not likely causing problems
other than lose of smoothness, it looks beneficial to test the ready
made code earlier in production environments, rather than to take lots
of efforts to strip them out and test new code, only to add them back
in some future release.

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2011-10-05 13:56 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-03 13:42 [PATCH 00/11] IO-less dirty throttling v12 Wu Fengguang
2011-10-03 13:42 ` [PATCH 01/11] writeback: account per-bdi accumulated dirtied pages Wu Fengguang
2011-10-03 13:42 ` [PATCH 02/11] writeback: dirty position control Wu Fengguang
2011-10-03 13:42 ` [PATCH 03/11] writeback: add bg_threshold parameter to __bdi_update_bandwidth() Wu Fengguang
2011-10-03 13:42 ` [PATCH 04/11] writeback: dirty rate control Wu Fengguang
2011-10-03 13:42 ` [PATCH 05/11] writeback: stabilize bdi->dirty_ratelimit Wu Fengguang
2011-10-03 13:42 ` [PATCH 06/11] writeback: per task dirty rate limit Wu Fengguang
2011-10-03 13:42 ` [PATCH 07/11] writeback: IO-less balance_dirty_pages() Wu Fengguang
2011-10-03 13:42 ` [PATCH 08/11] writeback: limit max dirty pause time Wu Fengguang
2011-10-03 13:42 ` [PATCH 09/11] writeback: control " Wu Fengguang
2011-10-03 13:42 ` [PATCH 10/11] writeback: dirty position control - bdi reserve area Wu Fengguang
2011-10-03 13:42 ` [PATCH 11/11] writeback: per-bdi background threshold Wu Fengguang
2011-10-03 13:59 ` [PATCH 00/11] IO-less dirty throttling v12 Wu Fengguang
2011-10-05  1:42   ` Wu Fengguang
2011-10-04 19:52 ` Vivek Goyal
2011-10-05 13:56   ` Wu Fengguang [this message]
2011-10-05 15:16   ` Andi Kleen
2011-10-10 12:14 ` Peter Zijlstra
2011-10-10 13:07   ` Wu Fengguang
2011-10-10 13:10     ` [RFC][PATCH 1/2] nfs: writeback pages wait queue Wu Fengguang
2011-10-10 13:11       ` [RFC][PATCH 2/2] nfs: scale writeback threshold proportional to dirty threshold Wu Fengguang
2011-10-18  8:53         ` Wu Fengguang
2011-10-18  8:59           ` Wu Fengguang
2011-10-20  2:49             ` Wu Fengguang
2011-10-18  8:51       ` [RFC][PATCH 1/2] nfs: writeback pages wait queue Wu Fengguang
2011-10-20  3:59         ` Wu Fengguang
2011-10-10 14:28     ` [PATCH 00/11] IO-less dirty throttling v12 Wu Fengguang
2011-10-17  3:03       ` Wu Fengguang
2011-10-20  3:39 ` Wu Fengguang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111005135615.GA16438@localhost \
    --to=fengguang.wu@intel.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=arighi@develer.com \
    --cc=david@fromorbit.com \
    --cc=gthelen@google.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan.kim@gmail.com \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).