From: Jan Kara <jack@suse.cz>
To: Wu Fengguang <fengguang.wu@intel.com>
Cc: Christoph Hellwig <hch@infradead.org>, Jan Kara <jack@suse.cz>,
Mel Gorman <mel@csn.ul.ie>,
Andrew Morton <akpm@linux-foundation.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
Dave Chinner <david@fromorbit.com>,
Chris Mason <chris.mason@oracle.com>,
Nick Piggin <npiggin@suse.de>, Rik van Riel <riel@redhat.com>,
Johannes Weiner <hannes@cmpxchg.org>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Andrea Arcangeli <aarcange@redhat.com>
Subject: Re: [PATCH 0/9] Reduce writeback from page reclaim context V5
Date: Tue, 3 Aug 2010 14:52:49 +0200 [thread overview]
Message-ID: <20100803125249.GD3322@quack.suse.cz> (raw)
In-Reply-To: <20100803073449.GA21452@localhost>
On Tue 03-08-10 15:34:49, Wu Fengguang wrote:
> On Thu, Jul 29, 2010 at 04:45:23PM +0800, Christoph Hellwig wrote:
> > Btw, I'm very happy with all this writeback related progress we've made
> > for the 2.6.36 cycle. The only major thing that's really missing, and
> > which should help dramatically with the I/O patters is stopping direct
> > writeback from balance_dirty_pages(). I've seen patches frrom Wu and
> > and Jan for this and lots of discussion. If we get either variant in
> > this should be once of the best VM release from the filesystem point of
> > view.
>
> Sorry for the delay. But I'm not feeling good about the current
> patches, both mine and Jan's.
>
> Accounting overheads/accuracy are the obvious problem. Both patches do
> not perform well on large NUMA machines and fast storage. They are found
> hard to improve in previous discussions.
Yes, my patch for balance_dirty_pages() has a problem with percpu counter
(im)precision and resorting to pure atomic type could result in bouncing
of the cache line among CPUs completing the IO (at least that is the reason
why all other BDI stats are per-cpu I believe).
We could solve the problem by doing the accounting on page IO submission
time (there using the atomic type should be fine as we mostly submit IO
from the flusher thread anyway). It's just that doing the accounting on
completion time has the nice property that we really hold the throttled
thread upto the moment when vm can really reuse the pages.
> We might do dirty throttling based on throughput, ignoring the
> writeback completions totally. The basic idea is, for current process,
> we already have a per-bdi-and-task threshold B as the local throttle
Do we? The limit is currently just per-bdi, isn't it? Or do you mean
the ratelimiting - i.e. how often do we call balance_dirty_pages()?
That is per-cpu if I'm right.
> target. When dirty pages go beyond B*80% for example, we start
> throttling the task's writeback throughput. The more closer to B, the
> lower throughput. When reaches B or global threshold, we completely
> stop it. The hope is, the throughput will be sustained at some balance
> point. This will need careful calculation to perform stable/robust.
But what do you exactly mean by throttling the task in your scenario?
What would it wait on?
> In this way, the throttle can be made very smooth. My old experiments
> show that the current writeback completion based throttling fluctuates
> a lot for the stall time. In particular it makes bumpy writeback for
> NFS, so that some times the network pipe is not active at all and
> performance is impacted noticeably.
>
> By the way, we'll harvest a writeback IO controller :)
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-08-03 12:52 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-28 10:27 [PATCH 0/9] Reduce writeback from page reclaim context V5 Mel Gorman
2010-07-28 10:27 ` [PATCH 1/9] vmscan: tracing: Roll up of patches currently in mmotm Mel Gorman
2010-07-28 10:27 ` [PATCH 2/9] vmscan: tracing: Update trace event to track if page reclaim IO is for anon or file pages Mel Gorman
2010-07-28 10:27 ` [PATCH 3/9] vmscan: tracing: Update post-processing script to distinguish between anon and file IO from page reclaim Mel Gorman
2010-07-28 11:05 ` Christoph Hellwig
2010-07-28 11:19 ` Mel Gorman
2010-07-28 10:27 ` [PATCH 4/9] vmscan: tracing: Correct units in post-processing script Mel Gorman
2010-07-28 10:27 ` [PATCH 5/9] vmscan: Do not writeback filesystem pages in direct reclaim Mel Gorman
2010-07-28 10:27 ` [PATCH 6/9] writeback: Roll up of writeback changes in next-20100722 versus 2.6.35-rc5 Mel Gorman
2010-07-28 10:27 ` [PATCH 7/9] writeback: Roll up of writeback: try to write older pages first Mel Gorman
2010-07-28 10:27 ` [PATCH 8/9] vmscan: Kick flusher threads to clean pages when reclaim is encountering dirty pages Mel Gorman
2010-07-28 10:27 ` [PATCH 9/9] writeback: Prioritise dirty inodes encountered by reclaim for background flushing Mel Gorman
2010-07-28 11:08 ` Christoph Hellwig
2010-07-28 11:30 ` Mel Gorman
2010-07-29 8:45 ` [PATCH 0/9] Reduce writeback from page reclaim context V5 Christoph Hellwig
2010-08-03 7:34 ` Wu Fengguang
2010-08-03 12:52 ` Jan Kara [this message]
2010-08-03 15:04 ` Wu Fengguang
2010-08-03 15:07 ` Wu Fengguang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100803125249.GD3322@quack.suse.cz \
--to=jack@suse.cz \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=chris.mason@oracle.com \
--cc=david@fromorbit.com \
--cc=fengguang.wu@intel.com \
--cc=hannes@cmpxchg.org \
--cc=hch@infradead.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=npiggin@suse.de \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).