From: Wu Fengguang <fengguang.wu@intel.com>
To: Jan Kara <jack@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Dave Chinner <david@fromorbit.com>,
Hugh Dickins <hughd@google.com>, Rik van Riel <riel@redhat.com>,
LKML <linux-kernel@vger.kernel.org>,
Linux Memory Management List <linux-mm@kvack.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 2/4] writeback: avoid duplicate balance_dirty_pages_ratelimited() calls
Date: Thu, 14 Apr 2011 08:30:45 +0800 [thread overview]
Message-ID: <20110414003045.GB6097@localhost> (raw)
In-Reply-To: <20110413215307.GD4648@quack.suse.cz>
On Thu, Apr 14, 2011 at 05:53:07AM +0800, Jan Kara wrote:
> On Wed 13-04-11 16:59:39, Wu Fengguang wrote:
> > When dd in 512bytes, balance_dirty_pages_ratelimited() could be called 8
> > times for the same page, but obviously the page is only dirtied once.
> >
> > Fix it with a (slightly racy) PageDirty() test.
> >
> > Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> > ---
> > mm/filemap.c | 5 ++++-
> > 1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > --- linux-next.orig/mm/filemap.c 2011-04-13 16:46:01.000000000 +0800
> > +++ linux-next/mm/filemap.c 2011-04-13 16:47:26.000000000 +0800
> > @@ -2313,6 +2313,7 @@ static ssize_t generic_perform_write(str
> > long status = 0;
> > ssize_t written = 0;
> > unsigned int flags = 0;
> > + unsigned int dirty;
> >
> > /*
> > * Copies from kernel address space cannot fail (NFSD is a big user).
> > @@ -2361,6 +2362,7 @@ again:
> > pagefault_enable();
> > flush_dcache_page(page);
> >
> > + dirty = PageDirty(page);
> This isn't completely right as we sometimes dirty the page in
> ->write_begin() (see e.g. block_write_begin() when we allocate blocks under
> an already uptodate page) and in such cases we would not call
> balance_dirty_pages(). So I'm not sure we can really do this
> optimization (although it's sad)...
Good catch, thanks! I evaluated three possible options, the last one
looks most promising (however is a radical change).
- do radix_tree_tag_get() before calling ->write_begin()
simple but heavy weight
- add balance_dirty_pages_ratelimited() in __block_write_begin()
seems not easy, too
- accurately account the dirtied pages in account_page_dirtied() rather than
in balance_dirty_pages_ratelimited_nr(). This diff on top of my patchset
illustrates the idea, but will need to sort out cases like direct IO ...
--- linux-next.orig/mm/page-writeback.c 2011-04-14 07:50:09.000000000 +0800
+++ linux-next/mm/page-writeback.c 2011-04-14 07:52:35.000000000 +0800
@@ -1295,8 +1295,6 @@ void balance_dirty_pages_ratelimited_nr(
if (!bdi_cap_account_dirty(bdi))
return;
- current->nr_dirtied += nr_pages_dirtied;
-
if (dirty_exceeded_recently(bdi, MAX_PAUSE)) {
unsigned long max = current->nr_dirtied +
(128 >> (PAGE_SHIFT - 10));
@@ -1752,6 +1750,7 @@ void account_page_dirtied(struct page *p
__inc_bdi_stat(mapping->backing_dev_info, BDI_DIRTIED);
task_dirty_inc(current);
task_io_account_write(PAGE_CACHE_SIZE);
+ current->nr_dirtied++;
}
}
EXPORT_SYMBOL(account_page_dirtied);
> > mark_page_accessed(page);
> > status = a_ops->write_end(file, mapping, pos, bytes, copied,
> > page, fsdata);
> > @@ -2387,7 +2389,8 @@ again:
> > pos += copied;
> > written += copied;
> >
> > - balance_dirty_pages_ratelimited(mapping);
> > + if (!dirty)
> > + balance_dirty_pages_ratelimited(mapping);
> >
> > } while (iov_iter_count(i));
>
> Honza
> --
> Jan Kara <jack@suse.cz>
> SUSE Labs, CR
WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu@intel.com>
To: Jan Kara <jack@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Dave Chinner <david@fromorbit.com>,
Hugh Dickins <hughd@google.com>, Rik van Riel <riel@redhat.com>,
LKML <linux-kernel@vger.kernel.org>,
Linux Memory Management List <linux-mm@kvack.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 2/4] writeback: avoid duplicate balance_dirty_pages_ratelimited() calls
Date: Thu, 14 Apr 2011 08:30:45 +0800 [thread overview]
Message-ID: <20110414003045.GB6097@localhost> (raw)
In-Reply-To: <20110413215307.GD4648@quack.suse.cz>
On Thu, Apr 14, 2011 at 05:53:07AM +0800, Jan Kara wrote:
> On Wed 13-04-11 16:59:39, Wu Fengguang wrote:
> > When dd in 512bytes, balance_dirty_pages_ratelimited() could be called 8
> > times for the same page, but obviously the page is only dirtied once.
> >
> > Fix it with a (slightly racy) PageDirty() test.
> >
> > Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> > ---
> > mm/filemap.c | 5 ++++-
> > 1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > --- linux-next.orig/mm/filemap.c 2011-04-13 16:46:01.000000000 +0800
> > +++ linux-next/mm/filemap.c 2011-04-13 16:47:26.000000000 +0800
> > @@ -2313,6 +2313,7 @@ static ssize_t generic_perform_write(str
> > long status = 0;
> > ssize_t written = 0;
> > unsigned int flags = 0;
> > + unsigned int dirty;
> >
> > /*
> > * Copies from kernel address space cannot fail (NFSD is a big user).
> > @@ -2361,6 +2362,7 @@ again:
> > pagefault_enable();
> > flush_dcache_page(page);
> >
> > + dirty = PageDirty(page);
> This isn't completely right as we sometimes dirty the page in
> ->write_begin() (see e.g. block_write_begin() when we allocate blocks under
> an already uptodate page) and in such cases we would not call
> balance_dirty_pages(). So I'm not sure we can really do this
> optimization (although it's sad)...
Good catch, thanks! I evaluated three possible options, the last one
looks most promising (however is a radical change).
- do radix_tree_tag_get() before calling ->write_begin()
simple but heavy weight
- add balance_dirty_pages_ratelimited() in __block_write_begin()
seems not easy, too
- accurately account the dirtied pages in account_page_dirtied() rather than
in balance_dirty_pages_ratelimited_nr(). This diff on top of my patchset
illustrates the idea, but will need to sort out cases like direct IO ...
--- linux-next.orig/mm/page-writeback.c 2011-04-14 07:50:09.000000000 +0800
+++ linux-next/mm/page-writeback.c 2011-04-14 07:52:35.000000000 +0800
@@ -1295,8 +1295,6 @@ void balance_dirty_pages_ratelimited_nr(
if (!bdi_cap_account_dirty(bdi))
return;
- current->nr_dirtied += nr_pages_dirtied;
-
if (dirty_exceeded_recently(bdi, MAX_PAUSE)) {
unsigned long max = current->nr_dirtied +
(128 >> (PAGE_SHIFT - 10));
@@ -1752,6 +1750,7 @@ void account_page_dirtied(struct page *p
__inc_bdi_stat(mapping->backing_dev_info, BDI_DIRTIED);
task_dirty_inc(current);
task_io_account_write(PAGE_CACHE_SIZE);
+ current->nr_dirtied++;
}
}
EXPORT_SYMBOL(account_page_dirtied);
> > mark_page_accessed(page);
> > status = a_ops->write_end(file, mapping, pos, bytes, copied,
> > page, fsdata);
> > @@ -2387,7 +2389,8 @@ again:
> > pos += copied;
> > written += copied;
> >
> > - balance_dirty_pages_ratelimited(mapping);
> > + if (!dirty)
> > + balance_dirty_pages_ratelimited(mapping);
> >
> > } while (iov_iter_count(i));
>
> Honza
> --
> Jan Kara <jack@suse.cz>
> SUSE Labs, CR
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-04-14 0:30 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-13 8:59 [PATCH 0/4] trivial writeback fixes Wu Fengguang
2011-04-13 8:59 ` Wu Fengguang
2011-04-13 8:59 ` [PATCH 1/4] writeback: add bdi_dirty_limit() kernel-doc Wu Fengguang
2011-04-13 8:59 ` Wu Fengguang
2011-04-13 8:59 ` Wu Fengguang
2011-04-13 21:47 ` Jan Kara
2011-04-13 21:47 ` Jan Kara
2011-04-13 8:59 ` [PATCH 2/4] writeback: avoid duplicate balance_dirty_pages_ratelimited() calls Wu Fengguang
2011-04-13 8:59 ` Wu Fengguang
2011-04-13 8:59 ` Wu Fengguang
2011-04-13 21:53 ` Jan Kara
2011-04-13 21:53 ` Jan Kara
2011-04-14 0:30 ` Wu Fengguang [this message]
2011-04-14 0:30 ` Wu Fengguang
2011-04-14 10:20 ` Jan Kara
2011-04-14 10:20 ` Jan Kara
2011-04-13 8:59 ` [PATCH 3/4] writeback: skip balance_dirty_pages() for in-memory fs Wu Fengguang
2011-04-13 8:59 ` Wu Fengguang
2011-04-13 8:59 ` Wu Fengguang
2011-04-13 21:54 ` Jan Kara
2011-04-13 21:54 ` Jan Kara
2011-04-13 8:59 ` [PATCH 4/4] writeback: reduce per-bdi dirty threshold ramp up time Wu Fengguang
2011-04-13 8:59 ` Wu Fengguang
2011-04-13 8:59 ` Wu Fengguang
2011-04-13 22:04 ` Jan Kara
2011-04-13 22:04 ` Jan Kara
2011-04-13 23:31 ` Wu Fengguang
2011-04-13 23:31 ` Wu Fengguang
2011-04-13 23:52 ` Dave Chinner
2011-04-13 23:52 ` Dave Chinner
2011-04-14 0:23 ` Wu Fengguang
2011-04-14 0:23 ` Wu Fengguang
2011-04-14 10:36 ` Richard Kennedy
2011-04-14 10:36 ` Richard Kennedy
2011-04-14 13:49 ` Wu Fengguang
2011-04-14 13:49 ` Wu Fengguang
2011-04-14 14:08 ` Wu Fengguang
2011-04-14 15:14 ` Wu Fengguang
2011-04-14 15:56 ` Wu Fengguang
2011-04-14 18:16 ` Jan Kara
2011-04-14 18:16 ` Jan Kara
2011-04-15 3:43 ` Wu Fengguang
2011-04-15 14:37 ` Wu Fengguang
2011-04-15 22:13 ` Jan Kara
2011-04-15 22:13 ` Jan Kara
2011-04-16 6:05 ` Wu Fengguang
2011-04-16 6:05 ` Wu Fengguang
2011-04-16 8:33 ` Peter Zijlstra
2011-04-16 8:33 ` Peter Zijlstra
2011-04-16 14:21 ` Wu Fengguang
2011-04-17 2:11 ` Wu Fengguang
2011-04-17 2:11 ` Wu Fengguang
2011-04-18 14:59 ` Jan Kara
2011-04-18 14:59 ` Jan Kara
2011-05-24 12:24 ` Peter Zijlstra
2011-05-24 12:24 ` Peter Zijlstra
2011-05-24 12:41 ` Peter Zijlstra
2011-05-24 12:41 ` Peter Zijlstra
2011-06-09 23:58 ` Jan Kara
2011-06-09 23:58 ` Jan Kara
2011-04-13 10:15 ` [PATCH 0/4] trivial writeback fixes Peter Zijlstra
2011-04-13 10:15 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110414003045.GB6097@localhost \
--to=fengguang.wu@intel.com \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=david@fromorbit.com \
--cc=hughd@google.com \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.