linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: "Aneesh Kumar K. V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Dave Chinner <david@fromorbit.com>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	xfs@oss.sgi.com, "Theodore Ts'o" <tytso@mit.edu>
Subject: Re: [PATCH 3/4] writeback: pay attention to wbc->nr_to_write in write_cache_pages
Date: Fri, 30 Apr 2010 12:43:29 -0700	[thread overview]
Message-ID: <20100430124329.10a4c02b.akpm@linux-foundation.org> (raw)
In-Reply-To: <87sk6dwka6.fsf@linux.vnet.ibm.com>

On Fri, 30 Apr 2010 11:31:53 +0530
"Aneesh Kumar K. V" <aneesh.kumar@linux.vnet.ibm.com> wrote:

> On Thu, 29 Apr 2010 14:39:31 -0700, Andrew Morton <akpm@linux-foundation.org> wrote:
> > On Tue, 20 Apr 2010 12:41:53 +1000
> > Dave Chinner <david@fromorbit.com> wrote:
> > 
> > > If a filesystem writes more than one page in ->writepage, write_cache_pages
> > > fails to notice this and continues to attempt writeback when wbc->nr_to_write
> > > has gone negative - this trace was captured from XFS:
> > > 
> > > 
> > >     wbc_writeback_start: towrt=1024
> > >     wbc_writepage: towrt=1024
> > >     wbc_writepage: towrt=0
> > >     wbc_writepage: towrt=-1
> > >     wbc_writepage: towrt=-5
> > >     wbc_writepage: towrt=-21
> > >     wbc_writepage: towrt=-85
> > > 
> > 
> > Bug.
> > 
> > AFAIT it's a regression introduced by
> > 
> > : commit 17bc6c30cf6bfffd816bdc53682dd46fc34a2cf4
> > : Author:     Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> > : AuthorDate: Thu Oct 16 10:09:17 2008 -0400
> > : Commit:     Theodore Ts'o <tytso@mit.edu>
> > : CommitDate: Thu Oct 16 10:09:17 2008 -0400
> > : 
> > :     vfs: Add no_nrwrite_index_update writeback control flag
> > 
> > I suggest that what you do here is remove the local `nr_to_write' from
> > write_cache_pages() and go back to directly using wbc->nr_to_write
> > within the loop.
> > 
> > And thus we restore the convention that if the fs writes back more than
> > a single page, it subtracts (nr_written - 1) from wbc->nr_to_write.
> > 
> 
> My mistake i never expected writepage to write more than one page.

The writeback code is tricky and easy to break in subtle ways.

> The
> interface said 'writepage' so it was natural to expect that it writes only
> one page. BTW the reason for the change is to give file system which
> accumulate dirty pages using write_cache_pages and attempt to write
> them out later a chance to properly manage nr_to_write. Something like
> 
> ext4_da_writepages
> -- write_cache_pages
> ---- collect dirty page
> ---- return
> --return
> --now try to writeout all the collected dirty pages ( say 100)
> ----Only able to allocate blocks for 50 pages
>     so update nr_to_write -= 50 and mark rest of 50 pages as dirty
>     again
> 
> So we want wbc->nr_to_write updated only by ext4_da_writepages. 

So you want a ->writepage() implementation which doesn't actually write
a page at all - it just remembers that page for later.

Maybe that fs shouldn't be calling write_cache_pages() at all.  After
all, write_cache_pages() is a wrapper which emits a sequence of calls
to ->writepage(), and ->writepage() writes a page.

Rather than hacking around, subverting things and breaking core kernel
code, let's step back and more clearly think about what to do?

One option would be to implement a new address_space_operation which
provides the new semantics in a well-understood fashion.  Let's call it
writepage_prepare(?).  Then reimplement write_cache_pages() so that if
->writepage_prepare() is available, it handles it in a sensible fashion
and doesn't break traditional filesystems.

Or simply implement a new, different version of write_cache_pages() for
filesystems which wish to buffer in this fashion.  The new
write_cache_pages_prepare()(?) would call ->writepage_prepare(). 
Internally it might share implementation with write_cache_pages().

There are lots of options.  But the way in which write_cache_pages()
was extended to handle this ext4 requirement was rather unclean,
non-obvious and, umm, broken!

  reply	other threads:[~2010-04-30 19:43 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-20  2:41 [PATCH 0/4] writeback: tracing and wbc->nr_to_write fixes Dave Chinner
2010-04-20  2:41 ` [PATCH 1/4] writeback: initial tracing support Dave Chinner
2010-05-21 15:06   ` Christoph Hellwig
2010-04-20  2:41 ` [PATCH 2/4] writeback: Add tracing to balance_dirty_pages Dave Chinner
2010-04-20  2:41 ` [PATCH 3/4] writeback: pay attention to wbc->nr_to_write in write_cache_pages Dave Chinner
2010-04-22 19:07   ` Jan Kara
2010-04-25  3:33   ` tytso
2010-04-26  1:49     ` Dave Chinner
2010-04-26  2:43       ` tytso
2010-04-26  2:45         ` tytso
2010-04-27  3:30         ` Dave Chinner
2010-04-29 21:39   ` Andrew Morton
2010-04-30  6:01     ` Aneesh Kumar K. V
2010-04-30 19:43       ` Andrew Morton [this message]
2010-05-01 19:47         ` tytso
2010-04-20  2:41 ` [PATCH 4/4] xfs: remove nr_to_write writeback windup Dave Chinner
2010-04-22 19:09   ` Jan Kara
2010-04-26  0:46     ` Dave Chinner
2010-04-20  3:40 ` [PATCH 5/4] writeback: limit write_cache_pages integrity scanning to current EOF Dave Chinner
2010-04-20 23:28   ` Jamie Lokier
2010-04-20 23:31     ` Dave Chinner
2010-04-22 19:13   ` Jan Kara
2010-04-20 12:02 ` [PATCH 0/4] writeback: tracing and wbc->nr_to_write fixes Richard Kennedy
2010-04-20 23:29   ` Dave Chinner
2010-05-21 15:05 ` Christoph Hellwig
2010-05-22  0:09   ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100430124329.10a4c02b.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).