From: Andrew Morton <akpm@linux-foundation.org>
To: Nick Piggin <npiggin@suse.de>
Cc: linux-fsdevel@vger.kernel.org,
Dave Kleikamp <shaggy@linux.vnet.ibm.com>,
Badari Pulavarty <pbadari@gmail.com>
Subject: Re: [patch][rfc] fs: fix nobh error handling
Date: Tue, 7 Aug 2007 19:33:47 -0700 [thread overview]
Message-ID: <20070807193347.fbcd7f38.akpm@linux-foundation.org> (raw)
In-Reply-To: <20070808021838.GA11018@wotan.suse.de>
On Wed, 8 Aug 2007 04:18:38 +0200 Nick Piggin <npiggin@suse.de> wrote:
> On Tue, Aug 07, 2007 at 06:09:03PM -0700, Andrew Morton wrote:
> > On Tue, 7 Aug 2007 07:51:29 +0200
> > Nick Piggin <npiggin@suse.de> wrote:
> >
> > > nobh mode error handling is not just pretty slack, it's wrong.
> > >
> > > One cannot zero out the whole page to ensure new blocks are zeroed,
> > > because it just brings the whole page "uptodate" with zeroes even if
> > > that may not be the correct uptodate data. Also, other parts of the
> > > page may already contain dirty data which would get lost by zeroing
> > > it out. Thirdly, the writeback of zeroes to the new blocks will also
> > > erase existing blocks. All these conditions are pagecache and/or
> > > filesystem corruption.
> > >
> > > The problem comes about because we didn't keep track of which buffers
> > > actually are new or old. However it is not enough just to keep only
> > > this state, because at the point we start dirtying parts of the page
> > > (new blocks, with zeroes), the handling of IO errors becomes impossible
> > > without buffers because the page may only be partially uptodate, in
> > > which case the page flags allone cannot capture the state of the parts
> > > of the page.
> > >
> > > So allocate all buffers for the page upfront, but leave them unattached
> > > so that they don't pick up any other references and can be freed when
> > > we're done. If the error path is hit, then zero the new buffers as the
> > > regular buffer path does, then attach the buffers to the page so that
> > > it can actually be written out correctly and be subject to the normal
> > > IO error handling paths.
> > >
> > > As an upshot, we save 1K of kernel stack on ia64 or powerpc 64K page
> > > systems.
> > >
> > > Signed-off-by: Nick Piggin <npiggin@suse.de>
> > >
> >
> > With this change, nobh_prepare_write() can magically attach buffers to the
> > page. But a filesystem which is running in nobh mode wouldn't expect that,
> > and could quite legitimately go BUG, or leak data or, more seriously and
> > much less fixably, just go and overwrite page->private, because it "knows"
> > that nobody else is using ->private.
>
> I was fairly sure that a filesystem can not assume buffers won't be
> attached, because there are various error case paths thta do exactly
> the same thing (eg. nobh_writepage can call __block_write_full_page
> which will attach buffers).
oh crap, that's sad. Either we broke it later on or I misremembered.
> Does any filesystem assume this? Is it not already broken?
Yes, it would be broken.
>
> > I'd have thought that it would be better to not attach the buffers and to
> > go ahead and do whatever synchronous IO is needed in the error recovery
> > code, then free those buffers again.
>
> It is hard because if the synchronous IO fails, then what do you do?
Do what we usually do when an IO error happens: crash the kernel? (Sorry,
have been spending too long at bugzilla.kernel.org)
> You could try making it up as you go along, but of course if we _can_
> attach the buffers here then it would be preferable to do that. IMO.
>
>
> > Also, you have a couple of (cheerily uncommented) PagePrivate() tests in
> > there which should be page_has_buffers().
>
> Yeah, I guess the whole thing needs more commenting :P
> page_has_buffers... right, I'll change that.
Did it get much testing?
next prev parent reply other threads:[~2007-08-08 2:34 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-08-07 5:51 [patch][rfc] fs: fix nobh error handling Nick Piggin
2007-08-08 1:09 ` Andrew Morton
2007-08-08 2:18 ` Nick Piggin
2007-08-08 2:33 ` Andrew Morton [this message]
2007-08-08 3:12 ` Nick Piggin
2007-08-08 13:07 ` Dave Kleikamp
2007-08-08 14:39 ` Mingming Cao
2007-08-09 1:45 ` Nick Piggin
2007-08-20 4:33 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070807193347.fbcd7f38.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=npiggin@suse.de \
--cc=pbadari@gmail.com \
--cc=shaggy@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).