All of lore.kernel.org
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: Christoph Hellwig <hch@infradead.org>
Cc: Andrew Patterson <andrew.patterson@hp.com>,
	Jens Axboe <axboe@kernel.dk>,
	linux-raid@vger.kernel.org, dm-devel@redhat.com,
	linux-kernel@vger.kernel.org, James.Bottomley@suse.de
Subject: Re: [PATCH] Fix over-zealous flush_disk when changing device size.
Date: Fri, 4 Mar 2011 11:16:24 +1100	[thread overview]
Message-ID: <20110304111624.4be27aaf@notabene.brown> (raw)
In-Reply-To: <20110303143120.GA8134@infradead.org>

On Thu, 3 Mar 2011 09:31:20 -0500 Christoph Hellwig <hch@infradead.org> wrote:

> On Thu, Feb 17, 2011 at 04:50:57PM +1100, NeilBrown wrote:
> > 
> > Hi Andrew (and others)
> >  I wonder if you would review the following for me and comment.
> 
> Please send think in this area through -fsdevel next time, thanks!

Will try to remember - it is sometimes hard to get this sort of patch before
the right audience ... I thought "block layer" rather than "file systems" :-(

Thanks for finding it anyway.

> 
> > There are two cases when we call flush_disk.
> > In one, the device has disappeared (check_disk_change) so any
> > data will hold becomes irrelevant.
> > In the oter, the device has changed size (check_disk_size_change)
> > so data we hold may be irrelevant.
> > 
> > In both cases it makes sense to discard any 'clean' buffers,
> > so they will be read back from the device if needed.
> 
> Does it?  If the device has disappeared we can't read them back anyway.

I think that is the point - return an error rather than stale data.

> If the device has resized to a smaller size the same is true about
> those buffers that have gone away, and if it has resized to a larger
> size invalidating anything doesn't make sense at all.  I think this
> area needs more love than a quick kill_dirty hackjob.

I tend to agree.  I wasn't entirely convinced by the changelog comments on
the original offending patch, but I couldn't convince myself there was no
justification either, and I wanted to fix the corruption I saw - while close
to the end of a release cycle - without introducing any new regressions.

> 
> > In the former case it makes sense to discard 'dirty' buffers
> > as there will never be anywhere safe to write the data.  In the
> > second case it *does*not* make sense to discard dirty buffers
> > as that will lead to file system corruption when you simply enlarge
> > the containing devices.
> 
> Doing anything like this at the buffer cache layer or inode cache layer
> doesn't make any sense.  If a device goes away or shrinks below the
> filesystem size the filesystem simply needs to be shut down and in te
> former size the admin needs to start a manual repair.  Trying to do
> any botch jobs in lower layer never works in practice.

Amen.
What I personally would really like to see is an interface for the block
device to say to the filesystem (or more specifically: whatever has bdclaimed
it) "I am about to resize to $X - is that OK?" and also "I have resized -
deal with it".

> 
> For now I think the best short term fix is to simply revert commit
> 608aeef17a91747d6303de4df5e2c2e6899a95e8
> 
> 	"Call flush_disk() after detecting an online resize."

You may be right, but I suspect that Andrew Patterson had a real issue to
solve which lead to submitting it, and I'd really like to understand that
issue before I would feel confident just reverting it.

Andrew:  are you out there?  Can you provide some background for your patch?

Thanks,
NeilBrown

  reply	other threads:[~2011-03-04  0:16 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-17  5:50 [PATCH] Fix over-zealous flush_disk when changing device size NeilBrown
2011-02-17  5:50 ` NeilBrown
2011-02-17 17:03 ` Jeff Moyer
2011-02-23  8:48   ` Kwolek, Adam
2011-02-23 10:01     ` NeilBrown
2011-02-21 19:36 ` Jeff Moyer
2011-02-21 19:36   ` Jeff Moyer
2011-02-21 21:14   ` NeilBrown
2011-03-03 14:31 ` Christoph Hellwig
2011-03-04  0:16   ` NeilBrown [this message]
2011-03-04 17:25     ` Andrew Patterson
2011-03-04 17:25       ` Andrew Patterson
2011-03-06  6:47       ` NeilBrown
2011-03-07  4:22         ` Andrew Patterson
2011-03-07  4:22           ` Andrew Patterson
2011-03-07 16:46           ` [dm-devel] " James Bottomley
2011-03-07 22:44             ` NeilBrown
2011-03-07 22:56               ` James Bottomley
2011-03-08  0:04                 ` NeilBrown
2011-03-16 20:30                   ` Jeff Moyer
2011-03-17  1:28                     ` NeilBrown
2011-03-17 17:33                       ` Jeff Moyer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110304111624.4be27aaf@notabene.brown \
    --to=neilb@suse.de \
    --cc=James.Bottomley@suse.de \
    --cc=andrew.patterson@hp.com \
    --cc=axboe@kernel.dk \
    --cc=dm-devel@redhat.com \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.