Re: [PATCH] blockdev: fix for racing mount/umount

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Jeff Mahoney <jeffm@suse.com>
To: Linus Torvalds <torvalds@osdl.org>
Cc: Andrew Morton <akpm@osdl.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
Subject: Re: [PATCH] blockdev: fix for racing mount/umount
Date: Tue, 15 Mar 2005 15:59:06 -0500	[thread overview]
Message-ID: <42374C9A.1010000@suse.com> (raw)
In-Reply-To: <Pine.LNX.4.58.0503150746320.6119@ppc970.osdl.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds wrote:
> 
> On Tue, 15 Mar 2005, Jeff Mahoney wrote:
> 
>>This patch is another take at fixing the race between mount and umount
>>resetting the blocksize and causing buffer errors, infinite loops in
>>__getblk_slow, and possibly other undiscovered effects.
> 
> 
> Ok. I had to go back and look up the original problem, and having looked 
> at this a bit more, I wonder whether the real problem is not that we do 
> that silly "set blocksize back to the original one" at umount time in the 
> first place.
> 
> (It happens very indirectly, though the "->kill_sb()" fn pointer, which 
> ends up doing kill_block_super on a regular block device).
> 
> Maybe we should just get rid of it entirely? There's really no point to 
> it.
> 
> Instead, to make things repeatable, we'd always just set the blocksize to
> its default value at the first open. We already do that anyway, don't we?
> 
> Wouldn't that approach also just fix things? And then the fix would 
> literally be to just remove the set_blocksize() call in kill_block_super. 
> At that point, we know that the only people who set the block size are 
> either
>  - a first opener
>  - somebody who got exclusive access (ie a filesystem that sets it at 
>    mount-time)
> 
> (Yeah, it's a bit more complex than that one-liner, since somebody would 
> need to back me up on not being totally tripping on some 'shrooms. But Al 
> can probably do that trivially)
> 
> Or maybe I misunderstood the problem. Jeff?

*smacks head*

Yeah, I think this will fix the problem in a much cleaner way. I wanted
to try to preserve the old behavior, but you validly point out that it's
silly to do so. I don't believe that any opener of a block device has
any expectation that the block size be anything in particular. They
could open a block device that is mounted, or one that isn't. If they
need it a different size, they can explicitly set it. The set_blocksize
in kill_block_super is just being nice and returning it the default.

Here's a quick analysis:

Previous to the superblock being shut down, sget will return a
superblock which will then lead to get_sb_bdev returning -EBUSY as expected.

After the superblock being shut down, bd_release releases the holders.
This could allow another opener to open it exclusively, but then the new
mount would just return -EBUSY, which is safe. Otherwise, we'll take a
holder reference under bdev_lock.

Then, blkdev_put will sync and kill the blkdev under bdev->bd_sem if it
holds the sole opener reference. This is what usually will happen.

If it doesn't hold the only reference, blkdev_get in the new mount beat
us there and won't need to reset the blocksize. The bdev doesn't get
synced and killed either, though. This isn't that big a deal, since
generic_shutdown_super->fsync_super takes care of syncing the fs, but
the ->write_super call following it could leave the superblock unsynced
under certain circumstances[*]. So, rather than completely remove the
set_blocksize, I'd prefer to replace it with a sync_blockdev instead.

- -Jeff

[*]: It's a very specific corner case: Currently, since most filesystems
use a block size that is different than the device block size,
set_blocksize resetting the block size can be counted on to sync any
writes that occured during that final ->write_super call. If we remove
set_blocksize, then that final sync may not occur if there is another
opener (such as a process reading the block device).

Yes, it's a case of shooting-yourself-in-the-foot, but say it's a pen
drive and knowing that it's umounted, the user removes the device. The
final write is lost - because a read-only operation was in progress.
That doesn't seem right to me, and can be fixed pretty easily by
substituting a sync_blockdev for that set_blocksize rather than removing
it completely.

- --
Jeff Mahoney
SuSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)

iD8DBQFCN0yaLPWxlyuTD7IRAtp0AJ4xQDNjS+B6wq3sKu2t6c4tj6PHowCeJeqR
54cAg0bsuf1pMU0kSDJLy20=
=t5Vf
-----END PGP SIGNATURE-----

next prev parent reply	other threads:[~2005-03-15 20:58 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-03-15 14:14 [PATCH] blockdev: fix for racing mount/umount Jeff Mahoney
2005-03-15 16:00 ` Linus Torvalds
2005-03-15 17:50   ` Al Viro
2005-03-15 20:59   ` Jeff Mahoney [this message]
2005-03-17 19:51   ` Jeff Mahoney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=42374C9A.1010000@suse.com \
    --to=jeffm@suse.com \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@osdl.org \
    --cc=viro@parcelfarce.linux.theplanet.co.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox