From: Jan Kara <jack@suse.cz>
To: "Fernando Luis Vázquez Cao" <fernando_b1@lab.ntt.co.jp>
Cc: Al Viro <viro@zeniv.linux.org.uk>, Jan Kara <jack@suse.cz>,
Dave Chinner <dchinner@redhat.com>,
Christoph Hellwig <hch@infradead.org>,
linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 3/8] fsfreeze: freeze_super and thaw_bdev don't play well together
Date: Fri, 13 Jul 2012 15:45:37 +0200 [thread overview]
Message-ID: <20120713134537.GF20361@quack.suse.cz> (raw)
In-Reply-To: <1342083943.7033.12.camel@nexus.lab.ntt.co.jp>
On Thu 12-07-12 18:05:43, Fernando Luis Vázquez Cao wrote:
> Changes from Dave Chinner's version:
> - Remove s_frozen check in freeze_super which is not needed now that it is
> re-entrant.
> - Decrement freeze counter if the freeze_fs callback fails.
>
> ---
>
> thaw_bdev() has re-entrancy guards to allow freezes to nest
> together. That is, it ensures that the filesystem is not thawed
> until the last thaw command is issued. This is needed to prevent the
> filesystem from being unfrozen while an existing freezer is still
> operating on the filesystem in a frozen state (e.g. dm-snapshot).
>
> Currently, freeze_super() and thaw_super() bypasses these guards,
> and as a result manual freezing and unfreezing via the ioctl methods
> do not nest correctly. hence mixing userspace directed freezes with
> block device level freezes result in inconsistency due to premature
> thawing of the filesystem.
>
> Move the re-enterency guards to the superblock and into freeze_super
> and thaw_super() so that userspace directed freezes nest correctly
> again. Caveat: Things work as expected as long as direct calls to
> thaw_super are always in response to a previous sb level freeze. In
> other words an unpaired call to thaw_super can still thaw a
> filesystem frozen using freeze_bdev (this issue could be addressed
> in a follow-up patch if deemed necessary).
>
> This patch retains the bdev level mutex and counter to keep the
> "feature" that we can freeze a block device that does not have a
> filesystem mounted yet.
>
> Cc: Christoph Hellwig <hch@infradead.org>
> Cc: Jan Kara <jack@suse.cz>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
> ---
Some mostly minor coments below.
> diff -urNp vfs-orig/fs/block_dev.c vfs/fs/block_dev.c
> --- vfs-orig/fs/block_dev.c 2012-07-12 14:31:38.936631141 +0900
> +++ vfs/fs/block_dev.c 2012-07-12 15:03:57.032627014 +0900
> @@ -257,16 +257,18 @@ int fsync_bdev(struct block_device *bdev
> EXPORT_SYMBOL(fsync_bdev);
>
> /**
> - * freeze_bdev -- lock a filesystem and force it into a consistent state
> + * freeze_bdev -- lock a block device
> * @bdev: blockdevice to lock
> *
> - * If a superblock is found on this device, we take the s_umount semaphore
> - * on it to make sure nobody unmounts until the snapshot creation is done.
> - * The reference counter (bd_fsfreeze_count) guarantees that only the last
> - * unfreeze process can unfreeze the frozen filesystem actually when multiple
> - * freeze requests arrive simultaneously. It counts up in freeze_bdev() and
> - * count down in thaw_bdev(). When it becomes 0, thaw_bdev() will unfreeze
> - * actually.
> + * Locks the block device and, if present, the associated filesystem too.
> + *
> + * The reference counter (bd_fsfreeze_count) is used to implement the feature
> + * that allows one to freeze a block device that does not have a filesystem
> + * mounted yet. For filesystems using mount_bdev the kernel takes care of
> + * things by preventing the mount operation from succeeding if the underlying
> + * block device is frozen. Other filesystems should check this counter or risk
> + * a situation where a freeze_bdev user (e.g. dm snapshot) and mount race,
> + * which may lead to inconsistencies.
> */
> struct super_block *freeze_bdev(struct block_device *bdev)
> {
> @@ -274,17 +276,7 @@ struct super_block *freeze_bdev(struct b
> int error = 0;
>
> mutex_lock(&bdev->bd_fsfreeze_mutex);
> - if (++bdev->bd_fsfreeze_count > 1) {
> - /*
> - * We don't even need to grab a reference - the first call
> - * to freeze_bdev grab an active reference and only the last
> - * thaw_bdev drops it.
> - */
> - sb = get_super(bdev);
> - drop_super(sb);
> - mutex_unlock(&bdev->bd_fsfreeze_mutex);
> - return sb;
> - }
> + bdev->bd_fsfreeze_count++;
>
> sb = get_active_super(bdev);
> if (!sb)
> @@ -297,30 +289,33 @@ struct super_block *freeze_bdev(struct b
> return ERR_PTR(error);
> }
> deactivate_super(sb);
> - out:
> +out:
> sync_blockdev(bdev);
> mutex_unlock(&bdev->bd_fsfreeze_mutex);
> - return sb; /* thaw_bdev releases s->s_umount */
> + return sb;
> }
> EXPORT_SYMBOL(freeze_bdev);
>
> /**
> - * __thaw_bdev -- unlock filesystem
> + * __thaw_bdev -- unlock a block device
> * @bdev: blockdevice to unlock
> * @sb: associated superblock
> * @emergency: emergency thaw
> *
> - * Unlocks the filesystem and marks it writeable again after freeze_bdev().
> + * Unlocks the block device and, if present, the associated filesystem too.
> */
> static int __thaw_bdev(struct block_device *bdev, struct super_block *sb, int emergency)
> {
> int error = -EINVAL;
>
> mutex_lock(&bdev->bd_fsfreeze_mutex);
> +
> if (!bdev->bd_fsfreeze_count)
> goto out;
>
> - if (--bdev->bd_fsfreeze_count > 0 || !sb) {
> + bdev->bd_fsfreeze_count--;
> +
> + if (!sb) {
> error = 0;
> goto out;
> }
> @@ -336,13 +331,6 @@ out:
> return error;
> }
>
> -/**
> - * thaw_bdev -- unlock filesystem
> - * @bdev: blockdevice to unlock
> - * @sb: associated superblock
> - *
> - * Unlocks the filesystem and marks it writeable again after freeze_bdev().
> - */
> int thaw_bdev(struct block_device *bdev, struct super_block *sb)
> {
> return __thaw_bdev(bdev, sb, 0);
It's a bit confusing (for reviewer) to remove the documentation here when
you remove the whole function in a later patch...
...
> @@ -1233,29 +1240,33 @@ static int __thaw_super(struct super_blo
> {
> int error = 0;
>
> - if (sb->s_frozen == SB_UNFROZEN) {
> + mutex_lock(&sb->s_freeze_mutex);
> + if (!sb->s_freeze_count) {
> error = -EINVAL;
> - goto out;
> + goto out_unlock;
> }
> + sb->s_freeze_count = emergency ? 1 : sb->s_freeze_count;
It would be cleaner to do this somewhere in do_thaw_one() and not here.
Also you won't have to pass the emergency parameter then... Also it might
be more logical for __thaw_super() to expect also s_freeze_mutex locked and
handle the locking in thaw_super().
Honza
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2012-07-13 13:45 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-07-12 8:57 [PATCH 0/8 v2] fsfreeze: miscellaneous fixes and cleanups Fernando Luis Vázquez Cao
2012-07-12 9:02 ` [PATCH 1/8] fsfreeze: Prevent emergency thaw from looping infinitely Fernando Luis Vázquez Cao
2012-07-13 12:59 ` Jan Kara
2012-07-17 5:13 ` Fernando Luis Vazquez Cao
2012-07-12 9:04 ` [PATCH 2/8] fsfreeze: emergency thaw will deadlock on s_umount Fernando Luis Vázquez Cao
2012-07-13 13:17 ` Jan Kara
2012-08-09 6:00 ` Fernando Luis Vazquez Cao
2012-09-13 6:23 ` Fernando Luis Vazquez Cao
2012-07-12 9:05 ` [PATCH 3/8] fsfreeze: freeze_super and thaw_bdev don't play well together Fernando Luis Vázquez Cao
2012-07-13 13:45 ` Jan Kara [this message]
2012-08-09 9:00 ` Fernando Luis Vazquez Cao
2012-07-12 9:07 ` [PATCH 4/8] fsfreeze: switch to using super methods where possible Fernando Luis Vázquez Cao
2012-07-13 13:50 ` Jan Kara
2012-07-12 9:09 ` [PATCH 5/8] fsfreeze: move emergency thaw code to fs/super.c Fernando Luis Vázquez Cao
2012-07-13 13:53 ` Jan Kara
2012-07-12 9:10 ` [PATCH 6/8] fsfreeze: add vfs ioctl to check freeze state Fernando Luis Vázquez Cao
2012-07-13 13:54 ` Jan Kara
2012-07-15 22:45 ` Dave Chinner
2012-09-13 6:19 ` Fernando Luis Vazquez Cao
2012-09-13 7:18 ` Dave Chinner
2012-09-13 8:19 ` Fernando Luis Vazquez Cao
2012-09-14 0:15 ` Dave Chinner
2012-09-14 1:46 ` Fernando Luis Vazquez Cao
2012-09-14 6:28 ` Dave Chinner
2012-09-14 8:18 ` Fernando Luis Vazquez Cao
2012-09-13 6:11 ` Fernando Luis Vazquez Cao
2012-07-12 9:11 ` [PATCH 7/8] fsfreeze: add block device " Fernando Luis Vázquez Cao
2012-07-12 9:12 ` [PATCH 8/8] fsfreeze: update Documentation/filesystems/Locking Fernando Luis Vázquez Cao
2012-07-13 14:11 ` Jan Kara
2012-07-17 1:42 ` Fernando Luis Vazquez Cao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120713134537.GF20361@quack.suse.cz \
--to=jack@suse.cz \
--cc=dchinner@redhat.com \
--cc=fernando_b1@lab.ntt.co.jp \
--cc=hch@infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).