From: Jan Kara <jack@suse.cz>
To: "Fernando Luis Vázquez Cao" <fernando_b1@lab.ntt.co.jp>
Cc: Al Viro <viro@zeniv.linux.org.uk>,
Josef Bacik <jbacik@fusionio.com>,
Eric Sandeen <sandeen@redhat.com>,
Dave Chinner <dchinner@redhat.com>,
Christoph Hellwig <hch@infradead.org>, Jan Kara <jack@suse.cz>,
Luiz Capitulino <lcapitulino@redhat.com>,
linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 12/17] fsfreeze: sb-level/bdev-level fsfreeze integration
Date: Wed, 9 Jan 2013 17:37:37 +0100 [thread overview]
Message-ID: <20130109163737.GE17353@quack.suse.cz> (raw)
In-Reply-To: <1357558704.8183.19.camel@nexus.lab.ntt.co.jp>
On Mon 07-01-13 20:38:24, Fernando Luis Vázquez Cao wrote:
> As things stand now a filesystem frozen through the in-kernel bdev level API
> can be thawed using the userspace sb level API, which can lead to accidental
> corruption of filesystem snapshots and backups.
>
> To address this problem we modify the in-kernel API so that we can tell
> fsfreeze that a kernel initiated freeze is in progress and that the filesystem
> should not be thawed no matter how many times the FITHAW ioctl is invoked.
I'm not sure if this isn't going too far in the direction of trying to
prevent sysadmin to shoot himself in the foot. For well written applications
where FITHAW and FIFREEZE are paired, things should work OK after your
initial fixes. And if someone calls unpaired FITHAW, things can break
spectacularly anyway for other users of FIFREEZE. So I just wouldn't bother
with any more protections. What do you think?
Honza
> Cc: linux-fsdevel@vger.kernel.org
> Cc: Josef Bacik <jbacik@fusionio.com>
> Cc: Eric Sandeen <sandeen@redhat.com>
> Cc: Christoph Hellwig <hch@infradead.org>
> Cc: Dave Chinner <dchinner@redhat.com>
> Cc: Jan Kara <jack@suse.cz>
> Cc: Luiz Capitulino <lcapitulino@redhat.com>
> Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
> ---
>
> diff -urNp linux-3.8-rc1-orig/fs/block_dev.c linux-3.8-rc1/fs/block_dev.c
> --- linux-3.8-rc1-orig/fs/block_dev.c 2012-12-25 16:22:48.268018000 +0900
> +++ linux-3.8-rc1/fs/block_dev.c 2012-12-25 16:32:09.712018000 +0900
> @@ -238,7 +238,7 @@ struct super_block *freeze_bdev(struct b
> sb = get_active_super(bdev);
> if (!sb)
> goto out;
> - error = freeze_super(sb);
> + error = __freeze_super(sb, true);
> if (error) {
> deactivate_super(sb);
> bdev->bd_fsfreeze_count--;
> @@ -265,6 +265,7 @@ int thaw_bdev(struct block_device *bdev,
> int error = -EINVAL;
>
> mutex_lock(&bdev->bd_fsfreeze_mutex);
> +
> if (!bdev->bd_fsfreeze_count)
> goto out;
>
> @@ -273,20 +274,10 @@ int thaw_bdev(struct block_device *bdev,
> goto out;
> }
>
> - error = thaw_super(sb);
> - /*
> - * If the superblock is already unfrozen, i.e. thaw_super() returned
> - * -EINVAL, we consider the block device level thaw successful. This
> - * behavior is important in a scenario where a filesystem frozen using
> - * freeze_bdev() is thawed through the superblock level API; if we
> - * caused the subsequent thaw_bdev() to fail bdev->bd_fsfreeze_count
> - * would not go back to 0 which means that future calls to freeze_bdev()
> - * would not freeze the superblock, just increase the counter.
> - */
> - if (error && error != -EINVAL)
> + error = __thaw_super(sb, true);
> +
> + if (error)
> bdev->bd_fsfreeze_count++;
> - else
> - error = 0;
> out:
> mutex_unlock(&bdev->bd_fsfreeze_mutex);
> return error;
> diff -urNp linux-3.8-rc1-orig/fs/namespace.c linux-3.8-rc1/fs/namespace.c
> --- linux-3.8-rc1-orig/fs/namespace.c 2012-12-25 16:31:10.780018000 +0900
> +++ linux-3.8-rc1/fs/namespace.c 2012-12-25 16:32:09.712018000 +0900
> @@ -1103,6 +1103,11 @@ static void thaw_mount(struct mount *mnt
> * superblock succeeds (once it has been detached the fsfreeze
> * ioctls become unusable). Thus, force-thaw sb so that all tasks
> * in fsfreeze wait queue are woken up.
> + *
> + * thaw_super_force() does not actually thaw the sb if the freeze
> + * counter was locked (i.e. was frozen through the block device
> + * level API). In such a case the freeze counter is set to one
> + * thus guaranteeing that the sb will get thawed unlock time.
> */
> thaw_super_force(sb); /* Drops superblock lock. */
> }
> diff -urNp linux-3.8-rc1-orig/fs/super.c linux-3.8-rc1/fs/super.c
> --- linux-3.8-rc1-orig/fs/super.c 2012-12-25 16:31:10.780018000 +0900
> +++ linux-3.8-rc1/fs/super.c 2012-12-25 16:32:09.712018000 +0900
> @@ -1301,15 +1301,20 @@ static void sb_wait_write(struct super_b
> }
>
> /**
> - * freeze_super - lock the filesystem and force it into a consistent state
> + * __freeze_super - lock the filesystem and force it into a consistent state
> * @sb: the super to lock
> + * @lock: should we lock the freeze counter?
> *
> * Syncs the super to make sure the filesystem is consistent and calls the fs's
> - * freeze_fs. The reference counter (s_freeze_count) guarantees that only the
> - * last unfreeze process can unfreeze the frozen filesystem actually when
> - * multiple freeze requests arrive simultaneously. It counts up in
> - * freeze_super() and counts down in thaw_super(). When it becomes 0,
> - * thaw_super() will execute the unfreeze.
> + * freeze_fs. Freezes can nest which has two implications: the filesystem level
> + * freeze occurs during the first nested freeze, the actual filesystem thaw
> + * occurs only when the last thaw operation brings the freeze counter down to
> + * zero.
> + *
> + * If @lock is true the freeze counter is increased after a successful freeze
> + * but it cannot go back to zero (and the filesystem get actually thawed) until
> + * the the counter is unlocked using this function's thaw counterpart. The
> + * freeze counter lock does not nest.
> *
> * During this function, sb->s_writers.frozen goes through these values:
> *
> @@ -1334,15 +1339,24 @@ static void sb_wait_write(struct super_b
> * freezing. Then we transition to SB_FREEZE_COMPLETE state. This state is
> * mostly auxiliary for filesystems to verify they do not modify frozen fs.
> *
> - * sb->s_writers.frozen and sb->s_freeze_count are protected by sb->s_umount.
> + * sb->s_writers.frozen, sb->s_freeze_count and sb->s_freeze_locked are
> + * protected by sb->s_umount.
> */
> -int freeze_super(struct super_block *sb)
> +int __freeze_super(struct super_block *sb, bool lock)
> {
> int ret = 0;
> + bool locked_old = sb->s_freeze_locked;
>
> atomic_inc(&sb->s_active);
> down_write(&sb->s_umount);
>
> + /* The freeze counter lock does not nest. */
> + if (sb->s_freeze_locked && lock) {
> + ret = -EBUSY;
> + goto out_deactivate;
> + }
> +
> + sb->s_freeze_locked = lock ? true : sb->s_freeze_locked;
> if (++sb->s_freeze_count > 1)
> goto out_deactivate;
>
> @@ -1390,6 +1404,7 @@ int freeze_super(struct super_block *sb)
> if (ret) {
> printk(KERN_ERR
> "VFS:Filesystem freeze failed\n");
> + sb->s_freeze_locked = locked_old;
> sb->s_freeze_count--;
> sb->s_writers.frozen = SB_UNFROZEN;
> smp_wmb();
> @@ -1397,11 +1412,13 @@ int freeze_super(struct super_block *sb)
> goto out_deactivate;
> }
> }
> +
> /*
> * This is just for debugging purposes so that fs can warn if it
> * sees write activity when frozen is set to SB_FREEZE_COMPLETE.
> */
> sb->s_writers.frozen = SB_FREEZE_COMPLETE;
> +
> out_unlock:
> up_write(&sb->s_umount);
> return ret;
> @@ -1409,6 +1426,18 @@ out_deactivate:
> deactivate_locked_super(sb);
> return ret;
> }
> +
> +/**
> + * freeze_super - lock the filesystem and force it into a consistent state
> + * @sb: the super to lock
> + *
> + * This is a wrapper around __freeze_super() which does the actual work of
> + * freezing the filesystem. fsfreeze counter lock is not requested.
> + */
> +int freeze_super(struct super_block *sb)
> +{
> + return __freeze_super(sb, false);
> +}
> EXPORT_SYMBOL(freeze_super);
>
> /**
> @@ -1449,34 +1478,56 @@ out:
> }
>
> /**
> - * thaw_super - unlock filesystem
> + * __thaw_super - unlock filesystem
> * @sb: the super to thaw
> + * @unlock: should we unlock the freeze counter?
> *
> - * Unlocks the filesystem and marks it writeable again after freeze_super().
> + * Tries to decrease the freeze counter and when it reaches zero unlocks the
> + * filesystem and marks it writeable again. If the counter is locked it cannot
> + * go back to zero (and thus trigger the actual filesystem thaw) unless @unlock
> + * is true.
> *
> * Returns -EINVAL if @sb is not frozen, 0 if it succeeded or the corresponding
> * error code otherwise. If the unfreeze fails, @sb is left in the frozen state.
> */
> -int thaw_super(struct super_block *sb)
> +int __thaw_super(struct super_block *sb, bool unlock)
> {
> int error = 0;
>
> down_write(&sb->s_umount);
>
> - if (!sb->s_freeze_count) {
> + /*
> + * An unfrozen filesystem cannot be thawed. Similarly, an unlocked
> + * freeze counter cannot be unlocked.
> + */
> + if (!sb->s_freeze_count || (!sb->s_freeze_locked && unlock)) {
> error = -EINVAL;
> goto out_unlock;
> }
>
> - if (--sb->s_freeze_count > 0)
> + /*
> + * Freezes nest so only the last call (freeze counter down to one) can
> + * trigger the actual filesystem thaw.
> + */
> + if (sb->s_freeze_count > 1) {
> + sb->s_freeze_count--;
> + sb->s_freeze_locked = unlock ? false: sb->s_freeze_locked;
> + goto out_unlock;
> + }
> + /* A locked filesystem cannot be thawed unless unlock was requested. */
> + else if (sb->s_freeze_locked && !unlock) {
> + error = -EINVAL;
> goto out_unlock;
> + }
>
> error = raw_thaw_super(sb, false);
>
> - if (error) {
> - sb->s_freeze_count++;
> - goto out_unlock;
> + if (!error) {
> + sb->s_freeze_count = 0;
> + sb->s_freeze_locked = false;
> }
> + else
> + goto out_unlock;
>
> /* Active reference released after last thaw. */
> deactivate_locked_super(sb);
> @@ -1486,6 +1537,19 @@ out_unlock:
> up_write(&sb->s_umount);
> return error;
> }
> +
> +/**
> + * thaw_super - unlock filesystem
> + * @sb: the super to unlock
> + *
> + * This is a wrapper around __thaw_super() which does the actual work of
> + * thawing the filesystem. Release of the fsfreeze counter lock is not
> + * requested.
> + */
> +int thaw_super(struct super_block *sb)
> +{
> + return __thaw_super(sb, false);
> +}
> EXPORT_SYMBOL(thaw_super);
>
> /**
> @@ -1505,10 +1569,19 @@ int thaw_super_force(struct super_block
> up_write(&sb->s_umount);
> return -EINVAL;
> }
> +
> + if (sb->s_freeze_locked) {
> + /* Ensure superblock gets thawed at unlock time */
> + sb->s_freeze_count = 1;
> + up_write(&sb->s_umount);
> + return -EINVAL;
> + }
> +
> sb->s_freeze_count = 0;
> raw_thaw_super(sb, true);
> /* Active reference released after last thaw. */
> deactivate_locked_super(sb);
> +
> return 0;
> }
>
> diff -urNp linux-3.8-rc1-orig/include/linux/fs.h linux-3.8-rc1/include/linux/fs.h
> --- linux-3.8-rc1-orig/include/linux/fs.h 2012-12-25 16:31:10.784018000 +0900
> +++ linux-3.8-rc1/include/linux/fs.h 2012-12-25 16:32:09.712018000 +0900
> @@ -1323,6 +1323,9 @@ struct super_block {
>
> /* Number of nested freezes */
> int s_freeze_count;
> +
> + /* Is freeze state locked? */
> + bool s_freeze_locked;
> };
>
> /* superblock cache pruning functions */
> @@ -1881,7 +1884,9 @@ extern int vfs_statfs(struct path *, str
> extern int user_statfs(const char __user *, struct kstatfs *);
> extern int fd_statfs(int, struct kstatfs *);
> extern int vfs_ustat(dev_t, struct kstatfs *);
> +extern int __freeze_super(struct super_block *sb, bool lock);
> extern int freeze_super(struct super_block *super);
> +extern int __thaw_super(struct super_block *sb, bool unlock);
> extern int thaw_super(struct super_block *super);
> extern int thaw_super_force(struct super_block *super);
> extern void emergency_thaw_all(void);
>
>
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2013-01-09 16:37 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-07 11:18 [PATCH v6 0/17] fsfreeze: miscellaneous fixes and cleanups Fernando Luis Vázquez Cao
2013-01-07 11:21 ` [PATCH 1/17] vfs: add __iterate_supers() and helpers around it Fernando Luis Vázquez Cao
2013-01-07 11:22 ` [PATCH 2/17] fsfreeze: add unlocked version of thaw_super Fernando Luis Vázquez Cao
2013-01-07 11:23 ` [PATCH 3/17] fsfreeze: fix emergency thaw infinite loop Fernando Luis Vázquez Cao
2013-01-07 11:26 ` [PATCH 4/17] fsfreeze: emergency thaw will deadlock on s_umount Fernando Luis Vázquez Cao
2013-01-09 16:12 ` Jan Kara
2013-01-07 11:27 ` [PATCH 5/17] xfs: switch to using super methods for fsfreeze Fernando Luis Vázquez Cao
2013-01-07 11:29 ` [PATCH 6/17] fsfreeze: move emergency thaw code to fs/super.c Fernando Luis Vázquez Cao
2013-01-07 11:30 ` [PATCH 7/17] fsfreeze: fix nested freezing of sb-less bdevs Fernando Luis Vázquez Cao
2013-01-09 16:24 ` Jan Kara
2013-01-07 11:32 ` [PATCH 8/17] fsfreeze: allow bdev level thaws when the sb is unfrozen Fernando Luis Vázquez Cao
2013-01-09 16:26 ` Jan Kara
2013-01-07 11:34 ` [PATCH 9/17] fsfreeze: freeze_super and thaw_bdev don't play well together Fernando Luis Vázquez Cao
2013-01-07 11:35 ` [PATCH 10/17] fsfreeze: automatically thaw on umount Fernando Luis Vázquez Cao
2013-01-09 17:20 ` Jan Kara
2013-01-10 9:14 ` Fernando Luis Vazquez Cao
2013-01-07 11:36 ` [PATCH 11/17] fsfreeze: add thaw_super_force Fernando Luis Vázquez Cao
2013-01-07 11:38 ` [PATCH 12/17] fsfreeze: sb-level/bdev-level fsfreeze integration Fernando Luis Vázquez Cao
2013-01-09 16:37 ` Jan Kara [this message]
2013-01-10 9:57 ` Fernando Luis Vazquez Cao
2013-01-07 11:39 ` [PATCH 13/17] fsfreeze: unfreeze bdevs in addition to filesystems during emergency thaw Fernando Luis Vázquez Cao
2013-01-09 16:41 ` Jan Kara
2013-01-07 11:41 ` [PATCH 14/17] vfs: leverage bd_super in get_super and get_active_super Fernando Luis Vázquez Cao
2013-01-09 16:44 ` Jan Kara
2013-01-07 11:42 ` [PATCH 15/17] btrfs: store pointer to superblock in bd_super Fernando Luis Vázquez Cao
2013-01-07 11:43 ` [PATCH 16/17] fsfreeze: allow freeze counter lock nesting Fernando Luis Vázquez Cao
2013-01-07 11:44 ` [PATCH 17/17] fsfreeze: export freeze_count through mountinfo Fernando Luis Vázquez Cao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130109163737.GE17353@quack.suse.cz \
--to=jack@suse.cz \
--cc=dchinner@redhat.com \
--cc=fernando_b1@lab.ntt.co.jp \
--cc=hch@infradead.org \
--cc=jbacik@fusionio.com \
--cc=lcapitulino@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=sandeen@redhat.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).