public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC, PATCH] filesystem freeze: fix sys_umount induced perpetual freeze
@ 2009-08-26  5:00 Fernando Luis Vázquez Cao
  2009-08-26 17:38 ` Christoph Hellwig
  2009-08-27 14:06 ` [PATCH 4/4] filesystem freeze: add ISFROZEN ioctl Fernando Luis Vázquez Cao
  0 siblings, 2 replies; 13+ messages in thread
From: Fernando Luis Vázquez Cao @ 2009-08-26  5:00 UTC (permalink / raw)
  To: t-sato, m-hamaguchi
  Cc: Christoph Hellwig, Al Viro, Andrew Morton,
	Linux Kernel Mailing List, Eric Sandeen

The current locking scheme for filesystem freeze avoids races between
freeze_bdev() and do_umount() by taking the s_umount semaphore.

If freeze_bdev() wins the race the process that invoked sys_umount
will sleep until thaw_bdev releases the semaphore. Unfortunately, this
will never happen because thaw_bdev bails out early the
bd_fsfreeze_count check having failed (the count is 0).

The problem is that the block_device that ioctl_fsthaw() passes to
thaw_bdev() is not the one that we freezed because before sleeping in
deactivate_super() do_umount() released the dentry (dput()) and freed
the vfs mount (free_vfsmnt()).

This patch works around this issue by checking the freeze level in
do_umount().

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
---

diff -urNp linux-2.6.31-rc7-orig/fs/namespace.c linux-2.6.31-rc7/fs/namespace.c
--- linux-2.6.31-rc7-orig/fs/namespace.c	2009-08-25 16:39:46.000000000 +0900
+++ linux-2.6.31-rc7/fs/namespace.c	2009-08-26 11:30:10.000000000 +0900
@@ -1033,6 +1033,8 @@ static int do_umount(struct vfsmount *mn
  	if (retval)
  		return retval;

+	vfs_check_frozen(sb, SB_FREEZE_WRITE);
+
  	/*
  	 * Allow userspace to request a mountpoint be expired rather than
  	 * unmounting unconditionally. Unmount only happens if:

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC, PATCH] filesystem freeze: fix sys_umount induced perpetual freeze
  2009-08-26  5:00 [RFC, PATCH] filesystem freeze: fix sys_umount induced perpetual freeze Fernando Luis Vázquez Cao
@ 2009-08-26 17:38 ` Christoph Hellwig
  2009-08-27 10:11   ` Fernando Luis Vázquez Cao
  2009-08-27 14:06 ` [PATCH 4/4] filesystem freeze: add ISFROZEN ioctl Fernando Luis Vázquez Cao
  1 sibling, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2009-08-26 17:38 UTC (permalink / raw)
  To: Fernando Luis V??zquez Cao
  Cc: t-sato, m-hamaguchi, Christoph Hellwig, Al Viro, Andrew Morton,
	Linux Kernel Mailing List, Eric Sandeen

On Wed, Aug 26, 2009 at 02:00:01PM +0900, Fernando Luis V??zquez Cao wrote:
> The current locking scheme for filesystem freeze avoids races between
> freeze_bdev() and do_umount() by taking the s_umount semaphore.
> 
> If freeze_bdev() wins the race the process that invoked sys_umount
> will sleep until thaw_bdev releases the semaphore. Unfortunately, this
> will never happen because thaw_bdev bails out early the
> bd_fsfreeze_count check having failed (the count is 0).
> 
> The problem is that the block_device that ioctl_fsthaw() passes to
> thaw_bdev() is not the one that we freezed because before sleeping in
> deactivate_super() do_umount() released the dentry (dput()) and freed
> the vfs mount (free_vfsmnt()).
> 
> This patch works around this issue by checking the freeze level in
> do_umount()


This should be solved my my freeze locking rewrite:

	http://marc.info/?l=linux-fsdevel&m=124933489118480&w=2
	http://marc.info/?l=linux-fsdevel&m=124933491918517&w=2


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC, PATCH] filesystem freeze: fix sys_umount induced perpetual freeze
  2009-08-26 17:38 ` Christoph Hellwig
@ 2009-08-27 10:11   ` Fernando Luis Vázquez Cao
  2009-08-27 11:54     ` Christoph Hellwig
                       ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Fernando Luis Vázquez Cao @ 2009-08-27 10:11 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: t-sato, m-hamaguchi, Al Viro, Andrew Morton,
	Linux Kernel Mailing List, Eric Sandeen

Christoph Hellwig wrote:
> On Wed, Aug 26, 2009 at 02:00:01PM +0900, Fernando Luis V??zquez Cao wrote:
>> The current locking scheme for filesystem freeze avoids races between
>> freeze_bdev() and do_umount() by taking the s_umount semaphore.
>>
>> If freeze_bdev() wins the race the process that invoked sys_umount
>> will sleep until thaw_bdev releases the semaphore. Unfortunately, this
>> will never happen because thaw_bdev bails out early the
>> bd_fsfreeze_count check having failed (the count is 0).
>>
>> The problem is that the block_device that ioctl_fsthaw() passes to
>> thaw_bdev() is not the one that we freezed because before sleeping in
>> deactivate_super() do_umount() released the dentry (dput()) and freed
>> the vfs mount (free_vfsmnt()).
>>
>> This patch works around this issue by checking the freeze level in
>> do_umount()
> 
> 
> This should be solved my my freeze locking rewrite:
> 
> 	http://marc.info/?l=linux-fsdevel&m=124933489118480&w=2
> 	http://marc.info/?l=linux-fsdevel&m=124933491918517&w=2

Hi Christoph,

Thank you for the pointers. I gave those patches a spin, but it seems
the umount case is not being tackled. I suggest rejecting the umount
for frozen filesystems. What do you think?

I will be replying to this email with a forward port of your patches
along with my own patches that fix the locking for umount and add a
new ioctl to check the freeze state of the filesystem (this is helpful
to create clean resource agents for HA solutions).

- Fernando

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC, PATCH] filesystem freeze: fix sys_umount induced perpetual freeze
  2009-08-27 10:11   ` Fernando Luis Vázquez Cao
@ 2009-08-27 11:54     ` Christoph Hellwig
  2009-08-27 12:16       ` Fernando Luis Vázquez Cao
  2009-08-27 14:05     ` [PATCH 1/4] freeze_bdev: kill bd_mount_sem Fernando Luis Vázquez Cao
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2009-08-27 11:54 UTC (permalink / raw)
  To: Fernando Luis V?zquez Cao
  Cc: Christoph Hellwig, t-sato, m-hamaguchi, Al Viro, Andrew Morton,
	Linux Kernel Mailing List, Eric Sandeen

On Thu, Aug 27, 2009 at 07:11:29PM +0900, Fernando Luis V?zquez Cao wrote:
> Thank you for the pointers. I gave those patches a spin, but it seems
> the umount case is not being tackled. I suggest rejecting the umount
> for frozen filesystems. What do you think?
> 
> I will be replying to this email with a forward port of your patches
> along with my own patches that fix the locking for umount and add a
> new ioctl to check the freeze state of the filesystem (this is helpful
> to create clean resource agents for HA solutions).

I think we should just reject the umount for a forzen filesystem with
-EBUSY like I do for remount in those patches.  I somehow thought I did
this but didn't.  Is that good for your use case?


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC, PATCH] filesystem freeze: fix sys_umount induced perpetual freeze
  2009-08-27 11:54     ` Christoph Hellwig
@ 2009-08-27 12:16       ` Fernando Luis Vázquez Cao
  0 siblings, 0 replies; 13+ messages in thread
From: Fernando Luis Vázquez Cao @ 2009-08-27 12:16 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: t-sato, m-hamaguchi, Al Viro, Andrew Morton,
	Linux Kernel Mailing List, Eric Sandeen

Christoph Hellwig wrote:
> On Thu, Aug 27, 2009 at 07:11:29PM +0900, Fernando Luis V?zquez Cao wrote:
>> Thank you for the pointers. I gave those patches a spin, but it seems
>> the umount case is not being tackled. I suggest rejecting the umount
>> for frozen filesystems. What do you think?
>>
>> I will be replying to this email with a forward port of your patches
>> along with my own patches that fix the locking for umount and add a
>> new ioctl to check the freeze state of the filesystem (this is helpful
>> to create clean resource agents for HA solutions).
> 
> I think we should just reject the umount for a forzen filesystem with
> -EBUSY like I do for remount in those patches.  I somehow thought I did
> this but didn't.  Is that good for your use case?

Yes, absolutely. Today I will be sending patches that do just that.
Hopefully you will find the new ioctl (ISFROZEN) I am adding
helpful too.

Thanks,

Fernando

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/4] freeze_bdev: kill bd_mount_sem
  2009-08-27 10:11   ` Fernando Luis Vázquez Cao
  2009-08-27 11:54     ` Christoph Hellwig
@ 2009-08-27 14:05     ` Fernando Luis Vázquez Cao
  2009-08-27 14:06     ` [PATCH 2/4] freeze_bdev: grab active reference to frozen superblocks Fernando Luis Vázquez Cao
  2009-08-27 14:06     ` [PATCH 3/4] Do not allow umounting of frozen filesystems Fernando Luis Vázquez Cao
  3 siblings, 0 replies; 13+ messages in thread
From: Fernando Luis Vázquez Cao @ 2009-08-27 14:05 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: t-sato, m-hamaguchi, Al Viro, Andrew Morton,
	Linux Kernel Mailing List, Eric Sandeen

Author: Christoph Hellwig <hch@lst.de>
Subject: [PATCH 1/4] freeze_bdev: kill bd_mount_sem

Now that we have the freeze count there is not much reason for bd_mount_sem
anymore.  The actual freeze/thaw operations are serialized using the
bd_fsfreeze_mutex, and the only other place we take bd_mount_sem is
get_sb_bdev which tries to prevent mounting a filesystem while the block
device is frozen.  Instead of add a check for bd_fsfreeze_count and
return -EBUSY if a filesystem is frozen.  While that is a change in user
visible behaviour a failing mount is much better for this case rather
than having the mount process stuck uninterruptible for a long time.


Signed-off-by: Christoph Hellwig <hch@lst.de>
---

diff -urNp linux-2.6.31-rc7-orig/fs/block_dev.c linux-2.6.31-rc7/fs/block_dev.c
--- linux-2.6.31-rc7-orig/fs/block_dev.c	2009-08-27 19:34:35.000000000 +0900
+++ linux-2.6.31-rc7/fs/block_dev.c	2009-08-27 19:36:22.000000000 +0900
@@ -216,8 +216,6 @@ EXPORT_SYMBOL(fsync_bdev);
   * freeze_bdev  --  lock a filesystem and force it into a consistent state
   * @bdev:	blockdevice to lock
   *
- * This takes the block device bd_mount_sem to make sure no new mounts
- * happen on bdev until thaw_bdev() is called.
   * If a superblock is found on this device, we take the s_umount semaphore
   * on it to make sure nobody unmounts until the snapshot creation is done.
   * The reference counter (bd_fsfreeze_count) guarantees that only the last
@@ -240,7 +238,6 @@ struct super_block *freeze_bdev(struct b
  	}
  	bdev->bd_fsfreeze_count++;

-	down(&bdev->bd_mount_sem);
  	sb = get_super(bdev);
  	if (sb && !(sb->s_flags & MS_RDONLY)) {
  		sb->s_frozen = SB_FREEZE_WRITE;
@@ -260,7 +257,6 @@ struct super_block *freeze_bdev(struct b
  					"VFS:Filesystem freeze failed\n");
  				sb->s_frozen = SB_UNFROZEN;
  				drop_super(sb);
-				up(&bdev->bd_mount_sem);
  				bdev->bd_fsfreeze_count--;
  				mutex_unlock(&bdev->bd_fsfreeze_mutex);
  				return ERR_PTR(error);
@@ -271,7 +267,7 @@ struct super_block *freeze_bdev(struct b
  	sync_blockdev(bdev);
  	mutex_unlock(&bdev->bd_fsfreeze_mutex);

-	return sb;	/* thaw_bdev releases s->s_umount and bd_mount_sem */
+	return sb;	/* thaw_bdev releases s->s_umount */
  }
  EXPORT_SYMBOL(freeze_bdev);

@@ -321,7 +317,6 @@ int thaw_bdev(struct block_device *bdev,
  		drop_super(sb);
  	}

-	up(&bdev->bd_mount_sem);
  	mutex_unlock(&bdev->bd_fsfreeze_mutex);
  	return 0;
  }
@@ -431,7 +426,6 @@ static void init_once(void *foo)

  	memset(bdev, 0, sizeof(*bdev));
  	mutex_init(&bdev->bd_mutex);
-	sema_init(&bdev->bd_mount_sem, 1);
  	INIT_LIST_HEAD(&bdev->bd_inodes);
  	INIT_LIST_HEAD(&bdev->bd_list);
  #ifdef CONFIG_SYSFS
diff -urNp linux-2.6.31-rc7-orig/fs/super.c linux-2.6.31-rc7/fs/super.c
--- linux-2.6.31-rc7-orig/fs/super.c	2009-08-27 19:34:35.000000000 +0900
+++ linux-2.6.31-rc7/fs/super.c	2009-08-27 19:36:22.000000000 +0900
@@ -740,9 +740,14 @@ int get_sb_bdev(struct file_system_type
  	 * will protect the lockfs code from trying to start a snapshot
  	 * while we are mounting
  	 */
-	down(&bdev->bd_mount_sem);
+	mutex_lock(&bdev->bd_fsfreeze_mutex);
+	if (bdev->bd_fsfreeze_count > 0) {
+		mutex_unlock(&bdev->bd_fsfreeze_mutex);
+		error = -EBUSY;
+		goto error_bdev;
+	}
  	s = sget(fs_type, test_bdev_super, set_bdev_super, bdev);
-	up(&bdev->bd_mount_sem);
+	mutex_unlock(&bdev->bd_fsfreeze_mutex);
  	if (IS_ERR(s))
  		goto error_s;

diff -urNp linux-2.6.31-rc7-orig/include/linux/fs.h linux-2.6.31-rc7/include/linux/fs.h
--- linux-2.6.31-rc7-orig/include/linux/fs.h	2009-08-27 19:34:35.000000000 +0900
+++ linux-2.6.31-rc7/include/linux/fs.h	2009-08-27 19:36:22.000000000 +0900
@@ -640,7 +640,6 @@ struct block_device {
  	struct super_block *	bd_super;
  	int			bd_openers;
  	struct mutex		bd_mutex;	/* open/close mutex */
-	struct semaphore	bd_mount_sem;
  	struct list_head	bd_inodes;
  	void *			bd_holder;
  	int			bd_holders;

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/4] freeze_bdev: grab active reference to frozen superblocks
  2009-08-27 10:11   ` Fernando Luis Vázquez Cao
  2009-08-27 11:54     ` Christoph Hellwig
  2009-08-27 14:05     ` [PATCH 1/4] freeze_bdev: kill bd_mount_sem Fernando Luis Vázquez Cao
@ 2009-08-27 14:06     ` Fernando Luis Vázquez Cao
  2009-08-27 14:06     ` [PATCH 3/4] Do not allow umounting of frozen filesystems Fernando Luis Vázquez Cao
  3 siblings, 0 replies; 13+ messages in thread
From: Fernando Luis Vázquez Cao @ 2009-08-27 14:06 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: t-sato, m-hamaguchi, Al Viro, Andrew Morton,
	Linux Kernel Mailing List, Eric Sandeen

Author: Christoph Hellwig <hch@lst.de>
Subject: [PATCH 2/4] freeze_bdev: grab active reference to frozen superblocks

Currently we held s_umount while a filesystem is frozen, despite that we
might return to userspace and unlock it from a different process.  Instead
grab an active reference to keep the file system busy and add an explicit
check for frozen filesystems in remount and reject the remount instead
of blocking on s_umount.

Add a new get_active_super helper to super.c for use by freeze_bdev that
grabs an active reference to a superblock from a given block device.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---

diff -urNp linux-2.6.31-rc7-orig/fs/block_dev.c linux-2.6.31-rc7/fs/block_dev.c
--- linux-2.6.31-rc7-orig/fs/block_dev.c	2009-08-27 19:38:47.000000000 +0900
+++ linux-2.6.31-rc7/fs/block_dev.c	2009-08-27 19:38:55.000000000 +0900
@@ -230,43 +230,54 @@ struct super_block *freeze_bdev(struct b
  	int error = 0;

  	mutex_lock(&bdev->bd_fsfreeze_mutex);
-	if (bdev->bd_fsfreeze_count > 0) {
-		bdev->bd_fsfreeze_count++;
+	if (++bdev->bd_fsfreeze_count > 1) {
+		/*
+		 * We don't even need to grab a reference - the first call
+		 * to freeze_bdev grab an active reference and only the last
+		 * thaw_bdev drops it.
+		 */
  		sb = get_super(bdev);
+		drop_super(sb);
  		mutex_unlock(&bdev->bd_fsfreeze_mutex);
  		return sb;
  	}
-	bdev->bd_fsfreeze_count++;

-	sb = get_super(bdev);
-	if (sb && !(sb->s_flags & MS_RDONLY)) {
-		sb->s_frozen = SB_FREEZE_WRITE;
-		smp_wmb();
-
-		sync_filesystem(sb);
-
-		sb->s_frozen = SB_FREEZE_TRANS;
-		smp_wmb();
-
-		sync_blockdev(sb->s_bdev);
-
-		if (sb->s_op->freeze_fs) {
-			error = sb->s_op->freeze_fs(sb);
-			if (error) {
-				printk(KERN_ERR
-					"VFS:Filesystem freeze failed\n");
-				sb->s_frozen = SB_UNFROZEN;
-				drop_super(sb);
-				bdev->bd_fsfreeze_count--;
-				mutex_unlock(&bdev->bd_fsfreeze_mutex);
-				return ERR_PTR(error);
-			}
+	sb = get_active_super(bdev);
+	if (!sb)
+		goto out;
+	if (sb->s_flags & MS_RDONLY) {
+		deactivate_locked_super(sb);
+		mutex_unlock(&bdev->bd_fsfreeze_mutex);
+		return sb;
+	}
+
+	sb->s_frozen = SB_FREEZE_WRITE;
+	smp_wmb();
+
+	sync_filesystem(sb);
+
+	sb->s_frozen = SB_FREEZE_TRANS;
+	smp_wmb();
+
+	sync_blockdev(sb->s_bdev);
+
+	if (sb->s_op->freeze_fs) {
+		error = sb->s_op->freeze_fs(sb);
+		if (error) {
+			printk(KERN_ERR
+				"VFS:Filesystem freeze failed\n");
+			sb->s_frozen = SB_UNFROZEN;
+			deactivate_locked_super(sb);
+			bdev->bd_fsfreeze_count--;
+			mutex_unlock(&bdev->bd_fsfreeze_mutex);
+			return ERR_PTR(error);
  		}
  	}
+	up_write(&sb->s_umount);

+ out:
  	sync_blockdev(bdev);
  	mutex_unlock(&bdev->bd_fsfreeze_mutex);
-
  	return sb;	/* thaw_bdev releases s->s_umount */
  }
  EXPORT_SYMBOL(freeze_bdev);
@@ -280,43 +291,44 @@ EXPORT_SYMBOL(freeze_bdev);
   */
  int thaw_bdev(struct block_device *bdev, struct super_block *sb)
  {
-	int error = 0;
+	int error = -EINVAL;

  	mutex_lock(&bdev->bd_fsfreeze_mutex);
-	if (!bdev->bd_fsfreeze_count) {
-		mutex_unlock(&bdev->bd_fsfreeze_mutex);
-		return -EINVAL;
-	}
+	if (!bdev->bd_fsfreeze_count)
+		goto out_unlock;

-	bdev->bd_fsfreeze_count--;
-	if (bdev->bd_fsfreeze_count > 0) {
-		if (sb)
-			drop_super(sb);
-		mutex_unlock(&bdev->bd_fsfreeze_mutex);
-		return 0;
-	}
-
-	if (sb) {
-		BUG_ON(sb->s_bdev != bdev);
-		if (!(sb->s_flags & MS_RDONLY)) {
-			if (sb->s_op->unfreeze_fs) {
-				error = sb->s_op->unfreeze_fs(sb);
-				if (error) {
-					printk(KERN_ERR
-						"VFS:Filesystem thaw failed\n");
-					sb->s_frozen = SB_FREEZE_TRANS;
-					bdev->bd_fsfreeze_count++;
-					mutex_unlock(&bdev->bd_fsfreeze_mutex);
-					return error;
-				}
-			}
-			sb->s_frozen = SB_UNFROZEN;
-			smp_wmb();
-			wake_up(&sb->s_wait_unfrozen);
+	error = 0;
+	if (--bdev->bd_fsfreeze_count > 0)
+		goto out_unlock;
+
+	if (!sb)
+		goto out_unlock;
+
+	BUG_ON(sb->s_bdev != bdev);
+	down_write(&sb->s_umount);
+	if (sb->s_flags & MS_RDONLY)
+		goto out_deactivate;
+
+	if (sb->s_op->unfreeze_fs) {
+		error = sb->s_op->unfreeze_fs(sb);
+		if (error) {
+			printk(KERN_ERR
+				"VFS:Filesystem thaw failed\n");
+			sb->s_frozen = SB_FREEZE_TRANS;
+			bdev->bd_fsfreeze_count++;
+			mutex_unlock(&bdev->bd_fsfreeze_mutex);
+			return error;
  		}
-		drop_super(sb);
  	}

+	sb->s_frozen = SB_UNFROZEN;
+	smp_wmb();
+	wake_up(&sb->s_wait_unfrozen);
+
+out_deactivate:
+	if (sb)
+		deactivate_locked_super(sb);
+out_unlock:
  	mutex_unlock(&bdev->bd_fsfreeze_mutex);
  	return 0;
  }
diff -urNp linux-2.6.31-rc7-orig/fs/super.c linux-2.6.31-rc7/fs/super.c
--- linux-2.6.31-rc7-orig/fs/super.c	2009-08-27 19:38:47.000000000 +0900
+++ linux-2.6.31-rc7/fs/super.c	2009-08-27 19:38:55.000000000 +0900
@@ -468,6 +468,48 @@ rescan:
  }

  EXPORT_SYMBOL(get_super);
+
+/**
+ * get_active_super - get an active reference to the superblock of a device
+ * @bdev: device to get the superblock for
+ *
+ * Scans the superblock list and finds the superblock of the file system
+ * mounted on the device given.  Returns the superblock with an active
+ * reference and s_umount held exclusively or %NULL if none was found.
+ */
+struct super_block *get_active_super(struct block_device *bdev)
+{
+	struct super_block *sb;
+
+	if (!bdev)
+		return NULL;
+
+	spin_lock(&sb_lock);
+	list_for_each_entry(sb, &super_blocks, s_list) {
+		if (sb->s_bdev != bdev)
+			continue;
+
+		sb->s_count++;
+		spin_unlock(&sb_lock);
+		down_write(&sb->s_umount);
+		if (sb->s_root) {
+			spin_lock(&sb_lock);
+			if (sb->s_count > S_BIAS) {
+				atomic_inc(&sb->s_active);
+				sb->s_count--;
+				spin_unlock(&sb_lock);
+				return sb;
+			}
+			spin_unlock(&sb_lock);
+		}
+		up_write(&sb->s_umount);
+		put_super(sb);
+		yield();
+		spin_lock(&sb_lock);
+	}
+	spin_unlock(&sb_lock);
+	return NULL;
+}

  struct super_block * user_get_super(dev_t dev)
  {
@@ -530,11 +572,15 @@ int do_remount_sb(struct super_block *sb
  {
  	int retval;
  	int remount_rw;
-	
+
+	if (sb->s_frozen != SB_UNFROZEN)
+		return -EBUSY;
+
  #ifdef CONFIG_BLOCK
  	if (!(flags & MS_RDONLY) && bdev_read_only(sb->s_bdev))
  		return -EACCES;
  #endif
+
  	if (flags & MS_RDONLY)
  		acct_auto_close(sb);
  	shrink_dcache_sb(sb);
diff -urNp linux-2.6.31-rc7-orig/include/linux/fs.h linux-2.6.31-rc7/include/linux/fs.h
--- linux-2.6.31-rc7-orig/include/linux/fs.h	2009-08-27 19:38:47.000000000 +0900
+++ linux-2.6.31-rc7/include/linux/fs.h	2009-08-27 19:38:55.000000000 +0900
@@ -2320,6 +2320,7 @@ extern void get_filesystem(struct file_s
  extern void put_filesystem(struct file_system_type *fs);
  extern struct file_system_type *get_fs_type(const char *name);
  extern struct super_block *get_super(struct block_device *);
+extern struct super_block *get_active_super(struct block_device *bdev);
  extern struct super_block *user_get_super(dev_t);
  extern void drop_super(struct super_block *sb);


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 3/4] Do not allow umounting of frozen filesystems
  2009-08-27 10:11   ` Fernando Luis Vázquez Cao
                       ` (2 preceding siblings ...)
  2009-08-27 14:06     ` [PATCH 2/4] freeze_bdev: grab active reference to frozen superblocks Fernando Luis Vázquez Cao
@ 2009-08-27 14:06     ` Fernando Luis Vázquez Cao
  2009-09-22 11:07       ` Al Viro
  3 siblings, 1 reply; 13+ messages in thread
From: Fernando Luis Vázquez Cao @ 2009-08-27 14:06 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: t-sato, m-hamaguchi, Al Viro, Andrew Morton,
	Linux Kernel Mailing List, Eric Sandeen

Instead of making umount users wait until the filesystem is
unfreezed return EBUSY, which is very convenient in HA
configurations.

This could have been implemented at a lower level but it would
require considerable plumbing in functions such as release_mounts
which do not return errors.

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
---

diff -urNp linux-2.6.31-rc7-orig/fs/namespace.c linux-2.6.31-rc7/fs/namespace.c
--- linux-2.6.31-rc7-orig/fs/namespace.c	2009-08-27 19:34:35.000000000 +0900
+++ linux-2.6.31-rc7/fs/namespace.c	2009-08-27 22:45:14.000000000 +0900
@@ -1086,6 +1086,14 @@ static int do_umount(struct vfsmount *mn
  		return retval;
  	}

+	if (sb->s_bdev != NULL) {
+		mutex_lock(&sb->s_bdev->bd_fsfreeze_mutex);
+		if (sb->s_frozen != SB_UNFROZEN) {
+			mutex_unlock(&sb->s_bdev->bd_fsfreeze_mutex);
+			return -EBUSY;
+		}
+	}
+
  	down_write(&namespace_sem);
  	spin_lock(&vfsmount_lock);
  	event++;
@@ -1104,6 +1112,10 @@ static int do_umount(struct vfsmount *mn
  		security_sb_umount_busy(mnt);
  	up_write(&namespace_sem);
  	release_mounts(&umount_list);
+
+	if (sb->s_bdev != NULL)
+		mutex_unlock(&sb->s_bdev->bd_fsfreeze_mutex);
+
  	return retval;
  }


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 4/4] filesystem freeze: add ISFROZEN ioctl
  2009-08-26  5:00 [RFC, PATCH] filesystem freeze: fix sys_umount induced perpetual freeze Fernando Luis Vázquez Cao
  2009-08-26 17:38 ` Christoph Hellwig
@ 2009-08-27 14:06 ` Fernando Luis Vázquez Cao
  1 sibling, 0 replies; 13+ messages in thread
From: Fernando Luis Vázquez Cao @ 2009-08-27 14:06 UTC (permalink / raw)
  To: Fernando Luis Vázquez Cao
  Cc: t-sato, m-hamaguchi, Christoph Hellwig, Al Viro, Andrew Morton,
	Linux Kernel Mailing List, Eric Sandeen

The ISFROZEN ioctl can be use by HA and monitoring software to check
the freeze state of a filesystem.

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
---

diff -urNp linux-2.6.31-rc7-orig/fs/block_dev.c linux-2.6.31-rc7/fs/block_dev.c
--- linux-2.6.31-rc7-orig/fs/block_dev.c	2009-08-27 14:48:12.000000000 +0900
+++ linux-2.6.31-rc7/fs/block_dev.c	2009-08-27 18:49:46.000000000 +0900
@@ -334,6 +334,21 @@ out_unlock:
  }
  EXPORT_SYMBOL(thaw_bdev);

+/**
+ *
+ */
+int isfrozen_bdev(struct block_device *bdev, struct super_block *sb)
+{
+	int ret;
+
+	mutex_lock(&bdev->bd_fsfreeze_mutex);
+	ret = sb->s_frozen > SB_UNFROZEN ? 1:0;
+	mutex_unlock(&bdev->bd_fsfreeze_mutex);
+
+	return ret;
+}
+EXPORT_SYMBOL(isfrozen_bdev);
+
  static int blkdev_writepage(struct page *page, struct writeback_control *wbc)
  {
  	return block_write_full_page(page, blkdev_get_block, wbc);
diff -urNp linux-2.6.31-rc7-orig/fs/ioctl.c linux-2.6.31-rc7/fs/ioctl.c
--- linux-2.6.31-rc7-orig/fs/ioctl.c	2009-08-27 10:08:00.000000000 +0900
+++ linux-2.6.31-rc7/fs/ioctl.c	2009-08-27 18:49:17.000000000 +0900
@@ -536,6 +536,20 @@ static int ioctl_fsthaw(struct file *fil
  	return thaw_bdev(sb->s_bdev, sb);
  }

+static int ioctl_isfrozen(struct file *filp)
+{
+	struct super_block *sb = filp->f_path.dentry->d_inode->i_sb;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	/* If a blockdevice-backed filesystem isn't specified, return EINVAL. */
+	if (sb->s_bdev == NULL)
+		return -EINVAL;
+
+	return isfrozen_bdev(sb->s_bdev, sb);
+}
+
  /*
   * When you add any new common ioctls to the switches above and below
   * please update compat_sys_ioctl() too.
@@ -586,6 +600,12 @@ int do_vfs_ioctl(struct file *filp, unsi
  		error = ioctl_fsthaw(filp);
  		break;

+	case FIISFROZEN:
+		error = ioctl_isfrozen(filp);
+		if (error >= 0)
+			return put_user(error, (int __user *)arg);
+		break;
+
  	case FS_IOC_FIEMAP:
  		return ioctl_fiemap(filp, arg);

diff -urNp linux-2.6.31-rc7-orig/include/linux/fs.h linux-2.6.31-rc7/include/linux/fs.h
--- linux-2.6.31-rc7-orig/include/linux/fs.h	2009-08-27 14:48:12.000000000 +0900
+++ linux-2.6.31-rc7/include/linux/fs.h	2009-08-27 16:33:19.000000000 +0900
@@ -306,6 +306,7 @@ struct inodes_stat_t {
  #define FIGETBSZ   _IO(0x00,2)	/* get the block size used for bmap */
  #define FIFREEZE	_IOWR('X', 119, int)	/* Freeze */
  #define FITHAW		_IOWR('X', 120, int)	/* Thaw */
+#define FIISFROZEN	_IOWR('X', 121, int)	/* Is frozen? */

  #define	FS_IOC_GETFLAGS			_IOR('f', 1, long)
  #define	FS_IOC_SETFLAGS			_IOW('f', 2, long)
@@ -1955,6 +1956,7 @@ extern int sync_blockdev(struct block_de
  extern struct super_block *freeze_bdev(struct block_device *);
  extern void emergency_thaw_all(void);
  extern int thaw_bdev(struct block_device *bdev, struct super_block *sb);
+extern int isfrozen_bdev(struct block_device *bdev, struct super_block *sb);
  extern int fsync_bdev(struct block_device *);
  #else
  static inline void bd_forget(struct inode *inode) {}
@@ -1970,6 +1972,12 @@ static inline int thaw_bdev(struct block
  {
  	return 0;
  }
+
+extern int isfrozen_bdev(struct block_device *bdev, struct super_block *sb)
+{
+	return 0;
+}
+
  #endif
  extern int sync_filesystem(struct super_block *);
  extern const struct file_operations def_blk_fops;

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/4] Do not allow umounting of frozen filesystems
  2009-08-27 14:06     ` [PATCH 3/4] Do not allow umounting of frozen filesystems Fernando Luis Vázquez Cao
@ 2009-09-22 11:07       ` Al Viro
  2009-09-22 15:57         ` Eric Sandeen
  2009-09-22 16:41         ` Fernando Luis Vazquez Cao
  0 siblings, 2 replies; 13+ messages in thread
From: Al Viro @ 2009-09-22 11:07 UTC (permalink / raw)
  To: Fernando Luis V?zquez Cao
  Cc: Christoph Hellwig, t-sato, m-hamaguchi, Andrew Morton,
	Linux Kernel Mailing List, Eric Sandeen

On Thu, Aug 27, 2009 at 11:06:07PM +0900, Fernando Luis V?zquez Cao wrote:
> Instead of making umount users wait until the filesystem is
> unfreezed return EBUSY, which is very convenient in HA
> configurations.
>
> This could have been implemented at a lower level but it would
> require considerable plumbing in functions such as release_mounts
> which do not return errors.

> +	if (sb->s_bdev != NULL) {
> +		mutex_lock(&sb->s_bdev->bd_fsfreeze_mutex);
> +		if (sb->s_frozen != SB_UNFROZEN) {
> +			mutex_unlock(&sb->s_bdev->bd_fsfreeze_mutex);
> +			return -EBUSY;
> +		}
> +	}

NAK.  First of all, it _partially_ breaks umount -l for no good reason.
If the first fs on the mountpoint is frozen, we fail; if it's deeper
we succeed just fine (and delay actual fs shutdown until the thaw).

As far as I can see, the real problem is that fsthaw ioctl has braindead
API; it takes some opened file on fs in question.  Why not do a bdev
ioctl instead?  Then we could let umount go ahead just fine, leaving
fs frozen (and not shut down until it thaws).  And whoever does thaw
(via bdev ioctl) will automatically trigger the actual fs shutdown.
Just with Christoph's pair of patches...

IOW, I'd rather add two new ioctls (check if frozen/thaw), both by
bdev.  On top of the first two patches in this set.

Comments?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/4] Do not allow umounting of frozen filesystems
  2009-09-22 11:07       ` Al Viro
@ 2009-09-22 15:57         ` Eric Sandeen
  2009-09-22 16:46           ` Fernando Luis Vazquez Cao
  2009-09-22 16:41         ` Fernando Luis Vazquez Cao
  1 sibling, 1 reply; 13+ messages in thread
From: Eric Sandeen @ 2009-09-22 15:57 UTC (permalink / raw)
  To: Al Viro
  Cc: Fernando Luis V?zquez Cao, Christoph Hellwig, t-sato, m-hamaguchi,
	Andrew Morton, Linux Kernel Mailing List

Al Viro wrote:
...

> IOW, I'd rather add two new ioctls (check if frozen/thaw), both by
> bdev.  On top of the first two patches in this set.

> Comments?

The check ioctls would be very very useful, IMHO.  Many filesystem tools
refuse to operate on a mounted filesystem, but in some cases it'd be
safe to do readonly operations on a frozen, mounted fs.

-Eric

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/4] Do not allow umounting of frozen filesystems
  2009-09-22 11:07       ` Al Viro
  2009-09-22 15:57         ` Eric Sandeen
@ 2009-09-22 16:41         ` Fernando Luis Vazquez Cao
  1 sibling, 0 replies; 13+ messages in thread
From: Fernando Luis Vazquez Cao @ 2009-09-22 16:41 UTC (permalink / raw)
  To: Al Viro
  Cc: Christoph Hellwig, t-sato, m-hamaguchi, Andrew Morton,
	Linux Kernel Mailing List, Eric Sandeen

Al Viro さんは書きました:
> On Thu, Aug 27, 2009 at 11:06:07PM +0900, Fernando Luis V?zquez Cao wrote:
>   
>> Instead of making umount users wait until the filesystem is
>> unfreezed return EBUSY, which is very convenient in HA
>> configurations.
>>
>> This could have been implemented at a lower level but it would
>> require considerable plumbing in functions such as release_mounts
>> which do not return errors.
>>     
>
>   
>> +	if (sb->s_bdev != NULL) {
>> +		mutex_lock(&sb->s_bdev->bd_fsfreeze_mutex);
>> +		if (sb->s_frozen != SB_UNFROZEN) {
>> +			mutex_unlock(&sb->s_bdev->bd_fsfreeze_mutex);
>> +			return -EBUSY;
>> +		}
>> +	}
>>     
>
> NAK.  First of all, it _partially_ breaks umount -l for no good reason.
> If the first fs on the mountpoint is frozen, we fail; if it's deeper
> we succeed just fine (and delay actual fs shutdown until the thaw).
>
> As far as I can see, the real problem is that fsthaw ioctl has braindead
> API; it takes some opened file on fs in question.  Why not do a bdev
> ioctl instead?  Then we could let umount go ahead just fine, leaving
> fs frozen (and not shut down until it thaws).  And whoever does thaw
> (via bdev ioctl) will automatically trigger the actual fs shutdown.
> Just with Christoph's pair of patches...
>   

I basically agree with you. The current API creates a lot of locking 
issues that could be tackled
more cleanly with the bdev ioctls you suggest.

> IOW, I'd rather add two new ioctls (check if frozen/thaw), both by
> bdev.  On top of the first two patches in this set.
>   

I am happy to see you would welcome a check ioctl.

If there is consensus on the bdev ioctl approach I could send patches.

Thanks,

Fernando

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/4] Do not allow umounting of frozen filesystems
  2009-09-22 15:57         ` Eric Sandeen
@ 2009-09-22 16:46           ` Fernando Luis Vazquez Cao
  0 siblings, 0 replies; 13+ messages in thread
From: Fernando Luis Vazquez Cao @ 2009-09-22 16:46 UTC (permalink / raw)
  To: Eric Sandeen
  Cc: Al Viro, Christoph Hellwig, t-sato, m-hamaguchi, Andrew Morton,
	Linux Kernel Mailing List

Eric Sandeen さんは書きました:
> Al Viro wrote:
> ...
>
>   
>> IOW, I'd rather add two new ioctls (check if frozen/thaw), both by
>> bdev.  On top of the first two patches in this set.
>>     
>
>   
>> Comments?
>>     
>
> The check ioctls would be very very useful, IMHO.  Many filesystem tools
> refuse to operate on a mounted filesystem, but in some cases it'd be
> safe to do readonly operations on a frozen, mounted fs.
>   

Yep, and HA software would benefit from the check ioctls too.

- Fernando

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2009-09-22 16:45 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-26  5:00 [RFC, PATCH] filesystem freeze: fix sys_umount induced perpetual freeze Fernando Luis Vázquez Cao
2009-08-26 17:38 ` Christoph Hellwig
2009-08-27 10:11   ` Fernando Luis Vázquez Cao
2009-08-27 11:54     ` Christoph Hellwig
2009-08-27 12:16       ` Fernando Luis Vázquez Cao
2009-08-27 14:05     ` [PATCH 1/4] freeze_bdev: kill bd_mount_sem Fernando Luis Vázquez Cao
2009-08-27 14:06     ` [PATCH 2/4] freeze_bdev: grab active reference to frozen superblocks Fernando Luis Vázquez Cao
2009-08-27 14:06     ` [PATCH 3/4] Do not allow umounting of frozen filesystems Fernando Luis Vázquez Cao
2009-09-22 11:07       ` Al Viro
2009-09-22 15:57         ` Eric Sandeen
2009-09-22 16:46           ` Fernando Luis Vazquez Cao
2009-09-22 16:41         ` Fernando Luis Vazquez Cao
2009-08-27 14:06 ` [PATCH 4/4] filesystem freeze: add ISFROZEN ioctl Fernando Luis Vázquez Cao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox