* [RFC, PATCH] filesystem freeze: fix sys_umount induced perpetual freeze @ 2009-08-26 5:00 Fernando Luis Vázquez Cao 2009-08-26 17:38 ` Christoph Hellwig 2009-08-27 14:06 ` [PATCH 4/4] filesystem freeze: add ISFROZEN ioctl Fernando Luis Vázquez Cao 0 siblings, 2 replies; 13+ messages in thread From: Fernando Luis Vázquez Cao @ 2009-08-26 5:00 UTC (permalink / raw) To: t-sato, m-hamaguchi Cc: Christoph Hellwig, Al Viro, Andrew Morton, Linux Kernel Mailing List, Eric Sandeen The current locking scheme for filesystem freeze avoids races between freeze_bdev() and do_umount() by taking the s_umount semaphore. If freeze_bdev() wins the race the process that invoked sys_umount will sleep until thaw_bdev releases the semaphore. Unfortunately, this will never happen because thaw_bdev bails out early the bd_fsfreeze_count check having failed (the count is 0). The problem is that the block_device that ioctl_fsthaw() passes to thaw_bdev() is not the one that we freezed because before sleeping in deactivate_super() do_umount() released the dentry (dput()) and freed the vfs mount (free_vfsmnt()). This patch works around this issue by checking the freeze level in do_umount(). Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> --- diff -urNp linux-2.6.31-rc7-orig/fs/namespace.c linux-2.6.31-rc7/fs/namespace.c --- linux-2.6.31-rc7-orig/fs/namespace.c 2009-08-25 16:39:46.000000000 +0900 +++ linux-2.6.31-rc7/fs/namespace.c 2009-08-26 11:30:10.000000000 +0900 @@ -1033,6 +1033,8 @@ static int do_umount(struct vfsmount *mn if (retval) return retval; + vfs_check_frozen(sb, SB_FREEZE_WRITE); + /* * Allow userspace to request a mountpoint be expired rather than * unmounting unconditionally. Unmount only happens if: ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC, PATCH] filesystem freeze: fix sys_umount induced perpetual freeze 2009-08-26 5:00 [RFC, PATCH] filesystem freeze: fix sys_umount induced perpetual freeze Fernando Luis Vázquez Cao @ 2009-08-26 17:38 ` Christoph Hellwig 2009-08-27 10:11 ` Fernando Luis Vázquez Cao 2009-08-27 14:06 ` [PATCH 4/4] filesystem freeze: add ISFROZEN ioctl Fernando Luis Vázquez Cao 1 sibling, 1 reply; 13+ messages in thread From: Christoph Hellwig @ 2009-08-26 17:38 UTC (permalink / raw) To: Fernando Luis V??zquez Cao Cc: t-sato, m-hamaguchi, Christoph Hellwig, Al Viro, Andrew Morton, Linux Kernel Mailing List, Eric Sandeen On Wed, Aug 26, 2009 at 02:00:01PM +0900, Fernando Luis V??zquez Cao wrote: > The current locking scheme for filesystem freeze avoids races between > freeze_bdev() and do_umount() by taking the s_umount semaphore. > > If freeze_bdev() wins the race the process that invoked sys_umount > will sleep until thaw_bdev releases the semaphore. Unfortunately, this > will never happen because thaw_bdev bails out early the > bd_fsfreeze_count check having failed (the count is 0). > > The problem is that the block_device that ioctl_fsthaw() passes to > thaw_bdev() is not the one that we freezed because before sleeping in > deactivate_super() do_umount() released the dentry (dput()) and freed > the vfs mount (free_vfsmnt()). > > This patch works around this issue by checking the freeze level in > do_umount() This should be solved my my freeze locking rewrite: http://marc.info/?l=linux-fsdevel&m=124933489118480&w=2 http://marc.info/?l=linux-fsdevel&m=124933491918517&w=2 ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC, PATCH] filesystem freeze: fix sys_umount induced perpetual freeze 2009-08-26 17:38 ` Christoph Hellwig @ 2009-08-27 10:11 ` Fernando Luis Vázquez Cao 2009-08-27 11:54 ` Christoph Hellwig ` (3 more replies) 0 siblings, 4 replies; 13+ messages in thread From: Fernando Luis Vázquez Cao @ 2009-08-27 10:11 UTC (permalink / raw) To: Christoph Hellwig Cc: t-sato, m-hamaguchi, Al Viro, Andrew Morton, Linux Kernel Mailing List, Eric Sandeen Christoph Hellwig wrote: > On Wed, Aug 26, 2009 at 02:00:01PM +0900, Fernando Luis V??zquez Cao wrote: >> The current locking scheme for filesystem freeze avoids races between >> freeze_bdev() and do_umount() by taking the s_umount semaphore. >> >> If freeze_bdev() wins the race the process that invoked sys_umount >> will sleep until thaw_bdev releases the semaphore. Unfortunately, this >> will never happen because thaw_bdev bails out early the >> bd_fsfreeze_count check having failed (the count is 0). >> >> The problem is that the block_device that ioctl_fsthaw() passes to >> thaw_bdev() is not the one that we freezed because before sleeping in >> deactivate_super() do_umount() released the dentry (dput()) and freed >> the vfs mount (free_vfsmnt()). >> >> This patch works around this issue by checking the freeze level in >> do_umount() > > > This should be solved my my freeze locking rewrite: > > http://marc.info/?l=linux-fsdevel&m=124933489118480&w=2 > http://marc.info/?l=linux-fsdevel&m=124933491918517&w=2 Hi Christoph, Thank you for the pointers. I gave those patches a spin, but it seems the umount case is not being tackled. I suggest rejecting the umount for frozen filesystems. What do you think? I will be replying to this email with a forward port of your patches along with my own patches that fix the locking for umount and add a new ioctl to check the freeze state of the filesystem (this is helpful to create clean resource agents for HA solutions). - Fernando ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC, PATCH] filesystem freeze: fix sys_umount induced perpetual freeze 2009-08-27 10:11 ` Fernando Luis Vázquez Cao @ 2009-08-27 11:54 ` Christoph Hellwig 2009-08-27 12:16 ` Fernando Luis Vázquez Cao 2009-08-27 14:05 ` [PATCH 1/4] freeze_bdev: kill bd_mount_sem Fernando Luis Vázquez Cao ` (2 subsequent siblings) 3 siblings, 1 reply; 13+ messages in thread From: Christoph Hellwig @ 2009-08-27 11:54 UTC (permalink / raw) To: Fernando Luis V?zquez Cao Cc: Christoph Hellwig, t-sato, m-hamaguchi, Al Viro, Andrew Morton, Linux Kernel Mailing List, Eric Sandeen On Thu, Aug 27, 2009 at 07:11:29PM +0900, Fernando Luis V?zquez Cao wrote: > Thank you for the pointers. I gave those patches a spin, but it seems > the umount case is not being tackled. I suggest rejecting the umount > for frozen filesystems. What do you think? > > I will be replying to this email with a forward port of your patches > along with my own patches that fix the locking for umount and add a > new ioctl to check the freeze state of the filesystem (this is helpful > to create clean resource agents for HA solutions). I think we should just reject the umount for a forzen filesystem with -EBUSY like I do for remount in those patches. I somehow thought I did this but didn't. Is that good for your use case? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC, PATCH] filesystem freeze: fix sys_umount induced perpetual freeze 2009-08-27 11:54 ` Christoph Hellwig @ 2009-08-27 12:16 ` Fernando Luis Vázquez Cao 0 siblings, 0 replies; 13+ messages in thread From: Fernando Luis Vázquez Cao @ 2009-08-27 12:16 UTC (permalink / raw) To: Christoph Hellwig Cc: t-sato, m-hamaguchi, Al Viro, Andrew Morton, Linux Kernel Mailing List, Eric Sandeen Christoph Hellwig wrote: > On Thu, Aug 27, 2009 at 07:11:29PM +0900, Fernando Luis V?zquez Cao wrote: >> Thank you for the pointers. I gave those patches a spin, but it seems >> the umount case is not being tackled. I suggest rejecting the umount >> for frozen filesystems. What do you think? >> >> I will be replying to this email with a forward port of your patches >> along with my own patches that fix the locking for umount and add a >> new ioctl to check the freeze state of the filesystem (this is helpful >> to create clean resource agents for HA solutions). > > I think we should just reject the umount for a forzen filesystem with > -EBUSY like I do for remount in those patches. I somehow thought I did > this but didn't. Is that good for your use case? Yes, absolutely. Today I will be sending patches that do just that. Hopefully you will find the new ioctl (ISFROZEN) I am adding helpful too. Thanks, Fernando ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/4] freeze_bdev: kill bd_mount_sem 2009-08-27 10:11 ` Fernando Luis Vázquez Cao 2009-08-27 11:54 ` Christoph Hellwig @ 2009-08-27 14:05 ` Fernando Luis Vázquez Cao 2009-08-27 14:06 ` [PATCH 2/4] freeze_bdev: grab active reference to frozen superblocks Fernando Luis Vázquez Cao 2009-08-27 14:06 ` [PATCH 3/4] Do not allow umounting of frozen filesystems Fernando Luis Vázquez Cao 3 siblings, 0 replies; 13+ messages in thread From: Fernando Luis Vázquez Cao @ 2009-08-27 14:05 UTC (permalink / raw) To: Christoph Hellwig Cc: t-sato, m-hamaguchi, Al Viro, Andrew Morton, Linux Kernel Mailing List, Eric Sandeen Author: Christoph Hellwig <hch@lst.de> Subject: [PATCH 1/4] freeze_bdev: kill bd_mount_sem Now that we have the freeze count there is not much reason for bd_mount_sem anymore. The actual freeze/thaw operations are serialized using the bd_fsfreeze_mutex, and the only other place we take bd_mount_sem is get_sb_bdev which tries to prevent mounting a filesystem while the block device is frozen. Instead of add a check for bd_fsfreeze_count and return -EBUSY if a filesystem is frozen. While that is a change in user visible behaviour a failing mount is much better for this case rather than having the mount process stuck uninterruptible for a long time. Signed-off-by: Christoph Hellwig <hch@lst.de> --- diff -urNp linux-2.6.31-rc7-orig/fs/block_dev.c linux-2.6.31-rc7/fs/block_dev.c --- linux-2.6.31-rc7-orig/fs/block_dev.c 2009-08-27 19:34:35.000000000 +0900 +++ linux-2.6.31-rc7/fs/block_dev.c 2009-08-27 19:36:22.000000000 +0900 @@ -216,8 +216,6 @@ EXPORT_SYMBOL(fsync_bdev); * freeze_bdev -- lock a filesystem and force it into a consistent state * @bdev: blockdevice to lock * - * This takes the block device bd_mount_sem to make sure no new mounts - * happen on bdev until thaw_bdev() is called. * If a superblock is found on this device, we take the s_umount semaphore * on it to make sure nobody unmounts until the snapshot creation is done. * The reference counter (bd_fsfreeze_count) guarantees that only the last @@ -240,7 +238,6 @@ struct super_block *freeze_bdev(struct b } bdev->bd_fsfreeze_count++; - down(&bdev->bd_mount_sem); sb = get_super(bdev); if (sb && !(sb->s_flags & MS_RDONLY)) { sb->s_frozen = SB_FREEZE_WRITE; @@ -260,7 +257,6 @@ struct super_block *freeze_bdev(struct b "VFS:Filesystem freeze failed\n"); sb->s_frozen = SB_UNFROZEN; drop_super(sb); - up(&bdev->bd_mount_sem); bdev->bd_fsfreeze_count--; mutex_unlock(&bdev->bd_fsfreeze_mutex); return ERR_PTR(error); @@ -271,7 +267,7 @@ struct super_block *freeze_bdev(struct b sync_blockdev(bdev); mutex_unlock(&bdev->bd_fsfreeze_mutex); - return sb; /* thaw_bdev releases s->s_umount and bd_mount_sem */ + return sb; /* thaw_bdev releases s->s_umount */ } EXPORT_SYMBOL(freeze_bdev); @@ -321,7 +317,6 @@ int thaw_bdev(struct block_device *bdev, drop_super(sb); } - up(&bdev->bd_mount_sem); mutex_unlock(&bdev->bd_fsfreeze_mutex); return 0; } @@ -431,7 +426,6 @@ static void init_once(void *foo) memset(bdev, 0, sizeof(*bdev)); mutex_init(&bdev->bd_mutex); - sema_init(&bdev->bd_mount_sem, 1); INIT_LIST_HEAD(&bdev->bd_inodes); INIT_LIST_HEAD(&bdev->bd_list); #ifdef CONFIG_SYSFS diff -urNp linux-2.6.31-rc7-orig/fs/super.c linux-2.6.31-rc7/fs/super.c --- linux-2.6.31-rc7-orig/fs/super.c 2009-08-27 19:34:35.000000000 +0900 +++ linux-2.6.31-rc7/fs/super.c 2009-08-27 19:36:22.000000000 +0900 @@ -740,9 +740,14 @@ int get_sb_bdev(struct file_system_type * will protect the lockfs code from trying to start a snapshot * while we are mounting */ - down(&bdev->bd_mount_sem); + mutex_lock(&bdev->bd_fsfreeze_mutex); + if (bdev->bd_fsfreeze_count > 0) { + mutex_unlock(&bdev->bd_fsfreeze_mutex); + error = -EBUSY; + goto error_bdev; + } s = sget(fs_type, test_bdev_super, set_bdev_super, bdev); - up(&bdev->bd_mount_sem); + mutex_unlock(&bdev->bd_fsfreeze_mutex); if (IS_ERR(s)) goto error_s; diff -urNp linux-2.6.31-rc7-orig/include/linux/fs.h linux-2.6.31-rc7/include/linux/fs.h --- linux-2.6.31-rc7-orig/include/linux/fs.h 2009-08-27 19:34:35.000000000 +0900 +++ linux-2.6.31-rc7/include/linux/fs.h 2009-08-27 19:36:22.000000000 +0900 @@ -640,7 +640,6 @@ struct block_device { struct super_block * bd_super; int bd_openers; struct mutex bd_mutex; /* open/close mutex */ - struct semaphore bd_mount_sem; struct list_head bd_inodes; void * bd_holder; int bd_holders; ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 2/4] freeze_bdev: grab active reference to frozen superblocks 2009-08-27 10:11 ` Fernando Luis Vázquez Cao 2009-08-27 11:54 ` Christoph Hellwig 2009-08-27 14:05 ` [PATCH 1/4] freeze_bdev: kill bd_mount_sem Fernando Luis Vázquez Cao @ 2009-08-27 14:06 ` Fernando Luis Vázquez Cao 2009-08-27 14:06 ` [PATCH 3/4] Do not allow umounting of frozen filesystems Fernando Luis Vázquez Cao 3 siblings, 0 replies; 13+ messages in thread From: Fernando Luis Vázquez Cao @ 2009-08-27 14:06 UTC (permalink / raw) To: Christoph Hellwig Cc: t-sato, m-hamaguchi, Al Viro, Andrew Morton, Linux Kernel Mailing List, Eric Sandeen Author: Christoph Hellwig <hch@lst.de> Subject: [PATCH 2/4] freeze_bdev: grab active reference to frozen superblocks Currently we held s_umount while a filesystem is frozen, despite that we might return to userspace and unlock it from a different process. Instead grab an active reference to keep the file system busy and add an explicit check for frozen filesystems in remount and reject the remount instead of blocking on s_umount. Add a new get_active_super helper to super.c for use by freeze_bdev that grabs an active reference to a superblock from a given block device. Signed-off-by: Christoph Hellwig <hch@lst.de> --- diff -urNp linux-2.6.31-rc7-orig/fs/block_dev.c linux-2.6.31-rc7/fs/block_dev.c --- linux-2.6.31-rc7-orig/fs/block_dev.c 2009-08-27 19:38:47.000000000 +0900 +++ linux-2.6.31-rc7/fs/block_dev.c 2009-08-27 19:38:55.000000000 +0900 @@ -230,43 +230,54 @@ struct super_block *freeze_bdev(struct b int error = 0; mutex_lock(&bdev->bd_fsfreeze_mutex); - if (bdev->bd_fsfreeze_count > 0) { - bdev->bd_fsfreeze_count++; + if (++bdev->bd_fsfreeze_count > 1) { + /* + * We don't even need to grab a reference - the first call + * to freeze_bdev grab an active reference and only the last + * thaw_bdev drops it. + */ sb = get_super(bdev); + drop_super(sb); mutex_unlock(&bdev->bd_fsfreeze_mutex); return sb; } - bdev->bd_fsfreeze_count++; - sb = get_super(bdev); - if (sb && !(sb->s_flags & MS_RDONLY)) { - sb->s_frozen = SB_FREEZE_WRITE; - smp_wmb(); - - sync_filesystem(sb); - - sb->s_frozen = SB_FREEZE_TRANS; - smp_wmb(); - - sync_blockdev(sb->s_bdev); - - if (sb->s_op->freeze_fs) { - error = sb->s_op->freeze_fs(sb); - if (error) { - printk(KERN_ERR - "VFS:Filesystem freeze failed\n"); - sb->s_frozen = SB_UNFROZEN; - drop_super(sb); - bdev->bd_fsfreeze_count--; - mutex_unlock(&bdev->bd_fsfreeze_mutex); - return ERR_PTR(error); - } + sb = get_active_super(bdev); + if (!sb) + goto out; + if (sb->s_flags & MS_RDONLY) { + deactivate_locked_super(sb); + mutex_unlock(&bdev->bd_fsfreeze_mutex); + return sb; + } + + sb->s_frozen = SB_FREEZE_WRITE; + smp_wmb(); + + sync_filesystem(sb); + + sb->s_frozen = SB_FREEZE_TRANS; + smp_wmb(); + + sync_blockdev(sb->s_bdev); + + if (sb->s_op->freeze_fs) { + error = sb->s_op->freeze_fs(sb); + if (error) { + printk(KERN_ERR + "VFS:Filesystem freeze failed\n"); + sb->s_frozen = SB_UNFROZEN; + deactivate_locked_super(sb); + bdev->bd_fsfreeze_count--; + mutex_unlock(&bdev->bd_fsfreeze_mutex); + return ERR_PTR(error); } } + up_write(&sb->s_umount); + out: sync_blockdev(bdev); mutex_unlock(&bdev->bd_fsfreeze_mutex); - return sb; /* thaw_bdev releases s->s_umount */ } EXPORT_SYMBOL(freeze_bdev); @@ -280,43 +291,44 @@ EXPORT_SYMBOL(freeze_bdev); */ int thaw_bdev(struct block_device *bdev, struct super_block *sb) { - int error = 0; + int error = -EINVAL; mutex_lock(&bdev->bd_fsfreeze_mutex); - if (!bdev->bd_fsfreeze_count) { - mutex_unlock(&bdev->bd_fsfreeze_mutex); - return -EINVAL; - } + if (!bdev->bd_fsfreeze_count) + goto out_unlock; - bdev->bd_fsfreeze_count--; - if (bdev->bd_fsfreeze_count > 0) { - if (sb) - drop_super(sb); - mutex_unlock(&bdev->bd_fsfreeze_mutex); - return 0; - } - - if (sb) { - BUG_ON(sb->s_bdev != bdev); - if (!(sb->s_flags & MS_RDONLY)) { - if (sb->s_op->unfreeze_fs) { - error = sb->s_op->unfreeze_fs(sb); - if (error) { - printk(KERN_ERR - "VFS:Filesystem thaw failed\n"); - sb->s_frozen = SB_FREEZE_TRANS; - bdev->bd_fsfreeze_count++; - mutex_unlock(&bdev->bd_fsfreeze_mutex); - return error; - } - } - sb->s_frozen = SB_UNFROZEN; - smp_wmb(); - wake_up(&sb->s_wait_unfrozen); + error = 0; + if (--bdev->bd_fsfreeze_count > 0) + goto out_unlock; + + if (!sb) + goto out_unlock; + + BUG_ON(sb->s_bdev != bdev); + down_write(&sb->s_umount); + if (sb->s_flags & MS_RDONLY) + goto out_deactivate; + + if (sb->s_op->unfreeze_fs) { + error = sb->s_op->unfreeze_fs(sb); + if (error) { + printk(KERN_ERR + "VFS:Filesystem thaw failed\n"); + sb->s_frozen = SB_FREEZE_TRANS; + bdev->bd_fsfreeze_count++; + mutex_unlock(&bdev->bd_fsfreeze_mutex); + return error; } - drop_super(sb); } + sb->s_frozen = SB_UNFROZEN; + smp_wmb(); + wake_up(&sb->s_wait_unfrozen); + +out_deactivate: + if (sb) + deactivate_locked_super(sb); +out_unlock: mutex_unlock(&bdev->bd_fsfreeze_mutex); return 0; } diff -urNp linux-2.6.31-rc7-orig/fs/super.c linux-2.6.31-rc7/fs/super.c --- linux-2.6.31-rc7-orig/fs/super.c 2009-08-27 19:38:47.000000000 +0900 +++ linux-2.6.31-rc7/fs/super.c 2009-08-27 19:38:55.000000000 +0900 @@ -468,6 +468,48 @@ rescan: } EXPORT_SYMBOL(get_super); + +/** + * get_active_super - get an active reference to the superblock of a device + * @bdev: device to get the superblock for + * + * Scans the superblock list and finds the superblock of the file system + * mounted on the device given. Returns the superblock with an active + * reference and s_umount held exclusively or %NULL if none was found. + */ +struct super_block *get_active_super(struct block_device *bdev) +{ + struct super_block *sb; + + if (!bdev) + return NULL; + + spin_lock(&sb_lock); + list_for_each_entry(sb, &super_blocks, s_list) { + if (sb->s_bdev != bdev) + continue; + + sb->s_count++; + spin_unlock(&sb_lock); + down_write(&sb->s_umount); + if (sb->s_root) { + spin_lock(&sb_lock); + if (sb->s_count > S_BIAS) { + atomic_inc(&sb->s_active); + sb->s_count--; + spin_unlock(&sb_lock); + return sb; + } + spin_unlock(&sb_lock); + } + up_write(&sb->s_umount); + put_super(sb); + yield(); + spin_lock(&sb_lock); + } + spin_unlock(&sb_lock); + return NULL; +} struct super_block * user_get_super(dev_t dev) { @@ -530,11 +572,15 @@ int do_remount_sb(struct super_block *sb { int retval; int remount_rw; - + + if (sb->s_frozen != SB_UNFROZEN) + return -EBUSY; + #ifdef CONFIG_BLOCK if (!(flags & MS_RDONLY) && bdev_read_only(sb->s_bdev)) return -EACCES; #endif + if (flags & MS_RDONLY) acct_auto_close(sb); shrink_dcache_sb(sb); diff -urNp linux-2.6.31-rc7-orig/include/linux/fs.h linux-2.6.31-rc7/include/linux/fs.h --- linux-2.6.31-rc7-orig/include/linux/fs.h 2009-08-27 19:38:47.000000000 +0900 +++ linux-2.6.31-rc7/include/linux/fs.h 2009-08-27 19:38:55.000000000 +0900 @@ -2320,6 +2320,7 @@ extern void get_filesystem(struct file_s extern void put_filesystem(struct file_system_type *fs); extern struct file_system_type *get_fs_type(const char *name); extern struct super_block *get_super(struct block_device *); +extern struct super_block *get_active_super(struct block_device *bdev); extern struct super_block *user_get_super(dev_t); extern void drop_super(struct super_block *sb); ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 3/4] Do not allow umounting of frozen filesystems 2009-08-27 10:11 ` Fernando Luis Vázquez Cao ` (2 preceding siblings ...) 2009-08-27 14:06 ` [PATCH 2/4] freeze_bdev: grab active reference to frozen superblocks Fernando Luis Vázquez Cao @ 2009-08-27 14:06 ` Fernando Luis Vázquez Cao 2009-09-22 11:07 ` Al Viro 3 siblings, 1 reply; 13+ messages in thread From: Fernando Luis Vázquez Cao @ 2009-08-27 14:06 UTC (permalink / raw) To: Christoph Hellwig Cc: t-sato, m-hamaguchi, Al Viro, Andrew Morton, Linux Kernel Mailing List, Eric Sandeen Instead of making umount users wait until the filesystem is unfreezed return EBUSY, which is very convenient in HA configurations. This could have been implemented at a lower level but it would require considerable plumbing in functions such as release_mounts which do not return errors. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> --- diff -urNp linux-2.6.31-rc7-orig/fs/namespace.c linux-2.6.31-rc7/fs/namespace.c --- linux-2.6.31-rc7-orig/fs/namespace.c 2009-08-27 19:34:35.000000000 +0900 +++ linux-2.6.31-rc7/fs/namespace.c 2009-08-27 22:45:14.000000000 +0900 @@ -1086,6 +1086,14 @@ static int do_umount(struct vfsmount *mn return retval; } + if (sb->s_bdev != NULL) { + mutex_lock(&sb->s_bdev->bd_fsfreeze_mutex); + if (sb->s_frozen != SB_UNFROZEN) { + mutex_unlock(&sb->s_bdev->bd_fsfreeze_mutex); + return -EBUSY; + } + } + down_write(&namespace_sem); spin_lock(&vfsmount_lock); event++; @@ -1104,6 +1112,10 @@ static int do_umount(struct vfsmount *mn security_sb_umount_busy(mnt); up_write(&namespace_sem); release_mounts(&umount_list); + + if (sb->s_bdev != NULL) + mutex_unlock(&sb->s_bdev->bd_fsfreeze_mutex); + return retval; } ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/4] Do not allow umounting of frozen filesystems 2009-08-27 14:06 ` [PATCH 3/4] Do not allow umounting of frozen filesystems Fernando Luis Vázquez Cao @ 2009-09-22 11:07 ` Al Viro 2009-09-22 15:57 ` Eric Sandeen 2009-09-22 16:41 ` Fernando Luis Vazquez Cao 0 siblings, 2 replies; 13+ messages in thread From: Al Viro @ 2009-09-22 11:07 UTC (permalink / raw) To: Fernando Luis V?zquez Cao Cc: Christoph Hellwig, t-sato, m-hamaguchi, Andrew Morton, Linux Kernel Mailing List, Eric Sandeen On Thu, Aug 27, 2009 at 11:06:07PM +0900, Fernando Luis V?zquez Cao wrote: > Instead of making umount users wait until the filesystem is > unfreezed return EBUSY, which is very convenient in HA > configurations. > > This could have been implemented at a lower level but it would > require considerable plumbing in functions such as release_mounts > which do not return errors. > + if (sb->s_bdev != NULL) { > + mutex_lock(&sb->s_bdev->bd_fsfreeze_mutex); > + if (sb->s_frozen != SB_UNFROZEN) { > + mutex_unlock(&sb->s_bdev->bd_fsfreeze_mutex); > + return -EBUSY; > + } > + } NAK. First of all, it _partially_ breaks umount -l for no good reason. If the first fs on the mountpoint is frozen, we fail; if it's deeper we succeed just fine (and delay actual fs shutdown until the thaw). As far as I can see, the real problem is that fsthaw ioctl has braindead API; it takes some opened file on fs in question. Why not do a bdev ioctl instead? Then we could let umount go ahead just fine, leaving fs frozen (and not shut down until it thaws). And whoever does thaw (via bdev ioctl) will automatically trigger the actual fs shutdown. Just with Christoph's pair of patches... IOW, I'd rather add two new ioctls (check if frozen/thaw), both by bdev. On top of the first two patches in this set. Comments? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/4] Do not allow umounting of frozen filesystems 2009-09-22 11:07 ` Al Viro @ 2009-09-22 15:57 ` Eric Sandeen 2009-09-22 16:46 ` Fernando Luis Vazquez Cao 2009-09-22 16:41 ` Fernando Luis Vazquez Cao 1 sibling, 1 reply; 13+ messages in thread From: Eric Sandeen @ 2009-09-22 15:57 UTC (permalink / raw) To: Al Viro Cc: Fernando Luis V?zquez Cao, Christoph Hellwig, t-sato, m-hamaguchi, Andrew Morton, Linux Kernel Mailing List Al Viro wrote: ... > IOW, I'd rather add two new ioctls (check if frozen/thaw), both by > bdev. On top of the first two patches in this set. > Comments? The check ioctls would be very very useful, IMHO. Many filesystem tools refuse to operate on a mounted filesystem, but in some cases it'd be safe to do readonly operations on a frozen, mounted fs. -Eric ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/4] Do not allow umounting of frozen filesystems 2009-09-22 15:57 ` Eric Sandeen @ 2009-09-22 16:46 ` Fernando Luis Vazquez Cao 0 siblings, 0 replies; 13+ messages in thread From: Fernando Luis Vazquez Cao @ 2009-09-22 16:46 UTC (permalink / raw) To: Eric Sandeen Cc: Al Viro, Christoph Hellwig, t-sato, m-hamaguchi, Andrew Morton, Linux Kernel Mailing List Eric Sandeen さんは書きました: > Al Viro wrote: > ... > > >> IOW, I'd rather add two new ioctls (check if frozen/thaw), both by >> bdev. On top of the first two patches in this set. >> > > >> Comments? >> > > The check ioctls would be very very useful, IMHO. Many filesystem tools > refuse to operate on a mounted filesystem, but in some cases it'd be > safe to do readonly operations on a frozen, mounted fs. > Yep, and HA software would benefit from the check ioctls too. - Fernando ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/4] Do not allow umounting of frozen filesystems 2009-09-22 11:07 ` Al Viro 2009-09-22 15:57 ` Eric Sandeen @ 2009-09-22 16:41 ` Fernando Luis Vazquez Cao 1 sibling, 0 replies; 13+ messages in thread From: Fernando Luis Vazquez Cao @ 2009-09-22 16:41 UTC (permalink / raw) To: Al Viro Cc: Christoph Hellwig, t-sato, m-hamaguchi, Andrew Morton, Linux Kernel Mailing List, Eric Sandeen Al Viro さんは書きました: > On Thu, Aug 27, 2009 at 11:06:07PM +0900, Fernando Luis V?zquez Cao wrote: > >> Instead of making umount users wait until the filesystem is >> unfreezed return EBUSY, which is very convenient in HA >> configurations. >> >> This could have been implemented at a lower level but it would >> require considerable plumbing in functions such as release_mounts >> which do not return errors. >> > > >> + if (sb->s_bdev != NULL) { >> + mutex_lock(&sb->s_bdev->bd_fsfreeze_mutex); >> + if (sb->s_frozen != SB_UNFROZEN) { >> + mutex_unlock(&sb->s_bdev->bd_fsfreeze_mutex); >> + return -EBUSY; >> + } >> + } >> > > NAK. First of all, it _partially_ breaks umount -l for no good reason. > If the first fs on the mountpoint is frozen, we fail; if it's deeper > we succeed just fine (and delay actual fs shutdown until the thaw). > > As far as I can see, the real problem is that fsthaw ioctl has braindead > API; it takes some opened file on fs in question. Why not do a bdev > ioctl instead? Then we could let umount go ahead just fine, leaving > fs frozen (and not shut down until it thaws). And whoever does thaw > (via bdev ioctl) will automatically trigger the actual fs shutdown. > Just with Christoph's pair of patches... > I basically agree with you. The current API creates a lot of locking issues that could be tackled more cleanly with the bdev ioctls you suggest. > IOW, I'd rather add two new ioctls (check if frozen/thaw), both by > bdev. On top of the first two patches in this set. > I am happy to see you would welcome a check ioctl. If there is consensus on the bdev ioctl approach I could send patches. Thanks, Fernando ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 4/4] filesystem freeze: add ISFROZEN ioctl 2009-08-26 5:00 [RFC, PATCH] filesystem freeze: fix sys_umount induced perpetual freeze Fernando Luis Vázquez Cao 2009-08-26 17:38 ` Christoph Hellwig @ 2009-08-27 14:06 ` Fernando Luis Vázquez Cao 1 sibling, 0 replies; 13+ messages in thread From: Fernando Luis Vázquez Cao @ 2009-08-27 14:06 UTC (permalink / raw) To: Fernando Luis Vázquez Cao Cc: t-sato, m-hamaguchi, Christoph Hellwig, Al Viro, Andrew Morton, Linux Kernel Mailing List, Eric Sandeen The ISFROZEN ioctl can be use by HA and monitoring software to check the freeze state of a filesystem. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> --- diff -urNp linux-2.6.31-rc7-orig/fs/block_dev.c linux-2.6.31-rc7/fs/block_dev.c --- linux-2.6.31-rc7-orig/fs/block_dev.c 2009-08-27 14:48:12.000000000 +0900 +++ linux-2.6.31-rc7/fs/block_dev.c 2009-08-27 18:49:46.000000000 +0900 @@ -334,6 +334,21 @@ out_unlock: } EXPORT_SYMBOL(thaw_bdev); +/** + * + */ +int isfrozen_bdev(struct block_device *bdev, struct super_block *sb) +{ + int ret; + + mutex_lock(&bdev->bd_fsfreeze_mutex); + ret = sb->s_frozen > SB_UNFROZEN ? 1:0; + mutex_unlock(&bdev->bd_fsfreeze_mutex); + + return ret; +} +EXPORT_SYMBOL(isfrozen_bdev); + static int blkdev_writepage(struct page *page, struct writeback_control *wbc) { return block_write_full_page(page, blkdev_get_block, wbc); diff -urNp linux-2.6.31-rc7-orig/fs/ioctl.c linux-2.6.31-rc7/fs/ioctl.c --- linux-2.6.31-rc7-orig/fs/ioctl.c 2009-08-27 10:08:00.000000000 +0900 +++ linux-2.6.31-rc7/fs/ioctl.c 2009-08-27 18:49:17.000000000 +0900 @@ -536,6 +536,20 @@ static int ioctl_fsthaw(struct file *fil return thaw_bdev(sb->s_bdev, sb); } +static int ioctl_isfrozen(struct file *filp) +{ + struct super_block *sb = filp->f_path.dentry->d_inode->i_sb; + + if (!capable(CAP_SYS_ADMIN)) + return -EPERM; + + /* If a blockdevice-backed filesystem isn't specified, return EINVAL. */ + if (sb->s_bdev == NULL) + return -EINVAL; + + return isfrozen_bdev(sb->s_bdev, sb); +} + /* * When you add any new common ioctls to the switches above and below * please update compat_sys_ioctl() too. @@ -586,6 +600,12 @@ int do_vfs_ioctl(struct file *filp, unsi error = ioctl_fsthaw(filp); break; + case FIISFROZEN: + error = ioctl_isfrozen(filp); + if (error >= 0) + return put_user(error, (int __user *)arg); + break; + case FS_IOC_FIEMAP: return ioctl_fiemap(filp, arg); diff -urNp linux-2.6.31-rc7-orig/include/linux/fs.h linux-2.6.31-rc7/include/linux/fs.h --- linux-2.6.31-rc7-orig/include/linux/fs.h 2009-08-27 14:48:12.000000000 +0900 +++ linux-2.6.31-rc7/include/linux/fs.h 2009-08-27 16:33:19.000000000 +0900 @@ -306,6 +306,7 @@ struct inodes_stat_t { #define FIGETBSZ _IO(0x00,2) /* get the block size used for bmap */ #define FIFREEZE _IOWR('X', 119, int) /* Freeze */ #define FITHAW _IOWR('X', 120, int) /* Thaw */ +#define FIISFROZEN _IOWR('X', 121, int) /* Is frozen? */ #define FS_IOC_GETFLAGS _IOR('f', 1, long) #define FS_IOC_SETFLAGS _IOW('f', 2, long) @@ -1955,6 +1956,7 @@ extern int sync_blockdev(struct block_de extern struct super_block *freeze_bdev(struct block_device *); extern void emergency_thaw_all(void); extern int thaw_bdev(struct block_device *bdev, struct super_block *sb); +extern int isfrozen_bdev(struct block_device *bdev, struct super_block *sb); extern int fsync_bdev(struct block_device *); #else static inline void bd_forget(struct inode *inode) {} @@ -1970,6 +1972,12 @@ static inline int thaw_bdev(struct block { return 0; } + +extern int isfrozen_bdev(struct block_device *bdev, struct super_block *sb) +{ + return 0; +} + #endif extern int sync_filesystem(struct super_block *); extern const struct file_operations def_blk_fops; ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2009-09-22 16:45 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-08-26 5:00 [RFC, PATCH] filesystem freeze: fix sys_umount induced perpetual freeze Fernando Luis Vázquez Cao 2009-08-26 17:38 ` Christoph Hellwig 2009-08-27 10:11 ` Fernando Luis Vázquez Cao 2009-08-27 11:54 ` Christoph Hellwig 2009-08-27 12:16 ` Fernando Luis Vázquez Cao 2009-08-27 14:05 ` [PATCH 1/4] freeze_bdev: kill bd_mount_sem Fernando Luis Vázquez Cao 2009-08-27 14:06 ` [PATCH 2/4] freeze_bdev: grab active reference to frozen superblocks Fernando Luis Vázquez Cao 2009-08-27 14:06 ` [PATCH 3/4] Do not allow umounting of frozen filesystems Fernando Luis Vázquez Cao 2009-09-22 11:07 ` Al Viro 2009-09-22 15:57 ` Eric Sandeen 2009-09-22 16:46 ` Fernando Luis Vazquez Cao 2009-09-22 16:41 ` Fernando Luis Vazquez Cao 2009-08-27 14:06 ` [PATCH 4/4] filesystem freeze: add ISFROZEN ioctl Fernando Luis Vázquez Cao
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox