From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kara Subject: Re: [PATCH 12/17] fsfreeze: sb-level/bdev-level fsfreeze integration Date: Wed, 9 Jan 2013 17:37:37 +0100 Message-ID: <20130109163737.GE17353@quack.suse.cz> References: <1357557492.8183.1.camel@nexus.lab.ntt.co.jp> <1357558704.8183.19.camel@nexus.lab.ntt.co.jp> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Al Viro , Josef Bacik , Eric Sandeen , Dave Chinner , Christoph Hellwig , Jan Kara , Luiz Capitulino , linux-fsdevel@vger.kernel.org To: Fernando Luis =?iso-8859-1?Q?V=E1zquez?= Cao Return-path: Received: from cantor2.suse.de ([195.135.220.15]:56189 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932193Ab3AIQhk (ORCPT ); Wed, 9 Jan 2013 11:37:40 -0500 Content-Disposition: inline In-Reply-To: <1357558704.8183.19.camel@nexus.lab.ntt.co.jp> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Mon 07-01-13 20:38:24, Fernando Luis V=E1zquez Cao wrote: > As things stand now a filesystem frozen through the in-kernel bdev le= vel API > can be thawed using the userspace sb level API, which can lead to acc= idental > corruption of filesystem snapshots and backups. >=20 > To address this problem we modify the in-kernel API so that we can te= ll > fsfreeze that a kernel initiated freeze is in progress and that the f= ilesystem > should not be thawed no matter how many times the FITHAW ioctl is inv= oked. I'm not sure if this isn't going too far in the direction of trying t= o prevent sysadmin to shoot himself in the foot. For well written applica= tions where FITHAW and FIFREEZE are paired, things should work OK after your initial fixes. And if someone calls unpaired FITHAW, things can break spectacularly anyway for other users of FIFREEZE. So I just wouldn't bo= ther with any more protections. What do you think? Honza > Cc: linux-fsdevel@vger.kernel.org > Cc: Josef Bacik > Cc: Eric Sandeen > Cc: Christoph Hellwig > Cc: Dave Chinner > Cc: Jan Kara > Cc: Luiz Capitulino > Signed-off-by: Fernando Luis Vazquez Cao > --- >=20 > diff -urNp linux-3.8-rc1-orig/fs/block_dev.c linux-3.8-rc1/fs/block_d= ev.c > --- linux-3.8-rc1-orig/fs/block_dev.c 2012-12-25 16:22:48.268018000 += 0900 > +++ linux-3.8-rc1/fs/block_dev.c 2012-12-25 16:32:09.712018000 +0900 > @@ -238,7 +238,7 @@ struct super_block *freeze_bdev(struct b > sb =3D get_active_super(bdev); > if (!sb) > goto out; > - error =3D freeze_super(sb); > + error =3D __freeze_super(sb, true); > if (error) { > deactivate_super(sb); > bdev->bd_fsfreeze_count--; > @@ -265,6 +265,7 @@ int thaw_bdev(struct block_device *bdev, > int error =3D -EINVAL; > =20 > mutex_lock(&bdev->bd_fsfreeze_mutex); > + > if (!bdev->bd_fsfreeze_count) > goto out; > =20 > @@ -273,20 +274,10 @@ int thaw_bdev(struct block_device *bdev, > goto out; > } > =20 > - error =3D thaw_super(sb); > - /* > - * If the superblock is already unfrozen, i.e. thaw_super() returne= d > - * -EINVAL, we consider the block device level thaw successful. Thi= s > - * behavior is important in a scenario where a filesystem frozen us= ing > - * freeze_bdev() is thawed through the superblock level API; if we > - * caused the subsequent thaw_bdev() to fail bdev->bd_fsfreeze_coun= t > - * would not go back to 0 which means that future calls to freeze_b= dev() > - * would not freeze the superblock, just increase the counter. > - */ > - if (error && error !=3D -EINVAL) > + error =3D __thaw_super(sb, true); > + > + if (error) > bdev->bd_fsfreeze_count++; > - else > - error =3D 0; > out: > mutex_unlock(&bdev->bd_fsfreeze_mutex); > return error; > diff -urNp linux-3.8-rc1-orig/fs/namespace.c linux-3.8-rc1/fs/namespa= ce.c > --- linux-3.8-rc1-orig/fs/namespace.c 2012-12-25 16:31:10.780018000 += 0900 > +++ linux-3.8-rc1/fs/namespace.c 2012-12-25 16:32:09.712018000 +0900 > @@ -1103,6 +1103,11 @@ static void thaw_mount(struct mount *mnt > * superblock succeeds (once it has been detached the fsfreeze > * ioctls become unusable). Thus, force-thaw sb so that all tasks > * in fsfreeze wait queue are woken up. > + * > + * thaw_super_force() does not actually thaw the sb if the freeze > + * counter was locked (i.e. was frozen through the block device > + * level API). In such a case the freeze counter is set to one > + * thus guaranteeing that the sb will get thawed unlock time. > */ > thaw_super_force(sb); /* Drops superblock lock. */ > } > diff -urNp linux-3.8-rc1-orig/fs/super.c linux-3.8-rc1/fs/super.c > --- linux-3.8-rc1-orig/fs/super.c 2012-12-25 16:31:10.780018000 +0900 > +++ linux-3.8-rc1/fs/super.c 2012-12-25 16:32:09.712018000 +0900 > @@ -1301,15 +1301,20 @@ static void sb_wait_write(struct super_b > } > =20 > /** > - * freeze_super - lock the filesystem and force it into a consistent= state > + * __freeze_super - lock the filesystem and force it into a consiste= nt state > * @sb: the super to lock > + * @lock: should we lock the freeze counter? > * > * Syncs the super to make sure the filesystem is consistent and cal= ls the fs's > - * freeze_fs. The reference counter (s_freeze_count) guarantees that= only the > - * last unfreeze process can unfreeze the frozen filesystem actually= when > - * multiple freeze requests arrive simultaneously. It counts up in > - * freeze_super() and counts down in thaw_super(). When it becomes 0= , > - * thaw_super() will execute the unfreeze. > + * freeze_fs. Freezes can nest which has two implications: the files= ystem level > + * freeze occurs during the first nested freeze, the actual filesyst= em thaw > + * occurs only when the last thaw operation brings the freeze counte= r down to > + * zero. > + * > + * If @lock is true the freeze counter is increased after a successf= ul freeze > + * but it cannot go back to zero (and the filesystem get actually th= awed) until > + * the the counter is unlocked using this function's thaw counterpar= t. The > + * freeze counter lock does not nest. > * > * During this function, sb->s_writers.frozen goes through these val= ues: > * > @@ -1334,15 +1339,24 @@ static void sb_wait_write(struct super_b > * freezing. Then we transition to SB_FREEZE_COMPLETE state. This st= ate is > * mostly auxiliary for filesystems to verify they do not modify fro= zen fs. > * > - * sb->s_writers.frozen and sb->s_freeze_count are protected by sb->= s_umount. > + * sb->s_writers.frozen, sb->s_freeze_count and sb->s_freeze_locked = are > + * protected by sb->s_umount. > */ > -int freeze_super(struct super_block *sb) > +int __freeze_super(struct super_block *sb, bool lock) > { > int ret =3D 0; > + bool locked_old =3D sb->s_freeze_locked; > =20 > atomic_inc(&sb->s_active); > down_write(&sb->s_umount); > =20 > + /* The freeze counter lock does not nest. */ > + if (sb->s_freeze_locked && lock) { > + ret =3D -EBUSY; > + goto out_deactivate; > + } > + > + sb->s_freeze_locked =3D lock ? true : sb->s_freeze_locked; > if (++sb->s_freeze_count > 1) > goto out_deactivate; > =20 > @@ -1390,6 +1404,7 @@ int freeze_super(struct super_block *sb) > if (ret) { > printk(KERN_ERR > "VFS:Filesystem freeze failed\n"); > + sb->s_freeze_locked =3D locked_old; > sb->s_freeze_count--; > sb->s_writers.frozen =3D SB_UNFROZEN; > smp_wmb(); > @@ -1397,11 +1412,13 @@ int freeze_super(struct super_block *sb) > goto out_deactivate; > } > } > + > /* > * This is just for debugging purposes so that fs can warn if it > * sees write activity when frozen is set to SB_FREEZE_COMPLETE. > */ > sb->s_writers.frozen =3D SB_FREEZE_COMPLETE; > + > out_unlock: > up_write(&sb->s_umount); > return ret; > @@ -1409,6 +1426,18 @@ out_deactivate: > deactivate_locked_super(sb); > return ret; > } > + > +/** > + * freeze_super - lock the filesystem and force it into a consistent= state > + * @sb: the super to lock > + * > + * This is a wrapper around __freeze_super() which does the actual w= ork of > + * freezing the filesystem. fsfreeze counter lock is not requested. > + */ > +int freeze_super(struct super_block *sb) > +{ > + return __freeze_super(sb, false); > +} > EXPORT_SYMBOL(freeze_super); > =20 > /** > @@ -1449,34 +1478,56 @@ out: > } > =20 > /** > - * thaw_super - unlock filesystem > + * __thaw_super - unlock filesystem > * @sb: the super to thaw > + * @unlock: should we unlock the freeze counter? > * > - * Unlocks the filesystem and marks it writeable again after freeze_= super(). > + * Tries to decrease the freeze counter and when it reaches zero unl= ocks the > + * filesystem and marks it writeable again. If the counter is locked= it cannot > + * go back to zero (and thus trigger the actual filesystem thaw) unl= ess @unlock > + * is true. > * > * Returns -EINVAL if @sb is not frozen, 0 if it succeeded or the co= rresponding > * error code otherwise. If the unfreeze fails, @sb is left in the f= rozen state. > */ > -int thaw_super(struct super_block *sb) > +int __thaw_super(struct super_block *sb, bool unlock) > { > int error =3D 0; > =20 > down_write(&sb->s_umount); > =20 > - if (!sb->s_freeze_count) { > + /* > + * An unfrozen filesystem cannot be thawed. Similarly, an unlocked > + * freeze counter cannot be unlocked. > + */ > + if (!sb->s_freeze_count || (!sb->s_freeze_locked && unlock)) { > error =3D -EINVAL; > goto out_unlock; > } > =20 > - if (--sb->s_freeze_count > 0) > + /* > + * Freezes nest so only the last call (freeze counter down to one) = can > + * trigger the actual filesystem thaw. > + */ > + if (sb->s_freeze_count > 1) { > + sb->s_freeze_count--; > + sb->s_freeze_locked =3D unlock ? false: sb->s_freeze_locked; > + goto out_unlock; > + } > + /* A locked filesystem cannot be thawed unless unlock was requested= =2E */ > + else if (sb->s_freeze_locked && !unlock) { > + error =3D -EINVAL; > goto out_unlock; > + } > =20 > error =3D raw_thaw_super(sb, false); > =20 > - if (error) { > - sb->s_freeze_count++; > - goto out_unlock; > + if (!error) { > + sb->s_freeze_count =3D 0; > + sb->s_freeze_locked =3D false; > } > + else > + goto out_unlock; > =20 > /* Active reference released after last thaw. */ > deactivate_locked_super(sb); > @@ -1486,6 +1537,19 @@ out_unlock: > up_write(&sb->s_umount); > return error; > } > + > +/** > + * thaw_super - unlock filesystem > + * @sb: the super to unlock > + * > + * This is a wrapper around __thaw_super() which does the actual wor= k of > + * thawing the filesystem. Release of the fsfreeze counter lock is n= ot > + * requested. > + */ > +int thaw_super(struct super_block *sb) > +{ > + return __thaw_super(sb, false); > +} > EXPORT_SYMBOL(thaw_super); > =20 > /** > @@ -1505,10 +1569,19 @@ int thaw_super_force(struct super_block > up_write(&sb->s_umount); > return -EINVAL; > } > + > + if (sb->s_freeze_locked) { > + /* Ensure superblock gets thawed at unlock time */ > + sb->s_freeze_count =3D 1; > + up_write(&sb->s_umount); > + return -EINVAL; > + } > + > sb->s_freeze_count =3D 0; > raw_thaw_super(sb, true); > /* Active reference released after last thaw. */ > deactivate_locked_super(sb); > + > return 0; > } > =20 > diff -urNp linux-3.8-rc1-orig/include/linux/fs.h linux-3.8-rc1/includ= e/linux/fs.h > --- linux-3.8-rc1-orig/include/linux/fs.h 2012-12-25 16:31:10.7840180= 00 +0900 > +++ linux-3.8-rc1/include/linux/fs.h 2012-12-25 16:32:09.712018000 +0= 900 > @@ -1323,6 +1323,9 @@ struct super_block { > =20 > /* Number of nested freezes */ > int s_freeze_count; > + > + /* Is freeze state locked? */ > + bool s_freeze_locked; > }; > =20 > /* superblock cache pruning functions */ > @@ -1881,7 +1884,9 @@ extern int vfs_statfs(struct path *, str > extern int user_statfs(const char __user *, struct kstatfs *); > extern int fd_statfs(int, struct kstatfs *); > extern int vfs_ustat(dev_t, struct kstatfs *); > +extern int __freeze_super(struct super_block *sb, bool lock); > extern int freeze_super(struct super_block *super); > +extern int __thaw_super(struct super_block *sb, bool unlock); > extern int thaw_super(struct super_block *super); > extern int thaw_super_force(struct super_block *super); > extern void emergency_thaw_all(void); >=20 >=20 --=20 Jan Kara SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel= " in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html