From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kara Subject: Re: [PATCH 4/9] fsfreeze: emergency thaw will deadlock on s_umount Date: Mon, 8 Oct 2012 15:57:33 +0200 Message-ID: <20121008135733.GC9243@quack.suse.cz> References: <1349414653.7347.2.camel@nexus.lab.ntt.co.jp> <1349415353.7347.8.camel@nexus.lab.ntt.co.jp> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Al Viro , Josef Bacik , Eric Sandeen , Dave Chinner , Christoph Hellwig , Jan Kara , linux-fsdevel@vger.kernel.org To: Fernando Luis =?iso-8859-1?Q?V=E1zquez?= Cao Return-path: Received: from cantor2.suse.de ([195.135.220.15]:57207 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750933Ab2JHN5h (ORCPT ); Mon, 8 Oct 2012 09:57:37 -0400 Content-Disposition: inline In-Reply-To: <1349415353.7347.8.camel@nexus.lab.ntt.co.jp> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Fri 05-10-12 14:35:53, Fernando Luis V=E1zquez Cao wrote: > The emergency thaw process uses iterate_super() which holds the > sb->s_umount lock in read mode. The current thaw_super() code takes > the sb->s_umount lock in write mode, hence leading to an instant > deadlock. > =20 > Use the unlocked version of thaw_super() to do the thawing and replac= e > iterate_supers() with __iterate_supers() so that the unfreeze operati= on can ^^ iterate_supers_write() > be performed with s_umount held as the locking rules for fsfreeze ind= icate. >=20 > As a bonus, by using thaw_super(), which does not nest, instead of th= aw_bdev() > when can get rid of the ugly while loop. >=20 > Jan Kara pointed out that with this approach we will leave the block = devices > frozen, but this is a problem we have had since the introduction of t= he > superblock level API: if we thaw the filesystem using the superblock = level API > (be it through the thaw ioctl or emergency thaw) the bdev level freez= e > reference counter (bd_fsfreeze_count) will not be updated and even th= ough > subsequent calls to thaw_bdev() will decrease it it will never get ba= ck to 0 > (if thaw_super() returns an error, and it will when the superblock is= unfrozen, > thaw_bdev() will return without decreasing the counter). The solution= I propose > (and will be implementing in the followup patch "fsfreeze: freeze_sup= er and > thaw_bdev don't play well together") is letting bd_fsfreeze_count > become zero when the superblock sitting on top of it is unfrozen, so = that > future calls to freeze_bdev() actually try to freeze the superblock. >=20 > Cc: Josef Bacik > Cc: Eric Sandeen > Cc: Christoph Hellwig > Cc: Jan Kara > Cc: Dave Chinner > Signed-off-by: Fernando Luis Vazquez Cao > --- >=20 > diff -urNp linux-3.6.0-rc7-orig/fs/buffer.c linux-3.6.0-rc7/fs/buffer= =2Ec > --- linux-3.6.0-rc7-orig/fs/buffer.c 2012-09-26 13:20:14.842365056 +0= 900 > +++ linux-3.6.0-rc7/fs/buffer.c 2012-09-26 15:02:22.630595704 +0900 > @@ -513,15 +513,28 @@ repeat: > =20 > static void do_thaw_one(struct super_block *sb, void *unused) > { > - char b[BDEVNAME_SIZE]; > - while (sb->s_bdev && !thaw_bdev(sb->s_bdev, sb)) > - printk(KERN_WARNING "Emergency Thaw on %s\n", > + int res; > + > + if (sb->s_bdev) { > + char b[BDEVNAME_SIZE]; > + printk(KERN_WARNING "Emergency Thaw on %s.\n", > bdevname(sb->s_bdev, b)); > + } > + > + /* We got here from __iterate_supers with the superblock lock taken > + * so we can call the lockless version of thaw_super() safely. */ > + res =3D __thaw_super(sb); > + /* If we are going to drop the final active reference call > + * deactivate_locked_super to clean things up. In the general case > + * we avoid calling deactivate_locked_super() because it would rela= se > + * the superblock lock, which is __iterate_supers()'s job. */ > + if (!res && !atomic_add_unless(&sb->s_active, -1, 1)) > + deactivate_locked_super(sb); This just looks wrong. When we *do* end up calling deactivate_locked_super() we will return with sb unlocked which makes iterate_supers_write() unlock already unlocked lock. What I would put h= ere is: if (!res) { deactivate_locked_super(sb); /* * We have to re-acquire s_umount because * iterate_supers_write() will unlock it. It still holds * passive reference so sb cannot be freed under us. */ down_write(&sb->s_umount); } =09 Is there any problem with this I miss? Honza --=20 Jan Kara SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel= " in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html