From: Josef Bacik <josef@redhat.com>
To: Josef Bacik <josef@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>,
Jeffrey Merkey <jeffmerkey@gmail.com>,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
viro@zeniv.linux.org.uk
Subject: Re: 2.6.34 echo j > /proc/sysrq-trigger causes inifnite unfreeze/Thaw event
Date: Mon, 7 Jun 2010 17:59:25 -0400 [thread overview]
Message-ID: <20100607215925.GF2336@localhost.localdomain> (raw)
In-Reply-To: <20100607213631.GE2336@localhost.localdomain>
On Mon, Jun 07, 2010 at 05:36:31PM -0400, Josef Bacik wrote:
> On Mon, Jun 07, 2010 at 11:05:42AM +1000, Dave Chinner wrote:
> > On Thu, Jun 03, 2010 at 11:30:30PM -0600, Jeffrey Merkey wrote:
> > > causes the FS Thaw stuff in fs/buffer.c to enter an infinite loop
> > > filling the /var/log/messages with junk and causing the hard drive to
> > > crank away endlessly.
> >
> > Hmmm, looks pretty obvious what the 2.6.34 bug is:
> >
> > while (sb->s_bdev && !thaw_bdev(sb->s_bdev, sb))
> > printk(KERN_WARNING "Emergency Thaw on %s\n",
> > bdevname(sb->s_bdev, b));
> >
> > thaw_bdev() returns 0 on success or not frozen, and returns non-zero
> > only if the unfreeze failed. Looks like it was broken from the start
> > to me.
> >
> > Fixing that endless loop shows some other problems on 2.6.35,
> > though: the emergency unfreeze is not unfreezing frozen XFS
> > filesystems. This appears to be caused by
> > 18e9e5104fcd9a973ffe3eed3816c87f2a1b6cd2 ("Introduce freeze_super
> > and thaw_super for the fsfreeze ioctl").
> >
> > It appears that this introduces a significant mismatch between the
> > bdev freeze/thaw and the super freze/thaw. That is, if you freeze
> > with the sb method, you can only unfreeze via the sb method.
> > however, if you freeze via the bdev method, you can unfreeze by
> > either the bdev or sb method. This breaks the nesting of the
> > freeze/thaw operations between dm and userspace, which can lead to
> > premature thawing of the filesystem.
> >
> > Then there is this deadlock:
> >
> > iterate_supers(do_thaw_one) does:
> >
> > down_read(&sb->s_umount);
> > do_thaw_one(sb)
> > thaw_bdev(sb->s_bdev, sb))
> > thaw_super(sb)
> > down_write(&sb->s_umount);
> >
> > Which is an instant deadlock.
> >
> > These problems were hidden by the fact that the emergency thaw code
> > was not getting past the thaw_bdev guards and so not triggering
> > this deadlock.
> >
> > Al, Josef, what's the best way to fix this mess?
> >
>
> Well we can do something like the following
>
> 1) Make a __thaw_super() that just does all the work currently in thaw_super(),
> just without taking the s_umount semaphore.
> 2) Make an thaw_bdev_force or something like that that just sets
> bd_fsfreeze_count to 0 and calls __thaw_super(). The original intent was to
> make us call thaw until the thaw actually occured, so might as well just make it
> quick and painless.
> 3) Make do_thaw_one() call __thaw_super if sb->s_bdev doesn't exist. I'm not
> sure if this happens currently, but it's nice just in case.
>
> This takes care of the s_umount problem and makes sure that do_thaw_one does
> actually thaw the device. Does this sound kosher to everybody? Thanks,
>
Ok here is my half-assed 15 minute fix. I've not even compile tested it, but it
should get the general idea of what I'm talking about across. Comments?
Complaints? Flames? Thanks,
Josef
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 7346c96..086361c 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -272,6 +272,10 @@ int thaw_bdev(struct block_device *bdev, struct super_block *sb)
{
int error = -EINVAL;
+ if (!sb)
+ return error;
+
+ down_write(&sb->s_umount);
mutex_lock(&bdev->bd_fsfreeze_mutex);
if (!bdev->bd_fsfreeze_count)
goto out;
@@ -280,21 +284,47 @@ int thaw_bdev(struct block_device *bdev, struct super_block *sb)
if (--bdev->bd_fsfreeze_count > 0)
goto out;
- if (!sb)
- goto out;
-
- error = thaw_super(sb);
- if (error) {
+ error = __thaw_super(sb);
+ if (error && error != -EINVAL)
bdev->bd_fsfreeze_count++;
- mutex_unlock(&bdev->bd_fsfreeze_mutex);
- return error;
- }
+ else if (error == -EINVAL)
+ error = 0;
out:
mutex_unlock(&bdev->bd_fsfreeze_mutex);
- return 0;
+ up_write(&sb->s_umount);
+
+ return error;
}
EXPORT_SYMBOL(thaw_bdev);
+/**
+ * thaw_bdev_emergency -- unlock a filesystem regardless of its freeze count
+ * @bdev: blockdevice to unlock
+ * @sb: associated superblock
+ *
+ * Unlocks the filesystem and marks it writeable again regardless of the freeze
+ * count. Caller must down_write(&sb->s_umount).
+ */
+int thaw_bdev_emergency(struct block_device *bdev, struct super_block *sb)
+{
+ int error = -EINVAL;
+
+ if (!sb)
+ return error;
+
+ mutex_lock(&bdev->bd_fsfreeze_mutex);
+ bdev->bd_fsfreeze_count = 0;
+ error = __thaw_super(sb);
+ if (error && error != -EINVAL)
+ bdev->bd_fsfreeze_count = 1;
+ else if (error == -EINVAL)
+ error = 0;
+ mutex_unlock(&bdev->bd_fsfreeze_mutex);
+
+ return error;
+}
+EXPORT_SYMBOL(thaw_bdev_emergency);
+
static int blkdev_writepage(struct page *page, struct writeback_control *wbc)
{
return block_write_full_page(page, blkdev_get_block, wbc);
diff --git a/fs/buffer.c b/fs/buffer.c
index d54812b..62d97c6 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -564,9 +564,19 @@ repeat:
static void do_thaw_one(struct super_block *sb, void *unused)
{
char b[BDEVNAME_SIZE];
- while (sb->s_bdev && !thaw_bdev(sb->s_bdev, sb))
+
+ while (sb->s_bdev && !thaw_bdev_emergency(sb->s_bdev, sb))
printk(KERN_WARNING "Emergency Thaw on %s\n",
bdevname(sb->s_bdev, b));
+ if (!sb->s_bdev) {
+ while (1) {
+ int ret;
+
+ ret = __thaw_super(sb);
+ if (!ret || ret == -EINVAL)
+ break;
+ }
+ }
}
static void do_thaw_all(struct work_struct *work)
diff --git a/fs/super.c b/fs/super.c
index 5c35bc7..5dee40b 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -987,20 +987,18 @@ int freeze_super(struct super_block *sb)
EXPORT_SYMBOL(freeze_super);
/**
- * thaw_super -- unlock filesystem
+ * __thaw_super -- unlock filesystem
* @sb: the super to thaw
*
* Unlocks the filesystem and marks it writeable again after freeze_super().
+ * Caller must down_write(&sb->s_umount).
*/
-int thaw_super(struct super_block *sb)
+int __thaw_super(struct super_block *sb)
{
int error;
- down_write(&sb->s_umount);
- if (sb->s_frozen == SB_UNFROZEN) {
- up_write(&sb->s_umount);
+ if (sb->s_frozen == SB_UNFROZEN)
return -EINVAL;
- }
if (sb->s_flags & MS_RDONLY)
goto out;
@@ -1011,7 +1009,6 @@ int thaw_super(struct super_block *sb)
printk(KERN_ERR
"VFS:Filesystem thaw failed\n");
sb->s_frozen = SB_FREEZE_TRANS;
- up_write(&sb->s_umount);
return error;
}
}
@@ -1020,10 +1017,28 @@ out:
sb->s_frozen = SB_UNFROZEN;
smp_wmb();
wake_up(&sb->s_wait_unfrozen);
- deactivate_locked_super(sb);
+ deactivate_super(sb);
return 0;
}
+EXPORT_SYMBOL(__thaw_super);
+
+/**
+ * thaw_super -- unlock filesystem
+ * @sb: the super to thaw
+ *
+ * Unlocks the filesystem and marks it writeable again after freeze_super().
+ */
+int thaw_super(struct super_block *sb)
+{
+ int error;
+
+ down_write(&sb->s_umount);
+ error = __thaw_super(sb);
+ up_write(&sb->s_umount);
+
+ return error;
+}
EXPORT_SYMBOL(thaw_super);
static struct vfsmount *fs_set_subtype(struct vfsmount *mnt, const char *fstype)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 471e1ff..6dd6d4f 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1803,6 +1803,7 @@ extern int iterate_mounts(int (*)(struct vfsmount *, void *), void *,
extern int vfs_statfs(struct dentry *, struct kstatfs *);
extern int freeze_super(struct super_block *super);
extern int thaw_super(struct super_block *super);
+extern int __thaw_super(struct super_block *super);
extern int current_umask(void);
@@ -1953,6 +1954,7 @@ extern int sync_blockdev(struct block_device *bdev);
extern struct super_block *freeze_bdev(struct block_device *);
extern void emergency_thaw_all(void);
extern int thaw_bdev(struct block_device *bdev, struct super_block *sb);
+int thaw_bdev_emergency(struct block_device *bdev, struct super_block *sb);
extern int fsync_bdev(struct block_device *);
#else
static inline void bd_forget(struct inode *inode) {}
@@ -1968,6 +1970,12 @@ static inline int thaw_bdev(struct block_device *bdev, struct super_block *sb)
{
return 0;
}
+
+static inline int thaw_bdev_emergency(struct block_device *bdev,
+ struct super_block *sb)
+{
+ return 0;
+}
#endif
extern int sync_filesystem(struct super_block *);
extern const struct file_operations def_blk_fops;
next prev parent reply other threads:[~2010-06-07 21:59 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-06-04 5:30 2.6.34 echo j > /proc/sysrq-trigger causes inifnite unfreeze/Thaw event Jeffrey Merkey
2010-06-07 1:05 ` Dave Chinner
2010-06-07 4:23 ` Eric Sandeen
2010-06-07 21:36 ` Josef Bacik
2010-06-07 21:59 ` Josef Bacik [this message]
2010-06-07 23:23 ` Dave Chinner
2010-06-08 2:07 ` Josef Bacik
2010-06-08 2:26 ` Dave Chinner
2010-06-08 12:58 ` Dave Chinner
2010-06-08 14:56 ` Josef Bacik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100607215925.GF2336@localhost.localdomain \
--to=josef@redhat.com \
--cc=david@fromorbit.com \
--cc=jeffmerkey@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.