From: "Rafael J. Wysocki" <rjw@sisk.pl>
To: Pavel Machek <pavel@ucw.cz>
Cc: David Chinner <dgc@sgi.com>,
Nigel Cunningham <ncunningham@linuxmail.org>,
Andrew Morton <akpm@osdl.org>,
LKML <linux-kernel@vger.kernel.org>,
xfs@oss.sgi.com, Christoph Hellwig <hch@infradead.org>
Subject: Re: [PATCH] Freeze bdevs when freezing processes.
Date: Mon, 30 Oct 2006 00:29:24 +0100 [thread overview]
Message-ID: <200610300029.25555.rjw@sisk.pl> (raw)
In-Reply-To: <20061029173537.GA3022@elf.ucw.cz>
Hi,
On Sunday, 29 October 2006 18:35, Pavel Machek wrote:
> Hi!
>
> > > > > As you have them at the moment, the threads seem to be freezing fine.
> > > > > The issue I've seen in the past related not to threads but to timer
> > > > > based activity. Admittedly it was 2.6.14 when I last looked at it, but
> > > > > there used to be a possibility for XFS to submit I/O from a timer when
> > > > > the threads are frozen but the bdev isn't frozen. Has that changed?
> > > >
> > > > I didn't think we've ever done that - periodic or delayed operations
> > > > are passed off to the kernel threads to execute. A stack trace
> > > > (if you still have it) would be really help here.
> > > >
> > > > Hmmm - we have a couple of per-cpu work queues as well that are
> > > > used on I/O completion and that can, in some circumstances,
> > > > trigger new transactions. If we are only flush metadata, then
> > > > I don't think that any more I/o will be issued, but I could be
> > > > wrong (maze of twisty passages).
> > >
> > > Well, I think this exactly is the problem, because worker_threads run with
> > > PF_NOFREEZE set (as I've just said in another message).
> >
> > Ok, so freezing the filesystem is the only way you can prevent
> > this as the workqueues are flushed as part of quiescing the filesystem.
>
> Well, alternative is to teach XFS to sense that we are being frozen
> and stop disk writes in such case.
>
> OTOH freeze_bdevs is perhaps not that bad solution...
Okay, appended is a patch that implements the freezing of bdevs in a slightly
different way than the Nigel's patch did it.
As Christoph suggested, I have put freeze_filesystems() and thaw_filesystems()
into fs/buffer.c and indroduced the MS_FROZEN flag to mark frozen
filesystems.
It seems to work fine, except I get the following trace from lockdep during
the suspend on a regular basis (not 100% reproducible, though):
Stopping tasks...
=============================================
[ INFO: possible recursive locking detected ]
2.6.19-rc2-mm2 #15
---------------------------------------------
s2disk/5564 is trying to acquire lock:
(&bdev->bd_mount_mutex){--..}, at: [<ffffffff80475e79>] mutex_lock+0x9/0x10
but task is already holding lock:
(&bdev->bd_mount_mutex){--..}, at: [<ffffffff80475e79>] mutex_lock+0x9/0x10
other info that might help us debug this:
3 locks held by s2disk/5564:
#0: (&bdev->bd_mount_mutex){--..}, at: [<ffffffff80475e79>] mutex_lock+0x9/0x10
#1: (&type->s_umount_key#16){----}, at: [<ffffffff80291647>] get_super+0x67/0xc0
#2: (&journal->j_barrier){--..}, at: [<ffffffff80475e79>] mutex_lock+0x9/0x10
stack backtrace:
Call Trace:
[<ffffffff8020af79>] dump_trace+0xb9/0x430
[<ffffffff8020b333>] show_trace+0x43/0x60
[<ffffffff8020b635>] dump_stack+0x15/0x20
[<ffffffff8024a1d1>] __lock_acquire+0x881/0xc60
[<ffffffff8024a94d>] lock_acquire+0x8d/0xc0
[<ffffffff80475cd4>] __mutex_lock_slowpath+0xd4/0x270
[<ffffffff80475e79>] mutex_lock+0x9/0x10
[<ffffffff802b2bb6>] freeze_bdev+0x16/0x80
[<ffffffff802b3105>] freeze_filesystems+0x55/0x80
[<ffffffff80255942>] freeze_processes+0x1e2/0x360
[<ffffffff802592a3>] snapshot_ioctl+0x163/0x610
[<ffffffff8029cf0b>] do_ioctl+0x6b/0xa0
[<ffffffff8029d1eb>] vfs_ioctl+0x2ab/0x2d0
[<ffffffff8029d27a>] sys_ioctl+0x6a/0xa0
[<ffffffff80209c2e>] system_call+0x7e/0x83
[<00002afb13a4d8a9>]
done.
Shrinking memory... done (19126 pages freed)
Greetings,
Rafael
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
fs/buffer.c | 38 +++++++++++++++++++++++++++++++++
include/linux/buffer_head.h | 2 +
include/linux/fs.h | 1
kernel/power/process.c | 50 +++++++++++++++++++++++++++++---------------
4 files changed, 74 insertions(+), 17 deletions(-)
Index: linux-2.6.19-rc2-mm2/kernel/power/process.c
===================================================================
--- linux-2.6.19-rc2-mm2.orig/kernel/power/process.c
+++ linux-2.6.19-rc2-mm2/kernel/power/process.c
@@ -14,6 +14,7 @@
#include <linux/module.h>
#include <linux/syscalls.h>
#include <linux/freezer.h>
+#include <linux/buffer_head.h>
/*
* Timeout for stopping processes
@@ -119,7 +120,7 @@ int freeze_processes(void)
read_unlock(&tasklist_lock);
todo += nr_user;
if (!user_frozen && !nr_user) {
- sys_sync();
+ freeze_filesystems();
start_time = jiffies;
}
user_frozen = !nr_user;
@@ -156,28 +157,43 @@ int freeze_processes(void)
void thaw_some_processes(int all)
{
struct task_struct *g, *p;
- int pass = 0; /* Pass 0 = Kernel space, 1 = Userspace */
printk("Restarting tasks... ");
read_lock(&tasklist_lock);
- do {
- do_each_thread(g, p) {
- /*
- * is_user = 0 if kernel thread or borrowed mm,
- * 1 otherwise.
- */
- int is_user = !!(p->mm && !(p->flags & PF_BORROWED_MM));
- if (!freezeable(p) || (is_user != pass))
- continue;
- if (!thaw_process(p))
- printk(KERN_INFO
- "Strange, %s not stopped\n", p->comm);
- } while_each_thread(g, p);
- pass++;
- } while (pass < 2 && all);
+ do_each_thread(g, p) {
+ if (!freezeable(p))
+ continue;
+
+ /* Don't thaw userland processes, for now */
+ if (p->mm && !(p->flags & PF_BORROWED_MM))
+ continue;
+
+ if (!thaw_process(p))
+ printk(KERN_INFO " Strange, %s not stopped\n", p->comm );
+ } while_each_thread(g, p);
+
+ read_unlock(&tasklist_lock);
+ if (!all)
+ goto Exit;
+
+ thaw_filesystems();
+ read_lock(&tasklist_lock);
+
+ do_each_thread(g, p) {
+ if (!freezeable(p))
+ continue;
+
+ /* Kernel threads should have been thawed already */
+ if (!p->mm || (p->flags & PF_BORROWED_MM))
+ continue;
+
+ if (!thaw_process(p))
+ printk(KERN_INFO " Strange, %s not stopped\n", p->comm );
+ } while_each_thread(g, p);
read_unlock(&tasklist_lock);
+Exit:
schedule();
printk("done.\n");
}
Index: linux-2.6.19-rc2-mm2/include/linux/buffer_head.h
===================================================================
--- linux-2.6.19-rc2-mm2.orig/include/linux/buffer_head.h
+++ linux-2.6.19-rc2-mm2/include/linux/buffer_head.h
@@ -170,6 +170,8 @@ wait_queue_head_t *bh_waitq_head(struct
int fsync_bdev(struct block_device *);
struct super_block *freeze_bdev(struct block_device *);
void thaw_bdev(struct block_device *, struct super_block *);
+void freeze_filesystems(void);
+void thaw_filesystems(void);
int fsync_super(struct super_block *);
int fsync_no_super(struct block_device *);
struct buffer_head *__find_get_block(struct block_device *, sector_t, int);
Index: linux-2.6.19-rc2-mm2/include/linux/fs.h
===================================================================
--- linux-2.6.19-rc2-mm2.orig/include/linux/fs.h
+++ linux-2.6.19-rc2-mm2/include/linux/fs.h
@@ -120,6 +120,7 @@ extern int dir_notify_enable;
#define MS_PRIVATE (1<<18) /* change to private */
#define MS_SLAVE (1<<19) /* change to slave */
#define MS_SHARED (1<<20) /* change to shared */
+#define MS_FROZEN (1<<21) /* Frozen by freeze_filesystems() */
#define MS_ACTIVE (1<<30)
#define MS_NOUSER (1<<31)
Index: linux-2.6.19-rc2-mm2/fs/buffer.c
===================================================================
--- linux-2.6.19-rc2-mm2.orig/fs/buffer.c
+++ linux-2.6.19-rc2-mm2/fs/buffer.c
@@ -244,6 +244,44 @@ void thaw_bdev(struct block_device *bdev
}
EXPORT_SYMBOL(thaw_bdev);
+/**
+ * freeze_filesystems - lock all filesystems and force them into a consistent
+ * state
+ */
+void freeze_filesystems(void)
+{
+ struct super_block *sb;
+
+ /*
+ * Freeze in reverse order so filesystems dependant upon others are
+ * frozen in the right order (eg. loopback on ext3).
+ */
+ list_for_each_entry_reverse(sb, &super_blocks, s_list) {
+ if (!sb->s_root || !sb->s_bdev ||
+ (sb->s_frozen == SB_FREEZE_TRANS) ||
+ (sb->s_flags & MS_RDONLY) ||
+ (sb->s_flags & MS_FROZEN))
+ continue;
+
+ freeze_bdev(sb->s_bdev);
+ sb->s_flags |= MS_FROZEN;
+ }
+}
+
+/**
+ * thaw_filesystems - unlock all filesystems
+ */
+void thaw_filesystems(void)
+{
+ struct super_block *sb;
+
+ list_for_each_entry(sb, &super_blocks, s_list)
+ if (sb->s_flags & MS_FROZEN) {
+ sb->s_flags &= ~MS_FROZEN;
+ thaw_bdev(sb->s_bdev, sb);
+ }
+}
+
/*
* Various filesystems appear to want __find_get_block to be non-blocking.
* But it's the page lock which protects the buffers. To get around this,
next prev parent reply other threads:[~2006-10-29 23:30 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-10-23 4:12 [PATCH] Freeze bdevs when freezing processes Nigel Cunningham
2006-10-23 10:36 ` Rafael J. Wysocki
2006-10-23 12:09 ` Nigel Cunningham
2006-10-23 14:07 ` Rafael J. Wysocki
2006-10-23 14:15 ` Nick Piggin
2006-10-23 14:20 ` Rafael J. Wysocki
2006-10-23 23:05 ` Nigel Cunningham
2006-10-23 16:55 ` Andrew Morton
2006-10-23 17:14 ` Pavel Machek
2006-10-23 17:50 ` Andrew Morton
2006-10-23 18:06 ` Pavel Machek
2006-10-23 19:19 ` Rafael J. Wysocki
2006-10-23 22:52 ` Nigel Cunningham
2006-10-24 7:57 ` Rafael J. Wysocki
2006-10-24 8:21 ` Nigel Cunningham
2006-10-23 21:39 ` Matthew Garrett
2006-10-23 22:12 ` Rafael J. Wysocki
2006-10-24 7:58 ` Pavel Machek
2006-10-23 22:58 ` Nigel Cunningham
2006-10-24 8:01 ` Pavel Machek
2006-10-23 23:22 ` Nigel Cunningham
2006-10-24 8:37 ` Rafael J. Wysocki
2006-10-24 14:44 ` David Chinner
2006-10-24 15:29 ` Rafael J. Wysocki
2006-10-24 16:20 ` Oleg Verych
2006-10-24 16:27 ` Oleg Verych
2006-10-24 17:08 ` Christoph Hellwig
2006-10-25 8:05 ` Pavel Machek
2006-10-24 16:33 ` David Chinner
2006-10-24 21:37 ` Pavel Machek
2006-10-25 0:13 ` David Chinner
2006-10-25 8:10 ` Pavel Machek
2006-10-25 8:38 ` David Chinner
2006-10-25 8:47 ` Pavel Machek
2006-10-25 12:32 ` Rafael J. Wysocki
2006-10-25 13:23 ` Nigel Cunningham
2006-10-25 19:05 ` Rafael J. Wysocki
2006-10-26 7:30 ` David Chinner
2006-10-26 8:18 ` Nigel Cunningham
2006-10-26 8:48 ` Rafael J. Wysocki
2006-10-26 8:57 ` David Chinner
2006-10-26 9:11 ` Rafael J. Wysocki
2006-10-27 1:38 ` David Chinner
2006-10-27 14:37 ` Rafael J. Wysocki
2006-10-29 17:35 ` Pavel Machek
2006-10-29 23:29 ` Rafael J. Wysocki [this message]
2006-10-29 23:46 ` Nigel Cunningham
2006-10-26 9:18 ` Nigel Cunningham
2006-10-26 9:08 ` Rafael J. Wysocki
2006-10-25 8:12 ` Rafael J. Wysocki
2006-10-24 17:06 ` Christoph Hellwig
2006-10-24 19:09 ` Rafael J. Wysocki
2006-10-24 21:26 ` Pavel Machek
2006-10-24 21:33 ` Christoph Hellwig
2006-10-24 21:43 ` Pavel Machek
2006-10-24 22:19 ` Nigel Cunningham
2006-10-24 20:16 ` Rafael J. Wysocki
2006-10-24 22:17 ` Nigel Cunningham
2006-10-24 20:38 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200610300029.25555.rjw@sisk.pl \
--to=rjw@sisk.pl \
--cc=akpm@osdl.org \
--cc=dgc@sgi.com \
--cc=hch@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=ncunningham@linuxmail.org \
--cc=pavel@ucw.cz \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.