public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Nigel Cunningham <ncunningham@linuxmail.org>
To: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Pavel Machek <pavel@ucw.cz>, David Chinner <dgc@sgi.com>,
	Andrew Morton <akpm@osdl.org>,
	LKML <linux-kernel@vger.kernel.org>,
	xfs@oss.sgi.com, Christoph Hellwig <hch@infradead.org>
Subject: Re: [PATCH] Freeze bdevs when freezing processes.
Date: Mon, 30 Oct 2006 10:46:26 +1100	[thread overview]
Message-ID: <1162165587.16177.7.camel@nigel.suspend2.net> (raw)
In-Reply-To: <200610300029.25555.rjw@sisk.pl>

Hi.

On Mon, 2006-10-30 at 00:29 +0100, Rafael J. Wysocki wrote:
> Hi,
> 
> On Sunday, 29 October 2006 18:35, Pavel Machek wrote:
> > Hi!
> > 
> > > > > > As you have them at the moment, the threads seem to be freezing fine.
> > > > > > The issue I've seen in the past related not to threads but to timer
> > > > > > based activity. Admittedly it was 2.6.14 when I last looked at it, but
> > > > > > there used to be a possibility for XFS to submit I/O from a timer when
> > > > > > the threads are frozen but the bdev isn't frozen. Has that changed?
> > > > > 
> > > > > I didn't think we've ever done that - periodic or delayed operations
> > > > > are passed off to the kernel threads to execute. A stack trace
> > > > > (if you still have it) would be really help here.
> > > > > 
> > > > > Hmmm - we have a couple of per-cpu work queues as well that are
> > > > > used on I/O completion and that can, in some circumstances,
> > > > > trigger new transactions. If we are only flush metadata, then
> > > > > I don't think that any more I/o will be issued, but I could be
> > > > > wrong (maze of twisty passages).
> > > > 
> > > > Well, I think this exactly is the problem, because worker_threads run with
> > > > PF_NOFREEZE set (as I've just said in another message).
> > > 
> > > Ok, so freezing the filesystem is the only way you can prevent
> > > this as the workqueues are flushed as part of quiescing the filesystem.
> > 
> > Well, alternative is to teach XFS to sense that we are being frozen
> > and stop disk writes in such case.
> > 
> > OTOH freeze_bdevs is perhaps not that bad solution... 
> 
> Okay, appended is a patch that implements the freezing of bdevs in a slightly
> different way than the Nigel's patch did it.

> As Christoph suggested, I have put freeze_filesystems() and thaw_filesystems()
> into fs/buffer.c and indroduced the MS_FROZEN flag to mark frozen
> filesystems.
> 
> It seems to work fine, except I get the following trace from lockdep during
> the suspend on a regular basis (not 100% reproducible, though):
> 
> Stopping tasks...
> =============================================
> [ INFO: possible recursive locking detected ]
> 2.6.19-rc2-mm2 #15
> ---------------------------------------------
> s2disk/5564 is trying to acquire lock:
>  (&bdev->bd_mount_mutex){--..}, at: [<ffffffff80475e79>] mutex_lock+0x9/0x10
> 
> but task is already holding lock:
>  (&bdev->bd_mount_mutex){--..}, at: [<ffffffff80475e79>] mutex_lock+0x9/0x10
> 
> other info that might help us debug this:
> 3 locks held by s2disk/5564:
>  #0:  (&bdev->bd_mount_mutex){--..}, at: [<ffffffff80475e79>] mutex_lock+0x9/0x10
>  #1:  (&type->s_umount_key#16){----}, at: [<ffffffff80291647>] get_super+0x67/0xc0
>  #2:  (&journal->j_barrier){--..}, at: [<ffffffff80475e79>] mutex_lock+0x9/0x10
> 
> stack backtrace:
> 
> Call Trace:
>  [<ffffffff8020af79>] dump_trace+0xb9/0x430
>  [<ffffffff8020b333>] show_trace+0x43/0x60
>  [<ffffffff8020b635>] dump_stack+0x15/0x20
>  [<ffffffff8024a1d1>] __lock_acquire+0x881/0xc60
>  [<ffffffff8024a94d>] lock_acquire+0x8d/0xc0
>  [<ffffffff80475cd4>] __mutex_lock_slowpath+0xd4/0x270
>  [<ffffffff80475e79>] mutex_lock+0x9/0x10
>  [<ffffffff802b2bb6>] freeze_bdev+0x16/0x80
>  [<ffffffff802b3105>] freeze_filesystems+0x55/0x80
>  [<ffffffff80255942>] freeze_processes+0x1e2/0x360
>  [<ffffffff802592a3>] snapshot_ioctl+0x163/0x610
>  [<ffffffff8029cf0b>] do_ioctl+0x6b/0xa0
>  [<ffffffff8029d1eb>] vfs_ioctl+0x2ab/0x2d0
>  [<ffffffff8029d27a>] sys_ioctl+0x6a/0xa0
>  [<ffffffff80209c2e>] system_call+0x7e/0x83
>  [<00002afb13a4d8a9>]
> 
> done.
> Shrinking memory... done (19126 pages freed)
> 
> Greetings,
> Rafael
> 
> 
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>

Heh. I've just prepared almost exactly the same patch (except for
unwinding the process thawing).

I haven't ever seen those mutex warnings - is that due to something new
in -mm or one of those gazillion new warnings you can enable in vanilla?

Apart from that, I'll add:

Signed-off-by: Nigel Cunningham <nigel@suspend2.net>

Regards,

Nigel

  parent reply	other threads:[~2006-10-29 23:59 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1161576735.3466.7.camel@nigel.suspend2.net>
     [not found] ` <200610231236.54317.rjw@sisk.pl>
2006-10-24 14:44   ` [PATCH] Freeze bdevs when freezing processes David Chinner
2006-10-24 15:29     ` Rafael J. Wysocki
2006-10-24 16:27       ` Oleg Verych
2006-10-25  8:05         ` Pavel Machek
2006-10-24 16:33       ` David Chinner
2006-10-24 21:37         ` Pavel Machek
2006-10-25  0:13           ` David Chinner
2006-10-25  8:10             ` Pavel Machek
2006-10-25  8:38               ` David Chinner
2006-10-25  8:47                 ` Pavel Machek
2006-10-25 12:32                   ` Rafael J. Wysocki
2006-10-25 13:23                     ` Nigel Cunningham
     [not found]                       ` <200610252105.56862.rjw@sisk.pl>
2006-10-26  7:30                         ` David Chinner
2006-10-26  8:18                           ` Nigel Cunningham
2006-10-26  8:48                             ` Rafael J. Wysocki
2006-10-26  8:57                             ` David Chinner
2006-10-26  9:11                               ` Rafael J. Wysocki
2006-10-27  1:38                                 ` David Chinner
2006-10-27 14:37                                   ` Rafael J. Wysocki
2006-10-29 17:35                                   ` Pavel Machek
     [not found]                                     ` <200610300029.25555.rjw@sisk.pl>
2006-10-29 23:46                                       ` Nigel Cunningham [this message]
2006-10-26  9:18                               ` Nigel Cunningham
2006-10-26  9:08                             ` Rafael J. Wysocki
2006-10-24 17:06       ` Christoph Hellwig
2006-10-24 21:26         ` Pavel Machek
2006-10-24 21:33           ` Christoph Hellwig
2006-10-24 21:43             ` Pavel Machek
2006-10-24 22:19     ` Nigel Cunningham

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1162165587.16177.7.camel@nigel.suspend2.net \
    --to=ncunningham@linuxmail.org \
    --cc=akpm@osdl.org \
    --cc=dgc@sgi.com \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pavel@ucw.cz \
    --cc=rjw@sisk.pl \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox