From: Steven Whitehouse <swhiteho@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] GFS2: Umount recovery race fix
Date: Tue, 11 Aug 2009 10:39:14 +0100 [thread overview]
Message-ID: <1249983554.3337.6.camel@localhost.localdomain> (raw)
In-Reply-To: <20090810223159.GA28306@redhat.com>
Hi,
On Mon, 2009-08-10 at 17:31 -0500, David Teigland wrote:
> On Thu, May 14, 2009 at 02:13:17PM +0100, Steven Whitehouse wrote:
> >
> > This patch fixes a race condition where we can receive recovery
> > requests part way through processing a umount. This was causing
> > problems since the recovery thread had already gone away.
>
> Do you have some logs showing specifically what happened in both kernel and
> userland?
>
> > Looking in more detail at the recovery code, it was really trying
> > to implement a slight variation on a work queue, and that happens to
> > align nicely with the recently introduced slow-work subsystem. As a
> > result I've updated the code to use slow-work, rather than its own home
> > grown variety of work queue.
> >
> > When using the wait_on_bit() function, I noticed that the wait function
> > that was supplied as an argument was appearing in the WCHAN field, so
> > I've updated the function names in order to produce more meaningful
> > output.
>
> That description doesn't explain how the specific bug was fixed.
>
> I'm guessing that this is the patch that broke gfs2 recovery, although there
> are others that muck around with the sysfs control files.
>
> This is what appears in /var/log/messages,
>
> gfs_controld[7901]: start_journal_recovery 3 error -1
>
> And from the daemon debug log,
>
> 1249942342 foo start_journal_recovery jid 3
> 1249942342 foo set /sys/fs/gfs2/bull:foo/lock_module/recover to 3
> 1249942342 foo set open /sys/fs/gfs2/bull:foo/lock_module/recover error -1 13
> 1249942342 start_journal_recovery 3 error -1
>
> Dave
>
I think I've tracked down the issue. Let me know if the patch I just
posted doesn't fix what you are seeing,
Steve.
prev parent reply other threads:[~2009-08-11 9:39 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-14 13:13 [Cluster-devel] GFS2: Umount recovery race fix Steven Whitehouse
2009-08-10 22:31 ` David Teigland
2009-08-11 8:42 ` Steven Whitehouse
2009-08-11 9:39 ` Steven Whitehouse [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1249983554.3337.6.camel@localhost.localdomain \
--to=swhiteho@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).