cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Steven Whitehouse <swhiteho@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] GFS2: Umount recovery race fix
Date: Tue, 11 Aug 2009 10:39:14 +0100	[thread overview]
Message-ID: <1249983554.3337.6.camel@localhost.localdomain> (raw)
In-Reply-To: <20090810223159.GA28306@redhat.com>

Hi,

On Mon, 2009-08-10 at 17:31 -0500, David Teigland wrote:
> On Thu, May 14, 2009 at 02:13:17PM +0100, Steven Whitehouse wrote:
> > 
> > This patch fixes a race condition where we can receive recovery
> > requests part way through processing a umount. This was causing
> > problems since the recovery thread had already gone away.
> 
> Do you have some logs showing specifically what happened in both kernel and
> userland?
> 
> > Looking in more detail at the recovery code, it was really trying
> > to implement a slight variation on a work queue, and that happens to
> > align nicely with the recently introduced slow-work subsystem. As a
> > result I've updated the code to use slow-work, rather than its own home
> > grown variety of work queue.
> > 
> > When using the wait_on_bit() function, I noticed that the wait function
> > that was supplied as an argument was appearing in the WCHAN field, so
> > I've updated the function names in order to produce more meaningful
> > output.
> 
> That description doesn't explain how the specific bug was fixed.
> 
> I'm guessing that this is the patch that broke gfs2 recovery, although there
> are others that muck around with the sysfs control files.
> 
> This is what appears in /var/log/messages,
> 
> gfs_controld[7901]: start_journal_recovery 3 error -1
> 
> And from the daemon debug log,
> 
> 1249942342 foo start_journal_recovery jid 3
> 1249942342 foo set /sys/fs/gfs2/bull:foo/lock_module/recover to 3
> 1249942342 foo set open /sys/fs/gfs2/bull:foo/lock_module/recover error -1 13
> 1249942342 start_journal_recovery 3 error -1
> 
> Dave
> 
I think I've tracked down the issue. Let me know if the patch I just
posted doesn't fix what you are seeing,

Steve.




      parent reply	other threads:[~2009-08-11  9:39 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-14 13:13 [Cluster-devel] GFS2: Umount recovery race fix Steven Whitehouse
2009-08-10 22:31 ` David Teigland
2009-08-11  8:42   ` Steven Whitehouse
2009-08-11  9:39   ` Steven Whitehouse [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1249983554.3337.6.camel@localhost.localdomain \
    --to=swhiteho@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).