All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steven Whitehouse <swhiteho@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] GFS2: Umount recovery race fix
Date: Tue, 11 Aug 2009 10:39:14 +0100	[thread overview]
Message-ID: <1249983554.3337.6.camel@localhost.localdomain> (raw)
In-Reply-To: <20090810223159.GA28306@redhat.com>

Hi,

On Mon, 2009-08-10 at 17:31 -0500, David Teigland wrote:
> On Thu, May 14, 2009 at 02:13:17PM +0100, Steven Whitehouse wrote:
> > 
> > This patch fixes a race condition where we can receive recovery
> > requests part way through processing a umount. This was causing
> > problems since the recovery thread had already gone away.
> 
> Do you have some logs showing specifically what happened in both kernel and
> userland?
> 
> > Looking in more detail at the recovery code, it was really trying
> > to implement a slight variation on a work queue, and that happens to
> > align nicely with the recently introduced slow-work subsystem. As a
> > result I've updated the code to use slow-work, rather than its own home
> > grown variety of work queue.
> > 
> > When using the wait_on_bit() function, I noticed that the wait function
> > that was supplied as an argument was appearing in the WCHAN field, so
> > I've updated the function names in order to produce more meaningful
> > output.
> 
> That description doesn't explain how the specific bug was fixed.
> 
> I'm guessing that this is the patch that broke gfs2 recovery, although there
> are others that muck around with the sysfs control files.
> 
> This is what appears in /var/log/messages,
> 
> gfs_controld[7901]: start_journal_recovery 3 error -1
> 
> And from the daemon debug log,
> 
> 1249942342 foo start_journal_recovery jid 3
> 1249942342 foo set /sys/fs/gfs2/bull:foo/lock_module/recover to 3
> 1249942342 foo set open /sys/fs/gfs2/bull:foo/lock_module/recover error -1 13
> 1249942342 start_journal_recovery 3 error -1
> 
> Dave
> 
I think I've tracked down the issue. Let me know if the patch I just
posted doesn't fix what you are seeing,

Steve.




      parent reply	other threads:[~2009-08-11  9:39 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-14 13:13 [Cluster-devel] GFS2: Umount recovery race fix Steven Whitehouse
2009-08-10 22:31 ` David Teigland
2009-08-11  8:42   ` Steven Whitehouse
2009-08-11  9:39   ` Steven Whitehouse [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1249983554.3337.6.camel@localhost.localdomain \
    --to=swhiteho@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.