From: Zdenek Kabelac <zkabelac@redhat.com>
To: linux-lvm@redhat.com
Subject: Re: [linux-lvm] clvmd on cman waits forever holding the P_#global lock on node re-join
Date: Thu, 13 Dec 2012 11:04:40 +0100 [thread overview]
Message-ID: <50C9A838.3080709@redhat.com> (raw)
In-Reply-To: <50C90FC1.4060803@yahoo.co.uk>
Dne 13.12.2012 00:14, Dmitry Panov napsal(a):
> Hi everyone,
>
> I've been testing clvm recently and noticed that often the operations are
> blocked when a node rejoins the cluster after being fenced or power cycled.
> I've done some investigation and found a number of issues relating to clvm.
> Here is what's happening:
>
>
> - When a node is fenced there is no "port closed" message sent to clvm which
> means the node id remains in the updown hash, although the node itself is
> removed from the nodes list after a "configuration changed" message is received.
>
> - Then, when the node rejoins, another "configuration changed" message arrives
> but because the node id is still in the hash, it is assumed that clvmd on that
> node is running even though it might not be the case yet (in my case clvmd is
> a pacemaker resource so it takes a couple of seconds before it's started).
>
> - This causes the expected_replies count set to a higher number than it should
> be, and as a result there are never enough replies received.
>
> - There is a problem with handling of the cmd_timeout which appears to be
> fixed today (what a coincidence!) by this patch:
> https://www.redhat.com/archives/lvm-devel/2012-December/msg00024.html The
> reason why I was hitting this bug is because I'm using Linux Cluster
> Management Console which polls LVM often enough so that the timeout code never
> ran. I have
> fixed this independently and even though my efforts are now probably wasted
> I'm attaching a patch for your consideration. I believe it enforces the
> timeout more strictly.
>
> Now, the questions:
>
> 1. If the problem with stuck entry in the updown hash is fixed it will cause
> operations to fail until clvmd is started on the re-joined node. Is there any
> particular reason for making them fail? Is it to avoid a race condition when
> newly started clvmd might not receive a message generated by an 'old' node?
>
> 2. The current expected_replies counter seems a bit flawed to me because it
> will fail if a node leaves the cluster before it sends a reply. Should it be
> handled differently? For example instead of a simple counter we could have a
> list of nodes which should be updated when a node leaves the cluster.
>
Hmmm this rather looks like a logical problem either in
the if() expression in (select_status == 0) branch,
or somehow 'magical' gulm fix applied in 2005 for add_to_lvmqueue()
should be running not just when message arrives.
Both patches seems to be not fixing the bug, but rather trying to go around
broken logic in the main loop - it will need some thinking.
Zdenek
next prev parent reply other threads:[~2012-12-13 10:04 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-12-12 23:14 [linux-lvm] clvmd on cman waits forever holding the P_#global lock on node re-join Dmitry Panov
2012-12-13 10:04 ` Zdenek Kabelac [this message]
2012-12-13 11:07 ` Dmitry Panov
2012-12-14 7:10 ` Jacek Konieczny
2012-12-14 10:45 ` Dmitry Panov
2012-12-14 7:14 ` Jacek Konieczny
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50C9A838.3080709@redhat.com \
--to=zkabelac@redhat.com \
--cc=linux-lvm@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.