From: Alexander Aring <aahringo@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [PATCH v5.19-rc1 0/7] fs: dlm: recovery error handling
Date: Fri, 10 Jun 2022 13:06:09 -0400 [thread overview]
Message-ID: <20220610170616.3480642-1-aahringo@redhat.com> (raw)
Hi,
I have these patches laying around a long time... and it's maybe time to
bring them up. It does the three changes in dlm recovery handling:
1.
The dlm_lsop_recover_prep() callback should be called once after the
lockspace is stopped and not if it's already stopped when the recovery
is running.
It will change possible:
dlm_lsop_recover_prep()
...
dlm_lsop_recover_prep()
dlm_lsop_recover_done()
to only have one possible prep call:
dlm_lsop_recover_prep()
dlm_lsop_recover_done()
2.
If a new_lockspace() is created we wait until a point when members are
successful pinged, then new_lockspace() returns to the caller. However
the recovery might be still running. Mostly all users of dlm will
workaround this with a dlm_lsop_recover_done() call wait to know the dlm
lockspace can be used now. This should be backwards compatible with the
existing dlm users, however they can drop their handling if they want.
3.
There exists two ways how recovery can be triggered. Either somebody called
new_lockspace(), that means a waiter waits until recovery is done. Or it
is a complete async process e.g. nodes joining/leaving the lockspace.
There is no caller in the async case which waits for dlm recovery is done,
therefore there exists no error handling which reacts on possible recovery
errors. This patch series will introduce a "best effort" approach to simple
retry/schedule() the recovery on error and hope the error gets resolved.
If this is not the case in 5 retries panic() will fence the node.
- Alex
Alexander Aring (7):
fs: dlm: add notes for recovery and membership handling
fs: dlm: call dlm_lsop_recover_prep once
fs: dlm: let new_lockspace() wait until recovery
fs: dlm: handle recovery result outside of ls_recover
fs: dlm: handle recovery -EAGAIN case as retry
fs: dlm: change -EINVAL recovery error to -EAGAIN
fs: dlm: add WARN_ON for non waiter case
fs/dlm/dlm_internal.h | 4 +--
fs/dlm/lock.c | 5 +++-
fs/dlm/lockspace.c | 9 ++++---
fs/dlm/member.c | 30 +++++++++++-----------
fs/dlm/recoverd.c | 60 ++++++++++++++++++++++++++++++++++++++++---
5 files changed, 82 insertions(+), 26 deletions(-)
--
2.31.1
next reply other threads:[~2022-06-10 17:06 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-10 17:06 Alexander Aring [this message]
2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 1/7] fs: dlm: add notes for recovery and membership handling Alexander Aring
2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 2/7] fs: dlm: call dlm_lsop_recover_prep once Alexander Aring
2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 3/7] fs: dlm: let new_lockspace() wait until recovery Alexander Aring
2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 4/7] fs: dlm: handle recovery result outside of ls_recover Alexander Aring
2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 5/7] fs: dlm: handle recovery -EAGAIN case as retry Alexander Aring
2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 6/7] fs: dlm: change -EINVAL recovery error to -EAGAIN Alexander Aring
2022-06-14 14:54 ` Alexander Aring
2022-06-10 17:06 ` [Cluster-devel] [PATCH v5.19-rc1 7/7] fs: dlm: add WARN_ON for non waiter case Alexander Aring
2022-06-14 17:59 ` Alexander Aring
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220610170616.3480642-1-aahringo@redhat.com \
--to=aahringo@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).