From: Bob Peterson <rpeterso@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [PATCH 07/32] gfs2: Ignore dlm recovery requests if gfs2 is withdrawn
Date: Wed, 13 Nov 2019 15:30:05 -0600 [thread overview]
Message-ID: <20191113213030.237431-8-rpeterso@redhat.com> (raw)
In-Reply-To: <20191113213030.237431-1-rpeterso@redhat.com>
When a node fails, user space informs dlm of the node failure,
and dlm instructs gfs2 on the surviving nodes to perform journal
recovery. It does this by calling various callback functions in
lock_dlm.c. To mark its progress, it keeps generation numbers
and recover bits in a dlm "control" lock lvb, which is seen by
all nodes to determine which journals need to be replayed.
The gfs2 on all nodes get the same recovery requests from dlm,
so they all try to do the recovery, but only one will be
granted the exclusive lock on the journal. The others fail
with a "Busy" message on their "try lock."
However, when a node is withdrawn, it cannot safely do any
recovery or safely replay any journals. To make matters worse,
gfs2 might withdraw as a result of attempting recovery. For
example, this might happen if the device goes offline, or if
an hba fails. But in today's gfs2 code, it doesn't check for
being withdrawn at any step in the recovery process. What's
worse if that these callbacks from dlm have no return code,
so there is no way to indicate failure back to dlm. We can
send a "Recovery failed" uevent eventually, but that tells
user space what happened, not dlm's kernel code.
Before this patch, lock_dlm would perform its recovery steps but
ignore the result, and eventually it would still update its
generation number in the lvb, despite the fact that it may have
withdrawn or encountered an error. The other nodes would then
see the newer generation number in the lvb and conclude that
they don't need to do recovery because the generation number
is newer than the last one they saw. They think a different
node has already recovered the journal.
This patch adds checks to several of the callbacks used by dlm
in its recovery state machine so that the functions are ignored
and skipped if an io error has occurred or if the file system
is withdrawn. That prevents the lvb bits from being updated, and
therefore dlm and user space still see the need for recovery to
take place.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
fs/gfs2/lock_dlm.c | 18 ++++++++++++++++++
fs/gfs2/recovery.c | 5 +++++
2 files changed, 23 insertions(+)
diff --git a/fs/gfs2/lock_dlm.c b/fs/gfs2/lock_dlm.c
index 7c7197343ee2..57fdf53d2246 100644
--- a/fs/gfs2/lock_dlm.c
+++ b/fs/gfs2/lock_dlm.c
@@ -1079,6 +1079,10 @@ static void gdlm_recover_prep(void *arg)
struct gfs2_sbd *sdp = arg;
struct lm_lockstruct *ls = &sdp->sd_lockstruct;
+ if (gfs2_withdrawn(sdp)) {
+ fs_err(sdp, "recover_prep ignored due to withdraw.\n");
+ return;
+ }
spin_lock(&ls->ls_recover_spin);
ls->ls_recover_block = ls->ls_recover_start;
set_bit(DFL_DLM_RECOVERY, &ls->ls_recover_flags);
@@ -1101,6 +1105,11 @@ static void gdlm_recover_slot(void *arg, struct dlm_slot *slot)
struct lm_lockstruct *ls = &sdp->sd_lockstruct;
int jid = slot->slot - 1;
+ if (gfs2_withdrawn(sdp)) {
+ fs_err(sdp, "recover_slot jid %d ignored due to withdraw.\n",
+ jid);
+ return;
+ }
spin_lock(&ls->ls_recover_spin);
if (ls->ls_recover_size < jid + 1) {
fs_err(sdp, "recover_slot jid %d gen %u short size %d\n",
@@ -1125,6 +1134,10 @@ static void gdlm_recover_done(void *arg, struct dlm_slot *slots, int num_slots,
struct gfs2_sbd *sdp = arg;
struct lm_lockstruct *ls = &sdp->sd_lockstruct;
+ if (gfs2_withdrawn(sdp)) {
+ fs_err(sdp, "recover_done ignored due to withdraw.\n");
+ return;
+ }
/* ensure the ls jid arrays are large enough */
set_recover_size(sdp, slots, num_slots);
@@ -1152,6 +1165,11 @@ static void gdlm_recovery_result(struct gfs2_sbd *sdp, unsigned int jid,
{
struct lm_lockstruct *ls = &sdp->sd_lockstruct;
+ if (gfs2_withdrawn(sdp)) {
+ fs_err(sdp, "recovery_result jid %d ignored due to withdraw.\n",
+ jid);
+ return;
+ }
if (test_bit(DFL_NO_DLM_OPS, &ls->ls_recover_flags))
return;
diff --git a/fs/gfs2/recovery.c b/fs/gfs2/recovery.c
index 85f830e56945..f1762f2daf50 100644
--- a/fs/gfs2/recovery.c
+++ b/fs/gfs2/recovery.c
@@ -306,6 +306,11 @@ void gfs2_recover_func(struct work_struct *work)
int jlocked = 0;
t_start = ktime_get();
+ if (gfs2_withdrawn(sdp)) {
+ fs_err(sdp, "jid=%u: Recovery not attempted due to withdraw.\n",
+ jd->jd_jid);
+ goto fail;
+ }
if (sdp->sd_args.ar_spectator)
goto fail;
if (jd->jd_jid != sdp->sd_lockstruct.ls_jid) {
--
2.23.0
next prev parent reply other threads:[~2019-11-13 21:30 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-13 21:29 [Cluster-devel] [PATCH 00/32] gfs2: misc recovery patch collection Bob Peterson
2019-11-13 21:29 ` [Cluster-devel] [PATCH 01/32] gfs2: Introduce concept of a pending withdraw Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 02/32] gfs2: clear ail1 list when gfs2 withdraws Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 03/32] gfs2: Rework how rgrp buffer_heads are managed Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 04/32] gfs2: fix infinite loop in gfs2_ail1_flush on io error Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 05/32] gfs2: log error reform Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 06/32] gfs2: Only complain the first time an io error occurs in quota or log Bob Peterson
2019-11-13 21:30 ` Bob Peterson [this message]
2019-11-13 21:30 ` [Cluster-devel] [PATCH 08/32] gfs2: move check_journal_clean to util.c for future use Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 09/32] gfs2: Allow some glocks to be used during withdraw Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 10/32] gfs2: Don't loop forever in gfs2_freeze if withdrawn Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 11/32] gfs2: Make secondary withdrawers wait for first withdrawer Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 12/32] gfs2: Don't write log headers after file system withdraw Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 13/32] gfs2: Force withdraw to replay journals and wait for it to finish Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 14/32] gfs2: fix infinite loop when checking ail item count before go_inval Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 15/32] gfs2: Add verbose option to check_journal_clean Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 16/32] gfs2: Abort gfs2_freeze if io error is seen Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 17/32] gfs2: Issue revokes more intelligently Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 18/32] gfs2: Prepare to withdraw as soon as an IO error occurs in log write Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 19/32] gfs2: Check for log write errors before telling dlm to unlock Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 20/32] gfs2: new slab for transactions Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 21/32] gfs2: Close timing window with GLF_INVALIDATE_IN_PROGRESS Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 22/32] gfs2: Do log_flush in gfs2_ail_empty_gl even if ail list is empty Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 23/32] gfs2: Don't skip log flush if glock still has revokes Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 24/32] gfs2: initialize tr_ail1_list when creating transactions Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 25/32] gfs2: Withdraw in gfs2_ail1_flush if write_cache_pages returns error Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 26/32] gfs2: drain the ail2 list after io errors Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 27/32] gfs2: make gfs2_log_shutdown static Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 28/32] gfs2: Eliminate GFS2_RDF_UPTODATE flag in favor of buffer existence Bob Peterson
2019-11-14 10:42 ` Steven Whitehouse
2019-11-14 13:16 ` Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 29/32] gfs2: if finish_open returns error, clean up iopen glock mess Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 30/32] gfs2: Don't demote a glock until its revokes are written Bob Peterson
2019-11-14 10:45 ` Steven Whitehouse
2019-11-13 21:30 ` [Cluster-devel] [PATCH 31/32] gfs2: Do proper error checking for go_sync family of glops functions Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 32/32] gfs2: fix glock reference problem in gfs2_trans_add_unrevoke Bob Peterson
2019-11-14 10:48 ` [Cluster-devel] [PATCH 00/32] gfs2: misc recovery patch collection Steven Whitehouse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191113213030.237431-8-rpeterso@redhat.com \
--to=rpeterso@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).