From: Bob Peterson <rpeterso@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [GFS2 PATCH 11/28] gfs2: Ignore dlm recovery requests if gfs2 is withdrawn
Date: Thu, 20 Feb 2020 13:53:12 -0600 [thread overview]
Message-ID: <20200220195329.952027-12-rpeterso@redhat.com> (raw)
In-Reply-To: <20200220195329.952027-1-rpeterso@redhat.com>
When a node fails, user space informs dlm of the node failure,
and dlm instructs gfs2 on the surviving nodes to perform journal
recovery. It does this by calling various callback functions in
lock_dlm.c. To mark its progress, it keeps generation numbers
and recover bits in a dlm "control" lock lvb, which is seen by
all nodes to determine which journals need to be replayed.
The gfs2 on all nodes get the same recovery requests from dlm,
so they all try to do the recovery, but only one will be
granted the exclusive lock on the journal. The others fail
with a "Busy" message on their "try lock."
However, when a node is withdrawn, it cannot safely do any
recovery or replay any journals. To make matters worse,
gfs2 might withdraw as a result of attempting recovery. For
example, this might happen if the device goes offline, or if
an hba fails. But in today's gfs2 code, it doesn't check for
being withdrawn at any step in the recovery process. What's
worse is that these callbacks from dlm have no return code,
so there is no way to indicate failure back to dlm. We can
send a "Recovery failed" uevent eventually, but that tells
user space what happened, not dlm's kernel code.
Before this patch, lock_dlm would perform its recovery steps but
ignore the result, and eventually it would still update its
generation number in the lvb, despite the fact that it may have
withdrawn or encountered an error. The other nodes would then
see the newer generation number in the lvb and conclude that
they don't need to do recovery because the generation number
is newer than the last one they saw. They think a different
node has already recovered the journal.
This patch adds checks to several of the callbacks used by dlm
in its recovery state machine so that the functions are ignored
and skipped if an io error has occurred or if the file system
is withdrawn. That prevents the lvb bits from being updated, and
therefore dlm and user space still see the need for recovery to
take place.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
---
fs/gfs2/lock_dlm.c | 18 ++++++++++++++++++
fs/gfs2/recovery.c | 5 +++++
2 files changed, 23 insertions(+)
diff --git a/fs/gfs2/lock_dlm.c b/fs/gfs2/lock_dlm.c
index 7c7197343ee2..57fdf53d2246 100644
--- a/fs/gfs2/lock_dlm.c
+++ b/fs/gfs2/lock_dlm.c
@@ -1079,6 +1079,10 @@ static void gdlm_recover_prep(void *arg)
struct gfs2_sbd *sdp = arg;
struct lm_lockstruct *ls = &sdp->sd_lockstruct;
+ if (gfs2_withdrawn(sdp)) {
+ fs_err(sdp, "recover_prep ignored due to withdraw.\n");
+ return;
+ }
spin_lock(&ls->ls_recover_spin);
ls->ls_recover_block = ls->ls_recover_start;
set_bit(DFL_DLM_RECOVERY, &ls->ls_recover_flags);
@@ -1101,6 +1105,11 @@ static void gdlm_recover_slot(void *arg, struct dlm_slot *slot)
struct lm_lockstruct *ls = &sdp->sd_lockstruct;
int jid = slot->slot - 1;
+ if (gfs2_withdrawn(sdp)) {
+ fs_err(sdp, "recover_slot jid %d ignored due to withdraw.\n",
+ jid);
+ return;
+ }
spin_lock(&ls->ls_recover_spin);
if (ls->ls_recover_size < jid + 1) {
fs_err(sdp, "recover_slot jid %d gen %u short size %d\n",
@@ -1125,6 +1134,10 @@ static void gdlm_recover_done(void *arg, struct dlm_slot *slots, int num_slots,
struct gfs2_sbd *sdp = arg;
struct lm_lockstruct *ls = &sdp->sd_lockstruct;
+ if (gfs2_withdrawn(sdp)) {
+ fs_err(sdp, "recover_done ignored due to withdraw.\n");
+ return;
+ }
/* ensure the ls jid arrays are large enough */
set_recover_size(sdp, slots, num_slots);
@@ -1152,6 +1165,11 @@ static void gdlm_recovery_result(struct gfs2_sbd *sdp, unsigned int jid,
{
struct lm_lockstruct *ls = &sdp->sd_lockstruct;
+ if (gfs2_withdrawn(sdp)) {
+ fs_err(sdp, "recovery_result jid %d ignored due to withdraw.\n",
+ jid);
+ return;
+ }
if (test_bit(DFL_NO_DLM_OPS, &ls->ls_recover_flags))
return;
diff --git a/fs/gfs2/recovery.c b/fs/gfs2/recovery.c
index 85f830e56945..8cc26bef4e64 100644
--- a/fs/gfs2/recovery.c
+++ b/fs/gfs2/recovery.c
@@ -305,6 +305,11 @@ void gfs2_recover_func(struct work_struct *work)
int error = 0;
int jlocked = 0;
+ if (gfs2_withdrawn(sdp)) {
+ fs_err(sdp, "jid=%u: Recovery not attempted due to withdraw.\n",
+ jd->jd_jid);
+ goto fail;
+ }
t_start = ktime_get();
if (sdp->sd_args.ar_spectator)
goto fail;
--
2.24.1
next prev parent reply other threads:[~2020-02-20 19:53 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-20 19:53 [Cluster-devel] [GFS2 PATCH 00/28] GFS2 recovery patches v10 Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 01/28] gfs2: Split gfs2_lm_withdraw into two functions Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 02/28] gfs2: Report errors before withdraw Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 03/28] gfs2: Remove usused cluster_wide arguments of gfs2_consist functions Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 04/28] gfs2: Turn gfs2_consist into void functions Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 05/28] gfs2: Return bool from gfs2_assert functions Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 06/28] gfs2: Introduce concept of a pending withdraw Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 07/28] gfs2: clear ail1 list when gfs2 withdraws Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 08/28] gfs2: Rework how rgrp buffer_heads are managed Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 09/28] gfs2: log error reform Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 10/28] gfs2: Only complain the first time an io error occurs in quota or log Bob Peterson
2020-02-20 19:53 ` Bob Peterson [this message]
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 12/28] gfs2: move check_journal_clean to util.c for future use Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 13/28] gfs2: Allow some glocks to be used during withdraw Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 14/28] gfs2: Force withdraw to replay journals and wait for it to finish Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 15/28] gfs2: fix infinite loop when checking ail item count before go_inval Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 16/28] gfs2: Add verbose option to check_journal_clean Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 17/28] gfs2: Issue revokes more intelligently Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 18/28] gfs2: Prepare to withdraw as soon as an IO error occurs in log write Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 19/28] gfs2: Check for log write errors before telling dlm to unlock Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 20/28] gfs2: Do log_flush in gfs2_ail_empty_gl even if ail list is empty Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 21/28] gfs2: Withdraw in gfs2_ail1_flush if write_cache_pages fails Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 22/28] gfs2: drain the ail2 list after io errors Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 23/28] gfs2: Don't demote a glock until its revokes are written Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 24/28] gfs2: Do proper error checking for go_sync family of glops functions Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 25/28] gfs2: flesh out delayed withdraw for gfs2_log_flush Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 26/28] fs: clean up __block_commit_write Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 27/28] gfs2: don't allow releasepage to free bd still used for revokes Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 28/28] gfs2: allow journal replay to hold sd_log_flush_lock Bob Peterson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200220195329.952027-12-rpeterso@redhat.com \
--to=rpeterso@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).