* [Cluster-devel] [GFS2 v3 PATCH 0/3] gfs2: Misc withdraw patches (version 3)
@ 2022-08-18 18:32 Bob Peterson
2022-08-18 18:32 ` [Cluster-devel] [GFS2 v3 PATCH 1/3] gfs2: Prevent double iput for journal on error Bob Peterson
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Bob Peterson @ 2022-08-18 18:32 UTC (permalink / raw)
To: cluster-devel.redhat.com
This is version 3 which has further simplification and improvements.
This patch set fixes a few bugs in how gfs2 handles file systems after
withdraw. In an ideal world, after a file system is withdrawn, users
should be able to unmount the file system without problems. However, we
discovered three problems that prevented clean unmounts:
1. A duplicate iput of the journal after attempted recovery caused
kernel panics after withdraw.
2. After withdraw, unmount would hang for its alloted timeout period
when glocks had waiters queued that, due to the withdraw, could
never be granted.
3. Unmount would similarly hang when the withdraw prevented an outgoing
request to dlm, but so the glock was never unlocked.
Bob Peterson (3):
gfs2: Prevent double iput for journal on error
gfs2: Dequeue waiters when withdrawn
gfs2: Clear flags when withdraw prevents xmote
fs/gfs2/glock.c | 36 ++++++++++++++++++++++++++++++++++--
fs/gfs2/glock.h | 1 +
fs/gfs2/util.c | 6 ++++++
3 files changed, 41 insertions(+), 2 deletions(-)
--
2.37.2
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Cluster-devel] [GFS2 v3 PATCH 1/3] gfs2: Prevent double iput for journal on error
2022-08-18 18:32 [Cluster-devel] [GFS2 v3 PATCH 0/3] gfs2: Misc withdraw patches (version 3) Bob Peterson
@ 2022-08-18 18:32 ` Bob Peterson
2022-08-18 18:32 ` [Cluster-devel] [GFS2 v3 PATCH 2/3] gfs2: Dequeue waiters when withdrawn Bob Peterson
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Bob Peterson @ 2022-08-18 18:32 UTC (permalink / raw)
To: cluster-devel.redhat.com
When a gfs2 file system is withdrawn it does iput on its journal to
allow recovery from another cluster node. If it's unable to get a
replacement inode for whatever reason, the journal descriptor would
still be pointing at the evicted inode. So when unmount clears out the
list of journals, it would do a second iput referencing the pointer.
To avoid this, set the inode pointer to NULL.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
fs/gfs2/util.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/gfs2/util.c b/fs/gfs2/util.c
index 8241029a2a5d..95c79a3ec161 100644
--- a/fs/gfs2/util.c
+++ b/fs/gfs2/util.c
@@ -204,6 +204,7 @@ static void signal_our_withdraw(struct gfs2_sbd *sdp)
* exception code in glock_dq.
*/
iput(inode);
+ sdp->sd_jdesc->jd_inode = NULL;
/*
* Wait until the journal inode's glock is freed. This allows try locks
* on other nodes to be successful, otherwise we remain the owner of
--
2.37.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [Cluster-devel] [GFS2 v3 PATCH 2/3] gfs2: Dequeue waiters when withdrawn
2022-08-18 18:32 [Cluster-devel] [GFS2 v3 PATCH 0/3] gfs2: Misc withdraw patches (version 3) Bob Peterson
2022-08-18 18:32 ` [Cluster-devel] [GFS2 v3 PATCH 1/3] gfs2: Prevent double iput for journal on error Bob Peterson
@ 2022-08-18 18:32 ` Bob Peterson
2022-08-18 18:32 ` [Cluster-devel] [GFS2 v3 PATCH 3/3] gfs2: Clear flags when withdraw prevents xmote Bob Peterson
2022-08-25 15:59 ` [Cluster-devel] [GFS2 v3 PATCH 0/3] gfs2: Misc withdraw patches (version 3) Andreas Gruenbacher
3 siblings, 0 replies; 5+ messages in thread
From: Bob Peterson @ 2022-08-18 18:32 UTC (permalink / raw)
To: cluster-devel.redhat.com
When a withdraw occurs, ordinary (not system) glocks may not be granted
anymore. Later, when the file system is unmounted, gfs2_gl_hash_clear()
tries to clear out all the glocks, but these un-grantable pending
waiters prevent some glocks from being freed. So the unmount hangs, at
least for its ten-minute timeout period.
This patch takes measures to remove any pending waiters from
the glocks that will never be granted. This allows the unmount to
proceed in a reasonable period of time.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
fs/gfs2/glock.c | 14 ++++++++++++++
fs/gfs2/glock.h | 1 +
fs/gfs2/util.c | 5 +++++
3 files changed, 20 insertions(+)
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 0b36a16659b6..0ffe7d474cd5 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -2194,6 +2194,20 @@ static void dump_glock_func(struct gfs2_glock *gl)
dump_glock(NULL, gl, true);
}
+static void withdraw_dq(struct gfs2_glock *gl)
+{
+ spin_lock(&gl->gl_lockref.lock);
+ if (!__lockref_is_dead(&gl->gl_lockref) &&
+ glock_blocked_by_withdraw(gl))
+ do_error(gl, LM_OUT_ERROR); /* remove pending waiters */
+ spin_unlock(&gl->gl_lockref.lock);
+}
+
+void gfs2_gl_dq_holders(struct gfs2_sbd *sdp)
+{
+ glock_hash_walk(withdraw_dq, sdp);
+}
+
/**
* gfs2_gl_hash_clear - Empty out the glock hash table
* @sdp: the filesystem
diff --git a/fs/gfs2/glock.h b/fs/gfs2/glock.h
index 5aed8b500cf5..0199a3dcb114 100644
--- a/fs/gfs2/glock.h
+++ b/fs/gfs2/glock.h
@@ -274,6 +274,7 @@ extern void gfs2_cancel_delete_work(struct gfs2_glock *gl);
extern bool gfs2_delete_work_queued(const struct gfs2_glock *gl);
extern void gfs2_flush_delete_work(struct gfs2_sbd *sdp);
extern void gfs2_gl_hash_clear(struct gfs2_sbd *sdp);
+extern void gfs2_gl_dq_holders(struct gfs2_sbd *sdp);
extern void gfs2_glock_thaw(struct gfs2_sbd *sdp);
extern void gfs2_glock_add_to_lru(struct gfs2_glock *gl);
extern void gfs2_glock_free(struct gfs2_glock *gl);
diff --git a/fs/gfs2/util.c b/fs/gfs2/util.c
index 95c79a3ec161..88185a341504 100644
--- a/fs/gfs2/util.c
+++ b/fs/gfs2/util.c
@@ -164,6 +164,11 @@ static void signal_our_withdraw(struct gfs2_sbd *sdp)
}
if (!ret)
gfs2_make_fs_ro(sdp);
+ /*
+ * Dequeue any pending non-system glock holders that can no
+ * longer be granted because the file system is withdrawn.
+ */
+ gfs2_gl_dq_holders(sdp);
gfs2_freeze_unlock(&freeze_gh);
}
--
2.37.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [Cluster-devel] [GFS2 v3 PATCH 3/3] gfs2: Clear flags when withdraw prevents xmote
2022-08-18 18:32 [Cluster-devel] [GFS2 v3 PATCH 0/3] gfs2: Misc withdraw patches (version 3) Bob Peterson
2022-08-18 18:32 ` [Cluster-devel] [GFS2 v3 PATCH 1/3] gfs2: Prevent double iput for journal on error Bob Peterson
2022-08-18 18:32 ` [Cluster-devel] [GFS2 v3 PATCH 2/3] gfs2: Dequeue waiters when withdrawn Bob Peterson
@ 2022-08-18 18:32 ` Bob Peterson
2022-08-25 15:59 ` [Cluster-devel] [GFS2 v3 PATCH 0/3] gfs2: Misc withdraw patches (version 3) Andreas Gruenbacher
3 siblings, 0 replies; 5+ messages in thread
From: Bob Peterson @ 2022-08-18 18:32 UTC (permalink / raw)
To: cluster-devel.redhat.com
There are a couple places in function do_xmote where normal processing
is circumvented due to withdraws in progress. However, since we bypass
most of do_xmote() we bypass telling dlm to lock the dlm lock, which
means dlm will never respond with a completion callback. Since the
completion callback ordinarily clears GLF_LOCK, this patch changes
function do_xmote to handle those situations more gracefully so the
file system may be unmounted after withdraw.
A very similar situation happens with the GLF_DEMOTE_IN_PROGRESS flag,
which is cleared by function finish_xmote(). Since the withdraw causes
us to skip the majority of do_xmote, it therefore also skips the call
to finish_xmote() so the DEMOTE_IN_PROGRESS flag needs to be cleared
manually.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
fs/gfs2/glock.c | 22 ++++++++++++++++++++--
1 file changed, 20 insertions(+), 2 deletions(-)
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 0ffe7d474cd5..8d84094555b3 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -59,6 +59,8 @@ typedef void (*glock_examiner) (struct gfs2_glock * gl);
static void do_xmote(struct gfs2_glock *gl, struct gfs2_holder *gh, unsigned int target);
static void __gfs2_glock_dq(struct gfs2_holder *gh);
+static void handle_callback(struct gfs2_glock *gl, unsigned int state,
+ unsigned long delay, bool remote);
static struct dentry *gfs2_root;
static struct workqueue_struct *glock_workqueue;
@@ -730,7 +732,8 @@ static bool is_system_glock(struct gfs2_glock *gl)
*
*/
-static void do_xmote(struct gfs2_glock *gl, struct gfs2_holder *gh, unsigned int target)
+static void do_xmote(struct gfs2_glock *gl, struct gfs2_holder *gh,
+ unsigned int target)
__releases(&gl->gl_lockref.lock)
__acquires(&gl->gl_lockref.lock)
{
@@ -741,7 +744,8 @@ __acquires(&gl->gl_lockref.lock)
if (target != LM_ST_UNLOCKED && glock_blocked_by_withdraw(gl) &&
gh && !(gh->gh_flags & LM_FLAG_NOEXP))
- return;
+ goto skip_inval;
+
lck_flags &= (LM_FLAG_TRY | LM_FLAG_TRY_1CB | LM_FLAG_NOEXP |
LM_FLAG_PRIORITY);
GLOCK_BUG_ON(gl, gl->gl_state == target);
@@ -826,6 +830,20 @@ __acquires(&gl->gl_lockref.lock)
(target != LM_ST_UNLOCKED ||
test_bit(SDF_WITHDRAW_RECOVERY, &sdp->sd_flags))) {
if (!is_system_glock(gl)) {
+ handle_callback(gl, LM_ST_UNLOCKED, 0, false); /* sets demote */
+ /*
+ * Ordinarily, we would call dlm and its callback would call
+ * finish_xmote, which would call state_change() to the new state.
+ * Since we withdrew, we won't call dlm, so call state_change
+ * manually, but to the UNLOCKED state we desire.
+ */
+ state_change(gl, LM_ST_UNLOCKED);
+ /*
+ * We skip telling dlm to do the locking, so we won't get a
+ * reply that would otherwise clear GLF_LOCK. So we clear it here.
+ */
+ clear_bit(GLF_LOCK, &gl->gl_flags);
+ clear_bit(GLF_DEMOTE_IN_PROGRESS, &gl->gl_flags);
gfs2_glock_queue_work(gl, GL_GLOCK_DFT_HOLD);
goto out;
} else {
--
2.37.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [Cluster-devel] [GFS2 v3 PATCH 0/3] gfs2: Misc withdraw patches (version 3)
2022-08-18 18:32 [Cluster-devel] [GFS2 v3 PATCH 0/3] gfs2: Misc withdraw patches (version 3) Bob Peterson
` (2 preceding siblings ...)
2022-08-18 18:32 ` [Cluster-devel] [GFS2 v3 PATCH 3/3] gfs2: Clear flags when withdraw prevents xmote Bob Peterson
@ 2022-08-25 15:59 ` Andreas Gruenbacher
3 siblings, 0 replies; 5+ messages in thread
From: Andreas Gruenbacher @ 2022-08-25 15:59 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Thu, Aug 18, 2022 at 8:32 PM Bob Peterson <rpeterso@redhat.com> wrote:
> This is version 3 which has further simplification and improvements.
>
> This patch set fixes a few bugs in how gfs2 handles file systems after
> withdraw. In an ideal world, after a file system is withdrawn, users
> should be able to unmount the file system without problems. However, we
> discovered three problems that prevented clean unmounts:
>
> 1. A duplicate iput of the journal after attempted recovery caused
> kernel panics after withdraw.
> 2. After withdraw, unmount would hang for its alloted timeout period
> when glocks had waiters queued that, due to the withdraw, could
> never be granted.
> 3. Unmount would similarly hang when the withdraw prevented an outgoing
> request to dlm, but so the glock was never unlocked.
>
> Bob Peterson (3):
> gfs2: Prevent double iput for journal on error
> gfs2: Dequeue waiters when withdrawn
> gfs2: Clear flags when withdraw prevents xmote
>
> fs/gfs2/glock.c | 36 ++++++++++++++++++++++++++++++++++--
> fs/gfs2/glock.h | 1 +
> fs/gfs2/util.c | 6 ++++++
> 3 files changed, 41 insertions(+), 2 deletions(-)
>
> --
> 2.37.2
>
Pushed to for-next.
Thanks,
Andreas
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-08-25 15:59 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-08-18 18:32 [Cluster-devel] [GFS2 v3 PATCH 0/3] gfs2: Misc withdraw patches (version 3) Bob Peterson
2022-08-18 18:32 ` [Cluster-devel] [GFS2 v3 PATCH 1/3] gfs2: Prevent double iput for journal on error Bob Peterson
2022-08-18 18:32 ` [Cluster-devel] [GFS2 v3 PATCH 2/3] gfs2: Dequeue waiters when withdrawn Bob Peterson
2022-08-18 18:32 ` [Cluster-devel] [GFS2 v3 PATCH 3/3] gfs2: Clear flags when withdraw prevents xmote Bob Peterson
2022-08-25 15:59 ` [Cluster-devel] [GFS2 v3 PATCH 0/3] gfs2: Misc withdraw patches (version 3) Andreas Gruenbacher
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).