* [Cluster-devel] [GFS2 PATCH 1/4] gfs2: return errors from gfs2_ail_empty_gl
2023-04-21 19:07 [Cluster-devel] [GFS2 PATCH 0/4] Fix revoke processing at unmount and ro Bob Peterson
@ 2023-04-21 19:07 ` Bob Peterson
2023-04-24 14:07 ` Andreas Gruenbacher
2023-04-21 19:07 ` [Cluster-devel] [GFS2 PATCH 2/4] gfs2: Perform second log flush in gfs2_make_fs_ro Bob Peterson
` (2 subsequent siblings)
3 siblings, 1 reply; 8+ messages in thread
From: Bob Peterson @ 2023-04-21 19:07 UTC (permalink / raw)
To: cluster-devel.redhat.com
Before this patch, function gfs2_ail_empty_gl did not return errors it
encountered from __gfs2_trans_begin. Those errors usually came from the
fact that the file system was made read-only, often due to unmount,
(but theoretically could be due to -o remount,ro) which prevented
the transaction from starting.
The inability to start a transaction prevented its revokes from being
properly written to the journal for glocks during unmount (and
transition to ro).
That meant glocks could be unlocked without the metadata properly
revoked in the journal. So other nodes could grab the glock thinking
that their lvb values were correct, but in fact corresponded to the
glock without its revokes properly synced. That presented as lvb
mismatch errors.
This patch allows gfs2_ail_empty_gl to return the error properly to
the caller.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
fs/gfs2/glops.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
index 6e33c8058059..8e245d793e6b 100644
--- a/fs/gfs2/glops.c
+++ b/fs/gfs2/glops.c
@@ -90,7 +90,7 @@ static int gfs2_ail_empty_gl(struct gfs2_glock *gl)
struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
struct gfs2_trans tr;
unsigned int revokes;
- int ret;
+ int ret = 0;
revokes = atomic_read(&gl->gl_ail_count);
@@ -124,15 +124,15 @@ static int gfs2_ail_empty_gl(struct gfs2_glock *gl)
memset(&tr, 0, sizeof(tr));
set_bit(TR_ONSTACK, &tr.tr_flags);
ret = __gfs2_trans_begin(&tr, sdp, 0, revokes, _RET_IP_);
- if (ret)
- goto flush;
- __gfs2_ail_flush(gl, 0, revokes);
- gfs2_trans_end(sdp);
+ if (!ret) {
+ __gfs2_ail_flush(gl, 0, revokes);
+ gfs2_trans_end(sdp);
+ }
flush:
gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_NORMAL |
GFS2_LFC_AIL_EMPTY_GL);
- return 0;
+ return ret;
}
void gfs2_ail_flush(struct gfs2_glock *gl, bool fsync)
@@ -326,7 +326,9 @@ static int inode_go_sync(struct gfs2_glock *gl)
ret = gfs2_inode_metasync(gl);
if (!error)
error = ret;
- gfs2_ail_empty_gl(gl);
+ ret = gfs2_ail_empty_gl(gl);
+ if (!error)
+ error = ret;
/*
* Writeback of the data mapping may cause the dirty flag to be set
* so we have to clear it again here.
--
2.39.2
^ permalink raw reply related [flat|nested] 8+ messages in thread* [Cluster-devel] [GFS2 PATCH 1/4] gfs2: return errors from gfs2_ail_empty_gl
2023-04-21 19:07 ` [Cluster-devel] [GFS2 PATCH 1/4] gfs2: return errors from gfs2_ail_empty_gl Bob Peterson
@ 2023-04-24 14:07 ` Andreas Gruenbacher
0 siblings, 0 replies; 8+ messages in thread
From: Andreas Gruenbacher @ 2023-04-24 14:07 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Fri, Apr 21, 2023 at 9:07?PM Bob Peterson <rpeterso@redhat.com> wrote:
> Before this patch, function gfs2_ail_empty_gl did not return errors it
> encountered from __gfs2_trans_begin. Those errors usually came from the
> fact that the file system was made read-only, often due to unmount,
> (but theoretically could be due to -o remount,ro) which prevented
> the transaction from starting.
>
> The inability to start a transaction prevented its revokes from being
> properly written to the journal for glocks during unmount (and
> transition to ro).
>
> That meant glocks could be unlocked without the metadata properly
> revoked in the journal. So other nodes could grab the glock thinking
> that their lvb values were correct, but in fact corresponded to the
> glock without its revokes properly synced. That presented as lvb
> mismatch errors.
>
> This patch allows gfs2_ail_empty_gl to return the error properly to
> the caller.
Alright,
> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> ---
> fs/gfs2/glops.c | 16 +++++++++-------
> 1 file changed, 9 insertions(+), 7 deletions(-)
>
> diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
> index 6e33c8058059..8e245d793e6b 100644
> --- a/fs/gfs2/glops.c
> +++ b/fs/gfs2/glops.c
> @@ -90,7 +90,7 @@ static int gfs2_ail_empty_gl(struct gfs2_glock *gl)
> struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
> struct gfs2_trans tr;
> unsigned int revokes;
> - int ret;
> + int ret = 0;
>
> revokes = atomic_read(&gl->gl_ail_count);
>
> @@ -124,15 +124,15 @@ static int gfs2_ail_empty_gl(struct gfs2_glock *gl)
> memset(&tr, 0, sizeof(tr));
> set_bit(TR_ONSTACK, &tr.tr_flags);
> ret = __gfs2_trans_begin(&tr, sdp, 0, revokes, _RET_IP_);
> - if (ret)
> - goto flush;
> - __gfs2_ail_flush(gl, 0, revokes);
> - gfs2_trans_end(sdp);
> + if (!ret) {
> + __gfs2_ail_flush(gl, 0, revokes);
> + gfs2_trans_end(sdp);
> + }
but the above hunk doesn't help, so let me skip that.
> flush:
> gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_NORMAL |
> GFS2_LFC_AIL_EMPTY_GL);
> - return 0;
> + return ret;
> }
>
> void gfs2_ail_flush(struct gfs2_glock *gl, bool fsync)
> @@ -326,7 +326,9 @@ static int inode_go_sync(struct gfs2_glock *gl)
> ret = gfs2_inode_metasync(gl);
> if (!error)
> error = ret;
> - gfs2_ail_empty_gl(gl);
> + ret = gfs2_ail_empty_gl(gl);
> + if (!error)
> + error = ret;
> /*
> * Writeback of the data mapping may cause the dirty flag to be set
> * so we have to clear it again here.
> --
> 2.39.2
>
Thanks,
Andreas
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Cluster-devel] [GFS2 PATCH 2/4] gfs2: Perform second log flush in gfs2_make_fs_ro
2023-04-21 19:07 [Cluster-devel] [GFS2 PATCH 0/4] Fix revoke processing at unmount and ro Bob Peterson
2023-04-21 19:07 ` [Cluster-devel] [GFS2 PATCH 1/4] gfs2: return errors from gfs2_ail_empty_gl Bob Peterson
@ 2023-04-21 19:07 ` Bob Peterson
2023-04-24 14:08 ` Andreas Gruenbacher
2023-04-21 19:07 ` [Cluster-devel] [GFS2 PATCH 3/4] gfs2: Issue message when revokes cannot be written Bob Peterson
2023-04-21 19:07 ` [Cluster-devel] [GFS2 PATCH 4/4] gfs2: gfs2_ail_empty_gl no log flush on error Bob Peterson
3 siblings, 1 reply; 8+ messages in thread
From: Bob Peterson @ 2023-04-21 19:07 UTC (permalink / raw)
To: cluster-devel.redhat.com
Before this patch function gfs2_make_fs_ro called gfs2_log_flush once to
finalize the log. However, if there's dirty metadata, log flushes tend
to sync the metadata and formulate revokes. Before this patch, those
revokes may not be written out to the journal immediately, which meant
unresolved glocks could still have revokes in their ail lists. When the
glock worker runs, it tries to transition the glock, but the unresolved
revokes in the ail still need to be written, so it tries to start a
transaction. It's impossible to start a transaction because at that
point the GFS2_LOG_HEAD_FLUSH_SHUTDOWN has been set by gfs2_make_fs_ro.
That cause the glock worker to get an error, and unable to write the
revokes. The calling sequence looked something like this:
gfs2_make_fs_ro
gfs2_log_flush - with GFS2_LOG_HEAD_FLUSH_SHUTDOWN flag set
if (flags & GFS2_LOG_HEAD_FLUSH_SHUTDOWN)
clear_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags);
...meanwhile...
glock_work_func
do_xmote
rgrp_go_sync (or possibly inode_go_sync)
...
gfs2_ail_empty_gl
__gfs2_trans_begin
if (unlikely(!test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags))) {
...
return -EROFS;
The previous patch in the series ("gfs2: return errors from
gfs2_ail_empty_gl") now causes the transaction error to no longer be
ignored, so it causes a warning from MOST of the xfstests:
WARNING: CPU: 11 PID: X at fs/gfs2/super.c:603 gfs2_put_super [gfs2]
which corresponds to:
WARN_ON(gfs2_withdrawing(sdp));
The withdraw was triggered silently from do_xmote by:
if (unlikely(sdp->sd_log_error && !gfs2_withdrawn(sdp)))
gfs2_withdraw_delayed(sdp);
This patch adds a second log_flush to gfs2_make_fs_ro: one to sync the
data and one to sync any outstanding revokes and finalize the journal.
Note that both of these log flushes need to be "special," in other
words, not GFS2_LOG_HEAD_FLUSH_NORMAL.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
fs/gfs2/super.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index a83fa62106f0..5eed8c237500 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -552,6 +552,15 @@ void gfs2_make_fs_ro(struct gfs2_sbd *sdp)
gfs2_quota_sync(sdp->sd_vfs, 0);
gfs2_statfs_sync(sdp->sd_vfs, 0);
+ /* We do two log flushes here. The first one commits dirty inodes
+ * and rgrps to the journal, but queues up revokes to the ail list.
+ * The second flush writes out and removes the revokes.
+ *
+ * The first must be done before the FLUSH_SHUTDOWN code
+ * clears the LIVE flag, otherwise it will not be able to start
+ * a transaction to write its revokes, and the error will cause
+ * a withdraw of the file system. */
+ gfs2_log_flush(sdp, NULL, GFS2_LFC_MAKE_FS_RO);
gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_SHUTDOWN |
GFS2_LFC_MAKE_FS_RO);
wait_event_timeout(sdp->sd_log_waitq,
--
2.39.2
^ permalink raw reply related [flat|nested] 8+ messages in thread* [Cluster-devel] [GFS2 PATCH 2/4] gfs2: Perform second log flush in gfs2_make_fs_ro
2023-04-21 19:07 ` [Cluster-devel] [GFS2 PATCH 2/4] gfs2: Perform second log flush in gfs2_make_fs_ro Bob Peterson
@ 2023-04-24 14:08 ` Andreas Gruenbacher
2023-04-24 16:08 ` Bob Peterson
0 siblings, 1 reply; 8+ messages in thread
From: Andreas Gruenbacher @ 2023-04-24 14:08 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Fri, Apr 21, 2023 at 9:07?PM Bob Peterson <rpeterso@redhat.com> wrote:
> Before this patch function gfs2_make_fs_ro called gfs2_log_flush once to
> finalize the log. However, if there's dirty metadata, log flushes tend
> to sync the metadata and formulate revokes. Before this patch, those
> revokes may not be written out to the journal immediately, which meant
> unresolved glocks could still have revokes in their ail lists. When the
> glock worker runs, it tries to transition the glock, but the unresolved
> revokes in the ail still need to be written, so it tries to start a
> transaction. It's impossible to start a transaction because at that
> point the GFS2_LOG_HEAD_FLUSH_SHUTDOWN has been set by gfs2_make_fs_ro.
Do you mean that at that point, the SDF_JOURNAL_LIVE flag has already
been cleared?
> That cause the glock worker to get an error, and unable to write the
> revokes. The calling sequence looked something like this:
>
> gfs2_make_fs_ro
> gfs2_log_flush - with GFS2_LOG_HEAD_FLUSH_SHUTDOWN flag set
> if (flags & GFS2_LOG_HEAD_FLUSH_SHUTDOWN)
> clear_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags);
> ...meanwhile...
> glock_work_func
> do_xmote
> rgrp_go_sync (or possibly inode_go_sync)
> ...
> gfs2_ail_empty_gl
> __gfs2_trans_begin
> if (unlikely(!test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags))) {
> ...
> return -EROFS;
>
> The previous patch in the series ("gfs2: return errors from
> gfs2_ail_empty_gl") now causes the transaction error to no longer be
> ignored, so it causes a warning from MOST of the xfstests:
>
> WARNING: CPU: 11 PID: X at fs/gfs2/super.c:603 gfs2_put_super [gfs2]
>
> which corresponds to:
>
> WARN_ON(gfs2_withdrawing(sdp));
>
> The withdraw was triggered silently from do_xmote by:
>
> if (unlikely(sdp->sd_log_error && !gfs2_withdrawn(sdp)))
> gfs2_withdraw_delayed(sdp);
>
> This patch adds a second log_flush to gfs2_make_fs_ro: one to sync the
> data and one to sync any outstanding revokes and finalize the journal.
> Note that both of these log flushes need to be "special," in other
> words, not GFS2_LOG_HEAD_FLUSH_NORMAL.
>
> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> ---
> fs/gfs2/super.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
> index a83fa62106f0..5eed8c237500 100644
> --- a/fs/gfs2/super.c
> +++ b/fs/gfs2/super.c
> @@ -552,6 +552,15 @@ void gfs2_make_fs_ro(struct gfs2_sbd *sdp)
> gfs2_quota_sync(sdp->sd_vfs, 0);
> gfs2_statfs_sync(sdp->sd_vfs, 0);
>
> + /* We do two log flushes here. The first one commits dirty inodes
> + * and rgrps to the journal, but queues up revokes to the ail list.
> + * The second flush writes out and removes the revokes.
> + *
> + * The first must be done before the FLUSH_SHUTDOWN code
> + * clears the LIVE flag, otherwise it will not be able to start
> + * a transaction to write its revokes, and the error will cause
> + * a withdraw of the file system. */
> + gfs2_log_flush(sdp, NULL, GFS2_LFC_MAKE_FS_RO);
> gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_SHUTDOWN |
> GFS2_LFC_MAKE_FS_RO);
> wait_event_timeout(sdp->sd_log_waitq,
> --
> 2.39.2
>
Thanks,
Andreas
^ permalink raw reply [flat|nested] 8+ messages in thread* [Cluster-devel] [GFS2 PATCH 2/4] gfs2: Perform second log flush in gfs2_make_fs_ro
2023-04-24 14:08 ` Andreas Gruenbacher
@ 2023-04-24 16:08 ` Bob Peterson
0 siblings, 0 replies; 8+ messages in thread
From: Bob Peterson @ 2023-04-24 16:08 UTC (permalink / raw)
To: cluster-devel.redhat.com
On 4/24/23 10:08 AM, Andreas Gruenbacher wrote:
>> point the GFS2_LOG_HEAD_FLUSH_SHUTDOWN has been set by gfs2_make_fs_ro.
>
> Do you mean that at that point, the SDF_JOURNAL_LIVE flag has already
> been cleared?
Ah, yes, you are correct. That was a think-o. Please adjust as appropriate.
Regards,
Bob Peterson
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Cluster-devel] [GFS2 PATCH 3/4] gfs2: Issue message when revokes cannot be written
2023-04-21 19:07 [Cluster-devel] [GFS2 PATCH 0/4] Fix revoke processing at unmount and ro Bob Peterson
2023-04-21 19:07 ` [Cluster-devel] [GFS2 PATCH 1/4] gfs2: return errors from gfs2_ail_empty_gl Bob Peterson
2023-04-21 19:07 ` [Cluster-devel] [GFS2 PATCH 2/4] gfs2: Perform second log flush in gfs2_make_fs_ro Bob Peterson
@ 2023-04-21 19:07 ` Bob Peterson
2023-04-21 19:07 ` [Cluster-devel] [GFS2 PATCH 4/4] gfs2: gfs2_ail_empty_gl no log flush on error Bob Peterson
3 siblings, 0 replies; 8+ messages in thread
From: Bob Peterson @ 2023-04-21 19:07 UTC (permalink / raw)
To: cluster-devel.redhat.com
Before this patch, function gfs2_ail_empty_gl would silently return an
error to the caller. This would get silently set into sd_log_error which
would cause a withdraw, but there was no indication why the file system
was withdrawn. This patch adds a fs_err to log the appropriate error
message.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
fs/gfs2/glops.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
index 8e245d793e6b..4b3949f2020b 100644
--- a/fs/gfs2/glops.c
+++ b/fs/gfs2/glops.c
@@ -127,6 +127,8 @@ static int gfs2_ail_empty_gl(struct gfs2_glock *gl)
if (!ret) {
__gfs2_ail_flush(gl, 0, revokes);
gfs2_trans_end(sdp);
+ } else {
+ fs_err(sdp, "Transaction error %d: Unable to write revokes.", ret);
}
flush:
--
2.39.2
^ permalink raw reply related [flat|nested] 8+ messages in thread* [Cluster-devel] [GFS2 PATCH 4/4] gfs2: gfs2_ail_empty_gl no log flush on error
2023-04-21 19:07 [Cluster-devel] [GFS2 PATCH 0/4] Fix revoke processing at unmount and ro Bob Peterson
` (2 preceding siblings ...)
2023-04-21 19:07 ` [Cluster-devel] [GFS2 PATCH 3/4] gfs2: Issue message when revokes cannot be written Bob Peterson
@ 2023-04-21 19:07 ` Bob Peterson
3 siblings, 0 replies; 8+ messages in thread
From: Bob Peterson @ 2023-04-21 19:07 UTC (permalink / raw)
To: cluster-devel.redhat.com
Before this patch function gfs2_ail_empty_gl called gfs2_log_flush even
in cases where it encountered an error. It should probably skip the log
flush and leave the file system in an inconsistent state, letting a
subsequent withdraw force the journal to be replayed to reestablish
metadata consistency.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
fs/gfs2/glops.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
index 4b3949f2020b..ef31218060aa 100644
--- a/fs/gfs2/glops.c
+++ b/fs/gfs2/glops.c
@@ -132,8 +132,9 @@ static int gfs2_ail_empty_gl(struct gfs2_glock *gl)
}
flush:
- gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_NORMAL |
- GFS2_LFC_AIL_EMPTY_GL);
+ if (!ret)
+ gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_NORMAL |
+ GFS2_LFC_AIL_EMPTY_GL);
return ret;
}
--
2.39.2
^ permalink raw reply related [flat|nested] 8+ messages in thread