From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: darrick.wong@oracle.com
Cc: linux-xfs@vger.kernel.org
Subject: [PATCH 03/10] xfs: clear BAD_SUMMARY if unmounting an unhealthy filesystem
Date: Mon, 01 Apr 2019 10:10:28 -0700 [thread overview]
Message-ID: <155413862812.4966.6543791189302248422.stgit@magnolia> (raw)
In-Reply-To: <155413860964.4966.6087725033542837255.stgit@magnolia>
From: Darrick J. Wong <darrick.wong@oracle.com>
If we know the filesystem metadata isn't healthy during unmount, we want
to encourage the administrator to run xfs_repair right away. We can't
do this if BAD_SUMMARY will cause an unclean log unmount to force
summary recalculation, so turn it off if the fs is bad.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
fs/xfs/libxfs/xfs_health.h | 2 +
fs/xfs/xfs_health.c | 59 ++++++++++++++++++++++++++++++++++++++++++++
fs/xfs/xfs_mount.c | 2 +
fs/xfs/xfs_trace.h | 3 ++
4 files changed, 66 insertions(+)
diff --git a/fs/xfs/libxfs/xfs_health.h b/fs/xfs/libxfs/xfs_health.h
index 0d51bd2689ea..269b124dc1d7 100644
--- a/fs/xfs/libxfs/xfs_health.h
+++ b/fs/xfs/libxfs/xfs_health.h
@@ -148,6 +148,8 @@ void xfs_inode_mark_sick(struct xfs_inode *ip, unsigned int mask);
void xfs_inode_mark_healthy(struct xfs_inode *ip, unsigned int mask);
unsigned int xfs_inode_measure_sickness(struct xfs_inode *ip);
+void xfs_health_unmount(struct xfs_mount *mp);
+
/* Now some helpers. */
static inline bool
diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c
index e9d6859f7501..6e2da858c356 100644
--- a/fs/xfs/xfs_health.c
+++ b/fs/xfs/xfs_health.c
@@ -19,6 +19,65 @@
#include "xfs_trace.h"
#include "xfs_health.h"
+/*
+ * Warn about metadata corruption that we detected but haven't fixed, and
+ * make sure we're not sitting on anything that would get in the way of
+ * recovery.
+ */
+void
+xfs_health_unmount(
+ struct xfs_mount *mp)
+{
+ struct xfs_perag *pag;
+ xfs_agnumber_t agno;
+ unsigned int sick;
+ bool warn = false;
+
+ if (XFS_FORCED_SHUTDOWN(mp))
+ return;
+
+ /* Measure AG corruption levels. */
+ for (agno = 0; agno < mp->m_sb.sb_agcount; agno++) {
+ pag = xfs_perag_get(mp, agno);
+ spin_lock(&pag->pag_state_lock);
+ if (pag->pag_sick) {
+ trace_xfs_ag_unfixed_corruption(mp, agno, sick);
+ warn = true;
+ }
+ spin_unlock(&pag->pag_state_lock);
+ xfs_perag_put(pag);
+ }
+
+ /* Measure realtime volume corruption levels. */
+ sick = xfs_rt_measure_sickness(mp);
+ if (sick) {
+ trace_xfs_rt_unfixed_corruption(mp, sick);
+ warn = true;
+ }
+
+ /* Measure fs corruption and keep the sample around for the warning. */
+ sick = xfs_fs_measure_sickness(mp);
+ if (sick) {
+ trace_xfs_fs_unfixed_corruption(mp, sick);
+ warn = true;
+ }
+
+ if (warn) {
+ xfs_warn(mp,
+"Uncorrected metadata errors detected; please run xfs_repair.");
+
+ /*
+ * If we have unhealthy metadata, we want the admin to run
+ * xfs_repair after unmounting. They can't do that if the log
+ * is written out without a clean unmount record (such as when
+ * the summary counters are marked unhealthy to force
+ * recalculation of the summary counters) so clear it.
+ */
+ if (sick & XFS_HEALTH_FS_COUNTERS)
+ xfs_fs_mark_healthy(mp, XFS_HEALTH_FS_COUNTERS);
+ }
+}
+
/* Mark unhealthy per-fs metadata. */
void
xfs_fs_mark_sick(
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index a43ca655a431..f0f73d598a0c 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -1075,6 +1075,7 @@ xfs_mountfs(
*/
cancel_delayed_work_sync(&mp->m_reclaim_work);
xfs_reclaim_inodes(mp, SYNC_WAIT);
+ xfs_health_unmount(mp);
out_log_dealloc:
mp->m_flags |= XFS_MOUNT_UNMOUNTING;
xfs_log_mount_cancel(mp);
@@ -1157,6 +1158,7 @@ xfs_unmountfs(
*/
cancel_delayed_work_sync(&mp->m_reclaim_work);
xfs_reclaim_inodes(mp, SYNC_WAIT);
+ xfs_health_unmount(mp);
xfs_qm_unmount(mp);
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index f079841c7af6..2464ea351f83 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -3461,8 +3461,10 @@ DEFINE_EVENT(xfs_fs_corrupt_class, name, \
TP_ARGS(mp, flags))
DEFINE_FS_CORRUPT_EVENT(xfs_fs_mark_sick);
DEFINE_FS_CORRUPT_EVENT(xfs_fs_mark_healthy);
+DEFINE_FS_CORRUPT_EVENT(xfs_fs_unfixed_corruption);
DEFINE_FS_CORRUPT_EVENT(xfs_rt_mark_sick);
DEFINE_FS_CORRUPT_EVENT(xfs_rt_mark_healthy);
+DEFINE_FS_CORRUPT_EVENT(xfs_rt_unfixed_corruption);
DECLARE_EVENT_CLASS(xfs_ag_corrupt_class,
TP_PROTO(struct xfs_mount *mp, xfs_agnumber_t agno, unsigned int flags),
@@ -3488,6 +3490,7 @@ DEFINE_EVENT(xfs_ag_corrupt_class, name, \
TP_ARGS(mp, agno, flags))
DEFINE_AG_CORRUPT_EVENT(xfs_ag_mark_sick);
DEFINE_AG_CORRUPT_EVENT(xfs_ag_mark_healthy);
+DEFINE_AG_CORRUPT_EVENT(xfs_ag_unfixed_corruption);
DECLARE_EVENT_CLASS(xfs_inode_corrupt_class,
TP_PROTO(struct xfs_inode *ip, unsigned int flags),
next prev parent reply other threads:[~2019-04-01 17:10 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-01 17:10 [PATCH 00/10] xfs: online health tracking support Darrick J. Wong
2019-04-01 17:10 ` [PATCH 01/10] xfs: track metadata health levels Darrick J. Wong
2019-04-02 13:22 ` Brian Foster
2019-04-02 13:30 ` Darrick J. Wong
2019-04-01 17:10 ` [PATCH 02/10] xfs: replace the BAD_SUMMARY mount flag with the equivalent health code Darrick J. Wong
2019-04-02 13:22 ` Brian Foster
2019-04-01 17:10 ` Darrick J. Wong [this message]
2019-04-02 13:24 ` [PATCH 03/10] xfs: clear BAD_SUMMARY if unmounting an unhealthy filesystem Brian Foster
2019-04-02 13:40 ` Darrick J. Wong
2019-04-02 13:53 ` Brian Foster
2019-04-02 18:16 ` Darrick J. Wong
2019-04-02 18:32 ` Brian Foster
2019-04-01 17:10 ` [PATCH 04/10] xfs: expand xfs_fsop_geom Darrick J. Wong
2019-04-02 17:34 ` Brian Foster
2019-04-02 21:53 ` Dave Chinner
2019-04-02 22:31 ` Darrick J. Wong
2019-04-01 17:10 ` [PATCH 05/10] xfs: add a new ioctl to describe allocation group geometry Darrick J. Wong
2019-04-02 17:34 ` Brian Foster
2019-04-02 21:35 ` Darrick J. Wong
2019-04-01 17:10 ` [PATCH 06/10] xfs: report fs and rt health via geometry structure Darrick J. Wong
2019-04-02 17:35 ` Brian Foster
2019-04-02 18:23 ` Darrick J. Wong
2019-04-02 23:34 ` Darrick J. Wong
2019-04-01 17:10 ` [PATCH 07/10] xfs: report AG health via AG geometry ioctl Darrick J. Wong
2019-04-03 14:30 ` Brian Foster
2019-04-03 16:11 ` Darrick J. Wong
2019-04-04 11:48 ` Brian Foster
2019-04-05 20:33 ` Darrick J. Wong
2019-04-08 11:34 ` Brian Foster
2019-04-09 3:25 ` Darrick J. Wong
2019-04-01 17:11 ` [PATCH 08/10] xfs: report inode health via bulkstat Darrick J. Wong
2019-04-01 17:11 ` [PATCH 09/10] xfs: scrub/repair should update filesystem metadata health Darrick J. Wong
2019-04-04 11:50 ` Brian Foster
2019-04-04 18:01 ` Darrick J. Wong
2019-04-05 13:07 ` Brian Foster
2019-04-05 20:54 ` Darrick J. Wong
2019-04-08 11:35 ` Brian Foster
2019-04-09 3:30 ` Darrick J. Wong
2019-04-01 17:11 ` [PATCH 10/10] xfs: update health status if we get a clean bill of health Darrick J. Wong
2019-04-04 11:51 ` Brian Foster
2019-04-04 15:48 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=155413862812.4966.6543791189302248422.stgit@magnolia \
--to=darrick.wong@oracle.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).