From: Dave Chinner <david@fromorbit.com>
To: linux-xfs@vger.kernel.org
Subject: [PATCH 2/2] xfs: reduce free inode accounting overhead
Date: Wed, 20 May 2020 07:48:40 +1000 [thread overview]
Message-ID: <20200519214840.2570159-3-david@fromorbit.com> (raw)
In-Reply-To: <20200519214840.2570159-1-david@fromorbit.com>
From: Dave Chinner <dchinner@redhat.com>
Shaokun Zhang reported that XFs was using substantial CPU time in
percpu_count_sum() when running a single threaded benchmark on
a high CPU count (128p) machine from xfs_mod_ifree(). The issue
is that the filesystem is empty when the benchmark runs, so inode
allocation is running with a very low inode free count.
With the percpu counter batching, this means comparisons when the
counter is less that 128 * 256 = 32768 use the slow path of adding
up all the counters across the CPUs, and this is expensive on high
CPU count machines.
The summing in xfs_mod_ifree() is only used to fire an assert if an
underrun occurs. The error is ignored by the higher level code.
Hence this is really just debug code and we don't need to run it
on production kernels, nor do we need such debug checks to return
error values just to trigger an assert.
Finally, xfs_mod_icount/xfs_mod_ifree are only called from
xfs_trans_unreserve_and_mod_sb(), so get rid of them and just
directly call the percpu_counter_add/percpu_counter_compare
functions. The compare functions are now run only on debug builds as
they are internal to ASSERT() checks and so only compiled in when
ASSERTs are active (CONFIG_XFS_DEBUG=y or CONFIG_XFS_WARN=y).
Reported-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/xfs_mount.c | 33 ---------------------------------
fs/xfs/xfs_mount.h | 2 --
fs/xfs/xfs_trans.c | 17 +++++++++++++----
3 files changed, 13 insertions(+), 39 deletions(-)
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index bb91f04266b9a..d5dcf98698600 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -1189,39 +1189,6 @@ xfs_log_sbcount(xfs_mount_t *mp)
return xfs_sync_sb(mp, true);
}
-/*
- * Deltas for the inode count are +/-64, hence we use a large batch size
- * of 128 so we don't need to take the counter lock on every update.
- */
-#define XFS_ICOUNT_BATCH 128
-int
-xfs_mod_icount(
- struct xfs_mount *mp,
- int64_t delta)
-{
- percpu_counter_add_batch(&mp->m_icount, delta, XFS_ICOUNT_BATCH);
- if (__percpu_counter_compare(&mp->m_icount, 0, XFS_ICOUNT_BATCH) < 0) {
- ASSERT(0);
- percpu_counter_add(&mp->m_icount, -delta);
- return -EINVAL;
- }
- return 0;
-}
-
-int
-xfs_mod_ifree(
- struct xfs_mount *mp,
- int64_t delta)
-{
- percpu_counter_add(&mp->m_ifree, delta);
- if (percpu_counter_compare(&mp->m_ifree, 0) < 0) {
- ASSERT(0);
- percpu_counter_add(&mp->m_ifree, -delta);
- return -EINVAL;
- }
- return 0;
-}
-
/*
* Deltas for the block count can vary from 1 to very large, but lock contention
* only occurs on frequent small block count updates such as in the delayed
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index aba5a15792792..4835581f3eb00 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -392,8 +392,6 @@ extern int xfs_initialize_perag(xfs_mount_t *mp, xfs_agnumber_t agcount,
xfs_agnumber_t *maxagi);
extern void xfs_unmountfs(xfs_mount_t *);
-extern int xfs_mod_icount(struct xfs_mount *mp, int64_t delta);
-extern int xfs_mod_ifree(struct xfs_mount *mp, int64_t delta);
extern int xfs_mod_fdblocks(struct xfs_mount *mp, int64_t delta,
bool reserved);
extern int xfs_mod_frextents(struct xfs_mount *mp, int64_t delta);
diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index 4522ceaaf57ba..b055a5ab53465 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -545,7 +545,12 @@ xfs_trans_apply_sb_deltas(
* used block counts are not updated in the on disk superblock. In this case,
* XFS_TRANS_SB_DIRTY will not be set when the transaction is updated but we
* still need to update the incore superblock with the changes.
+ *
+ * Deltas for the inode count are +/-64, hence we use a large batch size of 128
+ * so we don't need to take the counter lock on every update.
*/
+#define XFS_ICOUNT_BATCH 128
+
void
xfs_trans_unreserve_and_mod_sb(
struct xfs_trans *tp)
@@ -585,13 +590,17 @@ xfs_trans_unreserve_and_mod_sb(
}
if (idelta) {
- error = xfs_mod_icount(mp, idelta);
- ASSERT(!error);
+ percpu_counter_add_batch(&mp->m_icount, idelta,
+ XFS_ICOUNT_BATCH);
+ if (idelta < 0)
+ ASSERT(__percpu_counter_compare(&mp->m_icount, 0,
+ XFS_ICOUNT_BATCH) >= 0);
}
if (ifreedelta) {
- error = xfs_mod_ifree(mp, ifreedelta);
- ASSERT(!error);
+ percpu_counter_add(&mp->m_ifree, ifreedelta);
+ if (ifreedelta < 0)
+ ASSERT(percpu_counter_compare(&mp->m_ifree, 0) >= 0);
}
if (rtxdelta == 0 && !(tp->t_flags & XFS_TRANS_SB_DIRTY))
--
2.26.2.761.g0e0b3e54be
next prev parent reply other threads:[~2020-05-19 21:48 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-19 21:48 [PATCH 0/2] xfs: fix unecessary percpu counter overhead Dave Chinner
2020-05-19 21:48 ` [PATCH 1/2] xfs: gut error handling in xfs_trans_unreserve_and_mod_sb() Dave Chinner
2020-05-20 6:53 ` Christoph Hellwig
2020-05-20 7:03 ` Dave Chinner
2020-05-20 7:33 ` [PATCH 1/2 V2] " Dave Chinner
2020-05-20 7:48 ` Christoph Hellwig
2020-05-20 20:27 ` Darrick J. Wong
2020-05-20 21:55 ` Dave Chinner
2020-05-20 22:28 ` Darrick J. Wong
2020-05-20 22:37 ` Dave Chinner
2020-05-19 21:48 ` Dave Chinner [this message]
2020-05-20 6:56 ` [PATCH 2/2] xfs: reduce free inode accounting overhead Christoph Hellwig
2020-05-20 20:43 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200519214840.2570159-3-david@fromorbit.com \
--to=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox