public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Waiman Long <Waiman.Long@hpe.com>
To: Tejun Heo <tj@kernel.org>,
	Christoph Lameter <cl@linux-foundation.org>,
	Dave Chinner <dchinner@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Scott J Norton <scott.norton@hp.com>,
	linux-kernel@vger.kernel.org, Waiman Long <Waiman.Long@hpe.com>,
	xfs@oss.sgi.com, Ingo Molnar <mingo@redhat.com>,
	Douglas Hatch <doug.hatch@hp.com>
Subject: [RFC PATCH 2/2] xfs: Allow degeneration of m_fdblocks/m_ifree to global counters
Date: Fri,  4 Mar 2016 21:51:39 -0500	[thread overview]
Message-ID: <1457146299-1601-3-git-send-email-Waiman.Long@hpe.com> (raw)
In-Reply-To: <1457146299-1601-1-git-send-email-Waiman.Long@hpe.com>

Small XFS filesystems on systems with large number of CPUs can incur a
significant overhead due to excessive calls to the percpu_counter_sum()
function which needs to walk through a large number of different
cachelines.

This patch uses the newly added percpu_counter_set_limit() API to
potentially switch the m_fdblocks and m_ifree per-cpu counters to
a global counter with locks at filesystem mount time if its size
is small relatively to the number of CPUs available.

A possible use case is the use of the NVDIMM as an application scratch
storage area for log file and other small files. Current battery-backed
NVDIMMs are pretty small in size, e.g. 8G per DIMM. So we cannot create
large filesystem on top of them.

On a 4-socket 80-thread system running 4.5-rc6 kernel, this patch can
improve the throughput of the AIM7 XFS disk workload by 25%. Before
the patch, the perf profile was:

  18.68%   0.08%  reaim  [k] __percpu_counter_compare
  18.05%   9.11%  reaim  [k] __percpu_counter_sum
   0.37%   0.36%  reaim  [k] __percpu_counter_add

After the patch, the perf profile was:

   0.73%   0.36%  reaim  [k] __percpu_counter_add
   0.27%   0.27%  reaim  [k] __percpu_counter_compare

Signed-off-by: Waiman Long <Waiman.Long@hpe.com>
---
 fs/xfs/xfs_mount.c |    1 -
 fs/xfs/xfs_mount.h |    5 +++++
 fs/xfs/xfs_super.c |    6 ++++++
 3 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index bb753b3..fe74b91 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -1163,7 +1163,6 @@ xfs_mod_ifree(
  * a large batch count (1024) to minimise global counter updates except when
  * we get near to ENOSPC and we have to be very accurate with our updates.
  */
-#define XFS_FDBLOCKS_BATCH	1024
 int
 xfs_mod_fdblocks(
 	struct xfs_mount	*mp,
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index b570984..d9520f4 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -206,6 +206,11 @@ typedef struct xfs_mount {
 #define	XFS_WSYNC_WRITEIO_LOG	14	/* 16k */
 
 /*
+ * FD blocks batch size for per-cpu compare
+ */
+#define XFS_FDBLOCKS_BATCH	1024
+
+/*
  * Allow large block sizes to be reported to userspace programs if the
  * "largeio" mount option is used.
  *
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 59c9b7b..c0b4f79 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1412,6 +1412,12 @@ xfs_reinit_percpu_counters(
 	percpu_counter_set(&mp->m_icount, mp->m_sb.sb_icount);
 	percpu_counter_set(&mp->m_ifree, mp->m_sb.sb_ifree);
 	percpu_counter_set(&mp->m_fdblocks, mp->m_sb.sb_fdblocks);
+
+	/*
+	 * Use default batch size for m_ifree
+	 */
+	percpu_counter_set_limit(&mp->m_ifree, 0);
+	percpu_counter_set_limit(&mp->m_fdblocks, 4 * XFS_FDBLOCKS_BATCH);
 }
 
 static void
-- 
1.7.1

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  parent reply	other threads:[~2016-03-05  2:52 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-05  2:51 [RFC PATCH 0/2] percpu_counter: Enable switching to global counter Waiman Long
2016-03-05  2:51 ` [RFC PATCH 1/2] percpu_counter: Allow falling back to global counter on large system Waiman Long
2016-03-07 18:24   ` Christoph Lameter
     [not found]     ` <56E9B219.7090500@hpe.com>
2016-03-18  1:58       ` Christoph Lameter
2016-03-05  2:51 ` Waiman Long [this message]
2016-03-05  6:34 ` [RFC PATCH 0/2] percpu_counter: Enable switching to global counter Dave Chinner
     [not found]   ` <56DDBCEB.8060307@hpe.com>
2016-03-07 21:33     ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1457146299-1601-3-git-send-email-Waiman.Long@hpe.com \
    --to=waiman.long@hpe.com \
    --cc=cl@linux-foundation.org \
    --cc=dchinner@redhat.com \
    --cc=doug.hatch@hp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=scott.norton@hp.com \
    --cc=tj@kernel.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox