[PATCH 0/3] Use generic percpu counters in XFS

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/3] Use generic percpu counters in XFS
@ 2010-11-29  0:36 Dave Chinner
  2010-11-29  0:36 ` [PATCH 1/3] lib: percpu counter add unless less than functionality Dave Chinner
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Dave Chinner @ 2010-11-29  0:36 UTC (permalink / raw)
  To: xfs; +Cc: linux-kernel, a.p.zijlstra

This series replaces the XFS per-cpu superblock counters with generic per-cpu
counters. It adds a specialised operation to the generic counter implementation
and then switches the XFS percpu superblock counters over to use the generic
percpu counter implementation. The result is several hundred lines of complex
code removed from XFS.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/3] lib: percpu counter add unless less than functionality
  2010-11-29  0:36 [PATCH 0/3] Use generic percpu counters in XFS Dave Chinner
@ 2010-11-29  0:36 ` Dave Chinner
  2010-11-29 12:08   ` Peter Zijlstra
  2010-11-29  0:36 ` [PATCH 2/3] xfs: use generic per-cpu counter infrastructure Dave Chinner
  2010-11-29  0:36 ` [PATCH 3/3] xfs: demultiplex xfs_icsb_modify_counters() Dave Chinner
  2 siblings, 1 reply; 12+ messages in thread
From: Dave Chinner @ 2010-11-29  0:36 UTC (permalink / raw)
  To: xfs; +Cc: linux-kernel, a.p.zijlstra

From: Dave Chinner <dchinner@redhat.com>

To use the generic percpu counter infrastructure for counters that
require conditional addition based on a threshold value we need
special handling of the counter. Further, the caller needs to know
the status of the conditional addition to determine what action to
take depending on whether the addition occurred or not.  Examples of
this sort of usage are resource counters that cannot go below zero
(e.g. filesystem free blocks).

To allow XFS to replace it's complex roll-your-own per-cpu
superblock counters, a single generic conditional function is
required: percpu_counter_add_unless_lt(). This will add the amount
to the counter unless the result would be less than the given
threshold. A caller supplied threshold is required because XFS does
not necessarily use the same threshold for every counter.

percpu_counter_add_unless_lt() attempts to minimise counter lock
traversals by only taking the counter lock when the threshold is
within the error range of the current counter value. Hence when the
threshold is not within the counter error range, the counter will
still have the same scalability characteristics as the normal
percpu_counter_add() function.

Adding this functionality to the generic percpu counters allows us
to remove the much more complex and less efficient XFS percpu
counter code (~700 lines of code) and replace it with generic
percpu counters.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 include/linux/percpu_counter.h |   20 ++++++++++
 lib/percpu_counter.c           |   78 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 98 insertions(+), 0 deletions(-)

diff --git a/include/linux/percpu_counter.h b/include/linux/percpu_counter.h
index 46f6ba5..465b658f 100644
--- a/include/linux/percpu_counter.h
+++ b/include/linux/percpu_counter.h
@@ -41,6 +41,8 @@ void percpu_counter_set(struct percpu_counter *fbc, s64 amount);
 void __percpu_counter_add(struct percpu_counter *fbc, s64 amount, s32 batch);
 s64 __percpu_counter_sum(struct percpu_counter *fbc);
 int percpu_counter_compare(struct percpu_counter *fbc, s64 rhs);
+int percpu_counter_add_unless_lt(struct percpu_counter *fbc, s64 amount,
+							s64 threshold);
 
 static inline void percpu_counter_add(struct percpu_counter *fbc, s64 amount)
 {
@@ -153,6 +155,24 @@ static inline int percpu_counter_initialized(struct percpu_counter *fbc)
 	return 1;
 }
 
+static inline int percpu_counter_add_unless_lt(struct percpu_counter *fbc, s64 amount,
+							s64 threshold)
+{
+	s64 count;
+	int ret = ‐1;
+
+	preempt_disable();
+	count = fbc->count + amount;
+	if (count < threshold)
+		goto out;
+	fbc->count = count;
+	ret = count == threshold ? 0 : 1;
+out:
+	preempt_enable();
+	return ret;
+}
+
+
 #endif	/* CONFIG_SMP */
 
 static inline void percpu_counter_inc(struct percpu_counter *fbc)
diff --git a/lib/percpu_counter.c b/lib/percpu_counter.c
index 604678d..ba625b7 100644
--- a/lib/percpu_counter.c
+++ b/lib/percpu_counter.c
@@ -213,6 +213,84 @@ int percpu_counter_compare(struct percpu_counter *fbc, s64 rhs)
 }
 EXPORT_SYMBOL(percpu_counter_compare);
 
+/**
+ * percpu_counter_add_unless_lt - add to a counter avoiding underruns
+ * @fbc:	counter
+ * @amount:	amount to add
+ * @threshold:	underrun threshold
+ *
+ * Add @amount to @fdc if and only if result of addition is greater than or
+ * equal to @threshold  Return 1 if greater and added, 0 if equal and added
+ * and -1 if and underrun would have occured.
+ *
+ * This is useful for operations that must accurately and atomically only add a
+ * delta to a counter if the result is greater than a given (e.g. for freespace
+ * accounting with ENOSPC checking in filesystems).
+ */
+int percpu_counter_add_unless_lt(struct percpu_counter *fbc, s64 amount,
+						s64 threshold)
+{
+	s64	count;
+	s64	error = 2 * percpu_counter_batch * num_online_cpus();
+	int	cpu;
+	int	ret = -1;
+
+	preempt_disable();
+
+	/* Check to see if rough count will be sufficient for comparison */
+	count = percpu_counter_read(fbc);
+	if (count + amount < threshold - error)
+		goto out;
+
+	/*
+	 * If the counter is over the threshold and the change is less than the
+	 * batch size, we might be able to avoid locking.
+	 */
+	if (count > threshold + error && abs(amount) < percpu_counter_batch) {
+		__percpu_counter_add(fbc, amount, percpu_counter_batch);
+		ret = 1;
+		goto out;
+	}
+
+	/*
+	 * If the result is is over the error threshold, we can just add it
+	 * into the global counter ignoring what is in the per-cpu counters
+	 * as they will not change the result of the calculation.
+	 */
+	spin_lock(&fbc->lock);
+	if (fbc->count + amount > threshold + error) {
+		fbc->count += amount;
+		ret = 1;
+		goto out_unlock;
+	}
+
+	/*
+	 * Result is withing the error margin. Run an open-coded sum of the
+	 * per-cpu counters to get the exact value at this point in time,
+	 * and if the result woul dbe above the threshold, add the amount to
+	 * the global counter.
+	 */
+	count = fbc->count;
+	for_each_online_cpu(cpu) {
+		s32 *pcount = per_cpu_ptr(fbc->counters, cpu);
+		count += *pcount;
+	}
+	WARN_ON(count < threshold);
+
+	if (count + amount >= threshold) {
+		ret = 0;
+		if (count + amount > threshold)
+			ret = 1;
+		fbc->count += amount;
+	}
+out_unlock:
+	spin_unlock(&fbc->lock);
+out:
+	preempt_enable();
+	return ret;
+}
+EXPORT_SYMBOL(percpu_counter_add_unless_lt);
+
 static int __init percpu_counter_startup(void)
 {
 	compute_batch_value();
-- 
1.7.2.3


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/3] xfs: use generic per-cpu counter infrastructure
  2010-11-29  0:36 [PATCH 0/3] Use generic percpu counters in XFS Dave Chinner
  2010-11-29  0:36 ` [PATCH 1/3] lib: percpu counter add unless less than functionality Dave Chinner
@ 2010-11-29  0:36 ` Dave Chinner
  2010-11-29  9:16   ` Christoph Hellwig
  2010-11-29  0:36 ` [PATCH 3/3] xfs: demultiplex xfs_icsb_modify_counters() Dave Chinner
  2 siblings, 1 reply; 12+ messages in thread
From: Dave Chinner @ 2010-11-29  0:36 UTC (permalink / raw)
  To: xfs; +Cc: linux-kernel, a.p.zijlstra

From: Dave Chinner <dchinner@redhat.com>

XFS has a per-cpu counter implementation for in-core superblock
counters that pre-dated the generic implementation. It is complex
and baroque as it is tailored directly to the needs of ENOSPC
detection.

Now that the generic percpu counter infrastructure has the
percpu_counter_add_unless_lt() function that implements the
necessary threshold checks for us, switch the XFS per-cpu
superblock counters to use the generic percpu counter
infrastructure.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/linux-2.6/xfs_linux.h |    9 -
 fs/xfs/linux-2.6/xfs_super.c |    4 +-
 fs/xfs/xfs_fsops.c           |    4 +-
 fs/xfs/xfs_mount.c           |  842 ++++++++++--------------------------------
 fs/xfs/xfs_mount.h           |   71 +---
 5 files changed, 211 insertions(+), 719 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_linux.h b/fs/xfs/linux-2.6/xfs_linux.h
index 214ddd7..9fa4f2a 100644
--- a/fs/xfs/linux-2.6/xfs_linux.h
+++ b/fs/xfs/linux-2.6/xfs_linux.h
@@ -88,15 +88,6 @@
 #include <xfs_super.h>
 #include <xfs_buf.h>
 
-/*
- * Feature macros (disable/enable)
- */
-#ifdef CONFIG_SMP
-#define HAVE_PERCPU_SB	/* per cpu superblock counters are a 2.6 feature */
-#else
-#undef  HAVE_PERCPU_SB	/* per cpu superblock counters are a 2.6 feature */
-#endif
-
 #define irix_sgid_inherit	xfs_params.sgid_inherit.val
 #define irix_symlink_mode	xfs_params.symlink_mode.val
 #define xfs_panic_mask		xfs_params.panic_mask.val
diff --git a/fs/xfs/linux-2.6/xfs_super.c b/fs/xfs/linux-2.6/xfs_super.c
index 6f02771..6cbebb2 100644
--- a/fs/xfs/linux-2.6/xfs_super.c
+++ b/fs/xfs/linux-2.6/xfs_super.c
@@ -1227,9 +1227,9 @@ xfs_fs_statfs(
 	statp->f_fsid.val[0] = (u32)id;
 	statp->f_fsid.val[1] = (u32)(id >> 32);
 
-	xfs_icsb_sync_counters(mp, XFS_ICSB_LAZY_COUNT);
-
 	spin_lock(&mp->m_sb_lock);
+	xfs_icsb_sync_counters(mp);
+
 	statp->f_bsize = sbp->sb_blocksize;
 	lsize = sbp->sb_logstart ? sbp->sb_logblocks : 0;
 	statp->f_blocks = sbp->sb_dblocks - lsize;
diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
index a7c116e..fb9a9c8 100644
--- a/fs/xfs/xfs_fsops.c
+++ b/fs/xfs/xfs_fsops.c
@@ -478,8 +478,8 @@ xfs_fs_counts(
 	xfs_mount_t		*mp,
 	xfs_fsop_counts_t	*cnt)
 {
-	xfs_icsb_sync_counters(mp, XFS_ICSB_LAZY_COUNT);
 	spin_lock(&mp->m_sb_lock);
+	xfs_icsb_sync_counters(mp);
 	cnt->freedata = mp->m_sb.sb_fdblocks - XFS_ALLOC_SET_ASIDE(mp);
 	cnt->freertx = mp->m_sb.sb_frextents;
 	cnt->freeino = mp->m_sb.sb_ifree;
@@ -540,7 +540,7 @@ xfs_reserve_blocks(
 	 */
 retry:
 	spin_lock(&mp->m_sb_lock);
-	xfs_icsb_sync_counters_locked(mp, 0);
+	xfs_icsb_sync_counters(mp);
 
 	/*
 	 * If our previous reservation was larger than the current value,
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 19e9dfa..f0353fb 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -46,19 +46,6 @@
 
 STATIC void	xfs_unmountfs_wait(xfs_mount_t *);
 
-
-#ifdef HAVE_PERCPU_SB
-STATIC void	xfs_icsb_balance_counter(xfs_mount_t *, xfs_sb_field_t,
-						int);
-STATIC void	xfs_icsb_balance_counter_locked(xfs_mount_t *, xfs_sb_field_t,
-						int);
-STATIC void	xfs_icsb_disable_counter(xfs_mount_t *, xfs_sb_field_t);
-#else
-
-#define xfs_icsb_balance_counter(mp, a, b)		do { } while (0)
-#define xfs_icsb_balance_counter_locked(mp, a, b)	do { } while (0)
-#endif
-
 static const struct {
 	short offset;
 	short type;	/* 0 = integer
@@ -280,6 +267,112 @@ xfs_free_perag(
 	}
 }
 
+
+/*
+ * Per-cpu incore superblock counters
+ *
+ * Simple concept, difficult implementation, now somewhat simplified by generic
+ * per-cpu counter support.  This provides distributed per cpu counters for
+ * contended fields (e.g.  free block count).
+ *
+ * Difficulties arise in that the incore sb is used for ENOSPC checking, and
+ * hence needs to be accurately read when we are running low on space. Hence We
+ * need to check against counter error bounds and determine how accurately to
+ * sum based on that metric. The percpu counters take care of this for us,
+ * so we only need to modify the fast path to handle per-cpu counter error
+ * cases.
+ */
+static inline int
+xfs_icsb_add(
+	struct xfs_mount	*mp,
+	int			counter,
+	int64_t			delta,
+	int64_t			threshold)
+{
+	int			ret;
+
+	ret = percpu_counter_add_unless_lt(&mp->m_icsb[counter], delta,
+								threshold);
+	if (ret < 0)
+		return -ENOSPC;
+	return 0;
+}
+
+static inline void
+xfs_icsb_set(
+	struct xfs_mount	*mp,
+	int			counter,
+	int64_t			value)
+{
+	percpu_counter_set(&mp->m_icsb[counter], value);
+}
+
+static inline int64_t
+xfs_icsb_sum(
+	struct xfs_mount	*mp,
+	int			counter)
+{
+	return percpu_counter_sum_positive(&mp->m_icsb[counter]);
+}
+
+static inline int64_t
+xfs_icsb_read(
+	struct xfs_mount	*mp,
+	int			counter)
+{
+	return percpu_counter_read_positive(&mp->m_icsb[counter]);
+}
+
+void
+xfs_icsb_reinit_counters(
+	struct xfs_mount	*mp)
+{
+	xfs_icsb_set(mp, XFS_ICSB_FDBLOCKS, mp->m_sb.sb_fdblocks);
+	xfs_icsb_set(mp, XFS_ICSB_IFREE, mp->m_sb.sb_ifree);
+	xfs_icsb_set(mp, XFS_ICSB_ICOUNT, mp->m_sb.sb_icount);
+}
+
+int
+xfs_icsb_init_counters(
+	struct xfs_mount	*mp)
+{
+	int			i;
+	int			error;
+
+	for (i = 0; i < XFS_ICSB_MAX; i++) {
+		error = percpu_counter_init(&mp->m_icsb[i], 0);
+		if (error)
+			goto out_error;
+	}
+	xfs_icsb_reinit_counters(mp);
+	return 0;
+
+out_error:
+	for (; i >= 0; i--)
+		percpu_counter_destroy(&mp->m_icsb[i]);
+	return error;
+}
+
+void
+xfs_icsb_destroy_counters(
+	xfs_mount_t	*mp)
+{
+	int		i;
+
+	for (i = 0; i < XFS_ICSB_MAX; i++)
+		percpu_counter_destroy(&mp->m_icsb[i]);
+}
+
+void
+xfs_icsb_sync_counters(
+	xfs_mount_t	*mp)
+{
+	assert_spin_locked(&mp->m_sb_lock);
+	mp->m_sb.sb_icount = xfs_icsb_sum(mp, XFS_ICSB_ICOUNT);
+	mp->m_sb.sb_ifree = xfs_icsb_sum(mp, XFS_ICSB_IFREE);
+	mp->m_sb.sb_fdblocks = xfs_icsb_sum(mp, XFS_ICSB_FDBLOCKS);
+}
+
 /*
  * Check size of device based on the (data/realtime) block count.
  * Note: this check is used by the growfs code as well as mount.
@@ -1562,7 +1655,9 @@ xfs_log_sbcount(
 	if (!xfs_fs_writable(mp))
 		return 0;
 
-	xfs_icsb_sync_counters(mp, 0);
+	spin_lock(&mp->m_sb_lock);
+	xfs_icsb_sync_counters(mp);
+	spin_unlock(&mp->m_sb_lock);
 
 	/*
 	 * we don't need to do this if we are updating the superblock
@@ -1674,9 +1769,8 @@ xfs_mod_incore_sb_unlocked(
 	int64_t		delta,
 	int		rsvd)
 {
-	int		scounter;	/* short counter for 32 bit fields */
-	long long	lcounter;	/* long counter for 64 bit fields */
-	long long	res_used, rem;
+	int		scounter = 0;	/* short counter for 32 bit fields */
+	long long	lcounter = 0;	/* long counter for 64 bit fields */
 
 	/*
 	 * With the in-core superblock spin lock held, switch
@@ -1685,66 +1779,6 @@ xfs_mod_incore_sb_unlocked(
 	 * 0, then do not apply the delta and return EINVAL.
 	 */
 	switch (field) {
-	case XFS_SBS_ICOUNT:
-		lcounter = (long long)mp->m_sb.sb_icount;
-		lcounter += delta;
-		if (lcounter < 0) {
-			ASSERT(0);
-			return XFS_ERROR(EINVAL);
-		}
-		mp->m_sb.sb_icount = lcounter;
-		return 0;
-	case XFS_SBS_IFREE:
-		lcounter = (long long)mp->m_sb.sb_ifree;
-		lcounter += delta;
-		if (lcounter < 0) {
-			ASSERT(0);
-			return XFS_ERROR(EINVAL);
-		}
-		mp->m_sb.sb_ifree = lcounter;
-		return 0;
-	case XFS_SBS_FDBLOCKS:
-		lcounter = (long long)
-			mp->m_sb.sb_fdblocks - XFS_ALLOC_SET_ASIDE(mp);
-		res_used = (long long)(mp->m_resblks - mp->m_resblks_avail);
-
-		if (delta > 0) {		/* Putting blocks back */
-			if (res_used > delta) {
-				mp->m_resblks_avail += delta;
-			} else {
-				rem = delta - res_used;
-				mp->m_resblks_avail = mp->m_resblks;
-				lcounter += rem;
-			}
-		} else {				/* Taking blocks away */
-			lcounter += delta;
-			if (lcounter >= 0) {
-				mp->m_sb.sb_fdblocks = lcounter +
-							XFS_ALLOC_SET_ASIDE(mp);
-				return 0;
-			}
-
-			/*
-			 * We are out of blocks, use any available reserved
-			 * blocks if were allowed to.
-			 */
-			if (!rsvd)
-				return XFS_ERROR(ENOSPC);
-
-			lcounter = (long long)mp->m_resblks_avail + delta;
-			if (lcounter >= 0) {
-				mp->m_resblks_avail = lcounter;
-				return 0;
-			}
-			printk_once(KERN_WARNING
-				"Filesystem \"%s\": reserve blocks depleted! "
-				"Consider increasing reserve pool size.",
-				mp->m_fsname);
-			return XFS_ERROR(ENOSPC);
-		}
-
-		mp->m_sb.sb_fdblocks = lcounter + XFS_ALLOC_SET_ASIDE(mp);
-		return 0;
 	case XFS_SBS_FREXTENTS:
 		lcounter = (long long)mp->m_sb.sb_frextents;
 		lcounter += delta;
@@ -1846,9 +1880,7 @@ xfs_mod_incore_sb(
 {
 	int			status;
 
-#ifdef HAVE_PERCPU_SB
 	ASSERT(field < XFS_SBS_ICOUNT || field > XFS_SBS_FDBLOCKS);
-#endif
 	spin_lock(&mp->m_sb_lock);
 	status = xfs_mod_incore_sb_unlocked(mp, field, delta, rsvd);
 	spin_unlock(&mp->m_sb_lock);
@@ -1907,6 +1939,89 @@ unwind:
 	return error;
 }
 
+int
+xfs_icsb_modify_counters(
+	xfs_mount_t	*mp,
+	xfs_sb_field_t	field,
+	int64_t		delta,
+	int		rsvd)
+{
+	int64_t		lcounter;
+	int64_t		res_used;
+	int		ret = 0;
+
+
+	switch (field) {
+	case XFS_SBS_ICOUNT:
+		ret = xfs_icsb_add(mp, cntr, delta, 0);
+		if (ret < 0) {
+			ASSERT(0);
+			return XFS_ERROR(EINVAL);
+		}
+		return 0;
+
+	case XFS_SBS_IFREE:
+		ret = xfs_icsb_add(mp, XFS_ICSB_IFREE, delta, 0);
+		if (ret < 0) {
+			ASSERT(0);
+			return XFS_ERROR(EINVAL);
+		}
+		return 0;
+
+	case XFS_SBS_FDBLOCKS:
+		/*
+		 * if we are putting blocks back, put them into the reserve
+		 * block pool first.
+		 */
+		if (mp->m_resblks != mp->m_resblks_avail && delta > 0) {
+			spin_lock(&mp->m_sb_lock);
+			res_used = (int64_t)(mp->m_resblks -
+						mp->m_resblks_avail);
+			if (res_used > delta) {
+				mp->m_resblks_avail += delta;
+				delta = 0;
+			} else {
+				delta -= res_used;
+				mp->m_resblks_avail = mp->m_resblks;
+			}
+			spin_unlock(&mp->m_sb_lock);
+			if (!delta)
+				return 0;
+		}
+
+		/* try the change */
+		ret = xfs_icsb_add(mp, XFS_ICSB_FDBLOCKS, delta,
+						XFS_ALLOC_SET_ASIDE(mp));
+		if (likely(ret >= 0))
+			return 0;
+
+		/* ENOSPC */
+		ASSERT(ret == -ENOSPC);
+		ASSERT(delta < 0);
+
+		if (!rsvd)
+			return XFS_ERROR(ENOSPC);
+
+		spin_lock(&mp->m_sb_lock);
+		lcounter = (int64_t)mp->m_resblks_avail + delta;
+		if (lcounter >= 0) {
+			mp->m_resblks_avail = lcounter;
+			spin_unlock(&mp->m_sb_lock);
+			return 0;
+		}
+		spin_unlock(&mp->m_sb_lock);
+		printk_once(KERN_WARNING
+			"Filesystem \"%s\": reserve blocks depleted! "
+			"Consider increasing reserve pool size.",
+			mp->m_fsname);
+		return XFS_ERROR(ENOSPC);
+	default:
+		ASSERT(0);
+		return XFS_ERROR(EINVAL);
+	}
+	return 0;
+}
+
 /*
  * xfs_getsb() is called to obtain the buffer for the superblock.
  * The buffer is returned locked and read in from disk.
@@ -2000,572 +2115,3 @@ xfs_dev_is_read_only(
 	}
 	return 0;
 }
-
-#ifdef HAVE_PERCPU_SB
-/*
- * Per-cpu incore superblock counters
- *
- * Simple concept, difficult implementation
- *
- * Basically, replace the incore superblock counters with a distributed per cpu
- * counter for contended fields (e.g.  free block count).
- *
- * Difficulties arise in that the incore sb is used for ENOSPC checking, and
- * hence needs to be accurately read when we are running low on space. Hence
- * there is a method to enable and disable the per-cpu counters based on how
- * much "stuff" is available in them.
- *
- * Basically, a counter is enabled if there is enough free resource to justify
- * running a per-cpu fast-path. If the per-cpu counter runs out (i.e. a local
- * ENOSPC), then we disable the counters to synchronise all callers and
- * re-distribute the available resources.
- *
- * If, once we redistributed the available resources, we still get a failure,
- * we disable the per-cpu counter and go through the slow path.
- *
- * The slow path is the current xfs_mod_incore_sb() function.  This means that
- * when we disable a per-cpu counter, we need to drain its resources back to
- * the global superblock. We do this after disabling the counter to prevent
- * more threads from queueing up on the counter.
- *
- * Essentially, this means that we still need a lock in the fast path to enable
- * synchronisation between the global counters and the per-cpu counters. This
- * is not a problem because the lock will be local to a CPU almost all the time
- * and have little contention except when we get to ENOSPC conditions.
- *
- * Basically, this lock becomes a barrier that enables us to lock out the fast
- * path while we do things like enabling and disabling counters and
- * synchronising the counters.
- *
- * Locking rules:
- *
- * 	1. m_sb_lock before picking up per-cpu locks
- * 	2. per-cpu locks always picked up via for_each_online_cpu() order
- * 	3. accurate counter sync requires m_sb_lock + per cpu locks
- * 	4. modifying per-cpu counters requires holding per-cpu lock
- * 	5. modifying global counters requires holding m_sb_lock
- *	6. enabling or disabling a counter requires holding the m_sb_lock 
- *	   and _none_ of the per-cpu locks.
- *
- * Disabled counters are only ever re-enabled by a balance operation
- * that results in more free resources per CPU than a given threshold.
- * To ensure counters don't remain disabled, they are rebalanced when
- * the global resource goes above a higher threshold (i.e. some hysteresis
- * is present to prevent thrashing).
- */
-
-#ifdef CONFIG_HOTPLUG_CPU
-/*
- * hot-plug CPU notifier support.
- *
- * We need a notifier per filesystem as we need to be able to identify
- * the filesystem to balance the counters out. This is achieved by
- * having a notifier block embedded in the xfs_mount_t and doing pointer
- * magic to get the mount pointer from the notifier block address.
- */
-STATIC int
-xfs_icsb_cpu_notify(
-	struct notifier_block *nfb,
-	unsigned long action,
-	void *hcpu)
-{
-	xfs_icsb_cnts_t *cntp;
-	xfs_mount_t	*mp;
-
-	mp = (xfs_mount_t *)container_of(nfb, xfs_mount_t, m_icsb_notifier);
-	cntp = (xfs_icsb_cnts_t *)
-			per_cpu_ptr(mp->m_sb_cnts, (unsigned long)hcpu);
-	switch (action) {
-	case CPU_UP_PREPARE:
-	case CPU_UP_PREPARE_FROZEN:
-		/* Easy Case - initialize the area and locks, and
-		 * then rebalance when online does everything else for us. */
-		memset(cntp, 0, sizeof(xfs_icsb_cnts_t));
-		break;
-	case CPU_ONLINE:
-	case CPU_ONLINE_FROZEN:
-		xfs_icsb_lock(mp);
-		xfs_icsb_balance_counter(mp, XFS_SBS_ICOUNT, 0);
-		xfs_icsb_balance_counter(mp, XFS_SBS_IFREE, 0);
-		xfs_icsb_balance_counter(mp, XFS_SBS_FDBLOCKS, 0);
-		xfs_icsb_unlock(mp);
-		break;
-	case CPU_DEAD:
-	case CPU_DEAD_FROZEN:
-		/* Disable all the counters, then fold the dead cpu's
-		 * count into the total on the global superblock and
-		 * re-enable the counters. */
-		xfs_icsb_lock(mp);
-		spin_lock(&mp->m_sb_lock);
-		xfs_icsb_disable_counter(mp, XFS_SBS_ICOUNT);
-		xfs_icsb_disable_counter(mp, XFS_SBS_IFREE);
-		xfs_icsb_disable_counter(mp, XFS_SBS_FDBLOCKS);
-
-		mp->m_sb.sb_icount += cntp->icsb_icount;
-		mp->m_sb.sb_ifree += cntp->icsb_ifree;
-		mp->m_sb.sb_fdblocks += cntp->icsb_fdblocks;
-
-		memset(cntp, 0, sizeof(xfs_icsb_cnts_t));
-
-		xfs_icsb_balance_counter_locked(mp, XFS_SBS_ICOUNT, 0);
-		xfs_icsb_balance_counter_locked(mp, XFS_SBS_IFREE, 0);
-		xfs_icsb_balance_counter_locked(mp, XFS_SBS_FDBLOCKS, 0);
-		spin_unlock(&mp->m_sb_lock);
-		xfs_icsb_unlock(mp);
-		break;
-	}
-
-	return NOTIFY_OK;
-}
-#endif /* CONFIG_HOTPLUG_CPU */
-
-int
-xfs_icsb_init_counters(
-	xfs_mount_t	*mp)
-{
-	xfs_icsb_cnts_t *cntp;
-	int		i;
-
-	mp->m_sb_cnts = alloc_percpu(xfs_icsb_cnts_t);
-	if (mp->m_sb_cnts == NULL)
-		return -ENOMEM;
-
-#ifdef CONFIG_HOTPLUG_CPU
-	mp->m_icsb_notifier.notifier_call = xfs_icsb_cpu_notify;
-	mp->m_icsb_notifier.priority = 0;
-	register_hotcpu_notifier(&mp->m_icsb_notifier);
-#endif /* CONFIG_HOTPLUG_CPU */
-
-	for_each_online_cpu(i) {
-		cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i);
-		memset(cntp, 0, sizeof(xfs_icsb_cnts_t));
-	}
-
-	mutex_init(&mp->m_icsb_mutex);
-
-	/*
-	 * start with all counters disabled so that the
-	 * initial balance kicks us off correctly
-	 */
-	mp->m_icsb_counters = -1;
-	return 0;
-}
-
-void
-xfs_icsb_reinit_counters(
-	xfs_mount_t	*mp)
-{
-	xfs_icsb_lock(mp);
-	/*
-	 * start with all counters disabled so that the
-	 * initial balance kicks us off correctly
-	 */
-	mp->m_icsb_counters = -1;
-	xfs_icsb_balance_counter(mp, XFS_SBS_ICOUNT, 0);
-	xfs_icsb_balance_counter(mp, XFS_SBS_IFREE, 0);
-	xfs_icsb_balance_counter(mp, XFS_SBS_FDBLOCKS, 0);
-	xfs_icsb_unlock(mp);
-}
-
-void
-xfs_icsb_destroy_counters(
-	xfs_mount_t	*mp)
-{
-	if (mp->m_sb_cnts) {
-		unregister_hotcpu_notifier(&mp->m_icsb_notifier);
-		free_percpu(mp->m_sb_cnts);
-	}
-	mutex_destroy(&mp->m_icsb_mutex);
-}
-
-STATIC void
-xfs_icsb_lock_cntr(
-	xfs_icsb_cnts_t	*icsbp)
-{
-	while (test_and_set_bit(XFS_ICSB_FLAG_LOCK, &icsbp->icsb_flags)) {
-		ndelay(1000);
-	}
-}
-
-STATIC void
-xfs_icsb_unlock_cntr(
-	xfs_icsb_cnts_t	*icsbp)
-{
-	clear_bit(XFS_ICSB_FLAG_LOCK, &icsbp->icsb_flags);
-}
-
-
-STATIC void
-xfs_icsb_lock_all_counters(
-	xfs_mount_t	*mp)
-{
-	xfs_icsb_cnts_t *cntp;
-	int		i;
-
-	for_each_online_cpu(i) {
-		cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i);
-		xfs_icsb_lock_cntr(cntp);
-	}
-}
-
-STATIC void
-xfs_icsb_unlock_all_counters(
-	xfs_mount_t	*mp)
-{
-	xfs_icsb_cnts_t *cntp;
-	int		i;
-
-	for_each_online_cpu(i) {
-		cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i);
-		xfs_icsb_unlock_cntr(cntp);
-	}
-}
-
-STATIC void
-xfs_icsb_count(
-	xfs_mount_t	*mp,
-	xfs_icsb_cnts_t	*cnt,
-	int		flags)
-{
-	xfs_icsb_cnts_t *cntp;
-	int		i;
-
-	memset(cnt, 0, sizeof(xfs_icsb_cnts_t));
-
-	if (!(flags & XFS_ICSB_LAZY_COUNT))
-		xfs_icsb_lock_all_counters(mp);
-
-	for_each_online_cpu(i) {
-		cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i);
-		cnt->icsb_icount += cntp->icsb_icount;
-		cnt->icsb_ifree += cntp->icsb_ifree;
-		cnt->icsb_fdblocks += cntp->icsb_fdblocks;
-	}
-
-	if (!(flags & XFS_ICSB_LAZY_COUNT))
-		xfs_icsb_unlock_all_counters(mp);
-}
-
-STATIC int
-xfs_icsb_counter_disabled(
-	xfs_mount_t	*mp,
-	xfs_sb_field_t	field)
-{
-	ASSERT((field >= XFS_SBS_ICOUNT) && (field <= XFS_SBS_FDBLOCKS));
-	return test_bit(field, &mp->m_icsb_counters);
-}
-
-STATIC void
-xfs_icsb_disable_counter(
-	xfs_mount_t	*mp,
-	xfs_sb_field_t	field)
-{
-	xfs_icsb_cnts_t	cnt;
-
-	ASSERT((field >= XFS_SBS_ICOUNT) && (field <= XFS_SBS_FDBLOCKS));
-
-	/*
-	 * If we are already disabled, then there is nothing to do
-	 * here. We check before locking all the counters to avoid
-	 * the expensive lock operation when being called in the
-	 * slow path and the counter is already disabled. This is
-	 * safe because the only time we set or clear this state is under
-	 * the m_icsb_mutex.
-	 */
-	if (xfs_icsb_counter_disabled(mp, field))
-		return;
-
-	xfs_icsb_lock_all_counters(mp);
-	if (!test_and_set_bit(field, &mp->m_icsb_counters)) {
-		/* drain back to superblock */
-
-		xfs_icsb_count(mp, &cnt, XFS_ICSB_LAZY_COUNT);
-		switch(field) {
-		case XFS_SBS_ICOUNT:
-			mp->m_sb.sb_icount = cnt.icsb_icount;
-			break;
-		case XFS_SBS_IFREE:
-			mp->m_sb.sb_ifree = cnt.icsb_ifree;
-			break;
-		case XFS_SBS_FDBLOCKS:
-			mp->m_sb.sb_fdblocks = cnt.icsb_fdblocks;
-			break;
-		default:
-			BUG();
-		}
-	}
-
-	xfs_icsb_unlock_all_counters(mp);
-}
-
-STATIC void
-xfs_icsb_enable_counter(
-	xfs_mount_t	*mp,
-	xfs_sb_field_t	field,
-	uint64_t	count,
-	uint64_t	resid)
-{
-	xfs_icsb_cnts_t	*cntp;
-	int		i;
-
-	ASSERT((field >= XFS_SBS_ICOUNT) && (field <= XFS_SBS_FDBLOCKS));
-
-	xfs_icsb_lock_all_counters(mp);
-	for_each_online_cpu(i) {
-		cntp = per_cpu_ptr(mp->m_sb_cnts, i);
-		switch (field) {
-		case XFS_SBS_ICOUNT:
-			cntp->icsb_icount = count + resid;
-			break;
-		case XFS_SBS_IFREE:
-			cntp->icsb_ifree = count + resid;
-			break;
-		case XFS_SBS_FDBLOCKS:
-			cntp->icsb_fdblocks = count + resid;
-			break;
-		default:
-			BUG();
-			break;
-		}
-		resid = 0;
-	}
-	clear_bit(field, &mp->m_icsb_counters);
-	xfs_icsb_unlock_all_counters(mp);
-}
-
-void
-xfs_icsb_sync_counters_locked(
-	xfs_mount_t	*mp,
-	int		flags)
-{
-	xfs_icsb_cnts_t	cnt;
-
-	xfs_icsb_count(mp, &cnt, flags);
-
-	if (!xfs_icsb_counter_disabled(mp, XFS_SBS_ICOUNT))
-		mp->m_sb.sb_icount = cnt.icsb_icount;
-	if (!xfs_icsb_counter_disabled(mp, XFS_SBS_IFREE))
-		mp->m_sb.sb_ifree = cnt.icsb_ifree;
-	if (!xfs_icsb_counter_disabled(mp, XFS_SBS_FDBLOCKS))
-		mp->m_sb.sb_fdblocks = cnt.icsb_fdblocks;
-}
-
-/*
- * Accurate update of per-cpu counters to incore superblock
- */
-void
-xfs_icsb_sync_counters(
-	xfs_mount_t	*mp,
-	int		flags)
-{
-	spin_lock(&mp->m_sb_lock);
-	xfs_icsb_sync_counters_locked(mp, flags);
-	spin_unlock(&mp->m_sb_lock);
-}
-
-/*
- * Balance and enable/disable counters as necessary.
- *
- * Thresholds for re-enabling counters are somewhat magic.  inode counts are
- * chosen to be the same number as single on disk allocation chunk per CPU, and
- * free blocks is something far enough zero that we aren't going thrash when we
- * get near ENOSPC. We also need to supply a minimum we require per cpu to
- * prevent looping endlessly when xfs_alloc_space asks for more than will
- * be distributed to a single CPU but each CPU has enough blocks to be
- * reenabled.
- *
- * Note that we can be called when counters are already disabled.
- * xfs_icsb_disable_counter() optimises the counter locking in this case to
- * prevent locking every per-cpu counter needlessly.
- */
-
-#define XFS_ICSB_INO_CNTR_REENABLE	(uint64_t)64
-#define XFS_ICSB_FDBLK_CNTR_REENABLE(mp) \
-		(uint64_t)(512 + XFS_ALLOC_SET_ASIDE(mp))
-STATIC void
-xfs_icsb_balance_counter_locked(
-	xfs_mount_t	*mp,
-	xfs_sb_field_t  field,
-	int		min_per_cpu)
-{
-	uint64_t	count, resid;
-	int		weight = num_online_cpus();
-	uint64_t	min = (uint64_t)min_per_cpu;
-
-	/* disable counter and sync counter */
-	xfs_icsb_disable_counter(mp, field);
-
-	/* update counters  - first CPU gets residual*/
-	switch (field) {
-	case XFS_SBS_ICOUNT:
-		count = mp->m_sb.sb_icount;
-		resid = do_div(count, weight);
-		if (count < max(min, XFS_ICSB_INO_CNTR_REENABLE))
-			return;
-		break;
-	case XFS_SBS_IFREE:
-		count = mp->m_sb.sb_ifree;
-		resid = do_div(count, weight);
-		if (count < max(min, XFS_ICSB_INO_CNTR_REENABLE))
-			return;
-		break;
-	case XFS_SBS_FDBLOCKS:
-		count = mp->m_sb.sb_fdblocks;
-		resid = do_div(count, weight);
-		if (count < max(min, XFS_ICSB_FDBLK_CNTR_REENABLE(mp)))
-			return;
-		break;
-	default:
-		BUG();
-		count = resid = 0;	/* quiet, gcc */
-		break;
-	}
-
-	xfs_icsb_enable_counter(mp, field, count, resid);
-}
-
-STATIC void
-xfs_icsb_balance_counter(
-	xfs_mount_t	*mp,
-	xfs_sb_field_t  fields,
-	int		min_per_cpu)
-{
-	spin_lock(&mp->m_sb_lock);
-	xfs_icsb_balance_counter_locked(mp, fields, min_per_cpu);
-	spin_unlock(&mp->m_sb_lock);
-}
-
-int
-xfs_icsb_modify_counters(
-	xfs_mount_t	*mp,
-	xfs_sb_field_t	field,
-	int64_t		delta,
-	int		rsvd)
-{
-	xfs_icsb_cnts_t	*icsbp;
-	long long	lcounter;	/* long counter for 64 bit fields */
-	int		ret = 0;
-
-	might_sleep();
-again:
-	preempt_disable();
-	icsbp = this_cpu_ptr(mp->m_sb_cnts);
-
-	/*
-	 * if the counter is disabled, go to slow path
-	 */
-	if (unlikely(xfs_icsb_counter_disabled(mp, field)))
-		goto slow_path;
-	xfs_icsb_lock_cntr(icsbp);
-	if (unlikely(xfs_icsb_counter_disabled(mp, field))) {
-		xfs_icsb_unlock_cntr(icsbp);
-		goto slow_path;
-	}
-
-	switch (field) {
-	case XFS_SBS_ICOUNT:
-		lcounter = icsbp->icsb_icount;
-		lcounter += delta;
-		if (unlikely(lcounter < 0))
-			goto balance_counter;
-		icsbp->icsb_icount = lcounter;
-		break;
-
-	case XFS_SBS_IFREE:
-		lcounter = icsbp->icsb_ifree;
-		lcounter += delta;
-		if (unlikely(lcounter < 0))
-			goto balance_counter;
-		icsbp->icsb_ifree = lcounter;
-		break;
-
-	case XFS_SBS_FDBLOCKS:
-		BUG_ON((mp->m_resblks - mp->m_resblks_avail) != 0);
-
-		lcounter = icsbp->icsb_fdblocks - XFS_ALLOC_SET_ASIDE(mp);
-		lcounter += delta;
-		if (unlikely(lcounter < 0))
-			goto balance_counter;
-		icsbp->icsb_fdblocks = lcounter + XFS_ALLOC_SET_ASIDE(mp);
-		break;
-	default:
-		BUG();
-		break;
-	}
-	xfs_icsb_unlock_cntr(icsbp);
-	preempt_enable();
-	return 0;
-
-slow_path:
-	preempt_enable();
-
-	/*
-	 * serialise with a mutex so we don't burn lots of cpu on
-	 * the superblock lock. We still need to hold the superblock
-	 * lock, however, when we modify the global structures.
-	 */
-	xfs_icsb_lock(mp);
-
-	/*
-	 * Now running atomically.
-	 *
-	 * If the counter is enabled, someone has beaten us to rebalancing.
-	 * Drop the lock and try again in the fast path....
-	 */
-	if (!(xfs_icsb_counter_disabled(mp, field))) {
-		xfs_icsb_unlock(mp);
-		goto again;
-	}
-
-	/*
-	 * The counter is currently disabled. Because we are
-	 * running atomically here, we know a rebalance cannot
-	 * be in progress. Hence we can go straight to operating
-	 * on the global superblock. We do not call xfs_mod_incore_sb()
-	 * here even though we need to get the m_sb_lock. Doing so
-	 * will cause us to re-enter this function and deadlock.
-	 * Hence we get the m_sb_lock ourselves and then call
-	 * xfs_mod_incore_sb_unlocked() as the unlocked path operates
-	 * directly on the global counters.
-	 */
-	spin_lock(&mp->m_sb_lock);
-	ret = xfs_mod_incore_sb_unlocked(mp, field, delta, rsvd);
-	spin_unlock(&mp->m_sb_lock);
-
-	/*
-	 * Now that we've modified the global superblock, we
-	 * may be able to re-enable the distributed counters
-	 * (e.g. lots of space just got freed). After that
-	 * we are done.
-	 */
-	if (ret != ENOSPC)
-		xfs_icsb_balance_counter(mp, field, 0);
-	xfs_icsb_unlock(mp);
-	return ret;
-
-balance_counter:
-	xfs_icsb_unlock_cntr(icsbp);
-	preempt_enable();
-
-	/*
-	 * We may have multiple threads here if multiple per-cpu
-	 * counters run dry at the same time. This will mean we can
-	 * do more balances than strictly necessary but it is not
-	 * the common slowpath case.
-	 */
-	xfs_icsb_lock(mp);
-
-	/*
-	 * running atomically.
-	 *
-	 * This will leave the counter in the correct state for future
-	 * accesses. After the rebalance, we simply try again and our retry
-	 * will either succeed through the fast path or slow path without
-	 * another balance operation being required.
-	 */
-	xfs_icsb_balance_counter(mp, field, delta);
-	xfs_icsb_unlock(mp);
-	goto again;
-}
-
-#endif
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index 5861b49..42d31df 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -65,44 +65,19 @@ struct xfs_nameops;
 struct xfs_ail;
 struct xfs_quotainfo;
 
-#ifdef HAVE_PERCPU_SB
-
 /*
- * Valid per-cpu incore superblock counters. Note that if you add new counters,
- * you may need to define new counter disabled bit field descriptors as there
- * are more possible fields in the superblock that can fit in a bitfield on a
- * 32 bit platform. The XFS_SBS_* values for the current current counters just
- * fit.
+ * Per-cpu incore superblock counters.
  */
-typedef struct xfs_icsb_cnts {
-	uint64_t	icsb_fdblocks;
-	uint64_t	icsb_ifree;
-	uint64_t	icsb_icount;
-	unsigned long	icsb_flags;
-} xfs_icsb_cnts_t;
-
-#define XFS_ICSB_FLAG_LOCK	(1 << 0)	/* counter lock bit */
-
-#define XFS_ICSB_LAZY_COUNT	(1 << 1)	/* accuracy not needed */
+enum {
+	XFS_ICSB_FDBLOCKS = 0,
+	XFS_ICSB_IFREE,
+	XFS_ICSB_ICOUNT,
+	XFS_ICSB_MAX,
+};
 
-extern int	xfs_icsb_init_counters(struct xfs_mount *);
-extern void	xfs_icsb_reinit_counters(struct xfs_mount *);
-extern void	xfs_icsb_destroy_counters(struct xfs_mount *);
-extern void	xfs_icsb_sync_counters(struct xfs_mount *, int);
-extern void	xfs_icsb_sync_counters_locked(struct xfs_mount *, int);
 extern int	xfs_icsb_modify_counters(struct xfs_mount *, xfs_sb_field_t,
 						int64_t, int);
 
-#else
-#define xfs_icsb_init_counters(mp)		(0)
-#define xfs_icsb_destroy_counters(mp)		do { } while (0)
-#define xfs_icsb_reinit_counters(mp)		do { } while (0)
-#define xfs_icsb_sync_counters(mp, flags)	do { } while (0)
-#define xfs_icsb_sync_counters_locked(mp, flags) do { } while (0)
-#define xfs_icsb_modify_counters(mp, field, delta, rsvd) \
-	xfs_mod_incore_sb(mp, field, delta, rsvd)
-#endif
-
 typedef struct xfs_mount {
 	struct super_block	*m_super;
 	xfs_tid_t		m_tid;		/* next unused tid for fs */
@@ -186,12 +161,6 @@ typedef struct xfs_mount {
 	struct xfs_chash	*m_chash;	/* fs private inode per-cluster
 						 * hash table */
 	atomic_t		m_active_trans;	/* number trans frozen */
-#ifdef HAVE_PERCPU_SB
-	xfs_icsb_cnts_t __percpu *m_sb_cnts;	/* per-cpu superblock counters */
-	unsigned long		m_icsb_counters; /* disabled per-cpu counters */
-	struct notifier_block	m_icsb_notifier; /* hotplug cpu notifier */
-	struct mutex		m_icsb_mutex;	/* balancer sync lock */
-#endif
 	struct xfs_mru_cache	*m_filestream;  /* per-mount filestream data */
 	struct task_struct	*m_sync_task;	/* generalised sync thread */
 	xfs_sync_work_t		m_sync_work;	/* work item for VFS_SYNC */
@@ -202,6 +171,7 @@ typedef struct xfs_mount {
 	__int64_t		m_update_flags;	/* sb flags we need to update
 						   on the next remount,rw */
 	struct shrinker		m_inode_shrink;	/* inode reclaim shrinker */
+	struct percpu_counter	m_icsb[XFS_ICSB_MAX];
 } xfs_mount_t;
 
 /*
@@ -333,26 +303,6 @@ struct xfs_perag *xfs_perag_get_tag(struct xfs_mount *mp, xfs_agnumber_t agno,
 void	xfs_perag_put(struct xfs_perag *pag);
 
 /*
- * Per-cpu superblock locking functions
- */
-#ifdef HAVE_PERCPU_SB
-static inline void
-xfs_icsb_lock(xfs_mount_t *mp)
-{
-	mutex_lock(&mp->m_icsb_mutex);
-}
-
-static inline void
-xfs_icsb_unlock(xfs_mount_t *mp)
-{
-	mutex_unlock(&mp->m_icsb_mutex);
-}
-#else
-#define xfs_icsb_lock(mp)
-#define xfs_icsb_unlock(mp)
-#endif
-
-/*
  * This structure is for use by the xfs_mod_incore_sb_batch() routine.
  * xfs_growfs can specify a few fields which are more than int limit
  */
@@ -379,6 +329,11 @@ extern int	xfs_sb_validate_fsb_count(struct xfs_sb *, __uint64_t);
 
 extern int	xfs_dev_is_read_only(struct xfs_mount *, char *);
 
+extern int	xfs_icsb_init_counters(struct xfs_mount *);
+extern void	xfs_icsb_reinit_counters(struct xfs_mount *);
+extern void	xfs_icsb_destroy_counters(struct xfs_mount *);
+extern void	xfs_icsb_sync_counters(struct xfs_mount *);
+
 #endif	/* __KERNEL__ */
 
 extern void	xfs_mod_sb(struct xfs_trans *, __int64_t);
-- 
1.7.2.3


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/3] xfs: demultiplex xfs_icsb_modify_counters()
  2010-11-29  0:36 [PATCH 0/3] Use generic percpu counters in XFS Dave Chinner
  2010-11-29  0:36 ` [PATCH 1/3] lib: percpu counter add unless less than functionality Dave Chinner
  2010-11-29  0:36 ` [PATCH 2/3] xfs: use generic per-cpu counter infrastructure Dave Chinner
@ 2010-11-29  0:36 ` Dave Chinner
  2010-11-29  9:19   ` Christoph Hellwig
  2 siblings, 1 reply; 12+ messages in thread
From: Dave Chinner @ 2010-11-29  0:36 UTC (permalink / raw)
  To: xfs; +Cc: linux-kernel, a.p.zijlstra

From: Dave Chinner <dchinner@redhat.com>

With the conversion to percpu counters, xfs_icsb_modify_counters() really does
not need to exist. Convert the inode counter modifications to use a common
helper function for the one place that calls them, and add another function for
the free block modification and convert all the callers to use that.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_bmap.c  |   34 +++++-------
 fs/xfs/xfs_fsops.c |    3 +-
 fs/xfs/xfs_mount.c |  160 +++++++++++++++++++++++++---------------------------
 fs/xfs/xfs_mount.h |    4 +-
 fs/xfs/xfs_trans.c |   19 +++---
 5 files changed, 103 insertions(+), 117 deletions(-)

diff --git a/fs/xfs/xfs_bmap.c b/fs/xfs/xfs_bmap.c
index 4111cd3..6a47556 100644
--- a/fs/xfs/xfs_bmap.c
+++ b/fs/xfs/xfs_bmap.c
@@ -614,8 +614,8 @@ xfs_bmap_add_extent(
 			nblks += cur->bc_private.b.allocated;
 		ASSERT(nblks <= da_old);
 		if (nblks < da_old)
-			xfs_icsb_modify_counters(ip->i_mount, XFS_SBS_FDBLOCKS,
-				(int64_t)(da_old - nblks), rsvd);
+			xfs_icsb_modify_free_blocks(ip->i_mount,
+					(int64_t)(da_old - nblks), rsvd);
 	}
 	/*
 	 * Clear out the allocated field, done with it now in any case.
@@ -1079,7 +1079,7 @@ xfs_bmap_add_extent_delay_real(
 		diff = (int)(temp + temp2 - startblockval(PREV.br_startblock) -
 			(cur ? cur->bc_private.b.allocated : 0));
 		if (diff > 0 &&
-		    xfs_icsb_modify_counters(ip->i_mount, XFS_SBS_FDBLOCKS,
+		    xfs_icsb_modify_free_blocks(ip->i_mount,
 					     -((int64_t)diff), rsvd)) {
 			/*
 			 * Ick gross gag me with a spoon.
@@ -1090,8 +1090,7 @@ xfs_bmap_add_extent_delay_real(
 					temp--;
 					diff--;
 					if (!diff ||
-					    !xfs_icsb_modify_counters(ip->i_mount,
-						    XFS_SBS_FDBLOCKS,
+					    !xfs_icsb_modify_free_blocks(ip->i_mount,
 						    -((int64_t)diff), rsvd))
 						break;
 				}
@@ -1099,8 +1098,7 @@ xfs_bmap_add_extent_delay_real(
 					temp2--;
 					diff--;
 					if (!diff ||
-					    !xfs_icsb_modify_counters(ip->i_mount,
-						    XFS_SBS_FDBLOCKS,
+					    !xfs_icsb_modify_free_blocks(ip->i_mount,
 						    -((int64_t)diff), rsvd))
 						break;
 				}
@@ -1769,8 +1767,8 @@ xfs_bmap_add_extent_hole_delay(
 	}
 	if (oldlen != newlen) {
 		ASSERT(oldlen > newlen);
-		xfs_icsb_modify_counters(ip->i_mount, XFS_SBS_FDBLOCKS,
-			(int64_t)(oldlen - newlen), rsvd);
+		xfs_icsb_modify_free_blocks(ip->i_mount,
+					(int64_t)(oldlen - newlen), rsvd);
 		/*
 		 * Nothing to do for disk quota accounting here.
 		 */
@@ -3114,10 +3112,9 @@ xfs_bmap_del_extent(
 	 * Nothing to do for disk quota accounting here.
 	 */
 	ASSERT(da_old >= da_new);
-	if (da_old > da_new) {
-		xfs_icsb_modify_counters(mp, XFS_SBS_FDBLOCKS,
-			(int64_t)(da_old - da_new), rsvd);
-	}
+	if (da_old > da_new)
+		xfs_icsb_modify_free_blocks(ip->i_mount,
+					(int64_t)(da_old - da_new), rsvd);
 done:
 	*logflagsp = flags;
 	return error;
@@ -4530,14 +4527,12 @@ xfs_bmapi(
 							-((int64_t)extsz), (flags &
 							XFS_BMAPI_RSVBLOCKS));
 				} else {
-					error = xfs_icsb_modify_counters(mp,
-							XFS_SBS_FDBLOCKS,
+					error = xfs_icsb_modify_free_blocks(mp,
 							-((int64_t)alen), (flags &
 							XFS_BMAPI_RSVBLOCKS));
 				}
 				if (!error) {
-					error = xfs_icsb_modify_counters(mp,
-							XFS_SBS_FDBLOCKS,
+					error = xfs_icsb_modify_free_blocks(mp,
 							-((int64_t)indlen), (flags &
 							XFS_BMAPI_RSVBLOCKS));
 					if (error && rt)
@@ -4546,8 +4541,7 @@ xfs_bmapi(
 							(int64_t)extsz, (flags &
 							XFS_BMAPI_RSVBLOCKS));
 					else if (error)
-						xfs_icsb_modify_counters(mp,
-							XFS_SBS_FDBLOCKS,
+						xfs_icsb_modify_free_blocks(mp,
 							(int64_t)alen, (flags &
 							XFS_BMAPI_RSVBLOCKS));
 				}
@@ -5210,7 +5204,7 @@ xfs_bunmapi(
 					ip, -((long)del.br_blockcount), 0,
 					XFS_QMOPT_RES_RTBLKS);
 			} else {
-				xfs_icsb_modify_counters(mp, XFS_SBS_FDBLOCKS,
+				xfs_icsb_modify_free_blocks(mp,
 						(int64_t)del.br_blockcount, rsvd);
 				(void)xfs_trans_reserve_quota_nblks(NULL,
 					ip, -((long)del.br_blockcount), 0,
diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
index fb9a9c8..be34ff2 100644
--- a/fs/xfs/xfs_fsops.c
+++ b/fs/xfs/xfs_fsops.c
@@ -596,8 +596,7 @@ out:
 		 * the extra reserve blocks from the reserve.....
 		 */
 		int error;
-		error = xfs_icsb_modify_counters(mp, XFS_SBS_FDBLOCKS,
-						 fdblks_delta, 0);
+		error = xfs_icsb_modify_free_blocks(mp, fdblks_delta, 0);
 		if (error == ENOSPC)
 			goto retry;
 	}
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index f0353fb..3905fc3 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -373,6 +373,83 @@ xfs_icsb_sync_counters(
 	mp->m_sb.sb_fdblocks = xfs_icsb_sum(mp, XFS_ICSB_FDBLOCKS);
 }
 
+int
+xfs_icsb_modify_inodes(
+	xfs_mount_t	*mp,
+	int		cntr,
+	int64_t		delta,
+	int		rsvd)
+{
+	int		ret = 0;
+
+	ASSERT(cntr == XFS_ICSB_ICOUNT || cntr == XFS_ICSB_IFREE);
+
+	ret = xfs_icsb_add(mp, cntr, delta, 0);
+	if (ret < 0) {
+		ASSERT(0);
+		return XFS_ERROR(EINVAL);
+	}
+	return 0;
+}
+
+int
+xfs_icsb_modify_free_blocks(
+	xfs_mount_t	*mp,
+	int64_t		delta,
+	int		rsvd)
+{
+	int64_t		lcounter;
+	int64_t		res_used;
+	int		ret = 0;
+
+	/*
+	 * if we are putting blocks back, put them into the reserve
+	 * block pool first.
+	 */
+	if (unlikely(mp->m_resblks != mp->m_resblks_avail) && delta > 0) {
+		spin_lock(&mp->m_sb_lock);
+		res_used = (int64_t)(mp->m_resblks -
+					mp->m_resblks_avail);
+		if (res_used > delta) {
+			mp->m_resblks_avail += delta;
+			delta = 0;
+		} else {
+			delta -= res_used;
+			mp->m_resblks_avail = mp->m_resblks;
+		}
+		spin_unlock(&mp->m_sb_lock);
+		if (!delta)
+			return 0;
+	}
+
+	/* try the change */
+	ret = xfs_icsb_add(mp, XFS_ICSB_FDBLOCKS, delta,
+					XFS_ALLOC_SET_ASIDE(mp));
+	if (likely(ret >= 0))
+		return 0;
+
+	/* ENOSPC */
+	ASSERT(ret == -ENOSPC);
+	ASSERT(delta < 0);
+
+	if (!rsvd)
+		return XFS_ERROR(ENOSPC);
+
+	spin_lock(&mp->m_sb_lock);
+	lcounter = (int64_t)mp->m_resblks_avail + delta;
+	if (lcounter >= 0) {
+		mp->m_resblks_avail = lcounter;
+		spin_unlock(&mp->m_sb_lock);
+		return 0;
+	}
+	spin_unlock(&mp->m_sb_lock);
+	printk_once(KERN_WARNING
+		"Filesystem \"%s\": reserve blocks depleted! "
+		"Consider increasing reserve pool size.",
+		mp->m_fsname);
+	return XFS_ERROR(ENOSPC);
+}
+
 /*
  * Check size of device based on the (data/realtime) block count.
  * Note: this check is used by the growfs code as well as mount.
@@ -1939,89 +2016,6 @@ unwind:
 	return error;
 }
 
-int
-xfs_icsb_modify_counters(
-	xfs_mount_t	*mp,
-	xfs_sb_field_t	field,
-	int64_t		delta,
-	int		rsvd)
-{
-	int64_t		lcounter;
-	int64_t		res_used;
-	int		ret = 0;
-
-
-	switch (field) {
-	case XFS_SBS_ICOUNT:
-		ret = xfs_icsb_add(mp, cntr, delta, 0);
-		if (ret < 0) {
-			ASSERT(0);
-			return XFS_ERROR(EINVAL);
-		}
-		return 0;
-
-	case XFS_SBS_IFREE:
-		ret = xfs_icsb_add(mp, XFS_ICSB_IFREE, delta, 0);
-		if (ret < 0) {
-			ASSERT(0);
-			return XFS_ERROR(EINVAL);
-		}
-		return 0;
-
-	case XFS_SBS_FDBLOCKS:
-		/*
-		 * if we are putting blocks back, put them into the reserve
-		 * block pool first.
-		 */
-		if (mp->m_resblks != mp->m_resblks_avail && delta > 0) {
-			spin_lock(&mp->m_sb_lock);
-			res_used = (int64_t)(mp->m_resblks -
-						mp->m_resblks_avail);
-			if (res_used > delta) {
-				mp->m_resblks_avail += delta;
-				delta = 0;
-			} else {
-				delta -= res_used;
-				mp->m_resblks_avail = mp->m_resblks;
-			}
-			spin_unlock(&mp->m_sb_lock);
-			if (!delta)
-				return 0;
-		}
-
-		/* try the change */
-		ret = xfs_icsb_add(mp, XFS_ICSB_FDBLOCKS, delta,
-						XFS_ALLOC_SET_ASIDE(mp));
-		if (likely(ret >= 0))
-			return 0;
-
-		/* ENOSPC */
-		ASSERT(ret == -ENOSPC);
-		ASSERT(delta < 0);
-
-		if (!rsvd)
-			return XFS_ERROR(ENOSPC);
-
-		spin_lock(&mp->m_sb_lock);
-		lcounter = (int64_t)mp->m_resblks_avail + delta;
-		if (lcounter >= 0) {
-			mp->m_resblks_avail = lcounter;
-			spin_unlock(&mp->m_sb_lock);
-			return 0;
-		}
-		spin_unlock(&mp->m_sb_lock);
-		printk_once(KERN_WARNING
-			"Filesystem \"%s\": reserve blocks depleted! "
-			"Consider increasing reserve pool size.",
-			mp->m_fsname);
-		return XFS_ERROR(ENOSPC);
-	default:
-		ASSERT(0);
-		return XFS_ERROR(EINVAL);
-	}
-	return 0;
-}
-
 /*
  * xfs_getsb() is called to obtain the buffer for the superblock.
  * The buffer is returned locked and read in from disk.
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index 42d31df..c8ff435 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -75,8 +75,8 @@ enum {
 	XFS_ICSB_MAX,
 };
 
-extern int	xfs_icsb_modify_counters(struct xfs_mount *, xfs_sb_field_t,
-						int64_t, int);
+extern int	xfs_icsb_modify_inodes(struct xfs_mount *, int, int64_t, int);
+extern int	xfs_icsb_modify_free_blocks(struct xfs_mount *, int64_t, int);
 
 typedef struct xfs_mount {
 	struct super_block	*m_super;
diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index f6d956b..cf2214d 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -696,7 +696,7 @@ xfs_trans_reserve(
 	 * fail if the count would go below zero.
 	 */
 	if (blocks > 0) {
-		error = xfs_icsb_modify_counters(tp->t_mountp, XFS_SBS_FDBLOCKS,
+		error = xfs_icsb_modify_free_blocks(tp->t_mountp,
 					  -((int64_t)blocks), rsvd);
 		if (error != 0) {
 			current_restore_flags_nested(&tp->t_pflags, PF_FSTRANS);
@@ -767,8 +767,8 @@ undo_log:
 
 undo_blocks:
 	if (blocks > 0) {
-		xfs_icsb_modify_counters(tp->t_mountp, XFS_SBS_FDBLOCKS,
-					 (int64_t)blocks, rsvd);
+		xfs_icsb_modify_free_blocks(tp->t_mountp,
+						(int64_t)blocks, rsvd);
 		tp->t_blk_res = 0;
 	}
 
@@ -1045,21 +1045,20 @@ xfs_trans_unreserve_and_mod_sb(
 
 	/* apply the per-cpu counters */
 	if (blkdelta) {
-		error = xfs_icsb_modify_counters(mp, XFS_SBS_FDBLOCKS,
-						 blkdelta, rsvd);
+		error = xfs_icsb_modify_free_blocks(mp, blkdelta, rsvd);
 		if (error)
 			goto out;
 	}
 
 	if (idelta) {
-		error = xfs_icsb_modify_counters(mp, XFS_SBS_ICOUNT,
+		error = xfs_icsb_modify_inodes(mp, XFS_ICSB_ICOUNT,
 						 idelta, rsvd);
 		if (error)
 			goto out_undo_fdblocks;
 	}
 
 	if (ifreedelta) {
-		error = xfs_icsb_modify_counters(mp, XFS_SBS_IFREE,
+		error = xfs_icsb_modify_inodes(mp, XFS_ICSB_IFREE,
 						 ifreedelta, rsvd);
 		if (error)
 			goto out_undo_icount;
@@ -1129,13 +1128,13 @@ xfs_trans_unreserve_and_mod_sb(
 
 out_undo_ifreecount:
 	if (ifreedelta)
-		xfs_icsb_modify_counters(mp, XFS_SBS_IFREE, -ifreedelta, rsvd);
+		xfs_icsb_modify_inodes(mp, XFS_ICSB_IFREE, -ifreedelta, rsvd);
 out_undo_icount:
 	if (idelta)
-		xfs_icsb_modify_counters(mp, XFS_SBS_ICOUNT, -idelta, rsvd);
+		xfs_icsb_modify_inodes(mp, XFS_ICSB_ICOUNT, -idelta, rsvd);
 out_undo_fdblocks:
 	if (blkdelta)
-		xfs_icsb_modify_counters(mp, XFS_SBS_FDBLOCKS, -blkdelta, rsvd);
+		xfs_icsb_modify_free_blocks(mp, -blkdelta, rsvd);
 out:
 	ASSERT(error = 0);
 	return;
-- 
1.7.2.3


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] xfs: use generic per-cpu counter infrastructure
  2010-11-29  0:36 ` [PATCH 2/3] xfs: use generic per-cpu counter infrastructure Dave Chinner
@ 2010-11-29  9:16   ` Christoph Hellwig
  2010-11-30  4:37     ` Dave Chinner
  0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2010-11-29  9:16 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs, linux-kernel, a.p.zijlstra

On Mon, Nov 29, 2010 at 11:36:40AM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> XFS has a per-cpu counter implementation for in-core superblock
> counters that pre-dated the generic implementation. It is complex
> and baroque as it is tailored directly to the needs of ENOSPC
> detection.
> 
> Now that the generic percpu counter infrastructure has the
> percpu_counter_add_unless_lt() function that implements the
> necessary threshold checks for us, switch the XFS per-cpu
> superblock counters to use the generic percpu counter
> infrastructure.

Looks good, but a few comments below:

> +/*
> + * Per-cpu incore superblock counters
> + *
> + * Simple concept, difficult implementation, now somewhat simplified by generic
> + * per-cpu counter support.  This provides distributed per cpu counters for
> + * contended fields (e.g.  free block count).

The kind of historic comments like now simplified by  .. don't make any
sense after only a short while.  I'd just remove the first senstence
above, as the details of the problems are explained much better later.

> +static inline int
> +xfs_icsb_add(
> +	struct xfs_mount	*mp,
> +	int			counter,
> +	int64_t			delta,
> +	int64_t			threshold)
> +{
> +	int			ret;
> +
> +	ret = percpu_counter_add_unless_lt(&mp->m_icsb[counter], delta,
> +								threshold);
> +	if (ret < 0)
> +		return -ENOSPC;
> +	return 0;
> +}
> +
> +static inline void
> +xfs_icsb_set(
> +	struct xfs_mount	*mp,
> +	int			counter,
> +	int64_t			value)
> +{
> +	percpu_counter_set(&mp->m_icsb[counter], value);
> +}
> +
> +static inline int64_t
> +xfs_icsb_sum(
> +	struct xfs_mount	*mp,
> +	int			counter)
> +{
> +	return percpu_counter_sum_positive(&mp->m_icsb[counter]);
> +}

I still don't like these wrappers.  They are all local to xfs_mount.c,
and only have a single function calling them.  See the RFC patch below
which removes them, and imho makes the code more readable.  Especially
in xfs _add case where we can get rid of one level of error remapping,
and go directly from the weird percpu return values to the positive
xfs errors instead of doing a detour via the negative linux errors.

> +static inline int64_t
> +xfs_icsb_read(
> +	struct xfs_mount	*mp,
> +	int			counter)
> +{
> +	return percpu_counter_read_positive(&mp->m_icsb[counter]);
> +}

this one is entirely unused.

Index: xfs/fs/xfs/xfs_mount.c
===================================================================
--- xfs.orig/fs/xfs/xfs_mount.c	2010-11-29 09:43:31.423011248 +0100
+++ xfs/fs/xfs/xfs_mount.c	2010-11-29 09:56:32.546255345 +0100
@@ -282,54 +282,14 @@ xfs_free_perag(
  * so we only need to modify the fast path to handle per-cpu counter error
  * cases.
  */
-static inline int
-xfs_icsb_add(
-	struct xfs_mount	*mp,
-	int			counter,
-	int64_t			delta,
-	int64_t			threshold)
-{
-	int			ret;
-
-	ret = percpu_counter_add_unless_lt(&mp->m_icsb[counter], delta,
-								threshold);
-	if (ret < 0)
-		return -ENOSPC;
-	return 0;
-}
-
-static inline void
-xfs_icsb_set(
-	struct xfs_mount	*mp,
-	int			counter,
-	int64_t			value)
-{
-	percpu_counter_set(&mp->m_icsb[counter], value);
-}
-
-static inline int64_t
-xfs_icsb_sum(
-	struct xfs_mount	*mp,
-	int			counter)
-{
-	return percpu_counter_sum_positive(&mp->m_icsb[counter]);
-}
-
-static inline int64_t
-xfs_icsb_read(
-	struct xfs_mount	*mp,
-	int			counter)
-{
-	return percpu_counter_read_positive(&mp->m_icsb[counter]);
-}
-
 void
 xfs_icsb_reinit_counters(
 	struct xfs_mount	*mp)
 {
-	xfs_icsb_set(mp, XFS_ICSB_FDBLOCKS, mp->m_sb.sb_fdblocks);
-	xfs_icsb_set(mp, XFS_ICSB_IFREE, mp->m_sb.sb_ifree);
-	xfs_icsb_set(mp, XFS_ICSB_ICOUNT, mp->m_sb.sb_icount);
+	percpu_counter_set(&mp->m_icsb[XFS_ICSB_FDBLOCKS],
+			   mp->m_sb.sb_fdblocks);
+	percpu_counter_set(&mp->m_icsb[XFS_ICSB_IFREE], mp->m_sb.sb_ifree);
+	percpu_counter_set(&mp->m_icsb[XFS_ICSB_ICOUNT], mp->m_sb.sb_icount);
 }
 
 int
@@ -368,9 +328,12 @@ xfs_icsb_sync_counters(
 	xfs_mount_t	*mp)
 {
 	assert_spin_locked(&mp->m_sb_lock);
-	mp->m_sb.sb_icount = xfs_icsb_sum(mp, XFS_ICSB_ICOUNT);
-	mp->m_sb.sb_ifree = xfs_icsb_sum(mp, XFS_ICSB_IFREE);
-	mp->m_sb.sb_fdblocks = xfs_icsb_sum(mp, XFS_ICSB_FDBLOCKS);
+	mp->m_sb.sb_icount =
+		percpu_counter_sum_positive(&mp->m_icsb[XFS_ICSB_ICOUNT]);
+	mp->m_sb.sb_ifree =
+		percpu_counter_sum_positive(&mp->m_icsb[XFS_ICSB_IFREE]);
+	mp->m_sb.sb_fdblocks =
+		percpu_counter_sum_positive(&mp->m_icsb[XFS_ICSB_FDBLOCKS]);
 }
 
 /*
@@ -1953,7 +1916,7 @@ xfs_icsb_modify_counters(
 
 	switch (field) {
 	case XFS_SBS_ICOUNT:
-		ret = xfs_icsb_add(mp, cntr, delta, 0);
+		ret = percpu_counter_add_unless_lt(&mp->m_icsb[XFS_SBS_ICOUNT], delta, 0);
 		if (ret < 0) {
 			ASSERT(0);
 			return XFS_ERROR(EINVAL);
@@ -1961,7 +1924,7 @@ xfs_icsb_modify_counters(
 		return 0;
 
 	case XFS_SBS_IFREE:
-		ret = xfs_icsb_add(mp, XFS_ICSB_IFREE, delta, 0);
+		ret = percpu_counter_add_unless_lt(&mp->m_icsb[XFS_SBS_IFREE], delta, 0);
 		if (ret < 0) {
 			ASSERT(0);
 			return XFS_ERROR(EINVAL);
@@ -1990,13 +1953,12 @@ xfs_icsb_modify_counters(
 		}
 
 		/* try the change */
-		ret = xfs_icsb_add(mp, XFS_ICSB_FDBLOCKS, delta,
-						XFS_ALLOC_SET_ASIDE(mp));
+		ret = percpu_counter_add_unless_lt(&mp->m_icsb[XFS_ICSB_FDBLOCKS],
+						   delta, XFS_ALLOC_SET_ASIDE(mp));
 		if (likely(ret >= 0))
 			return 0;
 
 		/* ENOSPC */
-		ASSERT(ret == -ENOSPC);
 		ASSERT(delta < 0);
 
 		if (!rsvd)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] xfs: demultiplex xfs_icsb_modify_counters()
  2010-11-29  0:36 ` [PATCH 3/3] xfs: demultiplex xfs_icsb_modify_counters() Dave Chinner
@ 2010-11-29  9:19   ` Christoph Hellwig
  2010-11-30  4:38     ` Dave Chinner
  0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2010-11-29  9:19 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs, linux-kernel, a.p.zijlstra

On Mon, Nov 29, 2010 at 11:36:41AM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> With the conversion to percpu counters, xfs_icsb_modify_counters() really does
> not need to exist. Convert the inode counter modifications to use a common
> helper function for the one place that calls them, and add another function for
> the free block modification and convert all the callers to use that.

> +xfs_icsb_modify_inodes(
> +	xfs_mount_t	*mp,

struct xfs_mount, please.

> +	int		cntr,
> +	int64_t		delta,
> +	int		rsvd)

the rsvd argument isn't used at all.

> +{
> +	int		ret = 0;
> +
> +	ASSERT(cntr == XFS_ICSB_ICOUNT || cntr == XFS_ICSB_IFREE);
> +
> +	ret = xfs_icsb_add(mp, cntr, delta, 0);
> +	if (ret < 0) {
> +		ASSERT(0);
> +		return XFS_ERROR(EINVAL);
> +	}
> +	return 0;

You could get rdif of the ret argument as we don't care about the
value.  I also don't think we need the assert here - the caller already
does one for us.

> +xfs_icsb_modify_free_blocks(
> +	xfs_mount_t	*mp,

same here.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] lib: percpu counter add unless less than functionality
  2010-11-29  0:36 ` [PATCH 1/3] lib: percpu counter add unless less than functionality Dave Chinner
@ 2010-11-29 12:08   ` Peter Zijlstra
  2010-11-30  4:30     ` Dave Chinner
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2010-11-29 12:08 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs, linux-kernel

On Mon, 2010-11-29 at 11:36 +1100, Dave Chinner wrote:
> +int percpu_counter_add_unless_lt(struct percpu_counter *fbc, s64 amount,
> +                                                       s64 threshold);

It might make sense to implement that as;

  __percpu_counter_add_unless_lt(struct percpu_counter *fbc,
				 s64 amount,
				 s64 threshold,
				 s32 batch);

That way its consistent with the __percpu_counter_add() implementation
where you can also specify the batch value.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] lib: percpu counter add unless less than functionality
  2010-11-29 12:08   ` Peter Zijlstra
@ 2010-11-30  4:30     ` Dave Chinner
  0 siblings, 0 replies; 12+ messages in thread
From: Dave Chinner @ 2010-11-30  4:30 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: xfs, linux-kernel

On Mon, Nov 29, 2010 at 01:08:50PM +0100, Peter Zijlstra wrote:
> On Mon, 2010-11-29 at 11:36 +1100, Dave Chinner wrote:
> > +int percpu_counter_add_unless_lt(struct percpu_counter *fbc, s64 amount,
> > +                                                       s64 threshold);
> 
> It might make sense to implement that as;
> 
>   __percpu_counter_add_unless_lt(struct percpu_counter *fbc,
> 				 s64 amount,
> 				 s64 threshold,
> 				 s32 batch);
> 
> That way its consistent with the __percpu_counter_add() implementation
> where you can also specify the batch value.

Good point. I'll fix that up.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] xfs: use generic per-cpu counter infrastructure
  2010-11-29  9:16   ` Christoph Hellwig
@ 2010-11-30  4:37     ` Dave Chinner
  0 siblings, 0 replies; 12+ messages in thread
From: Dave Chinner @ 2010-11-30  4:37 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs, linux-kernel, a.p.zijlstra

On Mon, Nov 29, 2010 at 04:16:10AM -0500, Christoph Hellwig wrote:
> On Mon, Nov 29, 2010 at 11:36:40AM +1100, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > XFS has a per-cpu counter implementation for in-core superblock
> > counters that pre-dated the generic implementation. It is complex
> > and baroque as it is tailored directly to the needs of ENOSPC
> > detection.
> > 
> > Now that the generic percpu counter infrastructure has the
> > percpu_counter_add_unless_lt() function that implements the
> > necessary threshold checks for us, switch the XFS per-cpu
> > superblock counters to use the generic percpu counter
> > infrastructure.
> 
> Looks good, but a few comments below:
> 
> > +/*
> > + * Per-cpu incore superblock counters
> > + *
> > + * Simple concept, difficult implementation, now somewhat simplified by generic
> > + * per-cpu counter support.  This provides distributed per cpu counters for
> > + * contended fields (e.g.  free block count).
> 
> The kind of historic comments like now simplified by  .. don't make any
> sense after only a short while.  I'd just remove the first senstence
> above, as the details of the problems are explained much better later.

Ok, will do.
> 
> > +static inline int
> > +xfs_icsb_add(
> > +	struct xfs_mount	*mp,
> > +	int			counter,
> > +	int64_t			delta,
> > +	int64_t			threshold)
> > +{
> > +	int			ret;
> > +
> > +	ret = percpu_counter_add_unless_lt(&mp->m_icsb[counter], delta,
> > +								threshold);
> > +	if (ret < 0)
> > +		return -ENOSPC;
> > +	return 0;
> > +}
> > +
> > +static inline void
> > +xfs_icsb_set(
> > +	struct xfs_mount	*mp,
> > +	int			counter,
> > +	int64_t			value)
> > +{
> > +	percpu_counter_set(&mp->m_icsb[counter], value);
> > +}
> > +
> > +static inline int64_t
> > +xfs_icsb_sum(
> > +	struct xfs_mount	*mp,
> > +	int			counter)
> > +{
> > +	return percpu_counter_sum_positive(&mp->m_icsb[counter]);
> > +}
> 
> I still don't like these wrappers.  They are all local to xfs_mount.c,
> and only have a single function calling them.  See the RFC patch below
> which removes them, and imho makes the code more readable.  Especially
> in xfs _add case where we can get rid of one level of error remapping,
> and go directly from the weird percpu return values to the positive
> xfs errors instead of doing a detour via the negative linux errors.

Ok, if we need to tweak the batch size in future, then we can deal
with it then. I'll clean it up as you suggest...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] xfs: demultiplex xfs_icsb_modify_counters()
  2010-11-29  9:19   ` Christoph Hellwig
@ 2010-11-30  4:38     ` Dave Chinner
  0 siblings, 0 replies; 12+ messages in thread
From: Dave Chinner @ 2010-11-30  4:38 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs, linux-kernel, a.p.zijlstra

On Mon, Nov 29, 2010 at 04:19:20AM -0500, Christoph Hellwig wrote:
> On Mon, Nov 29, 2010 at 11:36:41AM +1100, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > With the conversion to percpu counters, xfs_icsb_modify_counters() really does
> > not need to exist. Convert the inode counter modifications to use a common
> > helper function for the one place that calls them, and add another function for
> > the free block modification and convert all the callers to use that.
> 
> > +xfs_icsb_modify_inodes(
> > +	xfs_mount_t	*mp,
> 
> struct xfs_mount, please.
> 
> > +	int		cntr,
> > +	int64_t		delta,
> > +	int		rsvd)
> 
> the rsvd argument isn't used at all.
> 
> > +{
> > +	int		ret = 0;
> > +
> > +	ASSERT(cntr == XFS_ICSB_ICOUNT || cntr == XFS_ICSB_IFREE);
> > +
> > +	ret = xfs_icsb_add(mp, cntr, delta, 0);
> > +	if (ret < 0) {
> > +		ASSERT(0);
> > +		return XFS_ERROR(EINVAL);
> > +	}
> > +	return 0;
> 
> You could get rdif of the ret argument as we don't care about the
> value.  I also don't think we need the assert here - the caller already
> does one for us.
> 
> > +xfs_icsb_modify_free_blocks(
> > +	xfs_mount_t	*mp,
> 
> same here.

Ok, makes sense. I'll clean it up.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 2/3] xfs: use generic per-cpu counter infrastructure
  2010-12-13  1:21 [PATCH 0/3] Use generic percpu counters in XFS V2 Dave Chinner
@ 2010-12-13  1:21 ` Dave Chinner
  2010-12-16 15:36   ` Christoph Hellwig
  0 siblings, 1 reply; 12+ messages in thread
From: Dave Chinner @ 2010-12-13  1:21 UTC (permalink / raw)
  To: xfs; +Cc: linux-kernel, a.p.zijlstra

From: Dave Chinner <dchinner@redhat.com>

XFS has a per-cpu counter implementation for in-core superblock
counters that pre-dated the generic implementation. It is complex
and baroque as it is tailored directly to the needs of ENOSPC
detection.

Now that the generic percpu counter infrastructure has the
percpu_counter_add_unless_lt() function that implements the
necessary threshold checks for us, switch the XFS per-cpu
superblock counters to use the generic percpu counter
infrastructure.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/linux-2.6/xfs_linux.h |    9 -
 fs/xfs/linux-2.6/xfs_super.c |    4 +-
 fs/xfs/xfs_fsops.c           |    4 +-
 fs/xfs/xfs_mount.c           |  802 ++++++++----------------------------------
 fs/xfs/xfs_mount.h           |   71 +---
 5 files changed, 171 insertions(+), 719 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_linux.h b/fs/xfs/linux-2.6/xfs_linux.h
index 214ddd7..9fa4f2a 100644
--- a/fs/xfs/linux-2.6/xfs_linux.h
+++ b/fs/xfs/linux-2.6/xfs_linux.h
@@ -88,15 +88,6 @@
 #include <xfs_super.h>
 #include <xfs_buf.h>
 
-/*
- * Feature macros (disable/enable)
- */
-#ifdef CONFIG_SMP
-#define HAVE_PERCPU_SB	/* per cpu superblock counters are a 2.6 feature */
-#else
-#undef  HAVE_PERCPU_SB	/* per cpu superblock counters are a 2.6 feature */
-#endif
-
 #define irix_sgid_inherit	xfs_params.sgid_inherit.val
 #define irix_symlink_mode	xfs_params.symlink_mode.val
 #define xfs_panic_mask		xfs_params.panic_mask.val
diff --git a/fs/xfs/linux-2.6/xfs_super.c b/fs/xfs/linux-2.6/xfs_super.c
index c45b323..abcda07 100644
--- a/fs/xfs/linux-2.6/xfs_super.c
+++ b/fs/xfs/linux-2.6/xfs_super.c
@@ -1229,9 +1229,9 @@ xfs_fs_statfs(
 	statp->f_fsid.val[0] = (u32)id;
 	statp->f_fsid.val[1] = (u32)(id >> 32);
 
-	xfs_icsb_sync_counters(mp, XFS_ICSB_LAZY_COUNT);
-
 	spin_lock(&mp->m_sb_lock);
+	xfs_icsb_sync_counters(mp);
+
 	statp->f_bsize = sbp->sb_blocksize;
 	lsize = sbp->sb_logstart ? sbp->sb_logblocks : 0;
 	statp->f_blocks = sbp->sb_dblocks - lsize;
diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
index a7c116e..fb9a9c8 100644
--- a/fs/xfs/xfs_fsops.c
+++ b/fs/xfs/xfs_fsops.c
@@ -478,8 +478,8 @@ xfs_fs_counts(
 	xfs_mount_t		*mp,
 	xfs_fsop_counts_t	*cnt)
 {
-	xfs_icsb_sync_counters(mp, XFS_ICSB_LAZY_COUNT);
 	spin_lock(&mp->m_sb_lock);
+	xfs_icsb_sync_counters(mp);
 	cnt->freedata = mp->m_sb.sb_fdblocks - XFS_ALLOC_SET_ASIDE(mp);
 	cnt->freertx = mp->m_sb.sb_frextents;
 	cnt->freeino = mp->m_sb.sb_ifree;
@@ -540,7 +540,7 @@ xfs_reserve_blocks(
 	 */
 retry:
 	spin_lock(&mp->m_sb_lock);
-	xfs_icsb_sync_counters_locked(mp, 0);
+	xfs_icsb_sync_counters(mp);
 
 	/*
 	 * If our previous reservation was larger than the current value,
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 19e9dfa..16be1b1 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -46,19 +46,6 @@
 
 STATIC void	xfs_unmountfs_wait(xfs_mount_t *);
 
-
-#ifdef HAVE_PERCPU_SB
-STATIC void	xfs_icsb_balance_counter(xfs_mount_t *, xfs_sb_field_t,
-						int);
-STATIC void	xfs_icsb_balance_counter_locked(xfs_mount_t *, xfs_sb_field_t,
-						int);
-STATIC void	xfs_icsb_disable_counter(xfs_mount_t *, xfs_sb_field_t);
-#else
-
-#define xfs_icsb_balance_counter(mp, a, b)		do { } while (0)
-#define xfs_icsb_balance_counter_locked(mp, a, b)	do { } while (0)
-#endif
-
 static const struct {
 	short offset;
 	short type;	/* 0 = integer
@@ -281,6 +268,71 @@ xfs_free_perag(
 }
 
 /*
+ * Per-cpu incore superblock counters
+ *
+ * This provides distributed per cpu counters for contended fields (e.g. free
+ * block count).  Difficulties arise in that the incore sb is used for ENOSPC
+ * checking, and hence needs to be accurately read when we are running low on
+ * space. Hence We need to check against counter error bounds and determine how
+ * accurately to sum based on that metric. The percpu counters take care of
+ * this for us, so we only need to modify the fast path to handle per-cpu
+ * counter error cases.
+ */
+void
+xfs_icsb_reinit_counters(
+	struct xfs_mount	*mp)
+{
+	percpu_counter_set(&mp->m_icsb[XFS_ICSB_FDBLOCKS],
+						mp->m_sb.sb_fdblocks);
+	percpu_counter_set(&mp->m_icsb[XFS_ICSB_IFREE], mp->m_sb.sb_ifree);
+	percpu_counter_set(&mp->m_icsb[XFS_ICSB_ICOUNT], mp->m_sb.sb_icount);
+}
+
+int
+xfs_icsb_init_counters(
+	struct xfs_mount	*mp)
+{
+	int			i;
+	int			error;
+
+	for (i = 0; i < XFS_ICSB_MAX; i++) {
+		error = percpu_counter_init(&mp->m_icsb[i], 0);
+		if (error)
+			goto out_error;
+	}
+	xfs_icsb_reinit_counters(mp);
+	return 0;
+
+out_error:
+	for (; i >= 0; i--)
+		percpu_counter_destroy(&mp->m_icsb[i]);
+	return error;
+}
+
+void
+xfs_icsb_destroy_counters(
+	xfs_mount_t	*mp)
+{
+	int		i;
+
+	for (i = 0; i < XFS_ICSB_MAX; i++)
+		percpu_counter_destroy(&mp->m_icsb[i]);
+}
+
+void
+xfs_icsb_sync_counters(
+	xfs_mount_t	*mp)
+{
+	assert_spin_locked(&mp->m_sb_lock);
+	mp->m_sb.sb_icount =
+		percpu_counter_sum_positive(&mp->m_icsb[XFS_ICSB_ICOUNT]);
+	mp->m_sb.sb_ifree =
+		percpu_counter_sum_positive(&mp->m_icsb[XFS_ICSB_IFREE]);
+	mp->m_sb.sb_fdblocks =
+		percpu_counter_sum_positive(&mp->m_icsb[XFS_ICSB_FDBLOCKS]);
+}
+
+/*
  * Check size of device based on the (data/realtime) block count.
  * Note: this check is used by the growfs code as well as mount.
  */
@@ -1562,7 +1614,9 @@ xfs_log_sbcount(
 	if (!xfs_fs_writable(mp))
 		return 0;
 
-	xfs_icsb_sync_counters(mp, 0);
+	spin_lock(&mp->m_sb_lock);
+	xfs_icsb_sync_counters(mp);
+	spin_unlock(&mp->m_sb_lock);
 
 	/*
 	 * we don't need to do this if we are updating the superblock
@@ -1674,9 +1728,8 @@ xfs_mod_incore_sb_unlocked(
 	int64_t		delta,
 	int		rsvd)
 {
-	int		scounter;	/* short counter for 32 bit fields */
-	long long	lcounter;	/* long counter for 64 bit fields */
-	long long	res_used, rem;
+	int		scounter = 0;	/* short counter for 32 bit fields */
+	long long	lcounter = 0;	/* long counter for 64 bit fields */
 
 	/*
 	 * With the in-core superblock spin lock held, switch
@@ -1685,66 +1738,6 @@ xfs_mod_incore_sb_unlocked(
 	 * 0, then do not apply the delta and return EINVAL.
 	 */
 	switch (field) {
-	case XFS_SBS_ICOUNT:
-		lcounter = (long long)mp->m_sb.sb_icount;
-		lcounter += delta;
-		if (lcounter < 0) {
-			ASSERT(0);
-			return XFS_ERROR(EINVAL);
-		}
-		mp->m_sb.sb_icount = lcounter;
-		return 0;
-	case XFS_SBS_IFREE:
-		lcounter = (long long)mp->m_sb.sb_ifree;
-		lcounter += delta;
-		if (lcounter < 0) {
-			ASSERT(0);
-			return XFS_ERROR(EINVAL);
-		}
-		mp->m_sb.sb_ifree = lcounter;
-		return 0;
-	case XFS_SBS_FDBLOCKS:
-		lcounter = (long long)
-			mp->m_sb.sb_fdblocks - XFS_ALLOC_SET_ASIDE(mp);
-		res_used = (long long)(mp->m_resblks - mp->m_resblks_avail);
-
-		if (delta > 0) {		/* Putting blocks back */
-			if (res_used > delta) {
-				mp->m_resblks_avail += delta;
-			} else {
-				rem = delta - res_used;
-				mp->m_resblks_avail = mp->m_resblks;
-				lcounter += rem;
-			}
-		} else {				/* Taking blocks away */
-			lcounter += delta;
-			if (lcounter >= 0) {
-				mp->m_sb.sb_fdblocks = lcounter +
-							XFS_ALLOC_SET_ASIDE(mp);
-				return 0;
-			}
-
-			/*
-			 * We are out of blocks, use any available reserved
-			 * blocks if were allowed to.
-			 */
-			if (!rsvd)
-				return XFS_ERROR(ENOSPC);
-
-			lcounter = (long long)mp->m_resblks_avail + delta;
-			if (lcounter >= 0) {
-				mp->m_resblks_avail = lcounter;
-				return 0;
-			}
-			printk_once(KERN_WARNING
-				"Filesystem \"%s\": reserve blocks depleted! "
-				"Consider increasing reserve pool size.",
-				mp->m_fsname);
-			return XFS_ERROR(ENOSPC);
-		}
-
-		mp->m_sb.sb_fdblocks = lcounter + XFS_ALLOC_SET_ASIDE(mp);
-		return 0;
 	case XFS_SBS_FREXTENTS:
 		lcounter = (long long)mp->m_sb.sb_frextents;
 		lcounter += delta;
@@ -1846,9 +1839,7 @@ xfs_mod_incore_sb(
 {
 	int			status;
 
-#ifdef HAVE_PERCPU_SB
 	ASSERT(field < XFS_SBS_ICOUNT || field > XFS_SBS_FDBLOCKS);
-#endif
 	spin_lock(&mp->m_sb_lock);
 	status = xfs_mod_incore_sb_unlocked(mp, field, delta, rsvd);
 	spin_unlock(&mp->m_sb_lock);
@@ -1907,6 +1898,90 @@ unwind:
 	return error;
 }
 
+int
+xfs_icsb_modify_counters(
+	xfs_mount_t	*mp,
+	xfs_sb_field_t	field,
+	int64_t		delta,
+	int		rsvd)
+{
+	int64_t		lcounter;
+	int64_t		res_used;
+	int		ret = 0;
+
+
+	switch (field) {
+	case XFS_SBS_ICOUNT:
+		ret = percpu_counter_add_unless_lt(&mp->m_icsb[XFS_SBS_ICOUNT],
+							delta, 0);
+		if (ret < 0) {
+			ASSERT(0);
+			return XFS_ERROR(EINVAL);
+		}
+		return 0;
+
+	case XFS_SBS_IFREE:
+		ret = percpu_counter_add_unless_lt(&mp->m_icsb[XFS_SBS_IFREE],
+							delta, 0);
+		if (ret < 0) {
+			ASSERT(0);
+			return XFS_ERROR(EINVAL);
+		}
+		return 0;
+
+	case XFS_SBS_FDBLOCKS:
+		/*
+		 * if we are putting blocks back, put them into the reserve
+		 * block pool first.
+		 */
+		if (mp->m_resblks != mp->m_resblks_avail && delta > 0) {
+			spin_lock(&mp->m_sb_lock);
+			res_used = (int64_t)(mp->m_resblks -
+						mp->m_resblks_avail);
+			if (res_used > delta) {
+				mp->m_resblks_avail += delta;
+				delta = 0;
+			} else {
+				delta -= res_used;
+				mp->m_resblks_avail = mp->m_resblks;
+			}
+			spin_unlock(&mp->m_sb_lock);
+			if (!delta)
+				return 0;
+		}
+
+		/* try the change */
+		ret = percpu_counter_add_unless_lt(&mp->m_icsb[XFS_ICSB_FDBLOCKS],
+						delta, XFS_ALLOC_SET_ASIDE(mp));
+		if (likely(ret >= 0))
+			return 0;
+
+		/* ENOSPC */
+		ASSERT(delta < 0);
+
+		if (!rsvd)
+			return XFS_ERROR(ENOSPC);
+
+		spin_lock(&mp->m_sb_lock);
+		lcounter = (int64_t)mp->m_resblks_avail + delta;
+		if (lcounter >= 0) {
+			mp->m_resblks_avail = lcounter;
+			spin_unlock(&mp->m_sb_lock);
+			return 0;
+		}
+		spin_unlock(&mp->m_sb_lock);
+		printk_once(KERN_WARNING
+			"Filesystem \"%s\": reserve blocks depleted! "
+			"Consider increasing reserve pool size.",
+			mp->m_fsname);
+		return XFS_ERROR(ENOSPC);
+	default:
+		ASSERT(0);
+		return XFS_ERROR(EINVAL);
+	}
+	return 0;
+}
+
 /*
  * xfs_getsb() is called to obtain the buffer for the superblock.
  * The buffer is returned locked and read in from disk.
@@ -2000,572 +2075,3 @@ xfs_dev_is_read_only(
 	}
 	return 0;
 }
-
-#ifdef HAVE_PERCPU_SB
-/*
- * Per-cpu incore superblock counters
- *
- * Simple concept, difficult implementation
- *
- * Basically, replace the incore superblock counters with a distributed per cpu
- * counter for contended fields (e.g.  free block count).
- *
- * Difficulties arise in that the incore sb is used for ENOSPC checking, and
- * hence needs to be accurately read when we are running low on space. Hence
- * there is a method to enable and disable the per-cpu counters based on how
- * much "stuff" is available in them.
- *
- * Basically, a counter is enabled if there is enough free resource to justify
- * running a per-cpu fast-path. If the per-cpu counter runs out (i.e. a local
- * ENOSPC), then we disable the counters to synchronise all callers and
- * re-distribute the available resources.
- *
- * If, once we redistributed the available resources, we still get a failure,
- * we disable the per-cpu counter and go through the slow path.
- *
- * The slow path is the current xfs_mod_incore_sb() function.  This means that
- * when we disable a per-cpu counter, we need to drain its resources back to
- * the global superblock. We do this after disabling the counter to prevent
- * more threads from queueing up on the counter.
- *
- * Essentially, this means that we still need a lock in the fast path to enable
- * synchronisation between the global counters and the per-cpu counters. This
- * is not a problem because the lock will be local to a CPU almost all the time
- * and have little contention except when we get to ENOSPC conditions.
- *
- * Basically, this lock becomes a barrier that enables us to lock out the fast
- * path while we do things like enabling and disabling counters and
- * synchronising the counters.
- *
- * Locking rules:
- *
- * 	1. m_sb_lock before picking up per-cpu locks
- * 	2. per-cpu locks always picked up via for_each_online_cpu() order
- * 	3. accurate counter sync requires m_sb_lock + per cpu locks
- * 	4. modifying per-cpu counters requires holding per-cpu lock
- * 	5. modifying global counters requires holding m_sb_lock
- *	6. enabling or disabling a counter requires holding the m_sb_lock 
- *	   and _none_ of the per-cpu locks.
- *
- * Disabled counters are only ever re-enabled by a balance operation
- * that results in more free resources per CPU than a given threshold.
- * To ensure counters don't remain disabled, they are rebalanced when
- * the global resource goes above a higher threshold (i.e. some hysteresis
- * is present to prevent thrashing).
- */
-
-#ifdef CONFIG_HOTPLUG_CPU
-/*
- * hot-plug CPU notifier support.
- *
- * We need a notifier per filesystem as we need to be able to identify
- * the filesystem to balance the counters out. This is achieved by
- * having a notifier block embedded in the xfs_mount_t and doing pointer
- * magic to get the mount pointer from the notifier block address.
- */
-STATIC int
-xfs_icsb_cpu_notify(
-	struct notifier_block *nfb,
-	unsigned long action,
-	void *hcpu)
-{
-	xfs_icsb_cnts_t *cntp;
-	xfs_mount_t	*mp;
-
-	mp = (xfs_mount_t *)container_of(nfb, xfs_mount_t, m_icsb_notifier);
-	cntp = (xfs_icsb_cnts_t *)
-			per_cpu_ptr(mp->m_sb_cnts, (unsigned long)hcpu);
-	switch (action) {
-	case CPU_UP_PREPARE:
-	case CPU_UP_PREPARE_FROZEN:
-		/* Easy Case - initialize the area and locks, and
-		 * then rebalance when online does everything else for us. */
-		memset(cntp, 0, sizeof(xfs_icsb_cnts_t));
-		break;
-	case CPU_ONLINE:
-	case CPU_ONLINE_FROZEN:
-		xfs_icsb_lock(mp);
-		xfs_icsb_balance_counter(mp, XFS_SBS_ICOUNT, 0);
-		xfs_icsb_balance_counter(mp, XFS_SBS_IFREE, 0);
-		xfs_icsb_balance_counter(mp, XFS_SBS_FDBLOCKS, 0);
-		xfs_icsb_unlock(mp);
-		break;
-	case CPU_DEAD:
-	case CPU_DEAD_FROZEN:
-		/* Disable all the counters, then fold the dead cpu's
-		 * count into the total on the global superblock and
-		 * re-enable the counters. */
-		xfs_icsb_lock(mp);
-		spin_lock(&mp->m_sb_lock);
-		xfs_icsb_disable_counter(mp, XFS_SBS_ICOUNT);
-		xfs_icsb_disable_counter(mp, XFS_SBS_IFREE);
-		xfs_icsb_disable_counter(mp, XFS_SBS_FDBLOCKS);
-
-		mp->m_sb.sb_icount += cntp->icsb_icount;
-		mp->m_sb.sb_ifree += cntp->icsb_ifree;
-		mp->m_sb.sb_fdblocks += cntp->icsb_fdblocks;
-
-		memset(cntp, 0, sizeof(xfs_icsb_cnts_t));
-
-		xfs_icsb_balance_counter_locked(mp, XFS_SBS_ICOUNT, 0);
-		xfs_icsb_balance_counter_locked(mp, XFS_SBS_IFREE, 0);
-		xfs_icsb_balance_counter_locked(mp, XFS_SBS_FDBLOCKS, 0);
-		spin_unlock(&mp->m_sb_lock);
-		xfs_icsb_unlock(mp);
-		break;
-	}
-
-	return NOTIFY_OK;
-}
-#endif /* CONFIG_HOTPLUG_CPU */
-
-int
-xfs_icsb_init_counters(
-	xfs_mount_t	*mp)
-{
-	xfs_icsb_cnts_t *cntp;
-	int		i;
-
-	mp->m_sb_cnts = alloc_percpu(xfs_icsb_cnts_t);
-	if (mp->m_sb_cnts == NULL)
-		return -ENOMEM;
-
-#ifdef CONFIG_HOTPLUG_CPU
-	mp->m_icsb_notifier.notifier_call = xfs_icsb_cpu_notify;
-	mp->m_icsb_notifier.priority = 0;
-	register_hotcpu_notifier(&mp->m_icsb_notifier);
-#endif /* CONFIG_HOTPLUG_CPU */
-
-	for_each_online_cpu(i) {
-		cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i);
-		memset(cntp, 0, sizeof(xfs_icsb_cnts_t));
-	}
-
-	mutex_init(&mp->m_icsb_mutex);
-
-	/*
-	 * start with all counters disabled so that the
-	 * initial balance kicks us off correctly
-	 */
-	mp->m_icsb_counters = -1;
-	return 0;
-}
-
-void
-xfs_icsb_reinit_counters(
-	xfs_mount_t	*mp)
-{
-	xfs_icsb_lock(mp);
-	/*
-	 * start with all counters disabled so that the
-	 * initial balance kicks us off correctly
-	 */
-	mp->m_icsb_counters = -1;
-	xfs_icsb_balance_counter(mp, XFS_SBS_ICOUNT, 0);
-	xfs_icsb_balance_counter(mp, XFS_SBS_IFREE, 0);
-	xfs_icsb_balance_counter(mp, XFS_SBS_FDBLOCKS, 0);
-	xfs_icsb_unlock(mp);
-}
-
-void
-xfs_icsb_destroy_counters(
-	xfs_mount_t	*mp)
-{
-	if (mp->m_sb_cnts) {
-		unregister_hotcpu_notifier(&mp->m_icsb_notifier);
-		free_percpu(mp->m_sb_cnts);
-	}
-	mutex_destroy(&mp->m_icsb_mutex);
-}
-
-STATIC void
-xfs_icsb_lock_cntr(
-	xfs_icsb_cnts_t	*icsbp)
-{
-	while (test_and_set_bit(XFS_ICSB_FLAG_LOCK, &icsbp->icsb_flags)) {
-		ndelay(1000);
-	}
-}
-
-STATIC void
-xfs_icsb_unlock_cntr(
-	xfs_icsb_cnts_t	*icsbp)
-{
-	clear_bit(XFS_ICSB_FLAG_LOCK, &icsbp->icsb_flags);
-}
-
-
-STATIC void
-xfs_icsb_lock_all_counters(
-	xfs_mount_t	*mp)
-{
-	xfs_icsb_cnts_t *cntp;
-	int		i;
-
-	for_each_online_cpu(i) {
-		cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i);
-		xfs_icsb_lock_cntr(cntp);
-	}
-}
-
-STATIC void
-xfs_icsb_unlock_all_counters(
-	xfs_mount_t	*mp)
-{
-	xfs_icsb_cnts_t *cntp;
-	int		i;
-
-	for_each_online_cpu(i) {
-		cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i);
-		xfs_icsb_unlock_cntr(cntp);
-	}
-}
-
-STATIC void
-xfs_icsb_count(
-	xfs_mount_t	*mp,
-	xfs_icsb_cnts_t	*cnt,
-	int		flags)
-{
-	xfs_icsb_cnts_t *cntp;
-	int		i;
-
-	memset(cnt, 0, sizeof(xfs_icsb_cnts_t));
-
-	if (!(flags & XFS_ICSB_LAZY_COUNT))
-		xfs_icsb_lock_all_counters(mp);
-
-	for_each_online_cpu(i) {
-		cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i);
-		cnt->icsb_icount += cntp->icsb_icount;
-		cnt->icsb_ifree += cntp->icsb_ifree;
-		cnt->icsb_fdblocks += cntp->icsb_fdblocks;
-	}
-
-	if (!(flags & XFS_ICSB_LAZY_COUNT))
-		xfs_icsb_unlock_all_counters(mp);
-}
-
-STATIC int
-xfs_icsb_counter_disabled(
-	xfs_mount_t	*mp,
-	xfs_sb_field_t	field)
-{
-	ASSERT((field >= XFS_SBS_ICOUNT) && (field <= XFS_SBS_FDBLOCKS));
-	return test_bit(field, &mp->m_icsb_counters);
-}
-
-STATIC void
-xfs_icsb_disable_counter(
-	xfs_mount_t	*mp,
-	xfs_sb_field_t	field)
-{
-	xfs_icsb_cnts_t	cnt;
-
-	ASSERT((field >= XFS_SBS_ICOUNT) && (field <= XFS_SBS_FDBLOCKS));
-
-	/*
-	 * If we are already disabled, then there is nothing to do
-	 * here. We check before locking all the counters to avoid
-	 * the expensive lock operation when being called in the
-	 * slow path and the counter is already disabled. This is
-	 * safe because the only time we set or clear this state is under
-	 * the m_icsb_mutex.
-	 */
-	if (xfs_icsb_counter_disabled(mp, field))
-		return;
-
-	xfs_icsb_lock_all_counters(mp);
-	if (!test_and_set_bit(field, &mp->m_icsb_counters)) {
-		/* drain back to superblock */
-
-		xfs_icsb_count(mp, &cnt, XFS_ICSB_LAZY_COUNT);
-		switch(field) {
-		case XFS_SBS_ICOUNT:
-			mp->m_sb.sb_icount = cnt.icsb_icount;
-			break;
-		case XFS_SBS_IFREE:
-			mp->m_sb.sb_ifree = cnt.icsb_ifree;
-			break;
-		case XFS_SBS_FDBLOCKS:
-			mp->m_sb.sb_fdblocks = cnt.icsb_fdblocks;
-			break;
-		default:
-			BUG();
-		}
-	}
-
-	xfs_icsb_unlock_all_counters(mp);
-}
-
-STATIC void
-xfs_icsb_enable_counter(
-	xfs_mount_t	*mp,
-	xfs_sb_field_t	field,
-	uint64_t	count,
-	uint64_t	resid)
-{
-	xfs_icsb_cnts_t	*cntp;
-	int		i;
-
-	ASSERT((field >= XFS_SBS_ICOUNT) && (field <= XFS_SBS_FDBLOCKS));
-
-	xfs_icsb_lock_all_counters(mp);
-	for_each_online_cpu(i) {
-		cntp = per_cpu_ptr(mp->m_sb_cnts, i);
-		switch (field) {
-		case XFS_SBS_ICOUNT:
-			cntp->icsb_icount = count + resid;
-			break;
-		case XFS_SBS_IFREE:
-			cntp->icsb_ifree = count + resid;
-			break;
-		case XFS_SBS_FDBLOCKS:
-			cntp->icsb_fdblocks = count + resid;
-			break;
-		default:
-			BUG();
-			break;
-		}
-		resid = 0;
-	}
-	clear_bit(field, &mp->m_icsb_counters);
-	xfs_icsb_unlock_all_counters(mp);
-}
-
-void
-xfs_icsb_sync_counters_locked(
-	xfs_mount_t	*mp,
-	int		flags)
-{
-	xfs_icsb_cnts_t	cnt;
-
-	xfs_icsb_count(mp, &cnt, flags);
-
-	if (!xfs_icsb_counter_disabled(mp, XFS_SBS_ICOUNT))
-		mp->m_sb.sb_icount = cnt.icsb_icount;
-	if (!xfs_icsb_counter_disabled(mp, XFS_SBS_IFREE))
-		mp->m_sb.sb_ifree = cnt.icsb_ifree;
-	if (!xfs_icsb_counter_disabled(mp, XFS_SBS_FDBLOCKS))
-		mp->m_sb.sb_fdblocks = cnt.icsb_fdblocks;
-}
-
-/*
- * Accurate update of per-cpu counters to incore superblock
- */
-void
-xfs_icsb_sync_counters(
-	xfs_mount_t	*mp,
-	int		flags)
-{
-	spin_lock(&mp->m_sb_lock);
-	xfs_icsb_sync_counters_locked(mp, flags);
-	spin_unlock(&mp->m_sb_lock);
-}
-
-/*
- * Balance and enable/disable counters as necessary.
- *
- * Thresholds for re-enabling counters are somewhat magic.  inode counts are
- * chosen to be the same number as single on disk allocation chunk per CPU, and
- * free blocks is something far enough zero that we aren't going thrash when we
- * get near ENOSPC. We also need to supply a minimum we require per cpu to
- * prevent looping endlessly when xfs_alloc_space asks for more than will
- * be distributed to a single CPU but each CPU has enough blocks to be
- * reenabled.
- *
- * Note that we can be called when counters are already disabled.
- * xfs_icsb_disable_counter() optimises the counter locking in this case to
- * prevent locking every per-cpu counter needlessly.
- */
-
-#define XFS_ICSB_INO_CNTR_REENABLE	(uint64_t)64
-#define XFS_ICSB_FDBLK_CNTR_REENABLE(mp) \
-		(uint64_t)(512 + XFS_ALLOC_SET_ASIDE(mp))
-STATIC void
-xfs_icsb_balance_counter_locked(
-	xfs_mount_t	*mp,
-	xfs_sb_field_t  field,
-	int		min_per_cpu)
-{
-	uint64_t	count, resid;
-	int		weight = num_online_cpus();
-	uint64_t	min = (uint64_t)min_per_cpu;
-
-	/* disable counter and sync counter */
-	xfs_icsb_disable_counter(mp, field);
-
-	/* update counters  - first CPU gets residual*/
-	switch (field) {
-	case XFS_SBS_ICOUNT:
-		count = mp->m_sb.sb_icount;
-		resid = do_div(count, weight);
-		if (count < max(min, XFS_ICSB_INO_CNTR_REENABLE))
-			return;
-		break;
-	case XFS_SBS_IFREE:
-		count = mp->m_sb.sb_ifree;
-		resid = do_div(count, weight);
-		if (count < max(min, XFS_ICSB_INO_CNTR_REENABLE))
-			return;
-		break;
-	case XFS_SBS_FDBLOCKS:
-		count = mp->m_sb.sb_fdblocks;
-		resid = do_div(count, weight);
-		if (count < max(min, XFS_ICSB_FDBLK_CNTR_REENABLE(mp)))
-			return;
-		break;
-	default:
-		BUG();
-		count = resid = 0;	/* quiet, gcc */
-		break;
-	}
-
-	xfs_icsb_enable_counter(mp, field, count, resid);
-}
-
-STATIC void
-xfs_icsb_balance_counter(
-	xfs_mount_t	*mp,
-	xfs_sb_field_t  fields,
-	int		min_per_cpu)
-{
-	spin_lock(&mp->m_sb_lock);
-	xfs_icsb_balance_counter_locked(mp, fields, min_per_cpu);
-	spin_unlock(&mp->m_sb_lock);
-}
-
-int
-xfs_icsb_modify_counters(
-	xfs_mount_t	*mp,
-	xfs_sb_field_t	field,
-	int64_t		delta,
-	int		rsvd)
-{
-	xfs_icsb_cnts_t	*icsbp;
-	long long	lcounter;	/* long counter for 64 bit fields */
-	int		ret = 0;
-
-	might_sleep();
-again:
-	preempt_disable();
-	icsbp = this_cpu_ptr(mp->m_sb_cnts);
-
-	/*
-	 * if the counter is disabled, go to slow path
-	 */
-	if (unlikely(xfs_icsb_counter_disabled(mp, field)))
-		goto slow_path;
-	xfs_icsb_lock_cntr(icsbp);
-	if (unlikely(xfs_icsb_counter_disabled(mp, field))) {
-		xfs_icsb_unlock_cntr(icsbp);
-		goto slow_path;
-	}
-
-	switch (field) {
-	case XFS_SBS_ICOUNT:
-		lcounter = icsbp->icsb_icount;
-		lcounter += delta;
-		if (unlikely(lcounter < 0))
-			goto balance_counter;
-		icsbp->icsb_icount = lcounter;
-		break;
-
-	case XFS_SBS_IFREE:
-		lcounter = icsbp->icsb_ifree;
-		lcounter += delta;
-		if (unlikely(lcounter < 0))
-			goto balance_counter;
-		icsbp->icsb_ifree = lcounter;
-		break;
-
-	case XFS_SBS_FDBLOCKS:
-		BUG_ON((mp->m_resblks - mp->m_resblks_avail) != 0);
-
-		lcounter = icsbp->icsb_fdblocks - XFS_ALLOC_SET_ASIDE(mp);
-		lcounter += delta;
-		if (unlikely(lcounter < 0))
-			goto balance_counter;
-		icsbp->icsb_fdblocks = lcounter + XFS_ALLOC_SET_ASIDE(mp);
-		break;
-	default:
-		BUG();
-		break;
-	}
-	xfs_icsb_unlock_cntr(icsbp);
-	preempt_enable();
-	return 0;
-
-slow_path:
-	preempt_enable();
-
-	/*
-	 * serialise with a mutex so we don't burn lots of cpu on
-	 * the superblock lock. We still need to hold the superblock
-	 * lock, however, when we modify the global structures.
-	 */
-	xfs_icsb_lock(mp);
-
-	/*
-	 * Now running atomically.
-	 *
-	 * If the counter is enabled, someone has beaten us to rebalancing.
-	 * Drop the lock and try again in the fast path....
-	 */
-	if (!(xfs_icsb_counter_disabled(mp, field))) {
-		xfs_icsb_unlock(mp);
-		goto again;
-	}
-
-	/*
-	 * The counter is currently disabled. Because we are
-	 * running atomically here, we know a rebalance cannot
-	 * be in progress. Hence we can go straight to operating
-	 * on the global superblock. We do not call xfs_mod_incore_sb()
-	 * here even though we need to get the m_sb_lock. Doing so
-	 * will cause us to re-enter this function and deadlock.
-	 * Hence we get the m_sb_lock ourselves and then call
-	 * xfs_mod_incore_sb_unlocked() as the unlocked path operates
-	 * directly on the global counters.
-	 */
-	spin_lock(&mp->m_sb_lock);
-	ret = xfs_mod_incore_sb_unlocked(mp, field, delta, rsvd);
-	spin_unlock(&mp->m_sb_lock);
-
-	/*
-	 * Now that we've modified the global superblock, we
-	 * may be able to re-enable the distributed counters
-	 * (e.g. lots of space just got freed). After that
-	 * we are done.
-	 */
-	if (ret != ENOSPC)
-		xfs_icsb_balance_counter(mp, field, 0);
-	xfs_icsb_unlock(mp);
-	return ret;
-
-balance_counter:
-	xfs_icsb_unlock_cntr(icsbp);
-	preempt_enable();
-
-	/*
-	 * We may have multiple threads here if multiple per-cpu
-	 * counters run dry at the same time. This will mean we can
-	 * do more balances than strictly necessary but it is not
-	 * the common slowpath case.
-	 */
-	xfs_icsb_lock(mp);
-
-	/*
-	 * running atomically.
-	 *
-	 * This will leave the counter in the correct state for future
-	 * accesses. After the rebalance, we simply try again and our retry
-	 * will either succeed through the fast path or slow path without
-	 * another balance operation being required.
-	 */
-	xfs_icsb_balance_counter(mp, field, delta);
-	xfs_icsb_unlock(mp);
-	goto again;
-}
-
-#endif
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index 5861b49..42d31df 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -65,44 +65,19 @@ struct xfs_nameops;
 struct xfs_ail;
 struct xfs_quotainfo;
 
-#ifdef HAVE_PERCPU_SB
-
 /*
- * Valid per-cpu incore superblock counters. Note that if you add new counters,
- * you may need to define new counter disabled bit field descriptors as there
- * are more possible fields in the superblock that can fit in a bitfield on a
- * 32 bit platform. The XFS_SBS_* values for the current current counters just
- * fit.
+ * Per-cpu incore superblock counters.
  */
-typedef struct xfs_icsb_cnts {
-	uint64_t	icsb_fdblocks;
-	uint64_t	icsb_ifree;
-	uint64_t	icsb_icount;
-	unsigned long	icsb_flags;
-} xfs_icsb_cnts_t;
-
-#define XFS_ICSB_FLAG_LOCK	(1 << 0)	/* counter lock bit */
-
-#define XFS_ICSB_LAZY_COUNT	(1 << 1)	/* accuracy not needed */
+enum {
+	XFS_ICSB_FDBLOCKS = 0,
+	XFS_ICSB_IFREE,
+	XFS_ICSB_ICOUNT,
+	XFS_ICSB_MAX,
+};
 
-extern int	xfs_icsb_init_counters(struct xfs_mount *);
-extern void	xfs_icsb_reinit_counters(struct xfs_mount *);
-extern void	xfs_icsb_destroy_counters(struct xfs_mount *);
-extern void	xfs_icsb_sync_counters(struct xfs_mount *, int);
-extern void	xfs_icsb_sync_counters_locked(struct xfs_mount *, int);
 extern int	xfs_icsb_modify_counters(struct xfs_mount *, xfs_sb_field_t,
 						int64_t, int);
 
-#else
-#define xfs_icsb_init_counters(mp)		(0)
-#define xfs_icsb_destroy_counters(mp)		do { } while (0)
-#define xfs_icsb_reinit_counters(mp)		do { } while (0)
-#define xfs_icsb_sync_counters(mp, flags)	do { } while (0)
-#define xfs_icsb_sync_counters_locked(mp, flags) do { } while (0)
-#define xfs_icsb_modify_counters(mp, field, delta, rsvd) \
-	xfs_mod_incore_sb(mp, field, delta, rsvd)
-#endif
-
 typedef struct xfs_mount {
 	struct super_block	*m_super;
 	xfs_tid_t		m_tid;		/* next unused tid for fs */
@@ -186,12 +161,6 @@ typedef struct xfs_mount {
 	struct xfs_chash	*m_chash;	/* fs private inode per-cluster
 						 * hash table */
 	atomic_t		m_active_trans;	/* number trans frozen */
-#ifdef HAVE_PERCPU_SB
-	xfs_icsb_cnts_t __percpu *m_sb_cnts;	/* per-cpu superblock counters */
-	unsigned long		m_icsb_counters; /* disabled per-cpu counters */
-	struct notifier_block	m_icsb_notifier; /* hotplug cpu notifier */
-	struct mutex		m_icsb_mutex;	/* balancer sync lock */
-#endif
 	struct xfs_mru_cache	*m_filestream;  /* per-mount filestream data */
 	struct task_struct	*m_sync_task;	/* generalised sync thread */
 	xfs_sync_work_t		m_sync_work;	/* work item for VFS_SYNC */
@@ -202,6 +171,7 @@ typedef struct xfs_mount {
 	__int64_t		m_update_flags;	/* sb flags we need to update
 						   on the next remount,rw */
 	struct shrinker		m_inode_shrink;	/* inode reclaim shrinker */
+	struct percpu_counter	m_icsb[XFS_ICSB_MAX];
 } xfs_mount_t;
 
 /*
@@ -333,26 +303,6 @@ struct xfs_perag *xfs_perag_get_tag(struct xfs_mount *mp, xfs_agnumber_t agno,
 void	xfs_perag_put(struct xfs_perag *pag);
 
 /*
- * Per-cpu superblock locking functions
- */
-#ifdef HAVE_PERCPU_SB
-static inline void
-xfs_icsb_lock(xfs_mount_t *mp)
-{
-	mutex_lock(&mp->m_icsb_mutex);
-}
-
-static inline void
-xfs_icsb_unlock(xfs_mount_t *mp)
-{
-	mutex_unlock(&mp->m_icsb_mutex);
-}
-#else
-#define xfs_icsb_lock(mp)
-#define xfs_icsb_unlock(mp)
-#endif
-
-/*
  * This structure is for use by the xfs_mod_incore_sb_batch() routine.
  * xfs_growfs can specify a few fields which are more than int limit
  */
@@ -379,6 +329,11 @@ extern int	xfs_sb_validate_fsb_count(struct xfs_sb *, __uint64_t);
 
 extern int	xfs_dev_is_read_only(struct xfs_mount *, char *);
 
+extern int	xfs_icsb_init_counters(struct xfs_mount *);
+extern void	xfs_icsb_reinit_counters(struct xfs_mount *);
+extern void	xfs_icsb_destroy_counters(struct xfs_mount *);
+extern void	xfs_icsb_sync_counters(struct xfs_mount *);
+
 #endif	/* __KERNEL__ */
 
 extern void	xfs_mod_sb(struct xfs_trans *, __int64_t);
-- 
1.7.2.3


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] xfs: use generic per-cpu counter infrastructure
  2010-12-13  1:21 ` [PATCH 2/3] xfs: use generic per-cpu counter infrastructure Dave Chinner
@ 2010-12-16 15:36   ` Christoph Hellwig
  0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2010-12-16 15:36 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs, linux-kernel, a.p.zijlstra

On Mon, Dec 13, 2010 at 12:21:52PM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> XFS has a per-cpu counter implementation for in-core superblock
> counters that pre-dated the generic implementation. It is complex
> and baroque as it is tailored directly to the needs of ENOSPC
> detection.
> 
> Now that the generic percpu counter infrastructure has the
> percpu_counter_add_unless_lt() function that implements the
> necessary threshold checks for us, switch the XFS per-cpu
> superblock counters to use the generic percpu counter
> infrastructure.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>

Looks good to me,

Reviewed-by: Christoph Hellwig <hch@lst.de>

A little nipick:

> -#ifdef HAVE_PERCPU_SB
>  	ASSERT(field < XFS_SBS_ICOUNT || field > XFS_SBS_FDBLOCKS);
> -#endif

No need to keep this assert - xfs_mod_incore_sb_unlocked already
has one for unknown fields.


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2010-12-16 15:36 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-29  0:36 [PATCH 0/3] Use generic percpu counters in XFS Dave Chinner
2010-11-29  0:36 ` [PATCH 1/3] lib: percpu counter add unless less than functionality Dave Chinner
2010-11-29 12:08   ` Peter Zijlstra
2010-11-30  4:30     ` Dave Chinner
2010-11-29  0:36 ` [PATCH 2/3] xfs: use generic per-cpu counter infrastructure Dave Chinner
2010-11-29  9:16   ` Christoph Hellwig
2010-11-30  4:37     ` Dave Chinner
2010-11-29  0:36 ` [PATCH 3/3] xfs: demultiplex xfs_icsb_modify_counters() Dave Chinner
2010-11-29  9:19   ` Christoph Hellwig
2010-11-30  4:38     ` Dave Chinner
  -- strict thread matches above, loose matches on Subject: below --
2010-12-13  1:21 [PATCH 0/3] Use generic percpu counters in XFS V2 Dave Chinner
2010-12-13  1:21 ` [PATCH 2/3] xfs: use generic per-cpu counter infrastructure Dave Chinner
2010-12-16 15:36   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox