From: Dave Chinner <david@fromorbit.com>
To: xfs@oss.sgi.com
Subject: [PATCH 03/34] lib: percpu counter add unless less than functionality
Date: Tue, 21 Dec 2010 18:28:59 +1100 [thread overview]
Message-ID: <1292916570-25015-4-git-send-email-david@fromorbit.com> (raw)
In-Reply-To: <1292916570-25015-1-git-send-email-david@fromorbit.com>
From: Dave Chinner <dchinner@redhat.com>
To use the generic percpu counter infrastructure for counters that
require conditional addition based on a threshold value we need
special handling of the counter. Further, the caller needs to know
the status of the conditional addition to determine what action to
take depending on whether the addition occurred or not. Examples of
this sort of usage are resource counters that cannot go below zero
(e.g. filesystem free blocks).
To allow XFS to replace it's complex roll-your-own per-cpu
superblock counters, a single generic conditional function is
required: percpu_counter_add_unless_lt(). This will add the amount
to the counter unless the result would be less than the given
threshold. A caller supplied threshold is required because XFS does
not necessarily use the same threshold for every counter.
percpu_counter_add_unless_lt() attempts to minimise counter lock
traversals by only taking the counter lock when the threshold is
within the error range of the current counter value. Hence when the
threshold is not within the counter error range, the counter will
still have the same scalability characteristics as the normal
percpu_counter_add() function.
Adding this functionality to the generic percpu counters allows us
to remove the much more complex and less efficient XFS percpu
counter code (~700 lines of code) and replace it with generic
percpu counters.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
include/linux/percpu_counter.h | 27 ++++++++++++++
lib/percpu_counter.c | 79 ++++++++++++++++++++++++++++++++++++++++
2 files changed, 106 insertions(+), 0 deletions(-)
diff --git a/include/linux/percpu_counter.h b/include/linux/percpu_counter.h
index 46f6ba5..ad18779 100644
--- a/include/linux/percpu_counter.h
+++ b/include/linux/percpu_counter.h
@@ -41,12 +41,21 @@ void percpu_counter_set(struct percpu_counter *fbc, s64 amount);
void __percpu_counter_add(struct percpu_counter *fbc, s64 amount, s32 batch);
s64 __percpu_counter_sum(struct percpu_counter *fbc);
int percpu_counter_compare(struct percpu_counter *fbc, s64 rhs);
+int __percpu_counter_add_unless_lt(struct percpu_counter *fbc, s64 amount,
+ s64 threshold, s32 batch);
static inline void percpu_counter_add(struct percpu_counter *fbc, s64 amount)
{
__percpu_counter_add(fbc, amount, percpu_counter_batch);
}
+static inline int percpu_counter_add_unless_lt(struct percpu_counter *fbc,
+ s64 amount, s64 threshold)
+{
+ return __percpu_counter_add_unless_lt(fbc, amount, threshold,
+ percpu_counter_batch);
+}
+
static inline s64 percpu_counter_sum_positive(struct percpu_counter *fbc)
{
s64 ret = __percpu_counter_sum(fbc);
@@ -153,6 +162,24 @@ static inline int percpu_counter_initialized(struct percpu_counter *fbc)
return 1;
}
+static inline int percpu_counter_add_unless_lt(struct percpu_counter *fbc, s64 amount,
+ s64 threshold)
+{
+ s64 count;
+ int ret = ‐1;
+
+ preempt_disable();
+ count = fbc->count + amount;
+ if (count < threshold)
+ goto out;
+ fbc->count = count;
+ ret = count == threshold ? 0 : 1;
+out:
+ preempt_enable();
+ return ret;
+}
+
+
#endif /* CONFIG_SMP */
static inline void percpu_counter_inc(struct percpu_counter *fbc)
diff --git a/lib/percpu_counter.c b/lib/percpu_counter.c
index 604678d..eacccb7 100644
--- a/lib/percpu_counter.c
+++ b/lib/percpu_counter.c
@@ -213,6 +213,85 @@ int percpu_counter_compare(struct percpu_counter *fbc, s64 rhs)
}
EXPORT_SYMBOL(percpu_counter_compare);
+/**
+ * __percpu_counter_add_unless_lt - add to a counter avoiding underruns
+ * @fbc: counter
+ * @amount: amount to add
+ * @threshold: underrun threshold
+ * @batch: percpu counter batch size.
+ *
+ * Add @amount to @fdc if and only if result of addition is greater than or
+ * equal to @threshold Return 1 if greater and added, 0 if equal and added
+ * and -1 if and underrun would have occured.
+ *
+ * This is useful for operations that must accurately and atomically only add a
+ * delta to a counter if the result is greater than a given (e.g. for freespace
+ * accounting with ENOSPC checking in filesystems).
+ */
+int __percpu_counter_add_unless_lt(struct percpu_counter *fbc, s64 amount,
+ s64 threshold, s32 batch)
+{
+ s64 count;
+ s64 error = 2 * batch * num_online_cpus();
+ int cpu;
+ int ret = -1;
+
+ preempt_disable();
+
+ /* Check to see if rough count will be sufficient for comparison */
+ count = percpu_counter_read(fbc);
+ if (count + amount < threshold - error)
+ goto out;
+
+ /*
+ * If the counter is over the threshold and the change is less than the
+ * batch size, we might be able to avoid locking.
+ */
+ if (count > threshold + error && abs(amount) < batch) {
+ __percpu_counter_add(fbc, amount, batch);
+ ret = 1;
+ goto out;
+ }
+
+ /*
+ * If the result is over the error threshold, we can just add it
+ * into the global counter ignoring what is in the per-cpu counters
+ * as they will not change the result of the calculation.
+ */
+ spin_lock(&fbc->lock);
+ if (fbc->count + amount > threshold + error) {
+ fbc->count += amount;
+ ret = 1;
+ goto out_unlock;
+ }
+
+ /*
+ * Result is withing the error margin. Run an open-coded sum of the
+ * per-cpu counters to get the exact value at this point in time,
+ * and if the result greater than the threshold, add the amount to
+ * the global counter.
+ */
+ count = fbc->count;
+ for_each_online_cpu(cpu) {
+ s32 *pcount = per_cpu_ptr(fbc->counters, cpu);
+ count += *pcount;
+ }
+ WARN_ON(count < threshold);
+
+ if (count + amount >= threshold) {
+ ret = 0;
+ if (count + amount > threshold)
+ ret = 1;
+ fbc->count += amount;
+ }
+out_unlock:
+ spin_unlock(&fbc->lock);
+out:
+ preempt_enable();
+ return ret;
+}
+EXPORT_SYMBOL(percpu_counter_add_unless_lt);
+
static int __init percpu_counter_startup(void)
{
compute_batch_value();
--
1.7.2.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2010-12-21 7:28 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-12-21 7:28 [PATCH 00/34] xfs: scalability patchset for 2.6.38 Dave Chinner
2010-12-21 7:28 ` [PATCH 01/34] xfs: provide a inode iolock lockdep class Dave Chinner
2010-12-21 15:15 ` Christoph Hellwig
2010-12-21 7:28 ` [PATCH 02/34] xfs: use KM_NOFS for allocations during attribute list operations Dave Chinner
2010-12-21 15:16 ` Christoph Hellwig
2010-12-21 7:28 ` Dave Chinner [this message]
2010-12-22 2:20 ` [PATCH 03/34] lib: percpu counter add unless less than functionality Alex Elder
2010-12-22 3:46 ` Dave Chinner
2010-12-21 7:29 ` [PATCH 04/34] xfs: use generic per-cpu counter infrastructure Dave Chinner
2010-12-21 7:29 ` [PATCH 05/34] xfs: demultiplex xfs_icsb_modify_counters() Dave Chinner
2010-12-21 7:29 ` [PATCH 06/34] xfs: dynamic speculative EOF preallocation Dave Chinner
2010-12-21 15:15 ` Christoph Hellwig
2010-12-21 21:42 ` Dave Chinner
2010-12-21 23:44 ` Dave Chinner
2010-12-22 2:29 ` Alex Elder
2010-12-29 12:56 ` Christoph Hellwig
2010-12-27 14:57 ` Alex Elder
2010-12-27 15:00 ` Alex Elder
2011-01-06 18:16 ` Christoph Hellwig
2010-12-21 7:29 ` [PATCH 07/34] xfs: don't truncate prealloc from frequently accessed inodes Dave Chinner
2010-12-21 16:45 ` Christoph Hellwig
2010-12-21 7:29 ` [PATCH 08/34] xfs: rcu free inodes Dave Chinner
2010-12-21 7:29 ` [PATCH 09/34] xfs: convert inode cache lookups to use RCU locking Dave Chinner
2010-12-21 7:29 ` [PATCH 10/34] xfs: convert pag_ici_lock to a spin lock Dave Chinner
2010-12-21 7:29 ` [PATCH 11/34] xfs: convert xfsbud shrinker to a per-buftarg shrinker Dave Chinner
2010-12-21 7:29 ` [PATCH 12/34] xfs: add a lru to the XFS buffer cache Dave Chinner
2010-12-21 7:29 ` [PATCH 13/34] xfs: connect up buffer reclaim priority hooks Dave Chinner
2010-12-21 7:29 ` [PATCH 14/34] xfs: fix EFI transaction cancellation Dave Chinner
2010-12-21 7:29 ` [PATCH 15/34] xfs: Pull EFI/EFD handling out from under the AIL lock Dave Chinner
2010-12-21 7:29 ` [PATCH 16/34] xfs: clean up xfs_ail_delete() Dave Chinner
2010-12-21 7:29 ` [PATCH 17/34] xfs: bulk AIL insertion during transaction commit Dave Chinner
2010-12-21 7:29 ` [PATCH 18/34] xfs: reduce the number of AIL push wakeups Dave Chinner
2010-12-21 7:29 ` [PATCH 19/34] xfs: consume iodone callback items on buffers as they are processed Dave Chinner
2010-12-21 7:29 ` [PATCH 20/34] xfs: remove all the inodes on a buffer from the AIL in bulk Dave Chinner
2010-12-22 2:20 ` Alex Elder
2010-12-22 3:49 ` Dave Chinner
2010-12-21 7:29 ` [PATCH 22/34] xfs: use AIL bulk delete function to implement single delete Dave Chinner
2010-12-21 7:29 ` [PATCH 23/34] xfs: convert log grant ticket queues to list heads Dave Chinner
2010-12-21 7:29 ` [PATCH 24/34] xfs: fact out common grant head/log tail verification code Dave Chinner
2010-12-21 7:29 ` [PATCH 25/34] xfs: rework log grant space calculations Dave Chinner
2010-12-21 7:29 ` [PATCH 26/34] xfs: combine grant heads into a single 64 bit integer Dave Chinner
2010-12-21 7:29 ` [PATCH 27/34] xfs: use wait queues directly for the log wait queues Dave Chinner
2010-12-21 7:29 ` [PATCH 28/34] xfs: make AIL tail pushing independent of the grant lock Dave Chinner
2010-12-21 7:29 ` [PATCH 29/34] xfs: convert l_last_sync_lsn to an atomic variable Dave Chinner
2010-12-21 7:29 ` [PATCH 30/34] xfs: convert l_tail_lsn " Dave Chinner
2010-12-29 12:52 ` Christoph Hellwig
2010-12-29 15:49 ` Alex Elder
2010-12-21 7:29 ` [PATCH 31/34] xfs: convert log grant heads to atomic variables Dave Chinner
2010-12-21 7:29 ` [PATCH 32/34] xfs: introduce new locks for the log grant ticket wait queues Dave Chinner
2010-12-21 7:29 ` [PATCH 33/34] xfs: convert grant head manipulations to lockless algorithm Dave Chinner
2010-12-21 7:29 ` [PATCH 34/34] xfs: kill useless spinlock_destroy macro Dave Chinner
2010-12-23 1:15 ` [PATCH 00/34] xfs: scalability patchset for 2.6.38 Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1292916570-25015-4-git-send-email-david@fromorbit.com \
--to=david@fromorbit.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox