linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 00/12] Lockless update of reference count protected by spinlock
@ 2013-07-04  3:32 Waiman Long
  2013-07-04  3:32 ` [PATCH v4 01/12] spinlock: A new lockref structure for lockless update of refcount Waiman Long
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Waiman Long @ 2013-07-04  3:32 UTC (permalink / raw)
  To: Alexander Viro, Jeff Layton, Miklos Szeredi, Ingo Molnar,
	Thomas Gleixner
  Cc: Waiman Long, linux-fsdevel, linux-kernel, Peter Zijlstra,
	Steven Rostedt, Linus Torvalds, Benjamin Herrenschmidt,
	Andi Kleen, Chandramouleeswaran, Aswin, Norton, Scott J

v3->v4:
 - Replace helper function access to d_lock and d_count by using
   macros to redefine the old d_lock name to the spinlock and new
   d_refcount name to the reference count. This greatly reduces the
   size of this patchset from 25 to 12 and make it easier to review.
v2->v3:
 - Completely revamp the packaging by adding a new lockref data
   structure that combines the spinlock with the reference
   count. Helper functions are also added to manipulate the new data
   structure. That results in modifying over 50 files, but the changes
   were trivial in most of them.
 - Change initial spinlock wait to use a timeout.
 - Force 64-bit alignment of the spinlock & reference count structure.
 - Add a new way to use the combo by using a new union and helper
   functions.

v1->v2:
 - Add one more layer of indirection to LOCK_WITH_REFCOUNT macro.
 - Add __LINUX_SPINLOCK_REFCOUNT_H protection to spinlock_refcount.h.
 - Add some generic get/put macros into spinlock_refcount.h.

This patchset supports a generic mechanism to atomically update
a reference count that is protected by a spinlock without actually
acquiring the lock itself. If the update doesn't succeeed, the caller
will have to acquire the lock and update the reference count in the
the old way.  This will help in situation where there is a lot of
spinlock contention because of frequent reference count update.

The d_lock and d_count fields of the struct dentry in dcache.h was
modified to use the new lockref data structure and the d_lock name
is now a macro to the actual spinlock. The d_count name, however,
cannot be reused as it has collision elsewhere in the kernel. So a
new d_refcount name is now used for the reference count.

This patch set causes significant performance improvement in the
short workload of the AIM7 benchmark on a 8-socket x86-64 machine
with 80 cores.

patch 1:	Introduce the new lockref data structure
patch 2:	Enable x86 architecture to use the feature
patches 3-11:	Rename all the d_count references to d_refcount
patch 12:	Change the dentry structure to use the lockref
		structure to improve performance for high contention
		cases

Thank to Thomas Gleixner, Andi Kleen and Linus for their valuable
input in shaping this patchset.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>

Waiman Long (12):
  spinlock: A new lockref structure for lockless update of refcount
  spinlock: Enable x86 architecture to do lockless refcount update
  dcache: rename d_count field of dentry to d_refcount
  auto-fs: rename d_count field of dentry to d_refcount
  ceph-fs: rename d_count field of dentry to d_refcount
  coda-fs: rename d_count field of dentry to d_refcount
  config-fs: rename d_count field of dentry to d_refcount
  ecrypt-fs: rename d_count field of dentry to d_refcount
  file locking: rename d_count field of dentry to d_refcount
  nfs: rename d_count field of dentry to d_refcount
  nilfs2: rename d_count field of dentry to d_refcount
  dcache: Enable lockless update of refcount in dentry structure

 arch/x86/Kconfig                         |    3 +
 arch/x86/include/asm/spinlock_refcount.h |    1 +
 fs/autofs4/expire.c                      |    8 +-
 fs/autofs4/root.c                        |    2 +-
 fs/ceph/inode.c                          |    4 +-
 fs/ceph/mds_client.c                     |    2 +-
 fs/coda/dir.c                            |    4 +-
 fs/configfs/dir.c                        |    2 +-
 fs/dcache.c                              |   72 +++++-----
 fs/ecryptfs/inode.c                      |    2 +-
 fs/locks.c                               |    2 +-
 fs/namei.c                               |    6 +-
 fs/nfs/dir.c                             |    8 +-
 fs/nfs/unlink.c                          |    2 +-
 fs/nilfs2/super.c                        |    2 +-
 include/asm-generic/spinlock_refcount.h  |   46 ++++++
 include/linux/dcache.h                   |   21 ++--
 include/linux/spinlock_refcount.h        |  107 ++++++++++++++
 kernel/Kconfig.locks                     |    5 +
 lib/Makefile                             |    2 +
 lib/spinlock_refcount.c                  |  229 ++++++++++++++++++++++++++++++
 21 files changed, 466 insertions(+), 64 deletions(-)
 create mode 100644 arch/x86/include/asm/spinlock_refcount.h
 create mode 100644 include/asm-generic/spinlock_refcount.h
 create mode 100644 include/linux/spinlock_refcount.h
 create mode 100644 lib/spinlock_refcount.c

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v4 01/12] spinlock: A new lockref structure for lockless update of refcount
  2013-07-04  3:32 [PATCH v4 00/12] Lockless update of reference count protected by spinlock Waiman Long
@ 2013-07-04  3:32 ` Waiman Long
  2013-07-04  3:32 ` [PATCH v4 02/12] spinlock: Enable x86 architecture to do lockless refcount update Waiman Long
  2013-07-04  3:32 ` [PATCH v4 03/12] dcache: rename d_count field of dentry to d_refcount Waiman Long
  2 siblings, 0 replies; 4+ messages in thread
From: Waiman Long @ 2013-07-04  3:32 UTC (permalink / raw)
  To: Alexander Viro, Jeff Layton, Miklos Szeredi, Ingo Molnar,
	Thomas Gleixner
  Cc: Waiman Long, linux-fsdevel, linux-kernel, Peter Zijlstra,
	Steven Rostedt, Linus Torvalds, Benjamin Herrenschmidt,
	Andi Kleen, Chandramouleeswaran, Aswin, Norton, Scott J

This patch introduces a new set of spinlock_refcount.h header files to
be included by kernel codes that want to do a faster lockless update
of reference count protected by a spinlock.

The new lockref structure consists of just the spinlock and the
reference count data. Helper functions are defined in the new
<linux/spinlock_refcount.h> header file to access the content of
the new structure. There is a generic structure defined for all
architecture, but each architecture can also optionally define its
own structure and use its own helper functions.

Two new config parameters are introduced:
1. SPINLOCK_REFCOUNT
2. ARCH_SPINLOCK_REFCOUNT

The first one is defined in the kernel/Kconfig.locks which is used
to enable or disable the faster lockless reference count update
optimization. The second one has to be defined in each of the
architecture's Kconfig file to enable the optimization for that
architecture. Therefore, each architecture has to opt-in for this
optimization or it won't get it. This allows each architecture plenty
of time to test it out before deciding to use it or replace it with
a better architecture specific solution.

This optimization won't work for non-SMP system or when spinlock
debugging is turned on. As a result, it is turned off each any of
them is true. It also won't work for full preempt-RT and so should
be turned on in this case.

The current patch allows 3 levels of access to the new lockref
structure:

1. The lockless update optimization is turned off (SPINLOCK_REFCOUNT=n).
2. The lockless update optimization is turned on and the generic version
   is used (SPINLOCK_REFCOUNT=y and ARCH_SPINLOCK_REFCOUNT=y).
3. The lockless update optimization is turned on and the architecture
   provides its own version.

To maximize the chance of doing lockless update in the generic version,
the inlined __lockref_add_unless() function will wait for a certain
amount of time if the lock is not free before trying to do the update.

The new code also attempts to do lockless atomic update twice before
falling back to the old code path of acquiring a lock before doing
the update. It is because there will still be some fair amount of
contention with only one attempt.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
---
 include/asm-generic/spinlock_refcount.h |   46 ++++++
 include/linux/spinlock_refcount.h       |  107 ++++++++++++++
 kernel/Kconfig.locks                    |    5 +
 lib/Makefile                            |    2 +
 lib/spinlock_refcount.c                 |  229 +++++++++++++++++++++++++++++++
 5 files changed, 389 insertions(+), 0 deletions(-)
 create mode 100644 include/asm-generic/spinlock_refcount.h
 create mode 100644 include/linux/spinlock_refcount.h
 create mode 100644 lib/spinlock_refcount.c

diff --git a/include/asm-generic/spinlock_refcount.h b/include/asm-generic/spinlock_refcount.h
new file mode 100644
index 0000000..469b96b
--- /dev/null
+++ b/include/asm-generic/spinlock_refcount.h
@@ -0,0 +1,46 @@
+/*
+ * Spinlock with reference count combo
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * (c) Copyright 2013 Hewlett-Packard Development Company, L.P.
+ *
+ * Authors: Waiman Long <waiman.long@hp.com>
+ */
+#ifndef __ASM_GENERIC_SPINLOCK_REFCOUNT_H
+#define __ASM_GENERIC_SPINLOCK_REFCOUNT_H
+
+/*
+ * The lockref structure defines a combined spinlock with reference count
+ * data structure to be embedded in a larger structure. The combined data
+ * structure is always 8-byte aligned. So proper placement of this structure
+ * in the larger embedding data structure is needed to ensure that there is
+ * no hole in it.
+ */
+struct __aligned(sizeof(u64)) lockref {
+	union {
+		u64		__lock_count;
+		struct {
+			unsigned int	refcnt;	/* Reference count */
+			spinlock_t	lock;
+		};
+	};
+};
+
+/*
+ * Struct lockref helper functions
+ */
+extern void lockref_get(struct lockref *lockcnt);
+extern int  lockref_put(struct lockref *lockcnt);
+extern int  lockref_get_not_zero(struct lockref *lockcnt);
+extern int  lockref_put_or_locked(struct lockref *lockcnt);
+
+#endif /* __ASM_GENERIC_SPINLOCK_REFCOUNT_H */
diff --git a/include/linux/spinlock_refcount.h b/include/linux/spinlock_refcount.h
new file mode 100644
index 0000000..28b0b89
--- /dev/null
+++ b/include/linux/spinlock_refcount.h
@@ -0,0 +1,107 @@
+/*
+ * Spinlock with reference count combo
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * (c) Copyright 2013 Hewlett-Packard Development Company, L.P.
+ *
+ * Authors: Waiman Long <waiman.long@hp.com>
+ */
+#ifndef __LINUX_SPINLOCK_REFCOUNT_H
+#define __LINUX_SPINLOCK_REFCOUNT_H
+
+#include <linux/spinlock.h>
+
+#ifdef CONFIG_SPINLOCK_REFCOUNT
+#include <asm/spinlock_refcount.h>
+#else
+/*
+ * If the spinlock & reference count optimization feature is disabled,
+ * the spinlock and reference count are accessed separately on its own.
+ */
+struct lockref {
+	unsigned int refcnt;	/* Reference count */
+	spinlock_t   lock;
+};
+
+/*
+ * Struct lockref helper functions
+ */
+/*
+ * lockref_get - increments reference count while not locked
+ * @lockcnt: pointer to lockref structure
+ */
+static __always_inline void
+lockref_get(struct lockref *lockcnt)
+{
+	spin_lock(&lockcnt->lock);
+	lockcnt->refcnt++;
+	spin_unlock(&lockcnt->lock);
+}
+
+/*
+ * lockref_get_not_zero - increments count unless the count is 0
+ * @lockcnt: pointer to lockref structure
+ * Return: 1 if count updated successfully or 0 if count is 0
+ */
+static __always_inline int
+lockref_get_not_zero(struct lockref *lockcnt)
+{
+	int retval = 0;
+
+	spin_lock(&lockcnt->lock);
+	if (likely(lockcnt->refcnt)) {
+		lockcnt->refcnt++;
+		retval = 1;
+	}
+	spin_unlock(&lockcnt->lock);
+	return retval;
+}
+
+/*
+ * lockref_put - decrements count unless count <= 1 before decrement
+ * @lockcnt: pointer to lockref structure
+ * Return: 1 if count updated successfully or 0 if count <= 1
+ */
+static __always_inline int
+lockref_put(struct lockref *lockcnt)
+{
+	int retval = 0;
+
+	spin_lock(&lockcnt->lock);
+	if (likely(lockcnt->refcnt > 1)) {
+		lockcnt->refcnt--;
+		retval = 1;
+	}
+	spin_unlock(&lockcnt->lock);
+	return retval;
+}
+
+/*
+ * lockref_put_or_locked - decrements count unless count <= 1 before decrement
+ *			   otherwise the lock will be taken
+ * @lockcnt: pointer to lockref structure
+ * Return: 1 if count updated successfully or 0 if count <= 1 and lock taken
+ */
+static __always_inline int
+lockref_put_or_locked(struct lockref *lockcnt)
+{
+	spin_lock(&lockcnt->lock);
+	if (likely(lockcnt->refcnt > 1)) {
+		lockcnt->refcnt--;
+		spin_unlock(&lockcnt->lock);
+		return 1;
+	}
+	return 0;	/* Count is 1 & lock taken */
+}
+
+#endif /* CONFIG_SPINLOCK_REFCOUNT */
+#endif /* __LINUX_SPINLOCK_REFCOUNT_H */
diff --git a/kernel/Kconfig.locks b/kernel/Kconfig.locks
index 44511d1..d1f8670 100644
--- a/kernel/Kconfig.locks
+++ b/kernel/Kconfig.locks
@@ -223,3 +223,8 @@ endif
 config MUTEX_SPIN_ON_OWNER
 	def_bool y
 	depends on SMP && !DEBUG_MUTEXES
+
+config SPINLOCK_REFCOUNT
+	def_bool y
+	depends on ARCH_SPINLOCK_REFCOUNT && SMP
+	depends on !GENERIC_LOCKBREAK && !DEBUG_SPINLOCK && !DEBUG_LOCK_ALLOC
diff --git a/lib/Makefile b/lib/Makefile
index c55a037..0367915 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -179,3 +179,5 @@ quiet_cmd_build_OID_registry = GEN     $@
 clean-files	+= oid_registry_data.c
 
 obj-$(CONFIG_UCS2_STRING) += ucs2_string.o
+
+obj-$(CONFIG_SPINLOCK_REFCOUNT) += spinlock_refcount.o
diff --git a/lib/spinlock_refcount.c b/lib/spinlock_refcount.c
new file mode 100644
index 0000000..1f599a7
--- /dev/null
+++ b/lib/spinlock_refcount.c
@@ -0,0 +1,229 @@
+/*
+ * Generic spinlock with reference count combo
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * (C) Copyright 2013 Hewlett-Packard Development Company, L.P.
+ *
+ * Authors: Waiman Long <waiman.long@hp.com>
+ */
+
+#include <linux/spinlock.h>
+#include <asm-generic/spinlock_refcount.h>
+
+#ifdef _SHOW_WAIT_LOOP_TIME
+#include <linux/time.h>
+#endif
+
+/*
+ * The maximum number of waiting spins when the lock was acquiring before
+ * trying to attempt a lockless update. The purpose of the timeout is to
+ * limit the amount of unfairness to the thread that is doing the reference
+ * count update. Otherwise, it is theoretically possible for that thread to
+ * be starved for a really long time causing all kind of problems. If it
+ * times out and the lock is still not free, the code will fall back to the
+ * traditional way of queuing up to acquire the lock before updating the count.
+ *
+ * The actual time spent in the wait loop very much depends on the CPU being
+ * used. On a 2.4GHz Westmere CPU, the execution time of a PAUSE instruction
+ * (cpu_relax) in a 4k loop is about 16us. The lock checking and looping
+ * overhead is about 8us. With 4 cpu_relax's in the loop, it will wait
+ * about 72us before the count reaches 0. With cacheline contention, the
+ * wait time can go up to 3x as much (about 210us). Having multiple
+ * cpu_relax's in the wait loop does seem to reduce cacheline contention
+ * somewhat and give slightly better performance.
+ *
+ * The preset timeout value is rather arbitrary and really depends on the CPU
+ * being used. If customization is needed, we could add a config variable for
+ * that. The exact value is not that important. A longer wait time will
+ * increase the chance of doing a lockless update, but it results in more
+ * unfairness to the waiting thread and vice versa.
+ */
+#ifndef CONFIG_LOCKREF_WAIT_SHIFT
+#define CONFIG_LOCKREF_WAIT_SHIFT	12
+#endif
+#define	LOCKREF_SPIN_WAIT_MAX	(1<<CONFIG_LOCKREF_WAIT_SHIFT)
+
+/**
+ *
+ * __lockref_add_unless - atomically add the given value to the count unless
+ *			  the lock was acquired or the count equals to the
+ *			  given threshold value.
+ *
+ * @plockcnt : pointer to the combined lock and count 8-byte data
+ * @plock    : pointer to the spinlock
+ * @pcount   : pointer to the reference count
+ * @value    : value to be added
+ * @threshold: threshold value for acquiring the lock
+ * Return    : 1 if operation succeeds, 0 otherwise
+ *
+ * If the lock was not acquired, __lockref_add_unless() atomically adds the
+ * given value to the reference count unless the given threshold is reached.
+ * If the lock was acquired or the threshold was reached, 0 is returned and
+ * the caller will have to acquire the lock and update the count accordingly
+ * (can be done in a non-atomic way).
+ *
+ * N.B. The lockdep code won't be run as this optimization should be disabled
+ *	when LOCK_STAT is enabled.
+ */
+static __always_inline int
+__lockref_add_unless(u64 *plockcnt, spinlock_t *plock, unsigned int *pcount,
+		     int value, int threshold)
+{
+	struct lockref old, new;
+	int   get_lock, loopcnt;
+#ifdef _SHOW_WAIT_LOOP_TIME
+	struct timespec tv1, tv2;
+#endif
+
+	/*
+	 * Code doesn't work if raw spinlock is larger than 4 bytes
+	 * or is empty.
+	 */
+	BUILD_BUG_ON(sizeof(arch_spinlock_t) == 0);
+	if (sizeof(arch_spinlock_t) > 4)
+		return 0;	/* Need to acquire the lock explicitly */
+
+	/*
+	 * Wait a certain amount of time until the lock is free or the loop
+	 * counter reaches 0. Doing multiple cpu_relax() helps to reduce
+	 * contention in the spinlock cacheline and achieve better performance.
+	 */
+#ifdef _SHOW_WAIT_LOOP_TIME
+	getnstimeofday(&tv1);
+#endif
+	loopcnt = LOCKREF_SPIN_WAIT_MAX;
+	while (loopcnt && spin_is_locked(plock)) {
+		loopcnt--;
+		cpu_relax();
+		cpu_relax();
+		cpu_relax();
+		cpu_relax();
+	}
+#ifdef _SHOW_WAIT_LOOP_TIME
+	if (loopcnt == 0) {
+		unsigned long ns;
+
+		getnstimeofday(&tv2);
+		ns = (tv2.tv_sec - tv1.tv_sec) * NSEC_PER_SEC +
+		     (tv2.tv_nsec - tv1.tv_nsec);
+		pr_info("lockref wait loop time = %lu ns\n", ns);
+	}
+#endif
+
+#ifdef CONFIG_LOCK_STAT
+	if (loopcnt != LOCKREF_SPIN_WAIT_MAX)
+		lock_contended(plock->dep_map, _RET_IP_);
+#endif
+	old.__lock_count = ACCESS_ONCE(*plockcnt);
+	get_lock = ((threshold >= 0) && (old.refcnt <= threshold));
+	if (likely(!get_lock && !spin_is_locked(&old.lock))) {
+		new.__lock_count = old.__lock_count;
+		new.refcnt += value;
+		if (cmpxchg64(plockcnt, old.__lock_count, new.__lock_count)
+		    == old.__lock_count)
+			return 1;
+		cpu_relax();
+		cpu_relax();
+		/*
+		 * Try one more time
+		 */
+		old.__lock_count = ACCESS_ONCE(*plockcnt);
+		get_lock = ((threshold >= 0) && (old.refcnt <= threshold));
+		if (likely(!get_lock && !spin_is_locked(&old.lock))) {
+			new.__lock_count = old.__lock_count;
+			new.refcnt += value;
+			if (cmpxchg64(plockcnt, old.__lock_count,
+				      new.__lock_count) == old.__lock_count)
+				return 1;
+			cpu_relax();
+		}
+	}
+	return 0;	/* The caller will need to acquire the lock */
+}
+
+/*
+ * Generic macros to increment and decrement reference count
+ * @sname : pointer to lockref structure
+ * @lock  : name of spinlock in structure
+ * @count : name of reference count in structure
+ *
+ * LOCKREF_GET()  - increments count unless it is locked
+ * LOCKREF_GET0() - increments count unless it is locked or count is 0
+ * LOCKREF_PUT()  - decrements count unless it is locked or count is 1
+ */
+#define LOCKREF_GET(sname, lock, count)					\
+	__lockref_add_unless(&(sname)->__lock_count, &(sname)->lock,	\
+			     &(sname)->count, 1, -1)
+#define LOCKREF_GET0(sname, lock, count)				\
+	__lockref_add_unless(&(sname)->__lock_count, &(sname)->lock,	\
+			     &(sname)->count, 1, 0)
+#define LOCKREF_PUT(sname, lock, count)					\
+	__lockref_add_unless(&(sname)->__lock_count, &(sname)->lock,	\
+			     &(sname)->count, -1, 1)
+
+/*
+ * Struct lockref helper functions
+ */
+/*
+ * lockref_get - increments reference count
+ * @lockcnt: pointer to struct lockref structure
+ */
+void
+lockref_get(struct lockref *lockcnt)
+{
+	if (LOCKREF_GET(lockcnt, lock, refcnt))
+		return;
+	spin_lock(&lockcnt->lock);
+	lockcnt->refcnt++;
+	spin_unlock(&lockcnt->lock);
+}
+EXPORT_SYMBOL(lockref_get);
+
+/*
+ * lockref_get_not_zero - increments count unless the count is 0
+ * @lockcnt: pointer to struct lockref structure
+ * Return: 1 if count updated successfully or 0 if count is 0 and lock taken
+ */
+int
+lockref_get_not_zero(struct lockref *lockcnt)
+{
+	return LOCKREF_GET0(lockcnt, lock, refcnt);
+}
+EXPORT_SYMBOL(lockref_get_not_zero);
+
+/*
+ * lockref_put - decrements count unless the count <= 1
+ * @lockcnt: pointer to struct lockref structure
+ * Return: 1 if count updated successfully or 0 if count <= 1
+ */
+int
+lockref_put(struct lockref *lockcnt)
+{
+	return LOCKREF_PUT(lockcnt, lock, refcnt);
+}
+EXPORT_SYMBOL(lockref_put);
+
+/*
+ * lockref_put_or_locked - decrements count unless the count is <= 1
+ *			   otherwise, the lock will be taken
+ * @lockcnt: pointer to struct lockref structure
+ * Return: 1 if count updated successfully or 0 if count <= 1 and lock taken
+ */
+int
+lockref_put_or_locked(struct lockref *lockcnt)
+{
+	if (LOCKREF_PUT(lockcnt, lock, refcnt))
+		return 1;
+	spin_lock(&lockcnt->lock);
+	return 0;
+}
+EXPORT_SYMBOL(lockref_put_or_locked);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v4 02/12] spinlock: Enable x86 architecture to do lockless refcount update
  2013-07-04  3:32 [PATCH v4 00/12] Lockless update of reference count protected by spinlock Waiman Long
  2013-07-04  3:32 ` [PATCH v4 01/12] spinlock: A new lockref structure for lockless update of refcount Waiman Long
@ 2013-07-04  3:32 ` Waiman Long
  2013-07-04  3:32 ` [PATCH v4 03/12] dcache: rename d_count field of dentry to d_refcount Waiman Long
  2 siblings, 0 replies; 4+ messages in thread
From: Waiman Long @ 2013-07-04  3:32 UTC (permalink / raw)
  To: Alexander Viro, Jeff Layton, Miklos Szeredi, Ingo Molnar,
	Thomas Gleixner
  Cc: Waiman Long, linux-fsdevel, linux-kernel, Peter Zijlstra,
	Steven Rostedt, Linus Torvalds, Benjamin Herrenschmidt,
	Andi Kleen, Chandramouleeswaran, Aswin, Norton, Scott J

There are two steps to enable each architecture to do lockless
reference count update:
1. Define the ARCH_SPINLOCK_REFCOUNT config parameter in its Kconfig
   file.
2. Add a <asm/spinlock_refcount.h> architecture specific header file.

This is done for the x86 architecture to use the generic version
available.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
---
 arch/x86/Kconfig                         |    3 +++
 arch/x86/include/asm/spinlock_refcount.h |    1 +
 2 files changed, 4 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/include/asm/spinlock_refcount.h

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index fe120da..649ed4b 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -253,6 +253,9 @@ config ARCH_CPU_PROBE_RELEASE
 config ARCH_SUPPORTS_UPROBES
 	def_bool y
 
+config ARCH_SPINLOCK_REFCOUNT
+	def_bool y
+
 source "init/Kconfig"
 source "kernel/Kconfig.freezer"
 
diff --git a/arch/x86/include/asm/spinlock_refcount.h b/arch/x86/include/asm/spinlock_refcount.h
new file mode 100644
index 0000000..ab6224f
--- /dev/null
+++ b/arch/x86/include/asm/spinlock_refcount.h
@@ -0,0 +1 @@
+#include <asm-generic/spinlock_refcount.h>
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v4 03/12] dcache: rename d_count field of dentry to d_refcount
  2013-07-04  3:32 [PATCH v4 00/12] Lockless update of reference count protected by spinlock Waiman Long
  2013-07-04  3:32 ` [PATCH v4 01/12] spinlock: A new lockref structure for lockless update of refcount Waiman Long
  2013-07-04  3:32 ` [PATCH v4 02/12] spinlock: Enable x86 architecture to do lockless refcount update Waiman Long
@ 2013-07-04  3:32 ` Waiman Long
  2 siblings, 0 replies; 4+ messages in thread
From: Waiman Long @ 2013-07-04  3:32 UTC (permalink / raw)
  To: Alexander Viro, Jeff Layton, Miklos Szeredi, Ingo Molnar,
	Thomas Gleixner
  Cc: Waiman Long, linux-fsdevel, linux-kernel, Peter Zijlstra,
	Steven Rostedt, Linus Torvalds, Benjamin Herrenschmidt,
	Andi Kleen, Chandramouleeswaran, Aswin, Norton, Scott J

Before converting the d_lock and d_count field of the dentry data
structure to the new lockref structure, we need to consider the
implication of such a change. All current references of d_count and
d_lock have to be changed accordingly.

One way to minimize the changes is to redefine the original field
names as macros to the new names.  For d_lock, it is possible to do
so saving a lot of changes as this name is not used anywhere else
in the kernel. For d_count, this is not possible as this name is used
somewhere else in the kernel for different things.

Therefore, we need to change the d_count field name to a different
one which is unique in the kernel source code. The new name will
be d_refcount.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
---
 fs/dcache.c            |   54 ++++++++++++++++++++++++------------------------
 fs/namei.c             |    6 ++--
 include/linux/dcache.h |    6 ++--
 3 files changed, 33 insertions(+), 33 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index f09b908..61d2f11 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -54,7 +54,7 @@
  *   - d_flags
  *   - d_name
  *   - d_lru
- *   - d_count
+ *   - d_refcount
  *   - d_unhashed()
  *   - d_parent and d_subdirs
  *   - childrens' d_child and d_parent
@@ -229,7 +229,7 @@ static void __d_free(struct rcu_head *head)
  */
 static void d_free(struct dentry *dentry)
 {
-	BUG_ON(dentry->d_count);
+	BUG_ON(dentry->d_refcount);
 	this_cpu_dec(nr_dentry);
 	if (dentry->d_op && dentry->d_op->d_release)
 		dentry->d_op->d_release(dentry);
@@ -467,7 +467,7 @@ relock:
 	}
 
 	if (ref)
-		dentry->d_count--;
+		dentry->d_refcount--;
 	/*
 	 * inform the fs via d_prune that this dentry is about to be
 	 * unhashed and destroyed.
@@ -513,12 +513,12 @@ void dput(struct dentry *dentry)
 		return;
 
 repeat:
-	if (dentry->d_count == 1)
+	if (dentry->d_refcount == 1)
 		might_sleep();
 	spin_lock(&dentry->d_lock);
-	BUG_ON(!dentry->d_count);
-	if (dentry->d_count > 1) {
-		dentry->d_count--;
+	BUG_ON(!dentry->d_refcount);
+	if (dentry->d_refcount > 1) {
+		dentry->d_refcount--;
 		spin_unlock(&dentry->d_lock);
 		return;
 	}
@@ -535,7 +535,7 @@ repeat:
 	dentry->d_flags |= DCACHE_REFERENCED;
 	dentry_lru_add(dentry);
 
-	dentry->d_count--;
+	dentry->d_refcount--;
 	spin_unlock(&dentry->d_lock);
 	return;
 
@@ -590,7 +590,7 @@ int d_invalidate(struct dentry * dentry)
 	 * We also need to leave mountpoints alone,
 	 * directory or not.
 	 */
-	if (dentry->d_count > 1 && dentry->d_inode) {
+	if (dentry->d_refcount > 1 && dentry->d_inode) {
 		if (S_ISDIR(dentry->d_inode->i_mode) || d_mountpoint(dentry)) {
 			spin_unlock(&dentry->d_lock);
 			return -EBUSY;
@@ -606,7 +606,7 @@ EXPORT_SYMBOL(d_invalidate);
 /* This must be called with d_lock held */
 static inline void __dget_dlock(struct dentry *dentry)
 {
-	dentry->d_count++;
+	dentry->d_refcount++;
 }
 
 static inline void __dget(struct dentry *dentry)
@@ -634,8 +634,8 @@ repeat:
 		goto repeat;
 	}
 	rcu_read_unlock();
-	BUG_ON(!ret->d_count);
-	ret->d_count++;
+	BUG_ON(!ret->d_refcount);
+	ret->d_refcount++;
 	spin_unlock(&ret->d_lock);
 	return ret;
 }
@@ -718,7 +718,7 @@ restart:
 	spin_lock(&inode->i_lock);
 	hlist_for_each_entry(dentry, &inode->i_dentry, d_alias) {
 		spin_lock(&dentry->d_lock);
-		if (!dentry->d_count) {
+		if (!dentry->d_refcount) {
 			__dget_dlock(dentry);
 			__d_drop(dentry);
 			spin_unlock(&dentry->d_lock);
@@ -734,7 +734,7 @@ EXPORT_SYMBOL(d_prune_aliases);
 
 /*
  * Try to throw away a dentry - free the inode, dput the parent.
- * Requires dentry->d_lock is held, and dentry->d_count == 0.
+ * Requires dentry->d_lock is held, and dentry->d_refcount == 0.
  * Releases dentry->d_lock.
  *
  * This may fail if locks cannot be acquired no problem, just try again.
@@ -764,8 +764,8 @@ static void try_prune_one_dentry(struct dentry *dentry)
 	dentry = parent;
 	while (dentry) {
 		spin_lock(&dentry->d_lock);
-		if (dentry->d_count > 1) {
-			dentry->d_count--;
+		if (dentry->d_refcount > 1) {
+			dentry->d_refcount--;
 			spin_unlock(&dentry->d_lock);
 			return;
 		}
@@ -793,7 +793,7 @@ static void shrink_dentry_list(struct list_head *list)
 		 * the LRU because of laziness during lookup.  Do not free
 		 * it - just keep it off the LRU list.
 		 */
-		if (dentry->d_count) {
+		if (dentry->d_refcount) {
 			dentry_lru_del(dentry);
 			spin_unlock(&dentry->d_lock);
 			continue;
@@ -913,7 +913,7 @@ static void shrink_dcache_for_umount_subtree(struct dentry *dentry)
 			dentry_lru_del(dentry);
 			__d_shrink(dentry);
 
-			if (dentry->d_count != 0) {
+			if (dentry->d_refcount != 0) {
 				printk(KERN_ERR
 				       "BUG: Dentry %p{i=%lx,n=%s}"
 				       " still in use (%d)"
@@ -922,7 +922,7 @@ static void shrink_dcache_for_umount_subtree(struct dentry *dentry)
 				       dentry->d_inode ?
 				       dentry->d_inode->i_ino : 0UL,
 				       dentry->d_name.name,
-				       dentry->d_count,
+				       dentry->d_refcount,
 				       dentry->d_sb->s_type->name,
 				       dentry->d_sb->s_id);
 				BUG();
@@ -933,7 +933,7 @@ static void shrink_dcache_for_umount_subtree(struct dentry *dentry)
 				list_del(&dentry->d_u.d_child);
 			} else {
 				parent = dentry->d_parent;
-				parent->d_count--;
+				parent->d_refcount--;
 				list_del(&dentry->d_u.d_child);
 			}
 
@@ -981,7 +981,7 @@ void shrink_dcache_for_umount(struct super_block *sb)
 
 	dentry = sb->s_root;
 	sb->s_root = NULL;
-	dentry->d_count--;
+	dentry->d_refcount--;
 	shrink_dcache_for_umount_subtree(dentry);
 
 	while (!hlist_bl_empty(&sb->s_anon)) {
@@ -1147,7 +1147,7 @@ resume:
 		 * loop in shrink_dcache_parent() might not make any progress
 		 * and loop forever.
 		 */
-		if (dentry->d_count) {
+		if (dentry->d_refcount) {
 			dentry_lru_del(dentry);
 		} else if (!(dentry->d_flags & DCACHE_SHRINK_LIST)) {
 			dentry_lru_move_list(dentry, dispose);
@@ -1269,7 +1269,7 @@ struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name)
 	smp_wmb();
 	dentry->d_name.name = dname;
 
-	dentry->d_count = 1;
+	dentry->d_refcount = 1;
 	dentry->d_flags = 0;
 	spin_lock_init(&dentry->d_lock);
 	seqcount_init(&dentry->d_seq);
@@ -1970,7 +1970,7 @@ struct dentry *__d_lookup(const struct dentry *parent, const struct qstr *name)
 				goto next;
 		}
 
-		dentry->d_count++;
+		dentry->d_refcount++;
 		found = dentry;
 		spin_unlock(&dentry->d_lock);
 		break;
@@ -2069,7 +2069,7 @@ again:
 	spin_lock(&dentry->d_lock);
 	inode = dentry->d_inode;
 	isdir = S_ISDIR(inode->i_mode);
-	if (dentry->d_count == 1) {
+	if (dentry->d_refcount == 1) {
 		if (!spin_trylock(&inode->i_lock)) {
 			spin_unlock(&dentry->d_lock);
 			cpu_relax();
@@ -2937,7 +2937,7 @@ resume:
 		}
 		if (!(dentry->d_flags & DCACHE_GENOCIDE)) {
 			dentry->d_flags |= DCACHE_GENOCIDE;
-			dentry->d_count--;
+			dentry->d_refcount--;
 		}
 		spin_unlock(&dentry->d_lock);
 	}
@@ -2945,7 +2945,7 @@ resume:
 		struct dentry *child = this_parent;
 		if (!(this_parent->d_flags & DCACHE_GENOCIDE)) {
 			this_parent->d_flags |= DCACHE_GENOCIDE;
-			this_parent->d_count--;
+			this_parent->d_refcount--;
 		}
 		this_parent = try_to_ascend(this_parent, locked, seq);
 		if (!this_parent)
diff --git a/fs/namei.c b/fs/namei.c
index 9ed9361..3a59198 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -536,8 +536,8 @@ static int unlazy_walk(struct nameidata *nd, struct dentry *dentry)
 		 * a reference at this point.
 		 */
 		BUG_ON(!IS_ROOT(dentry) && dentry->d_parent != parent);
-		BUG_ON(!parent->d_count);
-		parent->d_count++;
+		BUG_ON(!parent->d_refcount);
+		parent->d_refcount++;
 		spin_unlock(&dentry->d_lock);
 	}
 	spin_unlock(&parent->d_lock);
@@ -3280,7 +3280,7 @@ void dentry_unhash(struct dentry *dentry)
 {
 	shrink_dcache_parent(dentry);
 	spin_lock(&dentry->d_lock);
-	if (dentry->d_count == 1)
+	if (dentry->d_refcount == 1)
 		__d_drop(dentry);
 	spin_unlock(&dentry->d_lock);
 }
diff --git a/include/linux/dcache.h b/include/linux/dcache.h
index 1a6bb81..f0fd214 100644
--- a/include/linux/dcache.h
+++ b/include/linux/dcache.h
@@ -112,7 +112,7 @@ struct dentry {
 	unsigned char d_iname[DNAME_INLINE_LEN];	/* small names */
 
 	/* Ref lookup also touches following */
-	unsigned int d_count;		/* protected by d_lock */
+	unsigned int d_refcount;	/* protected by d_lock */
 	spinlock_t d_lock;		/* per dentry lock */
 	const struct dentry_operations *d_op;
 	struct super_block *d_sb;	/* The root of the dentry tree */
@@ -319,7 +319,7 @@ static inline int __d_rcu_to_refcount(struct dentry *dentry, unsigned seq)
 	assert_spin_locked(&dentry->d_lock);
 	if (!read_seqcount_retry(&dentry->d_seq, seq)) {
 		ret = 1;
-		dentry->d_count++;
+		dentry->d_refcount++;
 	}
 
 	return ret;
@@ -352,7 +352,7 @@ extern char *dentry_path(struct dentry *, char *, int);
 static inline struct dentry *dget_dlock(struct dentry *dentry)
 {
 	if (dentry)
-		dentry->d_count++;
+		dentry->d_refcount++;
 	return dentry;
 }
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-07-04  3:33 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-07-04  3:32 [PATCH v4 00/12] Lockless update of reference count protected by spinlock Waiman Long
2013-07-04  3:32 ` [PATCH v4 01/12] spinlock: A new lockref structure for lockless update of refcount Waiman Long
2013-07-04  3:32 ` [PATCH v4 02/12] spinlock: Enable x86 architecture to do lockless refcount update Waiman Long
2013-07-04  3:32 ` [PATCH v4 03/12] dcache: rename d_count field of dentry to d_refcount Waiman Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).