From: Malcolm Crossley <malcolm.crossley@citrix.com>
To: JBeulich@suse.com, ian.campbell@citrix.com,
andrew.cooper3@citrix.com, Marcos.Matsunaga@oracle.com,
keir@xen.org, konrad.wilk@oracle.com,
george.dunlap@eu.citrix.com
Cc: xen-devel@lists.xenproject.org, dario.faggioli@citrix.com,
stefano.stabellini@citrix.com
Subject: Re: [PATCHv3 1/3] rwlock: Add per-cpu reader-writer lock infrastructure
Date: Fri, 18 Dec 2015 10:08:00 +0000 [thread overview]
Message-ID: <5673DB00.6050008@citrix.com> (raw)
In-Reply-To: <1450356747-29039-2-git-send-email-malcolm.crossley@citrix.com>
I didn't spot the percpu rwlock owner ASSERT being the wrong way round.
Please review version 4 of the series.
Sorry for the noise.
On 17/12/15 12:52, Malcolm Crossley wrote:
> Per-cpu read-write locks allow for the fast path read case to have
> low overhead by only setting/clearing a per-cpu variable for using
> the read lock. The per-cpu read fast path also avoids locked
> compare swap operations which can be particularly slow on coherent
> multi-socket systems, particularly if there is heavy usage of the
> read lock itself.
>
> The per-cpu reader-writer lock uses a local variable to control
> the read lock fast path. This allows a writer to disable the fast
> path and ensures the readers switch to using the underlying
> read-write lock implementation instead of the per-cpu variable.
>
> Once the writer has taken the write lock and disabled the fast path,
> it must poll the per-cpu variable for all CPU's which have entered
> the critical section for the specific read-write lock the writer is
> attempting to take. This design allows for a single per-cpu variable
> to be used for read/write locks belonging to seperate data structures.
> If a two or more different per-cpu read lock(s) are taken
> simultaneously then the per-cpu data structure is not used and the
> implementation takes the read lock of the underlying read-write lock,
> this behaviour is equivalent to the slow path in terms of performance.
> The per-cpu rwlock is not recursion safe for taking the per-cpu
> read lock because there is no recursion count variable, this is
> functionally equivalent to standard spin locks.
>
> Slow path readers which are unblocked, set the per-cpu variable and
> drop the read lock. This simplifies the implementation and allows
> for fairness in the underlying read-write lock to be taken
> advantage of.
>
> There is more overhead on the per-cpu write lock path due to checking
> each CPUs fast path per-cpu variable but this overhead is likely be
> hidden by the required delay of waiting for readers to exit the
> critical section. The loop is optimised to only iterate over
> the per-cpu data of active readers of the rwlock. The cpumask_t for
> tracking the active readers is stored in a single per-cpu data
> location and thus the write lock is not pre-emption safe. Therefore
> the per-cpu write lock can only be used with interrupts disabled.
>
> Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com>
> --
> Changes since v2:
> - Remove stray hard tabs
>
> Changes since v1:
> - Removed restriction on taking two or more different percpu rw
> locks simultaneously
> - Moved fast-path/slow-path barrier to be per lock instead of global
> - Created seperate percpu_rwlock_t type and added macros to
> initialise new type
> - Added helper macros for using the percpu rwlock itself
> - Moved active_readers cpumask off the stack and into a percpu
> variable
> ---
> xen/common/spinlock.c | 46 +++++++++++++++++
> xen/include/asm-arm/percpu.h | 5 ++
> xen/include/asm-x86/percpu.h | 6 +++
> xen/include/xen/percpu.h | 4 ++
> xen/include/xen/spinlock.h | 115 +++++++++++++++++++++++++++++++++++++++++++
> 5 files changed, 176 insertions(+)
>
> diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c
> index 7f89694..524ab6e 100644
> --- a/xen/common/spinlock.c
> +++ b/xen/common/spinlock.c
> @@ -10,6 +10,8 @@
> #include <asm/processor.h>
> #include <asm/atomic.h>
>
> +static DEFINE_PER_CPU(cpumask_t, percpu_rwlock_readers);
> +
> #ifndef NDEBUG
>
> static atomic_t spin_debug __read_mostly = ATOMIC_INIT(0);
> @@ -492,6 +494,50 @@ int _rw_is_write_locked(rwlock_t *lock)
> return (lock->lock == RW_WRITE_FLAG); /* writer in critical section? */
> }
>
> +void _percpu_write_lock(percpu_rwlock_t **per_cpudata,
> + percpu_rwlock_t *percpu_rwlock)
> +{
> + unsigned int cpu;
> + cpumask_t *rwlock_readers = &this_cpu(percpu_rwlock_readers);
> +
> +#ifndef NDEBUG
> + /* Validate the correct per_cpudata variable has been provided. */
> + ASSERT(per_cpudata != percpu_rwlock->percpu_owner);
> +#endif
> + /*
> + * First take the write lock to protect against other writers or slow
> + * path readers.
> + */
> + write_lock(&percpu_rwlock->rwlock);
> +
> + /* Now set the global variable so that readers start using read_lock. */
> + percpu_rwlock->writer_activating = 1;
> + smp_mb();
> +
> + /* Using a per cpu cpumask is only safe if there is no nesting. */
> + ASSERT(!in_irq());
> + cpumask_copy(rwlock_readers, &cpu_online_map);
> +
> + /* Check if there are any percpu readers in progress on this rwlock. */
> + for ( ; ; )
> + {
> + for_each_cpu(cpu, rwlock_readers)
> + {
> + /*
> + * Remove any percpu readers not contending on this rwlock
> + * from our check mask.
> + */
> + if ( per_cpu_ptr(per_cpudata, cpu) != percpu_rwlock )
> + __cpumask_clear_cpu(cpu, rwlock_readers);
> + }
> + /* Check if we've cleared all percpu readers from check mask. */
> + if ( cpumask_empty(rwlock_readers) )
> + break;
> + /* Give the coherency fabric a break. */
> + cpu_relax();
> + };
> +}
> +
> #ifdef LOCK_PROFILE
>
> struct lock_profile_anc {
> diff --git a/xen/include/asm-arm/percpu.h b/xen/include/asm-arm/percpu.h
> index 71e7649..c308a56 100644
> --- a/xen/include/asm-arm/percpu.h
> +++ b/xen/include/asm-arm/percpu.h
> @@ -27,6 +27,11 @@ void percpu_init_areas(void);
> #define __get_cpu_var(var) \
> (*RELOC_HIDE(&per_cpu__##var, READ_SYSREG(TPIDR_EL2)))
>
> +#define per_cpu_ptr(var, cpu) \
> + (*RELOC_HIDE(&var, __per_cpu_offset[cpu]))
> +#define __get_cpu_ptr(var) \
> + (*RELOC_HIDE(&var, READ_SYSREG(TPIDR_EL2)))
> +
> #define DECLARE_PER_CPU(type, name) extern __typeof__(type) per_cpu__##name
>
> DECLARE_PER_CPU(unsigned int, cpu_id);
> diff --git a/xen/include/asm-x86/percpu.h b/xen/include/asm-x86/percpu.h
> index 604ff0d..51562b9 100644
> --- a/xen/include/asm-x86/percpu.h
> +++ b/xen/include/asm-x86/percpu.h
> @@ -20,4 +20,10 @@ void percpu_init_areas(void);
>
> #define DECLARE_PER_CPU(type, name) extern __typeof__(type) per_cpu__##name
>
> +#define __get_cpu_ptr(var) \
> + (*RELOC_HIDE(var, get_cpu_info()->per_cpu_offset))
> +
> +#define per_cpu_ptr(var, cpu) \
> + (*RELOC_HIDE(var, __per_cpu_offset[cpu]))
> +
> #endif /* __X86_PERCPU_H__ */
> diff --git a/xen/include/xen/percpu.h b/xen/include/xen/percpu.h
> index abe0b11..c896863 100644
> --- a/xen/include/xen/percpu.h
> +++ b/xen/include/xen/percpu.h
> @@ -16,6 +16,10 @@
> /* Preferred on Xen. Also see arch-defined per_cpu(). */
> #define this_cpu(var) __get_cpu_var(var)
>
> +#define this_cpu_ptr(ptr) __get_cpu_ptr(ptr)
> +
> +#define get_per_cpu_var(var) (per_cpu__##var)
> +
> /* Linux compatibility. */
> #define get_cpu_var(var) this_cpu(var)
> #define put_cpu_var(var)
> diff --git a/xen/include/xen/spinlock.h b/xen/include/xen/spinlock.h
> index fb0438e..f9977b2 100644
> --- a/xen/include/xen/spinlock.h
> +++ b/xen/include/xen/spinlock.h
> @@ -3,6 +3,7 @@
>
> #include <asm/system.h>
> #include <asm/spinlock.h>
> +#include <asm/types.h>
>
> #ifndef NDEBUG
> struct lock_debug {
> @@ -261,4 +262,118 @@ int _rw_is_write_locked(rwlock_t *lock);
> #define rw_is_locked(l) _rw_is_locked(l)
> #define rw_is_write_locked(l) _rw_is_write_locked(l)
>
> +typedef struct percpu_rwlock percpu_rwlock_t;
> +
> +struct percpu_rwlock {
> + rwlock_t rwlock;
> + bool_t writer_activating;
> +#ifndef NDEBUG
> + percpu_rwlock_t **percpu_owner;
> +#endif
> +};
> +
> +#ifndef NDEBUG
> +#define PERCPU_RW_LOCK_UNLOCKED(owner) { RW_LOCK_UNLOCKED, 0, owner }
> +#else
> +#define PERCPU_RW_LOCK_UNLOCKED(owner) { RW_LOCK_UNLOCKED, 0 }
> +#endif
> +
> +#define DEFINE_PERCPU_RWLOCK_RESOURCE(l, owner) \
> + percpu_rwlock_t l = PERCPU_RW_LOCK_UNLOCKED(&get_per_cpu_var(owner))
> +#define percpu_rwlock_resource_init(l, owner) \
> + (*(l) = (percpu_rwlock_t)PERCPU_RW_LOCK_UNLOCKED(&get_per_cpu_var(owner)))
> +
> +
> +static inline void _percpu_read_lock(percpu_rwlock_t **per_cpudata,
> + percpu_rwlock_t *percpu_rwlock)
> +{
> +#ifndef NDEBUG
> + /* Validate the correct per_cpudata variable has been provided. */
> + ASSERT(per_cpudata != percpu_rwlock->percpu_owner);
> +#endif
> +
> + /* We cannot support recursion on the same lock. */
> + ASSERT(this_cpu_ptr(per_cpudata) != percpu_rwlock);
> + /*
> + * Detect using a second percpu_rwlock_t simulatenously and fallback
> + * to standard read_lock.
> + */
> + if ( unlikely(this_cpu_ptr(per_cpudata) != NULL ) )
> + {
> + read_lock(&percpu_rwlock->rwlock);
> + return;
> + }
> +
> + /* Indicate this cpu is reading. */
> + this_cpu_ptr(per_cpudata) = percpu_rwlock;
> + smp_mb();
> + /* Check if a writer is waiting. */
> + if ( unlikely(percpu_rwlock->writer_activating) )
> + {
> + /* Let the waiting writer know we aren't holding the lock. */
> + this_cpu_ptr(per_cpudata) = NULL;
> + /* Wait using the read lock to keep the lock fair. */
> + read_lock(&percpu_rwlock->rwlock);
> + /* Set the per CPU data again and continue. */
> + this_cpu_ptr(per_cpudata) = percpu_rwlock;
> + /* Drop the read lock because we don't need it anymore. */
> + read_unlock(&percpu_rwlock->rwlock);
> + }
> +}
> +
> +static inline void _percpu_read_unlock(percpu_rwlock_t **per_cpudata,
> + percpu_rwlock_t *percpu_rwlock)
> +{
> +#ifndef NDEBUG
> + /* Validate the correct per_cpudata variable has been provided. */
> + ASSERT(per_cpudata != percpu_rwlock->percpu_owner);
> +#endif
> +
> + /* Verify the read lock was taken for this lock */
> + ASSERT(this_cpu_ptr(per_cpudata) != NULL);
> + /*
> + * Detect using a second percpu_rwlock_t simulatenously and fallback
> + * to standard read_unlock.
> + */
> + if ( unlikely(this_cpu_ptr(per_cpudata) != percpu_rwlock ) )
> + {
> + read_unlock(&percpu_rwlock->rwlock);
> + return;
> + }
> + this_cpu_ptr(per_cpudata) = NULL;
> + smp_wmb();
> +}
> +
> +/* Don't inline percpu write lock as it's a complex function. */
> +void _percpu_write_lock(percpu_rwlock_t **per_cpudata,
> + percpu_rwlock_t *percpu_rwlock);
> +
> +static inline void _percpu_write_unlock(percpu_rwlock_t **per_cpudata,
> + percpu_rwlock_t *percpu_rwlock)
> +{
> +#ifndef NDEBUG
> + /* Validate the correct per_cpudata variable has been provided. */
> + ASSERT(per_cpudata != percpu_rwlock->percpu_owner);
> +#endif
> + ASSERT(percpu_rwlock->writer_activating);
> + percpu_rwlock->writer_activating = 0;
> + write_unlock(&percpu_rwlock->rwlock);
> +}
> +
> +#define percpu_rw_is_write_locked(l) _rw_is_write_locked(&((l)->rwlock))
> +
> +#define percpu_read_lock(percpu, lock) \
> + _percpu_read_lock(&get_per_cpu_var(percpu), lock)
> +#define percpu_read_unlock(percpu, lock) \
> + _percpu_read_unlock(&get_per_cpu_var(percpu), lock)
> +#define percpu_write_lock(percpu, lock) \
> + _percpu_write_lock(&get_per_cpu_var(percpu), lock)
> +#define percpu_write_unlock(percpu, lock) \
> + _percpu_write_unlock(&get_per_cpu_var(percpu), lock)
> +
> +#define DEFINE_PERCPU_RWLOCK_GLOBAL(name) DEFINE_PER_CPU(percpu_rwlock_t *, \
> + name)
> +#define DECLARE_PERCPU_RWLOCK_GLOBAL(name) DECLARE_PER_CPU(percpu_rwlock_t *, \
> + name)
> +
> #endif /* __SPINLOCK_H__ */
>
next prev parent reply other threads:[~2015-12-18 10:08 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-17 12:52 [PATCHv3 0/3] Implement per-cpu reader-writer locks Malcolm Crossley
2015-12-17 12:52 ` [PATCHv3 1/3] rwlock: Add per-cpu reader-writer lock infrastructure Malcolm Crossley
2015-12-18 10:08 ` Malcolm Crossley [this message]
2015-12-17 12:52 ` [PATCHv3 2/3] grant_table: convert grant table rwlock to percpu rwlock Malcolm Crossley
2015-12-17 12:52 ` [PATCHv3 3/3] p2m: convert p2m " Malcolm Crossley
2015-12-18 10:07 ` [PATCHv3 0/3] Implement per-cpu reader-writer locks Malcolm Crossley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5673DB00.6050008@citrix.com \
--to=malcolm.crossley@citrix.com \
--cc=JBeulich@suse.com \
--cc=Marcos.Matsunaga@oracle.com \
--cc=andrew.cooper3@citrix.com \
--cc=dario.faggioli@citrix.com \
--cc=george.dunlap@eu.citrix.com \
--cc=ian.campbell@citrix.com \
--cc=keir@xen.org \
--cc=konrad.wilk@oracle.com \
--cc=stefano.stabellini@citrix.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).