Re: [PATCHv3 1/3] rwlock: Add per-cpu reader-writer lock infrastructure

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Malcolm Crossley <malcolm.crossley@citrix.com>
To: JBeulich@suse.com, ian.campbell@citrix.com,
	andrew.cooper3@citrix.com, Marcos.Matsunaga@oracle.com,
	keir@xen.org, konrad.wilk@oracle.com,
	george.dunlap@eu.citrix.com
Cc: xen-devel@lists.xenproject.org, dario.faggioli@citrix.com,
	stefano.stabellini@citrix.com
Subject: Re: [PATCHv3 1/3] rwlock: Add per-cpu reader-writer lock infrastructure
Date: Fri, 18 Dec 2015 10:08:00 +0000	[thread overview]
Message-ID: <5673DB00.6050008@citrix.com> (raw)
In-Reply-To: <1450356747-29039-2-git-send-email-malcolm.crossley@citrix.com>

I didn't spot the percpu rwlock owner ASSERT being the wrong way round.
Please review version 4 of the series.

Sorry for the noise.

On 17/12/15 12:52, Malcolm Crossley wrote:
> Per-cpu read-write locks allow for the fast path read case to have
> low overhead by only setting/clearing a per-cpu variable for using
> the read lock. The per-cpu read fast path also avoids locked
> compare swap operations which can be particularly slow on coherent
> multi-socket systems, particularly if there is heavy usage of the
> read lock itself.
> 
> The per-cpu reader-writer lock uses a local variable to control
> the read lock fast path. This allows a writer to disable the fast
> path and ensures the readers switch to using the underlying
> read-write lock implementation instead of the per-cpu variable.
> 
> Once the writer has taken the write lock and disabled the fast path,
> it must poll the per-cpu variable for all CPU's which have entered
> the critical section for the specific read-write lock the writer is
> attempting to take. This design allows for a single per-cpu variable
> to be used for read/write locks belonging to seperate data structures.
> If a two or more different per-cpu read lock(s) are taken
> simultaneously then the per-cpu data structure is not used and the
> implementation takes the read lock of the underlying read-write lock,
> this behaviour is equivalent to the slow path in terms of performance.
> The per-cpu rwlock is not recursion safe for taking the per-cpu
> read lock because there is no recursion count variable, this is
> functionally equivalent to standard spin locks.
> 
> Slow path readers which are unblocked, set the per-cpu variable and
> drop the read lock. This simplifies the implementation and allows
> for fairness in the underlying read-write lock to be taken
> advantage of.
> 
> There is more overhead on the per-cpu write lock path due to checking
> each CPUs fast path per-cpu variable but this overhead is likely be
> hidden by the required delay of waiting for readers to exit the
> critical section. The loop is optimised to only iterate over
> the per-cpu data of active readers of the rwlock. The cpumask_t for
> tracking the active readers is stored in a single per-cpu data
> location and thus the write lock is not pre-emption safe. Therefore
> the per-cpu write lock can only be used with interrupts disabled.
> 
> Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com>
> --
> Changes since v2:
> - Remove stray hard tabs
> 
> Changes since v1:
> - Removed restriction on taking two or more different percpu rw
> locks simultaneously
> - Moved fast-path/slow-path barrier to be per lock instead of global
> - Created seperate percpu_rwlock_t type and added macros to
> initialise new type
> - Added helper macros for using the percpu rwlock itself
> - Moved active_readers cpumask off the stack and into a percpu
> variable
> ---
>  xen/common/spinlock.c        |  46 +++++++++++++++++
>  xen/include/asm-arm/percpu.h |   5 ++
>  xen/include/asm-x86/percpu.h |   6 +++
>  xen/include/xen/percpu.h     |   4 ++
>  xen/include/xen/spinlock.h   | 115 +++++++++++++++++++++++++++++++++++++++++++
>  5 files changed, 176 insertions(+)
> 
> diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c
> index 7f89694..524ab6e 100644
> --- a/xen/common/spinlock.c
> +++ b/xen/common/spinlock.c
> @@ -10,6 +10,8 @@
>  #include <asm/processor.h>
>  #include <asm/atomic.h>
>  
> +static DEFINE_PER_CPU(cpumask_t, percpu_rwlock_readers);
> +
>  #ifndef NDEBUG
>  
>  static atomic_t spin_debug __read_mostly = ATOMIC_INIT(0);
> @@ -492,6 +494,50 @@ int _rw_is_write_locked(rwlock_t *lock)
>      return (lock->lock == RW_WRITE_FLAG); /* writer in critical section? */
>  }
>  
> +void _percpu_write_lock(percpu_rwlock_t **per_cpudata,
> +                percpu_rwlock_t *percpu_rwlock)
> +{
> +    unsigned int cpu;
> +    cpumask_t *rwlock_readers = &this_cpu(percpu_rwlock_readers);
> +
> +#ifndef NDEBUG
> +    /* Validate the correct per_cpudata variable has been provided. */
> +    ASSERT(per_cpudata != percpu_rwlock->percpu_owner);
> +#endif
> +    /* 
> +     * First take the write lock to protect against other writers or slow 
> +     * path readers.
> +     */
> +    write_lock(&percpu_rwlock->rwlock);
> +
> +    /* Now set the global variable so that readers start using read_lock. */
> +    percpu_rwlock->writer_activating = 1;
> +    smp_mb();
> +
> +    /* Using a per cpu cpumask is only safe if there is no nesting. */
> +    ASSERT(!in_irq());
> +    cpumask_copy(rwlock_readers, &cpu_online_map);
> +
> +    /* Check if there are any percpu readers in progress on this rwlock. */
> +    for ( ; ; )
> +    {
> +        for_each_cpu(cpu, rwlock_readers)
> +        {
> +            /* 
> +             * Remove any percpu readers not contending on this rwlock
> +             * from our check mask.
> +             */
> +            if ( per_cpu_ptr(per_cpudata, cpu) != percpu_rwlock )
> +                __cpumask_clear_cpu(cpu, rwlock_readers);
> +        }
> +        /* Check if we've cleared all percpu readers from check mask. */
> +        if ( cpumask_empty(rwlock_readers) )
> +            break;
> +        /* Give the coherency fabric a break. */
> +        cpu_relax();
> +    };
> +}
> +
>  #ifdef LOCK_PROFILE
>  
>  struct lock_profile_anc {
> diff --git a/xen/include/asm-arm/percpu.h b/xen/include/asm-arm/percpu.h
> index 71e7649..c308a56 100644
> --- a/xen/include/asm-arm/percpu.h
> +++ b/xen/include/asm-arm/percpu.h
> @@ -27,6 +27,11 @@ void percpu_init_areas(void);
>  #define __get_cpu_var(var) \
>      (*RELOC_HIDE(&per_cpu__##var, READ_SYSREG(TPIDR_EL2)))
>  
> +#define per_cpu_ptr(var, cpu)  \
> +    (*RELOC_HIDE(&var, __per_cpu_offset[cpu]))
> +#define __get_cpu_ptr(var) \
> +    (*RELOC_HIDE(&var, READ_SYSREG(TPIDR_EL2)))
> +
>  #define DECLARE_PER_CPU(type, name) extern __typeof__(type) per_cpu__##name
>  
>  DECLARE_PER_CPU(unsigned int, cpu_id);
> diff --git a/xen/include/asm-x86/percpu.h b/xen/include/asm-x86/percpu.h
> index 604ff0d..51562b9 100644
> --- a/xen/include/asm-x86/percpu.h
> +++ b/xen/include/asm-x86/percpu.h
> @@ -20,4 +20,10 @@ void percpu_init_areas(void);
>  
>  #define DECLARE_PER_CPU(type, name) extern __typeof__(type) per_cpu__##name
>  
> +#define __get_cpu_ptr(var) \
> +    (*RELOC_HIDE(var, get_cpu_info()->per_cpu_offset))
> +
> +#define per_cpu_ptr(var, cpu)  \
> +    (*RELOC_HIDE(var, __per_cpu_offset[cpu]))
> +
>  #endif /* __X86_PERCPU_H__ */
> diff --git a/xen/include/xen/percpu.h b/xen/include/xen/percpu.h
> index abe0b11..c896863 100644
> --- a/xen/include/xen/percpu.h
> +++ b/xen/include/xen/percpu.h
> @@ -16,6 +16,10 @@
>  /* Preferred on Xen. Also see arch-defined per_cpu(). */
>  #define this_cpu(var)    __get_cpu_var(var)
>  
> +#define this_cpu_ptr(ptr)    __get_cpu_ptr(ptr)
> +
> +#define get_per_cpu_var(var)  (per_cpu__##var)
> +
>  /* Linux compatibility. */
>  #define get_cpu_var(var) this_cpu(var)
>  #define put_cpu_var(var)
> diff --git a/xen/include/xen/spinlock.h b/xen/include/xen/spinlock.h
> index fb0438e..f9977b2 100644
> --- a/xen/include/xen/spinlock.h
> +++ b/xen/include/xen/spinlock.h
> @@ -3,6 +3,7 @@
>  
>  #include <asm/system.h>
>  #include <asm/spinlock.h>
> +#include <asm/types.h>
>  
>  #ifndef NDEBUG
>  struct lock_debug {
> @@ -261,4 +262,118 @@ int _rw_is_write_locked(rwlock_t *lock);
>  #define rw_is_locked(l)               _rw_is_locked(l)
>  #define rw_is_write_locked(l)         _rw_is_write_locked(l)
>  
> +typedef struct percpu_rwlock percpu_rwlock_t;
> +
> +struct percpu_rwlock {
> +    rwlock_t            rwlock;
> +    bool_t              writer_activating;
> +#ifndef NDEBUG
> +    percpu_rwlock_t     **percpu_owner;
> +#endif
> +};
> +
> +#ifndef NDEBUG
> +#define PERCPU_RW_LOCK_UNLOCKED(owner) { RW_LOCK_UNLOCKED, 0, owner }
> +#else
> +#define PERCPU_RW_LOCK_UNLOCKED(owner) { RW_LOCK_UNLOCKED, 0 }
> +#endif
> +
> +#define DEFINE_PERCPU_RWLOCK_RESOURCE(l, owner) \
> +    percpu_rwlock_t l = PERCPU_RW_LOCK_UNLOCKED(&get_per_cpu_var(owner))
> +#define percpu_rwlock_resource_init(l, owner) \
> +    (*(l) = (percpu_rwlock_t)PERCPU_RW_LOCK_UNLOCKED(&get_per_cpu_var(owner)))
> +
> +
> +static inline void _percpu_read_lock(percpu_rwlock_t **per_cpudata,
> +                                         percpu_rwlock_t *percpu_rwlock)
> +{
> +#ifndef NDEBUG
> +    /* Validate the correct per_cpudata variable has been provided. */
> +    ASSERT(per_cpudata != percpu_rwlock->percpu_owner);
> +#endif
> +
> +    /* We cannot support recursion on the same lock. */
> +    ASSERT(this_cpu_ptr(per_cpudata) != percpu_rwlock);
> +    /* 
> +     * Detect using a second percpu_rwlock_t simulatenously and fallback
> +     * to standard read_lock.
> +     */
> +    if ( unlikely(this_cpu_ptr(per_cpudata) != NULL ) )
> +    {
> +        read_lock(&percpu_rwlock->rwlock);
> +        return;
> +    }
> +
> +    /* Indicate this cpu is reading. */
> +    this_cpu_ptr(per_cpudata) = percpu_rwlock;
> +    smp_mb();
> +    /* Check if a writer is waiting. */
> +    if ( unlikely(percpu_rwlock->writer_activating) )
> +    {
> +        /* Let the waiting writer know we aren't holding the lock. */
> +        this_cpu_ptr(per_cpudata) = NULL;
> +        /* Wait using the read lock to keep the lock fair. */
> +        read_lock(&percpu_rwlock->rwlock);
> +        /* Set the per CPU data again and continue. */
> +        this_cpu_ptr(per_cpudata) = percpu_rwlock;
> +        /* Drop the read lock because we don't need it anymore. */
> +        read_unlock(&percpu_rwlock->rwlock);
> +    }
> +}
> +
> +static inline void _percpu_read_unlock(percpu_rwlock_t **per_cpudata,
> +                percpu_rwlock_t *percpu_rwlock)
> +{
> +#ifndef NDEBUG
> +    /* Validate the correct per_cpudata variable has been provided. */
> +    ASSERT(per_cpudata != percpu_rwlock->percpu_owner);
> +#endif
> +
> +    /* Verify the read lock was taken for this lock */
> +    ASSERT(this_cpu_ptr(per_cpudata) != NULL);
> +    /* 
> +     * Detect using a second percpu_rwlock_t simulatenously and fallback
> +     * to standard read_unlock.
> +     */
> +    if ( unlikely(this_cpu_ptr(per_cpudata) != percpu_rwlock ) )
> +    {
> +        read_unlock(&percpu_rwlock->rwlock);
> +        return;
> +    }
> +    this_cpu_ptr(per_cpudata) = NULL;
> +    smp_wmb();
> +}
> +
> +/* Don't inline percpu write lock as it's a complex function. */
> +void _percpu_write_lock(percpu_rwlock_t **per_cpudata,
> +                        percpu_rwlock_t *percpu_rwlock);
> +
> +static inline void _percpu_write_unlock(percpu_rwlock_t **per_cpudata,
> +                percpu_rwlock_t *percpu_rwlock)
> +{
> +#ifndef NDEBUG
> +    /* Validate the correct per_cpudata variable has been provided. */
> +    ASSERT(per_cpudata != percpu_rwlock->percpu_owner);
> +#endif
> +    ASSERT(percpu_rwlock->writer_activating);
> +    percpu_rwlock->writer_activating = 0;
> +    write_unlock(&percpu_rwlock->rwlock);
> +}
> +
> +#define percpu_rw_is_write_locked(l)         _rw_is_write_locked(&((l)->rwlock))
> +
> +#define percpu_read_lock(percpu, lock) \
> +    _percpu_read_lock(&get_per_cpu_var(percpu), lock)
> +#define percpu_read_unlock(percpu, lock) \
> +    _percpu_read_unlock(&get_per_cpu_var(percpu), lock)
> +#define percpu_write_lock(percpu, lock) \
> +    _percpu_write_lock(&get_per_cpu_var(percpu), lock)
> +#define percpu_write_unlock(percpu, lock) \
> +    _percpu_write_unlock(&get_per_cpu_var(percpu), lock)
> +
> +#define DEFINE_PERCPU_RWLOCK_GLOBAL(name) DEFINE_PER_CPU(percpu_rwlock_t *, \
> +                                                         name)
> +#define DECLARE_PERCPU_RWLOCK_GLOBAL(name) DECLARE_PER_CPU(percpu_rwlock_t *, \
> +                                                         name)
> +
>  #endif /* __SPINLOCK_H__ */
>

next prev parent reply	other threads:[~2015-12-18 10:08 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-17 12:52 [PATCHv3 0/3] Implement per-cpu reader-writer locks Malcolm Crossley
2015-12-17 12:52 ` [PATCHv3 1/3] rwlock: Add per-cpu reader-writer lock infrastructure Malcolm Crossley
2015-12-18 10:08   ` Malcolm Crossley [this message]
2015-12-17 12:52 ` [PATCHv3 2/3] grant_table: convert grant table rwlock to percpu rwlock Malcolm Crossley
2015-12-17 12:52 ` [PATCHv3 3/3] p2m: convert p2m " Malcolm Crossley
2015-12-18 10:07 ` [PATCHv3 0/3] Implement per-cpu reader-writer locks Malcolm Crossley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5673DB00.6050008@citrix.com \
    --to=malcolm.crossley@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=Marcos.Matsunaga@oracle.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=dario.faggioli@citrix.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=ian.campbell@citrix.com \
    --cc=keir@xen.org \
    --cc=konrad.wilk@oracle.com \
    --cc=stefano.stabellini@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).