[PATCHv5 0/5] Use ticket locks for spinlocks

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCHv5 0/5] Use ticket locks for spinlocks
@ 2015-05-11 14:37 David Vrabel
  2015-05-11 14:37 ` [PATCHv4 1/5] x86: provide add_sized() David Vrabel
                   ` (4 more replies)
  0 siblings, 5 replies; 19+ messages in thread
From: David Vrabel @ 2015-05-11 14:37 UTC (permalink / raw)
  To: xen-devel
  Cc: Keir Fraser, Ian Campbell, Andrew Cooper, Tim Deegan,
	David Vrabel, Jan Beulich, Jennifer Herbert

Use ticket locks for spin locks instead of the current byte locks.
Ticket locks are fair.  This is an important property for hypervisor
locks.

Note that spin_lock_irq() and spin_lock_irqsave() now spin with irqs
disabled (previously, they would spin with irqs enabled if possible).
This is required to prevent deadlocks when the irq handler tries to
take the same lock with a higher ticket.

We have analysed the affect of this series on interrupt latency (by
measuring the duration of irq disable/enable regions) and there is no
signficant impact.

http://xenbits.xen.org/people/dvrabel/ticket-locks/busy_comp.png

Thanks to Jennifer Herbert for performing this analysis.

Performance results (using a earlier prototype) of aggregate network
throughput between 20 VIF pairs shows good improvements.

http://xenbits.xen.org/people/dvrabel/3336.png

This benchmark is limited by contention on the map track lock.

Changes in v5:
- _spin_barrier() fixes (re-add smp_mb(), profiling, cpu_relax()).
- Cleanup x86 add_sized().

Changes in v4:
- Use new add_sized() to increment ticket head.
- Add arch_lock_acquire_barrier() and arch_lock_release_barrier().
- Rename xadd() to arch_fetch_and_add().
- Shrink struct domain by 16 bytes.

Changes in v3:
- xadd() implementation for arm32 and arm64.
- Add barriers required on arm.
- Only wait for the current lock holder in _spin_barrier().

Changes in v2:
- Reimplemented in C, requiring only an arch-specific xadd() op
- Optimize spin_lock_recursive() to call spin_lock() since trylock
  loops are bad with ticket locks.

David

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCHv4 1/5] x86: provide add_sized()
  2015-05-11 14:37 [PATCHv5 0/5] Use ticket locks for spinlocks David Vrabel
@ 2015-05-11 14:37 ` David Vrabel
  2015-05-13  9:23   ` Jan Beulich
  2015-05-11 14:37 ` [PATCHv4 2/5] arm: " David Vrabel
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: David Vrabel @ 2015-05-11 14:37 UTC (permalink / raw)
  To: xen-devel
  Cc: Keir Fraser, Ian Campbell, Andrew Cooper, Tim Deegan,
	David Vrabel, Jan Beulich, Jennifer Herbert

add_sized(ptr, inc) adds inc to the value at ptr using only the correct
size of loads and stores for the type of *ptr.  The add is /not/ atomic.

This is needed for ticket locks to ensure the increment of the head ticket
does not affect the tail ticket.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 xen/include/asm-x86/atomic.h |   27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/xen/include/asm-x86/atomic.h b/xen/include/asm-x86/atomic.h
index 52af083..74fa0bc 100644
--- a/xen/include/asm-x86/atomic.h
+++ b/xen/include/asm-x86/atomic.h
@@ -14,6 +14,14 @@ static inline void name(volatile type *addr, type val) \
 { asm volatile("mov" size " %1,%0": "=m" (*(volatile type *)addr) \
 :reg (val) barrier); }
 
+#define build_add_sized(name, size, type, reg) \
+    static inline void name(volatile type *addr, type val)              \
+    {                                                                   \
+        asm volatile("add" size " %1,%0"                                \
+                     : "=m" (*addr)                                     \
+                     : reg (val));                                      \
+    }
+
 build_read_atomic(read_u8_atomic, "b", uint8_t, "=q", )
 build_read_atomic(read_u16_atomic, "w", uint16_t, "=r", )
 build_read_atomic(read_u32_atomic, "l", uint32_t, "=r", )
@@ -24,8 +32,14 @@ build_write_atomic(write_u16_atomic, "w", uint16_t, "r", )
 build_write_atomic(write_u32_atomic, "l", uint32_t, "r", )
 build_write_atomic(write_u64_atomic, "q", uint64_t, "r", )
 
+build_add_sized(add_u8_sized, "b", uint8_t, "qi")
+build_add_sized(add_u16_sized, "w", uint16_t, "ri")
+build_add_sized(add_u32_sized, "l", uint32_t, "ri")
+build_add_sized(add_u64_sized, "q", uint64_t, "ri")
+
 #undef build_read_atomic
 #undef build_write_atomic
+#undef build_add_sized
 
 void __bad_atomic_size(void);
 
@@ -53,6 +67,19 @@ void __bad_atomic_size(void);
     }                                                     \
 })
 
+#define add_sized(p, x) ({                                \
+    typeof(*(p)) __x = (x);                               \
+    unsigned long x_ = (unsigned long)__x;                \
+    switch ( sizeof(*(p)) )                               \
+    {                                                     \
+    case 1: add_u8_sized((uint8_t *)(p), x_); break;      \
+    case 2: add_u16_sized((uint16_t *)(p), x_); break;    \
+    case 4: add_u32_sized((uint32_t *)(p), x_); break;    \
+    case 8: add_u64_sized((uint64_t *)(p), x_); break;    \
+    default: __bad_atomic_size(); break;                  \
+    }                                                     \
+})
+
 /*
  * NB. I've pushed the volatile qualifier into the operations. This allows
  * fast accessors such as _atomic_read() and _atomic_set() which don't give
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCHv4 2/5] arm: provide add_sized()
  2015-05-11 14:37 [PATCHv5 0/5] Use ticket locks for spinlocks David Vrabel
  2015-05-11 14:37 ` [PATCHv4 1/5] x86: provide add_sized() David Vrabel
@ 2015-05-11 14:37 ` David Vrabel
  2015-05-11 14:45   ` David Vrabel
  2015-05-11 14:37 ` [PATCHv4 3/5] xen: use ticket locks for spin locks David Vrabel
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: David Vrabel @ 2015-05-11 14:37 UTC (permalink / raw)
  To: xen-devel
  Cc: Keir Fraser, Ian Campbell, Andrew Cooper, Tim Deegan,
	David Vrabel, Jan Beulich, Jennifer Herbert

add_sized(ptr, inc) adds inc to the value at ptr using only the correct
size of loads and stores for the type of *ptr.  The add is /not/ atomic.

This is needed for ticket locks to ensure the increment of the head ticket
does not affect the tail ticket.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 xen/include/asm-arm/atomic.h |   27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/xen/include/asm-arm/atomic.h b/xen/include/asm-arm/atomic.h
index 7d15fb0..d80ae5d 100644
--- a/xen/include/asm-arm/atomic.h
+++ b/xen/include/asm-arm/atomic.h
@@ -23,6 +23,17 @@ static inline void name(volatile type *addr, type val) \
                  : reg (val));                         \
 }
 
+#define build_add_sized(name, size, width, type, reg) \
+static inline void name(volatile type *addr, type val)                  \
+{                                                                       \
+    type t;                                                             \
+    asm volatile("ldr" size " %"width"1,%0\n"                           \
+                 "add %"width"1,%"width"1,%"width"2\n"                  \
+                 "str" size " %"width"1,%0"                             \
+                 : "=m" (*(volatile type *)addr), "=r" (t)              \
+                 : reg (val));                                          \
+}
+
 #if defined (CONFIG_ARM_32)
 #define BYTE ""
 #define WORD ""
@@ -46,6 +57,10 @@ build_atomic_read(read_u64_atomic, "x", uint64_t, "=r")
 build_atomic_write(write_u64_atomic, "x", uint64_t, "r")
 #endif
 
+build_add_sized(add_u8_sized, "b", BYTE, uint8_t, "ri")
+build_add_sized(add_u16_sized, "h", WORD, uint16_t, "ri")
+build_add_sized(add_u32_sized, "", WORD, uint32_t, "ri")
+
 void __bad_atomic_size(void);
 
 #define read_atomic(p) ({                                               \
@@ -70,6 +85,18 @@ void __bad_atomic_size(void);
     __x;                                                                \
 })
 
+#define add_sized(p, x) ({                                              \
+    typeof(*p) __x = (x);                                               \
+    switch ( sizeof(*p) )                                               \
+    {                                                                   \
+    case 1: add_u8_sized((uint8_t *)p, (uint8_t)__x); break;            \
+    case 2: add_u16_sized((uint16_t *)p, (uint16_t)__x); break;         \
+    case 4: add_u32_sized((uint32_t *)p, (uint32_t)__x); break;         \
+    default: __bad_atomic_size(); break;                                \
+    }                                                                   \
+    __x;                                                                \
+})
+    
 /*
  * NB. I've pushed the volatile qualifier into the operations. This allows
  * fast accessors such as _atomic_read() and _atomic_set() which don't give
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCHv4 3/5] xen: use ticket locks for spin locks
  2015-05-11 14:37 [PATCHv5 0/5] Use ticket locks for spinlocks David Vrabel
  2015-05-11 14:37 ` [PATCHv4 1/5] x86: provide add_sized() David Vrabel
  2015-05-11 14:37 ` [PATCHv4 2/5] arm: " David Vrabel
@ 2015-05-11 14:37 ` David Vrabel
  2015-05-13  9:31   ` Jan Beulich
  2015-05-14 10:36   ` Tim Deegan
  2015-05-11 14:37 ` [PATCHv4 4/5] x86, arm: remove asm/spinlock.h from all architectures David Vrabel
  2015-05-11 14:37 ` [PATCHv4 5/5] x86: reduce struct hvm_domain size David Vrabel
  4 siblings, 2 replies; 19+ messages in thread
From: David Vrabel @ 2015-05-11 14:37 UTC (permalink / raw)
  To: xen-devel
  Cc: Keir Fraser, Ian Campbell, Andrew Cooper, Tim Deegan,
	David Vrabel, Jan Beulich, Jennifer Herbert

Replace the byte locks with ticket locks.  Ticket locks are: a) fair;
and b) peform better when contented since they spin without an atomic
operation.

The lock is split into two ticket values: head and tail.  A locker
acquires a ticket by (atomically) increasing tail and using the
previous tail value.  A CPU holds the lock if its ticket == head.  The
lock is released by increasing head.

spin_lock_irq() and spin_lock_irqsave() now spin with irqs disabled
(previously, they would spin with irqs enabled if possible).  This is
required to prevent deadlocks when the irq handler tries to take the
same lock with a higher ticket.

Architectures need only provide arch_fetch_and_add() and two barriers:
arch_lock_acquire_barrier() and arch_lock_release_barrier().

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 xen/common/spinlock.c        |  118 ++++++++++++++++++++++++------------------
 xen/include/asm-arm/system.h |    3 ++
 xen/include/asm-x86/system.h |    3 ++
 xen/include/xen/spinlock.h   |   16 ++++--
 4 files changed, 86 insertions(+), 54 deletions(-)

diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c
index 5fd8b1c..49948dd 100644
--- a/xen/common/spinlock.c
+++ b/xen/common/spinlock.c
@@ -115,94 +115,100 @@ void spin_debug_disable(void)
 
 #endif
 
+static always_inline spinlock_tickets_t observe_lock(spinlock_tickets_t *t)
+{
+    spinlock_tickets_t v;
+
+    smp_rmb();
+    v.head_tail = read_atomic(&t->head_tail);
+    return v;
+}
+
+static always_inline u16 observe_head(spinlock_tickets_t *t)
+{
+    smp_rmb();
+    return read_atomic(&t->head);
+}
+
 void _spin_lock(spinlock_t *lock)
 {
+    spinlock_tickets_t tickets = { .tail = 1, };
     LOCK_PROFILE_VAR;
 
     check_lock(&lock->debug);
-    while ( unlikely(!_raw_spin_trylock(&lock->raw)) )
+    tickets.head_tail = arch_fetch_and_add(&lock->tickets.head_tail,
+                                           tickets.head_tail);
+    while ( tickets.tail != observe_head(&lock->tickets) )
     {
         LOCK_PROFILE_BLOCK;
-        while ( likely(_raw_spin_is_locked(&lock->raw)) )
-            cpu_relax();
+        cpu_relax();
     }
     LOCK_PROFILE_GOT;
     preempt_disable();
+    arch_lock_acquire_barrier();
 }
 
 void _spin_lock_irq(spinlock_t *lock)
 {
-    LOCK_PROFILE_VAR;
-
     ASSERT(local_irq_is_enabled());
     local_irq_disable();
-    check_lock(&lock->debug);
-    while ( unlikely(!_raw_spin_trylock(&lock->raw)) )
-    {
-        LOCK_PROFILE_BLOCK;
-        local_irq_enable();
-        while ( likely(_raw_spin_is_locked(&lock->raw)) )
-            cpu_relax();
-        local_irq_disable();
-    }
-    LOCK_PROFILE_GOT;
-    preempt_disable();
+    _spin_lock(lock);
 }
 
 unsigned long _spin_lock_irqsave(spinlock_t *lock)
 {
     unsigned long flags;
-    LOCK_PROFILE_VAR;
 
     local_irq_save(flags);
-    check_lock(&lock->debug);
-    while ( unlikely(!_raw_spin_trylock(&lock->raw)) )
-    {
-        LOCK_PROFILE_BLOCK;
-        local_irq_restore(flags);
-        while ( likely(_raw_spin_is_locked(&lock->raw)) )
-            cpu_relax();
-        local_irq_disable();
-    }
-    LOCK_PROFILE_GOT;
-    preempt_disable();
+    _spin_lock(lock);
     return flags;
 }
 
 void _spin_unlock(spinlock_t *lock)
 {
+    arch_lock_release_barrier();
     preempt_enable();
     LOCK_PROFILE_REL;
-    _raw_spin_unlock(&lock->raw);
+    add_sized(&lock->tickets.head, 1);
 }
 
 void _spin_unlock_irq(spinlock_t *lock)
 {
-    preempt_enable();
-    LOCK_PROFILE_REL;
-    _raw_spin_unlock(&lock->raw);
+    _spin_unlock(lock);
     local_irq_enable();
 }
 
 void _spin_unlock_irqrestore(spinlock_t *lock, unsigned long flags)
 {
-    preempt_enable();
-    LOCK_PROFILE_REL;
-    _raw_spin_unlock(&lock->raw);
+    _spin_unlock(lock);
     local_irq_restore(flags);
 }
 
 int _spin_is_locked(spinlock_t *lock)
 {
     check_lock(&lock->debug);
-    return _raw_spin_is_locked(&lock->raw);
+    return lock->tickets.head != lock->tickets.tail;
 }
 
 int _spin_trylock(spinlock_t *lock)
 {
+    spinlock_tickets_t old, new;
+
     check_lock(&lock->debug);
-    if ( !_raw_spin_trylock(&lock->raw) )
+    old = observe_lock(&lock->tickets);
+    if ( old.head != old.tail )
         return 0;
+    new = old;
+    new.tail++;
+    if ( cmpxchg(&lock->tickets.head_tail,
+                 old.head_tail, new.head_tail) != old.head_tail )
+    {
+        /*
+         * cmpxchg() is a full barrier so no need for an
+         * arch_lock_acquire_barrier().
+         */
+        return 0;
+    }
 #ifdef LOCK_PROFILE
     if (lock->profile)
         lock->profile->time_locked = NOW();
@@ -213,27 +219,32 @@ int _spin_trylock(spinlock_t *lock)
 
 void _spin_barrier(spinlock_t *lock)
 {
+    spinlock_tickets_t sample;
 #ifdef LOCK_PROFILE
     s_time_t block = NOW();
-    u64      loop = 0;
+#endif
 
     check_barrier(&lock->debug);
-    do { smp_mb(); loop++;} while ( _raw_spin_is_locked(&lock->raw) );
-    if ((loop > 1) && lock->profile)
+    smp_mb();
+    sample = observe_lock(&lock->tickets);
+    if ( sample.head != sample.tail )
     {
-        lock->profile->time_block += NOW() - block;
-        lock->profile->block_cnt++;
-    }
-#else
-    check_barrier(&lock->debug);
-    do { smp_mb(); } while ( _raw_spin_is_locked(&lock->raw) );
+        while ( observe_head(&lock->tickets) != sample.tail )
+            cpu_relax();
+#ifdef LOCK_PROFILE
+        if ( lock->profile )
+        {
+            lock->profile->time_block += NOW() - block;
+            lock->profile->block_cnt++;
+        }
 #endif
+    }
     smp_mb();
 }
 
 int _spin_trylock_recursive(spinlock_t *lock)
 {
-    int cpu = smp_processor_id();
+    unsigned int cpu = smp_processor_id();
 
     /* Don't allow overflow of recurse_cpu field. */
     BUILD_BUG_ON(NR_CPUS > 0xfffu);
@@ -256,8 +267,17 @@ int _spin_trylock_recursive(spinlock_t *lock)
 
 void _spin_lock_recursive(spinlock_t *lock)
 {
-    while ( !spin_trylock_recursive(lock) )
-        cpu_relax();
+    unsigned int cpu = smp_processor_id();
+
+    if ( likely(lock->recurse_cpu != cpu) )
+    {
+        _spin_lock(lock);
+        lock->recurse_cpu = cpu;
+    }
+
+    /* We support only fairly shallow recursion, else the counter overflows. */
+    ASSERT(lock->recurse_cnt < 0xfu);
+    lock->recurse_cnt++;
 }
 
 void _spin_unlock_recursive(spinlock_t *lock)
diff --git a/xen/include/asm-arm/system.h b/xen/include/asm-arm/system.h
index 2eb96e8..f0e222f 100644
--- a/xen/include/asm-arm/system.h
+++ b/xen/include/asm-arm/system.h
@@ -53,6 +53,9 @@
 
 #define arch_fetch_and_add(x, v) __sync_fetch_and_add(x, v)
 
+#define arch_lock_acquire_barrier() smp_mb()
+#define arch_lock_release_barrier() smp_mb()
+
 extern struct vcpu *__context_switch(struct vcpu *prev, struct vcpu *next);
 
 #endif
diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h
index 9fb70f5..4f66f78 100644
--- a/xen/include/asm-x86/system.h
+++ b/xen/include/asm-x86/system.h
@@ -185,6 +185,9 @@ static always_inline unsigned long __xadd(
 #define set_mb(var, value) do { xchg(&var, value); } while (0)
 #define set_wmb(var, value) do { var = value; wmb(); } while (0)
 
+#define arch_lock_acquire_barrier() barrier()
+#define arch_lock_release_barrier() barrier()
+
 #define local_irq_disable()     asm volatile ( "cli" : : : "memory" )
 #define local_irq_enable()      asm volatile ( "sti" : : : "memory" )
 
diff --git a/xen/include/xen/spinlock.h b/xen/include/xen/spinlock.h
index eda9b2e..bafbc74 100644
--- a/xen/include/xen/spinlock.h
+++ b/xen/include/xen/spinlock.h
@@ -80,8 +80,7 @@ struct lock_profile_qhead {
     static struct lock_profile *__lock_profile_##name                         \
     __used_section(".lockprofile.data") =                                     \
     &__lock_profile_data_##name
-#define _SPIN_LOCK_UNLOCKED(x) { _RAW_SPIN_LOCK_UNLOCKED, 0xfffu, 0,          \
-                                 _LOCK_DEBUG, x }
+#define _SPIN_LOCK_UNLOCKED(x) { { 0 }, 0xfffu, 0, _LOCK_DEBUG, x }
 #define SPIN_LOCK_UNLOCKED _SPIN_LOCK_UNLOCKED(NULL)
 #define DEFINE_SPINLOCK(l)                                                    \
     spinlock_t l = _SPIN_LOCK_UNLOCKED(NULL);                                 \
@@ -117,8 +116,7 @@ extern void spinlock_profile_reset(unsigned char key);
 
 struct lock_profile_qhead { };
 
-#define SPIN_LOCK_UNLOCKED                                                    \
-    { _RAW_SPIN_LOCK_UNLOCKED, 0xfffu, 0, _LOCK_DEBUG }
+#define SPIN_LOCK_UNLOCKED { { 0 }, 0xfffu, 0, _LOCK_DEBUG }
 #define DEFINE_SPINLOCK(l) spinlock_t l = SPIN_LOCK_UNLOCKED
 
 #define spin_lock_init_prof(s, l) spin_lock_init(&((s)->l))
@@ -127,8 +125,16 @@ struct lock_profile_qhead { };
 
 #endif
 
+typedef union {
+    u32 head_tail;
+    struct {
+        u16 head;
+        u16 tail;
+    };
+} spinlock_tickets_t;
+
 typedef struct spinlock {
-    raw_spinlock_t raw;
+    spinlock_tickets_t tickets;
     u16 recurse_cpu:12;
     u16 recurse_cnt:4;
     struct lock_debug debug;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCHv4 4/5] x86, arm: remove asm/spinlock.h from all architectures
  2015-05-11 14:37 [PATCHv5 0/5] Use ticket locks for spinlocks David Vrabel
                   ` (2 preceding siblings ...)
  2015-05-11 14:37 ` [PATCHv4 3/5] xen: use ticket locks for spin locks David Vrabel
@ 2015-05-11 14:37 ` David Vrabel
  2015-05-11 14:37 ` [PATCHv4 5/5] x86: reduce struct hvm_domain size David Vrabel
  4 siblings, 0 replies; 19+ messages in thread
From: David Vrabel @ 2015-05-11 14:37 UTC (permalink / raw)
  To: xen-devel
  Cc: Keir Fraser, Ian Campbell, Andrew Cooper, Tim Deegan,
	David Vrabel, Jan Beulich, Jennifer Herbert

Now that all architecture use a common ticket lock implementation for
spinlocks, remove the architecture specific byte lock implementations.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 xen/arch/arm/README.LinuxPrimitives  |   28 ---------------
 xen/include/asm-arm/arm32/spinlock.h |   66 ----------------------------------
 xen/include/asm-arm/arm64/spinlock.h |   63 --------------------------------
 xen/include/asm-arm/spinlock.h       |   23 ------------
 xen/include/asm-x86/spinlock.h       |   37 -------------------
 xen/include/xen/spinlock.h           |    1 -
 6 files changed, 218 deletions(-)
 delete mode 100644 xen/include/asm-arm/arm32/spinlock.h
 delete mode 100644 xen/include/asm-arm/arm64/spinlock.h
 delete mode 100644 xen/include/asm-arm/spinlock.h
 delete mode 100644 xen/include/asm-x86/spinlock.h

diff --git a/xen/arch/arm/README.LinuxPrimitives b/xen/arch/arm/README.LinuxPrimitives
index 7f33fc7..3115f51 100644
--- a/xen/arch/arm/README.LinuxPrimitives
+++ b/xen/arch/arm/README.LinuxPrimitives
@@ -25,16 +25,6 @@ linux/arch/arm64/include/asm/atomic.h   xen/include/asm-arm/arm64/atomic.h
 
 ---------------------------------------------------------------------
 
-spinlocks: last sync @ v3.16-rc6 (last commit: 95c4189689f9)
-
-linux/arch/arm64/include/asm/spinlock.h xen/include/asm-arm/arm64/spinlock.h
-
-Skipped:
-  5686b06 arm64: lockref: add support for lockless lockrefs using cmpxchg
-  52ea2a5 arm64: locks: introduce ticket-based spinlock implementation
-
----------------------------------------------------------------------
-
 mem*: last sync @ v3.16-rc6 (last commit: d875c9b37240)
 
 linux/arch/arm64/lib/memchr.S           xen/arch/arm/arm64/lib/memchr.S
@@ -103,24 +93,6 @@ linux/arch/arm/include/asm/atomic.h     xen/include/asm-arm/arm32/atomic.h
 
 ---------------------------------------------------------------------
 
-spinlocks: last sync: 15e7e5c1ebf5
-
-linux/arch/arm/include/asm/spinlock.h   xen/include/asm-arm/arm32/spinlock.h
-
-*** Linux has switched to ticket locks but we still use bitlocks.
-
-resync to v3.14-rc7:
-
-  7c8746a ARM: 7955/1: spinlock: ensure we have a compiler barrier before sev
-  0cbad9c ARM: 7854/1: lockref: add support for lockless lockrefs using cmpxchg64
-  9bb17be ARM: locks: prefetch the destination word for write prior to strex
-  27a8479 ARM: smp_on_up: move inline asm ALT_SMP patching macro out of spinlock.
-  00efaa0 ARM: 7812/1: rwlocks: retry trylock operation if strex fails on free lo
-  afa31d8 ARM: 7811/1: locks: use early clobber in arch_spin_trylock
-  73a6fdc ARM: spinlock: use inner-shareable dsb variant prior to sev instruction
-
----------------------------------------------------------------------
-
 mem*: last sync @ v3.16-rc6 (last commit: d98b90ea22b0)
 
 linux/arch/arm/lib/copy_template.S      xen/arch/arm/arm32/lib/copy_template.S
diff --git a/xen/include/asm-arm/arm32/spinlock.h b/xen/include/asm-arm/arm32/spinlock.h
deleted file mode 100644
index bc0343c..0000000
--- a/xen/include/asm-arm/arm32/spinlock.h
+++ /dev/null
@@ -1,66 +0,0 @@
-#ifndef __ASM_ARM32_SPINLOCK_H
-#define __ASM_ARM32_SPINLOCK_H
-
-static inline void dsb_sev(void)
-{
-    __asm__ __volatile__ (
-        "dsb\n"
-        "sev\n"
-        );
-}
-
-typedef struct {
-    volatile unsigned int lock;
-} raw_spinlock_t;
-
-#define _RAW_SPIN_LOCK_UNLOCKED { 0 }
-
-#define _raw_spin_is_locked(x)          ((x)->lock != 0)
-
-static always_inline void _raw_spin_unlock(raw_spinlock_t *lock)
-{
-    ASSERT(_raw_spin_is_locked(lock));
-
-    smp_mb();
-
-    __asm__ __volatile__(
-"   str     %1, [%0]\n"
-    :
-    : "r" (&lock->lock), "r" (0)
-    : "cc");
-
-    dsb_sev();
-}
-
-static always_inline int _raw_spin_trylock(raw_spinlock_t *lock)
-{
-    unsigned long contended, res;
-
-    do {
-        __asm__ __volatile__(
-    "   ldrex   %0, [%2]\n"
-    "   teq     %0, #0\n"
-    "   strexeq %1, %3, [%2]\n"
-    "   movne   %1, #0\n"
-        : "=&r" (contended), "=r" (res)
-        : "r" (&lock->lock), "r" (1)
-        : "cc");
-    } while (res);
-
-    if (!contended) {
-        smp_mb();
-        return 1;
-    } else {
-        return 0;
-    }
-}
-
-#endif /* __ASM_SPINLOCK_H */
-/*
- * Local variables:
- * mode: C
- * c-file-style: "BSD"
- * c-basic-offset: 4
- * indent-tabs-mode: nil
- * End:
- */
diff --git a/xen/include/asm-arm/arm64/spinlock.h b/xen/include/asm-arm/arm64/spinlock.h
deleted file mode 100644
index 5ae034d..0000000
--- a/xen/include/asm-arm/arm64/spinlock.h
+++ /dev/null
@@ -1,63 +0,0 @@
-/*
- * Derived from Linux arch64 spinlock.h which is:
- * Copyright (C) 2012 ARM Ltd.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program.  If not, see <http://www.gnu.org/licenses/>.
- */
-
-#ifndef __ASM_ARM64_SPINLOCK_H
-#define __ASM_ARM64_SPINLOCK_H
-
-typedef struct {
-    volatile unsigned int lock;
-} raw_spinlock_t;
-
-#define _RAW_SPIN_LOCK_UNLOCKED { 0 }
-
-#define _raw_spin_is_locked(x)          ((x)->lock != 0)
-
-static always_inline void _raw_spin_unlock(raw_spinlock_t *lock)
-{
-    ASSERT(_raw_spin_is_locked(lock));
-
-    asm volatile(
-        "       stlr    %w1, %0\n"
-        : "=Q" (lock->lock) : "r" (0) : "memory");
-}
-
-static always_inline int _raw_spin_trylock(raw_spinlock_t *lock)
-{
-    unsigned int tmp;
-
-    asm volatile(
-        "2:     ldaxr   %w0, %1\n"
-        "       cbnz    %w0, 1f\n"
-        "       stxr    %w0, %w2, %1\n"
-        "       cbnz    %w0, 2b\n"
-        "1:\n"
-        : "=&r" (tmp), "+Q" (lock->lock)
-        : "r" (1)
-        : "cc", "memory");
-
-    return !tmp;
-}
-
-#endif /* __ASM_SPINLOCK_H */
-/*
- * Local variables:
- * mode: C
- * c-file-style: "BSD"
- * c-basic-offset: 4
- * indent-tabs-mode: nil
- * End:
- */
diff --git a/xen/include/asm-arm/spinlock.h b/xen/include/asm-arm/spinlock.h
deleted file mode 100644
index a064f73..0000000
--- a/xen/include/asm-arm/spinlock.h
+++ /dev/null
@@ -1,23 +0,0 @@
-#ifndef __ASM_SPINLOCK_H
-#define __ASM_SPINLOCK_H
-
-#include <xen/config.h>
-#include <xen/lib.h>
-
-#if defined(CONFIG_ARM_32)
-# include <asm/arm32/spinlock.h>
-#elif defined(CONFIG_ARM_64)
-# include <asm/arm64/spinlock.h>
-#else
-# error "unknown ARM variant"
-#endif
-
-#endif /* __ASM_SPINLOCK_H */
-/*
- * Local variables:
- * mode: C
- * c-file-style: "BSD"
- * c-basic-offset: 4
- * indent-tabs-mode: nil
- * End:
- */
diff --git a/xen/include/asm-x86/spinlock.h b/xen/include/asm-x86/spinlock.h
deleted file mode 100644
index 757e20b..0000000
--- a/xen/include/asm-x86/spinlock.h
+++ /dev/null
@@ -1,37 +0,0 @@
-#ifndef __ASM_SPINLOCK_H
-#define __ASM_SPINLOCK_H
-
-#include <xen/config.h>
-#include <xen/lib.h>
-#include <asm/atomic.h>
-
-typedef struct {
-    volatile s16 lock;
-} raw_spinlock_t;
-
-#define _RAW_SPIN_LOCK_UNLOCKED /*(raw_spinlock_t)*/ { 1 }
-
-#define _raw_spin_is_locked(x) ((x)->lock <= 0)
-
-static always_inline void _raw_spin_unlock(raw_spinlock_t *lock)
-{
-    ASSERT(_raw_spin_is_locked(lock));
-    asm volatile (
-        "movw $1,%0" 
-        : "=m" (lock->lock) : : "memory" );
-}
-
-static always_inline int _raw_spin_trylock(raw_spinlock_t *lock)
-{
-    s16 oldval;
-    asm volatile (
-        "xchgw %w0,%1"
-        :"=r" (oldval), "=m" (lock->lock)
-        :"0" ((s16)0) : "memory" );
-    return (oldval > 0);
-}
-
-#define _raw_read_unlock(l) \
-    asm volatile ( "lock; dec%z0 %0" : "+m" ((l)->lock) :: "memory" )
-
-#endif /* __ASM_SPINLOCK_H */
diff --git a/xen/include/xen/spinlock.h b/xen/include/xen/spinlock.h
index bafbc74..311685a 100644
--- a/xen/include/xen/spinlock.h
+++ b/xen/include/xen/spinlock.h
@@ -2,7 +2,6 @@
 #define __SPINLOCK_H__
 
 #include <asm/system.h>
-#include <asm/spinlock.h>
 
 #ifndef NDEBUG
 struct lock_debug {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCHv4 5/5] x86: reduce struct hvm_domain size
  2015-05-11 14:37 [PATCHv5 0/5] Use ticket locks for spinlocks David Vrabel
                   ` (3 preceding siblings ...)
  2015-05-11 14:37 ` [PATCHv4 4/5] x86, arm: remove asm/spinlock.h from all architectures David Vrabel
@ 2015-05-11 14:37 ` David Vrabel
  2015-05-14 11:24   ` Tim Deegan
  4 siblings, 1 reply; 19+ messages in thread
From: David Vrabel @ 2015-05-11 14:37 UTC (permalink / raw)
  To: xen-devel
  Cc: Keir Fraser, Ian Campbell, Andrew Cooper, Tim Deegan,
	David Vrabel, Jan Beulich, Jennifer Herbert

Pack struct hvm_domain to reduce it by 8 bytes.  Thus reducing the
size of struct domain by 8 bytes.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 xen/include/asm-x86/hvm/domain.h |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
index 0f8b19a..fb30903 100644
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -115,12 +115,6 @@ struct hvm_domain {
     /* VRAM dirty support.  Protect with the domain paging lock. */
     struct sh_dirty_vram *dirty_vram;
 
-    /* If one of vcpus of this domain is in no_fill_mode or
-     * mtrr/pat between vcpus is not the same, set is_in_uc_mode
-     */
-    spinlock_t             uc_lock;
-    bool_t                 is_in_uc_mode;
-
     /* Pass-through */
     struct hvm_iommu       hvm_iommu;
 
@@ -135,6 +129,12 @@ struct hvm_domain {
     bool_t                 qemu_mapcache_invalidate;
     bool_t                 is_s3_suspended;
 
+    /* If one of vcpus of this domain is in no_fill_mode or
+     * mtrr/pat between vcpus is not the same, set is_in_uc_mode
+     */
+    bool_t                 is_in_uc_mode;
+    spinlock_t             uc_lock;
+
     /*
      * TSC value that VCPUs use to calculate their tsc_offset value.
      * Used during initialization and save/restore.
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCHv4 2/5] arm: provide add_sized()
  2015-05-11 14:37 ` [PATCHv4 2/5] arm: " David Vrabel
@ 2015-05-11 14:45   ` David Vrabel
  2015-05-13  9:26     ` Jan Beulich
  0 siblings, 1 reply; 19+ messages in thread
From: David Vrabel @ 2015-05-11 14:45 UTC (permalink / raw)
  To: David Vrabel, xen-devel
  Cc: Keir Fraser, Ian Campbell, Andrew Cooper, Jennifer Herbert,
	Tim Deegan, Jan Beulich

On 11/05/15 15:37, David Vrabel wrote:
> add_sized(ptr, inc) adds inc to the value at ptr using only the correct
> size of loads and stores for the type of *ptr.  The add is /not/ atomic.
> 
> This is needed for ticket locks to ensure the increment of the head ticket
> does not affect the tail ticket.
[...]
>  
> +#define add_sized(p, x) ({                                              \
> +    typeof(*p) __x = (x);                                               \
> +    switch ( sizeof(*p) )                                               \
> +    {                                                                   \
> +    case 1: add_u8_sized((uint8_t *)p, (uint8_t)__x); break;            \
> +    case 2: add_u16_sized((uint16_t *)p, (uint16_t)__x); break;         \
> +    case 4: add_u32_sized((uint32_t *)p, (uint32_t)__x); break;         \
> +    default: __bad_atomic_size(); break;                                \
> +    }                                                                   \
> +    __x;                                                                \

I should have cleaned up this arm implementation in the same way I did
x86.  Please drop this "__x;" before committing, or I can resubmit if it
is preferred.

Sorry.

David

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv4 1/5] x86: provide add_sized()
  2015-05-11 14:37 ` [PATCHv4 1/5] x86: provide add_sized() David Vrabel
@ 2015-05-13  9:23   ` Jan Beulich
  2015-05-13  9:35     ` David Vrabel
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Beulich @ 2015-05-13  9:23 UTC (permalink / raw)
  To: David Vrabel
  Cc: Keir Fraser, Ian Campbell, Andrew Cooper, Tim Deegan,
	Jennifer Herbert, xen-devel

>>> On 11.05.15 at 16:37, <david.vrabel@citrix.com> wrote:
> @@ -53,6 +67,19 @@ void __bad_atomic_size(void);
>      }                                                     \
>  })
>  
> +#define add_sized(p, x) ({                                \
> +    typeof(*(p)) __x = (x);                               \
> +    unsigned long x_ = (unsigned long)__x;                \

So is there a particular reason you kept this double type conversion?
As said earlier, I can see why write_atomic() wants it, but I don't see
the need here.

Jan

> +    switch ( sizeof(*(p)) )                               \
> +    {                                                     \
> +    case 1: add_u8_sized((uint8_t *)(p), x_); break;      \
> +    case 2: add_u16_sized((uint16_t *)(p), x_); break;    \
> +    case 4: add_u32_sized((uint32_t *)(p), x_); break;    \
> +    case 8: add_u64_sized((uint64_t *)(p), x_); break;    \
> +    default: __bad_atomic_size(); break;                  \
> +    }                                                     \
> +})

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv4 2/5] arm: provide add_sized()
  2015-05-11 14:45   ` David Vrabel
@ 2015-05-13  9:26     ` Jan Beulich
  0 siblings, 0 replies; 19+ messages in thread
From: Jan Beulich @ 2015-05-13  9:26 UTC (permalink / raw)
  To: David Vrabel, Ian Campbell
  Cc: Andrew Cooper, Tim Deegan, Keir Fraser, Jennifer Herbert,
	xen-devel

>>> On 11.05.15 at 16:45, <david.vrabel@citrix.com> wrote:
> On 11/05/15 15:37, David Vrabel wrote:
>> add_sized(ptr, inc) adds inc to the value at ptr using only the correct
>> size of loads and stores for the type of *ptr.  The add is /not/ atomic.
>> 
>> This is needed for ticket locks to ensure the increment of the head ticket
>> does not affect the tail ticket.
> [...]
>>  
>> +#define add_sized(p, x) ({                                              \
>> +    typeof(*p) __x = (x);                                               \
>> +    switch ( sizeof(*p) )                                               \
>> +    {                                                                   \
>> +    case 1: add_u8_sized((uint8_t *)p, (uint8_t)__x); break;            \
>> +    case 2: add_u16_sized((uint16_t *)p, (uint16_t)__x); break;         \
>> +    case 4: add_u32_sized((uint32_t *)p, (uint32_t)__x); break;         \
>> +    default: __bad_atomic_size(); break;                                \
>> +    }                                                                   \
>> +    __x;                                                                \
> 
> I should have cleaned up this arm implementation in the same way I did
> x86.  Please drop this "__x;" before committing, or I can resubmit if it
> is preferred.

No problem. But I suppose the casts on __x inside the individual
case statements aren't needed either.

Also it looks a little odd that x86 handles 8-byte operands here,
but ARM (and namely ARM64) doesn't.

Jan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv4 3/5] xen: use ticket locks for spin locks
  2015-05-11 14:37 ` [PATCHv4 3/5] xen: use ticket locks for spin locks David Vrabel
@ 2015-05-13  9:31   ` Jan Beulich
  2015-05-14 10:36   ` Tim Deegan
  1 sibling, 0 replies; 19+ messages in thread
From: Jan Beulich @ 2015-05-13  9:31 UTC (permalink / raw)
  To: David Vrabel
  Cc: Keir Fraser, Ian Campbell, Andrew Cooper, Tim Deegan,
	Jennifer Herbert, xen-devel

>>> On 11.05.15 at 16:37, <david.vrabel@citrix.com> wrote:
> Replace the byte locks with ticket locks.  Ticket locks are: a) fair;
> and b) peform better when contented since they spin without an atomic
> operation.
> 
> The lock is split into two ticket values: head and tail.  A locker
> acquires a ticket by (atomically) increasing tail and using the
> previous tail value.  A CPU holds the lock if its ticket == head.  The
> lock is released by increasing head.
> 
> spin_lock_irq() and spin_lock_irqsave() now spin with irqs disabled
> (previously, they would spin with irqs enabled if possible).  This is
> required to prevent deadlocks when the irq handler tries to take the
> same lock with a higher ticket.
> 
> Architectures need only provide arch_fetch_and_add() and two barriers:
> arch_lock_acquire_barrier() and arch_lock_release_barrier().
> 
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv4 1/5] x86: provide add_sized()
  2015-05-13  9:23   ` Jan Beulich
@ 2015-05-13  9:35     ` David Vrabel
  2015-05-13  9:48       ` Jan Beulich
  0 siblings, 1 reply; 19+ messages in thread
From: David Vrabel @ 2015-05-13  9:35 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Keir Fraser, Ian Campbell, Andrew Cooper, Tim Deegan,
	Jennifer Herbert, xen-devel

On 13/05/15 10:23, Jan Beulich wrote:
>>>> On 11.05.15 at 16:37, <david.vrabel@citrix.com> wrote:
>> @@ -53,6 +67,19 @@ void __bad_atomic_size(void);
>>      }                                                     \
>>  })
>>  
>> +#define add_sized(p, x) ({                                \
>> +    typeof(*(p)) __x = (x);                               \
>> +    unsigned long x_ = (unsigned long)__x;                \
> 
> So is there a particular reason you kept this double type conversion?
> As said earlier, I can see why write_atomic() wants it, but I don't see
> the need here.

I kept it in case typeof (*p) wasn't a numeric type (like a struct { u32
a; }) and needed the cast to unsigned long, although in hindsight I'm
not sure this really makes any sense.

David

>> +    switch ( sizeof(*(p)) )                               \
>> +    {                                                     \
>> +    case 1: add_u8_sized((uint8_t *)(p), x_); break;      \
>> +    case 2: add_u16_sized((uint16_t *)(p), x_); break;    \
>> +    case 4: add_u32_sized((uint32_t *)(p), x_); break;    \
>> +    case 8: add_u64_sized((uint64_t *)(p), x_); break;    \
>> +    default: __bad_atomic_size(); break;                  \
>> +    }                                                     \
>> +})
> 
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv4 1/5] x86: provide add_sized()
  2015-05-13  9:35     ` David Vrabel
@ 2015-05-13  9:48       ` Jan Beulich
  0 siblings, 0 replies; 19+ messages in thread
From: Jan Beulich @ 2015-05-13  9:48 UTC (permalink / raw)
  To: David Vrabel
  Cc: Keir Fraser, Ian Campbell, Andrew Cooper, Tim Deegan,
	Jennifer Herbert, xen-devel

>>> On 13.05.15 at 11:35, <david.vrabel@citrix.com> wrote:
> On 13/05/15 10:23, Jan Beulich wrote:
>>>>> On 11.05.15 at 16:37, <david.vrabel@citrix.com> wrote:
>>> @@ -53,6 +67,19 @@ void __bad_atomic_size(void);
>>>      }                                                     \
>>>  })
>>>  
>>> +#define add_sized(p, x) ({                                \
>>> +    typeof(*(p)) __x = (x);                               \
>>> +    unsigned long x_ = (unsigned long)__x;                \
>> 
>> So is there a particular reason you kept this double type conversion?
>> As said earlier, I can see why write_atomic() wants it, but I don't see
>> the need here.
> 
> I kept it in case typeof (*p) wasn't a numeric type (like a struct { u32
> a; }) and needed the cast to unsigned long, although in hindsight I'm
> not sure this really makes any sense.

Indeed - such an attempt of a conversion would result in a
compilation error. The only thing that could need this would be
pointers, but (as said before) I don't think add_sized makes
much sense for pointers (it would be actively misleading for
pointers to other than void or 1-byte types).

Jan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv4 3/5] xen: use ticket locks for spin locks
  2015-05-11 14:37 ` [PATCHv4 3/5] xen: use ticket locks for spin locks David Vrabel
  2015-05-13  9:31   ` Jan Beulich
@ 2015-05-14 10:36   ` Tim Deegan
  2015-05-14 20:05     ` Jan Beulich
  1 sibling, 1 reply; 19+ messages in thread
From: Tim Deegan @ 2015-05-14 10:36 UTC (permalink / raw)
  To: David Vrabel
  Cc: Keir Fraser, Ian Campbell, Andrew Cooper, Jennifer Herbert,
	Jan Beulich, xen-devel

At 15:37 +0100 on 11 May (1431358623), David Vrabel wrote:
> Replace the byte locks with ticket locks.  Ticket locks are: a) fair;
> and b) peform better when contented since they spin without an atomic
> operation.
> 
> The lock is split into two ticket values: head and tail.  A locker
> acquires a ticket by (atomically) increasing tail and using the
> previous tail value.  A CPU holds the lock if its ticket == head.  The
> lock is released by increasing head.
> 
> spin_lock_irq() and spin_lock_irqsave() now spin with irqs disabled
> (previously, they would spin with irqs enabled if possible).  This is
> required to prevent deadlocks when the irq handler tries to take the
> same lock with a higher ticket.
> 
> Architectures need only provide arch_fetch_and_add() and two barriers:
> arch_lock_acquire_barrier() and arch_lock_release_barrier().
> 
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>

Looking good.  I have two nits about comments, and one bug, which I
missed in the last version:

>  int _spin_trylock(spinlock_t *lock)
>  {
> +    spinlock_tickets_t old, new;
> +
>      check_lock(&lock->debug);
> -    if ( !_raw_spin_trylock(&lock->raw) )
> +    old = observe_lock(&lock->tickets);
> +    if ( old.head != old.tail )
>          return 0;
> +    new = old;
> +    new.tail++;
> +    if ( cmpxchg(&lock->tickets.head_tail,
> +                 old.head_tail, new.head_tail) != old.head_tail )
> +    {
> +        /*
> +         * cmpxchg() is a full barrier so no need for an
> +         * arch_lock_acquire_barrier().
> +         */

This comment belongs in the other branch, where we have taken the lock.

> +        return 0;
> +    }
>  #ifdef LOCK_PROFILE
>      if (lock->profile)
>          lock->profile->time_locked = NOW();
> @@ -213,27 +219,32 @@ int _spin_trylock(spinlock_t *lock)
>  
>  void _spin_barrier(spinlock_t *lock)
>  {
> +    spinlock_tickets_t sample;
>  #ifdef LOCK_PROFILE
>      s_time_t block = NOW();
> -    u64      loop = 0;
> +#endif
>  
>      check_barrier(&lock->debug);
> -    do { smp_mb(); loop++;} while ( _raw_spin_is_locked(&lock->raw) );
> -    if ((loop > 1) && lock->profile)
> +    smp_mb();
> +    sample = observe_lock(&lock->tickets);
> +    if ( sample.head != sample.tail )
>      {
> -        lock->profile->time_block += NOW() - block;
> -        lock->profile->block_cnt++;
> -    }
> -#else
> -    check_barrier(&lock->debug);
> -    do { smp_mb(); } while ( _raw_spin_is_locked(&lock->raw) );
> +        while ( observe_head(&lock->tickets) != sample.tail )

This test should be "observe_head(&lock->tickets) == sample.head",
i.e. wait until the thread that held the lock has released it.
Checking for it to reach the tail is unnecessary (other threads that
were queueing for the lock at the sample time don't matter) and
dangerous (on a contended lock head might pass sample.tail without us
happening to observe it being == ).

> +            cpu_relax();
> +#ifdef LOCK_PROFILE
> +        if ( lock->profile )
> +        {
> +            lock->profile->time_block += NOW() - block;
> +            lock->profile->block_cnt++;
> +        }
>  #endif
> +    }
>      smp_mb();
>  }
[...]
> --- a/xen/include/asm-x86/system.h
> +++ b/xen/include/asm-x86/system.h
> @@ -185,6 +185,9 @@ static always_inline unsigned long __xadd(
>  #define set_mb(var, value) do { xchg(&var, value); } while (0)
>  #define set_wmb(var, value) do { var = value; wmb(); } while (0)
>  
> +#define arch_lock_acquire_barrier() barrier()
> +#define arch_lock_release_barrier() barrier()

This could do with a comment explaining why they're not smp_mb().

Summarizing the last thread: on x86 the only reordering is of reads
with older writes.  In the lock case, the read in observe_head() can
only be reordered with writes that precede it, and moving a write
_into_ a locked section is OK.  In the release case, the write in
add_sized() can only be reordered with reads that follow it, and
hoisting a read _into_ a locked region is OK.

Cheers,

Tim.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv4 5/5] x86: reduce struct hvm_domain size
  2015-05-11 14:37 ` [PATCHv4 5/5] x86: reduce struct hvm_domain size David Vrabel
@ 2015-05-14 11:24   ` Tim Deegan
  2015-05-14 12:08     ` David Vrabel
  0 siblings, 1 reply; 19+ messages in thread
From: Tim Deegan @ 2015-05-14 11:24 UTC (permalink / raw)
  To: David Vrabel
  Cc: Keir Fraser, Ian Campbell, Andrew Cooper, Jennifer Herbert,
	Jan Beulich, xen-devel

At 15:37 +0100 on 11 May (1431358625), David Vrabel wrote:
> Pack struct hvm_domain to reduce it by 8 bytes.  Thus reducing the
> size of struct domain by 8 bytes.

In my builds (non-debug, on current staging), this makes no difference
to struct domain.  struct hvm_domain gets 8 bytes smaller (2144 ->
2136), but struct arch_domain remains at 2944 bytes (it just gets an
extra 8 bytes in the padding at the end to round it up to
__cacheline_aligned).

Tim.

> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> ---
>  xen/include/asm-x86/hvm/domain.h |   12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
> index 0f8b19a..fb30903 100644
> --- a/xen/include/asm-x86/hvm/domain.h
> +++ b/xen/include/asm-x86/hvm/domain.h
> @@ -115,12 +115,6 @@ struct hvm_domain {
>      /* VRAM dirty support.  Protect with the domain paging lock. */
>      struct sh_dirty_vram *dirty_vram;
>  
> -    /* If one of vcpus of this domain is in no_fill_mode or
> -     * mtrr/pat between vcpus is not the same, set is_in_uc_mode
> -     */
> -    spinlock_t             uc_lock;
> -    bool_t                 is_in_uc_mode;
> -
>      /* Pass-through */
>      struct hvm_iommu       hvm_iommu;
>  
> @@ -135,6 +129,12 @@ struct hvm_domain {
>      bool_t                 qemu_mapcache_invalidate;
>      bool_t                 is_s3_suspended;
>  
> +    /* If one of vcpus of this domain is in no_fill_mode or
> +     * mtrr/pat between vcpus is not the same, set is_in_uc_mode
> +     */
> +    bool_t                 is_in_uc_mode;
> +    spinlock_t             uc_lock;
> +
>      /*
>       * TSC value that VCPUs use to calculate their tsc_offset value.
>       * Used during initialization and save/restore.
> -- 
> 1.7.10.4
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv4 5/5] x86: reduce struct hvm_domain size
  2015-05-14 11:24   ` Tim Deegan
@ 2015-05-14 12:08     ` David Vrabel
  0 siblings, 0 replies; 19+ messages in thread
From: David Vrabel @ 2015-05-14 12:08 UTC (permalink / raw)
  To: Tim Deegan
  Cc: Keir Fraser, Ian Campbell, Andrew Cooper, Jennifer Herbert,
	Jan Beulich, xen-devel

On 14/05/15 12:24, Tim Deegan wrote:
> At 15:37 +0100 on 11 May (1431358625), David Vrabel wrote:
>> Pack struct hvm_domain to reduce it by 8 bytes.  Thus reducing the
>> size of struct domain by 8 bytes.
> 
> In my builds (non-debug, on current staging), this makes no difference
> to struct domain.  struct hvm_domain gets 8 bytes smaller (2144 ->
> 2136), but struct arch_domain remains at 2944 bytes (it just gets an
> extra 8 bytes in the padding at the end to round it up to
> __cacheline_aligned).

It used to make a difference.  Perhaps some other recently change
negates the benefit?

In which case, this patch could be dropped.

David

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv4 3/5] xen: use ticket locks for spin locks
  2015-05-14 10:36   ` Tim Deegan
@ 2015-05-14 20:05     ` Jan Beulich
  2015-05-14 20:51       ` Jan Beulich
  2015-05-14 20:55       ` Tim Deegan
  0 siblings, 2 replies; 19+ messages in thread
From: Jan Beulich @ 2015-05-14 20:05 UTC (permalink / raw)
  To: david.vrabel, tim
  Cc: andrew.cooper3, keir, jennifer.herbert, ian.campbell, xen-devel

>>> Tim Deegan <tim@xen.org> 05/14/15 12:36 PM >>>
>At 15:37 +0100 on 11 May (1431358623), David Vrabel wrote:
>>  void _spin_barrier(spinlock_t *lock)
>>  {
>> +    spinlock_tickets_t sample;
>>  #ifdef LOCK_PROFILE
>>      s_time_t block = NOW();
>> -    u64      loop = 0;
>> +#endif
>>  
>>      check_barrier(&lock->debug);
>> -    do { smp_mb(); loop++;} while ( _raw_spin_is_locked(&lock->raw) );
>> -    if ((loop > 1) && lock->profile)
>> +    smp_mb();
>> +    sample = observe_lock(&lock->tickets);
>> +    if ( sample.head != sample.tail )
>>      {
>> -        lock->profile->time_block += NOW() - block;
>> -        lock->profile->block_cnt++;
>> -    }
>> -#else
>> -    check_barrier(&lock->debug);
>> -    do { smp_mb(); } while ( _raw_spin_is_locked(&lock->raw) );
>> +        while ( observe_head(&lock->tickets) != sample.tail )
>
>This test should be "observe_head(&lock->tickets) == sample.head",
>i.e. wait until the thread that held the lock has released it.
>Checking for it to reach the tail is unnecessary (other threads that
>were queueing for the lock at the sample time don't matter) and
>dangerous (on a contended lock head might pass sample.tail without us
>happening to observe it being == ).

The observation of there being a problem is correct, but the suggested solution
doesn't seem to be. The new code being

    if ( sample.head != sample.tail )
    {
        while ( observe_head(&lock->tickets) == sample.tail )
            cpu_relax();

means that if head didn't change between the full sample and the head sample
we'd end the loop right away, which can't be right. We really need to wait for
head to reach or pass the sampled tail.

Jan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv4 3/5] xen: use ticket locks for spin locks
  2015-05-14 20:05     ` Jan Beulich
@ 2015-05-14 20:51       ` Jan Beulich
  2015-05-14 20:55       ` Tim Deegan
  1 sibling, 0 replies; 19+ messages in thread
From: Jan Beulich @ 2015-05-14 20:51 UTC (permalink / raw)
  To: david.vrabel, tim
  Cc: andrew.cooper3, keir, jennifer.herbert, ian.campbell, xen-devel

>>> "Jan Beulich" <jbeulich@suse.com> 05/14/15 10:06 PM >>>
>>>> Tim Deegan <tim@xen.org> 05/14/15 12:36 PM >>>
>>At 15:37 +0100 on 11 May (1431358623), David Vrabel wrote:
>>>  void _spin_barrier(spinlock_t *lock)
>>>  {
>>> +    spinlock_tickets_t sample;
>>>  #ifdef LOCK_PROFILE
>>>      s_time_t block = NOW();
>>> -    u64      loop = 0;
>>> +#endif
>>>  
>>>      check_barrier(&lock->debug);
>>> -    do { smp_mb(); loop++;} while ( _raw_spin_is_locked(&lock->raw) );
>>> -    if ((loop > 1) && lock->profile)
>>> +    smp_mb();
>>> +    sample = observe_lock(&lock->tickets);
>>> +    if ( sample.head != sample.tail )
>>>      {
>>> -        lock->profile->time_block += NOW() - block;
>>> -        lock->profile->block_cnt++;
>>> -    }
>>> -#else
>>> -    check_barrier(&lock->debug);
>>> -    do { smp_mb(); } while ( _raw_spin_is_locked(&lock->raw) );
>>> +        while ( observe_head(&lock->tickets) != sample.tail )
>>
>>This test should be "observe_head(&lock->tickets) == sample.head",
>>i.e. wait until the thread that held the lock has released it.
>>Checking for it to reach the tail is unnecessary (other threads that
>>were queueing for the lock at the sample time don't matter) and
>>dangerous (on a contended lock head might pass sample.tail without us
>>happening to observe it being == ).
>
>The observation of there being a problem is correct, but the suggested solution
>doesn't seem to be. The new code being
>
>if ( sample.head != sample.tail )
>{
>while ( observe_head(&lock->tickets) == sample.tail )
>cpu_relax();
>
>means that if head didn't change between the full sample and the head sample
>we'd end the loop right away, which can't be right. We really need to wait for
>head to reach or pass the sampled tail.

Just realized that this would be too heavy again (and more difficult to do than
necessary) - I think on the left side we need observe_tail() instead of
observe_head(), and then the == is correct.

Jan

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv4 3/5] xen: use ticket locks for spin locks
  2015-05-14 20:05     ` Jan Beulich
  2015-05-14 20:51       ` Jan Beulich
@ 2015-05-14 20:55       ` Tim Deegan
  2015-05-14 21:01         ` Jan Beulich
  1 sibling, 1 reply; 19+ messages in thread
From: Tim Deegan @ 2015-05-14 20:55 UTC (permalink / raw)
  To: Jan Beulich
  Cc: keir, ian.campbell, andrew.cooper3, jennifer.herbert,
	david.vrabel, xen-devel

At 21:05 +0100 on 14 May (1431637535), Jan Beulich wrote:
> >>> Tim Deegan <tim@xen.org> 05/14/15 12:36 PM >>>
> >At 15:37 +0100 on 11 May (1431358623), David Vrabel wrote:
> >>  void _spin_barrier(spinlock_t *lock)
> >>  {
> >> +    spinlock_tickets_t sample;
> >>  #ifdef LOCK_PROFILE
> >>      s_time_t block = NOW();
> >> -    u64      loop = 0;
> >> +#endif
> >>  
> >>      check_barrier(&lock->debug);
> >> -    do { smp_mb(); loop++;} while ( _raw_spin_is_locked(&lock->raw) );
> >> -    if ((loop > 1) && lock->profile)
> >> +    smp_mb();
> >> +    sample = observe_lock(&lock->tickets);
> >> +    if ( sample.head != sample.tail )
> >>      {
> >> -        lock->profile->time_block += NOW() - block;
> >> -        lock->profile->block_cnt++;
> >> -    }
> >> -#else
> >> -    check_barrier(&lock->debug);
> >> -    do { smp_mb(); } while ( _raw_spin_is_locked(&lock->raw) );
> >> +        while ( observe_head(&lock->tickets) != sample.tail )
> >
> >This test should be "observe_head(&lock->tickets) == sample.head",
> >i.e. wait until the thread that held the lock has released it.
> >Checking for it to reach the tail is unnecessary (other threads that
> >were queueing for the lock at the sample time don't matter) and
> >dangerous (on a contended lock head might pass sample.tail without us
> >happening to observe it being == ).
> 
> The observation of there being a problem is correct, but the suggested solution
> doesn't seem to be. The new code being
> 
>     if ( sample.head != sample.tail )
>     {
>         while ( observe_head(&lock->tickets) == sample.tail )
>             cpu_relax();
> 
> means that if head didn't change between the full sample and the head sample
> we'd end the loop right away, which can't be right. We really need to wait for
> head to reach or pass the sampled tail.

I think you misread what I asked for.  We wait until the observed head
doesn't match the sampled _head_, i.e. for whoever had the lock when
we sampled it to realease it:

     if ( sample.head != sample.tail )
     {
         while ( observe_head(&lock->tickets) == sample.head )
             cpu_relax();

Cheers,

Tim.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv4 3/5] xen: use ticket locks for spin locks
  2015-05-14 20:55       ` Tim Deegan
@ 2015-05-14 21:01         ` Jan Beulich
  0 siblings, 0 replies; 19+ messages in thread
From: Jan Beulich @ 2015-05-14 21:01 UTC (permalink / raw)
  To: tim
  Cc: keir, ian.campbell, andrew.cooper3, jennifer.herbert,
	david.vrabel, xen-devel

>>> Tim Deegan <tim@xen.org> 05/14/15 10:55 PM >>>
>At 21:05 +0100 on 14 May (1431637535), Jan Beulich wrote:
>> >>> Tim Deegan <tim@xen.org> 05/14/15 12:36 PM >>>
>> >At 15:37 +0100 on 11 May (1431358623), David Vrabel wrote:
>> >> +        while ( observe_head(&lock->tickets) != sample.tail )
>> >
>> >This test should be "observe_head(&lock->tickets) == sample.head",
>> >i.e. wait until the thread that held the lock has released it.
>> >Checking for it to reach the tail is unnecessary (other threads that
>> >were queueing for the lock at the sample time don't matter) and
>> >dangerous (on a contended lock head might pass sample.tail without us
>> >happening to observe it being == ).
>> 
>> The observation of there being a problem is correct, but the suggested solution
>> doesn't seem to be. The new code being
>> 
>>     if ( sample.head != sample.tail )
>>     {
>>         while ( observe_head(&lock->tickets) == sample.tail )
>>             cpu_relax();
>> 
>> means that if head didn't change between the full sample and the head sample
>> we'd end the loop right away, which can't be right. We really need to wait for
>> head to reach or pass the sampled tail.
>
>I think you misread what I asked for.  We wait until the observed head
>doesn't match the sampled _head_, i.e. for whoever had the lock when
>we sampled it to realease it:
>
>if ( sample.head != sample.tail )
>{
>while ( observe_head(&lock->tickets) == sample.head )
>cpu_relax();

Indeed. And in my simultaneously written reply to that mail I (once again)
mixed up head and tail. Sorry for the noise then.

Jan

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2015-05-14 21:01 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-11 14:37 [PATCHv5 0/5] Use ticket locks for spinlocks David Vrabel
2015-05-11 14:37 ` [PATCHv4 1/5] x86: provide add_sized() David Vrabel
2015-05-13  9:23   ` Jan Beulich
2015-05-13  9:35     ` David Vrabel
2015-05-13  9:48       ` Jan Beulich
2015-05-11 14:37 ` [PATCHv4 2/5] arm: " David Vrabel
2015-05-11 14:45   ` David Vrabel
2015-05-13  9:26     ` Jan Beulich
2015-05-11 14:37 ` [PATCHv4 3/5] xen: use ticket locks for spin locks David Vrabel
2015-05-13  9:31   ` Jan Beulich
2015-05-14 10:36   ` Tim Deegan
2015-05-14 20:05     ` Jan Beulich
2015-05-14 20:51       ` Jan Beulich
2015-05-14 20:55       ` Tim Deegan
2015-05-14 21:01         ` Jan Beulich
2015-05-11 14:37 ` [PATCHv4 4/5] x86, arm: remove asm/spinlock.h from all architectures David Vrabel
2015-05-11 14:37 ` [PATCHv4 5/5] x86: reduce struct hvm_domain size David Vrabel
2015-05-14 11:24   ` Tim Deegan
2015-05-14 12:08     ` David Vrabel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.