* [PATCHv4 1/8] x86: provide arch_fetch_and_add()
2015-04-30 15:33 [PATCHv4 0/8] Use ticket locks for spinlocks David Vrabel
@ 2015-04-30 15:33 ` David Vrabel
2015-05-07 15:59 ` Jan Beulich
2015-04-30 15:33 ` [PATCHv4 2/8] arm: " David Vrabel
` (6 subsequent siblings)
7 siblings, 1 reply; 21+ messages in thread
From: David Vrabel @ 2015-04-30 15:33 UTC (permalink / raw)
To: xen-devel
Cc: Keir Fraser, Ian Campbell, Andrew Cooper, Tim Deegan,
David Vrabel, Jan Beulich, Jennifer Herbert
arch_fetch_and_add() atomically adds a value and returns the previous
value.
This is needed to implement ticket locks.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
xen/include/asm-x86/system.h | 48 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 48 insertions(+)
diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h
index 7111329..47d4a70 100644
--- a/xen/include/asm-x86/system.h
+++ b/xen/include/asm-x86/system.h
@@ -118,6 +118,54 @@ static always_inline unsigned long __cmpxchg(
})
/*
+ * Undefined symbol to cause link failure if a wrong size is used with
+ * arch_fetch_and_add().
+ */
+extern unsigned long __bad_fetch_and_add_size(void);
+
+static always_inline unsigned long __xadd(
+ volatile void *ptr, unsigned long v, int size)
+{
+ switch ( size )
+ {
+ case 1:
+ asm volatile ( "lock; xaddb %b0,%1"
+ : "+r" (v), "+m" (*__xg((volatile void *)ptr))
+ :: "memory");
+ return v;
+ case 2:
+ asm volatile ( "lock; xaddw %w0,%1"
+ : "+r" (v), "+m" (*__xg((volatile void *)ptr))
+ :: "memory");
+ return v;
+ case 4:
+ asm volatile ( "lock; xaddl %k0,%1"
+ : "+r" (v), "+m" (*__xg((volatile void *)ptr))
+ :: "memory");
+ return v;
+ case 8:
+ asm volatile ( "lock; xaddq %q0,%1"
+ : "+r" (v), "+m" (*__xg((volatile void *)ptr))
+ :: "memory");
+
+ return v;
+ default:
+ return __bad_fetch_and_add_size();
+ }
+}
+
+/*
+ * Atomically add @v to the 1, 2, 4, or 8 byte value at @ptr. Returns
+ * the previous value.
+ *
+ * This is a full memory barrier.
+ */
+#define arch_fetch_and_add(ptr, v) \
+ ({ \
+ (typeof(*(ptr)))__xadd(ptr, (typeof(*(ptr)))(v), sizeof(*(ptr))); \
+ })
+
+/*
* Both Intel and AMD agree that, from a programmer's viewpoint:
* Loads cannot be reordered relative to other loads.
* Stores cannot be reordered relative to other stores.
--
1.7.10.4
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [PATCHv4 1/8] x86: provide arch_fetch_and_add()
2015-04-30 15:33 ` [PATCHv4 1/8] x86: provide arch_fetch_and_add() David Vrabel
@ 2015-05-07 15:59 ` Jan Beulich
0 siblings, 0 replies; 21+ messages in thread
From: Jan Beulich @ 2015-05-07 15:59 UTC (permalink / raw)
To: David Vrabel
Cc: Keir Fraser, Ian Campbell, Andrew Cooper, Tim Deegan,
Jennifer Herbert, xen-devel
>>> On 30.04.15 at 17:33, <david.vrabel@citrix.com> wrote:
> +static always_inline unsigned long __xadd(
> + volatile void *ptr, unsigned long v, int size)
> +{
> + switch ( size )
> + {
> + case 1:
> + asm volatile ( "lock; xaddb %b0,%1"
> + : "+r" (v), "+m" (*__xg((volatile void *)ptr))
"ptr" already is "volatile void *", so why the cast? Or is this just
an artifact of __xchg() doing the same odd thing?
> +#define arch_fetch_and_add(ptr, v) \
> + ({ \
> + (typeof(*(ptr)))__xadd(ptr, (typeof(*(ptr)))(v), sizeof(*(ptr))); \
> + })
It being a single expression inside the braces I don't think you really
need the ({}) GNU extension here.
Both of course easily adjusted while committing, if you agree.
Jan
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCHv4 2/8] arm: provide arch_fetch_and_add()
2015-04-30 15:33 [PATCHv4 0/8] Use ticket locks for spinlocks David Vrabel
2015-04-30 15:33 ` [PATCHv4 1/8] x86: provide arch_fetch_and_add() David Vrabel
@ 2015-04-30 15:33 ` David Vrabel
2015-04-30 15:33 ` [PATCHv4 3/8] x86: provide add_sized() David Vrabel
` (5 subsequent siblings)
7 siblings, 0 replies; 21+ messages in thread
From: David Vrabel @ 2015-04-30 15:33 UTC (permalink / raw)
To: xen-devel
Cc: Keir Fraser, Ian Campbell, Andrew Cooper, Tim Deegan,
David Vrabel, Jan Beulich, Jennifer Herbert
arch_fetch_and_add() atomically adds a value and returns the previous
value.
This generic arm implementation uses the GCC __sync_fetch_and_add()
builtin. This builtin resulted in suitable inlined asm for GCC 4.8.3
(arm64) and GCC 4.6.3 (arm32).
This is needed to implement ticket locks.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
xen/include/asm-arm/system.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/xen/include/asm-arm/system.h b/xen/include/asm-arm/system.h
index ce3d38a..2eb96e8 100644
--- a/xen/include/asm-arm/system.h
+++ b/xen/include/asm-arm/system.h
@@ -51,6 +51,8 @@
# error "unknown ARM variant"
#endif
+#define arch_fetch_and_add(x, v) __sync_fetch_and_add(x, v)
+
extern struct vcpu *__context_switch(struct vcpu *prev, struct vcpu *next);
#endif
--
1.7.10.4
^ permalink raw reply related [flat|nested] 21+ messages in thread* [PATCHv4 3/8] x86: provide add_sized()
2015-04-30 15:33 [PATCHv4 0/8] Use ticket locks for spinlocks David Vrabel
2015-04-30 15:33 ` [PATCHv4 1/8] x86: provide arch_fetch_and_add() David Vrabel
2015-04-30 15:33 ` [PATCHv4 2/8] arm: " David Vrabel
@ 2015-04-30 15:33 ` David Vrabel
2015-05-07 16:05 ` Jan Beulich
2015-04-30 15:33 ` [PATCHv4 4/8] arm: " David Vrabel
` (4 subsequent siblings)
7 siblings, 1 reply; 21+ messages in thread
From: David Vrabel @ 2015-04-30 15:33 UTC (permalink / raw)
To: xen-devel
Cc: Keir Fraser, Ian Campbell, Andrew Cooper, Tim Deegan,
David Vrabel, Jan Beulich, Jennifer Herbert
add_sized(ptr, inc) adds inc to the value at ptr using only the correct
size of loads and stores for the type of *ptr. The add is /not/ atomic.
This is needed for ticket locks to ensure the increment of the head ticket
does not affect the tail ticket.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
xen/include/asm-x86/atomic.h | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)
diff --git a/xen/include/asm-x86/atomic.h b/xen/include/asm-x86/atomic.h
index aadded0..d97ea54 100644
--- a/xen/include/asm-x86/atomic.h
+++ b/xen/include/asm-x86/atomic.h
@@ -14,6 +14,14 @@ static inline void name(volatile type *addr, type val) \
{ asm volatile("mov" size " %1,%0": "=m" (*(volatile type *)addr) \
:reg (val) barrier); }
+#define build_add_sized(name, size, type, reg) \
+ static inline void name(volatile type *addr, type val) \
+ { \
+ asm volatile("add" size " %1,%0" \
+ : "=m" (*(volatile type *)addr) \
+ : reg (val)); \
+ }
+
build_read_atomic(read_u8_atomic, "b", uint8_t, "=q", )
build_read_atomic(read_u16_atomic, "w", uint16_t, "=r", )
build_read_atomic(read_u32_atomic, "l", uint32_t, "=r", )
@@ -25,8 +33,14 @@ build_write_atomic(write_u32_atomic, "l", uint32_t, "r", )
build_read_atomic(read_u64_atomic, "q", uint64_t, "=r", )
build_write_atomic(write_u64_atomic, "q", uint64_t, "r", )
+build_add_sized(add_u8_sized, "b", uint8_t, "qi")
+build_add_sized(add_u16_sized, "w", uint16_t, "ri")
+build_add_sized(add_u32_sized, "l", uint32_t, "ri")
+build_add_sized(add_u64_sized, "q", uint64_t, "ri")
+
#undef build_read_atomic
#undef build_write_atomic
+#undef build_add_sized
void __bad_atomic_size(void);
@@ -55,6 +69,20 @@ void __bad_atomic_size(void);
__x; \
})
+#define add_sized(p, x) ({ \
+ typeof(*(p)) __x = (x); \
+ unsigned long x_ = (unsigned long)__x; \
+ switch ( sizeof(*(p)) ) \
+ { \
+ case 1: add_u8_sized((uint8_t *)(p), (uint8_t)x_); break; \
+ case 2: add_u16_sized((uint16_t *)(p), (uint16_t)x_); break; \
+ case 4: add_u32_sized((uint32_t *)(p), (uint32_t)x_); break; \
+ case 8: add_u64_sized((uint64_t *)(p), (uint64_t)x_); break; \
+ default: __bad_atomic_size(); break; \
+ } \
+ __x; \
+})
+
/*
* NB. I've pushed the volatile qualifier into the operations. This allows
* fast accessors such as _atomic_read() and _atomic_set() which don't give
--
1.7.10.4
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [PATCHv4 3/8] x86: provide add_sized()
2015-04-30 15:33 ` [PATCHv4 3/8] x86: provide add_sized() David Vrabel
@ 2015-05-07 16:05 ` Jan Beulich
0 siblings, 0 replies; 21+ messages in thread
From: Jan Beulich @ 2015-05-07 16:05 UTC (permalink / raw)
To: David Vrabel
Cc: Keir Fraser, Ian Campbell, Andrew Cooper, Tim Deegan,
Jennifer Herbert, xen-devel
>>> On 30.04.15 at 17:33, <david.vrabel@citrix.com> wrote:
> @@ -55,6 +69,20 @@ void __bad_atomic_size(void);
> __x; \
> })
>
> +#define add_sized(p, x) ({ \
> + typeof(*(p)) __x = (x); \
> + unsigned long x_ = (unsigned long)__x; \
I don't see the need for this triple type conversion (together with the
code below): original -> typeof(*(p)) -> unsigned long -> uintN_t.
(For write_atomic() I think this aids writing pointers, but add_sized()
surely isn't meant to do that?)
> + switch ( sizeof(*(p)) ) \
> + { \
> + case 1: add_u8_sized((uint8_t *)(p), (uint8_t)x_); break; \
> + case 2: add_u16_sized((uint16_t *)(p), (uint16_t)x_); break; \
> + case 4: add_u32_sized((uint32_t *)(p), (uint32_t)x_); break; \
> + case 8: add_u64_sized((uint64_t *)(p), (uint64_t)x_); break; \
> + default: __bad_atomic_size(); break; \
> + } \
> + __x; \
I don't see why write_atomic() needs this, and I even less so
understand why add_sized() would need to return its second
input.
Jan
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCHv4 4/8] arm: provide add_sized()
2015-04-30 15:33 [PATCHv4 0/8] Use ticket locks for spinlocks David Vrabel
` (2 preceding siblings ...)
2015-04-30 15:33 ` [PATCHv4 3/8] x86: provide add_sized() David Vrabel
@ 2015-04-30 15:33 ` David Vrabel
2015-05-05 13:49 ` Ian Campbell
2015-04-30 15:33 ` [PATCHv4 5/8] xen: use ticket locks for spin locks David Vrabel
` (3 subsequent siblings)
7 siblings, 1 reply; 21+ messages in thread
From: David Vrabel @ 2015-04-30 15:33 UTC (permalink / raw)
To: xen-devel
Cc: Keir Fraser, Ian Campbell, Andrew Cooper, Tim Deegan,
David Vrabel, Jan Beulich, Jennifer Herbert
add_sized(ptr, inc) adds inc to the value at ptr using only the correct
size of loads and stores for the type of *ptr. The add is /not/ atomic.
This is needed for ticket locks to ensure the increment of the head ticket
does not affect the tail ticket.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
xen/include/asm-arm/atomic.h | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)
diff --git a/xen/include/asm-arm/atomic.h b/xen/include/asm-arm/atomic.h
index 7d15fb0..d80ae5d 100644
--- a/xen/include/asm-arm/atomic.h
+++ b/xen/include/asm-arm/atomic.h
@@ -23,6 +23,17 @@ static inline void name(volatile type *addr, type val) \
: reg (val)); \
}
+#define build_add_sized(name, size, width, type, reg) \
+static inline void name(volatile type *addr, type val) \
+{ \
+ type t; \
+ asm volatile("ldr" size " %"width"1,%0\n" \
+ "add %"width"1,%"width"1,%"width"2\n" \
+ "str" size " %"width"1,%0" \
+ : "=m" (*(volatile type *)addr), "=r" (t) \
+ : reg (val)); \
+}
+
#if defined (CONFIG_ARM_32)
#define BYTE ""
#define WORD ""
@@ -46,6 +57,10 @@ build_atomic_read(read_u64_atomic, "x", uint64_t, "=r")
build_atomic_write(write_u64_atomic, "x", uint64_t, "r")
#endif
+build_add_sized(add_u8_sized, "b", BYTE, uint8_t, "ri")
+build_add_sized(add_u16_sized, "h", WORD, uint16_t, "ri")
+build_add_sized(add_u32_sized, "", WORD, uint32_t, "ri")
+
void __bad_atomic_size(void);
#define read_atomic(p) ({ \
@@ -70,6 +85,18 @@ void __bad_atomic_size(void);
__x; \
})
+#define add_sized(p, x) ({ \
+ typeof(*p) __x = (x); \
+ switch ( sizeof(*p) ) \
+ { \
+ case 1: add_u8_sized((uint8_t *)p, (uint8_t)__x); break; \
+ case 2: add_u16_sized((uint16_t *)p, (uint16_t)__x); break; \
+ case 4: add_u32_sized((uint32_t *)p, (uint32_t)__x); break; \
+ default: __bad_atomic_size(); break; \
+ } \
+ __x; \
+})
+
/*
* NB. I've pushed the volatile qualifier into the operations. This allows
* fast accessors such as _atomic_read() and _atomic_set() which don't give
--
1.7.10.4
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [PATCHv4 4/8] arm: provide add_sized()
2015-04-30 15:33 ` [PATCHv4 4/8] arm: " David Vrabel
@ 2015-05-05 13:49 ` Ian Campbell
2015-05-05 13:54 ` Tim Deegan
0 siblings, 1 reply; 21+ messages in thread
From: Ian Campbell @ 2015-05-05 13:49 UTC (permalink / raw)
To: David Vrabel
Cc: Keir Fraser, Tim Deegan, Jennifer Herbert, Jan Beulich,
Andrew Cooper, xen-devel
On Thu, 2015-04-30 at 16:33 +0100, David Vrabel wrote:
> add_sized(ptr, inc) adds inc to the value at ptr using only the correct
> size of loads and stores for the type of *ptr. The add is /not/ atomic.
atomic.h is an odd place for it then ;-) But given you use the
infrastructure I can't suggest a better home.
> This is needed for ticket locks to ensure the increment of the head ticket
> does not affect the tail ticket.
>
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCHv4 4/8] arm: provide add_sized()
2015-05-05 13:49 ` Ian Campbell
@ 2015-05-05 13:54 ` Tim Deegan
0 siblings, 0 replies; 21+ messages in thread
From: Tim Deegan @ 2015-05-05 13:54 UTC (permalink / raw)
To: Ian Campbell
Cc: Keir Fraser, Andrew Cooper, Jennifer Herbert, David Vrabel,
Jan Beulich, xen-devel
At 14:49 +0100 on 05 May (1430837355), Ian Campbell wrote:
> On Thu, 2015-04-30 at 16:33 +0100, David Vrabel wrote:
> > add_sized(ptr, inc) adds inc to the value at ptr using only the correct
> > size of loads and stores for the type of *ptr. The add is /not/ atomic.
>
> atomic.h is an odd place for it then ;-)
The reads and writes _are_ atomic, though. It's just that the whole
r/m/w is not.
Tim.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCHv4 5/8] xen: use ticket locks for spin locks
2015-04-30 15:33 [PATCHv4 0/8] Use ticket locks for spinlocks David Vrabel
` (3 preceding siblings ...)
2015-04-30 15:33 ` [PATCHv4 4/8] arm: " David Vrabel
@ 2015-04-30 15:33 ` David Vrabel
2015-05-05 13:56 ` Ian Campbell
2015-05-08 9:36 ` Jan Beulich
2015-04-30 15:33 ` [PATCHv4 6/8] x86, arm: remove asm/spinlock.h from all architectures David Vrabel
` (2 subsequent siblings)
7 siblings, 2 replies; 21+ messages in thread
From: David Vrabel @ 2015-04-30 15:33 UTC (permalink / raw)
To: xen-devel
Cc: Keir Fraser, Ian Campbell, Andrew Cooper, Tim Deegan,
David Vrabel, Jan Beulich, Jennifer Herbert
Replace the byte locks with ticket locks. Ticket locks are: a) fair;
and b) peform better when contented since they spin without an atomic
operation.
The lock is split into two ticket values: head and tail. A locker
acquires a ticket by (atomically) increasing tail and using the
previous tail value. A CPU holds the lock if its ticket == head. The
lock is released by increasing head.
Architectures need only provide arch_fetch_and_add() and two barriers:
arch_lock_acquire_barrier() and arch_lock_release_barrier().
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
xen/common/spinlock.c | 112 ++++++++++++++++++++++++------------------
xen/include/asm-arm/system.h | 3 ++
xen/include/asm-x86/system.h | 3 ++
xen/include/xen/spinlock.h | 16 ++++--
4 files changed, 80 insertions(+), 54 deletions(-)
diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c
index 5fd8b1c..087b643 100644
--- a/xen/common/spinlock.c
+++ b/xen/common/spinlock.c
@@ -115,93 +115,93 @@ void spin_debug_disable(void)
#endif
+static always_inline spinlock_tickets_t observe_lock(spinlock_tickets_t *t)
+{
+ spinlock_tickets_t v;
+
+ smp_rmb();
+ v.head_tail = read_atomic(&t->head_tail);
+ return v;
+}
+
+static always_inline u16 observe_head(spinlock_tickets_t *t)
+{
+ smp_rmb();
+ return read_atomic(&t->head);
+}
+
void _spin_lock(spinlock_t *lock)
{
+ spinlock_tickets_t tickets = { .tail = 1, };
LOCK_PROFILE_VAR;
check_lock(&lock->debug);
- while ( unlikely(!_raw_spin_trylock(&lock->raw)) )
+ tickets.head_tail = arch_fetch_and_add(&lock->tickets.head_tail,
+ tickets.head_tail);
+ while ( tickets.tail != observe_head(&lock->tickets) )
{
LOCK_PROFILE_BLOCK;
- while ( likely(_raw_spin_is_locked(&lock->raw)) )
- cpu_relax();
+ cpu_relax();
}
LOCK_PROFILE_GOT;
preempt_disable();
+ arch_lock_acquire_barrier();
}
void _spin_lock_irq(spinlock_t *lock)
{
- LOCK_PROFILE_VAR;
-
ASSERT(local_irq_is_enabled());
local_irq_disable();
- check_lock(&lock->debug);
- while ( unlikely(!_raw_spin_trylock(&lock->raw)) )
- {
- LOCK_PROFILE_BLOCK;
- local_irq_enable();
- while ( likely(_raw_spin_is_locked(&lock->raw)) )
- cpu_relax();
- local_irq_disable();
- }
- LOCK_PROFILE_GOT;
- preempt_disable();
+ _spin_lock(lock);
}
unsigned long _spin_lock_irqsave(spinlock_t *lock)
{
unsigned long flags;
- LOCK_PROFILE_VAR;
local_irq_save(flags);
- check_lock(&lock->debug);
- while ( unlikely(!_raw_spin_trylock(&lock->raw)) )
- {
- LOCK_PROFILE_BLOCK;
- local_irq_restore(flags);
- while ( likely(_raw_spin_is_locked(&lock->raw)) )
- cpu_relax();
- local_irq_disable();
- }
- LOCK_PROFILE_GOT;
- preempt_disable();
+ _spin_lock(lock);
return flags;
}
void _spin_unlock(spinlock_t *lock)
{
+ arch_lock_release_barrier();
preempt_enable();
LOCK_PROFILE_REL;
- _raw_spin_unlock(&lock->raw);
+ add_sized(&lock->tickets.head, 1);
}
void _spin_unlock_irq(spinlock_t *lock)
{
- preempt_enable();
- LOCK_PROFILE_REL;
- _raw_spin_unlock(&lock->raw);
+ _spin_unlock(lock);
local_irq_enable();
}
void _spin_unlock_irqrestore(spinlock_t *lock, unsigned long flags)
{
- preempt_enable();
- LOCK_PROFILE_REL;
- _raw_spin_unlock(&lock->raw);
+ _spin_unlock(lock);
local_irq_restore(flags);
}
int _spin_is_locked(spinlock_t *lock)
{
check_lock(&lock->debug);
- return _raw_spin_is_locked(&lock->raw);
+ return lock->tickets.head != lock->tickets.tail;
}
int _spin_trylock(spinlock_t *lock)
{
+ spinlock_tickets_t old, new;
+
check_lock(&lock->debug);
- if ( !_raw_spin_trylock(&lock->raw) )
+ old = observe_lock(&lock->tickets);
+ if ( old.head != old.tail )
+ return 0;
+ new = old;
+ new.tail++;
+ if ( cmpxchg(&lock->tickets.head_tail, old.head_tail, new.head_tail)
+ != old.head_tail )
return 0;
#ifdef LOCK_PROFILE
if (lock->profile)
@@ -213,27 +213,32 @@ int _spin_trylock(spinlock_t *lock)
void _spin_barrier(spinlock_t *lock)
{
+ spinlock_tickets_t sample;
#ifdef LOCK_PROFILE
s_time_t block = NOW();
- u64 loop = 0;
+#endif
check_barrier(&lock->debug);
- do { smp_mb(); loop++;} while ( _raw_spin_is_locked(&lock->raw) );
- if ((loop > 1) && lock->profile)
+ sample = observe_lock(&lock->tickets);
+ if ( sample.head != sample.tail )
{
- lock->profile->time_block += NOW() - block;
- lock->profile->block_cnt++;
- }
-#else
- check_barrier(&lock->debug);
- do { smp_mb(); } while ( _raw_spin_is_locked(&lock->raw) );
+ while ( observe_head(&lock->tickets) != sample.tail )
+ {
+#ifdef LOCK_PROFILE
+ if ( lock->profile )
+ {
+ lock->profile->time_block += NOW() - block;
+ lock->profile->block_cnt++;
+ }
#endif
+ }
+ }
smp_mb();
}
int _spin_trylock_recursive(spinlock_t *lock)
{
- int cpu = smp_processor_id();
+ unsigned int cpu = smp_processor_id();
/* Don't allow overflow of recurse_cpu field. */
BUILD_BUG_ON(NR_CPUS > 0xfffu);
@@ -256,8 +261,17 @@ int _spin_trylock_recursive(spinlock_t *lock)
void _spin_lock_recursive(spinlock_t *lock)
{
- while ( !spin_trylock_recursive(lock) )
- cpu_relax();
+ unsigned int cpu = smp_processor_id();
+
+ if ( likely(lock->recurse_cpu != cpu) )
+ {
+ spin_lock(lock);
+ lock->recurse_cpu = cpu;
+ }
+
+ /* We support only fairly shallow recursion, else the counter overflows. */
+ ASSERT(lock->recurse_cnt < 0xfu);
+ lock->recurse_cnt++;
}
void _spin_unlock_recursive(spinlock_t *lock)
diff --git a/xen/include/asm-arm/system.h b/xen/include/asm-arm/system.h
index 2eb96e8..f0e222f 100644
--- a/xen/include/asm-arm/system.h
+++ b/xen/include/asm-arm/system.h
@@ -53,6 +53,9 @@
#define arch_fetch_and_add(x, v) __sync_fetch_and_add(x, v)
+#define arch_lock_acquire_barrier() smp_mb()
+#define arch_lock_release_barrier() smp_mb()
+
extern struct vcpu *__context_switch(struct vcpu *prev, struct vcpu *next);
#endif
diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h
index 47d4a70..3cadd2d 100644
--- a/xen/include/asm-x86/system.h
+++ b/xen/include/asm-x86/system.h
@@ -187,6 +187,9 @@ static always_inline unsigned long __xadd(
#define set_mb(var, value) do { xchg(&var, value); } while (0)
#define set_wmb(var, value) do { var = value; wmb(); } while (0)
+#define arch_lock_acquire_barrier() barrier()
+#define arch_lock_release_barrier() barrier()
+
#define local_irq_disable() asm volatile ( "cli" : : : "memory" )
#define local_irq_enable() asm volatile ( "sti" : : : "memory" )
diff --git a/xen/include/xen/spinlock.h b/xen/include/xen/spinlock.h
index eda9b2e..bafbc74 100644
--- a/xen/include/xen/spinlock.h
+++ b/xen/include/xen/spinlock.h
@@ -80,8 +80,7 @@ struct lock_profile_qhead {
static struct lock_profile *__lock_profile_##name \
__used_section(".lockprofile.data") = \
&__lock_profile_data_##name
-#define _SPIN_LOCK_UNLOCKED(x) { _RAW_SPIN_LOCK_UNLOCKED, 0xfffu, 0, \
- _LOCK_DEBUG, x }
+#define _SPIN_LOCK_UNLOCKED(x) { { 0 }, 0xfffu, 0, _LOCK_DEBUG, x }
#define SPIN_LOCK_UNLOCKED _SPIN_LOCK_UNLOCKED(NULL)
#define DEFINE_SPINLOCK(l) \
spinlock_t l = _SPIN_LOCK_UNLOCKED(NULL); \
@@ -117,8 +116,7 @@ extern void spinlock_profile_reset(unsigned char key);
struct lock_profile_qhead { };
-#define SPIN_LOCK_UNLOCKED \
- { _RAW_SPIN_LOCK_UNLOCKED, 0xfffu, 0, _LOCK_DEBUG }
+#define SPIN_LOCK_UNLOCKED { { 0 }, 0xfffu, 0, _LOCK_DEBUG }
#define DEFINE_SPINLOCK(l) spinlock_t l = SPIN_LOCK_UNLOCKED
#define spin_lock_init_prof(s, l) spin_lock_init(&((s)->l))
@@ -127,8 +125,16 @@ struct lock_profile_qhead { };
#endif
+typedef union {
+ u32 head_tail;
+ struct {
+ u16 head;
+ u16 tail;
+ };
+} spinlock_tickets_t;
+
typedef struct spinlock {
- raw_spinlock_t raw;
+ spinlock_tickets_t tickets;
u16 recurse_cpu:12;
u16 recurse_cnt:4;
struct lock_debug debug;
--
1.7.10.4
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [PATCHv4 5/8] xen: use ticket locks for spin locks
2015-04-30 15:33 ` [PATCHv4 5/8] xen: use ticket locks for spin locks David Vrabel
@ 2015-05-05 13:56 ` Ian Campbell
2015-05-05 13:58 ` David Vrabel
2015-05-08 9:36 ` Jan Beulich
1 sibling, 1 reply; 21+ messages in thread
From: Ian Campbell @ 2015-05-05 13:56 UTC (permalink / raw)
To: David Vrabel
Cc: Keir Fraser, Tim Deegan, Jennifer Herbert, Jan Beulich,
Andrew Cooper, xen-devel
On Thu, 2015-04-30 at 16:33 +0100, David Vrabel wrote:
>
> void _spin_lock_irq(spinlock_t *lock)
> {
> - LOCK_PROFILE_VAR;
> -
> ASSERT(local_irq_is_enabled());
> local_irq_disable();
> - check_lock(&lock->debug);
> - while ( unlikely(!_raw_spin_trylock(&lock->raw)) )
> - {
> - LOCK_PROFILE_BLOCK;
> - local_irq_enable();
> - while ( likely(_raw_spin_is_locked(&lock->raw)) )
> - cpu_relax();
> - local_irq_disable();
> - }
> - LOCK_PROFILE_GOT;
> - preempt_disable();
> + _spin_lock(lock);
This (and the irqsave variant) differs in that it spins/waits with irq's
disabled unlike the previous implementation which enabled irqs while it
waited.
Is this change intentional? I think it warrants a mention and rationale
in the commit log if so.
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [PATCHv4 5/8] xen: use ticket locks for spin locks
2015-05-05 13:56 ` Ian Campbell
@ 2015-05-05 13:58 ` David Vrabel
0 siblings, 0 replies; 21+ messages in thread
From: David Vrabel @ 2015-05-05 13:58 UTC (permalink / raw)
To: Ian Campbell
Cc: Keir Fraser, Tim Deegan, Jennifer Herbert, Jan Beulich,
Andrew Cooper, xen-devel
On 05/05/15 14:56, Ian Campbell wrote:
> On Thu, 2015-04-30 at 16:33 +0100, David Vrabel wrote:
>>
>> void _spin_lock_irq(spinlock_t *lock)
>> {
>> - LOCK_PROFILE_VAR;
>> -
>> ASSERT(local_irq_is_enabled());
>> local_irq_disable();
>> - check_lock(&lock->debug);
>> - while ( unlikely(!_raw_spin_trylock(&lock->raw)) )
>> - {
>> - LOCK_PROFILE_BLOCK;
>> - local_irq_enable();
>> - while ( likely(_raw_spin_is_locked(&lock->raw)) )
>> - cpu_relax();
>> - local_irq_disable();
>> - }
>> - LOCK_PROFILE_GOT;
>> - preempt_disable();
>> + _spin_lock(lock);
>
> This (and the irqsave variant) differs in that it spins/waits with irq's
> disabled unlike the previous implementation which enabled irqs while it
> waited.
>
> Is this change intentional? I think it warrants a mention and rationale
> in the commit log if so.
It's mentioned in the cover letter, but I'll add it to the commit
description as well.
David
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCHv4 5/8] xen: use ticket locks for spin locks
2015-04-30 15:33 ` [PATCHv4 5/8] xen: use ticket locks for spin locks David Vrabel
2015-05-05 13:56 ` Ian Campbell
@ 2015-05-08 9:36 ` Jan Beulich
2015-05-11 13:13 ` David Vrabel
1 sibling, 1 reply; 21+ messages in thread
From: Jan Beulich @ 2015-05-08 9:36 UTC (permalink / raw)
To: David Vrabel
Cc: Keir Fraser, Ian Campbell, Andrew Cooper, Tim Deegan,
Jennifer Herbert, xen-devel
>>> On 30.04.15 at 17:33, <david.vrabel@citrix.com> wrote:
> int _spin_trylock(spinlock_t *lock)
> {
> + spinlock_tickets_t old, new;
> +
> check_lock(&lock->debug);
> - if ( !_raw_spin_trylock(&lock->raw) )
> + old = observe_lock(&lock->tickets);
> + if ( old.head != old.tail )
> + return 0;
> + new = old;
> + new.tail++;
> + if ( cmpxchg(&lock->tickets.head_tail, old.head_tail, new.head_tail)
> + != old.head_tail )
> return 0;
Maybe worth a comment here that arch_lock_acquire_barrier() is
not needed (implicitly done by cmpxchg())?
> @@ -213,27 +213,32 @@ int _spin_trylock(spinlock_t *lock)
>
> void _spin_barrier(spinlock_t *lock)
> {
> + spinlock_tickets_t sample;
> #ifdef LOCK_PROFILE
> s_time_t block = NOW();
> - u64 loop = 0;
> +#endif
>
> check_barrier(&lock->debug);
> - do { smp_mb(); loop++;} while ( _raw_spin_is_locked(&lock->raw) );
> - if ((loop > 1) && lock->profile)
> + sample = observe_lock(&lock->tickets);
> + if ( sample.head != sample.tail )
> {
> - lock->profile->time_block += NOW() - block;
> - lock->profile->block_cnt++;
> - }
> -#else
> - check_barrier(&lock->debug);
> - do { smp_mb(); } while ( _raw_spin_is_locked(&lock->raw) );
> + while ( observe_head(&lock->tickets) != sample.tail )
> + {
> +#ifdef LOCK_PROFILE
> + if ( lock->profile )
> + {
> + lock->profile->time_block += NOW() - block;
> + lock->profile->block_cnt++;
> + }
If you want to keep this inside the loop (which seems a bad idea)
you'd need to update "block" in each iteration and increment
->block_cnt only the first time through.
> #endif
Wouldn't there be a cpu_relax() on order here?
> + }
> + }
> smp_mb();
> }
The old code had smp_mb() before _and_ after the check - is it really
correct to drop the one before (or effectively replace it by smp_rmb()
in observe_{lock,head}())?
> @@ -256,8 +261,17 @@ int _spin_trylock_recursive(spinlock_t *lock)
>
> void _spin_lock_recursive(spinlock_t *lock)
> {
> - while ( !spin_trylock_recursive(lock) )
> - cpu_relax();
> + unsigned int cpu = smp_processor_id();
> +
> + if ( likely(lock->recurse_cpu != cpu) )
> + {
> + spin_lock(lock);
While benign with what xen/spinlock.h currently has, it would seem
to me that in a function named _spin_lock_recursive() this ought to
be _spin_lock().
Jan
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [PATCHv4 5/8] xen: use ticket locks for spin locks
2015-05-08 9:36 ` Jan Beulich
@ 2015-05-11 13:13 ` David Vrabel
2015-05-11 13:29 ` Tim Deegan
0 siblings, 1 reply; 21+ messages in thread
From: David Vrabel @ 2015-05-11 13:13 UTC (permalink / raw)
To: Jan Beulich, David Vrabel
Cc: Keir Fraser, Ian Campbell, Andrew Cooper, Jennifer Herbert,
Tim Deegan, xen-devel
On 08/05/15 10:36, Jan Beulich wrote:
>>
>> + }
>> + }
>> smp_mb();
>> }
>
> The old code had smp_mb() before _and_ after the check - is it really
> correct to drop the one before (or effectively replace it by smp_rmb()
> in observe_{lock,head}())?
Typical usage is:
d->is_dying = DOMDYING_dying;
spin_barrier(&d->domain_lock);
So there needs to be a barrier before we check that the lock is
released. i.e., I removed the wrong smp_mb().
I don't see the need for the second barrier since there's no stores in
_spin_barrier() and observe_lock() and observe_head() both have their
required read barriers.
David
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [PATCHv4 5/8] xen: use ticket locks for spin locks
2015-05-11 13:13 ` David Vrabel
@ 2015-05-11 13:29 ` Tim Deegan
0 siblings, 0 replies; 21+ messages in thread
From: Tim Deegan @ 2015-05-11 13:29 UTC (permalink / raw)
To: David Vrabel
Cc: Keir Fraser, Ian Campbell, Andrew Cooper, Jennifer Herbert,
Jan Beulich, xen-devel
At 14:13 +0100 on 11 May (1431353629), David Vrabel wrote:
> On 08/05/15 10:36, Jan Beulich wrote:
> >>
> >> + }
> >> + }
> >> smp_mb();
> >> }
> >
> > The old code had smp_mb() before _and_ after the check - is it really
> > correct to drop the one before (or effectively replace it by smp_rmb()
> > in observe_{lock,head}())?
>
> Typical usage is:
>
> d->is_dying = DOMDYING_dying;
> spin_barrier(&d->domain_lock);
>
> So there needs to be a barrier before we check that the lock is
> released. i.e., I removed the wrong smp_mb().
>
> I don't see the need for the second barrier since there's no stores in
> _spin_barrier() and observe_lock() and observe_head() both have their
> required read barriers.
You need the second smp_mb() to make sure that any post-barrier writes
don't happen before those reads (i.e. before the barrier).
E.g. evtchn_destroy() uses a barrier before modifying some state
(though in that case I think the read barrier might be enough because
the writes depend on reads, but you get the idea).
Tim.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCHv4 6/8] x86, arm: remove asm/spinlock.h from all architectures
2015-04-30 15:33 [PATCHv4 0/8] Use ticket locks for spinlocks David Vrabel
` (4 preceding siblings ...)
2015-04-30 15:33 ` [PATCHv4 5/8] xen: use ticket locks for spin locks David Vrabel
@ 2015-04-30 15:33 ` David Vrabel
2015-05-05 13:49 ` Ian Campbell
2015-04-30 15:33 ` [PATCHv4 7/8] x86: reduce struct paging_domain size David Vrabel
2015-04-30 15:33 ` [PATCHv4 8/8] x86: reduce struct hvm_domain size David Vrabel
7 siblings, 1 reply; 21+ messages in thread
From: David Vrabel @ 2015-04-30 15:33 UTC (permalink / raw)
To: xen-devel
Cc: Keir Fraser, Ian Campbell, Andrew Cooper, Tim Deegan,
David Vrabel, Jan Beulich, Jennifer Herbert
Now that all architecture use a common ticket lock implementation for
spinlocks, remove the architecture specific byte lock implementations.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
xen/arch/arm/README.LinuxPrimitives | 28 ---------------
xen/include/asm-arm/arm32/spinlock.h | 66 ----------------------------------
xen/include/asm-arm/arm64/spinlock.h | 63 --------------------------------
xen/include/asm-arm/spinlock.h | 23 ------------
xen/include/asm-x86/spinlock.h | 37 -------------------
xen/include/xen/spinlock.h | 1 -
6 files changed, 218 deletions(-)
delete mode 100644 xen/include/asm-arm/arm32/spinlock.h
delete mode 100644 xen/include/asm-arm/arm64/spinlock.h
delete mode 100644 xen/include/asm-arm/spinlock.h
delete mode 100644 xen/include/asm-x86/spinlock.h
diff --git a/xen/arch/arm/README.LinuxPrimitives b/xen/arch/arm/README.LinuxPrimitives
index 7f33fc7..3115f51 100644
--- a/xen/arch/arm/README.LinuxPrimitives
+++ b/xen/arch/arm/README.LinuxPrimitives
@@ -25,16 +25,6 @@ linux/arch/arm64/include/asm/atomic.h xen/include/asm-arm/arm64/atomic.h
---------------------------------------------------------------------
-spinlocks: last sync @ v3.16-rc6 (last commit: 95c4189689f9)
-
-linux/arch/arm64/include/asm/spinlock.h xen/include/asm-arm/arm64/spinlock.h
-
-Skipped:
- 5686b06 arm64: lockref: add support for lockless lockrefs using cmpxchg
- 52ea2a5 arm64: locks: introduce ticket-based spinlock implementation
-
----------------------------------------------------------------------
-
mem*: last sync @ v3.16-rc6 (last commit: d875c9b37240)
linux/arch/arm64/lib/memchr.S xen/arch/arm/arm64/lib/memchr.S
@@ -103,24 +93,6 @@ linux/arch/arm/include/asm/atomic.h xen/include/asm-arm/arm32/atomic.h
---------------------------------------------------------------------
-spinlocks: last sync: 15e7e5c1ebf5
-
-linux/arch/arm/include/asm/spinlock.h xen/include/asm-arm/arm32/spinlock.h
-
-*** Linux has switched to ticket locks but we still use bitlocks.
-
-resync to v3.14-rc7:
-
- 7c8746a ARM: 7955/1: spinlock: ensure we have a compiler barrier before sev
- 0cbad9c ARM: 7854/1: lockref: add support for lockless lockrefs using cmpxchg64
- 9bb17be ARM: locks: prefetch the destination word for write prior to strex
- 27a8479 ARM: smp_on_up: move inline asm ALT_SMP patching macro out of spinlock.
- 00efaa0 ARM: 7812/1: rwlocks: retry trylock operation if strex fails on free lo
- afa31d8 ARM: 7811/1: locks: use early clobber in arch_spin_trylock
- 73a6fdc ARM: spinlock: use inner-shareable dsb variant prior to sev instruction
-
----------------------------------------------------------------------
-
mem*: last sync @ v3.16-rc6 (last commit: d98b90ea22b0)
linux/arch/arm/lib/copy_template.S xen/arch/arm/arm32/lib/copy_template.S
diff --git a/xen/include/asm-arm/arm32/spinlock.h b/xen/include/asm-arm/arm32/spinlock.h
deleted file mode 100644
index bc0343c..0000000
--- a/xen/include/asm-arm/arm32/spinlock.h
+++ /dev/null
@@ -1,66 +0,0 @@
-#ifndef __ASM_ARM32_SPINLOCK_H
-#define __ASM_ARM32_SPINLOCK_H
-
-static inline void dsb_sev(void)
-{
- __asm__ __volatile__ (
- "dsb\n"
- "sev\n"
- );
-}
-
-typedef struct {
- volatile unsigned int lock;
-} raw_spinlock_t;
-
-#define _RAW_SPIN_LOCK_UNLOCKED { 0 }
-
-#define _raw_spin_is_locked(x) ((x)->lock != 0)
-
-static always_inline void _raw_spin_unlock(raw_spinlock_t *lock)
-{
- ASSERT(_raw_spin_is_locked(lock));
-
- smp_mb();
-
- __asm__ __volatile__(
-" str %1, [%0]\n"
- :
- : "r" (&lock->lock), "r" (0)
- : "cc");
-
- dsb_sev();
-}
-
-static always_inline int _raw_spin_trylock(raw_spinlock_t *lock)
-{
- unsigned long contended, res;
-
- do {
- __asm__ __volatile__(
- " ldrex %0, [%2]\n"
- " teq %0, #0\n"
- " strexeq %1, %3, [%2]\n"
- " movne %1, #0\n"
- : "=&r" (contended), "=r" (res)
- : "r" (&lock->lock), "r" (1)
- : "cc");
- } while (res);
-
- if (!contended) {
- smp_mb();
- return 1;
- } else {
- return 0;
- }
-}
-
-#endif /* __ASM_SPINLOCK_H */
-/*
- * Local variables:
- * mode: C
- * c-file-style: "BSD"
- * c-basic-offset: 4
- * indent-tabs-mode: nil
- * End:
- */
diff --git a/xen/include/asm-arm/arm64/spinlock.h b/xen/include/asm-arm/arm64/spinlock.h
deleted file mode 100644
index 5ae034d..0000000
--- a/xen/include/asm-arm/arm64/spinlock.h
+++ /dev/null
@@ -1,63 +0,0 @@
-/*
- * Derived from Linux arch64 spinlock.h which is:
- * Copyright (C) 2012 ARM Ltd.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program. If not, see <http://www.gnu.org/licenses/>.
- */
-
-#ifndef __ASM_ARM64_SPINLOCK_H
-#define __ASM_ARM64_SPINLOCK_H
-
-typedef struct {
- volatile unsigned int lock;
-} raw_spinlock_t;
-
-#define _RAW_SPIN_LOCK_UNLOCKED { 0 }
-
-#define _raw_spin_is_locked(x) ((x)->lock != 0)
-
-static always_inline void _raw_spin_unlock(raw_spinlock_t *lock)
-{
- ASSERT(_raw_spin_is_locked(lock));
-
- asm volatile(
- " stlr %w1, %0\n"
- : "=Q" (lock->lock) : "r" (0) : "memory");
-}
-
-static always_inline int _raw_spin_trylock(raw_spinlock_t *lock)
-{
- unsigned int tmp;
-
- asm volatile(
- "2: ldaxr %w0, %1\n"
- " cbnz %w0, 1f\n"
- " stxr %w0, %w2, %1\n"
- " cbnz %w0, 2b\n"
- "1:\n"
- : "=&r" (tmp), "+Q" (lock->lock)
- : "r" (1)
- : "cc", "memory");
-
- return !tmp;
-}
-
-#endif /* __ASM_SPINLOCK_H */
-/*
- * Local variables:
- * mode: C
- * c-file-style: "BSD"
- * c-basic-offset: 4
- * indent-tabs-mode: nil
- * End:
- */
diff --git a/xen/include/asm-arm/spinlock.h b/xen/include/asm-arm/spinlock.h
deleted file mode 100644
index a064f73..0000000
--- a/xen/include/asm-arm/spinlock.h
+++ /dev/null
@@ -1,23 +0,0 @@
-#ifndef __ASM_SPINLOCK_H
-#define __ASM_SPINLOCK_H
-
-#include <xen/config.h>
-#include <xen/lib.h>
-
-#if defined(CONFIG_ARM_32)
-# include <asm/arm32/spinlock.h>
-#elif defined(CONFIG_ARM_64)
-# include <asm/arm64/spinlock.h>
-#else
-# error "unknown ARM variant"
-#endif
-
-#endif /* __ASM_SPINLOCK_H */
-/*
- * Local variables:
- * mode: C
- * c-file-style: "BSD"
- * c-basic-offset: 4
- * indent-tabs-mode: nil
- * End:
- */
diff --git a/xen/include/asm-x86/spinlock.h b/xen/include/asm-x86/spinlock.h
deleted file mode 100644
index 757e20b..0000000
--- a/xen/include/asm-x86/spinlock.h
+++ /dev/null
@@ -1,37 +0,0 @@
-#ifndef __ASM_SPINLOCK_H
-#define __ASM_SPINLOCK_H
-
-#include <xen/config.h>
-#include <xen/lib.h>
-#include <asm/atomic.h>
-
-typedef struct {
- volatile s16 lock;
-} raw_spinlock_t;
-
-#define _RAW_SPIN_LOCK_UNLOCKED /*(raw_spinlock_t)*/ { 1 }
-
-#define _raw_spin_is_locked(x) ((x)->lock <= 0)
-
-static always_inline void _raw_spin_unlock(raw_spinlock_t *lock)
-{
- ASSERT(_raw_spin_is_locked(lock));
- asm volatile (
- "movw $1,%0"
- : "=m" (lock->lock) : : "memory" );
-}
-
-static always_inline int _raw_spin_trylock(raw_spinlock_t *lock)
-{
- s16 oldval;
- asm volatile (
- "xchgw %w0,%1"
- :"=r" (oldval), "=m" (lock->lock)
- :"0" ((s16)0) : "memory" );
- return (oldval > 0);
-}
-
-#define _raw_read_unlock(l) \
- asm volatile ( "lock; dec%z0 %0" : "+m" ((l)->lock) :: "memory" )
-
-#endif /* __ASM_SPINLOCK_H */
diff --git a/xen/include/xen/spinlock.h b/xen/include/xen/spinlock.h
index bafbc74..311685a 100644
--- a/xen/include/xen/spinlock.h
+++ b/xen/include/xen/spinlock.h
@@ -2,7 +2,6 @@
#define __SPINLOCK_H__
#include <asm/system.h>
-#include <asm/spinlock.h>
#ifndef NDEBUG
struct lock_debug {
--
1.7.10.4
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [PATCHv4 6/8] x86, arm: remove asm/spinlock.h from all architectures
2015-04-30 15:33 ` [PATCHv4 6/8] x86, arm: remove asm/spinlock.h from all architectures David Vrabel
@ 2015-05-05 13:49 ` Ian Campbell
0 siblings, 0 replies; 21+ messages in thread
From: Ian Campbell @ 2015-05-05 13:49 UTC (permalink / raw)
To: David Vrabel
Cc: Keir Fraser, Tim Deegan, Jennifer Herbert, Jan Beulich,
Andrew Cooper, xen-devel
On Thu, 2015-04-30 at 16:33 +0100, David Vrabel wrote:
> Now that all architecture use a common ticket lock implementation for
> spinlocks, remove the architecture specific byte lock implementations.
>
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> Reviewed-by: Tim Deegan <tim@xen.org>
> Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCHv4 7/8] x86: reduce struct paging_domain size
2015-04-30 15:33 [PATCHv4 0/8] Use ticket locks for spinlocks David Vrabel
` (5 preceding siblings ...)
2015-04-30 15:33 ` [PATCHv4 6/8] x86, arm: remove asm/spinlock.h from all architectures David Vrabel
@ 2015-04-30 15:33 ` David Vrabel
2015-04-30 15:33 ` [PATCHv4 8/8] x86: reduce struct hvm_domain size David Vrabel
7 siblings, 0 replies; 21+ messages in thread
From: David Vrabel @ 2015-04-30 15:33 UTC (permalink / raw)
To: xen-devel
Cc: Keir Fraser, Ian Campbell, Andrew Cooper, Tim Deegan,
David Vrabel, Jan Beulich, Jennifer Herbert
Pack struct paging_domain to reduce it by 8 bytes. Thus reducing the
size of struct domain by 8 bytes.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
xen/include/asm-x86/domain.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index e5102cc..26125af 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -186,6 +186,8 @@ struct paging_domain {
/* flags to control paging operation */
u32 mode;
+ /* Has that pool ever run out of memory? */
+ bool_t p2m_alloc_failed;
/* extension for shadow paging support */
struct shadow_domain shadow;
/* extension for hardware-assited paging */
@@ -210,8 +212,6 @@ struct paging_domain {
* (used by p2m and log-dirty code for their tries) */
struct page_info * (*alloc_page)(struct domain *d);
void (*free_page)(struct domain *d, struct page_info *pg);
- /* Has that pool ever run out of memory? */
- bool_t p2m_alloc_failed;
};
struct paging_vcpu {
--
1.7.10.4
^ permalink raw reply related [flat|nested] 21+ messages in thread* [PATCHv4 8/8] x86: reduce struct hvm_domain size
2015-04-30 15:33 [PATCHv4 0/8] Use ticket locks for spinlocks David Vrabel
` (6 preceding siblings ...)
2015-04-30 15:33 ` [PATCHv4 7/8] x86: reduce struct paging_domain size David Vrabel
@ 2015-04-30 15:33 ` David Vrabel
2015-05-08 9:47 ` Jan Beulich
7 siblings, 1 reply; 21+ messages in thread
From: David Vrabel @ 2015-04-30 15:33 UTC (permalink / raw)
To: xen-devel
Cc: Keir Fraser, Ian Campbell, Andrew Cooper, Tim Deegan,
David Vrabel, Jan Beulich, Jennifer Herbert
Pack struct hvm_domain to reduce it by 8 bytes. Thus reducing the
size of struct domain by 8 bytes.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
xen/include/asm-x86/hvm/domain.h | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
index 0702bf5..8bdf228 100644
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -116,12 +116,6 @@ struct hvm_domain {
/* VRAM dirty support. Protect with the domain paging lock. */
struct sh_dirty_vram *dirty_vram;
- /* If one of vcpus of this domain is in no_fill_mode or
- * mtrr/pat between vcpus is not the same, set is_in_uc_mode
- */
- spinlock_t uc_lock;
- bool_t is_in_uc_mode;
-
/* Pass-through */
struct hvm_iommu hvm_iommu;
@@ -137,6 +131,12 @@ struct hvm_domain {
bool_t is_s3_suspended;
bool_t introspection_enabled;
+ /* If one of vcpus of this domain is in no_fill_mode or
+ * mtrr/pat between vcpus is not the same, set is_in_uc_mode
+ */
+ bool_t is_in_uc_mode;
+ spinlock_t uc_lock;
+
/*
* TSC value that VCPUs use to calculate their tsc_offset value.
* Used during initialization and save/restore.
--
1.7.10.4
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [PATCHv4 8/8] x86: reduce struct hvm_domain size
2015-04-30 15:33 ` [PATCHv4 8/8] x86: reduce struct hvm_domain size David Vrabel
@ 2015-05-08 9:47 ` Jan Beulich
2015-05-11 13:17 ` David Vrabel
0 siblings, 1 reply; 21+ messages in thread
From: Jan Beulich @ 2015-05-08 9:47 UTC (permalink / raw)
To: David Vrabel
Cc: Keir Fraser, Ian Campbell, Andrew Cooper, Tim Deegan,
Jennifer Herbert, xen-devel
>>> On 30.04.15 at 17:33, <david.vrabel@citrix.com> wrote:
> Pack struct hvm_domain to reduce it by 8 bytes. Thus reducing the
> size of struct domain by 8 bytes.
Is that really true _after_ the change to ticket locks?
> --- a/xen/include/asm-x86/hvm/domain.h
> +++ b/xen/include/asm-x86/hvm/domain.h
> @@ -116,12 +116,6 @@ struct hvm_domain {
> /* VRAM dirty support. Protect with the domain paging lock. */
> struct sh_dirty_vram *dirty_vram;
>
> - /* If one of vcpus of this domain is in no_fill_mode or
> - * mtrr/pat between vcpus is not the same, set is_in_uc_mode
> - */
> - spinlock_t uc_lock;
> - bool_t is_in_uc_mode;
> -
> /* Pass-through */
> struct hvm_iommu hvm_iommu;
Here the block sits between two 8-byte aligned fields.
> @@ -137,6 +131,12 @@ struct hvm_domain {
> bool_t is_s3_suspended;
> bool_t introspection_enabled;
>
> + /* If one of vcpus of this domain is in no_fill_mode or
> + * mtrr/pat between vcpus is not the same, set is_in_uc_mode
> + */
> + bool_t is_in_uc_mode;
> + spinlock_t uc_lock;
> +
> /*
> * TSC value that VCPUs use to calculate their tsc_offset value.
> * Used during initialization and save/restore.
And here it follows 5 bool_t-s, and is being followed by an 8-byte
aligned field. I.e. without ticket locks it exactly fills the 3 byte gap,
but with ticket locks it requires a second 8-byte slot.
Additionally I wonder whether the reduced distance between
uc_lock and msixtbl_list_lock would now lead to (or, going forward,
at least risk) them being on the same cache line.
Jan
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [PATCHv4 8/8] x86: reduce struct hvm_domain size
2015-05-08 9:47 ` Jan Beulich
@ 2015-05-11 13:17 ` David Vrabel
0 siblings, 0 replies; 21+ messages in thread
From: David Vrabel @ 2015-05-11 13:17 UTC (permalink / raw)
To: Jan Beulich, David Vrabel
Cc: Keir Fraser, Ian Campbell, Andrew Cooper, Jennifer Herbert,
Tim Deegan, xen-devel
On 08/05/15 10:47, Jan Beulich wrote:
>>>> On 30.04.15 at 17:33, <david.vrabel@citrix.com> wrote:
>> Pack struct hvm_domain to reduce it by 8 bytes. Thus reducing the
>> size of struct domain by 8 bytes.
>
> Is that really true _after_ the change to ticket locks?
Yes.
>> @@ -137,6 +131,12 @@ struct hvm_domain {
>> bool_t is_s3_suspended;
>> bool_t introspection_enabled;
>>
>> + /* If one of vcpus of this domain is in no_fill_mode or
>> + * mtrr/pat between vcpus is not the same, set is_in_uc_mode
>> + */
>> + bool_t is_in_uc_mode;
>> + spinlock_t uc_lock;
>> +
>> /*
>> * TSC value that VCPUs use to calculate their tsc_offset value.
>> * Used during initialization and save/restore.
>
> And here it follows 5 bool_t-s, and is being followed by an 8-byte
> aligned field. I.e. without ticket locks it exactly fills the 3 byte gap,
> but with ticket locks it requires a second 8-byte slot.
No. The old byte locks were 4-bytes in size (not 2 bytes).
> Additionally I wonder whether the reduced distance between
> uc_lock and msixtbl_list_lock would now lead to (or, going forward,
> at least risk) them being on the same cache line.
They're still on different cache lines.
David
^ permalink raw reply [flat|nested] 21+ messages in thread