Re: [PATCH v9 05/12] RISC-V: Atomic and Locking Code

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Will Deacon <will.deacon@arm.com>
To: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Daniel Lustig <dlustig@nvidia.com>, Arnd Bergmann <arnd@arndb.de>,
	Olof Johansson <olof@lixom.net>,
	linux-kernel@vger.kernel.org, patches@groups.riscv.org,
	peterz@infradead.org, boqun.feng@gmail.com
Subject: Re: [PATCH v9 05/12] RISC-V: Atomic and Locking Code
Date: Wed, 15 Nov 2017 18:06:01 +0000	[thread overview]
Message-ID: <20171115180600.GR19071@arm.com> (raw)
In-Reply-To: <mhng-f07acce6-f1e0-4f6b-ad20-7a6921f089ea@palmer-si-x1c4>

Hi Palmer,

On Tue, Nov 14, 2017 at 12:30:59PM -0800, Palmer Dabbelt wrote:
> On Tue, 24 Oct 2017 07:10:33 PDT (-0700), will.deacon@arm.com wrote:
> >On Tue, Sep 26, 2017 at 06:56:31PM -0700, Palmer Dabbelt wrote:
> >>+ATOMIC_OPS(add, add, +,  i)
> >>+ATOMIC_OPS(sub, add, +, -i)
> >>+ATOMIC_OPS(and, and, &,  i)
> >>+ATOMIC_OPS( or,  or, |,  i)
> >>+ATOMIC_OPS(xor, xor, ^,  i)
> >
> >What is the point in the c_op parameter to these things?
> 
> I guess there isn't one, it just lingered from a handful of refactorings.
> It's used in some of the other functions if we need to do a C operation
> after the atomic.  How does this look?
> 
> commit 5db229491a205ad0e1aa18287e3b342176c62d30 (HEAD -> staging-mm)
> Author: Palmer Dabbelt <palmer@dabbelt.com>
> Date:   Tue Nov 14 11:35:37 2017 -0800
> 
>    RISC-V: Remove c_op from ATOMIC_OP
> 
>    This was an unused macro parameter.
> 
>    Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>

You can do the same thing for FETCH_OP, I think.

> >>+ATOMIC_OPS(add, add, +,  i,      , _relaxed)
> >>+ATOMIC_OPS(add, add, +,  i, .aq  , _acquire)
> >>+ATOMIC_OPS(add, add, +,  i, .rl  , _release)
> >>+ATOMIC_OPS(add, add, +,  i, .aqrl,         )
> >
> >Have you checked that .aqrl is equivalent to "ordered", since there are
> >interpretations where that isn't the case. Specifically:
> >
> >// all variables zero at start of time
> >P0:
> >WRITE_ONCE(x) = 1;
> >atomic_add_return(y, 1);
> >WRITE_ONCE(z) = 1;
> >
> >P1:
> >READ_ONCE(z) // reads 1
> >smp_rmb();
> >READ_ONCE(x) // must not read 0
> 
> I haven't.  We don't quite have a formal memory model specification yet.
> I've added Daniel Lustig, who is creating that model.  He should have a
> better idea

Thanks. You really do need to ensure that, as it's heavily relied upon.

> >>+/* These barriers do not need to enforce ordering on devices, just memory. */
> >>+#define smp_mb()	RISCV_FENCE(rw,rw)
> >>+#define smp_rmb()	RISCV_FENCE(r,r)
> >>+#define smp_wmb()	RISCV_FENCE(w,w)
> >>+
> >>+/*
> >>+ * These fences exist to enforce ordering around the relaxed AMOs.  The
> >>+ * documentation defines that
> >>+ * "
> >>+ *     atomic_fetch_add();
> >>+ *   is equivalent to:
> >>+ *     smp_mb__before_atomic();
> >>+ *     atomic_fetch_add_relaxed();
> >>+ *     smp_mb__after_atomic();
> >>+ * "
> >>+ * So we emit full fences on both sides.
> >>+ */
> >>+#define __smb_mb__before_atomic()	smp_mb()
> >>+#define __smb_mb__after_atomic()	smp_mb()
> >
> >Now I'm confused, because you're also spitting out .aqrl for those afaict.
> >Do you really need full barriers *and* .aqrl, or am I misunderstanding
> >something here?
> 
> Here's the section of atomic_t.txt that I was reading
> 
>    The barriers:
>      smp_mb__{before,after}_atomic()
>    only apply to the RMW ops and can be used to augment/upgrade the ordering
>    inherent to the used atomic op. These barriers provide a full smp_mb().
>    These helper barriers exist because architectures have varying implicit
>    ordering on their SMP atomic primitives. For example our TSO architectures
>    provide full ordered atomics and these barriers are no-ops.
>    Thus:
>      atomic_fetch_add();
>    is equivalent to:
>      smp_mb__before_atomic();
>      atomic_fetch_add_relaxed();
>      smp_mb__after_atomic();
>    However the atomic_fetch_add() might be implemented more efficiently.
> 
> so I think what we've got there is correct: the barriers go with
> atomic_fetch_add_relaxed(), not atomic_fetch_add().  asm-generic/barrier.h
> has
> 
>    #ifndef __smp_mb__before_atomic
>    #define __smp_mb__before_atomic()       __smp_mb()
>    #endif
>    #ifndef __smp_mb__after_atomic
>    #define __smp_mb__after_atomic()        __smp_mb()
>    #endif
> 
> so I think we can just drop these entirely.  How does this look?
> 
> commit da682f7ee5d2af4aae7026ef40b5b5fb8e103938
> Author: Palmer Dabbelt <palmer@dabbelt.com>
> Date:   Tue Nov 14 11:49:42 2017 -0800
> 
>    RISC-V: Remove __smp_bp__{before,after}_atomic
> 
>    These duplicate the asm-generic definitions are therefor aren't useful.
> 
>    Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>

Yes, that looks good to me. I was misunderstanding things, but you're right
that you don't need to do anything special for this.

> >>+
> >>+/*
> >>+ * These barriers prevent accesses performed outside a spinlock from being moved
> >>+ * inside a spinlock.  Since RISC-V sets the aq/rl bits on our spinlock only
> >>+ * enforce release consistency, we need full fences here.
> >>+ */
> >>+#define smb_mb__before_spinlock()	smp_mb()
> >
> >We killed this macro, so you don't need to define it.
> 
> Thanks!
> 
> commit 382a1f8b33a04fc0f991e062f70f4c65ca888bca
> Author: Palmer Dabbelt <palmer@dabbelt.com>
> Date:   Tue Nov 14 11:50:37 2017 -0800
> 
>    RISC-V: Remove smb_mb__{before,after}_spinlock()
> 
>    These are obselete.
> 
>    Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
> 
> diff --git a/arch/riscv/include/asm/barrier.h b/arch/riscv/include/asm/barrier.h
> index 455ee16127fb..773c4e039cd7 100644
> --- a/arch/riscv/include/asm/barrier.h
> +++ b/arch/riscv/include/asm/barrier.h
> @@ -38,14 +38,6 @@
> #define smp_rmb()	RISCV_FENCE(r,r)
> #define smp_wmb()	RISCV_FENCE(w,w)
> 
> -/*
> - * These barriers prevent accesses performed outside a spinlock from being moved
> - * inside a spinlock.  Since RISC-V sets the aq/rl bits on our spinlock only
> - * enforce release consistency, we need full fences here.
> - */
> -#define smb_mb__before_spinlock()	smp_mb()
> -#define smb_mb__after_spinlock()	smp_mb()

You might still need the __after version, particularly if your lock
operation has only acquire semantics. The __before version has been removed.

> >>+#define __test_and_op_bit_ord(op, mod, nr, addr, ord)		\
> >>+({								\
> >>+	unsigned long __res, __mask;				\
> >>+	__mask = BIT_MASK(nr);					\
> >>+	__asm__ __volatile__ (					\
> >>+		__AMO(op) #ord " %0, %2, %1"			\
> >>+		: "=r" (__res), "+A" (addr[BIT_WORD(nr)])	\
> >>+		: "r" (mod(__mask))				\
> >>+		: "memory");					\
> >>+	((__res & __mask) != 0);				\
> >>+})
> >
> >This looks broken to me -- the value-returning test bitops need to be fully
> >ordered.
> 
> Yep, you're right -- I just mis-read atomic_ops.rst (and re-misread it the
> first time when I went to check again).  I think this should do it
> 
> commit 9951b6ed76bffb714517d81d9ffceb0eb1796229
> Author: Palmer Dabbelt <palmer@dabbelt.com>
> Date:   Tue Nov 14 12:06:06 2017 -0800
> 
>    RISC-V: __test_and_op_bit_ord should be strongly ordered
> 
>    I mis-read the documentation.
> 
>    Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
> 
> diff --git a/arch/riscv/include/asm/bitops.h b/arch/riscv/include/asm/bitops.h
> index 7c281ef1d583..f30daf26f08f 100644
> --- a/arch/riscv/include/asm/bitops.h
> +++ b/arch/riscv/include/asm/bitops.h
> @@ -67,7 +67,7 @@
> 		: "memory");
> 
> #define __test_and_op_bit(op, mod, nr, addr) 			\
> -	__test_and_op_bit_ord(op, mod, nr, addr, )
> +	__test_and_op_bit_ord(op, mod, nr, addr, .aqrl)

Yes, modulo my earlier comment and litmus test.

> >>+#define arch_spin_lock_flags(lock, flags) arch_spin_lock(lock)
> >>+#define arch_spin_is_locked(x)	((x)->lock != 0)
> >
> >Missing READ_ONCE.
> 
> Thanks!
> 
> commit 64e80b0bf3898a88de09f4e12090b002b57efede
> Author: Palmer Dabbelt <palmer@dabbelt.com>
> Date:   Tue Nov 14 12:18:49 2017 -0800
> 
>    RISC-V: Add READ_ONCE in arch_spin_is_locked()
>    Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>

Also looks good.

> >>+static inline void arch_read_lock(arch_rwlock_t *lock)
> >>+{
> >>+	int tmp;
> >>+
> >>+	__asm__ __volatile__(
> >>+		"1:	lr.w	%1, %0\n"
> >>+		"	bltz	%1, 1b\n"
> >>+		"	addi	%1, %1, 1\n"
> >>+		"	sc.w.aq	%1, %1, %0\n"
> >>+		"	bnez	%1, 1b\n"
> >>+		: "+A" (lock->lock), "=&r" (tmp)
> >>+		:: "memory");
> >>+}
> >>+
> >>+static inline void arch_write_lock(arch_rwlock_t *lock)
> >>+{
> >>+	int tmp;
> >>+
> >>+	__asm__ __volatile__(
> >>+		"1:	lr.w	%1, %0\n"
> >>+		"	bnez	%1, 1b\n"
> >>+		"	li	%1, -1\n"
> >>+		"	sc.w.aq	%1, %1, %0\n"
> >>+		"	bnez	%1, 1b\n"
> >>+		: "+A" (lock->lock), "=&r" (tmp)
> >>+		:: "memory");
> >>+}
> >
> >I think you have the same starvation issues as we have on arm64 here. I
> >strongly recommend moving over to qrwlock :)

I still strongly recommend this!

> >
> >>+#ifndef _ASM_RISCV_TLBFLUSH_H
> >>+#define _ASM_RISCV_TLBFLUSH_H
> >>+
> >>+#ifdef CONFIG_MMU
> >>+
> >>+/* Flush entire local TLB */
> >>+static inline void local_flush_tlb_all(void)
> >>+{
> >>+	__asm__ __volatile__ ("sfence.vma" : : : "memory");
> >>+}
> >>+
> >>+/* Flush one page from local TLB */
> >>+static inline void local_flush_tlb_page(unsigned long addr)
> >>+{
> >>+	__asm__ __volatile__ ("sfence.vma %0" : : "r" (addr) : "memory");
> >>+}
> >
> >Is this serialised against prior updates to the page table (set_pte) and
> >also against subsequent instruction fetch?
> 
> This is a store -> (load, store) fence.  The ordering is between stores that
> touch paging data structures and the implicit loads that come from any
> memory access when paging is enabled.  As far as I can tell, it does not
> enforce any instruction fetch ordering constraints.

That sounds pretty suspicious to me. You need to ensure that speculative
instruction fetch doesn't fetch instructions from the old mapping. Also,
store -> (load, store) fences don't generally order the page table walk,
which is what you need to order here.

Will

next prev parent reply	other threads:[~2017-11-15 18:06 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-27  1:56 RISC-V Linux Port v9 Palmer Dabbelt
2017-09-27  1:56 ` Palmer Dabbelt
2017-09-27  1:56 ` [PATCH v9 02/12] lib: Add shared copies of some GCC library routines Palmer Dabbelt
     [not found] ` <20170927015638.19443-1-palmer-96lFi9zoCfxBDgjK7y7TUQ@public.gmane.org>
2017-09-27  1:56   ` [PATCH v9 01/12] MAINTAINERS: Add RISC-V Palmer Dabbelt
2017-09-27  1:56     ` Palmer Dabbelt
2017-09-27  1:56   ` [PATCH v9 03/12] dt-bindings: RISC-V CPU Bindings Palmer Dabbelt
2017-09-27  1:56     ` Palmer Dabbelt
2017-10-05 10:16     ` Mark Rutland
2017-11-20  7:35       ` [patches] " Jonathan Neuschäfer
2017-11-20  7:35         ` Jonathan Neuschäfer
2017-11-20 19:45         ` Palmer Dabbelt
2017-09-27  1:56   ` [PATCH v9 05/12] RISC-V: Atomic and Locking Code Palmer Dabbelt
2017-09-27  1:56     ` Palmer Dabbelt
     [not found]     ` <20170927015638.19443-6-palmer-96lFi9zoCfxBDgjK7y7TUQ@public.gmane.org>
2017-10-24 14:10       ` Will Deacon
2017-10-24 14:10         ` Will Deacon
2017-11-14 20:30         ` Palmer Dabbelt
2017-11-15 18:06           ` Will Deacon [this message]
2017-11-15 19:48             ` [patches] " Palmer Dabbelt
2017-11-15 23:59               ` Daniel Lustig
2017-11-16  1:19                 ` Boqun Feng
2017-11-16  1:31                   ` Daniel Lustig
2017-11-16  1:52                     ` Boqun Feng
2017-11-16  6:40                       ` Daniel Lustig
2017-11-16 10:25                         ` Will Deacon
2017-11-16 17:12                           ` Daniel Lustig
2017-11-16  2:08                 ` Palmer Dabbelt
2017-09-27  1:56   ` [PATCH v9 10/12] RISC-V: Paging and MMU Palmer Dabbelt
2017-09-27  1:56     ` Palmer Dabbelt
2017-09-27  1:56   ` [PATCH v9 12/12] RISC-V: Build Infrastructure Palmer Dabbelt
2017-09-27  1:56     ` Palmer Dabbelt
2017-09-27  1:56 ` [PATCH v9 04/12] RISC-V: Init and Halt Code Palmer Dabbelt
2017-10-05 11:01   ` Mark Rutland
2018-07-30 23:42     ` Palmer Dabbelt
2018-07-30 23:42       ` Palmer Dabbelt
2018-07-31 13:03       ` Mark Rutland
2018-07-31 13:03         ` Mark Rutland
2017-09-27  1:56 ` [PATCH v9 06/12] RISC-V: Generic library routines and assembly Palmer Dabbelt
2017-09-27  1:56 ` [PATCH v9 07/12] RISC-V: ELF and module implementation Palmer Dabbelt
2017-09-27  1:56 ` [PATCH v9 08/12] RISC-V: Task implementation Palmer Dabbelt
2017-09-27  1:56 ` [PATCH v9 09/12] RISC-V: Device, timer, IRQs, and the SBI Palmer Dabbelt
2017-09-27  1:56 ` [PATCH v9 11/12] RISC-V: User-facing API Palmer Dabbelt
2017-09-27  6:08 ` RISC-V Linux Port v9 Arnd Bergmann
     [not found]   ` <CAK8P3a1ROLHHY9HE0LtPywhTMqBaHjp8kd4ZVBr3iWTyveBh9A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-10-05  0:21     ` Palmer Dabbelt
2017-10-05  0:21       ` Palmer Dabbelt
2017-10-05  7:34       ` Arnd Bergmann
2017-10-05  7:34         ` Arnd Bergmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171115180600.GR19071@arm.com \
    --to=will.deacon@arm.com \
    --cc=arnd@arndb.de \
    --cc=boqun.feng@gmail.com \
    --cc=dlustig@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=olof@lixom.net \
    --cc=palmer@dabbelt.com \
    --cc=patches@groups.riscv.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.