public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
From: Will Deacon <will@kernel.org>
To: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org, boqun.feng@gmail.com,
	catalin.marinas@arm.com, peterz@infradead.org
Subject: Re: [PATCH 4/5] arm64: atomics: lse: improve constraints for simple ops
Date: Mon, 13 Dec 2021 19:40:23 +0000	[thread overview]
Message-ID: <20211213194022.GD12868@willie-the-truck> (raw)
In-Reply-To: <20211210151410.2782645-5-mark.rutland@arm.com>

On Fri, Dec 10, 2021 at 03:14:09PM +0000, Mark Rutland wrote:
> We have overly conservative assembly constraints for the basic FEAT_LSE
> atomic instructions, and using more accurate and permissive constraints
> will allow for better code generation.
> 
> The FEAT_LSE basic atomic instructions have come in two forms:
> 
> 	LD{op}{order}{size} <Rs>, <Rt>, [<Rn>]
> 	ST{op}{order}{size} <Rs>, [<Rn>]
> 
> The ST* forms are aliases of the LD* forms where:
> 
> 	ST{op}{order}{size} <Rs>, [<Rn>]
> Is:
> 	LD{op}{order}{size} <Rs>, XZR, [<Rn>]
> 
> For either form, both <Rs> and <Rn> are read but not written back to,
> and <Rt> is written with the original value of the memory location.
> Where (<Rt> == <Rs>) or (<Rt> == <Rn>), <Rt> is written *after* the
> other register value(s) are consumed. There are no UNPREDICTABLE or
> CONSTRAINED UNPREDICTABLE behaviours when any pair of <Rs>, <Rt>, or
> <Rn> are the same register.
> 
> Our current inline assembly always uses <Rs> == <Rt>, treating this
> register as both an input and an output (using a '+r' constraint). This
> forces the compiler to do some unnecessary register shuffling and/or
> redundant value generation.
> 
> For example, the compiler cannot reuse the <Rs> value, and currently GCC
> 11.1.0 will compile:
> 
> 	__lse_atomic_add(1, a);
> 	__lse_atomic_add(1, b);
> 	__lse_atomic_add(1, c);
> 
> As:
> 
> 	mov     w3, #0x1
> 	mov     w4, w3
> 	stadd   w4, [x0]
> 	mov     w0, w3
> 	stadd   w0, [x1]
> 	stadd   w3, [x2]
> 
> We can improve this with more accurate constraints, separating <Rs> and
> <Rt>, where <Rs> is an input-only register ('r'), and <Rt> is an
> output-only value ('=r'). As <Rt> is written back after <Rs> is
> consumed, it does not need to be earlyclobber ('=&r'), leaving the
> compiler free to use the same register for both <Rs> and <Rt> where this
> is desirable.
> 
> At the same time, the redundant 'r' constraint for `v` is removed, as
> the `+Q` constraint is sufficient.
> 
> With this change, the above example becomes:
> 
> 	mov     w3, #0x1
> 	stadd   w3, [x0]
> 	stadd   w3, [x1]
> 	stadd   w3, [x2]
> 
> I've made this change for the non-value-returning and FETCH ops. The
> RETURN ops have a multi-instruction sequence for which we cannot use the
> same constraints, and a subsequent patch will rewrite hte RETURN ops in
> terms of the FETCH ops, relying on the ability for the compiler to reuse
> the <Rs> value.
> 
> This is intended as an optimization.
> There should be no functional change as a result of this patch.
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/include/asm/atomic_lse.h | 30 +++++++++++++++++------------
>  1 file changed, 18 insertions(+), 12 deletions(-)

Makes sense to me. I'm not sure _why_ the current constraints are so weird;
maybe a hangover from when we patched them inline? Anywho:

Acked-by: Will Deacon <will@kernel.org>

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-12-13 19:41 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-10 15:14 [PATCH 0/5] arm64: atomics: cleanups and codegen improvements Mark Rutland
2021-12-10 15:14 ` [PATCH 1/5] arm64: atomics: format whitespace consistently Mark Rutland
2021-12-13 19:20   ` Will Deacon
2021-12-10 15:14 ` [PATCH 2/5] arm64: atomics lse: define SUBs in terms of ADDs Mark Rutland
2021-12-13 19:27   ` Will Deacon
2021-12-10 15:14 ` [PATCH 3/5] arm64: atomics: lse: define ANDs in terms of ANDNOTs Mark Rutland
2021-12-13 19:29   ` Will Deacon
2021-12-10 15:14 ` [PATCH 4/5] arm64: atomics: lse: improve constraints for simple ops Mark Rutland
2021-12-13 19:40   ` Will Deacon [this message]
2021-12-14 13:04     ` Mark Rutland
2021-12-10 15:14 ` [PATCH 5/5] arm64: atomics: lse: define RETURN ops in terms of FETCH ops Mark Rutland
2021-12-10 22:19   ` Peter Zijlstra
2021-12-13 16:49     ` Mark Rutland
2021-12-13 19:43   ` Will Deacon
2021-12-10 22:26 ` [PATCH 0/5] arm64: atomics: cleanups and codegen improvements Peter Zijlstra
2021-12-14 16:54 ` Catalin Marinas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211213194022.GD12868@willie-the-truck \
    --to=will@kernel.org \
    --cc=boqun.feng@gmail.com \
    --cc=catalin.marinas@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=mark.rutland@arm.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox