All of lore.kernel.org
 help / color / mirror / Atom feed
From: will.deacon@arm.com (Will Deacon)
To: linux-arm-kernel@lists.infradead.org
Subject: [RFC PATCH] arm64: lse: provide additional GPR to 'fetch' LL/SC fallback variants
Date: Tue, 7 Aug 2018 17:56:34 +0100	[thread overview]
Message-ID: <20180807165634.GA21809@arm.com> (raw)
In-Reply-To: <20180804095553.16358-1-ard.biesheuvel@linaro.org>

Hi Ard,

On Sat, Aug 04, 2018 at 11:55:53AM +0200, Ard Biesheuvel wrote:
> When support for ARMv8.2 LSE atomics is compiled in, the original
> LL/SC implementations are demoted to fallbacks that are invoked
> via function calls on systems that do not implement the new instructions.
> 
> Due to the fact that these function calls may occur from modules that
> are located further than 128 MB away from their targets in the core
> kernel, such calls may be indirected via PLT entries, which are permitted
> to clobber registers x16 and x17. Since we must assume that those
> registers do not retain their value across a function call to such a
> LL/SC fallback, and given that those function calls are hidden from the
> compiler entirely, we must assume that calling any of the LSE atomics
> routines clobbers x16 and x17 (and x30, for that matter).
> 
> Fortunately, there is an upside: having two scratch register available
> permits the compiler to emit many of the LL/SC fallbacks without having
> to preserve/restore registers on the stack, which would penalise the
> users of the LL/SC fallbacks even more, given that they are already
> putting up with the function call overhead.
> 
> However, the 'fetch' variants need an additional scratch register in
> order to execute without the need to preserve registers on the stack.
> 
> So let's give those routines an additional scratch register 'x15' when
> emitted as a LL/SC fallback, and ensure that the register is marked as
> clobbered at the associated LSE call sites (but not anywhere else)

Hmm, doesn't this mean that we'll needlessly spill/reload in the case that
we have LSE atomics in the CPU? I'd rather keep the LSE code as fast as
possible if ARM64_LSE_ATOMICS=y, and allow people to disable the config
option if they want to get the best performance for the LL/SC variants.

Will

  reply	other threads:[~2018-08-07 16:56 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-04  9:55 [RFC PATCH] arm64: lse: provide additional GPR to 'fetch' LL/SC fallback variants Ard Biesheuvel
2018-08-07 16:56 ` Will Deacon [this message]
2018-08-07 17:02   ` Ard Biesheuvel
2018-08-08 15:44     ` Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180807165634.GA21809@arm.com \
    --to=will.deacon@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.