* Feature request for enabling SCTLR_ELx.nAA
@ 2023-02-22 23:17 Richard Henderson
2023-02-23 12:15 ` Catalin Marinas
0 siblings, 1 reply; 4+ messages in thread
From: Richard Henderson @ 2023-02-22 23:17 UTC (permalink / raw)
To: linux-arm-kernel; +Cc: Alex Bennée
Hi guys,
It would be helpful to have a prctl for enabling nAA. Since we already have
task->thread.sctlr_user, it would seem that this would not require any additional overhead
during __switch_to().
My use case is the QEMU JIT, and being able to make use of LDAR/STLR instead of explicit
DBM in some cases. At the moment, I can only make this replacement when the address is
provably aligned, which is tricky to do with the time budget of a JIT, so the replacement
rarely triggers. This ought to make a difference when emulating strongly ordered guests
like x86.
Thanks,
r~
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Feature request for enabling SCTLR_ELx.nAA
2023-02-22 23:17 Feature request for enabling SCTLR_ELx.nAA Richard Henderson
@ 2023-02-23 12:15 ` Catalin Marinas
2023-02-23 15:36 ` Richard Henderson
2023-02-24 17:09 ` Will Deacon
0 siblings, 2 replies; 4+ messages in thread
From: Catalin Marinas @ 2023-02-23 12:15 UTC (permalink / raw)
To: Richard Henderson
Cc: linux-arm-kernel, Alex Bennée, Will Deacon, Mark Rutland
On Wed, Feb 22, 2023 at 01:17:55PM -1000, Richard Henderson wrote:
> It would be helpful to have a prctl for enabling nAA. Since we already have
> task->thread.sctlr_user, it would seem that this would not require any
> additional overhead during __switch_to().
This shouldn't be difficult to add.
> My use case is the QEMU JIT, and being able to make use of LDAR/STLR instead
> of explicit DBM in some cases. At the moment, I can only make this
> replacement when the address is provably aligned, which is tricky to do with
> the time budget of a JIT, so the replacement rarely triggers. This ought to
> make a difference when emulating strongly ordered guests like x86.
It looks like in 4.17 (commit 7206dc93a58f, "arm64: Expose Arm v8.4
features") we exposed the LSE2 features as HWCAP_USCAT (unaligned
single-copy atomicity) but that still restricts LDAR/STLR to a 16-byte
boundary as there is no control for SCTLR_EL1.nAA.
Given that allowing unaligned accesses could break atomicity, I wouldn't
set this bit to 1 permanently, it helps catching tricky software bugs.
So a prctl() makes more sense. If your intended use is just preserving
the acquire/release semantics, I don't think these are affected by the
atomicity rules even if they go across a 16-byte boundary.
Adding Will and Mark for their view on this.
--
Catalin
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Feature request for enabling SCTLR_ELx.nAA
2023-02-23 12:15 ` Catalin Marinas
@ 2023-02-23 15:36 ` Richard Henderson
2023-02-24 17:09 ` Will Deacon
1 sibling, 0 replies; 4+ messages in thread
From: Richard Henderson @ 2023-02-23 15:36 UTC (permalink / raw)
To: Catalin Marinas
Cc: linux-arm-kernel, Alex Bennée, Will Deacon, Mark Rutland
On 2/23/23 02:15, Catalin Marinas wrote:
> Given that allowing unaligned accesses could break atomicity, I wouldn't
> set this bit to 1 permanently, it helps catching tricky software bugs.
> So a prctl() makes more sense. If your intended use is just preserving
> the acquire/release semantics, I don't think these are affected by the
> atomicity rules even if they go across a 16-byte boundary.
Yes, my intended use is just the acquire/release.
As I read the Arm ARM pseudo-code for aarch64/functions/memory/Mem, the !aligned case
devolves to a series of bytes, but with the same acctype, so each byte is AccType_ORDERED.
Which is just fine for my use case.
r~
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Feature request for enabling SCTLR_ELx.nAA
2023-02-23 12:15 ` Catalin Marinas
2023-02-23 15:36 ` Richard Henderson
@ 2023-02-24 17:09 ` Will Deacon
1 sibling, 0 replies; 4+ messages in thread
From: Will Deacon @ 2023-02-24 17:09 UTC (permalink / raw)
To: Catalin Marinas
Cc: Richard Henderson, linux-arm-kernel, Alex Bennée,
Mark Rutland
On Thu, Feb 23, 2023 at 12:15:43PM +0000, Catalin Marinas wrote:
> On Wed, Feb 22, 2023 at 01:17:55PM -1000, Richard Henderson wrote:
> > It would be helpful to have a prctl for enabling nAA. Since we already have
> > task->thread.sctlr_user, it would seem that this would not require any
> > additional overhead during __switch_to().
>
> This shouldn't be difficult to add.
>
> > My use case is the QEMU JIT, and being able to make use of LDAR/STLR instead
> > of explicit DBM in some cases. At the moment, I can only make this
> > replacement when the address is provably aligned, which is tricky to do with
> > the time budget of a JIT, so the replacement rarely triggers. This ought to
> > make a difference when emulating strongly ordered guests like x86.
>
> It looks like in 4.17 (commit 7206dc93a58f, "arm64: Expose Arm v8.4
> features") we exposed the LSE2 features as HWCAP_USCAT (unaligned
> single-copy atomicity) but that still restricts LDAR/STLR to a 16-byte
> boundary as there is no control for SCTLR_EL1.nAA.
>
> Given that allowing unaligned accesses could break atomicity, I wouldn't
> set this bit to 1 permanently, it helps catching tricky software bugs.
> So a prctl() makes more sense. If your intended use is just preserving
> the acquire/release semantics, I don't think these are affected by the
> atomicity rules even if they go across a 16-byte boundary.
>
> Adding Will and Mark for their view on this.
I'd definitely want to see some numbers to justify the complexity of a new
prctl(), but otherwise it sounds fine as long as it's opt-in and cleared on
exec().
Will
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-02-24 17:10 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-22 23:17 Feature request for enabling SCTLR_ELx.nAA Richard Henderson
2023-02-23 12:15 ` Catalin Marinas
2023-02-23 15:36 ` Richard Henderson
2023-02-24 17:09 ` Will Deacon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).