linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] arm64: Support the TSO memory model
@ 2024-04-11  0:51 Hector Martin
  2024-04-11  0:51 ` [PATCH 1/4] prctl: Introduce PR_{SET,GET}_MEM_MODEL Hector Martin
                   ` (6 more replies)
  0 siblings, 7 replies; 30+ messages in thread
From: Hector Martin @ 2024-04-11  0:51 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Marc Zyngier, Mark Rutland
  Cc: Zayd Qumsieh, Justin Lu, Ryan Houdek, Mark Brown, Ard Biesheuvel,
	Mateusz Guzik, Anshuman Khandual, Oliver Upton, Miguel Luis,
	Joey Gouly, Christoph Paasch, Kees Cook, Sami Tolvanen,
	Baoquan He, Joel Granados, Dawei Li, Andrew Morton,
	Florent Revest, David Hildenbrand, Stefan Roesch, Andy Chiu,
	Josh Triplett, Oleg Nesterov, Helge Deller, Zev Weiss,
	Ondrej Mosnacek, Miguel Ojeda, linux-arm-kernel, linux-kernel,
	Asahi Linux, Hector Martin

x86 CPUs implement a stricter memory modern than ARM64 (TSO). For this
reason, x86 emulation on baseline ARM64 systems requires very expensive
memory model emulation. Having hardware that supports this natively is
therefore very attractive. Such hardware, in fact, exists. This series
adds support for userspace to identify when TSO is available and
toggle it on, if supported.

Some ARM64 CPUs intrinsically implement the TSO memory model, while
others expose is as an IMPDEF control. Apple Silicon SoCs are in the
latter category. Using TSO for x86 emulation on chips that support it
has been shown to provide a massive performance boost [1].

Patch 1 introduces the PR_{SET,GET}_MEM_MODEL userspace control, which
is initially not implemented for any architectures.

Patch 2 implements it for CPUs which are known, to the best of my
knowledge, to always implement the TSO memory model unconditionally.
This uses the cpufeature mechanism to only enable this if *all* cores in
the system meet the requirements.

Patch 3 adds the scaffolding necesasry to save/restore the ACTLR_EL1
register across context switches. This register contains IMPDEF flags
related to CPU execution, and on Apple CPUs this is where the runtime
TSO toggle bit is implemented. Other CPUs could conceivably benefit from
this scaffolding if they also use ACTLR_EL1 for things that could
ostensibly be runtime controlled and context-switched. For this to work,
ACTLR_EL1 must have a uniform layout across all cores in the system.

Finally, patch 4 implements PR_{SET,GET}_MEM_MODEL for Apple CPUs by
hooking it up to flip the appropriate ACTLR_EL1 bit when the Apple TSO
feature is detected (on all CPUs, which also implies the uniform
ACTLR_EL1 layout).

This series has been brewing in the downstream Asahi Linux tree for a
while now, and ships to thousands of users. A subset have been using it
with FEX-Emu, which already supports this feature. This rebase on
v6.9-rc1 is only build-tested (all intermediate commits with and without
the config enabled, on ARM64) but I'll update the downstream branch soon
with this version and get it pushed out to users/testers.

The Apple support works on bare metal and *should* work exactly the same
way on macOS VMs (as alluded to by Zayd in his independent submission [3]),
though I haven't personally verified this. KVM support for this is left
for a future patchset.

(Apologies for the large Cc: list; I want to make sure nobody who got
Cced on Zayd's alternate take is left out of this one.) 

[1] https://fex-emu.com/FEX-2306/
[2] https://github.com/AsahiLinux/linux/tree/bits/220-tso
[3] https://lore.kernel.org/lkml/20240410211652.16640-1-zayd_qumsieh@apple.com/

To: Catalin Marinas <catalin.marinas@arm.com>
To: Will Deacon <will@kernel.org>
To: Marc Zyngier <maz@kernel.org>
To: Mark Rutland <mark.rutland@arm.com>
Cc: Zayd Qumsieh <zayd_qumsieh@apple.com>
Cc: Justin Lu <ih_justin@apple.com>
Cc: Ryan Houdek <Houdek.Ryan@fex-emu.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Miguel Luis <miguel.luis@oracle.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Christoph Paasch <cpaasch@apple.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Joel Granados <j.granados@samsung.com>
Cc: Dawei Li <dawei.li@shingroup.cn>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Florent Revest <revest@chromium.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Stefan Roesch <shr@devkernel.io>
Cc: Andy Chiu <andy.chiu@sifive.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Zev Weiss <zev@bewilderbeest.net>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: Asahi Linux <asahi@lists.linux.dev>

Signed-off-by: Hector Martin <marcan@marcan.st>
---
Hector Martin (4):
      prctl: Introduce PR_{SET,GET}_MEM_MODEL
      arm64: Implement PR_{GET,SET}_MEM_MODEL for always-TSO CPUs
      arm64: Introduce scaffolding to add ACTLR_EL1 to thread state
      arm64: Implement Apple IMPDEF TSO memory model control

 arch/arm64/Kconfig                        | 14 ++++++
 arch/arm64/include/asm/apple_cpufeature.h | 15 +++++++
 arch/arm64/include/asm/cpufeature.h       | 10 +++++
 arch/arm64/include/asm/processor.h        |  3 ++
 arch/arm64/kernel/Makefile                |  3 +-
 arch/arm64/kernel/cpufeature.c            | 11 ++---
 arch/arm64/kernel/cpufeature_impdef.c     | 61 ++++++++++++++++++++++++++
 arch/arm64/kernel/process.c               | 71 +++++++++++++++++++++++++++++++
 arch/arm64/kernel/setup.c                 |  8 ++++
 arch/arm64/tools/cpucaps                  |  2 +
 include/linux/memory_ordering_model.h     | 11 +++++
 include/uapi/linux/prctl.h                |  5 +++
 kernel/sys.c                              | 21 +++++++++
 13 files changed, 229 insertions(+), 6 deletions(-)
---
base-commit: 4cece764965020c22cff7665b18a012006359095
change-id: 20240411-tso-e86fdceb94b8

Best regards,
-- 
Hector Martin <marcan@marcan.st>


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2024-05-09 12:57 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-11  0:51 [PATCH 0/4] arm64: Support the TSO memory model Hector Martin
2024-04-11  0:51 ` [PATCH 1/4] prctl: Introduce PR_{SET,GET}_MEM_MODEL Hector Martin
2024-04-11  0:51 ` [PATCH 2/4] arm64: Implement PR_{GET,SET}_MEM_MODEL for always-TSO CPUs Hector Martin
2024-04-11  0:51 ` [PATCH 3/4] arm64: Introduce scaffolding to add ACTLR_EL1 to thread state Hector Martin
2024-04-11  0:51 ` [PATCH 4/4] arm64: Implement Apple IMPDEF TSO memory model control Hector Martin
2024-04-11  1:37 ` [PATCH 0/4] arm64: Support the TSO memory model Neal Gompa
2024-04-11 13:28 ` Will Deacon
2024-04-11 14:19   ` Hector Martin
2024-04-11 18:43     ` Hector Martin
2024-04-16  2:22       ` Zayd Qumsieh
2024-04-19 16:58         ` Will Deacon
2024-04-19 18:05           ` Catalin Marinas
2024-04-19 16:58     ` Will Deacon
2024-04-20 11:37       ` Marc Zyngier
2024-05-02  0:10         ` Zayd Qumsieh
2024-05-02 13:25           ` Marc Zyngier
2024-05-06  8:20             ` Jonas Oberhauser
2024-04-20 12:13       ` Eric Curtin
2024-04-20 12:15         ` Eric Curtin
2024-05-06 11:21         ` Sergio Lopez Pascual
2024-05-06 16:12           ` Marc Zyngier
2024-05-06 16:20             ` Eric Curtin
2024-05-06 22:04             ` Sergio Lopez Pascual
2024-05-02  0:16   ` Zayd Qumsieh
2024-05-07 10:24   ` Alex Bennée
2024-05-07 14:52     ` Ard Biesheuvel
2024-05-09 11:13       ` Catalin Marinas
2024-05-09 12:31         ` Neal Gompa
2024-05-09 12:56           ` Catalin Marinas
2024-04-16  2:11 ` Zayd Qumsieh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).