linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Mark Rutland <mark.rutland@arm.com>
To: linux-arm-kernel@lists.infradead.org
Cc: ardb@kernel.org, bertrand.marquis@arm.com,
	boris.ostrovsky@oracle.com, broonie@kernel.org,
	catalin.marinas@arm.com, daniel.lezcano@linaro.org,
	james.morse@arm.com, jgross@suse.com, mark.rutland@arm.com,
	maz@kernel.org, oliver.upton@linux.dev, pcc@google.com,
	sstabellini@kernel.org, suzuki.poulose@arm.com,
	tglx@linutronix.de, vladimir.murzin@arm.com, will@kernel.org
Subject: [PATCH 00/37] arm64: Remove cpus_have_const_cap()
Date: Tue, 19 Sep 2023 10:28:13 +0100	[thread overview]
Message-ID: <20230919092850.1940729-1-mark.rutland@arm.com> (raw)

For historical reasons, cpus_have_const_cap() does more than its name
implies, and its current behaviour is more harmful than helpful. This
series removes cpus_have_const_cap(), removing some redundant code and
making the kernel more robust.

Currently, cpus_have_const_cap() is implemented as:

| static __always_inline bool cpus_have_const_cap(int num)
| {
|         if (is_hyp_code())
|                 return cpus_have_final_cap(num);
|         else if (system_capabilities_finalized())
|                 return __cpus_have_const_cap(num);
|         else
|                 return cpus_have_cap(num);
| }

For hyp code this is safe and practically ideal. We finalize system
cpucaps and patch the relevant alternatives before KVM is initialized,
and so the alternative branch generated by cpus_have_final_cap() is
guaranteed to observe the finalized value of the cpucap.

For non-hyp code this is potentially unsafe and sub-optimal:

1) System cpucaps are detected on the boot CPU while secondary CPUs are
   executing code. This leads to potential races around cpucaps being
   detected, where the cpucaps can change at arbitrary points in time,
   potentially in the middle of sequences which depend on them not
   changing, e.g.

   CPU 0			CPU 1

                                // doesn't save PMR
                                flags = local_daif_save();
   // detects PSEUDO-NMI
                                // attempts to restore PMR
                                local_daif_restore(flags);

   This can potentially lead to erratic behaviour, and for stateful
   sequences it would be better to use alternatives such that the entire
   sequence is patched atomically.

2) For several cpucaps we perform some enablement/intialization work
   between detecting the cpucap nad patching alternatives. For some
   features (e.g. SVE and SME) we need to record some additional
   properties (e.g. vector lengths) before patching alternatives.

   If patched alternative sequences consume any of the recorded
   properties, it's possible that these race with the
   enablement/initialization and consume stale values, which could
   potentially result in erratic behaviour. It would be better to use
   alternatives such that the enablement/initialization is guaranteed to
   happen before any such usage.

3) Most code doesn't run between cpucaps being detected and their
   alternatives being patched, and will have redundant code generated,
   with an alternative branch for system_capabilities_finalized(), and a
   bitmap test for cpus_have_cap(). This bloats the kernel and wastes
   I-cache resources, and the resulting branching structure pessimizes
   compiler output.

   This is especially noticeable in part of the kernel which need to
   test a number of cpucaps in quick succession, such as exception
   handlers in entry-common.c and state save/restore in fpsimd.c. Using
   alternative branches directly can dramatically improve the code
   generated for such paths (e.g. making the entry code several KB
   smaller in some configurations).

This series attempts to address the above issues by removing
cpus_have_const_cap() and migrating code over to alternative branches
wherever possible:

* Patches 1 to 2 address a couple of bugs I spotted where cpucaps
  are consumed prior to being initialized.

* Patches 3 to 5 rework some low-level cpucap helpers and add new
  helpers which are used later in the series.

* Patches 6 to 8 rework some feature enablement code so that this can
  work in the window between cpucap detection and alternative patching
  without the need to use cpus_have_const_cap().

* Patch 9 moves KVM entirely over to cpus_have_final_cap().

* Patches 10 to 12 clean up the ARM64_HAS_NO_FPSIMD cpucap, inverting
  this and making it behave the same way as all other system cpucaps.

* Patches 13 to 36 migrate code away from cpus_have_const_cap().

* Patch 37 removes the now-unused cpus_have_const_cap().

The series is based on v6.6-rc2.

Mark.

Mark Rutland (37):
  clocksource/drivers/arm_arch_timer: Initialize evtstrm after
    finalizing cpucaps
  arm64/arm: xen: enlighten: Fix KPTI checks
  arm64: Factor out cpucap definitions
  arm64: Add cpucap_is_possible()
  arm64: Add cpus_have_final_boot_cap()
  arm64: Rework setup_cpu_features()
  arm64: Fixup user features at boot time
  arm64: Split kpti_install_ng_mappings()
  arm64: kvm: Use cpus_have_final_cap() explicitly
  arm64: Explicitly save/restore CPACR when probing SVE and SME
  arm64: Rename SVE/SME cpu_enable functions
  arm64: Use a positive cpucap for FP/SIMD
  arm64: Avoid cpus_have_const_cap() for
    ARM64_HAS_{ADDRESS,GENERIC}_AUTH
  arm64: Avoid cpus_have_const_cap() for ARM64_HAS_ARMv8_4_TTL
  arm64: Avoid cpus_have_const_cap() for ARM64_HAS_BTI
  arm64: Avoid cpus_have_const_cap() for ARM64_HAS_CACHE_DIC
  arm64: Avoid cpus_have_const_cap() for ARM64_HAS_CNP
  arm64: Avoid cpus_have_const_cap() for ARM64_HAS_DIT
  arm64: Avoid cpus_have_const_cap() for ARM64_HAS_GIC_PRIO_MASKING
  arm64: Avoid cpus_have_const_cap() for ARM64_HAS_PAN
  arm64: Avoid cpus_have_const_cap() for ARM64_HAS_EPAN
  arm64: Avoid cpus_have_const_cap() for ARM64_HAS_RNG
  arm64: Avoid cpus_have_const_cap() for ARM64_HAS_WFXT
  arm64: Avoid cpus_have_const_cap() for ARM64_HAS_TLB_RANGE
  arm64: Avoid cpus_have_const_cap() for ARM64_MTE
  arm64: Avoid cpus_have_const_cap() for ARM64_SSBS
  arm64: Avoid cpus_have_const_cap() for ARM64_SPECTRE_V2
  arm64: Avoid cpus_have_const_cap() for ARM64_{SVE,SME,SME2,FA64}
  arm64: Avoid cpus_have_const_cap() for ARM64_UNMAP_KERNEL_AT_EL0
  arm64: Avoid cpus_have_const_cap() for ARM64_WORKAROUND_843419
  arm64: Avoid cpus_have_const_cap() for ARM64_WORKAROUND_1542419
  arm64: Avoid cpus_have_const_cap() for ARM64_WORKAROUND_1742098
  arm64: Avoid cpus_have_const_cap() for ARM64_WORKAROUND_2645198
  arm64: Avoid cpus_have_const_cap() for ARM64_WORKAROUND_CAVIUM_23154
  arm64: Avoid cpus_have_const_cap() for
    ARM64_WORKAROUND_NVIDIA_CARMEL_CNP
  arm64: Avoid cpus_have_const_cap() for ARM64_WORKAROUND_REPEAT_TLBI
  arm64: Remove cpus_have_const_cap()

 arch/arm/xen/enlighten.c                    |  25 +--
 arch/arm64/include/asm/alternative-macros.h |   8 +-
 arch/arm64/include/asm/arch_gicv3.h         |   8 +
 arch/arm64/include/asm/archrandom.h         |   2 +-
 arch/arm64/include/asm/cacheflush.h         |   2 +-
 arch/arm64/include/asm/cpucaps.h            |  67 ++++++++
 arch/arm64/include/asm/cpufeature.h         |  96 +++++------
 arch/arm64/include/asm/fpsimd.h             |  35 +++-
 arch/arm64/include/asm/irqflags.h           |  20 +--
 arch/arm64/include/asm/kvm_emulate.h        |   4 +-
 arch/arm64/include/asm/kvm_host.h           |   2 +-
 arch/arm64/include/asm/kvm_mmu.h            |   2 +-
 arch/arm64/include/asm/mmu.h                |   2 +-
 arch/arm64/include/asm/mmu_context.h        |  28 ++--
 arch/arm64/include/asm/module.h             |   3 +-
 arch/arm64/include/asm/pgtable-prot.h       |   6 +-
 arch/arm64/include/asm/spectre.h            |   2 +-
 arch/arm64/include/asm/tlbflush.h           |   7 +-
 arch/arm64/include/asm/vectors.h            |   2 +-
 arch/arm64/kernel/cpu_errata.c              |  17 --
 arch/arm64/kernel/cpufeature.c              | 167 ++++++++++++--------
 arch/arm64/kernel/efi.c                     |   3 +-
 arch/arm64/kernel/fpsimd.c                  |  81 ++++++----
 arch/arm64/kernel/module-plts.c             |   7 +-
 arch/arm64/kernel/process.c                 |   2 +-
 arch/arm64/kernel/proton-pack.c             |   2 +-
 arch/arm64/kernel/smp.c                     |   3 +-
 arch/arm64/kernel/suspend.c                 |  13 +-
 arch/arm64/kernel/sys_compat.c              |   2 +-
 arch/arm64/kernel/traps.c                   |   2 +-
 arch/arm64/kernel/vdso.c                    |   2 +-
 arch/arm64/kvm/arm.c                        |  10 +-
 arch/arm64/kvm/guest.c                      |   4 +-
 arch/arm64/kvm/hyp/pgtable.c                |   4 +-
 arch/arm64/kvm/mmu.c                        |   2 +-
 arch/arm64/kvm/sys_regs.c                   |   2 +-
 arch/arm64/kvm/vgic/vgic-v3.c               |   2 +-
 arch/arm64/lib/delay.c                      |   2 +-
 arch/arm64/mm/fault.c                       |   2 +-
 arch/arm64/mm/hugetlbpage.c                 |   3 +-
 arch/arm64/mm/mmap.c                        |   2 +-
 arch/arm64/mm/mmu.c                         |   3 +-
 arch/arm64/mm/proc.S                        |   3 +-
 arch/arm64/tools/Makefile                   |   4 +-
 arch/arm64/tools/cpucaps                    |   2 +-
 arch/arm64/tools/gen-cpucaps.awk            |   6 +-
 drivers/clocksource/arm_arch_timer.c        |  31 +++-
 drivers/irqchip/irq-gic-v3.c                |  11 --
 include/linux/cpuhotplug.h                  |   2 +
 49 files changed, 429 insertions(+), 288 deletions(-)
 create mode 100644 arch/arm64/include/asm/cpucaps.h

-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

             reply	other threads:[~2023-09-19  9:29 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-19  9:28 Mark Rutland [this message]
2023-09-19  9:28 ` [PATCH 01/37] clocksource/drivers/arm_arch_timer: Initialize evtstrm after finalizing cpucaps Mark Rutland
2023-09-21  7:41   ` Marc Zyngier
2023-09-21 16:27     ` Mark Rutland
2023-09-19  9:28 ` [PATCH 02/37] arm64/arm: xen: enlighten: Fix KPTI checks Mark Rutland
2023-09-19  9:28 ` [PATCH 03/37] arm64: Factor out cpucap definitions Mark Rutland
2023-09-19  9:28 ` [PATCH 04/37] arm64: Add cpucap_is_possible() Mark Rutland
2023-09-19  9:28 ` [PATCH 05/37] arm64: Add cpus_have_final_boot_cap() Mark Rutland
2023-09-21  9:13   ` Suzuki K Poulose
2023-09-21 16:36     ` Mark Rutland
2023-09-22 10:26       ` Suzuki K Poulose
2023-10-02 10:25         ` Mark Rutland
2023-10-05  9:23         ` Mark Rutland
2023-10-05  9:39           ` Suzuki K Poulose
2023-09-19  9:28 ` [PATCH 06/37] arm64: Rework setup_cpu_features() Mark Rutland
2023-09-25 13:04   ` Suzuki K Poulose
2023-09-19  9:28 ` [PATCH 07/37] arm64: Fixup user features at boot time Mark Rutland
2023-09-19  9:28 ` [PATCH 08/37] arm64: Split kpti_install_ng_mappings() Mark Rutland
2023-09-19  9:28 ` [PATCH 09/37] arm64: kvm: Use cpus_have_final_cap() explicitly Mark Rutland
2023-09-21  7:49   ` Marc Zyngier
2023-09-19  9:28 ` [PATCH 10/37] arm64: Explicitly save/restore CPACR when probing SVE and SME Mark Rutland
2023-09-19  9:28 ` [PATCH 11/37] arm64: Rename SVE/SME cpu_enable functions Mark Rutland
2023-09-19 10:52   ` Mark Brown
2023-09-21 16:50     ` Mark Rutland
2023-09-19  9:28 ` [PATCH 12/37] arm64: Use a positive cpucap for FP/SIMD Mark Rutland
2023-09-19 11:21   ` Mark Brown
2023-09-19  9:28 ` [PATCH 13/37] arm64: Avoid cpus_have_const_cap() for ARM64_HAS_{ADDRESS,GENERIC}_AUTH Mark Rutland
2023-09-19  9:28 ` [PATCH 14/37] arm64: Avoid cpus_have_const_cap() for ARM64_HAS_ARMv8_4_TTL Mark Rutland
2023-09-19  9:28 ` [PATCH 15/37] arm64: Avoid cpus_have_const_cap() for ARM64_HAS_BTI Mark Rutland
2023-09-19 11:23   ` Mark Brown
2023-09-19  9:28 ` [PATCH 16/37] arm64: Avoid cpus_have_const_cap() for ARM64_HAS_CACHE_DIC Mark Rutland
2023-09-19  9:28 ` [PATCH 17/37] arm64: Avoid cpus_have_const_cap() for ARM64_HAS_CNP Mark Rutland
2023-09-19  9:28 ` [PATCH 18/37] arm64: Avoid cpus_have_const_cap() for ARM64_HAS_DIT Mark Rutland
2023-09-19  9:28 ` [PATCH 19/37] arm64: Avoid cpus_have_const_cap() for ARM64_HAS_GIC_PRIO_MASKING Mark Rutland
2023-09-19  9:28 ` [PATCH 20/37] arm64: Avoid cpus_have_const_cap() for ARM64_HAS_PAN Mark Rutland
2023-09-19  9:28 ` [PATCH 21/37] arm64: Avoid cpus_have_const_cap() for ARM64_HAS_EPAN Mark Rutland
2023-09-19  9:28 ` [PATCH 22/37] arm64: Avoid cpus_have_const_cap() for ARM64_HAS_RNG Mark Rutland
2023-09-19 11:24   ` Mark Brown
2023-09-19  9:28 ` [PATCH 23/37] arm64: Avoid cpus_have_const_cap() for ARM64_HAS_WFXT Mark Rutland
2023-09-19  9:28 ` [PATCH 24/37] arm64: Avoid cpus_have_const_cap() for ARM64_HAS_TLB_RANGE Mark Rutland
2023-09-19  9:28 ` [PATCH 25/37] arm64: Avoid cpus_have_const_cap() for ARM64_MTE Mark Rutland
2023-09-19  9:28 ` [PATCH 26/37] arm64: Avoid cpus_have_const_cap() for ARM64_SSBS Mark Rutland
2023-09-19  9:28 ` [PATCH 27/37] arm64: Avoid cpus_have_const_cap() for ARM64_SPECTRE_V2 Mark Rutland
2023-09-19  9:28 ` [PATCH 28/37] arm64: Avoid cpus_have_const_cap() for ARM64_{SVE,SME,SME2,FA64} Mark Rutland
2023-09-19 11:27   ` Mark Brown
2023-09-19  9:28 ` [PATCH 29/37] arm64: Avoid cpus_have_const_cap() for ARM64_UNMAP_KERNEL_AT_EL0 Mark Rutland
2023-09-19  9:28 ` [PATCH 30/37] arm64: Avoid cpus_have_const_cap() for ARM64_WORKAROUND_843419 Mark Rutland
2023-09-19  9:28 ` [PATCH 31/37] arm64: Avoid cpus_have_const_cap() for ARM64_WORKAROUND_1542419 Mark Rutland
2023-09-19  9:28 ` [PATCH 32/37] arm64: Avoid cpus_have_const_cap() for ARM64_WORKAROUND_1742098 Mark Rutland
2023-09-19  9:28 ` [PATCH 33/37] arm64: Avoid cpus_have_const_cap() for ARM64_WORKAROUND_2645198 Mark Rutland
2023-09-19  9:28 ` [PATCH 34/37] arm64: Avoid cpus_have_const_cap() for ARM64_WORKAROUND_CAVIUM_23154 Mark Rutland
2023-09-19  9:28 ` [PATCH 35/37] arm64: Avoid cpus_have_const_cap() for ARM64_WORKAROUND_NVIDIA_CARMEL_CNP Mark Rutland
2023-09-19  9:28 ` [PATCH 36/37] arm64: Avoid cpus_have_const_cap() for ARM64_WORKAROUND_REPEAT_TLBI Mark Rutland
2023-09-19  9:28 ` [PATCH 37/37] arm64: Remove cpus_have_const_cap() Mark Rutland
2023-10-03 17:20   ` Kristina Martsenko
2023-10-05  9:35     ` Mark Rutland

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230919092850.1940729-1-mark.rutland@arm.com \
    --to=mark.rutland@arm.com \
    --cc=ardb@kernel.org \
    --cc=bertrand.marquis@arm.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=broonie@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=james.morse@arm.com \
    --cc=jgross@suse.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=maz@kernel.org \
    --cc=oliver.upton@linux.dev \
    --cc=pcc@google.com \
    --cc=sstabellini@kernel.org \
    --cc=suzuki.poulose@arm.com \
    --cc=tglx@linutronix.de \
    --cc=vladimir.murzin@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).