linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v8 0/8] kvm/coresight: Support exclude guest and exclude host
@ 2024-11-27 10:01 James Clark
  2024-11-27 10:01 ` [PATCH v8 1/8] KVM: arm64: Get rid of __kvm_get_mdcr_el2() and related warts James Clark
                   ` (7 more replies)
  0 siblings, 8 replies; 16+ messages in thread
From: James Clark @ 2024-11-27 10:01 UTC (permalink / raw)
  To: maz, kvmarm, oliver.upton, suzuki.poulose, coresight
  Cc: James Clark, Joey Gouly, Zenghui Yu, Catalin Marinas, Will Deacon,
	Mike Leach, Alexander Shishkin, Mark Brown, Anshuman Khandual,
	Rob Herring (Arm), Shiqi Liu, Fuad Tabba, James Morse,
	Raghavendra Rao Ananta, linux-arm-kernel, linux-kernel

FEAT_TRF is a Coresight feature that allows trace capture to be
completely filtered at different exception levels, unlike the existing
TRCVICTLR controls which may still emit target addresses of branches,
even if the following trace is filtered.

Without FEAT_TRF, it was possible to start a trace session on a host and
also collect trace from the guest as TRCVICTLR was never programmed to
exclude guests (and it could still emit target addresses even if it
was).

With FEAT_TRF, the current behavior of trace in guests exists depends on
whether nVHE or VHE are being used. Both of the examples below are from
the host's point of view, as Coresight isn't accessible from guests.
This patchset is only relevant to when FEAT_TRF exists, otherwise there
is no change.

Current behavior:

  nVHE/pKVM:

  Because the host and the guest are both using TRFCR_EL1, trace will be
  generated in guests depending on the same filter rules the host is
  using. For example if the host is tracing userspace only, then guest
  userspace trace will also be collected.

  (This is further limited by whether TRBE is used because an issue
  with TRBE means that it's completely disabled in nVHE guests, but it's
  possible to have other tracing components.)

  VHE:

  With VHE, the host filters will be in TRFCR_EL2, but the filters in
  TRFCR_EL1 will be active when the guest is running. Because we don't
  write to TRFCR_EL1, guest trace will be completely disabled.

New behavior:

The guest filtering rules from the Perf session are now honored for both
nVHE and VHE modes. This is done by either writing to TRFCR_EL12 at the
start of the Perf session and doing nothing else further, or caching the
guest value and writing it at guest switch for nVHE. In pKVM, trace is
now be disabled for both protected and unprotected guests.

There is also an optimization where the Coresight drivers pass their
enabled state to KVM. This means in the common case KVM doesn't have to
touch any sysregs when the feature isn't in use.

Applies to kvmarm/next (60ad25e14a) but includes two commits from Oliver
for a conflicting change to move TRBE and SPE flags to host data [7].

---

Changes since V7 [6]:
  * Drop SPE changes
  * Change the interface to be based on intent, i.e kvm_enable_trbe()
    rather than passing the raw register value
  * Drop change to re-use vcpu_flags mechanism in favour of [7]
  * Simplify by using the same switch function to and from guest

Changes since V6 [5]:
  * Implement a better "do nothing" case where both the SPE and Coresight
    drivers give the enabled state to KVM, allowing some register
    reads to be dropped.
  * Move the state and feature flags out of the vCPU into the per-CPU
    host_debug_state.
  * Simplify the switch logic by adding a new flag HOST_STATE_SWAP_TRFCR
    and only storing a single TRFCR value.
  * Rename vcpu flag macros to a more generic kvm_flag...

Changes since V5 [4]:
  * Sort new sysreg entries by encoding
  * Add a comment about sorting arch/arm64/tools/sysreg
  * Warn on preemptible() before calling smp_processor_id()
  * Pickup tags
  * Change TRFCR_EL2 from SysregFields to Sysreg because it was only
    used once

Changes since V4 [3]:
  * Remove all V3 changes that made it work in pKVM and just disable
    trace there instead
  * Restore PMU host/hyp state sharing back to how it was
    (kvm_pmu_update_vcpu_events())
  * Simplify some of the duplication in the comments and function docs
  * Add a WARN_ON_ONCE() if kvm_etm_set_guest_trfcr() is called when
    the trace filtering feature doesn't exist.
  * Split sysreg change into a tools update followed by the new register
    addition

Changes since V3:
  * Create a new shared area to store the host state instead of copying
    it before each VCPU run
  * Drop commit that moved SPE and trace registers from host_debug_state
    into the kvm sysregs array because the guest values were never used
  * Document kvm_etm_set_guest_trfcr()
  * Guard kvm_etm_set_guest_trfcr() with a feature check
  * Drop Mark B and Suzuki's review tags on the sysreg patch because it
    turned out that broke the Perf build and needed some unconventional
    changes to fix it (as in: to update the tools copy of the headers in
    the same commit as the kernel changes)

Changes since V2:

  * Add a new iflag to signify presence of FEAT_TRF and keep the
    existing TRBE iflag. This fixes the issue where TRBLIMITR_EL1 was
    being accessed even if TRBE didn't exist
  * Reword a commit message

Changes since V1:

  * Squashed all the arm64/tools/sysreg changes into the first commit
  * Add a new commit to move SPE and TRBE regs into the kvm sysreg array
  * Add a comment above the TRFCR global that it's per host CPU rather
    than vcpu

Changes since nVHE RFC [1]:

 * Re-write just in terms of the register value to be written for the
   host and the guest. This removes some logic from the hyp code and
   a value of kvm_vcpu_arch:trfcr_el1 = 0 no longer means "don't
   restore".
 * Remove all the conditional compilation and new files.
 * Change the kvm_etm_update_vcpu_events macro to a function.
 * Re-use DEBUG_STATE_SAVE_TRFCR so iflags don't need to be expanded
   anymore.
 * Expand the cover letter.

Changes since VHE v3 [2]:

 * Use the same interface as nVHE mode so TRFCR_EL12 is now written by
   kvm.

[1]: https://lore.kernel.org/kvmarm/20230804101317.460697-1-james.clark@arm.com/
[2]: https://lore.kernel.org/kvmarm/20230905102117.2011094-1-james.clark@arm.com/
[3]: https://lore.kernel.org/linux-arm-kernel/20240104162714.1062610-1-james.clark@arm.com/
[4]: https://lore.kernel.org/all/20240220100924.2761706-1-james.clark@arm.com/
[5]: https://lore.kernel.org/linux-arm-kernel/20240226113044.228403-1-james.clark@arm.com/
[6]: https://lore.kernel.org/kvmarm/20241112103717.589952-1-james.clark@linaro.org/T/#t
[7]: https://lore.kernel.org/kvmarm/20241115224924.2132364-4-oliver.upton@linux.dev/

James Clark (6):
  arm64/sysreg: Add a comment that the sysreg file should be sorted
  tools: arm64: Update sysreg.h header files
  arm64/sysreg/tools: Move TRFCR definitions to sysreg
  KVM: arm64: coresight: Give TRBE enabled state to KVM
  KVM: arm64: Support trace filtering for guests
  coresight: Pass guest TRFCR value to KVM

Oliver Upton (2):
  KVM: arm64: Get rid of __kvm_get_mdcr_el2() and related warts
  KVM: arm64: Track presence of SPE/TRBE in kvm_host_data instead of
    vCPU

 arch/arm64/include/asm/kvm_asm.h              |   5 +-
 arch/arm64/include/asm/kvm_host.h             |  39 +-
 arch/arm64/include/asm/sysreg.h               |  12 -
 arch/arm64/kvm/arm.c                          |   5 +-
 arch/arm64/kvm/debug.c                        |  92 ++--
 arch/arm64/kvm/hyp/nvhe/debug-sr.c            |  61 +--
 arch/arm64/kvm/hyp/nvhe/hyp-main.c            |   6 -
 arch/arm64/kvm/hyp/vhe/debug-sr.c             |   5 -
 arch/arm64/tools/sysreg                       |  38 ++
 .../coresight/coresight-etm4x-core.c          |  43 +-
 drivers/hwtracing/coresight/coresight-etm4x.h |   2 +-
 drivers/hwtracing/coresight/coresight-priv.h  |   3 +
 drivers/hwtracing/coresight/coresight-trbe.c  |   5 +
 tools/arch/arm64/include/asm/sysreg.h         | 410 +++++++++++++++++-
 tools/include/linux/kasan-tags.h              |  15 +
 15 files changed, 609 insertions(+), 132 deletions(-)
 create mode 100644 tools/include/linux/kasan-tags.h

-- 
2.34.1



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v8 1/8] KVM: arm64: Get rid of __kvm_get_mdcr_el2() and related warts
  2024-11-27 10:01 [PATCH v8 0/8] kvm/coresight: Support exclude guest and exclude host James Clark
@ 2024-11-27 10:01 ` James Clark
  2024-11-27 10:01 ` [PATCH v8 2/8] KVM: arm64: Track presence of SPE/TRBE in kvm_host_data instead of vCPU James Clark
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: James Clark @ 2024-11-27 10:01 UTC (permalink / raw)
  To: maz, kvmarm, oliver.upton, suzuki.poulose, coresight
  Cc: James Clark, Joey Gouly, Zenghui Yu, Catalin Marinas, Will Deacon,
	Mike Leach, Alexander Shishkin, Mark Rutland, Anshuman Khandual,
	Fuad Tabba, James Morse, Shiqi Liu, Mark Brown,
	Raghavendra Rao Ananta, linux-arm-kernel, linux-kernel

From: Oliver Upton <oliver.upton@linux.dev>

KVM caches MDCR_EL2 on a per-CPU basis in order to preserve the
configuration of MDCR_EL2.HPMN while running a guest. This is a bit
gross, since we're relying on some baked configuration rather than the
hardware definition of implemented counters.

Discover the number of implemented counters by reading PMCR_EL0.N
instead. This works because:

 - In VHE the kernel runs at EL2, and N always returns the number of
   counters implemented in hardware

 - In {n,h}VHE, the EL2 setup code programs MDCR_EL2.HPMN with the EL2
   view of PMCR_EL0.N for the host

Lastly, avoid traps under nested virtualization by saving PMCR_EL0.N in
host data.

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: James Clark <james.clark@linaro.org>
---
 arch/arm64/include/asm/kvm_asm.h   |  5 +----
 arch/arm64/include/asm/kvm_host.h  |  7 +++++--
 arch/arm64/kvm/arm.c               |  2 +-
 arch/arm64/kvm/debug.c             | 29 +++++++++++------------------
 arch/arm64/kvm/hyp/nvhe/debug-sr.c |  5 -----
 arch/arm64/kvm/hyp/nvhe/hyp-main.c |  6 ------
 arch/arm64/kvm/hyp/vhe/debug-sr.c  |  5 -----
 7 files changed, 18 insertions(+), 41 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index ca2590344313..063185c202ce 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -53,8 +53,7 @@
 enum __kvm_host_smccc_func {
 	/* Hypercalls available only prior to pKVM finalisation */
 	/* __KVM_HOST_SMCCC_FUNC___kvm_hyp_init */
-	__KVM_HOST_SMCCC_FUNC___kvm_get_mdcr_el2 = __KVM_HOST_SMCCC_FUNC___kvm_hyp_init + 1,
-	__KVM_HOST_SMCCC_FUNC___pkvm_init,
+	__KVM_HOST_SMCCC_FUNC___pkvm_init = __KVM_HOST_SMCCC_FUNC___kvm_hyp_init + 1,
 	__KVM_HOST_SMCCC_FUNC___pkvm_create_private_mapping,
 	__KVM_HOST_SMCCC_FUNC___pkvm_cpu_set_vector,
 	__KVM_HOST_SMCCC_FUNC___kvm_enable_ssbs,
@@ -247,8 +246,6 @@ extern void __kvm_adjust_pc(struct kvm_vcpu *vcpu);
 extern u64 __vgic_v3_get_gic_config(void);
 extern void __vgic_v3_init_lrs(void);
 
-extern u64 __kvm_get_mdcr_el2(void);
-
 #define __KVM_EXTABLE(from, to)						\
 	"	.pushsection	__kvm_ex_table, \"a\"\n"		\
 	"	.align		3\n"					\
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index f333b189fb43..ad514434f3fe 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -642,7 +642,7 @@ struct kvm_host_data {
 	 * host_debug_state contains the host registers which are
 	 * saved and restored during world switches.
 	 */
-	 struct {
+	struct {
 		/* {Break,watch}point registers */
 		struct kvm_guest_debug_arch regs;
 		/* Statistical profiling extension */
@@ -652,6 +652,9 @@ struct kvm_host_data {
 		/* Values of trap registers for the host before guest entry. */
 		u64 mdcr_el2;
 	} host_debug_state;
+
+	/* Number of programmable event counters (PMCR_EL0.N) for this CPU */
+	unsigned int nr_event_counters;
 };
 
 struct kvm_host_psci_config {
@@ -1332,7 +1335,7 @@ static inline bool kvm_system_needs_idmapped_vectors(void)
 
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 
-void kvm_arm_init_debug(void);
+void kvm_init_host_debug_data(void);
 void kvm_arm_vcpu_init_debug(struct kvm_vcpu *vcpu);
 void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
 void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index a102c3aebdbc..ab1bf9ccf385 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -2109,6 +2109,7 @@ static void cpu_set_hyp_vector(void)
 static void cpu_hyp_init_context(void)
 {
 	kvm_init_host_cpu_context(host_data_ptr(host_ctxt));
+	kvm_init_host_debug_data();
 
 	if (!is_kernel_in_hyp_mode())
 		cpu_init_hyp_mode();
@@ -2117,7 +2118,6 @@ static void cpu_hyp_init_context(void)
 static void cpu_hyp_init_features(void)
 {
 	cpu_set_hyp_vector();
-	kvm_arm_init_debug();
 
 	if (is_kernel_in_hyp_mode())
 		kvm_timer_init_vhe();
diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
index ce8886122ed3..0dbabb6e1108 100644
--- a/arch/arm64/kvm/debug.c
+++ b/arch/arm64/kvm/debug.c
@@ -21,8 +21,6 @@
 				DBG_MDSCR_KDE | \
 				DBG_MDSCR_MDE)
 
-static DEFINE_PER_CPU(u64, mdcr_el2);
-
 /*
  * save/restore_guest_debug_regs
  *
@@ -65,21 +63,6 @@ static void restore_guest_debug_regs(struct kvm_vcpu *vcpu)
 		*vcpu_cpsr(vcpu) &= ~DBG_SPSR_SS;
 }
 
-/**
- * kvm_arm_init_debug - grab what we need for debug
- *
- * Currently the sole task of this function is to retrieve the initial
- * value of mdcr_el2 so we can preserve MDCR_EL2.HPMN which has
- * presumably been set-up by some knowledgeable bootcode.
- *
- * It is called once per-cpu during CPU hyp initialisation.
- */
-
-void kvm_arm_init_debug(void)
-{
-	__this_cpu_write(mdcr_el2, kvm_call_hyp_ret(__kvm_get_mdcr_el2));
-}
-
 /**
  * kvm_arm_setup_mdcr_el2 - configure vcpu mdcr_el2 value
  *
@@ -99,7 +82,8 @@ static void kvm_arm_setup_mdcr_el2(struct kvm_vcpu *vcpu)
 	 * This also clears MDCR_EL2_E2PB_MASK and MDCR_EL2_E2TB_MASK
 	 * to disable guest access to the profiling and trace buffers
 	 */
-	vcpu->arch.mdcr_el2 = __this_cpu_read(mdcr_el2) & MDCR_EL2_HPMN_MASK;
+	vcpu->arch.mdcr_el2 = FIELD_PREP(MDCR_EL2_HPMN,
+					 *host_data_ptr(nr_event_counters));
 	vcpu->arch.mdcr_el2 |= (MDCR_EL2_TPM |
 				MDCR_EL2_TPMS |
 				MDCR_EL2_TTRF |
@@ -343,3 +327,12 @@ void kvm_arch_vcpu_put_debug_state_flags(struct kvm_vcpu *vcpu)
 	vcpu_clear_flag(vcpu, DEBUG_STATE_SAVE_SPE);
 	vcpu_clear_flag(vcpu, DEBUG_STATE_SAVE_TRBE);
 }
+
+void kvm_init_host_debug_data(void)
+{
+	u64 dfr0 = read_sysreg(id_aa64dfr0_el1);
+
+	if (cpuid_feature_extract_signed_field(dfr0, ID_AA64DFR0_EL1_PMUVer_SHIFT) > 0)
+		*host_data_ptr(nr_event_counters) = FIELD_GET(ARMV8_PMU_PMCR_N,
+							      read_sysreg(pmcr_el0));
+}
diff --git a/arch/arm64/kvm/hyp/nvhe/debug-sr.c b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
index 53efda0235cf..1e2a26d0196e 100644
--- a/arch/arm64/kvm/hyp/nvhe/debug-sr.c
+++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
@@ -106,8 +106,3 @@ void __debug_switch_to_host(struct kvm_vcpu *vcpu)
 {
 	__debug_switch_to_host_common(vcpu);
 }
-
-u64 __kvm_get_mdcr_el2(void)
-{
-	return read_sysreg(mdcr_el2);
-}
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 6aa0b13d86e5..16f5da3a884a 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -264,11 +264,6 @@ static void handle___vgic_v3_init_lrs(struct kvm_cpu_context *host_ctxt)
 	__vgic_v3_init_lrs();
 }
 
-static void handle___kvm_get_mdcr_el2(struct kvm_cpu_context *host_ctxt)
-{
-	cpu_reg(host_ctxt, 1) = __kvm_get_mdcr_el2();
-}
-
 static void handle___vgic_v3_save_vmcr_aprs(struct kvm_cpu_context *host_ctxt)
 {
 	DECLARE_REG(struct vgic_v3_cpu_if *, cpu_if, host_ctxt, 1);
@@ -384,7 +379,6 @@ typedef void (*hcall_t)(struct kvm_cpu_context *);
 
 static const hcall_t host_hcall[] = {
 	/* ___kvm_hyp_init */
-	HANDLE_FUNC(__kvm_get_mdcr_el2),
 	HANDLE_FUNC(__pkvm_init),
 	HANDLE_FUNC(__pkvm_create_private_mapping),
 	HANDLE_FUNC(__pkvm_cpu_set_vector),
diff --git a/arch/arm64/kvm/hyp/vhe/debug-sr.c b/arch/arm64/kvm/hyp/vhe/debug-sr.c
index 289689b2682d..0100339b09e0 100644
--- a/arch/arm64/kvm/hyp/vhe/debug-sr.c
+++ b/arch/arm64/kvm/hyp/vhe/debug-sr.c
@@ -19,8 +19,3 @@ void __debug_switch_to_host(struct kvm_vcpu *vcpu)
 {
 	__debug_switch_to_host_common(vcpu);
 }
-
-u64 __kvm_get_mdcr_el2(void)
-{
-	return read_sysreg(mdcr_el2);
-}
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v8 2/8] KVM: arm64: Track presence of SPE/TRBE in kvm_host_data instead of vCPU
  2024-11-27 10:01 [PATCH v8 0/8] kvm/coresight: Support exclude guest and exclude host James Clark
  2024-11-27 10:01 ` [PATCH v8 1/8] KVM: arm64: Get rid of __kvm_get_mdcr_el2() and related warts James Clark
@ 2024-11-27 10:01 ` James Clark
  2024-11-27 10:01 ` [PATCH v8 3/8] arm64/sysreg: Add a comment that the sysreg file should be sorted James Clark
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: James Clark @ 2024-11-27 10:01 UTC (permalink / raw)
  To: maz, kvmarm, oliver.upton, suzuki.poulose, coresight
  Cc: James Clark, Joey Gouly, Zenghui Yu, Catalin Marinas, Will Deacon,
	Mike Leach, Alexander Shishkin, Mark Brown, Anshuman Khandual,
	Fuad Tabba, Shiqi Liu, James Morse, Raghavendra Rao Ananta,
	linux-arm-kernel, linux-kernel

From: Oliver Upton <oliver.upton@linux.dev>

Add flags to kvm_host_data to track if SPE/TRBE is present +
programmable on a per-CPU basis. Set the flags up at init rather than
vcpu_load() as the programmability of these buffers is unlikely to
change.

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: James Clark <james.clark@linaro.org>
---
 arch/arm64/include/asm/kvm_host.h  | 19 +++++++++-------
 arch/arm64/kvm/arm.c               |  3 ---
 arch/arm64/kvm/debug.c             | 36 ++++++++----------------------
 arch/arm64/kvm/hyp/nvhe/debug-sr.c |  8 +++----
 4 files changed, 24 insertions(+), 42 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index ad514434f3fe..7e3478386351 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -610,6 +610,10 @@ struct cpu_sve_state {
  * field.
  */
 struct kvm_host_data {
+#define KVM_HOST_DATA_FLAG_HAS_SPE	0
+#define KVM_HOST_DATA_FLAG_HAS_TRBE	1
+	unsigned long flags;
+
 	struct kvm_cpu_context host_ctxt;
 
 	/*
@@ -911,10 +915,6 @@ struct kvm_vcpu_arch {
 #define EXCEPT_AA64_EL2_SERR	__vcpu_except_flags(7)
 /* Guest debug is live */
 #define DEBUG_DIRTY		__vcpu_single_flag(iflags, BIT(4))
-/* Save SPE context if active  */
-#define DEBUG_STATE_SAVE_SPE	__vcpu_single_flag(iflags, BIT(5))
-/* Save TRBE context if active  */
-#define DEBUG_STATE_SAVE_TRBE	__vcpu_single_flag(iflags, BIT(6))
 
 /* SVE enabled for host EL0 */
 #define HOST_SVE_ENABLED	__vcpu_single_flag(sflags, BIT(0))
@@ -1310,6 +1310,13 @@ DECLARE_KVM_HYP_PER_CPU(struct kvm_host_data, kvm_host_data);
 	 &this_cpu_ptr_hyp_sym(kvm_host_data)->f)
 #endif
 
+#define host_data_test_flag(flag)					\
+	(test_bit(KVM_HOST_DATA_FLAG_##flag, host_data_ptr(flags)))
+#define host_data_set_flag(flag)					\
+	set_bit(KVM_HOST_DATA_FLAG_##flag, host_data_ptr(flags))
+#define host_data_clear_flag(flag)					\
+	clear_bit(KVM_HOST_DATA_FLAG_##flag, host_data_ptr(flags))
+
 /* Check whether the FP regs are owned by the guest */
 static inline bool guest_owns_fp_regs(void)
 {
@@ -1370,10 +1377,6 @@ static inline bool kvm_pmu_counter_deferred(struct perf_event_attr *attr)
 	return (!has_vhe() && attr->exclude_host);
 }
 
-/* Flags for host debug state */
-void kvm_arch_vcpu_load_debug_state_flags(struct kvm_vcpu *vcpu);
-void kvm_arch_vcpu_put_debug_state_flags(struct kvm_vcpu *vcpu);
-
 #ifdef CONFIG_KVM
 void kvm_set_pmu_events(u64 set, struct perf_event_attr *attr);
 void kvm_clr_pmu_events(u64 clr);
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index ab1bf9ccf385..3822774840e1 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -617,15 +617,12 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 
 	vcpu_set_pauth_traps(vcpu);
 
-	kvm_arch_vcpu_load_debug_state_flags(vcpu);
-
 	if (!cpumask_test_cpu(cpu, vcpu->kvm->arch.supported_cpus))
 		vcpu_set_on_unsupported_cpu(vcpu);
 }
 
 void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 {
-	kvm_arch_vcpu_put_debug_state_flags(vcpu);
 	kvm_arch_vcpu_put_fp(vcpu);
 	if (has_vhe())
 		kvm_vcpu_put_vhe(vcpu);
diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
index 0dbabb6e1108..dd9e139dfd13 100644
--- a/arch/arm64/kvm/debug.c
+++ b/arch/arm64/kvm/debug.c
@@ -299,40 +299,22 @@ void kvm_arm_clear_debug(struct kvm_vcpu *vcpu)
 	}
 }
 
-void kvm_arch_vcpu_load_debug_state_flags(struct kvm_vcpu *vcpu)
+void kvm_init_host_debug_data(void)
 {
-	u64 dfr0;
+	u64 dfr0 = read_sysreg(id_aa64dfr0_el1);
+
+	if (cpuid_feature_extract_signed_field(dfr0, ID_AA64DFR0_EL1_PMUVer_SHIFT) > 0)
+		*host_data_ptr(nr_event_counters) = FIELD_GET(ARMV8_PMU_PMCR_N,
+							      read_sysreg(pmcr_el0));
 
-	/* For VHE, there is nothing to do */
 	if (has_vhe())
 		return;
 
-	dfr0 = read_sysreg(id_aa64dfr0_el1);
-	/*
-	 * If SPE is present on this CPU and is available at current EL,
-	 * we may need to check if the host state needs to be saved.
-	 */
 	if (cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_EL1_PMSVer_SHIFT) &&
-	    !(read_sysreg_s(SYS_PMBIDR_EL1) & BIT(PMBIDR_EL1_P_SHIFT)))
-		vcpu_set_flag(vcpu, DEBUG_STATE_SAVE_SPE);
+	    !(read_sysreg_s(SYS_PMBIDR_EL1) & PMBIDR_EL1_P))
+		host_data_set_flag(HAS_SPE);
 
-	/* Check if we have TRBE implemented and available at the host */
 	if (cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_EL1_TraceBuffer_SHIFT) &&
 	    !(read_sysreg_s(SYS_TRBIDR_EL1) & TRBIDR_EL1_P))
-		vcpu_set_flag(vcpu, DEBUG_STATE_SAVE_TRBE);
-}
-
-void kvm_arch_vcpu_put_debug_state_flags(struct kvm_vcpu *vcpu)
-{
-	vcpu_clear_flag(vcpu, DEBUG_STATE_SAVE_SPE);
-	vcpu_clear_flag(vcpu, DEBUG_STATE_SAVE_TRBE);
-}
-
-void kvm_init_host_debug_data(void)
-{
-	u64 dfr0 = read_sysreg(id_aa64dfr0_el1);
-
-	if (cpuid_feature_extract_signed_field(dfr0, ID_AA64DFR0_EL1_PMUVer_SHIFT) > 0)
-		*host_data_ptr(nr_event_counters) = FIELD_GET(ARMV8_PMU_PMCR_N,
-							      read_sysreg(pmcr_el0));
+		host_data_set_flag(HAS_TRBE);
 }
diff --git a/arch/arm64/kvm/hyp/nvhe/debug-sr.c b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
index 1e2a26d0196e..858bb38e273f 100644
--- a/arch/arm64/kvm/hyp/nvhe/debug-sr.c
+++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
@@ -82,10 +82,10 @@ static void __debug_restore_trace(u64 trfcr_el1)
 void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu)
 {
 	/* Disable and flush SPE data generation */
-	if (vcpu_get_flag(vcpu, DEBUG_STATE_SAVE_SPE))
+	if (host_data_test_flag(HAS_SPE))
 		__debug_save_spe(host_data_ptr(host_debug_state.pmscr_el1));
 	/* Disable and flush Self-Hosted Trace generation */
-	if (vcpu_get_flag(vcpu, DEBUG_STATE_SAVE_TRBE))
+	if (host_data_test_flag(HAS_TRBE))
 		__debug_save_trace(host_data_ptr(host_debug_state.trfcr_el1));
 }
 
@@ -96,9 +96,9 @@ void __debug_switch_to_guest(struct kvm_vcpu *vcpu)
 
 void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu)
 {
-	if (vcpu_get_flag(vcpu, DEBUG_STATE_SAVE_SPE))
+	if (host_data_test_flag(HAS_SPE))
 		__debug_restore_spe(*host_data_ptr(host_debug_state.pmscr_el1));
-	if (vcpu_get_flag(vcpu, DEBUG_STATE_SAVE_TRBE))
+	if (host_data_test_flag(HAS_TRBE))
 		__debug_restore_trace(*host_data_ptr(host_debug_state.trfcr_el1));
 }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v8 3/8] arm64/sysreg: Add a comment that the sysreg file should be sorted
  2024-11-27 10:01 [PATCH v8 0/8] kvm/coresight: Support exclude guest and exclude host James Clark
  2024-11-27 10:01 ` [PATCH v8 1/8] KVM: arm64: Get rid of __kvm_get_mdcr_el2() and related warts James Clark
  2024-11-27 10:01 ` [PATCH v8 2/8] KVM: arm64: Track presence of SPE/TRBE in kvm_host_data instead of vCPU James Clark
@ 2024-11-27 10:01 ` James Clark
  2024-11-27 10:01 ` [PATCH v8 4/8] tools: arm64: Update sysreg.h header files James Clark
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: James Clark @ 2024-11-27 10:01 UTC (permalink / raw)
  To: maz, kvmarm, oliver.upton, suzuki.poulose, coresight
  Cc: James Clark, Mark Brown, James Clark, Joey Gouly, Zenghui Yu,
	Catalin Marinas, Will Deacon, Mike Leach, Alexander Shishkin,
	Anshuman Khandual, James Morse, Fuad Tabba, Shiqi Liu,
	Raghavendra Rao Ananta, linux-arm-kernel, linux-kernel

From: James Clark <james.clark@arm.com>

There are a few entries particularly at the end of the file that aren't
in order. To avoid confusion, add a comment that might help new entries
to be added in the right place.

Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
---
 arch/arm64/tools/sysreg | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg
index ed3bf6a0f5c1..a26c0da0c42d 100644
--- a/arch/arm64/tools/sysreg
+++ b/arch/arm64/tools/sysreg
@@ -48,6 +48,8 @@
 # feature that introduces them (eg, FEAT_LS64_ACCDATA introduces enumeration
 # item ACCDATA) though it may be more taseful to do something else.
 
+# Please try to keep entries in this file sorted by sysreg encoding.
+
 Sysreg	OSDTRRX_EL1	2	0	0	0	2
 Res0	63:32
 Field	31:0	DTRRX
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v8 4/8] tools: arm64: Update sysreg.h header files
  2024-11-27 10:01 [PATCH v8 0/8] kvm/coresight: Support exclude guest and exclude host James Clark
                   ` (2 preceding siblings ...)
  2024-11-27 10:01 ` [PATCH v8 3/8] arm64/sysreg: Add a comment that the sysreg file should be sorted James Clark
@ 2024-11-27 10:01 ` James Clark
  2024-11-27 10:01 ` [PATCH v8 5/8] arm64/sysreg/tools: Move TRFCR definitions to sysreg James Clark
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: James Clark @ 2024-11-27 10:01 UTC (permalink / raw)
  To: maz, kvmarm, oliver.upton, suzuki.poulose, coresight
  Cc: James Clark, Mark Brown, James Clark, Joey Gouly, Zenghui Yu,
	Catalin Marinas, Will Deacon, Mike Leach, Alexander Shishkin,
	Anshuman Khandual, James Morse, Shiqi Liu, Fuad Tabba,
	Raghavendra Rao Ananta, linux-arm-kernel, linux-kernel

From: James Clark <james.clark@arm.com>

Created with the following:

  cp include/linux/kasan-tags.h tools/include/linux/
  cp arch/arm64/include/asm/sysreg.h tools/arch/arm64/include/asm/

Update the tools copy of sysreg.h so that the next commit to add a new
register doesn't have unrelated changes in it. Because the new version
of sysreg.h includes kasan-tags.h, that file also now needs to be copied
into tools.

Acked-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
---
 tools/arch/arm64/include/asm/sysreg.h | 398 +++++++++++++++++++++++++-
 tools/include/linux/kasan-tags.h      |  15 +
 2 files changed, 405 insertions(+), 8 deletions(-)
 create mode 100644 tools/include/linux/kasan-tags.h

diff --git a/tools/arch/arm64/include/asm/sysreg.h b/tools/arch/arm64/include/asm/sysreg.h
index cd8420e8c3ad..345e81e0d2b3 100644
--- a/tools/arch/arm64/include/asm/sysreg.h
+++ b/tools/arch/arm64/include/asm/sysreg.h
@@ -11,6 +11,7 @@
 
 #include <linux/bits.h>
 #include <linux/stringify.h>
+#include <linux/kasan-tags.h>
 
 #include <asm/gpr-num.h>
 
@@ -108,6 +109,9 @@
 #define set_pstate_ssbs(x)		asm volatile(SET_PSTATE_SSBS(x))
 #define set_pstate_dit(x)		asm volatile(SET_PSTATE_DIT(x))
 
+/* Register-based PAN access, for save/restore purposes */
+#define SYS_PSTATE_PAN			sys_reg(3, 0, 4, 2, 3)
+
 #define __SYS_BARRIER_INSN(CRm, op2, Rt) \
 	__emit_inst(0xd5000000 | sys_insn(0, 3, 3, (CRm), (op2)) | ((Rt) & 0x1f))
 
@@ -123,6 +127,37 @@
 #define SYS_DC_CIGSW			sys_insn(1, 0, 7, 14, 4)
 #define SYS_DC_CIGDSW			sys_insn(1, 0, 7, 14, 6)
 
+#define SYS_IC_IALLUIS			sys_insn(1, 0, 7, 1, 0)
+#define SYS_IC_IALLU			sys_insn(1, 0, 7, 5, 0)
+#define SYS_IC_IVAU			sys_insn(1, 3, 7, 5, 1)
+
+#define SYS_DC_IVAC			sys_insn(1, 0, 7, 6, 1)
+#define SYS_DC_IGVAC			sys_insn(1, 0, 7, 6, 3)
+#define SYS_DC_IGDVAC			sys_insn(1, 0, 7, 6, 5)
+
+#define SYS_DC_CVAC			sys_insn(1, 3, 7, 10, 1)
+#define SYS_DC_CGVAC			sys_insn(1, 3, 7, 10, 3)
+#define SYS_DC_CGDVAC			sys_insn(1, 3, 7, 10, 5)
+
+#define SYS_DC_CVAU			sys_insn(1, 3, 7, 11, 1)
+
+#define SYS_DC_CVAP			sys_insn(1, 3, 7, 12, 1)
+#define SYS_DC_CGVAP			sys_insn(1, 3, 7, 12, 3)
+#define SYS_DC_CGDVAP			sys_insn(1, 3, 7, 12, 5)
+
+#define SYS_DC_CVADP			sys_insn(1, 3, 7, 13, 1)
+#define SYS_DC_CGVADP			sys_insn(1, 3, 7, 13, 3)
+#define SYS_DC_CGDVADP			sys_insn(1, 3, 7, 13, 5)
+
+#define SYS_DC_CIVAC			sys_insn(1, 3, 7, 14, 1)
+#define SYS_DC_CIGVAC			sys_insn(1, 3, 7, 14, 3)
+#define SYS_DC_CIGDVAC			sys_insn(1, 3, 7, 14, 5)
+
+/* Data cache zero operations */
+#define SYS_DC_ZVA			sys_insn(1, 3, 7, 4, 1)
+#define SYS_DC_GVA			sys_insn(1, 3, 7, 4, 3)
+#define SYS_DC_GZVA			sys_insn(1, 3, 7, 4, 4)
+
 /*
  * Automatically generated definitions for system registers, the
  * manual encodings below are in the process of being converted to
@@ -162,6 +197,84 @@
 #define SYS_DBGDTRTX_EL0		sys_reg(2, 3, 0, 5, 0)
 #define SYS_DBGVCR32_EL2		sys_reg(2, 4, 0, 7, 0)
 
+#define SYS_BRBINF_EL1(n)		sys_reg(2, 1, 8, (n & 15), (((n & 16) >> 2) | 0))
+#define SYS_BRBINFINJ_EL1		sys_reg(2, 1, 9, 1, 0)
+#define SYS_BRBSRC_EL1(n)		sys_reg(2, 1, 8, (n & 15), (((n & 16) >> 2) | 1))
+#define SYS_BRBSRCINJ_EL1		sys_reg(2, 1, 9, 1, 1)
+#define SYS_BRBTGT_EL1(n)		sys_reg(2, 1, 8, (n & 15), (((n & 16) >> 2) | 2))
+#define SYS_BRBTGTINJ_EL1		sys_reg(2, 1, 9, 1, 2)
+#define SYS_BRBTS_EL1			sys_reg(2, 1, 9, 0, 2)
+
+#define SYS_BRBCR_EL1			sys_reg(2, 1, 9, 0, 0)
+#define SYS_BRBFCR_EL1			sys_reg(2, 1, 9, 0, 1)
+#define SYS_BRBIDR0_EL1			sys_reg(2, 1, 9, 2, 0)
+
+#define SYS_TRCITECR_EL1		sys_reg(3, 0, 1, 2, 3)
+#define SYS_TRCACATR(m)			sys_reg(2, 1, 2, ((m & 7) << 1), (2 | (m >> 3)))
+#define SYS_TRCACVR(m)			sys_reg(2, 1, 2, ((m & 7) << 1), (0 | (m >> 3)))
+#define SYS_TRCAUTHSTATUS		sys_reg(2, 1, 7, 14, 6)
+#define SYS_TRCAUXCTLR			sys_reg(2, 1, 0, 6, 0)
+#define SYS_TRCBBCTLR			sys_reg(2, 1, 0, 15, 0)
+#define SYS_TRCCCCTLR			sys_reg(2, 1, 0, 14, 0)
+#define SYS_TRCCIDCCTLR0		sys_reg(2, 1, 3, 0, 2)
+#define SYS_TRCCIDCCTLR1		sys_reg(2, 1, 3, 1, 2)
+#define SYS_TRCCIDCVR(m)		sys_reg(2, 1, 3, ((m & 7) << 1), 0)
+#define SYS_TRCCLAIMCLR			sys_reg(2, 1, 7, 9, 6)
+#define SYS_TRCCLAIMSET			sys_reg(2, 1, 7, 8, 6)
+#define SYS_TRCCNTCTLR(m)		sys_reg(2, 1, 0, (4 | (m & 3)), 5)
+#define SYS_TRCCNTRLDVR(m)		sys_reg(2, 1, 0, (0 | (m & 3)), 5)
+#define SYS_TRCCNTVR(m)			sys_reg(2, 1, 0, (8 | (m & 3)), 5)
+#define SYS_TRCCONFIGR			sys_reg(2, 1, 0, 4, 0)
+#define SYS_TRCDEVARCH			sys_reg(2, 1, 7, 15, 6)
+#define SYS_TRCDEVID			sys_reg(2, 1, 7, 2, 7)
+#define SYS_TRCEVENTCTL0R		sys_reg(2, 1, 0, 8, 0)
+#define SYS_TRCEVENTCTL1R		sys_reg(2, 1, 0, 9, 0)
+#define SYS_TRCEXTINSELR(m)		sys_reg(2, 1, 0, (8 | (m & 3)), 4)
+#define SYS_TRCIDR0			sys_reg(2, 1, 0, 8, 7)
+#define SYS_TRCIDR10			sys_reg(2, 1, 0, 2, 6)
+#define SYS_TRCIDR11			sys_reg(2, 1, 0, 3, 6)
+#define SYS_TRCIDR12			sys_reg(2, 1, 0, 4, 6)
+#define SYS_TRCIDR13			sys_reg(2, 1, 0, 5, 6)
+#define SYS_TRCIDR1			sys_reg(2, 1, 0, 9, 7)
+#define SYS_TRCIDR2			sys_reg(2, 1, 0, 10, 7)
+#define SYS_TRCIDR3			sys_reg(2, 1, 0, 11, 7)
+#define SYS_TRCIDR4			sys_reg(2, 1, 0, 12, 7)
+#define SYS_TRCIDR5			sys_reg(2, 1, 0, 13, 7)
+#define SYS_TRCIDR6			sys_reg(2, 1, 0, 14, 7)
+#define SYS_TRCIDR7			sys_reg(2, 1, 0, 15, 7)
+#define SYS_TRCIDR8			sys_reg(2, 1, 0, 0, 6)
+#define SYS_TRCIDR9			sys_reg(2, 1, 0, 1, 6)
+#define SYS_TRCIMSPEC(m)		sys_reg(2, 1, 0, (m & 7), 7)
+#define SYS_TRCITEEDCR			sys_reg(2, 1, 0, 2, 1)
+#define SYS_TRCOSLSR			sys_reg(2, 1, 1, 1, 4)
+#define SYS_TRCPRGCTLR			sys_reg(2, 1, 0, 1, 0)
+#define SYS_TRCQCTLR			sys_reg(2, 1, 0, 1, 1)
+#define SYS_TRCRSCTLR(m)		sys_reg(2, 1, 1, (m & 15), (0 | (m >> 4)))
+#define SYS_TRCRSR			sys_reg(2, 1, 0, 10, 0)
+#define SYS_TRCSEQEVR(m)		sys_reg(2, 1, 0, (m & 3), 4)
+#define SYS_TRCSEQRSTEVR		sys_reg(2, 1, 0, 6, 4)
+#define SYS_TRCSEQSTR			sys_reg(2, 1, 0, 7, 4)
+#define SYS_TRCSSCCR(m)			sys_reg(2, 1, 1, (m & 7), 2)
+#define SYS_TRCSSCSR(m)			sys_reg(2, 1, 1, (8 | (m & 7)), 2)
+#define SYS_TRCSSPCICR(m)		sys_reg(2, 1, 1, (m & 7), 3)
+#define SYS_TRCSTALLCTLR		sys_reg(2, 1, 0, 11, 0)
+#define SYS_TRCSTATR			sys_reg(2, 1, 0, 3, 0)
+#define SYS_TRCSYNCPR			sys_reg(2, 1, 0, 13, 0)
+#define SYS_TRCTRACEIDR			sys_reg(2, 1, 0, 0, 1)
+#define SYS_TRCTSCTLR			sys_reg(2, 1, 0, 12, 0)
+#define SYS_TRCVICTLR			sys_reg(2, 1, 0, 0, 2)
+#define SYS_TRCVIIECTLR			sys_reg(2, 1, 0, 1, 2)
+#define SYS_TRCVIPCSSCTLR		sys_reg(2, 1, 0, 3, 2)
+#define SYS_TRCVISSCTLR			sys_reg(2, 1, 0, 2, 2)
+#define SYS_TRCVMIDCCTLR0		sys_reg(2, 1, 3, 2, 2)
+#define SYS_TRCVMIDCCTLR1		sys_reg(2, 1, 3, 3, 2)
+#define SYS_TRCVMIDCVR(m)		sys_reg(2, 1, 3, ((m & 7) << 1), 1)
+
+/* ETM */
+#define SYS_TRCOSLAR			sys_reg(2, 1, 1, 0, 4)
+
+#define SYS_BRBCR_EL2			sys_reg(2, 4, 9, 0, 0)
+
 #define SYS_MIDR_EL1			sys_reg(3, 0, 0, 0, 0)
 #define SYS_MPIDR_EL1			sys_reg(3, 0, 0, 0, 5)
 #define SYS_REVIDR_EL1			sys_reg(3, 0, 0, 0, 6)
@@ -202,15 +315,38 @@
 #define SYS_ERXCTLR_EL1			sys_reg(3, 0, 5, 4, 1)
 #define SYS_ERXSTATUS_EL1		sys_reg(3, 0, 5, 4, 2)
 #define SYS_ERXADDR_EL1			sys_reg(3, 0, 5, 4, 3)
+#define SYS_ERXPFGF_EL1			sys_reg(3, 0, 5, 4, 4)
+#define SYS_ERXPFGCTL_EL1		sys_reg(3, 0, 5, 4, 5)
+#define SYS_ERXPFGCDN_EL1		sys_reg(3, 0, 5, 4, 6)
 #define SYS_ERXMISC0_EL1		sys_reg(3, 0, 5, 5, 0)
 #define SYS_ERXMISC1_EL1		sys_reg(3, 0, 5, 5, 1)
+#define SYS_ERXMISC2_EL1		sys_reg(3, 0, 5, 5, 2)
+#define SYS_ERXMISC3_EL1		sys_reg(3, 0, 5, 5, 3)
 #define SYS_TFSR_EL1			sys_reg(3, 0, 5, 6, 0)
 #define SYS_TFSRE0_EL1			sys_reg(3, 0, 5, 6, 1)
 
 #define SYS_PAR_EL1			sys_reg(3, 0, 7, 4, 0)
 
 #define SYS_PAR_EL1_F			BIT(0)
+/* When PAR_EL1.F == 1 */
 #define SYS_PAR_EL1_FST			GENMASK(6, 1)
+#define SYS_PAR_EL1_PTW			BIT(8)
+#define SYS_PAR_EL1_S			BIT(9)
+#define SYS_PAR_EL1_AssuredOnly		BIT(12)
+#define SYS_PAR_EL1_TopLevel		BIT(13)
+#define SYS_PAR_EL1_Overlay		BIT(14)
+#define SYS_PAR_EL1_DirtyBit		BIT(15)
+#define SYS_PAR_EL1_F1_IMPDEF		GENMASK_ULL(63, 48)
+#define SYS_PAR_EL1_F1_RES0		(BIT(7) | BIT(10) | GENMASK_ULL(47, 16))
+#define SYS_PAR_EL1_RES1		BIT(11)
+/* When PAR_EL1.F == 0 */
+#define SYS_PAR_EL1_SH			GENMASK_ULL(8, 7)
+#define SYS_PAR_EL1_NS			BIT(9)
+#define SYS_PAR_EL1_F0_IMPDEF		BIT(10)
+#define SYS_PAR_EL1_NSE			BIT(11)
+#define SYS_PAR_EL1_PA			GENMASK_ULL(51, 12)
+#define SYS_PAR_EL1_ATTR		GENMASK_ULL(63, 56)
+#define SYS_PAR_EL1_F0_RES0		(GENMASK_ULL(6, 1) | GENMASK_ULL(55, 52))
 
 /*** Statistical Profiling Extension ***/
 #define PMSEVFR_EL1_RES0_IMP	\
@@ -274,6 +410,8 @@
 #define SYS_ICC_IGRPEN0_EL1		sys_reg(3, 0, 12, 12, 6)
 #define SYS_ICC_IGRPEN1_EL1		sys_reg(3, 0, 12, 12, 7)
 
+#define SYS_ACCDATA_EL1			sys_reg(3, 0, 13, 0, 5)
+
 #define SYS_CNTKCTL_EL1			sys_reg(3, 0, 14, 1, 0)
 
 #define SYS_AIDR_EL1			sys_reg(3, 1, 0, 0, 7)
@@ -286,7 +424,6 @@
 #define SYS_PMCNTENCLR_EL0		sys_reg(3, 3, 9, 12, 2)
 #define SYS_PMOVSCLR_EL0		sys_reg(3, 3, 9, 12, 3)
 #define SYS_PMSWINC_EL0			sys_reg(3, 3, 9, 12, 4)
-#define SYS_PMSELR_EL0			sys_reg(3, 3, 9, 12, 5)
 #define SYS_PMCEID0_EL0			sys_reg(3, 3, 9, 12, 6)
 #define SYS_PMCEID1_EL0			sys_reg(3, 3, 9, 12, 7)
 #define SYS_PMCCNTR_EL0			sys_reg(3, 3, 9, 13, 0)
@@ -369,6 +506,7 @@
 
 #define SYS_SCTLR_EL2			sys_reg(3, 4, 1, 0, 0)
 #define SYS_ACTLR_EL2			sys_reg(3, 4, 1, 0, 1)
+#define SYS_SCTLR2_EL2			sys_reg(3, 4, 1, 0, 3)
 #define SYS_HCR_EL2			sys_reg(3, 4, 1, 1, 0)
 #define SYS_MDCR_EL2			sys_reg(3, 4, 1, 1, 1)
 #define SYS_CPTR_EL2			sys_reg(3, 4, 1, 1, 2)
@@ -382,12 +520,15 @@
 #define SYS_VTCR_EL2			sys_reg(3, 4, 2, 1, 2)
 
 #define SYS_TRFCR_EL2			sys_reg(3, 4, 1, 2, 1)
-#define SYS_HDFGRTR_EL2			sys_reg(3, 4, 3, 1, 4)
-#define SYS_HDFGWTR_EL2			sys_reg(3, 4, 3, 1, 5)
+#define SYS_VNCR_EL2			sys_reg(3, 4, 2, 2, 0)
 #define SYS_HAFGRTR_EL2			sys_reg(3, 4, 3, 1, 6)
 #define SYS_SPSR_EL2			sys_reg(3, 4, 4, 0, 0)
 #define SYS_ELR_EL2			sys_reg(3, 4, 4, 0, 1)
 #define SYS_SP_EL1			sys_reg(3, 4, 4, 1, 0)
+#define SYS_SPSR_irq			sys_reg(3, 4, 4, 3, 0)
+#define SYS_SPSR_abt			sys_reg(3, 4, 4, 3, 1)
+#define SYS_SPSR_und			sys_reg(3, 4, 4, 3, 2)
+#define SYS_SPSR_fiq			sys_reg(3, 4, 4, 3, 3)
 #define SYS_IFSR32_EL2			sys_reg(3, 4, 5, 0, 1)
 #define SYS_AFSR0_EL2			sys_reg(3, 4, 5, 1, 0)
 #define SYS_AFSR1_EL2			sys_reg(3, 4, 5, 1, 1)
@@ -449,24 +590,49 @@
 
 #define SYS_CONTEXTIDR_EL2		sys_reg(3, 4, 13, 0, 1)
 #define SYS_TPIDR_EL2			sys_reg(3, 4, 13, 0, 2)
+#define SYS_SCXTNUM_EL2			sys_reg(3, 4, 13, 0, 7)
+
+#define __AMEV_op2(m)			(m & 0x7)
+#define __AMEV_CRm(n, m)		(n | ((m & 0x8) >> 3))
+#define __SYS__AMEVCNTVOFF0n_EL2(m)	sys_reg(3, 4, 13, __AMEV_CRm(0x8, m), __AMEV_op2(m))
+#define SYS_AMEVCNTVOFF0n_EL2(m)	__SYS__AMEVCNTVOFF0n_EL2(m)
+#define __SYS__AMEVCNTVOFF1n_EL2(m)	sys_reg(3, 4, 13, __AMEV_CRm(0xA, m), __AMEV_op2(m))
+#define SYS_AMEVCNTVOFF1n_EL2(m)	__SYS__AMEVCNTVOFF1n_EL2(m)
 
 #define SYS_CNTVOFF_EL2			sys_reg(3, 4, 14, 0, 3)
 #define SYS_CNTHCTL_EL2			sys_reg(3, 4, 14, 1, 0)
+#define SYS_CNTHP_TVAL_EL2		sys_reg(3, 4, 14, 2, 0)
+#define SYS_CNTHP_CTL_EL2		sys_reg(3, 4, 14, 2, 1)
+#define SYS_CNTHP_CVAL_EL2		sys_reg(3, 4, 14, 2, 2)
+#define SYS_CNTHV_TVAL_EL2		sys_reg(3, 4, 14, 3, 0)
+#define SYS_CNTHV_CTL_EL2		sys_reg(3, 4, 14, 3, 1)
+#define SYS_CNTHV_CVAL_EL2		sys_reg(3, 4, 14, 3, 2)
 
 /* VHE encodings for architectural EL0/1 system registers */
+#define SYS_BRBCR_EL12			sys_reg(2, 5, 9, 0, 0)
 #define SYS_SCTLR_EL12			sys_reg(3, 5, 1, 0, 0)
+#define SYS_CPACR_EL12			sys_reg(3, 5, 1, 0, 2)
+#define SYS_SCTLR2_EL12			sys_reg(3, 5, 1, 0, 3)
+#define SYS_ZCR_EL12			sys_reg(3, 5, 1, 2, 0)
+#define SYS_TRFCR_EL12			sys_reg(3, 5, 1, 2, 1)
+#define SYS_SMCR_EL12			sys_reg(3, 5, 1, 2, 6)
 #define SYS_TTBR0_EL12			sys_reg(3, 5, 2, 0, 0)
 #define SYS_TTBR1_EL12			sys_reg(3, 5, 2, 0, 1)
 #define SYS_TCR_EL12			sys_reg(3, 5, 2, 0, 2)
+#define SYS_TCR2_EL12			sys_reg(3, 5, 2, 0, 3)
 #define SYS_SPSR_EL12			sys_reg(3, 5, 4, 0, 0)
 #define SYS_ELR_EL12			sys_reg(3, 5, 4, 0, 1)
 #define SYS_AFSR0_EL12			sys_reg(3, 5, 5, 1, 0)
 #define SYS_AFSR1_EL12			sys_reg(3, 5, 5, 1, 1)
 #define SYS_ESR_EL12			sys_reg(3, 5, 5, 2, 0)
 #define SYS_TFSR_EL12			sys_reg(3, 5, 5, 6, 0)
+#define SYS_FAR_EL12			sys_reg(3, 5, 6, 0, 0)
+#define SYS_PMSCR_EL12			sys_reg(3, 5, 9, 9, 0)
 #define SYS_MAIR_EL12			sys_reg(3, 5, 10, 2, 0)
 #define SYS_AMAIR_EL12			sys_reg(3, 5, 10, 3, 0)
 #define SYS_VBAR_EL12			sys_reg(3, 5, 12, 0, 0)
+#define SYS_CONTEXTIDR_EL12		sys_reg(3, 5, 13, 0, 1)
+#define SYS_SCXTNUM_EL12		sys_reg(3, 5, 13, 0, 7)
 #define SYS_CNTKCTL_EL12		sys_reg(3, 5, 14, 1, 0)
 #define SYS_CNTP_TVAL_EL02		sys_reg(3, 5, 14, 2, 0)
 #define SYS_CNTP_CTL_EL02		sys_reg(3, 5, 14, 2, 1)
@@ -477,6 +643,183 @@
 
 #define SYS_SP_EL2			sys_reg(3, 6,  4, 1, 0)
 
+/* AT instructions */
+#define AT_Op0 1
+#define AT_CRn 7
+
+#define OP_AT_S1E1R	sys_insn(AT_Op0, 0, AT_CRn, 8, 0)
+#define OP_AT_S1E1W	sys_insn(AT_Op0, 0, AT_CRn, 8, 1)
+#define OP_AT_S1E0R	sys_insn(AT_Op0, 0, AT_CRn, 8, 2)
+#define OP_AT_S1E0W	sys_insn(AT_Op0, 0, AT_CRn, 8, 3)
+#define OP_AT_S1E1RP	sys_insn(AT_Op0, 0, AT_CRn, 9, 0)
+#define OP_AT_S1E1WP	sys_insn(AT_Op0, 0, AT_CRn, 9, 1)
+#define OP_AT_S1E1A	sys_insn(AT_Op0, 0, AT_CRn, 9, 2)
+#define OP_AT_S1E2R	sys_insn(AT_Op0, 4, AT_CRn, 8, 0)
+#define OP_AT_S1E2W	sys_insn(AT_Op0, 4, AT_CRn, 8, 1)
+#define OP_AT_S12E1R	sys_insn(AT_Op0, 4, AT_CRn, 8, 4)
+#define OP_AT_S12E1W	sys_insn(AT_Op0, 4, AT_CRn, 8, 5)
+#define OP_AT_S12E0R	sys_insn(AT_Op0, 4, AT_CRn, 8, 6)
+#define OP_AT_S12E0W	sys_insn(AT_Op0, 4, AT_CRn, 8, 7)
+#define OP_AT_S1E2A	sys_insn(AT_Op0, 4, AT_CRn, 9, 2)
+
+/* TLBI instructions */
+#define TLBI_Op0	1
+
+#define TLBI_Op1_EL1	0	/* Accessible from EL1 or higher */
+#define TLBI_Op1_EL2	4	/* Accessible from EL2 or higher */
+
+#define TLBI_CRn_XS	8	/* Extra Slow (the common one) */
+#define TLBI_CRn_nXS	9	/* not Extra Slow (which nobody uses)*/
+
+#define TLBI_CRm_IPAIS	0	/* S2 Inner-Shareable */
+#define TLBI_CRm_nROS	1	/* non-Range, Outer-Sharable */
+#define TLBI_CRm_RIS	2	/* Range, Inner-Sharable */
+#define TLBI_CRm_nRIS	3	/* non-Range, Inner-Sharable */
+#define TLBI_CRm_IPAONS	4	/* S2 Outer and Non-Shareable */
+#define TLBI_CRm_ROS	5	/* Range, Outer-Sharable */
+#define TLBI_CRm_RNS	6	/* Range, Non-Sharable */
+#define TLBI_CRm_nRNS	7	/* non-Range, Non-Sharable */
+
+#define OP_TLBI_VMALLE1OS		sys_insn(1, 0, 8, 1, 0)
+#define OP_TLBI_VAE1OS			sys_insn(1, 0, 8, 1, 1)
+#define OP_TLBI_ASIDE1OS		sys_insn(1, 0, 8, 1, 2)
+#define OP_TLBI_VAAE1OS			sys_insn(1, 0, 8, 1, 3)
+#define OP_TLBI_VALE1OS			sys_insn(1, 0, 8, 1, 5)
+#define OP_TLBI_VAALE1OS		sys_insn(1, 0, 8, 1, 7)
+#define OP_TLBI_RVAE1IS			sys_insn(1, 0, 8, 2, 1)
+#define OP_TLBI_RVAAE1IS		sys_insn(1, 0, 8, 2, 3)
+#define OP_TLBI_RVALE1IS		sys_insn(1, 0, 8, 2, 5)
+#define OP_TLBI_RVAALE1IS		sys_insn(1, 0, 8, 2, 7)
+#define OP_TLBI_VMALLE1IS		sys_insn(1, 0, 8, 3, 0)
+#define OP_TLBI_VAE1IS			sys_insn(1, 0, 8, 3, 1)
+#define OP_TLBI_ASIDE1IS		sys_insn(1, 0, 8, 3, 2)
+#define OP_TLBI_VAAE1IS			sys_insn(1, 0, 8, 3, 3)
+#define OP_TLBI_VALE1IS			sys_insn(1, 0, 8, 3, 5)
+#define OP_TLBI_VAALE1IS		sys_insn(1, 0, 8, 3, 7)
+#define OP_TLBI_RVAE1OS			sys_insn(1, 0, 8, 5, 1)
+#define OP_TLBI_RVAAE1OS		sys_insn(1, 0, 8, 5, 3)
+#define OP_TLBI_RVALE1OS		sys_insn(1, 0, 8, 5, 5)
+#define OP_TLBI_RVAALE1OS		sys_insn(1, 0, 8, 5, 7)
+#define OP_TLBI_RVAE1			sys_insn(1, 0, 8, 6, 1)
+#define OP_TLBI_RVAAE1			sys_insn(1, 0, 8, 6, 3)
+#define OP_TLBI_RVALE1			sys_insn(1, 0, 8, 6, 5)
+#define OP_TLBI_RVAALE1			sys_insn(1, 0, 8, 6, 7)
+#define OP_TLBI_VMALLE1			sys_insn(1, 0, 8, 7, 0)
+#define OP_TLBI_VAE1			sys_insn(1, 0, 8, 7, 1)
+#define OP_TLBI_ASIDE1			sys_insn(1, 0, 8, 7, 2)
+#define OP_TLBI_VAAE1			sys_insn(1, 0, 8, 7, 3)
+#define OP_TLBI_VALE1			sys_insn(1, 0, 8, 7, 5)
+#define OP_TLBI_VAALE1			sys_insn(1, 0, 8, 7, 7)
+#define OP_TLBI_VMALLE1OSNXS		sys_insn(1, 0, 9, 1, 0)
+#define OP_TLBI_VAE1OSNXS		sys_insn(1, 0, 9, 1, 1)
+#define OP_TLBI_ASIDE1OSNXS		sys_insn(1, 0, 9, 1, 2)
+#define OP_TLBI_VAAE1OSNXS		sys_insn(1, 0, 9, 1, 3)
+#define OP_TLBI_VALE1OSNXS		sys_insn(1, 0, 9, 1, 5)
+#define OP_TLBI_VAALE1OSNXS		sys_insn(1, 0, 9, 1, 7)
+#define OP_TLBI_RVAE1ISNXS		sys_insn(1, 0, 9, 2, 1)
+#define OP_TLBI_RVAAE1ISNXS		sys_insn(1, 0, 9, 2, 3)
+#define OP_TLBI_RVALE1ISNXS		sys_insn(1, 0, 9, 2, 5)
+#define OP_TLBI_RVAALE1ISNXS		sys_insn(1, 0, 9, 2, 7)
+#define OP_TLBI_VMALLE1ISNXS		sys_insn(1, 0, 9, 3, 0)
+#define OP_TLBI_VAE1ISNXS		sys_insn(1, 0, 9, 3, 1)
+#define OP_TLBI_ASIDE1ISNXS		sys_insn(1, 0, 9, 3, 2)
+#define OP_TLBI_VAAE1ISNXS		sys_insn(1, 0, 9, 3, 3)
+#define OP_TLBI_VALE1ISNXS		sys_insn(1, 0, 9, 3, 5)
+#define OP_TLBI_VAALE1ISNXS		sys_insn(1, 0, 9, 3, 7)
+#define OP_TLBI_RVAE1OSNXS		sys_insn(1, 0, 9, 5, 1)
+#define OP_TLBI_RVAAE1OSNXS		sys_insn(1, 0, 9, 5, 3)
+#define OP_TLBI_RVALE1OSNXS		sys_insn(1, 0, 9, 5, 5)
+#define OP_TLBI_RVAALE1OSNXS		sys_insn(1, 0, 9, 5, 7)
+#define OP_TLBI_RVAE1NXS		sys_insn(1, 0, 9, 6, 1)
+#define OP_TLBI_RVAAE1NXS		sys_insn(1, 0, 9, 6, 3)
+#define OP_TLBI_RVALE1NXS		sys_insn(1, 0, 9, 6, 5)
+#define OP_TLBI_RVAALE1NXS		sys_insn(1, 0, 9, 6, 7)
+#define OP_TLBI_VMALLE1NXS		sys_insn(1, 0, 9, 7, 0)
+#define OP_TLBI_VAE1NXS			sys_insn(1, 0, 9, 7, 1)
+#define OP_TLBI_ASIDE1NXS		sys_insn(1, 0, 9, 7, 2)
+#define OP_TLBI_VAAE1NXS		sys_insn(1, 0, 9, 7, 3)
+#define OP_TLBI_VALE1NXS		sys_insn(1, 0, 9, 7, 5)
+#define OP_TLBI_VAALE1NXS		sys_insn(1, 0, 9, 7, 7)
+#define OP_TLBI_IPAS2E1IS		sys_insn(1, 4, 8, 0, 1)
+#define OP_TLBI_RIPAS2E1IS		sys_insn(1, 4, 8, 0, 2)
+#define OP_TLBI_IPAS2LE1IS		sys_insn(1, 4, 8, 0, 5)
+#define OP_TLBI_RIPAS2LE1IS		sys_insn(1, 4, 8, 0, 6)
+#define OP_TLBI_ALLE2OS			sys_insn(1, 4, 8, 1, 0)
+#define OP_TLBI_VAE2OS			sys_insn(1, 4, 8, 1, 1)
+#define OP_TLBI_ALLE1OS			sys_insn(1, 4, 8, 1, 4)
+#define OP_TLBI_VALE2OS			sys_insn(1, 4, 8, 1, 5)
+#define OP_TLBI_VMALLS12E1OS		sys_insn(1, 4, 8, 1, 6)
+#define OP_TLBI_RVAE2IS			sys_insn(1, 4, 8, 2, 1)
+#define OP_TLBI_RVALE2IS		sys_insn(1, 4, 8, 2, 5)
+#define OP_TLBI_ALLE2IS			sys_insn(1, 4, 8, 3, 0)
+#define OP_TLBI_VAE2IS			sys_insn(1, 4, 8, 3, 1)
+#define OP_TLBI_ALLE1IS			sys_insn(1, 4, 8, 3, 4)
+#define OP_TLBI_VALE2IS			sys_insn(1, 4, 8, 3, 5)
+#define OP_TLBI_VMALLS12E1IS		sys_insn(1, 4, 8, 3, 6)
+#define OP_TLBI_IPAS2E1OS		sys_insn(1, 4, 8, 4, 0)
+#define OP_TLBI_IPAS2E1			sys_insn(1, 4, 8, 4, 1)
+#define OP_TLBI_RIPAS2E1		sys_insn(1, 4, 8, 4, 2)
+#define OP_TLBI_RIPAS2E1OS		sys_insn(1, 4, 8, 4, 3)
+#define OP_TLBI_IPAS2LE1OS		sys_insn(1, 4, 8, 4, 4)
+#define OP_TLBI_IPAS2LE1		sys_insn(1, 4, 8, 4, 5)
+#define OP_TLBI_RIPAS2LE1		sys_insn(1, 4, 8, 4, 6)
+#define OP_TLBI_RIPAS2LE1OS		sys_insn(1, 4, 8, 4, 7)
+#define OP_TLBI_RVAE2OS			sys_insn(1, 4, 8, 5, 1)
+#define OP_TLBI_RVALE2OS		sys_insn(1, 4, 8, 5, 5)
+#define OP_TLBI_RVAE2			sys_insn(1, 4, 8, 6, 1)
+#define OP_TLBI_RVALE2			sys_insn(1, 4, 8, 6, 5)
+#define OP_TLBI_ALLE2			sys_insn(1, 4, 8, 7, 0)
+#define OP_TLBI_VAE2			sys_insn(1, 4, 8, 7, 1)
+#define OP_TLBI_ALLE1			sys_insn(1, 4, 8, 7, 4)
+#define OP_TLBI_VALE2			sys_insn(1, 4, 8, 7, 5)
+#define OP_TLBI_VMALLS12E1		sys_insn(1, 4, 8, 7, 6)
+#define OP_TLBI_IPAS2E1ISNXS		sys_insn(1, 4, 9, 0, 1)
+#define OP_TLBI_RIPAS2E1ISNXS		sys_insn(1, 4, 9, 0, 2)
+#define OP_TLBI_IPAS2LE1ISNXS		sys_insn(1, 4, 9, 0, 5)
+#define OP_TLBI_RIPAS2LE1ISNXS		sys_insn(1, 4, 9, 0, 6)
+#define OP_TLBI_ALLE2OSNXS		sys_insn(1, 4, 9, 1, 0)
+#define OP_TLBI_VAE2OSNXS		sys_insn(1, 4, 9, 1, 1)
+#define OP_TLBI_ALLE1OSNXS		sys_insn(1, 4, 9, 1, 4)
+#define OP_TLBI_VALE2OSNXS		sys_insn(1, 4, 9, 1, 5)
+#define OP_TLBI_VMALLS12E1OSNXS		sys_insn(1, 4, 9, 1, 6)
+#define OP_TLBI_RVAE2ISNXS		sys_insn(1, 4, 9, 2, 1)
+#define OP_TLBI_RVALE2ISNXS		sys_insn(1, 4, 9, 2, 5)
+#define OP_TLBI_ALLE2ISNXS		sys_insn(1, 4, 9, 3, 0)
+#define OP_TLBI_VAE2ISNXS		sys_insn(1, 4, 9, 3, 1)
+#define OP_TLBI_ALLE1ISNXS		sys_insn(1, 4, 9, 3, 4)
+#define OP_TLBI_VALE2ISNXS		sys_insn(1, 4, 9, 3, 5)
+#define OP_TLBI_VMALLS12E1ISNXS		sys_insn(1, 4, 9, 3, 6)
+#define OP_TLBI_IPAS2E1OSNXS		sys_insn(1, 4, 9, 4, 0)
+#define OP_TLBI_IPAS2E1NXS		sys_insn(1, 4, 9, 4, 1)
+#define OP_TLBI_RIPAS2E1NXS		sys_insn(1, 4, 9, 4, 2)
+#define OP_TLBI_RIPAS2E1OSNXS		sys_insn(1, 4, 9, 4, 3)
+#define OP_TLBI_IPAS2LE1OSNXS		sys_insn(1, 4, 9, 4, 4)
+#define OP_TLBI_IPAS2LE1NXS		sys_insn(1, 4, 9, 4, 5)
+#define OP_TLBI_RIPAS2LE1NXS		sys_insn(1, 4, 9, 4, 6)
+#define OP_TLBI_RIPAS2LE1OSNXS		sys_insn(1, 4, 9, 4, 7)
+#define OP_TLBI_RVAE2OSNXS		sys_insn(1, 4, 9, 5, 1)
+#define OP_TLBI_RVALE2OSNXS		sys_insn(1, 4, 9, 5, 5)
+#define OP_TLBI_RVAE2NXS		sys_insn(1, 4, 9, 6, 1)
+#define OP_TLBI_RVALE2NXS		sys_insn(1, 4, 9, 6, 5)
+#define OP_TLBI_ALLE2NXS		sys_insn(1, 4, 9, 7, 0)
+#define OP_TLBI_VAE2NXS			sys_insn(1, 4, 9, 7, 1)
+#define OP_TLBI_ALLE1NXS		sys_insn(1, 4, 9, 7, 4)
+#define OP_TLBI_VALE2NXS		sys_insn(1, 4, 9, 7, 5)
+#define OP_TLBI_VMALLS12E1NXS		sys_insn(1, 4, 9, 7, 6)
+
+/* Misc instructions */
+#define OP_GCSPUSHX			sys_insn(1, 0, 7, 7, 4)
+#define OP_GCSPOPCX			sys_insn(1, 0, 7, 7, 5)
+#define OP_GCSPOPX			sys_insn(1, 0, 7, 7, 6)
+#define OP_GCSPUSHM			sys_insn(1, 3, 7, 7, 0)
+
+#define OP_BRB_IALL			sys_insn(1, 1, 7, 2, 4)
+#define OP_BRB_INJ			sys_insn(1, 1, 7, 2, 5)
+#define OP_CFP_RCTX			sys_insn(1, 3, 7, 3, 4)
+#define OP_DVP_RCTX			sys_insn(1, 3, 7, 3, 5)
+#define OP_COSP_RCTX			sys_insn(1, 3, 7, 3, 6)
+#define OP_CPP_RCTX			sys_insn(1, 3, 7, 3, 7)
+
 /* Common SCTLR_ELx flags. */
 #define SCTLR_ELx_ENTP2	(BIT(60))
 #define SCTLR_ELx_DSSBS	(BIT(44))
@@ -555,16 +898,14 @@
 /* Position the attr at the correct index */
 #define MAIR_ATTRIDX(attr, idx)		((attr) << ((idx) * 8))
 
-/* id_aa64pfr0 */
-#define ID_AA64PFR0_EL1_ELx_64BIT_ONLY		0x1
-#define ID_AA64PFR0_EL1_ELx_32BIT_64BIT		0x2
-
 /* id_aa64mmfr0 */
 #define ID_AA64MMFR0_EL1_TGRAN4_SUPPORTED_MIN	0x0
+#define ID_AA64MMFR0_EL1_TGRAN4_LPA2		ID_AA64MMFR0_EL1_TGRAN4_52_BIT
 #define ID_AA64MMFR0_EL1_TGRAN4_SUPPORTED_MAX	0x7
 #define ID_AA64MMFR0_EL1_TGRAN64_SUPPORTED_MIN	0x0
 #define ID_AA64MMFR0_EL1_TGRAN64_SUPPORTED_MAX	0x7
 #define ID_AA64MMFR0_EL1_TGRAN16_SUPPORTED_MIN	0x1
+#define ID_AA64MMFR0_EL1_TGRAN16_LPA2		ID_AA64MMFR0_EL1_TGRAN16_52_BIT
 #define ID_AA64MMFR0_EL1_TGRAN16_SUPPORTED_MAX	0xf
 
 #define ARM64_MIN_PARANGE_BITS		32
@@ -572,6 +913,7 @@
 #define ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_DEFAULT	0x0
 #define ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_NONE		0x1
 #define ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_MIN		0x2
+#define ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_LPA2		0x3
 #define ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_MAX		0x7
 
 #ifdef CONFIG_ARM64_PA_BITS_52
@@ -582,11 +924,13 @@
 
 #if defined(CONFIG_ARM64_4K_PAGES)
 #define ID_AA64MMFR0_EL1_TGRAN_SHIFT		ID_AA64MMFR0_EL1_TGRAN4_SHIFT
+#define ID_AA64MMFR0_EL1_TGRAN_LPA2		ID_AA64MMFR0_EL1_TGRAN4_52_BIT
 #define ID_AA64MMFR0_EL1_TGRAN_SUPPORTED_MIN	ID_AA64MMFR0_EL1_TGRAN4_SUPPORTED_MIN
 #define ID_AA64MMFR0_EL1_TGRAN_SUPPORTED_MAX	ID_AA64MMFR0_EL1_TGRAN4_SUPPORTED_MAX
 #define ID_AA64MMFR0_EL1_TGRAN_2_SHIFT		ID_AA64MMFR0_EL1_TGRAN4_2_SHIFT
 #elif defined(CONFIG_ARM64_16K_PAGES)
 #define ID_AA64MMFR0_EL1_TGRAN_SHIFT		ID_AA64MMFR0_EL1_TGRAN16_SHIFT
+#define ID_AA64MMFR0_EL1_TGRAN_LPA2		ID_AA64MMFR0_EL1_TGRAN16_52_BIT
 #define ID_AA64MMFR0_EL1_TGRAN_SUPPORTED_MIN	ID_AA64MMFR0_EL1_TGRAN16_SUPPORTED_MIN
 #define ID_AA64MMFR0_EL1_TGRAN_SUPPORTED_MAX	ID_AA64MMFR0_EL1_TGRAN16_SUPPORTED_MAX
 #define ID_AA64MMFR0_EL1_TGRAN_2_SHIFT		ID_AA64MMFR0_EL1_TGRAN16_2_SHIFT
@@ -610,6 +954,19 @@
 #define SYS_GCR_EL1_RRND	(BIT(16))
 #define SYS_GCR_EL1_EXCL_MASK	0xffffUL
 
+#ifdef CONFIG_KASAN_HW_TAGS
+/*
+ * KASAN always uses a whole byte for its tags. With CONFIG_KASAN_HW_TAGS it
+ * only uses tags in the range 0xF0-0xFF, which we map to MTE tags 0x0-0xF.
+ */
+#define __MTE_TAG_MIN		(KASAN_TAG_MIN & 0xf)
+#define __MTE_TAG_MAX		(KASAN_TAG_MAX & 0xf)
+#define __MTE_TAG_INCL		GENMASK(__MTE_TAG_MAX, __MTE_TAG_MIN)
+#define KERNEL_GCR_EL1_EXCL	(SYS_GCR_EL1_EXCL_MASK & ~__MTE_TAG_INCL)
+#else
+#define KERNEL_GCR_EL1_EXCL	SYS_GCR_EL1_EXCL_MASK
+#endif
+
 #define KERNEL_GCR_EL1		(SYS_GCR_EL1_RRND | KERNEL_GCR_EL1_EXCL)
 
 /* RGSR_EL1 Definitions */
@@ -716,6 +1073,22 @@
 
 #define PIRx_ELx_PERM(idx, perm)	((perm) << ((idx) * 4))
 
+/*
+ * Permission Overlay Extension (POE) permission encodings.
+ */
+#define POE_NONE	UL(0x0)
+#define POE_R		UL(0x1)
+#define POE_X		UL(0x2)
+#define POE_RX		UL(0x3)
+#define POE_W		UL(0x4)
+#define POE_RW		UL(0x5)
+#define POE_XW		UL(0x6)
+#define POE_RXW		UL(0x7)
+#define POE_MASK	UL(0xf)
+
+/* Initial value for Permission Overlay Extension for EL0 */
+#define POR_EL0_INIT	POE_RXW
+
 #define ARM64_FEATURE_FIELD_BITS	4
 
 /* Defined for compatibility only, do not add new users. */
@@ -789,15 +1162,21 @@
 /*
  * For registers without architectural names, or simply unsupported by
  * GAS.
+ *
+ * __check_r forces warnings to be generated by the compiler when
+ * evaluating r which wouldn't normally happen due to being passed to
+ * the assembler via __stringify(r).
  */
 #define read_sysreg_s(r) ({						\
 	u64 __val;							\
+	u32 __maybe_unused __check_r = (u32)(r);			\
 	asm volatile(__mrs_s("%0", r) : "=r" (__val));			\
 	__val;								\
 })
 
 #define write_sysreg_s(v, r) do {					\
 	u64 __val = (u64)(v);						\
+	u32 __maybe_unused __check_r = (u32)(r);			\
 	asm volatile(__msr_s(r, "%x0") : : "rZ" (__val));		\
 } while (0)
 
@@ -827,6 +1206,8 @@
 	par;								\
 })
 
+#define SYS_FIELD_VALUE(reg, field, val)	reg##_##field##_##val
+
 #define SYS_FIELD_GET(reg, field, val)		\
 		 FIELD_GET(reg##_##field##_MASK, val)
 
@@ -834,7 +1215,8 @@
 		 FIELD_PREP(reg##_##field##_MASK, val)
 
 #define SYS_FIELD_PREP_ENUM(reg, field, val)		\
-		 FIELD_PREP(reg##_##field##_MASK, reg##_##field##_##val)
+		 FIELD_PREP(reg##_##field##_MASK,	\
+			    SYS_FIELD_VALUE(reg, field, val))
 
 #endif
 
diff --git a/tools/include/linux/kasan-tags.h b/tools/include/linux/kasan-tags.h
new file mode 100644
index 000000000000..4f85f562512c
--- /dev/null
+++ b/tools/include/linux/kasan-tags.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_KASAN_TAGS_H
+#define _LINUX_KASAN_TAGS_H
+
+#define KASAN_TAG_KERNEL	0xFF /* native kernel pointers tag */
+#define KASAN_TAG_INVALID	0xFE /* inaccessible memory tag */
+#define KASAN_TAG_MAX		0xFD /* maximum value for random tags */
+
+#ifdef CONFIG_KASAN_HW_TAGS
+#define KASAN_TAG_MIN		0xF0 /* minimum value for random tags */
+#else
+#define KASAN_TAG_MIN		0x00 /* minimum value for random tags */
+#endif
+
+#endif /* LINUX_KASAN_TAGS_H */
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v8 5/8] arm64/sysreg/tools: Move TRFCR definitions to sysreg
  2024-11-27 10:01 [PATCH v8 0/8] kvm/coresight: Support exclude guest and exclude host James Clark
                   ` (3 preceding siblings ...)
  2024-11-27 10:01 ` [PATCH v8 4/8] tools: arm64: Update sysreg.h header files James Clark
@ 2024-11-27 10:01 ` James Clark
  2024-11-27 10:01 ` [PATCH v8 6/8] KVM: arm64: coresight: Give TRBE enabled state to KVM James Clark
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: James Clark @ 2024-11-27 10:01 UTC (permalink / raw)
  To: maz, kvmarm, oliver.upton, suzuki.poulose, coresight
  Cc: James Clark, Mark Brown, James Clark, Joey Gouly, Zenghui Yu,
	Catalin Marinas, Will Deacon, Mike Leach, Alexander Shishkin,
	Anshuman Khandual, Fuad Tabba, Shiqi Liu, James Morse,
	Raghavendra Rao Ananta, linux-arm-kernel, linux-kernel

From: James Clark <james.clark@arm.com>

Convert TRFCR to automatic generation. Add separate definitions for ELx
and EL2 as TRFCR_EL1 doesn't have CX. This also mirrors the previous
definition so no code change is required.

Also add TRFCR_EL12 which will start to be used in a later commit.

Unfortunately, to avoid breaking the Perf build with duplicate
definition errors, the tools copy of the sysreg.h header needs to be
updated at the same time rather than the usual second commit. This is
because the generated version of sysreg
(arch/arm64/include/generated/asm/sysreg-defs.h), is currently shared
and tools/ does not have its own copy.

Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
---
 arch/arm64/include/asm/sysreg.h       | 12 ---------
 arch/arm64/tools/sysreg               | 36 +++++++++++++++++++++++++++
 tools/arch/arm64/include/asm/sysreg.h | 12 ---------
 3 files changed, 36 insertions(+), 24 deletions(-)

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 345e81e0d2b3..150416682e2c 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -283,8 +283,6 @@
 #define SYS_RGSR_EL1			sys_reg(3, 0, 1, 0, 5)
 #define SYS_GCR_EL1			sys_reg(3, 0, 1, 0, 6)
 
-#define SYS_TRFCR_EL1			sys_reg(3, 0, 1, 2, 1)
-
 #define SYS_TCR_EL1			sys_reg(3, 0, 2, 0, 2)
 
 #define SYS_APIAKEYLO_EL1		sys_reg(3, 0, 2, 1, 0)
@@ -519,7 +517,6 @@
 #define SYS_VTTBR_EL2			sys_reg(3, 4, 2, 1, 0)
 #define SYS_VTCR_EL2			sys_reg(3, 4, 2, 1, 2)
 
-#define SYS_TRFCR_EL2			sys_reg(3, 4, 1, 2, 1)
 #define SYS_VNCR_EL2			sys_reg(3, 4, 2, 2, 0)
 #define SYS_HAFGRTR_EL2			sys_reg(3, 4, 3, 1, 6)
 #define SYS_SPSR_EL2			sys_reg(3, 4, 4, 0, 0)
@@ -983,15 +980,6 @@
 /* Safe value for MPIDR_EL1: Bit31:RES1, Bit30:U:0, Bit24:MT:0 */
 #define SYS_MPIDR_SAFE_VAL	(BIT(31))
 
-#define TRFCR_ELx_TS_SHIFT		5
-#define TRFCR_ELx_TS_MASK		((0x3UL) << TRFCR_ELx_TS_SHIFT)
-#define TRFCR_ELx_TS_VIRTUAL		((0x1UL) << TRFCR_ELx_TS_SHIFT)
-#define TRFCR_ELx_TS_GUEST_PHYSICAL	((0x2UL) << TRFCR_ELx_TS_SHIFT)
-#define TRFCR_ELx_TS_PHYSICAL		((0x3UL) << TRFCR_ELx_TS_SHIFT)
-#define TRFCR_EL2_CX			BIT(3)
-#define TRFCR_ELx_ExTRE			BIT(1)
-#define TRFCR_ELx_E0TRE			BIT(0)
-
 /* GIC Hypervisor interface registers */
 /* ICH_MISR_EL2 bit definitions */
 #define ICH_MISR_EOI		(1 << 0)
diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg
index a26c0da0c42d..27a7afd5329a 100644
--- a/arch/arm64/tools/sysreg
+++ b/arch/arm64/tools/sysreg
@@ -1994,6 +1994,22 @@ Sysreg	CPACR_EL1	3	0	1	0	2
 Fields	CPACR_ELx
 EndSysreg
 
+SysregFields TRFCR_ELx
+Res0	63:7
+UnsignedEnum	6:5	TS
+	0b0001	VIRTUAL
+	0b0010	GUEST_PHYSICAL
+	0b0011	PHYSICAL
+EndEnum
+Res0	4:2
+Field	1	ExTRE
+Field	0	E0TRE
+EndSysregFields
+
+Sysreg	TRFCR_EL1	3	0	1	2	1
+Fields	TRFCR_ELx
+EndSysreg
+
 Sysreg	SMPRI_EL1	3	0	1	2	4
 Res0	63:4
 Field	3:0	PRIORITY
@@ -2536,6 +2552,22 @@ Field	1	ICIALLU
 Field	0	ICIALLUIS
 EndSysreg
 
+Sysreg TRFCR_EL2	3	4	1	2	1
+Res0	63:7
+UnsignedEnum	6:5	TS
+	0b0000	USE_TRFCR_EL1_TS
+	0b0001	VIRTUAL
+	0b0010	GUEST_PHYSICAL
+	0b0011	PHYSICAL
+EndEnum
+Res0	4
+Field	3	CX
+Res0	2
+Field	1	E2TRE
+Field	0	E0HTRE
+EndSysreg
+
+
 Sysreg HDFGRTR_EL2	3	4	3	1	4
 Field	63	PMBIDR_EL1
 Field	62	nPMSNEVFR_EL1
@@ -2946,6 +2978,10 @@ Sysreg	ZCR_EL12	3	5	1	2	0
 Fields	ZCR_ELx
 EndSysreg
 
+Sysreg	TRFCR_EL12	3	5	1	2	1
+Fields	TRFCR_ELx
+EndSysreg
+
 Sysreg	SMCR_EL12	3	5	1	2	6
 Fields	SMCR_ELx
 EndSysreg
diff --git a/tools/arch/arm64/include/asm/sysreg.h b/tools/arch/arm64/include/asm/sysreg.h
index 345e81e0d2b3..150416682e2c 100644
--- a/tools/arch/arm64/include/asm/sysreg.h
+++ b/tools/arch/arm64/include/asm/sysreg.h
@@ -283,8 +283,6 @@
 #define SYS_RGSR_EL1			sys_reg(3, 0, 1, 0, 5)
 #define SYS_GCR_EL1			sys_reg(3, 0, 1, 0, 6)
 
-#define SYS_TRFCR_EL1			sys_reg(3, 0, 1, 2, 1)
-
 #define SYS_TCR_EL1			sys_reg(3, 0, 2, 0, 2)
 
 #define SYS_APIAKEYLO_EL1		sys_reg(3, 0, 2, 1, 0)
@@ -519,7 +517,6 @@
 #define SYS_VTTBR_EL2			sys_reg(3, 4, 2, 1, 0)
 #define SYS_VTCR_EL2			sys_reg(3, 4, 2, 1, 2)
 
-#define SYS_TRFCR_EL2			sys_reg(3, 4, 1, 2, 1)
 #define SYS_VNCR_EL2			sys_reg(3, 4, 2, 2, 0)
 #define SYS_HAFGRTR_EL2			sys_reg(3, 4, 3, 1, 6)
 #define SYS_SPSR_EL2			sys_reg(3, 4, 4, 0, 0)
@@ -983,15 +980,6 @@
 /* Safe value for MPIDR_EL1: Bit31:RES1, Bit30:U:0, Bit24:MT:0 */
 #define SYS_MPIDR_SAFE_VAL	(BIT(31))
 
-#define TRFCR_ELx_TS_SHIFT		5
-#define TRFCR_ELx_TS_MASK		((0x3UL) << TRFCR_ELx_TS_SHIFT)
-#define TRFCR_ELx_TS_VIRTUAL		((0x1UL) << TRFCR_ELx_TS_SHIFT)
-#define TRFCR_ELx_TS_GUEST_PHYSICAL	((0x2UL) << TRFCR_ELx_TS_SHIFT)
-#define TRFCR_ELx_TS_PHYSICAL		((0x3UL) << TRFCR_ELx_TS_SHIFT)
-#define TRFCR_EL2_CX			BIT(3)
-#define TRFCR_ELx_ExTRE			BIT(1)
-#define TRFCR_ELx_E0TRE			BIT(0)
-
 /* GIC Hypervisor interface registers */
 /* ICH_MISR_EL2 bit definitions */
 #define ICH_MISR_EOI		(1 << 0)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v8 6/8] KVM: arm64: coresight: Give TRBE enabled state to KVM
  2024-11-27 10:01 [PATCH v8 0/8] kvm/coresight: Support exclude guest and exclude host James Clark
                   ` (4 preceding siblings ...)
  2024-11-27 10:01 ` [PATCH v8 5/8] arm64/sysreg/tools: Move TRFCR definitions to sysreg James Clark
@ 2024-11-27 10:01 ` James Clark
  2024-12-20 17:05   ` Marc Zyngier
  2024-11-27 10:01 ` [PATCH v8 7/8] KVM: arm64: Support trace filtering for guests James Clark
  2024-11-27 10:01 ` [PATCH v8 8/8] coresight: Pass guest TRFCR value to KVM James Clark
  7 siblings, 1 reply; 16+ messages in thread
From: James Clark @ 2024-11-27 10:01 UTC (permalink / raw)
  To: maz, kvmarm, oliver.upton, suzuki.poulose, coresight
  Cc: James Clark, Joey Gouly, Zenghui Yu, Catalin Marinas, Will Deacon,
	Mike Leach, Alexander Shishkin, Mark Brown, Anshuman Khandual,
	Rob Herring (Arm), Shiqi Liu, Fuad Tabba, James Morse,
	Raghavendra Rao Ananta, linux-arm-kernel, linux-kernel

Currently in nVHE, KVM has to check if TRBE is enabled on every guest
switch even if it was never used. Because it's a debug feature and is
more likely to not be used than used, give KVM the TRBE buffer status to
allow a much simpler and faster do-nothing path in the hyp.

This is always called with preemption disabled except for probe/hotplug
which gets wrapped with preempt_disable().

Protected mode disables trace regardless of TRBE (because
guest_trfcr_el1 is always 0), which was not previously done. HAS_TRBE
becomes redundant, but HAS_TRF is now required for this.

Signed-off-by: James Clark <james.clark@linaro.org>
---
 arch/arm64/include/asm/kvm_host.h            | 10 +++-
 arch/arm64/kvm/debug.c                       | 25 ++++++++--
 arch/arm64/kvm/hyp/nvhe/debug-sr.c           | 51 +++++++++++---------
 drivers/hwtracing/coresight/coresight-trbe.c |  5 ++
 4 files changed, 65 insertions(+), 26 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 7e3478386351..ba251caa593b 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -611,7 +611,8 @@ struct cpu_sve_state {
  */
 struct kvm_host_data {
 #define KVM_HOST_DATA_FLAG_HAS_SPE	0
-#define KVM_HOST_DATA_FLAG_HAS_TRBE	1
+#define KVM_HOST_DATA_FLAG_HAS_TRF	1
+#define KVM_HOST_DATA_FLAG_TRBE_ENABLED	2
 	unsigned long flags;
 
 	struct kvm_cpu_context host_ctxt;
@@ -657,6 +658,9 @@ struct kvm_host_data {
 		u64 mdcr_el2;
 	} host_debug_state;
 
+	/* Guest trace filter value */
+	u64 guest_trfcr_el1;
+
 	/* Number of programmable event counters (PMCR_EL0.N) for this CPU */
 	unsigned int nr_event_counters;
 };
@@ -1381,6 +1385,8 @@ static inline bool kvm_pmu_counter_deferred(struct perf_event_attr *attr)
 void kvm_set_pmu_events(u64 set, struct perf_event_attr *attr);
 void kvm_clr_pmu_events(u64 clr);
 bool kvm_set_pmuserenr(u64 val);
+void kvm_enable_trbe(void);
+void kvm_disable_trbe(void);
 #else
 static inline void kvm_set_pmu_events(u64 set, struct perf_event_attr *attr) {}
 static inline void kvm_clr_pmu_events(u64 clr) {}
@@ -1388,6 +1394,8 @@ static inline bool kvm_set_pmuserenr(u64 val)
 {
 	return false;
 }
+static inline void kvm_enable_trbe(void) {}
+static inline void kvm_disable_trbe(void) {}
 #endif
 
 void kvm_vcpu_load_vhe(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
index dd9e139dfd13..0c340ae7b5d1 100644
--- a/arch/arm64/kvm/debug.c
+++ b/arch/arm64/kvm/debug.c
@@ -314,7 +314,26 @@ void kvm_init_host_debug_data(void)
 	    !(read_sysreg_s(SYS_PMBIDR_EL1) & PMBIDR_EL1_P))
 		host_data_set_flag(HAS_SPE);
 
-	if (cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_EL1_TraceBuffer_SHIFT) &&
-	    !(read_sysreg_s(SYS_TRBIDR_EL1) & TRBIDR_EL1_P))
-		host_data_set_flag(HAS_TRBE);
+	if (cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_EL1_TraceFilt_SHIFT))
+		host_data_set_flag(HAS_TRF);
 }
+
+void kvm_enable_trbe(void)
+{
+	if (has_vhe() || is_protected_kvm_enabled() ||
+	    WARN_ON_ONCE(preemptible()))
+		return;
+
+	host_data_set_flag(TRBE_ENABLED);
+}
+EXPORT_SYMBOL_GPL(kvm_enable_trbe);
+
+void kvm_disable_trbe(void)
+{
+	if (has_vhe() || is_protected_kvm_enabled() ||
+	    WARN_ON_ONCE(preemptible()))
+		return;
+
+	host_data_clear_flag(TRBE_ENABLED);
+}
+EXPORT_SYMBOL_GPL(kvm_disable_trbe);
diff --git a/arch/arm64/kvm/hyp/nvhe/debug-sr.c b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
index 858bb38e273f..9479bee41801 100644
--- a/arch/arm64/kvm/hyp/nvhe/debug-sr.c
+++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
@@ -51,32 +51,39 @@ static void __debug_restore_spe(u64 pmscr_el1)
 	write_sysreg_el1(pmscr_el1, SYS_PMSCR);
 }
 
-static void __debug_save_trace(u64 *trfcr_el1)
+static void __trace_do_switch(u64 *saved_trfcr, u64 new_trfcr)
 {
-	*trfcr_el1 = 0;
+	*saved_trfcr = read_sysreg_el1(SYS_TRFCR);
+	write_sysreg_el1(new_trfcr, SYS_TRFCR);
 
-	/* Check if the TRBE is enabled */
-	if (!(read_sysreg_s(SYS_TRBLIMITR_EL1) & TRBLIMITR_EL1_E))
+	/* No need to drain if going to an enabled state or from disabled state */
+	if (new_trfcr || !*saved_trfcr)
 		return;
-	/*
-	 * Prohibit trace generation while we are in guest.
-	 * Since access to TRFCR_EL1 is trapped, the guest can't
-	 * modify the filtering set by the host.
-	 */
-	*trfcr_el1 = read_sysreg_el1(SYS_TRFCR);
-	write_sysreg_el1(0, SYS_TRFCR);
+
 	isb();
-	/* Drain the trace buffer to memory */
 	tsb_csync();
 }
 
-static void __debug_restore_trace(u64 trfcr_el1)
+static bool __trace_needs_switch(void)
 {
-	if (!trfcr_el1)
-		return;
+	return host_data_test_flag(TRBE_ENABLED) ||
+	       (is_protected_kvm_enabled() && host_data_test_flag(HAS_TRF));
+}
 
-	/* Restore trace filter controls */
-	write_sysreg_el1(trfcr_el1, SYS_TRFCR);
+static void __trace_switch_to_guest(void)
+{
+	/* Unsupported with TRBE so disable */
+	if (host_data_test_flag(TRBE_ENABLED))
+		*host_data_ptr(guest_trfcr_el1) = 0;
+
+	__trace_do_switch(host_data_ptr(host_debug_state.trfcr_el1),
+			  *host_data_ptr(guest_trfcr_el1));
+}
+
+static void __trace_switch_to_host(void)
+{
+	__trace_do_switch(host_data_ptr(guest_trfcr_el1),
+			  *host_data_ptr(host_debug_state.trfcr_el1));
 }
 
 void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu)
@@ -84,9 +91,9 @@ void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu)
 	/* Disable and flush SPE data generation */
 	if (host_data_test_flag(HAS_SPE))
 		__debug_save_spe(host_data_ptr(host_debug_state.pmscr_el1));
-	/* Disable and flush Self-Hosted Trace generation */
-	if (host_data_test_flag(HAS_TRBE))
-		__debug_save_trace(host_data_ptr(host_debug_state.trfcr_el1));
+
+	if (__trace_needs_switch())
+		__trace_switch_to_guest();
 }
 
 void __debug_switch_to_guest(struct kvm_vcpu *vcpu)
@@ -98,8 +105,8 @@ void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu)
 {
 	if (host_data_test_flag(HAS_SPE))
 		__debug_restore_spe(*host_data_ptr(host_debug_state.pmscr_el1));
-	if (host_data_test_flag(HAS_TRBE))
-		__debug_restore_trace(*host_data_ptr(host_debug_state.trfcr_el1));
+	if (__trace_needs_switch())
+		__trace_switch_to_host();
 }
 
 void __debug_switch_to_host(struct kvm_vcpu *vcpu)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
index 96a32b213669..9c0f8c43e6fe 100644
--- a/drivers/hwtracing/coresight/coresight-trbe.c
+++ b/drivers/hwtracing/coresight/coresight-trbe.c
@@ -18,6 +18,7 @@
 #include <asm/barrier.h>
 #include <asm/cpufeature.h>
 #include <linux/vmalloc.h>
+#include <linux/kvm_host.h>
 
 #include "coresight-self-hosted-trace.h"
 #include "coresight-trbe.h"
@@ -221,6 +222,7 @@ static inline void set_trbe_enabled(struct trbe_cpudata *cpudata, u64 trblimitr)
 	 */
 	trblimitr |= TRBLIMITR_EL1_E;
 	write_sysreg_s(trblimitr, SYS_TRBLIMITR_EL1);
+	kvm_enable_trbe();
 
 	/* Synchronize the TRBE enable event */
 	isb();
@@ -239,6 +241,7 @@ static inline void set_trbe_disabled(struct trbe_cpudata *cpudata)
 	 */
 	trblimitr &= ~TRBLIMITR_EL1_E;
 	write_sysreg_s(trblimitr, SYS_TRBLIMITR_EL1);
+	kvm_disable_trbe();
 
 	if (trbe_needs_drain_after_disable(cpudata))
 		trbe_drain_buffer();
@@ -253,8 +256,10 @@ static void trbe_drain_and_disable_local(struct trbe_cpudata *cpudata)
 
 static void trbe_reset_local(struct trbe_cpudata *cpudata)
 {
+	preempt_disable();
 	trbe_drain_and_disable_local(cpudata);
 	write_sysreg_s(0, SYS_TRBLIMITR_EL1);
+	preempt_enable();
 	write_sysreg_s(0, SYS_TRBPTR_EL1);
 	write_sysreg_s(0, SYS_TRBBASER_EL1);
 	write_sysreg_s(0, SYS_TRBSR_EL1);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v8 7/8] KVM: arm64: Support trace filtering for guests
  2024-11-27 10:01 [PATCH v8 0/8] kvm/coresight: Support exclude guest and exclude host James Clark
                   ` (5 preceding siblings ...)
  2024-11-27 10:01 ` [PATCH v8 6/8] KVM: arm64: coresight: Give TRBE enabled state to KVM James Clark
@ 2024-11-27 10:01 ` James Clark
  2024-12-21 12:34   ` Marc Zyngier
  2024-11-27 10:01 ` [PATCH v8 8/8] coresight: Pass guest TRFCR value to KVM James Clark
  7 siblings, 1 reply; 16+ messages in thread
From: James Clark @ 2024-11-27 10:01 UTC (permalink / raw)
  To: maz, kvmarm, oliver.upton, suzuki.poulose, coresight
  Cc: James Clark, Joey Gouly, Zenghui Yu, Catalin Marinas, Will Deacon,
	Mike Leach, Alexander Shishkin, Mark Brown, Anshuman Khandual,
	James Morse, Rob Herring (Arm), Shiqi Liu, Fuad Tabba,
	Raghavendra Rao Ananta, linux-arm-kernel, linux-kernel

For nVHE, switch the filter value in and out if the Coresight driver
asks for it. This will support filters for guests when sinks other than
TRBE are used.

For VHE, just write the filter directly to TRFCR_EL1 where trace can be
used even with TRBE sinks.

Signed-off-by: James Clark <james.clark@linaro.org>
---
 arch/arm64/include/asm/kvm_host.h  |  5 +++++
 arch/arm64/kvm/debug.c             | 28 ++++++++++++++++++++++++++++
 arch/arm64/kvm/hyp/nvhe/debug-sr.c |  1 +
 3 files changed, 34 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index ba251caa593b..cce07887551b 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -613,6 +613,7 @@ struct kvm_host_data {
 #define KVM_HOST_DATA_FLAG_HAS_SPE	0
 #define KVM_HOST_DATA_FLAG_HAS_TRF	1
 #define KVM_HOST_DATA_FLAG_TRBE_ENABLED	2
+#define KVM_HOST_DATA_FLAG_GUEST_FILTER	3
 	unsigned long flags;
 
 	struct kvm_cpu_context host_ctxt;
@@ -1387,6 +1388,8 @@ void kvm_clr_pmu_events(u64 clr);
 bool kvm_set_pmuserenr(u64 val);
 void kvm_enable_trbe(void);
 void kvm_disable_trbe(void);
+void kvm_set_trfcr(u64 guest_trfcr);
+void kvm_clear_trfcr(void);
 #else
 static inline void kvm_set_pmu_events(u64 set, struct perf_event_attr *attr) {}
 static inline void kvm_clr_pmu_events(u64 clr) {}
@@ -1396,6 +1399,8 @@ static inline bool kvm_set_pmuserenr(u64 val)
 }
 static inline void kvm_enable_trbe(void) {}
 static inline void kvm_disable_trbe(void) {}
+static inline void kvm_set_trfcr(u64 guest_trfcr) {}
+static inline void kvm_clear_trfcr(void) {}
 #endif
 
 void kvm_vcpu_load_vhe(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
index 0c340ae7b5d1..9266f2776991 100644
--- a/arch/arm64/kvm/debug.c
+++ b/arch/arm64/kvm/debug.c
@@ -337,3 +337,31 @@ void kvm_disable_trbe(void)
 	host_data_clear_flag(TRBE_ENABLED);
 }
 EXPORT_SYMBOL_GPL(kvm_disable_trbe);
+
+void kvm_set_trfcr(u64 guest_trfcr)
+{
+	if (is_protected_kvm_enabled() || WARN_ON_ONCE(preemptible()))
+		return;
+
+	if (has_vhe())
+		write_sysreg_s(guest_trfcr, SYS_TRFCR_EL12);
+	else {
+		*host_data_ptr(guest_trfcr_el1) = guest_trfcr;
+		host_data_set_flag(GUEST_FILTER);
+	}
+}
+EXPORT_SYMBOL_GPL(kvm_set_trfcr);
+
+void kvm_clear_trfcr(void)
+{
+	if (is_protected_kvm_enabled() || WARN_ON_ONCE(preemptible()))
+		return;
+
+	if (has_vhe())
+		write_sysreg_s(0, SYS_TRFCR_EL12);
+	else {
+		*host_data_ptr(guest_trfcr_el1) = 0;
+		host_data_clear_flag(GUEST_FILTER);
+	}
+}
+EXPORT_SYMBOL_GPL(kvm_clear_trfcr);
diff --git a/arch/arm64/kvm/hyp/nvhe/debug-sr.c b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
index 9479bee41801..7edee7ace433 100644
--- a/arch/arm64/kvm/hyp/nvhe/debug-sr.c
+++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
@@ -67,6 +67,7 @@ static void __trace_do_switch(u64 *saved_trfcr, u64 new_trfcr)
 static bool __trace_needs_switch(void)
 {
 	return host_data_test_flag(TRBE_ENABLED) ||
+	       host_data_test_flag(GUEST_FILTER) ||
 	       (is_protected_kvm_enabled() && host_data_test_flag(HAS_TRF));
 }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v8 8/8] coresight: Pass guest TRFCR value to KVM
  2024-11-27 10:01 [PATCH v8 0/8] kvm/coresight: Support exclude guest and exclude host James Clark
                   ` (6 preceding siblings ...)
  2024-11-27 10:01 ` [PATCH v8 7/8] KVM: arm64: Support trace filtering for guests James Clark
@ 2024-11-27 10:01 ` James Clark
  7 siblings, 0 replies; 16+ messages in thread
From: James Clark @ 2024-11-27 10:01 UTC (permalink / raw)
  To: maz, kvmarm, oliver.upton, suzuki.poulose, coresight
  Cc: James Clark, Joey Gouly, Zenghui Yu, Catalin Marinas, Will Deacon,
	Mike Leach, James Clark, Alexander Shishkin, Anshuman Khandual,
	Rob Herring (Arm), James Morse, Shiqi Liu, Fuad Tabba, Mark Brown,
	Raghavendra Rao Ananta, linux-arm-kernel, linux-kernel

From: James Clark <james.clark@arm.com>

Currently the userspace and kernel filters for guests are never set, so
no trace will be generated for them. Add support for tracing guests by
passing the desired TRFCR value to KVM so it can be applied to the
guest.

By writing either E1TRE or E0TRE, filtering on either guest kernel or
guest userspace is also supported. And if both E1TRE and E0TRE are
cleared when exclude_guest is set, that option is supported too. This
change also brings exclude_host support which is difficult to add as a
separate commit without excess churn and resulting in no trace at all.

Testing
=======

The addresses were counted with the following:

  $ perf report -D | grep -Eo 'EL2|EL1|EL0' | sort | uniq -c

Guest kernel only:

  $ perf record -e cs_etm//Gk -a -- true
    535 EL1
      1 EL2

Guest user only (only 5 addresses because the guest runs slowly in the
model):

  $ perf record -e cs_etm//Gu -a -- true
    5 EL0

Host kernel only:

  $  perf record -e cs_etm//Hk -a -- true
   3501 EL2

Host userspace only:

  $  perf record -e cs_etm//Hu -a -- true
    408 EL0
      1 EL2

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: James Clark <james.clark@arm.com>
---
 .../coresight/coresight-etm4x-core.c          | 43 ++++++++++++++++---
 drivers/hwtracing/coresight/coresight-etm4x.h |  2 +-
 drivers/hwtracing/coresight/coresight-priv.h  |  3 ++
 3 files changed, 41 insertions(+), 7 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
index 66d44a404ad0..afe56e1b9364 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
@@ -6,6 +6,7 @@
 #include <linux/acpi.h>
 #include <linux/bitops.h>
 #include <linux/kernel.h>
+#include <linux/kvm_host.h>
 #include <linux/moduleparam.h>
 #include <linux/init.h>
 #include <linux/types.h>
@@ -271,9 +272,23 @@ static void etm4x_prohibit_trace(struct etmv4_drvdata *drvdata)
 	/* If the CPU doesn't support FEAT_TRF, nothing to do */
 	if (!drvdata->trfcr)
 		return;
+
+	kvm_clear_trfcr();
 	cpu_prohibit_trace();
 }
 
+static u64 etm4x_get_kern_user_filter(struct etmv4_drvdata *drvdata)
+{
+	u64 trfcr = drvdata->trfcr;
+
+	if (drvdata->config.mode & ETM_MODE_EXCL_KERN)
+		trfcr &= ~TRFCR_ELx_ExTRE;
+	if (drvdata->config.mode & ETM_MODE_EXCL_USER)
+		trfcr &= ~TRFCR_ELx_E0TRE;
+
+	return trfcr;
+}
+
 /*
  * etm4x_allow_trace - Allow CPU tracing in the respective ELs,
  * as configured by the drvdata->config.mode for the current
@@ -286,18 +301,28 @@ static void etm4x_prohibit_trace(struct etmv4_drvdata *drvdata)
  */
 static void etm4x_allow_trace(struct etmv4_drvdata *drvdata)
 {
-	u64 trfcr = drvdata->trfcr;
+	u64 trfcr, guest_trfcr;
 
 	/* If the CPU doesn't support FEAT_TRF, nothing to do */
-	if (!trfcr)
+	if (!drvdata->trfcr)
 		return;
 
-	if (drvdata->config.mode & ETM_MODE_EXCL_KERN)
-		trfcr &= ~TRFCR_ELx_ExTRE;
-	if (drvdata->config.mode & ETM_MODE_EXCL_USER)
-		trfcr &= ~TRFCR_ELx_E0TRE;
+	if (drvdata->config.mode & ETM_MODE_EXCL_HOST)
+		trfcr = drvdata->trfcr & ~(TRFCR_ELx_ExTRE | TRFCR_ELx_E0TRE);
+	else
+		trfcr = etm4x_get_kern_user_filter(drvdata);
 
 	write_trfcr(trfcr);
+
+	/* Set filters for guests and pass to KVM */
+	if (drvdata->config.mode & ETM_MODE_EXCL_GUEST)
+		guest_trfcr = drvdata->trfcr & ~(TRFCR_ELx_ExTRE | TRFCR_ELx_E0TRE);
+	else
+		guest_trfcr = etm4x_get_kern_user_filter(drvdata);
+
+	/* TRFCR_EL1 doesn't have CX so mask it out. */
+	guest_trfcr &= ~TRFCR_EL2_CX;
+	kvm_set_trfcr(guest_trfcr);
 }
 
 #ifdef CONFIG_ETM4X_IMPDEF_FEATURE
@@ -655,6 +680,12 @@ static int etm4_parse_event_config(struct coresight_device *csdev,
 	if (attr->exclude_user)
 		config->mode = ETM_MODE_EXCL_USER;
 
+	if (attr->exclude_host)
+		config->mode |= ETM_MODE_EXCL_HOST;
+
+	if (attr->exclude_guest)
+		config->mode |= ETM_MODE_EXCL_GUEST;
+
 	/* Always start from the default config */
 	etm4_set_default_config(config);
 
diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h
index 9e9165f62e81..1119762b5cec 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x.h
+++ b/drivers/hwtracing/coresight/coresight-etm4x.h
@@ -817,7 +817,7 @@ enum etm_impdef_type {
  * @s_ex_level: Secure ELs where tracing is supported.
  */
 struct etmv4_config {
-	u32				mode;
+	u64				mode;
 	u32				pe_sel;
 	u32				cfg;
 	u32				eventctrl0;
diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h
index 05f891ca6b5c..76403530f33e 100644
--- a/drivers/hwtracing/coresight/coresight-priv.h
+++ b/drivers/hwtracing/coresight/coresight-priv.h
@@ -42,6 +42,9 @@ extern const struct device_type coresight_dev_type[];
 
 #define ETM_MODE_EXCL_KERN	BIT(30)
 #define ETM_MODE_EXCL_USER	BIT(31)
+#define ETM_MODE_EXCL_HOST	BIT(32)
+#define ETM_MODE_EXCL_GUEST	BIT(33)
+
 struct cs_pair_attribute {
 	struct device_attribute attr;
 	u32 lo_off;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 6/8] KVM: arm64: coresight: Give TRBE enabled state to KVM
  2024-11-27 10:01 ` [PATCH v8 6/8] KVM: arm64: coresight: Give TRBE enabled state to KVM James Clark
@ 2024-12-20 17:05   ` Marc Zyngier
  2024-12-20 17:32     ` James Clark
  0 siblings, 1 reply; 16+ messages in thread
From: Marc Zyngier @ 2024-12-20 17:05 UTC (permalink / raw)
  To: James Clark
  Cc: kvmarm, oliver.upton, suzuki.poulose, coresight, Joey Gouly,
	Zenghui Yu, Catalin Marinas, Will Deacon, Mike Leach,
	Alexander Shishkin, Mark Brown, Anshuman Khandual,
	Rob Herring (Arm), Shiqi Liu, Fuad Tabba, James Morse,
	Raghavendra Rao Ananta, linux-arm-kernel, linux-kernel

On Wed, 27 Nov 2024 10:01:23 +0000,
James Clark <james.clark@linaro.org> wrote:
> 
> Currently in nVHE, KVM has to check if TRBE is enabled on every guest
> switch even if it was never used. Because it's a debug feature and is
> more likely to not be used than used, give KVM the TRBE buffer status to
> allow a much simpler and faster do-nothing path in the hyp.
> 
> This is always called with preemption disabled except for probe/hotplug
> which gets wrapped with preempt_disable().
> 
> Protected mode disables trace regardless of TRBE (because
> guest_trfcr_el1 is always 0), which was not previously done. HAS_TRBE
> becomes redundant, but HAS_TRF is now required for this.
> 
> Signed-off-by: James Clark <james.clark@linaro.org>
> ---
>  arch/arm64/include/asm/kvm_host.h            | 10 +++-
>  arch/arm64/kvm/debug.c                       | 25 ++++++++--
>  arch/arm64/kvm/hyp/nvhe/debug-sr.c           | 51 +++++++++++---------
>  drivers/hwtracing/coresight/coresight-trbe.c |  5 ++
>  4 files changed, 65 insertions(+), 26 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 7e3478386351..ba251caa593b 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -611,7 +611,8 @@ struct cpu_sve_state {
>   */
>  struct kvm_host_data {
>  #define KVM_HOST_DATA_FLAG_HAS_SPE	0
> -#define KVM_HOST_DATA_FLAG_HAS_TRBE	1
> +#define KVM_HOST_DATA_FLAG_HAS_TRF	1
> +#define KVM_HOST_DATA_FLAG_TRBE_ENABLED	2
>  	unsigned long flags;
>  
>  	struct kvm_cpu_context host_ctxt;
> @@ -657,6 +658,9 @@ struct kvm_host_data {
>  		u64 mdcr_el2;
>  	} host_debug_state;
>  
> +	/* Guest trace filter value */
> +	u64 guest_trfcr_el1;

Guest value? Or host state while running the guest? If the former,
then this has nothing to do here. If the latter, this should be
spelled out (trfcr_in_guest?), and the comment amended.

> +
>  	/* Number of programmable event counters (PMCR_EL0.N) for this CPU */
>  	unsigned int nr_event_counters;
>  };
> @@ -1381,6 +1385,8 @@ static inline bool kvm_pmu_counter_deferred(struct perf_event_attr *attr)
>  void kvm_set_pmu_events(u64 set, struct perf_event_attr *attr);
>  void kvm_clr_pmu_events(u64 clr);
>  bool kvm_set_pmuserenr(u64 val);
> +void kvm_enable_trbe(void);
> +void kvm_disable_trbe(void);
>  #else
>  static inline void kvm_set_pmu_events(u64 set, struct perf_event_attr *attr) {}
>  static inline void kvm_clr_pmu_events(u64 clr) {}
> @@ -1388,6 +1394,8 @@ static inline bool kvm_set_pmuserenr(u64 val)
>  {
>  	return false;
>  }
> +static inline void kvm_enable_trbe(void) {}
> +static inline void kvm_disable_trbe(void) {}
>  #endif
>  
>  void kvm_vcpu_load_vhe(struct kvm_vcpu *vcpu);
> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
> index dd9e139dfd13..0c340ae7b5d1 100644
> --- a/arch/arm64/kvm/debug.c
> +++ b/arch/arm64/kvm/debug.c
> @@ -314,7 +314,26 @@ void kvm_init_host_debug_data(void)
>  	    !(read_sysreg_s(SYS_PMBIDR_EL1) & PMBIDR_EL1_P))
>  		host_data_set_flag(HAS_SPE);
>  
> -	if (cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_EL1_TraceBuffer_SHIFT) &&
> -	    !(read_sysreg_s(SYS_TRBIDR_EL1) & TRBIDR_EL1_P))
> -		host_data_set_flag(HAS_TRBE);
> +	if (cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_EL1_TraceFilt_SHIFT))
> +		host_data_set_flag(HAS_TRF);
>  }
> +
> +void kvm_enable_trbe(void)
> +{
> +	if (has_vhe() || is_protected_kvm_enabled() ||
> +	    WARN_ON_ONCE(preemptible()))
> +		return;
> +
> +	host_data_set_flag(TRBE_ENABLED);
> +}
> +EXPORT_SYMBOL_GPL(kvm_enable_trbe);
> +
> +void kvm_disable_trbe(void)
> +{
> +	if (has_vhe() || is_protected_kvm_enabled() ||
> +	    WARN_ON_ONCE(preemptible()))
> +		return;
> +
> +	host_data_clear_flag(TRBE_ENABLED);
> +}
> +EXPORT_SYMBOL_GPL(kvm_disable_trbe);
> diff --git a/arch/arm64/kvm/hyp/nvhe/debug-sr.c b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
> index 858bb38e273f..9479bee41801 100644
> --- a/arch/arm64/kvm/hyp/nvhe/debug-sr.c
> +++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
> @@ -51,32 +51,39 @@ static void __debug_restore_spe(u64 pmscr_el1)
>  	write_sysreg_el1(pmscr_el1, SYS_PMSCR);
>  }
>  
> -static void __debug_save_trace(u64 *trfcr_el1)
> +static void __trace_do_switch(u64 *saved_trfcr, u64 new_trfcr)
>  {
> -	*trfcr_el1 = 0;
> +	*saved_trfcr = read_sysreg_el1(SYS_TRFCR);
> +	write_sysreg_el1(new_trfcr, SYS_TRFCR);
>  
> -	/* Check if the TRBE is enabled */
> -	if (!(read_sysreg_s(SYS_TRBLIMITR_EL1) & TRBLIMITR_EL1_E))
> +	/* No need to drain if going to an enabled state or from disabled state */
> +	if (new_trfcr || !*saved_trfcr)

What if TRFCR_EL1.TS is set to something non-zero? I'd rather you
check for the E*TRE bits instead of assuming things.

>  		return;
> -	/*
> -	 * Prohibit trace generation while we are in guest.
> -	 * Since access to TRFCR_EL1 is trapped, the guest can't
> -	 * modify the filtering set by the host.
> -	 */
> -	*trfcr_el1 = read_sysreg_el1(SYS_TRFCR);
> -	write_sysreg_el1(0, SYS_TRFCR);
> +
>  	isb();
> -	/* Drain the trace buffer to memory */
>  	tsb_csync();
>  }
>  
> -static void __debug_restore_trace(u64 trfcr_el1)
> +static bool __trace_needs_switch(void)
>  {
> -	if (!trfcr_el1)
> -		return;
> +	return host_data_test_flag(TRBE_ENABLED) ||
> +	       (is_protected_kvm_enabled() && host_data_test_flag(HAS_TRF));
> +}
>  
> -	/* Restore trace filter controls */
> -	write_sysreg_el1(trfcr_el1, SYS_TRFCR);
> +static void __trace_switch_to_guest(void)
> +{
> +	/* Unsupported with TRBE so disable */
> +	if (host_data_test_flag(TRBE_ENABLED))
> +		*host_data_ptr(guest_trfcr_el1) = 0;
> +
> +	__trace_do_switch(host_data_ptr(host_debug_state.trfcr_el1),
> +			  *host_data_ptr(guest_trfcr_el1));
> +}
> +
> +static void __trace_switch_to_host(void)
> +{
> +	__trace_do_switch(host_data_ptr(guest_trfcr_el1),
> +			  *host_data_ptr(host_debug_state.trfcr_el1));
>  }
>  
>  void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu)
> @@ -84,9 +91,9 @@ void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu)
>  	/* Disable and flush SPE data generation */
>  	if (host_data_test_flag(HAS_SPE))
>  		__debug_save_spe(host_data_ptr(host_debug_state.pmscr_el1));
> -	/* Disable and flush Self-Hosted Trace generation */
> -	if (host_data_test_flag(HAS_TRBE))
> -		__debug_save_trace(host_data_ptr(host_debug_state.trfcr_el1));
> +
> +	if (__trace_needs_switch())
> +		__trace_switch_to_guest();
>  }
>  
>  void __debug_switch_to_guest(struct kvm_vcpu *vcpu)
> @@ -98,8 +105,8 @@ void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu)
>  {
>  	if (host_data_test_flag(HAS_SPE))
>  		__debug_restore_spe(*host_data_ptr(host_debug_state.pmscr_el1));
> -	if (host_data_test_flag(HAS_TRBE))
> -		__debug_restore_trace(*host_data_ptr(host_debug_state.trfcr_el1));
> +	if (__trace_needs_switch())
> +		__trace_switch_to_host();
>  }
>  
>  void __debug_switch_to_host(struct kvm_vcpu *vcpu)
> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
> index 96a32b213669..9c0f8c43e6fe 100644
> --- a/drivers/hwtracing/coresight/coresight-trbe.c
> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
> @@ -18,6 +18,7 @@
>  #include <asm/barrier.h>
>  #include <asm/cpufeature.h>
>  #include <linux/vmalloc.h>
> +#include <linux/kvm_host.h>

Ordering of include files.

>  
>  #include "coresight-self-hosted-trace.h"
>  #include "coresight-trbe.h"
> @@ -221,6 +222,7 @@ static inline void set_trbe_enabled(struct trbe_cpudata *cpudata, u64 trblimitr)
>  	 */
>  	trblimitr |= TRBLIMITR_EL1_E;
>  	write_sysreg_s(trblimitr, SYS_TRBLIMITR_EL1);
> +	kvm_enable_trbe();
>  
>  	/* Synchronize the TRBE enable event */
>  	isb();
> @@ -239,6 +241,7 @@ static inline void set_trbe_disabled(struct trbe_cpudata *cpudata)
>  	 */
>  	trblimitr &= ~TRBLIMITR_EL1_E;
>  	write_sysreg_s(trblimitr, SYS_TRBLIMITR_EL1);
> +	kvm_disable_trbe();
>  
>  	if (trbe_needs_drain_after_disable(cpudata))
>  		trbe_drain_buffer();
> @@ -253,8 +256,10 @@ static void trbe_drain_and_disable_local(struct trbe_cpudata *cpudata)
>  
>  static void trbe_reset_local(struct trbe_cpudata *cpudata)
>  {
> +	preempt_disable();
>  	trbe_drain_and_disable_local(cpudata);
>  	write_sysreg_s(0, SYS_TRBLIMITR_EL1);
> +	preempt_enable();

This looks terribly wrong. If you need to disable preemption here, why
doesn't the critical section cover all register accesses? Surely you
don't want to nuke another CPU's context?

But looking at the calling sites, this makes even less sense. The two
callers of this thing mess with *per-CPU* interrupts. Dealing with
per-CPU interrupts in preemptible context is a big no-no (hint: they
start with a call to smp_processor_id()).

So what is this supposed to ensure?

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 6/8] KVM: arm64: coresight: Give TRBE enabled state to KVM
  2024-12-20 17:05   ` Marc Zyngier
@ 2024-12-20 17:32     ` James Clark
  2024-12-21 11:54       ` Marc Zyngier
  0 siblings, 1 reply; 16+ messages in thread
From: James Clark @ 2024-12-20 17:32 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, oliver.upton, suzuki.poulose, coresight, Joey Gouly,
	Zenghui Yu, Catalin Marinas, Will Deacon, Mike Leach,
	Alexander Shishkin, Mark Brown, Anshuman Khandual,
	Rob Herring (Arm), Shiqi Liu, Fuad Tabba, James Morse,
	Raghavendra Rao Ananta, linux-arm-kernel, linux-kernel



On 20/12/2024 5:05 pm, Marc Zyngier wrote:
> On Wed, 27 Nov 2024 10:01:23 +0000,
> James Clark <james.clark@linaro.org> wrote:
>>
>> Currently in nVHE, KVM has to check if TRBE is enabled on every guest
>> switch even if it was never used. Because it's a debug feature and is
>> more likely to not be used than used, give KVM the TRBE buffer status to
>> allow a much simpler and faster do-nothing path in the hyp.
>>
>> This is always called with preemption disabled except for probe/hotplug
>> which gets wrapped with preempt_disable().
>>
>> Protected mode disables trace regardless of TRBE (because
>> guest_trfcr_el1 is always 0), which was not previously done. HAS_TRBE
>> becomes redundant, but HAS_TRF is now required for this.
>>
>> Signed-off-by: James Clark <james.clark@linaro.org>
>> ---
>>   arch/arm64/include/asm/kvm_host.h            | 10 +++-
>>   arch/arm64/kvm/debug.c                       | 25 ++++++++--
>>   arch/arm64/kvm/hyp/nvhe/debug-sr.c           | 51 +++++++++++---------
>>   drivers/hwtracing/coresight/coresight-trbe.c |  5 ++
>>   4 files changed, 65 insertions(+), 26 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
>> index 7e3478386351..ba251caa593b 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -611,7 +611,8 @@ struct cpu_sve_state {
>>    */
>>   struct kvm_host_data {
>>   #define KVM_HOST_DATA_FLAG_HAS_SPE	0
>> -#define KVM_HOST_DATA_FLAG_HAS_TRBE	1
>> +#define KVM_HOST_DATA_FLAG_HAS_TRF	1
>> +#define KVM_HOST_DATA_FLAG_TRBE_ENABLED	2
>>   	unsigned long flags;
>>   
>>   	struct kvm_cpu_context host_ctxt;
>> @@ -657,6 +658,9 @@ struct kvm_host_data {
>>   		u64 mdcr_el2;
>>   	} host_debug_state;
>>   
>> +	/* Guest trace filter value */
>> +	u64 guest_trfcr_el1;
> 
> Guest value? Or host state while running the guest? If the former,
> then this has nothing to do here. If the latter, this should be
> spelled out (trfcr_in_guest?), and the comment amended.
> 
>> +
>>   	/* Number of programmable event counters (PMCR_EL0.N) for this CPU */
>>   	unsigned int nr_event_counters;
>>   };
>> @@ -1381,6 +1385,8 @@ static inline bool kvm_pmu_counter_deferred(struct perf_event_attr *attr)
>>   void kvm_set_pmu_events(u64 set, struct perf_event_attr *attr);
>>   void kvm_clr_pmu_events(u64 clr);
>>   bool kvm_set_pmuserenr(u64 val);
>> +void kvm_enable_trbe(void);
>> +void kvm_disable_trbe(void);
>>   #else
>>   static inline void kvm_set_pmu_events(u64 set, struct perf_event_attr *attr) {}
>>   static inline void kvm_clr_pmu_events(u64 clr) {}
>> @@ -1388,6 +1394,8 @@ static inline bool kvm_set_pmuserenr(u64 val)
>>   {
>>   	return false;
>>   }
>> +static inline void kvm_enable_trbe(void) {}
>> +static inline void kvm_disable_trbe(void) {}
>>   #endif
>>   
>>   void kvm_vcpu_load_vhe(struct kvm_vcpu *vcpu);
>> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
>> index dd9e139dfd13..0c340ae7b5d1 100644
>> --- a/arch/arm64/kvm/debug.c
>> +++ b/arch/arm64/kvm/debug.c
>> @@ -314,7 +314,26 @@ void kvm_init_host_debug_data(void)
>>   	    !(read_sysreg_s(SYS_PMBIDR_EL1) & PMBIDR_EL1_P))
>>   		host_data_set_flag(HAS_SPE);
>>   
>> -	if (cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_EL1_TraceBuffer_SHIFT) &&
>> -	    !(read_sysreg_s(SYS_TRBIDR_EL1) & TRBIDR_EL1_P))
>> -		host_data_set_flag(HAS_TRBE);
>> +	if (cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_EL1_TraceFilt_SHIFT))
>> +		host_data_set_flag(HAS_TRF);
>>   }
>> +
>> +void kvm_enable_trbe(void)
>> +{
>> +	if (has_vhe() || is_protected_kvm_enabled() ||
>> +	    WARN_ON_ONCE(preemptible()))
>> +		return;
>> +
>> +	host_data_set_flag(TRBE_ENABLED);
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_enable_trbe);
>> +
>> +void kvm_disable_trbe(void)
>> +{
>> +	if (has_vhe() || is_protected_kvm_enabled() ||
>> +	    WARN_ON_ONCE(preemptible()))
>> +		return;
>> +
>> +	host_data_clear_flag(TRBE_ENABLED);
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_disable_trbe);
>> diff --git a/arch/arm64/kvm/hyp/nvhe/debug-sr.c b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
>> index 858bb38e273f..9479bee41801 100644
>> --- a/arch/arm64/kvm/hyp/nvhe/debug-sr.c
>> +++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
>> @@ -51,32 +51,39 @@ static void __debug_restore_spe(u64 pmscr_el1)
>>   	write_sysreg_el1(pmscr_el1, SYS_PMSCR);
>>   }
>>   
>> -static void __debug_save_trace(u64 *trfcr_el1)
>> +static void __trace_do_switch(u64 *saved_trfcr, u64 new_trfcr)
>>   {
>> -	*trfcr_el1 = 0;
>> +	*saved_trfcr = read_sysreg_el1(SYS_TRFCR);
>> +	write_sysreg_el1(new_trfcr, SYS_TRFCR);
>>   
>> -	/* Check if the TRBE is enabled */
>> -	if (!(read_sysreg_s(SYS_TRBLIMITR_EL1) & TRBLIMITR_EL1_E))
>> +	/* No need to drain if going to an enabled state or from disabled state */
>> +	if (new_trfcr || !*saved_trfcr)
> 
> What if TRFCR_EL1.TS is set to something non-zero? I'd rather you
> check for the E*TRE bits instead of assuming things.
> 

Yeah it's probably better that way. TS is actually always set when any 
tracing session starts and then never cleared, so doing it the simpler 
way made it always flush even after tracing finished, which probably 
wasn't great.

>>   		return;
>> -	/*
>> -	 * Prohibit trace generation while we are in guest.
>> -	 * Since access to TRFCR_EL1 is trapped, the guest can't
>> -	 * modify the filtering set by the host.
>> -	 */
>> -	*trfcr_el1 = read_sysreg_el1(SYS_TRFCR);
>> -	write_sysreg_el1(0, SYS_TRFCR);
>> +
>>   	isb();
>> -	/* Drain the trace buffer to memory */
>>   	tsb_csync();
>>   }
>>   
>> -static void __debug_restore_trace(u64 trfcr_el1)
>> +static bool __trace_needs_switch(void)
>>   {
>> -	if (!trfcr_el1)
>> -		return;
>> +	return host_data_test_flag(TRBE_ENABLED) ||
>> +	       (is_protected_kvm_enabled() && host_data_test_flag(HAS_TRF));
>> +}
>>   
>> -	/* Restore trace filter controls */
>> -	write_sysreg_el1(trfcr_el1, SYS_TRFCR);
>> +static void __trace_switch_to_guest(void)
>> +{
>> +	/* Unsupported with TRBE so disable */
>> +	if (host_data_test_flag(TRBE_ENABLED))
>> +		*host_data_ptr(guest_trfcr_el1) = 0;
>> +
>> +	__trace_do_switch(host_data_ptr(host_debug_state.trfcr_el1),
>> +			  *host_data_ptr(guest_trfcr_el1));
>> +}
>> +
>> +static void __trace_switch_to_host(void)
>> +{
>> +	__trace_do_switch(host_data_ptr(guest_trfcr_el1),
>> +			  *host_data_ptr(host_debug_state.trfcr_el1));
>>   }
>>   
>>   void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu)
>> @@ -84,9 +91,9 @@ void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu)
>>   	/* Disable and flush SPE data generation */
>>   	if (host_data_test_flag(HAS_SPE))
>>   		__debug_save_spe(host_data_ptr(host_debug_state.pmscr_el1));
>> -	/* Disable and flush Self-Hosted Trace generation */
>> -	if (host_data_test_flag(HAS_TRBE))
>> -		__debug_save_trace(host_data_ptr(host_debug_state.trfcr_el1));
>> +
>> +	if (__trace_needs_switch())
>> +		__trace_switch_to_guest();
>>   }
>>   
>>   void __debug_switch_to_guest(struct kvm_vcpu *vcpu)
>> @@ -98,8 +105,8 @@ void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu)
>>   {
>>   	if (host_data_test_flag(HAS_SPE))
>>   		__debug_restore_spe(*host_data_ptr(host_debug_state.pmscr_el1));
>> -	if (host_data_test_flag(HAS_TRBE))
>> -		__debug_restore_trace(*host_data_ptr(host_debug_state.trfcr_el1));
>> +	if (__trace_needs_switch())
>> +		__trace_switch_to_host();
>>   }
>>   
>>   void __debug_switch_to_host(struct kvm_vcpu *vcpu)
>> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
>> index 96a32b213669..9c0f8c43e6fe 100644
>> --- a/drivers/hwtracing/coresight/coresight-trbe.c
>> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
>> @@ -18,6 +18,7 @@
>>   #include <asm/barrier.h>
>>   #include <asm/cpufeature.h>
>>   #include <linux/vmalloc.h>
>> +#include <linux/kvm_host.h>
> 
> Ordering of include files.
> 
>>   
>>   #include "coresight-self-hosted-trace.h"
>>   #include "coresight-trbe.h"
>> @@ -221,6 +222,7 @@ static inline void set_trbe_enabled(struct trbe_cpudata *cpudata, u64 trblimitr)
>>   	 */
>>   	trblimitr |= TRBLIMITR_EL1_E;
>>   	write_sysreg_s(trblimitr, SYS_TRBLIMITR_EL1);
>> +	kvm_enable_trbe();
>>   
>>   	/* Synchronize the TRBE enable event */
>>   	isb();
>> @@ -239,6 +241,7 @@ static inline void set_trbe_disabled(struct trbe_cpudata *cpudata)
>>   	 */
>>   	trblimitr &= ~TRBLIMITR_EL1_E;
>>   	write_sysreg_s(trblimitr, SYS_TRBLIMITR_EL1);
>> +	kvm_disable_trbe();
>>   
>>   	if (trbe_needs_drain_after_disable(cpudata))
>>   		trbe_drain_buffer();
>> @@ -253,8 +256,10 @@ static void trbe_drain_and_disable_local(struct trbe_cpudata *cpudata)
>>   
>>   static void trbe_reset_local(struct trbe_cpudata *cpudata)
>>   {
>> +	preempt_disable();
>>   	trbe_drain_and_disable_local(cpudata);
>>   	write_sysreg_s(0, SYS_TRBLIMITR_EL1);
>> +	preempt_enable();
> 
> This looks terribly wrong. If you need to disable preemption here, why
> doesn't the critical section cover all register accesses? Surely you
> don't want to nuke another CPU's context?
> 
> But looking at the calling sites, this makes even less sense. The two
> callers of this thing mess with *per-CPU* interrupts. Dealing with
> per-CPU interrupts in preemptible context is a big no-no (hint: they
> start with a call to smp_processor_id()).
> 
> So what is this supposed to ensure?
> 
> 	M.
> 

These ones are only intended to silence the WARN_ON_ONCE(preemptible()) 
in kvm_enable_trbe() when this is called from boot/hotplug 
(arm_trbe_enable_cpu()). Preemption isn't disabled, but a guest can't 
run at that point either.

The "real" calls to kvm_enable_trbe() _are_ called from an atomic 
context. I think there was a previous review comment about when it was 
safe to call the KVM parts of this change, which is why I added the 
warning making sure it was always called with preemption disabled. But 
actually I could remove the warning and these preempt_disables() and 
replace them with a comment.

Thanks
James



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 6/8] KVM: arm64: coresight: Give TRBE enabled state to KVM
  2024-12-20 17:32     ` James Clark
@ 2024-12-21 11:54       ` Marc Zyngier
  2024-12-23 11:36         ` James Clark
  0 siblings, 1 reply; 16+ messages in thread
From: Marc Zyngier @ 2024-12-21 11:54 UTC (permalink / raw)
  To: James Clark
  Cc: kvmarm, oliver.upton, suzuki.poulose, coresight, Joey Gouly,
	Zenghui Yu, Catalin Marinas, Will Deacon, Mike Leach,
	Alexander Shishkin, Mark Brown, Anshuman Khandual,
	Rob Herring (Arm), Shiqi Liu, Fuad Tabba, James Morse,
	Raghavendra Rao Ananta, linux-arm-kernel, linux-kernel

On Fri, 20 Dec 2024 17:32:17 +0000,
James Clark <james.clark@linaro.org> wrote:
> 
> 
> 
> On 20/12/2024 5:05 pm, Marc Zyngier wrote:
> > On Wed, 27 Nov 2024 10:01:23 +0000,
> > James Clark <james.clark@linaro.org> wrote:
> >> 
> >> Currently in nVHE, KVM has to check if TRBE is enabled on every guest
> >> switch even if it was never used. Because it's a debug feature and is
> >> more likely to not be used than used, give KVM the TRBE buffer status to
> >> allow a much simpler and faster do-nothing path in the hyp.
> >> 
> >> This is always called with preemption disabled except for probe/hotplug
> >> which gets wrapped with preempt_disable().
> >> 
> >> Protected mode disables trace regardless of TRBE (because
> >> guest_trfcr_el1 is always 0), which was not previously done. HAS_TRBE
> >> becomes redundant, but HAS_TRF is now required for this.
> >> 
> >> Signed-off-by: James Clark <james.clark@linaro.org>
> >> ---
> >>   arch/arm64/include/asm/kvm_host.h            | 10 +++-
> >>   arch/arm64/kvm/debug.c                       | 25 ++++++++--
> >>   arch/arm64/kvm/hyp/nvhe/debug-sr.c           | 51 +++++++++++---------
> >>   drivers/hwtracing/coresight/coresight-trbe.c |  5 ++
> >>   4 files changed, 65 insertions(+), 26 deletions(-)
> >> 
> >> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> >> index 7e3478386351..ba251caa593b 100644
> >> --- a/arch/arm64/include/asm/kvm_host.h
> >> +++ b/arch/arm64/include/asm/kvm_host.h
> >> @@ -611,7 +611,8 @@ struct cpu_sve_state {
> >>    */
> >>   struct kvm_host_data {
> >>   #define KVM_HOST_DATA_FLAG_HAS_SPE	0
> >> -#define KVM_HOST_DATA_FLAG_HAS_TRBE	1
> >> +#define KVM_HOST_DATA_FLAG_HAS_TRF	1
> >> +#define KVM_HOST_DATA_FLAG_TRBE_ENABLED	2
> >>   	unsigned long flags;
> >>     	struct kvm_cpu_context host_ctxt;
> >> @@ -657,6 +658,9 @@ struct kvm_host_data {
> >>   		u64 mdcr_el2;
> >>   	} host_debug_state;
> >>   +	/* Guest trace filter value */
> >> +	u64 guest_trfcr_el1;
> > 
> > Guest value? Or host state while running the guest? If the former,
> > then this has nothing to do here. If the latter, this should be
> > spelled out (trfcr_in_guest?), and the comment amended.
> > 
> >> +
> >>   	/* Number of programmable event counters (PMCR_EL0.N) for this CPU */
> >>   	unsigned int nr_event_counters;
> >>   };
> >> @@ -1381,6 +1385,8 @@ static inline bool kvm_pmu_counter_deferred(struct perf_event_attr *attr)
> >>   void kvm_set_pmu_events(u64 set, struct perf_event_attr *attr);
> >>   void kvm_clr_pmu_events(u64 clr);
> >>   bool kvm_set_pmuserenr(u64 val);
> >> +void kvm_enable_trbe(void);
> >> +void kvm_disable_trbe(void);
> >>   #else
> >>   static inline void kvm_set_pmu_events(u64 set, struct perf_event_attr *attr) {}
> >>   static inline void kvm_clr_pmu_events(u64 clr) {}
> >> @@ -1388,6 +1394,8 @@ static inline bool kvm_set_pmuserenr(u64 val)
> >>   {
> >>   	return false;
> >>   }
> >> +static inline void kvm_enable_trbe(void) {}
> >> +static inline void kvm_disable_trbe(void) {}
> >>   #endif
> >>     void kvm_vcpu_load_vhe(struct kvm_vcpu *vcpu);
> >> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
> >> index dd9e139dfd13..0c340ae7b5d1 100644
> >> --- a/arch/arm64/kvm/debug.c
> >> +++ b/arch/arm64/kvm/debug.c
> >> @@ -314,7 +314,26 @@ void kvm_init_host_debug_data(void)
> >>   	    !(read_sysreg_s(SYS_PMBIDR_EL1) & PMBIDR_EL1_P))
> >>   		host_data_set_flag(HAS_SPE);
> >>   -	if (cpuid_feature_extract_unsigned_field(dfr0,
> >> ID_AA64DFR0_EL1_TraceBuffer_SHIFT) &&
> >> -	    !(read_sysreg_s(SYS_TRBIDR_EL1) & TRBIDR_EL1_P))
> >> -		host_data_set_flag(HAS_TRBE);
> >> +	if (cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_EL1_TraceFilt_SHIFT))
> >> +		host_data_set_flag(HAS_TRF);
> >>   }
> >> +
> >> +void kvm_enable_trbe(void)
> >> +{
> >> +	if (has_vhe() || is_protected_kvm_enabled() ||
> >> +	    WARN_ON_ONCE(preemptible()))
> >> +		return;
> >> +
> >> +	host_data_set_flag(TRBE_ENABLED);
> >> +}
> >> +EXPORT_SYMBOL_GPL(kvm_enable_trbe);
> >> +
> >> +void kvm_disable_trbe(void)
> >> +{
> >> +	if (has_vhe() || is_protected_kvm_enabled() ||
> >> +	    WARN_ON_ONCE(preemptible()))
> >> +		return;
> >> +
> >> +	host_data_clear_flag(TRBE_ENABLED);
> >> +}
> >> +EXPORT_SYMBOL_GPL(kvm_disable_trbe);
> >> diff --git a/arch/arm64/kvm/hyp/nvhe/debug-sr.c b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
> >> index 858bb38e273f..9479bee41801 100644
> >> --- a/arch/arm64/kvm/hyp/nvhe/debug-sr.c
> >> +++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
> >> @@ -51,32 +51,39 @@ static void __debug_restore_spe(u64 pmscr_el1)
> >>   	write_sysreg_el1(pmscr_el1, SYS_PMSCR);
> >>   }
> >>   -static void __debug_save_trace(u64 *trfcr_el1)
> >> +static void __trace_do_switch(u64 *saved_trfcr, u64 new_trfcr)
> >>   {
> >> -	*trfcr_el1 = 0;
> >> +	*saved_trfcr = read_sysreg_el1(SYS_TRFCR);
> >> +	write_sysreg_el1(new_trfcr, SYS_TRFCR);
> >>   -	/* Check if the TRBE is enabled */
> >> -	if (!(read_sysreg_s(SYS_TRBLIMITR_EL1) & TRBLIMITR_EL1_E))
> >> +	/* No need to drain if going to an enabled state or from disabled state */
> >> +	if (new_trfcr || !*saved_trfcr)
> > 
> > What if TRFCR_EL1.TS is set to something non-zero? I'd rather you
> > check for the E*TRE bits instead of assuming things.
> > 
> 
> Yeah it's probably better that way. TS is actually always set when any
> tracing session starts and then never cleared, so doing it the simpler
> way made it always flush even after tracing finished, which probably
> wasn't great.

Quite. Can you please *test* these things?

[...]

> >> @@ -253,8 +256,10 @@ static void trbe_drain_and_disable_local(struct trbe_cpudata *cpudata)
> >>     static void trbe_reset_local(struct trbe_cpudata *cpudata)
> >>   {
> >> +	preempt_disable();
> >>   	trbe_drain_and_disable_local(cpudata);
> >>   	write_sysreg_s(0, SYS_TRBLIMITR_EL1);
> >> +	preempt_enable();
> > 
> > This looks terribly wrong. If you need to disable preemption here, why
> > doesn't the critical section cover all register accesses? Surely you
> > don't want to nuke another CPU's context?
> > 
> > But looking at the calling sites, this makes even less sense. The two
> > callers of this thing mess with *per-CPU* interrupts. Dealing with
> > per-CPU interrupts in preemptible context is a big no-no (hint: they
> > start with a call to smp_processor_id()).
> > 
> > So what is this supposed to ensure?
> > 
> > 	M.
> > 
> 
> These ones are only intended to silence the
> WARN_ON_ONCE(preemptible()) in kvm_enable_trbe() when this is called
> from boot/hotplug (arm_trbe_enable_cpu()). Preemption isn't disabled,
> but a guest can't run at that point either.
>
> The "real" calls to kvm_enable_trbe() _are_ called from an atomic
> context. I think there was a previous review comment about when it was
> safe to call the KVM parts of this change, which is why I added the
> warning making sure it was always called with preemption disabled. But
> actually I could remove the warning and these preempt_disables() and
> replace them with a comment.

You should keep the WARN_ON(), and either *never* end-up calling this
stuff during a CPUHP event, or handle the fact that preemption isn't
initialised yet. For example by checking whether the current CPU is
online.

But this sort of random spreading of preemption disabling is not an
acceptable outcome.

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 7/8] KVM: arm64: Support trace filtering for guests
  2024-11-27 10:01 ` [PATCH v8 7/8] KVM: arm64: Support trace filtering for guests James Clark
@ 2024-12-21 12:34   ` Marc Zyngier
  2024-12-23 11:28     ` James Clark
  0 siblings, 1 reply; 16+ messages in thread
From: Marc Zyngier @ 2024-12-21 12:34 UTC (permalink / raw)
  To: James Clark
  Cc: kvmarm, oliver.upton, suzuki.poulose, coresight, Joey Gouly,
	Zenghui Yu, Catalin Marinas, Will Deacon, Mike Leach,
	Alexander Shishkin, Mark Brown, Anshuman Khandual, James Morse,
	Rob Herring (Arm), Shiqi Liu, Fuad Tabba, Raghavendra Rao Ananta,
	linux-arm-kernel, linux-kernel

On Wed, 27 Nov 2024 10:01:24 +0000,
James Clark <james.clark@linaro.org> wrote:
> 
> For nVHE, switch the filter value in and out if the Coresight driver
> asks for it. This will support filters for guests when sinks other than
> TRBE are used.
> 
> For VHE, just write the filter directly to TRFCR_EL1 where trace can be
> used even with TRBE sinks.
> 
> Signed-off-by: James Clark <james.clark@linaro.org>
> ---
>  arch/arm64/include/asm/kvm_host.h  |  5 +++++
>  arch/arm64/kvm/debug.c             | 28 ++++++++++++++++++++++++++++
>  arch/arm64/kvm/hyp/nvhe/debug-sr.c |  1 +
>  3 files changed, 34 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index ba251caa593b..cce07887551b 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -613,6 +613,7 @@ struct kvm_host_data {
>  #define KVM_HOST_DATA_FLAG_HAS_SPE	0
>  #define KVM_HOST_DATA_FLAG_HAS_TRF	1
>  #define KVM_HOST_DATA_FLAG_TRBE_ENABLED	2
> +#define KVM_HOST_DATA_FLAG_GUEST_FILTER	3

Guest filter what? This is meaningless.

>  	unsigned long flags;
>  
>  	struct kvm_cpu_context host_ctxt;
> @@ -1387,6 +1388,8 @@ void kvm_clr_pmu_events(u64 clr);
>  bool kvm_set_pmuserenr(u64 val);
>  void kvm_enable_trbe(void);
>  void kvm_disable_trbe(void);
> +void kvm_set_trfcr(u64 guest_trfcr);
> +void kvm_clear_trfcr(void);
>  #else
>  static inline void kvm_set_pmu_events(u64 set, struct perf_event_attr *attr) {}
>  static inline void kvm_clr_pmu_events(u64 clr) {}
> @@ -1396,6 +1399,8 @@ static inline bool kvm_set_pmuserenr(u64 val)
>  }
>  static inline void kvm_enable_trbe(void) {}
>  static inline void kvm_disable_trbe(void) {}
> +static inline void kvm_set_trfcr(u64 guest_trfcr) {}
> +static inline void kvm_clear_trfcr(void) {}
>  #endif
>  
>  void kvm_vcpu_load_vhe(struct kvm_vcpu *vcpu);
> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
> index 0c340ae7b5d1..9266f2776991 100644
> --- a/arch/arm64/kvm/debug.c
> +++ b/arch/arm64/kvm/debug.c
> @@ -337,3 +337,31 @@ void kvm_disable_trbe(void)
>  	host_data_clear_flag(TRBE_ENABLED);
>  }
>  EXPORT_SYMBOL_GPL(kvm_disable_trbe);
> +
> +void kvm_set_trfcr(u64 guest_trfcr)

Again. Is this the guest's view? or the host view while running the
guest? I asked the question on the previous patch, and you didn't
reply.

> +{
> +	if (is_protected_kvm_enabled() || WARN_ON_ONCE(preemptible()))
> +		return;
> +
> +	if (has_vhe())
> +		write_sysreg_s(guest_trfcr, SYS_TRFCR_EL12);
> +	else {
> +		*host_data_ptr(guest_trfcr_el1) = guest_trfcr;
> +		host_data_set_flag(GUEST_FILTER);
> +	}

Oh come on. This is basic coding style, see section 3 in
Documentation/process/coding-style.rst.

> +}
> +EXPORT_SYMBOL_GPL(kvm_set_trfcr);
> +
> +void kvm_clear_trfcr(void)
> +{
> +	if (is_protected_kvm_enabled() || WARN_ON_ONCE(preemptible()))
> +		return;
> +
> +	if (has_vhe())
> +		write_sysreg_s(0, SYS_TRFCR_EL12);
> +	else {
> +		*host_data_ptr(guest_trfcr_el1) = 0;
> +		host_data_clear_flag(GUEST_FILTER);
> +	}
> +}
> +EXPORT_SYMBOL_GPL(kvm_clear_trfcr);

Why do we have two helpers? Clearly, calling kvm_set_trfcr() with
E{1,0}TRE=={0,0} should result in *disabling* things.  Except it
doesn't, and you should fix it. Once that is fixed, it becomes obvious
that kvm_clear_trfcr() serves no purpose.

To sum it up, KVM's API should reflect the architecture instead of
making things up.

> diff --git a/arch/arm64/kvm/hyp/nvhe/debug-sr.c b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
> index 9479bee41801..7edee7ace433 100644
> --- a/arch/arm64/kvm/hyp/nvhe/debug-sr.c
> +++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
> @@ -67,6 +67,7 @@ static void __trace_do_switch(u64 *saved_trfcr, u64 new_trfcr)
>  static bool __trace_needs_switch(void)
>  {
>  	return host_data_test_flag(TRBE_ENABLED) ||
> +	       host_data_test_flag(GUEST_FILTER) ||
>  	       (is_protected_kvm_enabled() && host_data_test_flag(HAS_TRF));

Wouldn't it make more sense to just force the "GUEST_FILTER" flag in
the pKVM case, and drop the 3rd term altogether?

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 7/8] KVM: arm64: Support trace filtering for guests
  2024-12-21 12:34   ` Marc Zyngier
@ 2024-12-23 11:28     ` James Clark
  2024-12-26 14:22       ` Marc Zyngier
  0 siblings, 1 reply; 16+ messages in thread
From: James Clark @ 2024-12-23 11:28 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton
  Cc: kvmarm, suzuki.poulose, coresight, Joey Gouly, Zenghui Yu,
	Catalin Marinas, Will Deacon, Mike Leach, Alexander Shishkin,
	Mark Brown, Anshuman Khandual, James Morse, Rob Herring (Arm),
	Shiqi Liu, Fuad Tabba, Raghavendra Rao Ananta, linux-arm-kernel,
	linux-kernel



On 21/12/2024 12:34 pm, Marc Zyngier wrote:
> On Wed, 27 Nov 2024 10:01:24 +0000,
> James Clark <james.clark@linaro.org> wrote:
>>
>> For nVHE, switch the filter value in and out if the Coresight driver
>> asks for it. This will support filters for guests when sinks other than
>> TRBE are used.
>>
>> For VHE, just write the filter directly to TRFCR_EL1 where trace can be
>> used even with TRBE sinks.
>>
>> Signed-off-by: James Clark <james.clark@linaro.org>
>> ---
>>   arch/arm64/include/asm/kvm_host.h  |  5 +++++
>>   arch/arm64/kvm/debug.c             | 28 ++++++++++++++++++++++++++++
>>   arch/arm64/kvm/hyp/nvhe/debug-sr.c |  1 +
>>   3 files changed, 34 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
>> index ba251caa593b..cce07887551b 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -613,6 +613,7 @@ struct kvm_host_data {
>>   #define KVM_HOST_DATA_FLAG_HAS_SPE	0
>>   #define KVM_HOST_DATA_FLAG_HAS_TRF	1
>>   #define KVM_HOST_DATA_FLAG_TRBE_ENABLED	2
>> +#define KVM_HOST_DATA_FLAG_GUEST_FILTER	3
> 
> Guest filter what? This is meaningless.
> 

KVM_HOST_DATA_FLAG_SWAP_TRFCR maybe?

>>   	unsigned long flags;
>>   
>>   	struct kvm_cpu_context host_ctxt;
>> @@ -1387,6 +1388,8 @@ void kvm_clr_pmu_events(u64 clr);
>>   bool kvm_set_pmuserenr(u64 val);
>>   void kvm_enable_trbe(void);
>>   void kvm_disable_trbe(void);
>> +void kvm_set_trfcr(u64 guest_trfcr);
>> +void kvm_clear_trfcr(void);
>>   #else
>>   static inline void kvm_set_pmu_events(u64 set, struct perf_event_attr *attr) {}
>>   static inline void kvm_clr_pmu_events(u64 clr) {}
>> @@ -1396,6 +1399,8 @@ static inline bool kvm_set_pmuserenr(u64 val)
>>   }
>>   static inline void kvm_enable_trbe(void) {}
>>   static inline void kvm_disable_trbe(void) {}
>> +static inline void kvm_set_trfcr(u64 guest_trfcr) {}
>> +static inline void kvm_clear_trfcr(void) {}
>>   #endif
>>   
>>   void kvm_vcpu_load_vhe(struct kvm_vcpu *vcpu);
>> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
>> index 0c340ae7b5d1..9266f2776991 100644
>> --- a/arch/arm64/kvm/debug.c
>> +++ b/arch/arm64/kvm/debug.c
>> @@ -337,3 +337,31 @@ void kvm_disable_trbe(void)
>>   	host_data_clear_flag(TRBE_ENABLED);
>>   }
>>   EXPORT_SYMBOL_GPL(kvm_disable_trbe);
>> +
>> +void kvm_set_trfcr(u64 guest_trfcr)
> 
> Again. Is this the guest's view? or the host view while running the
> guest? I asked the question on the previous patch, and you didn't
> reply.
> 

Ah sorry missed that one:

 > Guest value? Or host state while running the guest? If the former,
 > then this has nothing to do here. If the latter, this should be
 > spelled out (trfcr_in_guest?), and the comment amended.

Yes, the latter, guest TRFCR reads are undef anyway. I can rename this 
and the host_data variable to be trfcr_in_guest.

>> +{
>> +	if (is_protected_kvm_enabled() || WARN_ON_ONCE(preemptible()))
>> +		return;
>> +
>> +	if (has_vhe())
>> +		write_sysreg_s(guest_trfcr, SYS_TRFCR_EL12);
>> +	else {
>> +		*host_data_ptr(guest_trfcr_el1) = guest_trfcr;
>> +		host_data_set_flag(GUEST_FILTER);
>> +	}
> 
> Oh come on. This is basic coding style, see section 3 in
> Documentation/process/coding-style.rst.
> 

Oops, I'd have thought checkpatch could catch something like that. Will fix.

>> +}
>> +EXPORT_SYMBOL_GPL(kvm_set_trfcr);
>> +
>> +void kvm_clear_trfcr(void)
>> +{
>> +	if (is_protected_kvm_enabled() || WARN_ON_ONCE(preemptible()))
>> +		return;
>> +
>> +	if (has_vhe())
>> +		write_sysreg_s(0, SYS_TRFCR_EL12);
>> +	else {
>> +		*host_data_ptr(guest_trfcr_el1) = 0;
>> +		host_data_clear_flag(GUEST_FILTER);
>> +	}
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_clear_trfcr);
> 
> Why do we have two helpers? Clearly, calling kvm_set_trfcr() with
> E{1,0}TRE=={0,0} should result in *disabling* things.  Except it
 > doesn't, and you should fix it. Once that is fixed, it becomes 
obvious> that kvm_clear_trfcr() serves no purpose.
> 

With only one kvm_set_trfcr() there's no way to distinguish swapping in 
a 0 value or stopping swapping altogether. I thought we wanted a single 
flag that gated the register accesses so the hyp mostly does nothing? 
With only kvm_set_trfcr() you first need to check FEAT_TRF then you need 
to compare the real register with trfcr_in_guest to know whether to swap 
or not every time.

Actually I think some of the previous versions had something like this 
but it was a bit more complicated.

Maybe set/clear_trfcr() aren't great names. Perhaps 
kvm_set_trfcr_in_guest() and kvm_disable_trfcr_in_guest()? With the 
second one hinting that it stops the swapping regardless of what the 
values are.

I don't think calling kvm_set_trfcr() with E{1,0}TRE=={0,0} is actually 
broken in this version, it means that the Coresight driver wants that 
value to be installed for guests. So it should actually _enable_ 
swapping in the value of 0, not disable anything.

> To sum it up, KVM's API should reflect the architecture instead of
> making things up.
> 

We had kvm_set_trfcr(u64 host_trfcr, u64 guest_trfcr) on the last 
version, which also serves the same purpose I mentioned above because 
you can check if they're the same or not and disable swapping. I don't 
know if that counts as reflecting the architecture better. But Oliver 
mentioned he preferred it more "intent" based which is why I added the 
clear_trfcr().

>> diff --git a/arch/arm64/kvm/hyp/nvhe/debug-sr.c b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
>> index 9479bee41801..7edee7ace433 100644
>> --- a/arch/arm64/kvm/hyp/nvhe/debug-sr.c
>> +++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
>> @@ -67,6 +67,7 @@ static void __trace_do_switch(u64 *saved_trfcr, u64 new_trfcr)
>>   static bool __trace_needs_switch(void)
>>   {
>>   	return host_data_test_flag(TRBE_ENABLED) ||
>> +	       host_data_test_flag(GUEST_FILTER) ||
>>   	       (is_protected_kvm_enabled() && host_data_test_flag(HAS_TRF));
> 
> Wouldn't it make more sense to just force the "GUEST_FILTER" flag in
> the pKVM case, and drop the 3rd term altogether?
> 
> 	M.
> 

Yep we can set GUEST_FILTER once at startup and it gets dropped along 
with HAS_TRF. That's a lot simpler.

Thanks
James



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 6/8] KVM: arm64: coresight: Give TRBE enabled state to KVM
  2024-12-21 11:54       ` Marc Zyngier
@ 2024-12-23 11:36         ` James Clark
  0 siblings, 0 replies; 16+ messages in thread
From: James Clark @ 2024-12-23 11:36 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, oliver.upton, suzuki.poulose, coresight, Joey Gouly,
	Zenghui Yu, Catalin Marinas, Will Deacon, Mike Leach,
	Alexander Shishkin, Mark Brown, Anshuman Khandual,
	Rob Herring (Arm), Shiqi Liu, Fuad Tabba, James Morse,
	Raghavendra Rao Ananta, linux-arm-kernel, linux-kernel



On 21/12/2024 11:54 am, Marc Zyngier wrote:
> On Fri, 20 Dec 2024 17:32:17 +0000,
> James Clark <james.clark@linaro.org> wrote:
>>
>>
>>
>> On 20/12/2024 5:05 pm, Marc Zyngier wrote:
>>> On Wed, 27 Nov 2024 10:01:23 +0000,
>>> James Clark <james.clark@linaro.org> wrote:
>>>>
>>>> Currently in nVHE, KVM has to check if TRBE is enabled on every guest
>>>> switch even if it was never used. Because it's a debug feature and is
>>>> more likely to not be used than used, give KVM the TRBE buffer status to
>>>> allow a much simpler and faster do-nothing path in the hyp.
>>>>
>>>> This is always called with preemption disabled except for probe/hotplug
>>>> which gets wrapped with preempt_disable().
>>>>
>>>> Protected mode disables trace regardless of TRBE (because
>>>> guest_trfcr_el1 is always 0), which was not previously done. HAS_TRBE
>>>> becomes redundant, but HAS_TRF is now required for this.
>>>>
>>>> Signed-off-by: James Clark <james.clark@linaro.org>
>>>> ---
>>>>    arch/arm64/include/asm/kvm_host.h            | 10 +++-
>>>>    arch/arm64/kvm/debug.c                       | 25 ++++++++--
>>>>    arch/arm64/kvm/hyp/nvhe/debug-sr.c           | 51 +++++++++++---------
>>>>    drivers/hwtracing/coresight/coresight-trbe.c |  5 ++
>>>>    4 files changed, 65 insertions(+), 26 deletions(-)
>>>>
>>>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
>>>> index 7e3478386351..ba251caa593b 100644
>>>> --- a/arch/arm64/include/asm/kvm_host.h
>>>> +++ b/arch/arm64/include/asm/kvm_host.h
>>>> @@ -611,7 +611,8 @@ struct cpu_sve_state {
>>>>     */
>>>>    struct kvm_host_data {
>>>>    #define KVM_HOST_DATA_FLAG_HAS_SPE	0
>>>> -#define KVM_HOST_DATA_FLAG_HAS_TRBE	1
>>>> +#define KVM_HOST_DATA_FLAG_HAS_TRF	1
>>>> +#define KVM_HOST_DATA_FLAG_TRBE_ENABLED	2
>>>>    	unsigned long flags;
>>>>      	struct kvm_cpu_context host_ctxt;
>>>> @@ -657,6 +658,9 @@ struct kvm_host_data {
>>>>    		u64 mdcr_el2;
>>>>    	} host_debug_state;
>>>>    +	/* Guest trace filter value */
>>>> +	u64 guest_trfcr_el1;
>>>
>>> Guest value? Or host state while running the guest? If the former,
>>> then this has nothing to do here. If the latter, this should be
>>> spelled out (trfcr_in_guest?), and the comment amended.
>>>
>>>> +
>>>>    	/* Number of programmable event counters (PMCR_EL0.N) for this CPU */
>>>>    	unsigned int nr_event_counters;
>>>>    };
>>>> @@ -1381,6 +1385,8 @@ static inline bool kvm_pmu_counter_deferred(struct perf_event_attr *attr)
>>>>    void kvm_set_pmu_events(u64 set, struct perf_event_attr *attr);
>>>>    void kvm_clr_pmu_events(u64 clr);
>>>>    bool kvm_set_pmuserenr(u64 val);
>>>> +void kvm_enable_trbe(void);
>>>> +void kvm_disable_trbe(void);
>>>>    #else
>>>>    static inline void kvm_set_pmu_events(u64 set, struct perf_event_attr *attr) {}
>>>>    static inline void kvm_clr_pmu_events(u64 clr) {}
>>>> @@ -1388,6 +1394,8 @@ static inline bool kvm_set_pmuserenr(u64 val)
>>>>    {
>>>>    	return false;
>>>>    }
>>>> +static inline void kvm_enable_trbe(void) {}
>>>> +static inline void kvm_disable_trbe(void) {}
>>>>    #endif
>>>>      void kvm_vcpu_load_vhe(struct kvm_vcpu *vcpu);
>>>> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
>>>> index dd9e139dfd13..0c340ae7b5d1 100644
>>>> --- a/arch/arm64/kvm/debug.c
>>>> +++ b/arch/arm64/kvm/debug.c
>>>> @@ -314,7 +314,26 @@ void kvm_init_host_debug_data(void)
>>>>    	    !(read_sysreg_s(SYS_PMBIDR_EL1) & PMBIDR_EL1_P))
>>>>    		host_data_set_flag(HAS_SPE);
>>>>    -	if (cpuid_feature_extract_unsigned_field(dfr0,
>>>> ID_AA64DFR0_EL1_TraceBuffer_SHIFT) &&
>>>> -	    !(read_sysreg_s(SYS_TRBIDR_EL1) & TRBIDR_EL1_P))
>>>> -		host_data_set_flag(HAS_TRBE);
>>>> +	if (cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_EL1_TraceFilt_SHIFT))
>>>> +		host_data_set_flag(HAS_TRF);
>>>>    }
>>>> +
>>>> +void kvm_enable_trbe(void)
>>>> +{
>>>> +	if (has_vhe() || is_protected_kvm_enabled() ||
>>>> +	    WARN_ON_ONCE(preemptible()))
>>>> +		return;
>>>> +
>>>> +	host_data_set_flag(TRBE_ENABLED);
>>>> +}
>>>> +EXPORT_SYMBOL_GPL(kvm_enable_trbe);
>>>> +
>>>> +void kvm_disable_trbe(void)
>>>> +{
>>>> +	if (has_vhe() || is_protected_kvm_enabled() ||
>>>> +	    WARN_ON_ONCE(preemptible()))
>>>> +		return;
>>>> +
>>>> +	host_data_clear_flag(TRBE_ENABLED);
>>>> +}
>>>> +EXPORT_SYMBOL_GPL(kvm_disable_trbe);
>>>> diff --git a/arch/arm64/kvm/hyp/nvhe/debug-sr.c b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
>>>> index 858bb38e273f..9479bee41801 100644
>>>> --- a/arch/arm64/kvm/hyp/nvhe/debug-sr.c
>>>> +++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
>>>> @@ -51,32 +51,39 @@ static void __debug_restore_spe(u64 pmscr_el1)
>>>>    	write_sysreg_el1(pmscr_el1, SYS_PMSCR);
>>>>    }
>>>>    -static void __debug_save_trace(u64 *trfcr_el1)
>>>> +static void __trace_do_switch(u64 *saved_trfcr, u64 new_trfcr)
>>>>    {
>>>> -	*trfcr_el1 = 0;
>>>> +	*saved_trfcr = read_sysreg_el1(SYS_TRFCR);
>>>> +	write_sysreg_el1(new_trfcr, SYS_TRFCR);
>>>>    -	/* Check if the TRBE is enabled */
>>>> -	if (!(read_sysreg_s(SYS_TRBLIMITR_EL1) & TRBLIMITR_EL1_E))
>>>> +	/* No need to drain if going to an enabled state or from disabled state */
>>>> +	if (new_trfcr || !*saved_trfcr)
>>>
>>> What if TRFCR_EL1.TS is set to something non-zero? I'd rather you
>>> check for the E*TRE bits instead of assuming things.
>>>
>>
>> Yeah it's probably better that way. TS is actually always set when any
>> tracing session starts and then never cleared, so doing it the simpler
>> way made it always flush even after tracing finished, which probably
>> wasn't great.
> 
> Quite. Can you please *test* these things?
> 
> [...]
> 

Sorry to confuse things I wasn't 100% accurate here, yes it's tested and 
working. It works because of the split set/clear_trfcr() API. The 
Coresight driver specifically calls clear at the end of the session 
rather than a set of 0. That signals this function not to be called so 
there's no excessive swapping.

Secondly, the buffer flushing case is triggered by TRBE_ENABLED, which 
forces TRFCR to 0, so "if (new_trfcr)" is an OK way to gate the flush.


>>>> @@ -253,8 +256,10 @@ static void trbe_drain_and_disable_local(struct trbe_cpudata *cpudata)
>>>>      static void trbe_reset_local(struct trbe_cpudata *cpudata)
>>>>    {
>>>> +	preempt_disable();
>>>>    	trbe_drain_and_disable_local(cpudata);
>>>>    	write_sysreg_s(0, SYS_TRBLIMITR_EL1);
>>>> +	preempt_enable();
>>>
>>> This looks terribly wrong. If you need to disable preemption here, why
>>> doesn't the critical section cover all register accesses? Surely you
>>> don't want to nuke another CPU's context?
>>>
>>> But looking at the calling sites, this makes even less sense. The two
>>> callers of this thing mess with *per-CPU* interrupts. Dealing with
>>> per-CPU interrupts in preemptible context is a big no-no (hint: they
>>> start with a call to smp_processor_id()).
>>>
>>> So what is this supposed to ensure?
>>>
>>> 	M.
>>>
>>
>> These ones are only intended to silence the
>> WARN_ON_ONCE(preemptible()) in kvm_enable_trbe() when this is called
>> from boot/hotplug (arm_trbe_enable_cpu()). Preemption isn't disabled,
>> but a guest can't run at that point either.
>>
>> The "real" calls to kvm_enable_trbe() _are_ called from an atomic
>> context. I think there was a previous review comment about when it was
>> safe to call the KVM parts of this change, which is why I added the
>> warning making sure it was always called with preemption disabled. But
>> actually I could remove the warning and these preempt_disables() and
>> replace them with a comment.
> 
> You should keep the WARN_ON(), and either *never* end-up calling this
> stuff during a CPUHP event, or handle the fact that preemption isn't
> initialised yet. For example by checking whether the current CPU is
> online.
> 
> But this sort of random spreading of preemption disabling is not an
> acceptable outcome.
> 
> 	M.
> 

I'll look into this again. This was my initial attempt but couldn't find 
any easily accessible state that allowed to to be done this way. Maybe I 
missed something, but the obvious cpu_online() etc were already true at 
this point.

Thanks
James




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 7/8] KVM: arm64: Support trace filtering for guests
  2024-12-23 11:28     ` James Clark
@ 2024-12-26 14:22       ` Marc Zyngier
  0 siblings, 0 replies; 16+ messages in thread
From: Marc Zyngier @ 2024-12-26 14:22 UTC (permalink / raw)
  To: James Clark
  Cc: Oliver Upton, kvmarm, suzuki.poulose, coresight, Joey Gouly,
	Zenghui Yu, Catalin Marinas, Will Deacon, Mike Leach,
	Alexander Shishkin, Mark Brown, Anshuman Khandual, James Morse,
	Rob Herring (Arm), Shiqi Liu, Fuad Tabba, Raghavendra Rao Ananta,
	linux-arm-kernel, linux-kernel

On Mon, 23 Dec 2024 11:28:06 +0000,
James Clark <james.clark@linaro.org> wrote:
> 
> 
> 
> On 21/12/2024 12:34 pm, Marc Zyngier wrote:
> > On Wed, 27 Nov 2024 10:01:24 +0000,
> > James Clark <james.clark@linaro.org> wrote:
> >> 
> >> For nVHE, switch the filter value in and out if the Coresight driver
> >> asks for it. This will support filters for guests when sinks other than
> >> TRBE are used.
> >> 
> >> For VHE, just write the filter directly to TRFCR_EL1 where trace can be
> >> used even with TRBE sinks.
> >> 
> >> Signed-off-by: James Clark <james.clark@linaro.org>
> >> ---
> >>   arch/arm64/include/asm/kvm_host.h  |  5 +++++
> >>   arch/arm64/kvm/debug.c             | 28 ++++++++++++++++++++++++++++
> >>   arch/arm64/kvm/hyp/nvhe/debug-sr.c |  1 +
> >>   3 files changed, 34 insertions(+)
> >> 
> >> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> >> index ba251caa593b..cce07887551b 100644
> >> --- a/arch/arm64/include/asm/kvm_host.h
> >> +++ b/arch/arm64/include/asm/kvm_host.h
> >> @@ -613,6 +613,7 @@ struct kvm_host_data {
> >>   #define KVM_HOST_DATA_FLAG_HAS_SPE	0
> >>   #define KVM_HOST_DATA_FLAG_HAS_TRF	1
> >>   #define KVM_HOST_DATA_FLAG_TRBE_ENABLED	2
> >> +#define KVM_HOST_DATA_FLAG_GUEST_FILTER	3
> > 
> > Guest filter what? This is meaningless.
> > 
> 
> KVM_HOST_DATA_FLAG_SWAP_TRFCR maybe?

A flag indicates a state, just like a level interrupt. So in cannot be
a verb that indicates an action, after which the flag should be
dropped (just like an edge interrupt). Maybe if you explained what
this is supposed to indicate, we could come up with a better name.

I would have thought that something like EL1_TRACING_ENABLED would be
adequate, but it is unusually hard to understand what this is supposed
to be doing.

> 
> >>   	unsigned long flags;
> >>     	struct kvm_cpu_context host_ctxt;
> >> @@ -1387,6 +1388,8 @@ void kvm_clr_pmu_events(u64 clr);
> >>   bool kvm_set_pmuserenr(u64 val);
> >>   void kvm_enable_trbe(void);
> >>   void kvm_disable_trbe(void);
> >> +void kvm_set_trfcr(u64 guest_trfcr);
> >> +void kvm_clear_trfcr(void);
> >>   #else
> >>   static inline void kvm_set_pmu_events(u64 set, struct perf_event_attr *attr) {}
> >>   static inline void kvm_clr_pmu_events(u64 clr) {}
> >> @@ -1396,6 +1399,8 @@ static inline bool kvm_set_pmuserenr(u64 val)
> >>   }
> >>   static inline void kvm_enable_trbe(void) {}
> >>   static inline void kvm_disable_trbe(void) {}
> >> +static inline void kvm_set_trfcr(u64 guest_trfcr) {}
> >> +static inline void kvm_clear_trfcr(void) {}
> >>   #endif
> >>     void kvm_vcpu_load_vhe(struct kvm_vcpu *vcpu);
> >> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
> >> index 0c340ae7b5d1..9266f2776991 100644
> >> --- a/arch/arm64/kvm/debug.c
> >> +++ b/arch/arm64/kvm/debug.c
> >> @@ -337,3 +337,31 @@ void kvm_disable_trbe(void)
> >>   	host_data_clear_flag(TRBE_ENABLED);
> >>   }
> >>   EXPORT_SYMBOL_GPL(kvm_disable_trbe);
> >> +
> >> +void kvm_set_trfcr(u64 guest_trfcr)
> > 
> > Again. Is this the guest's view? or the host view while running the
> > guest? I asked the question on the previous patch, and you didn't
> > reply.
> > 
> 
> Ah sorry missed that one:
> 
> > Guest value? Or host state while running the guest? If the former,
> > then this has nothing to do here. If the latter, this should be
> > spelled out (trfcr_in_guest?), and the comment amended.
> 
> Yes, the latter, guest TRFCR reads are undef anyway. I can rename this
> and the host_data variable to be trfcr_in_guest.
> 
> >> +{
> >> +	if (is_protected_kvm_enabled() || WARN_ON_ONCE(preemptible()))
> >> +		return;
> >> +
> >> +	if (has_vhe())
> >> +		write_sysreg_s(guest_trfcr, SYS_TRFCR_EL12);
> >> +	else {
> >> +		*host_data_ptr(guest_trfcr_el1) = guest_trfcr;
> >> +		host_data_set_flag(GUEST_FILTER);
> >> +	}
> > 
> > Oh come on. This is basic coding style, see section 3 in
> > Documentation/process/coding-style.rst.
> > 
> 
> Oops, I'd have thought checkpatch could catch something like that. Will fix.

Checkpatch serves little to no purpose, really, and relying on it is a
very bad idea.

>
> >> +}
> >> +EXPORT_SYMBOL_GPL(kvm_set_trfcr);
> >> +
> >> +void kvm_clear_trfcr(void)
> >> +{
> >> +	if (is_protected_kvm_enabled() || WARN_ON_ONCE(preemptible()))
> >> +		return;
> >> +
> >> +	if (has_vhe())
> >> +		write_sysreg_s(0, SYS_TRFCR_EL12);
> >> +	else {
> >> +		*host_data_ptr(guest_trfcr_el1) = 0;
> >> +		host_data_clear_flag(GUEST_FILTER);
> >> +	}
> >> +}
> >> +EXPORT_SYMBOL_GPL(kvm_clear_trfcr);
> > 
> > Why do we have two helpers? Clearly, calling kvm_set_trfcr() with
> > E{1,0}TRE=={0,0} should result in *disabling* things.  Except it
> > doesn't, and you should fix it. Once that is fixed, it becomes
> obvious> that kvm_clear_trfcr() serves no purpose.
> > 
> 
> With only one kvm_set_trfcr() there's no way to distinguish swapping
> in a 0 value or stopping swapping altogether. I thought we wanted a
> single flag that gated the register accesses so the hyp mostly does
> nothing? With only kvm_set_trfcr() you first need to check FEAT_TRF
> then you need to compare the real register with trfcr_in_guest to know
> whether to swap or not every time.
> 
> Actually I think some of the previous versions had something like this
> but it was a bit more complicated.
> 
> Maybe set/clear_trfcr() aren't great names. Perhaps
> kvm_set_trfcr_in_guest() and kvm_disable_trfcr_in_guest()? With the
> second one hinting that it stops the swapping regardless of what the
> values are.

I really think the way you name thing is getting in the way of simply
*understanding* what you are doing.

You don't disable or enable TRFCR. You enable or disable EL1 tracing
while in guest context. TRFCR is the tool by which you are doing that,
and the value you pass to these helpers is the tracing configuration.

> 
> I don't think calling kvm_set_trfcr() with E{1,0}TRE=={0,0} is
> actually broken in this version, it means that the Coresight driver
> wants that value to be installed for guests. So it should actually
> _enable_ swapping in the value of 0, not disable anything.

But you can decide whether there is need for "swapping" if the
configuration is different than what's on the host, can't you?

> 
> > To sum it up, KVM's API should reflect the architecture instead of
> > making things up.
> > 
> 
> We had kvm_set_trfcr(u64 host_trfcr, u64 guest_trfcr) on the last
> version, which also serves the same purpose I mentioned above because
> you can check if they're the same or not and disable swapping. I don't
> know if that counts as reflecting the architecture better. But Oliver
> mentioned he preferred it more "intent" based which is why I added the
> clear_trfcr().

My personal take on this would be something like:

void kvm_tracing_set_el1_configuration(u64 trfcr_while_in_guest)
{
	[VHE stuff omitted]

	*host_data_ptr(guest_trfcr_el1) = trfcr_while_in_guest;
	if (read_sysreg_s(SYS_TRFCR_EL1) != trfcr_while_in_guest);
		host_data_set_flag(EL1_TRACING_CONFIGURED);
	else
		host_data_clear_flag(EL1_TRACING_CONFIGURED);
}

and that'd about about it.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2024-12-26 14:24 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-27 10:01 [PATCH v8 0/8] kvm/coresight: Support exclude guest and exclude host James Clark
2024-11-27 10:01 ` [PATCH v8 1/8] KVM: arm64: Get rid of __kvm_get_mdcr_el2() and related warts James Clark
2024-11-27 10:01 ` [PATCH v8 2/8] KVM: arm64: Track presence of SPE/TRBE in kvm_host_data instead of vCPU James Clark
2024-11-27 10:01 ` [PATCH v8 3/8] arm64/sysreg: Add a comment that the sysreg file should be sorted James Clark
2024-11-27 10:01 ` [PATCH v8 4/8] tools: arm64: Update sysreg.h header files James Clark
2024-11-27 10:01 ` [PATCH v8 5/8] arm64/sysreg/tools: Move TRFCR definitions to sysreg James Clark
2024-11-27 10:01 ` [PATCH v8 6/8] KVM: arm64: coresight: Give TRBE enabled state to KVM James Clark
2024-12-20 17:05   ` Marc Zyngier
2024-12-20 17:32     ` James Clark
2024-12-21 11:54       ` Marc Zyngier
2024-12-23 11:36         ` James Clark
2024-11-27 10:01 ` [PATCH v8 7/8] KVM: arm64: Support trace filtering for guests James Clark
2024-12-21 12:34   ` Marc Zyngier
2024-12-23 11:28     ` James Clark
2024-12-26 14:22       ` Marc Zyngier
2024-11-27 10:01 ` [PATCH v8 8/8] coresight: Pass guest TRFCR value to KVM James Clark

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).