linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/4] arm64: Support perf event modifiers :G and :H
@ 2018-11-22 11:25 Andrew Murray
  2018-11-22 11:25 ` [PATCH v3 1/4] arm64: arm_pmu: remove unnecessary isb instruction Andrew Murray
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Andrew Murray @ 2018-11-22 11:25 UTC (permalink / raw)
  To: linux-arm-kernel

This patchset provides support for perf event modifiers :G and :H which
allows for filtering of PMU events between host and guests when used
with KVM.

As the underlying hardware cannot distinguish between guest and host
context, the performance counters must be stopped and started upon
entry/exit to the guest. This is performed at EL2 in a way that
minimizes overhead and improves accuracy of recording events that only
occur in the requested context.

This has been tested with VHE and non-VHE kernels with a KVM guest.

Changes from v2:

 - Ensured that exclude_kernel works for guest
 - Removed unnecessary exclusion of EL2 with exclude_host on !VHE
 - Renamed kvm_clr_set_host_pmu_events to reflect args order
 - Added additional information to isb patch

Changes from v1:

 - Removed unnecessary exclusion of EL1 with exclude_guest on VHE
 - Removed unnecessary isb from existing perf_event.c driver
 - Folded perf_event.c patches together
 - Added additional information to last patch commit message


Andrew Murray (4):
  arm64: arm_pmu: remove unnecessary isb instruction
  arm64: KVM: add accessors to track guest/host only counters
  arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  arm64: KVM: Enable support for :G/:H perf event modifiers

 arch/arm64/include/asm/kvm_host.h | 20 +++++++++++++++
 arch/arm64/kernel/perf_event.c    | 53 +++++++++++++++++++++++++++++++++------
 arch/arm64/kvm/hyp/switch.c       | 38 ++++++++++++++++++++++++++++
 3 files changed, 103 insertions(+), 8 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v3 1/4] arm64: arm_pmu: remove unnecessary isb instruction
  2018-11-22 11:25 [PATCH v3 0/4] arm64: Support perf event modifiers :G and :H Andrew Murray
@ 2018-11-22 11:25 ` Andrew Murray
  2018-11-22 11:25 ` [PATCH v3 2/4] arm64: KVM: add accessors to track guest/host only counters Andrew Murray
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Andrew Murray @ 2018-11-22 11:25 UTC (permalink / raw)
  To: linux-arm-kernel

The armv8pmu_enable_event_counter function issues an isb instruction
after enabling a pair of counters - this doesn't provide any value
and is inconsistent with the armv8pmu_disable_event_counter.

In any case armv8pmu_enable_event_counter is always called with the
PMU stopped. Starting the PMU with armv8pmu_start results in an isb
instruction being issued prior to writing to PMCR_EL0.

Let's remove the unnecessary isb instruction.

Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
---
 arch/arm64/kernel/perf_event.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 8e38d52..de564ae 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -652,7 +652,6 @@ static inline void armv8pmu_enable_event_counter(struct perf_event *event)
 	armv8pmu_enable_counter(idx);
 	if (armv8pmu_event_is_chained(event))
 		armv8pmu_enable_counter(idx - 1);
-	isb();
 }
 
 static inline int armv8pmu_disable_counter(int idx)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 2/4] arm64: KVM: add accessors to track guest/host only counters
  2018-11-22 11:25 [PATCH v3 0/4] arm64: Support perf event modifiers :G and :H Andrew Murray
  2018-11-22 11:25 ` [PATCH v3 1/4] arm64: arm_pmu: remove unnecessary isb instruction Andrew Murray
@ 2018-11-22 11:25 ` Andrew Murray
  2018-12-03 14:31   ` Will Deacon
  2018-11-22 11:25 ` [PATCH v3 3/4] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes Andrew Murray
  2018-11-22 11:25 ` [PATCH v3 4/4] arm64: KVM: Enable support for :G/:H perf event modifiers Andrew Murray
  3 siblings, 1 reply; 11+ messages in thread
From: Andrew Murray @ 2018-11-22 11:25 UTC (permalink / raw)
  To: linux-arm-kernel

In order to effeciently enable/disable guest/host only perf counters
at guest entry/exit we add bitfields to kvm_cpu_context for guest and
host only events as well as accessors for updating them.

Signed-off-by: Andrew Murray <andrew.murray@arm.com>
---
 arch/arm64/include/asm/kvm_host.h | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 1550192..4a828eb 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -203,6 +203,8 @@ struct kvm_cpu_context {
 	};
 
 	struct kvm_vcpu *__hyp_running_vcpu;
+	u32 events_host_only;
+	u32 events_guest_only;
 };
 
 typedef struct kvm_cpu_context kvm_cpu_context_t;
@@ -472,6 +474,24 @@ static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
 {
 	return kvm_arch_vcpu_run_map_fp(vcpu);
 }
+static inline void kvm_clr_set_host_pmu_events(u32 clr, u32 set)
+{
+	kvm_cpu_context_t *ctx = this_cpu_ptr(&kvm_host_cpu_state);
+
+	ctx->events_host_only &= ~clr;
+	ctx->events_host_only |= set;
+}
+
+static inline void kvm_clr_set_guest_pmu_events(u32 clr, u32 set)
+{
+	kvm_cpu_context_t *ctx = this_cpu_ptr(&kvm_host_cpu_state);
+
+	ctx->events_guest_only &= ~clr;
+	ctx->events_guest_only |= set;
+}
+#else
+static inline void kvm_clr_set_host_pmu_events(u32 clr, u32 set) {}
+static inline void kvm_clr_set_guest_pmu_events(u32 clr, u32 set) {}
 #endif
 
 static inline void kvm_arm_vhe_guest_enter(void)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 3/4] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2018-11-22 11:25 [PATCH v3 0/4] arm64: Support perf event modifiers :G and :H Andrew Murray
  2018-11-22 11:25 ` [PATCH v3 1/4] arm64: arm_pmu: remove unnecessary isb instruction Andrew Murray
  2018-11-22 11:25 ` [PATCH v3 2/4] arm64: KVM: add accessors to track guest/host only counters Andrew Murray
@ 2018-11-22 11:25 ` Andrew Murray
  2018-12-03 15:04   ` Will Deacon
  2018-11-22 11:25 ` [PATCH v3 4/4] arm64: KVM: Enable support for :G/:H perf event modifiers Andrew Murray
  3 siblings, 1 reply; 11+ messages in thread
From: Andrew Murray @ 2018-11-22 11:25 UTC (permalink / raw)
  To: linux-arm-kernel

Add support for the :G and :H attributes in perf by handling the
exclude_host/exclude_guest event attributes.

We notify KVM of counters that we wish to be enabled or disabled on
guest entry/exit and thus defer from starting or stopping :G events
as per the events exclude_host attribute.

With both VHE and non-VHE we switch the counters between host/guest
at EL2. We are able to eliminate counters counting host events on
the boundaries of guest entry/exit when using :G by filtering out
EL2 for exclude_host. However when using :H unless exclude_hv is set
on non-VHE then there is a small blackout window at the guest
entry/exit where host events are not captured.

Signed-off-by: Andrew Murray <andrew.murray@arm.com>
---
 arch/arm64/kernel/perf_event.c | 52 ++++++++++++++++++++++++++++++++++++------
 1 file changed, 45 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index de564ae..e0619ba 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -26,6 +26,7 @@
 
 #include <linux/acpi.h>
 #include <linux/clocksource.h>
+#include <linux/kvm_host.h>
 #include <linux/of.h>
 #include <linux/perf/arm_pmu.h>
 #include <linux/platform_device.h>
@@ -647,11 +648,23 @@ static inline int armv8pmu_enable_counter(int idx)
 
 static inline void armv8pmu_enable_event_counter(struct perf_event *event)
 {
+	struct perf_event_attr *attr = &event->attr;
 	int idx = event->hw.idx;
+	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
 
-	armv8pmu_enable_counter(idx);
 	if (armv8pmu_event_is_chained(event))
-		armv8pmu_enable_counter(idx - 1);
+		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
+
+	if (attr->exclude_host)
+		kvm_clr_set_guest_pmu_events(0, counter_bits);
+	if (attr->exclude_guest)
+		kvm_clr_set_host_pmu_events(0, counter_bits);
+
+	if (!attr->exclude_host) {
+		armv8pmu_enable_counter(idx);
+		if (armv8pmu_event_is_chained(event))
+			armv8pmu_enable_counter(idx - 1);
+	}
 }
 
 static inline int armv8pmu_disable_counter(int idx)
@@ -664,11 +677,23 @@ static inline int armv8pmu_disable_counter(int idx)
 static inline void armv8pmu_disable_event_counter(struct perf_event *event)
 {
 	struct hw_perf_event *hwc = &event->hw;
+	struct perf_event_attr *attr = &event->attr;
 	int idx = hwc->idx;
+	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
 
 	if (armv8pmu_event_is_chained(event))
-		armv8pmu_disable_counter(idx - 1);
-	armv8pmu_disable_counter(idx);
+		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
+
+	if (attr->exclude_host)
+		kvm_clr_set_guest_pmu_events(counter_bits, 0);
+	if (attr->exclude_guest)
+		kvm_clr_set_host_pmu_events(counter_bits, 0);
+
+	if (!attr->exclude_host) {
+		if (armv8pmu_event_is_chained(event))
+			armv8pmu_disable_counter(idx - 1);
+		armv8pmu_disable_counter(idx);
+	}
 }
 
 static inline int armv8pmu_enable_intens(int idx)
@@ -943,16 +968,25 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
 	 * Therefore we ignore exclude_hv in this configuration, since
 	 * there's no hypervisor to sample anyway. This is consistent
 	 * with other architectures (x86 and Power).
+	 *
+	 * To eliminate counting host events on the boundaries of
+	 * guest entry/exit we ensure EL2 is not included in hyp mode
+	 * with !exclude_host.
 	 */
 	if (is_kernel_in_hyp_mode()) {
-		if (!attr->exclude_kernel)
+		if (!attr->exclude_kernel && !attr->exclude_host)
 			config_base |= ARMV8_PMU_INCLUDE_EL2;
 	} else {
-		if (attr->exclude_kernel)
-			config_base |= ARMV8_PMU_EXCLUDE_EL1;
 		if (!attr->exclude_hv)
 			config_base |= ARMV8_PMU_INCLUDE_EL2;
 	}
+
+	/*
+	 * Filter out !VHE kernels and guest kernels
+	 */
+	if (attr->exclude_kernel)
+		config_base |= ARMV8_PMU_EXCLUDE_EL1;
+
 	if (attr->exclude_user)
 		config_base |= ARMV8_PMU_EXCLUDE_EL0;
 
@@ -976,6 +1010,10 @@ static void armv8pmu_reset(void *info)
 		armv8pmu_disable_intens(idx);
 	}
 
+	/* Clear the counters we flip at guest entry/exit */
+	kvm_clr_set_host_pmu_events(U32_MAX, 0);
+	kvm_clr_set_guest_pmu_events(U32_MAX, 0);
+
 	/*
 	 * Initialize & Reset PMNC. Request overflow interrupt for
 	 * 64 bit cycle counter but cheat in armv8pmu_write_counter().
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 4/4] arm64: KVM: Enable support for :G/:H perf event modifiers
  2018-11-22 11:25 [PATCH v3 0/4] arm64: Support perf event modifiers :G and :H Andrew Murray
                   ` (2 preceding siblings ...)
  2018-11-22 11:25 ` [PATCH v3 3/4] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes Andrew Murray
@ 2018-11-22 11:25 ` Andrew Murray
  3 siblings, 0 replies; 11+ messages in thread
From: Andrew Murray @ 2018-11-22 11:25 UTC (permalink / raw)
  To: linux-arm-kernel

Enable/disable event counters as appropriate when entering and exiting
the guest to enable support for guest or host only event counting.

For both VHE and non-VHE we switch the counters between host/guest at
EL2. EL2 is filtered out by the PMU when we are using the :G modifier.

The PMU may be on when we change which counters are enabled however
we avoid adding an isb as we instead rely on existing context
synchronisation events: the isb in kvm_arm_vhe_guest_exit for VHE and
the eret from the hvc in kvm_call_hyp.

Signed-off-by: Andrew Murray <andrew.murray@arm.com>
---
 arch/arm64/kvm/hyp/switch.c | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index d496ef5..ebf0aac 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -373,6 +373,32 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
 	return true;
 }
 
+static bool __hyp_text __pmu_switch_to_guest(struct kvm_cpu_context *host_ctxt)
+{
+	u32 host_only = host_ctxt->events_host_only;
+	u32 guest_only = host_ctxt->events_guest_only;
+
+	if (host_only)
+		write_sysreg(host_only, pmcntenclr_el0);
+
+	if (guest_only)
+		write_sysreg(guest_only, pmcntenset_el0);
+
+	return (host_only || guest_only);
+}
+
+static void __hyp_text __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt)
+{
+	u32 host_only = host_ctxt->events_host_only;
+	u32 guest_only = host_ctxt->events_guest_only;
+
+	if (guest_only)
+		write_sysreg(guest_only, pmcntenclr_el0);
+
+	if (host_only)
+		write_sysreg(host_only, pmcntenset_el0);
+}
+
 /*
  * Return true when we were able to fixup the guest exit and should return to
  * the guest, false when we should restore the host state and return to the
@@ -488,12 +514,15 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
 {
 	struct kvm_cpu_context *host_ctxt;
 	struct kvm_cpu_context *guest_ctxt;
+	bool pmu_switch_needed;
 	u64 exit_code;
 
 	host_ctxt = vcpu->arch.host_cpu_context;
 	host_ctxt->__hyp_running_vcpu = vcpu;
 	guest_ctxt = &vcpu->arch.ctxt;
 
+	pmu_switch_needed = __pmu_switch_to_guest(host_ctxt);
+
 	sysreg_save_host_state_vhe(host_ctxt);
 
 	__activate_traps(vcpu);
@@ -524,6 +553,9 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
 
 	__debug_switch_to_host(vcpu);
 
+	if (pmu_switch_needed)
+		__pmu_switch_to_host(host_ctxt);
+
 	return exit_code;
 }
 
@@ -532,6 +564,7 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
 {
 	struct kvm_cpu_context *host_ctxt;
 	struct kvm_cpu_context *guest_ctxt;
+	bool pmu_switch_needed;
 	u64 exit_code;
 
 	vcpu = kern_hyp_va(vcpu);
@@ -540,6 +573,8 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
 	host_ctxt->__hyp_running_vcpu = vcpu;
 	guest_ctxt = &vcpu->arch.ctxt;
 
+	pmu_switch_needed = __pmu_switch_to_guest(host_ctxt);
+
 	__sysreg_save_state_nvhe(host_ctxt);
 
 	__activate_traps(vcpu);
@@ -586,6 +621,9 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
 	 */
 	__debug_switch_to_host(vcpu);
 
+	if (pmu_switch_needed)
+		__pmu_switch_to_host(host_ctxt);
+
 	return exit_code;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 2/4] arm64: KVM: add accessors to track guest/host only counters
  2018-11-22 11:25 ` [PATCH v3 2/4] arm64: KVM: add accessors to track guest/host only counters Andrew Murray
@ 2018-12-03 14:31   ` Will Deacon
  2018-12-03 15:05     ` Andrew Murray
  0 siblings, 1 reply; 11+ messages in thread
From: Will Deacon @ 2018-12-03 14:31 UTC (permalink / raw)
  To: Andrew Murray
  Cc: Mark Rutland, Suzuki K Poulose, Marc Zyngier, Catalin Marinas,
	Julien Thierry, Christoffer Dall, kvmarm, linux-arm-kernel

On Thu, Nov 22, 2018 at 11:25:48AM +0000, Andrew Murray wrote:
> In order to effeciently enable/disable guest/host only perf counters
> at guest entry/exit we add bitfields to kvm_cpu_context for guest and
> host only events as well as accessors for updating them.
> 
> Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> ---
>  arch/arm64/include/asm/kvm_host.h | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 1550192..4a828eb 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -203,6 +203,8 @@ struct kvm_cpu_context {
>  	};
>  
>  	struct kvm_vcpu *__hyp_running_vcpu;
> +	u32 events_host_only;
> +	u32 events_guest_only;
>  };
>  
>  typedef struct kvm_cpu_context kvm_cpu_context_t;
> @@ -472,6 +474,24 @@ static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
>  {
>  	return kvm_arch_vcpu_run_map_fp(vcpu);
>  }
> +static inline void kvm_clr_set_host_pmu_events(u32 clr, u32 set)
> +{
> +	kvm_cpu_context_t *ctx = this_cpu_ptr(&kvm_host_cpu_state);
> +
> +	ctx->events_host_only &= ~clr;
> +	ctx->events_host_only |= set;
> +}

Do you actually need this clr/set function, or can you just use the
facilities provided by bitfield.h at the callsites?

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 3/4] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2018-11-22 11:25 ` [PATCH v3 3/4] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes Andrew Murray
@ 2018-12-03 15:04   ` Will Deacon
  2018-12-03 22:37     ` Andrew Murray
  0 siblings, 1 reply; 11+ messages in thread
From: Will Deacon @ 2018-12-03 15:04 UTC (permalink / raw)
  To: Andrew Murray
  Cc: Mark Rutland, Suzuki K Poulose, Marc Zyngier, Catalin Marinas,
	Julien Thierry, Christoffer Dall, kvmarm, linux-arm-kernel

Hi Andrew,

On Thu, Nov 22, 2018 at 11:25:49AM +0000, Andrew Murray wrote:
> Add support for the :G and :H attributes in perf by handling the
> exclude_host/exclude_guest event attributes.
> 
> We notify KVM of counters that we wish to be enabled or disabled on
> guest entry/exit and thus defer from starting or stopping :G events
> as per the events exclude_host attribute.
> 
> With both VHE and non-VHE we switch the counters between host/guest
> at EL2. We are able to eliminate counters counting host events on
> the boundaries of guest entry/exit when using :G by filtering out
> EL2 for exclude_host. However when using :H unless exclude_hv is set
> on non-VHE then there is a small blackout window at the guest
> entry/exit where host events are not captured.
> 
> Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> ---
>  arch/arm64/kernel/perf_event.c | 52 ++++++++++++++++++++++++++++++++++++------
>  1 file changed, 45 insertions(+), 7 deletions(-)

I've got a cold at the moment, so it could just be that, but I struggled to
follow the logic in this patch. Comments below.

> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index de564ae..e0619ba 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -26,6 +26,7 @@
>  
>  #include <linux/acpi.h>
>  #include <linux/clocksource.h>
> +#include <linux/kvm_host.h>
>  #include <linux/of.h>
>  #include <linux/perf/arm_pmu.h>
>  #include <linux/platform_device.h>
> @@ -647,11 +648,23 @@ static inline int armv8pmu_enable_counter(int idx)
>  
>  static inline void armv8pmu_enable_event_counter(struct perf_event *event)
>  {
> +	struct perf_event_attr *attr = &event->attr;
>  	int idx = event->hw.idx;
> +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
>  
> -	armv8pmu_enable_counter(idx);
>  	if (armv8pmu_event_is_chained(event))
> -		armv8pmu_enable_counter(idx - 1);
> +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> +
> +	if (attr->exclude_host)
> +		kvm_clr_set_guest_pmu_events(0, counter_bits);
> +	if (attr->exclude_guest)
> +		kvm_clr_set_host_pmu_events(0, counter_bits);

Hmm, so if I have both exclude_host and exclude_guest set then we'll set
the counter in both events_host_only and events_guest_only, which is weird
and probably not what you want.

I think I'd find it easier to understand if you dropped the "only" suffix on
the kvm masks, and you actually made the logic positive so that an event
that counts in both host and guest has its bit set in both of the masks.

> +
> +	if (!attr->exclude_host) {
> +		armv8pmu_enable_counter(idx);
> +		if (armv8pmu_event_is_chained(event))
> +			armv8pmu_enable_counter(idx - 1);
> +	}
>  }
>  
>  static inline int armv8pmu_disable_counter(int idx)
> @@ -664,11 +677,23 @@ static inline int armv8pmu_disable_counter(int idx)
>  static inline void armv8pmu_disable_event_counter(struct perf_event *event)
>  {
>  	struct hw_perf_event *hwc = &event->hw;
> +	struct perf_event_attr *attr = &event->attr;
>  	int idx = hwc->idx;
> +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
>  
>  	if (armv8pmu_event_is_chained(event))
> -		armv8pmu_disable_counter(idx - 1);
> -	armv8pmu_disable_counter(idx);
> +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> +
> +	if (attr->exclude_host)
> +		kvm_clr_set_guest_pmu_events(counter_bits, 0);
> +	if (attr->exclude_guest)
> +		kvm_clr_set_host_pmu_events(counter_bits, 0);
> +
> +	if (!attr->exclude_host) {
> +		if (armv8pmu_event_is_chained(event))
> +			armv8pmu_disable_counter(idx - 1);
> +		armv8pmu_disable_counter(idx);
> +	}

It looks like you're relying on the perf_event_attr remaining unchanged
between the enable/disable calls for a given event. However, I'm not
convinced that's the case -- have you checked that e.g.
PERF_EVENT_IOC_MODIFY_ATTRIBUTES can't rewrite the thing behind your
back?

>  }
>  
>  static inline int armv8pmu_enable_intens(int idx)
> @@ -943,16 +968,25 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
>  	 * Therefore we ignore exclude_hv in this configuration, since
>  	 * there's no hypervisor to sample anyway. This is consistent
>  	 * with other architectures (x86 and Power).
> +	 *
> +	 * To eliminate counting host events on the boundaries of
> +	 * guest entry/exit we ensure EL2 is not included in hyp mode
> +	 * with !exclude_host.
>  	 */
>  	if (is_kernel_in_hyp_mode()) {
> -		if (!attr->exclude_kernel)
> +		if (!attr->exclude_kernel && !attr->exclude_host)
>  			config_base |= ARMV8_PMU_INCLUDE_EL2;

Do you know what perf passes by default here? I'm worried that we're
introducing a user-visible change to existing programs by changing this
check.

>  	} else {
> -		if (attr->exclude_kernel)
> -			config_base |= ARMV8_PMU_EXCLUDE_EL1;
>  		if (!attr->exclude_hv)
>  			config_base |= ARMV8_PMU_INCLUDE_EL2;
>  	}
> +
> +	/*
> +	 * Filter out !VHE kernels and guest kernels
> +	 */
> +	if (attr->exclude_kernel)
> +		config_base |= ARMV8_PMU_EXCLUDE_EL1;

Why have you moved this hunk?

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 2/4] arm64: KVM: add accessors to track guest/host only counters
  2018-12-03 14:31   ` Will Deacon
@ 2018-12-03 15:05     ` Andrew Murray
  2018-12-03 16:58       ` Will Deacon
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew Murray @ 2018-12-03 15:05 UTC (permalink / raw)
  To: Will Deacon
  Cc: Mark Rutland, Suzuki K Poulose, Marc Zyngier, Catalin Marinas,
	Julien Thierry, Christoffer Dall, kvmarm, linux-arm-kernel

On Mon, Dec 03, 2018 at 02:31:14PM +0000, Will Deacon wrote:
> On Thu, Nov 22, 2018 at 11:25:48AM +0000, Andrew Murray wrote:
> > In order to effeciently enable/disable guest/host only perf counters
> > at guest entry/exit we add bitfields to kvm_cpu_context for guest and
> > host only events as well as accessors for updating them.
> > 
> > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> > ---
> >  arch/arm64/include/asm/kvm_host.h | 20 ++++++++++++++++++++
> >  1 file changed, 20 insertions(+)
> > 
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index 1550192..4a828eb 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -203,6 +203,8 @@ struct kvm_cpu_context {
> >  	};
> >  
> >  	struct kvm_vcpu *__hyp_running_vcpu;
> > +	u32 events_host_only;
> > +	u32 events_guest_only;
> >  };
> >  
> >  typedef struct kvm_cpu_context kvm_cpu_context_t;
> > @@ -472,6 +474,24 @@ static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
> >  {
> >  	return kvm_arch_vcpu_run_map_fp(vcpu);
> >  }
> > +static inline void kvm_clr_set_host_pmu_events(u32 clr, u32 set)
> > +{
> > +	kvm_cpu_context_t *ctx = this_cpu_ptr(&kvm_host_cpu_state);
> > +
> > +	ctx->events_host_only &= ~clr;
> > +	ctx->events_host_only |= set;
> > +}
> 
> Do you actually need this clr/set function, or can you just use the
> facilities provided by bitfield.h at the callsites?

This modifies events_{host|guest}_only which is conditionally defined
(CONFIG_KVM) yet used by arch/arm64/kernel/perf_events which is also
conditionally defined (CONFIG_HW_PERF_EVENTS). By moving this away
from the callsite we avoid a load of horrible #ifdef's and allow this
function to become a nop when KVM isn't enabled.

Is there value in updating this function to use set_bit/clear_bit
functions in bitops/atomic.h ?

Andrew Murray

> 
> Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 2/4] arm64: KVM: add accessors to track guest/host only counters
  2018-12-03 15:05     ` Andrew Murray
@ 2018-12-03 16:58       ` Will Deacon
  2018-12-03 22:44         ` Andrew Murray
  0 siblings, 1 reply; 11+ messages in thread
From: Will Deacon @ 2018-12-03 16:58 UTC (permalink / raw)
  To: Andrew Murray
  Cc: Mark Rutland, Suzuki K Poulose, Marc Zyngier, Catalin Marinas,
	Julien Thierry, Christoffer Dall, kvmarm, linux-arm-kernel

On Mon, Dec 03, 2018 at 03:05:52PM +0000, Andrew Murray wrote:
> On Mon, Dec 03, 2018 at 02:31:14PM +0000, Will Deacon wrote:
> > On Thu, Nov 22, 2018 at 11:25:48AM +0000, Andrew Murray wrote:
> > > In order to effeciently enable/disable guest/host only perf counters
> > > at guest entry/exit we add bitfields to kvm_cpu_context for guest and
> > > host only events as well as accessors for updating them.
> > > 
> > > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> > > ---
> > >  arch/arm64/include/asm/kvm_host.h | 20 ++++++++++++++++++++
> > >  1 file changed, 20 insertions(+)
> > > 
> > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > > index 1550192..4a828eb 100644
> > > --- a/arch/arm64/include/asm/kvm_host.h
> > > +++ b/arch/arm64/include/asm/kvm_host.h
> > > @@ -203,6 +203,8 @@ struct kvm_cpu_context {
> > >  	};
> > >  
> > >  	struct kvm_vcpu *__hyp_running_vcpu;
> > > +	u32 events_host_only;
> > > +	u32 events_guest_only;
> > >  };
> > >  
> > >  typedef struct kvm_cpu_context kvm_cpu_context_t;
> > > @@ -472,6 +474,24 @@ static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
> > >  {
> > >  	return kvm_arch_vcpu_run_map_fp(vcpu);
> > >  }
> > > +static inline void kvm_clr_set_host_pmu_events(u32 clr, u32 set)
> > > +{
> > > +	kvm_cpu_context_t *ctx = this_cpu_ptr(&kvm_host_cpu_state);
> > > +
> > > +	ctx->events_host_only &= ~clr;
> > > +	ctx->events_host_only |= set;
> > > +}
> > 
> > Do you actually need this clr/set function, or can you just use the
> > facilities provided by bitfield.h at the callsites?
> 
> This modifies events_{host|guest}_only which is conditionally defined
> (CONFIG_KVM) yet used by arch/arm64/kernel/perf_events which is also
> conditionally defined (CONFIG_HW_PERF_EVENTS). By moving this away
> from the callsite we avoid a load of horrible #ifdef's and allow this
> function to become a nop when KVM isn't enabled.
> 
> Is there value in updating this function to use set_bit/clear_bit
> functions in bitops/atomic.h ?

You don't need the atomicity, so you'd be looking at the '__' variants
instead. My main concern is that we're introducing a clr_set_* API which
only ever clears or sets, so I think it's needlessly complicated. If you
need two separate macros that are just #defined to __set_bit/__clear_bit
or expand to nothing then that could work too.

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 3/4] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
  2018-12-03 15:04   ` Will Deacon
@ 2018-12-03 22:37     ` Andrew Murray
  0 siblings, 0 replies; 11+ messages in thread
From: Andrew Murray @ 2018-12-03 22:37 UTC (permalink / raw)
  To: Will Deacon
  Cc: Mark Rutland, Suzuki K Poulose, Marc Zyngier, Catalin Marinas,
	Julien Thierry, Christoffer Dall, kvmarm, linux-arm-kernel

On Mon, Dec 03, 2018 at 03:04:14PM +0000, Will Deacon wrote:
> Hi Andrew,
> 
> On Thu, Nov 22, 2018 at 11:25:49AM +0000, Andrew Murray wrote:
> > Add support for the :G and :H attributes in perf by handling the
> > exclude_host/exclude_guest event attributes.
> > 
> > We notify KVM of counters that we wish to be enabled or disabled on
> > guest entry/exit and thus defer from starting or stopping :G events
> > as per the events exclude_host attribute.
> > 
> > With both VHE and non-VHE we switch the counters between host/guest
> > at EL2. We are able to eliminate counters counting host events on
> > the boundaries of guest entry/exit when using :G by filtering out
> > EL2 for exclude_host. However when using :H unless exclude_hv is set
> > on non-VHE then there is a small blackout window at the guest
> > entry/exit where host events are not captured.
> > 
> > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> > ---
> >  arch/arm64/kernel/perf_event.c | 52 ++++++++++++++++++++++++++++++++++++------
> >  1 file changed, 45 insertions(+), 7 deletions(-)
> 
> I've got a cold at the moment, so it could just be that, but I struggled to
> follow the logic in this patch. Comments below.
> 
> > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > index de564ae..e0619ba 100644
> > --- a/arch/arm64/kernel/perf_event.c
> > +++ b/arch/arm64/kernel/perf_event.c
> > @@ -26,6 +26,7 @@
> >  
> >  #include <linux/acpi.h>
> >  #include <linux/clocksource.h>
> > +#include <linux/kvm_host.h>
> >  #include <linux/of.h>
> >  #include <linux/perf/arm_pmu.h>
> >  #include <linux/platform_device.h>
> > @@ -647,11 +648,23 @@ static inline int armv8pmu_enable_counter(int idx)
> >  
> >  static inline void armv8pmu_enable_event_counter(struct perf_event *event)
> >  {
> > +	struct perf_event_attr *attr = &event->attr;
> >  	int idx = event->hw.idx;
> > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> >  
> > -	armv8pmu_enable_counter(idx);
> >  	if (armv8pmu_event_is_chained(event))
> > -		armv8pmu_enable_counter(idx - 1);
> > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > +
> > +	if (attr->exclude_host)
> > +		kvm_clr_set_guest_pmu_events(0, counter_bits);
> > +	if (attr->exclude_guest)
> > +		kvm_clr_set_host_pmu_events(0, counter_bits);
> 
> Hmm, so if I have both exclude_host and exclude_guest set then we'll set
> the counter in both events_host_only and events_guest_only, which is weird
> and probably not what you want.

perf_event_attr is passed from userspace, whilst the perf tool doesn't allow
you to set exclude_host and exclude_guest at the same time it's feasiable for
another user to do so. So indeed the current behaviour wouldn't be expected.

> 
> I think I'd find it easier to understand if you dropped the "only" suffix on
> the kvm masks, and you actually made the logic positive so that an event
> that counts in both host and guest has its bit set in both of the masks.

No problem. This actually fixes a bug when multiple events are enabled.

> 
> > +
> > +	if (!attr->exclude_host) {
> > +		armv8pmu_enable_counter(idx);
> > +		if (armv8pmu_event_is_chained(event))
> > +			armv8pmu_enable_counter(idx - 1);
> > +	}
> >  }
> >  
> >  static inline int armv8pmu_disable_counter(int idx)
> > @@ -664,11 +677,23 @@ static inline int armv8pmu_disable_counter(int idx)
> >  static inline void armv8pmu_disable_event_counter(struct perf_event *event)
> >  {
> >  	struct hw_perf_event *hwc = &event->hw;
> > +	struct perf_event_attr *attr = &event->attr;
> >  	int idx = hwc->idx;
> > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> >  
> >  	if (armv8pmu_event_is_chained(event))
> > -		armv8pmu_disable_counter(idx - 1);
> > -	armv8pmu_disable_counter(idx);
> > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > +
> > +	if (attr->exclude_host)
> > +		kvm_clr_set_guest_pmu_events(counter_bits, 0);
> > +	if (attr->exclude_guest)
> > +		kvm_clr_set_host_pmu_events(counter_bits, 0);
> > +
> > +	if (!attr->exclude_host) {
> > +		if (armv8pmu_event_is_chained(event))
> > +			armv8pmu_disable_counter(idx - 1);
> > +		armv8pmu_disable_counter(idx);
> > +	}
> 
> It looks like you're relying on the perf_event_attr remaining unchanged
> between the enable/disable calls for a given event. However, I'm not
> convinced that's the case -- have you checked that e.g.
> PERF_EVENT_IOC_MODIFY_ATTRIBUTES can't rewrite the thing behind your
> back?

Even better - I can unconditionally clear the counter_bits here seeing as the
event is being disabled.

> 
> >  }
> >  
> >  static inline int armv8pmu_enable_intens(int idx)
> > @@ -943,16 +968,25 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
> >  	 * Therefore we ignore exclude_hv in this configuration, since
> >  	 * there's no hypervisor to sample anyway. This is consistent
> >  	 * with other architectures (x86 and Power).
> > +	 *
> > +	 * To eliminate counting host events on the boundaries of
> > +	 * guest entry/exit we ensure EL2 is not included in hyp mode
> > +	 * with !exclude_host.
> >  	 */
> >  	if (is_kernel_in_hyp_mode()) {
> > -		if (!attr->exclude_kernel)
> > +		if (!attr->exclude_kernel && !attr->exclude_host)
> >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> 
> Do you know what perf passes by default here? I'm worried that we're
> introducing a user-visible change to existing programs by changing this
> check.

When running 'perf xxx' exclude_guest is set and exclude_host is unset. And the
opposite when running 'perf kvm xxx'. So for existing users that do not use
virtualisation then there is no change.

> 
> >  	} else {
> > -		if (attr->exclude_kernel)
> > -			config_base |= ARMV8_PMU_EXCLUDE_EL1;
> >  		if (!attr->exclude_hv)
> >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> >  	}
> > +
> > +	/*
> > +	 * Filter out !VHE kernels and guest kernels
> > +	 */
> > +	if (attr->exclude_kernel)
> > +		config_base |= ARMV8_PMU_EXCLUDE_EL1;
> 
> Why have you moved this hunk?

When guests use perf the guest PMU is emulated by KVM (virt/kvm/arm/pmu) and is
backed by host perf events, these events always have the exclude_host flag set.
Therefore to filter these kernel events we need to filter EL1. By moving this
hunk we ensure guest kernel filtering works even when the host is VHE. (The
side effect is that we unnecessarily filter EL1 on VHE host kernel events, but
this has no effect as no-one is using this except any guest.)

Thanks,

Andrew Murray

> 
> Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 2/4] arm64: KVM: add accessors to track guest/host only counters
  2018-12-03 16:58       ` Will Deacon
@ 2018-12-03 22:44         ` Andrew Murray
  0 siblings, 0 replies; 11+ messages in thread
From: Andrew Murray @ 2018-12-03 22:44 UTC (permalink / raw)
  To: Will Deacon
  Cc: Mark Rutland, Suzuki K Poulose, Marc Zyngier, Catalin Marinas,
	Julien Thierry, Christoffer Dall, kvmarm, linux-arm-kernel

On Mon, Dec 03, 2018 at 04:58:43PM +0000, Will Deacon wrote:
> On Mon, Dec 03, 2018 at 03:05:52PM +0000, Andrew Murray wrote:
> > On Mon, Dec 03, 2018 at 02:31:14PM +0000, Will Deacon wrote:
> > > On Thu, Nov 22, 2018 at 11:25:48AM +0000, Andrew Murray wrote:
> > > > In order to effeciently enable/disable guest/host only perf counters
> > > > at guest entry/exit we add bitfields to kvm_cpu_context for guest and
> > > > host only events as well as accessors for updating them.
> > > > 
> > > > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> > > > ---
> > > >  arch/arm64/include/asm/kvm_host.h | 20 ++++++++++++++++++++
> > > >  1 file changed, 20 insertions(+)
> > > > 
> > > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > > > index 1550192..4a828eb 100644
> > > > --- a/arch/arm64/include/asm/kvm_host.h
> > > > +++ b/arch/arm64/include/asm/kvm_host.h
> > > > @@ -203,6 +203,8 @@ struct kvm_cpu_context {
> > > >  	};
> > > >  
> > > >  	struct kvm_vcpu *__hyp_running_vcpu;
> > > > +	u32 events_host_only;
> > > > +	u32 events_guest_only;
> > > >  };
> > > >  
> > > >  typedef struct kvm_cpu_context kvm_cpu_context_t;
> > > > @@ -472,6 +474,24 @@ static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
> > > >  {
> > > >  	return kvm_arch_vcpu_run_map_fp(vcpu);
> > > >  }
> > > > +static inline void kvm_clr_set_host_pmu_events(u32 clr, u32 set)
> > > > +{
> > > > +	kvm_cpu_context_t *ctx = this_cpu_ptr(&kvm_host_cpu_state);
> > > > +
> > > > +	ctx->events_host_only &= ~clr;
> > > > +	ctx->events_host_only |= set;
> > > > +}
> > > 
> > > Do you actually need this clr/set function, or can you just use the
> > > facilities provided by bitfield.h at the callsites?
> > 
> > This modifies events_{host|guest}_only which is conditionally defined
> > (CONFIG_KVM) yet used by arch/arm64/kernel/perf_events which is also
> > conditionally defined (CONFIG_HW_PERF_EVENTS). By moving this away
> > from the callsite we avoid a load of horrible #ifdef's and allow this
> > function to become a nop when KVM isn't enabled.
> > 
> > Is there value in updating this function to use set_bit/clear_bit
> > functions in bitops/atomic.h ?
> 
> You don't need the atomicity, so you'd be looking at the '__' variants
> instead. My main concern is that we're introducing a clr_set_* API which
> only ever clears or sets, so I think it's needlessly complicated. If you
> need two separate macros that are just #defined to __set_bit/__clear_bit
> or expand to nothing then that could work too.

Though it would be 4 macros - a set and clear for both events_host and
events_guest. Though with the proposed changes in the other patch, this
could end up being 3 macros seeing as we'd always clear both events_host
and events_guest at the same time.

Thanks,

Andrew Murray

> 
> Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-12-03 22:44 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-11-22 11:25 [PATCH v3 0/4] arm64: Support perf event modifiers :G and :H Andrew Murray
2018-11-22 11:25 ` [PATCH v3 1/4] arm64: arm_pmu: remove unnecessary isb instruction Andrew Murray
2018-11-22 11:25 ` [PATCH v3 2/4] arm64: KVM: add accessors to track guest/host only counters Andrew Murray
2018-12-03 14:31   ` Will Deacon
2018-12-03 15:05     ` Andrew Murray
2018-12-03 16:58       ` Will Deacon
2018-12-03 22:44         ` Andrew Murray
2018-11-22 11:25 ` [PATCH v3 3/4] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes Andrew Murray
2018-12-03 15:04   ` Will Deacon
2018-12-03 22:37     ` Andrew Murray
2018-11-22 11:25 ` [PATCH v3 4/4] arm64: KVM: Enable support for :G/:H perf event modifiers Andrew Murray

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).