* [PATCH v3 0/4] arm64: Support perf event modifiers :G and :H @ 2018-11-22 11:25 Andrew Murray 2018-11-22 11:25 ` [PATCH v3 1/4] arm64: arm_pmu: remove unnecessary isb instruction Andrew Murray ` (3 more replies) 0 siblings, 4 replies; 11+ messages in thread From: Andrew Murray @ 2018-11-22 11:25 UTC (permalink / raw) To: linux-arm-kernel This patchset provides support for perf event modifiers :G and :H which allows for filtering of PMU events between host and guests when used with KVM. As the underlying hardware cannot distinguish between guest and host context, the performance counters must be stopped and started upon entry/exit to the guest. This is performed at EL2 in a way that minimizes overhead and improves accuracy of recording events that only occur in the requested context. This has been tested with VHE and non-VHE kernels with a KVM guest. Changes from v2: - Ensured that exclude_kernel works for guest - Removed unnecessary exclusion of EL2 with exclude_host on !VHE - Renamed kvm_clr_set_host_pmu_events to reflect args order - Added additional information to isb patch Changes from v1: - Removed unnecessary exclusion of EL1 with exclude_guest on VHE - Removed unnecessary isb from existing perf_event.c driver - Folded perf_event.c patches together - Added additional information to last patch commit message Andrew Murray (4): arm64: arm_pmu: remove unnecessary isb instruction arm64: KVM: add accessors to track guest/host only counters arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes arm64: KVM: Enable support for :G/:H perf event modifiers arch/arm64/include/asm/kvm_host.h | 20 +++++++++++++++ arch/arm64/kernel/perf_event.c | 53 +++++++++++++++++++++++++++++++++------ arch/arm64/kvm/hyp/switch.c | 38 ++++++++++++++++++++++++++++ 3 files changed, 103 insertions(+), 8 deletions(-) -- 2.7.4 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v3 1/4] arm64: arm_pmu: remove unnecessary isb instruction 2018-11-22 11:25 [PATCH v3 0/4] arm64: Support perf event modifiers :G and :H Andrew Murray @ 2018-11-22 11:25 ` Andrew Murray 2018-11-22 11:25 ` [PATCH v3 2/4] arm64: KVM: add accessors to track guest/host only counters Andrew Murray ` (2 subsequent siblings) 3 siblings, 0 replies; 11+ messages in thread From: Andrew Murray @ 2018-11-22 11:25 UTC (permalink / raw) To: linux-arm-kernel The armv8pmu_enable_event_counter function issues an isb instruction after enabling a pair of counters - this doesn't provide any value and is inconsistent with the armv8pmu_disable_event_counter. In any case armv8pmu_enable_event_counter is always called with the PMU stopped. Starting the PMU with armv8pmu_start results in an isb instruction being issued prior to writing to PMCR_EL0. Let's remove the unnecessary isb instruction. Signed-off-by: Andrew Murray <andrew.murray@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> --- arch/arm64/kernel/perf_event.c | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index 8e38d52..de564ae 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -652,7 +652,6 @@ static inline void armv8pmu_enable_event_counter(struct perf_event *event) armv8pmu_enable_counter(idx); if (armv8pmu_event_is_chained(event)) armv8pmu_enable_counter(idx - 1); - isb(); } static inline int armv8pmu_disable_counter(int idx) -- 2.7.4 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v3 2/4] arm64: KVM: add accessors to track guest/host only counters 2018-11-22 11:25 [PATCH v3 0/4] arm64: Support perf event modifiers :G and :H Andrew Murray 2018-11-22 11:25 ` [PATCH v3 1/4] arm64: arm_pmu: remove unnecessary isb instruction Andrew Murray @ 2018-11-22 11:25 ` Andrew Murray 2018-12-03 14:31 ` Will Deacon 2018-11-22 11:25 ` [PATCH v3 3/4] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes Andrew Murray 2018-11-22 11:25 ` [PATCH v3 4/4] arm64: KVM: Enable support for :G/:H perf event modifiers Andrew Murray 3 siblings, 1 reply; 11+ messages in thread From: Andrew Murray @ 2018-11-22 11:25 UTC (permalink / raw) To: linux-arm-kernel In order to effeciently enable/disable guest/host only perf counters at guest entry/exit we add bitfields to kvm_cpu_context for guest and host only events as well as accessors for updating them. Signed-off-by: Andrew Murray <andrew.murray@arm.com> --- arch/arm64/include/asm/kvm_host.h | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 1550192..4a828eb 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -203,6 +203,8 @@ struct kvm_cpu_context { }; struct kvm_vcpu *__hyp_running_vcpu; + u32 events_host_only; + u32 events_guest_only; }; typedef struct kvm_cpu_context kvm_cpu_context_t; @@ -472,6 +474,24 @@ static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu) { return kvm_arch_vcpu_run_map_fp(vcpu); } +static inline void kvm_clr_set_host_pmu_events(u32 clr, u32 set) +{ + kvm_cpu_context_t *ctx = this_cpu_ptr(&kvm_host_cpu_state); + + ctx->events_host_only &= ~clr; + ctx->events_host_only |= set; +} + +static inline void kvm_clr_set_guest_pmu_events(u32 clr, u32 set) +{ + kvm_cpu_context_t *ctx = this_cpu_ptr(&kvm_host_cpu_state); + + ctx->events_guest_only &= ~clr; + ctx->events_guest_only |= set; +} +#else +static inline void kvm_clr_set_host_pmu_events(u32 clr, u32 set) {} +static inline void kvm_clr_set_guest_pmu_events(u32 clr, u32 set) {} #endif static inline void kvm_arm_vhe_guest_enter(void) -- 2.7.4 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v3 2/4] arm64: KVM: add accessors to track guest/host only counters 2018-11-22 11:25 ` [PATCH v3 2/4] arm64: KVM: add accessors to track guest/host only counters Andrew Murray @ 2018-12-03 14:31 ` Will Deacon 2018-12-03 15:05 ` Andrew Murray 0 siblings, 1 reply; 11+ messages in thread From: Will Deacon @ 2018-12-03 14:31 UTC (permalink / raw) To: Andrew Murray Cc: Mark Rutland, Suzuki K Poulose, Marc Zyngier, Catalin Marinas, Julien Thierry, Christoffer Dall, kvmarm, linux-arm-kernel On Thu, Nov 22, 2018 at 11:25:48AM +0000, Andrew Murray wrote: > In order to effeciently enable/disable guest/host only perf counters > at guest entry/exit we add bitfields to kvm_cpu_context for guest and > host only events as well as accessors for updating them. > > Signed-off-by: Andrew Murray <andrew.murray@arm.com> > --- > arch/arm64/include/asm/kvm_host.h | 20 ++++++++++++++++++++ > 1 file changed, 20 insertions(+) > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h > index 1550192..4a828eb 100644 > --- a/arch/arm64/include/asm/kvm_host.h > +++ b/arch/arm64/include/asm/kvm_host.h > @@ -203,6 +203,8 @@ struct kvm_cpu_context { > }; > > struct kvm_vcpu *__hyp_running_vcpu; > + u32 events_host_only; > + u32 events_guest_only; > }; > > typedef struct kvm_cpu_context kvm_cpu_context_t; > @@ -472,6 +474,24 @@ static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu) > { > return kvm_arch_vcpu_run_map_fp(vcpu); > } > +static inline void kvm_clr_set_host_pmu_events(u32 clr, u32 set) > +{ > + kvm_cpu_context_t *ctx = this_cpu_ptr(&kvm_host_cpu_state); > + > + ctx->events_host_only &= ~clr; > + ctx->events_host_only |= set; > +} Do you actually need this clr/set function, or can you just use the facilities provided by bitfield.h at the callsites? Will _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v3 2/4] arm64: KVM: add accessors to track guest/host only counters 2018-12-03 14:31 ` Will Deacon @ 2018-12-03 15:05 ` Andrew Murray 2018-12-03 16:58 ` Will Deacon 0 siblings, 1 reply; 11+ messages in thread From: Andrew Murray @ 2018-12-03 15:05 UTC (permalink / raw) To: Will Deacon Cc: Mark Rutland, Suzuki K Poulose, Marc Zyngier, Catalin Marinas, Julien Thierry, Christoffer Dall, kvmarm, linux-arm-kernel On Mon, Dec 03, 2018 at 02:31:14PM +0000, Will Deacon wrote: > On Thu, Nov 22, 2018 at 11:25:48AM +0000, Andrew Murray wrote: > > In order to effeciently enable/disable guest/host only perf counters > > at guest entry/exit we add bitfields to kvm_cpu_context for guest and > > host only events as well as accessors for updating them. > > > > Signed-off-by: Andrew Murray <andrew.murray@arm.com> > > --- > > arch/arm64/include/asm/kvm_host.h | 20 ++++++++++++++++++++ > > 1 file changed, 20 insertions(+) > > > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h > > index 1550192..4a828eb 100644 > > --- a/arch/arm64/include/asm/kvm_host.h > > +++ b/arch/arm64/include/asm/kvm_host.h > > @@ -203,6 +203,8 @@ struct kvm_cpu_context { > > }; > > > > struct kvm_vcpu *__hyp_running_vcpu; > > + u32 events_host_only; > > + u32 events_guest_only; > > }; > > > > typedef struct kvm_cpu_context kvm_cpu_context_t; > > @@ -472,6 +474,24 @@ static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu) > > { > > return kvm_arch_vcpu_run_map_fp(vcpu); > > } > > +static inline void kvm_clr_set_host_pmu_events(u32 clr, u32 set) > > +{ > > + kvm_cpu_context_t *ctx = this_cpu_ptr(&kvm_host_cpu_state); > > + > > + ctx->events_host_only &= ~clr; > > + ctx->events_host_only |= set; > > +} > > Do you actually need this clr/set function, or can you just use the > facilities provided by bitfield.h at the callsites? This modifies events_{host|guest}_only which is conditionally defined (CONFIG_KVM) yet used by arch/arm64/kernel/perf_events which is also conditionally defined (CONFIG_HW_PERF_EVENTS). By moving this away from the callsite we avoid a load of horrible #ifdef's and allow this function to become a nop when KVM isn't enabled. Is there value in updating this function to use set_bit/clear_bit functions in bitops/atomic.h ? Andrew Murray > > Will _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v3 2/4] arm64: KVM: add accessors to track guest/host only counters 2018-12-03 15:05 ` Andrew Murray @ 2018-12-03 16:58 ` Will Deacon 2018-12-03 22:44 ` Andrew Murray 0 siblings, 1 reply; 11+ messages in thread From: Will Deacon @ 2018-12-03 16:58 UTC (permalink / raw) To: Andrew Murray Cc: Mark Rutland, Suzuki K Poulose, Marc Zyngier, Catalin Marinas, Julien Thierry, Christoffer Dall, kvmarm, linux-arm-kernel On Mon, Dec 03, 2018 at 03:05:52PM +0000, Andrew Murray wrote: > On Mon, Dec 03, 2018 at 02:31:14PM +0000, Will Deacon wrote: > > On Thu, Nov 22, 2018 at 11:25:48AM +0000, Andrew Murray wrote: > > > In order to effeciently enable/disable guest/host only perf counters > > > at guest entry/exit we add bitfields to kvm_cpu_context for guest and > > > host only events as well as accessors for updating them. > > > > > > Signed-off-by: Andrew Murray <andrew.murray@arm.com> > > > --- > > > arch/arm64/include/asm/kvm_host.h | 20 ++++++++++++++++++++ > > > 1 file changed, 20 insertions(+) > > > > > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h > > > index 1550192..4a828eb 100644 > > > --- a/arch/arm64/include/asm/kvm_host.h > > > +++ b/arch/arm64/include/asm/kvm_host.h > > > @@ -203,6 +203,8 @@ struct kvm_cpu_context { > > > }; > > > > > > struct kvm_vcpu *__hyp_running_vcpu; > > > + u32 events_host_only; > > > + u32 events_guest_only; > > > }; > > > > > > typedef struct kvm_cpu_context kvm_cpu_context_t; > > > @@ -472,6 +474,24 @@ static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu) > > > { > > > return kvm_arch_vcpu_run_map_fp(vcpu); > > > } > > > +static inline void kvm_clr_set_host_pmu_events(u32 clr, u32 set) > > > +{ > > > + kvm_cpu_context_t *ctx = this_cpu_ptr(&kvm_host_cpu_state); > > > + > > > + ctx->events_host_only &= ~clr; > > > + ctx->events_host_only |= set; > > > +} > > > > Do you actually need this clr/set function, or can you just use the > > facilities provided by bitfield.h at the callsites? > > This modifies events_{host|guest}_only which is conditionally defined > (CONFIG_KVM) yet used by arch/arm64/kernel/perf_events which is also > conditionally defined (CONFIG_HW_PERF_EVENTS). By moving this away > from the callsite we avoid a load of horrible #ifdef's and allow this > function to become a nop when KVM isn't enabled. > > Is there value in updating this function to use set_bit/clear_bit > functions in bitops/atomic.h ? You don't need the atomicity, so you'd be looking at the '__' variants instead. My main concern is that we're introducing a clr_set_* API which only ever clears or sets, so I think it's needlessly complicated. If you need two separate macros that are just #defined to __set_bit/__clear_bit or expand to nothing then that could work too. Will _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v3 2/4] arm64: KVM: add accessors to track guest/host only counters 2018-12-03 16:58 ` Will Deacon @ 2018-12-03 22:44 ` Andrew Murray 0 siblings, 0 replies; 11+ messages in thread From: Andrew Murray @ 2018-12-03 22:44 UTC (permalink / raw) To: Will Deacon Cc: Mark Rutland, Suzuki K Poulose, Marc Zyngier, Catalin Marinas, Julien Thierry, Christoffer Dall, kvmarm, linux-arm-kernel On Mon, Dec 03, 2018 at 04:58:43PM +0000, Will Deacon wrote: > On Mon, Dec 03, 2018 at 03:05:52PM +0000, Andrew Murray wrote: > > On Mon, Dec 03, 2018 at 02:31:14PM +0000, Will Deacon wrote: > > > On Thu, Nov 22, 2018 at 11:25:48AM +0000, Andrew Murray wrote: > > > > In order to effeciently enable/disable guest/host only perf counters > > > > at guest entry/exit we add bitfields to kvm_cpu_context for guest and > > > > host only events as well as accessors for updating them. > > > > > > > > Signed-off-by: Andrew Murray <andrew.murray@arm.com> > > > > --- > > > > arch/arm64/include/asm/kvm_host.h | 20 ++++++++++++++++++++ > > > > 1 file changed, 20 insertions(+) > > > > > > > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h > > > > index 1550192..4a828eb 100644 > > > > --- a/arch/arm64/include/asm/kvm_host.h > > > > +++ b/arch/arm64/include/asm/kvm_host.h > > > > @@ -203,6 +203,8 @@ struct kvm_cpu_context { > > > > }; > > > > > > > > struct kvm_vcpu *__hyp_running_vcpu; > > > > + u32 events_host_only; > > > > + u32 events_guest_only; > > > > }; > > > > > > > > typedef struct kvm_cpu_context kvm_cpu_context_t; > > > > @@ -472,6 +474,24 @@ static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu) > > > > { > > > > return kvm_arch_vcpu_run_map_fp(vcpu); > > > > } > > > > +static inline void kvm_clr_set_host_pmu_events(u32 clr, u32 set) > > > > +{ > > > > + kvm_cpu_context_t *ctx = this_cpu_ptr(&kvm_host_cpu_state); > > > > + > > > > + ctx->events_host_only &= ~clr; > > > > + ctx->events_host_only |= set; > > > > +} > > > > > > Do you actually need this clr/set function, or can you just use the > > > facilities provided by bitfield.h at the callsites? > > > > This modifies events_{host|guest}_only which is conditionally defined > > (CONFIG_KVM) yet used by arch/arm64/kernel/perf_events which is also > > conditionally defined (CONFIG_HW_PERF_EVENTS). By moving this away > > from the callsite we avoid a load of horrible #ifdef's and allow this > > function to become a nop when KVM isn't enabled. > > > > Is there value in updating this function to use set_bit/clear_bit > > functions in bitops/atomic.h ? > > You don't need the atomicity, so you'd be looking at the '__' variants > instead. My main concern is that we're introducing a clr_set_* API which > only ever clears or sets, so I think it's needlessly complicated. If you > need two separate macros that are just #defined to __set_bit/__clear_bit > or expand to nothing then that could work too. Though it would be 4 macros - a set and clear for both events_host and events_guest. Though with the proposed changes in the other patch, this could end up being 3 macros seeing as we'd always clear both events_host and events_guest at the same time. Thanks, Andrew Murray > > Will _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v3 3/4] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes 2018-11-22 11:25 [PATCH v3 0/4] arm64: Support perf event modifiers :G and :H Andrew Murray 2018-11-22 11:25 ` [PATCH v3 1/4] arm64: arm_pmu: remove unnecessary isb instruction Andrew Murray 2018-11-22 11:25 ` [PATCH v3 2/4] arm64: KVM: add accessors to track guest/host only counters Andrew Murray @ 2018-11-22 11:25 ` Andrew Murray 2018-12-03 15:04 ` Will Deacon 2018-11-22 11:25 ` [PATCH v3 4/4] arm64: KVM: Enable support for :G/:H perf event modifiers Andrew Murray 3 siblings, 1 reply; 11+ messages in thread From: Andrew Murray @ 2018-11-22 11:25 UTC (permalink / raw) To: linux-arm-kernel Add support for the :G and :H attributes in perf by handling the exclude_host/exclude_guest event attributes. We notify KVM of counters that we wish to be enabled or disabled on guest entry/exit and thus defer from starting or stopping :G events as per the events exclude_host attribute. With both VHE and non-VHE we switch the counters between host/guest at EL2. We are able to eliminate counters counting host events on the boundaries of guest entry/exit when using :G by filtering out EL2 for exclude_host. However when using :H unless exclude_hv is set on non-VHE then there is a small blackout window at the guest entry/exit where host events are not captured. Signed-off-by: Andrew Murray <andrew.murray@arm.com> --- arch/arm64/kernel/perf_event.c | 52 ++++++++++++++++++++++++++++++++++++------ 1 file changed, 45 insertions(+), 7 deletions(-) diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index de564ae..e0619ba 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -26,6 +26,7 @@ #include <linux/acpi.h> #include <linux/clocksource.h> +#include <linux/kvm_host.h> #include <linux/of.h> #include <linux/perf/arm_pmu.h> #include <linux/platform_device.h> @@ -647,11 +648,23 @@ static inline int armv8pmu_enable_counter(int idx) static inline void armv8pmu_enable_event_counter(struct perf_event *event) { + struct perf_event_attr *attr = &event->attr; int idx = event->hw.idx; + u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx)); - armv8pmu_enable_counter(idx); if (armv8pmu_event_is_chained(event)) - armv8pmu_enable_counter(idx - 1); + counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1)); + + if (attr->exclude_host) + kvm_clr_set_guest_pmu_events(0, counter_bits); + if (attr->exclude_guest) + kvm_clr_set_host_pmu_events(0, counter_bits); + + if (!attr->exclude_host) { + armv8pmu_enable_counter(idx); + if (armv8pmu_event_is_chained(event)) + armv8pmu_enable_counter(idx - 1); + } } static inline int armv8pmu_disable_counter(int idx) @@ -664,11 +677,23 @@ static inline int armv8pmu_disable_counter(int idx) static inline void armv8pmu_disable_event_counter(struct perf_event *event) { struct hw_perf_event *hwc = &event->hw; + struct perf_event_attr *attr = &event->attr; int idx = hwc->idx; + u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx)); if (armv8pmu_event_is_chained(event)) - armv8pmu_disable_counter(idx - 1); - armv8pmu_disable_counter(idx); + counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1)); + + if (attr->exclude_host) + kvm_clr_set_guest_pmu_events(counter_bits, 0); + if (attr->exclude_guest) + kvm_clr_set_host_pmu_events(counter_bits, 0); + + if (!attr->exclude_host) { + if (armv8pmu_event_is_chained(event)) + armv8pmu_disable_counter(idx - 1); + armv8pmu_disable_counter(idx); + } } static inline int armv8pmu_enable_intens(int idx) @@ -943,16 +968,25 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event, * Therefore we ignore exclude_hv in this configuration, since * there's no hypervisor to sample anyway. This is consistent * with other architectures (x86 and Power). + * + * To eliminate counting host events on the boundaries of + * guest entry/exit we ensure EL2 is not included in hyp mode + * with !exclude_host. */ if (is_kernel_in_hyp_mode()) { - if (!attr->exclude_kernel) + if (!attr->exclude_kernel && !attr->exclude_host) config_base |= ARMV8_PMU_INCLUDE_EL2; } else { - if (attr->exclude_kernel) - config_base |= ARMV8_PMU_EXCLUDE_EL1; if (!attr->exclude_hv) config_base |= ARMV8_PMU_INCLUDE_EL2; } + + /* + * Filter out !VHE kernels and guest kernels + */ + if (attr->exclude_kernel) + config_base |= ARMV8_PMU_EXCLUDE_EL1; + if (attr->exclude_user) config_base |= ARMV8_PMU_EXCLUDE_EL0; @@ -976,6 +1010,10 @@ static void armv8pmu_reset(void *info) armv8pmu_disable_intens(idx); } + /* Clear the counters we flip at guest entry/exit */ + kvm_clr_set_host_pmu_events(U32_MAX, 0); + kvm_clr_set_guest_pmu_events(U32_MAX, 0); + /* * Initialize & Reset PMNC. Request overflow interrupt for * 64 bit cycle counter but cheat in armv8pmu_write_counter(). -- 2.7.4 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v3 3/4] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes 2018-11-22 11:25 ` [PATCH v3 3/4] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes Andrew Murray @ 2018-12-03 15:04 ` Will Deacon 2018-12-03 22:37 ` Andrew Murray 0 siblings, 1 reply; 11+ messages in thread From: Will Deacon @ 2018-12-03 15:04 UTC (permalink / raw) To: Andrew Murray Cc: Mark Rutland, Suzuki K Poulose, Marc Zyngier, Catalin Marinas, Julien Thierry, Christoffer Dall, kvmarm, linux-arm-kernel Hi Andrew, On Thu, Nov 22, 2018 at 11:25:49AM +0000, Andrew Murray wrote: > Add support for the :G and :H attributes in perf by handling the > exclude_host/exclude_guest event attributes. > > We notify KVM of counters that we wish to be enabled or disabled on > guest entry/exit and thus defer from starting or stopping :G events > as per the events exclude_host attribute. > > With both VHE and non-VHE we switch the counters between host/guest > at EL2. We are able to eliminate counters counting host events on > the boundaries of guest entry/exit when using :G by filtering out > EL2 for exclude_host. However when using :H unless exclude_hv is set > on non-VHE then there is a small blackout window at the guest > entry/exit where host events are not captured. > > Signed-off-by: Andrew Murray <andrew.murray@arm.com> > --- > arch/arm64/kernel/perf_event.c | 52 ++++++++++++++++++++++++++++++++++++------ > 1 file changed, 45 insertions(+), 7 deletions(-) I've got a cold at the moment, so it could just be that, but I struggled to follow the logic in this patch. Comments below. > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c > index de564ae..e0619ba 100644 > --- a/arch/arm64/kernel/perf_event.c > +++ b/arch/arm64/kernel/perf_event.c > @@ -26,6 +26,7 @@ > > #include <linux/acpi.h> > #include <linux/clocksource.h> > +#include <linux/kvm_host.h> > #include <linux/of.h> > #include <linux/perf/arm_pmu.h> > #include <linux/platform_device.h> > @@ -647,11 +648,23 @@ static inline int armv8pmu_enable_counter(int idx) > > static inline void armv8pmu_enable_event_counter(struct perf_event *event) > { > + struct perf_event_attr *attr = &event->attr; > int idx = event->hw.idx; > + u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx)); > > - armv8pmu_enable_counter(idx); > if (armv8pmu_event_is_chained(event)) > - armv8pmu_enable_counter(idx - 1); > + counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1)); > + > + if (attr->exclude_host) > + kvm_clr_set_guest_pmu_events(0, counter_bits); > + if (attr->exclude_guest) > + kvm_clr_set_host_pmu_events(0, counter_bits); Hmm, so if I have both exclude_host and exclude_guest set then we'll set the counter in both events_host_only and events_guest_only, which is weird and probably not what you want. I think I'd find it easier to understand if you dropped the "only" suffix on the kvm masks, and you actually made the logic positive so that an event that counts in both host and guest has its bit set in both of the masks. > + > + if (!attr->exclude_host) { > + armv8pmu_enable_counter(idx); > + if (armv8pmu_event_is_chained(event)) > + armv8pmu_enable_counter(idx - 1); > + } > } > > static inline int armv8pmu_disable_counter(int idx) > @@ -664,11 +677,23 @@ static inline int armv8pmu_disable_counter(int idx) > static inline void armv8pmu_disable_event_counter(struct perf_event *event) > { > struct hw_perf_event *hwc = &event->hw; > + struct perf_event_attr *attr = &event->attr; > int idx = hwc->idx; > + u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx)); > > if (armv8pmu_event_is_chained(event)) > - armv8pmu_disable_counter(idx - 1); > - armv8pmu_disable_counter(idx); > + counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1)); > + > + if (attr->exclude_host) > + kvm_clr_set_guest_pmu_events(counter_bits, 0); > + if (attr->exclude_guest) > + kvm_clr_set_host_pmu_events(counter_bits, 0); > + > + if (!attr->exclude_host) { > + if (armv8pmu_event_is_chained(event)) > + armv8pmu_disable_counter(idx - 1); > + armv8pmu_disable_counter(idx); > + } It looks like you're relying on the perf_event_attr remaining unchanged between the enable/disable calls for a given event. However, I'm not convinced that's the case -- have you checked that e.g. PERF_EVENT_IOC_MODIFY_ATTRIBUTES can't rewrite the thing behind your back? > } > > static inline int armv8pmu_enable_intens(int idx) > @@ -943,16 +968,25 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event, > * Therefore we ignore exclude_hv in this configuration, since > * there's no hypervisor to sample anyway. This is consistent > * with other architectures (x86 and Power). > + * > + * To eliminate counting host events on the boundaries of > + * guest entry/exit we ensure EL2 is not included in hyp mode > + * with !exclude_host. > */ > if (is_kernel_in_hyp_mode()) { > - if (!attr->exclude_kernel) > + if (!attr->exclude_kernel && !attr->exclude_host) > config_base |= ARMV8_PMU_INCLUDE_EL2; Do you know what perf passes by default here? I'm worried that we're introducing a user-visible change to existing programs by changing this check. > } else { > - if (attr->exclude_kernel) > - config_base |= ARMV8_PMU_EXCLUDE_EL1; > if (!attr->exclude_hv) > config_base |= ARMV8_PMU_INCLUDE_EL2; > } > + > + /* > + * Filter out !VHE kernels and guest kernels > + */ > + if (attr->exclude_kernel) > + config_base |= ARMV8_PMU_EXCLUDE_EL1; Why have you moved this hunk? Will _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v3 3/4] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes 2018-12-03 15:04 ` Will Deacon @ 2018-12-03 22:37 ` Andrew Murray 0 siblings, 0 replies; 11+ messages in thread From: Andrew Murray @ 2018-12-03 22:37 UTC (permalink / raw) To: Will Deacon Cc: Mark Rutland, Suzuki K Poulose, Marc Zyngier, Catalin Marinas, Julien Thierry, Christoffer Dall, kvmarm, linux-arm-kernel On Mon, Dec 03, 2018 at 03:04:14PM +0000, Will Deacon wrote: > Hi Andrew, > > On Thu, Nov 22, 2018 at 11:25:49AM +0000, Andrew Murray wrote: > > Add support for the :G and :H attributes in perf by handling the > > exclude_host/exclude_guest event attributes. > > > > We notify KVM of counters that we wish to be enabled or disabled on > > guest entry/exit and thus defer from starting or stopping :G events > > as per the events exclude_host attribute. > > > > With both VHE and non-VHE we switch the counters between host/guest > > at EL2. We are able to eliminate counters counting host events on > > the boundaries of guest entry/exit when using :G by filtering out > > EL2 for exclude_host. However when using :H unless exclude_hv is set > > on non-VHE then there is a small blackout window at the guest > > entry/exit where host events are not captured. > > > > Signed-off-by: Andrew Murray <andrew.murray@arm.com> > > --- > > arch/arm64/kernel/perf_event.c | 52 ++++++++++++++++++++++++++++++++++++------ > > 1 file changed, 45 insertions(+), 7 deletions(-) > > I've got a cold at the moment, so it could just be that, but I struggled to > follow the logic in this patch. Comments below. > > > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c > > index de564ae..e0619ba 100644 > > --- a/arch/arm64/kernel/perf_event.c > > +++ b/arch/arm64/kernel/perf_event.c > > @@ -26,6 +26,7 @@ > > > > #include <linux/acpi.h> > > #include <linux/clocksource.h> > > +#include <linux/kvm_host.h> > > #include <linux/of.h> > > #include <linux/perf/arm_pmu.h> > > #include <linux/platform_device.h> > > @@ -647,11 +648,23 @@ static inline int armv8pmu_enable_counter(int idx) > > > > static inline void armv8pmu_enable_event_counter(struct perf_event *event) > > { > > + struct perf_event_attr *attr = &event->attr; > > int idx = event->hw.idx; > > + u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx)); > > > > - armv8pmu_enable_counter(idx); > > if (armv8pmu_event_is_chained(event)) > > - armv8pmu_enable_counter(idx - 1); > > + counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1)); > > + > > + if (attr->exclude_host) > > + kvm_clr_set_guest_pmu_events(0, counter_bits); > > + if (attr->exclude_guest) > > + kvm_clr_set_host_pmu_events(0, counter_bits); > > Hmm, so if I have both exclude_host and exclude_guest set then we'll set > the counter in both events_host_only and events_guest_only, which is weird > and probably not what you want. perf_event_attr is passed from userspace, whilst the perf tool doesn't allow you to set exclude_host and exclude_guest at the same time it's feasiable for another user to do so. So indeed the current behaviour wouldn't be expected. > > I think I'd find it easier to understand if you dropped the "only" suffix on > the kvm masks, and you actually made the logic positive so that an event > that counts in both host and guest has its bit set in both of the masks. No problem. This actually fixes a bug when multiple events are enabled. > > > + > > + if (!attr->exclude_host) { > > + armv8pmu_enable_counter(idx); > > + if (armv8pmu_event_is_chained(event)) > > + armv8pmu_enable_counter(idx - 1); > > + } > > } > > > > static inline int armv8pmu_disable_counter(int idx) > > @@ -664,11 +677,23 @@ static inline int armv8pmu_disable_counter(int idx) > > static inline void armv8pmu_disable_event_counter(struct perf_event *event) > > { > > struct hw_perf_event *hwc = &event->hw; > > + struct perf_event_attr *attr = &event->attr; > > int idx = hwc->idx; > > + u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx)); > > > > if (armv8pmu_event_is_chained(event)) > > - armv8pmu_disable_counter(idx - 1); > > - armv8pmu_disable_counter(idx); > > + counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1)); > > + > > + if (attr->exclude_host) > > + kvm_clr_set_guest_pmu_events(counter_bits, 0); > > + if (attr->exclude_guest) > > + kvm_clr_set_host_pmu_events(counter_bits, 0); > > + > > + if (!attr->exclude_host) { > > + if (armv8pmu_event_is_chained(event)) > > + armv8pmu_disable_counter(idx - 1); > > + armv8pmu_disable_counter(idx); > > + } > > It looks like you're relying on the perf_event_attr remaining unchanged > between the enable/disable calls for a given event. However, I'm not > convinced that's the case -- have you checked that e.g. > PERF_EVENT_IOC_MODIFY_ATTRIBUTES can't rewrite the thing behind your > back? Even better - I can unconditionally clear the counter_bits here seeing as the event is being disabled. > > > } > > > > static inline int armv8pmu_enable_intens(int idx) > > @@ -943,16 +968,25 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event, > > * Therefore we ignore exclude_hv in this configuration, since > > * there's no hypervisor to sample anyway. This is consistent > > * with other architectures (x86 and Power). > > + * > > + * To eliminate counting host events on the boundaries of > > + * guest entry/exit we ensure EL2 is not included in hyp mode > > + * with !exclude_host. > > */ > > if (is_kernel_in_hyp_mode()) { > > - if (!attr->exclude_kernel) > > + if (!attr->exclude_kernel && !attr->exclude_host) > > config_base |= ARMV8_PMU_INCLUDE_EL2; > > Do you know what perf passes by default here? I'm worried that we're > introducing a user-visible change to existing programs by changing this > check. When running 'perf xxx' exclude_guest is set and exclude_host is unset. And the opposite when running 'perf kvm xxx'. So for existing users that do not use virtualisation then there is no change. > > > } else { > > - if (attr->exclude_kernel) > > - config_base |= ARMV8_PMU_EXCLUDE_EL1; > > if (!attr->exclude_hv) > > config_base |= ARMV8_PMU_INCLUDE_EL2; > > } > > + > > + /* > > + * Filter out !VHE kernels and guest kernels > > + */ > > + if (attr->exclude_kernel) > > + config_base |= ARMV8_PMU_EXCLUDE_EL1; > > Why have you moved this hunk? When guests use perf the guest PMU is emulated by KVM (virt/kvm/arm/pmu) and is backed by host perf events, these events always have the exclude_host flag set. Therefore to filter these kernel events we need to filter EL1. By moving this hunk we ensure guest kernel filtering works even when the host is VHE. (The side effect is that we unnecessarily filter EL1 on VHE host kernel events, but this has no effect as no-one is using this except any guest.) Thanks, Andrew Murray > > Will _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v3 4/4] arm64: KVM: Enable support for :G/:H perf event modifiers 2018-11-22 11:25 [PATCH v3 0/4] arm64: Support perf event modifiers :G and :H Andrew Murray ` (2 preceding siblings ...) 2018-11-22 11:25 ` [PATCH v3 3/4] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes Andrew Murray @ 2018-11-22 11:25 ` Andrew Murray 3 siblings, 0 replies; 11+ messages in thread From: Andrew Murray @ 2018-11-22 11:25 UTC (permalink / raw) To: linux-arm-kernel Enable/disable event counters as appropriate when entering and exiting the guest to enable support for guest or host only event counting. For both VHE and non-VHE we switch the counters between host/guest at EL2. EL2 is filtered out by the PMU when we are using the :G modifier. The PMU may be on when we change which counters are enabled however we avoid adding an isb as we instead rely on existing context synchronisation events: the isb in kvm_arm_vhe_guest_exit for VHE and the eret from the hvc in kvm_call_hyp. Signed-off-by: Andrew Murray <andrew.murray@arm.com> --- arch/arm64/kvm/hyp/switch.c | 38 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 38 insertions(+) diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index d496ef5..ebf0aac 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -373,6 +373,32 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu) return true; } +static bool __hyp_text __pmu_switch_to_guest(struct kvm_cpu_context *host_ctxt) +{ + u32 host_only = host_ctxt->events_host_only; + u32 guest_only = host_ctxt->events_guest_only; + + if (host_only) + write_sysreg(host_only, pmcntenclr_el0); + + if (guest_only) + write_sysreg(guest_only, pmcntenset_el0); + + return (host_only || guest_only); +} + +static void __hyp_text __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt) +{ + u32 host_only = host_ctxt->events_host_only; + u32 guest_only = host_ctxt->events_guest_only; + + if (guest_only) + write_sysreg(guest_only, pmcntenclr_el0); + + if (host_only) + write_sysreg(host_only, pmcntenset_el0); +} + /* * Return true when we were able to fixup the guest exit and should return to * the guest, false when we should restore the host state and return to the @@ -488,12 +514,15 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu) { struct kvm_cpu_context *host_ctxt; struct kvm_cpu_context *guest_ctxt; + bool pmu_switch_needed; u64 exit_code; host_ctxt = vcpu->arch.host_cpu_context; host_ctxt->__hyp_running_vcpu = vcpu; guest_ctxt = &vcpu->arch.ctxt; + pmu_switch_needed = __pmu_switch_to_guest(host_ctxt); + sysreg_save_host_state_vhe(host_ctxt); __activate_traps(vcpu); @@ -524,6 +553,9 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu) __debug_switch_to_host(vcpu); + if (pmu_switch_needed) + __pmu_switch_to_host(host_ctxt); + return exit_code; } @@ -532,6 +564,7 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu) { struct kvm_cpu_context *host_ctxt; struct kvm_cpu_context *guest_ctxt; + bool pmu_switch_needed; u64 exit_code; vcpu = kern_hyp_va(vcpu); @@ -540,6 +573,8 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu) host_ctxt->__hyp_running_vcpu = vcpu; guest_ctxt = &vcpu->arch.ctxt; + pmu_switch_needed = __pmu_switch_to_guest(host_ctxt); + __sysreg_save_state_nvhe(host_ctxt); __activate_traps(vcpu); @@ -586,6 +621,9 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu) */ __debug_switch_to_host(vcpu); + if (pmu_switch_needed) + __pmu_switch_to_host(host_ctxt); + return exit_code; } -- 2.7.4 ^ permalink raw reply related [flat|nested] 11+ messages in thread
end of thread, other threads:[~2018-12-03 22:44 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-11-22 11:25 [PATCH v3 0/4] arm64: Support perf event modifiers :G and :H Andrew Murray 2018-11-22 11:25 ` [PATCH v3 1/4] arm64: arm_pmu: remove unnecessary isb instruction Andrew Murray 2018-11-22 11:25 ` [PATCH v3 2/4] arm64: KVM: add accessors to track guest/host only counters Andrew Murray 2018-12-03 14:31 ` Will Deacon 2018-12-03 15:05 ` Andrew Murray 2018-12-03 16:58 ` Will Deacon 2018-12-03 22:44 ` Andrew Murray 2018-11-22 11:25 ` [PATCH v3 3/4] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes Andrew Murray 2018-12-03 15:04 ` Will Deacon 2018-12-03 22:37 ` Andrew Murray 2018-11-22 11:25 ` [PATCH v3 4/4] arm64: KVM: Enable support for :G/:H perf event modifiers Andrew Murray
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).