linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Christoffer Dall <christoffer.dall@arm.com>
To: Andrew Murray <andrew.murray@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>,
	Julien Thierry <julien.thierry@arm.com>,
	Marc Zyngier <marc.zyngier@arm.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	joro@8bytes.org, Suzuki K Poulose <suzuki.poulose@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	kvmarm@lists.cs.columbia.edu,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
Date: Tue, 18 Dec 2018 15:38:33 +0100	[thread overview]
Message-ID: <20181218143833.GD25383@e113682-lin.lund.arm.com> (raw)
In-Reply-To: <20181218132531.GF47018@e119886-lin.cambridge.arm.com>

On Tue, Dec 18, 2018 at 01:25:32PM +0000, Andrew Murray wrote:
> On Tue, Dec 18, 2018 at 01:02:26PM +0100, Christoffer Dall wrote:
> > On Wed, Dec 12, 2018 at 10:29:32AM +0000, Andrew Murray wrote:
> > > Add support for the :G and :H attributes in perf by handling the
> > > exclude_host/exclude_guest event attributes.
> > > 
> > > We notify KVM of counters that we wish to be enabled or disabled on
> > > guest entry/exit and thus defer from starting or stopping :G events
> > > as per the events exclude_host attribute.
> > > 
> > > With both VHE and non-VHE we switch the counters between host/guest
> > > at EL2. We are able to eliminate counters counting host events on
> > > the boundaries of guest entry/exit when using :G by filtering out
> > > EL2 for exclude_host. However when using :H unless exclude_hv is set
> > > on non-VHE then there is a small blackout window at the guest
> > > entry/exit where host events are not captured.
> > > 
> > > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> > > ---
> > >  arch/arm64/kernel/perf_event.c | 51 ++++++++++++++++++++++++++++++++++++------
> > >  1 file changed, 44 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > > index de564ae..4a3c73d 100644
> > > --- a/arch/arm64/kernel/perf_event.c
> > > +++ b/arch/arm64/kernel/perf_event.c
> > > @@ -26,6 +26,7 @@
> > >  
> > >  #include <linux/acpi.h>
> > >  #include <linux/clocksource.h>
> > > +#include <linux/kvm_host.h>
> > >  #include <linux/of.h>
> > >  #include <linux/perf/arm_pmu.h>
> > >  #include <linux/platform_device.h>
> > > @@ -647,11 +648,26 @@ static inline int armv8pmu_enable_counter(int idx)
> > >  
> > >  static inline void armv8pmu_enable_event_counter(struct perf_event *event)
> > >  {
> > > +	struct perf_event_attr *attr = &event->attr;
> > >  	int idx = event->hw.idx;
> > > +	int flags = 0;
> > > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> > >  
> > > -	armv8pmu_enable_counter(idx);
> > >  	if (armv8pmu_event_is_chained(event))
> > > -		armv8pmu_enable_counter(idx - 1);
> > > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > > +
> > > +	if (!attr->exclude_host)
> > > +		flags |= KVM_PMU_EVENTS_HOST;
> > > +	if (!attr->exclude_guest)
> > > +		flags |= KVM_PMU_EVENTS_GUEST;
> > > +
> > > +	kvm_set_pmu_events(counter_bits, flags);
> > > +
> > > +	if (!attr->exclude_host) {
> > > +		armv8pmu_enable_counter(idx);
> > > +		if (armv8pmu_event_is_chained(event))
> > > +			armv8pmu_enable_counter(idx - 1);
> > > +	}
> > >  }
> > >  
> > >  static inline int armv8pmu_disable_counter(int idx)
> > > @@ -664,11 +680,20 @@ static inline int armv8pmu_disable_counter(int idx)
> > >  static inline void armv8pmu_disable_event_counter(struct perf_event *event)
> > >  {
> > >  	struct hw_perf_event *hwc = &event->hw;
> > > +	struct perf_event_attr *attr = &event->attr;
> > >  	int idx = hwc->idx;
> > > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> > >  
> > >  	if (armv8pmu_event_is_chained(event))
> > > -		armv8pmu_disable_counter(idx - 1);
> > > -	armv8pmu_disable_counter(idx);
> > > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > > +
> > > +	kvm_clr_pmu_events(counter_bits);
> > > +
> > > +	if (!attr->exclude_host) {
> > > +		if (armv8pmu_event_is_chained(event))
> > > +			armv8pmu_disable_counter(idx - 1);
> > > +		armv8pmu_disable_counter(idx);
> > > +	}
> > >  }
> > >  
> > >  static inline int armv8pmu_enable_intens(int idx)
> > > @@ -943,16 +968,25 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
> > >  	 * Therefore we ignore exclude_hv in this configuration, since
> > >  	 * there's no hypervisor to sample anyway. This is consistent
> > >  	 * with other architectures (x86 and Power).
> > > +	 *
> > > +	 * To eliminate counting host events on the boundaries of
> > > +	 * guest entry/exit we ensure EL2 is not included in hyp mode
> > > +	 * with !exclude_host.
> > >  	 */
> > >  	if (is_kernel_in_hyp_mode()) {
> > > -		if (!attr->exclude_kernel)
> > > +		if (!attr->exclude_kernel && !attr->exclude_host)
> > >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> > >  	} else {
> > > -		if (attr->exclude_kernel)
> > > -			config_base |= ARMV8_PMU_EXCLUDE_EL1;
> > >  		if (!attr->exclude_hv)
> > >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> > 
> > I'm not sure about the current use of exclude_hv here.  The comment says
> > it's consistent with other architectures, but I can't find an example to
> > confirm this, and I don't think we have a comparable thing to the split
> > of the hypervisor between EL1 and EL2 we have on non-VHE.
> > 
> > Joerg told me the semantics were designed to be:
> > 
> > 	exclude_hv: When running as a guest, stop counting events when
> > 		    the HV runs.
> 
> Can the definition of "guest" here refer to both type 1 and type 2
> hypervisor guests? Or do we assume type 1 only?
> 

A guest is a guest.  Linux can run as a guest under a hypervisor with a
type 1 or type 2 design, doesn't matter for this conversation.

> > 
> > 	exclude_host: When Linux runs as a HV itself, only count events
> > 	              while a guest is running.
> > 
> > 	exclude_guest: When Linux runs as a HV, only count events when
> > 	               running in host mode.
> > 
> > (But tools/perf/design.txt does not really confirm this).
> > 
> > On arm64 that would mean:
> > 
> > 	exclude_hv: As a host, no effect.
> > 		    As a guest, set the counter to include EL2 for a
> > 		    hypervisor to emulate.

[...]

> 
> Though more correctly we should count EL2 *and EL1* events whilst pinned
> to the KVM task and whilst running outside of the guest.  This then
> covers both !VHE and VHE and allows for fair comparasion between !VHE
> and VHE systems.

Yes, if the guest has cleared exclude_hv and if we can properly detect
that from the hypervisor.

> 
> This then gives us the unique benefit of the type 2 host being able to
> examine the hypervisor overhead of its individual guests.

Not sure I understand this part.

> 
> The only issue here is that the type 2 host wouldn't be able to examine
> the HV overhead of all its guests across the system as you wouldn't be
> able to rely on the perf task pinning to distinguish between EL1 from
> host and EL1 from guests in a !VHE system. I'm not sure the best way
> to overcome this limitation.

Why can't you disable EL1 counting whilst running in the host, and
enable EL1 counting whilst running in the guest?

> > 
> > 	exclude_host: As a guest, has no effect.
> > 		      Don't count EL1 host or EL2, but count EL1 guest
> > 		      by enabling EL1 counting at EL2 when entering a
> > 		      guest, and disabling EL1 counting when returning
> > 		      from a guest.
> > 
> > 	exclude_guest: As a guest, has no effect.  As a host, disable
> > 		       EL1 counting at EL2 when entering a guest.
> > 
> > Not sure if we break anything by changing the behavior on arm64 now, but
> > I really doubt that being able to exclude an arbitrary part (the one tha
> > happens to run in EL2 on non-VHE systems) is meaningful, and the fact
> > that behavior and semantics change depending on the version of the
> > underlying CPU is not great, if what you care about is understanding the
> > system's performance.
> 
> This is a bit strange. It's arbitary as it only represents a bit of the
> HV overhead - this is solved though by counting the whole overhead (EL1
> and EL2 instead (but only counting outside the guest and pinned to the
> guest tasks).

Not sure I understand your point here?

> 
> > 
> > Thoughts?
> > 
> 
> Though if I've understood you correctly, you're suggesting that the only
> time we count EL2 is when exclude_hv is not set on the immediate guest
> of a type 1 hypervisor?
> 
No, I didn't say anything about a type 1 or type 2 hypervisor, and I
think that distinction is completely irrelevant to the discussion at
hand.  I also don't know what an immediate guest is -- is there any
other kind?

I don't think exclude_hv, exclude_host, and exclude_guest are directly
tied to a single CPU mode.  The only 'modes' you need to consider for
Linux are 'guest' and 'host' when Linux can run VMs and, 'self' and
'hypervisor' when Linux is a guest.

When Linux can run VMs, you count EL2 events when exclude_host is not
set.

(When Linux is a guest, and you set/clear exclude_hv, for this to work,
you need some way of informing your hypervisor that you want to know
about events happening in the hypervisor.  This could be a PV interface,
or maybe this can work by the guest setting/clearing the NSH bit in its
virtual PMU registers, which then amusingly can get translated into
actually counting in EL1/EL2 (non-VHE) or EL2 (VHE) by KVM's PMU
emulation code.  The method used is specific to the hypervisor used, but
not specific to whether the hypervisor is type-1 or type-2.)


Thanks,

    Christoffer

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2018-12-18 14:38 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-12 10:29 [PATCH v8 0/5] arm64: Support perf event modifiers :G and :H Andrew Murray
2018-12-12 10:29 ` [PATCH v8 1/5] arm64: arm_pmu: remove unnecessary isb instruction Andrew Murray
2018-12-12 10:29 ` [PATCH v8 2/5] arm64: KVM: encapsulate kvm_cpu_context in kvm_host_data Andrew Murray
2018-12-12 10:37   ` Suzuki K Poulose
2018-12-12 12:31     ` Andrew Murray
2018-12-18 11:40   ` Christoffer Dall
2018-12-12 10:29 ` [PATCH v8 3/5] arm64: KVM: add accessors to track guest/host only counters Andrew Murray
2018-12-12 10:56   ` Suzuki K Poulose
2018-12-12 10:29 ` [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes Andrew Murray
2018-12-12 10:42   ` Suzuki K Poulose
2018-12-12 10:47     ` Julien Thierry
2018-12-12 10:51       ` Suzuki K Poulose
2018-12-12 10:55   ` Suzuki K Poulose
2018-12-12 14:38     ` Andrew Murray
2018-12-18 12:02   ` Christoffer Dall
2018-12-18 13:25     ` Andrew Murray
2018-12-18 14:38       ` Christoffer Dall [this message]
2018-12-18 16:27         ` Andrew Murray
2018-12-18 18:51           ` Christoffer Dall
2018-12-18 19:19             ` Andrew Jones
2019-01-04 15:32     ` Will Deacon
2019-01-08 10:18       ` Christoffer Dall
2019-01-08 11:25         ` Andrew Murray
2019-01-08 11:50           ` Marc Zyngier
2019-01-08 12:03             ` Christoffer Dall
2019-01-08 12:12               ` Marc Zyngier
2019-01-08 12:20                 ` Christoffer Dall
2019-01-08 12:14           ` Christoffer Dall
2019-01-08 12:39             ` Andrew Murray
2018-12-12 10:29 ` [PATCH v8 5/5] arm64: KVM: Enable support for :G/:H perf event modifiers Andrew Murray
2018-12-12 10:53   ` Suzuki K Poulose
2018-12-12 14:46     ` Andrew Murray

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181218143833.GD25383@e113682-lin.lund.arm.com \
    --to=christoffer.dall@arm.com \
    --cc=andrew.murray@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=joro@8bytes.org \
    --cc=julien.thierry@arm.com \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=marc.zyngier@arm.com \
    --cc=mark.rutland@arm.com \
    --cc=suzuki.poulose@arm.com \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).