kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Shannon Zhao <shannon.zhao@linaro.org>
To: Christoffer Dall <christoffer.dall@linaro.org>
Cc: kvmarm@lists.cs.columbia.edu,
	linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org,
	will.deacon@arm.com, marc.zyngier@arm.com,
	alex.bennee@linaro.org, zhaoshenglong@huawei.com
Subject: Re: [PATCH 07/18] KVM: ARM64: PMU: Add perf event map and introduce perf event creating function
Date: Tue, 21 Jul 2015 09:35:03 +0800	[thread overview]
Message-ID: <55ADA1C7.5040002@linaro.org> (raw)
In-Reply-To: <20150717143043.GR14024@cbox>



On 2015/7/17 22:30, Christoffer Dall wrote:
> On Mon, Jul 06, 2015 at 10:17:37AM +0800, shannon.zhao@linaro.org wrote:
>> From: Shannon Zhao <shannon.zhao@linaro.org>
>>
>> When we use tools like perf on host, perf passes the event type and the
>> id in this type category to kernel, then kernel will map them to event
>> number and write this number to PMU PMEVTYPER<n>_EL0 register. While
>> we're trapping and emulating guest accesses to PMU registers, we get the
>> event number and map it to the event type and the id reversely.
> 
> There's something with the nomenclature that makes this really hard to
> read.
> 
> you mention here: "event type", "the id", and "event number".  The
> former two I think are perf identifiers, and the latter is the hardware
> event number, is this right?
> 

Yeah, right.

>>
>> Check whether the event type is same with the one to be set.
> 
> when?
> 
In function kvm_pmu_set_counter_event_type before create a new perf event.

>> +	if ((data & ARMV8_EVTYPE_EVENT) == pmc->eventsel)
>> +		return;


>> If not,
>> stop counter to monitor current event and find the event type map id.
>> According to the bits of data to configure this perf_event attr and
>> set exclude_host to 1 for guest. Then call perf_event API to create the
>> corresponding event and save the event pointer.
>>
>> Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
>> ---
>>  include/kvm/arm_pmu.h |   4 ++
>>  virt/kvm/arm/pmu.c    | 173 ++++++++++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 177 insertions(+)
>>
>> diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
>> index 27d14ca..1050b24 100644
>> --- a/include/kvm/arm_pmu.h
>> +++ b/include/kvm/arm_pmu.h
>> @@ -45,9 +45,13 @@ struct kvm_pmu {
>>  
>>  #ifdef CONFIG_KVM_ARM_PMU
>>  void kvm_pmu_vcpu_reset(struct kvm_vcpu *vcpu);
>> +void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, unsigned long data,
>> +				    unsigned long select_idx);
>>  void kvm_pmu_init(struct kvm_vcpu *vcpu);
>>  #else
>>  static inline void kvm_pmu_vcpu_reset(struct kvm_vcpu *vcpu) {}
>> +void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, unsigned long data,
>> +				    unsigned long select_idx) {}
>>  static inline void kvm_pmu_init(struct kvm_vcpu *vcpu) {}
>>  #endif
>>  
>> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
>> index dc252d0..50a3c82 100644
>> --- a/virt/kvm/arm/pmu.c
>> +++ b/virt/kvm/arm/pmu.c
>> @@ -18,8 +18,68 @@
>>  #include <linux/cpu.h>
>>  #include <linux/kvm.h>
>>  #include <linux/kvm_host.h>
>> +#include <linux/perf_event.h>
>>  #include <kvm/arm_pmu.h>
>>  
>> +/* PMU HW events mapping. */
>> +static struct kvm_pmu_hw_event_map {
>> +	unsigned eventsel;
>> +	unsigned event_type;
>> +} kvm_pmu_hw_events[] = {
>> +	[0] = { 0x11, PERF_COUNT_HW_CPU_CYCLES },
>> +	[1] = { 0x08, PERF_COUNT_HW_INSTRUCTIONS },
>> +	[2] = { 0x04, PERF_COUNT_HW_CACHE_REFERENCES },
>> +	[3] = { 0x03, PERF_COUNT_HW_CACHE_MISSES },
>> +	[4] = { 0x10, PERF_COUNT_HW_BRANCH_MISSES },
>> +};
>> +
>> +/* PMU HW cache events mapping. */
>> +static struct kvm_pmu_hw_cache_event_map {
>> +	unsigned eventsel;
>> +	unsigned cache_type;
>> +	unsigned cache_op;
>> +	unsigned cache_result;
>> +} kvm_pmu_hw_cache_events[] = {
>> +	[0] = { 0x04, PERF_COUNT_HW_CACHE_L1D, PERF_COUNT_HW_CACHE_OP_READ,
>> +		      PERF_COUNT_HW_CACHE_RESULT_ACCESS },
>> +	[1] = { 0x03, PERF_COUNT_HW_CACHE_L1D, PERF_COUNT_HW_CACHE_OP_READ,
>> +		      PERF_COUNT_HW_CACHE_RESULT_MISS },
>> +	[2] = { 0x04, PERF_COUNT_HW_CACHE_L1D, PERF_COUNT_HW_CACHE_OP_WRITE,
>> +		      PERF_COUNT_HW_CACHE_RESULT_ACCESS },
>> +	[3] = { 0x03, PERF_COUNT_HW_CACHE_L1D, PERF_COUNT_HW_CACHE_OP_WRITE,
>> +		      PERF_COUNT_HW_CACHE_RESULT_MISS },
> 
> seems to me that the four entries above will never be used, because
> you'll always match them in kvm_pmu_hw_events above?
> 

Yes, I found this before, but for the completeness I list them.

> Is this because perf map multiple generic perf events to the same
> hardware event? 
I think so.

> Does it matter if we register this with perf as one or
> the other then?
> 
I think it's ok because the hardware event numbers are same.


>> +	[4] = { 0x12, PERF_COUNT_HW_CACHE_BPU, PERF_COUNT_HW_CACHE_OP_READ,
>> +		      PERF_COUNT_HW_CACHE_RESULT_ACCESS },
>> +	[5] = { 0x10, PERF_COUNT_HW_CACHE_BPU, PERF_COUNT_HW_CACHE_OP_READ,
>> +		      PERF_COUNT_HW_CACHE_RESULT_MISS },
>> +	[6] = { 0x12, PERF_COUNT_HW_CACHE_BPU, PERF_COUNT_HW_CACHE_OP_WRITE,
>> +		      PERF_COUNT_HW_CACHE_RESULT_ACCESS },
>> +	[7] = { 0x10, PERF_COUNT_HW_CACHE_BPU, PERF_COUNT_HW_CACHE_OP_WRITE,
>> +		      PERF_COUNT_HW_CACHE_RESULT_MISS },
>> +};
>> +
>> +/**
>> + * kvm_pmu_stop_counter - stop PMU counter for the selected counter
>> + * @vcpu: The vcpu pointer
>> + * @select_idx: The counter index
>> + *
>> + * If this counter has been configured to monitor some event, disable and
>> + * release it.
>> + */
>> +static void kvm_pmu_stop_counter(struct kvm_vcpu *vcpu,
>> +				 unsigned long select_idx)
>> +{
>> +	struct kvm_pmu *pmu = &vcpu->arch.pmu;
>> +	struct kvm_pmc *pmc = &pmu->pmc[select_idx];
>> +
>> +	if (pmc->perf_event) {
>> +		perf_event_disable(pmc->perf_event);
>> +		perf_event_release_kernel(pmc->perf_event);
>> +	}
>> +	pmc->perf_event = NULL;
>> +	pmc->eventsel = 0xff;
> 
> why is 0xff 'unused' or reserved?  If we're choosing this arbitrarily,
> why is it not 0x3ff?  Should we create a define for this?
> 

Yeah, 0x3ff may be better.

>> +}
>> +
>>  /**
>>   * kvm_pmu_vcpu_reset - reset pmu state for cpu
>>   * @vcpu: The vcpu pointer
>> @@ -27,12 +87,125 @@
>>   */
>>  void kvm_pmu_vcpu_reset(struct kvm_vcpu *vcpu)
>>  {
>> +	int i;
>>  	struct kvm_pmu *pmu = &vcpu->arch.pmu;
>>  
>> +	for (i = 0; i < ARMV8_MAX_COUNTERS; i++)
>> +		kvm_pmu_stop_counter(vcpu, i);
>> +	pmu->overflow_status = 0;
>>  	pmu->irq_pending = false;
>>  }
>>  
>>  /**
>> + * kvm_pmu_find_hw_event - find hardware event
>> + * @pmu: The pmu pointer
>> + * @event_select: The number of selected event type
>> + *
>> + * Based on the number of selected event type, find out whether it belongs to
>> + * PERF_TYPE_HARDWARE. If so, return the corresponding event id.
>> + */
>> +static unsigned kvm_pmu_find_hw_event(struct kvm_pmu *pmu,
>> +				      unsigned long event_select)
>> +{
>> +	int i;
>> +
>> +	for (i = 0; i < ARRAY_SIZE(kvm_pmu_hw_events); i++)
>> +		if (kvm_pmu_hw_events[i].eventsel == event_select)
>> +			break;
>> +
>> +	if (i == ARRAY_SIZE(kvm_pmu_hw_events))
>> +		return PERF_COUNT_HW_MAX;
>> +
>> +	return kvm_pmu_hw_events[i].event_type;
> 
> you can just return this directly in the loop if you have a match and
> unconditionally return PERF_COUNT_HW_MAX outside the loop without having
> to check the loop condition.
> 
ok
>> +}
>> +
>> +/**
>> + * kvm_pmu_find_hw_cache_event - find hardware cache event
>> + * @pmu: The pmu pointer
>> + * @event_select: The number of selected event type
>> + *
>> + * Based on the number of selected event type, find out whether it belongs to
>> + * PERF_TYPE_HW_CACHE. If so, return the corresponding event id.
>> + */
>> +static unsigned kvm_pmu_find_hw_cache_event(struct kvm_pmu *pmu,
>> +					    unsigned long event_select)
>> +{
>> +	int i;
>> +	unsigned config;
>> +
>> +	for (i = 0; i < ARRAY_SIZE(kvm_pmu_hw_cache_events); i++)
>> +		if (kvm_pmu_hw_cache_events[i].eventsel == event_select)
>> +			break;
>> +
>> +	if (i == ARRAY_SIZE(kvm_pmu_hw_cache_events))
>> +		return PERF_COUNT_HW_CACHE_MAX;
> 
> I feel like I just read this code, can we reuse it with a pointer to the
> array?
> 
>> +
>> +	config = (kvm_pmu_hw_cache_events[i].cache_type & 0xff)
>> +		 | ((kvm_pmu_hw_cache_events[i].cache_op & 0xff) << 8)
>> +		 | ((kvm_pmu_hw_cache_events[i].cache_result & 0xff) << 16);
>> +
>> +	return config;
>> +}
>> +
>> +/**
>> + * kvm_pmu_set_counter_event_type - set selected counter to monitor some event
>> + * @vcpu: The vcpu pointer
>> + * @data: The data guest writes to PMXEVTYPER_EL0
>> + * @select_idx: The number of selected counter
>> + *
>> + * Firstly check whether the event type is same with the one to be set.
>> + * If not, stop counter to monitor current event and find the event type map id.
>> + * According to the bits of data to configure this perf_event attr and set
>> + * exclude_host to 1 for guest. Then call perf_event API to create the
>> + * corresponding event and save the event pointer.
> 
> This text seems to be describing more how the function does something,
> as opposed to what it does and why.  I found it a little hard to read.
> 
>> + */
>> +void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, unsigned long data,
>> +				    unsigned long select_idx)
>> +{
>> +	struct kvm_pmu *pmu = &vcpu->arch.pmu;
>> +	struct kvm_pmc *pmc = &pmu->pmc[select_idx];
>> +	struct perf_event *event;
>> +	struct perf_event_attr attr;
>> +	unsigned config, type = PERF_TYPE_RAW;
>> +
>> +	if ((data & ARMV8_EVTYPE_EVENT) == pmc->eventsel)
>> +		return;
>> +
>> +	kvm_pmu_stop_counter(vcpu, select_idx);
>> +	pmc->eventsel = data & ARMV8_EVTYPE_EVENT;
>> +
>> +	config = kvm_pmu_find_hw_event(pmu, pmc->eventsel);
>> +	if (config != PERF_COUNT_HW_MAX) {
>> +		type = PERF_TYPE_HARDWARE;
>> +	} else {
>> +		config = kvm_pmu_find_hw_cache_event(pmu, pmc->eventsel);
>> +		if (config != PERF_COUNT_HW_CACHE_MAX)
>> +			type = PERF_TYPE_HW_CACHE;
>> +	}
>> +
>> +	if (type == PERF_TYPE_RAW)
>> +		config = pmc->eventsel;
> 
> don't you need to memset attr to 0 first?
> 
ok
> otherwise, how do you ensure that for example exclude_guest is always
> clear?
> 
>> +
>> +	attr.type = type;
>> +	attr.size = sizeof(attr);
>> +	attr.pinned = true;
>> +	attr.exclude_user = data & ARMV8_EXCLUDE_EL0 ? 1 : 0;
>> +	attr.exclude_kernel = data & ARMV8_EXCLUDE_EL1 ? 1 : 0;
>> +	attr.exclude_hv = data & ARMV8_INCLUDE_EL2 ? 0 : 1;
> 
> should the guest be able to see something counted in the hypervisor ever
> or should that only be the host being able to see that?
> 
> my gut feeling is that the hypervisor should be hidden from the guest
> and that exclude_hv = 0, is the right choice.  But this is a question
> about the semantics of perf, I suppose.
> 

This is what I'm not sure. I just thought about if guest has complete
ELs(EL3-EL0) and how to deal with nested virtualization.

>> +	attr.exclude_host = 1;
>> +	attr.config = config;
>> +	attr.sample_period = (-pmc->counter) & (((u64)1 << 32) - 1);
> 
> whoa, what is this scary calculation?
> 
> definitely needs an explanation?
> 
>> +
>> +	event = perf_event_create_kernel_counter(&attr, -1, current, NULL, pmc);
>> +	if (IS_ERR(event)) {
>> +		kvm_err("kvm: pmu event creation failed %ld\n",
>> +			  PTR_ERR(event));
> 
> doesn't this mean we'll spam the kernel log if the guest supplies
> bogus/unsupported event numbers?
> 

X86 uses printk_once, is that ok to us here?

> In that case it shoudl be kvm_debug and the guest should be able to see
> this somehow (e.g. events don't count).
> 

Yes, will think about this.

>> +		return;
>> +	}
>> +	pmc->perf_event = event;
>> +}
>> +
>> +/**
>>   * kvm_pmu_init - Initialize global PMU state for per vcpu
>>   * @vcpu: The vcpu pointer
>>   *
>> -- 
>> 2.1.0
>>

-- 
Shannon

  reply	other threads:[~2015-07-21  1:35 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-06  2:17 [PATCH 00/18] KVM: ARM64: Add guest PMU support shannon.zhao
2015-07-06  2:17 ` [PATCH 01/18] ARM64: Move PMU register related defines to asm/pmu.h shannon.zhao
2015-07-08 17:18   ` Will Deacon
2015-07-06  2:17 ` [PATCH 02/18] KVM: ARM64: Add initial support for PMU shannon.zhao
2015-07-16 18:25   ` Christoffer Dall
2015-07-17  8:13     ` Shannon Zhao
2015-07-17  9:58       ` Christoffer Dall
2015-07-17 11:34         ` Shannon Zhao
2015-07-17 12:48           ` Christoffer Dall
2015-07-06  2:17 ` [PATCH 03/18] KVM: ARM64: Add offset defines for PMU registers shannon.zhao
2015-07-16 18:45   ` Christoffer Dall
2015-07-17  8:25     ` Shannon Zhao
2015-07-17 10:17       ` Christoffer Dall
2015-07-17 11:40         ` Shannon Zhao
2015-07-06  2:17 ` [PATCH 04/18] KVM: ARM64: Add reset and access handlers for PMCR_EL0 register shannon.zhao
2015-07-16 19:55   ` Christoffer Dall
2015-07-17  8:45     ` Shannon Zhao
2015-07-17 10:21       ` Christoffer Dall
2015-07-21  1:16         ` Shannon Zhao
2015-08-03 19:39           ` Christoffer Dall
2015-07-06  2:17 ` [PATCH 05/18] KVM: ARM64: Add reset and access handlers for PMSELR_EL0 register shannon.zhao
2015-07-06  2:17 ` [PATCH 06/18] KVM: ARM64: Add reset and access handlers for PMCEID0_EL0 and PMCEID1_EL0 register shannon.zhao
2015-07-17 13:51   ` Christoffer Dall
2015-07-06  2:17 ` [PATCH 07/18] KVM: ARM64: PMU: Add perf event map and introduce perf event creating function shannon.zhao
2015-07-17 14:30   ` Christoffer Dall
2015-07-21  1:35     ` Shannon Zhao [this message]
2015-08-03 19:55       ` Christoffer Dall
2015-07-06  2:17 ` [PATCH 08/18] KVM: ARM64: Add reset and access handlers for PMXEVTYPER_EL0 register shannon.zhao
2015-07-06  2:17 ` [PATCH 09/18] KVM: ARM64: Add reset and access handlers for PMXEVCNTR_EL0 register shannon.zhao
2015-07-17 14:41   ` Christoffer Dall
2015-07-06  2:17 ` [PATCH 10/18] KVM: ARM64: Add reset and access handlers for PMCCNTR_EL0 register shannon.zhao
2015-07-17 14:42   ` Christoffer Dall
2015-07-06  2:17 ` [PATCH 11/18] KVM: ARM64: Add reset and access handlers for PMCNTENSET_EL0 and PMCNTENCLR_EL0 register shannon.zhao
2015-07-17 14:52   ` Christoffer Dall
2015-07-06  2:17 ` [PATCH 12/18] KVM: ARM64: Add reset and access handlers for PMINTENSET_EL1 and PMINTENCLR_EL1 register shannon.zhao
2015-07-17 14:56   ` Christoffer Dall
2015-07-06  2:17 ` [PATCH 13/18] KVM: ARM64: Add reset and access handlers for PMOVSSET_EL0 and PMOVSCLR_EL0 register shannon.zhao
2015-07-17 14:59   ` Christoffer Dall
2015-07-17 15:02   ` Christoffer Dall
2015-07-06  2:17 ` [PATCH 14/18] KVM: ARM64: Add reset and access handlers for PMUSERENR_EL0 register shannon.zhao
2015-07-17 15:01   ` Christoffer Dall
2015-07-06  2:17 ` [PATCH 15/18] KVM: ARM64: Add reset and access handlers for PMSWINC_EL0 register shannon.zhao
2015-07-17 15:13   ` Christoffer Dall
2015-07-06  2:17 ` [PATCH 16/18] KVM: ARM64: Add access handlers for PMEVCNTRn_EL0 and PMEVTYPERn_EL0 register shannon.zhao
2015-07-17 15:19   ` Christoffer Dall
2015-07-06  2:17 ` [PATCH 17/18] KVM: ARM64: Add PMU overflow interrupt routing shannon.zhao
2015-07-17 15:28   ` Christoffer Dall
2015-07-06  2:17 ` [PATCH 18/18] KVM: ARM64: Add KVM_CAP_ARM_PMU and KVM_ARM_PMU_SET_IRQ shannon.zhao
2015-07-17 15:32   ` Christoffer Dall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55ADA1C7.5040002@linaro.org \
    --to=shannon.zhao@linaro.org \
    --cc=alex.bennee@linaro.org \
    --cc=christoffer.dall@linaro.org \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=marc.zyngier@arm.com \
    --cc=will.deacon@arm.com \
    --cc=zhaoshenglong@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).