linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: nleeder@codeaurora.org (Leeder, Neil)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 2/2] perf: add arm64 smmuv3 pmu driver
Date: Mon, 7 Aug 2017 17:18:35 -0400	[thread overview]
Message-ID: <4fc734dc-fde2-3f33-2f53-54affe7a14a9@codeaurora.org> (raw)
In-Reply-To: <1de250db-87d1-ed85-d0a7-43319139412b@arm.com>

Hi Robin,
Thank you for your comments.

On 8/7/2017 10:31 AM, Robin Murphy wrote:
> On 04/08/17 20:59, Neil Leeder wrote:
>> PMUs are named smmu_0_<phys_addr_page> where <phys_addr_page>
>> is the physical page address of the SMMU PMCG.
>> For example, the SMMU PMCG at 0xff88840000 is named smmu_0_ff88840
> 
> This seems a bit rough - is it at feasible to at least chase the node
> reference and namespace them by the associated component, e.g. something
> like "arm-smmu-v3.x:pmcg.y"? The user can always dig the associated
> physical address out of /proc/iomem if necessary.
> 
That looks like it may be better - I'll look into it. 

>> Filtering by stream id is done by specifying filtering parameters
>> with the event. Options are:
>>   filter_enable    - 0 = no filtering, 1 = filtering enabled
>>   filter_span      - 0 = exact match, 1 = pattern match
>>   filter_sec       - applies to non-secure (0) or secure (1) namespace
> 
> I'm a little dubious as to how useful it is to expose this, since we
> can't see the true value of SCR.SO so have no way of knowing what we'll
> actually end up counting.

I can remove the sec filter.

>> +config ARM_SMMUV3_PMU
>> +	 bool "ARM SMMUv3 PMU"
>> +	 depends on PERF_EVENTS && ARM64 && ACPI
> 
> PERF_EVENTS is already a top-level dependency now.
> 
OK

>> +#include <linux/msi.h>
> 
> Is MSI support planned?
> 
Not in this patchset. I'll remove the include.

>> +#define SMMU_PMCG_EVCNTR0               0x0
>> +#define SMMU_PMCG_EVCNTR(n, stride)     (SMMU_PMCG_EVCNTR0 + (n) * (stride))
>> +#define SMMU_PMCG_EVTYPER0              0x400
>> +#define SMMU_PMCG_EVTYPER(n)            (SMMU_PMCG_EVTYPER0 + (n) * 4)
>> +#define SMMU_PMCG_EVTYPER_SEC_SID_SHIFT       30
>> +#define SMMU_PMCG_EVTYPER_SID_SPAN_SHIFT      29
>> +#define SMMU_PMCG_EVTYPER_EVENT_MASK          GENMASK(15, 0)
>> +#define SMMU_PMCG_SVR0                  0x600
>> +#define SMMU_PMCG_SVR(n, stride)        (SMMU_PMCG_SVR0 + (n) * (stride))
>> +#define SMMU_PMCG_SMR0                  0xA00
>> +#define SMMU_PMCG_SMR(n)                (SMMU_PMCG_SMR0 + (n) * 4)
>> +#define SMMU_PMCG_CNTENSET0             0xC00
>> +#define SMMU_PMCG_CNTENCLR0             0xC20
>> +#define SMMU_PMCG_INTENSET0             0xC40
>> +#define SMMU_PMCG_INTENCLR0             0xC60
>> +#define SMMU_PMCG_OVSCLR0               0xC80
>> +#define SMMU_PMCG_OVSSET0               0xCC0
>> +#define SMMU_PMCG_CAPR                  0xD88
>> +#define SMMU_PMCG_SCR                   0xDF8
>> +#define SMMU_PMCG_CFGR                  0xE00
>> +#define SMMU_PMCG_CFGR_SID_FILTER_TYPE        BIT(23)
>> +#define SMMU_PMCG_CFGR_CAPTURE                BIT(22)
>> +#define SMMU_PMCG_CFGR_MSI                    BIT(21)
>> +#define SMMU_PMCG_CFGR_RELOC_CTRS             BIT(20)
>> +#define SMMU_PMCG_CFGR_SIZE_MASK              GENMASK(13, 8)
>> +#define SMMU_PMCG_CFGR_SIZE_SHIFT             8
>> +#define SMMU_PMCG_CFGR_COUNTER_SIZE_32        31
>> +#define SMMU_PMCG_CFGR_NCTR_MASK              GENMASK(5, 0)
>> +#define SMMU_PMCG_CFGR_NCTR_SHIFT             0
>> +#define SMMU_PMCG_CR                    0xE04
>> +#define SMMU_PMCG_CR_ENABLE                   BIT(0)
>> +#define SMMU_PMCG_CEID0                 0xE20
>> +#define SMMU_PMCG_CEID1                 0xE28
>> +#define SMMU_PMCG_IRQ_CTRL              0xE50
>> +#define SMMU_PMCG_IRQ_CTRL_IRQEN              BIT(0)
>> +#define SMMU_PMCG_IRQ_CTRLACK           0xE54
>> +#define SMMU_PMCG_IRQ_CTRLACK_IRQEN           BIT(0)
>> +#define SMMU_PMCG_IRQ_CFG0              0xE58
>> +#define SMMU_PMCG_IRQ_CFG0_ADDR_MASK          GENMASK(51, 2)
>> +#define SMMU_PMCG_IRQ_CFG1              0xE60
>> +#define SMMU_PMCG_IRQ_CFG2              0xE64
>> +#define SMMU_PMCG_IRQ_STATUS            0xE68
>> +#define SMMU_PMCG_IRQ_STATUS_IRQ_ABT          BIT(0)
>> +#define SMMU_PMCG_AIDR                  0xE70
> 
> Several of these are unused (although at least IRQ0_CFG1 probably should
> be, to zero it to a known state if we aren't supporting MSIs).
> 
I can remove the unused defines and clear IRQ_CFG1.

>> +	bool reg_size_32;
> 
> This guy is redundant...
> 
[...]
>> +	if (smmu_pmu->reg_size_32)
> 
> ...since it would be just as efficient to directly test
> smmu_pmu->counter_mask & BIT(32) here and below.
> 
OK

>> +
>> +	for (i = 0; i < smmu_pmu->num_counters; i++) {
>> +		smmu_pmu_counter_disable(smmu_pmu, i);
>> +		smmu_pmu_interrupt_disable(smmu_pmu, i);
>> +	}
> 
> Surely it would be far quicker and simpler to do this?
> 
> 	writeq(smmu_pmu->counter_present_mask,
> 		smmu_pmu->reg_base + SMMU_PMCG_CNTENCLR0);
> 	writeq(smmu_pmu->counter_present_mask,
> 		smmu_pmu->reg_base + SMMU_PMCG_INTENCLR0);
> 
OK

>> +static inline bool smmu_pmu_has_overflowed(struct smmu_pmu *smmu_pmu, u64 ovsr)
>> +{
>> +	return !!(ovsr & smmu_pmu->counter_present_mask);
>> +}
> 
> Personally, I find these helpers abstracting simple reads/writes
> actually make the code harder to follow, especially when they're each
> used a grand total of once. That may well just be me, though.
> 
At least this one will go away with the below change to the interrupt handler.

>> +	new = SMMU_COUNTER_RELOAD;
> 
> Given that we *are* following the "use the top counter bit as an
> implicit overflow bit" pattern of arm_pmu, it feels a bit weird to not
> use the actual half-maximum value here (especially since it's easily
> computable from counter_mask). I'm about 85% sure it probably still
> works, but as ever inconsistency breeds uncertainty.

I thought that if we're happy with BIT(31) working fine with 32-bit registers,
it should also work for larger registers, so there was no need to waste more of
their bits. But I can change it to be half-max for all of them.

>> +static irqreturn_t smmu_pmu_handle_irq(int irq_num, void *data)
>> +{
>> +	struct smmu_pmu *smmu_pmu = data;
>> +	u64 ovsr;
>> +	unsigned int idx;
>> +
>> +	ovsr = smmu_pmu_getreset_ovsr(smmu_pmu);
>> +	if (!smmu_pmu_has_overflowed(smmu_pmu, ovsr))
> 
> You have an architectural guarantee that unimplemented bits of OVSSET0
> are RES0, so checking !ovsr is sufficient.
> 
OK

>> +	/* Verify specified event is supported on this PMU */
>> +	event_id = get_event(event);
>> +	if ((event_id >= SMMU_MAX_EVENT_ID) ||
> 
> What about raw imp-def events?
> 
I can keep the check for common events, but also allow any raw event
in the imp-def range.

>> +	    (!test_bit(event_id, smmu_pmu->supported_events))) {
>> +		dev_dbg_ratelimited(&smmu_pmu->pdev->dev,
>> +				    "Invalid event %d for this PMU\n",
>> +				    event_id);
>> +		return -EINVAL;
>> +	}
[...]

>> +static int smmu_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
>> +{
>> +	struct smmu_pmu *smmu_pmu;
>> +	unsigned int target;
>> +
>> +	smmu_pmu = hlist_entry_safe(node, struct smmu_pmu, node);
> 
> Is it ever valid for node to be NULL? If we can't trust it to be one of
> the PMUs we registered I think all bets are off anyway.
> 
I was following the logic in arm-ccn.c and arm-cci.c. If it works for them
I would hope it works here too.

>> +
>> +	ceid_64 = readq(smmu_pmu->reg_base + SMMU_PMCG_CEID0);
>> +	ceid[0] = ceid_64 & GENMASK(31, 0);
> 
> It took a second look to determine that that masking does nothing...
> 
>> +	ceid[1] = ceid_64 >> 32;
>> +	ceid_64 = readq(smmu_pmu->reg_base + SMMU_PMCG_CEID1);
>> +	ceid[2] = ceid_64 & GENMASK(31, 0);
>> +	ceid[3] = ceid_64 >> 32;
>> +	bitmap_from_u32array(smmu_pmu->supported_events, SMMU_MAX_EVENT_ID,
>> +			     ceid, SMMU_NUM_EVENTS_U32);
> 
> ...but then the whole lot might be cleaner and simpler with a u64[2]
> cast to u32* (or unioned to u32[4]) as necessary.
> 
I've rewritten this about 4 different ways and didn't love any of them,
including this one. I can re-do this as you suggest.

>> +static struct platform_driver smmu_pmu_driver = {
>> +	.driver = {
>> +		.name = "arm-smmu-pmu",
> 
> Nit: "arm-smmu-v3-pmu" please, for consistency with the IOMMU driver
> naming. There is a SMMUv2 PMU driver in the works, too ;)
> 
ok

Thanks,

Neil
-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

  reply	other threads:[~2017-08-07 21:18 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-04 19:59 [PATCH 0/2] arm64 SMMUv3 PMU driver with IORT support Neil Leeder
2017-08-04 19:59 ` [PATCH 1/2] acpi: arm64: add iort support for PMCG Neil Leeder
2017-08-07 11:17   ` Robin Murphy
2017-08-07 20:52     ` Leeder, Neil
2017-08-07 16:44   ` Lorenzo Pieralisi
2017-08-07 21:00     ` Leeder, Neil
2018-01-30 10:39   ` Shameerali Kolothum Thodi
2018-01-30 18:00     ` Lorenzo Pieralisi
2018-01-31 12:10       ` Shameerali Kolothum Thodi
2018-01-31 12:34         ` Lorenzo Pieralisi
2017-08-04 19:59 ` [PATCH 2/2] perf: add arm64 smmuv3 pmu driver Neil Leeder
2017-08-07 14:31   ` Robin Murphy
2017-08-07 21:18     ` Leeder, Neil [this message]
2017-12-05  5:01     ` Linu Cherian
2018-03-29  7:03   ` Yisheng Xie
     [not found]     ` <e55ab4404143ea0b3cc4795a93e37480@codeaurora.org>
2018-04-01  5:44       ` Neil Leeder
2018-04-02  6:37         ` Yisheng Xie
2018-04-02 14:24           ` Hanjun Guo
2018-04-02 17:59             ` Neil Leeder
2018-04-03  1:15               ` Hanjun Guo
2018-04-04 11:35                 ` Lorenzo Pieralisi
2018-05-02 14:20           ` Agustin Vega-Frias
2018-05-03  9:22             ` Shameerali Kolothum Thodi
2018-04-18 11:05     ` Shameerali Kolothum Thodi
2018-04-19  1:17       ` Yisheng Xie
2017-08-09  7:56 ` [PATCH 0/2] arm64 SMMUv3 PMU driver with IORT support Hanjun Guo
2017-08-09 15:48   ` Leeder, Neil
2017-08-10  1:26     ` Hanjun Guo
2017-08-11  3:28       ` Leeder, Neil
2017-10-12 10:58         ` Hanjun Guo
2017-10-12 11:05           ` Lorenzo Pieralisi
2017-10-12 11:11             ` Hanjun Guo
2017-10-31 23:33 ` Yury Norov
2017-11-02 20:38   ` Leeder, Neil
2017-12-10  2:35 ` Linu Cherian
2017-12-18 14:48   ` Robin Murphy
2017-12-18 15:39     ` Marc Zyngier
2017-12-19  6:55       ` Linu Cherian
2017-12-19 12:11         ` Marc Zyngier
2017-12-19  6:36     ` Linu Cherian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4fc734dc-fde2-3f33-2f53-54affe7a14a9@codeaurora.org \
    --to=nleeder@codeaurora.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).