From: agustinv@codeaurora.org (agustinv at codeaurora.org)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH V1] perf: qcom: Add L3 cache PMU driver
Date: Mon, 21 Mar 2016 12:06:57 -0400 [thread overview]
Message-ID: <79190bfdf24ac993206e9aa00f0c437b@codeaurora.org> (raw)
In-Reply-To: <20160321103507.GB17326@leverpostej>
On 2016-03-21 06:35, Mark Rutland wrote:
> On Fri, Mar 18, 2016 at 04:37:02PM -0400, Agustin Vega-Frias wrote:
>> This adds a new dynamic PMU to the Perf Events framework to program
>> and control the L3 cache PMUs in some Qualcomm Technologies SOCs.
>>
>> The driver supports a distributed cache architecture where the overall
>> cache is comprised of multiple slices each with its own PMU. The
>> driver
>> aggregates counts across the whole system to provide a global picture
>> of the metrics selected by the user.
>>
>> The driver exports formatting and event information to sysfs so it can
>> be used by the perf user space tools with the syntaxes:
>> perf stat -a -e l3cache/read-miss/
>> perf stat -a -e l3cache/event=0x21/
>>
>> Signed-off-by: Agustin Vega-Frias <agustinv@codeaurora.org>
>> ---
>> arch/arm64/kernel/Makefile | 4 +
>> arch/arm64/kernel/perf_event_qcom_l3_cache.c | 816
>> +++++++++++++++++++++++++++
>> 2 files changed, 820 insertions(+)
>> create mode 100644 arch/arm64/kernel/perf_event_qcom_l3_cache.c
>
> Move this to drivers/bus (where the CCI and CCN PMU drivers live), or
> drivers/perf (where some common infrastructure lives).
>
> This isn't architectural, and isn't CPU-specific, so it has no reason
> to
> live in arch/arm64.
>
I will move it to drivers/perf. Thanks.
>> +static
>> +int qcom_l3_cache__event_init(struct perf_event *event)
>> +{
>> + struct hw_perf_event *hwc = &event->hw;
>> +
>> + if (event->attr.type != l3cache_pmu.pmu.type)
>> + return -ENOENT;
>> +
>> + /*
>> + * There are no per-counter mode filters in the PMU.
>> + */
>> + if (event->attr.exclude_user || event->attr.exclude_kernel ||
>> + event->attr.exclude_hv || event->attr.exclude_idle)
>> + return -EINVAL;
>> +
>> + hwc->idx = -1;
>> +
>> + /*
>> + * Sampling not supported since these events are not
>> core-attributable.
>> + */
>> + if (hwc->sample_period)
>> + return -EINVAL;
>> +
>> + /*
>> + * Task mode not available, we run the counters as system counters,
>> + * not attributable to any CPU and therefore cannot attribute
>> per-task.
>> + */
>> + if (event->cpu < 0)
>> + return -EINVAL;
>> +
>> + return 0;
>> +}
>
> Please follow what the other system PMUs do and (forcefully) ensure
> that all
> events share the same CPU in event_init.
>
> Otherwise, events can exist in multiple percpu contexts, and management
> thereof can race on things line pmu_{enable,disable}.
>
> You'll also want to expose a cpumask to userspace for that, to ensure
> that it
> only opens events on a single CPU.
>
> For an example, see drivers/bus/arm-ccn.c. That also handles migrating
> events to a new CPU upon hotplug.
Understood. I will look at the CCN implementation as reference,
>
>> +static
>> +int qcom_l3_cache__event_add(struct perf_event *event, int flags)
>> +{
>> + struct l3cache_pmu *system = to_l3cache_pmu(event->pmu);
>> + struct hw_perf_event *hwc = &event->hw;
>> + int idx;
>> + int prev_cpu;
>> + int err = 0;
>> +
>> + /*
>> + * We need to disable the pmu while adding the event, otherwise
>> + * the perf tick might kick-in and re-add this event.
>> + */
>> + perf_pmu_disable(event->pmu);
>> +
>> + /*
>> + * This ensures all events are on the same CPU context. No need to
>> open
>> + * these on all CPUs since they are system events. The strategy here
>> is
>> + * to set system->cpu when the first event is created and from that
>> + * point on, only events in the same CPU context will be allowed to
>> be
>> + * active. system->cpu will get reset back to -1 when the last event
>> + * is deleted, please see qcom_l3_cache__event_del below.
>> + */
>> + prev_cpu = atomic_cmpxchg(&system->cpu, -1, event->cpu);
>> + if ((event->cpu != prev_cpu) && (prev_cpu != -1)) {
>> + err = -EAGAIN;
>> + goto out;
>> + }
>
> As above, handle this in event_init, as other system PMUs do.
>
> Handling it here is rather late, and unnecessarily fragile.
Understood. I will look at the CCN implementation as reference,
>
>> +/*
>> + * In some platforms interrupt resources might not come directly from
>> the GIC,
>> + * but from separate IRQ circuitry that signals a summary IRQ to the
>> GIC and
>> + * is handled by a secondary IRQ controller. This is problematic
>> under ACPI boot
>> + * because the ACPI core does not use the Resource Source field on
>> the Extended
>> + * Interrupt Descriptor, which in theory could be used to specify an
>> alternative
>> + * IRQ controller.
>> +
>> + * For this reason in these platforms we implement the secondary IRQ
>> controller
>> + * using the gpiolib and specify the IRQs as GpioInt resources, so
>> when getting
>> + * an IRQ from the device we first try platform_get_irq and if it
>> fails we try
>> + * devm_gpiod_get_index/gpiod_to_irq.
>> + */
>> +static
>> +int qcom_l3_cache_pmu_get_slice_irq(struct platform_device *pdev,
>> + struct platform_device *sdev)
>> +{
>> + int virq = platform_get_irq(sdev, 0);
>> + struct gpio_desc *desc;
>> +
>> + if (virq >= 0)
>> + return virq;
>> +
>> + desc = devm_gpiod_get_index(&sdev->dev, NULL, 0, GPIOD_ASIS);
>> + if (IS_ERR(desc))
>> + return -ENOENT;
>> +
>> + return gpiod_to_irq(desc);
>> +}
>> +
>
> As Marc Zyngier pointed out in another thread, you should represent
> your
> interrupt controller as an interrupt controller rather than pretending
> it is a
> GPIO controller.
>
> Drivers should be able to remain blissfully unaware what the other end
> of their
> interrupt line is wired up to, and shouldn't have to jump through hoops
> like
> the above.
Understood. We need to close the loop with Rafael J. Wysocki w.r.t. ACPI
support for multiple regular IRQ controllers similar to DT.
>
> Thanks,
> Mark.
Thanks,
Agustin
next prev parent reply other threads:[~2016-03-21 16:06 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-18 20:37 [PATCH V1] perf: qcom: Add L3 cache PMU driver Agustin Vega-Frias
2016-03-21 9:04 ` Peter Zijlstra
2016-03-21 15:56 ` agustinv at codeaurora.org
2016-03-21 16:00 ` Peter Zijlstra
2016-03-21 10:35 ` Mark Rutland
2016-03-21 10:54 ` Will Deacon
2016-03-21 12:04 ` Peter Zijlstra
2016-03-21 16:37 ` agustinv at codeaurora.org
2016-03-21 16:06 ` agustinv at codeaurora.org [this message]
2016-03-22 18:33 ` agustinv at codeaurora.org
[not found] ` <56F26FF1.90002@arm.com>
2016-03-23 12:36 ` agustinv at codeaurora.org
2016-03-23 14:46 ` Peter Zijlstra
2016-03-22 20:48 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=79190bfdf24ac993206e9aa00f0c437b@codeaurora.org \
--to=agustinv@codeaurora.org \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).