From: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
To: Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com>
Cc: Bommu Krishnaiah <krishnaiah.bommu@intel.com>,
Tvrtko Ursulin <tvrtko.ursulin@intel.com>,
intel-xe@lists.freedesktop.org, rodrigo.vivi@intel.com
Subject: Re: [Intel-xe] [PATCH v6 3/3] drm/xe/pmu: Enable PMU interface
Date: Wed, 13 Sep 2023 22:16:54 -0700 [thread overview]
Message-ID: <87edj1tjuh.wl-ashutosh.dixit@intel.com> (raw)
In-Reply-To: <a681fe7a-4487-a8a6-1830-bdc23cc98ff9@linux.intel.com>
On Wed, 13 Sep 2023 21:41:59 -0700, Aravind Iddamsetty wrote:
>
Hi Aravind,
> > On Fri, 01 Sep 2023 00:06:48 -0700, Aravind Iddamsetty wrote:
> >
> >> +static u64 __engine_group_busyness_read(struct xe_gt *gt, int sample_type)
> >> +{
> >> + u64 val;
> >> +
> >> + switch (sample_type) {
> >> + case __XE_SAMPLE_RENDER_GROUP_BUSY:
> >> + val = xe_mmio_read32(gt, XE_OAG_RENDER_BUSY_FREE);
> >> + break;
> >> + case __XE_SAMPLE_COPY_GROUP_BUSY:
> >> + val = xe_mmio_read32(gt, XE_OAG_BLT_BUSY_FREE);
> >> + break;
> >> + case __XE_SAMPLE_MEDIA_GROUP_BUSY:
> >> + /*
> >> + * This GSI offset is not replicated in media specific range
> >> + * BSPEC: 67789, 46559
> >> + */
> >> + if (gt->info.type == XE_GT_TYPE_MEDIA)
> >> + val = xe_mmio_read32(gt->tile->primary_gt, XE_OAG_ANY_MEDIA_FF_BUSY_FREE);
> > Hmm, this appeared in this version. Is this really needed? Did you test
> > this on MTL media gt?
> >
> > E.g. if this is needed we'd need to do similar stuff in freq_act_show()
> > when reading MTL_MIRROR_TARGET_WP1, but we don't do this there. So we don't
> > see this done anywhere else in XE.
> >
> > I don't understand this multicast steering but appears XE mmio read/write
> > functions might already be taking care of this via:
> >
> > if (reg.addr < gt->mmio.adj_limit)
>
> As per the BSPEC: 67789, 46559 the XE_OAG_ANY_MEDIA_FF_BUSY_FREE register
> falls in the range 0038D120 - 0038DFFF but this belongs to
> DVPMEDIAVCCRO3D, means the above register is not replicated for media gt
> hencethe above register math is not valid in these case.
>
> I did verify on MTL but the register is not updating properly the same
> was reported by windows folks as well, so I can do one thing can make
> this change as separate patch and track that for MTL.
Yes sounds good, please copy Matt Roper on the new MTL patch and see what
he says.
> >
> > So could you please check this out and take out this code if not needed.
> >
> >> + else
> >> + val = xe_mmio_read32(gt, XE_OAG_ANY_MEDIA_FF_BUSY_FREE);
> >> + break;
> >> + case __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY:
> >> + val = xe_mmio_read32(gt, XE_OAG_RC0_ANY_ENGINE_BUSY_FREE);
> >> + break;
> >> + default:
> >> + drm_warn(>->tile->xe->drm, "unknown pmu event\n");
> >> + }
> >> +
> >> + return xe_gt_clock_cycles_to_ns(gt, val * 16);
> >> +}
> >> +
> >> +static u64 engine_group_busyness_read(struct xe_gt *gt, u64 config)
> >> +{
> >> + int sample_type = config_counter(config) - 1;
> >> + const unsigned int gt_id = gt->info.id;
> >> + struct xe_device *xe = gt->tile->xe;
> >> + struct xe_pmu *pmu = &xe->pmu;
> >> + unsigned long flags;
> >> + bool device_awake;
> >> + u64 val;
> >> +
> >> + device_awake = xe_device_mem_access_get_if_ongoing(xe);
> >> + if (device_awake) {
> >> + XE_WARN_ON(xe_force_wake_get(gt_to_fw(gt), XE_FW_GT));
> >> + val = __engine_group_busyness_read(gt, sample_type);
> >> + XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FW_GT));
> >> + xe_device_mem_access_put(xe);
> >> + }
> >> +
> >> + spin_lock_irqsave(&pmu->lock, flags);
> >> +
> >> + if (device_awake)
> >> + pmu->sample[gt_id][sample_type] = val;
> >> + else
> >> + val = pmu->sample[gt_id][sample_type];
> >> +
> >> + spin_unlock_irqrestore(&pmu->lock, flags);
> >> +
> >> + return val;
> >> +}
> >> +
> >> +static void engine_group_busyness_store(struct xe_gt *gt)
> >> +{
> >> + struct xe_pmu *pmu = >->tile->xe->pmu;
> >> + unsigned int gt_id = gt->info.id;
> >> + unsigned long flags;
> >> + int i;
> >> +
> >> + spin_lock_irqsave(&pmu->lock, flags);
> >> +
> >> + for (i = __XE_SAMPLE_RENDER_GROUP_BUSY; i <= __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY; i++)
> >> + pmu->sample[gt_id][i] = __engine_group_busyness_read(gt, i);
> > Just a reminder about what we discussed about this earlier. Here we are
> > saving all these samples unconditionally, whether or not PMU will ever be
> > used. We have deferred the optimization to only save enabled counters to a
> > later patch. So we might still come back and do that.
> OK
> >
> >> +static void xe_pmu_event_read(struct perf_event *event)
> >> +{
> >> + struct xe_device *xe =
> >> + container_of(event->pmu, typeof(*xe), pmu.base);
> >> + struct hw_perf_event *hwc = &event->hw;
> >> + struct xe_pmu *pmu = &xe->pmu;
> >> + u64 prev, new;
> >> +
> >> + if (pmu->closed) {
> >> + event->hw.state = PERF_HES_STOPPED;
> >> + return;
> >> + }
> >> +again:
> >> + prev = local64_read(&hwc->prev_count);
> >> + new = __xe_pmu_event_read(event);
> >> +
> >> + if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev)
> >> + goto again;
> >> +
> >> + local64_add(new - prev, &event->count);
> > Just curious, in case you know, why do we do this here rather than just a:
> >
> > local64_set(new, &event->count);
> >
> > The code is the same as i915 so is likely correct, so no issue, if you
> > don't know just ignore this comment. No changes needed here.
> So a PMU counter can be opened by multiple clients simultaneously in which case
> we shall always report the relative value per the clients view, hence
> that math.
OK.
> >
> >> diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
> >> index 86f16d50e9cc..8a59350b3cea 100644
> >> --- a/include/uapi/drm/xe_drm.h
> >> +++ b/include/uapi/drm/xe_drm.h
> >> @@ -1056,6 +1056,45 @@ struct drm_xe_vm_madvise {
> >> __u64 reserved[2];
> >> };
> >>
> >> +/* PMU event config IDs */
> >> +
> >> +/*
> >> + * The below config IDs needs to be filled in the struct perf_even_attr field
> >> + * as part of perf_event_open call to read a particular event.
> >> + *
> >> + * check man perf_event_open.
> >> + *
> >> + * example on how to open an event:
> >> + * .. code-block:: C
> >> + * struct perf_event_attr pe;
> >> + * long long count;
> >> + * int cpu = 0;
> >> + * int fd;
> >> + *
> >> + * memset(&pe, 0, sizeof(struct perf_event_attr));
> >> + * pe.type = type // eg: /sys/bus/event_source/devices/xe_0000_56_00.0/type
> >> + * pe.read_format = PERF_FORMAT_TOTAL_TIME_ENABLED;
> >> + * pe.use_clockid = 1;
> >> + * pe.clockid = CLOCK_MONOTONIC;
> >> + * pe.config = XE_PMU_INTERRUPTS(0);
> >> + *
> >> + * fd = perf_event_open(&pe, -1, cpu, -1, 0);
> >> + */
> > Pretty good, let me fix up the formatting, style and some typos. But please
> > check.
> >
> > /**
> > * XE PMU event config ID's
> > *
> > * Check 'man perf_event_open' to use these ID's in 'struct
> > * perf_event_attr' as part of perf_event_open syscall to read particular
> > * event counts.
> > *
> > * For example, to open the XE_PMU_INTERRUPTS(0) event:
> > *
> > * .. code-block:: C
> > *
> > * struct perf_event_attr attr = {};
> > * long long count;
> > * int cpu = 0;
> > * int fd;
> > *
> > * attr.type = type; /* /sys/bus/event_source/devices/xe_0000_56_00.0/type */
> > * attr.read_format = PERF_FORMAT_TOTAL_TIME_ENABLED;
> > * attr.use_clockid = 1;
> > * attr.clockid = CLOCK_MONOTONIC;
> > * attr.config = XE_PMU_INTERRUPTS(0);
> > *
> > * fd = syscall(__NR_perf_event_open, &attr, -1, cpu, -1, 0);
> > */
> we have a perf_event_open API we needn't use syscall so i'll mention
> that, but rest of the text looks good.
Where is the perf_event_open() API? I only see it in IGT and that doesn't
count here. That is why I changed it to syscall.
> >
> > Apart from the couple of things above to be fixed up, I don't want to hold
> > onto this review any longer, so this is now:
> >
> > Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
>
> thanks, so will respin the series by nuking the MTL change.
Thanks.
--
Ashutosh
next prev parent reply other threads:[~2023-09-14 5:16 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-01 7:06 [Intel-xe] [PATCH v6 0/3] drm/xe/pmu: Enable PMU interface Aravind Iddamsetty
2023-09-01 7:06 ` [Intel-xe] [PATCH v3 1/3] drm/xe: Get GT clock to nanosecs Aravind Iddamsetty
2023-09-01 7:06 ` [Intel-xe] [PATCH 2/3] drm/xe: Use spinlock in forcewake instead of mutex Aravind Iddamsetty
2023-09-01 7:06 ` [Intel-xe] [PATCH v6 3/3] drm/xe/pmu: Enable PMU interface Aravind Iddamsetty
2023-09-02 5:00 ` Dixit, Ashutosh
2023-09-14 4:41 ` Aravind Iddamsetty
2023-09-14 5:16 ` Dixit, Ashutosh [this message]
2023-09-14 5:38 ` Aravind Iddamsetty
2023-09-01 7:28 ` [Intel-xe] ✓ CI.Patch_applied: success for drm/xe/pmu: Enable PMU interface (rev6) Patchwork
2023-09-01 7:29 ` [Intel-xe] ✗ CI.checkpatch: warning " Patchwork
2023-09-01 7:30 ` [Intel-xe] ✓ CI.KUnit: success " Patchwork
2023-09-01 7:37 ` [Intel-xe] ✓ CI.Build: " Patchwork
2023-09-01 7:37 ` [Intel-xe] ✗ CI.Hooks: failure " Patchwork
2023-09-01 7:37 ` [Intel-xe] ✗ CI.checksparse: warning " Patchwork
2023-09-01 8:11 ` [Intel-xe] ✗ CI.BAT: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87edj1tjuh.wl-ashutosh.dixit@intel.com \
--to=ashutosh.dixit@intel.com \
--cc=aravind.iddamsetty@linux.intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=krishnaiah.bommu@intel.com \
--cc=rodrigo.vivi@intel.com \
--cc=tvrtko.ursulin@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.