Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com>
To: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
Cc: Bommu Krishnaiah <krishnaiah.bommu@intel.com>,
	Tvrtko Ursulin <tvrtko.ursulin@intel.com>,
	intel-xe@lists.freedesktop.org, rodrigo.vivi@intel.com
Subject: Re: [Intel-xe] [PATCH v6 3/3] drm/xe/pmu: Enable PMU interface
Date: Thu, 14 Sep 2023 11:08:01 +0530	[thread overview]
Message-ID: <a328c5db-7ae0-d178-289e-3a6ce3021425@linux.intel.com> (raw)
In-Reply-To: <87edj1tjuh.wl-ashutosh.dixit@intel.com>


On 14/09/23 10:46, Dixit, Ashutosh wrote:

Hi Ashutosh,

> On Wed, 13 Sep 2023 21:41:59 -0700, Aravind Iddamsetty wrote:
> Hi Aravind,
>
>>> On Fri, 01 Sep 2023 00:06:48 -0700, Aravind Iddamsetty wrote:
>>>
>>>> +static u64 __engine_group_busyness_read(struct xe_gt *gt, int sample_type)
>>>> +{
>>>> +	u64 val;
>>>> +
>>>> +	switch (sample_type) {
>>>> +	case __XE_SAMPLE_RENDER_GROUP_BUSY:
>>>> +		val = xe_mmio_read32(gt, XE_OAG_RENDER_BUSY_FREE);
>>>> +		break;
>>>> +	case __XE_SAMPLE_COPY_GROUP_BUSY:
>>>> +		val = xe_mmio_read32(gt, XE_OAG_BLT_BUSY_FREE);
>>>> +		break;
>>>> +	case __XE_SAMPLE_MEDIA_GROUP_BUSY:
>>>> +		/*
>>>> +		 * This GSI offset is not replicated in media specific range
>>>> +		 * BSPEC: 67789, 46559
>>>> +		 */
>>>> +		if (gt->info.type == XE_GT_TYPE_MEDIA)
>>>> +			val = xe_mmio_read32(gt->tile->primary_gt, XE_OAG_ANY_MEDIA_FF_BUSY_FREE);
>>> Hmm, this appeared in this version. Is this really needed? Did you test
>>> this on MTL media gt?
>>>
>>> E.g. if this is needed we'd need to do similar stuff in freq_act_show()
>>> when reading MTL_MIRROR_TARGET_WP1, but we don't do this there. So we don't
>>> see this done anywhere else in XE.
>>>
>>> I don't understand this multicast steering but appears XE mmio read/write
>>> functions might already be taking care of this via:
>>>
>>> 	if (reg.addr < gt->mmio.adj_limit)
>> As per the BSPEC: 67789, 46559 the XE_OAG_ANY_MEDIA_FF_BUSY_FREE register
>> falls in the range 0038D120 - 0038DFFF but this belongs to
>> DVPMEDIAVCCRO3D, means the above register is not replicated for media gt
>> hencethe above register math is not valid in these case.
>>
>> I did verify on MTL but the register is not updating properly the same
>> was reported by windows folks as well, so I can do one thing can make
>> this change as separate patch and track that for MTL.
> Yes sounds good, please copy Matt Roper on the new MTL patch and see what
> he says.
>
>>> So could you please check this out and take out this code if not needed.
>>>
>>>> +		else
>>>> +			val = xe_mmio_read32(gt, XE_OAG_ANY_MEDIA_FF_BUSY_FREE);
>>>> +		break;
>>>> +	case __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY:
>>>> +		val = xe_mmio_read32(gt, XE_OAG_RC0_ANY_ENGINE_BUSY_FREE);
>>>> +		break;
>>>> +	default:
>>>> +		drm_warn(&gt->tile->xe->drm, "unknown pmu event\n");
>>>> +	}
>>>> +
>>>> +	return xe_gt_clock_cycles_to_ns(gt, val * 16);
>>>> +}
>>>> +
>>>> +static u64 engine_group_busyness_read(struct xe_gt *gt, u64 config)
>>>> +{
>>>> +	int sample_type = config_counter(config) - 1;
>>>> +	const unsigned int gt_id = gt->info.id;
>>>> +	struct xe_device *xe = gt->tile->xe;
>>>> +	struct xe_pmu *pmu = &xe->pmu;
>>>> +	unsigned long flags;
>>>> +	bool device_awake;
>>>> +	u64 val;
>>>> +
>>>> +	device_awake = xe_device_mem_access_get_if_ongoing(xe);
>>>> +	if (device_awake) {
>>>> +		XE_WARN_ON(xe_force_wake_get(gt_to_fw(gt), XE_FW_GT));
>>>> +		val = __engine_group_busyness_read(gt, sample_type);
>>>> +		XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FW_GT));
>>>> +		xe_device_mem_access_put(xe);
>>>> +	}
>>>> +
>>>> +	spin_lock_irqsave(&pmu->lock, flags);
>>>> +
>>>> +	if (device_awake)
>>>> +		pmu->sample[gt_id][sample_type] = val;
>>>> +	else
>>>> +		val = pmu->sample[gt_id][sample_type];
>>>> +
>>>> +	spin_unlock_irqrestore(&pmu->lock, flags);
>>>> +
>>>> +	return val;
>>>> +}
>>>> +
>>>> +static void engine_group_busyness_store(struct xe_gt *gt)
>>>> +{
>>>> +	struct xe_pmu *pmu = &gt->tile->xe->pmu;
>>>> +	unsigned int gt_id = gt->info.id;
>>>> +	unsigned long flags;
>>>> +	int i;
>>>> +
>>>> +	spin_lock_irqsave(&pmu->lock, flags);
>>>> +
>>>> +	for (i = __XE_SAMPLE_RENDER_GROUP_BUSY; i <= __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY; i++)
>>>> +		pmu->sample[gt_id][i] = __engine_group_busyness_read(gt, i);
>>> Just a reminder about what we discussed about this earlier. Here we are
>>> saving all these samples unconditionally, whether or not PMU will ever be
>>> used. We have deferred the optimization to only save enabled counters to a
>>> later patch. So we might still come back and do that.
>> OK
>>>> +static void xe_pmu_event_read(struct perf_event *event)
>>>> +{
>>>> +	struct xe_device *xe =
>>>> +		container_of(event->pmu, typeof(*xe), pmu.base);
>>>> +	struct hw_perf_event *hwc = &event->hw;
>>>> +	struct xe_pmu *pmu = &xe->pmu;
>>>> +	u64 prev, new;
>>>> +
>>>> +	if (pmu->closed) {
>>>> +		event->hw.state = PERF_HES_STOPPED;
>>>> +		return;
>>>> +	}
>>>> +again:
>>>> +	prev = local64_read(&hwc->prev_count);
>>>> +	new = __xe_pmu_event_read(event);
>>>> +
>>>> +	if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev)
>>>> +		goto again;
>>>> +
>>>> +	local64_add(new - prev, &event->count);
>>> Just curious, in case you know, why do we do this here rather than just a:
>>>
>>> 	local64_set(new, &event->count);
>>>
>>> The code is the same as i915 so is likely correct, so no issue, if you
>>> don't know just ignore this comment. No changes needed here.
>> So a PMU counter can be opened by multiple clients simultaneously in which case
>> we shall always report the relative value per the clients view, hence
>> that math.
> OK.
>
>>>> diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
>>>> index 86f16d50e9cc..8a59350b3cea 100644
>>>> --- a/include/uapi/drm/xe_drm.h
>>>> +++ b/include/uapi/drm/xe_drm.h
>>>> @@ -1056,6 +1056,45 @@ struct drm_xe_vm_madvise {
>>>> 	__u64 reserved[2];
>>>>  };
>>>>
>>>> +/* PMU event config IDs */
>>>> +
>>>> +/*
>>>> + * The below config IDs needs to be filled in the struct perf_even_attr field
>>>> + * as part of perf_event_open call to read a particular event.
>>>> + *
>>>> + * check man perf_event_open.
>>>> + *
>>>> + * example on how to open an event:
>>>> + * .. code-block:: C
>>>> + *	struct perf_event_attr pe;
>>>> + *	long long count;
>>>> + *	int cpu = 0;
>>>> + *	int fd;
>>>> + *
>>>> + *	memset(&pe, 0, sizeof(struct perf_event_attr));
>>>> + *	pe.type = type // eg: /sys/bus/event_source/devices/xe_0000_56_00.0/type
>>>> + *	pe.read_format = PERF_FORMAT_TOTAL_TIME_ENABLED;
>>>> + *	pe.use_clockid = 1;
>>>> + *	pe.clockid = CLOCK_MONOTONIC;
>>>> + *	pe.config = XE_PMU_INTERRUPTS(0);
>>>> + *
>>>> + *	fd = perf_event_open(&pe, -1, cpu, -1, 0);
>>>> + */
>>> Pretty good, let me fix up the formatting, style and some typos. But please
>>> check.
>>>
>>> /**
>>>  * XE PMU event config ID's
>>>  *
>>>  * Check 'man perf_event_open' to use these ID's in 'struct
>>>  * perf_event_attr' as part of perf_event_open syscall to read particular
>>>  * event counts.
>>>  *
>>>  * For example, to open the XE_PMU_INTERRUPTS(0) event:
>>>  *
>>>  * .. code-block:: C
>>>  *
>>>  *	struct perf_event_attr attr = {};
>>>  *	long long count;
>>>  *	int cpu = 0;
>>>  *	int fd;
>>>  *
>>>  *	attr.type = type; /* /sys/bus/event_source/devices/xe_0000_56_00.0/type */
>>>  *	attr.read_format = PERF_FORMAT_TOTAL_TIME_ENABLED;
>>>  *	attr.use_clockid = 1;
>>>  *	attr.clockid = CLOCK_MONOTONIC;
>>>  *	attr.config = XE_PMU_INTERRUPTS(0);
>>>  *
>>>  *	fd = syscall(__NR_perf_event_open, &attr, -1, cpu, -1, 0);
>>>  */
>> we have a perf_event_open API we needn't use syscall so i'll mention
>> that, but rest of the text looks good.
> Where is the perf_event_open() API? I only see it in IGT and that doesn't
> count here. That is why I changed it to syscall.

ya you are right there is no glibc wrapper for this.

Thanks,
Aravind.
>
>>> Apart from the couple of things above to be fixed up, I don't want to hold
>>> onto this review any longer, so this is now:
>>>
>>> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
>> thanks, so will respin the series by nuking the MTL change.
> Thanks.
> --
> Ashutosh

  reply	other threads:[~2023-09-14  5:29 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-01  7:06 [Intel-xe] [PATCH v6 0/3] drm/xe/pmu: Enable PMU interface Aravind Iddamsetty
2023-09-01  7:06 ` [Intel-xe] [PATCH v3 1/3] drm/xe: Get GT clock to nanosecs Aravind Iddamsetty
2023-09-01  7:06 ` [Intel-xe] [PATCH 2/3] drm/xe: Use spinlock in forcewake instead of mutex Aravind Iddamsetty
2023-09-01  7:06 ` [Intel-xe] [PATCH v6 3/3] drm/xe/pmu: Enable PMU interface Aravind Iddamsetty
2023-09-02  5:00   ` Dixit, Ashutosh
2023-09-14  4:41     ` Aravind Iddamsetty
2023-09-14  5:16       ` Dixit, Ashutosh
2023-09-14  5:38         ` Aravind Iddamsetty [this message]
2023-09-01  7:28 ` [Intel-xe] ✓ CI.Patch_applied: success for drm/xe/pmu: Enable PMU interface (rev6) Patchwork
2023-09-01  7:29 ` [Intel-xe] ✗ CI.checkpatch: warning " Patchwork
2023-09-01  7:30 ` [Intel-xe] ✓ CI.KUnit: success " Patchwork
2023-09-01  7:37 ` [Intel-xe] ✓ CI.Build: " Patchwork
2023-09-01  7:37 ` [Intel-xe] ✗ CI.Hooks: failure " Patchwork
2023-09-01  7:37 ` [Intel-xe] ✗ CI.checksparse: warning " Patchwork
2023-09-01  8:11 ` [Intel-xe] ✗ CI.BAT: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a328c5db-7ae0-d178-289e-3a6ce3021425@linux.intel.com \
    --to=aravind.iddamsetty@linux.intel.com \
    --cc=ashutosh.dixit@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=krishnaiah.bommu@intel.com \
    --cc=rodrigo.vivi@intel.com \
    --cc=tvrtko.ursulin@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox