From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 38D0EC83F2F for ; Thu, 31 Aug 2023 17:12:24 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D441410E6A5; Thu, 31 Aug 2023 17:12:23 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.93]) by gabe.freedesktop.org (Postfix) with ESMTPS id E0FFC10E6A9 for ; Thu, 31 Aug 2023 17:12:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1693501941; x=1725037941; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version:content-transfer-encoding; bh=sygOc06ltmuD3HpuOVSB+3GcZZ2/eNY8UH8taJOOKtc=; b=Io0/5p5jrlg3lKnONIETJQE4lNOu0VbiVNyQGUFeA+deAQWcZ/e7Ynyx GGx4UjZ44fGOXtcUBLY36kZf8z+DndDhPlzMv/epXD9W2zQq+P1sf7gid JY4tRqUhy2YB+QN1DNacqZeES27kA210YcCMoa4NdVbi6VtSXbDYieuLi Am1RoBWlPXbAUcrLesbWEQ8voV3sNJbB1ynXSEApI1OLMpkInXlD1qUFY +IsHJVib24l+6h0ltrptCk0NlKyoYshvZRa1+9SB2y5ZDZtgV1Fn6bAQL bAdm7G6Lubp3skrgfxSJ1gAd3xmKZmTWl8G37NTRvueb3YfsULP25hs+1 Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10819"; a="373410009" X-IronPort-AV: E=Sophos;i="6.02,217,1688454000"; d="scan'208";a="373410009" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Aug 2023 10:12:21 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10819"; a="716436074" X-IronPort-AV: E=Sophos;i="6.02,217,1688454000"; d="scan'208";a="716436074" Received: from adixit-mobl.amr.corp.intel.com (HELO adixit-arch.intel.com) ([10.212.199.173]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Aug 2023 10:12:20 -0700 Date: Thu, 31 Aug 2023 09:58:54 -0700 Message-ID: <877cpb173l.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: Aravind Iddamsetty In-Reply-To: <3b85e034-b1a4-29d1-2e4f-08ea6562b6ba@linux.intel.com> References: <20230830051544.369643-1-aravind.iddamsetty@linux.intel.com> <20230830051544.369643-4-aravind.iddamsetty@linux.intel.com> <87a5u724xe.wl-ashutosh.dixit@intel.com> <3b85e034-b1a4-29d1-2e4f-08ea6562b6ba@linux.intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/29.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: [Intel-xe] [PATCH v5 3/3] drm/xe/pmu: Enable PMU interface X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Bommu Krishnaiah , intel-xe@lists.freedesktop.org, Tvrtko Ursulin Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" On Thu, 31 Aug 2023 03:29:11 -0700, Aravind Iddamsetty wrote: > Hi Aravind, Hmm, what happened to the email formatting here? > On Tue, 29 Aug 2023 22:15:44 -0700, Aravind Iddamsetty wrote: > > diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c > new file mode 100644 > index 000000000000..41dd422812ff > --- /dev/null > +++ b/drivers/gpu/drm/xe/xe_pmu.c > @@ -0,0 +1,679 @@ > +// SPDX-License-Identifier: MIT > +/* > + * Copyright =A9 2023 Intel Corporation > + */ > + > +#include > +#include > +#include > + > +#include "regs/xe_gt_regs.h" > +#include "xe_device.h" > +#include "xe_gt_clock.h" > +#include "xe_mmio.h" > + > +static cpumask_t xe_pmu_cpumask; > +static unsigned int xe_pmu_target_cpu =3D -1; > + > +static unsigned int config_gt_id(const u64 config) > +{ > + return config >> __XE_PMU_GT_SHIFT; > +} > + > +static u64 config_counter(const u64 config) > +{ > + return config & ~(~0ULL << __XE_PMU_GT_SHIFT); > +} > + > +static int engine_busyness_sample_type(u64 config) > +{ > + int type =3D 0; > > > Why initialize? The switch statement should have a default with a BUG/WAR= N_ON > below? Also see the comment below. > > + > + switch (config) { > + case XE_PMU_RENDER_GROUP_BUSY(0): > + type =3D __XE_SAMPLE_RENDER_GROUP_BUSY; > + break; > + case XE_PMU_COPY_GROUP_BUSY(0): > + type =3D __XE_SAMPLE_COPY_GROUP_BUSY; > + break; > + case XE_PMU_MEDIA_GROUP_BUSY(0): > + type =3D __XE_SAMPLE_MEDIA_GROUP_BUSY; > + break; > + case XE_PMU_ANY_ENGINE_GROUP_BUSY(0): > + type =3D __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY; > + break; > + } > + > + return type; > +} > > > I am thinking this function is not really needed. We can just do: > > int sample_type =3D config - 1; > > or > > int sample_type =3D config_counter(config) - 1; > > It might not always be true in future, the configs can start from any ran= ge. Disagree. This is uapi. Once it is exposed it cannot change. I am talking about this: +#define XE_PMU_INTERRUPTS(gt) ___XE_PMU_OTHER(gt, 0) +#define XE_PMU_RENDER_GROUP_BUSY(gt) ___XE_PMU_OTHER(gt, 1) +#define XE_PMU_COPY_GROUP_BUSY(gt) ___XE_PMU_OTHER(gt, 2) +#define XE_PMU_MEDIA_GROUP_BUSY(gt) ___XE_PMU_OTHER(gt, 3) +#define XE_PMU_ANY_ENGINE_GROUP_BUSY(gt) ___XE_PMU_OTHER(gt, 4) How can this "start from any range"? We can only add new counters after these, not before these, correct? > in engine_group_busyness_read? See comment at __xe_pmu_event_read below. > > + > +static void xe_pmu_event_destroy(struct perf_event *event) > +{ > + struct xe_device *xe =3D > + container_of(event->pmu, typeof(*xe), pmu.base); > + > + drm_WARN_ON(&xe->drm, event->parent); > + > + drm_dev_put(&xe->drm); > +} > + > +static u64 __engine_group_busyness_read(struct xe_gt *gt, int sample_typ= e) > +{ > + u64 val =3D 0; No need to initialize here I think. We are not really expecting to drop into the default case, which should be caught much before we enter this function. > + > + switch (sample_type) { > + case __XE_SAMPLE_RENDER_GROUP_BUSY: > + val =3D xe_mmio_read32(gt, XE_OAG_RENDER_BUSY_FREE); > + break; > + case __XE_SAMPLE_COPY_GROUP_BUSY: > + val =3D xe_mmio_read32(gt, XE_OAG_BLT_BUSY_FREE); > + break; > + case __XE_SAMPLE_MEDIA_GROUP_BUSY: > + val =3D xe_mmio_read32(gt, XE_OAG_ANY_MEDIA_FF_BUSY_FREE); > + break; > + case __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY: > + val =3D xe_mmio_read32(gt, XE_OAG_RC0_ANY_ENGINE_BUSY_FREE); > + break; > + default: > + drm_warn(>->tile->xe->drm, "unknown pmu event\n"); > + } > + > + return xe_gt_clock_cycles_to_ns(gt, val * 16); > +} > + > +static u64 engine_group_busyness_read(struct xe_gt *gt, u64 config) > +{ > + int sample_type =3D engine_busyness_sample_type(config); > > > If config is event->attr.config, this can just be 'config_counter(config)= - 1'. > See comment at __xe_pmu_event_read below. > > + const unsigned int gt_id =3D gt->info.id; > + struct xe_device *xe =3D gt->tile->xe; > + struct xe_pmu *pmu =3D &xe->pmu; > + unsigned long flags; > + bool device_awake; > + u64 val; > + > + device_awake =3D xe_device_mem_access_get_if_ongoing(xe); > + if (device_awake) { > + XE_WARN_ON(xe_force_wake_get(gt_to_fw(gt), XE_FW_GT)); > + val =3D __engine_group_busyness_read(gt, sample_type); > + XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FW_GT)); > + xe_device_mem_access_put(xe); > + } > + > + spin_lock_irqsave(&pmu->lock, flags); > + > + if (device_awake) > + pmu->sample[gt_id][sample_type] =3D val; > + else > + val =3D pmu->sample[gt_id][sample_type]; > + > + spin_unlock_irqrestore(&pmu->lock, flags); > + > + return val; > +} > + > +static void engine_group_busyness_store(struct xe_gt *gt) > +{ > + struct xe_pmu *pmu =3D >->tile->xe->pmu; > + unsigned int gt_id =3D gt->info.id; > + unsigned long flags; > + int i; > + > + spin_lock_irqsave(&pmu->lock, flags); > + > + for (i =3D __XE_SAMPLE_RENDER_GROUP_BUSY; i <=3D __XE_SAMPLE_ANY_ENGINE= _GROUP_BUSY; i++) { > + pmu->sample[gt_id][i] =3D __engine_group_busyness_read(gt, i); > > > This is not quite right. At the minimum we need to take forcewake > here. Also since this is called in both suspend and runtime_suspend code > paths we might also need to the take the runtime_pm reference. > > The pm reference and forcewake are already taken in suspend paths hence > didn't add here again as this is called only from those paths. > > check xe_gt_suspend. Sorry, you are right, I missed it. So this is fine. > > > > I think the simplest might be to just construct 'config' > (event->attr.config) here and call engine_group_busyness_read? Something > like: > > for (i =3D __XE_SAMPLE_RENDER_GROUP_BUSY; i <=3D __XE_SAMPLE_ANY_ENGINE_G= ROUP_BUSY; i++) { > config =3D ; // Construct config using gt_id and i > engine_group_busyness_read(gt, i); > } > > This will automatically save the read values in pmu->sample[][] if the > device is awake. Thoughts? > > I think this is best kept separate from usual read paths(which are > atomic) didn't want to club them. Also because forcewakes and pm > reference are taken separately in suspend path. Sure, no changes needed here. Just get rid of the braces to keep checkpatch happy. > > > > + } > + > + spin_unlock_irqrestore(&pmu->lock, flags); > +} > + > +static int > +config_status(struct xe_device *xe, u64 config) > +{ > + unsigned int max_gt_id =3D xe->info.gt_count > 1 ? 1 : 0; > > > What is this for? See below. > > reminiscent of my previous code, will clean it up. > > > > + unsigned int gt_id =3D config_gt_id(config); > + struct xe_gt *gt =3D xe_device_get_gt(xe, gt_id); > + > + if (gt_id > max_gt_id) > > > Maybe this can just be: > > if (gt_id >=3D XE_PMU_MAX_GT) > > + return -ENOENT; > + > + switch (config_counter(config)) { > + case XE_PMU_INTERRUPTS(0): > + if (gt_id) > + return -ENOENT; > + break; > + case XE_PMU_RENDER_GROUP_BUSY(0): > + case XE_PMU_COPY_GROUP_BUSY(0): > + case XE_PMU_ANY_ENGINE_GROUP_BUSY(0): > + if (gt->info.type =3D=3D XE_GT_TYPE_MEDIA) > + return -ENOENT; > + break; > + case XE_PMU_MEDIA_GROUP_BUSY(0): > + if (!(gt->info.engine_mask & (BIT(XE_HW_ENGINE_VCS0) | BIT(XE_HW_ENGIN= E_VECS0)))) > + return -ENOENT; > + break; > + default: > + return -ENOENT; > + } > + > + return 0; > +} > + > +static int xe_pmu_event_init(struct perf_event *event) > +{ > + struct xe_device *xe =3D > + container_of(event->pmu, typeof(*xe), pmu.base); > + struct xe_pmu *pmu =3D &xe->pmu; > + int ret; > + > + if (pmu->closed) > + return -ENODEV; > + > + if (event->attr.type !=3D event->pmu->type) > + return -ENOENT; > + > + /* unsupported modes and filters */ > + if (event->attr.sample_period) /* no sampling */ > + return -EINVAL; > + > + if (has_branch_stack(event)) > + return -EOPNOTSUPP; > + > + if (event->cpu < 0) > + return -EINVAL; > + > + /* only allow running on one cpu at a time */ > + if (!cpumask_test_cpu(event->cpu, &xe_pmu_cpumask)) > + return -EINVAL; > + > + ret =3D config_status(xe, event->attr.config); > + if (ret) > + return ret; > + > + if (!event->parent) { > + drm_dev_get(&xe->drm); > + event->destroy =3D xe_pmu_event_destroy; > + } > + > + return 0; > +} > + > +static u64 __xe_pmu_event_read(struct perf_event *event) > +{ > + struct xe_device *xe =3D > + container_of(event->pmu, typeof(*xe), pmu.base); > + const unsigned int gt_id =3D config_gt_id(event->attr.config); > + const u64 config =3D config_counter(event->attr.config); > > > Probably nit but this config being different from event->attr.config is > confusing. Let's use 'event->attr.config' throughout as argument to > functions and use config_counter() to get rid of gt_id. So get rid of this > config variable. > > + struct xe_gt *gt =3D xe_device_get_gt(xe, gt_id); > + struct xe_pmu *pmu =3D &xe->pmu; > + u64 val =3D 0; > + > + switch (config) { > > > switch (config_counter(event->attr.config)) > > + case XE_PMU_INTERRUPTS(0): > + val =3D READ_ONCE(pmu->irq_count); > + break; > + case XE_PMU_RENDER_GROUP_BUSY(0): > + case XE_PMU_COPY_GROUP_BUSY(0): > + case XE_PMU_ANY_ENGINE_GROUP_BUSY(0): > + case XE_PMU_MEDIA_GROUP_BUSY(0): > + val =3D engine_group_busyness_read(gt, config); > > > engine_group_busyness_read(gt, event->attr.config); > > hmmm ok. > > > > Also, need a default case. > > + } > + > + return val; > +} > + > +static void xe_pmu_event_read(struct perf_event *event) > +{ > + struct xe_device *xe =3D > + container_of(event->pmu, typeof(*xe), pmu.base); > + struct hw_perf_event *hwc =3D &event->hw; > + struct xe_pmu *pmu =3D &xe->pmu; > + u64 prev, new; > + > + if (pmu->closed) { > + event->hw.state =3D PERF_HES_STOPPED; > + return; > + } > +again: > + prev =3D local64_read(&hwc->prev_count); > + new =3D __xe_pmu_event_read(event); > + > + if (local64_cmpxchg(&hwc->prev_count, prev, new) !=3D prev) > + goto again; > + > + local64_add(new - prev, &event->count); > +} > + > +static void xe_pmu_enable(struct perf_event *event) > +{ > + /* > + * Store the current counter value so we can report the correct delta > + * for all listeners. Even when the event was already enabled and has > + * an existing non-zero value. > + */ > + local64_set(&event->hw.prev_count, __xe_pmu_event_read(event)); > > > Right now nothing is being enabled here (unlike i915) so the function name > xe_pmu_enable looks weird. Not sure, maybe leave as is for when things get > added in the future? > > +static int xe_pmu_cpu_online(unsigned int cpu, struct hlist_node *node) > +{ > + struct xe_pmu *pmu =3D hlist_entry_safe(node, typeof(*pmu), cpuhp.node); > + > + BUG_ON(!pmu->base.event_init); > + > + /* Select the first online CPU as a designated reader. */ > + if (cpumask_empty(&xe_pmu_cpumask)) > + cpumask_set_cpu(cpu, &xe_pmu_cpumask); > + > + return 0; > +} > + > +static int xe_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node) > +{ > + struct xe_pmu *pmu =3D hlist_entry_safe(node, typeof(*pmu), cpuhp.node); > + unsigned int target =3D xe_pmu_target_cpu; > + > + BUG_ON(!pmu->base.event_init); > > > nit but wondering if we should remove these two BUG_ON's (and save a coup= le > of checkpatch warnings even though the BUG_ON's are in i915) and just do > something like: > > if (!pmu->base.event_init) > return 0; > > The reason for the BUG_ON's seems to be that these functions can be called > after module_init but before probe. > > xe_pmu_cpu_online() doesn't depend on pmu at all so looks like the BUG_ON > can just be dropped? > > the xe_pmu_cpu_online/offline are not invoked when they are registered w= ith > cpuhp_setup_state_multi, but rather when cpuhp_state_add_instance() is ca= lled > which is done post the PMU is initialized hence the check for BUG_ON. cpuhp_setup_state_multi is called at module_init time. cpuhp_state_add_instance is called from xe_pmu_register, i.e. during device probe when pmu->base.event_init is already initialized. Therefore seems even less reason to have the BUG_ON's. Just a few minor issues left now so I am hoping we can wrap this marathon review up soon :) Thanks. -- Ashutosh