From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 218A2EB64DD for ; Wed, 9 Aug 2023 07:28:18 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D09DE10E252; Wed, 9 Aug 2023 07:28:17 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8177010E255 for ; Wed, 9 Aug 2023 07:28:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1691566095; x=1723102095; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version:content-transfer-encoding; bh=EaChrCqMjoolV2gOpoWo+L/hoPzl0+bkgRQJKyRzvlg=; b=PxCfQdYUxXW5QvhBAnetPuTOKQYjvhOfBxoPtz1U0qwtH/5XeSyklGv/ Pua8TZF40AB8go8SUfhmdfjK0ZfRYAvktOLihgzidtQd2N1KLRifSg5nq d8MR/g7Qsrr5Kb0IHSboOjbqOCkVNfKHDEQzbIyeVvo44q3TP5ziwPcWA /TFOSAzN3t2hbKBT7ls/ePmId9uJGNDNKyreFzsGZ3Sxh5+yHzF6oDxw6 ra/PPe3vJE/6iCpzadb7Azk73RsZrUJJgi1/G+Mo37p34GKdzLCy1lLw+ MM8jDw00tQS7olkbhp99W6OfEIMxYIgTQIuPvAvx9nhagvGyM9tYwLfdy Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10795"; a="373829968" X-IronPort-AV: E=Sophos;i="6.01,158,1684825200"; d="scan'208";a="373829968" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Aug 2023 00:28:14 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10795"; a="855414997" X-IronPort-AV: E=Sophos;i="6.01,158,1684825200"; d="scan'208";a="855414997" Received: from adixit-mobl.amr.corp.intel.com (HELO adixit-arch.intel.com) ([10.209.147.231]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Aug 2023 00:28:14 -0700 Date: Wed, 09 Aug 2023 00:28:12 -0700 Message-ID: <877cq4y8s3.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: Aravind Iddamsetty In-Reply-To: <20230808115436.400611-3-aravind.iddamsetty@intel.com> References: <20230808115436.400611-1-aravind.iddamsetty@intel.com> <20230808115436.400611-3-aravind.iddamsetty@intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/29.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: [Intel-xe] [PATCH v3 2/2] drm/xe/pmu: Enable PMU interface X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Bommu Krishnaiah , intel-xe@lists.freedesktop.org, Tvrtko Ursulin Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" On Tue, 08 Aug 2023 04:54:36 -0700, Aravind Iddamsetty wrote: > Hi Aravind, Spotted a few remaining things. See if it's possible to fix these up and send another version. > diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c > new file mode 100644 > index 000000000000..9637f8283641 > --- /dev/null > +++ b/drivers/gpu/drm/xe/xe_pmu.c > @@ -0,0 +1,673 @@ > +// SPDX-License-Identifier: MIT > +/* > + * Copyright =A9 2023 Intel Corporation > + */ > + > +#include > +#include > +#include > + > +#include "regs/xe_gt_regs.h" > +#include "xe_device.h" > +#include "xe_gt_clock.h" > +#include "xe_mmio.h" > + > +static cpumask_t xe_pmu_cpumask; > +static unsigned int xe_pmu_target_cpu =3D -1; > + > +static unsigned int config_gt_id(const u64 config) > +{ > + return config >> __XE_PMU_GT_SHIFT; > +} > + > +static u64 config_counter(const u64 config) > +{ > + return config & ~(~0ULL << __XE_PMU_GT_SHIFT); > +} > + > +static int engine_busyness_sample_type(u64 config) > +{ > + int type =3D 0; > + > + switch (config) { > + case XE_PMU_RENDER_GROUP_BUSY(0): > + type =3D __XE_SAMPLE_RENDER_GROUP_BUSY; One extra space here... > +static u64 __engine_group_busyness_read(struct xe_gt *gt, int sample_typ= e) > +{ > + u64 val =3D 0; > + What is the forcewake domain for these registers? Don't we need to get forcewake before reading these. Something like: XE_WARN_ON(xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL)); > + switch (sample_type) { > + case __XE_SAMPLE_RENDER_GROUP_BUSY: > + val =3D xe_mmio_read32(gt, XE_OAG_RENDER_BUSY_FREE); > + break; > + case __XE_SAMPLE_COPY_GROUP_BUSY: > + val =3D xe_mmio_read32(gt, XE_OAG_BLT_BUSY_FREE); > + break; > + case __XE_SAMPLE_MEDIA_GROUP_BUSY: > + val =3D xe_mmio_read32(gt, XE_OAG_ANY_MEDIA_FF_BUSY_FREE); > + break; > + case __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY: > + val =3D xe_mmio_read32(gt, XE_OAG_RC0_ANY_ENGINE_BUSY_FREE); > + break; > + default: > + drm_warn(>->tile->xe->drm, "unknown pmu event\n"); > + } And similarly here: XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL)); > +static u64 engine_group_busyness_read(struct xe_gt *gt, u64 config) > +{ > + int sample_type =3D engine_busyness_sample_type(config); > + const unsigned int gt_id =3D gt->info.id; > + struct xe_device *xe =3D gt->tile->xe; > + struct xe_pmu *pmu =3D &xe->pmu; > + unsigned long flags; > + bool device_awake; > + u64 val; > + > + device_awake =3D xe_device_mem_access_get_if_ongoing(xe); > + if (device_awake) > + val =3D __engine_group_busyness_read(gt, sample_type); > + > + spin_lock_irqsave(&pmu->lock, flags); > + > + if (device_awake) { > + pmu->sample[gt_id][sample_type] =3D val; > + xe_device_mem_access_put(xe); I think we can add the xe_device_mem_access_put right after __engine_group_busyness_read: if (device_awake) { val =3D __engine_group_busyness_read(gt, sample_type); xe_device_mem_access_put(xe); } > + } else { > + val =3D pmu->sample[gt_id][sample_type]; > + } > + > + spin_unlock_irqrestore(&pmu->lock, flags); > + > + return val; > +} > + > +static void engine_group_busyness_store(struct xe_gt *gt) > +{ > + struct xe_pmu *pmu =3D >->tile->xe->pmu; > + unsigned int gt_id =3D gt->info.id; > + unsigned long flags; > + u64 val; > + int i; > + > + spin_lock_irqsave(&pmu->lock, flags); > + > + for (i =3D __XE_SAMPLE_RENDER_GROUP_BUSY; i <=3D __XE_SAMPLE_ANY_ENGINE= _GROUP_BUSY; i++) { Change the termination condition to 'i < __XE_NUM_PMU_SAMPLERS'. > + val =3D __engine_group_busyness_read(gt, i); > + pmu->sample[gt_id][i] =3D val; Let's eliminate val and just do: pmu->sample[gt_id][i] =3D __engine_group_busyness_read(gt, i); > +static int > +config_status(struct xe_device *xe, u64 config) > +{ > + unsigned int max_gt_id =3D xe->info.gt_count > 1 ? 1 : 0; > + unsigned int gt_id =3D config_gt_id(config); > + struct xe_gt *gt =3D xe_device_get_gt(xe, gt_id); > + > + if (gt_id > max_gt_id) > + return -ENOENT; > + > + switch (config_counter(config)) { > + case XE_PMU_INTERRUPTS(0): > + if (gt_id) > + return -ENOENT; > + break; > + case XE_PMU_RENDER_GROUP_BUSY(0): > + case XE_PMU_COPY_GROUP_BUSY(0): > + case XE_PMU_ANY_ENGINE_GROUP_BUSY(0): > + break; > + case XE_PMU_MEDIA_GROUP_BUSY(0): > + if (!(gt->info.engine_mask & (BIT(XE_HW_ENGINE_VCS0) | BIT(XE_HW_ENGIN= E_VECS0)))) Why change this? "if (gt->info.type !=3D XE_GT_TYPE_MEDIA)" doesn't work? > +void xe_pmu_register(struct xe_pmu *pmu) > +{ > + struct xe_device *xe =3D container_of(pmu, typeof(*xe), pmu); > + const struct attribute_group *attr_groups[] =3D { > + &pmu->events_attr_group, > + &xe_pmu_cpumask_attr_group, > + NULL > + }; > + > + int ret =3D -ENOMEM; > + > + spin_lock_init(&pmu->lock); > + pmu->cpuhp.cpu =3D -1; > + > + pmu->name =3D kasprintf(GFP_KERNEL, > + "xe_%s", > + dev_name(xe->drm.dev)); > + if (pmu->name) > + /* tools/perf reserves colons as special. */ > + strreplace((char *)pmu->name, ':', '_'); > + > + if (!pmu->name) > + goto err; > + > + pmu->events_attr_group.name =3D "events"; > + pmu->events_attr_group.attrs =3D create_event_attributes(pmu); > + if (!pmu->events_attr_group.attrs) > + goto err_name; > + > + pmu->base.attr_groups =3D kmemdup(attr_groups, sizeof(attr_groups), > + GFP_KERNEL); > + if (!pmu->base.attr_groups) > + goto err_attr; > + > + pmu->base.module =3D THIS_MODULE; > + pmu->base.task_ctx_nr =3D perf_invalid_context; > + pmu->base.event_init =3D xe_pmu_event_init; > + pmu->base.add =3D xe_pmu_event_add; > + pmu->base.del =3D xe_pmu_event_del; > + pmu->base.start =3D xe_pmu_event_start; > + pmu->base.stop =3D xe_pmu_event_stop; > + pmu->base.read =3D xe_pmu_event_read; > + pmu->base.event_idx =3D xe_pmu_event_event_idx; > + > + ret =3D perf_pmu_register(&pmu->base, pmu->name, -1); > + if (ret) > + goto err_groups; > + > + ret =3D xe_pmu_register_cpuhp_state(pmu); > + if (ret) > + goto err_unreg; > + > + ret =3D drmm_add_action_or_reset(&xe->drm, xe_pmu_unregister, pmu); > + if (!ret) > + return; The error unwinding is not correct here, should be something like: ret =3D drmm_add_action_or_reset(&xe->drm, xe_pmu_unregister, pmu); if (ret) goto err_cpuhp; return; err_cpuhp: xe_pmu_unregister_cpuhp_state(pmu); > +err_unreg: > + perf_pmu_unregister(&pmu->base); > +err_groups: > + kfree(pmu->base.attr_groups); > +err_attr: > + pmu->base.event_init =3D NULL; > + free_event_attributes(pmu); > +err_name: > + kfree(pmu->name); > +err: > + drm_notice(&xe->drm, "Failed to register PMU!\n"); > +} > diff --git a/drivers/gpu/drm/xe/xe_pmu_types.h b/drivers/gpu/drm/xe/xe_pm= u_types.h > new file mode 100644 > index 000000000000..a950c892e364 > --- /dev/null > +++ b/drivers/gpu/drm/xe/xe_pmu_types.h > @@ -0,0 +1,76 @@ > +/* SPDX-License-Identifier: MIT */ > +/* > + * Copyright =A9 2023 Intel Corporation > + */ > + > +#ifndef _XE_PMU_TYPES_H_ > +#define _XE_PMU_TYPES_H_ > + > +#include > +#include > +#include > + > +enum { > + __XE_SAMPLE_RENDER_GROUP_BUSY, > + __XE_SAMPLE_COPY_GROUP_BUSY, > + __XE_SAMPLE_MEDIA_GROUP_BUSY, > + __XE_SAMPLE_ANY_ENGINE_GROUP_BUSY, > + __XE_NUM_PMU_SAMPLERS > +}; > + > +#define XE_MAX_GT_PER_TILE 2 > + > +struct xe_pmu { > + /** > + * @cpuhp: Struct used for CPU hotplug handling. > + */ > + struct { > + struct hlist_node node; > + unsigned int cpu; > + } cpuhp; > + /** > + * @base: PMU base. > + */ > + struct pmu base; > + /** > + * @closed: xe is unregistering. > + */ > + bool closed; > + /** > + * @name: Name as registered with perf core. > + */ > + const char *name; > + /** > + * @lock: Lock protecting enable mask and ref count handling. > + */ > + spinlock_t lock; > + /** > + * @sample: Current and previous (raw) counters. > + * > + * These counters are updated when the device is awake. > + * > + */ > + u64 sample[XE_MAX_GT_PER_TILE][__XE_NUM_PMU_SAMPLERS]; s/XE_MAX_GT_PER_TILE/XE_MAX_GT/ since the PMU is for the entire device not per tile, as I mentioned earlier. Thanks. -- Ashutosh