From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7C21CC02183 for ; Sat, 18 Jan 2025 02:34:31 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3349710E332; Sat, 18 Jan 2025 02:34:31 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="esVmTjQX"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) by gabe.freedesktop.org (Postfix) with ESMTPS id F329110E332 for ; Sat, 18 Jan 2025 02:34:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1737167669; x=1768703669; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version; bh=zYsY9TC7IFn1VVaX36sHcrd9aVdiZTvyWS4PFdMbkLY=; b=esVmTjQXgItR4XOwduMlJQRIKusvuVY0WXEx0vgsCVU+wUg/EDZAPxq3 CP0e9QX1BR1XmfjAq1FxlOaYi0qJWNFBXySznTNwDs+UkISiGpUIO6bwT AucmQYnwoOApZtQQwTDY0PGJP5T06HV1xxPciH5Rrr/G5YAcQYN6JypYC dCSgoHOuIMD9pB12dvdrEO3upX05Z5ztiOmrZKgsA7KKtznwxlCdKM2p/ 9HF7Uz5aWiQVRLENvWJa4aqwxCP8nW6L/xjreDEuy0keVF+qzLPKteeul k/mhkACQ0wYE/+1pIpWf62v8bjEJBMCtoTGzZRsGfwOfDJJ+lWsRA+vRy Q==; X-CSE-ConnectionGUID: QR0hlokNRQOz+RLf9NQ3tw== X-CSE-MsgGUID: g693IILER5GXTLTKh5K6Iw== X-IronPort-AV: E=McAfee;i="6700,10204,11314"; a="37729564" X-IronPort-AV: E=Sophos;i="6.12,310,1728975600"; d="scan'208";a="37729564" Received: from orviesa009.jf.intel.com ([10.64.159.149]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Jan 2025 18:34:28 -0800 X-CSE-ConnectionGUID: cJCHFUE+S2CA9UeMXhxQ8A== X-CSE-MsgGUID: rT3gUvURQVW9hGlM7dorGA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.13,213,1732608000"; d="scan'208";a="105778545" Received: from orsosgc001.jf.intel.com (HELO orsosgc001.intel.com) ([10.165.21.142]) by orviesa009-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Jan 2025 18:34:29 -0800 Date: Fri, 17 Jan 2025 18:34:28 -0800 Message-ID: <85ikqcetor.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: Harish Chegondi Cc: , , , , , , , , Subject: Re: [PATCH v8 3/7] drm/xe/eustall: Implement EU stall sampling APIs for Xe_HPC In-Reply-To: <28e9b33c1fd5164d611d1ea3caa45388278efe43.1736970203.git.harish.chegondi@intel.com> References: <28e9b33c1fd5164d611d1ea3caa45388278efe43.1736970203.git.harish.chegondi@intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (x86_64-redhat-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" On Wed, 15 Jan 2025 12:02:09 -0800, Harish Chegondi wrote: > Hi Harish, I've just started reviewing this large patch. Just sending out initial comments for now in case you want to start looking at the coments and work on it meanwhile. I will pick up reviewing it again next week. > Implement EU stall sampling APIs introduced in the previous patch for > Xe_HPC (PVC). Add register definitions and the code that accesses these > registers to the APIs. > > Add initialization and clean up functions and their implementations, > EU stall enable and disable functions, poll() and read() implementations. > > A timer thread periodically polls the EU stall data buffer write pointer > registers to look for any new data and caches the write pointer. The read > function compares the cached read and write pointers and copies any new > data to the user space. > > v8: Updated copyright year in xe_eu_stall_regs.h to 2025. > Renamed struct drm_xe_eu_stall_data_pvc to struct xe_eu_stall_data_pvc > since it is a local structure. > v6: Fix buffer wrap around over write bug (Matt Olson) Once again, overall advice for the patch: think carefully if a static function needs documentation. And if it needs documentation, does it need each parameter described, even if that conveys 0 information. My recommendation is to only include documentation for functions called by other files (function exported via the .h). At least let's not follow a rule where every function has a header. This is towards kernel recommendation of minimizing documentation and having self-documenting code. > > Signed-off-by: Harish Chegondi > --- > drivers/gpu/drm/xe/regs/xe_eu_stall_regs.h | 29 + > drivers/gpu/drm/xe/xe_eu_stall.c | 759 ++++++++++++++++++++- > drivers/gpu/drm/xe/xe_eu_stall.h | 43 ++ > drivers/gpu/drm/xe/xe_gt.c | 6 + > drivers/gpu/drm/xe/xe_gt_types.h | 3 + > drivers/gpu/drm/xe/xe_trace.h | 33 + > 6 files changed, 847 insertions(+), 26 deletions(-) This patch is also a little bit too big. I think at least it can be split into two: the read and the initialization parts can be separated out. Anyway for now I will plod on with this review. > create mode 100644 drivers/gpu/drm/xe/regs/xe_eu_stall_regs.h > > diff --git a/drivers/gpu/drm/xe/xe_eu_stall.c b/drivers/gpu/drm/xe/xe_eu_stall.c > index 48dcc7cb7791..c388d733b857 100644 > --- a/drivers/gpu/drm/xe/xe_eu_stall.c > +++ b/drivers/gpu/drm/xe/xe_eu_stall.c > @@ -8,15 +8,27 @@ > #include > #include > > +#include > #include > > +#include "xe_bo.h" > +#include "xe_pm.h" > +#include "xe_trace.h" > #include "xe_macros.h" > #include "xe_device.h" > +#include "xe_gt_mcr.h" > #include "xe_eu_stall.h" > #include "xe_gt_printk.h" > +#include "xe_force_wake.h" > #include "xe_gt_topology.h" > #include "xe_observation.h" #includes should be in alphabetical order. See other files. > > +#include "regs/xe_gt_regs.h" > +#include "regs/xe_eu_stall_regs.h" > + > +#define POLL_FREQUENCY_HZ 100 > +#define POLL_PERIOD_NS (NSEC_PER_SEC / POLL_FREQUENCY_HZ) > + > /** > * struct eu_stall_open_properties - EU stall sampling properties received > * from user space at open. > @@ -31,6 +43,50 @@ struct eu_stall_open_properties { > struct xe_gt *gt; > }; > > +/** > + * struct xe_eu_stall_data_pvc - EU stall data format for PVC > + * > + * Bits Field > + * 0 to 28 IP (addr) > + * 29 to 36 active count > + * 37 to 44 other count > + * 45 to 52 control count > + * 53 to 60 pipestall count > + * 61 to 68 send count > + * 69 to 76 dist_acc count > + * 77 to 84 sbid count > + * 85 to 92 sync count > + * 93 to 100 inst_fetch count > + */ > +struct xe_eu_stall_data_pvc { > + __u64 ip_addr:29; > + __u64 active_count:8; > + __u64 other_count:8; > + __u64 control_count:8; > + __u64 pipestall_count:8; > + __u64 send_count:8; > + __u64 dist_acc_count:8; > + __u64 sbid_count:8; > + __u64 sync_count:8; > + __u64 inst_fetch_count:8; > + __u64 unused_bits:27; > + __u64 unused[6]; > +} __packed; What is the purpose of including these data structs in the code? Afaict they are not used anywhere, except for sizeof()? You might as well as just store the sizeof in a table? > + > +static u64 per_xecore_buf_size = SZ_512K; > + > +static unsigned long > +xe_eu_stall_data_record_size(struct xe_device *xe) > +{ > + enum xe_platform platform = xe->info.platform; Don't use unnecessary variables, just use xe->info.platform. > + unsigned long record_size = 0; Don't initialize I think, add an else with xe_gt_WARN_ON, if that works. > + > + if (platform == XE_PVC) Use GRAPHICS_VERx100 or GRAPHICS_VER, so that later you can just say 'if (GRAPHICS_VER(xe) >= 20)' to avoid changing code for each platform. PVC is GRAPHICS_VERx100(xe) 1260, though just for PVC you can use xe->info.platform. > @@ -265,19 +982,9 @@ int xe_eu_stall_stream_open(struct drm_device *dev, > return -EINVAL; > } > > - if (xe_observation_paranoid && !perfmon_capable()) { > - xe_gt_dbg(props.gt, "Insufficient privileges for EU stall monitoring\n"); > - return -EACCES; > - } > + mutex_lock(&props.gt->eu_stall->lock); > + ret = xe_eu_stall_stream_open_locked(dev, &props, file); > + mutex_unlock(&props.gt->eu_stall->lock); > > - if (!has_eu_stall_sampling_support(xe)) { > - xe_gt_dbg(props.gt, "EU stall monitoring is not supported on this platform\n"); > - return -EPERM; > - } > - stream_fd = anon_inode_getfd("[xe_eu_stall]", &fops_eu_stall, > - NULL, 0); > - if (stream_fd < 0) > - xe_gt_dbg(props.gt, "EU stall inode get fd failed : %d\n", stream_fd); > - > - return stream_fd; As I mentioned in the other patch, we need to see why is so much code moving around? Generally we should not have to delete lines of code in these patches. Ashutosh