From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 79D7EECE588 for ; Mon, 9 Sep 2024 19:27:33 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 17B4E10E04C; Mon, 9 Sep 2024 19:27:33 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="bpxNmvxr"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.21]) by gabe.freedesktop.org (Postfix) with ESMTPS id A253410E04C for ; Mon, 9 Sep 2024 19:27:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1725910051; x=1757446051; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version; bh=alb5l53gLuxQJJEkhruAP7Fk9Ze4wRnmT7gBnNqMNDM=; b=bpxNmvxr1OZFpQOAAvYf0DTTx8w0zXwSuDi03lJIaNo8AmY4mAMsqQo+ 1jUwj8RWsRreG1oJSiJYwwF8BwlMotmlIXpQtVN8rDWC23ZCzyhjF6xhb /Rgb7fH7it5G05zw0J16IXHAPNtON6aXT5AXB4C1PcXl65okdOMs0qVgP sgia6jSdH2rJDq2hR33zw5tvXOJzCyed0mylLftXwKY2ftvll9EzHhJgJ qoKglPkxWXmy3h0c7QI3EYRPC0pD4fjGa+c7oXGDrn59my0otOUcCCmUU CATwWdXPEhzJRqM0AHseULOQBrfr1H+BFIyVhXfSknCqLsJwD6GsKA1sa Q==; X-CSE-ConnectionGUID: 4eDCNC/fTXK+snxpa6vicQ== X-CSE-MsgGUID: uab/VB8yQh6CwJnX63oJZw== X-IronPort-AV: E=McAfee;i="6700,10204,11190"; a="24570410" X-IronPort-AV: E=Sophos;i="6.10,215,1719903600"; d="scan'208";a="24570410" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa113.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Sep 2024 12:27:07 -0700 X-CSE-ConnectionGUID: nqf/mPXxSm2nyVlDsRuBng== X-CSE-MsgGUID: mHdsMUasTaGTCdGoVtTO7w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,215,1719903600"; d="scan'208";a="71177958" Received: from zzhan38-mobl1.amr.corp.intel.com (HELO adixit-arch.intel.com) ([10.125.32.6]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Sep 2024 12:27:07 -0700 Date: Mon, 09 Sep 2024 12:27:05 -0700 Message-ID: <874j6o1vcm.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: Harish Chegondi Cc: , , , , , , shubham.kumar@intel.com Subject: Re: [PATCH v3 1/1] drm/xe/eustall: Add support for EU stall sampling In-Reply-To: <5f7adcc0dffec68bf88b282a5d2f92a9f05834ca.1725866097.git.harish.chegondi@intel.com> References: <5f7adcc0dffec68bf88b282a5d2f92a9f05834ca.1725866097.git.harish.chegondi@intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/29.4 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" On Mon, 09 Sep 2024 00:36:40 -0700, Harish Chegondi wrote: > Hi Harish, A few new comments on just the uapi, which I missed out previously. > diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h > index b6fbe4988f2e..c836227d064f 100644 > --- a/include/uapi/drm/xe_drm.h > +++ b/include/uapi/drm/xe_drm.h > @@ -1395,6 +1395,8 @@ struct drm_xe_wait_user_fence { > enum drm_xe_observation_type { > /** @DRM_XE_OBSERVATION_TYPE_OA: OA observation stream type */ > DRM_XE_OBSERVATION_TYPE_OA, > + /** @DRM_XE_OBSERVATION_TYPE_EU_STALL: EU stall sampling observation stream type */ > + DRM_XE_OBSERVATION_TYPE_EU_STALL, > }; > > /** > @@ -1694,6 +1696,138 @@ struct drm_xe_oa_stream_info { > __u64 reserved[3]; > }; > > +/** > + * enum drm_xe_eu_stall_property_id - EU stall data stream property ids. > + * > + * These properties are passed to the driver as a chain of > + * @drm_xe_ext_set_property structures with @property set to these > + * properties' enums and @value set to the corresponding values of these > + * properties. @drm_xe_user_extension base.name should be set to > + * @DRM_XE_EU_STALL_EXTENSION_SET_PROPERTY. > + */ > +enum drm_xe_eu_stall_property_id { > +#define DRM_XE_EU_STALL_EXTENSION_SET_PROPERTY 0 > + /** > + * @DRM_XE_EU_STALL_PROP_BUF_SZ: Per DSS Memory Buffer Size. > + * Valid values are 128 KB, 256 KB, and 512 KB. > + */ > + DRM_XE_EU_STALL_PROP_BUF_SZ = 1, > + > + /** > + * @DRM_XE_EU_STALL_PROP_SAMPLE_RATE: Sampling rate > + * in multiples of 251 cycles. Valid values are 1 to 7. > + * If the value is 1, sampling interval is 251 cycles. > + * If the value is 7, sampling interval is 7 x 251 cycles. > + */ > + DRM_XE_EU_STALL_PROP_SAMPLE_RATE, > + > + /** > + * @DRM_XE_EU_STALL_PROP_POLL_PERIOD: EU stall data > + * poll period in nanoseconds. should be at least 100000 ns. > + */ > + DRM_XE_EU_STALL_PROP_POLL_PERIOD, > + > + /** > + * @DRM_XE_EU_STALL_PROP_EVENT_REPORT_COUNT: Minimum number of > + * EU stall data rows to be present in the kernel buffer for > + * poll() to set POLLIN (data present). > + */ > + DRM_XE_EU_STALL_PROP_EVENT_REPORT_COUNT, > + > + /** > + * @DRM_XE_EU_STALL_PROP_GT_ID: GT ID of the GT on which > + * EU stall data will be captured. > + */ > + DRM_XE_EU_STALL_PROP_GT_ID, > + > + /** > + * @DRM_XE_EU_STALL_PROP_OPEN_DISABLED: A value of 1 will open > + * the EU stall data stream without enabling EU stall sampling. > + */ > + DRM_XE_EU_STALL_PROP_OPEN_DISABLED, > +}; If we add new properties in the future, UMD's would need to be able to query if a particular kernel supports these new properties. For this, we need a query even before the EU Stall stream is opened. Here we should follow the same scheme as in OA and introduce an EU stall 'capabilities'. For how this is done in OA, see 'struct drm_xe_oa_unit' in xe_drm.h which is returned from DRM_XE_DEVICE_QUERY_OA_UNITS. For example, after the https://patchwork.freedesktop.org/series/137058/ series, OA capabilities will look like this: /** @capabilities: OA capabilities bit-mask */ __u64 capabilities; #define DRM_XE_OA_CAPS_BASE (1 << 0) #define DRM_XE_OA_CAPS_SYNCS (1 << 1) Let's call this DRM_XE_DEVICE_QUERY_EU_STALL for EU stall. > + > +/** > + * struct drm_xe_eu_stall_stream_info - EU stall stream info returned from > + * @DRM_XE_OBSERVATION_IOCTL_INFO observation stream fd ioctl > + */ > +struct drm_xe_eu_stall_stream_info { > + /** @extensions: Pointer to the first extension struct, if any */ > + __u64 extensions; > + > + /** @record_size: size of each EU stall data record */ > + __u64 record_size; If this is only platform specific, not stream specific, we should add this also to the DRM_XE_DEVICE_QUERY_EU_STALL. If we do this, we won't have a 'struct drm_xe_eu_stall_stream_info', unless we need to return some other stream specific information in the future. > + > + /** @reserved: reserved for future use */ > + __u64 reserved[3]; > +}; Also, in response to the EIO return code, if we ever want to return a status different from "RECORDS DROPPED" (say a status that GPU was reset during data collection), we need to support DRM_XE_OBSERVATION_IOCTL_STATUS for EU stall, similar to how it is done for OA. IMO, DRM_XE_DEVICE_QUERY_EU_STALL and DRM_XE_OBSERVATION_IOCTL_STATUS need to be introduced now, as part of this patch. Otherwise UMD's will see error returns for older kernels, if these are introduced later. > + > +/** > + * struct drm_xe_eu_stall_data_pvc - EU stall data format for PVC > + * > + * Bits Field > + * 0 to 28 IP (addr) > + * 29 to 36 active count > + * 37 to 44 other count > + * 45 to 52 control count > + * 53 to 60 pipestall count > + * 61 to 68 send count > + * 69 to 76 dist_acc count > + * 77 to 84 sbid count > + * 85 to 92 sync count > + * 93 to 100 inst_fetch count > + */ > +struct drm_xe_eu_stall_data_pvc { > + __u64 ip_addr:29; > + __u64 active_count:8; > + __u64 other_count:8; > + __u64 control_count:8; > + __u64 pipestall_count:8; > + __u64 send_count:8; > + __u64 dist_acc_count:8; > + __u64 sbid_count:8; > + __u64 sync_count:8; > + __u64 inst_fetch_count:8; > + __u64 unused_bits:27; > + __u64 unused[6]; > +} __attribute__((packed)); > + > +/** > + * struct drm_xe_eu_stall_data_xe2 - EU stall data format for LNL, BMG > + * > + * Bits Field > + * 0 to 28 IP (addr) > + * 29 to 36 Tdr count > + * 37 to 44 other count > + * 45 to 52 control count > + * 53 to 60 pipestall count > + * 61 to 68 send count > + * 69 to 76 dist_acc count > + * 77 to 84 sbid count > + * 85 to 92 sync count > + * 93 to 100 inst_fetch count > + * 101 to 108 Active count > + * 109 to 111 Exid > + * 112 EndFlag (is always 1) > + */ > +struct drm_xe_eu_stall_data_xe2 { > + __u64 ip_addr:29; > + __u64 tdr_count:8; > + __u64 other_count:8; > + __u64 control_count:8; > + __u64 pipestall_count:8; > + __u64 send_count:8; > + __u64 dist_acc_count:8; > + __u64 sbid_count:8; > + __u64 sync_count:8; > + __u64 inst_fetch_count:8; > + __u64 active_count:8; > + __u64 ex_id:3; > + __u64 end_flag:1; > + __u64 unused_bits:15; > + __u64 unused[6]; > +} __attribute__((packed)); IMO, these data struct's are uapi between HW and UMD, not between KMD and UMD, so these structs should *not* be in xe_drm.h. KMD just forwards opaque records to UMD, UMD uses "outside" knowledge to make sense of these records. We have not included such record formats for OA either. Record formats are documented in Bspec and UMD's know how to interpret the records. Also note that Linux kernel does not use bit-fields (to avoid issues related to endianness). Thanks. -- Ashutosh