From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F41B5C0015E for ; Sat, 12 Aug 2023 00:34:37 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A4AB110E19B; Sat, 12 Aug 2023 00:34:37 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.93]) by gabe.freedesktop.org (Postfix) with ESMTPS id B99E610E381 for ; Sat, 12 Aug 2023 00:34:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1691800474; x=1723336474; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version; bh=42C95P/NYJNV3wPzZz5UTMT/7TEfeQbIPTILkArme8I=; b=IRW8UUvsuJpEBaX10vvqqNNYgsIMqdLgqIT9SoEv3A+N6466SycwSAZq LuHDvOWHwkcQ+mLbRJeMoD904T/sj3hFJ0WT9pjEWNrrU7I2PtQsfbfGe xLEfPJU7f0OIArh6s4ny0s51+22bcJ3l/OYYi+jFaOMvst8gH4oS0Jx9A UBjbYJY2vUSSCnAEGjsg0j18XuTz5/Ij8RhNZN3cVk/y0f7lH6o8b5q4K s7uqH+76Vs0vA2TKlgSCmLtedcglKgfYctBUHjhNtfS8W/7V5zuxNkE4Y xYndFf0qrPToIR0UkLzHlj76goLOOmDPlzQup5QlL33zxKrzx+gPzBlaH g==; X-IronPort-AV: E=McAfee;i="6600,9927,10799"; a="369259904" X-IronPort-AV: E=Sophos;i="6.01,166,1684825200"; d="scan'208";a="369259904" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2023 17:34:33 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10799"; a="798210047" X-IronPort-AV: E=Sophos;i="6.01,166,1684825200"; d="scan'208";a="798210047" Received: from adixit-mobl.amr.corp.intel.com (HELO adixit-arch.intel.com) ([10.209.112.39]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2023 17:34:33 -0700 Date: Fri, 11 Aug 2023 17:32:40 -0700 Message-ID: <87ttt5yuaf.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: Umesh Nerlige Ramappa In-Reply-To: References: <20230804213253.2381705-1-umesh.nerlige.ramappa@intel.com> <20230804213253.2381705-4-umesh.nerlige.ramappa@intel.com> <87v8dlyx2n.wl-ashutosh.dixit@intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/29.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII Subject: Re: [Intel-xe] [PATCH 3/3] drm/xe: Correlate engine and cpu timestamps with better accuracy X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lionel G Landwerlin , intel-xe@lists.freedesktop.org Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" On Fri, 11 Aug 2023 17:30:18 -0700, Umesh Nerlige Ramappa wrote: > > On Fri, Aug 11, 2023 at 04:32:32PM -0700, Dixit, Ashutosh wrote: > > On Fri, 04 Aug 2023 14:47:59 -0700, Umesh Nerlige Ramappa wrote: > >> > > > > Hi Umesh, > > > >> On Fri, Aug 04, 2023 at 02:32:53PM -0700, Umesh Nerlige Ramappa wrote: > >> > +/** > >> > + * struct drm_xe_query_cs_cycles - correlate CPU and GPU timestamps > >> > + * > >> > + * If a query is made with a struct drm_xe_device_query where .query > >> > + * is equal to DRM_XE_QUERY_CS_CYCLES, then the reply uses > >> > + * struct drm_xe_query_cs_cycles in .data. > >> > + * > >> > + * struct drm_xe_query_cs_cycles is allocated by the user and .data points to > >> > + * this allocated structure. The user must pass .eci and .clockid as inputs to > >> > + * this query. eci determines the engine and tile info required to fetch the > >> > + * relevant GPU timestamp. clockid is used to return the specific CPU > >> > + * timestamp. > >> > + * > >> > + * The query returns the command streamer cycles and the frequency that can > >> > + * be used to calculate the command streamer timestamp. In addition the > >> > + * query returns a set of cpu timestamps that indicate when the command > >> > + * streamer cycle count was captured. > >> > + */ > >> > +struct drm_xe_query_cs_cycles { > >> > + /** Engine for which command streamer cycles is queried. */ > >> > + struct drm_xe_engine_class_instance eci; > >> > + > >> > + /** MBZ (pad eci to 64 bit) */ > >> > + __u16 rsvd; > >> > >> I need some inputs on the rsvd field here. Looks like struct > >> drm_xe_engine_class_instance may need padding (64 bit aligned) if used this > >> way. Is this the right way to pad it? Should the padding be moved to > >> struct drm_xe_engine_class_instance? OR should struct > >> drm_xe_engine_class_instance be packed? > > > > Does it need to be 64 bit aligned? The next field can be an unaligned u64 > > field, it should be ok, x86 at least should be able to handle it. > > Then compiler would potentially insert a pad between eci and the next > u64. The idea is to avoid an uninitialized padding. The compiler may or may not add padding. So the next u64 field might or might not be 8 byte aligned. Either way it should not make a difference to functionality, unaligned accesses may be a little slower than aligned access. But in this case probably doesn't matter much. > > > > But yes, nice to have all struct's 64 bit aligned. IMO the right way to do > > it would be to pad 'struct drm_xe_engine_class_instance' to be 64 bit > > aligned. Because if 'struct drm_xe_engine_class_instance' changes, the > > above '__u16 rsvd' field might not work and is error prone and will also > > need to be modified. So better to pad 'struct drm_xe_engine_class_instance' > > to be 64 bit aligned. > > Agree, padding drm_xe_engine_class_instance makes sense. Yeah better to explicitly pad drm_xe_engine_class_instance. Thanks. -- Ashutosh >> >> > + > >> > + /** > >> > + * Command streamer cycles as read from the command streamer > >> > + * register at 0x358 offset. > >> > + */ > >> > + __u64 cs_cycles; > >> > + > >> > + /** Frequency of the cs cycles in Hz. */ > >> > + __u64 cs_frequency; > >> > + > >> > + /** > >> > + * CPU timestamp in ns. The timestamp is captured before reading the > >> > + * cs_cycles register using the reference clockid set by the user. > >> > + */ > >> > + __u64 cpu_timestamp; > >> > + > >> > + /** > >> > + * Time delta in ns captured around reading the lower dword of the > >> > + * cs_cycles register. > >> > + */ > >> > + __u64 cpu_delta; > >> > + > >> > + /** > >> > + * Reference clock id for CPU timestamp. For definition, see > >> > + * clock_gettime(2) and perf_event_open(2). Supported clock ids are > >> > + * CLOCK_MONOTONIC, CLOCK_MONOTONIC_RAW, CLOCK_REALTIME, CLOCK_BOOTTIME, > >> > + * CLOCK_TAI. > >> > + */ > >> > + __s32 clockid; > >> > + > >> > + /** Width of the cs cycle counter in bits. */ > >> > + __u32 width; > >> > +}; > >> > + > >> > /** > >> > * struct drm_xe_query_mem_usage - describe memory regions and usage > >> > * > >> > @@ -395,6 +471,7 @@ struct drm_xe_device_query { > >> > #define DRM_XE_DEVICE_QUERY_GTS 3 > >> > #define DRM_XE_DEVICE_QUERY_HWCONFIG 4 > >> > #define DRM_XE_DEVICE_QUERY_GT_TOPOLOGY 5 > >> > +#define DRM_XE_QUERY_CS_CYCLES 6 > >> > /** @query: The type of data to query */ > >> > __u32 query; > >> > > >> > @@ -737,24 +814,6 @@ struct drm_xe_exec_queue_set_property { > >> > __u64 reserved[2]; > >> > }; > >> > > >> > -/** struct drm_xe_engine_class_instance - instance of an engine class */ > >> > -struct drm_xe_engine_class_instance { > >> > -#define DRM_XE_ENGINE_CLASS_RENDER 0 > >> > -#define DRM_XE_ENGINE_CLASS_COPY 1 > >> > -#define DRM_XE_ENGINE_CLASS_VIDEO_DECODE 2 > >> > -#define DRM_XE_ENGINE_CLASS_VIDEO_ENHANCE 3 > >> > -#define DRM_XE_ENGINE_CLASS_COMPUTE 4 > >> > - /* > >> > - * Kernel only class (not actual hardware engine class). Used for > >> > - * creating ordered queues of VM bind operations. > >> > - */ > >> > -#define DRM_XE_ENGINE_CLASS_VM_BIND 5 > >> > - __u16 engine_class; > >> > - > >> > - __u16 engine_instance; > >> > - __u16 gt_id; > >> > -}; > >> > - > >> > struct drm_xe_exec_queue_create { > >> > #define XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY 0 > >> > /** @extensions: Pointer to the first extension struct, if any */ > >> > -- > >> > 2.38.1 > >> >