From: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
To: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Lionel G Landwerlin <lionel.g.landwerlin@linux.intel.com>,
intel-xe@lists.freedesktop.org
Subject: Re: [Intel-xe] [PATCH 3/3] drm/xe: Correlate engine and cpu timestamps with better accuracy
Date: Fri, 11 Aug 2023 17:32:40 -0700 [thread overview]
Message-ID: <87ttt5yuaf.wl-ashutosh.dixit@intel.com> (raw)
In-Reply-To: <ZNbSmrA4flUumeLg@orsosgc001.jf.intel.com>
On Fri, 11 Aug 2023 17:30:18 -0700, Umesh Nerlige Ramappa wrote:
>
> On Fri, Aug 11, 2023 at 04:32:32PM -0700, Dixit, Ashutosh wrote:
> > On Fri, 04 Aug 2023 14:47:59 -0700, Umesh Nerlige Ramappa wrote:
> >>
> >
> > Hi Umesh,
> >
> >> On Fri, Aug 04, 2023 at 02:32:53PM -0700, Umesh Nerlige Ramappa wrote:
> >> > +/**
> >> > + * struct drm_xe_query_cs_cycles - correlate CPU and GPU timestamps
> >> > + *
> >> > + * If a query is made with a struct drm_xe_device_query where .query
> >> > + * is equal to DRM_XE_QUERY_CS_CYCLES, then the reply uses
> >> > + * struct drm_xe_query_cs_cycles in .data.
> >> > + *
> >> > + * struct drm_xe_query_cs_cycles is allocated by the user and .data points to
> >> > + * this allocated structure. The user must pass .eci and .clockid as inputs to
> >> > + * this query. eci determines the engine and tile info required to fetch the
> >> > + * relevant GPU timestamp. clockid is used to return the specific CPU
> >> > + * timestamp.
> >> > + *
> >> > + * The query returns the command streamer cycles and the frequency that can
> >> > + * be used to calculate the command streamer timestamp. In addition the
> >> > + * query returns a set of cpu timestamps that indicate when the command
> >> > + * streamer cycle count was captured.
> >> > + */
> >> > +struct drm_xe_query_cs_cycles {
> >> > + /** Engine for which command streamer cycles is queried. */
> >> > + struct drm_xe_engine_class_instance eci;
> >> > +
> >> > + /** MBZ (pad eci to 64 bit) */
> >> > + __u16 rsvd;
> >>
> >> I need some inputs on the rsvd field here. Looks like struct
> >> drm_xe_engine_class_instance may need padding (64 bit aligned) if used this
> >> way. Is this the right way to pad it? Should the padding be moved to
> >> struct drm_xe_engine_class_instance? OR should struct
> >> drm_xe_engine_class_instance be packed?
> >
> > Does it need to be 64 bit aligned? The next field can be an unaligned u64
> > field, it should be ok, x86 at least should be able to handle it.
>
> Then compiler would potentially insert a pad between eci and the next
> u64. The idea is to avoid an uninitialized padding.
The compiler may or may not add padding. So the next u64 field might or
might not be 8 byte aligned. Either way it should not make a difference to
functionality, unaligned accesses may be a little slower than aligned
access. But in this case probably doesn't matter much.
> >
> > But yes, nice to have all struct's 64 bit aligned. IMO the right way to do
> > it would be to pad 'struct drm_xe_engine_class_instance' to be 64 bit
> > aligned. Because if 'struct drm_xe_engine_class_instance' changes, the
> > above '__u16 rsvd' field might not work and is error prone and will also
> > need to be modified. So better to pad 'struct drm_xe_engine_class_instance'
> > to be 64 bit aligned.
>
> Agree, padding drm_xe_engine_class_instance makes sense.
Yeah better to explicitly pad drm_xe_engine_class_instance.
Thanks.
--
Ashutosh
>> >> > +
> >> > + /**
> >> > + * Command streamer cycles as read from the command streamer
> >> > + * register at 0x358 offset.
> >> > + */
> >> > + __u64 cs_cycles;
> >> > +
> >> > + /** Frequency of the cs cycles in Hz. */
> >> > + __u64 cs_frequency;
> >> > +
> >> > + /**
> >> > + * CPU timestamp in ns. The timestamp is captured before reading the
> >> > + * cs_cycles register using the reference clockid set by the user.
> >> > + */
> >> > + __u64 cpu_timestamp;
> >> > +
> >> > + /**
> >> > + * Time delta in ns captured around reading the lower dword of the
> >> > + * cs_cycles register.
> >> > + */
> >> > + __u64 cpu_delta;
> >> > +
> >> > + /**
> >> > + * Reference clock id for CPU timestamp. For definition, see
> >> > + * clock_gettime(2) and perf_event_open(2). Supported clock ids are
> >> > + * CLOCK_MONOTONIC, CLOCK_MONOTONIC_RAW, CLOCK_REALTIME, CLOCK_BOOTTIME,
> >> > + * CLOCK_TAI.
> >> > + */
> >> > + __s32 clockid;
> >> > +
> >> > + /** Width of the cs cycle counter in bits. */
> >> > + __u32 width;
> >> > +};
> >> > +
> >> > /**
> >> > * struct drm_xe_query_mem_usage - describe memory regions and usage
> >> > *
> >> > @@ -395,6 +471,7 @@ struct drm_xe_device_query {
> >> > #define DRM_XE_DEVICE_QUERY_GTS 3
> >> > #define DRM_XE_DEVICE_QUERY_HWCONFIG 4
> >> > #define DRM_XE_DEVICE_QUERY_GT_TOPOLOGY 5
> >> > +#define DRM_XE_QUERY_CS_CYCLES 6
> >> > /** @query: The type of data to query */
> >> > __u32 query;
> >> >
> >> > @@ -737,24 +814,6 @@ struct drm_xe_exec_queue_set_property {
> >> > __u64 reserved[2];
> >> > };
> >> >
> >> > -/** struct drm_xe_engine_class_instance - instance of an engine class */
> >> > -struct drm_xe_engine_class_instance {
> >> > -#define DRM_XE_ENGINE_CLASS_RENDER 0
> >> > -#define DRM_XE_ENGINE_CLASS_COPY 1
> >> > -#define DRM_XE_ENGINE_CLASS_VIDEO_DECODE 2
> >> > -#define DRM_XE_ENGINE_CLASS_VIDEO_ENHANCE 3
> >> > -#define DRM_XE_ENGINE_CLASS_COMPUTE 4
> >> > - /*
> >> > - * Kernel only class (not actual hardware engine class). Used for
> >> > - * creating ordered queues of VM bind operations.
> >> > - */
> >> > -#define DRM_XE_ENGINE_CLASS_VM_BIND 5
> >> > - __u16 engine_class;
> >> > -
> >> > - __u16 engine_instance;
> >> > - __u16 gt_id;
> >> > -};
> >> > -
> >> > struct drm_xe_exec_queue_create {
> >> > #define XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY 0
> >> > /** @extensions: Pointer to the first extension struct, if any */
> >> > --
> >> > 2.38.1
> >> >
next prev parent reply other threads:[~2023-08-12 0:34 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-04 21:32 [Intel-xe] [PATCH 0/3] Port CPU-GPU timestamp correlation to XE KMD Umesh Nerlige Ramappa
2023-08-04 21:32 ` [Intel-xe] [PATCH 1/3] drm/xe: Fix array bounds check for queries Umesh Nerlige Ramappa
2023-08-05 6:46 ` Niranjana Vishwanathapura
2023-08-04 21:32 ` [Intel-xe] [PATCH 2/3] drm/xe: Set the correct type for xe_to_user_engine_class Umesh Nerlige Ramappa
2023-08-05 6:54 ` Niranjana Vishwanathapura
2023-08-04 21:32 ` [Intel-xe] [PATCH 3/3] drm/xe: Correlate engine and cpu timestamps with better accuracy Umesh Nerlige Ramappa
2023-08-04 21:47 ` Umesh Nerlige Ramappa
2023-08-11 23:32 ` Dixit, Ashutosh
2023-08-12 0:30 ` Umesh Nerlige Ramappa
2023-08-12 0:32 ` Dixit, Ashutosh [this message]
2023-08-09 19:55 ` Souza, Jose
2023-08-09 20:18 ` Umesh Nerlige Ramappa
2023-08-04 22:38 ` [Intel-xe] ✓ CI.Patch_applied: success for Port CPU-GPU timestamp correlation to XE KMD Patchwork
2023-08-04 22:39 ` [Intel-xe] ✗ CI.checkpatch: warning " Patchwork
2023-08-04 22:40 ` [Intel-xe] ✓ CI.KUnit: success " Patchwork
2023-08-04 22:44 ` [Intel-xe] ✓ CI.Build: " Patchwork
2023-08-04 22:44 ` [Intel-xe] ✗ CI.Hooks: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ttt5yuaf.wl-ashutosh.dixit@intel.com \
--to=ashutosh.dixit@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=lionel.g.landwerlin@linux.intel.com \
--cc=umesh.nerlige.ramappa@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.