Re: [Intel-xe] [PATCH DONTMERGE] drm/xe: uapi review submission

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
To: Matt Roper <matthew.d.roper@intel.com>
Cc: intel-xe@lists.freedesktop.org,
	Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Subject: Re: [Intel-xe] [PATCH DONTMERGE] drm/xe: uapi review submission
Date: Wed, 05 Jul 2023 15:33:20 -0700	[thread overview]
Message-ID: <87v8eyc7kf.wl-ashutosh.dixit@intel.com> (raw)
In-Reply-To: <20230705211923.GP6953@mdroper-desk1.amr.corp.intel.com>

On Wed, 05 Jul 2023 14:19:23 -0700, Matt Roper wrote:
>
> On Fri, Jun 30, 2023 at 04:40:54PM -0700, Dixit, Ashutosh wrote:
> > On Fri, 30 Jun 2023 03:00:59 -0700, Thomas Hellström wrote:
> > >
> >
> > I have a question about the toplogy query below. I am not hugely familiar
> > with why/how this particular struct was chosen nor the history here, but
> > anyway.
> >
> > > +/**
> > > + * struct drm_xe_query_topology_mask - describe the topology mask of a GT
> > > + *
> > > + * This is the hardware topology which reflects the internal physical
> > > + * structure of the GPU.
> > > + *
> > > + * If a query is made with a struct drm_xe_device_query where .query
> > > + * is equal to DRM_XE_DEVICE_QUERY_GT_TOPOLOGY, then the reply uses
> > > + * struct drm_xe_query_topology_mask in .data.
> > > + */
> > > +struct drm_xe_query_topology_mask {
> > > +	/** @gt_id: GT ID the mask is associated with */
> > > +	__u16 gt_id;
> > > +
> > > +	/*
> > > +	 * To query the mask of Dual Sub Slices (DSS) available for geometry
> > > +	 * operations. For example a query response containing the following
> > > +	 * in mask:
> > > +	 *   DSS_GEOMETRY    ff ff ff ff 00 00 00 00
> > > +	 * means 32 DSS are available for geometry.
> > > +	 */
> > > +#define XE_TOPO_DSS_GEOMETRY	(1 << 0)
> > > +	/*
> > > +	 * To query the mask of Dual Sub Slices (DSS) available for compute
> > > +	 * operations. For example a query response containing the following
> > > +	 * in mask:
> > > +	 *   DSS_COMPUTE    ff ff ff ff 00 00 00 00
> > > +	 * means 32 DSS are available for compute.
> > > +	 */
> > > +#define XE_TOPO_DSS_COMPUTE	(1 << 1)
> > > +	/*
> > > +	 * To query the mask of Execution Units (EU) available per Dual Sub
> > > +	 * Slices (DSS). For example a query response containing the following
> > > +	 * in mask:
> > > +	 *   EU_PER_DSS    ff ff 00 00 00 00 00 00
> > > +	 * means each DSS has 16 EU.
> > > +	 */
> > > +#define XE_TOPO_EU_PER_DSS	(1 << 2)
> > > +	/** @type: type of mask */
> > > +	__u16 type;
> > > +
> > > +	/** @num_bytes: number of bytes in requested mask */
> > > +	__u32 num_bytes;
> > > +
> > > +	/** @mask: little-endian mask of @num_bytes */
> > > +	__u8 mask[];
> > > +};
> >
> > So typically to consume the above struct, userspace needs additional
> > information, specifically 'max_subslices' and 'max_eus_per_subslice' which
> > was included in i915 'struct drm_i915_query_topology_info'.
> >
> > For example to consume 'struct drm_xe_query_topology_mask' I had recently
> > to write the following code in IGT because this information was not
> > available through 'struct drm_xe_query_topology_mask':
> >
> >	/* Fixed fields, see fill_topology_info() and intel_sseu_set_info() in i915 */
> >	i915_topinfo.max_slices = 1;			/* always 1 */
> >	if (IS_PONTEVECCHIO(xe_dev_id(drm_fd))) {
> >		i915_topinfo.max_subslices = 64;
> >		i915_topinfo.max_eus_per_subslice = 8;
> >	} else if (intel_graphics_ver(xe_dev_id(drm_fd)) >= IP_VER(12, 50)) {
> >		i915_topinfo.max_subslices = 32;
> >		i915_topinfo.max_eus_per_subslice = 16;
> >	} else if (intel_graphics_ver(xe_dev_id(drm_fd)) >= IP_VER(12, 0)) {
> >		i915_topinfo.max_subslices = 6;
> >		i915_topinfo.max_eus_per_subslice = 16;
> >	} else {
> >		igt_assert(0);
> >	}
> >
> > So, if we are going to expose 'struct drm_xe_query_topology_mask' as it is
> > above through xe uapi, are we assuming that userspace has out of band
> > knowledge of 'max_subslices' and 'max_eus_per_subslice'? Or should this
> > information be (somehow) added to the above struct?
> >
> > Another option (which sort of works but only approximately) would be to set
> > 'num_bytes' field in 'struct drm_xe_query_topology_mask' to actual number
> > of bytes in the mask. At present it is set unconditionally to 8 in
> > query_gt_topology(), irrespective of the actual num bytes in the masks
> > which is basically (again in i915 parlance):
>
> I'm not sure what you're saying here.  num_bytes is set to the actual
> size of the mask data returned; it's not hardcoded as a constant value,
> although in practice it won't change very often since it's not tied to
> the specific platform you're running on.
>
> The size of the mask may be larger than the true set of possible DSS/EUs
> on a given platform, but that's expected.  For example, if some platform
> can only ever have a maximum of 4 DSS, but we tell userspace they're
> getting 8 bytes (64 bits), that's fine since the upper bits will always
> be 0 and can just be ignored.
>
> Remember --- this API is not intended to be give information about
> platform theoretical maximums (e.g., "Foobar Lake will never have more
> than 23 DSS") since if userspace drivers actually need that information
> it must come from the GuC's hwconfig, not from the driver code.  This
> interface is exclusively about conveying the fusing information of which
> units are physically present on your specific chip.

Ah, thanks Matt, what I missed was the GuC hwconfig part. That should solve
my problem.

Thanks.
--
Ashutosh

> >	i915_topinfo.subslice_stride = DIV_ROUND_UP(i915_topinfo.max_subslices, 8);
> >	i915_topinfo.eu_stride = DIV_ROUND_UP(i915_topinfo.max_eus_per_subslice, 8);
> >
> > If we go the 'num_bytes' route, that would be just an implementation change
> > not a uapi change.
> >
> > Thanks.
> > --
> > Ashutosh
>
> --
> Matt Roper
> Graphics Software Engineer
> Linux GPU Platform Enablement
> Intel Corporation

next prev parent reply	other threads:[~2023-07-05 22:46 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-30 10:00 [Intel-xe] [PATCH DONTMERGE] drm/xe: uapi review submission Thomas Hellström
2023-06-30 10:03 ` [Intel-xe] ✓ CI.Patch_applied: success for " Patchwork
2023-06-30 10:03 ` [Intel-xe] ✗ CI.checkpatch: warning " Patchwork
2023-06-30 10:05 ` [Intel-xe] ✓ CI.KUnit: success " Patchwork
2023-06-30 10:08 ` [Intel-xe] ✓ CI.Build: " Patchwork
2023-06-30 10:09 ` [Intel-xe] ✓ CI.Hooks: " Patchwork
2023-06-30 10:10 ` [Intel-xe] ✗ CI.checksparse: warning " Patchwork
2023-06-30 10:57 ` [Intel-xe] ○ CI.BAT: info " Patchwork
2023-06-30 12:21 ` [Intel-xe] [PATCH DONTMERGE] " Thomas Hellström
2023-06-30 23:40 ` Dixit, Ashutosh
2023-07-05 21:19   ` Matt Roper
2023-07-05 22:33     ` Dixit, Ashutosh [this message]
2023-07-06  4:20 ` Matt Roper
2023-07-06  6:14   ` Thomas Hellström
2023-07-06 22:30     ` Matt Roper
2023-07-07  7:22       ` Thomas Hellström
2023-07-06 22:48     ` Matt Roper
     [not found] ` <76c3d534-453a-7fd3-6fc8-cf30e9c8d464@intel.com>
2023-09-14  1:40   ` Zeng, Oak
2023-09-25 21:42     ` Rodrigo Vivi
2023-09-25 22:07       ` Zeng, Oak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87v8eyc7kf.wl-ashutosh.dixit@intel.com \
    --to=ashutosh.dixit@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=lionel.g.landwerlin@intel.com \
    --cc=matthew.d.roper@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.