All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [Intel-xe] [PATCH v3 33/43] drm/xe/uapi: Convert tile_mask to a pt_placement_hint
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 33/43] drm/xe/uapi: Convert tile_mask to a pt_placement_hint Francois Dugast
@ 2023-11-09  9:29   ` Matthew Brost
  2023-11-09 19:05     ` Rodrigo Vivi
  0 siblings, 1 reply; 53+ messages in thread
From: Matthew Brost @ 2023-11-09  9:29 UTC (permalink / raw)
  To: Francois Dugast; +Cc: intel-xe, Rodrigo Vivi

On Thu, Nov 09, 2023 at 03:44:47PM +0000, Francois Dugast wrote:
> From: Rodrigo Vivi <rodrigo.vivi@intel.com>
> 
> The previous tile_mask was also an optional hint, and only used
> for the page-table tree placement. However, it was so tied
> with the tile concept itself. Let's clarify things up and make
> this generic enough. So accept any valid memory region mask.
> It could even be a direct near_mem_region gotten from the engine_info.
> pt stands for page table.
> 
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Signed-off-by: Francois Dugast <francois.dugast@intel.com>

I thought we landed on converting tile_mask to sched_group_mask? I do
not like pt_placement_hint at all as I've statede what we actually care
about is creating mappings for exec queues. The sched_group_mask is
still a hint basically saying at minimum you must create a mapping for
these sched groups perhaps more. The driver is free to place a PPGTT (or
multiple) anywhere it wants to based on the platform.

e.g. On PVC we have two scheduling groups, and two PPGTT (one per tile in VRAM)
e.g. On MTL we have two scheduling groups, and one PPGTT (sysmem)
e.g. On a unified memory hypothetical future platform we have two scheduling groups, and one PPGGT (shared across tiles in unified VRAM)

Matt

> ---
>  drivers/gpu/drm/xe/xe_vm.c | 14 ++++++++++----
>  include/uapi/drm/xe_drm.h  | 16 +++++++++++++---
>  2 files changed, 23 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> index f8559ebad9bc..ad3b5ea6f91a 100644
> --- a/drivers/gpu/drm/xe/xe_vm.c
> +++ b/drivers/gpu/drm/xe/xe_vm.c
> @@ -3018,11 +3018,16 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
>  			goto release_vm_lock;
>  		}
>  
> -		if (bind_ops[i].tile_mask) {
> +		if (bind_ops[i].pt_placement_hint) {
>  			u64 valid_tiles = BIT(xe->info.tile_count) - 1;
> +			/*
> +			 * System memory is currently ignored from this hint,
> +			 * which gets entirely converted to a tile_mask
> +			 */
> +			u8 system_memory = 0x1;
>  
> -			if (XE_IOCTL_DBG(xe, bind_ops[i].tile_mask &
> -					 ~valid_tiles)) {
> +			if (XE_IOCTL_DBG(xe, bind_ops[i].pt_placement_hint &
> +					 ~valid_tiles & ~system_memory)) {
>  				err = -EINVAL;
>  				goto release_vm_lock;
>  			}
> @@ -3099,7 +3104,8 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
>  		u32 op = bind_ops[i].op;
>  		u32 flags = bind_ops[i].flags;
>  		u64 obj_offset = bind_ops[i].obj_offset;
> -		u8 tile_mask = bind_ops[i].tile_mask;
> +		/* Remove the system memory bit when converting to tiles */
> +		u8 tile_mask = bind_ops[i].pt_placement_hint & ~0x1;
>  		u32 prefetch_region = bind_ops[i].prefetch_mem_region_instance;
>  
>  		ops[i] = vm_bind_ioctl_ops_create(vm, bos[i], obj_offset,
> diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
> index 3cbfc17d9ffa..144a423868cf 100644
> --- a/include/uapi/drm/xe_drm.h
> +++ b/include/uapi/drm/xe_drm.h
> @@ -853,10 +853,20 @@ struct drm_xe_vm_bind_op {
>  	__u64 addr;
>  
>  	/**
> -	 * @tile_mask: Mask for which tiles to create binds for, 0 == All tiles,
> -	 * only applies to creating new VMAs
> +	 * @pt_placement_hint: An optional memory_region bit-mask hint, which
> +	 * only applies when creating new VMAs. Default value '0' is the
> +	 * recommended value.
> +	 *
> +	 * It hints the optimal placement for the page-table tree for this VMA.
> +	 * For instance, when userspace is using engines living in a secondary
> +	 * tile with allocated BOs near those engines, that same
> +	 * @near_mem_region could be used in this hint field.
> +	 *
> +	 * Since it is a hint, the Xe kernel driver is free to ignore this mask
> +	 * and choose the best location for the page-table, taking into
> +	 * consideration the running hardware and runtime constrains.
>  	 */
> -	__u64 tile_mask;
> +	__u64 pt_placement_hint;
>  
>  #define DRM_XE_VM_BIND_OP_MAP		0x0
>  #define DRM_XE_VM_BIND_OP_UNMAP		0x1
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Intel-xe] [PATCH v3 35/43] drm/xe/uapi: Refactor engine information
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 35/43] drm/xe/uapi: Refactor engine information Francois Dugast
@ 2023-11-09 12:07   ` Matthew Brost
  0 siblings, 0 replies; 53+ messages in thread
From: Matthew Brost @ 2023-11-09 12:07 UTC (permalink / raw)
  To: Francois Dugast; +Cc: intel-xe, Rodrigo Vivi

On Thu, Nov 09, 2023 at 03:44:49PM +0000, Francois Dugast wrote:
> From: Rodrigo Vivi <rodrigo.vivi@intel.com>
> 
> First of all, let's add the tile and gt IDs to the engine_info.
> We originally tried to abstract tile from the uAPI, but it is
> not future proof since the tile might be important info to the
> user space in regarding cache line information.
> 
> Now that we have gt_id as part of the info, let's convert
> the instance.gt_id into a generic scheduling group id number.
> For all the current platforms, the scheduling group is the
> GT ID underneath, but at least the API becomes flexible enough
> to allow different kind of engine grouping without necessarily
> get so tied to the GT ID.
> 
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

Reviewed-by: Matthew Brost <matthew.brost@intel.com>

> ---
>  drivers/gpu/drm/xe/xe_exec_queue.c      | 17 +++++++++--------
>  drivers/gpu/drm/xe/xe_query.c           | 13 ++++++++++---
>  drivers/gpu/drm/xe/xe_wait_user_fence.c |  4 ++--
>  include/uapi/drm/xe_drm.h               | 10 ++++++++--
>  4 files changed, 29 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
> index 064f25e5e3a5..e30363bb5152 100644
> --- a/drivers/gpu/drm/xe/xe_exec_queue.c
> +++ b/drivers/gpu/drm/xe/xe_exec_queue.c
> @@ -500,13 +500,13 @@ find_hw_engine(struct xe_device *xe,
>  	if (eci.engine_class > ARRAY_SIZE(user_to_xe_engine_class))
>  		return NULL;
>  
> -	if (eci.gt_id >= xe->info.gt_count)
> +	if (eci.sched_group_id >= xe->info.gt_count)
>  		return NULL;
>  
>  	idx = array_index_nospec(eci.engine_class,
>  				 ARRAY_SIZE(user_to_xe_engine_class));
>  
> -	return xe_gt_hw_engine(xe_device_get_gt(xe, eci.gt_id),
> +	return xe_gt_hw_engine(xe_device_get_gt(xe, eci.sched_group_id),
>  			       user_to_xe_engine_class[idx],
>  			       eci.engine_instance, true);
>  }
> @@ -547,7 +547,7 @@ static u32 calc_validate_logical_mask(struct xe_device *xe, struct xe_gt *gt,
>  	int len = num_bb_per_exec * num_eng_per_bb;
>  	int i, j, n;
>  	u16 class;
> -	u16 gt_id;
> +	u16 sched_group_id;
>  	u32 return_mask = 0, prev_mask;
>  
>  	if (XE_IOCTL_DBG(xe, !xe_device_uc_enabled(xe) &&
> @@ -569,12 +569,13 @@ static u32 calc_validate_logical_mask(struct xe_device *xe, struct xe_gt *gt,
>  			if (XE_IOCTL_DBG(xe, xe_hw_engine_is_reserved(hwe)))
>  				return 0;
>  
> -			if (XE_IOCTL_DBG(xe, n && eci[n].gt_id != gt_id) ||
> +			if (XE_IOCTL_DBG(xe, n &&
> +					 eci[n].sched_group_id != sched_group_id) ||
>  			    XE_IOCTL_DBG(xe, n && eci[n].engine_class != class))
>  				return 0;
>  
>  			class = eci[n].engine_class;
> -			gt_id = eci[n].gt_id;
> +			sched_group_id = eci[n].sched_group_id;
>  
>  			if (num_bb_per_exec == 1 || !i)
>  				return_mask |= BIT(eci[n].engine_instance);
> @@ -623,7 +624,7 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
>  	if (XE_IOCTL_DBG(xe, err))
>  		return -EFAULT;
>  
> -	if (XE_IOCTL_DBG(xe, eci[0].gt_id >= xe->info.gt_count))
> +	if (XE_IOCTL_DBG(xe, eci[0].sched_group_id >= xe->info.gt_count))
>  		return -EINVAL;
>  
>  	if (eci[0].engine_class >= DRM_XE_ENGINE_CLASS_VM_BIND_ASYNC) {
> @@ -636,7 +637,7 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
>  			if (xe_gt_is_media_type(gt))
>  				continue;
>  
> -			eci[0].gt_id = gt->info.id;
> +			eci[0].sched_group_id = gt->info.id;
>  			logical_mask = bind_exec_queue_logical_mask(xe, gt, eci,
>  								    args->num_bb_per_exec,
>  								    args->num_eng_per_bb);
> @@ -677,7 +678,7 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
>  					      &q->multi_gt_link);
>  		}
>  	} else {
> -		gt = xe_device_get_gt(xe, eci[0].gt_id);
> +		gt = xe_device_get_gt(xe, eci[0].sched_group_id);
>  		logical_mask = calc_validate_logical_mask(xe, gt, eci,
>  							  args->num_bb_per_exec,
>  							  args->num_eng_per_bb);
> diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
> index e5db18c91f01..99e1bfa9b446 100644
> --- a/drivers/gpu/drm/xe/xe_query.c
> +++ b/drivers/gpu/drm/xe/xe_query.c
> @@ -131,10 +131,10 @@ query_engine_cycles(struct xe_device *xe,
>  		return -EINVAL;
>  
>  	eci = &resp.eci;
> -	if (eci->gt_id > XE_MAX_GT_PER_TILE)
> +	if (eci->sched_group_id > XE_MAX_GT_PER_TILE)
>  		return -EINVAL;
>  
> -	gt = xe_device_get_gt(xe, eci->gt_id);
> +	gt = xe_device_get_gt(xe, eci->sched_group_id);
>  	if (!gt)
>  		return -EINVAL;
>  
> @@ -215,8 +215,15 @@ static int query_engines(struct xe_device *xe,
>  				xe_to_user_engine_class[hwe->class];
>  			hw_engine_info[i].instance.engine_instance =
>  				hwe->logical_instance;
> -			hw_engine_info[i].instance.gt_id = gt->info.id;
> +			/*
> +			 * Scheduling Group ID is the global GT ID for the
> +			 * current hardware, although the API is flexible
> +			 */
> +			hw_engine_info[i].instance.sched_group_id = gt->info.id;
>  			hw_engine_info[i].instance.pad = 0;
> +			hw_engine_info[i].tile_id = gt_to_tile(gt)->id;
> +			hw_engine_info[i].gt_id = gt->info.id;
> +
>  			/*
>  			 * The mem_regions indexes in the mask below need to
>  			 * directly identify the struct
> diff --git a/drivers/gpu/drm/xe/xe_wait_user_fence.c b/drivers/gpu/drm/xe/xe_wait_user_fence.c
> index 4d5c2555ce41..dcbb1c578b22 100644
> --- a/drivers/gpu/drm/xe/xe_wait_user_fence.c
> +++ b/drivers/gpu/drm/xe/xe_wait_user_fence.c
> @@ -68,10 +68,10 @@ static int check_hw_engines(struct xe_device *xe,
>  		enum xe_engine_class user_class =
>  			user_to_xe_engine_class[eci[i].engine_class];
>  
> -		if (eci[i].gt_id >= xe->info.tile_count)
> +		if (eci[i].sched_group_id >= xe->info.tile_count)
>  			return -EINVAL;
>  
> -		if (!xe_gt_hw_engine(xe_device_get_gt(xe, eci[i].gt_id),
> +		if (!xe_gt_hw_engine(xe_device_get_gt(xe, eci[i].sched_group_id),
>  				     user_class, eci[i].engine_instance, true))
>  			return -EINVAL;
>  	}
> diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
> index df8c5663f899..342f22c2d9f0 100644
> --- a/include/uapi/drm/xe_drm.h
> +++ b/include/uapi/drm/xe_drm.h
> @@ -211,8 +211,8 @@ struct drm_xe_engine_class_instance {
>  	__u16 engine_class;
>  	/** @engine_instance: Engine instance */
>  	__u16 engine_instance;
> -	/** @gt_id: GT ID the instance is associated with */
> -	__u16 gt_id;
> +	/** @sched_group_id: Scheduling Group ID for this engine instance */
> +	__u16 sched_group_id;
>  	/** @pad: MBZ */
>  	__u16 pad;
>  };
> @@ -228,6 +228,12 @@ struct drm_xe_query_engine_info {
>  	/** @instance: The @drm_xe_engine_class_instance */
>  	struct drm_xe_engine_class_instance instance;
>  
> +	/** @tile_id: Tile ID where this Engine lives */
> +	__u16 tile_id;
> +
> +	/** @gt_id: GT ID where this Engine lives */
> +	__u16 gt_id;
> +
>  	/**
>  	 * @near_mem_regions: Bit mask of instances from
>  	 * drm_xe_query_mem_regions that is near this engine.
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Intel-xe] [PATCH v3 13/43] drm/xe/uapi: Separate bo_create placement from flags
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 13/43] drm/xe/uapi: Separate bo_create placement from flags Francois Dugast
@ 2023-11-09 14:58   ` Matthew Brost
  0 siblings, 0 replies; 53+ messages in thread
From: Matthew Brost @ 2023-11-09 14:58 UTC (permalink / raw)
  To: Francois Dugast; +Cc: intel-xe, Rodrigo Vivi

On Thu, Nov 09, 2023 at 03:44:27PM +0000, Francois Dugast wrote:
> From: Rodrigo Vivi <rodrigo.vivi@intel.com>
> 
> Although the flags are about the creation, the memory placement
> of the BO deserves a proper dedicated field in the uapi.
> 
> Besides getting more clear, it also allows to remove the
> 'magic' shifts from the flags that was a concern during the
> uapi reviews.
> 
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

Reviewed-by: Matthew Brost <matthew.brost@intel.com>

> ---
>  drivers/gpu/drm/xe/xe_bo.c | 15 +++++++--------
>  include/uapi/drm/xe_drm.h  | 12 ++++++------
>  2 files changed, 13 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
> index b955c89da42c..87971f4faa58 100644
> --- a/drivers/gpu/drm/xe/xe_bo.c
> +++ b/drivers/gpu/drm/xe/xe_bo.c
> @@ -1799,19 +1799,18 @@ int xe_gem_create_ioctl(struct drm_device *dev, void *data,
>  	u32 handle;
>  	int err;
>  
> -	if (XE_IOCTL_DBG(xe, args->extensions) || XE_IOCTL_DBG(xe, args->pad) ||
> +	if (XE_IOCTL_DBG(xe, args->extensions) ||
>  	    XE_IOCTL_DBG(xe, args->reserved[0] || args->reserved[1]))
>  		return -EINVAL;
>  
> +	/* at least one valid memory placement must be specified */
> +	if (XE_IOCTL_DBG(xe, !(args->placement & xe->info.mem_region_mask)))
> +		return -EINVAL;
> +
>  	if (XE_IOCTL_DBG(xe, args->flags &
>  			 ~(DRM_XE_GEM_CREATE_FLAG_DEFER_BACKING |
>  			   DRM_XE_GEM_CREATE_FLAG_SCANOUT |
> -			   DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM |
> -			   xe->info.mem_region_mask)))
> -		return -EINVAL;
> -
> -	/* at least one memory type must be specified */
> -	if (XE_IOCTL_DBG(xe, !(args->flags & xe->info.mem_region_mask)))
> +			   DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM)))
>  		return -EINVAL;
>  
>  	if (XE_IOCTL_DBG(xe, args->handle))
> @@ -1832,7 +1831,7 @@ int xe_gem_create_ioctl(struct drm_device *dev, void *data,
>  	if (args->flags & DRM_XE_GEM_CREATE_FLAG_SCANOUT)
>  		bo_flags |= XE_BO_SCANOUT_BIT;
>  
> -	bo_flags |= args->flags << (ffs(XE_BO_CREATE_SYSTEM_BIT) - 1);
> +	bo_flags |= args->placement << (ffs(XE_BO_CREATE_SYSTEM_BIT) - 1);
>  
>  	if (args->flags & DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM) {
>  		if (XE_IOCTL_DBG(xe, !(bo_flags & XE_BO_CREATE_VRAM_MASK)))
> diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
> index 2ed69b02a2e8..3685eeff4b8d 100644
> --- a/include/uapi/drm/xe_drm.h
> +++ b/include/uapi/drm/xe_drm.h
> @@ -622,9 +622,12 @@ struct drm_xe_gem_create {
>  	 */
>  	__u64 size;
>  
> -#define DRM_XE_GEM_CREATE_FLAG_DEFER_BACKING		(0x1 << 24)
> -#define DRM_XE_GEM_CREATE_FLAG_SCANOUT			(0x1 << 25)
> -#define DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM	(0x1 << 26)
> +	/** @placement: A mask of memory instances of where BO can be placed. */
> +	__u32 placement;
> +
> +#define DRM_XE_GEM_CREATE_FLAG_DEFER_BACKING		(1 << 0)
> +#define DRM_XE_GEM_CREATE_FLAG_SCANOUT			(1 << 1)
> +#define DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM	(1 << 2)
>  	/**
>  	 * @flags: Flags, currently a mask of memory instances of where BO can
>  	 * be placed
> @@ -648,9 +651,6 @@ struct drm_xe_gem_create {
>  	 */
>  	__u32 handle;
>  
> -	/** @pad: MBZ */
> -	__u32 pad;
> -
>  	/** @reserved: Reserved */
>  	__u64 reserved[2];
>  };
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Intel-xe] [PATCH v3 27/43] drm/xe/uapi: Standardize the FLAG naming and assignment
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 27/43] drm/xe/uapi: Standardize the FLAG naming and assignment Francois Dugast
@ 2023-11-09 15:10   ` Matthew Brost
  0 siblings, 0 replies; 53+ messages in thread
From: Matthew Brost @ 2023-11-09 15:10 UTC (permalink / raw)
  To: Francois Dugast; +Cc: intel-xe, Rodrigo Vivi

On Thu, Nov 09, 2023 at 03:44:41PM +0000, Francois Dugast wrote:
> From: Rodrigo Vivi <rodrigo.vivi@intel.com>
> 
> Only cosmetic things. No functional change on this patch.
> Define every flag with (1 << n) and use singular FLAG name.
> 
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

Reviewed-by: Matthew Brost <matthew.brost@intel.com>

> ---
>  drivers/gpu/drm/xe/xe_query.c |  2 +-
>  include/uapi/drm/xe_drm.h     | 20 ++++++++++----------
>  2 files changed, 11 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
> index bc2b4609a38d..71a4943cab20 100644
> --- a/drivers/gpu/drm/xe/xe_query.c
> +++ b/drivers/gpu/drm/xe/xe_query.c
> @@ -333,7 +333,7 @@ static int query_config(struct xe_device *xe, struct drm_xe_device_query *query)
>  		xe->info.devid | (xe->info.revid << 16);
>  	if (xe_device_get_root_tile(xe)->mem.vram.usable_size)
>  		config->info[DRM_XE_QUERY_CONFIG_FLAGS] =
> -			DRM_XE_QUERY_CONFIG_FLAGS_HAS_VRAM;
> +			DRM_XE_QUERY_CONFIG_FLAG_HAS_VRAM;
>  	config->info[DRM_XE_QUERY_CONFIG_MIN_ALIGNMENT] =
>  		xe->info.vram_flags & XE_VRAM_FLAGS_NEED64K ? SZ_64K : SZ_4K;
>  	config->info[DRM_XE_QUERY_CONFIG_VA_BITS] = xe->info.va_bits;
> diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
> index 0c004b24f820..5217558a32d0 100644
> --- a/include/uapi/drm/xe_drm.h
> +++ b/include/uapi/drm/xe_drm.h
> @@ -354,7 +354,7 @@ struct drm_xe_query_mem_regions {
>   *  - %DRM_XE_QUERY_CONFIG_FLAGS - Flags describing the device
>   *    configuration, see list below
>   *
> - *    - %DRM_XE_QUERY_CONFIG_FLAGS_HAS_VRAM - Flag is set if the device
> + *    - %DRM_XE_QUERY_CONFIG_FLAG_HAS_VRAM - Flag is set if the device
>   *      has usable VRAM
>   *  - %DRM_XE_QUERY_CONFIG_MIN_ALIGNMENT - Minimal memory alignment
>   *    required by this device, typically SZ_4K or SZ_64K
> @@ -371,7 +371,7 @@ struct drm_xe_query_config {
>  
>  #define DRM_XE_QUERY_CONFIG_REV_AND_DEVICE_ID		0
>  #define DRM_XE_QUERY_CONFIG_FLAGS			1
> -	#define DRM_XE_QUERY_CONFIG_FLAGS_HAS_VRAM	(0x1 << 0)
> +	#define DRM_XE_QUERY_CONFIG_FLAG_HAS_VRAM	(1 << 0)
>  	/*
>  	 * DRM_XE_QUERY_CONFIG_MIN_ALIGNMENT - This returns the
>  	 * maximum value of the &min_page_size across all memory regions
> @@ -755,10 +755,10 @@ struct drm_xe_vm_create {
>  	/** @extensions: Pointer to the first extension struct, if any */
>  	__u64 extensions;
>  
> -#define DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE	(0x1 << 0)
> -#define DRM_XE_VM_CREATE_FLAG_COMPUTE_MODE	(0x1 << 1)
> -#define DRM_XE_VM_CREATE_FLAG_ASYNC_DEFAULT	(0x1 << 2)
> -#define DRM_XE_VM_CREATE_FLAG_FAULT_MODE	(0x1 << 3)
> +#define DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE	(1 << 0)
> +#define DRM_XE_VM_CREATE_FLAG_COMPUTE_MODE	(1 << 1)
> +#define DRM_XE_VM_CREATE_FLAG_ASYNC_DEFAULT	(1 << 2)
> +#define DRM_XE_VM_CREATE_FLAG_FAULT_MODE	(1 << 3)
>  	/** @flags: Flags */
>  	__u32 flags;
>  
> @@ -852,10 +852,10 @@ struct drm_xe_vm_bind_op {
>  	/** @op: Bind operation to perform */
>  	__u32 op;
>  
> -#define DRM_XE_VM_BIND_FLAG_READONLY	(0x1 << 0)
> -#define DRM_XE_VM_BIND_FLAG_ASYNC	(0x1 << 1)
> -#define DRM_XE_VM_BIND_FLAG_IMMEDIATE	(0x1 << 2)
> -#define DRM_XE_VM_BIND_FLAG_NULL	(0x1 << 3)
> +#define DRM_XE_VM_BIND_FLAG_READONLY	(1 << 0)
> +#define DRM_XE_VM_BIND_FLAG_ASYNC	(1 << 1)
> +#define DRM_XE_VM_BIND_FLAG_IMMEDIATE	(1 << 2)
> +#define DRM_XE_VM_BIND_FLAG_NULL	(1 << 3)
>  	/** @flags: Bind flags */
>  	__u32 flags;
>  
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2
@ 2023-11-09 15:44 Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 01/43] drm/xe/uapi: Add documentation for query Francois Dugast
                   ` (46 more replies)
  0 siblings, 47 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Francois Dugast

This is the second take of uAPI updates that would lead to
breakage in the compatibility, which it is not acceptable after
we are merged upstream. So, let's break it before it is too late,
and start upstreaming a good, reliable and clean uapi.

v2: Rebase, drop "RFC", more uAPI fixes and cleanup

v3:
- Rebase
- Checkpatch
- Apply fixups and squash 
- Do not remove num_params 
- Skip "[PATCH v2 01/50] fixup! drm/xe: Correlate engine and cpu
  timestamps with better accuracy" already merged by Lucas 
- Skip "[PATCH v2 40/50] drm/xe/uapi: Add link to Xe documentation"
  as location will change 
- Change "[PATCH v2 12/50] fixup! drm/xe: Correlate engine and cpu
  timestamps with better accuracy" to not be a fixup 
- Fix commit message of "[PATCH v2 24/50] xe/xe_bo: Reject bo
  creation of unaligned size" 
- Include already provided "Reviewed-by" 

Aravind Iddamsetty (1):
  drm/xe/pmu: Drop interrupt pmu event

Francois Dugast (17):
  drm/xe/uapi: Add documentation for query
  drm/xe/uapi: Document DRM_XE_DEVICE_QUERY_HWCONFIG
  drm/xe: Extend uAPI to query HuC micro-controler firmware version
  drm/xe/uapi: Remove useless XE_QUERY_CONFIG_NUM_PARAM
  drm/xe/uapi: Add missing DRM_ prefix in uAPI constants
  drm/xe/uapi: Add _FLAG to uAPI constants usable for flags
  drm/xe/uapi: Make constant comments visible in kernel doc
  drm/xe/uapi: Change rsvd to pad in struct drm_xe_class_instance
  drm/xe/uapi: Remove unused inaccessible memory region
  drm/xe/uapi: Remove unused QUERY_CONFIG_MEM_REGION_COUNT
  drm/xe/uapi: Remove unused QUERY_CONFIG_GT_COUNT
  drm/xe/uapi: Replace BO with GEM in documentation
  drm/xe/uapi: Align on a common way to return arrays (memory regions)
  drm/xe/uapi: Align on a common way to return arrays (gt)
  drm/xe/uapi: Align on a common way to return arrays (engines)
  drm/xe/uapi: Add block diagram of a device
  drm/xe/uapi: Add examples of user space code

José Roberto de Souza (2):
  drm/xe: Add uAPI to query micro-controler firmware version
  drm/xe: Make DRM_XE_DEVICE_QUERY_ENGINES future proof

Mauro Carvalho Chehab (1):
  drm/xe/uapi: Reject bo creation of unaligned size

Mika Kuoppala (1):
  drm/xe: Extend drm_xe_vm_bind_op

Rodrigo Vivi (21):
  drm/xe/uapi: Remove GT_TYPE_REMOTE
  drm/xe/uapi: Kill VM_MADVISE IOCTL
  drm/xe/uapi: Separate bo_create placement from flags
  drm/xe/uapi: Rename *_mem_regions masks
  drm/xe/uapi: Rename query's mem_usage to mem_regions
  drm/xe/uapi: Fix indentation issues that sometimes causes build
    warning
  drm/xe/uapi: Order sections
  drm/xe/uapi: More uAPI documentation additions and cosmetic updates
  drm/xe/uapi: Split xe_sync types from flags
  drm/xe/uapi: Standardize the FLAG naming and assignment
  drm/xe/uapi: Differentiate WAIT_OP from WAIT_MASK
  drm/xe/uapi: Move xe_exec after xe_exec_queue
  drm/xe/uapi: Move memory_region masks from GT to engine
  drm/xe/uapi: Document the memory_region bitmask
  drm/xe/uapi: Be more specific about the vm_bind prefetch region
  drm/xe/uapi: Convert tile_mask to a pt_placement_hint
  drm/xe/uapi: Exec queue documentation and variable renaming
  drm/xe/uapi: Refactor engine information
  drm/xe/uapi: Crystal Reference Clock updates
  drm/xe/uapi: Add Tile ID information to the GT info query
  drm/xe/uapi: Remove bogus engine list from the wait_user_fence IOCTL

 drivers/gpu/drm/xe/Makefile              |    1 -
 drivers/gpu/drm/xe/tests/xe_dma_buf.c    |    8 +-
 drivers/gpu/drm/xe/xe_bo.c               |   51 +-
 drivers/gpu/drm/xe/xe_bo_types.h         |    3 +
 drivers/gpu/drm/xe/xe_devcoredump.c      |    8 +-
 drivers/gpu/drm/xe/xe_device.c           |    8 +-
 drivers/gpu/drm/xe/xe_exec.c             |    4 +-
 drivers/gpu/drm/xe/xe_exec_queue.c       |   86 +-
 drivers/gpu/drm/xe/xe_exec_queue.h       |    4 +-
 drivers/gpu/drm/xe/xe_exec_queue_types.h |    4 +-
 drivers/gpu/drm/xe/xe_gt.c               |    2 +-
 drivers/gpu/drm/xe/xe_gt_clock.c         |    4 +-
 drivers/gpu/drm/xe/xe_gt_types.h         |    4 +-
 drivers/gpu/drm/xe/xe_guc_submit.c       |   32 +-
 drivers/gpu/drm/xe/xe_irq.c              |   18 -
 drivers/gpu/drm/xe/xe_pmu.c              |   25 +-
 drivers/gpu/drm/xe/xe_pmu_types.h        |    8 -
 drivers/gpu/drm/xe/xe_query.c            |  220 ++--
 drivers/gpu/drm/xe/xe_ring_ops.c         |    8 +-
 drivers/gpu/drm/xe/xe_sched_job.c        |   10 +-
 drivers/gpu/drm/xe/xe_sync.c             |   27 +-
 drivers/gpu/drm/xe/xe_sync_types.h       |    1 +
 drivers/gpu/drm/xe/xe_trace.h            |    8 +-
 drivers/gpu/drm/xe/xe_vm.c               |  115 +-
 drivers/gpu/drm/xe/xe_vm_doc.h           |   14 +-
 drivers/gpu/drm/xe/xe_vm_madvise.c       |  299 -----
 drivers/gpu/drm/xe/xe_vm_madvise.h       |   15 -
 drivers/gpu/drm/xe/xe_wait_user_fence.c  |   74 +-
 include/uapi/drm/xe_drm.h                | 1334 ++++++++++++++--------
 29 files changed, 1250 insertions(+), 1145 deletions(-)
 delete mode 100644 drivers/gpu/drm/xe/xe_vm_madvise.c
 delete mode 100644 drivers/gpu/drm/xe/xe_vm_madvise.h

-- 
2.34.1


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 01/43] drm/xe/uapi: Add documentation for query
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 02/43] drm/xe: Extend drm_xe_vm_bind_op Francois Dugast
                   ` (45 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Francois Dugast, Rodrigo Vivi

Provide a description of the keys used the struct
drm_xe_query_config info array. Document the behavior
of the driver for IOCTL DRM_IOCTL_XE_DEVICE_QUERY
depending on the size value provided in struct
drm_xe_device_query.

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/637
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 include/uapi/drm/xe_drm.h | 41 ++++++++++++++++++++++++++++++++++++---
 1 file changed, 38 insertions(+), 3 deletions(-)

diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 9bd7092a7ea4..0b1482d5f709 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -321,14 +321,43 @@ struct drm_xe_query_config {
 	/** @pad: MBZ */
 	__u32 pad;
 
+	/*
+	 * Device ID (lower 16 bits) and the device revision (next
+	 * 8 bits)
+	 */
 #define XE_QUERY_CONFIG_REV_AND_DEVICE_ID	0
+	/*
+	 * Flags describing the device configuration, see list below
+	 */
 #define XE_QUERY_CONFIG_FLAGS			1
+	/*
+	 * Flag is set if the device has usable VRAM
+	 */
 	#define XE_QUERY_CONFIG_FLAGS_HAS_VRAM		(0x1 << 0)
+	/*
+	 * Minimal memory alignment required by this device,
+	 * typically SZ_4K or SZ_64K
+	 */
 #define XE_QUERY_CONFIG_MIN_ALIGNMENT		2
+	/*
+	 * Maximum bits of a virtual address
+	 */
 #define XE_QUERY_CONFIG_VA_BITS			3
+	/*
+	 * Total number of GTs for the entire device
+	 */
 #define XE_QUERY_CONFIG_GT_COUNT		4
+	/*
+	 * Total number of accessible memory regions
+	 */
 #define XE_QUERY_CONFIG_MEM_REGION_COUNT	5
+	/*
+	 * Value of the highest available exec queue priority
+	 */
 #define XE_QUERY_CONFIG_MAX_EXEC_QUEUE_PRIORITY	6
+	/*
+	 * Number of elements in the info array
+	 */
 #define XE_QUERY_CONFIG_NUM_PARAM		(XE_QUERY_CONFIG_MAX_EXEC_QUEUE_PRIORITY + 1)
 	/** @info: array of elements containing the config info */
 	__u64 info[];
@@ -440,9 +469,15 @@ struct drm_xe_query_topology_mask {
 /**
  * struct drm_xe_device_query - main structure to query device information
  *
- * If size is set to 0, the driver fills it with the required size for the
- * requested type of data to query. If size is equal to the required size,
- * the queried information is copied into data.
+ * The user selects the type of data to query among DRM_XE_DEVICE_QUERY_*
+ * and sets the value in the query member. This determines the type of
+ * the structure provided by the driver in data, among struct drm_xe_query_*.
+ *
+ * If size is set to 0, the driver fills it with the required size for
+ * the requested type of data to query. If size is equal to the required
+ * size, the queried information is copied into data. If size is set to
+ * a value different from 0 and different from the required size, the
+ * IOCTL call returns -EINVAL.
  *
  * For example the following code snippet allows retrieving and printing
  * information about the device engines with DRM_XE_DEVICE_QUERY_ENGINES:
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 02/43] drm/xe: Extend drm_xe_vm_bind_op
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 01/43] drm/xe/uapi: Add documentation for query Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 03/43] drm/xe: Add uAPI to query micro-controler firmware version Francois Dugast
                   ` (44 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Francois Dugast, Lucas De Marchi, Rodrigo Vivi

From: Mika Kuoppala <mika.kuoppala@linux.intel.com>

The bind api is extensible but for a single bind op, there
is not a mechanism to extend. Add extensions field to
struct drm_xe_vm_bind_op.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 include/uapi/drm/xe_drm.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 0b1482d5f709..edbc58a4769c 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -656,6 +656,9 @@ struct drm_xe_vm_destroy {
 };
 
 struct drm_xe_vm_bind_op {
+	/** @extensions: Pointer to the first extension struct, if any */
+	__u64 extensions;
+
 	/**
 	 * @obj: GEM object to operate on, MBZ for MAP_USERPTR, MBZ for UNMAP
 	 */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 03/43] drm/xe: Add uAPI to query micro-controler firmware version
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 01/43] drm/xe/uapi: Add documentation for query Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 02/43] drm/xe: Extend drm_xe_vm_bind_op Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 04/43] drm/xe/uapi: Document DRM_XE_DEVICE_QUERY_HWCONFIG Francois Dugast
                   ` (43 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Francois Dugast, Rodrigo Vivi

From: José Roberto de Souza <jose.souza@intel.com>

Due to a bug in GuC firmware, Mesa can't enable by default the usage of
compute engines in DG2 and newer.

A new GuC firmware fixed the issue but until now there was no way
for Mesa to know if KMD was running with the fixed GuC version or not,
so this uAPI is required.

It may be expanded in future to query other firmware versions too.

More information:
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661
Mesa usage:
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25233

v2:
- changed to submission version
- added branch version to be future proof
- checking if pads and reserved are zero

v3:
- add braces around case XE_QUERY_UC_TYPE_GUC to make CI happy

v4:
- squashed commits
- make it very clear and documented that it is about the submission
  version, and also what that actually means.

Cc: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
---
 drivers/gpu/drm/xe/xe_query.c | 41 +++++++++++++++++++++++++++++++++
 include/uapi/drm/xe_drm.h     | 43 +++++++++++++++++++++++++++++++++++
 2 files changed, 84 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
index 10b9878ec95a..063f9bf071a3 100644
--- a/drivers/gpu/drm/xe/xe_query.c
+++ b/drivers/gpu/drm/xe/xe_query.c
@@ -498,6 +498,46 @@ static int query_gt_topology(struct xe_device *xe,
 	return 0;
 }
 
+static int
+query_uc_fw_version(struct xe_device *xe, struct drm_xe_device_query *query)
+{
+	struct drm_xe_query_uc_fw_version __user *query_ptr = u64_to_user_ptr(query->data);
+	size_t size = sizeof(struct drm_xe_query_uc_fw_version);
+	struct drm_xe_query_uc_fw_version resp;
+
+	if (query->size == 0) {
+		query->size = size;
+		return 0;
+	} else if (XE_IOCTL_DBG(xe, query->size != size)) {
+		return -EINVAL;
+	}
+
+	if (copy_from_user(&resp, query_ptr, size))
+		return -EFAULT;
+
+	if (XE_IOCTL_DBG(xe, resp.reserved || resp.pad2 || resp.reserved2))
+		return -EINVAL;
+
+	switch (resp.uc_type) {
+	case DRM_XE_QUERY_UC_TYPE_GUC_SUBMISSION: {
+		struct xe_guc *guc = &xe->tiles[0].primary_gt->uc.guc;
+
+		resp.major_ver = guc->submission_state.version.major;
+		resp.minor_ver = guc->submission_state.version.minor;
+		resp.patch_ver = guc->submission_state.version.patch;
+		resp.branch_ver = 0;
+		break;
+	}
+	default:
+		return -EINVAL;
+	}
+
+	if (copy_to_user(query_ptr, &resp, size))
+		return -EFAULT;
+
+	return 0;
+}
+
 static int (* const xe_query_funcs[])(struct xe_device *xe,
 				      struct drm_xe_device_query *query) = {
 	query_engines,
@@ -507,6 +547,7 @@ static int (* const xe_query_funcs[])(struct xe_device *xe,
 	query_hwconfig,
 	query_gt_topology,
 	query_engine_cycles,
+	query_uc_fw_version,
 };
 
 int xe_query_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index edbc58a4769c..169ae928802b 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -466,6 +466,48 @@ struct drm_xe_query_topology_mask {
 	__u8 mask[];
 };
 
+/**
+ * struct drm_xe_query_uc_fw_version - query a micro-controller firmware version
+ *
+ * Given a uc_type this will return the major, minor, patch and branch version
+ * of the micro-controller firmware.
+ *
+ * The @uc_type can be:
+ *  - %DRM_XE_QUERY_UC_TYPE_GUC_SUBMISSION - This is the GuC Submission Version,
+ * a.k.a 'VF version'. It is not the actual GuC blob version. A running GuC can
+ * support multiple VF APIs with different Submission Versions. This version is
+ * negotiated by the VF KMD with GuC during VF initialization. In most of the
+ * current available GuC blobs, this is a 1-1 relationship where the Submission
+ * version could be inferred from the running version and vice-versa. However,
+ * the submission version is the most useful information for the user space
+ * perspective and needs.
+ *  - %DRM_XE_QUERY_TYPE_HUC - The actual HuC blob that is currently running
+ * in the platform. It returns 0 when HuC is not currently loaded.
+ */
+struct drm_xe_query_uc_fw_version {
+	/** @uc_type: The micro-controller type to query firmware version */
+#define DRM_XE_QUERY_UC_TYPE_GUC_SUBMISSION	0
+	__u16 uc_type;
+
+	/** @reserved: Reserved */
+	__u16 reserved;
+
+	/** @major_ver: major uc fw version */
+	__u32 major_ver;
+	/** @minor_ver: minor uc fw version */
+	__u32 minor_ver;
+	/** @patch_ver: patch uc fw version */
+	__u32 patch_ver;
+	/** @branch_ver: branch uc fw version */
+	__u32 branch_ver;
+
+	/** @pad2: MBZ */
+	__u32 pad2;
+
+	/** @reserved2: Reserved */
+	__u64 reserved2;
+};
+
 /**
  * struct drm_xe_device_query - main structure to query device information
  *
@@ -518,6 +560,7 @@ struct drm_xe_device_query {
 #define DRM_XE_DEVICE_QUERY_HWCONFIG		4
 #define DRM_XE_DEVICE_QUERY_GT_TOPOLOGY		5
 #define DRM_XE_DEVICE_QUERY_ENGINE_CYCLES	6
+#define DRM_XE_DEVICE_QUERY_UC_FW_VERSION	7
 	/** @query: The type of data to query */
 	__u32 query;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 04/43] drm/xe/uapi: Document DRM_XE_DEVICE_QUERY_HWCONFIG
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (2 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 03/43] drm/xe: Add uAPI to query micro-controler firmware version Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 05/43] drm/xe: Extend uAPI to query HuC micro-controler firmware version Francois Dugast
                   ` (42 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Francois Dugast

Add a documentation on the content and format of when using query type
DRM_XE_DEVICE_QUERY_HWCONFIG. The list of keys can be found in IGT
under lib/intel_hwconfig_types.h.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
---
 include/uapi/drm/xe_drm.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 169ae928802b..68cf67461846 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -557,6 +557,11 @@ struct drm_xe_device_query {
 #define DRM_XE_DEVICE_QUERY_MEM_USAGE		1
 #define DRM_XE_DEVICE_QUERY_CONFIG		2
 #define DRM_XE_DEVICE_QUERY_GT_LIST		3
+	/*
+	 * Query type to retrieve the hardware configuration of the device
+	 * such as information on slices, memory, caches, and so on. It is
+	 * provided as a table of attributes (key / value).
+	 */
 #define DRM_XE_DEVICE_QUERY_HWCONFIG		4
 #define DRM_XE_DEVICE_QUERY_GT_TOPOLOGY		5
 #define DRM_XE_DEVICE_QUERY_ENGINE_CYCLES	6
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 05/43] drm/xe: Extend uAPI to query HuC micro-controler firmware version
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (3 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 04/43] drm/xe/uapi: Document DRM_XE_DEVICE_QUERY_HWCONFIG Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 06/43] drm/xe/uapi: Remove useless XE_QUERY_CONFIG_NUM_PARAM Francois Dugast
                   ` (41 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Francois Dugast

The infrastructure to query GuC firmware version is already in place. It
is extended with a new micro-controller type to query the HuC firmware
version. It can be used from user space to know if HuC is running.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
---
 drivers/gpu/drm/xe/xe_query.c | 9 +++++++++
 include/uapi/drm/xe_drm.h     | 1 +
 2 files changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
index 063f9bf071a3..a7f34669bb9a 100644
--- a/drivers/gpu/drm/xe/xe_query.c
+++ b/drivers/gpu/drm/xe/xe_query.c
@@ -528,6 +528,15 @@ query_uc_fw_version(struct xe_device *xe, struct drm_xe_device_query *query)
 		resp.branch_ver = 0;
 		break;
 	}
+	case DRM_XE_QUERY_UC_TYPE_HUC: {
+		struct xe_huc *huc = &xe->tiles[0].primary_gt->uc.huc;
+
+		resp.major_ver = huc->fw.major_ver_found;
+		resp.minor_ver = huc->fw.minor_ver_found;
+		resp.patch_ver = huc->fw.patch_ver_found;
+		resp.branch_ver = 0;
+		break;
+	}
 	default:
 		return -EINVAL;
 	}
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 68cf67461846..f03aea937459 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -487,6 +487,7 @@ struct drm_xe_query_topology_mask {
 struct drm_xe_query_uc_fw_version {
 	/** @uc_type: The micro-controller type to query firmware version */
 #define DRM_XE_QUERY_UC_TYPE_GUC_SUBMISSION	0
+#define DRM_XE_QUERY_UC_TYPE_HUC		1
 	__u16 uc_type;
 
 	/** @reserved: Reserved */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 06/43] drm/xe/uapi: Remove useless XE_QUERY_CONFIG_NUM_PARAM
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (4 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 05/43] drm/xe: Extend uAPI to query HuC micro-controler firmware version Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 07/43] drm/xe/uapi: Add missing DRM_ prefix in uAPI constants Francois Dugast
                   ` (40 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Francois Dugast

num_params can be used to retrieve the size of the info array
for the specific version of the kernel being used.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
---
 drivers/gpu/drm/xe/xe_query.c | 1 +
 include/uapi/drm/xe_drm.h     | 4 ----
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
index a7f34669bb9a..5942ab0811a2 100644
--- a/drivers/gpu/drm/xe/xe_query.c
+++ b/drivers/gpu/drm/xe/xe_query.c
@@ -305,6 +305,7 @@ static int query_memory_usage(struct xe_device *xe,
 
 static int query_config(struct xe_device *xe, struct drm_xe_device_query *query)
 {
+#define XE_QUERY_CONFIG_NUM_PARAM	(XE_QUERY_CONFIG_MAX_EXEC_QUEUE_PRIORITY + 1)
 	u32 num_params = XE_QUERY_CONFIG_NUM_PARAM;
 	size_t size =
 		sizeof(struct drm_xe_query_config) + num_params * sizeof(u64);
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index f03aea937459..dcd4680ae788 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -355,10 +355,6 @@ struct drm_xe_query_config {
 	 * Value of the highest available exec queue priority
 	 */
 #define XE_QUERY_CONFIG_MAX_EXEC_QUEUE_PRIORITY	6
-	/*
-	 * Number of elements in the info array
-	 */
-#define XE_QUERY_CONFIG_NUM_PARAM		(XE_QUERY_CONFIG_MAX_EXEC_QUEUE_PRIORITY + 1)
 	/** @info: array of elements containing the config info */
 	__u64 info[];
 };
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 07/43] drm/xe/uapi: Add missing DRM_ prefix in uAPI constants
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (5 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 06/43] drm/xe/uapi: Remove useless XE_QUERY_CONFIG_NUM_PARAM Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 08/43] drm/xe/uapi: Add _FLAG to uAPI constants usable for flags Francois Dugast
                   ` (39 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Francois Dugast

Most constants defined in xe_drm.h use DRM_XE_ as prefix which is
helpful to identify the name space. Make this systematic and add
this prefix where it was missing.

v2:
- fix vertical alignment of define values
- remove double DRM_ in some variables (José Roberto de Souza)

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/xe/xe_bo.c         |  14 +--
 drivers/gpu/drm/xe/xe_exec_queue.c |  20 ++---
 drivers/gpu/drm/xe/xe_gt.c         |   2 +-
 drivers/gpu/drm/xe/xe_pmu.c        |  20 ++---
 drivers/gpu/drm/xe/xe_query.c      |  36 ++++----
 drivers/gpu/drm/xe/xe_vm.c         |  54 ++++++------
 drivers/gpu/drm/xe/xe_vm_doc.h     |  12 +--
 drivers/gpu/drm/xe/xe_vm_madvise.c |   8 +-
 include/uapi/drm/xe_drm.h          | 136 ++++++++++++++---------------
 9 files changed, 151 insertions(+), 151 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 1ae0543882a0..f01817c6d022 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -209,7 +209,7 @@ static int __xe_bo_placement_for_flags(struct xe_device *xe, struct xe_bo *bo,
 
 	/* The order of placements should indicate preferred location */
 
-	if (bo->props.preferred_mem_class == XE_MEM_REGION_CLASS_SYSMEM) {
+	if (bo->props.preferred_mem_class == DRM_XE_MEM_REGION_CLASS_SYSMEM) {
 		try_add_system(bo, places, bo_flags, &c);
 		try_add_vram(xe, bo, places, bo_flags, &c);
 	} else {
@@ -1804,9 +1804,9 @@ int xe_gem_create_ioctl(struct drm_device *dev, void *data,
 		return -EINVAL;
 
 	if (XE_IOCTL_DBG(xe, args->flags &
-			 ~(XE_GEM_CREATE_FLAG_DEFER_BACKING |
-			   XE_GEM_CREATE_FLAG_SCANOUT |
-			   XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM |
+			 ~(DRM_XE_GEM_CREATE_FLAG_DEFER_BACKING |
+			   DRM_XE_GEM_CREATE_FLAG_SCANOUT |
+			   DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM |
 			   xe->info.mem_region_mask)))
 		return -EINVAL;
 
@@ -1826,15 +1826,15 @@ int xe_gem_create_ioctl(struct drm_device *dev, void *data,
 	if (XE_IOCTL_DBG(xe, args->size & ~PAGE_MASK))
 		return -EINVAL;
 
-	if (args->flags & XE_GEM_CREATE_FLAG_DEFER_BACKING)
+	if (args->flags & DRM_XE_GEM_CREATE_FLAG_DEFER_BACKING)
 		bo_flags |= XE_BO_DEFER_BACKING;
 
-	if (args->flags & XE_GEM_CREATE_FLAG_SCANOUT)
+	if (args->flags & DRM_XE_GEM_CREATE_FLAG_SCANOUT)
 		bo_flags |= XE_BO_SCANOUT_BIT;
 
 	bo_flags |= args->flags << (ffs(XE_BO_CREATE_SYSTEM_BIT) - 1);
 
-	if (args->flags & XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM) {
+	if (args->flags & DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM) {
 		if (XE_IOCTL_DBG(xe, !(bo_flags & XE_BO_CREATE_VRAM_MASK)))
 			return -EINVAL;
 
diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
index 4fd44a9203e4..59e8d1ed34f7 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue.c
+++ b/drivers/gpu/drm/xe/xe_exec_queue.c
@@ -406,14 +406,14 @@ typedef int (*xe_exec_queue_set_property_fn)(struct xe_device *xe,
 					     u64 value, bool create);
 
 static const xe_exec_queue_set_property_fn exec_queue_set_property_funcs[] = {
-	[XE_EXEC_QUEUE_SET_PROPERTY_PRIORITY] = exec_queue_set_priority,
-	[XE_EXEC_QUEUE_SET_PROPERTY_TIMESLICE] = exec_queue_set_timeslice,
-	[XE_EXEC_QUEUE_SET_PROPERTY_PREEMPTION_TIMEOUT] = exec_queue_set_preemption_timeout,
-	[XE_EXEC_QUEUE_SET_PROPERTY_PERSISTENCE] = exec_queue_set_persistence,
-	[XE_EXEC_QUEUE_SET_PROPERTY_JOB_TIMEOUT] = exec_queue_set_job_timeout,
-	[XE_EXEC_QUEUE_SET_PROPERTY_ACC_TRIGGER] = exec_queue_set_acc_trigger,
-	[XE_EXEC_QUEUE_SET_PROPERTY_ACC_NOTIFY] = exec_queue_set_acc_notify,
-	[XE_EXEC_QUEUE_SET_PROPERTY_ACC_GRANULARITY] = exec_queue_set_acc_granularity,
+	[DRM_XE_EXEC_QUEUE_SET_PROPERTY_PRIORITY] = exec_queue_set_priority,
+	[DRM_XE_EXEC_QUEUE_SET_PROPERTY_TIMESLICE] = exec_queue_set_timeslice,
+	[DRM_XE_EXEC_QUEUE_SET_PROPERTY_PREEMPTION_TIMEOUT] = exec_queue_set_preemption_timeout,
+	[DRM_XE_EXEC_QUEUE_SET_PROPERTY_PERSISTENCE] = exec_queue_set_persistence,
+	[DRM_XE_EXEC_QUEUE_SET_PROPERTY_JOB_TIMEOUT] = exec_queue_set_job_timeout,
+	[DRM_XE_EXEC_QUEUE_SET_PROPERTY_ACC_TRIGGER] = exec_queue_set_acc_trigger,
+	[DRM_XE_EXEC_QUEUE_SET_PROPERTY_ACC_NOTIFY] = exec_queue_set_acc_notify,
+	[DRM_XE_EXEC_QUEUE_SET_PROPERTY_ACC_GRANULARITY] = exec_queue_set_acc_granularity,
 };
 
 static int exec_queue_user_ext_set_property(struct xe_device *xe,
@@ -445,7 +445,7 @@ typedef int (*xe_exec_queue_user_extension_fn)(struct xe_device *xe,
 					       bool create);
 
 static const xe_exec_queue_set_property_fn exec_queue_user_extension_funcs[] = {
-	[XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY] = exec_queue_user_ext_set_property,
+	[DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY] = exec_queue_user_ext_set_property,
 };
 
 #define MAX_USER_EXTENSIONS	16
@@ -764,7 +764,7 @@ int xe_exec_queue_get_property_ioctl(struct drm_device *dev, void *data,
 		return -ENOENT;
 
 	switch (args->property) {
-	case XE_EXEC_QUEUE_GET_PROPERTY_BAN:
+	case DRM_XE_EXEC_QUEUE_GET_PROPERTY_BAN:
 		args->value = !!(q->flags & EXEC_QUEUE_FLAG_BANNED);
 		ret = 0;
 		break;
diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
index 73c090762771..80fa48c95e60 100644
--- a/drivers/gpu/drm/xe/xe_gt.c
+++ b/drivers/gpu/drm/xe/xe_gt.c
@@ -556,7 +556,7 @@ static void xe_uevent_gt_reset_failure(struct pci_dev *pdev, u8 tile_id, u8 gt_i
 {
 	char *reset_event[4];
 
-	reset_event[0] = XE_RESET_FAILED_UEVENT "=NEEDS_RESET";
+	reset_event[0] = DRM_XE_RESET_FAILED_UEVENT "=NEEDS_RESET";
 	reset_event[1] = kasprintf(GFP_KERNEL, "TILE_ID=%d", tile_id);
 	reset_event[2] = kasprintf(GFP_KERNEL, "GT_ID=%d", gt_id);
 	reset_event[3] = NULL;
diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
index abfc0b3aeac4..8378ca3007d9 100644
--- a/drivers/gpu/drm/xe/xe_pmu.c
+++ b/drivers/gpu/drm/xe/xe_pmu.c
@@ -114,17 +114,17 @@ config_status(struct xe_device *xe, u64 config)
 		return -ENOENT;
 
 	switch (config_counter(config)) {
-	case XE_PMU_INTERRUPTS(0):
+	case DRM_XE_PMU_INTERRUPTS(0):
 		if (gt_id)
 			return -ENOENT;
 		break;
-	case XE_PMU_RENDER_GROUP_BUSY(0):
-	case XE_PMU_COPY_GROUP_BUSY(0):
-	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
+	case DRM_XE_PMU_RENDER_GROUP_BUSY(0):
+	case DRM_XE_PMU_COPY_GROUP_BUSY(0):
+	case DRM_XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
 		if (gt->info.type == XE_GT_TYPE_MEDIA)
 			return -ENOENT;
 		break;
-	case XE_PMU_MEDIA_GROUP_BUSY(0):
+	case DRM_XE_PMU_MEDIA_GROUP_BUSY(0):
 		if (!(gt->info.engine_mask & (BIT(XE_HW_ENGINE_VCS0) | BIT(XE_HW_ENGINE_VECS0))))
 			return -ENOENT;
 		break;
@@ -185,13 +185,13 @@ static u64 __xe_pmu_event_read(struct perf_event *event)
 	u64 val;
 
 	switch (config_counter(config)) {
-	case XE_PMU_INTERRUPTS(0):
+	case DRM_XE_PMU_INTERRUPTS(0):
 		val = READ_ONCE(pmu->irq_count);
 		break;
-	case XE_PMU_RENDER_GROUP_BUSY(0):
-	case XE_PMU_COPY_GROUP_BUSY(0):
-	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
-	case XE_PMU_MEDIA_GROUP_BUSY(0):
+	case DRM_XE_PMU_RENDER_GROUP_BUSY(0):
+	case DRM_XE_PMU_COPY_GROUP_BUSY(0):
+	case DRM_XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
+	case DRM_XE_PMU_MEDIA_GROUP_BUSY(0):
 		val = engine_group_busyness_read(gt, config);
 		break;
 	default:
diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
index 5942ab0811a2..995930e47ea2 100644
--- a/drivers/gpu/drm/xe/xe_query.c
+++ b/drivers/gpu/drm/xe/xe_query.c
@@ -261,7 +261,7 @@ static int query_memory_usage(struct xe_device *xe,
 		return -ENOMEM;
 
 	man = ttm_manager_type(&xe->ttm, XE_PL_TT);
-	usage->regions[0].mem_class = XE_MEM_REGION_CLASS_SYSMEM;
+	usage->regions[0].mem_class = DRM_XE_MEM_REGION_CLASS_SYSMEM;
 	usage->regions[0].instance = 0;
 	usage->regions[0].min_page_size = PAGE_SIZE;
 	usage->regions[0].total_size = man->size << PAGE_SHIFT;
@@ -273,7 +273,7 @@ static int query_memory_usage(struct xe_device *xe,
 		man = ttm_manager_type(&xe->ttm, i);
 		if (man) {
 			usage->regions[usage->num_regions].mem_class =
-				XE_MEM_REGION_CLASS_VRAM;
+				DRM_XE_MEM_REGION_CLASS_VRAM;
 			usage->regions[usage->num_regions].instance =
 				usage->num_regions;
 			usage->regions[usage->num_regions].min_page_size =
@@ -305,8 +305,8 @@ static int query_memory_usage(struct xe_device *xe,
 
 static int query_config(struct xe_device *xe, struct drm_xe_device_query *query)
 {
-#define XE_QUERY_CONFIG_NUM_PARAM	(XE_QUERY_CONFIG_MAX_EXEC_QUEUE_PRIORITY + 1)
-	u32 num_params = XE_QUERY_CONFIG_NUM_PARAM;
+#define DRM_XE_QUERY_CONFIG_NUM_PARAM	(DRM_XE_QUERY_CONFIG_MAX_EXEC_QUEUE_PRIORITY + 1)
+	u32 num_params = DRM_XE_QUERY_CONFIG_NUM_PARAM;
 	size_t size =
 		sizeof(struct drm_xe_query_config) + num_params * sizeof(u64);
 	struct drm_xe_query_config __user *query_ptr =
@@ -325,18 +325,18 @@ static int query_config(struct xe_device *xe, struct drm_xe_device_query *query)
 		return -ENOMEM;
 
 	config->num_params = num_params;
-	config->info[XE_QUERY_CONFIG_REV_AND_DEVICE_ID] =
+	config->info[DRM_XE_QUERY_CONFIG_REV_AND_DEVICE_ID] =
 		xe->info.devid | (xe->info.revid << 16);
 	if (xe_device_get_root_tile(xe)->mem.vram.usable_size)
-		config->info[XE_QUERY_CONFIG_FLAGS] =
-			XE_QUERY_CONFIG_FLAGS_HAS_VRAM;
-	config->info[XE_QUERY_CONFIG_MIN_ALIGNMENT] =
+		config->info[DRM_XE_QUERY_CONFIG_FLAGS] =
+			DRM_XE_QUERY_CONFIG_FLAGS_HAS_VRAM;
+	config->info[DRM_XE_QUERY_CONFIG_MIN_ALIGNMENT] =
 		xe->info.vram_flags & XE_VRAM_FLAGS_NEED64K ? SZ_64K : SZ_4K;
-	config->info[XE_QUERY_CONFIG_VA_BITS] = xe->info.va_bits;
-	config->info[XE_QUERY_CONFIG_GT_COUNT] = xe->info.gt_count;
-	config->info[XE_QUERY_CONFIG_MEM_REGION_COUNT] =
+	config->info[DRM_XE_QUERY_CONFIG_VA_BITS] = xe->info.va_bits;
+	config->info[DRM_XE_QUERY_CONFIG_GT_COUNT] = xe->info.gt_count;
+	config->info[DRM_XE_QUERY_CONFIG_MEM_REGION_COUNT] =
 		hweight_long(xe->info.mem_region_mask);
-	config->info[XE_QUERY_CONFIG_MAX_EXEC_QUEUE_PRIORITY] =
+	config->info[DRM_XE_QUERY_CONFIG_MAX_EXEC_QUEUE_PRIORITY] =
 		xe_exec_queue_device_get_max_priority(xe);
 
 	if (copy_to_user(query_ptr, config, size)) {
@@ -372,11 +372,11 @@ static int query_gt_list(struct xe_device *xe, struct drm_xe_device_query *query
 	gt_list->num_gt = xe->info.gt_count;
 	for_each_gt(gt, xe, id) {
 		if (xe_gt_is_media_type(gt))
-			gt_list->gt_list[id].type = XE_QUERY_GT_TYPE_MEDIA;
+			gt_list->gt_list[id].type = DRM_XE_QUERY_GT_TYPE_MEDIA;
 		else if (gt_to_tile(gt)->id > 0)
-			gt_list->gt_list[id].type = XE_QUERY_GT_TYPE_REMOTE;
+			gt_list->gt_list[id].type = DRM_XE_QUERY_GT_TYPE_REMOTE;
 		else
-			gt_list->gt_list[id].type = XE_QUERY_GT_TYPE_MAIN;
+			gt_list->gt_list[id].type = DRM_XE_QUERY_GT_TYPE_MAIN;
 		gt_list->gt_list[id].gt_id = gt->info.id;
 		gt_list->gt_list[id].clock_freq = gt->info.clock_freq;
 		if (!IS_DGFX(xe))
@@ -474,21 +474,21 @@ static int query_gt_topology(struct xe_device *xe,
 	for_each_gt(gt, xe, id) {
 		topo.gt_id = id;
 
-		topo.type = XE_TOPO_DSS_GEOMETRY;
+		topo.type = DRM_XE_TOPO_DSS_GEOMETRY;
 		query_ptr = copy_mask(query_ptr, &topo,
 				      gt->fuse_topo.g_dss_mask,
 				      sizeof(gt->fuse_topo.g_dss_mask));
 		if (IS_ERR(query_ptr))
 			return PTR_ERR(query_ptr);
 
-		topo.type = XE_TOPO_DSS_COMPUTE;
+		topo.type = DRM_XE_TOPO_DSS_COMPUTE;
 		query_ptr = copy_mask(query_ptr, &topo,
 				      gt->fuse_topo.c_dss_mask,
 				      sizeof(gt->fuse_topo.c_dss_mask));
 		if (IS_ERR(query_ptr))
 			return PTR_ERR(query_ptr);
 
-		topo.type = XE_TOPO_EU_PER_DSS;
+		topo.type = DRM_XE_TOPO_EU_PER_DSS;
 		query_ptr = copy_mask(query_ptr, &topo,
 				      gt->fuse_topo.eu_mask_per_dss,
 				      sizeof(gt->fuse_topo.eu_mask_per_dss));
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index d45f4f1d490f..ca4abbb86585 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -2183,8 +2183,8 @@ vm_bind_ioctl_ops_create(struct xe_vm *vm, struct xe_bo *bo,
 	       (ULL)bo_offset_or_userptr);
 
 	switch (operation) {
-	case XE_VM_BIND_OP_MAP:
-	case XE_VM_BIND_OP_MAP_USERPTR:
+	case DRM_XE_VM_BIND_OP_MAP:
+	case DRM_XE_VM_BIND_OP_MAP_USERPTR:
 		ops = drm_gpuvm_sm_map_ops_create(&vm->gpuvm, addr, range,
 						  obj, bo_offset_or_userptr);
 		if (IS_ERR(ops))
@@ -2195,13 +2195,13 @@ vm_bind_ioctl_ops_create(struct xe_vm *vm, struct xe_bo *bo,
 
 			op->tile_mask = tile_mask;
 			op->map.immediate =
-				flags & XE_VM_BIND_FLAG_IMMEDIATE;
+				flags & DRM_XE_VM_BIND_FLAG_IMMEDIATE;
 			op->map.read_only =
-				flags & XE_VM_BIND_FLAG_READONLY;
-			op->map.is_null = flags & XE_VM_BIND_FLAG_NULL;
+				flags & DRM_XE_VM_BIND_FLAG_READONLY;
+			op->map.is_null = flags & DRM_XE_VM_BIND_FLAG_NULL;
 		}
 		break;
-	case XE_VM_BIND_OP_UNMAP:
+	case DRM_XE_VM_BIND_OP_UNMAP:
 		ops = drm_gpuvm_sm_unmap_ops_create(&vm->gpuvm, addr, range);
 		if (IS_ERR(ops))
 			return ops;
@@ -2212,7 +2212,7 @@ vm_bind_ioctl_ops_create(struct xe_vm *vm, struct xe_bo *bo,
 			op->tile_mask = tile_mask;
 		}
 		break;
-	case XE_VM_BIND_OP_PREFETCH:
+	case DRM_XE_VM_BIND_OP_PREFETCH:
 		ops = drm_gpuvm_prefetch_ops_create(&vm->gpuvm, addr, range);
 		if (IS_ERR(ops))
 			return ops;
@@ -2224,7 +2224,7 @@ vm_bind_ioctl_ops_create(struct xe_vm *vm, struct xe_bo *bo,
 			op->prefetch.region = region;
 		}
 		break;
-	case XE_VM_BIND_OP_UNMAP_ALL:
+	case DRM_XE_VM_BIND_OP_UNMAP_ALL:
 		xe_assert(vm->xe, bo);
 
 		err = xe_bo_lock(bo, true);
@@ -2828,13 +2828,13 @@ static int vm_bind_ioctl_ops_execute(struct xe_vm *vm,
 
 #ifdef TEST_VM_ASYNC_OPS_ERROR
 #define SUPPORTED_FLAGS	\
-	(FORCE_ASYNC_OP_ERROR | XE_VM_BIND_FLAG_ASYNC | \
-	 XE_VM_BIND_FLAG_READONLY | XE_VM_BIND_FLAG_IMMEDIATE | \
-	 XE_VM_BIND_FLAG_NULL | 0xffff)
+	(FORCE_ASYNC_OP_ERROR | DRM_XE_VM_BIND_FLAG_ASYNC | \
+	 DRM_XE_VM_BIND_FLAG_READONLY | DRM_XE_VM_BIND_FLAG_IMMEDIATE | \
+	 DRM_XE_VM_BIND_FLAG_NULL | 0xffff)
 #else
 #define SUPPORTED_FLAGS	\
-	(XE_VM_BIND_FLAG_ASYNC | XE_VM_BIND_FLAG_READONLY | \
-	 XE_VM_BIND_FLAG_IMMEDIATE | XE_VM_BIND_FLAG_NULL | \
+	(DRM_XE_VM_BIND_FLAG_ASYNC | DRM_XE_VM_BIND_FLAG_READONLY | \
+	 DRM_XE_VM_BIND_FLAG_IMMEDIATE | DRM_XE_VM_BIND_FLAG_NULL | \
 	 0xffff)
 #endif
 #define XE_64K_PAGE_MASK 0xffffull
@@ -2882,45 +2882,45 @@ static int vm_bind_ioctl_check_args(struct xe_device *xe,
 		u32 obj = (*bind_ops)[i].obj;
 		u64 obj_offset = (*bind_ops)[i].obj_offset;
 		u32 region = (*bind_ops)[i].region;
-		bool is_null = flags & XE_VM_BIND_FLAG_NULL;
+		bool is_null = flags & DRM_XE_VM_BIND_FLAG_NULL;
 
 		if (i == 0) {
-			*async = !!(flags & XE_VM_BIND_FLAG_ASYNC);
+			*async = !!(flags & DRM_XE_VM_BIND_FLAG_ASYNC);
 			if (XE_IOCTL_DBG(xe, !*async && args->num_syncs)) {
 				err = -EINVAL;
 				goto free_bind_ops;
 			}
 		} else if (XE_IOCTL_DBG(xe, *async !=
-					!!(flags & XE_VM_BIND_FLAG_ASYNC))) {
+					!!(flags & DRM_XE_VM_BIND_FLAG_ASYNC))) {
 			err = -EINVAL;
 			goto free_bind_ops;
 		}
 
-		if (XE_IOCTL_DBG(xe, op > XE_VM_BIND_OP_PREFETCH) ||
+		if (XE_IOCTL_DBG(xe, op > DRM_XE_VM_BIND_OP_PREFETCH) ||
 		    XE_IOCTL_DBG(xe, flags & ~SUPPORTED_FLAGS) ||
 		    XE_IOCTL_DBG(xe, obj && is_null) ||
 		    XE_IOCTL_DBG(xe, obj_offset && is_null) ||
-		    XE_IOCTL_DBG(xe, op != XE_VM_BIND_OP_MAP &&
+		    XE_IOCTL_DBG(xe, op != DRM_XE_VM_BIND_OP_MAP &&
 				 is_null) ||
 		    XE_IOCTL_DBG(xe, !obj &&
-				 op == XE_VM_BIND_OP_MAP &&
+				 op == DRM_XE_VM_BIND_OP_MAP &&
 				 !is_null) ||
 		    XE_IOCTL_DBG(xe, !obj &&
-				 op == XE_VM_BIND_OP_UNMAP_ALL) ||
+				 op == DRM_XE_VM_BIND_OP_UNMAP_ALL) ||
 		    XE_IOCTL_DBG(xe, addr &&
-				 op == XE_VM_BIND_OP_UNMAP_ALL) ||
+				 op == DRM_XE_VM_BIND_OP_UNMAP_ALL) ||
 		    XE_IOCTL_DBG(xe, range &&
-				 op == XE_VM_BIND_OP_UNMAP_ALL) ||
+				 op == DRM_XE_VM_BIND_OP_UNMAP_ALL) ||
 		    XE_IOCTL_DBG(xe, obj &&
-				 op == XE_VM_BIND_OP_MAP_USERPTR) ||
+				 op == DRM_XE_VM_BIND_OP_MAP_USERPTR) ||
 		    XE_IOCTL_DBG(xe, obj &&
-				 op == XE_VM_BIND_OP_PREFETCH) ||
+				 op == DRM_XE_VM_BIND_OP_PREFETCH) ||
 		    XE_IOCTL_DBG(xe, region &&
-				 op != XE_VM_BIND_OP_PREFETCH) ||
+				 op != DRM_XE_VM_BIND_OP_PREFETCH) ||
 		    XE_IOCTL_DBG(xe, !(BIT(region) &
 				       xe->info.mem_region_mask)) ||
 		    XE_IOCTL_DBG(xe, obj &&
-				 op == XE_VM_BIND_OP_UNMAP)) {
+				 op == DRM_XE_VM_BIND_OP_UNMAP)) {
 			err = -EINVAL;
 			goto free_bind_ops;
 		}
@@ -2929,7 +2929,7 @@ static int vm_bind_ioctl_check_args(struct xe_device *xe,
 		    XE_IOCTL_DBG(xe, addr & ~PAGE_MASK) ||
 		    XE_IOCTL_DBG(xe, range & ~PAGE_MASK) ||
 		    XE_IOCTL_DBG(xe, !range &&
-				 op != XE_VM_BIND_OP_UNMAP_ALL)) {
+				 op != DRM_XE_VM_BIND_OP_UNMAP_ALL)) {
 			err = -EINVAL;
 			goto free_bind_ops;
 		}
diff --git a/drivers/gpu/drm/xe/xe_vm_doc.h b/drivers/gpu/drm/xe/xe_vm_doc.h
index b1b2dc4a6089..516f4dc97223 100644
--- a/drivers/gpu/drm/xe/xe_vm_doc.h
+++ b/drivers/gpu/drm/xe/xe_vm_doc.h
@@ -32,9 +32,9 @@
  * Operations
  * ----------
  *
- * XE_VM_BIND_OP_MAP		- Create mapping for a BO
- * XE_VM_BIND_OP_UNMAP		- Destroy mapping for a BO / userptr
- * XE_VM_BIND_OP_MAP_USERPTR	- Create mapping for userptr
+ * DRM_XE_VM_BIND_OP_MAP		- Create mapping for a BO
+ * DRM_XE_VM_BIND_OP_UNMAP		- Destroy mapping for a BO / userptr
+ * DRM_XE_VM_BIND_OP_MAP_USERPTR	- Create mapping for userptr
  *
  * Implementation details
  * ~~~~~~~~~~~~~~~~~~~~~~
@@ -113,7 +113,7 @@
  * VM uses to report errors to. The ufence wait interface can be used to wait on
  * a VM going into an error state. Once an error is reported the VM's async
  * worker is paused. While the VM's async worker is paused sync,
- * XE_VM_BIND_OP_UNMAP operations are allowed (this can free memory). Once the
+ * DRM_XE_VM_BIND_OP_UNMAP operations are allowed (this can free memory). Once the
  * uses believe the error state is fixed, the async worker can be resumed via
  * XE_VM_BIND_OP_RESTART operation. When VM async bind work is restarted, the
  * first operation processed is the operation that caused the original error.
@@ -193,7 +193,7 @@
  * In a VM is in fault mode (TODO: link to fault mode), new bind operations that
  * create mappings are by default are deferred to the page fault handler (first
  * use). This behavior can be overriden by setting the flag
- * XE_VM_BIND_FLAG_IMMEDIATE which indicates to creating the mapping
+ * DRM_XE_VM_BIND_FLAG_IMMEDIATE which indicates to creating the mapping
  * immediately.
  *
  * User pointer
@@ -322,7 +322,7 @@
  *
  * By default, on a faulting VM binds just allocate the VMA and the actual
  * updating of the page tables is defered to the page fault handler. This
- * behavior can be overridden by setting the flag XE_VM_BIND_FLAG_IMMEDIATE in
+ * behavior can be overridden by setting the flag DRM_XE_VM_BIND_FLAG_IMMEDIATE in
  * the VM bind which will then do the bind immediately.
  *
  * Page fault handler
diff --git a/drivers/gpu/drm/xe/xe_vm_madvise.c b/drivers/gpu/drm/xe/xe_vm_madvise.c
index 0ef7d483d050..72d051ecac5c 100644
--- a/drivers/gpu/drm/xe/xe_vm_madvise.c
+++ b/drivers/gpu/drm/xe/xe_vm_madvise.c
@@ -19,10 +19,10 @@ static int madvise_preferred_mem_class(struct xe_device *xe, struct xe_vm *vm,
 {
 	int i, err;
 
-	if (XE_IOCTL_DBG(xe, value > XE_MEM_REGION_CLASS_VRAM))
+	if (XE_IOCTL_DBG(xe, value > DRM_XE_MEM_REGION_CLASS_VRAM))
 		return -EINVAL;
 
-	if (XE_IOCTL_DBG(xe, value == XE_MEM_REGION_CLASS_VRAM &&
+	if (XE_IOCTL_DBG(xe, value == DRM_XE_MEM_REGION_CLASS_VRAM &&
 			 !xe->info.is_dgfx))
 		return -EINVAL;
 
@@ -75,10 +75,10 @@ static int madvise_preferred_mem_class_gt(struct xe_device *xe,
 	u32 gt_id = upper_32_bits(value);
 	u32 mem_class = lower_32_bits(value);
 
-	if (XE_IOCTL_DBG(xe, mem_class > XE_MEM_REGION_CLASS_VRAM))
+	if (XE_IOCTL_DBG(xe, mem_class > DRM_XE_MEM_REGION_CLASS_VRAM))
 		return -EINVAL;
 
-	if (XE_IOCTL_DBG(xe, mem_class == XE_MEM_REGION_CLASS_VRAM &&
+	if (XE_IOCTL_DBG(xe, mem_class == DRM_XE_MEM_REGION_CLASS_VRAM &&
 			 !xe->info.is_dgfx))
 		return -EINVAL;
 
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index dcd4680ae788..641e94a5f9c1 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -19,12 +19,12 @@ extern "C" {
 /**
  * DOC: uevent generated by xe on it's pci node.
  *
- * XE_RESET_FAILED_UEVENT - Event is generated when attempt to reset gt
+ * DRM_XE_RESET_FAILED_UEVENT - Event is generated when attempt to reset gt
  * fails. The value supplied with the event is always "NEEDS_RESET".
  * Additional information supplied is tile id and gt id of the gt unit for
  * which reset has failed.
  */
-#define XE_RESET_FAILED_UEVENT "DEVICE_STATUS"
+#define DRM_XE_RESET_FAILED_UEVENT "DEVICE_STATUS"
 
 /**
  * struct xe_user_extension - Base class for defining a chain of extensions
@@ -103,8 +103,8 @@ struct xe_user_extension {
 #define DRM_XE_VM_CREATE		0x03
 #define DRM_XE_VM_DESTROY		0x04
 #define DRM_XE_VM_BIND			0x05
-#define DRM_XE_EXEC_QUEUE_CREATE		0x06
-#define DRM_XE_EXEC_QUEUE_DESTROY		0x07
+#define DRM_XE_EXEC_QUEUE_CREATE	0x06
+#define DRM_XE_EXEC_QUEUE_DESTROY	0x07
 #define DRM_XE_EXEC			0x08
 #define DRM_XE_EXEC_QUEUE_SET_PROPERTY	0x09
 #define DRM_XE_WAIT_USER_FENCE		0x0a
@@ -150,14 +150,14 @@ struct drm_xe_engine_class_instance {
  * enum drm_xe_memory_class - Supported memory classes.
  */
 enum drm_xe_memory_class {
-	/** @XE_MEM_REGION_CLASS_SYSMEM: Represents system memory. */
-	XE_MEM_REGION_CLASS_SYSMEM = 0,
+	/** @DRM_XE_MEM_REGION_CLASS_SYSMEM: Represents system memory. */
+	DRM_XE_MEM_REGION_CLASS_SYSMEM = 0,
 	/**
-	 * @XE_MEM_REGION_CLASS_VRAM: On discrete platforms, this
+	 * @DRM_XE_MEM_REGION_CLASS_VRAM: On discrete platforms, this
 	 * represents the memory that is local to the device, which we
 	 * call VRAM. Not valid on integrated platforms.
 	 */
-	XE_MEM_REGION_CLASS_VRAM
+	DRM_XE_MEM_REGION_CLASS_VRAM
 };
 
 /**
@@ -217,7 +217,7 @@ struct drm_xe_query_mem_region {
 	 * always equal the @total_size, since all of it will be CPU
 	 * accessible.
 	 *
-	 * Note this is only tracked for XE_MEM_REGION_CLASS_VRAM
+	 * Note this is only tracked for DRM_XE_MEM_REGION_CLASS_VRAM
 	 * regions (for other types the value here will always equal
 	 * zero).
 	 */
@@ -229,7 +229,7 @@ struct drm_xe_query_mem_region {
 	 * Requires CAP_PERFMON or CAP_SYS_ADMIN to get reliable
 	 * accounting. Without this the value here will always equal
 	 * zero.  Note this is only currently tracked for
-	 * XE_MEM_REGION_CLASS_VRAM regions (for other types the value
+	 * DRM_XE_MEM_REGION_CLASS_VRAM regions (for other types the value
 	 * here will always be zero).
 	 */
 	__u64 cpu_visible_used;
@@ -325,36 +325,36 @@ struct drm_xe_query_config {
 	 * Device ID (lower 16 bits) and the device revision (next
 	 * 8 bits)
 	 */
-#define XE_QUERY_CONFIG_REV_AND_DEVICE_ID	0
+#define DRM_XE_QUERY_CONFIG_REV_AND_DEVICE_ID		0
 	/*
 	 * Flags describing the device configuration, see list below
 	 */
-#define XE_QUERY_CONFIG_FLAGS			1
+#define DRM_XE_QUERY_CONFIG_FLAGS			1
 	/*
 	 * Flag is set if the device has usable VRAM
 	 */
-	#define XE_QUERY_CONFIG_FLAGS_HAS_VRAM		(0x1 << 0)
+	#define DRM_XE_QUERY_CONFIG_FLAGS_HAS_VRAM	(0x1 << 0)
 	/*
 	 * Minimal memory alignment required by this device,
 	 * typically SZ_4K or SZ_64K
 	 */
-#define XE_QUERY_CONFIG_MIN_ALIGNMENT		2
+#define DRM_XE_QUERY_CONFIG_MIN_ALIGNMENT		2
 	/*
 	 * Maximum bits of a virtual address
 	 */
-#define XE_QUERY_CONFIG_VA_BITS			3
+#define DRM_XE_QUERY_CONFIG_VA_BITS			3
 	/*
 	 * Total number of GTs for the entire device
 	 */
-#define XE_QUERY_CONFIG_GT_COUNT		4
+#define DRM_XE_QUERY_CONFIG_GT_COUNT			4
 	/*
 	 * Total number of accessible memory regions
 	 */
-#define XE_QUERY_CONFIG_MEM_REGION_COUNT	5
+#define DRM_XE_QUERY_CONFIG_MEM_REGION_COUNT		5
 	/*
 	 * Value of the highest available exec queue priority
 	 */
-#define XE_QUERY_CONFIG_MAX_EXEC_QUEUE_PRIORITY	6
+#define DRM_XE_QUERY_CONFIG_MAX_EXEC_QUEUE_PRIORITY	6
 	/** @info: array of elements containing the config info */
 	__u64 info[];
 };
@@ -368,9 +368,9 @@ struct drm_xe_query_config {
  * implementing graphics and/or media operations.
  */
 struct drm_xe_query_gt {
-#define XE_QUERY_GT_TYPE_MAIN		0
-#define XE_QUERY_GT_TYPE_REMOTE		1
-#define XE_QUERY_GT_TYPE_MEDIA		2
+#define DRM_XE_QUERY_GT_TYPE_MAIN		0
+#define DRM_XE_QUERY_GT_TYPE_REMOTE		1
+#define DRM_XE_QUERY_GT_TYPE_MEDIA		2
 	/** @type: GT type: Main, Remote, or Media */
 	__u16 type;
 	/** @gt_id: Unique ID of this GT within the PCI Device */
@@ -435,7 +435,7 @@ struct drm_xe_query_topology_mask {
 	 *   DSS_GEOMETRY    ff ff ff ff 00 00 00 00
 	 * means 32 DSS are available for geometry.
 	 */
-#define XE_TOPO_DSS_GEOMETRY	(1 << 0)
+#define DRM_XE_TOPO_DSS_GEOMETRY	(1 << 0)
 	/*
 	 * To query the mask of Dual Sub Slices (DSS) available for compute
 	 * operations. For example a query response containing the following
@@ -443,7 +443,7 @@ struct drm_xe_query_topology_mask {
 	 *   DSS_COMPUTE    ff ff ff ff 00 00 00 00
 	 * means 32 DSS are available for compute.
 	 */
-#define XE_TOPO_DSS_COMPUTE	(1 << 1)
+#define DRM_XE_TOPO_DSS_COMPUTE		(1 << 1)
 	/*
 	 * To query the mask of Execution Units (EU) available per Dual Sub
 	 * Slices (DSS). For example a query response containing the following
@@ -451,7 +451,7 @@ struct drm_xe_query_topology_mask {
 	 *   EU_PER_DSS    ff ff 00 00 00 00 00 00
 	 * means each DSS has 16 EU.
 	 */
-#define XE_TOPO_EU_PER_DSS	(1 << 2)
+#define DRM_XE_TOPO_EU_PER_DSS		(1 << 2)
 	/** @type: type of mask */
 	__u16 type;
 
@@ -587,8 +587,8 @@ struct drm_xe_gem_create {
 	 */
 	__u64 size;
 
-#define XE_GEM_CREATE_FLAG_DEFER_BACKING	(0x1 << 24)
-#define XE_GEM_CREATE_FLAG_SCANOUT		(0x1 << 25)
+#define DRM_XE_GEM_CREATE_FLAG_DEFER_BACKING		(0x1 << 24)
+#define DRM_XE_GEM_CREATE_FLAG_SCANOUT			(0x1 << 25)
 /*
  * When using VRAM as a possible placement, ensure that the corresponding VRAM
  * allocation will always use the CPU accessible part of VRAM. This is important
@@ -604,7 +604,7 @@ struct drm_xe_gem_create {
  * display surfaces, therefore the kernel requires setting this flag for such
  * objects, otherwise an error is thrown on small-bar systems.
  */
-#define XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM	(0x1 << 26)
+#define DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM	(0x1 << 26)
 	/**
 	 * @flags: Flags, currently a mask of memory instances of where BO can
 	 * be placed
@@ -671,14 +671,14 @@ struct drm_xe_ext_set_property {
 };
 
 struct drm_xe_vm_create {
-#define XE_VM_EXTENSION_SET_PROPERTY	0
+#define DRM_XE_VM_EXTENSION_SET_PROPERTY	0
 	/** @extensions: Pointer to the first extension struct, if any */
 	__u64 extensions;
 
-#define DRM_XE_VM_CREATE_SCRATCH_PAGE	(0x1 << 0)
-#define DRM_XE_VM_CREATE_COMPUTE_MODE	(0x1 << 1)
-#define DRM_XE_VM_CREATE_ASYNC_DEFAULT	(0x1 << 2)
-#define DRM_XE_VM_CREATE_FAULT_MODE	(0x1 << 3)
+#define DRM_XE_VM_CREATE_SCRATCH_PAGE		(0x1 << 0)
+#define DRM_XE_VM_CREATE_COMPUTE_MODE		(0x1 << 1)
+#define DRM_XE_VM_CREATE_ASYNC_DEFAULT		(0x1 << 2)
+#define DRM_XE_VM_CREATE_FAULT_MODE		(0x1 << 3)
 	/** @flags: Flags */
 	__u32 flags;
 
@@ -737,29 +737,29 @@ struct drm_xe_vm_bind_op {
 	 */
 	__u64 tile_mask;
 
-#define XE_VM_BIND_OP_MAP		0x0
-#define XE_VM_BIND_OP_UNMAP		0x1
-#define XE_VM_BIND_OP_MAP_USERPTR	0x2
-#define XE_VM_BIND_OP_UNMAP_ALL		0x3
-#define XE_VM_BIND_OP_PREFETCH		0x4
+#define DRM_XE_VM_BIND_OP_MAP		0x0
+#define DRM_XE_VM_BIND_OP_UNMAP		0x1
+#define DRM_XE_VM_BIND_OP_MAP_USERPTR	0x2
+#define DRM_XE_VM_BIND_OP_UNMAP_ALL	0x3
+#define DRM_XE_VM_BIND_OP_PREFETCH	0x4
 	/** @op: Bind operation to perform */
 	__u32 op;
 
-#define XE_VM_BIND_FLAG_READONLY	(0x1 << 0)
-#define XE_VM_BIND_FLAG_ASYNC		(0x1 << 1)
+#define DRM_XE_VM_BIND_FLAG_READONLY	(0x1 << 0)
+#define DRM_XE_VM_BIND_FLAG_ASYNC	(0x1 << 1)
 	/*
 	 * Valid on a faulting VM only, do the MAP operation immediately rather
 	 * than deferring the MAP to the page fault handler.
 	 */
-#define XE_VM_BIND_FLAG_IMMEDIATE	(0x1 << 2)
+#define DRM_XE_VM_BIND_FLAG_IMMEDIATE	(0x1 << 2)
 	/*
 	 * When the NULL flag is set, the page tables are setup with a special
 	 * bit which indicates writes are dropped and all reads return zero.  In
-	 * the future, the NULL flags will only be valid for XE_VM_BIND_OP_MAP
+	 * the future, the NULL flags will only be valid for DRM_XE_VM_BIND_OP_MAP
 	 * operations, the BO handle MBZ, and the BO offset MBZ. This flag is
 	 * intended to implement VK sparse bindings.
 	 */
-#define XE_VM_BIND_FLAG_NULL		(0x1 << 3)
+#define DRM_XE_VM_BIND_FLAG_NULL	(0x1 << 3)
 	/** @flags: Bind flags */
 	__u32 flags;
 
@@ -840,14 +840,14 @@ struct drm_xe_exec_queue_set_property {
 	/** @exec_queue_id: Exec queue ID */
 	__u32 exec_queue_id;
 
-#define XE_EXEC_QUEUE_SET_PROPERTY_PRIORITY		0
-#define XE_EXEC_QUEUE_SET_PROPERTY_TIMESLICE		1
-#define XE_EXEC_QUEUE_SET_PROPERTY_PREEMPTION_TIMEOUT	2
-#define XE_EXEC_QUEUE_SET_PROPERTY_PERSISTENCE		3
-#define XE_EXEC_QUEUE_SET_PROPERTY_JOB_TIMEOUT		4
-#define XE_EXEC_QUEUE_SET_PROPERTY_ACC_TRIGGER		5
-#define XE_EXEC_QUEUE_SET_PROPERTY_ACC_NOTIFY		6
-#define XE_EXEC_QUEUE_SET_PROPERTY_ACC_GRANULARITY	7
+#define DRM_XE_EXEC_QUEUE_SET_PROPERTY_PRIORITY			0
+#define DRM_XE_EXEC_QUEUE_SET_PROPERTY_TIMESLICE		1
+#define DRM_XE_EXEC_QUEUE_SET_PROPERTY_PREEMPTION_TIMEOUT	2
+#define DRM_XE_EXEC_QUEUE_SET_PROPERTY_PERSISTENCE		3
+#define DRM_XE_EXEC_QUEUE_SET_PROPERTY_JOB_TIMEOUT		4
+#define DRM_XE_EXEC_QUEUE_SET_PROPERTY_ACC_TRIGGER		5
+#define DRM_XE_EXEC_QUEUE_SET_PROPERTY_ACC_NOTIFY		6
+#define DRM_XE_EXEC_QUEUE_SET_PROPERTY_ACC_GRANULARITY		7
 	/** @property: property to set */
 	__u32 property;
 
@@ -859,7 +859,7 @@ struct drm_xe_exec_queue_set_property {
 };
 
 struct drm_xe_exec_queue_create {
-#define XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY               0
+#define DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY               0
 	/** @extensions: Pointer to the first extension struct, if any */
 	__u64 extensions;
 
@@ -898,7 +898,7 @@ struct drm_xe_exec_queue_get_property {
 	/** @exec_queue_id: Exec queue ID */
 	__u32 exec_queue_id;
 
-#define XE_EXEC_QUEUE_GET_PROPERTY_BAN			0
+#define DRM_XE_EXEC_QUEUE_GET_PROPERTY_BAN	0
 	/** @property: property to get */
 	__u32 property;
 
@@ -1087,8 +1087,8 @@ struct drm_xe_vm_madvise {
 	 * For DRM_XE_VM_MADVISE_PREFERRED_MEM_CLASS usage, see enum
 	 * drm_xe_memory_class.
 	 */
-#define DRM_XE_VM_MADVISE_PREFERRED_MEM_CLASS	0
-#define DRM_XE_VM_MADVISE_PREFERRED_GT		1
+#define DRM_XE_VM_MADVISE_PREFERRED_MEM_CLASS		0
+#define DRM_XE_VM_MADVISE_PREFERRED_GT			1
 	/*
 	 * In this case lower 32 bits are mem class, upper 32 are GT.
 	 * Combination provides a single IOCTL plus migrate VMA to preferred
@@ -1099,25 +1099,25 @@ struct drm_xe_vm_madvise {
 	 * The CPU will do atomic memory operations to this VMA. Must be set on
 	 * some devices for atomics to behave correctly.
 	 */
-#define DRM_XE_VM_MADVISE_CPU_ATOMIC		3
+#define DRM_XE_VM_MADVISE_CPU_ATOMIC			3
 	/*
 	 * The device will do atomic memory operations to this VMA. Must be set
 	 * on some devices for atomics to behave correctly.
 	 */
-#define DRM_XE_VM_MADVISE_DEVICE_ATOMIC		4
+#define DRM_XE_VM_MADVISE_DEVICE_ATOMIC			4
 	/*
 	 * Priority WRT to eviction (moving from preferred memory location due
 	 * to memory pressure). The lower the priority, the more likely to be
 	 * evicted.
 	 */
-#define DRM_XE_VM_MADVISE_PRIORITY		5
-#define		DRM_XE_VMA_PRIORITY_LOW		0
+#define DRM_XE_VM_MADVISE_PRIORITY			5
+#define		DRM_XE_VMA_PRIORITY_LOW			0
 		/* Default */
-#define		DRM_XE_VMA_PRIORITY_NORMAL	1
+#define		DRM_XE_VMA_PRIORITY_NORMAL		1
 		/* Must be user with elevated privileges */
-#define		DRM_XE_VMA_PRIORITY_HIGH	2
+#define		DRM_XE_VMA_PRIORITY_HIGH		2
 	/* Pin the VMA in memory, must be user with elevated privileges */
-#define DRM_XE_VM_MADVISE_PIN			6
+#define DRM_XE_VM_MADVISE_PIN				6
 	/** @property: property to set */
 	__u32 property;
 
@@ -1138,7 +1138,7 @@ struct drm_xe_vm_madvise {
  * in 'struct perf_event_attr' as part of perf_event_open syscall to read a
  * particular event.
  *
- * For example to open the XE_PMU_INTERRUPTS(0):
+ * For example to open the DRM_XE_PMU_INTERRUPTS(0):
  *
  * .. code-block:: C
  *
@@ -1152,7 +1152,7 @@ struct drm_xe_vm_madvise {
  *	attr.read_format = PERF_FORMAT_TOTAL_TIME_ENABLED;
  *	attr.use_clockid = 1;
  *	attr.clockid = CLOCK_MONOTONIC;
- *	attr.config = XE_PMU_INTERRUPTS(0);
+ *	attr.config = DRM_XE_PMU_INTERRUPTS(0);
  *
  *	fd = syscall(__NR_perf_event_open, &attr, -1, cpu, -1, 0);
  */
@@ -1165,11 +1165,11 @@ struct drm_xe_vm_madvise {
 #define ___XE_PMU_OTHER(gt, x) \
 	(((__u64)(x)) | ((__u64)(gt) << __XE_PMU_GT_SHIFT))
 
-#define XE_PMU_INTERRUPTS(gt)			___XE_PMU_OTHER(gt, 0)
-#define XE_PMU_RENDER_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 1)
-#define XE_PMU_COPY_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 2)
-#define XE_PMU_MEDIA_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 3)
-#define XE_PMU_ANY_ENGINE_GROUP_BUSY(gt)	___XE_PMU_OTHER(gt, 4)
+#define DRM_XE_PMU_INTERRUPTS(gt)		___XE_PMU_OTHER(gt, 0)
+#define DRM_XE_PMU_RENDER_GROUP_BUSY(gt)	___XE_PMU_OTHER(gt, 1)
+#define DRM_XE_PMU_COPY_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 2)
+#define DRM_XE_PMU_MEDIA_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 3)
+#define DRM_XE_PMU_ANY_ENGINE_GROUP_BUSY(gt)	___XE_PMU_OTHER(gt, 4)
 
 #if defined(__cplusplus)
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 08/43] drm/xe/uapi: Add _FLAG to uAPI constants usable for flags
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (6 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 07/43] drm/xe/uapi: Add missing DRM_ prefix in uAPI constants Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 09/43] drm/xe/uapi: Make constant comments visible in kernel doc Francois Dugast
                   ` (38 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Francois Dugast

Most constants defined in xe_drm.h which can be used for flags are
named DRM_XE_*_FLAG_*, which is helpful to identify them. Make this
systematic and add _FLAG where it was missing.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
---
 drivers/gpu/drm/xe/xe_sync.c            | 16 ++++++-------
 drivers/gpu/drm/xe/xe_vm.c              | 32 ++++++++++++-------------
 drivers/gpu/drm/xe/xe_vm_doc.h          |  2 +-
 drivers/gpu/drm/xe/xe_wait_user_fence.c | 10 ++++----
 include/uapi/drm/xe_drm.h               | 30 +++++++++++------------
 5 files changed, 45 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_sync.c b/drivers/gpu/drm/xe/xe_sync.c
index 73ef259aa387..eafe53c2f55d 100644
--- a/drivers/gpu/drm/xe/xe_sync.c
+++ b/drivers/gpu/drm/xe/xe_sync.c
@@ -110,14 +110,14 @@ int xe_sync_entry_parse(struct xe_device *xe, struct xe_file *xef,
 		return -EFAULT;
 
 	if (XE_IOCTL_DBG(xe, sync_in.flags &
-			 ~(SYNC_FLAGS_TYPE_MASK | DRM_XE_SYNC_SIGNAL)) ||
+			 ~(SYNC_FLAGS_TYPE_MASK | DRM_XE_SYNC_FLAG_SIGNAL)) ||
 	    XE_IOCTL_DBG(xe, sync_in.pad) ||
 	    XE_IOCTL_DBG(xe, sync_in.reserved[0] || sync_in.reserved[1]))
 		return -EINVAL;
 
-	signal = sync_in.flags & DRM_XE_SYNC_SIGNAL;
+	signal = sync_in.flags & DRM_XE_SYNC_FLAG_SIGNAL;
 	switch (sync_in.flags & SYNC_FLAGS_TYPE_MASK) {
-	case DRM_XE_SYNC_SYNCOBJ:
+	case DRM_XE_SYNC_FLAG_SYNCOBJ:
 		if (XE_IOCTL_DBG(xe, no_dma_fences && signal))
 			return -EOPNOTSUPP;
 
@@ -135,7 +135,7 @@ int xe_sync_entry_parse(struct xe_device *xe, struct xe_file *xef,
 		}
 		break;
 
-	case DRM_XE_SYNC_TIMELINE_SYNCOBJ:
+	case DRM_XE_SYNC_FLAG_TIMELINE_SYNCOBJ:
 		if (XE_IOCTL_DBG(xe, no_dma_fences && signal))
 			return -EOPNOTSUPP;
 
@@ -165,12 +165,12 @@ int xe_sync_entry_parse(struct xe_device *xe, struct xe_file *xef,
 		}
 		break;
 
-	case DRM_XE_SYNC_DMA_BUF:
+	case DRM_XE_SYNC_FLAG_DMA_BUF:
 		if (XE_IOCTL_DBG(xe, "TODO"))
 			return -EINVAL;
 		break;
 
-	case DRM_XE_SYNC_USER_FENCE:
+	case DRM_XE_SYNC_FLAG_USER_FENCE:
 		if (XE_IOCTL_DBG(xe, !signal))
 			return -EOPNOTSUPP;
 
@@ -225,7 +225,7 @@ int xe_sync_entry_add_deps(struct xe_sync_entry *sync, struct xe_sched_job *job)
 void xe_sync_entry_signal(struct xe_sync_entry *sync, struct xe_sched_job *job,
 			  struct dma_fence *fence)
 {
-	if (!(sync->flags & DRM_XE_SYNC_SIGNAL))
+	if (!(sync->flags & DRM_XE_SYNC_FLAG_SIGNAL))
 		return;
 
 	if (sync->chain_fence) {
@@ -253,7 +253,7 @@ void xe_sync_entry_signal(struct xe_sync_entry *sync, struct xe_sched_job *job,
 			dma_fence_put(fence);
 		}
 	} else if ((sync->flags & SYNC_FLAGS_TYPE_MASK) ==
-		   DRM_XE_SYNC_USER_FENCE) {
+		   DRM_XE_SYNC_FLAG_USER_FENCE) {
 		job->user_fence.used = true;
 		job->user_fence.addr = sync->addr;
 		job->user_fence.value = sync->timeline_value;
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index ca4abbb86585..76926ee756c7 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -1918,10 +1918,10 @@ static int xe_vm_unbind(struct xe_vm *vm, struct xe_vma *vma,
 	return 0;
 }
 
-#define ALL_DRM_XE_VM_CREATE_FLAGS (DRM_XE_VM_CREATE_SCRATCH_PAGE | \
-				    DRM_XE_VM_CREATE_COMPUTE_MODE | \
-				    DRM_XE_VM_CREATE_ASYNC_DEFAULT | \
-				    DRM_XE_VM_CREATE_FAULT_MODE)
+#define ALL_DRM_XE_VM_CREATE_FLAGS (DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE | \
+				    DRM_XE_VM_CREATE_FLAG_COMPUTE_MODE | \
+				    DRM_XE_VM_CREATE_FLAG_ASYNC_DEFAULT | \
+				    DRM_XE_VM_CREATE_FLAG_FAULT_MODE)
 
 int xe_vm_create_ioctl(struct drm_device *dev, void *data,
 		       struct drm_file *file)
@@ -1939,9 +1939,9 @@ int xe_vm_create_ioctl(struct drm_device *dev, void *data,
 		return -EINVAL;
 
 	if (XE_WA(xe_root_mmio_gt(xe), 14016763929))
-		args->flags |= DRM_XE_VM_CREATE_SCRATCH_PAGE;
+		args->flags |= DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE;
 
-	if (XE_IOCTL_DBG(xe, args->flags & DRM_XE_VM_CREATE_FAULT_MODE &&
+	if (XE_IOCTL_DBG(xe, args->flags & DRM_XE_VM_CREATE_FLAG_FAULT_MODE &&
 			 !xe->info.supports_usm))
 		return -EINVAL;
 
@@ -1951,32 +1951,32 @@ int xe_vm_create_ioctl(struct drm_device *dev, void *data,
 	if (XE_IOCTL_DBG(xe, args->flags & ~ALL_DRM_XE_VM_CREATE_FLAGS))
 		return -EINVAL;
 
-	if (XE_IOCTL_DBG(xe, args->flags & DRM_XE_VM_CREATE_SCRATCH_PAGE &&
-			 args->flags & DRM_XE_VM_CREATE_FAULT_MODE))
+	if (XE_IOCTL_DBG(xe, args->flags & DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE &&
+			 args->flags & DRM_XE_VM_CREATE_FLAG_FAULT_MODE))
 		return -EINVAL;
 
-	if (XE_IOCTL_DBG(xe, args->flags & DRM_XE_VM_CREATE_COMPUTE_MODE &&
-			 args->flags & DRM_XE_VM_CREATE_FAULT_MODE))
+	if (XE_IOCTL_DBG(xe, args->flags & DRM_XE_VM_CREATE_FLAG_COMPUTE_MODE &&
+			 args->flags & DRM_XE_VM_CREATE_FLAG_FAULT_MODE))
 		return -EINVAL;
 
-	if (XE_IOCTL_DBG(xe, args->flags & DRM_XE_VM_CREATE_FAULT_MODE &&
+	if (XE_IOCTL_DBG(xe, args->flags & DRM_XE_VM_CREATE_FLAG_FAULT_MODE &&
 			 xe_device_in_non_fault_mode(xe)))
 		return -EINVAL;
 
-	if (XE_IOCTL_DBG(xe, !(args->flags & DRM_XE_VM_CREATE_FAULT_MODE) &&
+	if (XE_IOCTL_DBG(xe, !(args->flags & DRM_XE_VM_CREATE_FLAG_FAULT_MODE) &&
 			 xe_device_in_fault_mode(xe)))
 		return -EINVAL;
 
 	if (XE_IOCTL_DBG(xe, args->extensions))
 		return -EINVAL;
 
-	if (args->flags & DRM_XE_VM_CREATE_SCRATCH_PAGE)
+	if (args->flags & DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE)
 		flags |= XE_VM_FLAG_SCRATCH_PAGE;
-	if (args->flags & DRM_XE_VM_CREATE_COMPUTE_MODE)
+	if (args->flags & DRM_XE_VM_CREATE_FLAG_COMPUTE_MODE)
 		flags |= XE_VM_FLAG_COMPUTE_MODE;
-	if (args->flags & DRM_XE_VM_CREATE_ASYNC_DEFAULT)
+	if (args->flags & DRM_XE_VM_CREATE_FLAG_ASYNC_DEFAULT)
 		flags |= XE_VM_FLAG_ASYNC_DEFAULT;
-	if (args->flags & DRM_XE_VM_CREATE_FAULT_MODE)
+	if (args->flags & DRM_XE_VM_CREATE_FLAG_FAULT_MODE)
 		flags |= XE_VM_FLAG_FAULT_MODE;
 
 	vm = xe_vm_create(xe, flags);
diff --git a/drivers/gpu/drm/xe/xe_vm_doc.h b/drivers/gpu/drm/xe/xe_vm_doc.h
index 516f4dc97223..bdc6659891a5 100644
--- a/drivers/gpu/drm/xe/xe_vm_doc.h
+++ b/drivers/gpu/drm/xe/xe_vm_doc.h
@@ -18,7 +18,7 @@
  * Scratch page
  * ------------
  *
- * If the VM is created with the flag, DRM_XE_VM_CREATE_SCRATCH_PAGE, set the
+ * If the VM is created with the flag, DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE, set the
  * entire page table structure defaults pointing to blank page allocated by the
  * VM. Invalid memory access rather than fault just read / write to this page.
  *
diff --git a/drivers/gpu/drm/xe/xe_wait_user_fence.c b/drivers/gpu/drm/xe/xe_wait_user_fence.c
index 78686908f7fb..13562db6c07f 100644
--- a/drivers/gpu/drm/xe/xe_wait_user_fence.c
+++ b/drivers/gpu/drm/xe/xe_wait_user_fence.c
@@ -79,8 +79,8 @@ static int check_hw_engines(struct xe_device *xe,
 	return 0;
 }
 
-#define VALID_FLAGS	(DRM_XE_UFENCE_WAIT_SOFT_OP | \
-			 DRM_XE_UFENCE_WAIT_ABSTIME)
+#define VALID_FLAGS	(DRM_XE_UFENCE_WAIT_FLAG_SOFT_OP | \
+			 DRM_XE_UFENCE_WAIT_FLAG_ABSTIME)
 #define MAX_OP		DRM_XE_UFENCE_WAIT_LTE
 
 static long to_jiffies_timeout(struct xe_device *xe,
@@ -107,7 +107,7 @@ static long to_jiffies_timeout(struct xe_device *xe,
 	 * Save the timeout to an u64 variable because nsecs_to_jiffies
 	 * might return a value that overflows s32 variable.
 	 */
-	if (args->flags & DRM_XE_UFENCE_WAIT_ABSTIME)
+	if (args->flags & DRM_XE_UFENCE_WAIT_FLAG_ABSTIME)
 		t = drm_timeout_abs_to_jiffies(args->timeout);
 	else
 		t = nsecs_to_jiffies(args->timeout);
@@ -137,7 +137,7 @@ int xe_wait_user_fence_ioctl(struct drm_device *dev, void *data,
 		u64_to_user_ptr(args->instances);
 	u64 addr = args->addr;
 	int err;
-	bool no_engines = args->flags & DRM_XE_UFENCE_WAIT_SOFT_OP;
+	bool no_engines = args->flags & DRM_XE_UFENCE_WAIT_FLAG_SOFT_OP;
 	long timeout;
 	ktime_t start;
 
@@ -206,7 +206,7 @@ int xe_wait_user_fence_ioctl(struct drm_device *dev, void *data,
 	}
 	remove_wait_queue(&xe->ufence_wq, &w_wait);
 
-	if (!(args->flags & DRM_XE_UFENCE_WAIT_ABSTIME)) {
+	if (!(args->flags & DRM_XE_UFENCE_WAIT_FLAG_ABSTIME)) {
 		args->timeout -= ktime_to_ns(ktime_sub(ktime_get(), start));
 		if (args->timeout < 0)
 			args->timeout = 0;
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 641e94a5f9c1..145d8b19dfca 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -675,10 +675,10 @@ struct drm_xe_vm_create {
 	/** @extensions: Pointer to the first extension struct, if any */
 	__u64 extensions;
 
-#define DRM_XE_VM_CREATE_SCRATCH_PAGE		(0x1 << 0)
-#define DRM_XE_VM_CREATE_COMPUTE_MODE		(0x1 << 1)
-#define DRM_XE_VM_CREATE_ASYNC_DEFAULT		(0x1 << 2)
-#define DRM_XE_VM_CREATE_FAULT_MODE		(0x1 << 3)
+#define DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE	(0x1 << 0)
+#define DRM_XE_VM_CREATE_FLAG_COMPUTE_MODE	(0x1 << 1)
+#define DRM_XE_VM_CREATE_FLAG_ASYNC_DEFAULT	(0x1 << 2)
+#define DRM_XE_VM_CREATE_FLAG_FAULT_MODE	(0x1 << 3)
 	/** @flags: Flags */
 	__u32 flags;
 
@@ -924,11 +924,11 @@ struct drm_xe_sync {
 	/** @extensions: Pointer to the first extension struct, if any */
 	__u64 extensions;
 
-#define DRM_XE_SYNC_SYNCOBJ		0x0
-#define DRM_XE_SYNC_TIMELINE_SYNCOBJ	0x1
-#define DRM_XE_SYNC_DMA_BUF		0x2
-#define DRM_XE_SYNC_USER_FENCE		0x3
-#define DRM_XE_SYNC_SIGNAL		0x10
+#define DRM_XE_SYNC_FLAG_SYNCOBJ		0x0
+#define DRM_XE_SYNC_FLAG_TIMELINE_SYNCOBJ	0x1
+#define DRM_XE_SYNC_FLAG_DMA_BUF		0x2
+#define DRM_XE_SYNC_FLAG_USER_FENCE		0x3
+#define DRM_XE_SYNC_FLAG_SIGNAL		0x10
 	__u32 flags;
 
 	/** @pad: MBZ */
@@ -1014,8 +1014,8 @@ struct drm_xe_wait_user_fence {
 	/** @op: wait operation (type of comparison) */
 	__u16 op;
 
-#define DRM_XE_UFENCE_WAIT_SOFT_OP	(1 << 0)	/* e.g. Wait on VM bind */
-#define DRM_XE_UFENCE_WAIT_ABSTIME	(1 << 1)
+#define DRM_XE_UFENCE_WAIT_FLAG_SOFT_OP	(1 << 0)	/* e.g. Wait on VM bind */
+#define DRM_XE_UFENCE_WAIT_FLAG_ABSTIME	(1 << 1)
 	/** @flags: wait flags */
 	__u16 flags;
 
@@ -1033,10 +1033,10 @@ struct drm_xe_wait_user_fence {
 	__u64 mask;
 	/**
 	 * @timeout: how long to wait before bailing, value in nanoseconds.
-	 * Without DRM_XE_UFENCE_WAIT_ABSTIME flag set (relative timeout)
+	 * Without DRM_XE_UFENCE_WAIT_FLAG_ABSTIME flag set (relative timeout)
 	 * it contains timeout expressed in nanoseconds to wait (fence will
 	 * expire at now() + timeout).
-	 * When DRM_XE_UFENCE_WAIT_ABSTIME flat is set (absolute timeout) wait
+	 * When DRM_XE_UFENCE_WAIT_FLAG_ABSTIME flat is set (absolute timeout) wait
 	 * will end at timeout (uses system MONOTONIC_CLOCK).
 	 * Passing negative timeout leads to neverending wait.
 	 *
@@ -1049,13 +1049,13 @@ struct drm_xe_wait_user_fence {
 
 	/**
 	 * @num_engines: number of engine instances to wait on, must be zero
-	 * when DRM_XE_UFENCE_WAIT_SOFT_OP set
+	 * when DRM_XE_UFENCE_WAIT_FLAG_SOFT_OP set
 	 */
 	__u64 num_engines;
 
 	/**
 	 * @instances: user pointer to array of drm_xe_engine_class_instance to
-	 * wait on, must be NULL when DRM_XE_UFENCE_WAIT_SOFT_OP set
+	 * wait on, must be NULL when DRM_XE_UFENCE_WAIT_FLAG_SOFT_OP set
 	 */
 	__u64 instances;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 09/43] drm/xe/uapi: Make constant comments visible in kernel doc
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (7 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 08/43] drm/xe/uapi: Add _FLAG to uAPI constants usable for flags Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 10/43] drm/xe/uapi: Change rsvd to pad in struct drm_xe_class_instance Francois Dugast
                   ` (37 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Francois Dugast

As there is no direct way to make comments of constants directly
visible in the kernel doc, move them to the description of the
structure where they can be used. By doing so they appear in the
"Description" section of the struct documentation.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
---
 include/uapi/drm/xe_drm.h | 264 ++++++++++++++++++++++----------------
 1 file changed, 150 insertions(+), 114 deletions(-)

diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 145d8b19dfca..697973fff24c 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -126,23 +126,40 @@ struct xe_user_extension {
 #define DRM_IOCTL_XE_WAIT_USER_FENCE		DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_WAIT_USER_FENCE, struct drm_xe_wait_user_fence)
 #define DRM_IOCTL_XE_VM_MADVISE			 DRM_IOW(DRM_COMMAND_BASE + DRM_XE_VM_MADVISE, struct drm_xe_vm_madvise)
 
-/** struct drm_xe_engine_class_instance - instance of an engine class */
+/**
+ * struct drm_xe_engine_class_instance - instance of an engine class
+ *
+ * The @engine_class can be:
+ *  - %DRM_XE_ENGINE_CLASS_RENDER
+ *  - %DRM_XE_ENGINE_CLASS_COPY
+ *  - %DRM_XE_ENGINE_CLASS_VIDEO_DECODE
+ *  - %DRM_XE_ENGINE_CLASS_VIDEO_ENHANCE
+ *  - %DRM_XE_ENGINE_CLASS_COMPUTE
+ *  - %DRM_XE_ENGINE_CLASS_VM_BIND_ASYNC - Kernel only class (not actual
+ *    hardware engine class) used for creating ordered queues of
+ *    asynchronous VM bind operations.
+ *  - %DRM_XE_ENGINE_CLASS_VM_BIND_SYNC - Kernel only class (not actual
+ *    synchronous VM bind operations.
+ *
+ */
 struct drm_xe_engine_class_instance {
 #define DRM_XE_ENGINE_CLASS_RENDER		0
 #define DRM_XE_ENGINE_CLASS_COPY		1
 #define DRM_XE_ENGINE_CLASS_VIDEO_DECODE	2
 #define DRM_XE_ENGINE_CLASS_VIDEO_ENHANCE	3
 #define DRM_XE_ENGINE_CLASS_COMPUTE		4
-	/*
-	 * Kernel only classes (not actual hardware engine class). Used for
-	 * creating ordered queues of VM bind operations.
-	 */
 #define DRM_XE_ENGINE_CLASS_VM_BIND_ASYNC	5
 #define DRM_XE_ENGINE_CLASS_VM_BIND_SYNC	6
+	/**
+	 * @engine_class: Class of this instance among possible
+	 * DRM_XE_ENGINE_CLASS_*
+	 */
 	__u16 engine_class;
-
+	/** @engine_instance: Engine instance */
 	__u16 engine_instance;
+	/** @gt_id: GT ID the instance is associated with */
 	__u16 gt_id;
+	/** @rsvd: Reserved */
 	__u16 rsvd;
 };
 
@@ -313,6 +330,24 @@ struct drm_xe_query_mem_usage {
  * If a query is made with a struct drm_xe_device_query where .query
  * is equal to DRM_XE_DEVICE_QUERY_CONFIG, then the reply uses
  * struct drm_xe_query_config in .data.
+ *
+ * The index in @info can be:
+ *  - %DRM_XE_QUERY_CONFIG_REV_AND_DEVICE_ID - Device ID (lower 16 bits)
+ *    and the device revision (next 8 bits)
+ *  - %DRM_XE_QUERY_CONFIG_FLAGS - Flags describing the device
+ *    configuration, see list below
+ *
+ *    - %DRM_XE_QUERY_CONFIG_FLAGS_HAS_VRAM - Flag is set if the device
+ *      has usable VRAM
+ *  - %DRM_XE_QUERY_CONFIG_MIN_ALIGNMENT - Minimal memory alignment
+ *    required by this device, typically SZ_4K or SZ_64K
+ *  - %DRM_XE_QUERY_CONFIG_VA_BITS - Maximum bits of a virtual address
+ *  - %DRM_XE_QUERY_CONFIG_GT_COUNT - Total number of GTs for the entire
+ *    device
+ *  - %DRM_XE_QUERY_CONFIG_MEM_REGION_COUNT - Total number of accessible
+ *    memory regions
+ *  - %DRM_XE_QUERY_CONFIG_MAX_EXEC_QUEUE_PRIORITY - Value of the highest
+ *    available exec queue priority
  */
 struct drm_xe_query_config {
 	/** @num_params: number of parameters returned in info */
@@ -321,39 +356,13 @@ struct drm_xe_query_config {
 	/** @pad: MBZ */
 	__u32 pad;
 
-	/*
-	 * Device ID (lower 16 bits) and the device revision (next
-	 * 8 bits)
-	 */
 #define DRM_XE_QUERY_CONFIG_REV_AND_DEVICE_ID		0
-	/*
-	 * Flags describing the device configuration, see list below
-	 */
 #define DRM_XE_QUERY_CONFIG_FLAGS			1
-	/*
-	 * Flag is set if the device has usable VRAM
-	 */
 	#define DRM_XE_QUERY_CONFIG_FLAGS_HAS_VRAM	(0x1 << 0)
-	/*
-	 * Minimal memory alignment required by this device,
-	 * typically SZ_4K or SZ_64K
-	 */
 #define DRM_XE_QUERY_CONFIG_MIN_ALIGNMENT		2
-	/*
-	 * Maximum bits of a virtual address
-	 */
 #define DRM_XE_QUERY_CONFIG_VA_BITS			3
-	/*
-	 * Total number of GTs for the entire device
-	 */
 #define DRM_XE_QUERY_CONFIG_GT_COUNT			4
-	/*
-	 * Total number of accessible memory regions
-	 */
 #define DRM_XE_QUERY_CONFIG_MEM_REGION_COUNT		5
-	/*
-	 * Value of the highest available exec queue priority
-	 */
 #define DRM_XE_QUERY_CONFIG_MAX_EXEC_QUEUE_PRIORITY	6
 	/** @info: array of elements containing the config info */
 	__u64 info[];
@@ -366,6 +375,7 @@ struct drm_xe_query_config {
  * existing GT individual descriptions.
  * Graphics Technology (GT) is a subset of a GPU/tile that is responsible for
  * implementing graphics and/or media operations.
+ *
  */
 struct drm_xe_query_gt {
 #define DRM_XE_QUERY_GT_TYPE_MAIN		0
@@ -423,34 +433,31 @@ struct drm_xe_query_gt_list {
  * If a query is made with a struct drm_xe_device_query where .query
  * is equal to DRM_XE_DEVICE_QUERY_GT_TOPOLOGY, then the reply uses
  * struct drm_xe_query_topology_mask in .data.
+ *
+ * The @type can be:
+ *  - %DRM_XE_TOPO_DSS_GEOMETRY - To query the mask of Dual Sub Slices
+ *    (DSS) available for geometry operations. For example a query response
+ *    containing the following in mask:
+ *    ``DSS_GEOMETRY    ff ff ff ff 00 00 00 00``
+ *    means 32 DSS are available for geometry.
+ *  - %DRM_XE_TOPO_DSS_COMPUTE - To query the mask of Dual Sub Slices
+ *    (DSS) available for compute operations. For example a query response
+ *    containing the following in mask:
+ *    ``DSS_COMPUTE    ff ff ff ff 00 00 00 00``
+ *    means 32 DSS are available for compute.
+ *  - %DRM_XE_TOPO_EU_PER_DSS - To query the mask of Execution Units (EU)
+ *    available per Dual Sub Slices (DSS). For example a query response
+ *    containing the following in mask:
+ *    ``EU_PER_DSS    ff ff 00 00 00 00 00 00``
+ *    means each DSS has 16 EU.
+ *
  */
 struct drm_xe_query_topology_mask {
 	/** @gt_id: GT ID the mask is associated with */
 	__u16 gt_id;
 
-	/*
-	 * To query the mask of Dual Sub Slices (DSS) available for geometry
-	 * operations. For example a query response containing the following
-	 * in mask:
-	 *   DSS_GEOMETRY    ff ff ff ff 00 00 00 00
-	 * means 32 DSS are available for geometry.
-	 */
 #define DRM_XE_TOPO_DSS_GEOMETRY	(1 << 0)
-	/*
-	 * To query the mask of Dual Sub Slices (DSS) available for compute
-	 * operations. For example a query response containing the following
-	 * in mask:
-	 *   DSS_COMPUTE    ff ff ff ff 00 00 00 00
-	 * means 32 DSS are available for compute.
-	 */
 #define DRM_XE_TOPO_DSS_COMPUTE		(1 << 1)
-	/*
-	 * To query the mask of Execution Units (EU) available per Dual Sub
-	 * Slices (DSS). For example a query response containing the following
-	 * in mask:
-	 *   EU_PER_DSS    ff ff 00 00 00 00 00 00
-	 * means each DSS has 16 EU.
-	 */
 #define DRM_XE_TOPO_EU_PER_DSS		(1 << 2)
 	/** @type: type of mask */
 	__u16 type;
@@ -512,6 +519,19 @@ struct drm_xe_query_uc_fw_version {
  * and sets the value in the query member. This determines the type of
  * the structure provided by the driver in data, among struct drm_xe_query_*.
  *
+ * The @query can be:
+ *  - %DRM_XE_DEVICE_QUERY_ENGINES
+ *  - %DRM_XE_DEVICE_QUERY_MEM_USAGE
+ *  - %DRM_XE_DEVICE_QUERY_CONFIG
+ *  - %DRM_XE_DEVICE_QUERY_GT_LIST - Query type to retrieve the hardware
+ *    configuration of the device such as information on slices, memory,
+ *    caches, and so on. It is provided as a table of key / value
+ *    attributes.
+ *  - %DRM_XE_DEVICE_QUERY_HWCONFIG
+ *  - %DRM_XE_DEVICE_QUERY_GT_TOPOLOGY
+ *  - %DRM_XE_DEVICE_QUERY_ENGINE_CYCLES
+ *  - %DRM_XE_DEVICE_QUERY_UC_FW_VERSION
+ *
  * If size is set to 0, the driver fills it with the required size for
  * the requested type of data to query. If size is equal to the required
  * size, the queried information is copied into data. If size is set to
@@ -554,11 +574,6 @@ struct drm_xe_device_query {
 #define DRM_XE_DEVICE_QUERY_MEM_USAGE		1
 #define DRM_XE_DEVICE_QUERY_CONFIG		2
 #define DRM_XE_DEVICE_QUERY_GT_LIST		3
-	/*
-	 * Query type to retrieve the hardware configuration of the device
-	 * such as information on slices, memory, caches, and so on. It is
-	 * provided as a table of attributes (key / value).
-	 */
 #define DRM_XE_DEVICE_QUERY_HWCONFIG		4
 #define DRM_XE_DEVICE_QUERY_GT_TOPOLOGY		5
 #define DRM_XE_DEVICE_QUERY_ENGINE_CYCLES	6
@@ -576,6 +591,29 @@ struct drm_xe_device_query {
 	__u64 reserved[2];
 };
 
+/**
+ * struct drm_xe_gem_create - structure for gem creation
+ *
+ * The @flags can be:
+ *  - %DRM_XE_GEM_CREATE_FLAG_DEFER_BACKING
+ *  - %DRM_XE_GEM_CREATE_FLAG_SCANOUT
+ *  - %DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM - When using VRAM as a
+ *    possible placement, ensure that the corresponding VRAM allocation
+ *    will always use the CPU accessible part of VRAM. This is important
+ *    for small-bar systems (on full-bar systems this gets turned into a
+ *    noop).
+ *    Note1: System memory can be used as an extra placement if the kernel
+ *    should spill the allocation to system memory, if space can't be made
+ *    available in the CPU accessible part of VRAM (giving the same
+ *    behaviour as the i915 interface, see
+ *    I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS).
+ *    Note2: For clear-color CCS surfaces the kernel needs to read the
+ *    clear-color value stored in the buffer, and on discrete platforms we
+ *    need to use VRAM for display surfaces, therefore the kernel requires
+ *    setting this flag for such objects, otherwise an error is thrown on
+ *    small-bar systems.
+ *
+ */
 struct drm_xe_gem_create {
 	/** @extensions: Pointer to the first extension struct, if any */
 	__u64 extensions;
@@ -589,21 +627,6 @@ struct drm_xe_gem_create {
 
 #define DRM_XE_GEM_CREATE_FLAG_DEFER_BACKING		(0x1 << 24)
 #define DRM_XE_GEM_CREATE_FLAG_SCANOUT			(0x1 << 25)
-/*
- * When using VRAM as a possible placement, ensure that the corresponding VRAM
- * allocation will always use the CPU accessible part of VRAM. This is important
- * for small-bar systems (on full-bar systems this gets turned into a noop).
- *
- * Note: System memory can be used as an extra placement if the kernel should
- * spill the allocation to system memory, if space can't be made available in
- * the CPU accessible part of VRAM (giving the same behaviour as the i915
- * interface, see I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS).
- *
- * Note: For clear-color CCS surfaces the kernel needs to read the clear-color
- * value stored in the buffer, and on discrete platforms we need to use VRAM for
- * display surfaces, therefore the kernel requires setting this flag for such
- * objects, otherwise an error is thrown on small-bar systems.
- */
 #define DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM	(0x1 << 26)
 	/**
 	 * @flags: Flags, currently a mask of memory instances of where BO can
@@ -700,6 +723,30 @@ struct drm_xe_vm_destroy {
 	__u64 reserved[2];
 };
 
+/**
+ * struct drm_xe_vm_bind_op - run bind operations
+ *
+ * The @op can be:
+ *  - %DRM_XE_VM_BIND_OP_MAP
+ *  - %DRM_XE_VM_BIND_OP_UNMAP
+ *  - %DRM_XE_VM_BIND_OP_MAP_USERPTR
+ *  - %DRM_XE_VM_BIND_OP_UNMAP_ALL
+ *  - %DRM_XE_VM_BIND_OP_PREFETCH
+ *
+ * and the @flags can be:
+ *  - %DRM_XE_VM_BIND_FLAG_READONLY
+ *  - %DRM_XE_VM_BIND_FLAG_ASYNC
+ *  - %DRM_XE_VM_BIND_FLAG_IMMEDIATE - Valid on a faulting VM only, do the
+ *    MAP operation immediately rather than deferring the MAP to the page
+ *    fault handler.
+ *  - %DRM_XE_VM_BIND_FLAG_NULL - When the NULL flag is set, the page
+ *    tables are setup with a special bit which indicates writes are
+ *    dropped and all reads return zero. In the future, the NULL flags
+ *    will only be valid for DRM_XE_VM_BIND_OP_MAP operations, the BO
+ *    handle MBZ, and the BO offset MBZ. This flag is intended to
+ *    implement VK sparse bindings.
+ *
+ */
 struct drm_xe_vm_bind_op {
 	/** @extensions: Pointer to the first extension struct, if any */
 	__u64 extensions;
@@ -747,23 +794,12 @@ struct drm_xe_vm_bind_op {
 
 #define DRM_XE_VM_BIND_FLAG_READONLY	(0x1 << 0)
 #define DRM_XE_VM_BIND_FLAG_ASYNC	(0x1 << 1)
-	/*
-	 * Valid on a faulting VM only, do the MAP operation immediately rather
-	 * than deferring the MAP to the page fault handler.
-	 */
 #define DRM_XE_VM_BIND_FLAG_IMMEDIATE	(0x1 << 2)
-	/*
-	 * When the NULL flag is set, the page tables are setup with a special
-	 * bit which indicates writes are dropped and all reads return zero.  In
-	 * the future, the NULL flags will only be valid for DRM_XE_VM_BIND_OP_MAP
-	 * operations, the BO handle MBZ, and the BO offset MBZ. This flag is
-	 * intended to implement VK sparse bindings.
-	 */
 #define DRM_XE_VM_BIND_FLAG_NULL	(0x1 << 3)
 	/** @flags: Bind flags */
 	__u32 flags;
 
-	/** @mem_region: Memory region to prefetch VMA to, instance not a mask */
+	/** @region: Memory region to prefetch VMA to, instance not a mask */
 	__u32 region;
 
 	/** @reserved: Reserved */
@@ -1063,6 +1099,35 @@ struct drm_xe_wait_user_fence {
 	__u64 reserved[2];
 };
 
+/**
+ * struct drm_xe_vm_madvise - give advice about use of memory
+ *
+ * The @property can be:
+ *  - %DRM_XE_VM_MADVISE_PREFERRED_MEM_CLASS - Setting the preferred
+ *    location will trigger a migrate of the VMA backing store to new
+ *    location if the backing store is already allocated.
+ *    For DRM_XE_VM_MADVISE_PREFERRED_MEM_CLASS usage, see enum
+ *    drm_xe_memory_class.
+ *  - %DRM_XE_VM_MADVISE_PREFERRED_GT
+ *  - %DRM_XE_VM_MADVISE_PREFERRED_MEM_CLASS_GT - In this case lower 32 bits
+ *    are mem class, upper 32 are GT. Combination provides a single IOCTL
+ *    plus migrate VMA to preferred location.
+ *  - %DRM_XE_VM_MADVISE_CPU_ATOMIC - The CPU will do atomic memory
+ *    operations to this VMA. Must be set on some devices for atomics to
+ *    behave correctly.
+ *  - %DRM_XE_VM_MADVISE_DEVICE_ATOMIC - The device will do atomic memory
+ *    operations to this VMA. Must be set on some devices for atomics to
+ *    behave correctly.
+ *  - %DRM_XE_VM_MADVISE_PRIORITY - Priority WRT to eviction (moving from
+ *    preferred memory location due to memory pressure). The lower the
+ *    priority, the more likely to be evicted.
+ *
+ *    - %DRM_XE_VMA_PRIORITY_LOW
+ *    - %DRM_XE_VMA_PRIORITY_NORMAL - Default
+ *    - %DRM_XE_VMA_PRIORITY_HIGH - Must be user with elevated privileges
+ *  - %DRM_XE_VM_MADVISE_PIN - Pin the VMA in memory, must be user with
+ *    elevated privileges
+ */
 struct drm_xe_vm_madvise {
 	/** @extensions: Pointer to the first extension struct, if any */
 	__u64 extensions;
@@ -1079,44 +1144,15 @@ struct drm_xe_vm_madvise {
 	/** @addr: Address of the VMA to operation on */
 	__u64 addr;
 
-	/*
-	 * Setting the preferred location will trigger a migrate of the VMA
-	 * backing store to new location if the backing store is already
-	 * allocated.
-	 *
-	 * For DRM_XE_VM_MADVISE_PREFERRED_MEM_CLASS usage, see enum
-	 * drm_xe_memory_class.
-	 */
 #define DRM_XE_VM_MADVISE_PREFERRED_MEM_CLASS		0
 #define DRM_XE_VM_MADVISE_PREFERRED_GT			1
-	/*
-	 * In this case lower 32 bits are mem class, upper 32 are GT.
-	 * Combination provides a single IOCTL plus migrate VMA to preferred
-	 * location.
-	 */
 #define DRM_XE_VM_MADVISE_PREFERRED_MEM_CLASS_GT	2
-	/*
-	 * The CPU will do atomic memory operations to this VMA. Must be set on
-	 * some devices for atomics to behave correctly.
-	 */
 #define DRM_XE_VM_MADVISE_CPU_ATOMIC			3
-	/*
-	 * The device will do atomic memory operations to this VMA. Must be set
-	 * on some devices for atomics to behave correctly.
-	 */
 #define DRM_XE_VM_MADVISE_DEVICE_ATOMIC			4
-	/*
-	 * Priority WRT to eviction (moving from preferred memory location due
-	 * to memory pressure). The lower the priority, the more likely to be
-	 * evicted.
-	 */
 #define DRM_XE_VM_MADVISE_PRIORITY			5
 #define		DRM_XE_VMA_PRIORITY_LOW			0
-		/* Default */
 #define		DRM_XE_VMA_PRIORITY_NORMAL		1
-		/* Must be user with elevated privileges */
 #define		DRM_XE_VMA_PRIORITY_HIGH		2
-	/* Pin the VMA in memory, must be user with elevated privileges */
 #define DRM_XE_VM_MADVISE_PIN				6
 	/** @property: property to set */
 	__u32 property;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 10/43] drm/xe/uapi: Change rsvd to pad in struct drm_xe_class_instance
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (8 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 09/43] drm/xe/uapi: Make constant comments visible in kernel doc Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 11/43] drm/xe/uapi: Remove GT_TYPE_REMOTE Francois Dugast
                   ` (36 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Francois Dugast

Change rsvd to pad in struct drm_xe_class_instance to prevent the field
from being used in future.

v2: Change from fixup to regular commit because this touches the
    uAPI (Francois Dugast)

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
---
 drivers/gpu/drm/xe/xe_query.c | 5 ++++-
 include/uapi/drm/xe_drm.h     | 4 ++--
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
index 995930e47ea2..808244e668a3 100644
--- a/drivers/gpu/drm/xe/xe_query.c
+++ b/drivers/gpu/drm/xe/xe_query.c
@@ -215,7 +215,10 @@ static int query_engines(struct xe_device *xe,
 				xe_to_user_engine_class[hwe->class];
 			hw_engine_info[i].engine_instance =
 				hwe->logical_instance;
-			hw_engine_info[i++].gt_id = gt->info.id;
+			hw_engine_info[i].gt_id = gt->info.id;
+			hw_engine_info[i].pad = 0;
+
+			i++;
 		}
 
 	if (copy_to_user(query_ptr, hw_engine_info, size)) {
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 697973fff24c..5d7836f8137d 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -159,8 +159,8 @@ struct drm_xe_engine_class_instance {
 	__u16 engine_instance;
 	/** @gt_id: GT ID the instance is associated with */
 	__u16 gt_id;
-	/** @rsvd: Reserved */
-	__u16 rsvd;
+	/** @pad: MBZ */
+	__u16 pad;
 };
 
 /**
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 11/43] drm/xe/uapi: Remove GT_TYPE_REMOTE
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (9 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 10/43] drm/xe/uapi: Change rsvd to pad in struct drm_xe_class_instance Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 12/43] drm/xe/uapi: Kill VM_MADVISE IOCTL Francois Dugast
                   ` (35 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Carl Zhang, Francois Dugast, Matt Roper, Rodrigo Vivi

From: Rodrigo Vivi <rodrigo.vivi@intel.com>

With the split between tile and gt, this is currently unused.
Also it is bringing confusion because main vs remote would be
more a concept of the tile itself and not about GT.

So, the MAIN one is the traditional GT used for every operation
in older platforms, and for render/graphics and compute on platforms
that contains the stand-alone Media GT.

Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Cc: Carl Zhang <carl.zhang@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_query.c | 2 --
 include/uapi/drm/xe_drm.h     | 5 ++---
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
index 808244e668a3..be5cfb29216b 100644
--- a/drivers/gpu/drm/xe/xe_query.c
+++ b/drivers/gpu/drm/xe/xe_query.c
@@ -376,8 +376,6 @@ static int query_gt_list(struct xe_device *xe, struct drm_xe_device_query *query
 	for_each_gt(gt, xe, id) {
 		if (xe_gt_is_media_type(gt))
 			gt_list->gt_list[id].type = DRM_XE_QUERY_GT_TYPE_MEDIA;
-		else if (gt_to_tile(gt)->id > 0)
-			gt_list->gt_list[id].type = DRM_XE_QUERY_GT_TYPE_REMOTE;
 		else
 			gt_list->gt_list[id].type = DRM_XE_QUERY_GT_TYPE_MAIN;
 		gt_list->gt_list[id].gt_id = gt->info.id;
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 5d7836f8137d..2fa0d1f5b47a 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -379,9 +379,8 @@ struct drm_xe_query_config {
  */
 struct drm_xe_query_gt {
 #define DRM_XE_QUERY_GT_TYPE_MAIN		0
-#define DRM_XE_QUERY_GT_TYPE_REMOTE		1
-#define DRM_XE_QUERY_GT_TYPE_MEDIA		2
-	/** @type: GT type: Main, Remote, or Media */
+#define DRM_XE_QUERY_GT_TYPE_MEDIA		1
+	/** @type: GT type: Main or Media */
 	__u16 type;
 	/** @gt_id: Unique ID of this GT within the PCI Device */
 	__u16 gt_id;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 12/43] drm/xe/uapi: Kill VM_MADVISE IOCTL
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (10 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 11/43] drm/xe/uapi: Remove GT_TYPE_REMOTE Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 13/43] drm/xe/uapi: Separate bo_create placement from flags Francois Dugast
                   ` (34 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

From: Rodrigo Vivi <rodrigo.vivi@intel.com>

Remove unused IOCTL.
Without any userspace using it we need to remove before we
can be accepted upstream.

At this point we are breaking the compatibility for good,
so we don't need to break when we are in-tree. So, let's
also use this breakage to sort out the IOCTL entries and
fix all the small indentation and line issues.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/Makefile        |   1 -
 drivers/gpu/drm/xe/xe_bo.c         |   2 +-
 drivers/gpu/drm/xe/xe_bo_types.h   |   3 +
 drivers/gpu/drm/xe/xe_device.c     |   8 +-
 drivers/gpu/drm/xe/xe_vm_madvise.c | 299 -----------------------------
 drivers/gpu/drm/xe/xe_vm_madvise.h |  15 --
 include/uapi/drm/xe_drm.h          |  92 ++-------
 7 files changed, 18 insertions(+), 402 deletions(-)
 delete mode 100644 drivers/gpu/drm/xe/xe_vm_madvise.c
 delete mode 100644 drivers/gpu/drm/xe/xe_vm_madvise.h

diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index a1a8847e2ba3..84a007f8925d 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -114,7 +114,6 @@ xe-y += xe_bb.o \
 	xe_uc_debugfs.o \
 	xe_uc_fw.o \
 	xe_vm.o \
-	xe_vm_madvise.o \
 	xe_wait_user_fence.o \
 	xe_wa.o \
 	xe_wopcm.o
diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index f01817c6d022..b955c89da42c 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -1236,7 +1236,7 @@ struct xe_bo *__xe_bo_create_locked(struct xe_device *xe, struct xe_bo *bo,
 	bo->props.preferred_mem_class = XE_BO_PROPS_INVALID;
 	bo->props.preferred_gt = XE_BO_PROPS_INVALID;
 	bo->props.preferred_mem_type = XE_BO_PROPS_INVALID;
-	bo->ttm.priority = DRM_XE_VMA_PRIORITY_NORMAL;
+	bo->ttm.priority = XE_BO_PRIORITY_NORMAL;
 	INIT_LIST_HEAD(&bo->pinned_link);
 #ifdef CONFIG_PROC_FS
 	INIT_LIST_HEAD(&bo->client_link);
diff --git a/drivers/gpu/drm/xe/xe_bo_types.h b/drivers/gpu/drm/xe/xe_bo_types.h
index 051fe990c133..4bff60996168 100644
--- a/drivers/gpu/drm/xe/xe_bo_types.h
+++ b/drivers/gpu/drm/xe/xe_bo_types.h
@@ -19,6 +19,9 @@ struct xe_vm;
 
 #define XE_BO_MAX_PLACEMENTS	3
 
+/* TODO: To be selected with VM_MADVISE */
+#define	XE_BO_PRIORITY_NORMAL	1
+
 /** @xe_bo: XE buffer object */
 struct xe_bo {
 	/** @ttm: TTM base buffer object */
diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 515cdf599fab..4ea24de135cf 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -36,7 +36,6 @@
 #include "xe_ttm_stolen_mgr.h"
 #include "xe_ttm_sys_mgr.h"
 #include "xe_vm.h"
-#include "xe_vm_madvise.h"
 #include "xe_wait_user_fence.h"
 #include "xe_hwmon.h"
 
@@ -117,18 +116,17 @@ static const struct drm_ioctl_desc xe_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(XE_VM_CREATE, xe_vm_create_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(XE_VM_DESTROY, xe_vm_destroy_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(XE_VM_BIND, xe_vm_bind_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(XE_EXEC, xe_exec_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(XE_EXEC_QUEUE_CREATE, xe_exec_queue_create_ioctl,
 			  DRM_RENDER_ALLOW),
-	DRM_IOCTL_DEF_DRV(XE_EXEC_QUEUE_GET_PROPERTY, xe_exec_queue_get_property_ioctl,
-			  DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(XE_EXEC_QUEUE_DESTROY, xe_exec_queue_destroy_ioctl,
 			  DRM_RENDER_ALLOW),
-	DRM_IOCTL_DEF_DRV(XE_EXEC, xe_exec_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(XE_EXEC_QUEUE_SET_PROPERTY, xe_exec_queue_set_property_ioctl,
 			  DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(XE_EXEC_QUEUE_GET_PROPERTY, xe_exec_queue_get_property_ioctl,
+			  DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(XE_WAIT_USER_FENCE, xe_wait_user_fence_ioctl,
 			  DRM_RENDER_ALLOW),
-	DRM_IOCTL_DEF_DRV(XE_VM_MADVISE, xe_vm_madvise_ioctl, DRM_RENDER_ALLOW),
 };
 
 static const struct file_operations xe_driver_fops = {
diff --git a/drivers/gpu/drm/xe/xe_vm_madvise.c b/drivers/gpu/drm/xe/xe_vm_madvise.c
deleted file mode 100644
index 72d051ecac5c..000000000000
--- a/drivers/gpu/drm/xe/xe_vm_madvise.c
+++ /dev/null
@@ -1,299 +0,0 @@
-// SPDX-License-Identifier: MIT
-/*
- * Copyright © 2021 Intel Corporation
- */
-
-#include "xe_vm_madvise.h"
-
-#include <linux/nospec.h>
-
-#include <drm/ttm/ttm_tt.h>
-#include <drm/xe_drm.h>
-
-#include "xe_bo.h"
-#include "xe_vm.h"
-
-static int madvise_preferred_mem_class(struct xe_device *xe, struct xe_vm *vm,
-				       struct xe_vma **vmas, int num_vmas,
-				       u64 value)
-{
-	int i, err;
-
-	if (XE_IOCTL_DBG(xe, value > DRM_XE_MEM_REGION_CLASS_VRAM))
-		return -EINVAL;
-
-	if (XE_IOCTL_DBG(xe, value == DRM_XE_MEM_REGION_CLASS_VRAM &&
-			 !xe->info.is_dgfx))
-		return -EINVAL;
-
-	for (i = 0; i < num_vmas; ++i) {
-		struct xe_bo *bo;
-
-		bo = xe_vma_bo(vmas[i]);
-
-		err = xe_bo_lock(bo, true);
-		if (err)
-			return err;
-		bo->props.preferred_mem_class = value;
-		xe_bo_placement_for_flags(xe, bo, bo->flags);
-		xe_bo_unlock(bo);
-	}
-
-	return 0;
-}
-
-static int madvise_preferred_gt(struct xe_device *xe, struct xe_vm *vm,
-				struct xe_vma **vmas, int num_vmas, u64 value)
-{
-	int i, err;
-
-	if (XE_IOCTL_DBG(xe, value > xe->info.tile_count))
-		return -EINVAL;
-
-	for (i = 0; i < num_vmas; ++i) {
-		struct xe_bo *bo;
-
-		bo = xe_vma_bo(vmas[i]);
-
-		err = xe_bo_lock(bo, true);
-		if (err)
-			return err;
-		bo->props.preferred_gt = value;
-		xe_bo_placement_for_flags(xe, bo, bo->flags);
-		xe_bo_unlock(bo);
-	}
-
-	return 0;
-}
-
-static int madvise_preferred_mem_class_gt(struct xe_device *xe,
-					  struct xe_vm *vm,
-					  struct xe_vma **vmas, int num_vmas,
-					  u64 value)
-{
-	int i, err;
-	u32 gt_id = upper_32_bits(value);
-	u32 mem_class = lower_32_bits(value);
-
-	if (XE_IOCTL_DBG(xe, mem_class > DRM_XE_MEM_REGION_CLASS_VRAM))
-		return -EINVAL;
-
-	if (XE_IOCTL_DBG(xe, mem_class == DRM_XE_MEM_REGION_CLASS_VRAM &&
-			 !xe->info.is_dgfx))
-		return -EINVAL;
-
-	if (XE_IOCTL_DBG(xe, gt_id > xe->info.tile_count))
-		return -EINVAL;
-
-	for (i = 0; i < num_vmas; ++i) {
-		struct xe_bo *bo;
-
-		bo = xe_vma_bo(vmas[i]);
-
-		err = xe_bo_lock(bo, true);
-		if (err)
-			return err;
-		bo->props.preferred_mem_class = mem_class;
-		bo->props.preferred_gt = gt_id;
-		xe_bo_placement_for_flags(xe, bo, bo->flags);
-		xe_bo_unlock(bo);
-	}
-
-	return 0;
-}
-
-static int madvise_cpu_atomic(struct xe_device *xe, struct xe_vm *vm,
-			      struct xe_vma **vmas, int num_vmas, u64 value)
-{
-	int i, err;
-
-	for (i = 0; i < num_vmas; ++i) {
-		struct xe_bo *bo;
-
-		bo = xe_vma_bo(vmas[i]);
-		if (XE_IOCTL_DBG(xe, !(bo->flags & XE_BO_CREATE_SYSTEM_BIT)))
-			return -EINVAL;
-
-		err = xe_bo_lock(bo, true);
-		if (err)
-			return err;
-		bo->props.cpu_atomic = !!value;
-
-		/*
-		 * All future CPU accesses must be from system memory only, we
-		 * just invalidate the CPU page tables which will trigger a
-		 * migration on next access.
-		 */
-		if (bo->props.cpu_atomic)
-			ttm_bo_unmap_virtual(&bo->ttm);
-		xe_bo_unlock(bo);
-	}
-
-	return 0;
-}
-
-static int madvise_device_atomic(struct xe_device *xe, struct xe_vm *vm,
-				 struct xe_vma **vmas, int num_vmas, u64 value)
-{
-	int i, err;
-
-	for (i = 0; i < num_vmas; ++i) {
-		struct xe_bo *bo;
-
-		bo = xe_vma_bo(vmas[i]);
-		if (XE_IOCTL_DBG(xe, !(bo->flags & XE_BO_CREATE_VRAM0_BIT) &&
-				 !(bo->flags & XE_BO_CREATE_VRAM1_BIT)))
-			return -EINVAL;
-
-		err = xe_bo_lock(bo, true);
-		if (err)
-			return err;
-		bo->props.device_atomic = !!value;
-		xe_bo_unlock(bo);
-	}
-
-	return 0;
-}
-
-static int madvise_priority(struct xe_device *xe, struct xe_vm *vm,
-			    struct xe_vma **vmas, int num_vmas, u64 value)
-{
-	int i, err;
-
-	if (XE_IOCTL_DBG(xe, value > DRM_XE_VMA_PRIORITY_HIGH))
-		return -EINVAL;
-
-	if (XE_IOCTL_DBG(xe, value == DRM_XE_VMA_PRIORITY_HIGH &&
-			 !capable(CAP_SYS_NICE)))
-		return -EPERM;
-
-	for (i = 0; i < num_vmas; ++i) {
-		struct xe_bo *bo;
-
-		bo = xe_vma_bo(vmas[i]);
-
-		err = xe_bo_lock(bo, true);
-		if (err)
-			return err;
-		bo->ttm.priority = value;
-		ttm_bo_move_to_lru_tail(&bo->ttm);
-		xe_bo_unlock(bo);
-	}
-
-	return 0;
-}
-
-static int madvise_pin(struct xe_device *xe, struct xe_vm *vm,
-		       struct xe_vma **vmas, int num_vmas, u64 value)
-{
-	drm_warn(&xe->drm, "NIY");
-	return 0;
-}
-
-typedef int (*madvise_func)(struct xe_device *xe, struct xe_vm *vm,
-			    struct xe_vma **vmas, int num_vmas, u64 value);
-
-static const madvise_func madvise_funcs[] = {
-	[DRM_XE_VM_MADVISE_PREFERRED_MEM_CLASS] = madvise_preferred_mem_class,
-	[DRM_XE_VM_MADVISE_PREFERRED_GT] = madvise_preferred_gt,
-	[DRM_XE_VM_MADVISE_PREFERRED_MEM_CLASS_GT] =
-		madvise_preferred_mem_class_gt,
-	[DRM_XE_VM_MADVISE_CPU_ATOMIC] = madvise_cpu_atomic,
-	[DRM_XE_VM_MADVISE_DEVICE_ATOMIC] = madvise_device_atomic,
-	[DRM_XE_VM_MADVISE_PRIORITY] = madvise_priority,
-	[DRM_XE_VM_MADVISE_PIN] = madvise_pin,
-};
-
-static struct xe_vma **
-get_vmas(struct xe_vm *vm, int *num_vmas, u64 addr, u64 range)
-{
-	struct xe_vma **vmas, **__vmas;
-	struct drm_gpuva *gpuva;
-	int max_vmas = 8;
-
-	lockdep_assert_held(&vm->lock);
-
-	vmas = kmalloc(max_vmas * sizeof(*vmas), GFP_KERNEL);
-	if (!vmas)
-		return NULL;
-
-	drm_gpuvm_for_each_va_range(gpuva, &vm->gpuvm, addr, addr + range) {
-		struct xe_vma *vma = gpuva_to_vma(gpuva);
-
-		if (xe_vma_is_userptr(vma))
-			continue;
-
-		if (*num_vmas == max_vmas) {
-			max_vmas <<= 1;
-			__vmas = krealloc(vmas, max_vmas * sizeof(*vmas),
-					  GFP_KERNEL);
-			if (!__vmas)
-				return NULL;
-			vmas = __vmas;
-		}
-
-		vmas[*num_vmas] = vma;
-		*num_vmas += 1;
-	}
-
-	return vmas;
-}
-
-int xe_vm_madvise_ioctl(struct drm_device *dev, void *data,
-			struct drm_file *file)
-{
-	struct xe_device *xe = to_xe_device(dev);
-	struct xe_file *xef = to_xe_file(file);
-	struct drm_xe_vm_madvise *args = data;
-	struct xe_vm *vm;
-	struct xe_vma **vmas = NULL;
-	int num_vmas = 0, err = 0, idx;
-
-	if (XE_IOCTL_DBG(xe, args->extensions) ||
-	    XE_IOCTL_DBG(xe, args->pad || args->pad2) ||
-	    XE_IOCTL_DBG(xe, args->reserved[0] || args->reserved[1]))
-		return -EINVAL;
-
-	if (XE_IOCTL_DBG(xe, args->property > ARRAY_SIZE(madvise_funcs)))
-		return -EINVAL;
-
-	vm = xe_vm_lookup(xef, args->vm_id);
-	if (XE_IOCTL_DBG(xe, !vm))
-		return -EINVAL;
-
-	if (XE_IOCTL_DBG(xe, !xe_vm_in_fault_mode(vm))) {
-		err = -EINVAL;
-		goto put_vm;
-	}
-
-	down_read(&vm->lock);
-
-	if (XE_IOCTL_DBG(xe, xe_vm_is_closed_or_banned(vm))) {
-		err = -ENOENT;
-		goto unlock_vm;
-	}
-
-	vmas = get_vmas(vm, &num_vmas, args->addr, args->range);
-	if (XE_IOCTL_DBG(xe, err))
-		goto unlock_vm;
-
-	if (XE_IOCTL_DBG(xe, !vmas)) {
-		err = -ENOMEM;
-		goto unlock_vm;
-	}
-
-	if (XE_IOCTL_DBG(xe, !num_vmas)) {
-		err = -EINVAL;
-		goto unlock_vm;
-	}
-
-	idx = array_index_nospec(args->property, ARRAY_SIZE(madvise_funcs));
-	err = madvise_funcs[idx](xe, vm, vmas, num_vmas, args->value);
-
-unlock_vm:
-	up_read(&vm->lock);
-put_vm:
-	xe_vm_put(vm);
-	kfree(vmas);
-	return err;
-}
diff --git a/drivers/gpu/drm/xe/xe_vm_madvise.h b/drivers/gpu/drm/xe/xe_vm_madvise.h
deleted file mode 100644
index eecd33acd248..000000000000
--- a/drivers/gpu/drm/xe/xe_vm_madvise.h
+++ /dev/null
@@ -1,15 +0,0 @@
-/* SPDX-License-Identifier: MIT */
-/*
- * Copyright © 2021 Intel Corporation
- */
-
-#ifndef _XE_VM_MADVISE_H_
-#define _XE_VM_MADVISE_H_
-
-struct drm_device;
-struct drm_file;
-
-int xe_vm_madvise_ioctl(struct drm_device *dev, void *data,
-			struct drm_file *file);
-
-#endif
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 2fa0d1f5b47a..2ed69b02a2e8 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -103,28 +103,26 @@ struct xe_user_extension {
 #define DRM_XE_VM_CREATE		0x03
 #define DRM_XE_VM_DESTROY		0x04
 #define DRM_XE_VM_BIND			0x05
-#define DRM_XE_EXEC_QUEUE_CREATE	0x06
-#define DRM_XE_EXEC_QUEUE_DESTROY	0x07
-#define DRM_XE_EXEC			0x08
+#define DRM_XE_EXEC			0x06
+#define DRM_XE_EXEC_QUEUE_CREATE	0x07
+#define DRM_XE_EXEC_QUEUE_DESTROY	0x08
 #define DRM_XE_EXEC_QUEUE_SET_PROPERTY	0x09
-#define DRM_XE_WAIT_USER_FENCE		0x0a
-#define DRM_XE_VM_MADVISE		0x0b
-#define DRM_XE_EXEC_QUEUE_GET_PROPERTY	0x0c
-
+#define DRM_XE_EXEC_QUEUE_GET_PROPERTY	0x0a
+#define DRM_XE_WAIT_USER_FENCE		0x0b
 /* Must be kept compact -- no holes */
+
 #define DRM_IOCTL_XE_DEVICE_QUERY		DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_DEVICE_QUERY, struct drm_xe_device_query)
 #define DRM_IOCTL_XE_GEM_CREATE			DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_GEM_CREATE, struct drm_xe_gem_create)
 #define DRM_IOCTL_XE_GEM_MMAP_OFFSET		DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_GEM_MMAP_OFFSET, struct drm_xe_gem_mmap_offset)
 #define DRM_IOCTL_XE_VM_CREATE			DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_VM_CREATE, struct drm_xe_vm_create)
-#define DRM_IOCTL_XE_VM_DESTROY			 DRM_IOW(DRM_COMMAND_BASE + DRM_XE_VM_DESTROY, struct drm_xe_vm_destroy)
-#define DRM_IOCTL_XE_VM_BIND			 DRM_IOW(DRM_COMMAND_BASE + DRM_XE_VM_BIND, struct drm_xe_vm_bind)
+#define DRM_IOCTL_XE_VM_DESTROY			DRM_IOW(DRM_COMMAND_BASE + DRM_XE_VM_DESTROY, struct drm_xe_vm_destroy)
+#define DRM_IOCTL_XE_VM_BIND			DRM_IOW(DRM_COMMAND_BASE + DRM_XE_VM_BIND, struct drm_xe_vm_bind)
+#define DRM_IOCTL_XE_EXEC			DRM_IOW(DRM_COMMAND_BASE + DRM_XE_EXEC, struct drm_xe_exec)
 #define DRM_IOCTL_XE_EXEC_QUEUE_CREATE		DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_EXEC_QUEUE_CREATE, struct drm_xe_exec_queue_create)
+#define DRM_IOCTL_XE_EXEC_QUEUE_DESTROY		DRM_IOW(DRM_COMMAND_BASE + DRM_XE_EXEC_QUEUE_DESTROY, struct drm_xe_exec_queue_destroy)
+#define DRM_IOCTL_XE_EXEC_QUEUE_SET_PROPERTY	DRM_IOW(DRM_COMMAND_BASE + DRM_XE_EXEC_QUEUE_SET_PROPERTY, struct drm_xe_exec_queue_set_property)
 #define DRM_IOCTL_XE_EXEC_QUEUE_GET_PROPERTY	DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_EXEC_QUEUE_GET_PROPERTY, struct drm_xe_exec_queue_get_property)
-#define DRM_IOCTL_XE_EXEC_QUEUE_DESTROY		 DRM_IOW(DRM_COMMAND_BASE + DRM_XE_EXEC_QUEUE_DESTROY, struct drm_xe_exec_queue_destroy)
-#define DRM_IOCTL_XE_EXEC			 DRM_IOW(DRM_COMMAND_BASE + DRM_XE_EXEC, struct drm_xe_exec)
-#define DRM_IOCTL_XE_EXEC_QUEUE_SET_PROPERTY	 DRM_IOW(DRM_COMMAND_BASE + DRM_XE_EXEC_QUEUE_SET_PROPERTY, struct drm_xe_exec_queue_set_property)
 #define DRM_IOCTL_XE_WAIT_USER_FENCE		DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_WAIT_USER_FENCE, struct drm_xe_wait_user_fence)
-#define DRM_IOCTL_XE_VM_MADVISE			 DRM_IOW(DRM_COMMAND_BASE + DRM_XE_VM_MADVISE, struct drm_xe_vm_madvise)
 
 /**
  * struct drm_xe_engine_class_instance - instance of an engine class
@@ -1098,74 +1096,6 @@ struct drm_xe_wait_user_fence {
 	__u64 reserved[2];
 };
 
-/**
- * struct drm_xe_vm_madvise - give advice about use of memory
- *
- * The @property can be:
- *  - %DRM_XE_VM_MADVISE_PREFERRED_MEM_CLASS - Setting the preferred
- *    location will trigger a migrate of the VMA backing store to new
- *    location if the backing store is already allocated.
- *    For DRM_XE_VM_MADVISE_PREFERRED_MEM_CLASS usage, see enum
- *    drm_xe_memory_class.
- *  - %DRM_XE_VM_MADVISE_PREFERRED_GT
- *  - %DRM_XE_VM_MADVISE_PREFERRED_MEM_CLASS_GT - In this case lower 32 bits
- *    are mem class, upper 32 are GT. Combination provides a single IOCTL
- *    plus migrate VMA to preferred location.
- *  - %DRM_XE_VM_MADVISE_CPU_ATOMIC - The CPU will do atomic memory
- *    operations to this VMA. Must be set on some devices for atomics to
- *    behave correctly.
- *  - %DRM_XE_VM_MADVISE_DEVICE_ATOMIC - The device will do atomic memory
- *    operations to this VMA. Must be set on some devices for atomics to
- *    behave correctly.
- *  - %DRM_XE_VM_MADVISE_PRIORITY - Priority WRT to eviction (moving from
- *    preferred memory location due to memory pressure). The lower the
- *    priority, the more likely to be evicted.
- *
- *    - %DRM_XE_VMA_PRIORITY_LOW
- *    - %DRM_XE_VMA_PRIORITY_NORMAL - Default
- *    - %DRM_XE_VMA_PRIORITY_HIGH - Must be user with elevated privileges
- *  - %DRM_XE_VM_MADVISE_PIN - Pin the VMA in memory, must be user with
- *    elevated privileges
- */
-struct drm_xe_vm_madvise {
-	/** @extensions: Pointer to the first extension struct, if any */
-	__u64 extensions;
-
-	/** @vm_id: The ID VM in which the VMA exists */
-	__u32 vm_id;
-
-	/** @pad: MBZ */
-	__u32 pad;
-
-	/** @range: Number of bytes in the VMA */
-	__u64 range;
-
-	/** @addr: Address of the VMA to operation on */
-	__u64 addr;
-
-#define DRM_XE_VM_MADVISE_PREFERRED_MEM_CLASS		0
-#define DRM_XE_VM_MADVISE_PREFERRED_GT			1
-#define DRM_XE_VM_MADVISE_PREFERRED_MEM_CLASS_GT	2
-#define DRM_XE_VM_MADVISE_CPU_ATOMIC			3
-#define DRM_XE_VM_MADVISE_DEVICE_ATOMIC			4
-#define DRM_XE_VM_MADVISE_PRIORITY			5
-#define		DRM_XE_VMA_PRIORITY_LOW			0
-#define		DRM_XE_VMA_PRIORITY_NORMAL		1
-#define		DRM_XE_VMA_PRIORITY_HIGH		2
-#define DRM_XE_VM_MADVISE_PIN				6
-	/** @property: property to set */
-	__u32 property;
-
-	/** @pad2: MBZ */
-	__u32 pad2;
-
-	/** @value: property value */
-	__u64 value;
-
-	/** @reserved: Reserved */
-	__u64 reserved[2];
-};
-
 /**
  * DOC: XE PMU event config IDs
  *
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 13/43] drm/xe/uapi: Separate bo_create placement from flags
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (11 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 12/43] drm/xe/uapi: Kill VM_MADVISE IOCTL Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 14:58   ` Matthew Brost
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 14/43] drm/xe/uapi: Remove unused inaccessible memory region Francois Dugast
                   ` (33 subsequent siblings)
  46 siblings, 1 reply; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

From: Rodrigo Vivi <rodrigo.vivi@intel.com>

Although the flags are about the creation, the memory placement
of the BO deserves a proper dedicated field in the uapi.

Besides getting more clear, it also allows to remove the
'magic' shifts from the flags that was a concern during the
uapi reviews.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_bo.c | 15 +++++++--------
 include/uapi/drm/xe_drm.h  | 12 ++++++------
 2 files changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index b955c89da42c..87971f4faa58 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -1799,19 +1799,18 @@ int xe_gem_create_ioctl(struct drm_device *dev, void *data,
 	u32 handle;
 	int err;
 
-	if (XE_IOCTL_DBG(xe, args->extensions) || XE_IOCTL_DBG(xe, args->pad) ||
+	if (XE_IOCTL_DBG(xe, args->extensions) ||
 	    XE_IOCTL_DBG(xe, args->reserved[0] || args->reserved[1]))
 		return -EINVAL;
 
+	/* at least one valid memory placement must be specified */
+	if (XE_IOCTL_DBG(xe, !(args->placement & xe->info.mem_region_mask)))
+		return -EINVAL;
+
 	if (XE_IOCTL_DBG(xe, args->flags &
 			 ~(DRM_XE_GEM_CREATE_FLAG_DEFER_BACKING |
 			   DRM_XE_GEM_CREATE_FLAG_SCANOUT |
-			   DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM |
-			   xe->info.mem_region_mask)))
-		return -EINVAL;
-
-	/* at least one memory type must be specified */
-	if (XE_IOCTL_DBG(xe, !(args->flags & xe->info.mem_region_mask)))
+			   DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM)))
 		return -EINVAL;
 
 	if (XE_IOCTL_DBG(xe, args->handle))
@@ -1832,7 +1831,7 @@ int xe_gem_create_ioctl(struct drm_device *dev, void *data,
 	if (args->flags & DRM_XE_GEM_CREATE_FLAG_SCANOUT)
 		bo_flags |= XE_BO_SCANOUT_BIT;
 
-	bo_flags |= args->flags << (ffs(XE_BO_CREATE_SYSTEM_BIT) - 1);
+	bo_flags |= args->placement << (ffs(XE_BO_CREATE_SYSTEM_BIT) - 1);
 
 	if (args->flags & DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM) {
 		if (XE_IOCTL_DBG(xe, !(bo_flags & XE_BO_CREATE_VRAM_MASK)))
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 2ed69b02a2e8..3685eeff4b8d 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -622,9 +622,12 @@ struct drm_xe_gem_create {
 	 */
 	__u64 size;
 
-#define DRM_XE_GEM_CREATE_FLAG_DEFER_BACKING		(0x1 << 24)
-#define DRM_XE_GEM_CREATE_FLAG_SCANOUT			(0x1 << 25)
-#define DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM	(0x1 << 26)
+	/** @placement: A mask of memory instances of where BO can be placed. */
+	__u32 placement;
+
+#define DRM_XE_GEM_CREATE_FLAG_DEFER_BACKING		(1 << 0)
+#define DRM_XE_GEM_CREATE_FLAG_SCANOUT			(1 << 1)
+#define DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM	(1 << 2)
 	/**
 	 * @flags: Flags, currently a mask of memory instances of where BO can
 	 * be placed
@@ -648,9 +651,6 @@ struct drm_xe_gem_create {
 	 */
 	__u32 handle;
 
-	/** @pad: MBZ */
-	__u32 pad;
-
 	/** @reserved: Reserved */
 	__u64 reserved[2];
 };
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 14/43] drm/xe/uapi: Remove unused inaccessible memory region
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (12 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 13/43] drm/xe/uapi: Separate bo_create placement from flags Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 15/43] drm/xe/uapi: Remove unused QUERY_CONFIG_MEM_REGION_COUNT Francois Dugast
                   ` (32 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Francois Dugast

This is not used and also the negative of the other 2 regions:
native_mem_regions and slow_mem_regions.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
---
 include/uapi/drm/xe_drm.h | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 3685eeff4b8d..549db8cb4e21 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -396,11 +396,6 @@ struct drm_xe_query_gt {
 	 * they live on a different GPU/Tile.
 	 */
 	__u64 slow_mem_regions;
-	/**
-	 * @inaccessible_mem_regions: Bit mask of instances from
-	 * drm_xe_query_mem_usage that is not accessible by this GT at all.
-	 */
-	__u64 inaccessible_mem_regions;
 	/** @reserved: Reserved */
 	__u64 reserved[8];
 };
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 15/43] drm/xe/uapi: Remove unused QUERY_CONFIG_MEM_REGION_COUNT
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (13 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 14/43] drm/xe/uapi: Remove unused inaccessible memory region Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 16/43] drm/xe/uapi: Remove unused QUERY_CONFIG_GT_COUNT Francois Dugast
                   ` (31 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Francois Dugast

As part of uAPI cleanup, remove this constant which is not used. Memory
regions can be queried with DRM_XE_DEVICE_QUERY_MEM_USAGE.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
---
 drivers/gpu/drm/xe/xe_query.c | 2 --
 include/uapi/drm/xe_drm.h     | 5 +----
 2 files changed, 1 insertion(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
index be5cfb29216b..63321f8f1b9b 100644
--- a/drivers/gpu/drm/xe/xe_query.c
+++ b/drivers/gpu/drm/xe/xe_query.c
@@ -337,8 +337,6 @@ static int query_config(struct xe_device *xe, struct drm_xe_device_query *query)
 		xe->info.vram_flags & XE_VRAM_FLAGS_NEED64K ? SZ_64K : SZ_4K;
 	config->info[DRM_XE_QUERY_CONFIG_VA_BITS] = xe->info.va_bits;
 	config->info[DRM_XE_QUERY_CONFIG_GT_COUNT] = xe->info.gt_count;
-	config->info[DRM_XE_QUERY_CONFIG_MEM_REGION_COUNT] =
-		hweight_long(xe->info.mem_region_mask);
 	config->info[DRM_XE_QUERY_CONFIG_MAX_EXEC_QUEUE_PRIORITY] =
 		xe_exec_queue_device_get_max_priority(xe);
 
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 549db8cb4e21..0073d660698b 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -342,8 +342,6 @@ struct drm_xe_query_mem_usage {
  *  - %DRM_XE_QUERY_CONFIG_VA_BITS - Maximum bits of a virtual address
  *  - %DRM_XE_QUERY_CONFIG_GT_COUNT - Total number of GTs for the entire
  *    device
- *  - %DRM_XE_QUERY_CONFIG_MEM_REGION_COUNT - Total number of accessible
- *    memory regions
  *  - %DRM_XE_QUERY_CONFIG_MAX_EXEC_QUEUE_PRIORITY - Value of the highest
  *    available exec queue priority
  */
@@ -360,8 +358,7 @@ struct drm_xe_query_config {
 #define DRM_XE_QUERY_CONFIG_MIN_ALIGNMENT		2
 #define DRM_XE_QUERY_CONFIG_VA_BITS			3
 #define DRM_XE_QUERY_CONFIG_GT_COUNT			4
-#define DRM_XE_QUERY_CONFIG_MEM_REGION_COUNT		5
-#define DRM_XE_QUERY_CONFIG_MAX_EXEC_QUEUE_PRIORITY	6
+#define DRM_XE_QUERY_CONFIG_MAX_EXEC_QUEUE_PRIORITY	5
 	/** @info: array of elements containing the config info */
 	__u64 info[];
 };
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 16/43] drm/xe/uapi: Remove unused QUERY_CONFIG_GT_COUNT
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (14 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 15/43] drm/xe/uapi: Remove unused QUERY_CONFIG_MEM_REGION_COUNT Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 17/43] drm/xe/uapi: Rename *_mem_regions masks Francois Dugast
                   ` (30 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Francois Dugast

As part of uAPI cleanup, remove this constant which is not used. Number
of GTs are provided as num_gt in drm_xe_query_gt_list.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
---
 drivers/gpu/drm/xe/xe_query.c | 1 -
 include/uapi/drm/xe_drm.h     | 5 +----
 2 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
index 63321f8f1b9b..8c31c02ac2d7 100644
--- a/drivers/gpu/drm/xe/xe_query.c
+++ b/drivers/gpu/drm/xe/xe_query.c
@@ -336,7 +336,6 @@ static int query_config(struct xe_device *xe, struct drm_xe_device_query *query)
 	config->info[DRM_XE_QUERY_CONFIG_MIN_ALIGNMENT] =
 		xe->info.vram_flags & XE_VRAM_FLAGS_NEED64K ? SZ_64K : SZ_4K;
 	config->info[DRM_XE_QUERY_CONFIG_VA_BITS] = xe->info.va_bits;
-	config->info[DRM_XE_QUERY_CONFIG_GT_COUNT] = xe->info.gt_count;
 	config->info[DRM_XE_QUERY_CONFIG_MAX_EXEC_QUEUE_PRIORITY] =
 		xe_exec_queue_device_get_max_priority(xe);
 
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 0073d660698b..9cb7dfd129eb 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -340,8 +340,6 @@ struct drm_xe_query_mem_usage {
  *  - %DRM_XE_QUERY_CONFIG_MIN_ALIGNMENT - Minimal memory alignment
  *    required by this device, typically SZ_4K or SZ_64K
  *  - %DRM_XE_QUERY_CONFIG_VA_BITS - Maximum bits of a virtual address
- *  - %DRM_XE_QUERY_CONFIG_GT_COUNT - Total number of GTs for the entire
- *    device
  *  - %DRM_XE_QUERY_CONFIG_MAX_EXEC_QUEUE_PRIORITY - Value of the highest
  *    available exec queue priority
  */
@@ -357,8 +355,7 @@ struct drm_xe_query_config {
 	#define DRM_XE_QUERY_CONFIG_FLAGS_HAS_VRAM	(0x1 << 0)
 #define DRM_XE_QUERY_CONFIG_MIN_ALIGNMENT		2
 #define DRM_XE_QUERY_CONFIG_VA_BITS			3
-#define DRM_XE_QUERY_CONFIG_GT_COUNT			4
-#define DRM_XE_QUERY_CONFIG_MAX_EXEC_QUEUE_PRIORITY	5
+#define DRM_XE_QUERY_CONFIG_MAX_EXEC_QUEUE_PRIORITY	4
 	/** @info: array of elements containing the config info */
 	__u64 info[];
 };
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 17/43] drm/xe/uapi: Rename *_mem_regions masks
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (15 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 16/43] drm/xe/uapi: Remove unused QUERY_CONFIG_GT_COUNT Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 18/43] drm/xe/uapi: Rename query's mem_usage to mem_regions Francois Dugast
                   ` (29 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

From: Rodrigo Vivi <rodrigo.vivi@intel.com>

- 'native' doesn't make much sense on integrated devices.
- 'slow' is not necessarily true and doesn't go well with opposition
  to 'native'.

Instead, let's use 'near' vs 'far'. It makes sense with all the current
Intel GPUs and it is future proof. Right now, there's absolutely no need
to define among the 'far' memory, which ones are slower, either in terms
of latency, nunmber of hops or bandwidth.

In case of this might become a requirement in the future, a new query
could be added to indicate the certain 'distance' between a given engine
and a memory_region. But for now, this fulfill all of the current
requirements in the most straightforward way for the userspace drivers.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_query.c |  8 ++++----
 include/uapi/drm/xe_drm.h     | 17 +++++++++--------
 2 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
index 8c31c02ac2d7..573532f1bbb0 100644
--- a/drivers/gpu/drm/xe/xe_query.c
+++ b/drivers/gpu/drm/xe/xe_query.c
@@ -378,12 +378,12 @@ static int query_gt_list(struct xe_device *xe, struct drm_xe_device_query *query
 		gt_list->gt_list[id].gt_id = gt->info.id;
 		gt_list->gt_list[id].clock_freq = gt->info.clock_freq;
 		if (!IS_DGFX(xe))
-			gt_list->gt_list[id].native_mem_regions = 0x1;
+			gt_list->gt_list[id].near_mem_regions = 0x1;
 		else
-			gt_list->gt_list[id].native_mem_regions =
+			gt_list->gt_list[id].near_mem_regions =
 				BIT(gt_to_tile(gt)->id) << 1;
-		gt_list->gt_list[id].slow_mem_regions = xe->info.mem_region_mask ^
-			gt_list->gt_list[id].native_mem_regions;
+		gt_list->gt_list[id].far_mem_regions = xe->info.mem_region_mask ^
+			gt_list->gt_list[id].near_mem_regions;
 	}
 
 	if (copy_to_user(query_ptr, gt_list, size)) {
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 9cb7dfd129eb..89be8e2f9852 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -379,17 +379,18 @@ struct drm_xe_query_gt {
 	/** @clock_freq: A clock frequency for timestamp */
 	__u32 clock_freq;
 	/**
-	 * @native_mem_regions: Bit mask of instances from
-	 * drm_xe_query_mem_usage that lives on the same GPU/Tile and have
-	 * direct access.
+	 * @near_mem_regions: Bit mask of instances from
+	 * drm_xe_query_mem_usage that is near the current engines of this GT.
 	 */
-	__u64 native_mem_regions;
+	__u64 near_mem_regions;
 	/**
-	 * @slow_mem_regions: Bit mask of instances from
-	 * drm_xe_query_mem_usage that this GT can indirectly access, although
-	 * they live on a different GPU/Tile.
+	 * @far_mem_regions: Bit mask of instances from
+	 * drm_xe_query_mem_usage that is far from the engines of this GT.
+	 * In general, it has extra indirections when compared to the
+	 * @near_mem_regions. For a discrete device this could mean system
+	 * memory and memory living in a different Tile.
 	 */
-	__u64 slow_mem_regions;
+	__u64 far_mem_regions;
 	/** @reserved: Reserved */
 	__u64 reserved[8];
 };
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 18/43] drm/xe/uapi: Rename query's mem_usage to mem_regions
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (16 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 17/43] drm/xe/uapi: Rename *_mem_regions masks Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 19/43] drm/xe: Make DRM_XE_DEVICE_QUERY_ENGINES future proof Francois Dugast
                   ` (28 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Francois Dugast, Rodrigo Vivi

From: Rodrigo Vivi <rodrigo.vivi@intel.com>

'Usage' gives an impression of telemetry information where someone
would query to see how the memory is currently used and available
size, etc. However this API is more than this. It is about a global
view of all the memory regions available in the system and user
space needs to have this information so they can then use the
mem_region masks that are returned for the engine access.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
---
 drivers/gpu/drm/xe/xe_query.c | 16 ++++++++--------
 include/uapi/drm/xe_drm.h     | 16 ++++++++--------
 2 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
index 573532f1bbb0..5573855dd1a2 100644
--- a/drivers/gpu/drm/xe/xe_query.c
+++ b/drivers/gpu/drm/xe/xe_query.c
@@ -230,7 +230,7 @@ static int query_engines(struct xe_device *xe,
 	return 0;
 }
 
-static size_t calc_memory_usage_size(struct xe_device *xe)
+static size_t calc_mem_regions_size(struct xe_device *xe)
 {
 	u32 num_managers = 1;
 	int i;
@@ -239,15 +239,15 @@ static size_t calc_memory_usage_size(struct xe_device *xe)
 		if (ttm_manager_type(&xe->ttm, i))
 			num_managers++;
 
-	return offsetof(struct drm_xe_query_mem_usage, regions[num_managers]);
+	return offsetof(struct drm_xe_query_mem_regions, regions[num_managers]);
 }
 
-static int query_memory_usage(struct xe_device *xe,
-			      struct drm_xe_device_query *query)
+static int query_mem_regions(struct xe_device *xe,
+			     struct drm_xe_device_query *query)
 {
-	size_t size = calc_memory_usage_size(xe);
-	struct drm_xe_query_mem_usage *usage;
-	struct drm_xe_query_mem_usage __user *query_ptr =
+	size_t size = calc_mem_regions_size(xe);
+	struct drm_xe_query_mem_regions *usage;
+	struct drm_xe_query_mem_regions __user *query_ptr =
 		u64_to_user_ptr(query->data);
 	struct ttm_resource_manager *man;
 	int ret, i;
@@ -549,7 +549,7 @@ query_uc_fw_version(struct xe_device *xe, struct drm_xe_device_query *query)
 static int (* const xe_query_funcs[])(struct xe_device *xe,
 				      struct drm_xe_device_query *query) = {
 	query_engines,
-	query_memory_usage,
+	query_mem_regions,
 	query_config,
 	query_gt_list,
 	query_hwconfig,
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 89be8e2f9852..91692cec1d6b 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -307,13 +307,13 @@ struct drm_xe_query_engine_cycles {
 };
 
 /**
- * struct drm_xe_query_mem_usage - describe memory regions and usage
+ * struct drm_xe_query_mem_regions - describe memory regions
  *
  * If a query is made with a struct drm_xe_device_query where .query
- * is equal to DRM_XE_DEVICE_QUERY_MEM_USAGE, then the reply uses
- * struct drm_xe_query_mem_usage in .data.
+ * is equal to DRM_XE_DEVICE_QUERY_MEM_REGIONS, then the reply uses
+ * struct drm_xe_query_mem_regions in .data.
  */
-struct drm_xe_query_mem_usage {
+struct drm_xe_query_mem_regions {
 	/** @num_regions: number of memory regions returned in @regions */
 	__u32 num_regions;
 	/** @pad: MBZ */
@@ -380,12 +380,12 @@ struct drm_xe_query_gt {
 	__u32 clock_freq;
 	/**
 	 * @near_mem_regions: Bit mask of instances from
-	 * drm_xe_query_mem_usage that is near the current engines of this GT.
+	 * drm_xe_query_mem_regions that is near the current engines of this GT.
 	 */
 	__u64 near_mem_regions;
 	/**
 	 * @far_mem_regions: Bit mask of instances from
-	 * drm_xe_query_mem_usage that is far from the engines of this GT.
+	 * drm_xe_query_mem_regions that is far from the engines of this GT.
 	 * In general, it has extra indirections when compared to the
 	 * @near_mem_regions. For a discrete device this could mean system
 	 * memory and memory living in a different Tile.
@@ -508,7 +508,7 @@ struct drm_xe_query_uc_fw_version {
  *
  * The @query can be:
  *  - %DRM_XE_DEVICE_QUERY_ENGINES
- *  - %DRM_XE_DEVICE_QUERY_MEM_USAGE
+ *  - %DRM_XE_DEVICE_QUERY_MEM_REGIONS
  *  - %DRM_XE_DEVICE_QUERY_CONFIG
  *  - %DRM_XE_DEVICE_QUERY_GT_LIST - Query type to retrieve the hardware
  *    configuration of the device such as information on slices, memory,
@@ -558,7 +558,7 @@ struct drm_xe_device_query {
 	__u64 extensions;
 
 #define DRM_XE_DEVICE_QUERY_ENGINES		0
-#define DRM_XE_DEVICE_QUERY_MEM_USAGE		1
+#define DRM_XE_DEVICE_QUERY_MEM_REGIONS		1
 #define DRM_XE_DEVICE_QUERY_CONFIG		2
 #define DRM_XE_DEVICE_QUERY_GT_LIST		3
 #define DRM_XE_DEVICE_QUERY_HWCONFIG		4
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 19/43] drm/xe: Make DRM_XE_DEVICE_QUERY_ENGINES future proof
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (17 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 18/43] drm/xe/uapi: Rename query's mem_usage to mem_regions Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 20/43] drm/xe/uapi: Replace BO with GEM in documentation Francois Dugast
                   ` (27 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Francois Dugast, Rodrigo Vivi

From: José Roberto de Souza <jose.souza@intel.com>

We have at least 2 future features(OA and future media engines
capabilities) that will require Xe to provide more information about
engines to UMDs.

But this information should not just be added to
drm_xe_engine_class_instance for a couple of reasons:
- drm_xe_engine_class_instance is used as input to other structs/uAPIs
and those uAPIs don't care about any of these future new engine fields
- those new fields are useless information after initialization for
some UMDs, so it should not need to carry that around

So here my proposal is to make DRM_XE_DEVICE_QUERY_ENGINES return an
array of drm_xe_query_engine_info that contain
drm_xe_engine_class_instance and 3 u64s to be used for future features.

Reference OA:
https://patchwork.freedesktop.org/patch/558362/?series=121084&rev=6

Cc: Francois Dugast <francois.dugast@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Rodrigo Rebased]
---
 drivers/gpu/drm/xe/xe_query.c | 15 ++++++++-------
 include/uapi/drm/xe_drm.h     | 19 +++++++++++++++++++
 2 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
index 5573855dd1a2..bc2b4609a38d 100644
--- a/drivers/gpu/drm/xe/xe_query.c
+++ b/drivers/gpu/drm/xe/xe_query.c
@@ -53,7 +53,7 @@ static size_t calc_hw_engine_info_size(struct xe_device *xe)
 			i++;
 		}
 
-	return i * sizeof(struct drm_xe_engine_class_instance);
+	return i * sizeof(struct drm_xe_query_engine_info);
 }
 
 typedef u64 (*__ktime_func_t)(void);
@@ -186,9 +186,9 @@ static int query_engines(struct xe_device *xe,
 			 struct drm_xe_device_query *query)
 {
 	size_t size = calc_hw_engine_info_size(xe);
-	struct drm_xe_engine_class_instance __user *query_ptr =
+	struct drm_xe_query_engine_info __user *query_ptr =
 		u64_to_user_ptr(query->data);
-	struct drm_xe_engine_class_instance *hw_engine_info;
+	struct drm_xe_query_engine_info *hw_engine_info;
 	struct xe_hw_engine *hwe;
 	enum xe_hw_engine_id id;
 	struct xe_gt *gt;
@@ -211,12 +211,13 @@ static int query_engines(struct xe_device *xe,
 			if (xe_hw_engine_is_reserved(hwe))
 				continue;
 
-			hw_engine_info[i].engine_class =
+			hw_engine_info[i].instance.engine_class =
 				xe_to_user_engine_class[hwe->class];
-			hw_engine_info[i].engine_instance =
+			hw_engine_info[i].instance.engine_instance =
 				hwe->logical_instance;
-			hw_engine_info[i].gt_id = gt->info.id;
-			hw_engine_info[i].pad = 0;
+			hw_engine_info[i].instance.gt_id = gt->info.id;
+			hw_engine_info[i].instance.pad = 0;
+			memset(hw_engine_info->reserved, 0, sizeof(hw_engine_info->reserved));
 
 			i++;
 		}
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 91692cec1d6b..8d83f8c2ca04 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -127,6 +127,10 @@ struct xe_user_extension {
 /**
  * struct drm_xe_engine_class_instance - instance of an engine class
  *
+ * It is returned as part of the @drm_xe_query_engine_info, but it also is
+ * used as the input of engine selection for both @drm_xe_exec_queue_create
+ * and @drm_xe_query_engine_cycles
+ *
  * The @engine_class can be:
  *  - %DRM_XE_ENGINE_CLASS_RENDER
  *  - %DRM_XE_ENGINE_CLASS_COPY
@@ -161,6 +165,21 @@ struct drm_xe_engine_class_instance {
 	__u16 pad;
 };
 
+/**
+ * struct drm_xe_query_engine_info - describe hardware engine
+ *
+ * If a query is made with a struct @drm_xe_device_query where .query
+ * is equal to %DRM_XE_DEVICE_QUERY_ENGINES, then the reply uses an array of
+ * struct @drm_xe_query_engine_info in .data.
+ */
+struct drm_xe_query_engine_info {
+	/** @instance: The @drm_xe_engine_class_instance */
+	struct drm_xe_engine_class_instance instance;
+
+	/** @reserved: Reserved */
+	__u64 reserved[5];
+};
+
 /**
  * enum drm_xe_memory_class - Supported memory classes.
  */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 20/43] drm/xe/uapi: Replace BO with GEM in documentation
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (18 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 19/43] drm/xe: Make DRM_XE_DEVICE_QUERY_ENGINES future proof Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 21/43] drm/xe/pmu: Drop interrupt pmu event Francois Dugast
                   ` (26 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Francois Dugast

Align documentation with names of constants and structs, which
use GEM instead of BO.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
---
 include/uapi/drm/xe_drm.h | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 8d83f8c2ca04..d66d8aa09270 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -631,14 +631,14 @@ struct drm_xe_gem_create {
 	 */
 	__u64 size;
 
-	/** @placement: A mask of memory instances of where BO can be placed. */
+	/** @placement: A mask of memory instances of where GEM can be placed. */
 	__u32 placement;
 
 #define DRM_XE_GEM_CREATE_FLAG_DEFER_BACKING		(1 << 0)
 #define DRM_XE_GEM_CREATE_FLAG_SCANOUT			(1 << 1)
 #define DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM	(1 << 2)
 	/**
-	 * @flags: Flags, currently a mask of memory instances of where BO can
+	 * @flags: Flags, currently a mask of memory instances of where GEM can
 	 * be placed
 	 */
 	__u32 flags;
@@ -646,7 +646,7 @@ struct drm_xe_gem_create {
 	/**
 	 * @vm_id: Attached VM, if any
 	 *
-	 * If a VM is specified, this BO must:
+	 * If a VM is specified, this GEM must:
 	 *
 	 *  1. Only ever be bound to that VM.
 	 *  2. Cannot be exported as a PRIME fd.
@@ -748,8 +748,8 @@ struct drm_xe_vm_destroy {
  *  - %DRM_XE_VM_BIND_FLAG_NULL - When the NULL flag is set, the page
  *    tables are setup with a special bit which indicates writes are
  *    dropped and all reads return zero. In the future, the NULL flags
- *    will only be valid for DRM_XE_VM_BIND_OP_MAP operations, the BO
- *    handle MBZ, and the BO offset MBZ. This flag is intended to
+ *    will only be valid for DRM_XE_VM_BIND_OP_MAP operations, the GEM
+ *    handle MBZ, and the GEM offset MBZ. This flag is intended to
  *    implement VK sparse bindings.
  *
  */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 21/43] drm/xe/pmu: Drop interrupt pmu event
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (19 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 20/43] drm/xe/uapi: Replace BO with GEM in documentation Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 22/43] drm/xe/uapi: Reject bo creation of unaligned size Francois Dugast
                   ` (25 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Tvrtko Ursulin, Francois Dugast, Rodrigo Vivi

From: Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com>

Drop interrupt event from PMU as that is not useful and not being used
by any UMD.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
---
 drivers/gpu/drm/xe/xe_irq.c       | 18 ------------------
 drivers/gpu/drm/xe/xe_pmu.c       |  9 ---------
 drivers/gpu/drm/xe/xe_pmu_types.h |  8 --------
 include/uapi/drm/xe_drm.h         | 13 ++++++-------
 4 files changed, 6 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
index 21d5273d7e61..205147511aa0 100644
--- a/drivers/gpu/drm/xe/xe_irq.c
+++ b/drivers/gpu/drm/xe/xe_irq.c
@@ -27,20 +27,6 @@
 #define IIR(offset)				XE_REG(offset + 0x8)
 #define IER(offset)				XE_REG(offset + 0xc)
 
-/*
- * Interrupt statistic for PMU. Increments the counter only if the
- * interrupt originated from the GPU so interrupts from a device which
- * shares the interrupt line are not accounted.
- */
-static __always_inline void xe_pmu_irq_stats(struct xe_device *xe)
-{
-	/*
-	 * A clever compiler translates that into INC. A not so clever one
-	 * should at least prevent store tearing.
-	 */
-	WRITE_ONCE(xe->pmu.irq_count, xe->pmu.irq_count + 1);
-}
-
 static void assert_iir_is_zero(struct xe_gt *mmio, struct xe_reg reg)
 {
 	u32 val = xe_mmio_read32(mmio, reg);
@@ -360,8 +346,6 @@ static irqreturn_t xelp_irq_handler(int irq, void *arg)
 
 	xe_display_irq_enable(xe, gu_misc_iir);
 
-	xe_pmu_irq_stats(xe);
-
 	return IRQ_HANDLED;
 }
 
@@ -458,8 +442,6 @@ static irqreturn_t dg1_irq_handler(int irq, void *arg)
 	dg1_intr_enable(xe, false);
 	xe_display_irq_enable(xe, gu_misc_iir);
 
-	xe_pmu_irq_stats(xe);
-
 	return IRQ_HANDLED;
 }
 
diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
index 8378ca3007d9..3e13e48a4f45 100644
--- a/drivers/gpu/drm/xe/xe_pmu.c
+++ b/drivers/gpu/drm/xe/xe_pmu.c
@@ -114,10 +114,6 @@ config_status(struct xe_device *xe, u64 config)
 		return -ENOENT;
 
 	switch (config_counter(config)) {
-	case DRM_XE_PMU_INTERRUPTS(0):
-		if (gt_id)
-			return -ENOENT;
-		break;
 	case DRM_XE_PMU_RENDER_GROUP_BUSY(0):
 	case DRM_XE_PMU_COPY_GROUP_BUSY(0):
 	case DRM_XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
@@ -181,13 +177,9 @@ static u64 __xe_pmu_event_read(struct perf_event *event)
 	const unsigned int gt_id = config_gt_id(event->attr.config);
 	const u64 config = event->attr.config;
 	struct xe_gt *gt = xe_device_get_gt(xe, gt_id);
-	struct xe_pmu *pmu = &xe->pmu;
 	u64 val;
 
 	switch (config_counter(config)) {
-	case DRM_XE_PMU_INTERRUPTS(0):
-		val = READ_ONCE(pmu->irq_count);
-		break;
 	case DRM_XE_PMU_RENDER_GROUP_BUSY(0):
 	case DRM_XE_PMU_COPY_GROUP_BUSY(0):
 	case DRM_XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
@@ -361,7 +353,6 @@ create_event_attributes(struct xe_pmu *pmu)
 		const char *unit;
 		bool global;
 	} events[] = {
-		__global_event(0, "interrupts", NULL),
 		__event(1, "render-group-busy", "ns"),
 		__event(2, "copy-group-busy", "ns"),
 		__event(3, "media-group-busy", "ns"),
diff --git a/drivers/gpu/drm/xe/xe_pmu_types.h b/drivers/gpu/drm/xe/xe_pmu_types.h
index 4ccc7e9042f6..9cadbd243f57 100644
--- a/drivers/gpu/drm/xe/xe_pmu_types.h
+++ b/drivers/gpu/drm/xe/xe_pmu_types.h
@@ -51,14 +51,6 @@ struct xe_pmu {
 	 *
 	 */
 	u64 sample[XE_PMU_MAX_GT][__XE_NUM_PMU_SAMPLERS];
-	/**
-	 * @irq_count: Number of interrupts
-	 *
-	 * Intentionally unsigned long to avoid atomics or heuristics on 32bit.
-	 * 4e9 interrupts are a lot and postprocessing can really deal with an
-	 * occasional wraparound easily. It's 32bit after all.
-	 */
-	unsigned long irq_count;
 	/**
 	 * @events_attr_group: Device events attribute group.
 	 */
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index d66d8aa09270..9210df850ece 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -1112,7 +1112,7 @@ struct drm_xe_wait_user_fence {
  * in 'struct perf_event_attr' as part of perf_event_open syscall to read a
  * particular event.
  *
- * For example to open the DRM_XE_PMU_INTERRUPTS(0):
+ * For example to open the DRM_XE_PMU_RENDER_GROUP_BUSY(0):
  *
  * .. code-block:: C
  *
@@ -1126,7 +1126,7 @@ struct drm_xe_wait_user_fence {
  *	attr.read_format = PERF_FORMAT_TOTAL_TIME_ENABLED;
  *	attr.use_clockid = 1;
  *	attr.clockid = CLOCK_MONOTONIC;
- *	attr.config = DRM_XE_PMU_INTERRUPTS(0);
+ *	attr.config = DRM_XE_PMU_RENDER_GROUP_BUSY(0);
  *
  *	fd = syscall(__NR_perf_event_open, &attr, -1, cpu, -1, 0);
  */
@@ -1139,11 +1139,10 @@ struct drm_xe_wait_user_fence {
 #define ___XE_PMU_OTHER(gt, x) \
 	(((__u64)(x)) | ((__u64)(gt) << __XE_PMU_GT_SHIFT))
 
-#define DRM_XE_PMU_INTERRUPTS(gt)		___XE_PMU_OTHER(gt, 0)
-#define DRM_XE_PMU_RENDER_GROUP_BUSY(gt)	___XE_PMU_OTHER(gt, 1)
-#define DRM_XE_PMU_COPY_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 2)
-#define DRM_XE_PMU_MEDIA_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 3)
-#define DRM_XE_PMU_ANY_ENGINE_GROUP_BUSY(gt)	___XE_PMU_OTHER(gt, 4)
+#define DRM_XE_PMU_RENDER_GROUP_BUSY(gt)	___XE_PMU_OTHER(gt, 0)
+#define DRM_XE_PMU_COPY_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 1)
+#define DRM_XE_PMU_MEDIA_GROUP_BUSY(gt)		___XE_PMU_OTHER(gt, 2)
+#define DRM_XE_PMU_ANY_ENGINE_GROUP_BUSY(gt)	___XE_PMU_OTHER(gt, 3)
 
 #if defined(__cplusplus)
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 22/43] drm/xe/uapi: Reject bo creation of unaligned size
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (20 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 21/43] drm/xe/pmu: Drop interrupt pmu event Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 23/43] drm/xe/uapi: Fix indentation issues that sometimes causes build warning Francois Dugast
                   ` (24 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Francois Dugast

From: Mauro Carvalho Chehab <mauro.chehab@linux.intel.com>

For xe bo creation we request passing size which matches system or
vram minimum page alignment. This way we want to ensure userspace
is aware of region constraints and not aligned allocations will be
rejected returning EINVAL.

v2:
- Rebase, Update uAPI documentation. (Thomas)
v3:
- Adjust the dma-buf kunit test accordingly. (Thomas)
v4:
- Fixed rebase conflicts and updated commit message. (Francois)

Signed-off-by: Mauro Carvalho Chehab <mauro.chehab@linux.intel.com>
Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
---
 drivers/gpu/drm/xe/tests/xe_dma_buf.c |  8 +++++++-
 drivers/gpu/drm/xe/xe_bo.c            | 24 ++++++++++++++++--------
 include/uapi/drm/xe_drm.h             | 25 +++++++++++++++++--------
 3 files changed, 40 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/xe/tests/xe_dma_buf.c b/drivers/gpu/drm/xe/tests/xe_dma_buf.c
index 18c00bc03024..a6756b554069 100644
--- a/drivers/gpu/drm/xe/tests/xe_dma_buf.c
+++ b/drivers/gpu/drm/xe/tests/xe_dma_buf.c
@@ -109,14 +109,20 @@ static void xe_test_dmabuf_import_same_driver(struct xe_device *xe)
 	struct drm_gem_object *import;
 	struct dma_buf *dmabuf;
 	struct xe_bo *bo;
+	size_t size;
 
 	/* No VRAM on this device? */
 	if (!ttm_manager_type(&xe->ttm, XE_PL_VRAM0) &&
 	    (params->mem_mask & XE_BO_CREATE_VRAM0_BIT))
 		return;
 
+	size = PAGE_SIZE;
+	if ((params->mem_mask & XE_BO_CREATE_VRAM0_BIT) &&
+	    xe->info.vram_flags & XE_VRAM_FLAGS_NEED64K)
+		size = SZ_64K;
+
 	kunit_info(test, "running %s\n", __func__);
-	bo = xe_bo_create(xe, NULL, NULL, PAGE_SIZE, ttm_bo_type_device,
+	bo = xe_bo_create(xe, NULL, NULL, size, ttm_bo_type_device,
 			  XE_BO_CREATE_USER_BIT | params->mem_mask);
 	if (IS_ERR(bo)) {
 		KUNIT_FAIL(test, "xe_bo_create() failed with err=%ld\n",
diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 87971f4faa58..38a67015caef 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -1202,6 +1202,7 @@ struct xe_bo *__xe_bo_create_locked(struct xe_device *xe, struct xe_bo *bo,
 	};
 	struct ttm_placement *placement;
 	uint32_t alignment;
+	size_t aligned_size;
 	int err;
 
 	/* Only kernel objects should set GT */
@@ -1212,23 +1213,30 @@ struct xe_bo *__xe_bo_create_locked(struct xe_device *xe, struct xe_bo *bo,
 		return ERR_PTR(-EINVAL);
 	}
 
-	if (!bo) {
-		bo = xe_bo_alloc();
-		if (IS_ERR(bo))
-			return bo;
-	}
-
 	if (flags & (XE_BO_CREATE_VRAM_MASK | XE_BO_CREATE_STOLEN_BIT) &&
 	    !(flags & XE_BO_CREATE_IGNORE_MIN_PAGE_SIZE_BIT) &&
 	    xe->info.vram_flags & XE_VRAM_FLAGS_NEED64K) {
-		size = ALIGN(size, SZ_64K);
+		aligned_size = ALIGN(size, SZ_64K);
+		if (type != ttm_bo_type_device)
+			size = ALIGN(size, SZ_64K);
 		flags |= XE_BO_INTERNAL_64K;
 		alignment = SZ_64K >> PAGE_SHIFT;
+
 	} else {
-		size = ALIGN(size, PAGE_SIZE);
+		aligned_size = ALIGN(size, SZ_4K);
+		flags &= ~XE_BO_INTERNAL_64K;
 		alignment = SZ_4K >> PAGE_SHIFT;
 	}
 
+	if (type == ttm_bo_type_device && aligned_size != size)
+		return ERR_PTR(-EINVAL);
+
+	if (!bo) {
+		bo = xe_bo_alloc();
+		if (IS_ERR(bo))
+			return bo;
+	}
+
 	bo->tile = tile;
 	bo->size = size;
 	bo->flags = flags;
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 9210df850ece..f796d05157a4 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -219,11 +219,13 @@ struct drm_xe_query_mem_region {
 	 *
 	 * When the kernel allocates memory for this region, the
 	 * underlying pages will be at least @min_page_size in size.
-	 *
-	 * Important note: When userspace allocates a GTT address which
-	 * can point to memory allocated from this region, it must also
-	 * respect this minimum alignment. This is enforced by the
-	 * kernel.
+	 * Buffer objects with an allowable placement in this region must be
+	 * created with a size aligned to this value.
+	 * GPU virtual address mappings of (parts of) buffer objects that
+	 * may be placed in this region must also have their GPU virtual
+	 * address and range aligned to this value.
+	 * Affected IOCTLS will return %-EINVAL if alignment restrictions are
+	 * not met.
 	 */
 	__u32 min_page_size;
 	/**
@@ -372,6 +374,14 @@ struct drm_xe_query_config {
 #define DRM_XE_QUERY_CONFIG_REV_AND_DEVICE_ID		0
 #define DRM_XE_QUERY_CONFIG_FLAGS			1
 	#define DRM_XE_QUERY_CONFIG_FLAGS_HAS_VRAM	(0x1 << 0)
+	/*
+	 * DRM_XE_QUERY_CONFIG_MIN_ALIGNMENT - This returns the
+	 * maximum value of the &min_page_size across all memory regions
+	 * the device implements. User-space code that does not want
+	 * to track @min_page_size per region can use this value for
+	 * a buffer-object size and GPU virtual address and -range
+	 * alignment value that is valid for all regions.
+	 */
 #define DRM_XE_QUERY_CONFIG_MIN_ALIGNMENT		2
 #define DRM_XE_QUERY_CONFIG_VA_BITS			3
 #define DRM_XE_QUERY_CONFIG_MAX_EXEC_QUEUE_PRIORITY	4
@@ -625,9 +635,8 @@ struct drm_xe_gem_create {
 	__u64 extensions;
 
 	/**
-	 * @size: Requested size for the object
-	 *
-	 * The (page-aligned) allocated size for the object will be returned.
+	 * @size: Size of the object to be created, must match region
+	 * (system or vram) minimum alignment (&min_page_size).
 	 */
 	__u64 size;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 23/43] drm/xe/uapi: Fix indentation issues that sometimes causes build warning
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (21 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 22/43] drm/xe/uapi: Reject bo creation of unaligned size Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 24/43] drm/xe/uapi: Order sections Francois Dugast
                   ` (23 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

From: Rodrigo Vivi <rodrigo.vivi@intel.com>

These issues were not seeing every time, but once another error was hit
these were printed out:

./include/uapi/drm/xe_drm.h:493: WARNING: Definition list ends without \
 a blank line; unexpected unindent.
./include/uapi/drm/xe_drm.h:500: ERROR: Unexpected indentation.
./include/uapi/drm/xe_drm.h:501: WARNING: Block quote ends without a \
 blank line; unexpected unindent.

This patch fixes the build issues, but also the presentation of the
uc_type list in the build html doc.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 include/uapi/drm/xe_drm.h | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index f796d05157a4..96db432d91bc 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -493,15 +493,16 @@ struct drm_xe_query_topology_mask {
  *
  * The @uc_type can be:
  *  - %DRM_XE_QUERY_UC_TYPE_GUC_SUBMISSION - This is the GuC Submission Version,
- * a.k.a 'VF version'. It is not the actual GuC blob version. A running GuC can
- * support multiple VF APIs with different Submission Versions. This version is
- * negotiated by the VF KMD with GuC during VF initialization. In most of the
- * current available GuC blobs, this is a 1-1 relationship where the Submission
- * version could be inferred from the running version and vice-versa. However,
- * the submission version is the most useful information for the user space
- * perspective and needs.
+ *    a.k.a 'VF version'. It is not the actual GuC blob version. A running GuC can
+ *    support multiple VF APIs with different Submission Versions. This version is
+ *    negotiated by the VF KMD with GuC during VF initialization. In most of the
+ *    current available GuC blobs, this is a 1-1 relationship where the Submission
+ *    version could be inferred from the running version and vice-versa. However,
+ *    the submission version is the most useful information for the user space
+ *    perspective and needs.
  *  - %DRM_XE_QUERY_TYPE_HUC - The actual HuC blob that is currently running
- * in the platform. It returns 0 when HuC is not currently loaded.
+ *    in the platform. It returns 0 when HuC is not currently loaded.
+ *
  */
 struct drm_xe_query_uc_fw_version {
 	/** @uc_type: The micro-controller type to query firmware version */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 24/43] drm/xe/uapi: Order sections
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (22 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 23/43] drm/xe/uapi: Fix indentation issues that sometimes causes build warning Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 25/43] drm/xe/uapi: More uAPI documentation additions and cosmetic updates Francois Dugast
                   ` (22 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Francois Dugast, Rodrigo Vivi

From: Rodrigo Vivi <rodrigo.vivi@intel.com>

This patch doesn't modify any text or uapi entries themselves.
It only move things up and down aiming a better organization of the uAPI.

While fixing the documentation I noticed that query_engine_cs_cycles
was in the middle of the memory_region info. Then I noticed more
mismatches on the order when compared to the order of the IOCTL
and QUERY entries declaration. So this patch aims to bring some
order to the uAPI so it gets easier to read and the documentation
generated in the end is able to tell a consistent story.

Overall order:

1. IOCTL definition
2. Extension definition and helper structs
3. IOCTL's Query structs in the order of the Query's entries.
4. The rest of IOCTL structs in the order of IOCTL declaration.
5. uEvents
6. PMU

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
---
 include/uapi/drm/xe_drm.h | 398 +++++++++++++++++++-------------------
 1 file changed, 204 insertions(+), 194 deletions(-)

diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 96db432d91bc..50dd9e0aad76 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -12,19 +12,53 @@
 extern "C" {
 #endif
 
-/* Please note that modifications to all structs defined here are
+/*
+ * Please note that modifications to all structs defined here are
  * subject to backwards-compatibility constraints.
+ *
+ * Sections in this file are organized as follows:
+ *   1. IOCTL definition
+ *   2. Extension definition and helper structs
+ *   3. IOCTL's Query structs in the order of the Query's entries.
+ *   4. The rest of IOCTL structs in the order of IOCTL declaration.
+ *   5. uEvents
+ *   6. PMU
+ *
  */
 
-/**
- * DOC: uevent generated by xe on it's pci node.
+/*
+ * xe specific ioctls.
  *
- * DRM_XE_RESET_FAILED_UEVENT - Event is generated when attempt to reset gt
- * fails. The value supplied with the event is always "NEEDS_RESET".
- * Additional information supplied is tile id and gt id of the gt unit for
- * which reset has failed.
+ * The device specific ioctl range is [DRM_COMMAND_BASE, DRM_COMMAND_END) ie
+ * [0x40, 0xa0) (a0 is excluded). The numbers below are defined as offset
+ * against DRM_COMMAND_BASE and should be between [0x0, 0x60).
  */
-#define DRM_XE_RESET_FAILED_UEVENT "DEVICE_STATUS"
+#define DRM_XE_DEVICE_QUERY		0x00
+#define DRM_XE_GEM_CREATE		0x01
+#define DRM_XE_GEM_MMAP_OFFSET		0x02
+#define DRM_XE_VM_CREATE		0x03
+#define DRM_XE_VM_DESTROY		0x04
+#define DRM_XE_VM_BIND			0x05
+#define DRM_XE_EXEC			0x06
+#define DRM_XE_EXEC_QUEUE_CREATE	0x07
+#define DRM_XE_EXEC_QUEUE_DESTROY	0x08
+#define DRM_XE_EXEC_QUEUE_SET_PROPERTY	0x09
+#define DRM_XE_EXEC_QUEUE_GET_PROPERTY	0x0a
+#define DRM_XE_WAIT_USER_FENCE		0x0b
+/* Must be kept compact -- no holes */
+
+#define DRM_IOCTL_XE_DEVICE_QUERY		DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_DEVICE_QUERY, struct drm_xe_device_query)
+#define DRM_IOCTL_XE_GEM_CREATE			DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_GEM_CREATE, struct drm_xe_gem_create)
+#define DRM_IOCTL_XE_GEM_MMAP_OFFSET		DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_GEM_MMAP_OFFSET, struct drm_xe_gem_mmap_offset)
+#define DRM_IOCTL_XE_VM_CREATE			DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_VM_CREATE, struct drm_xe_vm_create)
+#define DRM_IOCTL_XE_VM_DESTROY			DRM_IOW(DRM_COMMAND_BASE + DRM_XE_VM_DESTROY, struct drm_xe_vm_destroy)
+#define DRM_IOCTL_XE_VM_BIND			DRM_IOW(DRM_COMMAND_BASE + DRM_XE_VM_BIND, struct drm_xe_vm_bind)
+#define DRM_IOCTL_XE_EXEC			DRM_IOW(DRM_COMMAND_BASE + DRM_XE_EXEC, struct drm_xe_exec)
+#define DRM_IOCTL_XE_EXEC_QUEUE_CREATE		DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_EXEC_QUEUE_CREATE, struct drm_xe_exec_queue_create)
+#define DRM_IOCTL_XE_EXEC_QUEUE_DESTROY		DRM_IOW(DRM_COMMAND_BASE + DRM_XE_EXEC_QUEUE_DESTROY, struct drm_xe_exec_queue_destroy)
+#define DRM_IOCTL_XE_EXEC_QUEUE_SET_PROPERTY	DRM_IOW(DRM_COMMAND_BASE + DRM_XE_EXEC_QUEUE_SET_PROPERTY, struct drm_xe_exec_queue_set_property)
+#define DRM_IOCTL_XE_EXEC_QUEUE_GET_PROPERTY	DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_EXEC_QUEUE_GET_PROPERTY, struct drm_xe_exec_queue_get_property)
+#define DRM_IOCTL_XE_WAIT_USER_FENCE		DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_WAIT_USER_FENCE, struct drm_xe_wait_user_fence)
 
 /**
  * struct xe_user_extension - Base class for defining a chain of extensions
@@ -90,39 +124,23 @@ struct xe_user_extension {
 	__u32 pad;
 };
 
-/*
- * xe specific ioctls.
- *
- * The device specific ioctl range is [DRM_COMMAND_BASE, DRM_COMMAND_END) ie
- * [0x40, 0xa0) (a0 is excluded). The numbers below are defined as offset
- * against DRM_COMMAND_BASE and should be between [0x0, 0x60).
- */
-#define DRM_XE_DEVICE_QUERY		0x00
-#define DRM_XE_GEM_CREATE		0x01
-#define DRM_XE_GEM_MMAP_OFFSET		0x02
-#define DRM_XE_VM_CREATE		0x03
-#define DRM_XE_VM_DESTROY		0x04
-#define DRM_XE_VM_BIND			0x05
-#define DRM_XE_EXEC			0x06
-#define DRM_XE_EXEC_QUEUE_CREATE	0x07
-#define DRM_XE_EXEC_QUEUE_DESTROY	0x08
-#define DRM_XE_EXEC_QUEUE_SET_PROPERTY	0x09
-#define DRM_XE_EXEC_QUEUE_GET_PROPERTY	0x0a
-#define DRM_XE_WAIT_USER_FENCE		0x0b
-/* Must be kept compact -- no holes */
+/** struct drm_xe_ext_set_property - XE set property extension */
+struct drm_xe_ext_set_property {
+	/** @base: base user extension */
+	struct xe_user_extension base;
 
-#define DRM_IOCTL_XE_DEVICE_QUERY		DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_DEVICE_QUERY, struct drm_xe_device_query)
-#define DRM_IOCTL_XE_GEM_CREATE			DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_GEM_CREATE, struct drm_xe_gem_create)
-#define DRM_IOCTL_XE_GEM_MMAP_OFFSET		DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_GEM_MMAP_OFFSET, struct drm_xe_gem_mmap_offset)
-#define DRM_IOCTL_XE_VM_CREATE			DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_VM_CREATE, struct drm_xe_vm_create)
-#define DRM_IOCTL_XE_VM_DESTROY			DRM_IOW(DRM_COMMAND_BASE + DRM_XE_VM_DESTROY, struct drm_xe_vm_destroy)
-#define DRM_IOCTL_XE_VM_BIND			DRM_IOW(DRM_COMMAND_BASE + DRM_XE_VM_BIND, struct drm_xe_vm_bind)
-#define DRM_IOCTL_XE_EXEC			DRM_IOW(DRM_COMMAND_BASE + DRM_XE_EXEC, struct drm_xe_exec)
-#define DRM_IOCTL_XE_EXEC_QUEUE_CREATE		DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_EXEC_QUEUE_CREATE, struct drm_xe_exec_queue_create)
-#define DRM_IOCTL_XE_EXEC_QUEUE_DESTROY		DRM_IOW(DRM_COMMAND_BASE + DRM_XE_EXEC_QUEUE_DESTROY, struct drm_xe_exec_queue_destroy)
-#define DRM_IOCTL_XE_EXEC_QUEUE_SET_PROPERTY	DRM_IOW(DRM_COMMAND_BASE + DRM_XE_EXEC_QUEUE_SET_PROPERTY, struct drm_xe_exec_queue_set_property)
-#define DRM_IOCTL_XE_EXEC_QUEUE_GET_PROPERTY	DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_EXEC_QUEUE_GET_PROPERTY, struct drm_xe_exec_queue_get_property)
-#define DRM_IOCTL_XE_WAIT_USER_FENCE		DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_WAIT_USER_FENCE, struct drm_xe_wait_user_fence)
+	/** @property: property to set */
+	__u32 property;
+
+	/** @pad: MBZ */
+	__u32 pad;
+
+	/** @value: property value */
+	__u64 value;
+
+	/** @reserved: Reserved */
+	__u64 reserved[2];
+};
 
 /**
  * struct drm_xe_engine_class_instance - instance of an engine class
@@ -273,60 +291,6 @@ struct drm_xe_query_mem_region {
 	__u64 reserved[6];
 };
 
-/**
- * struct drm_xe_query_engine_cycles - correlate CPU and GPU timestamps
- *
- * If a query is made with a struct drm_xe_device_query where .query is equal to
- * DRM_XE_DEVICE_QUERY_ENGINE_CYCLES, then the reply uses struct drm_xe_query_engine_cycles
- * in .data. struct drm_xe_query_engine_cycles is allocated by the user and
- * .data points to this allocated structure.
- *
- * The query returns the engine cycles and the frequency that can
- * be used to calculate the engine timestamp. In addition the
- * query returns a set of cpu timestamps that indicate when the command
- * streamer cycle count was captured.
- */
-struct drm_xe_query_engine_cycles {
-	/**
-	 * @eci: This is input by the user and is the engine for which command
-	 * streamer cycles is queried.
-	 */
-	struct drm_xe_engine_class_instance eci;
-
-	/**
-	 * @clockid: This is input by the user and is the reference clock id for
-	 * CPU timestamp. For definition, see clock_gettime(2) and
-	 * perf_event_open(2). Supported clock ids are CLOCK_MONOTONIC,
-	 * CLOCK_MONOTONIC_RAW, CLOCK_REALTIME, CLOCK_BOOTTIME, CLOCK_TAI.
-	 */
-	__s32 clockid;
-
-	/** @width: Width of the engine cycle counter in bits. */
-	__u32 width;
-
-	/**
-	 * @engine_cycles: Engine cycles as read from its register
-	 * at 0x358 offset.
-	 */
-	__u64 engine_cycles;
-
-	/** @engine_frequency: Frequency of the engine cycles in Hz. */
-	__u64 engine_frequency;
-
-	/**
-	 * @cpu_timestamp: CPU timestamp in ns. The timestamp is captured before
-	 * reading the engine_cycles register using the reference clockid set by the
-	 * user.
-	 */
-	__u64 cpu_timestamp;
-
-	/**
-	 * @cpu_delta: Time delta in ns captured around reading the lower dword
-	 * of the engine_cycles register.
-	 */
-	__u64 cpu_delta;
-};
-
 /**
  * struct drm_xe_query_mem_regions - describe memory regions
  *
@@ -485,6 +449,60 @@ struct drm_xe_query_topology_mask {
 	__u8 mask[];
 };
 
+/**
+ * struct drm_xe_query_engine_cycles - correlate CPU and GPU timestamps
+ *
+ * If a query is made with a struct drm_xe_device_query where .query is equal to
+ * DRM_XE_DEVICE_QUERY_ENGINE_CYCLES, then the reply uses struct drm_xe_query_engine_cycles
+ * in .data. struct drm_xe_query_engine_cycles is allocated by the user and
+ * .data points to this allocated structure.
+ *
+ * The query returns the engine cycles and the frequency that can
+ * be used to calculate the engine timestamp. In addition the
+ * query returns a set of cpu timestamps that indicate when the command
+ * streamer cycle count was captured.
+ */
+struct drm_xe_query_engine_cycles {
+	/**
+	 * @eci: This is input by the user and is the engine for which command
+	 * streamer cycles is queried.
+	 */
+	struct drm_xe_engine_class_instance eci;
+
+	/**
+	 * @clockid: This is input by the user and is the reference clock id for
+	 * CPU timestamp. For definition, see clock_gettime(2) and
+	 * perf_event_open(2). Supported clock ids are CLOCK_MONOTONIC,
+	 * CLOCK_MONOTONIC_RAW, CLOCK_REALTIME, CLOCK_BOOTTIME, CLOCK_TAI.
+	 */
+	__s32 clockid;
+
+	/** @width: Width of the engine cycle counter in bits. */
+	__u32 width;
+
+	/**
+	 * @engine_cycles: Engine cycles as read from its register
+	 * at 0x358 offset.
+	 */
+	__u64 engine_cycles;
+
+	/** @engine_frequency: Frequency of the engine cycles in Hz. */
+	__u64 engine_frequency;
+
+	/**
+	 * @cpu_timestamp: CPU timestamp in ns. The timestamp is captured before
+	 * reading the engine_cycles register using the reference clockid set by the
+	 * user.
+	 */
+	__u64 cpu_timestamp;
+
+	/**
+	 * @cpu_delta: Time delta in ns captured around reading the lower dword
+	 * of the engine_cycles register.
+	 */
+	__u64 cpu_delta;
+};
+
 /**
  * struct drm_xe_query_uc_fw_version - query a micro-controller firmware version
  *
@@ -691,24 +709,6 @@ struct drm_xe_gem_mmap_offset {
 	__u64 reserved[2];
 };
 
-/** struct drm_xe_ext_set_property - XE set property extension */
-struct drm_xe_ext_set_property {
-	/** @base: base user extension */
-	struct xe_user_extension base;
-
-	/** @property: property to set */
-	__u32 property;
-
-	/** @pad: MBZ */
-	__u32 pad;
-
-	/** @value: property value */
-	__u64 value;
-
-	/** @reserved: Reserved */
-	__u64 reserved[2];
-};
-
 struct drm_xe_vm_create {
 #define DRM_XE_VM_EXTENSION_SET_PROPERTY	0
 	/** @extensions: Pointer to the first extension struct, if any */
@@ -880,31 +880,67 @@ struct drm_xe_vm_bind {
 /* Monitor 64MB contiguous region with 2M sub-granularity */
 #define XE_ACC_GRANULARITY_64M 3
 
-/**
- * struct drm_xe_exec_queue_set_property - exec queue set property
- *
- * Same namespace for extensions as drm_xe_exec_queue_create
- */
-struct drm_xe_exec_queue_set_property {
+struct drm_xe_sync {
 	/** @extensions: Pointer to the first extension struct, if any */
 	__u64 extensions;
 
-	/** @exec_queue_id: Exec queue ID */
+#define DRM_XE_SYNC_FLAG_SYNCOBJ		0x0
+#define DRM_XE_SYNC_FLAG_TIMELINE_SYNCOBJ	0x1
+#define DRM_XE_SYNC_FLAG_DMA_BUF		0x2
+#define DRM_XE_SYNC_FLAG_USER_FENCE		0x3
+#define DRM_XE_SYNC_FLAG_SIGNAL		0x10
+	__u32 flags;
+
+	/** @pad: MBZ */
+	__u32 pad;
+
+	union {
+		__u32 handle;
+
+		/**
+		 * @addr: Address of user fence. When sync passed in via exec
+		 * IOCTL this a GPU address in the VM. When sync passed in via
+		 * VM bind IOCTL this is a user pointer. In either case, it is
+		 * the users responsibility that this address is present and
+		 * mapped when the user fence is signalled. Must be qword
+		 * aligned.
+		 */
+		__u64 addr;
+	};
+
+	__u64 timeline_value;
+
+	/** @reserved: Reserved */
+	__u64 reserved[2];
+};
+
+struct drm_xe_exec {
+	/** @extensions: Pointer to the first extension struct, if any */
+	__u64 extensions;
+
+	/** @exec_queue_id: Exec queue ID for the batch buffer */
 	__u32 exec_queue_id;
 
-#define DRM_XE_EXEC_QUEUE_SET_PROPERTY_PRIORITY			0
-#define DRM_XE_EXEC_QUEUE_SET_PROPERTY_TIMESLICE		1
-#define DRM_XE_EXEC_QUEUE_SET_PROPERTY_PREEMPTION_TIMEOUT	2
-#define DRM_XE_EXEC_QUEUE_SET_PROPERTY_PERSISTENCE		3
-#define DRM_XE_EXEC_QUEUE_SET_PROPERTY_JOB_TIMEOUT		4
-#define DRM_XE_EXEC_QUEUE_SET_PROPERTY_ACC_TRIGGER		5
-#define DRM_XE_EXEC_QUEUE_SET_PROPERTY_ACC_NOTIFY		6
-#define DRM_XE_EXEC_QUEUE_SET_PROPERTY_ACC_GRANULARITY		7
-	/** @property: property to set */
-	__u32 property;
+	/** @num_syncs: Amount of struct drm_xe_sync in array. */
+	__u32 num_syncs;
 
-	/** @value: property value */
-	__u64 value;
+	/** @syncs: Pointer to struct drm_xe_sync array. */
+	__u64 syncs;
+
+	/**
+	 * @address: address of batch buffer if num_batch_buffer == 1 or an
+	 * array of batch buffer addresses
+	 */
+	__u64 address;
+
+	/**
+	 * @num_batch_buffer: number of batch buffer in this exec, must match
+	 * the width of the engine
+	 */
+	__u16 num_batch_buffer;
+
+	/** @pad: MBZ */
+	__u16 pad[3];
 
 	/** @reserved: Reserved */
 	__u64 reserved[2];
@@ -943,24 +979,6 @@ struct drm_xe_exec_queue_create {
 	__u64 reserved[2];
 };
 
-struct drm_xe_exec_queue_get_property {
-	/** @extensions: Pointer to the first extension struct, if any */
-	__u64 extensions;
-
-	/** @exec_queue_id: Exec queue ID */
-	__u32 exec_queue_id;
-
-#define DRM_XE_EXEC_QUEUE_GET_PROPERTY_BAN	0
-	/** @property: property to get */
-	__u32 property;
-
-	/** @value: property value */
-	__u64 value;
-
-	/** @reserved: Reserved */
-	__u64 reserved[2];
-};
-
 struct drm_xe_exec_queue_destroy {
 	/** @exec_queue_id: Exec queue ID */
 	__u32 exec_queue_id;
@@ -972,67 +990,49 @@ struct drm_xe_exec_queue_destroy {
 	__u64 reserved[2];
 };
 
-struct drm_xe_sync {
+/**
+ * struct drm_xe_exec_queue_set_property - exec queue set property
+ *
+ * Same namespace for extensions as drm_xe_exec_queue_create
+ */
+struct drm_xe_exec_queue_set_property {
 	/** @extensions: Pointer to the first extension struct, if any */
 	__u64 extensions;
 
-#define DRM_XE_SYNC_FLAG_SYNCOBJ		0x0
-#define DRM_XE_SYNC_FLAG_TIMELINE_SYNCOBJ	0x1
-#define DRM_XE_SYNC_FLAG_DMA_BUF		0x2
-#define DRM_XE_SYNC_FLAG_USER_FENCE		0x3
-#define DRM_XE_SYNC_FLAG_SIGNAL		0x10
-	__u32 flags;
-
-	/** @pad: MBZ */
-	__u32 pad;
-
-	union {
-		__u32 handle;
+	/** @exec_queue_id: Exec queue ID */
+	__u32 exec_queue_id;
 
-		/**
-		 * @addr: Address of user fence. When sync passed in via exec
-		 * IOCTL this a GPU address in the VM. When sync passed in via
-		 * VM bind IOCTL this is a user pointer. In either case, it is
-		 * the users responsibility that this address is present and
-		 * mapped when the user fence is signalled. Must be qword
-		 * aligned.
-		 */
-		__u64 addr;
-	};
+#define DRM_XE_EXEC_QUEUE_SET_PROPERTY_PRIORITY			0
+#define DRM_XE_EXEC_QUEUE_SET_PROPERTY_TIMESLICE		1
+#define DRM_XE_EXEC_QUEUE_SET_PROPERTY_PREEMPTION_TIMEOUT	2
+#define DRM_XE_EXEC_QUEUE_SET_PROPERTY_PERSISTENCE		3
+#define DRM_XE_EXEC_QUEUE_SET_PROPERTY_JOB_TIMEOUT		4
+#define DRM_XE_EXEC_QUEUE_SET_PROPERTY_ACC_TRIGGER		5
+#define DRM_XE_EXEC_QUEUE_SET_PROPERTY_ACC_NOTIFY		6
+#define DRM_XE_EXEC_QUEUE_SET_PROPERTY_ACC_GRANULARITY		7
+	/** @property: property to set */
+	__u32 property;
 
-	__u64 timeline_value;
+	/** @value: property value */
+	__u64 value;
 
 	/** @reserved: Reserved */
 	__u64 reserved[2];
 };
 
-struct drm_xe_exec {
+struct drm_xe_exec_queue_get_property {
 	/** @extensions: Pointer to the first extension struct, if any */
 	__u64 extensions;
 
-	/** @exec_queue_id: Exec queue ID for the batch buffer */
+	/** @exec_queue_id: Exec queue ID */
 	__u32 exec_queue_id;
 
-	/** @num_syncs: Amount of struct drm_xe_sync in array. */
-	__u32 num_syncs;
-
-	/** @syncs: Pointer to struct drm_xe_sync array. */
-	__u64 syncs;
-
-	/**
-	 * @address: address of batch buffer if num_batch_buffer == 1 or an
-	 * array of batch buffer addresses
-	 */
-	__u64 address;
-
-	/**
-	 * @num_batch_buffer: number of batch buffer in this exec, must match
-	 * the width of the engine
-	 */
-	__u16 num_batch_buffer;
+#define DRM_XE_EXEC_QUEUE_GET_PROPERTY_BAN	0
+	/** @property: property to get */
+	__u32 property;
 
-	/** @pad: MBZ */
-	__u16 pad[3];
+	/** @value: property value */
+	__u64 value;
 
 	/** @reserved: Reserved */
 	__u64 reserved[2];
@@ -1115,6 +1115,16 @@ struct drm_xe_wait_user_fence {
 	__u64 reserved[2];
 };
 
+/**
+ * DOC: uevent generated by xe on it's pci node.
+ *
+ * DRM_XE_RESET_FAILED_UEVENT - Event is generated when attempt to reset gt
+ * fails. The value supplied with the event is always "NEEDS_RESET".
+ * Additional information supplied is tile id and gt id of the gt unit for
+ * which reset has failed.
+ */
+#define DRM_XE_RESET_FAILED_UEVENT "DEVICE_STATUS"
+
 /**
  * DOC: XE PMU event config IDs
  *
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 25/43] drm/xe/uapi: More uAPI documentation additions and cosmetic updates
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (23 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 24/43] drm/xe/uapi: Order sections Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 26/43] drm/xe/uapi: Split xe_sync types from flags Francois Dugast
                   ` (21 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

From: Rodrigo Vivi <rodrigo.vivi@intel.com>

No functional change in this patch.

Let's ensure all of our structs are documented and with a certain
standard. Also, let's have an overview and list of IOCTLs as the
very beginning of the generated HTML doc.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 include/uapi/drm/xe_drm.h | 134 ++++++++++++++++++++++++++++++++++----
 1 file changed, 122 insertions(+), 12 deletions(-)

diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 50dd9e0aad76..65cbeaeacedb 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -26,6 +26,29 @@ extern "C" {
  *
  */
 
+/**
+ * DOC: Xe uAPI Overview
+ *
+ * This section aims to describe the Xe's IOCTL entries, its structs, and other
+ * Xe related uAPI such as uevents and PMU (Platform Monitoring Unit) related
+ * entries and usage.
+ *
+ * List of supported IOCTLs:
+ *  - &DRM_IOCTL_XE_DEVICE_QUERY
+ *  - &DRM_IOCTL_XE_GEM_CREATE
+ *  - &DRM_IOCTL_XE_GEM_MMAP_OFFSET
+ *  - &DRM_IOCTL_XE_VM_CREATE
+ *  - &DRM_IOCTL_XE_VM_DESTROY
+ *  - &DRM_IOCTL_XE_VM_BIND
+ *  - &DRM_IOCTL_XE_EXEC
+ *  - &DRM_IOCTL_XE_EXEC_QUEUE_CREATE
+ *  - &DRM_IOCTL_XE_EXEC_QUEUE_DESTROY
+ *  - &DRM_IOCTL_XE_EXEC_QUEUE_SET_PROPERTY
+ *  - &DRM_IOCTL_XE_EXEC_QUEUE_GET_PROPERTY
+ *  - &DRM_IOCTL_XE_WAIT_USER_FENCE
+ *
+ */
+
 /*
  * xe specific ioctls.
  *
@@ -61,7 +84,10 @@ extern "C" {
 #define DRM_IOCTL_XE_WAIT_USER_FENCE		DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_WAIT_USER_FENCE, struct drm_xe_wait_user_fence)
 
 /**
- * struct xe_user_extension - Base class for defining a chain of extensions
+ * DOC: Xe IOCT Extensions
+ *
+ * Before detailing the IOCTLs and its structs, it is important to highlight
+ * that every IOCTL in Xe is extensible.
  *
  * Many interfaces need to grow over time. In most cases we can simply
  * extend the struct and have userspace pass in more data. Another option,
@@ -95,7 +121,10 @@ extern "C" {
  * Typically the struct xe_user_extension would be embedded in some uAPI
  * struct, and in this case we would feed it the head of the chain(i.e ext1),
  * which would then apply all of the above extensions.
- *
+*/
+
+/**
+ * struct xe_user_extension - Base class for defining a chain of extensions
  */
 struct xe_user_extension {
 	/**
@@ -124,7 +153,12 @@ struct xe_user_extension {
 	__u32 pad;
 };
 
-/** struct drm_xe_ext_set_property - XE set property extension */
+/**
+ * struct drm_xe_ext_set_property - Generic set property extension
+ *
+ * A generic struct that could allow any of the Xe's IOCLT to be extended
+ * with a set_property operation.
+ */
 struct drm_xe_ext_set_property {
 	/** @base: base user extension */
 	struct xe_user_extension base;
@@ -287,7 +321,7 @@ struct drm_xe_query_mem_region {
 	 * here will always be zero).
 	 */
 	__u64 cpu_visible_used;
-	/** @reserved: MBZ */
+	/** @reserved: Reserved */
 	__u64 reserved[6];
 };
 
@@ -360,7 +394,6 @@ struct drm_xe_query_config {
  * existing GT individual descriptions.
  * Graphics Technology (GT) is a subset of a GPU/tile that is responsible for
  * implementing graphics and/or media operations.
- *
  */
 struct drm_xe_query_gt {
 #define DRM_XE_QUERY_GT_TYPE_MAIN		0
@@ -548,7 +581,8 @@ struct drm_xe_query_uc_fw_version {
 };
 
 /**
- * struct drm_xe_device_query - main structure to query device information
+ * struct drm_xe_device_query - Input of &DRM_IOCLT_XE_DEVICE_QUERY - The
+ * main structure to query device information
  *
  * The user selects the type of data to query among DRM_XE_DEVICE_QUERY_*
  * and sets the value in the query member. This determines the type of
@@ -627,7 +661,8 @@ struct drm_xe_device_query {
 };
 
 /**
- * struct drm_xe_gem_create - structure for gem creation
+ * struct drm_xe_gem_create - Input of &DRM_IOCLT_XE_GEM_CREATE - A structure for
+ * gem creation
  *
  * The @flags can be:
  *  - %DRM_XE_GEM_CREATE_FLAG_DEFER_BACKING
@@ -692,6 +727,9 @@ struct drm_xe_gem_create {
 	__u64 reserved[2];
 };
 
+/**
+ * struct drm_xe_gem_mmap_offset - Input of &DRM_IOCTL_XE_GEM_MMAP_OFFSET
+ */
 struct drm_xe_gem_mmap_offset {
 	/** @extensions: Pointer to the first extension struct, if any */
 	__u64 extensions;
@@ -709,6 +747,9 @@ struct drm_xe_gem_mmap_offset {
 	__u64 reserved[2];
 };
 
+/**
+ * struct drm_xe_vm_create - Input of &DRM_IOCTL_XE_VM_CREATE
+ */
 struct drm_xe_vm_create {
 #define DRM_XE_VM_EXTENSION_SET_PROPERTY	0
 	/** @extensions: Pointer to the first extension struct, if any */
@@ -728,6 +769,9 @@ struct drm_xe_vm_create {
 	__u64 reserved[2];
 };
 
+/**
+ * struct drm_xe_vm_destroy - Input of &DRM_IOCTL_XE_VM_DESTROY
+ */
 struct drm_xe_vm_destroy {
 	/** @vm_id: VM ID */
 	__u32 vm_id;
@@ -822,6 +866,9 @@ struct drm_xe_vm_bind_op {
 	__u64 reserved[2];
 };
 
+/**
+ * struct drm_xe_vm_bind - Input of &DRM_IOCTL_XE_VM_BIND
+ */
 struct drm_xe_vm_bind {
 	/** @extensions: Pointer to the first extension struct, if any */
 	__u64 extensions;
@@ -880,6 +927,19 @@ struct drm_xe_vm_bind {
 /* Monitor 64MB contiguous region with 2M sub-granularity */
 #define XE_ACC_GRANULARITY_64M 3
 
+/**
+ * struct drm_xe_sync - Main structure for sync objects and user fences
+ *
+ * This can be used with both @drm_xe_exec or with @drm_xe_vm_bind
+ *
+ * The @flags can be:
+ *  - %DRM_XE_SYNC_FLAG_SYNCOBJ
+ *  - %DRM_XE_SYNC_FLAG_TIMELINE_SYNCOBJ
+ *  - %DRM_XE_SYNC_FLAG_DMA_BUF
+ *  - %DRM_XE_SYNC_FLAG_USER_FENCE
+ *  - %DRM_XE_SYNC_FLAG_SIGNAL
+ *
+ */
 struct drm_xe_sync {
 	/** @extensions: Pointer to the first extension struct, if any */
 	__u64 extensions;
@@ -889,17 +949,19 @@ struct drm_xe_sync {
 #define DRM_XE_SYNC_FLAG_DMA_BUF		0x2
 #define DRM_XE_SYNC_FLAG_USER_FENCE		0x3
 #define DRM_XE_SYNC_FLAG_SIGNAL		0x10
+	/** @flags: Sync Flags */
 	__u32 flags;
 
 	/** @pad: MBZ */
 	__u32 pad;
 
 	union {
+		/** @handle: Handle to the sync object */
 		__u32 handle;
 
 		/**
-		 * @addr: Address of user fence. When sync passed in via exec
-		 * IOCTL this a GPU address in the VM. When sync passed in via
+		 * @addr: Address of user fence. When sync is passed in via exec
+		 * IOCTL this is a GPU address in the VM. When sync passed in via
 		 * VM bind IOCTL this is a user pointer. In either case, it is
 		 * the users responsibility that this address is present and
 		 * mapped when the user fence is signalled. Must be qword
@@ -908,12 +970,19 @@ struct drm_xe_sync {
 		__u64 addr;
 	};
 
+	/**
+	 * @timeline_value: Input for the timeline sync object. Needs to be
+	 * different than 0 when used with %DRM_XE_SYNC_FLAG_TIMELINE_SYNCOBJ.
+	 */
 	__u64 timeline_value;
 
 	/** @reserved: Reserved */
 	__u64 reserved[2];
 };
 
+/**
+ * struct drm_xe_exec - Input of &DRM_IOCTL_XE_EXEC
+ */
 struct drm_xe_exec {
 	/** @extensions: Pointer to the first extension struct, if any */
 	__u64 extensions;
@@ -946,6 +1015,9 @@ struct drm_xe_exec {
 	__u64 reserved[2];
 };
 
+/**
+ * struct drm_xe_exec_queue_create - Input of &DRM_IOCTL_XE_EXEC_QUEUE_CREATE
+ */
 struct drm_xe_exec_queue_create {
 #define DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY               0
 	/** @extensions: Pointer to the first extension struct, if any */
@@ -979,6 +1051,9 @@ struct drm_xe_exec_queue_create {
 	__u64 reserved[2];
 };
 
+/**
+ * struct drm_xe_exec_queue_destroy - Input of &DRM_IOCTL_XE_EXEC_QUEUE_DESTROY
+ */
 struct drm_xe_exec_queue_destroy {
 	/** @exec_queue_id: Exec queue ID */
 	__u32 exec_queue_id;
@@ -991,9 +1066,18 @@ struct drm_xe_exec_queue_destroy {
 };
 
 /**
- * struct drm_xe_exec_queue_set_property - exec queue set property
+ * struct drm_xe_exec_queue_set_property - Input of &DRM_IOCTL_XE_EXEC_QUEUE_SET_PROPERTY
+ *
+ * The @property can be:
+ *  - %DRM_XE_EXEC_QUEUE_SET_PROPERTY_PRIORITY
+ *  - %DRM_XE_EXEC_QUEUE_SET_PROPERTY_TIMESLICE
+ *  - %DRM_XE_EXEC_QUEUE_SET_PROPERTY_PREEMPTION_TIMEOUT
+ *  - %DRM_XE_EXEC_QUEUE_SET_PROPERTY_PERSISTENCE
+ *  - %DRM_XE_EXEC_QUEUE_SET_PROPERTY_JOB_TIMEOUT
+ *  - %DRM_XE_EXEC_QUEUE_SET_PROPERTY_ACC_TRIGGER
+ *  - %DRM_XE_EXEC_QUEUE_SET_PROPERTY_ACC_NOTIFY
+ *  - %DRM_XE_EXEC_QUEUE_SET_PROPERTY_ACC_GRANULARITY
  *
- * Same namespace for extensions as drm_xe_exec_queue_create
  */
 struct drm_xe_exec_queue_set_property {
 	/** @extensions: Pointer to the first extension struct, if any */
@@ -1020,6 +1104,13 @@ struct drm_xe_exec_queue_set_property {
 	__u64 reserved[2];
 };
 
+/**
+ * struct drm_xe_exec_queue_get_property - Input of &DRM_IOCTL_XE_EXEC_QUEUE_GET_PROPERTY
+ *
+ * The @property can be:
+ *  - %DRM_XE_EXEC_QUEUE_GET_PROPERTY_BAN
+ *
+ */
 struct drm_xe_exec_queue_get_property {
 	/** @extensions: Pointer to the first extension struct, if any */
 	__u64 extensions;
@@ -1039,7 +1130,7 @@ struct drm_xe_exec_queue_get_property {
 };
 
 /**
- * struct drm_xe_wait_user_fence - wait user fence
+ * struct drm_xe_wait_user_fence - Input of &DRM_IOCTL_XE_WAIT_USER_FENCE
  *
  * Wait on user fence, XE will wake-up on every HW engine interrupt in the
  * instances list and check if user fence is complete::
@@ -1047,6 +1138,25 @@ struct drm_xe_exec_queue_get_property {
  *	(*addr & MASK) OP (VALUE & MASK)
  *
  * Returns to user on user fence completion or timeout.
+ *
+ * The wait @op can be:
+ *  - %DRM_XE_UFENCE_WAIT_EQ
+ *  - %DRM_XE_UFENCE_WAIT_NEQ
+ *  - %DRM_XE_UFENCE_WAIT_GT
+ *  - %DRM_XE_UFENCE_WAIT_GTE
+ *  - %DRM_XE_UFENCE_WAIT_LT
+ *  - %DRM_XE_UFENCE_WAIT_LTE
+ *
+ * The wait @flags can be:
+ *  - %DRM_XE_UFENCE_WAIT_FLAG_SOFT_OP
+ *  - %DRM_XE_UFENCE_WAIT_FLAG_ABSTIME
+ *
+ * The wait @mask can be:
+ *  - %DRM_XE_UFENCE_WAIT_U8
+ *  - %DRM_XE_UFENCE_WAIT_U16
+ *  - %DRM_XE_UFENCE_WAIT_U32
+ *  - %DRM_XE_UFENCE_WAIT_U64
+ *
  */
 struct drm_xe_wait_user_fence {
 	/** @extensions: Pointer to the first extension struct, if any */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 26/43] drm/xe/uapi: Split xe_sync types from flags
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (24 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 25/43] drm/xe/uapi: More uAPI documentation additions and cosmetic updates Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 27/43] drm/xe/uapi: Standardize the FLAG naming and assignment Francois Dugast
                   ` (20 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Francois Dugast, Rodrigo Vivi

From: Rodrigo Vivi <rodrigo.vivi@intel.com>

Let's continue on the uapi clean-up with more splits
with stuff into their own exclusive fields instead of
reusing stuff.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
---
 drivers/gpu/drm/xe/xe_sync.c       | 23 +++++++----------------
 drivers/gpu/drm/xe/xe_sync_types.h |  1 +
 include/uapi/drm/xe_drm.h          | 26 +++++++++++++-------------
 3 files changed, 21 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_sync.c b/drivers/gpu/drm/xe/xe_sync.c
index eafe53c2f55d..883987b27c4e 100644
--- a/drivers/gpu/drm/xe/xe_sync.c
+++ b/drivers/gpu/drm/xe/xe_sync.c
@@ -17,8 +17,6 @@
 #include "xe_macros.h"
 #include "xe_sched_job_types.h"
 
-#define SYNC_FLAGS_TYPE_MASK 0x3
-
 struct user_fence {
 	struct xe_device *xe;
 	struct kref refcount;
@@ -109,15 +107,13 @@ int xe_sync_entry_parse(struct xe_device *xe, struct xe_file *xef,
 	if (copy_from_user(&sync_in, sync_user, sizeof(*sync_user)))
 		return -EFAULT;
 
-	if (XE_IOCTL_DBG(xe, sync_in.flags &
-			 ~(SYNC_FLAGS_TYPE_MASK | DRM_XE_SYNC_FLAG_SIGNAL)) ||
-	    XE_IOCTL_DBG(xe, sync_in.pad) ||
+	if (XE_IOCTL_DBG(xe, sync_in.flags & ~DRM_XE_SYNC_FLAG_SIGNAL) ||
 	    XE_IOCTL_DBG(xe, sync_in.reserved[0] || sync_in.reserved[1]))
 		return -EINVAL;
 
 	signal = sync_in.flags & DRM_XE_SYNC_FLAG_SIGNAL;
-	switch (sync_in.flags & SYNC_FLAGS_TYPE_MASK) {
-	case DRM_XE_SYNC_FLAG_SYNCOBJ:
+	switch (sync_in.type) {
+	case DRM_XE_SYNC_TYPE_SYNCOBJ:
 		if (XE_IOCTL_DBG(xe, no_dma_fences && signal))
 			return -EOPNOTSUPP;
 
@@ -135,7 +131,7 @@ int xe_sync_entry_parse(struct xe_device *xe, struct xe_file *xef,
 		}
 		break;
 
-	case DRM_XE_SYNC_FLAG_TIMELINE_SYNCOBJ:
+	case DRM_XE_SYNC_TYPE_TIMELINE_SYNCOBJ:
 		if (XE_IOCTL_DBG(xe, no_dma_fences && signal))
 			return -EOPNOTSUPP;
 
@@ -165,12 +161,7 @@ int xe_sync_entry_parse(struct xe_device *xe, struct xe_file *xef,
 		}
 		break;
 
-	case DRM_XE_SYNC_FLAG_DMA_BUF:
-		if (XE_IOCTL_DBG(xe, "TODO"))
-			return -EINVAL;
-		break;
-
-	case DRM_XE_SYNC_FLAG_USER_FENCE:
+	case DRM_XE_SYNC_TYPE_USER_FENCE:
 		if (XE_IOCTL_DBG(xe, !signal))
 			return -EOPNOTSUPP;
 
@@ -192,6 +183,7 @@ int xe_sync_entry_parse(struct xe_device *xe, struct xe_file *xef,
 		return -EINVAL;
 	}
 
+	sync->type = sync_in.type;
 	sync->flags = sync_in.flags;
 	sync->timeline_value = sync_in.timeline_value;
 
@@ -252,8 +244,7 @@ void xe_sync_entry_signal(struct xe_sync_entry *sync, struct xe_sched_job *job,
 			user_fence_put(sync->ufence);
 			dma_fence_put(fence);
 		}
-	} else if ((sync->flags & SYNC_FLAGS_TYPE_MASK) ==
-		   DRM_XE_SYNC_FLAG_USER_FENCE) {
+	} else if (sync->type == DRM_XE_SYNC_TYPE_USER_FENCE) {
 		job->user_fence.used = true;
 		job->user_fence.addr = sync->addr;
 		job->user_fence.value = sync->timeline_value;
diff --git a/drivers/gpu/drm/xe/xe_sync_types.h b/drivers/gpu/drm/xe/xe_sync_types.h
index 24fccc26cb53..852db5e7884f 100644
--- a/drivers/gpu/drm/xe/xe_sync_types.h
+++ b/drivers/gpu/drm/xe/xe_sync_types.h
@@ -21,6 +21,7 @@ struct xe_sync_entry {
 	struct user_fence *ufence;
 	u64 addr;
 	u64 timeline_value;
+	u32 type;
 	u32 flags;
 };
 
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 65cbeaeacedb..0c004b24f820 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -932,11 +932,12 @@ struct drm_xe_vm_bind {
  *
  * This can be used with both @drm_xe_exec or with @drm_xe_vm_bind
  *
+ * The @type can be:
+ *  - %DRM_XE_SYNC_TYPE_SYNCOBJ - A simple drm sync object
+ *  - %DRM_XE_SYNC_TYPE_TIMELINE_SYNCOBJ - A timelined sync object
+ *  - %DRM_XE_SYNC_TYPE_USER_FENCE - A user fence
+ *
  * The @flags can be:
- *  - %DRM_XE_SYNC_FLAG_SYNCOBJ
- *  - %DRM_XE_SYNC_FLAG_TIMELINE_SYNCOBJ
- *  - %DRM_XE_SYNC_FLAG_DMA_BUF
- *  - %DRM_XE_SYNC_FLAG_USER_FENCE
  *  - %DRM_XE_SYNC_FLAG_SIGNAL
  *
  */
@@ -944,17 +945,16 @@ struct drm_xe_sync {
 	/** @extensions: Pointer to the first extension struct, if any */
 	__u64 extensions;
 
-#define DRM_XE_SYNC_FLAG_SYNCOBJ		0x0
-#define DRM_XE_SYNC_FLAG_TIMELINE_SYNCOBJ	0x1
-#define DRM_XE_SYNC_FLAG_DMA_BUF		0x2
-#define DRM_XE_SYNC_FLAG_USER_FENCE		0x3
-#define DRM_XE_SYNC_FLAG_SIGNAL		0x10
+#define DRM_XE_SYNC_TYPE_SYNCOBJ		0x0
+#define DRM_XE_SYNC_TYPE_TIMELINE_SYNCOBJ	0x1
+#define DRM_XE_SYNC_TYPE_USER_FENCE		0x2
+	/** @type: Type of the this sync object */
+	__u32 type;
+
+#define DRM_XE_SYNC_FLAG_SIGNAL	(1 << 0)
 	/** @flags: Sync Flags */
 	__u32 flags;
 
-	/** @pad: MBZ */
-	__u32 pad;
-
 	union {
 		/** @handle: Handle to the sync object */
 		__u32 handle;
@@ -972,7 +972,7 @@ struct drm_xe_sync {
 
 	/**
 	 * @timeline_value: Input for the timeline sync object. Needs to be
-	 * different than 0 when used with %DRM_XE_SYNC_FLAG_TIMELINE_SYNCOBJ.
+	 * different than 0 when used with %DRM_XE_SYNC_TYPE_TIMELINE_SYNCOBJ.
 	 */
 	__u64 timeline_value;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 27/43] drm/xe/uapi: Standardize the FLAG naming and assignment
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (25 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 26/43] drm/xe/uapi: Split xe_sync types from flags Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:10   ` Matthew Brost
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 28/43] drm/xe/uapi: Differentiate WAIT_OP from WAIT_MASK Francois Dugast
                   ` (19 subsequent siblings)
  46 siblings, 1 reply; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

From: Rodrigo Vivi <rodrigo.vivi@intel.com>

Only cosmetic things. No functional change on this patch.
Define every flag with (1 << n) and use singular FLAG name.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_query.c |  2 +-
 include/uapi/drm/xe_drm.h     | 20 ++++++++++----------
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
index bc2b4609a38d..71a4943cab20 100644
--- a/drivers/gpu/drm/xe/xe_query.c
+++ b/drivers/gpu/drm/xe/xe_query.c
@@ -333,7 +333,7 @@ static int query_config(struct xe_device *xe, struct drm_xe_device_query *query)
 		xe->info.devid | (xe->info.revid << 16);
 	if (xe_device_get_root_tile(xe)->mem.vram.usable_size)
 		config->info[DRM_XE_QUERY_CONFIG_FLAGS] =
-			DRM_XE_QUERY_CONFIG_FLAGS_HAS_VRAM;
+			DRM_XE_QUERY_CONFIG_FLAG_HAS_VRAM;
 	config->info[DRM_XE_QUERY_CONFIG_MIN_ALIGNMENT] =
 		xe->info.vram_flags & XE_VRAM_FLAGS_NEED64K ? SZ_64K : SZ_4K;
 	config->info[DRM_XE_QUERY_CONFIG_VA_BITS] = xe->info.va_bits;
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 0c004b24f820..5217558a32d0 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -354,7 +354,7 @@ struct drm_xe_query_mem_regions {
  *  - %DRM_XE_QUERY_CONFIG_FLAGS - Flags describing the device
  *    configuration, see list below
  *
- *    - %DRM_XE_QUERY_CONFIG_FLAGS_HAS_VRAM - Flag is set if the device
+ *    - %DRM_XE_QUERY_CONFIG_FLAG_HAS_VRAM - Flag is set if the device
  *      has usable VRAM
  *  - %DRM_XE_QUERY_CONFIG_MIN_ALIGNMENT - Minimal memory alignment
  *    required by this device, typically SZ_4K or SZ_64K
@@ -371,7 +371,7 @@ struct drm_xe_query_config {
 
 #define DRM_XE_QUERY_CONFIG_REV_AND_DEVICE_ID		0
 #define DRM_XE_QUERY_CONFIG_FLAGS			1
-	#define DRM_XE_QUERY_CONFIG_FLAGS_HAS_VRAM	(0x1 << 0)
+	#define DRM_XE_QUERY_CONFIG_FLAG_HAS_VRAM	(1 << 0)
 	/*
 	 * DRM_XE_QUERY_CONFIG_MIN_ALIGNMENT - This returns the
 	 * maximum value of the &min_page_size across all memory regions
@@ -755,10 +755,10 @@ struct drm_xe_vm_create {
 	/** @extensions: Pointer to the first extension struct, if any */
 	__u64 extensions;
 
-#define DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE	(0x1 << 0)
-#define DRM_XE_VM_CREATE_FLAG_COMPUTE_MODE	(0x1 << 1)
-#define DRM_XE_VM_CREATE_FLAG_ASYNC_DEFAULT	(0x1 << 2)
-#define DRM_XE_VM_CREATE_FLAG_FAULT_MODE	(0x1 << 3)
+#define DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE	(1 << 0)
+#define DRM_XE_VM_CREATE_FLAG_COMPUTE_MODE	(1 << 1)
+#define DRM_XE_VM_CREATE_FLAG_ASYNC_DEFAULT	(1 << 2)
+#define DRM_XE_VM_CREATE_FLAG_FAULT_MODE	(1 << 3)
 	/** @flags: Flags */
 	__u32 flags;
 
@@ -852,10 +852,10 @@ struct drm_xe_vm_bind_op {
 	/** @op: Bind operation to perform */
 	__u32 op;
 
-#define DRM_XE_VM_BIND_FLAG_READONLY	(0x1 << 0)
-#define DRM_XE_VM_BIND_FLAG_ASYNC	(0x1 << 1)
-#define DRM_XE_VM_BIND_FLAG_IMMEDIATE	(0x1 << 2)
-#define DRM_XE_VM_BIND_FLAG_NULL	(0x1 << 3)
+#define DRM_XE_VM_BIND_FLAG_READONLY	(1 << 0)
+#define DRM_XE_VM_BIND_FLAG_ASYNC	(1 << 1)
+#define DRM_XE_VM_BIND_FLAG_IMMEDIATE	(1 << 2)
+#define DRM_XE_VM_BIND_FLAG_NULL	(1 << 3)
 	/** @flags: Bind flags */
 	__u32 flags;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 28/43] drm/xe/uapi: Differentiate WAIT_OP from WAIT_MASK
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (26 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 27/43] drm/xe/uapi: Standardize the FLAG naming and assignment Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 29/43] drm/xe/uapi: Move xe_exec after xe_exec_queue Francois Dugast
                   ` (18 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Francois Dugast, Rodrigo Vivi

From: Rodrigo Vivi <rodrigo.vivi@intel.com>

On one hand the WAIT_OP represents the operation use for waiting such
as ==, !=, > and so on. On the other hand, the mask is applied to the
value used for comparision. Split those two to bring clarity to the uapi.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
---
 drivers/gpu/drm/xe/xe_wait_user_fence.c | 14 ++++-----
 include/uapi/drm/xe_drm.h               | 41 +++++++++++++------------
 2 files changed, 28 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_wait_user_fence.c b/drivers/gpu/drm/xe/xe_wait_user_fence.c
index 13562db6c07f..4d5c2555ce41 100644
--- a/drivers/gpu/drm/xe/xe_wait_user_fence.c
+++ b/drivers/gpu/drm/xe/xe_wait_user_fence.c
@@ -25,22 +25,22 @@ static int do_compare(u64 addr, u64 value, u64 mask, u16 op)
 		return -EFAULT;
 
 	switch (op) {
-	case DRM_XE_UFENCE_WAIT_EQ:
+	case DRM_XE_UFENCE_WAIT_OP_EQ:
 		passed = (rvalue & mask) == (value & mask);
 		break;
-	case DRM_XE_UFENCE_WAIT_NEQ:
+	case DRM_XE_UFENCE_WAIT_OP_NEQ:
 		passed = (rvalue & mask) != (value & mask);
 		break;
-	case DRM_XE_UFENCE_WAIT_GT:
+	case DRM_XE_UFENCE_WAIT_OP_GT:
 		passed = (rvalue & mask) > (value & mask);
 		break;
-	case DRM_XE_UFENCE_WAIT_GTE:
+	case DRM_XE_UFENCE_WAIT_OP_GTE:
 		passed = (rvalue & mask) >= (value & mask);
 		break;
-	case DRM_XE_UFENCE_WAIT_LT:
+	case DRM_XE_UFENCE_WAIT_OP_LT:
 		passed = (rvalue & mask) < (value & mask);
 		break;
-	case DRM_XE_UFENCE_WAIT_LTE:
+	case DRM_XE_UFENCE_WAIT_OP_LTE:
 		passed = (rvalue & mask) <= (value & mask);
 		break;
 	default:
@@ -81,7 +81,7 @@ static int check_hw_engines(struct xe_device *xe,
 
 #define VALID_FLAGS	(DRM_XE_UFENCE_WAIT_FLAG_SOFT_OP | \
 			 DRM_XE_UFENCE_WAIT_FLAG_ABSTIME)
-#define MAX_OP		DRM_XE_UFENCE_WAIT_LTE
+#define MAX_OP		DRM_XE_UFENCE_WAIT_OP_LTE
 
 static long to_jiffies_timeout(struct xe_device *xe,
 			       struct drm_xe_wait_user_fence *args)
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 5217558a32d0..c30f7ea3fae3 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -1140,22 +1140,22 @@ struct drm_xe_exec_queue_get_property {
  * Returns to user on user fence completion or timeout.
  *
  * The wait @op can be:
- *  - %DRM_XE_UFENCE_WAIT_EQ
- *  - %DRM_XE_UFENCE_WAIT_NEQ
- *  - %DRM_XE_UFENCE_WAIT_GT
- *  - %DRM_XE_UFENCE_WAIT_GTE
- *  - %DRM_XE_UFENCE_WAIT_LT
- *  - %DRM_XE_UFENCE_WAIT_LTE
+ *  - %DRM_XE_UFENCE_WAIT_OP_EQ
+ *  - %DRM_XE_UFENCE_WAIT_OP_NEQ
+ *  - %DRM_XE_UFENCE_WAIT_OP_GT
+ *  - %DRM_XE_UFENCE_WAIT_OP_GTE
+ *  - %DRM_XE_UFENCE_WAIT_OP_LT
+ *  - %DRM_XE_UFENCE_WAIT_OP_LTE
  *
  * The wait @flags can be:
  *  - %DRM_XE_UFENCE_WAIT_FLAG_SOFT_OP
  *  - %DRM_XE_UFENCE_WAIT_FLAG_ABSTIME
  *
  * The wait @mask can be:
- *  - %DRM_XE_UFENCE_WAIT_U8
- *  - %DRM_XE_UFENCE_WAIT_U16
- *  - %DRM_XE_UFENCE_WAIT_U32
- *  - %DRM_XE_UFENCE_WAIT_U64
+ *  - %DRM_XE_UFENCE_WAIT_MASK_U8
+ *  - %DRM_XE_UFENCE_WAIT_MASK_U16
+ *  - %DRM_XE_UFENCE_WAIT_MASK_U32
+ *  - %DRM_XE_UFENCE_WAIT_MASK_U64
  *
  */
 struct drm_xe_wait_user_fence {
@@ -1167,12 +1167,12 @@ struct drm_xe_wait_user_fence {
 	 */
 	__u64 addr;
 
-#define DRM_XE_UFENCE_WAIT_EQ	0
-#define DRM_XE_UFENCE_WAIT_NEQ	1
-#define DRM_XE_UFENCE_WAIT_GT	2
-#define DRM_XE_UFENCE_WAIT_GTE	3
-#define DRM_XE_UFENCE_WAIT_LT	4
-#define DRM_XE_UFENCE_WAIT_LTE	5
+#define DRM_XE_UFENCE_WAIT_OP_EQ	0x0
+#define DRM_XE_UFENCE_WAIT_OP_NEQ	0x1
+#define DRM_XE_UFENCE_WAIT_OP_GT	0x2
+#define DRM_XE_UFENCE_WAIT_OP_GTE	0x3
+#define DRM_XE_UFENCE_WAIT_OP_LT	0x4
+#define DRM_XE_UFENCE_WAIT_OP_LTE	0x5
 	/** @op: wait operation (type of comparison) */
 	__u16 op;
 
@@ -1187,12 +1187,13 @@ struct drm_xe_wait_user_fence {
 	/** @value: compare value */
 	__u64 value;
 
-#define DRM_XE_UFENCE_WAIT_U8		0xffu
-#define DRM_XE_UFENCE_WAIT_U16		0xffffu
-#define DRM_XE_UFENCE_WAIT_U32		0xffffffffu
-#define DRM_XE_UFENCE_WAIT_U64		0xffffffffffffffffu
+#define DRM_XE_UFENCE_WAIT_MASK_U8	0xffu
+#define DRM_XE_UFENCE_WAIT_MASK_U16	0xffffu
+#define DRM_XE_UFENCE_WAIT_MASK_U32	0xffffffffu
+#define DRM_XE_UFENCE_WAIT_MASK_U64	0xffffffffffffffffu
 	/** @mask: comparison mask */
 	__u64 mask;
+
 	/**
 	 * @timeout: how long to wait before bailing, value in nanoseconds.
 	 * Without DRM_XE_UFENCE_WAIT_FLAG_ABSTIME flag set (relative timeout)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 29/43] drm/xe/uapi: Move xe_exec after xe_exec_queue
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (27 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 28/43] drm/xe/uapi: Differentiate WAIT_OP from WAIT_MASK Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 30/43] drm/xe/uapi: Move memory_region masks from GT to engine Francois Dugast
                   ` (17 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

From: Rodrigo Vivi <rodrigo.vivi@intel.com>

Although the exec ioctl is a very important one, it makes no sense
to explain xe_exec before explaining the exec_queue. So, let's
move this down to help bring a better flow on the documentation
and code readability.

It is important to highlight that this patch is changing all
the ioctl numbers in a non-backward compatible way. However, we
are doing this final uapi clean-up before we submit our first
pull-request to be part of the upstream Kernel. Once we get
there, no other change like this will ever happen and all the
backward compatibility will be respected.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 include/uapi/drm/xe_drm.h | 82 +++++++++++++++++++--------------------
 1 file changed, 41 insertions(+), 41 deletions(-)

diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index c30f7ea3fae3..5c023feaa6d5 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -62,11 +62,11 @@ extern "C" {
 #define DRM_XE_VM_CREATE		0x03
 #define DRM_XE_VM_DESTROY		0x04
 #define DRM_XE_VM_BIND			0x05
-#define DRM_XE_EXEC			0x06
-#define DRM_XE_EXEC_QUEUE_CREATE	0x07
-#define DRM_XE_EXEC_QUEUE_DESTROY	0x08
-#define DRM_XE_EXEC_QUEUE_SET_PROPERTY	0x09
-#define DRM_XE_EXEC_QUEUE_GET_PROPERTY	0x0a
+#define DRM_XE_EXEC_QUEUE_CREATE	0x06
+#define DRM_XE_EXEC_QUEUE_DESTROY	0x07
+#define DRM_XE_EXEC_QUEUE_SET_PROPERTY	0x08
+#define DRM_XE_EXEC_QUEUE_GET_PROPERTY	0x09
+#define DRM_XE_EXEC			0x0a
 #define DRM_XE_WAIT_USER_FENCE		0x0b
 /* Must be kept compact -- no holes */
 
@@ -76,11 +76,11 @@ extern "C" {
 #define DRM_IOCTL_XE_VM_CREATE			DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_VM_CREATE, struct drm_xe_vm_create)
 #define DRM_IOCTL_XE_VM_DESTROY			DRM_IOW(DRM_COMMAND_BASE + DRM_XE_VM_DESTROY, struct drm_xe_vm_destroy)
 #define DRM_IOCTL_XE_VM_BIND			DRM_IOW(DRM_COMMAND_BASE + DRM_XE_VM_BIND, struct drm_xe_vm_bind)
-#define DRM_IOCTL_XE_EXEC			DRM_IOW(DRM_COMMAND_BASE + DRM_XE_EXEC, struct drm_xe_exec)
 #define DRM_IOCTL_XE_EXEC_QUEUE_CREATE		DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_EXEC_QUEUE_CREATE, struct drm_xe_exec_queue_create)
 #define DRM_IOCTL_XE_EXEC_QUEUE_DESTROY		DRM_IOW(DRM_COMMAND_BASE + DRM_XE_EXEC_QUEUE_DESTROY, struct drm_xe_exec_queue_destroy)
 #define DRM_IOCTL_XE_EXEC_QUEUE_SET_PROPERTY	DRM_IOW(DRM_COMMAND_BASE + DRM_XE_EXEC_QUEUE_SET_PROPERTY, struct drm_xe_exec_queue_set_property)
 #define DRM_IOCTL_XE_EXEC_QUEUE_GET_PROPERTY	DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_EXEC_QUEUE_GET_PROPERTY, struct drm_xe_exec_queue_get_property)
+#define DRM_IOCTL_XE_EXEC			DRM_IOW(DRM_COMMAND_BASE + DRM_XE_EXEC, struct drm_xe_exec)
 #define DRM_IOCTL_XE_WAIT_USER_FENCE		DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_WAIT_USER_FENCE, struct drm_xe_wait_user_fence)
 
 /**
@@ -980,41 +980,6 @@ struct drm_xe_sync {
 	__u64 reserved[2];
 };
 
-/**
- * struct drm_xe_exec - Input of &DRM_IOCTL_XE_EXEC
- */
-struct drm_xe_exec {
-	/** @extensions: Pointer to the first extension struct, if any */
-	__u64 extensions;
-
-	/** @exec_queue_id: Exec queue ID for the batch buffer */
-	__u32 exec_queue_id;
-
-	/** @num_syncs: Amount of struct drm_xe_sync in array. */
-	__u32 num_syncs;
-
-	/** @syncs: Pointer to struct drm_xe_sync array. */
-	__u64 syncs;
-
-	/**
-	 * @address: address of batch buffer if num_batch_buffer == 1 or an
-	 * array of batch buffer addresses
-	 */
-	__u64 address;
-
-	/**
-	 * @num_batch_buffer: number of batch buffer in this exec, must match
-	 * the width of the engine
-	 */
-	__u16 num_batch_buffer;
-
-	/** @pad: MBZ */
-	__u16 pad[3];
-
-	/** @reserved: Reserved */
-	__u64 reserved[2];
-};
-
 /**
  * struct drm_xe_exec_queue_create - Input of &DRM_IOCTL_XE_EXEC_QUEUE_CREATE
  */
@@ -1129,6 +1094,41 @@ struct drm_xe_exec_queue_get_property {
 	__u64 reserved[2];
 };
 
+/**
+ * struct drm_xe_exec - Input of &DRM_IOCTL_XE_EXEC
+ */
+struct drm_xe_exec {
+	/** @extensions: Pointer to the first extension struct, if any */
+	__u64 extensions;
+
+	/** @exec_queue_id: Exec queue ID for the batch buffer */
+	__u32 exec_queue_id;
+
+	/** @num_syncs: Amount of struct drm_xe_sync in array. */
+	__u32 num_syncs;
+
+	/** @syncs: Pointer to struct drm_xe_sync array. */
+	__u64 syncs;
+
+	/**
+	 * @address: address of batch buffer if num_batch_buffer == 1 or an
+	 * array of batch buffer addresses
+	 */
+	__u64 address;
+
+	/**
+	 * @num_batch_buffer: number of batch buffer in this exec, must match
+	 * the width of the engine
+	 */
+	__u16 num_batch_buffer;
+
+	/** @pad: MBZ */
+	__u16 pad[3];
+
+	/** @reserved: Reserved */
+	__u64 reserved[2];
+};
+
 /**
  * struct drm_xe_wait_user_fence - Input of &DRM_IOCTL_XE_WAIT_USER_FENCE
  *
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 30/43] drm/xe/uapi: Move memory_region masks from GT to engine
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (28 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 29/43] drm/xe/uapi: Move xe_exec after xe_exec_queue Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 31/43] drm/xe/uapi: Document the memory_region bitmask Francois Dugast
                   ` (16 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

From: Rodrigo Vivi <rodrigo.vivi@intel.com>

In the Tiled platforms, the memory is more tied to the Tile
than to the GT.
The distance (near vs far) makes more sense from the Engine
perspective than from the GT perspective.

So, let's move this out from the GT and into the engine info.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_query.c | 14 +++++++-------
 include/uapi/drm/xe_drm.h     | 27 ++++++++++++++-------------
 2 files changed, 21 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
index 71a4943cab20..40af8bcc9f02 100644
--- a/drivers/gpu/drm/xe/xe_query.c
+++ b/drivers/gpu/drm/xe/xe_query.c
@@ -217,6 +217,13 @@ static int query_engines(struct xe_device *xe,
 				hwe->logical_instance;
 			hw_engine_info[i].instance.gt_id = gt->info.id;
 			hw_engine_info[i].instance.pad = 0;
+			if (!IS_DGFX(xe))
+				hw_engine_info[i].near_mem_regions = 0x1;
+			else
+				hw_engine_info[i].near_mem_regions =
+					BIT(gt_to_tile(gt)->id) << 1;
+			hw_engine_info[i].far_mem_regions = xe->info.mem_region_mask ^
+				hw_engine_info[i].near_mem_regions;
 			memset(hw_engine_info->reserved, 0, sizeof(hw_engine_info->reserved));
 
 			i++;
@@ -378,13 +385,6 @@ static int query_gt_list(struct xe_device *xe, struct drm_xe_device_query *query
 			gt_list->gt_list[id].type = DRM_XE_QUERY_GT_TYPE_MAIN;
 		gt_list->gt_list[id].gt_id = gt->info.id;
 		gt_list->gt_list[id].clock_freq = gt->info.clock_freq;
-		if (!IS_DGFX(xe))
-			gt_list->gt_list[id].near_mem_regions = 0x1;
-		else
-			gt_list->gt_list[id].near_mem_regions =
-				BIT(gt_to_tile(gt)->id) << 1;
-		gt_list->gt_list[id].far_mem_regions = xe->info.mem_region_mask ^
-			gt_list->gt_list[id].near_mem_regions;
 	}
 
 	if (copy_to_user(query_ptr, gt_list, size)) {
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 5c023feaa6d5..365208caa22e 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -228,6 +228,20 @@ struct drm_xe_query_engine_info {
 	/** @instance: The @drm_xe_engine_class_instance */
 	struct drm_xe_engine_class_instance instance;
 
+	/**
+	 * @near_mem_regions: Bit mask of instances from
+	 * drm_xe_query_mem_regions that is near this engine.
+	 */
+	__u64 near_mem_regions;
+	/**
+	 * @far_mem_regions: Bit mask of instances from
+	 * drm_xe_query_mem_regions that is far from this engine.
+	 * In general, it has extra indirections when compared to the
+	 * @near_mem_regions. For a discrete device this could mean system
+	 * memory and memory living in a different Tile.
+	 */
+	__u64 far_mem_regions;
+
 	/** @reserved: Reserved */
 	__u64 reserved[5];
 };
@@ -404,19 +418,6 @@ struct drm_xe_query_gt {
 	__u16 gt_id;
 	/** @clock_freq: A clock frequency for timestamp */
 	__u32 clock_freq;
-	/**
-	 * @near_mem_regions: Bit mask of instances from
-	 * drm_xe_query_mem_regions that is near the current engines of this GT.
-	 */
-	__u64 near_mem_regions;
-	/**
-	 * @far_mem_regions: Bit mask of instances from
-	 * drm_xe_query_mem_regions that is far from the engines of this GT.
-	 * In general, it has extra indirections when compared to the
-	 * @near_mem_regions. For a discrete device this could mean system
-	 * memory and memory living in a different Tile.
-	 */
-	__u64 far_mem_regions;
 	/** @reserved: Reserved */
 	__u64 reserved[8];
 };
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 31/43] drm/xe/uapi: Document the memory_region bitmask
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (29 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 30/43] drm/xe/uapi: Move memory_region masks from GT to engine Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 32/43] drm/xe/uapi: Be more specific about the vm_bind prefetch region Francois Dugast
                   ` (15 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Francois Dugast, Rodrigo Vivi

From: Rodrigo Vivi <rodrigo.vivi@intel.com>

The uAPI should stay generic in regarding to the bitmask. It is
the userspace responsibility to check for the type/class of the
memory, without any assumption.

Also add comments inside the code to explain how it is actually
constructed so we don't accidentally change the assignment of
the instance and the masks.

No functional change in this patch. It only explains and document
the memory_region masks. A further follow-up work with the
organization of all memory regions around struct xe_mem_regions
is desired, but not part of this patch.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
---
 drivers/gpu/drm/xe/xe_query.c | 19 +++++++++++++++++++
 include/uapi/drm/xe_drm.h     | 23 ++++++++++++++++++-----
 2 files changed, 37 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
index 40af8bcc9f02..e5db18c91f01 100644
--- a/drivers/gpu/drm/xe/xe_query.c
+++ b/drivers/gpu/drm/xe/xe_query.c
@@ -217,6 +217,20 @@ static int query_engines(struct xe_device *xe,
 				hwe->logical_instance;
 			hw_engine_info[i].instance.gt_id = gt->info.id;
 			hw_engine_info[i].instance.pad = 0;
+			/*
+			 * The mem_regions indexes in the mask below need to
+			 * directly identify the struct
+			 * drm_xe_query_mem_regions' instance constructed at
+			 * query_mem_regions()
+			 *
+			 * For our current platforms:
+			 * Bit 0 -> System Memory
+			 * Bit 1 -> VRAM0 on Tile0
+			 * Bit 2 -> VRAM1 on Tile1
+			 * However the uAPI is generic and it's userspace's
+			 * responsibility to check the mem_class, without any
+			 * assumption.
+			 */
 			if (!IS_DGFX(xe))
 				hw_engine_info[i].near_mem_regions = 0x1;
 			else
@@ -273,6 +287,11 @@ static int query_mem_regions(struct xe_device *xe,
 
 	man = ttm_manager_type(&xe->ttm, XE_PL_TT);
 	usage->regions[0].mem_class = DRM_XE_MEM_REGION_CLASS_SYSMEM;
+	/*
+	 * The instance needs to be a unique number that represents the index
+	 * in the placement mask used at xe_gem_create_ioctl() for the
+	 * xe_bo_create() placement.
+	 */
 	usage->regions[0].instance = 0;
 	usage->regions[0].min_page_size = PAGE_SIZE;
 	usage->regions[0].total_size = man->size << PAGE_SHIFT;
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 365208caa22e..df3e5e31a2c9 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -231,6 +231,10 @@ struct drm_xe_query_engine_info {
 	/**
 	 * @near_mem_regions: Bit mask of instances from
 	 * drm_xe_query_mem_regions that is near this engine.
+	 * Each index in this mask refers directly to the struct
+	 * drm_xe_query_mem_regions' instance, no assumptions should
+	 * be made about order. The type of each region is described
+	 * by struct drm_xe_query_mem_regions' mem_class.
 	 */
 	__u64 near_mem_regions;
 	/**
@@ -239,6 +243,10 @@ struct drm_xe_query_engine_info {
 	 * In general, it has extra indirections when compared to the
 	 * @near_mem_regions. For a discrete device this could mean system
 	 * memory and memory living in a different Tile.
+	 * Each index in this mask refers directly to the struct
+	 * drm_xe_query_mem_regions' instance, no assumptions should
+	 * be made about order. The type of each region is described
+	 * by struct drm_xe_query_mem_regions' mem_class.
 	 */
 	__u64 far_mem_regions;
 
@@ -272,10 +280,9 @@ struct drm_xe_query_mem_region {
 	 */
 	__u16 mem_class;
 	/**
-	 * @instance: The instance for this region.
-	 *
-	 * The @mem_class and @instance taken together will always give
-	 * a unique pair.
+	 * @instance: The unique ID for this region, which serves as the
+	 * index in the placement bitmask used as argument for
+	 * &DRM_IOCTL_XE_GEM_CREATE
 	 */
 	__u16 instance;
 	/** @pad: MBZ */
@@ -695,7 +702,13 @@ struct drm_xe_gem_create {
 	 */
 	__u64 size;
 
-	/** @placement: A mask of memory instances of where GEM can be placed. */
+	/**
+	 * @placement: A mask of memory instances of where GEM can be placed.
+	 * Each index in this mask refers directly to the struct
+	 * drm_xe_query_mem_regions' instance, no assumptions should
+	 * be made about order. The type of each region is described
+	 * by struct drm_xe_query_mem_regions' mem_class.
+	 */
 	__u32 placement;
 
 #define DRM_XE_GEM_CREATE_FLAG_DEFER_BACKING		(1 << 0)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 32/43] drm/xe/uapi: Be more specific about the vm_bind prefetch region
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (30 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 31/43] drm/xe/uapi: Document the memory_region bitmask Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 33/43] drm/xe/uapi: Convert tile_mask to a pt_placement_hint Francois Dugast
                   ` (14 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

From: Rodrigo Vivi <rodrigo.vivi@intel.com>

Let's bring a bit of clarity on this 'region' field that is
part of vm_bind operation struct. Rename and document to make
it more than obvious that it is a region instance and not a
mask and also that it should only be used with the prefetch
operation itself.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_vm.c | 15 ++++++++-------
 include/uapi/drm/xe_drm.h  |  8 ++++++--
 2 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 76926ee756c7..f8559ebad9bc 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -2167,7 +2167,8 @@ static void print_op(struct xe_device *xe, struct drm_gpuva_op *op)
 static struct drm_gpuva_ops *
 vm_bind_ioctl_ops_create(struct xe_vm *vm, struct xe_bo *bo,
 			 u64 bo_offset_or_userptr, u64 addr, u64 range,
-			 u32 operation, u32 flags, u8 tile_mask, u32 region)
+			 u32 operation, u32 flags, u8 tile_mask,
+			 u32 prefetch_region)
 {
 	struct drm_gem_object *obj = bo ? &bo->ttm.base : NULL;
 	struct drm_gpuva_ops *ops;
@@ -2221,7 +2222,7 @@ vm_bind_ioctl_ops_create(struct xe_vm *vm, struct xe_bo *bo,
 			struct xe_vma_op *op = gpuva_op_to_vma_op(__op);
 
 			op->tile_mask = tile_mask;
-			op->prefetch.region = region;
+			op->prefetch.region = prefetch_region;
 		}
 		break;
 	case DRM_XE_VM_BIND_OP_UNMAP_ALL:
@@ -2881,7 +2882,7 @@ static int vm_bind_ioctl_check_args(struct xe_device *xe,
 		u32 flags = (*bind_ops)[i].flags;
 		u32 obj = (*bind_ops)[i].obj;
 		u64 obj_offset = (*bind_ops)[i].obj_offset;
-		u32 region = (*bind_ops)[i].region;
+		u32 prefetch_region = (*bind_ops)[i].prefetch_mem_region_instance;
 		bool is_null = flags & DRM_XE_VM_BIND_FLAG_NULL;
 
 		if (i == 0) {
@@ -2915,9 +2916,9 @@ static int vm_bind_ioctl_check_args(struct xe_device *xe,
 				 op == DRM_XE_VM_BIND_OP_MAP_USERPTR) ||
 		    XE_IOCTL_DBG(xe, obj &&
 				 op == DRM_XE_VM_BIND_OP_PREFETCH) ||
-		    XE_IOCTL_DBG(xe, region &&
+		    XE_IOCTL_DBG(xe, prefetch_region &&
 				 op != DRM_XE_VM_BIND_OP_PREFETCH) ||
-		    XE_IOCTL_DBG(xe, !(BIT(region) &
+		    XE_IOCTL_DBG(xe, !(BIT(prefetch_region) &
 				       xe->info.mem_region_mask)) ||
 		    XE_IOCTL_DBG(xe, obj &&
 				 op == DRM_XE_VM_BIND_OP_UNMAP)) {
@@ -3099,11 +3100,11 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 		u32 flags = bind_ops[i].flags;
 		u64 obj_offset = bind_ops[i].obj_offset;
 		u8 tile_mask = bind_ops[i].tile_mask;
-		u32 region = bind_ops[i].region;
+		u32 prefetch_region = bind_ops[i].prefetch_mem_region_instance;
 
 		ops[i] = vm_bind_ioctl_ops_create(vm, bos[i], obj_offset,
 						  addr, range, op, flags,
-						  tile_mask, region);
+						  tile_mask, prefetch_region);
 		if (IS_ERR(ops[i])) {
 			err = PTR_ERR(ops[i]);
 			ops[i] = NULL;
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index df3e5e31a2c9..3cbfc17d9ffa 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -873,8 +873,12 @@ struct drm_xe_vm_bind_op {
 	/** @flags: Bind flags */
 	__u32 flags;
 
-	/** @region: Memory region to prefetch VMA to, instance not a mask */
-	__u32 region;
+	/**
+	 * @prefetch_mem_region_instance: Memory region to prefetch VMA to.
+	 * It is a region instance, not a mask.
+	 * To be used only with %DRM_XE_VM_BIND_OP_PREFETCH operation.
+	 */
+	__u32 prefetch_mem_region_instance;
 
 	/** @reserved: Reserved */
 	__u64 reserved[2];
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 33/43] drm/xe/uapi: Convert tile_mask to a pt_placement_hint
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (31 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 32/43] drm/xe/uapi: Be more specific about the vm_bind prefetch region Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09  9:29   ` Matthew Brost
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 34/43] drm/xe/uapi: Exec queue documentation and variable renaming Francois Dugast
                   ` (13 subsequent siblings)
  46 siblings, 1 reply; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Francois Dugast, Rodrigo Vivi

From: Rodrigo Vivi <rodrigo.vivi@intel.com>

The previous tile_mask was also an optional hint, and only used
for the page-table tree placement. However, it was so tied
with the tile concept itself. Let's clarify things up and make
this generic enough. So accept any valid memory region mask.
It could even be a direct near_mem_region gotten from the engine_info.
pt stands for page table.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
---
 drivers/gpu/drm/xe/xe_vm.c | 14 ++++++++++----
 include/uapi/drm/xe_drm.h  | 16 +++++++++++++---
 2 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index f8559ebad9bc..ad3b5ea6f91a 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -3018,11 +3018,16 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 			goto release_vm_lock;
 		}
 
-		if (bind_ops[i].tile_mask) {
+		if (bind_ops[i].pt_placement_hint) {
 			u64 valid_tiles = BIT(xe->info.tile_count) - 1;
+			/*
+			 * System memory is currently ignored from this hint,
+			 * which gets entirely converted to a tile_mask
+			 */
+			u8 system_memory = 0x1;
 
-			if (XE_IOCTL_DBG(xe, bind_ops[i].tile_mask &
-					 ~valid_tiles)) {
+			if (XE_IOCTL_DBG(xe, bind_ops[i].pt_placement_hint &
+					 ~valid_tiles & ~system_memory)) {
 				err = -EINVAL;
 				goto release_vm_lock;
 			}
@@ -3099,7 +3104,8 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 		u32 op = bind_ops[i].op;
 		u32 flags = bind_ops[i].flags;
 		u64 obj_offset = bind_ops[i].obj_offset;
-		u8 tile_mask = bind_ops[i].tile_mask;
+		/* Remove the system memory bit when converting to tiles */
+		u8 tile_mask = bind_ops[i].pt_placement_hint & ~0x1;
 		u32 prefetch_region = bind_ops[i].prefetch_mem_region_instance;
 
 		ops[i] = vm_bind_ioctl_ops_create(vm, bos[i], obj_offset,
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 3cbfc17d9ffa..144a423868cf 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -853,10 +853,20 @@ struct drm_xe_vm_bind_op {
 	__u64 addr;
 
 	/**
-	 * @tile_mask: Mask for which tiles to create binds for, 0 == All tiles,
-	 * only applies to creating new VMAs
+	 * @pt_placement_hint: An optional memory_region bit-mask hint, which
+	 * only applies when creating new VMAs. Default value '0' is the
+	 * recommended value.
+	 *
+	 * It hints the optimal placement for the page-table tree for this VMA.
+	 * For instance, when userspace is using engines living in a secondary
+	 * tile with allocated BOs near those engines, that same
+	 * @near_mem_region could be used in this hint field.
+	 *
+	 * Since it is a hint, the Xe kernel driver is free to ignore this mask
+	 * and choose the best location for the page-table, taking into
+	 * consideration the running hardware and runtime constrains.
 	 */
-	__u64 tile_mask;
+	__u64 pt_placement_hint;
 
 #define DRM_XE_VM_BIND_OP_MAP		0x0
 #define DRM_XE_VM_BIND_OP_UNMAP		0x1
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 34/43] drm/xe/uapi: Exec queue documentation and variable renaming
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (32 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 33/43] drm/xe/uapi: Convert tile_mask to a pt_placement_hint Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 35/43] drm/xe/uapi: Refactor engine information Francois Dugast
                   ` (12 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

From: Rodrigo Vivi <rodrigo.vivi@intel.com>

Rename 'placement' to num_eng_per_bb and 'width' to num_bb_per_exec, and
give a graphical documentation to it.

Let's make it obvious and straight forward. Not only because it is
important to have variable names that are clear and descriptive, but also
because 'placement' is now used in many terms around the memory_region
selection where the BO or the page table will live and 'width' is so
generic and with so many other common meaning in the graphics world.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_devcoredump.c      |  8 +--
 drivers/gpu/drm/xe/xe_exec.c             |  4 +-
 drivers/gpu/drm/xe/xe_exec_queue.c       | 49 +++++++-------
 drivers/gpu/drm/xe/xe_exec_queue.h       |  4 +-
 drivers/gpu/drm/xe/xe_exec_queue_types.h |  4 +-
 drivers/gpu/drm/xe/xe_guc_submit.c       | 32 ++++-----
 drivers/gpu/drm/xe/xe_ring_ops.c         |  8 +--
 drivers/gpu/drm/xe/xe_sched_job.c        | 10 +--
 drivers/gpu/drm/xe/xe_trace.h            |  8 +--
 include/uapi/drm/xe_drm.h                | 84 ++++++++++++++++++++++--
 10 files changed, 141 insertions(+), 70 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c b/drivers/gpu/drm/xe/xe_devcoredump.c
index 68abc0b195be..b4e8de4903b9 100644
--- a/drivers/gpu/drm/xe/xe_devcoredump.c
+++ b/drivers/gpu/drm/xe/xe_devcoredump.c
@@ -130,7 +130,7 @@ static void devcoredump_snapshot(struct xe_devcoredump *coredump,
 	struct xe_hw_engine *hwe;
 	enum xe_hw_engine_id id;
 	u32 adj_logical_mask = q->logical_mask;
-	u32 width_mask = (0x1 << q->width) - 1;
+	u32 num_bb_per_exec_mask = (0x1 << q->num_bb_per_exec) - 1;
 	int i;
 	bool cookie;
 
@@ -138,10 +138,10 @@ static void devcoredump_snapshot(struct xe_devcoredump *coredump,
 	ss->boot_time = ktime_get_boottime();
 
 	cookie = dma_fence_begin_signalling();
-	for (i = 0; q->width > 1 && i < XE_HW_ENGINE_MAX_INSTANCE;) {
+	for (i = 0; q->num_bb_per_exec > 1 && i < XE_HW_ENGINE_MAX_INSTANCE;) {
 		if (adj_logical_mask & BIT(i)) {
-			adj_logical_mask |= width_mask << i;
-			i += q->width;
+			adj_logical_mask |= num_bb_per_exec_mask << i;
+			i += q->num_bb_per_exec;
 		} else {
 			++i;
 		}
diff --git a/drivers/gpu/drm/xe/xe_exec.c b/drivers/gpu/drm/xe/xe_exec.c
index 28e84a0bbeb0..ca922635db89 100644
--- a/drivers/gpu/drm/xe/xe_exec.c
+++ b/drivers/gpu/drm/xe/xe_exec.c
@@ -161,7 +161,7 @@ int xe_exec_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 	if (XE_IOCTL_DBG(xe, q->flags & EXEC_QUEUE_FLAG_VM))
 		return -EINVAL;
 
-	if (XE_IOCTL_DBG(xe, q->width != args->num_batch_buffer))
+	if (XE_IOCTL_DBG(xe, q->num_bb_per_exec != args->num_batch_buffer))
 		return -EINVAL;
 
 	if (XE_IOCTL_DBG(xe, q->flags & EXEC_QUEUE_FLAG_BANNED)) {
@@ -189,7 +189,7 @@ int xe_exec_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 
 	if (xe_exec_queue_is_parallel(q)) {
 		err = __copy_from_user(addresses, addresses_user, sizeof(u64) *
-				       q->width);
+				       q->num_bb_per_exec);
 		if (err) {
 			err = -EFAULT;
 			goto err_syncs;
diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
index 59e8d1ed34f7..064f25e5e3a5 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue.c
+++ b/drivers/gpu/drm/xe/xe_exec_queue.c
@@ -33,7 +33,8 @@ enum xe_exec_queue_sched_prop {
 static struct xe_exec_queue *__xe_exec_queue_create(struct xe_device *xe,
 						    struct xe_vm *vm,
 						    u32 logical_mask,
-						    u16 width, struct xe_hw_engine *hwe,
+						    u16 num_bb_per_exec,
+						    struct xe_hw_engine *hwe,
 						    u32 flags)
 {
 	struct xe_exec_queue *q;
@@ -44,7 +45,7 @@ static struct xe_exec_queue *__xe_exec_queue_create(struct xe_device *xe,
 	/* only kernel queues can be permanent */
 	XE_WARN_ON((flags & EXEC_QUEUE_FLAG_PERMANENT) && !(flags & EXEC_QUEUE_FLAG_KERNEL));
 
-	q = kzalloc(sizeof(*q) + sizeof(struct xe_lrc) * width, GFP_KERNEL);
+	q = kzalloc(sizeof(*q) + sizeof(struct xe_lrc) * num_bb_per_exec, GFP_KERNEL);
 	if (!q)
 		return ERR_PTR(-ENOMEM);
 
@@ -55,7 +56,7 @@ static struct xe_exec_queue *__xe_exec_queue_create(struct xe_device *xe,
 	if (vm)
 		q->vm = xe_vm_get(vm);
 	q->class = hwe->class;
-	q->width = width;
+	q->num_bb_per_exec = num_bb_per_exec;
 	q->logical_mask = logical_mask;
 	q->fence_irq = &gt->fence_irq[hwe->class];
 	q->ring_ops = gt->ring_ops[hwe->class];
@@ -77,7 +78,7 @@ static struct xe_exec_queue *__xe_exec_queue_create(struct xe_device *xe,
 		q->bind.fence_seqno = XE_FENCE_INITIAL_SEQNO;
 	}
 
-	for (i = 0; i < width; ++i) {
+	for (i = 0; i < num_bb_per_exec; ++i) {
 		err = xe_lrc_init(q->lrc + i, hwe, q, vm, SZ_16K);
 		if (err)
 			goto err_lrc;
@@ -108,7 +109,7 @@ static struct xe_exec_queue *__xe_exec_queue_create(struct xe_device *xe,
 }
 
 struct xe_exec_queue *xe_exec_queue_create(struct xe_device *xe, struct xe_vm *vm,
-					   u32 logical_mask, u16 width,
+					   u32 logical_mask, u16 num_bb_per_exec,
 					   struct xe_hw_engine *hwe, u32 flags)
 {
 	struct xe_exec_queue *q;
@@ -119,7 +120,7 @@ struct xe_exec_queue *xe_exec_queue_create(struct xe_device *xe, struct xe_vm *v
 		if (err)
 			return ERR_PTR(err);
 	}
-	q = __xe_exec_queue_create(xe, vm, logical_mask, width, hwe, flags);
+	q = __xe_exec_queue_create(xe, vm, logical_mask, num_bb_per_exec, hwe, flags);
 	if (vm)
 		xe_vm_unlock(vm);
 
@@ -170,7 +171,7 @@ void xe_exec_queue_fini(struct xe_exec_queue *q)
 {
 	int i;
 
-	for (i = 0; i < q->width; ++i)
+	for (i = 0; i < q->num_bb_per_exec; ++i)
 		xe_lrc_finish(q->lrc + i);
 	if (q->vm)
 		xe_vm_put(q->vm);
@@ -512,15 +513,15 @@ find_hw_engine(struct xe_device *xe,
 
 static u32 bind_exec_queue_logical_mask(struct xe_device *xe, struct xe_gt *gt,
 					struct drm_xe_engine_class_instance *eci,
-					u16 width, u16 num_placements)
+					u16 num_bb_per_exec, u16 num_eng_per_bb)
 {
 	struct xe_hw_engine *hwe;
 	enum xe_hw_engine_id id;
 	u32 logical_mask = 0;
 
-	if (XE_IOCTL_DBG(xe, width != 1))
+	if (XE_IOCTL_DBG(xe, num_bb_per_exec != 1))
 		return 0;
-	if (XE_IOCTL_DBG(xe, num_placements != 1))
+	if (XE_IOCTL_DBG(xe, num_eng_per_bb != 1))
 		return 0;
 	if (XE_IOCTL_DBG(xe, eci[0].engine_instance != 0))
 		return 0;
@@ -541,9 +542,9 @@ static u32 bind_exec_queue_logical_mask(struct xe_device *xe, struct xe_gt *gt,
 
 static u32 calc_validate_logical_mask(struct xe_device *xe, struct xe_gt *gt,
 				      struct drm_xe_engine_class_instance *eci,
-				      u16 width, u16 num_placements)
+				      u16 num_bb_per_exec, u16 num_eng_per_bb)
 {
-	int len = width * num_placements;
+	int len = num_bb_per_exec * num_eng_per_bb;
 	int i, j, n;
 	u16 class;
 	u16 gt_id;
@@ -553,13 +554,13 @@ static u32 calc_validate_logical_mask(struct xe_device *xe, struct xe_gt *gt,
 			 len > 1))
 		return 0;
 
-	for (i = 0; i < width; ++i) {
+	for (i = 0; i < num_bb_per_exec; ++i) {
 		u32 current_mask = 0;
 
-		for (j = 0; j < num_placements; ++j) {
+		for (j = 0; j < num_eng_per_bb; ++j) {
 			struct xe_hw_engine *hwe;
 
-			n = j * width + i;
+			n = j * num_bb_per_exec + i;
 
 			hwe = find_hw_engine(xe, eci[n]);
 			if (XE_IOCTL_DBG(xe, !hwe))
@@ -575,7 +576,7 @@ static u32 calc_validate_logical_mask(struct xe_device *xe, struct xe_gt *gt,
 			class = eci[n].engine_class;
 			gt_id = eci[n].gt_id;
 
-			if (width == 1 || !i)
+			if (num_bb_per_exec == 1 || !i)
 				return_mask |= BIT(eci[n].engine_instance);
 			current_mask |= BIT(eci[n].engine_instance);
 		}
@@ -612,7 +613,7 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
 	    XE_IOCTL_DBG(xe, args->reserved[0] || args->reserved[1]))
 		return -EINVAL;
 
-	len = args->width * args->num_placements;
+	len = args->num_bb_per_exec * args->num_eng_per_bb;
 	if (XE_IOCTL_DBG(xe, !len || len > XE_HW_ENGINE_MAX_INSTANCE))
 		return -EINVAL;
 
@@ -637,8 +638,8 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
 
 			eci[0].gt_id = gt->info.id;
 			logical_mask = bind_exec_queue_logical_mask(xe, gt, eci,
-								    args->width,
-								    args->num_placements);
+								    args->num_bb_per_exec,
+								    args->num_eng_per_bb);
 			if (XE_IOCTL_DBG(xe, !logical_mask))
 				return -EINVAL;
 
@@ -651,7 +652,7 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
 
 			migrate_vm = xe_migrate_get_vm(gt_to_tile(gt)->migrate);
 			new = xe_exec_queue_create(xe, migrate_vm, logical_mask,
-						   args->width, hwe,
+						   args->num_bb_per_exec, hwe,
 						   EXEC_QUEUE_FLAG_PERSISTENT |
 						   EXEC_QUEUE_FLAG_VM |
 						   (sync ? 0 :
@@ -678,8 +679,8 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
 	} else {
 		gt = xe_device_get_gt(xe, eci[0].gt_id);
 		logical_mask = calc_validate_logical_mask(xe, gt, eci,
-							  args->width,
-							  args->num_placements);
+							  args->num_bb_per_exec,
+							  args->num_eng_per_bb);
 		if (XE_IOCTL_DBG(xe, !logical_mask))
 			return -EINVAL;
 
@@ -704,7 +705,7 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
 		}
 
 		q = xe_exec_queue_create(xe, vm, logical_mask,
-					 args->width, hwe,
+					 args->num_bb_per_exec, hwe,
 					 xe_vm_no_dma_fences(vm) ? 0 :
 					 EXEC_QUEUE_FLAG_PERSISTENT);
 		up_read(&vm->lock);
@@ -827,7 +828,7 @@ bool xe_exec_queue_is_idle(struct xe_exec_queue *q)
 	if (xe_exec_queue_is_parallel(q)) {
 		int i;
 
-		for (i = 0; i < q->width; ++i) {
+		for (i = 0; i < q->num_bb_per_exec; ++i) {
 			if (xe_lrc_seqno(&q->lrc[i]) !=
 			    q->lrc[i].fence_ctx.next_seqno - 1)
 				return false;
diff --git a/drivers/gpu/drm/xe/xe_exec_queue.h b/drivers/gpu/drm/xe/xe_exec_queue.h
index 59a54bfb9a8c..6782f3ce9faf 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue.h
+++ b/drivers/gpu/drm/xe/xe_exec_queue.h
@@ -15,7 +15,7 @@ struct xe_device;
 struct xe_file;
 
 struct xe_exec_queue *xe_exec_queue_create(struct xe_device *xe, struct xe_vm *vm,
-					   u32 logical_mask, u16 width,
+					   u32 logical_mask, u16 num_bb_per_exec,
 					   struct xe_hw_engine *hw_engine, u32 flags);
 struct xe_exec_queue *xe_exec_queue_create_class(struct xe_device *xe, struct xe_gt *gt,
 						 struct xe_vm *vm,
@@ -40,7 +40,7 @@ static inline void xe_exec_queue_put(struct xe_exec_queue *q)
 
 static inline bool xe_exec_queue_is_parallel(struct xe_exec_queue *q)
 {
-	return q->width > 1;
+	return q->num_bb_per_exec > 1;
 }
 
 bool xe_exec_queue_is_lr(struct xe_exec_queue *q);
diff --git a/drivers/gpu/drm/xe/xe_exec_queue_types.h b/drivers/gpu/drm/xe/xe_exec_queue_types.h
index ecd761177567..eb924a3e5d98 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue_types.h
+++ b/drivers/gpu/drm/xe/xe_exec_queue_types.h
@@ -47,8 +47,8 @@ struct xe_exec_queue {
 	u32 logical_mask;
 	/** @name: name of this exec queue */
 	char name[MAX_FENCE_NAME_LEN];
-	/** @width: width (number BB submitted per exec) of this exec queue */
-	u16 width;
+	/** @num_bb_per_exec: the width of this exec queue */
+	u16 num_bb_per_exec;
 	/** @fence_irq: fence IRQ used to signal job completion */
 	struct xe_hw_fence_irq *fence_irq;
 
diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
index 870dc5c532fa..b5a41a772445 100644
--- a/drivers/gpu/drm/xe/xe_guc_submit.c
+++ b/drivers/gpu/drm/xe/xe_guc_submit.c
@@ -259,7 +259,7 @@ static void __release_guc_id(struct xe_guc *guc, struct xe_exec_queue *q, u32 xa
 	if (xe_exec_queue_is_parallel(q))
 		bitmap_release_region(guc->submission_state.guc_ids_bitmap,
 				      q->guc->id - GUC_ID_START_MLRC,
-				      order_base_2(q->width));
+				      order_base_2(q->num_bb_per_exec));
 	else
 		ida_simple_remove(&guc->submission_state.guc_ids, q->guc->id);
 }
@@ -283,7 +283,7 @@ static int alloc_guc_id(struct xe_guc *guc, struct xe_exec_queue *q)
 		void *bitmap = guc->submission_state.guc_ids_bitmap;
 
 		ret = bitmap_find_free_region(bitmap, GUC_ID_NUMBER_MLRC,
-					      order_base_2(q->width));
+					      order_base_2(q->num_bb_per_exec));
 	} else {
 		ret = ida_simple_get(&guc->submission_state.guc_ids, 0,
 				     GUC_ID_NUMBER_SLRC, GFP_NOWAIT);
@@ -295,7 +295,7 @@ static int alloc_guc_id(struct xe_guc *guc, struct xe_exec_queue *q)
 	if (xe_exec_queue_is_parallel(q))
 		q->guc->id += GUC_ID_START_MLRC;
 
-	for (i = 0; i < q->width; ++i) {
+	for (i = 0; i < q->num_bb_per_exec; ++i) {
 		ptr = xa_store(&guc->submission_state.exec_queue_lookup,
 			       q->guc->id + i, q, GFP_NOWAIT);
 		if (IS_ERR(ptr)) {
@@ -315,7 +315,7 @@ static int alloc_guc_id(struct xe_guc *guc, struct xe_exec_queue *q)
 static void release_guc_id(struct xe_guc *guc, struct xe_exec_queue *q)
 {
 	mutex_lock(&guc->submission_state.lock);
-	__release_guc_id(guc, q, q->width);
+	__release_guc_id(guc, q, q->num_bb_per_exec);
 	mutex_unlock(&guc->submission_state.lock);
 }
 
@@ -426,11 +426,11 @@ static void __register_mlrc_engine(struct xe_guc *guc,
 	action[len++] = info->wq_base_lo;
 	action[len++] = info->wq_base_hi;
 	action[len++] = info->wq_size;
-	action[len++] = q->width;
+	action[len++] = q->num_bb_per_exec;
 	action[len++] = info->hwlrca_lo;
 	action[len++] = info->hwlrca_hi;
 
-	for (i = 1; i < q->width; ++i) {
+	for (i = 1; i < q->num_bb_per_exec; ++i) {
 		struct xe_lrc *lrc = q->lrc + i;
 
 		action[len++] = lower_32_bits(xe_lrc_descriptor(lrc));
@@ -578,7 +578,7 @@ static void wq_item_append(struct xe_exec_queue *q)
 	struct iosys_map map = xe_lrc_parallel_map(q->lrc);
 #define WQ_HEADER_SIZE	4	/* Includes 1 LRC address too */
 	u32 wqi[XE_HW_ENGINE_MAX_INSTANCE + (WQ_HEADER_SIZE - 1)];
-	u32 wqi_size = (q->width + (WQ_HEADER_SIZE - 1)) * sizeof(u32);
+	u32 wqi_size = (q->num_bb_per_exec + (WQ_HEADER_SIZE - 1)) * sizeof(u32);
 	u32 len_dw = (wqi_size / sizeof(u32)) - 1;
 	int i = 0, j;
 
@@ -595,7 +595,7 @@ static void wq_item_append(struct xe_exec_queue *q)
 	wqi[i++] = FIELD_PREP(WQ_GUC_ID_MASK, q->guc->id) |
 		FIELD_PREP(WQ_RING_TAIL_MASK, q->lrc->ring.tail / sizeof(u64));
 	wqi[i++] = 0;
-	for (j = 1; j < q->width; ++j) {
+	for (j = 1; j < q->num_bb_per_exec; ++j) {
 		struct xe_lrc *lrc = q->lrc + j;
 
 		wqi[i++] = lrc->ring.tail / sizeof(u64);
@@ -766,17 +766,17 @@ static void simple_error_capture(struct xe_exec_queue *q)
 	struct xe_hw_engine *hwe;
 	enum xe_hw_engine_id id;
 	u32 adj_logical_mask = q->logical_mask;
-	u32 width_mask = (0x1 << q->width) - 1;
+	u32 width_mask = (0x1 << q->num_bb_per_exec) - 1;
 	int i;
 	bool cookie;
 
 	if (q->vm && !q->vm->error_capture.capture_once) {
 		q->vm->error_capture.capture_once = true;
 		cookie = dma_fence_begin_signalling();
-		for (i = 0; q->width > 1 && i < XE_HW_ENGINE_MAX_INSTANCE;) {
+		for (i = 0; q->num_bb_per_exec > 1 && i < XE_HW_ENGINE_MAX_INSTANCE;) {
 			if (adj_logical_mask & BIT(i)) {
 				adj_logical_mask |= width_mask << i;
-				i += q->width;
+				i += q->num_bb_per_exec;
 			} else {
 				++i;
 			}
@@ -1462,7 +1462,7 @@ static void guc_exec_queue_start(struct xe_exec_queue *q)
 		int i;
 
 		trace_xe_exec_queue_resubmit(q);
-		for (i = 0; i < q->width; ++i)
+		for (i = 0; i < q->num_bb_per_exec; ++i)
 			xe_lrc_set_ring_head(q->lrc + i, q->lrc[i].ring.tail);
 		drm_sched_resubmit_jobs(sched);
 	}
@@ -1508,7 +1508,7 @@ g2h_exec_queue_lookup(struct xe_guc *guc, u32 guc_id)
 	}
 
 	xe_assert(xe, guc_id >= q->guc->id);
-	xe_assert(xe, guc_id < (q->guc->id + q->width));
+	xe_assert(xe, guc_id < (q->guc->id + q->num_bb_per_exec));
 
 	return q;
 }
@@ -1768,20 +1768,20 @@ xe_guc_exec_queue_snapshot_capture(struct xe_exec_queue *q)
 	memcpy(&snapshot->name, &q->name, sizeof(snapshot->name));
 	snapshot->class = q->class;
 	snapshot->logical_mask = q->logical_mask;
-	snapshot->width = q->width;
+	snapshot->width = q->num_bb_per_exec;
 	snapshot->refcount = kref_read(&q->refcount);
 	snapshot->sched_timeout = sched->timeout;
 	snapshot->sched_props.timeslice_us = q->sched_props.timeslice_us;
 	snapshot->sched_props.preempt_timeout_us =
 		q->sched_props.preempt_timeout_us;
 
-	snapshot->lrc = kmalloc_array(q->width, sizeof(struct lrc_snapshot),
+	snapshot->lrc = kmalloc_array(q->num_bb_per_exec, sizeof(struct lrc_snapshot),
 				      GFP_ATOMIC);
 
 	if (!snapshot->lrc) {
 		drm_err(&xe->drm, "Skipping GuC Engine LRC snapshot.\n");
 	} else {
-		for (i = 0; i < q->width; ++i) {
+		for (i = 0; i < q->num_bb_per_exec; ++i) {
 			struct xe_lrc *lrc = q->lrc + i;
 
 			snapshot->lrc[i].context_desc =
diff --git a/drivers/gpu/drm/xe/xe_ring_ops.c b/drivers/gpu/drm/xe/xe_ring_ops.c
index 59e0aa2d6a4c..d3d671784e8e 100644
--- a/drivers/gpu/drm/xe/xe_ring_ops.c
+++ b/drivers/gpu/drm/xe/xe_ring_ops.c
@@ -383,7 +383,7 @@ static void emit_job_gen12_gsc(struct xe_sched_job *job)
 {
 	struct xe_gt *gt = job->q->gt;
 
-	xe_gt_assert(gt, job->q->width <= 1); /* no parallel submission for GSCCS */
+	xe_gt_assert(gt, job->q->num_bb_per_exec <= 1); /* no parallel submission for GSCCS */
 
 	__emit_job_gen12_simple(job, job->q->lrc,
 				job->batch_addr[0],
@@ -400,7 +400,7 @@ static void emit_job_gen12_copy(struct xe_sched_job *job)
 		return;
 	}
 
-	for (i = 0; i < job->q->width; ++i)
+	for (i = 0; i < job->q->num_bb_per_exec; ++i)
 		__emit_job_gen12_simple(job, job->q->lrc + i,
 				        job->batch_addr[i],
 				        xe_sched_job_seqno(job));
@@ -411,7 +411,7 @@ static void emit_job_gen12_video(struct xe_sched_job *job)
 	int i;
 
 	/* FIXME: Not doing parallel handshake for now */
-	for (i = 0; i < job->q->width; ++i)
+	for (i = 0; i < job->q->num_bb_per_exec; ++i)
 		__emit_job_gen12_video(job, job->q->lrc + i,
 				       job->batch_addr[i],
 				       xe_sched_job_seqno(job));
@@ -421,7 +421,7 @@ static void emit_job_gen12_render_compute(struct xe_sched_job *job)
 {
 	int i;
 
-	for (i = 0; i < job->q->width; ++i)
+	for (i = 0; i < job->q->num_bb_per_exec; ++i)
 		__emit_job_gen12_render_compute(job, job->q->lrc + i,
 						job->batch_addr[i],
 						xe_sched_job_seqno(job));
diff --git a/drivers/gpu/drm/xe/xe_sched_job.c b/drivers/gpu/drm/xe/xe_sched_job.c
index adbd82f8744e..1884b6b6b398 100644
--- a/drivers/gpu/drm/xe/xe_sched_job.c
+++ b/drivers/gpu/drm/xe/xe_sched_job.c
@@ -117,13 +117,13 @@ struct xe_sched_job *xe_sched_job_create(struct xe_exec_queue *q,
 	} else {
 		struct dma_fence_array *cf;
 
-		fences = kmalloc_array(q->width, sizeof(*fences), GFP_KERNEL);
+		fences = kmalloc_array(q->num_bb_per_exec, sizeof(*fences), GFP_KERNEL);
 		if (!fences) {
 			err = -ENOMEM;
 			goto err_sched_job;
 		}
 
-		for (j = 0; j < q->width; ++j) {
+		for (j = 0; j < q->num_bb_per_exec; ++j) {
 			fences[j] = xe_lrc_create_seqno_fence(q->lrc + j);
 			if (IS_ERR(fences[j])) {
 				err = PTR_ERR(fences[j]);
@@ -131,7 +131,7 @@ struct xe_sched_job *xe_sched_job_create(struct xe_exec_queue *q,
 			}
 		}
 
-		cf = dma_fence_array_create(q->width, fences,
+		cf = dma_fence_array_create(q->num_bb_per_exec, fences,
 					    q->parallel.composite_fence_ctx,
 					    q->parallel.composite_fence_seqno++,
 					    false);
@@ -142,13 +142,13 @@ struct xe_sched_job *xe_sched_job_create(struct xe_exec_queue *q,
 		}
 
 		/* Sanity check */
-		for (j = 0; j < q->width; ++j)
+		for (j = 0; j < q->num_bb_per_exec; ++j)
 			xe_assert(job_to_xe(job), cf->base.seqno == fences[j]->seqno);
 
 		job->fence = &cf->base;
 	}
 
-	width = q->width;
+	width = q->num_bb_per_exec;
 	if (is_migration)
 		width = 2;
 
diff --git a/drivers/gpu/drm/xe/xe_trace.h b/drivers/gpu/drm/xe/xe_trace.h
index 1536130e56f6..d49b6d9c480a 100644
--- a/drivers/gpu/drm/xe/xe_trace.h
+++ b/drivers/gpu/drm/xe/xe_trace.h
@@ -112,7 +112,7 @@ DECLARE_EVENT_CLASS(xe_exec_queue,
 			     __field(enum xe_engine_class, class)
 			     __field(u32, logical_mask)
 			     __field(u8, gt_id)
-			     __field(u16, width)
+			     __field(u16, num_bb_per_exec)
 			     __field(u16, guc_id)
 			     __field(u32, guc_state)
 			     __field(u32, flags)
@@ -122,15 +122,15 @@ DECLARE_EVENT_CLASS(xe_exec_queue,
 			   __entry->class = q->class;
 			   __entry->logical_mask = q->logical_mask;
 			   __entry->gt_id = q->gt->info.id;
-			   __entry->width = q->width;
+			   __entry->num_bb_per_exec = q->num_bb_per_exec;
 			   __entry->guc_id = q->guc->id;
 			   __entry->guc_state = atomic_read(&q->guc->state);
 			   __entry->flags = q->flags;
 			   ),
 
-		    TP_printk("%d:0x%x, gt=%d, width=%d, guc_id=%d, guc_state=0x%x, flags=0x%x",
+		    TP_printk("%d:0x%x, gt=%d, num_bb_per_exec=%d, guc_id=%d, guc_state=0x%x, flags=0x%x",
 			      __entry->class, __entry->logical_mask,
-			      __entry->gt_id, __entry->width, __entry->guc_id,
+			      __entry->gt_id, __entry->num_bb_per_exec, __entry->guc_id,
 			      __entry->guc_state, __entry->flags)
 );
 
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 144a423868cf..df8c5663f899 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -1008,6 +1008,68 @@ struct drm_xe_sync {
 	__u64 reserved[2];
 };
 
+/**
+ * DOC: Execution Queue
+ *
+ * The Execution Queue abstracts the Hardware Engine that is going to be used
+ * with the execution of the Batch Buffers in &DRM_IOCTL_XE_EXEC
+ *
+ * In a regular usage of this execution queue, only one hardware engine pointer
+ * would be given as input of the @instances below and both @num_bb_per_exec and
+ * @num_eng_per_bb would be set to '1'.
+ *
+ * Regular execution example::
+ *
+ *                    ┌─────┐
+ *                    │ BB0 │
+ *                    └──┬──┘
+ *                       │     @num_bb_per_exec = 1
+ *                       │     @num_eng_per_bb = 1
+ *                       │     @instances = {Engine0}
+ *                       ▼
+ *                   ┌───────┐
+ *                   │Engine0│
+ *                   └───────┘
+ *
+ * However this execution queue is flexible to be used for parallel submission or
+ * for load balancing submission (a.k.a virtual load balancing).
+ *
+ * In a parallel submission, different batch buffers will be simultaneously
+ * dispatched to different engines listed in @instances, in a 1-1 relationship.
+ *
+ * Parallel execution example::
+ *
+ *               ┌─────┐   ┌─────┐
+ *               │ BB0 │   │ BB1 │
+ *               └──┬──┘   └──┬──┘
+ *                  │         │     @num_bb_per_exec = 2
+ *                  │         │     @num_eng_per_bb = 1
+ *                  │         │     @instances = {Engine0, Engine1}
+ *                  ▼         ▼
+ *              ┌───────┐ ┌───────┐
+ *              │Engine0│ │Engine1│
+ *              └───────┘ └───────┘
+ *
+ * On a load balancing submission, each batch buffer is virtually dispatched
+ * to all of the listed engine @instances. Then, underneath driver, firmware, or
+ * hardware can select the best available engine to actually run the job.
+ *
+ * Virtual Load Balancing example::
+ *
+ *                    ┌─────┐
+ *                    │ BB0 │
+ *                    └──┬──┘
+ *                       │      @num_bb_per_exec = 1
+ *                       │      @num_eng_per_bb = 2
+ *                       │      @instances = {Engine0, Engine1}
+ *                  ┌────┴────┐
+ *                  │         │
+ *                  ▼         ▼
+ *              ┌───────┐ ┌───────┐
+ *              │Engine0│ │Engine1│
+ *              └───────┘ └───────┘
+ */
+
 /**
  * struct drm_xe_exec_queue_create - Input of &DRM_IOCTL_XE_EXEC_QUEUE_CREATE
  */
@@ -1016,11 +1078,17 @@ struct drm_xe_exec_queue_create {
 	/** @extensions: Pointer to the first extension struct, if any */
 	__u64 extensions;
 
-	/** @width: submission width (number BB per exec) for this exec queue */
-	__u16 width;
+	/**
+	 * @num_bb_per_exec: Indicates a submission width for this exec queue,
+	 * for how many batch buffers can be submitted in parallel.
+	 */
+	__u16 num_bb_per_exec;
 
-	/** @num_placements: number of valid placements for this exec queue */
-	__u16 num_placements;
+	/**
+	 * @num_eng_per_bb: Indicates how many possible engines are available
+	 * at @instances for the Xe to distribute the load.
+	 */
+	__u16 num_eng_per_bb;
 
 	/** @vm_id: VM to use for this exec queue */
 	__u32 vm_id;
@@ -1035,8 +1103,10 @@ struct drm_xe_exec_queue_create {
 	 * @instances: user pointer to a 2-d array of struct
 	 * drm_xe_engine_class_instance
 	 *
-	 * length = width (i) * num_placements (j)
-	 * index = j + i * width
+	 * Every engine in the array needs to have the same @sched_group_id
+	 *
+	 * length = num_bb_per_exec (i) * num_eng_per_bb (j)
+	 * index = j + i * num_bb_per_exec
 	 */
 	__u64 instances;
 
@@ -1146,7 +1216,7 @@ struct drm_xe_exec {
 
 	/**
 	 * @num_batch_buffer: number of batch buffer in this exec, must match
-	 * the width of the engine
+	 * the @num_bb_per_exec of the struct drm_xe_exec_queue_create
 	 */
 	__u16 num_batch_buffer;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 35/43] drm/xe/uapi: Refactor engine information
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (33 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 34/43] drm/xe/uapi: Exec queue documentation and variable renaming Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 12:07   ` Matthew Brost
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 36/43] drm/xe/uapi: Crystal Reference Clock updates Francois Dugast
                   ` (11 subsequent siblings)
  46 siblings, 1 reply; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

From: Rodrigo Vivi <rodrigo.vivi@intel.com>

First of all, let's add the tile and gt IDs to the engine_info.
We originally tried to abstract tile from the uAPI, but it is
not future proof since the tile might be important info to the
user space in regarding cache line information.

Now that we have gt_id as part of the info, let's convert
the instance.gt_id into a generic scheduling group id number.
For all the current platforms, the scheduling group is the
GT ID underneath, but at least the API becomes flexible enough
to allow different kind of engine grouping without necessarily
get so tied to the GT ID.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_exec_queue.c      | 17 +++++++++--------
 drivers/gpu/drm/xe/xe_query.c           | 13 ++++++++++---
 drivers/gpu/drm/xe/xe_wait_user_fence.c |  4 ++--
 include/uapi/drm/xe_drm.h               | 10 ++++++++--
 4 files changed, 29 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
index 064f25e5e3a5..e30363bb5152 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue.c
+++ b/drivers/gpu/drm/xe/xe_exec_queue.c
@@ -500,13 +500,13 @@ find_hw_engine(struct xe_device *xe,
 	if (eci.engine_class > ARRAY_SIZE(user_to_xe_engine_class))
 		return NULL;
 
-	if (eci.gt_id >= xe->info.gt_count)
+	if (eci.sched_group_id >= xe->info.gt_count)
 		return NULL;
 
 	idx = array_index_nospec(eci.engine_class,
 				 ARRAY_SIZE(user_to_xe_engine_class));
 
-	return xe_gt_hw_engine(xe_device_get_gt(xe, eci.gt_id),
+	return xe_gt_hw_engine(xe_device_get_gt(xe, eci.sched_group_id),
 			       user_to_xe_engine_class[idx],
 			       eci.engine_instance, true);
 }
@@ -547,7 +547,7 @@ static u32 calc_validate_logical_mask(struct xe_device *xe, struct xe_gt *gt,
 	int len = num_bb_per_exec * num_eng_per_bb;
 	int i, j, n;
 	u16 class;
-	u16 gt_id;
+	u16 sched_group_id;
 	u32 return_mask = 0, prev_mask;
 
 	if (XE_IOCTL_DBG(xe, !xe_device_uc_enabled(xe) &&
@@ -569,12 +569,13 @@ static u32 calc_validate_logical_mask(struct xe_device *xe, struct xe_gt *gt,
 			if (XE_IOCTL_DBG(xe, xe_hw_engine_is_reserved(hwe)))
 				return 0;
 
-			if (XE_IOCTL_DBG(xe, n && eci[n].gt_id != gt_id) ||
+			if (XE_IOCTL_DBG(xe, n &&
+					 eci[n].sched_group_id != sched_group_id) ||
 			    XE_IOCTL_DBG(xe, n && eci[n].engine_class != class))
 				return 0;
 
 			class = eci[n].engine_class;
-			gt_id = eci[n].gt_id;
+			sched_group_id = eci[n].sched_group_id;
 
 			if (num_bb_per_exec == 1 || !i)
 				return_mask |= BIT(eci[n].engine_instance);
@@ -623,7 +624,7 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
 	if (XE_IOCTL_DBG(xe, err))
 		return -EFAULT;
 
-	if (XE_IOCTL_DBG(xe, eci[0].gt_id >= xe->info.gt_count))
+	if (XE_IOCTL_DBG(xe, eci[0].sched_group_id >= xe->info.gt_count))
 		return -EINVAL;
 
 	if (eci[0].engine_class >= DRM_XE_ENGINE_CLASS_VM_BIND_ASYNC) {
@@ -636,7 +637,7 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
 			if (xe_gt_is_media_type(gt))
 				continue;
 
-			eci[0].gt_id = gt->info.id;
+			eci[0].sched_group_id = gt->info.id;
 			logical_mask = bind_exec_queue_logical_mask(xe, gt, eci,
 								    args->num_bb_per_exec,
 								    args->num_eng_per_bb);
@@ -677,7 +678,7 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
 					      &q->multi_gt_link);
 		}
 	} else {
-		gt = xe_device_get_gt(xe, eci[0].gt_id);
+		gt = xe_device_get_gt(xe, eci[0].sched_group_id);
 		logical_mask = calc_validate_logical_mask(xe, gt, eci,
 							  args->num_bb_per_exec,
 							  args->num_eng_per_bb);
diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
index e5db18c91f01..99e1bfa9b446 100644
--- a/drivers/gpu/drm/xe/xe_query.c
+++ b/drivers/gpu/drm/xe/xe_query.c
@@ -131,10 +131,10 @@ query_engine_cycles(struct xe_device *xe,
 		return -EINVAL;
 
 	eci = &resp.eci;
-	if (eci->gt_id > XE_MAX_GT_PER_TILE)
+	if (eci->sched_group_id > XE_MAX_GT_PER_TILE)
 		return -EINVAL;
 
-	gt = xe_device_get_gt(xe, eci->gt_id);
+	gt = xe_device_get_gt(xe, eci->sched_group_id);
 	if (!gt)
 		return -EINVAL;
 
@@ -215,8 +215,15 @@ static int query_engines(struct xe_device *xe,
 				xe_to_user_engine_class[hwe->class];
 			hw_engine_info[i].instance.engine_instance =
 				hwe->logical_instance;
-			hw_engine_info[i].instance.gt_id = gt->info.id;
+			/*
+			 * Scheduling Group ID is the global GT ID for the
+			 * current hardware, although the API is flexible
+			 */
+			hw_engine_info[i].instance.sched_group_id = gt->info.id;
 			hw_engine_info[i].instance.pad = 0;
+			hw_engine_info[i].tile_id = gt_to_tile(gt)->id;
+			hw_engine_info[i].gt_id = gt->info.id;
+
 			/*
 			 * The mem_regions indexes in the mask below need to
 			 * directly identify the struct
diff --git a/drivers/gpu/drm/xe/xe_wait_user_fence.c b/drivers/gpu/drm/xe/xe_wait_user_fence.c
index 4d5c2555ce41..dcbb1c578b22 100644
--- a/drivers/gpu/drm/xe/xe_wait_user_fence.c
+++ b/drivers/gpu/drm/xe/xe_wait_user_fence.c
@@ -68,10 +68,10 @@ static int check_hw_engines(struct xe_device *xe,
 		enum xe_engine_class user_class =
 			user_to_xe_engine_class[eci[i].engine_class];
 
-		if (eci[i].gt_id >= xe->info.tile_count)
+		if (eci[i].sched_group_id >= xe->info.tile_count)
 			return -EINVAL;
 
-		if (!xe_gt_hw_engine(xe_device_get_gt(xe, eci[i].gt_id),
+		if (!xe_gt_hw_engine(xe_device_get_gt(xe, eci[i].sched_group_id),
 				     user_class, eci[i].engine_instance, true))
 			return -EINVAL;
 	}
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index df8c5663f899..342f22c2d9f0 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -211,8 +211,8 @@ struct drm_xe_engine_class_instance {
 	__u16 engine_class;
 	/** @engine_instance: Engine instance */
 	__u16 engine_instance;
-	/** @gt_id: GT ID the instance is associated with */
-	__u16 gt_id;
+	/** @sched_group_id: Scheduling Group ID for this engine instance */
+	__u16 sched_group_id;
 	/** @pad: MBZ */
 	__u16 pad;
 };
@@ -228,6 +228,12 @@ struct drm_xe_query_engine_info {
 	/** @instance: The @drm_xe_engine_class_instance */
 	struct drm_xe_engine_class_instance instance;
 
+	/** @tile_id: Tile ID where this Engine lives */
+	__u16 tile_id;
+
+	/** @gt_id: GT ID where this Engine lives */
+	__u16 gt_id;
+
 	/**
 	 * @near_mem_regions: Bit mask of instances from
 	 * drm_xe_query_mem_regions that is near this engine.
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 36/43] drm/xe/uapi: Crystal Reference Clock updates
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (34 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 35/43] drm/xe/uapi: Refactor engine information Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 37/43] drm/xe/uapi: Add Tile ID information to the GT info query Francois Dugast
                   ` (10 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Matt Roper, Rodrigo Vivi

From: Rodrigo Vivi <rodrigo.vivi@intel.com>

First of all, let's remove the duplication.
But also, let's rename it to remove the word 'frequency'
out of it. In general, the first thing people think of frequency
is the frequency in which the GTs are operating to execute the
GPU instructions.

While this frequency here is a crystal reference clock frequency
which is the base of everything else, and in this case of this
uAPI it is used to calculate a better and precise timestamp.

v2: (Suggested by Jose) Remove the engine_cs and keep the GT info one
since it might be useful for other SRIOV cases where the engine_cs
will be zeroed. So, grabbing from the GT_LIST should be cleaner.

Cc: Matt Roper <matthew.d.roper@intel.com>
Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Jose Souza <jose.souza@intel.com>

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_gt_clock.c |  4 ++--
 drivers/gpu/drm/xe/xe_gt_types.h |  4 ++--
 drivers/gpu/drm/xe/xe_query.c    |  8 +-------
 include/uapi/drm/xe_drm.h        | 11 ++++-------
 4 files changed, 9 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_gt_clock.c b/drivers/gpu/drm/xe/xe_gt_clock.c
index 25a18eaad9c4..937054e31d72 100644
--- a/drivers/gpu/drm/xe/xe_gt_clock.c
+++ b/drivers/gpu/drm/xe/xe_gt_clock.c
@@ -75,11 +75,11 @@ int xe_gt_clock_init(struct xe_gt *gt)
 		freq >>= 3 - REG_FIELD_GET(RPM_CONFIG0_CTC_SHIFT_PARAMETER_MASK, c0);
 	}
 
-	gt->info.clock_freq = freq;
+	gt->info.reference_clock = freq;
 	return 0;
 }
 
 u64 xe_gt_clock_cycles_to_ns(const struct xe_gt *gt, u64 count)
 {
-	return DIV_ROUND_CLOSEST_ULL(count * NSEC_PER_SEC, gt->info.clock_freq);
+	return DIV_ROUND_CLOSEST_ULL(count * NSEC_PER_SEC, gt->info.reference_clock);
 }
diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h
index d3f2793684e2..56b0f22ee78d 100644
--- a/drivers/gpu/drm/xe/xe_gt_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_types.h
@@ -107,8 +107,8 @@ struct xe_gt {
 		enum xe_gt_type type;
 		/** @id: Unique ID of this GT within the PCI Device */
 		u8 id;
-		/** @clock_freq: clock frequency */
-		u32 clock_freq;
+		/** @reference_clock: clock frequency */
+		u32 reference_clock;
 		/** @engine_mask: mask of engines present on GT */
 		u64 engine_mask;
 		/**
diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
index 99e1bfa9b446..2fcb2a4846ef 100644
--- a/drivers/gpu/drm/xe/xe_query.c
+++ b/drivers/gpu/drm/xe/xe_query.c
@@ -146,8 +146,6 @@ query_engine_cycles(struct xe_device *xe,
 	if (!hwe)
 		return -EINVAL;
 
-	resp.engine_frequency = gt->info.clock_freq;
-
 	xe_device_mem_access_get(xe);
 	xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL);
 
@@ -163,10 +161,6 @@ query_engine_cycles(struct xe_device *xe,
 	xe_device_mem_access_put(xe);
 	resp.width = 36;
 
-	/* Only write to the output fields of user query */
-	if (put_user(resp.engine_frequency, &query_ptr->engine_frequency))
-		return -EFAULT;
-
 	if (put_user(resp.cpu_timestamp, &query_ptr->cpu_timestamp))
 		return -EFAULT;
 
@@ -410,7 +404,7 @@ static int query_gt_list(struct xe_device *xe, struct drm_xe_device_query *query
 		else
 			gt_list->gt_list[id].type = DRM_XE_QUERY_GT_TYPE_MAIN;
 		gt_list->gt_list[id].gt_id = gt->info.id;
-		gt_list->gt_list[id].clock_freq = gt->info.clock_freq;
+		gt_list->gt_list[id].reference_clock = gt->info.reference_clock;
 	}
 
 	if (copy_to_user(query_ptr, gt_list, size)) {
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 342f22c2d9f0..6018062df378 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -429,8 +429,8 @@ struct drm_xe_query_gt {
 	__u16 type;
 	/** @gt_id: Unique ID of this GT within the PCI Device */
 	__u16 gt_id;
-	/** @clock_freq: A clock frequency for timestamp */
-	__u32 clock_freq;
+	/** @reference_clock: A clock frequency for timestamp */
+	__u32 reference_clock;
 	/** @reserved: Reserved */
 	__u64 reserved[8];
 };
@@ -504,8 +504,8 @@ struct drm_xe_query_topology_mask {
  * in .data. struct drm_xe_query_engine_cycles is allocated by the user and
  * .data points to this allocated structure.
  *
- * The query returns the engine cycles and the frequency that can
- * be used to calculate the engine timestamp. In addition the
+ * The query returns the engine cycles, which along with GT's @reference_clock,
+ * can be used to calculate the engine timestamp. In addition the
  * query returns a set of cpu timestamps that indicate when the command
  * streamer cycle count was captured.
  */
@@ -533,9 +533,6 @@ struct drm_xe_query_engine_cycles {
 	 */
 	__u64 engine_cycles;
 
-	/** @engine_frequency: Frequency of the engine cycles in Hz. */
-	__u64 engine_frequency;
-
 	/**
 	 * @cpu_timestamp: CPU timestamp in ns. The timestamp is captured before
 	 * reading the engine_cycles register using the reference clockid set by the
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 37/43] drm/xe/uapi: Add Tile ID information to the GT info query
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (35 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 36/43] drm/xe/uapi: Crystal Reference Clock updates Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 38/43] drm/xe/uapi: Remove bogus engine list from the wait_user_fence IOCTL Francois Dugast
                   ` (9 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

From: Rodrigo Vivi <rodrigo.vivi@intel.com>

As an information only. So Userspace can use this information
and be able to correlate different GTs.

Make API symmetric between Engine and GT info.

There's no need right now to include a tile_query entry
since there's no other information that we need from tile
that is not already exposed through different queries.

However, this could be added later if we have different Tile
information that could matter to userspace. But let's keep
the API ready for a direct reference to Tile ID based on
the GT entry.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_query.c | 1 +
 include/uapi/drm/xe_drm.h     | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
index 2fcb2a4846ef..eea89c0e7243 100644
--- a/drivers/gpu/drm/xe/xe_query.c
+++ b/drivers/gpu/drm/xe/xe_query.c
@@ -403,6 +403,7 @@ static int query_gt_list(struct xe_device *xe, struct drm_xe_device_query *query
 			gt_list->gt_list[id].type = DRM_XE_QUERY_GT_TYPE_MEDIA;
 		else
 			gt_list->gt_list[id].type = DRM_XE_QUERY_GT_TYPE_MAIN;
+		gt_list->gt_list[id].tile_id = gt_to_tile(gt)->id;
 		gt_list->gt_list[id].gt_id = gt->info.id;
 		gt_list->gt_list[id].reference_clock = gt->info.reference_clock;
 	}
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 6018062df378..174e3b98b361 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -427,6 +427,8 @@ struct drm_xe_query_gt {
 #define DRM_XE_QUERY_GT_TYPE_MEDIA		1
 	/** @type: GT type: Main or Media */
 	__u16 type;
+	/** @tile_id: Tile ID where this GT lives (Information only) */
+	__u16 tile_id;
 	/** @gt_id: Unique ID of this GT within the PCI Device */
 	__u16 gt_id;
 	/** @reference_clock: A clock frequency for timestamp */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 38/43] drm/xe/uapi: Remove bogus engine list from the wait_user_fence IOCTL
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (36 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 37/43] drm/xe/uapi: Add Tile ID information to the GT info query Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 39/43] drm/xe/uapi: Align on a common way to return arrays (memory regions) Francois Dugast
                   ` (8 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Rodrigo Vivi

From: Rodrigo Vivi <rodrigo.vivi@intel.com>

Right now this is only checking if the engine list is sane and nothing
else. In the end every operation with this IOCTL is a soft check.
So, let's formalize that and only use this IOCTL to wait on the fence.

Upon timeout, userspace need then to inspect the engine properties
like BAN, in order to determine the reset status and any other
information that can be (or be added) there.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_wait_user_fence.c | 56 +------------------------
 include/uapi/drm/xe_drm.h               | 17 +-------
 2 files changed, 3 insertions(+), 70 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_wait_user_fence.c b/drivers/gpu/drm/xe/xe_wait_user_fence.c
index dcbb1c578b22..a9d231548498 100644
--- a/drivers/gpu/drm/xe/xe_wait_user_fence.c
+++ b/drivers/gpu/drm/xe/xe_wait_user_fence.c
@@ -58,29 +58,7 @@ static const enum xe_engine_class user_to_xe_engine_class[] = {
 	[DRM_XE_ENGINE_CLASS_COMPUTE] = XE_ENGINE_CLASS_COMPUTE,
 };
 
-static int check_hw_engines(struct xe_device *xe,
-			    struct drm_xe_engine_class_instance *eci,
-			    int num_engines)
-{
-	int i;
-
-	for (i = 0; i < num_engines; ++i) {
-		enum xe_engine_class user_class =
-			user_to_xe_engine_class[eci[i].engine_class];
-
-		if (eci[i].sched_group_id >= xe->info.tile_count)
-			return -EINVAL;
-
-		if (!xe_gt_hw_engine(xe_device_get_gt(xe, eci[i].sched_group_id),
-				     user_class, eci[i].engine_instance, true))
-			return -EINVAL;
-	}
-
-	return 0;
-}
-
-#define VALID_FLAGS	(DRM_XE_UFENCE_WAIT_FLAG_SOFT_OP | \
-			 DRM_XE_UFENCE_WAIT_FLAG_ABSTIME)
+#define VALID_FLAGS	(DRM_XE_UFENCE_WAIT_FLAG_ABSTIME)
 #define MAX_OP		DRM_XE_UFENCE_WAIT_OP_LTE
 
 static long to_jiffies_timeout(struct xe_device *xe,
@@ -132,12 +110,8 @@ int xe_wait_user_fence_ioctl(struct drm_device *dev, void *data,
 	struct xe_device *xe = to_xe_device(dev);
 	DEFINE_WAIT_FUNC(w_wait, woken_wake_function);
 	struct drm_xe_wait_user_fence *args = data;
-	struct drm_xe_engine_class_instance eci[XE_HW_ENGINE_MAX_INSTANCE];
-	struct drm_xe_engine_class_instance __user *user_eci =
-		u64_to_user_ptr(args->instances);
 	u64 addr = args->addr;
 	int err;
-	bool no_engines = args->flags & DRM_XE_UFENCE_WAIT_FLAG_SOFT_OP;
 	long timeout;
 	ktime_t start;
 
@@ -151,41 +125,13 @@ int xe_wait_user_fence_ioctl(struct drm_device *dev, void *data,
 	if (XE_IOCTL_DBG(xe, args->op > MAX_OP))
 		return -EINVAL;
 
-	if (XE_IOCTL_DBG(xe, no_engines &&
-			 (args->num_engines || args->instances)))
-		return -EINVAL;
-
-	if (XE_IOCTL_DBG(xe, !no_engines && !args->num_engines))
-		return -EINVAL;
-
 	if (XE_IOCTL_DBG(xe, addr & 0x7))
 		return -EINVAL;
 
-	if (XE_IOCTL_DBG(xe, args->num_engines > XE_HW_ENGINE_MAX_INSTANCE))
-		return -EINVAL;
-
-	if (!no_engines) {
-		err = copy_from_user(eci, user_eci,
-				     sizeof(struct drm_xe_engine_class_instance) *
-			     args->num_engines);
-		if (XE_IOCTL_DBG(xe, err))
-			return -EFAULT;
-
-		if (XE_IOCTL_DBG(xe, check_hw_engines(xe, eci,
-						      args->num_engines)))
-			return -EINVAL;
-	}
-
 	timeout = to_jiffies_timeout(xe, args);
 
 	start = ktime_get();
 
-	/*
-	 * FIXME: Very simple implementation at the moment, single wait queue
-	 * for everything. Could be optimized to have a wait queue for every
-	 * hardware engine. Open coding as 'do_compare' can sleep which doesn't
-	 * work with the wait_event_* macros.
-	 */
 	add_wait_queue(&xe->ufence_wq, &w_wait);
 	for (;;) {
 		err = do_compare(addr, args->value, args->mask, args->op);
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 174e3b98b361..3e3f2428e6c6 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -1279,8 +1279,7 @@ struct drm_xe_wait_user_fence {
 	/** @op: wait operation (type of comparison) */
 	__u16 op;
 
-#define DRM_XE_UFENCE_WAIT_FLAG_SOFT_OP	(1 << 0)	/* e.g. Wait on VM bind */
-#define DRM_XE_UFENCE_WAIT_FLAG_ABSTIME	(1 << 1)
+#define DRM_XE_UFENCE_WAIT_FLAG_ABSTIME	(1 << 0)
 	/** @flags: wait flags */
 	__u16 flags;
 
@@ -1313,20 +1312,8 @@ struct drm_xe_wait_user_fence {
 	 */
 	__s64 timeout;
 
-	/**
-	 * @num_engines: number of engine instances to wait on, must be zero
-	 * when DRM_XE_UFENCE_WAIT_FLAG_SOFT_OP set
-	 */
-	__u64 num_engines;
-
-	/**
-	 * @instances: user pointer to array of drm_xe_engine_class_instance to
-	 * wait on, must be NULL when DRM_XE_UFENCE_WAIT_FLAG_SOFT_OP set
-	 */
-	__u64 instances;
-
 	/** @reserved: Reserved */
-	__u64 reserved[2];
+	__u64 reserved[4];
 };
 
 /**
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 39/43] drm/xe/uapi: Align on a common way to return arrays (memory regions)
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (37 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 38/43] drm/xe/uapi: Remove bogus engine list from the wait_user_fence IOCTL Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 40/43] drm/xe/uapi: Align on a common way to return arrays (gt) Francois Dugast
                   ` (7 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Francois Dugast

The uAPI provides queries which return arrays of elements. As of now
the format used in the struct is different depending on which element
is queried. Fix this for memory regions by applying the pattern below:

    struct drm_xe_query_X {
       __u32 num_X;
       struct drm_xe_X Xs[];
       ...
    }

This removes "query" in the name of struct drm_xe_query_mem_region
as it is not returned from the query IOCTL. There is no functional
change.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
---
 drivers/gpu/drm/xe/xe_query.c | 44 ++++++++++++++++++-----------------
 include/uapi/drm/xe_drm.h     | 38 +++++++++++++++---------------
 2 files changed, 42 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
index eea89c0e7243..4ee8e848e24a 100644
--- a/drivers/gpu/drm/xe/xe_query.c
+++ b/drivers/gpu/drm/xe/xe_query.c
@@ -262,15 +262,15 @@ static size_t calc_mem_regions_size(struct xe_device *xe)
 		if (ttm_manager_type(&xe->ttm, i))
 			num_managers++;
 
-	return offsetof(struct drm_xe_query_mem_regions, regions[num_managers]);
+	return offsetof(struct drm_xe_query_mem_region, mem_regions[num_managers]);
 }
 
-static int query_mem_regions(struct xe_device *xe,
-			     struct drm_xe_device_query *query)
+static int query_mem_region(struct xe_device *xe,
+			    struct drm_xe_device_query *query)
 {
 	size_t size = calc_mem_regions_size(xe);
-	struct drm_xe_query_mem_regions *usage;
-	struct drm_xe_query_mem_regions __user *query_ptr =
+	struct drm_xe_query_mem_region *usage;
+	struct drm_xe_query_mem_region __user *query_ptr =
 		u64_to_user_ptr(query->data);
 	struct ttm_resource_manager *man;
 	int ret, i;
@@ -287,41 +287,43 @@ static int query_mem_regions(struct xe_device *xe,
 		return -ENOMEM;
 
 	man = ttm_manager_type(&xe->ttm, XE_PL_TT);
-	usage->regions[0].mem_class = DRM_XE_MEM_REGION_CLASS_SYSMEM;
+	usage->mem_regions[0].mem_class = DRM_XE_MEM_REGION_CLASS_SYSMEM;
 	/*
 	 * The instance needs to be a unique number that represents the index
 	 * in the placement mask used at xe_gem_create_ioctl() for the
 	 * xe_bo_create() placement.
 	 */
-	usage->regions[0].instance = 0;
-	usage->regions[0].min_page_size = PAGE_SIZE;
-	usage->regions[0].total_size = man->size << PAGE_SHIFT;
+	usage->mem_regions[0].instance = 0;
+	usage->mem_regions[0].min_page_size = PAGE_SIZE;
+	usage->mem_regions[0].total_size = man->size << PAGE_SHIFT;
 	if (perfmon_capable())
-		usage->regions[0].used = ttm_resource_manager_usage(man);
-	usage->num_regions = 1;
+		usage->mem_regions[0].used = ttm_resource_manager_usage(man);
+	usage->num_mem_regions = 1;
 
 	for (i = XE_PL_VRAM0; i <= XE_PL_VRAM1; ++i) {
 		man = ttm_manager_type(&xe->ttm, i);
 		if (man) {
-			usage->regions[usage->num_regions].mem_class =
+			usage->mem_regions[usage->num_mem_regions].mem_class =
 				DRM_XE_MEM_REGION_CLASS_VRAM;
-			usage->regions[usage->num_regions].instance =
-				usage->num_regions;
-			usage->regions[usage->num_regions].min_page_size =
+			usage->mem_regions[usage->num_mem_regions].instance =
+				usage->num_mem_regions;
+			usage->mem_regions[usage->num_mem_regions].min_page_size =
 				xe->info.vram_flags & XE_VRAM_FLAGS_NEED64K ?
 				SZ_64K : PAGE_SIZE;
-			usage->regions[usage->num_regions].total_size =
+			usage->mem_regions[usage->num_mem_regions].total_size =
 				man->size;
 
 			if (perfmon_capable()) {
 				xe_ttm_vram_get_used(man,
-						     &usage->regions[usage->num_regions].used,
-						     &usage->regions[usage->num_regions].cpu_visible_used);
+						     &usage->mem_regions
+						     [usage->num_mem_regions].used,
+						     &usage->mem_regions
+						     [usage->num_mem_regions].cpu_visible_used);
 			}
 
-			usage->regions[usage->num_regions].cpu_visible_size =
+			usage->mem_regions[usage->num_mem_regions].cpu_visible_size =
 				xe_ttm_vram_get_cpu_visible_size(man);
-			usage->num_regions++;
+			usage->num_mem_regions++;
 		}
 	}
 
@@ -571,7 +573,7 @@ query_uc_fw_version(struct xe_device *xe, struct drm_xe_device_query *query)
 static int (* const xe_query_funcs[])(struct xe_device *xe,
 				      struct drm_xe_device_query *query) = {
 	query_engines,
-	query_mem_regions,
+	query_mem_region,
 	query_config,
 	query_gt_list,
 	query_hwconfig,
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 3e3f2428e6c6..74007c0ea970 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -236,23 +236,23 @@ struct drm_xe_query_engine_info {
 
 	/**
 	 * @near_mem_regions: Bit mask of instances from
-	 * drm_xe_query_mem_regions that is near this engine.
+	 * drm_xe_query_mem_region that is near this engine.
 	 * Each index in this mask refers directly to the struct
-	 * drm_xe_query_mem_regions' instance, no assumptions should
+	 * drm_xe_query_mem_region's instance, no assumptions should
 	 * be made about order. The type of each region is described
-	 * by struct drm_xe_query_mem_regions' mem_class.
+	 * by struct drm_xe_mem_region's mem_class.
 	 */
 	__u64 near_mem_regions;
 	/**
 	 * @far_mem_regions: Bit mask of instances from
-	 * drm_xe_query_mem_regions that is far from this engine.
+	 * drm_xe_query_mem_region that is far from this engine.
 	 * In general, it has extra indirections when compared to the
 	 * @near_mem_regions. For a discrete device this could mean system
 	 * memory and memory living in a different Tile.
 	 * Each index in this mask refers directly to the struct
-	 * drm_xe_query_mem_regions' instance, no assumptions should
+	 * drm_xe_query_mem_region's instance, no assumptions should
 	 * be made about order. The type of each region is described
-	 * by struct drm_xe_query_mem_regions' mem_class.
+	 * by struct drm_xe_mem_region's mem_class.
 	 */
 	__u64 far_mem_regions;
 
@@ -275,10 +275,10 @@ enum drm_xe_memory_class {
 };
 
 /**
- * struct drm_xe_query_mem_region - Describes some region as known to
+ * struct drm_xe_mem_region - Describes some region as known to
  * the driver.
  */
-struct drm_xe_query_mem_region {
+struct drm_xe_mem_region {
 	/**
 	 * @mem_class: The memory class describing this region.
 	 *
@@ -353,19 +353,19 @@ struct drm_xe_query_mem_region {
 };
 
 /**
- * struct drm_xe_query_mem_regions - describe memory regions
+ * struct drm_xe_query_mem_region - describe memory regions
  *
  * If a query is made with a struct drm_xe_device_query where .query
- * is equal to DRM_XE_DEVICE_QUERY_MEM_REGIONS, then the reply uses
- * struct drm_xe_query_mem_regions in .data.
+ * is equal to DRM_XE_DEVICE_QUERY_MEM_REGION, then the reply uses
+ * struct drm_xe_query_mem_region in .data.
  */
-struct drm_xe_query_mem_regions {
-	/** @num_regions: number of memory regions returned in @regions */
-	__u32 num_regions;
+struct drm_xe_query_mem_region {
+	/** @num_mem_regions: number of memory regions returned in @mem_regions */
+	__u32 num_mem_regions;
 	/** @pad: MBZ */
 	__u32 pad;
-	/** @regions: The returned regions for this device */
-	struct drm_xe_query_mem_region regions[];
+	/** @mem_regions: The returned memory regions for this device */
+	struct drm_xe_mem_region mem_regions[];
 };
 
 /**
@@ -653,7 +653,7 @@ struct drm_xe_device_query {
 	__u64 extensions;
 
 #define DRM_XE_DEVICE_QUERY_ENGINES		0
-#define DRM_XE_DEVICE_QUERY_MEM_REGIONS		1
+#define DRM_XE_DEVICE_QUERY_MEM_REGION		1
 #define DRM_XE_DEVICE_QUERY_CONFIG		2
 #define DRM_XE_DEVICE_QUERY_GT_LIST		3
 #define DRM_XE_DEVICE_QUERY_HWCONFIG		4
@@ -710,9 +710,9 @@ struct drm_xe_gem_create {
 	/**
 	 * @placement: A mask of memory instances of where GEM can be placed.
 	 * Each index in this mask refers directly to the struct
-	 * drm_xe_query_mem_regions' instance, no assumptions should
+	 * drm_xe_query_mem_region's instance, no assumptions should
 	 * be made about order. The type of each region is described
-	 * by struct drm_xe_query_mem_regions' mem_class.
+	 * by struct drm_xe_mem_region's mem_class.
 	 */
 	__u32 placement;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 40/43] drm/xe/uapi: Align on a common way to return arrays (gt)
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (38 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 39/43] drm/xe/uapi: Align on a common way to return arrays (memory regions) Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 41/43] drm/xe/uapi: Align on a common way to return arrays (engines) Francois Dugast
                   ` (6 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Francois Dugast

The uAPI provides queries which return arrays of elements. As of now
the format used in the struct is different depending on which element
is queried. Fix this for gt by applying the pattern below:

    struct drm_xe_query_X {
       __u32 num_X;
       struct drm_xe_X Xs[];
       ...
    }

However, strictly following this rule would bring back the name "gts"
which is avoided as per commit ("drm/xe/uapi: Rename gts to gt_list")
so leave exceptions in the case of gt with num_gt (singular) and
"gt_list". Also, this change removes "query" in the name of struct
drm_xe_query_gt as it is not returned from the query IOCTL. There is
no functional change.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
---
 drivers/gpu/drm/xe/xe_query.c |  8 ++++----
 include/uapi/drm/xe_drm.h     | 20 ++++++++++----------
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
index 4ee8e848e24a..0077033fa753 100644
--- a/drivers/gpu/drm/xe/xe_query.c
+++ b/drivers/gpu/drm/xe/xe_query.c
@@ -381,11 +381,11 @@ static int query_config(struct xe_device *xe, struct drm_xe_device_query *query)
 static int query_gt_list(struct xe_device *xe, struct drm_xe_device_query *query)
 {
 	struct xe_gt *gt;
-	size_t size = sizeof(struct drm_xe_query_gt_list) +
-		xe->info.gt_count * sizeof(struct drm_xe_query_gt);
-	struct drm_xe_query_gt_list __user *query_ptr =
+	size_t size = sizeof(struct drm_xe_query_gt) +
+		xe->info.gt_count * sizeof(struct drm_xe_gt);
+	struct drm_xe_query_gt __user *query_ptr =
 		u64_to_user_ptr(query->data);
-	struct drm_xe_query_gt_list *gt_list;
+	struct drm_xe_query_gt *gt_list;
 	u8 id;
 
 	if (query->size == 0) {
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 74007c0ea970..467d6877f887 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -415,14 +415,14 @@ struct drm_xe_query_config {
 };
 
 /**
- * struct drm_xe_query_gt - describe an individual GT.
+ * struct drm_xe_gt - describe an individual GT.
  *
- * To be used with drm_xe_query_gt_list, which will return a list with all the
+ * To be used with drm_xe_query_gt, which will return a list with all the
  * existing GT individual descriptions.
  * Graphics Technology (GT) is a subset of a GPU/tile that is responsible for
  * implementing graphics and/or media operations.
  */
-struct drm_xe_query_gt {
+struct drm_xe_gt {
 #define DRM_XE_QUERY_GT_TYPE_MAIN		0
 #define DRM_XE_QUERY_GT_TYPE_MEDIA		1
 	/** @type: GT type: Main or Media */
@@ -438,19 +438,19 @@ struct drm_xe_query_gt {
 };
 
 /**
- * struct drm_xe_query_gt_list - A list with GT description items.
+ * struct drm_xe_query_gt - A list with GT description items.
  *
  * If a query is made with a struct drm_xe_device_query where .query
- * is equal to DRM_XE_DEVICE_QUERY_GT_LIST, then the reply uses struct
- * drm_xe_query_gt_list in .data.
+ * is equal to DRM_XE_DEVICE_QUERY_GT, then the reply uses struct
+ * drm_xe_query_gt in .data.
  */
-struct drm_xe_query_gt_list {
+struct drm_xe_query_gt {
 	/** @num_gt: number of GT items returned in gt_list */
 	__u32 num_gt;
 	/** @pad: MBZ */
 	__u32 pad;
 	/** @gt_list: The GT list returned for this device */
-	struct drm_xe_query_gt gt_list[];
+	struct drm_xe_gt gt_list[];
 };
 
 /**
@@ -605,7 +605,7 @@ struct drm_xe_query_uc_fw_version {
  *  - %DRM_XE_DEVICE_QUERY_ENGINES
  *  - %DRM_XE_DEVICE_QUERY_MEM_REGIONS
  *  - %DRM_XE_DEVICE_QUERY_CONFIG
- *  - %DRM_XE_DEVICE_QUERY_GT_LIST - Query type to retrieve the hardware
+ *  - %DRM_XE_DEVICE_QUERY_GT - Query type to retrieve the hardware
  *    configuration of the device such as information on slices, memory,
  *    caches, and so on. It is provided as a table of key / value
  *    attributes.
@@ -655,7 +655,7 @@ struct drm_xe_device_query {
 #define DRM_XE_DEVICE_QUERY_ENGINES		0
 #define DRM_XE_DEVICE_QUERY_MEM_REGION		1
 #define DRM_XE_DEVICE_QUERY_CONFIG		2
-#define DRM_XE_DEVICE_QUERY_GT_LIST		3
+#define DRM_XE_DEVICE_QUERY_GT			3
 #define DRM_XE_DEVICE_QUERY_HWCONFIG		4
 #define DRM_XE_DEVICE_QUERY_GT_TOPOLOGY		5
 #define DRM_XE_DEVICE_QUERY_ENGINE_CYCLES	6
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 41/43] drm/xe/uapi: Align on a common way to return arrays (engines)
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (39 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 40/43] drm/xe/uapi: Align on a common way to return arrays (gt) Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 42/43] drm/xe/uapi: Add block diagram of a device Francois Dugast
                   ` (5 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Francois Dugast

The uAPI provides queries which return arrays of elements. As of now
the format used in the struct is different depending on which element
is queried. Fix this for engines by applying the pattern below:

        struct drm_xe_query_X {
           __u32 num_X;
           struct drm_xe_X Xs[];
           ...
        }

Instead of directly returning an array of struct
drm_xe_query_engine_info, a new struct drm_xe_query_engine is
introduced. It contains itself an array of struct drm_xe_engine
which holds the information about each engine.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
---
 drivers/gpu/drm/xe/xe_query.c | 41 +++++++++---------
 include/uapi/drm/xe_drm.h     | 78 +++++++++++++++++++++--------------
 2 files changed, 69 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
index 0077033fa753..70a488e5ebbb 100644
--- a/drivers/gpu/drm/xe/xe_query.c
+++ b/drivers/gpu/drm/xe/xe_query.c
@@ -53,7 +53,8 @@ static size_t calc_hw_engine_info_size(struct xe_device *xe)
 			i++;
 		}
 
-	return i * sizeof(struct drm_xe_query_engine_info);
+	return sizeof(struct drm_xe_query_engine) +
+		i * sizeof(struct drm_xe_engine);
 }
 
 typedef u64 (*__ktime_func_t)(void);
@@ -180,9 +181,9 @@ static int query_engines(struct xe_device *xe,
 			 struct drm_xe_device_query *query)
 {
 	size_t size = calc_hw_engine_info_size(xe);
-	struct drm_xe_query_engine_info __user *query_ptr =
+	struct drm_xe_query_engine __user *query_ptr =
 		u64_to_user_ptr(query->data);
-	struct drm_xe_query_engine_info *hw_engine_info;
+	struct drm_xe_query_engine *engines;
 	struct xe_hw_engine *hwe;
 	enum xe_hw_engine_id id;
 	struct xe_gt *gt;
@@ -196,8 +197,8 @@ static int query_engines(struct xe_device *xe,
 		return -EINVAL;
 	}
 
-	hw_engine_info = kmalloc(size, GFP_KERNEL);
-	if (!hw_engine_info)
+	engines = kmalloc(size, GFP_KERNEL);
+	if (!engines)
 		return -ENOMEM;
 
 	for_each_gt(gt, xe, gt_id)
@@ -205,18 +206,18 @@ static int query_engines(struct xe_device *xe,
 			if (xe_hw_engine_is_reserved(hwe))
 				continue;
 
-			hw_engine_info[i].instance.engine_class =
+			engines->engines[i].instance.engine_class =
 				xe_to_user_engine_class[hwe->class];
-			hw_engine_info[i].instance.engine_instance =
+			engines->engines[i].instance.engine_instance =
 				hwe->logical_instance;
 			/*
 			 * Scheduling Group ID is the global GT ID for the
 			 * current hardware, although the API is flexible
 			 */
-			hw_engine_info[i].instance.sched_group_id = gt->info.id;
-			hw_engine_info[i].instance.pad = 0;
-			hw_engine_info[i].tile_id = gt_to_tile(gt)->id;
-			hw_engine_info[i].gt_id = gt->info.id;
+			engines->engines[i].instance.sched_group_id = gt->info.id;
+			engines->engines[i].instance.pad = 0;
+			engines->engines[i].tile_id = gt_to_tile(gt)->id;
+			engines->engines[i].gt_id = gt->info.id;
 
 			/*
 			 * The mem_regions indexes in the mask below need to
@@ -233,22 +234,24 @@ static int query_engines(struct xe_device *xe,
 			 * assumption.
 			 */
 			if (!IS_DGFX(xe))
-				hw_engine_info[i].near_mem_regions = 0x1;
+				engines->engines[i].near_mem_regions = 0x1;
 			else
-				hw_engine_info[i].near_mem_regions =
+				engines->engines[i].near_mem_regions =
 					BIT(gt_to_tile(gt)->id) << 1;
-			hw_engine_info[i].far_mem_regions = xe->info.mem_region_mask ^
-				hw_engine_info[i].near_mem_regions;
-			memset(hw_engine_info->reserved, 0, sizeof(hw_engine_info->reserved));
+			engines->engines[i].far_mem_regions = xe->info.mem_region_mask ^
+				engines->engines[i].near_mem_regions;
+			memset(engines->engines->reserved, 0, sizeof(engines->engines->reserved));
 
 			i++;
 		}
 
-	if (copy_to_user(query_ptr, hw_engine_info, size)) {
-		kfree(hw_engine_info);
+	engines->num_engines = i;
+
+	if (copy_to_user(query_ptr, engines, size)) {
+		kfree(engines);
 		return -EFAULT;
 	}
-	kfree(hw_engine_info);
+	kfree(engines);
 
 	return 0;
 }
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 467d6877f887..d586c8aeb279 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -179,9 +179,9 @@ struct drm_xe_ext_set_property {
 /**
  * struct drm_xe_engine_class_instance - instance of an engine class
  *
- * It is returned as part of the @drm_xe_query_engine_info, but it also is
- * used as the input of engine selection for both @drm_xe_exec_queue_create
- * and @drm_xe_query_engine_cycles
+ * It is returned as part of the @drm_xe_engine, but it also is used as
+ * the input of engine selection for both @drm_xe_exec_queue_create and
+ * @drm_xe_query_engine_cycles
  *
  * The @engine_class can be:
  *  - %DRM_XE_ENGINE_CLASS_RENDER
@@ -218,13 +218,9 @@ struct drm_xe_engine_class_instance {
 };
 
 /**
- * struct drm_xe_query_engine_info - describe hardware engine
- *
- * If a query is made with a struct @drm_xe_device_query where .query
- * is equal to %DRM_XE_DEVICE_QUERY_ENGINES, then the reply uses an array of
- * struct @drm_xe_query_engine_info in .data.
+ * struct drm_xe_engine - describe hardware engine
  */
-struct drm_xe_query_engine_info {
+struct drm_xe_engine {
 	/** @instance: The @drm_xe_engine_class_instance */
 	struct drm_xe_engine_class_instance instance;
 
@@ -260,6 +256,22 @@ struct drm_xe_query_engine_info {
 	__u64 reserved[5];
 };
 
+/**
+ * struct drm_xe_query_engine - describe engines
+ *
+ * If a query is made with a struct @drm_xe_device_query where .query
+ * is equal to %DRM_XE_DEVICE_QUERY_ENGINES, then the reply uses an array of
+ * struct @drm_xe_query_engine in .data.
+ */
+struct drm_xe_query_engine {
+	/** @num_engines: number of engines returned in @engines */
+	__u32 num_engines;
+	/** @pad: MBZ */
+	__u32 pad;
+	/** @engines: The returned engines for this device */
+	struct drm_xe_engine engines[];
+};
+
 /**
  * enum drm_xe_memory_class - Supported memory classes.
  */
@@ -625,28 +637,32 @@ struct drm_xe_query_uc_fw_version {
  *
  * .. code-block:: C
  *
- *	struct drm_xe_engine_class_instance *hwe;
- *	struct drm_xe_device_query query = {
- *		.extensions = 0,
- *		.query = DRM_XE_DEVICE_QUERY_ENGINES,
- *		.size = 0,
- *		.data = 0,
- *	};
- *	ioctl(fd, DRM_IOCTL_XE_DEVICE_QUERY, &query);
- *	hwe = malloc(query.size);
- *	query.data = (uintptr_t)hwe;
- *	ioctl(fd, DRM_IOCTL_XE_DEVICE_QUERY, &query);
- *	int num_engines = query.size / sizeof(*hwe);
- *	for (int i = 0; i < num_engines; i++) {
- *		printf("Engine %d: %s\n", i,
- *			hwe[i].engine_class == DRM_XE_ENGINE_CLASS_RENDER ? "RENDER":
- *			hwe[i].engine_class == DRM_XE_ENGINE_CLASS_COPY ? "COPY":
- *			hwe[i].engine_class == DRM_XE_ENGINE_CLASS_VIDEO_DECODE ? "VIDEO_DECODE":
- *			hwe[i].engine_class == DRM_XE_ENGINE_CLASS_VIDEO_ENHANCE ? "VIDEO_ENHANCE":
- *			hwe[i].engine_class == DRM_XE_ENGINE_CLASS_COMPUTE ? "COMPUTE":
- *			"UNKNOWN");
- *	}
- *	free(hwe);
+ *     struct drm_xe_query_engine *engines;
+ *     struct drm_xe_device_query query = {
+ *         .extensions = 0,
+ *         .query = DRM_XE_DEVICE_QUERY_ENGINES,
+ *         .size = 0,
+ *         .data = 0,
+ *     };
+ *     ioctl(fd, DRM_IOCTL_XE_DEVICE_QUERY, &query);
+ *     engines = malloc(query.size);
+ *     query.data = (uintptr_t)engines;
+ *     ioctl(fd, DRM_IOCTL_XE_DEVICE_QUERY, &query);
+ *     for (int i = 0; i < engines->num_engines; i++) {
+ *         printf("Engine %d: %s\n", i,
+ *             engines->engines[i].instance.engine_class ==
+ *                 DRM_XE_ENGINE_CLASS_RENDER ? "RENDER":
+ *             engines->engines[i].instance.engine_class ==
+ *                 DRM_XE_ENGINE_CLASS_COPY ? "COPY":
+ *             engines->engines[i].instance.engine_class ==
+ *                 DRM_XE_ENGINE_CLASS_VIDEO_DECODE ? "VIDEO_DECODE":
+ *             engines->engines[i].instance.engine_class ==
+ *                 DRM_XE_ENGINE_CLASS_VIDEO_ENHANCE ? "VIDEO_ENHANCE":
+ *             engines->engines[i].instance.engine_class ==
+ *                 DRM_XE_ENGINE_CLASS_COMPUTE ? "COMPUTE":
+ *             "UNKNOWN");
+ *     }
+ *     free(engines);
  */
 struct drm_xe_device_query {
 	/** @extensions: Pointer to the first extension struct, if any */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 42/43] drm/xe/uapi: Add block diagram of a device
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (40 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 41/43] drm/xe/uapi: Align on a common way to return arrays (engines) Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 43/43] drm/xe/uapi: Add examples of user space code Francois Dugast
                   ` (4 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Francois Dugast

In order to make proper use the uAPI, a prerequisite is to understand
some key concepts about the discrete GPU devices which are supported
by the Xe driver. For example, some structs defined in the uAPI are an
abstraction of a hardware component with a specific role.

This diagram helps to build a mental representation of a device how it
is seen by the Xe driver. As written in the documentation, it does not
intend to be a literal representation of an existing device. A lot
more information could be added but the intention for the overview is
to keep it simple, and go into detail as needed in other sections.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
---
 include/uapi/drm/xe_drm.h | 41 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 41 insertions(+)

diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index d586c8aeb279..4339e6fe47e4 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -26,6 +26,47 @@ extern "C" {
  *
  */
 
+/**
+ * DOC: Xe Device Block Diagram
+ *
+ * The diagram below represents a high-level simplification of a discrete
+ * GPU supported by the Xe driver. It shows some device components which
+ * are necessary to understand this API, as well as how their relations
+ * to each other. This diagram does not represent real hardware::
+ *
+ *   ┌──────────────────────────────────────────────────────────────────┐
+ *   │ ┌──────────────────────────────────────────────────┐ ┌─────────┐ │
+ *   │ │              ┌───────────────────────┐           │ │ ┌─────┐ │ │
+ *   │ │              │         VRAM0         │           │ │ │VRAM1│ │ │
+ *   │ │              └───────────┬───────────┘           │ │ └──┬──┘ │ │
+ *   │ │ ┌────────────────────────┴─────────────────────┐ │ │ ┌──┴──┐ │ │
+ *   │ │ │ ┌─────────────────────┐  ┌─────────────────┐ │ │ │ │     │ │ │
+ *   │ │ │ │ ┌──┐ ┌──┐ ┌──┐ ┌──┐ │  │ ┌─────┐ ┌─────┐ │ │ │ │ │     │ │ │
+ *   │ │ │ │ │EU│ │EU│ │EU│ │EU│ │  │ │RCS0 │ │BCS0 │ │ │ │ │ │     │ │ │
+ *   │ │ │ │ └──┘ └──┘ └──┘ └──┘ │  │ └─────┘ └─────┘ │ │ │ │ │     │ │ │
+ *   │ │ │ │ ┌──┐ ┌──┐ ┌──┐ ┌──┐ │  │ ┌─────┐ ┌─────┐ │ │ │ │ │     │ │ │
+ *   │ │ │ │ │EU│ │EU│ │EU│ │EU│ │  │ │VCS0 │ │VCS1 │ │ │ │ │ │     │ │ │
+ *   │ │ │ │ └──┘ └──┘ └──┘ └──┘ │  │ └─────┘ └─────┘ │ │ │ │ │     │ │ │
+ *   │ │ │ │ ┌──┐ ┌──┐ ┌──┐ ┌──┐ │  │ ┌─────┐ ┌─────┐ │ │ │ │ │     │ │ │
+ *   │ │ │ │ │EU│ │EU│ │EU│ │EU│ │  │ │VECS0│ │VECS1│ │ │ │ │ │ ... │ │ │
+ *   │ │ │ │ └──┘ └──┘ └──┘ └──┘ │  │ └─────┘ └─────┘ │ │ │ │ │     │ │ │
+ *   │ │ │ │ ┌──┐ ┌──┐ ┌──┐ ┌──┐ │  │ ┌─────┐ ┌─────┐ │ │ │ │ │     │ │ │
+ *   │ │ │ │ │EU│ │EU│ │EU│ │EU│ │  │ │CCS0 │ │CCS1 │ │ │ │ │ │     │ │ │
+ *   │ │ │ │ └──┘ └──┘ └──┘ └──┘ │  │ └─────┘ └─────┘ │ │ │ │ │     │ │ │
+ *   │ │ │ └─────────DSS─────────┘  │ ┌─────┐ ┌─────┐ │ │ │ │ │     │ │ │
+ *   │ │ │                          │ │CCS2 │ │CCS3 │ │ │ │ │ │     │ │ │
+ *   │ │ │ ┌─────┐ ┌─────┐ ┌─────┐  │ └─────┘ └─────┘ │ │ │ │ │     │ │ │
+ *   │ │ │ │ ... │ │ ... │ │ ... │  │                 │ │ │ │ │     │ │ │
+ *   │ │ │ └─DSS─┘ └─DSS─┘ └─DSS─┘  └─────Engines─────┘ │ │ │ │     │ │ │
+ *   │ │ └───────────────────────────GT0────────────────┘ │ │ └─GT1─┘ │ │
+ *   │ └────────────────────────────Tile0─────────────────┘ └─ Tile1──┘ │
+ *   └─────────────────────────────Device0───────┬──────────────────────┘
+ *                                               │
+ *                                               │
+ *                        ───────────────────────┴────────── PCI bus
+ *
+ */
+
 /**
  * DOC: Xe uAPI Overview
  *
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] [PATCH v3 43/43] drm/xe/uapi: Add examples of user space code
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (41 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 42/43] drm/xe/uapi: Add block diagram of a device Francois Dugast
@ 2023-11-09 15:44 ` Francois Dugast
  2023-11-09 16:05 ` [Intel-xe] ✗ CI.Patch_applied: failure for uAPI Alignment - take 2 Patchwork
                   ` (3 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Francois Dugast @ 2023-11-09 15:44 UTC (permalink / raw)
  To: intel-xe; +Cc: Francois Dugast

Complete the documentation of some structs by adding functional
examples of user space code. Those examples are intentionally kept
very simple. Put together, they provide a foundation for a minimal
application that executes a job using the Xe driver.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
---
 include/uapi/drm/xe_drm.h | 84 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 84 insertions(+)

diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 4339e6fe47e4..089d05b69da5 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -958,6 +958,30 @@ struct drm_xe_vm_bind_op {
 
 /**
  * struct drm_xe_vm_bind - Input of &DRM_IOCTL_XE_VM_BIND
+ *
+ * Below is an example of a minimal use of @drm_xe_vm_bind to
+ * asynchronously bind the buffer `data` at address `BIND_ADDRESS` to
+ * illustrate `userptr`. It can be synchronized by using the example
+ * provided for @drm_xe_sync.
+ *
+ * .. code-block:: C
+ *
+ *     data = aligned_alloc(ALIGNMENT, BO_SIZE);
+ *     struct drm_xe_vm_bind bind = {
+ *         .vm_id = vm,
+ *         .num_binds = 1,
+ *         .bind.obj = 0,
+ *         .bind.obj_offset = to_user_pointer(data),
+ *         .bind.range = BO_SIZE,
+ *         .bind.addr = BIND_ADDRESS,
+ *         .bind.op = DRM_XE_VM_BIND_OP_MAP_USERPTR,
+ *         .bind.flags = DRM_XE_VM_BIND_FLAG_ASYNC,
+ *         .num_syncs = 1,
+ *         .syncs = &sync,
+ *         .exec_queue_id = 0,
+ *     };
+ *     ioctl(fd, DRM_IOCTL_XE_VM_BIND, &bind);
+ *
  */
 struct drm_xe_vm_bind {
 	/** @extensions: Pointer to the first extension struct, if any */
@@ -1030,6 +1054,30 @@ struct drm_xe_vm_bind {
  * The @flags can be:
  *  - %DRM_XE_SYNC_FLAG_SIGNAL
  *
+ * A minimal use of @drm_xe_sync looks like this:
+ *
+ * .. code-block:: C
+ *
+ *     struct drm_xe_sync sync = {
+ *         .flags = DRM_XE_SYNC_FLAG_SIGNAL,
+ *         .type = DRM_XE_SYNC_TYPE_SYNCOBJ,
+ *     };
+ *     struct drm_syncobj_create syncobj_create = { 0 };
+ *     ioctl(fd, DRM_IOCTL_SYNCOBJ_CREATE, &syncobj_create);
+ *     sync.handle = syncobj_create.handle;
+ *         ...
+ *         use of &sync in drm_xe_exec or drm_xe_vm_bind
+ *         ...
+ *     struct drm_syncobj_wait wait = {
+ *         .handles = &sync.handle,
+ *         .timeout_nsec = INT64_MAX,
+ *         .count_handles = 1,
+ *         .flags = 0,
+ *         .first_signaled = 0,
+ *         .pad = 0,
+ *     };
+ *     ioctl(fd, DRM_IOCTL_SYNCOBJ_WAIT, &wait);
+ *
  */
 struct drm_xe_sync {
 	/** @extensions: Pointer to the first extension struct, if any */
@@ -1134,6 +1182,25 @@ struct drm_xe_sync {
 
 /**
  * struct drm_xe_exec_queue_create - Input of &DRM_IOCTL_XE_EXEC_QUEUE_CREATE
+ *
+ * The example below shows how to use @drm_xe_exec_queue_create to create
+ * a simple exec_queue (no parallel submission) of class
+ * &DRM_XE_ENGINE_CLASS_RENDER.
+ *
+ * .. code-block:: C
+ *
+ *     struct drm_xe_engine_class_instance instance = {
+ *         .engine_class = DRM_XE_ENGINE_CLASS_RENDER,
+ *     };
+ *     struct drm_xe_exec_queue_create exec_queue_create = {
+ *          .extensions = 0,
+ *          .vm_id = vm,
+ *          .num_bb_per_exec = 1,
+ *          .num_eng_per_bb = 1,
+ *          .instances = to_user_pointer(&instance),
+ *     };
+ *     ioctl(fd, DRM_IOCTL_XE_EXEC_QUEUE_CREATE, &exec_queue_create);
+ *
  */
 struct drm_xe_exec_queue_create {
 #define DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY               0
@@ -1256,6 +1323,23 @@ struct drm_xe_exec_queue_get_property {
 
 /**
  * struct drm_xe_exec - Input of &DRM_IOCTL_XE_EXEC
+ *
+ * This is an example to use @drm_xe_exec for execution of the object
+ * at BIND_ADDRESS (see example in @drm_xe_vm_bind) by an exec_queue
+ * (see example in @drm_xe_exec_queue_create). It can be synchronized
+ * by using the example provided for @drm_xe_sync.
+ *
+ * .. code-block:: C
+ *
+ *     struct drm_xe_exec exec = {
+ *         .exec_queue_id = exec_queue,
+ *         .syncs = &sync,
+ *         .num_syncs = 1,
+ *         .address = BIND_ADDRESS,
+ *         .num_batch_buffer = 1,
+ *     };
+ *     ioctl(fd, DRM_IOCTL_XE_EXEC, &exec);
+ *
  */
 struct drm_xe_exec {
 	/** @extensions: Pointer to the first extension struct, if any */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Intel-xe] ✗ CI.Patch_applied: failure for uAPI Alignment - take 2
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (42 preceding siblings ...)
  2023-11-09 15:44 ` [Intel-xe] [PATCH v3 43/43] drm/xe/uapi: Add examples of user space code Francois Dugast
@ 2023-11-09 16:05 ` Patchwork
  2023-11-09 17:11 ` [Intel-xe] [PATCH v3 00/43] " Souza, Jose
                   ` (2 subsequent siblings)
  46 siblings, 0 replies; 53+ messages in thread
From: Patchwork @ 2023-11-09 16:05 UTC (permalink / raw)
  To: Francois Dugast; +Cc: intel-xe

== Series Details ==

Series: uAPI Alignment - take 2
URL   : https://patchwork.freedesktop.org/series/126203/
State : failure

== Summary ==

=== Applying kernel patches on branch 'drm-xe-next' with base: ===
Base commit: 2096aea11 fixup! drm/xe/gsc: add gsc device support
=== git am output follows ===
Applying: drm/xe/uapi: Add documentation for query
Applying: drm/xe: Extend drm_xe_vm_bind_op
Applying: drm/xe: Add uAPI to query micro-controler firmware version
Applying: drm/xe/uapi: Document DRM_XE_DEVICE_QUERY_HWCONFIG
Applying: drm/xe: Extend uAPI to query HuC micro-controler firmware version
Applying: drm/xe/uapi: Remove useless XE_QUERY_CONFIG_NUM_PARAM
Applying: drm/xe/uapi: Add missing DRM_ prefix in uAPI constants
Applying: drm/xe/uapi: Add _FLAG to uAPI constants usable for flags
Applying: drm/xe/uapi: Make constant comments visible in kernel doc
Applying: drm/xe/uapi: Change rsvd to pad in struct drm_xe_class_instance
Applying: drm/xe/uapi: Remove GT_TYPE_REMOTE
Applying: drm/xe/uapi: Kill VM_MADVISE IOCTL
Applying: drm/xe/uapi: Separate bo_create placement from flags
Applying: drm/xe/uapi: Remove unused inaccessible memory region
Applying: drm/xe/uapi: Remove unused QUERY_CONFIG_MEM_REGION_COUNT
Applying: drm/xe/uapi: Remove unused QUERY_CONFIG_GT_COUNT
Applying: drm/xe/uapi: Rename *_mem_regions masks
Applying: drm/xe/uapi: Rename query's mem_usage to mem_regions
Applying: drm/xe: Make DRM_XE_DEVICE_QUERY_ENGINES future proof
Applying: drm/xe/uapi: Replace BO with GEM in documentation
Applying: drm/xe/pmu: Drop interrupt pmu event
Applying: drm/xe/uapi: Reject bo creation of unaligned size
Applying: drm/xe/uapi: Fix indentation issues that sometimes causes build warning
Applying: drm/xe/uapi: Order sections
Applying: drm/xe/uapi: More uAPI documentation additions and cosmetic updates
Applying: drm/xe/uapi: Split xe_sync types from flags
Applying: drm/xe/uapi: Standardize the FLAG naming and assignment
Applying: drm/xe/uapi: Differentiate WAIT_OP from WAIT_MASK
Applying: drm/xe/uapi: Move xe_exec after xe_exec_queue
Applying: drm/xe/uapi: Move memory_region masks from GT to engine
Applying: drm/xe/uapi: Document the memory_region bitmask
Applying: drm/xe/uapi: Be more specific about the vm_bind prefetch region
Applying: drm/xe/uapi: Convert tile_mask to a pt_placement_hint
Applying: drm/xe/uapi: Exec queue documentation and variable renaming
Applying: drm/xe/uapi: Refactor engine information
Applying: drm/xe/uapi: Crystal Reference Clock updates
Applying: drm/xe/uapi: Add Tile ID information to the GT info query
Applying: drm/xe/uapi: Remove bogus engine list from the wait_user_fence IOCTL
Applying: drm/xe/uapi: Align on a common way to return arrays (memory regions)
Applying: drm/xe/uapi: Align on a common way to return arrays (gt)
Applying: drm/xe/uapi: Align on a common way to return arrays (engines)
Applying: drm/xe/uapi: Add block diagram of a device
Applying: drm/xe/uapi: Add examples of user space code



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (43 preceding siblings ...)
  2023-11-09 16:05 ` [Intel-xe] ✗ CI.Patch_applied: failure for uAPI Alignment - take 2 Patchwork
@ 2023-11-09 17:11 ` Souza, Jose
  2023-11-15 23:29 ` [Intel-xe] ✗ CI.Patch_applied: failure for uAPI Alignment - take 2 (rev2) Patchwork
  2023-11-17 21:35 ` Patchwork
  46 siblings, 0 replies; 53+ messages in thread
From: Souza, Jose @ 2023-11-09 17:11 UTC (permalink / raw)
  To: intel-xe@lists.freedesktop.org, Vivi,  Rodrigo, Dugast, Francois

This series is huuugeeee!
This will take a while to get all patches reviewed in KMD and UMD so lets start by reducing it with the patches that don't break uAPI?

Here the ones that I think that can be merged without breaking any uAPI because are just comments changes, removal of uAPIs not used by UMDs or
addition of a new uAPI.

drm/xe/uapi: Add documentation for query
drm/xe: Add uAPI to query micro-controler firmware version
drm/xe/uapi: Document DRM_XE_DEVICE_QUERY_HWCONFIG
drm/xe: Extend uAPI to query HuC micro-controler firmware version
drm/xe/uapi: Make constant comments visible in kernel doc
drm/xe/uapi: Kill VM_MADVISE IOCTL
drm/xe/pmu: Drop interrupt pmu event
drm/xe/uapi: Fix indentation issues that sometimes causes build warning
drm/xe/uapi: Order sections
drm/xe/uapi: More uAPI documentation additions and cosmetic updates
drm/xe/uapi: Document the memory_region bitmask
drm/xe/uapi: Add block diagram of a device

On Thu, 2023-11-09 at 15:44 +0000, Francois Dugast wrote:
> This is the second take of uAPI updates that would lead to
> breakage in the compatibility, which it is not acceptable after
> we are merged upstream. So, let's break it before it is too late,
> and start upstreaming a good, reliable and clean uapi.
> 
> v2: Rebase, drop "RFC", more uAPI fixes and cleanup
> 
> v3:
> - Rebase
> - Checkpatch
> - Apply fixups and squash 
> - Do not remove num_params 
> - Skip "[PATCH v2 01/50] fixup! drm/xe: Correlate engine and cpu
>   timestamps with better accuracy" already merged by Lucas 
> - Skip "[PATCH v2 40/50] drm/xe/uapi: Add link to Xe documentation"
>   as location will change 
> - Change "[PATCH v2 12/50] fixup! drm/xe: Correlate engine and cpu
>   timestamps with better accuracy" to not be a fixup 
> - Fix commit message of "[PATCH v2 24/50] xe/xe_bo: Reject bo
>   creation of unaligned size" 
> - Include already provided "Reviewed-by" 
> 
> Aravind Iddamsetty (1):
>   drm/xe/pmu: Drop interrupt pmu event
> 
> Francois Dugast (17):
>   drm/xe/uapi: Add documentation for query
>   drm/xe/uapi: Document DRM_XE_DEVICE_QUERY_HWCONFIG
>   drm/xe: Extend uAPI to query HuC micro-controler firmware version
>   drm/xe/uapi: Remove useless XE_QUERY_CONFIG_NUM_PARAM
>   drm/xe/uapi: Add missing DRM_ prefix in uAPI constants
>   drm/xe/uapi: Add _FLAG to uAPI constants usable for flags
>   drm/xe/uapi: Make constant comments visible in kernel doc
>   drm/xe/uapi: Change rsvd to pad in struct drm_xe_class_instance
>   drm/xe/uapi: Remove unused inaccessible memory region
>   drm/xe/uapi: Remove unused QUERY_CONFIG_MEM_REGION_COUNT
>   drm/xe/uapi: Remove unused QUERY_CONFIG_GT_COUNT
>   drm/xe/uapi: Replace BO with GEM in documentation
>   drm/xe/uapi: Align on a common way to return arrays (memory regions)
>   drm/xe/uapi: Align on a common way to return arrays (gt)
>   drm/xe/uapi: Align on a common way to return arrays (engines)
>   drm/xe/uapi: Add block diagram of a device
>   drm/xe/uapi: Add examples of user space code
> 
> José Roberto de Souza (2):
>   drm/xe: Add uAPI to query micro-controler firmware version
>   drm/xe: Make DRM_XE_DEVICE_QUERY_ENGINES future proof
> 
> Mauro Carvalho Chehab (1):
>   drm/xe/uapi: Reject bo creation of unaligned size
> 
> Mika Kuoppala (1):
>   drm/xe: Extend drm_xe_vm_bind_op
> 
> Rodrigo Vivi (21):
>   drm/xe/uapi: Remove GT_TYPE_REMOTE
>   drm/xe/uapi: Kill VM_MADVISE IOCTL
>   drm/xe/uapi: Separate bo_create placement from flags
>   drm/xe/uapi: Rename *_mem_regions masks
>   drm/xe/uapi: Rename query's mem_usage to mem_regions
>   drm/xe/uapi: Fix indentation issues that sometimes causes build
>     warning
>   drm/xe/uapi: Order sections
>   drm/xe/uapi: More uAPI documentation additions and cosmetic updates
>   drm/xe/uapi: Split xe_sync types from flags
>   drm/xe/uapi: Standardize the FLAG naming and assignment
>   drm/xe/uapi: Differentiate WAIT_OP from WAIT_MASK
>   drm/xe/uapi: Move xe_exec after xe_exec_queue
>   drm/xe/uapi: Move memory_region masks from GT to engine
>   drm/xe/uapi: Document the memory_region bitmask
>   drm/xe/uapi: Be more specific about the vm_bind prefetch region
>   drm/xe/uapi: Convert tile_mask to a pt_placement_hint
>   drm/xe/uapi: Exec queue documentation and variable renaming
>   drm/xe/uapi: Refactor engine information
>   drm/xe/uapi: Crystal Reference Clock updates
>   drm/xe/uapi: Add Tile ID information to the GT info query
>   drm/xe/uapi: Remove bogus engine list from the wait_user_fence IOCTL
> 
>  drivers/gpu/drm/xe/Makefile              |    1 -
>  drivers/gpu/drm/xe/tests/xe_dma_buf.c    |    8 +-
>  drivers/gpu/drm/xe/xe_bo.c               |   51 +-
>  drivers/gpu/drm/xe/xe_bo_types.h         |    3 +
>  drivers/gpu/drm/xe/xe_devcoredump.c      |    8 +-
>  drivers/gpu/drm/xe/xe_device.c           |    8 +-
>  drivers/gpu/drm/xe/xe_exec.c             |    4 +-
>  drivers/gpu/drm/xe/xe_exec_queue.c       |   86 +-
>  drivers/gpu/drm/xe/xe_exec_queue.h       |    4 +-
>  drivers/gpu/drm/xe/xe_exec_queue_types.h |    4 +-
>  drivers/gpu/drm/xe/xe_gt.c               |    2 +-
>  drivers/gpu/drm/xe/xe_gt_clock.c         |    4 +-
>  drivers/gpu/drm/xe/xe_gt_types.h         |    4 +-
>  drivers/gpu/drm/xe/xe_guc_submit.c       |   32 +-
>  drivers/gpu/drm/xe/xe_irq.c              |   18 -
>  drivers/gpu/drm/xe/xe_pmu.c              |   25 +-
>  drivers/gpu/drm/xe/xe_pmu_types.h        |    8 -
>  drivers/gpu/drm/xe/xe_query.c            |  220 ++--
>  drivers/gpu/drm/xe/xe_ring_ops.c         |    8 +-
>  drivers/gpu/drm/xe/xe_sched_job.c        |   10 +-
>  drivers/gpu/drm/xe/xe_sync.c             |   27 +-
>  drivers/gpu/drm/xe/xe_sync_types.h       |    1 +
>  drivers/gpu/drm/xe/xe_trace.h            |    8 +-
>  drivers/gpu/drm/xe/xe_vm.c               |  115 +-
>  drivers/gpu/drm/xe/xe_vm_doc.h           |   14 +-
>  drivers/gpu/drm/xe/xe_vm_madvise.c       |  299 -----
>  drivers/gpu/drm/xe/xe_vm_madvise.h       |   15 -
>  drivers/gpu/drm/xe/xe_wait_user_fence.c  |   74 +-
>  include/uapi/drm/xe_drm.h                | 1334 ++++++++++++++--------
>  29 files changed, 1250 insertions(+), 1145 deletions(-)
>  delete mode 100644 drivers/gpu/drm/xe/xe_vm_madvise.c
>  delete mode 100644 drivers/gpu/drm/xe/xe_vm_madvise.h
> 


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Intel-xe] [PATCH v3 33/43] drm/xe/uapi: Convert tile_mask to a pt_placement_hint
  2023-11-09  9:29   ` Matthew Brost
@ 2023-11-09 19:05     ` Rodrigo Vivi
  0 siblings, 0 replies; 53+ messages in thread
From: Rodrigo Vivi @ 2023-11-09 19:05 UTC (permalink / raw)
  To: Matthew Brost; +Cc: Francois Dugast, intel-xe

On Thu, Nov 09, 2023 at 09:29:49AM +0000, Matthew Brost wrote:
> On Thu, Nov 09, 2023 at 03:44:47PM +0000, Francois Dugast wrote:
> > From: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > 
> > The previous tile_mask was also an optional hint, and only used
> > for the page-table tree placement. However, it was so tied
> > with the tile concept itself. Let's clarify things up and make
> > this generic enough. So accept any valid memory region mask.
> > It could even be a direct near_mem_region gotten from the engine_info.
> > pt stands for page table.
> > 
> > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > Signed-off-by: Francois Dugast <francois.dugast@intel.com>
> 
> I thought we landed on converting tile_mask to sched_group_mask?

my bad. I had forgotten or misunderstood that...

> I do
> not like pt_placement_hint at all as I've statede what we actually care
> about is creating mappings for exec queues. The sched_group_mask is
> still a hint basically saying at minimum you must create a mapping for
> these sched groups perhaps more. The driver is free to place a PPGTT (or
> multiple) anywhere it wants to based on the platform.

or I might have changed when documenting it since on the documentation
it was a lot about the placement of the PPGTT that was what this was
doing and I confused myself.

I believe with your text here as doc the sched_group_mask makes more sense.
Let's change.

> 
> e.g. On PVC we have two scheduling groups, and two PPGTT (one per tile in VRAM)
> e.g. On MTL we have two scheduling groups, and one PPGTT (sysmem)


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Intel-xe] ✗ CI.Patch_applied: failure for uAPI Alignment - take 2 (rev2)
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (44 preceding siblings ...)
  2023-11-09 17:11 ` [Intel-xe] [PATCH v3 00/43] " Souza, Jose
@ 2023-11-15 23:29 ` Patchwork
  2023-11-17 21:35 ` Patchwork
  46 siblings, 0 replies; 53+ messages in thread
From: Patchwork @ 2023-11-15 23:29 UTC (permalink / raw)
  To: Francois Dugast; +Cc: intel-xe

== Series Details ==

Series: uAPI Alignment - take 2 (rev2)
URL   : https://patchwork.freedesktop.org/series/126203/
State : failure

== Summary ==

=== Applying kernel patches on branch 'drm-xe-next' with base: ===
Base commit: eba8bfb1d fixup! drm/xe/display: Implement display support
=== git am output follows ===
error: patch failed: drivers/gpu/drm/xe/xe_guc_submit.c:1462
error: drivers/gpu/drm/xe/xe_guc_submit.c: patch does not apply
hint: Use 'git am --show-current-patch' to see the failed patch
Applying: drm/xe/uapi: Add documentation for query
Applying: drm/xe: Extend drm_xe_vm_bind_op
Applying: drm/xe: Add uAPI to query micro-controler firmware version
Applying: drm/xe/uapi: Document DRM_XE_DEVICE_QUERY_HWCONFIG
Applying: drm/xe: Extend uAPI to query HuC micro-controler firmware version
Applying: drm/xe/uapi: Remove useless XE_QUERY_CONFIG_NUM_PARAM
Applying: drm/xe/uapi: Add missing DRM_ prefix in uAPI constants
Applying: drm/xe/uapi: Add _FLAG to uAPI constants usable for flags
Applying: drm/xe/uapi: Make constant comments visible in kernel doc
Applying: drm/xe/uapi: Change rsvd to pad in struct drm_xe_class_instance
Applying: drm/xe/uapi: Remove GT_TYPE_REMOTE
Applying: drm/xe/uapi: Kill VM_MADVISE IOCTL
Applying: drm/xe/uapi: Separate bo_create placement from flags
Applying: drm/xe/uapi: Remove unused inaccessible memory region
Applying: drm/xe/uapi: Remove unused QUERY_CONFIG_MEM_REGION_COUNT
Applying: drm/xe/uapi: Remove unused QUERY_CONFIG_GT_COUNT
Applying: drm/xe/uapi: Rename *_mem_regions masks
Applying: drm/xe/uapi: Rename query's mem_usage to mem_regions
Applying: drm/xe: Make DRM_XE_DEVICE_QUERY_ENGINES future proof
Applying: drm/xe/uapi: Replace BO with GEM in documentation
Applying: drm/xe/pmu: Drop interrupt pmu event
Applying: drm/xe/uapi: Reject bo creation of unaligned size
Applying: drm/xe/uapi: Fix indentation issues that sometimes causes build warning
Applying: drm/xe/uapi: Order sections
Applying: drm/xe/uapi: More uAPI documentation additions and cosmetic updates
Applying: drm/xe/uapi: Split xe_sync types from flags
Applying: drm/xe/uapi: Standardize the FLAG naming and assignment
Applying: drm/xe/uapi: Differentiate WAIT_OP from WAIT_MASK
Applying: drm/xe/uapi: Move xe_exec after xe_exec_queue
Applying: drm/xe/uapi: Move memory_region masks from GT to engine
Applying: drm/xe/uapi: Document the memory_region bitmask
Applying: drm/xe/uapi: Be more specific about the vm_bind prefetch region
Applying: drm/xe/uapi: Convert tile_mask to a pt_placement_hint
Applying: drm/xe/uapi: Exec queue documentation and variable renaming
Patch failed at 0034 drm/xe/uapi: Exec queue documentation and variable renaming
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".



^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Intel-xe] ✗ CI.Patch_applied: failure for uAPI Alignment - take 2 (rev2)
  2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
                   ` (45 preceding siblings ...)
  2023-11-15 23:29 ` [Intel-xe] ✗ CI.Patch_applied: failure for uAPI Alignment - take 2 (rev2) Patchwork
@ 2023-11-17 21:35 ` Patchwork
  46 siblings, 0 replies; 53+ messages in thread
From: Patchwork @ 2023-11-17 21:35 UTC (permalink / raw)
  To: Francois Dugast; +Cc: intel-xe

== Series Details ==

Series: uAPI Alignment - take 2 (rev2)
URL   : https://patchwork.freedesktop.org/series/126203/
State : failure

== Summary ==

=== Applying kernel patches on branch 'drm-xe-next' with base: ===
Base commit: 3b8183b7e drm/xe/uapi: Be more specific about the vm_bind prefetch region
=== git am output follows ===
error: patch failed: include/uapi/drm/xe_drm.h:321
error: include/uapi/drm/xe_drm.h: patch does not apply
hint: Use 'git am --show-current-patch' to see the failed patch
Applying: drm/xe/uapi: Add documentation for query
Patch failed at 0001 drm/xe/uapi: Add documentation for query
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".



^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2023-11-17 21:35 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-09 15:44 [Intel-xe] [PATCH v3 00/43] uAPI Alignment - take 2 Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 01/43] drm/xe/uapi: Add documentation for query Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 02/43] drm/xe: Extend drm_xe_vm_bind_op Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 03/43] drm/xe: Add uAPI to query micro-controler firmware version Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 04/43] drm/xe/uapi: Document DRM_XE_DEVICE_QUERY_HWCONFIG Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 05/43] drm/xe: Extend uAPI to query HuC micro-controler firmware version Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 06/43] drm/xe/uapi: Remove useless XE_QUERY_CONFIG_NUM_PARAM Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 07/43] drm/xe/uapi: Add missing DRM_ prefix in uAPI constants Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 08/43] drm/xe/uapi: Add _FLAG to uAPI constants usable for flags Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 09/43] drm/xe/uapi: Make constant comments visible in kernel doc Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 10/43] drm/xe/uapi: Change rsvd to pad in struct drm_xe_class_instance Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 11/43] drm/xe/uapi: Remove GT_TYPE_REMOTE Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 12/43] drm/xe/uapi: Kill VM_MADVISE IOCTL Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 13/43] drm/xe/uapi: Separate bo_create placement from flags Francois Dugast
2023-11-09 14:58   ` Matthew Brost
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 14/43] drm/xe/uapi: Remove unused inaccessible memory region Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 15/43] drm/xe/uapi: Remove unused QUERY_CONFIG_MEM_REGION_COUNT Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 16/43] drm/xe/uapi: Remove unused QUERY_CONFIG_GT_COUNT Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 17/43] drm/xe/uapi: Rename *_mem_regions masks Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 18/43] drm/xe/uapi: Rename query's mem_usage to mem_regions Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 19/43] drm/xe: Make DRM_XE_DEVICE_QUERY_ENGINES future proof Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 20/43] drm/xe/uapi: Replace BO with GEM in documentation Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 21/43] drm/xe/pmu: Drop interrupt pmu event Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 22/43] drm/xe/uapi: Reject bo creation of unaligned size Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 23/43] drm/xe/uapi: Fix indentation issues that sometimes causes build warning Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 24/43] drm/xe/uapi: Order sections Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 25/43] drm/xe/uapi: More uAPI documentation additions and cosmetic updates Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 26/43] drm/xe/uapi: Split xe_sync types from flags Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 27/43] drm/xe/uapi: Standardize the FLAG naming and assignment Francois Dugast
2023-11-09 15:10   ` Matthew Brost
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 28/43] drm/xe/uapi: Differentiate WAIT_OP from WAIT_MASK Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 29/43] drm/xe/uapi: Move xe_exec after xe_exec_queue Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 30/43] drm/xe/uapi: Move memory_region masks from GT to engine Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 31/43] drm/xe/uapi: Document the memory_region bitmask Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 32/43] drm/xe/uapi: Be more specific about the vm_bind prefetch region Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 33/43] drm/xe/uapi: Convert tile_mask to a pt_placement_hint Francois Dugast
2023-11-09  9:29   ` Matthew Brost
2023-11-09 19:05     ` Rodrigo Vivi
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 34/43] drm/xe/uapi: Exec queue documentation and variable renaming Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 35/43] drm/xe/uapi: Refactor engine information Francois Dugast
2023-11-09 12:07   ` Matthew Brost
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 36/43] drm/xe/uapi: Crystal Reference Clock updates Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 37/43] drm/xe/uapi: Add Tile ID information to the GT info query Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 38/43] drm/xe/uapi: Remove bogus engine list from the wait_user_fence IOCTL Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 39/43] drm/xe/uapi: Align on a common way to return arrays (memory regions) Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 40/43] drm/xe/uapi: Align on a common way to return arrays (gt) Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 41/43] drm/xe/uapi: Align on a common way to return arrays (engines) Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 42/43] drm/xe/uapi: Add block diagram of a device Francois Dugast
2023-11-09 15:44 ` [Intel-xe] [PATCH v3 43/43] drm/xe/uapi: Add examples of user space code Francois Dugast
2023-11-09 16:05 ` [Intel-xe] ✗ CI.Patch_applied: failure for uAPI Alignment - take 2 Patchwork
2023-11-09 17:11 ` [Intel-xe] [PATCH v3 00/43] " Souza, Jose
2023-11-15 23:29 ` [Intel-xe] ✗ CI.Patch_applied: failure for uAPI Alignment - take 2 (rev2) Patchwork
2023-11-17 21:35 ` Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.