Igt-dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [igt-dev] ✗ Fi.CI.BUILD: failure for Compression support for Lunarlake
  2023-11-10 21:24 [igt-dev] [PATCH i-g-t 0/5] Compression support for Lunarlake Akshata Jahagirdar
@ 2023-11-10 10:29 ` Patchwork
  2023-11-10 21:24 ` [igt-dev] [PATCH i-g-t 1/5] lib: add blt command properties for lunarlake Akshata Jahagirdar
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 16+ messages in thread
From: Patchwork @ 2023-11-10 10:29 UTC (permalink / raw)
  To: Akshata Jahagirdar; +Cc: igt-dev

== Series Details ==

Series: Compression support for Lunarlake
URL   : https://patchwork.freedesktop.org/series/126254/
State : failure

== Summary ==

Applying: lib: add blt command properties for lunarlake
Applying: lib/intel_blt: Update calculation of ccs_size and size_of_ctrl_copy
Using index info to reconstruct a base tree...
M	lib/intel_blt.c
M	lib/intel_blt.h
Falling back to patching base and 3-way merge...
Auto-merging lib/intel_blt.h
Auto-merging lib/intel_blt.c
CONFLICT (content): Merge conflict in lib/intel_blt.c
Patch failed at 0002 lib/intel_blt: Update calculation of ccs_size and size_of_ctrl_copy
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 2/5] lib/intel_blt: Update calculation of ccs_size and size_of_ctrl_copy
  2023-11-10 21:24 ` [igt-dev] [PATCH i-g-t 2/5] lib/intel_blt: Update calculation of ccs_size and size_of_ctrl_copy Akshata Jahagirdar
@ 2023-11-10 10:59   ` Matthew Auld
  2023-11-15 21:06     ` Jahagirdar, Akshata
  0 siblings, 1 reply; 16+ messages in thread
From: Matthew Auld @ 2023-11-10 10:59 UTC (permalink / raw)
  To: Akshata Jahagirdar; +Cc: igt-dev, ayaz.siddiqui

On 10/11/2023 21:24, Akshata Jahagirdar wrote:
> The Main-to-CCS Ratio for XE2 has been changed to 512:1. Update the CCS_RATIO macro to select relevant ratio based on platform.
> Since the PAGE_SIZE of sysmem is 4K, update the size of ctrl_copy to reflect this change.

Nit: very long line.

> 
> Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
> ---
>   lib/intel_blt.c | 19 +++++++++++--------
>   lib/intel_blt.h |  2 +-
>   2 files changed, 12 insertions(+), 9 deletions(-)
> 
> diff --git a/lib/intel_blt.c b/lib/intel_blt.c
> index c0593930c..8168ce7b5 100644
> --- a/lib/intel_blt.c
> +++ b/lib/intel_blt.c
> @@ -948,15 +948,16 @@ int blt_block_copy(int fd,
>   	return ret;
>   }
>   
> -static uint16_t __ccs_size(const struct blt_ctrl_surf_copy_data *surf)
> +static uint16_t __ccs_size(int fd, const struct blt_ctrl_surf_copy_data *surf)
>   {
>   	uint32_t src_size, dst_size;
> +	uint16_t ccsratio = CCS_RATIO(fd);
>   
>   	src_size = surf->src.access_type == DIRECT_ACCESS ?
> -				surf->src.size : surf->src.size / CCS_RATIO;
> +				surf->src.size : surf->src.size / ccsratio;
>   
>   	dst_size = surf->dst.access_type == DIRECT_ACCESS ?
> -				surf->dst.size : surf->dst.size / CCS_RATIO;
> +				surf->dst.size : surf->dst.size / ccsratio;
>   
>   	igt_assert_f(src_size <= dst_size, "dst size must be >= src size for CCS copy\n");
>   
> @@ -1118,6 +1119,7 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
>   	uint64_t dst_offset, src_offset, bb_offset, alignment;
>   	uint32_t bbe = MI_BATCH_BUFFER_END;
>   	uint32_t *bb;
> +	uint16_t num_ccs_blocks = xe_get_default_alignment(fd) / (CCS_RATIO(fd));
>   
>   	igt_assert_f(ahnd, "ctrl-surf-copy supports softpin only\n");
>   	igt_assert_f(surf, "ctrl-surf-copy requires data to do ctrl-surf-copy blit\n");
> @@ -1136,7 +1138,7 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
>   		data.xe2.dw00.dst_access_type = surf->dst.access_type;
>   
>   		/* Ensure dst has size capable to keep src ccs aux */
> -		data.xe2.dw00.size_of_ctrl_copy = __ccs_size(surf) / CCS_RATIO - 1;
> +		data.xe2.dw00.size_of_ctrl_copy = __ccs_size(fd, surf) / num_ccs_blocks - 1;
>   		data.xe2.dw00.length = 0x3;
>   
>   		data.xe2.dw01.src_address_lo = src_offset;
> @@ -1155,7 +1157,7 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
>   		data.gen12.dw00.dst_access_type = surf->dst.access_type;
>   
>   		/* Ensure dst has size capable to keep src ccs aux */
> -		data.gen12.dw00.size_of_ctrl_copy = __ccs_size(surf) / CCS_RATIO - 1;
> +		data.gen12.dw00.size_of_ctrl_copy = __ccs_size(fd,surf) / num_ccs_blocks - 1;
>   		data.gen12.dw00.length = 0x3;
>   
>   		data.gen12.dw01.src_address_lo = src_offset;
> @@ -1794,7 +1796,8 @@ blt_create_object(const struct blt_copy_data *blt, uint32_t region,
>   	uint64_t size = width * height * bpp / 8;
>   	uint32_t stride = tiling == T_LINEAR ? width * 4 : width;
>   	uint32_t handle;
> -
> +	uint8_t pat_index = compression ? intel_get_pat_idx_uc_comp(blt->fd) : intel_get_pat_idx_uc(blt->fd);

intel_get_pat_idx_uc_comp() has not been defined yet in the series?

> +	uint16_t cpu_caching = DRM_XE_GEM_CPU_CACHING_WC;

Maybe make this conditional on xe2+ and compression = true? I figure we 
still want wb (if possible) for non-compressed stuff.

Also commit doesn't really explain this change in behaviour? Should this 
change be separated?

>   	igt_assert_f(blt->driver, "Driver isn't set, have you called blt_copy_init()?\n");
>   
>   	obj = calloc(1, sizeof(*obj));
> @@ -1808,13 +1811,13 @@ blt_create_object(const struct blt_copy_data *blt, uint32_t region,
>   			flags |= XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM;
>   
>   		size = ALIGN(size, xe_get_default_alignment(blt->fd));
> -		handle = xe_bo_create_flags(blt->fd, 0, size, flags);
> +		handle = xe_bo_create_caching(blt->fd, 0, size, region, cpu_caching);
>   	} else {
>   		igt_assert(__gem_create_in_memory_regions(blt->fd, &handle,
>   							  &size, region) == 0);
>   	}
>   
> -	blt_set_object(obj, handle, size, region, mocs_index, DEFAULT_PAT_INDEX, tiling,
> +	blt_set_object(obj, handle, size, region, mocs_index, pat_index, tiling,
>   		       compression, compression_type);
>   	blt_set_geom(obj, stride, 0, 0, width, height, 0, 0);
>   
> diff --git a/lib/intel_blt.h b/lib/intel_blt.h
> index 5934ccd67..a7aeab595 100644
> --- a/lib/intel_blt.h
> +++ b/lib/intel_blt.h
> @@ -52,7 +52,7 @@
>   #include "igt.h"
>   #include "intel_cmds_info.h"
>   
> -#define CCS_RATIO 256
> +#define CCS_RATIO(xe) AT_LEAST_GEN(intel_get_drm_devid(xe), 20) ? 512 : 256
>   
>   enum blt_color_depth {
>   	CD_8bit,

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 3/5] lib/intel_pat : Add uc_comp pat_index
  2023-11-10 21:24 ` [igt-dev] [PATCH i-g-t 3/5] lib/intel_pat : Add uc_comp pat_index Akshata Jahagirdar
@ 2023-11-10 11:06   ` Matthew Auld
  2023-11-15 21:38     ` Jahagirdar, Akshata
  0 siblings, 1 reply; 16+ messages in thread
From: Matthew Auld @ 2023-11-10 11:06 UTC (permalink / raw)
  To: Akshata Jahagirdar; +Cc: igt-dev, ayaz.siddiqui

On 10/11/2023 21:24, Akshata Jahagirdar wrote:
> Compression in XE2 is programmed through pat-index attribute. Add a dedicated pat-index for compression for XE2 and later platforms. The helper function returns uc_pat_index for previous platforms.
> 
> Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
> ---
>   lib/intel_pat.c | 12 +++++++++++-
>   lib/intel_pat.h |  1 +
>   2 files changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/intel_pat.c b/lib/intel_pat.c
> index 2b892ee52..36f13642f 100644
> --- a/lib/intel_pat.c
> +++ b/lib/intel_pat.c
> @@ -11,7 +11,7 @@ struct intel_pat_cache {
>   	uint8_t uc; /* UC + COH_NONE */
>   	uint8_t wt; /* WT + COH_NONE */
>   	uint8_t wb; /* WB + COH_AT_LEAST_1WAY */
> -
> +	uint8_t uc_comp; /* UC + COH_NONE + COMPRESSION*/

Maybe prefix this with xe2? i.e xe2_uc_comp.

>   	uint8_t max_index;
>   };
>   
> @@ -23,6 +23,7 @@ static void intel_get_pat_idx(int fd, struct intel_pat_cache *pat)
>   		pat->uc = 3;
>   		pat->wt = 15; /* Compressed + WB-transient */
>   		pat->wb = 2;
> +		pat->uc_comp = 12; /* Compressed + UC */
>   		pat->max_index = 31;
>   	} else if (IS_METEORLAKE(dev_id)) {
>   		pat->uc = 2;
> @@ -60,6 +61,15 @@ uint8_t intel_get_pat_idx_uc(int fd)
>   	return pat.uc;
>   }
>   
> +uint8_t intel_get_pat_idx_uc_comp(int fd)
> +{
> +	struct intel_pat_cache pat = {};
> +	uint16_t dev_id = intel_get_drm_devid(fd);
> +	
> +	intel_get_pat_idx(fd, &pat);
> +	return (intel_get_device_info(dev_id)->graphics_ver >= 20) ? pat.uc_comp : pat.uc;

Maybe make just make this xe2_get_pat_idx_uc_comp()? Or if we really do 
want to return plain uc on pre-xe2 then maybe make that clearer somehow 
in the api? intel_get_pat_idx_uc_comp_if_possible()? But perhaps caller 
should just handle this.

> +}
> +
>   uint8_t intel_get_pat_idx_wt(int fd)
>   {
>   	struct intel_pat_cache pat = {};
> diff --git a/lib/intel_pat.h b/lib/intel_pat.h
> index c24dbc275..6d68b9fc2 100644
> --- a/lib/intel_pat.h
> +++ b/lib/intel_pat.h
> @@ -13,6 +13,7 @@
>   uint8_t intel_get_max_pat_index(int fd);
>   
>   uint8_t intel_get_pat_idx_uc(int fd);
> +uint8_t intel_get_pat_idx_uc_comp(int fd);
>   uint8_t intel_get_pat_idx_wt(int fd);
>   uint8_t intel_get_pat_idx_wb(int fd);
>   

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [igt-dev] [PATCH i-g-t 0/5] Compression support for Lunarlake
@ 2023-11-10 21:24 Akshata Jahagirdar
  2023-11-10 10:29 ` [igt-dev] ✗ Fi.CI.BUILD: failure for " Patchwork
                   ` (5 more replies)
  0 siblings, 6 replies; 16+ messages in thread
From: Akshata Jahagirdar @ 2023-11-10 21:24 UTC (permalink / raw)
  Cc: igt-dev, ayaz.siddiqui, Akshata Jahagirdar, matthew.auld

Series enables the compression feature for Lunarlake and address various
changes of the feature from Gen12. The Main-to-CCS ratio has been
changed to 512:1. This changes the calculations and value for fields
such as CCS Copy size for blitter command APIs. 
This patch series updates tests xe_ccs and gem_ccs for XE2.  

These changes are based on top of the "vm_bind pat_index" patch here:
https://patchwork.freedesktop.org/series/124667/

v2: Lots of improvements/tweaks (Kamil)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [igt-dev] [PATCH i-g-t 1/5] lib: add blt command properties for lunarlake
  2023-11-10 21:24 [igt-dev] [PATCH i-g-t 0/5] Compression support for Lunarlake Akshata Jahagirdar
  2023-11-10 10:29 ` [igt-dev] ✗ Fi.CI.BUILD: failure for " Patchwork
@ 2023-11-10 21:24 ` Akshata Jahagirdar
  2023-11-13  8:42   ` Karolina Stolarek
  2023-11-10 21:24 ` [igt-dev] [PATCH i-g-t 2/5] lib/intel_blt: Update calculation of ccs_size and size_of_ctrl_copy Akshata Jahagirdar
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 16+ messages in thread
From: Akshata Jahagirdar @ 2023-11-10 21:24 UTC (permalink / raw)
  Cc: igt-dev, ayaz.siddiqui, Akshata Jahagirdar, matthew.auld

Add blt_cmd_info structs to describe properties of XY_BLOCK_COPY and XY_FAST_COPY blitter commands for XE2 platform. Updated the definitions for Lunarlake.

Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
---
 lib/intel_cmds_info.c   | 24 ++++++++++++++++++++++++
 lib/intel_cmds_info.h   |  1 +
 lib/intel_device_info.c |  2 +-
 3 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/lib/intel_cmds_info.c b/lib/intel_cmds_info.c
index 2e51ec081..fc3554271 100644
--- a/lib/intel_cmds_info.c
+++ b/lib/intel_cmds_info.c
@@ -67,6 +67,22 @@ static const struct blt_cmd_info
 						 BLT_CMD_EXTENDED |
 						 BLT_CMD_SUPPORTS_COMPRESSION);
 
+static const struct blt_cmd_info				
+		xe2_xy_block_copy = BLT_INFO_EXT(XY_BLOCK_COPY,
+						 BIT(T_LINEAR) |
+						 BIT(T_XMAJOR) |
+						 BIT(T_TILE4)  |
+						 BIT(T_TILE64),
+						 BLT_CMD_EXTENDED |
+						 BLT_CMD_SUPPORTS_COMPRESSION);
+
+static const struct blt_cmd_info
+		xe2_xy_fast_copy = BLT_INFO(XY_FAST_COPY,
+					    BIT(T_LINEAR) |
+					    BIT(T_XMAJOR) |
+					    BIT(T_TILE4)  |
+					    BIT(T_TILE64));
+
 static const struct blt_cmd_info
 		mtl_xy_block_copy = BLT_INFO_EXT(XY_BLOCK_COPY,
 						 BIT(T_LINEAR) |
@@ -169,6 +185,14 @@ const struct intel_cmds_info gen12_pvc_cmds_info = {
 	}
 };
 
+
+const struct intel_cmds_info xe2_cmds_info  = {
+	.blt_cmds = {
+		[XY_FAST_COPY] = &xe2_xy_fast_copy,
+		[XY_BLOCK_COPY] = &xe2_xy_block_copy,
+	}
+};
+
 const struct blt_cmd_info *blt_get_cmd_info(const struct intel_cmds_info *cmds_info,
 					    enum blt_cmd_type cmd)
 {
diff --git a/lib/intel_cmds_info.h b/lib/intel_cmds_info.h
index f9e3932d1..0a83b6a44 100644
--- a/lib/intel_cmds_info.h
+++ b/lib/intel_cmds_info.h
@@ -55,6 +55,7 @@ extern const struct intel_cmds_info gen12_cmds_info;
 extern const struct intel_cmds_info gen12_dg2_cmds_info;
 extern const struct intel_cmds_info gen12_mtl_cmds_info;
 extern const struct intel_cmds_info gen12_pvc_cmds_info;
+extern const struct intel_cmds_info xe2_cmds_info;
 
 #define for_each_tiling(__tiling) \
 	for (__tiling = T_LINEAR; __tiling < __BLT_MAX_TILING; __tiling++)
diff --git a/lib/intel_device_info.c b/lib/intel_device_info.c
index 34817f7b6..a669797c3 100644
--- a/lib/intel_device_info.c
+++ b/lib/intel_device_info.c
@@ -511,7 +511,7 @@ static const struct intel_device_info intel_lunarlake_info = {
 	.has_4tile = true,
 	.is_lunarlake = true,
 	.codename = "lunarlake",
-	.cmds_info = &gen12_pvc_cmds_info,
+	.cmds_info = &xe2_cmds_info,
 };
 
 static const struct pci_id_match intel_device_match[] = {
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [igt-dev] [PATCH i-g-t 2/5] lib/intel_blt: Update calculation of ccs_size and size_of_ctrl_copy
  2023-11-10 21:24 [igt-dev] [PATCH i-g-t 0/5] Compression support for Lunarlake Akshata Jahagirdar
  2023-11-10 10:29 ` [igt-dev] ✗ Fi.CI.BUILD: failure for " Patchwork
  2023-11-10 21:24 ` [igt-dev] [PATCH i-g-t 1/5] lib: add blt command properties for lunarlake Akshata Jahagirdar
@ 2023-11-10 21:24 ` Akshata Jahagirdar
  2023-11-10 10:59   ` Matthew Auld
  2023-11-10 21:24 ` [igt-dev] [PATCH i-g-t 3/5] lib/intel_pat : Add uc_comp pat_index Akshata Jahagirdar
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 16+ messages in thread
From: Akshata Jahagirdar @ 2023-11-10 21:24 UTC (permalink / raw)
  Cc: igt-dev, ayaz.siddiqui, Akshata Jahagirdar, matthew.auld

The Main-to-CCS Ratio for XE2 has been changed to 512:1. Update the CCS_RATIO macro to select relevant ratio based on platform.
Since the PAGE_SIZE of sysmem is 4K, update the size of ctrl_copy to reflect this change.

Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
---
 lib/intel_blt.c | 19 +++++++++++--------
 lib/intel_blt.h |  2 +-
 2 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/lib/intel_blt.c b/lib/intel_blt.c
index c0593930c..8168ce7b5 100644
--- a/lib/intel_blt.c
+++ b/lib/intel_blt.c
@@ -948,15 +948,16 @@ int blt_block_copy(int fd,
 	return ret;
 }
 
-static uint16_t __ccs_size(const struct blt_ctrl_surf_copy_data *surf)
+static uint16_t __ccs_size(int fd, const struct blt_ctrl_surf_copy_data *surf)
 {
 	uint32_t src_size, dst_size;
+	uint16_t ccsratio = CCS_RATIO(fd);
 
 	src_size = surf->src.access_type == DIRECT_ACCESS ?
-				surf->src.size : surf->src.size / CCS_RATIO;
+				surf->src.size : surf->src.size / ccsratio;
 
 	dst_size = surf->dst.access_type == DIRECT_ACCESS ?
-				surf->dst.size : surf->dst.size / CCS_RATIO;
+				surf->dst.size : surf->dst.size / ccsratio;
 
 	igt_assert_f(src_size <= dst_size, "dst size must be >= src size for CCS copy\n");
 
@@ -1118,6 +1119,7 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
 	uint64_t dst_offset, src_offset, bb_offset, alignment;
 	uint32_t bbe = MI_BATCH_BUFFER_END;
 	uint32_t *bb;
+	uint16_t num_ccs_blocks = xe_get_default_alignment(fd) / (CCS_RATIO(fd));
 
 	igt_assert_f(ahnd, "ctrl-surf-copy supports softpin only\n");
 	igt_assert_f(surf, "ctrl-surf-copy requires data to do ctrl-surf-copy blit\n");
@@ -1136,7 +1138,7 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
 		data.xe2.dw00.dst_access_type = surf->dst.access_type;
 
 		/* Ensure dst has size capable to keep src ccs aux */
-		data.xe2.dw00.size_of_ctrl_copy = __ccs_size(surf) / CCS_RATIO - 1;
+		data.xe2.dw00.size_of_ctrl_copy = __ccs_size(fd, surf) / num_ccs_blocks - 1;
 		data.xe2.dw00.length = 0x3;
 
 		data.xe2.dw01.src_address_lo = src_offset;
@@ -1155,7 +1157,7 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
 		data.gen12.dw00.dst_access_type = surf->dst.access_type;
 
 		/* Ensure dst has size capable to keep src ccs aux */
-		data.gen12.dw00.size_of_ctrl_copy = __ccs_size(surf) / CCS_RATIO - 1;
+		data.gen12.dw00.size_of_ctrl_copy = __ccs_size(fd,surf) / num_ccs_blocks - 1;
 		data.gen12.dw00.length = 0x3;
 
 		data.gen12.dw01.src_address_lo = src_offset;
@@ -1794,7 +1796,8 @@ blt_create_object(const struct blt_copy_data *blt, uint32_t region,
 	uint64_t size = width * height * bpp / 8;
 	uint32_t stride = tiling == T_LINEAR ? width * 4 : width;
 	uint32_t handle;
-
+	uint8_t pat_index = compression ? intel_get_pat_idx_uc_comp(blt->fd) : intel_get_pat_idx_uc(blt->fd);
+	uint16_t cpu_caching = DRM_XE_GEM_CPU_CACHING_WC;
 	igt_assert_f(blt->driver, "Driver isn't set, have you called blt_copy_init()?\n");
 
 	obj = calloc(1, sizeof(*obj));
@@ -1808,13 +1811,13 @@ blt_create_object(const struct blt_copy_data *blt, uint32_t region,
 			flags |= XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM;
 
 		size = ALIGN(size, xe_get_default_alignment(blt->fd));
-		handle = xe_bo_create_flags(blt->fd, 0, size, flags);
+		handle = xe_bo_create_caching(blt->fd, 0, size, region, cpu_caching);
 	} else {
 		igt_assert(__gem_create_in_memory_regions(blt->fd, &handle,
 							  &size, region) == 0);
 	}
 
-	blt_set_object(obj, handle, size, region, mocs_index, DEFAULT_PAT_INDEX, tiling,
+	blt_set_object(obj, handle, size, region, mocs_index, pat_index, tiling,
 		       compression, compression_type);
 	blt_set_geom(obj, stride, 0, 0, width, height, 0, 0);
 
diff --git a/lib/intel_blt.h b/lib/intel_blt.h
index 5934ccd67..a7aeab595 100644
--- a/lib/intel_blt.h
+++ b/lib/intel_blt.h
@@ -52,7 +52,7 @@
 #include "igt.h"
 #include "intel_cmds_info.h"
 
-#define CCS_RATIO 256
+#define CCS_RATIO(xe) AT_LEAST_GEN(intel_get_drm_devid(xe), 20) ? 512 : 256
 
 enum blt_color_depth {
 	CD_8bit,
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [igt-dev] [PATCH i-g-t 3/5] lib/intel_pat : Add uc_comp pat_index
  2023-11-10 21:24 [igt-dev] [PATCH i-g-t 0/5] Compression support for Lunarlake Akshata Jahagirdar
                   ` (2 preceding siblings ...)
  2023-11-10 21:24 ` [igt-dev] [PATCH i-g-t 2/5] lib/intel_blt: Update calculation of ccs_size and size_of_ctrl_copy Akshata Jahagirdar
@ 2023-11-10 21:24 ` Akshata Jahagirdar
  2023-11-10 11:06   ` Matthew Auld
  2023-11-10 21:24 ` [igt-dev] [PATCH i-g-t 4/5] tests/intel/gem_ccs: Add compression support for Lunarlake Akshata Jahagirdar
  2023-11-10 21:24 ` [igt-dev] [PATCH i-g-t 5/5] tests/intel/xe_ccs: " Akshata Jahagirdar
  5 siblings, 1 reply; 16+ messages in thread
From: Akshata Jahagirdar @ 2023-11-10 21:24 UTC (permalink / raw)
  Cc: igt-dev, ayaz.siddiqui, Akshata Jahagirdar, matthew.auld

Compression in XE2 is programmed through pat-index attribute. Add a dedicated pat-index for compression for XE2 and later platforms. The helper function returns uc_pat_index for previous platforms.

Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
---
 lib/intel_pat.c | 12 +++++++++++-
 lib/intel_pat.h |  1 +
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/lib/intel_pat.c b/lib/intel_pat.c
index 2b892ee52..36f13642f 100644
--- a/lib/intel_pat.c
+++ b/lib/intel_pat.c
@@ -11,7 +11,7 @@ struct intel_pat_cache {
 	uint8_t uc; /* UC + COH_NONE */
 	uint8_t wt; /* WT + COH_NONE */
 	uint8_t wb; /* WB + COH_AT_LEAST_1WAY */
-
+	uint8_t uc_comp; /* UC + COH_NONE + COMPRESSION*/
 	uint8_t max_index;
 };
 
@@ -23,6 +23,7 @@ static void intel_get_pat_idx(int fd, struct intel_pat_cache *pat)
 		pat->uc = 3;
 		pat->wt = 15; /* Compressed + WB-transient */
 		pat->wb = 2;
+		pat->uc_comp = 12; /* Compressed + UC */
 		pat->max_index = 31;
 	} else if (IS_METEORLAKE(dev_id)) {
 		pat->uc = 2;
@@ -60,6 +61,15 @@ uint8_t intel_get_pat_idx_uc(int fd)
 	return pat.uc;
 }
 
+uint8_t intel_get_pat_idx_uc_comp(int fd)
+{
+	struct intel_pat_cache pat = {};
+	uint16_t dev_id = intel_get_drm_devid(fd);
+	
+	intel_get_pat_idx(fd, &pat);
+	return (intel_get_device_info(dev_id)->graphics_ver >= 20) ? pat.uc_comp : pat.uc;
+}
+
 uint8_t intel_get_pat_idx_wt(int fd)
 {
 	struct intel_pat_cache pat = {};
diff --git a/lib/intel_pat.h b/lib/intel_pat.h
index c24dbc275..6d68b9fc2 100644
--- a/lib/intel_pat.h
+++ b/lib/intel_pat.h
@@ -13,6 +13,7 @@
 uint8_t intel_get_max_pat_index(int fd);
 
 uint8_t intel_get_pat_idx_uc(int fd);
+uint8_t intel_get_pat_idx_uc_comp(int fd);
 uint8_t intel_get_pat_idx_wt(int fd);
 uint8_t intel_get_pat_idx_wb(int fd);
 
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [igt-dev] [PATCH i-g-t 4/5] tests/intel/gem_ccs: Add compression support for Lunarlake
  2023-11-10 21:24 [igt-dev] [PATCH i-g-t 0/5] Compression support for Lunarlake Akshata Jahagirdar
                   ` (3 preceding siblings ...)
  2023-11-10 21:24 ` [igt-dev] [PATCH i-g-t 3/5] lib/intel_pat : Add uc_comp pat_index Akshata Jahagirdar
@ 2023-11-10 21:24 ` Akshata Jahagirdar
  2023-11-10 21:24 ` [igt-dev] [PATCH i-g-t 5/5] tests/intel/xe_ccs: " Akshata Jahagirdar
  5 siblings, 0 replies; 16+ messages in thread
From: Akshata Jahagirdar @ 2023-11-10 21:24 UTC (permalink / raw)
  Cc: igt-dev, ayaz.siddiqui, Akshata Jahagirdar, matthew.auld

Update the ccs_ratio call to get value according to platform.

Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
---
 tests/intel/gem_ccs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/intel/gem_ccs.c b/tests/intel/gem_ccs.c
index 0a691778d..5a17be4cb 100644
--- a/tests/intel/gem_ccs.c
+++ b/tests/intel/gem_ccs.c
@@ -99,7 +99,7 @@ static void surf_copy(int i915,
 	struct blt_block_copy_data_ext ext = {};
 	struct blt_ctrl_surf_copy_data surf = {};
 	uint32_t bb1, bb2, ccs, ccs2, *ccsmap, *ccsmap2;
-	uint64_t bb_size, ccssize = mid->size / CCS_RATIO;
+	uint64_t bb_size, ccssize = mid->size / (CCS_RATIO(i915));
 	uint32_t *ccscopy;
 	uint8_t uc_mocs = intel_get_uc_mocs_index(i915);
 	int result;
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [igt-dev] [PATCH i-g-t 5/5] tests/intel/xe_ccs: Add compression support for Lunarlake
  2023-11-10 21:24 [igt-dev] [PATCH i-g-t 0/5] Compression support for Lunarlake Akshata Jahagirdar
                   ` (4 preceding siblings ...)
  2023-11-10 21:24 ` [igt-dev] [PATCH i-g-t 4/5] tests/intel/gem_ccs: Add compression support for Lunarlake Akshata Jahagirdar
@ 2023-11-10 21:24 ` Akshata Jahagirdar
  5 siblings, 0 replies; 16+ messages in thread
From: Akshata Jahagirdar @ 2023-11-10 21:24 UTC (permalink / raw)
  Cc: igt-dev, ayaz.siddiqui, Akshata Jahagirdar, matthew.auld

In XE2 IGFX platform, sysmem also participates in compression. So, create all blt objects in sysmem itself, and update the pat-index to reflect the compression status.
Since we need to align the buffer object size with page_size and also have the src size and dst size of CCS copy to be equal, change the default width and height to 1024.

Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
---
 tests/intel/xe_ccs.c | 38 +++++++++++++++++++-------------------
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/tests/intel/xe_ccs.c b/tests/intel/xe_ccs.c
index 647a6bd2e..d0444d0df 100644
--- a/tests/intel/xe_ccs.c
+++ b/tests/intel/xe_ccs.c
@@ -63,8 +63,8 @@ static struct param {
 	.write_png = false,
 	.print_bb = false,
 	.print_surface_info = false,
-	.width = 512,
-	.height = 512,
+	.width = 1024,
+	.height = 1024,
 };
 
 struct test_config {
@@ -95,7 +95,7 @@ static void surf_copy(int xe,
 	struct blt_block_copy_data_ext ext = {};
 	struct blt_ctrl_surf_copy_data surf = {};
 	uint32_t bb1, bb2, ccs, ccs2, *ccsmap, *ccsmap2;
-	uint64_t bb_size, ccssize = mid->size / CCS_RATIO;
+	uint64_t bb_size, ccssize = mid->size / (CCS_RATIO(xe));
 	uint32_t *ccscopy;
 	uint8_t uc_mocs = intel_get_uc_mocs_index(xe);
 	uint32_t sysmem = system_memory(xe);
@@ -103,13 +103,13 @@ static void surf_copy(int xe,
 
 	igt_assert(mid->compression);
 	ccscopy = (uint32_t *) malloc(ccssize);
-	ccs = xe_bo_create_flags(xe, 0, ccssize, sysmem);
-	ccs2 = xe_bo_create_flags(xe, 0, ccssize, sysmem);
+	ccs = xe_bo_create_caching(xe, 0, ccssize, sysmem, DRM_XE_GEM_CPU_CACHING_WC);
+	ccs2 = xe_bo_create_caching(xe, 0, ccssize, sysmem, DRM_XE_GEM_CPU_CACHING_WC);
 
 	blt_ctrl_surf_copy_init(xe, &surf);
 	surf.print_bb = param.print_bb;
 	blt_set_ctrl_surf_object(&surf.src, mid->handle, mid->region, mid->size,
-				 uc_mocs, DEFAULT_PAT_INDEX, BLT_INDIRECT_ACCESS);
+				 uc_mocs, intel_get_pat_idx_uc_comp(xe), BLT_INDIRECT_ACCESS);
 	blt_set_ctrl_surf_object(&surf.dst, ccs, sysmem, ccssize, uc_mocs,
 				 DEFAULT_PAT_INDEX, DIRECT_ACCESS);
 	bb_size = xe_get_default_alignment(xe);
@@ -157,7 +157,7 @@ static void surf_copy(int xe,
 	blt_set_ctrl_surf_object(&surf.src, ccs, sysmem, ccssize,
 				 uc_mocs, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
 	blt_set_ctrl_surf_object(&surf.dst, mid->handle, mid->region, mid->size,
-				 uc_mocs, DEFAULT_PAT_INDEX, INDIRECT_ACCESS);
+				 uc_mocs, intel_get_pat_idx_uc_comp(xe), INDIRECT_ACCESS);
 	blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
 	intel_ctx_xe_sync(ctx, true);
 
@@ -234,10 +234,10 @@ static int blt_block_copy3(int xe,
 	igt_assert_f(blt3, "block-copy3 requires data to do blit\n");
 
 	alignment = xe_get_default_alignment(xe);
-	get_offset(ahnd, blt3->src.handle, blt3->src.size, alignment);
-	get_offset(ahnd, blt3->mid.handle, blt3->mid.size, alignment);
-	get_offset(ahnd, blt3->dst.handle, blt3->dst.size, alignment);
-	get_offset(ahnd, blt3->final.handle, blt3->final.size, alignment);
+	get_offset_pat_index(ahnd, blt3->src.handle, blt3->src.size, alignment, blt3->src.pat_index);
+	get_offset_pat_index(ahnd, blt3->mid.handle, blt3->mid.size, alignment, blt3->mid.pat_index);
+	get_offset_pat_index(ahnd, blt3->dst.handle, blt3->dst.size, alignment, blt3->dst.pat_index);
+	get_offset_pat_index(ahnd, blt3->final.handle, blt3->final.size, alignment, blt3->final.pat_index);
 	bb_offset = get_offset(ahnd, blt3->bb.handle, blt3->bb.size, alignment);
 
 	/* First blit src -> mid */
@@ -291,8 +291,8 @@ static void block_copy(int xe,
 	uint64_t bb_size = xe_get_default_alignment(xe);
 	uint64_t ahnd = intel_allocator_open(xe, ctx->vm, INTEL_ALLOCATOR_RELOC);
 	uint32_t run_id = mid_tiling;
-	uint32_t mid_region = region2, bb;
-	uint32_t width = param.width, height = param.height;
+	uint32_t mid_region = (AT_LEAST_GEN(intel_get_drm_devid(xe), 20) & !xe_has_vram(xe)) ? region1 : region2;
+	uint32_t width = param.width, height = param.height, bb;
 	enum blt_compression mid_compression = config->compression;
 	int mid_compression_format = param.compression_format;
 	enum blt_compression_type comp_type = COMPRESSION_TYPE_3D;
@@ -413,8 +413,8 @@ static void block_multicopy(int xe,
 	uint64_t bb_size = xe_get_default_alignment(xe);
 	uint64_t ahnd = intel_allocator_open(xe, ctx->vm, INTEL_ALLOCATOR_RELOC);
 	uint32_t run_id = mid_tiling;
-	uint32_t mid_region = region2, bb;
-	uint32_t width = param.width, height = param.height;
+	uint32_t mid_region = (AT_LEAST_GEN(intel_get_drm_devid(xe), 20) & !xe_has_vram(xe)) ? region1 : region2;
+	uint32_t width = param.width, height = param.height, bb;
 	enum blt_compression mid_compression = config->compression;
 	int mid_compression_format = param.compression_format;
 	enum blt_compression_type comp_type = COMPRESSION_TYPE_3D;
@@ -539,8 +539,8 @@ static void block_copy_test(int xe,
 			region1 = igt_collection_get_value(regions, 0);
 			region2 = igt_collection_get_value(regions, 1);
 
-			/* Compressed surface must be in device memory */
-			if (config->compression && !XE_IS_VRAM_MEMORY_REGION(xe, region2))
+			/* if not XE2, then Compressed surface must be in device memory */
+			if (config->compression && !(AT_LEAST_GEN((intel_get_drm_devid(xe)), 20)) && !XE_IS_VRAM_MEMORY_REGION(xe, region2))
 				continue;
 
 			regtxt = xe_memregion_dynamic_subtest_name(xe, regions);
@@ -621,8 +621,8 @@ const char *help_str =
 	"  -p\tWrite PNG\n"
 	"  -s\tPrint surface info\n"
 	"  -t\tTiling format (0 - linear, 1 - XMAJOR, 2 - YMAJOR, 3 - TILE4, 4 - TILE64)\n"
-	"  -W\tWidth (default 512)\n"
-	"  -H\tHeight (default 512)"
+	"  -W\tWidth (default 1024)\n"
+	"  -H\tHeight (default 1024)"
 	;
 
 igt_main_args("bf:pst:W:H:", NULL, help_str, opt_handler, NULL)
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 1/5] lib: add blt command properties for lunarlake
  2023-11-10 21:24 ` [igt-dev] [PATCH i-g-t 1/5] lib: add blt command properties for lunarlake Akshata Jahagirdar
@ 2023-11-13  8:42   ` Karolina Stolarek
  2023-11-15 21:55     ` Jahagirdar, Akshata
  0 siblings, 1 reply; 16+ messages in thread
From: Karolina Stolarek @ 2023-11-13  8:42 UTC (permalink / raw)
  To: Akshata Jahagirdar; +Cc: igt-dev, ayaz.siddiqui, matthew.auld


On 10.11.2023 22:24, Akshata Jahagirdar wrote:
> Add blt_cmd_info structs to describe properties of XY_BLOCK_COPY and XY_FAST_COPY blitter commands for XE2 platform. Updated the definitions for Lunarlake.

Nit: break the commit message at 80 column

> 
> Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
> ---
>   lib/intel_cmds_info.c   | 24 ++++++++++++++++++++++++
>   lib/intel_cmds_info.h   |  1 +
>   lib/intel_device_info.c |  2 +-
>   3 files changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/intel_cmds_info.c b/lib/intel_cmds_info.c
> index 2e51ec081..fc3554271 100644
> --- a/lib/intel_cmds_info.c
> +++ b/lib/intel_cmds_info.c
> @@ -67,6 +67,22 @@ static const struct blt_cmd_info
>   						 BLT_CMD_EXTENDED |
>   						 BLT_CMD_SUPPORTS_COMPRESSION);
>   
> +static const struct blt_cmd_info				
> +		xe2_xy_block_copy = BLT_INFO_EXT(XY_BLOCK_COPY,
> +						 BIT(T_LINEAR) |
> +						 BIT(T_XMAJOR) |
> +						 BIT(T_TILE4)  |
> +						 BIT(T_TILE64),
> +						 BLT_CMD_EXTENDED |
> +						 BLT_CMD_SUPPORTS_COMPRESSION);

I know this might be too much work, but could we confirm that XMAJOR is
actually supported with block copy on this platform? This could be as
easy as running xe_ccs test with -p parameter, and then checking the mid
image and the tiling pattern. But I won't argue it's essential for r-b,
it's more of a sanity check.

> +
> +static const struct blt_cmd_info
> +		xe2_xy_fast_copy = BLT_INFO(XY_FAST_COPY,
> +					    BIT(T_LINEAR) |
> +					    BIT(T_XMAJOR) |
> +					    BIT(T_TILE4)  |
> +					    BIT(T_TILE64));

This looks a lot like dg2_xy_fast_copy definition. Why not reuse it when
defining xe2_cmds_info?

With these comments addressed:

Reviewed-by: Karolina Stolarek <karolina.stolarek@intel.com>

> +
>   static const struct blt_cmd_info
>   		mtl_xy_block_copy = BLT_INFO_EXT(XY_BLOCK_COPY,
>   						 BIT(T_LINEAR) |
> @@ -169,6 +185,14 @@ const struct intel_cmds_info gen12_pvc_cmds_info = {
>   	}
>   };
>   
> +
> +const struct intel_cmds_info xe2_cmds_info  = {
> +	.blt_cmds = {
> +		[XY_FAST_COPY] = &xe2_xy_fast_copy,
> +		[XY_BLOCK_COPY] = &xe2_xy_block_copy,
> +	}
> +};
> +
>   const struct blt_cmd_info *blt_get_cmd_info(const struct intel_cmds_info *cmds_info,
>   					    enum blt_cmd_type cmd)
>   {
> diff --git a/lib/intel_cmds_info.h b/lib/intel_cmds_info.h
> index f9e3932d1..0a83b6a44 100644
> --- a/lib/intel_cmds_info.h
> +++ b/lib/intel_cmds_info.h
> @@ -55,6 +55,7 @@ extern const struct intel_cmds_info gen12_cmds_info;
>   extern const struct intel_cmds_info gen12_dg2_cmds_info;
>   extern const struct intel_cmds_info gen12_mtl_cmds_info;
>   extern const struct intel_cmds_info gen12_pvc_cmds_info;
> +extern const struct intel_cmds_info xe2_cmds_info;
>   
>   #define for_each_tiling(__tiling) \
>   	for (__tiling = T_LINEAR; __tiling < __BLT_MAX_TILING; __tiling++)
> diff --git a/lib/intel_device_info.c b/lib/intel_device_info.c
> index 34817f7b6..a669797c3 100644
> --- a/lib/intel_device_info.c
> +++ b/lib/intel_device_info.c
> @@ -511,7 +511,7 @@ static const struct intel_device_info intel_lunarlake_info = {
>   	.has_4tile = true,
>   	.is_lunarlake = true,
>   	.codename = "lunarlake",
> -	.cmds_info = &gen12_pvc_cmds_info,
> +	.cmds_info = &xe2_cmds_info,
>   };
>   
>   static const struct pci_id_match intel_device_match[] = {

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 2/5] lib/intel_blt: Update calculation of ccs_size and size_of_ctrl_copy
  2023-11-10 10:59   ` Matthew Auld
@ 2023-11-15 21:06     ` Jahagirdar, Akshata
  0 siblings, 0 replies; 16+ messages in thread
From: Jahagirdar, Akshata @ 2023-11-15 21:06 UTC (permalink / raw)
  To: Auld, Matthew; +Cc: igt-dev@lists.freedesktop.org, Siddiqui, Ayaz A

Hi,

Comments inline.

Best,
Akshata

-----Original Message-----
From: Auld, Matthew <matthew.auld@intel.com> 
Sent: Friday, November 10, 2023 3:00 AM
To: Jahagirdar, Akshata <akshata.jahagirdar@intel.com>
Cc: igt-dev@lists.freedesktop.org; Siddiqui, Ayaz A <ayaz.siddiqui@intel.com>; Stolarek, Karolina <karolina.stolarek@intel.com>; Kempczynski, Zbigniew <zbigniew.kempczynski@intel.com>
Subject: Re: [PATCH i-g-t 2/5] lib/intel_blt: Update calculation of ccs_size and size_of_ctrl_copy

On 10/11/2023 21:24, Akshata Jahagirdar wrote:
> The Main-to-CCS Ratio for XE2 has been changed to 512:1. Update the CCS_RATIO macro to select relevant ratio based on platform.
> Since the PAGE_SIZE of sysmem is 4K, update the size of ctrl_copy to reflect this change.

Nit: very long line.

Will fix this.

> 
> Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
> ---
>   lib/intel_blt.c | 19 +++++++++++--------
>   lib/intel_blt.h |  2 +-
>   2 files changed, 12 insertions(+), 9 deletions(-)
> 
> diff --git a/lib/intel_blt.c b/lib/intel_blt.c index 
> c0593930c..8168ce7b5 100644
> --- a/lib/intel_blt.c
> +++ b/lib/intel_blt.c
> @@ -948,15 +948,16 @@ int blt_block_copy(int fd,
>   	return ret;
>   }
>   
> -static uint16_t __ccs_size(const struct blt_ctrl_surf_copy_data 
> *surf)
> +static uint16_t __ccs_size(int fd, const struct 
> +blt_ctrl_surf_copy_data *surf)
>   {
>   	uint32_t src_size, dst_size;
> +	uint16_t ccsratio = CCS_RATIO(fd);
>   
>   	src_size = surf->src.access_type == DIRECT_ACCESS ?
> -				surf->src.size : surf->src.size / CCS_RATIO;
> +				surf->src.size : surf->src.size / ccsratio;
>   
>   	dst_size = surf->dst.access_type == DIRECT_ACCESS ?
> -				surf->dst.size : surf->dst.size / CCS_RATIO;
> +				surf->dst.size : surf->dst.size / ccsratio;
>   
>   	igt_assert_f(src_size <= dst_size, "dst size must be >= src size 
> for CCS copy\n");
>   
> @@ -1118,6 +1119,7 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
>   	uint64_t dst_offset, src_offset, bb_offset, alignment;
>   	uint32_t bbe = MI_BATCH_BUFFER_END;
>   	uint32_t *bb;
> +	uint16_t num_ccs_blocks = xe_get_default_alignment(fd) / 
> +(CCS_RATIO(fd));
>   
>   	igt_assert_f(ahnd, "ctrl-surf-copy supports softpin only\n");
>   	igt_assert_f(surf, "ctrl-surf-copy requires data to do 
> ctrl-surf-copy blit\n"); @@ -1136,7 +1138,7 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
>   		data.xe2.dw00.dst_access_type = surf->dst.access_type;
>   
>   		/* Ensure dst has size capable to keep src ccs aux */
> -		data.xe2.dw00.size_of_ctrl_copy = __ccs_size(surf) / CCS_RATIO - 1;
> +		data.xe2.dw00.size_of_ctrl_copy = __ccs_size(fd, surf) / 
> +num_ccs_blocks - 1;
>   		data.xe2.dw00.length = 0x3;
>   
>   		data.xe2.dw01.src_address_lo = src_offset; @@ -1155,7 +1157,7 @@ 
> uint64_t emit_blt_ctrl_surf_copy(int fd,
>   		data.gen12.dw00.dst_access_type = surf->dst.access_type;
>   
>   		/* Ensure dst has size capable to keep src ccs aux */
> -		data.gen12.dw00.size_of_ctrl_copy = __ccs_size(surf) / CCS_RATIO - 1;
> +		data.gen12.dw00.size_of_ctrl_copy = __ccs_size(fd,surf) / 
> +num_ccs_blocks - 1;
>   		data.gen12.dw00.length = 0x3;
>   
>   		data.gen12.dw01.src_address_lo = src_offset; @@ -1794,7 +1796,8 @@ 
> blt_create_object(const struct blt_copy_data *blt, uint32_t region,
>   	uint64_t size = width * height * bpp / 8;
>   	uint32_t stride = tiling == T_LINEAR ? width * 4 : width;
>   	uint32_t handle;
> -
> +	uint8_t pat_index = compression ? intel_get_pat_idx_uc_comp(blt->fd) 
> +: intel_get_pat_idx_uc(blt->fd);

intel_get_pat_idx_uc_comp() has not been defined yet in the series?

The patch order seems to be messed up, will fix this.

> +	uint16_t cpu_caching = DRM_XE_GEM_CPU_CACHING_WC;

Maybe make this conditional on xe2+ and compression = true? I figure we still want wb (if possible) for non-compressed stuff.

Also commit doesn't really explain this change in behaviour? Should this change be separated?

Yes, adding the condition sounds good. Will address this change is commit message too.

>   	igt_assert_f(blt->driver, "Driver isn't set, have you called 
> blt_copy_init()?\n");
>   
>   	obj = calloc(1, sizeof(*obj));
> @@ -1808,13 +1811,13 @@ blt_create_object(const struct blt_copy_data *blt, uint32_t region,
>   			flags |= XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM;
>   
>   		size = ALIGN(size, xe_get_default_alignment(blt->fd));
> -		handle = xe_bo_create_flags(blt->fd, 0, size, flags);
> +		handle = xe_bo_create_caching(blt->fd, 0, size, region, 
> +cpu_caching);
>   	} else {
>   		igt_assert(__gem_create_in_memory_regions(blt->fd, &handle,
>   							  &size, region) == 0);
>   	}
>   
> -	blt_set_object(obj, handle, size, region, mocs_index, DEFAULT_PAT_INDEX, tiling,
> +	blt_set_object(obj, handle, size, region, mocs_index, pat_index, 
> +tiling,
>   		       compression, compression_type);
>   	blt_set_geom(obj, stride, 0, 0, width, height, 0, 0);
>   
> diff --git a/lib/intel_blt.h b/lib/intel_blt.h index 
> 5934ccd67..a7aeab595 100644
> --- a/lib/intel_blt.h
> +++ b/lib/intel_blt.h
> @@ -52,7 +52,7 @@
>   #include "igt.h"
>   #include "intel_cmds_info.h"
>   
> -#define CCS_RATIO 256
> +#define CCS_RATIO(xe) AT_LEAST_GEN(intel_get_drm_devid(xe), 20) ? 512 
> +: 256
>   
>   enum blt_color_depth {
>   	CD_8bit,

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 3/5] lib/intel_pat : Add uc_comp pat_index
  2023-11-10 11:06   ` Matthew Auld
@ 2023-11-15 21:38     ` Jahagirdar, Akshata
  0 siblings, 0 replies; 16+ messages in thread
From: Jahagirdar, Akshata @ 2023-11-15 21:38 UTC (permalink / raw)
  To: Auld, Matthew; +Cc: igt-dev@lists.freedesktop.org, Siddiqui, Ayaz A

HI,

Comments inline.

Best,
Akshata

-----Original Message-----
From: Auld, Matthew <matthew.auld@intel.com> 
Sent: Friday, November 10, 2023 3:06 AM
To: Jahagirdar, Akshata <akshata.jahagirdar@intel.com>
Cc: igt-dev@lists.freedesktop.org; Siddiqui, Ayaz A <ayaz.siddiqui@intel.com>; Stolarek, Karolina <karolina.stolarek@intel.com>; Kempczynski, Zbigniew <zbigniew.kempczynski@intel.com>
Subject: Re: [PATCH i-g-t 3/5] lib/intel_pat : Add uc_comp pat_index

On 10/11/2023 21:24, Akshata Jahagirdar wrote:
> Compression in XE2 is programmed through pat-index attribute. Add a dedicated pat-index for compression for XE2 and later platforms. The helper function returns uc_pat_index for previous platforms.
> 
> Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
> ---
>   lib/intel_pat.c | 12 +++++++++++-
>   lib/intel_pat.h |  1 +
>   2 files changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/intel_pat.c b/lib/intel_pat.c index 
> 2b892ee52..36f13642f 100644
> --- a/lib/intel_pat.c
> +++ b/lib/intel_pat.c
> @@ -11,7 +11,7 @@ struct intel_pat_cache {
>   	uint8_t uc; /* UC + COH_NONE */
>   	uint8_t wt; /* WT + COH_NONE */
>   	uint8_t wb; /* WB + COH_AT_LEAST_1WAY */
> -
> +	uint8_t uc_comp; /* UC + COH_NONE + COMPRESSION*/

Maybe prefix this with xe2? i.e xe2_uc_comp.

AJ: yes, will fix this.

>   	uint8_t max_index;
>   };
>   
> @@ -23,6 +23,7 @@ static void intel_get_pat_idx(int fd, struct intel_pat_cache *pat)
>   		pat->uc = 3;
>   		pat->wt = 15; /* Compressed + WB-transient */
>   		pat->wb = 2;
> +		pat->uc_comp = 12; /* Compressed + UC */
>   		pat->max_index = 31;
>   	} else if (IS_METEORLAKE(dev_id)) {
>   		pat->uc = 2;
> @@ -60,6 +61,15 @@ uint8_t intel_get_pat_idx_uc(int fd)
>   	return pat.uc;
>   }
>   
> +uint8_t intel_get_pat_idx_uc_comp(int fd) {
> +	struct intel_pat_cache pat = {};
> +	uint16_t dev_id = intel_get_drm_devid(fd);
> +	
> +	intel_get_pat_idx(fd, &pat);
> +	return (intel_get_device_info(dev_id)->graphics_ver >= 20) ? 
> +pat.uc_comp : pat.uc;

Maybe make just make this xe2_get_pat_idx_uc_comp()? Or if we really do want to return plain uc on pre-xe2 then maybe make that clearer somehow in the api? intel_get_pat_idx_uc_comp_if_possible()? But perhaps caller should just handle this.

AJ: yeah, I think we can make it xe2_get_pat_idx_uc_comp() and not return pre-xe2, instead handle it from caller.

> +}
> +
>   uint8_t intel_get_pat_idx_wt(int fd)
>   {
>   	struct intel_pat_cache pat = {};
> diff --git a/lib/intel_pat.h b/lib/intel_pat.h index 
> c24dbc275..6d68b9fc2 100644
> --- a/lib/intel_pat.h
> +++ b/lib/intel_pat.h
> @@ -13,6 +13,7 @@
>   uint8_t intel_get_max_pat_index(int fd);
>   
>   uint8_t intel_get_pat_idx_uc(int fd);
> +uint8_t intel_get_pat_idx_uc_comp(int fd);
>   uint8_t intel_get_pat_idx_wt(int fd);
>   uint8_t intel_get_pat_idx_wb(int fd);
>   

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 1/5] lib: add blt command properties for lunarlake
  2023-11-13  8:42   ` Karolina Stolarek
@ 2023-11-15 21:55     ` Jahagirdar, Akshata
  0 siblings, 0 replies; 16+ messages in thread
From: Jahagirdar, Akshata @ 2023-11-15 21:55 UTC (permalink / raw)
  To: Stolarek, Karolina
  Cc: igt-dev@lists.freedesktop.org, Siddiqui, Ayaz A, Auld, Matthew

HI,
Comments inline.


-----Original Message-----
From: Stolarek, Karolina <karolina.stolarek@intel.com> 
Sent: Monday, November 13, 2023 12:42 AM
To: Jahagirdar, Akshata <akshata.jahagirdar@intel.com>
Cc: igt-dev@lists.freedesktop.org; Siddiqui, Ayaz A <ayaz.siddiqui@intel.com>; Kempczynski, Zbigniew <zbigniew.kempczynski@intel.com>; Auld, Matthew <matthew.auld@intel.com>
Subject: Re: [PATCH i-g-t 1/5] lib: add blt command properties for lunarlake


On 10.11.2023 22:24, Akshata Jahagirdar wrote:
> Add blt_cmd_info structs to describe properties of XY_BLOCK_COPY and XY_FAST_COPY blitter commands for XE2 platform. Updated the definitions for Lunarlake.

Nit: break the commit message at 80 column

AJ: Will fix this.

> 
> Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
> ---
>   lib/intel_cmds_info.c   | 24 ++++++++++++++++++++++++
>   lib/intel_cmds_info.h   |  1 +
>   lib/intel_device_info.c |  2 +-
>   3 files changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/intel_cmds_info.c b/lib/intel_cmds_info.c index 
> 2e51ec081..fc3554271 100644
> --- a/lib/intel_cmds_info.c
> +++ b/lib/intel_cmds_info.c
> @@ -67,6 +67,22 @@ static const struct blt_cmd_info
>   						 BLT_CMD_EXTENDED |
>   						 BLT_CMD_SUPPORTS_COMPRESSION);
>   
> +static const struct blt_cmd_info				
> +		xe2_xy_block_copy = BLT_INFO_EXT(XY_BLOCK_COPY,
> +						 BIT(T_LINEAR) |
> +						 BIT(T_XMAJOR) |
> +						 BIT(T_TILE4)  |
> +						 BIT(T_TILE64),
> +						 BLT_CMD_EXTENDED |
> +						 BLT_CMD_SUPPORTS_COMPRESSION);

I know this might be too much work, but could we confirm that XMAJOR is actually supported with block copy on this platform? This could be as easy as running xe_ccs test with -p parameter, and then checking the mid image and the tiling pattern. But I won't argue it's essential for r-b, it's more of a sanity check.

AJ: I have verified this through bspec, as well as tests are passing on the platform.
> +
> +static const struct blt_cmd_info
> +		xe2_xy_fast_copy = BLT_INFO(XY_FAST_COPY,
> +					    BIT(T_LINEAR) |
> +					    BIT(T_XMAJOR) |
> +					    BIT(T_TILE4)  |
> +					    BIT(T_TILE64));

This looks a lot like dg2_xy_fast_copy definition. Why not reuse it when defining xe2_cmds_info?

AJ: Yes, I thought of keeping it separate to address any other differences.

With these comments addressed:

Reviewed-by: Karolina Stolarek <karolina.stolarek@intel.com>

> +
>   static const struct blt_cmd_info
>   		mtl_xy_block_copy = BLT_INFO_EXT(XY_BLOCK_COPY,
>   						 BIT(T_LINEAR) |
> @@ -169,6 +185,14 @@ const struct intel_cmds_info gen12_pvc_cmds_info = {
>   	}
>   };
>   
> +
> +const struct intel_cmds_info xe2_cmds_info  = {
> +	.blt_cmds = {
> +		[XY_FAST_COPY] = &xe2_xy_fast_copy,
> +		[XY_BLOCK_COPY] = &xe2_xy_block_copy,
> +	}
> +};
> +
>   const struct blt_cmd_info *blt_get_cmd_info(const struct intel_cmds_info *cmds_info,
>   					    enum blt_cmd_type cmd)
>   {
> diff --git a/lib/intel_cmds_info.h b/lib/intel_cmds_info.h index 
> f9e3932d1..0a83b6a44 100644
> --- a/lib/intel_cmds_info.h
> +++ b/lib/intel_cmds_info.h
> @@ -55,6 +55,7 @@ extern const struct intel_cmds_info gen12_cmds_info;
>   extern const struct intel_cmds_info gen12_dg2_cmds_info;
>   extern const struct intel_cmds_info gen12_mtl_cmds_info;
>   extern const struct intel_cmds_info gen12_pvc_cmds_info;
> +extern const struct intel_cmds_info xe2_cmds_info;
>   
>   #define for_each_tiling(__tiling) \
>   	for (__tiling = T_LINEAR; __tiling < __BLT_MAX_TILING; __tiling++) 
> diff --git a/lib/intel_device_info.c b/lib/intel_device_info.c index 
> 34817f7b6..a669797c3 100644
> --- a/lib/intel_device_info.c
> +++ b/lib/intel_device_info.c
> @@ -511,7 +511,7 @@ static const struct intel_device_info intel_lunarlake_info = {
>   	.has_4tile = true,
>   	.is_lunarlake = true,
>   	.codename = "lunarlake",
> -	.cmds_info = &gen12_pvc_cmds_info,
> +	.cmds_info = &xe2_cmds_info,
>   };
>   
>   static const struct pci_id_match intel_device_match[] = {

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 2/5] lib/intel_blt: Update calculation of ccs_size and size_of_ctrl_copy
  2023-11-16 15:38 ` [igt-dev] [PATCH i-g-t 2/5] lib/intel_blt: Update calculation of ccs_size and size_of_ctrl_copy Akshata Jahagirdar
@ 2023-11-16 15:08   ` Kamil Konieczny
  2023-11-16 18:44     ` Mishra, Pallavi
  0 siblings, 1 reply; 16+ messages in thread
From: Kamil Konieczny @ 2023-11-16 15:08 UTC (permalink / raw)
  To: igt-dev; +Cc: matthew.auld, ayaz.siddiqui, Akshata Jahagirdar

Hi Akshata,
On 2023-11-16 at 07:38:31 -0800, Akshata Jahagirdar wrote:

your patch do not apply, please first make sure it will merge
with upstream igt repo, then send new version. Also add version
number like:

[PATCH i-g-t v3 2/5] lib/intel_blt: Update calculation of ...

> The Main-to-CCS Ratio for XE2 has been changed to 512:1.
> Update the CCS_RATIO macro to select relevant ratio based on platform.
> Since the PAGE_SIZE of sysmem is 4K, update the size of ctrl_copy to reflect this change.
> The pat-index and caching mode for compression need to change to xe2_uc_comp and WC in case
> of compression, else they just take the default value of pat_index and WB.

Could you split this patch into two, first with calculations and
next one with pat index?

> 
> Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
> ---
>  lib/intel_blt.c | 23 +++++++++++++++--------
>  lib/intel_blt.h |  2 +-
>  2 files changed, 16 insertions(+), 9 deletions(-)
> 
> diff --git a/lib/intel_blt.c b/lib/intel_blt.c
> index c0593930c..764b27213 100644
> --- a/lib/intel_blt.c
> +++ b/lib/intel_blt.c
> @@ -948,15 +948,16 @@ int blt_block_copy(int fd,
>  	return ret;
>  }
>  
> -static uint16_t __ccs_size(const struct blt_ctrl_surf_copy_data *surf)
> +static uint16_t __ccs_size(int fd, const struct blt_ctrl_surf_copy_data *surf)
>  {
>  	uint32_t src_size, dst_size;
> +	uint16_t ccsratio = CCS_RATIO(fd);
>  
>  	src_size = surf->src.access_type == DIRECT_ACCESS ?
> -				surf->src.size : surf->src.size / CCS_RATIO;
> +				surf->src.size : surf->src.size / ccsratio;
>  
>  	dst_size = surf->dst.access_type == DIRECT_ACCESS ?
> -				surf->dst.size : surf->dst.size / CCS_RATIO;
> +				surf->dst.size : surf->dst.size / ccsratio;
>  
>  	igt_assert_f(src_size <= dst_size, "dst size must be >= src size for CCS copy\n");
>  
> @@ -1118,6 +1119,7 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
>  	uint64_t dst_offset, src_offset, bb_offset, alignment;
>  	uint32_t bbe = MI_BATCH_BUFFER_END;
>  	uint32_t *bb;
> +	uint16_t num_ccs_blocks = xe_get_default_alignment(fd) / (CCS_RATIO(fd));
>  
>  	igt_assert_f(ahnd, "ctrl-surf-copy supports softpin only\n");
>  	igt_assert_f(surf, "ctrl-surf-copy requires data to do ctrl-surf-copy blit\n");
> @@ -1136,7 +1138,7 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
>  		data.xe2.dw00.dst_access_type = surf->dst.access_type;
>  
>  		/* Ensure dst has size capable to keep src ccs aux */
> -		data.xe2.dw00.size_of_ctrl_copy = __ccs_size(surf) / CCS_RATIO - 1;
> +		data.xe2.dw00.size_of_ctrl_copy = __ccs_size(fd, surf) / num_ccs_blocks - 1;
>  		data.xe2.dw00.length = 0x3;
>  
>  		data.xe2.dw01.src_address_lo = src_offset;
> @@ -1155,7 +1157,7 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
>  		data.gen12.dw00.dst_access_type = surf->dst.access_type;
>  
>  		/* Ensure dst has size capable to keep src ccs aux */
> -		data.gen12.dw00.size_of_ctrl_copy = __ccs_size(surf) / CCS_RATIO - 1;
> +		data.gen12.dw00.size_of_ctrl_copy = __ccs_size(fd,surf) / num_ccs_blocks - 1;
>  		data.gen12.dw00.length = 0x3;
>  
>  		data.gen12.dw01.src_address_lo = src_offset;
> @@ -1794,7 +1796,8 @@ blt_create_object(const struct blt_copy_data *blt, uint32_t region,
>  	uint64_t size = width * height * bpp / 8;
>  	uint32_t stride = tiling == T_LINEAR ? width * 4 : width;
>  	uint32_t handle;
> -
> +	uint8_t pat_index = intel_get_pat_idx_uc(blt->fd);
> +	uint16_t cpu_caching = DRM_XE_GEM_CPU_CACHING_WB;
>  	igt_assert_f(blt->driver, "Driver isn't set, have you called blt_copy_init()?\n");
>  
>  	obj = calloc(1, sizeof(*obj));
> @@ -1804,17 +1807,21 @@ blt_create_object(const struct blt_copy_data *blt, uint32_t region,

This hunk do not apply.

Regards,
Kamil

>  	if (blt->driver == INTEL_DRIVER_XE) {
>  		uint64_t flags = region;
>  
> +		if(AT_LEAST_GEN(intel_get_drm_devid(xe), 20) && mid->compression){
> +			pat_index = xe2_get_pat_idx_uc_comp(blt->fd);
> +			cpu_caching = DRM_XE_GEM_CPU_CACHING_WC;
> +		}
>  		if (create_mapping && region != system_memory(blt->fd))
>  			flags |= XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM;
>  
>  		size = ALIGN(size, xe_get_default_alignment(blt->fd));
> -		handle = xe_bo_create_flags(blt->fd, 0, size, flags);
> +		handle = xe_bo_create_caching(blt->fd, 0, size, region, cpu_caching);
>  	} else {
>  		igt_assert(__gem_create_in_memory_regions(blt->fd, &handle,
>  							  &size, region) == 0);
>  	}
>  
> -	blt_set_object(obj, handle, size, region, mocs_index, DEFAULT_PAT_INDEX, tiling,
> +	blt_set_object(obj, handle, size, region, mocs_index, pat_index, tiling,
>  		       compression, compression_type);
>  	blt_set_geom(obj, stride, 0, 0, width, height, 0, 0);
>  
> diff --git a/lib/intel_blt.h b/lib/intel_blt.h
> index 5934ccd67..a7aeab595 100644
> --- a/lib/intel_blt.h
> +++ b/lib/intel_blt.h
> @@ -52,7 +52,7 @@
>  #include "igt.h"
>  #include "intel_cmds_info.h"
>  
> -#define CCS_RATIO 256
> +#define CCS_RATIO(xe) AT_LEAST_GEN(intel_get_drm_devid(xe), 20) ? 512 : 256
>  
>  enum blt_color_depth {
>  	CD_8bit,
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [igt-dev] [PATCH i-g-t 2/5] lib/intel_blt: Update calculation of ccs_size and size_of_ctrl_copy
  2023-11-16 15:38 [igt-dev] [PATCH i-g-t 0/5] Compression " Akshata Jahagirdar
@ 2023-11-16 15:38 ` Akshata Jahagirdar
  2023-11-16 15:08   ` Kamil Konieczny
  0 siblings, 1 reply; 16+ messages in thread
From: Akshata Jahagirdar @ 2023-11-16 15:38 UTC (permalink / raw)
  Cc: igt-dev, ayaz.siddiqui, Akshata Jahagirdar, matthew.auld

The Main-to-CCS Ratio for XE2 has been changed to 512:1.
Update the CCS_RATIO macro to select relevant ratio based on platform.
Since the PAGE_SIZE of sysmem is 4K, update the size of ctrl_copy to reflect this change.
The pat-index and caching mode for compression need to change to xe2_uc_comp and WC in case
of compression, else they just take the default value of pat_index and WB.

Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
---
 lib/intel_blt.c | 23 +++++++++++++++--------
 lib/intel_blt.h |  2 +-
 2 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/lib/intel_blt.c b/lib/intel_blt.c
index c0593930c..764b27213 100644
--- a/lib/intel_blt.c
+++ b/lib/intel_blt.c
@@ -948,15 +948,16 @@ int blt_block_copy(int fd,
 	return ret;
 }
 
-static uint16_t __ccs_size(const struct blt_ctrl_surf_copy_data *surf)
+static uint16_t __ccs_size(int fd, const struct blt_ctrl_surf_copy_data *surf)
 {
 	uint32_t src_size, dst_size;
+	uint16_t ccsratio = CCS_RATIO(fd);
 
 	src_size = surf->src.access_type == DIRECT_ACCESS ?
-				surf->src.size : surf->src.size / CCS_RATIO;
+				surf->src.size : surf->src.size / ccsratio;
 
 	dst_size = surf->dst.access_type == DIRECT_ACCESS ?
-				surf->dst.size : surf->dst.size / CCS_RATIO;
+				surf->dst.size : surf->dst.size / ccsratio;
 
 	igt_assert_f(src_size <= dst_size, "dst size must be >= src size for CCS copy\n");
 
@@ -1118,6 +1119,7 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
 	uint64_t dst_offset, src_offset, bb_offset, alignment;
 	uint32_t bbe = MI_BATCH_BUFFER_END;
 	uint32_t *bb;
+	uint16_t num_ccs_blocks = xe_get_default_alignment(fd) / (CCS_RATIO(fd));
 
 	igt_assert_f(ahnd, "ctrl-surf-copy supports softpin only\n");
 	igt_assert_f(surf, "ctrl-surf-copy requires data to do ctrl-surf-copy blit\n");
@@ -1136,7 +1138,7 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
 		data.xe2.dw00.dst_access_type = surf->dst.access_type;
 
 		/* Ensure dst has size capable to keep src ccs aux */
-		data.xe2.dw00.size_of_ctrl_copy = __ccs_size(surf) / CCS_RATIO - 1;
+		data.xe2.dw00.size_of_ctrl_copy = __ccs_size(fd, surf) / num_ccs_blocks - 1;
 		data.xe2.dw00.length = 0x3;
 
 		data.xe2.dw01.src_address_lo = src_offset;
@@ -1155,7 +1157,7 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
 		data.gen12.dw00.dst_access_type = surf->dst.access_type;
 
 		/* Ensure dst has size capable to keep src ccs aux */
-		data.gen12.dw00.size_of_ctrl_copy = __ccs_size(surf) / CCS_RATIO - 1;
+		data.gen12.dw00.size_of_ctrl_copy = __ccs_size(fd,surf) / num_ccs_blocks - 1;
 		data.gen12.dw00.length = 0x3;
 
 		data.gen12.dw01.src_address_lo = src_offset;
@@ -1794,7 +1796,8 @@ blt_create_object(const struct blt_copy_data *blt, uint32_t region,
 	uint64_t size = width * height * bpp / 8;
 	uint32_t stride = tiling == T_LINEAR ? width * 4 : width;
 	uint32_t handle;
-
+	uint8_t pat_index = intel_get_pat_idx_uc(blt->fd);
+	uint16_t cpu_caching = DRM_XE_GEM_CPU_CACHING_WB;
 	igt_assert_f(blt->driver, "Driver isn't set, have you called blt_copy_init()?\n");
 
 	obj = calloc(1, sizeof(*obj));
@@ -1804,17 +1807,21 @@ blt_create_object(const struct blt_copy_data *blt, uint32_t region,
 	if (blt->driver == INTEL_DRIVER_XE) {
 		uint64_t flags = region;
 
+		if(AT_LEAST_GEN(intel_get_drm_devid(xe), 20) && mid->compression){
+			pat_index = xe2_get_pat_idx_uc_comp(blt->fd);
+			cpu_caching = DRM_XE_GEM_CPU_CACHING_WC;
+		}
 		if (create_mapping && region != system_memory(blt->fd))
 			flags |= XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM;
 
 		size = ALIGN(size, xe_get_default_alignment(blt->fd));
-		handle = xe_bo_create_flags(blt->fd, 0, size, flags);
+		handle = xe_bo_create_caching(blt->fd, 0, size, region, cpu_caching);
 	} else {
 		igt_assert(__gem_create_in_memory_regions(blt->fd, &handle,
 							  &size, region) == 0);
 	}
 
-	blt_set_object(obj, handle, size, region, mocs_index, DEFAULT_PAT_INDEX, tiling,
+	blt_set_object(obj, handle, size, region, mocs_index, pat_index, tiling,
 		       compression, compression_type);
 	blt_set_geom(obj, stride, 0, 0, width, height, 0, 0);
 
diff --git a/lib/intel_blt.h b/lib/intel_blt.h
index 5934ccd67..a7aeab595 100644
--- a/lib/intel_blt.h
+++ b/lib/intel_blt.h
@@ -52,7 +52,7 @@
 #include "igt.h"
 #include "intel_cmds_info.h"
 
-#define CCS_RATIO 256
+#define CCS_RATIO(xe) AT_LEAST_GEN(intel_get_drm_devid(xe), 20) ? 512 : 256
 
 enum blt_color_depth {
 	CD_8bit,
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 2/5] lib/intel_blt: Update calculation of ccs_size and size_of_ctrl_copy
  2023-11-16 15:08   ` Kamil Konieczny
@ 2023-11-16 18:44     ` Mishra, Pallavi
  0 siblings, 0 replies; 16+ messages in thread
From: Mishra, Pallavi @ 2023-11-16 18:44 UTC (permalink / raw)
  To: Kamil Konieczny, igt-dev@lists.freedesktop.org
  Cc: Jahagirdar, Akshata, Auld, Matthew, Siddiqui, Ayaz A



> -----Original Message-----
> From: igt-dev <igt-dev-bounces@lists.freedesktop.org> On Behalf Of Kamil
> Konieczny
> Sent: Thursday, November 16, 2023 7:09 AM
> To: igt-dev@lists.freedesktop.org
> Cc: Auld, Matthew <matthew.auld@intel.com>; Siddiqui, Ayaz A
> <ayaz.siddiqui@intel.com>; Jahagirdar, Akshata
> <akshata.jahagirdar@intel.com>
> Subject: Re: [igt-dev] [PATCH i-g-t 2/5] lib/intel_blt: Update calculation of
> ccs_size and size_of_ctrl_copy
> 
> Hi Akshata,
> On 2023-11-16 at 07:38:31 -0800, Akshata Jahagirdar wrote:
> 
> your patch do not apply, please first make sure it will merge with upstream igt

These changes are based on https://patchwork.freedesktop.org/series/124667/ which is not merged yet. Hence patch does not apply.

Thanks,
Pallavi

> repo, then send new version. Also add version number like:
> 
> [PATCH i-g-t v3 2/5] lib/intel_blt: Update calculation of ...
> 
> > The Main-to-CCS Ratio for XE2 has been changed to 512:1.
> > Update the CCS_RATIO macro to select relevant ratio based on platform.
> > Since the PAGE_SIZE of sysmem is 4K, update the size of ctrl_copy to reflect
> this change.
> > The pat-index and caching mode for compression need to change to
> > xe2_uc_comp and WC in case of compression, else they just take the default
> value of pat_index and WB.
> 
> Could you split this patch into two, first with calculations and next one with
> pat index?
> 
> >
> > Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
> > ---
> >  lib/intel_blt.c | 23 +++++++++++++++--------  lib/intel_blt.h |  2 +-
> >  2 files changed, 16 insertions(+), 9 deletions(-)
> >
> > diff --git a/lib/intel_blt.c b/lib/intel_blt.c index
> > c0593930c..764b27213 100644
> > --- a/lib/intel_blt.c
> > +++ b/lib/intel_blt.c
> > @@ -948,15 +948,16 @@ int blt_block_copy(int fd,
> >  	return ret;
> >  }
> >
> > -static uint16_t __ccs_size(const struct blt_ctrl_surf_copy_data
> > *surf)
> > +static uint16_t __ccs_size(int fd, const struct
> > +blt_ctrl_surf_copy_data *surf)
> >  {
> >  	uint32_t src_size, dst_size;
> > +	uint16_t ccsratio = CCS_RATIO(fd);
> >
> >  	src_size = surf->src.access_type == DIRECT_ACCESS ?
> > -				surf->src.size : surf->src.size / CCS_RATIO;
> > +				surf->src.size : surf->src.size / ccsratio;
> >
> >  	dst_size = surf->dst.access_type == DIRECT_ACCESS ?
> > -				surf->dst.size : surf->dst.size / CCS_RATIO;
> > +				surf->dst.size : surf->dst.size / ccsratio;
> >
> >  	igt_assert_f(src_size <= dst_size, "dst size must be >= src size for
> > CCS copy\n");
> >
> > @@ -1118,6 +1119,7 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
> >  	uint64_t dst_offset, src_offset, bb_offset, alignment;
> >  	uint32_t bbe = MI_BATCH_BUFFER_END;
> >  	uint32_t *bb;
> > +	uint16_t num_ccs_blocks = xe_get_default_alignment(fd) /
> > +(CCS_RATIO(fd));
> >
> >  	igt_assert_f(ahnd, "ctrl-surf-copy supports softpin only\n");
> >  	igt_assert_f(surf, "ctrl-surf-copy requires data to do
> > ctrl-surf-copy blit\n"); @@ -1136,7 +1138,7 @@ uint64_t
> emit_blt_ctrl_surf_copy(int fd,
> >  		data.xe2.dw00.dst_access_type = surf->dst.access_type;
> >
> >  		/* Ensure dst has size capable to keep src ccs aux */
> > -		data.xe2.dw00.size_of_ctrl_copy = __ccs_size(surf) /
> CCS_RATIO - 1;
> > +		data.xe2.dw00.size_of_ctrl_copy = __ccs_size(fd, surf) /
> > +num_ccs_blocks - 1;
> >  		data.xe2.dw00.length = 0x3;
> >
> >  		data.xe2.dw01.src_address_lo = src_offset; @@ -1155,7
> +1157,7 @@
> > uint64_t emit_blt_ctrl_surf_copy(int fd,
> >  		data.gen12.dw00.dst_access_type = surf->dst.access_type;
> >
> >  		/* Ensure dst has size capable to keep src ccs aux */
> > -		data.gen12.dw00.size_of_ctrl_copy = __ccs_size(surf) /
> CCS_RATIO - 1;
> > +		data.gen12.dw00.size_of_ctrl_copy = __ccs_size(fd,surf) /
> > +num_ccs_blocks - 1;
> >  		data.gen12.dw00.length = 0x3;
> >
> >  		data.gen12.dw01.src_address_lo = src_offset; @@ -1794,7
> +1796,8 @@
> > blt_create_object(const struct blt_copy_data *blt, uint32_t region,
> >  	uint64_t size = width * height * bpp / 8;
> >  	uint32_t stride = tiling == T_LINEAR ? width * 4 : width;
> >  	uint32_t handle;
> > -
> > +	uint8_t pat_index = intel_get_pat_idx_uc(blt->fd);
> > +	uint16_t cpu_caching = DRM_XE_GEM_CPU_CACHING_WB;
> >  	igt_assert_f(blt->driver, "Driver isn't set, have you called
> > blt_copy_init()?\n");
> >
> >  	obj = calloc(1, sizeof(*obj));
> > @@ -1804,17 +1807,21 @@ blt_create_object(const struct blt_copy_data
> > *blt, uint32_t region,
> 
> This hunk do not apply.
> 
> Regards,
> Kamil
> 
> >  	if (blt->driver == INTEL_DRIVER_XE) {
> >  		uint64_t flags = region;
> >
> > +		if(AT_LEAST_GEN(intel_get_drm_devid(xe), 20) && mid-
> >compression){
> > +			pat_index = xe2_get_pat_idx_uc_comp(blt->fd);
> > +			cpu_caching = DRM_XE_GEM_CPU_CACHING_WC;
> > +		}
> >  		if (create_mapping && region != system_memory(blt->fd))
> >  			flags |=
> XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM;
> >
> >  		size = ALIGN(size, xe_get_default_alignment(blt->fd));
> > -		handle = xe_bo_create_flags(blt->fd, 0, size, flags);
> > +		handle = xe_bo_create_caching(blt->fd, 0, size, region,
> > +cpu_caching);
> >  	} else {
> >  		igt_assert(__gem_create_in_memory_regions(blt->fd,
> &handle,
> >  							  &size, region) == 0);
> >  	}
> >
> > -	blt_set_object(obj, handle, size, region, mocs_index,
> DEFAULT_PAT_INDEX, tiling,
> > +	blt_set_object(obj, handle, size, region, mocs_index, pat_index,
> > +tiling,
> >  		       compression, compression_type);
> >  	blt_set_geom(obj, stride, 0, 0, width, height, 0, 0);
> >
> > diff --git a/lib/intel_blt.h b/lib/intel_blt.h index
> > 5934ccd67..a7aeab595 100644
> > --- a/lib/intel_blt.h
> > +++ b/lib/intel_blt.h
> > @@ -52,7 +52,7 @@
> >  #include "igt.h"
> >  #include "intel_cmds_info.h"
> >
> > -#define CCS_RATIO 256
> > +#define CCS_RATIO(xe) AT_LEAST_GEN(intel_get_drm_devid(xe), 20) ? 512
> > +: 256
> >
> >  enum blt_color_depth {
> >  	CD_8bit,
> > --
> > 2.34.1
> >

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2023-11-16 18:46 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-10 21:24 [igt-dev] [PATCH i-g-t 0/5] Compression support for Lunarlake Akshata Jahagirdar
2023-11-10 10:29 ` [igt-dev] ✗ Fi.CI.BUILD: failure for " Patchwork
2023-11-10 21:24 ` [igt-dev] [PATCH i-g-t 1/5] lib: add blt command properties for lunarlake Akshata Jahagirdar
2023-11-13  8:42   ` Karolina Stolarek
2023-11-15 21:55     ` Jahagirdar, Akshata
2023-11-10 21:24 ` [igt-dev] [PATCH i-g-t 2/5] lib/intel_blt: Update calculation of ccs_size and size_of_ctrl_copy Akshata Jahagirdar
2023-11-10 10:59   ` Matthew Auld
2023-11-15 21:06     ` Jahagirdar, Akshata
2023-11-10 21:24 ` [igt-dev] [PATCH i-g-t 3/5] lib/intel_pat : Add uc_comp pat_index Akshata Jahagirdar
2023-11-10 11:06   ` Matthew Auld
2023-11-15 21:38     ` Jahagirdar, Akshata
2023-11-10 21:24 ` [igt-dev] [PATCH i-g-t 4/5] tests/intel/gem_ccs: Add compression support for Lunarlake Akshata Jahagirdar
2023-11-10 21:24 ` [igt-dev] [PATCH i-g-t 5/5] tests/intel/xe_ccs: " Akshata Jahagirdar
  -- strict thread matches above, loose matches on Subject: below --
2023-11-16 15:38 [igt-dev] [PATCH i-g-t 0/5] Compression " Akshata Jahagirdar
2023-11-16 15:38 ` [igt-dev] [PATCH i-g-t 2/5] lib/intel_blt: Update calculation of ccs_size and size_of_ctrl_copy Akshata Jahagirdar
2023-11-16 15:08   ` Kamil Konieczny
2023-11-16 18:44     ` Mishra, Pallavi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox