* Re: [PATCH 0/6] Implement compression support on BMG
2024-07-10 7:53 [PATCH 0/6] Implement compression support on BMG Akshata Jahagirdar
@ 2024-07-09 10:33 ` Matthew Auld
2024-07-09 19:07 ` Jahagirdar, Akshata
2024-07-10 7:53 ` [PATCH 1/6] drm/xe/migrate: Handle clear ccs logic for xe2 dgfx Akshata Jahagirdar
` (6 subsequent siblings)
7 siblings, 1 reply; 25+ messages in thread
From: Matthew Auld @ 2024-07-09 10:33 UTC (permalink / raw)
To: Akshata Jahagirdar, intel-xe
Cc: matthew.d.roper, himal.prasad.ghimiray, lucas.demarchi
On 09/07/2024 11:49, Akshata Jahagirdar wrote:
> According to the SAS for BMG compression, we need to decompress during eviction,
> and not recompress on restore. Due to this, we need to introduce encoding pat_index
> in case of vram too. This patch explores the solution of setting up an additional
> identity map for the vram, this time at the end of previous mapping offset and
> with compressed pat_index.
> We then select the appropriate mapping during eviction/restore/clear.
Hey, I don't see this series on the ml. Are you subscribed?
>
> Akshata Jahagirdar (6):
> drm/xe/xe2: Introduce identity map for compressed pat for vram
> drm/xe/migrate: Handle clear ccs logic for xe2 dgfx
> drm/xe/migrate: Add kunit to test clear functionality
> drm/xe/xe_migrate: Handle migration logic for xe2+ dgfx
> drm/xe/migrate: Add kunit to test migration functionality for BMG
> drm/xe/xe2: Do not run xe_bo_test for xe2+ dgfx
>
> drivers/gpu/drm/xe/tests/xe_bo.c | 6 +
> drivers/gpu/drm/xe/tests/xe_migrate.c | 388 +++++++++++++++++++++
> drivers/gpu/drm/xe/tests/xe_migrate_test.c | 1 +
> drivers/gpu/drm/xe/tests/xe_migrate_test.h | 1 +
> drivers/gpu/drm/xe/xe_device.h | 5 +
> drivers/gpu/drm/xe/xe_migrate.c | 66 +++-
> 6 files changed, 449 insertions(+), 18 deletions(-)
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 0/6] Implement compression support on BMG
2024-07-09 10:33 ` Matthew Auld
@ 2024-07-09 19:07 ` Jahagirdar, Akshata
0 siblings, 0 replies; 25+ messages in thread
From: Jahagirdar, Akshata @ 2024-07-09 19:07 UTC (permalink / raw)
To: Matthew Auld, intel-xe
Cc: matthew.d.roper, himal.prasad.ghimiray, lucas.demarchi
On 7/9/2024 3:33 AM, Matthew Auld wrote:
> On 09/07/2024 11:49, Akshata Jahagirdar wrote:
>> According to the SAS for BMG compression, we need to decompress
>> during eviction,
>> and not recompress on restore. Due to this, we need to introduce
>> encoding pat_index
>> in case of vram too. This patch explores the solution of setting up
>> an additional
>> identity map for the vram, this time at the end of previous mapping
>> offset and
>> with compressed pat_index.
>> We then select the appropriate mapping during eviction/restore/clear.
>
> Hey, I don't see this series on the ml. Are you subscribed?
Hey Matt,
Yes, I am.. seems like there is some issue with my mailing list
subscription,
so its not getting reflected on the website. I just resubscribed , would
send this out again.
>
>>
>> Akshata Jahagirdar (6):
>> drm/xe/xe2: Introduce identity map for compressed pat for vram
>> drm/xe/migrate: Handle clear ccs logic for xe2 dgfx
>> drm/xe/migrate: Add kunit to test clear functionality
>> drm/xe/xe_migrate: Handle migration logic for xe2+ dgfx
>> drm/xe/migrate: Add kunit to test migration functionality for BMG
>> drm/xe/xe2: Do not run xe_bo_test for xe2+ dgfx
>>
>> drivers/gpu/drm/xe/tests/xe_bo.c | 6 +
>> drivers/gpu/drm/xe/tests/xe_migrate.c | 388 +++++++++++++++++++++
>> drivers/gpu/drm/xe/tests/xe_migrate_test.c | 1 +
>> drivers/gpu/drm/xe/tests/xe_migrate_test.h | 1 +
>> drivers/gpu/drm/xe/xe_device.h | 5 +
>> drivers/gpu/drm/xe/xe_migrate.c | 66 +++-
>> 6 files changed, 449 insertions(+), 18 deletions(-)
>>
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 0/6] Implement compression support on BMG
@ 2024-07-10 7:53 Akshata Jahagirdar
2024-07-09 10:33 ` Matthew Auld
` (7 more replies)
0 siblings, 8 replies; 25+ messages in thread
From: Akshata Jahagirdar @ 2024-07-10 7:53 UTC (permalink / raw)
To: intel-xe; +Cc: lucas.demarchi, lucas.demarchi, Akshata Jahagirdar
According to the SAS for BMG compression, we need to decompress during eviction,
and not recompress on restore. Due to this, we need to introduce encoding pat_index
in case of vram too. This patch explores the solution of setting up an additional
identity map for the vram, this time at the end of previous mapping offset and
with compressed pat_index.
We then select the appropriate mapping during eviction/restore/clear.
Akshata Jahagirdar (6):
drm/xe/xe2: Introduce identity map for compressed pat for vram
drm/xe/migrate: Handle clear ccs logic for xe2 dgfx
drm/xe/migrate: Add kunit to test clear functionality
drm/xe/xe_migrate: Handle migration logic for xe2+ dgfx
drm/xe/migrate: Add kunit to test migration functionality for BMG
drm/xe/xe2: Do not run xe_bo_test for xe2+ dgfx
drivers/gpu/drm/xe/tests/xe_bo.c | 6 +
drivers/gpu/drm/xe/tests/xe_migrate.c | 388 +++++++++++++++++++++
drivers/gpu/drm/xe/tests/xe_migrate_test.c | 1 +
drivers/gpu/drm/xe/tests/xe_migrate_test.h | 1 +
drivers/gpu/drm/xe/xe_device.h | 5 +
drivers/gpu/drm/xe/xe_migrate.c | 66 +++-
6 files changed, 449 insertions(+), 18 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 1/6] drm/xe/migrate: Handle clear ccs logic for xe2 dgfx
2024-07-10 7:53 [PATCH 0/6] Implement compression support on BMG Akshata Jahagirdar
2024-07-09 10:33 ` Matthew Auld
@ 2024-07-10 7:53 ` Akshata Jahagirdar
2024-07-10 8:01 ` Nirmoy Das
` (6 more replies)
2024-07-10 8:17 ` [PATCH 0/6] Implement compression support on BMG Akshata Jahagirdar
` (5 subsequent siblings)
7 siblings, 7 replies; 25+ messages in thread
From: Akshata Jahagirdar @ 2024-07-10 7:53 UTC (permalink / raw)
To: intel-xe; +Cc: lucas.demarchi, lucas.demarchi, Akshata Jahagirdar
For Xe2 dGPU, we clear the bo by modifying the VRAM using an
uncompressed pat index which then indirectly updates the
compression status as uncompressed i.e zeroed CCS.
So xe_migrate_clear() should be updated for BMG to not
emit CCS surf copy commands.
Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
---
drivers/gpu/drm/xe/xe_device.h | 5 +++++
drivers/gpu/drm/xe/xe_migrate.c | 6 +++---
2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
index 0a2a3e7fd402..c3093506c28c 100644
--- a/drivers/gpu/drm/xe/xe_device.h
+++ b/drivers/gpu/drm/xe/xe_device.h
@@ -144,6 +144,11 @@ static inline bool xe_device_has_flat_ccs(struct xe_device *xe)
return xe->info.has_flat_ccs;
}
+static inline bool xe_device_needs_ccs_emit(struct xe_device *xe)
+{
+ return xe_device_has_flat_ccs(xe) && !(GRAPHICS_VER(xe) >= 20 && IS_DGFX(xe));
+}
+
static inline bool xe_device_has_sriov(struct xe_device *xe)
{
return xe->info.has_sriov;
diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
index fa23a7e7ec43..2fc2cf375b1e 100644
--- a/drivers/gpu/drm/xe/xe_migrate.c
+++ b/drivers/gpu/drm/xe/xe_migrate.c
@@ -420,7 +420,7 @@ struct xe_migrate *xe_migrate_init(struct xe_tile *tile)
return ERR_PTR(err);
if (IS_DGFX(xe)) {
- if (xe_device_has_flat_ccs(xe))
+ if (xe_device_needs_ccs_emit(xe))
/* min chunk size corresponds to 4K of CCS Metadata */
m->min_chunk_size = SZ_4K * SZ_64K /
xe_device_ccs_bytes(xe, SZ_64K);
@@ -1034,7 +1034,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
clear_system_ccs ? 0 : emit_clear_cmd_len(gt), 0,
avail_pts);
- if (xe_device_has_flat_ccs(xe))
+ if (xe_device_needs_ccs_emit(xe))
batch_size += EMIT_COPY_CCS_DW;
/* Clear commands */
@@ -1062,7 +1062,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
if (!clear_system_ccs)
emit_clear(gt, bb, clear_L0_ofs, clear_L0, XE_PAGE_SIZE, clear_vram);
- if (xe_device_has_flat_ccs(xe)) {
+ if (xe_device_needs_ccs_emit(xe)) {
emit_copy_ccs(gt, bb, clear_L0_ofs, true,
m->cleared_mem_ofs, false, clear_L0);
flush_flags = MI_FLUSH_DW_CCS;
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH 1/6] drm/xe/migrate: Handle clear ccs logic for xe2 dgfx
2024-07-10 7:53 ` [PATCH 1/6] drm/xe/migrate: Handle clear ccs logic for xe2 dgfx Akshata Jahagirdar
@ 2024-07-10 8:01 ` Nirmoy Das
2024-07-10 8:17 ` Akshata Jahagirdar
` (5 subsequent siblings)
6 siblings, 0 replies; 25+ messages in thread
From: Nirmoy Das @ 2024-07-10 8:01 UTC (permalink / raw)
To: Akshata Jahagirdar, intel-xe; +Cc: lucas.demarchi, lucas.demarchi, Matthew Auld
Hi Akshata,
I have two patches(2,3) in
https://patchwork.freedesktop.org/series/135743/ that deals with this
which I sent to
to handle bo clear for igfx. Those two patches are independent of my
series and I you can incorporate it.
https://patchwork.freedesktop.org/patch/602287/?series=135743&rev=1
https://patchwork.freedesktop.org/patch/602288/?series=135743&rev=1
Regards,
Nirmoy
On 7/10/2024 10:17 AM, Akshata Jahagirdar wrote:
> For Xe2 dGPU, we clear the bo by modifying the VRAM using an
> uncompressed pat index which then indirectly updates the
> compression status as uncompressed i.e zeroed CCS.
> So xe_migrate_clear() should be updated for BMG to not
> emit CCS surf copy commands.
>
> Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
> ---
> drivers/gpu/drm/xe/xe_device.h | 5 +++++
> drivers/gpu/drm/xe/xe_migrate.c | 6 +++---
> 2 files changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
> index 0a2a3e7fd402..c3093506c28c 100644
> --- a/drivers/gpu/drm/xe/xe_device.h
> +++ b/drivers/gpu/drm/xe/xe_device.h
> @@ -144,6 +144,11 @@ static inline bool xe_device_has_flat_ccs(struct xe_device *xe)
> return xe->info.has_flat_ccs;
> }
>
> +static inline bool xe_device_needs_ccs_emit(struct xe_device *xe)
> +{
> + return xe_device_has_flat_ccs(xe) && !(GRAPHICS_VER(xe) >= 20 && IS_DGFX(xe));
> +}
> +
> static inline bool xe_device_has_sriov(struct xe_device *xe)
> {
> return xe->info.has_sriov;
> diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
> index fa23a7e7ec43..2fc2cf375b1e 100644
> --- a/drivers/gpu/drm/xe/xe_migrate.c
> +++ b/drivers/gpu/drm/xe/xe_migrate.c
> @@ -420,7 +420,7 @@ struct xe_migrate *xe_migrate_init(struct xe_tile *tile)
> return ERR_PTR(err);
>
> if (IS_DGFX(xe)) {
> - if (xe_device_has_flat_ccs(xe))
> + if (xe_device_needs_ccs_emit(xe))
> /* min chunk size corresponds to 4K of CCS Metadata */
> m->min_chunk_size = SZ_4K * SZ_64K /
> xe_device_ccs_bytes(xe, SZ_64K);
> @@ -1034,7 +1034,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
> clear_system_ccs ? 0 : emit_clear_cmd_len(gt), 0,
> avail_pts);
>
> - if (xe_device_has_flat_ccs(xe))
> + if (xe_device_needs_ccs_emit(xe))
> batch_size += EMIT_COPY_CCS_DW;
>
> /* Clear commands */
> @@ -1062,7 +1062,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
> if (!clear_system_ccs)
> emit_clear(gt, bb, clear_L0_ofs, clear_L0, XE_PAGE_SIZE, clear_vram);
>
> - if (xe_device_has_flat_ccs(xe)) {
> + if (xe_device_needs_ccs_emit(xe)) {
> emit_copy_ccs(gt, bb, clear_L0_ofs, true,
> m->cleared_mem_ofs, false, clear_L0);
> flush_flags = MI_FLUSH_DW_CCS;
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 0/6] Implement compression support on BMG
2024-07-10 7:53 [PATCH 0/6] Implement compression support on BMG Akshata Jahagirdar
2024-07-09 10:33 ` Matthew Auld
2024-07-10 7:53 ` [PATCH 1/6] drm/xe/migrate: Handle clear ccs logic for xe2 dgfx Akshata Jahagirdar
@ 2024-07-10 8:17 ` Akshata Jahagirdar
2024-07-11 5:54 ` Akshata Jahagirdar
` (4 subsequent siblings)
7 siblings, 0 replies; 25+ messages in thread
From: Akshata Jahagirdar @ 2024-07-10 8:17 UTC (permalink / raw)
To: intel-xe; +Cc: lucas.demarchi, lucas.demarchi, Akshata Jahagirdar
According to the SAS for BMG compression, we need to decompress during eviction,
and not recompress on restore. Due to this, we need to introduce encoding pat_index
in case of vram too. This patch explores the solution of setting up an additional
identity map for the vram, this time at the end of previous mapping offset and
with compressed pat_index.
We then select the appropriate mapping during eviction/restore/clear.
Akshata Jahagirdar (6):
drm/xe/xe2: Introduce identity map for compressed pat for vram
drm/xe/migrate: Handle clear ccs logic for xe2 dgfx
drm/xe/migrate: Add kunit to test clear functionality
drm/xe/xe_migrate: Handle migration logic for xe2+ dgfx
drm/xe/migrate: Add kunit to test migration functionality for BMG
drm/xe/xe2: Do not run xe_bo_test for xe2+ dgfx
drivers/gpu/drm/xe/tests/xe_bo.c | 6 +
drivers/gpu/drm/xe/tests/xe_migrate.c | 388 +++++++++++++++++++++
drivers/gpu/drm/xe/tests/xe_migrate_test.c | 1 +
drivers/gpu/drm/xe/tests/xe_migrate_test.h | 1 +
drivers/gpu/drm/xe/xe_device.h | 5 +
drivers/gpu/drm/xe/xe_migrate.c | 66 +++-
6 files changed, 449 insertions(+), 18 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 1/6] drm/xe/migrate: Handle clear ccs logic for xe2 dgfx
2024-07-10 7:53 ` [PATCH 1/6] drm/xe/migrate: Handle clear ccs logic for xe2 dgfx Akshata Jahagirdar
2024-07-10 8:01 ` Nirmoy Das
@ 2024-07-10 8:17 ` Akshata Jahagirdar
2024-07-11 11:27 ` Akshata Jahagirdar
` (4 subsequent siblings)
6 siblings, 0 replies; 25+ messages in thread
From: Akshata Jahagirdar @ 2024-07-10 8:17 UTC (permalink / raw)
To: intel-xe; +Cc: lucas.demarchi, lucas.demarchi, Akshata Jahagirdar
For Xe2 dGPU, we clear the bo by modifying the VRAM using an
uncompressed pat index which then indirectly updates the
compression status as uncompressed i.e zeroed CCS.
So xe_migrate_clear() should be updated for BMG to not
emit CCS surf copy commands.
Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
---
drivers/gpu/drm/xe/xe_device.h | 5 +++++
drivers/gpu/drm/xe/xe_migrate.c | 6 +++---
2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
index 0a2a3e7fd402..c3093506c28c 100644
--- a/drivers/gpu/drm/xe/xe_device.h
+++ b/drivers/gpu/drm/xe/xe_device.h
@@ -144,6 +144,11 @@ static inline bool xe_device_has_flat_ccs(struct xe_device *xe)
return xe->info.has_flat_ccs;
}
+static inline bool xe_device_needs_ccs_emit(struct xe_device *xe)
+{
+ return xe_device_has_flat_ccs(xe) && !(GRAPHICS_VER(xe) >= 20 && IS_DGFX(xe));
+}
+
static inline bool xe_device_has_sriov(struct xe_device *xe)
{
return xe->info.has_sriov;
diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
index fa23a7e7ec43..2fc2cf375b1e 100644
--- a/drivers/gpu/drm/xe/xe_migrate.c
+++ b/drivers/gpu/drm/xe/xe_migrate.c
@@ -420,7 +420,7 @@ struct xe_migrate *xe_migrate_init(struct xe_tile *tile)
return ERR_PTR(err);
if (IS_DGFX(xe)) {
- if (xe_device_has_flat_ccs(xe))
+ if (xe_device_needs_ccs_emit(xe))
/* min chunk size corresponds to 4K of CCS Metadata */
m->min_chunk_size = SZ_4K * SZ_64K /
xe_device_ccs_bytes(xe, SZ_64K);
@@ -1034,7 +1034,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
clear_system_ccs ? 0 : emit_clear_cmd_len(gt), 0,
avail_pts);
- if (xe_device_has_flat_ccs(xe))
+ if (xe_device_needs_ccs_emit(xe))
batch_size += EMIT_COPY_CCS_DW;
/* Clear commands */
@@ -1062,7 +1062,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
if (!clear_system_ccs)
emit_clear(gt, bb, clear_L0_ofs, clear_L0, XE_PAGE_SIZE, clear_vram);
- if (xe_device_has_flat_ccs(xe)) {
+ if (xe_device_needs_ccs_emit(xe)) {
emit_copy_ccs(gt, bb, clear_L0_ofs, true,
m->cleared_mem_ofs, false, clear_L0);
flush_flags = MI_FLUSH_DW_CCS;
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 0/6] Implement compression support on BMG
2024-07-10 7:53 [PATCH 0/6] Implement compression support on BMG Akshata Jahagirdar
` (2 preceding siblings ...)
2024-07-10 8:17 ` [PATCH 0/6] Implement compression support on BMG Akshata Jahagirdar
@ 2024-07-11 5:54 ` Akshata Jahagirdar
2024-07-11 11:27 ` Akshata Jahagirdar
` (3 subsequent siblings)
7 siblings, 0 replies; 25+ messages in thread
From: Akshata Jahagirdar @ 2024-07-11 5:54 UTC (permalink / raw)
To: intel-xe; +Cc: akshatajahagirdar6, Akshata Jahagirdar
On Xe2 the compression has moved to a unified universal model
(exactly one compression mode/format), where compression is now
controlled via PAT on per-page basis. This now means KMD can
decompress freely. This was problematic on DG2 since we had
multiple compression formats, and the compression format used
on a particular buffer was unknown to the KMD, so instead the
raw CCS state needed to be copied around when evicting VRAM.
In addition mixed VRAM and system memory buffers were not
supported with compression enabled.
On Xe2 dGPU compression is still only supported with VRAM,
however we can now support compression with VRAM and system
memory buffers, with GPU access being seamless underneath.
So long as when doing VRAM -> sysmem the KMD does the move
using compressed -> uncompressed, to decompress it.
CPU access to such buffers is also possible, under the premise
that userspace first decompress the corresponding pages being
accessed. If the pages are already in system memory then KMD would
have already decompressed them. When restoring such buffers with
sysmem -> VRAM the KMD can't easily know which pages were originally
compressed, so we always use uncompressed -> uncompressed here.
With this it also means we can drop all the raw CCS handling
on such platforms (including needing to allocate extra CCS storage).
In order to support this we now need to have two different identity
mappings for compressed and uncompressed VRAM.
The additional identity map is the VRAM with compressed pat_index.
We then select the appropriate mapping during migration/clear.
Akshata Jahagirdar (6):
drm/xe/xe2: Introduce identity map for compressed pat for vram
drm/xe/migrate: Handle clear ccs logic for xe2 dgfx
drm/xe/migrate: Add kunit to test clear functionality
drm/xe/xe_migrate: Handle migration logic for xe2+ dgfx
drm/xe/migrate: Add kunit to test migration functionality for BMG
drm/xe/xe2: Do not run xe_bo_test for xe2+ dgfx
drivers/gpu/drm/xe/tests/xe_bo.c | 6 +
drivers/gpu/drm/xe/tests/xe_migrate.c | 388 +++++++++++++++++++++
drivers/gpu/drm/xe/tests/xe_migrate_test.c | 1 +
drivers/gpu/drm/xe/tests/xe_migrate_test.h | 1 +
drivers/gpu/drm/xe/xe_device.h | 5 +
drivers/gpu/drm/xe/xe_migrate.c | 66 +++-
6 files changed, 449 insertions(+), 18 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 1/6] drm/xe/migrate: Handle clear ccs logic for xe2 dgfx
[not found] <cover.1720677099.git.akshata.jahagirdar@intel.com>
@ 2024-07-11 5:55 ` Akshata Jahagirdar
0 siblings, 0 replies; 25+ messages in thread
From: Akshata Jahagirdar @ 2024-07-11 5:55 UTC (permalink / raw)
To: intel-xe; +Cc: akshatajahagirdar6, Akshata Jahagirdar
For Xe2 dGPU, we clear the bo by modifying the VRAM using an
uncompressed pat index which then indirectly updates the
compression status as uncompressed i.e zeroed CCS.
So xe_migrate_clear() should be updated for BMG to not
emit CCS surf copy commands.
Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
---
drivers/gpu/drm/xe/xe_device.h | 5 +++++
drivers/gpu/drm/xe/xe_migrate.c | 6 +++---
2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
index 0a2a3e7fd402..c3093506c28c 100644
--- a/drivers/gpu/drm/xe/xe_device.h
+++ b/drivers/gpu/drm/xe/xe_device.h
@@ -144,6 +144,11 @@ static inline bool xe_device_has_flat_ccs(struct xe_device *xe)
return xe->info.has_flat_ccs;
}
+static inline bool xe_device_needs_ccs_emit(struct xe_device *xe)
+{
+ return xe_device_has_flat_ccs(xe) && !(GRAPHICS_VER(xe) >= 20 && IS_DGFX(xe));
+}
+
static inline bool xe_device_has_sriov(struct xe_device *xe)
{
return xe->info.has_sriov;
diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
index fa23a7e7ec43..2fc2cf375b1e 100644
--- a/drivers/gpu/drm/xe/xe_migrate.c
+++ b/drivers/gpu/drm/xe/xe_migrate.c
@@ -420,7 +420,7 @@ struct xe_migrate *xe_migrate_init(struct xe_tile *tile)
return ERR_PTR(err);
if (IS_DGFX(xe)) {
- if (xe_device_has_flat_ccs(xe))
+ if (xe_device_needs_ccs_emit(xe))
/* min chunk size corresponds to 4K of CCS Metadata */
m->min_chunk_size = SZ_4K * SZ_64K /
xe_device_ccs_bytes(xe, SZ_64K);
@@ -1034,7 +1034,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
clear_system_ccs ? 0 : emit_clear_cmd_len(gt), 0,
avail_pts);
- if (xe_device_has_flat_ccs(xe))
+ if (xe_device_needs_ccs_emit(xe))
batch_size += EMIT_COPY_CCS_DW;
/* Clear commands */
@@ -1062,7 +1062,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
if (!clear_system_ccs)
emit_clear(gt, bb, clear_L0_ofs, clear_L0, XE_PAGE_SIZE, clear_vram);
- if (xe_device_has_flat_ccs(xe)) {
+ if (xe_device_needs_ccs_emit(xe)) {
emit_copy_ccs(gt, bb, clear_L0_ofs, true,
m->cleared_mem_ofs, false, clear_L0);
flush_flags = MI_FLUSH_DW_CCS;
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 1/6] drm/xe/migrate: Handle clear ccs logic for xe2 dgfx
[not found] <cover.1720689220.git.akshata.jahagirdar@intel.com>
@ 2024-07-11 9:18 ` Akshata Jahagirdar
2024-07-11 9:19 ` Akshata Jahagirdar
2024-07-11 12:09 ` Matthew Auld
0 siblings, 2 replies; 25+ messages in thread
From: Akshata Jahagirdar @ 2024-07-11 9:18 UTC (permalink / raw)
To: intel-xe
Cc: matthew.d.roper, matthew.auld, himal.prasad.ghimiray,
lucas.demarchi, Akshata Jahagirdar
For Xe2 dGPU, we clear the bo by modifying the VRAM using an
uncompressed pat index which then indirectly updates the
compression status as uncompressed i.e zeroed CCS.
So xe_migrate_clear() should be updated for BMG to not
emit CCS surf copy commands.
Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
---
drivers/gpu/drm/xe/xe_device.h | 5 +++++
drivers/gpu/drm/xe/xe_migrate.c | 6 +++---
2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
index 0a2a3e7fd402..c3093506c28c 100644
--- a/drivers/gpu/drm/xe/xe_device.h
+++ b/drivers/gpu/drm/xe/xe_device.h
@@ -144,6 +144,11 @@ static inline bool xe_device_has_flat_ccs(struct xe_device *xe)
return xe->info.has_flat_ccs;
}
+static inline bool xe_device_needs_ccs_emit(struct xe_device *xe)
+{
+ return xe_device_has_flat_ccs(xe) && !(GRAPHICS_VER(xe) >= 20 && IS_DGFX(xe));
+}
+
static inline bool xe_device_has_sriov(struct xe_device *xe)
{
return xe->info.has_sriov;
diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
index fa23a7e7ec43..2fc2cf375b1e 100644
--- a/drivers/gpu/drm/xe/xe_migrate.c
+++ b/drivers/gpu/drm/xe/xe_migrate.c
@@ -420,7 +420,7 @@ struct xe_migrate *xe_migrate_init(struct xe_tile *tile)
return ERR_PTR(err);
if (IS_DGFX(xe)) {
- if (xe_device_has_flat_ccs(xe))
+ if (xe_device_needs_ccs_emit(xe))
/* min chunk size corresponds to 4K of CCS Metadata */
m->min_chunk_size = SZ_4K * SZ_64K /
xe_device_ccs_bytes(xe, SZ_64K);
@@ -1034,7 +1034,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
clear_system_ccs ? 0 : emit_clear_cmd_len(gt), 0,
avail_pts);
- if (xe_device_has_flat_ccs(xe))
+ if (xe_device_needs_ccs_emit(xe))
batch_size += EMIT_COPY_CCS_DW;
/* Clear commands */
@@ -1062,7 +1062,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
if (!clear_system_ccs)
emit_clear(gt, bb, clear_L0_ofs, clear_L0, XE_PAGE_SIZE, clear_vram);
- if (xe_device_has_flat_ccs(xe)) {
+ if (xe_device_needs_ccs_emit(xe)) {
emit_copy_ccs(gt, bb, clear_L0_ofs, true,
m->cleared_mem_ofs, false, clear_L0);
flush_flags = MI_FLUSH_DW_CCS;
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 1/6] drm/xe/migrate: Handle clear ccs logic for xe2 dgfx
2024-07-11 9:18 ` Akshata Jahagirdar
@ 2024-07-11 9:19 ` Akshata Jahagirdar
2024-07-11 12:09 ` Matthew Auld
1 sibling, 0 replies; 25+ messages in thread
From: Akshata Jahagirdar @ 2024-07-11 9:19 UTC (permalink / raw)
To: intel-xe; +Cc: akshatajahagirdar6, Akshata Jahagirdar
For Xe2 dGPU, we clear the bo by modifying the VRAM using an
uncompressed pat index which then indirectly updates the
compression status as uncompressed i.e zeroed CCS.
So xe_migrate_clear() should be updated for BMG to not
emit CCS surf copy commands.
Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
---
drivers/gpu/drm/xe/xe_device.h | 5 +++++
drivers/gpu/drm/xe/xe_migrate.c | 6 +++---
2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
index 0a2a3e7fd402..c3093506c28c 100644
--- a/drivers/gpu/drm/xe/xe_device.h
+++ b/drivers/gpu/drm/xe/xe_device.h
@@ -144,6 +144,11 @@ static inline bool xe_device_has_flat_ccs(struct xe_device *xe)
return xe->info.has_flat_ccs;
}
+static inline bool xe_device_needs_ccs_emit(struct xe_device *xe)
+{
+ return xe_device_has_flat_ccs(xe) && !(GRAPHICS_VER(xe) >= 20 && IS_DGFX(xe));
+}
+
static inline bool xe_device_has_sriov(struct xe_device *xe)
{
return xe->info.has_sriov;
diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
index fa23a7e7ec43..2fc2cf375b1e 100644
--- a/drivers/gpu/drm/xe/xe_migrate.c
+++ b/drivers/gpu/drm/xe/xe_migrate.c
@@ -420,7 +420,7 @@ struct xe_migrate *xe_migrate_init(struct xe_tile *tile)
return ERR_PTR(err);
if (IS_DGFX(xe)) {
- if (xe_device_has_flat_ccs(xe))
+ if (xe_device_needs_ccs_emit(xe))
/* min chunk size corresponds to 4K of CCS Metadata */
m->min_chunk_size = SZ_4K * SZ_64K /
xe_device_ccs_bytes(xe, SZ_64K);
@@ -1034,7 +1034,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
clear_system_ccs ? 0 : emit_clear_cmd_len(gt), 0,
avail_pts);
- if (xe_device_has_flat_ccs(xe))
+ if (xe_device_needs_ccs_emit(xe))
batch_size += EMIT_COPY_CCS_DW;
/* Clear commands */
@@ -1062,7 +1062,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
if (!clear_system_ccs)
emit_clear(gt, bb, clear_L0_ofs, clear_L0, XE_PAGE_SIZE, clear_vram);
- if (xe_device_has_flat_ccs(xe)) {
+ if (xe_device_needs_ccs_emit(xe)) {
emit_copy_ccs(gt, bb, clear_L0_ofs, true,
m->cleared_mem_ofs, false, clear_L0);
flush_flags = MI_FLUSH_DW_CCS;
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 0/6] Implement compression support on BMG
2024-07-10 7:53 [PATCH 0/6] Implement compression support on BMG Akshata Jahagirdar
` (3 preceding siblings ...)
2024-07-11 5:54 ` Akshata Jahagirdar
@ 2024-07-11 11:27 ` Akshata Jahagirdar
2024-07-11 12:34 ` Ghimiray, Himal Prasad
` (2 subsequent siblings)
7 siblings, 0 replies; 25+ messages in thread
From: Akshata Jahagirdar @ 2024-07-11 11:27 UTC (permalink / raw)
To: intel-xe; +Cc: lucas.demarchi, lucas.demarchi, Akshata Jahagirdar
According to the SAS for BMG compression, we need to decompress during eviction,
and not recompress on restore. Due to this, we need to introduce encoding pat_index
in case of vram too. This patch explores the solution of setting up an additional
identity map for the vram, this time at the end of previous mapping offset and
with compressed pat_index.
We then select the appropriate mapping during eviction/restore/clear.
Akshata Jahagirdar (6):
drm/xe/xe2: Introduce identity map for compressed pat for vram
drm/xe/migrate: Handle clear ccs logic for xe2 dgfx
drm/xe/migrate: Add kunit to test clear functionality
drm/xe/xe_migrate: Handle migration logic for xe2+ dgfx
drm/xe/migrate: Add kunit to test migration functionality for BMG
drm/xe/xe2: Do not run xe_bo_test for xe2+ dgfx
drivers/gpu/drm/xe/tests/xe_bo.c | 6 +
drivers/gpu/drm/xe/tests/xe_migrate.c | 388 +++++++++++++++++++++
drivers/gpu/drm/xe/tests/xe_migrate_test.c | 1 +
drivers/gpu/drm/xe/tests/xe_migrate_test.h | 1 +
drivers/gpu/drm/xe/xe_device.h | 5 +
drivers/gpu/drm/xe/xe_migrate.c | 66 +++-
6 files changed, 449 insertions(+), 18 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 1/6] drm/xe/migrate: Handle clear ccs logic for xe2 dgfx
2024-07-10 7:53 ` [PATCH 1/6] drm/xe/migrate: Handle clear ccs logic for xe2 dgfx Akshata Jahagirdar
2024-07-10 8:01 ` Nirmoy Das
2024-07-10 8:17 ` Akshata Jahagirdar
@ 2024-07-11 11:27 ` Akshata Jahagirdar
2024-07-11 12:42 ` Akshata Jahagirdar
` (3 subsequent siblings)
6 siblings, 0 replies; 25+ messages in thread
From: Akshata Jahagirdar @ 2024-07-11 11:27 UTC (permalink / raw)
To: intel-xe; +Cc: lucas.demarchi, lucas.demarchi, Akshata Jahagirdar
For Xe2 dGPU, we clear the bo by modifying the VRAM using an
uncompressed pat index which then indirectly updates the
compression status as uncompressed i.e zeroed CCS.
So xe_migrate_clear() should be updated for BMG to not
emit CCS surf copy commands.
Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
---
drivers/gpu/drm/xe/xe_device.h | 5 +++++
drivers/gpu/drm/xe/xe_migrate.c | 6 +++---
2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
index 0a2a3e7fd402..c3093506c28c 100644
--- a/drivers/gpu/drm/xe/xe_device.h
+++ b/drivers/gpu/drm/xe/xe_device.h
@@ -144,6 +144,11 @@ static inline bool xe_device_has_flat_ccs(struct xe_device *xe)
return xe->info.has_flat_ccs;
}
+static inline bool xe_device_needs_ccs_emit(struct xe_device *xe)
+{
+ return xe_device_has_flat_ccs(xe) && !(GRAPHICS_VER(xe) >= 20 && IS_DGFX(xe));
+}
+
static inline bool xe_device_has_sriov(struct xe_device *xe)
{
return xe->info.has_sriov;
diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
index fa23a7e7ec43..2fc2cf375b1e 100644
--- a/drivers/gpu/drm/xe/xe_migrate.c
+++ b/drivers/gpu/drm/xe/xe_migrate.c
@@ -420,7 +420,7 @@ struct xe_migrate *xe_migrate_init(struct xe_tile *tile)
return ERR_PTR(err);
if (IS_DGFX(xe)) {
- if (xe_device_has_flat_ccs(xe))
+ if (xe_device_needs_ccs_emit(xe))
/* min chunk size corresponds to 4K of CCS Metadata */
m->min_chunk_size = SZ_4K * SZ_64K /
xe_device_ccs_bytes(xe, SZ_64K);
@@ -1034,7 +1034,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
clear_system_ccs ? 0 : emit_clear_cmd_len(gt), 0,
avail_pts);
- if (xe_device_has_flat_ccs(xe))
+ if (xe_device_needs_ccs_emit(xe))
batch_size += EMIT_COPY_CCS_DW;
/* Clear commands */
@@ -1062,7 +1062,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
if (!clear_system_ccs)
emit_clear(gt, bb, clear_L0_ofs, clear_L0, XE_PAGE_SIZE, clear_vram);
- if (xe_device_has_flat_ccs(xe)) {
+ if (xe_device_needs_ccs_emit(xe)) {
emit_copy_ccs(gt, bb, clear_L0_ofs, true,
m->cleared_mem_ofs, false, clear_L0);
flush_flags = MI_FLUSH_DW_CCS;
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH 1/6] drm/xe/migrate: Handle clear ccs logic for xe2 dgfx
2024-07-11 9:18 ` Akshata Jahagirdar
2024-07-11 9:19 ` Akshata Jahagirdar
@ 2024-07-11 12:09 ` Matthew Auld
2024-07-12 4:09 ` Jahagirdar, Akshata
1 sibling, 1 reply; 25+ messages in thread
From: Matthew Auld @ 2024-07-11 12:09 UTC (permalink / raw)
To: Akshata Jahagirdar, intel-xe
Cc: matthew.d.roper, himal.prasad.ghimiray, lucas.demarchi
On 11/07/2024 10:18, Akshata Jahagirdar wrote:
> For Xe2 dGPU, we clear the bo by modifying the VRAM using an
> uncompressed pat index which then indirectly updates the
> compression status as uncompressed i.e zeroed CCS.
> So xe_migrate_clear() should be updated for BMG to not
> emit CCS surf copy commands.
>
> Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
> ---
> drivers/gpu/drm/xe/xe_device.h | 5 +++++
> drivers/gpu/drm/xe/xe_migrate.c | 6 +++---
> 2 files changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
> index 0a2a3e7fd402..c3093506c28c 100644
> --- a/drivers/gpu/drm/xe/xe_device.h
> +++ b/drivers/gpu/drm/xe/xe_device.h
> @@ -144,6 +144,11 @@ static inline bool xe_device_has_flat_ccs(struct xe_device *xe)
> return xe->info.has_flat_ccs;
> }
>
> +static inline bool xe_device_needs_ccs_emit(struct xe_device *xe)
> +{
> + return xe_device_has_flat_ccs(xe) && !(GRAPHICS_VER(xe) >= 20 && IS_DGFX(xe));
> +}
> +
This should in theory be hyper specific to the migration code
implementation. I think best keep in xe_migrate.c, instead of exporting
(if possible).
With that,
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
> static inline bool xe_device_has_sriov(struct xe_device *xe)
> {
> return xe->info.has_sriov;
> diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
> index fa23a7e7ec43..2fc2cf375b1e 100644
> --- a/drivers/gpu/drm/xe/xe_migrate.c
> +++ b/drivers/gpu/drm/xe/xe_migrate.c
> @@ -420,7 +420,7 @@ struct xe_migrate *xe_migrate_init(struct xe_tile *tile)
> return ERR_PTR(err);
>
> if (IS_DGFX(xe)) {
> - if (xe_device_has_flat_ccs(xe))
> + if (xe_device_needs_ccs_emit(xe))
> /* min chunk size corresponds to 4K of CCS Metadata */
> m->min_chunk_size = SZ_4K * SZ_64K /
> xe_device_ccs_bytes(xe, SZ_64K);
> @@ -1034,7 +1034,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
> clear_system_ccs ? 0 : emit_clear_cmd_len(gt), 0,
> avail_pts);
>
> - if (xe_device_has_flat_ccs(xe))
> + if (xe_device_needs_ccs_emit(xe))
> batch_size += EMIT_COPY_CCS_DW;
>
> /* Clear commands */
> @@ -1062,7 +1062,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
> if (!clear_system_ccs)
> emit_clear(gt, bb, clear_L0_ofs, clear_L0, XE_PAGE_SIZE, clear_vram);
>
> - if (xe_device_has_flat_ccs(xe)) {
> + if (xe_device_needs_ccs_emit(xe)) {
> emit_copy_ccs(gt, bb, clear_L0_ofs, true,
> m->cleared_mem_ofs, false, clear_L0);
> flush_flags = MI_FLUSH_DW_CCS;
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 0/6] Implement compression support on BMG
2024-07-10 7:53 [PATCH 0/6] Implement compression support on BMG Akshata Jahagirdar
` (4 preceding siblings ...)
2024-07-11 11:27 ` Akshata Jahagirdar
@ 2024-07-11 12:34 ` Ghimiray, Himal Prasad
2024-07-11 12:42 ` Akshata Jahagirdar
2024-07-11 13:07 ` Akshata Jahagirdar
7 siblings, 0 replies; 25+ messages in thread
From: Ghimiray, Himal Prasad @ 2024-07-11 12:34 UTC (permalink / raw)
To: Akshata Jahagirdar, intel-xe; +Cc: akshatajahagirdar6
On 11-07-2024 11:24, Akshata Jahagirdar wrote:
> On Xe2 the compression has moved to a unified universal model
> (exactly one compression mode/format), where compression is now
> controlled via PAT on per-page basis. This now means KMD can
> decompress freely. This was problematic on DG2 since we had
> multiple compression formats, and the compression format used
> on a particular buffer was unknown to the KMD, so instead the
> raw CCS state needed to be copied around when evicting VRAM.
> In addition mixed VRAM and system memory buffers were not
> supported with compression enabled.
>
> On Xe2 dGPU compression is still only supported with VRAM,
> however we can now support compression with VRAM and system
> memory buffers, with GPU access being seamless underneath.
> So long as when doing VRAM -> sysmem the KMD does the move
> using compressed -> uncompressed, to decompress it.
> CPU access to such buffers is also possible, under the premise
> that userspace first decompress the corresponding pages being
> accessed. If the pages are already in system memory then KMD would
> have already decompressed them. When restoring such buffers with
> sysmem -> VRAM the KMD can't easily know which pages were originally
> compressed, so we always use uncompressed -> uncompressed here.
> With this it also means we can drop all the raw CCS handling
> on such platforms (including needing to allocate extra CCS storage).
>
> In order to support this we now need to have two different identity
> mappings for compressed and uncompressed VRAM.
> The additional identity map is the VRAM with compressed pat_index.
> We then select the appropriate mapping during migration/clear.
>
Have gone through all the patches. Please address indentation and
checkpatch errors.
With above addressed all patches lgtm.
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
> Akshata Jahagirdar (6):
> drm/xe/xe2: Introduce identity map for compressed pat for vram
> drm/xe/migrate: Handle clear ccs logic for xe2 dgfx
> drm/xe/migrate: Add kunit to test clear functionality
> drm/xe/xe_migrate: Handle migration logic for xe2+ dgfx
> drm/xe/migrate: Add kunit to test migration functionality for BMG
> drm/xe/xe2: Do not run xe_bo_test for xe2+ dgfx
>
> drivers/gpu/drm/xe/tests/xe_bo.c | 6 +
> drivers/gpu/drm/xe/tests/xe_migrate.c | 388 +++++++++++++++++++++
> drivers/gpu/drm/xe/tests/xe_migrate_test.c | 1 +
> drivers/gpu/drm/xe/tests/xe_migrate_test.h | 1 +
> drivers/gpu/drm/xe/xe_device.h | 5 +
> drivers/gpu/drm/xe/xe_migrate.c | 66 +++-
> 6 files changed, 449 insertions(+), 18 deletions(-)
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 0/6] Implement compression support on BMG
2024-07-10 7:53 [PATCH 0/6] Implement compression support on BMG Akshata Jahagirdar
` (5 preceding siblings ...)
2024-07-11 12:34 ` Ghimiray, Himal Prasad
@ 2024-07-11 12:42 ` Akshata Jahagirdar
2024-07-11 13:07 ` Akshata Jahagirdar
7 siblings, 0 replies; 25+ messages in thread
From: Akshata Jahagirdar @ 2024-07-11 12:42 UTC (permalink / raw)
To: intel-xe; +Cc: akshatajahagirdar6, Akshata Jahagirdar
According to the SAS for BMG compression, we need to decompress during eviction,
and not recompress on restore. Due to this, we need to introduce encoding pat_index
in case of vram too. This patch explores the solution of setting up an additional
identity map for the vram, this time at the end of previous mapping offset and
with compressed pat_index.
We then select the appropriate mapping during eviction/restore/clear.
Akshata Jahagirdar (6):
drm/xe/xe2: Introduce identity map for compressed pat for vram
drm/xe/migrate: Handle clear ccs logic for xe2 dgfx
drm/xe/migrate: Add kunit to test clear functionality
drm/xe/xe_migrate: Handle migration logic for xe2+ dgfx
drm/xe/migrate: Add kunit to test migration functionality for BMG
drm/xe/xe2: Do not run xe_bo_test for xe2+ dgfx
drivers/gpu/drm/xe/tests/xe_bo.c | 6 +
drivers/gpu/drm/xe/tests/xe_migrate.c | 388 +++++++++++++++++++++
drivers/gpu/drm/xe/tests/xe_migrate_test.c | 1 +
drivers/gpu/drm/xe/tests/xe_migrate_test.h | 1 +
drivers/gpu/drm/xe/xe_device.h | 5 +
drivers/gpu/drm/xe/xe_migrate.c | 66 +++-
6 files changed, 449 insertions(+), 18 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 1/6] drm/xe/migrate: Handle clear ccs logic for xe2 dgfx
2024-07-10 7:53 ` [PATCH 1/6] drm/xe/migrate: Handle clear ccs logic for xe2 dgfx Akshata Jahagirdar
` (2 preceding siblings ...)
2024-07-11 11:27 ` Akshata Jahagirdar
@ 2024-07-11 12:42 ` Akshata Jahagirdar
2024-07-11 13:07 ` Akshata Jahagirdar
` (2 subsequent siblings)
6 siblings, 0 replies; 25+ messages in thread
From: Akshata Jahagirdar @ 2024-07-11 12:42 UTC (permalink / raw)
To: intel-xe; +Cc: akshatajahagirdar6, Akshata Jahagirdar
For Xe2 dGPU, we clear the bo by modifying the VRAM using an
uncompressed pat index which then indirectly updates the
compression status as uncompressed i.e zeroed CCS.
So xe_migrate_clear() should be updated for BMG to not
emit CCS surf copy commands.
Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
---
drivers/gpu/drm/xe/xe_device.h | 5 +++++
drivers/gpu/drm/xe/xe_migrate.c | 6 +++---
2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
index 0a2a3e7fd402..c3093506c28c 100644
--- a/drivers/gpu/drm/xe/xe_device.h
+++ b/drivers/gpu/drm/xe/xe_device.h
@@ -144,6 +144,11 @@ static inline bool xe_device_has_flat_ccs(struct xe_device *xe)
return xe->info.has_flat_ccs;
}
+static inline bool xe_device_needs_ccs_emit(struct xe_device *xe)
+{
+ return xe_device_has_flat_ccs(xe) && !(GRAPHICS_VER(xe) >= 20 && IS_DGFX(xe));
+}
+
static inline bool xe_device_has_sriov(struct xe_device *xe)
{
return xe->info.has_sriov;
diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
index fa23a7e7ec43..2fc2cf375b1e 100644
--- a/drivers/gpu/drm/xe/xe_migrate.c
+++ b/drivers/gpu/drm/xe/xe_migrate.c
@@ -420,7 +420,7 @@ struct xe_migrate *xe_migrate_init(struct xe_tile *tile)
return ERR_PTR(err);
if (IS_DGFX(xe)) {
- if (xe_device_has_flat_ccs(xe))
+ if (xe_device_needs_ccs_emit(xe))
/* min chunk size corresponds to 4K of CCS Metadata */
m->min_chunk_size = SZ_4K * SZ_64K /
xe_device_ccs_bytes(xe, SZ_64K);
@@ -1034,7 +1034,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
clear_system_ccs ? 0 : emit_clear_cmd_len(gt), 0,
avail_pts);
- if (xe_device_has_flat_ccs(xe))
+ if (xe_device_needs_ccs_emit(xe))
batch_size += EMIT_COPY_CCS_DW;
/* Clear commands */
@@ -1062,7 +1062,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
if (!clear_system_ccs)
emit_clear(gt, bb, clear_L0_ofs, clear_L0, XE_PAGE_SIZE, clear_vram);
- if (xe_device_has_flat_ccs(xe)) {
+ if (xe_device_needs_ccs_emit(xe)) {
emit_copy_ccs(gt, bb, clear_L0_ofs, true,
m->cleared_mem_ofs, false, clear_L0);
flush_flags = MI_FLUSH_DW_CCS;
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 0/6] Implement compression support on BMG
2024-07-10 7:53 [PATCH 0/6] Implement compression support on BMG Akshata Jahagirdar
` (6 preceding siblings ...)
2024-07-11 12:42 ` Akshata Jahagirdar
@ 2024-07-11 13:07 ` Akshata Jahagirdar
7 siblings, 0 replies; 25+ messages in thread
From: Akshata Jahagirdar @ 2024-07-11 13:07 UTC (permalink / raw)
To: intel-xe; +Cc: akshatajahagirdar6, Akshata Jahagirdar
According to the SAS for BMG compression, we need to decompress during eviction,
and not recompress on restore. Due to this, we need to introduce encoding pat_index
in case of vram too. This patch explores the solution of setting up an additional
identity map for the vram, this time at the end of previous mapping offset and
with compressed pat_index.
We then select the appropriate mapping during eviction/restore/clear.
Akshata Jahagirdar (6):
drm/xe/xe2: Introduce identity map for compressed pat for vram
drm/xe/migrate: Handle clear ccs logic for xe2 dgfx
drm/xe/migrate: Add kunit to test clear functionality
drm/xe/xe_migrate: Handle migration logic for xe2+ dgfx
drm/xe/migrate: Add kunit to test migration functionality for BMG
drm/xe/xe2: Do not run xe_bo_test for xe2+ dgfx
drivers/gpu/drm/xe/tests/xe_bo.c | 6 +
drivers/gpu/drm/xe/tests/xe_migrate.c | 388 +++++++++++++++++++++
drivers/gpu/drm/xe/tests/xe_migrate_test.c | 1 +
drivers/gpu/drm/xe/tests/xe_migrate_test.h | 1 +
drivers/gpu/drm/xe/xe_device.h | 5 +
drivers/gpu/drm/xe/xe_migrate.c | 66 +++-
6 files changed, 449 insertions(+), 18 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 1/6] drm/xe/migrate: Handle clear ccs logic for xe2 dgfx
2024-07-10 7:53 ` [PATCH 1/6] drm/xe/migrate: Handle clear ccs logic for xe2 dgfx Akshata Jahagirdar
` (3 preceding siblings ...)
2024-07-11 12:42 ` Akshata Jahagirdar
@ 2024-07-11 13:07 ` Akshata Jahagirdar
2024-07-12 11:52 ` [PATCH 1/6] drm/xe/migrate: Sample patch for testing Akshata Jahagirdar
2024-07-12 11:53 ` Akshata Jahagirdar
6 siblings, 0 replies; 25+ messages in thread
From: Akshata Jahagirdar @ 2024-07-11 13:07 UTC (permalink / raw)
To: intel-xe; +Cc: akshatajahagirdar6, Akshata Jahagirdar
For Xe2 dGPU, we clear the bo by modifying the VRAM using an
uncompressed pat index which then indirectly updates the
compression status as uncompressed i.e zeroed CCS.
So xe_migrate_clear() should be updated for BMG to not
emit CCS surf copy commands.
Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
---
drivers/gpu/drm/xe/xe_device.h | 5 +++++
drivers/gpu/drm/xe/xe_migrate.c | 6 +++---
2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
index 0a2a3e7fd402..c3093506c28c 100644
--- a/drivers/gpu/drm/xe/xe_device.h
+++ b/drivers/gpu/drm/xe/xe_device.h
@@ -144,6 +144,11 @@ static inline bool xe_device_has_flat_ccs(struct xe_device *xe)
return xe->info.has_flat_ccs;
}
+static inline bool xe_device_needs_ccs_emit(struct xe_device *xe)
+{
+ return xe_device_has_flat_ccs(xe) && !(GRAPHICS_VER(xe) >= 20 && IS_DGFX(xe));
+}
+
static inline bool xe_device_has_sriov(struct xe_device *xe)
{
return xe->info.has_sriov;
diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
index fa23a7e7ec43..2fc2cf375b1e 100644
--- a/drivers/gpu/drm/xe/xe_migrate.c
+++ b/drivers/gpu/drm/xe/xe_migrate.c
@@ -420,7 +420,7 @@ struct xe_migrate *xe_migrate_init(struct xe_tile *tile)
return ERR_PTR(err);
if (IS_DGFX(xe)) {
- if (xe_device_has_flat_ccs(xe))
+ if (xe_device_needs_ccs_emit(xe))
/* min chunk size corresponds to 4K of CCS Metadata */
m->min_chunk_size = SZ_4K * SZ_64K /
xe_device_ccs_bytes(xe, SZ_64K);
@@ -1034,7 +1034,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
clear_system_ccs ? 0 : emit_clear_cmd_len(gt), 0,
avail_pts);
- if (xe_device_has_flat_ccs(xe))
+ if (xe_device_needs_ccs_emit(xe))
batch_size += EMIT_COPY_CCS_DW;
/* Clear commands */
@@ -1062,7 +1062,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
if (!clear_system_ccs)
emit_clear(gt, bb, clear_L0_ofs, clear_L0, XE_PAGE_SIZE, clear_vram);
- if (xe_device_has_flat_ccs(xe)) {
+ if (xe_device_needs_ccs_emit(xe)) {
emit_copy_ccs(gt, bb, clear_L0_ofs, true,
m->cleared_mem_ofs, false, clear_L0);
flush_flags = MI_FLUSH_DW_CCS;
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 1/6] drm/xe/migrate: Handle clear ccs logic for xe2 dgfx
2024-07-11 9:18 [PATCH 0/6] Implement compression support on BMG Akshata Jahagirdar
@ 2024-07-12 3:11 ` Akshata Jahagirdar
0 siblings, 0 replies; 25+ messages in thread
From: Akshata Jahagirdar @ 2024-07-12 3:11 UTC (permalink / raw)
To: intel-xe; +Cc: san0582, Jahagirdar, Akshata
From: "Jahagirdar, Akshata" <akshata.jahagirdar@intel.com>
For Xe2 dGPU, we clear the bo by modifying the VRAM using an
uncompressed pat index which then indirectly updates the
compression status as uncompressed i.e zeroed CCS.
So xe_migrate_clear() should be updated for BMG to not
emit CCS surf copy commands.
Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
---
drivers/gpu/drm/xe/xe_device.h | 5 +++++
drivers/gpu/drm/xe/xe_migrate.c | 6 +++---
2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
index 0a2a3e7fd402..c3093506c28c 100644
--- a/drivers/gpu/drm/xe/xe_device.h
+++ b/drivers/gpu/drm/xe/xe_device.h
@@ -144,6 +144,11 @@ static inline bool xe_device_has_flat_ccs(struct xe_device *xe)
return xe->info.has_flat_ccs;
}
+static inline bool xe_device_needs_ccs_emit(struct xe_device *xe)
+{
+ return xe_device_has_flat_ccs(xe) && !(GRAPHICS_VER(xe) >= 20 && IS_DGFX(xe));
+}
+
static inline bool xe_device_has_sriov(struct xe_device *xe)
{
return xe->info.has_sriov;
diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
index fa23a7e7ec43..2fc2cf375b1e 100644
--- a/drivers/gpu/drm/xe/xe_migrate.c
+++ b/drivers/gpu/drm/xe/xe_migrate.c
@@ -420,7 +420,7 @@ struct xe_migrate *xe_migrate_init(struct xe_tile *tile)
return ERR_PTR(err);
if (IS_DGFX(xe)) {
- if (xe_device_has_flat_ccs(xe))
+ if (xe_device_needs_ccs_emit(xe))
/* min chunk size corresponds to 4K of CCS Metadata */
m->min_chunk_size = SZ_4K * SZ_64K /
xe_device_ccs_bytes(xe, SZ_64K);
@@ -1034,7 +1034,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
clear_system_ccs ? 0 : emit_clear_cmd_len(gt), 0,
avail_pts);
- if (xe_device_has_flat_ccs(xe))
+ if (xe_device_needs_ccs_emit(xe))
batch_size += EMIT_COPY_CCS_DW;
/* Clear commands */
@@ -1062,7 +1062,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
if (!clear_system_ccs)
emit_clear(gt, bb, clear_L0_ofs, clear_L0, XE_PAGE_SIZE, clear_vram);
- if (xe_device_has_flat_ccs(xe)) {
+ if (xe_device_needs_ccs_emit(xe)) {
emit_copy_ccs(gt, bb, clear_L0_ofs, true,
m->cleared_mem_ofs, false, clear_L0);
flush_flags = MI_FLUSH_DW_CCS;
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH 1/6] drm/xe/migrate: Handle clear ccs logic for xe2 dgfx
2024-07-11 12:09 ` Matthew Auld
@ 2024-07-12 4:09 ` Jahagirdar, Akshata
0 siblings, 0 replies; 25+ messages in thread
From: Jahagirdar, Akshata @ 2024-07-12 4:09 UTC (permalink / raw)
To: Matthew Auld, intel-xe
Cc: matthew.d.roper, himal.prasad.ghimiray, lucas.demarchi
On 7/11/2024 5:09 AM, Matthew Auld wrote:
> On 11/07/2024 10:18, Akshata Jahagirdar wrote:
>> For Xe2 dGPU, we clear the bo by modifying the VRAM using an
>> uncompressed pat index which then indirectly updates the
>> compression status as uncompressed i.e zeroed CCS.
>> So xe_migrate_clear() should be updated for BMG to not
>> emit CCS surf copy commands.
>>
>> Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
>> ---
>> drivers/gpu/drm/xe/xe_device.h | 5 +++++
>> drivers/gpu/drm/xe/xe_migrate.c | 6 +++---
>> 2 files changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_device.h
>> b/drivers/gpu/drm/xe/xe_device.h
>> index 0a2a3e7fd402..c3093506c28c 100644
>> --- a/drivers/gpu/drm/xe/xe_device.h
>> +++ b/drivers/gpu/drm/xe/xe_device.h
>> @@ -144,6 +144,11 @@ static inline bool xe_device_has_flat_ccs(struct
>> xe_device *xe)
>> return xe->info.has_flat_ccs;
>> }
>> +static inline bool xe_device_needs_ccs_emit(struct xe_device *xe)
>> +{
>> + return xe_device_has_flat_ccs(xe) && !(GRAPHICS_VER(xe) >= 20 &&
>> IS_DGFX(xe));
>> +}
>> +
>
> This should in theory be hyper specific to the migration code
> implementation. I think best keep in xe_migrate.c, instead of
> exporting (if possible).
>
> With that,
> Reviewed-by: Matthew Auld <matthew.auld@intel.com>
>
Thank you for your review.
Should I move this change in internal as well?
-Akshata
>
>> static inline bool xe_device_has_sriov(struct xe_device *xe)
>> {
>> return xe->info.has_sriov;
>> diff --git a/drivers/gpu/drm/xe/xe_migrate.c
>> b/drivers/gpu/drm/xe/xe_migrate.c
>> index fa23a7e7ec43..2fc2cf375b1e 100644
>> --- a/drivers/gpu/drm/xe/xe_migrate.c
>> +++ b/drivers/gpu/drm/xe/xe_migrate.c
>> @@ -420,7 +420,7 @@ struct xe_migrate *xe_migrate_init(struct xe_tile
>> *tile)
>> return ERR_PTR(err);
>> if (IS_DGFX(xe)) {
>> - if (xe_device_has_flat_ccs(xe))
>> + if (xe_device_needs_ccs_emit(xe))
>> /* min chunk size corresponds to 4K of CCS Metadata */
>> m->min_chunk_size = SZ_4K * SZ_64K /
>> xe_device_ccs_bytes(xe, SZ_64K);
>> @@ -1034,7 +1034,7 @@ struct dma_fence *xe_migrate_clear(struct
>> xe_migrate *m,
>> clear_system_ccs ? 0 : emit_clear_cmd_len(gt), 0,
>> avail_pts);
>> - if (xe_device_has_flat_ccs(xe))
>> + if (xe_device_needs_ccs_emit(xe))
>> batch_size += EMIT_COPY_CCS_DW;
>> /* Clear commands */
>> @@ -1062,7 +1062,7 @@ struct dma_fence *xe_migrate_clear(struct
>> xe_migrate *m,
>> if (!clear_system_ccs)
>> emit_clear(gt, bb, clear_L0_ofs, clear_L0,
>> XE_PAGE_SIZE, clear_vram);
>> - if (xe_device_has_flat_ccs(xe)) {
>> + if (xe_device_needs_ccs_emit(xe)) {
>> emit_copy_ccs(gt, bb, clear_L0_ofs, true,
>> m->cleared_mem_ofs, false, clear_L0);
>> flush_flags = MI_FLUSH_DW_CCS;
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 1/6] drm/xe/migrate: Handle clear ccs logic for xe2 dgfx
2024-07-12 6:39 [PATCH 0/6] Implement compression support on BMG Akshata Jahagirdar
@ 2024-07-12 6:39 ` Akshata Jahagirdar
0 siblings, 0 replies; 25+ messages in thread
From: Akshata Jahagirdar @ 2024-07-12 6:39 UTC (permalink / raw)
To: intel-xe
Cc: akshatajahagirdar6, Jahagirdar, Akshata, Matthew Auld,
Himal Prasad Ghimiray
From: "Jahagirdar, Akshata" <akshata.jahagirdar@intel.com>
For Xe2 dGPU, we clear the bo by modifying the VRAM using an
uncompressed pat index which then indirectly updates the
compression status as uncompressed i.e zeroed CCS.
So xe_migrate_clear() should be updated for BMG to not
emit CCS surf copy commands.
Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
---
drivers/gpu/drm/xe/xe_migrate.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
index fa23a7e7ec43..85eec95c9bc2 100644
--- a/drivers/gpu/drm/xe/xe_migrate.c
+++ b/drivers/gpu/drm/xe/xe_migrate.c
@@ -347,6 +347,11 @@ static u32 xe_migrate_usm_logical_mask(struct xe_gt *gt)
return logical_mask;
}
+static bool xe_migrate_needs_ccs_emit(struct xe_device *xe)
+{
+ return xe_device_has_flat_ccs(xe) && !(GRAPHICS_VER(xe) >= 20 && IS_DGFX(xe));
+}
+
/**
* xe_migrate_init() - Initialize a migrate context
* @tile: Back-pointer to the tile we're initializing for.
@@ -420,7 +425,7 @@ struct xe_migrate *xe_migrate_init(struct xe_tile *tile)
return ERR_PTR(err);
if (IS_DGFX(xe)) {
- if (xe_device_has_flat_ccs(xe))
+ if (xe_migrate_needs_ccs_emit(xe))
/* min chunk size corresponds to 4K of CCS Metadata */
m->min_chunk_size = SZ_4K * SZ_64K /
xe_device_ccs_bytes(xe, SZ_64K);
@@ -1034,7 +1039,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
clear_system_ccs ? 0 : emit_clear_cmd_len(gt), 0,
avail_pts);
- if (xe_device_has_flat_ccs(xe))
+ if (xe_migrate_needs_ccs_emit(xe))
batch_size += EMIT_COPY_CCS_DW;
/* Clear commands */
@@ -1062,7 +1067,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
if (!clear_system_ccs)
emit_clear(gt, bb, clear_L0_ofs, clear_L0, XE_PAGE_SIZE, clear_vram);
- if (xe_device_has_flat_ccs(xe)) {
+ if (xe_migrate_needs_ccs_emit(xe)) {
emit_copy_ccs(gt, bb, clear_L0_ofs, true,
m->cleared_mem_ofs, false, clear_L0);
flush_flags = MI_FLUSH_DW_CCS;
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 1/6] drm/xe/migrate: Handle clear ccs logic for xe2 dgfx
[not found] <cover.1720768378.git.akshata.jahagirdar@intel.com>
@ 2024-07-12 7:24 ` Akshata Jahagirdar
0 siblings, 0 replies; 25+ messages in thread
From: Akshata Jahagirdar @ 2024-07-12 7:24 UTC (permalink / raw)
To: intel-xe
Cc: akshatajahagirdar6, Akshata Jahagirdar, Matthew Auld,
Himal Prasad Ghimiray
For Xe2 dGPU, we clear the bo by modifying the VRAM using an
uncompressed pat index which then indirectly updates the
compression status as uncompressed i.e zeroed CCS.
So xe_migrate_clear() should be updated for BMG to not
emit CCS surf copy commands.
Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
---
drivers/gpu/drm/xe/xe_migrate.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
index fa23a7e7ec43..85eec95c9bc2 100644
--- a/drivers/gpu/drm/xe/xe_migrate.c
+++ b/drivers/gpu/drm/xe/xe_migrate.c
@@ -347,6 +347,11 @@ static u32 xe_migrate_usm_logical_mask(struct xe_gt *gt)
return logical_mask;
}
+static bool xe_migrate_needs_ccs_emit(struct xe_device *xe)
+{
+ return xe_device_has_flat_ccs(xe) && !(GRAPHICS_VER(xe) >= 20 && IS_DGFX(xe));
+}
+
/**
* xe_migrate_init() - Initialize a migrate context
* @tile: Back-pointer to the tile we're initializing for.
@@ -420,7 +425,7 @@ struct xe_migrate *xe_migrate_init(struct xe_tile *tile)
return ERR_PTR(err);
if (IS_DGFX(xe)) {
- if (xe_device_has_flat_ccs(xe))
+ if (xe_migrate_needs_ccs_emit(xe))
/* min chunk size corresponds to 4K of CCS Metadata */
m->min_chunk_size = SZ_4K * SZ_64K /
xe_device_ccs_bytes(xe, SZ_64K);
@@ -1034,7 +1039,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
clear_system_ccs ? 0 : emit_clear_cmd_len(gt), 0,
avail_pts);
- if (xe_device_has_flat_ccs(xe))
+ if (xe_migrate_needs_ccs_emit(xe))
batch_size += EMIT_COPY_CCS_DW;
/* Clear commands */
@@ -1062,7 +1067,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
if (!clear_system_ccs)
emit_clear(gt, bb, clear_L0_ofs, clear_L0, XE_PAGE_SIZE, clear_vram);
- if (xe_device_has_flat_ccs(xe)) {
+ if (xe_migrate_needs_ccs_emit(xe)) {
emit_copy_ccs(gt, bb, clear_L0_ofs, true,
m->cleared_mem_ofs, false, clear_L0);
flush_flags = MI_FLUSH_DW_CCS;
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 1/6] drm/xe/migrate: Sample patch for testing
2024-07-10 7:53 ` [PATCH 1/6] drm/xe/migrate: Handle clear ccs logic for xe2 dgfx Akshata Jahagirdar
` (4 preceding siblings ...)
2024-07-11 13:07 ` Akshata Jahagirdar
@ 2024-07-12 11:52 ` Akshata Jahagirdar
2024-07-12 11:53 ` Akshata Jahagirdar
6 siblings, 0 replies; 25+ messages in thread
From: Akshata Jahagirdar @ 2024-07-12 11:52 UTC (permalink / raw)
To: intel-xe; +Cc: akshatajahagirdar6, Akshata Jahagirdar
For Xe2 dGPU, we clear the bo by modifying the VRAM using an
uncompressed pat index which then indirectly updates the
compression status as uncompressed i.e zeroed CCS.
So xe_migrate_clear() should be updated for BMG to not
emit CCS surf copy commands.
Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
---
drivers/gpu/drm/xe/xe_device.h | 5 +++++
drivers/gpu/drm/xe/xe_migrate.c | 6 +++---
2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
index 0a2a3e7fd402..c3093506c28c 100644
--- a/drivers/gpu/drm/xe/xe_device.h
+++ b/drivers/gpu/drm/xe/xe_device.h
@@ -144,6 +144,11 @@ static inline bool xe_device_has_flat_ccs(struct xe_device *xe)
return xe->info.has_flat_ccs;
}
+static inline bool xe_device_needs_ccs_emit(struct xe_device *xe)
+{
+ return xe_device_has_flat_ccs(xe) && !(GRAPHICS_VER(xe) >= 20 && IS_DGFX(xe));
+}
+
static inline bool xe_device_has_sriov(struct xe_device *xe)
{
return xe->info.has_sriov;
diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
index fa23a7e7ec43..2fc2cf375b1e 100644
--- a/drivers/gpu/drm/xe/xe_migrate.c
+++ b/drivers/gpu/drm/xe/xe_migrate.c
@@ -420,7 +420,7 @@ struct xe_migrate *xe_migrate_init(struct xe_tile *tile)
return ERR_PTR(err);
if (IS_DGFX(xe)) {
- if (xe_device_has_flat_ccs(xe))
+ if (xe_device_needs_ccs_emit(xe))
/* min chunk size corresponds to 4K of CCS Metadata */
m->min_chunk_size = SZ_4K * SZ_64K /
xe_device_ccs_bytes(xe, SZ_64K);
@@ -1034,7 +1034,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
clear_system_ccs ? 0 : emit_clear_cmd_len(gt), 0,
avail_pts);
- if (xe_device_has_flat_ccs(xe))
+ if (xe_device_needs_ccs_emit(xe))
batch_size += EMIT_COPY_CCS_DW;
/* Clear commands */
@@ -1062,7 +1062,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
if (!clear_system_ccs)
emit_clear(gt, bb, clear_L0_ofs, clear_L0, XE_PAGE_SIZE, clear_vram);
- if (xe_device_has_flat_ccs(xe)) {
+ if (xe_device_needs_ccs_emit(xe)) {
emit_copy_ccs(gt, bb, clear_L0_ofs, true,
m->cleared_mem_ofs, false, clear_L0);
flush_flags = MI_FLUSH_DW_CCS;
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 1/6] drm/xe/migrate: Sample patch for testing
2024-07-10 7:53 ` [PATCH 1/6] drm/xe/migrate: Handle clear ccs logic for xe2 dgfx Akshata Jahagirdar
` (5 preceding siblings ...)
2024-07-12 11:52 ` [PATCH 1/6] drm/xe/migrate: Sample patch for testing Akshata Jahagirdar
@ 2024-07-12 11:53 ` Akshata Jahagirdar
6 siblings, 0 replies; 25+ messages in thread
From: Akshata Jahagirdar @ 2024-07-12 11:53 UTC (permalink / raw)
To: intel-xe; +Cc: lucas.demarchi, Akshata Jahagirdar
For Xe2 dGPU, we clear the bo by modifying the VRAM using an
uncompressed pat index which then indirectly updates the
compression status as uncompressed i.e zeroed CCS.
So xe_migrate_clear() should be updated for BMG to not
emit CCS surf copy commands.
Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
---
drivers/gpu/drm/xe/xe_device.h | 5 +++++
drivers/gpu/drm/xe/xe_migrate.c | 6 +++---
2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
index 0a2a3e7fd402..c3093506c28c 100644
--- a/drivers/gpu/drm/xe/xe_device.h
+++ b/drivers/gpu/drm/xe/xe_device.h
@@ -144,6 +144,11 @@ static inline bool xe_device_has_flat_ccs(struct xe_device *xe)
return xe->info.has_flat_ccs;
}
+static inline bool xe_device_needs_ccs_emit(struct xe_device *xe)
+{
+ return xe_device_has_flat_ccs(xe) && !(GRAPHICS_VER(xe) >= 20 && IS_DGFX(xe));
+}
+
static inline bool xe_device_has_sriov(struct xe_device *xe)
{
return xe->info.has_sriov;
diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
index fa23a7e7ec43..2fc2cf375b1e 100644
--- a/drivers/gpu/drm/xe/xe_migrate.c
+++ b/drivers/gpu/drm/xe/xe_migrate.c
@@ -420,7 +420,7 @@ struct xe_migrate *xe_migrate_init(struct xe_tile *tile)
return ERR_PTR(err);
if (IS_DGFX(xe)) {
- if (xe_device_has_flat_ccs(xe))
+ if (xe_device_needs_ccs_emit(xe))
/* min chunk size corresponds to 4K of CCS Metadata */
m->min_chunk_size = SZ_4K * SZ_64K /
xe_device_ccs_bytes(xe, SZ_64K);
@@ -1034,7 +1034,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
clear_system_ccs ? 0 : emit_clear_cmd_len(gt), 0,
avail_pts);
- if (xe_device_has_flat_ccs(xe))
+ if (xe_device_needs_ccs_emit(xe))
batch_size += EMIT_COPY_CCS_DW;
/* Clear commands */
@@ -1062,7 +1062,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m,
if (!clear_system_ccs)
emit_clear(gt, bb, clear_L0_ofs, clear_L0, XE_PAGE_SIZE, clear_vram);
- if (xe_device_has_flat_ccs(xe)) {
+ if (xe_device_needs_ccs_emit(xe)) {
emit_copy_ccs(gt, bb, clear_L0_ofs, true,
m->cleared_mem_ofs, false, clear_L0);
flush_flags = MI_FLUSH_DW_CCS;
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
end of thread, other threads:[~2024-07-12 7:24 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-10 7:53 [PATCH 0/6] Implement compression support on BMG Akshata Jahagirdar
2024-07-09 10:33 ` Matthew Auld
2024-07-09 19:07 ` Jahagirdar, Akshata
2024-07-10 7:53 ` [PATCH 1/6] drm/xe/migrate: Handle clear ccs logic for xe2 dgfx Akshata Jahagirdar
2024-07-10 8:01 ` Nirmoy Das
2024-07-10 8:17 ` Akshata Jahagirdar
2024-07-11 11:27 ` Akshata Jahagirdar
2024-07-11 12:42 ` Akshata Jahagirdar
2024-07-11 13:07 ` Akshata Jahagirdar
2024-07-12 11:52 ` [PATCH 1/6] drm/xe/migrate: Sample patch for testing Akshata Jahagirdar
2024-07-12 11:53 ` Akshata Jahagirdar
2024-07-10 8:17 ` [PATCH 0/6] Implement compression support on BMG Akshata Jahagirdar
2024-07-11 5:54 ` Akshata Jahagirdar
2024-07-11 11:27 ` Akshata Jahagirdar
2024-07-11 12:34 ` Ghimiray, Himal Prasad
2024-07-11 12:42 ` Akshata Jahagirdar
2024-07-11 13:07 ` Akshata Jahagirdar
[not found] <cover.1720677099.git.akshata.jahagirdar@intel.com>
2024-07-11 5:55 ` [PATCH 1/6] drm/xe/migrate: Handle clear ccs logic for xe2 dgfx Akshata Jahagirdar
-- strict thread matches above, loose matches on Subject: below --
2024-07-11 9:18 [PATCH 0/6] Implement compression support on BMG Akshata Jahagirdar
2024-07-12 3:11 ` [PATCH 1/6] drm/xe/migrate: Handle clear ccs logic for xe2 dgfx Akshata Jahagirdar
[not found] <cover.1720689220.git.akshata.jahagirdar@intel.com>
2024-07-11 9:18 ` Akshata Jahagirdar
2024-07-11 9:19 ` Akshata Jahagirdar
2024-07-11 12:09 ` Matthew Auld
2024-07-12 4:09 ` Jahagirdar, Akshata
2024-07-12 6:39 [PATCH 0/6] Implement compression support on BMG Akshata Jahagirdar
2024-07-12 6:39 ` [PATCH 1/6] drm/xe/migrate: Handle clear ccs logic for xe2 dgfx Akshata Jahagirdar
[not found] <cover.1720768378.git.akshata.jahagirdar@intel.com>
2024-07-12 7:24 ` Akshata Jahagirdar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox