* [Intel-gfx] [PATCH v5 0/9] Update AUX invalidation sequence
@ 2023-07-18 13:38 Andi Shyti
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 1/9] drm-tip: 2023y-07m-17d-16h-04m-53s UTC integration manifest Andi Shyti
` (9 more replies)
0 siblings, 10 replies; 13+ messages in thread
From: Andi Shyti @ 2023-07-18 13:38 UTC (permalink / raw)
To: Jonathan Cavitt, Matt Roper, Chris Wilson, Mika Kuoppala,
Nirmoy Das, Andrzej Hajda
Cc: Intel GFX, DRI Devel
Hi,
as there are new hardware directives, we need a little adaptation
for the AUX invalidation sequence.
In this version we support all the engines affected by this
change.
The stable backport has some challenges because the original
patch that this series fixes has had more changes in between.
This patch is slowly exploding with code refactorings and
features added and fixed.
Thanks a lot Nirmoy, Andrzej and Matt for your review and for the
fruitful discussions!
Thanks,
Andi
Changelog:
=========
v4 -> v5
- The AUX CCS is added as a device property instead of checking
against FLAT CCS. This adds the new HAS_AUX_CCS check
(Patch 2, new).
- little and trivial refactoring here and there.
- extended the flags{0,1}/bit_group_{0,1} renaming to other
functions.
- Created an intel_emit_pipe_control_cs() wrapper for submitting
the pipe control.
- Quiesce memory for all the engines, not just RCS (Patch 6,
new).
- The PIPE_CONTROL_CCS_FLUSH is added to all the engines.
- Remove redundant EMIT_FLUSH_CCS mode flag.
- Remove unnecessary NOOPs from the command streamer for
invalidating the CCS table.
- Use INVALID_MMIO_REG and gen12_get_aux_inv_reg() instad of
__MMIO(0) and reg.reg.
- Remove useless wrapper and just use gen12_get_aux_inv_reg().
v3 -> v4
- A trivial patch 3 is added to rename the flags with
bit_group_{0,1} to align with the datasheet naming.
- Patch 4 fixes a confusion I made where the CCS flag was
applied to the wrong bit group.
v2 -> v3
- added r-b from Nirmoy in patch 1 and 4.
- added patch 3 which enables the ccs_flush in the control pipe
for mtl+ compute and render engines.
- added redundant checks in patch 2 for enabling the EMIT_FLUSH
flag.
v1 -> v2
- add a clean up preliminary patch for the existing registers
- add support for more engines
- add the Fixes tag
Andi Shyti (6):
drm/i915/gt: Cleanup aux invalidation registers
drm/i915: Add the has_aux_ccs device property
drm/i915/gt: Rename flags with bit_group_X according to the datasheet
drm/i915/gt: Refactor intel_emit_pipe_control_cs() in a single
function
drm/i915/gt: Ensure memory quiesced before invalidation for all
engines
drm/i915/gt: Support aux invalidation on all engines
Jonathan Cavitt (2):
drm/i915/gt: Ensure memory quiesced before invalidation
drm/i915/gt: Poll aux invalidation register bit on invalidation
Robert Foss (1):
drm-tip: 2023y-07m-17d-16h-04m-53s UTC integration manifest
drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 216 +++++++++++++------
drivers/gpu/drm/i915/gt/gen8_engine_cs.h | 21 +-
drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 2 +
drivers/gpu/drm/i915/gt/intel_gt_regs.h | 16 +-
drivers/gpu/drm/i915/gt/intel_lrc.c | 17 +-
drivers/gpu/drm/i915/i915_drv.h | 1 +
drivers/gpu/drm/i915/i915_pci.c | 5 +-
drivers/gpu/drm/i915/intel_device_info.h | 1 +
integration-manifest | 24 +++
9 files changed, 204 insertions(+), 99 deletions(-)
create mode 100644 integration-manifest
--
2.40.1
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Intel-gfx] [PATCH v5 1/9] drm-tip: 2023y-07m-17d-16h-04m-53s UTC integration manifest
2023-07-18 13:38 [Intel-gfx] [PATCH v5 0/9] Update AUX invalidation sequence Andi Shyti
@ 2023-07-18 13:38 ` Andi Shyti
2023-07-18 13:44 ` Andi Shyti
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 2/9] drm/i915/gt: Cleanup aux invalidation registers Andi Shyti
` (8 subsequent siblings)
9 siblings, 1 reply; 13+ messages in thread
From: Andi Shyti @ 2023-07-18 13:38 UTC (permalink / raw)
To: Jonathan Cavitt, Matt Roper, Chris Wilson, Mika Kuoppala,
Nirmoy Das, Andrzej Hajda
Cc: Intel GFX, DRI Devel
From: Robert Foss <rfoss@kernel.org>
---
integration-manifest | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
create mode 100644 integration-manifest
diff --git a/integration-manifest b/integration-manifest
new file mode 100644
index 0000000000000..8642016b34817
--- /dev/null
+++ b/integration-manifest
@@ -0,0 +1,24 @@
+drm drm-fixes 38d88d5e97c9032ebeca092b9372209f2ca92cdf
+ Merge tag 'amd-drm-fixes-6.5-2023-07-12' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
+drm-misc drm-misc-fixes 05abb3be91d8788328231ee02973ab3d47f5e3d2
+ dma-buf/dma-resv: Stop leaking on krealloc() failure
+drm-intel drm-intel-fixes fdf0eaf11452d72945af31804e2a1048ee1b574c
+ Linux 6.5-rc2
+drm drm-next 6c7f27441d6af776a89147027c6f4a11c0162c64
+ Merge tag 'drm-misc-next-2023-07-13' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
+drm-misc drm-misc-next-fixes 59bba51ec2a50e3dc5c3ee80f0a23207346303ff
+ drm/panel: Fine tune Starry-ili9882t panel HFP and HBP
+drm-intel drm-intel-next-fixes f6cf3883df471abbcf1553127681dc244c8ff8dd
+ drm/i915: use mock device info for creating mock device
+drm-misc drm-misc-next 41639b3a8b0f1f194dfe0577d99db70613f78626
+ drm/bridge: anx7625: Use common macros for HDCP capabilities
+drm-intel drm-intel-next c5741c5c1122b7944d9af185c83ab7056153259e
+ drm/i915/display: Do not use stolen on MTL
+drm-intel drm-intel-gt-next 8529e3777b7644d41105a06141574a24795f8348
+ drm/i915/gt: Do not use stolen on MTL
+drm-intel topic/core-for-CI c0ea2fa0491287dea97b384bec1b5a614408b8e3
+ drm/i915/gsc: define gsc fw
+drm-misc topic/i915-ttm 1e3944578b749449bd7fa6bf0bae4c3d3f5f1733
+ Merge tag 'amd-drm-next-5.16-2021-09-27' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
+drm topic/nouveau-misc dfc4005f8c172eea359f9db08c3b2b0ff0153699
+ drm/nouveau/disp: move DAC load detection method
--
2.40.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Intel-gfx] [PATCH v5 2/9] drm/i915/gt: Cleanup aux invalidation registers
2023-07-18 13:38 [Intel-gfx] [PATCH v5 0/9] Update AUX invalidation sequence Andi Shyti
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 1/9] drm-tip: 2023y-07m-17d-16h-04m-53s UTC integration manifest Andi Shyti
@ 2023-07-18 13:38 ` Andi Shyti
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 3/9] drm/i915: Add the has_aux_ccs device property Andi Shyti
` (7 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Andi Shyti @ 2023-07-18 13:38 UTC (permalink / raw)
To: Jonathan Cavitt, Matt Roper, Chris Wilson, Mika Kuoppala,
Nirmoy Das, Andrzej Hajda
Cc: Intel GFX, DRI Devel
Fix the 'NV' definition postfix that is supposed to be INV.
Take the chance to also order properly the registers based on
their address and call the GEN12_GFX_CCS_AUX_INV address as
GEN12_CCS_AUX_INV like all the other similar registers.
Remove also VD1, VD3 and VE1 registers that don't exist and add
BCS0 and CCS0.
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.8+
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
---
drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 8 ++++----
drivers/gpu/drm/i915/gt/intel_gt_regs.h | 16 ++++++++--------
drivers/gpu/drm/i915/gt/intel_lrc.c | 6 +++---
3 files changed, 15 insertions(+), 15 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 23857cc08eca1..563efee055602 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -287,8 +287,8 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
if (!HAS_FLAT_CCS(rq->engine->i915)) {
/* hsdes: 1809175790 */
- cs = gen12_emit_aux_table_inv(rq->engine->gt,
- cs, GEN12_GFX_CCS_AUX_NV);
+ cs = gen12_emit_aux_table_inv(rq->engine->gt, cs,
+ GEN12_CCS_AUX_INV);
}
*cs++ = preparser_disable(false);
@@ -348,10 +348,10 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
if (aux_inv) { /* hsdes: 1809175790 */
if (rq->engine->class == VIDEO_DECODE_CLASS)
cs = gen12_emit_aux_table_inv(rq->engine->gt,
- cs, GEN12_VD0_AUX_NV);
+ cs, GEN12_VD0_AUX_INV);
else
cs = gen12_emit_aux_table_inv(rq->engine->gt,
- cs, GEN12_VE0_AUX_NV);
+ cs, GEN12_VE0_AUX_INV);
}
if (mode & EMIT_INVALIDATE)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 718cb2c80f79e..2cdfb2f713d02 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -332,9 +332,11 @@
#define GEN8_PRIVATE_PAT_HI _MMIO(0x40e0 + 4)
#define GEN10_PAT_INDEX(index) _MMIO(0x40e0 + (index) * 4)
#define BSD_HWS_PGA_GEN7 _MMIO(0x4180)
-#define GEN12_GFX_CCS_AUX_NV _MMIO(0x4208)
-#define GEN12_VD0_AUX_NV _MMIO(0x4218)
-#define GEN12_VD1_AUX_NV _MMIO(0x4228)
+
+#define GEN12_CCS_AUX_INV _MMIO(0x4208)
+#define GEN12_VD0_AUX_INV _MMIO(0x4218)
+#define GEN12_VE0_AUX_INV _MMIO(0x4238)
+#define GEN12_BCS0_AUX_INV _MMIO(0x4248)
#define GEN8_RTCR _MMIO(0x4260)
#define GEN8_M1TCR _MMIO(0x4264)
@@ -342,14 +344,12 @@
#define GEN8_BTCR _MMIO(0x426c)
#define GEN8_VTCR _MMIO(0x4270)
-#define GEN12_VD2_AUX_NV _MMIO(0x4298)
-#define GEN12_VD3_AUX_NV _MMIO(0x42a8)
-#define GEN12_VE0_AUX_NV _MMIO(0x4238)
-
#define BLT_HWS_PGA_GEN7 _MMIO(0x4280)
-#define GEN12_VE1_AUX_NV _MMIO(0x42b8)
+#define GEN12_VD2_AUX_INV _MMIO(0x4298)
+#define GEN12_CCS0_AUX_INV _MMIO(0x42c8)
#define AUX_INV REG_BIT(0)
+
#define VEBOX_HWS_PGA_GEN7 _MMIO(0x4380)
#define GEN12_AUX_ERR_DBG _MMIO(0x43f4)
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 1b710102390bf..235f3fab60a98 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1374,7 +1374,7 @@ gen12_emit_indirect_ctx_rcs(const struct intel_context *ce, u32 *cs)
/* hsdes: 1809175790 */
if (!HAS_FLAT_CCS(ce->engine->i915))
cs = gen12_emit_aux_table_inv(ce->engine->gt,
- cs, GEN12_GFX_CCS_AUX_NV);
+ cs, GEN12_CCS_AUX_INV);
/* Wa_16014892111 */
if (IS_MTL_GRAPHICS_STEP(ce->engine->i915, M, STEP_A0, STEP_B0) ||
@@ -1403,10 +1403,10 @@ gen12_emit_indirect_ctx_xcs(const struct intel_context *ce, u32 *cs)
if (!HAS_FLAT_CCS(ce->engine->i915)) {
if (ce->engine->class == VIDEO_DECODE_CLASS)
cs = gen12_emit_aux_table_inv(ce->engine->gt,
- cs, GEN12_VD0_AUX_NV);
+ cs, GEN12_VD0_AUX_INV);
else if (ce->engine->class == VIDEO_ENHANCEMENT_CLASS)
cs = gen12_emit_aux_table_inv(ce->engine->gt,
- cs, GEN12_VE0_AUX_NV);
+ cs, GEN12_VE0_AUX_INV);
}
return cs;
--
2.40.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Intel-gfx] [PATCH v5 3/9] drm/i915: Add the has_aux_ccs device property
2023-07-18 13:38 [Intel-gfx] [PATCH v5 0/9] Update AUX invalidation sequence Andi Shyti
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 1/9] drm-tip: 2023y-07m-17d-16h-04m-53s UTC integration manifest Andi Shyti
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 2/9] drm/i915/gt: Cleanup aux invalidation registers Andi Shyti
@ 2023-07-18 13:38 ` Andi Shyti
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 4/9] drm/i915/gt: Ensure memory quiesced before invalidation Andi Shyti
` (6 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Andi Shyti @ 2023-07-18 13:38 UTC (permalink / raw)
To: Jonathan Cavitt, Matt Roper, Chris Wilson, Mika Kuoppala,
Nirmoy Das, Andrzej Hajda
Cc: Intel GFX, DRI Devel
We always assumed that a device might either have AUX or FLAT
CCS, but this is an approximation that is not always true as it
requires some further per device checks.
Add the "has_aux_ccs" flag in the intel_device_info structure in
order to have a per device flag indicating of the AUX CCS.
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Cc: <stable@vger.kernel.org> # v5.8+
---
drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 4 ++--
drivers/gpu/drm/i915/i915_drv.h | 1 +
drivers/gpu/drm/i915/i915_pci.c | 5 ++++-
drivers/gpu/drm/i915/intel_device_info.h | 1 +
4 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 563efee055602..0d4d5e0407a2d 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -267,7 +267,7 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
else if (engine->class == COMPUTE_CLASS)
flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
- if (!HAS_FLAT_CCS(rq->engine->i915))
+ if (HAS_AUX_CCS(rq->engine->i915))
count = 8 + 4;
else
count = 8;
@@ -307,7 +307,7 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
if (mode & EMIT_INVALIDATE) {
cmd += 2;
- if (!HAS_FLAT_CCS(rq->engine->i915) &&
+ if (HAS_AUX_CCS(rq->engine->i915) &&
(rq->engine->class == VIDEO_DECODE_CLASS ||
rq->engine->class == VIDEO_ENHANCEMENT_CLASS)) {
aux_inv = rq->engine->mask &
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 682ef2b5c7d59..e9cc048b5727a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -848,6 +848,7 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
* stored in lmem to support the 3D and media compression formats.
*/
#define HAS_FLAT_CCS(i915) (INTEL_INFO(i915)->has_flat_ccs)
+#define HAS_AUX_CCS(i915) (INTEL_INFO(i915)->has_aux_ccs)
#define HAS_GT_UC(i915) (INTEL_INFO(i915)->has_gt_uc)
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index fcacdc21643cf..c9ff1d11a9fce 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -643,7 +643,8 @@ static const struct intel_device_info jsl_info = {
TGL_CACHELEVEL, \
.has_global_mocs = 1, \
.has_pxp = 1, \
- .max_pat_index = 3
+ .max_pat_index = 3, \
+ .has_aux_ccs = 1
static const struct intel_device_info tgl_info = {
GEN12_FEATURES,
@@ -775,6 +776,7 @@ static const struct intel_device_info dg2_info = {
static const struct intel_device_info ats_m_info = {
DG2_FEATURES,
+ .has_aux_ccs = 1,
.require_force_probe = 1,
.tuning_thread_rr_after_dep = 1,
};
@@ -827,6 +829,7 @@ static const struct intel_device_info mtl_info = {
.__runtime.media.ip.ver = 13,
PLATFORM(INTEL_METEORLAKE),
.extra_gt_list = xelpmp_extra_gt,
+ .has_aux_ccs = 1,
.has_flat_ccs = 0,
.has_gmd_id = 1,
.has_guc_deprivilege = 1,
diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
index dbfe6443457b5..93485507506cc 100644
--- a/drivers/gpu/drm/i915/intel_device_info.h
+++ b/drivers/gpu/drm/i915/intel_device_info.h
@@ -151,6 +151,7 @@ enum intel_ppgtt_type {
func(has_reset_engine); \
func(has_3d_pipeline); \
func(has_4tile); \
+ func(has_aux_ccs); \
func(has_flat_ccs); \
func(has_global_mocs); \
func(has_gmd_id); \
--
2.40.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Intel-gfx] [PATCH v5 4/9] drm/i915/gt: Ensure memory quiesced before invalidation
2023-07-18 13:38 [Intel-gfx] [PATCH v5 0/9] Update AUX invalidation sequence Andi Shyti
` (2 preceding siblings ...)
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 3/9] drm/i915: Add the has_aux_ccs device property Andi Shyti
@ 2023-07-18 13:38 ` Andi Shyti
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 5/9] drm/i915/gt: Rename flags with bit_group_X according to the datasheet Andi Shyti
` (5 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Andi Shyti @ 2023-07-18 13:38 UTC (permalink / raw)
To: Jonathan Cavitt, Matt Roper, Chris Wilson, Mika Kuoppala,
Nirmoy Das, Andrzej Hajda
Cc: Intel GFX, DRI Devel
From: Jonathan Cavitt <jonathan.cavitt@intel.com>
All memory traffic must be quiesced before requesting
an aux invalidation on platforms that use Aux CCS.
Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.8+
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
---
drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 0d4d5e0407a2d..6fd1f254b84a2 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -202,7 +202,11 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
{
struct intel_engine_cs *engine = rq->engine;
- if (mode & EMIT_FLUSH) {
+ /*
+ * Aux invalidations on Aux CCS platforms require
+ * memory traffic is quiesced prior.
+ */
+ if (mode & EMIT_FLUSH || HAS_AUX_CCS(engine->i915)) {
u32 flags = 0;
int err;
u32 *cs;
--
2.40.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Intel-gfx] [PATCH v5 5/9] drm/i915/gt: Rename flags with bit_group_X according to the datasheet
2023-07-18 13:38 [Intel-gfx] [PATCH v5 0/9] Update AUX invalidation sequence Andi Shyti
` (3 preceding siblings ...)
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 4/9] drm/i915/gt: Ensure memory quiesced before invalidation Andi Shyti
@ 2023-07-18 13:38 ` Andi Shyti
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 6/9] drm/i915/gt: Refactor intel_emit_pipe_control_cs() in a single function Andi Shyti
` (4 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Andi Shyti @ 2023-07-18 13:38 UTC (permalink / raw)
To: Jonathan Cavitt, Matt Roper, Chris Wilson, Mika Kuoppala,
Nirmoy Das, Andrzej Hajda
Cc: Intel GFX, DRI Devel
In preparation of the next patch align with the datasheet (BSPEC
47112) with the naming of the pipe control set of flag values.
The variable "flags" in gen12_emit_flush_rcs() is applied as a
set of flags called Bit Group 1.
Define also the Bit Group 0 as bit_group_0 where currently only
PIPE_CONTROL0_HDC_PIPELINE_FLUSH bit is set.
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.8+
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
---
drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 34 +++++++++++++-----------
drivers/gpu/drm/i915/gt/gen8_engine_cs.h | 18 ++++++++-----
2 files changed, 29 insertions(+), 23 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 6fd1f254b84a2..c9951bcf091a2 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -207,7 +207,8 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
* memory traffic is quiesced prior.
*/
if (mode & EMIT_FLUSH || HAS_AUX_CCS(engine->i915)) {
- u32 flags = 0;
+ u32 bit_group_0 = 0;
+ u32 bit_group_1 = 0;
int err;
u32 *cs;
@@ -215,32 +216,33 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
if (err)
return err;
- flags |= PIPE_CONTROL_TILE_CACHE_FLUSH;
- flags |= PIPE_CONTROL_FLUSH_L3;
- flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
- flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
+ bit_group_0 |= PIPE_CONTROL0_HDC_PIPELINE_FLUSH;
+
+ bit_group_1 |= PIPE_CONTROL_TILE_CACHE_FLUSH;
+ bit_group_1 |= PIPE_CONTROL_FLUSH_L3;
+ bit_group_1 |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
+ bit_group_1 |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
/* Wa_1409600907:tgl,adl-p */
- flags |= PIPE_CONTROL_DEPTH_STALL;
- flags |= PIPE_CONTROL_DC_FLUSH_ENABLE;
- flags |= PIPE_CONTROL_FLUSH_ENABLE;
+ bit_group_1 |= PIPE_CONTROL_DEPTH_STALL;
+ bit_group_1 |= PIPE_CONTROL_DC_FLUSH_ENABLE;
+ bit_group_1 |= PIPE_CONTROL_FLUSH_ENABLE;
- flags |= PIPE_CONTROL_STORE_DATA_INDEX;
- flags |= PIPE_CONTROL_QW_WRITE;
+ bit_group_1 |= PIPE_CONTROL_STORE_DATA_INDEX;
+ bit_group_1 |= PIPE_CONTROL_QW_WRITE;
- flags |= PIPE_CONTROL_CS_STALL;
+ bit_group_1 |= PIPE_CONTROL_CS_STALL;
if (!HAS_3D_PIPELINE(engine->i915))
- flags &= ~PIPE_CONTROL_3D_ARCH_FLAGS;
+ bit_group_1 &= ~PIPE_CONTROL_3D_ARCH_FLAGS;
else if (engine->class == COMPUTE_CLASS)
- flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
+ bit_group_1 &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
cs = intel_ring_begin(rq, 6);
if (IS_ERR(cs))
return PTR_ERR(cs);
- cs = gen12_emit_pipe_control(cs,
- PIPE_CONTROL0_HDC_PIPELINE_FLUSH,
- flags, LRC_PPHWSP_SCRATCH_ADDR);
+ cs = gen12_emit_pipe_control(cs, bit_group_0, bit_group_1,
+ LRC_PPHWSP_SCRATCH_ADDR);
intel_ring_advance(rq, cs);
}
diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.h b/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
index 655e5c00ddc27..a44eda096557c 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
@@ -49,25 +49,29 @@ u32 *gen12_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs);
u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs, const i915_reg_t inv_reg);
static inline u32 *
-__gen8_emit_pipe_control(u32 *batch, u32 flags0, u32 flags1, u32 offset)
+__gen8_emit_pipe_control(u32 *batch, u32 bit_group_0,
+ u32 bit_group_1, u32 offset)
{
memset(batch, 0, 6 * sizeof(u32));
- batch[0] = GFX_OP_PIPE_CONTROL(6) | flags0;
- batch[1] = flags1;
+ batch[0] = GFX_OP_PIPE_CONTROL(6) | bit_group_0;
+ batch[1] = bit_group_1;
batch[2] = offset;
return batch + 6;
}
-static inline u32 *gen8_emit_pipe_control(u32 *batch, u32 flags, u32 offset)
+static inline u32 *gen8_emit_pipe_control(u32 *batch,
+ u32 bit_group_1, u32 offset)
{
- return __gen8_emit_pipe_control(batch, 0, flags, offset);
+ return __gen8_emit_pipe_control(batch, 0, bit_group_1, offset);
}
-static inline u32 *gen12_emit_pipe_control(u32 *batch, u32 flags0, u32 flags1, u32 offset)
+static inline u32 *gen12_emit_pipe_control(u32 *batch, u32 bit_group_0,
+ u32 bit_group_1, u32 offset)
{
- return __gen8_emit_pipe_control(batch, flags0, flags1, offset);
+ return __gen8_emit_pipe_control(batch, bit_group_0,
+ bit_group_1, offset);
}
static inline u32 *
--
2.40.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Intel-gfx] [PATCH v5 6/9] drm/i915/gt: Refactor intel_emit_pipe_control_cs() in a single function
2023-07-18 13:38 [Intel-gfx] [PATCH v5 0/9] Update AUX invalidation sequence Andi Shyti
` (4 preceding siblings ...)
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 5/9] drm/i915/gt: Rename flags with bit_group_X according to the datasheet Andi Shyti
@ 2023-07-18 13:38 ` Andi Shyti
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 7/9] drm/i915/gt: Ensure memory quiesced before invalidation for all engines Andi Shyti
` (3 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Andi Shyti @ 2023-07-18 13:38 UTC (permalink / raw)
To: Jonathan Cavitt, Matt Roper, Chris Wilson, Mika Kuoppala,
Nirmoy Das, Andrzej Hajda
Cc: Intel GFX, DRI Devel
Just a trivial refactoring for reducing the number of code
duplicate. This will come at handy in the next commits.
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
---
drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 44 +++++++++++++-----------
1 file changed, 23 insertions(+), 21 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index c9951bcf091a2..3275e55b18d90 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -177,23 +177,31 @@ u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs, const i915_reg_t inv
return cs;
}
+static u32 *intel_emit_pipe_control_cs(struct i915_request *rq, u32 bit_group_0,
+ u32 bit_group_1, u32 offset)
+{
+ u32 *cs;
+
+ cs = intel_ring_begin(rq, 6);
+ if (IS_ERR(cs))
+ return cs;
+
+ cs = gen12_emit_pipe_control(cs, bit_group_0, bit_group_1,
+ LRC_PPHWSP_SCRATCH_ADDR);
+ intel_ring_advance(rq, cs);
+
+ return cs;
+}
+
static int mtl_dummy_pipe_control(struct i915_request *rq)
{
/* Wa_14016712196 */
if (IS_MTL_GRAPHICS_STEP(rq->engine->i915, M, STEP_A0, STEP_B0) ||
- IS_MTL_GRAPHICS_STEP(rq->engine->i915, P, STEP_A0, STEP_B0)) {
- u32 *cs;
-
- /* dummy PIPE_CONTROL + depth flush */
- cs = intel_ring_begin(rq, 6);
- if (IS_ERR(cs))
- return PTR_ERR(cs);
- cs = gen12_emit_pipe_control(cs,
- 0,
- PIPE_CONTROL_DEPTH_CACHE_FLUSH,
- LRC_PPHWSP_SCRATCH_ADDR);
- intel_ring_advance(rq, cs);
- }
+ IS_MTL_GRAPHICS_STEP(rq->engine->i915, P, STEP_A0, STEP_B0))
+ intel_emit_pipe_control_cs(rq,
+ 0,
+ PIPE_CONTROL_DEPTH_CACHE_FLUSH,
+ LRC_PPHWSP_SCRATCH_ADDR);
return 0;
}
@@ -210,7 +218,6 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
u32 bit_group_0 = 0;
u32 bit_group_1 = 0;
int err;
- u32 *cs;
err = mtl_dummy_pipe_control(rq);
if (err)
@@ -237,13 +244,8 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
else if (engine->class == COMPUTE_CLASS)
bit_group_1 &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
- cs = intel_ring_begin(rq, 6);
- if (IS_ERR(cs))
- return PTR_ERR(cs);
-
- cs = gen12_emit_pipe_control(cs, bit_group_0, bit_group_1,
- LRC_PPHWSP_SCRATCH_ADDR);
- intel_ring_advance(rq, cs);
+ intel_emit_pipe_control_cs(rq, bit_group_0, bit_group_1,
+ LRC_PPHWSP_SCRATCH_ADDR);
}
if (mode & EMIT_INVALIDATE) {
--
2.40.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Intel-gfx] [PATCH v5 7/9] drm/i915/gt: Ensure memory quiesced before invalidation for all engines
2023-07-18 13:38 [Intel-gfx] [PATCH v5 0/9] Update AUX invalidation sequence Andi Shyti
` (5 preceding siblings ...)
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 6/9] drm/i915/gt: Refactor intel_emit_pipe_control_cs() in a single function Andi Shyti
@ 2023-07-18 13:38 ` Andi Shyti
2023-07-18 14:12 ` Nirmoy Das
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 8/9] drm/i915/gt: Poll aux invalidation register bit on invalidation Andi Shyti
` (2 subsequent siblings)
9 siblings, 1 reply; 13+ messages in thread
From: Andi Shyti @ 2023-07-18 13:38 UTC (permalink / raw)
To: Jonathan Cavitt, Matt Roper, Chris Wilson, Mika Kuoppala,
Nirmoy Das, Andrzej Hajda
Cc: Intel GFX, DRI Devel
Commit af9e423a8aae ("drm/i915/gt: Ensure memory quiesced before
invalidation") has made sure that the memory is quiesced before
invalidating the AUX CCS table. Do it for all the other engines
and not just RCS.
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
---
drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 71 +++++++++++++++++---
drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 1 +
2 files changed, 62 insertions(+), 10 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 3275e55b18d90..2f40cd515cc78 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -225,6 +225,13 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
bit_group_0 |= PIPE_CONTROL0_HDC_PIPELINE_FLUSH;
+ /*
+ * When required, in MTL+ platforms we need to
+ * set the CCS_FLUSH bit in the pipe control
+ */
+ if (GRAPHICS_VER_FULL(rq->i915) >= IP_VER(12, 70))
+ bit_group_0 |= PIPE_CONTROL_CCS_FLUSH;
+
bit_group_1 |= PIPE_CONTROL_TILE_CACHE_FLUSH;
bit_group_1 |= PIPE_CONTROL_FLUSH_L3;
bit_group_1 |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
@@ -309,20 +316,64 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
{
intel_engine_mask_t aux_inv = 0;
- u32 cmd, *cs;
+ u32 cmd = 4;
+ u32 *cs;
- cmd = 4;
- if (mode & EMIT_INVALIDATE) {
+ if (mode & EMIT_INVALIDATE)
cmd += 2;
- if (HAS_AUX_CCS(rq->engine->i915) &&
- (rq->engine->class == VIDEO_DECODE_CLASS ||
- rq->engine->class == VIDEO_ENHANCEMENT_CLASS)) {
- aux_inv = rq->engine->mask &
- ~GENMASK(_BCS(I915_MAX_BCS - 1), BCS0);
- if (aux_inv)
- cmd += 4;
+ if (HAS_AUX_CCS(rq->engine->i915))
+ aux_inv = rq->engine->mask &
+ ~GENMASK(_BCS(I915_MAX_BCS - 1), BCS0);
+
+ /*
+ * Aux invalidations on Aux CCS platforms require
+ * memory traffic is quiesced prior.
+ */
+ if (aux_inv) {
+ u32 bit_group_0 = 0;
+ u32 bit_group_1 = 0;
+
+ cmd += 4;
+
+ bit_group_0 |= PIPE_CONTROL0_HDC_PIPELINE_FLUSH;
+
+ switch (rq->engine->class) {
+ case VIDEO_DECODE_CLASS:
+ bit_group_1 |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
+ bit_group_1 |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
+ bit_group_1 |= PIPE_CONTROL_DC_FLUSH_ENABLE;
+ bit_group_1 |= PIPE_CONTROL_FLUSH_L3;
+ bit_group_1 |= PIPE_CONTROL_CS_STALL;
+
+ /*
+ * When required, in MTL+ platforms we need to
+ * set the CCS_FLUSH bit in the pipe control
+ */
+ if (GRAPHICS_VER_FULL(rq->i915) >= IP_VER(12, 70))
+ bit_group_0 |= PIPE_CONTROL_CCS_FLUSH;
+
+ break;
+
+ case VIDEO_ENHANCEMENT_CLASS:
+ case COMPUTE_CLASS:
+ bit_group_1 |= MI_FLUSH_DW;
+
+ break;
+
+ case COPY_ENGINE_CLASS:
+ /*
+ * When required, in MTL+ platforms we need to
+ * set the CCS_FLUSH bit in the pipe control
+ */
+ if (GRAPHICS_VER_FULL(rq->i915) >= IP_VER(12, 70))
+ bit_group_0 |= PIPE_CONTROL_CCS_FLUSH;
+ break;
}
+
+ if (bit_group_1 || bit_group_0)
+ intel_emit_pipe_control_cs(rq, bit_group_0, bit_group_1,
+ LRC_PPHWSP_SCRATCH_ADDR);
}
cs = intel_ring_begin(rq, cmd);
diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
index 5d143e2a8db03..5df7cce23197c 100644
--- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
@@ -299,6 +299,7 @@
#define PIPE_CONTROL_QW_WRITE (1<<14)
#define PIPE_CONTROL_POST_SYNC_OP_MASK (3<<14)
#define PIPE_CONTROL_DEPTH_STALL (1<<13)
+#define PIPE_CONTROL_CCS_FLUSH (1<<13) /* MTL+ */
#define PIPE_CONTROL_WRITE_FLUSH (1<<12)
#define PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH (1<<12) /* gen6+ */
#define PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE (1<<11) /* MBZ on ILK */
--
2.40.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Intel-gfx] [PATCH v5 8/9] drm/i915/gt: Poll aux invalidation register bit on invalidation
2023-07-18 13:38 [Intel-gfx] [PATCH v5 0/9] Update AUX invalidation sequence Andi Shyti
` (6 preceding siblings ...)
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 7/9] drm/i915/gt: Ensure memory quiesced before invalidation for all engines Andi Shyti
@ 2023-07-18 13:38 ` Andi Shyti
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 9/9] drm/i915/gt: Support aux invalidation on all engines Andi Shyti
2023-07-18 15:51 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for Update AUX invalidation sequence (rev5) Patchwork
9 siblings, 0 replies; 13+ messages in thread
From: Andi Shyti @ 2023-07-18 13:38 UTC (permalink / raw)
To: Jonathan Cavitt, Matt Roper, Chris Wilson, Mika Kuoppala,
Nirmoy Das, Andrzej Hajda
Cc: Intel GFX, DRI Devel
From: Jonathan Cavitt <jonathan.cavitt@intel.com>
For platforms that use Aux CCS, wait for aux invalidation to
complete by checking the aux invalidation register bit is
cleared.
Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.8+
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
---
drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 17 ++++++++++++-----
drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 1 +
2 files changed, 13 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 2f40cd515cc78..029ef1a5d3b6a 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -172,7 +172,15 @@ u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs, const i915_reg_t inv
*cs++ = MI_LOAD_REGISTER_IMM(1) | MI_LRI_MMIO_REMAP_EN;
*cs++ = i915_mmio_reg_offset(inv_reg) + gsi_offset;
*cs++ = AUX_INV;
- *cs++ = MI_NOOP;
+
+ *cs++ = MI_SEMAPHORE_WAIT_TOKEN |
+ MI_SEMAPHORE_REGISTER_POLL |
+ MI_SEMAPHORE_POLL |
+ MI_SEMAPHORE_SAD_EQ_SDD;
+ *cs++ = 0;
+ *cs++ = i915_mmio_reg_offset(inv_reg) + gsi_offset;
+ *cs++ = 0;
+ *cs++ = 0;
return cs;
}
@@ -282,10 +290,9 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
else if (engine->class == COMPUTE_CLASS)
flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
+ count = 8;
if (HAS_AUX_CCS(rq->engine->i915))
- count = 8 + 4;
- else
- count = 8;
+ count += 8;
cs = intel_ring_begin(rq, count);
if (IS_ERR(cs))
@@ -334,7 +341,7 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
u32 bit_group_0 = 0;
u32 bit_group_1 = 0;
- cmd += 4;
+ cmd += 8;
bit_group_0 |= PIPE_CONTROL0_HDC_PIPELINE_FLUSH;
diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
index 5df7cce23197c..2bd8d98d21102 100644
--- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
@@ -121,6 +121,7 @@
#define MI_SEMAPHORE_TARGET(engine) ((engine)<<15)
#define MI_SEMAPHORE_WAIT MI_INSTR(0x1c, 2) /* GEN8+ */
#define MI_SEMAPHORE_WAIT_TOKEN MI_INSTR(0x1c, 3) /* GEN12+ */
+#define MI_SEMAPHORE_REGISTER_POLL (1 << 16)
#define MI_SEMAPHORE_POLL (1 << 15)
#define MI_SEMAPHORE_SAD_GT_SDD (0 << 12)
#define MI_SEMAPHORE_SAD_GTE_SDD (1 << 12)
--
2.40.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Intel-gfx] [PATCH v5 9/9] drm/i915/gt: Support aux invalidation on all engines
2023-07-18 13:38 [Intel-gfx] [PATCH v5 0/9] Update AUX invalidation sequence Andi Shyti
` (7 preceding siblings ...)
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 8/9] drm/i915/gt: Poll aux invalidation register bit on invalidation Andi Shyti
@ 2023-07-18 13:38 ` Andi Shyti
2023-07-18 15:51 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for Update AUX invalidation sequence (rev5) Patchwork
9 siblings, 0 replies; 13+ messages in thread
From: Andi Shyti @ 2023-07-18 13:38 UTC (permalink / raw)
To: Jonathan Cavitt, Matt Roper, Chris Wilson, Mika Kuoppala,
Nirmoy Das, Andrzej Hajda
Cc: Intel GFX, DRI Devel
Perform some refactoring with the purpose of keeping in one
single place all the operations around the aux table
invalidation.
With this refactoring add more engines where the invalidation
should be performed.
Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: <stable@vger.kernel.org> # v5.8+
---
drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 58 +++++++++++++++---------
drivers/gpu/drm/i915/gt/gen8_engine_cs.h | 3 +-
drivers/gpu/drm/i915/gt/intel_lrc.c | 17 +------
3 files changed, 41 insertions(+), 37 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 029ef1a5d3b6a..05f5794ce7fa7 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -165,9 +165,36 @@ static u32 preparser_disable(bool state)
return MI_ARB_CHECK | 1 << 8 | state;
}
-u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs, const i915_reg_t inv_reg)
+static i915_reg_t gen12_get_aux_inv_reg(struct intel_engine_cs *engine)
{
- u32 gsi_offset = gt->uncore->gsi_offset;
+ if (!HAS_AUX_CCS(engine->i915))
+ return INVALID_MMIO_REG;
+
+ switch (engine->id) {
+ case RCS0:
+ return GEN12_CCS_AUX_INV;
+ case BCS0:
+ return GEN12_BCS0_AUX_INV;
+ case VCS0:
+ return GEN12_VD0_AUX_INV;
+ case VCS2:
+ return GEN12_VD2_AUX_INV;
+ case VECS0:
+ return GEN12_VE0_AUX_INV;
+ case CCS0:
+ return GEN12_CCS0_AUX_INV;
+ default:
+ return INVALID_MMIO_REG;
+ }
+}
+
+u32 *gen12_emit_aux_table_inv(struct intel_engine_cs *engine, u32 *cs)
+{
+ i915_reg_t inv_reg = gen12_get_aux_inv_reg(engine);
+ u32 gsi_offset = engine->gt->uncore->gsi_offset;
+
+ if (i915_mmio_reg_valid(inv_reg))
+ return cs;
*cs++ = MI_LOAD_REGISTER_IMM(1) | MI_LRI_MMIO_REMAP_EN;
*cs++ = i915_mmio_reg_offset(inv_reg) + gsi_offset;
@@ -201,6 +228,11 @@ static u32 *intel_emit_pipe_control_cs(struct i915_request *rq, u32 bit_group_0,
return cs;
}
+static bool gen12_engine_has_aux_inv(struct intel_engine_cs *engine)
+{
+ return i915_mmio_reg_valid(gen12_get_aux_inv_reg(engine));
+}
+
static int mtl_dummy_pipe_control(struct i915_request *rq)
{
/* Wa_14016712196 */
@@ -307,11 +339,7 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
cs = gen8_emit_pipe_control(cs, flags, LRC_PPHWSP_SCRATCH_ADDR);
- if (!HAS_FLAT_CCS(rq->engine->i915)) {
- /* hsdes: 1809175790 */
- cs = gen12_emit_aux_table_inv(rq->engine->gt, cs,
- GEN12_CCS_AUX_INV);
- }
+ cs = gen12_emit_aux_table_inv(engine, cs);
*cs++ = preparser_disable(false);
intel_ring_advance(rq, cs);
@@ -322,22 +350,17 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
{
- intel_engine_mask_t aux_inv = 0;
u32 cmd = 4;
u32 *cs;
if (mode & EMIT_INVALIDATE)
cmd += 2;
- if (HAS_AUX_CCS(rq->engine->i915))
- aux_inv = rq->engine->mask &
- ~GENMASK(_BCS(I915_MAX_BCS - 1), BCS0);
-
/*
* Aux invalidations on Aux CCS platforms require
* memory traffic is quiesced prior.
*/
- if (aux_inv) {
+ if (gen12_engine_has_aux_inv(rq->engine)) {
u32 bit_group_0 = 0;
u32 bit_group_1 = 0;
@@ -411,14 +434,7 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
*cs++ = 0; /* upper addr */
*cs++ = 0; /* value */
- if (aux_inv) { /* hsdes: 1809175790 */
- if (rq->engine->class == VIDEO_DECODE_CLASS)
- cs = gen12_emit_aux_table_inv(rq->engine->gt,
- cs, GEN12_VD0_AUX_INV);
- else
- cs = gen12_emit_aux_table_inv(rq->engine->gt,
- cs, GEN12_VE0_AUX_INV);
- }
+ cs = gen12_emit_aux_table_inv(rq->engine, cs);
if (mode & EMIT_INVALIDATE)
*cs++ = preparser_disable(false);
diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.h b/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
index a44eda096557c..867ba697aceb8 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
@@ -13,6 +13,7 @@
#include "intel_gt_regs.h"
#include "intel_gpu_commands.h"
+struct intel_engine_cs;
struct intel_gt;
struct i915_request;
@@ -46,7 +47,7 @@ u32 *gen8_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs);
u32 *gen11_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs);
u32 *gen12_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs);
-u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs, const i915_reg_t inv_reg);
+u32 *gen12_emit_aux_table_inv(struct intel_engine_cs *engine, u32 *cs);
static inline u32 *
__gen8_emit_pipe_control(u32 *batch, u32 bit_group_0,
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 235f3fab60a98..119deb9f938c7 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1371,10 +1371,7 @@ gen12_emit_indirect_ctx_rcs(const struct intel_context *ce, u32 *cs)
IS_DG2_G11(ce->engine->i915))
cs = gen8_emit_pipe_control(cs, PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE, 0);
- /* hsdes: 1809175790 */
- if (!HAS_FLAT_CCS(ce->engine->i915))
- cs = gen12_emit_aux_table_inv(ce->engine->gt,
- cs, GEN12_CCS_AUX_INV);
+ cs = gen12_emit_aux_table_inv(ce->engine, cs);
/* Wa_16014892111 */
if (IS_MTL_GRAPHICS_STEP(ce->engine->i915, M, STEP_A0, STEP_B0) ||
@@ -1399,17 +1396,7 @@ gen12_emit_indirect_ctx_xcs(const struct intel_context *ce, u32 *cs)
PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE,
0);
- /* hsdes: 1809175790 */
- if (!HAS_FLAT_CCS(ce->engine->i915)) {
- if (ce->engine->class == VIDEO_DECODE_CLASS)
- cs = gen12_emit_aux_table_inv(ce->engine->gt,
- cs, GEN12_VD0_AUX_INV);
- else if (ce->engine->class == VIDEO_ENHANCEMENT_CLASS)
- cs = gen12_emit_aux_table_inv(ce->engine->gt,
- cs, GEN12_VE0_AUX_INV);
- }
-
- return cs;
+ return gen12_emit_aux_table_inv(ce->engine, cs);
}
static void
--
2.40.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [Intel-gfx] [PATCH v5 1/9] drm-tip: 2023y-07m-17d-16h-04m-53s UTC integration manifest
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 1/9] drm-tip: 2023y-07m-17d-16h-04m-53s UTC integration manifest Andi Shyti
@ 2023-07-18 13:44 ` Andi Shyti
0 siblings, 0 replies; 13+ messages in thread
From: Andi Shyti @ 2023-07-18 13:44 UTC (permalink / raw)
To: Andi Shyti
Cc: Intel GFX, Jonathan Cavitt, DRI Devel, Chris Wilson,
Andrzej Hajda, Matt Roper, Nirmoy Das
Sorry! wrong format-patch :)
Please ignore patch 1.
Andi
On Tue, Jul 18, 2023 at 03:38:28PM +0200, Andi Shyti wrote:
> From: Robert Foss <rfoss@kernel.org>
>
> ---
> integration-manifest | 24 ++++++++++++++++++++++++
> 1 file changed, 24 insertions(+)
> create mode 100644 integration-manifest
>
> diff --git a/integration-manifest b/integration-manifest
> new file mode 100644
> index 0000000000000..8642016b34817
> --- /dev/null
> +++ b/integration-manifest
> @@ -0,0 +1,24 @@
> +drm drm-fixes 38d88d5e97c9032ebeca092b9372209f2ca92cdf
> + Merge tag 'amd-drm-fixes-6.5-2023-07-12' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
> +drm-misc drm-misc-fixes 05abb3be91d8788328231ee02973ab3d47f5e3d2
> + dma-buf/dma-resv: Stop leaking on krealloc() failure
> +drm-intel drm-intel-fixes fdf0eaf11452d72945af31804e2a1048ee1b574c
> + Linux 6.5-rc2
> +drm drm-next 6c7f27441d6af776a89147027c6f4a11c0162c64
> + Merge tag 'drm-misc-next-2023-07-13' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
> +drm-misc drm-misc-next-fixes 59bba51ec2a50e3dc5c3ee80f0a23207346303ff
> + drm/panel: Fine tune Starry-ili9882t panel HFP and HBP
> +drm-intel drm-intel-next-fixes f6cf3883df471abbcf1553127681dc244c8ff8dd
> + drm/i915: use mock device info for creating mock device
> +drm-misc drm-misc-next 41639b3a8b0f1f194dfe0577d99db70613f78626
> + drm/bridge: anx7625: Use common macros for HDCP capabilities
> +drm-intel drm-intel-next c5741c5c1122b7944d9af185c83ab7056153259e
> + drm/i915/display: Do not use stolen on MTL
> +drm-intel drm-intel-gt-next 8529e3777b7644d41105a06141574a24795f8348
> + drm/i915/gt: Do not use stolen on MTL
> +drm-intel topic/core-for-CI c0ea2fa0491287dea97b384bec1b5a614408b8e3
> + drm/i915/gsc: define gsc fw
> +drm-misc topic/i915-ttm 1e3944578b749449bd7fa6bf0bae4c3d3f5f1733
> + Merge tag 'amd-drm-next-5.16-2021-09-27' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
> +drm topic/nouveau-misc dfc4005f8c172eea359f9db08c3b2b0ff0153699
> + drm/nouveau/disp: move DAC load detection method
> --
> 2.40.1
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Intel-gfx] [PATCH v5 7/9] drm/i915/gt: Ensure memory quiesced before invalidation for all engines
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 7/9] drm/i915/gt: Ensure memory quiesced before invalidation for all engines Andi Shyti
@ 2023-07-18 14:12 ` Nirmoy Das
0 siblings, 0 replies; 13+ messages in thread
From: Nirmoy Das @ 2023-07-18 14:12 UTC (permalink / raw)
To: Andi Shyti, Jonathan Cavitt, Matt Roper, Chris Wilson,
Mika Kuoppala, Andrzej Hajda
Cc: Intel GFX, DRI Devel
Hi Andi,
On 7/18/2023 3:38 PM, Andi Shyti wrote:
> Commit af9e423a8aae ("drm/i915/gt: Ensure memory quiesced before
> invalidation") has made sure that the memory is quiesced before
> invalidating the AUX CCS table. Do it for all the other engines
> and not just RCS.
>
> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
> Cc: Matt Roper <matthew.d.roper@intel.com>
> ---
> drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 71 +++++++++++++++++---
> drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 1 +
> 2 files changed, 62 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> index 3275e55b18d90..2f40cd515cc78 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> @@ -225,6 +225,13 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
>
> bit_group_0 |= PIPE_CONTROL0_HDC_PIPELINE_FLUSH;
>
> + /*
> + * When required, in MTL+ platforms we need to
> + * set the CCS_FLUSH bit in the pipe control
> + */
> + if (GRAPHICS_VER_FULL(rq->i915) >= IP_VER(12, 70))
> + bit_group_0 |= PIPE_CONTROL_CCS_FLUSH;
> +
> bit_group_1 |= PIPE_CONTROL_TILE_CACHE_FLUSH;
> bit_group_1 |= PIPE_CONTROL_FLUSH_L3;
> bit_group_1 |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
> @@ -309,20 +316,64 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
> int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
> {
> intel_engine_mask_t aux_inv = 0;
> - u32 cmd, *cs;
> + u32 cmd = 4;
> + u32 *cs;
>
> - cmd = 4;
> - if (mode & EMIT_INVALIDATE) {
> + if (mode & EMIT_INVALIDATE)
> cmd += 2;
>
> - if (HAS_AUX_CCS(rq->engine->i915) &&
> - (rq->engine->class == VIDEO_DECODE_CLASS ||
> - rq->engine->class == VIDEO_ENHANCEMENT_CLASS)) {
> - aux_inv = rq->engine->mask &
> - ~GENMASK(_BCS(I915_MAX_BCS - 1), BCS0);
> - if (aux_inv)
> - cmd += 4;
> + if (HAS_AUX_CCS(rq->engine->i915))
> + aux_inv = rq->engine->mask &
> + ~GENMASK(_BCS(I915_MAX_BCS - 1), BCS0);
> +
> + /*
> + * Aux invalidations on Aux CCS platforms require
> + * memory traffic is quiesced prior.
> + */
> + if (aux_inv) {
> + u32 bit_group_0 = 0;
> + u32 bit_group_1 = 0;
> +
> + cmd += 4;
> +
> + bit_group_0 |= PIPE_CONTROL0_HDC_PIPELINE_FLUSH;
> +
> + switch (rq->engine->class) {
> + case VIDEO_DECODE_CLASS:
> + bit_group_1 |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
> + bit_group_1 |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
> + bit_group_1 |= PIPE_CONTROL_DC_FLUSH_ENABLE;
> + bit_group_1 |= PIPE_CONTROL_FLUSH_L3;
> + bit_group_1 |= PIPE_CONTROL_CS_STALL;
> +
> + /*
> + * When required, in MTL+ platforms we need to
> + * set the CCS_FLUSH bit in the pipe control
> + */
> + if (GRAPHICS_VER_FULL(rq->i915) >= IP_VER(12, 70))
> + bit_group_0 |= PIPE_CONTROL_CCS_FLUSH;
> +
> + break;
> +
> + case VIDEO_ENHANCEMENT_CLASS:
> + case COMPUTE_CLASS:
> + bit_group_1 |= MI_FLUSH_DW;
> +
> + break;
> +
> + case COPY_ENGINE_CLASS:
> + /*
> + * When required, in MTL+ platforms we need to
> + * set the CCS_FLUSH bit in the pipe control
> + */
> + if (GRAPHICS_VER_FULL(rq->i915) >= IP_VER(12, 70))
> + bit_group_0 |= PIPE_CONTROL_CCS_FLUSH;
> + break;
> }
> +
> + if (bit_group_1 || bit_group_0)
> + intel_emit_pipe_control_cs(rq, bit_group_0, bit_group_1,
> + LRC_PPHWSP_SCRATCH_ADDR);
The pipe control is limited to render and compute engines only and
gen12_emit_flush_xcs() gets called only for other engines(BCS,VE,VD) AFAIU. So I imagine changes for this patch as:
gen12_emit_flush_rcs()
pipe_control with CCS_FLUSH
AUX CCS inval
gen12_emit_flush_xcs()
MI_FLUSH_DW (with CCS flush for BCS)
AUX CCS inval
(Note that ccs flush bit for MI_FLUSH_DW is at 16 )
Regards,
Nirmoy
> }
>
> cs = intel_ring_begin(rq, cmd);
> diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> index 5d143e2a8db03..5df7cce23197c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> @@ -299,6 +299,7 @@
> #define PIPE_CONTROL_QW_WRITE (1<<14)
> #define PIPE_CONTROL_POST_SYNC_OP_MASK (3<<14)
> #define PIPE_CONTROL_DEPTH_STALL (1<<13)
> +#define PIPE_CONTROL_CCS_FLUSH (1<<13) /* MTL+ */
> #define PIPE_CONTROL_WRITE_FLUSH (1<<12)
> #define PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH (1<<12) /* gen6+ */
> #define PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE (1<<11) /* MBZ on ILK */
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Intel-gfx] ✗ Fi.CI.BUILD: failure for Update AUX invalidation sequence (rev5)
2023-07-18 13:38 [Intel-gfx] [PATCH v5 0/9] Update AUX invalidation sequence Andi Shyti
` (8 preceding siblings ...)
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 9/9] drm/i915/gt: Support aux invalidation on all engines Andi Shyti
@ 2023-07-18 15:51 ` Patchwork
9 siblings, 0 replies; 13+ messages in thread
From: Patchwork @ 2023-07-18 15:51 UTC (permalink / raw)
To: Andi Shyti; +Cc: intel-gfx
== Series Details ==
Series: Update AUX invalidation sequence (rev5)
URL : https://patchwork.freedesktop.org/series/119798/
State : failure
== Summary ==
Error: patch https://patchwork.freedesktop.org/api/1.0/series/119798/revisions/5/mbox/ not applied
Applying: drm-tip: 2023y-07m-17d-16h-04m-53s UTC integration manifest
Using index info to reconstruct a base tree...
Falling back to patching base and 3-way merge...
CONFLICT (add/add): Merge conflict in integration-manifest
Auto-merging integration-manifest
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 drm-tip: 2023y-07m-17d-16h-04m-53s UTC integration manifest
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".
Build failed, no error log produced
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2023-07-18 15:51 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-18 13:38 [Intel-gfx] [PATCH v5 0/9] Update AUX invalidation sequence Andi Shyti
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 1/9] drm-tip: 2023y-07m-17d-16h-04m-53s UTC integration manifest Andi Shyti
2023-07-18 13:44 ` Andi Shyti
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 2/9] drm/i915/gt: Cleanup aux invalidation registers Andi Shyti
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 3/9] drm/i915: Add the has_aux_ccs device property Andi Shyti
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 4/9] drm/i915/gt: Ensure memory quiesced before invalidation Andi Shyti
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 5/9] drm/i915/gt: Rename flags with bit_group_X according to the datasheet Andi Shyti
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 6/9] drm/i915/gt: Refactor intel_emit_pipe_control_cs() in a single function Andi Shyti
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 7/9] drm/i915/gt: Ensure memory quiesced before invalidation for all engines Andi Shyti
2023-07-18 14:12 ` Nirmoy Das
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 8/9] drm/i915/gt: Poll aux invalidation register bit on invalidation Andi Shyti
2023-07-18 13:38 ` [Intel-gfx] [PATCH v5 9/9] drm/i915/gt: Support aux invalidation on all engines Andi Shyti
2023-07-18 15:51 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for Update AUX invalidation sequence (rev5) Patchwork
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox