From: Kenneth Graunke <kenneth@whitecape.org>
To: intel-gfx@lists.freedesktop.org,
Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>,
Matt Roper <matthew.d.roper@intel.com>,
dri-devel@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH v2 3/6] drm/i915/gt: Fix context workarounds with non-masked regs
Date: Tue, 27 Jun 2023 11:32:41 -0700 [thread overview]
Message-ID: <2394779.plcHdMng3P@mizzik> (raw)
In-Reply-To: <20230624171757.3906095-4-lucas.demarchi@intel.com>
[-- Attachment #1: Type: text/plain, Size: 4740 bytes --]
On Saturday, June 24, 2023 10:17:54 AM PDT Lucas De Marchi wrote:
> Most of the context workarounds tweak masked registers, but not all. For
> masked registers, when writing the value it's sufficient to just write
> the wa->set_bits since that will take care of both the clr and set bits
> as well as not overwriting other bits.
>
> However there are some workarounds, the registers are non-masked. Up
> until now the driver was simply emitting a MI_LOAD_REGISTER_IMM with the
> set_bits to program the register via the GPU in the WA bb. This has the
> side effect of overwriting the content of the register outside of bits
> that should be set and also doesn't handle the bits that should be
> cleared.
>
> Kenneth reported that on DG2, mesa was seeing a weird behavior due to
> the kernel programming of L3SQCREG5 in dg2_ctx_gt_tuning_init(). With
> the GPU idle, that register could be read via intel_reg as 0x00e001ff,
> but during a 3D workload it would change to 0x0000007f. So the
> programming of that tuning was affecting more than the bits in
> L3_PWM_TIMER_INIT_VAL_MASK. Matt Roper noticed the lack of rmw for the
> context workarounds due to the use of MI_LOAD_REGISTER_IMM.
>
> So, for registers that are not masked, read its value via mmio, modify
> and then set it in the buffer to be written by the GPU. This should take
> care in a simple way of programming just the bits required by the
> tuning/workaround. If in future there are registers that involved that
> can't be read by the CPU, a more complex approach may be required like
> a) issuing additional instructions to read and modify; or b) scan the
> golden context and patch it in place before saving it; or something
> else. But for now this should suffice.
>
> Scanning the context workarounds for all platforms, these are the
> impacted ones with the respective registers
>
> mtl: DRAW_WATERMARK
> mtl/dg2: XEHP_L3SQCREG5, XEHP_FF_MODE2
>
> ICL has some non-masked registers in the context workarounds:
> GEN8_L3CNTLREG, IVB_FBC_RT_BASE and VB_FBC_RT_BASE_UPPER, but there
> shouldn't be an impact. The first is already being manually read and the
> other 2 are intentionally overwriting the entire register. Same
> reasoning applies to GEN12_FF_MODE2: the WA is intentionally
> overwriting all the bits to avoid a read-modify-write.
>
> v2: Reword commit message wrt GEN12_FF_MODE2 and the changed behavior
> on preparatory patches.
>
> Cc: Kenneth Graunke <kenneth@whitecape.org>
> Cc: Matt Roper <matthew.d.roper@intel.com>
> Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23783#note_1968971
> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
> ---
> drivers/gpu/drm/i915/gt/intel_workarounds.c | 27 ++++++++++++++++++++-
> 1 file changed, 26 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> index 7d48bd57b6ef..9291c2b4ca0e 100644
> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> @@ -986,6 +986,9 @@ void intel_engine_init_ctx_wa(struct intel_engine_cs *engine)
> int intel_engine_emit_ctx_wa(struct i915_request *rq)
> {
> struct i915_wa_list *wal = &rq->engine->ctx_wa_list;
> + struct intel_uncore *uncore = rq->engine->uncore;
> + enum forcewake_domains fw;
> + unsigned long flags;
> struct i915_wa *wa;
> unsigned int i;
> u32 *cs;
> @@ -1002,13 +1005,35 @@ int intel_engine_emit_ctx_wa(struct i915_request *rq)
> if (IS_ERR(cs))
> return PTR_ERR(cs);
>
> + fw = wal_get_fw_for_rmw(uncore, wal);
> +
> + intel_gt_mcr_lock(wal->gt, &flags);
> + spin_lock(&uncore->lock);
> + intel_uncore_forcewake_get__locked(uncore, fw);
> +
> *cs++ = MI_LOAD_REGISTER_IMM(wal->count);
> for (i = 0, wa = wal->list; i < wal->count; i++, wa++) {
> + u32 val;
> +
> + if (wa->masked_reg || wa->set == U32_MAX) {
I think you still want:
if (wa->masked_reg || wa->set == U32_MAX || wa->clr == U32_MAX) {
since there's no point to doing a read just to mask off 100% of the
values. Harmless, of course, but unnecessary.
Either way, patches 1-5 are:
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
> + val = wa->set;
> + } else {
> + val = wa->is_mcr ?
> + intel_gt_mcr_read_any_fw(wal->gt, wa->mcr_reg) :
> + intel_uncore_read_fw(uncore, wa->reg);
> + val &= ~wa->clr;
> + val |= wa->set;
> + }
> +
> *cs++ = i915_mmio_reg_offset(wa->reg);
> - *cs++ = wa->set;
> + *cs++ = val;
> }
> *cs++ = MI_NOOP;
>
> + intel_uncore_forcewake_put__locked(uncore, fw);
> + spin_unlock(&uncore->lock);
> + intel_gt_mcr_unlock(wal->gt, flags);
> +
> intel_ring_advance(rq, cs);
>
> ret = rq->engine->emit_flush(rq, EMIT_BARRIER);
>
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
WARNING: multiple messages have this Message-ID (diff)
From: Kenneth Graunke <kenneth@whitecape.org>
To: intel-gfx@lists.freedesktop.org,
Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>,
Matt Roper <matthew.d.roper@intel.com>,
dri-devel@lists.freedesktop.org
Subject: Re: [PATCH v2 3/6] drm/i915/gt: Fix context workarounds with non-masked regs
Date: Tue, 27 Jun 2023 11:32:41 -0700 [thread overview]
Message-ID: <2394779.plcHdMng3P@mizzik> (raw)
In-Reply-To: <20230624171757.3906095-4-lucas.demarchi@intel.com>
[-- Attachment #1: Type: text/plain, Size: 4740 bytes --]
On Saturday, June 24, 2023 10:17:54 AM PDT Lucas De Marchi wrote:
> Most of the context workarounds tweak masked registers, but not all. For
> masked registers, when writing the value it's sufficient to just write
> the wa->set_bits since that will take care of both the clr and set bits
> as well as not overwriting other bits.
>
> However there are some workarounds, the registers are non-masked. Up
> until now the driver was simply emitting a MI_LOAD_REGISTER_IMM with the
> set_bits to program the register via the GPU in the WA bb. This has the
> side effect of overwriting the content of the register outside of bits
> that should be set and also doesn't handle the bits that should be
> cleared.
>
> Kenneth reported that on DG2, mesa was seeing a weird behavior due to
> the kernel programming of L3SQCREG5 in dg2_ctx_gt_tuning_init(). With
> the GPU idle, that register could be read via intel_reg as 0x00e001ff,
> but during a 3D workload it would change to 0x0000007f. So the
> programming of that tuning was affecting more than the bits in
> L3_PWM_TIMER_INIT_VAL_MASK. Matt Roper noticed the lack of rmw for the
> context workarounds due to the use of MI_LOAD_REGISTER_IMM.
>
> So, for registers that are not masked, read its value via mmio, modify
> and then set it in the buffer to be written by the GPU. This should take
> care in a simple way of programming just the bits required by the
> tuning/workaround. If in future there are registers that involved that
> can't be read by the CPU, a more complex approach may be required like
> a) issuing additional instructions to read and modify; or b) scan the
> golden context and patch it in place before saving it; or something
> else. But for now this should suffice.
>
> Scanning the context workarounds for all platforms, these are the
> impacted ones with the respective registers
>
> mtl: DRAW_WATERMARK
> mtl/dg2: XEHP_L3SQCREG5, XEHP_FF_MODE2
>
> ICL has some non-masked registers in the context workarounds:
> GEN8_L3CNTLREG, IVB_FBC_RT_BASE and VB_FBC_RT_BASE_UPPER, but there
> shouldn't be an impact. The first is already being manually read and the
> other 2 are intentionally overwriting the entire register. Same
> reasoning applies to GEN12_FF_MODE2: the WA is intentionally
> overwriting all the bits to avoid a read-modify-write.
>
> v2: Reword commit message wrt GEN12_FF_MODE2 and the changed behavior
> on preparatory patches.
>
> Cc: Kenneth Graunke <kenneth@whitecape.org>
> Cc: Matt Roper <matthew.d.roper@intel.com>
> Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23783#note_1968971
> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
> ---
> drivers/gpu/drm/i915/gt/intel_workarounds.c | 27 ++++++++++++++++++++-
> 1 file changed, 26 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> index 7d48bd57b6ef..9291c2b4ca0e 100644
> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> @@ -986,6 +986,9 @@ void intel_engine_init_ctx_wa(struct intel_engine_cs *engine)
> int intel_engine_emit_ctx_wa(struct i915_request *rq)
> {
> struct i915_wa_list *wal = &rq->engine->ctx_wa_list;
> + struct intel_uncore *uncore = rq->engine->uncore;
> + enum forcewake_domains fw;
> + unsigned long flags;
> struct i915_wa *wa;
> unsigned int i;
> u32 *cs;
> @@ -1002,13 +1005,35 @@ int intel_engine_emit_ctx_wa(struct i915_request *rq)
> if (IS_ERR(cs))
> return PTR_ERR(cs);
>
> + fw = wal_get_fw_for_rmw(uncore, wal);
> +
> + intel_gt_mcr_lock(wal->gt, &flags);
> + spin_lock(&uncore->lock);
> + intel_uncore_forcewake_get__locked(uncore, fw);
> +
> *cs++ = MI_LOAD_REGISTER_IMM(wal->count);
> for (i = 0, wa = wal->list; i < wal->count; i++, wa++) {
> + u32 val;
> +
> + if (wa->masked_reg || wa->set == U32_MAX) {
I think you still want:
if (wa->masked_reg || wa->set == U32_MAX || wa->clr == U32_MAX) {
since there's no point to doing a read just to mask off 100% of the
values. Harmless, of course, but unnecessary.
Either way, patches 1-5 are:
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
> + val = wa->set;
> + } else {
> + val = wa->is_mcr ?
> + intel_gt_mcr_read_any_fw(wal->gt, wa->mcr_reg) :
> + intel_uncore_read_fw(uncore, wa->reg);
> + val &= ~wa->clr;
> + val |= wa->set;
> + }
> +
> *cs++ = i915_mmio_reg_offset(wa->reg);
> - *cs++ = wa->set;
> + *cs++ = val;
> }
> *cs++ = MI_NOOP;
>
> + intel_uncore_forcewake_put__locked(uncore, fw);
> + spin_unlock(&uncore->lock);
> + intel_gt_mcr_unlock(wal->gt, flags);
> +
> intel_ring_advance(rq, cs);
>
> ret = rq->engine->emit_flush(rq, EMIT_BARRIER);
>
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2023-06-27 18:32 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-24 17:17 [Intel-gfx] [PATCH v2 0/6] Fix ctx workarounds for non-masked regs Lucas De Marchi
2023-06-24 17:17 ` Lucas De Marchi
2023-06-24 17:17 ` [Intel-gfx] [PATCH v2 1/6] drm/i915/gt: Move wal_get_fw_for_rmw() Lucas De Marchi
2023-06-24 17:17 ` Lucas De Marchi
2023-06-24 17:17 ` [Intel-gfx] [PATCH v2 2/6] drm/i915/gt: Clear all bits from GEN12_FF_MODE2 Lucas De Marchi
2023-06-24 17:17 ` Lucas De Marchi
2023-06-25 18:39 ` [Intel-gfx] " Matt Roper
2023-06-25 18:39 ` Matt Roper
2023-06-24 17:17 ` [Intel-gfx] [PATCH v2 3/6] drm/i915/gt: Fix context workarounds with non-masked regs Lucas De Marchi
2023-06-24 17:17 ` Lucas De Marchi
2023-06-27 18:32 ` Kenneth Graunke [this message]
2023-06-27 18:32 ` Kenneth Graunke
2023-06-24 17:17 ` [Intel-gfx] [PATCH v2 4/6] drm/i915/gt: Drop read from GEN8_L3CNTLREG in ICL workaround Lucas De Marchi
2023-06-24 17:17 ` Lucas De Marchi
2023-06-24 17:17 ` [Intel-gfx] [PATCH v2 5/6] drm/i915/gt: Enable read back on XEHP_FF_MODE2 Lucas De Marchi
2023-06-24 17:17 ` Lucas De Marchi
2023-06-24 17:17 ` [Intel-gfx] [PATCH v2 6/6] drm/i915/gt: Remove bogus comment on IVB_FBC_RT_BASE_UPPER Lucas De Marchi
2023-06-24 17:17 ` Lucas De Marchi
2023-06-27 18:30 ` [Intel-gfx] " Kenneth Graunke
2023-06-27 18:30 ` Kenneth Graunke
2023-06-28 4:02 ` [Intel-gfx] " Lucas De Marchi
2023-06-28 4:02 ` Lucas De Marchi
2023-06-25 6:42 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for Fix ctx workarounds for non-masked regs Patchwork
2023-06-29 3:42 ` [Intel-gfx] ✗ Fi.CI.DOCS: warning for Fix ctx workarounds for non-masked regs (rev2) Patchwork
2023-06-29 8:46 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2023-06-29 16:56 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2394779.plcHdMng3P@mizzik \
--to=kenneth@whitecape.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-gfx@lists.freedesktop.org \
--cc=lucas.demarchi@intel.com \
--cc=matthew.d.roper@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.