Re: [PATCH 2/3] drm/i915/gt: Fix context workarounds with non-masked regs

Linux kernel -stable discussions
 help / color / mirror / Atom feed

From: Matt Roper <matthew.d.roper@intel.com>
To: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>,
	<intel-gfx@lists.freedesktop.org>,
	<dri-devel@lists.freedesktop.org>, <stable@vger.kernel.org>
Subject: Re: [PATCH 2/3] drm/i915/gt: Fix context workarounds with non-masked regs
Date: Fri, 23 Jun 2023 14:56:46 -0700	[thread overview]
Message-ID: <20230623215646.GJ5433@mdroper-desk1.amr.corp.intel.com> (raw)
In-Reply-To: <aet7uj3ldnjk5rdgc2fwsn42uqgx5kbxau3efvdgmnhq473s2m@d6uyzwskmfaf>

On Fri, Jun 23, 2023 at 02:05:20PM -0700, Lucas De Marchi wrote:
> On Fri, Jun 23, 2023 at 12:48:13PM -0700, Kenneth Graunke wrote:
> > On Friday, June 23, 2023 8:49:05 AM PDT Lucas De Marchi wrote:
> > > On Thu, Jun 22, 2023 at 04:37:21PM -0700, Kenneth Graunke wrote:
> > > >On Thursday, June 22, 2023 11:27:30 AM PDT Lucas De Marchi wrote:
> > > >> Most of the context workarounds tweak masked registers, but not all. For
> > > >> masked registers, when writing the value it's sufficient to just write
> > > >> the wa->set_bits since that will take care of both the clr and set bits
> > > >> as well as not overwriting other bits.
> > > >>
> > > >> However there are some workarounds, the registers are non-masked. Up
> > > >> until now the driver was simply emitting a MI_LOAD_REGISTER_IMM with the
> > > >> set_bits to program the register via the GPU in the WA bb. This has the
> > > >> side effect of overwriting the content of the register outside of bits
> > > >> that should be set and also doesn't handle the bits that should be
> > > >> cleared.
> > > >>
> > > >> Kenneth reported that on DG2, mesa was seeing a weird behavior due to
> > > >> the kernel programming of L3SQCREG5 in dg2_ctx_gt_tuning_init(). With
> > > >> the GPU idle, that register could be read via intel_reg as 0x00e001ff,
> > > >> but during a 3D workload it would change to 0x0000007f. So the
> > > >> programming of that tuning was affecting more than the bits in
> > > >> L3_PWM_TIMER_INIT_VAL_MASK. Matt Roper noticed the lack of rmw for the
> > > >> context workarounds due to the use of MI_LOAD_REGISTER_IMM.
> > > >>
> > > >> So, for registers that are not masked, read its value via mmio, modify
> > > >> and then set it in the buffer to be written by the GPU. This should take
> > > >> care in a simple way of programming just the bits required by the
> > > >> tuning/workaround. If in future there are registers that involved that
> > > >> can't be read by the CPU, a more complex approach may be required like
> > > >> a) issuing additional instructions to read and modify; or b) scan the
> > > >> golden context and patch it in place before saving it; or something
> > > >> else. But for now this should suffice.
> > > >>
> > > >> Scanning the context workarounds for all platforms, these are the
> > > >> impacted ones with the respective registers
> > > >>
> > > >> 	mtl: DRAW_WATERMARK
> > > >> 	mtl/dg2: XEHP_L3SQCREG5, XEHP_FF_MODE2
> > > >> 	gen12: GEN12_FF_MODE2
> > > >
> > > >Speaking of GEN12_FF_MODE2...there's a big scary comment above that
> > > >workaround write which says that register "will return the wrong value
> > > >when read."  I think with this patch, we'll start doing a RMW cycle for
> > > >the register, which could mix in some of this "wrong value".  The
> > > >comment mentions that the intention is to write the whole register,
> > > >as the default value is 0 for all fields.
> > > 
> > > Good point. That also means we don't need to backport this patch to
> > > stable kernel to any gen12, since overwritting the other bits is
> > > actually the intended behavior.
> > > 
> > > >
> > > >Maybe what we want to do is change gen12_ctx_gt_tuning_init to do
> > > >
> > > >    wa_write(wal, GEN12_FF_MODE2, FF_MODE2_TDS_TIMER_128);
> > > >
> > > >so it has a clear mask of ~0 instead of FF_MODE2_TDS_TIMER_MASK, and
> > > 
> > > In order to ignore read back when verifying, we would still need to use
> > > wa_add(), but changing the mask. We don't have a wa_write() that ends up
> > > with { .clr = ~0, .read_mask = 0 }.
> > > 
> > > 	wa_add(wal,
> > > 	       GEN12_FF_MODE2,
> > > 	       ~0, FF_MODE2_TDS_TIMER_128,
> > > 	       0, false);
> > 
> > Good point!  Though, I just noticed another bug here:
> > 
> > gen12_ctx_workarounds_init sets FF_MODE2_GS_TIMER_224 to avoid hangs
> > in the HS/DS unit, after gen12_ctx_gt_tuning_init set TDS_TIMER_128
> > for performance.  One of those is going to clobber the other; we're
> > likely losing the TDS tuning today.  Combining those workarounds into
> 
> we are not losing it today. As long as the wa list is the same, we do detect collisions when
> adding workarounds and they are coallesced before applying. However,
> indeed if we change this to make clear be ~0, then they will collide and
> we will see a warning.
> 
> Applying them together in a single operation would indeed solve it
> with a side-effect of moving this back to the workarounds. Either that
> or
> 
> a) we handle the read_back == 0 && clear == U32_MAX specially when
>    adding WAs. If that is true, then the check for collisions can
>    be adjusted to allow that.
> 
> b) we give up on this approach and proceed with one of
> 
> 	1) scan the ctx wa list. If it has any non-masked register,
> 	   we submit a job to read it from the GPU side. MCR will
> 	   make this harder as the steering from the GPU side is
> 	   different than the CPU
> 
> 	2) emit additional commands to read and modify the register from
> 	   the GPU side
> 
> 	3) find the register in the golden context and patch it in place
> 
> 
> 
> 
> > one place seems like an easy way to fix that.
> 
> I'm leaning towards this option in the hope we don't have have
> another GEN12_FF_MODE2 in future.
> 
> Matt, we've been pushing towards separating the tuning from the WAs, but
> here we'd go the other way. Anything against doing this for now?

That's probably fine as long as we leave a comment behind in the tuning
section explaining why that specific setting is found in a different
spot.


Matt


> 
> thanks
> Lucas De Marchi
> 
> > 
> > > >then in this patch update your condition below from
> > > >
> > > >+		if (wa->masked_reg || wa->set == U32_MAX) {
> > > >
> > > >to
> > > >
> > > >+		if (wa->masked_reg || wa->set == U32_MAX || wa->clear == U32_MAX) {
> > > 
> > > yeah... and maybe also warn if wa->read is 0, which means it's one
> > > of the registers we can't/shouldn't read from the CPU.
> > > 
> > > >
> > > >because if we're clearing all bits then we don't care about doing a
> > > >read-modify-write either.
> > > 
> > > thanks
> > > Lucas De Marchi
> > > 
> > > >
> > > >--Ken
> > > 
> > > 
> > > 
> > 
> 
> 

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation

next prev parent reply	other threads:[~2023-06-23 21:57 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-22 18:27 [PATCH 1/3] drm/i915/gt: Move wal_get_fw_for_rmw() Lucas De Marchi
2023-06-22 18:27 ` [PATCH 2/3] drm/i915/gt: Fix context workarounds with non-masked regs Lucas De Marchi
2023-06-22 23:37   ` Kenneth Graunke
2023-06-23 15:49     ` Lucas De Marchi
2023-06-23 19:48       ` Kenneth Graunke
2023-06-23 21:05         ` Lucas De Marchi
2023-06-23 21:56           ` Matt Roper [this message]
2023-06-24 18:12             ` [Intel-gfx] " Lucas De Marchi
2023-06-22 18:27 ` [PATCH 3/3] drm/i915/gt: Drop read from GEN8_L3CNTLREG in ICL workaround Lucas De Marchi
2023-06-22 18:29   ` kernel test robot
2023-06-22 23:37 ` [PATCH 1/3] drm/i915/gt: Move wal_get_fw_for_rmw() Kenneth Graunke

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230623215646.GJ5433@mdroper-desk1.amr.corp.intel.com \
    --to=matthew.d.roper@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=kenneth@whitecape.org \
    --cc=lucas.demarchi@intel.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox