Re: [Intel-gfx] [RFC 2/2] drm/i915: Remove PAT hack from i915_gem_object_can_bypass_llc

Intel-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed

From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Matt Roper <matthew.d.roper@intel.com>
Cc: "Intel-gfx@lists.freedesktop.org"
	<Intel-gfx@lists.freedesktop.org>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>,
	Chris Wilson <chris.p.wilson@linux.intel.com>
Subject: Re: [Intel-gfx] [RFC 2/2] drm/i915: Remove PAT hack from i915_gem_object_can_bypass_llc
Date: Mon, 17 Jul 2023 11:55:30 +0100	[thread overview]
Message-ID: <b9afd2fe-426e-7d4a-2768-44c6d2507e29@linux.intel.com> (raw)
In-Reply-To: <20230715002023.GA138014@mdroper-desk1.amr.corp.intel.com>


On 15/07/2023 01:20, Matt Roper wrote:
> On Fri, Jul 14, 2023 at 11:11:30AM +0100, Tvrtko Ursulin wrote:
>>
>> On 14/07/2023 06:43, Yang, Fei wrote:
>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>
>>>> According to the comment in i915_gem_object_can_bypass_llc the
>>>> purpose of the function is to return false if the platform/object
>>>> has a caching mode where GPU can bypass the LLC.
>>>>
>>>> So far the only platforms which allegedly can do this are Jasperlake
>>>> and Elkhartlake, and that via MOCS (not PAT).
>>>>
>>>> Instead of blindly assuming that objects where userspace has set the
>>>> PAT index can (bypass the LLC), question is is there a such PAT index
>>>> on a platform. Probably starting with Meteorlake since that one is the
>>>> only one where set PAT extension can be currently used. Or if there is
>>>> a MOCS entry which can achieve the same thing on Meteorlake.
>>>>
>>>> If there is such PAT, now that i915 can be made to understand them
>>>> better, we can make the check more fine grained. Or if there is a MOCS
>>>> entry then we probably should apply the blanket IS_METEORLAKE condition.
>>>>
>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>> Fixes: 9275277d5324 ("drm/i915: use pat_index instead of cache_level")
>>>> Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
>>>> Cc: Fei Yang <fei.yang@intel.com>
>>>> Cc: Andi Shyti <andi.shyti@linux.intel.com>
>>>> Cc: Matt Roper <matthew.d.roper@intel.com>
>>>> ---
>>>>    drivers/gpu/drm/i915/gem/i915_gem_object.c | 6 ------
>>>>    1 file changed, 6 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
>>>> index 33a1e97d18b3..1e34171c4162 100644
>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
>>>> @@ -229,12 +229,6 @@ bool i915_gem_object_can_bypass_llc(struct drm_i915_gem_object *obj)
>>>>         if (!(obj->flags & I915_BO_ALLOC_USER))
>>>>                 return false;
>>>>
>>>> -     /*
>>>> -      * Always flush cache for UMD objects at creation time.
>>>> -      */
>>>> -     if (obj->pat_set_by_user)
>>>
>>> I'm afraid this is going to break MESA. Can we run MESA tests with this patch?
>>
>> I can't, but question is why it would break Mesa which would need a nice
>> comment here?
>>
>> For instance should the check be IS_METEORLAKE?
>>
>> Or should it be "is wb" && "not has 1-way coherent"?
>>
>> Or both?
>>
>> Or, given how Meteorlake does not have LLC, how can anything bypass it
>> there? Or is it about snooping on Meteorlake and how?
> 
> I think the "LLC" in the function name is a bit misleading since this is
> really all just about the ability to avoid coherency (which might come
> from an LLC on some platforms or from snooping on others).
> 
> The concern is that the CPU writes to the buffer and those writes sit in
> a CPU cache without making it to RAM immediately.  If the GPU then
> reads the object with any of the non-coherent PAT settings that were
> introduced in Xe_LPG, it will not snoop the CPU cache and will read old,
> stale data from RAM.
> 
> So I think we'd want a condition like ("Xe_LPG or later" && "any non
> coherent PAT").  The WB/WT/UC status of the GPU behavior shouldn't
> matter here, just the coherency setting.

Right, sounds plausible to me. So with this series the new condition in this function would look like this:

i915_gem_object_can_bypass_llc(..)
{
...
	if (i915_gem_object_has_cache_mode(obj, I915_CACHE_MODE_WB) &&
	    i915_gem_object_has_cache_flag(obj, I915_CACHE_FLAG_COH1W) != 1)
		return true;

("!= 1" in the condition meaning either it is not coherent, or i915 does not know due table being incomplete - like some PAT index on some future platform was forgotten to be defined.)

That would catch any platform with non-coherent WB, as long as the PAT-to-i915-cache-mode tables are correct. It would currently only apply to Meteorlake:

#define MTL_CACHE_MODES \
	.cache_modes = { \
		[0] = I915_CACHE(WB), \
		[1] = I915_CACHE(WT), \
		[2] = I915_CACHE(UC), \
		[3] = _I915_CACHE(WB, COH1W), \
		[4] = __I915_CACHE(WB, BIT(I915_CACHE_FLAG_COH1W) | BIT(I915_CACHE_FLAG_COH2W)), \
	}

Or are saying it should apply to UC and WT too somehow?

I'll also try to join sub-threads to Fei's reply here too.

So in terms of the stated issue with _CPU_ access from Mesa seeing stale data (non-zeroed pages) depending on the PAT index - I don't understand that yet. That seems like a completely CPU cache problem space and I do not understand how PAT index gets into the picture.

But the proposed patch from your email Fei looks like it would be covered by the snippet I have in this reply.

Regards,

Tvrtko

next prev parent reply	other threads:[~2023-07-17 10:55 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-13 15:27 [Intel-gfx] [RFC 1/2] drm/i915: Refactor PAT/object cache handling Tvrtko Ursulin
2023-07-13 15:27 ` [Intel-gfx] [RFC 2/2] drm/i915: Remove PAT hack from i915_gem_object_can_bypass_llc Tvrtko Ursulin
2023-07-14  5:43   ` Yang, Fei
2023-07-14 10:11     ` Tvrtko Ursulin
2023-07-14 17:38       ` Yang, Fei
2023-07-15  0:20       ` Matt Roper
2023-07-17 10:55         ` Tvrtko Ursulin [this message]
2023-07-13 19:38 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [RFC,1/2] drm/i915: Refactor PAT/object cache handling Patchwork
2023-07-13 19:38 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2023-07-13 19:49 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2023-07-14  0:42 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
2023-07-14  5:36 ` [Intel-gfx] [RFC 1/2] " Yang, Fei
2023-07-14 10:08   ` Tvrtko Ursulin
2023-07-14 18:00 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [RFC,1/2] drm/i915: Refactor PAT/object cache handling (rev2) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b9afd2fe-426e-7d4a-2768-44c6d2507e29@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=Intel-gfx@lists.freedesktop.org \
    --cc=chris.p.wilson@linux.intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=matthew.d.roper@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox