From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: "Yang, Fei" <fei.yang@intel.com>,
"intel-gfx@lists.freedesktop.org"
<intel-gfx@lists.freedesktop.org>
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>,
"Roper, Matthew D" <matthew.d.roper@intel.com>,
"dri-devel@lists.freedesktop.org"
<dri-devel@lists.freedesktop.org>
Subject: Re: [Intel-gfx] [PATCH 7/8] drm/i915: use pat_index instead of cache_level
Date: Mon, 24 Apr 2023 10:22:59 +0100 [thread overview]
Message-ID: <44564d4e-57c9-cc04-e146-fa065ab784de@linux.intel.com> (raw)
In-Reply-To: <BYAPR11MB2567E4680658824BD81D01229A669@BYAPR11MB2567.namprd11.prod.outlook.com>
On 23/04/2023 07:52, Yang, Fei wrote:
>> On 20/04/2023 00:00, fei.yang@intel.com wrote:
>>> From: Fei Yang <fei.yang@intel.com>
>>>
>>> Currently the KMD is using enum i915_cache_level to set caching policy for
>>> buffer objects. This is flaky because the PAT index which really controls
>>> the caching behavior in PTE has far more levels than what's defined in the
>>> enum. In addition, the PAT index is platform dependent, having to translate
>>> between i915_cache_level and PAT index is not reliable, and makes the code
>>> more complicated.
>>>
>>> >From UMD's perspective there is also a necessity to set caching policy for
>>> performance fine tuning. It's much easier for the UMD to directly use PAT
>>> index because the behavior of each PAT index is clearly defined in Bspec.
>>> Having the abstracted i915_cache_level sitting in between would only cause
>>> more ambiguity.
>>>
>>> For these reasons this patch replaces i915_cache_level with PAT index. Also
>>> note, the cache_level is not completely removed yet, because the KMD still
>>> has the need of creating buffer objects with simple cache settings such as
>>> cached, uncached, or writethrough. For such simple cases, using cache_level
>>> would help simplify the code.
>>>
>>> Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
>>> Cc: Matt Roper <matthew.d.roper@intel.com>
>>> Signed-off-by: Fei Yang <fei.yang@intel.com>
>>> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
>>
>> I think have some ideas no how to perhaps make this simpler, please bear
>> with me.
>>
>> In my mind get/set caching ioctls need to be failing once explicit pat
>> index has been set by userspace. Or at least not return false information.
>
> By design we are ending the support for set caching ioctl. The patch is included
> in this series, "drm/i915/mtl: end support for set caching ioctl"
>
> + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
> + return -EOPNOTSUPP;
> +
>
>> And I don't like i915_gem_object_has_cache_level and
>> i915_gem_get_pat_index as a refactoring step.
>>
>> It also seems that the driver has a need to query the caching mode set
>> regardless of the route (of setting).
>
> Only for the objects created by the KMD. For UMD created objects with PAT
> index set KMD should never touch the setting.
>
>> So how about this.
>>
>> Three callers which query the caching mode: use_cpu_reloc, vm_fault_gtt,
>> gpu_write_needs_clflush.
>>
>> We convert them to be like:
>>
>> i915_gem_object_has_caching_mode(obj, PAT_UC / PAT_WT / ...);
>
> PAT_UC/WT/WB are platform dependent (https://gfxspecs.intel.com/Predator/Home/Index/45101),
> performing this check you would have to do something like,
>
> if (MTL)
> ...
> else if (PVC)
> ...
> else if (GEN12)
> ...
> else
> ...
No, it would be doable with a table as I suggested below. And that table
could be re-used for debugfs pretty-printing simplifying that code too.
>> Then apart from the per platform tables for mapping between cache level
>> to pat index, you add tables which map pat index to caching modes
>> (PAT_UC, etc, naming TBD, just enums or bitmasks also TBD, I haven't
>> looked at the bspec to see how exactly it works).
>>
>> You would use that table in the i915_gem_object_has_caching_mode helper,
>> called from the above three functions instead of obj->cache_level direct
>> comparison.
>>
>> I am assuming at least for instance cache_level != I915_CACHE_NONE would
>> be equivalent to i915_gem_object_has_caching_mode(obj, PAT_UC), etc.
>
> So far kernel only needs 4 cache levels defined in enum i915_cache_level,
> kernel doesn't need to understand all PAT indices. By desgin if the userspace
> is setting PAT index directly, kernel only needs to pass the setting to PTE.
>
> For objects created by kernel (including objects created by userspace without
> specifying pat index), there are only 4 options (defined in the cachelevel_to_pat).
>
> For objects created by userspace with PAT index set (GEM_CREATE + set_pat extension),
> kernel should not touch the setting, just pass it to the PAT index bits in PTE.
>
> That's why I was only checking cache_level. Handling PAT index is much more
> complicated because of its platform dependent nature and even the number of
> PAT indices varies from platform to platform. Fortunately kernel doesn't need
> to understand that.
Yeah but I think you maybe missed the spirit of my proposal - which is
to simplify the internal code paths by not having the duality of
cache_level-vs-pat almost all the way down. But instead cut it at the
top API level.
You have this:
+ .cachelevel_to_pat = { \
+ [I915_CACHE_NONE] = 0, \
+ [I915_CACHE_LLC] = 1, \
+ [I915_CACHE_L3_LLC] = 2, \
+ [I915_CACHE_WT] = 3, \
+ }
I propose to add something like:
.legacy_platform_pat = { /* pat index to driver logical flags */
[0] = PAT_UC,
[1] = PAT_WB | PAT_LLC, /* Just illustrating the principle */
};
i915->platform_pat = &legacy_platform_pat
i915_gem_object_has_caching_mode(obj, PAT_UC)
...
return i915->platform_pat & PAT_UC == PAT_UC /* give or take */
get/set_caching_ioctl
{
...
if (obj->has_pat_index) /* set in the extension only */
return -EINVAL;
debugfs:
i915_show_pat_flags(i915->platform_pat[obj->pat_index]); /* generic! */
etc...
Set obj->pat_index in the extension or set_cache_level _only_. No
internal code paths then need to use anything but it. No sprinkling of
conversion helpers needed or dubious has_cache_level query.
Regards,
Tvrtko
>> Same mapping table could also be used in debugfs (i915_cache_level_str)
>> to universally describe any obj->pat_index, with no need to have
>> anything platform dependend there.
>>
>> In set caching set you always set obj->pat_index and so low level code
>> can always just use that.
>>
>> Unless I am missing something (possible) I think like that we end up
>> with no i915_gem_get_pat_index sprinkled around and also no confusing
>> i915_gem_object_has_cache_level.
>>
>> Obj->pat_index would be a single point of truth, while obj->cache_level
>> is just a legacy field for get/set_caching ioctl - not used in the
>> internal driver flows.
>>
>> We would need an additional field for storing the boolean of whether
>> userspace had overriden the PAT.
>>
>> Regards,
>>
>> Tvrtko
next prev parent reply other threads:[~2023-04-24 9:23 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-19 23:00 [Intel-gfx] [PATCH 0/8] drm/i915/mtl: Define MOCS and PAT tables for MTL fei.yang
2023-04-19 23:00 ` [Intel-gfx] [PATCH 1/8] drm/i915/mtl: Set has_llc=0 fei.yang
2023-04-20 10:20 ` Das, Nirmoy
2023-04-19 23:00 ` [Intel-gfx] [PATCH 2/8] drm/i915/mtl: Define MOCS and PAT tables for MTL fei.yang
2023-04-20 20:29 ` Matt Roper
2023-04-19 23:00 ` [Intel-gfx] [PATCH 3/8] drm/i915/mtl: Add PTE encode function fei.yang
2023-04-20 20:40 ` Matt Roper
2023-04-21 17:27 ` Yang, Fei
2023-04-21 17:42 ` Matt Roper
2023-04-23 7:37 ` Yang, Fei
2023-04-24 17:20 ` Matt Roper
2023-04-24 18:41 ` Yang, Fei
2023-04-19 23:00 ` [Intel-gfx] [PATCH 4/8] drm/i915/mtl: workaround coherency issue for Media fei.yang
2023-04-20 8:26 ` Andrzej Hajda
2023-04-20 11:36 ` Das, Nirmoy
2023-04-20 20:52 ` Matt Roper
2023-04-19 23:00 ` [Intel-gfx] [PATCH 5/8] drm/i915/mtl: end support for set caching ioctl fei.yang
2023-04-20 21:05 ` Matt Roper
2023-04-19 23:00 ` [Intel-gfx] [PATCH 6/8] drm/i915: preparation for using PAT index fei.yang
2023-04-20 8:45 ` Andrzej Hajda
2023-04-20 21:14 ` Matt Roper
2023-04-19 23:00 ` [Intel-gfx] [PATCH 7/8] drm/i915: use pat_index instead of cache_level fei.yang
2023-04-20 10:13 ` Andrzej Hajda
2023-04-20 12:39 ` Tvrtko Ursulin
2023-04-20 20:34 ` Yang, Fei
2023-04-21 8:43 ` Tvrtko Ursulin
2023-04-21 10:17 ` Tvrtko Ursulin
2023-04-23 6:12 ` Yang, Fei
2023-04-24 8:41 ` Tvrtko Ursulin
2023-04-21 11:39 ` Tvrtko Ursulin
2023-04-23 6:52 ` Yang, Fei
2023-04-24 9:22 ` Tvrtko Ursulin [this message]
2023-04-19 23:00 ` [Intel-gfx] [PATCH 8/8] drm/i915: Allow user to set cache at BO creation fei.yang
2023-04-20 11:39 ` Andi Shyti
2023-04-20 13:06 ` Tvrtko Ursulin
2023-04-20 16:11 ` Yang, Fei
2023-04-20 16:29 ` Andi Shyti
2023-04-21 20:48 ` Jordan Justen
[not found] ` <BYAPR11MB2567F03AD43D7E2DE2628D5D9A669@BYAPR11MB2567.namprd11.prod.outlook.com>
[not found] ` <168232538771.392286.3227368099155268955@jljusten-skl>
2023-04-24 9:08 ` Tvrtko Ursulin
2023-04-24 17:13 ` Jordan Justen
2023-04-25 13:41 ` [Intel-gfx] IOCTL feature detection (Was: Re: [PATCH 8/8] drm/i915: Allow user to set cache at BO creation) Joonas Lahtinen
2023-04-25 17:21 ` Teres Alexis, Alan Previn
2023-04-25 18:19 ` Jordan Justen
2023-04-26 11:52 ` Daniel Vetter
2023-04-26 16:48 ` Teres Alexis, Alan Previn
2023-04-26 18:10 ` Ceraolo Spurio, Daniele
2023-04-26 20:04 ` Jordan Justen
2023-04-19 23:29 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm/i915/mtl: Define MOCS and PAT tables for MTL (rev8) Patchwork
2023-04-19 23:51 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2023-04-20 11:30 ` [Intel-gfx] [PATCH 0/8] drm/i915/mtl: Define MOCS and PAT tables for MTL Andi Shyti
-- strict thread matches above, loose matches on Subject: below --
2023-04-19 21:12 fei.yang
2023-04-19 21:12 ` [Intel-gfx] [PATCH 7/8] drm/i915: use pat_index instead of cache_level fei.yang
2023-04-19 22:09 ` Andi Shyti
2023-04-19 18:09 [Intel-gfx] [PATCH 0/8] drm/i915/mtl: Define MOCS and PAT tables for MTL fei.yang
2023-04-19 18:09 ` [Intel-gfx] [PATCH 7/8] drm/i915: use pat_index instead of cache_level fei.yang
2023-04-17 6:24 [Intel-gfx] [PATCH 0/8] drm/i915/mtl: Define MOCS and PAT tables for MTL fei.yang
2023-04-17 6:25 ` [Intel-gfx] [PATCH 7/8] drm/i915: use pat_index instead of cache_level fei.yang
2023-04-19 12:16 ` Andi Shyti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=44564d4e-57c9-cc04-e146-fa065ab784de@linux.intel.com \
--to=tvrtko.ursulin@linux.intel.com \
--cc=chris.p.wilson@linux.intel.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=fei.yang@intel.com \
--cc=intel-gfx@lists.freedesktop.org \
--cc=matthew.d.roper@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox