Re: [Intel-gfx] [PATCH v2 4/4] drm/i915/: Re-work clflush_write32

intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed

From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Michael Cheng <michael.cheng@intel.com>, intel-gfx@lists.freedesktop.org
Cc: lucas.demarchi@intel.com, matthew.auld@intel.com,
	mika.kuoppala@intel.com
Subject: Re: [Intel-gfx] [PATCH v2 4/4] drm/i915/: Re-work clflush_write32
Date: Tue, 1 Feb 2022 16:32:14 +0000	[thread overview]
Message-ID: <8177f292-3c69-b475-9efa-0fa00e9c37d4@linux.intel.com> (raw)
In-Reply-To: <c097fde2-7b69-7d7c-ef06-ca81edc9046d@intel.com>


On 01/02/2022 15:41, Michael Cheng wrote:
> Ah, thanks for the clarification! While discussion goes on about the 
> route you suggested, could we land these patches (after addressing the 
> reviews) to unblock compiling i915 on arm?

I am 60-40 to no, since follow up can be hard. I'd prefer a little bit 
of discussion before merging.

Also, what will be the Arm implementation of drm_clflush_virt_range? 
Noob question - why is i915 the only driver calling it? Do other GPUs 
never need to flush CPU cache?

Regards,

Tvrtko

> On 2022-02-01 1:25 a.m., Tvrtko Ursulin wrote:
>>
>> On 31/01/2022 17:02, Michael Cheng wrote:
>>> Hey Tvrtko,
>>>
>>> Are you saying when adding drm_clflush_virt_range(addr, sizeof(addr), 
>>> this function forces an x86 code path only? If that is the case, 
>>> drm_clflush_virt_range(addr, sizeof(addr) currently has ifdefs that 
>>> seperate out x86 and powerpc, so we can add an ifdef for arm in the 
>>> near future when needed.
>>
>> No, I was noticing that the change you are making in this patch, while 
>> it indeed fixes a build failure, it is a code path which does not get 
>> executed on Arm at all.
>>
>> So what effectively happens is a single assembly instruction gets 
>> replaced with a function call on all integrated GPUs up to and 
>> including Tigerlake.
>>
>> That was the slightly annoying part I was referring to and asking 
>> whether it was discussed before.
>>
>> Sadly I don't think there is a super nice solution apart from 
>> duplicating drm_clflush_virt_range as for example i915_clflush_range 
>> and having it static inline. That would allow the integrated GPU code 
>> path to remain of the same performance profile, while solving the Arm 
>> problem. However it would be code duplication so might be frowned upon.
>>
>> I'd be tempted to go that route but it is something which needs a bit 
>> of discussion if that hasn't happened already.
>>
>> Regards,
>>
>> Tvrtko
>>
>>> On 2022-01-31 6:55 a.m., Tvrtko Ursulin wrote:
>>>> On 28/01/2022 22:10, Michael Cheng wrote:
>>>>> Use drm_clflush_virt_range instead of clflushopt and remove the memory
>>>>> barrier, since drm_clflush_virt_range takes care of that.
>>>>>
>>>>> Signed-off-by: Michael Cheng <michael.cheng@intel.com>
>>>>> ---
>>>>>   drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 8 +++-----
>>>>>   1 file changed, 3 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
>>>>> b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>>>>> index 498b458fd784..0854276ff7ba 100644
>>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>>>>> @@ -1332,10 +1332,8 @@ static void *reloc_vaddr(struct i915_vma *vma,
>>>>>   static void clflush_write32(u32 *addr, u32 value, unsigned int 
>>>>> flushes)
>>>>>   {
>>>>>       if (unlikely(flushes & (CLFLUSH_BEFORE | CLFLUSH_AFTER))) {
>>>>> -        if (flushes & CLFLUSH_BEFORE) {
>>>>> -            clflushopt(addr);
>>>>> -            mb();
>>>>> -        }
>>>>> +        if (flushes & CLFLUSH_BEFORE)
>>>>> +            drm_clflush_virt_range(addr, sizeof(addr));
>>>>>             *addr = value;
>>>>>   @@ -1347,7 +1345,7 @@ static void clflush_write32(u32 *addr, u32 
>>>>> value, unsigned int flushes)
>>>>>            * to ensure ordering of clflush wrt to the system.
>>>>>            */
>>>>>           if (flushes & CLFLUSH_AFTER)
>>>>> -            clflushopt(addr);
>>>>> +            drm_clflush_virt_range(addr, sizeof(addr));
>>>>>       } else
>>>>>           *addr = value;
>>>>>   }
>>>>
>>>> Slightly annoying thing here (maybe in some other patches from the 
>>>> series as well) is that the change adds a function call to x86 only 
>>>> code path, because relocations are not supported on discrete as per:
>>>>
>>>> static in
>>>> eb_validate_vma(...)
>>>>         /* Relocations are disallowed for all platforms after 
>>>> TGL-LP. This
>>>>          * also covers all platforms with local memory.
>>>>          */
>>>>
>>>>         if (entry->relocation_count &&
>>>>             GRAPHICS_VER(eb->i915) >= 12 && !IS_TIGERLAKE(eb->i915))
>>>>                 return -EINVAL;
>>>>
>>>> How acceptable would be, for the whole series, to introduce a static 
>>>> inline i915 cluflush wrapper and so be able to avoid functions calls 
>>>> on x86? Is this something that has been discussed and discounted 
>>>> already?
>>>>
>>>> Regards,
>>>>
>>>> Tvrtko
>>>>
>>>> P.S. Hmm I am now reminded of my really old per platform build 
>>>> patches. With them you would be able to compile out large portions 
>>>> of the driver when building for ARM. Probably like a 3rd if my 
>>>> memory serves me right.

next prev parent reply	other threads:[~2022-02-01 16:32 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-28 22:10 [Intel-gfx] [PATCH v2 0/4] Use drm_clflush* instead of clflush Michael Cheng
2022-01-28 22:10 ` [Intel-gfx] [PATCH v2 1/4] drm/i915/gt: Re-work intel_write_status_page Michael Cheng
2022-01-29  7:21   ` Bowman, Casey G
2022-01-28 22:10 ` [Intel-gfx] [PATCH v2 2/4] drm/i915/gt: Re-work invalidate_csb_entries Michael Cheng
2022-01-29  7:21   ` Bowman, Casey G
2022-01-31 13:51   ` Tvrtko Ursulin
2022-01-31 14:15     ` Mika Kuoppala
2022-02-01  9:32       ` Tvrtko Ursulin
2022-01-28 22:10 ` [Intel-gfx] [PATCH v2 3/4] drm/i915/gt: Re-work reset_csb Michael Cheng
2022-01-29  7:23   ` Bowman, Casey G
2022-01-28 22:10 ` [Intel-gfx] [PATCH v2 4/4] drm/i915/: Re-work clflush_write32 Michael Cheng
2022-01-29  7:24   ` Bowman, Casey G
2022-01-31 14:55   ` Tvrtko Ursulin
2022-01-31 17:02     ` Michael Cheng
2022-02-01  9:25       ` Tvrtko Ursulin
2022-02-01 15:41         ` Michael Cheng
2022-02-01 16:32           ` Tvrtko Ursulin [this message]
2022-02-02 16:35             ` Michael Cheng
2022-01-28 22:31 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Use drm_clflush* instead of clflush (rev2) Patchwork
2022-01-28 22:33 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2022-01-28 22:59 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2022-01-29  3:00 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8177f292-3c69-b475-9efa-0fa00e9c37d4@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=lucas.demarchi@intel.com \
    --cc=matthew.auld@intel.com \
    --cc=michael.cheng@intel.com \
    --cc=mika.kuoppala@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).