From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Michael Cheng <michael.cheng@intel.com>, intel-gfx@lists.freedesktop.org
Cc: lucas.demarchi@intel.com, matthew.auld@intel.com,
mika.kuoppala@intel.com
Subject: Re: [Intel-gfx] [PATCH v2 4/4] drm/i915/: Re-work clflush_write32
Date: Tue, 1 Feb 2022 16:32:14 +0000 [thread overview]
Message-ID: <8177f292-3c69-b475-9efa-0fa00e9c37d4@linux.intel.com> (raw)
In-Reply-To: <c097fde2-7b69-7d7c-ef06-ca81edc9046d@intel.com>
On 01/02/2022 15:41, Michael Cheng wrote:
> Ah, thanks for the clarification! While discussion goes on about the
> route you suggested, could we land these patches (after addressing the
> reviews) to unblock compiling i915 on arm?
I am 60-40 to no, since follow up can be hard. I'd prefer a little bit
of discussion before merging.
Also, what will be the Arm implementation of drm_clflush_virt_range?
Noob question - why is i915 the only driver calling it? Do other GPUs
never need to flush CPU cache?
Regards,
Tvrtko
> On 2022-02-01 1:25 a.m., Tvrtko Ursulin wrote:
>>
>> On 31/01/2022 17:02, Michael Cheng wrote:
>>> Hey Tvrtko,
>>>
>>> Are you saying when adding drm_clflush_virt_range(addr, sizeof(addr),
>>> this function forces an x86 code path only? If that is the case,
>>> drm_clflush_virt_range(addr, sizeof(addr) currently has ifdefs that
>>> seperate out x86 and powerpc, so we can add an ifdef for arm in the
>>> near future when needed.
>>
>> No, I was noticing that the change you are making in this patch, while
>> it indeed fixes a build failure, it is a code path which does not get
>> executed on Arm at all.
>>
>> So what effectively happens is a single assembly instruction gets
>> replaced with a function call on all integrated GPUs up to and
>> including Tigerlake.
>>
>> That was the slightly annoying part I was referring to and asking
>> whether it was discussed before.
>>
>> Sadly I don't think there is a super nice solution apart from
>> duplicating drm_clflush_virt_range as for example i915_clflush_range
>> and having it static inline. That would allow the integrated GPU code
>> path to remain of the same performance profile, while solving the Arm
>> problem. However it would be code duplication so might be frowned upon.
>>
>> I'd be tempted to go that route but it is something which needs a bit
>> of discussion if that hasn't happened already.
>>
>> Regards,
>>
>> Tvrtko
>>
>>> On 2022-01-31 6:55 a.m., Tvrtko Ursulin wrote:
>>>> On 28/01/2022 22:10, Michael Cheng wrote:
>>>>> Use drm_clflush_virt_range instead of clflushopt and remove the memory
>>>>> barrier, since drm_clflush_virt_range takes care of that.
>>>>>
>>>>> Signed-off-by: Michael Cheng <michael.cheng@intel.com>
>>>>> ---
>>>>> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 8 +++-----
>>>>> 1 file changed, 3 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>>>>> b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>>>>> index 498b458fd784..0854276ff7ba 100644
>>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>>>>> @@ -1332,10 +1332,8 @@ static void *reloc_vaddr(struct i915_vma *vma,
>>>>> static void clflush_write32(u32 *addr, u32 value, unsigned int
>>>>> flushes)
>>>>> {
>>>>> if (unlikely(flushes & (CLFLUSH_BEFORE | CLFLUSH_AFTER))) {
>>>>> - if (flushes & CLFLUSH_BEFORE) {
>>>>> - clflushopt(addr);
>>>>> - mb();
>>>>> - }
>>>>> + if (flushes & CLFLUSH_BEFORE)
>>>>> + drm_clflush_virt_range(addr, sizeof(addr));
>>>>> *addr = value;
>>>>> @@ -1347,7 +1345,7 @@ static void clflush_write32(u32 *addr, u32
>>>>> value, unsigned int flushes)
>>>>> * to ensure ordering of clflush wrt to the system.
>>>>> */
>>>>> if (flushes & CLFLUSH_AFTER)
>>>>> - clflushopt(addr);
>>>>> + drm_clflush_virt_range(addr, sizeof(addr));
>>>>> } else
>>>>> *addr = value;
>>>>> }
>>>>
>>>> Slightly annoying thing here (maybe in some other patches from the
>>>> series as well) is that the change adds a function call to x86 only
>>>> code path, because relocations are not supported on discrete as per:
>>>>
>>>> static in
>>>> eb_validate_vma(...)
>>>> /* Relocations are disallowed for all platforms after
>>>> TGL-LP. This
>>>> * also covers all platforms with local memory.
>>>> */
>>>>
>>>> if (entry->relocation_count &&
>>>> GRAPHICS_VER(eb->i915) >= 12 && !IS_TIGERLAKE(eb->i915))
>>>> return -EINVAL;
>>>>
>>>> How acceptable would be, for the whole series, to introduce a static
>>>> inline i915 cluflush wrapper and so be able to avoid functions calls
>>>> on x86? Is this something that has been discussed and discounted
>>>> already?
>>>>
>>>> Regards,
>>>>
>>>> Tvrtko
>>>>
>>>> P.S. Hmm I am now reminded of my really old per platform build
>>>> patches. With them you would be able to compile out large portions
>>>> of the driver when building for ARM. Probably like a 3rd if my
>>>> memory serves me right.
next prev parent reply other threads:[~2022-02-01 16:32 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-28 22:10 [Intel-gfx] [PATCH v2 0/4] Use drm_clflush* instead of clflush Michael Cheng
2022-01-28 22:10 ` [Intel-gfx] [PATCH v2 1/4] drm/i915/gt: Re-work intel_write_status_page Michael Cheng
2022-01-29 7:21 ` Bowman, Casey G
2022-01-28 22:10 ` [Intel-gfx] [PATCH v2 2/4] drm/i915/gt: Re-work invalidate_csb_entries Michael Cheng
2022-01-29 7:21 ` Bowman, Casey G
2022-01-31 13:51 ` Tvrtko Ursulin
2022-01-31 14:15 ` Mika Kuoppala
2022-02-01 9:32 ` Tvrtko Ursulin
2022-01-28 22:10 ` [Intel-gfx] [PATCH v2 3/4] drm/i915/gt: Re-work reset_csb Michael Cheng
2022-01-29 7:23 ` Bowman, Casey G
2022-01-28 22:10 ` [Intel-gfx] [PATCH v2 4/4] drm/i915/: Re-work clflush_write32 Michael Cheng
2022-01-29 7:24 ` Bowman, Casey G
2022-01-31 14:55 ` Tvrtko Ursulin
2022-01-31 17:02 ` Michael Cheng
2022-02-01 9:25 ` Tvrtko Ursulin
2022-02-01 15:41 ` Michael Cheng
2022-02-01 16:32 ` Tvrtko Ursulin [this message]
2022-02-02 16:35 ` Michael Cheng
2022-01-28 22:31 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Use drm_clflush* instead of clflush (rev2) Patchwork
2022-01-28 22:33 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2022-01-28 22:59 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2022-01-29 3:00 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8177f292-3c69-b475-9efa-0fa00e9c37d4@linux.intel.com \
--to=tvrtko.ursulin@linux.intel.com \
--cc=intel-gfx@lists.freedesktop.org \
--cc=lucas.demarchi@intel.com \
--cc=matthew.auld@intel.com \
--cc=michael.cheng@intel.com \
--cc=mika.kuoppala@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).