Intel-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Andrzej Hajda <andrzej.hajda@intel.com>
To: Matthew Auld <matthew.auld@intel.com>, intel-gfx@lists.freedesktop.org
Cc: Lucas De Marchi <lucas.demarchi@intel.com>,
	Rodrigo Vivi <rodrigo.vivi@intel.com>
Subject: Re: [Intel-gfx] [PATCH] drm/i915/selftests: add prefetch padding to store_dw batchbuffer
Date: Wed, 19 Oct 2022 13:01:44 +0200	[thread overview]
Message-ID: <cd6d9802-353c-642d-4ee7-af7b2eac5a01@intel.com> (raw)
In-Reply-To: <ca42bc29-ef8c-cb36-a8f7-897c7baee0ca@intel.com>

[-- Attachment #1: Type: text/plain, Size: 3108 bytes --]



On 19.10.2022 11:14, Matthew Auld wrote:
> On 19/10/2022 10:12, Matthew Auld wrote:
>> On 19/10/2022 08:12, Andrzej Hajda wrote:
>>> Instruction prefetch mechanism requires that 512 bytes after the last
>>> command should be readable by EU. Otherwise DMAR errors and engine
>>> hangs can happen.
>>>
>>> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5278
>>> Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
>>
>> Is there a Bspec ref for this? I would have assumed that EU was more 
>> about kernels/shaders, than simple MI commands? Also should we be 
>> hitting dmar errors for ppGTT if this were some kind of overfetch? 
>> AFAICT we always point entries back to scratch, unlike with say the 
>> GGTT where we might have stale entries, and unbinding should flush 
>> the tlb?
>
> s/unbinding/put_pages/

Bspec is here [1], but when you made distinction between simple MI 
commands and kernel/shaders I am not so sure if it applies to this case, 
so I will present my finding leading to this conclusion:

My findings (on RaptorLake):
1. dmar errors always print physical address of recently removed bb 
created by igt_emit_store_dw, at least in my tests.
2. intel_iommu enqueues tlb flush during put_pages of this bb, but 
actual flush happens later, triggered by timer.
3. Together with dmar errors GuC reports CAT error on context/engine 
executing this batch (with IPEHR=MI_BATCH_BUFFER_END).
4. Errors happens only on vcs/vecs (???).
5. Errors happens only in case tested huge page has size SZ_2M - SZ_64K, 
or SZ_2M - SZ_4K. In both cases calculated size of bb (8kb) is just few 
dwords after the last cmd, in other cases there is much more padding.
6. Enlarging bb works (as in this patch).
7. Flushing iommu tlb for the phys address of bb just before calling 
dma_unmap_sg (in i915_gem_gtt_finish_pages) helps as well :)
8. There is already some workaround present in i915_gem_gtt_finish_pages:
>
> /* XXXThis does not prevent more requests being submitted! */
>
> if(unlikely(ggtt->do_idle_maps))
>
> /* Wait a bit, in the hope it avoids the hang */
>
>     usleep_range(100, 250);
>
but it is only implemented for Gen5 and is slow, but also works 
(probably because tlb is flushed meantime).

[1]: https://gfxspecs.intel.com/Predator/Home/Index/47286

Regards
Andrzej

>
>>
>>> ---
>>>   drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c 
>>> b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c
>>> index 3c55e77b0f1b00..fe999a02f8e10a 100644
>>> --- a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c
>>> +++ b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c
>>> @@ -50,7 +50,7 @@ igt_emit_store_dw(struct i915_vma *vma,
>>>       u32 *cmd;
>>>       int err;
>>> -    size = (4 * count + 1) * sizeof(u32);
>>> +    size = (4 * count + 1) * sizeof(u32) + 512;
>>>       size = round_up(size, PAGE_SIZE);
>>>       obj = i915_gem_object_create_internal(vma->vm->i915, size);
>>>       if (IS_ERR(obj))

[-- Attachment #2: Type: text/html, Size: 5608 bytes --]

  reply	other threads:[~2022-10-19 11:02 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-19  7:12 [Intel-gfx] [PATCH] drm/i915/selftests: add prefetch padding to store_dw batchbuffer Andrzej Hajda
2022-10-19  8:16 ` [Intel-gfx] ✓ Fi.CI.BAT: success for " Patchwork
2022-10-19  9:12 ` [Intel-gfx] [PATCH] " Matthew Auld
2022-10-19  9:14   ` Matthew Auld
2022-10-19 11:01     ` Andrzej Hajda [this message]
2022-10-19 14:54 ` [Intel-gfx] ✗ Fi.CI.IGT: failure for " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cd6d9802-353c-642d-4ee7-af7b2eac5a01@intel.com \
    --to=andrzej.hajda@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=lucas.demarchi@intel.com \
    --cc=matthew.auld@intel.com \
    --cc=rodrigo.vivi@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox