Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Welty, Brian" <brian.welty@intel.com>
To: Matthew Brost <matthew.brost@intel.com>
Cc: <intel-xe@lists.freedesktop.org>,
	Matt Roper <matthew.d.roper@intel.com>,
	 Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Subject: Re: [PATCH 1/1] drm/xe: Only use reserved BCS instances for usm migrate exec queue
Date: Wed, 8 May 2024 11:18:31 -0700	[thread overview]
Message-ID: <7660b286-5dc6-4edb-bdf7-d16da1614017@intel.com> (raw)
In-Reply-To: <ZjrmpDa05PSL6B1h@DUT025-TGLU.fm.intel.com>



On 5/7/2024 7:42 PM, Matthew Brost wrote:
> On Tue, May 07, 2024 at 04:59:25PM -0700, Welty, Brian wrote:
>>
>>
>> On 4/15/2024 12:04 PM, Matthew Brost wrote:
>>> The GuC context scheduling queue is 2 entires deep, thus it is possible
>>> for a migration job to be stuck behind a fault if migration exec queue
>>> shares engines with user jobs. This can deadlock as the migrate exec
>>> queue is required to service page faults. Avoid deadlock by only using
>>> reserved BCS instances for usm migrate exec queue.
>>
>> So the underlying concept was always broken here?
>>
> 
> It seems to be broken.
> 
>> With the mask of more than one engine, the virtual engine still won't always
>> pick an idle engine?  HW may end up picking an engine and
> 
> I thought the GuC would always pick the idle hardware engine but it
> doesn't appear to be the case (I can see a clear deadlock before this
> patch, resolved after). Maybe the GuC considers 1 exec queue on the
> engine idle so it doesn't pick resevered engine?
> 
> We probably should follow up with GuC team on this but for PVC as a SDV,
> I think we should get this merged.

Makes sense.

Reviewed-by: Brian Welty <brian.welty@intel.com>


> 
> Matt
> 
>> confusing it to have been idle?   Because the extra 2-depth thing is not
>> being considered?
>>
>>
>>>
>>> Fixes: a043fbab7af5 ("drm/xe/pvc: Use fast copy engines as migrate engine on PVC")
>>> Cc: Matt Roper <matthew.d.roper@intel.com>
>>> Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>> ---
>>>    drivers/gpu/drm/xe/xe_migrate.c | 14 +++++---------
>>>    1 file changed, 5 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
>>> index 9f6e9b7f11c8..c37bb7dfcf1f 100644
>>> --- a/drivers/gpu/drm/xe/xe_migrate.c
>>> +++ b/drivers/gpu/drm/xe/xe_migrate.c
>>> @@ -12,8 +12,6 @@
>>>    #include <drm/ttm/ttm_tt.h>
>>>    #include <drm/xe_drm.h>
>>> -#include <generated/xe_wa_oob.h>
>>> -
>>>    #include "instructions/xe_mi_commands.h"
>>>    #include "regs/xe_gpu_commands.h"
>>>    #include "regs/xe_gtt_defs.h"
>>> @@ -34,7 +32,6 @@
>>>    #include "xe_sync.h"
>>>    #include "xe_trace.h"
>>>    #include "xe_vm.h"
>>> -#include "xe_wa.h"
>>>    /**
>>>     * struct xe_migrate - migrate context.
>>> @@ -300,10 +297,6 @@ static int xe_migrate_prepare_vm(struct xe_tile *tile, struct xe_migrate *m,
>>>    }
>>>    /*
>>> - * Due to workaround 16017236439, odd instance hardware copy engines are
>>> - * faster than even instance ones.
>>> - * This function returns the mask involving all fast copy engines and the
>>> - * reserved copy engine to be used as logical mask for migrate engine.
>>>     * Including the reserved copy engine is required to avoid deadlocks due to
>>>     * migrate jobs servicing the faults gets stuck behind the job that faulted.
>>>     */
>>> @@ -317,8 +310,7 @@ static u32 xe_migrate_usm_logical_mask(struct xe_gt *gt)
>>>    		if (hwe->class != XE_ENGINE_CLASS_COPY)
>>>    			continue;
>>> -		if (!XE_WA(gt, 16017236439) ||
>>> -		    xe_gt_is_usm_hwe(gt, hwe) || hwe->instance & 1)
>>> +		if (xe_gt_is_usm_hwe(gt, hwe))
>>>    			logical_mask |= BIT(hwe->logical_instance);
>>>    	}
>>> @@ -369,6 +361,10 @@ struct xe_migrate *xe_migrate_init(struct xe_tile *tile)
>>>    		if (!hwe || !logical_mask)
>>>    			return ERR_PTR(-EINVAL);
>>> +		/*
>>> +		 * XXX: Currently only reserving 1 (likely slow) BCS instance on
>>> +		 * PVC, may want to revisit if performance is needed.
>>> +		 */
>>>    		m->q = xe_exec_queue_create(xe, vm, logical_mask, 1, hwe,
>>>    					    EXEC_QUEUE_FLAG_KERNEL |
>>>    					    EXEC_QUEUE_FLAG_PERMANENT |

  reply	other threads:[~2024-05-08 18:18 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-15 19:04 [PATCH 0/1] Fix faulting VM deadlock Matthew Brost
2024-04-15 19:04 ` [PATCH 1/1] drm/xe: Only use reserved BCS instances for usm migrate exec queue Matthew Brost
2024-05-07 23:59   ` Welty, Brian
2024-05-08  2:42     ` Matthew Brost
2024-05-08 18:18       ` Welty, Brian [this message]
2024-05-14  1:24         ` Welty, Brian
2024-04-16  0:03 ` ✓ CI.Patch_applied: success for Fix faulting VM deadlock Patchwork
2024-04-16  0:04 ` ✗ CI.checkpatch: warning " Patchwork
2024-04-16  0:04 ` ✓ CI.KUnit: success " Patchwork
2024-04-16  0:16 ` ✓ CI.Build: " Patchwork
2024-04-16  0:19 ` ✓ CI.Hooks: " Patchwork
2024-04-16  0:20 ` ✓ CI.checksparse: " Patchwork
2024-04-16  0:43 ` ✓ CI.BAT: " Patchwork
2024-04-16 17:03 ` ✓ CI.FULL: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7660b286-5dc6-4edb-bdf7-d16da1614017@intel.com \
    --to=brian.welty@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.brost@intel.com \
    --cc=matthew.d.roper@intel.com \
    --cc=niranjana.vishwanathapura@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox