From: Matthew Brost <matthew.brost@intel.com>
To: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: <intel-xe@lists.freedesktop.org>, Matt Roper <matthew.d.roper@intel.com>
Subject: Re: [PATCH v6 27/30] drm/xe/vf: Use primary GT ordered work queue on media GT on PTL VF
Date: Mon, 6 Oct 2025 15:51:12 -0700 [thread overview]
Message-ID: <aORH4NrjGSNcCxi/@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <t6aytzfw5frhua2ojrz3xyrobi25p424exjagpmovtvif5jwhw@ocmceysqldql>
On Mon, Oct 06, 2025 at 05:24:55PM -0500, Lucas De Marchi wrote:
> On Mon, Oct 06, 2025 at 04:10:35AM -0700, Matthew Brost wrote:
> > VF CCS restore is a primary GT operation on which the media GT depends.
> > Therefore, it doesn't make much sense to run these operations in
>
> I'd need to double check the previous patches to see the entire
> picture, but this seems weird at a first glance. The VF CCS restore is
> not the single work we queue in gt->ordered_wq. To me it seems more
> like "in what ordered queue we are going to queue the VF CCS restore. If
> it's global per device, why are we not using the device wq rather than
> making all the GT wq point to the same thing?
>
This is where things get convoluted. Four mechanisms manipulate the
scheduling state by starting or stopping the DRM scheduler:
- Job timeouts
- GT resets
- VF restore
- PM resume
All of these paths require mutual exclusion, or the scheduler design
breaks.
The first three ensure ordering by scheduling on the same ordered work
queue (GT-ordered WQ). The last one is guaranteed by holding PM
references in all the right places.
Another issue is that the first three items are all in the reclaim path
— the GT-ordered work queue is designed to handle this.
This patch [1] explains the entire scheduler design in detail.
Only PTL has the cross-VF/GT restore ordering requirement, so I figured
the path of least resistance is to just point all GTs to the primary
work queue.
[1] https://patchwork.freedesktop.org/patch/677980/?series=154627&rev=4
> > parallel. To address this, point the media GT's ordered work queue to
> > the primary GT's ordered work queue on platforms that require (PTL VFs)
> > CCS restore as part of VF post-migration recovery.
> >
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_device_types.h | 2 ++
> > drivers/gpu/drm/xe/xe_gt.c | 7 +++++--
> > drivers/gpu/drm/xe/xe_gt.h | 2 +-
> > drivers/gpu/drm/xe/xe_pci.c | 6 +++++-
> > drivers/gpu/drm/xe/xe_pci_types.h | 1 +
> > drivers/gpu/drm/xe/xe_tile.c | 2 +-
> > 6 files changed, 15 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> > index c66523bf4bf0..02c04ad7296e 100644
> > --- a/drivers/gpu/drm/xe/xe_device_types.h
> > +++ b/drivers/gpu/drm/xe/xe_device_types.h
> > @@ -334,6 +334,8 @@ struct xe_device {
> > u8 skip_mtcfg:1;
> > /** @info.skip_pcode: skip access to PCODE uC */
> > u8 skip_pcode:1;
> > + /** @info.needs_shared_vf_gt_wq: needs shared GT WQ on VF */
> > + u8 needs_shared_vf_gt_wq:1;
> > } info;
> >
> > /** @wa_active: keep track of active workarounds */
> > diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
> > index cf484a2da35e..05465f358c96 100644
> > --- a/drivers/gpu/drm/xe/xe_gt.c
> > +++ b/drivers/gpu/drm/xe/xe_gt.c
> > @@ -65,7 +65,7 @@
> > #include "xe_wa.h"
> > #include "xe_wopcm.h"
> >
> > -struct xe_gt *xe_gt_alloc(struct xe_tile *tile)
> > +struct xe_gt *xe_gt_alloc(struct xe_tile *tile, bool use_primary_wq)
>
> If using the device wq is not an option (possibly because it would queue
> with other undesired work going on there), then I'd rather drop this
> bool passing here and make the decision inside this function:
>
> > {
> > struct drm_device *drm = &tile_to_xe(tile)->drm;
> > struct xe_gt *gt;
> > @@ -75,7 +75,10 @@ struct xe_gt *xe_gt_alloc(struct xe_tile *tile)
> > return ERR_PTR(-ENOMEM);
> >
> > gt->tile = tile;
>
> if (!xe->info.needs_shared_gt_wq || !tile->primary_gt->ordered_wq)
> ordered_wq = drmm_alloc_ordered_workqueue(drm, "gt-ordered-wq", WQ_MEM_RECLAIM);
> else
> ordered_wq = tile->primary_gt->ordered_wq;
> if (IS_ERR_OR_NUL(ordered_wq))
> return ordered_wq ? ERR_CAST(gt->ordered_wq) : ERR_PTR(-EINVAL);
>
> gt->ordered_wq = ordered_wq;
>
> ... or something like that so you use the xe info to decide it here
> rather than passing it down as a function arg.
>
Sure can refactor this as you suggest.
>
> > - gt->ordered_wq = drmm_alloc_ordered_workqueue(drm, "gt-ordered-wq", WQ_MEM_RECLAIM);
> > + if (use_primary_wq)
> > + gt->ordered_wq = tile->primary_gt->ordered_wq;
> > + else
> > + gt->ordered_wq = drmm_alloc_ordered_workqueue(drm, "gt-ordered-wq", WQ_MEM_RECLAIM);
> > if (IS_ERR(gt->ordered_wq))
> > return ERR_CAST(gt->ordered_wq);
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h
> > index 5df2ffe3ff83..9545c0c93ab6 100644
> > --- a/drivers/gpu/drm/xe/xe_gt.h
> > +++ b/drivers/gpu/drm/xe/xe_gt.h
> > @@ -28,7 +28,7 @@ static inline bool xe_fault_inject_gt_reset(void)
> > return IS_ENABLED(CONFIG_DEBUG_FS) && should_fail(>_reset_failure, 1);
> > }
> >
> > -struct xe_gt *xe_gt_alloc(struct xe_tile *tile);
> > +struct xe_gt *xe_gt_alloc(struct xe_tile *tile, bool use_primary_wq);
> > int xe_gt_init_early(struct xe_gt *gt);
> > int xe_gt_init(struct xe_gt *gt);
> > void xe_gt_mmio_init(struct xe_gt *gt);
> > diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
> > index 3f42b91efa28..25a1d96a68e7 100644
> > --- a/drivers/gpu/drm/xe/xe_pci.c
> > +++ b/drivers/gpu/drm/xe/xe_pci.c
> > @@ -347,6 +347,7 @@ static const struct xe_device_desc ptl_desc = {
> > .has_sriov = true,
> > .max_gt_per_tile = 2,
> > .needs_scratch = true,
> > + .needs_shared_vf_gt_wq = true,
>
> as per above... I think this needs to be detached from vf. There may be
> other reasons the wq needs to be shared.
>
I’d prefer to share the work queue only when absolutely necessary — such
as when PTL is on a VF and migration is supported.
> If we just make them point to a device wq as suggested above, then
> there's no extra issue with the ongoing work to disable GTs that Matt
> Roper is doing (https://patchwork.freedesktop.org/series/154739/).
> Otherwise we will need to think on how to reconciliate them.
>
I don’t see how it would ever be possible to disable the primary GT and
still do anything meaningful. You can’t allocate memory (e.g., for
clears), perform VM binds, or handle page faults without a copy engine.
Matt
> Lucas De Marchi
>
> > };
> >
> > #undef PLATFORM
> > @@ -598,6 +599,7 @@ static int xe_info_init_early(struct xe_device *xe,
> > xe->info.skip_mtcfg = desc->skip_mtcfg;
> > xe->info.skip_pcode = desc->skip_pcode;
> > xe->info.needs_scratch = desc->needs_scratch;
> > + xe->info.needs_shared_vf_gt_wq = desc->needs_shared_vf_gt_wq;
> >
> > xe->info.probe_display = IS_ENABLED(CONFIG_DRM_XE_DISPLAY) &&
> > xe_modparam.probe_display &&
> > @@ -766,7 +768,9 @@ static int xe_info_init(struct xe_device *xe,
> > * Allocate and setup media GT for platforms with standalone
> > * media.
> > */
> > - tile->media_gt = xe_gt_alloc(tile);
> > + tile->media_gt = xe_gt_alloc(tile,
> > + xe->info.needs_shared_vf_gt_wq &&
> > + IS_SRIOV_VF(xe));
> > if (IS_ERR(tile->media_gt))
> > return PTR_ERR(tile->media_gt);
> >
> > diff --git a/drivers/gpu/drm/xe/xe_pci_types.h b/drivers/gpu/drm/xe/xe_pci_types.h
> > index 9b9766a3baa3..b11bf6abda5b 100644
> > --- a/drivers/gpu/drm/xe/xe_pci_types.h
> > +++ b/drivers/gpu/drm/xe/xe_pci_types.h
> > @@ -48,6 +48,7 @@ struct xe_device_desc {
> > u8 skip_guc_pc:1;
> > u8 skip_mtcfg:1;
> > u8 skip_pcode:1;
> > + u8 needs_shared_vf_gt_wq:1;
> > };
> >
> > struct xe_graphics_desc {
> > diff --git a/drivers/gpu/drm/xe/xe_tile.c b/drivers/gpu/drm/xe/xe_tile.c
> > index 6edb5062c1da..e9bcff2de563 100644
> > --- a/drivers/gpu/drm/xe/xe_tile.c
> > +++ b/drivers/gpu/drm/xe/xe_tile.c
> > @@ -157,7 +157,7 @@ int xe_tile_init_early(struct xe_tile *tile, struct xe_device *xe, u8 id)
> > if (err)
> > return err;
> >
> > - tile->primary_gt = xe_gt_alloc(tile);
> > + tile->primary_gt = xe_gt_alloc(tile, false);
> > if (IS_ERR(tile->primary_gt))
> > return PTR_ERR(tile->primary_gt);
> >
> > --
> > 2.34.1
> >
next prev parent reply other threads:[~2025-10-06 22:51 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-06 11:10 [PATCH v6 00/30] VF migration redesign Matthew Brost
2025-10-06 11:10 ` [PATCH v6 01/30] drm/xe: Add NULL checks to scratch LRC allocation Matthew Brost
2025-10-06 21:51 ` Lis, Tomasz
2025-10-06 11:10 ` [PATCH v6 02/30] drm/xe: Save off position in ring in which a job was programmed Matthew Brost
2025-10-06 11:10 ` [PATCH v6 03/30] drm/xe/guc: Track pending-enable source in submission state Matthew Brost
2025-10-06 11:10 ` [PATCH v6 04/30] drm/xe: Track LR jobs in DRM scheduler pending list Matthew Brost
2025-10-06 11:10 ` [PATCH v6 05/30] drm/xe: Don't change LRC ring head on job resubmission Matthew Brost
2025-10-06 11:10 ` [PATCH v6 06/30] drm/xe: Make LRC W/A scratch buffer usage consistent Matthew Brost
2025-10-06 11:10 ` [PATCH v6 07/30] drm/xe/vf: Add xe_gt_recovery_pending helper Matthew Brost
2025-10-06 13:10 ` Michal Wajdeczko
2025-10-06 11:10 ` [PATCH v6 08/30] drm/xe/vf: Make VF recovery run on per-GT worker Matthew Brost
2025-10-06 11:10 ` [PATCH v6 09/30] drm/xe/vf: Abort H2G sends during VF post-migration recovery Matthew Brost
2025-10-06 11:10 ` [PATCH v6 10/30] drm/xe/vf: Remove memory allocations from VF post migration recovery Matthew Brost
2025-10-06 11:10 ` [PATCH v6 11/30] drm/xe/vf: Close multi-GT GGTT shift race Matthew Brost
2025-10-06 14:27 ` Michal Wajdeczko
2025-10-06 14:56 ` Matthew Brost
2025-10-06 11:10 ` [PATCH v6 12/30] drm/xe/vf: Teardown VF post migration worker on driver unload Matthew Brost
2025-10-06 11:10 ` [PATCH v6 13/30] drm/xe/vf: Don't allow GT reset to be queued during VF post migration recovery Matthew Brost
2025-10-06 11:10 ` [PATCH v6 14/30] drm/xe/vf: Wakeup in GuC backend on " Matthew Brost
2025-10-06 14:35 ` Michal Wajdeczko
2025-10-06 15:54 ` Matthew Brost
2025-10-06 22:27 ` Lis, Tomasz
2025-10-06 23:07 ` Matthew Brost
2025-10-06 11:10 ` [PATCH v6 15/30] drm/xe/vf: Avoid indefinite blocking in preempt rebind worker for VFs supporting migration Matthew Brost
2025-10-06 11:10 ` [PATCH v6 16/30] drm/xe/vf: Use GUC_HXG_TYPE_EVENT for GuC context register Matthew Brost
2025-10-06 14:51 ` Michal Wajdeczko
2025-10-06 16:02 ` Matthew Brost
2025-10-06 22:21 ` Lis, Tomasz
2025-10-06 22:57 ` Matthew Brost
2025-10-06 11:10 ` [PATCH v6 17/30] drm/xe/vf: Flush and stop CTs in VF post migration recovery Matthew Brost
2025-10-06 11:10 ` [PATCH v6 18/30] drm/xe/vf: Reset TLB invalidations during " Matthew Brost
2025-10-06 11:10 ` [PATCH v6 19/30] drm/xe/vf: Kickstart after resfix in " Matthew Brost
2025-10-06 11:10 ` [PATCH v6 20/30] drm/xe/vf: Start CTs before resfix " Matthew Brost
2025-10-06 21:50 ` Lis, Tomasz
2025-10-06 11:10 ` [PATCH v6 21/30] drm/xe/vf: Abort VF post migration recovery on failure Matthew Brost
2025-10-06 11:10 ` [PATCH v6 22/30] drm/xe/vf: Replay GuC submission state on pause / unpause Matthew Brost
2025-10-06 11:10 ` [PATCH v6 23/30] drm/xe: Move queue init before LRC creation Matthew Brost
2025-10-06 15:22 ` Michal Wajdeczko
2025-10-06 21:33 ` Lis, Tomasz
2025-10-06 11:10 ` [PATCH v6 24/30] drm/xe/vf: Add debug prints for GuC replaying state during VF recovery Matthew Brost
2025-10-06 11:10 ` [PATCH v6 25/30] drm/xe/vf: Workaround for race condition in GuC firmware during VF pause Matthew Brost
2025-10-06 11:10 ` [PATCH v6 26/30] drm/xe: Use PPGTT addresses for TLB invalidation to avoid GGTT fixups Matthew Brost
2025-10-06 11:10 ` [PATCH v6 27/30] drm/xe/vf: Use primary GT ordered work queue on media GT on PTL VF Matthew Brost
2025-10-06 22:24 ` Lucas De Marchi
2025-10-06 22:51 ` Matthew Brost [this message]
2025-10-07 17:00 ` Lucas De Marchi
2025-10-07 17:22 ` Matthew Brost
2025-10-07 20:36 ` Lucas De Marchi
2025-10-07 21:18 ` Matthew Brost
2025-10-06 11:10 ` [PATCH v6 28/30] drm/xe/vf: Ensure media GT VF recovery runs after primary GT on PTL Matthew Brost
2025-10-06 11:10 ` [PATCH v6 29/30] drm/xe/vf: Rebase CCS save/restore BB GGTT addresses Matthew Brost
2025-10-06 11:10 ` [PATCH v6 30/30] drm/xe/guc: Increase wait timeout to 2sec after BUSY reply from GuC Matthew Brost
2025-10-06 11:17 ` ✗ CI.checkpatch: warning for VF migration redesign (rev6) Patchwork
2025-10-06 11:18 ` ✓ CI.KUnit: success " Patchwork
2025-10-06 12:24 ` ✗ Xe.CI.BAT: failure " Patchwork
2025-10-06 14:28 ` ✗ Xe.CI.Full: " Patchwork
2025-10-07 0:20 ` [PATCH v6 00/30] VF migration redesign Niranjana Vishwanathapura
2025-10-07 1:11 ` Matthew Brost
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aORH4NrjGSNcCxi/@lstrano-desk.jf.intel.com \
--to=matthew.brost@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=lucas.demarchi@intel.com \
--cc=matthew.d.roper@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox