From: Matthew Brost <matthew.brost@intel.com>
To: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: <intel-xe@lists.freedesktop.org>, Matt Roper <matthew.d.roper@intel.com>
Subject: Re: [PATCH v6 27/30] drm/xe/vf: Use primary GT ordered work queue on media GT on PTL VF
Date: Tue, 7 Oct 2025 10:22:39 -0700 [thread overview]
Message-ID: <aOVMXzdQCB1e7LoV@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <h4mwspqv5chueicgkrvd4grucstgev3vfkuavaeqbweqecvznc@wjpe5lk7tkr3>
On Tue, Oct 07, 2025 at 12:00:32PM -0500, Lucas De Marchi wrote:
> On Mon, Oct 06, 2025 at 03:51:12PM -0700, Matthew Brost wrote:
> > On Mon, Oct 06, 2025 at 05:24:55PM -0500, Lucas De Marchi wrote:
> > > On Mon, Oct 06, 2025 at 04:10:35AM -0700, Matthew Brost wrote:
> > > > VF CCS restore is a primary GT operation on which the media GT depends.
> > > > Therefore, it doesn't make much sense to run these operations in
> > >
> > > I'd need to double check the previous patches to see the entire
> > > picture, but this seems weird at a first glance. The VF CCS restore is
> > > not the single work we queue in gt->ordered_wq. To me it seems more
> > > like "in what ordered queue we are going to queue the VF CCS restore. If
> > > it's global per device, why are we not using the device wq rather than
> > > making all the GT wq point to the same thing?
> > >
> >
> > This is where things get convoluted. Four mechanisms manipulate the
> > scheduling state by starting or stopping the DRM scheduler:
> >
> > - Job timeouts
> > - GT resets
> > - VF restore
> > - PM resume
> >
> > All of these paths require mutual exclusion, or the scheduler design
> > breaks.
> >
> > The first three ensure ordering by scheduling on the same ordered work
> > queue (GT-ordered WQ). The last one is guaranteed by holding PM
> > references in all the right places.
> >
> > Another issue is that the first three items are all in the reclaim path
> > — the GT-ordered work queue is designed to handle this.
> >
> > This patch [1] explains the entire scheduler design in detail.
> >
> > Only PTL has the cross-VF/GT restore ordering requirement, so I figured
> > the path of least resistance is to just point all GTs to the primary
> > work queue.
> >
> > [1] https://patchwork.freedesktop.org/patch/677980/?series=154627&rev=4
> >
> > > > parallel. To address this, point the media GT's ordered work queue to
> > > > the primary GT's ordered work queue on platforms that require (PTL VFs)
> > > > CCS restore as part of VF post-migration recovery.
> > > >
> > > > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > > > ---
> > > > drivers/gpu/drm/xe/xe_device_types.h | 2 ++
> > > > drivers/gpu/drm/xe/xe_gt.c | 7 +++++--
> > > > drivers/gpu/drm/xe/xe_gt.h | 2 +-
> > > > drivers/gpu/drm/xe/xe_pci.c | 6 +++++-
> > > > drivers/gpu/drm/xe/xe_pci_types.h | 1 +
> > > > drivers/gpu/drm/xe/xe_tile.c | 2 +-
> > > > 6 files changed, 15 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> > > > index c66523bf4bf0..02c04ad7296e 100644
> > > > --- a/drivers/gpu/drm/xe/xe_device_types.h
> > > > +++ b/drivers/gpu/drm/xe/xe_device_types.h
> > > > @@ -334,6 +334,8 @@ struct xe_device {
> > > > u8 skip_mtcfg:1;
> > > > /** @info.skip_pcode: skip access to PCODE uC */
> > > > u8 skip_pcode:1;
> > > > + /** @info.needs_shared_vf_gt_wq: needs shared GT WQ on VF */
> > > > + u8 needs_shared_vf_gt_wq:1;
> > > > } info;
> > > >
> > > > /** @wa_active: keep track of active workarounds */
> > > > diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
> > > > index cf484a2da35e..05465f358c96 100644
> > > > --- a/drivers/gpu/drm/xe/xe_gt.c
> > > > +++ b/drivers/gpu/drm/xe/xe_gt.c
> > > > @@ -65,7 +65,7 @@
> > > > #include "xe_wa.h"
> > > > #include "xe_wopcm.h"
> > > >
> > > > -struct xe_gt *xe_gt_alloc(struct xe_tile *tile)
> > > > +struct xe_gt *xe_gt_alloc(struct xe_tile *tile, bool use_primary_wq)
> > >
> > > If using the device wq is not an option (possibly because it would queue
> > > with other undesired work going on there), then I'd rather drop this
> > > bool passing here and make the decision inside this function:
> > >
> > > > {
> > > > struct drm_device *drm = &tile_to_xe(tile)->drm;
> > > > struct xe_gt *gt;
> > > > @@ -75,7 +75,10 @@ struct xe_gt *xe_gt_alloc(struct xe_tile *tile)
> > > > return ERR_PTR(-ENOMEM);
> > > >
> > > > gt->tile = tile;
> > >
> > > if (!xe->info.needs_shared_gt_wq || !tile->primary_gt->ordered_wq)
> > > ordered_wq = drmm_alloc_ordered_workqueue(drm, "gt-ordered-wq", WQ_MEM_RECLAIM);
> > > else
> > > ordered_wq = tile->primary_gt->ordered_wq;
> > > if (IS_ERR_OR_NUL(ordered_wq))
> > > return ordered_wq ? ERR_CAST(gt->ordered_wq) : ERR_PTR(-EINVAL);
> > >
> > > gt->ordered_wq = ordered_wq;
> > >
> > > ... or something like that so you use the xe info to decide it here
> > > rather than passing it down as a function arg.
> > >
> >
> > Sure can refactor this as you suggest.
>
>
> another option that would avoid a bool arg would be to actually pass the
> wq from the caller.
>
> wq = NULL;
>
> if (xe->info.needs_shared_gt_wq)
> wq = tile->primary_gt->ordered_wq;
>
> xe_gt_alloc(tile, wq);
>
I posted a v7/v8 which did it like you suggest.
> >
> > >
> > > > - gt->ordered_wq = drmm_alloc_ordered_workqueue(drm, "gt-ordered-wq", WQ_MEM_RECLAIM);
> > > > + if (use_primary_wq)
> > > > + gt->ordered_wq = tile->primary_gt->ordered_wq;
> > > > + else
> > > > + gt->ordered_wq = drmm_alloc_ordered_workqueue(drm, "gt-ordered-wq", WQ_MEM_RECLAIM);
> > > > if (IS_ERR(gt->ordered_wq))
> > > > return ERR_CAST(gt->ordered_wq);
> > > >
> > > > diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h
> > > > index 5df2ffe3ff83..9545c0c93ab6 100644
> > > > --- a/drivers/gpu/drm/xe/xe_gt.h
> > > > +++ b/drivers/gpu/drm/xe/xe_gt.h
> > > > @@ -28,7 +28,7 @@ static inline bool xe_fault_inject_gt_reset(void)
> > > > return IS_ENABLED(CONFIG_DEBUG_FS) && should_fail(>_reset_failure, 1);
> > > > }
> > > >
> > > > -struct xe_gt *xe_gt_alloc(struct xe_tile *tile);
> > > > +struct xe_gt *xe_gt_alloc(struct xe_tile *tile, bool use_primary_wq);
> > > > int xe_gt_init_early(struct xe_gt *gt);
> > > > int xe_gt_init(struct xe_gt *gt);
> > > > void xe_gt_mmio_init(struct xe_gt *gt);
> > > > diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
> > > > index 3f42b91efa28..25a1d96a68e7 100644
> > > > --- a/drivers/gpu/drm/xe/xe_pci.c
> > > > +++ b/drivers/gpu/drm/xe/xe_pci.c
> > > > @@ -347,6 +347,7 @@ static const struct xe_device_desc ptl_desc = {
> > > > .has_sriov = true,
> > > > .max_gt_per_tile = 2,
> > > > .needs_scratch = true,
> > > > + .needs_shared_vf_gt_wq = true,
> > >
> > > as per above... I think this needs to be detached from vf. There may be
> > > other reasons the wq needs to be shared.
> > >
> >
> > I’d prefer to share the work queue only when absolutely necessary — such
> > as when PTL is on a VF and migration is supported.
>
>
> I missed that you left the VF condition out as an && for the allocation:
>
Yes.
> > > > + xe->info.needs_shared_vf_gt_wq &&
> > > > + IS_SRIOV_VF(xe));
>
> IMO a better way would be to call it `needs_shared_gt_wq` and then
> override the condition in the vf-specific function:
>
> sriov_update_device_info()
>
I'm trying to follow this one.
This a platform (static) and VF (dynamic) condition combination.
Are you suggesting removing platform information from xe_pci.c and just
have info bit in the device which the sriov code sets? I think we'd need
a platform check then in the sriov code which I thought in general we
want to avoid inline platform checks.
Maybe I'm misunderstanding here.
Matt
> Lucas De Marchi
>
> >
> > > If we just make them point to a device wq as suggested above, then
> > > there's no extra issue with the ongoing work to disable GTs that Matt
> > > Roper is doing (https://patchwork.freedesktop.org/series/154739/).
> > > Otherwise we will need to think on how to reconciliate them.
> > >
> >
> > I don’t see how it would ever be possible to disable the primary GT and
> > still do anything meaningful. You can’t allocate memory (e.g., for
> > clears), perform VM binds, or handle page faults without a copy engine.
> >
> > Matt
> >
> > > Lucas De Marchi
> > >
> > > > };
> > > >
> > > > #undef PLATFORM
> > > > @@ -598,6 +599,7 @@ static int xe_info_init_early(struct xe_device *xe,
> > > > xe->info.skip_mtcfg = desc->skip_mtcfg;
> > > > xe->info.skip_pcode = desc->skip_pcode;
> > > > xe->info.needs_scratch = desc->needs_scratch;
> > > > + xe->info.needs_shared_vf_gt_wq = desc->needs_shared_vf_gt_wq;
> > > >
> > > > xe->info.probe_display = IS_ENABLED(CONFIG_DRM_XE_DISPLAY) &&
> > > > xe_modparam.probe_display &&
> > > > @@ -766,7 +768,9 @@ static int xe_info_init(struct xe_device *xe,
> > > > * Allocate and setup media GT for platforms with standalone
> > > > * media.
> > > > */
> > > > - tile->media_gt = xe_gt_alloc(tile);
> > > > + tile->media_gt = xe_gt_alloc(tile,
> > > > + xe->info.needs_shared_vf_gt_wq &&
> > > > + IS_SRIOV_VF(xe));
> > > > if (IS_ERR(tile->media_gt))
> > > > return PTR_ERR(tile->media_gt);
> > > >
> > > > diff --git a/drivers/gpu/drm/xe/xe_pci_types.h b/drivers/gpu/drm/xe/xe_pci_types.h
> > > > index 9b9766a3baa3..b11bf6abda5b 100644
> > > > --- a/drivers/gpu/drm/xe/xe_pci_types.h
> > > > +++ b/drivers/gpu/drm/xe/xe_pci_types.h
> > > > @@ -48,6 +48,7 @@ struct xe_device_desc {
> > > > u8 skip_guc_pc:1;
> > > > u8 skip_mtcfg:1;
> > > > u8 skip_pcode:1;
> > > > + u8 needs_shared_vf_gt_wq:1;
> > > > };
> > > >
> > > > struct xe_graphics_desc {
> > > > diff --git a/drivers/gpu/drm/xe/xe_tile.c b/drivers/gpu/drm/xe/xe_tile.c
> > > > index 6edb5062c1da..e9bcff2de563 100644
> > > > --- a/drivers/gpu/drm/xe/xe_tile.c
> > > > +++ b/drivers/gpu/drm/xe/xe_tile.c
> > > > @@ -157,7 +157,7 @@ int xe_tile_init_early(struct xe_tile *tile, struct xe_device *xe, u8 id)
> > > > if (err)
> > > > return err;
> > > >
> > > > - tile->primary_gt = xe_gt_alloc(tile);
> > > > + tile->primary_gt = xe_gt_alloc(tile, false);
> > > > if (IS_ERR(tile->primary_gt))
> > > > return PTR_ERR(tile->primary_gt);
> > > >
> > > > --
> > > > 2.34.1
> > > >
next prev parent reply other threads:[~2025-10-07 17:22 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-06 11:10 [PATCH v6 00/30] VF migration redesign Matthew Brost
2025-10-06 11:10 ` [PATCH v6 01/30] drm/xe: Add NULL checks to scratch LRC allocation Matthew Brost
2025-10-06 21:51 ` Lis, Tomasz
2025-10-06 11:10 ` [PATCH v6 02/30] drm/xe: Save off position in ring in which a job was programmed Matthew Brost
2025-10-06 11:10 ` [PATCH v6 03/30] drm/xe/guc: Track pending-enable source in submission state Matthew Brost
2025-10-06 11:10 ` [PATCH v6 04/30] drm/xe: Track LR jobs in DRM scheduler pending list Matthew Brost
2025-10-06 11:10 ` [PATCH v6 05/30] drm/xe: Don't change LRC ring head on job resubmission Matthew Brost
2025-10-06 11:10 ` [PATCH v6 06/30] drm/xe: Make LRC W/A scratch buffer usage consistent Matthew Brost
2025-10-06 11:10 ` [PATCH v6 07/30] drm/xe/vf: Add xe_gt_recovery_pending helper Matthew Brost
2025-10-06 13:10 ` Michal Wajdeczko
2025-10-06 11:10 ` [PATCH v6 08/30] drm/xe/vf: Make VF recovery run on per-GT worker Matthew Brost
2025-10-06 11:10 ` [PATCH v6 09/30] drm/xe/vf: Abort H2G sends during VF post-migration recovery Matthew Brost
2025-10-06 11:10 ` [PATCH v6 10/30] drm/xe/vf: Remove memory allocations from VF post migration recovery Matthew Brost
2025-10-06 11:10 ` [PATCH v6 11/30] drm/xe/vf: Close multi-GT GGTT shift race Matthew Brost
2025-10-06 14:27 ` Michal Wajdeczko
2025-10-06 14:56 ` Matthew Brost
2025-10-06 11:10 ` [PATCH v6 12/30] drm/xe/vf: Teardown VF post migration worker on driver unload Matthew Brost
2025-10-06 11:10 ` [PATCH v6 13/30] drm/xe/vf: Don't allow GT reset to be queued during VF post migration recovery Matthew Brost
2025-10-06 11:10 ` [PATCH v6 14/30] drm/xe/vf: Wakeup in GuC backend on " Matthew Brost
2025-10-06 14:35 ` Michal Wajdeczko
2025-10-06 15:54 ` Matthew Brost
2025-10-06 22:27 ` Lis, Tomasz
2025-10-06 23:07 ` Matthew Brost
2025-10-06 11:10 ` [PATCH v6 15/30] drm/xe/vf: Avoid indefinite blocking in preempt rebind worker for VFs supporting migration Matthew Brost
2025-10-06 11:10 ` [PATCH v6 16/30] drm/xe/vf: Use GUC_HXG_TYPE_EVENT for GuC context register Matthew Brost
2025-10-06 14:51 ` Michal Wajdeczko
2025-10-06 16:02 ` Matthew Brost
2025-10-06 22:21 ` Lis, Tomasz
2025-10-06 22:57 ` Matthew Brost
2025-10-06 11:10 ` [PATCH v6 17/30] drm/xe/vf: Flush and stop CTs in VF post migration recovery Matthew Brost
2025-10-06 11:10 ` [PATCH v6 18/30] drm/xe/vf: Reset TLB invalidations during " Matthew Brost
2025-10-06 11:10 ` [PATCH v6 19/30] drm/xe/vf: Kickstart after resfix in " Matthew Brost
2025-10-06 11:10 ` [PATCH v6 20/30] drm/xe/vf: Start CTs before resfix " Matthew Brost
2025-10-06 21:50 ` Lis, Tomasz
2025-10-06 11:10 ` [PATCH v6 21/30] drm/xe/vf: Abort VF post migration recovery on failure Matthew Brost
2025-10-06 11:10 ` [PATCH v6 22/30] drm/xe/vf: Replay GuC submission state on pause / unpause Matthew Brost
2025-10-06 11:10 ` [PATCH v6 23/30] drm/xe: Move queue init before LRC creation Matthew Brost
2025-10-06 15:22 ` Michal Wajdeczko
2025-10-06 21:33 ` Lis, Tomasz
2025-10-06 11:10 ` [PATCH v6 24/30] drm/xe/vf: Add debug prints for GuC replaying state during VF recovery Matthew Brost
2025-10-06 11:10 ` [PATCH v6 25/30] drm/xe/vf: Workaround for race condition in GuC firmware during VF pause Matthew Brost
2025-10-06 11:10 ` [PATCH v6 26/30] drm/xe: Use PPGTT addresses for TLB invalidation to avoid GGTT fixups Matthew Brost
2025-10-06 11:10 ` [PATCH v6 27/30] drm/xe/vf: Use primary GT ordered work queue on media GT on PTL VF Matthew Brost
2025-10-06 22:24 ` Lucas De Marchi
2025-10-06 22:51 ` Matthew Brost
2025-10-07 17:00 ` Lucas De Marchi
2025-10-07 17:22 ` Matthew Brost [this message]
2025-10-07 20:36 ` Lucas De Marchi
2025-10-07 21:18 ` Matthew Brost
2025-10-06 11:10 ` [PATCH v6 28/30] drm/xe/vf: Ensure media GT VF recovery runs after primary GT on PTL Matthew Brost
2025-10-06 11:10 ` [PATCH v6 29/30] drm/xe/vf: Rebase CCS save/restore BB GGTT addresses Matthew Brost
2025-10-06 11:10 ` [PATCH v6 30/30] drm/xe/guc: Increase wait timeout to 2sec after BUSY reply from GuC Matthew Brost
2025-10-06 11:17 ` ✗ CI.checkpatch: warning for VF migration redesign (rev6) Patchwork
2025-10-06 11:18 ` ✓ CI.KUnit: success " Patchwork
2025-10-06 12:24 ` ✗ Xe.CI.BAT: failure " Patchwork
2025-10-06 14:28 ` ✗ Xe.CI.Full: " Patchwork
2025-10-07 0:20 ` [PATCH v6 00/30] VF migration redesign Niranjana Vishwanathapura
2025-10-07 1:11 ` Matthew Brost
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aOVMXzdQCB1e7LoV@lstrano-desk.jf.intel.com \
--to=matthew.brost@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=lucas.demarchi@intel.com \
--cc=matthew.d.roper@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox