Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Brost <matthew.brost@intel.com>
To: "K V P, Satyanarayana" <satyanarayana.k.v.p@intel.com>
Cc: <intel-xe@lists.freedesktop.org>,
	Michal Wajdeczko <michal.wajdeczko@intel.com>,
	Michal Winiarski <michal.winiarski@intel.com>,
	Tomasz Lis <tomasz.lis@intel.com>
Subject: Re: [PATCH v2] drm/xe/vf: Fix up GGTT on S4 resume under SR-IOV
Date: Fri, 24 Oct 2025 08:58:10 -0700	[thread overview]
Message-ID: <aPuiEoi6zyEFw7U5@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <30bed83f-9730-43fe-a81a-719057d03e33@intel.com>

On Fri, Oct 24, 2025 at 09:39:18AM +0530, K V P, Satyanarayana wrote:
> 
> 
> On 24-10-2025 08:19, Matthew Brost wrote:
> > On Thu, Oct 23, 2025 at 12:16:11PM +0530, Satyanarayana K V P wrote:
> > > With SRIOV enabled, while resuming from S4, there is a possibility that,
> > > the VM might have been suspended on one VF and resumed on another VF.
> > > Since GGTT space is not virtualized, we need to fixup all the GGTT
> > > references while resuming.
> > > 
> > > While resuming from S4, check whether the GGTT space is same or not for
> > > the given VF and fix-up if it is different.
> > > 
> > 
> > Before I really think about if this will work - quick question, have you
> > been able to test this out and see GGTT shift after S4 exit? Also with
> > running workloads?
> > 
> > Matt
> Hi Matt,
>  Yes. I have checked it with idle system and by running basic 3D workload
> glxgears. They work fine.
> -Satya. >
> > > Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
> > > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> > > Cc: Michal Winiarski <michal.winiarski@intel.com>
> > > Cc: Matthew Brost <matthew.brost@intel.com>
> > > Cc: Tomasz Lis <tomasz.lis@intel.com>
> > > ---
> > > V1 -> V2:
> > > - Rebased to latest drm-tip.
> > > ---
> > >   drivers/gpu/drm/xe/xe_gt_sriov_vf.c | 32 +++++++++++++++++++++++++++++
> > >   drivers/gpu/drm/xe/xe_gt_sriov_vf.h |  1 +
> > >   drivers/gpu/drm/xe/xe_pm.c          | 12 +++++++++++
> > >   3 files changed, 45 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_vf.c b/drivers/gpu/drm/xe/xe_gt_sriov_vf.c
> > > index d0b102ab6ce8..38528495478f 100644
> > > --- a/drivers/gpu/drm/xe/xe_gt_sriov_vf.c
> > > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_vf.c
> > > @@ -1375,3 +1375,35 @@ void xe_gt_sriov_vf_wait_valid_ggtt(struct xe_gt *gt)
> > >   					       HZ * 5);
> > >   	xe_gt_WARN_ON(gt, !ret);
> > >   }
> > > +
> > > +/**
> > > + * xe_sriov_vf_fixup_ggtt - Fix up GGTT on resume from S4.
> > > + * @gt: the &xe_gt.
> > > + *
> > > + * This function shall be called only by VF.
> > > + * Main GT and media GT share the same GGTT space. So, fixups are needed
> > > + * only for Main GT.
> > > + *
> > > + * Returns: 0 if the operation completed successfully, or a negative
> > > + * error code otherwise.
> > > + */
> > > +int xe_gt_sriov_vf_fixup_ggtt(struct xe_gt *gt)
> > > +{
> > > +	int err = 0;
> > > +
> > > +	xe_gt_assert(gt, IS_SRIOV_VF(gt_to_xe(gt)));
> > > +
> > > +	if (gt->info.type != XE_GT_TYPE_MAIN)
> > > +		return err;

I'd return 0 here. It makes it more clear this is not an error case.

> > > +
> > > +	xe_guc_comm_init_early(&gt->uc.guc);

Do you need xe_guc_comm_init_early, this appears to be purely a software
thing that should have already be run on driver load and I don't think
this state would be lost across S4. 

> > > +	err = xe_gt_sriov_vf_bootstrap(gt);
> > > +	if (err)
> > > +		goto out;

I'd drop the 'out' label and just return an error here.

> > > +
> > > +	err = vf_post_migration_fixups(gt);
> > > +
> > > +out:
> > > +	return err;
> > > +}
> > > +
> > > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_vf.h b/drivers/gpu/drm/xe/xe_gt_sriov_vf.h
> > > index af40276790fa..17004223f33a 100644
> > > --- a/drivers/gpu/drm/xe/xe_gt_sriov_vf.h
> > > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_vf.h
> > > @@ -39,5 +39,6 @@ void xe_gt_sriov_vf_print_runtime(struct xe_gt *gt, struct drm_printer *p);
> > >   void xe_gt_sriov_vf_print_version(struct xe_gt *gt, struct drm_printer *p);
> > >   void xe_gt_sriov_vf_wait_valid_ggtt(struct xe_gt *gt);
> > > +int xe_gt_sriov_vf_fixup_ggtt(struct xe_gt *gt);
> > >   #endif
> > > diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
> > > index 210298c4bcb1..68bf08ae62f6 100644
> > > --- a/drivers/gpu/drm/xe/xe_pm.c
> > > +++ b/drivers/gpu/drm/xe/xe_pm.c
> > > @@ -19,6 +19,7 @@
> > >   #include "xe_ggtt.h"
> > >   #include "xe_gt.h"
> > >   #include "xe_gt_idle.h"
> > > +#include "xe_gt_sriov_vf.h"
> > >   #include "xe_i2c.h"
> > >   #include "xe_irq.h"
> > >   #include "xe_late_bind_fw.h"
> > > @@ -248,6 +249,17 @@ int xe_pm_resume(struct xe_device *xe)
> > >   	xe_display_pm_resume_early(xe);
> > > +	/* GGTT fixups (if needed) have to be done before restoring BOs */
> > > +	if (IS_SRIOV_VF(xe)) {
> > > +		for_each_gt(gt, xe, id) {
> > > +			err = xe_gt_sriov_vf_fixup_ggtt(gt);
> > > +			if (err) {
> > > +				drm_err(&xe->drm, "GGTT fixups failed with %d\n", err);
> > > +				return err;

I you need 'goto err' so xe_pm_block_end_signalling is called + we get
the 'Device resume failed' error message.

I also noticed if 'xe_pcode_ready' fails the error path is just
returning too which doesn't look right. While you are here can you fix
that too?

Matt

> > > +			}
> > > +		}
> > > +	}
> > > +
> > >   	/*
> > >   	 * This only restores pinned memory which is the memory required for the
> > >   	 * GT(s) to resume.
> > > -- 
> > > 2.51.0
> > > 
> 

  reply	other threads:[~2025-10-24 15:58 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-23  6:46 [PATCH v2] drm/xe/vf: Fix up GGTT on S4 resume under SR-IOV Satyanarayana K V P
2025-10-23  7:06 ` ✓ CI.KUnit: success for drm/xe/vf: Fix up GGTT on S4 resume under SR-IOV (rev2) Patchwork
2025-10-23  7:44 ` ✓ Xe.CI.BAT: " Patchwork
2025-10-23 13:25 ` ✗ Xe.CI.Full: failure " Patchwork
2025-10-24  2:49 ` [PATCH v2] drm/xe/vf: Fix up GGTT on S4 resume under SR-IOV Matthew Brost
2025-10-24  4:09   ` K V P, Satyanarayana
2025-10-24 15:58     ` Matthew Brost [this message]
2025-11-03 11:16 ` Michal Wajdeczko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aPuiEoi6zyEFw7U5@lstrano-desk.jf.intel.com \
    --to=matthew.brost@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=michal.wajdeczko@intel.com \
    --cc=michal.winiarski@intel.com \
    --cc=satyanarayana.k.v.p@intel.com \
    --cc=tomasz.lis@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox