Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Lis, Tomasz" <tomasz.lis@intel.com>
To: Michal Wajdeczko <michal.wajdeczko@intel.com>,
	<intel-xe@lists.freedesktop.org>
Cc: "Michał Winiarski" <michal.winiarski@intel.com>,
	"Piotr Piórkowski" <piotr.piorkowski@intel.com>,
	"Matthew Brost" <matthew.brost@intel.com>,
	"Satyanarayana K V P" <satyanarayana.k.v.p@intel.com>
Subject: Re: [PATCH v3 3/5] drm/xe/vf: Skip fixups on VF migration before getting GGTT info
Date: Mon, 20 Oct 2025 21:30:50 +0200	[thread overview]
Message-ID: <d9840b65-3f53-493a-ac99-fbdfe6a6c7be@intel.com> (raw)
In-Reply-To: <d3d9ae10-5128-47cf-8c6f-3aa9ec9f0202@intel.com>


On 10/17/2025 12:20 AM, Michal Wajdeczko wrote:
>
> On 10/16/2025 2:05 PM, Tomasz Lis wrote:
>> The GuC RESFIX state should be achievable only after a successful
>> handshake. If VF KMD has no GGTT configuration yet and we still got
>> into RESFIX state, then either we're dealing with unclean initial
>> state due to unusual actions before probe, or the migration
>> happened while xe init (started by probe) was running.
>>
>> In 1st case (VF migration before probe), we should just skip migration.
>> Init procedure will ensure exit from RESFIX state as it starts GuC
>> handshake with a reset.
>>
>> In 2nd case (VF migration during xe init), the migration procedure
>> should execute normally if GGTT configuration was already acquired
>> from GuC, and can be skipped if it was not acquired.
> we initiate recovery as part of the MEMIRQ handling
>
> but to get MEMIRQ interrupt from the GuC we need to setup it first
> to setup MEMIRQ vector we need to obtain GGTT data from GuC
> only the we can register this MEMIRQ vector in GuC
>
> so valid GGTT data seems to be hard prerequisite for this scenario
>
> thus vf_ggtt_queried() should be always true, no?

True. Patch makes little sense. Will discard.

-Tomasz

>
>> This solution will avoid crashes due to the VF migration running
>> on non-initialized xe sub-structures. But it is not enough to allow
>> fully reliable migration during driver probe. In particular, the
>> situation where the probe might not end successfully, is:
>>
>> * The VF is paused and migrated after GuC reset (vf_bootstrap) but
>> before config is acquired (vf_query_config). In such case, GuC may
>> remain in RESFIX state, leading to timeouting requests.
>>
>> Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
>> ---
>>   drivers/gpu/drm/xe/xe_gt_sriov_vf.c | 12 ++++++++++++
>>   1 file changed, 12 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_vf.c b/drivers/gpu/drm/xe/xe_gt_sriov_vf.c
>> index 34c68de6e2f3..bb0b71a47125 100644
>> --- a/drivers/gpu/drm/xe/xe_gt_sriov_vf.c
>> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_vf.c
>> @@ -1149,6 +1149,12 @@ void xe_gt_sriov_vf_print_version(struct xe_gt *gt, struct drm_printer *p)
>>   		   pf_version->major, pf_version->minor);
>>   }
>>   
>> +static bool vf_ggtt_queried(struct xe_tile *tile)
>> +{
>> +	guard(mutex)(&tile->mem.ggtt->lock);
>> +	return xe_tile_sriov_vf_ggtt(tile) != 0;
>> +}
>> +
>>   static bool vf_post_migration_shutdown(struct xe_gt *gt)
>>   {
>>   	struct xe_device *xe = gt_to_xe(gt);
>> @@ -1260,6 +1266,11 @@ static void vf_post_migration_recovery(struct xe_gt *gt)
>>   	xe_gt_sriov_dbg(gt, "migration recovery in progress\n");
>>   
>>   	xe_pm_runtime_get(xe);
>> +
>> +	/* If during init and before GGTT configuration, skip the procedure. */
>> +	if (!vf_ggtt_queried(gt_to_tile(gt)))
>> +		goto skip;
>> +
>>   	retry = vf_post_migration_shutdown(gt);
>>   	if (retry)
>>   		goto queue;
>> @@ -1282,6 +1293,7 @@ static void vf_post_migration_recovery(struct xe_gt *gt)
>>   
>>   	vf_post_migration_kickstart(gt);
>>   
>> +skip:
>>   	xe_pm_runtime_put(xe);
>>   	xe_gt_sriov_notice(gt, "migration recovery ended\n");
>>   	return;

  reply	other threads:[~2025-10-20 19:30 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-16 12:05 [PATCH v3 0/5] drm/xe/vf: Minor fixes to post-migration recovery Tomasz Lis
2025-10-16 12:05 ` [PATCH v3 1/5] drm/xe/vf: Helper for telling whether CCS migration BBs are needed Tomasz Lis
2025-10-16 19:57   ` Michal Wajdeczko
2025-10-16 21:51     ` Lis, Tomasz
2025-10-16 12:05 ` [PATCH v3 2/5] drm/xe/vf: Fix GuC FW check for VF migration support Tomasz Lis
2025-10-16 21:55   ` Michal Wajdeczko
2025-10-17 15:31     ` Lis, Tomasz
2025-10-17 15:40       ` Michal Wajdeczko
2025-10-16 12:05 ` [PATCH v3 3/5] drm/xe/vf: Skip fixups on VF migration before getting GGTT info Tomasz Lis
2025-10-16 22:20   ` Michal Wajdeczko
2025-10-20 19:30     ` Lis, Tomasz [this message]
2025-10-16 12:05 ` [PATCH v3 4/5] drm/xe: Assert that VF will never use fixed placement of BOs Tomasz Lis
2025-10-16 12:05 ` [PATCH v3 5/5] drm/xe/vf: Do not disable VF migration on ATS-M Tomasz Lis
2025-10-16 22:25   ` Michal Wajdeczko
2025-10-20 19:22     ` Lis, Tomasz
2025-10-16 12:11 ` ✓ CI.KUnit: success for drm/xe/vf: Minor fixes to post-migration recovery (rev3) Patchwork
2025-10-16 12:57 ` ✓ Xe.CI.BAT: " Patchwork
2025-10-17  8:23 ` ✗ Xe.CI.Full: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d9840b65-3f53-493a-ac99-fbdfe6a6c7be@intel.com \
    --to=tomasz.lis@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.brost@intel.com \
    --cc=michal.wajdeczko@intel.com \
    --cc=michal.winiarski@intel.com \
    --cc=piotr.piorkowski@intel.com \
    --cc=satyanarayana.k.v.p@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox