All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Lis, Tomasz" <tomasz.lis@intel.com>
To: Michal Wajdeczko <michal.wajdeczko@intel.com>,
	<intel-xe@lists.freedesktop.org>
Cc: "Michał Winiarski" <michal.winiarski@intel.com>
Subject: Re: [PATCH v2 3/4] drm/xe/vf: Start post-migration fixups with provisinoning query
Date: Thu, 26 Sep 2024 23:32:43 +0200	[thread overview]
Message-ID: <19f4d72f-9e69-4c19-b264-e7207537f3a2@intel.com> (raw)
In-Reply-To: <a41bb1ba-0f14-4616-967d-9ad8ef014b63@intel.com>


On 26.09.2024 16:27, Michal Wajdeczko wrote:
>
> On 24.09.2024 22:25, Tomasz Lis wrote:
>> During post-migration recovery, only MMIO communication to GuC is
>> allowed. The VF KMD needs to use that channel to ask for the new
>> provisioning, which includes a new GGTT range assigned to the VF.
> you likely need first to remove below assert from the
> xe_guc_mmio_send_recv()
>
> 	xe_assert(xe, !xe_guc_ct_enabled(&guc->ct));
>
>> v2: query config only instead of handshake; no need to get pm ref as
>>   it's now kept through whole recovery (mwajdeczko)
>>
>> Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
>> ---
>>   drivers/gpu/drm/xe/xe_sriov_vf.c | 39 ++++++++++++++++++++++++++++++++
>>   1 file changed, 39 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_sriov_vf.c b/drivers/gpu/drm/xe/xe_sriov_vf.c
>> index d0c5a0b7e170..fe5eefa736c8 100644
>> --- a/drivers/gpu/drm/xe/xe_sriov_vf.c
>> +++ b/drivers/gpu/drm/xe/xe_sriov_vf.c
>> @@ -24,6 +24,34 @@ void xe_sriov_vf_init_early(struct xe_device *xe)
>>   	INIT_WORK(&xe->sriov.vf.migration.worker, migration_worker_func);
>>   }
>>   
>> +/**
>> + * vf_post_migration_requery_guc - Re-initialize GuC communication.
>> + * @xe: the &xe_device struct instance
>> + *
>> + * After migration, we need to reestablish communication with GuC and
>> + * re-query all VF configuration to make sure they match previous
>> + * provisioning. Note that most of VF provisioning shall be the same,
>> + * except GGTT range, since GGTT is not virtualized per-VF.
>> + *
>> + * Returns: 0 if the operation completed successfully, or a negative error
>> + * code otherwise.
>> + */
>> +static int vf_post_migration_requery_guc(struct xe_device *xe)
>> +{
>> +	struct xe_gt *gt;
>> +	unsigned int id;
>> +	int err, ret;
> 	int err, ret = 0;
>
>> +
>> +	err = 0;
>> +	for_each_gt(gt, xe, id) {
>> +		ret = xe_gt_sriov_vf_query_config(gt);
> 		err = xe_gt_sriov_vf_query_config(gt);
>
>> +		if (!err)
>> +			err = ret;
> 		ret = ret ?: err;
>
>> +	}
>> +
>> +	return err;
> 	return ret;

ok, will do. But that doesn't seem any better than it was before.

Well, maybe except the ternary operator, that does look better (though 
it's not a part of c89).

>
>> +}
>> +
>>   /*
>>    * vf_post_migration_notify_resfix_done - Notify all GuCs about resource fixups apply finished.
>>    * @xe: the &xe_device struct instance
>> @@ -44,12 +72,23 @@ static void vf_post_migration_notify_resfix_done(struct xe_device *xe)
>>   
>>   static void vf_post_migration_recovery(struct xe_device *xe)
>>   {
>> +	int err;
>> +
>>   	drm_dbg(&xe->drm, "migration recovery in progress\n");
>>   	xe_pm_runtime_get(xe);
>> +	err = vf_post_migration_requery_guc(xe);
>> +	if (unlikely(err))
>> +		goto fail;
> shouldn't all this be below "add the recovery steps" line ?

If requery failed, why would we continue with fixups? We don't know the 
new GGTT range in that case.

-Tomasz

>
>> +
>>   	/* FIXME: add the recovery steps */
>>   	vf_post_migration_notify_resfix_done(xe);
>>   	xe_pm_runtime_put(xe);
>>   	drm_notice(&xe->drm, "migration recovery ended\n");
>> +	return;
>> +fail:
>> +	xe_pm_runtime_put(xe);
>> +	drm_err(&xe->drm, "migration recovery failed (%pe)\n", ERR_PTR(err));
>> +	xe_device_declare_wedged(xe);
>>   }
>>   
>>   static void migration_worker_func(struct work_struct *w)

  reply	other threads:[~2024-09-26 21:33 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-24 20:25 [PATCH v2 0/4] drm/xe/vf: Post-migration recovery worker basis Tomasz Lis
2024-09-24 20:25 ` [PATCH v2 1/4] drm/xe/vf: React to MIGRATED interrupt Tomasz Lis
2024-09-26 14:05   ` Michal Wajdeczko
2024-09-26 21:22     ` Lis, Tomasz
2024-09-24 20:25 ` [PATCH v2 2/4] drm/xe/vf: Send RESFIX_DONE message at end of VF restore Tomasz Lis
2024-09-24 20:25 ` [PATCH v2 3/4] drm/xe/vf: Start post-migration fixups with provisinoning query Tomasz Lis
2024-09-26 14:27   ` Michal Wajdeczko
2024-09-26 21:32     ` Lis, Tomasz [this message]
2024-09-24 20:25 ` [PATCH v2 4/4] drm/xe/vf: Defer fixups if migrated twice fast Tomasz Lis
2024-09-26 14:35   ` Michal Wajdeczko
2024-09-26 21:48     ` Lis, Tomasz
2024-09-26  5:16 ` ✓ CI.Patch_applied: success for drm/xe/vf: Post-migration recovery worker basis (rev2) Patchwork
2024-09-26  5:16 ` ✗ CI.checkpatch: warning " Patchwork
2024-09-26  5:18 ` ✓ CI.KUnit: success " Patchwork
2024-09-26  5:23 ` ✗ CI.Build: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=19f4d72f-9e69-4c19-b264-e7207537f3a2@intel.com \
    --to=tomasz.lis@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=michal.wajdeczko@intel.com \
    --cc=michal.winiarski@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.