From: "Lis, Tomasz" <tomasz.lis@intel.com>
To: Michal Wajdeczko <michal.wajdeczko@intel.com>,
<intel-xe@lists.freedesktop.org>
Cc: "Michał Winiarski" <michal.winiarski@intel.com>,
"Piotr Piórkowski" <piotr.piorkowski@intel.com>,
"Matthew Brost" <matthew.brost@intel.com>,
"Satyanarayana K V P" <satyanarayana.k.v.p@intel.com>
Subject: Re: [PATCH v6 2/4] drm/xe/vf: Fix GuC FW check for VF migration support
Date: Wed, 22 Oct 2025 00:39:13 +0200 [thread overview]
Message-ID: <d4dc560a-aeea-43a7-997c-06f5b03f0745@intel.com> (raw)
In-Reply-To: <5b46eed6-fa1d-4ab2-8963-a8915b802c12@intel.com>
On 10/21/2025 8:39 PM, Michal Wajdeczko wrote:
>
> On 10/21/2025 8:12 PM, Tomasz Lis wrote:
>> The check whether GuC ABI version meets requirements shall be
>> performed after said version is received from GuC.
>>
>> Doing it in wrong order was triggering a warning:
>> xe 0000:00:02.1: [drm] Assertion `gt->sriov.vf.guc_version.major` failed!
>>
>> With this change, dislodge part of the VF migration support check
>> and moved it to after GuC handshake.
>>
>> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
>> Tested-by: Matthew Brost <matthew.brost@intel.com> #v1
>> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6349
>> Fixes: ff1d2b5e3d28 ("drm/xe: Read VF GMD_ID with a specifically-allocated dummy GT")
>> Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
>> ---
>> v2: Use xe_sriov_vf_ccs_migration_bb_needed()
>>
>> v3: Update commit message, move check funct to ccs module (Michal),
>> rename xe_sriov_vf_migration_disable(), remove its duplicate
>>
>> v4: Limit scope of some functions to xe_sriov_vf_ccs file,
>> switched 'Fixes:' tag to a different commit (Michal)
>>
>> v5: Squashed with "Helper for telling whether CCS migration BBs are
>> needed", added kerneldoc, moved location of some checks (Michal)
>>
>> drivers/gpu/drm/xe/xe_sriov_vf.c | 33 +++++++------------
>> drivers/gpu/drm/xe/xe_sriov_vf.h | 1 +
>> drivers/gpu/drm/xe/xe_sriov_vf_ccs.c | 48 ++++++++++++++++++++++++++--
>> 3 files changed, 59 insertions(+), 23 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_sriov_vf.c b/drivers/gpu/drm/xe/xe_sriov_vf.c
>> index 0d8135f3927c..13d6c094ae8f 100644
>> --- a/drivers/gpu/drm/xe/xe_sriov_vf.c
>> +++ b/drivers/gpu/drm/xe/xe_sriov_vf.c
>> @@ -133,7 +133,12 @@ bool xe_sriov_vf_migration_supported(struct xe_device *xe)
>> return !xe->sriov.vf.migration.disabled;
>> }
>>
>> -static void vf_disable_migration(struct xe_device *xe, const char *fmt, ...)
>> +/**
>> + * xe_sriov_vf_migration_disable - Turn off VF migration with given log message.
>> + * @xe: the &xe_device instance.
>> + * @fmt: format string for the log message, to be combined with following VAs.
>> + */
>> +void xe_sriov_vf_migration_disable(struct xe_device *xe, const char *fmt, ...)
>> {
>> struct va_format vaf;
>> va_list va_args;
>> @@ -156,22 +161,13 @@ static void vf_migration_init_early(struct xe_device *xe)
>> * supported at production quality.
>> */
>> if (!IS_ENABLED(CONFIG_DRM_XE_DEBUG))
>> - return vf_disable_migration(xe,
>> - "experimental feature not available on production builds");
>> + return xe_sriov_vf_migration_disable(xe,
>> + "experimental feature not available on production builds");
> indent ?
it makes sense for the brackets align rule to be abandoned when using it
would produce unreasonably long lines.
>
>>
>> if (GRAPHICS_VER(xe) < 20)
>> - return vf_disable_migration(xe, "requires gfx version >= 20, but only %u found",
>> - GRAPHICS_VER(xe));
>> -
>> - if (!IS_DGFX(xe)) {
>> - struct xe_uc_fw_version guc_version;
>> -
>> - xe_gt_sriov_vf_guc_versions(xe_device_get_gt(xe, 0), NULL, &guc_version);
>> - if (MAKE_GUC_VER_STRUCT(guc_version) < MAKE_GUC_VER(1, 23, 0))
>> - return vf_disable_migration(xe,
>> - "CCS migration requires GuC ABI >= 1.23 but only %u.%u found",
>> - guc_version.major, guc_version.minor);
>> - }
>> + return xe_sriov_vf_migration_disable(xe,
>> + "requires gfx version >= 20, but only %u found",
>> + GRAPHICS_VER(xe));
>> }
>>
>> /**
>> @@ -193,12 +189,7 @@ void xe_sriov_vf_init_early(struct xe_device *xe)
>> */
>> int xe_sriov_vf_init_late(struct xe_device *xe)
>> {
>> - int err = 0;
>> -
>> - if (xe_sriov_vf_migration_supported(xe))
>> - err = xe_sriov_vf_ccs_init(xe);
>> -
>> - return err;
>> + return xe_sriov_vf_ccs_init(xe);
>> }
>>
>> static int sa_info_vf_ccs(struct seq_file *m, void *data)
>> diff --git a/drivers/gpu/drm/xe/xe_sriov_vf.h b/drivers/gpu/drm/xe/xe_sriov_vf.h
>> index 4df95266b261..e967d4166a43 100644
>> --- a/drivers/gpu/drm/xe/xe_sriov_vf.h
>> +++ b/drivers/gpu/drm/xe/xe_sriov_vf.h
>> @@ -14,6 +14,7 @@ struct xe_device;
>> void xe_sriov_vf_init_early(struct xe_device *xe);
>> int xe_sriov_vf_init_late(struct xe_device *xe);
>> bool xe_sriov_vf_migration_supported(struct xe_device *xe);
>> +void xe_sriov_vf_migration_disable(struct xe_device *xe, const char *fmt, ...);
>> void xe_sriov_vf_debugfs_register(struct xe_device *xe, struct dentry *root);
>>
>> #endif
>> diff --git a/drivers/gpu/drm/xe/xe_sriov_vf_ccs.c b/drivers/gpu/drm/xe/xe_sriov_vf_ccs.c
>> index 790249801364..842e2a4e4774 100644
>> --- a/drivers/gpu/drm/xe/xe_sriov_vf_ccs.c
>> +++ b/drivers/gpu/drm/xe/xe_sriov_vf_ccs.c
>> @@ -10,6 +10,8 @@
>> #include "xe_device.h"
>> #include "xe_exec_queue.h"
>> #include "xe_exec_queue_types.h"
>> +#include "xe_gt_sriov_vf.h"
>> +#include "xe_guc.h"
>> #include "xe_guc_submit.h"
>> #include "xe_lrc.h"
>> #include "xe_migrate.h"
>> @@ -260,6 +262,49 @@ int xe_sriov_vf_ccs_register_context(struct xe_device *xe)
>> return err;
>> }
>>
>> +/*
>> + * Whether GuC requires CCS copy BBs for VF migration.
>> + * @xe: the &xe_device instance.
>> + *
>> + * Only selected platforms require VF KMD to maintain CCS copy BBs and linked LRCAs.
>> + *
>> + * Return: true if VF driver must participate in the CCS migration, false otherwise.
>> + */
>> +static bool vf_migration_ccs_bb_needed(struct xe_device *xe)
>> +{
>> + xe_assert(xe, IS_SRIOV_VF(xe));
>> +
>> + return !IS_DGFX(xe) && xe_device_has_flat_ccs(xe);
>> +}
>> +
>> +/*
>> + * Check for disable migration due to no CCS BBs support in GuC FW.
>> + * @xe: the &xe_device instance.
>> + *
>> + * Performs late disable of VF migration feature in case GuC FW cannot support it.
>> + *
>> + * Returns: True if VF migration with CCS BBs is supported, false othherwise.
> typo
ack
>
>> + */
>> +static bool vf_migration_ccs_bb_support_check(struct xe_device *xe)
>> +{
>> + struct xe_gt *gt = xe_device_get_gt(xe, 0);
> this will make static code analyzer unhappy
>
> likely xe_root_mmio_gt(xe) can be used instead to avoid that
I assume you're referring to the theoretical ability for that function
to return NULL.
Not sure how the analyzer decides that one can and the other can't
return NULL.. but I believe you. Will change.
>> + struct xe_uc_fw_version guc_version;
>> +
>> + if (!xe_sriov_vf_migration_supported(xe) ||
>> + !vf_migration_ccs_bb_needed(xe))
>> + return false;
> nit: IMO it would be cleaner if moved to the caller side ...
Exposing multiple conditions, easily packed into one, to higher level,
is cleaner?
I don't really care which way this goes, so will change. Unusual request
though.
-Tomasz
>
>> +
>> + xe_gt_sriov_vf_guc_versions(gt, NULL, &guc_version);
>> + if (MAKE_GUC_VER_STRUCT(guc_version) < MAKE_GUC_VER(1, 23, 0)) {
>> + xe_sriov_vf_migration_disable(xe,
>> + "CCS migration requires GuC ABI >= 1.23 but only %u.%u found",
>> + guc_version.major, guc_version.minor);
>> + return false;
>> + }
>> +
>> + return true;
>> +}
>> +
>> static void xe_sriov_vf_ccs_fini(void *arg)
>> {
>> struct xe_sriov_vf_ccs_ctx *ctx = arg;
>> @@ -292,9 +337,8 @@ int xe_sriov_vf_ccs_init(struct xe_device *xe)
>> int err;
>>
>> xe_assert(xe, IS_SRIOV_VF(xe));
>> - xe_assert(xe, xe_sriov_vf_migration_supported(xe));
>>
>> - if (IS_DGFX(xe) || !xe_device_has_flat_ccs(xe))
>> + if (!vf_migration_ccs_bb_support_check(xe))
> ... here
>
>> return 0;
>>
>> for_each_ccs_rw_ctx(ctx_id) {
> but otherwise LGTM, so if CI is happy, then
>
> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>
next prev parent reply other threads:[~2025-10-21 22:39 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-21 18:12 [PATCH v6 0/4] drm/xe/vf: Minor fixes to post-migration recovery Tomasz Lis
2025-10-21 18:12 ` [PATCH v6 1/4] drm/xe/vf: Revert logic of vf.migration.enabled Tomasz Lis
2025-10-21 18:13 ` Michal Wajdeczko
2025-10-21 18:12 ` [PATCH v6 2/4] drm/xe/vf: Fix GuC FW check for VF migration support Tomasz Lis
2025-10-21 18:39 ` Michal Wajdeczko
2025-10-21 22:39 ` Lis, Tomasz [this message]
2025-10-21 18:12 ` [PATCH v6 3/4] drm/xe: Assert that VF will never use fixed placement of BOs Tomasz Lis
2025-10-21 18:12 ` [PATCH v6 4/4] drm/xe/vf: Do not disable VF migration on ATS-M Tomasz Lis
2025-10-21 18:18 ` ✓ CI.KUnit: success for drm/xe/vf: Minor fixes to post-migration recovery (rev6) Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d4dc560a-aeea-43a7-997c-06f5b03f0745@intel.com \
--to=tomasz.lis@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=matthew.brost@intel.com \
--cc=michal.wajdeczko@intel.com \
--cc=michal.winiarski@intel.com \
--cc=piotr.piorkowski@intel.com \
--cc=satyanarayana.k.v.p@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox