Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: "Michal Wajdeczko" <michal.wajdeczko@intel.com>,
	"Michał Winiarski" <michal.winiarski@intel.com>,
	"Sasha Levin" <sashal@kernel.org>,
	lucas.demarchi@intel.com, thomas.hellstrom@linux.intel.com,
	rodrigo.vivi@intel.com, airlied@gmail.com, simona@ffwll.ch,
	intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Subject: [PATCH AUTOSEL 6.12 469/486] drm/xe/pf: Reset GuC VF config when unprovisioning critical resource
Date: Mon,  5 May 2025 18:39:05 -0400	[thread overview]
Message-ID: <20250505223922.2682012-469-sashal@kernel.org> (raw)
In-Reply-To: <20250505223922.2682012-1-sashal@kernel.org>

From: Michal Wajdeczko <michal.wajdeczko@intel.com>

[ Upstream commit 33f17e2cbd930a2a00eb007d9b241b6db010a880 ]

GuC firmware counts received VF configuration KLVs and may start
validation of the complete VF config even if some resources where
unprovisioned in the meantime, leading to unexpected errors like:

 $ echo 1 | sudo tee /sys/kernel/debug/dri/0000:00:02.0/gt0/vf1/contexts_quota
 $ echo 0 | sudo tee /sys/kernel/debug/dri/0000:00:02.0/gt0/vf1/contexts_quota
 $ echo 1 | sudo tee /sys/kernel/debug/dri/0000:00:02.0/gt0/vf1/doorbells_quota
 $ echo 0 | sudo tee /sys/kernel/debug/dri/0000:00:02.0/gt0/vf1/doorbells_quota
 $ echo 1 | sudo tee /sys/kernel/debug/dri/0000:00:02.0/gt0/vf1/ggtt_quota
 tee: '/sys/kernel/debug/dri/0000:00:02.0/gt0/vf1/ggtt_quota': Input/output error

To mitigate this problem trigger explicit VF config reset after
unprovisioning any of the critical resources (GGTT, context or
doorbell IDs) that GuC is monitoring.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250129195947.764-3-michal.wajdeczko@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c | 37 +++++++++++++++++++---
 1 file changed, 33 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
index c9ed996b9cb0c..786f0dba41437 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
@@ -323,6 +323,26 @@ static int pf_push_full_vf_config(struct xe_gt *gt, unsigned int vfid)
 	return err;
 }
 
+static int pf_push_vf_cfg(struct xe_gt *gt, unsigned int vfid, bool reset)
+{
+	int err = 0;
+
+	xe_gt_assert(gt, vfid);
+	lockdep_assert_held(xe_gt_sriov_pf_master_mutex(gt));
+
+	if (reset)
+		err = pf_send_vf_cfg_reset(gt, vfid);
+	if (!err)
+		err = pf_push_full_vf_config(gt, vfid);
+
+	return err;
+}
+
+static int pf_refresh_vf_cfg(struct xe_gt *gt, unsigned int vfid)
+{
+	return pf_push_vf_cfg(gt, vfid, true);
+}
+
 static u64 pf_get_ggtt_alignment(struct xe_gt *gt)
 {
 	struct xe_device *xe = gt_to_xe(gt);
@@ -419,6 +439,10 @@ static int pf_provision_vf_ggtt(struct xe_gt *gt, unsigned int vfid, u64 size)
 			return err;
 
 		pf_release_vf_config_ggtt(gt, config);
+
+		err = pf_refresh_vf_cfg(gt, vfid);
+		if (unlikely(err))
+			return err;
 	}
 	xe_gt_assert(gt, !xe_ggtt_node_allocated(config->ggtt_region));
 
@@ -744,6 +768,10 @@ static int pf_provision_vf_ctxs(struct xe_gt *gt, unsigned int vfid, u32 num_ctx
 			return ret;
 
 		pf_release_config_ctxs(gt, config);
+
+		ret = pf_refresh_vf_cfg(gt, vfid);
+		if (unlikely(ret))
+			return ret;
 	}
 
 	if (!num_ctxs)
@@ -1041,6 +1069,10 @@ static int pf_provision_vf_dbs(struct xe_gt *gt, unsigned int vfid, u32 num_dbs)
 			return ret;
 
 		pf_release_config_dbs(gt, config);
+
+		ret = pf_refresh_vf_cfg(gt, vfid);
+		if (unlikely(ret))
+			return ret;
 	}
 
 	if (!num_dbs)
@@ -2003,10 +2035,7 @@ int xe_gt_sriov_pf_config_push(struct xe_gt *gt, unsigned int vfid, bool refresh
 	xe_gt_assert(gt, vfid);
 
 	mutex_lock(xe_gt_sriov_pf_master_mutex(gt));
-	if (refresh)
-		err = pf_send_vf_cfg_reset(gt, vfid);
-	if (!err)
-		err = pf_push_full_vf_config(gt, vfid);
+	err = pf_push_vf_cfg(gt, vfid, refresh);
 	mutex_unlock(xe_gt_sriov_pf_master_mutex(gt));
 
 	if (unlikely(err)) {
-- 
2.39.5


  parent reply	other threads:[~2025-05-05 22:56 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20250505223922.2682012-1-sashal@kernel.org>
2025-05-05 22:34 ` [PATCH AUTOSEL 6.12 183/486] drm/xe: Nuke VM's mapping upon close Sasha Levin
2025-05-05 22:34 ` [PATCH AUTOSEL 6.12 184/486] drm/xe: Retry BO allocation Sasha Levin
2025-05-05 22:35 ` [PATCH AUTOSEL 6.12 236/486] drm/xe/vf: Retry sending MMIO request to GUC on timeout error Sasha Levin
2025-05-05 22:35 ` [PATCH AUTOSEL 6.12 237/486] drm/xe/pf: Create a link between PF and VF devices Sasha Levin
2025-05-05 22:35 ` [PATCH AUTOSEL 6.12 242/486] drm/xe: xe_gen_wa_oob: replace program_invocation_short_name Sasha Levin
2025-05-05 22:35 ` [PATCH AUTOSEL 6.12 275/486] drm/xe/oa: Ensure that polled read returns latest data Sasha Levin
2025-05-05 22:36 ` [PATCH AUTOSEL 6.12 343/486] drm/xe: Stop ignoring errors from xe_ttm_stolen_mgr_init() Sasha Levin
2025-05-05 22:37 ` [PATCH AUTOSEL 6.12 344/486] drm/xe: Fix xe_tile_init_noalloc() error propagation Sasha Levin
2025-05-05 22:37 ` [PATCH AUTOSEL 6.12 347/486] drm/xe/debugfs: fixed the return value of wedged_mode_set Sasha Levin
2025-05-05 22:37 ` [PATCH AUTOSEL 6.12 348/486] drm/xe/debugfs: Add missing xe_pm_runtime_put in wedge_mode_set Sasha Levin
2025-05-05 22:38 ` [PATCH AUTOSEL 6.12 439/486] drm/xe/relay: Don't use GFP_KERNEL for new transactions Sasha Levin
2025-05-05 22:39 ` Sasha Levin [this message]
2025-05-05 22:39 ` [PATCH AUTOSEL 6.12 477/486] drm/xe: Move suballocator init to after display init Sasha Levin
2025-05-05 22:39 ` [PATCH AUTOSEL 6.12 478/486] drm/xe: Do not attempt to bootstrap VF in execlists mode Sasha Levin
2025-05-05 22:39 ` [PATCH AUTOSEL 6.12 480/486] drm/xe/sa: Always call drm_suballoc_manager_fini() Sasha Levin
2025-05-05 22:39 ` [PATCH AUTOSEL 6.12 481/486] drm/xe: Reject BO eviction if BO is bound to current VM Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250505223922.2682012-469-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=airlied@gmail.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lucas.demarchi@intel.com \
    --cc=michal.wajdeczko@intel.com \
    --cc=michal.winiarski@intel.com \
    --cc=rodrigo.vivi@intel.com \
    --cc=simona@ffwll.ch \
    --cc=stable@vger.kernel.org \
    --cc=thomas.hellstrom@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox