From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6DBE2CCD183 for ; Thu, 16 Oct 2025 12:04:04 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3223610E9C8; Thu, 16 Oct 2025 12:04:04 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="GiNbuBVu"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3BA9A10E9C8 for ; Thu, 16 Oct 2025 12:04:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1760616241; x=1792152241; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=u2eNQARi7qtmFmautRu9eW47ZUUI4rfZaaQZkqCYap4=; b=GiNbuBVuxoTta/xbi53CTRIzMBL2TOkobM9hAjriUa+xJpxIqApyh3Dg R562JSQJavCbtgQuf7IdVqzd7yCuSpA55mibrMbdPumGaDNeGNW6Hr0FV 2Wbz0+IGQXTMrEa9LFQkqf+L9zsm994ShIyuJ6EOPz4nTdyjzcAqOC9gQ mNpIKhBErh+aqkk1SfmnCSjLLxAl7P9lzSDS6hZsbR6iRO4QLfug93i9L 0DHYsRPmdqU3wWkMNwzrRepBblilOdxfbifUgu4Bn4uZ5OVDkyk7sGcv0 lxxXbbd9LCpPP6C5a5t2aIwJL+IsfvBGNQjGBKPDjcHaB/15hTduRuGAY g==; X-CSE-ConnectionGUID: +35Z6CpDSKCEouxIF7J18A== X-CSE-MsgGUID: s3+VzGW1SUizq1/I+VCJNQ== X-IronPort-AV: E=McAfee;i="6800,10657,11583"; a="62899716" X-IronPort-AV: E=Sophos;i="6.19,234,1754982000"; d="scan'208";a="62899716" Received: from fmviesa003.fm.intel.com ([10.60.135.143]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2025 05:04:01 -0700 X-CSE-ConnectionGUID: euZcIw8ARESWaPk0yEQSNw== X-CSE-MsgGUID: E3k7Ic2PTEuk5v9ge+lCsQ== X-ExtLoop1: 1 Received: from gkczarna.igk.intel.com ([10.211.131.163]) by fmviesa003.fm.intel.com with ESMTP; 16 Oct 2025 05:03:59 -0700 From: Tomasz Lis To: intel-xe@lists.freedesktop.org Cc: =?UTF-8?q?Micha=C5=82=20Winiarski?= , =?UTF-8?q?Micha=C5=82=20Wajdeczko?= , =?UTF-8?q?Piotr=20Pi=C3=B3rkowski?= , Matthew Brost , Satyanarayana K V P Subject: [PATCH v3 3/5] drm/xe/vf: Skip fixups on VF migration before getting GGTT info Date: Thu, 16 Oct 2025 14:05:09 +0200 Message-Id: <20251016120511.856792-4-tomasz.lis@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20251016120511.856792-1-tomasz.lis@intel.com> References: <20251016120511.856792-1-tomasz.lis@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" The GuC RESFIX state should be achievable only after a successful handshake. If VF KMD has no GGTT configuration yet and we still got into RESFIX state, then either we're dealing with unclean initial state due to unusual actions before probe, or the migration happened while xe init (started by probe) was running. In 1st case (VF migration before probe), we should just skip migration. Init procedure will ensure exit from RESFIX state as it starts GuC handshake with a reset. In 2nd case (VF migration during xe init), the migration procedure should execute normally if GGTT configuration was already acquired from GuC, and can be skipped if it was not acquired. This solution will avoid crashes due to the VF migration running on non-initialized xe sub-structures. But it is not enough to allow fully reliable migration during driver probe. In particular, the situation where the probe might not end successfully, is: * The VF is paused and migrated after GuC reset (vf_bootstrap) but before config is acquired (vf_query_config). In such case, GuC may remain in RESFIX state, leading to timeouting requests. Signed-off-by: Tomasz Lis --- drivers/gpu/drm/xe/xe_gt_sriov_vf.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_vf.c b/drivers/gpu/drm/xe/xe_gt_sriov_vf.c index 34c68de6e2f3..bb0b71a47125 100644 --- a/drivers/gpu/drm/xe/xe_gt_sriov_vf.c +++ b/drivers/gpu/drm/xe/xe_gt_sriov_vf.c @@ -1149,6 +1149,12 @@ void xe_gt_sriov_vf_print_version(struct xe_gt *gt, struct drm_printer *p) pf_version->major, pf_version->minor); } +static bool vf_ggtt_queried(struct xe_tile *tile) +{ + guard(mutex)(&tile->mem.ggtt->lock); + return xe_tile_sriov_vf_ggtt(tile) != 0; +} + static bool vf_post_migration_shutdown(struct xe_gt *gt) { struct xe_device *xe = gt_to_xe(gt); @@ -1260,6 +1266,11 @@ static void vf_post_migration_recovery(struct xe_gt *gt) xe_gt_sriov_dbg(gt, "migration recovery in progress\n"); xe_pm_runtime_get(xe); + + /* If during init and before GGTT configuration, skip the procedure. */ + if (!vf_ggtt_queried(gt_to_tile(gt))) + goto skip; + retry = vf_post_migration_shutdown(gt); if (retry) goto queue; @@ -1282,6 +1293,7 @@ static void vf_post_migration_recovery(struct xe_gt *gt) vf_post_migration_kickstart(gt); +skip: xe_pm_runtime_put(xe); xe_gt_sriov_notice(gt, "migration recovery ended\n"); return; -- 2.25.1