From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CA819CDE00D for ; Thu, 26 Sep 2024 14:27:42 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9CB1D10EB64; Thu, 26 Sep 2024 14:27:42 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="Urbdpr1i"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) by gabe.freedesktop.org (Postfix) with ESMTPS id E54DC10EB64 for ; Thu, 26 Sep 2024 14:27:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1727360861; x=1758896861; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=siRz4ZmPIC6AKVQDF3j1Hc1LKYOm/WgJ/IA/AawPQjU=; b=Urbdpr1iaYIikj2z2ZpwlAiaQWpAgTzOwjASa3UvwQ9UKOLTNZdtnX6y RkK9H+Hr2JZvE8kjfub+rctUkWjRh8/4G4PJ7ra539iaaTKDQNOwLeUV0 /EOw5nZt1yC3sHP3ZDtzj7ot0Jac7RdvYhnDlFHq32neHcA1SN6mCohhU 0IM7NeqU+X6RYnZ+gmt/yucQxzjsHP9j92JxrE152EnTCAVLdrrsx1Jyg CmNyktJnkECCEmvPyo8+voP+BZCmEAuIHl/xBqnE6DCZicdcL82aD3S4r sbVqkFKaJLoSJgGHD9vFa/jrbwTk99ul4BIhHK5Niwgb2lV2pRQm5qjS0 w==; X-CSE-ConnectionGUID: KkhmjJppQku/rWwe6QSDkQ== X-CSE-MsgGUID: wBEslAUhR5q+gqfXV1mfNw== X-IronPort-AV: E=McAfee;i="6700,10204,11207"; a="51875078" X-IronPort-AV: E=Sophos;i="6.11,155,1725346800"; d="scan'208";a="51875078" Received: from fmviesa010.fm.intel.com ([10.60.135.150]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Sep 2024 07:27:41 -0700 X-CSE-ConnectionGUID: jxESQOxgSxO7KLkeGS6ixQ== X-CSE-MsgGUID: lYt2YhTMS/utwZ0sXF6pDw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,155,1725346800"; d="scan'208";a="72481839" Received: from irvmail002.ir.intel.com ([10.43.11.120]) by fmviesa010.fm.intel.com with ESMTP; 26 Sep 2024 07:27:39 -0700 Received: from [10.246.1.253] (mwajdecz-MOBL.ger.corp.intel.com [10.246.1.253]) by irvmail002.ir.intel.com (Postfix) with ESMTP id 6D547284ED; Thu, 26 Sep 2024 15:27:38 +0100 (IST) Message-ID: Date: Thu, 26 Sep 2024 16:27:37 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2 3/4] drm/xe/vf: Start post-migration fixups with provisinoning query To: Tomasz Lis , intel-xe@lists.freedesktop.org Cc: =?UTF-8?Q?Micha=C5=82_Winiarski?= References: <20240924202553.1541574-1-tomasz.lis@intel.com> <20240924202553.1541574-4-tomasz.lis@intel.com> Content-Language: en-US From: Michal Wajdeczko In-Reply-To: <20240924202553.1541574-4-tomasz.lis@intel.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" On 24.09.2024 22:25, Tomasz Lis wrote: > During post-migration recovery, only MMIO communication to GuC is > allowed. The VF KMD needs to use that channel to ask for the new > provisioning, which includes a new GGTT range assigned to the VF. you likely need first to remove below assert from the xe_guc_mmio_send_recv() xe_assert(xe, !xe_guc_ct_enabled(&guc->ct)); > > v2: query config only instead of handshake; no need to get pm ref as > it's now kept through whole recovery (mwajdeczko) > > Signed-off-by: Tomasz Lis > --- > drivers/gpu/drm/xe/xe_sriov_vf.c | 39 ++++++++++++++++++++++++++++++++ > 1 file changed, 39 insertions(+) > > diff --git a/drivers/gpu/drm/xe/xe_sriov_vf.c b/drivers/gpu/drm/xe/xe_sriov_vf.c > index d0c5a0b7e170..fe5eefa736c8 100644 > --- a/drivers/gpu/drm/xe/xe_sriov_vf.c > +++ b/drivers/gpu/drm/xe/xe_sriov_vf.c > @@ -24,6 +24,34 @@ void xe_sriov_vf_init_early(struct xe_device *xe) > INIT_WORK(&xe->sriov.vf.migration.worker, migration_worker_func); > } > > +/** > + * vf_post_migration_requery_guc - Re-initialize GuC communication. > + * @xe: the &xe_device struct instance > + * > + * After migration, we need to reestablish communication with GuC and > + * re-query all VF configuration to make sure they match previous > + * provisioning. Note that most of VF provisioning shall be the same, > + * except GGTT range, since GGTT is not virtualized per-VF. > + * > + * Returns: 0 if the operation completed successfully, or a negative error > + * code otherwise. > + */ > +static int vf_post_migration_requery_guc(struct xe_device *xe) > +{ > + struct xe_gt *gt; > + unsigned int id; > + int err, ret; int err, ret = 0; > + > + err = 0; > + for_each_gt(gt, xe, id) { > + ret = xe_gt_sriov_vf_query_config(gt); err = xe_gt_sriov_vf_query_config(gt); > + if (!err) > + err = ret; ret = ret ?: err; > + } > + > + return err; return ret; > +} > + > /* > * vf_post_migration_notify_resfix_done - Notify all GuCs about resource fixups apply finished. > * @xe: the &xe_device struct instance > @@ -44,12 +72,23 @@ static void vf_post_migration_notify_resfix_done(struct xe_device *xe) > > static void vf_post_migration_recovery(struct xe_device *xe) > { > + int err; > + > drm_dbg(&xe->drm, "migration recovery in progress\n"); > xe_pm_runtime_get(xe); > + err = vf_post_migration_requery_guc(xe); > + if (unlikely(err)) > + goto fail; shouldn't all this be below "add the recovery steps" line ? > + > /* FIXME: add the recovery steps */ > vf_post_migration_notify_resfix_done(xe); > xe_pm_runtime_put(xe); > drm_notice(&xe->drm, "migration recovery ended\n"); > + return; > +fail: > + xe_pm_runtime_put(xe); > + drm_err(&xe->drm, "migration recovery failed (%pe)\n", ERR_PTR(err)); > + xe_device_declare_wedged(xe); > } > > static void migration_worker_func(struct work_struct *w)