From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9F3F72690E7 for ; Mon, 7 Jul 2025 19:08:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.12 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751915312; cv=none; b=fMBOEEBp9//5G494/zZ0vNznFff0lEi2kWdOJ4m/wmUgJozB0NQv4g4PSnOruU0dWj+T9b2CGNnXzuVx1BrfriHXWttKdrefaptdiK8V2qt24CUg4wyx3Zv9Cq+rPT9pM3iml1rXzMxx/PIKNIf7+8uHScMGEXTa4rsEqMTL8r8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751915312; c=relaxed/simple; bh=uvg1ZkUCHAyT1ZJign96r0uHR2f9F7hjaot1gafCOk0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=jpOqw7Cd7DWNmygjCuaqu7tcMcyIXVc7SRey0VNR31zcad+IVQ/QMcU5c5oqKU8nQaPv83IVQkDSK4Puauwx1X1RryB5b0DmDjmAbVNcx5vsu4nITQscT51KyaVLz52lKqZ3wsAr+wzVaYbW1n4ShOTyq+sYeEo+gq8FsMSN9wk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=TmKjs9Me; arc=none smtp.client-ip=192.198.163.12 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="TmKjs9Me" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1751915311; x=1783451311; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=uvg1ZkUCHAyT1ZJign96r0uHR2f9F7hjaot1gafCOk0=; b=TmKjs9MeZlb/hXQw1xn1B3DUDtQf7D7kKsKirsAhxBw+xWQ/N1GrNhGz Cxe21MRyexopiebSpsjqNEYjdkgfl1UNVbieIOIKGAXSoz6U/iSXZM3Jw FydDpRLbeJ9l6YicEWlJWzt3TnHG0JJMbr1RejZGdSp61uwdy4gVPIBdC qxEapmrl2qSv2vQmqpLuHmcHPG+W+nX0zSAAWxXj8pv20Io+agKIRykJv q38WRu9GWEkioGUXpUU/MAbBnj9Q8ENx760inO7sZvhgquWmxS/prXYpO OefEb7eu7zOx/1/UMPTl7GU8nfJAZWiFaEDKvpn80IJzcNoCNR1OVQ9Aa g==; X-CSE-ConnectionGUID: W+Iz4DwPSTaMiliLXpzaKg== X-CSE-MsgGUID: jBwu8v02QEKTKXS3f4yQnA== X-IronPort-AV: E=McAfee;i="6800,10657,11487"; a="57945677" X-IronPort-AV: E=Sophos;i="6.16,295,1744095600"; d="scan'208";a="57945677" Received: from orviesa010.jf.intel.com ([10.64.159.150]) by fmvoesa106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Jul 2025 12:08:31 -0700 X-CSE-ConnectionGUID: S/b95TdXSUOt/VHwTYftpA== X-CSE-MsgGUID: X60Br4QRQISEpzXLn9P8WA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,295,1744095600"; d="scan'208";a="154707510" Received: from unknown (HELO bnilawar-desk2.iind.intel.com) ([10.190.239.41]) by orviesa010-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Jul 2025 12:08:27 -0700 From: Badal Nilawar To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org Cc: anshuman.gupta@intel.com, rodrigo.vivi@intel.com, alexander.usyskin@intel.com, gregkh@linuxfoundation.org, daniele.ceraolospurio@intel.com Subject: [PATCH v7 5/9] drm/xe/xe_late_bind_fw: Load late binding firmware Date: Tue, 8 Jul 2025 00:42:33 +0530 Message-Id: <20250707191237.1782824-6-badal.nilawar@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250707191237.1782824-1-badal.nilawar@intel.com> References: <20250707191237.1782824-1-badal.nilawar@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Load late binding firmware v2: - s/EAGAIN/EBUSY/ - Flush worker in suspend and driver unload (Daniele) v3: - Use retry interval of 6s, in steps of 200ms, to allow other OS components release MEI CL handle (Sasha) v4: - return -ENODEV if component not added (Daniele) - parse and print status returned by csc v5: - Use payload to check firmware valid (Daniele) - Obtain the RPM reference before scheduling the worker to ensure the device remains awake until the worker completes firmware loading (Rodrigo) v6: - In case of error donot re-attempt fw download (Daniele) Signed-off-by: Badal Nilawar Reviewed-by: Daniele Ceraolo Spurio --- drivers/gpu/drm/xe/xe_late_bind_fw.c | 155 ++++++++++++++++++++- drivers/gpu/drm/xe/xe_late_bind_fw.h | 1 + drivers/gpu/drm/xe/xe_late_bind_fw_types.h | 7 + 3 files changed, 162 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw.c b/drivers/gpu/drm/xe/xe_late_bind_fw.c index 54b815145a69..9804508ee90d 100644 --- a/drivers/gpu/drm/xe/xe_late_bind_fw.c +++ b/drivers/gpu/drm/xe/xe_late_bind_fw.c @@ -16,6 +16,20 @@ #include "xe_late_bind_fw.h" #include "xe_pcode.h" #include "xe_pcode_api.h" +#include "xe_pm.h" + +/* + * The component should load quite quickly in most cases, but it could take + * a bit. Using a very big timeout just to cover the worst case scenario + */ +#define LB_INIT_TIMEOUT_MS 20000 + +/* + * Retry interval set to 6 seconds, in steps of 200 ms, to allow time for + * other OS components to release the MEI CL handle + */ +#define LB_FW_LOAD_RETRY_MAXCOUNT 30 +#define LB_FW_LOAD_RETRY_PAUSE_MS 200 static const u32 fw_id_to_type[] = { [XE_LB_FW_FAN_CONTROL] = CSC_LATE_BINDING_TYPE_FAN_CONTROL, @@ -31,6 +45,30 @@ late_bind_to_xe(struct xe_late_bind *late_bind) return container_of(late_bind, struct xe_device, late_bind); } +static const char *xe_late_bind_parse_status(uint32_t status) +{ + switch (status) { + case CSC_LATE_BINDING_STATUS_SUCCESS: + return "success"; + case CSC_LATE_BINDING_STATUS_4ID_MISMATCH: + return "4Id Mismatch"; + case CSC_LATE_BINDING_STATUS_ARB_FAILURE: + return "ARB Failure"; + case CSC_LATE_BINDING_STATUS_GENERAL_ERROR: + return "General Error"; + case CSC_LATE_BINDING_STATUS_INVALID_PARAMS: + return "Invalid Params"; + case CSC_LATE_BINDING_STATUS_INVALID_SIGNATURE: + return "Invalid Signature"; + case CSC_LATE_BINDING_STATUS_INVALID_PAYLOAD: + return "Invalid Payload"; + case CSC_LATE_BINDING_STATUS_TIMEOUT: + return "Timeout"; + default: + return "Unknown error"; + } +} + static int xe_late_bind_fw_num_fans(struct xe_late_bind *late_bind) { struct xe_device *xe = late_bind_to_xe(late_bind); @@ -44,6 +82,99 @@ static int xe_late_bind_fw_num_fans(struct xe_late_bind *late_bind) return 0; } +static void xe_late_bind_wait_for_worker_completion(struct xe_late_bind *late_bind) +{ + struct xe_device *xe = late_bind_to_xe(late_bind); + struct xe_late_bind_fw *lbfw; + int fw_id; + + for (fw_id = 0; fw_id < XE_LB_FW_MAX_ID; fw_id++) { + lbfw = &late_bind->late_bind_fw[fw_id]; + if (lbfw->payload && late_bind->wq) { + drm_dbg(&xe->drm, "Flush work: load %s firmware\n", + fw_id_to_name[lbfw->id]); + flush_work(&lbfw->work); + } + } +} + +static void xe_late_bind_work(struct work_struct *work) +{ + struct xe_late_bind_fw *lbfw = container_of(work, struct xe_late_bind_fw, work); + struct xe_late_bind *late_bind = container_of(lbfw, struct xe_late_bind, + late_bind_fw[lbfw->id]); + struct xe_device *xe = late_bind_to_xe(late_bind); + int retry = LB_FW_LOAD_RETRY_MAXCOUNT; + int ret; + int slept; + + xe_device_assert_mem_access(xe); + + /* we can queue this before the component is bound */ + for (slept = 0; slept < LB_INIT_TIMEOUT_MS; slept += 100) { + if (late_bind->component.ops) + break; + msleep(100); + } + + if (!late_bind->component.ops) { + drm_err(&xe->drm, "Late bind component not bound\n"); + /* Do not re-attempt fw load */ + drmm_kfree(&xe->drm, (void *)lbfw->payload); + lbfw->payload = NULL; + goto out; + } + + drm_dbg(&xe->drm, "Load %s firmware\n", fw_id_to_name[lbfw->id]); + + do { + ret = late_bind->component.ops->push_config(late_bind->component.mei_dev, + lbfw->type, lbfw->flags, + lbfw->payload, lbfw->payload_size); + if (!ret) + break; + msleep(LB_FW_LOAD_RETRY_PAUSE_MS); + } while (--retry && ret == -EBUSY); + + if (!ret) { + drm_dbg(&xe->drm, "Load %s firmware successful\n", + fw_id_to_name[lbfw->id]); + goto out; + } + + if (ret > 0) + drm_err(&xe->drm, "Load %s firmware failed with err %d, %s\n", + fw_id_to_name[lbfw->id], ret, xe_late_bind_parse_status(ret)); + else + drm_err(&xe->drm, "Load %s firmware failed with err %d", + fw_id_to_name[lbfw->id], ret); + /* Do not re-attempt fw load */ + drmm_kfree(&xe->drm, (void *)lbfw->payload); + lbfw->payload = NULL; + +out: + xe_pm_runtime_put(xe); +} + +int xe_late_bind_fw_load(struct xe_late_bind *late_bind) +{ + struct xe_device *xe = late_bind_to_xe(late_bind); + struct xe_late_bind_fw *lbfw; + int fw_id; + + if (!late_bind->component_added) + return -ENODEV; + + for (fw_id = 0; fw_id < XE_LB_FW_MAX_ID; fw_id++) { + lbfw = &late_bind->late_bind_fw[fw_id]; + if (lbfw->payload) { + xe_pm_runtime_get_noresume(xe); + queue_work(late_bind->wq, &lbfw->work); + } + } + return 0; +} + static int __xe_late_bind_fw_init(struct xe_late_bind *late_bind, u32 fw_id) { struct xe_device *xe = late_bind_to_xe(late_bind); @@ -97,6 +228,7 @@ static int __xe_late_bind_fw_init(struct xe_late_bind *late_bind, u32 fw_id) memcpy((void *)lb_fw->payload, fw->data, lb_fw->payload_size); release_firmware(fw); + INIT_WORK(&lb_fw->work, xe_late_bind_work); return 0; } @@ -106,11 +238,16 @@ static int xe_late_bind_fw_init(struct xe_late_bind *late_bind) int ret; int fw_id; + late_bind->wq = alloc_ordered_workqueue("late-bind-ordered-wq", 0); + if (!late_bind->wq) + return -ENOMEM; + for (fw_id = 0; fw_id < XE_LB_FW_MAX_ID; fw_id++) { ret = __xe_late_bind_fw_init(late_bind, fw_id); if (ret) return ret; } + return 0; } @@ -132,6 +269,8 @@ static void xe_late_bind_component_unbind(struct device *xe_kdev, struct xe_device *xe = kdev_to_xe_device(xe_kdev); struct xe_late_bind *late_bind = &xe->late_bind; + xe_late_bind_wait_for_worker_completion(late_bind); + late_bind->component.ops = NULL; } @@ -145,7 +284,15 @@ static void xe_late_bind_remove(void *arg) struct xe_late_bind *late_bind = arg; struct xe_device *xe = late_bind_to_xe(late_bind); + xe_late_bind_wait_for_worker_completion(late_bind); + + late_bind->component_added = false; + component_del(xe->drm.dev, &xe_late_bind_component_ops); + if (late_bind->wq) { + destroy_workqueue(late_bind->wq); + late_bind->wq = NULL; + } } /** @@ -174,9 +321,15 @@ int xe_late_bind_init(struct xe_late_bind *late_bind) return err; } + late_bind->component_added = true; + err = devm_add_action_or_reset(xe->drm.dev, xe_late_bind_remove, late_bind); if (err) return err; - return xe_late_bind_fw_init(late_bind); + err = xe_late_bind_fw_init(late_bind); + if (err) + return err; + + return xe_late_bind_fw_load(late_bind); } diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw.h b/drivers/gpu/drm/xe/xe_late_bind_fw.h index 4c73571c3e62..28d56ed2bfdc 100644 --- a/drivers/gpu/drm/xe/xe_late_bind_fw.h +++ b/drivers/gpu/drm/xe/xe_late_bind_fw.h @@ -11,5 +11,6 @@ struct xe_late_bind; int xe_late_bind_init(struct xe_late_bind *late_bind); +int xe_late_bind_fw_load(struct xe_late_bind *late_bind); #endif diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw_types.h b/drivers/gpu/drm/xe/xe_late_bind_fw_types.h index c4a8042f2600..3cc5fc0593b3 100644 --- a/drivers/gpu/drm/xe/xe_late_bind_fw_types.h +++ b/drivers/gpu/drm/xe/xe_late_bind_fw_types.h @@ -9,6 +9,7 @@ #include #include #include +#include #define XE_LB_MAX_PAYLOAD_SIZE SZ_4K @@ -36,6 +37,8 @@ struct xe_late_bind_fw { const u8 *payload; /** @payload_size: late binding blob payload_size */ size_t payload_size; + /** @work: worker to upload latebind blob */ + struct work_struct work; }; /** @@ -58,6 +61,10 @@ struct xe_late_bind { struct xe_late_bind_component component; /** @late_bind_fw: late binding firmware array */ struct xe_late_bind_fw late_bind_fw[XE_LB_FW_MAX_ID]; + /** @wq: workqueue to submit request to download late bind blob */ + struct workqueue_struct *wq; + /** @component_added: whether the component has been added */ + bool component_added; }; #endif -- 2.34.1