dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/9] Introducing firmware late binding
@ 2025-04-29 16:09 Badal Nilawar
  2025-04-29 16:09 ` [RFC 1/9] mei: bus: add mei_cldev_mtu interface Badal Nilawar
                   ` (9 more replies)
  0 siblings, 10 replies; 30+ messages in thread
From: Badal Nilawar @ 2025-04-29 16:09 UTC (permalink / raw)
  To: intel-xe, dri-devel
  Cc: anshuman.gupta, rodrigo.vivi, alexander.usyskin, gregkh,
	daniele.ceraolospurio

Introducing firmware late binding feature to enable firmware loading
for the devices, such as the fan controller and voltage regulator,
during the driver probe.
Typically, firmware for these devices are part of IFWI flash image but
can be replaced at probe after OEM tuning.

Alexander Usyskin (2):
  mei: bus: add mei_cldev_mtu interface
  mei: late_bind: add late binding component driver

Badal Nilawar (6):
  drm/xe/late_bind_fw: Introducing late_bind_fw
  drm/xe/xe_late_bind_fw: Initialize late binding firmware
  drm/xe/xe_late_bind_fw: Load late binding firmware
  drm/xe/xe_late_bind_fw: Reload late binding fw in rpm resume
  drm/xe/xe_late_bind_fw: Reload late binding fw in S2Idle/S3 resume
  drm/xe/xe_late_bind_fw: Introduce debug fs node to disable late
    binding

Rodrigo Vivi (1):
  {fwctl,drm}/xe/pcode: Introduce xe_pcode_fwctl

 Documentation/userspace-api/fwctl/index.rst   |   1 +
 drivers/gpu/drm/xe/Kconfig                    |   2 +
 drivers/gpu/drm/xe/Makefile                   |   2 +
 drivers/gpu/drm/xe/xe_debugfs.c               |  42 +++
 drivers/gpu/drm/xe/xe_device.c                |   7 +
 drivers/gpu/drm/xe/xe_device_types.h          |   4 +
 drivers/gpu/drm/xe/xe_late_bind_fw.c          | 289 ++++++++++++++++++
 drivers/gpu/drm/xe/xe_late_bind_fw.h          |  18 ++
 drivers/gpu/drm/xe/xe_late_bind_fw_types.h    |  97 ++++++
 drivers/gpu/drm/xe/xe_pci.c                   |   5 +
 drivers/gpu/drm/xe/xe_pcode_fwctl.c           | 218 +++++++++++++
 drivers/gpu/drm/xe/xe_pcode_fwctl.h           |  13 +
 drivers/gpu/drm/xe/xe_pm.c                    |   7 +
 drivers/misc/mei/Kconfig                      |   1 +
 drivers/misc/mei/Makefile                     |   1 +
 drivers/misc/mei/bus.c                        |  13 +
 drivers/misc/mei/late_bind/Kconfig            |  12 +
 drivers/misc/mei/late_bind/Makefile           |   9 +
 drivers/misc/mei/late_bind/mei_late_bind.c    | 261 ++++++++++++++++
 include/drm/intel/i915_component.h            |   1 +
 .../drm/intel/xe_late_bind_mei_interface.h    |  49 +++
 include/linux/mei_cl_bus.h                    |   1 +
 include/uapi/fwctl/fwctl.h                    |   1 +
 include/uapi/fwctl/xe_pcode.h                 |  80 +++++
 24 files changed, 1134 insertions(+)
 create mode 100644 drivers/gpu/drm/xe/xe_late_bind_fw.c
 create mode 100644 drivers/gpu/drm/xe/xe_late_bind_fw.h
 create mode 100644 drivers/gpu/drm/xe/xe_late_bind_fw_types.h
 create mode 100644 drivers/gpu/drm/xe/xe_pcode_fwctl.c
 create mode 100644 drivers/gpu/drm/xe/xe_pcode_fwctl.h
 create mode 100644 drivers/misc/mei/late_bind/Kconfig
 create mode 100644 drivers/misc/mei/late_bind/Makefile
 create mode 100644 drivers/misc/mei/late_bind/mei_late_bind.c
 create mode 100644 include/drm/intel/xe_late_bind_mei_interface.h
 create mode 100644 include/uapi/fwctl/xe_pcode.h

-- 
2.34.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [RFC 1/9] mei: bus: add mei_cldev_mtu interface
  2025-04-29 16:09 [RFC 0/9] Introducing firmware late binding Badal Nilawar
@ 2025-04-29 16:09 ` Badal Nilawar
  2025-04-29 16:09 ` [RFC 2/9] mei: late_bind: add late binding component driver Badal Nilawar
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 30+ messages in thread
From: Badal Nilawar @ 2025-04-29 16:09 UTC (permalink / raw)
  To: intel-xe, dri-devel
  Cc: anshuman.gupta, rodrigo.vivi, alexander.usyskin, gregkh,
	daniele.ceraolospurio

From: Alexander Usyskin <alexander.usyskin@intel.com>

Allow to bus client to obtain client mtu.

Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
---
 drivers/misc/mei/bus.c     | 13 +++++++++++++
 include/linux/mei_cl_bus.h |  1 +
 2 files changed, 14 insertions(+)

diff --git a/drivers/misc/mei/bus.c b/drivers/misc/mei/bus.c
index 67176caf5416..f860b1b6eda0 100644
--- a/drivers/misc/mei/bus.c
+++ b/drivers/misc/mei/bus.c
@@ -614,6 +614,19 @@ u8 mei_cldev_ver(const struct mei_cl_device *cldev)
 }
 EXPORT_SYMBOL_GPL(mei_cldev_ver);
 
+/**
+ * mei_cldev_mtu - max message that client can send and receive
+ *
+ * @cldev: mei client device
+ *
+ * Return: mtu or 0 if client is not connected
+ */
+size_t mei_cldev_mtu(const struct mei_cl_device *cldev)
+{
+	return mei_cl_mtu(cldev->cl);
+}
+EXPORT_SYMBOL_GPL(mei_cldev_mtu);
+
 /**
  * mei_cldev_enabled - check whether the device is enabled
  *
diff --git a/include/linux/mei_cl_bus.h b/include/linux/mei_cl_bus.h
index 725fd7727422..a82755e1fc40 100644
--- a/include/linux/mei_cl_bus.h
+++ b/include/linux/mei_cl_bus.h
@@ -113,6 +113,7 @@ int mei_cldev_register_notif_cb(struct mei_cl_device *cldev,
 				mei_cldev_cb_t notif_cb);
 
 u8 mei_cldev_ver(const struct mei_cl_device *cldev);
+size_t mei_cldev_mtu(const struct mei_cl_device *cldev);
 
 void *mei_cldev_get_drvdata(const struct mei_cl_device *cldev);
 void mei_cldev_set_drvdata(struct mei_cl_device *cldev, void *data);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC 2/9] mei: late_bind: add late binding component driver
  2025-04-29 16:09 [RFC 0/9] Introducing firmware late binding Badal Nilawar
  2025-04-29 16:09 ` [RFC 1/9] mei: bus: add mei_cldev_mtu interface Badal Nilawar
@ 2025-04-29 16:09 ` Badal Nilawar
  2025-05-07 22:42   ` Daniele Ceraolo Spurio
  2025-04-29 16:09 ` [RFC 3/9] drm/xe/late_bind_fw: Introducing late_bind_fw Badal Nilawar
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Badal Nilawar @ 2025-04-29 16:09 UTC (permalink / raw)
  To: intel-xe, dri-devel
  Cc: anshuman.gupta, rodrigo.vivi, alexander.usyskin, gregkh,
	daniele.ceraolospurio

From: Alexander Usyskin <alexander.usyskin@intel.com>

Add late binding component driver.
It allows pushing the late binding configuration from
Xe grphics driver to Intel discrete graphics card CSE device.

Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
---
 drivers/misc/mei/Kconfig                      |   1 +
 drivers/misc/mei/Makefile                     |   1 +
 drivers/misc/mei/late_bind/Kconfig            |  12 +
 drivers/misc/mei/late_bind/Makefile           |   9 +
 drivers/misc/mei/late_bind/mei_late_bind.c    | 261 ++++++++++++++++++
 include/drm/intel/i915_component.h            |   1 +
 .../drm/intel/xe_late_bind_mei_interface.h    |  49 ++++
 7 files changed, 334 insertions(+)
 create mode 100644 drivers/misc/mei/late_bind/Kconfig
 create mode 100644 drivers/misc/mei/late_bind/Makefile
 create mode 100644 drivers/misc/mei/late_bind/mei_late_bind.c
 create mode 100644 include/drm/intel/xe_late_bind_mei_interface.h

diff --git a/drivers/misc/mei/Kconfig b/drivers/misc/mei/Kconfig
index 7575fee96cc6..771becc68095 100644
--- a/drivers/misc/mei/Kconfig
+++ b/drivers/misc/mei/Kconfig
@@ -84,5 +84,6 @@ config INTEL_MEI_VSC
 source "drivers/misc/mei/hdcp/Kconfig"
 source "drivers/misc/mei/pxp/Kconfig"
 source "drivers/misc/mei/gsc_proxy/Kconfig"
+source "drivers/misc/mei/late_bind/Kconfig"
 
 endif
diff --git a/drivers/misc/mei/Makefile b/drivers/misc/mei/Makefile
index 6f9fdbf1a495..84bfde888d81 100644
--- a/drivers/misc/mei/Makefile
+++ b/drivers/misc/mei/Makefile
@@ -31,6 +31,7 @@ CFLAGS_mei-trace.o = -I$(src)
 obj-$(CONFIG_INTEL_MEI_HDCP) += hdcp/
 obj-$(CONFIG_INTEL_MEI_PXP) += pxp/
 obj-$(CONFIG_INTEL_MEI_GSC_PROXY) += gsc_proxy/
+obj-$(CONFIG_INTEL_MEI_LATE_BIND) += late_bind/
 
 obj-$(CONFIG_INTEL_MEI_VSC_HW) += mei-vsc-hw.o
 mei-vsc-hw-y := vsc-tp.o
diff --git a/drivers/misc/mei/late_bind/Kconfig b/drivers/misc/mei/late_bind/Kconfig
new file mode 100644
index 000000000000..c5302303e5af
--- /dev/null
+++ b/drivers/misc/mei/late_bind/Kconfig
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025, Intel Corporation. All rights reserved.
+#
+config INTEL_MEI_LATE_BIND
+	tristate "Intel late binding support on ME Interface"
+	select INTEL_MEI_ME
+	depends on DRM_XE
+	help
+	  MEI Support for Late Binding for Intel graphics card.
+
+	  Enables the ME FW interfaces for Late Binding for
+          Xe display driver of Intel.
diff --git a/drivers/misc/mei/late_bind/Makefile b/drivers/misc/mei/late_bind/Makefile
new file mode 100644
index 000000000000..a0aeda5853f0
--- /dev/null
+++ b/drivers/misc/mei/late_bind/Makefile
@@ -0,0 +1,9 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Copyright (c) 2025, Intel Corporation. All rights reserved.
+#
+# Makefile - Late Binding client driver for Intel MEI Bus Driver.
+
+subdir-ccflags-y += -I$(srctree)/drivers/misc/mei/
+
+obj-$(CONFIG_INTEL_MEI_LATE_BIND) += mei_late_bind.o
diff --git a/drivers/misc/mei/late_bind/mei_late_bind.c b/drivers/misc/mei/late_bind/mei_late_bind.c
new file mode 100644
index 000000000000..431fee664316
--- /dev/null
+++ b/drivers/misc/mei/late_bind/mei_late_bind.c
@@ -0,0 +1,261 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2025 Intel Corporation
+ */
+#include <drm/drm_connector.h>
+#include <drm/intel/i915_component.h>
+#include <drm/intel/xe_late_bind_mei_interface.h>
+#include <linux/component.h>
+#include <linux/pci.h>
+#include <linux/mei_cl_bus.h>
+#include <linux/module.h>
+#include <linux/overflow.h>
+#include <linux/slab.h>
+#include <linux/uuid.h>
+
+#include "mkhi.h"
+
+#define GFX_SRV_MKHI_LATE_BINDING_CMD 0x12
+#define GFX_SRV_MKHI_LATE_BINDING_RSP (GFX_SRV_MKHI_LATE_BINDING_CMD | 0x80)
+
+#define LATE_BIND_SEND_TIMEOUT_MSEC 3000
+#define LATE_BIND_RECV_TIMEOUT_MSEC 3000
+
+/**
+ * struct csc_heci_late_bind_req - late binding request
+ * @header: @ref mkhi_msg_hdr
+ * @type: type of the late binding payload
+ * @flags: flags to be passed to the firmware
+ * @reserved: reserved field
+ * @payload_size: size of the payload data in bytes
+ * @payload: data to be sent to the firmware
+ */
+struct csc_heci_late_bind_req {
+	struct mkhi_msg_hdr header;
+	u32 type;
+	u32 flags;
+	u32 reserved[2];
+	u32 payload_size;
+	u8  payload[] __counted_by(payload_size);
+} __packed;
+
+/**
+ * struct csc_heci_late_bind_rsp - late binding response
+ * @header: @ref mkhi_msg_hdr
+ * @type: type of the late binding payload
+ * @reserved: reserved field
+ * @status: status of the late binding command execution by firmware
+ */
+struct csc_heci_late_bind_rsp {
+	struct mkhi_msg_hdr header;
+	u32 type;
+	u32 reserved[2];
+	u32 status;
+} __packed;
+
+static int mei_late_bind_check_response(const struct device *dev, const struct mkhi_msg_hdr *hdr)
+{
+	if (hdr->group_id != MKHI_GROUP_ID_GFX) {
+		dev_err(dev, "Mismatch group id: 0x%x instead of 0x%x\n",
+			hdr->group_id, MKHI_GROUP_ID_GFX);
+		return -EINVAL;
+	}
+
+	if (hdr->command != GFX_SRV_MKHI_LATE_BINDING_RSP) {
+		dev_err(dev, "Mismatch command: 0x%x instead of 0x%x\n",
+			hdr->command, GFX_SRV_MKHI_LATE_BINDING_RSP);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+/**
+ * mei_late_bind_push_config - Sends a config to the firmware.
+ * @dev: device struct corresponding to the mei device
+ * @type: payload type
+ * @flags: payload flags
+ * @payload: payload buffer
+ * @payload_size: payload buffer size
+ *
+ * Return: 0 success, negative errno value on transport failure,
+ *         positive status returned by FW
+ */
+static int mei_late_bind_push_config(struct device *dev, u32 type, u32 flags,
+				     const void *payload, size_t payload_size)
+{
+	struct mei_cl_device *cldev;
+	struct csc_heci_late_bind_req *req = NULL;
+	struct csc_heci_late_bind_rsp rsp;
+	size_t req_size;
+	int ret;
+
+	if (!dev || !payload || !payload_size)
+		return -EINVAL;
+
+	cldev = to_mei_cl_device(dev);
+
+	ret = mei_cldev_enable(cldev);
+	if (ret < 0) {
+		dev_dbg(dev, "mei_cldev_enable failed. %d\n", ret);
+		return ret;
+	}
+
+	req_size = struct_size(req, payload, payload_size);
+	if (req_size > mei_cldev_mtu(cldev)) {
+		dev_err(dev, "Payload is too big %zu\n", payload_size);
+		ret = -EMSGSIZE;
+		goto end;
+	}
+
+	req = kmalloc(req_size, GFP_KERNEL);
+	if (!req) {
+		ret = -ENOMEM;
+		goto end;
+	}
+
+	req->header.group_id = MKHI_GROUP_ID_GFX;
+	req->header.command = GFX_SRV_MKHI_LATE_BINDING_CMD;
+	req->type = type;
+	req->flags = flags;
+	req->reserved[0] = 0;
+	req->reserved[1] = 0;
+	req->payload_size = payload_size;
+	memcpy(req->payload, payload, payload_size);
+
+	ret = mei_cldev_send_timeout(cldev, (void *)req, req_size, LATE_BIND_SEND_TIMEOUT_MSEC);
+	if (ret < 0) {
+		dev_err(dev, "mei_cldev_send failed. %d\n", ret);
+		goto end;
+	}
+	ret = mei_cldev_recv_timeout(cldev, (void *)&rsp, sizeof(rsp), LATE_BIND_RECV_TIMEOUT_MSEC);
+	if (ret < 0) {
+		dev_err(dev, "mei_cldev_recv failed. %d\n", ret);
+		goto end;
+	}
+	ret = mei_late_bind_check_response(dev, &rsp.header);
+	if (ret) {
+		dev_err(dev, "bad result response from the firmware: 0x%x\n",
+			*(uint32_t *)&rsp.header);
+		goto end;
+	}
+	ret = (int)rsp.status;
+	dev_dbg(dev, "mei_late_bind_push_config status = %d\n", ret);
+
+end:
+	mei_cldev_disable(cldev);
+	kfree(req);
+	return ret;
+}
+
+static const struct xe_late_bind_component_ops mei_late_bind_ops = {
+	.owner = THIS_MODULE,
+	.push_config = mei_late_bind_push_config,
+};
+
+static int mei_component_master_bind(struct device *dev)
+{
+	return component_bind_all(dev, (void *)&mei_late_bind_ops);
+}
+
+static void mei_component_master_unbind(struct device *dev)
+{
+	component_unbind_all(dev, (void *)&mei_late_bind_ops);
+}
+
+static const struct component_master_ops mei_component_master_ops = {
+	.bind = mei_component_master_bind,
+	.unbind = mei_component_master_unbind,
+};
+
+/**
+ * mei_late_bind_component_match - compare function for matching mei late bind.
+ *
+ *    The function checks if requested is Intel VGA device
+ *    and the parent of requester and the grand parent of mei_if are the same
+ *    device.
+ *
+ * @dev: master device
+ * @subcomponent: subcomponent to match (I915_COMPONENT_LATE_BIND)
+ * @data: compare data (mei late-bind bus device)
+ *
+ * Return:
+ * * 1 - if components match
+ * * 0 - otherwise
+ */
+static int mei_late_bind_component_match(struct device *dev, int subcomponent,
+					 void *data)
+{
+	struct device *base = data;
+	struct pci_dev *pdev;
+
+	if (!dev)
+		return 0;
+
+	if (!dev_is_pci(dev))
+		return 0;
+
+	pdev = to_pci_dev(dev);
+
+	if (pdev->class != (PCI_CLASS_DISPLAY_VGA << 8) ||
+	    pdev->vendor != PCI_VENDOR_ID_INTEL)
+		return 0;
+
+	if (subcomponent != I915_COMPONENT_LATE_BIND)
+		return 0;
+
+	base = base->parent;
+	if (!base) /* mei device */
+		return 0;
+
+	base = base->parent; /* pci device */
+
+	return !!base && dev == base;
+}
+
+static int mei_late_bind_probe(struct mei_cl_device *cldev,
+			       const struct mei_cl_device_id *id)
+{
+	struct component_match *master_match = NULL;
+	int ret;
+
+	component_match_add_typed(&cldev->dev, &master_match,
+				  mei_late_bind_component_match, &cldev->dev);
+	if (IS_ERR_OR_NULL(master_match))
+		return -ENOMEM;
+
+	ret = component_master_add_with_match(&cldev->dev,
+					      &mei_component_master_ops,
+					      master_match);
+	if (ret < 0)
+		dev_err(&cldev->dev, "Master comp add failed %d\n", ret);
+
+	return ret;
+}
+
+static void mei_late_bind_remove(struct mei_cl_device *cldev)
+{
+	component_master_del(&cldev->dev, &mei_component_master_ops);
+}
+
+#define MEI_GUID_MKHI UUID_LE(0xe2c2afa2, 0x3817, 0x4d19, \
+			      0x9d, 0x95, 0x6, 0xb1, 0x6b, 0x58, 0x8a, 0x5d)
+
+static struct mei_cl_device_id mei_late_bind_tbl[] = {
+	{ .uuid = MEI_GUID_MKHI, .version = MEI_CL_VERSION_ANY },
+	{ }
+};
+MODULE_DEVICE_TABLE(mei, mei_late_bind_tbl);
+
+static struct mei_cl_driver mei_late_bind_driver = {
+	.id_table = mei_late_bind_tbl,
+	.name = KBUILD_MODNAME,
+	.probe = mei_late_bind_probe,
+	.remove	= mei_late_bind_remove,
+};
+
+module_mei_cl_driver(mei_late_bind_driver);
+
+MODULE_AUTHOR("Intel Corporation");
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("MEI Late Binding");
diff --git a/include/drm/intel/i915_component.h b/include/drm/intel/i915_component.h
index 4ea3b17aa143..4945044d41e6 100644
--- a/include/drm/intel/i915_component.h
+++ b/include/drm/intel/i915_component.h
@@ -31,6 +31,7 @@ enum i915_component_type {
 	I915_COMPONENT_HDCP,
 	I915_COMPONENT_PXP,
 	I915_COMPONENT_GSC_PROXY,
+	I915_COMPONENT_LATE_BIND,
 };
 
 /* MAX_PORT is the number of port
diff --git a/include/drm/intel/xe_late_bind_mei_interface.h b/include/drm/intel/xe_late_bind_mei_interface.h
new file mode 100644
index 000000000000..4005c4c6184f
--- /dev/null
+++ b/include/drm/intel/xe_late_bind_mei_interface.h
@@ -0,0 +1,49 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright (c) 2025 Intel Corporation
+ */
+
+#ifndef _XE_LATE_BIND_MEI_INTERFACE_H_
+#define _XE_LATE_BIND_MEI_INTERFACE_H_
+
+#include <linux/types.h>
+
+struct device;
+struct module;
+
+/**
+ * struct xe_late_bind_component_ops - ops for Late Binding services.
+ * @owner: Module providing the ops
+ * @push_config: Sends a config to FW.
+ */
+struct xe_late_bind_component_ops {
+	struct module *owner;
+
+	/**
+	 * @push_config: Sends a config to FW.
+	 * @dev: device struct corresponding to the mei device
+	 * @type: payload type
+	 * @flags: payload flags
+	 * @payload: payload buffer
+	 * @payload_size: payload buffer size
+	 *
+	 * Return: 0 success, negative errno value on transport failure,
+	 *         positive status returned by FW
+	 */
+	int (*push_config)(struct device *dev, u32 type, u32 flags,
+			   const void *payload, size_t payload_size);
+};
+
+/**
+ * struct xe_late_bind_component - Late Binding services component
+ * @mei_dev: device that provide Late Binding service.
+ * @ops: Ops implemented by Late Binding driver, used by Xe driver.
+ *
+ * Communication between Xe and MEI drivers for Late Binding services
+ */
+struct xe_late_bind_component {
+	struct device *mei_dev;
+	const struct xe_late_bind_component_ops *ops;
+};
+
+#endif /* _XE_LATE_BIND_MEI_INTERFACE_H_ */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC 3/9] drm/xe/late_bind_fw: Introducing late_bind_fw
  2025-04-29 16:09 [RFC 0/9] Introducing firmware late binding Badal Nilawar
  2025-04-29 16:09 ` [RFC 1/9] mei: bus: add mei_cldev_mtu interface Badal Nilawar
  2025-04-29 16:09 ` [RFC 2/9] mei: late_bind: add late binding component driver Badal Nilawar
@ 2025-04-29 16:09 ` Badal Nilawar
  2025-05-07 21:38   ` Daniele Ceraolo Spurio
  2025-04-29 16:09 ` [RFC 4/9] drm/xe/xe_late_bind_fw: Initialize late binding firmware Badal Nilawar
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Badal Nilawar @ 2025-04-29 16:09 UTC (permalink / raw)
  To: intel-xe, dri-devel
  Cc: anshuman.gupta, rodrigo.vivi, alexander.usyskin, gregkh,
	daniele.ceraolospurio

Introducing late_bind_fw to enable firmware loading for the devices,
such as the fan controller and voltage regulator, during the driver probe.
Typically, firmware for these devices are part of IFWI flash image but
can be replaced at probe after OEM tuning.

Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
---
 drivers/gpu/drm/xe/Kconfig                 |   1 +
 drivers/gpu/drm/xe/Makefile                |   1 +
 drivers/gpu/drm/xe/xe_device.c             |   3 +
 drivers/gpu/drm/xe/xe_device_types.h       |   4 +
 drivers/gpu/drm/xe/xe_late_bind_fw.c       | 104 +++++++++++++++++++++
 drivers/gpu/drm/xe/xe_late_bind_fw.h       |  16 ++++
 drivers/gpu/drm/xe/xe_late_bind_fw_types.h |  95 +++++++++++++++++++
 7 files changed, 224 insertions(+)
 create mode 100644 drivers/gpu/drm/xe/xe_late_bind_fw.c
 create mode 100644 drivers/gpu/drm/xe/xe_late_bind_fw.h
 create mode 100644 drivers/gpu/drm/xe/xe_late_bind_fw_types.h

diff --git a/drivers/gpu/drm/xe/Kconfig b/drivers/gpu/drm/xe/Kconfig
index 9bce047901b2..a8cc1876a24f 100644
--- a/drivers/gpu/drm/xe/Kconfig
+++ b/drivers/gpu/drm/xe/Kconfig
@@ -44,6 +44,7 @@ config DRM_XE
 	select WANT_DEV_COREDUMP
 	select AUXILIARY_BUS
 	select HMM_MIRROR
+	select INTEL_MEI_LATE_BIND
 	help
 	  Experimental driver for Intel Xe series GPUs
 
diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index c5d6681645ed..6de291a21965 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -76,6 +76,7 @@ xe-y += xe_bb.o \
 	xe_hw_fence.o \
 	xe_irq.o \
 	xe_lrc.o \
+	xe_late_bind_fw.o \
 	xe_migrate.o \
 	xe_mmio.o \
 	xe_mocs.o \
diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 75e753e0a682..86a7b7065122 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -42,6 +42,7 @@
 #include "xe_hw_engine_group.h"
 #include "xe_hwmon.h"
 #include "xe_irq.h"
+#include "xe_late_bind_fw.h"
 #include "xe_memirq.h"
 #include "xe_mmio.h"
 #include "xe_module.h"
@@ -889,6 +890,8 @@ int xe_device_probe(struct xe_device *xe)
 	if (err)
 		return err;
 
+	xe_late_bind_init(&xe->late_bind);
+
 	err = xe_oa_init(xe);
 	if (err)
 		return err;
diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index 495bc00ebed4..57b63cc9b8ac 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -16,6 +16,7 @@
 #include "xe_devcoredump_types.h"
 #include "xe_heci_gsc.h"
 #include "xe_lmtt_types.h"
+#include "xe_late_bind_fw_types.h"
 #include "xe_memirq_types.h"
 #include "xe_oa_types.h"
 #include "xe_platform_types.h"
@@ -543,6 +544,9 @@ struct xe_device {
 	/** @heci_gsc: graphics security controller */
 	struct xe_heci_gsc heci_gsc;
 
+	/** @late_bind: xe mei late bind interface */
+	struct xe_late_bind late_bind;
+
 	/** @oa: oa observation subsystem */
 	struct xe_oa oa;
 
diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw.c b/drivers/gpu/drm/xe/xe_late_bind_fw.c
new file mode 100644
index 000000000000..7981fc500a78
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_late_bind_fw.c
@@ -0,0 +1,104 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2024 Intel Corporation
+ */
+
+#include <linux/component.h>
+#include <linux/delay.h>
+
+#include <drm/drm_managed.h>
+#include <drm/intel/i915_component.h>
+#include <drm/intel/xe_late_bind_mei_interface.h>
+#include <drm/drm_print.h>
+
+#include "xe_device.h"
+#include "xe_late_bind_fw.h"
+
+static struct xe_device *
+late_bind_to_xe(struct xe_late_bind *late_bind)
+{
+	return container_of(late_bind, struct xe_device, late_bind);
+}
+
+static int xe_late_bind_component_bind(struct device *xe_kdev,
+				       struct device *mei_kdev, void *data)
+{
+	struct xe_device *xe = kdev_to_xe_device(xe_kdev);
+	struct xe_late_bind *late_bind = &xe->late_bind;
+	struct xe_late_bind_component *component;
+
+	component = drmm_kzalloc(&xe->drm, sizeof(*component), GFP_KERNEL);
+
+	mutex_lock(&late_bind->mutex);
+	component->mei_dev = mei_kdev;
+	component->ops = data;
+	mutex_unlock(&late_bind->mutex);
+
+	late_bind->component = component;
+
+	return 0;
+}
+
+static void xe_late_bind_component_unbind(struct device *xe_kdev,
+					  struct device *mei_kdev, void *data)
+{
+	struct xe_device *xe = kdev_to_xe_device(xe_kdev);
+	struct xe_late_bind *late_bind = &xe->late_bind;
+
+	mutex_lock(&late_bind->mutex);
+	late_bind->component = NULL;
+	mutex_unlock(&late_bind->mutex);
+}
+
+static const struct component_ops xe_late_bind_component_ops = {
+	.bind   = xe_late_bind_component_bind,
+	.unbind = xe_late_bind_component_unbind,
+};
+
+/**
+ * xe_late_bind_init() - add xe mei late binding component
+ *
+ * Return: 0 if the initialization was successful, a negative errno otherwise.
+ */
+int xe_late_bind_init(struct xe_late_bind *late_bind)
+{
+	struct xe_device *xe = late_bind_to_xe(late_bind);
+	int err;
+
+	if (xe->info.platform != XE_BATTLEMAGE)
+		return 0;
+
+	mutex_init(&late_bind->mutex);
+
+	if (!IS_ENABLED(CONFIG_INTEL_MEI_LATE_BIND)) {
+		drm_info(&xe->drm, "Can't init xe mei late bind missing mei component\n");
+		return -ENODEV;
+	}
+
+	err = component_add_typed(xe->drm.dev, &xe_late_bind_component_ops,
+				  I915_COMPONENT_LATE_BIND);
+	if (err < 0) {
+		drm_info(&xe->drm, "Failed to add mei late bind component (%pe)\n", ERR_PTR(err));
+		return err;
+	}
+
+	late_bind->component_added = true;
+
+	/* the component must be removed before unload, so can't use drmm for cleanup */
+
+	return 0;
+}
+
+/**
+ * xe_late_bind_remove() - remove the xe mei late binding component
+ */
+void xe_late_bind_remove(struct xe_late_bind *late_bind)
+{
+	struct xe_device *xe = late_bind_to_xe(late_bind);
+
+	if (!late_bind->component_added)
+		return;
+
+	component_del(xe->drm.dev, &xe_late_bind_component_ops);
+	late_bind->component_added = false;
+}
diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw.h b/drivers/gpu/drm/xe/xe_late_bind_fw.h
new file mode 100644
index 000000000000..21299de54b47
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_late_bind_fw.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#ifndef _XE_LATE_BIND_FW_H_
+#define _XE_LATE_BIND_FW_H_
+
+#include <linux/types.h>
+
+struct xe_late_bind;
+
+int xe_late_bind_init(struct xe_late_bind *late_bind);
+void xe_late_bind_remove(struct xe_late_bind *late_bind);
+
+#endif
diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw_types.h b/drivers/gpu/drm/xe/xe_late_bind_fw_types.h
new file mode 100644
index 000000000000..ea11061ce556
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_late_bind_fw_types.h
@@ -0,0 +1,95 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#ifndef _XE_MEI_LATE_BIND_TYPES_H_
+#define _XE_MEI_LATE_BIND_TYPES_H_
+
+#include <linux/iosys-map.h>
+#include <linux/mutex.h>
+#include <linux/spinlock.h>
+#include <linux/types.h>
+#include <linux/workqueue.h>
+
+#define MAX_PAYLOAD_SIZE (1024 * 4)
+
+struct xe_bo;
+struct xe_late_bind_component;
+
+/**
+ * xe_late_bind_fw_type - enum to determine late binding fw type
+ */
+enum xe_late_bind_type {
+	CSC_LATE_BINDING_TYPE_FAN_CONTROL = 1,
+	CSC_LATE_BINDING_TYPE_VOLTAGE_REGULATOR
+};
+
+/**
+ * Late Binding flags
+ */
+enum csc_late_binding_flags {
+	/** Persistent across warm reset */
+	CSC_LATE_BINDING_FLAGS_IS_PERSISTENT = 0x1
+};
+
+/**
+ * xe_late_bind_fw_id - enum to determine late binding fw index
+ */
+enum xe_late_bind_fw_id {
+	FAN_CONTROL_ID = 0,
+	VOLTAGE_REGULATOR_ID,
+	MAX_ID
+};
+
+/**
+ * struct xe_late_bind_fw
+ */
+struct xe_late_bind_fw {
+	/** @late_bind_fw.valid */
+	bool valid;
+
+	/** @late_bind_fw.id */
+	u32 id;
+
+	/** @late_bind_fw.blob_path: late binding fw blob path */
+	char blob_path[PATH_MAX];
+
+	/** @late_bind_fw.type */
+	u32  type;
+
+	/** @late_bind_fw.type */
+	u32  flags;
+
+	/** @late_bind_fw.payload: to store the late binding blob */
+	u8  payload[MAX_PAYLOAD_SIZE];
+
+	/** @late_bind_fw.payload_size: late binding blob payload_size */
+	size_t payload_size;
+
+	/** @late_bind_fw.work: worker to upload latebind blob */
+	struct work_struct work;
+};
+
+/**
+ * struct xe_late_bind
+ */
+struct xe_late_bind {
+	/** @late_bind.component: struct for communication with mei component */
+	struct xe_late_bind_component *component;
+
+	/** @late_bind.component_added: whether the component has been added */
+	bool component_added;
+
+	/** @late_bind.wq: workqueue to submit request to download late bind blob */
+	struct workqueue_struct *wq;
+
+	/** @late_bind.mutex: protects the component binding and usage */
+	struct mutex mutex;
+
+	/** @late_bind.late_bind_fw: late binding firmwares */
+	struct xe_late_bind_fw late_bind_fw[MAX_ID];
+
+};
+
+#endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC 4/9] drm/xe/xe_late_bind_fw: Initialize late binding firmware
  2025-04-29 16:09 [RFC 0/9] Introducing firmware late binding Badal Nilawar
                   ` (2 preceding siblings ...)
  2025-04-29 16:09 ` [RFC 3/9] drm/xe/late_bind_fw: Introducing late_bind_fw Badal Nilawar
@ 2025-04-29 16:09 ` Badal Nilawar
  2025-05-07 23:11   ` Daniele Ceraolo Spurio
  2025-04-29 16:09 ` [RFC 5/9] drm/xe/xe_late_bind_fw: Load " Badal Nilawar
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Badal Nilawar @ 2025-04-29 16:09 UTC (permalink / raw)
  To: intel-xe, dri-devel
  Cc: anshuman.gupta, rodrigo.vivi, alexander.usyskin, gregkh,
	daniele.ceraolospurio

Search for late binding firmware binaries and populate the meta data of
firmware structures.

Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
---
 drivers/gpu/drm/xe/xe_device.c       |   2 +
 drivers/gpu/drm/xe/xe_late_bind_fw.c | 101 ++++++++++++++++++++++++++-
 drivers/gpu/drm/xe/xe_late_bind_fw.h |   1 +
 3 files changed, 101 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 86a7b7065122..d83864e7189c 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -892,6 +892,8 @@ int xe_device_probe(struct xe_device *xe)
 
 	xe_late_bind_init(&xe->late_bind);
 
+	xe_late_bind_fw_init(&xe->late_bind);
+
 	err = xe_oa_init(xe);
 	if (err)
 		return err;
diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw.c b/drivers/gpu/drm/xe/xe_late_bind_fw.c
index 7981fc500a78..297238fd3d16 100644
--- a/drivers/gpu/drm/xe/xe_late_bind_fw.c
+++ b/drivers/gpu/drm/xe/xe_late_bind_fw.c
@@ -5,6 +5,7 @@
 
 #include <linux/component.h>
 #include <linux/delay.h>
+#include <linux/firmware.h>
 
 #include <drm/drm_managed.h>
 #include <drm/intel/i915_component.h>
@@ -13,13 +14,108 @@
 
 #include "xe_device.h"
 #include "xe_late_bind_fw.h"
+#include "xe_pcode.h"
+#include "xe_pcode_api.h"
 
-static struct xe_device *
-late_bind_to_xe(struct xe_late_bind *late_bind)
+static const char * const fw_id_to_name[] = {
+		[FAN_CONTROL_ID] = "fan_control",
+		[VOLTAGE_REGULATOR_ID] = "voltage_regulator",
+	};
+
+static const u32 fw_id_to_type[] = {
+		[FAN_CONTROL_ID] = CSC_LATE_BINDING_TYPE_FAN_CONTROL,
+		[VOLTAGE_REGULATOR_ID] = CSC_LATE_BINDING_TYPE_VOLTAGE_REGULATOR
+	};
+
+static struct xe_device *late_bind_to_xe(struct xe_late_bind *late_bind)
 {
 	return container_of(late_bind, struct xe_device, late_bind);
 }
 
+static int late_bind_fw_num_fans(struct xe_late_bind *late_bind)
+{
+	struct xe_device *xe = late_bind_to_xe(late_bind);
+	struct xe_tile *root_tile = xe_device_get_root_tile(xe);
+	u32 uval;
+
+	if (!xe_pcode_read(root_tile,
+			   PCODE_MBOX(FAN_SPEED_CONTROL, FSC_READ_NUM_FANS, 0), &uval, NULL))
+		return uval;
+	else
+		return 0;
+}
+
+static int late_bind_fw_init(struct xe_late_bind *late_bind, u32 id)
+{
+	struct xe_device *xe = late_bind_to_xe(late_bind);
+	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
+	struct xe_late_bind_fw *lb_fw;
+	const struct firmware *fw;
+	u32 num_fans;
+	int ret;
+
+	if (!late_bind->component_added)
+		return 0;
+
+	if (id >= MAX_ID)
+		return -EINVAL;
+
+	lb_fw = &late_bind->late_bind_fw[id];
+
+	lb_fw->id = id;
+	lb_fw->type = fw_id_to_type[id];
+
+	if (lb_fw->type == CSC_LATE_BINDING_TYPE_FAN_CONTROL) {
+		num_fans = late_bind_fw_num_fans(late_bind);
+		drm_dbg(&xe->drm, "Number of Fans: %d\n", num_fans);
+		if (!num_fans)
+			return 0;
+	}
+
+	lb_fw->flags = CSC_LATE_BINDING_FLAGS_IS_PERSISTENT;
+
+	snprintf(lb_fw->blob_path, sizeof(lb_fw->blob_path), "xe/%s_8086_%04x_%04x_%04x.bin",
+		 fw_id_to_name[id], pdev->device,
+		 pdev->subsystem_vendor, pdev->subsystem_device);
+
+	drm_dbg(&xe->drm, "Request late binding firmware %s\n", lb_fw->blob_path);
+	ret = request_firmware(&fw, lb_fw->blob_path, xe->drm.dev);
+	if (ret) {
+		drm_err(&xe->drm, "Failed to request %s\n", lb_fw->blob_path);
+		lb_fw->valid = false;
+		return 0;
+	}
+
+	if (fw->size > MAX_PAYLOAD_SIZE)
+		lb_fw->payload_size = MAX_PAYLOAD_SIZE;
+	else
+		lb_fw->payload_size = fw->size;
+
+	memcpy(lb_fw->payload, fw->data, lb_fw->payload_size);
+	release_firmware(fw);
+	lb_fw->valid = true;
+
+	return 0;
+}
+
+/**
+ * xe_mei_late_bind_fw_init() - Initialize late bind firmware
+ *
+ * Return: 0 if the initialization was successful, a negative errno otherwise.
+ */
+int xe_late_bind_fw_init(struct xe_late_bind *late_bind)
+{
+	int id;
+	int ret;
+
+	for (id = 0; id < MAX_ID; id++) {
+		ret = late_bind_fw_init(late_bind, id);
+		if (ret)
+			return ret;
+	}
+	return ret;
+}
+
 static int xe_late_bind_component_bind(struct device *xe_kdev,
 				       struct device *mei_kdev, void *data)
 {
@@ -83,7 +179,6 @@ int xe_late_bind_init(struct xe_late_bind *late_bind)
 	}
 
 	late_bind->component_added = true;
-
 	/* the component must be removed before unload, so can't use drmm for cleanup */
 
 	return 0;
diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw.h b/drivers/gpu/drm/xe/xe_late_bind_fw.h
index 21299de54b47..e88c637b15a6 100644
--- a/drivers/gpu/drm/xe/xe_late_bind_fw.h
+++ b/drivers/gpu/drm/xe/xe_late_bind_fw.h
@@ -12,5 +12,6 @@ struct xe_late_bind;
 
 int xe_late_bind_init(struct xe_late_bind *late_bind);
 void xe_late_bind_remove(struct xe_late_bind *late_bind);
+int xe_late_bind_fw_init(struct xe_late_bind *late_bind);
 
 #endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC 5/9] drm/xe/xe_late_bind_fw: Load late binding firmware
  2025-04-29 16:09 [RFC 0/9] Introducing firmware late binding Badal Nilawar
                   ` (3 preceding siblings ...)
  2025-04-29 16:09 ` [RFC 4/9] drm/xe/xe_late_bind_fw: Initialize late binding firmware Badal Nilawar
@ 2025-04-29 16:09 ` Badal Nilawar
  2025-05-07 23:44   ` Daniele Ceraolo Spurio
  2025-04-29 16:09 ` [RFC 6/9] drm/xe/xe_late_bind_fw: Reload late binding fw in rpm resume Badal Nilawar
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Badal Nilawar @ 2025-04-29 16:09 UTC (permalink / raw)
  To: intel-xe, dri-devel
  Cc: anshuman.gupta, rodrigo.vivi, alexander.usyskin, gregkh,
	daniele.ceraolospurio

Load late binding firmware

Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
---
 drivers/gpu/drm/xe/xe_device.c       |  2 +
 drivers/gpu/drm/xe/xe_late_bind_fw.c | 91 +++++++++++++++++++++++++++-
 drivers/gpu/drm/xe/xe_late_bind_fw.h |  1 +
 3 files changed, 92 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index d83864e7189c..30a416323b37 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -894,6 +894,8 @@ int xe_device_probe(struct xe_device *xe)
 
 	xe_late_bind_fw_init(&xe->late_bind);
 
+	xe_late_bind_fw_load(&xe->late_bind);
+
 	err = xe_oa_init(xe);
 	if (err)
 		return err;
diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw.c b/drivers/gpu/drm/xe/xe_late_bind_fw.c
index 297238fd3d16..7d2bc959027d 100644
--- a/drivers/gpu/drm/xe/xe_late_bind_fw.c
+++ b/drivers/gpu/drm/xe/xe_late_bind_fw.c
@@ -16,6 +16,16 @@
 #include "xe_late_bind_fw.h"
 #include "xe_pcode.h"
 #include "xe_pcode_api.h"
+#include "xe_pm.h"
+
+/*
+ * The component should load quite quickly in most cases, but it could take
+ * a bit. Using a very big timeout just to cover the worst case scenario
+ */
+#define LB_INIT_TIMEOUT_MS 20000
+
+#define LB_FW_LOAD_RETRY_MAXCOUNT 40
+#define LB_FW_LOAD_RETRY_PAUSE_MS 50
 
 static const char * const fw_id_to_name[] = {
 		[FAN_CONTROL_ID] = "fan_control",
@@ -45,6 +55,78 @@ static int late_bind_fw_num_fans(struct xe_late_bind *late_bind)
 		return 0;
 }
 
+static void late_bind_work(struct work_struct *work)
+{
+	struct xe_late_bind_fw *lbfw = container_of(work, struct xe_late_bind_fw, work);
+	struct xe_late_bind *late_bind = container_of(lbfw, struct xe_late_bind,
+						      late_bind_fw[lbfw->id]);
+	struct xe_device *xe = late_bind_to_xe(late_bind);
+	int retry = LB_FW_LOAD_RETRY_MAXCOUNT;
+	int ret;
+	int slept;
+
+	if (!late_bind->component_added)
+		return;
+
+	if (!lbfw->valid)
+		return;
+
+	/* we can queue this before the component is bound */
+	for (slept = 0; slept < LB_INIT_TIMEOUT_MS; slept += 100) {
+		if (late_bind->component)
+			break;
+		msleep(100);
+	}
+
+	xe_pm_runtime_get(xe);
+	mutex_lock(&late_bind->mutex);
+	drm_dbg(&xe->drm, "Load %s firmware\n", fw_id_to_name[lbfw->id]);
+
+	do {
+		ret = late_bind->component->ops->push_config(late_bind->component->mei_dev,
+							     lbfw->type, lbfw->flags,
+							     lbfw->payload, lbfw->payload_size);
+		if (!ret)
+			break;
+		msleep(LB_FW_LOAD_RETRY_PAUSE_MS);
+	} while (--retry && ret == -EAGAIN);
+
+	if (ret)
+		drm_err(&xe->drm, "Load %s firmware failed with err %d\n",
+			fw_id_to_name[lbfw->id], ret);
+	else
+		drm_dbg(&xe->drm, "Load %s firmware successful\n",
+			fw_id_to_name[lbfw->id]);
+
+	mutex_unlock(&late_bind->mutex);
+	xe_pm_runtime_put(xe);
+}
+
+int xe_late_bind_fw_load(struct xe_late_bind *late_bind)
+{
+	struct xe_device *xe = late_bind_to_xe(late_bind);
+	struct xe_late_bind_fw *lbfw;
+	int id;
+
+	if (!late_bind->component_added)
+		return -EINVAL;
+
+	for (id = 0; id < MAX_ID; id++) {
+		lbfw = &late_bind->late_bind_fw[id];
+		if (lbfw->valid) {
+			drm_dbg(&xe->drm, "Queue work: to load %s firmware\n",
+				fw_id_to_name[lbfw->id]);
+			queue_work(late_bind->wq, &lbfw->work);
+		}
+	}
+	return 0;
+}
+
+/**
+ * late_bind_fw_init() - initialize late bind firmware
+ *
+ * Return: 0 if the initialization was successful, a negative errno otherwise.
+ */
 static int late_bind_fw_init(struct xe_late_bind *late_bind, u32 id)
 {
 	struct xe_device *xe = late_bind_to_xe(late_bind);
@@ -93,6 +175,7 @@ static int late_bind_fw_init(struct xe_late_bind *late_bind, u32 id)
 
 	memcpy(lb_fw->payload, fw->data, lb_fw->payload_size);
 	release_firmware(fw);
+	INIT_WORK(&lb_fw->work, late_bind_work);
 	lb_fw->valid = true;
 
 	return 0;
@@ -108,12 +191,17 @@ int xe_late_bind_fw_init(struct xe_late_bind *late_bind)
 	int id;
 	int ret;
 
+	late_bind->wq = create_singlethread_workqueue("late-bind-ordered-wq");
+	if (!late_bind->wq)
+		return -ENOMEM;
+
 	for (id = 0; id < MAX_ID; id++) {
 		ret = late_bind_fw_init(late_bind, id);
 		if (ret)
 			return ret;
 	}
-	return ret;
+
+	return 0;
 }
 
 static int xe_late_bind_component_bind(struct device *xe_kdev,
@@ -179,7 +267,6 @@ int xe_late_bind_init(struct xe_late_bind *late_bind)
 	}
 
 	late_bind->component_added = true;
-	/* the component must be removed before unload, so can't use drmm for cleanup */
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw.h b/drivers/gpu/drm/xe/xe_late_bind_fw.h
index e88c637b15a6..edd0e4c0650e 100644
--- a/drivers/gpu/drm/xe/xe_late_bind_fw.h
+++ b/drivers/gpu/drm/xe/xe_late_bind_fw.h
@@ -13,5 +13,6 @@ struct xe_late_bind;
 int xe_late_bind_init(struct xe_late_bind *late_bind);
 void xe_late_bind_remove(struct xe_late_bind *late_bind);
 int xe_late_bind_fw_init(struct xe_late_bind *late_bind);
+int xe_late_bind_fw_load(struct xe_late_bind *late_bind);
 
 #endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC 6/9] drm/xe/xe_late_bind_fw: Reload late binding fw in rpm resume
  2025-04-29 16:09 [RFC 0/9] Introducing firmware late binding Badal Nilawar
                   ` (4 preceding siblings ...)
  2025-04-29 16:09 ` [RFC 5/9] drm/xe/xe_late_bind_fw: Load " Badal Nilawar
@ 2025-04-29 16:09 ` Badal Nilawar
  2025-04-29 16:09 ` [RFC 7/9] drm/xe/xe_late_bind_fw: Reload late binding fw in S2Idle/S3 resume Badal Nilawar
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 30+ messages in thread
From: Badal Nilawar @ 2025-04-29 16:09 UTC (permalink / raw)
  To: intel-xe, dri-devel
  Cc: anshuman.gupta, rodrigo.vivi, alexander.usyskin, gregkh,
	daniele.ceraolospurio

Reload late binding fw during runtime resume.

Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
---
 drivers/gpu/drm/xe/xe_pm.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
index 38514cef817e..da7ee67eefd1 100644
--- a/drivers/gpu/drm/xe/xe_pm.c
+++ b/drivers/gpu/drm/xe/xe_pm.c
@@ -21,6 +21,7 @@
 #include "xe_gt.h"
 #include "xe_guc.h"
 #include "xe_irq.h"
+#include "xe_late_bind_fw.h"
 #include "xe_pcode.h"
 #include "xe_pxp.h"
 #include "xe_trace.h"
@@ -556,6 +557,9 @@ int xe_pm_runtime_resume(struct xe_device *xe)
 
 	xe_pxp_pm_resume(xe->pxp);
 
+	if (xe->d3cold.allowed)
+		xe_late_bind_fw_load(&xe->late_bind);
+
 out:
 	xe_rpm_lockmap_release(xe);
 	xe_pm_write_callback_task(xe, NULL);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC 7/9] drm/xe/xe_late_bind_fw: Reload late binding fw in S2Idle/S3 resume
  2025-04-29 16:09 [RFC 0/9] Introducing firmware late binding Badal Nilawar
                   ` (5 preceding siblings ...)
  2025-04-29 16:09 ` [RFC 6/9] drm/xe/xe_late_bind_fw: Reload late binding fw in rpm resume Badal Nilawar
@ 2025-04-29 16:09 ` Badal Nilawar
  2025-04-29 16:09 ` [RFC 8/9] drm/xe/xe_late_bind_fw: Introduce debug fs node to disable late binding Badal Nilawar
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 30+ messages in thread
From: Badal Nilawar @ 2025-04-29 16:09 UTC (permalink / raw)
  To: intel-xe, dri-devel
  Cc: anshuman.gupta, rodrigo.vivi, alexander.usyskin, gregkh,
	daniele.ceraolospurio

Reload late binding fw during S2Idle/S3 resume.

Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
---
 drivers/gpu/drm/xe/xe_pm.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
index da7ee67eefd1..d0a3019caa34 100644
--- a/drivers/gpu/drm/xe/xe_pm.c
+++ b/drivers/gpu/drm/xe/xe_pm.c
@@ -206,6 +206,9 @@ int xe_pm_resume(struct xe_device *xe)
 
 	xe_pxp_pm_resume(xe->pxp);
 
+	if (xe->d3cold.allowed)
+		xe_late_bind_fw_load(&xe->late_bind);
+
 	drm_dbg(&xe->drm, "Device resumed\n");
 	return 0;
 err:
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC 8/9] drm/xe/xe_late_bind_fw: Introduce debug fs node to disable late binding
  2025-04-29 16:09 [RFC 0/9] Introducing firmware late binding Badal Nilawar
                   ` (6 preceding siblings ...)
  2025-04-29 16:09 ` [RFC 7/9] drm/xe/xe_late_bind_fw: Reload late binding fw in S2Idle/S3 resume Badal Nilawar
@ 2025-04-29 16:09 ` Badal Nilawar
  2025-04-29 16:09 ` [RFC 9/9] {fwctl,drm}/xe/pcode: Introduce xe_pcode_fwctl Badal Nilawar
  2025-04-30 11:47 ` [RFC 0/9] Introducing firmware late binding Jani Nikula
  9 siblings, 0 replies; 30+ messages in thread
From: Badal Nilawar @ 2025-04-29 16:09 UTC (permalink / raw)
  To: intel-xe, dri-devel
  Cc: anshuman.gupta, rodrigo.vivi, alexander.usyskin, gregkh,
	daniele.ceraolospurio

Introduce a debug filesystem node to disable late binding fw reload
during the system or runtime resume. This is intended for situations
where the late binding fw needs to be loaded from user mode.

Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
---
 drivers/gpu/drm/xe/xe_debugfs.c            | 42 ++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_late_bind_fw.c       |  3 ++
 drivers/gpu/drm/xe/xe_late_bind_fw_types.h |  2 ++
 3 files changed, 47 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_debugfs.c b/drivers/gpu/drm/xe/xe_debugfs.c
index d0503959a8ed..7f238a9e1a49 100644
--- a/drivers/gpu/drm/xe/xe_debugfs.c
+++ b/drivers/gpu/drm/xe/xe_debugfs.c
@@ -185,12 +185,51 @@ static ssize_t wedged_mode_set(struct file *f, const char __user *ubuf,
 	return size;
 }
 
+static ssize_t disable_late_binding_show(struct file *f, char __user *ubuf,
+					 size_t size, loff_t *pos)
+{
+	struct xe_device *xe = file_inode(f)->i_private;
+	struct xe_late_bind *late_bind = &xe->late_bind;
+	char buf[32];
+	int len;
+
+	len = scnprintf(buf, sizeof(buf), "%d\n", late_bind->disable);
+
+	return simple_read_from_buffer(ubuf, size, pos, buf, len);
+}
+
+static ssize_t disable_late_binding_set(struct file *f, const char __user *ubuf,
+					size_t size, loff_t *pos)
+{
+	struct xe_device *xe = file_inode(f)->i_private;
+	struct xe_late_bind *late_bind = &xe->late_bind;
+	u32 uval;
+	ssize_t ret;
+
+	ret = kstrtouint_from_user(ubuf, size, sizeof(uval), &uval);
+	if (ret)
+		return ret;
+
+	if (uval > 1)
+		return -EINVAL;
+
+	late_bind->disable = (uval == 1) ? true : false;
+
+	return size;
+}
+
 static const struct file_operations wedged_mode_fops = {
 	.owner = THIS_MODULE,
 	.read = wedged_mode_show,
 	.write = wedged_mode_set,
 };
 
+static const struct file_operations disable_late_binding_fops = {
+	.owner = THIS_MODULE,
+	.read = disable_late_binding_show,
+	.write = disable_late_binding_set,
+};
+
 void xe_debugfs_register(struct xe_device *xe)
 {
 	struct ttm_device *bdev = &xe->ttm;
@@ -211,6 +250,9 @@ void xe_debugfs_register(struct xe_device *xe)
 	debugfs_create_file("wedged_mode", 0600, root, xe,
 			    &wedged_mode_fops);
 
+	debugfs_create_file("disable_late_binding", 0600, root, xe,
+			    &disable_late_binding_fops);
+
 	for (mem_type = XE_PL_VRAM0; mem_type <= XE_PL_VRAM1; ++mem_type) {
 		man = ttm_manager_type(bdev, mem_type);
 
diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw.c b/drivers/gpu/drm/xe/xe_late_bind_fw.c
index 7d2bc959027d..1626cf793b23 100644
--- a/drivers/gpu/drm/xe/xe_late_bind_fw.c
+++ b/drivers/gpu/drm/xe/xe_late_bind_fw.c
@@ -111,6 +111,9 @@ int xe_late_bind_fw_load(struct xe_late_bind *late_bind)
 	if (!late_bind->component_added)
 		return -EINVAL;
 
+	if (late_bind->disable)
+		return 0;
+
 	for (id = 0; id < MAX_ID; id++) {
 		lbfw = &late_bind->late_bind_fw[id];
 		if (lbfw->valid) {
diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw_types.h b/drivers/gpu/drm/xe/xe_late_bind_fw_types.h
index ea11061ce556..ccd482e67f91 100644
--- a/drivers/gpu/drm/xe/xe_late_bind_fw_types.h
+++ b/drivers/gpu/drm/xe/xe_late_bind_fw_types.h
@@ -90,6 +90,8 @@ struct xe_late_bind {
 	/** @late_bind.late_bind_fw: late binding firmwares */
 	struct xe_late_bind_fw late_bind_fw[MAX_ID];
 
+	/** @late_bind.disable to block late binding reload during pm resume flow*/
+	bool disable;
 };
 
 #endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC 9/9] {fwctl,drm}/xe/pcode: Introduce xe_pcode_fwctl
  2025-04-29 16:09 [RFC 0/9] Introducing firmware late binding Badal Nilawar
                   ` (7 preceding siblings ...)
  2025-04-29 16:09 ` [RFC 8/9] drm/xe/xe_late_bind_fw: Introduce debug fs node to disable late binding Badal Nilawar
@ 2025-04-29 16:09 ` Badal Nilawar
  2025-05-01 15:44   ` Rodrigo Vivi
  2025-05-06 18:13   ` Jason Gunthorpe
  2025-04-30 11:47 ` [RFC 0/9] Introducing firmware late binding Jani Nikula
  9 siblings, 2 replies; 30+ messages in thread
From: Badal Nilawar @ 2025-04-29 16:09 UTC (permalink / raw)
  To: intel-xe, dri-devel
  Cc: anshuman.gupta, rodrigo.vivi, alexander.usyskin, gregkh,
	daniele.ceraolospurio

From: Rodrigo Vivi <rodrigo.vivi@intel.com>

Xe PCODE FWCTL implements the generic FWCTL IOCLTs to allow limited
access from user space (as admin) to some very specific PCODE
Mailboxes only related to hardware configuration.

PCODE is a Firmware in Intel GPUs which is the main responsible
component for power and thermal aspects of the Intel GPUs.

Each different Intel GPU came with different PCODE versions with
different mailboxes and different needs. In the lack of an unified
interface, the per platform sysfs entries at the device level is
trending to grow, to allow admins to control different aspects of
the Hardware.

In this first experiment, xe_pcode_fwctl only adds support for the
Battlemage late-binding firmware information.

Late-binding is the name given to 2 other new auxiliary firmware
blobs that for now lives in the Flash like PCODE, but that soon
it is coming to linux-firmware.git: Fan-controller and
Voltage-regulator. Then, PCODE provides some mailboxes where the
status of both late-binding firmware can be queried as specified
in the documentation that is added along with the new uAPI here.

RFC IMPORTANT NOTE:
===================
Admins will need to query this information. This code here aims
to be used by Level0-Sysman and/or Intel XPU Manager directly
from user space. But following the drm upstream rules, the
userspace code will need to be ready before we can consider
getting this patch merged!

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
---
 Documentation/userspace-api/fwctl/index.rst |   1 +
 drivers/gpu/drm/xe/Kconfig                  |   1 +
 drivers/gpu/drm/xe/Makefile                 |   1 +
 drivers/gpu/drm/xe/xe_pci.c                 |   5 +
 drivers/gpu/drm/xe/xe_pcode_fwctl.c         | 218 ++++++++++++++++++++
 drivers/gpu/drm/xe/xe_pcode_fwctl.h         |  13 ++
 include/uapi/fwctl/fwctl.h                  |   1 +
 include/uapi/fwctl/xe_pcode.h               |  80 +++++++
 8 files changed, 320 insertions(+)
 create mode 100644 drivers/gpu/drm/xe/xe_pcode_fwctl.c
 create mode 100644 drivers/gpu/drm/xe/xe_pcode_fwctl.h
 create mode 100644 include/uapi/fwctl/xe_pcode.h

diff --git a/Documentation/userspace-api/fwctl/index.rst b/Documentation/userspace-api/fwctl/index.rst
index 316ac456ad3b..186f8cf17583 100644
--- a/Documentation/userspace-api/fwctl/index.rst
+++ b/Documentation/userspace-api/fwctl/index.rst
@@ -12,3 +12,4 @@ to securely construct and execute RPCs inside device firmware.
    fwctl
    fwctl-cxl
    pds_fwctl
+   xe_pcode_fwctl
diff --git a/drivers/gpu/drm/xe/Kconfig b/drivers/gpu/drm/xe/Kconfig
index a8cc1876a24f..ee77039b9256 100644
--- a/drivers/gpu/drm/xe/Kconfig
+++ b/drivers/gpu/drm/xe/Kconfig
@@ -45,6 +45,7 @@ config DRM_XE
 	select AUXILIARY_BUS
 	select HMM_MIRROR
 	select INTEL_MEI_LATE_BIND
+	select FWCTL
 	help
 	  Experimental driver for Intel Xe series GPUs
 
diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index 6de291a21965..c1f3b2e2da5f 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -86,6 +86,7 @@ xe-y += xe_bb.o \
 	xe_pat.o \
 	xe_pci.o \
 	xe_pcode.o \
+	xe_pcode_fwctl.o \
 	xe_pm.o \
 	xe_preempt_fence.o \
 	xe_pt.o \
diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
index 882398e09b7e..222e75c7427e 100644
--- a/drivers/gpu/drm/xe/xe_pci.c
+++ b/drivers/gpu/drm/xe/xe_pci.c
@@ -27,6 +27,7 @@
 #include "xe_module.h"
 #include "xe_pci_sriov.h"
 #include "xe_pci_types.h"
+#include "xe_pcode_fwctl.h"
 #include "xe_pm.h"
 #include "xe_sriov.h"
 #include "xe_step.h"
@@ -868,6 +869,10 @@ static int xe_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	if (err)
 		goto err_driver_cleanup;
 
+	err = xe_pcode_fwctl_init(xe);
+	if (err)
+		goto err_driver_cleanup;
+
 	drm_dbg(&xe->drm, "d3cold: capable=%s\n",
 		str_yes_no(xe->d3cold.capable));
 
diff --git a/drivers/gpu/drm/xe/xe_pcode_fwctl.c b/drivers/gpu/drm/xe/xe_pcode_fwctl.c
new file mode 100644
index 000000000000..d6443aa4a60a
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_pcode_fwctl.c
@@ -0,0 +1,218 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#include "xe_pcode_fwctl.h"
+
+#include <linux/fwctl.h>
+#include <uapi/fwctl/xe_pcode.h>
+
+#include "xe_device.h"
+#include "xe_pcode_api.h"
+#include "xe_pcode.h"
+#include "xe_pm.h"
+
+/**
+ * DOC: XE PCODE FWCTL
+ *
+ * Xe PCODE FWCTL implements the generic FWCTL IOCLTs to allow limited access
+ * from user space (as admin) to some very specific PCODE Mailboxes.
+ *
+ * User space first needs to issue the ```FWCTL_INFO``` ioctl and check for the
+ * capability flag, which will indicate which group of Mailboxes commands are
+ * supported on that current running firmware.
+ *
+ * After verifying the availability of the desired Mailbox command,
+ * ```FWCTL_RPC``` needs to be issued with in and out parameter both using
+ * pointers to a ```struct fwctl_rpc_xe_pcode``` allocated by userspace.
+ * In and out length needs to be sizeof(struct fwctl_rpc_xe_pcode).
+ *
+ * Any command that is not listed in the include/uapi/fwctl/xe_pcode.h or not
+ * supported by the running firmware, will return ERR_PTR(-EBADMSG).
+ *
+ * Example:
+ *
+ * .. code-block:: C
+ *
+ *  struct fwctl_info_xe_pcode xe_pcode_info;
+ *
+ *  struct fwctl_info info = {
+ *           .size = sizeof(struct fwctl_info),
+ *           .flags = 0,
+ *           .out_device_type = 0,
+ *           .device_data_len = sizeof(struct fwctl_info_xe_pcode),
+ *           .out_device_data = (__aligned_u64) &xe_pcode_info,
+ *   };
+ *
+ *   fd = open("/dev/fwctl/fwctl0", O_RDWR);
+ *   if (fd < 0) {
+ *       perror("Failed to open /dev/fwctl/fwctl0");
+ *       return -1;
+ *   }
+ *
+ *   if (ioctl(fd, FWCTL_INFO, &info)) {
+ *           perror("ioctl(FWCTL_INFO) failed");
+ *           close(fd);
+ *           return -1;
+ *   }
+ *
+ *   if (xe_pcode_info.uctx_caps & FWCTL_XE_PCODE_LATEBINDING) {
+ *           struct fwctl_rpc_xe_pcode rpc_in = {
+ *                   .command = PCODE_CMD_LATE_BINDING,
+ *                   .param1 = PARAM1_GET_CAPABILITY_STATUS,
+ *           };
+ *
+ *           struct fwctl_rpc_xe_pcode rpc_out = {0};
+ *
+ *           struct fwctl_rpc rpc = {
+ *                   .size = sizeof(struct fwctl_rpc),
+ *                   .scope = FWCTL_RPC_CONFIGURATION,
+ *                   .in_len = sizeof(struct fwctl_rpc_xe_pcode),
+ *                   .out_len = sizeof(struct fwctl_rpc_xe_pcode),
+ *                   .in = (__aligned_u64) &rpc_in,
+ *                   .out = (__aligned_u64) &rpc_out,
+ *           };
+ *
+ *           if (ioctl(fd, FWCTL_RPC, &rpc)) {
+ *                   perror("ioctl(FWCTL_RPC) failed");
+ *                   close(fd);
+ *                   return -1;
+ *           }
+ *
+ */
+
+struct xe_pcode_fwctl_dev {
+	struct fwctl_device fwctl;
+	struct xe_device *xe;
+};
+
+DEFINE_FREE(xe_pcode_fwctl, struct xe_pcode_fwctl_dev *, if (_T) fwctl_put(&_T->fwctl))
+
+static int xe_pcode_fwctl_uctx_open(struct fwctl_uctx *uctx)
+{
+	struct xe_pcode_fwctl_dev *fwctl_dev =
+		container_of(uctx->fwctl, struct xe_pcode_fwctl_dev, fwctl);
+	struct xe_device *xe = fwctl_dev->xe;
+
+	xe_pm_runtime_get(xe);
+
+	return 0;
+}
+
+static void xe_pcode_fwctl_uctx_close(struct fwctl_uctx *uctx)
+{
+	struct xe_pcode_fwctl_dev *fwctl_dev =
+		container_of(uctx->fwctl, struct xe_pcode_fwctl_dev, fwctl);
+	struct xe_device *xe = fwctl_dev->xe;
+
+	xe_pm_runtime_put(xe);
+}
+
+static void *xe_pcode_fwctl_info(struct fwctl_uctx *uctx, size_t *length)
+{
+	struct xe_pcode_fwctl_dev *fwctl_dev =
+		container_of(uctx->fwctl, struct xe_pcode_fwctl_dev, fwctl);
+	struct xe_device *xe = fwctl_dev->xe;
+	struct fwctl_info_xe_pcode *info;
+
+	info = kzalloc(sizeof(*info), GFP_KERNEL);
+	if (!info)
+		return ERR_PTR(-ENOMEM);
+
+	if (xe->info.platform == XE_BATTLEMAGE)
+		info->uctx_caps = FWCTL_XE_PCODE_LATEBINDING;
+
+	*length = sizeof(*info);
+
+	return info;
+}
+
+static bool xe_pcode_fwctl_rpc_validate(struct fwctl_rpc_xe_pcode *rpc,
+					enum fwctl_rpc_scope scope)
+{
+	u32 mbox = PCODE_MBOX(rpc->command, rpc->param1, rpc->param2);
+
+	if (mbox == PCODE_MBOX(PCODE_CMD_LATE_BINDING,
+			       PARAM1_GET_CAPABILITY_STATUS, 0))
+		return scope == FWCTL_RPC_CONFIGURATION;
+
+	if (mbox == PCODE_MBOX(PCODE_CMD_LATE_BINDING,
+			       PARAM1_GET_VERSION_LOW, 0))
+		return (rpc->data0 == DATA0_TYPE_FAN_CONTROLLER ||
+			rpc->data0 == DATA0_TYPE_VOLTAGE_REGULATOR) &&
+			scope == FWCTL_RPC_CONFIGURATION;
+
+	return false;
+}
+
+static void *xe_pcode_fwctl_rpc(struct fwctl_uctx *uctx,
+				enum fwctl_rpc_scope scope,
+				void *in, size_t in_len, size_t *out_len)
+{
+	struct xe_pcode_fwctl_dev *fwctl_dev =
+		container_of(uctx->fwctl, struct xe_pcode_fwctl_dev, fwctl);
+	struct xe_tile *root_tile = xe_device_get_root_tile(fwctl_dev->xe);
+	struct fwctl_rpc_xe_pcode *rpc = in;
+	int err;
+
+	if (in_len != sizeof(struct fwctl_rpc_xe_pcode) ||
+	    *out_len != sizeof(struct fwctl_rpc_xe_pcode))
+		return ERR_PTR(-EMSGSIZE);
+
+	if (!xe_pcode_fwctl_rpc_validate(rpc, scope))
+		return ERR_PTR(-EBADMSG);
+
+	err = xe_pcode_read(root_tile, PCODE_MBOX(rpc->command,
+						  rpc->param1,
+						  rpc->param2),
+			    &rpc->data0,
+			    &rpc->data1);
+	if (err)
+		return ERR_PTR(err);
+
+	return rpc;
+}
+
+static const struct fwctl_ops xe_pcode_fwctl_ops = {
+	.device_type = FWCTL_DEVICE_TYPE_XE_PCODE,
+	.uctx_size = sizeof(struct fwctl_uctx),
+	.open_uctx = xe_pcode_fwctl_uctx_open,
+	.close_uctx = xe_pcode_fwctl_uctx_close,
+	.info = xe_pcode_fwctl_info,
+	.fw_rpc = xe_pcode_fwctl_rpc,
+};
+
+static void xe_pcode_fwctl_fini(void *dev)
+{
+	struct fwctl_device *fwctl = dev;
+
+	fwctl_unregister(fwctl);
+	fwctl_put(fwctl);
+}
+
+int xe_pcode_fwctl_init(struct xe_device *xe)
+{
+	struct xe_pcode_fwctl_dev *fwctl_dev __free(xe_pcode_fwctl) =
+		fwctl_alloc_device(xe->drm.dev, &xe_pcode_fwctl_ops,
+				   struct xe_pcode_fwctl_dev, fwctl);
+	int err;
+
+	/* For now xe_pcode_fwctl supports only Late-Binding commands on BMG */
+	if (xe->info.platform != XE_BATTLEMAGE)
+		return -ENODEV;
+
+	if (!fwctl_dev)
+		return -ENOMEM;
+
+	fwctl_dev->xe = xe;
+
+	err = fwctl_register(&fwctl_dev->fwctl);
+	if (err)
+		return err;
+
+	return devm_add_action_or_reset(xe->drm.dev, xe_pcode_fwctl_fini,
+					&fwctl_dev->fwctl);
+}
+
+MODULE_IMPORT_NS("FWCTL");
diff --git a/drivers/gpu/drm/xe/xe_pcode_fwctl.h b/drivers/gpu/drm/xe/xe_pcode_fwctl.h
new file mode 100644
index 000000000000..67386d7bf2ea
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_pcode_fwctl.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#ifndef _XE_PCODE_FWCTL_H_
+#define _XE_PCODE_FWCTL_H_
+
+struct xe_device;
+
+int xe_pcode_fwctl_init(struct xe_device *xe);
+
+#endif
diff --git a/include/uapi/fwctl/fwctl.h b/include/uapi/fwctl/fwctl.h
index 716ac0eee42d..9e7e84aef791 100644
--- a/include/uapi/fwctl/fwctl.h
+++ b/include/uapi/fwctl/fwctl.h
@@ -45,6 +45,7 @@ enum fwctl_device_type {
 	FWCTL_DEVICE_TYPE_MLX5 = 1,
 	FWCTL_DEVICE_TYPE_CXL = 2,
 	FWCTL_DEVICE_TYPE_PDS = 4,
+	FWCTL_DEVICE_TYPE_XE_PCODE = 5,
 };
 
 /**
diff --git a/include/uapi/fwctl/xe_pcode.h b/include/uapi/fwctl/xe_pcode.h
new file mode 100644
index 000000000000..8df6db34e5ce
--- /dev/null
+++ b/include/uapi/fwctl/xe_pcode.h
@@ -0,0 +1,80 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#ifndef _UAPI_FWCTL_XE_PCODE_H_
+#define _UAPI_FWCTL_XE_PCODE_H_
+
+#include <linux/types.h>
+
+/**
+ * struct fwctl_info_xe_pcode - FWCTL Information struct for Xe PCODE
+ *
+ * @uctx_caps:  bitmap of available capabilities:
+ *  - %FWCTL_XE_PCODE_LATEBINDING - Command to configure Late Bind FW such as
+ * Fan Controller and Voltage Regulator
+ * @rsvd: Reserved for future usage or flags
+ */
+struct fwctl_info_xe_pcode {
+	__u32 uctx_caps;
+	__u32 rsvd[3];
+};
+
+#define FWCTL_XE_PCODE_LATEBINDING	(1 << 0)
+
+/**
+ * struct fwctl_rpc_xe_pcode - FWCTL Remote Procedure Calls for Xe PCODE
+ */
+struct fwctl_rpc_xe_pcode {
+	/** @command: The main Mailbox command */
+	__u8 command;
+	/** @param1: A subcommand or a parameter of the main command */
+	__u16 param1;
+	/** @param2: A parameter of a subcommand or a subsubcommand */
+	__u16 param2;
+	/** @data0: The first 32 bits of data. In general data-in as param */
+	__u32 data0;
+	/** @data1: The other 32 bits of data. In general data-out */
+	__u32 data1;
+	/** @pad: Padding the uAPI struct - Must be 0. Not sent to firmware */
+	__u8 pad[3];
+};
+
+/**
+ * DOC: Late Binding Commands
+ *
+ * FWCTL info.uctx_caps: FWCTL_XE_PCODE_LATEBINDING
+ * FWCTL rpc.scope: FWCTL_RPC_CONFIGURATION
+ *
+ * Command	0x5C - LATE_BINDING
+ * Param1	0x0 - GET_CAPABILITY_STATUS
+ * Param2	0
+ * Data in	None
+ * Data out:
+ *
+ *  - Bit0: ate binding for V1 Fan Tables is supported.
+ *  - Bit3: Late binding for VR parameters.
+ *  - Bit16: Late binding done for V1 Fan tables
+ *  - Bit17: Late binding done for power co-efficients.
+ *  - Bit18: Late binding done for V2 Fan tables
+ *  - Bit19: Late binding done for VR Parameters
+ *
+ * Command	0x5C - LATE_BINDING
+ * Param1	0x1 - GET_VERSION_LOW
+ * Param2	0
+ * Data in - conveys the Type of the Late Binding Configuration:
+ *
+ *  - FAN_CONTROLLER = 1
+ *  - VOLTAGE_REGULATOR = 2
+ *
+ * Data out - Lower 32 bits of Version Number for Late Binding configuration
+ *            that has been applied successfully.
+ */
+#define PCODE_CMD_LATE_BINDING		0x5C
+#define  PARAM1_GET_CAPABILITY_STATUS	0x0
+#define  PARAM1_GET_VERSION_LOW		0x1
+#define   DATA0_TYPE_FAN_CONTROLLER	1
+#define   DATA0_TYPE_VOLTAGE_REGULATOR	2
+
+#endif /* _UAPI_FWCTL_XE_PCODE_H_ */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [RFC 0/9] Introducing firmware late binding
  2025-04-29 16:09 [RFC 0/9] Introducing firmware late binding Badal Nilawar
                   ` (8 preceding siblings ...)
  2025-04-29 16:09 ` [RFC 9/9] {fwctl,drm}/xe/pcode: Introduce xe_pcode_fwctl Badal Nilawar
@ 2025-04-30 11:47 ` Jani Nikula
  9 siblings, 0 replies; 30+ messages in thread
From: Jani Nikula @ 2025-04-30 11:47 UTC (permalink / raw)
  To: Badal Nilawar, intel-xe, dri-devel
  Cc: anshuman.gupta, rodrigo.vivi, alexander.usyskin, gregkh,
	daniele.ceraolospurio

On Tue, 29 Apr 2025, Badal Nilawar <badal.nilawar@intel.com> wrote:
> Introducing firmware late binding feature to enable firmware loading
> for the devices, such as the fan controller and voltage regulator,
> during the driver probe.
> Typically, firmware for these devices are part of IFWI flash image but
> can be replaced at probe after OEM tuning.

I replied to the previous version [1], the comment has been ignored, and
there's no version information in this series on what changed.

Please don't do that.

BR,
Jani.


[1] https://lore.kernel.org/r/87v7qnwdm3.fsf@intel.com


>
> Alexander Usyskin (2):
>   mei: bus: add mei_cldev_mtu interface
>   mei: late_bind: add late binding component driver
>
> Badal Nilawar (6):
>   drm/xe/late_bind_fw: Introducing late_bind_fw
>   drm/xe/xe_late_bind_fw: Initialize late binding firmware
>   drm/xe/xe_late_bind_fw: Load late binding firmware
>   drm/xe/xe_late_bind_fw: Reload late binding fw in rpm resume
>   drm/xe/xe_late_bind_fw: Reload late binding fw in S2Idle/S3 resume
>   drm/xe/xe_late_bind_fw: Introduce debug fs node to disable late
>     binding
>
> Rodrigo Vivi (1):
>   {fwctl,drm}/xe/pcode: Introduce xe_pcode_fwctl
>
>  Documentation/userspace-api/fwctl/index.rst   |   1 +
>  drivers/gpu/drm/xe/Kconfig                    |   2 +
>  drivers/gpu/drm/xe/Makefile                   |   2 +
>  drivers/gpu/drm/xe/xe_debugfs.c               |  42 +++
>  drivers/gpu/drm/xe/xe_device.c                |   7 +
>  drivers/gpu/drm/xe/xe_device_types.h          |   4 +
>  drivers/gpu/drm/xe/xe_late_bind_fw.c          | 289 ++++++++++++++++++
>  drivers/gpu/drm/xe/xe_late_bind_fw.h          |  18 ++
>  drivers/gpu/drm/xe/xe_late_bind_fw_types.h    |  97 ++++++
>  drivers/gpu/drm/xe/xe_pci.c                   |   5 +
>  drivers/gpu/drm/xe/xe_pcode_fwctl.c           | 218 +++++++++++++
>  drivers/gpu/drm/xe/xe_pcode_fwctl.h           |  13 +
>  drivers/gpu/drm/xe/xe_pm.c                    |   7 +
>  drivers/misc/mei/Kconfig                      |   1 +
>  drivers/misc/mei/Makefile                     |   1 +
>  drivers/misc/mei/bus.c                        |  13 +
>  drivers/misc/mei/late_bind/Kconfig            |  12 +
>  drivers/misc/mei/late_bind/Makefile           |   9 +
>  drivers/misc/mei/late_bind/mei_late_bind.c    | 261 ++++++++++++++++
>  include/drm/intel/i915_component.h            |   1 +
>  .../drm/intel/xe_late_bind_mei_interface.h    |  49 +++
>  include/linux/mei_cl_bus.h                    |   1 +
>  include/uapi/fwctl/fwctl.h                    |   1 +
>  include/uapi/fwctl/xe_pcode.h                 |  80 +++++
>  24 files changed, 1134 insertions(+)
>  create mode 100644 drivers/gpu/drm/xe/xe_late_bind_fw.c
>  create mode 100644 drivers/gpu/drm/xe/xe_late_bind_fw.h
>  create mode 100644 drivers/gpu/drm/xe/xe_late_bind_fw_types.h
>  create mode 100644 drivers/gpu/drm/xe/xe_pcode_fwctl.c
>  create mode 100644 drivers/gpu/drm/xe/xe_pcode_fwctl.h
>  create mode 100644 drivers/misc/mei/late_bind/Kconfig
>  create mode 100644 drivers/misc/mei/late_bind/Makefile
>  create mode 100644 drivers/misc/mei/late_bind/mei_late_bind.c
>  create mode 100644 include/drm/intel/xe_late_bind_mei_interface.h
>  create mode 100644 include/uapi/fwctl/xe_pcode.h

-- 
Jani Nikula, Intel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 9/9] {fwctl,drm}/xe/pcode: Introduce xe_pcode_fwctl
  2025-04-29 16:09 ` [RFC 9/9] {fwctl,drm}/xe/pcode: Introduce xe_pcode_fwctl Badal Nilawar
@ 2025-05-01 15:44   ` Rodrigo Vivi
  2025-05-06 18:13   ` Jason Gunthorpe
  1 sibling, 0 replies; 30+ messages in thread
From: Rodrigo Vivi @ 2025-05-01 15:44 UTC (permalink / raw)
  To: Badal Nilawar, Dave Jiang, Jason Gunthorpe, Saeed Mahameed,
	onathan Cameron, Simona Vetter, Dave Airlie, linux-kernel
  Cc: intel-xe, dri-devel, anshuman.gupta, alexander.usyskin, gregkh,
	daniele.ceraolospurio

On Tue, Apr 29, 2025 at 09:39:56PM +0530, Badal Nilawar wrote:
> From: Rodrigo Vivi <rodrigo.vivi@intel.com>
> 
> Xe PCODE FWCTL implements the generic FWCTL IOCLTs to allow limited
> access from user space (as admin) to some very specific PCODE
> Mailboxes only related to hardware configuration.
> 
> PCODE is a Firmware in Intel GPUs which is the main responsible
> component for power and thermal aspects of the Intel GPUs.
> 
> Each different Intel GPU came with different PCODE versions with
> different mailboxes and different needs. In the lack of an unified
> interface, the per platform sysfs entries at the device level is
> trending to grow, to allow admins to control different aspects of
> the Hardware.
> 
> In this first experiment, xe_pcode_fwctl only adds support for the
> Battlemage late-binding firmware information.
> 
> Late-binding is the name given to 2 other new auxiliary firmware
> blobs that for now lives in the Flash like PCODE, but that soon
> it is coming to linux-firmware.git: Fan-controller and
> Voltage-regulator. Then, PCODE provides some mailboxes where the
> status of both late-binding firmware can be queried as specified
> in the documentation that is added along with the new uAPI here.
> 
> RFC IMPORTANT NOTE:
> ===================
> Admins will need to query this information. This code here aims
> to be used by Level0-Sysman and/or Intel XPU Manager directly
> from user space. But following the drm upstream rules, the
> userspace code will need to be ready before we can consider
> getting this patch merged!
> 
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
> ---
>  Documentation/userspace-api/fwctl/index.rst |   1 +
>  drivers/gpu/drm/xe/Kconfig                  |   1 +
>  drivers/gpu/drm/xe/Makefile                 |   1 +
>  drivers/gpu/drm/xe/xe_pci.c                 |   5 +
>  drivers/gpu/drm/xe/xe_pcode_fwctl.c         | 218 ++++++++++++++++++++
>  drivers/gpu/drm/xe/xe_pcode_fwctl.h         |  13 ++
>  include/uapi/fwctl/fwctl.h                  |   1 +
>  include/uapi/fwctl/xe_pcode.h               |  80 +++++++
>  8 files changed, 320 insertions(+)
>  create mode 100644 drivers/gpu/drm/xe/xe_pcode_fwctl.c
>  create mode 100644 drivers/gpu/drm/xe/xe_pcode_fwctl.h
>  create mode 100644 include/uapi/fwctl/xe_pcode.h
> 
> diff --git a/Documentation/userspace-api/fwctl/index.rst b/Documentation/userspace-api/fwctl/index.rst
> index 316ac456ad3b..186f8cf17583 100644
> --- a/Documentation/userspace-api/fwctl/index.rst
> +++ b/Documentation/userspace-api/fwctl/index.rst
> @@ -12,3 +12,4 @@ to securely construct and execute RPCs inside device firmware.
>     fwctl
>     fwctl-cxl
>     pds_fwctl
> +   xe_pcode_fwctl

I just noticed that I forgot to actually include this file and add to
the MAINTAINERS list under xe.

But the important part which is the documentation itself is inside
the include/uapi/fwctl/xe_pcode.h

For the record, the missing file is:
$ cat Documentation/userspace-api/fwctl/xe_pcode_fwctl.rst
.. SPDX-License-Identifier: (GPL-2.0+ OR MIT)

==================
fwctl drm/xe pcode
==================

.. kernel-doc:: drivers/gpu/drm/xe/xe_pcode_fwctl.c
   :doc: XE PCODE FWCTL

uAPI
====

.. kernel-doc:: include/uapi/fwctl/xe_pcode.h
   :internal:

.. kernel-doc:: include/uapi/fwctl/xe_pcode.h
   :doc: Late Binding Commands


Also cc'ing FWCTL and DRM maintainers and LKML to ensure we get the proper
feedback on this first attempt of using the fwctl with xe.

Thanks,
Rodrigo.

> diff --git a/drivers/gpu/drm/xe/Kconfig b/drivers/gpu/drm/xe/Kconfig
> index a8cc1876a24f..ee77039b9256 100644
> --- a/drivers/gpu/drm/xe/Kconfig
> +++ b/drivers/gpu/drm/xe/Kconfig
> @@ -45,6 +45,7 @@ config DRM_XE
>  	select AUXILIARY_BUS
>  	select HMM_MIRROR
>  	select INTEL_MEI_LATE_BIND
> +	select FWCTL
>  	help
>  	  Experimental driver for Intel Xe series GPUs
>  
> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> index 6de291a21965..c1f3b2e2da5f 100644
> --- a/drivers/gpu/drm/xe/Makefile
> +++ b/drivers/gpu/drm/xe/Makefile
> @@ -86,6 +86,7 @@ xe-y += xe_bb.o \
>  	xe_pat.o \
>  	xe_pci.o \
>  	xe_pcode.o \
> +	xe_pcode_fwctl.o \
>  	xe_pm.o \
>  	xe_preempt_fence.o \
>  	xe_pt.o \
> diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
> index 882398e09b7e..222e75c7427e 100644
> --- a/drivers/gpu/drm/xe/xe_pci.c
> +++ b/drivers/gpu/drm/xe/xe_pci.c
> @@ -27,6 +27,7 @@
>  #include "xe_module.h"
>  #include "xe_pci_sriov.h"
>  #include "xe_pci_types.h"
> +#include "xe_pcode_fwctl.h"
>  #include "xe_pm.h"
>  #include "xe_sriov.h"
>  #include "xe_step.h"
> @@ -868,6 +869,10 @@ static int xe_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>  	if (err)
>  		goto err_driver_cleanup;
>  
> +	err = xe_pcode_fwctl_init(xe);
> +	if (err)
> +		goto err_driver_cleanup;
> +
>  	drm_dbg(&xe->drm, "d3cold: capable=%s\n",
>  		str_yes_no(xe->d3cold.capable));
>  
> diff --git a/drivers/gpu/drm/xe/xe_pcode_fwctl.c b/drivers/gpu/drm/xe/xe_pcode_fwctl.c
> new file mode 100644
> index 000000000000..d6443aa4a60a
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_pcode_fwctl.c
> @@ -0,0 +1,218 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#include "xe_pcode_fwctl.h"
> +
> +#include <linux/fwctl.h>
> +#include <uapi/fwctl/xe_pcode.h>
> +
> +#include "xe_device.h"
> +#include "xe_pcode_api.h"
> +#include "xe_pcode.h"
> +#include "xe_pm.h"
> +
> +/**
> + * DOC: XE PCODE FWCTL
> + *
> + * Xe PCODE FWCTL implements the generic FWCTL IOCLTs to allow limited access
> + * from user space (as admin) to some very specific PCODE Mailboxes.
> + *
> + * User space first needs to issue the ```FWCTL_INFO``` ioctl and check for the
> + * capability flag, which will indicate which group of Mailboxes commands are
> + * supported on that current running firmware.
> + *
> + * After verifying the availability of the desired Mailbox command,
> + * ```FWCTL_RPC``` needs to be issued with in and out parameter both using
> + * pointers to a ```struct fwctl_rpc_xe_pcode``` allocated by userspace.
> + * In and out length needs to be sizeof(struct fwctl_rpc_xe_pcode).
> + *
> + * Any command that is not listed in the include/uapi/fwctl/xe_pcode.h or not
> + * supported by the running firmware, will return ERR_PTR(-EBADMSG).
> + *
> + * Example:
> + *
> + * .. code-block:: C
> + *
> + *  struct fwctl_info_xe_pcode xe_pcode_info;
> + *
> + *  struct fwctl_info info = {
> + *           .size = sizeof(struct fwctl_info),
> + *           .flags = 0,
> + *           .out_device_type = 0,
> + *           .device_data_len = sizeof(struct fwctl_info_xe_pcode),
> + *           .out_device_data = (__aligned_u64) &xe_pcode_info,
> + *   };
> + *
> + *   fd = open("/dev/fwctl/fwctl0", O_RDWR);
> + *   if (fd < 0) {
> + *       perror("Failed to open /dev/fwctl/fwctl0");
> + *       return -1;
> + *   }
> + *
> + *   if (ioctl(fd, FWCTL_INFO, &info)) {
> + *           perror("ioctl(FWCTL_INFO) failed");
> + *           close(fd);
> + *           return -1;
> + *   }
> + *
> + *   if (xe_pcode_info.uctx_caps & FWCTL_XE_PCODE_LATEBINDING) {
> + *           struct fwctl_rpc_xe_pcode rpc_in = {
> + *                   .command = PCODE_CMD_LATE_BINDING,
> + *                   .param1 = PARAM1_GET_CAPABILITY_STATUS,
> + *           };
> + *
> + *           struct fwctl_rpc_xe_pcode rpc_out = {0};
> + *
> + *           struct fwctl_rpc rpc = {
> + *                   .size = sizeof(struct fwctl_rpc),
> + *                   .scope = FWCTL_RPC_CONFIGURATION,
> + *                   .in_len = sizeof(struct fwctl_rpc_xe_pcode),
> + *                   .out_len = sizeof(struct fwctl_rpc_xe_pcode),
> + *                   .in = (__aligned_u64) &rpc_in,
> + *                   .out = (__aligned_u64) &rpc_out,
> + *           };
> + *
> + *           if (ioctl(fd, FWCTL_RPC, &rpc)) {
> + *                   perror("ioctl(FWCTL_RPC) failed");
> + *                   close(fd);
> + *                   return -1;
> + *           }
> + *
> + */
> +
> +struct xe_pcode_fwctl_dev {
> +	struct fwctl_device fwctl;
> +	struct xe_device *xe;
> +};
> +
> +DEFINE_FREE(xe_pcode_fwctl, struct xe_pcode_fwctl_dev *, if (_T) fwctl_put(&_T->fwctl))
> +
> +static int xe_pcode_fwctl_uctx_open(struct fwctl_uctx *uctx)
> +{
> +	struct xe_pcode_fwctl_dev *fwctl_dev =
> +		container_of(uctx->fwctl, struct xe_pcode_fwctl_dev, fwctl);
> +	struct xe_device *xe = fwctl_dev->xe;
> +
> +	xe_pm_runtime_get(xe);
> +
> +	return 0;
> +}
> +
> +static void xe_pcode_fwctl_uctx_close(struct fwctl_uctx *uctx)
> +{
> +	struct xe_pcode_fwctl_dev *fwctl_dev =
> +		container_of(uctx->fwctl, struct xe_pcode_fwctl_dev, fwctl);
> +	struct xe_device *xe = fwctl_dev->xe;
> +
> +	xe_pm_runtime_put(xe);
> +}
> +
> +static void *xe_pcode_fwctl_info(struct fwctl_uctx *uctx, size_t *length)
> +{
> +	struct xe_pcode_fwctl_dev *fwctl_dev =
> +		container_of(uctx->fwctl, struct xe_pcode_fwctl_dev, fwctl);
> +	struct xe_device *xe = fwctl_dev->xe;
> +	struct fwctl_info_xe_pcode *info;
> +
> +	info = kzalloc(sizeof(*info), GFP_KERNEL);
> +	if (!info)
> +		return ERR_PTR(-ENOMEM);
> +
> +	if (xe->info.platform == XE_BATTLEMAGE)
> +		info->uctx_caps = FWCTL_XE_PCODE_LATEBINDING;
> +
> +	*length = sizeof(*info);
> +
> +	return info;
> +}
> +
> +static bool xe_pcode_fwctl_rpc_validate(struct fwctl_rpc_xe_pcode *rpc,
> +					enum fwctl_rpc_scope scope)
> +{
> +	u32 mbox = PCODE_MBOX(rpc->command, rpc->param1, rpc->param2);
> +
> +	if (mbox == PCODE_MBOX(PCODE_CMD_LATE_BINDING,
> +			       PARAM1_GET_CAPABILITY_STATUS, 0))
> +		return scope == FWCTL_RPC_CONFIGURATION;
> +
> +	if (mbox == PCODE_MBOX(PCODE_CMD_LATE_BINDING,
> +			       PARAM1_GET_VERSION_LOW, 0))
> +		return (rpc->data0 == DATA0_TYPE_FAN_CONTROLLER ||
> +			rpc->data0 == DATA0_TYPE_VOLTAGE_REGULATOR) &&
> +			scope == FWCTL_RPC_CONFIGURATION;
> +
> +	return false;
> +}
> +
> +static void *xe_pcode_fwctl_rpc(struct fwctl_uctx *uctx,
> +				enum fwctl_rpc_scope scope,
> +				void *in, size_t in_len, size_t *out_len)
> +{
> +	struct xe_pcode_fwctl_dev *fwctl_dev =
> +		container_of(uctx->fwctl, struct xe_pcode_fwctl_dev, fwctl);
> +	struct xe_tile *root_tile = xe_device_get_root_tile(fwctl_dev->xe);
> +	struct fwctl_rpc_xe_pcode *rpc = in;
> +	int err;
> +
> +	if (in_len != sizeof(struct fwctl_rpc_xe_pcode) ||
> +	    *out_len != sizeof(struct fwctl_rpc_xe_pcode))
> +		return ERR_PTR(-EMSGSIZE);
> +
> +	if (!xe_pcode_fwctl_rpc_validate(rpc, scope))
> +		return ERR_PTR(-EBADMSG);
> +
> +	err = xe_pcode_read(root_tile, PCODE_MBOX(rpc->command,
> +						  rpc->param1,
> +						  rpc->param2),
> +			    &rpc->data0,
> +			    &rpc->data1);
> +	if (err)
> +		return ERR_PTR(err);
> +
> +	return rpc;
> +}
> +
> +static const struct fwctl_ops xe_pcode_fwctl_ops = {
> +	.device_type = FWCTL_DEVICE_TYPE_XE_PCODE,
> +	.uctx_size = sizeof(struct fwctl_uctx),
> +	.open_uctx = xe_pcode_fwctl_uctx_open,
> +	.close_uctx = xe_pcode_fwctl_uctx_close,
> +	.info = xe_pcode_fwctl_info,
> +	.fw_rpc = xe_pcode_fwctl_rpc,
> +};
> +
> +static void xe_pcode_fwctl_fini(void *dev)
> +{
> +	struct fwctl_device *fwctl = dev;
> +
> +	fwctl_unregister(fwctl);
> +	fwctl_put(fwctl);
> +}
> +
> +int xe_pcode_fwctl_init(struct xe_device *xe)
> +{
> +	struct xe_pcode_fwctl_dev *fwctl_dev __free(xe_pcode_fwctl) =
> +		fwctl_alloc_device(xe->drm.dev, &xe_pcode_fwctl_ops,
> +				   struct xe_pcode_fwctl_dev, fwctl);
> +	int err;
> +
> +	/* For now xe_pcode_fwctl supports only Late-Binding commands on BMG */
> +	if (xe->info.platform != XE_BATTLEMAGE)
> +		return -ENODEV;
> +
> +	if (!fwctl_dev)
> +		return -ENOMEM;
> +
> +	fwctl_dev->xe = xe;
> +
> +	err = fwctl_register(&fwctl_dev->fwctl);
> +	if (err)
> +		return err;
> +
> +	return devm_add_action_or_reset(xe->drm.dev, xe_pcode_fwctl_fini,
> +					&fwctl_dev->fwctl);
> +}
> +
> +MODULE_IMPORT_NS("FWCTL");
> diff --git a/drivers/gpu/drm/xe/xe_pcode_fwctl.h b/drivers/gpu/drm/xe/xe_pcode_fwctl.h
> new file mode 100644
> index 000000000000..67386d7bf2ea
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_pcode_fwctl.h
> @@ -0,0 +1,13 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#ifndef _XE_PCODE_FWCTL_H_
> +#define _XE_PCODE_FWCTL_H_
> +
> +struct xe_device;
> +
> +int xe_pcode_fwctl_init(struct xe_device *xe);
> +
> +#endif
> diff --git a/include/uapi/fwctl/fwctl.h b/include/uapi/fwctl/fwctl.h
> index 716ac0eee42d..9e7e84aef791 100644
> --- a/include/uapi/fwctl/fwctl.h
> +++ b/include/uapi/fwctl/fwctl.h
> @@ -45,6 +45,7 @@ enum fwctl_device_type {
>  	FWCTL_DEVICE_TYPE_MLX5 = 1,
>  	FWCTL_DEVICE_TYPE_CXL = 2,
>  	FWCTL_DEVICE_TYPE_PDS = 4,
> +	FWCTL_DEVICE_TYPE_XE_PCODE = 5,
>  };
>  
>  /**
> diff --git a/include/uapi/fwctl/xe_pcode.h b/include/uapi/fwctl/xe_pcode.h
> new file mode 100644
> index 000000000000..8df6db34e5ce
> --- /dev/null
> +++ b/include/uapi/fwctl/xe_pcode.h
> @@ -0,0 +1,80 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#ifndef _UAPI_FWCTL_XE_PCODE_H_
> +#define _UAPI_FWCTL_XE_PCODE_H_
> +
> +#include <linux/types.h>
> +
> +/**
> + * struct fwctl_info_xe_pcode - FWCTL Information struct for Xe PCODE
> + *
> + * @uctx_caps:  bitmap of available capabilities:
> + *  - %FWCTL_XE_PCODE_LATEBINDING - Command to configure Late Bind FW such as
> + * Fan Controller and Voltage Regulator
> + * @rsvd: Reserved for future usage or flags
> + */
> +struct fwctl_info_xe_pcode {
> +	__u32 uctx_caps;
> +	__u32 rsvd[3];
> +};
> +
> +#define FWCTL_XE_PCODE_LATEBINDING	(1 << 0)
> +
> +/**
> + * struct fwctl_rpc_xe_pcode - FWCTL Remote Procedure Calls for Xe PCODE
> + */
> +struct fwctl_rpc_xe_pcode {
> +	/** @command: The main Mailbox command */
> +	__u8 command;
> +	/** @param1: A subcommand or a parameter of the main command */
> +	__u16 param1;
> +	/** @param2: A parameter of a subcommand or a subsubcommand */
> +	__u16 param2;
> +	/** @data0: The first 32 bits of data. In general data-in as param */
> +	__u32 data0;
> +	/** @data1: The other 32 bits of data. In general data-out */
> +	__u32 data1;
> +	/** @pad: Padding the uAPI struct - Must be 0. Not sent to firmware */
> +	__u8 pad[3];
> +};
> +
> +/**
> + * DOC: Late Binding Commands
> + *
> + * FWCTL info.uctx_caps: FWCTL_XE_PCODE_LATEBINDING
> + * FWCTL rpc.scope: FWCTL_RPC_CONFIGURATION
> + *
> + * Command	0x5C - LATE_BINDING
> + * Param1	0x0 - GET_CAPABILITY_STATUS
> + * Param2	0
> + * Data in	None
> + * Data out:
> + *
> + *  - Bit0: ate binding for V1 Fan Tables is supported.
> + *  - Bit3: Late binding for VR parameters.
> + *  - Bit16: Late binding done for V1 Fan tables
> + *  - Bit17: Late binding done for power co-efficients.
> + *  - Bit18: Late binding done for V2 Fan tables
> + *  - Bit19: Late binding done for VR Parameters
> + *
> + * Command	0x5C - LATE_BINDING
> + * Param1	0x1 - GET_VERSION_LOW
> + * Param2	0
> + * Data in - conveys the Type of the Late Binding Configuration:
> + *
> + *  - FAN_CONTROLLER = 1
> + *  - VOLTAGE_REGULATOR = 2
> + *
> + * Data out - Lower 32 bits of Version Number for Late Binding configuration
> + *            that has been applied successfully.
> + */
> +#define PCODE_CMD_LATE_BINDING		0x5C
> +#define  PARAM1_GET_CAPABILITY_STATUS	0x0
> +#define  PARAM1_GET_VERSION_LOW		0x1
> +#define   DATA0_TYPE_FAN_CONTROLLER	1
> +#define   DATA0_TYPE_VOLTAGE_REGULATOR	2
> +
> +#endif /* _UAPI_FWCTL_XE_PCODE_H_ */
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 9/9] {fwctl,drm}/xe/pcode: Introduce xe_pcode_fwctl
  2025-04-29 16:09 ` [RFC 9/9] {fwctl,drm}/xe/pcode: Introduce xe_pcode_fwctl Badal Nilawar
  2025-05-01 15:44   ` Rodrigo Vivi
@ 2025-05-06 18:13   ` Jason Gunthorpe
  2025-05-07 19:49     ` Rodrigo Vivi
                       ` (2 more replies)
  1 sibling, 3 replies; 30+ messages in thread
From: Jason Gunthorpe @ 2025-05-06 18:13 UTC (permalink / raw)
  To: Badal Nilawar
  Cc: intel-xe, dri-devel, anshuman.gupta, rodrigo.vivi,
	alexander.usyskin, gregkh, daniele.ceraolospurio

On Tue, Apr 29, 2025 at 09:39:56PM +0530, Badal Nilawar wrote:

> diff --git a/drivers/gpu/drm/xe/xe_pcode_fwctl.c b/drivers/gpu/drm/xe/xe_pcode_fwctl.c
> new file mode 100644

I really do prefer it if you can find a way to put the code in
drivers/fwctl instead of in DRM subsystem.
> +static int xe_pcode_fwctl_uctx_open(struct fwctl_uctx *uctx)
> +{
> +	struct xe_pcode_fwctl_dev *fwctl_dev =
> +		container_of(uctx->fwctl, struct xe_pcode_fwctl_dev, fwctl);
> +	struct xe_device *xe = fwctl_dev->xe;
> +
> +	xe_pm_runtime_get(xe);

Shouldn't this be in the RPC function? Why keep the device awake as
long as a the FD is open?

> +static void *xe_pcode_fwctl_rpc(struct fwctl_uctx *uctx,
> +				enum fwctl_rpc_scope scope,
> +				void *in, size_t in_len, size_t *out_len)
> +{
> +	struct xe_pcode_fwctl_dev *fwctl_dev =
> +		container_of(uctx->fwctl, struct xe_pcode_fwctl_dev, fwctl);
> +	struct xe_tile *root_tile = xe_device_get_root_tile(fwctl_dev->xe);
> +	struct fwctl_rpc_xe_pcode *rpc = in;
> +	int err;
> +
> +	if (in_len != sizeof(struct fwctl_rpc_xe_pcode) ||
> +	    *out_len != sizeof(struct fwctl_rpc_xe_pcode))
> +		return ERR_PTR(-EMSGSIZE);
> +
> +	if (!xe_pcode_fwctl_rpc_validate(rpc, scope))
> +		return ERR_PTR(-EBADMSG);

There should be an EPERM here if the scope is not good enough..
> +/**
> + * struct fwctl_rpc_xe_pcode - FWCTL Remote Procedure Calls for Xe PCODE
> + */
> +struct fwctl_rpc_xe_pcode {
> +	/** @command: The main Mailbox command */
> +	__u8 command;
> +	/** @param1: A subcommand or a parameter of the main command */
> +	__u16 param1;
> +	/** @param2: A parameter of a subcommand or a subsubcommand */
> +	__u16 param2;
> +	/** @data0: The first 32 bits of data. In general data-in as param */
> +	__u32 data0;
> +	/** @data1: The other 32 bits of data. In general data-out */
> +	__u32 data1;
> +	/** @pad: Padding the uAPI struct - Must be 0. Not sent to firmware */
> +	__u8 pad[3];
> +};

This has implicit padding? Make the padding explicit or use packed..
> +/**
> + * DOC: Late Binding Commands
> + *
> + * FWCTL info.uctx_caps: FWCTL_XE_PCODE_LATEBINDING
> + * FWCTL rpc.scope: FWCTL_RPC_CONFIGURATION
> + *
> + * Command	0x5C - LATE_BINDING
> + * Param1	0x0 - GET_CAPABILITY_STATUS
> + * Param2	0
> + * Data in	None
> + * Data out:
> + *
> + *  - Bit0: ate binding for V1 Fan Tables is supported.

"ate" is a typo?

This seems fine, though very simple in what it can do. Do you imagine
more commands down the road?

Jason

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 9/9] {fwctl,drm}/xe/pcode: Introduce xe_pcode_fwctl
  2025-05-06 18:13   ` Jason Gunthorpe
@ 2025-05-07 19:49     ` Rodrigo Vivi
  2025-05-07 22:04       ` Jason Gunthorpe
  2025-06-06 13:47       ` Nilawar, Badal
  2025-06-06 13:45     ` Nilawar, Badal
  2025-06-30 22:01     ` Rodrigo Vivi
  2 siblings, 2 replies; 30+ messages in thread
From: Rodrigo Vivi @ 2025-05-07 19:49 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Badal Nilawar, intel-xe, dri-devel, anshuman.gupta,
	alexander.usyskin, gregkh, daniele.ceraolospurio

On Tue, May 06, 2025 at 03:13:53PM -0300, Jason Gunthorpe wrote:

Hi Jason,

first of all, thank you so much for creating this and for the review here.

> On Tue, Apr 29, 2025 at 09:39:56PM +0530, Badal Nilawar wrote:
> 
> > diff --git a/drivers/gpu/drm/xe/xe_pcode_fwctl.c b/drivers/gpu/drm/xe/xe_pcode_fwctl.c
> > new file mode 100644
> 
> I really do prefer it if you can find a way to put the code in
> drivers/fwctl instead of in DRM subsystem.

I was really in doubt about this and decided to go more towards the CXL way,
but if there's a stronger preference it would be okay by me if it is okay by
Sima and Dave as well.

> > +static int xe_pcode_fwctl_uctx_open(struct fwctl_uctx *uctx)
> > +{
> > +	struct xe_pcode_fwctl_dev *fwctl_dev =
> > +		container_of(uctx->fwctl, struct xe_pcode_fwctl_dev, fwctl);
> > +	struct xe_device *xe = fwctl_dev->xe;
> > +
> > +	xe_pm_runtime_get(xe);
> 
> Shouldn't this be in the RPC function? Why keep the device awake as
> long as a the FD is open?

In general I prefer the runtime on the outer bounds to avoid funny deadlock
cases. But right, in this case here it could be inside the xe_pcode calls
itself, that is when the mmio/mailboxes accesses will really happen.

> 
> > +static void *xe_pcode_fwctl_rpc(struct fwctl_uctx *uctx,
> > +				enum fwctl_rpc_scope scope,
> > +				void *in, size_t in_len, size_t *out_len)
> > +{
> > +	struct xe_pcode_fwctl_dev *fwctl_dev =
> > +		container_of(uctx->fwctl, struct xe_pcode_fwctl_dev, fwctl);
> > +	struct xe_tile *root_tile = xe_device_get_root_tile(fwctl_dev->xe);
> > +	struct fwctl_rpc_xe_pcode *rpc = in;
> > +	int err;
> > +
> > +	if (in_len != sizeof(struct fwctl_rpc_xe_pcode) ||
> > +	    *out_len != sizeof(struct fwctl_rpc_xe_pcode))
> > +		return ERR_PTR(-EMSGSIZE);
> > +
> > +	if (!xe_pcode_fwctl_rpc_validate(rpc, scope))
> > +		return ERR_PTR(-EBADMSG);
> 
> There should be an EPERM here if the scope is not good enough..

EPERM seems better indeed

> > +/**
> > + * struct fwctl_rpc_xe_pcode - FWCTL Remote Procedure Calls for Xe PCODE
> > + */
> > +struct fwctl_rpc_xe_pcode {
> > +	/** @command: The main Mailbox command */
> > +	__u8 command;
> > +	/** @param1: A subcommand or a parameter of the main command */
> > +	__u16 param1;
> > +	/** @param2: A parameter of a subcommand or a subsubcommand */
> > +	__u16 param2;
> > +	/** @data0: The first 32 bits of data. In general data-in as param */
> > +	__u32 data0;
> > +	/** @data1: The other 32 bits of data. In general data-out */
> > +	__u32 data1;
> > +	/** @pad: Padding the uAPI struct - Must be 0. Not sent to firmware */
> > +	__u8 pad[3];
> > +};
> 
> This has implicit padding? Make the padding explicit or use packed..
> > +/**
> > + * DOC: Late Binding Commands
> > + *
> > + * FWCTL info.uctx_caps: FWCTL_XE_PCODE_LATEBINDING
> > + * FWCTL rpc.scope: FWCTL_RPC_CONFIGURATION
> > + *
> > + * Command	0x5C - LATE_BINDING
> > + * Param1	0x0 - GET_CAPABILITY_STATUS
> > + * Param2	0
> > + * Data in	None
> > + * Data out:
> > + *
> > + *  - Bit0: ate binding for V1 Fan Tables is supported.
> 
> "ate" is a typo?

yes, for 'Late'

> 
> This seems fine, though very simple in what it can do. Do you imagine
> more commands down the road?

Indeed, and this first RFC was to test the waters and the overall acceptance,
reception. PCODE has more mailboxes that we might start using FWCTL more
instead of growing per platform spread sysfs files.

One last thing since I have your attention here. Was any time in the previous
fwctl discussions talked about the possibility of some extra usages for like
FW flashing or in-field-repair/tests where big data needs to filled bypassing
lockdown mode?

> 
> Jason

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 3/9] drm/xe/late_bind_fw: Introducing late_bind_fw
  2025-04-29 16:09 ` [RFC 3/9] drm/xe/late_bind_fw: Introducing late_bind_fw Badal Nilawar
@ 2025-05-07 21:38   ` Daniele Ceraolo Spurio
  2025-06-03 13:52     ` Nilawar, Badal
  0 siblings, 1 reply; 30+ messages in thread
From: Daniele Ceraolo Spurio @ 2025-05-07 21:38 UTC (permalink / raw)
  To: Badal Nilawar, intel-xe, dri-devel
  Cc: anshuman.gupta, rodrigo.vivi, alexander.usyskin, gregkh



On 4/29/2025 9:09 AM, Badal Nilawar wrote:
> Introducing late_bind_fw to enable firmware loading for the devices,
> such as the fan controller and voltage regulator, during the driver probe.
> Typically, firmware for these devices are part of IFWI flash image but
> can be replaced at probe after OEM tuning.

This description does not fully match what's happening in the patch, as 
the main thing happening is the addition of the mei component.

>
> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
> ---
>   drivers/gpu/drm/xe/Kconfig                 |   1 +
>   drivers/gpu/drm/xe/Makefile                |   1 +
>   drivers/gpu/drm/xe/xe_device.c             |   3 +
>   drivers/gpu/drm/xe/xe_device_types.h       |   4 +
>   drivers/gpu/drm/xe/xe_late_bind_fw.c       | 104 +++++++++++++++++++++
>   drivers/gpu/drm/xe/xe_late_bind_fw.h       |  16 ++++
>   drivers/gpu/drm/xe/xe_late_bind_fw_types.h |  95 +++++++++++++++++++
>   7 files changed, 224 insertions(+)
>   create mode 100644 drivers/gpu/drm/xe/xe_late_bind_fw.c
>   create mode 100644 drivers/gpu/drm/xe/xe_late_bind_fw.h
>   create mode 100644 drivers/gpu/drm/xe/xe_late_bind_fw_types.h
>
> diff --git a/drivers/gpu/drm/xe/Kconfig b/drivers/gpu/drm/xe/Kconfig
> index 9bce047901b2..a8cc1876a24f 100644
> --- a/drivers/gpu/drm/xe/Kconfig
> +++ b/drivers/gpu/drm/xe/Kconfig
> @@ -44,6 +44,7 @@ config DRM_XE
>   	select WANT_DEV_COREDUMP
>   	select AUXILIARY_BUS
>   	select HMM_MIRROR
> +	select INTEL_MEI_LATE_BIND

I'm not sure this is enough to guarantee that late bind will work. This 
selects the component, but the MEI_GSC child driver also needs to be 
built for the component to bind into it on dGPU. We can't select 
INTEL_MEI_GSC from here because that depends on the graphics driver, so 
we'd go circular. For other components (PXP, HDCP, SW proxy) what we did 
was notify the distros that they needed to enable the new config for the 
feature to work instead of selecting it from the Kconfig.

>   	help
>   	  Experimental driver for Intel Xe series GPUs
>   
> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> index c5d6681645ed..6de291a21965 100644
> --- a/drivers/gpu/drm/xe/Makefile
> +++ b/drivers/gpu/drm/xe/Makefile
> @@ -76,6 +76,7 @@ xe-y += xe_bb.o \
>   	xe_hw_fence.o \
>   	xe_irq.o \
>   	xe_lrc.o \
> +	xe_late_bind_fw.o \
>   	xe_migrate.o \
>   	xe_mmio.o \
>   	xe_mocs.o \
> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> index 75e753e0a682..86a7b7065122 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -42,6 +42,7 @@
>   #include "xe_hw_engine_group.h"
>   #include "xe_hwmon.h"
>   #include "xe_irq.h"
> +#include "xe_late_bind_fw.h"
>   #include "xe_memirq.h"
>   #include "xe_mmio.h"
>   #include "xe_module.h"
> @@ -889,6 +890,8 @@ int xe_device_probe(struct xe_device *xe)
>   	if (err)
>   		return err;
>   
> +	xe_late_bind_init(&xe->late_bind);
> +
>   	err = xe_oa_init(xe);
>   	if (err)
>   		return err;
> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> index 495bc00ebed4..57b63cc9b8ac 100644
> --- a/drivers/gpu/drm/xe/xe_device_types.h
> +++ b/drivers/gpu/drm/xe/xe_device_types.h
> @@ -16,6 +16,7 @@
>   #include "xe_devcoredump_types.h"
>   #include "xe_heci_gsc.h"
>   #include "xe_lmtt_types.h"
> +#include "xe_late_bind_fw_types.h"
>   #include "xe_memirq_types.h"
>   #include "xe_oa_types.h"
>   #include "xe_platform_types.h"
> @@ -543,6 +544,9 @@ struct xe_device {
>   	/** @heci_gsc: graphics security controller */
>   	struct xe_heci_gsc heci_gsc;
>   
> +	/** @late_bind: xe mei late bind interface */
> +	struct xe_late_bind late_bind;
> +
>   	/** @oa: oa observation subsystem */
>   	struct xe_oa oa;
>   
> diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw.c b/drivers/gpu/drm/xe/xe_late_bind_fw.c
> new file mode 100644
> index 000000000000..7981fc500a78
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_late_bind_fw.c
> @@ -0,0 +1,104 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2024 Intel Corporation

2025?

> + */
> +
> +#include <linux/component.h>
> +#include <linux/delay.h>
> +
> +#include <drm/drm_managed.h>
> +#include <drm/intel/i915_component.h>
> +#include <drm/intel/xe_late_bind_mei_interface.h>
> +#include <drm/drm_print.h>
> +
> +#include "xe_device.h"
> +#include "xe_late_bind_fw.h"
> +
> +static struct xe_device *
> +late_bind_to_xe(struct xe_late_bind *late_bind)
> +{
> +	return container_of(late_bind, struct xe_device, late_bind);
> +}
> +
> +static int xe_late_bind_component_bind(struct device *xe_kdev,
> +				       struct device *mei_kdev, void *data)
> +{
> +	struct xe_device *xe = kdev_to_xe_device(xe_kdev);
> +	struct xe_late_bind *late_bind = &xe->late_bind;
> +	struct xe_late_bind_component *component;
> +
> +	component = drmm_kzalloc(&xe->drm, sizeof(*component), GFP_KERNEL);

The component is unbound and re-bound on every suspend/resume, so if you 
do allocs in the bind function without freeing them in the unbind you'll 
keep the old allocations around. Why do you need this to be dynamically 
allocated to begin with?

> +
> +	mutex_lock(&late_bind->mutex);
> +	component->mei_dev = mei_kdev;
> +	component->ops = data;
> +	mutex_unlock(&late_bind->mutex);

This is a local variable right now, so locking around it doesn't do 
anything.

> +
> +	late_bind->component = component;

This assignment instead you might want to protect.

> +
> +	return 0;
> +}
> +
> +static void xe_late_bind_component_unbind(struct device *xe_kdev,
> +					  struct device *mei_kdev, void *data)
> +{
> +	struct xe_device *xe = kdev_to_xe_device(xe_kdev);
> +	struct xe_late_bind *late_bind = &xe->late_bind;
> +
> +	mutex_lock(&late_bind->mutex);
> +	late_bind->component = NULL;
> +	mutex_unlock(&late_bind->mutex);
> +}
> +
> +static const struct component_ops xe_late_bind_component_ops = {
> +	.bind   = xe_late_bind_component_bind,
> +	.unbind = xe_late_bind_component_unbind,
> +};
> +
> +/**
> + * xe_late_bind_init() - add xe mei late binding component
> + *
> + * Return: 0 if the initialization was successful, a negative errno otherwise.
> + */
> +int xe_late_bind_init(struct xe_late_bind *late_bind)
> +{
> +	struct xe_device *xe = late_bind_to_xe(late_bind);
> +	int err;
> +
> +	if (xe->info.platform != XE_BATTLEMAGE)
> +		return 0;
> +
> +	mutex_init(&late_bind->mutex);
> +
> +	if (!IS_ENABLED(CONFIG_INTEL_MEI_LATE_BIND)) {

also need INTEL_MEI_GSC for BMG as mentioned above

> +		drm_info(&xe->drm, "Can't init xe mei late bind missing mei component\n");
> +		return -ENODEV;
> +	}
> +
> +	err = component_add_typed(xe->drm.dev, &xe_late_bind_component_ops,
> +				  I915_COMPONENT_LATE_BIND);
> +	if (err < 0) {
> +		drm_info(&xe->drm, "Failed to add mei late bind component (%pe)\n", ERR_PTR(err));
> +		return err;
> +	}
> +
> +	late_bind->component_added = true;
> +
> +	/* the component must be removed before unload, so can't use drmm for cleanup */

this has now changed (see 8e1ddfada453 ("drivers: base: devres: Allow to 
release group on device release") ), so you can use a devm action here.

> +
> +	return 0;
> +}
> +
> +/**
> + * xe_late_bind_remove() - remove the xe mei late binding component
> + */
> +void xe_late_bind_remove(struct xe_late_bind *late_bind)
> +{
> +	struct xe_device *xe = late_bind_to_xe(late_bind);
> +
> +	if (!late_bind->component_added)
> +		return;
> +
> +	component_del(xe->drm.dev, &xe_late_bind_component_ops);
> +	late_bind->component_added = false;
> +}
> diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw.h b/drivers/gpu/drm/xe/xe_late_bind_fw.h
> new file mode 100644
> index 000000000000..21299de54b47
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_late_bind_fw.h
> @@ -0,0 +1,16 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#ifndef _XE_LATE_BIND_FW_H_
> +#define _XE_LATE_BIND_FW_H_
> +
> +#include <linux/types.h>
> +
> +struct xe_late_bind;
> +
> +int xe_late_bind_init(struct xe_late_bind *late_bind);
> +void xe_late_bind_remove(struct xe_late_bind *late_bind);
> +
> +#endif
> diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw_types.h b/drivers/gpu/drm/xe/xe_late_bind_fw_types.h
> new file mode 100644
> index 000000000000..ea11061ce556
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_late_bind_fw_types.h
> @@ -0,0 +1,95 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#ifndef _XE_MEI_LATE_BIND_TYPES_H_
> +#define _XE_MEI_LATE_BIND_TYPES_H_
> +
> +#include <linux/iosys-map.h>
> +#include <linux/mutex.h>
> +#include <linux/spinlock.h>
> +#include <linux/types.h>
> +#include <linux/workqueue.h>
> +
> +#define MAX_PAYLOAD_SIZE (1024 * 4)
> +
> +struct xe_bo;
> +struct xe_late_bind_component;
> +
> +/**
> + * xe_late_bind_fw_type - enum to determine late binding fw type
> + */
> +enum xe_late_bind_type {
> +	CSC_LATE_BINDING_TYPE_FAN_CONTROL = 1,
> +	CSC_LATE_BINDING_TYPE_VOLTAGE_REGULATOR
> +};
> +
> +/**
> + * Late Binding flags
> + */
> +enum csc_late_binding_flags {
> +	/** Persistent across warm reset */
> +	CSC_LATE_BINDING_FLAGS_IS_PERSISTENT = 0x1
> +};
> +
> +/**
> + * xe_late_bind_fw_id - enum to determine late binding fw index
> + */
> +enum xe_late_bind_fw_id {
> +	FAN_CONTROL_ID = 0,
> +	VOLTAGE_REGULATOR_ID,
> +	MAX_ID
> +};
> +
> +/**
> + * struct xe_late_bind_fw
> + */
> +struct xe_late_bind_fw {
> +	/** @late_bind_fw.valid */
> +	bool valid;
> +
> +	/** @late_bind_fw.id */
> +	u32 id;
> +
> +	/** @late_bind_fw.blob_path: late binding fw blob path */
> +	char blob_path[PATH_MAX];
> +
> +	/** @late_bind_fw.type */
> +	u32  type;
> +
> +	/** @late_bind_fw.type */
> +	u32  flags;
> +
> +	/** @late_bind_fw.payload: to store the late binding blob */
> +	u8  payload[MAX_PAYLOAD_SIZE];
> +
> +	/** @late_bind_fw.payload_size: late binding blob payload_size */
> +	size_t payload_size;
> +
> +	/** @late_bind_fw.work: worker to upload latebind blob */
> +	struct work_struct work;
> +};
> +
> +/**
> + * struct xe_late_bind
> + */
> +struct xe_late_bind {
> +	/** @late_bind.component: struct for communication with mei component */
> +	struct xe_late_bind_component *component;
> +
> +	/** @late_bind.component_added: whether the component has been added */
> +	bool component_added;
> +
> +	/** @late_bind.wq: workqueue to submit request to download late bind blob */
> +	struct workqueue_struct *wq;
> +
> +	/** @late_bind.mutex: protects the component binding and usage */
> +	struct mutex mutex;
> +
> +	/** @late_bind.late_bind_fw: late binding firmwares */
> +	struct xe_late_bind_fw late_bind_fw[MAX_ID];
> +
> +};
> +

A lot of the variables/enums in this file are unused in this patch. Can 
you move the additions to the patch in which they're actually used?

Daniele

> +#endif


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 9/9] {fwctl,drm}/xe/pcode: Introduce xe_pcode_fwctl
  2025-05-07 19:49     ` Rodrigo Vivi
@ 2025-05-07 22:04       ` Jason Gunthorpe
  2025-08-22 19:33         ` Rodrigo Vivi
  2025-06-06 13:47       ` Nilawar, Badal
  1 sibling, 1 reply; 30+ messages in thread
From: Jason Gunthorpe @ 2025-05-07 22:04 UTC (permalink / raw)
  To: Rodrigo Vivi
  Cc: Badal Nilawar, intel-xe, dri-devel, anshuman.gupta,
	alexander.usyskin, gregkh, daniele.ceraolospurio

On Wed, May 07, 2025 at 03:49:15PM -0400, Rodrigo Vivi wrote:

> One last thing since I have your attention here. Was any time in the previous
> fwctl discussions talked about the possibility of some extra usages for like
> FW flashing or in-field-repair/tests where big data needs to filled bypassing
> lockdown mode?

For FW flash I do suggest you try to use devlink's firmware flashing
interface. I think it would be really great if that could become a
cross-subsystem standard in Linux.

If that isn't going to work out then yes I would say fwctl should be
considered for flashing.

Saeed's original version had a "big data" memory pinned DMA capable
interface as well. With justification it could come into the fwctl
version. I'm not against it, but mindful that it widens what is
possible by quite a bit.

But you might not need something like that just for flash. Some
internal improvement kernel side to allow streaming from large
user-space RPC buffers instead of a single alloc and copy would be
sufficient..

Jason

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 2/9] mei: late_bind: add late binding component driver
  2025-04-29 16:09 ` [RFC 2/9] mei: late_bind: add late binding component driver Badal Nilawar
@ 2025-05-07 22:42   ` Daniele Ceraolo Spurio
  2025-05-08  5:41     ` Usyskin, Alexander
  0 siblings, 1 reply; 30+ messages in thread
From: Daniele Ceraolo Spurio @ 2025-05-07 22:42 UTC (permalink / raw)
  To: Badal Nilawar, intel-xe, dri-devel
  Cc: anshuman.gupta, rodrigo.vivi, alexander.usyskin, gregkh

<snip>
> diff --git a/include/drm/intel/xe_late_bind_mei_interface.h b/include/drm/intel/xe_late_bind_mei_interface.h
> new file mode 100644
> index 000000000000..4005c4c6184f
> --- /dev/null
> +++ b/include/drm/intel/xe_late_bind_mei_interface.h
> @@ -0,0 +1,49 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright (c) 2025 Intel Corporation
> + */
> +
> +#ifndef _XE_LATE_BIND_MEI_INTERFACE_H_
> +#define _XE_LATE_BIND_MEI_INTERFACE_H_
> +
> +#include <linux/types.h>
> +
> +struct device;
> +struct module;
> +
> +/**
> + * struct xe_late_bind_component_ops - ops for Late Binding services.
> + * @owner: Module providing the ops
> + * @push_config: Sends a config to FW.
> + */
> +struct xe_late_bind_component_ops {
> +	struct module *owner;
> +
> +	/**
> +	 * @push_config: Sends a config to FW.
> +	 * @dev: device struct corresponding to the mei device
> +	 * @type: payload type
> +	 * @flags: payload flags
> +	 * @payload: payload buffer
> +	 * @payload_size: payload buffer size
> +	 *
> +	 * Return: 0 success, negative errno value on transport failure,
> +	 *         positive status returned by FW
> +	 */
> +	int (*push_config)(struct device *dev, u32 type, u32 flags,
> +			   const void *payload, size_t payload_size);
> +};
> +
> +/**
> + * struct xe_late_bind_component - Late Binding services component
> + * @mei_dev: device that provide Late Binding service.
> + * @ops: Ops implemented by Late Binding driver, used by Xe driver.
> + *
> + * Communication between Xe and MEI drivers for Late Binding services
> + */
> +struct xe_late_bind_component {
> +	struct device *mei_dev;
> +	const struct xe_late_bind_component_ops *ops;
> +};

Does this structure actually need to be defined here? Differently from 
other components, for this component we're only passing the 
xe_late_bind_component_ops via the component_bind_all call, so in the Xe 
driver we should be free to use wherever type we want to store this info.

Daniele

> +
> +#endif /* _XE_LATE_BIND_MEI_INTERFACE_H_ */


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 4/9] drm/xe/xe_late_bind_fw: Initialize late binding firmware
  2025-04-29 16:09 ` [RFC 4/9] drm/xe/xe_late_bind_fw: Initialize late binding firmware Badal Nilawar
@ 2025-05-07 23:11   ` Daniele Ceraolo Spurio
  2025-06-03 13:57     ` Nilawar, Badal
  0 siblings, 1 reply; 30+ messages in thread
From: Daniele Ceraolo Spurio @ 2025-05-07 23:11 UTC (permalink / raw)
  To: Badal Nilawar, intel-xe, dri-devel
  Cc: anshuman.gupta, rodrigo.vivi, alexander.usyskin, gregkh



On 4/29/2025 9:09 AM, Badal Nilawar wrote:
> Search for late binding firmware binaries and populate the meta data of
> firmware structures.
>
> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
> ---
>   drivers/gpu/drm/xe/xe_device.c       |   2 +
>   drivers/gpu/drm/xe/xe_late_bind_fw.c | 101 ++++++++++++++++++++++++++-
>   drivers/gpu/drm/xe/xe_late_bind_fw.h |   1 +
>   3 files changed, 101 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> index 86a7b7065122..d83864e7189c 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -892,6 +892,8 @@ int xe_device_probe(struct xe_device *xe)
>   
>   	xe_late_bind_init(&xe->late_bind);
>   
> +	xe_late_bind_fw_init(&xe->late_bind);

Maybe call this from inside xe_late_bind_init?

> +
>   	err = xe_oa_init(xe);
>   	if (err)
>   		return err;
> diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw.c b/drivers/gpu/drm/xe/xe_late_bind_fw.c
> index 7981fc500a78..297238fd3d16 100644
> --- a/drivers/gpu/drm/xe/xe_late_bind_fw.c
> +++ b/drivers/gpu/drm/xe/xe_late_bind_fw.c
> @@ -5,6 +5,7 @@
>   
>   #include <linux/component.h>
>   #include <linux/delay.h>
> +#include <linux/firmware.h>
>   
>   #include <drm/drm_managed.h>
>   #include <drm/intel/i915_component.h>
> @@ -13,13 +14,108 @@
>   
>   #include "xe_device.h"
>   #include "xe_late_bind_fw.h"
> +#include "xe_pcode.h"
> +#include "xe_pcode_api.h"
>   
> -static struct xe_device *
> -late_bind_to_xe(struct xe_late_bind *late_bind)
> +static const char * const fw_id_to_name[] = {
> +		[FAN_CONTROL_ID] = "fan_control",
> +		[VOLTAGE_REGULATOR_ID] = "voltage_regulator",
> +	};
> +
> +static const u32 fw_id_to_type[] = {
> +		[FAN_CONTROL_ID] = CSC_LATE_BINDING_TYPE_FAN_CONTROL,
> +		[VOLTAGE_REGULATOR_ID] = CSC_LATE_BINDING_TYPE_VOLTAGE_REGULATOR
> +	};
> +
> +static struct xe_device *late_bind_to_xe(struct xe_late_bind *late_bind)
>   {
>   	return container_of(late_bind, struct xe_device, late_bind);
>   }
>   
> +static int late_bind_fw_num_fans(struct xe_late_bind *late_bind)
> +{
> +	struct xe_device *xe = late_bind_to_xe(late_bind);
> +	struct xe_tile *root_tile = xe_device_get_root_tile(xe);
> +	u32 uval;
> +
> +	if (!xe_pcode_read(root_tile,
> +			   PCODE_MBOX(FAN_SPEED_CONTROL, FSC_READ_NUM_FANS, 0), &uval, NULL))
> +		return uval;
> +	else
> +		return 0;
> +}
> +
> +static int late_bind_fw_init(struct xe_late_bind *late_bind, u32 id)
> +{
> +	struct xe_device *xe = late_bind_to_xe(late_bind);
> +	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
> +	struct xe_late_bind_fw *lb_fw;
> +	const struct firmware *fw;
> +	u32 num_fans;
> +	int ret;
> +
> +	if (!late_bind->component_added)
> +		return 0;
> +
> +	if (id >= MAX_ID)
> +		return -EINVAL;
> +
> +	lb_fw = &late_bind->late_bind_fw[id];
> +
> +	lb_fw->id = id;
> +	lb_fw->type = fw_id_to_type[id];
> +
> +	if (lb_fw->type == CSC_LATE_BINDING_TYPE_FAN_CONTROL) {
> +		num_fans = late_bind_fw_num_fans(late_bind);
> +		drm_dbg(&xe->drm, "Number of Fans: %d\n", num_fans);
> +		if (!num_fans)
> +			return 0;
> +	}
> +
> +	lb_fw->flags = CSC_LATE_BINDING_FLAGS_IS_PERSISTENT;
> +
> +	snprintf(lb_fw->blob_path, sizeof(lb_fw->blob_path), "xe/%s_8086_%04x_%04x_%04x.bin",
> +		 fw_id_to_name[id], pdev->device,
> +		 pdev->subsystem_vendor, pdev->subsystem_device);
> +
> +	drm_dbg(&xe->drm, "Request late binding firmware %s\n", lb_fw->blob_path);
> +	ret = request_firmware(&fw, lb_fw->blob_path, xe->drm.dev);

Are we expecting late binding firmwares for all possible cards to always 
be available? because if not (and therefore if this fetch can fail) we 
should change this to firmware_request_nowarn to avoid throwing errors

> +	if (ret) {
> +		drm_err(&xe->drm, "Failed to request %s\n", lb_fw->blob_path);

Same as above, if not finding the blob is a valid scenario then this 
should be a drm_dbg. Maybe even reword to make it clear it's not a 
failure but just the fact that there is no FW for the card.

> +		lb_fw->valid = false;
> +		return 0;
> +	}
> +
> +	if (fw->size > MAX_PAYLOAD_SIZE)
> +		lb_fw->payload_size = MAX_PAYLOAD_SIZE;

Is this safe? It feels weird to send a truncated firmware for something 
like voltage regulators. IMO if the firmware is too big we should throw 
and error and bail out.

> +	else
> +		lb_fw->payload_size = fw->size;
> +
> +	memcpy(lb_fw->payload, fw->data, lb_fw->payload_size);
> +	release_firmware(fw);
> +	lb_fw->valid = true;
> +
> +	return 0;
> +}
> +
> +/**
> + * xe_mei_late_bind_fw_init() - Initialize late bind firmware
> + *
> + * Return: 0 if the initialization was successful, a negative errno otherwise.
> + */
> +int xe_late_bind_fw_init(struct xe_late_bind *late_bind)
> +{
> +	int id;
> +	int ret;
> +
> +	for (id = 0; id < MAX_ID; id++) {
> +		ret = late_bind_fw_init(late_bind, id);
> +		if (ret)
> +			return ret;
> +	}
> +	return ret;
> +}
> +
>   static int xe_late_bind_component_bind(struct device *xe_kdev,
>   				       struct device *mei_kdev, void *data)
>   {
> @@ -83,7 +179,6 @@ int xe_late_bind_init(struct xe_late_bind *late_bind)
>   	}
>   
>   	late_bind->component_added = true;
> -

stray blank line removal

Daniele

>   	/* the component must be removed before unload, so can't use drmm for cleanup */
>   
>   	return 0;
> diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw.h b/drivers/gpu/drm/xe/xe_late_bind_fw.h
> index 21299de54b47..e88c637b15a6 100644
> --- a/drivers/gpu/drm/xe/xe_late_bind_fw.h
> +++ b/drivers/gpu/drm/xe/xe_late_bind_fw.h
> @@ -12,5 +12,6 @@ struct xe_late_bind;
>   
>   int xe_late_bind_init(struct xe_late_bind *late_bind);
>   void xe_late_bind_remove(struct xe_late_bind *late_bind);
> +int xe_late_bind_fw_init(struct xe_late_bind *late_bind);
>   
>   #endif


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 5/9] drm/xe/xe_late_bind_fw: Load late binding firmware
  2025-04-29 16:09 ` [RFC 5/9] drm/xe/xe_late_bind_fw: Load " Badal Nilawar
@ 2025-05-07 23:44   ` Daniele Ceraolo Spurio
  2025-06-04  5:36     ` Nilawar, Badal
  0 siblings, 1 reply; 30+ messages in thread
From: Daniele Ceraolo Spurio @ 2025-05-07 23:44 UTC (permalink / raw)
  To: Badal Nilawar, intel-xe, dri-devel
  Cc: anshuman.gupta, rodrigo.vivi, alexander.usyskin, gregkh



On 4/29/2025 9:09 AM, Badal Nilawar wrote:
> Load late binding firmware
>
> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
> ---
>   drivers/gpu/drm/xe/xe_device.c       |  2 +
>   drivers/gpu/drm/xe/xe_late_bind_fw.c | 91 +++++++++++++++++++++++++++-
>   drivers/gpu/drm/xe/xe_late_bind_fw.h |  1 +
>   3 files changed, 92 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> index d83864e7189c..30a416323b37 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -894,6 +894,8 @@ int xe_device_probe(struct xe_device *xe)
>   
>   	xe_late_bind_fw_init(&xe->late_bind);
>   
> +	xe_late_bind_fw_load(&xe->late_bind);
> +

Why does this need to be a separated call from xe_late_bind_fw_init?

>   	err = xe_oa_init(xe);
>   	if (err)
>   		return err;
> diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw.c b/drivers/gpu/drm/xe/xe_late_bind_fw.c
> index 297238fd3d16..7d2bc959027d 100644
> --- a/drivers/gpu/drm/xe/xe_late_bind_fw.c
> +++ b/drivers/gpu/drm/xe/xe_late_bind_fw.c
> @@ -16,6 +16,16 @@
>   #include "xe_late_bind_fw.h"
>   #include "xe_pcode.h"
>   #include "xe_pcode_api.h"
> +#include "xe_pm.h"
> +
> +/*
> + * The component should load quite quickly in most cases, but it could take
> + * a bit. Using a very big timeout just to cover the worst case scenario
> + */
> +#define LB_INIT_TIMEOUT_MS 20000
> +
> +#define LB_FW_LOAD_RETRY_MAXCOUNT 40
> +#define LB_FW_LOAD_RETRY_PAUSE_MS 50
>   
>   static const char * const fw_id_to_name[] = {
>   		[FAN_CONTROL_ID] = "fan_control",
> @@ -45,6 +55,78 @@ static int late_bind_fw_num_fans(struct xe_late_bind *late_bind)
>   		return 0;
>   }
>   
> +static void late_bind_work(struct work_struct *work)
> +{
> +	struct xe_late_bind_fw *lbfw = container_of(work, struct xe_late_bind_fw, work);
> +	struct xe_late_bind *late_bind = container_of(lbfw, struct xe_late_bind,
> +						      late_bind_fw[lbfw->id]);
> +	struct xe_device *xe = late_bind_to_xe(late_bind);
> +	int retry = LB_FW_LOAD_RETRY_MAXCOUNT;
> +	int ret;
> +	int slept;
> +
> +	if (!late_bind->component_added)
> +		return;
> +
> +	if (!lbfw->valid)
> +		return;
> +
> +	/* we can queue this before the component is bound */
> +	for (slept = 0; slept < LB_INIT_TIMEOUT_MS; slept += 100) {
> +		if (late_bind->component)
> +			break;
> +		msleep(100);
> +	}
> +
> +	xe_pm_runtime_get(xe);
> +	mutex_lock(&late_bind->mutex);

You're locking the mutex here but you're not checking that any of the 
protected data is valid (late_bind->component can be NULL when we exit 
from the above for loop, or it might have changed from valid to NULL in 
between).

> +	drm_dbg(&xe->drm, "Load %s firmware\n", fw_id_to_name[lbfw->id]);
> +
> +	do {
> +		ret = late_bind->component->ops->push_config(late_bind->component->mei_dev,
> +							     lbfw->type, lbfw->flags,
> +							     lbfw->payload, lbfw->payload_size);
> +		if (!ret)
> +			break;
> +		msleep(LB_FW_LOAD_RETRY_PAUSE_MS);
> +	} while (--retry && ret == -EAGAIN);

In which scenario can this call return -EAGAIN ? As far as I can see the 
mei driver only returns that on a non-blocking call, which is not what 
we're doing here. Am I missing a path somewhere?

> +
> +	if (ret)
> +		drm_err(&xe->drm, "Load %s firmware failed with err %d\n",
> +			fw_id_to_name[lbfw->id], ret);
> +	else
> +		drm_dbg(&xe->drm, "Load %s firmware successful\n",
> +			fw_id_to_name[lbfw->id]);
> +
> +	mutex_unlock(&late_bind->mutex);
> +	xe_pm_runtime_put(xe);
> +}
> +
> +int xe_late_bind_fw_load(struct xe_late_bind *late_bind)
> +{
> +	struct xe_device *xe = late_bind_to_xe(late_bind);
> +	struct xe_late_bind_fw *lbfw;
> +	int id;
> +
> +	if (!late_bind->component_added)
> +		return -EINVAL;
> +
> +	for (id = 0; id < MAX_ID; id++) {
> +		lbfw = &late_bind->late_bind_fw[id];
> +		if (lbfw->valid) {
> +			drm_dbg(&xe->drm, "Queue work: to load %s firmware\n",
> +				fw_id_to_name[lbfw->id]);
> +			queue_work(late_bind->wq, &lbfw->work);

Do we need to flush this work before suspend to make sure it has 
completed before the HW goes into D3Hot? Similarly, do we need to flush 
it on driver unload to make sure it's done before we de-allocate stuff?

> +		}
> +	}
> +	return 0;
> +}
> +
> +/**
> + * late_bind_fw_init() - initialize late bind firmware
> + *
> + * Return: 0 if the initialization was successful, a negative errno otherwise.
> + */
>   static int late_bind_fw_init(struct xe_late_bind *late_bind, u32 id)
>   {
>   	struct xe_device *xe = late_bind_to_xe(late_bind);
> @@ -93,6 +175,7 @@ static int late_bind_fw_init(struct xe_late_bind *late_bind, u32 id)
>   
>   	memcpy(lb_fw->payload, fw->data, lb_fw->payload_size);
>   	release_firmware(fw);
> +	INIT_WORK(&lb_fw->work, late_bind_work);
>   	lb_fw->valid = true;
>   
>   	return 0;
> @@ -108,12 +191,17 @@ int xe_late_bind_fw_init(struct xe_late_bind *late_bind)
>   	int id;
>   	int ret;
>   
> +	late_bind->wq = create_singlethread_workqueue("late-bind-ordered-wq");

Where is this WQ destroyed? Also, I think that using 
alloc_ordered_workqueue would be preferred.

Daniele

> +	if (!late_bind->wq)
> +		return -ENOMEM;
> +
>   	for (id = 0; id < MAX_ID; id++) {
>   		ret = late_bind_fw_init(late_bind, id);
>   		if (ret)
>   			return ret;
>   	}
> -	return ret;
> +
> +	return 0;
>   }
>   
>   static int xe_late_bind_component_bind(struct device *xe_kdev,
> @@ -179,7 +267,6 @@ int xe_late_bind_init(struct xe_late_bind *late_bind)
>   	}
>   
>   	late_bind->component_added = true;
> -	/* the component must be removed before unload, so can't use drmm for cleanup */
>   
>   	return 0;
>   }
> diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw.h b/drivers/gpu/drm/xe/xe_late_bind_fw.h
> index e88c637b15a6..edd0e4c0650e 100644
> --- a/drivers/gpu/drm/xe/xe_late_bind_fw.h
> +++ b/drivers/gpu/drm/xe/xe_late_bind_fw.h
> @@ -13,5 +13,6 @@ struct xe_late_bind;
>   int xe_late_bind_init(struct xe_late_bind *late_bind);
>   void xe_late_bind_remove(struct xe_late_bind *late_bind);
>   int xe_late_bind_fw_init(struct xe_late_bind *late_bind);
> +int xe_late_bind_fw_load(struct xe_late_bind *late_bind);
>   
>   #endif


^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [RFC 2/9] mei: late_bind: add late binding component driver
  2025-05-07 22:42   ` Daniele Ceraolo Spurio
@ 2025-05-08  5:41     ` Usyskin, Alexander
  2025-06-03 12:01       ` Nilawar, Badal
  0 siblings, 1 reply; 30+ messages in thread
From: Usyskin, Alexander @ 2025-05-08  5:41 UTC (permalink / raw)
  To: Ceraolo Spurio, Daniele, Nilawar, Badal,
	intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org
  Cc: Gupta, Anshuman, Vivi, Rodrigo, gregkh@linuxfoundation.org

> 
> <snip>
> > diff --git a/include/drm/intel/xe_late_bind_mei_interface.h
> b/include/drm/intel/xe_late_bind_mei_interface.h
> > new file mode 100644
> > index 000000000000..4005c4c6184f
> > --- /dev/null
> > +++ b/include/drm/intel/xe_late_bind_mei_interface.h
> > @@ -0,0 +1,49 @@
> > +/* SPDX-License-Identifier: MIT */
> > +/*
> > + * Copyright (c) 2025 Intel Corporation
> > + */
> > +
> > +#ifndef _XE_LATE_BIND_MEI_INTERFACE_H_
> > +#define _XE_LATE_BIND_MEI_INTERFACE_H_
> > +
> > +#include <linux/types.h>
> > +
> > +struct device;
> > +struct module;
> > +
> > +/**
> > + * struct xe_late_bind_component_ops - ops for Late Binding services.
> > + * @owner: Module providing the ops
> > + * @push_config: Sends a config to FW.
> > + */
> > +struct xe_late_bind_component_ops {
> > +	struct module *owner;
> > +
> > +	/**
> > +	 * @push_config: Sends a config to FW.
> > +	 * @dev: device struct corresponding to the mei device
> > +	 * @type: payload type
> > +	 * @flags: payload flags
> > +	 * @payload: payload buffer
> > +	 * @payload_size: payload buffer size
> > +	 *
> > +	 * Return: 0 success, negative errno value on transport failure,
> > +	 *         positive status returned by FW
> > +	 */
> > +	int (*push_config)(struct device *dev, u32 type, u32 flags,
> > +			   const void *payload, size_t payload_size);
> > +};
> > +
> > +/**
> > + * struct xe_late_bind_component - Late Binding services component
> > + * @mei_dev: device that provide Late Binding service.
> > + * @ops: Ops implemented by Late Binding driver, used by Xe driver.
> > + *
> > + * Communication between Xe and MEI drivers for Late Binding services
> > + */
> > +struct xe_late_bind_component {
> > +	struct device *mei_dev;
> > +	const struct xe_late_bind_component_ops *ops;
> > +};
> 
> Does this structure actually need to be defined here? Differently from
> other components, for this component we're only passing the
> xe_late_bind_component_ops via the component_bind_all call, so in the Xe
> driver we should be free to use wherever type we want to store this info.
> 
> Daniele
> 

You are right, this struct may be dropped from this header.
Badal, let's move it out into Xe code.

- - 
Thanks,
Sasha


> > +
> > +#endif /* _XE_LATE_BIND_MEI_INTERFACE_H_ */


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 2/9] mei: late_bind: add late binding component driver
  2025-05-08  5:41     ` Usyskin, Alexander
@ 2025-06-03 12:01       ` Nilawar, Badal
  0 siblings, 0 replies; 30+ messages in thread
From: Nilawar, Badal @ 2025-06-03 12:01 UTC (permalink / raw)
  To: Usyskin, Alexander, Ceraolo Spurio, Daniele,
	intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org
  Cc: Gupta, Anshuman, Vivi, Rodrigo, gregkh@linuxfoundation.org


On 08-05-2025 11:11, Usyskin, Alexander wrote:
>> <snip>
>>> diff --git a/include/drm/intel/xe_late_bind_mei_interface.h
>> b/include/drm/intel/xe_late_bind_mei_interface.h
>>> new file mode 100644
>>> index 000000000000..4005c4c6184f
>>> --- /dev/null
>>> +++ b/include/drm/intel/xe_late_bind_mei_interface.h
>>> @@ -0,0 +1,49 @@
>>> +/* SPDX-License-Identifier: MIT */
>>> +/*
>>> + * Copyright (c) 2025 Intel Corporation
>>> + */
>>> +
>>> +#ifndef _XE_LATE_BIND_MEI_INTERFACE_H_
>>> +#define _XE_LATE_BIND_MEI_INTERFACE_H_
>>> +
>>> +#include <linux/types.h>
>>> +
>>> +struct device;
>>> +struct module;
>>> +
>>> +/**
>>> + * struct xe_late_bind_component_ops - ops for Late Binding services.
>>> + * @owner: Module providing the ops
>>> + * @push_config: Sends a config to FW.
>>> + */
>>> +struct xe_late_bind_component_ops {
>>> +	struct module *owner;
>>> +
>>> +	/**
>>> +	 * @push_config: Sends a config to FW.
>>> +	 * @dev: device struct corresponding to the mei device
>>> +	 * @type: payload type
>>> +	 * @flags: payload flags
>>> +	 * @payload: payload buffer
>>> +	 * @payload_size: payload buffer size
>>> +	 *
>>> +	 * Return: 0 success, negative errno value on transport failure,
>>> +	 *         positive status returned by FW
>>> +	 */
>>> +	int (*push_config)(struct device *dev, u32 type, u32 flags,
>>> +			   const void *payload, size_t payload_size);
>>> +};
>>> +
>>> +/**
>>> + * struct xe_late_bind_component - Late Binding services component
>>> + * @mei_dev: device that provide Late Binding service.
>>> + * @ops: Ops implemented by Late Binding driver, used by Xe driver.
>>> + *
>>> + * Communication between Xe and MEI drivers for Late Binding services
>>> + */
>>> +struct xe_late_bind_component {
>>> +	struct device *mei_dev;
>>> +	const struct xe_late_bind_component_ops *ops;
>>> +};
>> Does this structure actually need to be defined here? Differently from
>> other components, for this component we're only passing the
>> xe_late_bind_component_ops via the component_bind_all call, so in the Xe
>> driver we should be free to use wherever type we want to store this info.
>>
>> Daniele
>>
> You are right, this struct may be dropped from this header.
> Badal, let's move it out into Xe code.

Sure, I will move this structure to xe code.

Thanks,
Badal

>
> - -
> Thanks,
> Sasha
>
>
>>> +
>>> +#endif /* _XE_LATE_BIND_MEI_INTERFACE_H_ */

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 3/9] drm/xe/late_bind_fw: Introducing late_bind_fw
  2025-05-07 21:38   ` Daniele Ceraolo Spurio
@ 2025-06-03 13:52     ` Nilawar, Badal
  0 siblings, 0 replies; 30+ messages in thread
From: Nilawar, Badal @ 2025-06-03 13:52 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio, intel-xe, dri-devel
  Cc: anshuman.gupta, rodrigo.vivi, alexander.usyskin, gregkh

[-- Attachment #1: Type: text/plain, Size: 12597 bytes --]


On 08-05-2025 03:08, Daniele Ceraolo Spurio wrote:
>
>
> On 4/29/2025 9:09 AM, Badal Nilawar wrote:
>> Introducing late_bind_fw to enable firmware loading for the devices,
>> such as the fan controller and voltage regulator, during the driver 
>> probe.
>> Typically, firmware for these devices are part of IFWI flash image but
>> can be replaced at probe after OEM tuning.
>
> This description does not fully match what's happening in the patch, 
> as the main thing happening is the addition of the mei component.
Sure I will update the description.
>
>>
>> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
>> ---
>>   drivers/gpu/drm/xe/Kconfig                 |   1 +
>>   drivers/gpu/drm/xe/Makefile                |   1 +
>>   drivers/gpu/drm/xe/xe_device.c             |   3 +
>>   drivers/gpu/drm/xe/xe_device_types.h       |   4 +
>>   drivers/gpu/drm/xe/xe_late_bind_fw.c       | 104 +++++++++++++++++++++
>>   drivers/gpu/drm/xe/xe_late_bind_fw.h       |  16 ++++
>>   drivers/gpu/drm/xe/xe_late_bind_fw_types.h |  95 +++++++++++++++++++
>>   7 files changed, 224 insertions(+)
>>   create mode 100644 drivers/gpu/drm/xe/xe_late_bind_fw.c
>>   create mode 100644 drivers/gpu/drm/xe/xe_late_bind_fw.h
>>   create mode 100644 drivers/gpu/drm/xe/xe_late_bind_fw_types.h
>>
>> diff --git a/drivers/gpu/drm/xe/Kconfig b/drivers/gpu/drm/xe/Kconfig
>> index 9bce047901b2..a8cc1876a24f 100644
>> --- a/drivers/gpu/drm/xe/Kconfig
>> +++ b/drivers/gpu/drm/xe/Kconfig
>> @@ -44,6 +44,7 @@ config DRM_XE
>>       select WANT_DEV_COREDUMP
>>       select AUXILIARY_BUS
>>       select HMM_MIRROR
>> +    select INTEL_MEI_LATE_BIND
>
> I'm not sure this is enough to guarantee that late bind will work. 
> This selects the component, but the MEI_GSC child driver also needs to 
> be built for the component to bind into it on dGPU. We can't select 
> INTEL_MEI_GSC from here because that depends on the graphics driver, 
> so we'd go circular. For other components (PXP, HDCP, SW proxy) what 
> we did was notify the distros that they needed to enable the new 
> config for the feature to work instead of selecting it from the Kconfig.
I will follow the approach used for other components. Is it ok to enable 
this feature for CI in a separate patch labeled "CI_only, do not review"?
>
>>       help
>>         Experimental driver for Intel Xe series GPUs
>>   diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
>> index c5d6681645ed..6de291a21965 100644
>> --- a/drivers/gpu/drm/xe/Makefile
>> +++ b/drivers/gpu/drm/xe/Makefile
>> @@ -76,6 +76,7 @@ xe-y += xe_bb.o \
>>       xe_hw_fence.o \
>>       xe_irq.o \
>>       xe_lrc.o \
>> +    xe_late_bind_fw.o \
>>       xe_migrate.o \
>>       xe_mmio.o \
>>       xe_mocs.o \
>> diff --git a/drivers/gpu/drm/xe/xe_device.c 
>> b/drivers/gpu/drm/xe/xe_device.c
>> index 75e753e0a682..86a7b7065122 100644
>> --- a/drivers/gpu/drm/xe/xe_device.c
>> +++ b/drivers/gpu/drm/xe/xe_device.c
>> @@ -42,6 +42,7 @@
>>   #include "xe_hw_engine_group.h"
>>   #include "xe_hwmon.h"
>>   #include "xe_irq.h"
>> +#include "xe_late_bind_fw.h"
>>   #include "xe_memirq.h"
>>   #include "xe_mmio.h"
>>   #include "xe_module.h"
>> @@ -889,6 +890,8 @@ int xe_device_probe(struct xe_device *xe)
>>       if (err)
>>           return err;
>>   +    xe_late_bind_init(&xe->late_bind);
>> +
>>       err = xe_oa_init(xe);
>>       if (err)
>>           return err;
>> diff --git a/drivers/gpu/drm/xe/xe_device_types.h 
>> b/drivers/gpu/drm/xe/xe_device_types.h
>> index 495bc00ebed4..57b63cc9b8ac 100644
>> --- a/drivers/gpu/drm/xe/xe_device_types.h
>> +++ b/drivers/gpu/drm/xe/xe_device_types.h
>> @@ -16,6 +16,7 @@
>>   #include "xe_devcoredump_types.h"
>>   #include "xe_heci_gsc.h"
>>   #include "xe_lmtt_types.h"
>> +#include "xe_late_bind_fw_types.h"
>>   #include "xe_memirq_types.h"
>>   #include "xe_oa_types.h"
>>   #include "xe_platform_types.h"
>> @@ -543,6 +544,9 @@ struct xe_device {
>>       /** @heci_gsc: graphics security controller */
>>       struct xe_heci_gsc heci_gsc;
>>   +    /** @late_bind: xe mei late bind interface */
>> +    struct xe_late_bind late_bind;
>> +
>>       /** @oa: oa observation subsystem */
>>       struct xe_oa oa;
>>   diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw.c 
>> b/drivers/gpu/drm/xe/xe_late_bind_fw.c
>> new file mode 100644
>> index 000000000000..7981fc500a78
>> --- /dev/null
>> +++ b/drivers/gpu/drm/xe/xe_late_bind_fw.c
>> @@ -0,0 +1,104 @@
>> +// SPDX-License-Identifier: MIT
>> +/*
>> + * Copyright © 2024 Intel Corporation
>
> 2025?
Ok
>
>> + */
>> +
>> +#include <linux/component.h>
>> +#include <linux/delay.h>
>> +
>> +#include <drm/drm_managed.h>
>> +#include <drm/intel/i915_component.h>
>> +#include <drm/intel/xe_late_bind_mei_interface.h>
>> +#include <drm/drm_print.h>
>> +
>> +#include "xe_device.h"
>> +#include "xe_late_bind_fw.h"
>> +
>> +static struct xe_device *
>> +late_bind_to_xe(struct xe_late_bind *late_bind)
>> +{
>> +    return container_of(late_bind, struct xe_device, late_bind);
>> +}
>> +
>> +static int xe_late_bind_component_bind(struct device *xe_kdev,
>> +                       struct device *mei_kdev, void *data)
>> +{
>> +    struct xe_device *xe = kdev_to_xe_device(xe_kdev);
>> +    struct xe_late_bind *late_bind = &xe->late_bind;
>> +    struct xe_late_bind_component *component;
>> +
>> +    component = drmm_kzalloc(&xe->drm, sizeof(*component), GFP_KERNEL);
>
> The component is unbound and re-bound on every suspend/resume, so if 
> you do allocs in the bind function without freeing them in the unbind 
> you'll keep the old allocations around. Why do you need this to be 
> dynamically allocated to begin with?
Dynamic allocation is not needed, I will drop the dynamic allocation and 
fix below comments.
>
>> +
>> +    mutex_lock(&late_bind->mutex);
>> +    component->mei_dev = mei_kdev;
>> +    component->ops = data;
>> +    mutex_unlock(&late_bind->mutex);
>
> This is a local variable right now, so locking around it doesn't do 
> anything.
>
>> +
>> +    late_bind->component = component;
>
> This assignment instead you might want to protect.
>
>> +
>> +    return 0;
>> +}
>> +
>> +static void xe_late_bind_component_unbind(struct device *xe_kdev,
>> +                      struct device *mei_kdev, void *data)
>> +{
>> +    struct xe_device *xe = kdev_to_xe_device(xe_kdev);
>> +    struct xe_late_bind *late_bind = &xe->late_bind;
>> +
>> +    mutex_lock(&late_bind->mutex);
>> +    late_bind->component = NULL;
>> +    mutex_unlock(&late_bind->mutex);
>> +}
>> +
>> +static const struct component_ops xe_late_bind_component_ops = {
>> +    .bind   = xe_late_bind_component_bind,
>> +    .unbind = xe_late_bind_component_unbind,
>> +};
>> +
>> +/**
>> + * xe_late_bind_init() - add xe mei late binding component
>> + *
>> + * Return: 0 if the initialization was successful, a negative errno 
>> otherwise.
>> + */
>> +int xe_late_bind_init(struct xe_late_bind *late_bind)
>> +{
>> +    struct xe_device *xe = late_bind_to_xe(late_bind);
>> +    int err;
>> +
>> +    if (xe->info.platform != XE_BATTLEMAGE)
>> +        return 0;
>> +
>> +    mutex_init(&late_bind->mutex);
>> +
>> +    if (!IS_ENABLED(CONFIG_INTEL_MEI_LATE_BIND)) {
>
> also need INTEL_MEI_GSC for BMG as mentioned above
Ok.
>
>> +        drm_info(&xe->drm, "Can't init xe mei late bind missing mei 
>> component\n");
>> +        return -ENODEV;
>> +    }
>> +
>> +    err = component_add_typed(xe->drm.dev, &xe_late_bind_component_ops,
>> +                  I915_COMPONENT_LATE_BIND);
>> +    if (err < 0) {
>> +        drm_info(&xe->drm, "Failed to add mei late bind component 
>> (%pe)\n", ERR_PTR(err));
>> +        return err;
>> +    }
>> +
>> +    late_bind->component_added = true;
>> +
>> +    /* the component must be removed before unload, so can't use 
>> drmm for cleanup */
>
> this has now changed (see 8e1ddfada453 ("drivers: base: devres: Allow 
> to release group on device release") ), so you can use a devm action 
> here.
Ok.
>
>> +
>> +    return 0;
>> +}
>> +
>> +/**
>> + * xe_late_bind_remove() - remove the xe mei late binding component
>> + */
>> +void xe_late_bind_remove(struct xe_late_bind *late_bind)
>> +{
>> +    struct xe_device *xe = late_bind_to_xe(late_bind);
>> +
>> +    if (!late_bind->component_added)
>> +        return;
>> +
>> +    component_del(xe->drm.dev, &xe_late_bind_component_ops);
>> +    late_bind->component_added = false;
>> +}
>> diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw.h 
>> b/drivers/gpu/drm/xe/xe_late_bind_fw.h
>> new file mode 100644
>> index 000000000000..21299de54b47
>> --- /dev/null
>> +++ b/drivers/gpu/drm/xe/xe_late_bind_fw.h
>> @@ -0,0 +1,16 @@
>> +/* SPDX-License-Identifier: MIT */
>> +/*
>> + * Copyright © 2025 Intel Corporation
>> + */
>> +
>> +#ifndef _XE_LATE_BIND_FW_H_
>> +#define _XE_LATE_BIND_FW_H_
>> +
>> +#include <linux/types.h>
>> +
>> +struct xe_late_bind;
>> +
>> +int xe_late_bind_init(struct xe_late_bind *late_bind);
>> +void xe_late_bind_remove(struct xe_late_bind *late_bind);
>> +
>> +#endif
>> diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw_types.h 
>> b/drivers/gpu/drm/xe/xe_late_bind_fw_types.h
>> new file mode 100644
>> index 000000000000..ea11061ce556
>> --- /dev/null
>> +++ b/drivers/gpu/drm/xe/xe_late_bind_fw_types.h
>> @@ -0,0 +1,95 @@
>> +/* SPDX-License-Identifier: MIT */
>> +/*
>> + * Copyright © 2025 Intel Corporation
>> + */
>> +
>> +#ifndef _XE_MEI_LATE_BIND_TYPES_H_
>> +#define _XE_MEI_LATE_BIND_TYPES_H_
>> +
>> +#include <linux/iosys-map.h>
>> +#include <linux/mutex.h>
>> +#include <linux/spinlock.h>
>> +#include <linux/types.h>
>> +#include <linux/workqueue.h>
>> +
>> +#define MAX_PAYLOAD_SIZE (1024 * 4)
>> +
>> +struct xe_bo;
>> +struct xe_late_bind_component;
>> +
>> +/**
>> + * xe_late_bind_fw_type - enum to determine late binding fw type
>> + */
>> +enum xe_late_bind_type {
>> +    CSC_LATE_BINDING_TYPE_FAN_CONTROL = 1,
>> +    CSC_LATE_BINDING_TYPE_VOLTAGE_REGULATOR
>> +};
>> +
>> +/**
>> + * Late Binding flags
>> + */
>> +enum csc_late_binding_flags {
>> +    /** Persistent across warm reset */
>> +    CSC_LATE_BINDING_FLAGS_IS_PERSISTENT = 0x1
>> +};
>> +
>> +/**
>> + * xe_late_bind_fw_id - enum to determine late binding fw index
>> + */
>> +enum xe_late_bind_fw_id {
>> +    FAN_CONTROL_ID = 0,
>> +    VOLTAGE_REGULATOR_ID,
>> +    MAX_ID
>> +};
>> +
>> +/**
>> + * struct xe_late_bind_fw
>> + */
>> +struct xe_late_bind_fw {
>> +    /** @late_bind_fw.valid */
>> +    bool valid;
>> +
>> +    /** @late_bind_fw.id */
>> +    u32 id;
>> +
>> +    /** @late_bind_fw.blob_path: late binding fw blob path */
>> +    char blob_path[PATH_MAX];
>> +
>> +    /** @late_bind_fw.type */
>> +    u32  type;
>> +
>> +    /** @late_bind_fw.type */
>> +    u32  flags;
>> +
>> +    /** @late_bind_fw.payload: to store the late binding blob */
>> +    u8  payload[MAX_PAYLOAD_SIZE];
>> +
>> +    /** @late_bind_fw.payload_size: late binding blob payload_size */
>> +    size_t payload_size;
>> +
>> +    /** @late_bind_fw.work: worker to upload latebind blob */
>> +    struct work_struct work;
>> +};
>> +
>> +/**
>> + * struct xe_late_bind
>> + */
>> +struct xe_late_bind {
>> +    /** @late_bind.component: struct for communication with mei 
>> component */
>> +    struct xe_late_bind_component *component;
>> +
>> +    /** @late_bind.component_added: whether the component has been 
>> added */
>> +    bool component_added;
>> +
>> +    /** @late_bind.wq: workqueue to submit request to download late 
>> bind blob */
>> +    struct workqueue_struct *wq;
>> +
>> +    /** @late_bind.mutex: protects the component binding and usage */
>> +    struct mutex mutex;
>> +
>> +    /** @late_bind.late_bind_fw: late binding firmwares */
>> +    struct xe_late_bind_fw late_bind_fw[MAX_ID];
>> +
>> +};
>> +
>
> A lot of the variables/enums in this file are unused in this patch. 
> Can you move the additions to the patch in which they're actually used?

Sure, I will add variables/enums to this structure incrementally in 
subsequent patches.

Thanks,
Badal

>
> Daniele
>
>> +#endif
>

[-- Attachment #2: Type: text/html, Size: 23723 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 4/9] drm/xe/xe_late_bind_fw: Initialize late binding firmware
  2025-05-07 23:11   ` Daniele Ceraolo Spurio
@ 2025-06-03 13:57     ` Nilawar, Badal
  0 siblings, 0 replies; 30+ messages in thread
From: Nilawar, Badal @ 2025-06-03 13:57 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio, intel-xe, dri-devel
  Cc: anshuman.gupta, rodrigo.vivi, alexander.usyskin, gregkh


On 08-05-2025 04:41, Daniele Ceraolo Spurio wrote:
>
>
> On 4/29/2025 9:09 AM, Badal Nilawar wrote:
>> Search for late binding firmware binaries and populate the meta data of
>> firmware structures.
>>
>> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
>> ---
>>   drivers/gpu/drm/xe/xe_device.c       |   2 +
>>   drivers/gpu/drm/xe/xe_late_bind_fw.c | 101 ++++++++++++++++++++++++++-
>>   drivers/gpu/drm/xe/xe_late_bind_fw.h |   1 +
>>   3 files changed, 101 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_device.c 
>> b/drivers/gpu/drm/xe/xe_device.c
>> index 86a7b7065122..d83864e7189c 100644
>> --- a/drivers/gpu/drm/xe/xe_device.c
>> +++ b/drivers/gpu/drm/xe/xe_device.c
>> @@ -892,6 +892,8 @@ int xe_device_probe(struct xe_device *xe)
>>         xe_late_bind_init(&xe->late_bind);
>>   +    xe_late_bind_fw_init(&xe->late_bind);
>
> Maybe call this from inside xe_late_bind_init?
Sure.
>
>> +
>>       err = xe_oa_init(xe);
>>       if (err)
>>           return err;
>> diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw.c 
>> b/drivers/gpu/drm/xe/xe_late_bind_fw.c
>> index 7981fc500a78..297238fd3d16 100644
>> --- a/drivers/gpu/drm/xe/xe_late_bind_fw.c
>> +++ b/drivers/gpu/drm/xe/xe_late_bind_fw.c
>> @@ -5,6 +5,7 @@
>>     #include <linux/component.h>
>>   #include <linux/delay.h>
>> +#include <linux/firmware.h>
>>     #include <drm/drm_managed.h>
>>   #include <drm/intel/i915_component.h>
>> @@ -13,13 +14,108 @@
>>     #include "xe_device.h"
>>   #include "xe_late_bind_fw.h"
>> +#include "xe_pcode.h"
>> +#include "xe_pcode_api.h"
>>   -static struct xe_device *
>> -late_bind_to_xe(struct xe_late_bind *late_bind)
>> +static const char * const fw_id_to_name[] = {
>> +        [FAN_CONTROL_ID] = "fan_control",
>> +        [VOLTAGE_REGULATOR_ID] = "voltage_regulator",
>> +    };
>> +
>> +static const u32 fw_id_to_type[] = {
>> +        [FAN_CONTROL_ID] = CSC_LATE_BINDING_TYPE_FAN_CONTROL,
>> +        [VOLTAGE_REGULATOR_ID] = 
>> CSC_LATE_BINDING_TYPE_VOLTAGE_REGULATOR
>> +    };
>> +
>> +static struct xe_device *late_bind_to_xe(struct xe_late_bind 
>> *late_bind)
>>   {
>>       return container_of(late_bind, struct xe_device, late_bind);
>>   }
>>   +static int late_bind_fw_num_fans(struct xe_late_bind *late_bind)
>> +{
>> +    struct xe_device *xe = late_bind_to_xe(late_bind);
>> +    struct xe_tile *root_tile = xe_device_get_root_tile(xe);
>> +    u32 uval;
>> +
>> +    if (!xe_pcode_read(root_tile,
>> +               PCODE_MBOX(FAN_SPEED_CONTROL, FSC_READ_NUM_FANS, 0), 
>> &uval, NULL))
>> +        return uval;
>> +    else
>> +        return 0;
>> +}
>> +
>> +static int late_bind_fw_init(struct xe_late_bind *late_bind, u32 id)
>> +{
>> +    struct xe_device *xe = late_bind_to_xe(late_bind);
>> +    struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
>> +    struct xe_late_bind_fw *lb_fw;
>> +    const struct firmware *fw;
>> +    u32 num_fans;
>> +    int ret;
>> +
>> +    if (!late_bind->component_added)
>> +        return 0;
>> +
>> +    if (id >= MAX_ID)
>> +        return -EINVAL;
>> +
>> +    lb_fw = &late_bind->late_bind_fw[id];
>> +
>> +    lb_fw->id = id;
>> +    lb_fw->type = fw_id_to_type[id];
>> +
>> +    if (lb_fw->type == CSC_LATE_BINDING_TYPE_FAN_CONTROL) {
>> +        num_fans = late_bind_fw_num_fans(late_bind);
>> +        drm_dbg(&xe->drm, "Number of Fans: %d\n", num_fans);
>> +        if (!num_fans)
>> +            return 0;
>> +    }
>> +
>> +    lb_fw->flags = CSC_LATE_BINDING_FLAGS_IS_PERSISTENT;
>> +
>> +    snprintf(lb_fw->blob_path, sizeof(lb_fw->blob_path), 
>> "xe/%s_8086_%04x_%04x_%04x.bin",
>> +         fw_id_to_name[id], pdev->device,
>> +         pdev->subsystem_vendor, pdev->subsystem_device);
>> +
>> +    drm_dbg(&xe->drm, "Request late binding firmware %s\n", 
>> lb_fw->blob_path);
>> +    ret = request_firmware(&fw, lb_fw->blob_path, xe->drm.dev);
>
> Are we expecting late binding firmwares for all possible cards to 
> always be available? because if not (and therefore if this fetch can 
> fail) we should change this to firmware_request_nowarn to avoid 
> throwing errors
No late binding firmware will not be available for all card, as 
suggested firmware_request_nowarn here.
>
>
>> +    if (ret) {
>> +        drm_err(&xe->drm, "Failed to request %s\n", lb_fw->blob_path);
>
> Same as above, if not finding the blob is a valid scenario then this 
> should be a drm_dbg. Maybe even reword to make it clear it's not a 
> failure but just the fact that there is no FW for the card.
I will change drm_err to drm_dbg.
>
>> +        lb_fw->valid = false;
>> +        return 0;
>> +    }
>> +
>> +    if (fw->size > MAX_PAYLOAD_SIZE)
>> +        lb_fw->payload_size = MAX_PAYLOAD_SIZE;
>
> Is this safe? It feels weird to send a truncated firmware for 
> something like voltage regulators. IMO if the firmware is too big we 
> should throw and error and bail out.
Sure, let's throw the error if firmware is too big.
>
>> +    else
>> +        lb_fw->payload_size = fw->size;
>> +
>> +    memcpy(lb_fw->payload, fw->data, lb_fw->payload_size);
>> +    release_firmware(fw);
>> +    lb_fw->valid = true;
>> +
>> +    return 0;
>> +}
>> +
>> +/**
>> + * xe_mei_late_bind_fw_init() - Initialize late bind firmware
>> + *
>> + * Return: 0 if the initialization was successful, a negative errno 
>> otherwise.
>> + */
>> +int xe_late_bind_fw_init(struct xe_late_bind *late_bind)
>> +{
>> +    int id;
>> +    int ret;
>> +
>> +    for (id = 0; id < MAX_ID; id++) {
>> +        ret = late_bind_fw_init(late_bind, id);
>> +        if (ret)
>> +            return ret;
>> +    }
>> +    return ret;
>> +}
>> +
>>   static int xe_late_bind_component_bind(struct device *xe_kdev,
>>                          struct device *mei_kdev, void *data)
>>   {
>> @@ -83,7 +179,6 @@ int xe_late_bind_init(struct xe_late_bind *late_bind)
>>       }
>>         late_bind->component_added = true;
>> -
>
> stray blank line removal

Ok.

Thanks,
Badal

>
> Daniele
>
>>       /* the component must be removed before unload, so can't use 
>> drmm for cleanup */
>>         return 0;
>> diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw.h 
>> b/drivers/gpu/drm/xe/xe_late_bind_fw.h
>> index 21299de54b47..e88c637b15a6 100644
>> --- a/drivers/gpu/drm/xe/xe_late_bind_fw.h
>> +++ b/drivers/gpu/drm/xe/xe_late_bind_fw.h
>> @@ -12,5 +12,6 @@ struct xe_late_bind;
>>     int xe_late_bind_init(struct xe_late_bind *late_bind);
>>   void xe_late_bind_remove(struct xe_late_bind *late_bind);
>> +int xe_late_bind_fw_init(struct xe_late_bind *late_bind);
>>     #endif
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 5/9] drm/xe/xe_late_bind_fw: Load late binding firmware
  2025-05-07 23:44   ` Daniele Ceraolo Spurio
@ 2025-06-04  5:36     ` Nilawar, Badal
  0 siblings, 0 replies; 30+ messages in thread
From: Nilawar, Badal @ 2025-06-04  5:36 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio, intel-xe, dri-devel
  Cc: anshuman.gupta, rodrigo.vivi, alexander.usyskin, gregkh


On 08-05-2025 05:14, Daniele Ceraolo Spurio wrote:
>
>
> On 4/29/2025 9:09 AM, Badal Nilawar wrote:
>> Load late binding firmware
>>
>> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
>> ---
>>   drivers/gpu/drm/xe/xe_device.c       |  2 +
>>   drivers/gpu/drm/xe/xe_late_bind_fw.c | 91 +++++++++++++++++++++++++++-
>>   drivers/gpu/drm/xe/xe_late_bind_fw.h |  1 +
>>   3 files changed, 92 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_device.c 
>> b/drivers/gpu/drm/xe/xe_device.c
>> index d83864e7189c..30a416323b37 100644
>> --- a/drivers/gpu/drm/xe/xe_device.c
>> +++ b/drivers/gpu/drm/xe/xe_device.c
>> @@ -894,6 +894,8 @@ int xe_device_probe(struct xe_device *xe)
>>         xe_late_bind_fw_init(&xe->late_bind);
>>   +    xe_late_bind_fw_load(&xe->late_bind);
>> +
>
> Why does this need to be a separated call from xe_late_bind_fw_init?
In subsequent patches xe_late_bind_fw_load called during S2idle/S3/rpm 
resume.
>
>>       err = xe_oa_init(xe);
>>       if (err)
>>           return err;
>> diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw.c 
>> b/drivers/gpu/drm/xe/xe_late_bind_fw.c
>> index 297238fd3d16..7d2bc959027d 100644
>> --- a/drivers/gpu/drm/xe/xe_late_bind_fw.c
>> +++ b/drivers/gpu/drm/xe/xe_late_bind_fw.c
>> @@ -16,6 +16,16 @@
>>   #include "xe_late_bind_fw.h"
>>   #include "xe_pcode.h"
>>   #include "xe_pcode_api.h"
>> +#include "xe_pm.h"
>> +
>> +/*
>> + * The component should load quite quickly in most cases, but it 
>> could take
>> + * a bit. Using a very big timeout just to cover the worst case 
>> scenario
>> + */
>> +#define LB_INIT_TIMEOUT_MS 20000
>> +
>> +#define LB_FW_LOAD_RETRY_MAXCOUNT 40
>> +#define LB_FW_LOAD_RETRY_PAUSE_MS 50
>>     static const char * const fw_id_to_name[] = {
>>           [FAN_CONTROL_ID] = "fan_control",
>> @@ -45,6 +55,78 @@ static int late_bind_fw_num_fans(struct 
>> xe_late_bind *late_bind)
>>           return 0;
>>   }
>>   +static void late_bind_work(struct work_struct *work)
>> +{
>> +    struct xe_late_bind_fw *lbfw = container_of(work, struct 
>> xe_late_bind_fw, work);
>> +    struct xe_late_bind *late_bind = container_of(lbfw, struct 
>> xe_late_bind,
>> +                              late_bind_fw[lbfw->id]);
>> +    struct xe_device *xe = late_bind_to_xe(late_bind);
>> +    int retry = LB_FW_LOAD_RETRY_MAXCOUNT;
>> +    int ret;
>> +    int slept;
>> +
>> +    if (!late_bind->component_added)
>> +        return;
>> +
>> +    if (!lbfw->valid)
>> +        return;
>> +
>> +    /* we can queue this before the component is bound */
>> +    for (slept = 0; slept < LB_INIT_TIMEOUT_MS; slept += 100) {
>> +        if (late_bind->component)
>> +            break;
>> +        msleep(100);
>> +    }
>> +
>> +    xe_pm_runtime_get(xe);
>> +    mutex_lock(&late_bind->mutex);
>
> You're locking the mutex here but you're not checking that any of the 
> protected data is valid (late_bind->component can be NULL when we exit 
> from the above for loop, or it might have changed from valid to NULL 
> in between).
>
>> +    drm_dbg(&xe->drm, "Load %s firmware\n", fw_id_to_name[lbfw->id]);
>> +
>> +    do {
>> +        ret = 
>> late_bind->component->ops->push_config(late_bind->component->mei_dev,
>> +                                 lbfw->type, lbfw->flags,
>> +                                 lbfw->payload, lbfw->payload_size);
>> +        if (!ret)
>> +            break;
>> +        msleep(LB_FW_LOAD_RETRY_PAUSE_MS);
>> +    } while (--retry && ret == -EAGAIN);
>
> In which scenario can this call return -EAGAIN ? As far as I can see 
> the mei driver only returns that on a non-blocking call, which is not 
> what we're doing here. Am I missing a path somewhere?
Discussed offline with Sasha, we should be using -EBUSY here as 
mei_cldev_enable() can return -EBUSY.
>
>> +
>> +    if (ret)
>> +        drm_err(&xe->drm, "Load %s firmware failed with err %d\n",
>> +            fw_id_to_name[lbfw->id], ret);
>> +    else
>> +        drm_dbg(&xe->drm, "Load %s firmware successful\n",
>> +            fw_id_to_name[lbfw->id]);
>> +
>> +    mutex_unlock(&late_bind->mutex);
>> +    xe_pm_runtime_put(xe);
>> +}
>> +
>> +int xe_late_bind_fw_load(struct xe_late_bind *late_bind)
>> +{
>> +    struct xe_device *xe = late_bind_to_xe(late_bind);
>> +    struct xe_late_bind_fw *lbfw;
>> +    int id;
>> +
>> +    if (!late_bind->component_added)
>> +        return -EINVAL;
>> +
>> +    for (id = 0; id < MAX_ID; id++) {
>> +        lbfw = &late_bind->late_bind_fw[id];
>> +        if (lbfw->valid) {
>> +            drm_dbg(&xe->drm, "Queue work: to load %s firmware\n",
>> +                fw_id_to_name[lbfw->id]);
>> +            queue_work(late_bind->wq, &lbfw->work);
>
> Do we need to flush this work before suspend to make sure it has 
> completed before the HW goes into D3Hot? Similarly, do we need to 
> flush it on driver unload to make sure it's done before we de-allocate 
> stuff?
Sure, I will flush this before unbind.
>
>> +        }
>> +    }
>> +    return 0;
>> +}
>> +
>> +/**
>> + * late_bind_fw_init() - initialize late bind firmware
>> + *
>> + * Return: 0 if the initialization was successful, a negative errno 
>> otherwise.
>> + */
>>   static int late_bind_fw_init(struct xe_late_bind *late_bind, u32 id)
>>   {
>>       struct xe_device *xe = late_bind_to_xe(late_bind);
>> @@ -93,6 +175,7 @@ static int late_bind_fw_init(struct xe_late_bind 
>> *late_bind, u32 id)
>>         memcpy(lb_fw->payload, fw->data, lb_fw->payload_size);
>>       release_firmware(fw);
>> +    INIT_WORK(&lb_fw->work, late_bind_work);
>>       lb_fw->valid = true;
>>         return 0;
>> @@ -108,12 +191,17 @@ int xe_late_bind_fw_init(struct xe_late_bind 
>> *late_bind)
>>       int id;
>>       int ret;
>>   +    late_bind->wq = 
>> create_singlethread_workqueue("late-bind-ordered-wq");
>
> Where is this WQ destroyed? Also, I think that using 
> alloc_ordered_workqueue would be preferred.
I missed that, will fix it.
>
> Daniele
>
>> +    if (!late_bind->wq)
>> +        return -ENOMEM;
>> +
>>       for (id = 0; id < MAX_ID; id++) {
>>           ret = late_bind_fw_init(late_bind, id);
>>           if (ret)
>>               return ret;
>>       }
>> -    return ret;
>> +
>> +    return 0;
>>   }
>>     static int xe_late_bind_component_bind(struct device *xe_kdev,
>> @@ -179,7 +267,6 @@ int xe_late_bind_init(struct xe_late_bind 
>> *late_bind)
>>       }
>>         late_bind->component_added = true;
>> -    /* the component must be removed before unload, so can't use 
>> drmm for cleanup */
>>         return 0;
>>   }
>> diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw.h 
>> b/drivers/gpu/drm/xe/xe_late_bind_fw.h
>> index e88c637b15a6..edd0e4c0650e 100644
>> --- a/drivers/gpu/drm/xe/xe_late_bind_fw.h
>> +++ b/drivers/gpu/drm/xe/xe_late_bind_fw.h
>> @@ -13,5 +13,6 @@ struct xe_late_bind;
>>   int xe_late_bind_init(struct xe_late_bind *late_bind);
>>   void xe_late_bind_remove(struct xe_late_bind *late_bind);
>>   int xe_late_bind_fw_init(struct xe_late_bind *late_bind);
>> +int xe_late_bind_fw_load(struct xe_late_bind *late_bind);
>>     #endif
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 9/9] {fwctl,drm}/xe/pcode: Introduce xe_pcode_fwctl
  2025-05-06 18:13   ` Jason Gunthorpe
  2025-05-07 19:49     ` Rodrigo Vivi
@ 2025-06-06 13:45     ` Nilawar, Badal
  2025-06-30 22:01     ` Rodrigo Vivi
  2 siblings, 0 replies; 30+ messages in thread
From: Nilawar, Badal @ 2025-06-06 13:45 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: intel-xe, dri-devel, anshuman.gupta, rodrigo.vivi,
	alexander.usyskin, gregkh, daniele.ceraolospurio

[-- Attachment #1: Type: text/plain, Size: 876 bytes --]


On 06-05-2025 23:43, Jason Gunthorpe wrote:
>> +/**
>> + * struct fwctl_rpc_xe_pcode - FWCTL Remote Procedure Calls for Xe PCODE
>> + */
>> +struct fwctl_rpc_xe_pcode {
>> +	/** @command: The main Mailbox command */
>> +	__u8 command;
>> +	/** @param1: A subcommand or a parameter of the main command */
>> +	__u16 param1;
>> +	/** @param2: A parameter of a subcommand or a subsubcommand */
>> +	__u16 param2;
>> +	/** @data0: The first 32 bits of data. In general data-in as param */
>> +	__u32 data0;
>> +	/** @data1: The other 32 bits of data. In general data-out */
>> +	__u32 data1;
>> +	/** @pad: Padding the uAPI struct - Must be 0. Not sent to firmware */
>> +	__u8 pad[3];
>> +};
> This has implicit padding? Make the padding explicit or use packed..

Thanks for pointing this out, was miss at my end, will go with explicit 
padding in next version

Regards,
Badal


[-- Attachment #2: Type: text/html, Size: 1421 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 9/9] {fwctl,drm}/xe/pcode: Introduce xe_pcode_fwctl
  2025-05-07 19:49     ` Rodrigo Vivi
  2025-05-07 22:04       ` Jason Gunthorpe
@ 2025-06-06 13:47       ` Nilawar, Badal
  1 sibling, 0 replies; 30+ messages in thread
From: Nilawar, Badal @ 2025-06-06 13:47 UTC (permalink / raw)
  To: Rodrigo Vivi, Jason Gunthorpe
  Cc: intel-xe, dri-devel, anshuman.gupta, alexander.usyskin, gregkh,
	daniele.ceraolospurio

[-- Attachment #1: Type: text/plain, Size: 729 bytes --]


On 08-05-2025 01:19, Rodrigo Vivi wrote:
>>> +static int xe_pcode_fwctl_uctx_open(struct fwctl_uctx *uctx)
>>> +{
>>> +	struct xe_pcode_fwctl_dev *fwctl_dev =
>>> +		container_of(uctx->fwctl, struct xe_pcode_fwctl_dev, fwctl);
>>> +	struct xe_device *xe = fwctl_dev->xe;
>>> +
>>> +	xe_pm_runtime_get(xe);
>> Shouldn't this be in the RPC function? Why keep the device awake as
>> long as a the FD is open?
> In general I prefer the runtime on the outer bounds to avoid funny deadlock
> cases. But right, in this case here it could be inside the xe_pcode calls
> itself, that is when the mmio/mailboxes accesses will really happen.

This I will handle in separate patch. For now will do runtime on rpc 
function.

Regards,
Badal

[-- Attachment #2: Type: text/html, Size: 1489 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 9/9] {fwctl,drm}/xe/pcode: Introduce xe_pcode_fwctl
  2025-05-06 18:13   ` Jason Gunthorpe
  2025-05-07 19:49     ` Rodrigo Vivi
  2025-06-06 13:45     ` Nilawar, Badal
@ 2025-06-30 22:01     ` Rodrigo Vivi
  2025-06-30 22:45       ` Jason Gunthorpe
  2 siblings, 1 reply; 30+ messages in thread
From: Rodrigo Vivi @ 2025-06-30 22:01 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Badal Nilawar, intel-xe, dri-devel, anshuman.gupta,
	alexander.usyskin, gregkh, daniele.ceraolospurio

On Tue, May 06, 2025 at 03:13:53PM -0300, Jason Gunthorpe wrote:
> On Tue, Apr 29, 2025 at 09:39:56PM +0530, Badal Nilawar wrote:
> 
> > diff --git a/drivers/gpu/drm/xe/xe_pcode_fwctl.c b/drivers/gpu/drm/xe/xe_pcode_fwctl.c
> > new file mode 100644
> 
> I really do prefer it if you can find a way to put the code in
> drivers/fwctl instead of in DRM subsystem.

Hi Jason,

Bringing this point back...

Is this a hard request from your side?

This fwctl here is mostly the front-end mirroring FW, but still using directly
the xe interfaces there for the communication with the pcode FW.

Also, there are changes coming to pcode interfaces there to align better with
i915 and xe code sharing. I believe it would be easier and beneficial to have
this under DRM subsystem near xe and pcode code.

But, please, let us know your thoughts here.

Thanks,
Rodrigo.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 9/9] {fwctl,drm}/xe/pcode: Introduce xe_pcode_fwctl
  2025-06-30 22:01     ` Rodrigo Vivi
@ 2025-06-30 22:45       ` Jason Gunthorpe
  0 siblings, 0 replies; 30+ messages in thread
From: Jason Gunthorpe @ 2025-06-30 22:45 UTC (permalink / raw)
  To: Rodrigo Vivi
  Cc: Badal Nilawar, intel-xe, dri-devel, anshuman.gupta,
	alexander.usyskin, gregkh, daniele.ceraolospurio

On Mon, Jun 30, 2025 at 06:01:13PM -0400, Rodrigo Vivi wrote:
> On Tue, May 06, 2025 at 03:13:53PM -0300, Jason Gunthorpe wrote:
> > On Tue, Apr 29, 2025 at 09:39:56PM +0530, Badal Nilawar wrote:
> > 
> > > diff --git a/drivers/gpu/drm/xe/xe_pcode_fwctl.c b/drivers/gpu/drm/xe/xe_pcode_fwctl.c
> > > new file mode 100644
> > 
> > I really do prefer it if you can find a way to put the code in
> > drivers/fwctl instead of in DRM subsystem.
> 
> Hi Jason,
> 
> Bringing this point back...
> 
> Is this a hard request from your side?

No, cxl didn't do it..

But also we've seen this go bad historically to just randomly spread
code all over the tree because it is easier that way.

Jason

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 9/9] {fwctl,drm}/xe/pcode: Introduce xe_pcode_fwctl
  2025-05-07 22:04       ` Jason Gunthorpe
@ 2025-08-22 19:33         ` Rodrigo Vivi
  2025-08-28 12:48           ` Jason Gunthorpe
  0 siblings, 1 reply; 30+ messages in thread
From: Rodrigo Vivi @ 2025-08-22 19:33 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Badal Nilawar, intel-xe, dri-devel, anshuman.gupta,
	alexander.usyskin, gregkh, daniele.ceraolospurio

On Wed, May 07, 2025 at 07:04:44PM -0300, Jason Gunthorpe wrote:
> On Wed, May 07, 2025 at 03:49:15PM -0400, Rodrigo Vivi wrote:
> 
> > One last thing since I have your attention here. Was any time in the previous
> > fwctl discussions talked about the possibility of some extra usages for like
> > FW flashing or in-field-repair/tests where big data needs to filled bypassing
> > lockdown mode?
> 
> For FW flash I do suggest you try to use devlink's firmware flashing
> interface. I think it would be really great if that could become a
> cross-subsystem standard in Linux.

I took a look there and I do believe devlink works very well for fw flashing and
it is also already prepared for basically any pci device without any change.
Thanks for the suggestion.

But now I have to ask you about 2 other use cases that are not under the
umbrella of: configuration and provisioning, but perhaps at least partially
under the umbrella of debug:

- In-field-test-and-repair
- Error injection

Can fwctl be used for these use cases, supposing that it FW mailboxes
commands and responses directly with no modification to the fwctl
infra itself?

in-field-test I'd say it is debug, the 'repair' portion is the most
questionable one. But error-injection I believe it is entirely
debug.

What's your thoughts and guidance here?

Thanks,
Rodrigo.

> 
> If that isn't going to work out then yes I would say fwctl should be
> considered for flashing.
> 
> Saeed's original version had a "big data" memory pinned DMA capable
> interface as well. With justification it could come into the fwctl
> version. I'm not against it, but mindful that it widens what is
> possible by quite a bit.
> 
> But you might not need something like that just for flash. Some
> internal improvement kernel side to allow streaming from large
> user-space RPC buffers instead of a single alloc and copy would be
> sufficient..
> 
> Jason

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 9/9] {fwctl,drm}/xe/pcode: Introduce xe_pcode_fwctl
  2025-08-22 19:33         ` Rodrigo Vivi
@ 2025-08-28 12:48           ` Jason Gunthorpe
  0 siblings, 0 replies; 30+ messages in thread
From: Jason Gunthorpe @ 2025-08-28 12:48 UTC (permalink / raw)
  To: Rodrigo Vivi
  Cc: Badal Nilawar, intel-xe, dri-devel, anshuman.gupta,
	alexander.usyskin, gregkh, daniele.ceraolospurio

On Fri, Aug 22, 2025 at 03:33:17PM -0400, Rodrigo Vivi wrote:
> On Wed, May 07, 2025 at 07:04:44PM -0300, Jason Gunthorpe wrote:
> > On Wed, May 07, 2025 at 03:49:15PM -0400, Rodrigo Vivi wrote:
> > 
> > > One last thing since I have your attention here. Was any time in the previous
> > > fwctl discussions talked about the possibility of some extra usages for like
> > > FW flashing or in-field-repair/tests where big data needs to filled bypassing
> > > lockdown mode?
> > 
> > For FW flash I do suggest you try to use devlink's firmware flashing
> > interface. I think it would be really great if that could become a
> > cross-subsystem standard in Linux.
> 
> I took a look there and I do believe devlink works very well for fw flashing and
> it is also already prepared for basically any pci device without any change.
> Thanks for the suggestion.

Jiri also has generic plugin for fwupd:

https://github.com/fwupd/fwupd/pull/9053

So if you do the kernel side you get a whole ecosystem now..

> But now I have to ask you about 2 other use cases that are not under the
> umbrella of: configuration and provisioning, but perhaps at least partially
> under the umbrella of debug:
> 
> - In-field-test-and-repair
> - Error injection
> 
> Can fwctl be used for these use cases, supposing that it FW mailboxes
> commands and responses directly with no modification to the fwctl
> infra itself?

These sound like what fwctl was intended for, yes.

Jason

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2025-08-28 12:48 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-29 16:09 [RFC 0/9] Introducing firmware late binding Badal Nilawar
2025-04-29 16:09 ` [RFC 1/9] mei: bus: add mei_cldev_mtu interface Badal Nilawar
2025-04-29 16:09 ` [RFC 2/9] mei: late_bind: add late binding component driver Badal Nilawar
2025-05-07 22:42   ` Daniele Ceraolo Spurio
2025-05-08  5:41     ` Usyskin, Alexander
2025-06-03 12:01       ` Nilawar, Badal
2025-04-29 16:09 ` [RFC 3/9] drm/xe/late_bind_fw: Introducing late_bind_fw Badal Nilawar
2025-05-07 21:38   ` Daniele Ceraolo Spurio
2025-06-03 13:52     ` Nilawar, Badal
2025-04-29 16:09 ` [RFC 4/9] drm/xe/xe_late_bind_fw: Initialize late binding firmware Badal Nilawar
2025-05-07 23:11   ` Daniele Ceraolo Spurio
2025-06-03 13:57     ` Nilawar, Badal
2025-04-29 16:09 ` [RFC 5/9] drm/xe/xe_late_bind_fw: Load " Badal Nilawar
2025-05-07 23:44   ` Daniele Ceraolo Spurio
2025-06-04  5:36     ` Nilawar, Badal
2025-04-29 16:09 ` [RFC 6/9] drm/xe/xe_late_bind_fw: Reload late binding fw in rpm resume Badal Nilawar
2025-04-29 16:09 ` [RFC 7/9] drm/xe/xe_late_bind_fw: Reload late binding fw in S2Idle/S3 resume Badal Nilawar
2025-04-29 16:09 ` [RFC 8/9] drm/xe/xe_late_bind_fw: Introduce debug fs node to disable late binding Badal Nilawar
2025-04-29 16:09 ` [RFC 9/9] {fwctl,drm}/xe/pcode: Introduce xe_pcode_fwctl Badal Nilawar
2025-05-01 15:44   ` Rodrigo Vivi
2025-05-06 18:13   ` Jason Gunthorpe
2025-05-07 19:49     ` Rodrigo Vivi
2025-05-07 22:04       ` Jason Gunthorpe
2025-08-22 19:33         ` Rodrigo Vivi
2025-08-28 12:48           ` Jason Gunthorpe
2025-06-06 13:47       ` Nilawar, Badal
2025-06-06 13:45     ` Nilawar, Badal
2025-06-30 22:01     ` Rodrigo Vivi
2025-06-30 22:45       ` Jason Gunthorpe
2025-04-30 11:47 ` [RFC 0/9] Introducing firmware late binding Jani Nikula

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).