[PATCH V2 0/4] Add versal-pci driver

linux-fpga.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH V2 0/4] Add versal-pci driver
@ 2024-12-10 18:37 Yidong Zhang
  2024-12-10 18:37 ` [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci Yidong Zhang
                   ` (4 more replies)
  0 siblings, 5 replies; 28+ messages in thread
From: Yidong Zhang @ 2024-12-10 18:37 UTC (permalink / raw)
  To: linux-kernel, linux-fpga, mdf, hao.wu, yilun.xu; +Cc: Yidong Zhang, lizhi.hou

This patchset introduces a new Linux Kernel Driver, versal-pci for AMD
Alevo Versal based PCIe Card. The driver is based on Linux fpga driver
framework.

The AMD Alevo Versal based PCIe Card, including V70, is the first Alevo
production card leveraging AMD XDNA architecture with AI Engines. It is
designed for AI inference efficiency and is tuned for video analytics and
natural language processing applications [1].

This versal-pci driver provides services, including:
  - leveraging linux firmware and FPGA framework to download management
    firmware
  - program additional bit-streams for AMD Xilinx specific hardware
  - communicate with PCIe user function
  - communicate with firmware running on the PCIe Card
  - monitor device health

The driver is licensed under GPL-2.0.

The firmware and bit-streams are distributed as a closed binary, delivered
by AMD. Please see [1] for more information.

[1] https://www.amd.com/en/products/accelerators/alveo/v70.html

Refactor driver to address all comments from v1.
Changes since v1:
- Add driver architecture description.
- Change the driver name to versal-pci 
- Remove unnecessary memcpy in versal-pci-comm-chan.c
- Keep mod_timer because we need single_threaded_queue with delayed_work.
- Change the "comms" to "comm_chan".
- Remove regmap, use base+offset directly.
- Add skeleton ops with on implementation (TODO) for fpga_manager in early
  patch (0001) and add implementation in later patch(0004).
- Remove br_ops and FPGA region, no need.
- Add load_xclbin request from user PF driver via comm_chan service, and
  read firmware from local cached data. 
- Remove pcim_iomap_table and pcim_iomap_regions. Use pcim_iomap_region.
- Redo patches, remove no-used definitions for each patch itself. Use
  /* TODO */ to mark dependencies that will be addressed in later patch.
- Fix memory leak in rm_check_health.
- Fix cmd with error status in rm_check_health.
- Change to vzalloc in all places.
- Add rm_queue_withdraw_cmd in fw_cancel.

Yidong Zhang (4):
  drivers/fpga/amd: Add new driver amd versal-pci
  drivers/fpga/amd: Add communication channel
  drivers/fpga/amd: Add remote queue
  drivers/fpga/amd: Add load xclbin and load firmware

 MAINTAINERS                              |   6 +
 drivers/fpga/Kconfig                     |   3 +
 drivers/fpga/Makefile                    |   3 +
 drivers/fpga/amd/Kconfig                 |  15 +
 drivers/fpga/amd/Makefile                |   8 +
 drivers/fpga/amd/versal-pci-comm-chan.c  | 271 ++++++++++++
 drivers/fpga/amd/versal-pci-comm-chan.h  |  14 +
 drivers/fpga/amd/versal-pci-main.c       | 445 ++++++++++++++++++++
 drivers/fpga/amd/versal-pci-rm-queue.c   | 324 +++++++++++++++
 drivers/fpga/amd/versal-pci-rm-queue.h   |  21 +
 drivers/fpga/amd/versal-pci-rm-service.c | 497 +++++++++++++++++++++++
 drivers/fpga/amd/versal-pci-rm-service.h | 229 +++++++++++
 drivers/fpga/amd/versal-pci.h            |  90 ++++
 13 files changed, 1926 insertions(+)
 create mode 100644 drivers/fpga/amd/Kconfig
 create mode 100644 drivers/fpga/amd/Makefile
 create mode 100644 drivers/fpga/amd/versal-pci-comm-chan.c
 create mode 100644 drivers/fpga/amd/versal-pci-comm-chan.h
 create mode 100644 drivers/fpga/amd/versal-pci-main.c
 create mode 100644 drivers/fpga/amd/versal-pci-rm-queue.c
 create mode 100644 drivers/fpga/amd/versal-pci-rm-queue.h
 create mode 100644 drivers/fpga/amd/versal-pci-rm-service.c
 create mode 100644 drivers/fpga/amd/versal-pci-rm-service.h
 create mode 100644 drivers/fpga/amd/versal-pci.h

-- 
2.34.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci
  2024-12-10 18:37 [PATCH V2 0/4] Add versal-pci driver Yidong Zhang
@ 2024-12-10 18:37 ` Yidong Zhang
  2025-01-26  9:24   ` Xu Yilun
                     ` (3 more replies)
  2024-12-10 18:37 ` [PATCH V2 2/4] drivers/fpga/amd: Add communication channel Yidong Zhang
                   ` (3 subsequent siblings)
  4 siblings, 4 replies; 28+ messages in thread
From: Yidong Zhang @ 2024-12-10 18:37 UTC (permalink / raw)
  To: linux-kernel, linux-fpga, mdf, hao.wu, yilun.xu
  Cc: Yidong Zhang, lizhi.hou, DMG Karthik, Nishad Saraf,
	Prapul Krishnamurthy, Hayden Laccabue

AMD Versal based PCIe card, including V70, is designed for AI inference
efficiency and is tuned for video analytics and natural language processing
applications.

The driver architecture:

  +---------+  Communication +---------+  Remote  +-----+------+
  |         |  Channel       |         |  Queue   |     |      |
  | User PF | <============> | Mgmt PF | <=======>| FW  | FPGA |
  +---------+                +---------+          +-----+------+
    PL Data                    base FW
                               APU FW
                               PL Data (copy)
 - PL (FPGA Program Logic)
 - FW (Firmware)

There are 2 separate drivers from the original XRT[1] design.
 - UserPF driver
 - MgmtPF driver

The new AMD versal-pci driver will replace the MgmtPF driver for Versal
PCIe card.

The XRT[1] is already open-sourced. It includes solution of runtime for
many different type of PCIe Based cards. It also provides utilities for
managing and programming the devices.

The AMD versal-pci stands for AMD Versal brand PCIe device management
driver. This driver provides the following functionalities:

   - module and PCI device initialization
     this driver will attach to specific device id of V70 card;
     the driver will initialize itself based on bar resources for
     - communication channel:
       a hardware message service between mgmt PF and user PF
     - remote queue:
       a hardware queue based ring buffer service between mgmt PF and PCIe
       hardware firmware for programming FPGA Program Logic, loading
       firmware and checking card healthy status.

   - programming FW
     - The base FW is downloaded onto the flash of the card.
     - The APU FW is downloaded once after a POR (power on reset).
     - Reloading the MgmtPF driver will not change any existing hardware.

   - programming FPGA hardware binaries - PL Data
    - using fpga framework ops to support re-programing FPGA
    - the re-programming request will be initiated from the existing UserPF
      driver only, and the MgmtPF driver load the matched PL Data after
      receiving request from the communication channel. The matching PL
      Data is indexed by the PL Data UUID and Base FW UUID.
      - The Base FW UUID identifies unique based hardware. Often called the
      interface UUID.
      - The PL Data UUID identifies unique PL design that is generated
      based on the base hardware. Often called xclbin UUID.

    - Example:
      4fdebe35[...trimmed...]_96df7d[...trimmed...].xclbin
      |                     | |                   |
      +--  xclbin UUID    --+ +--interface UUID --+

[1] https://github.com/Xilinx/XRT/blob/master/README.rst

Co-developed-by: DMG Karthik <Karthik.DMG@amd.com>
Signed-off-by: DMG Karthik <Karthik.DMG@amd.com>
Co-developed-by: Nishad Saraf <nishads@amd.com>
Signed-off-by: Nishad Saraf <nishads@amd.com>
Co-developed-by: Prapul Krishnamurthy <prapulk@amd.com>
Signed-off-by: Prapul Krishnamurthy <prapulk@amd.com>
Co-developed-by: Hayden Laccabue <hayden.laccabue@amd.com>
Signed-off-by: Hayden Laccabue <hayden.laccabue@amd.com>
Signed-off-by: Yidong Zhang <yidong.zhang@amd.com>
---
 MAINTAINERS                        |   6 +
 drivers/fpga/Kconfig               |   3 +
 drivers/fpga/Makefile              |   3 +
 drivers/fpga/amd/Kconfig           |  15 ++
 drivers/fpga/amd/Makefile          |   5 +
 drivers/fpga/amd/versal-pci-main.c | 328 +++++++++++++++++++++++++++++
 drivers/fpga/amd/versal-pci.h      |  86 ++++++++
 7 files changed, 446 insertions(+)
 create mode 100644 drivers/fpga/amd/Kconfig
 create mode 100644 drivers/fpga/amd/Makefile
 create mode 100644 drivers/fpga/amd/versal-pci-main.c
 create mode 100644 drivers/fpga/amd/versal-pci.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 17daa9ee9384..302c10004c5d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1194,6 +1194,12 @@ L:	linux-spi@vger.kernel.org
 S:	Supported
 F:	drivers/spi/spi-amd.c
 
+AMD VERSAL PCI DRIVER
+M:	Yidong Zhang <yidong.zhang@amd.com>
+L:	linux-fpga@vger.kernel.org
+S:	Supported
+F:	drivers/fpga/amd/
+
 AMD XGBE DRIVER
 M:	"Shyam Sundar S K" <Shyam-sundar.S-k@amd.com>
 L:	netdev@vger.kernel.org
diff --git a/drivers/fpga/Kconfig b/drivers/fpga/Kconfig
index 37b35f58f0df..dce060a7bd8f 100644
--- a/drivers/fpga/Kconfig
+++ b/drivers/fpga/Kconfig
@@ -290,4 +290,7 @@ config FPGA_MGR_LATTICE_SYSCONFIG_SPI
 
 source "drivers/fpga/tests/Kconfig"
 
+# Driver files
+source "drivers/fpga/amd/Kconfig"
+
 endif # FPGA
diff --git a/drivers/fpga/Makefile b/drivers/fpga/Makefile
index aeb89bb13517..8412f3e211cc 100644
--- a/drivers/fpga/Makefile
+++ b/drivers/fpga/Makefile
@@ -58,5 +58,8 @@ obj-$(CONFIG_FPGA_DFL_NIOS_INTEL_PAC_N3000)	+= dfl-n3000-nios.o
 # Drivers for FPGAs which implement DFL
 obj-$(CONFIG_FPGA_DFL_PCI)		+= dfl-pci.o
 
+# AMD PCIe Versal Management Driver
+obj-$(CONFIG_AMD_VERSAL_PCI)		+= amd/
+
 # KUnit tests
 obj-$(CONFIG_FPGA_KUNIT_TESTS)		+= tests/
diff --git a/drivers/fpga/amd/Kconfig b/drivers/fpga/amd/Kconfig
new file mode 100644
index 000000000000..b18a42a340ba
--- /dev/null
+++ b/drivers/fpga/amd/Kconfig
@@ -0,0 +1,15 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+config AMD_VERSAL_PCI
+	tristate "AMD Versal PCIe Management Driver"
+	select FW_LOADER
+	select FW_UPLOAD
+	depends on FPGA
+	depends on HAS_IOMEM
+	depends on PCI
+	help
+	  AMD Versal PCIe Management Driver provides management services to
+	  download firmware, program bitstream, and communicate with the User
+	  function.
+
+	  If "M" is selected, the driver module will be versal-pci
diff --git a/drivers/fpga/amd/Makefile b/drivers/fpga/amd/Makefile
new file mode 100644
index 000000000000..5d1ef04b5e80
--- /dev/null
+++ b/drivers/fpga/amd/Makefile
@@ -0,0 +1,5 @@
+# SPDX-License-Identifier: GPL-2.0
+
+obj-$(CONFIG_AMD_VERSAL_PCI)			+= versal-pci.o
+
+versal-pci-$(CONFIG_AMD_VERSAL_PCI)		:= versal-pci-main.o
diff --git a/drivers/fpga/amd/versal-pci-main.c b/drivers/fpga/amd/versal-pci-main.c
new file mode 100644
index 000000000000..a10ccf86802b
--- /dev/null
+++ b/drivers/fpga/amd/versal-pci-main.c
@@ -0,0 +1,328 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Driver for Versal PCIe device
+ *
+ * Copyright (C) 2024 Advanced Micro Devices, Inc. All rights reserved.
+ */
+
+#include <linux/pci.h>
+
+#include "versal-pci.h"
+
+#define DRV_NAME			"amd-versal-pci"
+
+#define PCI_DEVICE_ID_V70PQ2		0x50B0
+#define VERSAL_XCLBIN_MAGIC_ID		"xclbin2"
+
+static int versal_pci_fpga_write_init(struct fpga_manager *mgr, struct fpga_image_info *info,
+				      const char *buf, size_t count)
+{
+	/* TODO */
+	return 0;
+}
+
+static int versal_pci_fpga_write(struct fpga_manager *mgr, const char *buf,
+				 size_t count)
+{
+	/* TODO */
+	return 0;
+}
+
+static int versal_pci_fpga_write_complete(struct fpga_manager *mgr,
+					  struct fpga_image_info *info)
+{
+	/* TODO */
+	return 0;
+}
+
+static enum fpga_mgr_states versal_pci_fpga_state(struct fpga_manager *mgr)
+{
+	struct fpga_device *fdev = mgr->priv;
+
+	return fdev->state;
+}
+
+static const struct fpga_manager_ops versal_pci_fpga_ops = {
+	.write_init = versal_pci_fpga_write_init,
+	.write = versal_pci_fpga_write,
+	.write_complete = versal_pci_fpga_write_complete,
+	.state = versal_pci_fpga_state,
+};
+
+static void versal_pci_fpga_fini(struct fpga_device *fdev)
+{
+	fpga_mgr_unregister(fdev->mgr);
+}
+
+static void versal_pci_uuid_parse(struct versal_pci_device *vdev, uuid_t *uuid)
+{
+	char str[UUID_STRING_LEN];
+	u8 i, j;
+
+	/* parse uuid into a valid uuid string format */
+	for (i  = 0, j = 0; i < strlen(vdev->fw_id) && i < sizeof(str); i++) {
+		str[j++] = vdev->fw_id[i];
+		if (j == 8 || j == 13 || j == 18 || j == 23)
+			str[j++] = '-';
+	}
+
+	uuid_parse(str, uuid);
+	vdev_info(vdev, "Interface uuid %pU", uuid);
+}
+
+static struct fpga_device *versal_pci_fpga_init(struct versal_pci_device *vdev)
+{
+	struct device *dev = &vdev->pdev->dev;
+	struct fpga_manager_info info = { 0 };
+	struct fpga_device *fdev;
+	int ret;
+
+	fdev = devm_kzalloc(dev, sizeof(*fdev), GFP_KERNEL);
+	if (!fdev)
+		return ERR_PTR(-ENOMEM);
+
+	fdev->vdev = vdev;
+
+	info = (struct fpga_manager_info) {
+		.name = "AMD Versal FPGA Manager",
+		.mops = &versal_pci_fpga_ops,
+		.priv = fdev,
+	};
+
+	fdev->mgr = fpga_mgr_register_full(dev, &info);
+	if (IS_ERR(fdev->mgr)) {
+		ret = PTR_ERR(fdev->mgr);
+		vdev_err(vdev, "Failed to register FPGA manager, err %d", ret);
+		return ERR_PTR(ret);
+	}
+
+	/* Place holder for rm_queue_get_fw_id(vdev->rdev) */
+	versal_pci_uuid_parse(vdev, &vdev->intf_uuid);
+
+	return fdev;
+}
+
+static int versal_pci_program_axlf(struct versal_pci_device *vdev, char *data, size_t size)
+{
+	const struct axlf *axlf = (struct axlf *)data;
+	struct fpga_image_info *image_info;
+	int ret;
+
+	image_info = fpga_image_info_alloc(&vdev->pdev->dev);
+	if (!image_info)
+		return -ENOMEM;
+
+	image_info->count = axlf->header.length;
+	image_info->buf = (char *)axlf;
+
+	ret = fpga_mgr_load(vdev->fdev->mgr, image_info);
+	if (ret) {
+		vdev_err(vdev, "failed to load xclbin: %d", ret);
+		goto exit;
+	}
+
+	vdev_info(vdev, "Downloaded axlf %pUb of size %zu Bytes", &axlf->header.uuid, size);
+	uuid_copy(&vdev->xclbin_uuid, &axlf->header.uuid);
+
+exit:
+	fpga_image_info_free(image_info);
+
+	return ret;
+}
+
+int versal_pci_load_xclbin(struct versal_pci_device *vdev, uuid_t *xuuid)
+{
+	const char *xclbin_location = "xilinx/xclbins";
+	char fw_name[100];
+	const struct firmware *fw;
+	int ret;
+
+	snprintf(fw_name, sizeof(fw_name), "%s/%pUb_%s.xclbin",
+		 xclbin_location, xuuid, vdev->fw_id);
+
+	vdev_info(vdev, "trying to load %s", fw_name);
+	ret = request_firmware(&fw, fw_name, &vdev->pdev->dev);
+	if (ret) {
+		vdev_warn(vdev, "request xclbin fw %s failed %d", fw_name, ret);
+		return ret;
+	}
+	vdev_info(vdev, "loaded data size %zu", fw->size);
+
+	ret = versal_pci_program_axlf(vdev, (char *)fw->data, fw->size);
+	if (ret)
+		vdev_err(vdev, "program axlf %s failed %d", fw_name, ret);
+
+	release_firmware(fw);
+
+	return ret;
+}
+
+static enum fw_upload_err versal_pci_fw_prepare(struct fw_upload *fw_upload, const u8 *data,
+						u32 size)
+{
+	/* TODO */
+	return FW_UPLOAD_ERR_NONE;
+}
+
+static enum fw_upload_err versal_pci_fw_write(struct fw_upload *fw_upload, const u8 *data,
+					      u32 offset, u32 size, u32 *written)
+{
+	/* TODO */
+	return FW_UPLOAD_ERR_NONE;
+}
+
+static enum fw_upload_err versal_pci_fw_poll_complete(struct fw_upload *fw_upload)
+{
+	/* TODO */
+	return FW_UPLOAD_ERR_NONE;
+}
+
+static void versal_pci_fw_cancel(struct fw_upload *fw_upload)
+{
+	/* TODO */
+}
+
+static void versal_pci_fw_cleanup(struct fw_upload *fw_upload)
+{
+	/* TODO */
+}
+
+static const struct fw_upload_ops versal_pci_fw_ops = {
+	.prepare = versal_pci_fw_prepare,
+	.write = versal_pci_fw_write,
+	.poll_complete = versal_pci_fw_poll_complete,
+	.cancel = versal_pci_fw_cancel,
+	.cleanup = versal_pci_fw_cleanup,
+};
+
+static void versal_pci_fw_upload_fini(struct firmware_device *fwdev)
+{
+	firmware_upload_unregister(fwdev->fw);
+	kfree(fwdev->name);
+}
+
+static u32 versal_pci_devid(struct versal_pci_device *vdev)
+{
+	return ((pci_domain_nr(vdev->pdev->bus) << 16) |
+		PCI_DEVID(vdev->pdev->bus->number, vdev->pdev->devfn));
+}
+
+static struct firmware_device *versal_pci_fw_upload_init(struct versal_pci_device *vdev)
+{
+	struct device *dev = &vdev->pdev->dev;
+	struct firmware_device *fwdev;
+	u32 devid;
+
+	fwdev = devm_kzalloc(dev, sizeof(*fwdev), GFP_KERNEL);
+	if (!fwdev)
+		return ERR_PTR(-ENOMEM);
+
+	devid = versal_pci_devid(vdev);
+	fwdev->name = kasprintf(GFP_KERNEL, "%s%x", DRV_NAME, devid);
+	if (!fwdev->name)
+		return ERR_PTR(-ENOMEM);
+
+	fwdev->fw = firmware_upload_register(THIS_MODULE, dev, fwdev->name,
+					     &versal_pci_fw_ops, fwdev);
+	if (IS_ERR(fwdev->fw)) {
+		kfree(fwdev->name);
+		return ERR_CAST(fwdev->fw);
+	}
+
+	fwdev->vdev = vdev;
+
+	return fwdev;
+}
+
+static void versal_pci_device_teardown(struct versal_pci_device *vdev)
+{
+	versal_pci_fpga_fini(vdev->fdev);
+	versal_pci_fw_upload_fini(vdev->fwdev);
+}
+
+static int versal_pci_device_setup(struct versal_pci_device *vdev)
+{
+	int ret;
+
+	vdev->fwdev = versal_pci_fw_upload_init(vdev);
+	if (IS_ERR(vdev->fwdev)) {
+		ret = PTR_ERR(vdev->fwdev);
+		vdev_err(vdev, "Failed to init FW uploader, err %d", ret);
+		return ret;
+	}
+
+	vdev->fdev = versal_pci_fpga_init(vdev);
+	if (IS_ERR(vdev->fdev)) {
+		ret = PTR_ERR(vdev->fdev);
+		vdev_err(vdev, "Failed to init FPGA manager, err %d", ret);
+		goto upload_fini;
+	}
+
+	return 0;
+
+upload_fini:
+	versal_pci_fw_upload_fini(vdev->fwdev);
+
+	return ret;
+}
+
+static void versal_pci_remove(struct pci_dev *pdev)
+{
+	struct versal_pci_device *vdev = pci_get_drvdata(pdev);
+
+	versal_pci_device_teardown(vdev);
+}
+
+static int versal_pci_probe(struct pci_dev *pdev, const struct pci_device_id *pdev_id)
+{
+	struct versal_pci_device *vdev;
+	int ret;
+
+	vdev = devm_kzalloc(&pdev->dev, sizeof(*vdev), GFP_KERNEL);
+	if (!vdev)
+		return -ENOMEM;
+
+	pci_set_drvdata(pdev, vdev);
+	vdev->pdev = pdev;
+
+	ret = pcim_enable_device(pdev);
+	if (ret) {
+		vdev_err(vdev, "Failed to enable device %d", ret);
+		return ret;
+	}
+
+	vdev->io_regs = pcim_iomap_region(vdev->pdev, MGMT_BAR, DRV_NAME);
+	if (IS_ERR(vdev->io_regs)) {
+		vdev_err(vdev, "Failed to map RM shared memory BAR%d", MGMT_BAR);
+		return PTR_ERR(vdev->io_regs);
+	}
+
+	ret = versal_pci_device_setup(vdev);
+	if (ret) {
+		vdev_err(vdev, "Failed to setup Versal device %d", ret);
+		return ret;
+	}
+
+	vdev_dbg(vdev, "Successfully probed %s driver!", DRV_NAME);
+	return 0;
+}
+
+static const struct pci_device_id versal_pci_ids[] = {
+	{ PCI_DEVICE(PCI_VENDOR_ID_XILINX, PCI_DEVICE_ID_V70PQ2), },
+	{ 0 }
+};
+
+MODULE_DEVICE_TABLE(pci, versal_pci_ids);
+
+static struct pci_driver versal_pci_driver = {
+	.name = DRV_NAME,
+	.id_table = versal_pci_ids,
+	.probe = versal_pci_probe,
+	.remove = versal_pci_remove,
+};
+
+module_pci_driver(versal_pci_driver);
+
+MODULE_DESCRIPTION("AMD Versal PCIe Management Driver");
+MODULE_AUTHOR("XRT Team <runtimeca39d@amd.com>");
+MODULE_LICENSE("GPL");
diff --git a/drivers/fpga/amd/versal-pci.h b/drivers/fpga/amd/versal-pci.h
new file mode 100644
index 000000000000..1509bd0532ea
--- /dev/null
+++ b/drivers/fpga/amd/versal-pci.h
@@ -0,0 +1,86 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Driver for Versal PCIe device
+ *
+ * Copyright (C) 2024 Advanced Micro Devices, Inc. All rights reserved.
+ */
+
+#ifndef __VERSAL_PCI_H
+#define __VERSAL_PCI_H
+
+#include <linux/firmware.h>
+#include <linux/fpga/fpga-mgr.h>
+
+#define MGMT_BAR		0
+
+#define vdev_info(vdev, fmt, args...)					\
+	dev_info(&(vdev)->pdev->dev, "%s: "fmt, __func__, ##args)
+
+#define vdev_warn(vdev, fmt, args...)					\
+	dev_warn(&(vdev)->pdev->dev, "%s: "fmt, __func__, ##args)
+
+#define vdev_err(vdev, fmt, args...)					\
+	dev_err(&(vdev)->pdev->dev, "%s: "fmt, __func__, ##args)
+
+#define vdev_dbg(vdev, fmt, args...)					\
+	dev_dbg(&(vdev)->pdev->dev, fmt, ##args)
+
+struct versal_pci_device;
+
+struct axlf_header {
+	__u64				length;
+	__u8				reserved1[24];
+	uuid_t				rom_uuid;
+	__u8				reserved2[64];
+	uuid_t				uuid;
+	__u8				reserved3[24];
+} __packed;
+
+struct axlf {
+	__u8				magic[8];
+	__u8				reserved[296];
+	struct axlf_header		header;
+} __packed;
+
+struct fw_tnx {
+	struct rm_cmd			*cmd;
+	__u32				opcode;
+	__u32				id;
+};
+
+struct fpga_device {
+	enum fpga_mgr_states		state;
+	struct fpga_manager		*mgr;
+	struct versal_pci_device	*vdev;
+	struct fw_tnx			fw;
+};
+
+struct firmware_device {
+	struct versal_pci_device	*vdev;
+	struct fw_upload		*fw;
+	__u8				*name;
+	__u32				fw_name_id;
+	struct rm_cmd			*cmd;
+	__u32				id;
+	uuid_t				uuid;
+};
+
+struct versal_pci_device {
+	struct pci_dev			*pdev;
+
+	struct fpga_device		*fdev;
+	struct firmware_device		*fwdev;
+	struct device			*device;
+
+	void __iomem			*io_regs;
+	uuid_t				xclbin_uuid;
+	uuid_t				intf_uuid;
+	__u8				fw_id[UUID_STRING_LEN + 1];
+
+	__u8				*debugfs_root;
+};
+
+/* versal pci driver APIs */
+int versal_pci_load_xclbin(struct versal_pci_device *vdev, uuid_t *xclbin_uuid);
+
+#endif	/* __VERSAL_PCI_H */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V2 2/4] drivers/fpga/amd: Add communication channel
  2024-12-10 18:37 [PATCH V2 0/4] Add versal-pci driver Yidong Zhang
  2024-12-10 18:37 ` [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci Yidong Zhang
@ 2024-12-10 18:37 ` Yidong Zhang
  2025-01-26 10:19   ` Christophe JAILLET
  2024-12-10 18:37 ` [PATCH V2 3/4] drivers/fpga/amd: Add remote queue Yidong Zhang
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 28+ messages in thread
From: Yidong Zhang @ 2024-12-10 18:37 UTC (permalink / raw)
  To: linux-kernel, linux-fpga, mdf, hao.wu, yilun.xu
  Cc: Yidong Zhang, lizhi.hou, Nishad Saraf

The communication channel (comm_chan) service is between versal-pci and the
user PF driver. When the user PF driver requests PL data download, the
comm_chan service will handle the request by versal_pci_load_xclbin.

Co-developed-by: Nishad Saraf <nishads@amd.com>
Signed-off-by: Nishad Saraf <nishads@amd.com>
Signed-off-by: Yidong Zhang <yidong.zhang@amd.com>
---
 drivers/fpga/amd/Makefile               |   3 +-
 drivers/fpga/amd/versal-pci-comm-chan.c | 271 ++++++++++++++++++++++++
 drivers/fpga/amd/versal-pci-comm-chan.h |  14 ++
 drivers/fpga/amd/versal-pci-main.c      |  14 +-
 drivers/fpga/amd/versal-pci.h           |   2 +
 5 files changed, 301 insertions(+), 3 deletions(-)
 create mode 100644 drivers/fpga/amd/versal-pci-comm-chan.c
 create mode 100644 drivers/fpga/amd/versal-pci-comm-chan.h

diff --git a/drivers/fpga/amd/Makefile b/drivers/fpga/amd/Makefile
index 5d1ef04b5e80..7a604785e5f9 100644
--- a/drivers/fpga/amd/Makefile
+++ b/drivers/fpga/amd/Makefile
@@ -2,4 +2,5 @@
 
 obj-$(CONFIG_AMD_VERSAL_PCI)			+= versal-pci.o
 
-versal-pci-$(CONFIG_AMD_VERSAL_PCI)		:= versal-pci-main.o
+versal-pci-$(CONFIG_AMD_VERSAL_PCI)		:= versal-pci-main.o \
+						   versal-pci-comm-chan.o
diff --git a/drivers/fpga/amd/versal-pci-comm-chan.c b/drivers/fpga/amd/versal-pci-comm-chan.c
new file mode 100644
index 000000000000..20ccb1ac7754
--- /dev/null
+++ b/drivers/fpga/amd/versal-pci-comm-chan.c
@@ -0,0 +1,271 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Driver for Versal PCIe device
+ *
+ * Copyright (C) 2024 Advanced Micro Devices, Inc. All rights reserved.
+ */
+
+#include <linux/bitfield.h>
+#include <linux/pci.h>
+
+#include "versal-pci.h"
+#include "versal-pci-comm-chan.h"
+
+#define COMM_CHAN_PROTOCOL_VERSION		1
+#define COMM_CHAN_PCI_BAR_OFF			0x2000000
+#define COMM_CHAN_TIMER				(HZ / 10)
+#define COMM_CHAN_DATA_LEN			16
+#define COMM_CHAN_DATA_TYPE_MASK		GENMASK(7, 0)
+#define COMM_CHAN_DATA_EOM_MASK			BIT(31)
+#define COMM_CHAN_MSG_END			BIT(31)
+
+#define COMM_CHAN_REG_WRDATA_OFF		0x0
+#define COMM_CHAN_REG_RDDATA_OFF		0x8
+#define COMM_CHAN_REG_STATUS_OFF		0x10
+#define COMM_CHAN_REG_ERROR_OFF			0x14
+#define COMM_CHAN_REG_RIT_OFF			0x1C
+#define COMM_CHAN_REG_IS_OFF			0x20
+#define COMM_CHAN_REG_IE_OFF			0x24
+#define COMM_CHAN_REG_CTRL_OFF			0x2C
+#define COMM_CHAN_REGS_SIZE			SZ_4K
+
+#define COMM_CHAN_IRQ_DISABLE_ALL		0
+#define COMM_CHAN_IRQ_RECEIVE_ENABLE		BIT(1)
+#define COMM_CHAN_IRQ_CLEAR_ALL			GENMASK(2, 0)
+#define COMM_CHAN_CLEAR_FIFO			GENMASK(1, 0)
+#define COMM_CHAN_RECEIVE_THRESHOLD		15
+
+enum comm_chan_req_ops {
+	COMM_CHAN_REQ_OPS_UNKNOWN		= 0,
+	COMM_CHAN_REQ_OPS_HOT_RESET		= 5,
+	COMM_CHAN_REQ_OPS_GET_PROTOCOL_VERSION	= 19,
+	COMM_CHAN_REQ_OPS_LOAD_XCLBIN_UUID	= 20,
+	COMM_CHAN_REQ_OPS_MAX,
+};
+
+enum comm_chan_msg_type {
+	COMM_CHAN_MSG_INVALID			= 0,
+	COMM_CHAN_MSG_START			= 2,
+	COMM_CHAN_MSG_BODY			= 3,
+};
+
+enum comm_chan_msg_service_type {
+	COMM_CHAN_MSG_SRV_RESPONSE		= BIT(0),
+	COMM_CHAN_MSG_SRV_REQUEST		= BIT(1),
+};
+
+struct comm_chan_hw_msg {
+	struct {
+		__u32		type;
+		__u32		payload_size;
+	} header;
+	struct {
+		__u64		id;
+		__u32		flags;
+		__u32		size;
+		__u32		payload[COMM_CHAN_DATA_LEN - 6];
+	} body;
+} __packed;
+
+struct comm_chan_srv_req {
+	__u64			flags;
+	__u32			opcode;
+	__u32			data[];
+};
+
+struct comm_chan_srv_ver_resp {
+	__u32			version;
+};
+
+struct comm_chan_srv_uuid_resp {
+	__u32			ret;
+};
+
+struct comm_chan_msg {
+	__u64			id;
+	__u32			flags;
+	__u32			len;
+	__u32			bytes_read;
+	__u32			data[10];
+};
+
+struct comm_chan_device {
+	struct versal_pci_device	*vdev;
+	struct timer_list		timer;
+	struct work_struct		work;
+};
+
+static inline struct comm_chan_device *to_ccdev_work(struct work_struct *w)
+{
+	return container_of(w, struct comm_chan_device, work);
+}
+
+static inline struct comm_chan_device *to_ccdev_timer(struct timer_list *t)
+{
+	return container_of(t, struct comm_chan_device, timer);
+}
+
+static inline u32 comm_chan_read(struct comm_chan_device *cdev, u32 offset)
+{
+	return readl(cdev->vdev->io_regs + COMM_CHAN_PCI_BAR_OFF + offset);
+}
+
+static inline void comm_chan_write(struct comm_chan_device *cdev, u32 offset, const u32 value)
+{
+	writel(value, cdev->vdev->io_regs + COMM_CHAN_PCI_BAR_OFF + offset);
+}
+
+static u32 comm_chan_set_uuid_resp(void *payload, int ret)
+{
+	struct comm_chan_srv_uuid_resp *resp = (struct comm_chan_srv_uuid_resp *)payload;
+	u32 resp_len = sizeof(*resp);
+
+	resp->ret = (u32)ret;
+
+	return resp_len;
+}
+
+static u32 comm_chan_set_protocol_resp(void *payload)
+{
+	struct comm_chan_srv_ver_resp *resp = (struct comm_chan_srv_ver_resp *)payload;
+	u32 resp_len = sizeof(*resp);
+
+	resp->version = COMM_CHAN_PROTOCOL_VERSION;
+
+	return sizeof(resp_len);
+}
+
+static void comm_chan_send_response(struct comm_chan_device *ccdev, u64 msg_id, void *payload)
+{
+	struct comm_chan_srv_req *req = (struct comm_chan_srv_req *)payload;
+	struct versal_pci_device *vdev = ccdev->vdev;
+	struct comm_chan_hw_msg response = {0};
+	u32 size;
+	int ret;
+	u8 i;
+
+	switch (req->opcode) {
+	case COMM_CHAN_REQ_OPS_GET_PROTOCOL_VERSION:
+		size = comm_chan_set_protocol_resp(response.body.payload);
+		break;
+	case COMM_CHAN_REQ_OPS_LOAD_XCLBIN_UUID:
+		ret = versal_pci_load_xclbin(vdev, (uuid_t *)req->data);
+		size = comm_chan_set_uuid_resp(response.body.payload, ret);
+		break;
+	default:
+		vdev_err(vdev, "Unsupported request opcode: %d", req->opcode);
+		*response.body.payload = -1;
+		size = sizeof(int);
+	}
+
+	vdev_dbg(vdev, "Response opcode: %d", req->opcode);
+
+	response.header.type = COMM_CHAN_MSG_START | COMM_CHAN_MSG_END;
+	response.header.payload_size = size;
+
+	response.body.flags = COMM_CHAN_MSG_SRV_RESPONSE;
+	response.body.size = size;
+	response.body.id = msg_id;
+
+	for (i = 0; i < COMM_CHAN_DATA_LEN; i++)
+		comm_chan_write(ccdev, COMM_CHAN_REG_WRDATA_OFF, ((u32 *)&response)[i]);
+}
+
+#define STATUS_IS_READY(status) ((status) & BIT(1))
+#define STATUS_IS_ERROR(status) ((status) & BIT(2))
+
+static void comm_chan_check_request(struct work_struct *w)
+{
+	struct comm_chan_device *ccdev = to_ccdev_work(w);
+	u32 status = 0, request[COMM_CHAN_DATA_LEN] = {0};
+	struct comm_chan_hw_msg *hw_msg;
+	u8 type, eom;
+	int i;
+
+	status = comm_chan_read(ccdev, COMM_CHAN_REG_IS_OFF);
+	if (!STATUS_IS_READY(status))
+		return;
+	if (STATUS_IS_ERROR(status)) {
+		vdev_err(ccdev->vdev, "An error has occurred with comms");
+		return;
+	}
+
+	/* ACK status */
+	comm_chan_write(ccdev, COMM_CHAN_REG_IS_OFF, status);
+
+	for (i = 0; i < COMM_CHAN_DATA_LEN; i++)
+		request[i] = comm_chan_read(ccdev, COMM_CHAN_REG_RDDATA_OFF);
+
+	hw_msg = (struct comm_chan_hw_msg *)request;
+	type = FIELD_GET(COMM_CHAN_DATA_TYPE_MASK, hw_msg->header.type);
+	eom = FIELD_GET(COMM_CHAN_DATA_EOM_MASK, hw_msg->header.type);
+
+	/* Only support fixed size 64B messages */
+	if (!eom || type != COMM_CHAN_MSG_START) {
+		vdev_err(ccdev->vdev, "Unsupported message format or length");
+		return;
+	}
+
+	if (hw_msg->body.flags != COMM_CHAN_MSG_SRV_REQUEST) {
+		vdev_err(ccdev->vdev, "Unsupported service request");
+		return;
+	}
+
+	if (hw_msg->body.size > sizeof(hw_msg->body.payload)) {
+		vdev_err(ccdev->vdev, "msg is too big: %d", hw_msg->body.size);
+		return;
+	}
+
+	/* Now decode and respond appropriately */
+	comm_chan_send_response(ccdev, hw_msg->body.id, hw_msg->body.payload);
+}
+
+static void comm_chan_sched_work(struct timer_list *t)
+{
+	struct comm_chan_device *ccdev = to_ccdev_timer(t);
+
+	/* Schedule a work in the general workqueue */
+	schedule_work(&ccdev->work);
+	/* Periodic timer */
+	mod_timer(&ccdev->timer, jiffies + COMM_CHAN_TIMER);
+}
+
+static void comm_chan_config(struct comm_chan_device *ccdev)
+{
+	/* Disable interrupts */
+	comm_chan_write(ccdev, COMM_CHAN_REG_IE_OFF, COMM_CHAN_IRQ_DISABLE_ALL);
+	/* Clear request and response FIFOs */
+	comm_chan_write(ccdev, COMM_CHAN_REG_CTRL_OFF, COMM_CHAN_CLEAR_FIFO);
+	/* Clear interrupts */
+	comm_chan_write(ccdev, COMM_CHAN_REG_IS_OFF, COMM_CHAN_IRQ_CLEAR_ALL);
+	/* Setup RIT reg */
+	comm_chan_write(ccdev, COMM_CHAN_REG_RIT_OFF, COMM_CHAN_RECEIVE_THRESHOLD);
+	/* Enable RIT interrupt */
+	comm_chan_write(ccdev, COMM_CHAN_REG_IE_OFF, COMM_CHAN_IRQ_RECEIVE_ENABLE);
+
+	/* Create and schedule timer to do recurring work */
+	INIT_WORK(&ccdev->work, &comm_chan_check_request);
+	timer_setup(&ccdev->timer, &comm_chan_sched_work, 0);
+	mod_timer(&ccdev->timer, jiffies + COMM_CHAN_TIMER);
+}
+
+void versal_pci_comm_chan_fini(struct comm_chan_device *ccdev)
+{
+	/* First stop scheduling new work then cancel work */
+	del_timer_sync(&ccdev->timer);
+	cancel_work_sync(&ccdev->work);
+}
+
+struct comm_chan_device *versal_pci_comm_chan_init(struct versal_pci_device *vdev)
+{
+	struct comm_chan_device *ccdev;
+
+	ccdev = devm_kzalloc(&vdev->pdev->dev, sizeof(*ccdev), GFP_KERNEL);
+	if (!ccdev)
+		return ERR_PTR(-ENOMEM);
+
+	ccdev->vdev = vdev;
+
+	comm_chan_config(ccdev);
+	return ccdev;
+}
diff --git a/drivers/fpga/amd/versal-pci-comm-chan.h b/drivers/fpga/amd/versal-pci-comm-chan.h
new file mode 100644
index 000000000000..7605abc5527f
--- /dev/null
+++ b/drivers/fpga/amd/versal-pci-comm-chan.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Driver for Versal PCIe device
+ *
+ * Copyright (C) 2024 Advanced Micro Devices, Inc. All rights reserved.
+ */
+
+#ifndef __VERSAL_PCI_COMM_CHAN_H
+#define __VERSAL_PCI_COMM_CHAN_H
+
+struct comm_chan_device *versal_pci_comm_chan_init(struct versal_pci_device *vdev);
+void versal_pci_comm_chan_fini(struct comm_chan_device *ccdev);
+
+#endif	/* __VERSAL_PCI_COMM_CHAN_H */
diff --git a/drivers/fpga/amd/versal-pci-main.c b/drivers/fpga/amd/versal-pci-main.c
index a10ccf86802b..a3b83197c6d5 100644
--- a/drivers/fpga/amd/versal-pci-main.c
+++ b/drivers/fpga/amd/versal-pci-main.c
@@ -8,6 +8,7 @@
 #include <linux/pci.h>
 
 #include "versal-pci.h"
+#include "versal-pci-comm-chan.h"
 
 #define DRV_NAME			"amd-versal-pci"
 
@@ -238,6 +239,7 @@ static void versal_pci_device_teardown(struct versal_pci_device *vdev)
 {
 	versal_pci_fpga_fini(vdev->fdev);
 	versal_pci_fw_upload_fini(vdev->fwdev);
+	versal_pci_comm_chan_fini(vdev->ccdev);
 }
 
 static int versal_pci_device_setup(struct versal_pci_device *vdev)
@@ -251,15 +253,23 @@ static int versal_pci_device_setup(struct versal_pci_device *vdev)
 		return ret;
 	}
 
+	vdev->ccdev = versal_pci_comm_chan_init(vdev);
+	if (IS_ERR(vdev->ccdev)) {
+		ret = PTR_ERR(vdev->ccdev);
+		vdev_err(vdev, "Failed to init comms channel, err %d", ret);
+		goto upload_fini;
+	}
+
 	vdev->fdev = versal_pci_fpga_init(vdev);
 	if (IS_ERR(vdev->fdev)) {
 		ret = PTR_ERR(vdev->fdev);
 		vdev_err(vdev, "Failed to init FPGA manager, err %d", ret);
-		goto upload_fini;
+		goto comm_chan_fini;
 	}
 
 	return 0;
-
+comm_chan_fini:
+	versal_pci_comm_chan_fini(vdev->ccdev);
 upload_fini:
 	versal_pci_fw_upload_fini(vdev->fwdev);
 
diff --git a/drivers/fpga/amd/versal-pci.h b/drivers/fpga/amd/versal-pci.h
index 1509bd0532ea..6c1ca3ce505d 100644
--- a/drivers/fpga/amd/versal-pci.h
+++ b/drivers/fpga/amd/versal-pci.h
@@ -26,6 +26,7 @@
 	dev_dbg(&(vdev)->pdev->dev, fmt, ##args)
 
 struct versal_pci_device;
+struct comm_chan_device;
 
 struct axlf_header {
 	__u64				length;
@@ -69,6 +70,7 @@ struct versal_pci_device {
 	struct pci_dev			*pdev;
 
 	struct fpga_device		*fdev;
+	struct comm_chan_device         *ccdev;
 	struct firmware_device		*fwdev;
 	struct device			*device;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V2 3/4] drivers/fpga/amd: Add remote queue
  2024-12-10 18:37 [PATCH V2 0/4] Add versal-pci driver Yidong Zhang
  2024-12-10 18:37 ` [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci Yidong Zhang
  2024-12-10 18:37 ` [PATCH V2 2/4] drivers/fpga/amd: Add communication channel Yidong Zhang
@ 2024-12-10 18:37 ` Yidong Zhang
  2025-01-26 10:24   ` Christophe JAILLET
  2024-12-10 18:37 ` [PATCH V2 4/4] drivers/fpga/amd: Add load xclbin and load firmware Yidong Zhang
  2025-01-26  9:27 ` [PATCH V2 0/4] Add versal-pci driver Xu Yilun
  4 siblings, 1 reply; 28+ messages in thread
From: Yidong Zhang @ 2024-12-10 18:37 UTC (permalink / raw)
  To: linux-kernel, linux-fpga, mdf, hao.wu, yilun.xu
  Cc: Yidong Zhang, lizhi.hou, Nishad Saraf, Prapul Krishnamurthy

The remote queue is a hardware queue based ring buffer service between mgmt
PF and PCIe hardware firmware for programming FPGA Program Logic, loading
versal firmware and checking card healthy status.

Co-developed-by: Nishad Saraf <nishads@amd.com>
Signed-off-by: Nishad Saraf <nishads@amd.com>
Co-developed-by: Prapul Krishnamurthy <prapulk@amd.com>
Signed-off-by: Prapul Krishnamurthy <prapulk@amd.com>
Signed-off-by: Yidong Zhang <yidong.zhang@amd.com>
---
 drivers/fpga/amd/Makefile                |   3 +-
 drivers/fpga/amd/versal-pci-rm-queue.c   | 326 +++++++++++++++++++++++
 drivers/fpga/amd/versal-pci-rm-queue.h   |  21 ++
 drivers/fpga/amd/versal-pci-rm-service.h | 209 +++++++++++++++
 4 files changed, 558 insertions(+), 1 deletion(-)
 create mode 100644 drivers/fpga/amd/versal-pci-rm-queue.c
 create mode 100644 drivers/fpga/amd/versal-pci-rm-queue.h
 create mode 100644 drivers/fpga/amd/versal-pci-rm-service.h

diff --git a/drivers/fpga/amd/Makefile b/drivers/fpga/amd/Makefile
index 7a604785e5f9..b2ffdbf23400 100644
--- a/drivers/fpga/amd/Makefile
+++ b/drivers/fpga/amd/Makefile
@@ -3,4 +3,5 @@
 obj-$(CONFIG_AMD_VERSAL_PCI)			+= versal-pci.o
 
 versal-pci-$(CONFIG_AMD_VERSAL_PCI)		:= versal-pci-main.o \
-						   versal-pci-comm-chan.o
+						   versal-pci-comm-chan.o \
+						   versal-pci-rm-queue.o
diff --git a/drivers/fpga/amd/versal-pci-rm-queue.c b/drivers/fpga/amd/versal-pci-rm-queue.c
new file mode 100644
index 000000000000..92f2e1165052
--- /dev/null
+++ b/drivers/fpga/amd/versal-pci-rm-queue.c
@@ -0,0 +1,326 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Driver for Versal PCIe device
+ *
+ * Copyright (C) 2024 Advanced Micro Devices, Inc. All rights reserved.
+ */
+
+#include <linux/pci.h>
+
+#include "versal-pci.h"
+#include "versal-pci-rm-queue.h"
+#include "versal-pci-rm-service.h"
+
+static inline struct rm_device *to_rdev_msg_monitor(struct work_struct *w)
+{
+	return container_of(w, struct rm_device, msg_monitor);
+}
+
+static inline struct rm_device *to_rdev_msg_timer(struct timer_list *t)
+{
+	return container_of(t, struct rm_device, msg_timer);
+}
+
+static inline u32 rm_io_read(struct rm_device *rdev, u32 offset)
+{
+	/* TODO */
+	return 0;
+}
+
+static inline int rm_io_write(struct rm_device *rdev, u32 offset, u32 value)
+{
+	/* TODO */
+	return 0;
+}
+
+static inline u32 rm_queue_read(struct rm_device *rdev, u32 offset)
+{
+	/* TODO */
+	return 0;
+}
+
+static inline void rm_queue_write(struct rm_device *rdev, u32 offset, u32 value)
+{
+	/* TODO */
+}
+
+static inline void rm_queue_bulk_read(struct rm_device *rdev, u32 offset,
+				      u32 *value, u32 size)
+{
+	/* TODO */
+}
+
+static inline void rm_queue_bulk_write(struct rm_device *rdev, u32 offset,
+				       u32 *value, u32 size)
+{
+	/* TODO */
+}
+
+static inline u32 rm_queue_get_cidx(struct rm_device *rdev, enum rm_queue_type type)
+{
+	u32 off;
+
+	if (type == RM_QUEUE_SQ)
+		off = offsetof(struct rm_queue_header, sq_cidx);
+	else
+		off = offsetof(struct rm_queue_header, cq_cidx);
+
+	return rm_queue_read(rdev, off);
+}
+
+static inline void rm_queue_set_cidx(struct rm_device *rdev, enum rm_queue_type type,
+				     u32 value)
+{
+	u32 off;
+
+	if (type == RM_QUEUE_SQ)
+		off = offsetof(struct rm_queue_header, sq_cidx);
+	else
+		off = offsetof(struct rm_queue_header, cq_cidx);
+
+	rm_queue_write(rdev, off, value);
+}
+
+static inline u32 rm_queue_get_pidx(struct rm_device *rdev, enum rm_queue_type type)
+{
+	if (type == RM_QUEUE_SQ)
+		return rm_io_read(rdev, RM_IO_SQ_PIDX_OFF);
+	else
+		return rm_io_read(rdev, RM_IO_CQ_PIDX_OFF);
+}
+
+static inline int rm_queue_set_pidx(struct rm_device *rdev,
+				    enum rm_queue_type type, u32 value)
+{
+	if (type == RM_QUEUE_SQ)
+		return rm_io_write(rdev, RM_IO_SQ_PIDX_OFF, value);
+	else
+		return rm_io_write(rdev, RM_IO_CQ_PIDX_OFF, value);
+}
+
+static inline u32 rm_queue_get_sq_slot_offset(struct rm_device *rdev)
+{
+	u32 index;
+
+	if ((rdev->sq.pidx - rdev->sq.cidx) >= rdev->queue_size)
+		return RM_INVALID_SLOT;
+
+	index = rdev->sq.pidx & (rdev->queue_size - 1);
+	return rdev->sq.offset + RM_CMD_SQ_SLOT_SIZE * index;
+}
+
+static inline u32 rm_queue_get_cq_slot_offset(struct rm_device *rdev)
+{
+	u32 index;
+
+	index = rdev->cq.cidx & (rdev->queue_size - 1);
+	return rdev->cq.offset + RM_CMD_CQ_SLOT_SIZE * index;
+}
+
+static int rm_queue_submit_cmd(struct rm_cmd *cmd)
+{
+	struct versal_pci_device *vdev = cmd->rdev->vdev;
+	struct rm_device *rdev = cmd->rdev;
+	u32 offset;
+	int ret;
+
+	mutex_lock(&rdev->queue);
+
+	offset = rm_queue_get_sq_slot_offset(rdev);
+	if (!offset) {
+		vdev_err(vdev, "No SQ slot available");
+		ret = -ENOSPC;
+		goto exit;
+	}
+
+	rm_queue_bulk_write(rdev, offset, (u32 *)&cmd->sq_msg,
+			    sizeof(cmd->sq_msg));
+
+	ret = rm_queue_set_pidx(rdev, RM_QUEUE_SQ, ++rdev->sq.pidx);
+	if (ret) {
+		vdev_err(vdev, "Failed to update PIDX, ret %d", ret);
+		goto exit;
+	}
+
+	list_add_tail(&cmd->list, &rdev->submitted_cmds);
+exit:
+	mutex_unlock(&rdev->queue);
+	return ret;
+}
+
+void rm_queue_withdraw_cmd(struct rm_cmd *cmd)
+{
+	mutex_lock(&cmd->rdev->queue);
+	list_del(&cmd->list);
+	mutex_unlock(&cmd->rdev->queue);
+}
+
+static int rm_queue_wait_cmd_timeout(struct rm_cmd *cmd, unsigned long timeout)
+{
+	struct versal_pci_device *vdev = cmd->rdev->vdev;
+	int ret;
+
+	if (wait_for_completion_timeout(&cmd->executed, timeout)) {
+		ret = cmd->cq_msg.data.rcode;
+		if (!ret)
+			return 0;
+
+		vdev_err(vdev, "CMD returned with a failure: %d", ret);
+		return ret;
+	}
+
+	/*
+	 * each cmds will be cleaned up by complete before it times out.
+	 * if we reach here, the cmd should be cleared and hot reset should
+	 * be issued.
+	 */
+	vdev_err(vdev, "cmd is timedout after, please reset the card");
+	rm_queue_withdraw_cmd(cmd);
+	return -ETIME;
+}
+
+int rm_queue_send_cmd(struct rm_cmd *cmd, unsigned long timeout)
+{
+	int ret;
+
+	ret = rm_queue_submit_cmd(cmd);
+	if (ret)
+		return ret;
+
+	return rm_queue_wait_cmd_timeout(cmd, timeout);
+}
+
+static int rm_process_msg(struct rm_device *rdev)
+{
+	struct versal_pci_device *vdev = rdev->vdev;
+	struct rm_cmd *cmd, *next;
+	struct rm_cmd_cq_hdr header;
+	u32 offset;
+
+	offset = rm_queue_get_cq_slot_offset(rdev);
+	if (!offset) {
+		vdev_err(vdev, "Invalid CQ offset");
+		return -EINVAL;
+	}
+
+	rm_queue_bulk_read(rdev, offset, (u32 *)&header, sizeof(header));
+
+	list_for_each_entry_safe(cmd, next, &rdev->submitted_cmds, list) {
+		u32 value = 0;
+
+		if (cmd->sq_msg.hdr.id != header.id)
+			continue;
+
+		rm_queue_bulk_read(rdev, offset + sizeof(cmd->cq_msg.hdr),
+				   (u32 *)&cmd->cq_msg.data,
+				   sizeof(cmd->cq_msg.data));
+
+		rm_queue_write(rdev, offset, value);
+
+		list_del(&cmd->list);
+		complete(&cmd->executed);
+		return 0;
+	}
+
+	vdev_err(vdev, "Unknown cmd ID %d found in CQ", header.id);
+	return -EFAULT;
+}
+
+static void rm_check_msg(struct work_struct *w)
+{
+	struct rm_device *rdev = to_rdev_msg_monitor(w);
+	int ret;
+
+	mutex_lock(&rdev->queue);
+
+	rdev->sq.cidx = rm_queue_get_cidx(rdev, RM_QUEUE_SQ);
+	rdev->cq.pidx = rm_queue_get_pidx(rdev, RM_QUEUE_CQ);
+
+	while (rdev->cq.cidx < rdev->cq.pidx) {
+		ret = rm_process_msg(rdev);
+		if (ret)
+			break;
+
+		rdev->cq.cidx++;
+
+		rm_queue_set_cidx(rdev, RM_QUEUE_CQ, rdev->cq.cidx);
+	}
+
+	mutex_unlock(&rdev->queue);
+}
+
+static void rm_sched_work(struct timer_list *t)
+{
+	struct rm_device *rdev = to_rdev_msg_timer(t);
+
+	/* Schedule a work in the general workqueue */
+	schedule_work(&rdev->msg_monitor);
+	/* Periodic timer */
+	mod_timer(&rdev->msg_timer, jiffies + RM_COMPLETION_TIMER);
+}
+
+void rm_queue_fini(struct rm_device *rdev)
+{
+	del_timer_sync(&rdev->msg_timer);
+	cancel_work_sync(&rdev->msg_monitor);
+	mutex_destroy(&rdev->queue);
+}
+
+int rm_queue_init(struct rm_device *rdev)
+{
+	struct versal_pci_device *vdev = rdev->vdev;
+	struct rm_queue_header header = {0};
+	int ret;
+
+	INIT_LIST_HEAD(&rdev->submitted_cmds);
+	mutex_init(&rdev->queue);
+
+	rm_queue_bulk_read(rdev, RM_HDR_OFF, (u32 *)&header, sizeof(header));
+
+	if (header.magic != RM_QUEUE_HDR_MAGIC_NUM) {
+		vdev_err(vdev, "Invalid RM queue header");
+		ret = -ENODEV;
+		goto error;
+	}
+
+	if (!header.version) {
+		vdev_err(vdev, "Invalid RM queue header");
+		ret = -ENODEV;
+		goto error;
+	}
+
+	sema_init(&rdev->sq.data_lock, 1);
+	sema_init(&rdev->cq.data_lock, 1);
+	rdev->queue_size = header.size;
+	rdev->sq.offset = header.sq_off;
+	rdev->cq.offset = header.cq_off;
+	rdev->sq.type = RM_QUEUE_SQ;
+	rdev->cq.type = RM_QUEUE_CQ;
+	rdev->sq.data_size = rdev->queue_buffer_size - RM_CMD_CQ_BUFFER_SIZE;
+	rdev->cq.data_size = RM_CMD_CQ_BUFFER_SIZE;
+	rdev->sq.data_offset = rdev->queue_buffer_start +
+			       RM_CMD_CQ_BUFFER_OFFSET + RM_CMD_CQ_BUFFER_SIZE;
+	rdev->cq.data_offset = rdev->queue_buffer_start +
+			       RM_CMD_CQ_BUFFER_OFFSET;
+	rdev->sq.cidx = header.sq_cidx;
+	rdev->cq.cidx = header.cq_cidx;
+
+	rdev->sq.pidx = rm_queue_get_pidx(rdev, RM_QUEUE_SQ);
+	rdev->cq.pidx = rm_queue_get_pidx(rdev, RM_QUEUE_CQ);
+
+	if (rdev->cq.cidx != rdev->cq.pidx) {
+		vdev_warn(vdev, "Clearing stale completions");
+		rdev->cq.cidx = rdev->cq.pidx;
+		rm_queue_set_cidx(rdev, RM_QUEUE_CQ, rdev->cq.cidx);
+	}
+
+	/* Create and schedule timer to do recurring work */
+	INIT_WORK(&rdev->msg_monitor, &rm_check_msg);
+	timer_setup(&rdev->msg_timer, &rm_sched_work, 0);
+	mod_timer(&rdev->msg_timer, jiffies + RM_COMPLETION_TIMER);
+
+	return 0;
+error:
+	mutex_destroy(&rdev->queue);
+	return ret;
+}
diff --git a/drivers/fpga/amd/versal-pci-rm-queue.h b/drivers/fpga/amd/versal-pci-rm-queue.h
new file mode 100644
index 000000000000..3bc7102c6a58
--- /dev/null
+++ b/drivers/fpga/amd/versal-pci-rm-queue.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Driver for Versal PCIe device
+ *
+ * Copyright (C) 2024 Advanced Micro Devices, Inc. All rights reserved.
+ */
+
+#ifndef __RM_QUEUE_H
+#define __RM_QUEUE_H
+
+struct rm_device;
+
+/* rm queue hardware setup */
+int rm_queue_init(struct rm_device *rdev);
+void rm_queue_fini(struct rm_device *rdev);
+
+/* rm queue common API */
+int rm_queue_send_cmd(struct rm_cmd *cmd, unsigned long timeout);
+void rm_queue_withdraw_cmd(struct rm_cmd *cmd);
+
+#endif	/* __RM_QUEUE_H */
diff --git a/drivers/fpga/amd/versal-pci-rm-service.h b/drivers/fpga/amd/versal-pci-rm-service.h
new file mode 100644
index 000000000000..85a78257770a
--- /dev/null
+++ b/drivers/fpga/amd/versal-pci-rm-service.h
@@ -0,0 +1,209 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Driver for Versal PCIe device
+ *
+ * Copyright (C) 2024 Advanced Micro Devices, Inc. All rights reserved.
+ */
+
+#ifndef __RM_SERVICE_H
+#define __RM_SERVICE_H
+
+#define RM_HDR_OFF			0x0
+#define RM_HDR_MAGIC_NUM		0x564D5230
+#define RM_QUEUE_HDR_MAGIC_NUM		0x5847513F
+#define RM_PCI_IO_BAR_OFF		0x2010000
+#define RM_PCI_IO_SIZE			SZ_4K
+#define RM_PCI_SHMEM_BAR_OFF		0x8000000
+#define RM_PCI_SHMEM_SIZE		SZ_128M
+#define RM_PCI_SHMEM_HDR_SIZE		0x28
+
+#define RM_QUEUE_HDR_MAGIC_NUM_OFF	0x0
+#define RM_IO_SQ_PIDX_OFF		0x0
+#define RM_IO_CQ_PIDX_OFF		0x100
+
+#define RM_CMD_ID_MIN			1
+#define RM_CMD_ID_MAX			(BIT(17) - 1)
+#define RM_CMD_SQ_HDR_OPS_MSK		GENMASK(15, 0)
+#define RM_CMD_SQ_HDR_SIZE_MSK		GENMASK(14, 0)
+#define RM_CMD_SQ_SLOT_SIZE		SZ_512
+#define RM_CMD_CQ_SLOT_SIZE		SZ_16
+#define RM_CMD_CQ_BUFFER_SIZE		(1024 * 1024)
+#define RM_CMD_CQ_BUFFER_OFFSET		0x0
+#define RM_CMD_LOG_PAGE_TYPE_MASK	GENMASK(15, 0)
+#define RM_CMD_VMR_CONTROL_MSK		GENMASK(10, 8)
+#define RM_CMD_VMR_CONTROL_PS_MASK	BIT(9)
+
+#define RM_CMD_WAIT_CONFIG_TIMEOUT	msecs_to_jiffies(10 * 1000)
+#define RM_CMD_WAIT_DOWNLOAD_TIMEOUT	msecs_to_jiffies(300 * 1000)
+
+#define RM_COMPLETION_TIMER		(HZ / 10)
+#define RM_HEALTH_CHECK_TIMER		(HZ)
+
+#define RM_INVALID_SLOT			0
+
+enum rm_queue_opcode {
+	RM_QUEUE_OP_LOAD_XCLBIN		= 0x0,
+	RM_QUEUE_OP_GET_LOG_PAGE	= 0x8,
+	RM_QUEUE_OP_LOAD_FW		= 0xA,
+	RM_QUEUE_OP_LOAD_APU_FW		= 0xD,
+	RM_QUEUE_OP_VMR_CONTROL		= 0xE,
+	RM_QUEUE_OP_IDENTIFY		= 0x202,
+};
+
+struct rm_cmd_sq_hdr {
+	__u16 opcode;
+	__u16 msg_size;
+	__u16 id;
+	__u16 reserved;
+} __packed;
+
+struct rm_cmd_cq_hdr {
+	__u16 id;
+	__u16 reserved;
+} __packed;
+
+struct rm_cmd_sq_bin {
+	__u64			address;
+	__u32			size;
+	__u32			reserved1;
+	__u32			reserved2;
+	__u32			reserved3;
+	__u64			reserved4;
+} __packed;
+
+struct rm_cmd_sq_log_page {
+	__u64			address;
+	__u32			size;
+	__u32			reserved1;
+	__u32			type;
+	__u32			reserved2;
+} __packed;
+
+struct rm_cmd_sq_ctrl {
+	__u32			status;
+} __packed;
+
+struct rm_cmd_sq_data {
+	union {
+		struct rm_cmd_sq_log_page	page;
+		struct rm_cmd_sq_bin		bin;
+		struct rm_cmd_sq_ctrl		ctrl;
+	};
+} __packed;
+
+struct rm_cmd_cq_identify {
+	__u16			major;
+	__u16			minor;
+	__u32			reserved;
+} __packed;
+
+struct rm_cmd_cq_log_page {
+	__u32			len;
+	__u32			reserved;
+} __packed;
+
+struct rm_cmd_cq_control {
+	__u16			status;
+	__u16			reserved1;
+	__u32			reserved2;
+} __packed;
+
+struct rm_cmd_cq_data {
+	union {
+		struct rm_cmd_cq_identify	identify;
+		struct rm_cmd_cq_log_page	page;
+		struct rm_cmd_cq_control	ctrl;
+		__u32				reserved[2];
+	};
+	__u32			rcode;
+} __packed;
+
+struct rm_cmd_sq_msg {
+	struct rm_cmd_sq_hdr	hdr;
+	struct rm_cmd_sq_data	data;
+} __packed;
+
+struct rm_cmd_cq_msg {
+	struct rm_cmd_cq_hdr	hdr;
+	struct rm_cmd_cq_data	data;
+} __packed;
+
+struct rm_cmd {
+	struct rm_device	*rdev;
+	struct list_head	list;
+	struct completion	executed;
+	struct rm_cmd_sq_msg	sq_msg;
+	struct rm_cmd_cq_msg	cq_msg;
+	enum rm_queue_opcode	opcode;
+	__u8			*buffer;
+	ssize_t			size;
+};
+
+enum rm_queue_type {
+	RM_QUEUE_SQ,
+	RM_QUEUE_CQ
+};
+
+enum rm_cmd_log_page_type {
+	RM_CMD_LOG_PAGE_AXI_TRIP_STATUS	= 0x0,
+	RM_CMD_LOG_PAGE_FW_ID		= 0xA,
+};
+
+struct rm_queue {
+	enum rm_queue_type	type;
+	__u32			pidx;
+	__u32			cidx;
+	__u32			offset;
+	__u32			data_offset;
+	__u32			data_size;
+	struct semaphore	data_lock;
+};
+
+struct rm_queue_header {
+	__u32			magic;
+	__u32			version;
+	__u32			size;
+	__u32			sq_off;
+	__u32			sq_slot_size;
+	__u32			cq_off;
+	__u32			sq_cidx;
+	__u32			cq_cidx;
+};
+
+struct rm_header {
+	__u32			magic;
+	__u32			queue_base;
+	__u32			queue_size;
+	__u32			status_off;
+	__u32			status_len;
+	__u32			log_index;
+	__u32			log_off;
+	__u32			log_size;
+	__u32			data_start;
+	__u32			data_end;
+};
+
+struct rm_device {
+	struct versal_pci_device	*vdev;
+
+	struct rm_header	rm_metadata;
+	__u32			queue_buffer_start;
+	__u32			queue_buffer_size;
+	__u32			queue_base;
+
+	/* Lock to queue access */
+	struct mutex		queue;
+	struct rm_queue		sq;
+	struct rm_queue		cq;
+	__u32			queue_size;
+
+	struct timer_list	msg_timer;
+	struct work_struct	msg_monitor;
+	struct timer_list	health_timer;
+	struct work_struct	health_monitor;
+	struct list_head	submitted_cmds;
+
+	__u32			firewall_tripped;
+};
+
+#endif	/* __RM_SERVICE_H */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V2 4/4] drivers/fpga/amd: Add load xclbin and load firmware
  2024-12-10 18:37 [PATCH V2 0/4] Add versal-pci driver Yidong Zhang
                   ` (2 preceding siblings ...)
  2024-12-10 18:37 ` [PATCH V2 3/4] drivers/fpga/amd: Add remote queue Yidong Zhang
@ 2024-12-10 18:37 ` Yidong Zhang
  2025-01-26  9:27 ` [PATCH V2 0/4] Add versal-pci driver Xu Yilun
  4 siblings, 0 replies; 28+ messages in thread
From: Yidong Zhang @ 2024-12-10 18:37 UTC (permalink / raw)
  To: linux-kernel, linux-fpga, mdf, hao.wu, yilun.xu
  Cc: Yidong Zhang, lizhi.hou, Nishad Saraf, Prapul Krishnamurthy

   - programming FW
     - The base FW is downloaded onto the flash of the card.
     - The APU FW is downloaded once after a POR (power on reset).
     - Reloading the MgmtPF driver will not change any existing hardware.

   - programming FPGA hardware binaries - PL Data
    - using fpga framework ops to support re-programing FPGA
    - the re-programming request will be initiated from the existing UserPF
      driver only, and the MgmtPF driver load the matched PL Data after
      receiving request from the communication channel. The matching PL
      Data is indexed by the PL Data UUID and Base FW UUID.
      - The Base FW UUID identifies unique based hardware. Often called the
      interface UUID.
      - The PL Data UUID identifies unique PL design that is generated
      based on the base hardware. Often called xclbin UUID.

    - Example:
      4fdebe35[...trimmed...]_96df7d[...trimmed...].xclbin
      |                     | |                   |
      +--  xclbin UUID    --+ +--interface UUID --+

Co-developed-by: Nishad Saraf <nishads@amd.com>
Signed-off-by: Nishad Saraf <nishads@amd.com>
Co-developed-by: Prapul Krishnamurthy <prapulk@amd.com>
Signed-off-by: Prapul Krishnamurthy <prapulk@amd.com>
Signed-off-by: Yidong Zhang <yidong.zhang@amd.com>
---
 drivers/fpga/amd/Makefile                |   3 +-
 drivers/fpga/amd/versal-pci-main.c       | 137 ++++++-
 drivers/fpga/amd/versal-pci-rm-queue.c   |  14 +-
 drivers/fpga/amd/versal-pci-rm-service.c | 497 +++++++++++++++++++++++
 drivers/fpga/amd/versal-pci-rm-service.h |  20 +
 drivers/fpga/amd/versal-pci.h            |   2 +
 6 files changed, 649 insertions(+), 24 deletions(-)
 create mode 100644 drivers/fpga/amd/versal-pci-rm-service.c

diff --git a/drivers/fpga/amd/Makefile b/drivers/fpga/amd/Makefile
index b2ffdbf23400..ffe4ead0d55e 100644
--- a/drivers/fpga/amd/Makefile
+++ b/drivers/fpga/amd/Makefile
@@ -4,4 +4,5 @@ obj-$(CONFIG_AMD_VERSAL_PCI)			+= versal-pci.o
 
 versal-pci-$(CONFIG_AMD_VERSAL_PCI)		:= versal-pci-main.o \
 						   versal-pci-comm-chan.o \
-						   versal-pci-rm-queue.o
+						   versal-pci-rm-queue.o \
+						   versal-pci-rm-service.o
diff --git a/drivers/fpga/amd/versal-pci-main.c b/drivers/fpga/amd/versal-pci-main.c
index a3b83197c6d5..b56449932c05 100644
--- a/drivers/fpga/amd/versal-pci-main.c
+++ b/drivers/fpga/amd/versal-pci-main.c
@@ -9,6 +9,8 @@
 
 #include "versal-pci.h"
 #include "versal-pci-comm-chan.h"
+#include "versal-pci-rm-service.h"
+#include "versal-pci-rm-queue.h"
 
 #define DRV_NAME			"amd-versal-pci"
 
@@ -18,22 +20,55 @@
 static int versal_pci_fpga_write_init(struct fpga_manager *mgr, struct fpga_image_info *info,
 				      const char *buf, size_t count)
 {
-	/* TODO */
-	return 0;
+	struct fpga_device *fdev = mgr->priv;
+	struct fw_tnx *tnx = &fdev->fw;
+	int ret;
+
+	ret = rm_queue_create_cmd(fdev->vdev->rdev, tnx->opcode, &tnx->cmd);
+	if (ret) {
+		fdev->state = FPGA_MGR_STATE_WRITE_INIT_ERR;
+		return ret;
+	}
+
+	fdev->state = FPGA_MGR_STATE_WRITE_INIT;
+	return ret;
 }
 
 static int versal_pci_fpga_write(struct fpga_manager *mgr, const char *buf,
 				 size_t count)
 {
-	/* TODO */
-	return 0;
+	struct fpga_device *fdev = mgr->priv;
+	int ret;
+
+	ret = rm_queue_data_init(fdev->fw.cmd, buf, count);
+	if (ret) {
+		fdev->state = FPGA_MGR_STATE_WRITE_ERR;
+		rm_queue_destory_cmd(fdev->fw.cmd);
+		return ret;
+	}
+
+	fdev->state = FPGA_MGR_STATE_WRITE;
+	return ret;
 }
 
 static int versal_pci_fpga_write_complete(struct fpga_manager *mgr,
 					  struct fpga_image_info *info)
 {
-	/* TODO */
-	return 0;
+	struct fpga_device *fdev = mgr->priv;
+	int ret;
+
+	ret = rm_queue_send_cmd(fdev->fw.cmd, RM_CMD_WAIT_DOWNLOAD_TIMEOUT);
+	if (ret) {
+		fdev->state = FPGA_MGR_STATE_WRITE_COMPLETE_ERR;
+		vdev_err(fdev->vdev, "Send cmd failed:%d, cid:%d", ret, fdev->fw.id);
+	} else {
+		fdev->state = FPGA_MGR_STATE_WRITE_COMPLETE;
+	}
+
+	rm_queue_data_fini(fdev->fw.cmd);
+	rm_queue_destory_cmd(fdev->fw.cmd);
+	memset(&fdev->fw, 0, sizeof(fdev->fw));
+	return ret;
 }
 
 static enum fpga_mgr_states versal_pci_fpga_state(struct fpga_manager *mgr)
@@ -97,10 +132,20 @@ static struct fpga_device *versal_pci_fpga_init(struct versal_pci_device *vdev)
 		return ERR_PTR(ret);
 	}
 
-	/* Place holder for rm_queue_get_fw_id(vdev->rdev) */
+	ret = rm_queue_get_fw_id(vdev->rdev);
+	if (ret) {
+		vdev_warn(vdev, "Failed to get fw_id");
+		ret = -EINVAL;
+		goto unregister_fpga_mgr;
+	}
 	versal_pci_uuid_parse(vdev, &vdev->intf_uuid);
 
 	return fdev;
+
+unregister_fpga_mgr:
+	fpga_mgr_unregister(fdev->mgr);
+
+	return ERR_PTR(ret);
 }
 
 static int versal_pci_program_axlf(struct versal_pci_device *vdev, char *data, size_t size)
@@ -161,31 +206,84 @@ int versal_pci_load_xclbin(struct versal_pci_device *vdev, uuid_t *xuuid)
 static enum fw_upload_err versal_pci_fw_prepare(struct fw_upload *fw_upload, const u8 *data,
 						u32 size)
 {
-	/* TODO */
+	struct firmware_device *fwdev = fw_upload->dd_handle;
+	struct axlf *xsabin = (struct axlf *)data;
+	int ret;
+
+	ret = memcmp(xsabin->magic, VERSAL_XCLBIN_MAGIC_ID, sizeof(VERSAL_XCLBIN_MAGIC_ID));
+	if (ret) {
+		vdev_err(fwdev->vdev, "Invalid device firmware");
+		return FW_UPLOAD_ERR_INVALID_SIZE;
+	}
+
+	/* Firmware size should never be over 1G and less than size of struct axlf */
+	if (!size || size != xsabin->header.length || size < sizeof(*xsabin) ||
+	    size > 1024 * 1024 * 1024) {
+		vdev_err(fwdev->vdev, "Invalid device firmware size");
+		return FW_UPLOAD_ERR_INVALID_SIZE;
+	}
+
+	ret = rm_queue_create_cmd(fwdev->vdev->rdev, RM_QUEUE_OP_LOAD_FW,
+				  &fwdev->cmd);
+	if (ret)
+		return FW_UPLOAD_ERR_RW_ERROR;
+
+	uuid_copy(&fwdev->uuid, &xsabin->header.uuid);
 	return FW_UPLOAD_ERR_NONE;
 }
 
 static enum fw_upload_err versal_pci_fw_write(struct fw_upload *fw_upload, const u8 *data,
 					      u32 offset, u32 size, u32 *written)
 {
-	/* TODO */
+	struct firmware_device *fwdev = fw_upload->dd_handle;
+	int ret;
+
+	ret = rm_queue_data_init(fwdev->cmd, data, size);
+	if (ret)
+		return FW_UPLOAD_ERR_RW_ERROR;
+
+	*written = size;
 	return FW_UPLOAD_ERR_NONE;
 }
 
 static enum fw_upload_err versal_pci_fw_poll_complete(struct fw_upload *fw_upload)
 {
-	/* TODO */
+	struct firmware_device *fwdev = fw_upload->dd_handle;
+	int ret;
+
+	vdev_info(fwdev->vdev, "Programming device firmware: %pUb", &fwdev->uuid);
+
+	ret = rm_queue_send_cmd(fwdev->cmd, RM_CMD_WAIT_DOWNLOAD_TIMEOUT);
+	if (ret) {
+		vdev_err(fwdev->vdev, "Send cmd failedi:%d, cid %d", ret, fwdev->id);
+		return FW_UPLOAD_ERR_HW_ERROR;
+	}
+
+	vdev_info(fwdev->vdev, "Successfully programmed device firmware: %pUb",
+		  &fwdev->uuid);
 	return FW_UPLOAD_ERR_NONE;
 }
 
 static void versal_pci_fw_cancel(struct fw_upload *fw_upload)
 {
-	/* TODO */
+	struct firmware_device *fwdev = fw_upload->dd_handle;
+
+	vdev_warn(fwdev->vdev, "canceled");
+	rm_queue_withdraw_cmd(fwdev->cmd);
 }
 
 static void versal_pci_fw_cleanup(struct fw_upload *fw_upload)
 {
-	/* TODO */
+	struct firmware_device *fwdev = fw_upload->dd_handle;
+
+	if (!fwdev->cmd)
+		return;
+
+	rm_queue_data_fini(fwdev->cmd);
+	rm_queue_destory_cmd(fwdev->cmd);
+
+	fwdev->cmd = NULL;
+	fwdev->id = 0;
 }
 
 static const struct fw_upload_ops versal_pci_fw_ops = {
@@ -240,23 +338,31 @@ static void versal_pci_device_teardown(struct versal_pci_device *vdev)
 	versal_pci_fpga_fini(vdev->fdev);
 	versal_pci_fw_upload_fini(vdev->fwdev);
 	versal_pci_comm_chan_fini(vdev->ccdev);
+	versal_pci_rm_fini(vdev->rdev);
 }
 
 static int versal_pci_device_setup(struct versal_pci_device *vdev)
 {
 	int ret;
 
+	vdev->rdev = versal_pci_rm_init(vdev);
+	if (IS_ERR(vdev->rdev)) {
+		ret = PTR_ERR(vdev->rdev);
+		vdev_err(vdev, "Failed to init remote queue, err %d", ret);
+		return ret;
+	}
+
 	vdev->fwdev = versal_pci_fw_upload_init(vdev);
 	if (IS_ERR(vdev->fwdev)) {
 		ret = PTR_ERR(vdev->fwdev);
 		vdev_err(vdev, "Failed to init FW uploader, err %d", ret);
-		return ret;
+		goto rm_fini;
 	}
 
 	vdev->ccdev = versal_pci_comm_chan_init(vdev);
 	if (IS_ERR(vdev->ccdev)) {
 		ret = PTR_ERR(vdev->ccdev);
-		vdev_err(vdev, "Failed to init comms channel, err %d", ret);
+		vdev_err(vdev, "Failed to init comm channel, err %d", ret);
 		goto upload_fini;
 	}
 
@@ -272,7 +378,8 @@ static int versal_pci_device_setup(struct versal_pci_device *vdev)
 	versal_pci_comm_chan_fini(vdev->ccdev);
 upload_fini:
 	versal_pci_fw_upload_fini(vdev->fwdev);
-
+rm_fini:
+	versal_pci_rm_fini(vdev->rdev);
 	return ret;
 }
 
diff --git a/drivers/fpga/amd/versal-pci-rm-queue.c b/drivers/fpga/amd/versal-pci-rm-queue.c
index 92f2e1165052..e62aa6791a95 100644
--- a/drivers/fpga/amd/versal-pci-rm-queue.c
+++ b/drivers/fpga/amd/versal-pci-rm-queue.c
@@ -23,37 +23,35 @@ static inline struct rm_device *to_rdev_msg_timer(struct timer_list *t)
 
 static inline u32 rm_io_read(struct rm_device *rdev, u32 offset)
 {
-	/* TODO */
-	return 0;
+	return rm_reg_read(rdev, RM_PCI_IO_BAR_OFF + offset);
 }
 
 static inline int rm_io_write(struct rm_device *rdev, u32 offset, u32 value)
 {
-	/* TODO */
+	rm_reg_write(rdev, RM_PCI_IO_BAR_OFF + offset, value);
 	return 0;
 }
 
 static inline u32 rm_queue_read(struct rm_device *rdev, u32 offset)
 {
-	/* TODO */
-	return 0;
+	return rm_reg_read(rdev, RM_PCI_SHMEM_BAR_OFF + rdev->queue_base + offset);
 }
 
 static inline void rm_queue_write(struct rm_device *rdev, u32 offset, u32 value)
 {
-	/* TODO */
+	rm_reg_write(rdev, RM_PCI_SHMEM_BAR_OFF + rdev->queue_base + offset, value);
 }
 
 static inline void rm_queue_bulk_read(struct rm_device *rdev, u32 offset,
 				      u32 *value, u32 size)
 {
-	/* TODO */
+	rm_bulk_reg_read(rdev, RM_PCI_SHMEM_BAR_OFF + rdev->queue_base + offset, value, size);
 }
 
 static inline void rm_queue_bulk_write(struct rm_device *rdev, u32 offset,
 				       u32 *value, u32 size)
 {
-	/* TODO */
+	rm_bulk_reg_write(rdev, RM_PCI_SHMEM_BAR_OFF + rdev->queue_base + offset, value, size);
 }
 
 static inline u32 rm_queue_get_cidx(struct rm_device *rdev, enum rm_queue_type type)
diff --git a/drivers/fpga/amd/versal-pci-rm-service.c b/drivers/fpga/amd/versal-pci-rm-service.c
new file mode 100644
index 000000000000..0a1d96a432b2
--- /dev/null
+++ b/drivers/fpga/amd/versal-pci-rm-service.c
@@ -0,0 +1,497 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Driver for Versal PCIe device
+ *
+ * Copyright (C) 2024 Advanced Micro Devices, Inc. All rights reserved.
+ */
+
+#include <linux/bitfield.h>
+#include <linux/pci.h>
+#include <linux/vmalloc.h>
+
+#include "versal-pci.h"
+#include "versal-pci-rm-service.h"
+#include "versal-pci-rm-queue.h"
+
+static DEFINE_IDA(rm_cmd_ids);
+
+static void rm_uninstall_health_monitor(struct rm_device *rdev);
+
+static inline struct rm_device *to_rdev_health_monitor(struct work_struct *w)
+{
+	return container_of(w, struct rm_device, health_monitor);
+}
+
+static inline struct rm_device *to_rdev_health_timer(struct timer_list *t)
+{
+	return container_of(t, struct rm_device, health_timer);
+}
+
+u32 rm_reg_read(struct rm_device *rdev, u32 offset)
+{
+	return readl(rdev->vdev->io_regs + offset);
+}
+
+void rm_reg_write(struct rm_device *rdev, u32 offset, const u32 value)
+{
+	writel(value, rdev->vdev->io_regs + offset);
+}
+
+void rm_bulk_reg_read(struct rm_device *rdev, u32 offset, u32 *value, size_t size)
+{
+	void __iomem *src = rdev->vdev->io_regs + offset;
+	void *dst = (void *)value;
+
+	memcpy_fromio(dst, src, size);
+	/* Barrier of reading data from device */
+	rmb();
+}
+
+void rm_bulk_reg_write(struct rm_device *rdev, u32 offset, const void *value, size_t size)
+{
+	void __iomem *dst = rdev->vdev->io_regs + offset;
+
+	memcpy_toio(dst, value, size);
+	/* Barrier of writing data to device */
+	wmb();
+}
+
+static inline u32 rm_shmem_read(struct rm_device *rdev, u32 offset)
+{
+	return rm_reg_read(rdev, RM_PCI_SHMEM_BAR_OFF + offset);
+}
+
+static inline void rm_shmem_bulk_read(struct rm_device *rdev, u32 offset,
+				      u32 *value, u32 size)
+{
+	rm_bulk_reg_read(rdev, RM_PCI_SHMEM_BAR_OFF + offset, value, size);
+}
+
+static inline void rm_shmem_bulk_write(struct rm_device *rdev, u32 offset,
+				       const void *value, u32 size)
+{
+	rm_bulk_reg_write(rdev, RM_PCI_SHMEM_BAR_OFF + offset, value, size);
+}
+
+void rm_queue_destory_cmd(struct rm_cmd *cmd)
+{
+	ida_free(&rm_cmd_ids, cmd->sq_msg.hdr.id);
+	kfree(cmd);
+}
+
+static int rm_queue_copy_response(struct rm_cmd *cmd, void *buffer, ssize_t len)
+{
+	struct rm_cmd_cq_log_page *result = &cmd->cq_msg.data.page;
+	u64 off = cmd->sq_msg.data.page.address;
+
+	if (!result->len || len < result->len) {
+		vdev_err(cmd->rdev->vdev, "Invalid response or buffer size");
+		return -EINVAL;
+	}
+
+	rm_shmem_bulk_read(cmd->rdev, off, (u32 *)buffer, result->len);
+	return 0;
+}
+
+static void rm_queue_payload_fini(struct rm_cmd *cmd)
+{
+	up(&cmd->rdev->cq.data_lock);
+}
+
+static int rm_queue_payload_init(struct rm_cmd *cmd,
+				 enum rm_cmd_log_page_type type)
+{
+	struct rm_device *rdev = cmd->rdev;
+	int ret;
+
+	ret = down_interruptible(&rdev->cq.data_lock);
+	if (ret)
+		return ret;
+
+	cmd->sq_msg.data.page.address = rdev->cq.data_offset;
+	cmd->sq_msg.data.page.size = rdev->cq.data_size;
+	cmd->sq_msg.data.page.reserved1 = 0;
+	cmd->sq_msg.data.page.type = FIELD_PREP(RM_CMD_LOG_PAGE_TYPE_MASK,
+						type);
+	return 0;
+}
+
+void rm_queue_data_fini(struct rm_cmd *cmd)
+{
+	up(&cmd->rdev->sq.data_lock);
+}
+
+int rm_queue_data_init(struct rm_cmd *cmd, const char *buffer, ssize_t size)
+{
+	struct rm_device *rdev = cmd->rdev;
+	int ret;
+
+	if (!size || size > rdev->sq.data_size) {
+		vdev_err(rdev->vdev, "Unsupported file size");
+		return -ENOMEM;
+	}
+
+	ret = down_interruptible(&rdev->sq.data_lock);
+	if (ret)
+		return ret;
+
+	rm_shmem_bulk_write(cmd->rdev, rdev->sq.data_offset, buffer, size);
+
+	cmd->sq_msg.data.bin.address = rdev->sq.data_offset;
+	cmd->sq_msg.data.bin.size = size;
+	return 0;
+}
+
+int rm_queue_create_cmd(struct rm_device *rdev, enum rm_queue_opcode opcode,
+			struct rm_cmd **cmd_ptr)
+{
+	struct rm_cmd *cmd = NULL;
+	int ret, id;
+	u16 size;
+
+	if (rdev->firewall_tripped)
+		return -ENODEV;
+
+	cmd = kzalloc(sizeof(*cmd), GFP_KERNEL);
+	if (!cmd)
+		return -ENOMEM;
+	cmd->rdev = rdev;
+
+	switch (opcode) {
+	case RM_QUEUE_OP_LOAD_XCLBIN:
+		fallthrough;
+	case RM_QUEUE_OP_LOAD_FW:
+		fallthrough;
+	case RM_QUEUE_OP_LOAD_APU_FW:
+		size = sizeof(struct rm_cmd_sq_bin);
+		break;
+	case RM_QUEUE_OP_GET_LOG_PAGE:
+		size = sizeof(struct rm_cmd_sq_log_page);
+		break;
+	case RM_QUEUE_OP_IDENTIFY:
+		size = 0;
+		break;
+	case RM_QUEUE_OP_VMR_CONTROL:
+		size = sizeof(struct rm_cmd_sq_ctrl);
+		break;
+	default:
+		vdev_err(rdev->vdev, "Invalid cmd opcode %d", opcode);
+		ret = -EINVAL;
+		goto error;
+	}
+
+	cmd->opcode = opcode;
+	cmd->sq_msg.hdr.opcode = FIELD_PREP(RM_CMD_SQ_HDR_OPS_MSK, opcode);
+	cmd->sq_msg.hdr.msg_size = FIELD_PREP(RM_CMD_SQ_HDR_SIZE_MSK, size);
+
+	id = ida_alloc_range(&rm_cmd_ids, RM_CMD_ID_MIN, RM_CMD_ID_MAX, GFP_KERNEL);
+	if (id < 0) {
+		vdev_err(rdev->vdev, "Failed to alloc cmd ID: %d", id);
+		ret = id;
+		goto error;
+	}
+	cmd->sq_msg.hdr.id = id;
+
+	init_completion(&cmd->executed);
+
+	*cmd_ptr = cmd;
+	return 0;
+error:
+	kfree(cmd);
+	return ret;
+}
+
+static int rm_queue_verify(struct rm_device *rdev)
+{
+	struct versal_pci_device *vdev = rdev->vdev;
+	struct rm_cmd_cq_identify *result;
+	struct rm_cmd *cmd;
+	u32 major, minor;
+	int ret;
+
+	ret = rm_queue_create_cmd(rdev, RM_QUEUE_OP_IDENTIFY, &cmd);
+	if (ret)
+		return ret;
+
+	ret = rm_queue_send_cmd(cmd, RM_CMD_WAIT_CONFIG_TIMEOUT);
+	if (ret)
+		goto error;
+
+	result = &cmd->cq_msg.data.identify;
+	major = result->major;
+	minor = result->minor;
+	vdev_dbg(vdev, "VMR version %d.%d", major, minor);
+	if (!major) {
+		vdev_err(vdev, "VMR version is unsupported");
+		ret = -EOPNOTSUPP;
+	}
+
+error:
+	rm_queue_destory_cmd(cmd);
+	return ret;
+}
+
+static int rm_check_apu_status(struct rm_device *rdev, bool *status)
+{
+	struct rm_cmd_cq_control *result;
+	struct rm_cmd *cmd;
+	int ret;
+
+	ret = rm_queue_create_cmd(rdev, RM_QUEUE_OP_VMR_CONTROL, &cmd);
+	if (ret)
+		return ret;
+
+	ret = rm_queue_send_cmd(cmd, RM_CMD_WAIT_CONFIG_TIMEOUT);
+	if (ret)
+		goto error;
+
+	result = &cmd->cq_msg.data.ctrl;
+	*status = FIELD_GET(RM_CMD_VMR_CONTROL_PS_MASK, result->status);
+
+	rm_queue_destory_cmd(cmd);
+	return 0;
+
+error:
+	rm_queue_destory_cmd(cmd);
+	return ret;
+}
+
+static int rm_download_apu_fw(struct rm_device *rdev, char *data, ssize_t size)
+{
+	struct rm_cmd *cmd;
+	int ret;
+
+	ret = rm_queue_create_cmd(rdev, RM_QUEUE_OP_LOAD_APU_FW, &cmd);
+	if (ret)
+		return ret;
+
+	ret = rm_queue_data_init(cmd, data, size);
+	if (ret)
+		goto done;
+
+	ret = rm_queue_send_cmd(cmd, RM_CMD_WAIT_DOWNLOAD_TIMEOUT);
+
+done:
+	rm_queue_destory_cmd(cmd);
+	return ret;
+}
+
+int rm_queue_boot_apu(struct rm_device *rdev)
+{
+	char *bin = "xilinx/xrt-versal-apu.xsabin";
+	const struct firmware *fw = NULL;
+	bool status;
+	int ret;
+
+	ret = rm_check_apu_status(rdev, &status);
+	if (ret) {
+		vdev_err(rdev->vdev, "Failed to get APU status");
+		return ret;
+	}
+
+	if (status) {
+		vdev_dbg(rdev->vdev, "APU online. Skipping APU FW download");
+		return 0;
+	}
+
+	ret = request_firmware(&fw, bin, &rdev->vdev->pdev->dev);
+	if (ret) {
+		vdev_warn(rdev->vdev, "Request APU FW %s failed %d", bin, ret);
+		return ret;
+	}
+
+	vdev_dbg(rdev->vdev, "Starting... APU FW download");
+	ret = rm_download_apu_fw(rdev, (char *)fw->data, fw->size);
+	vdev_dbg(rdev->vdev, "Finished... APU FW download %d", ret);
+
+	if (ret)
+		vdev_err(rdev->vdev, "Failed to download APU FW, ret:%d", ret);
+
+	release_firmware(fw);
+
+	return ret;
+}
+
+static void rm_check_health(struct work_struct *w)
+{
+	struct rm_device *rdev = to_rdev_health_monitor(w);
+	u32 max_len = PAGE_SIZE;
+	struct rm_cmd *cmd;
+	int ret;
+
+	ret = rm_queue_create_cmd(rdev, RM_QUEUE_OP_GET_LOG_PAGE, &cmd);
+	if (ret)
+		return;
+
+	ret = rm_queue_payload_init(cmd, RM_CMD_LOG_PAGE_AXI_TRIP_STATUS);
+	if (ret)
+		goto destroy_cmd;
+
+	ret = rm_queue_send_cmd(cmd, RM_CMD_WAIT_CONFIG_TIMEOUT);
+	if (ret == -ETIME || ret == -EINVAL)
+		goto payload_fini;
+
+	if (ret) {
+		u32 log_len = cmd->cq_msg.data.page.len;
+
+		if (log_len > max_len) {
+			vdev_warn(rdev->vdev, "msg size %d is greater than requested %d",
+				  log_len, max_len);
+			log_len = max_len;
+		}
+
+		if (log_len) {
+			char *buffer = vzalloc(log_len);
+
+			if (!buffer)
+				goto payload_fini;
+
+			ret = rm_queue_copy_response(cmd, buffer, log_len);
+			if (ret) {
+				vfree(buffer);
+				goto payload_fini;
+			}
+
+			vdev_err(rdev->vdev, "%s", buffer);
+			vfree(buffer);
+
+		} else {
+			vdev_err(rdev->vdev, "firewall check ret%d", ret);
+		}
+
+		rdev->firewall_tripped = 1;
+	}
+
+payload_fini:
+	rm_queue_payload_fini(cmd);
+destroy_cmd:
+	rm_queue_destory_cmd(cmd);
+
+	vdev_dbg(rdev->vdev, "check result: %d", ret);
+}
+
+static void rm_sched_health_check(struct timer_list *t)
+{
+	struct rm_device *rdev = to_rdev_health_timer(t);
+
+	if (rdev->firewall_tripped) {
+		vdev_err(rdev->vdev, "Firewall tripped, health check paused. Please reset card");
+		return;
+	}
+	/* Schedule a work in the general workqueue */
+	schedule_work(&rdev->health_monitor);
+	/* Periodic timer */
+	mod_timer(&rdev->health_timer, jiffies + RM_HEALTH_CHECK_TIMER);
+}
+
+static void rm_uninstall_health_monitor(struct rm_device *rdev)
+{
+	del_timer_sync(&rdev->health_timer);
+	cancel_work_sync(&rdev->health_monitor);
+}
+
+static void rm_install_health_monitor(struct rm_device *rdev)
+{
+	INIT_WORK(&rdev->health_monitor, &rm_check_health);
+	timer_setup(&rdev->health_timer, &rm_sched_health_check, 0);
+	mod_timer(&rdev->health_timer, jiffies + RM_HEALTH_CHECK_TIMER);
+}
+
+void versal_pci_rm_fini(struct rm_device *rdev)
+{
+	rm_uninstall_health_monitor(rdev);
+	rm_queue_fini(rdev);
+}
+
+struct rm_device *versal_pci_rm_init(struct versal_pci_device *vdev)
+{
+	struct rm_header *header;
+	struct rm_device *rdev;
+	u32 status;
+	int ret;
+
+	rdev = devm_kzalloc(&vdev->pdev->dev, sizeof(*rdev), GFP_KERNEL);
+	if (!rdev)
+		return ERR_PTR(-ENOMEM);
+
+	rdev->vdev = vdev;
+	header = &rdev->rm_metadata;
+
+	rm_shmem_bulk_read(rdev, RM_HDR_OFF, (u32 *)header, sizeof(*header));
+	if (header->magic != RM_HDR_MAGIC_NUM) {
+		vdev_err(vdev, "Invalid RM header 0x%x", header->magic);
+		ret = -ENODEV;
+		goto err;
+	}
+
+	status = rm_shmem_read(rdev, header->status_off);
+	if (!status) {
+		vdev_err(vdev, "RM status %d is not ready", status);
+		ret = -ENODEV;
+		goto err;
+	}
+
+	rdev->queue_buffer_size = header->data_end - header->data_start + 1;
+	rdev->queue_buffer_start = header->data_start;
+	rdev->queue_base = header->queue_base;
+
+	ret = rm_queue_init(rdev);
+	if (ret) {
+		vdev_err(vdev, "Failed to init cmd queue, ret %d", ret);
+		ret = -ENODEV;
+		goto err;
+	}
+
+	ret = rm_queue_verify(rdev);
+	if (ret) {
+		vdev_err(vdev, "Failed to verify cmd queue, ret %d", ret);
+		ret = -ENODEV;
+		goto queue_fini;
+	}
+
+	ret = rm_queue_boot_apu(rdev);
+	if (ret) {
+		vdev_err(vdev, "Failed to bringup APU, ret %d", ret);
+		ret = -ENODEV;
+		goto queue_fini;
+	}
+
+	rm_install_health_monitor(rdev);
+
+	return rdev;
+queue_fini:
+	rm_queue_fini(rdev);
+err:
+	return ERR_PTR(ret);
+}
+
+int rm_queue_get_fw_id(struct rm_device *rdev)
+{
+	struct rm_cmd *cmd;
+	int ret;
+
+	ret = rm_queue_create_cmd(rdev, RM_QUEUE_OP_GET_LOG_PAGE, &cmd);
+	if (ret)
+		return ret;
+
+	ret = rm_queue_payload_init(cmd, RM_CMD_LOG_PAGE_FW_ID);
+	if (ret)
+		goto destroy_cmd;
+
+	ret = rm_queue_send_cmd(cmd, RM_CMD_WAIT_CONFIG_TIMEOUT);
+	if (ret)
+		goto payload_fini;
+
+	ret = rm_queue_copy_response(cmd, rdev->vdev->fw_id, sizeof(rdev->vdev->fw_id));
+	if (ret)
+		goto payload_fini;
+
+	vdev_info(rdev->vdev, "fw_id %s", rdev->vdev->fw_id);
+
+payload_fini:
+	rm_queue_payload_fini(cmd);
+destroy_cmd:
+	rm_queue_destory_cmd(cmd);
+
+	return ret;
+}
diff --git a/drivers/fpga/amd/versal-pci-rm-service.h b/drivers/fpga/amd/versal-pci-rm-service.h
index 85a78257770a..90796567f0d3 100644
--- a/drivers/fpga/amd/versal-pci-rm-service.h
+++ b/drivers/fpga/amd/versal-pci-rm-service.h
@@ -206,4 +206,24 @@ struct rm_device {
 	__u32			firewall_tripped;
 };
 
+/* rm service init api */
+struct rm_device *versal_pci_rm_init(struct versal_pci_device *vdev);
+void versal_pci_rm_fini(struct rm_device *rdev);
+
+/* rm services APIs */
+int rm_queue_create_cmd(struct rm_device *rdev, enum rm_queue_opcode opcode,
+			struct rm_cmd **cmd_ptr);
+void rm_queue_destory_cmd(struct rm_cmd *cmd);
+
+int rm_queue_data_init(struct rm_cmd *cmd, const char *buffer, ssize_t size);
+void rm_queue_data_fini(struct rm_cmd *cmd);
+int rm_queue_get_fw_id(struct rm_device *rdev);
+int rm_queue_boot_apu(struct rm_device *rdev);
+
+/* rm bar register operation APIs */
+u32 rm_reg_read(struct rm_device *rdev, u32 offset);
+void rm_reg_write(struct rm_device *rdev, u32 offset, const u32 value);
+void rm_bulk_reg_read(struct rm_device *rdev, u32 offset, u32 *value, size_t size);
+void rm_bulk_reg_write(struct rm_device *rdev, u32 offset, const void *value, size_t size);
+
 #endif	/* __RM_SERVICE_H */
diff --git a/drivers/fpga/amd/versal-pci.h b/drivers/fpga/amd/versal-pci.h
index 6c1ca3ce505d..5d3f793a5b68 100644
--- a/drivers/fpga/amd/versal-pci.h
+++ b/drivers/fpga/amd/versal-pci.h
@@ -27,6 +27,7 @@
 
 struct versal_pci_device;
 struct comm_chan_device;
+struct rm_cmd;
 
 struct axlf_header {
 	__u64				length;
@@ -69,6 +70,7 @@ struct firmware_device {
 struct versal_pci_device {
 	struct pci_dev			*pdev;
 
+	struct rm_device		*rdev;
 	struct fpga_device		*fdev;
 	struct comm_chan_device         *ccdev;
 	struct firmware_device		*fwdev;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci
  2024-12-10 18:37 ` [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci Yidong Zhang
@ 2025-01-26  9:24   ` Xu Yilun
  2025-01-26 19:50     ` Yidong Zhang
  2025-01-26 10:12   ` Christophe JAILLET
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 28+ messages in thread
From: Xu Yilun @ 2025-01-26  9:24 UTC (permalink / raw)
  To: Yidong Zhang
  Cc: linux-kernel, linux-fpga, mdf, hao.wu, yilun.xu, lizhi.hou,
	DMG Karthik, Nishad Saraf, Prapul Krishnamurthy, Hayden Laccabue

> +static int versal_pci_program_axlf(struct versal_pci_device *vdev, char *data, size_t size)
> +{
> +	const struct axlf *axlf = (struct axlf *)data;
> +	struct fpga_image_info *image_info;
> +	int ret;
> +
> +	image_info = fpga_image_info_alloc(&vdev->pdev->dev);
> +	if (!image_info)
> +		return -ENOMEM;
> +
> +	image_info->count = axlf->header.length;
> +	image_info->buf = (char *)axlf;
> +
> +	ret = fpga_mgr_load(vdev->fdev->mgr, image_info);

I see, but this is not working like this. fpga_mgr_load() is intended to be
called by fpga_region, any reprogramming API should come from fpga_region,
and fpga_region could provide uAPI for userspace reprogramming.

If your driver act both as a fpga_mgr backend and a fpga_mgr kAPI user,
then you don't have to bother using fpga framework at all.

Thanks,
Yilun

> +	if (ret) {
> +		vdev_err(vdev, "failed to load xclbin: %d", ret);
> +		goto exit;
> +	}
> +
> +	vdev_info(vdev, "Downloaded axlf %pUb of size %zu Bytes", &axlf->header.uuid, size);
> +	uuid_copy(&vdev->xclbin_uuid, &axlf->header.uuid);
> +
> +exit:
> +	fpga_image_info_free(image_info);
> +
> +	return ret;
> +}

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 0/4] Add versal-pci driver
  2024-12-10 18:37 [PATCH V2 0/4] Add versal-pci driver Yidong Zhang
                   ` (3 preceding siblings ...)
  2024-12-10 18:37 ` [PATCH V2 4/4] drivers/fpga/amd: Add load xclbin and load firmware Yidong Zhang
@ 2025-01-26  9:27 ` Xu Yilun
  4 siblings, 0 replies; 28+ messages in thread
From: Xu Yilun @ 2025-01-26  9:27 UTC (permalink / raw)
  To: Yidong Zhang; +Cc: linux-kernel, linux-fpga, mdf, hao.wu, yilun.xu, lizhi.hou

On Tue, Dec 10, 2024 at 10:37:29AM -0800, Yidong Zhang wrote:
> This patchset introduces a new Linux Kernel Driver, versal-pci for AMD
> Alevo Versal based PCIe Card. The driver is based on Linux fpga driver
> framework.
> 
> The AMD Alevo Versal based PCIe Card, including V70, is the first Alevo
> production card leveraging AMD XDNA architecture with AI Engines. It is
> designed for AI inference efficiency and is tuned for video analytics and
> natural language processing applications [1].
> 
> This versal-pci driver provides services, including:
>   - leveraging linux firmware and FPGA framework to download management
>     firmware
>   - program additional bit-streams for AMD Xilinx specific hardware
>   - communicate with PCIe user function
>   - communicate with firmware running on the PCIe Card
>   - monitor device health
> 
> The driver is licensed under GPL-2.0.
> 
> The firmware and bit-streams are distributed as a closed binary, delivered
> by AMD. Please see [1] for more information.
> 
> [1] https://www.amd.com/en/products/accelerators/alveo/v70.html
> 
> Refactor driver to address all comments from v1.
> Changes since v1:
> - Add driver architecture description.
> - Change the driver name to versal-pci 
> - Remove unnecessary memcpy in versal-pci-comm-chan.c
> - Keep mod_timer because we need single_threaded_queue with delayed_work.
> - Change the "comms" to "comm_chan".
> - Remove regmap, use base+offset directly.
> - Add skeleton ops with on implementation (TODO) for fpga_manager in early
>   patch (0001) and add implementation in later patch(0004).
> - Remove br_ops and FPGA region, no need.

I never said FPGA region is no need. FPGA region is the essential for
managing the reprogramming and re-enumeration.

Thanks,
Yilun

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci
  2024-12-10 18:37 ` [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci Yidong Zhang
  2025-01-26  9:24   ` Xu Yilun
@ 2025-01-26 10:12   ` Christophe JAILLET
  2025-01-26 19:56     ` Yidong Zhang
  2025-01-26 10:16   ` Christophe JAILLET
  2025-01-26 10:32   ` Xu Yilun
  3 siblings, 1 reply; 28+ messages in thread
From: Christophe JAILLET @ 2025-01-26 10:12 UTC (permalink / raw)
  To: Yidong Zhang, linux-kernel, linux-fpga, mdf, hao.wu, yilun.xu
  Cc: lizhi.hou, DMG Karthik, Nishad Saraf, Prapul Krishnamurthy,
	Hayden Laccabue

Le 10/12/2024 à 19:37, Yidong Zhang a écrit :
> AMD Versal based PCIe card, including V70, is designed for AI inference
> efficiency and is tuned for video analytics and natural language processing
> applications.

...

> +static void versal_pci_uuid_parse(struct versal_pci_device *vdev, uuid_t *uuid)
> +{
> +	char str[UUID_STRING_LEN];
> +	u8 i, j;
> +
> +	/* parse uuid into a valid uuid string format */
> +	for (i  = 0, j = 0; i < strlen(vdev->fw_id) && i < sizeof(str); i++) {

Unneeded extra space in "i  = 0"

I think that the compiler already does it on its own, but the strlen 
could be computed before the for loop.

> +		str[j++] = vdev->fw_id[i];
> +		if (j == 8 || j == 13 || j == 18 || j == 23)
> +			str[j++] = '-';
> +	}
> +
> +	uuid_parse(str, uuid);
> +	vdev_info(vdev, "Interface uuid %pU", uuid);
> +}
> +
> +static struct fpga_device *versal_pci_fpga_init(struct versal_pci_device *vdev)
> +{
> +	struct device *dev = &vdev->pdev->dev;
> +	struct fpga_manager_info info = { 0 };

Is the { 0 } needed?
Isn't the assigment below enough?

> +	struct fpga_device *fdev;
> +	int ret;
> +
> +	fdev = devm_kzalloc(dev, sizeof(*fdev), GFP_KERNEL);
> +	if (!fdev)
> +		return ERR_PTR(-ENOMEM);
> +
> +	fdev->vdev = vdev;
> +
> +	info = (struct fpga_manager_info) {
> +		.name = "AMD Versal FPGA Manager",
> +		.mops = &versal_pci_fpga_ops,
> +		.priv = fdev,
> +	};
> +
> +	fdev->mgr = fpga_mgr_register_full(dev, &info);
> +	if (IS_ERR(fdev->mgr)) {
> +		ret = PTR_ERR(fdev->mgr);
> +		vdev_err(vdev, "Failed to register FPGA manager, err %d", ret);
> +		return ERR_PTR(ret);
> +	}
> +
> +	/* Place holder for rm_queue_get_fw_id(vdev->rdev) */
> +	versal_pci_uuid_parse(vdev, &vdev->intf_uuid);
> +
> +	return fdev;
> +}

...

> +static struct firmware_device *versal_pci_fw_upload_init(struct versal_pci_device *vdev)
> +{
> +	struct device *dev = &vdev->pdev->dev;
> +	struct firmware_device *fwdev;
> +	u32 devid;
> +
> +	fwdev = devm_kzalloc(dev, sizeof(*fwdev), GFP_KERNEL);
> +	if (!fwdev)
> +		return ERR_PTR(-ENOMEM);
> +
> +	devid = versal_pci_devid(vdev);
> +	fwdev->name = kasprintf(GFP_KERNEL, "%s%x", DRV_NAME, devid);

Why is fwdev managed, and not fwdev->name?
It looks ok as-is, but using devm_kasprintf() would save a few lines of 
code.

> +	if (!fwdev->name)
> +		return ERR_PTR(-ENOMEM);
> +
> +	fwdev->fw = firmware_upload_register(THIS_MODULE, dev, fwdev->name,
> +					     &versal_pci_fw_ops, fwdev);
> +	if (IS_ERR(fwdev->fw)) {
> +		kfree(fwdev->name);
> +		return ERR_CAST(fwdev->fw);
> +	}
> +
> +	fwdev->vdev = vdev;
> +
> +	return fwdev;
> +}

...

> +static int versal_pci_probe(struct pci_dev *pdev, const struct pci_device_id *pdev_id)
> +{
> +	struct versal_pci_device *vdev;
> +	int ret;
> +
> +	vdev = devm_kzalloc(&pdev->dev, sizeof(*vdev), GFP_KERNEL);
> +	if (!vdev)
> +		return -ENOMEM;
> +
> +	pci_set_drvdata(pdev, vdev);
> +	vdev->pdev = pdev;
> +
> +	ret = pcim_enable_device(pdev);
> +	if (ret) {
> +		vdev_err(vdev, "Failed to enable device %d", ret);
> +		return ret;
> +	}
> +
> +	vdev->io_regs = pcim_iomap_region(vdev->pdev, MGMT_BAR, DRV_NAME);
> +	if (IS_ERR(vdev->io_regs)) {
> +		vdev_err(vdev, "Failed to map RM shared memory BAR%d", MGMT_BAR);
> +		return PTR_ERR(vdev->io_regs);
> +	}
> +
> +	ret = versal_pci_device_setup(vdev);
> +	if (ret) {
> +		vdev_err(vdev, "Failed to setup Versal device %d", ret);
> +		return ret;
> +	}
> +
> +	vdev_dbg(vdev, "Successfully probed %s driver!", DRV_NAME);

Usually, such debug messages are not needed.
No strong opinion about it.

> +	return 0;
> +}

...

CJ

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci
  2024-12-10 18:37 ` [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci Yidong Zhang
  2025-01-26  9:24   ` Xu Yilun
  2025-01-26 10:12   ` Christophe JAILLET
@ 2025-01-26 10:16   ` Christophe JAILLET
  2025-01-26 10:32   ` Xu Yilun
  3 siblings, 0 replies; 28+ messages in thread
From: Christophe JAILLET @ 2025-01-26 10:16 UTC (permalink / raw)
  To: Yidong Zhang, linux-kernel, linux-fpga, mdf, hao.wu, yilun.xu
  Cc: lizhi.hou, DMG Karthik, Nishad Saraf, Prapul Krishnamurthy,
	Hayden Laccabue

Le 10/12/2024 à 19:37, Yidong Zhang a écrit :
> AMD Versal based PCIe card, including V70, is designed for AI inference
> efficiency and is tuned for video analytics and natural language processing
> applications.

...

> +#define vdev_info(vdev, fmt, args...)					\
> +	dev_info(&(vdev)->pdev->dev, "%s: "fmt, __func__, ##args)
> +
> +#define vdev_warn(vdev, fmt, args...)					\
> +	dev_warn(&(vdev)->pdev->dev, "%s: "fmt, __func__, ##args)
> +
> +#define vdev_err(vdev, fmt, args...)					\
> +	dev_err(&(vdev)->pdev->dev, "%s: "fmt, __func__, ##args)
> +
> +#define vdev_dbg(vdev, fmt, args...)					\
> +	dev_dbg(&(vdev)->pdev->dev, fmt, ##args)

These helpers could also add the trailing \n.
It is apparently not added in the messages themselves.

CJ

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 2/4] drivers/fpga/amd: Add communication channel
  2024-12-10 18:37 ` [PATCH V2 2/4] drivers/fpga/amd: Add communication channel Yidong Zhang
@ 2025-01-26 10:19   ` Christophe JAILLET
  2025-01-26 19:57     ` Yidong Zhang
  0 siblings, 1 reply; 28+ messages in thread
From: Christophe JAILLET @ 2025-01-26 10:19 UTC (permalink / raw)
  To: Yidong Zhang, linux-kernel, linux-fpga, mdf, hao.wu, yilun.xu
  Cc: lizhi.hou, Nishad Saraf

Le 10/12/2024 à 19:37, Yidong Zhang a écrit :
> The communication channel (comm_chan) service is between versal-pci and the
> user PF driver. When the user PF driver requests PL data download, the
> comm_chan service will handle the request by versal_pci_load_xclbin.

...

> +enum comm_chan_req_ops {
> +	COMM_CHAN_REQ_OPS_UNKNOWN		= 0,
> +	COMM_CHAN_REQ_OPS_HOT_RESET		= 5,
> +	COMM_CHAN_REQ_OPS_GET_PROTOCOL_VERSION	= 19,
> +	COMM_CHAN_REQ_OPS_LOAD_XCLBIN_UUID	= 20,
> +	COMM_CHAN_REQ_OPS_MAX,

Unneeded comma after a terminator.

> +};

...

> +static void comm_chan_check_request(struct work_struct *w)
> +{
> +	struct comm_chan_device *ccdev = to_ccdev_work(w);
> +	u32 status = 0, request[COMM_CHAN_DATA_LEN] = {0};

These 2 initialisations are not needed.

> +	struct comm_chan_hw_msg *hw_msg;
> +	u8 type, eom;
> +	int i;
> +
> +	status = comm_chan_read(ccdev, COMM_CHAN_REG_IS_OFF);
> +	if (!STATUS_IS_READY(status))
> +		return;
> +	if (STATUS_IS_ERROR(status)) {
> +		vdev_err(ccdev->vdev, "An error has occurred with comms");
> +		return;
> +	}
> +
> +	/* ACK status */
> +	comm_chan_write(ccdev, COMM_CHAN_REG_IS_OFF, status);
> +
> +	for (i = 0; i < COMM_CHAN_DATA_LEN; i++)
> +		request[i] = comm_chan_read(ccdev, COMM_CHAN_REG_RDDATA_OFF);
> +

...

CJ

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 3/4] drivers/fpga/amd: Add remote queue
  2024-12-10 18:37 ` [PATCH V2 3/4] drivers/fpga/amd: Add remote queue Yidong Zhang
@ 2025-01-26 10:24   ` Christophe JAILLET
  0 siblings, 0 replies; 28+ messages in thread
From: Christophe JAILLET @ 2025-01-26 10:24 UTC (permalink / raw)
  To: Yidong Zhang, linux-kernel, linux-fpga, mdf, hao.wu, yilun.xu
  Cc: lizhi.hou, Nishad Saraf, Prapul Krishnamurthy

Le 10/12/2024 à 19:37, Yidong Zhang a écrit :
> The remote queue is a hardware queue based ring buffer service between mgmt
> PF and PCIe hardware firmware for programming FPGA Program Logic, loading
> versal firmware and checking card healthy status.

...

> +static int rm_queue_submit_cmd(struct rm_cmd *cmd)
> +{
> +	struct versal_pci_device *vdev = cmd->rdev->vdev;
> +	struct rm_device *rdev = cmd->rdev;
> +	u32 offset;
> +	int ret;
> +
> +	mutex_lock(&rdev->queue);

Using guardmutex) would slightly simplify the code.

> +
> +	offset = rm_queue_get_sq_slot_offset(rdev);
> +	if (!offset) {
> +		vdev_err(vdev, "No SQ slot available");
> +		ret = -ENOSPC;
> +		goto exit;
> +	}
> +
> +	rm_queue_bulk_write(rdev, offset, (u32 *)&cmd->sq_msg,
> +			    sizeof(cmd->sq_msg));
> +
> +	ret = rm_queue_set_pidx(rdev, RM_QUEUE_SQ, ++rdev->sq.pidx);
> +	if (ret) {
> +		vdev_err(vdev, "Failed to update PIDX, ret %d", ret);
> +		goto exit;
> +	}
> +
> +	list_add_tail(&cmd->list, &rdev->submitted_cmds);
> +exit:
> +	mutex_unlock(&rdev->queue);
> +	return ret;
> +}
> +
> +void rm_queue_withdraw_cmd(struct rm_cmd *cmd)
> +{
> +	mutex_lock(&cmd->rdev->queue);

Using guardmutex) would slightly simplify the code.

> +	list_del(&cmd->list);
> +	mutex_unlock(&cmd->rdev->queue);
> +}

...

> +static void rm_check_msg(struct work_struct *w)
> +{
> +	struct rm_device *rdev = to_rdev_msg_monitor(w);
> +	int ret;
> +
> +	mutex_lock(&rdev->queue);

Using guardmutex) would slightly simplify the code.

> +
> +	rdev->sq.cidx = rm_queue_get_cidx(rdev, RM_QUEUE_SQ);
> +	rdev->cq.pidx = rm_queue_get_pidx(rdev, RM_QUEUE_CQ);
> +
> +	while (rdev->cq.cidx < rdev->cq.pidx) {
> +		ret = rm_process_msg(rdev);
> +		if (ret)
> +			break;
> +
> +		rdev->cq.cidx++;
> +
> +		rm_queue_set_cidx(rdev, RM_QUEUE_CQ, rdev->cq.cidx);
> +	}
> +
> +	mutex_unlock(&rdev->queue);
> +}

...

> +int rm_queue_init(struct rm_device *rdev)
> +{
> +	struct versal_pci_device *vdev = rdev->vdev;
> +	struct rm_queue_header header = {0};
> +	int ret;
> +
> +	INIT_LIST_HEAD(&rdev->submitted_cmds);
> +	mutex_init(&rdev->queue);

devm_mutex_init()?
In order to slightly simplify the code.

> +
> +	rm_queue_bulk_read(rdev, RM_HDR_OFF, (u32 *)&header, sizeof(header));
> +
> +	if (header.magic != RM_QUEUE_HDR_MAGIC_NUM) {
> +		vdev_err(vdev, "Invalid RM queue header");
> +		ret = -ENODEV;
> +		goto error;
> +	}

...

CJ

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci
  2024-12-10 18:37 ` [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci Yidong Zhang
                     ` (2 preceding siblings ...)
  2025-01-26 10:16   ` Christophe JAILLET
@ 2025-01-26 10:32   ` Xu Yilun
  2025-01-26 19:46     ` Yidong Zhang
  3 siblings, 1 reply; 28+ messages in thread
From: Xu Yilun @ 2025-01-26 10:32 UTC (permalink / raw)
  To: Yidong Zhang
  Cc: linux-kernel, linux-fpga, mdf, hao.wu, yilun.xu, lizhi.hou,
	DMG Karthik, Nishad Saraf, Prapul Krishnamurthy, Hayden Laccabue

On Tue, Dec 10, 2024 at 10:37:30AM -0800, Yidong Zhang wrote:
> AMD Versal based PCIe card, including V70, is designed for AI inference
> efficiency and is tuned for video analytics and natural language processing
> applications.
> 
> The driver architecture:
> 
>   +---------+  Communication +---------+  Remote  +-----+------+
>   |         |  Channel       |         |  Queue   |     |      |
>   | User PF | <============> | Mgmt PF | <=======>| FW  | FPGA |
>   +---------+                +---------+          +-----+------+
>     PL Data                    base FW
>                                APU FW
>                                PL Data (copy)
>  - PL (FPGA Program Logic)
>  - FW (Firmware)
> 
> There are 2 separate drivers from the original XRT[1] design.
>  - UserPF driver
>  - MgmtPF driver
> 
> The new AMD versal-pci driver will replace the MgmtPF driver for Versal
> PCIe card.
> 
> The XRT[1] is already open-sourced. It includes solution of runtime for
> many different type of PCIe Based cards. It also provides utilities for
> managing and programming the devices.
> 
> The AMD versal-pci stands for AMD Versal brand PCIe device management
> driver. This driver provides the following functionalities:
> 
>    - module and PCI device initialization
>      this driver will attach to specific device id of V70 card;
>      the driver will initialize itself based on bar resources for
>      - communication channel:
>        a hardware message service between mgmt PF and user PF
>      - remote queue:
>        a hardware queue based ring buffer service between mgmt PF and PCIe
>        hardware firmware for programming FPGA Program Logic, loading
>        firmware and checking card healthy status.
> 
>    - programming FW
>      - The base FW is downloaded onto the flash of the card.
>      - The APU FW is downloaded once after a POR (power on reset).
>      - Reloading the MgmtPF driver will not change any existing hardware.
> 
>    - programming FPGA hardware binaries - PL Data
>     - using fpga framework ops to support re-programing FPGA
>     - the re-programming request will be initiated from the existing UserPF
>       driver only, and the MgmtPF driver load the matched PL Data after
>       receiving request from the communication channel. The matching PL

I think this is not the way the FPGA generic framework should do. A FPGA
region user (your userPF driver) should not also be the reprogram requester.
The user driver cannot deal with the unexpected HW change if it happens.
Maybe after reprogramming, the user driver cannot match the device
anymore, and if user driver is still working on it, crash.

The expected behavior is, the FPGA region removes user devices (thus
detaches user drivers), does reprogramming, re-enumerates/rescans and
matches new devices with new drivers. And I think that's what Nava is
working on.

BTW, AFAICS the expected flow is easier to implement for of-fpga-region,
but harder for PCI devices. But I think that's the right direction and
should try to work it out.

Thanks,
Yilun

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci
  2025-01-26 10:32   ` Xu Yilun
@ 2025-01-26 19:46     ` Yidong Zhang
  2025-02-06  4:15       ` Xu Yilun
  0 siblings, 1 reply; 28+ messages in thread
From: Yidong Zhang @ 2025-01-26 19:46 UTC (permalink / raw)
  To: Xu Yilun
  Cc: linux-kernel, linux-fpga, mdf, hao.wu, yilun.xu, lizhi.hou,
	DMG Karthik, Nishad Saraf, Prapul Krishnamurthy, Hayden Laccabue



On 1/26/25 02:32, Xu Yilun wrote:
> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
> 
> 
> On Tue, Dec 10, 2024 at 10:37:30AM -0800, Yidong Zhang wrote:
>> AMD Versal based PCIe card, including V70, is designed for AI inference
>> efficiency and is tuned for video analytics and natural language processing
>> applications.
>>
>> The driver architecture:
>>
>>    +---------+  Communication +---------+  Remote  +-----+------+
>>    |         |  Channel       |         |  Queue   |     |      |
>>    | User PF | <============> | Mgmt PF | <=======>| FW  | FPGA |
>>    +---------+                +---------+          +-----+------+
>>      PL Data                    base FW
>>                                 APU FW
>>                                 PL Data (copy)
>>   - PL (FPGA Program Logic)
>>   - FW (Firmware)
>>
>> There are 2 separate drivers from the original XRT[1] design.
>>   - UserPF driver
>>   - MgmtPF driver
>>
>> The new AMD versal-pci driver will replace the MgmtPF driver for Versal
>> PCIe card.
>>
>> The XRT[1] is already open-sourced. It includes solution of runtime for
>> many different type of PCIe Based cards. It also provides utilities for
>> managing and programming the devices.
>>
>> The AMD versal-pci stands for AMD Versal brand PCIe device management
>> driver. This driver provides the following functionalities:
>>
>>     - module and PCI device initialization
>>       this driver will attach to specific device id of V70 card;
>>       the driver will initialize itself based on bar resources for
>>       - communication channel:
>>         a hardware message service between mgmt PF and user PF
>>       - remote queue:
>>         a hardware queue based ring buffer service between mgmt PF and PCIe
>>         hardware firmware for programming FPGA Program Logic, loading
>>         firmware and checking card healthy status.
>>
>>     - programming FW
>>       - The base FW is downloaded onto the flash of the card.
>>       - The APU FW is downloaded once after a POR (power on reset).
>>       - Reloading the MgmtPF driver will not change any existing hardware.
>>
>>     - programming FPGA hardware binaries - PL Data
>>      - using fpga framework ops to support re-programing FPGA
>>      - the re-programming request will be initiated from the existing UserPF
>>        driver only, and the MgmtPF driver load the matched PL Data after
>>        receiving request from the communication channel. The matching PL
> 
> I think this is not the way the FPGA generic framework should do. A FPGA
> region user (your userPF driver) should not also be the reprogram requester.
> The user driver cannot deal with the unexpected HW change if it happens.
> Maybe after reprogramming, the user driver cannot match the device
> anymore, and if user driver is still working on it, crash.

One thing to clarify. The current design is:

The userPF driver is the only requester. The mgmtPF has no uAPI to 
reprogram the FPGA.


> 
> The expected behavior is, the FPGA region removes user devices (thus
> detaches user drivers), does reprogramming, re-enumerates/rescans and
> matches new devices with new drivers. And I think that's what Nava is
> working on.
> 

Nava's work is different than our current design, our current design is:

the separate userPF driver will detach all services before requesting to 
the mgmtPF to program the FPGA, and after the programming is done, the 
userPF will re-enumerate/rescan the matching new devices.

The mgmtPF is a util driver which is responsible for communicating with 
the mgmtPF PCIe bar resources.


> BTW, AFAICS the expected flow is easier to implement for of-fpga-region,
> but harder for PCI devices. But I think that's the right direction and
> should try to work it out.

Could I recap the suggested design if I understand that correctly...

You are thinking that the mgmtPF (aka. versal-pci) driver should have a 
uAPI to trigger the FPGA re-programing; and using Nava's callback ops to 
detach the separate userPF driver; after re-programing is done, re-attch 
the userPF driver and allow the userPF driver re-enumerate all to match 
the new hardware.

I think my understanding is correct, it is doable.

As long as we can keep our userPF driver as separate driver, the code 
change won't be too big.

> 
> Thanks,
> Yilun

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci
  2025-01-26  9:24   ` Xu Yilun
@ 2025-01-26 19:50     ` Yidong Zhang
  0 siblings, 0 replies; 28+ messages in thread
From: Yidong Zhang @ 2025-01-26 19:50 UTC (permalink / raw)
  To: Xu Yilun
  Cc: linux-kernel, linux-fpga, mdf, hao.wu, yilun.xu, lizhi.hou,
	DMG Karthik, Nishad Saraf, Prapul Krishnamurthy, Hayden Laccabue



On 1/26/25 01:24, Xu Yilun wrote:
> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
> 
> 
>> +static int versal_pci_program_axlf(struct versal_pci_device *vdev, char *data, size_t size)
>> +{
>> +     const struct axlf *axlf = (struct axlf *)data;
>> +     struct fpga_image_info *image_info;
>> +     int ret;
>> +
>> +     image_info = fpga_image_info_alloc(&vdev->pdev->dev);
>> +     if (!image_info)
>> +             return -ENOMEM;
>> +
>> +     image_info->count = axlf->header.length;
>> +     image_info->buf = (char *)axlf;
>> +
>> +     ret = fpga_mgr_load(vdev->fdev->mgr, image_info);
> 
> I see, but this is not working like this. fpga_mgr_load() is intended to be
> called by fpga_region, any reprogramming API should come from fpga_region,
> and fpga_region could provide uAPI for userspace reprogramming.
> 
> If your driver act both as a fpga_mgr backend and a fpga_mgr kAPI user,
> then you don't have to bother using fpga framework at all.

The versal-pci is more like a util driver that handles requests from the 
separate userPF driver.

Thanks,
David

> 
> Thanks,
> Yilun
> 
>> +     if (ret) {
>> +             vdev_err(vdev, "failed to load xclbin: %d", ret);
>> +             goto exit;
>> +     }
>> +
>> +     vdev_info(vdev, "Downloaded axlf %pUb of size %zu Bytes", &axlf->header.uuid, size);
>> +     uuid_copy(&vdev->xclbin_uuid, &axlf->header.uuid);
>> +
>> +exit:
>> +     fpga_image_info_free(image_info);
>> +
>> +     return ret;
>> +}

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci
  2025-01-26 10:12   ` Christophe JAILLET
@ 2025-01-26 19:56     ` Yidong Zhang
  0 siblings, 0 replies; 28+ messages in thread
From: Yidong Zhang @ 2025-01-26 19:56 UTC (permalink / raw)
  To: Christophe JAILLET, linux-kernel, linux-fpga, mdf, hao.wu,
	yilun.xu
  Cc: lizhi.hou, DMG Karthik, Nishad Saraf, Prapul Krishnamurthy,
	Hayden Laccabue



On 1/26/25 02:12, Christophe JAILLET wrote:
> Caution: This message originated from an External Source. Use proper 
> caution when opening attachments, clicking links, or responding.
> 
> 
> Le 10/12/2024 à 19:37, Yidong Zhang a écrit :
>> AMD Versal based PCIe card, including V70, is designed for AI inference
>> efficiency and is tuned for video analytics and natural language 
>> processing
>> applications.
> 
> ...
> 
>> +static void versal_pci_uuid_parse(struct versal_pci_device *vdev, 
>> uuid_t *uuid)
>> +{
>> +     char str[UUID_STRING_LEN];
>> +     u8 i, j;
>> +
>> +     /* parse uuid into a valid uuid string format */
>> +     for (i  = 0, j = 0; i < strlen(vdev->fw_id) && i < sizeof(str); 
>> i++) {
> 
> Unneeded extra space in "i  = 0"

Great catch! I will fix this.

> 
> I think that the compiler already does it on its own, but the strlen
> could be computed before the for loop.

I will change the code.

> 
>> +             str[j++] = vdev->fw_id[i];
>> +             if (j == 8 || j == 13 || j == 18 || j == 23)
>> +                     str[j++] = '-';
>> +     }
>> +
>> +     uuid_parse(str, uuid);
>> +     vdev_info(vdev, "Interface uuid %pU", uuid);
>> +}
>> +
>> +static struct fpga_device *versal_pci_fpga_init(struct 
>> versal_pci_device *vdev)
>> +{
>> +     struct device *dev = &vdev->pdev->dev;
>> +     struct fpga_manager_info info = { 0 };
> 
> Is the { 0 } needed?
> Isn't the assigment below enough?

Right. I will remove the unnecessary { 0 }.

> 
>> +     struct fpga_device *fdev;
>> +     int ret;
>> +
>> +     fdev = devm_kzalloc(dev, sizeof(*fdev), GFP_KERNEL);
>> +     if (!fdev)
>> +             return ERR_PTR(-ENOMEM);
>> +
>> +     fdev->vdev = vdev;
>> +
>> +     info = (struct fpga_manager_info) {
>> +             .name = "AMD Versal FPGA Manager",
>> +             .mops = &versal_pci_fpga_ops,
>> +             .priv = fdev,
>> +     };
>> +
>> +     fdev->mgr = fpga_mgr_register_full(dev, &info);
>> +     if (IS_ERR(fdev->mgr)) {
>> +             ret = PTR_ERR(fdev->mgr);
>> +             vdev_err(vdev, "Failed to register FPGA manager, err 
>> %d", ret);
>> +             return ERR_PTR(ret);
>> +     }
>> +
>> +     /* Place holder for rm_queue_get_fw_id(vdev->rdev) */
>> +     versal_pci_uuid_parse(vdev, &vdev->intf_uuid);
>> +
>> +     return fdev;
>> +}
> 
> ...
> 
>> +static struct firmware_device *versal_pci_fw_upload_init(struct 
>> versal_pci_device *vdev)
>> +{
>> +     struct device *dev = &vdev->pdev->dev;
>> +     struct firmware_device *fwdev;
>> +     u32 devid;
>> +
>> +     fwdev = devm_kzalloc(dev, sizeof(*fwdev), GFP_KERNEL);
>> +     if (!fwdev)
>> +             return ERR_PTR(-ENOMEM);
>> +
>> +     devid = versal_pci_devid(vdev);
>> +     fwdev->name = kasprintf(GFP_KERNEL, "%s%x", DRV_NAME, devid);
> 
> Why is fwdev managed, and not fwdev->name?
> It looks ok as-is, but using devm_kasprintf() would save a few lines of
> code.

I will change the code. Great suggestion.

> 
>> +     if (!fwdev->name)
>> +             return ERR_PTR(-ENOMEM);
>> +
>> +     fwdev->fw = firmware_upload_register(THIS_MODULE, dev, fwdev->name,
>> +                                          &versal_pci_fw_ops, fwdev);
>> +     if (IS_ERR(fwdev->fw)) {
>> +             kfree(fwdev->name);
>> +             return ERR_CAST(fwdev->fw);
>> +     }
>> +
>> +     fwdev->vdev = vdev;
>> +
>> +     return fwdev;
>> +}
> 
> ...
> 
>> +static int versal_pci_probe(struct pci_dev *pdev, const struct 
>> pci_device_id *pdev_id)
>> +{
>> +     struct versal_pci_device *vdev;
>> +     int ret;
>> +
>> +     vdev = devm_kzalloc(&pdev->dev, sizeof(*vdev), GFP_KERNEL);
>> +     if (!vdev)
>> +             return -ENOMEM;
>> +
>> +     pci_set_drvdata(pdev, vdev);
>> +     vdev->pdev = pdev;
>> +
>> +     ret = pcim_enable_device(pdev);
>> +     if (ret) {
>> +             vdev_err(vdev, "Failed to enable device %d", ret);
>> +             return ret;
>> +     }
>> +
>> +     vdev->io_regs = pcim_iomap_region(vdev->pdev, MGMT_BAR, DRV_NAME);
>> +     if (IS_ERR(vdev->io_regs)) {
>> +             vdev_err(vdev, "Failed to map RM shared memory BAR%d", 
>> MGMT_BAR);
>> +             return PTR_ERR(vdev->io_regs);
>> +     }
>> +
>> +     ret = versal_pci_device_setup(vdev);
>> +     if (ret) {
>> +             vdev_err(vdev, "Failed to setup Versal device %d", ret);
>> +             return ret;
>> +     }
>> +
>> +     vdev_dbg(vdev, "Successfully probed %s driver!", DRV_NAME);
> 
> Usually, such debug messages are not needed.
> No strong opinion about it.

I will remove this. The dbg only enable in the debug kernel, but this 
line isn't necessary. I will fix it.

> 
>> +     return 0;
>> +}
> 
> ...
> 
> CJ

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 2/4] drivers/fpga/amd: Add communication channel
  2025-01-26 10:19   ` Christophe JAILLET
@ 2025-01-26 19:57     ` Yidong Zhang
  0 siblings, 0 replies; 28+ messages in thread
From: Yidong Zhang @ 2025-01-26 19:57 UTC (permalink / raw)
  To: Christophe JAILLET, linux-kernel, linux-fpga, mdf, hao.wu,
	yilun.xu
  Cc: lizhi.hou, Nishad Saraf



On 1/26/25 02:19, Christophe JAILLET wrote:
> Caution: This message originated from an External Source. Use proper 
> caution when opening attachments, clicking links, or responding.
> 
> 
> Le 10/12/2024 à 19:37, Yidong Zhang a écrit :
>> The communication channel (comm_chan) service is between versal-pci 
>> and the
>> user PF driver. When the user PF driver requests PL data download, the
>> comm_chan service will handle the request by versal_pci_load_xclbin.
> 
> ...
> 
>> +enum comm_chan_req_ops {
>> +     COMM_CHAN_REQ_OPS_UNKNOWN               = 0,
>> +     COMM_CHAN_REQ_OPS_HOT_RESET             = 5,
>> +     COMM_CHAN_REQ_OPS_GET_PROTOCOL_VERSION  = 19,
>> +     COMM_CHAN_REQ_OPS_LOAD_XCLBIN_UUID      = 20,
>> +     COMM_CHAN_REQ_OPS_MAX,
> 
> Unneeded comma after a terminator.

Will fix this.

> 
>> +};
> 
> ...
> 
>> +static void comm_chan_check_request(struct work_struct *w)
>> +{
>> +     struct comm_chan_device *ccdev = to_ccdev_work(w);
>> +     u32 status = 0, request[COMM_CHAN_DATA_LEN] = {0};
> 
> These 2 initialisations are not needed.

Will fix this.

> 
>> +     struct comm_chan_hw_msg *hw_msg;
>> +     u8 type, eom;
>> +     int i;
>> +
>> +     status = comm_chan_read(ccdev, COMM_CHAN_REG_IS_OFF);
>> +     if (!STATUS_IS_READY(status))
>> +             return;
>> +     if (STATUS_IS_ERROR(status)) {
>> +             vdev_err(ccdev->vdev, "An error has occurred with comms");
>> +             return;
>> +     }
>> +
>> +     /* ACK status */
>> +     comm_chan_write(ccdev, COMM_CHAN_REG_IS_OFF, status);
>> +
>> +     for (i = 0; i < COMM_CHAN_DATA_LEN; i++)
>> +             request[i] = comm_chan_read(ccdev, 
>> COMM_CHAN_REG_RDDATA_OFF);
>> +
> 
> ...
> 
> CJ

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci
  2025-01-26 19:46     ` Yidong Zhang
@ 2025-02-06  4:15       ` Xu Yilun
  2025-02-06  4:31         ` Yidong Zhang
  0 siblings, 1 reply; 28+ messages in thread
From: Xu Yilun @ 2025-02-06  4:15 UTC (permalink / raw)
  To: Yidong Zhang
  Cc: linux-kernel, linux-fpga, mdf, hao.wu, yilun.xu, lizhi.hou,
	DMG Karthik, Nishad Saraf, Prapul Krishnamurthy, Hayden Laccabue

On Sun, Jan 26, 2025 at 11:46:54AM -0800, Yidong Zhang wrote:
> 
> 
> On 1/26/25 02:32, Xu Yilun wrote:
> > Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
> > 
> > 
> > On Tue, Dec 10, 2024 at 10:37:30AM -0800, Yidong Zhang wrote:
> > > AMD Versal based PCIe card, including V70, is designed for AI inference
> > > efficiency and is tuned for video analytics and natural language processing
> > > applications.
> > > 
> > > The driver architecture:
> > > 
> > >    +---------+  Communication +---------+  Remote  +-----+------+
> > >    |         |  Channel       |         |  Queue   |     |      |
> > >    | User PF | <============> | Mgmt PF | <=======>| FW  | FPGA |
> > >    +---------+                +---------+          +-----+------+
> > >      PL Data                    base FW
> > >                                 APU FW
> > >                                 PL Data (copy)
> > >   - PL (FPGA Program Logic)
> > >   - FW (Firmware)
> > > 
> > > There are 2 separate drivers from the original XRT[1] design.
> > >   - UserPF driver
> > >   - MgmtPF driver
> > > 
> > > The new AMD versal-pci driver will replace the MgmtPF driver for Versal
> > > PCIe card.
> > > 
> > > The XRT[1] is already open-sourced. It includes solution of runtime for
> > > many different type of PCIe Based cards. It also provides utilities for
> > > managing and programming the devices.
> > > 
> > > The AMD versal-pci stands for AMD Versal brand PCIe device management
> > > driver. This driver provides the following functionalities:
> > > 
> > >     - module and PCI device initialization
> > >       this driver will attach to specific device id of V70 card;
> > >       the driver will initialize itself based on bar resources for
> > >       - communication channel:
> > >         a hardware message service between mgmt PF and user PF
> > >       - remote queue:
> > >         a hardware queue based ring buffer service between mgmt PF and PCIe
> > >         hardware firmware for programming FPGA Program Logic, loading
> > >         firmware and checking card healthy status.
> > > 
> > >     - programming FW
> > >       - The base FW is downloaded onto the flash of the card.
> > >       - The APU FW is downloaded once after a POR (power on reset).
> > >       - Reloading the MgmtPF driver will not change any existing hardware.
> > > 
> > >     - programming FPGA hardware binaries - PL Data
> > >      - using fpga framework ops to support re-programing FPGA
> > >      - the re-programming request will be initiated from the existing UserPF
> > >        driver only, and the MgmtPF driver load the matched PL Data after
> > >        receiving request from the communication channel. The matching PL
> > 
> > I think this is not the way the FPGA generic framework should do. A FPGA
> > region user (your userPF driver) should not also be the reprogram requester.
> > The user driver cannot deal with the unexpected HW change if it happens.
> > Maybe after reprogramming, the user driver cannot match the device
> > anymore, and if user driver is still working on it, crash.
> 
> One thing to clarify. The current design is:
> 
> The userPF driver is the only requester. The mgmtPF has no uAPI to reprogram
> the FPGA.
> 
> 
> > 
> > The expected behavior is, the FPGA region removes user devices (thus
> > detaches user drivers), does reprogramming, re-enumerates/rescans and
> > matches new devices with new drivers. And I think that's what Nava is
> > working on.
> > 
> 
> Nava's work is different than our current design, our current design is:
> 
> the separate userPF driver will detach all services before requesting to the
> mgmtPF to program the FPGA, and after the programming is done, the userPF
> will re-enumerate/rescan the matching new devices.

That's not align with the Device-Driver Model, A device driver should not
re-enumerate/rescan a matching device.

> 
> The mgmtPF is a util driver which is responsible for communicating with the
> mgmtPF PCIe bar resources.
> 
> 
> > BTW, AFAICS the expected flow is easier to implement for of-fpga-region,
> > but harder for PCI devices. But I think that's the right direction and
> > should try to work it out.
> 
> Could I recap the suggested design if I understand that correctly...
> 
> You are thinking that the mgmtPF (aka. versal-pci) driver should have a uAPI

This should be the unified fpga-region class uAPI.

> to trigger the FPGA re-programing; and using Nava's callback ops to detach
> the separate userPF driver; after re-programing is done, re-attch the userPF

No need to detach a specific driver, remove all devices in the
fpga-region. I imagine this could also be a generic flow for all PCI/e
based FPGA cards.

Thanks,
Yilun

> driver and allow the userPF driver re-enumerate all to match the new
> hardware.
> 
> I think my understanding is correct, it is doable.
> 
> As long as we can keep our userPF driver as separate driver, the code change
> won't be too big.
> 
> > 
> > Thanks,
> > Yilun

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci
  2025-02-06  4:15       ` Xu Yilun
@ 2025-02-06  4:31         ` Yidong Zhang
  2025-02-07  2:19           ` Xu Yilun
  0 siblings, 1 reply; 28+ messages in thread
From: Yidong Zhang @ 2025-02-06  4:31 UTC (permalink / raw)
  To: Xu Yilun
  Cc: linux-kernel, linux-fpga, mdf, hao.wu, yilun.xu, lizhi.hou,
	DMG Karthik, Nishad Saraf, Prapul Krishnamurthy, Hayden Laccabue



On 2/5/25 20:15, Xu Yilun wrote:
> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
> 
> 
> On Sun, Jan 26, 2025 at 11:46:54AM -0800, Yidong Zhang wrote:
>>
>>
>> On 1/26/25 02:32, Xu Yilun wrote:
>>> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
>>>
>>>
>>> On Tue, Dec 10, 2024 at 10:37:30AM -0800, Yidong Zhang wrote:
>>>> AMD Versal based PCIe card, including V70, is designed for AI inference
>>>> efficiency and is tuned for video analytics and natural language processing
>>>> applications.
>>>>
>>>> The driver architecture:
>>>>
>>>>     +---------+  Communication +---------+  Remote  +-----+------+
>>>>     |         |  Channel       |         |  Queue   |     |      |
>>>>     | User PF | <============> | Mgmt PF | <=======>| FW  | FPGA |
>>>>     +---------+                +---------+          +-----+------+
>>>>       PL Data                    base FW
>>>>                                  APU FW
>>>>                                  PL Data (copy)
>>>>    - PL (FPGA Program Logic)
>>>>    - FW (Firmware)
>>>>
>>>> There are 2 separate drivers from the original XRT[1] design.
>>>>    - UserPF driver
>>>>    - MgmtPF driver
>>>>
>>>> The new AMD versal-pci driver will replace the MgmtPF driver for Versal
>>>> PCIe card.
>>>>
>>>> The XRT[1] is already open-sourced. It includes solution of runtime for
>>>> many different type of PCIe Based cards. It also provides utilities for
>>>> managing and programming the devices.
>>>>
>>>> The AMD versal-pci stands for AMD Versal brand PCIe device management
>>>> driver. This driver provides the following functionalities:
>>>>
>>>>      - module and PCI device initialization
>>>>        this driver will attach to specific device id of V70 card;
>>>>        the driver will initialize itself based on bar resources for
>>>>        - communication channel:
>>>>          a hardware message service between mgmt PF and user PF
>>>>        - remote queue:
>>>>          a hardware queue based ring buffer service between mgmt PF and PCIe
>>>>          hardware firmware for programming FPGA Program Logic, loading
>>>>          firmware and checking card healthy status.
>>>>
>>>>      - programming FW
>>>>        - The base FW is downloaded onto the flash of the card.
>>>>        - The APU FW is downloaded once after a POR (power on reset).
>>>>        - Reloading the MgmtPF driver will not change any existing hardware.
>>>>
>>>>      - programming FPGA hardware binaries - PL Data
>>>>       - using fpga framework ops to support re-programing FPGA
>>>>       - the re-programming request will be initiated from the existing UserPF
>>>>         driver only, and the MgmtPF driver load the matched PL Data after
>>>>         receiving request from the communication channel. The matching PL
>>>
>>> I think this is not the way the FPGA generic framework should do. A FPGA
>>> region user (your userPF driver) should not also be the reprogram requester.
>>> The user driver cannot deal with the unexpected HW change if it happens.
>>> Maybe after reprogramming, the user driver cannot match the device
>>> anymore, and if user driver is still working on it, crash.
>>
>> One thing to clarify. The current design is:
>>
>> The userPF driver is the only requester. The mgmtPF has no uAPI to reprogram
>> the FPGA.
>>
>>
>>>
>>> The expected behavior is, the FPGA region removes user devices (thus
>>> detaches user drivers), does reprogramming, re-enumerates/rescans and
>>> matches new devices with new drivers. And I think that's what Nava is
>>> working on.
>>>
>>
>> Nava's work is different than our current design, our current design is:
>>
>> the separate userPF driver will detach all services before requesting to the
>> mgmtPF to program the FPGA, and after the programming is done, the userPF
>> will re-enumerate/rescan the matching new devices.
> 
> That's not align with the Device-Driver Model, A device driver should not
> re-enumerate/rescan a matching device.

Thanks! Yilun.

I will need to discuss this in my team. Our current userPF driver 
organize services (rom, sensor, msix, etc.) as platform driver and 
re-enumerate (online/offline) when there is a hardware change.


> 
>>
>> The mgmtPF is a util driver which is responsible for communicating with the
>> mgmtPF PCIe bar resources.
>>
>>
>>> BTW, AFAICS the expected flow is easier to implement for of-fpga-region,
>>> but harder for PCI devices. But I think that's the right direction and
>>> should try to work it out.
>>
>> Could I recap the suggested design if I understand that correctly...
>>
>> You are thinking that the mgmtPF (aka. versal-pci) driver should have a uAPI
> 
> This should be the unified fpga-region class uAPI.
> 
>> to trigger the FPGA re-programing; and using Nava's callback ops to detach
>> the separate userPF driver; after re-programing is done, re-attch the userPF
> 
> No need to detach a specific driver, remove all devices in the
> fpga-region. I imagine this could also be a generic flow for all PCI/e
> based FPGA cards.

I see your point. Is there an existing example in current fpga driver 
for PCI/e based cards?

We will need to talk to our management team and re-design our driver.
I think we have 2 approaches:
1) Align with linux fpga, having one driver for both userPF and mgmtPF; or
2) find a different location (maybe driver/misc) for the version-pci 
driver, because it is an utility driver, not need to be tied with fpga 
framework.

Please let me know you thoughts. Which way is acceptable by you.

Thanks,
David
> 
> Thanks,
> Yilun
> 
>> driver and allow the userPF driver re-enumerate all to match the new
>> hardware.
>>
>> I think my understanding is correct, it is doable.
>>
>> As long as we can keep our userPF driver as separate driver, the code change
>> won't be too big.
>>
>>>
>>> Thanks,
>>> Yilun

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci
  2025-02-06  4:31         ` Yidong Zhang
@ 2025-02-07  2:19           ` Xu Yilun
  2025-02-07  3:16             ` Yidong Zhang
  0 siblings, 1 reply; 28+ messages in thread
From: Xu Yilun @ 2025-02-07  2:19 UTC (permalink / raw)
  To: Yidong Zhang
  Cc: linux-kernel, linux-fpga, mdf, hao.wu, yilun.xu, lizhi.hou,
	DMG Karthik, Nishad Saraf, Prapul Krishnamurthy, Hayden Laccabue

> > No need to detach a specific driver, remove all devices in the
> > fpga-region. I imagine this could also be a generic flow for all PCI/e
> > based FPGA cards.
> 
> I see your point. Is there an existing example in current fpga driver for
> PCI/e based cards?

No. The fpga-region re-enumeration arch is still WIP, so no existing
implementation.

> 
> We will need to talk to our management team and re-design our driver.
> I think we have 2 approaches:
> 1) Align with linux fpga, having one driver for both userPF and mgmtPF; or

Don't get you. Linux FPGA doesn't require one driver for both PFs.

> 2) find a different location (maybe driver/misc) for the version-pci driver,
> because it is an utility driver, not need to be tied with fpga framework.

I'm not the misc maintainer, but I assume in-tree utility driver +
out-of-tree client driver is not generally welcomed.

Thanks,
Yilun

> 
> Please let me know you thoughts. Which way is acceptable by you.
> 
> Thanks,
> David
> > 
> > Thanks,
> > Yilun
> > 
> > > driver and allow the userPF driver re-enumerate all to match the new
> > > hardware.
> > > 
> > > I think my understanding is correct, it is doable.
> > > 
> > > As long as we can keep our userPF driver as separate driver, the code change
> > > won't be too big.
> > > 
> > > > 
> > > > Thanks,
> > > > Yilun

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci
  2025-02-07  2:19           ` Xu Yilun
@ 2025-02-07  3:16             ` Yidong Zhang
  2025-02-07  4:40               ` Xu Yilun
  0 siblings, 1 reply; 28+ messages in thread
From: Yidong Zhang @ 2025-02-07  3:16 UTC (permalink / raw)
  To: Xu Yilun
  Cc: linux-kernel, linux-fpga, mdf, hao.wu, yilun.xu, lizhi.hou,
	DMG Karthik, Nishad Saraf, Prapul Krishnamurthy, Hayden Laccabue

On 2/6/25 18:19, Xu Yilun wrote:
> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
> 
> 
>>> No need to detach a specific driver, remove all devices in the
>>> fpga-region. I imagine this could also be a generic flow for all PCI/e
>>> based FPGA cards.
>>
>> I see your point. Is there an existing example in current fpga driver for
>> PCI/e based cards?
> 
> No. The fpga-region re-enumeration arch is still WIP, so no existing
> implementation.

Got you. I can also help to test or provide feedback.
Actually, I had sent Nava my protype of using configfs for the non-OF 
driver. He should have the updated patch soon.

> 
>>
>> We will need to talk to our management team and re-design our driver.
>> I think we have 2 approaches:
>> 1) Align with linux fpga, having one driver for both userPF and mgmtPF; or
> 
> Don't get you. Linux FPGA doesn't require one driver for both PFs.

I assume when you said "generic flow for removing all devices in 
fpga-region" means that there is a single driver which use the 
fpga-region callbacks to remove all devices and then re-progream the FPGA.

On AMD V70 hardware, the userPF has different deviceid than the mgmtPF.
Thus, our current design is having 2 separate pcie drivers for each 
different deviceid.

Thus, in our current design, the fpga-region should be in the userPF 
driver which has callbacks to remove all devices. But in mgmtPF, it is 
more like a utility which only handles request from the userPF but it 
has the management privilege. Usually, cloud vendors require the mgmtPF 
deployed in their secured domain (can be a separate physical machine).

We can keep 2 drivers, one for userPF, one for mgmtPF. The userPF and 
mgmtPF communicate each other via the communication channel because they 
can be physically on different machine.

Are you looking for us to upstream both of them together?
If yes, I still think that the fpga-region should be in the userPF 
driver. The mgmtPF driver is still a utility driver.

Please correct me if I am wrong.:)

> 
>> 2) find a different location (maybe driver/misc) for the version-pci driver,
>> because it is an utility driver, not need to be tied with fpga framework.
> 
> I'm not the misc maintainer, but I assume in-tree utility driver +
> out-of-tree client driver is not generally welcomed.

Great info! Thanks! I will have to discuss this with my team too.

My understanding, so far based on your comments above, the drivers/fpga 
prefer to use fpga-region ops to handle FPGA re-program changes.

The current versal-pci driver is a utility driver.
The fpga-region should be within the userPF driver.

We can try to make our userPF driver in-tree as well. But the current 
plan is to do it later.

If you prefer we do both of them together. I can talk to my team.

Thanks,
David
> 
> Thanks,
> Yilun
> 
>>
>> Please let me know you thoughts. Which way is acceptable by you.
>>
>> Thanks,
>> David
>>>
>>> Thanks,
>>> Yilun
>>>
>>>> driver and allow the userPF driver re-enumerate all to match the new
>>>> hardware.
>>>>
>>>> I think my understanding is correct, it is doable.
>>>>
>>>> As long as we can keep our userPF driver as separate driver, the code change
>>>> won't be too big.
>>>>
>>>>>
>>>>> Thanks,
>>>>> Yilun

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci
  2025-02-07  3:16             ` Yidong Zhang
@ 2025-02-07  4:40               ` Xu Yilun
  2025-02-10 11:33                 ` Yidong Zhang
  0 siblings, 1 reply; 28+ messages in thread
From: Xu Yilun @ 2025-02-07  4:40 UTC (permalink / raw)
  To: Yidong Zhang
  Cc: linux-kernel, linux-fpga, mdf, hao.wu, yilun.xu, lizhi.hou,
	DMG Karthik, Nishad Saraf, Prapul Krishnamurthy, Hayden Laccabue

On Thu, Feb 06, 2025 at 07:16:25PM -0800, Yidong Zhang wrote:
> 
> 
> On 2/6/25 18:19, Xu Yilun wrote:
> > Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
> > 
> > 
> > > > No need to detach a specific driver, remove all devices in the
> > > > fpga-region. I imagine this could also be a generic flow for all PCI/e
> > > > based FPGA cards.
> > > 
> > > I see your point. Is there an existing example in current fpga driver for
> > > PCI/e based cards?
> > 
> > No. The fpga-region re-enumeration arch is still WIP, so no existing
> > implementation.
> 
> Got you. I can also help to test or provide feedback.
> Actually, I had sent Nava my protype of using configfs for the non-OF
> driver. He should have the updated patch soon.
> 
> > 
> > > 
> > > We will need to talk to our management team and re-design our driver.
> > > I think we have 2 approaches:
> > > 1) Align with linux fpga, having one driver for both userPF and mgmtPF; or
> > 
> > Don't get you. Linux FPGA doesn't require one driver for both PFs.
> 
> I assume when you said "generic flow for removing all devices in
> fpga-region" means that there is a single driver which use the fpga-region
> callbacks to remove all devices and then re-progream the FPGA.
> 
> On AMD V70 hardware, the userPF has different deviceid than the mgmtPF.
> Thus, our current design is having 2 separate pcie drivers for each
> different deviceid.
> 
> Thus, in our current design, the fpga-region should be in the userPF driver
> which has callbacks to remove all devices. But in mgmtPF, it is more like a

A question, if the FPGA logic is cleared, does the userPF still exist on
PCIe bus?

> utility which only handles request from the userPF but it has the management
> privilege. Usually, cloud vendors require the mgmtPF deployed in their
> secured domain (can be a separate physical machine).

I though mgmtPF & userPF are just 2 functions on one PCIe card, they are not?
Then how the mgmtPF writes data to another physical machine and
influence the userPF?

Thanks,
Yilun

> 
> We can keep 2 drivers, one for userPF, one for mgmtPF. The userPF and mgmtPF
> communicate each other via the communication channel because they can be
> physically on different machine.
> 
> Are you looking for us to upstream both of them together?
> If yes, I still think that the fpga-region should be in the userPF driver.
> The mgmtPF driver is still a utility driver.
> 
> Please correct me if I am wrong.:)
> 
> > 
> > > 2) find a different location (maybe driver/misc) for the version-pci driver,
> > > because it is an utility driver, not need to be tied with fpga framework.
> > 
> > I'm not the misc maintainer, but I assume in-tree utility driver +
> > out-of-tree client driver is not generally welcomed.
> 
> Great info! Thanks! I will have to discuss this with my team too.
> 
> My understanding, so far based on your comments above, the drivers/fpga
> prefer to use fpga-region ops to handle FPGA re-program changes.
> 
> The current versal-pci driver is a utility driver.
> The fpga-region should be within the userPF driver.
> 
> We can try to make our userPF driver in-tree as well. But the current plan
> is to do it later.
> 
> If you prefer we do both of them together. I can talk to my team.
> 
> Thanks,
> David
> > 
> > Thanks,
> > Yilun
> > 
> > > 
> > > Please let me know you thoughts. Which way is acceptable by you.
> > > 
> > > Thanks,
> > > David
> > > > 
> > > > Thanks,
> > > > Yilun
> > > > 
> > > > > driver and allow the userPF driver re-enumerate all to match the new
> > > > > hardware.
> > > > > 
> > > > > I think my understanding is correct, it is doable.
> > > > > 
> > > > > As long as we can keep our userPF driver as separate driver, the code change
> > > > > won't be too big.
> > > > > 
> > > > > > 
> > > > > > Thanks,
> > > > > > Yilun

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci
  2025-02-07  4:40               ` Xu Yilun
@ 2025-02-10 11:33                 ` Yidong Zhang
  2025-02-11  9:09                   ` Xu Yilun
  0 siblings, 1 reply; 28+ messages in thread
From: Yidong Zhang @ 2025-02-10 11:33 UTC (permalink / raw)
  To: Xu Yilun
  Cc: linux-kernel, linux-fpga, mdf, hao.wu, yilun.xu, lizhi.hou,
	DMG Karthik, Nishad Saraf, Hayden Laccabue



On 2/6/25 20:40, Xu Yilun wrote:
> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
> 
> 
> On Thu, Feb 06, 2025 at 07:16:25PM -0800, Yidong Zhang wrote:
>>
>>
>> On 2/6/25 18:19, Xu Yilun wrote:
>>> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
>>>
>>>
>>>>> No need to detach a specific driver, remove all devices in the
>>>>> fpga-region. I imagine this could also be a generic flow for all PCI/e
>>>>> based FPGA cards.
>>>>
>>>> I see your point. Is there an existing example in current fpga driver for
>>>> PCI/e based cards?
>>>
>>> No. The fpga-region re-enumeration arch is still WIP, so no existing
>>> implementation.
>>
>> Got you. I can also help to test or provide feedback.
>> Actually, I had sent Nava my protype of using configfs for the non-OF
>> driver. He should have the updated patch soon.
>>
>>>
>>>>
>>>> We will need to talk to our management team and re-design our driver.
>>>> I think we have 2 approaches:
>>>> 1) Align with linux fpga, having one driver for both userPF and mgmtPF; or
>>>
>>> Don't get you. Linux FPGA doesn't require one driver for both PFs.
>>
>> I assume when you said "generic flow for removing all devices in
>> fpga-region" means that there is a single driver which use the fpga-region
>> callbacks to remove all devices and then re-progream the FPGA.
>>
>> On AMD V70 hardware, the userPF has different deviceid than the mgmtPF.
>> Thus, our current design is having 2 separate pcie drivers for each
>> different deviceid.
>>
>> Thus, in our current design, the fpga-region should be in the userPF driver
>> which has callbacks to remove all devices. But in mgmtPF, it is more like a
> 
> A question, if the FPGA logic is cleared, does the userPF still exist on
> PCIe bus?

I am not sure I fully understand your question by "FPGA logic is 
cleared". Based on our design, the userPF exists, but all services need 
to be re-configured after hardware is changed.


> 
>> utility which only handles request from the userPF but it has the management
>> privilege. Usually, cloud vendors require the mgmtPF deployed in their
>> secured domain (can be a separate physical machine).
> 
> I though mgmtPF & userPF are just 2 functions on one PCIe card, they are not?
> Then how the mgmtPF writes data to another physical machine and
> influence the userPF?
> 

They are functions but they also have different device id thus we can 
bind different driver onto each of them.

I think I should interrupt it accurately as Host and VM. See details 
here please:
https://xilinx.github.io/XRT/2019.2/html/cloud_vendor_support.html

But, my point is that we do have 2 separate drivers.

Please let me know if you need more detailed info.

/David

> Thanks,
> Yilun
> 
>>
>> We can keep 2 drivers, one for userPF, one for mgmtPF. The userPF and mgmtPF
>> communicate each other via the communication channel because they can be
>> physically on different machine.
>>
>> Are you looking for us to upstream both of them together?
>> If yes, I still think that the fpga-region should be in the userPF driver.
>> The mgmtPF driver is still a utility driver.
>>
>> Please correct me if I am wrong.:)
>>
>>>
>>>> 2) find a different location (maybe driver/misc) for the version-pci driver,
>>>> because it is an utility driver, not need to be tied with fpga framework.
>>>
>>> I'm not the misc maintainer, but I assume in-tree utility driver +
>>> out-of-tree client driver is not generally welcomed.
>>
>> Great info! Thanks! I will have to discuss this with my team too.
>>
>> My understanding, so far based on your comments above, the drivers/fpga
>> prefer to use fpga-region ops to handle FPGA re-program changes.
>>
>> The current versal-pci driver is a utility driver.
>> The fpga-region should be within the userPF driver.
>>
>> We can try to make our userPF driver in-tree as well. But the current plan
>> is to do it later.
>>
>> If you prefer we do both of them together. I can talk to my team.
>>
>> Thanks,
>> David
>>>
>>> Thanks,
>>> Yilun
>>>
>>>>
>>>> Please let me know you thoughts. Which way is acceptable by you.
>>>>
>>>> Thanks,
>>>> David
>>>>>
>>>>> Thanks,
>>>>> Yilun
>>>>>
>>>>>> driver and allow the userPF driver re-enumerate all to match the new
>>>>>> hardware.
>>>>>>
>>>>>> I think my understanding is correct, it is doable.
>>>>>>
>>>>>> As long as we can keep our userPF driver as separate driver, the code change
>>>>>> won't be too big.
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Yilun

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci
  2025-02-10 11:33                 ` Yidong Zhang
@ 2025-02-11  9:09                   ` Xu Yilun
  2025-02-11 11:31                     ` Yidong Zhang
  0 siblings, 1 reply; 28+ messages in thread
From: Xu Yilun @ 2025-02-11  9:09 UTC (permalink / raw)
  To: Yidong Zhang
  Cc: linux-kernel, linux-fpga, mdf, hao.wu, yilun.xu, lizhi.hou,
	DMG Karthik, Nishad Saraf, Hayden Laccabue

On Mon, Feb 10, 2025 at 03:33:07AM -0800, Yidong Zhang wrote:
> 
> 
> On 2/6/25 20:40, Xu Yilun wrote:
> > Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
> > 
> > 
> > On Thu, Feb 06, 2025 at 07:16:25PM -0800, Yidong Zhang wrote:
> > > 
> > > 
> > > On 2/6/25 18:19, Xu Yilun wrote:
> > > > Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
> > > > 
> > > > 
> > > > > > No need to detach a specific driver, remove all devices in the
> > > > > > fpga-region. I imagine this could also be a generic flow for all PCI/e
> > > > > > based FPGA cards.
> > > > > 
> > > > > I see your point. Is there an existing example in current fpga driver for
> > > > > PCI/e based cards?
> > > > 
> > > > No. The fpga-region re-enumeration arch is still WIP, so no existing
> > > > implementation.
> > > 
> > > Got you. I can also help to test or provide feedback.
> > > Actually, I had sent Nava my protype of using configfs for the non-OF
> > > driver. He should have the updated patch soon.
> > > 
> > > > 
> > > > > 
> > > > > We will need to talk to our management team and re-design our driver.
> > > > > I think we have 2 approaches:
> > > > > 1) Align with linux fpga, having one driver for both userPF and mgmtPF; or
> > > > 
> > > > Don't get you. Linux FPGA doesn't require one driver for both PFs.
> > > 
> > > I assume when you said "generic flow for removing all devices in
> > > fpga-region" means that there is a single driver which use the fpga-region
> > > callbacks to remove all devices and then re-progream the FPGA.
> > > 
> > > On AMD V70 hardware, the userPF has different deviceid than the mgmtPF.
> > > Thus, our current design is having 2 separate pcie drivers for each
> > > different deviceid.
> > > 
> > > Thus, in our current design, the fpga-region should be in the userPF driver
> > > which has callbacks to remove all devices. But in mgmtPF, it is more like a
> > 
> > A question, if the FPGA logic is cleared, does the userPF still exist on
> > PCIe bus?
> 
> I am not sure I fully understand your question by "FPGA logic is cleared".
> Based on our design, the userPF exists, but all services need to be
> re-configured after hardware is changed.

I briefly read your XRT docs. IIUC we are talking about reprogramming
the "Dynamic Region", not the "Shell".

The userPF's PCIe basic functionality is in Shell and is out of
our reprograming discussion. So userPF PCI device always exists even if
the "Dynamic Region" is being reprogrammed. In this case, we should
create fpga-region for this "Dynamic Region" in userPF driver.

userPF uses some HW communication channel to require reprograming, so
fpga-manager could also be implemented in userPF driver.

And I agree the mgmtPF driver is a utility driver that just monitor on
the communication channel and serve requirements, nothing to do with
FPGA framework. I have no objection to put it in misc or firmware
directory.

What I'm not sure is whether a standalone utility driver in kernel tree
is preferred (at least I don't). You may talk to other maintainers.

> 
> 
> > 
> > > utility which only handles request from the userPF but it has the management
> > > privilege. Usually, cloud vendors require the mgmtPF deployed in their
> > > secured domain (can be a separate physical machine).
> > 
> > I though mgmtPF & userPF are just 2 functions on one PCIe card, they are not?
> > Then how the mgmtPF writes data to another physical machine and
> > influence the userPF?
> > 
> 
> They are functions but they also have different device id thus we can bind
> different driver onto each of them.
> 
> I think I should interrupt it accurately as Host and VM. See details here
> please:
> https://xilinx.github.io/XRT/2019.2/html/cloud_vendor_support.html
> 
> But, my point is that we do have 2 separate drivers.

OK, having 2 separate drivers is definitely fine.

Thanks,
Yilun

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci
  2025-02-11  9:09                   ` Xu Yilun
@ 2025-02-11 11:31                     ` Yidong Zhang
  2025-03-01  8:20                       ` Xu Yilun
  0 siblings, 1 reply; 28+ messages in thread
From: Yidong Zhang @ 2025-02-11 11:31 UTC (permalink / raw)
  To: Xu Yilun
  Cc: linux-kernel, linux-fpga, mdf, hao.wu, yilun.xu, lizhi.hou,
	DMG Karthik, Nishad Saraf, Hayden Laccabue



On 2/11/25 01:09, Xu Yilun wrote:
> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
> 
> 
> On Mon, Feb 10, 2025 at 03:33:07AM -0800, Yidong Zhang wrote:
>>
>>
>> On 2/6/25 20:40, Xu Yilun wrote:
>>> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
>>>
>>>
>>> On Thu, Feb 06, 2025 at 07:16:25PM -0800, Yidong Zhang wrote:
>>>>
>>>>
>>>> On 2/6/25 18:19, Xu Yilun wrote:
>>>>> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
>>>>>
>>>>>
>>>>>>> No need to detach a specific driver, remove all devices in the
>>>>>>> fpga-region. I imagine this could also be a generic flow for all PCI/e
>>>>>>> based FPGA cards.
>>>>>>
>>>>>> I see your point. Is there an existing example in current fpga driver for
>>>>>> PCI/e based cards?
>>>>>
>>>>> No. The fpga-region re-enumeration arch is still WIP, so no existing
>>>>> implementation.
>>>>
>>>> Got you. I can also help to test or provide feedback.
>>>> Actually, I had sent Nava my protype of using configfs for the non-OF
>>>> driver. He should have the updated patch soon.
>>>>
>>>>>
>>>>>>
>>>>>> We will need to talk to our management team and re-design our driver.
>>>>>> I think we have 2 approaches:
>>>>>> 1) Align with linux fpga, having one driver for both userPF and mgmtPF; or
>>>>>
>>>>> Don't get you. Linux FPGA doesn't require one driver for both PFs.
>>>>
>>>> I assume when you said "generic flow for removing all devices in
>>>> fpga-region" means that there is a single driver which use the fpga-region
>>>> callbacks to remove all devices and then re-progream the FPGA.
>>>>
>>>> On AMD V70 hardware, the userPF has different deviceid than the mgmtPF.
>>>> Thus, our current design is having 2 separate pcie drivers for each
>>>> different deviceid.
>>>>
>>>> Thus, in our current design, the fpga-region should be in the userPF driver
>>>> which has callbacks to remove all devices. But in mgmtPF, it is more like a
>>>
>>> A question, if the FPGA logic is cleared, does the userPF still exist on
>>> PCIe bus?
>>
>> I am not sure I fully understand your question by "FPGA logic is cleared".
>> Based on our design, the userPF exists, but all services need to be
>> re-configured after hardware is changed.
> 
> I briefly read your XRT docs. IIUC we are talking about reprogramming
> the "Dynamic Region", not the "Shell".
> 
> The userPF's PCIe basic functionality is in Shell and is out of
> our reprograming discussion. So userPF PCI device always exists even if
> the "Dynamic Region" is being reprogrammed. In this case, we should
> create fpga-region for this "Dynamic Region" in userPF driver.
> 
> userPF uses some HW communication channel to require reprograming, so
> fpga-manager could also be implemented in userPF driver.
> 

Thanks for taking your time reading our documents. You are exactly 
correct. We will use fpga-manager and fpga-region in our userPF driver 
for upstreaming. If you think this is acceptable by linux fpga 
framework. I will talk to my team for next step.

> And I agree the mgmtPF driver is a utility driver that just monitor on
> the communication channel and serve requirements, nothing to do with
> FPGA framework. I have no objection to put it in misc or firmware
> directory.
> 
> What I'm not sure is whether a standalone utility driver in kernel tree
> is preferred (at least I don't). You may talk to other maintainers.
> 

My last question for this topic:
If we decide to upstream both userPF and mgmtPF driver together, could 
them be both within the drivers/fpga/amd as in-tree driver? This will 
help user find source code easily.

Thanks,
David

>>
>>
>>>
>>>> utility which only handles request from the userPF but it has the management
>>>> privilege. Usually, cloud vendors require the mgmtPF deployed in their
>>>> secured domain (can be a separate physical machine).
>>>
>>> I though mgmtPF & userPF are just 2 functions on one PCIe card, they are not?
>>> Then how the mgmtPF writes data to another physical machine and
>>> influence the userPF?
>>>
>>
>> They are functions but they also have different device id thus we can bind
>> different driver onto each of them.
>>
>> I think I should interrupt it accurately as Host and VM. See details here
>> please:
>> https://xilinx.github.io/XRT/2019.2/html/cloud_vendor_support.html
>>
>> But, my point is that we do have 2 separate drivers.
> 
> OK, having 2 separate drivers is definitely fine.
> 
> Thanks,
> Yilun

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci
  2025-02-11 11:31                     ` Yidong Zhang
@ 2025-03-01  8:20                       ` Xu Yilun
  2025-03-01 19:03                         ` Yidong Zhang
  0 siblings, 1 reply; 28+ messages in thread
From: Xu Yilun @ 2025-03-01  8:20 UTC (permalink / raw)
  To: Yidong Zhang
  Cc: linux-kernel, linux-fpga, mdf, hao.wu, yilun.xu, lizhi.hou,
	DMG Karthik, Nishad Saraf, Hayden Laccabue

> My last question for this topic:
> If we decide to upstream both userPF and mgmtPF driver together, could them
> be both within the drivers/fpga/amd as in-tree driver? This will help user

I don't look into your full driver stack. Generally, if your drivers are
all about reprogramming, then yes. If they are also about all kinds of
accelaration functions you'd better split them out in different domains.
I may not have enough knowledge to make them correct.

Thanks,
Yilun

> find source code easily.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci
  2025-03-01  8:20                       ` Xu Yilun
@ 2025-03-01 19:03                         ` Yidong Zhang
  2025-03-03  7:57                           ` Xu Yilun
  0 siblings, 1 reply; 28+ messages in thread
From: Yidong Zhang @ 2025-03-01 19:03 UTC (permalink / raw)
  To: Xu Yilun
  Cc: linux-kernel, linux-fpga, mdf, hao.wu, yilun.xu, lizhi.hou,
	DMG Karthik, Nishad Saraf, Hayden Laccabue

On 3/1/25 00:20, Xu Yilun wrote:
> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
> 
> 
>> My last question for this topic:
>> If we decide to upstream both userPF and mgmtPF driver together, could them
>> be both within the drivers/fpga/amd as in-tree driver? This will help user
> 
> I don't look into your full driver stack. Generally, if your drivers are
> all about reprogramming, then yes. If they are also about all kinds of
> accelaration functions you'd better split them out in different domains.
> I may not have enough knowledge to make them correct.
> 

The driver has more features than just re-programing. The re-programing 
is already done in the embedded firmware that's why the mgmtPF driver is 
just a utility driver.

The userPF driver has features such as:
   xdma (already in drivers/xilinx/xdma as platform driver)
   qdma (already in drivers/amd/qdma as platform driver)
   mailbox and more which have not been upstreamed in linux kernel yet.

The driver architecture is:

   userPF driver (as pci_driver)
     qdma (as platform_driver)
     ..
     mailbox (as platform_driver)
        /\
        ||
        \/
     mailbox (as platform_driver)
   mgmtPF driver (as pci_driver)
        /\
        ||
        \/
     Embedded firmware (re-programing done here)

Right now, I am working on upstreaming the mgmtPF driver as pci_driver.
In the future, I think the userPF driver should be fitting into the
"drivers/fpga", given that should manage all these platform_drivers and 
utilize the fpga_region callbacks to online/offline services due to 
hardware changes after re-programing.

Thanks,
David

> Thanks,
> Yilun
> 
>> find source code easily.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci
  2025-03-01 19:03                         ` Yidong Zhang
@ 2025-03-03  7:57                           ` Xu Yilun
  2025-03-03 17:00                             ` Yidong Zhang
  0 siblings, 1 reply; 28+ messages in thread
From: Xu Yilun @ 2025-03-03  7:57 UTC (permalink / raw)
  To: Yidong Zhang
  Cc: linux-kernel, linux-fpga, mdf, hao.wu, yilun.xu, lizhi.hou,
	DMG Karthik, Nishad Saraf, Hayden Laccabue

On Sat, Mar 01, 2025 at 11:03:29AM -0800, Yidong Zhang wrote:
> 
> 
> On 3/1/25 00:20, Xu Yilun wrote:
> > Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
> > 
> > 
> > > My last question for this topic:
> > > If we decide to upstream both userPF and mgmtPF driver together, could them
> > > be both within the drivers/fpga/amd as in-tree driver? This will help user
> > 
> > I don't look into your full driver stack. Generally, if your drivers are
> > all about reprogramming, then yes. If they are also about all kinds of
> > accelaration functions you'd better split them out in different domains.
> > I may not have enough knowledge to make them correct.
> > 
> 
> The driver has more features than just re-programing. The re-programing is
> already done in the embedded firmware that's why the mgmtPF driver is just a
> utility driver.
> 
> The userPF driver has features such as:
>   xdma (already in drivers/xilinx/xdma as platform driver)
>   qdma (already in drivers/amd/qdma as platform driver)
>   mailbox and more which have not been upstreamed in linux kernel yet.
> 
> The driver architecture is:
> 
>   userPF driver (as pci_driver)
>     qdma (as platform_driver)
>     ..
>     mailbox (as platform_driver)
>        /\
>        ||
>        \/
>     mailbox (as platform_driver)
>   mgmtPF driver (as pci_driver)
>        /\
>        ||
>        \/
>     Embedded firmware (re-programing done here)
> 
> Right now, I am working on upstreaming the mgmtPF driver as pci_driver.
> In the future, I think the userPF driver should be fitting into the
> "drivers/fpga", given that should manage all these platform_drivers and

No I think userPF driver should manage all these *platform_devices*.
Platform_drivers could be independent and put into proper domain folders.

> utilize the fpga_region callbacks to online/offline services due to hardware

fpga_region should online/offline platform devices. Not services, which is the
job of each platform_driver.

Thanks,
Yilun

> changes after re-programing.
> 
> Thanks,
> David
> 
> > Thanks,
> > Yilun
> > 
> > > find source code easily.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci
  2025-03-03  7:57                           ` Xu Yilun
@ 2025-03-03 17:00                             ` Yidong Zhang
  0 siblings, 0 replies; 28+ messages in thread
From: Yidong Zhang @ 2025-03-03 17:00 UTC (permalink / raw)
  To: Xu Yilun
  Cc: linux-kernel, linux-fpga, mdf, hao.wu, yilun.xu, lizhi.hou,
	DMG Karthik, Nishad Saraf, Hayden Laccabue



On 3/2/25 23:57, Xu Yilun wrote:
> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
> 
> 
> On Sat, Mar 01, 2025 at 11:03:29AM -0800, Yidong Zhang wrote:
>>
>>
>> On 3/1/25 00:20, Xu Yilun wrote:
>>> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
>>>
>>>
>>>> My last question for this topic:
>>>> If we decide to upstream both userPF and mgmtPF driver together, could them
>>>> be both within the drivers/fpga/amd as in-tree driver? This will help user
>>>
>>> I don't look into your full driver stack. Generally, if your drivers are
>>> all about reprogramming, then yes. If they are also about all kinds of
>>> accelaration functions you'd better split them out in different domains.
>>> I may not have enough knowledge to make them correct.
>>>
>>
>> The driver has more features than just re-programing. The re-programing is
>> already done in the embedded firmware that's why the mgmtPF driver is just a
>> utility driver.
>>
>> The userPF driver has features such as:
>>    xdma (already in drivers/xilinx/xdma as platform driver)
>>    qdma (already in drivers/amd/qdma as platform driver)
>>    mailbox and more which have not been upstreamed in linux kernel yet.
>>
>> The driver architecture is:
>>
>>    userPF driver (as pci_driver)
>>      qdma (as platform_driver)
>>      ..
>>      mailbox (as platform_driver)
>>         /\
>>         ||
>>         \/
>>      mailbox (as platform_driver)
>>    mgmtPF driver (as pci_driver)
>>         /\
>>         ||
>>         \/
>>      Embedded firmware (re-programing done here)
>>
>> Right now, I am working on upstreaming the mgmtPF driver as pci_driver.
>> In the future, I think the userPF driver should be fitting into the
>> "drivers/fpga", given that should manage all these platform_drivers and
> 
> No I think userPF driver should manage all these *platform_devices*.
Yes, userPF driver manage all these "platform_devices" by leveraging the 
fpga_region callback ops. That's why I am thinking the userPF should be 
in drivers/fpga.

> Platform_drivers could be independent and put into proper domain folders.
Yes. All these "platform_devices" - drivers - should be put into proper 
domain. Aka, not in the drivers/fpga.

> 
>> utilize the fpga_region callbacks to online/offline services due to hardware
> 
> fpga_region should online/offline platform devices. Not services, which is the
> job of each platform_driver.

Yes. I think we are thinking the same concept - online/offline platform 
devices. Not the services like probe/remove. The online/offline are 
separate ops just designed for the platform_devices to be online/offline.

Thanks,
David

> 
> Thanks,
> Yilun
> 
>> changes after re-programing.
>>
>> Thanks,
>> David
>>
>>> Thanks,
>>> Yilun
>>>
>>>> find source code easily.

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2025-03-03 17:00 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-10 18:37 [PATCH V2 0/4] Add versal-pci driver Yidong Zhang
2024-12-10 18:37 ` [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci Yidong Zhang
2025-01-26  9:24   ` Xu Yilun
2025-01-26 19:50     ` Yidong Zhang
2025-01-26 10:12   ` Christophe JAILLET
2025-01-26 19:56     ` Yidong Zhang
2025-01-26 10:16   ` Christophe JAILLET
2025-01-26 10:32   ` Xu Yilun
2025-01-26 19:46     ` Yidong Zhang
2025-02-06  4:15       ` Xu Yilun
2025-02-06  4:31         ` Yidong Zhang
2025-02-07  2:19           ` Xu Yilun
2025-02-07  3:16             ` Yidong Zhang
2025-02-07  4:40               ` Xu Yilun
2025-02-10 11:33                 ` Yidong Zhang
2025-02-11  9:09                   ` Xu Yilun
2025-02-11 11:31                     ` Yidong Zhang
2025-03-01  8:20                       ` Xu Yilun
2025-03-01 19:03                         ` Yidong Zhang
2025-03-03  7:57                           ` Xu Yilun
2025-03-03 17:00                             ` Yidong Zhang
2024-12-10 18:37 ` [PATCH V2 2/4] drivers/fpga/amd: Add communication channel Yidong Zhang
2025-01-26 10:19   ` Christophe JAILLET
2025-01-26 19:57     ` Yidong Zhang
2024-12-10 18:37 ` [PATCH V2 3/4] drivers/fpga/amd: Add remote queue Yidong Zhang
2025-01-26 10:24   ` Christophe JAILLET
2024-12-10 18:37 ` [PATCH V2 4/4] drivers/fpga/amd: Add load xclbin and load firmware Yidong Zhang
2025-01-26  9:27 ` [PATCH V2 0/4] Add versal-pci driver Xu Yilun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).