netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V7 0/5] TPH and cache direct injection support
@ 2024-10-02 16:59 Wei Huang
  2024-10-02 16:59 ` [PATCH V7 1/5] PCI: Add TLP Processing Hints (TPH) support Wei Huang
                   ` (5 more replies)
  0 siblings, 6 replies; 17+ messages in thread
From: Wei Huang @ 2024-10-02 16:59 UTC (permalink / raw)
  To: linux-pci, linux-kernel, linux-doc, netdev
  Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
	alex.williamson, gospo, michael.chan, ajit.khaparde,
	somnath.kotur, andrew.gospodarek, manoj.panicker2,
	Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
	bhelgaas, lukas, paul.e.luse, jing2.liu

Hi All,

TPH (TLP Processing Hints) is a PCIe feature that allows endpoint
devices to provide optimization hints for requests that target memory
space. These hints, in a format called steering tag (ST), are provided
in the requester's TLP headers and allow the system hardware, including
the Root Complex, to optimize the utilization of platform resources
for the requests.

Upcoming AMD hardware implement a new Cache Injection feature that
leverages TPH. Cache Injection allows PCIe endpoints to inject I/O
Coherent DMA writes directly into an L2 within the CCX (core complex)
closest to the CPU core that will consume it. This technology is aimed
at applications requiring high performance and low latency, such as
networking and storage applications.

This series introduces generic TPH support in Linux, allowing STs to be
retrieved and used by PCIe endpoint drivers as needed. As a
demonstration, it includes an example usage in the Broadcom BNXT driver.
When running on Broadcom NICs with the appropriate firmware, it shows
substantial memory bandwidth savings and better network bandwidth using
real-world benchmarks. This solution is vendor-neutral and implemented
based on industry standards (PCIe Spec and PCI FW Spec).

V6->V7:
 * Rebase on top of the latest pci/main (6.12-rc1)
 * Fix compilation warning/error on clang-18 with w=1 (test robot)
 * Revise commit messages for Patch #2, #4, and #5 (Bjorn)
 * Add more _DSM method description for reference in Patch #2 (Bjorn)
 * Remove "default n" in Kconfig (Lukas)

V5->V6:
 * Rebase on top of pci/main (tag: pci-v6.12-changes)
 * Fix spellings and FIELD_PREP/bnxt.c compilation errors (Simon)
 * Move tph.c to drivers/pci directory (Lukas)
 * Remove CONFIG_ACPI dependency (Lukas)
 * Slightly re-arrange save/restore sequence (Lukas)

V4->V5:
 * Rebase on top of net-next/main tree (Broadcom)
 * Remove TPH mode query and TPH enabled checking functions (Bjorn)
 * Remove "nostmode" kernel parameter (Bjorn)
 * Add "notph" kernel parameter support (Bjorn)
 * Add back TPH documentation (Bjorn)
 * Change TPH register namings (Bjorn)
 * Squash TPH enable/disable/save/restore funcs as a single patch (Bjorn)
 * Squash ST get_st/set_st funcs as a single patch (Bjorn)
 * Replace nic_open/close with netdev_rx_queue_restart() (Jakub, Broadcom)

V3->V4:
 * Rebase on top of the latest pci/next tree (tag: 6.11-rc1)
 * Add new API functioins to query/enable/disable TPH support
 * Make pcie_tph_set_st() completely independent from pcie_tph_get_cpu_st()
 * Rewrite bnxt.c based on new APIs
 * Remove documentation for now due to constantly changing API
 * Remove pci=notph, but keep pci=nostmode with better flow (Bjorn)
 * Lots of code rewrite in tph.c & pci-tph.h with cleaner interface (Bjorn)
 * Add TPH save/restore support (Paul Luse and Lukas Wunner)

V2->V3:
 * Rebase on top of pci/next tree (tag: pci-v6.11-changes)
 * Redefine PCI TPH registers (pci_regs.h) without breaking uapi
 * Fix commit subjects/messages for kernel options (Jonathan and Bjorn)
 * Break API functions into three individual patches for easy review
 * Rewrite lots of code in tph.c/tph.h based (Jonathan and Bjorn)

V1->V2:
 * Rebase on top of pci.git/for-linus (6.10-rc1)
 * Address mismatched data types reported by Sparse (Sparse check passed)
 * Add pcie_tph_intr_vec_supported() for checking IRQ mode support
 * Skip bnxt affinity notifier registration if
   pcie_tph_intr_vec_supported()=false
 * Minor fixes in bnxt driver (i.e. warning messages)

Manoj Panicker (1):
  bnxt_en: Add TPH support in BNXT driver

Michael Chan (1):
  bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings

Wei Huang (3):
  PCI: Add TLP Processing Hints (TPH) support
  PCI/TPH: Add Steering Tag support
  PCI/TPH: Add TPH documentation

 Documentation/PCI/index.rst                   |   1 +
 Documentation/PCI/tph.rst                     | 132 +++++
 .../admin-guide/kernel-parameters.txt         |   4 +
 Documentation/driver-api/pci/pci.rst          |   3 +
 drivers/net/ethernet/broadcom/bnxt/bnxt.c     |  91 ++-
 drivers/net/ethernet/broadcom/bnxt/bnxt.h     |   7 +
 drivers/pci/Kconfig                           |   9 +
 drivers/pci/Makefile                          |   1 +
 drivers/pci/pci.c                             |   4 +
 drivers/pci/pci.h                             |  12 +
 drivers/pci/probe.c                           |   1 +
 drivers/pci/tph.c                             | 546 ++++++++++++++++++
 include/linux/pci-tph.h                       |  44 ++
 include/linux/pci.h                           |   7 +
 include/uapi/linux/pci_regs.h                 |  37 +-
 net/core/netdev_rx_queue.c                    |   1 +
 16 files changed, 890 insertions(+), 10 deletions(-)
 create mode 100644 Documentation/PCI/tph.rst
 create mode 100644 drivers/pci/tph.c
 create mode 100644 include/linux/pci-tph.h


base-commit: 9852d85ec9d492ebef56dc5f229416c925758edc
-- 
2.46.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH V7 1/5] PCI: Add TLP Processing Hints (TPH) support
  2024-10-02 16:59 [PATCH V7 0/5] TPH and cache direct injection support Wei Huang
@ 2024-10-02 16:59 ` Wei Huang
  2024-10-02 16:59 ` [PATCH V7 2/5] PCI/TPH: Add Steering Tag support Wei Huang
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: Wei Huang @ 2024-10-02 16:59 UTC (permalink / raw)
  To: linux-pci, linux-kernel, linux-doc, netdev
  Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
	alex.williamson, gospo, michael.chan, ajit.khaparde,
	somnath.kotur, andrew.gospodarek, manoj.panicker2,
	Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
	bhelgaas, lukas, paul.e.luse, jing2.liu

Add support for PCIe TLP Processing Hints (TPH) support (see PCIe r6.2,
sec 6.17).

Add missing TPH register definitions in pci_regs.h, including the TPH
Requester capability register, TPH Requester control register, TPH
Completer capability, and the ST fields of MSI-X entry.

Introduce pcie_enable_tph() and pcie_disable_tph(), enabling drivers to
toggle TPH support and configure specific ST mode as needed. Also add a
new kernel parameter, "pci=notph", allowing users to disable TPH support
across the entire system.

Co-developed-by: Jing Liu <jing2.liu@intel.com>
Signed-off-by: Jing Liu <jing2.liu@intel.com>
Co-developed-by: Paul Luse <paul.e.luse@linux.intel.com>
Signed-off-by: Paul Luse <paul.e.luse@linux.intel.com>
Co-developed-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
---
 .../admin-guide/kernel-parameters.txt         |   4 +
 drivers/pci/Kconfig                           |   9 +
 drivers/pci/Makefile                          |   1 +
 drivers/pci/pci.c                             |   4 +
 drivers/pci/pci.h                             |  12 ++
 drivers/pci/probe.c                           |   1 +
 drivers/pci/tph.c                             | 197 ++++++++++++++++++
 include/linux/pci-tph.h                       |  21 ++
 include/linux/pci.h                           |   7 +
 include/uapi/linux/pci_regs.h                 |  37 +++-
 10 files changed, 285 insertions(+), 8 deletions(-)
 create mode 100644 drivers/pci/tph.c
 create mode 100644 include/linux/pci-tph.h

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 1518343bbe22..178995b07451 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4678,6 +4678,10 @@
 		nomio		[S390] Do not use MIO instructions.
 		norid		[S390] ignore the RID field and force use of
 				one PCI domain per PCI function
+		notph		[PCIE] If the PCIE_TPH kernel config parameter
+				is enabled, this kernel boot option can be used
+				to disable PCIe TLP Processing Hints support
+				system-wide.
 
 	pcie_aspm=	[PCIE] Forcibly enable or ignore PCIe Active State Power
 			Management.
diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
index 0d94e4a967d8..2f270e4414b3 100644
--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -173,6 +173,15 @@ config PCI_PASID
 
 	  If unsure, say N.
 
+config PCIE_TPH
+	bool "TLP Processing Hints"
+	help
+	  This option adds support for PCIe TLP Processing Hints (TPH).
+	  TPH allows endpoint devices to provide optimization hints, such as
+	  desired caching behavior, for requests that target memory space.
+	  These hints, called Steering Tags, can empower the system hardware
+	  to optimize the utilization of platform resources.
+
 config PCI_P2PDMA
 	bool "PCI peer-to-peer transfer support"
 	depends on ZONE_DEVICE
diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
index 374c5c06d92f..b2a100f2e24a 100644
--- a/drivers/pci/Makefile
+++ b/drivers/pci/Makefile
@@ -36,6 +36,7 @@ obj-$(CONFIG_VGA_ARB)		+= vgaarb.o
 obj-$(CONFIG_PCI_DOE)		+= doe.o
 obj-$(CONFIG_PCI_DYNAMIC_OF_NODES) += of_property.o
 obj-$(CONFIG_PCI_NPEM)		+= npem.o
+obj-$(CONFIG_PCIE_TPH)		+= tph.o
 
 # Endpoint library must be initialized before its users
 obj-$(CONFIG_PCI_ENDPOINT)	+= endpoint/
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 7d85c04fbba2..89dafecc869b 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1828,6 +1828,7 @@ int pci_save_state(struct pci_dev *dev)
 	pci_save_dpc_state(dev);
 	pci_save_aer_state(dev);
 	pci_save_ptm_state(dev);
+	pci_save_tph_state(dev);
 	return pci_save_vc_state(dev);
 }
 EXPORT_SYMBOL(pci_save_state);
@@ -1933,6 +1934,7 @@ void pci_restore_state(struct pci_dev *dev)
 	pci_restore_rebar_state(dev);
 	pci_restore_dpc_state(dev);
 	pci_restore_ptm_state(dev);
+	pci_restore_tph_state(dev);
 
 	pci_aer_clear_status(dev);
 	pci_restore_aer_state(dev);
@@ -6896,6 +6898,8 @@ static int __init pci_setup(char *str)
 				pci_no_domains();
 			} else if (!strncmp(str, "noari", 5)) {
 				pcie_ari_disabled = true;
+			} else if (!strncmp(str, "notph", 5)) {
+				pci_no_tph();
 			} else if (!strncmp(str, "cbiosize=", 9)) {
 				pci_cardbus_io_size = memparse(str + 9, &str);
 			} else if (!strncmp(str, "cbmemsize=", 10)) {
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 14d00ce45bfa..d89fdbf04f36 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -597,6 +597,18 @@ static inline int pci_iov_bus_range(struct pci_bus *bus)
 
 #endif /* CONFIG_PCI_IOV */
 
+#ifdef CONFIG_PCIE_TPH
+void pci_restore_tph_state(struct pci_dev *dev);
+void pci_save_tph_state(struct pci_dev *dev);
+void pci_no_tph(void);
+void pci_tph_init(struct pci_dev *dev);
+#else
+static inline void pci_restore_tph_state(struct pci_dev *dev) { }
+static inline void pci_save_tph_state(struct pci_dev *dev) { }
+static inline void pci_no_tph(void) { }
+static inline void pci_tph_init(struct pci_dev *dev) { }
+#endif
+
 #ifdef CONFIG_PCIE_PTM
 void pci_ptm_init(struct pci_dev *dev);
 void pci_save_ptm_state(struct pci_dev *dev);
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 4f68414c3086..b086d53a9048 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2495,6 +2495,7 @@ static void pci_init_capabilities(struct pci_dev *dev)
 	pci_dpc_init(dev);		/* Downstream Port Containment */
 	pci_rcec_init(dev);		/* Root Complex Event Collector */
 	pci_doe_init(dev);		/* Data Object Exchange */
+	pci_tph_init(dev);		/* TLP Processing Hints */
 
 	pcie_report_downtraining(dev);
 	pci_init_reset_methods(dev);
diff --git a/drivers/pci/tph.c b/drivers/pci/tph.c
new file mode 100644
index 000000000000..6c6b500c2eaa
--- /dev/null
+++ b/drivers/pci/tph.c
@@ -0,0 +1,197 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * TPH (TLP Processing Hints) support
+ *
+ * Copyright (C) 2024 Advanced Micro Devices, Inc.
+ *     Eric Van Tassell <Eric.VanTassell@amd.com>
+ *     Wei Huang <wei.huang2@amd.com>
+ */
+#include <linux/pci.h>
+#include <linux/bitfield.h>
+#include <linux/pci-tph.h>
+
+#include "pci.h"
+
+/* System-wide TPH disabled */
+static bool pci_tph_disabled;
+
+static u8 get_st_modes(struct pci_dev *pdev)
+{
+	u32 reg;
+
+	pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CAP, &reg);
+	reg &= PCI_TPH_CAP_ST_NS | PCI_TPH_CAP_ST_IV | PCI_TPH_CAP_ST_DS;
+
+	return reg;
+}
+
+/* Return device's Root Port completer capability */
+static u8 get_rp_completer_type(struct pci_dev *pdev)
+{
+	struct pci_dev *rp;
+	u32 reg;
+	int ret;
+
+	rp = pcie_find_root_port(pdev);
+	if (!rp)
+		return 0;
+
+	ret = pcie_capability_read_dword(rp, PCI_EXP_DEVCAP2, &reg);
+	if (ret)
+		return 0;
+
+	return FIELD_GET(PCI_EXP_DEVCAP2_TPH_COMP_MASK, reg);
+}
+
+/**
+ * pcie_disable_tph - Turn off TPH support for device
+ * @pdev: PCI device
+ *
+ * Return: none
+ */
+void pcie_disable_tph(struct pci_dev *pdev)
+{
+	if (!pdev->tph_cap)
+		return;
+
+	if (!pdev->tph_enabled)
+		return;
+
+	pci_write_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, 0);
+
+	pdev->tph_mode = 0;
+	pdev->tph_req_type = 0;
+	pdev->tph_enabled = 0;
+}
+EXPORT_SYMBOL(pcie_disable_tph);
+
+/**
+ * pcie_enable_tph - Enable TPH support for device using a specific ST mode
+ * @pdev: PCI device
+ * @mode: ST mode to enable. Current supported modes include:
+ *
+ *   - PCI_TPH_ST_NS_MODE: NO ST Mode
+ *   - PCI_TPH_ST_IV_MODE: Interrupt Vector Mode
+ *   - PCI_TPH_ST_DS_MODE: Device Specific Mode
+ *
+ * Checks whether the mode is actually supported by the device before enabling
+ * and returns an error if not. Additionally determines what types of requests,
+ * TPH or extended TPH, can be issued by the device based on its TPH requester
+ * capability and the Root Port's completer capability.
+ *
+ * Return: 0 on success, otherwise negative value (-errno)
+ */
+int pcie_enable_tph(struct pci_dev *pdev, int mode)
+{
+	u32 reg;
+	u8 dev_modes;
+	u8 rp_req_type;
+
+	/* Honor "notph" kernel parameter */
+	if (pci_tph_disabled)
+		return -EINVAL;
+
+	if (!pdev->tph_cap)
+		return -EINVAL;
+
+	if (pdev->tph_enabled)
+		return -EBUSY;
+
+	/* Sanitize and check ST mode compatibility */
+	mode &= PCI_TPH_CTRL_MODE_SEL_MASK;
+	dev_modes = get_st_modes(pdev);
+	if (!((1 << mode) & dev_modes))
+		return -EINVAL;
+
+	pdev->tph_mode = mode;
+
+	/* Get req_type supported by device and its Root Port */
+	pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CAP, &reg);
+	if (FIELD_GET(PCI_TPH_CAP_EXT_TPH, reg))
+		pdev->tph_req_type = PCI_TPH_REQ_EXT_TPH;
+	else
+		pdev->tph_req_type = PCI_TPH_REQ_TPH_ONLY;
+
+	rp_req_type = get_rp_completer_type(pdev);
+
+	/* Final req_type is the smallest value of two */
+	pdev->tph_req_type = min(pdev->tph_req_type, rp_req_type);
+
+	if (pdev->tph_req_type == PCI_TPH_REQ_DISABLE)
+		return -EINVAL;
+
+	/* Write them into TPH control register */
+	pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, &reg);
+
+	reg &= ~PCI_TPH_CTRL_MODE_SEL_MASK;
+	reg |= FIELD_PREP(PCI_TPH_CTRL_MODE_SEL_MASK, pdev->tph_mode);
+
+	reg &= ~PCI_TPH_CTRL_REQ_EN_MASK;
+	reg |= FIELD_PREP(PCI_TPH_CTRL_REQ_EN_MASK, pdev->tph_req_type);
+
+	pci_write_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, reg);
+
+	pdev->tph_enabled = 1;
+
+	return 0;
+}
+EXPORT_SYMBOL(pcie_enable_tph);
+
+void pci_restore_tph_state(struct pci_dev *pdev)
+{
+	struct pci_cap_saved_state *save_state;
+	u32 *cap;
+
+	if (!pdev->tph_cap)
+		return;
+
+	if (!pdev->tph_enabled)
+		return;
+
+	save_state = pci_find_saved_ext_cap(pdev, PCI_EXT_CAP_ID_TPH);
+	if (!save_state)
+		return;
+
+	/* Restore control register and all ST entries */
+	cap = &save_state->cap.data[0];
+	pci_write_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, *cap++);
+}
+
+void pci_save_tph_state(struct pci_dev *pdev)
+{
+	struct pci_cap_saved_state *save_state;
+	u32 *cap;
+
+	if (!pdev->tph_cap)
+		return;
+
+	if (!pdev->tph_enabled)
+		return;
+
+	save_state = pci_find_saved_ext_cap(pdev, PCI_EXT_CAP_ID_TPH);
+	if (!save_state)
+		return;
+
+	/* Save control register */
+	cap = &save_state->cap.data[0];
+	pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, cap++);
+}
+
+void pci_no_tph(void)
+{
+	pci_tph_disabled = true;
+
+	pr_info("PCIe TPH is disabled\n");
+}
+
+void pci_tph_init(struct pci_dev *pdev)
+{
+	u32 save_size;
+
+	pdev->tph_cap = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_TPH);
+	if (!pdev->tph_cap)
+		return;
+
+	save_size = sizeof(u32);
+	pci_add_ext_cap_save_buffer(pdev, PCI_EXT_CAP_ID_TPH, save_size);
+}
diff --git a/include/linux/pci-tph.h b/include/linux/pci-tph.h
new file mode 100644
index 000000000000..58654a334ffb
--- /dev/null
+++ b/include/linux/pci-tph.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * TPH (TLP Processing Hints)
+ *
+ * Copyright (C) 2024 Advanced Micro Devices, Inc.
+ *     Eric Van Tassell <Eric.VanTassell@amd.com>
+ *     Wei Huang <wei.huang2@amd.com>
+ */
+#ifndef LINUX_PCI_TPH_H
+#define LINUX_PCI_TPH_H
+
+#ifdef CONFIG_PCIE_TPH
+void pcie_disable_tph(struct pci_dev *pdev);
+int pcie_enable_tph(struct pci_dev *pdev, int mode);
+#else
+static inline void pcie_disable_tph(struct pci_dev *pdev) { }
+static inline int pcie_enable_tph(struct pci_dev *pdev, int mode)
+{ return -EINVAL; }
+#endif
+
+#endif /* LINUX_PCI_TPH_H */
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 573b4c4c2be6..8351d76b6e12 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -434,6 +434,7 @@ struct pci_dev {
 	unsigned int	ats_enabled:1;		/* Address Translation Svc */
 	unsigned int	pasid_enabled:1;	/* Process Address Space ID */
 	unsigned int	pri_enabled:1;		/* Page Request Interface */
+	unsigned int	tph_enabled:1;		/* TLP Processing Hints */
 	unsigned int	is_managed:1;		/* Managed via devres */
 	unsigned int	is_msi_managed:1;	/* MSI release via devres installed */
 	unsigned int	needs_freset:1;		/* Requires fundamental reset */
@@ -534,6 +535,12 @@ struct pci_dev {
 
 	/* These methods index pci_reset_fn_methods[] */
 	u8 reset_methods[PCI_NUM_RESET_METHODS]; /* In priority order */
+
+#ifdef CONFIG_PCIE_TPH
+	u16		tph_cap;	/* TPH capability offset */
+	u8		tph_mode;	/* TPH mode */
+	u8		tph_req_type;	/* TPH requester type */
+#endif
 };
 
 static inline struct pci_dev *pci_physfn(struct pci_dev *dev)
diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index 12323b3334a9..155dea741615 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -340,7 +340,8 @@
 #define PCI_MSIX_ENTRY_UPPER_ADDR	0x4  /* Message Upper Address */
 #define PCI_MSIX_ENTRY_DATA		0x8  /* Message Data */
 #define PCI_MSIX_ENTRY_VECTOR_CTRL	0xc  /* Vector Control */
-#define  PCI_MSIX_ENTRY_CTRL_MASKBIT	0x00000001
+#define  PCI_MSIX_ENTRY_CTRL_MASKBIT	0x00000001  /* Mask Bit */
+#define  PCI_MSIX_ENTRY_CTRL_ST		0xffff0000  /* Steering Tag */
 
 /* CompactPCI Hotswap Register */
 
@@ -659,6 +660,7 @@
 #define  PCI_EXP_DEVCAP2_ATOMIC_COMP64	0x00000100 /* 64b AtomicOp completion */
 #define  PCI_EXP_DEVCAP2_ATOMIC_COMP128	0x00000200 /* 128b AtomicOp completion */
 #define  PCI_EXP_DEVCAP2_LTR		0x00000800 /* Latency tolerance reporting */
+#define  PCI_EXP_DEVCAP2_TPH_COMP_MASK	0x00003000 /* TPH completer support */
 #define  PCI_EXP_DEVCAP2_OBFF_MASK	0x000c0000 /* OBFF support mechanism */
 #define  PCI_EXP_DEVCAP2_OBFF_MSG	0x00040000 /* New message signaling */
 #define  PCI_EXP_DEVCAP2_OBFF_WAKE	0x00080000 /* Re-use WAKE# for OBFF */
@@ -1023,15 +1025,34 @@
 #define  PCI_DPA_CAP_SUBSTATE_MASK	0x1F	/* # substates - 1 */
 #define PCI_DPA_BASE_SIZEOF	16	/* size with 0 substates */
 
+/* TPH Completer Support */
+#define PCI_EXP_DEVCAP2_TPH_COMP_NONE		0x0 /* None */
+#define PCI_EXP_DEVCAP2_TPH_COMP_TPH_ONLY	0x1 /* TPH only */
+#define PCI_EXP_DEVCAP2_TPH_COMP_EXT_TPH	0x3 /* TPH and Extended TPH */
+
 /* TPH Requester */
 #define PCI_TPH_CAP		4	/* capability register */
-#define  PCI_TPH_CAP_LOC_MASK	0x600	/* location mask */
-#define   PCI_TPH_LOC_NONE	0x000	/* no location */
-#define   PCI_TPH_LOC_CAP	0x200	/* in capability */
-#define   PCI_TPH_LOC_MSIX	0x400	/* in MSI-X */
-#define PCI_TPH_CAP_ST_MASK	0x07FF0000	/* ST table mask */
-#define PCI_TPH_CAP_ST_SHIFT	16	/* ST table shift */
-#define PCI_TPH_BASE_SIZEOF	0xc	/* size with no ST table */
+#define  PCI_TPH_CAP_ST_NS	0x00000001 /* No ST Mode Supported */
+#define  PCI_TPH_CAP_ST_IV	0x00000002 /* Interrupt Vector Mode Supported */
+#define  PCI_TPH_CAP_ST_DS	0x00000004 /* Device Specific Mode Supported */
+#define  PCI_TPH_CAP_EXT_TPH	0x00000100 /* Ext TPH Requester Supported */
+#define  PCI_TPH_CAP_LOC_MASK	0x00000600 /* ST Table Location */
+#define   PCI_TPH_LOC_NONE	0x00000000 /* Not present */
+#define   PCI_TPH_LOC_CAP	0x00000200 /* In capability */
+#define   PCI_TPH_LOC_MSIX	0x00000400 /* In MSI-X */
+#define  PCI_TPH_CAP_ST_MASK	0x07FF0000 /* ST Table Size */
+#define  PCI_TPH_CAP_ST_SHIFT	16	/* ST Table Size shift */
+#define PCI_TPH_BASE_SIZEOF	0xc	/* Size with no ST table */
+
+#define PCI_TPH_CTRL		8	/* control register */
+#define  PCI_TPH_CTRL_MODE_SEL_MASK	0x00000007 /* ST Mode Select */
+#define   PCI_TPH_ST_NS_MODE		0x0 /* No ST Mode */
+#define   PCI_TPH_ST_IV_MODE		0x1 /* Interrupt Vector Mode */
+#define   PCI_TPH_ST_DS_MODE		0x2 /* Device Specific Mode */
+#define  PCI_TPH_CTRL_REQ_EN_MASK	0x00000300 /* TPH Requester Enable */
+#define   PCI_TPH_REQ_DISABLE		0x0 /* No TPH requests allowed */
+#define   PCI_TPH_REQ_TPH_ONLY		0x1 /* TPH only requests allowed */
+#define   PCI_TPH_REQ_EXT_TPH		0x3 /* Extended TPH requests allowed */
 
 /* Downstream Port Containment */
 #define PCI_EXP_DPC_CAP			0x04	/* DPC Capability */
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH V7 2/5] PCI/TPH: Add Steering Tag support
  2024-10-02 16:59 [PATCH V7 0/5] TPH and cache direct injection support Wei Huang
  2024-10-02 16:59 ` [PATCH V7 1/5] PCI: Add TLP Processing Hints (TPH) support Wei Huang
@ 2024-10-02 16:59 ` Wei Huang
  2025-02-04 18:33   ` Robin Murphy
  2024-10-02 16:59 ` [PATCH V7 3/5] PCI/TPH: Add TPH documentation Wei Huang
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Wei Huang @ 2024-10-02 16:59 UTC (permalink / raw)
  To: linux-pci, linux-kernel, linux-doc, netdev
  Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
	alex.williamson, gospo, michael.chan, ajit.khaparde,
	somnath.kotur, andrew.gospodarek, manoj.panicker2,
	Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
	bhelgaas, lukas, paul.e.luse, jing2.liu

Add pcie_tph_get_cpu_st() to allow a caller to retrieve Steering Tags
for a target memory that is associated with a specific CPU. The ST tag
is retrieved by invoking PCI ACPI _DSM method (rev=0x7, func=0xF) of
the device's Root Port device.

Add pcie_tph_set_st_entry() to support updating the device's Steering
Tags. The tags will be written into the device's MSI-X table or the
ST table located in the TPH Extended Capability space.

Co-developed-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
---
 drivers/pci/tph.c       | 351 +++++++++++++++++++++++++++++++++++++++-
 include/linux/pci-tph.h |  23 +++
 2 files changed, 373 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/tph.c b/drivers/pci/tph.c
index 6c6b500c2eaa..9a268653866d 100644
--- a/drivers/pci/tph.c
+++ b/drivers/pci/tph.c
@@ -7,6 +7,8 @@
  *     Wei Huang <wei.huang2@amd.com>
  */
 #include <linux/pci.h>
+#include <linux/pci-acpi.h>
+#include <linux/msi.h>
 #include <linux/bitfield.h>
 #include <linux/pci-tph.h>
 
@@ -15,6 +17,134 @@
 /* System-wide TPH disabled */
 static bool pci_tph_disabled;
 
+#ifdef CONFIG_ACPI
+/*
+ * The st_info struct defines the Steering Tag (ST) info returned by the
+ * firmware PCI ACPI _DSM method (rev=0x7, func=0xF, "_DSM to Query Cache
+ * Locality TPH Features"), as specified in the approved ECN for PCI Firmware
+ * Spec and available at https://members.pcisig.com/wg/PCI-SIG/document/15470.
+ *
+ * @vm_st_valid:  8-bit ST for volatile memory is valid
+ * @vm_xst_valid: 16-bit extended ST for volatile memory is valid
+ * @vm_ph_ignore: 1 => PH was and will be ignored, 0 => PH should be supplied
+ * @vm_st:        8-bit ST for volatile mem
+ * @vm_xst:       16-bit extended ST for volatile mem
+ * @pm_st_valid:  8-bit ST for persistent memory is valid
+ * @pm_xst_valid: 16-bit extended ST for persistent memory is valid
+ * @pm_ph_ignore: 1 => PH was and will be ignored, 0 => PH should be supplied
+ * @pm_st:        8-bit ST for persistent mem
+ * @pm_xst:       16-bit extended ST for persistent mem
+ */
+union st_info {
+	struct {
+		u64 vm_st_valid : 1;
+		u64 vm_xst_valid : 1;
+		u64 vm_ph_ignore : 1;
+		u64 rsvd1 : 5;
+		u64 vm_st : 8;
+		u64 vm_xst : 16;
+		u64 pm_st_valid : 1;
+		u64 pm_xst_valid : 1;
+		u64 pm_ph_ignore : 1;
+		u64 rsvd2 : 5;
+		u64 pm_st : 8;
+		u64 pm_xst : 16;
+	};
+	u64 value;
+};
+
+static u16 tph_extract_tag(enum tph_mem_type mem_type, u8 req_type,
+			   union st_info *info)
+{
+	switch (req_type) {
+	case PCI_TPH_REQ_TPH_ONLY: /* 8-bit tag */
+		switch (mem_type) {
+		case TPH_MEM_TYPE_VM:
+			if (info->vm_st_valid)
+				return info->vm_st;
+			break;
+		case TPH_MEM_TYPE_PM:
+			if (info->pm_st_valid)
+				return info->pm_st;
+			break;
+		}
+		break;
+	case PCI_TPH_REQ_EXT_TPH: /* 16-bit tag */
+		switch (mem_type) {
+		case TPH_MEM_TYPE_VM:
+			if (info->vm_xst_valid)
+				return info->vm_xst;
+			break;
+		case TPH_MEM_TYPE_PM:
+			if (info->pm_xst_valid)
+				return info->pm_xst;
+			break;
+		}
+		break;
+	default:
+		return 0;
+	}
+
+	return 0;
+}
+
+#define TPH_ST_DSM_FUNC_INDEX	0xF
+static acpi_status tph_invoke_dsm(acpi_handle handle, u32 cpu_uid,
+				  union st_info *st_out)
+{
+	union acpi_object arg3[3], in_obj, *out_obj;
+
+	if (!acpi_check_dsm(handle, &pci_acpi_dsm_guid, 7,
+			    BIT(TPH_ST_DSM_FUNC_INDEX)))
+		return AE_ERROR;
+
+	/* DWORD: feature ID (0 for processor cache ST query) */
+	arg3[0].integer.type = ACPI_TYPE_INTEGER;
+	arg3[0].integer.value = 0;
+
+	/* DWORD: target UID */
+	arg3[1].integer.type = ACPI_TYPE_INTEGER;
+	arg3[1].integer.value = cpu_uid;
+
+	/* QWORD: properties, all 0's */
+	arg3[2].integer.type = ACPI_TYPE_INTEGER;
+	arg3[2].integer.value = 0;
+
+	in_obj.type = ACPI_TYPE_PACKAGE;
+	in_obj.package.count = ARRAY_SIZE(arg3);
+	in_obj.package.elements = arg3;
+
+	out_obj = acpi_evaluate_dsm(handle, &pci_acpi_dsm_guid, 7,
+				    TPH_ST_DSM_FUNC_INDEX, &in_obj);
+	if (!out_obj)
+		return AE_ERROR;
+
+	if (out_obj->type != ACPI_TYPE_BUFFER) {
+		ACPI_FREE(out_obj);
+		return AE_ERROR;
+	}
+
+	st_out->value = *((u64 *)(out_obj->buffer.pointer));
+
+	ACPI_FREE(out_obj);
+
+	return AE_OK;
+}
+#endif
+
+/* Update the TPH Requester Enable field of TPH Control Register */
+static void set_ctrl_reg_req_en(struct pci_dev *pdev, u8 req_type)
+{
+	u32 reg;
+
+	pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, &reg);
+
+	reg &= ~PCI_TPH_CTRL_REQ_EN_MASK;
+	reg |= FIELD_PREP(PCI_TPH_CTRL_REQ_EN_MASK, req_type);
+
+	pci_write_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, reg);
+}
+
 static u8 get_st_modes(struct pci_dev *pdev)
 {
 	u32 reg;
@@ -25,6 +155,37 @@ static u8 get_st_modes(struct pci_dev *pdev)
 	return reg;
 }
 
+static u32 get_st_table_loc(struct pci_dev *pdev)
+{
+	u32 reg;
+
+	pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CAP, &reg);
+
+	return FIELD_GET(PCI_TPH_CAP_LOC_MASK, reg);
+}
+
+/*
+ * Return the size of ST table. If ST table is not in TPH Requester Extended
+ * Capability space, return 0. Otherwise return the ST Table Size + 1.
+ */
+static u16 get_st_table_size(struct pci_dev *pdev)
+{
+	u32 reg;
+	u32 loc;
+
+	/* Check ST table location first */
+	loc = get_st_table_loc(pdev);
+
+	/* Convert loc to match with PCI_TPH_LOC_* defined in pci_regs.h */
+	loc = FIELD_PREP(PCI_TPH_CAP_LOC_MASK, loc);
+	if (loc != PCI_TPH_LOC_CAP)
+		return 0;
+
+	pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CAP, &reg);
+
+	return FIELD_GET(PCI_TPH_CAP_ST_MASK, reg) + 1;
+}
+
 /* Return device's Root Port completer capability */
 static u8 get_rp_completer_type(struct pci_dev *pdev)
 {
@@ -43,6 +204,170 @@ static u8 get_rp_completer_type(struct pci_dev *pdev)
 	return FIELD_GET(PCI_EXP_DEVCAP2_TPH_COMP_MASK, reg);
 }
 
+/* Write ST to MSI-X vector control reg - Return 0 if OK, otherwise -errno */
+static int write_tag_to_msix(struct pci_dev *pdev, int msix_idx, u16 tag)
+{
+#ifdef CONFIG_PCI_MSI
+	struct msi_desc *msi_desc = NULL;
+	void __iomem *vec_ctrl;
+	u32 val;
+	int err = 0;
+
+	msi_lock_descs(&pdev->dev);
+
+	/* Find the msi_desc entry with matching msix_idx */
+	msi_for_each_desc(msi_desc, &pdev->dev, MSI_DESC_ASSOCIATED) {
+		if (msi_desc->msi_index == msix_idx)
+			break;
+	}
+
+	if (!msi_desc) {
+		err = -ENXIO;
+		goto err_out;
+	}
+
+	/* Get the vector control register (offset 0xc) pointed by msix_idx */
+	vec_ctrl = pdev->msix_base + msix_idx * PCI_MSIX_ENTRY_SIZE;
+	vec_ctrl += PCI_MSIX_ENTRY_VECTOR_CTRL;
+
+	val = readl(vec_ctrl);
+	val &= ~PCI_MSIX_ENTRY_CTRL_ST;
+	val |= FIELD_PREP(PCI_MSIX_ENTRY_CTRL_ST, tag);
+	writel(val, vec_ctrl);
+
+	/* Read back to flush the update */
+	val = readl(vec_ctrl);
+
+err_out:
+	msi_unlock_descs(&pdev->dev);
+	return err;
+#else
+	return -ENODEV;
+#endif
+}
+
+/* Write tag to ST table - Return 0 if OK, otherwise -errno */
+static int write_tag_to_st_table(struct pci_dev *pdev, int index, u16 tag)
+{
+	int st_table_size;
+	int offset;
+
+	/* Check if index is out of bound */
+	st_table_size = get_st_table_size(pdev);
+	if (index >= st_table_size)
+		return -ENXIO;
+
+	offset = pdev->tph_cap + PCI_TPH_BASE_SIZEOF + index * sizeof(u16);
+
+	return pci_write_config_word(pdev, offset, tag);
+}
+
+/**
+ * pcie_tph_get_cpu_st() - Retrieve Steering Tag for a target memory associated
+ * with a specific CPU
+ * @pdev: PCI device
+ * @mem_type: target memory type (volatile or persistent RAM)
+ * @cpu_uid: associated CPU id
+ * @tag: Steering Tag to be returned
+ *
+ * This function returns the Steering Tag for a target memory that is
+ * associated with a specific CPU as indicated by cpu_uid.
+ *
+ * Returns: 0 if success, otherwise negative value (-errno)
+ */
+int pcie_tph_get_cpu_st(struct pci_dev *pdev, enum tph_mem_type mem_type,
+			unsigned int cpu_uid, u16 *tag)
+{
+#ifdef CONFIG_ACPI
+	struct pci_dev *rp;
+	acpi_handle rp_acpi_handle;
+	union st_info info;
+
+	rp = pcie_find_root_port(pdev);
+	if (!rp || !rp->bus || !rp->bus->bridge)
+		return -ENODEV;
+
+	rp_acpi_handle = ACPI_HANDLE(rp->bus->bridge);
+
+	if (tph_invoke_dsm(rp_acpi_handle, cpu_uid, &info) != AE_OK) {
+		*tag = 0;
+		return -EINVAL;
+	}
+
+	*tag = tph_extract_tag(mem_type, pdev->tph_req_type, &info);
+
+	pci_dbg(pdev, "get steering tag: mem_type=%s, cpu_uid=%d, tag=%#04x\n",
+		(mem_type == TPH_MEM_TYPE_VM) ? "volatile" : "persistent",
+		cpu_uid, *tag);
+
+	return 0;
+#else
+	return -ENODEV;
+#endif
+}
+EXPORT_SYMBOL(pcie_tph_get_cpu_st);
+
+/**
+ * pcie_tph_set_st_entry() - Set Steering Tag in the ST table entry
+ * @pdev: PCI device
+ * @index: ST table entry index
+ * @tag: Steering Tag to be written
+ *
+ * This function will figure out the proper location of ST table, either in the
+ * MSI-X table or in the TPH Extended Capability space, and write the Steering
+ * Tag into the ST entry pointed by index.
+ *
+ * Returns: 0 if success, otherwise negative value (-errno)
+ */
+int pcie_tph_set_st_entry(struct pci_dev *pdev, unsigned int index, u16 tag)
+{
+	u32 loc;
+	int err = 0;
+
+	if (!pdev->tph_cap)
+		return -EINVAL;
+
+	if (!pdev->tph_enabled)
+		return -EINVAL;
+
+	/* No need to write tag if device is in "No ST Mode" */
+	if (pdev->tph_mode == PCI_TPH_ST_NS_MODE)
+		return 0;
+
+	/* Disable TPH before updating ST to avoid potential instability as
+	 * cautioned in PCIe r6.2, sec 6.17.3, "ST Modes of Operation"
+	 */
+	set_ctrl_reg_req_en(pdev, PCI_TPH_REQ_DISABLE);
+
+	loc = get_st_table_loc(pdev);
+	/* Convert loc to match with PCI_TPH_LOC_* defined in pci_regs.h */
+	loc = FIELD_PREP(PCI_TPH_CAP_LOC_MASK, loc);
+
+	switch (loc) {
+	case PCI_TPH_LOC_MSIX:
+		err = write_tag_to_msix(pdev, index, tag);
+		break;
+	case PCI_TPH_LOC_CAP:
+		err = write_tag_to_st_table(pdev, index, tag);
+		break;
+	default:
+		err = -EINVAL;
+	}
+
+	if (err) {
+		pcie_disable_tph(pdev);
+		return err;
+	}
+
+	set_ctrl_reg_req_en(pdev, pdev->tph_mode);
+
+	pci_dbg(pdev, "set steering tag: %s table, index=%d, tag=%#04x\n",
+		(loc == PCI_TPH_LOC_MSIX) ? "MSI-X" : "ST", index, tag);
+
+	return 0;
+}
+EXPORT_SYMBOL(pcie_tph_set_st_entry);
+
 /**
  * pcie_disable_tph - Turn off TPH support for device
  * @pdev: PCI device
@@ -140,6 +465,8 @@ EXPORT_SYMBOL(pcie_enable_tph);
 void pci_restore_tph_state(struct pci_dev *pdev)
 {
 	struct pci_cap_saved_state *save_state;
+	int num_entries, i, offset;
+	u16 *st_entry;
 	u32 *cap;
 
 	if (!pdev->tph_cap)
@@ -155,11 +482,21 @@ void pci_restore_tph_state(struct pci_dev *pdev)
 	/* Restore control register and all ST entries */
 	cap = &save_state->cap.data[0];
 	pci_write_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, *cap++);
+	st_entry = (u16 *)cap;
+	offset = PCI_TPH_BASE_SIZEOF;
+	num_entries = get_st_table_size(pdev);
+	for (i = 0; i < num_entries; i++) {
+		pci_write_config_word(pdev, pdev->tph_cap + offset,
+				      *st_entry++);
+		offset += sizeof(u16);
+	}
 }
 
 void pci_save_tph_state(struct pci_dev *pdev)
 {
 	struct pci_cap_saved_state *save_state;
+	int num_entries, i, offset;
+	u16 *st_entry;
 	u32 *cap;
 
 	if (!pdev->tph_cap)
@@ -175,6 +512,16 @@ void pci_save_tph_state(struct pci_dev *pdev)
 	/* Save control register */
 	cap = &save_state->cap.data[0];
 	pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, cap++);
+
+	/* Save all ST entries in extended capability structure */
+	st_entry = (u16 *)cap;
+	offset = PCI_TPH_BASE_SIZEOF;
+	num_entries = get_st_table_size(pdev);
+	for (i = 0; i < num_entries; i++) {
+		pci_read_config_word(pdev, pdev->tph_cap + offset,
+				     st_entry++);
+		offset += sizeof(u16);
+	}
 }
 
 void pci_no_tph(void)
@@ -186,12 +533,14 @@ void pci_no_tph(void)
 
 void pci_tph_init(struct pci_dev *pdev)
 {
+	int num_entries;
 	u32 save_size;
 
 	pdev->tph_cap = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_TPH);
 	if (!pdev->tph_cap)
 		return;
 
-	save_size = sizeof(u32);
+	num_entries = get_st_table_size(pdev);
+	save_size = sizeof(u32) + num_entries * sizeof(u16);
 	pci_add_ext_cap_save_buffer(pdev, PCI_EXT_CAP_ID_TPH, save_size);
 }
diff --git a/include/linux/pci-tph.h b/include/linux/pci-tph.h
index 58654a334ffb..c3e806c13d64 100644
--- a/include/linux/pci-tph.h
+++ b/include/linux/pci-tph.h
@@ -9,10 +9,33 @@
 #ifndef LINUX_PCI_TPH_H
 #define LINUX_PCI_TPH_H
 
+/*
+ * According to the ECN for PCI Firmware Spec, Steering Tag can be different
+ * depending on the memory type: Volatile Memory or Persistent Memory. When a
+ * caller query about a target's Steering Tag, it must provide the target's
+ * tph_mem_type. ECN link: https://members.pcisig.com/wg/PCI-SIG/document/15470.
+ */
+enum tph_mem_type {
+	TPH_MEM_TYPE_VM,	/* volatile memory */
+	TPH_MEM_TYPE_PM		/* persistent memory */
+};
+
 #ifdef CONFIG_PCIE_TPH
+int pcie_tph_set_st_entry(struct pci_dev *pdev,
+			  unsigned int index, u16 tag);
+int pcie_tph_get_cpu_st(struct pci_dev *dev,
+			enum tph_mem_type mem_type,
+			unsigned int cpu_uid, u16 *tag);
 void pcie_disable_tph(struct pci_dev *pdev);
 int pcie_enable_tph(struct pci_dev *pdev, int mode);
 #else
+static inline int pcie_tph_set_st_entry(struct pci_dev *pdev,
+					unsigned int index, u16 tag)
+{ return -EINVAL; }
+static inline int pcie_tph_get_cpu_st(struct pci_dev *dev,
+				      enum tph_mem_type mem_type,
+				      unsigned int cpu_uid, u16 *tag)
+{ return -EINVAL; }
 static inline void pcie_disable_tph(struct pci_dev *pdev) { }
 static inline int pcie_enable_tph(struct pci_dev *pdev, int mode)
 { return -EINVAL; }
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH V7 3/5] PCI/TPH: Add TPH documentation
  2024-10-02 16:59 [PATCH V7 0/5] TPH and cache direct injection support Wei Huang
  2024-10-02 16:59 ` [PATCH V7 1/5] PCI: Add TLP Processing Hints (TPH) support Wei Huang
  2024-10-02 16:59 ` [PATCH V7 2/5] PCI/TPH: Add Steering Tag support Wei Huang
@ 2024-10-02 16:59 ` Wei Huang
  2024-10-02 16:59 ` [PATCH V7 4/5] bnxt_en: Add TPH support in BNXT driver Wei Huang
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: Wei Huang @ 2024-10-02 16:59 UTC (permalink / raw)
  To: linux-pci, linux-kernel, linux-doc, netdev
  Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
	alex.williamson, gospo, michael.chan, ajit.khaparde,
	somnath.kotur, andrew.gospodarek, manoj.panicker2,
	Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
	bhelgaas, lukas, paul.e.luse, jing2.liu

Provide a document for TPH feature, including the description of "notph"
kernel parameter and the API interface.

Co-developed-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
---
 Documentation/PCI/index.rst          |   1 +
 Documentation/PCI/tph.rst            | 132 +++++++++++++++++++++++++++
 Documentation/driver-api/pci/pci.rst |   3 +
 3 files changed, 136 insertions(+)
 create mode 100644 Documentation/PCI/tph.rst

diff --git a/Documentation/PCI/index.rst b/Documentation/PCI/index.rst
index e73f84aebde3..5e7c4e6e726b 100644
--- a/Documentation/PCI/index.rst
+++ b/Documentation/PCI/index.rst
@@ -18,3 +18,4 @@ PCI Bus Subsystem
    pcieaer-howto
    endpoint/index
    boot-interrupts
+   tph
diff --git a/Documentation/PCI/tph.rst b/Documentation/PCI/tph.rst
new file mode 100644
index 000000000000..e8993be64fd6
--- /dev/null
+++ b/Documentation/PCI/tph.rst
@@ -0,0 +1,132 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+
+===========
+TPH Support
+===========
+
+:Copyright: 2024 Advanced Micro Devices, Inc.
+:Authors: - Eric van Tassell <eric.vantassell@amd.com>
+          - Wei Huang <wei.huang2@amd.com>
+
+
+Overview
+========
+
+TPH (TLP Processing Hints) is a PCIe feature that allows endpoint devices
+to provide optimization hints for requests that target memory space.
+These hints, in a format called Steering Tags (STs), are embedded in the
+requester's TLP headers, enabling the system hardware, such as the Root
+Complex, to better manage platform resources for these requests.
+
+For example, on platforms with TPH-based direct data cache injection
+support, an endpoint device can include appropriate STs in its DMA
+traffic to specify which cache the data should be written to. This allows
+the CPU core to have a higher probability of getting data from cache,
+potentially improving performance and reducing latency in data
+processing.
+
+
+How to Use TPH
+==============
+
+TPH is presented as an optional extended capability in PCIe. The Linux
+kernel handles TPH discovery during boot, but it is up to the device
+driver to request TPH enablement if it is to be utilized. Once enabled,
+the driver uses the provided API to obtain the Steering Tag for the
+target memory and to program the ST into the device's ST table.
+
+Enable TPH support in Linux
+---------------------------
+
+To support TPH, the kernel must be built with the CONFIG_PCIE_TPH option
+enabled.
+
+Manage TPH
+----------
+
+To enable TPH for a device, use the following function::
+
+  int pcie_enable_tph(struct pci_dev *pdev, int mode);
+
+This function enables TPH support for device with a specific ST mode.
+Current supported modes include:
+
+  * PCI_TPH_ST_NS_MODE - NO ST Mode
+  * PCI_TPH_ST_IV_MODE - Interrupt Vector Mode
+  * PCI_TPH_ST_DS_MODE - Device Specific Mode
+
+`pcie_enable_tph()` checks whether the requested mode is actually
+supported by the device before enabling. The device driver can figure out
+which TPH mode is supported and can be properly enabled based on the
+return value of `pcie_enable_tph()`.
+
+To disable TPH, use the following function::
+
+  void pcie_disable_tph(struct pci_dev *pdev);
+
+Manage ST
+---------
+
+Steering Tags are platform specific. PCIe spec does not specify where STs
+are from. Instead PCI Firmware Specification defines an ACPI _DSM method
+(see the `Revised _DSM for Cache Locality TPH Features ECN
+<https://members.pcisig.com/wg/PCI-SIG/document/15470>`_) for retrieving
+STs for a target memory of various properties. This method is what is
+supported in this implementation.
+
+To retrieve a Steering Tag for a target memory associated with a specific
+CPU, use the following function::
+
+  int pcie_tph_get_cpu_st(struct pci_dev *pdev, enum tph_mem_type type,
+                          unsigned int cpu_uid, u16 *tag);
+
+The `type` argument is used to specify the memory type, either volatile
+or persistent, of the target memory. The `cpu_uid` argument specifies the
+CPU where the memory is associated to.
+
+After the ST value is retrieved, the device driver can use the following
+function to write the ST into the device::
+
+  int pcie_tph_set_st_entry(struct pci_dev *pdev, unsigned int index,
+                            u16 tag);
+
+The `index` argument is the ST table entry index the ST tag will be
+written into. `pcie_tph_set_st_entry()` will figure out the proper
+location of ST table, either in the MSI-X table or in the TPH Extended
+Capability space, and write the Steering Tag into the ST entry pointed by
+the `index` argument.
+
+It is completely up to the driver to decide how to use these TPH
+functions. For example a network device driver can use the TPH APIs above
+to update the Steering Tag when interrupt affinity of a RX/TX queue has
+been changed. Here is a sample code for IRQ affinity notifier:
+
+.. code-block:: c
+
+    static void irq_affinity_notified(struct irq_affinity_notify *notify,
+                                      const cpumask_t *mask)
+    {
+         struct drv_irq *irq;
+         unsigned int cpu_id;
+         u16 tag;
+
+         irq = container_of(notify, struct drv_irq, affinity_notify);
+         cpumask_copy(irq->cpu_mask, mask);
+
+         /* Pick a right CPU as the target - here is just an example */
+         cpu_id = cpumask_first(irq->cpu_mask);
+
+         if (pcie_tph_get_cpu_st(irq->pdev, TPH_MEM_TYPE_VM, cpu_id,
+                                 &tag))
+             return;
+
+         if (pcie_tph_set_st_entry(irq->pdev, irq->msix_nr, tag))
+             return;
+    }
+
+Disable TPH system-wide
+-----------------------
+
+There is a kernel command line option available to control TPH feature:
+    * "notph": TPH will be disabled for all endpoint devices.
diff --git a/Documentation/driver-api/pci/pci.rst b/Documentation/driver-api/pci/pci.rst
index aa40b1cc243b..59d86e827198 100644
--- a/Documentation/driver-api/pci/pci.rst
+++ b/Documentation/driver-api/pci/pci.rst
@@ -46,6 +46,9 @@ PCI Support Library
 .. kernel-doc:: drivers/pci/pci-sysfs.c
    :internal:
 
+.. kernel-doc:: drivers/pci/tph.c
+   :export:
+
 PCI Hotplug Support Library
 ---------------------------
 
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH V7 4/5] bnxt_en: Add TPH support in BNXT driver
  2024-10-02 16:59 [PATCH V7 0/5] TPH and cache direct injection support Wei Huang
                   ` (2 preceding siblings ...)
  2024-10-02 16:59 ` [PATCH V7 3/5] PCI/TPH: Add TPH documentation Wei Huang
@ 2024-10-02 16:59 ` Wei Huang
  2024-10-08 13:39   ` Jakub Kicinski
  2024-10-02 16:59 ` [PATCH V7 5/5] bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings Wei Huang
  2024-10-02 21:35 ` [PATCH V7 0/5] TPH and cache direct injection support Bjorn Helgaas
  5 siblings, 1 reply; 17+ messages in thread
From: Wei Huang @ 2024-10-02 16:59 UTC (permalink / raw)
  To: linux-pci, linux-kernel, linux-doc, netdev
  Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
	alex.williamson, gospo, michael.chan, ajit.khaparde,
	somnath.kotur, andrew.gospodarek, manoj.panicker2,
	Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
	bhelgaas, lukas, paul.e.luse, jing2.liu

From: Manoj Panicker <manoj.panicker2@amd.com>

Add TPH support to the Broadcom BNXT device driver. This allows the
driver to utilize TPH functions for retrieving and configuring Steering
Tags when changing interrupt affinity. With compatible NIC firmware,
network traffic will be tagged correctly with Steering Tags, leading to
significant memory bandwidth savings and other benefits as demonstrated
by real network benchmarks on TPH-capable platforms.

Co-developed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Co-developed-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Manoj Panicker <manoj.panicker2@amd.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 83 +++++++++++++++++++++++
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  7 ++
 net/core/netdev_rx_queue.c                |  1 +
 3 files changed, 91 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 6e422e24750a..23ad2b6e70c7 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -55,6 +55,8 @@
 #include <net/page_pool/helpers.h>
 #include <linux/align.h>
 #include <net/netdev_queues.h>
+#include <net/netdev_rx_queue.h>
+#include <linux/pci-tph.h>
 
 #include "bnxt_hsi.h"
 #include "bnxt.h"
@@ -10865,6 +10867,61 @@ int bnxt_reserve_rings(struct bnxt *bp, bool irq_re_init)
 	return 0;
 }
 
+static void __bnxt_irq_affinity_notify(struct irq_affinity_notify *notify,
+				       const cpumask_t *mask)
+{
+	struct bnxt_irq *irq;
+	u16 tag;
+	int err;
+
+	irq = container_of(notify, struct bnxt_irq, affinity_notify);
+	cpumask_copy(irq->cpu_mask, mask);
+
+	if (pcie_tph_get_cpu_st(irq->bp->pdev, TPH_MEM_TYPE_VM,
+				cpumask_first(irq->cpu_mask), &tag))
+		return;
+
+	if (pcie_tph_set_st_entry(irq->bp->pdev, irq->msix_nr, tag))
+		return;
+
+	if (netif_running(irq->bp->dev)) {
+		rtnl_lock();
+		err = netdev_rx_queue_restart(irq->bp->dev, irq->ring_nr);
+		if (err)
+			netdev_err(irq->bp->dev,
+				   "rx queue restart failed: err=%d\n", err);
+		rtnl_unlock();
+	}
+}
+
+static void __bnxt_irq_affinity_release(struct kref __always_unused *ref)
+{
+}
+
+static void bnxt_release_irq_notifier(struct bnxt_irq *irq)
+{
+	irq_set_affinity_notifier(irq->vector, NULL);
+}
+
+static void bnxt_register_irq_notifier(struct bnxt *bp, struct bnxt_irq *irq)
+{
+	struct irq_affinity_notify *notify;
+
+	irq->bp = bp;
+
+	/* Nothing to do if TPH is not enabled */
+	if (!bp->tph_mode)
+		return;
+
+	/* Register IRQ affinity notifier */
+	notify = &irq->affinity_notify;
+	notify->irq = irq->vector;
+	notify->notify = __bnxt_irq_affinity_notify;
+	notify->release = __bnxt_irq_affinity_release;
+
+	irq_set_affinity_notifier(irq->vector, notify);
+}
+
 static void bnxt_free_irq(struct bnxt *bp)
 {
 	struct bnxt_irq *irq;
@@ -10887,11 +10944,18 @@ static void bnxt_free_irq(struct bnxt *bp)
 				free_cpumask_var(irq->cpu_mask);
 				irq->have_cpumask = 0;
 			}
+
+			bnxt_release_irq_notifier(irq);
+
 			free_irq(irq->vector, bp->bnapi[i]);
 		}
 
 		irq->requested = 0;
 	}
+
+	/* Disable TPH support */
+	pcie_disable_tph(bp->pdev);
+	bp->tph_mode = 0;
 }
 
 static int bnxt_request_irq(struct bnxt *bp)
@@ -10911,6 +10975,12 @@ static int bnxt_request_irq(struct bnxt *bp)
 #ifdef CONFIG_RFS_ACCEL
 	rmap = bp->dev->rx_cpu_rmap;
 #endif
+
+	/* Enable TPH support as part of IRQ request */
+	rc = pcie_enable_tph(bp->pdev, PCI_TPH_ST_IV_MODE);
+	if (!rc)
+		bp->tph_mode = PCI_TPH_ST_IV_MODE;
+
 	for (i = 0, j = 0; i < bp->cp_nr_rings; i++) {
 		int map_idx = bnxt_cp_num_to_irq_num(bp, i);
 		struct bnxt_irq *irq = &bp->irq_tbl[map_idx];
@@ -10934,8 +11004,11 @@ static int bnxt_request_irq(struct bnxt *bp)
 
 		if (zalloc_cpumask_var(&irq->cpu_mask, GFP_KERNEL)) {
 			int numa_node = dev_to_node(&bp->pdev->dev);
+			u16 tag;
 
 			irq->have_cpumask = 1;
+			irq->msix_nr = map_idx;
+			irq->ring_nr = i;
 			cpumask_set_cpu(cpumask_local_spread(i, numa_node),
 					irq->cpu_mask);
 			rc = irq_set_affinity_hint(irq->vector, irq->cpu_mask);
@@ -10945,6 +11018,16 @@ static int bnxt_request_irq(struct bnxt *bp)
 					    irq->vector);
 				break;
 			}
+
+			bnxt_register_irq_notifier(bp, irq);
+
+			/* Init ST table entry */
+			if (pcie_tph_get_cpu_st(irq->bp->pdev, TPH_MEM_TYPE_VM,
+						cpumask_first(irq->cpu_mask),
+						&tag))
+				continue;
+
+			pcie_tph_set_st_entry(irq->bp->pdev, irq->msix_nr, tag);
 		}
 	}
 	return rc;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 69231e85140b..641d25646367 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1227,6 +1227,11 @@ struct bnxt_irq {
 	u8		have_cpumask:1;
 	char		name[IFNAMSIZ + BNXT_IRQ_NAME_EXTRA];
 	cpumask_var_t	cpu_mask;
+
+	struct bnxt	*bp;
+	int		msix_nr;
+	int		ring_nr;
+	struct irq_affinity_notify affinity_notify;
 };
 
 #define HWRM_RING_ALLOC_TX	0x1
@@ -2183,6 +2188,8 @@ struct bnxt {
 	struct net_device	*dev;
 	struct pci_dev		*pdev;
 
+	u8			tph_mode;
+
 	atomic_t		intr_sem;
 
 	u32			flags;
diff --git a/net/core/netdev_rx_queue.c b/net/core/netdev_rx_queue.c
index e217a5838c87..10e95d7b6892 100644
--- a/net/core/netdev_rx_queue.c
+++ b/net/core/netdev_rx_queue.c
@@ -79,3 +79,4 @@ int netdev_rx_queue_restart(struct net_device *dev, unsigned int rxq_idx)
 
 	return err;
 }
+EXPORT_SYMBOL_GPL(netdev_rx_queue_restart);
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH V7 5/5] bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings
  2024-10-02 16:59 [PATCH V7 0/5] TPH and cache direct injection support Wei Huang
                   ` (3 preceding siblings ...)
  2024-10-02 16:59 ` [PATCH V7 4/5] bnxt_en: Add TPH support in BNXT driver Wei Huang
@ 2024-10-02 16:59 ` Wei Huang
  2024-10-02 21:35 ` [PATCH V7 0/5] TPH and cache direct injection support Bjorn Helgaas
  5 siblings, 0 replies; 17+ messages in thread
From: Wei Huang @ 2024-10-02 16:59 UTC (permalink / raw)
  To: linux-pci, linux-kernel, linux-doc, netdev
  Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
	alex.williamson, gospo, michael.chan, ajit.khaparde,
	somnath.kotur, andrew.gospodarek, manoj.panicker2,
	Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
	bhelgaas, lukas, paul.e.luse, jing2.liu

From: Michael Chan <michael.chan@broadcom.com>

Newer firmware can use the NQ ring ID associated with each RX/RX AGG
ring to enable PCIe Steering Tags.  When allocating RX/RX AGG rings,
pass along NR ring ID for the firmware to use.  This information helps
optimize DMA writes by directing them to the cache closer to the CPU
consuming the data, potentially improving the processing speed.  This
change is backward-compatible with older firmware, which will simply
disregard the information.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 23ad2b6e70c7..a35207931d7d 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -6811,10 +6811,12 @@ static int hwrm_ring_alloc_send_msg(struct bnxt *bp,
 
 			/* Association of rx ring with stats context */
 			grp_info = &bp->grp_info[ring->grp_idx];
+			req->nq_ring_id = cpu_to_le16(grp_info->cp_fw_ring_id);
 			req->rx_buf_size = cpu_to_le16(bp->rx_buf_use_size);
 			req->stat_ctx_id = cpu_to_le32(grp_info->fw_stats_ctx);
 			req->enables |= cpu_to_le32(
-				RING_ALLOC_REQ_ENABLES_RX_BUF_SIZE_VALID);
+				RING_ALLOC_REQ_ENABLES_RX_BUF_SIZE_VALID |
+				RING_ALLOC_REQ_ENABLES_NQ_RING_ID_VALID);
 			if (NET_IP_ALIGN == 2)
 				flags = RING_ALLOC_REQ_FLAGS_RX_SOP_PAD;
 			req->flags = cpu_to_le16(flags);
@@ -6826,11 +6828,13 @@ static int hwrm_ring_alloc_send_msg(struct bnxt *bp,
 			/* Association of agg ring with rx ring */
 			grp_info = &bp->grp_info[ring->grp_idx];
 			req->rx_ring_id = cpu_to_le16(grp_info->rx_fw_ring_id);
+			req->nq_ring_id = cpu_to_le16(grp_info->cp_fw_ring_id);
 			req->rx_buf_size = cpu_to_le16(BNXT_RX_PAGE_SIZE);
 			req->stat_ctx_id = cpu_to_le32(grp_info->fw_stats_ctx);
 			req->enables |= cpu_to_le32(
 				RING_ALLOC_REQ_ENABLES_RX_RING_ID_VALID |
-				RING_ALLOC_REQ_ENABLES_RX_BUF_SIZE_VALID);
+				RING_ALLOC_REQ_ENABLES_RX_BUF_SIZE_VALID |
+				RING_ALLOC_REQ_ENABLES_NQ_RING_ID_VALID);
 		} else {
 			req->ring_type = RING_ALLOC_REQ_RING_TYPE_RX;
 		}
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH V7 0/5] TPH and cache direct injection support
  2024-10-02 16:59 [PATCH V7 0/5] TPH and cache direct injection support Wei Huang
                   ` (4 preceding siblings ...)
  2024-10-02 16:59 ` [PATCH V7 5/5] bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings Wei Huang
@ 2024-10-02 21:35 ` Bjorn Helgaas
  2024-10-02 22:08   ` Michael Chan
  2024-10-16 21:31   ` Bjorn Helgaas
  5 siblings, 2 replies; 17+ messages in thread
From: Bjorn Helgaas @ 2024-10-02 21:35 UTC (permalink / raw)
  To: Wei Huang
  Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
	corbet, davem, edumazet, kuba, pabeni, alex.williamson, gospo,
	michael.chan, ajit.khaparde, somnath.kotur, andrew.gospodarek,
	manoj.panicker2, Eric.VanTassell, vadim.fedorenko, horms,
	bagasdotme, bhelgaas, lukas, paul.e.luse, jing2.liu

On Wed, Oct 02, 2024 at 11:59:49AM -0500, Wei Huang wrote:
> Hi All,
> 
> TPH (TLP Processing Hints) is a PCIe feature that allows endpoint
> devices to provide optimization hints for requests that target memory
> space. These hints, in a format called steering tag (ST), are provided
> in the requester's TLP headers and allow the system hardware, including
> the Root Complex, to optimize the utilization of platform resources
> for the requests.
> 
> Upcoming AMD hardware implement a new Cache Injection feature that
> leverages TPH. Cache Injection allows PCIe endpoints to inject I/O
> Coherent DMA writes directly into an L2 within the CCX (core complex)
> closest to the CPU core that will consume it. This technology is aimed
> at applications requiring high performance and low latency, such as
> networking and storage applications.
> 
> This series introduces generic TPH support in Linux, allowing STs to be
> retrieved and used by PCIe endpoint drivers as needed. As a
> demonstration, it includes an example usage in the Broadcom BNXT driver.
> When running on Broadcom NICs with the appropriate firmware, it shows
> substantial memory bandwidth savings and better network bandwidth using
> real-world benchmarks. This solution is vendor-neutral and implemented
> based on industry standards (PCIe Spec and PCI FW Spec).
> 
> V6->V7:
>  * Rebase on top of the latest pci/main (6.12-rc1)
>  * Fix compilation warning/error on clang-18 with w=1 (test robot)
>  * Revise commit messages for Patch #2, #4, and #5 (Bjorn)
>  * Add more _DSM method description for reference in Patch #2 (Bjorn)
>  * Remove "default n" in Kconfig (Lukas)
> 
> V5->V6:
>  * Rebase on top of pci/main (tag: pci-v6.12-changes)
>  * Fix spellings and FIELD_PREP/bnxt.c compilation errors (Simon)
>  * Move tph.c to drivers/pci directory (Lukas)
>  * Remove CONFIG_ACPI dependency (Lukas)
>  * Slightly re-arrange save/restore sequence (Lukas)
> 
> V4->V5:
>  * Rebase on top of net-next/main tree (Broadcom)
>  * Remove TPH mode query and TPH enabled checking functions (Bjorn)
>  * Remove "nostmode" kernel parameter (Bjorn)
>  * Add "notph" kernel parameter support (Bjorn)
>  * Add back TPH documentation (Bjorn)
>  * Change TPH register namings (Bjorn)
>  * Squash TPH enable/disable/save/restore funcs as a single patch (Bjorn)
>  * Squash ST get_st/set_st funcs as a single patch (Bjorn)
>  * Replace nic_open/close with netdev_rx_queue_restart() (Jakub, Broadcom)
> 
> V3->V4:
>  * Rebase on top of the latest pci/next tree (tag: 6.11-rc1)
>  * Add new API functioins to query/enable/disable TPH support
>  * Make pcie_tph_set_st() completely independent from pcie_tph_get_cpu_st()
>  * Rewrite bnxt.c based on new APIs
>  * Remove documentation for now due to constantly changing API
>  * Remove pci=notph, but keep pci=nostmode with better flow (Bjorn)
>  * Lots of code rewrite in tph.c & pci-tph.h with cleaner interface (Bjorn)
>  * Add TPH save/restore support (Paul Luse and Lukas Wunner)
> 
> V2->V3:
>  * Rebase on top of pci/next tree (tag: pci-v6.11-changes)
>  * Redefine PCI TPH registers (pci_regs.h) without breaking uapi
>  * Fix commit subjects/messages for kernel options (Jonathan and Bjorn)
>  * Break API functions into three individual patches for easy review
>  * Rewrite lots of code in tph.c/tph.h based (Jonathan and Bjorn)
> 
> V1->V2:
>  * Rebase on top of pci.git/for-linus (6.10-rc1)
>  * Address mismatched data types reported by Sparse (Sparse check passed)
>  * Add pcie_tph_intr_vec_supported() for checking IRQ mode support
>  * Skip bnxt affinity notifier registration if
>    pcie_tph_intr_vec_supported()=false
>  * Minor fixes in bnxt driver (i.e. warning messages)
> 
> Manoj Panicker (1):
>   bnxt_en: Add TPH support in BNXT driver
> 
> Michael Chan (1):
>   bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings
> 
> Wei Huang (3):
>   PCI: Add TLP Processing Hints (TPH) support
>   PCI/TPH: Add Steering Tag support
>   PCI/TPH: Add TPH documentation
> 
>  Documentation/PCI/index.rst                   |   1 +
>  Documentation/PCI/tph.rst                     | 132 +++++
>  .../admin-guide/kernel-parameters.txt         |   4 +
>  Documentation/driver-api/pci/pci.rst          |   3 +
>  drivers/net/ethernet/broadcom/bnxt/bnxt.c     |  91 ++-
>  drivers/net/ethernet/broadcom/bnxt/bnxt.h     |   7 +
>  drivers/pci/Kconfig                           |   9 +
>  drivers/pci/Makefile                          |   1 +
>  drivers/pci/pci.c                             |   4 +
>  drivers/pci/pci.h                             |  12 +
>  drivers/pci/probe.c                           |   1 +
>  drivers/pci/tph.c                             | 546 ++++++++++++++++++
>  include/linux/pci-tph.h                       |  44 ++
>  include/linux/pci.h                           |   7 +
>  include/uapi/linux/pci_regs.h                 |  37 +-
>  net/core/netdev_rx_queue.c                    |   1 +
>  16 files changed, 890 insertions(+), 10 deletions(-)
>  create mode 100644 Documentation/PCI/tph.rst
>  create mode 100644 drivers/pci/tph.c
>  create mode 100644 include/linux/pci-tph.h

I tentatively applied this on pci/tph for v6.13.

Not sure what you intend for the bnxt changes, since they depend on
the PCI core changes.  I'm happy to merge them via PCI, given acks
from Michael and an overall network maintainer.

Alternatively they could wait another cycle, or I could make an
immutable branch, although I prefer to preserve the option to update
or remove things until the merge window.

Thanks very much; this looks like nice work!

Bjorn

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V7 0/5] TPH and cache direct injection support
  2024-10-02 21:35 ` [PATCH V7 0/5] TPH and cache direct injection support Bjorn Helgaas
@ 2024-10-02 22:08   ` Michael Chan
  2024-10-08  7:32     ` Paolo Abeni
  2024-10-16 21:31   ` Bjorn Helgaas
  1 sibling, 1 reply; 17+ messages in thread
From: Michael Chan @ 2024-10-02 22:08 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Wei Huang, linux-pci, linux-kernel, linux-doc, netdev,
	Jonathan.Cameron, corbet, davem, edumazet, kuba, pabeni,
	alex.williamson, gospo, ajit.khaparde, somnath.kotur,
	andrew.gospodarek, manoj.panicker2, Eric.VanTassell,
	vadim.fedorenko, horms, bagasdotme, bhelgaas, lukas, paul.e.luse,
	jing2.liu

[-- Attachment #1: Type: text/plain, Size: 398 bytes --]

On Wed, Oct 2, 2024 at 2:35 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> I tentatively applied this on pci/tph for v6.13.
>
> Not sure what you intend for the bnxt changes, since they depend on
> the PCI core changes.  I'm happy to merge them via PCI, given acks
> from Michael and an overall network maintainer.

The bnxt patch can go in through the PCI tree if Jakub agrees.  Thanks.

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V7 0/5] TPH and cache direct injection support
  2024-10-02 22:08   ` Michael Chan
@ 2024-10-08  7:32     ` Paolo Abeni
  0 siblings, 0 replies; 17+ messages in thread
From: Paolo Abeni @ 2024-10-08  7:32 UTC (permalink / raw)
  To: Michael Chan, Bjorn Helgaas
  Cc: Wei Huang, linux-pci, linux-kernel, linux-doc, netdev,
	Jonathan.Cameron, corbet, davem, edumazet, kuba, alex.williamson,
	gospo, ajit.khaparde, somnath.kotur, andrew.gospodarek,
	manoj.panicker2, Eric.VanTassell, vadim.fedorenko, horms,
	bagasdotme, bhelgaas, lukas, paul.e.luse, jing2.liu

On 10/3/24 00:08, Michael Chan wrote:
> On Wed, Oct 2, 2024 at 2:35 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>> I tentatively applied this on pci/tph for v6.13.
>>
>> Not sure what you intend for the bnxt changes, since they depend on
>> the PCI core changes.  I'm happy to merge them via PCI, given acks
>> from Michael and an overall network maintainer.
> 
> The bnxt patch can go in through the PCI tree if Jakub agrees.  Thanks.

I guess the most critical point is to avoid complex conflict at merge 
window time. My understanding it that the conventional way to avoid such 
issue would be sharing a stable branch somewhere with this change on top 
which both the netdev and the PCI tree could pull from.

Cheers,

Paolo


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V7 4/5] bnxt_en: Add TPH support in BNXT driver
  2024-10-02 16:59 ` [PATCH V7 4/5] bnxt_en: Add TPH support in BNXT driver Wei Huang
@ 2024-10-08 13:39   ` Jakub Kicinski
  2024-10-11 18:35     ` Panicker, Manoj
  0 siblings, 1 reply; 17+ messages in thread
From: Jakub Kicinski @ 2024-10-08 13:39 UTC (permalink / raw)
  To: Wei Huang
  Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
	helgaas, corbet, davem, edumazet, pabeni, alex.williamson, gospo,
	michael.chan, ajit.khaparde, somnath.kotur, andrew.gospodarek,
	manoj.panicker2, Eric.VanTassell, vadim.fedorenko, horms,
	bagasdotme, bhelgaas, lukas, paul.e.luse, jing2.liu

On Wed, 2 Oct 2024 11:59:53 -0500 Wei Huang wrote:
> +	if (netif_running(irq->bp->dev)) {
> +		rtnl_lock();
> +		err = netdev_rx_queue_restart(irq->bp->dev, irq->ring_nr);
> +		if (err)
> +			netdev_err(irq->bp->dev,
> +				   "rx queue restart failed: err=%d\n", err);
> +		rtnl_unlock();
> +	}
> +}
> +
> +static void __bnxt_irq_affinity_release(struct kref __always_unused *ref)
> +{
> +}

An empty release function is always a red flag.
How is the reference counting used here?
Is irq_set_affinity_notifier() not synchronous?
Otherwise the rtnl_lock() should probably cover the running check.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH V7 4/5] bnxt_en: Add TPH support in BNXT driver
  2024-10-08 13:39   ` Jakub Kicinski
@ 2024-10-11 18:35     ` Panicker, Manoj
  2024-10-15 19:50       ` Wei Huang
  0 siblings, 1 reply; 17+ messages in thread
From: Panicker, Manoj @ 2024-10-11 18:35 UTC (permalink / raw)
  To: Jakub Kicinski, Huang2, Wei
  Cc: linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-doc@vger.kernel.org, netdev@vger.kernel.org,
	Jonathan.Cameron@Huawei.com, helgaas@kernel.org, corbet@lwn.net,
	davem@davemloft.net, edumazet@google.com, pabeni@redhat.com,
	alex.williamson@redhat.com, gospo@broadcom.com,
	michael.chan@broadcom.com, ajit.khaparde@broadcom.com,
	somnath.kotur@broadcom.com, andrew.gospodarek@broadcom.com,
	VanTassell, Eric, vadim.fedorenko@linux.dev, horms@kernel.org,
	bagasdotme@gmail.com, bhelgaas@google.com, lukas@wunner.de,
	paul.e.luse@intel.com, jing2.liu@intel.com

[AMD Official Use Only - AMD Internal Distribution Only]

Hello Jakub,

Thanks for the feedback. We'll update the patch to cover the code under the rtnl_lock.

About the empty function, there are no actions to perform when the driver's notify.release function is called. The IRQ notifier is only registered once and there are no older IRQ notifiers for the driver that could get called back. We also followed the precedent seen from other drivers in the kernel tree that follow the same mechanism .

See code:
From drivers/net/ethernet/intel/i40e/i40e_main.c
static void i40e_irq_affinity_release(struct kref *ref) {}


From drivers/net/ethernet/intel/iavf/iavf_main.c
static void iavf_irq_affinity_release(struct kref *ref) {}


From drivers/net/ethernet/fungible/funeth/funeth_main.c
static void fun_irq_aff_release(struct kref __always_unused *ref)
{
}


Thanks
Manoj

-----Original Message-----
From: Jakub Kicinski <kuba@kernel.org>
Sent: Tuesday, October 8, 2024 6:40 AM
To: Huang2, Wei <Wei.Huang2@amd.com>
Cc: linux-pci@vger.kernel.org; linux-kernel@vger.kernel.org; linux-doc@vger.kernel.org; netdev@vger.kernel.org; Jonathan.Cameron@Huawei.com; helgaas@kernel.org; corbet@lwn.net; davem@davemloft.net; edumazet@google.com; pabeni@redhat.com; alex.williamson@redhat.com; gospo@broadcom.com; michael.chan@broadcom.com; ajit.khaparde@broadcom.com; somnath.kotur@broadcom.com; andrew.gospodarek@broadcom.com; Panicker, Manoj <Manoj.Panicker2@amd.com>; VanTassell, Eric <Eric.VanTassell@amd.com>; vadim.fedorenko@linux.dev; horms@kernel.org; bagasdotme@gmail.com; bhelgaas@google.com; lukas@wunner.de; paul.e.luse@intel.com; jing2.liu@intel.com
Subject: Re: [PATCH V7 4/5] bnxt_en: Add TPH support in BNXT driver

Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.


On Wed, 2 Oct 2024 11:59:53 -0500 Wei Huang wrote:
> +     if (netif_running(irq->bp->dev)) {
> +             rtnl_lock();
> +             err = netdev_rx_queue_restart(irq->bp->dev, irq->ring_nr);
> +             if (err)
> +                     netdev_err(irq->bp->dev,
> +                                "rx queue restart failed: err=%d\n", err);
> +             rtnl_unlock();
> +     }
> +}
> +
> +static void __bnxt_irq_affinity_release(struct kref __always_unused
> +*ref) { }

An empty release function is always a red flag.
How is the reference counting used here?
Is irq_set_affinity_notifier() not synchronous?
Otherwise the rtnl_lock() should probably cover the running check.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V7 4/5] bnxt_en: Add TPH support in BNXT driver
  2024-10-11 18:35     ` Panicker, Manoj
@ 2024-10-15 19:50       ` Wei Huang
  2024-10-15 23:45         ` Jakub Kicinski
  0 siblings, 1 reply; 17+ messages in thread
From: Wei Huang @ 2024-10-15 19:50 UTC (permalink / raw)
  To: Panicker, Manoj, Jakub Kicinski
  Cc: linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-doc@vger.kernel.org, netdev@vger.kernel.org,
	Jonathan.Cameron@Huawei.com, helgaas@kernel.org, corbet@lwn.net,
	davem@davemloft.net, edumazet@google.com, pabeni@redhat.com,
	alex.williamson@redhat.com, gospo@broadcom.com,
	michael.chan@broadcom.com, ajit.khaparde@broadcom.com,
	somnath.kotur@broadcom.com, andrew.gospodarek@broadcom.com,
	VanTassell, Eric, vadim.fedorenko@linux.dev, horms@kernel.org,
	bagasdotme@gmail.com, bhelgaas@google.com, lukas@wunner.de,
	paul.e.luse@intel.com, jing2.liu@intel.com

[These question are for both Jakub and Bjorn]

Any suggestions on how to proceed? I can send out a V8 patchset if Jakub
is OK with Manoj's solution? Or only a new patch #4 is needed since the
rest are intact.

Thanks,
-Wei

On 10/11/24 13:35, Panicker, Manoj wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
> 
> Hello Jakub,
> 
> Thanks for the feedback. We'll update the patch to cover the code under the rtnl_lock.
> 
> About the empty function, there are no actions to perform when the driver's notify.release function is called. The IRQ notifier is only registered once and there are no older IRQ notifiers for the driver that could get called back. We also followed the precedent seen from other drivers in the kernel tree that follow the same mechanism .
> 
> See code:
> From drivers/net/ethernet/intel/i40e/i40e_main.c
> static void i40e_irq_affinity_release(struct kref *ref) {}
> 
> 
> From drivers/net/ethernet/intel/iavf/iavf_main.c
> static void iavf_irq_affinity_release(struct kref *ref) {}
> 
> 
> From drivers/net/ethernet/fungible/funeth/funeth_main.c
> static void fun_irq_aff_release(struct kref __always_unused *ref)
> {
> }
> 
> 
> Thanks
> Manoj
> 
> -----Original Message-----
> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Tuesday, October 8, 2024 6:40 AM
> To: Huang2, Wei <Wei.Huang2@amd.com>
> Cc: linux-pci@vger.kernel.org; linux-kernel@vger.kernel.org; linux-doc@vger.kernel.org; netdev@vger.kernel.org; Jonathan.Cameron@Huawei.com; helgaas@kernel.org; corbet@lwn.net; davem@davemloft.net; edumazet@google.com; pabeni@redhat.com; alex.williamson@redhat.com; gospo@broadcom.com; michael.chan@broadcom.com; ajit.khaparde@broadcom.com; somnath.kotur@broadcom.com; andrew.gospodarek@broadcom.com; Panicker, Manoj <Manoj.Panicker2@amd.com>; VanTassell, Eric <Eric.VanTassell@amd.com>; vadim.fedorenko@linux.dev; horms@kernel.org; bagasdotme@gmail.com; bhelgaas@google.com; lukas@wunner.de; paul.e.luse@intel.com; jing2.liu@intel.com
> Subject: Re: [PATCH V7 4/5] bnxt_en: Add TPH support in BNXT driver
> 
> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
> 
> 
> On Wed, 2 Oct 2024 11:59:53 -0500 Wei Huang wrote:
>> +     if (netif_running(irq->bp->dev)) {
>> +             rtnl_lock();
>> +             err = netdev_rx_queue_restart(irq->bp->dev, irq->ring_nr);
>> +             if (err)
>> +                     netdev_err(irq->bp->dev,
>> +                                "rx queue restart failed: err=%d\n", err);
>> +             rtnl_unlock();
>> +     }
>> +}
>> +
>> +static void __bnxt_irq_affinity_release(struct kref __always_unused
>> +*ref) { }
> 
> An empty release function is always a red flag.
> How is the reference counting used here?
> Is irq_set_affinity_notifier() not synchronous?
> Otherwise the rtnl_lock() should probably cover the running check.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V7 4/5] bnxt_en: Add TPH support in BNXT driver
  2024-10-15 19:50       ` Wei Huang
@ 2024-10-15 23:45         ` Jakub Kicinski
  0 siblings, 0 replies; 17+ messages in thread
From: Jakub Kicinski @ 2024-10-15 23:45 UTC (permalink / raw)
  To: Wei Huang
  Cc: Panicker, Manoj, linux-pci@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
	netdev@vger.kernel.org, Jonathan.Cameron@Huawei.com,
	helgaas@kernel.org, corbet@lwn.net, davem@davemloft.net,
	edumazet@google.com, pabeni@redhat.com,
	alex.williamson@redhat.com, gospo@broadcom.com,
	michael.chan@broadcom.com, ajit.khaparde@broadcom.com,
	somnath.kotur@broadcom.com, andrew.gospodarek@broadcom.com,
	VanTassell, Eric, vadim.fedorenko@linux.dev, horms@kernel.org,
	bagasdotme@gmail.com, bhelgaas@google.com, lukas@wunner.de,
	paul.e.luse@intel.com, jing2.liu@intel.com

On Tue, 15 Oct 2024 14:50:39 -0500 Wei Huang wrote:
> Any suggestions on how to proceed? I can send out a V8 patchset if Jakub
> is OK with Manoj's solution? Or only a new patch #4 is needed since the
> rest are intact.

1) y'all need to stop top posting
2) Manoj's reply is AMD internal and I'm not an AMD employee
3) precedent in drivers means relatively little, existing code 
   can be buggy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V7 0/5] TPH and cache direct injection support
  2024-10-02 21:35 ` [PATCH V7 0/5] TPH and cache direct injection support Bjorn Helgaas
  2024-10-02 22:08   ` Michael Chan
@ 2024-10-16 21:31   ` Bjorn Helgaas
  1 sibling, 0 replies; 17+ messages in thread
From: Bjorn Helgaas @ 2024-10-16 21:31 UTC (permalink / raw)
  To: Wei Huang
  Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
	corbet, davem, edumazet, kuba, pabeni, alex.williamson, gospo,
	michael.chan, ajit.khaparde, somnath.kotur, andrew.gospodarek,
	manoj.panicker2, Eric.VanTassell, vadim.fedorenko, horms,
	bagasdotme, bhelgaas, lukas, paul.e.luse, jing2.liu

On Wed, Oct 02, 2024 at 04:35:55PM -0500, Bjorn Helgaas wrote:
> On Wed, Oct 02, 2024 at 11:59:49AM -0500, Wei Huang wrote:
> > Hi All,
> > 
> > TPH (TLP Processing Hints) is a PCIe feature that allows endpoint
> > devices to provide optimization hints for requests that target memory
> > space. These hints, in a format called steering tag (ST), are provided
> > in the requester's TLP headers and allow the system hardware, including
> > the Root Complex, to optimize the utilization of platform resources
> > for the requests.
> > 
> > Upcoming AMD hardware implement a new Cache Injection feature that
> > leverages TPH. Cache Injection allows PCIe endpoints to inject I/O
> > Coherent DMA writes directly into an L2 within the CCX (core complex)
> > closest to the CPU core that will consume it. This technology is aimed
> > at applications requiring high performance and low latency, such as
> > networking and storage applications.
> > 
> > This series introduces generic TPH support in Linux, allowing STs to be
> > retrieved and used by PCIe endpoint drivers as needed. As a
> > demonstration, it includes an example usage in the Broadcom BNXT driver.
> > When running on Broadcom NICs with the appropriate firmware, it shows
> > substantial memory bandwidth savings and better network bandwidth using
> > real-world benchmarks. This solution is vendor-neutral and implemented
> > based on industry standards (PCIe Spec and PCI FW Spec).
> > 
> > V6->V7:
> >  * Rebase on top of the latest pci/main (6.12-rc1)
> >  * Fix compilation warning/error on clang-18 with w=1 (test robot)
> >  * Revise commit messages for Patch #2, #4, and #5 (Bjorn)
> >  * Add more _DSM method description for reference in Patch #2 (Bjorn)
> >  * Remove "default n" in Kconfig (Lukas)
> > 
> > V5->V6:
> >  * Rebase on top of pci/main (tag: pci-v6.12-changes)
> >  * Fix spellings and FIELD_PREP/bnxt.c compilation errors (Simon)
> >  * Move tph.c to drivers/pci directory (Lukas)
> >  * Remove CONFIG_ACPI dependency (Lukas)
> >  * Slightly re-arrange save/restore sequence (Lukas)
> > 
> > V4->V5:
> >  * Rebase on top of net-next/main tree (Broadcom)
> >  * Remove TPH mode query and TPH enabled checking functions (Bjorn)
> >  * Remove "nostmode" kernel parameter (Bjorn)
> >  * Add "notph" kernel parameter support (Bjorn)
> >  * Add back TPH documentation (Bjorn)
> >  * Change TPH register namings (Bjorn)
> >  * Squash TPH enable/disable/save/restore funcs as a single patch (Bjorn)
> >  * Squash ST get_st/set_st funcs as a single patch (Bjorn)
> >  * Replace nic_open/close with netdev_rx_queue_restart() (Jakub, Broadcom)
> > 
> > V3->V4:
> >  * Rebase on top of the latest pci/next tree (tag: 6.11-rc1)
> >  * Add new API functioins to query/enable/disable TPH support
> >  * Make pcie_tph_set_st() completely independent from pcie_tph_get_cpu_st()
> >  * Rewrite bnxt.c based on new APIs
> >  * Remove documentation for now due to constantly changing API
> >  * Remove pci=notph, but keep pci=nostmode with better flow (Bjorn)
> >  * Lots of code rewrite in tph.c & pci-tph.h with cleaner interface (Bjorn)
> >  * Add TPH save/restore support (Paul Luse and Lukas Wunner)
> > 
> > V2->V3:
> >  * Rebase on top of pci/next tree (tag: pci-v6.11-changes)
> >  * Redefine PCI TPH registers (pci_regs.h) without breaking uapi
> >  * Fix commit subjects/messages for kernel options (Jonathan and Bjorn)
> >  * Break API functions into three individual patches for easy review
> >  * Rewrite lots of code in tph.c/tph.h based (Jonathan and Bjorn)
> > 
> > V1->V2:
> >  * Rebase on top of pci.git/for-linus (6.10-rc1)
> >  * Address mismatched data types reported by Sparse (Sparse check passed)
> >  * Add pcie_tph_intr_vec_supported() for checking IRQ mode support
> >  * Skip bnxt affinity notifier registration if
> >    pcie_tph_intr_vec_supported()=false
> >  * Minor fixes in bnxt driver (i.e. warning messages)
> > 
> > Manoj Panicker (1):
> >   bnxt_en: Add TPH support in BNXT driver
> > 
> > Michael Chan (1):
> >   bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings
> > 
> > Wei Huang (3):
> >   PCI: Add TLP Processing Hints (TPH) support
> >   PCI/TPH: Add Steering Tag support
> >   PCI/TPH: Add TPH documentation
> > 
> >  Documentation/PCI/index.rst                   |   1 +
> >  Documentation/PCI/tph.rst                     | 132 +++++
> >  .../admin-guide/kernel-parameters.txt         |   4 +
> >  Documentation/driver-api/pci/pci.rst          |   3 +
> >  drivers/net/ethernet/broadcom/bnxt/bnxt.c     |  91 ++-
> >  drivers/net/ethernet/broadcom/bnxt/bnxt.h     |   7 +
> >  drivers/pci/Kconfig                           |   9 +
> >  drivers/pci/Makefile                          |   1 +
> >  drivers/pci/pci.c                             |   4 +
> >  drivers/pci/pci.h                             |  12 +
> >  drivers/pci/probe.c                           |   1 +
> >  drivers/pci/tph.c                             | 546 ++++++++++++++++++
> >  include/linux/pci-tph.h                       |  44 ++
> >  include/linux/pci.h                           |   7 +
> >  include/uapi/linux/pci_regs.h                 |  37 +-
> >  net/core/netdev_rx_queue.c                    |   1 +
> >  16 files changed, 890 insertions(+), 10 deletions(-)
> >  create mode 100644 Documentation/PCI/tph.rst
> >  create mode 100644 drivers/pci/tph.c
> >  create mode 100644 include/linux/pci-tph.h
> 
> I tentatively applied this on pci/tph for v6.13.
> 
> Not sure what you intend for the bnxt changes, since they depend on
> the PCI core changes.  I'm happy to merge them via PCI, given acks
> from Michael and an overall network maintainer.

Given the ongoing discussion about the bnxt_en patches, I dropped
those, so the PCI tree pci/tph branch now contains only these:

  e045e5c1c706 ("PCI/TPH: Add TPH documentation")
  d2e8a34876ce ("PCI/TPH: Add Steering Tag support")
  f69767a1ada3 ("PCI: Add TLP Processing Hints (TPH) support")

This is headed for v6.13, but the branch should not be considered
immutable, and it may be merged during the merge window either before
or after the netdev tree.

Bjorn

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V7 2/5] PCI/TPH: Add Steering Tag support
  2024-10-02 16:59 ` [PATCH V7 2/5] PCI/TPH: Add Steering Tag support Wei Huang
@ 2025-02-04 18:33   ` Robin Murphy
  2025-02-04 20:18     ` Wei Huang
  0 siblings, 1 reply; 17+ messages in thread
From: Robin Murphy @ 2025-02-04 18:33 UTC (permalink / raw)
  To: Wei Huang, linux-pci, linux-kernel, linux-doc, netdev
  Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
	alex.williamson, gospo, michael.chan, ajit.khaparde,
	somnath.kotur, andrew.gospodarek, manoj.panicker2,
	Eric.VanTassell, vadim.fedorenko, horms, bagasdotme, bhelgaas,
	lukas, paul.e.luse, jing2.liu

On 2024-10-02 5:59 pm, Wei Huang wrote:
[...]
> +/**
> + * pcie_tph_set_st_entry() - Set Steering Tag in the ST table entry
> + * @pdev: PCI device
> + * @index: ST table entry index
> + * @tag: Steering Tag to be written
> + *
> + * This function will figure out the proper location of ST table, either in the
> + * MSI-X table or in the TPH Extended Capability space, and write the Steering
> + * Tag into the ST entry pointed by index.
> + *
> + * Returns: 0 if success, otherwise negative value (-errno)
> + */
> +int pcie_tph_set_st_entry(struct pci_dev *pdev, unsigned int index, u16 tag)
> +{
> +	u32 loc;
> +	int err = 0;
> +
> +	if (!pdev->tph_cap)
> +		return -EINVAL;
> +
> +	if (!pdev->tph_enabled)
> +		return -EINVAL;
> +
> +	/* No need to write tag if device is in "No ST Mode" */
> +	if (pdev->tph_mode == PCI_TPH_ST_NS_MODE)
> +		return 0;
> +
> +	/* Disable TPH before updating ST to avoid potential instability as
> +	 * cautioned in PCIe r6.2, sec 6.17.3, "ST Modes of Operation"
> +	 */
> +	set_ctrl_reg_req_en(pdev, PCI_TPH_REQ_DISABLE);
> +
> +	loc = get_st_table_loc(pdev);
> +	/* Convert loc to match with PCI_TPH_LOC_* defined in pci_regs.h */
> +	loc = FIELD_PREP(PCI_TPH_CAP_LOC_MASK, loc);
> +
> +	switch (loc) {
> +	case PCI_TPH_LOC_MSIX:
> +		err = write_tag_to_msix(pdev, index, tag);
> +		break;
> +	case PCI_TPH_LOC_CAP:
> +		err = write_tag_to_st_table(pdev, index, tag);
> +		break;
> +	default:
> +		err = -EINVAL;
> +	}
> +
> +	if (err) {
> +		pcie_disable_tph(pdev);
> +		return err;
> +	}
> +
> +	set_ctrl_reg_req_en(pdev, pdev->tph_mode);

Just looking at this code in mainline, and I don't trust my 
understanding quite enough to send a patch myself, but doesn't this want 
to be pdev->tph_req_type, rather than tph_mode?

Thanks,
Robin.

> +
> +	pci_dbg(pdev, "set steering tag: %s table, index=%d, tag=%#04x\n",
> +		(loc == PCI_TPH_LOC_MSIX) ? "MSI-X" : "ST", index, tag);
> +
> +	return 0;
> +}

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V7 2/5] PCI/TPH: Add Steering Tag support
  2025-02-04 18:33   ` Robin Murphy
@ 2025-02-04 20:18     ` Wei Huang
  2025-02-05 12:57       ` Robin Murphy
  0 siblings, 1 reply; 17+ messages in thread
From: Wei Huang @ 2025-02-04 20:18 UTC (permalink / raw)
  To: Robin Murphy, linux-pci, linux-kernel, linux-doc, netdev
  Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
	alex.williamson, gospo, michael.chan, ajit.khaparde,
	somnath.kotur, andrew.gospodarek, manoj.panicker2,
	Eric.VanTassell, vadim.fedorenko, horms, bagasdotme, bhelgaas,
	lukas, paul.e.luse, jing2.liu



On 2/4/25 12:33 PM, Robin Murphy wrote:
> On 2024-10-02 5:59 pm, Wei Huang wrote:
> [...]
>> +
>> +	if (err) {
>> +		pcie_disable_tph(pdev);
>> +		return err;
>> +	}
>> +
>> +	set_ctrl_reg_req_en(pdev, pdev->tph_mode);
> 
> Just looking at this code in mainline, and I don't trust my
> understanding quite enough to send a patch myself, but doesn't this want
> to be pdev->tph_req_type, rather than tph_mode?

Yeah, you are right - this is supposed to be pdev->tph_req_type instead 
of tph_mode. We disable TPH first by clearing (zero) the "TPH Requester 
Enable" field and needs to set it back using tph_req_type.

Do you want to send in a fix? I can ACK it. Thanks for spotting it.

-Wei

> 
> Thanks,
> Robin.
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V7 2/5] PCI/TPH: Add Steering Tag support
  2025-02-04 20:18     ` Wei Huang
@ 2025-02-05 12:57       ` Robin Murphy
  0 siblings, 0 replies; 17+ messages in thread
From: Robin Murphy @ 2025-02-05 12:57 UTC (permalink / raw)
  To: Wei Huang, linux-pci, linux-kernel, linux-doc, netdev
  Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
	alex.williamson, gospo, michael.chan, ajit.khaparde,
	somnath.kotur, andrew.gospodarek, manoj.panicker2,
	Eric.VanTassell, vadim.fedorenko, horms, bagasdotme, bhelgaas,
	lukas, paul.e.luse, jing2.liu

On 2025-02-04 8:18 pm, Wei Huang wrote:
> 
> 
> On 2/4/25 12:33 PM, Robin Murphy wrote:
>> On 2024-10-02 5:59 pm, Wei Huang wrote:
>> [...]
>>> +
>>> +    if (err) {
>>> +        pcie_disable_tph(pdev);
>>> +        return err;
>>> +    }
>>> +
>>> +    set_ctrl_reg_req_en(pdev, pdev->tph_mode);
>>
>> Just looking at this code in mainline, and I don't trust my
>> understanding quite enough to send a patch myself, but doesn't this want
>> to be pdev->tph_req_type, rather than tph_mode?
> 
> Yeah, you are right - this is supposed to be pdev->tph_req_type instead 
> of tph_mode. We disable TPH first by clearing (zero) the "TPH Requester 
> Enable" field and needs to set it back using tph_req_type.
> 
> Do you want to send in a fix? I can ACK it. Thanks for spotting it.

Done[1] - cheers for confirming!

Robin.


[1] 
https://lore.kernel.org/linux-pci/13118098116d7bce07aa20b8c52e28c7d1847246.1738759933.git.robin.murphy@arm.com/

> 
> -Wei
> 
>>
>> Thanks,
>> Robin.
>>


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2025-02-05 12:57 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-02 16:59 [PATCH V7 0/5] TPH and cache direct injection support Wei Huang
2024-10-02 16:59 ` [PATCH V7 1/5] PCI: Add TLP Processing Hints (TPH) support Wei Huang
2024-10-02 16:59 ` [PATCH V7 2/5] PCI/TPH: Add Steering Tag support Wei Huang
2025-02-04 18:33   ` Robin Murphy
2025-02-04 20:18     ` Wei Huang
2025-02-05 12:57       ` Robin Murphy
2024-10-02 16:59 ` [PATCH V7 3/5] PCI/TPH: Add TPH documentation Wei Huang
2024-10-02 16:59 ` [PATCH V7 4/5] bnxt_en: Add TPH support in BNXT driver Wei Huang
2024-10-08 13:39   ` Jakub Kicinski
2024-10-11 18:35     ` Panicker, Manoj
2024-10-15 19:50       ` Wei Huang
2024-10-15 23:45         ` Jakub Kicinski
2024-10-02 16:59 ` [PATCH V7 5/5] bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings Wei Huang
2024-10-02 21:35 ` [PATCH V7 0/5] TPH and cache direct injection support Bjorn Helgaas
2024-10-02 22:08   ` Michael Chan
2024-10-08  7:32     ` Paolo Abeni
2024-10-16 21:31   ` Bjorn Helgaas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).