linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V5 0/5] PCIe TPH and cache direct injection support
@ 2024-09-16 20:50 Wei Huang
  2024-09-16 20:50 ` [PATCH V5 1/5] PCI: Add TLP Processing Hints (TPH) support Wei Huang
                   ` (4 more replies)
  0 siblings, 5 replies; 22+ messages in thread
From: Wei Huang @ 2024-09-16 20:50 UTC (permalink / raw)
  To: linux-pci, linux-kernel, linux-doc, netdev
  Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
	alex.williamson, gospo, michael.chan, ajit.khaparde,
	somnath.kotur, andrew.gospodarek, manoj.panicker2,
	Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
	bhelgaas, lukas, paul.e.luse, jing2.liu

Hi All,

TPH (TLP Processing Hints) is a PCIe feature that allows endpoint
devices to provide optimization hints for requests that target memory
space. These hints, in a format called steering tag (ST), are provided
in the requester's TLP headers and allow the system hardware, including
the Root Complex, to optimize the utilization of platform resources
for the requests.

Upcoming AMD hardware implement a new Cache Injection feature that
leverages TPH. Cache Injection allows PCIe endpoints to inject I/O
Coherent DMA writes directly into an L2 within the CCX (core complex)
closest to the CPU core that will consume it. This technology is aimed
at applications requiring high performance and low latency, such as
networking and storage applications.

This series introduces generic TPH support in Linux, allowing STs to be
retrieved and used by PCIe endpoint drivers as needed. As a
demonstration, it includes an example usage in the Broadcom BNXT driver.
When running on Broadcom NICs with the appropriate firmware, it shows
substantial memory bandwidth savings and better network bandwidth using
real-world benchmarks. This solution is vendor-neutral and implemented
based on industry standards (PCIe Spec and PCI FW Spec).

V4->V5:
 * Rebase on top of net-next/main tree (Broadcom)
 * Remove TPH mode query and TPH enabled checking functions (Bjorn)
 * Remove "nostmode" kernel parameter (Bjorn)
 * Add "notph" kernel parameter support (Bjorn)
 * Add back TPH documentation (Bjorn)
 * Change TPH register namings (Bjorn)
 * Squash TPH enable/disable/save/restore funcs as a single patch (Bjorn)
 * Squash ST get_st/set_st funcs as a signle patch (Bjorn)
 * Replace nic_open/close with netdev_rx_queue_restart() (Jakub, Broadcom)

V3->V4:
 * Rebase on top of the latest pci/next tree (tag: 6.11-rc1)
 * Add new API functioins to query/enable/disable TPH support
 * Make pcie_tph_set_st() completely independent from pcie_tph_get_cpu_st()
 * Rewrite bnxt.c based on new APIs
 * Remove documentation for now due to constantly changing API
 * Remove pci=notph, but keep pci=nostmode with better flow (Bjorn)
 * Lots of code rewrite in tph.c & pci-tph.h with cleaner interface (Bjorn)
 * Add TPH save/restore support (Paul Luse and Lukas Wunner)

V2->V3:
 * Rebase on top of pci/next tree (tag: pci-v6.11-changes)
 * Redefine PCI TPH registers (pci_regs.h) without breaking uapi
 * Fix commit subjects/messages for kernel options (Jonathan and Bjorn)
 * Break API functions into three individual patches for easy review
 * Rewrite lots of code in tph.c/tph.h based (Jonathan and Bjorn)

V1->V2:
 * Rebase on top of pci.git/for-linus (6.10-rc1)
 * Address mismatched data types reported by Sparse (Sparse check passed)
 * Add pcie_tph_intr_vec_supported() for checking IRQ mode support
 * Skip bnxt affinity notifier registration if
   pcie_tph_intr_vec_supported()=false
 * Minor fixes in bnxt driver (i.e. warning messages)

Manoj Panicker (1):
  bnxt_en: Add TPH support in BNXT driver

Michael Chan (1):
  bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings

Wei Huang (3):
  PCI: Add TLP Processing Hints (TPH) support
  PCI/TPH: Add Steering Tag support
  PCI/TPH: Add TPH documentation

Documentation/PCI/index.rst                   |   1 +
 Documentation/PCI/tph.rst                     | 132 +++++
 .../admin-guide/kernel-parameters.txt         |   4 +
 Documentation/driver-api/pci/pci.rst          |   3 +
 drivers/net/ethernet/broadcom/bnxt/bnxt.c     |  93 ++-
 drivers/net/ethernet/broadcom/bnxt/bnxt.h     |   7 +
 drivers/pci/pci.c                             |   4 +
 drivers/pci/pci.h                             |  12 +
 drivers/pci/pcie/Kconfig                      |  11 +
 drivers/pci/pcie/Makefile                     |   1 +
 drivers/pci/pcie/tph.c                        | 536 ++++++++++++++++++
 drivers/pci/probe.c                           |   1 +
 include/linux/pci-tph.h                       |  44 ++
 include/linux/pci.h                           |   7 +
 include/uapi/linux/pci_regs.h                 |  38 +-
 net/core/netdev_rx_queue.c                    |   1 +
 16 files changed, 885 insertions(+), 10 deletions(-)
 create mode 100644 Documentation/PCI/tph.rst
 create mode 100644 drivers/pci/pcie/tph.c
 create mode 100644 include/linux/pci-tph.h

-- 
2.45.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH V5 1/5] PCI: Add TLP Processing Hints (TPH) support
  2024-09-16 20:50 [PATCH V5 0/5] PCIe TPH and cache direct injection support Wei Huang
@ 2024-09-16 20:50 ` Wei Huang
  2024-09-17  7:38   ` Simon Horman
                     ` (2 more replies)
  2024-09-16 20:51 ` [PATCH V5 2/5] PCI/TPH: Add Steering Tag support Wei Huang
                   ` (3 subsequent siblings)
  4 siblings, 3 replies; 22+ messages in thread
From: Wei Huang @ 2024-09-16 20:50 UTC (permalink / raw)
  To: linux-pci, linux-kernel, linux-doc, netdev
  Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
	alex.williamson, gospo, michael.chan, ajit.khaparde,
	somnath.kotur, andrew.gospodarek, manoj.panicker2,
	Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
	bhelgaas, lukas, paul.e.luse, jing2.liu

Add support for PCIe TLP Processing Hints (TPH) support (see PCIe r6.2,
sec 6.17).

Add missing TPH register definitions in pci_regs.h, including the TPH
Requester capability register, TPH Requester control register, TPH
Completer capability, and the ST fields of MSI-X entry.

Introduce pcie_enable_tph() and pcie_disable_tph(), enabling drivers to
toggle TPH support and configure specific ST mode as needed. Also add a
new kernel parameter, "pci=notph", allowing users to disable TPH support
across the entire system.

Co-developed-by: Jing Liu <jing2.liu@intel.com>
Signed-off-by: Jing Liu <jing2.liu@intel.com>
Co-developed-by: Paul Luse <paul.e.luse@linux.intel.com>
Signed-off-by: Paul Luse <paul.e.luse@linux.intel.com>
Co-developed-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
---
 .../admin-guide/kernel-parameters.txt         |   4 +
 drivers/pci/pci.c                             |   4 +
 drivers/pci/pci.h                             |  12 ++
 drivers/pci/pcie/Kconfig                      |  11 +
 drivers/pci/pcie/Makefile                     |   1 +
 drivers/pci/pcie/tph.c                        | 199 ++++++++++++++++++
 drivers/pci/probe.c                           |   1 +
 include/linux/pci-tph.h                       |  21 ++
 include/linux/pci.h                           |   7 +
 include/uapi/linux/pci_regs.h                 |  38 +++-
 10 files changed, 290 insertions(+), 8 deletions(-)
 create mode 100644 drivers/pci/pcie/tph.c
 create mode 100644 include/linux/pci-tph.h

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 09126bb8cc9f..8579d0fbcd33 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4617,6 +4617,10 @@
 		nomio		[S390] Do not use MIO instructions.
 		norid		[S390] ignore the RID field and force use of
 				one PCI domain per PCI function
+		notph		[PCIE] If the PCIE_TPH kernel config parameter
+				is enabled, this kernel boot option can be used
+				to disable PCIe TLP Processing Hints support
+				system-wide.
 
 	pcie_aspm=	[PCIE] Forcibly enable or ignore PCIe Active State Power
 			Management.
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index ffaaca0978cb..b6f60f7476cc 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1813,6 +1813,7 @@ int pci_save_state(struct pci_dev *dev)
 	pci_save_dpc_state(dev);
 	pci_save_aer_state(dev);
 	pci_save_ptm_state(dev);
+	pci_save_tph_state(dev);
 	return pci_save_vc_state(dev);
 }
 EXPORT_SYMBOL(pci_save_state);
@@ -1917,6 +1918,7 @@ void pci_restore_state(struct pci_dev *dev)
 	pci_restore_vc_state(dev);
 	pci_restore_rebar_state(dev);
 	pci_restore_dpc_state(dev);
+	pci_restore_tph_state(dev);
 	pci_restore_ptm_state(dev);
 
 	pci_aer_clear_status(dev);
@@ -6869,6 +6871,8 @@ static int __init pci_setup(char *str)
 				pci_no_domains();
 			} else if (!strncmp(str, "noari", 5)) {
 				pcie_ari_disabled = true;
+			} else if (!strncmp(str, "notph", 5)) {
+				pci_no_tph();
 			} else if (!strncmp(str, "cbiosize=", 9)) {
 				pci_cardbus_io_size = memparse(str + 9, &str);
 			} else if (!strncmp(str, "cbmemsize=", 10)) {
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 79c8398f3938..8eeabbbfa137 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -571,6 +571,18 @@ static inline int pci_iov_bus_range(struct pci_bus *bus)
 
 #endif /* CONFIG_PCI_IOV */
 
+#ifdef CONFIG_PCIE_TPH
+void pci_restore_tph_state(struct pci_dev *dev);
+void pci_save_tph_state(struct pci_dev *dev);
+void pci_no_tph(void);
+void pci_tph_init(struct pci_dev *dev);
+#else
+static inline void pci_restore_tph_state(struct pci_dev *dev) { }
+static inline void pci_save_tph_state(struct pci_dev *dev) { }
+static inline void pci_no_tph(void) { }
+static inline void pci_tph_init(struct pci_dev *dev) { }
+#endif
+
 #ifdef CONFIG_PCIE_PTM
 void pci_ptm_init(struct pci_dev *dev);
 void pci_save_ptm_state(struct pci_dev *dev);
diff --git a/drivers/pci/pcie/Kconfig b/drivers/pci/pcie/Kconfig
index 17919b99fa66..61e4bd16eaf1 100644
--- a/drivers/pci/pcie/Kconfig
+++ b/drivers/pci/pcie/Kconfig
@@ -155,3 +155,14 @@ config PCIE_EDR
 	  the PCI Firmware Specification r3.2.  Enable this if you want to
 	  support hybrid DPC model which uses both firmware and OS to
 	  implement DPC.
+
+config PCIE_TPH
+	bool "TLP Processing Hints"
+	depends on ACPI
+	default n
+	help
+	  This option adds support for PCIe TLP Processing Hints (TPH).
+	  TPH allows endpoint devices to provide optimization hints, such as
+	  desired caching behavior, for requests that target memory space.
+	  These hints, called Steering Tags, can empower the system hardware
+	  to optimize the utilization of platform resources.
diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile
index 6461aa93fe76..3542b42ea0b9 100644
--- a/drivers/pci/pcie/Makefile
+++ b/drivers/pci/pcie/Makefile
@@ -13,3 +13,4 @@ obj-$(CONFIG_PCIE_PME)		+= pme.o
 obj-$(CONFIG_PCIE_DPC)		+= dpc.o
 obj-$(CONFIG_PCIE_PTM)		+= ptm.o
 obj-$(CONFIG_PCIE_EDR)		+= edr.o
+obj-$(CONFIG_PCIE_TPH)		+= tph.o
diff --git a/drivers/pci/pcie/tph.c b/drivers/pci/pcie/tph.c
new file mode 100644
index 000000000000..1efd76c8dd30
--- /dev/null
+++ b/drivers/pci/pcie/tph.c
@@ -0,0 +1,199 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * TPH (TLP Processing Hints) support
+ *
+ * Copyright (C) 2024 Advanced Micro Devices, Inc.
+ *     Eric Van Tassell <Eric.VanTassell@amd.com>
+ *     Wei Huang <wei.huang2@amd.com>
+ */
+#include <linux/pci.h>
+#include <linux/pci-acpi.h>
+#include <linux/bitfield.h>
+#include <linux/msi.h>
+#include <linux/pci-tph.h>
+
+#include "../pci.h"
+
+/* System-wide TPH disabled */
+static bool pci_tph_disabled;
+
+static u8 get_st_modes(struct pci_dev *pdev)
+{
+	u32 reg;
+
+	pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CAP, &reg);
+	reg &= PCI_TPH_CAP_ST_NS | PCI_TPH_CAP_ST_IV | PCI_TPH_CAP_ST_DS;
+
+	return reg;
+}
+
+/* Return device's Root Port completer capability */
+static u8 get_rp_completer_type(struct pci_dev *pdev)
+{
+	struct pci_dev *rp;
+	u32 reg;
+	int ret;
+
+	rp = pcie_find_root_port(pdev);
+	if (!rp)
+		return 0;
+
+	ret = pcie_capability_read_dword(rp, PCI_EXP_DEVCAP2, &reg);
+	if (ret)
+		return 0;
+
+	return FIELD_GET(PCI_EXP_DEVCAP2_TPH_COMP_MASK, reg);
+}
+
+/**
+ * pcie_disable_tph - Turn off TPH support for device
+ * @pdev: PCI device
+ *
+ * Return: none
+ */
+void pcie_disable_tph(struct pci_dev *pdev)
+{
+	if (!pdev->tph_cap)
+		return;
+
+	if (!pdev->tph_enabled)
+		return;
+
+	pci_write_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, 0);
+
+	pdev->tph_mode = 0;
+	pdev->tph_req_type = 0;
+	pdev->tph_enabled = 0;
+}
+EXPORT_SYMBOL(pcie_disable_tph);
+
+/**
+ * pcie_enable_tph - Enable TPH support for device using a specific ST mode
+ * @pdev: PCI device
+ * @mode: ST mode to enable. Current supported modes include:
+ *
+ *   - PCI_TPH_ST_NS_MODE: NO ST Mode
+ *   - PCI_TPH_ST_IV_MODE: Interrupt Vector Mode
+ *   - PCI_TPH_ST_DS_MODE: Device Specific Mode
+ *
+ * Checks whether the mode is actually supported by the device before enabling
+ * and returns an error if not. Additionally determines what types of requests,
+ * TPH or extended TPH, can be issued by the device based on its TPH requester
+ * capability and the Root Port's completer capability.
+ *
+ * Return: 0 on success, otherwise negative value (-errno)
+ */
+int pcie_enable_tph(struct pci_dev *pdev, int mode)
+{
+	u32 reg;
+	u8 dev_modes;
+	u8 rp_req_type;
+
+	/* Honor "notph" kernel parameter */
+	if (pci_tph_disabled)
+		return -EINVAL;
+
+	if (!pdev->tph_cap)
+		return -EINVAL;
+
+	if (pdev->tph_enabled)
+		return -EBUSY;
+
+	/* Sanitize and check ST mode comptability */
+	mode &= PCI_TPH_CTRL_MODE_SEL_MASK;
+	dev_modes = get_st_modes(pdev);
+	if (!((1 << mode) & dev_modes))
+		return -EINVAL;
+
+	pdev->tph_mode = mode;
+
+	/* Get req_type supported by device and its Root Port */
+	pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CAP, &reg);
+	if (FIELD_GET(PCI_TPH_CAP_EXT_TPH, reg))
+		pdev->tph_req_type = PCI_TPH_REQ_EXT_TPH;
+	else
+		pdev->tph_req_type = PCI_TPH_REQ_TPH_ONLY;
+
+	rp_req_type = get_rp_completer_type(pdev);
+
+	/* Final req_type is the smallest value of two */
+	pdev->tph_req_type = min(pdev->tph_req_type, rp_req_type);
+
+	if (pdev->tph_req_type == PCI_TPH_REQ_DISABLE)
+		return -EINVAL;
+
+	/* Write them into TPH control register */
+	pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, &reg);
+
+	reg &= ~PCI_TPH_CTRL_MODE_SEL_MASK;
+	reg |= FIELD_PREP(PCI_TPH_CTRL_MODE_SEL_MASK, pdev->tph_mode);
+
+	reg &= ~PCI_TPH_CTRL_REQ_EN_MASK;
+	reg |= FIELD_PREP(PCI_TPH_CTRL_REQ_EN_MASK, pdev->tph_req_type);
+
+	pci_write_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, reg);
+
+	pdev->tph_enabled = 1;
+
+	return 0;
+}
+EXPORT_SYMBOL(pcie_enable_tph);
+
+void pci_restore_tph_state(struct pci_dev *pdev)
+{
+	struct pci_cap_saved_state *save_state;
+	u32 *cap;
+
+	if (!pdev->tph_cap)
+		return;
+
+	if (!pdev->tph_enabled)
+		return;
+
+	save_state = pci_find_saved_ext_cap(pdev, PCI_EXT_CAP_ID_TPH);
+	if (!save_state)
+		return;
+
+	/* Restore control register and all ST entries */
+	cap = &save_state->cap.data[0];
+	pci_write_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, *cap++);
+}
+
+void pci_save_tph_state(struct pci_dev *pdev)
+{
+	struct pci_cap_saved_state *save_state;
+	u32 *cap;
+
+	if (!pdev->tph_cap)
+		return;
+
+	if (!pdev->tph_enabled)
+		return;
+
+	save_state = pci_find_saved_ext_cap(pdev, PCI_EXT_CAP_ID_TPH);
+	if (!save_state)
+		return;
+
+	/* Save control register */
+	cap = &save_state->cap.data[0];
+	pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, cap++);
+}
+
+void pci_no_tph(void)
+{
+	pci_tph_disabled = true;
+
+	pr_info("PCIe TPH is disabled\n");
+}
+
+void pci_tph_init(struct pci_dev *pdev)
+{
+	u32 save_size;
+
+	pdev->tph_cap = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_TPH);
+	if (!pdev->tph_cap)
+		return;
+
+	save_size = sizeof(u32);
+	pci_add_ext_cap_save_buffer(pdev, PCI_EXT_CAP_ID_TPH, save_size);
+}
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index b14b9876c030..c74adcdee52b 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2498,6 +2498,7 @@ static void pci_init_capabilities(struct pci_dev *dev)
 	pci_dpc_init(dev);		/* Downstream Port Containment */
 	pci_rcec_init(dev);		/* Root Complex Event Collector */
 	pci_doe_init(dev);		/* Data Object Exchange */
+	pci_tph_init(dev);		/* TLP Processing Hints */
 
 	pcie_report_downtraining(dev);
 	pci_init_reset_methods(dev);
diff --git a/include/linux/pci-tph.h b/include/linux/pci-tph.h
new file mode 100644
index 000000000000..58654a334ffb
--- /dev/null
+++ b/include/linux/pci-tph.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * TPH (TLP Processing Hints)
+ *
+ * Copyright (C) 2024 Advanced Micro Devices, Inc.
+ *     Eric Van Tassell <Eric.VanTassell@amd.com>
+ *     Wei Huang <wei.huang2@amd.com>
+ */
+#ifndef LINUX_PCI_TPH_H
+#define LINUX_PCI_TPH_H
+
+#ifdef CONFIG_PCIE_TPH
+void pcie_disable_tph(struct pci_dev *pdev);
+int pcie_enable_tph(struct pci_dev *pdev, int mode);
+#else
+static inline void pcie_disable_tph(struct pci_dev *pdev) { }
+static inline int pcie_enable_tph(struct pci_dev *pdev, int mode)
+{ return -EINVAL; }
+#endif
+
+#endif /* LINUX_PCI_TPH_H */
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 4cf89a4b4cbc..6f05deb6a0bf 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -433,6 +433,7 @@ struct pci_dev {
 	unsigned int	ats_enabled:1;		/* Address Translation Svc */
 	unsigned int	pasid_enabled:1;	/* Process Address Space ID */
 	unsigned int	pri_enabled:1;		/* Page Request Interface */
+	unsigned int	tph_enabled:1;		/* TLP Processing Hints */
 	unsigned int	is_managed:1;		/* Managed via devres */
 	unsigned int	is_msi_managed:1;	/* MSI release via devres installed */
 	unsigned int	needs_freset:1;		/* Requires fundamental reset */
@@ -530,6 +531,12 @@ struct pci_dev {
 
 	/* These methods index pci_reset_fn_methods[] */
 	u8 reset_methods[PCI_NUM_RESET_METHODS]; /* In priority order */
+
+#ifdef CONFIG_PCIE_TPH
+	u16		tph_cap;	/* TPH capability offset */
+	u8		tph_mode;	/* TPH mode */
+	u8		tph_req_type;	/* TPH requester type */
+#endif
 };
 
 static inline struct pci_dev *pci_physfn(struct pci_dev *dev)
diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index 94c00996e633..25af1976953c 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -340,7 +340,9 @@
 #define PCI_MSIX_ENTRY_UPPER_ADDR	0x4  /* Message Upper Address */
 #define PCI_MSIX_ENTRY_DATA		0x8  /* Message Data */
 #define PCI_MSIX_ENTRY_VECTOR_CTRL	0xc  /* Vector Control */
-#define  PCI_MSIX_ENTRY_CTRL_MASKBIT	0x00000001
+#define  PCI_MSIX_ENTRY_CTRL_MASKBIT	0x00000001  /* Mask Bit */
+#define  PCI_MSIX_ENTRY_CTRL_ST_LOWER	0x00ff0000  /* ST Lower */
+#define  PCI_MSIX_ENTRY_CTRL_ST_UPPER	0xff000000  /* ST Upper */
 
 /* CompactPCI Hotswap Register */
 
@@ -657,6 +659,7 @@
 #define  PCI_EXP_DEVCAP2_ATOMIC_COMP64	0x00000100 /* 64b AtomicOp completion */
 #define  PCI_EXP_DEVCAP2_ATOMIC_COMP128	0x00000200 /* 128b AtomicOp completion */
 #define  PCI_EXP_DEVCAP2_LTR		0x00000800 /* Latency tolerance reporting */
+#define  PCI_EXP_DEVCAP2_TPH_COMP_MASK	0x00003000 /* TPH completer support */
 #define  PCI_EXP_DEVCAP2_OBFF_MASK	0x000c0000 /* OBFF support mechanism */
 #define  PCI_EXP_DEVCAP2_OBFF_MSG	0x00040000 /* New message signaling */
 #define  PCI_EXP_DEVCAP2_OBFF_WAKE	0x00080000 /* Re-use WAKE# for OBFF */
@@ -1020,15 +1023,34 @@
 #define  PCI_DPA_CAP_SUBSTATE_MASK	0x1F	/* # substates - 1 */
 #define PCI_DPA_BASE_SIZEOF	16	/* size with 0 substates */
 
+/* TPH Completer Support */
+#define PCI_EXP_DEVCAP2_TPH_COMP_NONE		0x0 /* None */
+#define PCI_EXP_DEVCAP2_TPH_COMP_TPH_ONLY	0x1 /* TPH only */
+#define PCI_EXP_DEVCAP2_TPH_COMP_EXT_TPH	0x3 /* TPH and Extended TPH */
+
 /* TPH Requester */
 #define PCI_TPH_CAP		4	/* capability register */
-#define  PCI_TPH_CAP_LOC_MASK	0x600	/* location mask */
-#define   PCI_TPH_LOC_NONE	0x000	/* no location */
-#define   PCI_TPH_LOC_CAP	0x200	/* in capability */
-#define   PCI_TPH_LOC_MSIX	0x400	/* in MSI-X */
-#define PCI_TPH_CAP_ST_MASK	0x07FF0000	/* ST table mask */
-#define PCI_TPH_CAP_ST_SHIFT	16	/* ST table shift */
-#define PCI_TPH_BASE_SIZEOF	0xc	/* size with no ST table */
+#define  PCI_TPH_CAP_ST_NS	0x00000001 /* No ST Mode Supported */
+#define  PCI_TPH_CAP_ST_IV	0x00000002 /* Interrupt Vector Mode Supported */
+#define  PCI_TPH_CAP_ST_DS	0x00000004 /* Device Specific Mode Supported */
+#define  PCI_TPH_CAP_EXT_TPH	0x00000100 /* Ext TPH Requester Supported */
+#define  PCI_TPH_CAP_LOC_MASK	0x00000600 /* ST Table Location */
+#define   PCI_TPH_LOC_NONE	0x00000000 /* Not present */
+#define   PCI_TPH_LOC_CAP	0x00000200 /* In capability */
+#define   PCI_TPH_LOC_MSIX	0x00000400 /* In MSI-X */
+#define  PCI_TPH_CAP_ST_MASK	0x07FF0000 /* ST Table Size */
+#define  PCI_TPH_CAP_ST_SHIFT	16	/* ST Table Size shift */
+#define PCI_TPH_BASE_SIZEOF	0xc	/* Size with no ST table */
+
+#define PCI_TPH_CTRL		8	/* control register */
+#define  PCI_TPH_CTRL_MODE_SEL_MASK	0x00000007 /* ST Mode Select */
+#define   PCI_TPH_ST_NS_MODE		0x0 /* No ST Mode */
+#define   PCI_TPH_ST_IV_MODE		0x1 /* Interrupt Vector Mode */
+#define   PCI_TPH_ST_DS_MODE		0x2 /* Device Specific Mode */
+#define  PCI_TPH_CTRL_REQ_EN_MASK	0x00000300 /* TPH Requester Enable */
+#define   PCI_TPH_REQ_DISABLE		0x0 /* No TPH requests allowed */
+#define   PCI_TPH_REQ_TPH_ONLY		0x1 /* TPH only requests allowed */
+#define   PCI_TPH_REQ_EXT_TPH		0x3 /* Extended TPH requests allowed */
 
 /* Downstream Port Containment */
 #define PCI_EXP_DPC_CAP			0x04	/* DPC Capability */
-- 
2.45.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH V5 2/5] PCI/TPH: Add Steering Tag support
  2024-09-16 20:50 [PATCH V5 0/5] PCIe TPH and cache direct injection support Wei Huang
  2024-09-16 20:50 ` [PATCH V5 1/5] PCI: Add TLP Processing Hints (TPH) support Wei Huang
@ 2024-09-16 20:51 ` Wei Huang
  2024-09-17  7:32   ` Simon Horman
  2024-09-20 10:38   ` kernel test robot
  2024-09-16 20:51 ` [PATCH V5 3/5] PCI/TPH: Add TPH documentation Wei Huang
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 22+ messages in thread
From: Wei Huang @ 2024-09-16 20:51 UTC (permalink / raw)
  To: linux-pci, linux-kernel, linux-doc, netdev
  Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
	alex.williamson, gospo, michael.chan, ajit.khaparde,
	somnath.kotur, andrew.gospodarek, manoj.panicker2,
	Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
	bhelgaas, lukas, paul.e.luse, jing2.liu

pcie_tph_get_cpu_st() is added to allow a caller to retrieve Steering Tags
for a target memory that is associated with a specific CPU. The ST tag is
retrieved by invoking ACPI _DSM of the device's Root Port device.

pcie_tph_set_st_entry() is added to support updating the device's Steering
Tags. The tags will be written into the device's MSI-X table or the ST
table located in the TPH Extended Capability space.

Co-developed-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
---
 drivers/pci/pcie/tph.c  | 339 +++++++++++++++++++++++++++++++++++++++-
 include/linux/pci-tph.h |  23 +++
 2 files changed, 361 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pcie/tph.c b/drivers/pci/pcie/tph.c
index 1efd76c8dd30..4b386616fc28 100644
--- a/drivers/pci/pcie/tph.c
+++ b/drivers/pci/pcie/tph.c
@@ -14,9 +14,134 @@
 
 #include "../pci.h"
 
+/*
+ * The st_info struct defines the Steering Tag (ST) info returned by the
+ * firmware _DSM method defined in the approved ECN for PCI Firmware Spec,
+ * available at https://members.pcisig.com/wg/PCI-SIG/document/15470.
+ *
+ * @vm_st_valid:  8-bit ST for volatile memory is valid
+ * @vm_xst_valid: 16-bit extended ST for volatile memory is valid
+ * @vm_ph_ignore: 1 => PH was and will be ignored, 0 => PH should be supplied
+ * @vm_st:        8-bit ST for volatile mem
+ * @vm_xst:       16-bit extended ST for volatile mem
+ * @pm_st_valid:  8-bit ST for persistent memory is valid
+ * @pm_xst_valid: 16-bit extended ST for persistent memory is valid
+ * @pm_ph_ignore: 1 => PH was and will be ignored, 0 => PH should be supplied
+ * @pm_st:        8-bit ST for persistent mem
+ * @pm_xst:       16-bit extended ST for persistent mem
+ */
+union st_info {
+	struct {
+		u64 vm_st_valid : 1;
+		u64 vm_xst_valid : 1;
+		u64 vm_ph_ignore : 1;
+		u64 rsvd1 : 5;
+		u64 vm_st : 8;
+		u64 vm_xst : 16;
+		u64 pm_st_valid : 1;
+		u64 pm_xst_valid : 1;
+		u64 pm_ph_ignore : 1;
+		u64 rsvd2 : 5;
+		u64 pm_st : 8;
+		u64 pm_xst : 16;
+	};
+	u64 value;
+};
+
 /* System-wide TPH disabled */
 static bool pci_tph_disabled;
 
+static u16 tph_extract_tag(enum tph_mem_type mem_type, u8 req_type,
+			   union st_info *info)
+{
+	switch (req_type) {
+	case PCI_TPH_REQ_TPH_ONLY: /* 8-bit tag */
+		switch (mem_type) {
+		case TPH_MEM_TYPE_VM:
+			if (info->vm_st_valid)
+				return info->vm_st;
+			break;
+		case TPH_MEM_TYPE_PM:
+			if (info->pm_st_valid)
+				return info->pm_st;
+			break;
+		}
+		break;
+	case PCI_TPH_REQ_EXT_TPH: /* 16-bit tag */
+		switch (mem_type) {
+		case TPH_MEM_TYPE_VM:
+			if (info->vm_xst_valid)
+				return info->vm_xst;
+			break;
+		case TPH_MEM_TYPE_PM:
+			if (info->pm_xst_valid)
+				return info->pm_xst;
+			break;
+		}
+		break;
+	default:
+		return 0;
+	}
+
+	return 0;
+}
+
+#define TPH_ST_DSM_FUNC_INDEX	0xF
+static acpi_status tph_invoke_dsm(acpi_handle handle, u32 cpu_uid,
+				  union st_info *st_out)
+{
+	union acpi_object arg3[3], in_obj, *out_obj;
+
+	if (!acpi_check_dsm(handle, &pci_acpi_dsm_guid, 7,
+			    BIT(TPH_ST_DSM_FUNC_INDEX)))
+		return AE_ERROR;
+
+	/* DWORD: feature ID (0 for processor cache ST query) */
+	arg3[0].integer.type = ACPI_TYPE_INTEGER;
+	arg3[0].integer.value = 0;
+
+	/* DWORD: target UID */
+	arg3[1].integer.type = ACPI_TYPE_INTEGER;
+	arg3[1].integer.value = cpu_uid;
+
+	/* QWORD: properties, all 0's */
+	arg3[2].integer.type = ACPI_TYPE_INTEGER;
+	arg3[2].integer.value = 0;
+
+	in_obj.type = ACPI_TYPE_PACKAGE;
+	in_obj.package.count = ARRAY_SIZE(arg3);
+	in_obj.package.elements = arg3;
+
+	out_obj = acpi_evaluate_dsm(handle, &pci_acpi_dsm_guid, 7,
+				    TPH_ST_DSM_FUNC_INDEX, &in_obj);
+	if (!out_obj)
+		return AE_ERROR;
+
+	if (out_obj->type != ACPI_TYPE_BUFFER) {
+		ACPI_FREE(out_obj);
+		return AE_ERROR;
+	}
+
+	st_out->value = *((u64 *)(out_obj->buffer.pointer));
+
+	ACPI_FREE(out_obj);
+
+	return AE_OK;
+}
+
+/* Update the TPH Requester Enable field of TPH Control Register */
+static void set_ctrl_reg_req_en(struct pci_dev *pdev, u8 req_type)
+{
+	u32 reg;
+
+	pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, &reg);
+
+	reg &= ~PCI_TPH_CTRL_REQ_EN_MASK;
+	reg |= FIELD_PREP(PCI_TPH_CTRL_REQ_EN_MASK, req_type);
+
+	pci_write_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, reg);
+}
+
 static u8 get_st_modes(struct pci_dev *pdev)
 {
 	u32 reg;
@@ -27,6 +152,37 @@ static u8 get_st_modes(struct pci_dev *pdev)
 	return reg;
 }
 
+static u32 get_st_table_loc(struct pci_dev *pdev)
+{
+	u32 reg;
+
+	pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CAP, &reg);
+
+	return FIELD_GET(PCI_TPH_CAP_LOC_MASK, reg);
+}
+
+/*
+ * Return the size of ST table. If ST table is not in TPH Requester Extended
+ * Capability space, return 0. Otherwise return the ST Table Size + 1.
+ */
+static u16 get_st_table_size(struct pci_dev *pdev)
+{
+	u32 reg;
+	u32 loc;
+
+	/* Check ST table location first */
+	loc = get_st_table_loc(pdev);
+
+	/* Convert loc to match with PCI_TPH_LOC_* defined in pci_regs.h */
+	loc = FIELD_PREP(PCI_TPH_CAP_LOC_MASK, loc);
+	if (loc != PCI_TPH_LOC_CAP)
+		return 0;
+
+	pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CAP, &reg);
+
+	return FIELD_GET(PCI_TPH_CAP_ST_MASK, reg) + 1;
+}
+
 /* Return device's Root Port completer capability */
 static u8 get_rp_completer_type(struct pci_dev *pdev)
 {
@@ -45,6 +201,163 @@ static u8 get_rp_completer_type(struct pci_dev *pdev)
 	return FIELD_GET(PCI_EXP_DEVCAP2_TPH_COMP_MASK, reg);
 }
 
+/* Write ST to MSI-X vector control reg - Return 0 if OK, otherwise -errno */
+static int write_tag_to_msix(struct pci_dev *pdev, int msix_idx, u16 tag)
+{
+	struct msi_desc *msi_desc = NULL;
+	void __iomem *vec_ctrl;
+	u32 val, mask;
+	int err = 0;
+
+	msi_lock_descs(&pdev->dev);
+
+	/* Find the msi_desc entry with matching msix_idx */
+	msi_for_each_desc(msi_desc, &pdev->dev, MSI_DESC_ASSOCIATED) {
+		if (msi_desc->msi_index == msix_idx)
+			break;
+	}
+
+	if (!msi_desc) {
+		err = -ENXIO;
+		goto err_out;
+	}
+
+	/* Get the vector control register (offset 0xc) pointed by msix_idx */
+	vec_ctrl = pdev->msix_base + msix_idx * PCI_MSIX_ENTRY_SIZE;
+	vec_ctrl += PCI_MSIX_ENTRY_VECTOR_CTRL;
+
+	val = readl(vec_ctrl);
+	mask = PCI_MSIX_ENTRY_CTRL_ST_LOWER | PCI_MSIX_ENTRY_CTRL_ST_UPPER;
+	val &= ~mask;
+	val |= FIELD_PREP(mask, (u32)tag);
+	writel(val, vec_ctrl);
+
+	/* Read back to flush the update */
+	val = readl(vec_ctrl);
+
+err_out:
+	msi_unlock_descs(&pdev->dev);
+	return err;
+}
+
+/* Write tag to ST table - Return 0 if OK, otherwise -errno */
+static int write_tag_to_st_table(struct pci_dev *pdev, int index, u16 tag)
+{
+	int st_table_size;
+	int offset;
+
+	/* Check if index is out of bound */
+	st_table_size = get_st_table_size(pdev);
+	if (index >= st_table_size)
+		return -ENXIO;
+
+	offset = pdev->tph_cap + PCI_TPH_BASE_SIZEOF + index * sizeof(u16);
+
+	return pci_write_config_word(pdev, offset, tag);
+}
+
+/**
+ * pcie_tph_get_cpu_st() - Retrieve Steering Tag for a target memory associated
+ * with a specific CPU
+ * @pdev: PCI device
+ * @mem_type: target memory type (volatile or persistent RAM)
+ * @cpu_uid: associated CPU id
+ * @tag: Steering Tag to be returned
+ *
+ * This function returns the Steering Tag for a target memory that is
+ * associated with a specific CPU as indicated by cpu_uid.
+ *
+ * Returns: 0 if success, otherwise negative value (-errno)
+ */
+int pcie_tph_get_cpu_st(struct pci_dev *pdev, enum tph_mem_type mem_type,
+			unsigned int cpu_uid, u16 *tag)
+{
+	struct pci_dev *rp;
+	acpi_handle rp_acpi_handle;
+	union st_info info;
+
+	rp = pcie_find_root_port(pdev);
+	if (!rp || !rp->bus || !rp->bus->bridge)
+		return -ENODEV;
+
+	rp_acpi_handle = ACPI_HANDLE(rp->bus->bridge);
+
+	if (tph_invoke_dsm(rp_acpi_handle, cpu_uid, &info) != AE_OK) {
+		*tag = 0;
+		return -EINVAL;
+	}
+
+	*tag = tph_extract_tag(mem_type, pdev->tph_req_type, &info);
+
+	pci_dbg(pdev, "get steering tag: mem_type=%s, cpu_uid=%d, tag=%#04x\n",
+		(mem_type == TPH_MEM_TYPE_VM) ? "volatile" : "persistent",
+		cpu_uid, *tag);
+
+	return 0;
+}
+EXPORT_SYMBOL(pcie_tph_get_cpu_st);
+
+/**
+ * pcie_tph_set_st_entry() - Set Steering Tag in the ST table entry
+ * @pdev: PCI device
+ * @index: ST table entry index
+ * @tag: Steering Tag to be written
+ *
+ * This function will figure out the proper location of ST table, either in the
+ * MSI-X table or in the TPH Extended Capability space, and write the Steering
+ * Tag into the ST entry pointed by index.
+ *
+ * Returns: 0 if success, otherwise negative value (-errno)
+ */
+int pcie_tph_set_st_entry(struct pci_dev *pdev, unsigned int index, u16 tag)
+{
+	u32 loc;
+	int err = 0;
+
+	if (!pdev->tph_cap)
+		return -EINVAL;
+
+	if (!pdev->tph_enabled)
+		return -EINVAL;
+
+	/* No need to write tag if device is in "No ST Mode" */
+	if (pdev->tph_mode == PCI_TPH_ST_NS_MODE)
+		return 0;
+
+	/* Disable TPH before updating ST to avoid potential instability as
+	 * cautioned in PCIe r6.2, sec 6.17.3, "ST Modes of Operation"
+	 */
+	set_ctrl_reg_req_en(pdev, PCI_TPH_REQ_DISABLE);
+
+	loc = get_st_table_loc(pdev);
+	/* Convert loc to match with PCI_TPH_LOC_* defined in pci_regs.h */
+	loc = FIELD_PREP(PCI_TPH_CAP_LOC_MASK, loc);
+
+	switch (loc) {
+	case PCI_TPH_LOC_MSIX:
+		err = write_tag_to_msix(pdev, index, tag);
+		break;
+	case PCI_TPH_LOC_CAP:
+		err = write_tag_to_st_table(pdev, index, tag);
+		break;
+	default:
+		err = -EINVAL;
+	}
+
+	if (err) {
+		pcie_disable_tph(pdev);
+		return err;
+	}
+
+	set_ctrl_reg_req_en(pdev, pdev->tph_mode);
+
+	pci_dbg(pdev, "set steering tag: %s table, index=%d, tag=%#04x\n",
+		(loc == PCI_TPH_LOC_MSIX) ? "MSI-X" : "ST", index, tag);
+
+	return 0;
+}
+EXPORT_SYMBOL(pcie_tph_set_st_entry);
+
 /**
  * pcie_disable_tph - Turn off TPH support for device
  * @pdev: PCI device
@@ -142,6 +455,8 @@ EXPORT_SYMBOL(pcie_enable_tph);
 void pci_restore_tph_state(struct pci_dev *pdev)
 {
 	struct pci_cap_saved_state *save_state;
+	int num_entries, i, offset;
+	u16 *st_entry;
 	u32 *cap;
 
 	if (!pdev->tph_cap)
@@ -157,11 +472,21 @@ void pci_restore_tph_state(struct pci_dev *pdev)
 	/* Restore control register and all ST entries */
 	cap = &save_state->cap.data[0];
 	pci_write_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, *cap++);
+	st_entry = (u16 *)cap;
+	offset = PCI_TPH_BASE_SIZEOF;
+	num_entries = get_st_table_size(pdev);
+	for (i = 0; i < num_entries; i++) {
+		pci_write_config_word(pdev, pdev->tph_cap + offset,
+				      *st_entry++);
+		offset += sizeof(u16);
+	}
 }
 
 void pci_save_tph_state(struct pci_dev *pdev)
 {
 	struct pci_cap_saved_state *save_state;
+	int num_entries, i, offset;
+	u16 *st_entry;
 	u32 *cap;
 
 	if (!pdev->tph_cap)
@@ -177,6 +502,16 @@ void pci_save_tph_state(struct pci_dev *pdev)
 	/* Save control register */
 	cap = &save_state->cap.data[0];
 	pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, cap++);
+
+	/* Save all ST entries in extended capability structure */
+	st_entry = (u16 *)cap;
+	offset = PCI_TPH_BASE_SIZEOF;
+	num_entries = get_st_table_size(pdev);
+	for (i = 0; i < num_entries; i++) {
+		pci_read_config_word(pdev, pdev->tph_cap + offset,
+				     st_entry++);
+		offset += sizeof(u16);
+	}
 }
 
 void pci_no_tph(void)
@@ -188,12 +523,14 @@ void pci_no_tph(void)
 
 void pci_tph_init(struct pci_dev *pdev)
 {
+	int num_entries;
 	u32 save_size;
 
 	pdev->tph_cap = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_TPH);
 	if (!pdev->tph_cap)
 		return;
 
-	save_size = sizeof(u32);
+	num_entries = get_st_table_size(pdev);
+	save_size = sizeof(u32) + num_entries * sizeof(u16);
 	pci_add_ext_cap_save_buffer(pdev, PCI_EXT_CAP_ID_TPH, save_size);
 }
diff --git a/include/linux/pci-tph.h b/include/linux/pci-tph.h
index 58654a334ffb..c3e806c13d64 100644
--- a/include/linux/pci-tph.h
+++ b/include/linux/pci-tph.h
@@ -9,10 +9,33 @@
 #ifndef LINUX_PCI_TPH_H
 #define LINUX_PCI_TPH_H
 
+/*
+ * According to the ECN for PCI Firmware Spec, Steering Tag can be different
+ * depending on the memory type: Volatile Memory or Persistent Memory. When a
+ * caller query about a target's Steering Tag, it must provide the target's
+ * tph_mem_type. ECN link: https://members.pcisig.com/wg/PCI-SIG/document/15470.
+ */
+enum tph_mem_type {
+	TPH_MEM_TYPE_VM,	/* volatile memory */
+	TPH_MEM_TYPE_PM		/* persistent memory */
+};
+
 #ifdef CONFIG_PCIE_TPH
+int pcie_tph_set_st_entry(struct pci_dev *pdev,
+			  unsigned int index, u16 tag);
+int pcie_tph_get_cpu_st(struct pci_dev *dev,
+			enum tph_mem_type mem_type,
+			unsigned int cpu_uid, u16 *tag);
 void pcie_disable_tph(struct pci_dev *pdev);
 int pcie_enable_tph(struct pci_dev *pdev, int mode);
 #else
+static inline int pcie_tph_set_st_entry(struct pci_dev *pdev,
+					unsigned int index, u16 tag)
+{ return -EINVAL; }
+static inline int pcie_tph_get_cpu_st(struct pci_dev *dev,
+				      enum tph_mem_type mem_type,
+				      unsigned int cpu_uid, u16 *tag)
+{ return -EINVAL; }
 static inline void pcie_disable_tph(struct pci_dev *pdev) { }
 static inline int pcie_enable_tph(struct pci_dev *pdev, int mode)
 { return -EINVAL; }
-- 
2.45.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH V5 3/5] PCI/TPH: Add TPH documentation
  2024-09-16 20:50 [PATCH V5 0/5] PCIe TPH and cache direct injection support Wei Huang
  2024-09-16 20:50 ` [PATCH V5 1/5] PCI: Add TLP Processing Hints (TPH) support Wei Huang
  2024-09-16 20:51 ` [PATCH V5 2/5] PCI/TPH: Add Steering Tag support Wei Huang
@ 2024-09-16 20:51 ` Wei Huang
  2024-09-16 20:51 ` [PATCH V5 4/5] bnxt_en: Add TPH support in BNXT driver Wei Huang
  2024-09-16 20:51 ` [PATCH V5 5/5] bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings Wei Huang
  4 siblings, 0 replies; 22+ messages in thread
From: Wei Huang @ 2024-09-16 20:51 UTC (permalink / raw)
  To: linux-pci, linux-kernel, linux-doc, netdev
  Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
	alex.williamson, gospo, michael.chan, ajit.khaparde,
	somnath.kotur, andrew.gospodarek, manoj.panicker2,
	Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
	bhelgaas, lukas, paul.e.luse, jing2.liu

Provide a document for TPH feature, including the description of "notph"
kernel parameter and the API interface.

Co-developed-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
---
 Documentation/PCI/index.rst          |   1 +
 Documentation/PCI/tph.rst            | 132 +++++++++++++++++++++++++++
 Documentation/driver-api/pci/pci.rst |   3 +
 3 files changed, 136 insertions(+)
 create mode 100644 Documentation/PCI/tph.rst

diff --git a/Documentation/PCI/index.rst b/Documentation/PCI/index.rst
index e73f84aebde3..5e7c4e6e726b 100644
--- a/Documentation/PCI/index.rst
+++ b/Documentation/PCI/index.rst
@@ -18,3 +18,4 @@ PCI Bus Subsystem
    pcieaer-howto
    endpoint/index
    boot-interrupts
+   tph
diff --git a/Documentation/PCI/tph.rst b/Documentation/PCI/tph.rst
new file mode 100644
index 000000000000..7c4b27789f87
--- /dev/null
+++ b/Documentation/PCI/tph.rst
@@ -0,0 +1,132 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+
+===========
+TPH Support
+===========
+
+:Copyright: 2024 Advanced Micro Devices, Inc.
+:Authors: - Eric van Tassell <eric.vantassell@amd.com>
+          - Wei Huang <wei.huang2@amd.com>
+
+
+Overview
+========
+
+TPH (TLP Processing Hints) is a PCIe feature that allows endpoint devices
+to provide optimization hints for requests that target memory space.
+These hints, in a format called Steering Tags (STs), are embedded in the
+requester's TLP headers, enabling the system hardware, such as the Root
+Complex, to better manage platform resources for these requests.
+
+For example, on platforms with TPH-based direct data cache injection
+support, an endpoint device can include appropriate STs in its DMA
+traffic to specify which cache the data should be written to. This allows
+the CPU core to have a higher probablity of getting data from cache,
+potentially improving performance and reducing latency in data
+processing.
+
+
+How to Use TPH
+==============
+
+TPH is presented as an optional extended capability in PCIe. The Linux
+kernel handles TPH discovery during boot, but it is up to the device
+driver to request TPH enablement if it is to be utilized. Once enabled,
+the driver uses the provided API to obtain the Steering Tag for the
+target memory and to program the ST into the device's ST table.
+
+Enable TPH support in Linux
+---------------------------
+
+To support TPH, the kernel must be built with the CONFIG_PCIE_TPH option
+enabled.
+
+Manage TPH
+----------
+
+To enable TPH for a device, use the following function::
+
+  int pcie_enable_tph(struct pci_dev *pdev, int mode);
+
+This function enables TPH support for device with a specific ST mode.
+Current supported modes include:
+
+  * PCI_TPH_ST_NS_MODE - NO ST Mode
+  * PCI_TPH_ST_IV_MODE - Interrupt Vector Mode
+  * PCI_TPH_ST_DS_MODE - Device Specific Mode
+
+`pcie_enable_tph()` checks whether the requested mode is actually
+supported by the device before enabling. The device driver can figure out
+which TPH mode is supported and can be properly enabled based on the
+return value of `pcie_enable_tph()`.
+
+To disable TPH, use the following function::
+
+  void pcie_disable_tph(struct pci_dev *pdev);
+
+Manage ST
+---------
+
+Steering Tags are platform specific. PCIe spec does not specify where STs
+are from. Instead PCI Firmware Specification defines an ACPI _DSM method
+(see the `Revised _DSM for Cache Locality TPH Features ECN
+<https://members.pcisig.com/wg/PCI-SIG/document/15470>`_) for retrieving
+STs for a target memory of various properties. This method is what is
+supported in this implementation.
+
+To retrieve a Steering Tag for a target memory associated with a specific
+CPU, use the following function::
+
+  int pcie_tph_get_cpu_st(struct pci_dev *pdev, enum tph_mem_type type,
+                          unsigned int cpu_uid, u16 *tag);
+
+The `type` argument is used to specify the memory type, either volatile
+or persistent, of the target memory. The `cpu_uid` argument specifies the
+CPU where the memory is associated to.
+
+After the ST value is retrieved, the device driver can use the following
+function to write the ST into the device::
+
+  int pcie_tph_set_st_entry(struct pci_dev *pdev, unsigned int index,
+                            u16 tag);
+
+The `index` argument is the ST table entry index the ST tag will be
+written into. `pcie_tph_set_st_entry()` will figure out the proper
+location of ST table, either in the MSI-X table or in the TPH Extended
+Capability space, and write the Steering Tag into the ST entry pointed by
+the `index` argument.
+
+It is completely up to the driver to decide how to use these TPH
+functions. For example a network device driver can use the TPH APIs above
+to update the Steering Tag when interrupt affinity of a RX/TX queue has
+been changed. Here is a sample code for IRQ affinity notifier:
+
+.. code-block:: c
+
+    static void irq_affinity_notified(struct irq_affinity_notify *notify,
+                                      const cpumask_t *mask)
+    {
+         struct drv_irq *irq;
+         unsigned int cpu_id;
+         u16 tag;
+
+         irq = container_of(notify, struct drv_irq, affinity_notify);
+         cpumask_copy(irq->cpu_mask, mask);
+
+         /* Pick a right CPU as the target - here is just an example */
+         cpu_id = cpumask_first(irq->cpu_mask);
+
+         if (pcie_tph_get_cpu_st(irq->pdev, TPH_MEM_TYPE_VM, cpu_id,
+                                 &tag))
+             return;
+
+         if (pcie_tph_set_st_entry(irq->pdev, irq->msix_nr, tag))
+             return;
+    }
+
+Disable TPH system-wide
+-----------------------
+
+There is a kernel command line option available to control TPH feature:
+    * "notph": TPH will be disabled for all endpoint devices.
diff --git a/Documentation/driver-api/pci/pci.rst b/Documentation/driver-api/pci/pci.rst
index aa40b1cc243b..3d896b2cf16e 100644
--- a/Documentation/driver-api/pci/pci.rst
+++ b/Documentation/driver-api/pci/pci.rst
@@ -46,6 +46,9 @@ PCI Support Library
 .. kernel-doc:: drivers/pci/pci-sysfs.c
    :internal:
 
+.. kernel-doc:: drivers/pci/pcie/tph.c
+   :export:
+
 PCI Hotplug Support Library
 ---------------------------
 
-- 
2.45.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH V5 4/5] bnxt_en: Add TPH support in BNXT driver
  2024-09-16 20:50 [PATCH V5 0/5] PCIe TPH and cache direct injection support Wei Huang
                   ` (2 preceding siblings ...)
  2024-09-16 20:51 ` [PATCH V5 3/5] PCI/TPH: Add TPH documentation Wei Huang
@ 2024-09-16 20:51 ` Wei Huang
  2024-09-16 21:25   ` Wei Huang
                     ` (2 more replies)
  2024-09-16 20:51 ` [PATCH V5 5/5] bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings Wei Huang
  4 siblings, 3 replies; 22+ messages in thread
From: Wei Huang @ 2024-09-16 20:51 UTC (permalink / raw)
  To: linux-pci, linux-kernel, linux-doc, netdev
  Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
	alex.williamson, gospo, michael.chan, ajit.khaparde,
	somnath.kotur, andrew.gospodarek, manoj.panicker2,
	Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
	bhelgaas, lukas, paul.e.luse, jing2.liu

From: Manoj Panicker <manoj.panicker2@amd.com>

Implement TPH support in Broadcom BNXT device driver. The driver uses TPH
functions to retrieve and configure the device's Steering Tags when its
interrupt affinity is being changed. With appropriate firmware, we see
sustancial memory bandwidth savings and other benefits using real network
benchmarks.

Co-developed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Co-developed-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Manoj Panicker <manoj.panicker2@amd.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 85 +++++++++++++++++++++++
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  7 ++
 net/core/netdev_rx_queue.c                |  1 +
 3 files changed, 93 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 6e422e24750a..ea0bd25d1efb 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -55,6 +55,8 @@
 #include <net/page_pool/helpers.h>
 #include <linux/align.h>
 #include <net/netdev_queues.h>
+#include <net/netdev_rx_queue.h>
+#include <linux/pci-tph.h>
 
 #include "bnxt_hsi.h"
 #include "bnxt.h"
@@ -10865,6 +10867,63 @@ int bnxt_reserve_rings(struct bnxt *bp, bool irq_re_init)
 	return 0;
 }
 
+static void __bnxt_irq_affinity_notify(struct irq_affinity_notify *notify,
+				       const cpumask_t *mask)
+{
+	struct bnxt_rx_ring_info *rxr;
+	struct bnxt_irq *irq;
+	u16 tag;
+	int err;
+
+	irq = container_of(notify, struct bnxt_irq, affinity_notify);
+	cpumask_copy(irq->cpu_mask, mask);
+
+	if (pcie_tph_get_cpu_st(irq->bp->pdev, TPH_MEM_TYPE_VM,
+				cpumask_first(irq->cpu_mask), &tag))
+		return;
+
+	if (pcie_tph_set_st_entry(irq->bp->pdev, irq->msix_nr, tag))
+		return;
+
+	if (netif_running(irq->bp->dev)) {
+		rxr = &irq->bp->rx_ring[irq->ring_nr];
+		rtnl_lock();
+		err = netdev_rx_queue_restart(irq->bp->dev, irq->ring_nr);
+		if (err)
+			netdev_err(irq->bp->dev,
+				   "rx queue restart failed: err=%d\n", err);
+		rtnl_unlock();
+	}
+}
+
+static void __bnxt_irq_affinity_release(struct kref __always_unused *ref)
+{
+}
+
+static void bnxt_release_irq_notifier(struct bnxt_irq *irq)
+{
+	irq_set_affinity_notifier(irq->vector, NULL);
+}
+
+static void bnxt_register_irq_notifier(struct bnxt *bp, struct bnxt_irq *irq)
+{
+	struct irq_affinity_notify *notify;
+
+	/* Nothing to do if TPH is not enabled */
+	if (!bp->tph_mode)
+		return;
+
+	irq->bp = bp;
+
+	/* Register IRQ affinity notifier */
+	notify = &irq->affinity_notify;
+	notify->irq = irq->vector;
+	notify->notify = __bnxt_irq_affinity_notify;
+	notify->release = __bnxt_irq_affinity_release;
+
+	irq_set_affinity_notifier(irq->vector, notify);
+}
+
 static void bnxt_free_irq(struct bnxt *bp)
 {
 	struct bnxt_irq *irq;
@@ -10887,11 +10946,18 @@ static void bnxt_free_irq(struct bnxt *bp)
 				free_cpumask_var(irq->cpu_mask);
 				irq->have_cpumask = 0;
 			}
+
+			bnxt_release_irq_notifier(irq);
+
 			free_irq(irq->vector, bp->bnapi[i]);
 		}
 
 		irq->requested = 0;
 	}
+
+	/* Disable TPH support */
+	pcie_disable_tph(bp->pdev);
+	bp->tph_mode = 0;
 }
 
 static int bnxt_request_irq(struct bnxt *bp)
@@ -10911,6 +10977,12 @@ static int bnxt_request_irq(struct bnxt *bp)
 #ifdef CONFIG_RFS_ACCEL
 	rmap = bp->dev->rx_cpu_rmap;
 #endif
+
+	/* Enable TPH support as part of IRQ request */
+	rc = pcie_enable_tph(bp->pdev, PCI_TPH_ST_IV_MODE);
+	if (!rc)
+		bp->tph_mode = PCI_TPH_ST_IV_MODE;
+
 	for (i = 0, j = 0; i < bp->cp_nr_rings; i++) {
 		int map_idx = bnxt_cp_num_to_irq_num(bp, i);
 		struct bnxt_irq *irq = &bp->irq_tbl[map_idx];
@@ -10934,8 +11006,11 @@ static int bnxt_request_irq(struct bnxt *bp)
 
 		if (zalloc_cpumask_var(&irq->cpu_mask, GFP_KERNEL)) {
 			int numa_node = dev_to_node(&bp->pdev->dev);
+			u16 tag;
 
 			irq->have_cpumask = 1;
+			irq->msix_nr = map_idx;
+			irq->ring_nr = i;
 			cpumask_set_cpu(cpumask_local_spread(i, numa_node),
 					irq->cpu_mask);
 			rc = irq_set_affinity_hint(irq->vector, irq->cpu_mask);
@@ -10945,6 +11020,16 @@ static int bnxt_request_irq(struct bnxt *bp)
 					    irq->vector);
 				break;
 			}
+
+			bnxt_register_irq_notifier(bp, irq);
+
+			/* Init ST table entry */
+			if (pcie_tph_get_cpu_st(irq->bp->pdev, TPH_MEM_TYPE_VM,
+						cpumask_first(irq->cpu_mask),
+						&tag))
+				continue;
+
+			pcie_tph_set_st_entry(irq->bp->pdev, irq->msix_nr, tag);
 		}
 	}
 	return rc;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 69231e85140b..641d25646367 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1227,6 +1227,11 @@ struct bnxt_irq {
 	u8		have_cpumask:1;
 	char		name[IFNAMSIZ + BNXT_IRQ_NAME_EXTRA];
 	cpumask_var_t	cpu_mask;
+
+	struct bnxt	*bp;
+	int		msix_nr;
+	int		ring_nr;
+	struct irq_affinity_notify affinity_notify;
 };
 
 #define HWRM_RING_ALLOC_TX	0x1
@@ -2183,6 +2188,8 @@ struct bnxt {
 	struct net_device	*dev;
 	struct pci_dev		*pdev;
 
+	u8			tph_mode;
+
 	atomic_t		intr_sem;
 
 	u32			flags;
diff --git a/net/core/netdev_rx_queue.c b/net/core/netdev_rx_queue.c
index e217a5838c87..10e95d7b6892 100644
--- a/net/core/netdev_rx_queue.c
+++ b/net/core/netdev_rx_queue.c
@@ -79,3 +79,4 @@ int netdev_rx_queue_restart(struct net_device *dev, unsigned int rxq_idx)
 
 	return err;
 }
+EXPORT_SYMBOL_GPL(netdev_rx_queue_restart);
-- 
2.45.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH V5 5/5] bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings
  2024-09-16 20:50 [PATCH V5 0/5] PCIe TPH and cache direct injection support Wei Huang
                   ` (3 preceding siblings ...)
  2024-09-16 20:51 ` [PATCH V5 4/5] bnxt_en: Add TPH support in BNXT driver Wei Huang
@ 2024-09-16 20:51 ` Wei Huang
  4 siblings, 0 replies; 22+ messages in thread
From: Wei Huang @ 2024-09-16 20:51 UTC (permalink / raw)
  To: linux-pci, linux-kernel, linux-doc, netdev
  Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
	alex.williamson, gospo, michael.chan, ajit.khaparde,
	somnath.kotur, andrew.gospodarek, manoj.panicker2,
	Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
	bhelgaas, lukas, paul.e.luse, jing2.liu

From: Michael Chan <michael.chan@broadcom.com>

Newer firmware can use the NQ ring ID associated with each RX/RX AGG
ring to enable PCIe steering tag.  Older firmware will just ignore the
information.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index ea0bd25d1efb..48ca3095ef2e 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -6811,10 +6811,12 @@ static int hwrm_ring_alloc_send_msg(struct bnxt *bp,
 
 			/* Association of rx ring with stats context */
 			grp_info = &bp->grp_info[ring->grp_idx];
+			req->nq_ring_id = cpu_to_le16(grp_info->cp_fw_ring_id);
 			req->rx_buf_size = cpu_to_le16(bp->rx_buf_use_size);
 			req->stat_ctx_id = cpu_to_le32(grp_info->fw_stats_ctx);
 			req->enables |= cpu_to_le32(
-				RING_ALLOC_REQ_ENABLES_RX_BUF_SIZE_VALID);
+				RING_ALLOC_REQ_ENABLES_RX_BUF_SIZE_VALID |
+				RING_ALLOC_REQ_ENABLES_NQ_RING_ID_VALID);
 			if (NET_IP_ALIGN == 2)
 				flags = RING_ALLOC_REQ_FLAGS_RX_SOP_PAD;
 			req->flags = cpu_to_le16(flags);
@@ -6826,11 +6828,13 @@ static int hwrm_ring_alloc_send_msg(struct bnxt *bp,
 			/* Association of agg ring with rx ring */
 			grp_info = &bp->grp_info[ring->grp_idx];
 			req->rx_ring_id = cpu_to_le16(grp_info->rx_fw_ring_id);
+			req->nq_ring_id = cpu_to_le16(grp_info->cp_fw_ring_id);
 			req->rx_buf_size = cpu_to_le16(BNXT_RX_PAGE_SIZE);
 			req->stat_ctx_id = cpu_to_le32(grp_info->fw_stats_ctx);
 			req->enables |= cpu_to_le32(
 				RING_ALLOC_REQ_ENABLES_RX_RING_ID_VALID |
-				RING_ALLOC_REQ_ENABLES_RX_BUF_SIZE_VALID);
+				RING_ALLOC_REQ_ENABLES_RX_BUF_SIZE_VALID |
+				RING_ALLOC_REQ_ENABLES_NQ_RING_ID_VALID);
 		} else {
 			req->ring_type = RING_ALLOC_REQ_RING_TYPE_RX;
 		}
-- 
2.45.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH V5 4/5] bnxt_en: Add TPH support in BNXT driver
  2024-09-16 20:51 ` [PATCH V5 4/5] bnxt_en: Add TPH support in BNXT driver Wei Huang
@ 2024-09-16 21:25   ` Wei Huang
  2024-09-23  7:25     ` Lukas Wunner
  2024-09-17  7:35   ` Simon Horman
  2024-09-20 10:38   ` kernel test robot
  2 siblings, 1 reply; 22+ messages in thread
From: Wei Huang @ 2024-09-16 21:25 UTC (permalink / raw)
  To: linux-pci, linux-kernel, linux-doc, netdev
  Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
	alex.williamson, gospo, michael.chan, ajit.khaparde,
	somnath.kotur, andrew.gospodarek, manoj.panicker2,
	Eric.VanTassell, vadim.fedorenko, horms, bagasdotme, bhelgaas,
	lukas, paul.e.luse, jing2.liu

Hi Bjorn,

This patch can not be compiled directly on pci.git tree because it uses 
netdev_rx_queue_restart() per Broadcom's suggestion. This function was 
just merged to netdev last week.

How could we resolve this double depedency issue? Can you take the first 
three TPH patches after review and I will send the rest via netdev?

Thanks,
-Wei

On 9/16/24 3:51 PM, Wei Huang wrote:
> From: Manoj Panicker <manoj.panicker2@amd.com>
> 
> Implement TPH support in Broadcom BNXT device driver. The driver uses TPH
> functions to retrieve and configure the device's Steering Tags when its
> interrupt affinity is being changed. With appropriate firmware, we see
> sustancial memory bandwidth savings and other benefits using real network
> benchmarks.
> 
> Co-developed-by: Somnath Kotur <somnath.kotur@broadcom.com>
> Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
> Co-developed-by: Wei Huang <wei.huang2@amd.com>
> Signed-off-by: Wei Huang <wei.huang2@amd.com>
> Signed-off-by: Manoj Panicker <manoj.panicker2@amd.com>
> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
> ---
>   drivers/net/ethernet/broadcom/bnxt/bnxt.c | 85 +++++++++++++++++++++++
>   drivers/net/ethernet/broadcom/bnxt/bnxt.h |  7 ++
>   net/core/netdev_rx_queue.c                |  1 +
>   3 files changed, 93 insertions(+)
> 
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> index 6e422e24750a..ea0bd25d1efb 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> @@ -55,6 +55,8 @@
>   #include <net/page_pool/helpers.h>
>   #include <linux/align.h>
>   #include <net/netdev_queues.h>
> +#include <net/netdev_rx_queue.h>
> +#include <linux/pci-tph.h>
>   
>   #include "bnxt_hsi.h"
>   #include "bnxt.h"
> @@ -10865,6 +10867,63 @@ int bnxt_reserve_rings(struct bnxt *bp, bool irq_re_init)
>   	return 0;
>   }
>   
> +static void __bnxt_irq_affinity_notify(struct irq_affinity_notify *notify,
> +				       const cpumask_t *mask)
> +{
> +	struct bnxt_rx_ring_info *rxr;
> +	struct bnxt_irq *irq;
> +	u16 tag;
> +	int err;
> +
> +	irq = container_of(notify, struct bnxt_irq, affinity_notify);
> +	cpumask_copy(irq->cpu_mask, mask);
> +
> +	if (pcie_tph_get_cpu_st(irq->bp->pdev, TPH_MEM_TYPE_VM,
> +				cpumask_first(irq->cpu_mask), &tag))
> +		return;
> +
> +	if (pcie_tph_set_st_entry(irq->bp->pdev, irq->msix_nr, tag))
> +		return;
> +
> +	if (netif_running(irq->bp->dev)) {
> +		rxr = &irq->bp->rx_ring[irq->ring_nr];
> +		rtnl_lock();
> +		err = netdev_rx_queue_restart(irq->bp->dev, irq->ring_nr);
> +		if (err)
> +			netdev_err(irq->bp->dev,
> +				   "rx queue restart failed: err=%d\n", err);
> +		rtnl_unlock();
> +	}
> +}
> +
> +static void __bnxt_irq_affinity_release(struct kref __always_unused *ref)
> +{
> +}
> +
> +static void bnxt_release_irq_notifier(struct bnxt_irq *irq)
> +{
> +	irq_set_affinity_notifier(irq->vector, NULL);
> +}
> +
> +static void bnxt_register_irq_notifier(struct bnxt *bp, struct bnxt_irq *irq)
> +{
> +	struct irq_affinity_notify *notify;
> +
> +	/* Nothing to do if TPH is not enabled */
> +	if (!bp->tph_mode)
> +		return;
> +
> +	irq->bp = bp;
> +
> +	/* Register IRQ affinity notifier */
> +	notify = &irq->affinity_notify;
> +	notify->irq = irq->vector;
> +	notify->notify = __bnxt_irq_affinity_notify;
> +	notify->release = __bnxt_irq_affinity_release;
> +
> +	irq_set_affinity_notifier(irq->vector, notify);
> +}
> +
>   static void bnxt_free_irq(struct bnxt *bp)
>   {
>   	struct bnxt_irq *irq;
> @@ -10887,11 +10946,18 @@ static void bnxt_free_irq(struct bnxt *bp)
>   				free_cpumask_var(irq->cpu_mask);
>   				irq->have_cpumask = 0;
>   			}
> +
> +			bnxt_release_irq_notifier(irq);
> +
>   			free_irq(irq->vector, bp->bnapi[i]);
>   		}
>   
>   		irq->requested = 0;
>   	}
> +
> +	/* Disable TPH support */
> +	pcie_disable_tph(bp->pdev);
> +	bp->tph_mode = 0;
>   }
>   
>   static int bnxt_request_irq(struct bnxt *bp)
> @@ -10911,6 +10977,12 @@ static int bnxt_request_irq(struct bnxt *bp)
>   #ifdef CONFIG_RFS_ACCEL
>   	rmap = bp->dev->rx_cpu_rmap;
>   #endif
> +
> +	/* Enable TPH support as part of IRQ request */
> +	rc = pcie_enable_tph(bp->pdev, PCI_TPH_ST_IV_MODE);
> +	if (!rc)
> +		bp->tph_mode = PCI_TPH_ST_IV_MODE;
> +
>   	for (i = 0, j = 0; i < bp->cp_nr_rings; i++) {
>   		int map_idx = bnxt_cp_num_to_irq_num(bp, i);
>   		struct bnxt_irq *irq = &bp->irq_tbl[map_idx];
> @@ -10934,8 +11006,11 @@ static int bnxt_request_irq(struct bnxt *bp)
>   
>   		if (zalloc_cpumask_var(&irq->cpu_mask, GFP_KERNEL)) {
>   			int numa_node = dev_to_node(&bp->pdev->dev);
> +			u16 tag;
>   
>   			irq->have_cpumask = 1;
> +			irq->msix_nr = map_idx;
> +			irq->ring_nr = i;
>   			cpumask_set_cpu(cpumask_local_spread(i, numa_node),
>   					irq->cpu_mask);
>   			rc = irq_set_affinity_hint(irq->vector, irq->cpu_mask);
> @@ -10945,6 +11020,16 @@ static int bnxt_request_irq(struct bnxt *bp)
>   					    irq->vector);
>   				break;
>   			}
> +
> +			bnxt_register_irq_notifier(bp, irq);
> +
> +			/* Init ST table entry */
> +			if (pcie_tph_get_cpu_st(irq->bp->pdev, TPH_MEM_TYPE_VM,
> +						cpumask_first(irq->cpu_mask),
> +						&tag))
> +				continue;
> +
> +			pcie_tph_set_st_entry(irq->bp->pdev, irq->msix_nr, tag);
>   		}
>   	}
>   	return rc;
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> index 69231e85140b..641d25646367 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> @@ -1227,6 +1227,11 @@ struct bnxt_irq {
>   	u8		have_cpumask:1;
>   	char		name[IFNAMSIZ + BNXT_IRQ_NAME_EXTRA];
>   	cpumask_var_t	cpu_mask;
> +
> +	struct bnxt	*bp;
> +	int		msix_nr;
> +	int		ring_nr;
> +	struct irq_affinity_notify affinity_notify;
>   };
>   
>   #define HWRM_RING_ALLOC_TX	0x1
> @@ -2183,6 +2188,8 @@ struct bnxt {
>   	struct net_device	*dev;
>   	struct pci_dev		*pdev;
>   
> +	u8			tph_mode;
> +
>   	atomic_t		intr_sem;
>   
>   	u32			flags;
> diff --git a/net/core/netdev_rx_queue.c b/net/core/netdev_rx_queue.c
> index e217a5838c87..10e95d7b6892 100644
> --- a/net/core/netdev_rx_queue.c
> +++ b/net/core/netdev_rx_queue.c
> @@ -79,3 +79,4 @@ int netdev_rx_queue_restart(struct net_device *dev, unsigned int rxq_idx)
>   
>   	return err;
>   }
> +EXPORT_SYMBOL_GPL(netdev_rx_queue_restart);

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH V5 2/5] PCI/TPH: Add Steering Tag support
  2024-09-16 20:51 ` [PATCH V5 2/5] PCI/TPH: Add Steering Tag support Wei Huang
@ 2024-09-17  7:32   ` Simon Horman
  2024-09-17 14:31     ` Wei Huang
  2024-09-20 10:38   ` kernel test robot
  1 sibling, 1 reply; 22+ messages in thread
From: Simon Horman @ 2024-09-17  7:32 UTC (permalink / raw)
  To: Wei Huang
  Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
	helgaas, corbet, davem, edumazet, kuba, pabeni, alex.williamson,
	gospo, michael.chan, ajit.khaparde, somnath.kotur,
	andrew.gospodarek, manoj.panicker2, Eric.VanTassell,
	vadim.fedorenko, bagasdotme, bhelgaas, lukas, paul.e.luse,
	jing2.liu

On Mon, Sep 16, 2024 at 03:51:00PM -0500, Wei Huang wrote:
> pcie_tph_get_cpu_st() is added to allow a caller to retrieve Steering Tags
> for a target memory that is associated with a specific CPU. The ST tag is
> retrieved by invoking ACPI _DSM of the device's Root Port device.
> 
> pcie_tph_set_st_entry() is added to support updating the device's Steering
> Tags. The tags will be written into the device's MSI-X table or the ST
> table located in the TPH Extended Capability space.
> 
> Co-developed-by: Eric Van Tassell <Eric.VanTassell@amd.com>
> Signed-off-by: Eric Van Tassell <Eric.VanTassell@amd.com>
> Signed-off-by: Wei Huang <wei.huang2@amd.com>
> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>

...

> @@ -45,6 +201,163 @@ static u8 get_rp_completer_type(struct pci_dev *pdev)
>  	return FIELD_GET(PCI_EXP_DEVCAP2_TPH_COMP_MASK, reg);
>  }
>  
> +/* Write ST to MSI-X vector control reg - Return 0 if OK, otherwise -errno */
> +static int write_tag_to_msix(struct pci_dev *pdev, int msix_idx, u16 tag)
> +{
> +	struct msi_desc *msi_desc = NULL;
> +	void __iomem *vec_ctrl;
> +	u32 val, mask;
> +	int err = 0;
> +
> +	msi_lock_descs(&pdev->dev);
> +
> +	/* Find the msi_desc entry with matching msix_idx */
> +	msi_for_each_desc(msi_desc, &pdev->dev, MSI_DESC_ASSOCIATED) {
> +		if (msi_desc->msi_index == msix_idx)
> +			break;
> +	}
> +
> +	if (!msi_desc) {
> +		err = -ENXIO;
> +		goto err_out;
> +	}
> +
> +	/* Get the vector control register (offset 0xc) pointed by msix_idx */
> +	vec_ctrl = pdev->msix_base + msix_idx * PCI_MSIX_ENTRY_SIZE;
> +	vec_ctrl += PCI_MSIX_ENTRY_VECTOR_CTRL;
> +
> +	val = readl(vec_ctrl);
> +	mask = PCI_MSIX_ENTRY_CTRL_ST_LOWER | PCI_MSIX_ENTRY_CTRL_ST_UPPER;
> +	val &= ~mask;
> +	val |= FIELD_PREP(mask, (u32)tag);

Hi Wei Huang,

Unfortunately clang-18 (x86_64, allmodconfig, W=1, when applied to net-next)
complains about this.  I think it is because it expects FIELD_PREP to be
used with a mask that is a built-in constant.

drivers/pci/pcie/tph.c:232:9: warning: result of comparison of constant 18446744073709551615 with expression of type 'typeof (_Generic((mask), char: (unsigned char)0, unsigned char: (unsigned char)0, signed char: (unsigned char)0, unsigned short: (unsigned short)0, short: (unsigned short)0, unsigned int: (unsigned int)0, int: (unsigned int)0, unsigned long: (unsigned long)0, long: (unsigned long)0, unsigned long long: (unsigned long long)0, long long: (unsigned long long)0, default: (mask)))' (aka 'unsigned int') is always false [-Wtautological-constant-out-of-range-compare]
  232 |         val |= FIELD_PREP(mask, (u32)tag);
      |                ^~~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/bitfield.h:115:3: note: expanded from macro 'FIELD_PREP'
  115 |                 __BF_FIELD_CHECK(_mask, 0ULL, _val, "FIELD_PREP: ");    \
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/bitfield.h:72:53: note: expanded from macro '__BF_FIELD_CHECK'
   72 |                 BUILD_BUG_ON_MSG(__bf_cast_unsigned(_mask, _mask) >     \
      |                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~
   73 |                                  __bf_cast_unsigned(_reg, ~0ull),       \
      |                                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   74 |                                  _pfx "type of reg too small for mask"); \
      |                                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/build_bug.h:39:58: note: expanded from macro 'BUILD_BUG_ON_MSG'
   39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
      |                                     ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~
././include/linux/compiler_types.h:510:22: note: expanded from macro 'compiletime_assert'
  510 |         _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
      |         ~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
././include/linux/compiler_types.h:498:23: note: expanded from macro '_compiletime_assert'
  498 |         __compiletime_assert(condition, msg, prefix, suffix)
      |         ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
././include/linux/compiler_types.h:490:9: note: expanded from macro '__compiletime_assert'
  490 |                 if (!(condition))                                       \
      |                       ^~~~~~~~~
1 warning generated.

> +	writel(val, vec_ctrl);
> +
> +	/* Read back to flush the update */
> +	val = readl(vec_ctrl);
> +
> +err_out:
> +	msi_unlock_descs(&pdev->dev);
> +	return err;
> +}

...

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH V5 4/5] bnxt_en: Add TPH support in BNXT driver
  2024-09-16 20:51 ` [PATCH V5 4/5] bnxt_en: Add TPH support in BNXT driver Wei Huang
  2024-09-16 21:25   ` Wei Huang
@ 2024-09-17  7:35   ` Simon Horman
  2024-09-20 10:38   ` kernel test robot
  2 siblings, 0 replies; 22+ messages in thread
From: Simon Horman @ 2024-09-17  7:35 UTC (permalink / raw)
  To: Wei Huang
  Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
	helgaas, corbet, davem, edumazet, kuba, pabeni, alex.williamson,
	gospo, michael.chan, ajit.khaparde, somnath.kotur,
	andrew.gospodarek, manoj.panicker2, Eric.VanTassell,
	vadim.fedorenko, bagasdotme, bhelgaas, lukas, paul.e.luse,
	jing2.liu

On Mon, Sep 16, 2024 at 03:51:02PM -0500, Wei Huang wrote:
> From: Manoj Panicker <manoj.panicker2@amd.com>
> 
> Implement TPH support in Broadcom BNXT device driver. The driver uses TPH
> functions to retrieve and configure the device's Steering Tags when its
> interrupt affinity is being changed. With appropriate firmware, we see
> sustancial memory bandwidth savings and other benefits using real network
> benchmarks.
> 
> Co-developed-by: Somnath Kotur <somnath.kotur@broadcom.com>
> Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
> Co-developed-by: Wei Huang <wei.huang2@amd.com>
> Signed-off-by: Wei Huang <wei.huang2@amd.com>
> Signed-off-by: Manoj Panicker <manoj.panicker2@amd.com>
> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
> ---
>  drivers/net/ethernet/broadcom/bnxt/bnxt.c | 85 +++++++++++++++++++++++
>  drivers/net/ethernet/broadcom/bnxt/bnxt.h |  7 ++
>  net/core/netdev_rx_queue.c                |  1 +
>  3 files changed, 93 insertions(+)
> 
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c

...

> @@ -10865,6 +10867,63 @@ int bnxt_reserve_rings(struct bnxt *bp, bool irq_re_init)
>  	return 0;
>  }
>  
> +static void __bnxt_irq_affinity_notify(struct irq_affinity_notify *notify,
> +				       const cpumask_t *mask)
> +{
> +	struct bnxt_rx_ring_info *rxr;

Hi Wei Huang,

A minor nit from my side:

rxr is set but otherwise unused in this function.

Flagged by x86_64 W=1 builds with gcc-14 and clang-18.

> +	struct bnxt_irq *irq;
> +	u16 tag;
> +	int err;
> +
> +	irq = container_of(notify, struct bnxt_irq, affinity_notify);
> +	cpumask_copy(irq->cpu_mask, mask);
> +
> +	if (pcie_tph_get_cpu_st(irq->bp->pdev, TPH_MEM_TYPE_VM,
> +				cpumask_first(irq->cpu_mask), &tag))
> +		return;
> +
> +	if (pcie_tph_set_st_entry(irq->bp->pdev, irq->msix_nr, tag))
> +		return;
> +
> +	if (netif_running(irq->bp->dev)) {
> +		rxr = &irq->bp->rx_ring[irq->ring_nr];
> +		rtnl_lock();
> +		err = netdev_rx_queue_restart(irq->bp->dev, irq->ring_nr);
> +		if (err)
> +			netdev_err(irq->bp->dev,
> +				   "rx queue restart failed: err=%d\n", err);
> +		rtnl_unlock();
> +	}
> +}

...

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH V5 1/5] PCI: Add TLP Processing Hints (TPH) support
  2024-09-16 20:50 ` [PATCH V5 1/5] PCI: Add TLP Processing Hints (TPH) support Wei Huang
@ 2024-09-17  7:38   ` Simon Horman
  2024-09-23  7:43   ` Lukas Wunner
  2024-09-23 12:07   ` Alejandro Lucero Palau
  2 siblings, 0 replies; 22+ messages in thread
From: Simon Horman @ 2024-09-17  7:38 UTC (permalink / raw)
  To: Wei Huang
  Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
	helgaas, corbet, davem, edumazet, kuba, pabeni, alex.williamson,
	gospo, michael.chan, ajit.khaparde, somnath.kotur,
	andrew.gospodarek, manoj.panicker2, Eric.VanTassell,
	vadim.fedorenko, bagasdotme, bhelgaas, lukas, paul.e.luse,
	jing2.liu

On Mon, Sep 16, 2024 at 03:50:59PM -0500, Wei Huang wrote:
> Add support for PCIe TLP Processing Hints (TPH) support (see PCIe r6.2,
> sec 6.17).
> 
> Add missing TPH register definitions in pci_regs.h, including the TPH
> Requester capability register, TPH Requester control register, TPH
> Completer capability, and the ST fields of MSI-X entry.
> 
> Introduce pcie_enable_tph() and pcie_disable_tph(), enabling drivers to
> toggle TPH support and configure specific ST mode as needed. Also add a
> new kernel parameter, "pci=notph", allowing users to disable TPH support
> across the entire system.
> 
> Co-developed-by: Jing Liu <jing2.liu@intel.com>
> Signed-off-by: Jing Liu <jing2.liu@intel.com>
> Co-developed-by: Paul Luse <paul.e.luse@linux.intel.com>
> Signed-off-by: Paul Luse <paul.e.luse@linux.intel.com>
> Co-developed-by: Eric Van Tassell <Eric.VanTassell@amd.com>
> Signed-off-by: Eric Van Tassell <Eric.VanTassell@amd.com>
> Signed-off-by: Wei Huang <wei.huang2@amd.com>
> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by: Lukas Wunner <lukas@wunner.de>

...

> diff --git a/drivers/pci/pcie/tph.c b/drivers/pci/pcie/tph.c

...

> +/**
> + * pcie_enable_tph - Enable TPH support for device using a specific ST mode
> + * @pdev: PCI device
> + * @mode: ST mode to enable. Current supported modes include:
> + *
> + *   - PCI_TPH_ST_NS_MODE: NO ST Mode
> + *   - PCI_TPH_ST_IV_MODE: Interrupt Vector Mode
> + *   - PCI_TPH_ST_DS_MODE: Device Specific Mode
> + *
> + * Checks whether the mode is actually supported by the device before enabling
> + * and returns an error if not. Additionally determines what types of requests,
> + * TPH or extended TPH, can be issued by the device based on its TPH requester
> + * capability and the Root Port's completer capability.
> + *
> + * Return: 0 on success, otherwise negative value (-errno)
> + */
> +int pcie_enable_tph(struct pci_dev *pdev, int mode)
> +{
> +	u32 reg;
> +	u8 dev_modes;
> +	u8 rp_req_type;
> +
> +	/* Honor "notph" kernel parameter */
> +	if (pci_tph_disabled)
> +		return -EINVAL;
> +
> +	if (!pdev->tph_cap)
> +		return -EINVAL;
> +
> +	if (pdev->tph_enabled)
> +		return -EBUSY;
> +
> +	/* Sanitize and check ST mode comptability */

Hi Wei Huang, all,

Another minor nit from my side (the last one, I think):

comptability -> compatibility

Flagged by checkpatch.pl --codespell

> +	mode &= PCI_TPH_CTRL_MODE_SEL_MASK;
> +	dev_modes = get_st_modes(pdev);
> +	if (!((1 << mode) & dev_modes))
> +		return -EINVAL;
> +
> +	pdev->tph_mode = mode;
> +
> +	/* Get req_type supported by device and its Root Port */
> +	pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CAP, &reg);
> +	if (FIELD_GET(PCI_TPH_CAP_EXT_TPH, reg))
> +		pdev->tph_req_type = PCI_TPH_REQ_EXT_TPH;
> +	else
> +		pdev->tph_req_type = PCI_TPH_REQ_TPH_ONLY;
> +
> +	rp_req_type = get_rp_completer_type(pdev);
> +
> +	/* Final req_type is the smallest value of two */
> +	pdev->tph_req_type = min(pdev->tph_req_type, rp_req_type);
> +
> +	if (pdev->tph_req_type == PCI_TPH_REQ_DISABLE)
> +		return -EINVAL;
> +
> +	/* Write them into TPH control register */
> +	pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, &reg);
> +
> +	reg &= ~PCI_TPH_CTRL_MODE_SEL_MASK;
> +	reg |= FIELD_PREP(PCI_TPH_CTRL_MODE_SEL_MASK, pdev->tph_mode);
> +
> +	reg &= ~PCI_TPH_CTRL_REQ_EN_MASK;
> +	reg |= FIELD_PREP(PCI_TPH_CTRL_REQ_EN_MASK, pdev->tph_req_type);
> +
> +	pci_write_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, reg);
> +
> +	pdev->tph_enabled = 1;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL(pcie_enable_tph);

...

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH V5 2/5] PCI/TPH: Add Steering Tag support
  2024-09-17  7:32   ` Simon Horman
@ 2024-09-17 14:31     ` Wei Huang
  2024-09-17 16:14       ` Simon Horman
  0 siblings, 1 reply; 22+ messages in thread
From: Wei Huang @ 2024-09-17 14:31 UTC (permalink / raw)
  To: Simon Horman
  Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
	helgaas, corbet, davem, edumazet, kuba, pabeni, alex.williamson,
	gospo, michael.chan, ajit.khaparde, somnath.kotur,
	andrew.gospodarek, manoj.panicker2, Eric.VanTassell,
	vadim.fedorenko, bagasdotme, bhelgaas, lukas, paul.e.luse,
	jing2.liu



On 9/17/24 02:32, Simon Horman wrote:
> On Mon, Sep 16, 2024 at 03:51:00PM -0500, Wei Huang wrote:
...
>> +	val = readl(vec_ctrl);
>> +	mask = PCI_MSIX_ENTRY_CTRL_ST_LOWER | PCI_MSIX_ENTRY_CTRL_ST_UPPER;
>> +	val &= ~mask;
>> +	val |= FIELD_PREP(mask, (u32)tag);
> 
> Hi Wei Huang,
> 
> Unfortunately clang-18 (x86_64, allmodconfig, W=1, when applied to net-next)
> complains about this.  I think it is because it expects FIELD_PREP to be
> used with a mask that is a built-in constant.

I thought I fixed it, but apparently not enough for clang-18. I will
address this problem, along with other comments from you and Bjorn (if any).

BTW there is another code in drivers/gpu/drm/ using a similar approach.

> 
> drivers/pci/pcie/tph.c:232:9: warning: result of comparison of constant 18446744073709551615 with expression of type 'typeof (_Generic((mask), char: (unsigned char)0, unsigned char: (unsigned char)0, signed char: (unsigned char)0, unsigned short: (unsigned short)0, short: (unsigned short)0, unsigned int: (unsigned int)0, int: (unsigned int)0, unsigned long: (unsigned long)0, long: (unsigned long)0, unsigned long long: (unsigned long long)0, long long: (unsigned long long)0, default: (mask)))' (aka 'unsigned int') is always false [-Wtautological-constant-out-of-range-compare]
>   232 |         val |= FIELD_PREP(mask, (u32)tag);
>       |                ^~~~~~~~~~~~~~~~~~~~~~~~~~
> ./include/linux/bitfield.h:115:3: note: expanded from macro 'FIELD_PREP'
>   115 |                 __BF_FIELD_CHECK(_mask, 0ULL, _val, "FIELD_PREP: ");    \
>       |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> ./include/linux/bitfield.h:72:53: note: expanded from macro '__BF_FIELD_CHECK'
>    72 |                 BUILD_BUG_ON_MSG(__bf_cast_unsigned(_mask, _mask) >     \
>       |                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~
>    73 |                                  __bf_cast_unsigned(_reg, ~0ull),       \
>       |                                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>    74 |                                  _pfx "type of reg too small for mask"); \
>       |                                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> ./include/linux/build_bug.h:39:58: note: expanded from macro 'BUILD_BUG_ON_MSG'
>    39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
>       |                                     ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~
> ././include/linux/compiler_types.h:510:22: note: expanded from macro 'compiletime_assert'
>   510 |         _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
>       |         ~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> ././include/linux/compiler_types.h:498:23: note: expanded from macro '_compiletime_assert'
>   498 |         __compiletime_assert(condition, msg, prefix, suffix)
>       |         ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> ././include/linux/compiler_types.h:490:9: note: expanded from macro '__compiletime_assert'
>   490 |                 if (!(condition))                                       \
>       |                       ^~~~~~~~~
> 1 warning generated.
> 
>> +	writel(val, vec_ctrl);
>> +
>> +	/* Read back to flush the update */
>> +	val = readl(vec_ctrl);
>> +
>> +err_out:
>> +	msi_unlock_descs(&pdev->dev);
>> +	return err;
>> +}
> 
> ...

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH V5 2/5] PCI/TPH: Add Steering Tag support
  2024-09-17 14:31     ` Wei Huang
@ 2024-09-17 16:14       ` Simon Horman
  2024-09-17 16:24         ` Simon Horman
  0 siblings, 1 reply; 22+ messages in thread
From: Simon Horman @ 2024-09-17 16:14 UTC (permalink / raw)
  To: Wei Huang
  Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
	helgaas, corbet, davem, edumazet, kuba, pabeni, alex.williamson,
	gospo, michael.chan, ajit.khaparde, somnath.kotur,
	andrew.gospodarek, manoj.panicker2, Eric.VanTassell,
	vadim.fedorenko, bagasdotme, bhelgaas, lukas, paul.e.luse,
	jing2.liu

On Tue, Sep 17, 2024 at 09:31:00AM -0500, Wei Huang wrote:
> 
> 
> On 9/17/24 02:32, Simon Horman wrote:
> > On Mon, Sep 16, 2024 at 03:51:00PM -0500, Wei Huang wrote:
> ...
> >> +	val = readl(vec_ctrl);
> >> +	mask = PCI_MSIX_ENTRY_CTRL_ST_LOWER | PCI_MSIX_ENTRY_CTRL_ST_UPPER;
> >> +	val &= ~mask;
> >> +	val |= FIELD_PREP(mask, (u32)tag);
> > 
> > Hi Wei Huang,
> > 
> > Unfortunately clang-18 (x86_64, allmodconfig, W=1, when applied to net-next)
> > complains about this.  I think it is because it expects FIELD_PREP to be
> > used with a mask that is a built-in constant.
> 
> I thought I fixed it, but apparently not enough for clang-18. I will
> address this problem, along with other comments from you and Bjorn (if any).
> 
> BTW there is another code in drivers/gpu/drm/ using a similar approach.

Thanks,

I will run some checks over drivers/gpu/drm/ and let you know if they find
anything.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH V5 2/5] PCI/TPH: Add Steering Tag support
  2024-09-17 16:14       ` Simon Horman
@ 2024-09-17 16:24         ` Simon Horman
  0 siblings, 0 replies; 22+ messages in thread
From: Simon Horman @ 2024-09-17 16:24 UTC (permalink / raw)
  To: Wei Huang
  Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
	helgaas, corbet, davem, edumazet, kuba, pabeni, alex.williamson,
	gospo, michael.chan, ajit.khaparde, somnath.kotur,
	andrew.gospodarek, manoj.panicker2, Eric.VanTassell,
	vadim.fedorenko, bagasdotme, bhelgaas, lukas, paul.e.luse,
	jing2.liu

On Tue, Sep 17, 2024 at 05:14:10PM +0100, Simon Horman wrote:
> On Tue, Sep 17, 2024 at 09:31:00AM -0500, Wei Huang wrote:
> > 
> > 
> > On 9/17/24 02:32, Simon Horman wrote:
> > > On Mon, Sep 16, 2024 at 03:51:00PM -0500, Wei Huang wrote:
> > ...
> > >> +	val = readl(vec_ctrl);
> > >> +	mask = PCI_MSIX_ENTRY_CTRL_ST_LOWER | PCI_MSIX_ENTRY_CTRL_ST_UPPER;
> > >> +	val &= ~mask;
> > >> +	val |= FIELD_PREP(mask, (u32)tag);
> > > 
> > > Hi Wei Huang,
> > > 
> > > Unfortunately clang-18 (x86_64, allmodconfig, W=1, when applied to net-next)
> > > complains about this.  I think it is because it expects FIELD_PREP to be
> > > used with a mask that is a built-in constant.
> > 
> > I thought I fixed it, but apparently not enough for clang-18. I will
> > address this problem, along with other comments from you and Bjorn (if any).
> > 
> > BTW there is another code in drivers/gpu/drm/ using a similar approach.
> 
> Thanks,
> 
> I will run some checks over drivers/gpu/drm/ and let you know if they find
> anything.

FWIIW, I did try compiling all the .c files under drivers/gpu/drm/ with
clang-18, and it did not flag this problem.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH V5 4/5] bnxt_en: Add TPH support in BNXT driver
  2024-09-16 20:51 ` [PATCH V5 4/5] bnxt_en: Add TPH support in BNXT driver Wei Huang
  2024-09-16 21:25   ` Wei Huang
  2024-09-17  7:35   ` Simon Horman
@ 2024-09-20 10:38   ` kernel test robot
  2 siblings, 0 replies; 22+ messages in thread
From: kernel test robot @ 2024-09-20 10:38 UTC (permalink / raw)
  To: Wei Huang, linux-pci, linux-kernel, linux-doc, netdev
  Cc: oe-kbuild-all, Jonathan.Cameron, helgaas, corbet, davem, edumazet,
	kuba, pabeni, alex.williamson, gospo, michael.chan, ajit.khaparde,
	somnath.kotur, andrew.gospodarek, manoj.panicker2,
	Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
	bhelgaas, lukas, paul.e.luse, jing2.liu

Hi Wei,

kernel test robot noticed the following build warnings:

[auto build test WARNING on linus/master]
[also build test WARNING on next-20240920]
[cannot apply to pci/next pci/for-linus v6.11]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Wei-Huang/PCI-Add-TLP-Processing-Hints-TPH-support/20240917-045345
base:   linus/master
patch link:    https://lore.kernel.org/r/20240916205103.3882081-5-wei.huang2%40amd.com
patch subject: [PATCH V5 4/5] bnxt_en: Add TPH support in BNXT driver
config: i386-allmodconfig (https://download.01.org/0day-ci/archive/20240920/202409201831.ToruGbMs-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240920/202409201831.ToruGbMs-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202409201831.ToruGbMs-lkp@intel.com/

All warnings (new ones prefixed by >>):

   drivers/net/ethernet/broadcom/bnxt/bnxt.c: In function '__bnxt_irq_affinity_notify':
>> drivers/net/ethernet/broadcom/bnxt/bnxt.c:10873:35: warning: variable 'rxr' set but not used [-Wunused-but-set-variable]
   10873 |         struct bnxt_rx_ring_info *rxr;
         |                                   ^~~


vim +/rxr +10873 drivers/net/ethernet/broadcom/bnxt/bnxt.c

 10869	
 10870	static void __bnxt_irq_affinity_notify(struct irq_affinity_notify *notify,
 10871					       const cpumask_t *mask)
 10872	{
 10873		struct bnxt_rx_ring_info *rxr;
 10874		struct bnxt_irq *irq;
 10875		u16 tag;
 10876		int err;
 10877	
 10878		irq = container_of(notify, struct bnxt_irq, affinity_notify);
 10879		cpumask_copy(irq->cpu_mask, mask);
 10880	
 10881		if (pcie_tph_get_cpu_st(irq->bp->pdev, TPH_MEM_TYPE_VM,
 10882					cpumask_first(irq->cpu_mask), &tag))
 10883			return;
 10884	
 10885		if (pcie_tph_set_st_entry(irq->bp->pdev, irq->msix_nr, tag))
 10886			return;
 10887	
 10888		if (netif_running(irq->bp->dev)) {
 10889			rxr = &irq->bp->rx_ring[irq->ring_nr];
 10890			rtnl_lock();
 10891			err = netdev_rx_queue_restart(irq->bp->dev, irq->ring_nr);
 10892			if (err)
 10893				netdev_err(irq->bp->dev,
 10894					   "rx queue restart failed: err=%d\n", err);
 10895			rtnl_unlock();
 10896		}
 10897	}
 10898	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH V5 2/5] PCI/TPH: Add Steering Tag support
  2024-09-16 20:51 ` [PATCH V5 2/5] PCI/TPH: Add Steering Tag support Wei Huang
  2024-09-17  7:32   ` Simon Horman
@ 2024-09-20 10:38   ` kernel test robot
  1 sibling, 0 replies; 22+ messages in thread
From: kernel test robot @ 2024-09-20 10:38 UTC (permalink / raw)
  To: Wei Huang, linux-pci, linux-kernel, linux-doc, netdev
  Cc: llvm, oe-kbuild-all, Jonathan.Cameron, helgaas, corbet, davem,
	edumazet, kuba, pabeni, alex.williamson, gospo, michael.chan,
	ajit.khaparde, somnath.kotur, andrew.gospodarek, manoj.panicker2,
	Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
	bhelgaas, lukas, paul.e.luse, jing2.liu

Hi Wei,

kernel test robot noticed the following build warnings:

[auto build test WARNING on linus/master]
[also build test WARNING on v6.11 next-20240920]
[cannot apply to pci/next pci/for-linus]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Wei-Huang/PCI-Add-TLP-Processing-Hints-TPH-support/20240917-045345
base:   linus/master
patch link:    https://lore.kernel.org/r/20240916205103.3882081-3-wei.huang2%40amd.com
patch subject: [PATCH V5 2/5] PCI/TPH: Add Steering Tag support
config: x86_64-allyesconfig (https://download.01.org/0day-ci/archive/20240920/202409201840.tDMBEi4q-lkp@intel.com/config)
compiler: clang version 18.1.8 (https://github.com/llvm/llvm-project 3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240920/202409201840.tDMBEi4q-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202409201840.tDMBEi4q-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/pci/pcie/tph.c:232:9: warning: result of comparison of constant 18446744073709551615 with expression of type 'typeof (_Generic((mask), char: (unsigned char)0, unsigned char: (unsigned char)0, signed char: (unsigned char)0, unsigned short: (unsigned short)0, short: (unsigned short)0, unsigned int: (unsigned int)0, int: (unsigned int)0, unsigned long: (unsigned long)0, long: (unsigned long)0, unsigned long long: (unsigned long long)0, long long: (unsigned long long)0, default: (mask)))' (aka 'unsigned int') is always false [-Wtautological-constant-out-of-range-compare]
     232 |         val |= FIELD_PREP(mask, (u32)tag);
         |                ^~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/bitfield.h:115:3: note: expanded from macro 'FIELD_PREP'
     115 |                 __BF_FIELD_CHECK(_mask, 0ULL, _val, "FIELD_PREP: ");    \
         |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/bitfield.h:72:53: note: expanded from macro '__BF_FIELD_CHECK'
      72 |                 BUILD_BUG_ON_MSG(__bf_cast_unsigned(_mask, _mask) >     \
         |                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~
      73 |                                  __bf_cast_unsigned(_reg, ~0ull),       \
         |                                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      74 |                                  _pfx "type of reg too small for mask"); \
         |                                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:39:58: note: expanded from macro 'BUILD_BUG_ON_MSG'
      39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
         |                                     ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~
   include/linux/compiler_types.h:510:22: note: expanded from macro 'compiletime_assert'
     510 |         _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
         |         ~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler_types.h:498:23: note: expanded from macro '_compiletime_assert'
     498 |         __compiletime_assert(condition, msg, prefix, suffix)
         |         ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler_types.h:490:9: note: expanded from macro '__compiletime_assert'
     490 |                 if (!(condition))                                       \
         |                       ^~~~~~~~~
   1 warning generated.


vim +232 drivers/pci/pcie/tph.c

   203	
   204	/* Write ST to MSI-X vector control reg - Return 0 if OK, otherwise -errno */
   205	static int write_tag_to_msix(struct pci_dev *pdev, int msix_idx, u16 tag)
   206	{
   207		struct msi_desc *msi_desc = NULL;
   208		void __iomem *vec_ctrl;
   209		u32 val, mask;
   210		int err = 0;
   211	
   212		msi_lock_descs(&pdev->dev);
   213	
   214		/* Find the msi_desc entry with matching msix_idx */
   215		msi_for_each_desc(msi_desc, &pdev->dev, MSI_DESC_ASSOCIATED) {
   216			if (msi_desc->msi_index == msix_idx)
   217				break;
   218		}
   219	
   220		if (!msi_desc) {
   221			err = -ENXIO;
   222			goto err_out;
   223		}
   224	
   225		/* Get the vector control register (offset 0xc) pointed by msix_idx */
   226		vec_ctrl = pdev->msix_base + msix_idx * PCI_MSIX_ENTRY_SIZE;
   227		vec_ctrl += PCI_MSIX_ENTRY_VECTOR_CTRL;
   228	
   229		val = readl(vec_ctrl);
   230		mask = PCI_MSIX_ENTRY_CTRL_ST_LOWER | PCI_MSIX_ENTRY_CTRL_ST_UPPER;
   231		val &= ~mask;
 > 232		val |= FIELD_PREP(mask, (u32)tag);
   233		writel(val, vec_ctrl);
   234	
   235		/* Read back to flush the update */
   236		val = readl(vec_ctrl);
   237	
   238	err_out:
   239		msi_unlock_descs(&pdev->dev);
   240		return err;
   241	}
   242	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH V5 4/5] bnxt_en: Add TPH support in BNXT driver
  2024-09-16 21:25   ` Wei Huang
@ 2024-09-23  7:25     ` Lukas Wunner
  2024-09-23 20:16       ` Wei Huang
  0 siblings, 1 reply; 22+ messages in thread
From: Lukas Wunner @ 2024-09-23  7:25 UTC (permalink / raw)
  To: Wei Huang
  Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
	helgaas, corbet, davem, edumazet, kuba, pabeni, alex.williamson,
	gospo, michael.chan, ajit.khaparde, somnath.kotur,
	andrew.gospodarek, manoj.panicker2, Eric.VanTassell,
	vadim.fedorenko, horms, bagasdotme, bhelgaas, paul.e.luse,
	jing2.liu

On Mon, Sep 16, 2024 at 04:25:03PM -0500, Wei Huang wrote:
> This patch can not be compiled directly on pci.git tree because it uses
> netdev_rx_queue_restart() per Broadcom's suggestion. This function was just
> merged to netdev last week.
> 
> How could we resolve this double depedency issue? Can you take the first
> three TPH patches after review and I will send the rest via netdev?

Just rebase this series on v6.12-rc1 when it is tagged this Sunday
and resubmit.  pci.git is usually based on the latest rc1 release.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH V5 1/5] PCI: Add TLP Processing Hints (TPH) support
  2024-09-16 20:50 ` [PATCH V5 1/5] PCI: Add TLP Processing Hints (TPH) support Wei Huang
  2024-09-17  7:38   ` Simon Horman
@ 2024-09-23  7:43   ` Lukas Wunner
  2024-09-24 16:34     ` Wei Huang
  2024-09-23 12:07   ` Alejandro Lucero Palau
  2 siblings, 1 reply; 22+ messages in thread
From: Lukas Wunner @ 2024-09-23  7:43 UTC (permalink / raw)
  To: Wei Huang
  Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
	helgaas, corbet, davem, edumazet, kuba, pabeni, alex.williamson,
	gospo, michael.chan, ajit.khaparde, somnath.kotur,
	andrew.gospodarek, manoj.panicker2, Eric.VanTassell,
	vadim.fedorenko, horms, bagasdotme, bhelgaas, paul.e.luse,
	jing2.liu

On Mon, Sep 16, 2024 at 03:50:59PM -0500, Wei Huang wrote:
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1813,6 +1813,7 @@ int pci_save_state(struct pci_dev *dev)
>  	pci_save_dpc_state(dev);
>  	pci_save_aer_state(dev);
>  	pci_save_ptm_state(dev);
> +	pci_save_tph_state(dev);
>  	return pci_save_vc_state(dev);
>  }
>  EXPORT_SYMBOL(pci_save_state);
> @@ -1917,6 +1918,7 @@ void pci_restore_state(struct pci_dev *dev)
>  	pci_restore_vc_state(dev);
>  	pci_restore_rebar_state(dev);
>  	pci_restore_dpc_state(dev);
> +	pci_restore_tph_state(dev);
>  	pci_restore_ptm_state(dev);
>  
>  	pci_aer_clear_status(dev);

I'm wondering if there's a reason to use a different order on save versus
restore?  E.g. does PTM need to be restored last?


> --- a/drivers/pci/pcie/Kconfig
> +++ b/drivers/pci/pcie/Kconfig
> @@ -155,3 +155,14 @@ config PCIE_EDR
>  	  the PCI Firmware Specification r3.2.  Enable this if you want to
>  	  support hybrid DPC model which uses both firmware and OS to
>  	  implement DPC.
> +
> +config PCIE_TPH
> +	bool "TLP Processing Hints"
> +	depends on ACPI

TPH isn't really an ACPI-specific feature, it could exist on
devicetree-based platforms as well.  I think there could be valid
use cases for enabling TPH support on such platforms:

E.g. the platform firmware or bootloader might set up the TPH Extended
Capability in a specific way and the kernel would have to save/restore
it on system sleep.

So I'd recommend removing this dependency.

Note that there's a static inline for acpi_check_dsm() which returns
false if CONFIG_ACPI=n, so tph_invoke_dsm() returns AE_ERROR and
pcie_tph_get_cpu_st() returns -EINVAL.  It thus looks like you may not
even need an #ifdef.


> diff --git a/drivers/pci/pcie/tph.c b/drivers/pci/pcie/tph.c
> new file mode 100644

The PCIe features added most recently (such as DOE) have been placed
directly in drivers/pci/ instead of the pcie/ subdirectory.
The pcie/ subdirectory mostly deals with port drivers.
So perhaps tph.c should likewise be placed in drivers/pci/ ?


> --- /dev/null
> +++ b/drivers/pci/pcie/tph.c
> @@ -0,0 +1,199 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * TPH (TLP Processing Hints) support
> + *
> + * Copyright (C) 2024 Advanced Micro Devices, Inc.
> + *     Eric Van Tassell <Eric.VanTassell@amd.com>
> + *     Wei Huang <wei.huang2@amd.com>
> + */
> +#include <linux/pci.h>
> +#include <linux/pci-acpi.h>

This patch doesn't seem to use any of the symbols defined in pci-acpi.h,
or did I miss anything?  I'd move the inclusion of pci-acpi.h to the patch
that actually uses its symbols.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH V5 1/5] PCI: Add TLP Processing Hints (TPH) support
  2024-09-16 20:50 ` [PATCH V5 1/5] PCI: Add TLP Processing Hints (TPH) support Wei Huang
  2024-09-17  7:38   ` Simon Horman
  2024-09-23  7:43   ` Lukas Wunner
@ 2024-09-23 12:07   ` Alejandro Lucero Palau
  2024-09-23 20:27     ` Wei Huang
  2 siblings, 1 reply; 22+ messages in thread
From: Alejandro Lucero Palau @ 2024-09-23 12:07 UTC (permalink / raw)
  To: Wei Huang, linux-pci, linux-kernel, linux-doc, netdev
  Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
	alex.williamson, gospo, michael.chan, ajit.khaparde,
	somnath.kotur, andrew.gospodarek, manoj.panicker2,
	Eric.VanTassell, vadim.fedorenko, horms, bagasdotme, bhelgaas,
	lukas, paul.e.luse, jing2.liu


On 9/16/24 21:50, Wei Huang wrote:
> Add support for PCIe TLP Processing Hints (TPH) support (see PCIe r6.2,
> sec 6.17).
>
> Add missing TPH register definitions in pci_regs.h, including the TPH
> Requester capability register, TPH Requester control register, TPH
> Completer capability, and the ST fields of MSI-X entry.
>
> Introduce pcie_enable_tph() and pcie_disable_tph(), enabling drivers to
> toggle TPH support and configure specific ST mode as needed. Also add a
> new kernel parameter, "pci=notph", allowing users to disable TPH support
> across the entire system.
>
> Co-developed-by: Jing Liu <jing2.liu@intel.com>
> Signed-off-by: Jing Liu <jing2.liu@intel.com>
> Co-developed-by: Paul Luse <paul.e.luse@linux.intel.com>
> Signed-off-by: Paul Luse <paul.e.luse@linux.intel.com>
> Co-developed-by: Eric Van Tassell <Eric.VanTassell@amd.com>
> Signed-off-by: Eric Van Tassell <Eric.VanTassell@amd.com>
> Signed-off-by: Wei Huang <wei.huang2@amd.com>
> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by: Lukas Wunner <lukas@wunner.de>
> ---
>   .../admin-guide/kernel-parameters.txt         |   4 +
>   drivers/pci/pci.c                             |   4 +
>   drivers/pci/pci.h                             |  12 ++
>   drivers/pci/pcie/Kconfig                      |  11 +
>   drivers/pci/pcie/Makefile                     |   1 +
>   drivers/pci/pcie/tph.c                        | 199 ++++++++++++++++++
>   drivers/pci/probe.c                           |   1 +
>   include/linux/pci-tph.h                       |  21 ++
>   include/linux/pci.h                           |   7 +
>   include/uapi/linux/pci_regs.h                 |  38 +++-
>   10 files changed, 290 insertions(+), 8 deletions(-)
>   create mode 100644 drivers/pci/pcie/tph.c
>   create mode 100644 include/linux/pci-tph.h
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 09126bb8cc9f..8579d0fbcd33 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -4617,6 +4617,10 @@
>   		nomio		[S390] Do not use MIO instructions.
>   		norid		[S390] ignore the RID field and force use of
>   				one PCI domain per PCI function
> +		notph		[PCIE] If the PCIE_TPH kernel config parameter
> +				is enabled, this kernel boot option can be used
> +				to disable PCIe TLP Processing Hints support
> +				system-wide.
>   
>   	pcie_aspm=	[PCIE] Forcibly enable or ignore PCIe Active State Power
>   			Management.
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index ffaaca0978cb..b6f60f7476cc 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1813,6 +1813,7 @@ int pci_save_state(struct pci_dev *dev)
>   	pci_save_dpc_state(dev);
>   	pci_save_aer_state(dev);
>   	pci_save_ptm_state(dev);
> +	pci_save_tph_state(dev);
>   	return pci_save_vc_state(dev);
>   }
>   EXPORT_SYMBOL(pci_save_state);
> @@ -1917,6 +1918,7 @@ void pci_restore_state(struct pci_dev *dev)
>   	pci_restore_vc_state(dev);
>   	pci_restore_rebar_state(dev);
>   	pci_restore_dpc_state(dev);
> +	pci_restore_tph_state(dev);
>   	pci_restore_ptm_state(dev);
>   
>   	pci_aer_clear_status(dev);
> @@ -6869,6 +6871,8 @@ static int __init pci_setup(char *str)
>   				pci_no_domains();
>   			} else if (!strncmp(str, "noari", 5)) {
>   				pcie_ari_disabled = true;
> +			} else if (!strncmp(str, "notph", 5)) {
> +				pci_no_tph();
>   			} else if (!strncmp(str, "cbiosize=", 9)) {
>   				pci_cardbus_io_size = memparse(str + 9, &str);
>   			} else if (!strncmp(str, "cbmemsize=", 10)) {
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 79c8398f3938..8eeabbbfa137 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -571,6 +571,18 @@ static inline int pci_iov_bus_range(struct pci_bus *bus)
>   
>   #endif /* CONFIG_PCI_IOV */
>   
> +#ifdef CONFIG_PCIE_TPH
> +void pci_restore_tph_state(struct pci_dev *dev);
> +void pci_save_tph_state(struct pci_dev *dev);
> +void pci_no_tph(void);
> +void pci_tph_init(struct pci_dev *dev);
> +#else
> +static inline void pci_restore_tph_state(struct pci_dev *dev) { }
> +static inline void pci_save_tph_state(struct pci_dev *dev) { }
> +static inline void pci_no_tph(void) { }
> +static inline void pci_tph_init(struct pci_dev *dev) { }
> +#endif
> +
>   #ifdef CONFIG_PCIE_PTM
>   void pci_ptm_init(struct pci_dev *dev);
>   void pci_save_ptm_state(struct pci_dev *dev);
> diff --git a/drivers/pci/pcie/Kconfig b/drivers/pci/pcie/Kconfig
> index 17919b99fa66..61e4bd16eaf1 100644
> --- a/drivers/pci/pcie/Kconfig
> +++ b/drivers/pci/pcie/Kconfig
> @@ -155,3 +155,14 @@ config PCIE_EDR
>   	  the PCI Firmware Specification r3.2.  Enable this if you want to
>   	  support hybrid DPC model which uses both firmware and OS to
>   	  implement DPC.
> +
> +config PCIE_TPH
> +	bool "TLP Processing Hints"
> +	depends on ACPI
> +	default n
> +	help
> +	  This option adds support for PCIe TLP Processing Hints (TPH).
> +	  TPH allows endpoint devices to provide optimization hints, such as
> +	  desired caching behavior, for requests that target memory space.
> +	  These hints, called Steering Tags, can empower the system hardware
> +	  to optimize the utilization of platform resources.
> diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile
> index 6461aa93fe76..3542b42ea0b9 100644
> --- a/drivers/pci/pcie/Makefile
> +++ b/drivers/pci/pcie/Makefile
> @@ -13,3 +13,4 @@ obj-$(CONFIG_PCIE_PME)		+= pme.o
>   obj-$(CONFIG_PCIE_DPC)		+= dpc.o
>   obj-$(CONFIG_PCIE_PTM)		+= ptm.o
>   obj-$(CONFIG_PCIE_EDR)		+= edr.o
> +obj-$(CONFIG_PCIE_TPH)		+= tph.o
> diff --git a/drivers/pci/pcie/tph.c b/drivers/pci/pcie/tph.c
> new file mode 100644
> index 000000000000..1efd76c8dd30
> --- /dev/null
> +++ b/drivers/pci/pcie/tph.c
> @@ -0,0 +1,199 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * TPH (TLP Processing Hints) support
> + *
> + * Copyright (C) 2024 Advanced Micro Devices, Inc.
> + *     Eric Van Tassell <Eric.VanTassell@amd.com>
> + *     Wei Huang <wei.huang2@amd.com>
> + */
> +#include <linux/pci.h>
> +#include <linux/pci-acpi.h>
> +#include <linux/bitfield.h>
> +#include <linux/msi.h>
> +#include <linux/pci-tph.h>
> +
> +#include "../pci.h"
> +
> +/* System-wide TPH disabled */
> +static bool pci_tph_disabled;
> +
> +static u8 get_st_modes(struct pci_dev *pdev)
> +{
> +	u32 reg;
> +
> +	pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CAP, &reg);
> +	reg &= PCI_TPH_CAP_ST_NS | PCI_TPH_CAP_ST_IV | PCI_TPH_CAP_ST_DS;
> +
> +	return reg;
> +}
> +
> +/* Return device's Root Port completer capability */
> +static u8 get_rp_completer_type(struct pci_dev *pdev)
> +{
> +	struct pci_dev *rp;
> +	u32 reg;
> +	int ret;
> +
> +	rp = pcie_find_root_port(pdev);
> +	if (!rp)
> +		return 0;
> +
> +	ret = pcie_capability_read_dword(rp, PCI_EXP_DEVCAP2, &reg);
> +	if (ret)
> +		return 0;
> +
> +	return FIELD_GET(PCI_EXP_DEVCAP2_TPH_COMP_MASK, reg);
> +}
> +
> +/**
> + * pcie_disable_tph - Turn off TPH support for device
> + * @pdev: PCI device
> + *
> + * Return: none
> + */
> +void pcie_disable_tph(struct pci_dev *pdev)
> +{
> +	if (!pdev->tph_cap)
> +		return;
> +
> +	if (!pdev->tph_enabled)
> +		return;
> +
> +	pci_write_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, 0);
> +
> +	pdev->tph_mode = 0;
> +	pdev->tph_req_type = 0;
> +	pdev->tph_enabled = 0;
> +}
> +EXPORT_SYMBOL(pcie_disable_tph);
> +
> +/**
> + * pcie_enable_tph - Enable TPH support for device using a specific ST mode
> + * @pdev: PCI device
> + * @mode: ST mode to enable. Current supported modes include:
> + *
> + *   - PCI_TPH_ST_NS_MODE: NO ST Mode
> + *   - PCI_TPH_ST_IV_MODE: Interrupt Vector Mode
> + *   - PCI_TPH_ST_DS_MODE: Device Specific Mode
> + *
> + * Checks whether the mode is actually supported by the device before enabling
> + * and returns an error if not. Additionally determines what types of requests,
> + * TPH or extended TPH, can be issued by the device based on its TPH requester
> + * capability and the Root Port's completer capability.
> + *
> + * Return: 0 on success, otherwise negative value (-errno)
> + */
> +int pcie_enable_tph(struct pci_dev *pdev, int mode)
> +{
> +	u32 reg;
> +	u8 dev_modes;
> +	u8 rp_req_type;
> +
> +	/* Honor "notph" kernel parameter */
> +	if (pci_tph_disabled)
> +		return -EINVAL;
> +
> +	if (!pdev->tph_cap)
> +		return -EINVAL;
> +
> +	if (pdev->tph_enabled)
> +		return -EBUSY;
> +
> +	/* Sanitize and check ST mode comptability */
> +	mode &= PCI_TPH_CTRL_MODE_SEL_MASK;
> +	dev_modes = get_st_modes(pdev);
> +	if (!((1 << mode) & dev_modes))


This is wrong. The mode definition is about the bit on and not about bit 
position. You got this right in v4 ...


> +		return -EINVAL;
> +
> +	pdev->tph_mode = mode;
> +
> +	/* Get req_type supported by device and its Root Port */
> +	pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CAP, &reg);
> +	if (FIELD_GET(PCI_TPH_CAP_EXT_TPH, reg))
> +		pdev->tph_req_type = PCI_TPH_REQ_EXT_TPH;
> +	else
> +		pdev->tph_req_type = PCI_TPH_REQ_TPH_ONLY;
> +
> +	rp_req_type = get_rp_completer_type(pdev);
> +
> +	/* Final req_type is the smallest value of two */
> +	pdev->tph_req_type = min(pdev->tph_req_type, rp_req_type);
> +
> +	if (pdev->tph_req_type == PCI_TPH_REQ_DISABLE)
> +		return -EINVAL;
> +
> +	/* Write them into TPH control register */
> +	pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, &reg);
> +
> +	reg &= ~PCI_TPH_CTRL_MODE_SEL_MASK;
> +	reg |= FIELD_PREP(PCI_TPH_CTRL_MODE_SEL_MASK, pdev->tph_mode);
> +
> +	reg &= ~PCI_TPH_CTRL_REQ_EN_MASK;
> +	reg |= FIELD_PREP(PCI_TPH_CTRL_REQ_EN_MASK, pdev->tph_req_type);
> +
> +	pci_write_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, reg);
> +
> +	pdev->tph_enabled = 1;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL(pcie_enable_tph);
> +
> +void pci_restore_tph_state(struct pci_dev *pdev)
> +{
> +	struct pci_cap_saved_state *save_state;
> +	u32 *cap;
> +
> +	if (!pdev->tph_cap)
> +		return;
> +
> +	if (!pdev->tph_enabled)
> +		return;
> +
> +	save_state = pci_find_saved_ext_cap(pdev, PCI_EXT_CAP_ID_TPH);
> +	if (!save_state)
> +		return;
> +
> +	/* Restore control register and all ST entries */
> +	cap = &save_state->cap.data[0];
> +	pci_write_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, *cap++);
> +}
> +
> +void pci_save_tph_state(struct pci_dev *pdev)
> +{
> +	struct pci_cap_saved_state *save_state;
> +	u32 *cap;
> +
> +	if (!pdev->tph_cap)
> +		return;
> +
> +	if (!pdev->tph_enabled)
> +		return;
> +
> +	save_state = pci_find_saved_ext_cap(pdev, PCI_EXT_CAP_ID_TPH);
> +	if (!save_state)
> +		return;
> +
> +	/* Save control register */
> +	cap = &save_state->cap.data[0];
> +	pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, cap++);
> +}
> +
> +void pci_no_tph(void)
> +{
> +	pci_tph_disabled = true;
> +
> +	pr_info("PCIe TPH is disabled\n");
> +}
> +
> +void pci_tph_init(struct pci_dev *pdev)
> +{
> +	u32 save_size;
> +
> +	pdev->tph_cap = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_TPH);
> +	if (!pdev->tph_cap)
> +		return;
> +
> +	save_size = sizeof(u32);
> +	pci_add_ext_cap_save_buffer(pdev, PCI_EXT_CAP_ID_TPH, save_size);
> +}
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index b14b9876c030..c74adcdee52b 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -2498,6 +2498,7 @@ static void pci_init_capabilities(struct pci_dev *dev)
>   	pci_dpc_init(dev);		/* Downstream Port Containment */
>   	pci_rcec_init(dev);		/* Root Complex Event Collector */
>   	pci_doe_init(dev);		/* Data Object Exchange */
> +	pci_tph_init(dev);		/* TLP Processing Hints */
>   
>   	pcie_report_downtraining(dev);
>   	pci_init_reset_methods(dev);
> diff --git a/include/linux/pci-tph.h b/include/linux/pci-tph.h
> new file mode 100644
> index 000000000000..58654a334ffb
> --- /dev/null
> +++ b/include/linux/pci-tph.h
> @@ -0,0 +1,21 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * TPH (TLP Processing Hints)
> + *
> + * Copyright (C) 2024 Advanced Micro Devices, Inc.
> + *     Eric Van Tassell <Eric.VanTassell@amd.com>
> + *     Wei Huang <wei.huang2@amd.com>
> + */
> +#ifndef LINUX_PCI_TPH_H
> +#define LINUX_PCI_TPH_H
> +
> +#ifdef CONFIG_PCIE_TPH
> +void pcie_disable_tph(struct pci_dev *pdev);
> +int pcie_enable_tph(struct pci_dev *pdev, int mode);
> +#else
> +static inline void pcie_disable_tph(struct pci_dev *pdev) { }
> +static inline int pcie_enable_tph(struct pci_dev *pdev, int mode)
> +{ return -EINVAL; }
> +#endif
> +
> +#endif /* LINUX_PCI_TPH_H */
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 4cf89a4b4cbc..6f05deb6a0bf 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -433,6 +433,7 @@ struct pci_dev {
>   	unsigned int	ats_enabled:1;		/* Address Translation Svc */
>   	unsigned int	pasid_enabled:1;	/* Process Address Space ID */
>   	unsigned int	pri_enabled:1;		/* Page Request Interface */
> +	unsigned int	tph_enabled:1;		/* TLP Processing Hints */
>   	unsigned int	is_managed:1;		/* Managed via devres */
>   	unsigned int	is_msi_managed:1;	/* MSI release via devres installed */
>   	unsigned int	needs_freset:1;		/* Requires fundamental reset */
> @@ -530,6 +531,12 @@ struct pci_dev {
>   
>   	/* These methods index pci_reset_fn_methods[] */
>   	u8 reset_methods[PCI_NUM_RESET_METHODS]; /* In priority order */
> +
> +#ifdef CONFIG_PCIE_TPH
> +	u16		tph_cap;	/* TPH capability offset */
> +	u8		tph_mode;	/* TPH mode */
> +	u8		tph_req_type;	/* TPH requester type */
> +#endif
>   };
>   
>   static inline struct pci_dev *pci_physfn(struct pci_dev *dev)
> diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
> index 94c00996e633..25af1976953c 100644
> --- a/include/uapi/linux/pci_regs.h
> +++ b/include/uapi/linux/pci_regs.h
> @@ -340,7 +340,9 @@
>   #define PCI_MSIX_ENTRY_UPPER_ADDR	0x4  /* Message Upper Address */
>   #define PCI_MSIX_ENTRY_DATA		0x8  /* Message Data */
>   #define PCI_MSIX_ENTRY_VECTOR_CTRL	0xc  /* Vector Control */
> -#define  PCI_MSIX_ENTRY_CTRL_MASKBIT	0x00000001
> +#define  PCI_MSIX_ENTRY_CTRL_MASKBIT	0x00000001  /* Mask Bit */
> +#define  PCI_MSIX_ENTRY_CTRL_ST_LOWER	0x00ff0000  /* ST Lower */
> +#define  PCI_MSIX_ENTRY_CTRL_ST_UPPER	0xff000000  /* ST Upper */
>   
>   /* CompactPCI Hotswap Register */
>   
> @@ -657,6 +659,7 @@
>   #define  PCI_EXP_DEVCAP2_ATOMIC_COMP64	0x00000100 /* 64b AtomicOp completion */
>   #define  PCI_EXP_DEVCAP2_ATOMIC_COMP128	0x00000200 /* 128b AtomicOp completion */
>   #define  PCI_EXP_DEVCAP2_LTR		0x00000800 /* Latency tolerance reporting */
> +#define  PCI_EXP_DEVCAP2_TPH_COMP_MASK	0x00003000 /* TPH completer support */
>   #define  PCI_EXP_DEVCAP2_OBFF_MASK	0x000c0000 /* OBFF support mechanism */
>   #define  PCI_EXP_DEVCAP2_OBFF_MSG	0x00040000 /* New message signaling */
>   #define  PCI_EXP_DEVCAP2_OBFF_WAKE	0x00080000 /* Re-use WAKE# for OBFF */
> @@ -1020,15 +1023,34 @@
>   #define  PCI_DPA_CAP_SUBSTATE_MASK	0x1F	/* # substates - 1 */
>   #define PCI_DPA_BASE_SIZEOF	16	/* size with 0 substates */
>   
> +/* TPH Completer Support */
> +#define PCI_EXP_DEVCAP2_TPH_COMP_NONE		0x0 /* None */
> +#define PCI_EXP_DEVCAP2_TPH_COMP_TPH_ONLY	0x1 /* TPH only */
> +#define PCI_EXP_DEVCAP2_TPH_COMP_EXT_TPH	0x3 /* TPH and Extended TPH */
> +
>   /* TPH Requester */
>   #define PCI_TPH_CAP		4	/* capability register */
> -#define  PCI_TPH_CAP_LOC_MASK	0x600	/* location mask */
> -#define   PCI_TPH_LOC_NONE	0x000	/* no location */
> -#define   PCI_TPH_LOC_CAP	0x200	/* in capability */
> -#define   PCI_TPH_LOC_MSIX	0x400	/* in MSI-X */
> -#define PCI_TPH_CAP_ST_MASK	0x07FF0000	/* ST table mask */
> -#define PCI_TPH_CAP_ST_SHIFT	16	/* ST table shift */
> -#define PCI_TPH_BASE_SIZEOF	0xc	/* size with no ST table */
> +#define  PCI_TPH_CAP_ST_NS	0x00000001 /* No ST Mode Supported */
> +#define  PCI_TPH_CAP_ST_IV	0x00000002 /* Interrupt Vector Mode Supported */
> +#define  PCI_TPH_CAP_ST_DS	0x00000004 /* Device Specific Mode Supported */
> +#define  PCI_TPH_CAP_EXT_TPH	0x00000100 /* Ext TPH Requester Supported */
> +#define  PCI_TPH_CAP_LOC_MASK	0x00000600 /* ST Table Location */
> +#define   PCI_TPH_LOC_NONE	0x00000000 /* Not present */
> +#define   PCI_TPH_LOC_CAP	0x00000200 /* In capability */
> +#define   PCI_TPH_LOC_MSIX	0x00000400 /* In MSI-X */
> +#define  PCI_TPH_CAP_ST_MASK	0x07FF0000 /* ST Table Size */
> +#define  PCI_TPH_CAP_ST_SHIFT	16	/* ST Table Size shift */
> +#define PCI_TPH_BASE_SIZEOF	0xc	/* Size with no ST table */
> +
> +#define PCI_TPH_CTRL		8	/* control register */
> +#define  PCI_TPH_CTRL_MODE_SEL_MASK	0x00000007 /* ST Mode Select */
> +#define   PCI_TPH_ST_NS_MODE		0x0 /* No ST Mode */
> +#define   PCI_TPH_ST_IV_MODE		0x1 /* Interrupt Vector Mode */
> +#define   PCI_TPH_ST_DS_MODE		0x2 /* Device Specific Mode */
> +#define  PCI_TPH_CTRL_REQ_EN_MASK	0x00000300 /* TPH Requester Enable */
> +#define   PCI_TPH_REQ_DISABLE		0x0 /* No TPH requests allowed */
> +#define   PCI_TPH_REQ_TPH_ONLY		0x1 /* TPH only requests allowed */
> +#define   PCI_TPH_REQ_EXT_TPH		0x3 /* Extended TPH requests allowed */
>   
>   /* Downstream Port Containment */
>   #define PCI_EXP_DPC_CAP			0x04	/* DPC Capability */

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH V5 4/5] bnxt_en: Add TPH support in BNXT driver
  2024-09-23  7:25     ` Lukas Wunner
@ 2024-09-23 20:16       ` Wei Huang
  0 siblings, 0 replies; 22+ messages in thread
From: Wei Huang @ 2024-09-23 20:16 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
	helgaas, corbet, davem, edumazet, kuba, pabeni, alex.williamson,
	gospo, michael.chan, ajit.khaparde, somnath.kotur,
	andrew.gospodarek, manoj.panicker2, Eric.VanTassell,
	vadim.fedorenko, horms, bagasdotme, bhelgaas, paul.e.luse,
	jing2.liu



On 9/23/24 2:25 AM, Lukas Wunner wrote:
> On Mon, Sep 16, 2024 at 04:25:03PM -0500, Wei Huang wrote:
>> This patch can not be compiled directly on pci.git tree because it uses
>> netdev_rx_queue_restart() per Broadcom's suggestion. This function was just
>> merged to netdev last week.
>>
>> How could we resolve this double depedency issue? Can you take the first
>> three TPH patches after review and I will send the rest via netdev?
> 
> Just rebase this series on v6.12-rc1 when it is tagged this Sunday
> and resubmit.  pci.git is usually based on the latest rc1 release.

Thanks for the advice - will do it as suggested.

> 
> Thanks,
> 
> Lukas


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH V5 1/5] PCI: Add TLP Processing Hints (TPH) support
  2024-09-23 12:07   ` Alejandro Lucero Palau
@ 2024-09-23 20:27     ` Wei Huang
  2024-09-24 14:33       ` Alejandro Lucero Palau
  0 siblings, 1 reply; 22+ messages in thread
From: Wei Huang @ 2024-09-23 20:27 UTC (permalink / raw)
  To: Alejandro Lucero Palau, linux-pci, linux-kernel, linux-doc,
	netdev
  Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
	alex.williamson, gospo, michael.chan, ajit.khaparde,
	somnath.kotur, andrew.gospodarek, manoj.panicker2,
	Eric.VanTassell, vadim.fedorenko, horms, bagasdotme, bhelgaas,
	lukas, paul.e.luse, jing2.liu



On 9/23/24 7:07 AM, Alejandro Lucero Palau wrote:
> 

...

>> +/**
>> + * pcie_enable_tph - Enable TPH support for device using a specific ST mode
>> + * @pdev: PCI device
>> + * @mode: ST mode to enable. Current supported modes include:
>> + *
>> + *   - PCI_TPH_ST_NS_MODE: NO ST Mode
>> + *   - PCI_TPH_ST_IV_MODE: Interrupt Vector Mode
>> + *   - PCI_TPH_ST_DS_MODE: Device Specific Mode
>> + *
>> + * Checks whether the mode is actually supported by the device before enabling
>> + * and returns an error if not. Additionally determines what types of requests,
>> + * TPH or extended TPH, can be issued by the device based on its TPH requester
>> + * capability and the Root Port's completer capability.
>> + *
>> + * Return: 0 on success, otherwise negative value (-errno)
>> + */
>> +int pcie_enable_tph(struct pci_dev *pdev, int mode)
>> +{
>> +	u32 reg;
>> +	u8 dev_modes;
>> +	u8 rp_req_type;
>> +
>> +	/* Honor "notph" kernel parameter */
>> +	if (pci_tph_disabled)
>> +		return -EINVAL;
>> +
>> +	if (!pdev->tph_cap)
>> +		return -EINVAL;
>> +
>> +	if (pdev->tph_enabled)
>> +		return -EBUSY;
>> +
>> +	/* Sanitize and check ST mode comptability */
>> +	mode &= PCI_TPH_CTRL_MODE_SEL_MASK;
>> +	dev_modes = get_st_modes(pdev);
>> +	if (!((1 << mode) & dev_modes))
> 
> 
> This is wrong. The mode definition is about the bit on and not about bit
> position. You got this right in v4 ...

This code is correct. In V5, I changed the "mode" parameter to the 
following values, as defined in TPH Ctrl register. These values are 
defined as bit positions:

PCI_TPH_ST_NS_MODE: NO ST Mode
PCI_TPH_ST_IV_MODE: Interrupt Vector Mode
PCI_TPH_ST_DS_MODE: Device Specific Mode

In V4, "mode" is defined as masks of TPH Cap register. I felt that V5 
looks more straightforward:

V4: pcie_enable_tph(dev, PCI_TPH_CAP_ST_IV)
vs.
V5: pcie_enable_tph(dev, PCI_TPH_ST_IV_MODE)

> 
> 
>> +		return -EINVAL;


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH V5 1/5] PCI: Add TLP Processing Hints (TPH) support
  2024-09-23 20:27     ` Wei Huang
@ 2024-09-24 14:33       ` Alejandro Lucero Palau
  0 siblings, 0 replies; 22+ messages in thread
From: Alejandro Lucero Palau @ 2024-09-24 14:33 UTC (permalink / raw)
  To: Wei Huang, linux-pci, linux-kernel, linux-doc, netdev
  Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
	alex.williamson, gospo, michael.chan, ajit.khaparde,
	somnath.kotur, andrew.gospodarek, manoj.panicker2,
	Eric.VanTassell, vadim.fedorenko, horms, bagasdotme, bhelgaas,
	lukas, paul.e.luse, jing2.liu


On 9/23/24 21:27, Wei Huang wrote:
>
>
> On 9/23/24 7:07 AM, Alejandro Lucero Palau wrote:
>>
>
> ...
>
>>> +/**
>>> + * pcie_enable_tph - Enable TPH support for device using a specific 
>>> ST mode
>>> + * @pdev: PCI device
>>> + * @mode: ST mode to enable. Current supported modes include:
>>> + *
>>> + *   - PCI_TPH_ST_NS_MODE: NO ST Mode
>>> + *   - PCI_TPH_ST_IV_MODE: Interrupt Vector Mode
>>> + *   - PCI_TPH_ST_DS_MODE: Device Specific Mode
>>> + *
>>> + * Checks whether the mode is actually supported by the device 
>>> before enabling
>>> + * and returns an error if not. Additionally determines what types 
>>> of requests,
>>> + * TPH or extended TPH, can be issued by the device based on its 
>>> TPH requester
>>> + * capability and the Root Port's completer capability.
>>> + *
>>> + * Return: 0 on success, otherwise negative value (-errno)
>>> + */
>>> +int pcie_enable_tph(struct pci_dev *pdev, int mode)
>>> +{
>>> +    u32 reg;
>>> +    u8 dev_modes;
>>> +    u8 rp_req_type;
>>> +
>>> +    /* Honor "notph" kernel parameter */
>>> +    if (pci_tph_disabled)
>>> +        return -EINVAL;
>>> +
>>> +    if (!pdev->tph_cap)
>>> +        return -EINVAL;
>>> +
>>> +    if (pdev->tph_enabled)
>>> +        return -EBUSY;
>>> +
>>> +    /* Sanitize and check ST mode comptability */
>>> +    mode &= PCI_TPH_CTRL_MODE_SEL_MASK;
>>> +    dev_modes = get_st_modes(pdev);
>>> +    if (!((1 << mode) & dev_modes))
>>
>>
>> This is wrong. The mode definition is about the bit on and not about bit
>> position. You got this right in v4 ...
>
> This code is correct. In V5, I changed the "mode" parameter to the 
> following values, as defined in TPH Ctrl register. These values are 
> defined as bit positions:
>
> PCI_TPH_ST_NS_MODE: NO ST Mode
> PCI_TPH_ST_IV_MODE: Interrupt Vector Mode
> PCI_TPH_ST_DS_MODE: Device Specific Mode
>

OK. I found the issue. I was using PCI_TPH_CAP_ST_DS for the mode 
instead of PCI_TPH_ST_DS_MODE.

That change confused me.

Apologies.


> In V4, "mode" is defined as masks of TPH Cap register. I felt that V5 
> looks more straightforward:
>
> V4: pcie_enable_tph(dev, PCI_TPH_CAP_ST_IV)
> vs.
> V5: pcie_enable_tph(dev, PCI_TPH_ST_IV_MODE)
>
>>
>>
>>> +        return -EINVAL;
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH V5 1/5] PCI: Add TLP Processing Hints (TPH) support
  2024-09-23  7:43   ` Lukas Wunner
@ 2024-09-24 16:34     ` Wei Huang
  0 siblings, 0 replies; 22+ messages in thread
From: Wei Huang @ 2024-09-24 16:34 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
	helgaas, corbet, davem, edumazet, kuba, pabeni, alex.williamson,
	gospo, michael.chan, ajit.khaparde, somnath.kotur,
	andrew.gospodarek, manoj.panicker2, Eric.VanTassell,
	vadim.fedorenko, horms, bagasdotme, bhelgaas, paul.e.luse,
	jing2.liu



On 9/23/24 02:43, Lukas Wunner wrote:
> On Mon, Sep 16, 2024 at 03:50:59PM -0500, Wei Huang wrote:
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -1813,6 +1813,7 @@ int pci_save_state(struct pci_dev *dev)
>>  	pci_save_dpc_state(dev);
>>  	pci_save_aer_state(dev);
>>  	pci_save_ptm_state(dev);
>> +	pci_save_tph_state(dev);
>>  	return pci_save_vc_state(dev);
>>  }
>>  EXPORT_SYMBOL(pci_save_state);
>> @@ -1917,6 +1918,7 @@ void pci_restore_state(struct pci_dev *dev)
>>  	pci_restore_vc_state(dev);
>>  	pci_restore_rebar_state(dev);
>>  	pci_restore_dpc_state(dev);
>> +	pci_restore_tph_state(dev);
>>  	pci_restore_ptm_state(dev);
>>  
>>  	pci_aer_clear_status(dev);
> 
> I'm wondering if there's a reason to use a different order on save versus
> restore?  E.g. does PTM need to be restored last?
> 
> 

Will fix

>> --- a/drivers/pci/pcie/Kconfig
>> +++ b/drivers/pci/pcie/Kconfig
>> @@ -155,3 +155,14 @@ config PCIE_EDR
>>  	  the PCI Firmware Specification r3.2.  Enable this if you want to
>>  	  support hybrid DPC model which uses both firmware and OS to
>>  	  implement DPC.
>> +
>> +config PCIE_TPH
>> +	bool "TLP Processing Hints"
>> +	depends on ACPI
> 
> TPH isn't really an ACPI-specific feature, it could exist on
> devicetree-based platforms as well.  I think there could be valid
> use cases for enabling TPH support on such platforms:
> 
> E.g. the platform firmware or bootloader might set up the TPH Extended
> Capability in a specific way and the kernel would have to save/restore
> it on system sleep.
> 
> So I'd recommend removing this dependency.

This is reasonable - I can remove this dependency.

> 
> Note that there's a static inline for acpi_check_dsm() which returns
> false if CONFIG_ACPI=n, so tph_invoke_dsm() returns AE_ERROR and
> pcie_tph_get_cpu_st() returns -EINVAL.  It thus looks like you may not
> even need an #ifdef.

We might have to add #ifdef around the ACPI related functions. The
reason is not because of acpi_evaluate_dsm() or acpi_evaluate_dsm().
Instead the compilation will fail due to "pci_acpi_dsm_guid". In TPH V2
series, somebody reported the following error:

"
This seems to break builds on ARM (32bit) with multi_v7_defconfig.

  .../tph.c:221:39: error: use of undeclared identifier 'pci_acpi_dsm_guid'
  221 |         out_obj = acpi_evaluate_dsm(handle, &pci_acpi_dsm_guid,
MIN_ST_DSM_REV,
      |
"

> 
> 
>> diff --git a/drivers/pci/pcie/tph.c b/drivers/pci/pcie/tph.c
>> new file mode 100644
> 
> The PCIe features added most recently (such as DOE) have been placed
> directly in drivers/pci/ instead of the pcie/ subdirectory.
> The pcie/ subdirectory mostly deals with port drivers.
> So perhaps tph.c should likewise be placed in drivers/pci/ ?

I am OK with it. Some extended features, such as ATS, are indeed
implemented in drivers/pci/.

Bjorn: Any comments on this idea?

> 
> 
>> --- /dev/null
>> +++ b/drivers/pci/pcie/tph.c
>> @@ -0,0 +1,199 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * TPH (TLP Processing Hints) support
>> + *
>> + * Copyright (C) 2024 Advanced Micro Devices, Inc.
>> + *     Eric Van Tassell <Eric.VanTassell@amd.com>
>> + *     Wei Huang <wei.huang2@amd.com>
>> + */
>> +#include <linux/pci.h>
>> +#include <linux/pci-acpi.h>
> 
> This patch doesn't seem to use any of the symbols defined in pci-acpi.h,
> or did I miss anything?  I'd move the inclusion of pci-acpi.h to the patch
> that actually uses its symbols.
> 
> Thanks,
> 
> Lukas

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2024-09-24 16:34 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-16 20:50 [PATCH V5 0/5] PCIe TPH and cache direct injection support Wei Huang
2024-09-16 20:50 ` [PATCH V5 1/5] PCI: Add TLP Processing Hints (TPH) support Wei Huang
2024-09-17  7:38   ` Simon Horman
2024-09-23  7:43   ` Lukas Wunner
2024-09-24 16:34     ` Wei Huang
2024-09-23 12:07   ` Alejandro Lucero Palau
2024-09-23 20:27     ` Wei Huang
2024-09-24 14:33       ` Alejandro Lucero Palau
2024-09-16 20:51 ` [PATCH V5 2/5] PCI/TPH: Add Steering Tag support Wei Huang
2024-09-17  7:32   ` Simon Horman
2024-09-17 14:31     ` Wei Huang
2024-09-17 16:14       ` Simon Horman
2024-09-17 16:24         ` Simon Horman
2024-09-20 10:38   ` kernel test robot
2024-09-16 20:51 ` [PATCH V5 3/5] PCI/TPH: Add TPH documentation Wei Huang
2024-09-16 20:51 ` [PATCH V5 4/5] bnxt_en: Add TPH support in BNXT driver Wei Huang
2024-09-16 21:25   ` Wei Huang
2024-09-23  7:25     ` Lukas Wunner
2024-09-23 20:16       ` Wei Huang
2024-09-17  7:35   ` Simon Horman
2024-09-20 10:38   ` kernel test robot
2024-09-16 20:51 ` [PATCH V5 5/5] bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings Wei Huang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).