* [PATCH V7 0/5] TPH and cache direct injection support
@ 2024-10-02 16:59 Wei Huang
2024-10-02 16:59 ` [PATCH V7 1/5] PCI: Add TLP Processing Hints (TPH) support Wei Huang
` (5 more replies)
0 siblings, 6 replies; 17+ messages in thread
From: Wei Huang @ 2024-10-02 16:59 UTC (permalink / raw)
To: linux-pci, linux-kernel, linux-doc, netdev
Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
alex.williamson, gospo, michael.chan, ajit.khaparde,
somnath.kotur, andrew.gospodarek, manoj.panicker2,
Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
bhelgaas, lukas, paul.e.luse, jing2.liu
Hi All,
TPH (TLP Processing Hints) is a PCIe feature that allows endpoint
devices to provide optimization hints for requests that target memory
space. These hints, in a format called steering tag (ST), are provided
in the requester's TLP headers and allow the system hardware, including
the Root Complex, to optimize the utilization of platform resources
for the requests.
Upcoming AMD hardware implement a new Cache Injection feature that
leverages TPH. Cache Injection allows PCIe endpoints to inject I/O
Coherent DMA writes directly into an L2 within the CCX (core complex)
closest to the CPU core that will consume it. This technology is aimed
at applications requiring high performance and low latency, such as
networking and storage applications.
This series introduces generic TPH support in Linux, allowing STs to be
retrieved and used by PCIe endpoint drivers as needed. As a
demonstration, it includes an example usage in the Broadcom BNXT driver.
When running on Broadcom NICs with the appropriate firmware, it shows
substantial memory bandwidth savings and better network bandwidth using
real-world benchmarks. This solution is vendor-neutral and implemented
based on industry standards (PCIe Spec and PCI FW Spec).
V6->V7:
* Rebase on top of the latest pci/main (6.12-rc1)
* Fix compilation warning/error on clang-18 with w=1 (test robot)
* Revise commit messages for Patch #2, #4, and #5 (Bjorn)
* Add more _DSM method description for reference in Patch #2 (Bjorn)
* Remove "default n" in Kconfig (Lukas)
V5->V6:
* Rebase on top of pci/main (tag: pci-v6.12-changes)
* Fix spellings and FIELD_PREP/bnxt.c compilation errors (Simon)
* Move tph.c to drivers/pci directory (Lukas)
* Remove CONFIG_ACPI dependency (Lukas)
* Slightly re-arrange save/restore sequence (Lukas)
V4->V5:
* Rebase on top of net-next/main tree (Broadcom)
* Remove TPH mode query and TPH enabled checking functions (Bjorn)
* Remove "nostmode" kernel parameter (Bjorn)
* Add "notph" kernel parameter support (Bjorn)
* Add back TPH documentation (Bjorn)
* Change TPH register namings (Bjorn)
* Squash TPH enable/disable/save/restore funcs as a single patch (Bjorn)
* Squash ST get_st/set_st funcs as a single patch (Bjorn)
* Replace nic_open/close with netdev_rx_queue_restart() (Jakub, Broadcom)
V3->V4:
* Rebase on top of the latest pci/next tree (tag: 6.11-rc1)
* Add new API functioins to query/enable/disable TPH support
* Make pcie_tph_set_st() completely independent from pcie_tph_get_cpu_st()
* Rewrite bnxt.c based on new APIs
* Remove documentation for now due to constantly changing API
* Remove pci=notph, but keep pci=nostmode with better flow (Bjorn)
* Lots of code rewrite in tph.c & pci-tph.h with cleaner interface (Bjorn)
* Add TPH save/restore support (Paul Luse and Lukas Wunner)
V2->V3:
* Rebase on top of pci/next tree (tag: pci-v6.11-changes)
* Redefine PCI TPH registers (pci_regs.h) without breaking uapi
* Fix commit subjects/messages for kernel options (Jonathan and Bjorn)
* Break API functions into three individual patches for easy review
* Rewrite lots of code in tph.c/tph.h based (Jonathan and Bjorn)
V1->V2:
* Rebase on top of pci.git/for-linus (6.10-rc1)
* Address mismatched data types reported by Sparse (Sparse check passed)
* Add pcie_tph_intr_vec_supported() for checking IRQ mode support
* Skip bnxt affinity notifier registration if
pcie_tph_intr_vec_supported()=false
* Minor fixes in bnxt driver (i.e. warning messages)
Manoj Panicker (1):
bnxt_en: Add TPH support in BNXT driver
Michael Chan (1):
bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings
Wei Huang (3):
PCI: Add TLP Processing Hints (TPH) support
PCI/TPH: Add Steering Tag support
PCI/TPH: Add TPH documentation
Documentation/PCI/index.rst | 1 +
Documentation/PCI/tph.rst | 132 +++++
.../admin-guide/kernel-parameters.txt | 4 +
Documentation/driver-api/pci/pci.rst | 3 +
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 91 ++-
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 7 +
drivers/pci/Kconfig | 9 +
drivers/pci/Makefile | 1 +
drivers/pci/pci.c | 4 +
drivers/pci/pci.h | 12 +
drivers/pci/probe.c | 1 +
drivers/pci/tph.c | 546 ++++++++++++++++++
include/linux/pci-tph.h | 44 ++
include/linux/pci.h | 7 +
include/uapi/linux/pci_regs.h | 37 +-
net/core/netdev_rx_queue.c | 1 +
16 files changed, 890 insertions(+), 10 deletions(-)
create mode 100644 Documentation/PCI/tph.rst
create mode 100644 drivers/pci/tph.c
create mode 100644 include/linux/pci-tph.h
base-commit: 9852d85ec9d492ebef56dc5f229416c925758edc
--
2.46.0
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH V7 1/5] PCI: Add TLP Processing Hints (TPH) support
2024-10-02 16:59 [PATCH V7 0/5] TPH and cache direct injection support Wei Huang
@ 2024-10-02 16:59 ` Wei Huang
2024-10-02 16:59 ` [PATCH V7 2/5] PCI/TPH: Add Steering Tag support Wei Huang
` (4 subsequent siblings)
5 siblings, 0 replies; 17+ messages in thread
From: Wei Huang @ 2024-10-02 16:59 UTC (permalink / raw)
To: linux-pci, linux-kernel, linux-doc, netdev
Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
alex.williamson, gospo, michael.chan, ajit.khaparde,
somnath.kotur, andrew.gospodarek, manoj.panicker2,
Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
bhelgaas, lukas, paul.e.luse, jing2.liu
Add support for PCIe TLP Processing Hints (TPH) support (see PCIe r6.2,
sec 6.17).
Add missing TPH register definitions in pci_regs.h, including the TPH
Requester capability register, TPH Requester control register, TPH
Completer capability, and the ST fields of MSI-X entry.
Introduce pcie_enable_tph() and pcie_disable_tph(), enabling drivers to
toggle TPH support and configure specific ST mode as needed. Also add a
new kernel parameter, "pci=notph", allowing users to disable TPH support
across the entire system.
Co-developed-by: Jing Liu <jing2.liu@intel.com>
Signed-off-by: Jing Liu <jing2.liu@intel.com>
Co-developed-by: Paul Luse <paul.e.luse@linux.intel.com>
Signed-off-by: Paul Luse <paul.e.luse@linux.intel.com>
Co-developed-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
---
.../admin-guide/kernel-parameters.txt | 4 +
drivers/pci/Kconfig | 9 +
drivers/pci/Makefile | 1 +
drivers/pci/pci.c | 4 +
drivers/pci/pci.h | 12 ++
drivers/pci/probe.c | 1 +
drivers/pci/tph.c | 197 ++++++++++++++++++
include/linux/pci-tph.h | 21 ++
include/linux/pci.h | 7 +
include/uapi/linux/pci_regs.h | 37 +++-
10 files changed, 285 insertions(+), 8 deletions(-)
create mode 100644 drivers/pci/tph.c
create mode 100644 include/linux/pci-tph.h
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 1518343bbe22..178995b07451 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4678,6 +4678,10 @@
nomio [S390] Do not use MIO instructions.
norid [S390] ignore the RID field and force use of
one PCI domain per PCI function
+ notph [PCIE] If the PCIE_TPH kernel config parameter
+ is enabled, this kernel boot option can be used
+ to disable PCIe TLP Processing Hints support
+ system-wide.
pcie_aspm= [PCIE] Forcibly enable or ignore PCIe Active State Power
Management.
diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
index 0d94e4a967d8..2f270e4414b3 100644
--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -173,6 +173,15 @@ config PCI_PASID
If unsure, say N.
+config PCIE_TPH
+ bool "TLP Processing Hints"
+ help
+ This option adds support for PCIe TLP Processing Hints (TPH).
+ TPH allows endpoint devices to provide optimization hints, such as
+ desired caching behavior, for requests that target memory space.
+ These hints, called Steering Tags, can empower the system hardware
+ to optimize the utilization of platform resources.
+
config PCI_P2PDMA
bool "PCI peer-to-peer transfer support"
depends on ZONE_DEVICE
diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
index 374c5c06d92f..b2a100f2e24a 100644
--- a/drivers/pci/Makefile
+++ b/drivers/pci/Makefile
@@ -36,6 +36,7 @@ obj-$(CONFIG_VGA_ARB) += vgaarb.o
obj-$(CONFIG_PCI_DOE) += doe.o
obj-$(CONFIG_PCI_DYNAMIC_OF_NODES) += of_property.o
obj-$(CONFIG_PCI_NPEM) += npem.o
+obj-$(CONFIG_PCIE_TPH) += tph.o
# Endpoint library must be initialized before its users
obj-$(CONFIG_PCI_ENDPOINT) += endpoint/
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 7d85c04fbba2..89dafecc869b 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1828,6 +1828,7 @@ int pci_save_state(struct pci_dev *dev)
pci_save_dpc_state(dev);
pci_save_aer_state(dev);
pci_save_ptm_state(dev);
+ pci_save_tph_state(dev);
return pci_save_vc_state(dev);
}
EXPORT_SYMBOL(pci_save_state);
@@ -1933,6 +1934,7 @@ void pci_restore_state(struct pci_dev *dev)
pci_restore_rebar_state(dev);
pci_restore_dpc_state(dev);
pci_restore_ptm_state(dev);
+ pci_restore_tph_state(dev);
pci_aer_clear_status(dev);
pci_restore_aer_state(dev);
@@ -6896,6 +6898,8 @@ static int __init pci_setup(char *str)
pci_no_domains();
} else if (!strncmp(str, "noari", 5)) {
pcie_ari_disabled = true;
+ } else if (!strncmp(str, "notph", 5)) {
+ pci_no_tph();
} else if (!strncmp(str, "cbiosize=", 9)) {
pci_cardbus_io_size = memparse(str + 9, &str);
} else if (!strncmp(str, "cbmemsize=", 10)) {
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 14d00ce45bfa..d89fdbf04f36 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -597,6 +597,18 @@ static inline int pci_iov_bus_range(struct pci_bus *bus)
#endif /* CONFIG_PCI_IOV */
+#ifdef CONFIG_PCIE_TPH
+void pci_restore_tph_state(struct pci_dev *dev);
+void pci_save_tph_state(struct pci_dev *dev);
+void pci_no_tph(void);
+void pci_tph_init(struct pci_dev *dev);
+#else
+static inline void pci_restore_tph_state(struct pci_dev *dev) { }
+static inline void pci_save_tph_state(struct pci_dev *dev) { }
+static inline void pci_no_tph(void) { }
+static inline void pci_tph_init(struct pci_dev *dev) { }
+#endif
+
#ifdef CONFIG_PCIE_PTM
void pci_ptm_init(struct pci_dev *dev);
void pci_save_ptm_state(struct pci_dev *dev);
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 4f68414c3086..b086d53a9048 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2495,6 +2495,7 @@ static void pci_init_capabilities(struct pci_dev *dev)
pci_dpc_init(dev); /* Downstream Port Containment */
pci_rcec_init(dev); /* Root Complex Event Collector */
pci_doe_init(dev); /* Data Object Exchange */
+ pci_tph_init(dev); /* TLP Processing Hints */
pcie_report_downtraining(dev);
pci_init_reset_methods(dev);
diff --git a/drivers/pci/tph.c b/drivers/pci/tph.c
new file mode 100644
index 000000000000..6c6b500c2eaa
--- /dev/null
+++ b/drivers/pci/tph.c
@@ -0,0 +1,197 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * TPH (TLP Processing Hints) support
+ *
+ * Copyright (C) 2024 Advanced Micro Devices, Inc.
+ * Eric Van Tassell <Eric.VanTassell@amd.com>
+ * Wei Huang <wei.huang2@amd.com>
+ */
+#include <linux/pci.h>
+#include <linux/bitfield.h>
+#include <linux/pci-tph.h>
+
+#include "pci.h"
+
+/* System-wide TPH disabled */
+static bool pci_tph_disabled;
+
+static u8 get_st_modes(struct pci_dev *pdev)
+{
+ u32 reg;
+
+ pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CAP, ®);
+ reg &= PCI_TPH_CAP_ST_NS | PCI_TPH_CAP_ST_IV | PCI_TPH_CAP_ST_DS;
+
+ return reg;
+}
+
+/* Return device's Root Port completer capability */
+static u8 get_rp_completer_type(struct pci_dev *pdev)
+{
+ struct pci_dev *rp;
+ u32 reg;
+ int ret;
+
+ rp = pcie_find_root_port(pdev);
+ if (!rp)
+ return 0;
+
+ ret = pcie_capability_read_dword(rp, PCI_EXP_DEVCAP2, ®);
+ if (ret)
+ return 0;
+
+ return FIELD_GET(PCI_EXP_DEVCAP2_TPH_COMP_MASK, reg);
+}
+
+/**
+ * pcie_disable_tph - Turn off TPH support for device
+ * @pdev: PCI device
+ *
+ * Return: none
+ */
+void pcie_disable_tph(struct pci_dev *pdev)
+{
+ if (!pdev->tph_cap)
+ return;
+
+ if (!pdev->tph_enabled)
+ return;
+
+ pci_write_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, 0);
+
+ pdev->tph_mode = 0;
+ pdev->tph_req_type = 0;
+ pdev->tph_enabled = 0;
+}
+EXPORT_SYMBOL(pcie_disable_tph);
+
+/**
+ * pcie_enable_tph - Enable TPH support for device using a specific ST mode
+ * @pdev: PCI device
+ * @mode: ST mode to enable. Current supported modes include:
+ *
+ * - PCI_TPH_ST_NS_MODE: NO ST Mode
+ * - PCI_TPH_ST_IV_MODE: Interrupt Vector Mode
+ * - PCI_TPH_ST_DS_MODE: Device Specific Mode
+ *
+ * Checks whether the mode is actually supported by the device before enabling
+ * and returns an error if not. Additionally determines what types of requests,
+ * TPH or extended TPH, can be issued by the device based on its TPH requester
+ * capability and the Root Port's completer capability.
+ *
+ * Return: 0 on success, otherwise negative value (-errno)
+ */
+int pcie_enable_tph(struct pci_dev *pdev, int mode)
+{
+ u32 reg;
+ u8 dev_modes;
+ u8 rp_req_type;
+
+ /* Honor "notph" kernel parameter */
+ if (pci_tph_disabled)
+ return -EINVAL;
+
+ if (!pdev->tph_cap)
+ return -EINVAL;
+
+ if (pdev->tph_enabled)
+ return -EBUSY;
+
+ /* Sanitize and check ST mode compatibility */
+ mode &= PCI_TPH_CTRL_MODE_SEL_MASK;
+ dev_modes = get_st_modes(pdev);
+ if (!((1 << mode) & dev_modes))
+ return -EINVAL;
+
+ pdev->tph_mode = mode;
+
+ /* Get req_type supported by device and its Root Port */
+ pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CAP, ®);
+ if (FIELD_GET(PCI_TPH_CAP_EXT_TPH, reg))
+ pdev->tph_req_type = PCI_TPH_REQ_EXT_TPH;
+ else
+ pdev->tph_req_type = PCI_TPH_REQ_TPH_ONLY;
+
+ rp_req_type = get_rp_completer_type(pdev);
+
+ /* Final req_type is the smallest value of two */
+ pdev->tph_req_type = min(pdev->tph_req_type, rp_req_type);
+
+ if (pdev->tph_req_type == PCI_TPH_REQ_DISABLE)
+ return -EINVAL;
+
+ /* Write them into TPH control register */
+ pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, ®);
+
+ reg &= ~PCI_TPH_CTRL_MODE_SEL_MASK;
+ reg |= FIELD_PREP(PCI_TPH_CTRL_MODE_SEL_MASK, pdev->tph_mode);
+
+ reg &= ~PCI_TPH_CTRL_REQ_EN_MASK;
+ reg |= FIELD_PREP(PCI_TPH_CTRL_REQ_EN_MASK, pdev->tph_req_type);
+
+ pci_write_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, reg);
+
+ pdev->tph_enabled = 1;
+
+ return 0;
+}
+EXPORT_SYMBOL(pcie_enable_tph);
+
+void pci_restore_tph_state(struct pci_dev *pdev)
+{
+ struct pci_cap_saved_state *save_state;
+ u32 *cap;
+
+ if (!pdev->tph_cap)
+ return;
+
+ if (!pdev->tph_enabled)
+ return;
+
+ save_state = pci_find_saved_ext_cap(pdev, PCI_EXT_CAP_ID_TPH);
+ if (!save_state)
+ return;
+
+ /* Restore control register and all ST entries */
+ cap = &save_state->cap.data[0];
+ pci_write_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, *cap++);
+}
+
+void pci_save_tph_state(struct pci_dev *pdev)
+{
+ struct pci_cap_saved_state *save_state;
+ u32 *cap;
+
+ if (!pdev->tph_cap)
+ return;
+
+ if (!pdev->tph_enabled)
+ return;
+
+ save_state = pci_find_saved_ext_cap(pdev, PCI_EXT_CAP_ID_TPH);
+ if (!save_state)
+ return;
+
+ /* Save control register */
+ cap = &save_state->cap.data[0];
+ pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, cap++);
+}
+
+void pci_no_tph(void)
+{
+ pci_tph_disabled = true;
+
+ pr_info("PCIe TPH is disabled\n");
+}
+
+void pci_tph_init(struct pci_dev *pdev)
+{
+ u32 save_size;
+
+ pdev->tph_cap = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_TPH);
+ if (!pdev->tph_cap)
+ return;
+
+ save_size = sizeof(u32);
+ pci_add_ext_cap_save_buffer(pdev, PCI_EXT_CAP_ID_TPH, save_size);
+}
diff --git a/include/linux/pci-tph.h b/include/linux/pci-tph.h
new file mode 100644
index 000000000000..58654a334ffb
--- /dev/null
+++ b/include/linux/pci-tph.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * TPH (TLP Processing Hints)
+ *
+ * Copyright (C) 2024 Advanced Micro Devices, Inc.
+ * Eric Van Tassell <Eric.VanTassell@amd.com>
+ * Wei Huang <wei.huang2@amd.com>
+ */
+#ifndef LINUX_PCI_TPH_H
+#define LINUX_PCI_TPH_H
+
+#ifdef CONFIG_PCIE_TPH
+void pcie_disable_tph(struct pci_dev *pdev);
+int pcie_enable_tph(struct pci_dev *pdev, int mode);
+#else
+static inline void pcie_disable_tph(struct pci_dev *pdev) { }
+static inline int pcie_enable_tph(struct pci_dev *pdev, int mode)
+{ return -EINVAL; }
+#endif
+
+#endif /* LINUX_PCI_TPH_H */
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 573b4c4c2be6..8351d76b6e12 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -434,6 +434,7 @@ struct pci_dev {
unsigned int ats_enabled:1; /* Address Translation Svc */
unsigned int pasid_enabled:1; /* Process Address Space ID */
unsigned int pri_enabled:1; /* Page Request Interface */
+ unsigned int tph_enabled:1; /* TLP Processing Hints */
unsigned int is_managed:1; /* Managed via devres */
unsigned int is_msi_managed:1; /* MSI release via devres installed */
unsigned int needs_freset:1; /* Requires fundamental reset */
@@ -534,6 +535,12 @@ struct pci_dev {
/* These methods index pci_reset_fn_methods[] */
u8 reset_methods[PCI_NUM_RESET_METHODS]; /* In priority order */
+
+#ifdef CONFIG_PCIE_TPH
+ u16 tph_cap; /* TPH capability offset */
+ u8 tph_mode; /* TPH mode */
+ u8 tph_req_type; /* TPH requester type */
+#endif
};
static inline struct pci_dev *pci_physfn(struct pci_dev *dev)
diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index 12323b3334a9..155dea741615 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -340,7 +340,8 @@
#define PCI_MSIX_ENTRY_UPPER_ADDR 0x4 /* Message Upper Address */
#define PCI_MSIX_ENTRY_DATA 0x8 /* Message Data */
#define PCI_MSIX_ENTRY_VECTOR_CTRL 0xc /* Vector Control */
-#define PCI_MSIX_ENTRY_CTRL_MASKBIT 0x00000001
+#define PCI_MSIX_ENTRY_CTRL_MASKBIT 0x00000001 /* Mask Bit */
+#define PCI_MSIX_ENTRY_CTRL_ST 0xffff0000 /* Steering Tag */
/* CompactPCI Hotswap Register */
@@ -659,6 +660,7 @@
#define PCI_EXP_DEVCAP2_ATOMIC_COMP64 0x00000100 /* 64b AtomicOp completion */
#define PCI_EXP_DEVCAP2_ATOMIC_COMP128 0x00000200 /* 128b AtomicOp completion */
#define PCI_EXP_DEVCAP2_LTR 0x00000800 /* Latency tolerance reporting */
+#define PCI_EXP_DEVCAP2_TPH_COMP_MASK 0x00003000 /* TPH completer support */
#define PCI_EXP_DEVCAP2_OBFF_MASK 0x000c0000 /* OBFF support mechanism */
#define PCI_EXP_DEVCAP2_OBFF_MSG 0x00040000 /* New message signaling */
#define PCI_EXP_DEVCAP2_OBFF_WAKE 0x00080000 /* Re-use WAKE# for OBFF */
@@ -1023,15 +1025,34 @@
#define PCI_DPA_CAP_SUBSTATE_MASK 0x1F /* # substates - 1 */
#define PCI_DPA_BASE_SIZEOF 16 /* size with 0 substates */
+/* TPH Completer Support */
+#define PCI_EXP_DEVCAP2_TPH_COMP_NONE 0x0 /* None */
+#define PCI_EXP_DEVCAP2_TPH_COMP_TPH_ONLY 0x1 /* TPH only */
+#define PCI_EXP_DEVCAP2_TPH_COMP_EXT_TPH 0x3 /* TPH and Extended TPH */
+
/* TPH Requester */
#define PCI_TPH_CAP 4 /* capability register */
-#define PCI_TPH_CAP_LOC_MASK 0x600 /* location mask */
-#define PCI_TPH_LOC_NONE 0x000 /* no location */
-#define PCI_TPH_LOC_CAP 0x200 /* in capability */
-#define PCI_TPH_LOC_MSIX 0x400 /* in MSI-X */
-#define PCI_TPH_CAP_ST_MASK 0x07FF0000 /* ST table mask */
-#define PCI_TPH_CAP_ST_SHIFT 16 /* ST table shift */
-#define PCI_TPH_BASE_SIZEOF 0xc /* size with no ST table */
+#define PCI_TPH_CAP_ST_NS 0x00000001 /* No ST Mode Supported */
+#define PCI_TPH_CAP_ST_IV 0x00000002 /* Interrupt Vector Mode Supported */
+#define PCI_TPH_CAP_ST_DS 0x00000004 /* Device Specific Mode Supported */
+#define PCI_TPH_CAP_EXT_TPH 0x00000100 /* Ext TPH Requester Supported */
+#define PCI_TPH_CAP_LOC_MASK 0x00000600 /* ST Table Location */
+#define PCI_TPH_LOC_NONE 0x00000000 /* Not present */
+#define PCI_TPH_LOC_CAP 0x00000200 /* In capability */
+#define PCI_TPH_LOC_MSIX 0x00000400 /* In MSI-X */
+#define PCI_TPH_CAP_ST_MASK 0x07FF0000 /* ST Table Size */
+#define PCI_TPH_CAP_ST_SHIFT 16 /* ST Table Size shift */
+#define PCI_TPH_BASE_SIZEOF 0xc /* Size with no ST table */
+
+#define PCI_TPH_CTRL 8 /* control register */
+#define PCI_TPH_CTRL_MODE_SEL_MASK 0x00000007 /* ST Mode Select */
+#define PCI_TPH_ST_NS_MODE 0x0 /* No ST Mode */
+#define PCI_TPH_ST_IV_MODE 0x1 /* Interrupt Vector Mode */
+#define PCI_TPH_ST_DS_MODE 0x2 /* Device Specific Mode */
+#define PCI_TPH_CTRL_REQ_EN_MASK 0x00000300 /* TPH Requester Enable */
+#define PCI_TPH_REQ_DISABLE 0x0 /* No TPH requests allowed */
+#define PCI_TPH_REQ_TPH_ONLY 0x1 /* TPH only requests allowed */
+#define PCI_TPH_REQ_EXT_TPH 0x3 /* Extended TPH requests allowed */
/* Downstream Port Containment */
#define PCI_EXP_DPC_CAP 0x04 /* DPC Capability */
--
2.46.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH V7 2/5] PCI/TPH: Add Steering Tag support
2024-10-02 16:59 [PATCH V7 0/5] TPH and cache direct injection support Wei Huang
2024-10-02 16:59 ` [PATCH V7 1/5] PCI: Add TLP Processing Hints (TPH) support Wei Huang
@ 2024-10-02 16:59 ` Wei Huang
2025-02-04 18:33 ` Robin Murphy
2024-10-02 16:59 ` [PATCH V7 3/5] PCI/TPH: Add TPH documentation Wei Huang
` (3 subsequent siblings)
5 siblings, 1 reply; 17+ messages in thread
From: Wei Huang @ 2024-10-02 16:59 UTC (permalink / raw)
To: linux-pci, linux-kernel, linux-doc, netdev
Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
alex.williamson, gospo, michael.chan, ajit.khaparde,
somnath.kotur, andrew.gospodarek, manoj.panicker2,
Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
bhelgaas, lukas, paul.e.luse, jing2.liu
Add pcie_tph_get_cpu_st() to allow a caller to retrieve Steering Tags
for a target memory that is associated with a specific CPU. The ST tag
is retrieved by invoking PCI ACPI _DSM method (rev=0x7, func=0xF) of
the device's Root Port device.
Add pcie_tph_set_st_entry() to support updating the device's Steering
Tags. The tags will be written into the device's MSI-X table or the
ST table located in the TPH Extended Capability space.
Co-developed-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
---
drivers/pci/tph.c | 351 +++++++++++++++++++++++++++++++++++++++-
include/linux/pci-tph.h | 23 +++
2 files changed, 373 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/tph.c b/drivers/pci/tph.c
index 6c6b500c2eaa..9a268653866d 100644
--- a/drivers/pci/tph.c
+++ b/drivers/pci/tph.c
@@ -7,6 +7,8 @@
* Wei Huang <wei.huang2@amd.com>
*/
#include <linux/pci.h>
+#include <linux/pci-acpi.h>
+#include <linux/msi.h>
#include <linux/bitfield.h>
#include <linux/pci-tph.h>
@@ -15,6 +17,134 @@
/* System-wide TPH disabled */
static bool pci_tph_disabled;
+#ifdef CONFIG_ACPI
+/*
+ * The st_info struct defines the Steering Tag (ST) info returned by the
+ * firmware PCI ACPI _DSM method (rev=0x7, func=0xF, "_DSM to Query Cache
+ * Locality TPH Features"), as specified in the approved ECN for PCI Firmware
+ * Spec and available at https://members.pcisig.com/wg/PCI-SIG/document/15470.
+ *
+ * @vm_st_valid: 8-bit ST for volatile memory is valid
+ * @vm_xst_valid: 16-bit extended ST for volatile memory is valid
+ * @vm_ph_ignore: 1 => PH was and will be ignored, 0 => PH should be supplied
+ * @vm_st: 8-bit ST for volatile mem
+ * @vm_xst: 16-bit extended ST for volatile mem
+ * @pm_st_valid: 8-bit ST for persistent memory is valid
+ * @pm_xst_valid: 16-bit extended ST for persistent memory is valid
+ * @pm_ph_ignore: 1 => PH was and will be ignored, 0 => PH should be supplied
+ * @pm_st: 8-bit ST for persistent mem
+ * @pm_xst: 16-bit extended ST for persistent mem
+ */
+union st_info {
+ struct {
+ u64 vm_st_valid : 1;
+ u64 vm_xst_valid : 1;
+ u64 vm_ph_ignore : 1;
+ u64 rsvd1 : 5;
+ u64 vm_st : 8;
+ u64 vm_xst : 16;
+ u64 pm_st_valid : 1;
+ u64 pm_xst_valid : 1;
+ u64 pm_ph_ignore : 1;
+ u64 rsvd2 : 5;
+ u64 pm_st : 8;
+ u64 pm_xst : 16;
+ };
+ u64 value;
+};
+
+static u16 tph_extract_tag(enum tph_mem_type mem_type, u8 req_type,
+ union st_info *info)
+{
+ switch (req_type) {
+ case PCI_TPH_REQ_TPH_ONLY: /* 8-bit tag */
+ switch (mem_type) {
+ case TPH_MEM_TYPE_VM:
+ if (info->vm_st_valid)
+ return info->vm_st;
+ break;
+ case TPH_MEM_TYPE_PM:
+ if (info->pm_st_valid)
+ return info->pm_st;
+ break;
+ }
+ break;
+ case PCI_TPH_REQ_EXT_TPH: /* 16-bit tag */
+ switch (mem_type) {
+ case TPH_MEM_TYPE_VM:
+ if (info->vm_xst_valid)
+ return info->vm_xst;
+ break;
+ case TPH_MEM_TYPE_PM:
+ if (info->pm_xst_valid)
+ return info->pm_xst;
+ break;
+ }
+ break;
+ default:
+ return 0;
+ }
+
+ return 0;
+}
+
+#define TPH_ST_DSM_FUNC_INDEX 0xF
+static acpi_status tph_invoke_dsm(acpi_handle handle, u32 cpu_uid,
+ union st_info *st_out)
+{
+ union acpi_object arg3[3], in_obj, *out_obj;
+
+ if (!acpi_check_dsm(handle, &pci_acpi_dsm_guid, 7,
+ BIT(TPH_ST_DSM_FUNC_INDEX)))
+ return AE_ERROR;
+
+ /* DWORD: feature ID (0 for processor cache ST query) */
+ arg3[0].integer.type = ACPI_TYPE_INTEGER;
+ arg3[0].integer.value = 0;
+
+ /* DWORD: target UID */
+ arg3[1].integer.type = ACPI_TYPE_INTEGER;
+ arg3[1].integer.value = cpu_uid;
+
+ /* QWORD: properties, all 0's */
+ arg3[2].integer.type = ACPI_TYPE_INTEGER;
+ arg3[2].integer.value = 0;
+
+ in_obj.type = ACPI_TYPE_PACKAGE;
+ in_obj.package.count = ARRAY_SIZE(arg3);
+ in_obj.package.elements = arg3;
+
+ out_obj = acpi_evaluate_dsm(handle, &pci_acpi_dsm_guid, 7,
+ TPH_ST_DSM_FUNC_INDEX, &in_obj);
+ if (!out_obj)
+ return AE_ERROR;
+
+ if (out_obj->type != ACPI_TYPE_BUFFER) {
+ ACPI_FREE(out_obj);
+ return AE_ERROR;
+ }
+
+ st_out->value = *((u64 *)(out_obj->buffer.pointer));
+
+ ACPI_FREE(out_obj);
+
+ return AE_OK;
+}
+#endif
+
+/* Update the TPH Requester Enable field of TPH Control Register */
+static void set_ctrl_reg_req_en(struct pci_dev *pdev, u8 req_type)
+{
+ u32 reg;
+
+ pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, ®);
+
+ reg &= ~PCI_TPH_CTRL_REQ_EN_MASK;
+ reg |= FIELD_PREP(PCI_TPH_CTRL_REQ_EN_MASK, req_type);
+
+ pci_write_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, reg);
+}
+
static u8 get_st_modes(struct pci_dev *pdev)
{
u32 reg;
@@ -25,6 +155,37 @@ static u8 get_st_modes(struct pci_dev *pdev)
return reg;
}
+static u32 get_st_table_loc(struct pci_dev *pdev)
+{
+ u32 reg;
+
+ pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CAP, ®);
+
+ return FIELD_GET(PCI_TPH_CAP_LOC_MASK, reg);
+}
+
+/*
+ * Return the size of ST table. If ST table is not in TPH Requester Extended
+ * Capability space, return 0. Otherwise return the ST Table Size + 1.
+ */
+static u16 get_st_table_size(struct pci_dev *pdev)
+{
+ u32 reg;
+ u32 loc;
+
+ /* Check ST table location first */
+ loc = get_st_table_loc(pdev);
+
+ /* Convert loc to match with PCI_TPH_LOC_* defined in pci_regs.h */
+ loc = FIELD_PREP(PCI_TPH_CAP_LOC_MASK, loc);
+ if (loc != PCI_TPH_LOC_CAP)
+ return 0;
+
+ pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CAP, ®);
+
+ return FIELD_GET(PCI_TPH_CAP_ST_MASK, reg) + 1;
+}
+
/* Return device's Root Port completer capability */
static u8 get_rp_completer_type(struct pci_dev *pdev)
{
@@ -43,6 +204,170 @@ static u8 get_rp_completer_type(struct pci_dev *pdev)
return FIELD_GET(PCI_EXP_DEVCAP2_TPH_COMP_MASK, reg);
}
+/* Write ST to MSI-X vector control reg - Return 0 if OK, otherwise -errno */
+static int write_tag_to_msix(struct pci_dev *pdev, int msix_idx, u16 tag)
+{
+#ifdef CONFIG_PCI_MSI
+ struct msi_desc *msi_desc = NULL;
+ void __iomem *vec_ctrl;
+ u32 val;
+ int err = 0;
+
+ msi_lock_descs(&pdev->dev);
+
+ /* Find the msi_desc entry with matching msix_idx */
+ msi_for_each_desc(msi_desc, &pdev->dev, MSI_DESC_ASSOCIATED) {
+ if (msi_desc->msi_index == msix_idx)
+ break;
+ }
+
+ if (!msi_desc) {
+ err = -ENXIO;
+ goto err_out;
+ }
+
+ /* Get the vector control register (offset 0xc) pointed by msix_idx */
+ vec_ctrl = pdev->msix_base + msix_idx * PCI_MSIX_ENTRY_SIZE;
+ vec_ctrl += PCI_MSIX_ENTRY_VECTOR_CTRL;
+
+ val = readl(vec_ctrl);
+ val &= ~PCI_MSIX_ENTRY_CTRL_ST;
+ val |= FIELD_PREP(PCI_MSIX_ENTRY_CTRL_ST, tag);
+ writel(val, vec_ctrl);
+
+ /* Read back to flush the update */
+ val = readl(vec_ctrl);
+
+err_out:
+ msi_unlock_descs(&pdev->dev);
+ return err;
+#else
+ return -ENODEV;
+#endif
+}
+
+/* Write tag to ST table - Return 0 if OK, otherwise -errno */
+static int write_tag_to_st_table(struct pci_dev *pdev, int index, u16 tag)
+{
+ int st_table_size;
+ int offset;
+
+ /* Check if index is out of bound */
+ st_table_size = get_st_table_size(pdev);
+ if (index >= st_table_size)
+ return -ENXIO;
+
+ offset = pdev->tph_cap + PCI_TPH_BASE_SIZEOF + index * sizeof(u16);
+
+ return pci_write_config_word(pdev, offset, tag);
+}
+
+/**
+ * pcie_tph_get_cpu_st() - Retrieve Steering Tag for a target memory associated
+ * with a specific CPU
+ * @pdev: PCI device
+ * @mem_type: target memory type (volatile or persistent RAM)
+ * @cpu_uid: associated CPU id
+ * @tag: Steering Tag to be returned
+ *
+ * This function returns the Steering Tag for a target memory that is
+ * associated with a specific CPU as indicated by cpu_uid.
+ *
+ * Returns: 0 if success, otherwise negative value (-errno)
+ */
+int pcie_tph_get_cpu_st(struct pci_dev *pdev, enum tph_mem_type mem_type,
+ unsigned int cpu_uid, u16 *tag)
+{
+#ifdef CONFIG_ACPI
+ struct pci_dev *rp;
+ acpi_handle rp_acpi_handle;
+ union st_info info;
+
+ rp = pcie_find_root_port(pdev);
+ if (!rp || !rp->bus || !rp->bus->bridge)
+ return -ENODEV;
+
+ rp_acpi_handle = ACPI_HANDLE(rp->bus->bridge);
+
+ if (tph_invoke_dsm(rp_acpi_handle, cpu_uid, &info) != AE_OK) {
+ *tag = 0;
+ return -EINVAL;
+ }
+
+ *tag = tph_extract_tag(mem_type, pdev->tph_req_type, &info);
+
+ pci_dbg(pdev, "get steering tag: mem_type=%s, cpu_uid=%d, tag=%#04x\n",
+ (mem_type == TPH_MEM_TYPE_VM) ? "volatile" : "persistent",
+ cpu_uid, *tag);
+
+ return 0;
+#else
+ return -ENODEV;
+#endif
+}
+EXPORT_SYMBOL(pcie_tph_get_cpu_st);
+
+/**
+ * pcie_tph_set_st_entry() - Set Steering Tag in the ST table entry
+ * @pdev: PCI device
+ * @index: ST table entry index
+ * @tag: Steering Tag to be written
+ *
+ * This function will figure out the proper location of ST table, either in the
+ * MSI-X table or in the TPH Extended Capability space, and write the Steering
+ * Tag into the ST entry pointed by index.
+ *
+ * Returns: 0 if success, otherwise negative value (-errno)
+ */
+int pcie_tph_set_st_entry(struct pci_dev *pdev, unsigned int index, u16 tag)
+{
+ u32 loc;
+ int err = 0;
+
+ if (!pdev->tph_cap)
+ return -EINVAL;
+
+ if (!pdev->tph_enabled)
+ return -EINVAL;
+
+ /* No need to write tag if device is in "No ST Mode" */
+ if (pdev->tph_mode == PCI_TPH_ST_NS_MODE)
+ return 0;
+
+ /* Disable TPH before updating ST to avoid potential instability as
+ * cautioned in PCIe r6.2, sec 6.17.3, "ST Modes of Operation"
+ */
+ set_ctrl_reg_req_en(pdev, PCI_TPH_REQ_DISABLE);
+
+ loc = get_st_table_loc(pdev);
+ /* Convert loc to match with PCI_TPH_LOC_* defined in pci_regs.h */
+ loc = FIELD_PREP(PCI_TPH_CAP_LOC_MASK, loc);
+
+ switch (loc) {
+ case PCI_TPH_LOC_MSIX:
+ err = write_tag_to_msix(pdev, index, tag);
+ break;
+ case PCI_TPH_LOC_CAP:
+ err = write_tag_to_st_table(pdev, index, tag);
+ break;
+ default:
+ err = -EINVAL;
+ }
+
+ if (err) {
+ pcie_disable_tph(pdev);
+ return err;
+ }
+
+ set_ctrl_reg_req_en(pdev, pdev->tph_mode);
+
+ pci_dbg(pdev, "set steering tag: %s table, index=%d, tag=%#04x\n",
+ (loc == PCI_TPH_LOC_MSIX) ? "MSI-X" : "ST", index, tag);
+
+ return 0;
+}
+EXPORT_SYMBOL(pcie_tph_set_st_entry);
+
/**
* pcie_disable_tph - Turn off TPH support for device
* @pdev: PCI device
@@ -140,6 +465,8 @@ EXPORT_SYMBOL(pcie_enable_tph);
void pci_restore_tph_state(struct pci_dev *pdev)
{
struct pci_cap_saved_state *save_state;
+ int num_entries, i, offset;
+ u16 *st_entry;
u32 *cap;
if (!pdev->tph_cap)
@@ -155,11 +482,21 @@ void pci_restore_tph_state(struct pci_dev *pdev)
/* Restore control register and all ST entries */
cap = &save_state->cap.data[0];
pci_write_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, *cap++);
+ st_entry = (u16 *)cap;
+ offset = PCI_TPH_BASE_SIZEOF;
+ num_entries = get_st_table_size(pdev);
+ for (i = 0; i < num_entries; i++) {
+ pci_write_config_word(pdev, pdev->tph_cap + offset,
+ *st_entry++);
+ offset += sizeof(u16);
+ }
}
void pci_save_tph_state(struct pci_dev *pdev)
{
struct pci_cap_saved_state *save_state;
+ int num_entries, i, offset;
+ u16 *st_entry;
u32 *cap;
if (!pdev->tph_cap)
@@ -175,6 +512,16 @@ void pci_save_tph_state(struct pci_dev *pdev)
/* Save control register */
cap = &save_state->cap.data[0];
pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, cap++);
+
+ /* Save all ST entries in extended capability structure */
+ st_entry = (u16 *)cap;
+ offset = PCI_TPH_BASE_SIZEOF;
+ num_entries = get_st_table_size(pdev);
+ for (i = 0; i < num_entries; i++) {
+ pci_read_config_word(pdev, pdev->tph_cap + offset,
+ st_entry++);
+ offset += sizeof(u16);
+ }
}
void pci_no_tph(void)
@@ -186,12 +533,14 @@ void pci_no_tph(void)
void pci_tph_init(struct pci_dev *pdev)
{
+ int num_entries;
u32 save_size;
pdev->tph_cap = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_TPH);
if (!pdev->tph_cap)
return;
- save_size = sizeof(u32);
+ num_entries = get_st_table_size(pdev);
+ save_size = sizeof(u32) + num_entries * sizeof(u16);
pci_add_ext_cap_save_buffer(pdev, PCI_EXT_CAP_ID_TPH, save_size);
}
diff --git a/include/linux/pci-tph.h b/include/linux/pci-tph.h
index 58654a334ffb..c3e806c13d64 100644
--- a/include/linux/pci-tph.h
+++ b/include/linux/pci-tph.h
@@ -9,10 +9,33 @@
#ifndef LINUX_PCI_TPH_H
#define LINUX_PCI_TPH_H
+/*
+ * According to the ECN for PCI Firmware Spec, Steering Tag can be different
+ * depending on the memory type: Volatile Memory or Persistent Memory. When a
+ * caller query about a target's Steering Tag, it must provide the target's
+ * tph_mem_type. ECN link: https://members.pcisig.com/wg/PCI-SIG/document/15470.
+ */
+enum tph_mem_type {
+ TPH_MEM_TYPE_VM, /* volatile memory */
+ TPH_MEM_TYPE_PM /* persistent memory */
+};
+
#ifdef CONFIG_PCIE_TPH
+int pcie_tph_set_st_entry(struct pci_dev *pdev,
+ unsigned int index, u16 tag);
+int pcie_tph_get_cpu_st(struct pci_dev *dev,
+ enum tph_mem_type mem_type,
+ unsigned int cpu_uid, u16 *tag);
void pcie_disable_tph(struct pci_dev *pdev);
int pcie_enable_tph(struct pci_dev *pdev, int mode);
#else
+static inline int pcie_tph_set_st_entry(struct pci_dev *pdev,
+ unsigned int index, u16 tag)
+{ return -EINVAL; }
+static inline int pcie_tph_get_cpu_st(struct pci_dev *dev,
+ enum tph_mem_type mem_type,
+ unsigned int cpu_uid, u16 *tag)
+{ return -EINVAL; }
static inline void pcie_disable_tph(struct pci_dev *pdev) { }
static inline int pcie_enable_tph(struct pci_dev *pdev, int mode)
{ return -EINVAL; }
--
2.46.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH V7 3/5] PCI/TPH: Add TPH documentation
2024-10-02 16:59 [PATCH V7 0/5] TPH and cache direct injection support Wei Huang
2024-10-02 16:59 ` [PATCH V7 1/5] PCI: Add TLP Processing Hints (TPH) support Wei Huang
2024-10-02 16:59 ` [PATCH V7 2/5] PCI/TPH: Add Steering Tag support Wei Huang
@ 2024-10-02 16:59 ` Wei Huang
2024-10-02 16:59 ` [PATCH V7 4/5] bnxt_en: Add TPH support in BNXT driver Wei Huang
` (2 subsequent siblings)
5 siblings, 0 replies; 17+ messages in thread
From: Wei Huang @ 2024-10-02 16:59 UTC (permalink / raw)
To: linux-pci, linux-kernel, linux-doc, netdev
Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
alex.williamson, gospo, michael.chan, ajit.khaparde,
somnath.kotur, andrew.gospodarek, manoj.panicker2,
Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
bhelgaas, lukas, paul.e.luse, jing2.liu
Provide a document for TPH feature, including the description of "notph"
kernel parameter and the API interface.
Co-developed-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
---
Documentation/PCI/index.rst | 1 +
Documentation/PCI/tph.rst | 132 +++++++++++++++++++++++++++
Documentation/driver-api/pci/pci.rst | 3 +
3 files changed, 136 insertions(+)
create mode 100644 Documentation/PCI/tph.rst
diff --git a/Documentation/PCI/index.rst b/Documentation/PCI/index.rst
index e73f84aebde3..5e7c4e6e726b 100644
--- a/Documentation/PCI/index.rst
+++ b/Documentation/PCI/index.rst
@@ -18,3 +18,4 @@ PCI Bus Subsystem
pcieaer-howto
endpoint/index
boot-interrupts
+ tph
diff --git a/Documentation/PCI/tph.rst b/Documentation/PCI/tph.rst
new file mode 100644
index 000000000000..e8993be64fd6
--- /dev/null
+++ b/Documentation/PCI/tph.rst
@@ -0,0 +1,132 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+
+===========
+TPH Support
+===========
+
+:Copyright: 2024 Advanced Micro Devices, Inc.
+:Authors: - Eric van Tassell <eric.vantassell@amd.com>
+ - Wei Huang <wei.huang2@amd.com>
+
+
+Overview
+========
+
+TPH (TLP Processing Hints) is a PCIe feature that allows endpoint devices
+to provide optimization hints for requests that target memory space.
+These hints, in a format called Steering Tags (STs), are embedded in the
+requester's TLP headers, enabling the system hardware, such as the Root
+Complex, to better manage platform resources for these requests.
+
+For example, on platforms with TPH-based direct data cache injection
+support, an endpoint device can include appropriate STs in its DMA
+traffic to specify which cache the data should be written to. This allows
+the CPU core to have a higher probability of getting data from cache,
+potentially improving performance and reducing latency in data
+processing.
+
+
+How to Use TPH
+==============
+
+TPH is presented as an optional extended capability in PCIe. The Linux
+kernel handles TPH discovery during boot, but it is up to the device
+driver to request TPH enablement if it is to be utilized. Once enabled,
+the driver uses the provided API to obtain the Steering Tag for the
+target memory and to program the ST into the device's ST table.
+
+Enable TPH support in Linux
+---------------------------
+
+To support TPH, the kernel must be built with the CONFIG_PCIE_TPH option
+enabled.
+
+Manage TPH
+----------
+
+To enable TPH for a device, use the following function::
+
+ int pcie_enable_tph(struct pci_dev *pdev, int mode);
+
+This function enables TPH support for device with a specific ST mode.
+Current supported modes include:
+
+ * PCI_TPH_ST_NS_MODE - NO ST Mode
+ * PCI_TPH_ST_IV_MODE - Interrupt Vector Mode
+ * PCI_TPH_ST_DS_MODE - Device Specific Mode
+
+`pcie_enable_tph()` checks whether the requested mode is actually
+supported by the device before enabling. The device driver can figure out
+which TPH mode is supported and can be properly enabled based on the
+return value of `pcie_enable_tph()`.
+
+To disable TPH, use the following function::
+
+ void pcie_disable_tph(struct pci_dev *pdev);
+
+Manage ST
+---------
+
+Steering Tags are platform specific. PCIe spec does not specify where STs
+are from. Instead PCI Firmware Specification defines an ACPI _DSM method
+(see the `Revised _DSM for Cache Locality TPH Features ECN
+<https://members.pcisig.com/wg/PCI-SIG/document/15470>`_) for retrieving
+STs for a target memory of various properties. This method is what is
+supported in this implementation.
+
+To retrieve a Steering Tag for a target memory associated with a specific
+CPU, use the following function::
+
+ int pcie_tph_get_cpu_st(struct pci_dev *pdev, enum tph_mem_type type,
+ unsigned int cpu_uid, u16 *tag);
+
+The `type` argument is used to specify the memory type, either volatile
+or persistent, of the target memory. The `cpu_uid` argument specifies the
+CPU where the memory is associated to.
+
+After the ST value is retrieved, the device driver can use the following
+function to write the ST into the device::
+
+ int pcie_tph_set_st_entry(struct pci_dev *pdev, unsigned int index,
+ u16 tag);
+
+The `index` argument is the ST table entry index the ST tag will be
+written into. `pcie_tph_set_st_entry()` will figure out the proper
+location of ST table, either in the MSI-X table or in the TPH Extended
+Capability space, and write the Steering Tag into the ST entry pointed by
+the `index` argument.
+
+It is completely up to the driver to decide how to use these TPH
+functions. For example a network device driver can use the TPH APIs above
+to update the Steering Tag when interrupt affinity of a RX/TX queue has
+been changed. Here is a sample code for IRQ affinity notifier:
+
+.. code-block:: c
+
+ static void irq_affinity_notified(struct irq_affinity_notify *notify,
+ const cpumask_t *mask)
+ {
+ struct drv_irq *irq;
+ unsigned int cpu_id;
+ u16 tag;
+
+ irq = container_of(notify, struct drv_irq, affinity_notify);
+ cpumask_copy(irq->cpu_mask, mask);
+
+ /* Pick a right CPU as the target - here is just an example */
+ cpu_id = cpumask_first(irq->cpu_mask);
+
+ if (pcie_tph_get_cpu_st(irq->pdev, TPH_MEM_TYPE_VM, cpu_id,
+ &tag))
+ return;
+
+ if (pcie_tph_set_st_entry(irq->pdev, irq->msix_nr, tag))
+ return;
+ }
+
+Disable TPH system-wide
+-----------------------
+
+There is a kernel command line option available to control TPH feature:
+ * "notph": TPH will be disabled for all endpoint devices.
diff --git a/Documentation/driver-api/pci/pci.rst b/Documentation/driver-api/pci/pci.rst
index aa40b1cc243b..59d86e827198 100644
--- a/Documentation/driver-api/pci/pci.rst
+++ b/Documentation/driver-api/pci/pci.rst
@@ -46,6 +46,9 @@ PCI Support Library
.. kernel-doc:: drivers/pci/pci-sysfs.c
:internal:
+.. kernel-doc:: drivers/pci/tph.c
+ :export:
+
PCI Hotplug Support Library
---------------------------
--
2.46.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH V7 4/5] bnxt_en: Add TPH support in BNXT driver
2024-10-02 16:59 [PATCH V7 0/5] TPH and cache direct injection support Wei Huang
` (2 preceding siblings ...)
2024-10-02 16:59 ` [PATCH V7 3/5] PCI/TPH: Add TPH documentation Wei Huang
@ 2024-10-02 16:59 ` Wei Huang
2024-10-08 13:39 ` Jakub Kicinski
2024-10-02 16:59 ` [PATCH V7 5/5] bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings Wei Huang
2024-10-02 21:35 ` [PATCH V7 0/5] TPH and cache direct injection support Bjorn Helgaas
5 siblings, 1 reply; 17+ messages in thread
From: Wei Huang @ 2024-10-02 16:59 UTC (permalink / raw)
To: linux-pci, linux-kernel, linux-doc, netdev
Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
alex.williamson, gospo, michael.chan, ajit.khaparde,
somnath.kotur, andrew.gospodarek, manoj.panicker2,
Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
bhelgaas, lukas, paul.e.luse, jing2.liu
From: Manoj Panicker <manoj.panicker2@amd.com>
Add TPH support to the Broadcom BNXT device driver. This allows the
driver to utilize TPH functions for retrieving and configuring Steering
Tags when changing interrupt affinity. With compatible NIC firmware,
network traffic will be tagged correctly with Steering Tags, leading to
significant memory bandwidth savings and other benefits as demonstrated
by real network benchmarks on TPH-capable platforms.
Co-developed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Co-developed-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Manoj Panicker <manoj.panicker2@amd.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 83 +++++++++++++++++++++++
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 7 ++
net/core/netdev_rx_queue.c | 1 +
3 files changed, 91 insertions(+)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 6e422e24750a..23ad2b6e70c7 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -55,6 +55,8 @@
#include <net/page_pool/helpers.h>
#include <linux/align.h>
#include <net/netdev_queues.h>
+#include <net/netdev_rx_queue.h>
+#include <linux/pci-tph.h>
#include "bnxt_hsi.h"
#include "bnxt.h"
@@ -10865,6 +10867,61 @@ int bnxt_reserve_rings(struct bnxt *bp, bool irq_re_init)
return 0;
}
+static void __bnxt_irq_affinity_notify(struct irq_affinity_notify *notify,
+ const cpumask_t *mask)
+{
+ struct bnxt_irq *irq;
+ u16 tag;
+ int err;
+
+ irq = container_of(notify, struct bnxt_irq, affinity_notify);
+ cpumask_copy(irq->cpu_mask, mask);
+
+ if (pcie_tph_get_cpu_st(irq->bp->pdev, TPH_MEM_TYPE_VM,
+ cpumask_first(irq->cpu_mask), &tag))
+ return;
+
+ if (pcie_tph_set_st_entry(irq->bp->pdev, irq->msix_nr, tag))
+ return;
+
+ if (netif_running(irq->bp->dev)) {
+ rtnl_lock();
+ err = netdev_rx_queue_restart(irq->bp->dev, irq->ring_nr);
+ if (err)
+ netdev_err(irq->bp->dev,
+ "rx queue restart failed: err=%d\n", err);
+ rtnl_unlock();
+ }
+}
+
+static void __bnxt_irq_affinity_release(struct kref __always_unused *ref)
+{
+}
+
+static void bnxt_release_irq_notifier(struct bnxt_irq *irq)
+{
+ irq_set_affinity_notifier(irq->vector, NULL);
+}
+
+static void bnxt_register_irq_notifier(struct bnxt *bp, struct bnxt_irq *irq)
+{
+ struct irq_affinity_notify *notify;
+
+ irq->bp = bp;
+
+ /* Nothing to do if TPH is not enabled */
+ if (!bp->tph_mode)
+ return;
+
+ /* Register IRQ affinity notifier */
+ notify = &irq->affinity_notify;
+ notify->irq = irq->vector;
+ notify->notify = __bnxt_irq_affinity_notify;
+ notify->release = __bnxt_irq_affinity_release;
+
+ irq_set_affinity_notifier(irq->vector, notify);
+}
+
static void bnxt_free_irq(struct bnxt *bp)
{
struct bnxt_irq *irq;
@@ -10887,11 +10944,18 @@ static void bnxt_free_irq(struct bnxt *bp)
free_cpumask_var(irq->cpu_mask);
irq->have_cpumask = 0;
}
+
+ bnxt_release_irq_notifier(irq);
+
free_irq(irq->vector, bp->bnapi[i]);
}
irq->requested = 0;
}
+
+ /* Disable TPH support */
+ pcie_disable_tph(bp->pdev);
+ bp->tph_mode = 0;
}
static int bnxt_request_irq(struct bnxt *bp)
@@ -10911,6 +10975,12 @@ static int bnxt_request_irq(struct bnxt *bp)
#ifdef CONFIG_RFS_ACCEL
rmap = bp->dev->rx_cpu_rmap;
#endif
+
+ /* Enable TPH support as part of IRQ request */
+ rc = pcie_enable_tph(bp->pdev, PCI_TPH_ST_IV_MODE);
+ if (!rc)
+ bp->tph_mode = PCI_TPH_ST_IV_MODE;
+
for (i = 0, j = 0; i < bp->cp_nr_rings; i++) {
int map_idx = bnxt_cp_num_to_irq_num(bp, i);
struct bnxt_irq *irq = &bp->irq_tbl[map_idx];
@@ -10934,8 +11004,11 @@ static int bnxt_request_irq(struct bnxt *bp)
if (zalloc_cpumask_var(&irq->cpu_mask, GFP_KERNEL)) {
int numa_node = dev_to_node(&bp->pdev->dev);
+ u16 tag;
irq->have_cpumask = 1;
+ irq->msix_nr = map_idx;
+ irq->ring_nr = i;
cpumask_set_cpu(cpumask_local_spread(i, numa_node),
irq->cpu_mask);
rc = irq_set_affinity_hint(irq->vector, irq->cpu_mask);
@@ -10945,6 +11018,16 @@ static int bnxt_request_irq(struct bnxt *bp)
irq->vector);
break;
}
+
+ bnxt_register_irq_notifier(bp, irq);
+
+ /* Init ST table entry */
+ if (pcie_tph_get_cpu_st(irq->bp->pdev, TPH_MEM_TYPE_VM,
+ cpumask_first(irq->cpu_mask),
+ &tag))
+ continue;
+
+ pcie_tph_set_st_entry(irq->bp->pdev, irq->msix_nr, tag);
}
}
return rc;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 69231e85140b..641d25646367 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1227,6 +1227,11 @@ struct bnxt_irq {
u8 have_cpumask:1;
char name[IFNAMSIZ + BNXT_IRQ_NAME_EXTRA];
cpumask_var_t cpu_mask;
+
+ struct bnxt *bp;
+ int msix_nr;
+ int ring_nr;
+ struct irq_affinity_notify affinity_notify;
};
#define HWRM_RING_ALLOC_TX 0x1
@@ -2183,6 +2188,8 @@ struct bnxt {
struct net_device *dev;
struct pci_dev *pdev;
+ u8 tph_mode;
+
atomic_t intr_sem;
u32 flags;
diff --git a/net/core/netdev_rx_queue.c b/net/core/netdev_rx_queue.c
index e217a5838c87..10e95d7b6892 100644
--- a/net/core/netdev_rx_queue.c
+++ b/net/core/netdev_rx_queue.c
@@ -79,3 +79,4 @@ int netdev_rx_queue_restart(struct net_device *dev, unsigned int rxq_idx)
return err;
}
+EXPORT_SYMBOL_GPL(netdev_rx_queue_restart);
--
2.46.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH V7 5/5] bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings
2024-10-02 16:59 [PATCH V7 0/5] TPH and cache direct injection support Wei Huang
` (3 preceding siblings ...)
2024-10-02 16:59 ` [PATCH V7 4/5] bnxt_en: Add TPH support in BNXT driver Wei Huang
@ 2024-10-02 16:59 ` Wei Huang
2024-10-02 21:35 ` [PATCH V7 0/5] TPH and cache direct injection support Bjorn Helgaas
5 siblings, 0 replies; 17+ messages in thread
From: Wei Huang @ 2024-10-02 16:59 UTC (permalink / raw)
To: linux-pci, linux-kernel, linux-doc, netdev
Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
alex.williamson, gospo, michael.chan, ajit.khaparde,
somnath.kotur, andrew.gospodarek, manoj.panicker2,
Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
bhelgaas, lukas, paul.e.luse, jing2.liu
From: Michael Chan <michael.chan@broadcom.com>
Newer firmware can use the NQ ring ID associated with each RX/RX AGG
ring to enable PCIe Steering Tags. When allocating RX/RX AGG rings,
pass along NR ring ID for the firmware to use. This information helps
optimize DMA writes by directing them to the cache closer to the CPU
consuming the data, potentially improving the processing speed. This
change is backward-compatible with older firmware, which will simply
disregard the information.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 23ad2b6e70c7..a35207931d7d 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -6811,10 +6811,12 @@ static int hwrm_ring_alloc_send_msg(struct bnxt *bp,
/* Association of rx ring with stats context */
grp_info = &bp->grp_info[ring->grp_idx];
+ req->nq_ring_id = cpu_to_le16(grp_info->cp_fw_ring_id);
req->rx_buf_size = cpu_to_le16(bp->rx_buf_use_size);
req->stat_ctx_id = cpu_to_le32(grp_info->fw_stats_ctx);
req->enables |= cpu_to_le32(
- RING_ALLOC_REQ_ENABLES_RX_BUF_SIZE_VALID);
+ RING_ALLOC_REQ_ENABLES_RX_BUF_SIZE_VALID |
+ RING_ALLOC_REQ_ENABLES_NQ_RING_ID_VALID);
if (NET_IP_ALIGN == 2)
flags = RING_ALLOC_REQ_FLAGS_RX_SOP_PAD;
req->flags = cpu_to_le16(flags);
@@ -6826,11 +6828,13 @@ static int hwrm_ring_alloc_send_msg(struct bnxt *bp,
/* Association of agg ring with rx ring */
grp_info = &bp->grp_info[ring->grp_idx];
req->rx_ring_id = cpu_to_le16(grp_info->rx_fw_ring_id);
+ req->nq_ring_id = cpu_to_le16(grp_info->cp_fw_ring_id);
req->rx_buf_size = cpu_to_le16(BNXT_RX_PAGE_SIZE);
req->stat_ctx_id = cpu_to_le32(grp_info->fw_stats_ctx);
req->enables |= cpu_to_le32(
RING_ALLOC_REQ_ENABLES_RX_RING_ID_VALID |
- RING_ALLOC_REQ_ENABLES_RX_BUF_SIZE_VALID);
+ RING_ALLOC_REQ_ENABLES_RX_BUF_SIZE_VALID |
+ RING_ALLOC_REQ_ENABLES_NQ_RING_ID_VALID);
} else {
req->ring_type = RING_ALLOC_REQ_RING_TYPE_RX;
}
--
2.46.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH V7 0/5] TPH and cache direct injection support
2024-10-02 16:59 [PATCH V7 0/5] TPH and cache direct injection support Wei Huang
` (4 preceding siblings ...)
2024-10-02 16:59 ` [PATCH V7 5/5] bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings Wei Huang
@ 2024-10-02 21:35 ` Bjorn Helgaas
2024-10-02 22:08 ` Michael Chan
2024-10-16 21:31 ` Bjorn Helgaas
5 siblings, 2 replies; 17+ messages in thread
From: Bjorn Helgaas @ 2024-10-02 21:35 UTC (permalink / raw)
To: Wei Huang
Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
corbet, davem, edumazet, kuba, pabeni, alex.williamson, gospo,
michael.chan, ajit.khaparde, somnath.kotur, andrew.gospodarek,
manoj.panicker2, Eric.VanTassell, vadim.fedorenko, horms,
bagasdotme, bhelgaas, lukas, paul.e.luse, jing2.liu
On Wed, Oct 02, 2024 at 11:59:49AM -0500, Wei Huang wrote:
> Hi All,
>
> TPH (TLP Processing Hints) is a PCIe feature that allows endpoint
> devices to provide optimization hints for requests that target memory
> space. These hints, in a format called steering tag (ST), are provided
> in the requester's TLP headers and allow the system hardware, including
> the Root Complex, to optimize the utilization of platform resources
> for the requests.
>
> Upcoming AMD hardware implement a new Cache Injection feature that
> leverages TPH. Cache Injection allows PCIe endpoints to inject I/O
> Coherent DMA writes directly into an L2 within the CCX (core complex)
> closest to the CPU core that will consume it. This technology is aimed
> at applications requiring high performance and low latency, such as
> networking and storage applications.
>
> This series introduces generic TPH support in Linux, allowing STs to be
> retrieved and used by PCIe endpoint drivers as needed. As a
> demonstration, it includes an example usage in the Broadcom BNXT driver.
> When running on Broadcom NICs with the appropriate firmware, it shows
> substantial memory bandwidth savings and better network bandwidth using
> real-world benchmarks. This solution is vendor-neutral and implemented
> based on industry standards (PCIe Spec and PCI FW Spec).
>
> V6->V7:
> * Rebase on top of the latest pci/main (6.12-rc1)
> * Fix compilation warning/error on clang-18 with w=1 (test robot)
> * Revise commit messages for Patch #2, #4, and #5 (Bjorn)
> * Add more _DSM method description for reference in Patch #2 (Bjorn)
> * Remove "default n" in Kconfig (Lukas)
>
> V5->V6:
> * Rebase on top of pci/main (tag: pci-v6.12-changes)
> * Fix spellings and FIELD_PREP/bnxt.c compilation errors (Simon)
> * Move tph.c to drivers/pci directory (Lukas)
> * Remove CONFIG_ACPI dependency (Lukas)
> * Slightly re-arrange save/restore sequence (Lukas)
>
> V4->V5:
> * Rebase on top of net-next/main tree (Broadcom)
> * Remove TPH mode query and TPH enabled checking functions (Bjorn)
> * Remove "nostmode" kernel parameter (Bjorn)
> * Add "notph" kernel parameter support (Bjorn)
> * Add back TPH documentation (Bjorn)
> * Change TPH register namings (Bjorn)
> * Squash TPH enable/disable/save/restore funcs as a single patch (Bjorn)
> * Squash ST get_st/set_st funcs as a single patch (Bjorn)
> * Replace nic_open/close with netdev_rx_queue_restart() (Jakub, Broadcom)
>
> V3->V4:
> * Rebase on top of the latest pci/next tree (tag: 6.11-rc1)
> * Add new API functioins to query/enable/disable TPH support
> * Make pcie_tph_set_st() completely independent from pcie_tph_get_cpu_st()
> * Rewrite bnxt.c based on new APIs
> * Remove documentation for now due to constantly changing API
> * Remove pci=notph, but keep pci=nostmode with better flow (Bjorn)
> * Lots of code rewrite in tph.c & pci-tph.h with cleaner interface (Bjorn)
> * Add TPH save/restore support (Paul Luse and Lukas Wunner)
>
> V2->V3:
> * Rebase on top of pci/next tree (tag: pci-v6.11-changes)
> * Redefine PCI TPH registers (pci_regs.h) without breaking uapi
> * Fix commit subjects/messages for kernel options (Jonathan and Bjorn)
> * Break API functions into three individual patches for easy review
> * Rewrite lots of code in tph.c/tph.h based (Jonathan and Bjorn)
>
> V1->V2:
> * Rebase on top of pci.git/for-linus (6.10-rc1)
> * Address mismatched data types reported by Sparse (Sparse check passed)
> * Add pcie_tph_intr_vec_supported() for checking IRQ mode support
> * Skip bnxt affinity notifier registration if
> pcie_tph_intr_vec_supported()=false
> * Minor fixes in bnxt driver (i.e. warning messages)
>
> Manoj Panicker (1):
> bnxt_en: Add TPH support in BNXT driver
>
> Michael Chan (1):
> bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings
>
> Wei Huang (3):
> PCI: Add TLP Processing Hints (TPH) support
> PCI/TPH: Add Steering Tag support
> PCI/TPH: Add TPH documentation
>
> Documentation/PCI/index.rst | 1 +
> Documentation/PCI/tph.rst | 132 +++++
> .../admin-guide/kernel-parameters.txt | 4 +
> Documentation/driver-api/pci/pci.rst | 3 +
> drivers/net/ethernet/broadcom/bnxt/bnxt.c | 91 ++-
> drivers/net/ethernet/broadcom/bnxt/bnxt.h | 7 +
> drivers/pci/Kconfig | 9 +
> drivers/pci/Makefile | 1 +
> drivers/pci/pci.c | 4 +
> drivers/pci/pci.h | 12 +
> drivers/pci/probe.c | 1 +
> drivers/pci/tph.c | 546 ++++++++++++++++++
> include/linux/pci-tph.h | 44 ++
> include/linux/pci.h | 7 +
> include/uapi/linux/pci_regs.h | 37 +-
> net/core/netdev_rx_queue.c | 1 +
> 16 files changed, 890 insertions(+), 10 deletions(-)
> create mode 100644 Documentation/PCI/tph.rst
> create mode 100644 drivers/pci/tph.c
> create mode 100644 include/linux/pci-tph.h
I tentatively applied this on pci/tph for v6.13.
Not sure what you intend for the bnxt changes, since they depend on
the PCI core changes. I'm happy to merge them via PCI, given acks
from Michael and an overall network maintainer.
Alternatively they could wait another cycle, or I could make an
immutable branch, although I prefer to preserve the option to update
or remove things until the merge window.
Thanks very much; this looks like nice work!
Bjorn
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH V7 0/5] TPH and cache direct injection support
2024-10-02 21:35 ` [PATCH V7 0/5] TPH and cache direct injection support Bjorn Helgaas
@ 2024-10-02 22:08 ` Michael Chan
2024-10-08 7:32 ` Paolo Abeni
2024-10-16 21:31 ` Bjorn Helgaas
1 sibling, 1 reply; 17+ messages in thread
From: Michael Chan @ 2024-10-02 22:08 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Wei Huang, linux-pci, linux-kernel, linux-doc, netdev,
Jonathan.Cameron, corbet, davem, edumazet, kuba, pabeni,
alex.williamson, gospo, ajit.khaparde, somnath.kotur,
andrew.gospodarek, manoj.panicker2, Eric.VanTassell,
vadim.fedorenko, horms, bagasdotme, bhelgaas, lukas, paul.e.luse,
jing2.liu
[-- Attachment #1: Type: text/plain, Size: 398 bytes --]
On Wed, Oct 2, 2024 at 2:35 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> I tentatively applied this on pci/tph for v6.13.
>
> Not sure what you intend for the bnxt changes, since they depend on
> the PCI core changes. I'm happy to merge them via PCI, given acks
> from Michael and an overall network maintainer.
The bnxt patch can go in through the PCI tree if Jakub agrees. Thanks.
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH V7 0/5] TPH and cache direct injection support
2024-10-02 22:08 ` Michael Chan
@ 2024-10-08 7:32 ` Paolo Abeni
0 siblings, 0 replies; 17+ messages in thread
From: Paolo Abeni @ 2024-10-08 7:32 UTC (permalink / raw)
To: Michael Chan, Bjorn Helgaas
Cc: Wei Huang, linux-pci, linux-kernel, linux-doc, netdev,
Jonathan.Cameron, corbet, davem, edumazet, kuba, alex.williamson,
gospo, ajit.khaparde, somnath.kotur, andrew.gospodarek,
manoj.panicker2, Eric.VanTassell, vadim.fedorenko, horms,
bagasdotme, bhelgaas, lukas, paul.e.luse, jing2.liu
On 10/3/24 00:08, Michael Chan wrote:
> On Wed, Oct 2, 2024 at 2:35 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>> I tentatively applied this on pci/tph for v6.13.
>>
>> Not sure what you intend for the bnxt changes, since they depend on
>> the PCI core changes. I'm happy to merge them via PCI, given acks
>> from Michael and an overall network maintainer.
>
> The bnxt patch can go in through the PCI tree if Jakub agrees. Thanks.
I guess the most critical point is to avoid complex conflict at merge
window time. My understanding it that the conventional way to avoid such
issue would be sharing a stable branch somewhere with this change on top
which both the netdev and the PCI tree could pull from.
Cheers,
Paolo
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH V7 4/5] bnxt_en: Add TPH support in BNXT driver
2024-10-02 16:59 ` [PATCH V7 4/5] bnxt_en: Add TPH support in BNXT driver Wei Huang
@ 2024-10-08 13:39 ` Jakub Kicinski
2024-10-11 18:35 ` Panicker, Manoj
0 siblings, 1 reply; 17+ messages in thread
From: Jakub Kicinski @ 2024-10-08 13:39 UTC (permalink / raw)
To: Wei Huang
Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
helgaas, corbet, davem, edumazet, pabeni, alex.williamson, gospo,
michael.chan, ajit.khaparde, somnath.kotur, andrew.gospodarek,
manoj.panicker2, Eric.VanTassell, vadim.fedorenko, horms,
bagasdotme, bhelgaas, lukas, paul.e.luse, jing2.liu
On Wed, 2 Oct 2024 11:59:53 -0500 Wei Huang wrote:
> + if (netif_running(irq->bp->dev)) {
> + rtnl_lock();
> + err = netdev_rx_queue_restart(irq->bp->dev, irq->ring_nr);
> + if (err)
> + netdev_err(irq->bp->dev,
> + "rx queue restart failed: err=%d\n", err);
> + rtnl_unlock();
> + }
> +}
> +
> +static void __bnxt_irq_affinity_release(struct kref __always_unused *ref)
> +{
> +}
An empty release function is always a red flag.
How is the reference counting used here?
Is irq_set_affinity_notifier() not synchronous?
Otherwise the rtnl_lock() should probably cover the running check.
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: [PATCH V7 4/5] bnxt_en: Add TPH support in BNXT driver
2024-10-08 13:39 ` Jakub Kicinski
@ 2024-10-11 18:35 ` Panicker, Manoj
2024-10-15 19:50 ` Wei Huang
0 siblings, 1 reply; 17+ messages in thread
From: Panicker, Manoj @ 2024-10-11 18:35 UTC (permalink / raw)
To: Jakub Kicinski, Huang2, Wei
Cc: linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-doc@vger.kernel.org, netdev@vger.kernel.org,
Jonathan.Cameron@Huawei.com, helgaas@kernel.org, corbet@lwn.net,
davem@davemloft.net, edumazet@google.com, pabeni@redhat.com,
alex.williamson@redhat.com, gospo@broadcom.com,
michael.chan@broadcom.com, ajit.khaparde@broadcom.com,
somnath.kotur@broadcom.com, andrew.gospodarek@broadcom.com,
VanTassell, Eric, vadim.fedorenko@linux.dev, horms@kernel.org,
bagasdotme@gmail.com, bhelgaas@google.com, lukas@wunner.de,
paul.e.luse@intel.com, jing2.liu@intel.com
[AMD Official Use Only - AMD Internal Distribution Only]
Hello Jakub,
Thanks for the feedback. We'll update the patch to cover the code under the rtnl_lock.
About the empty function, there are no actions to perform when the driver's notify.release function is called. The IRQ notifier is only registered once and there are no older IRQ notifiers for the driver that could get called back. We also followed the precedent seen from other drivers in the kernel tree that follow the same mechanism .
See code:
From drivers/net/ethernet/intel/i40e/i40e_main.c
static void i40e_irq_affinity_release(struct kref *ref) {}
From drivers/net/ethernet/intel/iavf/iavf_main.c
static void iavf_irq_affinity_release(struct kref *ref) {}
From drivers/net/ethernet/fungible/funeth/funeth_main.c
static void fun_irq_aff_release(struct kref __always_unused *ref)
{
}
Thanks
Manoj
-----Original Message-----
From: Jakub Kicinski <kuba@kernel.org>
Sent: Tuesday, October 8, 2024 6:40 AM
To: Huang2, Wei <Wei.Huang2@amd.com>
Cc: linux-pci@vger.kernel.org; linux-kernel@vger.kernel.org; linux-doc@vger.kernel.org; netdev@vger.kernel.org; Jonathan.Cameron@Huawei.com; helgaas@kernel.org; corbet@lwn.net; davem@davemloft.net; edumazet@google.com; pabeni@redhat.com; alex.williamson@redhat.com; gospo@broadcom.com; michael.chan@broadcom.com; ajit.khaparde@broadcom.com; somnath.kotur@broadcom.com; andrew.gospodarek@broadcom.com; Panicker, Manoj <Manoj.Panicker2@amd.com>; VanTassell, Eric <Eric.VanTassell@amd.com>; vadim.fedorenko@linux.dev; horms@kernel.org; bagasdotme@gmail.com; bhelgaas@google.com; lukas@wunner.de; paul.e.luse@intel.com; jing2.liu@intel.com
Subject: Re: [PATCH V7 4/5] bnxt_en: Add TPH support in BNXT driver
Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
On Wed, 2 Oct 2024 11:59:53 -0500 Wei Huang wrote:
> + if (netif_running(irq->bp->dev)) {
> + rtnl_lock();
> + err = netdev_rx_queue_restart(irq->bp->dev, irq->ring_nr);
> + if (err)
> + netdev_err(irq->bp->dev,
> + "rx queue restart failed: err=%d\n", err);
> + rtnl_unlock();
> + }
> +}
> +
> +static void __bnxt_irq_affinity_release(struct kref __always_unused
> +*ref) { }
An empty release function is always a red flag.
How is the reference counting used here?
Is irq_set_affinity_notifier() not synchronous?
Otherwise the rtnl_lock() should probably cover the running check.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH V7 4/5] bnxt_en: Add TPH support in BNXT driver
2024-10-11 18:35 ` Panicker, Manoj
@ 2024-10-15 19:50 ` Wei Huang
2024-10-15 23:45 ` Jakub Kicinski
0 siblings, 1 reply; 17+ messages in thread
From: Wei Huang @ 2024-10-15 19:50 UTC (permalink / raw)
To: Panicker, Manoj, Jakub Kicinski
Cc: linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-doc@vger.kernel.org, netdev@vger.kernel.org,
Jonathan.Cameron@Huawei.com, helgaas@kernel.org, corbet@lwn.net,
davem@davemloft.net, edumazet@google.com, pabeni@redhat.com,
alex.williamson@redhat.com, gospo@broadcom.com,
michael.chan@broadcom.com, ajit.khaparde@broadcom.com,
somnath.kotur@broadcom.com, andrew.gospodarek@broadcom.com,
VanTassell, Eric, vadim.fedorenko@linux.dev, horms@kernel.org,
bagasdotme@gmail.com, bhelgaas@google.com, lukas@wunner.de,
paul.e.luse@intel.com, jing2.liu@intel.com
[These question are for both Jakub and Bjorn]
Any suggestions on how to proceed? I can send out a V8 patchset if Jakub
is OK with Manoj's solution? Or only a new patch #4 is needed since the
rest are intact.
Thanks,
-Wei
On 10/11/24 13:35, Panicker, Manoj wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> Hello Jakub,
>
> Thanks for the feedback. We'll update the patch to cover the code under the rtnl_lock.
>
> About the empty function, there are no actions to perform when the driver's notify.release function is called. The IRQ notifier is only registered once and there are no older IRQ notifiers for the driver that could get called back. We also followed the precedent seen from other drivers in the kernel tree that follow the same mechanism .
>
> See code:
> From drivers/net/ethernet/intel/i40e/i40e_main.c
> static void i40e_irq_affinity_release(struct kref *ref) {}
>
>
> From drivers/net/ethernet/intel/iavf/iavf_main.c
> static void iavf_irq_affinity_release(struct kref *ref) {}
>
>
> From drivers/net/ethernet/fungible/funeth/funeth_main.c
> static void fun_irq_aff_release(struct kref __always_unused *ref)
> {
> }
>
>
> Thanks
> Manoj
>
> -----Original Message-----
> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Tuesday, October 8, 2024 6:40 AM
> To: Huang2, Wei <Wei.Huang2@amd.com>
> Cc: linux-pci@vger.kernel.org; linux-kernel@vger.kernel.org; linux-doc@vger.kernel.org; netdev@vger.kernel.org; Jonathan.Cameron@Huawei.com; helgaas@kernel.org; corbet@lwn.net; davem@davemloft.net; edumazet@google.com; pabeni@redhat.com; alex.williamson@redhat.com; gospo@broadcom.com; michael.chan@broadcom.com; ajit.khaparde@broadcom.com; somnath.kotur@broadcom.com; andrew.gospodarek@broadcom.com; Panicker, Manoj <Manoj.Panicker2@amd.com>; VanTassell, Eric <Eric.VanTassell@amd.com>; vadim.fedorenko@linux.dev; horms@kernel.org; bagasdotme@gmail.com; bhelgaas@google.com; lukas@wunner.de; paul.e.luse@intel.com; jing2.liu@intel.com
> Subject: Re: [PATCH V7 4/5] bnxt_en: Add TPH support in BNXT driver
>
> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
>
>
> On Wed, 2 Oct 2024 11:59:53 -0500 Wei Huang wrote:
>> + if (netif_running(irq->bp->dev)) {
>> + rtnl_lock();
>> + err = netdev_rx_queue_restart(irq->bp->dev, irq->ring_nr);
>> + if (err)
>> + netdev_err(irq->bp->dev,
>> + "rx queue restart failed: err=%d\n", err);
>> + rtnl_unlock();
>> + }
>> +}
>> +
>> +static void __bnxt_irq_affinity_release(struct kref __always_unused
>> +*ref) { }
>
> An empty release function is always a red flag.
> How is the reference counting used here?
> Is irq_set_affinity_notifier() not synchronous?
> Otherwise the rtnl_lock() should probably cover the running check.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH V7 4/5] bnxt_en: Add TPH support in BNXT driver
2024-10-15 19:50 ` Wei Huang
@ 2024-10-15 23:45 ` Jakub Kicinski
0 siblings, 0 replies; 17+ messages in thread
From: Jakub Kicinski @ 2024-10-15 23:45 UTC (permalink / raw)
To: Wei Huang
Cc: Panicker, Manoj, linux-pci@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
netdev@vger.kernel.org, Jonathan.Cameron@Huawei.com,
helgaas@kernel.org, corbet@lwn.net, davem@davemloft.net,
edumazet@google.com, pabeni@redhat.com,
alex.williamson@redhat.com, gospo@broadcom.com,
michael.chan@broadcom.com, ajit.khaparde@broadcom.com,
somnath.kotur@broadcom.com, andrew.gospodarek@broadcom.com,
VanTassell, Eric, vadim.fedorenko@linux.dev, horms@kernel.org,
bagasdotme@gmail.com, bhelgaas@google.com, lukas@wunner.de,
paul.e.luse@intel.com, jing2.liu@intel.com
On Tue, 15 Oct 2024 14:50:39 -0500 Wei Huang wrote:
> Any suggestions on how to proceed? I can send out a V8 patchset if Jakub
> is OK with Manoj's solution? Or only a new patch #4 is needed since the
> rest are intact.
1) y'all need to stop top posting
2) Manoj's reply is AMD internal and I'm not an AMD employee
3) precedent in drivers means relatively little, existing code
can be buggy
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH V7 0/5] TPH and cache direct injection support
2024-10-02 21:35 ` [PATCH V7 0/5] TPH and cache direct injection support Bjorn Helgaas
2024-10-02 22:08 ` Michael Chan
@ 2024-10-16 21:31 ` Bjorn Helgaas
1 sibling, 0 replies; 17+ messages in thread
From: Bjorn Helgaas @ 2024-10-16 21:31 UTC (permalink / raw)
To: Wei Huang
Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
corbet, davem, edumazet, kuba, pabeni, alex.williamson, gospo,
michael.chan, ajit.khaparde, somnath.kotur, andrew.gospodarek,
manoj.panicker2, Eric.VanTassell, vadim.fedorenko, horms,
bagasdotme, bhelgaas, lukas, paul.e.luse, jing2.liu
On Wed, Oct 02, 2024 at 04:35:55PM -0500, Bjorn Helgaas wrote:
> On Wed, Oct 02, 2024 at 11:59:49AM -0500, Wei Huang wrote:
> > Hi All,
> >
> > TPH (TLP Processing Hints) is a PCIe feature that allows endpoint
> > devices to provide optimization hints for requests that target memory
> > space. These hints, in a format called steering tag (ST), are provided
> > in the requester's TLP headers and allow the system hardware, including
> > the Root Complex, to optimize the utilization of platform resources
> > for the requests.
> >
> > Upcoming AMD hardware implement a new Cache Injection feature that
> > leverages TPH. Cache Injection allows PCIe endpoints to inject I/O
> > Coherent DMA writes directly into an L2 within the CCX (core complex)
> > closest to the CPU core that will consume it. This technology is aimed
> > at applications requiring high performance and low latency, such as
> > networking and storage applications.
> >
> > This series introduces generic TPH support in Linux, allowing STs to be
> > retrieved and used by PCIe endpoint drivers as needed. As a
> > demonstration, it includes an example usage in the Broadcom BNXT driver.
> > When running on Broadcom NICs with the appropriate firmware, it shows
> > substantial memory bandwidth savings and better network bandwidth using
> > real-world benchmarks. This solution is vendor-neutral and implemented
> > based on industry standards (PCIe Spec and PCI FW Spec).
> >
> > V6->V7:
> > * Rebase on top of the latest pci/main (6.12-rc1)
> > * Fix compilation warning/error on clang-18 with w=1 (test robot)
> > * Revise commit messages for Patch #2, #4, and #5 (Bjorn)
> > * Add more _DSM method description for reference in Patch #2 (Bjorn)
> > * Remove "default n" in Kconfig (Lukas)
> >
> > V5->V6:
> > * Rebase on top of pci/main (tag: pci-v6.12-changes)
> > * Fix spellings and FIELD_PREP/bnxt.c compilation errors (Simon)
> > * Move tph.c to drivers/pci directory (Lukas)
> > * Remove CONFIG_ACPI dependency (Lukas)
> > * Slightly re-arrange save/restore sequence (Lukas)
> >
> > V4->V5:
> > * Rebase on top of net-next/main tree (Broadcom)
> > * Remove TPH mode query and TPH enabled checking functions (Bjorn)
> > * Remove "nostmode" kernel parameter (Bjorn)
> > * Add "notph" kernel parameter support (Bjorn)
> > * Add back TPH documentation (Bjorn)
> > * Change TPH register namings (Bjorn)
> > * Squash TPH enable/disable/save/restore funcs as a single patch (Bjorn)
> > * Squash ST get_st/set_st funcs as a single patch (Bjorn)
> > * Replace nic_open/close with netdev_rx_queue_restart() (Jakub, Broadcom)
> >
> > V3->V4:
> > * Rebase on top of the latest pci/next tree (tag: 6.11-rc1)
> > * Add new API functioins to query/enable/disable TPH support
> > * Make pcie_tph_set_st() completely independent from pcie_tph_get_cpu_st()
> > * Rewrite bnxt.c based on new APIs
> > * Remove documentation for now due to constantly changing API
> > * Remove pci=notph, but keep pci=nostmode with better flow (Bjorn)
> > * Lots of code rewrite in tph.c & pci-tph.h with cleaner interface (Bjorn)
> > * Add TPH save/restore support (Paul Luse and Lukas Wunner)
> >
> > V2->V3:
> > * Rebase on top of pci/next tree (tag: pci-v6.11-changes)
> > * Redefine PCI TPH registers (pci_regs.h) without breaking uapi
> > * Fix commit subjects/messages for kernel options (Jonathan and Bjorn)
> > * Break API functions into three individual patches for easy review
> > * Rewrite lots of code in tph.c/tph.h based (Jonathan and Bjorn)
> >
> > V1->V2:
> > * Rebase on top of pci.git/for-linus (6.10-rc1)
> > * Address mismatched data types reported by Sparse (Sparse check passed)
> > * Add pcie_tph_intr_vec_supported() for checking IRQ mode support
> > * Skip bnxt affinity notifier registration if
> > pcie_tph_intr_vec_supported()=false
> > * Minor fixes in bnxt driver (i.e. warning messages)
> >
> > Manoj Panicker (1):
> > bnxt_en: Add TPH support in BNXT driver
> >
> > Michael Chan (1):
> > bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings
> >
> > Wei Huang (3):
> > PCI: Add TLP Processing Hints (TPH) support
> > PCI/TPH: Add Steering Tag support
> > PCI/TPH: Add TPH documentation
> >
> > Documentation/PCI/index.rst | 1 +
> > Documentation/PCI/tph.rst | 132 +++++
> > .../admin-guide/kernel-parameters.txt | 4 +
> > Documentation/driver-api/pci/pci.rst | 3 +
> > drivers/net/ethernet/broadcom/bnxt/bnxt.c | 91 ++-
> > drivers/net/ethernet/broadcom/bnxt/bnxt.h | 7 +
> > drivers/pci/Kconfig | 9 +
> > drivers/pci/Makefile | 1 +
> > drivers/pci/pci.c | 4 +
> > drivers/pci/pci.h | 12 +
> > drivers/pci/probe.c | 1 +
> > drivers/pci/tph.c | 546 ++++++++++++++++++
> > include/linux/pci-tph.h | 44 ++
> > include/linux/pci.h | 7 +
> > include/uapi/linux/pci_regs.h | 37 +-
> > net/core/netdev_rx_queue.c | 1 +
> > 16 files changed, 890 insertions(+), 10 deletions(-)
> > create mode 100644 Documentation/PCI/tph.rst
> > create mode 100644 drivers/pci/tph.c
> > create mode 100644 include/linux/pci-tph.h
>
> I tentatively applied this on pci/tph for v6.13.
>
> Not sure what you intend for the bnxt changes, since they depend on
> the PCI core changes. I'm happy to merge them via PCI, given acks
> from Michael and an overall network maintainer.
Given the ongoing discussion about the bnxt_en patches, I dropped
those, so the PCI tree pci/tph branch now contains only these:
e045e5c1c706 ("PCI/TPH: Add TPH documentation")
d2e8a34876ce ("PCI/TPH: Add Steering Tag support")
f69767a1ada3 ("PCI: Add TLP Processing Hints (TPH) support")
This is headed for v6.13, but the branch should not be considered
immutable, and it may be merged during the merge window either before
or after the netdev tree.
Bjorn
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH V7 2/5] PCI/TPH: Add Steering Tag support
2024-10-02 16:59 ` [PATCH V7 2/5] PCI/TPH: Add Steering Tag support Wei Huang
@ 2025-02-04 18:33 ` Robin Murphy
2025-02-04 20:18 ` Wei Huang
0 siblings, 1 reply; 17+ messages in thread
From: Robin Murphy @ 2025-02-04 18:33 UTC (permalink / raw)
To: Wei Huang, linux-pci, linux-kernel, linux-doc, netdev
Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
alex.williamson, gospo, michael.chan, ajit.khaparde,
somnath.kotur, andrew.gospodarek, manoj.panicker2,
Eric.VanTassell, vadim.fedorenko, horms, bagasdotme, bhelgaas,
lukas, paul.e.luse, jing2.liu
On 2024-10-02 5:59 pm, Wei Huang wrote:
[...]
> +/**
> + * pcie_tph_set_st_entry() - Set Steering Tag in the ST table entry
> + * @pdev: PCI device
> + * @index: ST table entry index
> + * @tag: Steering Tag to be written
> + *
> + * This function will figure out the proper location of ST table, either in the
> + * MSI-X table or in the TPH Extended Capability space, and write the Steering
> + * Tag into the ST entry pointed by index.
> + *
> + * Returns: 0 if success, otherwise negative value (-errno)
> + */
> +int pcie_tph_set_st_entry(struct pci_dev *pdev, unsigned int index, u16 tag)
> +{
> + u32 loc;
> + int err = 0;
> +
> + if (!pdev->tph_cap)
> + return -EINVAL;
> +
> + if (!pdev->tph_enabled)
> + return -EINVAL;
> +
> + /* No need to write tag if device is in "No ST Mode" */
> + if (pdev->tph_mode == PCI_TPH_ST_NS_MODE)
> + return 0;
> +
> + /* Disable TPH before updating ST to avoid potential instability as
> + * cautioned in PCIe r6.2, sec 6.17.3, "ST Modes of Operation"
> + */
> + set_ctrl_reg_req_en(pdev, PCI_TPH_REQ_DISABLE);
> +
> + loc = get_st_table_loc(pdev);
> + /* Convert loc to match with PCI_TPH_LOC_* defined in pci_regs.h */
> + loc = FIELD_PREP(PCI_TPH_CAP_LOC_MASK, loc);
> +
> + switch (loc) {
> + case PCI_TPH_LOC_MSIX:
> + err = write_tag_to_msix(pdev, index, tag);
> + break;
> + case PCI_TPH_LOC_CAP:
> + err = write_tag_to_st_table(pdev, index, tag);
> + break;
> + default:
> + err = -EINVAL;
> + }
> +
> + if (err) {
> + pcie_disable_tph(pdev);
> + return err;
> + }
> +
> + set_ctrl_reg_req_en(pdev, pdev->tph_mode);
Just looking at this code in mainline, and I don't trust my
understanding quite enough to send a patch myself, but doesn't this want
to be pdev->tph_req_type, rather than tph_mode?
Thanks,
Robin.
> +
> + pci_dbg(pdev, "set steering tag: %s table, index=%d, tag=%#04x\n",
> + (loc == PCI_TPH_LOC_MSIX) ? "MSI-X" : "ST", index, tag);
> +
> + return 0;
> +}
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH V7 2/5] PCI/TPH: Add Steering Tag support
2025-02-04 18:33 ` Robin Murphy
@ 2025-02-04 20:18 ` Wei Huang
2025-02-05 12:57 ` Robin Murphy
0 siblings, 1 reply; 17+ messages in thread
From: Wei Huang @ 2025-02-04 20:18 UTC (permalink / raw)
To: Robin Murphy, linux-pci, linux-kernel, linux-doc, netdev
Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
alex.williamson, gospo, michael.chan, ajit.khaparde,
somnath.kotur, andrew.gospodarek, manoj.panicker2,
Eric.VanTassell, vadim.fedorenko, horms, bagasdotme, bhelgaas,
lukas, paul.e.luse, jing2.liu
On 2/4/25 12:33 PM, Robin Murphy wrote:
> On 2024-10-02 5:59 pm, Wei Huang wrote:
> [...]
>> +
>> + if (err) {
>> + pcie_disable_tph(pdev);
>> + return err;
>> + }
>> +
>> + set_ctrl_reg_req_en(pdev, pdev->tph_mode);
>
> Just looking at this code in mainline, and I don't trust my
> understanding quite enough to send a patch myself, but doesn't this want
> to be pdev->tph_req_type, rather than tph_mode?
Yeah, you are right - this is supposed to be pdev->tph_req_type instead
of tph_mode. We disable TPH first by clearing (zero) the "TPH Requester
Enable" field and needs to set it back using tph_req_type.
Do you want to send in a fix? I can ACK it. Thanks for spotting it.
-Wei
>
> Thanks,
> Robin.
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH V7 2/5] PCI/TPH: Add Steering Tag support
2025-02-04 20:18 ` Wei Huang
@ 2025-02-05 12:57 ` Robin Murphy
0 siblings, 0 replies; 17+ messages in thread
From: Robin Murphy @ 2025-02-05 12:57 UTC (permalink / raw)
To: Wei Huang, linux-pci, linux-kernel, linux-doc, netdev
Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
alex.williamson, gospo, michael.chan, ajit.khaparde,
somnath.kotur, andrew.gospodarek, manoj.panicker2,
Eric.VanTassell, vadim.fedorenko, horms, bagasdotme, bhelgaas,
lukas, paul.e.luse, jing2.liu
On 2025-02-04 8:18 pm, Wei Huang wrote:
>
>
> On 2/4/25 12:33 PM, Robin Murphy wrote:
>> On 2024-10-02 5:59 pm, Wei Huang wrote:
>> [...]
>>> +
>>> + if (err) {
>>> + pcie_disable_tph(pdev);
>>> + return err;
>>> + }
>>> +
>>> + set_ctrl_reg_req_en(pdev, pdev->tph_mode);
>>
>> Just looking at this code in mainline, and I don't trust my
>> understanding quite enough to send a patch myself, but doesn't this want
>> to be pdev->tph_req_type, rather than tph_mode?
>
> Yeah, you are right - this is supposed to be pdev->tph_req_type instead
> of tph_mode. We disable TPH first by clearing (zero) the "TPH Requester
> Enable" field and needs to set it back using tph_req_type.
>
> Do you want to send in a fix? I can ACK it. Thanks for spotting it.
Done[1] - cheers for confirming!
Robin.
[1]
https://lore.kernel.org/linux-pci/13118098116d7bce07aa20b8c52e28c7d1847246.1738759933.git.robin.murphy@arm.com/
>
> -Wei
>
>>
>> Thanks,
>> Robin.
>>
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2025-02-05 12:57 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-02 16:59 [PATCH V7 0/5] TPH and cache direct injection support Wei Huang
2024-10-02 16:59 ` [PATCH V7 1/5] PCI: Add TLP Processing Hints (TPH) support Wei Huang
2024-10-02 16:59 ` [PATCH V7 2/5] PCI/TPH: Add Steering Tag support Wei Huang
2025-02-04 18:33 ` Robin Murphy
2025-02-04 20:18 ` Wei Huang
2025-02-05 12:57 ` Robin Murphy
2024-10-02 16:59 ` [PATCH V7 3/5] PCI/TPH: Add TPH documentation Wei Huang
2024-10-02 16:59 ` [PATCH V7 4/5] bnxt_en: Add TPH support in BNXT driver Wei Huang
2024-10-08 13:39 ` Jakub Kicinski
2024-10-11 18:35 ` Panicker, Manoj
2024-10-15 19:50 ` Wei Huang
2024-10-15 23:45 ` Jakub Kicinski
2024-10-02 16:59 ` [PATCH V7 5/5] bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings Wei Huang
2024-10-02 21:35 ` [PATCH V7 0/5] TPH and cache direct injection support Bjorn Helgaas
2024-10-02 22:08 ` Michael Chan
2024-10-08 7:32 ` Paolo Abeni
2024-10-16 21:31 ` Bjorn Helgaas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).