* [PATCH V3 01/10] PCI: Introduce PCIe TPH support framework
2024-07-17 20:55 [PATCH V3 00/10] PCIe TPH and cache direct injection support Wei Huang
@ 2024-07-17 20:55 ` Wei Huang
2024-07-17 20:55 ` [PATCH V3 02/10] PCI: Add TPH related register definition Wei Huang
` (10 subsequent siblings)
11 siblings, 0 replies; 36+ messages in thread
From: Wei Huang @ 2024-07-17 20:55 UTC (permalink / raw)
To: linux-pci, linux-kernel, linux-doc, netdev
Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
alex.williamson, gospo, michael.chan, ajit.khaparde,
somnath.kotur, andrew.gospodarek, manoj.panicker2,
Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
bhelgaas
Implement the framework for PCIe TPH support by introducing tph.c
source file, along with CONFIG_PCIE_TPH, to Linux PCIe subsystem.
Add tph_cap in pci_dev to cache TPH capability offset.
Co-developed-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
drivers/pci/pci.h | 6 ++++++
drivers/pci/pcie/Kconfig | 11 +++++++++++
drivers/pci/pcie/Makefile | 1 +
drivers/pci/pcie/tph.c | 15 +++++++++++++++
drivers/pci/probe.c | 1 +
include/linux/pci.h | 4 ++++
6 files changed, 38 insertions(+)
create mode 100644 drivers/pci/pcie/tph.c
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 79c8398f3938..b80342e6b3e8 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -571,6 +571,12 @@ static inline int pci_iov_bus_range(struct pci_bus *bus)
#endif /* CONFIG_PCI_IOV */
+#ifdef CONFIG_PCIE_TPH
+void pcie_tph_init(struct pci_dev *dev);
+#else
+static inline void pcie_tph_init(struct pci_dev *dev) {}
+#endif
+
#ifdef CONFIG_PCIE_PTM
void pci_ptm_init(struct pci_dev *dev);
void pci_save_ptm_state(struct pci_dev *dev);
diff --git a/drivers/pci/pcie/Kconfig b/drivers/pci/pcie/Kconfig
index 17919b99fa66..c765016a119a 100644
--- a/drivers/pci/pcie/Kconfig
+++ b/drivers/pci/pcie/Kconfig
@@ -155,3 +155,14 @@ config PCIE_EDR
the PCI Firmware Specification r3.2. Enable this if you want to
support hybrid DPC model which uses both firmware and OS to
implement DPC.
+
+config PCIE_TPH
+ bool "TLP Processing Hints"
+ depends on ACPI
+ default n
+ help
+ This option adds support for PCIe TLP Processing Hints (TPH).
+ TPH allows endpoint devices to provide optimization hints, such as
+ desired caching behavior, for requests that target memory space.
+ These hints, called steering tags, can empower the system hardware
+ to optimize the utilization of platform resources.
diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile
index 6461aa93fe76..3542b42ea0b9 100644
--- a/drivers/pci/pcie/Makefile
+++ b/drivers/pci/pcie/Makefile
@@ -13,3 +13,4 @@ obj-$(CONFIG_PCIE_PME) += pme.o
obj-$(CONFIG_PCIE_DPC) += dpc.o
obj-$(CONFIG_PCIE_PTM) += ptm.o
obj-$(CONFIG_PCIE_EDR) += edr.o
+obj-$(CONFIG_PCIE_TPH) += tph.o
diff --git a/drivers/pci/pcie/tph.c b/drivers/pci/pcie/tph.c
new file mode 100644
index 000000000000..e385b871333e
--- /dev/null
+++ b/drivers/pci/pcie/tph.c
@@ -0,0 +1,15 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * TPH (TLP Processing Hints) support
+ *
+ * Copyright (C) 2024 Advanced Micro Devices, Inc.
+ * Eric Van Tassell <Eric.VanTassell@amd.com>
+ * Wei Huang <wei.huang2@amd.com>
+ */
+
+#include "../pci.h"
+
+void pcie_tph_init(struct pci_dev *pdev)
+{
+ pdev->tph_cap = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_TPH);
+}
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 20475ca30505..b6bf3559c204 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2498,6 +2498,7 @@ static void pci_init_capabilities(struct pci_dev *dev)
pci_dpc_init(dev); /* Downstream Port Containment */
pci_rcec_init(dev); /* Root Complex Event Collector */
pci_doe_init(dev); /* Data Object Exchange */
+ pcie_tph_init(dev); /* TLP Processing Hints */
pcie_report_downtraining(dev);
pci_init_reset_methods(dev);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index e83ac93a4dcb..6631ebe80ca9 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -532,6 +532,10 @@ struct pci_dev {
/* These methods index pci_reset_fn_methods[] */
u8 reset_methods[PCI_NUM_RESET_METHODS]; /* In priority order */
+
+#ifdef CONFIG_PCIE_TPH
+ u16 tph_cap; /* TPH capability offset */
+#endif
};
static inline struct pci_dev *pci_physfn(struct pci_dev *dev)
--
2.45.1
^ permalink raw reply related [flat|nested] 36+ messages in thread* [PATCH V3 02/10] PCI: Add TPH related register definition
2024-07-17 20:55 [PATCH V3 00/10] PCIe TPH and cache direct injection support Wei Huang
2024-07-17 20:55 ` [PATCH V3 01/10] PCI: Introduce PCIe TPH support framework Wei Huang
@ 2024-07-17 20:55 ` Wei Huang
2024-07-23 22:33 ` Bjorn Helgaas
2024-07-17 20:55 ` [PATCH V3 03/10] PCI/TPH: Add pci=notph to prevent use of TPH Wei Huang
` (9 subsequent siblings)
11 siblings, 1 reply; 36+ messages in thread
From: Wei Huang @ 2024-07-17 20:55 UTC (permalink / raw)
To: linux-pci, linux-kernel, linux-doc, netdev
Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
alex.williamson, gospo, michael.chan, ajit.khaparde,
somnath.kotur, andrew.gospodarek, manoj.panicker2,
Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
bhelgaas
Linux has some basic, but incomplete, definition for the TPH Requester
capability registers. Also the control registers of TPH Requester and
the TPH Completer are missing. Add all required definitions to support
TPH without changing the existing uapi.
Co-developed-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
---
include/uapi/linux/pci_regs.h | 28 ++++++++++++++++++++++++----
1 file changed, 24 insertions(+), 4 deletions(-)
diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index 94c00996e633..0fb61af6097a 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -657,6 +657,7 @@
#define PCI_EXP_DEVCAP2_ATOMIC_COMP64 0x00000100 /* 64b AtomicOp completion */
#define PCI_EXP_DEVCAP2_ATOMIC_COMP128 0x00000200 /* 128b AtomicOp completion */
#define PCI_EXP_DEVCAP2_LTR 0x00000800 /* Latency tolerance reporting */
+#define PCI_EXP_DEVCAP2_TPH_COMP_MASK 0x00003000 /* TPH completer support */
#define PCI_EXP_DEVCAP2_OBFF_MASK 0x000c0000 /* OBFF support mechanism */
#define PCI_EXP_DEVCAP2_OBFF_MSG 0x00040000 /* New message signaling */
#define PCI_EXP_DEVCAP2_OBFF_WAKE 0x00080000 /* Re-use WAKE# for OBFF */
@@ -1020,16 +1021,35 @@
#define PCI_DPA_CAP_SUBSTATE_MASK 0x1F /* # substates - 1 */
#define PCI_DPA_BASE_SIZEOF 16 /* size with 0 substates */
+/* TPH Completer Support */
+#define PCI_EXP_DEVCAP2_TPH_COMP_NONE 0x0 /* None */
+#define PCI_EXP_DEVCAP2_TPH_COMP_TPH_ONLY 0x1 /* TPH only */
+#define PCI_EXP_DEVCAP2_TPH_COMP_TPH_AND_EXT 0x3 /* TPH and Extended TPH */
+
/* TPH Requester */
#define PCI_TPH_CAP 4 /* capability register */
-#define PCI_TPH_CAP_LOC_MASK 0x600 /* location mask */
-#define PCI_TPH_LOC_NONE 0x000 /* no location */
-#define PCI_TPH_LOC_CAP 0x200 /* in capability */
-#define PCI_TPH_LOC_MSIX 0x400 /* in MSI-X */
+#define PCI_TPH_CAP_NO_ST 0x00000001 /* no ST mode supported */
+#define PCI_TPH_CAP_INT_VEC 0x00000002 /* interrupt vector mode supported */
+#define PCI_TPH_CAP_DS 0x00000004 /* device specific mode supported */
+#define PCI_TPH_CAP_EXT_TPH 0x00000100 /* extended TPH requestor supported */
+#define PCI_TPH_CAP_LOC_MASK 0x00000600 /* location mask */
+#define PCI_TPH_LOC_NONE 0x00000000 /* no location */
+#define PCI_TPH_LOC_CAP 0x00000200 /* in capability */
+#define PCI_TPH_LOC_MSIX 0x00000400 /* in MSI-X */
#define PCI_TPH_CAP_ST_MASK 0x07FF0000 /* ST table mask */
#define PCI_TPH_CAP_ST_SHIFT 16 /* ST table shift */
#define PCI_TPH_BASE_SIZEOF 0xc /* size with no ST table */
+#define PCI_TPH_CTRL 8 /* control register */
+#define PCI_TPH_CTRL_MODE_SEL_MASK 0x00000007 /* ST mode select mask */
+#define PCI_TPH_NO_ST_MODE 0x0 /* no ST mode */
+#define PCI_TPH_INT_VEC_MODE 0x1 /* interrupt vector mode */
+#define PCI_TPH_DEV_SPEC_MODE 0x2 /* device specific mode */
+#define PCI_TPH_CTRL_REQ_EN_MASK 0x00000300 /* TPH requester mask */
+#define PCI_TPH_REQ_DISABLE 0x0 /* no TPH request allowed */
+#define PCI_TPH_REQ_TPH_ONLY 0x1 /* 8-bit TPH tags allowed */
+#define PCI_TPH_REQ_EXT_TPH 0x3 /* 16-bit TPH tags allowed */
+
/* Downstream Port Containment */
#define PCI_EXP_DPC_CAP 0x04 /* DPC Capability */
#define PCI_EXP_DPC_IRQ 0x001F /* Interrupt Message Number */
--
2.45.1
^ permalink raw reply related [flat|nested] 36+ messages in thread* Re: [PATCH V3 02/10] PCI: Add TPH related register definition
2024-07-17 20:55 ` [PATCH V3 02/10] PCI: Add TPH related register definition Wei Huang
@ 2024-07-23 22:33 ` Bjorn Helgaas
0 siblings, 0 replies; 36+ messages in thread
From: Bjorn Helgaas @ 2024-07-23 22:33 UTC (permalink / raw)
To: Wei Huang
Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
corbet, davem, edumazet, kuba, pabeni, alex.williamson, gospo,
michael.chan, ajit.khaparde, somnath.kotur, andrew.gospodarek,
manoj.panicker2, Eric.VanTassell, vadim.fedorenko, horms,
bagasdotme, bhelgaas
On Wed, Jul 17, 2024 at 03:55:03PM -0500, Wei Huang wrote:
> Linux has some basic, but incomplete, definition for the TPH Requester
> capability registers. Also the control registers of TPH Requester and
> the TPH Completer are missing. Add all required definitions to support
> TPH without changing the existing uapi.
> +#define PCI_TPH_CAP_NO_ST 0x00000001 /* no ST mode supported */
> +#define PCI_TPH_CAP_INT_VEC 0x00000002 /* interrupt vector mode supported */
> +#define PCI_TPH_CAP_DS 0x00000004 /* device specific mode supported */
Capitalize to match spec usage. Also below.
> +#define PCI_TPH_CAP_EXT_TPH 0x00000100 /* extended TPH requestor supported */
s/requestor/requester/
> +#define PCI_TPH_CAP_LOC_MASK 0x00000600 /* location mask */
> +#define PCI_TPH_LOC_NONE 0x00000000 /* no location */
> +#define PCI_TPH_LOC_CAP 0x00000200 /* in capability */
> +#define PCI_TPH_LOC_MSIX 0x00000400 /* in MSI-X */
> #define PCI_TPH_CAP_ST_MASK 0x07FF0000 /* ST table mask */
> #define PCI_TPH_CAP_ST_SHIFT 16 /* ST table shift */
> #define PCI_TPH_BASE_SIZEOF 0xc /* size with no ST table */
>
> +#define PCI_TPH_CTRL 8 /* control register */
> +#define PCI_TPH_CTRL_MODE_SEL_MASK 0x00000007 /* ST mode select mask */
> +#define PCI_TPH_NO_ST_MODE 0x0 /* no ST mode */
> +#define PCI_TPH_INT_VEC_MODE 0x1 /* interrupt vector mode */
> +#define PCI_TPH_DEV_SPEC_MODE 0x2 /* device specific mode */
> +#define PCI_TPH_CTRL_REQ_EN_MASK 0x00000300 /* TPH requester mask */
> +#define PCI_TPH_REQ_DISABLE 0x0 /* no TPH request allowed */
> +#define PCI_TPH_REQ_TPH_ONLY 0x1 /* 8-bit TPH tags allowed */
> +#define PCI_TPH_REQ_EXT_TPH 0x3 /* 16-bit TPH tags allowed */
> +
> /* Downstream Port Containment */
> #define PCI_EXP_DPC_CAP 0x04 /* DPC Capability */
> #define PCI_EXP_DPC_IRQ 0x001F /* Interrupt Message Number */
> --
> 2.45.1
>
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH V3 03/10] PCI/TPH: Add pci=notph to prevent use of TPH
2024-07-17 20:55 [PATCH V3 00/10] PCIe TPH and cache direct injection support Wei Huang
2024-07-17 20:55 ` [PATCH V3 01/10] PCI: Introduce PCIe TPH support framework Wei Huang
2024-07-17 20:55 ` [PATCH V3 02/10] PCI: Add TPH related register definition Wei Huang
@ 2024-07-17 20:55 ` Wei Huang
2024-07-23 22:41 ` Bjorn Helgaas
2024-07-24 14:45 ` Alejandro Lucero Palau
2024-07-17 20:55 ` [PATCH V3 04/10] PCI/TPH: Add pci=nostmode to force No ST Mode Wei Huang
` (8 subsequent siblings)
11 siblings, 2 replies; 36+ messages in thread
From: Wei Huang @ 2024-07-17 20:55 UTC (permalink / raw)
To: linux-pci, linux-kernel, linux-doc, netdev
Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
alex.williamson, gospo, michael.chan, ajit.khaparde,
somnath.kotur, andrew.gospodarek, manoj.panicker2,
Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
bhelgaas
TLP headers with incorrect steering tags (e.g. caused by buggy driver)
can potentially cause issues when the system hardware consumes the tags.
Provide a kernel option, with related helper functions, to completely
prevent TPH from being enabled.
Co-developed-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
---
.../admin-guide/kernel-parameters.txt | 1 +
drivers/pci/pci-driver.c | 7 +++++-
drivers/pci/pci.c | 12 +++++++++
drivers/pci/pcie/tph.c | 25 +++++++++++++++++++
include/linux/pci-tph.h | 18 +++++++++++++
include/linux/pci.h | 1 +
6 files changed, 63 insertions(+), 1 deletion(-)
create mode 100644 include/linux/pci-tph.h
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index b2057241ea6c..65581ebd9b50 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4655,6 +4655,7 @@
nomio [S390] Do not use MIO instructions.
norid [S390] ignore the RID field and force use of
one PCI domain per PCI function
+ notph [PCIE] Do not use PCIe TPH
pcie_aspm= [PCIE] Forcibly enable or ignore PCIe Active State Power
Management.
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index af2996d0d17f..9722d070c0ca 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -21,6 +21,7 @@
#include <linux/acpi.h>
#include <linux/dma-map-ops.h>
#include <linux/iommu.h>
+#include <linux/pci-tph.h>
#include "pci.h"
#include "pcie/portdrv.h"
@@ -322,8 +323,12 @@ static long local_pci_probe(void *_ddi)
pm_runtime_get_sync(dev);
pci_dev->driver = pci_drv;
rc = pci_drv->probe(pci_dev, ddi->id);
- if (!rc)
+ if (!rc) {
+ if (pci_tph_disabled())
+ pcie_tph_disable(pci_dev);
+
return rc;
+ }
if (rc < 0) {
pci_dev->driver = NULL;
pm_runtime_put_sync(dev);
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 02b1d81b1419..4cbfd5b53be8 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -157,6 +157,9 @@ static bool pcie_ari_disabled;
/* If set, the PCIe ATS capability will not be used. */
static bool pcie_ats_disabled;
+/* If set, the PCIe TPH capability will not be used. */
+static bool pcie_tph_disabled;
+
/* If set, the PCI config space of each device is printed during boot. */
bool pci_early_dump;
@@ -166,6 +169,12 @@ bool pci_ats_disabled(void)
}
EXPORT_SYMBOL_GPL(pci_ats_disabled);
+bool pci_tph_disabled(void)
+{
+ return pcie_tph_disabled;
+}
+EXPORT_SYMBOL_GPL(pci_tph_disabled);
+
/* Disable bridge_d3 for all PCIe ports */
static bool pci_bridge_d3_disable;
/* Force bridge_d3 for all PCIe ports */
@@ -6869,6 +6878,9 @@ static int __init pci_setup(char *str)
pci_no_domains();
} else if (!strncmp(str, "noari", 5)) {
pcie_ari_disabled = true;
+ } else if (!strcmp(str, "notph")) {
+ pr_info("PCIe: TPH is disabled\n");
+ pcie_tph_disabled = true;
} else if (!strncmp(str, "cbiosize=", 9)) {
pci_cardbus_io_size = memparse(str + 9, &str);
} else if (!strncmp(str, "cbmemsize=", 10)) {
diff --git a/drivers/pci/pcie/tph.c b/drivers/pci/pcie/tph.c
index e385b871333e..ad58a892792c 100644
--- a/drivers/pci/pcie/tph.c
+++ b/drivers/pci/pcie/tph.c
@@ -7,8 +7,33 @@
* Wei Huang <wei.huang2@amd.com>
*/
+#include <linux/pci.h>
+#include <linux/bitfield.h>
+#include <linux/pci-tph.h>
+
#include "../pci.h"
+/* Update the TPH Requester Enable field of TPH Control Register */
+static void set_ctrl_reg_req_en(struct pci_dev *pdev, u8 req_type)
+{
+ u32 reg_val;
+
+ pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, ®_val);
+
+ reg_val &= ~PCI_TPH_CTRL_REQ_EN_MASK;
+ reg_val |= FIELD_PREP(PCI_TPH_CTRL_REQ_EN_MASK, req_type);
+
+ pci_write_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, reg_val);
+}
+
+void pcie_tph_disable(struct pci_dev *pdev)
+{
+ if (!pdev->tph_cap)
+ return;
+
+ set_ctrl_reg_req_en(pdev, PCI_TPH_REQ_DISABLE);
+}
+
void pcie_tph_init(struct pci_dev *pdev)
{
pdev->tph_cap = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_TPH);
diff --git a/include/linux/pci-tph.h b/include/linux/pci-tph.h
new file mode 100644
index 000000000000..e0b782bda929
--- /dev/null
+++ b/include/linux/pci-tph.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * TPH (TLP Processing Hints)
+ *
+ * Copyright (C) 2024 Advanced Micro Devices, Inc.
+ * Eric Van Tassell <Eric.VanTassell@amd.com>
+ * Wei Huang <wei.huang2@amd.com>
+ */
+#ifndef LINUX_PCI_TPH_H
+#define LINUX_PCI_TPH_H
+
+#ifdef CONFIG_PCIE_TPH
+void pcie_tph_disable(struct pci_dev *dev);
+#else
+static inline void pcie_tph_disable(struct pci_dev *dev) {}
+#endif
+
+#endif /* LINUX_PCI_TPH_H */
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 6631ebe80ca9..05fbbd9ad6b4 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1840,6 +1840,7 @@ static inline bool pci_aer_available(void) { return false; }
#endif
bool pci_ats_disabled(void);
+bool pci_tph_disabled(void);
#ifdef CONFIG_PCIE_PTM
int pci_enable_ptm(struct pci_dev *dev, u8 *granularity);
--
2.45.1
^ permalink raw reply related [flat|nested] 36+ messages in thread* Re: [PATCH V3 03/10] PCI/TPH: Add pci=notph to prevent use of TPH
2024-07-17 20:55 ` [PATCH V3 03/10] PCI/TPH: Add pci=notph to prevent use of TPH Wei Huang
@ 2024-07-23 22:41 ` Bjorn Helgaas
2024-07-24 20:05 ` Wei Huang
2024-07-24 14:45 ` Alejandro Lucero Palau
1 sibling, 1 reply; 36+ messages in thread
From: Bjorn Helgaas @ 2024-07-23 22:41 UTC (permalink / raw)
To: Wei Huang
Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
corbet, davem, edumazet, kuba, pabeni, alex.williamson, gospo,
michael.chan, ajit.khaparde, somnath.kotur, andrew.gospodarek,
manoj.panicker2, Eric.VanTassell, vadim.fedorenko, horms,
bagasdotme, bhelgaas
On Wed, Jul 17, 2024 at 03:55:04PM -0500, Wei Huang wrote:
> TLP headers with incorrect steering tags (e.g. caused by buggy driver)
> can potentially cause issues when the system hardware consumes the tags.
Hmm. What kind of issues? Crash? Data corruption? Poor
performance?
> Provide a kernel option, with related helper functions, to completely
> prevent TPH from being enabled.
Also would be nice to have a hint about the difference between "notph"
and "nostmode". Maybe that goes in the "nostmode" patch? I'm not
super clear on all the differences here.
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -4655,6 +4655,7 @@
> nomio [S390] Do not use MIO instructions.
> norid [S390] ignore the RID field and force use of
> one PCI domain per PCI function
> + notph [PCIE] Do not use PCIe TPH
Expand acronym here since there's no helpful context. Can also
include "(TPH)" if that's useful.
> @@ -322,8 +323,12 @@ static long local_pci_probe(void *_ddi)
> pm_runtime_get_sync(dev);
> pci_dev->driver = pci_drv;
> rc = pci_drv->probe(pci_dev, ddi->id);
> - if (!rc)
> + if (!rc) {
> + if (pci_tph_disabled())
> + pcie_tph_disable(pci_dev);
I'm not really a fan of cluttering probe() like this. Can't we
disable it in pcie_tph_init() so all devices start off with TPH
disabled, and then check pci_tph_disabled() in whatever interface
drivers use to enable TPH?
> +bool pci_tph_disabled(void)
> +{
> + return pcie_tph_disabled;
> +}
> +EXPORT_SYMBOL_GPL(pci_tph_disabled);
Other related interfaces use "pcie" prefix; I think this should match.
Do drivers need this? Would be nice not to export it unless they do.
Bjorn
^ permalink raw reply [flat|nested] 36+ messages in thread* Re: [PATCH V3 03/10] PCI/TPH: Add pci=notph to prevent use of TPH
2024-07-23 22:41 ` Bjorn Helgaas
@ 2024-07-24 20:05 ` Wei Huang
2024-07-25 21:29 ` Bjorn Helgaas
0 siblings, 1 reply; 36+ messages in thread
From: Wei Huang @ 2024-07-24 20:05 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
corbet, davem, edumazet, kuba, pabeni, alex.williamson, gospo,
michael.chan, ajit.khaparde, somnath.kotur, andrew.gospodarek,
manoj.panicker2, Eric.VanTassell, vadim.fedorenko, horms,
bagasdotme, bhelgaas
On 7/23/24 17:41, Bjorn Helgaas wrote:
> On Wed, Jul 17, 2024 at 03:55:04PM -0500, Wei Huang wrote:
>> TLP headers with incorrect steering tags (e.g. caused by buggy driver)
>> can potentially cause issues when the system hardware consumes the tags.
>
> Hmm. What kind of issues? Crash? Data corruption? Poor
> performance?
Not crash or functionality errors. Usually it is QoS related because of
resource competition. AMD has
>
>> Provide a kernel option, with related helper functions, to completely
>> prevent TPH from being enabled.
>
> Also would be nice to have a hint about the difference between "notph"
> and "nostmode". Maybe that goes in the "nostmode" patch? I'm not
> super clear on all the differences here.
I can combine them. Here is the combination and it meaning based on TPH
Control Register values:
Requestor Enable | ST Mode | Meaning
---------------------------------------------------------------
00 | xx | TPH disabled (i.e. notph)
01 | 00 | TPH enabled, NO ST Mode (i.e. nostmode)
01 or 11 | 01 | Interrupt Vector mode
01 or 11 | 10 | Device specific mode
If you have any other thoughts on how to approach these modes, please
let me know.
>
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -4655,6 +4655,7 @@
>> nomio [S390] Do not use MIO instructions.
>> norid [S390] ignore the RID field and force use of
>> one PCI domain per PCI function
>> + notph [PCIE] Do not use PCIe TPH
>
> Expand acronym here since there's no helpful context. Can also
> include "(TPH)" if that's useful.
>
>> @@ -322,8 +323,12 @@ static long local_pci_probe(void *_ddi)
>> pm_runtime_get_sync(dev);
>> pci_dev->driver = pci_drv;
>> rc = pci_drv->probe(pci_dev, ddi->id);
>> - if (!rc)
>> + if (!rc) {
>> + if (pci_tph_disabled())
>> + pcie_tph_disable(pci_dev);
>
> I'm not really a fan of cluttering probe() like this. Can't we
> disable it in pcie_tph_init() so all devices start off with TPH
> disabled, and then check pci_tph_disabled() in whatever interface
> drivers use to enable TPH?
>
>> +bool pci_tph_disabled(void)
>> +{
>> + return pcie_tph_disabled;
>> +}
>> +EXPORT_SYMBOL_GPL(pci_tph_disabled);
>
> Other related interfaces use "pcie" prefix; I think this should match.
>
> Do drivers need this? Would be nice not to export it unless they do.
>
> Bjorn
^ permalink raw reply [flat|nested] 36+ messages in thread* Re: [PATCH V3 03/10] PCI/TPH: Add pci=notph to prevent use of TPH
2024-07-24 20:05 ` Wei Huang
@ 2024-07-25 21:29 ` Bjorn Helgaas
2024-07-29 14:56 ` Wei Huang
0 siblings, 1 reply; 36+ messages in thread
From: Bjorn Helgaas @ 2024-07-25 21:29 UTC (permalink / raw)
To: Wei Huang
Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
corbet, davem, edumazet, kuba, pabeni, alex.williamson, gospo,
michael.chan, ajit.khaparde, somnath.kotur, andrew.gospodarek,
manoj.panicker2, Eric.VanTassell, vadim.fedorenko, horms,
bagasdotme, bhelgaas
On Wed, Jul 24, 2024 at 03:05:59PM -0500, Wei Huang wrote:
>
>
> On 7/23/24 17:41, Bjorn Helgaas wrote:
> > On Wed, Jul 17, 2024 at 03:55:04PM -0500, Wei Huang wrote:
> >> TLP headers with incorrect steering tags (e.g. caused by buggy driver)
> >> can potentially cause issues when the system hardware consumes the tags.
> >
> > Hmm. What kind of issues? Crash? Data corruption? Poor
> > performance?
>
> Not crash or functionality errors. Usually it is QoS related because of
> resource competition. AMD has
Looks like you had more to say here?
I *assume* that both the PH hint and the Steering Tags are only
*hints* and there's no excuse for hardware to corrupt anything (e.g.,
by omitting cache maintenance) even if the hint turns out to be wrong.
If that's the case, I assume "can potentially cause issues" really
just means "might lead to lower performance". That's what I want to
clarify and confirm.
> >> Provide a kernel option, with related helper functions, to completely
> >> prevent TPH from being enabled.
> >
> > Also would be nice to have a hint about the difference between "notph"
> > and "nostmode". Maybe that goes in the "nostmode" patch? I'm not
> > super clear on all the differences here.
>
> I can combine them. Here is the combination and it meaning based on TPH
> Control Register values:
>
> Requestor Enable | ST Mode | Meaning
> ---------------------------------------------------------------
> 00 | xx | TPH disabled (i.e. notph)
> 01 | 00 | TPH enabled, NO ST Mode (i.e. nostmode)
> 01 or 11 | 01 | Interrupt Vector mode
> 01 or 11 | 10 | Device specific mode
>
> If you have any other thoughts on how to approach these modes, please
> let me know.
IIRC, there's no interface in this series that reall does anything
with TPH per se; drivers would only use the ST-related things.
If that's the case, maybe "pci=notph" isn't needed yet.
Bjorn
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH V3 03/10] PCI/TPH: Add pci=notph to prevent use of TPH
2024-07-25 21:29 ` Bjorn Helgaas
@ 2024-07-29 14:56 ` Wei Huang
0 siblings, 0 replies; 36+ messages in thread
From: Wei Huang @ 2024-07-29 14:56 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
corbet, davem, edumazet, kuba, pabeni, alex.williamson, gospo,
michael.chan, ajit.khaparde, somnath.kotur, andrew.gospodarek,
manoj.panicker2, Eric.VanTassell, vadim.fedorenko, horms,
bagasdotme, bhelgaas
On 7/25/24 16:29, Bjorn Helgaas wrote:
> On Wed, Jul 24, 2024 at 03:05:59PM -0500, Wei Huang wrote:
>>
>>
>> On 7/23/24 17:41, Bjorn Helgaas wrote:
>>> On Wed, Jul 17, 2024 at 03:55:04PM -0500, Wei Huang wrote:
>>>> TLP headers with incorrect steering tags (e.g. caused by buggy driver)
>>>> can potentially cause issues when the system hardware consumes the tags.
>>>
>>> Hmm. What kind of issues? Crash? Data corruption? Poor
>>> performance?
>>
>> Not crash or functionality errors. Usually it is QoS related because of
>> resource competition. AMD has
>
> Looks like you had more to say here?
I hit the send button too fast. What I wanted to say was there will be
AMD QoS patches to control TPH. Note that they will be hooked up under
x86/resctrl. Since they are AMD specific, it will be independent from
PCIe subsystem code.
>
> I *assume* that both the PH hint and the Steering Tags are only
> *hints* and there's no excuse for hardware to corrupt anything (e.g.,
> by omitting cache maintenance) even if the hint turns out to be wrong.
> If that's the case, I assume "can potentially cause issues" really
> just means "might lead to lower performance". That's what I want to
> clarify and confirm.
Corrrect, only QoS-related concerns. There won't be any correctness
concerns.
>
>>>> Provide a kernel option, with related helper functions, to completely
>>>> prevent TPH from being enabled.
>>>
>>> Also would be nice to have a hint about the difference between "notph"
>>> and "nostmode". Maybe that goes in the "nostmode" patch? I'm not
>>> super clear on all the differences here.
>>
>> I can combine them. Here is the combination and it meaning based on TPH
>> Control Register values:
>>
>> Requestor Enable | ST Mode | Meaning
>> ---------------------------------------------------------------
>> 00 | xx | TPH disabled (i.e. notph)
>> 01 | 00 | TPH enabled, NO ST Mode (i.e. nostmode)
>> 01 or 11 | 01 | Interrupt Vector mode
>> 01 or 11 | 10 | Device specific mode
>>
>> If you have any other thoughts on how to approach these modes, please
>> let me know.
>
> IIRC, there's no interface in this series that reall does anything
> with TPH per se; drivers would only use the ST-related things.
>
> If that's the case, maybe "pci=notph" isn't needed yet.
I can go with it. There will be a BIOS option to turn it off on AMD
platform. I would expect similar options on other vendors' platforms. So
I am not overly concerned about dropping pci=notph.
>
> Bjorn
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH V3 03/10] PCI/TPH: Add pci=notph to prevent use of TPH
2024-07-17 20:55 ` [PATCH V3 03/10] PCI/TPH: Add pci=notph to prevent use of TPH Wei Huang
2024-07-23 22:41 ` Bjorn Helgaas
@ 2024-07-24 14:45 ` Alejandro Lucero Palau
2024-07-24 15:36 ` Bjorn Helgaas
1 sibling, 1 reply; 36+ messages in thread
From: Alejandro Lucero Palau @ 2024-07-24 14:45 UTC (permalink / raw)
To: Wei Huang, linux-pci, linux-kernel, linux-doc, netdev
Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
alex.williamson, gospo, michael.chan, ajit.khaparde,
somnath.kotur, andrew.gospodarek, manoj.panicker2,
Eric.VanTassell, vadim.fedorenko, horms, bagasdotme, bhelgaas
On 7/17/24 21:55, Wei Huang wrote:
> TLP headers with incorrect steering tags (e.g. caused by buggy driver)
> can potentially cause issues when the system hardware consumes the tags.
> Provide a kernel option, with related helper functions, to completely
> prevent TPH from being enabled.
Maybe rephrase it for including a potential buggy device, including the cpu.
Also, what about handling this with a no-tph-allow device list instead
of a generic binary option for the whole system?
Foreseeing some buggy or poor-performance implementations, or specific
use cases where it could be counterproductive, maybe supporting both
options.
> Co-developed-by: Eric Van Tassell <Eric.VanTassell@amd.com>
> Signed-off-by: Eric Van Tassell <Eric.VanTassell@amd.com>
> Signed-off-by: Wei Huang <wei.huang2@amd.com>
> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
> ---
> .../admin-guide/kernel-parameters.txt | 1 +
> drivers/pci/pci-driver.c | 7 +++++-
> drivers/pci/pci.c | 12 +++++++++
> drivers/pci/pcie/tph.c | 25 +++++++++++++++++++
> include/linux/pci-tph.h | 18 +++++++++++++
> include/linux/pci.h | 1 +
> 6 files changed, 63 insertions(+), 1 deletion(-)
> create mode 100644 include/linux/pci-tph.h
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index b2057241ea6c..65581ebd9b50 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -4655,6 +4655,7 @@
> nomio [S390] Do not use MIO instructions.
> norid [S390] ignore the RID field and force use of
> one PCI domain per PCI function
> + notph [PCIE] Do not use PCIe TPH
>
> pcie_aspm= [PCIE] Forcibly enable or ignore PCIe Active State Power
> Management.
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index af2996d0d17f..9722d070c0ca 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -21,6 +21,7 @@
> #include <linux/acpi.h>
> #include <linux/dma-map-ops.h>
> #include <linux/iommu.h>
> +#include <linux/pci-tph.h>
> #include "pci.h"
> #include "pcie/portdrv.h"
>
> @@ -322,8 +323,12 @@ static long local_pci_probe(void *_ddi)
> pm_runtime_get_sync(dev);
> pci_dev->driver = pci_drv;
> rc = pci_drv->probe(pci_dev, ddi->id);
> - if (!rc)
> + if (!rc) {
> + if (pci_tph_disabled())
> + pcie_tph_disable(pci_dev);
> +
> return rc;
> + }
> if (rc < 0) {
> pci_dev->driver = NULL;
> pm_runtime_put_sync(dev);
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 02b1d81b1419..4cbfd5b53be8 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -157,6 +157,9 @@ static bool pcie_ari_disabled;
> /* If set, the PCIe ATS capability will not be used. */
> static bool pcie_ats_disabled;
>
> +/* If set, the PCIe TPH capability will not be used. */
> +static bool pcie_tph_disabled;
> +
> /* If set, the PCI config space of each device is printed during boot. */
> bool pci_early_dump;
>
> @@ -166,6 +169,12 @@ bool pci_ats_disabled(void)
> }
> EXPORT_SYMBOL_GPL(pci_ats_disabled);
>
> +bool pci_tph_disabled(void)
> +{
> + return pcie_tph_disabled;
> +}
> +EXPORT_SYMBOL_GPL(pci_tph_disabled);
> +
> /* Disable bridge_d3 for all PCIe ports */
> static bool pci_bridge_d3_disable;
> /* Force bridge_d3 for all PCIe ports */
> @@ -6869,6 +6878,9 @@ static int __init pci_setup(char *str)
> pci_no_domains();
> } else if (!strncmp(str, "noari", 5)) {
> pcie_ari_disabled = true;
> + } else if (!strcmp(str, "notph")) {
> + pr_info("PCIe: TPH is disabled\n");
> + pcie_tph_disabled = true;
> } else if (!strncmp(str, "cbiosize=", 9)) {
> pci_cardbus_io_size = memparse(str + 9, &str);
> } else if (!strncmp(str, "cbmemsize=", 10)) {
> diff --git a/drivers/pci/pcie/tph.c b/drivers/pci/pcie/tph.c
> index e385b871333e..ad58a892792c 100644
> --- a/drivers/pci/pcie/tph.c
> +++ b/drivers/pci/pcie/tph.c
> @@ -7,8 +7,33 @@
> * Wei Huang <wei.huang2@amd.com>
> */
>
> +#include <linux/pci.h>
> +#include <linux/bitfield.h>
> +#include <linux/pci-tph.h>
> +
> #include "../pci.h"
>
> +/* Update the TPH Requester Enable field of TPH Control Register */
> +static void set_ctrl_reg_req_en(struct pci_dev *pdev, u8 req_type)
> +{
> + u32 reg_val;
> +
> + pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, ®_val);
> +
> + reg_val &= ~PCI_TPH_CTRL_REQ_EN_MASK;
> + reg_val |= FIELD_PREP(PCI_TPH_CTRL_REQ_EN_MASK, req_type);
> +
> + pci_write_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, reg_val);
> +}
> +
> +void pcie_tph_disable(struct pci_dev *pdev)
> +{
> + if (!pdev->tph_cap)
> + return;
> +
> + set_ctrl_reg_req_en(pdev, PCI_TPH_REQ_DISABLE);
> +}
> +
> void pcie_tph_init(struct pci_dev *pdev)
> {
> pdev->tph_cap = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_TPH);
> diff --git a/include/linux/pci-tph.h b/include/linux/pci-tph.h
> new file mode 100644
> index 000000000000..e0b782bda929
> --- /dev/null
> +++ b/include/linux/pci-tph.h
> @@ -0,0 +1,18 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * TPH (TLP Processing Hints)
> + *
> + * Copyright (C) 2024 Advanced Micro Devices, Inc.
> + * Eric Van Tassell <Eric.VanTassell@amd.com>
> + * Wei Huang <wei.huang2@amd.com>
> + */
> +#ifndef LINUX_PCI_TPH_H
> +#define LINUX_PCI_TPH_H
> +
> +#ifdef CONFIG_PCIE_TPH
> +void pcie_tph_disable(struct pci_dev *dev);
> +#else
> +static inline void pcie_tph_disable(struct pci_dev *dev) {}
> +#endif
> +
> +#endif /* LINUX_PCI_TPH_H */
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 6631ebe80ca9..05fbbd9ad6b4 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1840,6 +1840,7 @@ static inline bool pci_aer_available(void) { return false; }
> #endif
>
> bool pci_ats_disabled(void);
> +bool pci_tph_disabled(void);
>
> #ifdef CONFIG_PCIE_PTM
> int pci_enable_ptm(struct pci_dev *dev, u8 *granularity);
^ permalink raw reply [flat|nested] 36+ messages in thread* Re: [PATCH V3 03/10] PCI/TPH: Add pci=notph to prevent use of TPH
2024-07-24 14:45 ` Alejandro Lucero Palau
@ 2024-07-24 15:36 ` Bjorn Helgaas
2024-07-24 20:08 ` Wei Huang
0 siblings, 1 reply; 36+ messages in thread
From: Bjorn Helgaas @ 2024-07-24 15:36 UTC (permalink / raw)
To: Alejandro Lucero Palau
Cc: Wei Huang, linux-pci, linux-kernel, linux-doc, netdev,
Jonathan.Cameron, corbet, davem, edumazet, kuba, pabeni,
alex.williamson, gospo, michael.chan, ajit.khaparde,
somnath.kotur, andrew.gospodarek, manoj.panicker2,
Eric.VanTassell, vadim.fedorenko, horms, bagasdotme, bhelgaas
On Wed, Jul 24, 2024 at 03:45:34PM +0100, Alejandro Lucero Palau wrote:
> On 7/17/24 21:55, Wei Huang wrote:
> > TLP headers with incorrect steering tags (e.g. caused by buggy driver)
> > can potentially cause issues when the system hardware consumes the tags.
> > Provide a kernel option, with related helper functions, to completely
> > prevent TPH from being enabled.
>
> Maybe rephrase it for including a potential buggy device, including the cpu.
>
> Also, what about handling this with a no-tph-allow device list instead of a
> generic binary option for the whole system?
>
> Foreseeing some buggy or poor-performance implementations, or specific use
> cases where it could be counterproductive, maybe supporting both options.
Makes sense if/when we need it. IMO no point in adding an empty list
of known-broken devices.
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH V3 03/10] PCI/TPH: Add pci=notph to prevent use of TPH
2024-07-24 15:36 ` Bjorn Helgaas
@ 2024-07-24 20:08 ` Wei Huang
0 siblings, 0 replies; 36+ messages in thread
From: Wei Huang @ 2024-07-24 20:08 UTC (permalink / raw)
To: Bjorn Helgaas, Alejandro Lucero Palau
Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
corbet, davem, edumazet, kuba, pabeni, alex.williamson, gospo,
michael.chan, ajit.khaparde, somnath.kotur, andrew.gospodarek,
manoj.panicker2, Eric.VanTassell, vadim.fedorenko, horms,
bagasdotme, bhelgaas
On 7/24/24 10:36, Bjorn Helgaas wrote:
> On Wed, Jul 24, 2024 at 03:45:34PM +0100, Alejandro Lucero Palau wrote:
>> On 7/17/24 21:55, Wei Huang wrote:
>>> TLP headers with incorrect steering tags (e.g. caused by buggy driver)
>>> can potentially cause issues when the system hardware consumes the tags.
>>> Provide a kernel option, with related helper functions, to completely
>>> prevent TPH from being enabled.
>>
>> Maybe rephrase it for including a potential buggy device, including the cpu.
>>
>> Also, what about handling this with a no-tph-allow device list instead of a
>> generic binary option for the whole system?
>>
>> Foreseeing some buggy or poor-performance implementations, or specific use
>> cases where it could be counterproductive, maybe supporting both options.
>
> Makes sense if/when we need it. IMO no point in adding an empty list
> of known-broken devices.
We can add quirks.c, or something similar later, after TPH support is
enabled.
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH V3 04/10] PCI/TPH: Add pci=nostmode to force No ST Mode
2024-07-17 20:55 [PATCH V3 00/10] PCIe TPH and cache direct injection support Wei Huang
` (2 preceding siblings ...)
2024-07-17 20:55 ` [PATCH V3 03/10] PCI/TPH: Add pci=notph to prevent use of TPH Wei Huang
@ 2024-07-17 20:55 ` Wei Huang
2024-07-23 22:44 ` Bjorn Helgaas
2024-07-17 20:55 ` [PATCH V3 05/10] PCI/TPH: Introduce API to check interrupt vector mode support Wei Huang
` (7 subsequent siblings)
11 siblings, 1 reply; 36+ messages in thread
From: Wei Huang @ 2024-07-17 20:55 UTC (permalink / raw)
To: linux-pci, linux-kernel, linux-doc, netdev
Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
alex.williamson, gospo, michael.chan, ajit.khaparde,
somnath.kotur, andrew.gospodarek, manoj.panicker2,
Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
bhelgaas
When "No ST mode" is enabled, endpoint devices can generate TPH headers
but with all steering tags treated as zero. A steering tag of zero is
interpreted as "using the default policy" by the root complex. This is
essential to quantify the benefit of steering tags for some given
workloads.
Co-developed-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
---
.../admin-guide/kernel-parameters.txt | 1 +
drivers/pci/pci-driver.c | 7 +++++-
drivers/pci/pci.c | 12 ++++++++++
drivers/pci/pcie/tph.c | 22 +++++++++++++++++++
include/linux/pci-tph.h | 2 ++
include/linux/pci.h | 1 +
6 files changed, 44 insertions(+), 1 deletion(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 65581ebd9b50..1b761f062969 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4656,6 +4656,7 @@
norid [S390] ignore the RID field and force use of
one PCI domain per PCI function
notph [PCIE] Do not use PCIe TPH
+ nostmode [PCIE] Force TPH to use No ST Mode
pcie_aspm= [PCIE] Forcibly enable or ignore PCIe Active State Power
Management.
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 9722d070c0ca..abe66541536e 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -324,8 +324,13 @@ static long local_pci_probe(void *_ddi)
pci_dev->driver = pci_drv;
rc = pci_drv->probe(pci_dev, ddi->id);
if (!rc) {
- if (pci_tph_disabled())
+ if (pci_tph_disabled()) {
pcie_tph_disable(pci_dev);
+ return rc;
+ }
+
+ if (pci_tph_nostmode())
+ pcie_tph_set_nostmode(pci_dev);
return rc;
}
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 4cbfd5b53be8..8745ce1c4a9a 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -160,6 +160,9 @@ static bool pcie_ats_disabled;
/* If set, the PCIe TPH capability will not be used. */
static bool pcie_tph_disabled;
+/* If TPH is enabled, "No ST Mode" will be enforced. */
+static bool pcie_tph_nostmode;
+
/* If set, the PCI config space of each device is printed during boot. */
bool pci_early_dump;
@@ -175,6 +178,12 @@ bool pci_tph_disabled(void)
}
EXPORT_SYMBOL_GPL(pci_tph_disabled);
+bool pci_tph_nostmode(void)
+{
+ return pcie_tph_nostmode;
+}
+EXPORT_SYMBOL_GPL(pci_tph_nostmode);
+
/* Disable bridge_d3 for all PCIe ports */
static bool pci_bridge_d3_disable;
/* Force bridge_d3 for all PCIe ports */
@@ -6881,6 +6890,9 @@ static int __init pci_setup(char *str)
} else if (!strcmp(str, "notph")) {
pr_info("PCIe: TPH is disabled\n");
pcie_tph_disabled = true;
+ } else if (!strcmp(str, "nostmode")) {
+ pr_info("PCIe: TPH No ST Mode is enabled\n");
+ pcie_tph_nostmode = true;
} else if (!strncmp(str, "cbiosize=", 9)) {
pci_cardbus_io_size = memparse(str + 9, &str);
} else if (!strncmp(str, "cbmemsize=", 10)) {
diff --git a/drivers/pci/pcie/tph.c b/drivers/pci/pcie/tph.c
index ad58a892792c..fb8e2f920712 100644
--- a/drivers/pci/pcie/tph.c
+++ b/drivers/pci/pcie/tph.c
@@ -13,6 +13,19 @@
#include "../pci.h"
+/* Update the ST Mode Select field of TPH Control Register */
+static void set_ctrl_reg_mode_sel(struct pci_dev *pdev, u8 st_mode)
+{
+ u32 reg_val;
+
+ pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, ®_val);
+
+ reg_val &= ~PCI_TPH_CTRL_MODE_SEL_MASK;
+ reg_val |= FIELD_PREP(PCI_TPH_CTRL_MODE_SEL_MASK, st_mode);
+
+ pci_write_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, reg_val);
+}
+
/* Update the TPH Requester Enable field of TPH Control Register */
static void set_ctrl_reg_req_en(struct pci_dev *pdev, u8 req_type)
{
@@ -26,6 +39,15 @@ static void set_ctrl_reg_req_en(struct pci_dev *pdev, u8 req_type)
pci_write_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, reg_val);
}
+void pcie_tph_set_nostmode(struct pci_dev *pdev)
+{
+ if (!pdev->tph_cap)
+ return;
+
+ set_ctrl_reg_mode_sel(pdev, PCI_TPH_NO_ST_MODE);
+ set_ctrl_reg_req_en(pdev, PCI_TPH_REQ_TPH_ONLY);
+}
+
void pcie_tph_disable(struct pci_dev *pdev)
{
if (!pdev->tph_cap)
diff --git a/include/linux/pci-tph.h b/include/linux/pci-tph.h
index e0b782bda929..8fce3969277c 100644
--- a/include/linux/pci-tph.h
+++ b/include/linux/pci-tph.h
@@ -11,8 +11,10 @@
#ifdef CONFIG_PCIE_TPH
void pcie_tph_disable(struct pci_dev *dev);
+void pcie_tph_set_nostmode(struct pci_dev *dev);
#else
static inline void pcie_tph_disable(struct pci_dev *dev) {}
+static inline void pcie_tph_set_nostmode(struct pci_dev *dev) {}
#endif
#endif /* LINUX_PCI_TPH_H */
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 05fbbd9ad6b4..ac58f3919993 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1841,6 +1841,7 @@ static inline bool pci_aer_available(void) { return false; }
bool pci_ats_disabled(void);
bool pci_tph_disabled(void);
+bool pci_tph_nostmode(void);
#ifdef CONFIG_PCIE_PTM
int pci_enable_ptm(struct pci_dev *dev, u8 *granularity);
--
2.45.1
^ permalink raw reply related [flat|nested] 36+ messages in thread* Re: [PATCH V3 04/10] PCI/TPH: Add pci=nostmode to force No ST Mode
2024-07-17 20:55 ` [PATCH V3 04/10] PCI/TPH: Add pci=nostmode to force No ST Mode Wei Huang
@ 2024-07-23 22:44 ` Bjorn Helgaas
2024-08-02 4:29 ` Wei Huang
0 siblings, 1 reply; 36+ messages in thread
From: Bjorn Helgaas @ 2024-07-23 22:44 UTC (permalink / raw)
To: Wei Huang
Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
corbet, davem, edumazet, kuba, pabeni, alex.williamson, gospo,
michael.chan, ajit.khaparde, somnath.kotur, andrew.gospodarek,
manoj.panicker2, Eric.VanTassell, vadim.fedorenko, horms,
bagasdotme, bhelgaas
On Wed, Jul 17, 2024 at 03:55:05PM -0500, Wei Huang wrote:
> When "No ST mode" is enabled, endpoint devices can generate TPH headers
> but with all steering tags treated as zero. A steering tag of zero is
> interpreted as "using the default policy" by the root complex. This is
> essential to quantify the benefit of steering tags for some given
> workloads.
Capitalize technical terms defined by spec.
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -4656,6 +4656,7 @@
> norid [S390] ignore the RID field and force use of
> one PCI domain per PCI function
> notph [PCIE] Do not use PCIe TPH
> + nostmode [PCIE] Force TPH to use No ST Mode
Needs a little more context here about what this means. Users won't
know where to even look for "No ST Mode" unless they have a copy of
the spec.
> +++ b/drivers/pci/pci-driver.c
> @@ -324,8 +324,13 @@ static long local_pci_probe(void *_ddi)
> pci_dev->driver = pci_drv;
> rc = pci_drv->probe(pci_dev, ddi->id);
> if (!rc) {
> - if (pci_tph_disabled())
> + if (pci_tph_disabled()) {
> pcie_tph_disable(pci_dev);
> + return rc;
> + }
> +
> + if (pci_tph_nostmode())
> + pcie_tph_set_nostmode(pci_dev);
Same comment here; can we do this outside the probe() path somehow?
> return rc;
> }
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 4cbfd5b53be8..8745ce1c4a9a 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -160,6 +160,9 @@ static bool pcie_ats_disabled;
> /* If set, the PCIe TPH capability will not be used. */
> static bool pcie_tph_disabled;
>
> +/* If TPH is enabled, "No ST Mode" will be enforced. */
> +static bool pcie_tph_nostmode;
> +
> /* If set, the PCI config space of each device is printed during boot. */
> bool pci_early_dump;
>
> @@ -175,6 +178,12 @@ bool pci_tph_disabled(void)
> }
> EXPORT_SYMBOL_GPL(pci_tph_disabled);
>
> +bool pci_tph_nostmode(void)
> +{
> + return pcie_tph_nostmode;
> +}
> +EXPORT_SYMBOL_GPL(pci_tph_nostmode);
s/pci/pcie/
Unexport unless it's useful for drivers.
Bjorn
^ permalink raw reply [flat|nested] 36+ messages in thread* Re: [PATCH V3 04/10] PCI/TPH: Add pci=nostmode to force No ST Mode
2024-07-23 22:44 ` Bjorn Helgaas
@ 2024-08-02 4:29 ` Wei Huang
2024-08-02 17:05 ` Bjorn Helgaas
0 siblings, 1 reply; 36+ messages in thread
From: Wei Huang @ 2024-08-02 4:29 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
corbet, davem, edumazet, kuba, pabeni, alex.williamson, gospo,
michael.chan, ajit.khaparde, somnath.kotur, andrew.gospodarek,
manoj.panicker2, Eric.VanTassell, vadim.fedorenko, horms,
bagasdotme, bhelgaas
On 7/23/24 17:44, Bjorn Helgaas wrote:
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -4656,6 +4656,7 @@
>> norid [S390] ignore the RID field and force use of
>> one PCI domain per PCI function
>> notph [PCIE] Do not use PCIe TPH
>> + nostmode [PCIE] Force TPH to use No ST Mode
>
> Needs a little more context here about what this means. Users won't
> know where to even look for "No ST Mode" unless they have a copy of
> the spec.
>
I can certainly add more description to talk about "No ST Mode". Also,
will "tph_nostmode" be better than "nostmode" in your opinion?
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH V3 04/10] PCI/TPH: Add pci=nostmode to force No ST Mode
2024-08-02 4:29 ` Wei Huang
@ 2024-08-02 17:05 ` Bjorn Helgaas
0 siblings, 0 replies; 36+ messages in thread
From: Bjorn Helgaas @ 2024-08-02 17:05 UTC (permalink / raw)
To: Wei Huang
Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
corbet, davem, edumazet, kuba, pabeni, alex.williamson, gospo,
michael.chan, ajit.khaparde, somnath.kotur, andrew.gospodarek,
manoj.panicker2, Eric.VanTassell, vadim.fedorenko, horms,
bagasdotme, bhelgaas
On Thu, Aug 01, 2024 at 11:29:22PM -0500, Wei Huang wrote:
> On 7/23/24 17:44, Bjorn Helgaas wrote:
> > > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > > @@ -4656,6 +4656,7 @@
> > > norid [S390] ignore the RID field and force use of
> > > one PCI domain per PCI function
> > > notph [PCIE] Do not use PCIe TPH
> > > + nostmode [PCIE] Force TPH to use No ST Mode
> >
> > Needs a little more context here about what this means. Users won't
> > know where to even look for "No ST Mode" unless they have a copy of
> > the spec.
>
> I can certainly add more description to talk about "No ST Mode". Also, will
> "tph_nostmode" be better than "nostmode" in your opinion?
I don't really care about the parameter name. I think just spelling
these out once, e.g., "PCIe TLP Processing Hints (TPH)" and "No ST
Mode (Steering Tags must be all zeros)" is probably enough context.
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH V3 05/10] PCI/TPH: Introduce API to check interrupt vector mode support
2024-07-17 20:55 [PATCH V3 00/10] PCIe TPH and cache direct injection support Wei Huang
` (3 preceding siblings ...)
2024-07-17 20:55 ` [PATCH V3 04/10] PCI/TPH: Add pci=nostmode to force No ST Mode Wei Huang
@ 2024-07-17 20:55 ` Wei Huang
2024-07-23 22:28 ` Bjorn Helgaas
2024-07-17 20:55 ` [PATCH V3 06/10] PCI/TPH: Introduce API to retrieve TPH steering tags from ACPI Wei Huang
` (6 subsequent siblings)
11 siblings, 1 reply; 36+ messages in thread
From: Wei Huang @ 2024-07-17 20:55 UTC (permalink / raw)
To: linux-pci, linux-kernel, linux-doc, netdev
Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
alex.williamson, gospo, michael.chan, ajit.khaparde,
somnath.kotur, andrew.gospodarek, manoj.panicker2,
Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
bhelgaas
Add an API function to allow endpoint device drivers to check
if the interrupt vector mode is allowed. If allowed, drivers
can proceed with updating ST tags.
Co-developed-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
---
drivers/pci/pcie/tph.c | 29 +++++++++++++++++++++++++++++
include/linux/pci-tph.h | 3 +++
2 files changed, 32 insertions(+)
diff --git a/drivers/pci/pcie/tph.c b/drivers/pci/pcie/tph.c
index fb8e2f920712..7183370b0977 100644
--- a/drivers/pci/pcie/tph.c
+++ b/drivers/pci/pcie/tph.c
@@ -39,6 +39,17 @@ static void set_ctrl_reg_req_en(struct pci_dev *pdev, u8 req_type)
pci_write_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, reg_val);
}
+static bool int_vec_mode_supported(struct pci_dev *pdev)
+{
+ u32 reg_val;
+ u8 mode;
+
+ pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CAP, ®_val);
+ mode = FIELD_GET(PCI_TPH_CAP_INT_VEC, reg_val);
+
+ return !!mode;
+}
+
void pcie_tph_set_nostmode(struct pci_dev *pdev)
{
if (!pdev->tph_cap)
@@ -60,3 +71,21 @@ void pcie_tph_init(struct pci_dev *pdev)
{
pdev->tph_cap = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_TPH);
}
+
+/**
+ * pcie_tph_intr_vec_supported() - Check if interrupt vector mode supported for dev
+ * @pdev: pci device
+ *
+ * Return:
+ * true : intr vector mode supported
+ * false: intr vector mode not supported
+ */
+bool pcie_tph_intr_vec_supported(struct pci_dev *pdev)
+{
+ if (!pdev->tph_cap || pci_tph_disabled() || !pdev->msix_enabled ||
+ !int_vec_mode_supported(pdev))
+ return false;
+
+ return true;
+}
+EXPORT_SYMBOL(pcie_tph_intr_vec_supported);
diff --git a/include/linux/pci-tph.h b/include/linux/pci-tph.h
index 8fce3969277c..854677651d81 100644
--- a/include/linux/pci-tph.h
+++ b/include/linux/pci-tph.h
@@ -12,9 +12,12 @@
#ifdef CONFIG_PCIE_TPH
void pcie_tph_disable(struct pci_dev *dev);
void pcie_tph_set_nostmode(struct pci_dev *dev);
+bool pcie_tph_intr_vec_supported(struct pci_dev *dev);
#else
static inline void pcie_tph_disable(struct pci_dev *dev) {}
static inline void pcie_tph_set_nostmode(struct pci_dev *dev) {}
+static inline bool pcie_tph_intr_vec_supported(struct pci_dev *dev)
+{ return false; }
#endif
#endif /* LINUX_PCI_TPH_H */
--
2.45.1
^ permalink raw reply related [flat|nested] 36+ messages in thread* Re: [PATCH V3 05/10] PCI/TPH: Introduce API to check interrupt vector mode support
2024-07-17 20:55 ` [PATCH V3 05/10] PCI/TPH: Introduce API to check interrupt vector mode support Wei Huang
@ 2024-07-23 22:28 ` Bjorn Helgaas
0 siblings, 0 replies; 36+ messages in thread
From: Bjorn Helgaas @ 2024-07-23 22:28 UTC (permalink / raw)
To: Wei Huang
Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
corbet, davem, edumazet, kuba, pabeni, alex.williamson, gospo,
michael.chan, ajit.khaparde, somnath.kotur, andrew.gospodarek,
manoj.panicker2, Eric.VanTassell, vadim.fedorenko, horms,
bagasdotme, bhelgaas
On Wed, Jul 17, 2024 at 03:55:06PM -0500, Wei Huang wrote:
> Add an API function to allow endpoint device drivers to check
> if the interrupt vector mode is allowed. If allowed, drivers
> can proceed with updating ST tags.
Wrap commit log to fill 75 columns
Wrap code/comments to fit in 80.
Here and below, capitalize technical terms defined in spec ("Interrupt
Vector Mode", "Steering Tag").
> Co-developed-by: Eric Van Tassell <Eric.VanTassell@amd.com>
> Signed-off-by: Eric Van Tassell <Eric.VanTassell@amd.com>
> Signed-off-by: Wei Huang <wei.huang2@amd.com>
> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
> ---
> drivers/pci/pcie/tph.c | 29 +++++++++++++++++++++++++++++
> include/linux/pci-tph.h | 3 +++
> 2 files changed, 32 insertions(+)
>
> diff --git a/drivers/pci/pcie/tph.c b/drivers/pci/pcie/tph.c
> index fb8e2f920712..7183370b0977 100644
> --- a/drivers/pci/pcie/tph.c
> +++ b/drivers/pci/pcie/tph.c
> @@ -39,6 +39,17 @@ static void set_ctrl_reg_req_en(struct pci_dev *pdev, u8 req_type)
> pci_write_config_dword(pdev, pdev->tph_cap + PCI_TPH_CTRL, reg_val);
> }
>
> +static bool int_vec_mode_supported(struct pci_dev *pdev)
> +{
> + u32 reg_val;
> + u8 mode;
> +
> + pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CAP, ®_val);
> + mode = FIELD_GET(PCI_TPH_CAP_INT_VEC, reg_val);
> +
> + return !!mode;
> +}
> +
> void pcie_tph_set_nostmode(struct pci_dev *pdev)
> {
> if (!pdev->tph_cap)
> @@ -60,3 +71,21 @@ void pcie_tph_init(struct pci_dev *pdev)
> {
> pdev->tph_cap = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_TPH);
> }
> +
> +/**
> + * pcie_tph_intr_vec_supported() - Check if interrupt vector mode supported for dev
> + * @pdev: pci device
> + *
> + * Return:
> + * true : intr vector mode supported
> + * false: intr vector mode not supported
> + */
> +bool pcie_tph_intr_vec_supported(struct pci_dev *pdev)
> +{
> + if (!pdev->tph_cap || pci_tph_disabled() || !pdev->msix_enabled ||
> + !int_vec_mode_supported(pdev))
> + return false;
IMO the int_vec_mode_supported() helper is overkill and could be
inlined here. The other booleans can be checked first.
> +
> + return true;
> +}
> +EXPORT_SYMBOL(pcie_tph_intr_vec_supported);
> diff --git a/include/linux/pci-tph.h b/include/linux/pci-tph.h
> index 8fce3969277c..854677651d81 100644
> --- a/include/linux/pci-tph.h
> +++ b/include/linux/pci-tph.h
> @@ -12,9 +12,12 @@
> #ifdef CONFIG_PCIE_TPH
> void pcie_tph_disable(struct pci_dev *dev);
> void pcie_tph_set_nostmode(struct pci_dev *dev);
> +bool pcie_tph_intr_vec_supported(struct pci_dev *dev);
> #else
> static inline void pcie_tph_disable(struct pci_dev *dev) {}
> static inline void pcie_tph_set_nostmode(struct pci_dev *dev) {}
> +static inline bool pcie_tph_intr_vec_supported(struct pci_dev *dev)
> +{ return false; }
> #endif
>
> #endif /* LINUX_PCI_TPH_H */
> --
> 2.45.1
>
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH V3 06/10] PCI/TPH: Introduce API to retrieve TPH steering tags from ACPI
2024-07-17 20:55 [PATCH V3 00/10] PCIe TPH and cache direct injection support Wei Huang
` (4 preceding siblings ...)
2024-07-17 20:55 ` [PATCH V3 05/10] PCI/TPH: Introduce API to check interrupt vector mode support Wei Huang
@ 2024-07-17 20:55 ` Wei Huang
2024-07-23 22:22 ` Bjorn Helgaas
2024-07-17 20:55 ` [PATCH V3 07/10] PCI/TPH: Introduce API to update TPH steering tags in PCIe devices Wei Huang
` (5 subsequent siblings)
11 siblings, 1 reply; 36+ messages in thread
From: Wei Huang @ 2024-07-17 20:55 UTC (permalink / raw)
To: linux-pci, linux-kernel, linux-doc, netdev
Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
alex.williamson, gospo, michael.chan, ajit.khaparde,
somnath.kotur, andrew.gospodarek, manoj.panicker2,
Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
bhelgaas
Add an API function to allow endpoint device drivers to retrieve
steering tags for a specific cpu_uid. This is achieved by invoking
ACPI _DSM on device's root port.
Co-developed-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
---
drivers/pci/pcie/tph.c | 162 ++++++++++++++++++++++++++++++++++++++++
include/linux/pci-tph.h | 12 +++
2 files changed, 174 insertions(+)
diff --git a/drivers/pci/pcie/tph.c b/drivers/pci/pcie/tph.c
index 7183370b0977..c805c8b1a7d2 100644
--- a/drivers/pci/pcie/tph.c
+++ b/drivers/pci/pcie/tph.c
@@ -7,12 +7,133 @@
* Wei Huang <wei.huang2@amd.com>
*/
+#define pr_fmt(fmt) "TPH: " fmt
+#define dev_fmt pr_fmt
+
#include <linux/pci.h>
+#include <linux/pci-acpi.h>
#include <linux/bitfield.h>
#include <linux/pci-tph.h>
#include "../pci.h"
+/*
+ * The st_info struct defines the steering tag returned by the firmware _DSM
+ * method defined in PCI Firmware Spec r3.3, sect 4.6.15 "_DSM to Query Cache
+ * Locality TPH Features"
+ *
+ * @vm_st_valid: 8 bit tag for volatile memory is valid
+ * @vm_xst_valid: 16 bit tag for volatile memory is valid
+ * @vm_ignore: 1 => was and will be ignored, 0 => ph should be supplied
+ * @vm_st: 8 bit steering tag for volatile mem
+ * @vm_xst: 16 bit steering tag for volatile mem
+ * @pm_st_valid: 8 bit tag for persistent memory is valid
+ * @pm_xst_valid: 16 bit tag for persistent memory is valid
+ * @pm_ph_ignore: 1 => was and will be ignore, 0 => ph should be supplied
+ * @pm_st: 8 bit steering tag for persistent mem
+ * @pm_xst: 16 bit steering tag for persistent mem
+ */
+union st_info {
+ struct {
+ u64 vm_st_valid : 1;
+ u64 vm_xst_valid : 1;
+ u64 vm_ph_ignore : 1;
+ u64 rsvd1 : 5;
+ u64 vm_st : 8;
+ u64 vm_xst : 16;
+ u64 pm_st_valid : 1;
+ u64 pm_xst_valid : 1;
+ u64 pm_ph_ignore : 1;
+ u64 rsvd2 : 5;
+ u64 pm_st : 8;
+ u64 pm_xst : 16;
+ };
+ u64 value;
+};
+
+static u16 tph_extract_tag(enum tph_mem_type mem_type, u8 req_type,
+ union st_info *st_tag)
+{
+ switch (req_type) {
+ case PCI_TPH_REQ_TPH_ONLY: /* 8 bit tags */
+ switch (mem_type) {
+ case TPH_MEM_TYPE_VM:
+ if (st_tag->vm_st_valid)
+ return st_tag->vm_st;
+ break;
+ case TPH_MEM_TYPE_PM:
+ if (st_tag->pm_st_valid)
+ return st_tag->pm_st;
+ break;
+ }
+ break;
+ case PCI_TPH_REQ_EXT_TPH: /* 16 bit tags */
+ switch (mem_type) {
+ case TPH_MEM_TYPE_VM:
+ if (st_tag->vm_xst_valid)
+ return st_tag->vm_xst;
+ break;
+ case TPH_MEM_TYPE_PM:
+ if (st_tag->pm_xst_valid)
+ return st_tag->pm_xst;
+ break;
+ }
+ break;
+ default:
+ pr_err("invalid steering tag in ACPI _DSM\n");
+ return 0;
+ }
+
+ return 0;
+}
+
+#define TPH_ST_DSM_FUNC_INDEX 0xF
+static acpi_status tph_invoke_dsm(acpi_handle handle, u32 cpu_uid, u8 ph,
+ u8 target_type, bool cache_ref_valid,
+ u64 cache_ref, union st_info *st_out)
+{
+ union acpi_object arg3[3], in_obj, *out_obj;
+
+ if (!acpi_check_dsm(handle, &pci_acpi_dsm_guid, 7, BIT(TPH_ST_DSM_FUNC_INDEX)))
+ return AE_ERROR;
+
+ /* DWORD: feature ID (0 for processor cache ST query) */
+ arg3[0].integer.type = ACPI_TYPE_INTEGER;
+ arg3[0].integer.value = 0;
+
+ /* DWORD: target UID */
+ arg3[1].integer.type = ACPI_TYPE_INTEGER;
+ arg3[1].integer.value = cpu_uid;
+
+ /* QWORD: properties */
+ arg3[2].integer.type = ACPI_TYPE_INTEGER;
+ arg3[2].integer.value = ph & 3;
+ arg3[2].integer.value |= (target_type & 1) << 2;
+ arg3[2].integer.value |= (cache_ref_valid & 1) << 3;
+ arg3[2].integer.value |= (cache_ref << 32);
+
+ in_obj.type = ACPI_TYPE_PACKAGE;
+ in_obj.package.count = ARRAY_SIZE(arg3);
+ in_obj.package.elements = arg3;
+
+ out_obj = acpi_evaluate_dsm(handle, &pci_acpi_dsm_guid, 7,
+ TPH_ST_DSM_FUNC_INDEX, &in_obj);
+
+ if (!out_obj)
+ return AE_ERROR;
+
+ if (out_obj->type != ACPI_TYPE_BUFFER) {
+ ACPI_FREE(out_obj);
+ return AE_ERROR;
+ }
+
+ st_out->value = *((u64 *)(out_obj->buffer.pointer));
+
+ ACPI_FREE(out_obj);
+
+ return AE_OK;
+}
+
/* Update the ST Mode Select field of TPH Control Register */
static void set_ctrl_reg_mode_sel(struct pci_dev *pdev, u8 st_mode)
{
@@ -89,3 +210,44 @@ bool pcie_tph_intr_vec_supported(struct pci_dev *pdev)
return true;
}
EXPORT_SYMBOL(pcie_tph_intr_vec_supported);
+
+/**
+ * pcie_tph_get_st_from_acpi() - Retrieve steering tag for a specific CPU
+ * using platform ACPI _DSM
+ * @pdev: pci device
+ * @cpu_acpi_uid: the acpi cpu_uid.
+ * @mem_type: memory type (vram, nvram)
+ * @req_type: request type (disable, tph, extended tph)
+ * @tag: steering tag return value
+ *
+ * Return: 0 if success, otherwise errno
+ */
+int pcie_tph_get_st_from_acpi(struct pci_dev *pdev, unsigned int cpu_acpi_uid,
+ enum tph_mem_type mem_type, u8 req_type,
+ u16 *tag)
+{
+ struct pci_dev *rp;
+ acpi_handle rp_acpi_handle;
+ union st_info info;
+
+ if (!pdev->tph_cap)
+ return -ENODEV;
+
+ /* find ACPI handler for device's root port */
+ rp = pcie_find_root_port(pdev);
+ if (!rp || !rp->bus || !rp->bus->bridge)
+ return -ENODEV;
+ rp_acpi_handle = ACPI_HANDLE(rp->bus->bridge);
+
+ /* invoke _DSM to extract tag value */
+ if (tph_invoke_dsm(rp_acpi_handle, cpu_acpi_uid, 0, 0, false, 0, &info) != AE_OK) {
+ *tag = 0;
+ return -EINVAL;
+ }
+
+ *tag = tph_extract_tag(mem_type, req_type, &info);
+ pci_dbg(pdev, "%s: cpu=%d tag=%d\n", __func__, cpu_acpi_uid, *tag);
+
+ return 0;
+}
+EXPORT_SYMBOL(pcie_tph_get_st_from_acpi);
diff --git a/include/linux/pci-tph.h b/include/linux/pci-tph.h
index 854677651d81..b12a592f3d49 100644
--- a/include/linux/pci-tph.h
+++ b/include/linux/pci-tph.h
@@ -9,15 +9,27 @@
#ifndef LINUX_PCI_TPH_H
#define LINUX_PCI_TPH_H
+enum tph_mem_type {
+ TPH_MEM_TYPE_VM, /* volatile memory type */
+ TPH_MEM_TYPE_PM /* persistent memory type */
+};
+
#ifdef CONFIG_PCIE_TPH
void pcie_tph_disable(struct pci_dev *dev);
void pcie_tph_set_nostmode(struct pci_dev *dev);
bool pcie_tph_intr_vec_supported(struct pci_dev *dev);
+int pcie_tph_get_st_from_acpi(struct pci_dev *dev, unsigned int cpu_acpi_uid,
+ enum tph_mem_type tag_type, u8 req_enable,
+ u16 *tag);
#else
static inline void pcie_tph_disable(struct pci_dev *dev) {}
static inline void pcie_tph_set_nostmode(struct pci_dev *dev) {}
static inline bool pcie_tph_intr_vec_supported(struct pci_dev *dev)
{ return false; }
+static inline int pcie_tph_get_st_from_acpi(struct pci_dev *dev, unsigned int cpu_acpi_uid,
+ enum tph_mem_type tag_type, u8 req_enable,
+ u16 *tag)
+{ return false; }
#endif
#endif /* LINUX_PCI_TPH_H */
--
2.45.1
^ permalink raw reply related [flat|nested] 36+ messages in thread* Re: [PATCH V3 06/10] PCI/TPH: Introduce API to retrieve TPH steering tags from ACPI
2024-07-17 20:55 ` [PATCH V3 06/10] PCI/TPH: Introduce API to retrieve TPH steering tags from ACPI Wei Huang
@ 2024-07-23 22:22 ` Bjorn Helgaas
2024-08-02 4:58 ` Wei Huang
0 siblings, 1 reply; 36+ messages in thread
From: Bjorn Helgaas @ 2024-07-23 22:22 UTC (permalink / raw)
To: Wei Huang
Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
corbet, davem, edumazet, kuba, pabeni, alex.williamson, gospo,
michael.chan, ajit.khaparde, somnath.kotur, andrew.gospodarek,
manoj.panicker2, Eric.VanTassell, vadim.fedorenko, horms,
bagasdotme, bhelgaas
On Wed, Jul 17, 2024 at 03:55:07PM -0500, Wei Huang wrote:
> Add an API function to allow endpoint device drivers to retrieve
> steering tags for a specific cpu_uid. This is achieved by invoking
> ACPI _DSM on device's root port.
s/an API function/<name of function>/
Also include function name in subject line.
> + * The st_info struct defines the steering tag returned by the firmware _DSM
> + * method defined in PCI Firmware Spec r3.3, sect 4.6.15 "_DSM to Query Cache
> + * Locality TPH Features"
I don't know what I'm missing, but my copy of the r3.3 spec, dated Jan
20, 2021, doesn't have sec 4.6.15.
> +static u16 tph_extract_tag(enum tph_mem_type mem_type, u8 req_type,
> + union st_info *st_tag)
> +{
> + switch (req_type) {
> + case PCI_TPH_REQ_TPH_ONLY: /* 8 bit tags */
> + switch (mem_type) {
> + case TPH_MEM_TYPE_VM:
> + if (st_tag->vm_st_valid)
> + return st_tag->vm_st;
> + break;
> + case TPH_MEM_TYPE_PM:
> + if (st_tag->pm_st_valid)
> + return st_tag->pm_st;
> + break;
> + }
> + break;
> + case PCI_TPH_REQ_EXT_TPH: /* 16 bit tags */
> + switch (mem_type) {
> + case TPH_MEM_TYPE_VM:
> + if (st_tag->vm_xst_valid)
> + return st_tag->vm_xst;
> + break;
> + case TPH_MEM_TYPE_PM:
> + if (st_tag->pm_xst_valid)
> + return st_tag->pm_xst;
> + break;
> + }
> + break;
> + default:
> + pr_err("invalid steering tag in ACPI _DSM\n");
Needs to mention a specific device.
> +static acpi_status tph_invoke_dsm(acpi_handle handle, u32 cpu_uid, u8 ph,
> + u8 target_type, bool cache_ref_valid,
> + u64 cache_ref, union st_info *st_out)
IMO you can omit ph, target_type, cache_ref_valid, etc until you have
a need for them. I don't see the point of parameters that always have
the same constant values.
> +{
> + union acpi_object arg3[3], in_obj, *out_obj;
> +
> + if (!acpi_check_dsm(handle, &pci_acpi_dsm_guid, 7, BIT(TPH_ST_DSM_FUNC_INDEX)))
Wrap the code and comments in this file to fit in 80 columns instead
of 85 or whatever you used. Lines longer than 80 are OK for printf
strings but pointless for comments, etc.
> + return AE_ERROR;
> +
> + /* DWORD: feature ID (0 for processor cache ST query) */
> + arg3[0].integer.type = ACPI_TYPE_INTEGER;
> + arg3[0].integer.value = 0;
> +
> + /* DWORD: target UID */
> + arg3[1].integer.type = ACPI_TYPE_INTEGER;
> + arg3[1].integer.value = cpu_uid;
> +
> + /* QWORD: properties */
> + arg3[2].integer.type = ACPI_TYPE_INTEGER;
> + arg3[2].integer.value = ph & 3;
> + arg3[2].integer.value |= (target_type & 1) << 2;
> + arg3[2].integer.value |= (cache_ref_valid & 1) << 3;
> + arg3[2].integer.value |= (cache_ref << 32);
> +
> + in_obj.type = ACPI_TYPE_PACKAGE;
> + in_obj.package.count = ARRAY_SIZE(arg3);
> + in_obj.package.elements = arg3;
> +
> + out_obj = acpi_evaluate_dsm(handle, &pci_acpi_dsm_guid, 7,
> + TPH_ST_DSM_FUNC_INDEX, &in_obj);
> +
> + if (!out_obj)
> + return AE_ERROR;
> +
> + if (out_obj->type != ACPI_TYPE_BUFFER) {
> + ACPI_FREE(out_obj);
> + return AE_ERROR;
> + }
> +
> + st_out->value = *((u64 *)(out_obj->buffer.pointer));
> +
> + ACPI_FREE(out_obj);
> +
> + return AE_OK;
> +}
> +
> /* Update the ST Mode Select field of TPH Control Register */
> static void set_ctrl_reg_mode_sel(struct pci_dev *pdev, u8 st_mode)
> {
> @@ -89,3 +210,44 @@ bool pcie_tph_intr_vec_supported(struct pci_dev *pdev)
> return true;
> }
> EXPORT_SYMBOL(pcie_tph_intr_vec_supported);
> +
> +/**
> + * pcie_tph_get_st_from_acpi() - Retrieve steering tag for a specific CPU
> + * using platform ACPI _DSM
1) TPH and Steering Tags are not ACPI-specific, even though the only
current mechanism to learn the tags happens to be an ACPI _DSM, so I
think we should omit "acpi" from the name drivers use.
2) The spec doesn't restrict Steering Tags to be for a CPU; it says
"processing resource such as a host processor or system cache
hierarchy ..." But obviously this interface only comprehends an ACPI
CPU ID. Maybe the function name should include "cpu".
3) Nobody outside the file calls this yet, so it should probably be
static (and removed from the doc) until a driver needs it.
> + * @pdev: pci device
> + * @cpu_acpi_uid: the acpi cpu_uid.
> + * @mem_type: memory type (vram, nvram)
> + * @req_type: request type (disable, tph, extended tph)
> + * @tag: steering tag return value
> + *
> + * Return: 0 if success, otherwise errno
> + */
> +int pcie_tph_get_st_from_acpi(struct pci_dev *pdev, unsigned int cpu_acpi_uid,
> + enum tph_mem_type mem_type, u8 req_type,
> + u16 *tag)
> +{
> + struct pci_dev *rp;
> + acpi_handle rp_acpi_handle;
> + union st_info info;
> +
> + if (!pdev->tph_cap)
> + return -ENODEV;
> +
> + /* find ACPI handler for device's root port */
Superfluous comments since the code is obvious. And this finds a
"handle", not a "handler" :)
> + rp = pcie_find_root_port(pdev);
> + if (!rp || !rp->bus || !rp->bus->bridge)
> + return -ENODEV;
> + rp_acpi_handle = ACPI_HANDLE(rp->bus->bridge);
> +
> + /* invoke _DSM to extract tag value */
> + if (tph_invoke_dsm(rp_acpi_handle, cpu_acpi_uid, 0, 0, false, 0, &info) != AE_OK) {
> + *tag = 0;
> + return -EINVAL;
> + }
> +
> + *tag = tph_extract_tag(mem_type, req_type, &info);
> + pci_dbg(pdev, "%s: cpu=%d tag=%d\n", __func__, cpu_acpi_uid, *tag);
> +
> + return 0;
> +}
> +EXPORT_SYMBOL(pcie_tph_get_st_from_acpi);
> diff --git a/include/linux/pci-tph.h b/include/linux/pci-tph.h
> index 854677651d81..b12a592f3d49 100644
> --- a/include/linux/pci-tph.h
> +++ b/include/linux/pci-tph.h
> @@ -9,15 +9,27 @@
> #ifndef LINUX_PCI_TPH_H
> #define LINUX_PCI_TPH_H
>
> +enum tph_mem_type {
> + TPH_MEM_TYPE_VM, /* volatile memory type */
> + TPH_MEM_TYPE_PM /* persistent memory type */
Where does this come from? I don't see "vram" or "volatile" used in
the PCIe spec in this context. Maybe this is from the PCI Firmware
spec?
> +static inline int pcie_tph_get_st_from_acpi(struct pci_dev *dev, unsigned int cpu_acpi_uid,
> + enum tph_mem_type tag_type, u8 req_enable,
> + u16 *tag)
> +{ return false; }
"false" is not "int".
Apparently you want to return "success" in this case, when
CONFIG_PCIE_TPH is not enabled? I suggested leaving this non-exported
for now, which would mean removing this altogether. But if/when we do
export it, I think maybe it should return error so a caller doesn't
assume it succeeded, since *tag will be garbage.
Bjorn
^ permalink raw reply [flat|nested] 36+ messages in thread* Re: [PATCH V3 06/10] PCI/TPH: Introduce API to retrieve TPH steering tags from ACPI
2024-07-23 22:22 ` Bjorn Helgaas
@ 2024-08-02 4:58 ` Wei Huang
2024-08-02 16:58 ` Bjorn Helgaas
0 siblings, 1 reply; 36+ messages in thread
From: Wei Huang @ 2024-08-02 4:58 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
corbet, davem, edumazet, kuba, pabeni, alex.williamson, gospo,
michael.chan, ajit.khaparde, somnath.kotur, andrew.gospodarek,
manoj.panicker2, Eric.VanTassell, vadim.fedorenko, horms,
bagasdotme, bhelgaas
On 7/23/24 17:22, Bjorn Helgaas wrote:
>> + * The st_info struct defines the steering tag returned by the firmware _DSM
>> + * method defined in PCI Firmware Spec r3.3, sect 4.6.15 "_DSM to Query Cache
>> + * Locality TPH Features"
>
> I don't know what I'm missing, but my copy of the r3.3 spec, dated Jan
> 20, 2021, doesn't have sec 4.6.15.
According to https://members.pcisig.com/wg/PCI-SIG/document/15470, the
revision has "4.6.15. _DSM to Query Cache Locality TPH Features".
PCI-SIG approved this ECN, but haven't merged it into PCI Firmware
Specification 3.3 yet.
<snip>
>> +
>> +/**
>> + * pcie_tph_get_st_from_acpi() - Retrieve steering tag for a specific CPU
>> + * using platform ACPI _DSM
>
> 1) TPH and Steering Tags are not ACPI-specific, even though the only
> current mechanism to learn the tags happens to be an ACPI _DSM, so I
> think we should omit "acpi" from the name drivers use.
>
> 2) The spec doesn't restrict Steering Tags to be for a CPU; it says
> "processing resource such as a host processor or system cache
> hierarchy ..." But obviously this interface only comprehends an ACPI
> CPU ID. Maybe the function name should include "cpu".
How about pcie_tph_get_st_for_cpu() or pcie_tph_retreive_st_for_cpu()?
>> diff --git a/include/linux/pci-tph.h b/include/linux/pci-tph.h
>> index 854677651d81..b12a592f3d49 100644
>> --- a/include/linux/pci-tph.h
>> +++ b/include/linux/pci-tph.h
>> @@ -9,15 +9,27 @@
>> #ifndef LINUX_PCI_TPH_H
>> #define LINUX_PCI_TPH_H
>>
>> +enum tph_mem_type {
>> + TPH_MEM_TYPE_VM, /* volatile memory type */
>> + TPH_MEM_TYPE_PM /* persistent memory type */
>
> Where does this come from? I don't see "vram" or "volatile" used in
> the PCIe spec in this context. Maybe this is from the PCI Firmware
> spec?
>
Yes, this is defined in the ECN mentioned above. Do you have concerns
about defining them here? If we want to remove it, then
pcie_tph_get_st_from_acpi() function can only support one memory type
(e.g. vram). Any advice?
>> +static inline int pcie_tph_get_st_from_acpi(struct pci_dev *dev, unsigned int cpu_acpi_uid,
>> + enum tph_mem_type tag_type, u8 req_enable,
>> + u16 *tag)
>> +{ return false; }
>
> "false" is not "int".
>
> Apparently you want to return "success" in this case, when
> CONFIG_PCIE_TPH is not enabled? I suggested leaving this non-exported
> for now, which would mean removing this altogether. But if/when we do
> export it, I think maybe it should return error so a caller doesn't
> assume it succeeded, since *tag will be garbage.
>
> Bjorn
^ permalink raw reply [flat|nested] 36+ messages in thread* Re: [PATCH V3 06/10] PCI/TPH: Introduce API to retrieve TPH steering tags from ACPI
2024-08-02 4:58 ` Wei Huang
@ 2024-08-02 16:58 ` Bjorn Helgaas
0 siblings, 0 replies; 36+ messages in thread
From: Bjorn Helgaas @ 2024-08-02 16:58 UTC (permalink / raw)
To: Wei Huang
Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
corbet, davem, edumazet, kuba, pabeni, alex.williamson, gospo,
michael.chan, ajit.khaparde, somnath.kotur, andrew.gospodarek,
manoj.panicker2, Eric.VanTassell, vadim.fedorenko, horms,
bagasdotme, bhelgaas
On Thu, Aug 01, 2024 at 11:58:46PM -0500, Wei Huang wrote:
> On 7/23/24 17:22, Bjorn Helgaas wrote:
> > > + * The st_info struct defines the steering tag returned by the firmware _DSM
> > > + * method defined in PCI Firmware Spec r3.3, sect 4.6.15 "_DSM to Query Cache
> > > + * Locality TPH Features"
> >
> > I don't know what I'm missing, but my copy of the r3.3 spec, dated Jan
> > 20, 2021, doesn't have sec 4.6.15.
>
> According to https://members.pcisig.com/wg/PCI-SIG/document/15470,
> the revision has "4.6.15. _DSM to Query Cache Locality TPH
> Features". PCI-SIG approved this ECN, but haven't merged it into PCI
> Firmware Specification 3.3 yet.
Thanks for the pointer. Please update to refer to this as an approved
ECN to r3.3 and include the URL. When it is eventually incorporated
into a PCI Firmware spec revision, it will not be r3.3. It will
likely be r3.4 or r4.0, so r3.3 will never be the correct citation.
> > > + * pcie_tph_get_st_from_acpi() - Retrieve steering tag for a specific CPU
> > > + * using platform ACPI _DSM
> >
> > 1) TPH and Steering Tags are not ACPI-specific, even though the only
> > current mechanism to learn the tags happens to be an ACPI _DSM, so I
> > think we should omit "acpi" from the name drivers use.
> >
> > 2) The spec doesn't restrict Steering Tags to be for a CPU; it says
> > "processing resource such as a host processor or system cache
> > hierarchy ..." But obviously this interface only comprehends an ACPI
> > CPU ID. Maybe the function name should include "cpu".
>
> How about pcie_tph_get_st_for_cpu() or pcie_tph_retreive_st_for_cpu()?
Sounds good. The former is nice because it's shorter.
"pcie_tph_cpu_st()" would even be fine with me. s/retreive/retrieve/
if you use that.
> > > diff --git a/include/linux/pci-tph.h b/include/linux/pci-tph.h
> > > index 854677651d81..b12a592f3d49 100644
> > > --- a/include/linux/pci-tph.h
> > > +++ b/include/linux/pci-tph.h
> > > @@ -9,15 +9,27 @@
> > > #ifndef LINUX_PCI_TPH_H
> > > #define LINUX_PCI_TPH_H
> > > +enum tph_mem_type {
> > > + TPH_MEM_TYPE_VM, /* volatile memory type */
> > > + TPH_MEM_TYPE_PM /* persistent memory type */
> >
> > Where does this come from? I don't see "vram" or "volatile" used in
> > the PCIe spec in this context. Maybe this is from the PCI Firmware
> > spec?
>
> Yes, this is defined in the ECN mentioned above. Do you have
> concerns about defining them here? If we want to remove it, then
> pcie_tph_get_st_from_acpi() function can only support one memory
> type (e.g. vram). Any advice?
No concerns if they're defined in the ECN, but we need a citation so
we know where to look for these values. Cite the ECN for now, and we
can update to the actual firmware spec citation when it becomes
available.
Bjorn
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH V3 07/10] PCI/TPH: Introduce API to update TPH steering tags in PCIe devices
2024-07-17 20:55 [PATCH V3 00/10] PCIe TPH and cache direct injection support Wei Huang
` (5 preceding siblings ...)
2024-07-17 20:55 ` [PATCH V3 06/10] PCI/TPH: Introduce API to retrieve TPH steering tags from ACPI Wei Huang
@ 2024-07-17 20:55 ` Wei Huang
2024-07-23 23:15 ` Bjorn Helgaas
2024-07-17 20:55 ` [PATCH V3 08/10] PCI/TPH: Add TPH documentation Wei Huang
` (4 subsequent siblings)
11 siblings, 1 reply; 36+ messages in thread
From: Wei Huang @ 2024-07-17 20:55 UTC (permalink / raw)
To: linux-pci, linux-kernel, linux-doc, netdev
Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
alex.williamson, gospo, michael.chan, ajit.khaparde,
somnath.kotur, andrew.gospodarek, manoj.panicker2,
Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
bhelgaas
Add an API function, pcie_tph_set_st(), to allow endpoint device driver
to update the steering tags. Depending on ST table location, the tags
will be written into device's MSI-X table or TPH Requester Extended
Capability structure.
Co-developed-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
---
drivers/pci/pcie/tph.c | 190 ++++++++++++++++++++++++++++++++++++++++
include/linux/pci-tph.h | 7 ++
2 files changed, 197 insertions(+)
diff --git a/drivers/pci/pcie/tph.c b/drivers/pci/pcie/tph.c
index c805c8b1a7d2..8a0e48c913cf 100644
--- a/drivers/pci/pcie/tph.c
+++ b/drivers/pci/pcie/tph.c
@@ -13,6 +13,7 @@
#include <linux/pci.h>
#include <linux/pci-acpi.h>
#include <linux/bitfield.h>
+#include <linux/msi.h>
#include <linux/pci-tph.h>
#include "../pci.h"
@@ -171,6 +172,160 @@ static bool int_vec_mode_supported(struct pci_dev *pdev)
return !!mode;
}
+static u32 get_st_table_loc(struct pci_dev *pdev)
+{
+ u32 reg_val;
+
+ pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CAP, ®_val);
+
+ return FIELD_GET(PCI_TPH_CAP_LOC_MASK, reg_val);
+}
+
+static bool msix_index_in_bound(struct pci_dev *pdev, int msi_idx)
+{
+ u32 reg_val;
+ u16 st_tbl_sz;
+
+ pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CAP, ®_val);
+ st_tbl_sz = FIELD_GET(PCI_TPH_CAP_ST_MASK, reg_val);
+
+ return msi_idx <= st_tbl_sz;
+}
+
+/* Write ST to MSI-X vector control reg - Return 0 if OK, otherwise errno */
+static int tph_write_tag_to_msix(struct pci_dev *pdev, int msi_idx, u16 tag)
+{
+ struct msi_desc *msi_desc = NULL;
+ void __iomem *vec_ctrl;
+ u32 val;
+ int err = 0;
+
+ if (!msix_index_in_bound(pdev, msi_idx))
+ return -EINVAL;
+
+ msi_lock_descs(&pdev->dev);
+
+ /* find the msi_desc entry with matching msi_idx */
+ msi_for_each_desc(msi_desc, &pdev->dev, MSI_DESC_ASSOCIATED) {
+ if (msi_desc->msi_index == msi_idx)
+ break;
+ }
+
+ if (!msi_desc) {
+ pci_err(pdev, "MSI-X descriptor for #%d not found\n", msi_idx);
+ err = -ENXIO;
+ goto err_out;
+ }
+
+ /* get the vector control register (offset 0xc) pointed by msi_idx */
+ vec_ctrl = pdev->msix_base + msi_idx * PCI_MSIX_ENTRY_SIZE;
+ vec_ctrl += PCI_MSIX_ENTRY_VECTOR_CTRL;
+
+ val = readl(vec_ctrl);
+ val &= 0xffff;
+ val |= (tag << 16);
+ writel(val, vec_ctrl);
+
+ /* read back to flush the update */
+ val = readl(vec_ctrl);
+
+err_out:
+ msi_unlock_descs(&pdev->dev);
+ return err;
+}
+
+/* Return root port TPH completer capability - 0 means none */
+static u8 get_rp_completer_support(struct pci_dev *pdev)
+{
+ struct pci_dev *rp;
+ u32 reg_val;
+ int ret;
+
+ rp = pcie_find_root_port(pdev);
+ if (!rp) {
+ pci_err(pdev, "cannot find root port of %s\n", dev_name(&pdev->dev));
+ return 0;
+ }
+
+ ret = pcie_capability_read_dword(rp, PCI_EXP_DEVCAP2, ®_val);
+ if (ret) {
+ pci_err(pdev, "cannot read device capabilities 2\n");
+ return 0;
+ }
+
+ return FIELD_GET(PCI_EXP_DEVCAP2_TPH_COMP_MASK, reg_val);
+}
+
+/*
+ * TPH device needs to be below a rootport with the TPH Completer and
+ * the completer must offer a compatible level of completer support to that
+ * requested by the device driver.
+ */
+static bool rp_completer_support_ok(struct pci_dev *pdev, u8 req_cap)
+{
+ u8 rp_cap;
+
+ rp_cap = get_rp_completer_support(pdev);
+
+ if (req_cap > rp_cap) {
+ pci_err(pdev, "root port lacks proper TPH completer capability\n");
+ return false;
+ }
+
+ return true;
+}
+
+/* Return 0 if OK, otherwise errno on failure */
+static int pcie_tph_write_st(struct pci_dev *pdev, unsigned int msix_idx,
+ u8 req_type, u16 tag)
+{
+ int offset;
+ u32 loc;
+ int err = 0;
+
+ /* setting ST isn't needed - not an error, just return OK */
+ if (!pdev->tph_cap || pci_tph_disabled() || pci_tph_nostmode() ||
+ !pdev->msix_enabled || !int_vec_mode_supported(pdev))
+ return 0;
+
+ /* setting ST is incorrect in the following cases - return error */
+ if (!msix_index_in_bound(pdev, msix_idx) || !rp_completer_support_ok(pdev, req_type))
+ return -EINVAL;
+
+ /*
+ * disable TPH before updating the tag to avoid potential instability
+ * as cautioned in PCIE Base Spec r6.2, sect 6.17.3 "ST Modes of Operation"
+ */
+ pcie_tph_disable(pdev);
+
+ loc = get_st_table_loc(pdev);
+ /* Note: use FIELD_PREP to match PCI_TPH_LOC_* definitions in header */
+ loc = FIELD_PREP(PCI_TPH_CAP_LOC_MASK, loc);
+
+ switch (loc) {
+ case PCI_TPH_LOC_MSIX:
+ err = tph_write_tag_to_msix(pdev, msix_idx, tag);
+ break;
+ case PCI_TPH_LOC_CAP:
+ offset = pdev->tph_cap + PCI_TPH_BASE_SIZEOF + msix_idx * sizeof(u16);
+ err = pci_write_config_word(pdev, offset, tag);
+ break;
+ default:
+ pci_err(pdev, "unable to write steering tag for device %s\n",
+ dev_name(&pdev->dev));
+ err = -EINVAL;
+ break;
+ }
+
+ if (!err) {
+ /* re-enable interrupt vector mode */
+ set_ctrl_reg_mode_sel(pdev, PCI_TPH_INT_VEC_MODE);
+ set_ctrl_reg_req_en(pdev, req_type);
+ }
+
+ return err;
+}
+
void pcie_tph_set_nostmode(struct pci_dev *pdev)
{
if (!pdev->tph_cap)
@@ -251,3 +406,38 @@ int pcie_tph_get_st_from_acpi(struct pci_dev *pdev, unsigned int cpu_acpi_uid,
return 0;
}
EXPORT_SYMBOL(pcie_tph_get_st_from_acpi);
+
+/**
+ * pcie_tph_set_st() - Set steering tag in ST table entry
+ * @pdev: pci device
+ * @msix_idx: ordinal number of msix interrupt.
+ * @cpu_acpi_uid: the acpi cpu_uid.
+ * @mem_type: memory type (vram, nvram)
+ * @req_type: request type (disable, tph, extended tph)
+ *
+ * Return: 0 if success, otherwise errno
+ */
+int pcie_tph_set_st(struct pci_dev *pdev, unsigned int msix_idx,
+ unsigned int cpu_acpi_uid, enum tph_mem_type mem_type,
+ u8 req_type)
+{
+ u16 tag;
+ int err = 0;
+
+ if (!pdev->tph_cap)
+ return -ENODEV;
+
+ err = pcie_tph_get_st_from_acpi(pdev, cpu_acpi_uid, mem_type,
+ req_type, &tag);
+
+ if (err)
+ return err;
+
+ pci_dbg(pdev, "%s: writing tag %d for msi-x intr %d (cpu: %d)\n",
+ __func__, tag, msix_idx, cpu_acpi_uid);
+
+ err = pcie_tph_write_st(pdev, msix_idx, req_type, tag);
+
+ return err;
+}
+EXPORT_SYMBOL(pcie_tph_set_st);
diff --git a/include/linux/pci-tph.h b/include/linux/pci-tph.h
index b12a592f3d49..1cc99cc528bd 100644
--- a/include/linux/pci-tph.h
+++ b/include/linux/pci-tph.h
@@ -21,6 +21,9 @@ bool pcie_tph_intr_vec_supported(struct pci_dev *dev);
int pcie_tph_get_st_from_acpi(struct pci_dev *dev, unsigned int cpu_acpi_uid,
enum tph_mem_type tag_type, u8 req_enable,
u16 *tag);
+int pcie_tph_set_st(struct pci_dev *dev, unsigned int msix_nr,
+ unsigned int cpu, enum tph_mem_type tag_type,
+ u8 req_enable);
#else
static inline void pcie_tph_disable(struct pci_dev *dev) {}
static inline void pcie_tph_set_nostmode(struct pci_dev *dev) {}
@@ -30,6 +33,10 @@ static inline int pcie_tph_get_st_from_acpi(struct pci_dev *dev, unsigned int cp
enum tph_mem_type tag_type, u8 req_enable,
u16 *tag)
{ return false; }
+static inline int pcie_tph_set_st(struct pci_dev *dev, unsigned int msix_nr,
+ unsigned int cpu, enum tph_mem_type tag_type,
+ u8 req_enable)
+{ return false; }
#endif
#endif /* LINUX_PCI_TPH_H */
--
2.45.1
^ permalink raw reply related [flat|nested] 36+ messages in thread* Re: [PATCH V3 07/10] PCI/TPH: Introduce API to update TPH steering tags in PCIe devices
2024-07-17 20:55 ` [PATCH V3 07/10] PCI/TPH: Introduce API to update TPH steering tags in PCIe devices Wei Huang
@ 2024-07-23 23:15 ` Bjorn Helgaas
0 siblings, 0 replies; 36+ messages in thread
From: Bjorn Helgaas @ 2024-07-23 23:15 UTC (permalink / raw)
To: Wei Huang
Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
corbet, davem, edumazet, kuba, pabeni, alex.williamson, gospo,
michael.chan, ajit.khaparde, somnath.kotur, andrew.gospodarek,
manoj.panicker2, Eric.VanTassell, vadim.fedorenko, horms,
bagasdotme, bhelgaas
On Wed, Jul 17, 2024 at 03:55:08PM -0500, Wei Huang wrote:
> Add an API function, pcie_tph_set_st(), to allow endpoint device driver
> to update the steering tags. Depending on ST table location, the tags
> will be written into device's MSI-X table or TPH Requester Extended
> Capability structure.
> +static u32 get_st_table_loc(struct pci_dev *pdev)
> +{
> + u32 reg_val;
> +
> + pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CAP, ®_val);
> +
> + return FIELD_GET(PCI_TPH_CAP_LOC_MASK, reg_val);
> +}
> +
> +static bool msix_index_in_bound(struct pci_dev *pdev, int msi_idx)
> +{
> + u32 reg_val;
> + u16 st_tbl_sz;
> +
> + pci_read_config_dword(pdev, pdev->tph_cap + PCI_TPH_CAP, ®_val);
> + st_tbl_sz = FIELD_GET(PCI_TPH_CAP_ST_MASK, reg_val);
Seems like a one-time enumeration thing, not a config read we need to
do for every Steering Tag update. Same for get_st_table_loc(), I
think.
> + return msi_idx <= st_tbl_sz;
> +}
> +
> +/* Write ST to MSI-X vector control reg - Return 0 if OK, otherwise errno */
> +static int tph_write_tag_to_msix(struct pci_dev *pdev, int msi_idx, u16 tag)
> +{
> + struct msi_desc *msi_desc = NULL;
> + void __iomem *vec_ctrl;
> + u32 val;
> + int err = 0;
> +
> + if (!msix_index_in_bound(pdev, msi_idx))
> + return -EINVAL;
> +
> + msi_lock_descs(&pdev->dev);
> +
> + /* find the msi_desc entry with matching msi_idx */
> + msi_for_each_desc(msi_desc, &pdev->dev, MSI_DESC_ASSOCIATED) {
> + if (msi_desc->msi_index == msi_idx)
> + break;
> + }
> +
> + if (!msi_desc) {
> + pci_err(pdev, "MSI-X descriptor for #%d not found\n", msi_idx);
> + err = -ENXIO;
> + goto err_out;
> + }
> +
> + /* get the vector control register (offset 0xc) pointed by msi_idx */
> + vec_ctrl = pdev->msix_base + msi_idx * PCI_MSIX_ENTRY_SIZE;
> + vec_ctrl += PCI_MSIX_ENTRY_VECTOR_CTRL;
> +
> + val = readl(vec_ctrl);
> + val &= 0xffff;
> + val |= (tag << 16);
Seems like there should be some kind of #defines here to connect this
to the MSI-X table structure. Maybe next to or connected somehow with
PCI_MSIX_ENTRY_VECTOR_CTRL.
> + writel(val, vec_ctrl);
> +
> + /* read back to flush the update */
> + val = readl(vec_ctrl);
> +
> +err_out:
> + msi_unlock_descs(&pdev->dev);
> + return err;
> +}
> +
> +/* Return root port TPH completer capability - 0 means none */
> +static u8 get_rp_completer_support(struct pci_dev *pdev)
> +{
> + struct pci_dev *rp;
> + u32 reg_val;
> + int ret;
> +
> + rp = pcie_find_root_port(pdev);
> + if (!rp) {
> + pci_err(pdev, "cannot find root port of %s\n", dev_name(&pdev->dev));
> + return 0;
> + }
> +
> + ret = pcie_capability_read_dword(rp, PCI_EXP_DEVCAP2, ®_val);
> + if (ret) {
> + pci_err(pdev, "cannot read device capabilities 2\n");
> + return 0;
> + }
> +
> + return FIELD_GET(PCI_EXP_DEVCAP2_TPH_COMP_MASK, reg_val);
> +}
> +
> +/*
> + * TPH device needs to be below a rootport with the TPH Completer and
> + * the completer must offer a compatible level of completer support to that
> + * requested by the device driver.
Use spec spelling of "Root Port" (not a mix of "rootport", "root
port", etc).
> + */
> +static bool rp_completer_support_ok(struct pci_dev *pdev, u8 req_cap)
> +{
> + u8 rp_cap;
> +
> + rp_cap = get_rp_completer_support(pdev);
> +
> + if (req_cap > rp_cap) {
> + pci_err(pdev, "root port lacks proper TPH completer capability\n");
Doesn't look like an error we should log to me. The *driver* might
need to know this, but the *user* can't do anything with this message,
so I don't think we should print it. There's nothing actually broken
in the hardware or software here.
> + return false;
> + }
> +
> + return true;
> +}
> +
> +/* Return 0 if OK, otherwise errno on failure */
> +static int pcie_tph_write_st(struct pci_dev *pdev, unsigned int msix_idx,
> + u8 req_type, u16 tag)
> +{
> + int offset;
> + u32 loc;
> + int err = 0;
> +
> + /* setting ST isn't needed - not an error, just return OK */
> + if (!pdev->tph_cap || pci_tph_disabled() || pci_tph_nostmode() ||
> + !pdev->msix_enabled || !int_vec_mode_supported(pdev))
I see now why you made int_vec_mode_supported() a separate helper.
Makes sense since you call it several places, so disregard my earlier
comment about inlining it.
> + return 0;
> +
> + /* setting ST is incorrect in the following cases - return error */
> + if (!msix_index_in_bound(pdev, msix_idx) || !rp_completer_support_ok(pdev, req_type))
> + return -EINVAL;
> +
> + /*
> + * disable TPH before updating the tag to avoid potential instability
> + * as cautioned in PCIE Base Spec r6.2, sect 6.17.3 "ST Modes of Operation"
Wrap to fit in 80 columns. Capitalize as normal for English
sentences.
s/PCIE Base Spec/PCIe/
s/sect/sec/ (as used in similar citations elsewhere)
Apply to all comments in this series.
> + */
> + pcie_tph_disable(pdev);
> +
> + loc = get_st_table_loc(pdev);
> + /* Note: use FIELD_PREP to match PCI_TPH_LOC_* definitions in header */
> + loc = FIELD_PREP(PCI_TPH_CAP_LOC_MASK, loc);
> +
> + switch (loc) {
> + case PCI_TPH_LOC_MSIX:
> + err = tph_write_tag_to_msix(pdev, msix_idx, tag);
> + break;
> + case PCI_TPH_LOC_CAP:
> + offset = pdev->tph_cap + PCI_TPH_BASE_SIZEOF + msix_idx * sizeof(u16);
> + err = pci_write_config_word(pdev, offset, tag);
> + break;
> + default:
> + pci_err(pdev, "unable to write steering tag for device %s\n",
> + dev_name(&pdev->dev));
I *guess* this message is really telling me that the ST Table Location
field is "Reserved"? I don't think we should emit an error here
because if it becomes defined in the future, we'll start warning about
all devices that use it, even though nothing is actually wrong except
that we don't know how to use the new value.
In any event, "unable to write steering tag" isn't actually the
problem here; it's only that "ST Table Location" contains something we
don't know about.
> + err = -EINVAL;
> + break;
> + }
> +
> + if (!err) {
> + /* re-enable interrupt vector mode */
> + set_ctrl_reg_mode_sel(pdev, PCI_TPH_INT_VEC_MODE);
> + set_ctrl_reg_req_en(pdev, req_type);
I wish this code *looked* parallel to the pcie_tph_disable() above,
since it *is* actually paralle.
> + }
if (err)
return err;
/* Re-enable ... */
return 0;
> +
> + return err;
> +}
> + * pcie_tph_set_st() - Set steering tag in ST table entry
> + * @pdev: pci device
> + * @msix_idx: ordinal number of msix interrupt.
> + * @cpu_acpi_uid: the acpi cpu_uid.
> + * @mem_type: memory type (vram, nvram)
> + * @req_type: request type (disable, tph, extended tph)
> + *
> + * Return: 0 if success, otherwise errno
> + */
> +int pcie_tph_set_st(struct pci_dev *pdev, unsigned int msix_idx,
> + unsigned int cpu_acpi_uid, enum tph_mem_type mem_type,
> + u8 req_type)
I think this function name should include something about "cpu".
Seems like this file uses "cpu_uid" and "cpu_acpi_uid"
interchangeably. I'm a little unclear on whether that's actually the
case or what the legal values are. Driver passes cpumask_first(),
which I think is a generic Linux CPU ID. Is that identical with an
ACPI CPU UID? It looks like we assume that since we pass this
unaltered to the _DSM. I didn't dig into this, but would like to be
reassured that all is well here.
In any case, please use a consistent name so I don't have to wonder
whether "cpu_uid" and "cpu_acpi_uid" are the same.
> +{
> + u16 tag;
> + int err = 0;
> +
> + if (!pdev->tph_cap)
> + return -ENODEV;
> +
> + err = pcie_tph_get_st_from_acpi(pdev, cpu_acpi_uid, mem_type,
> + req_type, &tag);
> +
> + if (err)
> + return err;
> +
> + pci_dbg(pdev, "%s: writing tag %d for msi-x intr %d (cpu: %d)\n",
> + __func__, tag, msix_idx, cpu_acpi_uid);
> +
> + err = pcie_tph_write_st(pdev, msix_idx, req_type, tag);
> +
> + return err;
return pcie_tph_write_st(...);
> +}
> +EXPORT_SYMBOL(pcie_tph_set_st);
> +static inline int pcie_tph_set_st(struct pci_dev *dev, unsigned int msix_nr,
> + unsigned int cpu, enum tph_mem_type tag_type,
> + u8 req_enable)
> +{ return false; }
"false" is not int. This looks like "success" to the caller, and I'm
not sure that's what you want.
Bjorn
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH V3 08/10] PCI/TPH: Add TPH documentation
2024-07-17 20:55 [PATCH V3 00/10] PCIe TPH and cache direct injection support Wei Huang
` (6 preceding siblings ...)
2024-07-17 20:55 ` [PATCH V3 07/10] PCI/TPH: Introduce API to update TPH steering tags in PCIe devices Wei Huang
@ 2024-07-17 20:55 ` Wei Huang
2024-07-23 21:50 ` Bjorn Helgaas
2024-07-17 20:55 ` [PATCH V3 09/10] bnxt_en: Add TPH support in BNXT driver Wei Huang
` (3 subsequent siblings)
11 siblings, 1 reply; 36+ messages in thread
From: Wei Huang @ 2024-07-17 20:55 UTC (permalink / raw)
To: linux-pci, linux-kernel, linux-doc, netdev
Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
alex.williamson, gospo, michael.chan, ajit.khaparde,
somnath.kotur, andrew.gospodarek, manoj.panicker2,
Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
bhelgaas
Provide a document for TPH feature, including the description of
kernel options and driver API interface.
Co-developed-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
Documentation/PCI/index.rst | 1 +
Documentation/PCI/tph.rst | 57 ++++++++++++++++++++++++++++
Documentation/driver-api/pci/pci.rst | 3 ++
3 files changed, 61 insertions(+)
create mode 100644 Documentation/PCI/tph.rst
diff --git a/Documentation/PCI/index.rst b/Documentation/PCI/index.rst
index e73f84aebde3..5e7c4e6e726b 100644
--- a/Documentation/PCI/index.rst
+++ b/Documentation/PCI/index.rst
@@ -18,3 +18,4 @@ PCI Bus Subsystem
pcieaer-howto
endpoint/index
boot-interrupts
+ tph
diff --git a/Documentation/PCI/tph.rst b/Documentation/PCI/tph.rst
new file mode 100644
index 000000000000..103f4c3251e2
--- /dev/null
+++ b/Documentation/PCI/tph.rst
@@ -0,0 +1,57 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===========
+TPH Support
+===========
+
+
+:Copyright: 2024 Advanced Micro Devices, Inc.
+:Authors: - Eric van Tassell <eric.vantassell@amd.com>
+ - Wei Huang <wei.huang2@amd.com>
+
+Overview
+========
+TPH (TLP Processing Hints) is a PCIe feature that allows endpoint devices
+to provide optimization hints, such as desired caching behavior, for
+requests that target memory space. These hints, in a format called steering
+tags, are provided in the requester's TLP headers and can empower the system
+hardware, including the Root Complex, to optimize the utilization of platform
+resources for the requests.
+
+User Guide
+==========
+
+Kernel Options
+--------------
+There are two kernel command line options available to control TPH feature
+
+ * "notph": TPH will be disabled for all endpoint devices.
+ * "nostmode": TPH will be enabled but the ST Mode will be forced to "No ST Mode".
+
+Device Driver API
+-----------------
+In brief, an endpoint device driver using the TPH interface to configure
+Interrupt Vector Mode will call pcie_tph_set_st() when setting up MSI-X
+interrupts as shown below:
+
+.. code-block:: c
+
+ for (i = 0, j = 0; i < nr_rings; i++) {
+ ...
+ rc = request_irq(irq->vector, irq->handler, flags, irq->name, NULL);
+ ...
+ if (!pcie_tph_set_st(pdev, i, cpumask_first(irq->cpu_mask),
+ TPH_MEM_TYPE_VM, PCI_TPH_REQ_TPH_ONLY))
+ pr_err("Error in configuring steering tag\n");
+ ...
+ }
+
+The caller is suggested to check if interrupt vector mode is supported using
+pcie_tph_intr_vec_supported() before updating the steering tags. If a device only
+supports TPH vendor specific mode, its driver can call pcie_tph_get_st_from_acpi()
+to retrieve the steering tag for a specific CPU and uses the tag to control TPH
+behavior.
+
+.. kernel-doc:: drivers/pci/pcie/tph.c
+ :export:
+ :identifiers: pcie_tph_intr_vec_supported pcie_tph_get_st_from_acpi pcie_tph_set_st
diff --git a/Documentation/driver-api/pci/pci.rst b/Documentation/driver-api/pci/pci.rst
index aa40b1cc243b..3d896b2cf16e 100644
--- a/Documentation/driver-api/pci/pci.rst
+++ b/Documentation/driver-api/pci/pci.rst
@@ -46,6 +46,9 @@ PCI Support Library
.. kernel-doc:: drivers/pci/pci-sysfs.c
:internal:
+.. kernel-doc:: drivers/pci/pcie/tph.c
+ :export:
+
PCI Hotplug Support Library
---------------------------
--
2.45.1
^ permalink raw reply related [flat|nested] 36+ messages in thread* Re: [PATCH V3 08/10] PCI/TPH: Add TPH documentation
2024-07-17 20:55 ` [PATCH V3 08/10] PCI/TPH: Add TPH documentation Wei Huang
@ 2024-07-23 21:50 ` Bjorn Helgaas
0 siblings, 0 replies; 36+ messages in thread
From: Bjorn Helgaas @ 2024-07-23 21:50 UTC (permalink / raw)
To: Wei Huang
Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
corbet, davem, edumazet, kuba, pabeni, alex.williamson, gospo,
michael.chan, ajit.khaparde, somnath.kotur, andrew.gospodarek,
manoj.panicker2, Eric.VanTassell, vadim.fedorenko, horms,
bagasdotme, bhelgaas
On Wed, Jul 17, 2024 at 03:55:09PM -0500, Wei Huang wrote:
> Provide a document for TPH feature, including the description of
> kernel options and driver API interface.
>
> Co-developed-by: Eric Van Tassell <Eric.VanTassell@amd.com>
> Signed-off-by: Eric Van Tassell <Eric.VanTassell@amd.com>
> Signed-off-by: Wei Huang <wei.huang2@amd.com>
> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
> Documentation/PCI/index.rst | 1 +
> Documentation/PCI/tph.rst | 57 ++++++++++++++++++++++++++++
Wrap file to fit in 80 columns. Looks like about 86 now, which is
kind of a random size.
> Documentation/driver-api/pci/pci.rst | 3 ++
> 3 files changed, 61 insertions(+)
> create mode 100644 Documentation/PCI/tph.rst
>
> diff --git a/Documentation/PCI/index.rst b/Documentation/PCI/index.rst
> index e73f84aebde3..5e7c4e6e726b 100644
> --- a/Documentation/PCI/index.rst
> +++ b/Documentation/PCI/index.rst
> @@ -18,3 +18,4 @@ PCI Bus Subsystem
> pcieaer-howto
> endpoint/index
> boot-interrupts
> + tph
> diff --git a/Documentation/PCI/tph.rst b/Documentation/PCI/tph.rst
> new file mode 100644
> index 000000000000..103f4c3251e2
> --- /dev/null
> +++ b/Documentation/PCI/tph.rst
> @@ -0,0 +1,57 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +===========
> +TPH Support
> +===========
> +
> +
> +:Copyright: 2024 Advanced Micro Devices, Inc.
> +:Authors: - Eric van Tassell <eric.vantassell@amd.com>
> + - Wei Huang <wei.huang2@amd.com>
> +
> +Overview
> +========
> +TPH (TLP Processing Hints) is a PCIe feature that allows endpoint devices
> +to provide optimization hints, such as desired caching behavior, for
> +requests that target memory space. These hints, in a format called steering
> +tags, are provided in the requester's TLP headers and can empower the system
> +hardware, including the Root Complex, to optimize the utilization of platform
> +resources for the requests.
I see that this is basically cribbed from sec 6.17, but it sort of
conflates Processing Hints (Bi-directional data structure, Requester,
Target, Target with Priority) with Steering Tags (hints about where to
target a TLP), and "optimize the utilization of platform resources for
the requests" is pretty vague marketing-speak.
I think it might be useful to at least mention the Processing Hints
because the distinction between PH and ST helps motivate the need for
both "pci=notph" and "pci=nostmode".
IIUC, we can enable TPH with the TPH Requester Enable bit, but I don't
see an architected mechanism to control the PH bits in the TLP header.
I assume there might be device-specific ways or it might be built into
the hardware.
Use "Steering Tags" to indicate that this is a term defined by the
spec.
> +User Guide
> +==========
> +
> +Kernel Options
> +--------------
> +There are two kernel command line options available to control TPH feature
> +
> + * "notph": TPH will be disabled for all endpoint devices.
> + * "nostmode": TPH will be enabled but the ST Mode will be forced to "No ST Mode".
> +
> +Device Driver API
> +-----------------
> +In brief, an endpoint device driver using the TPH interface to configure
> +Interrupt Vector Mode will call pcie_tph_set_st() when setting up MSI-X
> +interrupts as shown below:
I think we should include a spec reference for more details here,
e.g., PCIe r6.0, sec 6.17, 6.17.3.
> +.. code-block:: c
> +
> + for (i = 0, j = 0; i < nr_rings; i++) {
"j" is not relevant here and makes the example unnecessarily
complicated.
> + ...
> + rc = request_irq(irq->vector, irq->handler, flags, irq->name, NULL);
> + ...
> + if (!pcie_tph_set_st(pdev, i, cpumask_first(irq->cpu_mask),
> + TPH_MEM_TYPE_VM, PCI_TPH_REQ_TPH_ONLY))
> + pr_err("Error in configuring steering tag\n");
pci_err(). I don't want to encourage drivers to print messages that
have no connection to a specific device.
> + ...
> + }
> +
> +The caller is suggested to check if interrupt vector mode is supported using
> +pcie_tph_intr_vec_supported() before updating the steering tags.
I guess this refers to "Interrupt Vector Mode", an ST mode of
operation (PCIe r6.0, sec 6.17.3). Helpful to style this as
"Interrupt Vector Mode" to indicate that it is a technical term
defined in the spec.
> If a device only
> +supports TPH vendor specific mode, its driver can call pcie_tph_get_st_from_acpi()
> +to retrieve the steering tag for a specific CPU and uses the tag to control TPH
> +behavior.
I'm not sure what "vendor specific mode" refers to here. If it's the
"Device Specific" ST Mode, use the exact language from the spec to
help people find the details.
> +.. kernel-doc:: drivers/pci/pcie/tph.c
> + :export:
> + :identifiers: pcie_tph_intr_vec_supported pcie_tph_get_st_from_acpi pcie_tph_set_st
> diff --git a/Documentation/driver-api/pci/pci.rst b/Documentation/driver-api/pci/pci.rst
> index aa40b1cc243b..3d896b2cf16e 100644
> --- a/Documentation/driver-api/pci/pci.rst
> +++ b/Documentation/driver-api/pci/pci.rst
> @@ -46,6 +46,9 @@ PCI Support Library
> .. kernel-doc:: drivers/pci/pci-sysfs.c
> :internal:
>
> +.. kernel-doc:: drivers/pci/pcie/tph.c
> + :export:
> +
> PCI Hotplug Support Library
> ---------------------------
>
> --
> 2.45.1
>
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH V3 09/10] bnxt_en: Add TPH support in BNXT driver
2024-07-17 20:55 [PATCH V3 00/10] PCIe TPH and cache direct injection support Wei Huang
` (7 preceding siblings ...)
2024-07-17 20:55 ` [PATCH V3 08/10] PCI/TPH: Add TPH documentation Wei Huang
@ 2024-07-17 20:55 ` Wei Huang
2024-07-23 16:48 ` Bjorn Helgaas
2024-07-17 20:55 ` [PATCH V3 10/10] bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings Wei Huang
` (2 subsequent siblings)
11 siblings, 1 reply; 36+ messages in thread
From: Wei Huang @ 2024-07-17 20:55 UTC (permalink / raw)
To: linux-pci, linux-kernel, linux-doc, netdev
Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
alex.williamson, gospo, michael.chan, ajit.khaparde,
somnath.kotur, andrew.gospodarek, manoj.panicker2,
Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
bhelgaas
From: Manoj Panicker <manoj.panicker2@amd.com>
Implement TPH support in Broadcom BNXT device driver by invoking
pcie_tph_set_st() function when interrupt affinity is changed.
Signed-off-by: Manoj Panicker <manoj.panicker2@amd.com>
Reviewed-by: Wei Huang <wei.huang2@amd.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 54 +++++++++++++++++++++++
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 4 ++
2 files changed, 58 insertions(+)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index c437ca1c0fd3..2207dac8ce18 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -55,6 +55,7 @@
#include <net/page_pool/helpers.h>
#include <linux/align.h>
#include <net/netdev_queues.h>
+#include <linux/pci-tph.h>
#include "bnxt_hsi.h"
#include "bnxt.h"
@@ -10683,6 +10684,8 @@ static void bnxt_free_irq(struct bnxt *bp)
free_cpumask_var(irq->cpu_mask);
irq->have_cpumask = 0;
}
+ if (pcie_tph_intr_vec_supported(bp->pdev))
+ irq_set_affinity_notifier(irq->vector, NULL);
free_irq(irq->vector, bp->bnapi[i]);
}
@@ -10690,6 +10693,45 @@ static void bnxt_free_irq(struct bnxt *bp)
}
}
+static void bnxt_rtnl_lock_sp(struct bnxt *bp);
+static void bnxt_rtnl_unlock_sp(struct bnxt *bp);
+static void __bnxt_irq_affinity_notify(struct irq_affinity_notify *notify,
+ const cpumask_t *mask)
+{
+ struct bnxt_irq *irq;
+
+ irq = container_of(notify, struct bnxt_irq, affinity_notify);
+ cpumask_copy(irq->cpu_mask, mask);
+
+ if (!pcie_tph_set_st(irq->bp->pdev, irq->msix_nr,
+ cpumask_first(irq->cpu_mask),
+ TPH_MEM_TYPE_VM, PCI_TPH_REQ_TPH_ONLY))
+ netdev_dbg(irq->bp->dev, "error in setting steering tag\n");
+
+ if (netif_running(irq->bp->dev)) {
+ rtnl_lock();
+ bnxt_close_nic(irq->bp, false, false);
+ bnxt_open_nic(irq->bp, false, false);
+ rtnl_unlock();
+ }
+}
+
+static void __bnxt_irq_affinity_release(struct kref __always_unused *ref)
+{
+}
+
+static inline void bnxt_register_affinity_notifier(struct bnxt_irq *irq)
+{
+ struct irq_affinity_notify *notify;
+
+ notify = &irq->affinity_notify;
+ notify->irq = irq->vector;
+ notify->notify = __bnxt_irq_affinity_notify;
+ notify->release = __bnxt_irq_affinity_release;
+
+ irq_set_affinity_notifier(irq->vector, notify);
+}
+
static int bnxt_request_irq(struct bnxt *bp)
{
int i, j, rc = 0;
@@ -10735,6 +10777,7 @@ static int bnxt_request_irq(struct bnxt *bp)
int numa_node = dev_to_node(&bp->pdev->dev);
irq->have_cpumask = 1;
+ irq->msix_nr = map_idx;
cpumask_set_cpu(cpumask_local_spread(i, numa_node),
irq->cpu_mask);
rc = irq_set_affinity_hint(irq->vector, irq->cpu_mask);
@@ -10744,6 +10787,17 @@ static int bnxt_request_irq(struct bnxt *bp)
irq->vector);
break;
}
+
+ if (pcie_tph_intr_vec_supported(bp->pdev)) {
+ irq->bp = bp;
+ bnxt_register_affinity_notifier(irq);
+
+ /* first setup */
+ if (!pcie_tph_set_st(bp->pdev, i,
+ cpumask_first(irq->cpu_mask),
+ TPH_MEM_TYPE_VM, PCI_TPH_REQ_TPH_ONLY))
+ netdev_dbg(bp->dev, "error in setting steering tag\n");
+ }
}
}
return rc;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 656ab81c0272..4a841e8ccfb7 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1195,6 +1195,10 @@ struct bnxt_irq {
u8 have_cpumask:1;
char name[IFNAMSIZ + 2];
cpumask_var_t cpu_mask;
+
+ int msix_nr;
+ struct bnxt *bp;
+ struct irq_affinity_notify affinity_notify;
};
#define HWRM_RING_ALLOC_TX 0x1
--
2.45.1
^ permalink raw reply related [flat|nested] 36+ messages in thread* Re: [PATCH V3 09/10] bnxt_en: Add TPH support in BNXT driver
2024-07-17 20:55 ` [PATCH V3 09/10] bnxt_en: Add TPH support in BNXT driver Wei Huang
@ 2024-07-23 16:48 ` Bjorn Helgaas
2024-08-02 5:44 ` Wei Huang
0 siblings, 1 reply; 36+ messages in thread
From: Bjorn Helgaas @ 2024-07-23 16:48 UTC (permalink / raw)
To: Wei Huang
Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
corbet, davem, edumazet, kuba, pabeni, alex.williamson, gospo,
michael.chan, ajit.khaparde, somnath.kotur, andrew.gospodarek,
manoj.panicker2, Eric.VanTassell, vadim.fedorenko, horms,
bagasdotme, bhelgaas
On Wed, Jul 17, 2024 at 03:55:10PM -0500, Wei Huang wrote:
> From: Manoj Panicker <manoj.panicker2@amd.com>
>
> Implement TPH support in Broadcom BNXT device driver by invoking
> pcie_tph_set_st() function when interrupt affinity is changed.
*and* invoking pcie_tph_set_st() when setting up the IRQ in the first
place, I guess?
I guess this gives a significant performance benefit? The series
includes "pci=nostmode" so the benefit can be quantified, so now I'm
curious about what you measured :)
> +static void bnxt_rtnl_lock_sp(struct bnxt *bp);
> +static void bnxt_rtnl_unlock_sp(struct bnxt *bp);
These duplicate declarations can't be right, can they? OK for
work-in-progress, but it doesn't look like the final solution.
> +static void __bnxt_irq_affinity_notify(struct irq_affinity_notify *notify,
> + const cpumask_t *mask)
> +{
> + struct bnxt_irq *irq;
> +
> + irq = container_of(notify, struct bnxt_irq, affinity_notify);
> + cpumask_copy(irq->cpu_mask, mask);
> +
> + if (!pcie_tph_set_st(irq->bp->pdev, irq->msix_nr,
> + cpumask_first(irq->cpu_mask),
> + TPH_MEM_TYPE_VM, PCI_TPH_REQ_TPH_ONLY))
> + netdev_dbg(irq->bp->dev, "error in setting steering tag\n");
> +
> + if (netif_running(irq->bp->dev)) {
> + rtnl_lock();
> + bnxt_close_nic(irq->bp, false, false);
> + bnxt_open_nic(irq->bp, false, false);
> + rtnl_unlock();
> + }
> +}
^ permalink raw reply [flat|nested] 36+ messages in thread* Re: [PATCH V3 09/10] bnxt_en: Add TPH support in BNXT driver
2024-07-23 16:48 ` Bjorn Helgaas
@ 2024-08-02 5:44 ` Wei Huang
2024-08-02 17:00 ` Bjorn Helgaas
0 siblings, 1 reply; 36+ messages in thread
From: Wei Huang @ 2024-08-02 5:44 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
corbet, davem, edumazet, kuba, pabeni, alex.williamson, gospo,
michael.chan, ajit.khaparde, somnath.kotur, andrew.gospodarek,
manoj.panicker2, Eric.VanTassell, vadim.fedorenko, horms,
bagasdotme, bhelgaas
On 7/23/24 11:48, Bjorn Helgaas wrote:
> On Wed, Jul 17, 2024 at 03:55:10PM -0500, Wei Huang wrote:
>> From: Manoj Panicker <manoj.panicker2@amd.com>
>>
>> Implement TPH support in Broadcom BNXT device driver by invoking
>> pcie_tph_set_st() function when interrupt affinity is changed.
>
> *and* invoking pcie_tph_set_st() when setting up the IRQ in the first
> place, I guess?
>
> I guess this gives a significant performance benefit? The series
> includes "pci=nostmode" so the benefit can be quantified, so now I'm
> curious about what you measured :)
Using network benchmarks, three main metrics were measured: network
latency, network bandwidth, and memory bandwidth saving.
>
>> +static void bnxt_rtnl_lock_sp(struct bnxt *bp);
>> +static void bnxt_rtnl_unlock_sp(struct bnxt *bp);
>
> These duplicate declarations can't be right, can they? OK for
> work-in-progress, but it doesn't look like the final solution.
Will fix.
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH V3 09/10] bnxt_en: Add TPH support in BNXT driver
2024-08-02 5:44 ` Wei Huang
@ 2024-08-02 17:00 ` Bjorn Helgaas
0 siblings, 0 replies; 36+ messages in thread
From: Bjorn Helgaas @ 2024-08-02 17:00 UTC (permalink / raw)
To: Wei Huang
Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
corbet, davem, edumazet, kuba, pabeni, alex.williamson, gospo,
michael.chan, ajit.khaparde, somnath.kotur, andrew.gospodarek,
manoj.panicker2, Eric.VanTassell, vadim.fedorenko, horms,
bagasdotme, bhelgaas
On Fri, Aug 02, 2024 at 12:44:07AM -0500, Wei Huang wrote:
>
>
> On 7/23/24 11:48, Bjorn Helgaas wrote:
> > On Wed, Jul 17, 2024 at 03:55:10PM -0500, Wei Huang wrote:
> > > From: Manoj Panicker <manoj.panicker2@amd.com>
> > >
> > > Implement TPH support in Broadcom BNXT device driver by invoking
> > > pcie_tph_set_st() function when interrupt affinity is changed.
> >
> > *and* invoking pcie_tph_set_st() when setting up the IRQ in the first
> > place, I guess?
> >
> > I guess this gives a significant performance benefit? The series
> > includes "pci=nostmode" so the benefit can be quantified, so now I'm
> > curious about what you measured :)
>
> Using network benchmarks, three main metrics were measured: network latency,
> network bandwidth, and memory bandwidth saving.
I was hoping you could share actual percentage improvements to justify
the new code. If there's no improvement, obviously there would be no
point in adding the code. If there's significant improvement, it will
encourage using this in other drivers, which will improve the code and
testing for everybody.
Bjorn
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH V3 10/10] bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings
2024-07-17 20:55 [PATCH V3 00/10] PCIe TPH and cache direct injection support Wei Huang
` (8 preceding siblings ...)
2024-07-17 20:55 ` [PATCH V3 09/10] bnxt_en: Add TPH support in BNXT driver Wei Huang
@ 2024-07-17 20:55 ` Wei Huang
2024-07-20 8:08 ` [PATCH V3 00/10] PCIe TPH and cache direct injection support Lukas Wunner
2024-07-20 19:25 ` David Wei
11 siblings, 0 replies; 36+ messages in thread
From: Wei Huang @ 2024-07-17 20:55 UTC (permalink / raw)
To: linux-pci, linux-kernel, linux-doc, netdev
Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
alex.williamson, gospo, michael.chan, ajit.khaparde,
somnath.kotur, andrew.gospodarek, manoj.panicker2,
Eric.VanTassell, wei.huang2, vadim.fedorenko, horms, bagasdotme,
bhelgaas
From: Michael Chan <michael.chan@broadcom.com>
Newer firmware can use the NQ ring ID associated with each RX/RX AGG
ring to enable PCIe steering tag. Older firmware will just ignore the
information.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 2207dac8ce18..308b4747d041 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -6699,10 +6699,12 @@ static int hwrm_ring_alloc_send_msg(struct bnxt *bp,
/* Association of rx ring with stats context */
grp_info = &bp->grp_info[ring->grp_idx];
+ req->nq_ring_id = cpu_to_le16(grp_info->cp_fw_ring_id);
req->rx_buf_size = cpu_to_le16(bp->rx_buf_use_size);
req->stat_ctx_id = cpu_to_le32(grp_info->fw_stats_ctx);
req->enables |= cpu_to_le32(
- RING_ALLOC_REQ_ENABLES_RX_BUF_SIZE_VALID);
+ RING_ALLOC_REQ_ENABLES_RX_BUF_SIZE_VALID |
+ RING_ALLOC_REQ_ENABLES_NQ_RING_ID_VALID);
if (NET_IP_ALIGN == 2)
flags = RING_ALLOC_REQ_FLAGS_RX_SOP_PAD;
req->flags = cpu_to_le16(flags);
@@ -6714,11 +6716,13 @@ static int hwrm_ring_alloc_send_msg(struct bnxt *bp,
/* Association of agg ring with rx ring */
grp_info = &bp->grp_info[ring->grp_idx];
req->rx_ring_id = cpu_to_le16(grp_info->rx_fw_ring_id);
+ req->nq_ring_id = cpu_to_le16(grp_info->cp_fw_ring_id);
req->rx_buf_size = cpu_to_le16(BNXT_RX_PAGE_SIZE);
req->stat_ctx_id = cpu_to_le32(grp_info->fw_stats_ctx);
req->enables |= cpu_to_le32(
RING_ALLOC_REQ_ENABLES_RX_RING_ID_VALID |
- RING_ALLOC_REQ_ENABLES_RX_BUF_SIZE_VALID);
+ RING_ALLOC_REQ_ENABLES_RX_BUF_SIZE_VALID |
+ RING_ALLOC_REQ_ENABLES_NQ_RING_ID_VALID);
} else {
req->ring_type = RING_ALLOC_REQ_RING_TYPE_RX;
}
--
2.45.1
^ permalink raw reply related [flat|nested] 36+ messages in thread* Re: [PATCH V3 00/10] PCIe TPH and cache direct injection support
2024-07-17 20:55 [PATCH V3 00/10] PCIe TPH and cache direct injection support Wei Huang
` (9 preceding siblings ...)
2024-07-17 20:55 ` [PATCH V3 10/10] bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings Wei Huang
@ 2024-07-20 8:08 ` Lukas Wunner
2024-07-22 14:44 ` Wei Huang
2024-07-20 19:25 ` David Wei
11 siblings, 1 reply; 36+ messages in thread
From: Lukas Wunner @ 2024-07-20 8:08 UTC (permalink / raw)
To: Wei Huang
Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
helgaas, corbet, davem, edumazet, kuba, pabeni, alex.williamson,
gospo, michael.chan, ajit.khaparde, somnath.kotur,
andrew.gospodarek, manoj.panicker2, Eric.VanTassell,
vadim.fedorenko, horms, bagasdotme, bhelgaas, Paul Luse, Jing Liu
[cc += Paul Luse, Jing Liu]
On Wed, Jul 17, 2024 at 03:55:01PM -0500, Wei Huang wrote:
> TPH (TLP Processing Hints) is a PCIe feature that allows endpoint devices to
> provide optimization hints for requests that target memory space. These hints,
> in a format called steering tag (ST), are provided in the requester's TLP
> headers and allow the system hardware, including the Root Complex, to
> optimize the utilization of platform resources for the requests.
[...]
> This series introduces generic TPH support in Linux, allowing STs to be
> retrieved from ACPI _DSM (as defined by ACPI) and used by PCIe endpoint
> drivers as needed. As a demonstration, it includes an example usage in the
> Broadcom BNXT driver. When running on Broadcom NICs with the appropriate
> firmware, Cache Injection shows substantial memory bandwidth savings and
> better network bandwidth using real-world benchmarks. This solution is
> vendor-neutral, as both TPH and ACPI _DSM are industry standards.
I think you need to add support for saving and restoring TPH registers,
otherwise the changes you make to those registers may not survive
reset recovery or system sleep. Granted, system sleep may not be
relevant for servers (which I assume you're targeting with your patches),
but reset recovery very much is.
Paul Luse submitted a patch two years ago to save and restore
TPH registers, perhaps you can include it in your patch set?
https://lore.kernel.org/all/20220712123641.2319-1-paul.e.luse@intel.com/
Bjorn left some comments on Paul's patch:
https://lore.kernel.org/all/20220912214516.GA538566@bhelgaas/
In particular, Bjorn asked for shared infrastructure to access
TPH registers (which you're adding in your patch set) and spotted
several nits (which should be easy to address). So I think you may
be able to integrate Paul's patch into your series without too much
effort.
However note that when writing to TPH registers through the API you're
introducing, you also need to update the saved register state so that
those changes aren't lost upon a subsequent reset recovery.
Thanks,
Lukas
^ permalink raw reply [flat|nested] 36+ messages in thread* Re: [PATCH V3 00/10] PCIe TPH and cache direct injection support
2024-07-20 8:08 ` [PATCH V3 00/10] PCIe TPH and cache direct injection support Lukas Wunner
@ 2024-07-22 14:44 ` Wei Huang
2024-07-22 14:58 ` Lukas Wunner
0 siblings, 1 reply; 36+ messages in thread
From: Wei Huang @ 2024-07-22 14:44 UTC (permalink / raw)
To: Lukas Wunner
Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
helgaas, corbet, davem, edumazet, kuba, pabeni, alex.williamson,
gospo, michael.chan, ajit.khaparde, somnath.kotur,
andrew.gospodarek, manoj.panicker2, Eric.VanTassell,
vadim.fedorenko, horms, bagasdotme, bhelgaas, Paul Luse, Jing Liu
On 7/20/24 03:08, Lukas Wunner wrote:
> [cc += Paul Luse, Jing Liu]
>
> On Wed, Jul 17, 2024 at 03:55:01PM -0500, Wei Huang wrote:
>> TPH (TLP Processing Hints) is a PCIe feature that allows endpoint devices to
>> provide optimization hints for requests that target memory space. These hints,
>> in a format called steering tag (ST), are provided in the requester's TLP
>> headers and allow the system hardware, including the Root Complex, to
>> optimize the utilization of platform resources for the requests.
> [...]
>> This series introduces generic TPH support in Linux, allowing STs to be
>> retrieved from ACPI _DSM (as defined by ACPI) and used by PCIe endpoint
>> drivers as needed. As a demonstration, it includes an example usage in the
>> Broadcom BNXT driver. When running on Broadcom NICs with the appropriate
>> firmware, Cache Injection shows substantial memory bandwidth savings and
>> better network bandwidth using real-world benchmarks. This solution is
>> vendor-neutral, as both TPH and ACPI _DSM are industry standards.
>
> I think you need to add support for saving and restoring TPH registers,
> otherwise the changes you make to those registers may not survive
> reset recovery or system sleep. Granted, system sleep may not be
> relevant for servers (which I assume you're targeting with your patches),
> but reset recovery very much is.
>
> Paul Luse submitted a patch two years ago to save and restore
> TPH registers, perhaps you can include it in your patch set?
Thanks for pointing them out. I skimmed through Paul's patch and it is
straightforward to integrate.
Depending on Bjorn's preference, I can either integrate it into my
patchset with full credits to Paul and Jing, or Paul want to resubmit a
new version.
>
> https://lore.kernel.org/all/20220712123641.2319-1-paul.e.luse@intel.com/
>
> Bjorn left some comments on Paul's patch:
>
> https://lore.kernel.org/all/20220912214516.GA538566@bhelgaas/
>
> In particular, Bjorn asked for shared infrastructure to access
> TPH registers (which you're adding in your patch set) and spotted
> several nits (which should be easy to address). So I think you may
> be able to integrate Paul's patch into your series without too much
> effort.
I read Bjorn's comments, lots of them have been addressed in my patchset
(e.g. move under /pci/pcie, support _DSM and dev->tph). So, as I said,
integration is doable.
>
> However note that when writing to TPH registers through the API you're
> introducing, you also need to update the saved register state so that
> those changes aren't lost upon a subsequent reset recovery.
>
> Thanks,
>
> Lukas
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH V3 00/10] PCIe TPH and cache direct injection support
2024-07-22 14:44 ` Wei Huang
@ 2024-07-22 14:58 ` Lukas Wunner
0 siblings, 0 replies; 36+ messages in thread
From: Lukas Wunner @ 2024-07-22 14:58 UTC (permalink / raw)
To: Wei Huang
Cc: linux-pci, linux-kernel, linux-doc, netdev, Jonathan.Cameron,
helgaas, corbet, davem, edumazet, kuba, pabeni, alex.williamson,
gospo, michael.chan, ajit.khaparde, somnath.kotur,
andrew.gospodarek, manoj.panicker2, Eric.VanTassell,
vadim.fedorenko, horms, bagasdotme, bhelgaas, Paul Luse, Jing Liu
On Mon, Jul 22, 2024 at 09:44:32AM -0500, Wei Huang wrote:
> On 7/20/24 03:08, Lukas Wunner wrote:
> > Paul Luse submitted a patch two years ago to save and restore
> > TPH registers, perhaps you can include it in your patch set?
>
> Thanks for pointing them out. I skimmed through Paul's patch and it is
> straightforward to integrate.
>
> Depending on Bjorn's preference, I can either integrate it into my
> patchset with full credits to Paul and Jing, or Paul want to resubmit a
> new version.
The former would likely be better as I'm not sure Paul has the time
to respin the patch. My recollection is that TPH save/restore support
was dropped as a requirement for the Intel device this was originally
developed for, but it would be a shame to lose the time and effort
that already went into it and I think it might be useful for your
use case as well to support reset recovery.
> I read Bjorn's comments, lots of them have been addressed in my patchset
> (e.g. move under /pci/pcie, support _DSM and dev->tph).
Indeed, good job!
Thanks for taking a look!
Lukas
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH V3 00/10] PCIe TPH and cache direct injection support
2024-07-17 20:55 [PATCH V3 00/10] PCIe TPH and cache direct injection support Wei Huang
` (10 preceding siblings ...)
2024-07-20 8:08 ` [PATCH V3 00/10] PCIe TPH and cache direct injection support Lukas Wunner
@ 2024-07-20 19:25 ` David Wei
2024-07-22 15:38 ` Wei Huang
11 siblings, 1 reply; 36+ messages in thread
From: David Wei @ 2024-07-20 19:25 UTC (permalink / raw)
To: Wei Huang, linux-pci, linux-kernel, linux-doc, netdev
Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
alex.williamson, gospo, michael.chan, ajit.khaparde,
somnath.kotur, andrew.gospodarek, manoj.panicker2,
Eric.VanTassell, vadim.fedorenko, horms, bagasdotme, bhelgaas
On 2024-07-17 13:55, Wei Huang wrote:
> Hi All,
>
> TPH (TLP Processing Hints) is a PCIe feature that allows endpoint devices to
> provide optimization hints for requests that target memory space. These hints,
> in a format called steering tag (ST), are provided in the requester's TLP
> headers and allow the system hardware, including the Root Complex, to
> optimize the utilization of platform resources for the requests.
>
> Upcoming AMD hardware implement a new Cache Injection feature that leverages
> TPH. Cache Injection allows PCIe endpoints to inject I/O Coherent DMA writes
> directly into an L2 within the CCX (core complex) closest to the CPU core that
> will consume it. This technology is aimed at applications requiring high
> performance and low latency, such as networking and storage applications.
This sounds very exciting Wei and it's good to see bnxt support. When
you say 'upcoming AMD hardware' are you able to share exactly which? I
would like to try this out.
^ permalink raw reply [flat|nested] 36+ messages in thread* Re: [PATCH V3 00/10] PCIe TPH and cache direct injection support
2024-07-20 19:25 ` David Wei
@ 2024-07-22 15:38 ` Wei Huang
0 siblings, 0 replies; 36+ messages in thread
From: Wei Huang @ 2024-07-22 15:38 UTC (permalink / raw)
To: David Wei, linux-pci, linux-kernel, linux-doc, netdev
Cc: Jonathan.Cameron, helgaas, corbet, davem, edumazet, kuba, pabeni,
alex.williamson, gospo, michael.chan, ajit.khaparde,
somnath.kotur, andrew.gospodarek, manoj.panicker2,
Eric.VanTassell, vadim.fedorenko, horms, bagasdotme, bhelgaas
On 7/20/24 14:25, David Wei wrote:
> On 2024-07-17 13:55, Wei Huang wrote:
>> Hi All,
>>
>> TPH (TLP Processing Hints) is a PCIe feature that allows endpoint devices to
>> provide optimization hints for requests that target memory space. These hints,
>> in a format called steering tag (ST), are provided in the requester's TLP
>> headers and allow the system hardware, including the Root Complex, to
>> optimize the utilization of platform resources for the requests.
>>
>> Upcoming AMD hardware implement a new Cache Injection feature that leverages
>> TPH. Cache Injection allows PCIe endpoints to inject I/O Coherent DMA writes
>> directly into an L2 within the CCX (core complex) closest to the CPU core that
>> will consume it. This technology is aimed at applications requiring high
>> performance and low latency, such as networking and storage applications.
>
> This sounds very exciting Wei and it's good to see bnxt support. When
> you say 'upcoming AMD hardware' are you able to share exactly which? I
> would like to try this out.
I can't specify which server platforms yet. But you can find this
feature in either BIOS options or decode it from ACPI DSDT table (search
UUID e5c937d0-3553-4d7a-9117-ea4d19c3434d, Func 0x0F).
^ permalink raw reply [flat|nested] 36+ messages in thread