From: Marc Zyngier <maz@kernel.org>
To: Shanker Donthineni <sdonthineni@nvidia.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>, Jonathan Corbet <corbet@lwn.net>,
Mark Rutland <mark.rutland@arm.com>,
Lorenzo Pieralisi <lpieralisi@kernel.org>,
"Sudeep\ Holla" <sudeep.holla@arm.com>,
Thomas Gleixner <tglx@linutronix.de>,
<linux-arm-kernel@lists.infradead.org>,
<linux-doc@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
Vikram Sethi <vsethi@nvidia.com>,
Thierry Reding <treding@nvidia.com>
Subject: Re: [PATCH v2] irqchip/gicv3: Workaround for NVIDIA erratum T241-FABRIC-4
Date: Wed, 15 Mar 2023 08:34:52 +0000 [thread overview]
Message-ID: <871qlqif9v.wl-maz@kernel.org> (raw)
In-Reply-To: <20230314135128.2930580-1-sdonthineni@nvidia.com>
On Tue, 14 Mar 2023 13:51:28 +0000,
Shanker Donthineni <sdonthineni@nvidia.com> wrote:
>
> The T241 platform suffers from the T241-FABRIC-4 erratum which causes
> unexpected behavior in the GIC when multiple transactions are received
> simultaneously from different sources. This hardware issue impacts
> NVIDIA server platforms that use more than two T241 chips
> interconnected. Each chip has support for 320 {E}SPIs.
>
> This issue occurs when multiple packets from different GICs are
> incorrectly interleaved at the target chip. The erratum text below
> specifies exactly what can cause multiple transfer packets susceptible
> to interleaving and GIC state corruption. GIC state corruption can
> lead to a range of problems, including kernel panics, and unexpected
> behavior.
>
> From the erratum text:
> "In some cases, inter-socket AXI4 Stream packets with multiple
> transfers, may be interleaved by the fabric when presented to ARM
> Generic Interrupt Controller. GIC expects all transfers of a packet
> to be delivered without any interleaving.
>
> The following GICv3 commands may result in multiple transfer packets
> over inter-socket AXI4 Stream interface:
> - Register reads from GICD_I* and GICD_N*
> - Register writes to 64-bit GICD registers other than GICD_IROUTERn*
> - ITS command MOVALL
>
> Multiple commands in GICv4+ utilize multiple transfer packets,
> including VMOVP, VMOVI and VMAPP.
>
> This issue impacts system configurations with more than 2 sockets,
> that require multi-transfer packets to be sent over inter-socket
> AXI4 Stream interface between GIC instances on different sockets.
> GICv4 cannot be supported. GICv3 SW model can only be supported
> with the workaround. Single and Dual socket configurations are not
> impacted by this issue and support GICv3 and GICv4."
>
> Link: https://developer.nvidia.com/docs/t241-fabric-4/nvidia-t241-fabric-4-errata.pdf
>
> Writing to the chip alias region of the GICD_In{E} registers except
> GICD_ICENABLERn has an equivalent effect as writing to the global
> distributor. The SPI interrupt deactivate path is not impacted by
> the erratum.
>
> To fix this problem, implement a workaround that ensures read accesses
> to the GICD_In{E} registers are directed to the chip that owns the
> SPI, and disables GICv4.x features for KVM. To simplify code changes,
> the gic_configure_irq() function uses the same alias region for both
> read and write operations to GICD_ICFGR.
>
> Co-developed-by: Vikram Sethi <vsethi@nvidia.com>
> Signed-off-by: Vikram Sethi <vsethi@nvidia.com>
> Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com>
> ---
> Changes since v1:
> - Use SMCCC SOC-ID API for detecting the T241 chip
> - Implement Marc's suggestions
> - Edit commit text
>
> Documentation/arm64/silicon-errata.rst | 2 +
> drivers/firmware/smccc/smccc.c | 13 +++
> drivers/irqchip/irq-gic-v3.c | 116 +++++++++++++++++++++++--
> include/linux/arm-smccc.h | 1 +
> 4 files changed, 123 insertions(+), 9 deletions(-)
>
> diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst
> index ec5f889d76819..e31f6c0687041 100644
> --- a/Documentation/arm64/silicon-errata.rst
> +++ b/Documentation/arm64/silicon-errata.rst
> @@ -172,6 +172,8 @@ stable kernels.
> +----------------+-----------------+-----------------+-----------------------------+
> | NVIDIA | Carmel Core | N/A | NVIDIA_CARMEL_CNP_ERRATUM |
> +----------------+-----------------+-----------------+-----------------------------+
> +| NVIDIA | T241 GICv3/4.x | T241-FABRIC-4 | N/A |
> ++----------------+-----------------+-----------------+-----------------------------+
> +----------------+-----------------+-----------------+-----------------------------+
> | Freescale/NXP | LS2080A/LS1043A | A-008585 | FSL_ERRATUM_A008585 |
> +----------------+-----------------+-----------------+-----------------------------+
> diff --git a/drivers/firmware/smccc/smccc.c b/drivers/firmware/smccc/smccc.c
> index 60ccf3e90d7de..cb688121270b8 100644
> --- a/drivers/firmware/smccc/smccc.c
> +++ b/drivers/firmware/smccc/smccc.c
> @@ -17,9 +17,12 @@ static enum arm_smccc_conduit smccc_conduit = SMCCC_CONDUIT_NONE;
>
> bool __ro_after_init smccc_trng_available = false;
> u64 __ro_after_init smccc_has_sve_hint = false;
> +s32 __ro_after_init smccc_soc_id_version = SMCCC_RET_NOT_SUPPORTED;
>
> void __init arm_smccc_version_init(u32 version, enum arm_smccc_conduit conduit)
> {
> + struct arm_smccc_res res;
> +
> smccc_version = version;
> smccc_conduit = conduit;
>
> @@ -27,6 +30,16 @@ void __init arm_smccc_version_init(u32 version, enum arm_smccc_conduit conduit)
> if (IS_ENABLED(CONFIG_ARM64_SVE) &&
> smccc_version >= ARM_SMCCC_VERSION_1_3)
> smccc_has_sve_hint = true;
> +
> + if ((smccc_version >= ARM_SMCCC_VERSION_1_2) &&
> + (smccc_conduit != SMCCC_CONDUIT_NONE)) {
> + arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
> + ARM_SMCCC_ARCH_SOC_ID, &res);
> + if ((s32)res.a0 >= 0) {
> + arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_SOC_ID, 0, &res);
> + smccc_soc_id_version = (s32)res.a0;
> + }
> + }
Please don't duplicate existing code. There is already the required
infrastructure in drivers/firmware/smccc/soc_id.c. All you need to do
is:
- disassociate the SMCCC probing from the device registration
- probe the SOC_ID early
- add accessors for the relevant data
- select ARM_SMCCC_SOD_ID/ARM_SMCCC_DISCOVERY from the GICv3 Kconfig
> }
>
> enum arm_smccc_conduit arm_smccc_1_1_get_conduit(void)
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index 997104d4338e7..02c699ce92b12 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -24,6 +24,7 @@
> #include <linux/irqchip/arm-gic-common.h>
> #include <linux/irqchip/arm-gic-v3.h>
> #include <linux/irqchip/irq-partition-percpu.h>
> +#include <linux/arm-smccc.h>
>
> #include <asm/cputype.h>
> #include <asm/exception.h>
> @@ -57,8 +58,14 @@ struct gic_chip_data {
> bool has_rss;
> unsigned int ppi_nr;
> struct partition_desc **ppi_descs;
> + phys_addr_t dist_phys_base;
Place this field next to its mapping.
> };
>
> +
Spurious blank line.
> +#define T241_CHIPS_MAX 4
> +static void __iomem *t241_dist_base_alias[T241_CHIPS_MAX];
Make this __read_mostly.
> +static DEFINE_STATIC_KEY_FALSE(gic_nvidia_t241_erratum);
> +
> static struct gic_chip_data gic_data __read_mostly;
> static DEFINE_STATIC_KEY_TRUE(supports_deactivate_key);
>
> @@ -188,6 +195,34 @@ static inline bool gic_irq_in_rdist(struct irq_data *d)
> }
> }
>
> +static inline void __iomem *gic_dist_base_alias(irq_hw_number_t intid)
Have this function take a struct irq_data * instead, and only perform
the conversion in the workaround code.
> +{
> + u32 chip;
Move this into the if ().
> +
> + if (static_branch_unlikely(&gic_nvidia_t241_erratum)) {
> + /**
> + * {E}SPI mappings for all 4 chips
> + * Chip0 = 32-351
> + * Chip1 = 352-671
> + * Chip2 = 672-991
> + * Chip3 = 4096-4415
> + */
> + switch (__get_intid_range(intid)) {
> + case SPI_RANGE:
> + chip = (intid - 32) / 320;
> + break;
> + case ESPI_RANGE:
> + chip = 3;
> + break;
> + default:
> + unreachable();
> + }
> + return t241_dist_base_alias[chip];
> + }
> +
> + return gic_data.dist_base;
> +}
> +
> static inline void __iomem *gic_dist_base(struct irq_data *d)
> {
> switch (get_intid_range(d)) {
> @@ -343,10 +378,14 @@ static int gic_peek_irq(struct irq_data *d, u32 offset)
> offset = convert_offset_index(d, offset, &index);
> mask = 1 << (index % 32);
>
> + /**
> + * For the erratum T241-FABRIC-4, read accesses to the GICD_In{E}
> + * registers are directed to the chip that owns the SPI.
> + */
Move this into the workaround code, and drop the ** at the beginning
of the comment,
> if (gic_irq_in_rdist(d))
> base = gic_data_rdist_sgi_base();
> else
> - base = gic_data.dist_base;
> + base = gic_dist_base_alias(irqd_to_hwirq(d));
>
> return !!(readl_relaxed(base + offset + (index / 32) * 4) & mask);
> }
> @@ -594,13 +633,17 @@ static int gic_set_type(struct irq_data *d, unsigned int type)
> type != IRQ_TYPE_LEVEL_HIGH && type != IRQ_TYPE_EDGE_RISING)
> return -EINVAL;
>
> + /**
> + * For the erratum T241-FABRIC-4, read accesses to the GICD_In{E}
> + * registers are directed to the chip that owns the SPI. Use the
> + * same alias region for GICD_ICFGR writes to simplify code.
> + */
Drop this and merge it with the above comment as required.
> if (gic_irq_in_rdist(d))
> base = gic_data_rdist_sgi_base();
> else
> - base = gic_data.dist_base;
> + base = gic_dist_base_alias(irqd_to_hwirq(d));
>
> offset = convert_offset_index(d, GICD_ICFGR, &index);
> -
Spurious blank line change.
> ret = gic_configure_irq(index, type, base + offset, NULL);
> if (ret && (range == PPI_RANGE || range == EPPI_RANGE)) {
> /* Misconfigured PPIs are usually not fatal */
> @@ -1719,6 +1762,45 @@ static bool gic_enable_quirk_hip06_07(void *data)
> return false;
> }
>
> +#define T241_CHIPN_MASK GENMASK_ULL(45, 44)
> +#define T241_CHIP_GICDA_OFFSET 0x1580000
> +#define SMCCC_SOC_ID_MASK 0x7f7fffff
> +#define SMCCC_SOC_ID_T241 0x036b0241
> +
> +static bool gic_enable_quirk_nvidia_t241(void *data)
> +{
> + unsigned long chip_bmask = 0;
> + phys_addr_t phys;
> + u32 i;
> +
> + /* Check JEP106 code for NVIDIA T241 chip (036b:0241) */
> + if ((smccc_soc_id_version < 0) ||
> + ((smccc_soc_id_version & SMCCC_SOC_ID_MASK) != SMCCC_SOC_ID_T241)) {
> + return false;
> + }
> +
> + /* Find the chips based on GICR regions PHYS addr */
> + for (i = 0; i < gic_data.nr_redist_regions; i++) {
> + chip_bmask |= BIT(FIELD_GET(T241_CHIPN_MASK,
> + gic_data.redist_regions[i].phys_base));
> + }
> +
> + if (hweight32(chip_bmask) < 3)
> + return false;
> +
> + /* Setup GICD alias regions */
> + for (i = 0; i < ARRAY_SIZE(t241_dist_base_alias); i++) {
> + if (chip_bmask & BIT(i)) {
> + phys = gic_data.dist_phys_base + T241_CHIP_GICDA_OFFSET;
> + phys |= FIELD_PREP(T241_CHIPN_MASK, i);
> + t241_dist_base_alias[i] = ioremap(phys, SZ_64K);
> + WARN_ON_ONCE(!t241_dist_base_alias[i]);
> + }
> + }
> + static_branch_enable(&gic_nvidia_t241_erratum);
> + return true;
> +}
> +
> static const struct gic_quirk gic_quirks[] = {
> {
> .desc = "GICv3: Qualcomm MSM8996 broken firmware",
> @@ -1750,6 +1832,12 @@ static const struct gic_quirk gic_quirks[] = {
> .mask = 0xe8f00fff,
> .init = gic_enable_quirk_cavium_38539,
> },
> + {
> + .desc = "GICv3: NVIDIA erratum T241-FABRIC-4",
> + .iidr = 0x0402043b,
> + .mask = 0xffffffff,
> + .init = gic_enable_quirk_nvidia_t241,
> + },
> {
> }
> };
> @@ -1821,7 +1909,8 @@ static int __init gic_init_bases(void __iomem *dist_base,
> struct redist_region *rdist_regs,
> u32 nr_redist_regions,
> u64 redist_stride,
> - struct fwnode_handle *handle)
> + struct fwnode_handle *handle,
> + phys_addr_t dist_phys_base)
Pass dist_phys_base as a the first parameter, not the last.
> {
> u32 typer;
> int err;
> @@ -1834,6 +1923,7 @@ static int __init gic_init_bases(void __iomem *dist_base,
>
> gic_data.fwnode = handle;
> gic_data.dist_base = dist_base;
> + gic_data.dist_phys_base = dist_phys_base;
> gic_data.redist_regions = rdist_regs;
> gic_data.nr_redist_regions = nr_redist_regions;
> gic_data.redist_stride = redist_stride;
> @@ -2072,11 +2162,12 @@ static int __init gic_of_init(struct device_node *node, struct device_node *pare
> void __iomem *dist_base;
> struct redist_region *rdist_regs;
> struct resource res;
> + struct resource dist_res;
Surely you don't need a full struct resource to keep the distributor
base address around...
> u64 redist_stride;
> u32 nr_redist_regions;
> int err, i;
>
> - dist_base = gic_of_iomap(node, 0, "GICD", &res);
> + dist_base = gic_of_iomap(node, 0, "GICD", &dist_res);
> if (IS_ERR(dist_base)) {
> pr_err("%pOF: unable to map gic dist registers\n", node);
> return PTR_ERR(dist_base);
> @@ -2114,7 +2205,7 @@ static int __init gic_of_init(struct device_node *node, struct device_node *pare
> gic_enable_of_quirks(node, gic_quirks, &gic_data);
>
> err = gic_init_bases(dist_base, rdist_regs, nr_redist_regions,
> - redist_stride, &node->fwnode);
> + redist_stride, &node->fwnode, dist_res.start);
> if (err)
> goto out_unmap_rdist;
>
> @@ -2377,8 +2468,14 @@ static void __init gic_acpi_setup_kvm_info(void)
> vcpu->end = vcpu->start + ACPI_GICV2_VCPU_MEM_SIZE - 1;
> }
>
> - gic_v3_kvm_info.has_v4 = gic_data.rdists.has_vlpis;
> - gic_v3_kvm_info.has_v4_1 = gic_data.rdists.has_rvpeid;
> + /* Disable GICv4.x features for the erratum T241-FABRIC-4 */
> + if (static_branch_unlikely(&gic_nvidia_t241_erratum)) {
> + gic_v3_kvm_info.has_v4 = false;
> + gic_v3_kvm_info.has_v4_1 = false;
> + } else {
> + gic_v3_kvm_info.has_v4 = gic_data.rdists.has_vlpis;
> + gic_v3_kvm_info.has_v4_1 = gic_data.rdists.has_rvpeid;
> + }
This is the wrong place to do this. We want to disable *all* the
GICv4.1 features, and not just KVM's view. Specially given that there
are a bunch of fields that are evaluated in the ITS driver.
Something like the change below seems more logical. Maybe you can
salvage direct_lpi from the disaster, but that's about it.
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index fd134e1f481a..a68d33b4523f 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -1841,10 +1841,13 @@ static int __init gic_init_bases(void __iomem *dist_base,
gic_data.domain = irq_domain_create_tree(handle, &gic_irq_domain_ops,
&gic_data);
gic_data.rdists.rdist = alloc_percpu(typeof(*gic_data.rdists.rdist));
- gic_data.rdists.has_rvpeid = true;
- gic_data.rdists.has_vlpis = true;
- gic_data.rdists.has_direct_lpi = true;
- gic_data.rdists.has_vpend_valid_dirty = true;
+ if (!static_branch_unlikely(&gic_nvidia_t241_erratum)) {
+ /* Disable GICv4.x features for the erratum T241-FABRIC-4 */
+ gic_data.rdists.has_rvpeid = true;
+ gic_data.rdists.has_vlpis = true;
+ gic_data.rdists.has_direct_lpi = true;
+ gic_data.rdists.has_vpend_valid_dirty = true;
+ }
if (WARN_ON(!gic_data.domain) || WARN_ON(!gic_data.rdists.rdist)) {
err = -ENOMEM;
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2023-03-15 8:36 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-14 13:51 [PATCH v2] irqchip/gicv3: Workaround for NVIDIA erratum T241-FABRIC-4 Shanker Donthineni
2023-03-14 13:51 ` Shanker Donthineni
2023-03-15 8:34 ` Marc Zyngier [this message]
2023-03-15 12:27 ` Shanker Donthineni
2023-03-15 12:27 ` Shanker Donthineni
2023-03-16 15:10 ` Sudeep Holla
2023-03-16 15:10 ` Sudeep Holla
2023-03-16 16:00 ` Marc Zyngier
2023-03-16 16:00 ` Marc Zyngier
2023-03-16 22:41 ` Shanker Donthineni
2023-03-16 22:41 ` Shanker Donthineni
2023-03-15 22:07 ` kernel test robot
2023-03-15 22:07 ` kernel test robot
2023-03-15 22:48 ` kernel test robot
2023-03-15 22:48 ` kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=871qlqif9v.wl-maz@kernel.org \
--to=maz@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=corbet@lwn.net \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lpieralisi@kernel.org \
--cc=mark.rutland@arm.com \
--cc=sdonthineni@nvidia.com \
--cc=sudeep.holla@arm.com \
--cc=tglx@linutronix.de \
--cc=treding@nvidia.com \
--cc=vsethi@nvidia.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.