From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 01D0CC6FD1D for ; Wed, 15 Mar 2023 08:36:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Subject:Cc:To:From:Message-ID:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=Z8p3hbe+UZgGHgWOo4hw7k2zsLnBqu0K/O/tpPRJ01M=; b=GIXZ1c01se7VEO 644mtbrQfbNwccJ59l79SVAZuRU15r3fffenhGV9XUzqi0/NHQ38ddlCEn2RAVVuCXsWhpQUhc94U m6WBHapMBygwbRmRz890eNMxUllcSUHdmDnBHjUWwzeVOKuEdxjdlOnEZP+gv190Sx652f7zJNl3x 6p+jYydQ+VTu7hcfilo2QTiD4NoroTkxA5YIW9maVj7KxWXnpejwjvur9aj2PdIxuk3HpigOH0OqH IuYC/bvtYTn6iX+l2wpVJ2wogeKE0yjFeaGOHiyJGVeRhfS8sOjnEgVy1NCLEtEntmMUw9l/r05S5 Pr0IvWSd5k3Ny+b8BZhQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1pcMbE-00CnG3-1B; Wed, 15 Mar 2023 08:35:08 +0000 Received: from ams.source.kernel.org ([145.40.68.75]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1pcMbA-00CnFN-0n for linux-arm-kernel@lists.infradead.org; Wed, 15 Mar 2023 08:35:06 +0000 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 79DFCB81D7C; Wed, 15 Mar 2023 08:35:02 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E8463C433EF; Wed, 15 Mar 2023 08:35:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1678869301; bh=+BmXxfvZXwRDerzGHfN4EbB/C3wBHYToOMVmkqKkxN8=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=mLbDaS1WKzIAopO4iCsZKqoQ5qLxrOEj7tbLMgc1sewfplPD+C/w9E2DT5ABDp4xJ EmOOy+Zbf2YsKMycLOeNMUlCtuHkKyTJ419g8ht5tu+DbuhmQ0OhFpvpkaNL0HcShk CpaEtPiWb0dl+N9IfjoqoIsQ04yBWAJlWr/rNLpJysaDiag+CMdZAeAmbftPXHIUVC uak8IAOrj6RiGYHJJWaMkZ+II2ZzW7bWlIMxVLyZaSwcUVkE+IAZosNS2bl/1q4GW2 0SCVYhwSee273q9tgvzcWoR5qPpLqwkDFtnwaa2ChzlzI8oIEkdmop9pH4Fhq1C/1p 4dkzAGPqBYlYg== Received: from 82-132-239-126.dab.02.net ([82.132.239.126] helo=wait-a-minute.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pcMb1-000Cum-EX; Wed, 15 Mar 2023 08:34:58 +0000 Date: Wed, 15 Mar 2023 08:34:52 +0000 Message-ID: <871qlqif9v.wl-maz@kernel.org> From: Marc Zyngier To: Shanker Donthineni Cc: Catalin Marinas , Will Deacon , Jonathan Corbet , Mark Rutland , Lorenzo Pieralisi , "Sudeep\ Holla" , Thomas Gleixner , , , , Vikram Sethi , Thierry Reding Subject: Re: [PATCH v2] irqchip/gicv3: Workaround for NVIDIA erratum T241-FABRIC-4 In-Reply-To: <20230314135128.2930580-1-sdonthineni@nvidia.com> References: <20230314135128.2930580-1-sdonthineni@nvidia.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") X-SA-Exim-Connect-IP: 82.132.239.126 X-SA-Exim-Rcpt-To: sdonthineni@nvidia.com, catalin.marinas@arm.com, will@kernel.org, corbet@lwn.net, mark.rutland@arm.com, lpieralisi@kernel.org, sudeep.holla@arm.com, tglx@linutronix.de, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, vsethi@nvidia.com, treding@nvidia.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230315_013504_588493_660B2841 X-CRM114-Status: GOOD ( 53.75 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Tue, 14 Mar 2023 13:51:28 +0000, Shanker Donthineni wrote: > > The T241 platform suffers from the T241-FABRIC-4 erratum which causes > unexpected behavior in the GIC when multiple transactions are received > simultaneously from different sources. This hardware issue impacts > NVIDIA server platforms that use more than two T241 chips > interconnected. Each chip has support for 320 {E}SPIs. > > This issue occurs when multiple packets from different GICs are > incorrectly interleaved at the target chip. The erratum text below > specifies exactly what can cause multiple transfer packets susceptible > to interleaving and GIC state corruption. GIC state corruption can > lead to a range of problems, including kernel panics, and unexpected > behavior. > > From the erratum text: > "In some cases, inter-socket AXI4 Stream packets with multiple > transfers, may be interleaved by the fabric when presented to ARM > Generic Interrupt Controller. GIC expects all transfers of a packet > to be delivered without any interleaving. > > The following GICv3 commands may result in multiple transfer packets > over inter-socket AXI4 Stream interface: > - Register reads from GICD_I* and GICD_N* > - Register writes to 64-bit GICD registers other than GICD_IROUTERn* > - ITS command MOVALL > > Multiple commands in GICv4+ utilize multiple transfer packets, > including VMOVP, VMOVI and VMAPP. > > This issue impacts system configurations with more than 2 sockets, > that require multi-transfer packets to be sent over inter-socket > AXI4 Stream interface between GIC instances on different sockets. > GICv4 cannot be supported. GICv3 SW model can only be supported > with the workaround. Single and Dual socket configurations are not > impacted by this issue and support GICv3 and GICv4." > > Link: https://developer.nvidia.com/docs/t241-fabric-4/nvidia-t241-fabric-4-errata.pdf > > Writing to the chip alias region of the GICD_In{E} registers except > GICD_ICENABLERn has an equivalent effect as writing to the global > distributor. The SPI interrupt deactivate path is not impacted by > the erratum. > > To fix this problem, implement a workaround that ensures read accesses > to the GICD_In{E} registers are directed to the chip that owns the > SPI, and disables GICv4.x features for KVM. To simplify code changes, > the gic_configure_irq() function uses the same alias region for both > read and write operations to GICD_ICFGR. > > Co-developed-by: Vikram Sethi > Signed-off-by: Vikram Sethi > Signed-off-by: Shanker Donthineni > --- > Changes since v1: > - Use SMCCC SOC-ID API for detecting the T241 chip > - Implement Marc's suggestions > - Edit commit text > > Documentation/arm64/silicon-errata.rst | 2 + > drivers/firmware/smccc/smccc.c | 13 +++ > drivers/irqchip/irq-gic-v3.c | 116 +++++++++++++++++++++++-- > include/linux/arm-smccc.h | 1 + > 4 files changed, 123 insertions(+), 9 deletions(-) > > diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst > index ec5f889d76819..e31f6c0687041 100644 > --- a/Documentation/arm64/silicon-errata.rst > +++ b/Documentation/arm64/silicon-errata.rst > @@ -172,6 +172,8 @@ stable kernels. > +----------------+-----------------+-----------------+-----------------------------+ > | NVIDIA | Carmel Core | N/A | NVIDIA_CARMEL_CNP_ERRATUM | > +----------------+-----------------+-----------------+-----------------------------+ > +| NVIDIA | T241 GICv3/4.x | T241-FABRIC-4 | N/A | > ++----------------+-----------------+-----------------+-----------------------------+ > +----------------+-----------------+-----------------+-----------------------------+ > | Freescale/NXP | LS2080A/LS1043A | A-008585 | FSL_ERRATUM_A008585 | > +----------------+-----------------+-----------------+-----------------------------+ > diff --git a/drivers/firmware/smccc/smccc.c b/drivers/firmware/smccc/smccc.c > index 60ccf3e90d7de..cb688121270b8 100644 > --- a/drivers/firmware/smccc/smccc.c > +++ b/drivers/firmware/smccc/smccc.c > @@ -17,9 +17,12 @@ static enum arm_smccc_conduit smccc_conduit = SMCCC_CONDUIT_NONE; > > bool __ro_after_init smccc_trng_available = false; > u64 __ro_after_init smccc_has_sve_hint = false; > +s32 __ro_after_init smccc_soc_id_version = SMCCC_RET_NOT_SUPPORTED; > > void __init arm_smccc_version_init(u32 version, enum arm_smccc_conduit conduit) > { > + struct arm_smccc_res res; > + > smccc_version = version; > smccc_conduit = conduit; > > @@ -27,6 +30,16 @@ void __init arm_smccc_version_init(u32 version, enum arm_smccc_conduit conduit) > if (IS_ENABLED(CONFIG_ARM64_SVE) && > smccc_version >= ARM_SMCCC_VERSION_1_3) > smccc_has_sve_hint = true; > + > + if ((smccc_version >= ARM_SMCCC_VERSION_1_2) && > + (smccc_conduit != SMCCC_CONDUIT_NONE)) { > + arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID, > + ARM_SMCCC_ARCH_SOC_ID, &res); > + if ((s32)res.a0 >= 0) { > + arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_SOC_ID, 0, &res); > + smccc_soc_id_version = (s32)res.a0; > + } > + } Please don't duplicate existing code. There is already the required infrastructure in drivers/firmware/smccc/soc_id.c. All you need to do is: - disassociate the SMCCC probing from the device registration - probe the SOC_ID early - add accessors for the relevant data - select ARM_SMCCC_SOD_ID/ARM_SMCCC_DISCOVERY from the GICv3 Kconfig > } > > enum arm_smccc_conduit arm_smccc_1_1_get_conduit(void) > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c > index 997104d4338e7..02c699ce92b12 100644 > --- a/drivers/irqchip/irq-gic-v3.c > +++ b/drivers/irqchip/irq-gic-v3.c > @@ -24,6 +24,7 @@ > #include > #include > #include > +#include > > #include > #include > @@ -57,8 +58,14 @@ struct gic_chip_data { > bool has_rss; > unsigned int ppi_nr; > struct partition_desc **ppi_descs; > + phys_addr_t dist_phys_base; Place this field next to its mapping. > }; > > + Spurious blank line. > +#define T241_CHIPS_MAX 4 > +static void __iomem *t241_dist_base_alias[T241_CHIPS_MAX]; Make this __read_mostly. > +static DEFINE_STATIC_KEY_FALSE(gic_nvidia_t241_erratum); > + > static struct gic_chip_data gic_data __read_mostly; > static DEFINE_STATIC_KEY_TRUE(supports_deactivate_key); > > @@ -188,6 +195,34 @@ static inline bool gic_irq_in_rdist(struct irq_data *d) > } > } > > +static inline void __iomem *gic_dist_base_alias(irq_hw_number_t intid) Have this function take a struct irq_data * instead, and only perform the conversion in the workaround code. > +{ > + u32 chip; Move this into the if (). > + > + if (static_branch_unlikely(&gic_nvidia_t241_erratum)) { > + /** > + * {E}SPI mappings for all 4 chips > + * Chip0 = 32-351 > + * Chip1 = 352-671 > + * Chip2 = 672-991 > + * Chip3 = 4096-4415 > + */ > + switch (__get_intid_range(intid)) { > + case SPI_RANGE: > + chip = (intid - 32) / 320; > + break; > + case ESPI_RANGE: > + chip = 3; > + break; > + default: > + unreachable(); > + } > + return t241_dist_base_alias[chip]; > + } > + > + return gic_data.dist_base; > +} > + > static inline void __iomem *gic_dist_base(struct irq_data *d) > { > switch (get_intid_range(d)) { > @@ -343,10 +378,14 @@ static int gic_peek_irq(struct irq_data *d, u32 offset) > offset = convert_offset_index(d, offset, &index); > mask = 1 << (index % 32); > > + /** > + * For the erratum T241-FABRIC-4, read accesses to the GICD_In{E} > + * registers are directed to the chip that owns the SPI. > + */ Move this into the workaround code, and drop the ** at the beginning of the comment, > if (gic_irq_in_rdist(d)) > base = gic_data_rdist_sgi_base(); > else > - base = gic_data.dist_base; > + base = gic_dist_base_alias(irqd_to_hwirq(d)); > > return !!(readl_relaxed(base + offset + (index / 32) * 4) & mask); > } > @@ -594,13 +633,17 @@ static int gic_set_type(struct irq_data *d, unsigned int type) > type != IRQ_TYPE_LEVEL_HIGH && type != IRQ_TYPE_EDGE_RISING) > return -EINVAL; > > + /** > + * For the erratum T241-FABRIC-4, read accesses to the GICD_In{E} > + * registers are directed to the chip that owns the SPI. Use the > + * same alias region for GICD_ICFGR writes to simplify code. > + */ Drop this and merge it with the above comment as required. > if (gic_irq_in_rdist(d)) > base = gic_data_rdist_sgi_base(); > else > - base = gic_data.dist_base; > + base = gic_dist_base_alias(irqd_to_hwirq(d)); > > offset = convert_offset_index(d, GICD_ICFGR, &index); > - Spurious blank line change. > ret = gic_configure_irq(index, type, base + offset, NULL); > if (ret && (range == PPI_RANGE || range == EPPI_RANGE)) { > /* Misconfigured PPIs are usually not fatal */ > @@ -1719,6 +1762,45 @@ static bool gic_enable_quirk_hip06_07(void *data) > return false; > } > > +#define T241_CHIPN_MASK GENMASK_ULL(45, 44) > +#define T241_CHIP_GICDA_OFFSET 0x1580000 > +#define SMCCC_SOC_ID_MASK 0x7f7fffff > +#define SMCCC_SOC_ID_T241 0x036b0241 > + > +static bool gic_enable_quirk_nvidia_t241(void *data) > +{ > + unsigned long chip_bmask = 0; > + phys_addr_t phys; > + u32 i; > + > + /* Check JEP106 code for NVIDIA T241 chip (036b:0241) */ > + if ((smccc_soc_id_version < 0) || > + ((smccc_soc_id_version & SMCCC_SOC_ID_MASK) != SMCCC_SOC_ID_T241)) { > + return false; > + } > + > + /* Find the chips based on GICR regions PHYS addr */ > + for (i = 0; i < gic_data.nr_redist_regions; i++) { > + chip_bmask |= BIT(FIELD_GET(T241_CHIPN_MASK, > + gic_data.redist_regions[i].phys_base)); > + } > + > + if (hweight32(chip_bmask) < 3) > + return false; > + > + /* Setup GICD alias regions */ > + for (i = 0; i < ARRAY_SIZE(t241_dist_base_alias); i++) { > + if (chip_bmask & BIT(i)) { > + phys = gic_data.dist_phys_base + T241_CHIP_GICDA_OFFSET; > + phys |= FIELD_PREP(T241_CHIPN_MASK, i); > + t241_dist_base_alias[i] = ioremap(phys, SZ_64K); > + WARN_ON_ONCE(!t241_dist_base_alias[i]); > + } > + } > + static_branch_enable(&gic_nvidia_t241_erratum); > + return true; > +} > + > static const struct gic_quirk gic_quirks[] = { > { > .desc = "GICv3: Qualcomm MSM8996 broken firmware", > @@ -1750,6 +1832,12 @@ static const struct gic_quirk gic_quirks[] = { > .mask = 0xe8f00fff, > .init = gic_enable_quirk_cavium_38539, > }, > + { > + .desc = "GICv3: NVIDIA erratum T241-FABRIC-4", > + .iidr = 0x0402043b, > + .mask = 0xffffffff, > + .init = gic_enable_quirk_nvidia_t241, > + }, > { > } > }; > @@ -1821,7 +1909,8 @@ static int __init gic_init_bases(void __iomem *dist_base, > struct redist_region *rdist_regs, > u32 nr_redist_regions, > u64 redist_stride, > - struct fwnode_handle *handle) > + struct fwnode_handle *handle, > + phys_addr_t dist_phys_base) Pass dist_phys_base as a the first parameter, not the last. > { > u32 typer; > int err; > @@ -1834,6 +1923,7 @@ static int __init gic_init_bases(void __iomem *dist_base, > > gic_data.fwnode = handle; > gic_data.dist_base = dist_base; > + gic_data.dist_phys_base = dist_phys_base; > gic_data.redist_regions = rdist_regs; > gic_data.nr_redist_regions = nr_redist_regions; > gic_data.redist_stride = redist_stride; > @@ -2072,11 +2162,12 @@ static int __init gic_of_init(struct device_node *node, struct device_node *pare > void __iomem *dist_base; > struct redist_region *rdist_regs; > struct resource res; > + struct resource dist_res; Surely you don't need a full struct resource to keep the distributor base address around... > u64 redist_stride; > u32 nr_redist_regions; > int err, i; > > - dist_base = gic_of_iomap(node, 0, "GICD", &res); > + dist_base = gic_of_iomap(node, 0, "GICD", &dist_res); > if (IS_ERR(dist_base)) { > pr_err("%pOF: unable to map gic dist registers\n", node); > return PTR_ERR(dist_base); > @@ -2114,7 +2205,7 @@ static int __init gic_of_init(struct device_node *node, struct device_node *pare > gic_enable_of_quirks(node, gic_quirks, &gic_data); > > err = gic_init_bases(dist_base, rdist_regs, nr_redist_regions, > - redist_stride, &node->fwnode); > + redist_stride, &node->fwnode, dist_res.start); > if (err) > goto out_unmap_rdist; > > @@ -2377,8 +2468,14 @@ static void __init gic_acpi_setup_kvm_info(void) > vcpu->end = vcpu->start + ACPI_GICV2_VCPU_MEM_SIZE - 1; > } > > - gic_v3_kvm_info.has_v4 = gic_data.rdists.has_vlpis; > - gic_v3_kvm_info.has_v4_1 = gic_data.rdists.has_rvpeid; > + /* Disable GICv4.x features for the erratum T241-FABRIC-4 */ > + if (static_branch_unlikely(&gic_nvidia_t241_erratum)) { > + gic_v3_kvm_info.has_v4 = false; > + gic_v3_kvm_info.has_v4_1 = false; > + } else { > + gic_v3_kvm_info.has_v4 = gic_data.rdists.has_vlpis; > + gic_v3_kvm_info.has_v4_1 = gic_data.rdists.has_rvpeid; > + } This is the wrong place to do this. We want to disable *all* the GICv4.1 features, and not just KVM's view. Specially given that there are a bunch of fields that are evaluated in the ITS driver. Something like the change below seems more logical. Maybe you can salvage direct_lpi from the disaster, but that's about it. diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c index fd134e1f481a..a68d33b4523f 100644 --- a/drivers/irqchip/irq-gic-v3.c +++ b/drivers/irqchip/irq-gic-v3.c @@ -1841,10 +1841,13 @@ static int __init gic_init_bases(void __iomem *dist_base, gic_data.domain = irq_domain_create_tree(handle, &gic_irq_domain_ops, &gic_data); gic_data.rdists.rdist = alloc_percpu(typeof(*gic_data.rdists.rdist)); - gic_data.rdists.has_rvpeid = true; - gic_data.rdists.has_vlpis = true; - gic_data.rdists.has_direct_lpi = true; - gic_data.rdists.has_vpend_valid_dirty = true; + if (!static_branch_unlikely(&gic_nvidia_t241_erratum)) { + /* Disable GICv4.x features for the erratum T241-FABRIC-4 */ + gic_data.rdists.has_rvpeid = true; + gic_data.rdists.has_vlpis = true; + gic_data.rdists.has_direct_lpi = true; + gic_data.rdists.has_vpend_valid_dirty = true; + } if (WARN_ON(!gic_data.domain) || WARN_ON(!gic_data.rdists.rdist)) { err = -ENOMEM; Thanks, M. -- Without deviation from the norm, progress is not possible. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel