From: Marc Zyngier <maz@kernel.org>
To: Shanker Donthineni <sdonthineni@nvidia.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>, Jonathan Corbet <corbet@lwn.net>,
linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org, Vikram Sethi <vsethi@nvidia.com>,
Thierry Reding <treding@nvidia.com>
Subject: Re: [PATCH] irqchip/gicv3: Workaround for NVIDIA erratum T241-FABRIC-4
Date: Sun, 12 Mar 2023 13:43:01 +0000 [thread overview]
Message-ID: <871qlu5bmi.wl-maz@kernel.org> (raw)
In-Reply-To: <ff5306c3-c72e-87a3-f21f-13cdf11e1df5@nvidia.com>
On Fri, 10 Mar 2023 14:16:34 +0000,
Shanker Donthineni <sdonthineni@nvidia.com> wrote:
>
> Hi Marc,
>
> On 3/7/23 02:32, Marc Zyngier wrote:
> > External email: Use caution opening links or attachments
> >
> >
> > On Mon, 06 Mar 2023 01:31:48 +0000,
> > Shanker Donthineni <sdonthineni@nvidia.com> wrote:
> >>
> >> The purpose of this patch is to address the T241 erratum T241-FABRIC-4,
> >> which causes unexpected behavior in the GIC when multiple transactions
> >
> > nit: "The purpose of this patch" is superfluous. Instead, write
> > something like:
> >
> > "The T241 platform suffers from the T241-FABRIC-4 erratum which
> > causes..."
> >
> I'll fix in v2 patch.
>
> >> are received simultaneously from different sources. This hardware issue
> >> impacts NVIDIA server platforms that use more than two T241 chips
> >> interconnected. Each chip has support for 320 {E}SPIs.
> >>
> >> This issue occurs when multiple packets from different GICs are
> >> incorrectly interleaved at the target chip. The erratum text below
> >> specifies exactly what can cause multiple transfer packets susceptible
> >> to interleaving and GIC state corruption. GIC state corruption can
> >> lead to a range of problems, including kernel panics, and unexpected
> >> behavior.
> >>
> >> From the erratum text:
> >> "In some cases, inter-socket AXI4 Stream packets with multiple
> >> transfers, may be interleaved by the fabric when presented to ARM
> >> Generic Interrupt Controller. GIC expects all transfers of a packet
> >> to be delivered without any interleaving.
> >>
> >> The following GICv3 commands may result in multiple transfer packets
> >> over inter-socket AXI4 Stream interface:
> >> - Register reads from GICD_I* and GICD_N*
> >> - Register writes to 64-bit GICD registers other than GICD_IROUTERn*
> >> - ITS command MOVALL
> >
> > Does is also affect cross-chip traffic such as SPI deactivation?
> >
> No, it is not impacted.
>
> >>
> >> Multiple commands in GICv4+ utilize multiple transfer packets,
> >> including VMOVP, VMOVI and VMAPP.
> >>
> >> This issue impacts system configurations with more than 2 sockets,
> >> that require multi-transfer packets to be sent over inter-socket
> >> AXI4 Stream interface between GIC instances on different sockets.
> >> GICv4 cannot be supported. GICv3 SW model can only be supported
> >> with the workaround. Single and Dual socket configurations are not
> >> impacted by this issue and support GICv3 and GICv4."
> >
> > Do you have a public link to this erratum? This is really something
> > that we should be go back to when changing things in the GIC code
> > (should we ever use MOVALL, for example).
> >
> https://developer.nvidia.com/docs/t241-fabric-4/nvidia-t241-fabric-4-errata.pdf
Great. Please add this to the commit message and a comment next to the
workaround code.
[...]
> >> +static inline void __iomem *gic_dist_base_read_alias(irq_hw_number_t intid)
> >> +{
> >> + struct dist_base_alias *base_alias;
> >> + int i;
> >> +
> >> + if (static_branch_unlikely(&gic_nvidia_t241_erratum)) {
> >> + base_alias = gic_data.base_read_aliases;
> >> + for (i = 0; i < gic_data.nr_dist_base_aliases; i++) {
> >> + if (base_alias->base &&
> >> + (intid >= base_alias->intid_start) &&
> >> + (intid <= base_alias->intid_end)) {
> >> + return base_alias->base;
> >> + }
> >> + base_alias++;
> >> + }
> >> + }
> >
> > Each distributor has the exact same number of SPIs. So why isn't this
> > just a division that gives you a distributor number?
> >
>
> I considered creating a generic function that could potentially be
> utilized in the future for other Workarounds (WARs).
>
> I'll change to this in v2.
>
> static inline void __iomem *gic_dist_base_alias(irq_hw_number_t intid)
> {
> u32 chip;
>
> if (static_branch_unlikely(&gic_nvidia_t241_erratum)) {
> /**
> * {E}SPI mappings for all 4 chips
> * Chip0 = 32-351
> * Chip1 = 52-671
s/52/352/, right?
> * Chip2 = 672-991
> * Chip3 = 4096-4415
> */
> switch (__get_intid_range(intid)) {
> case SPI_RANGE:
> chip = (intid - 32) / 320;
> break;
> case ESPI_RANGE:
> chip = 3;
> break;
> default:
> unreachable();
> }
> BUG_ON(!t241_dist_base_alias[chip]);
You can drop this BUG_ON(), and replace it with on at probe time.
> return t241_dist_base_alias[chip];
> }
>
> return gic_data.dist_base;
> }
Yup, that's much better.
>
> >> +
> >> + return gic_data.dist_base;
> >> +}
> >> +
> >> static inline void __iomem *gic_dist_base(struct irq_data *d)
> >> {
> >> switch (get_intid_range(d)) {
> >> @@ -346,7 +377,7 @@ static int gic_peek_irq(struct irq_data *d, u32 offset)
> >> if (gic_irq_in_rdist(d))
> >> base = gic_data_rdist_sgi_base();
> >> else
> >> - base = gic_data.dist_base;
> >> + base = gic_dist_base_read_alias(irqd_to_hwirq(d));
> >>
> >> return !!(readl_relaxed(base + offset + (index / 32) * 4) & mask);
> >> }
> >> @@ -580,6 +611,7 @@ static int gic_set_type(struct irq_data *d, unsigned int type)
> >> enum gic_intid_range range;
> >> unsigned int irq = gic_irq(d);
> >> void __iomem *base;
> >> + void __iomem *base_read_alias;
> >> u32 offset, index;
> >> int ret;
> >>
> >> @@ -594,14 +626,17 @@ static int gic_set_type(struct irq_data *d, unsigned int type)
> >> type != IRQ_TYPE_LEVEL_HIGH && type != IRQ_TYPE_EDGE_RISING)
> >> return -EINVAL;
> >>
> >> - if (gic_irq_in_rdist(d))
> >> + if (gic_irq_in_rdist(d)) {
> >> base = gic_data_rdist_sgi_base();
> >> - else
> >> + base_read_alias = base;
> >> + } else {
> >> base = gic_data.dist_base;
> >> + base_read_alias = gic_dist_base_read_alias(irqd_to_hwirq(d));
> >> + }
> >>
> >> offset = convert_offset_index(d, GICD_ICFGR, &index);
> >> -
> >> - ret = gic_configure_irq(index, type, base + offset, NULL);
> >> + ret = gic_configure_irq(index, type, base + offset, NULL,
> >> + base_read_alias + offset);
> >> if (ret && (range == PPI_RANGE || range == EPPI_RANGE)) {
> >> /* Misconfigured PPIs are usually not fatal */
> >> pr_warn("GIC: PPI INTID%d is secure or misconfigured\n", irq);
> >> @@ -1719,6 +1754,70 @@ static bool gic_enable_quirk_hip06_07(void *data)
> >> return false;
> >> }
> >>
> >> +static bool gic_enable_quirk_nvidia_t241(void *data)
> >> +{
> >> +#ifdef CONFIG_ACPI
> >> + struct dist_base_alias *base_alias;
> >> + struct acpi_table_header *madt;
> >> + int i, intid, nchips = 0;
> >> + acpi_status status;
> >> + phys_addr_t phys;
> >> +
> >> + status = acpi_get_table(ACPI_SIG_MADT, 0, &madt);
> >> + if (ACPI_FAILURE(status))
> >> + return false;
> >> +
> >> + /* Check NVIDIA OEM ID */
> >> + if (memcmp(madt->oem_id, "NVIDIA", 6)) {
> >
> > What guarantees do we have that this string will always be present?
> > "oem_id" is usually updated to reflect the integrator, not the
> > silicon vendor.
> >
>
> Our company provides UEFI firmware porting guidelines to OEMs that
> ensure the MADT table generation, along with the ACPI header, remains
> unaltered. Thanks to your input, we are now looking into alternative
> approaches for identifying platform types and removing our dependence
> on ACPI. Specifically, we are interested in utilizing the SMCCC API
> to detect the CHIP. Determine whether the individual chips are present
> by referring to the GICR regions described in MADT.
Seems like a reasonable alternative.
>
>
> >> + acpi_put_table(madt);
> >> + return false;
> >> + }
> >> +
> >> + /* Find the number of chips based on OEM_TABLE_ID */
> >> + if ((!memcmp(madt->oem_table_id, "T241x3", 6)) ||
> >> + (!memcmp(madt->oem_table_id, "T241c3", 6))) {
> >> + nchips = 3;
> >> + } else if ((!memcmp(madt->oem_table_id, "T241x4", 6)) ||
> >> + (!memcmp(madt->oem_table_id, "T241c4", 6))) {
> >> + nchips = 4;
> >> + }
> >
> > Same question for these. This seems pretty fragile.
> >
> This can be avoid for the SMCCC based platform detection.
>
> >> +
> >> + acpi_put_table(madt);
> >> + if (nchips < 3)
> >> + return false;
> >> +
> >> + base_alias = kmalloc_array(nchips, sizeof(*base_alias),
> >> + GFP_KERNEL | __GFP_ZERO);
> >
> > You are fully initialising the structures, right? So why the
> > __GFP_ZERO?
> Yes, not needed. will use the staic array since size is small after
> removing INTID_start/end feilds.
>
> >
> >> + if (!base_alias)
> >> + return false;
> >> +
> >> + gic_data.base_read_aliases = base_alias;
> >> + gic_data.nr_dist_base_aliases = nchips;
> >> +
> >> + /**
> >> + * Setup GICD alias and {E}SPIs range for each chip
> >> + * {E}SPI blocks mappings:
> >> + * Chip0 = 00-09
> >> + * Chip1 = 10-19
> >> + * Chip2 = 20-29
> >> + * Chip3 = 30-39
> >
> > What are these ranges? From the code below, I can (sort of) guess that
> > each chip has 10 registers in the SPI/ESPI range, with chips 0-1
> > dealing with SPIs, and chips 2-3 dealing with ESPIs.
> >
> > It would be a lot clearer if you indicated the actual INTID ranges.
> Agree.
>
> >
> >> + */
> >> + for (i = 0; i < nchips; i++, base_alias++) {
> >> + phys = ((1ULL << 44) * i) | 0x23580000;
> >
> > Where is this address coming from? Can it be inferred from the MADT?
> > It would also be a lot more readable if written as:
> >
> > #define CHIP_MASK GENMASK_ULL(45, 44)
> > #define CHIP_ALIAS_BASE 0x23580000
> >
> I'll define macros for constants. Use the offset, global GICD-PHYS,
> and CHIP number to get the alias addressses.
>
> #define T241_CHIPN_MASK GENMASK_ULL(45, 44)
> #define T241_CHIP_GICDA_OFFSET 0x1580000
>
> phys = gic_data.dist_phys_base + T241_CHIP_GICDA_OFFSET;
> phys |= FIELD_PREP(T241_CHIPN_MASK, i);
>
>
> > phys = CHIP_ALIAS_BASE;
> > phys |= FIELD_PREP(CHIP_MASK, i);
> >
> >> + base_alias->base = ioremap(phys, SZ_64K);
> >> + WARN_ON(!base_alias->base);
> >> +
> >> + intid = i < 3 ? 32 + i * 10 * 32 : ESPI_BASE_INTID;
> >> + base_alias->intid_start = intid;
> >> + base_alias->intid_end = intid + 10 * 32 - 1;
> >
> > This really is obfuscated. And it also shows that we really don't need
> > the INTID ranges in the data structure. You can easily get to the chip
> > number with something like:
> ACK
>
> >
> > switch (__get_intid_range(intid)) {
> > case SPI_RANGE:
> > chip = (intid - 32) / 320;
> > break;
> > case ESPI_RANGE:
> > chip = (intid - ESPI_BASE_INTID) / 320;
> > break;
> > }
> >
> > alias = base_alias[chip];
> >
> > Bonus point if you add a #define for the magic numbers.
> >
> ACK
>
> >> + }
> >> + static_branch_enable(&gic_nvidia_t241_erratum);
> >> + return true;
> >> +#else
> >> + return false;
> >> +#endif
> >> +}
> >
> > How about moving the whole function under #ifdef CONFIG_ACPI?
> >
>
> If you're not satisfied with SMCCC-based platform detection, I'll
> make the necessary changes. We value your input and would appreciate
> your opinion on whether we should use SMCCC or ACPI-OEM-ID based
> platform detection. Our preference is to go with SMC if that's
> agreeable to you.
If you can guarantee that this FW-based discovery will always be
available, then this is a more robust way of doing it.
>
>
> #define SMCCC_JEP106_BANK_ID(v) FIELD_GET(GENMASK(30, 24), (v))
> #define SMCCC_JEP106_ID_CODE(v) FIELD_GET(GENMASK(22, 16), (v))
> #define SMCCC_JEP106_SOC_ID(v) FIELD_GET(GENMASK(15, 0), (v))
>
> #define JEP106_NVIDIA_BANK_ID 0x3
> #define JEP106_NVIDIA_ID_CODE 0x6b
> #define T241_CHIPN_MASK GENMASK_ULL(45, 44)
> #define T241_CHIP_GICDA_OFFSET 0x1580000
> #define T241_CHIP_ID 0x241
>
> static bool gic_enable_quirk_nvidia_t241(void *data)
> {
> unsigned long chip_bmask = 0;
> struct arm_smccc_res res;
> phys_addr_t phys;
> u32 i;
>
> if ((arm_smccc_get_version() < ARM_SMCCC_VERSION_1_2) ||
> (arm_smccc_1_1_get_conduit() == SMCCC_CONDUIT_NONE)) {
> return false;
> }
>
> arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
> ARM_SMCCC_ARCH_SOC_ID, &res);
> if ((s32)res.a0 < 0)
> return false;
>
> arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_SOC_ID, 0, &res);
> if ((s32)res.a0 < 0)
> return false;
Most of this should probably directly come from the soc_id
infrastructure. It would need to probe early and expose the low-level
data.
>
> /* Check JEP106 code for NVIDIA T241 chip (036b:0241) */
> if ((SMCCC_JEP106_BANK_ID(res.a0) != JEP106_NVIDIA_BANK_ID) ||
> (SMCCC_JEP106_ID_CODE(res.a0) != JEP106_NVIDIA_ID_CODE) ||
> (SMCCC_JEP106_SOC_ID(res.a0) != T241_CHIP_ID)) {
> return false;
> }
>
> /* Find the chips based on GICR regions PHYS addr */
> for (i = 0; i < gic_data.nr_redist_regions; i++) {
> chip_bmask |= BIT(FIELD_GET(T241_CHIPN_MASK,
> gic_data.redist_regions[i].phys_base));
> }
>
> if (hweight32(chip_bmask) < 3)
> return false;
>
> /* Setup GICD alias regions */
> for (i = 0; i < ARRAY_SIZE(t241_dist_base_alias); i++) {
> if (chip_bmask & BIT(i)) {
> phys = gic_data.dist_phys_base + T241_CHIP_GICDA_OFFSET;
> phys |= FIELD_PREP(T241_CHIPN_MASK, i);
> t241_dist_base_alias[i] = ioremap(phys, SZ_64K);
> WARN_ON(!t241_dist_base_alias[i]);
> }
> }
> static_branch_enable(&gic_nvidia_t241_erratum);
> return true;
> }
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
WARNING: multiple messages have this Message-ID (diff)
From: Marc Zyngier <maz@kernel.org>
To: Shanker Donthineni <sdonthineni@nvidia.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>, Jonathan Corbet <corbet@lwn.net>,
linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org, Vikram Sethi <vsethi@nvidia.com>,
Thierry Reding <treding@nvidia.com>
Subject: Re: [PATCH] irqchip/gicv3: Workaround for NVIDIA erratum T241-FABRIC-4
Date: Sun, 12 Mar 2023 13:43:01 +0000 [thread overview]
Message-ID: <871qlu5bmi.wl-maz@kernel.org> (raw)
In-Reply-To: <ff5306c3-c72e-87a3-f21f-13cdf11e1df5@nvidia.com>
On Fri, 10 Mar 2023 14:16:34 +0000,
Shanker Donthineni <sdonthineni@nvidia.com> wrote:
>
> Hi Marc,
>
> On 3/7/23 02:32, Marc Zyngier wrote:
> > External email: Use caution opening links or attachments
> >
> >
> > On Mon, 06 Mar 2023 01:31:48 +0000,
> > Shanker Donthineni <sdonthineni@nvidia.com> wrote:
> >>
> >> The purpose of this patch is to address the T241 erratum T241-FABRIC-4,
> >> which causes unexpected behavior in the GIC when multiple transactions
> >
> > nit: "The purpose of this patch" is superfluous. Instead, write
> > something like:
> >
> > "The T241 platform suffers from the T241-FABRIC-4 erratum which
> > causes..."
> >
> I'll fix in v2 patch.
>
> >> are received simultaneously from different sources. This hardware issue
> >> impacts NVIDIA server platforms that use more than two T241 chips
> >> interconnected. Each chip has support for 320 {E}SPIs.
> >>
> >> This issue occurs when multiple packets from different GICs are
> >> incorrectly interleaved at the target chip. The erratum text below
> >> specifies exactly what can cause multiple transfer packets susceptible
> >> to interleaving and GIC state corruption. GIC state corruption can
> >> lead to a range of problems, including kernel panics, and unexpected
> >> behavior.
> >>
> >> From the erratum text:
> >> "In some cases, inter-socket AXI4 Stream packets with multiple
> >> transfers, may be interleaved by the fabric when presented to ARM
> >> Generic Interrupt Controller. GIC expects all transfers of a packet
> >> to be delivered without any interleaving.
> >>
> >> The following GICv3 commands may result in multiple transfer packets
> >> over inter-socket AXI4 Stream interface:
> >> - Register reads from GICD_I* and GICD_N*
> >> - Register writes to 64-bit GICD registers other than GICD_IROUTERn*
> >> - ITS command MOVALL
> >
> > Does is also affect cross-chip traffic such as SPI deactivation?
> >
> No, it is not impacted.
>
> >>
> >> Multiple commands in GICv4+ utilize multiple transfer packets,
> >> including VMOVP, VMOVI and VMAPP.
> >>
> >> This issue impacts system configurations with more than 2 sockets,
> >> that require multi-transfer packets to be sent over inter-socket
> >> AXI4 Stream interface between GIC instances on different sockets.
> >> GICv4 cannot be supported. GICv3 SW model can only be supported
> >> with the workaround. Single and Dual socket configurations are not
> >> impacted by this issue and support GICv3 and GICv4."
> >
> > Do you have a public link to this erratum? This is really something
> > that we should be go back to when changing things in the GIC code
> > (should we ever use MOVALL, for example).
> >
> https://developer.nvidia.com/docs/t241-fabric-4/nvidia-t241-fabric-4-errata.pdf
Great. Please add this to the commit message and a comment next to the
workaround code.
[...]
> >> +static inline void __iomem *gic_dist_base_read_alias(irq_hw_number_t intid)
> >> +{
> >> + struct dist_base_alias *base_alias;
> >> + int i;
> >> +
> >> + if (static_branch_unlikely(&gic_nvidia_t241_erratum)) {
> >> + base_alias = gic_data.base_read_aliases;
> >> + for (i = 0; i < gic_data.nr_dist_base_aliases; i++) {
> >> + if (base_alias->base &&
> >> + (intid >= base_alias->intid_start) &&
> >> + (intid <= base_alias->intid_end)) {
> >> + return base_alias->base;
> >> + }
> >> + base_alias++;
> >> + }
> >> + }
> >
> > Each distributor has the exact same number of SPIs. So why isn't this
> > just a division that gives you a distributor number?
> >
>
> I considered creating a generic function that could potentially be
> utilized in the future for other Workarounds (WARs).
>
> I'll change to this in v2.
>
> static inline void __iomem *gic_dist_base_alias(irq_hw_number_t intid)
> {
> u32 chip;
>
> if (static_branch_unlikely(&gic_nvidia_t241_erratum)) {
> /**
> * {E}SPI mappings for all 4 chips
> * Chip0 = 32-351
> * Chip1 = 52-671
s/52/352/, right?
> * Chip2 = 672-991
> * Chip3 = 4096-4415
> */
> switch (__get_intid_range(intid)) {
> case SPI_RANGE:
> chip = (intid - 32) / 320;
> break;
> case ESPI_RANGE:
> chip = 3;
> break;
> default:
> unreachable();
> }
> BUG_ON(!t241_dist_base_alias[chip]);
You can drop this BUG_ON(), and replace it with on at probe time.
> return t241_dist_base_alias[chip];
> }
>
> return gic_data.dist_base;
> }
Yup, that's much better.
>
> >> +
> >> + return gic_data.dist_base;
> >> +}
> >> +
> >> static inline void __iomem *gic_dist_base(struct irq_data *d)
> >> {
> >> switch (get_intid_range(d)) {
> >> @@ -346,7 +377,7 @@ static int gic_peek_irq(struct irq_data *d, u32 offset)
> >> if (gic_irq_in_rdist(d))
> >> base = gic_data_rdist_sgi_base();
> >> else
> >> - base = gic_data.dist_base;
> >> + base = gic_dist_base_read_alias(irqd_to_hwirq(d));
> >>
> >> return !!(readl_relaxed(base + offset + (index / 32) * 4) & mask);
> >> }
> >> @@ -580,6 +611,7 @@ static int gic_set_type(struct irq_data *d, unsigned int type)
> >> enum gic_intid_range range;
> >> unsigned int irq = gic_irq(d);
> >> void __iomem *base;
> >> + void __iomem *base_read_alias;
> >> u32 offset, index;
> >> int ret;
> >>
> >> @@ -594,14 +626,17 @@ static int gic_set_type(struct irq_data *d, unsigned int type)
> >> type != IRQ_TYPE_LEVEL_HIGH && type != IRQ_TYPE_EDGE_RISING)
> >> return -EINVAL;
> >>
> >> - if (gic_irq_in_rdist(d))
> >> + if (gic_irq_in_rdist(d)) {
> >> base = gic_data_rdist_sgi_base();
> >> - else
> >> + base_read_alias = base;
> >> + } else {
> >> base = gic_data.dist_base;
> >> + base_read_alias = gic_dist_base_read_alias(irqd_to_hwirq(d));
> >> + }
> >>
> >> offset = convert_offset_index(d, GICD_ICFGR, &index);
> >> -
> >> - ret = gic_configure_irq(index, type, base + offset, NULL);
> >> + ret = gic_configure_irq(index, type, base + offset, NULL,
> >> + base_read_alias + offset);
> >> if (ret && (range == PPI_RANGE || range == EPPI_RANGE)) {
> >> /* Misconfigured PPIs are usually not fatal */
> >> pr_warn("GIC: PPI INTID%d is secure or misconfigured\n", irq);
> >> @@ -1719,6 +1754,70 @@ static bool gic_enable_quirk_hip06_07(void *data)
> >> return false;
> >> }
> >>
> >> +static bool gic_enable_quirk_nvidia_t241(void *data)
> >> +{
> >> +#ifdef CONFIG_ACPI
> >> + struct dist_base_alias *base_alias;
> >> + struct acpi_table_header *madt;
> >> + int i, intid, nchips = 0;
> >> + acpi_status status;
> >> + phys_addr_t phys;
> >> +
> >> + status = acpi_get_table(ACPI_SIG_MADT, 0, &madt);
> >> + if (ACPI_FAILURE(status))
> >> + return false;
> >> +
> >> + /* Check NVIDIA OEM ID */
> >> + if (memcmp(madt->oem_id, "NVIDIA", 6)) {
> >
> > What guarantees do we have that this string will always be present?
> > "oem_id" is usually updated to reflect the integrator, not the
> > silicon vendor.
> >
>
> Our company provides UEFI firmware porting guidelines to OEMs that
> ensure the MADT table generation, along with the ACPI header, remains
> unaltered. Thanks to your input, we are now looking into alternative
> approaches for identifying platform types and removing our dependence
> on ACPI. Specifically, we are interested in utilizing the SMCCC API
> to detect the CHIP. Determine whether the individual chips are present
> by referring to the GICR regions described in MADT.
Seems like a reasonable alternative.
>
>
> >> + acpi_put_table(madt);
> >> + return false;
> >> + }
> >> +
> >> + /* Find the number of chips based on OEM_TABLE_ID */
> >> + if ((!memcmp(madt->oem_table_id, "T241x3", 6)) ||
> >> + (!memcmp(madt->oem_table_id, "T241c3", 6))) {
> >> + nchips = 3;
> >> + } else if ((!memcmp(madt->oem_table_id, "T241x4", 6)) ||
> >> + (!memcmp(madt->oem_table_id, "T241c4", 6))) {
> >> + nchips = 4;
> >> + }
> >
> > Same question for these. This seems pretty fragile.
> >
> This can be avoid for the SMCCC based platform detection.
>
> >> +
> >> + acpi_put_table(madt);
> >> + if (nchips < 3)
> >> + return false;
> >> +
> >> + base_alias = kmalloc_array(nchips, sizeof(*base_alias),
> >> + GFP_KERNEL | __GFP_ZERO);
> >
> > You are fully initialising the structures, right? So why the
> > __GFP_ZERO?
> Yes, not needed. will use the staic array since size is small after
> removing INTID_start/end feilds.
>
> >
> >> + if (!base_alias)
> >> + return false;
> >> +
> >> + gic_data.base_read_aliases = base_alias;
> >> + gic_data.nr_dist_base_aliases = nchips;
> >> +
> >> + /**
> >> + * Setup GICD alias and {E}SPIs range for each chip
> >> + * {E}SPI blocks mappings:
> >> + * Chip0 = 00-09
> >> + * Chip1 = 10-19
> >> + * Chip2 = 20-29
> >> + * Chip3 = 30-39
> >
> > What are these ranges? From the code below, I can (sort of) guess that
> > each chip has 10 registers in the SPI/ESPI range, with chips 0-1
> > dealing with SPIs, and chips 2-3 dealing with ESPIs.
> >
> > It would be a lot clearer if you indicated the actual INTID ranges.
> Agree.
>
> >
> >> + */
> >> + for (i = 0; i < nchips; i++, base_alias++) {
> >> + phys = ((1ULL << 44) * i) | 0x23580000;
> >
> > Where is this address coming from? Can it be inferred from the MADT?
> > It would also be a lot more readable if written as:
> >
> > #define CHIP_MASK GENMASK_ULL(45, 44)
> > #define CHIP_ALIAS_BASE 0x23580000
> >
> I'll define macros for constants. Use the offset, global GICD-PHYS,
> and CHIP number to get the alias addressses.
>
> #define T241_CHIPN_MASK GENMASK_ULL(45, 44)
> #define T241_CHIP_GICDA_OFFSET 0x1580000
>
> phys = gic_data.dist_phys_base + T241_CHIP_GICDA_OFFSET;
> phys |= FIELD_PREP(T241_CHIPN_MASK, i);
>
>
> > phys = CHIP_ALIAS_BASE;
> > phys |= FIELD_PREP(CHIP_MASK, i);
> >
> >> + base_alias->base = ioremap(phys, SZ_64K);
> >> + WARN_ON(!base_alias->base);
> >> +
> >> + intid = i < 3 ? 32 + i * 10 * 32 : ESPI_BASE_INTID;
> >> + base_alias->intid_start = intid;
> >> + base_alias->intid_end = intid + 10 * 32 - 1;
> >
> > This really is obfuscated. And it also shows that we really don't need
> > the INTID ranges in the data structure. You can easily get to the chip
> > number with something like:
> ACK
>
> >
> > switch (__get_intid_range(intid)) {
> > case SPI_RANGE:
> > chip = (intid - 32) / 320;
> > break;
> > case ESPI_RANGE:
> > chip = (intid - ESPI_BASE_INTID) / 320;
> > break;
> > }
> >
> > alias = base_alias[chip];
> >
> > Bonus point if you add a #define for the magic numbers.
> >
> ACK
>
> >> + }
> >> + static_branch_enable(&gic_nvidia_t241_erratum);
> >> + return true;
> >> +#else
> >> + return false;
> >> +#endif
> >> +}
> >
> > How about moving the whole function under #ifdef CONFIG_ACPI?
> >
>
> If you're not satisfied with SMCCC-based platform detection, I'll
> make the necessary changes. We value your input and would appreciate
> your opinion on whether we should use SMCCC or ACPI-OEM-ID based
> platform detection. Our preference is to go with SMC if that's
> agreeable to you.
If you can guarantee that this FW-based discovery will always be
available, then this is a more robust way of doing it.
>
>
> #define SMCCC_JEP106_BANK_ID(v) FIELD_GET(GENMASK(30, 24), (v))
> #define SMCCC_JEP106_ID_CODE(v) FIELD_GET(GENMASK(22, 16), (v))
> #define SMCCC_JEP106_SOC_ID(v) FIELD_GET(GENMASK(15, 0), (v))
>
> #define JEP106_NVIDIA_BANK_ID 0x3
> #define JEP106_NVIDIA_ID_CODE 0x6b
> #define T241_CHIPN_MASK GENMASK_ULL(45, 44)
> #define T241_CHIP_GICDA_OFFSET 0x1580000
> #define T241_CHIP_ID 0x241
>
> static bool gic_enable_quirk_nvidia_t241(void *data)
> {
> unsigned long chip_bmask = 0;
> struct arm_smccc_res res;
> phys_addr_t phys;
> u32 i;
>
> if ((arm_smccc_get_version() < ARM_SMCCC_VERSION_1_2) ||
> (arm_smccc_1_1_get_conduit() == SMCCC_CONDUIT_NONE)) {
> return false;
> }
>
> arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
> ARM_SMCCC_ARCH_SOC_ID, &res);
> if ((s32)res.a0 < 0)
> return false;
>
> arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_SOC_ID, 0, &res);
> if ((s32)res.a0 < 0)
> return false;
Most of this should probably directly come from the soc_id
infrastructure. It would need to probe early and expose the low-level
data.
>
> /* Check JEP106 code for NVIDIA T241 chip (036b:0241) */
> if ((SMCCC_JEP106_BANK_ID(res.a0) != JEP106_NVIDIA_BANK_ID) ||
> (SMCCC_JEP106_ID_CODE(res.a0) != JEP106_NVIDIA_ID_CODE) ||
> (SMCCC_JEP106_SOC_ID(res.a0) != T241_CHIP_ID)) {
> return false;
> }
>
> /* Find the chips based on GICR regions PHYS addr */
> for (i = 0; i < gic_data.nr_redist_regions; i++) {
> chip_bmask |= BIT(FIELD_GET(T241_CHIPN_MASK,
> gic_data.redist_regions[i].phys_base));
> }
>
> if (hweight32(chip_bmask) < 3)
> return false;
>
> /* Setup GICD alias regions */
> for (i = 0; i < ARRAY_SIZE(t241_dist_base_alias); i++) {
> if (chip_bmask & BIT(i)) {
> phys = gic_data.dist_phys_base + T241_CHIP_GICDA_OFFSET;
> phys |= FIELD_PREP(T241_CHIPN_MASK, i);
> t241_dist_base_alias[i] = ioremap(phys, SZ_64K);
> WARN_ON(!t241_dist_base_alias[i]);
> }
> }
> static_branch_enable(&gic_nvidia_t241_erratum);
> return true;
> }
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2023-03-12 13:43 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-06 1:31 [PATCH] irqchip/gicv3: Workaround for NVIDIA erratum T241-FABRIC-4 Shanker Donthineni
2023-03-06 1:31 ` Shanker Donthineni
2023-03-07 8:32 ` Marc Zyngier
2023-03-07 8:32 ` Marc Zyngier
2023-03-10 14:16 ` Shanker Donthineni
2023-03-10 14:16 ` Shanker Donthineni
2023-03-12 13:43 ` Marc Zyngier [this message]
2023-03-12 13:43 ` Marc Zyngier
2023-03-13 3:47 ` Shanker Donthineni
2023-03-13 3:47 ` Shanker Donthineni
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=871qlu5bmi.wl-maz@kernel.org \
--to=maz@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=corbet@lwn.net \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=sdonthineni@nvidia.com \
--cc=tglx@linutronix.de \
--cc=treding@nvidia.com \
--cc=vsethi@nvidia.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.