* [PATCH v3 4/5] i2c: mt7621: limit SCL_STRETCH only to Mediatek SoC
From: Christian Marangi @ 2026-05-19 22:32 UTC (permalink / raw)
To: Stefan Roese, Andi Shyti, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Matthias Brugger, AngeloGioacchino Del Regno,
linux-i2c, devicetree, linux-kernel, linux-arm-kernel,
linux-mediatek
Cc: Christian Marangi
In-Reply-To: <20260519223253.1093-1-ansuelsmth@gmail.com>
The same I2C driver is also used for Airoha SoC with the only difference
that the i2c_reset should not enable SCL_STRETCH for Airoha SoC.
Introduce a new compatible for Airoha and limit the SCL_STRETCH only to
mediatek SoC.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
---
drivers/i2c/busses/i2c-mt7621.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/i2c/busses/i2c-mt7621.c b/drivers/i2c/busses/i2c-mt7621.c
index d8fa29e7e0fa..3cde43c57a2b 100644
--- a/drivers/i2c/busses/i2c-mt7621.c
+++ b/drivers/i2c/busses/i2c-mt7621.c
@@ -88,6 +88,7 @@ static int mtk_i2c_wait_idle(struct mtk_i2c *i2c, bool atomic)
static void mtk_i2c_reset(struct mtk_i2c *i2c)
{
+ u32 reg;
int ret;
ret = device_reset(i2c->adap.dev.parent);
@@ -98,8 +99,12 @@ static void mtk_i2c_reset(struct mtk_i2c *i2c)
* Don't set SM0CTL0_ODRAIN as its bit meaning is inverted. To
* configure open-drain mode, this bit needs to be cleared.
*/
- iowrite32(((i2c->clk_div << 16) & SM0CTL0_CLK_DIV_MASK) | SM0CTL0_EN |
- SM0CTL0_SCL_STRETCH, i2c->base + REG_SM0CTL0_REG);
+ reg = ((i2c->clk_div << 16) & SM0CTL0_CLK_DIV_MASK) | SM0CTL0_EN;
+ /* Set SCL_STRETCH only for Mediatek SoC */
+ if (device_is_compatible(i2c->dev, "mediatek,mt7621-i2c"))
+ reg |= SM0CTL0_SCL_STRETCH;
+
+ iowrite32(reg, i2c->base + REG_SM0CTL0_REG);
iowrite32(0, i2c->base + REG_SM0CFG2_REG);
/* Clear any pending interrupt */
iowrite32(1, i2c->base + REG_PINTEN_REG);
@@ -271,6 +276,7 @@ static const struct i2c_algorithm mtk_i2c_algo = {
static const struct of_device_id i2c_mtk_dt_ids[] = {
{ .compatible = "mediatek,mt7621-i2c" },
+ { .compatible = "airoha,an7581-i2c" },
{ /* sentinel */ }
};
--
2.53.0
^ permalink raw reply related
* [PATCH v3 3/5] dt-bindings: i2c: mt7621: Document an7581 compatible
From: Christian Marangi @ 2026-05-19 22:32 UTC (permalink / raw)
To: Stefan Roese, Andi Shyti, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Matthias Brugger, AngeloGioacchino Del Regno,
linux-i2c, devicetree, linux-kernel, linux-arm-kernel,
linux-mediatek
Cc: Christian Marangi
In-Reply-To: <20260519223253.1093-1-ansuelsmth@gmail.com>
Airoha SoC implement the same Mediatek logic for I2C bus with the only
difference of not having a dedicated reset line to reset it.
Add a dedicated compatible for the Airoha AN7581 SoC and reject the
unsupported property.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
---
.../bindings/i2c/mediatek,mt7621-i2c.yaml | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/Documentation/devicetree/bindings/i2c/mediatek,mt7621-i2c.yaml b/Documentation/devicetree/bindings/i2c/mediatek,mt7621-i2c.yaml
index 118ec00fc190..8223fbc74f14 100644
--- a/Documentation/devicetree/bindings/i2c/mediatek,mt7621-i2c.yaml
+++ b/Documentation/devicetree/bindings/i2c/mediatek,mt7621-i2c.yaml
@@ -14,7 +14,9 @@ allOf:
properties:
compatible:
- const: mediatek,mt7621-i2c
+ enum:
+ - airoha,an7581-i2c
+ - mediatek,mt7621-i2c
reg:
maxItems: 1
@@ -38,6 +40,16 @@ required:
- "#address-cells"
- "#size-cells"
+if:
+ properties:
+ compatible:
+ contains:
+ const: airoha,an7581-i2c
+then:
+ properties:
+ resets: false
+ reset-names: false
+
unevaluatedProperties: false
examples:
--
2.53.0
^ permalink raw reply related
* [PATCH v3 5/5] i2c: mt7621: make device reset optional
From: Christian Marangi @ 2026-05-19 22:32 UTC (permalink / raw)
To: Stefan Roese, Andi Shyti, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Matthias Brugger, AngeloGioacchino Del Regno,
linux-i2c, devicetree, linux-kernel, linux-arm-kernel,
linux-mediatek
Cc: Christian Marangi
In-Reply-To: <20260519223253.1093-1-ansuelsmth@gmail.com>
Airoha SoC that makes use of the same Mediatek I2C driver/logic doesn't
have reset line for I2C so use optional device_reset variant.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
---
drivers/i2c/busses/i2c-mt7621.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/i2c/busses/i2c-mt7621.c b/drivers/i2c/busses/i2c-mt7621.c
index 3cde43c57a2b..fb9d9701bb10 100644
--- a/drivers/i2c/busses/i2c-mt7621.c
+++ b/drivers/i2c/busses/i2c-mt7621.c
@@ -91,7 +91,7 @@ static void mtk_i2c_reset(struct mtk_i2c *i2c)
u32 reg;
int ret;
- ret = device_reset(i2c->adap.dev.parent);
+ ret = device_reset_optional(i2c->adap.dev.parent);
if (ret)
dev_err(i2c->dev, "I2C reset failed!\n");
--
2.53.0
^ permalink raw reply related
* Re: [PATCH v5 1/5] PCI: host-common: Add helper to determine host bridge D3cold eligibility
From: Bjorn Helgaas @ 2026-05-19 22:39 UTC (permalink / raw)
To: Krishna Chaitanya Chundru
Cc: Jingoo Han, Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas,
Will Deacon, linux-pci, linux-kernel, linux-arm-msm,
linux-arm-kernel, jonathanh, bjorn.andersson
In-Reply-To: <20260429-d3cold-v5-1-89e9735b9df6@oss.qualcomm.com>
On Wed, Apr 29, 2026 at 12:12:23PM +0530, Krishna Chaitanya Chundru wrote:
> Add a common helper, pci_host_common_d3cold_possible(), to determine
> whether PCIe devices under host bridge can safely transition to D3cold.
>
> This helper is intended to be used by PCI host controller drivers to
> decide whether they may safely put the host bridge into D3cold based on
> the power state and wakeup capabilities of downstream endpoints.
>
> The helper walks all devices on the all bridge buses and only allows
> the devices to enter D3cold if all PCIe endpoints are already in
> PCI_D3hot. This ensures that we do not power off the host bridge while
> any active endpoint still requires the link to remain powered.
>
> For devices that may wake the system, the helper additionally requires
> that the device supports PME wake from D3cold (via WAKE#). Devices that
> do not have wakeup enabled are not restricted by this check and do not
> block the devices under host bridge from entering D3cold.
>
> Devices without a bound driver and with PCI not enabled via sysfs are
> treated as inactive and therefore do not prevent the devices under host
> bridge from entering D3cold. This allows controllers to power down more
> aggressively when there are no actively managed endpoints.
>
> Some devices (e.g. M.2 without auxiliary power) lose PME detection when
> main power is removed. Even if such devices advertise PME-from-D3cold
> capability, entering D3cold may break wakeup. So, return PME-from-D3cold
> capability via an output parameter so PCIe controller drivers can apply
> platform-specific handling to preserve wakeup functionality.
>
> Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
> ---
> drivers/pci/controller/pci-host-common.c | 71 ++++++++++++++++++++++++++++++++
> drivers/pci/controller/pci-host-common.h | 2 +
> 2 files changed, 73 insertions(+)
>
> diff --git a/drivers/pci/controller/pci-host-common.c b/drivers/pci/controller/pci-host-common.c
> index d6258c1cffe5..09432d69175c 100644
> --- a/drivers/pci/controller/pci-host-common.c
> +++ b/drivers/pci/controller/pci-host-common.c
> @@ -17,6 +17,9 @@
>
> #include "pci-host-common.h"
>
> +#define PCI_HOST_D3COLD_ALLOWED BIT(0)
> +#define PCI_HOST_PME_D3COLD_CAPABLE BIT(1)
> +
> static void gen_pci_unmap_cfg(void *ptr)
> {
> pci_ecam_free((struct pci_config_window *)ptr);
> @@ -106,5 +109,73 @@ void pci_host_common_remove(struct platform_device *pdev)
> }
> EXPORT_SYMBOL_GPL(pci_host_common_remove);
>
> +static int __pci_host_common_d3cold_possible(struct pci_dev *pdev, void *userdata)
> +{
> + u32 *flags = userdata;
> + int type;
> +
> + /* Ignore conventional PCI devices */
> + if (!pci_is_pcie(pdev))
> + return 0;
> +
> + type = pci_pcie_type(pdev);
> + if (type != PCI_EXP_TYPE_ENDPOINT &&
> + type != PCI_EXP_TYPE_LEG_END &&
> + type != PCI_EXP_TYPE_RC_END)
> + return 0;
From https://sashiko.dev/#/patchset/20260429-d3cold-v5-0-89e9735b9df6%40oss.qualcomm.com:
If the topology contains an active conventional PCI device or an
intermediate PCIe switch in PCI_D0, returning 0 here allows
pci_walk_bus() to continue without clearing the
PCI_HOST_D3COLD_ALLOWED flag.
Does this create a situation where the host bridge might
aggressively power off the link, dropping power to these active
components?
I guess this is intentional, since you have comment about ignoring
conventional PCI devices. But this does seem like a potential
problem. Why should we ignore switches here? And I think it's still
fairly common to have a PCIe-to-PCI bridge leading to a conventional
PCI device, and I don't know why we should ignore them.
The commit log consistently refers to "PCIe" devices and endpoints, so
maybe there's some reason that I'm missing.
There are other sashiko comments on this series that I think should
also be looked at.
> +
> + if (!pdev->dev.driver && !pci_is_enabled(pdev))
> + return 0;
> +
> + if (pdev->current_state != PCI_D3hot)
> + goto exit;
> +
> + if (device_may_wakeup(&pdev->dev)) {
> + if (!pci_pme_capable(pdev, PCI_D3cold))
> + goto exit;
> + else
> + *flags |= PCI_HOST_PME_D3COLD_CAPABLE;
> + }
> +
> + return 0;
> +
> +exit:
> + *flags &= ~PCI_HOST_D3COLD_ALLOWED;
> +
> + return -EOPNOTSUPP;
> +}
> +
> +/**
> + * pci_host_common_d3cold_possible - Determine whether the host bridge can transition the
> + * devices into D3Cold.
> + *
> + * @bridge: PCI host bridge to check
> + * @pme_capable: Pointer to update if there is any device which is capable of generating
> + * PME from D3cold.
> + *
> + * Walk downstream PCIe endpoint devices and determine whether the host bridge
> + * is permitted to transition the devices into D3cold.
> + *
> + * Devices under host bridge can enter D3cold only if all active PCIe endpoints are in
> + * PCI_D3hot and any wakeup-enabled endpoint is capable of generating PME from D3cold.
> + * Inactive endpoints are ignored.
> + *
> + * The @pme_capable output allows PCIe controller drivers to apply
> + * platform-specific handling to preserve wakeup functionality.
> + *
> + * Return: %true if the host bridge may enter D3cold, otherwise %false.
> + */
> +bool pci_host_common_d3cold_possible(struct pci_host_bridge *bridge, bool *pme_capable)
> +{
> + u32 flags = PCI_HOST_D3COLD_ALLOWED;
> +
> + pci_walk_bus(bridge->bus, __pci_host_common_d3cold_possible, &flags);
> +
> + *pme_capable = !!(flags & PCI_HOST_PME_D3COLD_CAPABLE);
> +
> + return !!(flags & PCI_HOST_D3COLD_ALLOWED);
> +}
> +EXPORT_SYMBOL_GPL(pci_host_common_d3cold_possible);
> +
> MODULE_DESCRIPTION("Common library for PCI host controller drivers");
> MODULE_LICENSE("GPL v2");
> diff --git a/drivers/pci/controller/pci-host-common.h b/drivers/pci/controller/pci-host-common.h
> index b5075d4bd7eb..7eb5599b9ce4 100644
> --- a/drivers/pci/controller/pci-host-common.h
> +++ b/drivers/pci/controller/pci-host-common.h
> @@ -20,4 +20,6 @@ void pci_host_common_remove(struct platform_device *pdev);
>
> struct pci_config_window *pci_host_common_ecam_create(struct device *dev,
> struct pci_host_bridge *bridge, const struct pci_ecam_ops *ops);
> +
> +bool pci_host_common_d3cold_possible(struct pci_host_bridge *bridge, bool *pme_capable);
> #endif
>
> --
> 2.34.1
>
^ permalink raw reply
* Re: [PATCH 2/3] usb: dwc3: xilinx: use reset_control_reset() in versal init
From: Thinh Nguyen @ 2026-05-19 22:39 UTC (permalink / raw)
To: Pandey, Radhey Shyam
Cc: Thinh Nguyen, Radhey Shyam Pandey, gregkh@linuxfoundation.org,
michal.simek@amd.com, p.zabel@pengutronix.de,
linux-usb@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, git@amd.com
In-Reply-To: <95361d3a-34b6-4f0f-935e-9e1b45698e81@amd.com>
On Mon, May 18, 2026, Pandey, Radhey Shyam wrote:
> On 5/14/2026 7:04 AM, Thinh Nguyen wrote:
> > On Mon, May 11, 2026, Radhey Shyam Pandey wrote:
> > > Replace separate reset_control_assert() and reset_control_deassert() calls
> > > with reset_control_reset(), which pulses the reset in one step. Report
> > > failures with dev_err_probe() and a single message. No functional change.
> > >
> >
> > The behavior of reset_control_reset() is a little different. I wouldn't
> > call this "No functional change". However, I assumed this was tested.
> > Please provide a proper reason for this change in the change log.
>
> In the dwc3-xilinx case, reset_control_reset() routes through the
> zynqmp reset driver and invokes PM_RESET_ACTION_PULSE. This triggers
> the Xilinx firmware reset implementation, which performs both assert
> and deassert. Effectively, reset() issues a single SMC call for a
> reset pulse instead of separate assert and deassert calls and moves
> IP out of reset.
>
> Yes this new reset sequence is validated on HW. I will include
> above description and respin v2..
>
Thanks!
Thinh
^ permalink raw reply
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations
From: Oliver Upton @ 2026-05-19 22:57 UTC (permalink / raw)
To: David Woodhouse
Cc: Paolo Bonzini, Marc Zyngier, Will Deacon, Jonathan Corbet,
Shuah Khan, kvm, Linux Doc Mailing List,
Kernel Mailing List, Linux, Sean Christopherson, Jim Mattson,
Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas,
Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann,
Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest
In-Reply-To: <1243d375846c4f4e20c229a6f09300126188fc8b.camel@infradead.org>
On Tue, May 19, 2026 at 10:58:05PM +0100, David Woodhouse wrote:
> On Tue, 2026-05-19 at 14:10 -0700, Oliver Upton wrote:
> > And in the absence of clear evidence of a guest depending on the broken
> > IGROUPR behavior, I don't see how the guest-side changes of Christoffer's
> > series are any different from the multitude of bug fixes that we take
> > every single release cycle. It is an unfortunate bug and I concur with
> > Marc that it doesn't seem like the sort of thing a guest could rely
> > upon.
>
> I find this concerning, because I've already explained this.
>
> There is a very real possibility of guests simply not *noticing* that
> they had bugs in this area, as it didn't *matter* what they wrote to
> these registers since it never worked.
>
> There is an even larger possibility of guests having worked around the
> original issue by *detecting* whether the registers were actually
> writable before choosing to use the alternative groups. And if such a
> guest launches on a new kernel and then needs to be rolled back to an
> older kernel, that will also break.
The onus is on you to substantiate this claim. I would imagine after
carrying the revert for so long that there must be at least one example
of such a guest?
What ifs and maybes do not meet the bar, in my opinion, for preserving
bug emulation in KVM. Of course there could be a little flexibility with
that but we need to have some way of discriminating between bug fixes
and genuine guest expectations around the behavior of virtual hardware.
> > Wrong or not, this behavior is documented unambiguously. From the VGICv2
> > UAPI documentation:
> >
> > """
> > Userspace should set GICD_IIDR before setting any other registers (both
> > KVM_DEV_ARM_VGIC_GRP_DIST_REGS and KVM_DEV_ARM_VGIC_GRP_CPU_REGS) to ensure
> > the expected behavior. Unless GICD_IIDR has been set from userspace, writes
> > to the interrupt group registers (GICD_IGROUPR) are ignored.
> > """
> >
> > I'm not inclined to change that.
>
> That'll all very well... but as far as I can tell, QEMU *doesn't* set
> GICD_IIDR, so it still gets the bizarre behaviour where the *guest* can
> write the registers, but userspace can't. So it looks like it'll work
> except migration will fail. Am I missing something?
That's exactly it, and why I said tying up UAPI opt-in with
guest-visible registers is a really bad idea.
> But honestly, I don't care one iota about GICv2; I was only trying to
> do the cleanup while I was there. Feel free to drop that part entirely.
>
> > As a way out of this whole mess, can we
> > instead:
> >
> > - Allow userspace to set IIDR.Revision to 1
> >
> > - Drop any bug emulation from the handling of IGROUPR registers
>
> It doesn't make sense to allow setting IIDR.Revision to 1 *without* the
> one-liner that actually implements the corresponding behaviour change
> in the IGROUPR registers.
As I described earlier, this whole IIDR crap inarguably broke UAPI and
obviously normal guest behavior (i.e. reading the register). At minimum
we need to permit previously-valid values for IIDR, even if they carry
no implied behaviors.
> And as explained at least twice now, it's the
> behaviour change that's *important* here.
>
> The fact that it's a long-standing bug in KVM which downstream has been
> working around for a long time doesn't matter. The unconditional
> behavioural change *is* a bug and we should fix it.
That is the nature of a bug fix. If you can provide some concrete
evidence of a guest depending on the RAZ/WI behavior then I agree we
need to preserve the old behavior.
Otherwise I see this as a matter of principle in how we do bug fixes to
KVM. Even if upstream took the strictest possible stance towards behavior
changes we will invariably fail to account for some minutia.
> > - Special-case the stupid GICv2 UAPI where IGROUPR are only writable if
> > the VMM has written to IIDR and the revision >= 2
>
> That already *is* a special case, right? And you'd rather leave it as it is?
Left as documented, yes. With the exception that revision == 1 writes
not be considered opt-in to restorable IGROUPR.
Thanks,
Oliver
^ permalink raw reply
* Re: [PATCH v4 11/24] iommu: Add iommu_report_device_broken() to quarantine a broken device
From: Jason Gunthorpe @ 2026-05-19 23:02 UTC (permalink / raw)
To: Nicolin Chen
Cc: Will Deacon, Robin Murphy, Joerg Roedel, Bjorn Helgaas,
Rafael J . Wysocki, Len Brown, Pranjal Shrivastava, Mostafa Saleh,
Lu Baolu, Kevin Tian, linux-arm-kernel, iommu, linux-kernel,
linux-acpi, linux-pci, vsethi, Shuai Xue
In-Reply-To: <agzkldmlG1cuDkj4@Asurada-Nvidia>
On Tue, May 19, 2026 at 03:30:45PM -0700, Nicolin Chen wrote:
> On Tue, May 19, 2026 at 04:16:26PM -0300, Jason Gunthorpe wrote:
> > On Tue, May 19, 2026 at 11:29:23AM -0700, Nicolin Chen wrote:
> > > On Tue, May 19, 2026 at 09:07:37AM -0300, Jason Gunthorpe wrote:
> > > > On Mon, May 18, 2026 at 08:38:54PM -0700, Nicolin Chen wrote:
> > > Then, the core needs to block the device using the similar routine
> > > to the reset prepare(). And that needs to hold group->mutex, so it
> > > needs an async worker.
> > >
> > > Do you see a much simpler way?
> >
> > Put the work on the dev_iommu and forget about rcu.
> >
> > But this is all probably better as some later series if at all. The
> > driver can block the ATS and the expectation is something will FLR the
> > device. The FLR will set the blocking and then restore the
> > domain. None of this async work seems functionally necessary, though
> > it would be a nice to have. Lets focus on the bare minimum here it, it
> > is already a difficult enough problem without tacking on these
> > extras..
>
> OK. So you are suggesting a quarantine at the driver-level only:
>
> 1. Driver detects ATC_INV timeout during an invalidation.
> 2. Driver retries the commands to identify the master.
I might argue to push even this out to a followup series given it is
complex and I suspect it becomes much simpler after the batch
removal...
> 3. Driver calls pci_disable_ats() and clears STE.EATS.
> 4. Driver marks domain->invs ATS entries as BROKEN.
> (optional since pci_disable_ats() is done?)
We need to stop sending invs otherwise there will be trouble making
forward progress.
> 5. Driver sets master->ats_broken to fence concurrent attach:
> arm_smmu_write_ste() and arm_smmu_ats_supported().
Not sure this is needed, if we race some attach then the attach will
re-set EATS, get another timeout and clear EATS. Doesn't seem worth
trying to optimize for.
> 6. Something external triggers an FLR (sysfs or AER).
> 7. FLR goes through pci_dev_reset_iommu_prepare()/done(). done()
> reverts 3+4 and calls the reset_device_done callback clearing
> master->ats_broken (5).
It should restore core/driver/hw synchronization of EATS and the
pci_enable_ats() by installing a blocking domain. Then it can go on to
re-attach a translating domain and everything is back to correct.
We do need to push a pci error event (didn't see that in this series)
so the driver can catch it and start the FLR process. I suppose that
will still need to bounce through a workqueue, and once you have that
it can also set the blocked domain prior to calling out to the driver.
Jason
^ permalink raw reply
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations
From: David Woodhouse @ 2026-05-19 23:33 UTC (permalink / raw)
To: Oliver Upton
Cc: Paolo Bonzini, Marc Zyngier, Will Deacon, Jonathan Corbet,
Shuah Khan, kvm, Linux Doc Mailing List,
Kernel Mailing List, Linux, Sean Christopherson, Jim Mattson,
Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas,
Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann,
Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest
In-Reply-To: <agzq5kwzuJvd7Mh5@kernel.org>
[-- Attachment #1: Type: text/plain, Size: 6032 bytes --]
On Tue, 2026-05-19 at 15:57 -0700, Oliver Upton wrote:
> On Tue, May 19, 2026 at 10:58:05PM +0100, David Woodhouse wrote:
> > On Tue, 2026-05-19 at 14:10 -0700, Oliver Upton wrote:
> > > And in the absence of clear evidence of a guest depending on the broken
> > > IGROUPR behavior, I don't see how the guest-side changes of Christoffer's
> > > series are any different from the multitude of bug fixes that we take
> > > every single release cycle. It is an unfortunate bug and I concur with
> > > Marc that it doesn't seem like the sort of thing a guest could rely
> > > upon.
> >
> > I find this concerning, because I've already explained this.
> >
> > There is a very real possibility of guests simply not *noticing* that
> > they had bugs in this area, as it didn't *matter* what they wrote to
> > these registers since it never worked.
> >
> > There is an even larger possibility of guests having worked around the
> > original issue by *detecting* whether the registers were actually
> > writable before choosing to use the alternative groups. And if such a
> > guest launches on a new kernel and then needs to be rolled back to an
> > older kernel, that will also break.
>
> The onus is on you to substantiate this claim. I would imagine after
> carrying the revert for so long that there must be at least one example
> of such a guest?
What? No. We have *avoided* having the bug, specifically so that we do
not find out the consequences of the bug.
> What ifs and maybes do not meet the bar, in my opinion, for preserving
> bug emulation in KVM. Of course there could be a little flexibility with
> that but we need to have some way of discriminating between bug fixes
> and genuine guest expectations around the behavior of virtual hardware.
I believe you have this completely backwards.
The expectation of KVM is that do not change guest visible behaviour if
there's any reasonable chance that it might cause problems.
A stable and mature platform doesn't get to play in its ivory tower and
randomly inflict breakage on guests because they "deserve it".
I've literally explained the potential failure modes, including the one
on rollback if a guest *does* change the group configuration and then
needs to be rolled back to the older kernel that doesn't support it.
And yes, "ifs and maybes" absolutely *are* the quality bar expected by
KVM because — again, as already explained more than once — as we
accumulate a bunch of such "unlikely" breakages in a fleet upgrade
from, say, 6.1 to 6.12, the likelihood of *one* of them actually
turning out to afflict *one* of the zoo of guest operating systems
approaches 1.
We don't get to just YOLO it.
> > > Wrong or not, this behavior is documented unambiguously. From the VGICv2
> > > UAPI documentation:
> > >
> > > """
> > > Userspace should set GICD_IIDR before setting any other registers (both
> > > KVM_DEV_ARM_VGIC_GRP_DIST_REGS and KVM_DEV_ARM_VGIC_GRP_CPU_REGS) to ensure
> > > the expected behavior. Unless GICD_IIDR has been set from userspace, writes
> > > to the interrupt group registers (GICD_IGROUPR) are ignored.
> > > """
> > >
> > > I'm not inclined to change that.
> >
> > That'll all very well... but as far as I can tell, QEMU *doesn't* set
> > GICD_IIDR, so it still gets the bizarre behaviour where the *guest* can
> > write the registers, but userspace can't. So it looks like it'll work
> > except migration will fail. Am I missing something?
>
> That's exactly it, and why I said tying up UAPI opt-in with
> guest-visible registers is a really bad idea.
>
> > But honestly, I don't care one iota about GICv2; I was only trying to
> > do the cleanup while I was there. Feel free to drop that part entirely.
> >
> > > As a way out of this whole mess, can we
> > > instead:
> > >
> > > - Allow userspace to set IIDR.Revision to 1
> > >
> > > - Drop any bug emulation from the handling of IGROUPR registers
> >
> > It doesn't make sense to allow setting IIDR.Revision to 1 *without* the
> > one-liner that actually implements the corresponding behaviour change
> > in the IGROUPR registers.
>
> As I described earlier, this whole IIDR crap inarguably broke UAPI and
> obviously normal guest behavior (i.e. reading the register). At minimum
> we need to permit previously-valid values for IIDR, even if they carry
> no implied behaviors.
But the whole *point* of IIDR is to preserve the behaviour. To set the
IIDR and *not* have the corresponding behaviour is insanity.
> > And as explained at least twice now, it's the
> > behaviour change that's *important* here.
> >
> > The fact that it's a long-standing bug in KVM which downstream has been
> > working around for a long time doesn't matter. The unconditional
> > behavioural change *is* a bug and we should fix it.
>
> That is the nature of a bug fix. If you can provide some concrete
> evidence of a guest depending on the RAZ/WI behavior then I agree we
> need to preserve the old behavior.
>
> Otherwise I see this as a matter of principle in how we do bug fixes to
> KVM. Even if upstream took the strictest possible stance towards behavior
> changes we will invariably fail to account for some minutia.
No. Don't pretend that this is hard. KVM on x86 has been quietly
getting this right for years.
Yes, there is sometimes *some* subjectivity around it, and it's
sometimes reasonable to just unilaterally change behaviours. This is
not, and was not, once of those cases.
> > > - Special-case the stupid GICv2 UAPI where IGROUPR are only writable if
> > > the VMM has written to IIDR and the revision >= 2
> >
> > That already *is* a special case, right? And you'd rather leave it as it is?
>
> Left as documented, yes. With the exception that revision == 1 writes
> not be considered opt-in to restorable IGROUPR.
Don't do that. Just leave it broken, with QEMU not even working. I'm
beyond caring about GICv2 now.
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5069 bytes --]
^ permalink raw reply
* Re: [PATCH v4 1/3] PCI: Allow ATS to be always on for CXL.cache capable devices
From: Bjorn Helgaas @ 2026-05-19 23:48 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Nicolin Chen, will, robin.murphy, bhelgaas, joro, praan, baolu.lu,
kevin.tian, miko.lenczewski, linux-arm-kernel, iommu,
linux-kernel, linux-pci, dan.j.williams, jonathan.cameron, vsethi,
linux-cxl, nirmoyd
In-Reply-To: <20260519222335.GK3602937@nvidia.com>
On Tue, May 19, 2026 at 07:23:35PM -0300, Jason Gunthorpe wrote:
> On Tue, May 19, 2026 at 02:36:49PM -0500, Bjorn Helgaas wrote:
> > One motivation for putting this in the PCI core was to use the quirk
> > infrastructure, but this series doesn't use any of that. It doesn't
> > declare any fixups, e.g., DECLARE_PCI_FIXUP_FINAL, and it doesn't
> > update any state cached by the PCI core.
>
> It works like the acs quirks that are in the quirks file, which are
> also arguably only used by iommu too :)
True, although ACS has a lot more PCI-specific grunge in it, including
all the "pci=config_acs" and "pci=disable_acs_redir" stuff.
> I'm not keen on spreading lists of device ids for PCI quirks to iommu
> files, but it would be OK to move pci_ats_always_on() to
> iommu_ats_always_on() that calls the PCI quirk function.
Yeah, I guess it's fair to collect the device IDs in PCI since this is
about characteristics of the device.
If we leave stuff in drivers/pci/, I would prefer that part of it be
named to be purely informational, i.e., "CXL.cache_enabled" or
something similar that would also cover the NVIDIA devices.
"pci_ats_always_on()" doesn't sound right quite to me because it
presupposes the policy choice that IOMMU is going to make; that PCI
function doesn't actually turn ATS on, and it looks like the question
of enabling ATS depends on how the device is actually *used*. E.g.,
if Cache_Enable is not set, is ATS required?
That raises the question of whether this is the right test:
+ if (pci_read_config_word(pdev, offset + PCI_DVSEC_CXL_CAP, &cap))
+ return false;
+
+ return cap & PCI_DVSEC_CXL_CACHE_CAPABLE;
That just says the device is *capable* of CXL.cache; should it check
whether CXL.cache is *enabled* instead?
Bjorn
^ permalink raw reply
* Re: [PATCH net-next v2 2/2] net: ti: icssg: Add HSR and LRE PA statistics
From: Jakub Kicinski @ 2026-05-19 23:56 UTC (permalink / raw)
To: Luka Gejak
Cc: MD Danish Anwar, Felix Maurer, David S. Miller, Eric Dumazet,
Paolo Abeni, Simon Horman, Jonathan Corbet, Shuah Khan,
Roger Quadros, Andrew Lunn, Meghana Malladi, Jacob Keller,
David Carlier, Vadim Fedorenko, Kevin Hao, netdev, linux-doc,
linux-kernel, linux-arm-kernel, Vladimir Oltean
In-Reply-To: <E30AAC96-01D2-4A23-B562-126087DEB7FA@linux.dev>
On Tue, 19 May 2026 07:55:55 +0200 Luka Gejak wrote:
> On May 19, 2026 3:45:06 AM GMT+02:00, Jakub Kicinski <kuba@kernel.org> wrote:
> >On Thu, 14 May 2026 13:26:05 +0530 MD Danish Anwar wrote:
> >> Add new firmware PA statistics counters for HSR and LRE to the ethtool
> >> statistics exposed by the ICSSG driver.
> >>
> >> New statistics added:
> >> - FW_HSR_FWD_CHECK_FAIL_DROP: Packets dropped on the HSR forwarding path
> >> - FW_HSR_HE_CHECK_FAIL_DROP: Packets dropped on the HSR host egress path
> >> - FW_HSR_SKIP_HOST_DUP_DISCARD_FRAMES: Frames with duplicate discard
> >> skipped
> >> - FW_LRE_CNT_UNIQUE/DUPLICATE/MULTIPLE_RX: LRE duplicate detection
> >> counters
> >> - FW_LRE_CNT_RX/TX: LRE per-port frame counters
> >> - FW_LRE_CNT_OWN_RX: Own HSR tagged frames received
> >> - FW_LRE_CNT_ERRWRONGLAN: Frames with wrong LAN identifier (PRP)
> >>
> >> Document the new HSR/LRE statistics in icssg_prueth.rst.
> >
> >To an untrained eye these stats look like stuff that could
> >be standardized across drivers.
> >
> >Luka, Felix, others on CC, do you think we should expose these
> >from HSR over netlink as "standard" offload stats different drivers
> >can plug into or not worth it?
>
> I think there is a case for standardizing part of this, but I would
> not standardize the whole set as-is.
>
> The LRE counters look generic enough to me, especially:
> - unique rx
> - duplicate rx
> - multiple rx
> - rx / tx
> - own rx
> - wrong LAN, PRP only
>
> Those are protocol/LRE concepts rather than TI firmware details, so
> exposing them from the HSR/PRP layer sounds useful. I would expect
> both the software implementation and offloaded implementations to be
> able to provide at least some of them, with unsupported counters
> omitted or reported as not available.
> I would not put the firmware check/drop counters in the same standard
> bucket, though:
> - FW_HSR_FWD_CHECK_FAIL_DROP
> - FW_HSR_HE_CHECK_FAIL_DROP
> - FW_HSR_SKIP_HOST_DUP_DISCARD_FRAMES
Thanks for the breakdown!
> Those sound more like implementation/debug counters for the ICSSG
> firmware pipeline. They are still useful in ethtool driver stats, but
> I would be hesitant to bake their exact semantics into HSR UAPI.
> So my preference would be:
> 1. Keep driver-private ethtool stats for the full firmware counter set.
> 2. Add a small HSR/PRP standard stats set separately, limited to
> well-defined LRE counters.
> 3. Make the HSR layer expose them, with offload drivers plugging in via
> an optional callback or offload stats op.
> 4. Define the counters carefully, including whether they are per-HSR
> device or per-port A/B, and what PRP-only counters mean for HSR.
>
> I do not think this patch should blindly become the UAPI definition,
Not at all, the unique / multiple stats gave me pause. We should
only put in the standard API what can be easily and unambiguously
defined given the protocol spec.
> but I do think it points at a useful follow-up. If we want to avoid
> adding driver-private names first and then standardizing different
> names later, then it may be worth asking Danish to split the
> protocol-level LRE counters out and route those through a common HSR
> stats interface.
As a general policy we ask for standard stats to be added first and
ethtool to only contain what didn't fit in the standard ones.
There are some technical reasons but it's mostly a mindset thing.
^ permalink raw reply
* Re: [PATCH v5 4/5] PCI: dwc: Use common D3cold eligibility helper in suspend path
From: Bjorn Helgaas @ 2026-05-20 0:01 UTC (permalink / raw)
To: Krishna Chaitanya Chundru
Cc: Jingoo Han, Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas,
Will Deacon, linux-pci, linux-kernel, linux-arm-msm,
linux-arm-kernel, jonathanh, bjorn.andersson, Frank Li, linux-pm
In-Reply-To: <20260429-d3cold-v5-4-89e9735b9df6@oss.qualcomm.com>
[+cc Frank, linux-pm]
On Wed, Apr 29, 2026 at 12:12:26PM +0530, Krishna Chaitanya Chundru wrote:
> Previously, the driver skipped putting the link into L2/device state in
> D3cold whenever L1 ASPM was enabled, since some devices (e.g. NVMe) expect
> low resume latency and may not tolerate deeper power states.
I think "some devices expect low resume latency and may not tolerate
deeper power states" conveys the wrong message. It's not that NVMe
has a mysterious acceptable resume latency number that we have to meet
or that NVMe has some inherent aversion to D3cold or L1SS or whatever
"deeper power states" refers to.
It could be that ASPM L1 was configured incorrectly (e.g., an L1->L0
transition didn't happen within the advertised exit latency, leading
to some device access failure) or a device lost internal context when
the driver didn't expect it (e.g., the Qcom problem where L1SS exit
takes too long and results in a link-down and device reset [1]).
It sounds to me like the ASPM L1 check was a way to avoid problems
like that, but I don't think we ever really had a root cause.
[1] https://lore.kernel.org/linux-pci/20260519-l1ss-fix-v2-0-b2c3a4bdeb15@oss.qualcomm.com/
> However, such devices typically remain in D0 and are already covered
> by the new helper's requirement that all endpoints be in D3hot
> before the devices under host bridge may enter D3cold.
If we put the host bridge in D3cold, I assume the hierarchy below is
either put in D3cold as well, or at least every device in the
hierarchy will be reset as a consequence of the Root Port link going
down.
If the driver doesn't manage the device power state itself, I assume
we have the freedom to put the hierarchy in D3cold or reset it.
Do we have the same freedom if the driver *does* manage the power
state itself? What if the driver put the device in D3hot, expecting
it to *stay* in D3hot?
I think pci_host_common_d3cold_possible() will see the device in D3hot
and decide that D3cold is possible.
(I'm looking at https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/power/pci.rst?id=v7.0#n746)
> So, replace the local L1/L1SS-based check in dw_pcie_suspend_noirq() with
> the shared pci_host_common_d3cold_possible() helper to decide whether the
> devices under host bridge can safely transition to D3cold.
>
> In addition, propagate PME-from-D3cold capability information from the
> helper and record it in skip_pwrctrl_off. Some devices (e.g. M.2 cards
> without auxiliary power) may lose PME detection when main power is
> removed, even if they advertise PME-from-D3cold support. This allows
> controller power-off to be skipped when required to preserve wakeup
> functionality.
>
> Update the suspended flag in dw_pcie_resume_noirq() only after the PCIe
> link resumes successfully, to avoid marking the controller active when
> link resume fails.
>
> Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
> ---
> drivers/pci/controller/dwc/pcie-designware-host.c | 15 +++++++--------
> drivers/pci/controller/dwc/pcie-designware.h | 1 +
> 2 files changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> index c9517a348836..9e409a1909e6 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> @@ -16,9 +16,11 @@
> #include <linux/msi.h>
> #include <linux/of_address.h>
> #include <linux/of_pci.h>
> +#include <linux/pci.h>
> #include <linux/pci_regs.h>
> #include <linux/platform_device.h>
>
> +#include "../pci-host-common.h"
> #include "../../pci.h"
> #include "pcie-designware.h"
>
> @@ -1218,18 +1220,14 @@ static int dw_pcie_pme_turn_off(struct dw_pcie *pci)
>
> int dw_pcie_suspend_noirq(struct dw_pcie *pci)
> {
> - u8 offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
> + bool pme_capable = false;
> int ret = 0;
> u32 val;
>
> if (!dw_pcie_link_up(pci))
> goto stop_link;
>
> - /*
> - * If L1SS is supported, then do not put the link into L2 as some
> - * devices such as NVMe expect low resume latency.
> - */
> - if (dw_pcie_readw_dbi(pci, offset + PCI_EXP_LNKCTL) & PCI_EXP_LNKCTL_ASPM_L1)
> + if (!pci_host_common_d3cold_possible(pci->pp.bridge, &pme_capable))
> return 0;
>
> if (pci->pp.ops->pme_turn_off) {
> @@ -1273,6 +1271,7 @@ int dw_pcie_suspend_noirq(struct dw_pcie *pci)
> udelay(1);
>
> stop_link:
> + pci->pp.skip_pwrctrl_off = pme_capable;
> dw_pcie_stop_link(pci);
> if (pci->pp.ops->deinit)
> pci->pp.ops->deinit(&pci->pp);
> @@ -1290,8 +1289,6 @@ int dw_pcie_resume_noirq(struct dw_pcie *pci)
> if (!pci->suspended)
> return 0;
>
> - pci->suspended = false;
> -
> if (pci->pp.ops->init) {
> ret = pci->pp.ops->init(&pci->pp);
> if (ret) {
> @@ -1313,6 +1310,8 @@ int dw_pcie_resume_noirq(struct dw_pcie *pci)
> if (pci->pp.ops->post_init)
> pci->pp.ops->post_init(&pci->pp);
>
> + pci->suspended = false;
> +
> return 0;
>
> err_stop_link:
> diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
> index 3e69ef60165b..e759c5c7257e 100644
> --- a/drivers/pci/controller/dwc/pcie-designware.h
> +++ b/drivers/pci/controller/dwc/pcie-designware.h
> @@ -450,6 +450,7 @@ struct dw_pcie_rp {
> bool ecam_enabled;
> bool native_ecam;
> bool skip_l23_ready;
> + bool skip_pwrctrl_off;
> };
>
> struct dw_pcie_ep_ops {
>
> --
> 2.34.1
>
^ permalink raw reply
* Re: [PATCH v4 1/3] PCI: Allow ATS to be always on for CXL.cache capable devices
From: Jason Gunthorpe @ 2026-05-20 0:05 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Nicolin Chen, will, robin.murphy, bhelgaas, joro, praan, baolu.lu,
kevin.tian, miko.lenczewski, linux-arm-kernel, iommu,
linux-kernel, linux-pci, dan.j.williams, jonathan.cameron, vsethi,
linux-cxl, nirmoyd
In-Reply-To: <20260519234801.GA21369@bhelgaas>
On Tue, May 19, 2026 at 06:48:01PM -0500, Bjorn Helgaas wrote:
> On Tue, May 19, 2026 at 07:23:35PM -0300, Jason Gunthorpe wrote:
> > On Tue, May 19, 2026 at 02:36:49PM -0500, Bjorn Helgaas wrote:
> > > One motivation for putting this in the PCI core was to use the quirk
> > > infrastructure, but this series doesn't use any of that. It doesn't
> > > declare any fixups, e.g., DECLARE_PCI_FIXUP_FINAL, and it doesn't
> > > update any state cached by the PCI core.
> >
> > It works like the acs quirks that are in the quirks file, which are
> > also arguably only used by iommu too :)
>
> True, although ACS has a lot more PCI-specific grunge in it, including
> all the "pci=config_acs" and "pci=disable_acs_redir" stuff.
>
> > I'm not keen on spreading lists of device ids for PCI quirks to iommu
> > files, but it would be OK to move pci_ats_always_on() to
> > iommu_ats_always_on() that calls the PCI quirk function.
>
> Yeah, I guess it's fair to collect the device IDs in PCI since this is
> about characteristics of the device.
>
> If we leave stuff in drivers/pci/, I would prefer that part of it be
> named to be purely informational, i.e., "CXL.cache_enabled" or
> something similar that would also cover the NVIDIA devices.
Yeah, that's fair, so let's rename it to
pci_translated_required()
ie the device requires translated requests to function. This is what
CXL.cache implies (IIRC I was told the spec specifically says this)
Requiring translated requests implies you have to enable ATS in the
system.
> function doesn't actually turn ATS on, and it looks like the question
> of enabling ATS depends on how the device is actually *used*. E.g.,
> if Cache_Enable is not set, is ATS required?
We have no way to know..
> That raises the question of whether this is the right test:
>
> + if (pci_read_config_word(pdev, offset + PCI_DVSEC_CXL_CAP, &cap))
> + return false;
> +
> + return cap & PCI_DVSEC_CXL_CACHE_CAPABLE;
>
> That just says the device is *capable* of CXL.cache; should it check
> whether CXL.cache is *enabled* instead?
No, we talked about this with Dan in one of the versions... it is
better to over-enable ATS than under-enable. over-enable at best is a
NOP, or maybe a tiny performance loss, under-enable is a functional
failure.
If the CXL.cache is not enabled right now it could become enabled
later, after the iommu has already called this and made its
choice..
Thus lets not try to be too narrow here..
Thanks,
Jason
^ permalink raw reply
* Re: [PATCH v4 11/24] iommu: Add iommu_report_device_broken() to quarantine a broken device
From: Nicolin Chen @ 2026-05-20 0:21 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Will Deacon, Robin Murphy, Joerg Roedel, Bjorn Helgaas,
Rafael J . Wysocki, Len Brown, Pranjal Shrivastava, Mostafa Saleh,
Lu Baolu, Kevin Tian, linux-arm-kernel, iommu, linux-kernel,
linux-acpi, linux-pci, vsethi, Shuai Xue
In-Reply-To: <20260519230204.GM3602937@nvidia.com>
On Tue, May 19, 2026 at 08:02:04PM -0300, Jason Gunthorpe wrote:
> > OK. So you are suggesting a quarantine at the driver-level only:
> >
> > 1. Driver detects ATC_INV timeout during an invalidation.
> > 2. Driver retries the commands to identify the master.
>
> I might argue to push even this out to a followup series given it is
> complex and I suspect it becomes much simpler after the batch
> removal...
I see you suggest to treat the entire batch as ATS-broken. Just to
confirm: without per-SID retry, that might falsely block a healthy
device in the ATC batch, right? The driver now batches all ATC_INV
commands via arm_smmu_invs_end_batch().
> > 3. Driver calls pci_disable_ats() and clears STE.EATS.
> > 4. Driver marks domain->invs ATS entries as BROKEN.
> > (optional since pci_disable_ats() is done?)
>
> We need to stop sending invs otherwise there will be trouble making
> forward progress.
OK. This needs a surgical invs mutation: maybe INV_TYPE_ATS_BROEKN
that you suggested.
> > 5. Driver sets master->ats_broken to fence concurrent attach:
> > arm_smmu_write_ste() and arm_smmu_ats_supported().
>
> Not sure this is needed, if we race some attach then the attach will
> re-set EATS, get another timeout and clear EATS. Doesn't seem worth
> trying to optimize for.
I didn't see that coming. master->ats_enabled && state->ats_enabled
in the commit() for a concurrent attachment would issue an ATC that
may timeout again to re-start the step 1.
And since arm_smmu_atc_inv_master() doesn't use domain->invs, it is
not affected by INV_TYPE_ATS_BROKEN. So, ATC_INV can continue to be
issued in this case.
Ah, I feel that we are walking in the mine field where every single
step could be a kaboom. But your insight is clearly a safe pathway.
> > 6. Something external triggers an FLR (sysfs or AER).
> > 7. FLR goes through pci_dev_reset_iommu_prepare()/done(). done()
> > reverts 3+4 and calls the reset_device_done callback clearing
> > master->ats_broken (5).
>
> It should restore core/driver/hw synchronization of EATS and the
> pci_enable_ats() by installing a blocking domain. Then it can go on to
> re-attach a translating domain and everything is back to correct.
Yea. We probably could drop the master->ats_broken, as done() would
be seemingly sufficient. I'll do the rework first, and see if there
might be some corner case.
> We do need to push a pci error event (didn't see that in this series)
> so the driver can catch it and start the FLR process. I suppose that
> will still need to bounce through a workqueue, and once you have that
> it can also set the blocked domain prior to calling out to the driver.
In the specific case that I am trying to tackle with this series, I
do see AER error prints from the device already but there is no FLR
process. So, I assume that, even if we push a PCI error event, that
wouldn't necessarily trigger an FLR?
Thanks
Nicolin
^ permalink raw reply
* Re: [PATCH] arm64/entry: Don't disable preemption in debug_exception_enter() with RT kernel
From: Luis Claudio R. Goncalves @ 2026-05-20 0:23 UTC (permalink / raw)
To: Waiman Long, Ada Couprie Diaz
Cc: Catalin Marinas, Will Deacon, Mark Rutland,
Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt,
linux-arm-kernel, linux-kernel, linux-rt-devel
In-Reply-To: <20260519222524.886454-1-longman@redhat.com>
On Tue, May 19, 2026 at 06:25:24PM -0400, Waiman Long wrote:
> Commit d8bb6718c4db ("arm64: Make debug exception handlers visible from
> RCU") introduces debug_exception_enter() and debug_exception_exit()
> where preemption is explicitly disabled. With a PREEMPT_RT debug kernel,
> the following bug report can happen.
>
> BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 15255, name: gdb_app
> preempt_count: 1, expected: 0
> RCU nest depth: 0, expected: 0
> 1 lock held by gdb_app/15255:
> #0: ffff10007f41b7d8 (&sighand->siglock){..}-{3:3}, at: force_sig_info_to_task+0x34/0x130
> Preemption disabled at:
> [<ffff800080081ea8>] debug_exception_enter+0x18/0x70
> :
> Call trace:
> dump_backtrace+0xac/0x130
> show_stack+0x1c/0x24
> dump_stack_lvl+0xa0/0xe0
> dump_stack+0x14/0x2c
> __might_resched+0x178/0x230
> rt_spin_lock+0x58/0x120
> force_sig_info_to_task+0x34/0x130
> force_sig_fault+0x58/0x80
> arm64_force_sig_fault+0x44/0x70
> send_user_sigtrap+0x5c/0xa0
> brk_handler+0x38/0x5c
> do_debug_exception+0x78/0x110
> el0_dbg+0x50/0x1e0
> el0t_64_sync_handler+0x114/0x150
> el0t_64_sync+0x17c/0x180
>
> Fix that by blocking the preempt_disable()/preempt_enable_no_resched()
> calls when CONFIG_PREEMPT_RT is enabled.
Hi Waiman!
Last year Ada Couprie Diaz wrote a patcheseries that enhanced greatly the
ARM64 debug exception code. In the cover letter there is a discussion about
the effect of the patches on RT[0] (look for PREEMPT_RT), explaining that
there are a few remaining known bugs and briefly discussing the best way to
fix then. There is also a discussion[1] about the specific issue you reported.
I took the liberty of adding Ada to the thread.
Best regards,
Luis
[0] https://lore.kernel.org/all/20250707114109.35672-1-ada.coupriediaz@arm.com/
[1] https://lore.kernel.org/linux-arm-kernel/e86c5c3a-6666-46a7-b7ec-e803212a81a1@arm.com/
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
> arch/arm64/kernel/entry-common.c | 11 +++++++----
> 1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
> index c7a23f7c2212..191441b22b7c 100644
> --- a/arch/arm64/kernel/entry-common.c
> +++ b/arch/arm64/kernel/entry-common.c
> @@ -290,15 +290,17 @@ static __always_inline void fpsimd_syscall_exit(void)
> }
>
> /*
> - * In debug exception context, we explicitly disable preemption despite
> - * having interrupts disabled.
> + * In debug exception context, we explicitly disable preemption except for
> + * PREEMPT_RT kernel as rt_spin_lock() can be called.
> + *
> * This serves two purposes: it makes it much less likely that we would
> * accidentally schedule in exception context and it will force a warning
> * if we somehow manage to schedule by accident.
> */
> static void debug_exception_enter(struct pt_regs *regs)
> {
> - preempt_disable();
> + if (!IS_ENABLED(CONFIG_PREEMPT_RT))
> + preempt_disable();
>
> /* This code is a bit fragile. Test it. */
> RCU_LOCKDEP_WARN(!rcu_is_watching(), "exception_enter didn't work");
> @@ -307,7 +309,8 @@ NOKPROBE_SYMBOL(debug_exception_enter);
>
> static void debug_exception_exit(struct pt_regs *regs)
> {
> - preempt_enable_no_resched();
> + if (!IS_ENABLED(CONFIG_PREEMPT_RT))
> + preempt_enable_no_resched();
> }
> NOKPROBE_SYMBOL(debug_exception_exit);
>
> --
> 2.54.0
>
>
---end quoted text---
^ permalink raw reply
* Re: [PATCH v4 11/24] iommu: Add iommu_report_device_broken() to quarantine a broken device
From: Jason Gunthorpe @ 2026-05-20 0:30 UTC (permalink / raw)
To: Nicolin Chen
Cc: Will Deacon, Robin Murphy, Joerg Roedel, Bjorn Helgaas,
Rafael J . Wysocki, Len Brown, Pranjal Shrivastava, Mostafa Saleh,
Lu Baolu, Kevin Tian, linux-arm-kernel, iommu, linux-kernel,
linux-acpi, linux-pci, vsethi, Shuai Xue
In-Reply-To: <agz+kL2S8kcgHywG@Asurada-Nvidia>
On Tue, May 19, 2026 at 05:21:36PM -0700, Nicolin Chen wrote:
> On Tue, May 19, 2026 at 08:02:04PM -0300, Jason Gunthorpe wrote:
> > > OK. So you are suggesting a quarantine at the driver-level only:
> > >
> > > 1. Driver detects ATC_INV timeout during an invalidation.
> > > 2. Driver retries the commands to identify the master.
> >
> > I might argue to push even this out to a followup series given it is
> > complex and I suspect it becomes much simpler after the batch
> > removal...
>
> I see you suggest to treat the entire batch as ATS-broken. Just to
> confirm: without per-SID retry, that might falsely block a healthy
> device in the ATC batch, right? The driver now batches all ATC_INV
> commands via arm_smmu_invs_end_batch().
Yes, it is not good, but a giant complex series is not reviewable. So
I'd start with trashing all the devices, then come with a narrowing.
> > > 5. Driver sets master->ats_broken to fence concurrent attach:
> > > arm_smmu_write_ste() and arm_smmu_ats_supported().
> >
> > Not sure this is needed, if we race some attach then the attach will
> > re-set EATS, get another timeout and clear EATS. Doesn't seem worth
> > trying to optimize for.
>
> I didn't see that coming. master->ats_enabled && state->ats_enabled
> in the commit() for a concurrent attachment would issue an ATC that
> may timeout again to re-start the step 1.
>
> And since arm_smmu_atc_inv_master() doesn't use domain->invs, it is
> not affected by INV_TYPE_ATS_BROKEN. So, ATC_INV can continue to be
> issued in this case.
>
> Ah, I feel that we are walking in the mine field where every single
> step could be a kaboom. But your insight is clearly a safe pathway.
We cannot eliminate parallel ATS invalidation. Two threads could be
concurrently processing the invs list. So it has handle it, the driver
is going to have to tolerate a number of redundant error events. It's
OK if the unlikely case of parallel attach also generates redundant
error events.
> > We do need to push a pci error event (didn't see that in this series)
> > so the driver can catch it and start the FLR process. I suppose that
> > will still need to bounce through a workqueue, and once you have that
> > it can also set the blocked domain prior to calling out to the driver.
>
> In the specific case that I am trying to tackle with this series, I
> do see AER error prints from the device already but there is no FLR
> process.
It depends on the driver, mlx5 has a FLR RAS flow for instance.
A driver with a device that can blow up ATS should implement the FLR
flow if it wants automatic RAS. It requires driver co-ordination.
But I wasn't thinking we can rely on existing AER events here, yes
probably there will be AERs associated with the device exploding so
badly it cannot do ATS, but also maybe not..
This is also a problem if we shoot healthy devices as the first stage,
there will not be an AER from heathly..
So I guess we need to decide which is better to tackle, the dedicated
event or the single invalidation sequence..
Jason
^ permalink raw reply
* Re: [PATCH v2 5/6] gpio: remove machine hogs
From: Dmitry Torokhov @ 2026-05-20 0:46 UTC (permalink / raw)
To: Bartosz Golaszewski
Cc: Linus Walleij, Bartosz Golaszewski, Geert Uytterhoeven,
Frank Rowand, Mika Westerberg, Andy Shevchenko, Aaro Koskinen,
Janusz Krzysztofik, Tony Lindgren, Russell King, Jonathan Corbet,
Shuah Khan, linux-gpio, linux-kernel, linux-acpi,
linux-arm-kernel, linux-omap, linux-doc
In-Reply-To: <20260309-gpio-hog-fwnode-v2-5-4e61f3dbf06a@oss.qualcomm.com>
On Mon, Mar 09, 2026 at 01:42:41PM +0100, Bartosz Golaszewski wrote:
> With no more users, remove legacy machine hog API from the kernel.
>
> Reviewed-by: Linus Walleij <linusw@kernel.org>
> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Argh! What is the replacement for this? I have patches for rsk7203 to
use them to get rid of legacy gpio use, like this:
diff --git a/arch/sh/boards/mach-rsk/devices-rsk7203.c b/arch/sh/boards/mach-rsk/devices-rsk7203.c
index f8760a91e2f1..5bbd3b31cffb 100644
--- a/arch/sh/boards/mach-rsk/devices-rsk7203.c
+++ b/arch/sh/boards/mach-rsk/devices-rsk7203.c
@@ -12,7 +12,7 @@
#include <linux/smsc911x.h>
#include <linux/input.h>
#include <linux/io.h>
-#include <linux/gpio.h>
+#include <linux/gpio/consumer.h>
#include <linux/gpio/machine.h>
#include <linux/gpio/property.h>
#include <asm/machvec.h>
@@ -165,6 +165,19 @@ static const struct platform_device_info rsk7203_devices[] __initconst = {
},
};
+/* The base of the function GPIOs in the flat enum */
+#define SH7203_FN_BASE GPIO_FN_PINT7_PB
+
+static struct gpiod_hog rsk7203_gpio_hogs[] = {
+ GPIO_HOG("sh7203_pfc-fn", GPIO_FN_TXD0 - SH7203_FN_BASE,
+ "TXD0", GPIO_ACTIVE_HIGH, GPIOD_ASIS),
+ GPIO_HOG("sh7203_pfc-fn", GPIO_FN_RXD0 - SH7203_FN_BASE,
+ "RXD0", GPIO_ACTIVE_HIGH, GPIOD_ASIS),
+ GPIO_HOG("sh7203_pfc-fn", GPIO_FN_IRQ0_PB - SH7203_FN_BASE,
+ "IRQ0_PB", GPIO_ACTIVE_HIGH, GPIOD_ASIS),
+ { }
+};
+
static int __init rsk7203_devices_setup(void)
{
struct platform_device *pd;
@@ -172,12 +185,10 @@ static int __init rsk7203_devices_setup(void)
int i;
/* Select pins for SCIF0 */
- gpio_request(GPIO_FN_TXD0, NULL);
- gpio_request(GPIO_FN_RXD0, NULL);
+ gpiod_add_hogs(rsk7203_gpio_hogs);
/* Setup LAN9118: CS1 in 16-bit Big Endian Mode, IRQ0 at Port B */
__raw_writel(0x36db0400, 0xfffc0008); /* CS1BCR */
- gpio_request(GPIO_FN_IRQ0_PB, NULL);
error = software_node_register_node_group(rsk7203_swnodes);
if (error) {
If there is no replacement maybe we can resurrect this? Or shoudl we
have add swnode support for hogs?
Thanks.
--
Dmitry
^ permalink raw reply related
* Re: [PATCH v4 1/3] PCI: Allow ATS to be always on for CXL.cache capable devices
From: Nicolin Chen @ 2026-05-20 1:04 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Bjorn Helgaas, will, robin.murphy, bhelgaas, joro, praan,
baolu.lu, kevin.tian, miko.lenczewski, linux-arm-kernel, iommu,
linux-kernel, linux-pci, dan.j.williams, jonathan.cameron, vsethi,
linux-cxl, nirmoyd
In-Reply-To: <20260520000504.GQ3602937@nvidia.com>
On Tue, May 19, 2026 at 09:05:04PM -0300, Jason Gunthorpe wrote:
> On Tue, May 19, 2026 at 06:48:01PM -0500, Bjorn Helgaas wrote:
> > On Tue, May 19, 2026 at 07:23:35PM -0300, Jason Gunthorpe wrote:
> > > On Tue, May 19, 2026 at 02:36:49PM -0500, Bjorn Helgaas wrote:
> > > > One motivation for putting this in the PCI core was to use the quirk
> > > > infrastructure, but this series doesn't use any of that. It doesn't
> > > > declare any fixups, e.g., DECLARE_PCI_FIXUP_FINAL, and it doesn't
> > > > update any state cached by the PCI core.
> > >
> > > It works like the acs quirks that are in the quirks file, which are
> > > also arguably only used by iommu too :)
> >
> > True, although ACS has a lot more PCI-specific grunge in it, including
> > all the "pci=config_acs" and "pci=disable_acs_redir" stuff.
> >
> > > I'm not keen on spreading lists of device ids for PCI quirks to iommu
> > > files, but it would be OK to move pci_ats_always_on() to
> > > iommu_ats_always_on() that calls the PCI quirk function.
> >
> > Yeah, I guess it's fair to collect the device IDs in PCI since this is
> > about characteristics of the device.
> >
> > If we leave stuff in drivers/pci/, I would prefer that part of it be
> > named to be purely informational, i.e., "CXL.cache_enabled" or
> > something similar that would also cover the NVIDIA devices.
>
> Yeah, that's fair, so let's rename it to
>
> pci_translated_required()
>
> ie the device requires translated requests to function. This is what
> CXL.cache implies (IIRC I was told the spec specifically says this)
>
> Requiring translated requests implies you have to enable ATS in the
> system.
Perhaps we could let IOMMU drivers check:
pci_cxl_is_cache_capable() || pci_dev_specific_is_pre_cxl()
directly?
Thanks
Nicolin
^ permalink raw reply
* Re: [PATCH v15 0/9] Add Type-C DP support for RK3399 EVB IND board
From: Chaoyi Chen @ 2026-05-20 1:13 UTC (permalink / raw)
To: Heikki Krogerus
Cc: Chaoyi Chen, Greg Kroah-Hartman, Dmitry Baryshkov, Peter Chen,
Luca Ceresoli, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Vinod Koul, Kishon Vijay Abraham I, Heiko Stuebner, Sandy Huang,
Andy Yan, Yubing Zhang, Frank Wang, Andrzej Hajda, Neil Armstrong,
Robert Foss, Laurent Pinchart, Jonas Karlman, Jernej Skrabec,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, Amit Sunil Dhamne, Dragan Simic, Johan Jonker,
Diederik de Haas, Peter Robinson, Hugh Cole-Baker, linux-usb,
devicetree, linux-kernel, linux-phy, linux-arm-kernel,
linux-rockchip, dri-devel
In-Reply-To: <agxo8ic94e81nQRx@kuha>
Hello Heikki,
On 5/19/2026 9:43 PM, Heikki Krogerus wrote:
> Hi,
>
> On Wed, Mar 04, 2026 at 05:41:43PM +0800, Chaoyi Chen wrote:
>> From: Chaoyi Chen <chaoyi.chen@rock-chips.com>
>>
>> This series focuses on adding Type-C DP support for USBDP PHY and DP
>> driver. The USBDP PHY and DP will perceive the changes in cable status
>> based on the USB PD and Type-C state machines provided by TCPM. Before
>> this, the USBDP PHY and DP controller of RK3399 sensed cable state
>> changes through extcon, and devices such as the RK3399 Gru-Chromebook
>> rely on them. This series should not break them.
>
> What's the status with this series?
> Are these inteded to go via the DRM tree?
>
> thanks,
>
Thank you very much for your continued attention to this series.
The maintainers seem quite busy... Despite there being no further review
comments, this series have yet to be merged into the DRM tree.
And some of my other patches are in the same situation.
Do you happen to know what the next steps should be? Thank you.
--
Best,
Chaoyi
^ permalink raw reply
* Re: [PATCH v3] EDAC/altera: Guard SDRAM irq2 retrieval for Arria10 only
From: Borislav Petkov @ 2026-05-20 1:20 UTC (permalink / raw)
To: Dinh Nguyen
Cc: muhammad.nazim.amirul.nazle.asmade, tony.luck, linux-edac,
linux-arm-kernel, linux-kernel
In-Reply-To: <c3af86b8-7187-4299-973a-d89ca0d52e9a@kernel.org>
On Fri, May 15, 2026 at 06:41:05AM -0500, Dinh Nguyen wrote:
>
>
> On 5/15/26 00:04, muhammad.nazim.amirul.nazle.asmade@altera.com wrote:
> > From: Nazim Amirul <muhammad.nazim.amirul.nazle.asmade@altera.com>
> >
> > Guard the irq2 retrieval with an of_machine_is_compatible() check so
> > that platform_get_irq(pdev, 1) is only called on Arria10 platforms.
> >
> > Signed-off-by: Nazim Amirul <muhammad.nazim.amirul.nazle.asmade@altera.com>
> > ---
> > v3: Fix commit header formatting to follow EDAC/altera: prefix
> > convention as per maintainer feedback.
> > v2: Move irq2 = platform_get_irq(pdev, 1) inside the existing
> > of_machine_is_compatible("altr,socfpga-arria10") block instead of
> > adding a separate duplicate guard around it.
> > ---
> > drivers/edac/altera_edac.c | 6 +++---
> > 1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/edac/altera_edac.c b/drivers/edac/altera_edac.c
> > index 4edd2088c2db..ee6ced033f2c 100644
> > --- a/drivers/edac/altera_edac.c
> > +++ b/drivers/edac/altera_edac.c
> > @@ -347,9 +347,6 @@ static int altr_sdram_probe(struct platform_device *pdev)
> > return irq;
> > }
> > - /* Arria10 has a 2nd IRQ */
> > - irq2 = platform_get_irq(pdev, 1);
> > -
> > layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
> > layers[0].size = 1;
> > layers[0].is_virt_csrow = true;
> > @@ -395,6 +392,9 @@ static int altr_sdram_probe(struct platform_device *pdev)
> > /* Only the Arria10 has separate IRQs */
> > if (of_machine_is_compatible("altr,socfpga-arria10")) {
> > + /* Arria10 has a 2nd IRQ */
> > + irq2 = platform_get_irq(pdev, 1);
> > +
> > /* Arria10 specific initialization */
> > res = a10_init(mc_vbase);
> > if (res < 0)
>
> Acked-by: Dinh Nguyen <dinguyen@kernel.org>
https://sashiko.dev/#/patchset/20260515050444.10380-1-muhammad.nazim.amirul.nazle.asmade%40altera.com
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply
* Re: [PATCH] coresight: fix resource leaks on path build failure
From: Jie Gan @ 2026-05-20 1:55 UTC (permalink / raw)
To: James Clark
Cc: coresight, linux-arm-kernel, linux-kernel, Suzuki K Poulose,
Mike Leach, Leo Yan, Alexander Shishkin, Mathieu Poirier,
Tingwei Zhang, Greg Kroah-Hartman
In-Reply-To: <f347bad3-4a11-4f65-b35f-f59c6360ac5b@linaro.org>
On 5/19/2026 9:57 PM, James Clark wrote:
>
>
> On 13/05/2026 2:32 am, Jie Gan wrote:
>> Two related leaks when _coresight_build_path() encounters an error after
>> coresight_grab_device() has already incremented the pm_runtime, module,
>> and device references for a node:
>>
>> 1. In _coresight_build_path(), if kzalloc_obj() for the path node fails
>> after coresight_grab_device() succeeds, coresight_drop_device() was
>> never called, permanently leaking all three references.
>>
>> 2. In coresight_build_path(), on failure the partial path was freed with
>> kfree(path) instead of coresight_release_path(path). kfree() only
>> frees the coresight_path struct itself; it does not iterate path_list
>> to call coresight_drop_device() and kfree() for each coresight_node
>> already added by deeper recursive calls, leaking both the pm_runtime,
>> module, and device references and the node memory for every element
>> on the partial path.
>>
>> Fix both by adding coresight_drop_device() in the OOM unwind of
>> _coresight_build_path(), and replacing kfree(path) with
>> coresight_release_path(path) in coresight_build_path().
>>
>> Fixes: 32b0707a4182 ("coresight: Add try_get_module() in
>> coresight_grab_device()")
>> Fixes: b3e94405941e ("coresight: associating path with session rather
>> than tracer")
>> Signed-off-by: Jie Gan <jie.gan@oss.qualcomm.com>
>> ---
>> drivers/hwtracing/coresight/coresight-core.c | 6 ++++--
>> 1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/
>> hwtracing/coresight/coresight-core.c
>> index 46f247f73cf6..c1354ea8e11d 100644
>> --- a/drivers/hwtracing/coresight/coresight-core.c
>> +++ b/drivers/hwtracing/coresight/coresight-core.c
>> @@ -825,8 +825,10 @@ static int _coresight_build_path(struct
>> coresight_device *csdev,
>> return ret;
>> node = kzalloc_obj(struct coresight_node);
>> - if (!node)
>> + if (!node) {
>> + coresight_drop_device(csdev);
>> return -ENOMEM;
>> + }
>> node->csdev = csdev;
>> list_add(&node->link, &path->path_list);
>> @@ -851,7 +853,7 @@ struct coresight_path *coresight_build_path(struct
>> coresight_device *source,
>> rc = _coresight_build_path(source, source, sink, path);
>> if (rc) {
>> - kfree(path);
>> + coresight_release_path(path);
>> return ERR_PTR(rc);
>> }
>>
>> ---
>> base-commit: e98d21c170b01ddef366f023bbfcf6b31509fa83
>> change-id: 20260513-fix-memory-leak-issue-034b4a45265e
>>
>> Best regards,
>
> Looks good to me, but sashiko is complaining: https://sashiko.dev/#/
> patchset/20260513-fix-memory-leak-issue-
> v1-1-49822d7bc7d4%40oss.qualcomm.com
>
> I'm trying to understand why it's saying that, but I think the scenario
> is that if there are multiple correct paths to a sink, when one path
> partially fails and a second path succeeds you could get a path_list
> with some garbage entries in it.
I think the coresight_release_path is added to address this situation.
We suffered the path partially failure, and we need release all nodes
already added to the path.
>
> That's kind of a different and existing issue to the one you've fixed,
> and assumes that multiple paths to one sink are possible, which I'm not
> sure is supported?
Each path is unique. We only deal with the issue path for balancing the
reference count.
Thanks,
Jie
>
> It might be as easy as breaking the loop early for any return value
> other than -ENODEV, but I'll leave it to you to decide whether to do
> that here or not.
>
> Reviewed-by: James Clark <james.clark@linaro.org>
>
^ permalink raw reply
* Re: [PATCH v2] usb: gadget: aspeed_udc: avoid past-the-end iterator in dequeue
From: Andrew Jeffery @ 2026-05-20 2:01 UTC (permalink / raw)
To: Maoyi Xie, Neal Liu
Cc: Greg Kroah-Hartman, Benjamin Herrenschmidt, Joel Stanley,
Andrew Jeffery, Alan Stern, linux-aspeed, linux-arm-kernel,
linux-usb, linux-kernel
In-Reply-To: <20260519080213.1932516-1-maoyixie.tju@gmail.com>
On Tue, 2026-05-19 at 16:02 +0800, Maoyi Xie wrote:
> ast_udc_ep_dequeue() declares the loop cursor `req` outside the
> list_for_each_entry(). After the loop it tests `&req->req != _req`
> to decide whether the request was found. If the queue holds no
> match, `req` is past-the-end. It then aliases
> container_of(&ep->queue, struct ast_udc_request, queue) via offset
> cancellation. Whether that synthetic address equals `_req` depends
> on heap layout. The function can return 0 without dequeueing
> anything.
>
> Walk the list with a separate `iter`. Set `req` only when a
> request matches. After the loop, `req` is NULL if nothing
> matched.
>
> Suggested-by: Alan Stern <stern@rowland.harvard.edu>
> Signed-off-by: Maoyi Xie <maoyixie.tju@gmail.com>
> ---
> v2: Switch the loop body to Alan Stern's shape: test inside
> the if, assign `req`, break. Same behaviour as v1.
> v1: https://lore.kernel.org/linux-usb/20260518073403.1285339-1-maoyi.xie@ntu.edu.sg/
>
> drivers/usb/gadget/udc/aspeed_udc.c | 20 ++++++++++++--------
> 1 file changed, 12 insertions(+), 8 deletions(-)
>
> --- a/drivers/usb/gadget/udc/aspeed_udc.c 2026-05-19 15:29:28.690931576 +0800
> +++ b/drivers/usb/gadget/udc/aspeed_udc.c 2026-05-19 15:29:59.482953528 +0800
> @@ -692,26 +692,30 @@
> {
> struct ast_udc_ep *ep = to_ast_ep(_ep);
> struct ast_udc_dev *udc = ep->udc;
> - struct ast_udc_request *req;
> + struct ast_udc_request *req = NULL, *iter;
> unsigned long flags;
> int rc = 0;
>
> spin_lock_irqsave(&udc->lock, flags);
>
> /* make sure it's actually queued on this endpoint */
> - list_for_each_entry(req, &ep->queue, queue) {
> - if (&req->req == _req) {
> - list_del_init(&req->queue);
> - ast_udc_done(ep, req, -ESHUTDOWN);
> - _req->status = -ECONNRESET;
> + list_for_each_entry(iter, &ep->queue, queue) {
> + if (&iter->req == _req) {
> + req = iter;
> break;
> }
> }
>
> - /* dequeue request not found */
> - if (&req->req != _req)
> + if (!req) {
> rc = -EINVAL;
> + goto out;
> + }
> +
> + list_del_init(&req->queue);
> + ast_udc_done(ep, req, -ESHUTDOWN);
> + _req->status = -ECONNRESET;
>
> +out:
> spin_unlock_irqrestore(&udc->lock, flags);
>
> return rc;
This is a bit of a bikeshed comment and doesn't solve making the code
similar to other cases, however: Golfing the diff a bit, perhaps we can
start from the assumption that there isn't a match, and require the
search disprove that. Then we don't have to test whether we saw
something after-the-fact, and we avoid the goto as proposed above.
Untested:
diff --git a/drivers/usb/gadget/udc/aspeed_udc.c b/drivers/usb/gadget/udc/aspeed_udc.c
index 7fc6696b7694..75f9c831b21a 100644
--- a/drivers/usb/gadget/udc/aspeed_udc.c
+++ b/drivers/usb/gadget/udc/aspeed_udc.c
@@ -694,7 +694,7 @@ static int ast_udc_ep_dequeue(struct usb_ep *_ep, struct usb_request *_req)
struct ast_udc_dev *udc = ep->udc;
struct ast_udc_request *req;
unsigned long flags;
- int rc = 0;
+ int rc = -EINVAL;
spin_lock_irqsave(&udc->lock, flags);
@@ -704,14 +704,11 @@ static int ast_udc_ep_dequeue(struct usb_ep *_ep, struct usb_request *_req)
list_del_init(&req->queue);
ast_udc_done(ep, req, -ESHUTDOWN);
_req->status = -ECONNRESET;
+ rc = 0;
break;
}
}
- /* dequeue request not found */
- if (&req->req != _req)
- rc = -EINVAL;
-
spin_unlock_irqrestore(&udc->lock, flags);
return rc;
^ permalink raw reply
* Re: [PATCH] arm64/mm: Rename ptdesc_t
From: Anshuman Khandual @ 2026-05-20 2:08 UTC (permalink / raw)
To: David Hildenbrand (Arm), Will Deacon, Mike Rapoport
Cc: linux-arm-kernel, Catalin Marinas, linux-efi, linux-kernel
In-Reply-To: <34c31f7a-e3e5-47c1-9e41-e6e9a90e89d3@kernel.org>
On 19/05/26 4:32 PM, David Hildenbrand (Arm) wrote:
> On 5/19/26 12:44, Will Deacon wrote:
>> On Thu, Apr 30, 2026 at 09:51:46AM +0200, Mike Rapoport wrote:
>>> Hi Anshuman,
>>>
>>> On Thu, Apr 30, 2026 at 04:49:33AM +0100, Anshuman Khandual wrote:
>>>> ptdesc_t sounds very similar to the core MM struct ptdesc which is actually
>>>> the memory descriptor for page table allocations. Hence rename this typedef
>>>> element as pxxval_t instead for better clarity and separation.
>>>
>>> Maybe we should keep the "pt" prefix and make it "ptval_t"?
>>
>> Yeah, the 'pxx' prefix really hurts my eyes. Please use something else!
>
> Fine with me as long as we don't call it something with "pte" in it.
>
Seems like there are two choices here
(A) pxdval_t
(B) ptval_t which Mike had suggested earlier
Unless there is a preference will probably go with pxdval_t
^ permalink raw reply
* Re: [PATCH v2 2/3] clk: nuvoton: ma35d1: fix PLL_CTL1_FRAC bit field width and fractional calc
From: Joey Lu @ 2026-05-20 2:14 UTC (permalink / raw)
To: Brian Masney
Cc: mturquette, sboyd, ychuang3, schung, yclu4, linux-arm-kernel,
linux-clk, linux-kernel
In-Reply-To: <agx5gTfaByMNjkX4@redhat.com>
On 5/19/2026 10:53 PM, Brian Masney wrote:
> Hi Joey,
>
> On Wed, May 13, 2026 at 01:56:25PM +0800, Joey Lu wrote:
>> PLL_CTL1_FRAC was defined as GENMASK(31, 24), covering only 8 bits.
>> The hardware fractional field occupies bits [31:8] (24 bits), so the
>> mask must be GENMASK(31, 8).
>>
>> The previous fractional-mode calculation used FIELD_MAX(PLL_CTL1_FRAC)
>> as the denominator to obtain 2 decimal places. With the corrected 24-bit
>> mask the old divisor is wrong; replace the arithmetic with a proper
>> 24-bit fixed-point rounding to 3 decimal places:
>>
>> n_frac = n * 1000 + (x * 1000 + 500) >> 24
>>
>> The +500 term provides round-to-nearest before the right shift.
>>
>> Fixes: 691521a367cf ("clk: nuvoton: Add clock driver for ma35d1 clock controller")
>> Signed-off-by: Joey Lu <a0987203069@gmail.com>
>> ---
>> drivers/clk/nuvoton/clk-ma35d1-pll.c | 8 ++++----
>> 1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/clk/nuvoton/clk-ma35d1-pll.c b/drivers/clk/nuvoton/clk-ma35d1-pll.c
>> index bfedd45bd04b..7e6b30d20c01 100644
>> --- a/drivers/clk/nuvoton/clk-ma35d1-pll.c
>> +++ b/drivers/clk/nuvoton/clk-ma35d1-pll.c
>> @@ -48,7 +48,7 @@
>> #define PLL_CTL1_PD BIT(0)
>> #define PLL_CTL1_BP BIT(1)
>> #define PLL_CTL1_OUTDIV GENMASK(6, 4)
>> -#define PLL_CTL1_FRAC GENMASK(31, 24)
>> +#define PLL_CTL1_FRAC GENMASK(31, 8)
>> #define PLL_CTL2_SLOPE GENMASK(23, 0)
>>
>> #define INDIV_MIN 1
>> @@ -113,9 +113,9 @@ static unsigned long ma35d1_calc_pll_freq(u8 mode, u32 *reg_ctl, unsigned long p
>> pll_freq = div_u64(pll_freq, m * p);
>> } else {
>> x = FIELD_GET(PLL_CTL1_FRAC, reg_ctl[1]);
>> - /* 2 decimal places floating to integer (ex. 1.23 to 123) */
>> - n = n * 100 + ((x * 100) / FIELD_MAX(PLL_CTL1_FRAC));
>> - pll_freq = div_u64(parent_rate * n, 100 * m * p);
>> + /* x is 24-bit fractional part, convert to 3 decimal digits */
>> + n = n * 1000 + (u32)(((u64)x * 1000 + 500) >> 24);
> ^^^^^^^^^^^^^^^^^^^^^
> You should be able to use DIV_ROUND_CLOSEST_ULL() here.
>
> Brian
Thanks for the suggestion! I'll update this to use DIV_ROUND_CLOSEST_ULL
in the next version.
BR, Joey
>
>> + pll_freq = div_u64((u64)parent_rate * n, 1000 * m * p);
>> }
>> return pll_freq;
>> }
>> --
>> 2.43.0
>>
^ permalink raw reply
* [PATCH] Bluetooth: btmtk: remove extra copy in cmd array init
From: Jiajia Liu @ 2026-05-20 2:15 UTC (permalink / raw)
To: Marcel Holtmann, Luiz Augusto von Dentz, Matthias Brugger,
AngeloGioacchino Del Regno
Cc: linux-bluetooth, linux-kernel, linux-arm-kernel, linux-mediatek,
Jiajia Liu
In btmtk_setup_firmware_79xx, the data length indicated by wmt_params.dlen
in the cmd buffer is MTK_SEC_MAP_NEED_SEND_SIZE + 1. Except for the first
byte, the remaining length is MTK_SEC_MAP_NEED_SEND_SIZE. memcpy copied one
more byte to cmd + 1 than the remaining length. Align the length passed to
memcpy to avoid exceeding current section map.
Signed-off-by: Jiajia Liu <liujiajia@kylinos.cn>
---
drivers/bluetooth/btmtk.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/bluetooth/btmtk.c b/drivers/bluetooth/btmtk.c
index ea7a031000cd..53cba71cb07f 100644
--- a/drivers/bluetooth/btmtk.c
+++ b/drivers/bluetooth/btmtk.c
@@ -188,7 +188,7 @@ int btmtk_setup_firmware_79xx(struct hci_dev *hdev, const char *fwname,
MTK_FW_ROM_PATCH_GD_SIZE +
MTK_FW_ROM_PATCH_SEC_MAP_SIZE * i +
MTK_SEC_MAP_COMMON_SIZE,
- MTK_SEC_MAP_NEED_SEND_SIZE + 1);
+ MTK_SEC_MAP_NEED_SEND_SIZE);
wmt_params.op = BTMTK_WMT_PATCH_DWNLD;
wmt_params.status = &status;
--
2.53.0
^ permalink raw reply related
* [PATCH v3] cpu/hotplug: Fix NULL kobject warning in cpuhp_smt_enable()
From: Jinjie Ruan @ 2026-05-20 2:20 UTC (permalink / raw)
To: catalin.marinas, will, corbet, skhan, punit.agrawal, jic23,
osama.abdelkader, chenl311, fengchengwen, suzuki.poulose, maz,
lpieralisi, timothy.hayes, sascha.bischoff, arnd,
mrigendra.chaubey, pierre.gondois, dietmar.eggemann, yangyicong,
sudeep.holla, linux-arm-kernel, linux-doc, linux-kernel
Cc: ruanjinjie
On arm64, when booting with `maxcpus` greater than the number of present
CPUs (e.g., QEMU -smp cpus=4,maxcpus=8), some CPUs are marked as 'present'
but have not yet been registered via register_cpu(). Consequently,
the per-cpu device objects for these CPUs are not yet initialized.
In cpuhp_smt_enable(), the code iterates over all present CPUs. Calling
_cpu_up() for these unregistered CPUs eventually leads to
sysfs_create_group() being called with a NULL kobject (or a kobject
without a directory), triggering the following warning in
fs/sysfs/group.c:
if (WARN_ON(!kobj || (!update && !kobj->sd)))
return -EINVAL;
When booting with ACPI, arm64 smp_prepare_cpus() currently sets all
enumerated CPUs as "present" regardless of their status in the MADT. This
causes issues with SMT hotplug control. For instance, with QEMU's
"-smp 4,maxcpus=8" configuration, the MADT GICC entries are populated as
follows: the first four CPUs are marked Enabled while the remaining four
are marked Online Capable to support potential hot-plugging.
Fix this by:
1. When booting with ACPI, checking the ACPI_MADT_ENABLED flag in the GICC
entry before calling set_cpu_present() during SMP initialization.
2. Properly managing the present mask in acpi_map_cpu() and
acpi_unmap_cpu() to support actual CPU hotplug events, This aligns with
other architectures like x86 and LoongArch.
3. Update the arm64 CPU hotplug documentation to no longer state that all
online-capable vCPUs are marked as present by the kernel at boot time.
This ensures that only physically available or explicitly enabled CPUs
are in the present mask, keeping the SMT control logic consistent with
the actual hardware state.
How to reproduce:
1. echo off > /sys/devices/system/cpu/smt/control
psci: CPU1 killed (polled 0 ms)
psci: CPU3 killed (polled 0 ms)
2. echo 2 > /sys/devices/system/cpu/smt/control
Detected PIPT I-cache on CPU1
GICv3: CPU1: found redistributor 1 region 0:0x00000000080c0000
CPU1: Booted secondary processor 0x0000000001 [0x410fd082]
Detected PIPT I-cache on CPU3
GICv3: CPU3: found redistributor 3 region 0:0x0000000008100000
CPU3: Booted secondary processor 0x0000000003 [0x410fd082]
------------[ cut here ]------------
WARNING: fs/sysfs/group.c:137 at internal_create_group+0x41c/0x4bc, CPU#2: sh/181
Modules linked in:
CPU: 2 UID: 0 PID: 181 Comm: sh Not tainted 7.0.0-rc1-00010-g8d13386c7624 #142 PREEMPT
Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : internal_create_group+0x41c/0x4bc
lr : sysfs_create_group+0x18/0x24
sp : ffff80008078ba40
x29: ffff80008078ba40 x28: ffff296c980ad000 x27: ffff00007fb94128
x26: 0000000000000054 x25: ffffd693e845f3f0 x24: 0000000000000001
x23: 0000000000000001 x22: 0000000000000004 x21: 0000000000000000
x20: ffffd693e845fc10 x19: 0000000000000004 x18: 00000000ffffffff
x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
x14: 0000000000000358 x13: 0000000000000007 x12: 0000000000000350
x11: 0000000000000008 x10: 0000000000000407 x9 : 0000000000000400
x8 : ffff00007fbf3b60 x7 : 0000000000000000 x6 : ffffd693e845f3f0
x5 : ffff00007fb94128 x4 : 0000000000000000 x3 : ffff000000f4eac0
x2 : ffffd693e7095a08 x1 : 0000000000000000 x0 : 0000000000000000
Call trace:
internal_create_group+0x41c/0x4bc (P)
sysfs_create_group+0x18/0x24
topology_add_dev+0x1c/0x28
cpuhp_invoke_callback+0x104/0x20c
__cpuhp_invoke_callback_range+0x94/0x11c
_cpu_up+0x200/0x37c
cpuhp_smt_enable+0xbc/0x114
control_store+0xe8/0x1d4
dev_attr_store+0x18/0x2c
sysfs_kf_write+0x7c/0x94
kernfs_fop_write_iter+0x128/0x1b8
vfs_write+0x2b0/0x354
ksys_write+0x68/0xfc
__arm64_sys_write+0x1c/0x28
invoke_syscall+0x48/0x10c
el0_svc_common.constprop.0+0x40/0xe8
do_el0_svc+0x20/0x2c
el0_svc+0x34/0x124
el0t_64_sync_handler+0xa0/0xe4
el0t_64_sync+0x198/0x19c
---[ end trace 0000000000000000 ]---
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: stable@vger.kernel.org
Link: https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic-cpu-interface-gicc-structure
Fixes: eed4583bcf9a6 ("arm64: Kconfig: Enable HOTPLUG_SMT")
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
v3:
- Update the arm64 cpu-hotplug documentation as Catalin suggested.
- Update the commit message.
v2:
- Update the fix way.
---
Documentation/arch/arm64/cpu-hotplug.rst | 11 +++++++----
arch/arm64/kernel/acpi.c | 2 ++
arch/arm64/kernel/smp.c | 12 +++++++++++-
3 files changed, 20 insertions(+), 5 deletions(-)
diff --git a/Documentation/arch/arm64/cpu-hotplug.rst b/Documentation/arch/arm64/cpu-hotplug.rst
index 8fb438bf7781..60f7f51d7b96 100644
--- a/Documentation/arch/arm64/cpu-hotplug.rst
+++ b/Documentation/arch/arm64/cpu-hotplug.rst
@@ -47,8 +47,9 @@ ever have can be described at boot. There are no power-domain considerations
as such devices are emulated.
CPU Hotplug on virtual systems is supported. It is distinct from physical
-CPU Hotplug as all resources are described as ``present``, but CPUs may be
-marked as disabled by firmware. Only the CPU's online/offline behaviour is
+CPU Hotplug as all resources are described in the static configuration tables,
+but vCPUs that are not enabled at boot are not marked as ``present`` by the
+kernel until they are hotplugged. Only the CPU's online/offline behaviour is
influenced by firmware. An example is where a virtual machine boots with a
single CPU, and additional CPUs are added once a cloud orchestrator deploys
the workload.
@@ -68,8 +69,10 @@ redistributors.
CPUs described as ``online capable`` but not ``enabled`` can be set to enabled
by the DSDT's Processor object's _STA method. On virtual systems the _STA method
-must always report the CPU as ``present``. Changes to the firmware policy can
-be notified to the OS via device-check or eject-request.
+must report the CPU as ``present`` when it is activated by the firmware.
+The kernel will then set the vCPU as ``present`` dynamically during the hotplug
+configuration process. Changes can be notified to the OS via device-check or
+eject-request.
CPUs described as ``enabled`` in the static table, should not have their _STA
modified dynamically by firmware. Soft-restart features such as kexec will
diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
index 5891f92c2035..681aa2bbc399 100644
--- a/arch/arm64/kernel/acpi.c
+++ b/arch/arm64/kernel/acpi.c
@@ -448,12 +448,14 @@ int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, u32 apci_id,
return *pcpu;
}
+ set_cpu_present(*pcpu, true);
return 0;
}
EXPORT_SYMBOL(acpi_map_cpu);
int acpi_unmap_cpu(int cpu)
{
+ set_cpu_present(cpu, false);
return 0;
}
EXPORT_SYMBOL(acpi_unmap_cpu);
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 1aa324104afb..5932e5b30b71 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -566,6 +566,11 @@ struct acpi_madt_generic_interrupt *acpi_cpu_get_madt_gicc(int cpu)
}
EXPORT_SYMBOL_GPL(acpi_cpu_get_madt_gicc);
+static bool acpi_cpu_is_present(int cpu)
+{
+ return acpi_cpu_get_madt_gicc(cpu)->flags & ACPI_MADT_ENABLED;
+}
+
/*
* acpi_map_gic_cpu_interface - parse processor MADT entry
*
@@ -670,6 +675,10 @@ static void __init acpi_parse_and_init_cpus(void)
early_map_cpu_to_node(i, acpi_numa_get_nid(i));
}
#else
+static bool acpi_cpu_is_present(int cpu)
+{
+ return false;
+}
#define acpi_parse_and_init_cpus(...) do { } while (0)
#endif
@@ -808,7 +817,8 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
if (err)
continue;
- set_cpu_present(cpu, true);
+ if (acpi_disabled || acpi_cpu_is_present(cpu))
+ set_cpu_present(cpu, true);
numa_store_cpu_info(cpu);
}
}
--
2.34.1
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox