* [PATCH v2 17/18] irqchip/gic-v5: Immediately exec priority drop following activate
From: Marc Zyngier @ 2026-05-20 9:19 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel
Cc: Steffen Eiden, Joey Gouly, Suzuki K Poulose, Oliver Upton,
Zenghui Yu, Sascha Bischoff
In-Reply-To: <20260520091949.542365-1-maz@kernel.org>
From: Sascha Bischoff <sascha.bischoff@arm.com>
With GICv5 an interrupt of equal or lower priority cannot be signalled
until there has been a priority drop. This is done via the GIC CDEOI
system instruction. Once this has been executed, the hardware is able
to signal the next interrupt if there is one.
As all interrupts are programmed to have the same priority, no new
interrupts can be signalled until the priority drop has happened. This
can cause issues when, for example, an interrupt remains active while
a long running process takes place, such as when injecting a physical
interrupt into a guest VM in software.
The GICv5 driver has so far done the priority drop as part of
irq_eoi(), i.e., at the same time as deactivating the interrupt. This
means that any long running process (or VM) could block incoming
interrupts, effectively causing a denial of service for all other
interrupts.
Rather than doing the EOI as part of irq_eoi() (which the name would
suggest would be a good place for it), move it to happen immediately
after acknowledging an interrupt in the main GICv5 interrupt
handler. The deactivation of interrupts (GIC CDDI) remains implemented
as part of irq_eoi(), which means that the same interrupt cannot be
signalled a second time until deactivated by software.
Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
drivers/irqchip/irq-gic-v5.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/drivers/irqchip/irq-gic-v5.c b/drivers/irqchip/irq-gic-v5.c
index 6b0903be8ebfd..58e457d4c1476 100644
--- a/drivers/irqchip/irq-gic-v5.c
+++ b/drivers/irqchip/irq-gic-v5.c
@@ -218,17 +218,13 @@ static void gicv5_hwirq_eoi(u32 hwirq_id, u8 hwirq_type)
FIELD_PREP(GICV5_GIC_CDDI_TYPE_MASK, hwirq_type);
gic_insn(cddi, CDDI);
-
- gic_insn(0, CDEOI);
}
static void gicv5_ppi_irq_eoi(struct irq_data *d)
{
/* Skip deactivate for forwarded PPI interrupts */
- if (irqd_is_forwarded_to_vcpu(d)) {
- gic_insn(0, CDEOI);
+ if (irqd_is_forwarded_to_vcpu(d))
return;
- }
gicv5_hwirq_eoi(d->hwirq, GICV5_HWIRQ_TYPE_PPI);
}
@@ -963,6 +959,13 @@ static void __exception_irq_entry gicv5_handle_irq(struct pt_regs *regs)
*/
isb();
+ /*
+ * Ensure that we can receive the next interrupts in the event that we
+ * have a long running handler or directly enter a guest by doing the
+ * priority drop immediately.
+ */
+ gic_insn(0, CDEOI);
+
hwirq = FIELD_GET(GICV5_HWIRQ_INTID, ia);
handle_irq_per_domain(hwirq);
--
2.47.3
^ permalink raw reply related
* [PATCH v2 12/18] KVM: arm64: selftests: Add missing GIC CDEN to no-vgic-v5 selftest
From: Marc Zyngier @ 2026-05-20 9:19 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel
Cc: Steffen Eiden, Joey Gouly, Suzuki K Poulose, Oliver Upton,
Zenghui Yu, Sascha Bischoff
In-Reply-To: <20260520091949.542365-1-maz@kernel.org>
From: Sascha Bischoff <sascha.bischoff@arm.com>
The selftest mistakenly omitted the GIC CDEN instruction from the
testing. Add it in.
Fixes: ce29261ec648 ("KVM: arm64: selftests: Add no-vgic-v5 selftest")
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
tools/testing/selftests/kvm/arm64/no-vgic.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/kvm/arm64/no-vgic.c b/tools/testing/selftests/kvm/arm64/no-vgic.c
index 25b2e3222f685..ab57902ce4293 100644
--- a/tools/testing/selftests/kvm/arm64/no-vgic.c
+++ b/tools/testing/selftests/kvm/arm64/no-vgic.c
@@ -159,6 +159,7 @@ static void guest_code_gicv5(void)
check_gicv5_gic_op(CDAFF);
check_gicv5_gic_op(CDDI);
check_gicv5_gic_op(CDDIS);
+ check_gicv5_gic_op(CDEN);
check_gicv5_gic_op(CDEOI);
check_gicv5_gic_op(CDHM);
check_gicv5_gic_op(CDPEND);
--
2.47.3
^ permalink raw reply related
* [PATCH v2 07/18] KVM: arm64: vgic-v5: Drop defensive checks from vgic_v5_ppi_queue_irq_unlock()
From: Marc Zyngier @ 2026-05-20 9:19 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel
Cc: Steffen Eiden, Joey Gouly, Suzuki K Poulose, Oliver Upton,
Zenghui Yu, Sascha Bischoff
In-Reply-To: <20260520091949.542365-1-maz@kernel.org>
vgic_v5_ppi_queue_irq_unlock() performs a bunch of sanity checks
that are pretty pointless as there is no code path that can
result in these invariants to be violated. And if they are, a nice
crash is just as instructive than a warning.
Drop what is evidently debug code and simplify the whole thing.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
arch/arm64/kvm/vgic/vgic-v5.c | 16 +++-------------
1 file changed, 3 insertions(+), 13 deletions(-)
diff --git a/arch/arm64/kvm/vgic/vgic-v5.c b/arch/arm64/kvm/vgic/vgic-v5.c
index 757484d2493b2..7916bd8d564ef 100644
--- a/arch/arm64/kvm/vgic/vgic-v5.c
+++ b/arch/arm64/kvm/vgic/vgic-v5.c
@@ -238,9 +238,9 @@ static u32 vgic_v5_get_effective_priority_mask(struct kvm_vcpu *vcpu)
/*
* For GICv5, the PPIs are mostly directly managed by the hardware. We (the
- * hypervisor) handle the pending, active, enable state save/restore, but don't
- * need the PPIs to be queued on a per-VCPU AP list. Therefore, sanity check the
- * state, unlock, and return.
+ * hypervisor) handle the pending, active, enable state save/restore, but
+ * don't need the PPIs to be queued on a per-VCPU AP list. Therefore,
+ * unlock, kick the vcpu and return.
*/
bool vgic_v5_ppi_queue_irq_unlock(struct kvm *kvm, struct vgic_irq *irq,
unsigned long flags)
@@ -250,12 +250,7 @@ bool vgic_v5_ppi_queue_irq_unlock(struct kvm *kvm, struct vgic_irq *irq,
lockdep_assert_held(&irq->irq_lock);
- if (WARN_ON_ONCE(!__irq_is_ppi(KVM_DEV_TYPE_ARM_VGIC_V5, irq->intid)))
- goto out_unlock_fail;
-
vcpu = irq->target_vcpu;
- if (WARN_ON_ONCE(!vcpu))
- goto out_unlock_fail;
raw_spin_unlock_irqrestore(&irq->irq_lock, flags);
@@ -264,11 +259,6 @@ bool vgic_v5_ppi_queue_irq_unlock(struct kvm *kvm, struct vgic_irq *irq,
kvm_vcpu_kick(vcpu);
return true;
-
-out_unlock_fail:
- raw_spin_unlock_irqrestore(&irq->irq_lock, flags);
-
- return false;
}
/*
--
2.47.3
^ permalink raw reply related
* [PATCH v2 14/18] KVM: arm64: selftests: Improve error handling for GICv5 PPI selftest
From: Marc Zyngier @ 2026-05-20 9:19 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel
Cc: Steffen Eiden, Joey Gouly, Suzuki K Poulose, Oliver Upton,
Zenghui Yu, Sascha Bischoff
In-Reply-To: <20260520091949.542365-1-maz@kernel.org>
From: Sascha Bischoff <sascha.bischoff@arm.com>
Cases where the KVM_RUN ioctl returned an error were wrongly reported
as incorrect ucalls. Furthermore, potential failures when calling
KVM_IRQ_LINE were being hidden.
Improve the error handling to correctly propagate the error in both
cases.
Fixes: 0a9f38bf612b ("KVM: arm64: selftests: Introduce a minimal GICv5 PPI selftest")
Link: https://sashiko.dev/#/patchset/20260319154937.3619520-1-sascha.bischoff%40arm.com
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
tools/testing/selftests/kvm/arm64/vgic_v5.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kvm/arm64/vgic_v5.c b/tools/testing/selftests/kvm/arm64/vgic_v5.c
index a8707120de0d8..96cfd6bb32f6f 100644
--- a/tools/testing/selftests/kvm/arm64/vgic_v5.c
+++ b/tools/testing/selftests/kvm/arm64/vgic_v5.c
@@ -129,6 +129,8 @@ static void test_vgic_v5_ppis(u32 gic_dev_type)
while (1) {
ret = run_vcpu(vcpus[0]);
+ if (ret)
+ break;
switch (get_ucall(vcpus[0], &uc)) {
case UCALL_SYNC:
@@ -144,7 +146,7 @@ static void test_vgic_v5_ppis(u32 gic_dev_type)
irq = FIELD_PREP(KVM_ARM_IRQ_NUM_MASK, 3);
irq |= KVM_ARM_IRQ_TYPE_PPI << KVM_ARM_IRQ_TYPE_SHIFT;
- _kvm_irq_line(v.vm, irq, level);
+ kvm_irq_line(v.vm, irq, level);
} else if (uc.args[1] == GUEST_CMD_IS_AWAKE) {
pr_info("Guest skipping WFI due to pending IRQ\n");
} else if (uc.args[1] == GUEST_CMD_IRQ_CDIA) {
--
2.47.3
^ permalink raw reply related
* [PATCH v2 16/18] Documentation: KVM: Clarify that PMU_V3_IRQ IntID requirements for GICv5
From: Marc Zyngier @ 2026-05-20 9:19 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel
Cc: Steffen Eiden, Joey Gouly, Suzuki K Poulose, Oliver Upton,
Zenghui Yu, Sascha Bischoff
In-Reply-To: <20260520091949.542365-1-maz@kernel.org>
From: Sascha Bischoff <sascha.bischoff@arm.com>
When running a GICv5-based guest, the PMU must use PPI 23. This,
however, must be communicated via the
KVM_ARM_VCPU_PMU_V3_CTRL->KVM_ARM_VCPU_PMU_V3_IRQ ioctl as a full
GICv5-style Interrupt ID. That is, 0x20000017. Optionally, the whole
ioctl can be skipped for GICv5.
This was previously not clearly documented, so bump the documentation
accordingly.
Fixes: 7c31c06e2d2d ("KVM: arm64: gic-v5: Mandate architected PPI for PMU emulation on GICv5")
Link: https://sashiko.dev/#/patchset/20260319154937.3619520-1-sascha.bischoff%40arm.com
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
Documentation/virt/kvm/devices/vcpu.rst | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
index 5e38058200105..66e714f2fcfa7 100644
--- a/Documentation/virt/kvm/devices/vcpu.rst
+++ b/Documentation/virt/kvm/devices/vcpu.rst
@@ -37,8 +37,11 @@ Returns:
A value describing the PMUv3 (Performance Monitor Unit v3) overflow interrupt
number for this vcpu. This interrupt could be a PPI or SPI, but the interrupt
type must be same for each vcpu. As a PPI, the interrupt number is the same for
-all vcpus, while as an SPI it must be a separate number per vcpu. For
-GICv5-based guests, the architected PPI (23) must be used.
+all vcpus, while as an SPI it must be a separate number per vcpu.
+
+For GICv5-based guests, the architected PPI (23) must be used, and must be
+communicated as the full GICv5-style Interrupt ID, i.e., 0x20000017. This ioctl
+can be omitted altogether for a GICv5-based guest.
1.2 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_INIT
---------------------------------------
--
2.47.3
^ permalink raw reply related
* RE: [PATCH] coresight: fix resource leaks on path build failure
From: Mike Leach @ 2026-05-20 9:27 UTC (permalink / raw)
To: James Clark, Jie Gan
Cc: coresight@lists.linaro.org, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, Suzuki Poulose, Leo Yan,
Alexander Shishkin, Mathieu Poirier, Tingwei Zhang,
Greg Kroah-Hartman, nd
In-Reply-To: <dea28a56-a5d8-46ef-a782-a95073714377@linaro.org>
> -----Original Message-----
> From: James Clark <james.clark@linaro.org>
> Sent: Wednesday, May 20, 2026 9:38 AM
> To: Jie Gan <jie.gan@oss.qualcomm.com>
> Cc: coresight@lists.linaro.org; linux-arm-kernel@lists.infradead.org; linux-
> kernel@vger.kernel.org; Suzuki Poulose <Suzuki.Poulose@arm.com>; Mike
> Leach <Mike.Leach@arm.com>; Leo Yan <Leo.Yan@arm.com>; Alexander
> Shishkin <alexander.shishkin@linux.intel.com>; Mathieu Poirier
> <mathieu.poirier@linaro.org>; Tingwei Zhang
> <tingwei.zhang@oss.qualcomm.com>; Greg Kroah-Hartman
> <gregkh@linuxfoundation.org>
> Subject: Re: [PATCH] coresight: fix resource leaks on path build failure
>
>
>
> On 20/05/2026 2:55 am, Jie Gan wrote:
> >
> >
> > On 5/19/2026 9:57 PM, James Clark wrote:
> >>
> >>
> >> On 13/05/2026 2:32 am, Jie Gan wrote:
> >>> Two related leaks when _coresight_build_path() encounters an error after
> >>> coresight_grab_device() has already incremented the pm_runtime,
> module,
> >>> and device references for a node:
> >>>
> >>> 1. In _coresight_build_path(), if kzalloc_obj() for the path node fails
> >>> after coresight_grab_device() succeeds, coresight_drop_device() was
> >>> never called, permanently leaking all three references.
> >>>
> >>> 2. In coresight_build_path(), on failure the partial path was freed with
> >>> kfree(path) instead of coresight_release_path(path). kfree() only
> >>> frees the coresight_path struct itself; it does not iterate
> >>> path_list
> >>> to call coresight_drop_device() and kfree() for each coresight_node
> >>> already added by deeper recursive calls, leaking both the
> >>> pm_runtime,
> >>> module, and device references and the node memory for every element
> >>> on the partial path.
> >>>
> >>> Fix both by adding coresight_drop_device() in the OOM unwind of
> >>> _coresight_build_path(), and replacing kfree(path) with
> >>> coresight_release_path(path) in coresight_build_path().
> >>>
> >>> Fixes: 32b0707a4182 ("coresight: Add try_get_module() in
> >>> coresight_grab_device()")
> >>> Fixes: b3e94405941e ("coresight: associating path with session rather
> >>> than tracer")
> >>> Signed-off-by: Jie Gan <jie.gan@oss.qualcomm.com>
> >>> ---
> >>> drivers/hwtracing/coresight/coresight-core.c | 6 ++++--
> >>> 1 file changed, 4 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/
> >>> hwtracing/coresight/coresight-core.c
> >>> index 46f247f73cf6..c1354ea8e11d 100644
> >>> --- a/drivers/hwtracing/coresight/coresight-core.c
> >>> +++ b/drivers/hwtracing/coresight/coresight-core.c
> >>> @@ -825,8 +825,10 @@ static int _coresight_build_path(struct
> >>> coresight_device *csdev,
> >>> return ret;
> >>> node = kzalloc_obj(struct coresight_node);
> >>> - if (!node)
> >>> + if (!node) {
> >>> + coresight_drop_device(csdev);
> >>> return -ENOMEM;
> >>> + }
> >>> node->csdev = csdev;
> >>> list_add(&node->link, &path->path_list);
> >>> @@ -851,7 +853,7 @@ struct coresight_path
> >>> *coresight_build_path(struct coresight_device *source,
> >>> rc = _coresight_build_path(source, source, sink, path);
> >>> if (rc) {
> >>> - kfree(path);
> >>> + coresight_release_path(path);
> >>> return ERR_PTR(rc);
> >>> }
> >>>
> >>> ---
> >>> base-commit: e98d21c170b01ddef366f023bbfcf6b31509fa83
> >>> change-id: 20260513-fix-memory-leak-issue-034b4a45265e
> >>>
> >>> Best regards,
> >>
> >> Looks good to me, but sashiko is complaining: https://sashiko.dev/#/
> >> patchset/20260513-fix-memory-leak-issue-
> >> v1-1-49822d7bc7d4%40oss.qualcomm.com
> >>
> >> I'm trying to understand why it's saying that, but I think the
> >> scenario is that if there are multiple correct paths to a sink, when
> >> one path partially fails and a second path succeeds you could get a
> >> path_list with some garbage entries in it.
> >
> > I think the coresight_release_path is added to address this situation.
> > We suffered the path partially failure, and we need release all nodes
> > already added to the path.
> >
>
> It wouldn't call coresight_release_path() in this case though. If one
> path ends up building to success but a parallel path partially failed
> then _coresight_build_path() still returns success. During the search it
> would have still added the nodes from the partially failed path to the
> path_list. This is only an issue if there are multiple correct paths.
>
> >>
> >> That's kind of a different and existing issue to the one you've fixed,
> >> and assumes that multiple paths to one sink are possible, which I'm
> >> not sure is supported?
> >
> > Each path is unique. We only deal with the issue path for balancing the
> > reference count.
> >
> > Thanks,
> > Jie
> >
>
> I'm not exactly sure what you mean by unique, but the same source and
> sink could potentially be connected through two different sets of links.
>
Multiple paths between a source and sink are not permitted under the CoreSight spec.
If such a system was to be built - then a fix would need to be in the declaration of connections - e.g. miss one path out in the device tree for example. Not up to the Coresight drivers to handle out of specification hardware.
Mike
> >>
> >> It might be as easy as breaking the loop early for any return value
> >> other than -ENODEV, but I'll leave it to you to decide whether to do
> >> that here or not.
> >>
> >> Reviewed-by: James Clark <james.clark@linaro.org>
> >>
> >
^ permalink raw reply
* [PATCH v2] i2c: imx-lpi2c: fix resource leaks switching to devm_dma_request_chan()
From: Carlos Song (OSS) @ 2026-05-20 9:33 UTC (permalink / raw)
To: aisheng.dong, andi.shyti, Frank.Li, s.hauer, kernel, festevam,
carlos.song
Cc: linux-i2c, imx, linux-arm-kernel, linux-kernel, stable
From: Carlos Song <carlos.song@nxp.com>
The LPI2C driver requests DMA channels using dma_request_chan(), but
never releases them in lpi2c_imx_remove(), resulting in DMA channel
leaks every time the driver is unloaded.
Additionally, when lpi2c_dma_init() successfully requests the TX DMA
channel but fails to request the RX DMA channel, the probe falls back
to PIO mode and completes successfully. Since probe succeeds, the devres
framework will not trigger any cleanup, leaving the TX DMA channel and
the memory allocated for the dma structure held for the lifetime of the
device even though DMA is never used.
Switch to devm_dma_request_chan() to let the device core manage DMA
channel lifetime automatically. Wrap all allocations within a devres
group so that devres_release_group() can release all partially acquired
resources when DMA init fails and probe continues in PIO mode.
Fixes: a09c8b3f9047 ("i2c: imx-lpi2c: add eDMA mode support for LPI2C")
Cc: stable@vger.kernel.org
Signed-off-by: Carlos Song <carlos.song@nxp.com>
---
Change for v2:
- Wrap all allocations in lpi2c_dma_init() within a devres group so
that devres_release_group() releases all partially acquired resources
(dma structure memory, TX DMA channel) when DMA init fails and probe
continues in PIO mode. Without this, a successful TX channel request
followed by a failed RX channel request would leave the TX channel
and dma structure held for the lifetime of the device.
---
drivers/i2c/busses/i2c-imx-lpi2c.c | 53 ++++++++++++++++++------------
1 file changed, 32 insertions(+), 21 deletions(-)
diff --git a/drivers/i2c/busses/i2c-imx-lpi2c.c b/drivers/i2c/busses/i2c-imx-lpi2c.c
index 6e298424de5e..dedcc24e63ec 100644
--- a/drivers/i2c/busses/i2c-imx-lpi2c.c
+++ b/drivers/i2c/busses/i2c-imx-lpi2c.c
@@ -1383,55 +1383,66 @@ static int lpi2c_imx_init_recovery_info(struct lpi2c_imx_struct *lpi2c_imx,
return 0;
}
-static void dma_exit(struct device *dev, struct lpi2c_imx_dma *dma)
-{
- if (dma->chan_rx)
- dma_release_channel(dma->chan_rx);
-
- if (dma->chan_tx)
- dma_release_channel(dma->chan_tx);
-
- devm_kfree(dev, dma);
-}
-
static int lpi2c_dma_init(struct device *dev, dma_addr_t phy_addr)
{
struct lpi2c_imx_struct *lpi2c_imx = dev_get_drvdata(dev);
struct lpi2c_imx_dma *dma;
+ void *group;
int ret;
- dma = devm_kzalloc(dev, sizeof(*dma), GFP_KERNEL);
- if (!dma)
+ /*
+ * Open a devres group so that all resources allocated within
+ * this function can be released together if DMA init fails but
+ * probe continues in PIO mode.
+ */
+ group = devres_open_group(dev, NULL, GFP_KERNEL);
+ if (!group)
return -ENOMEM;
+ dma = devm_kzalloc(dev, sizeof(*dma), GFP_KERNEL);
+ if (!dma) {
+ ret = -ENOMEM;
+ goto release_group;
+ }
+
dma->phy_addr = phy_addr;
/* Prepare for TX DMA: */
- dma->chan_tx = dma_request_chan(dev, "tx");
+ dma->chan_tx = devm_dma_request_chan(dev, "tx");
if (IS_ERR(dma->chan_tx)) {
ret = PTR_ERR(dma->chan_tx);
if (ret != -ENODEV && ret != -EPROBE_DEFER)
dev_err(dev, "can't request DMA tx channel (%d)\n", ret);
- dma->chan_tx = NULL;
- goto dma_exit;
+ goto release_group;
}
/* Prepare for RX DMA: */
- dma->chan_rx = dma_request_chan(dev, "rx");
+ dma->chan_rx = devm_dma_request_chan(dev, "rx");
if (IS_ERR(dma->chan_rx)) {
ret = PTR_ERR(dma->chan_rx);
if (ret != -ENODEV && ret != -EPROBE_DEFER)
dev_err(dev, "can't request DMA rx channel (%d)\n", ret);
- dma->chan_rx = NULL;
- goto dma_exit;
+ goto release_group;
}
+ /*
+ * DMA init succeeded. Remove the group marker but keep all resources
+ * bound to the device, they will be freed at device removal.
+ */
+ devres_remove_group(dev, group);
+
lpi2c_imx->can_use_dma = true;
lpi2c_imx->dma = dma;
return 0;
-dma_exit:
- dma_exit(dev, dma);
+release_group:
+ /*
+ * DMA init failed. Release ALL resources allocated inside this
+ * group (dma memory, TX channel if already acquired, etc.) so
+ * that a successful PIO-mode probe does not hold unused resources
+ * for the entire device lifetime.
+ */
+ devres_release_group(dev, group);
return ret;
}
--
2.43.0
^ permalink raw reply related
* Re: [PATCH] drm/bridge: dw-hdmi-qp: compute audio CTS from N when not in TMDS table
From: Simon Wright @ 2026-05-20 9:33 UTC (permalink / raw)
To: Luca Ceresoli, Cristian Ciocaltea
Cc: Andrzej Hajda, Neil Armstrong, Robert Foss, Laurent Pinchart,
Jonas Karlman, Jernej Skrabec, David Airlie, Simona Vetter,
Heiko Stuebner, Andy Yan, Sebastian Reichel, Dmitry Baryshkov,
Algea Cao, dri-devel, linux-rockchip, linux-arm-kernel,
linux-kernel
In-Reply-To: <DILQTGH47XGL.1K6I6UI6IXP92@bootlin.com>
On 18/05/2026 10:50 pm, Luca Ceresoli wrote:
> Your patch got mangled by your mailer and cannot be applied.
> [...]
> Also your commit message is very detailed, and think it would be more
> readable if split in paragraphs.
Thanks Luca. v2 with paragraph breaks:
https://lore.kernel.org/linux-rockchip/00a34a82-213b-4b03-801c-3c10b163d643@symple.nz/
Simon
^ permalink raw reply
* [PATCH v2 1/2] Documentation: ABI: add sysfs interface for ZynqMP CSU registers
From: Ronak Jain @ 2026-05-20 9:36 UTC (permalink / raw)
To: michal.simek, senthilnathan.thangaraj
Cc: linux-kernel, linux-arm-kernel, ronak.jain
In-Reply-To: <20260520093654.3303917-1-ronak.jain@amd.com>
Document the new sysfs interface that exposes Configuration Security
Unit (CSU) registers through the zynqmp-firmware driver.
The interface is available under:
/sys/devices/platform/firmware:zynqmp-firmware/csu_registers/
The CSU registers are discovered at boot time using the PM_QUERY_DATA
firmware API. The following registers are currently supported:
- multiboot (CSU_MULTI_BOOT)
- idcode (CSU_IDCODE, read-only)
- pcap-status (CSU_PCAP_STATUS, read-only)
Read operations use the existing IOCTL_READ_REG firmware interface,
while write operations use IOCTL_MASK_WRITE_REG.
Access control is enforced by the firmware. Write attempts to
read-only registers are rejected by firmware even though the sysfs file
permissions allow writes.
Document the ABI entry accordingly.
Signed-off-by: Ronak Jain <ronak.jain@amd.com>
---
.../ABI/stable/sysfs-driver-firmware-zynqmp | 33 +++++++++++++++++++
1 file changed, 33 insertions(+)
diff --git a/Documentation/ABI/stable/sysfs-driver-firmware-zynqmp b/Documentation/ABI/stable/sysfs-driver-firmware-zynqmp
index c3fec3c835af..f9b004e33696 100644
--- a/Documentation/ABI/stable/sysfs-driver-firmware-zynqmp
+++ b/Documentation/ABI/stable/sysfs-driver-firmware-zynqmp
@@ -254,3 +254,36 @@ Description:
The expected result is 500.
Users: Xilinx
+
+What: /sys/devices/platform/firmware\:zynqmp-firmware/csu_registers/*
+Date: May 2026
+KernelVersion: 7.1
+Contact: "Ronak Jain" <ronak.jain@amd.com>
+Description:
+ Read/Write CSU (Configuration Security Unit) registers.
+
+ This interface provides dynamic access to CSU registers that are
+ discovered from the firmware at boot time using PM_QUERY_DATA API.
+
+ The supported registers are:
+
+ - multiboot: CSU_MULTI_BOOT register
+ - idcode: CSU_IDCODE register (read-only)
+ - pcap-status: CSU_PCAP_STATUS register (read-only)
+
+ Read operations use the existing IOCTL_READ_REG API.
+ Write operations use the existing IOCTL_MASK_WRITE_REG API.
+
+ The firmware enforces access control - read-only registers will reject
+ write attempts even though the sysfs permissions show write access.
+
+ Usage for reading::
+
+ # cat /sys/devices/platform/firmware\:zynqmp-firmware/csu_registers/multiboot
+ # cat /sys/devices/platform/firmware\:zynqmp-firmware/csu_registers/idcode
+
+ Usage for writing (mask and value are in hexadecimal)::
+
+ # echo 0xFFFFFFF 0x0 > /sys/devices/platform/firmware\:zynqmp-firmware/csu_registers/multiboot
+
+Users: Xilinx/AMD
--
2.34.1
^ permalink raw reply related
* [PATCH v2 2/2] firmware: zynqmp: Add dynamic CSU register discovery and sysfs interface
From: Ronak Jain @ 2026-05-20 9:36 UTC (permalink / raw)
To: michal.simek, senthilnathan.thangaraj
Cc: linux-kernel, linux-arm-kernel, ronak.jain
In-Reply-To: <20260520093654.3303917-1-ronak.jain@amd.com>
Add support for dynamically discovering and exposing Configuration
Security Unit (CSU) registers through sysfs. Leverage the existing
PM_QUERY_DATA API to discover available registers at runtime, making
the interface flexible and maintainable.
Key features:
- Dynamic register discovery using PM_QUERY_DATA API
* PM_QID_GET_NODE_COUNT: Query number of available registers
* PM_QID_GET_NODE_NAME: Query register names by index
- Automatic sysfs attribute creation under csu_registers/ group
- Read operations via existing IOCTL_READ_REG API
- Write operations via existing IOCTL_MASK_WRITE_REG API
The sysfs interface is created at:
/sys/devices/platform/firmware:zynqmp-firmware/csu_registers/
Currently supported registers include:
- multiboot (CSU_MULTI_BOOT)
- idcode (CSU_IDCODE, read-only)
- pcap-status (CSU_PCAP_STATUS, read-only)
The dynamic discovery approach allows firmware to control which
registers are exposed without requiring kernel changes, improving
maintainability and security.
The firmware does not currently expose per-register access mode
information, so the kernel cannot distinguish read-only registers
from read-write ones at discovery time. All discovered registers are
therefore created with sysfs mode 0644, and the firmware is
responsible for rejecting writes to registers it treats as read-only
(for example idcode and pcap-status); that error is propagated back
to userspace from the store callback. If a per-register access-mode
query is added to the firmware in the future, sysfs permissions can
be tightened to match.
CSU register discovery is an optional feature: on firmware that lacks
support for PM_QID_GET_NODE_COUNT or PM_QID_GET_NODE_NAME, the probe
returns gracefully without exposing any sysfs entries. To keep the
memory footprint minimal on that path, partial devm allocations made
during discovery are explicitly released on failure so that no memory
lingers until device unbind when the feature is unavailable.
Signed-off-by: Ronak Jain <ronak.jain@amd.com>
---
MAINTAINERS | 10 +
drivers/firmware/xilinx/Makefile | 2 +-
drivers/firmware/xilinx/zynqmp-csu-reg.c | 258 +++++++++++++++++++++++
drivers/firmware/xilinx/zynqmp-csu-reg.h | 18 ++
drivers/firmware/xilinx/zynqmp.c | 6 +
include/linux/firmware/xlnx-zynqmp.h | 4 +-
6 files changed, 296 insertions(+), 2 deletions(-)
create mode 100644 drivers/firmware/xilinx/zynqmp-csu-reg.c
create mode 100644 drivers/firmware/xilinx/zynqmp-csu-reg.h
diff --git a/MAINTAINERS b/MAINTAINERS
index b3e05a3186aa..f1b42935b40d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -29490,6 +29490,16 @@ F: drivers/dma/xilinx/xdma.c
F: include/linux/dma/amd_xdma.h
F: include/linux/platform_data/amd_xdma.h
+XILINX ZYNQMP CSU REGISTER DRIVER
+M: Senthil Nathan Thangaraj <senthilnathan.thangaraj@amd.com>
+R: Michal Simek <michal.simek@amd.com>
+R: Ronak Jain <ronak.jain@amd.com>
+L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
+S: Maintained
+F: Documentation/ABI/stable/sysfs-driver-firmware-zynqmp
+F: drivers/firmware/xilinx/zynqmp-csu-reg.c
+F: drivers/firmware/xilinx/zynqmp-csu-reg.h
+
XILINX ZYNQMP DPDMA DRIVER
M: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
L: dmaengine@vger.kernel.org
diff --git a/drivers/firmware/xilinx/Makefile b/drivers/firmware/xilinx/Makefile
index 8db0e66b6b7e..6203f41daaa6 100644
--- a/drivers/firmware/xilinx/Makefile
+++ b/drivers/firmware/xilinx/Makefile
@@ -1,5 +1,5 @@
# SPDX-License-Identifier: GPL-2.0
# Makefile for Xilinx firmwares
-obj-$(CONFIG_ZYNQMP_FIRMWARE) += zynqmp.o zynqmp-ufs.o zynqmp-crypto.o
+obj-$(CONFIG_ZYNQMP_FIRMWARE) += zynqmp.o zynqmp-ufs.o zynqmp-crypto.o zynqmp-csu-reg.o
obj-$(CONFIG_ZYNQMP_FIRMWARE_DEBUG) += zynqmp-debug.o
diff --git a/drivers/firmware/xilinx/zynqmp-csu-reg.c b/drivers/firmware/xilinx/zynqmp-csu-reg.c
new file mode 100644
index 000000000000..6e11a9b019f7
--- /dev/null
+++ b/drivers/firmware/xilinx/zynqmp-csu-reg.c
@@ -0,0 +1,258 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Xilinx Zynq MPSoC CSU Register Access
+ *
+ * Copyright (C) 2026 Advanced Micro Devices, Inc.
+ *
+ * Michal Simek <michal.simek@amd.com>
+ * Ronak Jain <ronak.jain@amd.com>
+ */
+
+#include <linux/firmware/xlnx-zynqmp.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+#include <linux/string.h>
+
+#include "zynqmp-csu-reg.h"
+
+/* Node ID for CSU module in firmware */
+#define CSU_NODE_ID 0
+
+/* Maximum number of CSU registers supported */
+#define MAX_CSU_REGS 50
+
+/* Size of register name returned by firmware (3 u32 words = 12 bytes) */
+#define CSU_REG_NAME_LEN 12
+
+/**
+ * struct zynqmp_csu_reg - CSU register information
+ * @id: Register index from firmware
+ * @name: Register name
+ * @attr: Device attribute for sysfs
+ */
+struct zynqmp_csu_reg {
+ u32 id;
+ char name[CSU_REG_NAME_LEN];
+ struct device_attribute attr;
+};
+
+/**
+ * struct zynqmp_csu_data - Per-device CSU data
+ * @csu_regs: Array of CSU registers
+ * @csu_attr_group: Attribute group for sysfs
+ */
+struct zynqmp_csu_data {
+ struct zynqmp_csu_reg *csu_regs;
+ struct attribute_group csu_attr_group;
+};
+
+/**
+ * zynqmp_pm_get_node_count() - Get number of supported nodes via QUERY_DATA
+ *
+ * Return: Number of nodes on success, or negative error code
+ */
+static int zynqmp_pm_get_node_count(void)
+{
+ struct zynqmp_pm_query_data qdata = {0};
+ u32 ret_payload[PAYLOAD_ARG_CNT];
+ int ret;
+
+ qdata.qid = PM_QID_GET_NODE_COUNT;
+
+ ret = zynqmp_pm_query_data(qdata, ret_payload);
+ if (ret)
+ return ret;
+
+ return ret_payload[1];
+}
+
+/**
+ * zynqmp_pm_get_node_name() - Get node name via QUERY_DATA
+ * @index: Register index
+ * @name: Buffer to store register name
+ *
+ * Return: 0 on success, error code otherwise
+ */
+static int zynqmp_pm_get_node_name(u32 index, char *name)
+{
+ struct zynqmp_pm_query_data qdata = {0};
+ u32 ret_payload[PAYLOAD_ARG_CNT];
+ int ret;
+
+ qdata.qid = PM_QID_GET_NODE_NAME;
+ qdata.arg1 = index;
+
+ ret = zynqmp_pm_query_data(qdata, ret_payload);
+ if (ret)
+ return ret;
+
+ memcpy(name, &ret_payload[1], CSU_REG_NAME_LEN);
+ name[CSU_REG_NAME_LEN - 1] = '\0';
+
+ return 0;
+}
+
+/**
+ * zynqmp_csu_reg_show() - Generic show function for all registers
+ * @dev: Device pointer
+ * @attr: Device attribute
+ * @buf: Output buffer
+ *
+ * Return: Number of bytes written to buffer, or error code
+ */
+static ssize_t zynqmp_csu_reg_show(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct zynqmp_csu_reg *reg;
+ u32 value;
+ int ret;
+
+ /* Use container_of to get register directly */
+ reg = container_of(attr, struct zynqmp_csu_reg, attr);
+
+ ret = zynqmp_pm_sec_read_reg(CSU_NODE_ID, reg->id, &value);
+ if (ret)
+ return ret;
+
+ return sysfs_emit(buf, "0x%08x\n", value);
+}
+
+/**
+ * zynqmp_csu_reg_store() - Generic store function for writable registers
+ * @dev: Device pointer
+ * @attr: Device attribute
+ * @buf: Input buffer
+ * @count: Buffer size
+ *
+ * Format: "mask value" - both mask and value required
+ * Example: echo "0xFFFFFFFF 0x12345678" > register
+ *
+ * Return: count on success, error code otherwise
+ */
+static ssize_t zynqmp_csu_reg_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct zynqmp_csu_reg *reg;
+ u32 mask, value;
+ int ret;
+
+ reg = container_of(attr, struct zynqmp_csu_reg, attr);
+
+ if (sscanf(buf, "%x %x", &mask, &value) != 2)
+ return -EINVAL;
+
+ ret = zynqmp_pm_sec_mask_write_reg(CSU_NODE_ID, reg->id, mask, value);
+ if (ret)
+ return ret;
+
+ return count;
+}
+
+/**
+ * zynqmp_csu_discover_registers() - Discover CSU registers from firmware
+ * @pdev: Platform device pointer
+ *
+ * This function uses PM_QUERY_DATA to discover all available CSU registers
+ * and creates sysfs group under /sys/devices/platform/firmware:zynqmp-firmware/
+ *
+ * Return: 0 on success, error code otherwise
+ */
+int zynqmp_csu_discover_registers(struct platform_device *pdev)
+{
+ struct zynqmp_csu_data *csu_data;
+ struct attribute **attrs;
+ int count, ret, i;
+
+ ret = zynqmp_pm_is_function_supported(PM_QUERY_DATA, PM_QID_GET_NODE_COUNT);
+ if (ret) {
+ dev_dbg(&pdev->dev, "CSU register discovery not supported by current firmware\n");
+ return 0;
+ }
+
+ ret = zynqmp_pm_is_function_supported(PM_QUERY_DATA, PM_QID_GET_NODE_NAME);
+ if (ret) {
+ dev_dbg(&pdev->dev, "CSU register name query not supported by current firmware\n");
+ return 0;
+ }
+
+ count = zynqmp_pm_get_node_count();
+ if (count < 0)
+ return count;
+ if (count == 0) {
+ dev_dbg(&pdev->dev, "No nodes available from firmware\n");
+ return 0;
+ }
+
+ /* Validate count to prevent excessive memory allocation */
+ if (count > MAX_CSU_REGS) {
+ dev_err(&pdev->dev, "Register count %d exceeds maximum %d\n",
+ count, MAX_CSU_REGS);
+ return -EINVAL;
+ }
+
+ dev_dbg(&pdev->dev, "Discovered %d nodes from firmware\n", count);
+
+ csu_data = devm_kzalloc(&pdev->dev, sizeof(*csu_data), GFP_KERNEL);
+ if (!csu_data)
+ return -ENOMEM;
+
+ csu_data->csu_regs = devm_kcalloc(&pdev->dev, count, sizeof(*csu_data->csu_regs),
+ GFP_KERNEL);
+ if (!csu_data->csu_regs) {
+ devm_kfree(&pdev->dev, csu_data);
+ return -ENOMEM;
+ }
+
+ attrs = devm_kcalloc(&pdev->dev, count + 1, sizeof(*attrs), GFP_KERNEL);
+ if (!attrs) {
+ devm_kfree(&pdev->dev, csu_data->csu_regs);
+ devm_kfree(&pdev->dev, csu_data);
+ return -ENOMEM;
+ }
+
+ for (i = 0; i < count; i++) {
+ struct zynqmp_csu_reg *reg = &csu_data->csu_regs[i];
+ struct device_attribute *dev_attr = ®->attr;
+
+ reg->id = i;
+
+ ret = zynqmp_pm_get_node_name(i, reg->name);
+ if (ret) {
+ dev_warn(&pdev->dev, "Failed to get name for register %d\n", i);
+ snprintf(reg->name, sizeof(reg->name), "csu_reg_%d", i);
+ }
+
+ /*
+ * The firmware does not expose per-register access mode via
+ * PM_QUERY_DATA today, so the kernel cannot tell read-only
+ * registers from read-write ones at discovery time. Expose
+ * every register as 0644 and rely on the firmware to reject
+ * IOCTL_MASK_WRITE_REG on read-only registers; the error is
+ * propagated back to userspace from the store callback.
+ */
+ sysfs_attr_init(&dev_attr->attr);
+ dev_attr->attr.name = reg->name;
+ dev_attr->attr.mode = 0644;
+ dev_attr->show = zynqmp_csu_reg_show;
+ dev_attr->store = zynqmp_csu_reg_store;
+
+ attrs[i] = &dev_attr->attr;
+
+ dev_dbg(&pdev->dev, "Register %d: id=%d name=%s\n", i, reg->id, reg->name);
+ }
+
+ csu_data->csu_attr_group.name = "csu_registers";
+ csu_data->csu_attr_group.attrs = attrs;
+
+ ret = devm_device_add_group(&pdev->dev, &csu_data->csu_attr_group);
+ if (ret) {
+ devm_kfree(&pdev->dev, attrs);
+ devm_kfree(&pdev->dev, csu_data->csu_regs);
+ devm_kfree(&pdev->dev, csu_data);
+ }
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(zynqmp_csu_discover_registers);
diff --git a/drivers/firmware/xilinx/zynqmp-csu-reg.h b/drivers/firmware/xilinx/zynqmp-csu-reg.h
new file mode 100644
index 000000000000..b12415db3496
--- /dev/null
+++ b/drivers/firmware/xilinx/zynqmp-csu-reg.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Xilinx Zynq MPSoC CSU Register Access
+ *
+ * Copyright (C) 2026 Advanced Micro Devices, Inc.
+ *
+ * Michal Simek <michal.simek@amd.com>
+ * Ronak Jain <ronak.jain@amd.com>
+ */
+
+#ifndef __ZYNQMP_CSU_REG_H__
+#define __ZYNQMP_CSU_REG_H__
+
+#include <linux/platform_device.h>
+
+int zynqmp_csu_discover_registers(struct platform_device *pdev);
+
+#endif /* __ZYNQMP_CSU_REG_H__ */
diff --git a/drivers/firmware/xilinx/zynqmp.c b/drivers/firmware/xilinx/zynqmp.c
index af838b2dc327..155a7a9b3777 100644
--- a/drivers/firmware/xilinx/zynqmp.c
+++ b/drivers/firmware/xilinx/zynqmp.c
@@ -27,6 +27,7 @@
#include <linux/firmware/xlnx-zynqmp.h>
#include <linux/firmware/xlnx-event-manager.h>
+#include "zynqmp-csu-reg.h"
#include "zynqmp-debug.h"
/* Max HashMap Order for PM API feature check (1<<7 = 128) */
@@ -2148,6 +2149,11 @@ static int zynqmp_firmware_probe(struct platform_device *pdev)
dev_err_probe(&pdev->dev, PTR_ERR(em_dev), "EM register fail with error\n");
}
+ /* Discover CSU registers dynamically */
+ ret = zynqmp_csu_discover_registers(pdev);
+ if (ret)
+ dev_warn(&pdev->dev, "CSU register discovery failed: %d\n", ret);
+
return of_platform_populate(dev->of_node, NULL, NULL, dev);
}
diff --git a/include/linux/firmware/xlnx-zynqmp.h b/include/linux/firmware/xlnx-zynqmp.h
index 7e27b0f7bf7e..a956e315be82 100644
--- a/include/linux/firmware/xlnx-zynqmp.h
+++ b/include/linux/firmware/xlnx-zynqmp.h
@@ -3,7 +3,7 @@
* Xilinx Zynq MPSoC Firmware layer
*
* Copyright (C) 2014-2021 Xilinx
- * Copyright (C) 2022 - 2025 Advanced Micro Devices, Inc.
+ * Copyright (C) 2022 - 2026 Advanced Micro Devices, Inc.
*
* Michal Simek <michal.simek@amd.com>
* Davorin Mista <davorin.mista@aggios.com>
@@ -262,6 +262,8 @@ enum pm_query_id {
PM_QID_CLOCK_GET_NUM_CLOCKS = 12,
PM_QID_CLOCK_GET_MAX_DIVISOR = 13,
PM_QID_PINCTRL_GET_ATTRIBUTES = 15,
+ PM_QID_GET_NODE_NAME = 16,
+ PM_QID_GET_NODE_COUNT = 17,
};
enum rpu_oper_mode {
--
2.34.1
^ permalink raw reply related
* [PATCH v2 0/2] Add dynamic CSU register sysfs interface
From: Ronak Jain @ 2026-05-20 9:36 UTC (permalink / raw)
To: michal.simek, senthilnathan.thangaraj
Cc: linux-kernel, linux-arm-kernel, ronak.jain
This patch series adds support for exposing CSU registers through a
sysfs interface. The implementation uses dynamic discovery via the
PM_QUERY_DATA firmware API to determine available registers at
runtime, making the interface flexible and maintainable without
requiring kernel changes when firmware capabilities evolve.
Background:
The ZynqMP platform has several CSU registers that are useful for
system configuration and debugging. Previously, accessing these
registers required direct memory access or custom tools. This series
provides a standardized sysfs interface that leverages existing
firmware APIs for secure access.
Key Features:
- Dynamic register discovery using PM_QUERY_DATA API
* PM_QID_GET_NODE_COUNT: Query number of available registers
* PM_QID_GET_NODE_NAME: Query register names by index
- Automatic sysfs attribute creation under csu_registers/ group
- Read operations via existing IOCTL_READ_REG firmware API
- Write operations via existing IOCTL_MASK_WRITE_REG firmware API
- Firmware-enforced access control for read-only registers
Currently Supported Registers:
- multiboot (CSU_MULTI_BOOT): Boot mode configuration
- idcode (CSU_IDCODE): Device identification (read-only)
- pcap-status (CSU_PCAP_STATUS): PCAP status (read-only)
The sysfs interface is available at:
/sys/devices/platform/firmware:zynqmp-firmware/csu_registers/
Usage Examples:
Reading a register:
# cat /sys/devices/platform/firmware:zynqmp-firmware/csu_registers/idcode
Writing a register (mask and value in hex):
# echo "0xFFFFFFFF 0x0" > /sys/devices/platform/firmware:zynqmp-firmware/csu_registers/multiboot
Testing:
- Verified register read operations return correct values
- Verified write operations update registers correctly
- Verified read-only registers reject write attempts
- Verified dynamic discovery works with different firmware versions
Changes in v2:
Patch #1
- Update date
Patch #2:
- Removed unused csu_reg_count field from struct zynqmp_csu_data and
its kernel-doc entry.
- Added explicit devm_kfree() on the csu_regs and attrs allocation
failure paths, plus on devm_device_add_group() failure — keeps the
footprint minimal when CSU is optional.
- Expanded the 0644 sysfs-mode inline comment into a block comment
explaining the firmware-enforced access-control limitation.Also,
update the commit message accordingly.
- Added zynqmp_pm_is_function_supported check for PM_QID_GET_NODE_NAME
ID to mirror the PM_QID_GET_NODE_COUNT verification.
Ronak Jain (2):
Documentation: ABI: add sysfs interface for ZynqMP CSU registers
firmware: zynqmp: Add dynamic CSU register discovery and sysfs
interface
.../ABI/stable/sysfs-driver-firmware-zynqmp | 33 +++
MAINTAINERS | 10 +
drivers/firmware/xilinx/Makefile | 2 +-
drivers/firmware/xilinx/zynqmp-csu-reg.c | 258 ++++++++++++++++++
drivers/firmware/xilinx/zynqmp-csu-reg.h | 18 ++
drivers/firmware/xilinx/zynqmp.c | 6 +
include/linux/firmware/xlnx-zynqmp.h | 4 +-
7 files changed, 329 insertions(+), 2 deletions(-)
create mode 100644 drivers/firmware/xilinx/zynqmp-csu-reg.c
create mode 100644 drivers/firmware/xilinx/zynqmp-csu-reg.h
--
2.34.1
^ permalink raw reply
* Re: [PATCH v7 04/23] drm: bridge: dw_hdmi: Hold bridge ref until connector cleanup
From: Jonas Karlman @ 2026-05-20 9:38 UTC (permalink / raw)
To: Luca Ceresoli
Cc: Andrzej Hajda, Neil Armstrong, Robert Foss, Heiko Stuebner,
Laurent Pinchart, Jernej Skrabec, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Cristian Ciocaltea, Louis Chauvet, Liu Ying, Sandy Huang,
Andy Yan, Chen-Yu Tsai, Christian Hewitt, Diederik de Haas,
Nicolas Frattaroli, Dmitry Baryshkov, dri-devel, linux-arm-kernel,
linux-rockchip, linux-amlogic, linux-sunxi, imx, linux-kernel
In-Reply-To: <177925954999.1337464.14745526881417951688.b4-reply@b4>
Hello Luca,
On 5/20/2026 8:45 AM, Luca Ceresoli wrote:
> Hello Jonas,
>
> On 2026-05-19 17:18 +0200, Jonas Karlman wrote:
>> Hello Luca,
>>
>> On 5/19/2026 2:06 PM, Luca Ceresoli wrote:
>>> On Mon, 18 May 2026 18:01:40 +0000, Jonas Karlman <jonas@kwiboo.se> wrote:
>>>
>>> Hello Jonas,
>>>
>>>>
>>>> diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
>>>> index b7bfc0e9a6b2..9d795c550f8a 100644
>>>> --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
>>>> +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
>>>> @@ -2600,10 +2609,14 @@ static int dw_hdmi_connector_create(struct dw_hdmi *hdmi)
>>>>
>>>> drm_connector_helper_add(connector, &dw_hdmi_connector_helper_funcs);
>>>>
>>>> - drm_connector_init_with_ddc(hdmi->bridge.dev, connector,
>>>> - &dw_hdmi_connector_funcs,
>>>> - DRM_MODE_CONNECTOR_HDMIA,
>>>> - hdmi->ddc);
>>>> + ret = drm_connector_init_with_ddc(hdmi->bridge.dev, connector,
>>>> + &dw_hdmi_connector_funcs,
>>>> + DRM_MODE_CONNECTOR_HDMIA,
>>>> + hdmi->ddc);
>>>> + if (ret)
>>>> + return ret;
>>>> +
>>>> + drm_bridge_get(&hdmi->bridge);
>>>
>>> I'm not fully following the code paths, but both the report and the fix
>>> make sense to me. Only I'd move the drm_bridge_get() before
>>> drm_connector_init_with_ddc(), to avoid a short window where no reference
>>> is held and the bridge might be destroyed before drm_bridge_get() is
>>> called. I'm not sure this can happen, but it's better to write the code in
>>> a way that clearly makes it impossible.
>>
>> dw_hdmi_connector_create() is only called from dw_hdmi_bridge_attach()
>> so the bridge should already have a ref for the lifetime of this call.
>
> Ah, that's true. So the patch is correct.
>
> Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Thanks.
> In case you send a new iteration, please add this extra explanation to the
> commit message, similar to the above paragraph.
Sure, I will include a note about this if/when I need to re-spin.
>> I explicitly chose the placement after drm_connector_init_with_ddc()
>> to ensure ref count is correctly balanced without having to add a
>> drm_bridge_put() call in any error path. I.e. connector destroy() is
>> only called when drm_connector_init_with_ddc() succeeds.
>>
>> This code/call is also planned to be removed in a future series,
>
> In order to remove the !DRM_BRIDGE_ATTACH_NO_CONNECTOR case? That would be
> welcome!
That would be an end goal, however initial plan/step was to just change
to use drm_bridge_connector_init() inside this driver [1] or possible
move that to the consuming driver (imx6, rockchip and sun8i), in an
unpolished future series [2].
Fully change to use ATTACH_NO_CONNECTOR for those affected drivers may
possibly be pushed to a follow-up future series.
Main end goal of my current effort is to enable support for Deep Color
and YCbCr output modes on Rockchip RK32xx/RK33xx/RK356x devices [3].
[1] https://github.com/Kwiboo/linux-rockchip/commit/813b55961e5a8fa864ea157e2793e76ca4967bac
[2] https://github.com/Kwiboo/linux-rockchip/compare/7e9084cc75011ce28b1ceafec804091438eed1ff...3b5507aa260eb8306554c34a0c362e514ea41c3b
[3] https://github.com/Kwiboo/linux-rockchip/commits/next-20260518-rk-hdmi-v5/
Regards,
Jonas
>
> Luca
>
^ permalink raw reply
* Re: [PATCH v3 5/5] i2c: mt7621: make device reset optional
From: Benjamin Larsson @ 2026-05-20 9:41 UTC (permalink / raw)
To: Christian Marangi, Stefan Roese, Andi Shyti, Rob Herring,
Krzysztof Kozlowski, Conor Dooley, Matthias Brugger,
AngeloGioacchino Del Regno, linux-i2c, devicetree, linux-kernel,
linux-arm-kernel, linux-mediatek
In-Reply-To: <20260519223253.1093-6-ansuelsmth@gmail.com>
Hi.
On 5/20/26 00:32, Christian Marangi wrote:
> Airoha SoC that makes use of the same Mediatek I2C driver/logic doesn't
> have reset line for I2C so use optional device_reset variant.
>
> Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
> ---
> drivers/i2c/busses/i2c-mt7621.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/i2c/busses/i2c-mt7621.c b/drivers/i2c/busses/i2c-mt7621.c
> index 3cde43c57a2b..fb9d9701bb10 100644
> --- a/drivers/i2c/busses/i2c-mt7621.c
> +++ b/drivers/i2c/busses/i2c-mt7621.c
> @@ -91,7 +91,7 @@ static void mtk_i2c_reset(struct mtk_i2c *i2c)
> u32 reg;
> int ret;
>
> - ret = device_reset(i2c->adap.dev.parent);
> + ret = device_reset_optional(i2c->adap.dev.parent);
> if (ret)
> dev_err(i2c->dev, "I2C reset failed!\n");
>
Can you elaborate on this one? I get this:
root@XGX-B-00e092000160:~# devmem 0x1fbf8040
0x00C7800C
root@XGX-B-00e092000160:~# devmem 0x1FB00834 32 0x10000
root@XGX-B-00e092000160:~# devmem 0x1fbf8040
[ 396.658742] pbus timeout interrupt ERR ADDR=1fbf8040
[ 396.663845] CPU: 0 PID: 5622 Comm: sleep Tainted: P O 5.4.55 #0
[ 396.671117] Hardware name: XGX-B (DT)
[ 396.674884] Call trace:
[ 396.677394] dump_backtrace+0x0/0x120
[ 396.681111] show_stack+0x14/0x20
[ 396.684478] dump_stack+0xac/0xec
[ 396.687900] bus_timeout_interrupt+0x54/0x70
[ 396.692223] __handle_irq_event_percpu+0x3c/0x140
[ 396.696978] handle_irq_event+0x4c/0xec
[ 396.700920] handle_fasteoi_irq+0xbc/0x21c
[ 396.705069] __handle_domain_irq+0x6c/0xd0
[ 396.709218] gic_handle_irq+0x8c/0x190
[ 396.713019] el1_irq+0xf0/0x1c0
[ 396.716217] __do_softirq+0x98/0x264
[ 396.719847] irq_exit+0x98/0xe0
[ 396.723118] __handle_domain_irq+0x74/0xd0
[ 396.727268] gic_handle_irq+0x8c/0x190
[ 396.731070] el1_irq+0xf0/0x1c0
[ 396.734320] tlb_flush+0xf8/0x260
[ 396.737693] tlb_finish_mmu+0x48/0xe0
[ 396.741417] exit_mmap+0xc0/0x170
[ 396.744841] mmput+0x44/0x120
[ 396.747872] do_exit+0x2b4/0x8ec
[ 396.751161] do_group_exit+0x34/0x9c
[ 396.754794] __wake_up_parent+0x0/0x2c
[ 396.758651] el0_svc_handler+0x8c/0x150
[ 396.762545] el0_svc+0x8/0x208
0xDEADBEEF
root@XGX-B-00e092000160:~# devmem 0x1FB00834 32 0x00000
root@XGX-B-00e092000160:~# devmem 0x1fbf8040
0x0000800C
and
root@XGX-B-00e092000160:~# devmem 0x1fbf8140
0x00318013
root@XGX-B-00e092000160:~# devmem 0x1FB00830 32 0x00040
root@XGX-B-00e092000160:~# devmem 0x1fbf8140
[ 611.730070] pbus timeout interrupt ERR ADDR=1fbf8140
[ 611.735197] CPU: 0 PID: 2651 Comm: ux-manager Tainted: P O
5.4.55 #0
[ 611.742925] Hardware name: XGX-B (DT)
[ 611.746697] Call trace:
[ 611.749222] dump_backtrace+0x0/0x120
[ 611.752960] show_stack+0x14/0x20
[ 611.756424] dump_stack+0xac/0xec
[ 611.759801] bus_timeout_interrupt+0x54/0x70
[ 611.764145] __handle_irq_event_percpu+0x3c/0x140
[ 611.769001] handle_irq_event+0x4c/0xec
[ 611.772912] handle_fasteoi_irq+0xbc/0x21c
[ 611.777077] __handle_domain_irq+0x6c/0xd0
[ 611.781312] gic_handle_irq+0x8c/0x190
[ 611.785117] el1_irq+0xf0/0x1c0
[ 611.788335] __do_softirq+0x98/0x264
[ 611.792032] irq_exit+0x98/0xe0
[ 611.795249] __handle_domain_irq+0x74/0xd0
[ 611.799411] gic_handle_irq+0x8c/0x190
[ 611.803303] el1_irq+0xf0/0x1c0
[ 611.806518] bgpio_read32+0x4/0x20
[ 611.809995] gpiod_get_value_cansleep+0x44/0x100
[ 611.814742] value_show+0x2c/0x64
[ 611.818114] dev_attr_show+0x1c/0x54
[ 611.821760] sysfs_kf_read+0x54/0xc0
[ 611.825396] kernfs_fop_read+0xac/0x300
[ 611.829355] __vfs_read+0x18/0x3c
[ 611.832741] vfs_read+0xc8/0x150
[ 611.836028] ksys_read+0x58/0xd4
[ 611.839369] __arm64_sys_read+0x18/0x20
[ 611.843264] el0_svc_handler+0x8c/0x150
[ 611.847166] el0_svc+0x8/0x208
0xDEADBEEF
root@XGX-B-00e092000160:~# devmem 0x1FB00830 32 0x00000
root@XGX-B-00e092000160:~# devmem 0x1fbf8140
0x00008000
When I look at the current dts:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/boot/dts/airoha/en7581.dtsi?h=v7.1-rc4#n322
it looks like the resets are just crossed with regards to the nodes.
MvH
Benjamin Larsson
^ permalink raw reply
* Re: [PATCH v2 7/7] mm/vmalloc: Stop scanning for compound pages after encountering small pages in vmap
From: Uladzislau Rezki @ 2026-05-20 9:44 UTC (permalink / raw)
To: Wen Jiang
Cc: linux-mm, linux-arm-kernel, catalin.marinas, will, akpm, urezki,
baohua, Xueyuan.chen21, dev.jain, rppt, david, ryan.roberts,
anshuman.khandual, ajd, linux-kernel, Wen Jiang
In-Reply-To: <20260514094108.2016201-8-jiangwen6@xiaomi.com>
On Thu, May 14, 2026 at 05:41:08PM +0800, Wen Jiang wrote:
> From: "Barry Song (Xiaomi)" <baohua@kernel.org>
>
> Users typically allocate memory in descending orders, e.g.
> 8 → 4 → 0. Once an order-0 page is encountered, subsequent
> pages are likely to also be order-0, so we stop scanning
> for compound pages at that point.
>
> Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
> Signed-off-by: Wen Jiang <jiangwen6@xiaomi.com>
> Tested-by: Xueyuan Chen <xueyuan.chen21@gmail.com>
> ---
> mm/vmalloc.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index b3389c8f1..60579bfbf 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -3576,6 +3576,12 @@ static int __vmap_huge(unsigned long addr, unsigned long end,
> map_addr = addr;
> idx = i;
> }
> + /*
> + * Once small pages are encountered, the remaining pages
> + * are likely small as well
> + */
> + if (shift == PAGE_SHIFT)
> + break;
>
> addr += 1UL << shift;
> i += 1U << (shift - PAGE_SHIFT);
> --
> 2.34.1
>
Can we squash this patch with
"mm/vmalloc: map contiguous pages in batches for vmap() if possible"?
--
Uladzislau Rezki
^ permalink raw reply
* Re: [PATCH v8 next 00/10] arm_mpam: Introduce Narrow-PARTID feature
From: Zeng Heng @ 2026-05-20 9:47 UTC (permalink / raw)
To: James Morse, ben.horgan, Dave.Martin, tan.shaopeng,
reinette.chatre, fenghuay, tglx, will, hpa, bp, babu.moger,
dave.hansen, mingo, tony.luck, gshan, catalin.marinas
Cc: linux-arm-kernel, x86, linux-kernel, wangkefeng.wang, zengheng4
In-Reply-To: <ec8bc617-9e74-4749-ab33-39d1079415cc@arm.com>
Hi James,
On 2026/5/15 1:06, James Morse wrote:
> Hi Zeng,
>
> (beware this is the first version I've seen - arm have been silently deleting your mail,
> it looks like a problem with DKIM signatures)
>
Thanks for your informing. I will try to send community mails using
huaweicloud email to avoid DKIM signature issues.
Hope it works.
> On 13/04/2026 09:53, Zeng Heng wrote:
>> Background
>> ==========
>>
>> On x86, the resctrl allows creating up to num_rmids monitoring groups
>> under parent control group. However, ARM64 MPAM is currently limited by
>> the PMG (Performance Monitoring Group) count, which is typically much
>> smaller than the theoretical RMID limit.
>
> The MPAM PMG limit is 255. Is that not enough?
>
> I think the real problem is the CHI interconnect protocol is forcing people
> to only have 1 bit of PMG - regardless of what the architecture says. This
> isn't an MPAM problem as such - its an implementation issue.
>
> (but we can try and work around it)
>
Yes, the architecture theoretically allows PMG to be up to 8 bits wide,
but many platforms I've worked with (not just Kunpeng) implement far
fewer bits in practice.
>
>> This creates a significant
>> scalability gap: users expecting fine-grained per-process or per-thread
>> monitoring quickly exhaust the PMG space, even when plenty of reqPARTIDs
>> remain available.
>
> This is more about MPAM's philosophical stance that PMG extents PARTID, whereas
> on x86 RMID is an independent number.
>
No value judgment here. ARM seeks to expand the number of monitoring
groups by combining PARTID and PMG within limited bit-width constraints,
which inherently introduces coupling between the two.
> Please don't muddle these - it results in muddled patches!
> If we want to try and attack both with narrowing, we should do them separately.
>
>
>> The Narrow-PARTID feature, defined in the ARM MPAM architecture,
>> addresses this by associating reqPARTIDs with intPARTIDs through a
>> programmable many-to-one mapping. This allows the kernel to present more
>> logical monitoring contexts.
>
> I'd put this as "can be abused to avoid this problem"! We still have a problem with
> controls that don't alias and need to be removed from MSC that don't support narrowing.
> This isn't what the feature was designed for - but it is a really cool trick, it works
> for some real platforms, and solves a problem seen in user-space.
>
> However - throughout this series you seem to be discarding all the control-group support
> for a monitoring-only setup that allocates intPARTID for everything. This might work for
> your use-case on your platform, but it doesn't generalise to platforms without narrowing
> or where multiple control-groups are needed.
>
Currently, for MSCs that have non-aliasing controls but do not support
the Narrow PARTID feature, this solution will directly disable itself,
rather than hiding the non-aliasing control capabilities (Patch 3:
https://lore.kernel.org/all/20260413085405.1166412-4-zengheng4@huawei.com/).
This does indeed affect the enablement of this solution on MSC systems
without narrowing capability.
On the contrary, the solution attempts to preserve as many intPARTIDs
(i.e., control groups) as original. In principle, I hope that on
systems where narrow PARTID was not previously enabled, this patch set
can create as many monitoring groups as possible without changing any
other functionality.
And also allows users to limit the intpartid_max count via boot
parameters. (Patch 6:
https://lore.kernel.org/all/20260413085405.1166412-7-zengheng4@huawei.com/)
>
>> Design Overview
>> ===============
>>
>> The implementation extends the RMID encoding to carry reqPARTID
>> information:
>>
>> RMID = reqPARTID * NUM_PMG + PMG
>>
>> In this patchset, a monitoring group is uniquely identified by the
>> combination of reqPARTID and PMG. The closid is represented by intPARTID,
>> which is exactly the original PARTID.
>
> The way I think of this is 'RMID' bits being spilled into PARTID. This
> means each control group has a set of PARTID. For MSC using narrowing,
> CLOSID would be the intPARTID value. But as you note, we need to support
> mismatches:
>
>
Yes.
>> For systems with homogeneous MSCs (all supporting Narrow-PARTID), the
>> driver exposes the full reqPARTID range directly. For heterogeneous
>> systems where some MSCs lack Narrow-PARTID support, the driver utilizes
>> PARTIDs beyond the intPARTID range as reqPARTIDs to expand monitoring
>> capability. The sole exception is when any type of MSCs lack Narrow-PARTID
>> support, their percentage-based control mechanism prevents the use of
>> PARTIDs as reqPARTIDs.
>
> It'd be good to have some discussion about what the interface between the
> mpam_devices code and any other user (like resctrl) should be.
>
> As a hypothetical system to think about:
> 64 PARTID at the L3, which support CPOR and CCAP
> 64 PARTID and narrowing to 16 at the SLC, which supoprts CPOR
> 64 PARTID and narrowing to 32 at the memory-controller, which support MBWU_MAX
>
By the way, in this case, the L3 does not support NP and has CCAP, so
the PARTID mapping extension(PME) is not enabled by default.
If we exclude the L3 CCAP, the solution would support 16 control groups
and (64 * PMG) monitoring groups.
> I think whether using intPARTID is a benefit needs to be user-space policy.
> You've likely got a platform where that choice is obvious - but it is a
> trade-off as you lose the non-aliasing controls. In the example above, using
> narrowing on this system means losing the CCAP controls on L3 as they don't alias [*].
> Where its a policy, its likely to be one policy for resctrl, and another for any other
> user.
> We can get the resctrl glue code to turn it on unconditionally if there is no trade off,
> I think that means: no non-aliasing controls in any class that doesn't support narrowing
> - including 'unknown'. (we couldn't add them to resctrl in the future if you already chose
> to enable this).
>
Currently, after MPAM initialization, the PARTID mapping extension(PME)
is enabled by default unless there exists an MSC that both lacks NP
support and has non-aliasing controls — this is purely beneficial with
no downsides. Going forward, we may consider adding a `force_reqpartid`
option to forcibly enable the feature and disable non-aliasing controls.
> As for the interface with mpam_devices:
> I think this means the resctrl glue code needs to be able to discover which
> classes support intPARTID, and how many controls they actually have. From there
> it can apply to policy to determine whether its better to support fewer features
> in resctrl to get more RMID. (the alternative is always to ignore the MSC with
> narrowing - narrowing lets hardware lie about the features it supports).
>
> Currently the resctrl glue code has to program a configuration for two PARTID
> when CDP is being hidden on the MB resource. This is ugly and fragile. I'd like
> to explore generalising it as this narrowing stuff will also need to apply a
> configuration to a set of PARTID when that MSC doesn't support narrowing.
> In the example above, we'd need to discard the CCAP controls and write the same
> CPOR bitmap to each PARTID that is mapped together by narrowing.
>
One option is to expand CDP compatibility by PME: L3DATA and L3CODE
would still be controlled separately, while MB control would be
consolidated via narrow mapping onto a single intPARTID.
Of course, this requires that the MB supports narrowing.
>
> I think this means the resctrl glue code will need to be able to write a configuration
> to controls using the full partid_max range as it does today. But also be able to set
> the narrowing mapping on classes that support it.
> For the monitors, the resctrl glue code will need to allocate and configure a set of
> monitors, and read and sum them. This will be regardless of whether narrowing is
> supported. >
> I think this means allocating a table of CLOSID to PARTID(s). the intPARTID would
> always match the CLOSID. Monitors and non-narrowing MSC would need to walk the list.
> I'm hoping we can make CDP a subset of this problem.
> Some clever arithmetic may save allocating memory for a table - but if we change resctrl
> to do this dynamically, the numbers become arbitrary forcing it to be a table.
> It might also be possible to support moving monitor-groups between control groups with
> the table driven approach. (see what you think on how complex it ends up ...)
>
In the current patch series, static allocation employs a
straightforward intPARTID-to-reqPARTID translation, while dynamic
management tracks the mappings via `reqpartid_map` table.
> I'd like to keep that grouping static for now, the table needs creating at setup time,
> (+/- CDP), to avoid problems like you've found with CDP. This means the intpartid mappings
> can be written once at setup time.
>
> I'd like to avoid exposing user ABI to control this until we get it working, then we can
> talk about whether to try making the grouping dynamically managed by resctrl. (there were
> some proposals in that area - but I can't find them on lore).
> If there are platforms were its certainly not a trade-off, we can enable it
> unconditionally - but I'm wary of this being "what we care about now", requiring user-abi
> to enable features that were detectable.
> e.g. we ignore an unknown MSC, and add a resctrl schema for it later - only we can't
> expose it if we were using narrowing. Now its a trade-off.
>
>
>> Capability Improvements
>> =======================
>>
>> --------------------------------------------------------------------------
>> The maximum | Sub-monitoring groups | System-wide
>> number of | under a control group | monitoring groups
>> --------------------------------------------------------------------------
>> Without reqPARTID | PMG | intPARTID * PMG
>> --------------------------------------------------------------------------
>> reqPARTID | |
>> static allocation | (reqPARTID // intPARTID) * PMG | reqPARTID * PMG
>> --------------------------------------------------------------------------
>> reqPARTID | |
>> dynamic allocation | (reqPARTID - intPARTID + 1) * PMG | reqPARTID * PMG
>> --------------------------------------------------------------------------
>>
>> Note: The number of intPARTIDs can be capped via the boot parameter
>> mpam.intpartid_max. Under MPAM, reqPARTID count is always greater than
>> or equal to intPARTID count.
>>
>> Series Structure
>> ================
>>
>> Patch 1: Fix pre-existing out-of-range PARTID issue between mount sessions.
>> Patches 2-6: Implement static reqPARTID allocation.
>> Patches 7-10: Implement dynamic reqPARTID allocation.
>
> I've had a hard time following this series. You dive in with invasive changes, then
> unbreak things in later patches.
>
> Please added the needed infrastructure in mpam_devices.c first. This should be free of
> resctrl-isms, and 'only' needs reviewing against the architecture.
>
> Then add the resctrl glue code stuff. That needs to comply with what resctrl expects.
>
> I think the cleanest way to think about this is to break the mapping between CLOSID and
> PARTID. We're effectively moving bits of RMID out of PMG into PARTID. Adding helpers
> to explicitly do this early in those patches will make your changes clearer.
> Please avoid spraying the narrowing terms for things everywhere.
>
>
Sure, I'll reorder the series to introduce the core infrastructure in
mpam_devices.c first. Should I drop the dynamic allocation part from
this series for now?
>
>
> [*] It's terminology from discussing this with Dave, just in case a summary is needed:
> aliasing controls are like CPOR where two different PARTID with the same bitmap
> compete for the same resource. If you give them each the same 50% of the portions,
> they can't exceed that together.
> non-aliasing controls are like CCAP where to different PARTID with the same fraction
> compete for different resources. If you give them each 50% of the capacity, it adds
> up to 100%. You can't represent 'the same' 50% using these controls.
>
> Narrowing papers over this problem with its remapping table, which gives you a 'same'
> property. For MSC that have controls of that shape - and where more monitors are
> desired - we'd have to drop the controls.
>
> I think "more monitors are desired" is going to need to be user-space policy. But
> we can come back to how to do that later.
>
>
I'm not sure if anyone else has formalized these into terminology
before, but I fully agree with the terms "aliasing controls" and "non-
aliasing controls" — they're instantly intuitive for software
developers.
Best regards,
Zeng Heng
^ permalink raw reply
* Re: [PATCH v2 3/8] of: reserved_mem: add dumpable flag to opt-in vmcore
From: Marek Szyprowski @ 2026-05-20 9:53 UTC (permalink / raw)
To: Wandun Chen, linux-arm-kernel, linux-kernel, loongarch,
linux-riscv, devicetree, kexec, iommu, zhaomeijing
Cc: catalin.marinas, will, chenhuacai, kernel, pjw, palmer, aou, alex,
robh, saravanak, akpm, bhe, rppt, pasha.tatashin, pratyush,
ruirui.yang, robin.murphy, leitao, kees, coxu, tangyouling,
songshuaishuai
In-Reply-To: <20260520091844.592753-4-chenwandun@lixiang.com>
On 20.05.2026 11:18, Wandun Chen wrote:
> From: Wandun Chen <chenwandun1@gmail.com>
>
> From: Wandun Chen <chenwandun@lixiang.com>
>
> Add a 'dumpable' flag to struct reserved_mem so the kernel can decide
> whether a reserved area should be included in the kdump vmcore. Most
> reserved regions are owned by devices and do not contain data useful
> for kernel crash analysis, so excluding them by default is the right
> behaviour.
>
> Reusable CMA regions are different: pages in a CMA region are handed
> back to the buddy allocator and may contain key data for crash
> analysis, so set dumpable to true in rmem_cma_setup().
>
> Suggested-by: Rob Herring <robh@kernel.org>
> Signed-off-by: Wandun Chen <chenwandun@lixiang.com>
> Tested-by: Meijing Zhao <zhaomeijing@lixiang.com>
> Link: https://lore.kernel.org/all/20260506144542.GA2072596-robh@kernel.org/
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
> ---
> include/linux/of_reserved_mem.h | 1 +
> kernel/dma/contiguous.c | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/include/linux/of_reserved_mem.h b/include/linux/of_reserved_mem.h
> index e8b20b29fa68..55a67cee41ea 100644
> --- a/include/linux/of_reserved_mem.h
> +++ b/include/linux/of_reserved_mem.h
> @@ -15,6 +15,7 @@ struct reserved_mem {
> phys_addr_t base;
> phys_addr_t size;
> void *priv;
> + bool dumpable;
> };
>
> struct reserved_mem_ops {
> diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
> index 03f52bd17120..eddec89eb414 100644
> --- a/kernel/dma/contiguous.c
> +++ b/kernel/dma/contiguous.c
> @@ -579,6 +579,7 @@ static int __init rmem_cma_setup(unsigned long node, struct reserved_mem *rmem)
> dma_contiguous_default_area = cma;
>
> rmem->priv = cma;
> + rmem->dumpable = true;
>
> pr_info("Reserved memory: created CMA memory pool at %pa, size %ld MiB\n",
> &rmem->base, (unsigned long)rmem->size / SZ_1M);
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
^ permalink raw reply
* Re: [PATCH v7 19/23] drm: bridge: dw_hdmi: Use delayed_work to debounce hotplug event
From: Neil Armstrong @ 2026-05-20 9:58 UTC (permalink / raw)
To: Jonas Karlman, Andrzej Hajda, Robert Foss, Heiko Stuebner,
Laurent Pinchart, Jernej Skrabec, Luca Ceresoli,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter
Cc: Liu Ying, Sandy Huang, Andy Yan, Chen-Yu Tsai, Christian Hewitt,
Diederik de Haas, Nicolas Frattaroli, Dmitry Baryshkov, dri-devel,
linux-arm-kernel, linux-rockchip, linux-amlogic, linux-sunxi, imx,
linux-kernel
In-Reply-To: <20260518180206.2480119-20-jonas@kwiboo.se>
Hi,
On 5/18/26 20:01, Jonas Karlman wrote:
> HDMI Specification Version 1.4b chapter 8.5 mentions:
>
> An HDMI Sink shall not assert high voltage level on its Hot Plug
> Detect pin when the E-EDID is not available for reading.
>
> A Source may use a high voltage level Hot Plug Detect signal to
> initiate the reading of E-EDID data.
>
> An HDMI Sink shall indicate any change to the contents of the E-EDID
> by driving a low voltage level pulse on the Hot Plug Detect pin. This
> pulse shall be at least 100 msec.
>
> Use a delayed work to debounce reacting on HPD events to improve
> handling of a HPD low voltage level pulse when a sink changes the EDID.
>
> The delayed work is only enabled between enable_hpd()/hpd_enable() and
> disable_hpd()/hpd_disable() calls from core, i.e. enabled after
> attach/bind/resume and disabled before detach/unbind/suspend.
>
> The 1100 msec hotplug debounce timeout was arbitrarily picked to match
> other drivers using same const, and testing using a Raspberry Pi Monitor
> seem to use a 200-300 msec pulse when going from standby to power on
> state.
The logic looks ok, but I'm puzzled by the 1.1 sec debounce, which after
plugging in a monitor will only send an irq event after 1.1s which is very long.
Since the spec says 100ms and the real worls values are more like 200-300ms,
I would first reduce this to 500ms.
But as I understand the code right now, on the first HPD front the irq work
is programmed to run after the debounce time, but if it's a pulse the irq would
also trigger on the second HPD front and then delay again the work after the
debounce time.
My understanding of a debounce was that we "ignore" the pulse by only generating
a single irq event when the pulse is finished.
The current code does that, we will only have a single irq event and the HPD
will return as connected state, good. But this delays the irq event 1.1s _after_
the end of the pulse, which I would expect the event to be send at tht debounce
time after the start of the pulse.
Like, program the work at the beginning of the pulse, if somehow the pulse ends before
the debounce time, send the irq event immediately, otherwise let the debounce
work run after the debounce time which will trigger a disconnect event.
But the delay is too high, 1.1s could be a manual unplug/plug or bad connector
with false contact on the hpd pin.
I would rather reduce this to something more realistic like 500ms or less and
try to better handle the pulse somehow. But I don't have any idea if the scheme
I described is doable.
Neil
>
> Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
> ---
> v7: Change to free irq before mute and clear using IH regs, also include
> clear of STAT0_RX_SENSE
> v6: Change back to disable_delayed_work_sync() in hpd disable ops,
> Ensure HPD interrupt is masked and IRQ handler is disabled early
> in dw_hdmi_remove() to prevent any irq re-arming of delayed work,
> Drop use of suspend helper
> v5: Change to none-sync disable_delayed_work() in hpd disable ops,
> Change to cancel_delayed_work_sync() in remove,
> Add cancel_delayed_work_sync() to new suspend helper
> v4: Disable/mask delayed_work until enable_hpd()/hpd_enable(),
> Read connector status directly from HW regs in hpd_work
> v3: New patch
> ---
> drivers/gpu/drm/bridge/synopsys/dw-hdmi.c | 80 +++++++++++++++++++++--
> 1 file changed, 75 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
> index 8afc9d240121..270db58a0e7c 100644
> --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
> +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
> @@ -50,6 +50,8 @@
>
> #define HDMI14_MAX_TMDSCLK 340000000
>
> +#define HOTPLUG_DEBOUNCE_MS 1100
> +
> static const u16 csc_coeff_default[3][4] = {
> { 0x2000, 0x0000, 0x0000, 0x0000 },
> { 0x0000, 0x2000, 0x0000, 0x0000 },
> @@ -185,6 +187,7 @@ struct dw_hdmi {
> hdmi_codec_plugged_cb plugged_cb;
> struct device *codec_dev;
> enum drm_connector_status last_connector_result;
> + struct delayed_work hpd_work;
> };
>
> const struct dw_hdmi_plat_data *dw_hdmi_to_plat_data(struct dw_hdmi *hdmi)
> @@ -2517,6 +2520,20 @@ static void dw_hdmi_connector_force(struct drm_connector *connector)
> dw_hdmi_connector_status_update(hdmi, connector, connector->status);
> }
>
> +static void dw_hdmi_connector_enable_hpd(struct drm_connector *connector)
> +{
> + struct dw_hdmi *hdmi = container_of(connector, struct dw_hdmi, connector);
> +
> + enable_delayed_work(&hdmi->hpd_work);
> +}
> +
> +static void dw_hdmi_connector_disable_hpd(struct drm_connector *connector)
> +{
> + struct dw_hdmi *hdmi = container_of(connector, struct dw_hdmi, connector);
> +
> + disable_delayed_work_sync(&hdmi->hpd_work);
> +}
> +
> static void dw_hdmi_connector_destroy(struct drm_connector *connector)
> {
> struct dw_hdmi *hdmi = container_of(connector, struct dw_hdmi, connector);
> @@ -2538,6 +2555,8 @@ static const struct drm_connector_funcs dw_hdmi_connector_funcs = {
> static const struct drm_connector_helper_funcs dw_hdmi_connector_helper_funcs = {
> .get_modes = dw_hdmi_connector_get_modes,
> .atomic_check = dw_hdmi_connector_atomic_check,
> + .enable_hpd = dw_hdmi_connector_enable_hpd,
> + .disable_hpd = dw_hdmi_connector_disable_hpd,
> };
>
> static int dw_hdmi_connector_create(struct dw_hdmi *hdmi)
> @@ -2968,6 +2987,20 @@ static const struct drm_edid *dw_hdmi_bridge_edid_read(struct drm_bridge *bridge
> return dw_hdmi_edid_read(hdmi, connector);
> }
>
> +static void dw_hdmi_bridge_hpd_enable(struct drm_bridge *bridge)
> +{
> + struct dw_hdmi *hdmi = bridge->driver_private;
> +
> + enable_delayed_work(&hdmi->hpd_work);
> +}
> +
> +static void dw_hdmi_bridge_hpd_disable(struct drm_bridge *bridge)
> +{
> + struct dw_hdmi *hdmi = bridge->driver_private;
> +
> + disable_delayed_work_sync(&hdmi->hpd_work);
> +}
> +
> static const struct drm_bridge_funcs dw_hdmi_bridge_funcs = {
> .atomic_duplicate_state = drm_atomic_helper_bridge_duplicate_state,
> .atomic_destroy_state = drm_atomic_helper_bridge_destroy_state,
> @@ -2981,6 +3014,8 @@ static const struct drm_bridge_funcs dw_hdmi_bridge_funcs = {
> .mode_valid = dw_hdmi_bridge_mode_valid,
> .detect = dw_hdmi_bridge_detect,
> .edid_read = dw_hdmi_bridge_edid_read,
> + .hpd_enable = dw_hdmi_bridge_hpd_enable,
> + .hpd_disable = dw_hdmi_bridge_hpd_disable,
> };
>
> /* -----------------------------------------------------------------------------
> @@ -3101,8 +3136,8 @@ static irqreturn_t dw_hdmi_irq(int irq, void *dev_id)
> status == connector_status_connected ?
> "plugin" : "plugout");
>
> - if (hdmi->bridge.dev)
> - drm_helper_hpd_irq_event(hdmi->bridge.dev);
> + mod_delayed_work(system_percpu_wq, &hdmi->hpd_work,
> + msecs_to_jiffies(HOTPLUG_DEBOUNCE_MS));
> }
>
> hdmi_writeb(hdmi, intr_stat, HDMI_IH_PHY_STAT0);
> @@ -3112,6 +3147,29 @@ static irqreturn_t dw_hdmi_irq(int irq, void *dev_id)
> return IRQ_HANDLED;
> }
>
> +static void dw_hdmi_hpd_work(struct work_struct *work)
> +{
> + struct dw_hdmi *hdmi = container_of(work, struct dw_hdmi, hpd_work.work);
> + struct drm_device *dev = hdmi->bridge.dev;
> +
> + if (WARN_ON(!dev))
> + return;
> +
> + /*
> + * Notify the DRM core of the HPD event using drm_helper_hpd_irq_event()
> + * instead of drm_bridge_hpd_notify(). This will cause the DRM function
> + * check_connector_changed() to be called, which in turn calls the
> + * connector detect()/force() funcs to detect any connection status or
> + * epoch changes. The bridge connector detect() func also ensures that
> + * any hpd_notify() funcs are called for all bridges in the chain.
> + *
> + * drm_bridge_hpd_notify() shares a mutex with drm_bridge_hpd_disable(),
> + * and can result in a deadlock due to the disable_delayed_work_sync()
> + * call to wait on work to complete in dw_hdmi_bridge_hpd_disable().
> + */
> + drm_helper_hpd_irq_event(dev);
> +}
> +
> static const struct dw_hdmi_phy_data dw_hdmi_phys[] = {
> {
> .type = DW_HDMI_PHY_DWC_HDMI_TX_PHY,
> @@ -3396,6 +3454,9 @@ struct dw_hdmi *dw_hdmi_probe(struct platform_device *pdev,
> goto err_res;
> }
>
> + INIT_DELAYED_WORK(&hdmi->hpd_work, dw_hdmi_hpd_work);
> + disable_delayed_work(&hdmi->hpd_work);
> +
> ret = devm_request_threaded_irq(dev, irq, dw_hdmi_hardirq,
> dw_hdmi_irq, IRQF_SHARED,
> dev_name(dev), hdmi);
> @@ -3532,6 +3593,18 @@ EXPORT_SYMBOL_GPL(dw_hdmi_probe);
>
> void dw_hdmi_remove(struct dw_hdmi *hdmi)
> {
> + struct platform_device *pdev = to_platform_device(hdmi->dev);
> + int irq = platform_get_irq(pdev, 0);
> +
> + /* Free, mute and clear phy interrupts */
> + devm_free_irq(hdmi->dev, irq, hdmi);
> + hdmi_writeb(hdmi, ~0, HDMI_IH_MUTE_PHY_STAT0);
> + hdmi_writeb(hdmi, HDMI_IH_PHY_STAT0_HPD | HDMI_IH_PHY_STAT0_RX_SENSE,
> + HDMI_IH_PHY_STAT0);
> +
> + /* Cancel any pending hot plug work */
> + cancel_delayed_work_sync(&hdmi->hpd_work);
> +
> drm_bridge_remove(&hdmi->bridge);
>
> if (hdmi->audio && !IS_ERR(hdmi->audio))
> @@ -3539,9 +3612,6 @@ void dw_hdmi_remove(struct dw_hdmi *hdmi)
> if (!IS_ERR(hdmi->cec))
> platform_device_unregister(hdmi->cec);
>
> - /* Disable all interrupts */
> - hdmi_writeb(hdmi, ~0, HDMI_IH_MUTE_PHY_STAT0);
> -
> if (hdmi->i2c)
> i2c_del_adapter(&hdmi->i2c->adap);
> else
^ permalink raw reply
* Re: [PATCH v7 20/23] drm: bridge: dw_hdmi: Rework HDP and RXSENSE interrupt handling
From: Neil Armstrong @ 2026-05-20 9:59 UTC (permalink / raw)
To: Jonas Karlman, Andrzej Hajda, Robert Foss, Heiko Stuebner,
Laurent Pinchart, Jernej Skrabec, Luca Ceresoli,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter
Cc: Liu Ying, Sandy Huang, Andy Yan, Chen-Yu Tsai, Christian Hewitt,
Diederik de Haas, Nicolas Frattaroli, Dmitry Baryshkov, dri-devel,
linux-arm-kernel, linux-rockchip, linux-amlogic, linux-sunxi, imx,
linux-kernel
In-Reply-To: <20260518180206.2480119-21-jonas@kwiboo.se>
On 5/18/26 20:01, Jonas Karlman wrote:
> The commit aeac23bda87f ("drm: bridge/dw_hdmi: improve HDMI
> enable/disable handling") added use of PHY RXSENSE indications to avoid
> triggering a full enable/disable of the HDMI block when a sink use a HPD
> low voltage level pulse to indicate changes of the EDID.
>
> HDMI Specification Version 1.4b chapter 8.5 mentions:
>
> An HDMI Sink shall indicate any change to the contents of the E-EDID
> by driving a low voltage level pulse on the Hot Plug Detect pin. This
> pulse shall be at least 100 msec.
>
> A delayed work is now used to debounce reacting on a HPD low voltage
> level pulse when a sink changes the EDID. The delayed work triggers a
> hotplug uevent every time the connection status or EDID has changed.
>
> Remove RXSENSE handling to simplify the HPD interrupt handling and
> instead depend on the delayed work to detect any connection status or
> EDID changes.
>
> This also ensures the initial HPD interrupt polarity is based on current
> HPD status to avoid an unnecessary interrupt from being triggered
> immediately at probe or resume when a sink is connected.
I'm still puzzled of the removal of RX_SENSE entirely as v1, and I since the
rx_sense code is not easy to understand I don't have an opinion on that.
Can someone with more knowledge can comment on that ?
Neil
>
> Tested-by: Diederik de Haas <diederik@cknow-tech.com> # Rock64, RockPro64, Quartz64-B
> Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
> ---
> v7: Remove clear of STAT0_RX_SENSE in dw_hdmi_remove() added in prior
> patch
> v6: Update commit message,
> Collect t-b tag
> v5: Add comment about interrupt generation
> v4: New patch
> ---
> drivers/gpu/drm/bridge/synopsys/dw-hdmi.c | 147 ++++------------------
> 1 file changed, 22 insertions(+), 125 deletions(-)
>
> diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
> index 270db58a0e7c..2e09bff5faf7 100644
> --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
> +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
> @@ -161,11 +161,7 @@ struct dw_hdmi {
> struct pinctrl_state *unwedge_state;
>
> struct mutex mutex; /* for state below */
> - enum drm_connector_force force; /* mutex-protected force state */
> struct drm_connector *curr_conn;/* current connector (only valid when !disabled) */
> - bool disabled; /* DRM has disabled our bridge */
> - bool rxsense; /* rxsense state */
> - u8 phy_mask; /* desired phy int mask settings */
> u8 mc_clkdis; /* clock disable register */
>
> spinlock_t audio_lock;
> @@ -196,14 +192,6 @@ const struct dw_hdmi_plat_data *dw_hdmi_to_plat_data(struct dw_hdmi *hdmi)
> }
> EXPORT_SYMBOL_GPL(dw_hdmi_to_plat_data);
>
> -#define HDMI_IH_PHY_STAT0_RX_SENSE \
> - (HDMI_IH_PHY_STAT0_RX_SENSE0 | HDMI_IH_PHY_STAT0_RX_SENSE1 | \
> - HDMI_IH_PHY_STAT0_RX_SENSE2 | HDMI_IH_PHY_STAT0_RX_SENSE3)
> -
> -#define HDMI_PHY_RX_SENSE \
> - (HDMI_PHY_RX_SENSE0 | HDMI_PHY_RX_SENSE1 | \
> - HDMI_PHY_RX_SENSE2 | HDMI_PHY_RX_SENSE3)
> -
> static inline void hdmi_writeb(struct dw_hdmi *hdmi, u8 val, int offset)
> {
> regmap_write(hdmi->regm, offset << hdmi->reg_shift, val);
> @@ -1702,36 +1690,25 @@ EXPORT_SYMBOL_GPL(dw_hdmi_phy_read_hpd);
> void dw_hdmi_phy_update_hpd(struct dw_hdmi *hdmi, void *data,
> bool force, bool disabled, bool rxsense)
> {
> - u8 old_mask = hdmi->phy_mask;
> -
> - if (force || disabled || !rxsense)
> - hdmi->phy_mask |= HDMI_PHY_RX_SENSE;
> - else
> - hdmi->phy_mask &= ~HDMI_PHY_RX_SENSE;
> -
> - if (old_mask != hdmi->phy_mask)
> - hdmi_writeb(hdmi, hdmi->phy_mask, HDMI_PHY_MASK0);
> }
> EXPORT_SYMBOL_GPL(dw_hdmi_phy_update_hpd);
>
> void dw_hdmi_phy_setup_hpd(struct dw_hdmi *hdmi, void *data)
> {
> /*
> - * Configure the PHY RX SENSE and HPD interrupts polarities and clear
> - * any pending interrupt.
> + * Configure the PHY HPD interrupt polarity based on current HPD status
> + * and clear any pending interrupt.
> */
> - hdmi_writeb(hdmi, HDMI_PHY_HPD | HDMI_PHY_RX_SENSE, HDMI_PHY_POL0);
> - hdmi_writeb(hdmi, HDMI_IH_PHY_STAT0_HPD | HDMI_IH_PHY_STAT0_RX_SENSE,
> - HDMI_IH_PHY_STAT0);
> + hdmi_modb(hdmi, hdmi_readb(hdmi, HDMI_PHY_STAT0) & HDMI_PHY_HPD ?
> + 0 : HDMI_PHY_HPD, HDMI_PHY_HPD, HDMI_PHY_POL0);
> + hdmi_writeb(hdmi, HDMI_IH_PHY_STAT0_HPD, HDMI_IH_PHY_STAT0);
>
> /* Enable cable hot plug irq. */
> - hdmi_writeb(hdmi, hdmi->phy_mask, HDMI_PHY_MASK0);
> + hdmi_writeb(hdmi, ~HDMI_PHY_HPD, HDMI_PHY_MASK0);
>
> /* Clear and unmute interrupts. */
> - hdmi_writeb(hdmi, HDMI_IH_PHY_STAT0_HPD | HDMI_IH_PHY_STAT0_RX_SENSE,
> - HDMI_IH_PHY_STAT0);
> - hdmi_writeb(hdmi, ~(HDMI_IH_PHY_STAT0_HPD | HDMI_IH_PHY_STAT0_RX_SENSE),
> - HDMI_IH_MUTE_PHY_STAT0);
> + hdmi_writeb(hdmi, HDMI_IH_PHY_STAT0_HPD, HDMI_IH_PHY_STAT0);
> + hdmi_writeb(hdmi, ~HDMI_IH_PHY_STAT0_HPD, HDMI_IH_MUTE_PHY_STAT0);
> }
> EXPORT_SYMBOL_GPL(dw_hdmi_phy_setup_hpd);
>
> @@ -2395,26 +2372,6 @@ static void dw_hdmi_poweroff(struct dw_hdmi *hdmi)
> }
> }
>
> -/*
> - * Adjust the detection of RXSENSE according to whether we have a forced
> - * connection mode enabled, or whether we have been disabled. There is
> - * no point processing RXSENSE interrupts if we have a forced connection
> - * state, or DRM has us disabled.
> - *
> - * We also disable rxsense interrupts when we think we're disconnected
> - * to avoid floating TDMS signals giving false rxsense interrupts.
> - *
> - * Note: we still need to listen for HPD interrupts even when DRM has us
> - * disabled so that we can detect a connect event.
> - */
> -static void dw_hdmi_update_phy_mask(struct dw_hdmi *hdmi)
> -{
> - if (hdmi->phy.ops->update_hpd)
> - hdmi->phy.ops->update_hpd(hdmi, hdmi->phy.data,
> - hdmi->force, hdmi->disabled,
> - hdmi->rxsense);
> -}
> -
> static enum drm_connector_status dw_hdmi_detect(struct dw_hdmi *hdmi)
> {
> enum drm_connector_status result;
> @@ -2512,9 +2469,7 @@ static void dw_hdmi_connector_force(struct drm_connector *connector)
> struct dw_hdmi *hdmi = container_of(connector, struct dw_hdmi, connector);
>
> mutex_lock(&hdmi->mutex);
> - hdmi->force = connector->force;
> hdmi->last_connector_result = connector->status;
> - dw_hdmi_update_phy_mask(hdmi);
> mutex_unlock(&hdmi->mutex);
>
> dw_hdmi_connector_status_update(hdmi, connector, connector->status);
> @@ -2932,10 +2887,8 @@ static void dw_hdmi_bridge_atomic_disable(struct drm_bridge *bridge,
> struct dw_hdmi *hdmi = bridge->driver_private;
>
> mutex_lock(&hdmi->mutex);
> - hdmi->disabled = true;
> hdmi->curr_conn = NULL;
> dw_hdmi_poweroff(hdmi);
> - dw_hdmi_update_phy_mask(hdmi);
> handle_plugged_change(hdmi, false);
> mutex_unlock(&hdmi->mutex);
> }
> @@ -2954,10 +2907,8 @@ static void dw_hdmi_bridge_atomic_enable(struct drm_bridge *bridge,
> mode = &drm_atomic_get_new_crtc_state(state, crtc)->adjusted_mode;
>
> mutex_lock(&hdmi->mutex);
> - hdmi->disabled = false;
> hdmi->curr_conn = connector;
> dw_hdmi_poweron(hdmi, connector, mode);
> - dw_hdmi_update_phy_mask(hdmi);
> handle_plugged_change(hdmi, true);
> mutex_unlock(&hdmi->mutex);
> }
> @@ -3060,78 +3011,29 @@ static irqreturn_t dw_hdmi_hardirq(int irq, void *dev_id)
>
> void dw_hdmi_setup_rx_sense(struct dw_hdmi *hdmi, bool hpd, bool rx_sense)
> {
> - mutex_lock(&hdmi->mutex);
> -
> - if (!hdmi->force) {
> - /*
> - * If the RX sense status indicates we're disconnected,
> - * clear the software rxsense status.
> - */
> - if (!rx_sense)
> - hdmi->rxsense = false;
> -
> - /*
> - * Only set the software rxsense status when both
> - * rxsense and hpd indicates we're connected.
> - * This avoids what seems to be bad behaviour in
> - * at least iMX6S versions of the phy.
> - */
> - if (hpd)
> - hdmi->rxsense = true;
> -
> - dw_hdmi_update_phy_mask(hdmi);
> - }
> - mutex_unlock(&hdmi->mutex);
> }
> EXPORT_SYMBOL_GPL(dw_hdmi_setup_rx_sense);
>
> static irqreturn_t dw_hdmi_irq(int irq, void *dev_id)
> {
> struct dw_hdmi *hdmi = dev_id;
> - u8 intr_stat, phy_int_pol, phy_pol_mask, phy_stat;
> - enum drm_connector_status status = connector_status_unknown;
> -
> - intr_stat = hdmi_readb(hdmi, HDMI_IH_PHY_STAT0);
> - phy_int_pol = hdmi_readb(hdmi, HDMI_PHY_POL0);
> - phy_stat = hdmi_readb(hdmi, HDMI_PHY_STAT0);
> -
> - phy_pol_mask = 0;
> - if (intr_stat & HDMI_IH_PHY_STAT0_HPD)
> - phy_pol_mask |= HDMI_PHY_HPD;
> - if (intr_stat & HDMI_IH_PHY_STAT0_RX_SENSE0)
> - phy_pol_mask |= HDMI_PHY_RX_SENSE0;
> - if (intr_stat & HDMI_IH_PHY_STAT0_RX_SENSE1)
> - phy_pol_mask |= HDMI_PHY_RX_SENSE1;
> - if (intr_stat & HDMI_IH_PHY_STAT0_RX_SENSE2)
> - phy_pol_mask |= HDMI_PHY_RX_SENSE2;
> - if (intr_stat & HDMI_IH_PHY_STAT0_RX_SENSE3)
> - phy_pol_mask |= HDMI_PHY_RX_SENSE3;
> -
> - if (phy_pol_mask)
> - hdmi_modb(hdmi, ~phy_int_pol, phy_pol_mask, HDMI_PHY_POL0);
> + u8 intr_stat;
>
> /*
> - * RX sense tells us whether the TDMS transmitters are detecting
> - * load - in other words, there's something listening on the
> - * other end of the link. Use this to decide whether we should
> - * power on the phy as HPD may be toggled by the sink to merely
> - * ask the source to re-read the EDID.
> + * Interrupt generation is accomplished in the following way:
> + * interrupt = (mask == 0) && (polarity == status)
> + * All interrupts are forwarded to the Interrupt Handler sticky bit
> + * register ih_phy_stat0 and muted using the register ih_mute_phy_stat0.
> */
> - if (intr_stat &
> - (HDMI_IH_PHY_STAT0_RX_SENSE | HDMI_IH_PHY_STAT0_HPD)) {
> - dw_hdmi_setup_rx_sense(hdmi,
> - phy_stat & HDMI_PHY_HPD,
> - phy_stat & HDMI_PHY_RX_SENSE);
> + intr_stat = hdmi_readb(hdmi, HDMI_IH_PHY_STAT0);
> + if (intr_stat & HDMI_IH_PHY_STAT0_HPD) {
> + enum drm_connector_status status;
>
> - if ((intr_stat & HDMI_IH_PHY_STAT0_HPD) &&
> - (phy_stat & HDMI_PHY_HPD))
> - status = connector_status_connected;
> + /* Set HPD interrupt polarity based on current HPD status. */
> + status = dw_hdmi_phy_read_hpd(hdmi, hdmi->phy.data);
> + hdmi_modb(hdmi, status == connector_status_connected ?
> + 0 : HDMI_PHY_HPD, HDMI_PHY_HPD, HDMI_PHY_POL0);
>
> - if (!(phy_stat & (HDMI_PHY_HPD | HDMI_PHY_RX_SENSE)))
> - status = connector_status_disconnected;
> - }
> -
> - if (status != connector_status_unknown) {
> dev_dbg(hdmi->dev, "EVENT=%s\n",
> status == connector_status_connected ?
> "plugin" : "plugout");
> @@ -3141,8 +3043,7 @@ static irqreturn_t dw_hdmi_irq(int irq, void *dev_id)
> }
>
> hdmi_writeb(hdmi, intr_stat, HDMI_IH_PHY_STAT0);
> - hdmi_writeb(hdmi, ~(HDMI_IH_PHY_STAT0_HPD | HDMI_IH_PHY_STAT0_RX_SENSE),
> - HDMI_IH_MUTE_PHY_STAT0);
> + hdmi_writeb(hdmi, ~HDMI_IH_PHY_STAT0_HPD, HDMI_IH_MUTE_PHY_STAT0);
>
> return IRQ_HANDLED;
> }
> @@ -3343,9 +3244,6 @@ struct dw_hdmi *dw_hdmi_probe(struct platform_device *pdev,
> hdmi->dev = dev;
> hdmi->sample_rate = 48000;
> hdmi->channels = 2;
> - hdmi->disabled = true;
> - hdmi->rxsense = true;
> - hdmi->phy_mask = (u8)~(HDMI_PHY_HPD | HDMI_PHY_RX_SENSE);
> hdmi->mc_clkdis = 0x7f;
> hdmi->last_connector_result = connector_status_disconnected;
>
> @@ -3599,8 +3497,7 @@ void dw_hdmi_remove(struct dw_hdmi *hdmi)
> /* Free, mute and clear phy interrupts */
> devm_free_irq(hdmi->dev, irq, hdmi);
> hdmi_writeb(hdmi, ~0, HDMI_IH_MUTE_PHY_STAT0);
> - hdmi_writeb(hdmi, HDMI_IH_PHY_STAT0_HPD | HDMI_IH_PHY_STAT0_RX_SENSE,
> - HDMI_IH_PHY_STAT0);
> + hdmi_writeb(hdmi, HDMI_IH_PHY_STAT0_HPD, HDMI_IH_PHY_STAT0);
>
> /* Cancel any pending hot plug work */
> cancel_delayed_work_sync(&hdmi->hpd_work);
^ permalink raw reply
* Re: [PATCH net-next v2 2/2] net: ti: icssg: Add HSR and LRE PA statistics
From: MD Danish Anwar @ 2026-05-20 10:00 UTC (permalink / raw)
To: Jakub Kicinski, Luka Gejak
Cc: Felix Maurer, David S. Miller, Eric Dumazet, Paolo Abeni,
Simon Horman, Jonathan Corbet, Shuah Khan, Roger Quadros,
Andrew Lunn, Meghana Malladi, Jacob Keller, David Carlier,
Vadim Fedorenko, Kevin Hao, netdev, linux-doc, linux-kernel,
linux-arm-kernel, Vladimir Oltean
In-Reply-To: <20260519165646.09b0783f@kernel.org>
Hi Jakub,
On 20/05/26 5:26 am, Jakub Kicinski wrote:
> On Tue, 19 May 2026 07:55:55 +0200 Luka Gejak wrote:
>> On May 19, 2026 3:45:06 AM GMT+02:00, Jakub Kicinski <kuba@kernel.org> wrote:
>>> On Thu, 14 May 2026 13:26:05 +0530 MD Danish Anwar wrote:
>>>> Add new firmware PA statistics counters for HSR and LRE to the ethtool
>>>> statistics exposed by the ICSSG driver.
>>>>
>>>> New statistics added:
>>>> - FW_HSR_FWD_CHECK_FAIL_DROP: Packets dropped on the HSR forwarding path
>>>> - FW_HSR_HE_CHECK_FAIL_DROP: Packets dropped on the HSR host egress path
>>>> - FW_HSR_SKIP_HOST_DUP_DISCARD_FRAMES: Frames with duplicate discard
>>>> skipped
>>>> - FW_LRE_CNT_UNIQUE/DUPLICATE/MULTIPLE_RX: LRE duplicate detection
>>>> counters
>>>> - FW_LRE_CNT_RX/TX: LRE per-port frame counters
>>>> - FW_LRE_CNT_OWN_RX: Own HSR tagged frames received
>>>> - FW_LRE_CNT_ERRWRONGLAN: Frames with wrong LAN identifier (PRP)
>>>>
>>>> Document the new HSR/LRE statistics in icssg_prueth.rst.
>>>
>>> To an untrained eye these stats look like stuff that could
>>> be standardized across drivers.
>>>
>>> Luka, Felix, others on CC, do you think we should expose these
>> >from HSR over netlink as "standard" offload stats different drivers
>>> can plug into or not worth it?
>>
>> I think there is a case for standardizing part of this, but I would
>> not standardize the whole set as-is.
>>
>> The LRE counters look generic enough to me, especially:
>> - unique rx
>> - duplicate rx
>> - multiple rx
>> - rx / tx
>> - own rx
>> - wrong LAN, PRP only
>>
>> Those are protocol/LRE concepts rather than TI firmware details, so
>> exposing them from the HSR/PRP layer sounds useful. I would expect
>> both the software implementation and offloaded implementations to be
>> able to provide at least some of them, with unsupported counters
>> omitted or reported as not available.
>> I would not put the firmware check/drop counters in the same standard
>> bucket, though:
>> - FW_HSR_FWD_CHECK_FAIL_DROP
>> - FW_HSR_HE_CHECK_FAIL_DROP
>> - FW_HSR_SKIP_HOST_DUP_DISCARD_FRAMES
>
> Thanks for the breakdown!
>
>> Those sound more like implementation/debug counters for the ICSSG
>> firmware pipeline. They are still useful in ethtool driver stats, but
>> I would be hesitant to bake their exact semantics into HSR UAPI.
>> So my preference would be:
>> 1. Keep driver-private ethtool stats for the full firmware counter set.
>> 2. Add a small HSR/PRP standard stats set separately, limited to
>> well-defined LRE counters.
>> 3. Make the HSR layer expose them, with offload drivers plugging in via
>> an optional callback or offload stats op.
>> 4. Define the counters carefully, including whether they are per-HSR
>> device or per-port A/B, and what PRP-only counters mean for HSR.
>>
>> I do not think this patch should blindly become the UAPI definition,
>
> Not at all, the unique / multiple stats gave me pause. We should
> only put in the standard API what can be easily and unambiguously
> defined given the protocol spec.
>
>> but I do think it points at a useful follow-up. If we want to avoid
>> adding driver-private names first and then standardizing different
>> names later, then it may be worth asking Danish to split the
>> protocol-level LRE counters out and route those through a common HSR
>> stats interface.
>
> As a general policy we ask for standard stats to be added first and
> ethtool to only contain what didn't fit in the standard ones.
> There are some technical reasons but it's mostly a mindset thing.
What should be the next steps here? Is there any existing defined set of
stats where I could populate stats from ICSSG firmware for HSR (similar
to ndo_get_stats64 callback). Or de we need to implement a new callback
that will do this for HSR.
I agree with Luka on the categorization,
Below stats can be generic,
- unique rx
- duplicate rx
- multiple rx
- rx / tx
- own rx
- wrong LAN, PRP only
Below stats can be driver specific and can be pulled using `ethtool -S`
on child interfaces of HSR
- FW_HSR_FWD_CHECK_FAIL_DROP
- FW_HSR_HE_CHECK_FAIL_DROP
- FW_HSR_SKIP_HOST_DUP_DISCARD_FRAMES
Let me know if I should go ahead and implement this.
--
Thanks and Regards,
Danish
^ permalink raw reply
* [PATCH v3 3/6] KVM: arm64: timer: Kill the per-timer irq level cache
From: Marc Zyngier @ 2026-05-20 10:01 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel
Cc: Deepanshu Kartikey, Steffen Eiden, Joey Gouly, Suzuki K Poulose,
Oliver Upton, Zenghui Yu
In-Reply-To: <20260520100200.543845-1-maz@kernel.org>
The timer code makes use of a per-timer irq level cache, which
looks like a very minor optimisation to avoid taking a lock upon
updating the GIC view of the interrupt when it is unchanged from
the previous state.
This is coming in the way of more important correctness issues,
so get rid of the cache, which simplifies a couple of minor things.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
arch/arm64/kvm/arch_timer.c | 20 +++++++++-----------
include/kvm/arm_arch_timer.h | 5 -----
2 files changed, 9 insertions(+), 16 deletions(-)
diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
index 7236dd6a99e67..c3b8257888e89 100644
--- a/arch/arm64/kvm/arch_timer.c
+++ b/arch/arm64/kvm/arch_timer.c
@@ -453,9 +453,8 @@ static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level,
{
kvm_timer_update_status(timer_ctx, new_level);
- timer_ctx->irq.level = new_level;
trace_kvm_timer_update_irq(vcpu->vcpu_id, timer_irq(timer_ctx),
- timer_ctx->irq.level);
+ new_level);
if (userspace_irqchip(vcpu->kvm))
return;
@@ -473,7 +472,7 @@ static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level,
kvm_vgic_inject_irq(vcpu->kvm, vcpu,
timer_irq(timer_ctx),
- timer_ctx->irq.level,
+ new_level,
timer_ctx);
}
@@ -484,10 +483,7 @@ static void timer_emulate(struct arch_timer_context *ctx)
trace_kvm_timer_emulate(ctx, pending);
- if (pending != ctx->irq.level)
- kvm_timer_update_irq(timer_context_to_vcpu(ctx), pending, ctx);
-
- kvm_timer_update_status(ctx, pending);
+ kvm_timer_update_irq(timer_context_to_vcpu(ctx), pending, ctx);
/*
* If the timer is pending, we don't need to have a soft timer
@@ -684,6 +680,7 @@ static inline void set_timer_irq_phys_active(struct arch_timer_context *ctx, boo
static void kvm_timer_vcpu_load_gic(struct arch_timer_context *ctx)
{
struct kvm_vcpu *vcpu = timer_context_to_vcpu(ctx);
+ bool pending = kvm_timer_pending(ctx);
bool phys_active = false;
/*
@@ -692,12 +689,12 @@ static void kvm_timer_vcpu_load_gic(struct arch_timer_context *ctx)
* this point and the register restoration, we'll take the
* interrupt anyway.
*/
- kvm_timer_update_irq(vcpu, kvm_timer_pending(ctx), ctx);
+ kvm_timer_update_irq(vcpu, pending, ctx);
if (irqchip_in_kernel(vcpu->kvm))
phys_active = kvm_vgic_map_is_active(vcpu, timer_irq(ctx));
- phys_active |= ctx->irq.level;
+ phys_active |= pending;
phys_active |= vgic_is_v5(vcpu->kvm);
set_timer_irq_phys_active(ctx, phys_active);
@@ -706,6 +703,7 @@ static void kvm_timer_vcpu_load_gic(struct arch_timer_context *ctx)
static void kvm_timer_vcpu_load_nogic(struct kvm_vcpu *vcpu)
{
struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
+ bool pending = kvm_timer_pending(vtimer);
/*
* Update the timer output so that it is likely to match the
@@ -713,7 +711,7 @@ static void kvm_timer_vcpu_load_nogic(struct kvm_vcpu *vcpu)
* this point and the register restoration, we'll take the
* interrupt anyway.
*/
- kvm_timer_update_irq(vcpu, kvm_timer_pending(vtimer), vtimer);
+ kvm_timer_update_irq(vcpu, pending, vtimer);
/*
* When using a userspace irqchip with the architected timers and a
@@ -725,7 +723,7 @@ static void kvm_timer_vcpu_load_nogic(struct kvm_vcpu *vcpu)
* being de-asserted, we unmask the interrupt again so that we exit
* from the guest when the timer fires.
*/
- if (vtimer->irq.level)
+ if (pending)
disable_percpu_irq(host_vtimer_irq);
else
enable_percpu_irq(host_vtimer_irq, host_vtimer_irq_flags);
diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
index 9e4076eebd29f..15a4f97f81051 100644
--- a/include/kvm/arm_arch_timer.h
+++ b/include/kvm/arm_arch_timer.h
@@ -66,11 +66,6 @@ struct arch_timer_context {
*/
bool loaded;
- /* Output level of the timer IRQ */
- struct {
- bool level;
- } irq;
-
/* Who am I? */
enum kvm_arch_timers timer_id;
--
2.47.3
^ permalink raw reply related
* [PATCH v3 6/6] KVM: arm64: vgic-v2: Don't init the vgic on in-kernel interrupt injection
From: Marc Zyngier @ 2026-05-20 10:02 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel
Cc: Deepanshu Kartikey, Steffen Eiden, Joey Gouly, Suzuki K Poulose,
Oliver Upton, Zenghui Yu
In-Reply-To: <20260520100200.543845-1-maz@kernel.org>
We how have the lazy init on three paths:
- on first run of a vcpu
- on first injection of an interrupt from userspace and irqfd
- on first injection of an interrupt from kernel space as
part of the device emulation (timers, PMU, vgic MI)
Given that we recompute the state of each in-kernel interrupt
every time we are about to enter the guest, we can drop the lazy
init from the kernel injection path.
This solves a bunch of issues related to vgic_lazy_init() being called
in non-preemptible context, such as vcpu reset.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
arch/arm64/kvm/vgic/vgic.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/kvm/vgic/vgic.c b/arch/arm64/kvm/vgic/vgic.c
index 1e9fe8764584d..9e29f03d3463c 100644
--- a/arch/arm64/kvm/vgic/vgic.c
+++ b/arch/arm64/kvm/vgic/vgic.c
@@ -534,11 +534,9 @@ int kvm_vgic_inject_irq(struct kvm *kvm, struct kvm_vcpu *vcpu,
{
struct vgic_irq *irq;
unsigned long flags;
- int ret;
- ret = vgic_lazy_init(kvm);
- if (ret)
- return ret;
+ if (unlikely(!vgic_initialized(kvm)))
+ return 0;
if (!vcpu && irq_is_private(kvm, intid))
return -EINVAL;
--
2.47.3
^ permalink raw reply related
* [PATCH v3 1/6] KVM: arm64: timer: Repaint kvm_timer_{should,irq_can}_fire() to kvm_timer_{pending,enabled}()
From: Marc Zyngier @ 2026-05-20 10:01 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel
Cc: Deepanshu Kartikey, Steffen Eiden, Joey Gouly, Suzuki K Poulose,
Oliver Upton, Zenghui Yu
In-Reply-To: <20260520100200.543845-1-maz@kernel.org>
kvm_timer_should_fire() seems to date back to a time where the author
of the timer code didn't seem to have made the word "pending" part of
their vocabulary.
Having since slightly improved on that front, let's rename this predicate
to kvm_timer_pending(), which clearly indicates whether the timer
interrupt is pending or not.
Similarly, kvm_timer_irq_can_fire() is renamed to kvm_timer_enabled().
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
arch/arm64/kvm/arch_timer.c | 55 ++++++++++++++++++-------------------
1 file changed, 27 insertions(+), 28 deletions(-)
diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
index cbea4d9ee9552..d8add34717f07 100644
--- a/arch/arm64/kvm/arch_timer.c
+++ b/arch/arm64/kvm/arch_timer.c
@@ -39,10 +39,9 @@ static const u8 default_ppi[] = {
[TIMER_HVTIMER] = 28,
};
-static bool kvm_timer_irq_can_fire(struct arch_timer_context *timer_ctx);
static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level,
struct arch_timer_context *timer_ctx);
-static bool kvm_timer_should_fire(struct arch_timer_context *timer_ctx);
+static bool kvm_timer_pending(struct arch_timer_context *timer_ctx);
static void kvm_arm_timer_write(struct kvm_vcpu *vcpu,
struct arch_timer_context *timer,
enum kvm_arch_timer_regs treg,
@@ -224,7 +223,7 @@ static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id)
else
ctx = map.direct_ptimer;
- if (kvm_timer_should_fire(ctx))
+ if (kvm_timer_pending(ctx))
kvm_timer_update_irq(vcpu, true, ctx);
if (userspace_irqchip(vcpu->kvm) &&
@@ -257,7 +256,7 @@ static u64 kvm_timer_compute_delta(struct arch_timer_context *timer_ctx)
return kvm_counter_compute_delta(timer_ctx, timer_get_cval(timer_ctx));
}
-static bool kvm_timer_irq_can_fire(struct arch_timer_context *timer_ctx)
+static bool kvm_timer_enabled(struct arch_timer_context *timer_ctx)
{
WARN_ON(timer_ctx && timer_ctx->loaded);
return timer_ctx &&
@@ -294,7 +293,7 @@ static u64 kvm_timer_earliest_exp(struct kvm_vcpu *vcpu)
struct arch_timer_context *ctx = &vcpu->arch.timer_cpu.timers[i];
WARN(ctx->loaded, "timer %d loaded\n", i);
- if (kvm_timer_irq_can_fire(ctx))
+ if (kvm_timer_enabled(ctx))
min_delta = min(min_delta, kvm_timer_compute_delta(ctx));
}
@@ -358,7 +357,7 @@ static enum hrtimer_restart kvm_hrtimer_expire(struct hrtimer *hrt)
return HRTIMER_NORESTART;
}
-static bool kvm_timer_should_fire(struct arch_timer_context *timer_ctx)
+static bool kvm_timer_pending(struct arch_timer_context *timer_ctx)
{
enum kvm_arch_timers index;
u64 cval, now;
@@ -391,7 +390,7 @@ static bool kvm_timer_should_fire(struct arch_timer_context *timer_ctx)
!(cnt_ctl & ARCH_TIMER_CTRL_IT_MASK);
}
- if (!kvm_timer_irq_can_fire(timer_ctx))
+ if (!kvm_timer_enabled(timer_ctx))
return false;
cval = timer_get_cval(timer_ctx);
@@ -417,9 +416,9 @@ void kvm_timer_update_run(struct kvm_vcpu *vcpu)
/* Populate the device bitmap with the timer states */
regs->device_irq_level &= ~(KVM_ARM_DEV_EL1_VTIMER |
KVM_ARM_DEV_EL1_PTIMER);
- if (kvm_timer_should_fire(vtimer))
+ if (kvm_timer_pending(vtimer))
regs->device_irq_level |= KVM_ARM_DEV_EL1_VTIMER;
- if (kvm_timer_should_fire(ptimer))
+ if (kvm_timer_pending(ptimer))
regs->device_irq_level |= KVM_ARM_DEV_EL1_PTIMER;
}
@@ -473,21 +472,21 @@ static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level,
/* Only called for a fully emulated timer */
static void timer_emulate(struct arch_timer_context *ctx)
{
- bool should_fire = kvm_timer_should_fire(ctx);
+ bool pending = kvm_timer_pending(ctx);
- trace_kvm_timer_emulate(ctx, should_fire);
+ trace_kvm_timer_emulate(ctx, pending);
- if (should_fire != ctx->irq.level)
- kvm_timer_update_irq(timer_context_to_vcpu(ctx), should_fire, ctx);
+ if (pending != ctx->irq.level)
+ kvm_timer_update_irq(timer_context_to_vcpu(ctx), pending, ctx);
- kvm_timer_update_status(ctx, should_fire);
+ kvm_timer_update_status(ctx, pending);
/*
- * If the timer can fire now, we don't need to have a soft timer
- * scheduled for the future. If the timer cannot fire at all,
- * then we also don't need a soft timer.
+ * If the timer is pending, we don't need to have a soft timer
+ * scheduled for the future. If the timer is disabled, then
+ * we don't need a soft timer either.
*/
- if (should_fire || !kvm_timer_irq_can_fire(ctx))
+ if (pending || !kvm_timer_enabled(ctx))
return;
soft_timer_start(&ctx->hrtimer, kvm_timer_compute_delta(ctx));
@@ -594,10 +593,10 @@ static void kvm_timer_blocking(struct kvm_vcpu *vcpu)
* If no timers are capable of raising interrupts (disabled or
* masked), then there's no more work for us to do.
*/
- if (!kvm_timer_irq_can_fire(map.direct_vtimer) &&
- !kvm_timer_irq_can_fire(map.direct_ptimer) &&
- !kvm_timer_irq_can_fire(map.emul_vtimer) &&
- !kvm_timer_irq_can_fire(map.emul_ptimer) &&
+ if (!kvm_timer_enabled(map.direct_vtimer) &&
+ !kvm_timer_enabled(map.direct_ptimer) &&
+ !kvm_timer_enabled(map.emul_vtimer) &&
+ !kvm_timer_enabled(map.emul_ptimer) &&
!vcpu_has_wfit_active(vcpu))
return;
@@ -685,7 +684,7 @@ static void kvm_timer_vcpu_load_gic(struct arch_timer_context *ctx)
* this point and the register restoration, we'll take the
* interrupt anyway.
*/
- kvm_timer_update_irq(vcpu, kvm_timer_should_fire(ctx), ctx);
+ kvm_timer_update_irq(vcpu, kvm_timer_pending(ctx), ctx);
if (irqchip_in_kernel(vcpu->kvm))
phys_active = kvm_vgic_map_is_active(vcpu, timer_irq(ctx));
@@ -706,7 +705,7 @@ static void kvm_timer_vcpu_load_nogic(struct kvm_vcpu *vcpu)
* this point and the register restoration, we'll take the
* interrupt anyway.
*/
- kvm_timer_update_irq(vcpu, kvm_timer_should_fire(vtimer), vtimer);
+ kvm_timer_update_irq(vcpu, kvm_timer_pending(vtimer), vtimer);
/*
* When using a userspace irqchip with the architected timers and a
@@ -917,8 +916,8 @@ bool kvm_timer_should_notify_user(struct kvm_vcpu *vcpu)
vlevel = sregs->device_irq_level & KVM_ARM_DEV_EL1_VTIMER;
plevel = sregs->device_irq_level & KVM_ARM_DEV_EL1_PTIMER;
- return kvm_timer_should_fire(vtimer) != vlevel ||
- kvm_timer_should_fire(ptimer) != plevel;
+ return kvm_timer_pending(vtimer) != vlevel ||
+ kvm_timer_pending(ptimer) != plevel;
}
void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu)
@@ -1006,7 +1005,7 @@ static void unmask_vtimer_irq_user(struct kvm_vcpu *vcpu)
{
struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
- if (!kvm_timer_should_fire(vtimer)) {
+ if (!kvm_timer_pending(vtimer)) {
kvm_timer_update_irq(vcpu, false, vtimer);
if (static_branch_likely(&has_gic_active_state))
set_timer_irq_phys_active(vtimer, false);
@@ -1579,7 +1578,7 @@ static bool kvm_arch_timer_get_input_level(int vintid)
ctx = vcpu_get_timer(vcpu, i);
if (timer_irq(ctx) == vintid)
- return kvm_timer_should_fire(ctx);
+ return kvm_timer_pending(ctx);
}
/* A timer IRQ has fired, but no matching timer was found? */
--
2.47.3
^ permalink raw reply related
* [PATCH v3 5/6] KVM: arm64: vgic-v2: Force vgic init on injection outside the run loop
From: Marc Zyngier @ 2026-05-20 10:01 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel
Cc: Deepanshu Kartikey, Steffen Eiden, Joey Gouly, Suzuki K Poulose,
Oliver Upton, Zenghui Yu
In-Reply-To: <20260520100200.543845-1-maz@kernel.org>
Make sure that any attempt to inject an interrupt from userspace
or an irqfd results in the GICv2 lazy init to take place.
This is not currently necessary as the init is also performed on
*any* interrupt injection. But as we're about to remove that,
let's introduce it here.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
arch/arm64/kvm/arm.c | 15 +++++++++++++--
arch/arm64/kvm/vgic/vgic-irqfd.c | 6 ++++++
2 files changed, 19 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 6e6dc17f8b606..cfb7921fc7d75 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -51,6 +51,7 @@
#include <linux/irqchip/arm-gic-v5.h>
+#include "vgic/vgic.h"
#include "sys_regs.h"
static enum kvm_mode kvm_mode = KVM_MODE_DEFAULT;
@@ -1497,8 +1498,13 @@ int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_level,
return vcpu_interrupt_line(vcpu, irq_num, level);
case KVM_ARM_IRQ_TYPE_PPI:
- if (!irqchip_in_kernel(kvm))
+ if (irqchip_in_kernel(kvm)) {
+ int ret = vgic_lazy_init(kvm);
+ if (ret)
+ return ret;
+ } else {
return -ENXIO;
+ }
vcpu = kvm_get_vcpu_by_id(kvm, vcpu_id);
if (!vcpu)
@@ -1525,8 +1531,13 @@ int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_level,
return kvm_vgic_inject_irq(kvm, vcpu, irq_num, level, NULL);
case KVM_ARM_IRQ_TYPE_SPI:
- if (!irqchip_in_kernel(kvm))
+ if (irqchip_in_kernel(kvm)) {
+ int ret = vgic_lazy_init(kvm);
+ if (ret)
+ return ret;
+ } else {
return -ENXIO;
+ }
if (vgic_is_v5(kvm)) {
/* Build a GICv5-style IntID here */
diff --git a/arch/arm64/kvm/vgic/vgic-irqfd.c b/arch/arm64/kvm/vgic/vgic-irqfd.c
index b9b86e3a6c862..19a1094536e6a 100644
--- a/arch/arm64/kvm/vgic/vgic-irqfd.c
+++ b/arch/arm64/kvm/vgic/vgic-irqfd.c
@@ -20,9 +20,15 @@ static int vgic_irqfd_set_irq(struct kvm_kernel_irq_routing_entry *e,
int level, bool line_status)
{
unsigned int spi_id = e->irqchip.pin + VGIC_NR_PRIVATE_IRQS;
+ int ret;
if (!vgic_valid_spi(kvm, spi_id))
return -EINVAL;
+
+ ret = vgic_lazy_init(kvm);
+ if (ret)
+ return ret;
+
return kvm_vgic_inject_irq(kvm, NULL, spi_id, level, NULL);
}
--
2.47.3
^ permalink raw reply related
* [PATCH v3 4/6] KVM: arm64: pmu: Kill the PMU interrupt level cache
From: Marc Zyngier @ 2026-05-20 10:01 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel
Cc: Deepanshu Kartikey, Steffen Eiden, Joey Gouly, Suzuki K Poulose,
Oliver Upton, Zenghui Yu
In-Reply-To: <20260520100200.543845-1-maz@kernel.org>
Just like the timer, the PMU has an interrupt cache that serves little
purpose. Drop it.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
arch/arm64/kvm/pmu-emul.c | 13 +++----------
include/kvm/arm_pmu.h | 1 -
2 files changed, 3 insertions(+), 11 deletions(-)
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index 31a472a2c4881..edb21239478a9 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -396,19 +396,12 @@ static bool kvm_pmu_overflow_status(struct kvm_vcpu *vcpu)
static void kvm_pmu_update_state(struct kvm_vcpu *vcpu)
{
struct kvm_pmu *pmu = &vcpu->arch.pmu;
- bool overflow;
- overflow = kvm_pmu_overflow_status(vcpu);
- if (pmu->irq_level == overflow)
+ if (unlikely(!irqchip_in_kernel(vcpu->kvm)))
return;
- pmu->irq_level = overflow;
-
- if (likely(irqchip_in_kernel(vcpu->kvm))) {
- int ret = kvm_vgic_inject_irq(vcpu->kvm, vcpu,
- pmu->irq_num, overflow, pmu);
- WARN_ON(ret);
- }
+ WARN_ON(kvm_vgic_inject_irq(vcpu->kvm, vcpu, pmu->irq_num,
+ kvm_pmu_overflow_status(vcpu), pmu));
}
bool kvm_pmu_should_notify_user(struct kvm_vcpu *vcpu)
diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
index 3e844c5ee9174..b5e5942204fc6 100644
--- a/include/kvm/arm_pmu.h
+++ b/include/kvm/arm_pmu.h
@@ -32,7 +32,6 @@ struct kvm_pmu {
struct kvm_pmc pmc[KVM_ARMV8_PMU_MAX_COUNTERS];
int irq_num;
bool created;
- bool irq_level;
};
struct arm_pmu_entry {
--
2.47.3
^ permalink raw reply related
* [PATCH v3 0/6] KVM: arm64: Don't perform vgic-v2 lazy init on timer injection
From: Marc Zyngier @ 2026-05-20 10:01 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel
Cc: Deepanshu Kartikey, Steffen Eiden, Joey Gouly, Suzuki K Poulose,
Oliver Upton, Zenghui Yu
This is the third version of this series aiming at fixing issues with
vgic-v2 being initialised from non-preemptible context.
* From v2 [2]:
- Remove the PMU's irq level cache which was hidding in plain sight
- Simplify the userspace notification of interrupt level update
- Additional comment clarification in patch #1
- Collected RB, with thanks
* From v1 [1]:
- Repaint kvm_timer_irq_can_fire() to kvm_timer_enabled()
- Drop duplicate kvm_timer_update_status() call
- Force lazy init on the irqfd slow-path for SPIs
[1] https://lore.kernel.org/r/20260417124612.2770268-1-maz@kernel.org
[2] https://lore.kernel.org/r/20260422100210.3008156-1-maz@kernel.org
Marc Zyngier (6):
KVM: arm64: timer: Repaint kvm_timer_{should,irq_can}_fire() to
kvm_timer_{pending,enabled}()
KVM: arm64: Simplify userspace notification of interrupt state
KVM: arm64: timer: Kill the per-timer irq level cache
KVM: arm64: pmu: Kill the PMU interrupt level cache
KVM: arm64: vgic-v2: Force vgic init on injection outside the run loop
KVM: arm64: vgic-v2: Don't init the vgic on in-kernel interrupt
injection
arch/arm64/kvm/arch_timer.c | 106 ++++++++++++++-----------------
arch/arm64/kvm/arm.c | 39 ++++++++----
arch/arm64/kvm/pmu-emul.c | 31 +++------
arch/arm64/kvm/vgic/vgic-irqfd.c | 6 ++
arch/arm64/kvm/vgic/vgic.c | 6 +-
include/kvm/arm_arch_timer.h | 7 +-
include/kvm/arm_pmu.h | 5 +-
7 files changed, 94 insertions(+), 106 deletions(-)
--
2.47.3
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox