Linux-ARM-Kernel Archive on lore.kernel.org

Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v12 02/25] drm/display: hdmi-state-helper: Use default case for unsupported formats
From: Nicolas Frattaroli @ 2026-04-09 15:44 UTC (permalink / raw)
  To: Harry Wentland, Leo Li, Rodrigo Siqueira, Alex Deucher,
	Christian König, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Andrzej Hajda, Neil Armstrong, Robert Foss, Laurent Pinchart,
	Jonas Karlman, Jernej Skrabec, Sandy Huang, Heiko Stübner,
	Andy Yan, Jani Nikula, Rodrigo Vivi, Joonas Lahtinen,
	Tvrtko Ursulin, Dmitry Baryshkov, Sascha Hauer, Rob Herring,
	Jonathan Corbet, Shuah Khan
  Cc: kernel, amd-gfx, dri-devel, linux-kernel, linux-arm-kernel,
	linux-rockchip, intel-gfx, intel-xe, linux-doc,
	Nicolas Frattaroli, Cristian Ciocaltea
In-Reply-To: <20260409-color-format-v12-0-ce84e1817a27@collabora.com>

Switch statements that do not handle all possible values of an
enumeration will generate a warning during compilation. In preparation
for adding a COUNT value to the end of the enum, this needs to be dealt
with.

Add a default case to sink_supports_format_bpc's DRM_OUTPUT_COLOR_FORMAT
switch statement, and move the log-and-return unknown pixel format
handling into it.

No functional change.

Reviewed-by: Maxime Ripard <mripard@kernel.org>
Reviewed-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
---
 drivers/gpu/drm/display/drm_hdmi_state_helper.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/display/drm_hdmi_state_helper.c b/drivers/gpu/drm/display/drm_hdmi_state_helper.c
index 9f3b696aceeb..a0d88701d236 100644
--- a/drivers/gpu/drm/display/drm_hdmi_state_helper.c
+++ b/drivers/gpu/drm/display/drm_hdmi_state_helper.c
@@ -541,10 +541,11 @@ sink_supports_format_bpc(const struct drm_connector *connector,
 		drm_dbg_kms(dev, "YUV444 format supported in that configuration.\n");
 
 		return true;
-	}
 
-	drm_dbg_kms(dev, "Unsupported pixel format.\n");
-	return false;
+	default:
+		drm_dbg_kms(dev, "Unsupported pixel format.\n");
+		return false;
+	}
 }
 
 static enum drm_mode_status

-- 
2.53.0



^ permalink raw reply related

* [PATCH v12 01/25] drm/amd/display: Remove unnecessary SIGNAL_TYPE_HDMI_TYPE_A check
From: Nicolas Frattaroli @ 2026-04-09 15:44 UTC (permalink / raw)
  To: Harry Wentland, Leo Li, Rodrigo Siqueira, Alex Deucher,
	Christian König, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Andrzej Hajda, Neil Armstrong, Robert Foss, Laurent Pinchart,
	Jonas Karlman, Jernej Skrabec, Sandy Huang, Heiko Stübner,
	Andy Yan, Jani Nikula, Rodrigo Vivi, Joonas Lahtinen,
	Tvrtko Ursulin, Dmitry Baryshkov, Sascha Hauer, Rob Herring,
	Jonathan Corbet, Shuah Khan
  Cc: kernel, amd-gfx, dri-devel, linux-kernel, linux-arm-kernel,
	linux-rockchip, intel-gfx, intel-xe, linux-doc,
	Nicolas Frattaroli, Werner Sembach, Andri Yngvason
In-Reply-To: <20260409-color-format-v12-0-ce84e1817a27@collabora.com>

From: Werner Sembach <wse@tuxedocomputers.com>

Remove unnecessary SIGNAL_TYPE_HDMI_TYPE_A check that was performed in the
drm_mode_is_420_only() case, but not in the drm_mode_is_420_also() &&
force_yuv420_output case.

Without further knowledge if YCbCr 4:2:0 is supported outside of HDMI,
there is no reason to use RGB when the display
reports drm_mode_is_420_only() even on a non HDMI connection.

This patch also moves both checks in the same if-case. This  eliminates an
extra else-if-case.

Signed-off-by: Werner Sembach <wse@tuxedocomputers.com>
Signed-off-by: Andri Yngvason <andri@yngvason.is>
Tested-by: Andri Yngvason <andri@yngvason.is>
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index c2066319772b..ad9714382d5f 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -6780,12 +6780,9 @@ static void fill_stream_properties_from_drm_display_mode(
 	timing_out->v_border_top = 0;
 	timing_out->v_border_bottom = 0;
 	/* TODO: un-hardcode */
-	if (drm_mode_is_420_only(info, mode_in)
-			&& stream->signal == SIGNAL_TYPE_HDMI_TYPE_A)
-		timing_out->pixel_encoding = PIXEL_ENCODING_YCBCR420;
-	else if (drm_mode_is_420_also(info, mode_in)
-			&& aconnector
-			&& aconnector->force_yuv420_output)
+	if (drm_mode_is_420_only(info, mode_in) ||
+	    (aconnector && aconnector->force_yuv420_output &&
+	     drm_mode_is_420_also(info, mode_in)))
 		timing_out->pixel_encoding = PIXEL_ENCODING_YCBCR420;
 	else if ((connector->display_info.color_formats & BIT(DRM_OUTPUT_COLOR_FORMAT_YCBCR422))
 			&& aconnector

-- 
2.53.0



^ permalink raw reply related

* Re: [PATCH v7 7/7] KVM: arm64: Normalize cache configuration
From: Marc Zyngier @ 2026-04-09 15:45 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Gutierrez Cantu, Bernardo, alexandru.elisei, alyssa, asahi,
	broonie, catalin.marinas, james.morse, kvmarm, linux-arm-kernel,
	linux-kernel, marcan, mathieu.poirier, oliver.upton,
	suzuki.poulose, sven, will
In-Reply-To: <254ca48a67779ccf9b9f60e2bb5796a305c03f95.camel@infradead.org>

On Thu, 09 Apr 2026 15:51:59 +0100,
David Woodhouse <dwmw2@infradead.org> wrote:
> 
> [1  <text/plain; UTF-8 (quoted-printable)>]
> On Thu, 2026-04-09 at 14:36 +0100, Marc Zyngier wrote:
> > On Thu, 09 Apr 2026 13:25:24 +0100,
> > David Woodhouse <dwmw2@infradead.org> wrote:
> > > 
> > > On Thu, 12 Jan 2023 at 11:38:52 +0900, Akihiko Odaki wrote:
> > > > Before this change, the cache configuration of the physical CPU was
> > > > exposed to vcpus. This is problematic because the cache configuration a
> > > > vcpu sees varies when it migrates between vcpus with different cache
> > > > configurations.
> > > > 
> > > > Fabricate cache configuration from the sanitized value, which holds the
> > > > CTR_EL0 value the userspace sees regardless of which physical CPU it
> > > > resides on.
> > > > 
> > > > CLIDR_EL1 and CCSIDR_EL1 are now writable from the userspace so that
> > > > the VMM can restore the values saved with the old kernel.
> > > 
> > > (commit 7af0c2534f4c5)
> > > 
> > > How does the VMM set the values that the old kernel would have set?
> > 
> > By reading them at the source?
> 
> Yes, but the VMM in EL0 *can't* read the source, can it?

Again, this is designed for migration, and snapshoting the source
values gives you the expected result.

>
> > > Let's say we're deploying a kernel with this change for the first time,
> > > and we need to ensure that we provide a consistent environment to
> > > guests, which can be live migrated back to an older host.
> > 
> > We have never guaranteed host downgrade. It almost never works.
> 
> Huh? Host downgrade absolutely *does* work; KVM on Arm isn't a toy.
> We'd never be able to ship a new kernel if we didn't know we could roll
> it *back* if we needed to.

You have clearly been lucky so far, because I'd never such claim.

> 
> I *know* that you know perfectly well that we actively test host
> downgrades prior to every rollout of a new kernel. So I'm a little
> confused about what you're trying to say here, Marc.

I said exactly this: we make no effort to ensure that a guest started
on a version of the kernel can be migrated to an older version -- the
state space might be different.

> 
> > > So for new launches, we need to provide the values that the old kernel
> > > *would* have provided to the guest. A new launch isn't a migration;
> > > there are no "values saved with the old kernel".
> > 
> > And you can provide these values.
> 
> If I know them, sure. But I don't, because:
> 
> > > Userspace can't read the CLIDR_EL1 and CCSIDR_EL1 registers directly,
> > > and AFAICT not everything we need to reconstitute them is in sysfs. How
> > > is this supposed to work?
> > > 
> > > Shouldn't this change have been made as a capability that the VMM can
> > > explicitly opt in or out of? Environments that don't do cross-CPU
> > > migration absolutely don't care about, and actively don't *want*, the
> > > sanitisation that this commit inflicted on us, surely?
> > 
> > I don't think a capability buys you anything. You want to expose
> > something to the guest? Make it so. You are in the favourable
> > situation to completely own the HW and the VMM.
> 
> Now that the values are writable, but userspace can't easily see *what*
> they would have been in the previous kernel. So having the capability
> to ask KVM to set them to those values seems like it might be useful.

Well, it's only a patch away.

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply

* Re: [PATCH v10 16/20] coresight: Add PM callbacks for sink device
From: Leo Yan @ 2026-04-09 15:44 UTC (permalink / raw)
  To: James Clark
  Cc: coresight, linux-arm-kernel, Yeoreum Yun, Mark Rutland,
	Will Deacon, Yabin Cui, Keita Morisaki, Yuanfang Zhang,
	Greg Kroah-Hartman, Alexander Shishkin, Tamas Petz,
	Thomas Gleixner, Peter Zijlstra, Suzuki K Poulose, Mike Leach
In-Reply-To: <e7c6f497-f8f0-409b-b80a-a064b222c08d@linaro.org>

On Thu, Apr 09, 2026 at 11:52:00AM +0100, James Clark wrote:

[...]

> > @@ -1787,15 +1808,32 @@ static int coresight_pm_save(struct coresight_path *path)
> >   	to = list_prev_entry(coresight_path_last_node(path), link);
> >   	coresight_disable_path_from_to(path, from, to);
> > +	ret = coresight_pm_device_save(coresight_get_sink(path));
> > +	if (ret)
> > +		goto sink_failed;
> > +
> 
> The comment directly above this says "Up to the node before sink to avoid
> latency". But then this line goes and saves the sink anyway. So I'm not sure
> what's meant by the comment?

I will refine the comment to: "Control the path up to the node before
the _system_ sink to avoid latency caused by memory copying; afterwards,
saves context if the sink provides a callback".

> >   	return 0;
> > +
> > +sink_failed:
> > +	if (!coresight_enable_path_from_to(path, coresight_get_mode(source),
> > +					   from, to))
> > +		coresight_pm_device_restore(source);
> > +
> > +	pr_err("Failed in coresight PM save on CPU%d: %d\n",
> > +	       smp_processor_id(), ret);
> > +	this_cpu_write(percpu_pm_failed, true);
> 
> Why does only a failing sink set percpu_pm_failed when failing to save the
> source exits early.

My purpose is to use "percpu_pm_failed == true" to prevent further
issues if we already know the link has failed.  To be clear, this is not
a recovery mechanism; it simply means "if link enabling or disabling
fails in CPU PM, do not continue to avoid potential hardware lockups.”

The source save failure is a bit special, it fails at the beginning of
the CoreSight CPU PM flow, and the returned error prevents the CPU idle
flow from proceeding.  As a result, there is nothing further to do.

> Sashiko has a similar comment that this could result in
> restoring uninitialised source save data later, but a comment in this
> function about why the flow is like this would be helpful.

The comment of "Restoring uninitialised source save data later" does not
make sense.

When the save fails, the notifier only rolls back the callbacks that ran
before the failing callback by invoking CPU_PM_ENTER_FAILED.  And there
is no CPU_PM_EXIT, as the idle state is never entered.

> We have coresight_disable_path_from_to() which always succeeds and doesn't
> return an error. TRBE is the only sink with a pm_save_disable()
> callback, but it always succeeds anyway.
> 
> Would it not be much simpler to require that sink save/restore callbacks
> always succeed and don't return anything? Seems like this percpu_pm_failed
> stuff is extra complexity for a scenario that doesn't exist? The only thing
> that can fail is saving the source but it doesn't goto sink_failed when that
> happens.

As you suggested, I can simplify the code with assuming sink ops
always success, but we might expect complaints from Sashiko.

> Ideally etm4_cpu_save() wouldn't have a return value either. It would be
> good if we could find away to skip or ignore the timeouts in there somehow
> because that's the only reason it can fail.

Seems to me, ETMv4 has a timeout log and returns error is not bad, as
it can be helpful to locate issue in a cheap way (sometimes, hang
issue caused by lockup might be a time-consumed debugging).

Thanks,
Leo

^ permalink raw reply

* Re: BUG: net-next (7.0-rc6 based and later) fails to boot on Jetson Xavier NX
From: Linus Torvalds @ 2026-04-09 15:37 UTC (permalink / raw)
  To: Will Deacon
  Cc: Russell King (Oracle), Robin Murphy, netdev, linux-arm-kernel,
	linux-kernel, iommu, linux-ext4, dmaengine, Marek Szyprowski,
	Theodore Ts'o, Andreas Dilger, Vinod Koul, Frank Li
In-Reply-To: <adeaiSAnkaggqPsA@willie-the-truck>

On Thu, 9 Apr 2026 at 05:24, Will Deacon <will@kernel.org> wrote:
>
> On Wed, Apr 08, 2026 at 08:52:32PM +0100, Russell King (Oracle) wrote:
> > What's the status on the iommu fix? Is it merged into mainline yet?
> > If it isn't already, that means net-next remains unbootable going
> > into the merge window without manually carrying the fix locally.
>
> I'll pick it up for 7.0 in the iommu tree.

... and now it's in my tree.

               Linus


^ permalink raw reply

* Re: [PATCH v2 1/3] dt-bindings: arm: aspeed: add Anacapa EVT1 EVT2 board
From: Conor Dooley @ 2026-04-09 15:36 UTC (permalink / raw)
  To: Colin Huang
  Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Joel Stanley,
	Andrew Jeffery, devicetree, linux-arm-kernel, linux-aspeed,
	linux-kernel, colin.huang2
In-Reply-To: <20260409-anacapa-devlop-phase-devicetree-v2-1-68f328671653@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 247 bytes --]

On Thu, Apr 09, 2026 at 07:40:26PM +0800, Colin Huang wrote:
> Document Anacapa BMC EVT1 and EVT2 compatibles.
> 
> Signed-off-by: Colin Huang <u8813345@gmail.com>

Acked-by: Conor Dooley <conor.dooley@microchip.com>
pw-bot: not-applicable

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply

* Re: [PATCH v2 1/3] dt-bindings: net: dsa: nxp,sja1105: make spi-cpol optional for sja1110
From: Conor Dooley @ 2026-04-09 15:36 UTC (permalink / raw)
  To: Josua Mayer
  Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Shawn Guo,
	Frank Li, Sascha Hauer, Pengutronix Kernel Team, Fabio Estevam,
	Andrew Lunn, Vladimir Oltean, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Yazan Shhady, Mikhail Anikin,
	Alexander Dahl, devicetree, linux-kernel, imx, linux-arm-kernel,
	Vladimir Oltean, Conor Dooley, Krzysztof Kozlowski, netdev
In-Reply-To: <20260409-imx8dxl-sr-som-v2-1-83ff20629ba0@solid-run.com>

[-- Attachment #1: Type: text/plain, Size: 668 bytes --]

On Thu, Apr 09, 2026 at 02:34:33PM +0200, Josua Mayer wrote:
> Currently, the binding requires 'spi-cpha' for SJA1105 and 'spi-cpol'
> for SJA1110.
> 
> However, the SJA1110 supports both SPI modes 0 and 2. Mode 2
> (cpha=0, cpol=1) is used by the NXP LX2160 Bluebox 3.
> 
> On the SolidRun i.MX8DXL HummingBoard Telematics, mode 0 is stable,
> while forcing mode 2 introduces CRC errors especially during bursts.
> 
> Drop the requirement on spi-cpol for SJA1110.
> 
> Fixes: af2eab1a8243 ("dt-bindings: net: nxp,sja1105: document spi-cpol/cpha")
> Signed-off-by: Josua Mayer <josua@solid-run.com>

Acked-by: Conor Dooley <conor.dooley@microchip.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply

* Re: [PATCH] media: cedrus: skip invalid H.264 reference list entries
From: Paul Kocialkowski @ 2026-04-09 15:31 UTC (permalink / raw)
  To: Nicolas Dufresne
  Cc: Pengpeng Hou, mripard, mchehab, gregkh, wens, jernej.skrabec,
	samuel, linux-media, linux-staging, linux-arm-kernel, linux-sunxi,
	linux-kernel
In-Reply-To: <a5f8c14becdc926f14ea852c8c5a2d4665a2ef8a.camel@collabora.com>

[-- Attachment #1: Type: text/plain, Size: 2268 bytes --]

Hi,

On Thu 09 Apr 26, 10:39, Nicolas Dufresne wrote:
> Hi,
> 
> Le jeudi 09 avril 2026 à 16:31 +0200, Paul Kocialkowski a écrit :
> > I think it make sense yes, but it would be good to document it in the uAPI
> > document too.
> 
> Basically, extend in the M2M decoder spec(s) on the existing documentation:
>    
>    V4L2_BUF_FLAG_ERROR:
>    -
>    When this flag is set, the buffer has been dequeued successfully, although
>    the data might **have been corrupted**. This is recoverable, streaming may
>    continue as normal and the buffer may be reused normally. Drivers set this
>    flag when the VIDIOC_DQBUF ioctl is called.

Well this part is about v4l2 buffers in general, not just m2m/decoders.
But I guess this mechanism would make sense for more device classes than just
decoders, so we could indeed specify it there. Maybe with a sufficiently broad
wording.

But it would be good to also update the stateless decode document (and maybe
stateful too) where V4L2_BUF_FLAG_ERROR is already mentionned a few times.
We could indicate how this behavior related to reference frames there.

If we agree I could make a series with the following:
- Introduce a V4L2_H264_REF_MISSING 0xff define (same for HEVC)
- Update the v4l2_h264_reference doc to mention it
- Update the cedrus driver to error out (zero-size payload) when the L0/L1 index
  is either V4L2_H264_REF_MISSING or an invalid index that doesn't exist in the
  DPB (same for HEVC)
- Update the v4l2 buffer and stateless(+stateful) documents to mention that
  buffers marked with V4L2_BUF_FLAG_ERROR may or may not contain usable (yet
  corrupted) data depending on the payload size and how it relates to reference
  frames.

Then we could later envision having a mechanism (hopefully common) to figure out
the best replacement to a given missing reference, which would allow cedrus
(and maybe other drivers too) to return a frame with incorrect data instead of
a zero-size payload error.

What do you think?

All the best,

Paul

-- 
Paul Kocialkowski,

Independent contractor - sys-base - https://www.sys-base.io/
Free software developer - https://www.paulk.fr/

Expert in multimedia, graphics and embedded hardware support with Linux.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* [PATCH] KVM: arm64: Add KVM_CAP_ARM_NATIVE_CACHE_CONFIG vcpu capability
From: David Woodhouse @ 2026-04-09 15:29 UTC (permalink / raw)
  To: akihiko.odaki, Gutierrez Cantu, Bernardo
  Cc: alexandru.elisei, alyssa, asahi, broonie, catalin.marinas,
	james.morse, kvmarm, kvmarm, linux-arm-kernel, linux-kernel,
	marcan, mathieu.poirier, maz, oliver.upton, suzuki.poulose, sven,
	will
In-Reply-To: <b71910e202ac4a56a87c0e34df13cce058d46a76.camel@infradead.org>

[-- Attachment #1: Type: text/plain, Size: 8489 bytes --]

From: David Woodhouse <dwmw@amazon.co.uk>

Commit 7af0c2534f4c5 ("KVM: arm64: Normalize cache configuration")
fabricates CLIDR_EL1 and CCSIDR_EL1 values instead of using the real
hardware values. While this provides consistent values across
heterogeneous CPUs, it does cause visible changes in the CPU model
exposed to guests.

The commit claims that userspace can restore the original values, but
there is no way for userspace to obtain the real CLIDR_EL1 register
value — it is not fully reconstructible from sysfs, which lacks the
LoC, LoUU, and LoUIS fields.

Add a per-vcpu KVM_CAP_ARM_NATIVE_CACHE_CONFIG capability that reads
the real CLIDR_EL1 and all CCSIDR_EL1 values from the current physical
CPU and sets them on the vcpu.

This allows hypervisors to present the real hardware cache configuration
to guests, which is important for consistency of the environment across
kernel versions and for migration compatibility with hosts running
older kernels that exposed the real values.

Fixes: 7af0c2534f4c ("KVM: arm64: Normalize cache configuration")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 Documentation/virt/kvm/api.rst                | 23 ++++++++
 arch/arm64/include/asm/kvm_host.h             |  1 +
 arch/arm64/kvm/arm.c                          | 17 ++++++
 arch/arm64/kvm/sys_regs.c                     | 26 ++++++++++
 include/uapi/linux/kvm.h                      |  1 +
 tools/testing/selftests/kvm/Makefile.kvm      |  1 +
 .../selftests/kvm/arm64/native_cache_config.c | 52 +++++++++++++++++++
 7 files changed, 121 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/arm64/native_cache_config.c

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index e3b3bd9edeec..ee47dc07ceac 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -8930,6 +8930,29 @@ no-op.
 
 ``KVM_CHECK_EXTENSION`` returns the bitmask of exits that can be disabled.
 
+7.48 KVM_CAP_ARM_NATIVE_CACHE_CONFIG
+-------------------------------------
+
+:Architecture: arm64
+:Target: vcpu
+:Parameters: none
+:Returns: 0 on success, -ENOMEM on allocation failure, -EINVAL if
+          args[0] or flags are non-zero.
+
+This per-vcpu capability reads the real CLIDR_EL1 and CCSIDR_EL1 values
+from the physical CPU on which the ioctl is executed, and sets them on
+the vcpu. This replaces the fabricated cache configuration that KVM
+provides by default.
+
+The caller should ensure the vcpu thread is pinned to the desired
+physical CPU before invoking this capability, so that the correct cache
+topology is captured. On heterogeneous systems, different physical CPUs
+may have different cache configurations.
+
+After this capability is enabled, the vcpu's CLIDR_EL1 and CCSIDR_EL1
+values can still be overridden individually via ``KVM_SET_ONE_REG`` and
+the ``KVM_REG_ARM_DEMUX`` interface.
+
 8. Other capabilities.
 ======================
 
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index a1bb025c641f..c9713a472c47 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -1296,6 +1296,7 @@ void kvm_sys_regs_create_debugfs(struct kvm *kvm);
 void kvm_reset_sys_regs(struct kvm_vcpu *vcpu);
 
 int __init kvm_sys_reg_table_init(void);
+int kvm_vcpu_set_native_cache_config(struct kvm_vcpu *vcpu);
 struct sys_reg_desc;
 int __init populate_sysreg_config(const struct sys_reg_desc *sr,
 				  unsigned int idx);
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 326a99fea753..579583e8dc5c 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -393,6 +393,10 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_ARM_DISABLE_EXITS:
 		r = KVM_ARM_DISABLE_VALID_EXITS;
 		break;
+	case KVM_CAP_ARM_NATIVE_CACHE_CONFIG:
+	case KVM_CAP_ENABLE_CAP:
+		r = 1;
+		break;
 	case KVM_CAP_SET_GUEST_DEBUG2:
 		return KVM_GUESTDBG_VALID_MASK;
 	case KVM_CAP_ARM_SET_DEVICE_ADDR:
@@ -1793,6 +1797,19 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 		r = kvm_arch_vcpu_ioctl_vcpu_init(vcpu, &init);
 		break;
 	}
+	case KVM_ENABLE_CAP: {
+		struct kvm_enable_cap cap;
+
+		r = -EFAULT;
+		if (copy_from_user(&cap, argp, sizeof(cap)))
+			break;
+
+		r = -EINVAL;
+		if (cap.cap == KVM_CAP_ARM_NATIVE_CACHE_CONFIG &&
+		    !cap.args[0] && !cap.flags)
+			r = kvm_vcpu_set_native_cache_config(vcpu);
+		break;
+	}
 	case KVM_SET_ONE_REG:
 	case KVM_GET_ONE_REG: {
 		struct kvm_one_reg reg;
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 1b4cacb6e918..c19d84e48f8b 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -484,6 +484,32 @@ static int set_ccsidr(struct kvm_vcpu *vcpu, u32 csselr, u32 val)
 	return 0;
 }
 
+int kvm_vcpu_set_native_cache_config(struct kvm_vcpu *vcpu)
+{
+	u32 csselr;
+
+	if (!vcpu->arch.ccsidr) {
+		vcpu->arch.ccsidr = kmalloc_array(CSSELR_MAX, sizeof(u32),
+						  GFP_KERNEL_ACCOUNT);
+		if (!vcpu->arch.ccsidr)
+			return -ENOMEM;
+	}
+
+	local_irq_disable();
+
+	__vcpu_assign_sys_reg(vcpu, CLIDR_EL1, read_sysreg(clidr_el1));
+
+	for (csselr = 0; csselr < CSSELR_MAX; csselr++) {
+		write_sysreg(csselr, csselr_el1);
+		isb();
+		vcpu->arch.ccsidr[csselr] = read_sysreg(ccsidr_el1);
+	}
+
+	local_irq_enable();
+
+	return 0;
+}
+
 static bool access_rw(struct kvm_vcpu *vcpu,
 		      struct sys_reg_params *p,
 		      const struct sys_reg_desc *r)
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 694cf699ed0a..2d8bbb4dd69b 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -995,6 +995,7 @@ struct kvm_enable_cap {
 #define KVM_CAP_S390_USER_OPEREXEC 246
 #define KVM_CAP_S390_KEYOP 247
 #define KVM_CAP_ARM_DISABLE_EXITS 248
+#define KVM_CAP_ARM_NATIVE_CACHE_CONFIG 249
 
 struct kvm_irq_routing_irqchip {
 	__u32 irqchip;
diff --git a/tools/testing/selftests/kvm/Makefile.kvm b/tools/testing/selftests/kvm/Makefile.kvm
index ad9f9fa181a1..8e05cc8ec204 100644
--- a/tools/testing/selftests/kvm/Makefile.kvm
+++ b/tools/testing/selftests/kvm/Makefile.kvm
@@ -182,6 +182,7 @@ TEST_GEN_PROGS_arm64 += arm64/vgic_group_iidr
 TEST_GEN_PROGS_arm64 += arm64/vgic_group_v2
 TEST_GEN_PROGS_arm64 += arm64/vpmu_counter_access
 TEST_GEN_PROGS_arm64 += arm64/no-vgic-v3
+TEST_GEN_PROGS_arm64 += arm64/native_cache_config
 TEST_GEN_PROGS_arm64 += arm64/idreg-idst
 TEST_GEN_PROGS_arm64 += arm64/kvm-uuid
 TEST_GEN_PROGS_arm64 += access_tracking_perf_test
diff --git a/tools/testing/selftests/kvm/arm64/native_cache_config.c b/tools/testing/selftests/kvm/arm64/native_cache_config.c
new file mode 100644
index 000000000000..4afea32f2348
--- /dev/null
+++ b/tools/testing/selftests/kvm/arm64/native_cache_config.c
@@ -0,0 +1,52 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * native_cache_config.c - Test KVM_CAP_ARM_NATIVE_CACHE_CONFIG
+ *
+ * Verify that enabling the capability populates the vcpu's CLIDR_EL1
+ * with the real hardware value instead of the fabricated default.
+ */
+#include "test_util.h"
+#include "kvm_util.h"
+#include "processor.h"
+
+#define CLIDR_EL1_REG_ID ARM64_SYS_REG(3, 1, 0, 0, 1)
+
+int main(int argc, char *argv[])
+{
+	struct kvm_enable_cap cap = {
+		.cap = KVM_CAP_ARM_NATIVE_CACHE_CONFIG,
+	};
+	struct kvm_vcpu *vcpu;
+	struct kvm_vm *vm;
+	uint64_t clidr_before, clidr_after;
+	int r;
+
+	TEST_REQUIRE(kvm_has_cap(KVM_CAP_ARM_NATIVE_CACHE_CONFIG));
+
+	vm = vm_create_with_one_vcpu(&vcpu, NULL);
+
+	/* Read the fabricated CLIDR_EL1 */
+	clidr_before = vcpu_get_reg(vcpu, CLIDR_EL1_REG_ID);
+
+	/* Enable native cache config */
+	r = __vcpu_ioctl(vcpu, KVM_ENABLE_CAP, &cap);
+	TEST_ASSERT(!r, "KVM_ENABLE_CAP failed: %d (errno %d)", r, errno);
+
+	/* Read CLIDR_EL1 again */
+	clidr_after = vcpu_get_reg(vcpu, CLIDR_EL1_REG_ID);
+
+	pr_info("CLIDR_EL1 before: 0x%016lx\n", clidr_before);
+	pr_info("CLIDR_EL1 after:  0x%016lx\n", clidr_after);
+
+	TEST_ASSERT(clidr_after != 0,
+		    "CLIDR_EL1 should not be zero after native config");
+
+	/* Invalid: non-zero args should fail */
+	cap.args[0] = 1;
+	r = __vcpu_ioctl(vcpu, KVM_ENABLE_CAP, &cap);
+	TEST_ASSERT(r == -1 && errno == EINVAL,
+		    "Non-zero args should fail: got %d errno %d", r, errno);
+
+	kvm_vm_free(vm);
+	return 0;
+}
-- 
2.43.0



[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5069 bytes --]

^ permalink raw reply related

* Re: [PATCH 7/7] media: rkvdec: Add multicore IOMMU support
From: Nicolas Dufresne @ 2026-04-09 15:24 UTC (permalink / raw)
  To: Detlev Casanova, Mauro Carvalho Chehab, Ezequiel Garcia,
	Heiko Stuebner, Hans Verkuil, Jonas Karlman
  Cc: kernel, linux-media, linux-kernel, linux-rockchip,
	linux-arm-kernel
In-Reply-To: <040f65ea-2559-4186-b733-3dc59a1ecc0b@collabora.com>

[-- Attachment #1: Type: text/plain, Size: 12646 bytes --]

Le jeudi 09 avril 2026 à 10:49 -0400, Detlev Casanova a écrit :
> Hi Nicolas,
> 
> On 4/9/26 10:19, Nicolas Dufresne wrote:
> > Le jeudi 09 avril 2026 à 09:50 -0400, Detlev Casanova a écrit :
> > > As each core has its own IOMMU core, buffers must be mapped in each
> > > core's IOMMU so that any run() call can use any core without having to
> > > remap everything.
> > > 
> > > To do that, we use rockchip iommu domain's iommu devices list.
> > > With that, one IOMMU domain can be mapped on multiple devices, meaning
> > > that each call to iommu_map() will flush the new mapping on all devices
> > > in the list.
> > > 
> > > The IOMMU domain that will have all devices in its list is the first
> > > core's default domain.
> > > 
> > > Another domain cannot be used because VB2 allocates buffers through the
> > > DMA engine, which uses iommu_get_dma_domain() to find the domain to map
> > > buffers through.
> > > 
> > > The IOMMU restore function can still work as before, but needs to be more
> > > explicit in what domain to attach the device to.
> > > That is because detaching the empty domain will reattach the core's default
> > > domain, which is wrong (except for the first "main" core).
> > > 
> > > The RCB temporary buffers are allocated in a dedicated SRAM, each
> > > core has its own SRAM, so the mapping for each core's SRAM is added in the
> > > global domain.
> > > 
> > > Everything else is mapped through the first core's default domain, making
> > > the driver write the mappings on both IOMMU cores.
> > Just raising an issue with the patch ordering here. I'm worried in a git bisect,
> > the driver will be broken until we apply this last patch. Can we make sure that
> > the driver bisects ? (or tell me if I'm wrong)
> No, you're right. That's also why I mentioned it in the cover letter. 
> This commit is also not too big, so I'll merge the last 3 commits and 
> adapt the commit message to keep the information from the 3 (the commit 
> messages are why I wanted to keep them separate, but there is no good 
> way to keep them that way)


Unless we can implement everything, but only turn the multicore switch on in a
final patch ? Basically removing the code that enable other cores at the end.
But I don't want more code just to justify this, the break nice is nice for
review fwiw, but if squashing is the way, let it be.

Nicolas

> 
> Detlev.
> > 
> > > Signed-off-by: Detlev Casanova <detlev.casanova@collabora.com>
> > > ---
> > >   .../media/platform/rockchip/rkvdec/rkvdec-rcb.c    | 21 ++++++-------
> > >   .../media/platform/rockchip/rkvdec/rkvdec-rcb.h    |  6 ++--
> > >   drivers/media/platform/rockchip/rkvdec/rkvdec.c    | 35 +++++++++++++++++-----
> > >   drivers/media/platform/rockchip/rkvdec/rkvdec.h    |  2 +-
> > >   4 files changed, 44 insertions(+), 20 deletions(-)
> > > 
> > > diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-rcb.c b/drivers/media/platform/rockchip/rkvdec/rkvdec-rcb.c
> > > index 190fb7438e8c..977e37cf209b 100644
> > > --- a/drivers/media/platform/rockchip/rkvdec/rkvdec-rcb.c
> > > +++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-rcb.c
> > > @@ -57,7 +57,7 @@ bool rkvdec_rcb_buf_validate_size(struct rkvdec_ctx *ctx)
> > >   	return ret;
> > >   }
> > >   
> > > -void rkvdec_free_rcb(struct rkvdec_core *core)
> > > +void rkvdec_free_rcb(struct rkvdec_dev *rkvdec, struct rkvdec_core *core)
> > >   {
> > >   	struct rkvdec_rcb_config *cfg = core->rcb_config;
> > >   	unsigned long virt_addr;
> > > @@ -76,12 +76,12 @@ void rkvdec_free_rcb(struct rkvdec_core *core)
> > >   		case RKVDEC_ALLOC_SRAM:
> > >   			virt_addr = (unsigned long)cfg->rcb_bufs[i].cpu;
> > >   
> > > -			if (core->iommu_domain)
> > > -				iommu_unmap(core->iommu_domain, virt_addr, rcb_size);
> > > +			if (rkvdec->iommu_global_domain)
> > > +				iommu_unmap(rkvdec->iommu_global_domain, virt_addr, rcb_size);
> > >   			gen_pool_free(core->sram_pool, virt_addr, rcb_size);
> > >   			break;
> > >   		case RKVDEC_ALLOC_DMA:
> > > -			dma_free_coherent(core->dev,
> > > +			dma_free_coherent(rkvdec->main_core->dev,
> > >   					  rcb_size,
> > >   					  cfg->rcb_bufs[i].cpu,
> > >   					  cfg->rcb_bufs[i].dma);
> > > @@ -97,7 +97,8 @@ void rkvdec_free_rcb(struct rkvdec_core *core)
> > >   	core->rcb_config = NULL;
> > >   }
> > >   
> > > -int rkvdec_allocate_rcb(struct rkvdec_core *core, u32 width, u32 height,
> > > +int rkvdec_allocate_rcb(struct rkvdec_dev *rkvdec, struct rkvdec_core *core,
> > > +			u32 width, u32 height,
> > >   			const struct rcb_size_info *size_info,
> > >   			size_t rcb_count)
> > >   {
> > > @@ -132,7 +133,7 @@ int rkvdec_allocate_rcb(struct rkvdec_core *core, u32 width, u32 height,
> > >   
> > >   		/* Try allocating an SRAM buffer */
> > >   		if (core->sram_pool) {
> > > -			if (core->iommu_domain)
> > > +			if (rkvdec->iommu_global_domain)
> > >   				rcb_size = ALIGN(rcb_size, SZ_4K);
> > >   
> > >   			cpu = gen_pool_dma_zalloc_align(core->sram_pool,
> > > @@ -142,11 +143,11 @@ int rkvdec_allocate_rcb(struct rkvdec_core *core, u32 width, u32 height,
> > >   		}
> > >   
> > >   		/* If an IOMMU is used, map the SRAM address through it */
> > > -		if (cpu && core->iommu_domain) {
> > > +		if (cpu && rkvdec->iommu_global_domain) {
> > >   			unsigned long virt_addr = (unsigned long)cpu;
> > >   			phys_addr_t phys_addr = dma;
> > >   
> > > -			ret = iommu_map(core->iommu_domain, virt_addr, phys_addr,
> > > +			ret = iommu_map(rkvdec->iommu_global_domain, virt_addr, phys_addr,
> > >   					rcb_size, IOMMU_READ | IOMMU_WRITE, 0);
> > >   			if (ret) {
> > >   				gen_pool_free(core->sram_pool,
> > > @@ -166,7 +167,7 @@ int rkvdec_allocate_rcb(struct rkvdec_core *core, u32 width, u32 height,
> > >   ram_fallback:
> > >   		/* Fallback to RAM */
> > >   		if (!cpu) {
> > > -			cpu = dma_alloc_coherent(core->dev,
> > > +			cpu = dma_alloc_coherent(rkvdec->main_core->dev,
> > >   						 rcb_size,
> > >   						 &dma,
> > >   						 GFP_KERNEL);
> > > @@ -189,7 +190,7 @@ int rkvdec_allocate_rcb(struct rkvdec_core *core, u32 width, u32 height,
> > >   	return 0;
> > >   
> > >   err_alloc:
> > > -	rkvdec_free_rcb(core);
> > > +	rkvdec_free_rcb(rkvdec, core);
> > >   
> > >   	return ret;
> > >   }
> > > diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-rcb.h b/drivers/media/platform/rockchip/rkvdec/rkvdec-rcb.h
> > > index a12af9b7dc2b..d1149afe7fda 100644
> > > --- a/drivers/media/platform/rockchip/rkvdec/rkvdec-rcb.h
> > > +++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-rcb.h
> > > @@ -8,6 +8,7 @@
> > >   
> > >   #include <linux/types.h>
> > >   
> > > +struct rkvdec_dev;
> > >   struct rkvdec_ctx;
> > >   struct rkvdec_core;
> > >   
> > > @@ -21,11 +22,12 @@ struct rcb_size_info {
> > >   	enum rcb_axis axis;
> > >   };
> > >   
> > > -int rkvdec_allocate_rcb(struct rkvdec_core *core, u32 width, u32 height,
> > > +int rkvdec_allocate_rcb(struct rkvdec_dev *rkvdec, struct rkvdec_core *core,
> > > +			u32 width, u32 height,
> > >   			const struct rcb_size_info *size_info,
> > >   			size_t rcb_count);
> > >   dma_addr_t rkvdec_rcb_buf_dma_addr(struct rkvdec_ctx *ctx, int id);
> > >   size_t rkvdec_rcb_buf_size(struct rkvdec_ctx *ctx, int id);
> > >   int rkvdec_rcb_buf_count(struct rkvdec_ctx *ctx);
> > >   bool rkvdec_rcb_buf_validate_size(struct rkvdec_ctx *ctx);
> > > -void rkvdec_free_rcb(struct rkvdec_core *core);
> > > +void rkvdec_free_rcb(struct rkvdec_dev *rkvdec, struct rkvdec_core *core);
> > > diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec.c b/drivers/media/platform/rockchip/rkvdec/rkvdec.c
> > > index c2818f1575ef..2930e9b64906 100644
> > > --- a/drivers/media/platform/rockchip/rkvdec/rkvdec.c
> > > +++ b/drivers/media/platform/rockchip/rkvdec/rkvdec.c
> > > @@ -1204,9 +1204,9 @@ static void rkvdec_device_run(void *priv)
> > >   	}
> > >   
> > >   	if (!rkvdec_rcb_buf_validate_size(ctx)) {
> > > -		rkvdec_free_rcb(ctx->core);
> > > +		rkvdec_free_rcb(ctx->dev, ctx->core);
> > >   
> > > -		ret = rkvdec_allocate_rcb(ctx->core,
> > > +		ret = rkvdec_allocate_rcb(ctx->dev, ctx->core,
> > >   					  ctx->decoded_fmt.fmt.pix_mp.width,
> > >   					  ctx->decoded_fmt.fmt.pix_mp.height,
> > >   					  ctx->dev->variant->rcb_sizes,
> > > @@ -1486,6 +1486,7 @@ static void rkvdec_v4l2_cleanup(struct rkvdec_dev *rkvdec)
> > >   
> > >   static void rkvdec_iommu_restore(struct rkvdec_core *core)
> > >   {
> > > +	int ret;
> > >   	if (core->empty_domain) {
> > >   		/*
> > >   		 * To rewrite mapping into the attached IOMMU core, attach a new empty domain that
> > > @@ -1494,8 +1495,14 @@ static void rkvdec_iommu_restore(struct rkvdec_core *core)
> > >   		 * This is safely done in this interrupt handler to make sure no memory get mapped
> > >   		 * through the IOMMU while the empty domain is attached.
> > >   		 */
> > > -		iommu_attach_device(core->empty_domain, core->dev);
> > > +		iommu_detach_device(core->curr_ctx->dev->iommu_global_domain, core->dev);
> > > +		ret = iommu_attach_device(core->empty_domain, core->dev);
> > > +		if (ret)
> > > +			dev_warn(core->dev, "Cannot attach empty domain: %d\n", ret);
> > >   		iommu_detach_device(core->empty_domain, core->dev);
> > > +		ret = iommu_attach_device(core->curr_ctx->dev->iommu_global_domain, core->dev);
> > > +		if (ret)
> > > +			dev_warn(core->dev, "Cannot attach global domain: %d\n", ret);
> > >   	}
> > >   }
> > >   
> > > @@ -1858,6 +1865,8 @@ static int rkvdec_probe(struct platform_device *pdev)
> > >   
> > >   	core = &rkvdec->cores[rkvdec->core_count++];
> > >   
> > > +	core->id = rkvdec->core_count - 1;
> > > +
> > >   	platform_set_drvdata(pdev, rkvdec);
> > >   	core->dev = &pdev->dev;
> > >   	INIT_DELAYED_WORK(&core->watchdog_work, rkvdec_watchdog_func);
> > > @@ -1883,12 +1892,24 @@ static int rkvdec_probe(struct platform_device *pdev)
> > >   			return PTR_ERR(core->link);
> > >   	}
> > >   
> > > -	core->iommu_domain = iommu_get_domain_for_dev(&pdev->dev);
> > > -	if (core->iommu_domain) {
> > > +	if (iommu_get_domain_for_dev(&pdev->dev)) {
> > >   		core->empty_domain = iommu_paging_domain_alloc(core->dev);
> > >   
> > > -		if (!core->empty_domain)
> > > +		if (IS_ERR(core->empty_domain))
> > >   			dev_warn(core->dev, "cannot alloc new empty domain\n");
> > > +
> > > +		if (!rkvdec->iommu_global_domain) {
> > > +			rkvdec->iommu_global_domain = iommu_get_domain_for_dev(core->dev);
> > > +
> > > +			if (IS_ERR(rkvdec->iommu_global_domain)) {
> > > +				rkvdec->iommu_global_domain = NULL;
> > > +				dev_warn_once(core->dev, "cannot alloc new global domain\n");
> > > +			}
> > > +		}
> > > +
> > > +		ret = iommu_attach_device(rkvdec->iommu_global_domain, core->dev);
> > > +		if (ret)
> > > +			dev_warn(core->dev, "cannot attach global domain to core %d\n", core->id);
> > >   	}
> > >   
> > >   	ret = dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(32));
> > > @@ -1961,7 +1982,7 @@ static void rkvdec_remove(struct platform_device *pdev)
> > >   		if (rkvdec->cores[i].empty_domain)
> > >   			iommu_domain_free(rkvdec->cores[i].empty_domain);
> > >   
> > > -		rkvdec_free_rcb(&rkvdec->cores[i]);
> > > +		rkvdec_free_rcb(rkvdec, &rkvdec->cores[i]);
> > >   	}
> > >   }
> > >   
> > > diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec.h b/drivers/media/platform/rockchip/rkvdec/rkvdec.h
> > > index 4f042a367dc0..ccd766b220c7 100644
> > > --- a/drivers/media/platform/rockchip/rkvdec/rkvdec.h
> > > +++ b/drivers/media/platform/rockchip/rkvdec/rkvdec.h
> > > @@ -135,7 +135,6 @@ struct rkvdec_core {
> > >   	void __iomem *link;
> > >   	struct delayed_work watchdog_work;
> > >   	struct gen_pool *sram_pool;
> > > -	struct iommu_domain *iommu_domain;
> > >   	struct iommu_domain *empty_domain;
> > >   	struct rkvdec_rcb_config *rcb_config;
> > >   	struct rkvdec_ctx *curr_ctx;
> > > @@ -155,6 +154,7 @@ struct rkvdec_dev {
> > >   	unsigned int available_core_count;
> > >   	spinlock_t cores_lock; /* serializes core list access */
> > >   	struct rkvdec_core *main_core;
> > > +	struct iommu_domain *iommu_global_domain;
> > >   };
> > >   
> > >   struct rkvdec_ctx {

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply

* Re: [PATCH 2/3] nvmem: Add the Raspberry Pi OTP driver
From: Stefan Wahren @ 2026-04-09 15:21 UTC (permalink / raw)
  To: Gregor Herburger
  Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Florian Fainelli,
	Ray Jui, Scott Branden, Broadcom internal kernel review list,
	Srinivas Kandagatla, devicetree, linux-rpi-kernel,
	linux-arm-kernel, linux-kernel
In-Reply-To: <addd5ZpuUdKBV7Bn@gregor-framework>

Am 09.04.26 um 10:05 schrieb Gregor Herburger:
> On Wed, Apr 08, 2026 at 10:03:47PM +0200, Stefan Wahren wrote:
>> Am 08.04.26 um 21:47 schrieb Gregor Herburger:
>>> Hi Stefan,
>>>
>>> thanks for the review.
>>>> Is there any reason, why we cannot register this driver in
>>>> rpi_firmware_probe() like hwmon and clk driver?
>>>>
>>>> I like to avoid the complete dt-binding from patch 1.
>>> The private OTP registers are not available on all Raspberries. Afaik
>>> only on 4 and 5. So I think these registers must be described through
>>> the device tree. Therefore the bindings are needed.
>> This binding doesn't represent some kind of hardware, it's just some
>> firmware interface. A proper DT binding would describe the MMIO address
>> range for OTP access.
> I think it does represent real hardware. Although it is hidden through the
> firmware. Not all hardware must be MMIO addresses.
>
> The only driver that does not have a DT node is the hwmon driver. All
> other drivers (clock, gpio, touchscreeen, reset, pwm) do have a DT
> binding. Looking at the comment in rpi_register_clk_driver this
> seems to be some legacy behaviour for older DTs for the clock driver.
There is a long history of different approaches how to implement the 
VideoCore firmware interface for the Raspberry Pi and not all of them 
are good from today's perspective.

One big problem with DT binding is that the kernel must be compatible 
with all mainline DTS versions. This sounds trivial, but it's not. Since 
we cannot assume that kernel & DTB are updated at the same time. So we 
need to keep these bad solutions from the past.
>> If you need some distinction between the Raspberry Pi generations there are
>> firmware tags to do this.
> So what is your suggestion? What tags do you mean?
Your driver already use firmware tags to access the OTPs via firmware. 
You can ask the Raspberry Pi guys, how to do the distinction in a 
efficient/maintainable way.

My suggestion would be to look at 
https://github.com/raspberrypi/firmware/wiki/Mailbox-property-interface#get-board-model

and

https://github.com/u-boot/u-boot/blob/master/board/raspberrypi/rpi/rpi.c#L95

The compatible "raspberrypi,bcm2712-firmware" approach is more straight 
forward, but requires a newer DTB. See above.

Best regards


^ permalink raw reply

* Re: [PATCH v2 1/3] arm64: mm: Fix rodata=full block mapping support for realm guests
From: Catalin Marinas @ 2026-04-09 15:20 UTC (permalink / raw)
  To: Kevin Brodsky
  Cc: Ryan Roberts, Will Deacon, David Hildenbrand (Arm), Dev Jain,
	Yang Shi, Suzuki K Poulose, Jinjiang Tu, linux-arm-kernel,
	linux-kernel, stable
In-Reply-To: <567dff89-9f0f-40a0-ab10-22e061b4faaf@arm.com>

On Thu, Apr 09, 2026 at 11:53:41AM +0200, Kevin Brodsky wrote:
> On 07/04/2026 12:52, Catalin Marinas wrote:
> >> if we have forced pte mapping then the value of
> >> can_set_direct_map() is irrelevant - we will never need to split because we are
> >> already pte-mapped.
> >
> > can_set_direct_map() is used in other places, so its value is
> > relevant, e.g. sys_memfd_secret() is rejected if this function returns
> > false.
> 
> Indeed, I have noticed this before: currently set_direct_map_*_noflush()
> and other functions will either fail or do nothing if none of the
> features (rodata=full, etc.) is enabled, even if we would be able to
> split the linear map using BBML2-noabort.

That's what I have been trying to say to Ryan ;), can_set_direct_map()
has different meanings depending on the caller: hint that it might split
or asking whether splitting is permitted. The latter is not captured.
Ignoring realms, if we have BBML2_NOABORT the kernel won't force pte
mappings under the assumption that split_kernel_leaf_mapping() is safe.
However set_direct_map_*_noflush() won't even reach the split function
because the "can" part says "no, you can't".

> What would make more sense to me is to enable the use of BBML2-noabort
> unconditionally if !force_pte_mapping(). We can then have
> can_set_direct_map() return true if we have BBML2-noabort, and we no
> longer need to check it in map_mem().

Indeed.

> This is a functional change that doesn't have anything to do with realms
> so it should probably be a separate series - happy to take care of it
> once the dust settles on the realm handling.

I think it can be done in parallel, it shouldn't interfere with realms.
The realm part should just affect force_pte_mapping() and
can_set_direct_map() should return just what's possible, not what may
need to set the direct map.

-- 
Catalin

^ permalink raw reply

* Re: [PATCH] mm/arm: pgtable: remove young bit check for pte_valid_user
From: Brian Ruley @ 2026-04-09 15:17 UTC (permalink / raw)
  To: Will Deacon
  Cc: Russell King, Steve Capper, Russell King, linux-arm-kernel,
	linux-kernel
In-Reply-To: <adewJetUv6wMiz9o@willie-the-truck>

On Apr 09, Will Deacon wrote:
> 
> On Thu, Apr 09, 2026 at 03:54:45PM +0300, Brian Ruley wrote:
> > Fixes cache desync, which can cause undefined instruction,
> > translation and permission faults under heavy memory use.
> >
> > This is an old bug introduced in commit 1971188aa196 ("ARM: 7985/1: mm:
> > implement pte_accessible for faulting mappings"), which included a check
> > for the young bit of a PTE. The underlying assumption was that old pages
> > are not cached, therefore, `__sync_icache_dcache' could be skipped
> > entirely.
> >
> > However, under extreme memory pressure, page migrations happen
> > frequently and the assumption of uncached "old" pages does not hold.
> > Especially for systems that do not have swap, the migrated pages are
> > unequivocally marked old. This presents a problem, as it is possible
> > for the original page to be immediately mapped to another VA that
> > happens to share the same cache index in VIPT I-cache (we found this
> > bug on Cortex-A9). Without cache invalidation, the CPU will see the
> > old mapping whose physical page can now be used for a different
> > purpose, as illustrated below:
> >
> >                 Core                      Physical Memory
> >   +-------------------------------+     +------------------+
> >   | TLB                           |     |                  |
> >   |  VA_A 0xb6e6f -> pfn_q        |     | pfn_q: code      |
> >   +-------------------------------+     +------------------+
> >   | I-cache                       |
> >   |  set[VA_A bits] | tag=pfn_q   |
> >   +-------------------------------+
> >
> > migrate (kcompactd):
> >   1. copy pfn_q --> pfn_r
> >   2. free pfn_q
> >   3. pte: VA_a -> pfn_r
> >   4. pte_mkold(pte) --> !young
> >   5. ICIALLUIS skipped (because !young)
> >
> > pfn_src reused (OOM pressure):
> >   pte: VA_B -> pfn_q (different code)
> >
> > bug:
> >                 Core                      Physical Memory
> >   +-------------------------------+     +------------------+
> >   | TLB (empty)                   |     | pfn_r: old code  |
> >   +-------------------------------+     | pfn_q: new code  |
> >   | I-cache                       |     +------------------+
> >   |  set[VA_A bits] | tag=pfn_q   |<--- wrong instructions
> >   +-------------------------------+
> 
> (nit: Do you have pfn_r and pfn_q mixed up in the "Physical Memory" box?)

No, I don't think so. The intent was to show that whatever was copied
from pfn_q is now in pfn_r while the old page (pfn_q) is now mapped to
VA_B with new code/data. Maybe a classic case of poor naming on my part
here. :-)

> 
> > This was verified on ba16-based board (i.MX6Quad/Dual, Cortex-A9) by
> > instrumenting the migration code to track recently migrated pages in a
> > ring buffer and then dumping them in the undefined instruction fault
> > handler. The bug can be triggered with `stress-ng':
> >
> >   stress-ng --vm 4 --vm-bytes 2G --vm-method zero-one --verify
> >
> > Note that the system we tested on has only 2G of memory, so the test
> > triggered the OOM-killer in our case.
> >
> > Fixes: 1971188aa196 ("ARM: 7985/1: mm: implement pte_accessible for faulting mappings")
> > Signed-off-by: Brian Ruley <brian.ruley@gehealthcare.com>
> > ---
> >  arch/arm/include/asm/pgtable.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
> > index 6fa9acd6a7f5..e3a5b4a9a65f 100644
> > --- a/arch/arm/include/asm/pgtable.h
> > +++ b/arch/arm/include/asm/pgtable.h
> > @@ -185,7 +185,7 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
> >  #define pte_exec(pte)                (pte_isclear((pte), L_PTE_XN))
> >
> >  #define pte_valid_user(pte)  \
> > -     (pte_valid(pte) && pte_isset((pte), L_PTE_USER) && pte_young(pte))
> > +     (pte_valid(pte) && pte_isset((pte), L_PTE_USER))
> 
> This patch is from twelve years ago, so please forgive me for having
> forgotten all of the details. However, my recollection is that when using
> the classic/!lpae format (as you will be on Cortex-A9), page aging is
> implemented by using invalid (translation faulting) ptes for 'old'
> mappings.
> 
> So in the case you describe, we may well elide the I-cache maintenance,
> but won't we also put down an invalid pte? If we later take a fault
> on that, we should then perform the cache maintenance when installing
> the young entry (via ptep_set_access_flags()). The more interesting part
> is probably when the mapping for 'VA_B' is installed to map 'pfn_q' but,
> again, I would've expected the cache maintenance to happen just prior to
> installing the valid (young) mapping.
> 
> Please can you help me to understand the problem better?
> 
> Will

Hi,

I am, by no means, a domain expert either so I'll be deferring to your
judgement. That said, I believe what you said is correct and the
expectation is that we will later fault and then flush the cache and
fault.

However, in the case I describe, if VA_B is mapped immediately to pfn_q
after it been has unmapped and freed for VA_A, then it's quite possible
that the page is still indexed in the cache. The hypothesis is that if
VA_A and VA_B land in the same I-cache set and VA_A old cache entry
still exists (tagged with pfn_q), then the CPU can fetch stale
instructions because the tag will match. That's one reason why we need
to invalidate the cache, but that will be skipped in the path:

    migrate_pages
     migrate_pages_batch
      migrate_folio_move
       remove_migration_ptes
        remove_migration_pte
         set_pte_at
          set_ptes
           __sync_icache_dcache  (skipped if !young)
            set_pte_ext

And migrated pages are always marked old:

mm/migrate.c=static bool remove_migration_pte(struct folio *folio,
mm/migrate.c:           if (!softleaf_is_migration_young(entry))
mm/migrate.c:                   pte = pte_mkold(pte);

include/linux/leafops.h:
static inline bool softleaf_is_migration_young(softleaf_t entry)
{
        VM_WARN_ON_ONCE(!softleaf_is_migration(entry));

        if (migration_entry_supports_ad())
                return swp_offset(entry) & SWP_MIG_YOUNG;
        /* Keep the old behavior of aging page after migration */
        return false;
}

I might be misunderstanding something, this took us a while to figure
out. But the patch seems to work for us. I hope I explained it a bit
better now.

Best regards,
Brian


^ permalink raw reply

* Re: [PATCH 0/4] arm64: mitigate CVE-2024-7881 in the absence of firmware mitigation
From: Will Deacon @ 2026-04-09 15:13 UTC (permalink / raw)
  To: David Woodhouse
  Cc: catalin.marinas, joey.gouly, kvmarm, linux-arm-kernel,
	mark.rutland, maz, oliver.upton, suzuki.poulose, yuzenghui
In-Reply-To: <3c9c567f7a6e926e8ec24913d564ec55e8f450d2.camel@infradead.org>

On Wed, Apr 08, 2026 at 10:26:10PM +0100, David Woodhouse wrote:
> On Mon, 17 Mar 2025 at 21:26:12 +0000, Will Deacon wrote:
> > I'm really not comfortable with this series and would prefer to see it
> > dropped while we continue the discussion, especially as it's causing
> > minor conflicts with the KVM/arm64 tree in -next.
> > 
> > ...
> > 
> > To be clear: I'm not at all against mitigating this problem and
> > advertising the status of that mitigation. I *am* against quietly
> > handling it like a CPU erratum whilst simultaneously telling userspace
> > that meltdown is not a problem regardless of the mitigation state.
> 
> Was there a conclusion? KVM still isn't even exposing
> SMCCC_ARCH_WORKAROUND_4 to guests...

I didn't see another version, so I think my comments (from a year ago!)
still stand.

Will


^ permalink raw reply

* [GIT PULL] pmdomain fixes for v7.0-rc8
From: Ulf Hansson @ 2026-04-09 15:09 UTC (permalink / raw)
  To: Linus, linux-pm, linux-kernel; +Cc: Ulf Hansson, linux-arm-kernel

Hi Linus,

Here's a pull-request with a couple of pmdomain/firmware fixes intended for
v7.0-rc8. I have also included a patch to update my email in MAINTAINERS and
mailmap. Details about the highlights are as usual found in the signed tag.

Please pull this in!

Kind regards
Ulf Hansson


The following changes since commit 7aaa8047eafd0bd628065b15757d9b48c5f9c07d:

  Linux 7.0-rc6 (2026-03-29 15:40:00 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm.git tags/pmdomain-v7.0-rc6

for you to fetch changes up to c2812c0cb909211a1d2e7cec862406e32833b9de:

  MAINTAINERS, mailmap: Change Ulf Hansson's email (2026-04-07 14:17:48 +0200)

----------------------------------------------------------------
pmdomain providers:
 - imx: Prevent hang at power down for imx8mp-blk-ctrl

firmware:
 - thead: Fix buffer overflow for TH1520 AON driver

MAINTAINERS, mailmap:
 - Change Ulf Hansson's email

----------------------------------------------------------------
Jacky Bai (1):
      pmdomain: imx8mp-blk-ctrl: Keep the NOC_HDCP clock enabled

Michal Wilczynski (1):
      firmware: thead: Fix buffer overflow and use standard endian macros

Ulf Hansson (1):
      MAINTAINERS, mailmap: Change Ulf Hansson's email

 .mailmap                                        |  2 +
 MAINTAINERS                                     | 14 ++---
 drivers/firmware/thead,th1520-aon.c             |  7 +--
 drivers/pmdomain/imx/imx8mp-blk-ctrl.c          |  8 +--
 include/linux/firmware/thead/thead,th1520-aon.h | 74 -------------------------
 5 files changed, 13 insertions(+), 92 deletions(-)


^ permalink raw reply

* Re: [EXTERNAL] [PATCH 2/3] KVM: arm64: vgic: Allow userspace to set IIDR revision 1
From: David Woodhouse @ 2026-04-09 15:01 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvm@vger.kernel.org, shuah@kernel.org, yuzenghui@huawei.com,
	joey.gouly@arm.com, linux-kernel@vger.kernel.org,
	catalin.marinas@arm.com, nathan@kernel.org, pbonzini@redhat.com,
	kvmarm@lists.linux.dev, arnd@arndb.de, kees@kernel.org,
	will@kernel.org, suzuki.poulose@arm.com, oupton@kernel.org,
	rananta@google.com, linux-arm-kernel@lists.infradead.org,
	linux-kselftest@vger.kernel.org, eric.auger@redhat.com
In-Reply-To: <86v7e02qoq.wl-maz@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 2607 bytes --]

On Thu, 2026-04-09 at 14:45 +0100, Marc Zyngier wrote:
> On Wed, 08 Apr 2026 09:39:15 +0100,
> "Woodhouse, David" <dwmw@amazon.co.uk> wrote:
> > 
> > What if the guest boots under a new host kernel and finds the group
> > registers are writable, and then is live migrated to an old host kernel
> > on which they are not?
> 
> That's your problem. KVM/arm64 never supported downgrading.

Again, I don't know why you're saying this. It isn't true, and *can't*
be true if KVM/arm64 is going to be anything more than a toy.

> Not to mention that there is no valid GIC implementation that has RO
> group registers. All you are doing is to inflict a hypervisor bug on
> unsuspecting guests, for no good reason.

It's not about "inflicting a hypervisor bug". It's about preserving the
exact same environment that those millions of guests already *have*
instead of taking the risk of changing things underneath them. And
giving us a *path* to cleanly upgrading for new launches.

> > What about hibernation, if the *boot* kernel in the guest configures
> > the groups, but then transfers control back to the resumed guest kernel
> > which had not?
> 
> A guest that doesn't configures the groups cannot expect anything to
> work. You'd have the exact same problem on bare-metal.
> 
> > 
> > > So what is this *really* fixing?
> > 
> > I look at that question the other way round.
> > 
> > KVM has an established procedure for allowing userspace to control
> > guest-visible changes, using the IIDR. First the host kernel which
> > *supports* the change is rolled out, and only then does the VMM start
> > to enable it for new launches.
> > 
> > Even if we can address the questions above, and even if we can convince
> > ourselves that those are the *only* questions to ask... why not follow
> > the normal, safe, procedure? Especially given that there is already an
> > IIDR value which corresponds to it.
> > 
> > We don't *have* to YOLO it... and I don't want to :)
> 
> That's hardly an argument, is it?

Er... yes, yes really it is an argument. I don't want to randomly
inflict a device model change on *running* guests, and when they resume
from hibernation, when I can preserve the existing behaviour.

It's weird enough that you claim that KVM doesn't support downgrading;
now you seem to be claiming that it doesn't support retaining
compatibility when *upgrading* either!

The ability to set IIDR like this was explicitly *designed* for this
purpose, wasn't it? Why on earth would you object to being able to set
it to KVM_VGIC_IMP_REV_1?




[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5069 bytes --]

^ permalink raw reply

* Re: [PATCH v7 7/7] KVM: arm64: Normalize cache configuration
From: David Woodhouse @ 2026-04-09 14:51 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: akihiko.odaki, Gutierrez Cantu, Bernardo, alexandru.elisei,
	alyssa, asahi, broonie, catalin.marinas, james.morse, kvmarm,
	kvmarm, linux-arm-kernel, linux-kernel, marcan, mathieu.poirier,
	oliver.upton, suzuki.poulose, sven, will
In-Reply-To: <86wlyg2r3i.wl-maz@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 3229 bytes --]

On Thu, 2026-04-09 at 14:36 +0100, Marc Zyngier wrote:
> On Thu, 09 Apr 2026 13:25:24 +0100,
> David Woodhouse <dwmw2@infradead.org> wrote:
> > 
> > On Thu, 12 Jan 2023 at 11:38:52 +0900, Akihiko Odaki wrote:
> > > Before this change, the cache configuration of the physical CPU was
> > > exposed to vcpus. This is problematic because the cache configuration a
> > > vcpu sees varies when it migrates between vcpus with different cache
> > > configurations.
> > > 
> > > Fabricate cache configuration from the sanitized value, which holds the
> > > CTR_EL0 value the userspace sees regardless of which physical CPU it
> > > resides on.
> > > 
> > > CLIDR_EL1 and CCSIDR_EL1 are now writable from the userspace so that
> > > the VMM can restore the values saved with the old kernel.
> > 
> > (commit 7af0c2534f4c5)
> > 
> > How does the VMM set the values that the old kernel would have set?
> 
> By reading them at the source?

Yes, but the VMM in EL0 *can't* read the source, can it?

> > Let's say we're deploying a kernel with this change for the first time,
> > and we need to ensure that we provide a consistent environment to
> > guests, which can be live migrated back to an older host.
> 
> We have never guaranteed host downgrade. It almost never works.

Huh? Host downgrade absolutely *does* work; KVM on Arm isn't a toy.
We'd never be able to ship a new kernel if we didn't know we could roll
it *back* if we needed to.

I *know* that you know perfectly well that we actively test host
downgrades prior to every rollout of a new kernel. So I'm a little
confused about what you're trying to say here, Marc.

> > So for new launches, we need to provide the values that the old kernel
> > *would* have provided to the guest. A new launch isn't a migration;
> > there are no "values saved with the old kernel".
> 
> And you can provide these values.

If I know them, sure. But I don't, because:

> > Userspace can't read the CLIDR_EL1 and CCSIDR_EL1 registers directly,
> > and AFAICT not everything we need to reconstitute them is in sysfs. How
> > is this supposed to work?
> > 
> > Shouldn't this change have been made as a capability that the VMM can
> > explicitly opt in or out of? Environments that don't do cross-CPU
> > migration absolutely don't care about, and actively don't *want*, the
> > sanitisation that this commit inflicted on us, surely?
> 
> I don't think a capability buys you anything. You want to expose
> something to the guest? Make it so. You are in the favourable
> situation to completely own the HW and the VMM.

Now that the values are writable, but userspace can't easily see *what*
they would have been in the previous kernel. So having the capability
to ask KVM to set them to those values seems like it might be useful.

> > Am I missing something?
> 
> That you had over 3 years to voice your concern, and did nothing?

I was looking more for technical input about how I might determine the
original values to retain compatibility... but yes, I have given
feedback that merely *reverting* the offending commit in previous
kernel upgrades without actually doing anything to fix it properly
wasn't the best choice.

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5069 bytes --]

^ permalink raw reply

* Re: [PATCH v10 16/20] coresight: Add PM callbacks for sink device
From: Leo Yan @ 2026-04-09 14:49 UTC (permalink / raw)
  To: James Clark
  Cc: Suzuki K Poulose, coresight, linux-arm-kernel, Yeoreum Yun,
	Mark Rutland, Will Deacon, Yabin Cui, Keita Morisaki,
	Yuanfang Zhang, Greg Kroah-Hartman, Alexander Shishkin,
	Tamas Petz, Thomas Gleixner, Peter Zijlstra, Mike Leach
In-Reply-To: <ffe95710-cd52-484e-821e-a3d92ba17c33@linaro.org>

On Thu, Apr 09, 2026 at 03:30:56PM +0100, James Clark wrote:

[...]

> > > > @@ -1759,16 +1760,36 @@ static int coresight_pm_check(struct
> > > > coresight_path *path)
> > > >       if (source_has_cb)
> > > >           return 1;
> > > > +    sink_has_cb = coresight_ops(sink)->pm_save_disable &&
> > > > +              coresight_ops(sink)->pm_restore_enable;
> > > > +    /*
> > > > +     * It is not permitted that the source has no callbacks
> > > > while the sink
> > > > +     * does, as the sink cannot be disabled without disabling
> > > > the source,
> > > > +     * which may lead to lockups. Alternatively, the ETM driver should
> > > > +     * enable self-hosted PM mode at probe (see etm4_probe()).
> > > > +     */
> > > > +    if (sink_has_cb) {
> > > > +        pr_warn_once("coresight PM failed: source has no PM
> > > > callbacks; "
> > > > +                 "cannot safely control sink\n");
> > > 
> > > This prints out on my Orion board on a fresh boot because of how
> > > pm_save_enable is setup there. Do we really need the configuration
> > > of pm_save_enable for ETE/TRBE if we know that it always needs
> > > saving?

Yeah, I can remove this check and always bind CPU PM ops for ETE.

> > > It also stops warning if I rmmod and modprobe the module after
> > > booting. Seems like pm_save_enable is different depending on how the
> > > module is loaded which doesn't seem right.
> > 
> > Thats because the warning is pr_warn_*once*()
> 
> I don't think so, I tested it with a printf instead of a warn once and also
> tested modprobeing straight after a reboot.

I am a bit surprised that Orion6 hits the CPU idle flow, as I observed
that idle states are not enabled on my board:

  # ls /sys/devices/system/cpu/cpu*/cpuidle
  ls: cannot access '/sys/devices/system/cpu/cpu*/cpuidle': No such file or directory

If you hit only once CPU idle notifier, it is good to add a
dump_stack() in coresight_cpu_pm_notify and print the "cmd" argument,
so we can know the calling coming from where.  I am a bit suspect it
might be a glitch in CPUIdle layer.

Thanks,
Leo


^ permalink raw reply

* Re: [PATCH 7/7] media: rkvdec: Add multicore IOMMU support
From: Detlev Casanova @ 2026-04-09 14:49 UTC (permalink / raw)
  To: Nicolas Dufresne, Mauro Carvalho Chehab, Ezequiel Garcia,
	Heiko Stuebner, Hans Verkuil, Jonas Karlman
  Cc: kernel, linux-media, linux-kernel, linux-rockchip,
	linux-arm-kernel
In-Reply-To: <517dc6e9de0e284b3dc22952d9479616e6c8ec62.camel@collabora.com>

Hi Nicolas,

On 4/9/26 10:19, Nicolas Dufresne wrote:
> Le jeudi 09 avril 2026 à 09:50 -0400, Detlev Casanova a écrit :
>> As each core has its own IOMMU core, buffers must be mapped in each
>> core's IOMMU so that any run() call can use any core without having to
>> remap everything.
>>
>> To do that, we use rockchip iommu domain's iommu devices list.
>> With that, one IOMMU domain can be mapped on multiple devices, meaning
>> that each call to iommu_map() will flush the new mapping on all devices
>> in the list.
>>
>> The IOMMU domain that will have all devices in its list is the first
>> core's default domain.
>>
>> Another domain cannot be used because VB2 allocates buffers through the
>> DMA engine, which uses iommu_get_dma_domain() to find the domain to map
>> buffers through.
>>
>> The IOMMU restore function can still work as before, but needs to be more
>> explicit in what domain to attach the device to.
>> That is because detaching the empty domain will reattach the core's default
>> domain, which is wrong (except for the first "main" core).
>>
>> The RCB temporary buffers are allocated in a dedicated SRAM, each
>> core has its own SRAM, so the mapping for each core's SRAM is added in the
>> global domain.
>>
>> Everything else is mapped through the first core's default domain, making
>> the driver write the mappings on both IOMMU cores.
> Just raising an issue with the patch ordering here. I'm worried in a git bisect,
> the driver will be broken until we apply this last patch. Can we make sure that
> the driver bisects ? (or tell me if I'm wrong)
No, you're right. That's also why I mentioned it in the cover letter. 
This commit is also not too big, so I'll merge the last 3 commits and 
adapt the commit message to keep the information from the 3 (the commit 
messages are why I wanted to keep them separate, but there is no good 
way to keep them that way)

Detlev.
>
>> Signed-off-by: Detlev Casanova <detlev.casanova@collabora.com>
>> ---
>>   .../media/platform/rockchip/rkvdec/rkvdec-rcb.c    | 21 ++++++-------
>>   .../media/platform/rockchip/rkvdec/rkvdec-rcb.h    |  6 ++--
>>   drivers/media/platform/rockchip/rkvdec/rkvdec.c    | 35 +++++++++++++++++-----
>>   drivers/media/platform/rockchip/rkvdec/rkvdec.h    |  2 +-
>>   4 files changed, 44 insertions(+), 20 deletions(-)
>>
>> diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-rcb.c b/drivers/media/platform/rockchip/rkvdec/rkvdec-rcb.c
>> index 190fb7438e8c..977e37cf209b 100644
>> --- a/drivers/media/platform/rockchip/rkvdec/rkvdec-rcb.c
>> +++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-rcb.c
>> @@ -57,7 +57,7 @@ bool rkvdec_rcb_buf_validate_size(struct rkvdec_ctx *ctx)
>>   	return ret;
>>   }
>>   
>> -void rkvdec_free_rcb(struct rkvdec_core *core)
>> +void rkvdec_free_rcb(struct rkvdec_dev *rkvdec, struct rkvdec_core *core)
>>   {
>>   	struct rkvdec_rcb_config *cfg = core->rcb_config;
>>   	unsigned long virt_addr;
>> @@ -76,12 +76,12 @@ void rkvdec_free_rcb(struct rkvdec_core *core)
>>   		case RKVDEC_ALLOC_SRAM:
>>   			virt_addr = (unsigned long)cfg->rcb_bufs[i].cpu;
>>   
>> -			if (core->iommu_domain)
>> -				iommu_unmap(core->iommu_domain, virt_addr, rcb_size);
>> +			if (rkvdec->iommu_global_domain)
>> +				iommu_unmap(rkvdec->iommu_global_domain, virt_addr, rcb_size);
>>   			gen_pool_free(core->sram_pool, virt_addr, rcb_size);
>>   			break;
>>   		case RKVDEC_ALLOC_DMA:
>> -			dma_free_coherent(core->dev,
>> +			dma_free_coherent(rkvdec->main_core->dev,
>>   					  rcb_size,
>>   					  cfg->rcb_bufs[i].cpu,
>>   					  cfg->rcb_bufs[i].dma);
>> @@ -97,7 +97,8 @@ void rkvdec_free_rcb(struct rkvdec_core *core)
>>   	core->rcb_config = NULL;
>>   }
>>   
>> -int rkvdec_allocate_rcb(struct rkvdec_core *core, u32 width, u32 height,
>> +int rkvdec_allocate_rcb(struct rkvdec_dev *rkvdec, struct rkvdec_core *core,
>> +			u32 width, u32 height,
>>   			const struct rcb_size_info *size_info,
>>   			size_t rcb_count)
>>   {
>> @@ -132,7 +133,7 @@ int rkvdec_allocate_rcb(struct rkvdec_core *core, u32 width, u32 height,
>>   
>>   		/* Try allocating an SRAM buffer */
>>   		if (core->sram_pool) {
>> -			if (core->iommu_domain)
>> +			if (rkvdec->iommu_global_domain)
>>   				rcb_size = ALIGN(rcb_size, SZ_4K);
>>   
>>   			cpu = gen_pool_dma_zalloc_align(core->sram_pool,
>> @@ -142,11 +143,11 @@ int rkvdec_allocate_rcb(struct rkvdec_core *core, u32 width, u32 height,
>>   		}
>>   
>>   		/* If an IOMMU is used, map the SRAM address through it */
>> -		if (cpu && core->iommu_domain) {
>> +		if (cpu && rkvdec->iommu_global_domain) {
>>   			unsigned long virt_addr = (unsigned long)cpu;
>>   			phys_addr_t phys_addr = dma;
>>   
>> -			ret = iommu_map(core->iommu_domain, virt_addr, phys_addr,
>> +			ret = iommu_map(rkvdec->iommu_global_domain, virt_addr, phys_addr,
>>   					rcb_size, IOMMU_READ | IOMMU_WRITE, 0);
>>   			if (ret) {
>>   				gen_pool_free(core->sram_pool,
>> @@ -166,7 +167,7 @@ int rkvdec_allocate_rcb(struct rkvdec_core *core, u32 width, u32 height,
>>   ram_fallback:
>>   		/* Fallback to RAM */
>>   		if (!cpu) {
>> -			cpu = dma_alloc_coherent(core->dev,
>> +			cpu = dma_alloc_coherent(rkvdec->main_core->dev,
>>   						 rcb_size,
>>   						 &dma,
>>   						 GFP_KERNEL);
>> @@ -189,7 +190,7 @@ int rkvdec_allocate_rcb(struct rkvdec_core *core, u32 width, u32 height,
>>   	return 0;
>>   
>>   err_alloc:
>> -	rkvdec_free_rcb(core);
>> +	rkvdec_free_rcb(rkvdec, core);
>>   
>>   	return ret;
>>   }
>> diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-rcb.h b/drivers/media/platform/rockchip/rkvdec/rkvdec-rcb.h
>> index a12af9b7dc2b..d1149afe7fda 100644
>> --- a/drivers/media/platform/rockchip/rkvdec/rkvdec-rcb.h
>> +++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-rcb.h
>> @@ -8,6 +8,7 @@
>>   
>>   #include <linux/types.h>
>>   
>> +struct rkvdec_dev;
>>   struct rkvdec_ctx;
>>   struct rkvdec_core;
>>   
>> @@ -21,11 +22,12 @@ struct rcb_size_info {
>>   	enum rcb_axis axis;
>>   };
>>   
>> -int rkvdec_allocate_rcb(struct rkvdec_core *core, u32 width, u32 height,
>> +int rkvdec_allocate_rcb(struct rkvdec_dev *rkvdec, struct rkvdec_core *core,
>> +			u32 width, u32 height,
>>   			const struct rcb_size_info *size_info,
>>   			size_t rcb_count);
>>   dma_addr_t rkvdec_rcb_buf_dma_addr(struct rkvdec_ctx *ctx, int id);
>>   size_t rkvdec_rcb_buf_size(struct rkvdec_ctx *ctx, int id);
>>   int rkvdec_rcb_buf_count(struct rkvdec_ctx *ctx);
>>   bool rkvdec_rcb_buf_validate_size(struct rkvdec_ctx *ctx);
>> -void rkvdec_free_rcb(struct rkvdec_core *core);
>> +void rkvdec_free_rcb(struct rkvdec_dev *rkvdec, struct rkvdec_core *core);
>> diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec.c b/drivers/media/platform/rockchip/rkvdec/rkvdec.c
>> index c2818f1575ef..2930e9b64906 100644
>> --- a/drivers/media/platform/rockchip/rkvdec/rkvdec.c
>> +++ b/drivers/media/platform/rockchip/rkvdec/rkvdec.c
>> @@ -1204,9 +1204,9 @@ static void rkvdec_device_run(void *priv)
>>   	}
>>   
>>   	if (!rkvdec_rcb_buf_validate_size(ctx)) {
>> -		rkvdec_free_rcb(ctx->core);
>> +		rkvdec_free_rcb(ctx->dev, ctx->core);
>>   
>> -		ret = rkvdec_allocate_rcb(ctx->core,
>> +		ret = rkvdec_allocate_rcb(ctx->dev, ctx->core,
>>   					  ctx->decoded_fmt.fmt.pix_mp.width,
>>   					  ctx->decoded_fmt.fmt.pix_mp.height,
>>   					  ctx->dev->variant->rcb_sizes,
>> @@ -1486,6 +1486,7 @@ static void rkvdec_v4l2_cleanup(struct rkvdec_dev *rkvdec)
>>   
>>   static void rkvdec_iommu_restore(struct rkvdec_core *core)
>>   {
>> +	int ret;
>>   	if (core->empty_domain) {
>>   		/*
>>   		 * To rewrite mapping into the attached IOMMU core, attach a new empty domain that
>> @@ -1494,8 +1495,14 @@ static void rkvdec_iommu_restore(struct rkvdec_core *core)
>>   		 * This is safely done in this interrupt handler to make sure no memory get mapped
>>   		 * through the IOMMU while the empty domain is attached.
>>   		 */
>> -		iommu_attach_device(core->empty_domain, core->dev);
>> +		iommu_detach_device(core->curr_ctx->dev->iommu_global_domain, core->dev);
>> +		ret = iommu_attach_device(core->empty_domain, core->dev);
>> +		if (ret)
>> +			dev_warn(core->dev, "Cannot attach empty domain: %d\n", ret);
>>   		iommu_detach_device(core->empty_domain, core->dev);
>> +		ret = iommu_attach_device(core->curr_ctx->dev->iommu_global_domain, core->dev);
>> +		if (ret)
>> +			dev_warn(core->dev, "Cannot attach global domain: %d\n", ret);
>>   	}
>>   }
>>   
>> @@ -1858,6 +1865,8 @@ static int rkvdec_probe(struct platform_device *pdev)
>>   
>>   	core = &rkvdec->cores[rkvdec->core_count++];
>>   
>> +	core->id = rkvdec->core_count - 1;
>> +
>>   	platform_set_drvdata(pdev, rkvdec);
>>   	core->dev = &pdev->dev;
>>   	INIT_DELAYED_WORK(&core->watchdog_work, rkvdec_watchdog_func);
>> @@ -1883,12 +1892,24 @@ static int rkvdec_probe(struct platform_device *pdev)
>>   			return PTR_ERR(core->link);
>>   	}
>>   
>> -	core->iommu_domain = iommu_get_domain_for_dev(&pdev->dev);
>> -	if (core->iommu_domain) {
>> +	if (iommu_get_domain_for_dev(&pdev->dev)) {
>>   		core->empty_domain = iommu_paging_domain_alloc(core->dev);
>>   
>> -		if (!core->empty_domain)
>> +		if (IS_ERR(core->empty_domain))
>>   			dev_warn(core->dev, "cannot alloc new empty domain\n");
>> +
>> +		if (!rkvdec->iommu_global_domain) {
>> +			rkvdec->iommu_global_domain = iommu_get_domain_for_dev(core->dev);
>> +
>> +			if (IS_ERR(rkvdec->iommu_global_domain)) {
>> +				rkvdec->iommu_global_domain = NULL;
>> +				dev_warn_once(core->dev, "cannot alloc new global domain\n");
>> +			}
>> +		}
>> +
>> +		ret = iommu_attach_device(rkvdec->iommu_global_domain, core->dev);
>> +		if (ret)
>> +			dev_warn(core->dev, "cannot attach global domain to core %d\n", core->id);
>>   	}
>>   
>>   	ret = dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(32));
>> @@ -1961,7 +1982,7 @@ static void rkvdec_remove(struct platform_device *pdev)
>>   		if (rkvdec->cores[i].empty_domain)
>>   			iommu_domain_free(rkvdec->cores[i].empty_domain);
>>   
>> -		rkvdec_free_rcb(&rkvdec->cores[i]);
>> +		rkvdec_free_rcb(rkvdec, &rkvdec->cores[i]);
>>   	}
>>   }
>>   
>> diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec.h b/drivers/media/platform/rockchip/rkvdec/rkvdec.h
>> index 4f042a367dc0..ccd766b220c7 100644
>> --- a/drivers/media/platform/rockchip/rkvdec/rkvdec.h
>> +++ b/drivers/media/platform/rockchip/rkvdec/rkvdec.h
>> @@ -135,7 +135,6 @@ struct rkvdec_core {
>>   	void __iomem *link;
>>   	struct delayed_work watchdog_work;
>>   	struct gen_pool *sram_pool;
>> -	struct iommu_domain *iommu_domain;
>>   	struct iommu_domain *empty_domain;
>>   	struct rkvdec_rcb_config *rcb_config;
>>   	struct rkvdec_ctx *curr_ctx;
>> @@ -155,6 +154,7 @@ struct rkvdec_dev {
>>   	unsigned int available_core_count;
>>   	spinlock_t cores_lock; /* serializes core list access */
>>   	struct rkvdec_core *main_core;
>> +	struct iommu_domain *iommu_global_domain;
>>   };
>>   
>>   struct rkvdec_ctx {



^ permalink raw reply

* Re: [PATCH v2] nvme-apple: drop invalid put of admin queue reference count
From: Keith Busch @ 2026-04-09 14:47 UTC (permalink / raw)
  To: Fedor Pchelkin
  Cc: Christoph Hellwig, Jens Axboe, Sven Peter, Janne Grunau,
	Neal Gompa, Sagi Grimberg, Hannes Reinecke, Ming Lei,
	Chaitanya Kulkarni, Heyne, Maximilian, asahi, linux-arm-kernel,
	linux-nvme, linux-kernel, lvc-project, stable
In-Reply-To: <20260408141815.375695-1-pchelkin@ispras.ru>

On Wed, Apr 08, 2026 at 05:18:14PM +0300, Fedor Pchelkin wrote:
> Commit 03b3bcd319b3 ("nvme: fix admin request_queue lifetime") moved the
> admin queue reference ->put call into nvme_free_ctrl() - a controller
> device release callback performed for every nvme driver doing
> nvme_init_ctrl().

Thanks, applied to nvme-7.1.


^ permalink raw reply

* Re: [PATCH] mm/arm: pgtable: remove young bit check for pte_valid_user
From: Russell King (Oracle) @ 2026-04-09 14:43 UTC (permalink / raw)
  To: Will Deacon; +Cc: Brian Ruley, Steve Capper, linux-arm-kernel, linux-kernel
In-Reply-To: <adewJetUv6wMiz9o@willie-the-truck>

On Thu, Apr 09, 2026 at 02:56:53PM +0100, Will Deacon wrote:
> On Thu, Apr 09, 2026 at 03:54:45PM +0300, Brian Ruley wrote:
> > Fixes cache desync, which can cause undefined instruction,
> > translation and permission faults under heavy memory use.
> > 
> > This is an old bug introduced in commit 1971188aa196 ("ARM: 7985/1: mm:
> > implement pte_accessible for faulting mappings"), which included a check
> > for the young bit of a PTE. The underlying assumption was that old pages
> > are not cached, therefore, `__sync_icache_dcache' could be skipped
> > entirely.
> > 
> > However, under extreme memory pressure, page migrations happen
> > frequently and the assumption of uncached "old" pages does not hold.
> > Especially for systems that do not have swap, the migrated pages are
> > unequivocally marked old. This presents a problem, as it is possible
> > for the original page to be immediately mapped to another VA that
> > happens to share the same cache index in VIPT I-cache (we found this
> > bug on Cortex-A9). Without cache invalidation, the CPU will see the
> > old mapping whose physical page can now be used for a different
> > purpose, as illustrated below:
> > 
> >                 Core                      Physical Memory
> >   +-------------------------------+     +------------------+
> >   | TLB                           |     |                  |
> >   |  VA_A 0xb6e6f -> pfn_q        |     | pfn_q: code      |
> >   +-------------------------------+     +------------------+
> >   | I-cache                       |
> >   |  set[VA_A bits] | tag=pfn_q   |
> >   +-------------------------------+
> > 
> > migrate (kcompactd):
> >   1. copy pfn_q --> pfn_r
> >   2. free pfn_q
> >   3. pte: VA_a -> pfn_r
> >   4. pte_mkold(pte) --> !young
> >   5. ICIALLUIS skipped (because !young)
> > 
> > pfn_src reused (OOM pressure):
> >   pte: VA_B -> pfn_q (different code)
> > 
> > bug:
> >                 Core                      Physical Memory
> >   +-------------------------------+     +------------------+
> >   | TLB (empty)                   |     | pfn_r: old code  |
> >   +-------------------------------+     | pfn_q: new code  |
> >   | I-cache                       |     +------------------+
> >   |  set[VA_A bits] | tag=pfn_q   |<--- wrong instructions
> >   +-------------------------------+
> 
> (nit: Do you have pfn_r and pfn_q mixed up in the "Physical Memory" box?)

I don't think so. pfn_r contains the code that _was_ in pfn_q before
the migration happened (the migration copied pfn_q to pfn_r).

Then, a short time later, the page for pfn_r is reallocated, with new
code placed into it, and then a new PTE is established for pfn_q -
however, this should be a young PTE (which will be necessary for the
mapping to be visible to userspace), and thus should cause
__sync_icache_dcache() to be called, resulting in __flush_icache_all()
if it is an executable mapping.

If it isn't an executable mapping (because pfn_q isn't code) then we
will skip the __flush_icache_all() for the new mapping.

However, the I-cache for the old PTE that was pfn_r and now is pfn_q
will not be present in the physical page tables. An attempt to execute
code from that mapping should fault, causing the MM to mark that PTE
young, and, as it's executable, it should result in
__flush_icache_all() being called.

Like you, I can't see any issue here.

I think we need to know exactly what happened to the old PTE entry
and the newer PTE entry that was subsequently established on the
lead-up to the undefined instruction exception.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!


^ permalink raw reply

* Re: [PATCH v0 06/15] mshv: Implement mshv bridge device for VFIO
From: Stanislav Kinsburskii @ 2026-04-09 14:41 UTC (permalink / raw)
  To: Mukesh R
  Cc: linux-kernel, linux-hyperv, linux-arm-kernel, iommu, linux-pci,
	linux-arch, kys, haiyangz, wei.liu, decui, longli,
	catalin.marinas, will, tglx, mingo, bp, dave.hansen, hpa, joro,
	lpieralisi, kwilczynski, mani, robh, bhelgaas, arnd, nunodasneves,
	mhklinux, romank
In-Reply-To: <c30ede65-46c4-02b1-756a-868f9a265cf1@linux.microsoft.com>

On Tue, Apr 07, 2026 at 10:41:12AM -0700, Mukesh R wrote:
> On 1/20/26 08:09, Stanislav Kinsburskii wrote:
> > On Mon, Jan 19, 2026 at 10:42:21PM -0800, Mukesh R wrote:
> > > From: Mukesh Rathor <mrathor@linux.microsoft.com>
> > > 
> > > Add a new file to implement VFIO-MSHV bridge pseudo device. These
> > > functions are called in the VFIO framework, and credits to kvm/vfio.c
> > > as this file was adapted from it.
> > > 
> > > Original author: Wei Liu <wei.liu@kernel.org>
> > > (Slightly modified from the original version).
> > > 
> > 
> > There is a Linux standard for giving credits when code is adapted from.
> > This doesn't follow that standard. Please fix.
> > 
> > > Signed-off-by: Mukesh Rathor <mrathor@linux.microsoft.com>
> > > ---
> > >   drivers/hv/Makefile    |   3 +-
> > >   drivers/hv/mshv_vfio.c | 210 +++++++++++++++++++++++++++++++++++++++++
> > >   2 files changed, 212 insertions(+), 1 deletion(-)
> > >   create mode 100644 drivers/hv/mshv_vfio.c
> > > 
> > > diff --git a/drivers/hv/Makefile b/drivers/hv/Makefile
> > > index a49f93c2d245..eae003c4cb8f 100644
> > > --- a/drivers/hv/Makefile
> > > +++ b/drivers/hv/Makefile
> > > @@ -14,7 +14,8 @@ hv_vmbus-y := vmbus_drv.o \
> > >   hv_vmbus-$(CONFIG_HYPERV_TESTING)	+= hv_debugfs.o
> > >   hv_utils-y := hv_util.o hv_kvp.o hv_snapshot.o hv_utils_transport.o
> > >   mshv_root-y := mshv_root_main.o mshv_synic.o mshv_eventfd.o mshv_irq.o \
> > > -	       mshv_root_hv_call.o mshv_portid_table.o mshv_regions.o
> > > +	       mshv_root_hv_call.o mshv_portid_table.o mshv_regions.o \
> > > +               mshv_vfio.o
> > >   mshv_vtl-y := mshv_vtl_main.o
> > >   # Code that must be built-in
> > > diff --git a/drivers/hv/mshv_vfio.c b/drivers/hv/mshv_vfio.c
> > > new file mode 100644
> > > index 000000000000..6ea4d99a3bd2
> > > --- /dev/null
> > > +++ b/drivers/hv/mshv_vfio.c
> > > @@ -0,0 +1,210 @@
> > > +// SPDX-License-Identifier: GPL-2.0-only
> > > +/*
> > > + * VFIO-MSHV bridge pseudo device
> > > + *
> > > + * Heavily inspired by the VFIO-KVM bridge pseudo device.
> > > + */
> > > +#include <linux/errno.h>
> > > +#include <linux/file.h>
> > > +#include <linux/list.h>
> > > +#include <linux/module.h>
> > > +#include <linux/mutex.h>
> > > +#include <linux/slab.h>
> > > +#include <linux/vfio.h>
> > > +
> > > +#include "mshv.h"
> > > +#include "mshv_root.h"
> > > +
> > > +struct mshv_vfio_file {
> > > +	struct list_head node;
> > > +	struct file *file;	/* list of struct mshv_vfio_file */
> > > +};
> > > +
> > > +struct mshv_vfio {
> > > +	struct list_head file_list;
> > > +	struct mutex lock;
> > > +};
> > > +
> > > +static bool mshv_vfio_file_is_valid(struct file *file)
> > > +{
> > > +	bool (*fn)(struct file *file);
> > > +	bool ret;
> > > +
> > > +	fn = symbol_get(vfio_file_is_valid);
> > > +	if (!fn)
> > > +		return false;
> > > +
> > > +	ret = fn(file);
> > > +
> > > +	symbol_put(vfio_file_is_valid);
> > > +
> > > +	return ret;
> > > +}
> > > +
> > > +static long mshv_vfio_file_add(struct mshv_device *mshvdev, unsigned int fd)
> > > +{
> > > +	struct mshv_vfio *mshv_vfio = mshvdev->device_private;
> > > +	struct mshv_vfio_file *mvf;
> > > +	struct file *filp;
> > > +	long ret = 0;
> > > +
> > > +	filp = fget(fd);
> > > +	if (!filp)
> > > +		return -EBADF;
> > > +
> > > +	/* Ensure the FD is a vfio FD. */
> > > +	if (!mshv_vfio_file_is_valid(filp)) {
> > > +		ret = -EINVAL;
> > > +		goto out_fput;
> > > +	}
> > > +
> > > +	mutex_lock(&mshv_vfio->lock);
> > > +
> > > +	list_for_each_entry(mvf, &mshv_vfio->file_list, node) {
> > > +		if (mvf->file == filp) {
> > > +			ret = -EEXIST;
> > > +			goto out_unlock;
> > > +		}
> > > +	}
> > > +
> > > +	mvf = kzalloc(sizeof(*mvf), GFP_KERNEL_ACCOUNT);
> > > +	if (!mvf) {
> > > +		ret = -ENOMEM;
> > > +		goto out_unlock;
> > > +	}
> > > +
> > > +	mvf->file = get_file(filp);
> > > +	list_add_tail(&mvf->node, &mshv_vfio->file_list);
> > > +
> > > +out_unlock:
> > > +	mutex_unlock(&mshv_vfio->lock);
> > > +out_fput:
> > > +	fput(filp);
> > > +	return ret;
> > > +}
> > > +
> > > +static long mshv_vfio_file_del(struct mshv_device *mshvdev, unsigned int fd)
> > > +{
> > > +	struct mshv_vfio *mshv_vfio = mshvdev->device_private;
> > > +	struct mshv_vfio_file *mvf;
> > > +	long ret;
> > > +
> > > +	CLASS(fd, f)(fd);
> > > +
> > > +	if (fd_empty(f))
> > > +		return -EBADF;
> > > +
> > > +	ret = -ENOENT;
> > > +	mutex_lock(&mshv_vfio->lock);
> > > +
> > > +	list_for_each_entry(mvf, &mshv_vfio->file_list, node) {
> > > +		if (mvf->file != fd_file(f))
> > > +			continue;
> > > +
> > > +		list_del(&mvf->node);
> > > +		fput(mvf->file);
> > > +		kfree(mvf);
> > > +		ret = 0;
> > > +		break;
> > > +	}
> > > +
> > > +	mutex_unlock(&mshv_vfio->lock);
> > > +	return ret;
> > > +}
> > > +
> > > +static long mshv_vfio_set_file(struct mshv_device *mshvdev, long attr,
> > > +			      void __user *arg)
> > > +{
> > > +	int32_t __user *argp = arg;
> > > +	int32_t fd;
> > > +
> > > +	switch (attr) {
> > > +	case MSHV_DEV_VFIO_FILE_ADD:
> > > +		if (get_user(fd, argp))
> > > +			return -EFAULT;
> > > +		return mshv_vfio_file_add(mshvdev, fd);
> > > +
> > > +	case MSHV_DEV_VFIO_FILE_DEL:
> > > +		if (get_user(fd, argp))
> > > +			return -EFAULT;
> > > +		return mshv_vfio_file_del(mshvdev, fd);
> > > +	}
> > > +
> > > +	return -ENXIO;
> > > +}
> > > +
> > > +static long mshv_vfio_set_attr(struct mshv_device *mshvdev,
> > > +			      struct mshv_device_attr *attr)
> > > +{
> > > +	switch (attr->group) {
> > > +	case MSHV_DEV_VFIO_FILE:
> > > +		return mshv_vfio_set_file(mshvdev, attr->attr,
> > > +					  u64_to_user_ptr(attr->addr));
> > > +	}
> > > +
> > > +	return -ENXIO;
> > > +}
> > > +
> > > +static long mshv_vfio_has_attr(struct mshv_device *mshvdev,
> > > +			      struct mshv_device_attr *attr)
> > > +{
> > > +	switch (attr->group) {
> > > +	case MSHV_DEV_VFIO_FILE:
> > > +		switch (attr->attr) {
> > > +		case MSHV_DEV_VFIO_FILE_ADD:
> > > +		case MSHV_DEV_VFIO_FILE_DEL:
> > > +			return 0;
> > > +		}
> > > +
> > > +		break;
> > > +	}
> > > +
> > > +	return -ENXIO;
> > > +}
> > > +
> > > +static long mshv_vfio_create_device(struct mshv_device *mshvdev, u32 type)
> > > +{
> > > +	struct mshv_device *tmp;
> > > +	struct mshv_vfio *mshv_vfio;
> > > +
> > > +	/* Only one VFIO "device" per VM */
> > > +	hlist_for_each_entry(tmp, &mshvdev->device_pt->pt_devices,
> > > +			     device_ptnode)
> > > +		if (tmp->device_ops == &mshv_vfio_device_ops)
> > > +			return -EBUSY;
> > > +
> > > +	mshv_vfio = kzalloc(sizeof(*mshv_vfio), GFP_KERNEL_ACCOUNT);
> > > +	if (mshv_vfio == NULL)
> > > +		return -ENOMEM;
> > > +
> > > +	INIT_LIST_HEAD(&mshv_vfio->file_list);
> > > +	mutex_init(&mshv_vfio->lock);
> > > +
> > > +	mshvdev->device_private = mshv_vfio;
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +/* This is called from mshv_device_fop_release() */
> > > +static void mshv_vfio_release_device(struct mshv_device *mshvdev)
> > > +{
> > > +	struct mshv_vfio *mv = mshvdev->device_private;
> > > +	struct mshv_vfio_file *mvf, *tmp;
> > > +
> > > +	list_for_each_entry_safe(mvf, tmp, &mv->file_list, node) {
> > > +		fput(mvf->file);
> > 
> > This put must be sync as device must be detached from domain before
> > attempting partition destruction.
> 
> Like I said in 6.6 PR, this does not attach or detach devices.
> 

You are mistaken. It absolutely does.

Thanks,
Stanislav

> > This was explicitly mentioned in the patch originated this code.
> > Please fix, add a comment and credits to the commit message.
> 
> That was ".detstroy" hook which is gone.
> 
> Thanks,
> -Mukesh
> 
> 
> > Thanks,
> > Stanislav
> > 
> > 
> > > +		list_del(&mvf->node);
> > > +		kfree(mvf);
> > > +	}
> > > +
> > > +	kfree(mv);
> > > +	kfree(mshvdev);
> > > +}
> > > +
> > > +struct mshv_device_ops mshv_vfio_device_ops = {
> > > +	.device_name = "mshv-vfio",
> > > +	.device_create = mshv_vfio_create_device,
> > > +	.device_release = mshv_vfio_release_device,
> > > +	.device_set_attr = mshv_vfio_set_attr,
> > > +	.device_has_attr = mshv_vfio_has_attr,
> > > +};
> > > -- 
> > > 2.51.2.vfs.0.1
> > > 
> 


^ permalink raw reply

* Re: [PATCH] media: cedrus: skip invalid H.264 reference list entries
From: Nicolas Dufresne @ 2026-04-09 14:39 UTC (permalink / raw)
  To: Paul Kocialkowski
  Cc: Pengpeng Hou, mripard, mchehab, gregkh, wens, jernej.skrabec,
	samuel, linux-media, linux-staging, linux-arm-kernel, linux-sunxi,
	linux-kernel
In-Reply-To: <ade4Qe4OS04au2Ba@shepard>

[-- Attachment #1: Type: text/plain, Size: 600 bytes --]

Hi,

Le jeudi 09 avril 2026 à 16:31 +0200, Paul Kocialkowski a écrit :
> I think it make sense yes, but it would be good to document it in the uAPI
> document too.

Basically, extend in the M2M decoder spec(s) on the existing documentation:

   V4L2_BUF_FLAG_ERROR:
   -
   When this flag is set, the buffer has been dequeued successfully, although
   the data might **have been corrupted**. This is recoverable, streaming may
   continue as normal and the buffer may be reused normally. Drivers set this
   flag when the VIDIOC_DQBUF ioctl is called.

cheers,
Nicolas

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply

* Re: [PATCH v10 16/20] coresight: Add PM callbacks for sink device
From: James Clark @ 2026-04-09 14:31 UTC (permalink / raw)
  To: Suzuki K Poulose, Leo Yan
  Cc: coresight, linux-arm-kernel, Yeoreum Yun, Mark Rutland,
	Will Deacon, Yabin Cui, Keita Morisaki, Yuanfang Zhang,
	Greg Kroah-Hartman, Alexander Shishkin, Tamas Petz,
	Thomas Gleixner, Peter Zijlstra, Mike Leach
In-Reply-To: <ffe95710-cd52-484e-821e-a3d92ba17c33@linaro.org>



On 09/04/2026 3:30 pm, James Clark wrote:
> 
> 
> On 09/04/2026 2:14 pm, Suzuki K Poulose wrote:
>> On 09/04/2026 11:52, James Clark wrote:
>>>
>>>
>>> On 05/04/2026 4:02 pm, Leo Yan wrote:
>>>> Unlike system level sinks, per-CPU sinks may lose power during CPU idle
>>>> states.  Currently, this applies specifically to TRBE.  This commit
>>>> invokes save and restore callbacks for the sink in the CPU PM notifier.
>>>>
>>>> If the sink provides PM callbacks but the source does not, this is
>>>> unsafe because the sink cannot be disabled safely unless the source
>>>> can also be controlled, so veto low power entry to avoid lockups.
>>>>
>>>> Tested-by: James Clark <james.clark@linaro.org>
>>>> Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com>
>>>> Reviewed-by: James Clark <james.clark@linaro.org>
>>>> Signed-off-by: Leo Yan <leo.yan@arm.com>
>>>> ---
>>>>   drivers/hwtracing/coresight/coresight-core.c | 46 ++++++++++++++++ 
>>>> + + ++++++++--
>>>>   1 file changed, 43 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/ 
>>>> hwtracing/coresight/coresight-core.c
>>>> index 
>>>> c1e8debc76aba7eb5ecf7efe2a3b9b8b3e11b10c..a918bf6398a932de30fe9b4947020cc4c1cfb2f7 100644
>>>> --- a/drivers/hwtracing/coresight/coresight-core.c
>>>> +++ b/drivers/hwtracing/coresight/coresight-core.c
>>>> @@ -1736,14 +1736,15 @@ static void coresight_release_device_list(void)
>>>>   /* Return: 1 if PM is required, 0 if skip, <0 on error */
>>>>   static int coresight_pm_check(struct coresight_path *path)
>>>>   {
>>>> -    struct coresight_device *source;
>>>> -    bool source_has_cb;
>>>> +    struct coresight_device *source, *sink;
>>>> +    bool source_has_cb, sink_has_cb;
>>>>       if (!path)
>>>>           return 0;
>>>>       source = coresight_get_source(path);
>>>> -    if (!source)
>>>> +    sink = coresight_get_sink(path);
>>>> +    if (!source || !sink)
>>>>           return 0;
>>>>       /* Don't save and restore if the source is inactive */
>>>> @@ -1759,16 +1760,36 @@ static int coresight_pm_check(struct 
>>>> coresight_path *path)
>>>>       if (source_has_cb)
>>>>           return 1;
>>>> +    sink_has_cb = coresight_ops(sink)->pm_save_disable &&
>>>> +              coresight_ops(sink)->pm_restore_enable;
>>>> +    /*
>>>> +     * It is not permitted that the source has no callbacks while 
>>>> the sink
>>>> +     * does, as the sink cannot be disabled without disabling the 
>>>> source,
>>>> +     * which may lead to lockups. Alternatively, the ETM driver should
>>>> +     * enable self-hosted PM mode at probe (see etm4_probe()).
>>>> +     */
>>>> +    if (sink_has_cb) {
>>>> +        pr_warn_once("coresight PM failed: source has no PM 
>>>> callbacks; "
>>>> +                 "cannot safely control sink\n");
>>>
>>> This prints out on my Orion board on a fresh boot because of how 
>>> pm_save_enable is setup there. Do we really need the configuration of 
>>> pm_save_enable for ETE/TRBE if we know that it always needs saving?
>>>
>>> It also stops warning if I rmmod and modprobe the module after 
>>> booting. Seems like pm_save_enable is different depending on how the 
>>> module is loaded which doesn't seem right.
>>
>> Thats because the warning is pr_warn_*once*()
>>
>> Suzuki
>>
>>
> 
> I don't think so, I tested it with a printf instead of a warn once and 
> also tested modprobeing straight after a reboot.
> 

I also printed the pointers to the callbacks and they really do change.

>>>
>>>> +        return -EINVAL;
>>>> +    }
>>>> +
>>>>       return 0;
>>>>   }
>>>>   static int coresight_pm_device_save(struct coresight_device *csdev)
>>>>   {
>>>> +    if (!csdev || !coresight_ops(csdev)->pm_save_disable)
>>>> +        return 0;
>>>> +
>>>>       return coresight_ops(csdev)->pm_save_disable(csdev);
>>>>   }
>>>>   static void coresight_pm_device_restore(struct coresight_device 
>>>> *csdev)
>>>>   {
>>>> +    if (!csdev || !coresight_ops(csdev)->pm_restore_enable)
>>>> +        return;
>>>> +
>>>>       coresight_ops(csdev)->pm_restore_enable(csdev);
>>>>   }
>>>> @@ -1787,15 +1808,32 @@ static int coresight_pm_save(struct 
>>>> coresight_path *path)
>>>>       to = list_prev_entry(coresight_path_last_node(path), link);
>>>>       coresight_disable_path_from_to(path, from, to);
>>>> +    ret = coresight_pm_device_save(coresight_get_sink(path));
>>>> +    if (ret)
>>>> +        goto sink_failed;
>>>> +
>>>
>>> The comment directly above this says "Up to the node before sink to 
>>> avoid latency". But then this line goes and saves the sink anyway. So 
>>> I'm not sure what's meant by the comment?
>>>
>>>>       return 0;
>>>> +
>>>> +sink_failed:
>>>> +    if (!coresight_enable_path_from_to(path, 
>>>> coresight_get_mode(source),
>>>> +                       from, to))
>>>> +        coresight_pm_device_restore(source);
>>>> +
>>>> +    pr_err("Failed in coresight PM save on CPU%d: %d\n",
>>>> +           smp_processor_id(), ret);
>>>> +    this_cpu_write(percpu_pm_failed, true);
>>>
>>> Why does only a failing sink set percpu_pm_failed when failing to 
>>> save the source exits early. Sashiko has a similar comment that this 
>>> could result in restoring uninitialised source save data later, but a 
>>> comment in this function about why the flow is like this would be 
>>> helpful.
>>>
>>> We have coresight_disable_path_from_to() which always succeeds and 
>>> doesn't return an error. TRBE is the only sink with a pm_save_disable()
>>> callback, but it always succeeds anyway.
>>>
>>> Would it not be much simpler to require that sink save/restore 
>>> callbacks always succeed and don't return anything? Seems like this 
>>> percpu_pm_failed stuff is extra complexity for a scenario that 
>>> doesn't exist? The only thing that can fail is saving the source but 
>>> it doesn't goto sink_failed when that happens.
>>>
>>> Ideally etm4_cpu_save() wouldn't have a return value either. It would 
>>> be good if we could find away to skip or ignore the timeouts in there 
>>> somehow because that's the only reason it can fail.
>>>
>>>> +    return ret;
>>>>   }
>>>>   static void coresight_pm_restore(struct coresight_path *path)
>>>>   {
>>>>       struct coresight_device *source = coresight_get_source(path);
>>>> +    struct coresight_device *sink = coresight_get_sink(path);
>>>>       struct coresight_node *from, *to;
>>>>       int ret;
>>>> +    coresight_pm_device_restore(sink);
>>>> +
>>>>       from = coresight_path_first_node(path);
>>>>       /* Up to the node before sink to avoid latency */
>>>>       to = list_prev_entry(coresight_path_last_node(path), link);
>>>> @@ -1808,6 +1846,8 @@ static void coresight_pm_restore(struct 
>>>> coresight_path *path)
>>>>       return;
>>>>   path_failed:
>>>> +    coresight_pm_device_save(sink);
>>>> +
>>>>       pr_err("Failed in coresight PM restore on CPU%d: %d\n",
>>>>              smp_processor_id(), ret);
>>>>
>>>
>>
> 



^ permalink raw reply

* Re: [PATCH] media: cedrus: skip invalid H.264 reference list entries
From: Paul Kocialkowski @ 2026-04-09 14:31 UTC (permalink / raw)
  To: Nicolas Dufresne
  Cc: Pengpeng Hou, mripard, mchehab, gregkh, wens, jernej.skrabec,
	samuel, linux-media, linux-staging, linux-arm-kernel, linux-sunxi,
	linux-kernel
In-Reply-To: <4bd5d70a6144f3e8d4356c182f314cf735f1921c.camel@collabora.com>

[-- Attachment #1: Type: text/plain, Size: 6000 bytes --]

Hi Nicolas,

On Thu 09 Apr 26, 10:00, Nicolas Dufresne wrote:
> Le jeudi 09 avril 2026 à 15:33 +0200, Paul Kocialkowski a écrit :
> > Hi,
> > 
> > On Tue 24 Mar 26, 16:08, Pengpeng Hou wrote:
> > > Cedrus consumes H.264 ref_pic_list0/ref_pic_list1 entries from the
> > > stateless slice control and later uses their indices to look up
> > > decode->dpb[] in _cedrus_write_ref_list().
> > > 
> > > Rejecting such controls in cedrus_try_ctrl() would break existing
> > > userspace, since stateless H.264 reference lists may legitimately carry
> > > out-of-range indices for missing references. Instead, guard the actual
> > > DPB lookup in Cedrus and skip entries whose indices do not fit the fixed
> > > V4L2_H264_NUM_DPB_ENTRIES array.
> > 
> > Could you explain why it is legitimate that userspace would pass indices that
> > are not in the dpb list? As far as I remember from the H.264 spec, the L0/L1
> > lists are constructed from active references only and the number of items
> > there
> > should be given by num_ref_idx_l0_active_minus1/num_ref_idx_l1_active_minus1.
> > We can tolerate invalid data beyond these indices, but certainly not as part
> > of the indices that should be valid.
> > 
> > However I agree that cedrus_try_ctrl is maybe not the right place to check it
> > since I'm not sure we are guaranteed that the slice params control will be
> > checked before the new DPB (from the same request) is applied, so we might end
> > up checking against the dpb from the previous decode request.
> > 
> > But I think we should error out and not just skip the invalid reference.
> 
> Its been a long time I haven't looked into this. But what happens here is that
> once you lost a reference, the userspace DPB will hold a gap picture, which as
> no backing storage. Since it has no backing storage, there is no cookie
> (timestamp) associated with it. This gap picture will still make it to the
> reference lists, since the position of the reference in the list is important
> (you cannot just remove an item). It is an established practice in userspace to
> simply fill the void with an invalid index, typically 0xff, which is always
> invalid. Because that's what some userspace do, it became part of our ABI.

Right we definitely need to keep the order of the L0/L1 lists even with missing
references and the question is whether the hardware can deal with it or not.

Our uAPI specification currently doesn't say anything about handling missing
references. I'm generally not very keen on considering that undefined behavior
becomes de-facto uAPI that should never be broken, because there are cases where
it is obviously incorrect and the fact that it didn't fail previously is the
result of a bug in the implementation.

But in this situation I agree we do need a way to indicate that references are
missing and using 0xff sounds like a good plan to me, given that we provide a
uAPI header define with this value and that the doc mentions it.

> Decoders are expected be fault tolerant, though the tolerance level is hardware
> specific, and so failing in the common code would be inappropriate (failing in
> Cedrus could be acceptable, assuming it can't work with missing references,
> which the implementation seems to be fine with).

Okay I agree that we should not fail in common code and tolerate a value to
indicate a missing reference.

What the current proposal is doing (skipping the reference) results in the
SRAM entry for the reference remaining untouched, which will keep its value
from the previous frame. This seems clearly incorrect.

> Hantro G1 notably have a flag to report missing reference to the HW, and it will
> manage concealement internally. G2/RKVDEC don't, and we try and pick the most
> recent frame as a replacement backing storage, which most of the time minimises
> the damages.

It sounds like an approach that could work for cedrus too.

> As future refinement, we need drivers in the long term to properly report the
> damages (perhaps through additional RO request controls). As discussed few years
> ago in the error handling wip for rkvdec, the V4L2 doc specify that any sort of
> damages known to exist in a frame shall results in the ERROR flag being set. We
> can deduce that the error flag with a payload of 0 indicates to userspace to not
> use the frame (which typically happen on hard errors, or errror at entropy
> decode staged) and ERROR flag with a correct payload signal some level of
> corruption, and its left to the application to decide what to do.

I think it make sense yes, but it would be good to document it in the uAPI
document too.

All the best,

Paul

> Nicolas
> 
> > 
> > All the best,
> > 
> > Paul
> > 
> > > 
> > > This keeps the fix local to the driver use site and avoids out-of-bounds
> > > reads from malformed or unsupported reference list entries.
> > > 
> > > Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn>
> > > ---
> > >  drivers/staging/media/sunxi/cedrus/cedrus_h264.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > > 
> > > diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_h264.c
> > > b/drivers/staging/media/sunxi/cedrus/cedrus_h264.c
> > > --- a/drivers/staging/media/sunxi/cedrus/cedrus_h264.c
> > > +++ b/drivers/staging/media/sunxi/cedrus/cedrus_h264.c
> > > @@ -210,6 +210,9 @@ static void _cedrus_write_ref_list(struct cedrus_ctx
> > > *ctx,
> > >  		u8 dpb_idx;
> > >  
> > >  		dpb_idx = ref_list[i].index;
> > > +		if (dpb_idx >= V4L2_H264_NUM_DPB_ENTRIES)
> > > +			continue;
> > > +
> > >  		dpb = &decode->dpb[dpb_idx];
> > >  
> > >  		if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
> > > -- 
> > > 2.50.1
> > > 



-- 
Paul Kocialkowski,

Independent contractor - sys-base - https://www.sys-base.io/
Free software developer - https://www.paulk.fr/

Expert in multimedia, graphics and embedded hardware support with Linux.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox