Linux-ARM-Kernel Archive on lore.kernel.org

Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH] arch/sh: Drop CONFIG_FIRMWARE_EDID from defconfig files
From: Geert Uytterhoeven @ 2026-04-01  8:55 UTC (permalink / raw)
  To: Thomas Zimmermann
  Cc: ysato, dalias, glaubitz, arnd, linux-sh, linux-kernel, dri-devel,
	Linux ARM, linuxppc-dev, linux-mips
In-Reply-To: <20260401083242.214492-1-tzimmermann@suse.de>

Hi Thomas,

CC arm/mips/ppc, as you sent similar patches for these arches.

On Wed, 1 Apr 2026 at 10:40, Thomas Zimmermann <tzimmermann@suse.de> wrote:
> CONFIG_FIRMWARE_EDID=y depends on X86 or EFI_GENERIC_STUB. Neither is
> true here, so drop the lines from the defconfig files.
>
> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>

Thanks for your patch!

Upon first look, your changes match the (current) dependencies
of FIRMWARE_EDID.  The dependency on X86 was added in commit
7e35fc7ab433683f ("video: Make CONFIG_FIRMWARE_EDID generally
available") in v6.17-rc1.
However, CONFIG_FIRMWARE_EDID also protects fb_firmware_edid(),
which seems to extract the EDID from the PCI ROM, and is thus not
x86-specific?  That function is only ever called by three fbdev drivers
(i810, nv, savagefb), though.

I assume none of these work on SuperH, so
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* RE: [EXT] Re: [PATCH 1/2] dt-bindings: gpu: mali-valhall-csf: Document i.MX952 support
From: Guangliu Ding @ 2026-04-01  8:48 UTC (permalink / raw)
  To: Liviu Dudau
  Cc: Daniel Almeida, Alice Ryhl, Boris Brezillon, Steven Price,
	David Airlie, Simona Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Frank Li, Sascha Hauer, Pengutronix Kernel Team, Fabio Estevam,
	dri-devel@lists.freedesktop.org, devicetree@vger.kernel.org,
	linux-kernel@vger.kernel.org, imx@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org, Jiyu Yang
In-Reply-To: <acva1Xt8V4k9-uG8@e142607>

Hi Liviu

Thanks for your review. Please refer to my comments below:

> On Tue, Mar 31, 2026 at 06:12:38PM +0800, Guangliu Ding wrote:
> > Add compatible string of Mali G310 GPU on i.MX952 board.
> >
> > Signed-off-by: Guangliu Ding <guangliu.ding@nxp.com>
> > Reviewed-by: Jiyu Yang <jiyu.yang@nxp.com>
> > ---
> >  Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml
> b/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml
> > index 8eccd4338a2b..6a10843a26e2 100644
> > --- a/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml
> > +++ b/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml
> > @@ -20,6 +20,7 @@ properties:
> >            - enum:
> >                - mediatek,mt8196-mali
> >                - nxp,imx95-mali            # G310
> > +              - nxp,imx952-mali           # G310
> 
> Can you explain why this is needed? Can it not be covered by the existing
> compatible?

There are functional differences in GPU module (GPUMIX) between i.MX95 
and i.MX952. So they cannot be fully covered by a single existing compatible.
On i.MX952, The GPU clock is controlled by hardware GPU auto clock-gating 
mechanism, while the GPU clock is managed explicitly by the driver on i.MX95.
Because of these behavioral differences, separate compatible strings
"nxp,imx95-mali" and "nxp,imx952-mali" are needed to allow the driver to handle
the two variants independently and to keep room for future divergence.

> 
> Best regards,
> Liviu
> 
> >                - rockchip,rk3588-mali
> >            - const: arm,mali-valhall-csf   # Mali Valhall GPU
> model/revision is fully discoverable
> >
> >
> > --
> > 2.34.1
> >
> 
> --
> ====================
> | I would like to |
> | fix the world,  |
> | but they're not |
> | giving me the   |
>  \ source code!  /
>   ---------------
>     ¯\_(ツ)_/¯

^ permalink raw reply

* Re: [PATCH v16 0/7] coresight: ctcu: Enable byte-cntr function for TMC ETR
From: Jie Gan @ 2026-04-01  8:47 UTC (permalink / raw)
  To: Suzuki K Poulose, Mike Leach, James Clark, Alexander Shishkin,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley, Tingwei Zhang,
	Bjorn Andersson, Konrad Dybcio
  Cc: coresight, linux-arm-kernel, linux-kernel, linux-arm-msm,
	devicetree, Konrad Dybcio, Krzysztof Kozlowski
In-Reply-To: <20260323-enable-byte-cntr-for-ctcu-v16-0-7a413d211b8d@oss.qualcomm.com>



On 3/23/2026 5:49 PM, Jie Gan wrote:
> The byte-cntr function provided by the CTCU device is used to count the
> trace data entering the ETR. An interrupt is triggered if the data size
> exceeds the threshold set in the BYTECNTRVAL register. The interrupt
> handler counts the number of triggered interruptions.
> 
> Based on this concept, the irq_cnt can be used to determine whether
> the etr_buf is full. The ETR device will be disabled when the active
> etr_buf is nearly full or a timeout occurs. The nearly full buffer will
> be switched to background after synced. A new buffer will be picked from
> the etr_buf_list, then restart the ETR device.
> 
> The byte-cntr reading functions can access data from the synced and
> deactivated buffer, transferring trace data from the etr_buf to userspace
> without stopping the ETR device.
> 
> The byte-cntr read operation has integrated with the file node tmc_etr,
> for example:
> /dev/tmc_etr0
> /dev/tmc_etr1
> 
> There are two scenarios for the tmc_etr file node with byte-cntr function:
> 1. BYTECNTRVAL register is configured and byte-cntr is enabled -> byte-cntr read
> 2. BYTECNTRVAL register is reset or byte-cntr is disabled -> original behavior
> 
> Shell commands to enable byte-cntr reading for etr0:
> echo 1 > /sys/bus/coresight/devices/ctcu0/irq_enabled0
> echo 1 > /sys/bus/coresight/devices/tmc_etr0/enable_sink
> echo 1 > /sys/bus/coresight/devices/etm0/enable_source
> cat /dev/tmc_etr0
> 
> Reset the BYTECNTR register for etr0:
> echo 0 > /sys/bus/coresight/devices/ctcu0/irq_enabled0
> 
> ---
> Changes in v16:
> 1. Remove lock/unlock processes in patch "coresight: tmc: add create/clean
>     functions for etr_buf_list" because we are allocating/freeing memory.
> - Link to v15: https://lore.kernel.org/r/20260313-enable-byte-cntr-for-ctcu-v15-0-1777f14ed319@oss.qualcomm.com
> 


Gentle ping


> Changes in v15:
> 1. add lockdep_assert_held in patch "coresight: tmc: add create/clean
>     functions for etr_buf_list"
> 2. optimize tmc_clean_etr_buf_list function
> 3. optimize the patch "enable byte-cntr for TMC ETR devices" according
>     to Suzuki's comments
>     - call byte_cntr_sysfs_ops from etr_sysfs_ops
>     - optimize the lock usage in all functions
>     - remove the buf_node parameter in etr_drvdata, move it to
>       byte_cntr_data
>     - move the tmc_reset_sysfs_buf function to tmc-etr.c
>     - add a read flag to struct etr_buf_node to allow updating pos while
>       traversing etr_buf_list during data reads.
> Link to v14: https://lore.kernel.org/r/20260309-enable-byte-cntr-for-ctcu-v14-0-c08823e5a8e6@oss.qualcomm.com
> 
> Changes in V14:
> 1. Drop the patch: integrate byte-cntr's sysfs_ops with tmc sysfs file_ops
> 2. Replace tmc_sysfs_ops with byte_cntr_sysfs_ops in byte_cntr_start
>     function and restore etr_sysfs_ops in byte_cntr_unprepare function.
> 3. Remove redundant checks in byte‑cntr functions.
> Link to V13: https://lore.kernel.org/all/20260223-enable-byte-cntr-for-ctcu-v13-0-9cb44178b250@oss.qualcomm.com/
> 
> Changes in v13:
> 1. initilize the byte_cntr_data->raw_spin_lock before using.
> 2. replace kzalloc with kzalloc_obj.
> Link to V12: https://lore.kernel.org/all/20260203-enable-byte-cntr-for-ctcu-v12-0-7bf81b86b70e@oss.qualcomm.com/
> 
> Changes in v12:
> 1. Add a new function for retrieving the CTCU's coresight_dev instead of
>     refactor the existing function.
> Link to v11: https://lore.kernel.org/r/20260126-enable-byte-cntr-for-ctcu-v11-0-c0af66ba15cf@oss.qualcomm.com
> 
> Changes in v11:
> 1. Correct the description in patch1 for the function coresight_get_in_port.
> 2. Renaming the sysfs_ops to tmc_sysfs_ops per Suzuki's suggestion.
> Link to v10: https://lore.kernel.org/r/20260122-enable-byte-cntr-for-ctcu-v10-0-22978e3c169f@oss.qualcomm.com
> 
> Changes in v10:
> 1. fix a free memory issue that is reported by robot for patch 2.
> Link to v9: https://lore.kernel.org/r/20251224-enable-byte-cntr-for-ctcu-v9-0-886c4496fed4@oss.qualcomm.com
> 
> Changes in v9:
> 1. Drop the patch: add a new API to retrieve the helper device
> 2. Add a new patch to refactor the tmc_etr_get_catu_device function,
>     making it generic to support all types of helper devices associated with ETR.
> 3. Optimizing the code for creating irq_threshold sysfs node.
> 4. Remove interrupt-name property and obtain the IRQ based on the
>     in-port number.
> Link to v8: https://lore.kernel.org/r/20251211-enable-byte-cntr-for-ctcu-v8-0-3e12ff313191@oss.qualcomm.com
> 
> Changes in V8:
> 1. Optimizing the patch 1 and patch 2 according to Suzuki's comments.
> 2. Combine the patch 3 and patch 4 together.
> 3. Rename the interrupt-name to prevent confusion, for example:etr0->etrirq0.
> Link to V7 - https://lore.kernel.org/all/20251013-enable-byte-cntr-for-ctcu-v7-0-e1e8f41e15dd@oss.qualcomm.com/
> 
> Changes in V7:
> 1. rebased on tag next-20251010
> 2. updated info for sysfs node document
> Link to V6 - https://lore.kernel.org/all/20250908-enable-byte-cntr-for-tmc-v6-0-1db9e621441a@oss.qualcomm.com/
> 
> Changes in V6:
> 1. rebased on next-20250905.
> 2. fixed the issue that the dtsi file has re-named from sa8775p.dtsi to
>     lemans.dtsi.
> 3. fixed some minor issues about comments.
> Link to V5 - https://lore.kernel.org/all/20250812083731.549-1-jie.gan@oss.qualcomm.com/
> 
> Changes in V5:
> 1. Add Mike's reviewed-by tag for patchset 1,2,5.
> 2. Remove the function pointer added to helper_ops according to Mike's
>     comment, it also results the patchset has been removed.
> 3. Optimizing the paired create/clean functions for etr_buf_list.
> 4. Remove the unneeded parameter "reading" from the etr_buf_node.
> Link to V4 - https://lore.kernel.org/all/20250725100806.1157-1-jie.gan@oss.qualcomm.com/
> 
> Changes in V4:
> 1. Rename the function to coresight_get_in_port_dest regarding to Mike's
> comment (patch 1/10).
> 2. Add lock to protect the connections regarding to Mike's comment
> (patch 2/10).
> 3. Move all byte-cntr functions to coresight-ctcu-byte-cntr file.
> 4. Add tmc_read_ops to wrap all read operations for TMC device.
> 5. Add a function in helper_ops to check whether the byte-cntr is
> enabkled.
> 6. Call byte-cntr's read_ops if byte-cntr is enabled when reading data
> from the sysfs node.
> Link to V3 resend - https://lore.kernel.org/all/20250714063109.591-1-jie.gan@oss.qualcomm.com/
> 
> Changes in V3 resend:
> 1. rebased on next-20250711.
> Link to V3 - https://lore.kernel.org/all/20250624060438.7469-1-jie.gan@oss.qualcomm.com/
> 
> Changes in V3:
> 1. The previous solution has been deprecated.
> 2. Add a etr_buf_list to manage allcated etr buffers.
> 3. Add a logic to switch buffer for ETR.
> 4. Add read functions to read trace data from synced etr buffer.
> Link to V2 - https://lore.kernel.org/all/20250410013330.3609482-1-jie.gan@oss.qualcomm.com/
> 
> Changes in V2:
> 1. Removed the independent file node /dev/byte_cntr.
> 2. Integrated the byte-cntr's file operations with current ETR file
>     node.
> 3. Optimized the driver code of the CTCU that associated with byte-cntr.
> 4. Add kernel document for the export API tmc_etr_get_rwp_offset.
> 5. Optimized the way to read the rwp_offset according to Mike's
>     suggestion.
> 6. Removed the dependency of the dts patch.
> Link to V1 - https://lore.kernel.org/all/20250310090407.2069489-1-quic_jiegan@quicinc.com/
> 
> To: Suzuki K Poulose <suzuki.poulose@arm.com>
> To: Mike Leach <mike.leach@arm.com>
> To: James Clark <james.clark@linaro.org>
> To: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> To: Rob Herring <robh@kernel.org>
> To: Krzysztof Kozlowski <krzk+dt@kernel.org>
> To: Conor Dooley <conor+dt@kernel.org>
> To: Tingwei Zhang <tingwei.zhang@oss.qualcomm.com>
> To: Bjorn Andersson <andersson@kernel.org>
> To: Konrad Dybcio <konradybcio@kernel.org>
> Cc: coresight@lists.linaro.org
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Cc: linux-arm-msm@vger.kernel.org
> Cc: devicetree@vger.kernel.org
> Signed-off-by: Jie Gan <jie.gan@oss.qualcomm.com>
> 
> ---
> Jie Gan (7):
>        coresight: core: refactor ctcu_get_active_port and make it generic
>        coresight: tmc: add create/clean functions for etr_buf_list
>        coresight: tmc: introduce tmc_sysfs_ops to wrap sysfs read operations
>        coresight: etr: add a new function to retrieve the CTCU device
>        dt-bindings: arm: add an interrupt property for Coresight CTCU
>        coresight: ctcu: enable byte-cntr for TMC ETR devices
>        arm64: dts: qcom: lemans: add interrupts to CTCU device
> 
>   .../ABI/testing/sysfs-bus-coresight-devices-ctcu   |   9 +
>   .../bindings/arm/qcom,coresight-ctcu.yaml          |  10 +
>   arch/arm64/boot/dts/qcom/lemans.dtsi               |   3 +
>   drivers/hwtracing/coresight/Makefile               |   2 +-
>   drivers/hwtracing/coresight/coresight-core.c       |  24 ++
>   .../hwtracing/coresight/coresight-ctcu-byte-cntr.c | 286 +++++++++++++++++++++
>   drivers/hwtracing/coresight/coresight-ctcu-core.c  | 123 +++++++--
>   drivers/hwtracing/coresight/coresight-ctcu.h       |  79 +++++-
>   drivers/hwtracing/coresight/coresight-priv.h       |   2 +
>   drivers/hwtracing/coresight/coresight-tmc-core.c   |  55 ++--
>   drivers/hwtracing/coresight/coresight-tmc-etr.c    | 226 +++++++++++++++-
>   drivers/hwtracing/coresight/coresight-tmc.h        |  42 +++
>   12 files changed, 789 insertions(+), 72 deletions(-)
> ---
> base-commit: a0ae2a256046c0c5d3778d1a194ff2e171f16e5f
> change-id: 20260309-enable-byte-cntr-for-ctcu-ff86e6198b7f
> 
> Best regards,



^ permalink raw reply

* Re: [PATCH 0/2] arm64: dts: imx8m-kontron: Revert reading SD_VSEL signal
From: Frieder Schrempf @ 2026-04-01  8:28 UTC (permalink / raw)
  To: Peng Fan (OSS), Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Frank Li, Sascha Hauer, Pengutronix Kernel Team, Fabio Estevam
  Cc: devicetree, imx, linux-arm-kernel, linux-kernel, Peng Fan
In-Reply-To: <20260401-imx8m-ldo5-v1-0-1b1c1381babd@nxp.com>

On 01.04.26 04:05, Peng Fan (OSS) wrote:
> When MUX is configured as SDHC VSELECT, enabling SION is not able
> to read back the SD_VSEL value. SION is used for force input path,
> not to redirect the PAD value to GPIO(the other mux).
> 
> This has been confirmed by reading i.MX8MP RTL. we have not check
> i.MX8MM RTL, but it should be same.

It seems like you are right and I misinterpreted the documentation and
also misinterpreted my test results. So I was probably basing my work on
wrong assumptions.

> 
> Not sure whether need to add Fixes commit for the patches, just revert
> patches.

This was introduced in 6.15. I would like to add Fixes tags for the
reverts. And can we add patches in this series that switch to GPIO
control as done in [1] and also tag them as fixes? This should allow to
read back the correct voltage from the regulator.

> 
> For the U-Boot support, either drop vqmmc-supply or switch to use gpio
> control to replace vselect control.
> 
> And below patch should also be revisited.

I think we can revert this, too.

> commit 3ce6f4f943ddd9edc03e450a2a0d89cb025b165b
> Author: Frieder Schrempf <frieder.schrempf@kontron.de>
> Date:   Wed Dec 18 16:27:27 2024 +0100
> 
>     regulator: pca9450: Fix control register for LDO5
> 
> To supporting read back signal, need the MUX set as GPIO and support
> in/out, not set mux as VSELECT.
> 
> TBH: I have not test setting MUX as GPIO, anyway we need to fix DT.

If we mux as GPIO, then we don't need to read back. I think in this case
the best solution is the one used in [1].

[1]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=5245dc5

> 
> Signed-off-by: Peng Fan <peng.fan@nxp.com>
> ---
> Peng Fan (2):
>       Revert "arm64: dts: imx8mm-kontron: Add support for reading SD_VSEL signal"
>       Revert "arm64: dts: imx8mp-kontron: Add support for reading SD_VSEL signal"
> 
>  arch/arm64/boot/dts/freescale/imx8mm-kontron-bl.dts     | 10 +++-------
>  arch/arm64/boot/dts/freescale/imx8mm-kontron-osm-s.dtsi |  7 +++----
>  arch/arm64/boot/dts/freescale/imx8mp-kontron-osm-s.dtsi |  7 +++----
>  3 files changed, 9 insertions(+), 15 deletions(-)
> ---
> base-commit: 3b058d1aeeeff27a7289529c4944291613b364e9
> change-id: 20260329-imx8m-ldo5-90e369066213
> 
> Best regards,



^ permalink raw reply

* Re: [PATCH] dmaengine: xilinx_dma: Fix CPU stall in xilinx_dma_poll_timeout
From: Geert Uytterhoeven @ 2026-04-01  8:40 UTC (permalink / raw)
  To: Alex Bereza
  Cc: Gupta, Suraj, Vinod Koul, Frank Li, Michal Simek, Ulf Hansson,
	Arnd Bergmann, Tony Lindgren, dmaengine, linux-arm-kernel,
	linux-kernel
In-Reply-To: <DHHOCNHDN27K.RIE745OFAACD@bereza.email>

Hi Alex,

On Wed, 1 Apr 2026 at 10:27, Alex Bereza <alex@bereza.email> wrote:
> On Wed Apr 1, 2026 at 7:23 AM CEST, Suraj Gupta wrote:
> >> Rename XILINX_DMA_LOOP_COUNT to XILINX_DMA_POLL_TIMEOUT_US because the
> >> former is incorrect. It is a timeout value for polling various register
> >> bits in microseconds. It is not a loop count. Add a constant
> >> XILINX_DMA_POLL_DELAY_US for delay_us value.
> >
> > Please split this change in a new patch.
>
> Ok, will send a v2.
>
> >> Fixes: 7349a69cf312 ("iopoll: Do not use timekeeping in read_poll_timeout_atomic()")
> >
> > This patch doesn't fixes anything in iopoll, please use correct fixes tag.

Fixes-tag are also used as guidelines, to indicate which patches
are also needed when backporting something.  I.e. if 7349a69cf312 is
ever backported, any other commits that contain "Fixes: 7349a69cf312"
should be backported, too.  So having this Fixes-tag, in addition to
another xilinx_dma-specific one, sounds fine to me.

> Ok, but I'm not sure what would be the correct fixes tag then? I though I need to reference
> 7349a69cf312 in fixes tag because this is the actual change that surfaced the CPU stall issue that I
> want to fix in this driver. I'm fixing the call sites of xilinx_dma_poll_timeout but they were added
> in different commits. Should I add all of them? That would be the following then:
>
> Fixes: 9495f2648287 ("dmaengine: xilinx_vdma: Use readl_poll_timeout instead of do while loop's")
> Fixes: 676f9c26c330 ("dmaengine: xilinx: fix device_terminate_all() callback for AXI CDMA")

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds


^ permalink raw reply

* Re: [PATCH v3 0/6] drm/sun4i: Support LVDS on D1s/T113 combo D-PHY
From: Parthiban @ 2026-04-01  8:39 UTC (permalink / raw)
  To: Kuba Szczodrzyński, Maxime Ripard, Samuel Holland,
	Chen-Yu Tsai, Jernej Skrabec, Maarten Lankhorst,
	Thomas Zimmermann, Rob Herring, Krzysztof Kozlowski, Conor Dooley
  Cc: parthiban, David Airlie, Simona Vetter, linux-arm-kernel,
	linux-sunxi, linux-kernel, linux-riscv, linux-phy, devicetree,
	dri-devel, paulk
In-Reply-To: <a5f6aeb1-b038-462e-8989-c4da65966134@linumiz.com>

Dear Kuba,

On 2/7/26 2:34 PM, Parthiban wrote:
> On 11/16/25 2:46 PM, Kuba Szczodrzyński wrote:
>> Some Allwinner chips (notably the D1s/T113 and the A100) have a "combo
>> MIPI DSI D-PHY" which is required when using single-link LVDS0. The same
>> PD0..PD9 pins are used for either DSI or LVDS.
>>
>> Other than having to use the combo D-PHY, LVDS output is configured in
>> the same way as on older chips.
>>
>> This series enables the sun6i MIPI D-PHY to also work in LVDS mode. It
>> is then configured by the LCD TCON, which allows connecting a
>> single-link LVDS display panel.

Now I also have the MIPI and LVDS working together on A133. Can I pick your
changes and post a combined series for the display support for A133? This will
also address D1s/T114 as well. 

--
Thanks,
Parthiban
https://linumiz.com
https://www.linkedin.com/company/linumiz


^ permalink raw reply

* Re: [PATCH 12/15] KVM: arm64: Remove evaluation of timer state in kvm_cpu_has_pending_timer()
From: Sascha Bischoff @ 2026-04-01  8:21 UTC (permalink / raw)
  To: maz@kernel.org
  Cc: yuzenghui@huawei.com, broonie@kernel.org, Suzuki Poulose,
	kvmarm@lists.linux.dev, oupton@kernel.org,
	linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org,
	Joey Gouly, nd
In-Reply-To: <868qb83pa6.wl-maz@kernel.org>

On Tue, 2026-03-31 at 18:02 +0100, Marc Zyngier wrote:
> On Tue, 31 Mar 2026 16:44:04 +0100,
> Sascha Bischoff <Sascha.Bischoff@arm.com> wrote:
> > 
> > On Thu, 2026-03-26 at 15:35 +0000, Marc Zyngier wrote:
> > > The vgic-v5 code added some evaluations of the timers in a helper
> > > funtion
> > > (kvm_cpu_has_pending_timer()) that is called to determine whether
> > > the vcpu can wake-up.
> > > 
> > > But looking at the timer there is wrong:
> > > 
> > > - we want to see timers that are signalling an interrupt to the
> > >   vcpu, and not just that have a pending interrupt
> > > 
> > > - we already have kvm_arch_vcpu_runnable() that evaluates the
> > >   state of interrupts
> > > 
> > > - kvm_cpu_has_pending_timer() really is about WFIT, as the
> > > timeout
> > >   does not generate an interrupt, and is therefore distinct from
> > >   the point above
> > > 
> > > As a consequence, revert these changes.
> > > 
> > > Fixes: 9491c63b6cd7b ("KVM: arm64: gic-v5: Enlighten arch timer
> > > for
> > > GICv5")
> > > Link:
> > > https://sashiko.dev/#/patchset/20260319154937.3619520-1-sascha.bischoff%40arm.com
> > > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > > ---
> > >  arch/arm64/kvm/arch_timer.c | 6 +-----
> > >  1 file changed, 1 insertion(+), 5 deletions(-)
> > > 
> > > diff --git a/arch/arm64/kvm/arch_timer.c
> > > b/arch/arm64/kvm/arch_timer.c
> > > index 37279f8748695..6608c47d1f628 100644
> > > --- a/arch/arm64/kvm/arch_timer.c
> > > +++ b/arch/arm64/kvm/arch_timer.c
> > > @@ -402,11 +402,7 @@ static bool kvm_timer_should_fire(struct
> > > arch_timer_context *timer_ctx)
> > >  
> > >  int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
> > >  {
> > > -	struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
> > > -	struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
> > > -
> > > -	return kvm_timer_should_fire(vtimer) ||
> > > kvm_timer_should_fire(ptimer) ||
> > > -	       (vcpu_has_wfit_active(vcpu) &&
> > > wfit_delay_ns(vcpu) ==
> > > 0);
> > > +	return vcpu_has_wfit_active(vcpu) && wfit_delay_ns(vcpu)
> > > ==
> > > 0;
> > >  }
> > >  
> > >  /*
> > 
> > Hi Marc,
> > 
> > It appears that I'd misunderstood the intent of this function when
> > I
> > originally wrote this bit code. That is: I agree that these checks
> > shouldn't be here.
> > 
> > However, said checks are needed somewhere. With GICv5, we directly
> > inject the timer state (when possible, at least) which means that
> > we
> > never see the timer interrupt firing on the host, and don't track
> > if it
> > is pending or not in struct vgic_irq as the pending state is driven
> > by
> > the hardware itself. The result of this is that we explicitly need
> > to
> > check if the timer interrupt would be pending if the guest were
> > running
> > somewhere.
> > 
> > I've run with this complete series and have tested the following
> > change. It is sufficient to catch this case, and does it as part of
> > checking if there are pending interrupts, i.e., a more appropriate
> > place called via kvm_arch_vcpu_runnable(). It is yet-another GICv5
> > special case, however. I'd love to hear your thoughts.
> > 
> > diff --git a/arch/arm64/kvm/arch_timer.c
> > b/arch/arm64/kvm/arch_timer.c
> > index cbea4d9ee9552..f8b95721857c3 100644
> > --- a/arch/arm64/kvm/arch_timer.c
> > +++ b/arch/arm64/kvm/arch_timer.c
> > @@ -400,6 +400,14 @@ static bool kvm_timer_should_fire(struct
> > arch_timer_context *timer_ctx)
> >         return cval <= now;
> >  }
> >  
> > +int kvm_cpu_timer_should_fire(struct kvm_vcpu *vcpu)
> > +{
> > +       struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
> > +       struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
> > +
> > +       return kvm_timer_should_fire(vtimer) ||
> > kvm_timer_should_fire(ptimer);
> > +}
> > +
> >  int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
> >  {
> >         return vcpu_has_wfit_active(vcpu) && wfit_delay_ns(vcpu) ==
> > 0;
> > diff --git a/arch/arm64/kvm/vgic/vgic.c
> > b/arch/arm64/kvm/vgic/vgic.c
> > index 7680ced92f715..ffb91f535efe8 100644
> > --- a/arch/arm64/kvm/vgic/vgic.c
> > +++ b/arch/arm64/kvm/vgic/vgic.c
> > @@ -1238,6 +1238,9 @@ int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu
> > *vcpu)
> >                 if (READ_ONCE(vcpu-
> > >arch.vgic_cpu.vgic_v5.gicv5_vpe.db_fired))
> >                         return true;
> >  
> > +               if (kvm_cpu_timer_should_fire(vcpu))
> > +                       return true;
> > +
> 
> This unfortunately seems to suffer from the exact same problem: you
> are evaluating the output of the timer independently of the enable
> bits gating the timer interrupt at the GIC level.
> 
> With this, you can disable the timers at the GIC level, arm the
> timers
> so that they are in a firing position, and enter WFI: the vcpu will
> exit WFI immediately, which is not the expected result.
> 
> I'd suggest something like this instead (compile tested only):
> 
> diff --git a/arch/arm64/kvm/vgic/vgic-v5.c
> b/arch/arm64/kvm/vgic/vgic-v5.c
> index 75372bbfb6a6a..e7d23d0519e8b 100644
> --- a/arch/arm64/kvm/vgic/vgic-v5.c
> +++ b/arch/arm64/kvm/vgic/vgic-v5.c
> @@ -365,9 +365,13 @@ bool vgic_v5_has_pending_ppi(struct kvm_vcpu
> *vcpu)
>  
>  		irq = vgic_get_vcpu_irq(vcpu, intid);
>  
> -		scoped_guard(raw_spinlock_irqsave, &irq->irq_lock)
> -			has_pending = (irq->enabled &&
> irq_is_pending(irq) &&
> +		scoped_guard(raw_spinlock_irqsave, &irq->irq_lock) {
> +			bool pending;
> +
> +			pending = irq->hw ?
> vgic_get_phys_line_level(irq) : irq_is_pending(irq);
> +			has_pending = (irq->enabled && pending &&
>  				       irq->priority <
> priority_mask);
> +		}
>  
>  		vgic_put_irq(vcpu->kvm, irq);
>  

I've just tested the above and can confirm that it does work as
expected. It indeed makes a lot more sense than what I'd suggested.

Thanks,
Sascha

> 
> Thanks,
> 
> 	M.
> 


^ permalink raw reply

* Re: [PATCH v11 03/22] drm: Add new general DRM property "color format"
From: Michel Dänzer @ 2026-04-01  8:27 UTC (permalink / raw)
  To: Nicolas Frattaroli, Ville Syrjälä, Dave Stevenson
  Cc: Harry Wentland, Leo Li, Rodrigo Siqueira, Alex Deucher,
	Christian König, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Andrzej Hajda, Neil Armstrong, Robert Foss, Laurent Pinchart,
	Jonas Karlman, Jernej Skrabec, Sandy Huang, Heiko Stübner,
	Andy Yan, Jani Nikula, Rodrigo Vivi, Joonas Lahtinen,
	Tvrtko Ursulin, Dmitry Baryshkov, Sascha Hauer, Rob Herring,
	Jonathan Corbet, Shuah Khan, kernel, amd-gfx, dri-devel,
	linux-kernel, linux-arm-kernel, linux-rockchip, intel-gfx,
	intel-xe, linux-doc, Werner Sembach, Andri Yngvason, Marius Vlad
In-Reply-To: <7991520.DvuYhMxLoT@workhorse>

On 3/26/26 13:02, Nicolas Frattaroli wrote:
> On Thursday, 26 March 2026 12:16:12 Central European Standard Time Dave Stevenson wrote:
>> On Wed, 25 Mar 2026 at 13:43, Ville Syrjälä
>> <ville.syrjala@linux.intel.com> wrote:
>>> On Wed, Mar 25, 2026 at 12:49:19PM +0000, Dave Stevenson wrote:
>>>> On Tue, 24 Mar 2026 at 16:02, Nicolas Frattaroli
>>>> <nicolas.frattaroli@collabora.com> wrote:
>>>>>
>>>>> +/**
>>>>> + * enum drm_connector_color_format - Connector Color Format Request
>>>>> + *
>>>>> + * This enum, unlike &enum drm_output_color_format, is used to specify requests
>>>>> + * for a specific color format on a connector through the DRM "color format"
>>>>> + * property. The difference is that it has an "AUTO" value to specify that
>>>>> + * no specific choice has been made.
>>>>> + */
>>>>> +enum drm_connector_color_format {
>>>>> +       /**
>>>>> +        * @DRM_CONNECTOR_COLOR_FORMAT_AUTO: The driver or display protocol
>>>>> +        * helpers should pick a suitable color format. All implementations of a
>>>>> +        * specific display protocol must behave the same way with "AUTO", but
>>>>> +        * different display protocols do not necessarily have the same "AUTO"
>>>>> +        * semantics.
>>>>> +        *
>>>>> +        * For HDMI, "AUTO" picks RGB, but falls back to YCbCr 4:2:0 if the
>>>>> +        * bandwidth required for full-scale RGB is not available, or the mode
>>>>> +        * is YCbCr 4:2:0-only, as long as the mode and output both support
>>>>> +        * YCbCr 4:2:0.
>>>>
>>>> Is there a reason you propose dropping back to YCbCr 4:2:0 without
>>>> trying YCbCr 4:2:2 first? Minimising the subsampling is surely
>>>> beneficial, and vc4 for one can do 4:2:2 but not 4:2:0.
>>>
>>> On HDMI 4:2:2 is always 12bpc, so it doesn't save any bandwidth
>>> compared to 8bpc 4:4:4.
>>
>> It does save bandwidth against 10 or 12bpc RGB 4:4:4.
>>
>> Or is the implication that max_bpc = 12 and
>> DRM_CONNECTOR_COLOR_FORMAT_AUTO should drop bpc down to 8 and select
>> RGB in preference to selecting 4:2:2?
> 
> Yes. Some people consider max-bpc to not be a legitimate way of requesting
> an actual bpc, and don't think drivers will choose the highest bpc <= max-bpc,
> and instead may negotiate a fantasy number anywhere below or equal to max-bpc.

Ridiculing others like this for disagreeing with you is uncalled for.

Is there any evidence for your claim that the driver must always use the 
highest possible bpc <= max-bpc?


> Of course this logic could be done in userspace which knows whether the
> less chroma for more bit depth trade-off is worth it, but userspace does
> not know the negotiated link bpc, and my attempts at adding a property for
> it are being blocked.

Assuming you're referring to the concerns I raised there, I don't have the power or intent to block it.


-- 
Earthling Michel Dänzer       \        GNOME / Xwayland / Mesa developer
https://redhat.com             \               Libre software enthusiast


^ permalink raw reply

* Re: [PATCH] dmaengine: xilinx_dma: Fix CPU stall in xilinx_dma_poll_timeout
From: Alex Bereza @ 2026-04-01  8:27 UTC (permalink / raw)
  To: Gupta, Suraj, Alex Bereza, Vinod Koul, Frank Li, Michal Simek,
	Geert Uytterhoeven, Ulf Hansson, Arnd Bergmann, Tony Lindgren
  Cc: dmaengine, linux-arm-kernel, linux-kernel
In-Reply-To: <833bb42a-65b8-4c93-8109-d2959f8b807f@amd.com>

On Wed Apr 1, 2026 at 7:23 AM CEST, Suraj Gupta wrote:

>> Rename XILINX_DMA_LOOP_COUNT to XILINX_DMA_POLL_TIMEOUT_US because the
>> former is incorrect. It is a timeout value for polling various register
>> bits in microseconds. It is not a loop count. Add a constant
>> XILINX_DMA_POLL_DELAY_US for delay_us value.
>
> Please split this change in a new patch.

Ok, will send a v2.

>> Fixes: 7349a69cf312 ("iopoll: Do not use timekeeping in read_poll_timeout_atomic()")
>
> This patch doesn't fixes anything in iopoll, please use correct fixes tag.

Ok, but I'm not sure what would be the correct fixes tag then? I though I need to reference
7349a69cf312 in fixes tag because this is the actual change that surfaced the CPU stall issue that I
want to fix in this driver. I'm fixing the call sites of xilinx_dma_poll_timeout but they were added
in different commits. Should I add all of them? That would be the following then:

Fixes: 9495f2648287 ("dmaengine: xilinx_vdma: Use readl_poll_timeout instead of do while loop's")
Fixes: 676f9c26c330 ("dmaengine: xilinx: fix device_terminate_all() callback for AXI CDMA")

Three call sites with delay_us=0 were first introduced by 9495f2648287, then 676f9c26c330 added the
fourth call site when introducing xilinx_cdma_stop_transfer (probably copy paste from
xilinx_dma_stop_transfer). Would adding these two fixes tags be correct?

>> Hi, in addition to this patch I also have a question: what is the point
>> of atomically polling for the HALTED or IDLE bit in the stop_transfer
>> functions? Does device_terminate_all really need to be callable from
>> atomic context? If not, one could switch to polling non-atomically and
>> avoid burning CPU cycles.
>>
>
> dmaengine_terminate_async(), which directly calls device_terminate_all
> can be called from atomic context.

Right, thanks! Just for my understanding: I still think there is potential for improvement, because
from my understanding it would be beneficial to do the waiting for the bits in the status register
and the freeing of descriptors in xilinx_dma_synchronize. Do I understand correctly that this is
currently not possible due to how the DMA engine API is structured? To make this possible I think
the deprecated dmaengine_terminate_all would have to be removed and all users of this API would have
to be adapted accordingly, correct? So this would be a patch of much larger scope than xilinx_dma
driver alone.

^ permalink raw reply

* [PATCH] arch/arm: Drop CONFIG_FIRMWARE_EDID from defconfig files
From: Thomas Zimmermann @ 2026-04-01  8:25 UTC (permalink / raw)
  To: linux, aaro.koskinen, jmkrzyszt, tony, andreas, khilman, rogerq,
	arnd
  Cc: linux-arm-kernel, linux-kernel, linux-omap, soc, linux-fbdev,
	dri-devel, Thomas Zimmermann

CONFIG_FIRMWARE_EDID=y depends on X86 or EFI_GENERIC_STUB. Neither is
true here, so drop the lines from the defconfig files.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/arm/configs/davinci_all_defconfig | 1 -
 arch/arm/configs/omap1_defconfig       | 1 -
 arch/arm/configs/omap2plus_defconfig   | 1 -
 arch/arm/configs/pxa_defconfig         | 1 -
 4 files changed, 4 deletions(-)

diff --git a/arch/arm/configs/davinci_all_defconfig b/arch/arm/configs/davinci_all_defconfig
index 673408a10888..72703ef0c51c 100644
--- a/arch/arm/configs/davinci_all_defconfig
+++ b/arch/arm/configs/davinci_all_defconfig
@@ -148,7 +148,6 @@ CONFIG_DRM_TINYDRM=m
 CONFIG_TINYDRM_ST7586=m
 CONFIG_FB=y
 CONFIG_FB_DA8XX=y
-CONFIG_FIRMWARE_EDID=y
 CONFIG_BACKLIGHT_PWM=m
 CONFIG_BACKLIGHT_GPIO=m
 CONFIG_FRAMEBUFFER_CONSOLE=y
diff --git a/arch/arm/configs/omap1_defconfig b/arch/arm/configs/omap1_defconfig
index df88763fc7c3..c6155f101fc9 100644
--- a/arch/arm/configs/omap1_defconfig
+++ b/arch/arm/configs/omap1_defconfig
@@ -136,7 +136,6 @@ CONFIG_FB_OMAP_LCDC_EXTERNAL=y
 CONFIG_FB_OMAP_LCDC_HWA742=y
 CONFIG_FB_OMAP_MANUAL_UPDATE=y
 CONFIG_FB_OMAP_LCD_MIPID=y
-CONFIG_FIRMWARE_EDID=y
 CONFIG_FB_MODE_HELPERS=y
 CONFIG_LCD_CLASS_DEVICE=y
 CONFIG_FRAMEBUFFER_CONSOLE=y
diff --git a/arch/arm/configs/omap2plus_defconfig b/arch/arm/configs/omap2plus_defconfig
index 0464f6552169..8e09e66ccc4d 100644
--- a/arch/arm/configs/omap2plus_defconfig
+++ b/arch/arm/configs/omap2plus_defconfig
@@ -505,7 +505,6 @@ CONFIG_DRM_SIMPLE_BRIDGE=m
 CONFIG_DRM_TI_TFP410=m
 CONFIG_DRM_TI_TPD12S015=m
 CONFIG_FB=y
-CONFIG_FIRMWARE_EDID=y
 CONFIG_FB_MODE_HELPERS=y
 CONFIG_FB_TILEBLITTING=y
 CONFIG_LCD_CLASS_DEVICE=y
diff --git a/arch/arm/configs/pxa_defconfig b/arch/arm/configs/pxa_defconfig
index eacd08fd87ad..c51ae373ca88 100644
--- a/arch/arm/configs/pxa_defconfig
+++ b/arch/arm/configs/pxa_defconfig
@@ -391,7 +391,6 @@ CONFIG_LCD_CORGI=m
 CONFIG_LCD_PLATFORM=m
 CONFIG_BACKLIGHT_PWM=m
 CONFIG_FRAMEBUFFER_CONSOLE=y
-CONFIG_FIRMWARE_EDID=y
 CONFIG_FB_TILEBLITTING=y
 CONFIG_FRAMEBUFFER_CONSOLE_ROTATION=y
 CONFIG_LOGO=y
-- 
2.53.0



^ permalink raw reply related

* Re: [PATCH 09/15] KVM: arm64: vgic-v5: align priority comparison with other GICs
From: Sascha Bischoff @ 2026-04-01  8:18 UTC (permalink / raw)
  To: maz@kernel.org
  Cc: yuzenghui@huawei.com, broonie@kernel.org, Suzuki Poulose,
	kvmarm@lists.linux.dev, oupton@kernel.org,
	linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org,
	Joey Gouly, nd
In-Reply-To: <867bqr534v.wl-maz@kernel.org>

On Tue, 2026-03-31 at 18:18 +0100, Marc Zyngier wrote:
> On Tue, 31 Mar 2026 16:09:10 +0100,
> Sascha Bischoff <Sascha.Bischoff@arm.com> wrote:
> > 
> > On Thu, 2026-03-26 at 15:35 +0000, Marc Zyngier wrote:
> > > The way the effective priority mask is computed, and then
> > > compared
> > > to the priority of an interrupt to decide whether to wake-up or
> > > not,
> > > is slightly odd, and breaks at the limits.
> > > 
> > > This could result in spurious wake-ups that are undesirable.
> > > 
> > > Adopt the GICv[23] logic instead, which checks that the priority
> > > value
> > > is strictly lower than the mask.
> > > 
> > > Fixes: 933e5288fa971 ("KVM: arm64: gic-v5: Check for pending
> > > PPIs")
> > > Link:
> > > https://sashiko.dev/#/patchset/20260319154937.3619520-1-sascha.bischoff%40arm.com
> > > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > > ---
> > >  arch/arm64/kvm/vgic/vgic-v5.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/arch/arm64/kvm/vgic/vgic-v5.c
> > > b/arch/arm64/kvm/vgic/vgic-v5.c
> > > index 0f269321ece4b..75372bbfb6a6a 100644
> > > --- a/arch/arm64/kvm/vgic/vgic-v5.c
> > > +++ b/arch/arm64/kvm/vgic/vgic-v5.c
> > > @@ -238,7 +238,7 @@ static u32
> > > vgic_v5_get_effective_priority_mask(struct kvm_vcpu *vcpu)
> > >  	 */
> > >  	priority_mask = FIELD_GET(FEAT_GCIE_ICH_VMCR_EL2_VPMR,
> > > cpu_if->vgic_vmcr);
> > >  
> > > -	return min(highest_ap, priority_mask + 1);
> > > +	return min(highest_ap, priority_mask);
> > 
> > Hi Marc,
> > 
> > This part of your change (dropping the `- 1`) is not correct for
> > GICv5.
> > The GICv[23] PMR works differently to the GICv5 PCR.
> > 
> > For GICv[23] the mask is exclusive, i.e., only higher priority
> > (lower
> > numerical value) interrupts are of sufficient priority to be
> > signalled.
> > 
> > For GICv5, the priority of an interrupt can be equal to or higher
> > than
> > (numerically lower than) the mask. See DMSQKF in the GICv5 spec:
> > 
> > A physical interrupt has Sufficient priority to be signaled when
> > all of
> > the following are true:
> >    * The priority of the interrupt is higher than the physical
> > running
> >    priority for the Physical Interrupt Domain.
> >    * The priority of the interrupt is equal to or higher than the
> >    Physical Priority Mask for the Physical Interrupt Domain.
> >    
> > Therefore, we require this `+ 1` for the priority_mask in order to
> > allow
> > us to combine the active priority and priority mask. Else, they
> > operate on
> > different scales.
> > 
> > I'd tried to explain this in a comment that lies just outside the
> > diff,
> > but hadn't explicitly called out that GICv5 operates differently to
> > GICv[23] in this regard. Apologies.
> 
> Nothing to apologise about, this is me not being able to read.
> 
> >    
> > >  }
> > >  
> > >  /*
> > > @@ -367,7 +367,7 @@ bool vgic_v5_has_pending_ppi(struct kvm_vcpu
> > > *vcpu)
> > >  
> > >  		scoped_guard(raw_spinlock_irqsave, &irq-
> > > >irq_lock)
> > >  			has_pending = (irq->enabled &&
> > > irq_is_pending(irq) &&
> > > -				       irq->priority <=
> > > priority_mask);
> > > +				       irq->priority <
> > > priority_mask);
> > 
> > I agree that this was wrong and should never have included the
> > equality. This was definitely a bug!
> 
> Cool. I'll revert the revert of the first hunk and keep the second
> one.

Sounds good. The commit message will need some rejigging too.

Thanks,
Sascha

> 
> Thanks!
> 
> 	M.
> 


^ permalink raw reply

* [PATCH v3 1/2] dt-bindings: perf: marvell: Document CN20K DDR PMU
From: Geetha sowjanya @ 2026-04-01  8:16 UTC (permalink / raw)
  To: linux-perf-users, linux-kernel, linux-arm-kernel, devicetree
  Cc: mark.rutland, will, krzk+dt
In-Reply-To: <20260401081640.23740-1-gakula@marvell.com>

Add a devicetree binding for the Marvell CN20K DDR performance
monitor block, including the marvell,cn20k-ddr-pmu compatible
string and the required MMIO reg region.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
---
 .../bindings/perf/marvell-cn20k-ddr.yaml      | 39 +++++++++++++++++++
 1 file changed, 39 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/perf/marvell-cn20k-ddr.yaml

diff --git a/Documentation/devicetree/bindings/perf/marvell-cn20k-ddr.yaml b/Documentation/devicetree/bindings/perf/marvell-cn20k-ddr.yaml
new file mode 100644
index 000000000000..fa757017d66e
--- /dev/null
+++ b/Documentation/devicetree/bindings/perf/marvell-cn20k-ddr.yaml
@@ -0,0 +1,39 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/perf/marvell-cn20k-ddr.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Marvell CN20K DDR performance monitor
+
+description:
+  Performance Monitoring Unit (PMU) for the DDR controller
+  in Marvell CN20K SoCs.
+
+maintainers:
+  - Geetha sowjanya <gakula@marvell.com>
+
+properties:
+  compatible:
+    const: marvell,cn20k-ddr-pmu
+
+  reg:
+    maxItems: 1
+
+required:
+  - compatible
+  - reg
+
+additionalProperties: false
+
+examples:
+  - |
+    bus {
+        #address-cells = <2>;
+        #size-cells = <2>;
+
+        ddr-pmu@c200000000 {
+            compatible = "marvell,cn20k-ddr-pmu";
+            reg = <0xc200 0x00000000 0x0 0x100000>;
+        };
+    };
-- 
2.25.1



^ permalink raw reply related

* [PATCH v3 2/2] perf: marvell: Add CN20K DDR PMU support
From: Geetha sowjanya @ 2026-04-01  8:16 UTC (permalink / raw)
  To: linux-perf-users, linux-kernel, linux-arm-kernel, devicetree
  Cc: mark.rutland, will, krzk+dt
In-Reply-To: <20260401081640.23740-1-gakula@marvell.com>

The CN20K DRAM Subsystem exposes eight programmable
performance counters and two fixed counters for DDR
read and write traffic.  Software selects events for
the programmable counters from traffic at the DDR PHY
interface, the CHI interconnect, or inside the DDR controller.

Add CN20K register offsets, event maps, and sysfs attributes;
match the device via OF (marvell,cn20k-ddr-pmu) and ACPI (MRVL000B).
Represent the SoC variant in platform data with bit flags so
CN20K can reuse the Odyssey PMU code path where appropriate.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
---
 drivers/perf/marvell_cn10k_ddr_pmu.c | 187 ++++++++++++++++++++++++---
 1 file changed, 171 insertions(+), 16 deletions(-)

diff --git a/drivers/perf/marvell_cn10k_ddr_pmu.c b/drivers/perf/marvell_cn10k_ddr_pmu.c
index 72ac17efd846..7e2e1823b009 100644
--- a/drivers/perf/marvell_cn10k_ddr_pmu.c
+++ b/drivers/perf/marvell_cn10k_ddr_pmu.c
@@ -13,31 +13,43 @@
 #include <linux/hrtimer.h>
 #include <linux/acpi.h>
 #include <linux/platform_device.h>
+#include <linux/bits.h>
+
+/* SoC variant flags for struct ddr_pmu_platform_data (mutually exclusive in pdata) */
+#define IS_CN10K	BIT(0)
+#define IS_ODY		BIT(1)
+#define IS_CN20K	BIT(2)
 
 /* Performance Counters Operating Mode Control Registers */
 #define CN10K_DDRC_PERF_CNT_OP_MODE_CTRL	0x8020
 #define ODY_DDRC_PERF_CNT_OP_MODE_CTRL		0x20020
+#define CN20K_DDRC_PERF_CNT_OP_MODE_CTRL	0x20000
 #define OP_MODE_CTRL_VAL_MANUAL	0x1
 
 /* Performance Counters Start Operation Control Registers */
 #define CN10K_DDRC_PERF_CNT_START_OP_CTRL	0x8028
 #define ODY_DDRC_PERF_CNT_START_OP_CTRL		0x200A0
+#define CN20K_DDRC_PERF_CNT_START_OP_CTRL	0x20080
 #define START_OP_CTRL_VAL_START		0x1ULL
 #define START_OP_CTRL_VAL_ACTIVE	0x2
 
 /* Performance Counters End Operation Control Registers */
 #define CN10K_DDRC_PERF_CNT_END_OP_CTRL	0x8030
 #define ODY_DDRC_PERF_CNT_END_OP_CTRL	0x200E0
+#define CN20K_DDRC_PERF_CNT_END_OP_CTRL	0x200C0
 #define END_OP_CTRL_VAL_END		0x1ULL
 
 /* Performance Counters End Status Registers */
 #define CN10K_DDRC_PERF_CNT_END_STATUS		0x8038
 #define ODY_DDRC_PERF_CNT_END_STATUS		0x20120
+#define CN20K_DDRC_PERF_CNT_END_STATUS		0x20100
 #define END_STATUS_VAL_END_TIMER_MODE_END	0x1
 
 /* Performance Counters Configuration Registers */
 #define CN10K_DDRC_PERF_CFG_BASE		0x8040
 #define ODY_DDRC_PERF_CFG_BASE			0x20160
+#define CN20K_DDRC_PERF_CFG_BASE		0x20140
+#define CN20K_DDRC_PERF_CFG1_BASE		0x20180
 
 /* 8 Generic event counter + 2 fixed event counters */
 #define DDRC_PERF_NUM_GEN_COUNTERS	8
@@ -61,6 +73,23 @@
  * DO NOT change these event-id numbers, they are used to
  * program event bitmap in h/w.
  */
+
+/* CN20K specific events */
+#define EVENT_PERF_OP_IS_RD16			61
+#define EVENT_PERF_OP_IS_RD32			60
+#define EVENT_PERF_OP_IS_WR16			59
+#define EVENT_PERF_OP_IS_WR32			58
+#define EVENT_OP_IS_ENTER_DSM			44
+#define EVENT_OP_IS_RFM				43
+
+#define EVENT_CN20K_OP_IS_TCR_MRR			50
+#define EVENT_CN20K_OP_IS_DQSOSC_MRR			49
+#define EVENT_CN20K_OP_IS_DQSOSC_MPC			48
+#define EVENT_CN20K_VISIBLE_WIN_LIMIT_REACHED_WR	47
+#define EVENT_CN20K_VISIBLE_WIN_LIMIT_REACHED_RD	46
+#define EVENT_CN20K_OP_IS_ZQLATCH			21
+#define EVENT_CN20K_OP_IS_ZQSTART			22
+
 #define EVENT_DFI_CMD_IS_RETRY			61
 #define EVENT_RD_UC_ECC_ERROR			60
 #define EVENT_RD_CRC_ERROR			59
@@ -87,6 +116,9 @@
 #define EVENT_OP_IS_SPEC_REF			41
 #define EVENT_OP_IS_CRIT_REF			40
 #define EVENT_OP_IS_REFRESH			39
+#define EVENT_OP_IS_CAS_WCK_SUS			38
+#define EVENT_OP_IS_CAS_WS_OFF			37
+#define EVENT_OP_IS_CAS_WS			36
 #define EVENT_OP_IS_ENTER_MPSM			35
 #define EVENT_OP_IS_ENTER_POWERDOWN		31
 #define EVENT_OP_IS_ENTER_SELFREF		27
@@ -183,8 +215,8 @@ struct ddr_pmu_platform_data {
 	u64 cnt_freerun_clr;
 	u64 cnt_value_wr_op;
 	u64 cnt_value_rd_op;
-	bool is_cn10k;
-	bool is_ody;
+	u64 cfg1_base;
+	unsigned int silicon_flags; /* IS_CN10K, IS_ODY, or IS_CN20K */
 };
 
 static ssize_t cn10k_ddr_pmu_event_show(struct device *dev,
@@ -336,6 +368,80 @@ static struct attribute *odyssey_ddr_perf_events_attrs[] = {
 	NULL
 };
 
+static struct attribute *cn20k_ddr_perf_events_attrs[] = {
+	/* Programmable */
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_hif_rd_or_wr_access, EVENT_HIF_RD_OR_WR),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_hif_wr_access, EVENT_HIF_WR),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_hif_rd_access, EVENT_HIF_RD),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_hif_rmw_access, EVENT_HIF_RMW),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_hif_pri_rdaccess, EVENT_HIF_HI_PRI_RD),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_rd_bypass_access, EVENT_READ_BYPASS),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_act_bypass_access, EVENT_ACT_BYPASS),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_dfi_wr_data_access,
+				 EVENT_DFI_WR_DATA_CYCLES),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_dfi_rd_data_access,
+				 EVENT_DFI_RD_DATA_CYCLES),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_hpri_sched_rd_crit_access,
+				 EVENT_HPR_XACT_WHEN_CRITICAL),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_lpri_sched_rd_crit_access,
+				 EVENT_LPR_XACT_WHEN_CRITICAL),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_wr_trxn_crit_access,
+				 EVENT_WR_XACT_WHEN_CRITICAL),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_cam_active_access, EVENT_OP_IS_ACTIVATE),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_cam_rd_or_wr_access,
+				 EVENT_OP_IS_RD_OR_WR),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_cam_rd_active_access,
+				 EVENT_OP_IS_RD_ACTIVATE),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_cam_read, EVENT_OP_IS_RD),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_cam_write, EVENT_OP_IS_WR),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_cam_mwr, EVENT_OP_IS_MWR),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_precharge, EVENT_OP_IS_PRECHARGE),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_precharge_for_rdwr,
+				 EVENT_PRECHARGE_FOR_RDWR),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_precharge_for_other,
+				 EVENT_PRECHARGE_FOR_OTHER),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_rdwr_transitions, EVENT_RDWR_TRANSITIONS),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_write_combine, EVENT_WRITE_COMBINE),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_war_hazard, EVENT_WAR_HAZARD),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_raw_hazard, EVENT_RAW_HAZARD),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_waw_hazard, EVENT_WAW_HAZARD),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_enter_selfref, EVENT_OP_IS_ENTER_SELFREF),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_enter_powerdown,
+				 EVENT_OP_IS_ENTER_POWERDOWN),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_cas_ws, EVENT_OP_IS_CAS_WS),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_cas_ws_off, EVENT_OP_IS_CAS_WS_OFF),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_cas_wck_sus, EVENT_OP_IS_CAS_WCK_SUS),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_refresh, EVENT_OP_IS_REFRESH),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_crit_ref, EVENT_OP_IS_CRIT_REF),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_spec_ref, EVENT_OP_IS_SPEC_REF),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_load_mode, EVENT_OP_IS_LOAD_MODE),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_rfm, EVENT_OP_IS_RFM),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_enter_dsm, EVENT_OP_IS_ENTER_DSM),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_dfi_cycles, EVENT_DFI_CYCLES),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_win_limit_reached_rd,
+				 EVENT_CN20K_VISIBLE_WIN_LIMIT_REACHED_RD),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_win_limit_reached_wr,
+				 EVENT_CN20K_VISIBLE_WIN_LIMIT_REACHED_WR),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_dqsosc_mpc, EVENT_CN20K_OP_IS_DQSOSC_MPC),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_dqsosc_mrr, EVENT_CN20K_OP_IS_DQSOSC_MRR),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_tcr_mrr, EVENT_CN20K_OP_IS_TCR_MRR),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_zqstart, EVENT_CN20K_OP_IS_ZQSTART),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_zqlatch, EVENT_CN20K_OP_IS_ZQLATCH),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_read16, EVENT_PERF_OP_IS_RD16),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_read32, EVENT_PERF_OP_IS_RD32),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_write16, EVENT_PERF_OP_IS_WR16),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_write32, EVENT_PERF_OP_IS_WR32),
+	/* Free run event counters */
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_ddr_reads, EVENT_DDR_READS),
+	CN10K_DDR_PMU_EVENT_ATTR(ddr_ddr_writes, EVENT_DDR_WRITES),
+	NULL
+};
+
+static struct attribute_group cn20k_ddr_perf_events_attr_group = {
+	.name = "events",
+	.attrs = cn20k_ddr_perf_events_attrs,
+};
+
 static struct attribute_group odyssey_ddr_perf_events_attr_group = {
 	.name = "events",
 	.attrs = odyssey_ddr_perf_events_attrs,
@@ -393,6 +499,13 @@ static const struct attribute_group *odyssey_attr_groups[] = {
 	NULL
 };
 
+static const struct attribute_group *cn20k_attr_groups[] = {
+	&cn20k_ddr_perf_events_attr_group,
+	&cn10k_ddr_perf_format_attr_group,
+	&cn10k_ddr_perf_cpumask_attr_group,
+	NULL
+};
+
 /* Default poll timeout is 100 sec, which is very sufficient for
  * 48 bit counter incremented max at 5.6 GT/s, which may take many
  * hours to overflow.
@@ -412,7 +525,7 @@ static int ddr_perf_get_event_bitmap(int eventid, u64 *event_bitmap,
 
 	switch (eventid) {
 	case EVENT_DFI_PARITY_POISON ...EVENT_DFI_CMD_IS_RETRY:
-		if (!ddr_pmu->p_data->is_ody) {
+		if (!(ddr_pmu->p_data->silicon_flags & IS_ODY)) {
 			err = -EINVAL;
 			break;
 		}
@@ -524,9 +637,9 @@ static void cn10k_ddr_perf_counter_enable(struct cn10k_ddr_pmu *pmu,
 					  int counter, bool enable)
 {
 	const struct ddr_pmu_platform_data *p_data = pmu->p_data;
+	unsigned int silicon_flags = pmu->p_data->silicon_flags;
 	u64 ctrl_reg = pmu->p_data->cnt_op_mode_ctrl;
 	const struct ddr_pmu_ops *ops = pmu->ops;
-	bool is_ody = pmu->p_data->is_ody;
 	u32 reg;
 	u64 val;
 
@@ -546,7 +659,7 @@ static void cn10k_ddr_perf_counter_enable(struct cn10k_ddr_pmu *pmu,
 
 		writeq_relaxed(val, pmu->base + reg);
 
-		if (is_ody) {
+		if (silicon_flags & IS_ODY) {
 			if (enable) {
 				/*
 				 * Setup the PMU counter to work in
@@ -621,6 +734,7 @@ static int cn10k_ddr_perf_event_add(struct perf_event *event, int flags)
 {
 	struct cn10k_ddr_pmu *pmu = to_cn10k_ddr_pmu(event->pmu);
 	const struct ddr_pmu_platform_data *p_data = pmu->p_data;
+	unsigned int silicon_flags = pmu->p_data->silicon_flags;
 	const struct ddr_pmu_ops *ops = pmu->ops;
 	struct hw_perf_event *hwc = &event->hw;
 	u8 config = event->attr.config;
@@ -642,10 +756,17 @@ static int cn10k_ddr_perf_event_add(struct perf_event *event, int flags)
 	if (counter < DDRC_PERF_NUM_GEN_COUNTERS) {
 		/* Generic counters, configure event id */
 		reg_offset = DDRC_PERF_CFG(p_data->cfg_base, counter);
-		ret = ddr_perf_get_event_bitmap(config, &val, pmu);
-		if (ret)
-			return ret;
 
+		if (silicon_flags & IS_CN20K) {
+			val =  (1ULL << (config - 1));
+			if (config == EVENT_CN20K_OP_IS_ZQSTART ||
+			    config == EVENT_CN20K_OP_IS_ZQLATCH)
+				reg_offset = DDRC_PERF_CFG(p_data->cfg1_base, counter);
+		} else {
+			ret = ddr_perf_get_event_bitmap(config, &val, pmu);
+			if (ret)
+				return ret;
+		}
 		writeq_relaxed(val, pmu->base + reg_offset);
 	} else {
 		/* fixed event counter, clear counter value */
@@ -952,7 +1073,25 @@ static const struct ddr_pmu_platform_data cn10k_ddr_pmu_pdata = {
 	.cnt_freerun_clr = 0,
 	.cnt_value_wr_op = CN10K_DDRC_PERF_CNT_VALUE_WR_OP,
 	.cnt_value_rd_op = CN10K_DDRC_PERF_CNT_VALUE_RD_OP,
-	.is_cn10k = TRUE,
+	.silicon_flags = IS_CN10K,
+};
+
+static const struct ddr_pmu_platform_data cn20k_ddr_pmu_pdata = {
+	.counter_overflow_val = 0,
+	.counter_max_val = GENMASK_ULL(63, 0),
+	.cnt_base = ODY_DDRC_PERF_CNT_VALUE_BASE,
+	.cfg_base = CN20K_DDRC_PERF_CFG_BASE,
+	.cfg1_base = CN20K_DDRC_PERF_CFG1_BASE,
+	.cnt_op_mode_ctrl = CN20K_DDRC_PERF_CNT_OP_MODE_CTRL,
+	.cnt_start_op_ctrl = CN20K_DDRC_PERF_CNT_START_OP_CTRL,
+	.cnt_end_op_ctrl = CN20K_DDRC_PERF_CNT_END_OP_CTRL,
+	.cnt_end_status = CN20K_DDRC_PERF_CNT_END_STATUS,
+	.cnt_freerun_en = 0,
+	.cnt_freerun_ctrl = ODY_DDRC_PERF_CNT_FREERUN_CTRL,
+	.cnt_freerun_clr = ODY_DDRC_PERF_CNT_FREERUN_CLR,
+	.cnt_value_wr_op = ODY_DDRC_PERF_CNT_VALUE_WR_OP,
+	.cnt_value_rd_op = ODY_DDRC_PERF_CNT_VALUE_RD_OP,
+	.silicon_flags = IS_CN20K,
 };
 #endif
 
@@ -979,7 +1118,7 @@ static const struct ddr_pmu_platform_data odyssey_ddr_pmu_pdata = {
 	.cnt_freerun_clr = ODY_DDRC_PERF_CNT_FREERUN_CLR,
 	.cnt_value_wr_op = ODY_DDRC_PERF_CNT_VALUE_WR_OP,
 	.cnt_value_rd_op = ODY_DDRC_PERF_CNT_VALUE_RD_OP,
-	.is_ody = TRUE,
+	.silicon_flags = IS_ODY,
 };
 #endif
 
@@ -989,8 +1128,7 @@ static int cn10k_ddr_perf_probe(struct platform_device *pdev)
 	struct cn10k_ddr_pmu *ddr_pmu;
 	struct resource *res;
 	void __iomem *base;
-	bool is_cn10k;
-	bool is_ody;
+	unsigned int silicon_flags;
 	char *name;
 	int ret;
 
@@ -1014,10 +1152,9 @@ static int cn10k_ddr_perf_probe(struct platform_device *pdev)
 	ddr_pmu->base = base;
 
 	ddr_pmu->p_data = dev_data;
-	is_cn10k = ddr_pmu->p_data->is_cn10k;
-	is_ody = ddr_pmu->p_data->is_ody;
+	silicon_flags = ddr_pmu->p_data->silicon_flags;
 
-	if (is_cn10k) {
+	if (silicon_flags & IS_CN10K) {
 		ddr_pmu->ops = &ddr_pmu_ops;
 		/* Setup the PMU counter to work in manual mode */
 		writeq_relaxed(OP_MODE_CTRL_VAL_MANUAL, ddr_pmu->base +
@@ -1039,7 +1176,7 @@ static int cn10k_ddr_perf_probe(struct platform_device *pdev)
 		};
 	}
 
-	if (is_ody) {
+	if (silicon_flags & IS_ODY) {
 		ddr_pmu->ops = &ddr_pmu_ody_ops;
 
 		ddr_pmu->pmu = (struct pmu) {
@@ -1056,6 +1193,22 @@ static int cn10k_ddr_perf_probe(struct platform_device *pdev)
 		};
 	}
 
+	if (silicon_flags & IS_CN20K) {
+		ddr_pmu->ops = &ddr_pmu_ody_ops;
+
+		ddr_pmu->pmu = (struct pmu) {
+			.module       = THIS_MODULE,
+			.capabilities = PERF_PMU_CAP_NO_EXCLUDE,
+			.task_ctx_nr = perf_invalid_context,
+			.attr_groups = cn20k_attr_groups,
+			.event_init  = cn10k_ddr_perf_event_init,
+			.add         = cn10k_ddr_perf_event_add,
+			.del         = cn10k_ddr_perf_event_del,
+			.start       = cn10k_ddr_perf_event_start,
+			.stop        = cn10k_ddr_perf_event_stop,
+			.read        = cn10k_ddr_perf_event_update,
+		};
+	}
 	/* Choose this cpu to collect perf data */
 	ddr_pmu->cpu = raw_smp_processor_id();
 
@@ -1098,6 +1251,7 @@ static void cn10k_ddr_perf_remove(struct platform_device *pdev)
 #ifdef CONFIG_OF
 static const struct of_device_id cn10k_ddr_pmu_of_match[] = {
 	{ .compatible = "marvell,cn10k-ddr-pmu", .data = &cn10k_ddr_pmu_pdata },
+	{ .compatible = "marvell,cn20k-ddr-pmu", .data = &cn20k_ddr_pmu_pdata },
 	{ },
 };
 MODULE_DEVICE_TABLE(of, cn10k_ddr_pmu_of_match);
@@ -1107,6 +1261,7 @@ MODULE_DEVICE_TABLE(of, cn10k_ddr_pmu_of_match);
 static const struct acpi_device_id cn10k_ddr_pmu_acpi_match[] = {
 	{"MRVL000A", (kernel_ulong_t)&cn10k_ddr_pmu_pdata },
 	{"MRVL000C", (kernel_ulong_t)&odyssey_ddr_pmu_pdata},
+	{"MRVL000B", (kernel_ulong_t)&cn20k_ddr_pmu_pdata},
 	{},
 };
 MODULE_DEVICE_TABLE(acpi, cn10k_ddr_pmu_acpi_match);
-- 
2.25.1



^ permalink raw reply related

* [PATCH v3 0/2] perf: marvell: Add CN20K DDR PMU support
From: Geetha sowjanya @ 2026-04-01  8:16 UTC (permalink / raw)
  To: linux-perf-users, linux-kernel, linux-arm-kernel, devicetree
  Cc: mark.rutland, will, krzk+dt

This series adds support for the Marvell CN20K DRAM Subsystem (DSS)
performance monitor in the existing marvell_cn10k_ddr_pmu driver, and
documents the device tree binding for the new compatible string.

The CN20K PMU provides eight programmable counters and two fixed
counters (DDR reads and writes).  Patch 1 adds the devicetree schema for
"marvell,cn20k-ddr-pmu".  Patch 2 wires OF and ACPI (MRVL000B) match
entries, adds CN20K register offsets and event maps, and refactors
platform data to use silicon variant flags.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>

Changes in v2:
 - Fixed YAML syntax error triggered by a tab character in the examples
  section, which caused dt_binding_check to fail.

Changes in v1:
- Added a description field to the binding.
- Simplified the compatible property using 'const' instead of 'items/enum'.
- Updated the example node name to include a unit-address matching the reg base.

Geetha sowjanya (2):
  dt-bindings: perf: marvell: Document CN20K DDR PMU
  perf: marvell: Add CN20K DDR PMU support

 .../bindings/perf/marvell-cn20k-ddr.yaml      |  37 ++++
 drivers/perf/marvell_cn10k_ddr_pmu.c          | 186 ++++++++++++++++--
 2 files changed, 207 insertions(+), 16 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/perf/marvell-cn20k-ddr.yaml

-- 
2.25.1

^ permalink raw reply

* RE: [PATCH v3 3/3] iommu/arm-smmu-v3: Allow ATS to be always on
From: Tian, Kevin @ 2026-04-01  8:15 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Nicolin Chen, will@kernel.org, robin.murphy@arm.com,
	bhelgaas@google.com, joro@8bytes.org, praan@google.com,
	baolu.lu@linux.intel.com, miko.lenczewski@arm.com,
	linux-arm-kernel@lists.infradead.org, iommu@lists.linux.dev,
	linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
	Williams, Dan J, jonathan.cameron@huawei.com, Vikram Sethi,
	linux-cxl@vger.kernel.org
In-Reply-To: <20260331120816.GW310919@nvidia.com>

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Tuesday, March 31, 2026 8:08 PM
> 
> On Tue, Mar 31, 2026 at 08:40:17AM +0000, Tian, Kevin wrote:
> > > From: Nicolin Chen <nicolinc@nvidia.com>
> > > Sent: Saturday, March 7, 2026 7:41 AM
> > >
> > > +
> > > +	master->ats_always_on = true;
> > > +
> > > +	ret = arm_smmu_alloc_cd_tables(master);
> > > +	if (ret)
> > > +		return ret;
> > > +
> > > +out_prepare:
> > > +	pci_prepare_ats(pdev, stu);
> > > +	return 0;
> >
> > is there a problem leaving ats_always_on being true while
> > allocating cd tables fails?
> 
> I would expect this error flow unwinds back up to failing device
> probe?
> 

yeah, I thought error was ignored.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>


^ permalink raw reply

* Re: [PATCH] arm64: dts: imx8x-colibri: Correct SODIMM PAD settings
From: Alexander Stein @ 2026-04-01  8:02 UTC (permalink / raw)
  To: Peng Fan (OSS), Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Frank Li, Sascha Hauer, Pengutronix Kernel Team, Fabio Estevam,
	Philippe Schenker, Ernest Van Hoecke, linux-arm-kernel
  Cc: devicetree, imx, linux-arm-kernel, linux-kernel, Peng Fan,
	Daniel Baluta
In-Reply-To: <40dcbb9c-15ad-4765-9f7e-40a571f98fb5@oss.nxp.com>

Am Mittwoch, 1. April 2026, 09:26:03 CEST schrieb Daniel Baluta:
> On 4/1/26 09:40, Peng Fan (OSS) wrote:
> > From: Peng Fan <peng.fan@nxp.com>
> >
> > SION is BIT(30), not BIT(26). Correct it.
> >
> > Fixes: 7ece3cbc8b1ef ("arm64: dts: colibri-imx8x: Add atmel pinctrl groups")
> > Signed-off-by: Peng Fan <peng.fan@nxp.com>
> Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com>
> 
> What is the general attitude around using symbolic macros for pin config?
> Like here: https://www.spinics.net/lists/kernel/msg6072866.html
> 
> I think there are useful to avoid this kind of bugs.
> 
> If I get enough Ack's I can move forward and replace all magic numbers from imx dtses.

Somehow I completely missed these defines :-/ That's a good improvement,
especially as SION bit is "custom".

Best regards,
Alexander
-- 
TQ-Systems GmbH | Mühlstraße 2, Gut Delling | 82229 Seefeld, Germany
Amtsgericht München, HRB 105018
Geschäftsführer: Detlef Schneider, Rüdiger Stahl, Stefan Schneider
http://www.tq-group.com/




^ permalink raw reply

* Re: [PATCH v2] iommu/rockchip: Drop global rk_ops in favor of per-device ops
From: Simon Xue @ 2026-04-01  7:59 UTC (permalink / raw)
  To: Shawn Lin
  Cc: iommu, linux-arm-kernel, linux-rockchip, linux-kernel,
	Joerg Roedel, Will Deacon, Robin Murphy, Heiko Stuebner
In-Reply-To: <a586dd60-5b9c-4044-3bb1-903df863b587@rock-chips.com>

Hi all,

A gentle ping on this patch.

在 2026/3/13 17:32, Shawn Lin 写道:
> 在 2026/03/10 星期二 18:53, Simon Xue 写道:
>> The driver currently uses a global rk_ops pointer, forcing all IOMMU
>> instances to share the same operations. This restricts the driver from
>> supporting SoCs that might integrate different versions of IOMMU 
>> hardware.
>>
>> Since the IOMMU framework passes the master device information to
>> iommu_paging_domain_alloc(), the global variable is no longer needed.
>>
>> Fix this by moving rk_ops into struct rk_iommu and struct 
>> rk_iommu_domain.
>> Initialize it per-device during probe via of_device_get_match_data(),
>> and replace all global references with the instance-specific pointers.
>>
>
> Thanks for the patch, Simon. I've tested it on the RK3576 EVB1 with
> PCIe1 + IOMMU. NVMe works fine on it, and I also verified the IOVA
> allocated in the NVMe driver, they look correct as I manually limited
> the memblock to under 2GB, so here it is:
>
> nvme 0001:21:00.0: cq_dma_addr: 0x00000000f7fc7000
>
> Tested-by: Shawn Lin <shawn.lin@rock-chips.com>
> Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
>
>> Signed-off-by: Simon Xue <xxm@rock-chips.com>
>> ---
>> v2:
>>   - Remove the one-time-used 'ops' variable in rk_iommu_probe()
>>
>>   drivers/iommu/rockchip-iommu.c | 71 ++++++++++++++++------------------
>>   1 file changed, 33 insertions(+), 38 deletions(-)
>>
>> diff --git a/drivers/iommu/rockchip-iommu.c 
>> b/drivers/iommu/rockchip-iommu.c
>> index 0013cf196c57..4da80136933c 100644
>> --- a/drivers/iommu/rockchip-iommu.c
>> +++ b/drivers/iommu/rockchip-iommu.c
>> @@ -82,6 +82,14 @@
>>     */
>>   #define RK_IOMMU_PGSIZE_BITMAP 0x007ff000
>>   +struct rk_iommu_ops {
>> +    phys_addr_t (*pt_address)(u32 dte);
>> +    u32 (*mk_dtentries)(dma_addr_t pt_dma);
>> +    u32 (*mk_ptentries)(phys_addr_t page, int prot);
>> +    u64 dma_bit_mask;
>> +    gfp_t gfp_flags;
>> +};
>> +
>>   struct rk_iommu_domain {
>>       struct list_head iommus;
>>       u32 *dt; /* page directory table */
>> @@ -89,6 +97,7 @@ struct rk_iommu_domain {
>>       spinlock_t iommus_lock; /* lock for iommus list */
>>       spinlock_t dt_lock; /* lock for modifying page directory table */
>>       struct device *dma_dev;
>> +    const struct rk_iommu_ops *rk_ops;
>>         struct iommu_domain domain;
>>   };
>> @@ -98,14 +107,6 @@ static const char * const rk_iommu_clocks[] = {
>>       "aclk", "iface",
>>   };
>>   -struct rk_iommu_ops {
>> -    phys_addr_t (*pt_address)(u32 dte);
>> -    u32 (*mk_dtentries)(dma_addr_t pt_dma);
>> -    u32 (*mk_ptentries)(phys_addr_t page, int prot);
>> -    u64 dma_bit_mask;
>> -    gfp_t gfp_flags;
>> -};
>> -
>>   struct rk_iommu {
>>       struct device *dev;
>>       void __iomem **bases;
>> @@ -117,6 +118,7 @@ struct rk_iommu {
>>       struct iommu_device iommu;
>>       struct list_head node; /* entry in rk_iommu_domain.iommus */
>>       struct iommu_domain *domain; /* domain to which iommu is 
>> attached */
>> +    const struct rk_iommu_ops *rk_ops;
>>   };
>>     struct rk_iommudata {
>> @@ -124,7 +126,6 @@ struct rk_iommudata {
>>       struct rk_iommu *iommu;
>>   };
>>   -static const struct rk_iommu_ops *rk_ops;
>>   static struct iommu_domain rk_identity_domain;
>>     static inline void rk_table_flush(struct rk_iommu_domain *dom, 
>> dma_addr_t dma,
>> @@ -510,7 +511,7 @@ static int rk_iommu_force_reset(struct rk_iommu 
>> *iommu)
>>        * and verifying that upper 5 (v1) or 7 (v2) nybbles are read 
>> back.
>>        */
>>       for (i = 0; i < iommu->num_mmu; i++) {
>> -        dte_addr = rk_ops->pt_address(DTE_ADDR_DUMMY);
>> +        dte_addr = iommu->rk_ops->pt_address(DTE_ADDR_DUMMY);
>>           rk_iommu_write(iommu->bases[i], RK_MMU_DTE_ADDR, dte_addr);
>>             if (dte_addr != rk_iommu_read(iommu->bases[i], 
>> RK_MMU_DTE_ADDR)) {
>> @@ -551,7 +552,7 @@ static void log_iova(struct rk_iommu *iommu, int 
>> index, dma_addr_t iova)
>>       page_offset = rk_iova_page_offset(iova);
>>         mmu_dte_addr = rk_iommu_read(base, RK_MMU_DTE_ADDR);
>> -    mmu_dte_addr_phys = rk_ops->pt_address(mmu_dte_addr);
>> +    mmu_dte_addr_phys = iommu->rk_ops->pt_address(mmu_dte_addr);
>>         dte_addr_phys = mmu_dte_addr_phys + (4 * dte_index);
>>       dte_addr = phys_to_virt(dte_addr_phys);
>> @@ -560,14 +561,14 @@ static void log_iova(struct rk_iommu *iommu, 
>> int index, dma_addr_t iova)
>>       if (!rk_dte_is_pt_valid(dte))
>>           goto print_it;
>>   -    pte_addr_phys = rk_ops->pt_address(dte) + (pte_index * 4);
>> +    pte_addr_phys = iommu->rk_ops->pt_address(dte) + (pte_index * 4);
>>       pte_addr = phys_to_virt(pte_addr_phys);
>>       pte = *pte_addr;
>>         if (!rk_pte_is_page_valid(pte))
>>           goto print_it;
>>   -    page_addr_phys = rk_ops->pt_address(pte) + page_offset;
>> +    page_addr_phys = iommu->rk_ops->pt_address(pte) + page_offset;
>>       page_flags = pte & RK_PTE_PAGE_FLAGS_MASK;
>>     print_it:
>> @@ -663,13 +664,13 @@ static phys_addr_t rk_iommu_iova_to_phys(struct 
>> iommu_domain *domain,
>>       if (!rk_dte_is_pt_valid(dte))
>>           goto out;
>>   -    pt_phys = rk_ops->pt_address(dte);
>> +    pt_phys = rk_domain->rk_ops->pt_address(dte);
>>       page_table = (u32 *)phys_to_virt(pt_phys);
>>       pte = page_table[rk_iova_pte_index(iova)];
>>       if (!rk_pte_is_page_valid(pte))
>>           goto out;
>>   -    phys = rk_ops->pt_address(pte) + rk_iova_page_offset(iova);
>> +    phys = rk_domain->rk_ops->pt_address(pte) + 
>> rk_iova_page_offset(iova);
>>   out:
>>       spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
>>   @@ -730,7 +731,7 @@ static u32 *rk_dte_get_page_table(struct 
>> rk_iommu_domain *rk_domain,
>>       if (rk_dte_is_pt_valid(dte))
>>           goto done;
>>   -    page_table = iommu_alloc_pages_sz(GFP_ATOMIC | rk_ops->gfp_flags,
>> +    page_table = iommu_alloc_pages_sz(GFP_ATOMIC | 
>> rk_domain->rk_ops->gfp_flags,
>>                         SPAGE_SIZE);
>>       if (!page_table)
>>           return ERR_PTR(-ENOMEM);
>> @@ -742,13 +743,13 @@ static u32 *rk_dte_get_page_table(struct 
>> rk_iommu_domain *rk_domain,
>>           return ERR_PTR(-ENOMEM);
>>       }
>>   -    dte = rk_ops->mk_dtentries(pt_dma);
>> +    dte = rk_domain->rk_ops->mk_dtentries(pt_dma);
>>       *dte_addr = dte;
>>         rk_table_flush(rk_domain,
>>                  rk_domain->dt_dma + dte_index * sizeof(u32), 1);
>>   done:
>> -    pt_phys = rk_ops->pt_address(dte);
>> +    pt_phys = rk_domain->rk_ops->pt_address(dte);
>>       return (u32 *)phys_to_virt(pt_phys);
>>   }
>>   @@ -790,7 +791,7 @@ static int rk_iommu_map_iova(struct 
>> rk_iommu_domain *rk_domain, u32 *pte_addr,
>>           if (rk_pte_is_page_valid(pte))
>>               goto unwind;
>>   -        pte_addr[pte_count] = rk_ops->mk_ptentries(paddr, prot);
>> +        pte_addr[pte_count] = rk_domain->rk_ops->mk_ptentries(paddr, 
>> prot);
>>             paddr += SPAGE_SIZE;
>>       }
>> @@ -812,7 +813,7 @@ static int rk_iommu_map_iova(struct 
>> rk_iommu_domain *rk_domain, u32 *pte_addr,
>>                   pte_count * SPAGE_SIZE);
>>         iova += pte_count * SPAGE_SIZE;
>> -    page_phys = rk_ops->pt_address(pte_addr[pte_count]);
>> +    page_phys = rk_domain->rk_ops->pt_address(pte_addr[pte_count]);
>>       pr_err("iova: %pad already mapped to %pa cannot remap to phys: 
>> %pa prot: %#x\n",
>>              &iova, &page_phys, &paddr, prot);
>>   @@ -849,7 +850,7 @@ static int rk_iommu_map(struct iommu_domain 
>> *domain, unsigned long _iova,
>>       pte_index = rk_iova_pte_index(iova);
>>       pte_addr = &page_table[pte_index];
>>   -    pte_dma = rk_ops->pt_address(dte_index) + pte_index * 
>> sizeof(u32);
>> +    pte_dma = rk_domain->rk_ops->pt_address(dte_index) + pte_index * 
>> sizeof(u32);
>>       ret = rk_iommu_map_iova(rk_domain, pte_addr, pte_dma, iova,
>>                   paddr, size, prot);
>>   @@ -887,7 +888,7 @@ static size_t rk_iommu_unmap(struct 
>> iommu_domain *domain, unsigned long _iova,
>>           return 0;
>>       }
>>   -    pt_phys = rk_ops->pt_address(dte);
>> +    pt_phys = rk_domain->rk_ops->pt_address(dte);
>>       pte_addr = (u32 *)phys_to_virt(pt_phys) + rk_iova_pte_index(iova);
>>       pte_dma = pt_phys + rk_iova_pte_index(iova) * sizeof(u32);
>>       unmap_size = rk_iommu_unmap_iova(rk_domain, pte_addr, pte_dma, 
>> size);
>> @@ -945,7 +946,7 @@ static int rk_iommu_enable(struct rk_iommu *iommu)
>>         for (i = 0; i < iommu->num_mmu; i++) {
>>           rk_iommu_write(iommu->bases[i], RK_MMU_DTE_ADDR,
>> - rk_ops->mk_dtentries(rk_domain->dt_dma));
>> + iommu->rk_ops->mk_dtentries(rk_domain->dt_dma));
>>           rk_iommu_base_command(iommu->bases[i], RK_MMU_CMD_ZAP_CACHE);
>>           rk_iommu_write(iommu->bases[i], RK_MMU_INT_MASK, 
>> RK_MMU_IRQ_MASK);
>>       }
>> @@ -1068,17 +1069,19 @@ static struct iommu_domain 
>> *rk_iommu_domain_alloc_paging(struct device *dev)
>>       if (!rk_domain)
>>           return NULL;
>>   +    iommu = rk_iommu_from_dev(dev);
>> +    rk_domain->rk_ops = iommu->rk_ops;
>> +
>>       /*
>>        * rk32xx iommus use a 2 level pagetable.
>>        * Each level1 (dt) and level2 (pt) table has 1024 4-byte entries.
>>        * Allocate one 4 KiB page for each table.
>>        */
>> -    rk_domain->dt = iommu_alloc_pages_sz(GFP_KERNEL | 
>> rk_ops->gfp_flags,
>> +    rk_domain->dt = iommu_alloc_pages_sz(GFP_KERNEL | 
>> rk_domain->rk_ops->gfp_flags,
>>                            SPAGE_SIZE);
>>       if (!rk_domain->dt)
>>           goto err_free_domain;
>>   -    iommu = rk_iommu_from_dev(dev);
>>       rk_domain->dma_dev = iommu->dev;
>>       rk_domain->dt_dma = dma_map_single(rk_domain->dma_dev, 
>> rk_domain->dt,
>>                          SPAGE_SIZE, DMA_TO_DEVICE);
>> @@ -1117,7 +1120,7 @@ static void rk_iommu_domain_free(struct 
>> iommu_domain *domain)
>>       for (i = 0; i < NUM_DT_ENTRIES; i++) {
>>           u32 dte = rk_domain->dt[i];
>>           if (rk_dte_is_pt_valid(dte)) {
>> -            phys_addr_t pt_phys = rk_ops->pt_address(dte);
>> +            phys_addr_t pt_phys = rk_domain->rk_ops->pt_address(dte);
>>               u32 *page_table = phys_to_virt(pt_phys);
>>               dma_unmap_single(rk_domain->dma_dev, pt_phys,
>>                        SPAGE_SIZE, DMA_TO_DEVICE);
>> @@ -1197,7 +1200,6 @@ static int rk_iommu_probe(struct 
>> platform_device *pdev)
>>       struct device *dev = &pdev->dev;
>>       struct rk_iommu *iommu;
>>       struct resource *res;
>> -    const struct rk_iommu_ops *ops;
>>       int num_res = pdev->num_resources;
>>       int err, i;
>>   @@ -1211,16 +1213,9 @@ static int rk_iommu_probe(struct 
>> platform_device *pdev)
>>       iommu->dev = dev;
>>       iommu->num_mmu = 0;
>>   -    ops = of_device_get_match_data(dev);
>> -    if (!rk_ops)
>> -        rk_ops = ops;
>> -
>> -    /*
>> -     * That should not happen unless different versions of the
>> -     * hardware block are embedded the same SoC
>> -     */
>> -    if (WARN_ON(rk_ops != ops))
>> -        return -EINVAL;
>> +    iommu->rk_ops = of_device_get_match_data(dev);
>> +    if (!iommu->rk_ops)
>> +        return -ENOENT;
>>         iommu->bases = devm_kcalloc(dev, num_res, sizeof(*iommu->bases),
>>                       GFP_KERNEL);
>> @@ -1286,7 +1281,7 @@ static int rk_iommu_probe(struct 
>> platform_device *pdev)
>>               goto err_pm_disable;
>>       }
>>   -    dma_set_mask_and_coherent(dev, rk_ops->dma_bit_mask);
>> +    dma_set_mask_and_coherent(dev, iommu->rk_ops->dma_bit_mask);
>>         err = iommu_device_sysfs_add(&iommu->iommu, dev, NULL, 
>> dev_name(dev));
>>       if (err)
>>
>


^ permalink raw reply

* Re: [PATCH 2/3] arm64: dts: realtek: Add GPIO support for RTD1625
From: Krzysztof Kozlowski @ 2026-04-01  7:49 UTC (permalink / raw)
  To: Yu-Chun Lin
  Cc: linusw, brgl, robh, krzk+dt, conor+dt, afaerber, tychang,
	linux-gpio, devicetree, linux-kernel, linux-arm-kernel,
	linux-realtek-soc, cy.huang, stanley_chang, james.tai
In-Reply-To: <20260331113835.3510341-3-eleanor.lin@realtek.com>

On Tue, Mar 31, 2026 at 07:38:34PM +0800, Yu-Chun Lin wrote:
> Add the GPIO node for the Realtek RTD1625 SoC.
> 
> Signed-off-by: Yu-Chun Lin <eleanor.lin@realtek.com>
> ---
>  arch/arm64/boot/dts/realtek/kent.dtsi    | 43 ++++++++++++++++++++++++
>  arch/arm64/boot/dts/realtek/rtd1501.dtsi |  8 +++++
>  arch/arm64/boot/dts/realtek/rtd1861.dtsi |  8 +++++
>  arch/arm64/boot/dts/realtek/rtd1920.dtsi |  8 +++++
>  4 files changed, 67 insertions(+)
> 

Why the DTS is in the middle? Drivers cannot depend on it. Please read
submitting patches (both documents).

> diff --git a/arch/arm64/boot/dts/realtek/kent.dtsi b/arch/arm64/boot/dts/realtek/kent.dtsi
> index 8d4293cd4c03..746932c26724 100644
> --- a/arch/arm64/boot/dts/realtek/kent.dtsi
> +++ b/arch/arm64/boot/dts/realtek/kent.dtsi
> @@ -151,6 +151,39 @@ uart0: serial@7800 {
>  				status = "disabled";
>  			};
>  
> +			gpio: gpio@31100 {
> +				compatible = "realtek,rtd1625-iso-gpio";
> +				reg = <0x31100 0x398>,
> +				      <0x31000 0x100>;
> +				gpio-controller;
> +				gpio-ranges = <&isom_pinctrl 0 0 2>,
> +					      <&ve4_pinctrl 2 0 6>,
> +					      <&iso_pinctrl 8 0 4>,
> +					      <&ve4_pinctrl 12 6 2>,
> +					      <&main2_pinctrl 14 0 2>,
> +					      <&ve4_pinctrl 16 8 4>,
> +					      <&main2_pinctrl 20 2 3>,
> +					      <&ve4_pinctrl 23 12 3>,
> +					      <&iso_pinctrl 26 4 2>,
> +					      <&isom_pinctrl 28 2 2>,
> +					      <&ve4_pinctrl 30 15 6>,
> +					      <&main2_pinctrl 36 5 6>,
> +					      <&ve4_pinctrl 42 21 3>,
> +					      <&iso_pinctrl 45 6 6>,
> +					      <&ve4_pinctrl 51 24 1>,
> +					      <&iso_pinctrl 52 12 1>,
> +					      <&ve4_pinctrl 53 25 11>,
> +					      <&main2_pinctrl 64 11 28>,
> +					      <&ve4_pinctrl 92 36 2>,
> +					      <&iso_pinctrl 94 13 19>,
> +					      <&iso_pinctrl 128 32 4>,
> +					      <&ve4_pinctrl 132 38 13>,
> +					      <&iso_pinctrl 145 36 19>,
> +					      <&ve4_pinctrl 164 51 2>;
> +				#gpio-cells = <2>;
> +				status = "disabled";

Why is it disabled? What is missing in the SoC? Which resources are
missing?

> +			};
> +
>  			iso_pinctrl: pinctrl@4e000 {
>  				compatible = "realtek,rtd1625-iso-pinctrl";
>  				reg = <0x4e000 0x1a4>;
> @@ -161,6 +194,16 @@ main2_pinctrl: pinctrl@4f200 {
>  				reg = <0x4f200 0x50>;
>  			};
>  
> +			iso_m_gpio: gpio@89120 {
> +				compatible = "realtek,rtd1625-isom-gpio";
> +				reg = <0x89120 0x10>,
> +				      <0x89100 0x20>;
> +				gpio-controller;
> +				gpio-ranges = <&isom_pinctrl 0 0 4>;
> +				#gpio-cells = <2>;
> +				status = "disabled";
> +			};
> +
>  			isom_pinctrl: pinctrl@146200 {
>  				compatible = "realtek,rtd1625-isom-pinctrl";
>  				reg = <0x146200 0x34>;
> diff --git a/arch/arm64/boot/dts/realtek/rtd1501.dtsi b/arch/arm64/boot/dts/realtek/rtd1501.dtsi
> index 65f7ede3df73..ae246a01f126 100644
> --- a/arch/arm64/boot/dts/realtek/rtd1501.dtsi
> +++ b/arch/arm64/boot/dts/realtek/rtd1501.dtsi
> @@ -10,3 +10,11 @@
>  &uart0 {
>  	status = "okay";
>  };
> +
> +&gpio {

Why aren't you following DTS coding style? What style is applicable for
Realtek?

Best regards,
Krzysztof



^ permalink raw reply

* Re: [PATCH] iommu/rockchip: fix page table allocation flags for v2 IOMMU
From: Simon @ 2026-04-01  7:48 UTC (permalink / raw)
  To: Midgy BALON, iommu
  Cc: joro, will, robin.murphy, heiko, jonas, linux-arm-kernel,
	linux-rockchip, linux-kernel, stable
In-Reply-To: <20260331075010.1463-1-midgy971@gmail.com>

Hi Midgy,

在 2026/3/31 15:50, Midgy BALON 写道:
> commit 2a7e6400f72b ("iommu: rockchip: Allocate tables from all
> available memory for IOMMU v2") removed GFP_DMA32 from
> iommu_data_ops_v2, reasoning that RK356x and RK3588 IOMMU v2 hardware
> supports up to 40-bit physical addresses for page tables.  However, the
> RK3568 IOMMU page-table walker uses a 32-bit AXI bus: it cannot access
> physical addresses above 4 GB regardless of the address encoding range.
>
> On boards with more than 4 GB of RAM (e.g. 8 GB LPDDR4X), removing
> GFP_DMA32 causes two distinct failure modes:
>
> 1. Direct allocation above 4 GB: iommu_alloc_pages_sz() may return
>     memory above 0x100000000.  The hardware page-table walker issues a
>     bus error trying to dereference those addresses, causing an IOMMU
>     fault on the first DMA transaction.
Which IP block is hitting this? We'd like to take a look on our end.
> 2. SWIOTLB bounce-buffer poisoning: without GFP_DMA32, page tables land
>     above the SWIOTLB window.  dma_map_single() with DMA_BIT_MASK(32)
>     then bounces them into a buffer below 4 GB.  rk_dte_get_page_table()
>     returns phys_to_virt() of the bounce buffer address; PTEs are written
>     there; the next dma_sync_single_for_device(DMA_TO_DEVICE) copies the
>     original (zero) data back over the bounce buffer, silently erasing the
>     freshly written PTEs.  The IOMMU faults because every PTE reads as zero.
This probably need a separate patch. One way to fix it would be to track the
original L2 page table base addresses in struct rk_iommu_domain,
then have rk_dte_get_page_table() return the tracked address instead of
deriving it from the DTE.
> Restore GFP_DMA32 (and DMA_BIT_MASK(32)) for iommu_data_ops_v2, which
> currently only serves "rockchip,rk3568-iommu" in mainline.
>
> Tested on Radxa ROCK 3B (RK3568, 8 GB LPDDR4X):
>    - MobileNetV1 via RKNN: 5.8 ms/inference (IOMMU mode)
>    - YOLOv5s 640x640 via RKNN: ~57 ms/inference (IOMMU mode)
>    - No IOMMU faults, correct inference results
>
> Fixes: 2a7e6400f72b ("iommu: rockchip: Allocate tables from all available memory for IOMMU v2")
> Cc: stable@vger.kernel.org
> Cc: Jonas Karlman <jonas@kwiboo.se>
> Signed-off-by: Midgy BALON <midgy971@gmail.com>
> ---
>   drivers/iommu/rockchip-iommu.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
> index 85f3667e797..8b45db29471 100644
> --- a/drivers/iommu/rockchip-iommu.c
> +++ b/drivers/iommu/rockchip-iommu.c
> @@ -1358,8 +1358,8 @@ static struct rk_iommu_ops iommu_data_ops_v2 = {
>   	.pt_address = &rk_dte_pt_address_v2,
>   	.mk_dtentries = &rk_mk_dte_v2,
>   	.mk_ptentries = &rk_mk_pte_v2,
> -	.dma_bit_mask = DMA_BIT_MASK(40),
> -	.gfp_flags = 0,
> +	.dma_bit_mask = DMA_BIT_MASK(32),
> +	.gfp_flags = GFP_DMA32,
>   };
>   
>   static const struct of_device_id rk_iommu_dt_ids[] = {


^ permalink raw reply

* Re: [PATCH 1/3] dt-bindings: gpio: realtek: Add realtek,rtd1625-gpio
From: Krzysztof Kozlowski @ 2026-04-01  7:47 UTC (permalink / raw)
  To: Yu-Chun Lin
  Cc: linusw, brgl, robh, krzk+dt, conor+dt, afaerber, tychang,
	linux-gpio, devicetree, linux-kernel, linux-arm-kernel,
	linux-realtek-soc, cy.huang, stanley_chang, james.tai
In-Reply-To: <20260331113835.3510341-2-eleanor.lin@realtek.com>

On Tue, Mar 31, 2026 at 07:38:33PM +0800, Yu-Chun Lin wrote:
> +  reg:
> +    items:
> +      - description: GPIO controller registers
> +      - description: GPIO interrupt registers
> +
> +  interrupts:
> +    items:
> +      - description: Interrupt number of the assert GPIO interrupt, which is
> +                     triggered when there is a rising edge.
> +      - description: Interrupt number of the deassert GPIO interrupt, which is
> +                     triggered when there is a falling edge.
> +      - description: Interrupt number of the level-sensitive GPIO interrupt,
> +                     triggered by a configured logic level.
> +
> +  interrupt-controller: true
> +
> +  "#interrupt-cells":
> +    const: 2
> +
> +  gpio-ranges: true
> +
> +  gpio-controller: true
> +
> +  "#gpio-cells":
> +    const: 2
> +
> +required:
> +  - compatible
> +  - reg
> +  - gpio-ranges
> +  - gpio-controller
> +  - "#gpio-cells"
> +
> +additionalProperties: false
> +
> +examples:
> +  - |
> +    gpio@89120 {
> +      compatible = "realtek,rtd1625-isom-gpio";
> +      reg = <0x89120 0x10>,

0x10 feels very short range.

> +            <0x89100 0x20>;

And this means it's continuous. Are you sure these are two separate
address spaces?

Best regards,
Krzysztof



^ permalink raw reply

* Re: [PATCH 1/2] dt-bindings: reset: imx8mq: Add _N suffix to IMX8MQ_RESET_MIPI_CSI*_RESET
From: Krzysztof Kozlowski @ 2026-04-01  7:41 UTC (permalink / raw)
  To: Robby Cai
  Cc: p.zabel, robh, krzk+dt, conor+dt, Frank.Li, s.hauer, festevam,
	devicetree, kernel, imx, linux-arm-kernel, linux-kernel,
	aisheng.dong
In-Reply-To: <20260331101331.1405588-2-robby.cai@nxp.com>

On Tue, Mar 31, 2026 at 06:13:30PM +0800, Robby Cai wrote:
> The assert logic of the MIPI CSI reset signals is active-low on i.MX8MQ,
> but the existing names do not indicate this explicitly. To improve
> consistency and clarity, append the _N suffix to all
> IMX8MQ_RESET_MIPI_CSI*_RESET definitions. The deprecated
> IMX8MQ_RESET_MIPI_CSI*_RESET versions remain temporarily for DT ABI
> compatibility and will be removed at an appropriate time in the future.
> 
> Signed-off-by: Robby Cai <robby.cai@nxp.com>
> ---
>  include/dt-bindings/reset/imx8mq-reset.h | 18 ++++++++++++------
>  1 file changed, 12 insertions(+), 6 deletions(-)
> 
> diff --git a/include/dt-bindings/reset/imx8mq-reset.h b/include/dt-bindings/reset/imx8mq-reset.h
> index 705870693ec2..83a155dbbd4a 100644
> --- a/include/dt-bindings/reset/imx8mq-reset.h
> +++ b/include/dt-bindings/reset/imx8mq-reset.h
> @@ -46,12 +46,18 @@
>  #define IMX8MQ_RESET_PCIEPHY2_PERST		35	/* i.MX8MM/i.MX8MN does NOT support */
>  #define IMX8MQ_RESET_PCIE2_CTRL_APPS_EN		36	/* i.MX8MM/i.MX8MN does NOT support */
>  #define IMX8MQ_RESET_PCIE2_CTRL_APPS_TURNOFF	37	/* i.MX8MM/i.MX8MN does NOT support */
> -#define IMX8MQ_RESET_MIPI_CSI1_CORE_RESET	38	/* i.MX8MM/i.MX8MN does NOT support */
> -#define IMX8MQ_RESET_MIPI_CSI1_PHY_REF_RESET	39	/* i.MX8MM/i.MX8MN does NOT support */
> -#define IMX8MQ_RESET_MIPI_CSI1_ESC_RESET	40	/* i.MX8MM/i.MX8MN does NOT support */
> -#define IMX8MQ_RESET_MIPI_CSI2_CORE_RESET	41	/* i.MX8MM/i.MX8MN does NOT support */
> -#define IMX8MQ_RESET_MIPI_CSI2_PHY_REF_RESET	42	/* i.MX8MM/i.MX8MN does NOT support */
> -#define IMX8MQ_RESET_MIPI_CSI2_ESC_RESET	43	/* i.MX8MM/i.MX8MN does NOT support */
> +#define IMX8MQ_RESET_MIPI_CSI1_CORE_RESET	38	/* Deprecated. Use *_RESET_N instead */
> +#define IMX8MQ_RESET_MIPI_CSI1_CORE_RESET_N	38	/* i.MX8MM/i.MX8MN does NOT support */

That's quite a churn for no need. The entire point of these values being
the binding is that it describes the ABI for SW and DTS, not your
hardware registers.

Whether signal is active low or high is kind of irrelevant. Linux uses
it exactly the same way.

Best regards,
Krzysztof



^ permalink raw reply

* Re: [PATCH 1/2] dt-bindings: gpu: mali-valhall-csf: Document i.MX952 support
From: Krzysztof Kozlowski @ 2026-04-01  7:40 UTC (permalink / raw)
  To: Guangliu Ding
  Cc: Daniel Almeida, Alice Ryhl, Boris Brezillon, Steven Price,
	Liviu Dudau, David Airlie, Simona Vetter, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Frank Li, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, dri-devel, devicetree,
	linux-kernel, imx, linux-arm-kernel, Jiyu Yang
In-Reply-To: <20260331-master-v1-1-65c8e318d462@nxp.com>

On Tue, Mar 31, 2026 at 06:12:38PM +0800, Guangliu Ding wrote:
> Add compatible string of Mali G310 GPU on i.MX952 board.

We see this from the diff. Say something useful.

> 
> Signed-off-by: Guangliu Ding <guangliu.ding@nxp.com>
> Reviewed-by: Jiyu Yang <jiyu.yang@nxp.com>

And the review should tell you that. Did that review even happen? That's
a v1 and a single liner patch, so how basics could be missed?

Best regards,
Krzysztof



^ permalink raw reply

* Re: [PATCH] lib/crypto: arm64: Assume a little-endian kernel
From: Ard Biesheuvel @ 2026-04-01  7:31 UTC (permalink / raw)
  To: Eric Biggers, linux-crypto
  Cc: linux-kernel, Jason A . Donenfeld, Herbert Xu, linux-arm-kernel
In-Reply-To: <20260401003331.144065-1-ebiggers@kernel.org>



On Wed, 1 Apr 2026, at 02:33, Eric Biggers wrote:
> Since support for big-endian arm64 kernels was removed, the CPU_LE()
> macro now unconditionally emits the code it is passed, and the CPU_BE()
> macro now unconditionally discards the code it is passed.
>
> Simplify the assembly code in lib/crypto/arm64/ accordingly.
>
> Signed-off-by: Eric Biggers <ebiggers@kernel.org>
> ---
>
> This patch is targeting libcrypto-next
>
>  lib/crypto/arm64/aes-cipher-core.S  | 10 -------
>  lib/crypto/arm64/chacha-neon-core.S | 16 -----------
>  lib/crypto/arm64/ghash-neon-core.S  |  2 +-
>  lib/crypto/arm64/sha1-ce-core.S     |  8 +++---
>  lib/crypto/arm64/sha256-ce.S        | 41 +++++++++++++----------------
>  lib/crypto/arm64/sha512-ce-core.S   | 16 +++++------
>  lib/crypto/arm64/sm3-ce-core.S      |  8 +++---
>  7 files changed, 36 insertions(+), 65 deletions(-)
>

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>


^ permalink raw reply

* Re: [PATCH] lib/crc: arm64: Assume a little-endian kernel
From: Ard Biesheuvel @ 2026-04-01  7:31 UTC (permalink / raw)
  To: Eric Biggers, linux-kernel; +Cc: linux-crypto, linux-arm-kernel
In-Reply-To: <20260401004431.151432-1-ebiggers@kernel.org>



On Wed, 1 Apr 2026, at 02:44, Eric Biggers wrote:
> Since support for big-endian arm64 kernels was removed, the CPU_LE()
> macro now unconditionally emits the code it is passed, and the CPU_BE()
> macro now unconditionally discards the code it is passed.
>
> Simplify the assembly code in lib/crc/arm64/ accordingly.
>
> Signed-off-by: Eric Biggers <ebiggers@kernel.org>
> ---
>
> This patch is targeting crc-next
>
>  lib/crc/arm64/crc-t10dif-core.S | 56 ++++++++++++++++-----------------
>  lib/crc/arm64/crc32-core.S      |  9 ++----
>  2 files changed, 30 insertions(+), 35 deletions(-)
>

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>


^ permalink raw reply

* [PATCH v2 4/4] Documentation: PCI: Add documentation for DOE endpoint support
From: Aksh Garg @ 2026-04-01  7:30 UTC (permalink / raw)
  To: linux-pci, linux-doc, mani, kwilczynski, bhelgaas, corbet, kishon,
	skhan, lukas, cassel, alistair
  Cc: linux-arm-kernel, linux-kernel, s-vadapalli, danishanwar, srk,
	a-garg7
In-Reply-To: <20260401073022.215805-1-a-garg7@ti.com>

Document the architecture and implementation details for the Data Object
Exchange (DOE) framework for PCIe Endpoint devices.

Co-developed-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Signed-off-by: Aksh Garg <a-garg7@ti.com>
---

Changes since v1:
- Squashed the patches [1] and [2], and moved the documentation file
  to Documentation/PCI/endpoint/pci-endpoint-doe.rst to match the existing
  naming scheme, as suggested by Niklas Cassel
- Updated the documentation as per the design and implementaion changes
  made to previous patches in this series:
  * Updated for static protocol array instead of dynamic registration
  * Documented asynchronous callback model
  * Updated request/response flow with new callback signature
  * Updated memory ownership: DOE core frees request, driver frees response
  * Updated initialization and cleanup sections for new APIs

v1: [1] https://lore.kernel.org/all/20260213123603.420941-2-a-garg7@ti.com/
    [2] https://lore.kernel.org/all/20260213123603.420941-5-a-garg7@ti.com/

 Documentation/PCI/endpoint/index.rst          |   1 +
 .../PCI/endpoint/pci-endpoint-doe.rst         | 318 ++++++++++++++++++
 2 files changed, 319 insertions(+)
 create mode 100644 Documentation/PCI/endpoint/pci-endpoint-doe.rst

diff --git a/Documentation/PCI/endpoint/index.rst b/Documentation/PCI/endpoint/index.rst
index dd1f62e731c9..7c03d5abd2ef 100644
--- a/Documentation/PCI/endpoint/index.rst
+++ b/Documentation/PCI/endpoint/index.rst
@@ -9,6 +9,7 @@ PCI Endpoint Framework
 
    pci-endpoint
    pci-endpoint-cfs
+   pci-endpoint-doe
    pci-test-function
    pci-test-howto
    pci-ntb-function
diff --git a/Documentation/PCI/endpoint/pci-endpoint-doe.rst b/Documentation/PCI/endpoint/pci-endpoint-doe.rst
new file mode 100644
index 000000000000..03b7a69516f3
--- /dev/null
+++ b/Documentation/PCI/endpoint/pci-endpoint-doe.rst
@@ -0,0 +1,318 @@
+.. SPDX-License-Identifier: GPL-2.0-only or MIT
+
+.. include:: <isonum.txt>
+
+=============================================
+Data Object Exchange (DOE) for PCIe Endpoint
+=============================================
+
+:Copyright: |copy| 2026 Texas Instruments Incorporated
+:Author: Aksh Garg <a-garg7@ti.com>
+:Co-Author: Siddharth Vadapalli <s-vadapalli@ti.com>
+
+Overview
+========
+
+DOE (Data Object Exchange) is a standard PCIe extended capability feature
+introduced in the Data Object Exchange (DOE) ECN for PCIe r5.0. It is an optional
+mechanism for system firmware/software running on root complex (host) to perform
+:ref:`data object <data-object-term>` exchanges with an endpoint function. Each
+data object is uniquely identified by the Vendor ID of the vendor publishing the
+data object definition and a Data Object Type value assigned by that vendor.
+
+Think of DOE as a sophisticated mailbox system built into PCIe. The root complex
+can send structured requests to the endpoint device through DOE mailboxes, and
+the endpoint device responds with appropriate data. DOE mailboxes are implemented
+as PCIe Extended Capabilities in endpoint devices, allowing multiple mailboxes
+per function, each potentially supporting different data object protocols.
+
+The DOE support for root complex devices has already been implemented in
+``drivers/pci/doe.c``.
+
+How DOE Works
+=============
+
+The DOE mailbox operates through a simple request-response model:
+
+1. **Host sends request**: The root complex writes a data object (vendor ID, type,
+   and payload) to the DOE write mailbox register (one DWORD at a time) of the
+   endpoint function's config space and sets the GO bit in the DOE Status register
+   to indicate that a request is ready for processing.
+2. **Endpoint processes**: The endpoint function reads the request from DOE write
+   mailbox register, sets the BUSY bit in the DOE Status register, identifies the
+   protocol of the data object, and executes the appropriate handler.
+3. **Endpoint responds**: The endpoint function writes the response data object to the
+   DOE read mailbox register (one DWORD at a time), and sets the READY bit in the DOE
+   Status register to indicate that the response is ready. If an error occurs during
+   request processing (such as unsupported protocol or handler failure), the endpoint
+   sets the ERROR bit in the DOE Status register instead of the READY bit.
+4. **Host reads response**: The root complex retrieves the response data from the DOE read
+   mailbox register once the READY bit is set in the DOE Status register, and then writes
+   any value to this register to indicate a successful read. If the ERROR bit was set,
+   the root complex discards the response and performs error handling as needed.
+
+Each mailbox operates independently and can handle one transaction at a time. The
+DOE specification supports data objects of size up to 256KB (2\ :sup:`18` dwords).
+
+For complete DOE capability details, refer to `PCI Express Base Specification Revision 7.0,
+Section 6.30 - Data Object Exchange (DOE)`.
+
+Key Terminologies
+=================
+
+.. _data-object-term:
+
+**Data Object**
+  A structured, vendor-defined, or standard-defined message exchanged between
+  root complex and endpoint function via DOE capability registers in configuration
+  space of the function.
+
+**Mailbox**
+  A DOE capability on the endpoint device, where each physical function can have
+  multiple mailboxes.
+
+**Protocol**
+  A specific type of DOE communication data object identified by a Vendor ID and Type.
+
+**Handler**
+  A function that processes DOE requests of a specific protocol and generates responses.
+
+Architecture of DOE Implementation for Endpoint
+===============================================
+
+.. code-block:: text
+
+       +------------------+
+       |                  |
+       |   Root Complex   |
+       |                  |
+       +--------^---------+
+                |
+                | Config space access
+                |   over PCIe link
+                |
+     +----------v-----------+
+     |                      |
+     |    PCIe Controller   |
+     |      as Endpoint     |
+     |                      |
+     |  +-----------------+ |
+     |  |   DOE Mailbox   | |
+     |  +-------^---------+ |
+     +----------|-----------+
+    +-----------|---------------------------------------------------------------+
+    |           |                                       +--------------------+  |
+    | +---------v--------+           Allocate           |  +--------------+  |  |
+    | |                  |-------------------------------->|   Request    |  |  |
+    | |   EP Controller  |                            +--->|    Buffer    |  |  |
+    | |      Driver      |             Free           | |  +--------------+  |  |
+    | |                  |--------------------------+ | |                    |  |
+    | +--------^---------+                          | | |                    |  |
+    |          |                                    | | |                    |  |
+    |          |                                    | | |                    |  |
+    |          | pci_ep_doe_process_request()       | | |                    |  |
+    |          |                                    | | |                    |  |
+    | +--------v---------+             Free         | | |                    |  |
+    | |                  |----------------------------+ |         DDR        |  |
+    | |    DOE EP Core   |<----+                    |   |                    |  |
+    | |    (doe-ep.c)    |     |     Discovery      |   |                    |  |
+    | |                  |-----+  Protocol Handler  |   |                    |  |
+    | +--------^---------+                          |   |                    |  |
+    |          |                                    |   |                    |  |
+    |          | protocol_handler()                 |   |                    |  |
+    |          |                                    |   |                    |  |
+    | +--------v---------+                          |   |                    |  |
+    | |                  |                          |   |  +--------------+  |  |
+    | | Protocol Handler |                          +----->|   Response   |  |  |
+    | |      Module      |-------------------------------->|    Buffer    |  |  |
+    | | (CMA/SPDM/Other) |           Allocate           |  +--------------+  |  |
+    | |                  |                              |                    |  |
+    | +------------------+                              |                    |  |
+    |                                                   +--------------------+  |
+    +---------------------------------------------------------------------------+
+
+Initialization and Cleanup
+--------------------------
+
+**Framework Initialization and DOE Setup**
+
+The EPC core provides the ``pci_epc_doe_setup(epc)`` API for centralized DOE
+mailbox discovery and registration. The controller driver calls this API during
+its probe sequence if DOE is supported.
+
+This API performs the following steps:
+
+1. Calls ``pci_ep_doe_init(epc)``, which initializes the xarray data structure
+   (a resizable array data structure defined in linux) named ``doe_mbs`` that
+   stores metadata of DOE mailboxes for the controller in ``struct pci_epc``.
+2. Discovers all DOE capabilities in the endpoint function's configuration space
+   for each function. For each discovered DOE capability, calls
+   ``pci_ep_doe_add_mailbox(epc, func_no, cap_offset)`` to register the mailbox.
+
+Each DOE mailbox structure created by ``pci_ep_doe_add_mailbox()`` gets an
+ordered workqueue allocated for processing DOE requests sequentially for that
+mailbox, enabling concurrent request handling across different mailboxes. Each
+mailbox is uniquely identified by the combination of physical function number
+and capability offset for that controller.
+
+**Cleanup**
+
+The EPC core provides the ``pci_epc_doe_destroy(epc)`` API for centralized DOE
+cleanup. The controller driver calls this API during its remove sequence
+if DOE is supported.
+
+This API calls ``pci_ep_doe_destroy(epc)``, which destroys all registered
+mailboxes, cancels any pending tasks, flushes and destroys the workqueues,
+and frees all memory allocated to the mailboxes.
+
+Protocol Handler Support
+------------------------
+
+Protocol implementations (such as CMA, SPDM, or vendor-specific protocols) are
+supported through a static array of protocol handlers.
+
+When a new DOE protocol library is introduced, its handler function is added to
+the static ``pci_doe_protocols`` array in ``drivers/pci/endpoint/pci-ep-doe.c``.
+The discovery protocol (VID = 0x0001 (PCI-SIG vendor ID), Type = 0x00 (discovery
+protocol)) is included in this static array and handled internally by the
+DOE EP core.
+
+Request Handling
+----------------
+
+The complete flow of a DOE request from the root complex to the response:
+
+**Step 1: Root Complex → EP Controller Driver**
+
+The root complex writes a DOE request (Vendor ID, Type, and Payload) to the
+DOE write mailbox register in the endpoint function's configuration space and sets
+the GO bit in the DOE Control register, indicating that the request is ready for
+processing.
+
+**Step 2: EP Controller Driver → DOE EP Core**
+
+The controller driver reads the request header to determine the data object
+length. Based on this length field, it allocates a request buffer in memory
+(DDR) of the appropriate size. The driver then reads the complete request
+payload from the DOE write mailbox register and converts the data from
+little-endian format (the format followed in the PCIe transactions over the
+link) to CPU-native format using ``le32_to_cpu()``. The driver defines a
+completion callback function with signature ``void (*complete)(u8 func_no,
+u16 cap_offset, int status, u16 vendor, u8 type, void *response_pl,
+size_t response_pl_sz)`` to be invoked when the request processing completes.
+The driver then calls ``pci_ep_doe_process_request(epc, func_no, cap_offset,
+vendor, type, request, request_sz, complete)`` to hand off the request to the
+DOE EP core. This function returns immediately after queuing the work
+(without blocking), and the driver sets the BUSY bit in the DOE Status register.
+
+**Step 3: DOE EP Core Processing**
+
+The DOE EP core creates a task structure and submits it to the mailbox's ordered
+workqueue. This ensures that requests for each mailbox are processed
+sequentially, one at a time, as required by the DOE specification. It looks up
+the protocol handler based on the Vendor ID and Type from the request header,
+and executes the handler function.
+
+**Step 4: Protocol Handler Execution**
+
+The workqueue executes the task by calling the registered protocol handler:
+``handler(request, request_sz, &response, &response_sz)``. The handler processes
+the request, allocates a response buffer in memory (DDR), builds the response
+data, and returns the response pointer and size. For the discovery protocol,
+the DOE EP core handles this directly without invoking an external handler.
+
+**Step 5: DOE EP Core → EP Controller Driver**
+
+After the protocol handler completes, the DOE EP core frees the request buffer,
+and invokes the completion callback provided by the controller driver asynchronously.
+The callback receives the function number, capability offset (to identify the mailbox),
+status code indicating the result of request processing, vendor ID and type of the data
+object, the response buffer, and its size.
+
+**Step 6: EP Controller Driver → Root Complex**
+
+The controller driver converts the response from CPU-native format to
+little-endian format using ``cpu_to_le32()``, writes the response to DOE read
+mailbox register, and sets the READY bit in the DOE Status register. The root
+complex then reads the response from the read mailbox register. Finally, the controller
+driver frees the response buffer (which the handler allocated).
+
+Asynchronous Request Processing
+-------------------------------
+
+The DOE-EP framework implements asynchronous request processing because an
+endpoint function can have multiple instances of DOE mailboxes, and requests may
+be interleaved across these mailboxes. Request processing of one mailbox should
+not result in blocking request processing of other mailboxes. Hence, requests
+on each mailbox need to be handled in parallel for optimization.
+
+For the EP controller driver to handle requests on multiple mailboxes in
+parallel, ``pci_ep_doe_process_request()`` must be asynchronous. The function
+returns immediately after submitting the request to the mailbox's workqueue,
+without waiting for the request to complete. A completion callback provided by
+the controller driver is invoked asynchronously when request processing
+finishes. This asynchronous design enables concurrent processing of requests
+across different mailboxes.
+
+Abort Handling
+--------------
+
+The DOE specification allows the root complex to abort ongoing DOE operations
+by setting the ABORT bit in the DOE Control register.
+
+**Trigger**
+
+When the root complex sets the ABORT bit, the EP controller driver detects this
+condition (typically in an interrupt handler or register polling routine). The
+action taken depends on the timing of the abort:
+
+- **ABORT during request transfer**: If the ABORT bit is set while the root complex
+  is still transferring the request to the mailbox registers, the controller driver
+  discards the request and no call to ``pci_ep_doe_abort()`` is needed.
+
+- **ABORT after request submission**: If the ABORT bit is set after the request
+  has been fully received and submitted to the DOE EP core via
+  ``pci_ep_doe_process_request()``, the controller driver must call
+  ``pci_ep_doe_abort(epc, func_no, cap_offset)`` for the affected mailbox to
+  perform abort sequence in the DOE EP core.
+
+**Abort Sequence**
+
+The abort function performs the following actions:
+
+1. Sets the CANCEL flag on the mailbox to prevent queued requests from starting
+2. Flushes the workqueue to wait for any currently executing handler to complete
+   (handlers cannot be interrupted mid-execution)
+3. Clears the CANCEL flag to allow the mailbox to accept new requests
+
+Queued requests that have not started execution will be aborted with an error
+status. The currently executing request will complete normally, and the controller
+will reject the response if it arrives after the abort sequence has been triggered.
+
+.. note::
+   Independent of when the ABORT bit is triggered, the controller driver must
+   clear the ERROR, BUSY, and READY bits in the DOE Status register after
+   completing the abort operation to reset the mailbox to an idle state.
+
+Error Handling
+--------------
+
+Errors can occur during DOE request processing for various reasons, such as
+unsupported protocols, handler failures, or memory allocation failures.
+
+**Error Detection**
+
+When an error occurs during DOE request processing, the DOE EP core propagates this error
+back to the controller driver either through the ``pci_ep_doe_process_request()`` return value,
+or the status code passed to the completion callback.
+
+**Error Response**
+
+When the controller driver receives an error code, it sets the ERROR bit in the DOE Status
+register instead of writing a response to the read mailbox register, and frees the buffers.
+
+API Reference
+=============
+
+.. kernel-doc:: drivers/pci/endpoint/pci-ep-doe.c
+   :export:
-- 
2.34.1



^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox