* Re: [PATCH v4 04/13] dma: swiotlb: track pool encryption state and honor DMA_ATTR_CC_SHARED
From: Jason Gunthorpe @ 2026-05-19 16:11 UTC (permalink / raw)
To: Aneesh Kumar K.V
Cc: Mostafa Saleh, iommu, linux-arm-kernel, linux-kernel, linux-coco,
Robin Murphy, Marek Szyprowski, Will Deacon, Marc Zyngier,
Steven Price, Suzuki K Poulose, Catalin Marinas, Jiri Pirko,
Petr Tesarik, Alexey Kardashevskiy, Dan Williams, Xu Yilun,
linuxppc-dev, linux-s390, Madhavan Srinivasan, Michael Ellerman,
Nicholas Piggin, Christophe Leroy (CS GROUP), Alexander Gordeev,
Gerald Schaefer, Heiko Carstens, Vasily Gorbik,
Christian Borntraeger, Sven Schnelle, x86
In-Reply-To: <yq5a8q9fs7ud.fsf@kernel.org>
On Tue, May 19, 2026 at 09:35:30PM +0530, Aneesh Kumar K.V wrote:
> Yes, that also resulted in simpler and cleaner code.
>
> swiotlb_tbl_map_single
> /*
> * If the physical address is encrypted but the device requires
> * decrypted DMA, use a decrypted io_tlb_mem and update the
> * attributes so the caller knows that a decrypted io_tlb_mem
> * was used.
> */
> if (!(*attrs & DMA_ATTR_CC_SHARED) && force_dma_unencrypted(dev))
> *attrs |= DMA_ATTR_CC_SHARED;
>
> if (mem->unencrypted != !!(*attrs & DMA_ATTR_CC_SHARED))
> return (phys_addr_t)DMA_MAPPING_ERROR;
Yeah, exactly that is so much clearer now that the mem->unecrypted is
tied directly.
That logic is reversed though, the incoming ATTR_CC doesn't matter for
swiotlb, that is just the source of the memcpy.
/* swiotlb pool is incorrect for this device */
if (mem->unencrypted != force_dma_unencrypted(dev))
return (phys_addr_t)DMA_MAPPING_ERROR;
/* Force attrs to match the kind of memory in the pool */
if (mem->unencrypted)
*attrs |= DMA_ATTR_CC_SHARED;
else
*attrs &= ~DMA_ATTR_CC_SHARED;
Attrs should be forced to whatever memory swiotlb selected.
Jason
^ permalink raw reply
* Re: [PATCH] arm64: dts: imx8mp-frdm: add support for SD-card
From: Frank Li @ 2026-05-19 16:19 UTC (permalink / raw)
To: Alexandru Ardelean
Cc: imx, linux-arm-kernel, linux-kernel, devicetree, festevam, kernel,
s.hauer, conor+dt, krzk+dt, robh, Xiaofeng Wei
In-Reply-To: <20260429135717.178982-1-aardelean@deviqon.com>
On Wed, Apr 29, 2026 at 04:57:17PM +0300, Alexandru Ardelean wrote:
Please Rebase to https://git.kernel.org/pub/scm/linux/kernel/git/frank.li/linux.git/log/?h=imx/dt64
> The i.MX8MP FRDM board also has an SD-card slot, which is useful during.
Reduntant "."
> development.
> This change picks it up from NXP's BSP repo:
avoid "This config", just said
Base on https://github.com/nxp-imx-support/meta-imx-frdm.
> https://github.com/nxp-imx-support/meta-imx-frdm
>
> Adding Xiaofeng Wei's as he is the original author of the DT.
I think needn't mention, you keep signed-off-by tags.
Frank
>
> Signed-off-by: Xiaofeng Wei <xiaofeng.wei@nxp.com>
> Signed-off-by: Alexandru Ardelean <aardelean@deviqon.com>
> ---
> arch/arm64/boot/dts/freescale/imx8mp-frdm.dts | 72 +++++++++++++++++++
> 1 file changed, 72 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/freescale/imx8mp-frdm.dts b/arch/arm64/boot/dts/freescale/imx8mp-frdm.dts
> index 55690f5e53d7e..84034b0ccb12d 100644
> --- a/arch/arm64/boot/dts/freescale/imx8mp-frdm.dts
> +++ b/arch/arm64/boot/dts/freescale/imx8mp-frdm.dts
> @@ -42,6 +42,17 @@ memory@40000000 {
> reg = <0x0 0x40000000 0 0xc0000000>,
> <0x1 0x00000000 0 0x40000000>;
> };
> +
> + reg_usdhc2_vmmc: regulator-usdhc2 {
> + compatible = "regulator-fixed";
> + pinctrl-names = "default";
> + pinctrl-0 = <&pinctrl_reg_usdhc2_vmmc>;
> + regulator-name = "VSD_3V3";
> + regulator-min-microvolt = <3300000>;
> + regulator-max-microvolt = <3300000>;
> + gpio = <&gpio2 19 GPIO_ACTIVE_HIGH>;
> + enable-active-high;
> + };
> };
>
> &A53_0 {
> @@ -237,6 +248,19 @@ &uart3 {
> status = "okay";
> };
>
> +&usdhc2 {
> + assigned-clocks = <&clk IMX8MP_CLK_USDHC2>;
> + assigned-clock-rates = <400000000>;
> + pinctrl-names = "default", "state_100mhz", "state_200mhz";
> + pinctrl-0 = <&pinctrl_usdhc2>, <&pinctrl_usdhc2_gpio>;
> + pinctrl-1 = <&pinctrl_usdhc2_100mhz>, <&pinctrl_usdhc2_gpio>;
> + pinctrl-2 = <&pinctrl_usdhc2_200mhz>, <&pinctrl_usdhc2_gpio>;
> + cd-gpios = <&gpio2 12 GPIO_ACTIVE_LOW>;
> + vmmc-supply = <®_usdhc2_vmmc>;
> + bus-width = <4>;
> + status = "okay";
> +};
> +
> &usdhc3 {
> assigned-clocks = <&clk IMX8MP_CLK_USDHC3>;
> assigned-clock-rates = <400000000>;
> @@ -289,6 +313,12 @@ MX8MP_IOMUXC_SD1_STROBE__GPIO2_IO11 0x146
> >;
> };
>
> + pinctrl_reg_usdhc2_vmmc: regusdhc2vmmcgrp {
> + fsl,pins = <
> + MX8MP_IOMUXC_SD2_RESET_B__GPIO2_IO19 0x40
> + >;
> + };
> +
> pinctrl_uart2: uart2grp {
> fsl,pins = <
> MX8MP_IOMUXC_UART2_RXD__UART2_DCE_RX 0x140
> @@ -305,6 +335,48 @@ MX8MP_IOMUXC_ECSPI1_MISO__UART3_DCE_CTS 0x140
> >;
> };
>
> + pinctrl_usdhc2: usdhc2grp {
> + fsl,pins = <
> + MX8MP_IOMUXC_SD2_CLK__USDHC2_CLK 0x190
> + MX8MP_IOMUXC_SD2_CMD__USDHC2_CMD 0x1d0
> + MX8MP_IOMUXC_SD2_DATA0__USDHC2_DATA0 0x1d0
> + MX8MP_IOMUXC_SD2_DATA1__USDHC2_DATA1 0x1d0
> + MX8MP_IOMUXC_SD2_DATA2__USDHC2_DATA2 0x1d0
> + MX8MP_IOMUXC_SD2_DATA3__USDHC2_DATA3 0x1d0
> + MX8MP_IOMUXC_GPIO1_IO04__USDHC2_VSELECT 0xc0
> + >;
> + };
> +
> + pinctrl_usdhc2_100mhz: usdhc2-100mhzgrp {
> + fsl,pins = <
> + MX8MP_IOMUXC_SD2_CLK__USDHC2_CLK 0x194
> + MX8MP_IOMUXC_SD2_CMD__USDHC2_CMD 0x1d4
> + MX8MP_IOMUXC_SD2_DATA0__USDHC2_DATA0 0x1d4
> + MX8MP_IOMUXC_SD2_DATA1__USDHC2_DATA1 0x1d4
> + MX8MP_IOMUXC_SD2_DATA2__USDHC2_DATA2 0x1d4
> + MX8MP_IOMUXC_SD2_DATA3__USDHC2_DATA3 0x1d4
> + MX8MP_IOMUXC_GPIO1_IO04__USDHC2_VSELECT 0xc0
> + >;
> + };
> +
> + pinctrl_usdhc2_200mhz: usdhc2-200mhzgrp {
> + fsl,pins = <
> + MX8MP_IOMUXC_SD2_CLK__USDHC2_CLK 0x196
> + MX8MP_IOMUXC_SD2_CMD__USDHC2_CMD 0x1d6
> + MX8MP_IOMUXC_SD2_DATA0__USDHC2_DATA0 0x1d6
> + MX8MP_IOMUXC_SD2_DATA1__USDHC2_DATA1 0x1d6
> + MX8MP_IOMUXC_SD2_DATA2__USDHC2_DATA2 0x1d6
> + MX8MP_IOMUXC_SD2_DATA3__USDHC2_DATA3 0x1d6
> + MX8MP_IOMUXC_GPIO1_IO04__USDHC2_VSELECT 0xc0
> + >;
> + };
> +
> + pinctrl_usdhc2_gpio: usdhc2gpiogrp {
> + fsl,pins = <
> + MX8MP_IOMUXC_SD2_CD_B__GPIO2_IO12 0x1c4
> + >;
> + };
> +
> pinctrl_usdhc3: usdhc3grp {
> fsl,pins = <
> MX8MP_IOMUXC_NAND_WE_B__USDHC3_CLK 0x190
> --
> 2.43.0
>
^ permalink raw reply
* Re: [PATCH v02] mailbox: pcc: report errors for PCC clients
From: Sudeep Holla @ 2026-05-19 16:25 UTC (permalink / raw)
To: lihuisong (C)
Cc: Adam Young, Jassi Brar, linux-kernel, Sudeep Holla, linux-hwmon,
Rafael J . Wysocki, Len Brown, linux-acpi, Andi Shyti,
Guenter Roeck, MyungJoo Ham, Kyungmin Park, Chanwoo Choi,
linux-arm-kernel
In-Reply-To: <881ec4ba-44ce-498d-b0c4-8c1d51b13cc3@huawei.com>
On Tue, May 19, 2026 at 09:54:47PM +0800, lihuisong (C) wrote:
>
> On 5/19/2026 3:30 AM, Adam Young wrote:
> > The tx_done callback function has a return code (rc) parameter
> > that the tx_done callback can use to determine how to handle an error.
> > However the IRQ handler was not setting that value if there is an error.
> >
> > The following clients are affected:
> >
> > drivers/acpi/cppc_acpi.c
> > drivers/i2c/busses/i2c-xgene-slimpro.c
> > drivers/hwmon/xgene-hwmon.c
> > drivers/soc/hisilicon/kunpeng_hccs.c
> > drivers/devfreq/hisi_uncore_freq.c
> >
> > All of these only use the error code to report, so they
> > are expecting an error code to come thorugh, but they
> > do not modify behavior based on this code.
> >
> > In the case of an error code in the IRQ, the handler was returning
> > IRQ_NONE which is not correct: the IRQ handler was matched
> > to the IRQ. This mean that multiple error codes returned from
> > a PCC triggered interrupt would end up disabling the device.
> >
> > In addition, if the error code IRQ was coming from a Type4 Device that was
> > expecting an IRQ response, that device would then be hung.
> >
> > Fixes: c45ded7e1135 ("mailbox: pcc: Add support for PCCT extended PCC subspaces(type 3/4)")
> Not fix above commit.
> mbox_chan_txdone() was added in below patch.
> Fixes: 9c753f7c953c (mailbox: pcc: Mark Tx as complete in PCC IRQ handler)
> > Signed-off-by: Adam Young <admiyo@os.amperecomputing.com>
> >
> > ---
> > ---
> > drivers/mailbox/pcc.c | 9 +++++----
> > 1 file changed, 5 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/mailbox/pcc.c b/drivers/mailbox/pcc.c
> > index 636879ae1db7..16b9ce087b9e 100644
> > --- a/drivers/mailbox/pcc.c
> > +++ b/drivers/mailbox/pcc.c
> > @@ -314,6 +314,7 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p)
> > {
> > struct pcc_chan_info *pchan;
> > struct mbox_chan *chan = p;
> > + int rc;
> > pchan = chan->con_priv;
> > @@ -327,8 +328,7 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p)
> > if (!pcc_mbox_cmd_complete_check(pchan))
> > return IRQ_NONE;
> > - if (pcc_mbox_error_check_and_clear(pchan))
> > - return IRQ_NONE;
> > + rc = pcc_mbox_error_check_and_clear(pchan);
>
> I think it is not necessary. This function just return -EIO on failure.
>
> > /*
> > * Clear this flag after updating interrupt ack register and just
> > @@ -337,8 +337,9 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p)
> > * required to avoid any possible race in updatation of this flag.
> > */
> > pchan->chan_in_use = false;
> > - mbox_chan_received_data(chan, NULL);
> > - mbox_chan_txdone(chan, 0);
> > + if (!rc)
> > + mbox_chan_received_data(chan, NULL);
> > + mbox_chan_txdone(chan, rc);
> @Sudeep, I have always had doubts about the addition of this line of code in
> the
> commit 9c753f7c953c (mailbox: pcc: Mark Tx as complete in PCC IRQ handler).
> The patch seems to avoid the timeouts in the mailbox core according to its
> commit log.
> Regardless of whether the command succeeds or fails, each mbox client
> driver, like cppc_acpi/acpi_pcc,kunpeng_hccs and so on, is responsible to
> call mbox_chan_txdone() to tell mailbox core.
Few controller drivers do have mbox_chan_txdone(), so Tx complete is detected
by PCC, so not sure why you think this is not the right place to do. The irq
is to indicate the completion. I am confused as why you think otherwise.
It is defined in include/linux/mailbox_controller.h for the same reason.
The client drivers can you mbox_client_txdone() if they wish to as defined
in include/linux/mailbox_client.h
> This is done after executing mbox_chan_received_data(). So I think this line
> in this function is redundant.
No, I think otherwise, see details above.
--
Regards,
Sudeep
^ permalink raw reply
* Re: [PATCH 2/3] arm64: dts: freescale: add Aquila iMX95 support
From: Frank Li @ 2026-05-19 16:33 UTC (permalink / raw)
To: Franz Schnyder
Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Shawn Guo,
Sascha Hauer, Pengutronix Kernel Team, Fabio Estevam, devicetree,
linux-kernel, imx, linux-arm-kernel, Franz Schnyder,
João Paulo Gonçalves, Emanuele Ghidoli,
Francesco Dolcini, Antoine Gouby, Ernest Van Hoecke
In-Reply-To: <20260506-add-aquila-imx95-v1-2-69c8ee1c5413@toradex.com>
On Wed, May 06, 2026 at 03:01:56PM +0200, Franz Schnyder wrote:
> From: João Paulo Gonçalves <joao.goncalves@toradex.com>
>
> Add support for the Toradex Aquila iMX95 and its development carrier
> board.
>
> The module consists of an NXP i.MX95 family SoC, up to 16GB LPDDR5 RAM,
> up to 128GB of storage, a USB 3.2 OTG and USB 2.0 Host, a Gigabit
> Ethernet PHY, a 10 Gigabit Ethernet interface, an I2C EEPROM and
> Temperature Sensor, an RX8130 RTC, one Quad lane CSI interface, one Quad
> lane DSI or CSI interface, one LVDS interface (one or two channels), and
> some optional addons: DisplayPort (through a DSI-DP bridge), TPM 2.0,
> and a WiFi/BT module.
>
...
> +
> +&scmi_iomuxc {
> + /* Aquila ETH_2_XGMII_MDIO */
> + pinctrl_emdio: emdiogrp {
> + fsl,pins = <IMX95_PAD_ENET2_MDC__NETCMIX_TOP_NETC_MDC 0x57e>, /* Aquila B90 */
> + <IMX95_PAD_ENET2_MDIO__NETCMIX_TOP_NETC_MDIO 0x97e>; /* Aquila B89 */
> + };
> +
> + /* Aquila ETH_1 */
> + pinctrl_enetc0: enetc0grp {
> + fsl,pins = <IMX95_PAD_ENET1_TX_CTL__NETCMIX_TOP_ETH0_RGMII_TX_CTL 0x57e>, /* ENET1_TX_CTL */
> + <IMX95_PAD_ENET1_TXC__NETCMIX_TOP_ETH0_RGMII_TX_CLK 0x58e>, /* ENET1_TXC */
> + <IMX95_PAD_ENET1_TD0__NETCMIX_TOP_ETH0_RGMII_TD0 0x50e>, /* ENET1_TDO */
> + <IMX95_PAD_ENET1_TD1__NETCMIX_TOP_ETH0_RGMII_TD1 0x50e>, /* ENET1_TD1 */
> + <IMX95_PAD_ENET1_TD2__NETCMIX_TOP_ETH0_RGMII_TD2 0x50e>, /* ENET1_TD2 */
> + <IMX95_PAD_ENET1_TD3__NETCMIX_TOP_ETH0_RGMII_TD3 0x50e>, /* ENET1_TD3 */
> + <IMX95_PAD_ENET1_RX_CTL__NETCMIX_TOP_ETH0_RGMII_RX_CTL 0x57e>, /* ENET1_RX_CTL */
> + <IMX95_PAD_ENET1_RXC__NETCMIX_TOP_ETH0_RGMII_RX_CLK 0x58e>, /* ENET1_RXC */
> + <IMX95_PAD_ENET1_RD0__NETCMIX_TOP_ETH0_RGMII_RD0 0x57e>, /* ENET1_RD0 */
> + <IMX95_PAD_ENET1_RD1__NETCMIX_TOP_ETH0_RGMII_RD1 0x57e>, /* ENET1_RD1 */
> + <IMX95_PAD_ENET1_RD2__NETCMIX_TOP_ETH0_RGMII_RD2 0x57e>, /* ENET1_RD2 */
> + <IMX95_PAD_ENET1_RD3__NETCMIX_TOP_ETH0_RGMII_RD3 0x57e>; /* ENET1_RD3 */
> + };
> +
> + pinctrl_ctrl_dp_clk_en: dpclkengrp {
> + fsl,pins = <IMX95_PAD_SAI1_TXFS__AONMIX_TOP_GPIO1_IO_BIT11 0x11e>; /* CTRL_DP_CLK_EN */
> + };
> +
> + pinctrl_ctrl_gpio_exp_int: gpioexpintgrp {
> + fsl,pins = <IMX95_PAD_SAI1_TXD0__AONMIX_TOP_GPIO1_IO_BIT13 0x31e>; /* CTRL_GPIO_EXP_INT# */
> + };
> +
> + /* Aquila CTRL_WAKE1_MICO# */
> + pinctrl_ctrl_wake1_mico: ctrlwake1micogrp {
> + fsl,pins = <IMX95_PAD_XSPI1_SS1_B__GPIO5_IO_BIT11 0x31e>; /* Aquila D6 */
> + };
This list is quite long, need keep alphabet order by node name. To reduce
this kinds problem, suggest run https://github.com/lznuaa/dt-format for
new dts.
Frank
^ permalink raw reply
* [v7 PATCH] arm64: mm: show direct mapping use in /proc/meminfo
From: Yang Shi @ 2026-05-19 16:36 UTC (permalink / raw)
To: catalin.marinas, will, ryan.roberts, cl
Cc: yang, linux-arm-kernel, linux-kernel
Since commit a166563e7ec3 ("arm64: mm: support large block mapping when
rodata=full"), the direct mapping may be split on some machines instead
keeping static since boot. It makes more sense to show the direct mapping
use in /proc/meminfo than before.
This patch will make /proc/meminfo show the direct mapping use like the
below (4K base page size):
DirectMap4K: 94792 kB
DirectMap64K: 134208 kB
DirectMap2M: 1173504 kB
DirectMap32M: 5636096 kB
DirectMap1G: 529530880 kB
Although just the machines which support BBML2_NOABORT can split the
direct mapping, show it on all machines regardless of BBML2_NOABORT so
that the users have consistent view in order to avoid confusion.
Although ptdump also can tell the direct map use, but it needs to dump
the whole kernel page table. It is costly and overkilling. It is also
in debugfs which may not be enabled by all distros. So showing direct
map use in /proc/meminfo seems more convenient and has less overhead.
Signed-off-by: Yang Shi <yang@os.amperecomputing.com>
---
arch/arm64/mm/mmu.c | 192 +++++++++++++++++++++++++++++++++++++++-----
1 file changed, 171 insertions(+), 21 deletions(-)
v7: * Rebased to v7.1-rc4
* Changed "dm" to "lm" to follow ARM convention per Will
* Used __is_lm_alias() instead of reinventing a new helper per Will
v6: * Rebased to v7.0-rc3
* Rebased on top of Anshuman's v5 "arm64/mm: Enable batched TLB flush
in unmap_hotplug_range()"
* Used const for direct map type array per Will
* Defined PUD size for 16K/64K even though it is not used per Will
* Removed the misleading comment in init_pmd() per Will
v5: * Rebased to v6.19-rc4
* Fixed the build error for !CONFIG_PROC_FS
v4: * Used PAGE_END instead of _PAGE_END(VA_BITS_MIN) per Ryan
* Used shorter name for the helpers and variables per Ryan
* Fixed accounting for memory hotunplug
v3: * Fixed the over-accounting problems per Ryan
* Introduced helpers for add/sub direct map use and #ifdef them with
CONFIG_PROC_FS per Ryan
* v3 is a fix patch on top of v2
v2: * Counted in size instead of the number of entries per Ryan
* Removed shift array per Ryan
* Use lower case "k" per Ryan
* Fixed a couple of build warnings reported by kernel test robot
* Fixed a couple of poential miscounts
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index dd85e093ffdb..e6c1d591a4ef 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -29,6 +29,7 @@
#include <linux/mm_inline.h>
#include <linux/pagewalk.h>
#include <linux/stop_machine.h>
+#include <linux/proc_fs.h>
#include <asm/barrier.h>
#include <asm/cputype.h>
@@ -164,6 +165,82 @@ static void init_clear_pgtable(void *table)
dsb(ishst);
}
+enum lm_type {
+ PTE,
+ CONT_PTE,
+ PMD,
+ CONT_PMD,
+ PUD,
+ NR_LM_TYPE,
+};
+
+#ifdef CONFIG_PROC_FS
+static unsigned long lm_meminfo[NR_LM_TYPE];
+
+void arch_report_meminfo(struct seq_file *m)
+{
+ const char *size[NR_LM_TYPE];
+
+#if defined(CONFIG_ARM64_4K_PAGES)
+ size[PTE] = "4k";
+ size[CONT_PTE] = "64k";
+ size[PMD] = "2M";
+ size[CONT_PMD] = "32M";
+ size[PUD] = "1G";
+#elif defined(CONFIG_ARM64_16K_PAGES)
+ size[PTE] = "16k";
+ size[CONT_PTE] = "2M";
+ size[PMD] = "32M";
+ size[CONT_PMD] = "1G";
+ size[PUD] = "64G";
+#elif defined(CONFIG_ARM64_64K_PAGES)
+ size[PTE] = "64k";
+ size[CONT_PTE] = "2M";
+ size[PMD] = "512M";
+ size[CONT_PMD] = "16G";
+ size[PUD] = "4T";
+#endif
+
+ seq_printf(m, "DirectMap%s: %8lu kB\n",
+ size[PTE], lm_meminfo[PTE] >> 10);
+ seq_printf(m, "DirectMap%s: %8lu kB\n",
+ size[CONT_PTE],
+ lm_meminfo[CONT_PTE] >> 10);
+ seq_printf(m, "DirectMap%s: %8lu kB\n",
+ size[PMD], lm_meminfo[PMD] >> 10);
+ seq_printf(m, "DirectMap%s: %8lu kB\n",
+ size[CONT_PMD],
+ lm_meminfo[CONT_PMD] >> 10);
+ if (pud_sect_supported())
+ seq_printf(m, "DirectMap%s: %8lu kB\n",
+ size[PUD], lm_meminfo[PUD] >> 10);
+}
+
+static inline void lm_meminfo_add(unsigned long addr, unsigned long size,
+ enum lm_type type)
+{
+ if (__is_lm_address(addr))
+ lm_meminfo[type] += size;
+}
+
+static inline void lm_meminfo_sub(unsigned long addr, unsigned long size,
+ enum lm_type type)
+{
+ if (__is_lm_address(addr))
+ lm_meminfo[type] -= size;
+}
+#else
+static inline void lm_meminfo_add(unsigned long addr, unsigned long size,
+ enum lm_type type)
+{
+}
+
+static inline void lm_meminfo_sub(unsigned long addr, unsigned long size,
+ enum lm_type type)
+{
+}
+#endif
+
static void init_pte(pte_t *ptep, unsigned long addr, unsigned long end,
phys_addr_t phys, pgprot_t prot)
{
@@ -229,6 +306,11 @@ static int alloc_init_cont_pte(pmd_t *pmdp, unsigned long addr,
init_pte(ptep, addr, next, phys, __prot);
+ if (pgprot_val(__prot) & PTE_CONT)
+ lm_meminfo_add(addr, (next - addr), CONT_PTE);
+ else
+ lm_meminfo_add(addr, (next - addr), PTE);
+
ptep += pte_index(next) - pte_index(addr);
phys += next - addr;
} while (addr = next, addr != end);
@@ -259,6 +341,10 @@ static int init_pmd(pmd_t *pmdp, unsigned long addr, unsigned long end,
(flags & NO_BLOCK_MAPPINGS) == 0) {
pmd_set_huge(pmdp, phys, prot);
+ if (pgprot_val(prot) & PTE_CONT)
+ lm_meminfo_add(addr, (next - addr), CONT_PMD);
+ else
+ lm_meminfo_add(addr, (next - addr), PMD);
/*
* After the PMD entry has been populated once, we
* only allow updates to the permission attributes.
@@ -382,6 +468,7 @@ static int alloc_init_pud(p4d_t *p4dp, unsigned long addr, unsigned long end,
(flags & NO_BLOCK_MAPPINGS) == 0) {
pud_set_huge(pudp, phys, prot);
+ lm_meminfo_add(addr, (next - addr), PUD);
/*
* After the PUD entry has been populated once, we
* only allow updates to the permission attributes.
@@ -571,16 +658,21 @@ pgd_pgtable_alloc_special_mm(enum pgtable_level pgtable_level)
return __pgd_pgtable_alloc(NULL, GFP_PGTABLE_KERNEL, pgtable_level);
}
-static void split_contpte(pte_t *ptep)
+static void split_contpte(unsigned long addr, pte_t *ptep)
{
int i;
+ lm_meminfo_sub(addr, CONT_PTE_SIZE, CONT_PTE);
+
ptep = PTR_ALIGN_DOWN(ptep, sizeof(*ptep) * CONT_PTES);
for (i = 0; i < CONT_PTES; i++, ptep++)
__set_pte(ptep, pte_mknoncont(__ptep_get(ptep)));
+
+ lm_meminfo_add(addr, CONT_PTE_SIZE, PTE);
}
-static int split_pmd(pmd_t *pmdp, pmd_t pmd, gfp_t gfp, bool to_cont)
+static int split_pmd(unsigned long addr, pmd_t *pmdp, pmd_t pmd, gfp_t gfp,
+ bool to_cont)
{
pmdval_t tableprot = PMD_TYPE_TABLE | PMD_TABLE_UXN | PMD_TABLE_AF;
unsigned long pfn = pmd_pfn(pmd);
@@ -604,8 +696,13 @@ static int split_pmd(pmd_t *pmdp, pmd_t pmd, gfp_t gfp, bool to_cont)
if (to_cont)
prot = __pgprot(pgprot_val(prot) | PTE_CONT);
+ lm_meminfo_sub(addr, PMD_SIZE, PMD);
for (i = 0; i < PTRS_PER_PTE; i++, ptep++, pfn++)
__set_pte(ptep, pfn_pte(pfn, prot));
+ if (to_cont)
+ lm_meminfo_add(addr, PMD_SIZE, CONT_PTE);
+ else
+ lm_meminfo_add(addr, PMD_SIZE, PTE);
/*
* Ensure the pte entries are visible to the table walker by the time
@@ -617,16 +714,21 @@ static int split_pmd(pmd_t *pmdp, pmd_t pmd, gfp_t gfp, bool to_cont)
return 0;
}
-static void split_contpmd(pmd_t *pmdp)
+static void split_contpmd(unsigned long addr, pmd_t *pmdp)
{
int i;
+ lm_meminfo_sub(addr, CONT_PMD_SIZE, CONT_PMD);
+
pmdp = PTR_ALIGN_DOWN(pmdp, sizeof(*pmdp) * CONT_PMDS);
for (i = 0; i < CONT_PMDS; i++, pmdp++)
set_pmd(pmdp, pmd_mknoncont(pmdp_get(pmdp)));
+
+ lm_meminfo_add(addr, CONT_PMD_SIZE, PMD);
}
-static int split_pud(pud_t *pudp, pud_t pud, gfp_t gfp, bool to_cont)
+static int split_pud(unsigned long addr, pud_t *pudp, pud_t pud, gfp_t gfp,
+ bool to_cont)
{
pudval_t tableprot = PUD_TYPE_TABLE | PUD_TABLE_UXN | PUD_TABLE_AF;
unsigned int step = PMD_SIZE >> PAGE_SHIFT;
@@ -651,8 +753,13 @@ static int split_pud(pud_t *pudp, pud_t pud, gfp_t gfp, bool to_cont)
if (to_cont)
prot = __pgprot(pgprot_val(prot) | PTE_CONT);
+ lm_meminfo_sub(addr, PUD_SIZE, PUD);
for (i = 0; i < PTRS_PER_PMD; i++, pmdp++, pfn += step)
set_pmd(pmdp, pfn_pmd(pfn, prot));
+ if (to_cont)
+ lm_meminfo_add(addr, PUD_SIZE, CONT_PMD);
+ else
+ lm_meminfo_add(addr, PUD_SIZE, PMD);
/*
* Ensure the pmd entries are visible to the table walker by the time
@@ -707,7 +814,7 @@ static int split_kernel_leaf_mapping_locked(unsigned long addr)
if (!pud_present(pud))
goto out;
if (pud_leaf(pud)) {
- ret = split_pud(pudp, pud, GFP_PGTABLE_KERNEL, true);
+ ret = split_pud(addr, pudp, pud, GFP_PGTABLE_KERNEL, true);
if (ret)
goto out;
}
@@ -725,14 +832,14 @@ static int split_kernel_leaf_mapping_locked(unsigned long addr)
goto out;
if (pmd_leaf(pmd)) {
if (pmd_cont(pmd))
- split_contpmd(pmdp);
+ split_contpmd(addr, pmdp);
/*
* PMD: If addr is PMD aligned then addr already describes a
* leaf boundary. Otherwise, split to contpte.
*/
if (ALIGN_DOWN(addr, PMD_SIZE) == addr)
goto out;
- ret = split_pmd(pmdp, pmd, GFP_PGTABLE_KERNEL, true);
+ ret = split_pmd(addr, pmdp, pmd, GFP_PGTABLE_KERNEL, true);
if (ret)
goto out;
}
@@ -749,7 +856,7 @@ static int split_kernel_leaf_mapping_locked(unsigned long addr)
if (!pte_present(pte))
goto out;
if (pte_cont(pte))
- split_contpte(ptep);
+ split_contpte(addr, ptep);
out:
return ret;
@@ -856,7 +963,7 @@ static int split_to_ptes_pud_entry(pud_t *pudp, unsigned long addr,
int ret = 0;
if (pud_leaf(pud))
- ret = split_pud(pudp, pud, gfp, false);
+ ret = split_pud(addr, pudp, pud, gfp, false);
return ret;
}
@@ -870,8 +977,8 @@ static int split_to_ptes_pmd_entry(pmd_t *pmdp, unsigned long addr,
if (pmd_leaf(pmd)) {
if (pmd_cont(pmd))
- split_contpmd(pmdp);
- ret = split_pmd(pmdp, pmd, gfp, false);
+ split_contpmd(addr, pmdp);
+ ret = split_pmd(addr, pmdp, pmd, gfp, false);
/*
* We have split the pmd directly to ptes so there is no need to
@@ -889,7 +996,7 @@ static int split_to_ptes_pte_entry(pte_t *ptep, unsigned long addr,
pte_t pte = __ptep_get(ptep);
if (pte_cont(pte))
- split_contpte(ptep);
+ split_contpte(addr, ptep);
return 0;
}
@@ -1463,20 +1570,20 @@ static bool pgtable_range_aligned(unsigned long start, unsigned long end,
return true;
}
-static void unmap_hotplug_pte_range(pmd_t *pmdp, unsigned long addr,
+static void unmap_hotplug_pte_range(pte_t *ptep, unsigned long addr,
unsigned long end, bool free_mapped,
struct vmem_altmap *altmap)
{
- pte_t *ptep, pte;
+ pte_t pte;
do {
- ptep = pte_offset_kernel(pmdp, addr);
pte = __ptep_get(ptep);
if (pte_none(pte))
continue;
WARN_ON(!pte_present(pte));
__pte_clear(&init_mm, addr, ptep);
+ lm_meminfo_sub(addr, PAGE_SIZE, PTE);
if (free_mapped) {
/* CONT blocks are not supported in the vmemmap */
WARN_ON(pte_cont(pte));
@@ -1485,19 +1592,39 @@ static void unmap_hotplug_pte_range(pmd_t *pmdp, unsigned long addr,
PAGE_SIZE, altmap);
}
/* unmap_hotplug_range() flushes TLB for !free_mapped */
- } while (addr += PAGE_SIZE, addr < end);
+ } while (ptep++, addr += PAGE_SIZE, addr < end);
}
-static void unmap_hotplug_pmd_range(pud_t *pudp, unsigned long addr,
+static void unmap_hotplug_cont_pte_range(pmd_t *pmdp, unsigned long addr,
+ unsigned long end, bool free_mapped,
+ struct vmem_altmap *altmap)
+{
+ unsigned long next;
+ pte_t *ptep, pte;
+
+ do {
+ next = pte_cont_addr_end(addr, end);
+ ptep = pte_offset_kernel(pmdp, addr);
+ pte = __ptep_get(ptep);
+
+ if (pte_present(pte) && pte_cont(pte)) {
+ lm_meminfo_sub(addr, CONT_PTE_SIZE, CONT_PTE);
+ lm_meminfo_add(addr, CONT_PTE_SIZE, PTE);
+ }
+
+ unmap_hotplug_pte_range(ptep, addr, next, free_mapped, altmap);
+ } while (addr = next, addr < end);
+}
+
+static void unmap_hotplug_pmd_range(pmd_t *pmdp, unsigned long addr,
unsigned long end, bool free_mapped,
struct vmem_altmap *altmap)
{
unsigned long next;
- pmd_t *pmdp, pmd;
+ pmd_t pmd;
do {
next = pmd_addr_end(addr, end);
- pmdp = pmd_offset(pudp, addr);
pmd = READ_ONCE(*pmdp);
if (pmd_none(pmd))
continue;
@@ -1505,6 +1632,7 @@ static void unmap_hotplug_pmd_range(pud_t *pudp, unsigned long addr,
WARN_ON(!pmd_present(pmd));
if (pmd_leaf(pmd)) {
pmd_clear(pmdp);
+ lm_meminfo_sub(addr, PMD_SIZE, PMD);
if (free_mapped) {
/* CONT blocks are not supported in the vmemmap */
WARN_ON(pmd_cont(pmd));
@@ -1516,7 +1644,28 @@ static void unmap_hotplug_pmd_range(pud_t *pudp, unsigned long addr,
continue;
}
WARN_ON(!pmd_table(pmd));
- unmap_hotplug_pte_range(pmdp, addr, next, free_mapped, altmap);
+ unmap_hotplug_cont_pte_range(pmdp, addr, next, free_mapped, altmap);
+ } while (pmdp++, addr = next, addr < end);
+}
+
+static void unmap_hotplug_cont_pmd_range(pud_t *pudp, unsigned long addr,
+ unsigned long end, bool free_mapped,
+ struct vmem_altmap *altmap)
+{
+ unsigned long next;
+ pmd_t *pmdp, pmd;
+
+ do {
+ next = pmd_cont_addr_end(addr, end);
+ pmdp = pmd_offset(pudp, addr);
+ pmd = READ_ONCE(*pmdp);
+
+ if (pmd_leaf(pmd) && pmd_cont(pmd)) {
+ lm_meminfo_sub(addr, CONT_PMD_SIZE, CONT_PMD);
+ lm_meminfo_add(addr, CONT_PMD_SIZE, PMD);
+ }
+
+ unmap_hotplug_pmd_range(pmdp, addr, next, free_mapped, altmap);
} while (addr = next, addr < end);
}
@@ -1537,6 +1686,7 @@ static void unmap_hotplug_pud_range(p4d_t *p4dp, unsigned long addr,
WARN_ON(!pud_present(pud));
if (pud_leaf(pud)) {
pud_clear(pudp);
+ lm_meminfo_sub(addr, PUD_SIZE, PUD);
if (free_mapped) {
flush_tlb_kernel_range(addr, addr + PUD_SIZE);
free_hotplug_page_range(pud_page(pud),
@@ -1546,7 +1696,7 @@ static void unmap_hotplug_pud_range(p4d_t *p4dp, unsigned long addr,
continue;
}
WARN_ON(!pud_table(pud));
- unmap_hotplug_pmd_range(pudp, addr, next, free_mapped, altmap);
+ unmap_hotplug_cont_pmd_range(pudp, addr, next, free_mapped, altmap);
} while (addr = next, addr < end);
}
--
2.47.0
^ permalink raw reply related
* Re: [PATCH v2 1/4] dt-bindings: display: verisilicon, dc: generalize for single-output variants
From: Conor Dooley @ 2026-05-19 16:47 UTC (permalink / raw)
To: Icenowy Zheng
Cc: Joey Lu, maarten.lankhorst, mripard, tzimmermann, airlied, simona,
robh, krzk+dt, conor+dt, ychuang3, schung, yclu4, dri-devel,
devicetree, linux-arm-kernel, linux-kernel
In-Reply-To: <a66cc60fe163167e30e42f0b4be996cae1170a5e.camel@iscas.ac.cn>
[-- Attachment #1: Type: text/plain, Size: 9586 bytes --]
On Tue, May 19, 2026 at 03:26:58PM +0800, Icenowy Zheng wrote:
> 在 2026-05-19二的 13:51 +0800,Joey Lu写道:
> > The existing schema assumes a fixed clock/reset topology and dual-
> > output
> > port structure matching the DC8200 IP block. This prevents reuse for
> > single-output variants such as the Verisilicon DCU Lite used in the
> > Nuvoton MA35D1 SoC.
> >
> > Rework the schema so that variant-specific constraints are expressed
> > via allOf/if-then-else:
> >
> > - The thead,th1520-dc8200 compatible keeps its existing five-clock,
> > three-reset, dual-port requirements.
> >
> > - A standalone verisilicon,dc compatible covers IPs whose identity is
> > discovered entirely through hardware registers; these have flexible
> > clock and reset counts, a single 'port' property, and no 'ports'
> > requirement.
> >
> > Changes to the base schema:
> > - Replace the fixed clock/reset items lists with minItems/maxItems
> > ranges; variant sub-schemas tighten the constraints via if-then-
> > else.
> > - Add a 'port' property (graph.yaml single-port alias) alongside the
> > existing 'ports', for single-output variants.
> > - Drop the unconditional 'ports' requirement; each if-branch enforces
> > its own port topology.
> > - Tighten additionalProperties to unevaluatedProperties to allow
> > per-variant schemas to add their own constraints cleanly.
> > - Fix a stray space in the port@0 description.
> > - Add a DT example for the generic verisilicon,dc compatible
> > (Nuvoton MA35D1 DCU Lite).
> >
> > Signed-off-by: Joey Lu <a0987203069@gmail.com>
> > ---
> > .../bindings/display/verisilicon,dc.yaml | 135 ++++++++++++++--
> > --
> > 1 file changed, 108 insertions(+), 27 deletions(-)
> >
> > diff --git
> > a/Documentation/devicetree/bindings/display/verisilicon,dc.yaml
> > b/Documentation/devicetree/bindings/display/verisilicon,dc.yaml
> > index 9dc35ab973f2..3a814c2e083e 100644
> > --- a/Documentation/devicetree/bindings/display/verisilicon,dc.yaml
> > +++ b/Documentation/devicetree/bindings/display/verisilicon,dc.yaml
> > @@ -14,10 +14,12 @@ properties:
> > pattern: "^display@[0-9a-f]+$"
> >
> > compatible:
> > - items:
> > - - enum:
> > - - thead,th1520-dc8200
>
> You should add a fallback compatible here for your SoC, in case its
> integration gets something quirky; this compatible is usually not
> consumed by the driver (see how thead,th1520-dc8200 exists in the
> binding but not the driver).
s/fallback compatible/soc-specific compatible/, but yes.
NAK to what's been done here, especially after the discussions on
earlier versions of this verisilicon binding.
pw-bot: changes-requested
>
> > - - const: verisilicon,dc # DC IPs have discoverable ID/revision
> > registers
> > + oneOf:
> > + - items:
> > + - enum:
> > + - thead,th1520-dc8200
> > + - const: verisilicon,dc
> > + - const: verisilicon,dc # DC IPs have discoverable
> > ID/revision registers
> >
> > reg:
> > maxItems: 1
> > @@ -26,32 +28,24 @@ properties:
> > maxItems: 1
> >
> > clocks:
> > - items:
> > - - description: DC Core clock
> > - - description: DMA AXI bus clock
> > - - description: Configuration AHB bus clock
> > - - description: Pixel clock of output 0
> > - - description: Pixel clock of output 1
> > + minItems: 2
> > + maxItems: 5
> >
> > clock-names:
> > - items:
> > - - const: core
> > - - const: axi
> > - - const: ahb
> > - - const: pix0
> > - - const: pix1
> > + minItems: 2
> > + maxItems: 5
> >
> > resets:
> > - items:
> > - - description: DC Core reset
> > - - description: DMA AXI bus reset
> > - - description: Configuration AHB bus reset
> > + minItems: 1
> > + maxItems: 3
> >
> > reset-names:
> > - items:
> > - - const: core
> > - - const: axi
> > - - const: ahb
> > + minItems: 1
> > + maxItems: 3
> > +
> > + port:
> > + $ref: /schemas/graph.yaml#/properties/port
> > + description: Single video output port for single-output
> > variants.
>
> Maybe the endpoint numbering rule needs a move to here? (I am not very
> sure).
>
> >
> > ports:
> > $ref: /schemas/graph.yaml#/properties/ports
> > @@ -59,7 +53,7 @@ properties:
> > properties:
> > port@0:
> > $ref: /schemas/graph.yaml#/properties/port
> > - description: The first output channel , endpoint 0 should be
> > + description: The first output channel, endpoint 0 should be
> > used for DPI format output and endpoint 1 should be used
> > for DP format output.
> >
> > @@ -75,9 +69,75 @@ required:
> > - interrupts
> > - clocks
> > - clock-names
> > - - ports
> >
> > -additionalProperties: false
> > +allOf:
> > + - if:
> > + properties:
> > + compatible:
> > + contains:
> > + const: thead,th1520-dc8200
> > + then:
> > + properties:
> > + clocks:
> > + items:
> > + - description: DC Core clock
> > + - description: DMA AXI bus clock
> > + - description: Configuration AHB bus clock
> > + - description: Pixel clock of output 0
> > + - description: Pixel clock of output 1
> > +
> > + clock-names:
> > + items:
> > + - const: core
> > + - const: axi
> > + - const: ahb
> > + - const: pix0
> > + - const: pix1
> > +
> > + resets:
> > + items:
> > + - description: DC Core reset
> > + - description: DMA AXI bus reset
> > + - description: Configuration AHB bus reset
> > +
> > + reset-names:
> > + items:
> > + - const: core
> > + - const: axi
> > + - const: ahb
> > +
> > + required:
> > + - ports
> > +
> > + else:
> > + properties:
> > + clocks:
> > + items:
> > + - description: Bus clock that gates register access
> > + - description: Pixel clock divider for display timing
>
> Please don't make compatible-specific description strings for
> individual compatibles, and keep these descriptions outside of the if.
> The compatible-specific part should be used to specify what's required
> for the specific SoC, for dt validation purpose.
>
> BTW if the clock is both the working clock and bus clock for the
> controller, I suggest listing it twice, except if the IP core is
> provided without a dedicated core clock (in the case I suggest to use
> "bus" only).
I agree. If the same clock is provided to two+ ports on the IP, that
should still be two+ clocks in the devicetree.
>
> Here's an example for "listing it twice":
> ```
> clocks = <&clk DCU_GATE>, <&clk DCU_GATE>, <&clk DCUP_DIV>;
> clock-names = "core", "bus", "pix0";
> ```
>
> Well nonetheless the name "core" does not match the description "Bus
> clock that gates register access".
>
> Thanks,
> Icenowy
>
> > +
> > + clock-names:
> > + items:
> > + - const: core
> > + - const: pix0
> > +
> > + resets:
> > + maxItems: 1
> > + description:
> > + Reset line for the display controller.
> > +
> > + reset-names:
> > + items:
> > + - const: core
> > +
> > + required:
> > + - port
> > +
> > + not:
> > + required:
> > + - ports
> > +
> > +unevaluatedProperties: false
> >
> > examples:
> > - |
> > @@ -120,3 +180,24 @@ examples:
> > };
> > };
> > };
> > +
> > + - |
> > + #include <dt-bindings/interrupt-controller/arm-gic.h>
> > + #include <dt-bindings/clock/nuvoton,ma35d1-clk.h>
> > + #include <dt-bindings/reset/nuvoton,ma35d1-reset.h>
> > +
> > + display@40260000 {
> > + compatible = "verisilicon,dc";
> > + reg = <0x40260000 0x20000>;
> > + interrupts = <GIC_SPI 20 IRQ_TYPE_LEVEL_HIGH>;
> > + clocks = <&clk DCU_GATE>, <&clk DCUP_DIV>;
> > + clock-names = "core", "pix0";
> > + resets = <&sys MA35D1_RESET_DISP>;
> > + reset-names = "core";
> > +
> > + port {
> > + dpi_out: endpoint {
> > + remote-endpoint = <&panel_in>;
> > + };
> > + };
> > + };
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply
* Re: [PATCH V2 00/11] soc: ti: keystone/k3 navigator queue/dma/ringacc cleanups
From: Hari Prasath G E @ 2026-05-19 16:47 UTC (permalink / raw)
To: Nishanth Menon, Justin Stitt, Bill Wendling, Nick Desaulniers,
Nathan Chancellor, Santosh Shilimkar
Cc: afd, llvm, linux-arm-kernel, linux-kernel
In-Reply-To: <20260512170623.3174416-1-nm@ti.com>
On 5/12/2026 10:36 PM, Nishanth Menon wrote:
> Fix W=2 (clang/gcc), sparse, smatch and coccinelle warnings.
> No functional changes.
>
> Tested: NFS boot (via nav subsystem) on k2l-evm, k2hk-evm and k2g-evm
> based on next-20260507:
> https://gist.github.com/nmenon/cff02a5f2a72fde5fcb49664fcc834d2
>
> Changes since V1:
> - update for review comments.
> - Picked up Randy's and Andrew's tags in appropriate patches.
>
> V1: https://lore.kernel.org/all/20260508153211.3688277-1-nm@ti.com/
For the whole series,
Reviewed-by: Hari Prasath Gujulan Elango <gehariprasath@ti.com>
(I did see a few patches already have picked Reviewed-by tags)
Regards,
Hari
>
> Nishanth Menon (11):
> soc: ti: knav_qmss: Remove remaining redundant ENOMEM printks
> soc: ti: knav_qmss: Rename global kdev to knav_qdev to fix -Wshadow
> soc: ti: knav_qmss: Inline lockdep condition in for_each_handle_rcu
> soc: ti: knav_qmss: Fix kernel-doc Return: tags
> soc: ti: knav_qmss: Use %pe to print PTR_ERR()
> soc: ti: knav_qmss: Fix __iomem annotations and __be32 type
> soc: ti: knav_qmss_acc: Fix kernel-doc Return: tag
> soc: ti: knav_dma: Remove unused DMA_PRIO_MASK macro
> soc: ti: knav_dma: Remove dead check on unsigned args.args[0]
> soc: ti: knav_dma: Use IOMEM_ERR_PTR() in pktdma_get_regs()
> soc: ti: k3-ringacc: Use str_enabled_disabled() helper
>
> drivers/soc/ti/k3-ringacc.c | 3 +-
> drivers/soc/ti/knav_dma.c | 8 +-
> drivers/soc/ti/knav_qmss.h | 2 +-
> drivers/soc/ti/knav_qmss_acc.c | 2 +-
> drivers/soc/ti/knav_qmss_queue.c | 148 +++++++++++++++----------------
> 5 files changed, 75 insertions(+), 88 deletions(-)
>
^ permalink raw reply
* Re: [PATCH] arm64: probes: Handle probes on hinted conditional branch instructions
From: Catalin Marinas @ 2026-05-19 17:05 UTC (permalink / raw)
To: linux-arm-kernel, Vladimir Murzin; +Cc: Will Deacon
In-Reply-To: <20260515133729.112196-1-vladimir.murzin@arm.com>
On Fri, 15 May 2026 14:37:29 +0100, Vladimir Murzin wrote:
> BC.cond instructions introduced by FEAT_HBC cannot be executed
> out-of-line, like other branch instructions. However, they can be
> simulated in the same way as B.cond instructions.
>
> Extend the B.cond decoder mask to match BC.cond instructions as well,
> and handle them using the existing B.cond simulation path.
>
> [...]
Applied to arm64 (for-next/fixes), thanks!
[1/1] arm64: probes: Handle probes on hinted conditional branch instructions
https://git.kernel.org/arm64/c/2ccd8ff980b5
^ permalink raw reply
* Re: [PATCH v6 00/19] dmaengine: ti: Add support for BCDMA v2 and PKTDMA v2
From: Vinod Koul @ 2026-05-19 17:06 UTC (permalink / raw)
To: Sai Sree Kartheek Adivi
Cc: peter.ujfalusi, robh, krzk+dt, conor+dt, nm, ssantosh, dmaengine,
devicetree, linux-kernel, linux-arm-kernel, vigneshr, Frank.li,
r-sharma3, gehariprasath
In-Reply-To: <20260428085202.1724548-1-s-adivi@ti.com>
On 28-04-26, 14:21, Sai Sree Kartheek Adivi wrote:
> This series adds support for the BCDMA_V2 and PKTDMA_V2 which is
> introduced in AM62L.
>
> The key differences between the existing DMA and DMA V2 are:
> - Absence of TISCI: Instead of configuring via TISCI calls, direct
> register writes are required.
> - Autopair: There is no longer a need for PSIL pair and instead AUTOPAIR
> bit needs to set in the RT_CTL register.
> - Static channel mapping: Each channel is mapped to a single peripheral.
> - Direct IRQs: There is no INT-A and interrupt lines from DMA are
> directly connected to GIC.
> - Remote side configuration handled by DMA. So no need to write to PEER
> registers to START / STOP / PAUSE / TEARDOWN.
> - Unified Channel Space: Tx and Rx channels share a single register
> space. Each channel index is specifically fixed in hardware as either
> Tx or Rx in an interleaved manner.
Please check the commments from Sashiko https://sashiko.dev/#/patchset/20260428085202.1724548-1-s-adivi%40ti.com
--
~Vinod
^ permalink raw reply
* Re: [PATCH 4/8] drm/panthor: Add support for protected memory allocation in panthor
From: Chia-I Wu @ 2026-05-19 17:07 UTC (permalink / raw)
To: Ketil Johnsen
Cc: Boris Brezillon, Liviu Dudau, Marcin Ślusarz, David Airlie,
Simona Vetter, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, Jonathan Corbet, Shuah Khan, Sumit Semwal,
Benjamin Gaignard, Brian Starkey, John Stultz, T.J. Mercier,
Christian König, Steven Price, Daniel Almeida, Alice Ryhl,
Matthias Brugger, AngeloGioacchino Del Regno, dri-devel,
linux-doc, linux-kernel, linux-media, linaro-mm-sig,
linux-arm-kernel, linux-mediatek, Florent Tomasin, nd
In-Reply-To: <8f0b1750-a853-4895-9672-73a75f6dbd84@arm.com>
On Tue, May 19, 2026 at 1:49 AM Ketil Johnsen <ketil.johnsen@arm.com> wrote:
>
> On 19/05/2026 09:39, Boris Brezillon wrote:
> > On Mon, 18 May 2026 17:36:40 -0700
> > Chia-I Wu <olvaffe@gmail.com> wrote:
> >
> >> On Mon, May 18, 2026 at 12:16 AM Boris Brezillon
> >> <boris.brezillon@collabora.com> wrote:
> >>>
> >>> On Wed, 13 May 2026 12:31:32 -0700
> >>> Chia-I Wu <olvaffe@gmail.com> wrote:
> >>>
> >>>> On Tue, May 12, 2026 at 8:39 AM Liviu Dudau <liviu.dudau@arm.com> wrote:
> >>>>>
> >>>>> On Tue, May 12, 2026 at 04:11:11PM +0200, Boris Brezillon wrote:
> >>>>>> On Tue, 12 May 2026 14:47:27 +0100
> >>>>>> Liviu Dudau <liviu.dudau@arm.com> wrote:
> >>>>>>
> >>>>>>> On Thu, May 07, 2026 at 01:53:56PM +0200, Boris Brezillon wrote:
> >>>>>>>> On Thu, 7 May 2026 11:02:26 +0200
> >>>>>>>> Marcin Ślusarz <marcin.slusarz@arm.com> wrote:
> >>>>>>>>
> >>>>>>>>> On Tue, May 05, 2026 at 06:15:23PM +0200, Boris Brezillon wrote:
> >>>>>>>>>>> @@ -277,9 +286,21 @@ int panthor_device_init(struct panthor_device *ptdev)
> >>>>>>>>>>> return ret;
> >>>>>>>>>>> }
> >>>>>>>>>>>
> >>>>>>>>>>> + /* If a protected heap name is specified but not found, defer the probe until created */
> >>>>>>>>>>> + if (protected_heap_name && strlen(protected_heap_name)) {
> >>>>>>>>>>
> >>>>>>>>>> Do we really need this strlen() > 0? Won't dma_heap_find() fail is the
> >>>>>>>>>> name is "" already?
> >>>>>>>>>
> >>>>>>>>> If dma_heap_find() will fail, then the whole probe with fail too.
> >>>>>>>>> This check prevents that.
> >>>>>>>>
> >>>>>>>> Yeah, that's also a questionable design choice. I mean, we can
> >>>>>>>> currently probe and boot the FW even though we never setup the
> >>>>>>>> protected FW sections, so why should we defer the probe here? Can't we
> >>>>>>>> just retry the next time a group with the protected bit is created and
> >>>>>>>> fail if we can find a protected heap?
> >>>>>>>
> >>>>>>> The problem we have with the current firmware is that it does a number of setup steps at "boot"
> >>>>>>> time only. One of the steps is preparing its internal structures for when it enters protected
> >>>>>>> mode and it stores them in the buffer passed in at firmware loading. We cannot later run the
> >>>>>>> process when we have a group with protected mode set.
> >>>>>>
> >>>>>> No, but we can force a full/slow reset and have that thing
> >>>>>> re-initialized, can't we? I mean, that's basically what we do when a
> >>>>>> fast reset fails: we re-initialize all the sections and reset again, at
> >>>>>> which point the FW should start from a fresh state, and be able to
> >>>>>> properly initialize the protected-related stuff if protected sections
> >>>>>> are populated. Am I missing something?
> >>>>>
> >>>>> Right, we can do that. For some reason I keep associating the reset with the
> >>>>> error handling and not with "normal" operations.
> >>>> I kind of hope we end up with either
> >>>>
> >>>> - panthor knows the exact heap to use and fails with EPROBE_DEFER if
> >>>> the heap is missing, or
> >>>> - panthor gets a dma-buf from userspace and does the full reset
> >>>> - userspace also needs to provide a dma-buf for each protected
> >>>> group for the suspend buffer
> >>>>
> >>>> than something in-between. The latter is more ad-hoc and basically
> >>>> kicks the issue to the userspace.
> >>>
> >>> Indeed, the second option is more ad-hoc, but when you think about it,
> >>> userspace has to have this knowledge, because it needs to know the
> >>> dma-heap to use for buffer allocation that cross a device boundary
> >>> anyway. Think about frames produced by a video decoder, and composited
> >>> by the GPU into a protected scanout buffer that's passed to the KMS
> >>> device. Why would the GPU driver be source of truth when it comes to
> >>> choosing the heap to use to allocate protected buffers for the video
> >>> decoder or those used for the display?
> >> I don't think the GPU driver is ever the source of truth. If the
> >> system integrator wants to specify the source of truth (SoT) from
> >> kernel space, they should use the device tree (or module params /
> >> config options). If they want to specify the SoT in userspace, then we
> >> don't really care how it is done other than providing an ioctl.
> >> Panthor is always on the receiving end.
> >
> > Okay, we're on the same page then.
> >
> >>
> >> If we don't want to delay this functionality, but it takes time to
> >> converge on SoT, maybe a solution that is not a long-term promise can
> >> work? Of the options on the table (dt, module params, kconfig options,
> >> ioctls), a kconfig option, potentially marked as experimental, seems
> >> like a good candidate.
> >
> > If Panthor is only a consumer, I actually think it'd be easier to just
> > let userspace pass the protected FW section as an imported buffer
> > through an ioctl for now. It means we don't need any of the
> > modifications to the dma_heap API in this series, and userspace is free
> > to choose its SoT (efuse, DT, ...) and pass the info back to mesa/GBM
> > somehow (envvar, driconf, ...). The only thing we need to ensure is if
> > lazy protected FW section allocation is going to work, but given the
> > current code purely and simply ignores those sections, and the FW is
> > still able to boot and act properly (at least on v10-v13), I'm pretty
> > confident this is okay, unless there's some trick the MCU can do to
> > detect that the protected section isn't mapped (which I doubt, because
> > the MCU doesn't know it lives behind an MMU).
I set up MMU to map non-protected memory to the protected section the
other day. The FW still booted fine. I didn't get access violation
until the FW executed PROT_REGION and panthor requested
GLB_PROTM_ENTER in response.
This was on v13, but I also doubt it will become an issue. Can ARM help clarify?
> >
> > Of course, once we have a consensus on how to describe this in the DT,
> > we can switch Panthor over to "protected dma_heap selection through DT",
> > and reflect that through the ioctl that exposes whether protected
> > support is ready or not (would be a DEV_QUERY), such that userspace can
> > skip this "PROTM initialization" step.
> >
> > We're talking about an extra ioctl to set those buffers, and a
> > DEV_QUERY to query the state (ready or not), the size of the global
> > protected buffer (protected FW section) and the size of the protected
> > suspend buffer. The protected suspend buffer would be allocated and
> > passed at group creation time (extra arg passed to the existing
> > GROUP_CREATE ioctl). So, overall, I don't consider it a huge liability
> > in term of maintenance cost.
>
> If we can avoid the dma-heap changes, then that would surely help!
> I can try to implement this in the next version unless someone finds a
> reason why it is a bad idea.
Yeah, that sounds good to me too.
Will the extra ioctl require root? On a system with true protected
memory, the FW cannot write to non-protected memory. It seems ok to
allow any client to make the ioctl call. But on systems without true
protected memory, it can be problematic.
>
> >>>> For the former, expressing the relation in DT seems to be the best,
> >>>> but only if possible :-). Otherwise, a kconfig option (instead of
> >>>> module param) should be easier to work with.
> >>>>
> >>>> Looking at the userspace implementation, can we also have an panthor
> >>>> ioctl to return the heap to userspace?
> >>>
> >>> Yes, it's something we can add, but again, I'm questioning the
> >>> usefulness of this: how can we ensure the heap used by panthor to
> >>> allocate its protected FW buffers is suitable for scanout buffers
> >>> (buffers that can be used by display drivers). There needs to be a glue
> >>> leaving in usersland and taking the decision, and I'm not too sure
> >>> trusting any of the component in the chain (vdec, gpu, display) is the
> >>> right thing to do.
> >> The heap returned by panthor is only for panfrost/panvk. It says
> >> nothing about compatibility with other components on the system.
> >
> > Okay, if it's used only for internal buffers, I guess that's fine.
>
> --
> Ketil
^ permalink raw reply
* Re: [PATCH v5 1/6] iommu/arm-smmu-v3: Add arm_smmu_kdump_adopt_strtab() for kdump
From: Jason Gunthorpe @ 2026-05-19 17:10 UTC (permalink / raw)
To: Nicolin Chen
Cc: will, robin.murphy, kevin.tian, joro, praan, kees, baolu.lu,
miko.lenczewski, smostafa, linux-arm-kernel, iommu, linux-kernel,
stable, jamien
In-Reply-To: <0582326eeadd4ae2b16fd4914e9bd46da5a251d3.1778416609.git.nicolinc@nvidia.com>
On Sun, May 10, 2026 at 02:23:00PM -0700, Nicolin Chen wrote:
> +#include <linux/dma-direct.h>
Nope, never do this, it is an internal header.
> +/*
> + * Adopting the crashed kernel's stream table has risks: the physical addresses
> + * read from ARM_SMMU_STRTAB_BASE / L1 descriptors may be corrupted. Reject any
> + * range that overlaps the kdump kernel's critical regions.
> + */
> +static bool arm_smmu_kdump_phys_is_corrupted(phys_addr_t base, size_t size)
> +{
> + /*
> + * On arm64 kdump, iomem_resource entries are typically:
> + * ------------------------------------------------------------
> + * | Entry | IORESOURCE_ Flags | IORES_DESC_ Desc |
> + * ------------------------------------------------------------
> + * | System RAM | MEM + BUSY + SYSRAM | NONE |
> + * | MMIO regions | MEM + BUSY | NONE |
> + * | Reserved memory | MEM | NONE |
> + * ------------------------------------------------------------
> + *
> + * Test and reject any overlap with MEM + BUSY, covering/excluding:
> + * + System RAM: silent corruption of kdump kernel's own memory
> + * + MMIO regions: fatal SError on cacheable speculative access
> + * - Reserved memory: crashed kernel's stream table might reside
> + */
> + if (region_intersects(base, size, IORESOURCE_MEM | IORESOURCE_BUSY,
> + IORES_DESC_NONE) != REGION_DISJOINT)
> + return true;
> +
> + /*
> + * Note: physical holes are absent from iomem_resource, so a corrupted
> + * address pointing into one will not be caught here. Closing that gap
> + * requires a firmware memory map and is left as a future improvement.
> + */
> + return false;
> +}
Something like this should not be in the smmu driver, this is some
core kdump code. I'd drop it, I don't see other drivers doing this?
> +static int arm_smmu_kdump_adopt_l2_strtab(struct arm_smmu_device *smmu, u32 sid,
> + u32 l1_idx, u64 l2_dma, u32 span,
> + struct arm_smmu_strtab_l2 **l2table)
> +{
> + phys_addr_t base = dma_to_phys(smmu->dev, l2_dma);
The thing stored in the L2PTR is a *phys*, the HW doesn't support any
kind of translation. When using dma_alloc_coherent we never get a phys
so it uses the dma_addr_t and assumes it is == phys.
But on this flow this is *phys* and should remain phys. Never touch
dma_addr_t.
> + struct arm_smmu_strtab_l2 *table;
> + size_t size;
> +
> + /*
> + * Only a coherent SMMU is supported at this moment. For a non-coherent
> + * SMMU that wants to support ARM_SMMU_OPT_KDUMP_ADOPT, try MEMREMAP_WC.
> + */
> + if (WARN_ON(!(smmu->features & ARM_SMMU_FEAT_COHERENCY)))
> + return -EOPNOTSUPP;
> +
> + /*
> + * Retest the memremap inputs in case the L1 descriptor was overwritten
> + * since adopt. Reject this master's insert; panic or SMMU-disable would
> + * either lose the vmcore or cascade aborts. Do not try to fix it, as it
> + * would break all other SIDs in the same bus (PCI case). The corruption
> + * blast radius is already bounded to that bus range.
> + */
> + if (span != STRTAB_SPLIT + 1) {
> + dev_err(smmu->dev,
> + "kdump: L1[%u] span %u changed since adopt (was %u)\n",
> + l1_idx, span, STRTAB_SPLIT + 1);
> + return -EINVAL;
> + }
> static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
> {
> dma_addr_t l2ptr_dma;
> struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
> struct arm_smmu_strtab_l2 **l2table;
> + u32 l1_idx = arm_smmu_strtab_l1_idx(sid);
>
> - l2table = &cfg->l2.l2ptrs[arm_smmu_strtab_l1_idx(sid)];
> + l2table = &cfg->l2.l2ptrs[l1_idx];
> if (*l2table)
> return 0;
>
> + /* Deferred adoption of the crashed kernel's L2 table */
> + if (smmu->options & ARM_SMMU_OPT_KDUMP_ADOPT) {
> + u64 l2ptr = le64_to_cpu(cfg->l2.l1tab[l1_idx].l2ptr);
> + dma_addr_t l2_dma = l2ptr & STRTAB_L1_DESC_L2PTR_MASK;
Like here, this should by phys_addr_t
> +static int arm_smmu_kdump_adopt_strtab_2lvl(struct arm_smmu_device *smmu,
> + u32 cfg_reg, dma_addr_t dma)
Same issues with dma_addr_t
> +static int arm_smmu_kdump_adopt_strtab_linear(struct arm_smmu_device *smmu,
> + u32 cfg_reg, dma_addr_t dma)
> +{
Same issues with dma_addr_t
> +static void arm_smmu_kdump_adopt_cleanup(struct arm_smmu_device *smmu, u32 fmt)
> +{
> + struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
> +
> + if (fmt == STRTAB_BASE_CFG_FMT_2LVL) {
> + if (cfg->l2.l2ptrs)
> + devm_kfree(smmu->dev, cfg->l2.l2ptrs);
> + if (!IS_ERR_OR_NULL(cfg->l2.l1tab))
> + devm_memunmap(smmu->dev, cfg->l2.l1tab);
> + } else if (fmt == STRTAB_BASE_CFG_FMT_LINEAR) {
> + if (!IS_ERR_OR_NULL(cfg->linear.table))
> + devm_memunmap(smmu->dev, cfg->linear.table);
> + }
> +}
If we have a cleanup function why is it using devm? Call the cleanup
function during remove too?
Jason
^ permalink raw reply
* Re: [PATCH 1/8] mm: Add ptep_try_install() for lockless empty-slot installs
From: Tejun Heo @ 2026-05-19 17:11 UTC (permalink / raw)
To: David Hildenbrand (Arm)
Cc: David Vernet, Andrea Righi, Changwoo Min, Alexei Starovoitov,
Andrii Nakryiko, Daniel Borkmann, Martin KaFai Lau,
Kumar Kartikeya Dwivedi, Catalin Marinas, Will Deacon,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Andrew Morton, Mike Rapoport, Emil Tsalapatis, sched-ext, bpf,
x86, linux-arm-kernel, linux-mm, linux-kernel
In-Reply-To: <5590fd3d-dae1-4070-b52f-bc40982ac678@kernel.org>
On Tue, May 19, 2026 at 11:40:48AM +0200, David Hildenbrand (Arm) wrote:
> On 5/19/26 11:05, David Hildenbrand (Arm) wrote:
...
> >> The only requirements are that the kernel doesn't oops and the
> >> violation gets caught. Beyond that, behavior at the address is
> >> unspecified, and which installer wins the race doesn't matter as
> >> long as kernel integrity holds.
> >
> > You'll have inconsistent TLB state.
Wouldn't it still be either or with both cases being okay?
> > I really don't like that approach.
> >
> > We should really try to just take the lock, and remove any code under the lock
> > that could trigger such unpleasant deadlocks.
> >
> > Is that feasible?
>
> ... or can we run into similar problems with kprobes? (I am obviously no bpf
> expert ...)
Yeah, I mean, that was just the first TP I found scanning the code. Any
kprobes or other TPs in the path would behave the same.
When this fault triggers, the BPF program has already malfunctioned, so it's
not going to be a high frequency path and performance isn't a primary
consideration. So, anything that can ensure that the kernel doesn't crash or
lock up would be fine. Any better ideas?
Thanks.
--
tejun
^ permalink raw reply
* Re: [PATCH] spi: aspeed: Replace VLA parameter with flat pointer in calibration helper
From: David Laight @ 2026-05-19 17:13 UTC (permalink / raw)
To: Mark Brown
Cc: Chin-Ting Kuo, clg, joel, andrew, linux-aspeed, openbmc,
linux-spi, linux-arm-kernel, linux-kernel, BMC-SW,
kernel test robot
In-Reply-To: <659a6593-0223-4a26-830b-1390326b84e5@sirena.org.uk>
On Tue, 19 May 2026 12:03:51 +0100
Mark Brown <broonie@kernel.org> wrote:
> On Mon, May 18, 2026 at 05:57:08PM +0800, Chin-Ting Kuo wrote:
>
> > - while (k < cols && buf[i][k])
> > + while (k < cols && buf[i * cols + k])
>
> This really needs () to make it clear what's going on; the precedence is
> well defined but not everyone is going to know that off the top of their
> head.
Come on, it's multiply and add - everyone is going to get that right.
-- David
^ permalink raw reply
* Re: [PATCH v7 0/4] PCI: Add support for resetting the Root Ports in a platform specific way
From: Niklas Cassel @ 2026-05-19 17:19 UTC (permalink / raw)
To: Manivannan Sadhasivam
Cc: manivannan.sadhasivam, Bjorn Helgaas, Mahesh J Salgaonkar,
Oliver O'Halloran, Will Deacon, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Heiko Stuebner,
Philipp Zabel, linux-pci, linux-kernel, linuxppc-dev,
linux-arm-kernel, linux-arm-msm, linux-rockchip, Wilfred Mallawa,
Krishna Chaitanya Chundru, Lukas Wunner, Richard Zhu,
Brian Norris, Wilson Ding, Frank Li
In-Reply-To: <tgsh3cum6qxrqjzbdeqjsp6bf7cqedj7il77hww3oxecadndin@idjnwib7cz4z>
Hello Mani,
On Mon, May 18, 2026 at 11:51:56AM +0530, Manivannan Sadhasivam wrote:
> >
> > With the patch above. There is zero difference before/after reset, and all
> > the BAR tests pass. However, MSI/MSI-X tests still fail with:
> >
> > # pci_endpoint_test.c:143:MSI_TEST:Expected 0 (0) == ret (-110)
> > # pci_endpoint_test.c:143:MSI_TEST:Test failed for MSI1
> >
> > ETIMEDOUT.
> >
> > This suggests that pci_endpoint_test on the host side did not receive an
> > interrupt.
> >
> > I don't know why, but considering that lspci output is now (with the
> > save+restore) identical, I assume that the problem is not related to
> > the host. Unless somehow the host will use a new/different MSI address
> > after the root port has been reset, and we restore the old MSI address,
> > but looking at the code, dw_pcie_msi_init() is called by
> > dw_pcie_setup_rc(), so I would expect the MSI address to be the same.
> >
>
> Hi Niklas,
>
> When I rebased this series on top of v7.1-rc1, I ended up seeing the issue what
> you described here (not sure why I didn't see it earlier). So after the Root
> Port reset, MSI tests fail, but BAR tests succeed. Also, I got IOMMU faults on
> the host after endpoint triggers MSI.
>
> I investigated it and found that the MSI iATU mapping gets cleared in hw after
> LDn happens. But the host continues to use the same address/size for the
> endpoint MSI even after reset. Due to this, the existing checks in
> dw_pcie_ep_raise_msi_irq() don't pass and the stale MSI iATU mapping gets
> reused.
>
> The fix would be to clear the mapping in dw_pcie_ep_cleanup(), which gets called
> as part of the PERST# assert/deassert sequence post LDn and also set
> msi_iatu_mapped flag to 'false'. This will force dw_pcie_ep_raise_msi_irq() to
> use fresh iATU mapping when it gets called for the first time:
>
> diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
> index d4dc3b24da60..4ae0e1b55f39 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-ep.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
> @@ -1035,6 +1035,11 @@ void dw_pcie_ep_cleanup(struct dw_pcie_ep *ep)
> {
> struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
>
> + if (ep->msi_iatu_mapped) {
> + dw_pcie_ep_unmap_addr(ep->epc, 0, 0, ep->msi_mem_phys);
> + ep->msi_iatu_mapped = false;
> + }
> +
> dwc_pcie_debugfs_deinit(pci);
> dw_pcie_edma_remove(pci);
> }
>
> With this change, MSI works after Root Port reset without any issues on our Qcom
> endpoint/host setup.
>
> Please test this change on your rockchip setup as well. You have to make sure
> that dw_pcie_ep_cleanup() is called during PERST# assert/deassert.
>
> I'm going to respin the series with this fix. If you confirm it works for you,
> then we can merge your Rockchip Root Port change.
I am happy to hear that you managed to find the root cause!
Hopefully your series can finally move forward :)
While e.g. RK3588 does have a PERST# input GPIO, so it could theoretically
add a perst_deassert()/assert() function. However, when the EPC support was
added, you did not want that, since I remember that you said that you only
wanted that for drivers that required an external refclock.
Thus, for drivers that do not require an external refclock, should we
perhaps add your suggested code in dw_pcie_ep_linkdown()?
E.g. pcie-tegra194.c does not call dw_pcie_ep_linkdown(), so I'm not
sure if we can simply move it from dw_pcie_ep_cleanup() to
dw_pcie_ep_linkdown() either...
Perhaps we need the code in both functions?
(pcie-qcom-ep.c seems to be the only function that will call both
dw_pcie_ep_linkdown() and dw_pcie_ep_cleanup().)
Kind regards,
Niklas
^ permalink raw reply
* Re: [PATCH 4/8] drm/panthor: Add support for protected memory allocation in panthor
From: Boris Brezillon @ 2026-05-19 17:29 UTC (permalink / raw)
To: Chia-I Wu
Cc: Ketil Johnsen, Liviu Dudau, Marcin Ślusarz, David Airlie,
Simona Vetter, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, Jonathan Corbet, Shuah Khan, Sumit Semwal,
Benjamin Gaignard, Brian Starkey, John Stultz, T.J. Mercier,
Christian König, Steven Price, Daniel Almeida, Alice Ryhl,
Matthias Brugger, AngeloGioacchino Del Regno, dri-devel,
linux-doc, linux-kernel, linux-media, linaro-mm-sig,
linux-arm-kernel, linux-mediatek, Florent Tomasin, nd
In-Reply-To: <CAPaKu7T7JZRmsS+D_3zFZtyhJk9mNXjL=xpAQ-UNGbm0vztyRg@mail.gmail.com>
On Tue, 19 May 2026 10:07:02 -0700
Chia-I Wu <olvaffe@gmail.com> wrote:
> On Tue, May 19, 2026 at 1:49 AM Ketil Johnsen <ketil.johnsen@arm.com> wrote:
> >
> > On 19/05/2026 09:39, Boris Brezillon wrote:
> > > On Mon, 18 May 2026 17:36:40 -0700
> > > Chia-I Wu <olvaffe@gmail.com> wrote:
> > >
> > >> On Mon, May 18, 2026 at 12:16 AM Boris Brezillon
> > >> <boris.brezillon@collabora.com> wrote:
> > >>>
> > >>> On Wed, 13 May 2026 12:31:32 -0700
> > >>> Chia-I Wu <olvaffe@gmail.com> wrote:
> > >>>
> > >>>> On Tue, May 12, 2026 at 8:39 AM Liviu Dudau <liviu.dudau@arm.com> wrote:
> > >>>>>
> > >>>>> On Tue, May 12, 2026 at 04:11:11PM +0200, Boris Brezillon wrote:
> > >>>>>> On Tue, 12 May 2026 14:47:27 +0100
> > >>>>>> Liviu Dudau <liviu.dudau@arm.com> wrote:
> > >>>>>>
> > >>>>>>> On Thu, May 07, 2026 at 01:53:56PM +0200, Boris Brezillon wrote:
> > >>>>>>>> On Thu, 7 May 2026 11:02:26 +0200
> > >>>>>>>> Marcin Ślusarz <marcin.slusarz@arm.com> wrote:
> > >>>>>>>>
> > >>>>>>>>> On Tue, May 05, 2026 at 06:15:23PM +0200, Boris Brezillon wrote:
> > >>>>>>>>>>> @@ -277,9 +286,21 @@ int panthor_device_init(struct panthor_device *ptdev)
> > >>>>>>>>>>> return ret;
> > >>>>>>>>>>> }
> > >>>>>>>>>>>
> > >>>>>>>>>>> + /* If a protected heap name is specified but not found, defer the probe until created */
> > >>>>>>>>>>> + if (protected_heap_name && strlen(protected_heap_name)) {
> > >>>>>>>>>>
> > >>>>>>>>>> Do we really need this strlen() > 0? Won't dma_heap_find() fail is the
> > >>>>>>>>>> name is "" already?
> > >>>>>>>>>
> > >>>>>>>>> If dma_heap_find() will fail, then the whole probe with fail too.
> > >>>>>>>>> This check prevents that.
> > >>>>>>>>
> > >>>>>>>> Yeah, that's also a questionable design choice. I mean, we can
> > >>>>>>>> currently probe and boot the FW even though we never setup the
> > >>>>>>>> protected FW sections, so why should we defer the probe here? Can't we
> > >>>>>>>> just retry the next time a group with the protected bit is created and
> > >>>>>>>> fail if we can find a protected heap?
> > >>>>>>>
> > >>>>>>> The problem we have with the current firmware is that it does a number of setup steps at "boot"
> > >>>>>>> time only. One of the steps is preparing its internal structures for when it enters protected
> > >>>>>>> mode and it stores them in the buffer passed in at firmware loading. We cannot later run the
> > >>>>>>> process when we have a group with protected mode set.
> > >>>>>>
> > >>>>>> No, but we can force a full/slow reset and have that thing
> > >>>>>> re-initialized, can't we? I mean, that's basically what we do when a
> > >>>>>> fast reset fails: we re-initialize all the sections and reset again, at
> > >>>>>> which point the FW should start from a fresh state, and be able to
> > >>>>>> properly initialize the protected-related stuff if protected sections
> > >>>>>> are populated. Am I missing something?
> > >>>>>
> > >>>>> Right, we can do that. For some reason I keep associating the reset with the
> > >>>>> error handling and not with "normal" operations.
> > >>>> I kind of hope we end up with either
> > >>>>
> > >>>> - panthor knows the exact heap to use and fails with EPROBE_DEFER if
> > >>>> the heap is missing, or
> > >>>> - panthor gets a dma-buf from userspace and does the full reset
> > >>>> - userspace also needs to provide a dma-buf for each protected
> > >>>> group for the suspend buffer
> > >>>>
> > >>>> than something in-between. The latter is more ad-hoc and basically
> > >>>> kicks the issue to the userspace.
> > >>>
> > >>> Indeed, the second option is more ad-hoc, but when you think about it,
> > >>> userspace has to have this knowledge, because it needs to know the
> > >>> dma-heap to use for buffer allocation that cross a device boundary
> > >>> anyway. Think about frames produced by a video decoder, and composited
> > >>> by the GPU into a protected scanout buffer that's passed to the KMS
> > >>> device. Why would the GPU driver be source of truth when it comes to
> > >>> choosing the heap to use to allocate protected buffers for the video
> > >>> decoder or those used for the display?
> > >> I don't think the GPU driver is ever the source of truth. If the
> > >> system integrator wants to specify the source of truth (SoT) from
> > >> kernel space, they should use the device tree (or module params /
> > >> config options). If they want to specify the SoT in userspace, then we
> > >> don't really care how it is done other than providing an ioctl.
> > >> Panthor is always on the receiving end.
> > >
> > > Okay, we're on the same page then.
> > >
> > >>
> > >> If we don't want to delay this functionality, but it takes time to
> > >> converge on SoT, maybe a solution that is not a long-term promise can
> > >> work? Of the options on the table (dt, module params, kconfig options,
> > >> ioctls), a kconfig option, potentially marked as experimental, seems
> > >> like a good candidate.
> > >
> > > If Panthor is only a consumer, I actually think it'd be easier to just
> > > let userspace pass the protected FW section as an imported buffer
> > > through an ioctl for now. It means we don't need any of the
> > > modifications to the dma_heap API in this series, and userspace is free
> > > to choose its SoT (efuse, DT, ...) and pass the info back to mesa/GBM
> > > somehow (envvar, driconf, ...). The only thing we need to ensure is if
> > > lazy protected FW section allocation is going to work, but given the
> > > current code purely and simply ignores those sections, and the FW is
> > > still able to boot and act properly (at least on v10-v13), I'm pretty
> > > confident this is okay, unless there's some trick the MCU can do to
> > > detect that the protected section isn't mapped (which I doubt, because
> > > the MCU doesn't know it lives behind an MMU).
> I set up MMU to map non-protected memory to the protected section the
> other day. The FW still booted fine. I didn't get access violation
> until the FW executed PROT_REGION and panthor requested
> GLB_PROTM_ENTER in response.
Ah, thanks for testing! We still don't have a setup with proper
protected heap, but that was on my list of things to test.
>
> This was on v13, but I also doubt it will become an issue. Can ARM help clarify?
>
> > >
> > > Of course, once we have a consensus on how to describe this in the DT,
> > > we can switch Panthor over to "protected dma_heap selection through DT",
> > > and reflect that through the ioctl that exposes whether protected
> > > support is ready or not (would be a DEV_QUERY), such that userspace can
> > > skip this "PROTM initialization" step.
> > >
> > > We're talking about an extra ioctl to set those buffers, and a
> > > DEV_QUERY to query the state (ready or not), the size of the global
> > > protected buffer (protected FW section) and the size of the protected
> > > suspend buffer. The protected suspend buffer would be allocated and
> > > passed at group creation time (extra arg passed to the existing
> > > GROUP_CREATE ioctl). So, overall, I don't consider it a huge liability
> > > in term of maintenance cost.
> >
> > If we can avoid the dma-heap changes, then that would surely help!
> > I can try to implement this in the next version unless someone finds a
> > reason why it is a bad idea.
> Yeah, that sounds good to me too.
>
> Will the extra ioctl require root?
The PROTM_INIT ioctl will certainly require high privilege
CAP_SYS_<something>, dunno yet what that <something> would be though.
> On a system with true protected
> memory, the FW cannot write to non-protected memory. It seems ok to
> allow any client to make the ioctl call. But on systems without true
> protected memory, it can be problematic.
Yep, I agree we shouldn't let random users pretend they initialized
protected mode if the system as a whole doesn't have proper the proper
bit hooked up to set that up.
^ permalink raw reply
* Re: [PATCH v5 2/6] iommu/arm-smmu-v3: Implement is_attach_deferred() for kdump
From: Jason Gunthorpe @ 2026-05-19 17:43 UTC (permalink / raw)
To: Nicolin Chen
Cc: will, robin.murphy, kevin.tian, joro, praan, kees, baolu.lu,
miko.lenczewski, smostafa, linux-arm-kernel, iommu, linux-kernel,
stable, jamien
In-Reply-To: <43fd9986b085cf5bfba2c9bc06c0411693a361e5.1778416609.git.nicolinc@nvidia.com>
On Sun, May 10, 2026 at 02:23:01PM -0700, Nicolin Chen wrote:
> Though the kdump kernel adopts the crashed kernel's stream table, the iommu
> core will still try to attach each probed device to a default domain, which
> overwrites the adopted STE and breaks in-flight DMA from that device.
>
> Implement an is_attach_deferred() callback to prevent this. For each device
> that has STE.V=1 and STE.Cfg!=Abort in the adopted table, defer the default
> domain attachment, until the device driver explicitly requests it.
>
> Fixes: b63b3439b856 ("iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel")
> Cc: stable@vger.kernel.org # v6.12+
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 24 +++++++++++++++++++++
> 1 file changed, 24 insertions(+)
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Jason
^ permalink raw reply
* Re: [PATCH v5 3/6] iommu/arm-smmu-v3: Suppress EVTQ/PRIQ events in kdump kernel
From: Jason Gunthorpe @ 2026-05-19 17:44 UTC (permalink / raw)
To: Nicolin Chen
Cc: will, robin.murphy, kevin.tian, joro, praan, kees, baolu.lu,
miko.lenczewski, smostafa, linux-arm-kernel, iommu, linux-kernel,
stable, jamien
In-Reply-To: <6e5828f3288aed6f9e9f4e0ca54e7fbd9f439274.1778416609.git.nicolinc@nvidia.com>
On Sun, May 10, 2026 at 02:23:02PM -0700, Nicolin Chen wrote:
> In kdump cases, the crashed kernel's CDs and page tables can be corrupted,
> which could trigger event spamming. Also, we cannot serve page requests.
>
> Skip the IRQ setup for EVTQ/PRIQ in arm_smmu_setup_irqs(), and guard the
> thread functions against being entered via a combined-IRQ delivery while
> the queue is disabled.
>
> Fixes: b63b3439b856 ("iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel")
> Cc: stable@vger.kernel.org # v6.12+
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 23 +++++++++++++++++++--
> 1 file changed, 21 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 579c8af82d6b6..ebb0826d74541 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2364,6 +2364,14 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
> static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL,
> DEFAULT_RATELIMIT_BURST);
>
> + /*
> + * A combined IRQ might call into this function with the queue disabled.
> + * E.g. kdump, where stale HW PROD vs SW CONS would drive a bogus drain
> + * and a CONS write to a disabled queue.
> + */
> + if (!(readl_relaxed(smmu->base + ARM_SMMU_CR0) & CR0_EVTQEN))
> + return IRQ_NONE;
I don't think we should be doing register reads on these paths.
Why not load a different irq function instead?
Jason
^ permalink raw reply
* Re: [PATCH v5 4/6] iommu/arm-smmu-v3: Skip EVTQ/PRIQ setup in kdump kernel
From: Jason Gunthorpe @ 2026-05-19 17:45 UTC (permalink / raw)
To: Nicolin Chen
Cc: will, robin.murphy, kevin.tian, joro, praan, kees, baolu.lu,
miko.lenczewski, smostafa, linux-arm-kernel, iommu, linux-kernel,
stable, jamien
In-Reply-To: <8de5639630e5723d6f371093cef93733f0ca534d.1778416609.git.nicolinc@nvidia.com>
On Sun, May 10, 2026 at 02:23:03PM -0700, Nicolin Chen wrote:
> In kdump cases, the crashed kernel's CDs and page tables can be corrupted,
> which could trigger event spamming. Also, we cannot serve page requests.
>
> Skip the EVTQ/PRIQ setup entirely rather than enabling then disabling them.
>
> Also add some inline comments explaining that.
>
> Fixes: b63b3439b856 ("iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel")
> Cc: stable@vger.kernel.org # v6.12+
> Suggested-by: Kevin Tian <kevin.tian@intel.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 43 +++++++++++++--------
> 1 file changed, 27 insertions(+), 16 deletions(-)
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Jason
^ permalink raw reply
* Re: [PATCH v5 6/6] iommu/arm-smmu-v3: Detect ARM_SMMU_OPT_KDUMP_ADOPT in probe()
From: Jason Gunthorpe @ 2026-05-19 17:58 UTC (permalink / raw)
To: Nicolin Chen
Cc: will, robin.murphy, kevin.tian, joro, praan, kees, baolu.lu,
miko.lenczewski, smostafa, linux-arm-kernel, iommu, linux-kernel,
stable, jamien
In-Reply-To: <69abcccc388952b2ba0ab4b50c31fcbdac59184a.1778416609.git.nicolinc@nvidia.com>
On Sun, May 10, 2026 at 02:23:05PM -0700, Nicolin Chen wrote:
> arm_smmu_device_hw_probe() runs before arm_smmu_init_structures(), so it's
> natural to decide whether the kdump kernel must adopt the crashed kernel's
> stream table.
>
> Given that memremap is used to adopt the old stream table, set this option
> only on a coherent SMMU.
>
> And make sure SMMU isn't in Service Failure Mode.
>
> Fixes: b63b3439b856 ("iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel")
> Cc: stable@vger.kernel.org # v6.12+
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 31 +++++++++++++++++++++
> 1 file changed, 31 insertions(+)
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Jason
^ permalink raw reply
* Re: [PATCH v5 1/6] iommu/arm-smmu-v3: Add arm_smmu_kdump_adopt_strtab() for kdump
From: Nicolin Chen @ 2026-05-19 18:11 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: will, robin.murphy, kevin.tian, joro, praan, kees, baolu.lu,
miko.lenczewski, smostafa, linux-arm-kernel, iommu, linux-kernel,
stable, jamien
In-Reply-To: <20260519171003.GD3602937@nvidia.com>
On Tue, May 19, 2026 at 02:10:03PM -0300, Jason Gunthorpe wrote:
> On Sun, May 10, 2026 at 02:23:00PM -0700, Nicolin Chen wrote:
>
> > +#include <linux/dma-direct.h>
>
> Nope, never do this, it is an internal header.
Hmm, I have included it for a wrong reason, yet it does mention
"IOMMU drivers".
/*
* Internals of the DMA direct mapping implementation. Only for use by the
* DMA mapping code and IOMMU drivers.
*/
> > +/*
> > + * Adopting the crashed kernel's stream table has risks: the physical addresses
> > + * read from ARM_SMMU_STRTAB_BASE / L1 descriptors may be corrupted. Reject any
> > + * range that overlaps the kdump kernel's critical regions.
> > + */
> > +static bool arm_smmu_kdump_phys_is_corrupted(phys_addr_t base, size_t size)
[..]
> Something like this should not be in the smmu driver, this is some
> core kdump code. I'd drop it, I don't see other drivers doing this?
OK.
> > +static int arm_smmu_kdump_adopt_l2_strtab(struct arm_smmu_device *smmu, u32 sid,
> > + u32 l1_idx, u64 l2_dma, u32 span,
> > + struct arm_smmu_strtab_l2 **l2table)
> > +{
> > + phys_addr_t base = dma_to_phys(smmu->dev, l2_dma);
>
> The thing stored in the L2PTR is a *phys*, the HW doesn't support any
> kind of translation. When using dma_alloc_coherent we never get a phys
> so it uses the dma_addr_t and assumes it is == phys.
>
> But on this flow this is *phys* and should remain phys. Never touch
> dma_addr_t.
Fixing that and other places too.
> > +static void arm_smmu_kdump_adopt_cleanup(struct arm_smmu_device *smmu, u32 fmt)
> > +{
> > + struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
> > +
> > + if (fmt == STRTAB_BASE_CFG_FMT_2LVL) {
> > + if (cfg->l2.l2ptrs)
> > + devm_kfree(smmu->dev, cfg->l2.l2ptrs);
> > + if (!IS_ERR_OR_NULL(cfg->l2.l1tab))
> > + devm_memunmap(smmu->dev, cfg->l2.l1tab);
> > + } else if (fmt == STRTAB_BASE_CFG_FMT_LINEAR) {
> > + if (!IS_ERR_OR_NULL(cfg->linear.table))
> > + devm_memunmap(smmu->dev, cfg->linear.table);
> > + }
> > +}
>
> If we have a cleanup function why is it using devm? Call the cleanup
> function during remove too?
Dropping "devm_"s.
Thanks
Nicolin
^ permalink raw reply
* Re: [PATCH 3/3] arm64: dts: imx95: Add iommus property and enable SMMU
From: Frank Li @ 2026-05-19 18:16 UTC (permalink / raw)
To: Peng Fan (OSS)
Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Sascha Hauer,
Pengutronix Kernel Team, Fabio Estevam, devicetree, imx,
linux-arm-kernel, linux-kernel, Peng Fan
In-Reply-To: <20260409-imx95-s-dts-v1-3-858e83ae1a37@nxp.com>
On Thu, Apr 09, 2026 at 08:00:03PM +0800, Peng Fan (OSS) wrote:
> From: Peng Fan <peng.fan@nxp.com>
>
> Add iommus property for SDHC and EDMA
> Enable SMMU by default.
>
> Signed-off-by: Peng Fan <peng.fan@nxp.com>
> ---
Peng:
I have to drop this patch because it cause below CHECK_DTB warnings
arch/arm64/boot/dts/freescale/imx95-verdin-wifi-dev.dtb: dma-controller@42210000 (fsl,imx95-edma5): Unevaluated properties are not allowed ('iommus' was unexpected)
from schema $id: http://devicetree.org/schemas/dma/fsl,edma.yaml
Frank
> arch/arm64/boot/dts/freescale/imx95.dtsi | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/boot/dts/freescale/imx95.dtsi b/arch/arm64/boot/dts/freescale/imx95.dtsi
> index 3e35c956a4d7af88310b3dfaef7e3d064f530e07..adcc0e1d3696b93250ab97fcac7c181b187d3d10 100644
> --- a/arch/arm64/boot/dts/freescale/imx95.dtsi
> +++ b/arch/arm64/boot/dts/freescale/imx95.dtsi
> @@ -777,6 +777,7 @@ edma3: dma-controller@42210000 {
> <GIC_SPI 287 IRQ_TYPE_LEVEL_HIGH>;
> clocks = <&scmi_clk IMX95_CLK_BUSWAKEUP>;
> clock-names = "dma";
> + iommus = <&smmu 0x0>;
> };
>
> mu7: mailbox@42430000 {
> @@ -1242,6 +1243,7 @@ usdhc1: mmc@42850000 {
> bus-width = <8>;
> fsl,tuning-start-tap = <1>;
> fsl,tuning-step = <2>;
> + iommus = <&smmu 0x1>;
> status = "disabled";
> };
>
> @@ -1259,6 +1261,7 @@ usdhc2: mmc@42860000 {
> bus-width = <4>;
> fsl,tuning-start-tap = <1>;
> fsl,tuning-step = <2>;
> + iommus = <&smmu 0x2>;
> status = "disabled";
> };
>
> @@ -1276,6 +1279,7 @@ usdhc3: mmc@428b0000 {
> bus-width = <4>;
> fsl,tuning-start-tap = <1>;
> fsl,tuning-step = <2>;
> + iommus = <&smmu 0x3>;
> status = "disabled";
> };
> };
> @@ -1768,7 +1772,6 @@ smmu: iommu@490d0000 {
> <GIC_SPI 326 IRQ_TYPE_EDGE_RISING>;
> interrupt-names = "eventq", "gerror", "priq", "cmdq-sync";
> #iommu-cells = <1>;
> - status = "disabled";
> };
>
> pmu@490d2000 {
>
> --
> 2.37.1
>
^ permalink raw reply
* Re: [PATCH v4 11/24] iommu: Add iommu_report_device_broken() to quarantine a broken device
From: Nicolin Chen @ 2026-05-19 18:29 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Will Deacon, Robin Murphy, Joerg Roedel, Bjorn Helgaas,
Rafael J . Wysocki, Len Brown, Pranjal Shrivastava, Mostafa Saleh,
Lu Baolu, Kevin Tian, linux-arm-kernel, iommu, linux-kernel,
linux-acpi, linux-pci, vsethi, Shuai Xue
In-Reply-To: <20260519120737.GQ787748@nvidia.com>
On Tue, May 19, 2026 at 09:07:37AM -0300, Jason Gunthorpe wrote:
> On Mon, May 18, 2026 at 08:38:54PM -0700, Nicolin Chen wrote:
> > +void iommu_report_device_broken(struct device *dev)
> > +{
> > + struct group_device *gdev;
> > +
> > + /*
> > + * We cannot hold group->mutex here. Rely on iommu_group_broken_worker()
> > + * to validate dev_has_iommu(). The iommu_group memory is RCU-protected
> > + * via kfree_rcu() in iommu_group_release(), and group->devices is an
> > + * RCU-protected list, so the lookup runs entirely under rcu_read_lock.
> > + *
> > + * Note the device might have been concurrently removed from the group
> > + * (list_del_rcu) before iommu_deinit_device() cleared the dev->iommu.
> > + */
> > + rcu_read_lock();
> > + gdev = __dev_to_gdev_rcu(dev);
> > + if (gdev) {
>
> If this is why the RCU is being added it seems like overkill.
>
> Just add the worker to struct dev_iommu and push it there so it can
> use a mutex but I'm confused why are we even adding this function?
>
> The entire design of this series was supposed to have the IOMMU driver
> itself adjust it's "STE" to inhibit translated TLPs synchronosly
> within its fully locked invalidation loop.
Yes. Surgical STE is done in the driver. But, core-level attaching
state doesn't reflect correctly. So the driver calls this function
to notify the core (this is in an invalidation context -- not able
to use mutex).
> Whats the async worker for?
Then, the core needs to block the device using the similar routine
to the reset prepare(). And that needs to hold group->mutex, so it
needs an async worker.
Do you see a much simpler way?
Thanks
Nicolin
^ permalink raw reply
* Re: [PATCH v2 1/2] i2c: imx: Don't recover bus when arbitration lost
From: Dan Scally @ 2026-05-19 18:32 UTC (permalink / raw)
To: Carlos Song (OSS)
Cc: linux-i2c@vger.kernel.org, imx@lists.linux.dev,
linux-arm-kernel@lists.infradead.org, Andi Shyti, Frank Li,
Sascha Hauer, Fabio Estevam, Gao Pan, Fugang Duan, Wolfram Sang,
Oleksij Rempel, Pengutronix Kernel Team
In-Reply-To: <AM0PR04MB680225F01902AD7990E4B6E5E8002@AM0PR04MB6802.eurprd04.prod.outlook.com>
Hi Carlos
On 19/05/2026 11:29, Carlos Song (OSS) wrote:
>
>
>> -----Original Message-----
>> From: Dan Scally <dan.scally@ideasonboard.com>
>> Sent: Tuesday, May 19, 2026 4:42 PM
>> To: Oleksij Rempel <o.rempel@pengutronix.de>; Pengutronix Kernel Team
>> <kernel@pengutronix.de>
>> Cc: linux-i2c@vger.kernel.org; imx@lists.linux.dev;
>> linux-arm-kernel@lists.infradead.org; Andi Shyti <andi.shyti@kernel.org>; Frank
>> Li <frank.li@nxp.com>; Sascha Hauer <s.hauer@pengutronix.de>; Fabio
>> Estevam <festevam@gmail.com>; Gao Pan <b54642@freescale.com>; Fugang
>> Duan <B38611@freescale.com>; Wolfram Sang <wsa@kernel.org>
>> Subject: Re: [PATCH v2 1/2] i2c: imx: Don't recover bus when arbitration lost
>>
>> [You don't often get email from dan.scally@ideasonboard.com. Learn why this is
>> important at https://aka.ms/LearnAboutSenderIdentification ]
>>
>> Hello Oleksij / all
>>
>> On 24/04/2026 13:36, Daniel Scally wrote:
>>> In i2c_imx_xfer_common(), the driver attempts bus recovery whenever
>>> i2c_imx_start() fails. One of the failure modes for i2c_imx_start() is
>>> an arbitration-lost signal which results when a second I2C master on
>>> the bus tries to control the bus simultaneously, which is a normal and
>>> expected behaviour.
>>>
>>> Bus recovery is not the right response for this case. Add a check for
>>> the -EAGAIN return code to avoid running the bus recovery.
>>>
>>> Fixes: 1c4b6c3bcf30d ("i2c: imx: implement bus recovery")
>>> Signed-off-by: Daniel Scally <dan.scally@ideasonboard.com>
>>> ---
>>
>> I raised this patch after we had issues with one of the i2c controllers on imx8mp.
>> In that case, the bus had multiple masters that were causing the SoC's i2c
>> controller to lose arbitration. The result was that the framework attempted to
>> run i2c_generic_scl_recovery() and regularly hit the "SCL is stuck low, exit
>> recovery" message [1] because the bus was busy rather than stuck.
>>
>> I'm now experiencing a different issue with the imx8mp in which a different
>> controller - which isn't on a multiple-masters bus - starts transacting fine early in
>> boot, but then seems to get stuck - any attempt to start a transaction by either
>> a driver or i2ctransfer results in the IAL bit in I2C_I2SR being set and so the
>> driver reports that it's lost arbitration [2]. In this case, the bus recovery is
>> needed to fix the problem, and so this commit hurts things rather than helps
>> them. This problem isn't consistent - I get it on maybe 10% of boots.
>>
> Hi Dan,
>
> This is the RM shows:
>
> Arbitration lost. Set by hardware in the following circumstances (IAL must be cleared by software by
> writing a "0" to it at the start of the interrupt service routine):
> * I2Cn_SDA input samples low when the master drives high during an address or data-transmit cycle.
> * I2Cn_SDA input samples low when the master drives high during the acknowledge bit of a datareceive
> cycle.
> For the above two cases, the bit is set at the falling edge of the ninth I2Cn_SCL clock during the ACK
> cycle.
> * A Start cycle is attempted when the bus is busy.
> * A Repeated Start cycle is requested in Slave mode.
> * A Stop condition is detected when the master did not request it.
> NOTE: Software cannot set the bit.
> 0 No arbitration lost.
> 1 Arbitration is lost.
>
> From my understanding:
> The IAL (Arbitration Lost) bit is set not only when true arbitration is lost, but also in several other conditions:
>
> - SDA is sampled low when the master drives it high (during address/data or ACK phase)
> - A START is attempted while the bus is busy
> - A STOP condition is detected unexpectedly
> - A repeated START occurs in slave mode
>
> So in practice, IAL can be asserted not only by real arbitration loss, but also when the controller detects abnormal bus conditions.
>
> Since your system is single-master, this is unlikely to be a true arbitration scenario. Instead, it is more likely caused by signal integrity or timing-related issues, such as:
> - weak pull-up / slow rising edges
> - noise or glitches on SDA
> - timing violations from the slave device
> - others
>
> As a workaround, you can enable the 'single-master' property to disable arbitration checks in single-master systems, for example:
>
> &i2c1 {
> clock-frequency = <400000>;
> pinctrl-names = "default", "gpio";
> pinctrl-0 = <&pinctrl_i2c1>;
> pinctrl-1 = <&pinctrl_i2c1_gpio>;
> scl-gpios = <&gpio5 14 (GPIO_ACTIVE_HIGH | GPIO_OPEN_DRAIN)>;
> sda-gpios = <&gpio5 15 (GPIO_ACTIVE_HIGH | GPIO_OPEN_DRAIN)>;
> single-master;
> status = "okay";
> };
Thanks! I'm away from the hardware at the moment but I'll give it a try next week and see if that
fixes the issue.
Dan
>
> Hope it will help some.
>
> Carlos
^ permalink raw reply
* Re: [PATCH v2 0/5] mm: reduce mmap_lock contention and improve page fault performance
From: Yang Shi @ 2026-05-19 18:41 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Barry Song, Matthew Wilcox, surenb, akpm, linux-mm, david, liam,
vbabka, rppt, mhocko, jack, pfalcato, wanglian, chentao,
lianux.mm, kunwu.chan, liyangouwen1, chrisl, kasong, shikemeng,
nphamcs, bhe, youngjun.park, linux-arm-kernel, linux-kernel,
loongarch, linuxppc-dev, linux-riscv, linux-s390, Nanzhe Zhao
In-Reply-To: <agxnJ8R-G3CRjeTR@lucifer>
On Tue, May 19, 2026 at 6:39 AM Lorenzo Stoakes <ljs@kernel.org> wrote:
>
> On Tue, May 19, 2026 at 02:12:10PM +0100, Lorenzo Stoakes wrote:
> > On Mon, May 18, 2026 at 02:21:14PM -0700, Yang Shi wrote:
> > > Maybe a little bit off topic. This is an interesting idea. It seems
> > > possible we don't have to take vma write lock unconditionally. IIUC
> > > the write lock is mainly used to serialize against page fault and
> > > madvise, right? I got a crazy idea off the top of my head. We may be
> >
> > Err no, it serialises against literally any modification or read of any
> > characteristic of VMAs.
If I remember correctly, you are not supposed to change VMA
flags/size/mm pointer/vm_file/pgoff/prot, etc, under read vma lock or
read mmap_lock.
> >
> > > able to just take vma write lock iff vma->anon_vma is not NULL.
> >
> > Except if we don't take it and vma->anon_vma is NULL, then somebody can
> > anon_vma_prepare() and change vma->anon_vma midway through a fork and completely
> > screw up the anon_vma fork hierarchy.
>
> correction: this won't happen as per Barry (see - I managed to confuse myself
> here :), since for vma->anon_vma install we take the mmap read lock.
>
> BUT we also have to consider other cases.
>
> >
> > So no.
> >
> > >
> > > First of all, write mmap_lock is held, so the vma can't go or be
> > > changed under us.
> >
> > vma->anon_vma can be changed.
>
> Correction: no it can't :)
Yes, vma->anon_vma change should require taking read mmap_lock.
>
> >
> > >
> > > Secondly, if vma->anon_vma is NULL, it basically means either no page
> > > fault happened or no cow happened, so there is no page table to copy,
> > > this is also what copy_page_range() does currently. So we can shrink
> > > the critical section to:
> >
> > Firstly, with no VMA write lock, !vma->anon_vma means a fault can race and
> > secondly copy_page_range() checks vma_needs_copy(), there are other cases - PFN
> > maps, mixed maps, UFFD W/P (ugh), guard regions.
> >
> > So yeah this isn't sufficient.
>
> However this is true...
Yes, fault can race with fork. Basically this is actually the purpose
of this idea. We can have improved page fault scalability. In my
proposal (take write vma lock if vma->anon_vma is not NULL), the race
just happens on the VMAs which page fault has not happened on before.
vma_needs_copy() also skips the VMAs which don't have vma->anon_vma.
So there is basically no difference in semantics other than more page
fault races IIUC. It should be safe as long as we can guarantee there
is no writable PTE point to a shared page after fork.
For guard regions, it can be serialized by vma write lock if
vma->anon_vma exists. If vma->anon_vma is NULL, it will prepare
anon_vma, which will take read mmap_lock if I read the code correctly.
I have not investigated UFFD yet.
>
> >
> > >
> > > if (vma->anon_vma) {
> > > vma_start_write_killable(src_vma);
> > > anon_vma_fork(dst_vma, src_vma);
> > > copy_page_range(dst_vma, src_vma);
> > > }
> >
> > Yeah that's totally broken fo reasons above as I said :)
> >
> > >
> > > But page fault can happen before write mmap_lock is taken, when we
> > > check vma->anon_vma, it is possible it has not been set up yet. But it
> > > seems to be equivalent to page fault after fork and won't break the
> > > semantic.
> >
> > It will totally break how the anon_vma hierarchy works :) See the links at the
> > top of https://ljs.io/talks for a link to various slides on anon_vma behaviour
> > (it's really a pain to think about because it's a super broken abstraction).
> >
> > You could end up with a CoW mapping that's unreachable from rmap and you could
> > get some nasty issues with page table entries pointing at freed folios :)
>
> Correction: actually we should be safe given mmap read lock on anon_vma install.
>
> >
> > >
> > > Anyway, just a crazy idea, I may miss some corner cases.
> >
> > Yeah sorry to push back here but this is just not a viable approach.
No worries. Thanks for all the feedback. Just tried to explore whether
such an idea is feasible or not.
> >
> > And this is forgetting that we have relied on page faults being blocked by fork
> > _forever_, who knows what else has baked in assumptions about that
> > serialisation.
> >
> > Forking is one of the nastiest parts of mm and has had multiple, subtle, corner
> > case breakages that have been a nightmare to deal with.
Yes, this might be the biggest concern. The page fault can race with
fork. If some applications rely on such subtle behavior, it may break,
but such applications are fragile too.
> >
> > So I'm very much against changing this behaviour to try to fix something in the
> > fault path.
> >
> > We should address the fault path issues in the fault path :)
Yeah, this idea was inspired by Barry's "not take vma read lock
unconditionally" idea. Maybe irrelevant to Barry's priority inversion
problem, just an idea for further optimization on page fault
scalability. This probably should be a separate topic.
Thanks,
Yang
>
> Above still all true though.
>
> >
> > >
> > > Thanks,
> > > Yang
> > >
> > > }
> > >
> > > >
> > > > Based on the above, we may want to re-check whether fork()
> > > > can be blocked by page faults. At the same time, if Suren,
> > > > you, or anyone else has any comments, please feel free to
> > > > share them.
> > > >
> > > > Best Regards
> > > > Barry
> > > >
> >
> > Cheers, Lorenzo
>
> So still a nope :)
>
> Cheers, Lorenzo
^ permalink raw reply
* Re: [PATCH 2/2] arm64: dts: freescale: add initial device tree for TQMa8MPQS with i.MX8MP
From: Frank Li @ 2026-05-19 18:44 UTC (permalink / raw)
To: Alexander Stein
Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Sascha Hauer,
Pengutronix Kernel Team, Fabio Estevam, Geert Uytterhoeven,
Magnus Damm, Shawn Guo, Paul Gerber, devicetree, linux-kernel,
imx, linux-arm-kernel, linux, linux-renesas-soc
In-Reply-To: <20260505063346.1799500-2-alexander.stein@ew.tq-group.com>
On Tue, May 05, 2026 at 08:33:44AM +0200, Alexander Stein wrote:
> From: Paul Gerber <paul.gerber@tq-group.com>
>
> This adds support for TQMa8MPQS module on MB-SMARC-2 board.
>
> Signed-off-by: Paul Gerber <paul.gerber@tq-group.com>
> Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
> ---
...
> +
> +&usb3_0 {
> + pinctrl-names = "default";
> + pinctrl-0 = <&pinctrl_usb0>;
> + fsl,over-current-active-low;
> + maximum-speed = "high-speed";
arch/arm64/boot/dts/freescale/imx8mp-tqma8mpqs-mb-smarc-2.dtb: usb@32f10100 (fsl,imx8mp-dwc3): 'maximum-speed' does not match any of the regexes: '^pinctrl-[0-9]+$', '^usb@[0-9a-f]+$'
from schema $id: http://devicetree.org/schemas/usb/fsl,imx8mp-dwc3.yaml
It will reduce review time if run CHECK_DTBS locally before post
Frank
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox