Linux-ARM-Kernel Archive on lore.kernel.org

Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH V12 00/12] pci-imx6: Add support for parsing the reset property in new Root Port binding
From: Sherry Sun @ 2026-04-10  2:30 UTC (permalink / raw)
  To: robh, krzk+dt, conor+dt, Frank.Li, s.hauer, kernel, festevam,
	lpieralisi, kwilczynski, mani, bhelgaas, hongxing.zhu, l.stach
  Cc: imx, linux-pci, linux-arm-kernel, devicetree, linux-kernel

This patch set adds support for parsing the reset property in new Root Port
binding in pci-imx6 driver, similar to the implementation in the qcom pcie
driver[1].

Also introduce generic helper functions to parse Root Port device tree
nodes and extract common properties like reset GPIOs. This allows multiple
PCI host controller drivers to share the same parsing logic.

Define struct pci_host_port to hold common Root Port properties
(currently only reset GPIO descriptor) and add
pci_host_common_parse_ports() to parse Root Port nodes from device tree.
Also add the 'ports' list to struct pci_host_bridge for better maintain
parsed Root Port information.

The plan is to add the wake-gpio property to the root port in subsequent
patches. Also, the vpcie-supply property will be moved to the root port
node later based on the refactoring patch set for the PCI pwrctrl
framework[2]. 

The initial idea is to adopt the Manivannan’s recent PCIe M.2 KeyE
connector support patch set[3] and PCI power control framework patches[2],
and extend them to the pcie-imx6 driver. Since the new M.2/pwrctrl model is
implemented based on Root Ports and requires the pwrctrl driver to bind to
a Root Port device, we need to introduce a Root Port child node on i.MX
boards that provide an M.2 connector.

To follow a more standardized DT structure, it also makes sense to move
the reset-gpios and wake-gpios properties into the Root Port node. These
signals logically belong to the Root Port rather than the host bridge,
and placing them there aligns with the new M.2/pwrctrl model.

Regarding backward compatibility, as Frank suggested, I will not remove
the old reset-gpio property from existing DTS files to avoid function
break.

For new i.MX platforms — such as the upcoming i.MX952-evk will add
vpcie-supply, reset-gpios, and wake-gpios directly under the Root Port
node.
Therefore, driver updates are needed to support both the legacy
properties and the new standardized Root Port based layout.

[1] https://lore.kernel.org/linux-pci/20250702-perst-v5-0-920b3d1f6ee1@qti.qualcomm.com/
[2] https://lore.kernel.org/linux-pci/20260115-pci-pwrctrl-rework-v5-0-9d26da3ce903@oss.qualcomm.com/
[3] https://lore.kernel.org/linux-pci/20260112-pci-m2-e-v4-0-eff84d2c6d26@oss.qualcomm.com/

Signed-off-by: Sherry Sun <sherry.sun@nxp.com>
---
Changes in V12:
1. Improve the pci_host_common_parse_port() to correctly handle three scenarios:
   PERST# found in Root Port node & PERST# not in Root Port but found in RC node
   & PERST# not found in either node.
2. Add documentation noting for pci_host_common_parse_port().
3. Add err_cleanup handle path for pci_host_common_parse_ports() to clean up any
   partially parsed Root Port resources.
4. Optimize imx_pcie_assert_perst() to avoid the linearly increasing deassertion
   delay if controller has multiple Root Ports.
5. Use mdelay instead of msleep in imx_pcie_assert_perst() for noirq context
   safety.
6. Remove early return in imx_pcie_parse_legacy_binding() when reset is NULL to
   align with pci_host_common_parse_port(), allowing port creation even without
   PERST# GPIO.

Changes in V11:
1. Call pci_host_common_parse_ports() API from pci-imx6 driver instead of dwc
   common layer as Mani suggested.
2. Improve the commit message of patch#3 to avoid confusion as Mani suggested.

Changes in V10:
1. Use gpiod_direction_output() instead of gpiod_set_value_cansleep() to
   ensure the reset GPIO is properly configured as output before setting
   its value in patch#5 as now the reset GPIO is obtained with
   GPIOD_ASIS flag.

Changes in V9:
1. Improve the error handling in pci_host_common_parse_ports() as Mani suggested. 
2. Move the list_empty check and the comment to imx_pcie_host_init() to make it
   clear that imx_pcie_parse_legacy_binding() is a fallback as Mani suggested.
3. Export pci_host_common_delete_ports() so that it can be called by
   imx_pcie_parse_legacy_binding().

Changes in V8:
1. Add back the cleanup function pci_host_common_delete_ports() to properly
   handles the ports list instead of simply using pci_free_resource_list().
2. Improve the patch#4 commit message.
3. Remove the irrelevant code change in patch#4.

Changes in V7:
1. Change to use GPIOD_ASIS when requesting perst gpio as Mani suggested.
   using bridge->dev.
2. Add a seperate patch to move vpcie3v3aux regulator enable from probe to
   imx_pcie_host_init() and move imx_pcie_assert_perst() before regulator and
   clock enable for pci-imx6.
3. Add device pointer parameter for pci_host_common_parse_port() instead of

Changes in V6:
1. Drop the pre-allocate pci_host_bridge struct changes in dw_pcie_host_init()
   and imx_pcie_probe().
2. Parse Root Port nodes in dw_pcie_host_init() as Frank and Mani suggested.
3. Move the imx_pcie_parse_legacy_binding() from imx_pcie_probe() to
   imx_pcie_host_init(), so that dw_pcie_host_init() parse Root Port first, if
   no Root Port nodes were parsed(indicated by empty ports list), then parse
   legacy binding.
4. Add device pointer parameter for pci_host_common_parse_ports().
5. Add NULL pointer check for reset gpio in imx_pcie_parse_legacy_binding().

Changes in V5:
1. Add the Root Port list(pci_host_port) to struct pci_host_bridge for better
   maintain parsed Root Port information.
2. Delete the pci_host_common_delete_ports() as now the Root Port list in
   pci_host_bridge can be cleared by pci_release_host_bridge_dev().
3. Change the common API pci_host_common_parse_ports() pass down struct
   pci_host_bridge *. 
4. Modify dw_pcie_host_init() to allow drivers to pre-allocate pci_host_bridge
   struct when needed.
5. Allocate bridge early in imx_pcie_probe() to parse Root Ports.

Changes in V4:
1. Add common helpers for parsing Root Port properties in pci-host-common.c in
   patch#2.
2. Call common pci_host_common_parse_ports() and pci_host_common_delete_ports()
   in pci-imx6 driver.
3. Use PCIE_T_PVPERL_MS and PCIE_RESET_CONFIG_WAIT_MS instead of magic number
   100 in patch#3 as Manivannan suggested.
4. Use "PERST#" instead of "PCIe reset" for the reset gpio lable in patch#3.

Changes in V3:
1. Improve the patch#2 commit message as Frank suggested.
2. Add Reviewed-by tag for patch#1.

Changes in V2:
1. Improve the patch#1 commit message as Frank suggested.
2. Also mark the reset-gpio-active-high property as deprecated in
   imx6q-pcie DT binding as Rob suggested.
3. The imx_pcie_delete_ports() has been moved up so that the
   imx_pcie_parse_ports() can call this helper function in error handling.
4. Keep the old reset-gpio property in the host bridge node for the
   existing dts files and add comments to avoid confusion.
---

Sherry Sun (12):
  dt-bindings: PCI: fsl,imx6q-pcie: Add reset GPIO in Root Port node
  PCI: host-generic: Add common helpers for parsing Root Port properties
  PCI: imx6: Assert PERST# before enabling regulators
  PCI: imx6: Add support for parsing the reset property in new Root Port
    binding
  arm: dts: imx6qdl: Add Root Port node and PERST property
  arm: dts: imx6sx: Add Root Port node and PERST property
  arm: dts: imx7d: Add Root Port node and PERST property
  arm64: dts: imx8mm: Add Root Port node and PERST property
  arm64: dts: imx8mp: Add Root Port node and PERST property
  arm64: dts: imx8mq: Add Root Port node and PERST property
  arm64: dts: imx8dxl/qm/qxp: Add Root Port node and PERST property
  arm64: dts: imx95: Add Root Port node and PERST property

 .../bindings/pci/fsl,imx6q-pcie.yaml          |  32 +++++
 .../arm/boot/dts/nxp/imx/imx6qdl-sabresd.dtsi |   5 +
 arch/arm/boot/dts/nxp/imx/imx6qdl.dtsi        |  11 ++
 .../arm/boot/dts/nxp/imx/imx6qp-sabreauto.dts |   5 +
 arch/arm/boot/dts/nxp/imx/imx6sx-sdb.dtsi     |   5 +
 arch/arm/boot/dts/nxp/imx/imx6sx.dtsi         |  11 ++
 arch/arm/boot/dts/nxp/imx/imx7d-sdb.dts       |   5 +
 arch/arm/boot/dts/nxp/imx/imx7d.dtsi          |  11 ++
 .../boot/dts/freescale/imx8-ss-hsio.dtsi      |  11 ++
 arch/arm64/boot/dts/freescale/imx8dxl-evk.dts |   5 +
 arch/arm64/boot/dts/freescale/imx8mm-evk.dtsi |   5 +
 arch/arm64/boot/dts/freescale/imx8mm.dtsi     |  11 ++
 arch/arm64/boot/dts/freescale/imx8mp-evk.dts  |   5 +
 arch/arm64/boot/dts/freescale/imx8mp.dtsi     |  11 ++
 arch/arm64/boot/dts/freescale/imx8mq-evk.dts  |  10 ++
 arch/arm64/boot/dts/freescale/imx8mq.dtsi     |  22 ++++
 arch/arm64/boot/dts/freescale/imx8qm-mek.dts  |  10 ++
 .../boot/dts/freescale/imx8qm-ss-hsio.dtsi    |  22 ++++
 arch/arm64/boot/dts/freescale/imx8qxp-mek.dts |   5 +
 .../boot/dts/freescale/imx95-15x15-evk.dts    |   5 +
 .../boot/dts/freescale/imx95-19x19-evk.dts    |  10 ++
 arch/arm64/boot/dts/freescale/imx95.dtsi      |  22 ++++
 drivers/pci/controller/dwc/pci-imx6.c         | 117 ++++++++++++++----
 drivers/pci/controller/pci-host-common.c      | 104 ++++++++++++++++
 drivers/pci/controller/pci-host-common.h      |  16 +++
 drivers/pci/probe.c                           |   1 +
 include/linux/pci.h                           |   1 +
 27 files changed, 454 insertions(+), 24 deletions(-)

-- 
2.37.1

^ permalink raw reply

* Re: [PATCH v14 00/10] arm64: entry: Convert to Generic Entry
From: Jinjie Ruan @ 2026-04-10  2:16 UTC (permalink / raw)
  To: Mark Rutland
  Cc: peterz, catalin.marinas, ldv, song, edumazet, will, mingo, kees,
	thuth, ryan.roberts, arnd, anshuman.khandual, kevin.brodsky,
	pengcan, broonie, mathieu.desnoyers, luto, linux-arm-kernel, wad,
	linusw, oleg, linux-kernel, tglx, liqiang01, yeoreum.yun
In-Reply-To: <addW4dkDgul4UrBY@J2N7QTR9R3>



On 2026/4/9 15:36, Mark Rutland wrote:
> Hi Jinjie,
> 
> On Thu, Apr 09, 2026 at 02:29:04PM +0800, Jinjie Ruan wrote:
>> On 2026/3/20 18:26, Jinjie Ruan wrote:
>>> Currently, x86, Riscv, Loongarch use the Generic Entry which makes
>>> maintainers' work easier and codes more elegant. arm64 has already
>>> successfully switched to the Generic IRQ Entry in commit
>>> b3cf07851b6c ("arm64: entry: Switch to generic IRQ entry"), it is
>>> time to completely convert arm64 to Generic Entry.
>>>
>>> The goal is to bring arm64 in line with other architectures that already
>>> use the generic entry infrastructure, reducing duplicated code and
>>> making it easier to share future changes in entry/exit paths, such as
>>> "Syscall User Dispatch" and RSEQ optimizations.
>>
>> Hi,
>>
>> Just a quick ping to see if this series is good to go. Do I need to
>> provide a new version rebased on the latest arm64 for-next/generic-entry
>> branches, or is the current version acceptable?
> 
> Sorry, this is on my list to review, but I've been busy fixing other
> issues (e.g. the preemption issues from moving to generic irqentry), and
> haven't yet had the time to go over this in detail.
> 
> This series might be fine as-is, but given that it's incredibly easy to
> introduce subtle and hard-to-fix regressions to user ABI, I don't think
> we can assume that it's safe until we've gone over it thoroughly.
> 
> For the moment, no need to resend. If you haven't received comments by
> v7.1-rc1, please rebase and resend then.

Hi Mark,

Thank you for the update. I completely understand the sensitivity of the
entry code and the need for a thorough review to avoid any ABI regressions.

I will wait for your feedback and will only rebase/resend if I haven't
heard back by v7.1-rc1, as suggested.

Best regards,
Jinjie

> 
> Thanks,
> Mark.
> 
>>> This patch set is rebased on arm64 for-next/core. This series contains
>>> foundational updates for arm64. As suggested by Linus Walleij, these 10
>>> patches are being submitted separately for inclusion in the arm64 tree.
>>>
>>> And the performance benchmarks results on qemu-kvm are below:
>>>
>>> perf bench syscall usec/op (-ve is improvement)
>>>
>>> | Syscall | Base        | Generic Entry | change % |
>>> | ------- | ----------- | ------------- | -------- |
>>> | basic   | 0.123997    | 0.120872      | -2.57    |
>>> | execve  | 512.1173    | 504.9966      | -1.52    |
>>> | fork    | 114.1144    | 113.2301      | -1.06    |
>>> | getpgid | 0.120182    | 0.121245      | +0.9     |
>>>
>>> perf bench syscall ops/sec (+ve is improvement)
>>>
>>> | Syscall | Base     | Generic Entry| change % |
>>> | ------- | -------- | ------------ | -------- |
>>> | basic   | 8064712  | 8273212      | +2.48    |
>>> | execve  | 1952     | 1980         | +1.52    |
>>> | fork    | 8763     | 8832         | +1.06    |
>>> | getpgid | 8320704  | 8247810      | -0.9     |
>>>
>>> Therefore, the syscall performance variation ranges from a 1% regression
>>> to a 2.5% improvement.
>>>
>>> It was tested ok with following test cases on QEMU virt platform:
>>>  - Stress-ng CPU stress test.
>>>  - Hackbench stress test.
>>>  - "sud" selftest testcase.
>>>  - get_set_sud, get_syscall_info, set_syscall_info, peeksiginfo
>>>    in tools/testing/selftests/ptrace.
>>>  - breakpoint_test_arm64 in selftests/breakpoints.
>>>  - syscall-abi and ptrace in tools/testing/selftests/arm64/abi
>>>  - fp-ptrace, sve-ptrace, za-ptrace in selftests/arm64/fp.
>>>  - vdso_test_getrandom in tools/testing/selftests/vDSO
>>>  - Strace tests.
>>>  - slice_test for rseq optimizations.
>>>
>>> The test QEMU configuration is as follows:
>>>
>>> 	qemu-system-aarch64 \
>>> 		-M virt \
>>> 		-enable-kvm \
>>> 		-cpu host \
>>> 		-kernel Image \
>>> 		-smp 8 \
>>> 		-m 512m \
>>> 		-nographic \
>>> 		-no-reboot \
>>> 		-device virtio-rng-pci \
>>> 		-append "root=/dev/vda rw console=ttyAMA0 kgdboc=ttyAMA0,115200 \
>>> 			earlycon preempt=voluntary irqchip.gicv3_pseudo_nmi=1 audit=1" \
>>> 		-drive if=none,file=images/rootfs.ext4,format=raw,id=hd0 \
>>> 		-device virtio-blk-device,drive=hd0 \
>>>
>>> Changes in v14:
>>> - Initialize ret = 0 in syscall_trace_enter().
>>> - Split into two patch sets as Linus Walleij suggested, so this patch set
>>>   can be applied separately to the arm64 tree.
>>> - Rebased on arm64 for-next/core branch.
>>> - Collect Reviewed-by and Acked-by.
>>> - Link to v13 resend: https://lore.kernel.org/all/20260317082020.737779-15-ruanjinjie@huawei.com/
>>>
>>> Changes in v13 resend:
>>> - Fix exit_to_user_mode_prepare_legacy() issues.
>>> - Also move TIF_SINGLESTEP to generic TIF infrastructure for loongarch.
>>> - Use generic TIF bits for arm64 and moving TIF_SINGLESTEP to
>>>   generic TIF for related architectures separately.
>>> - Refactor syscall_trace_enter/exit() to accept flags and Use
>>>   syscall_get_nr() helper separately.
>>> - Tested with slice_test for rseq optimizations.
>>> - Add acked-by.
>>> - Link to v13: https://lore.kernel.org/all/20260313094738.3985794-1-ruanjinjie@huawei.com/
>>>
>>> Changes in v13:
>>> - Rebased on v7.0-rc3, so drop the firt applied arm64 patch.
>>> - Use generic TIF bits to enables RSEQ optimization.
>>> - Update most of the commit message to make it more clear.
>>> - Link to v12: https://lore.kernel.org/all/20260203133728.848283-1-ruanjinjie@huawei.com/
>>>
>>> Changes in v12:
>>> - Rebased on "sched/core", so remove the four generic entry patches.
>>> - Move "Expand secure_computing() in place" and
>>>   "Use syscall_get_arguments() helper" patch forward, which will group all
>>>   non-functional cleanups at the front.
>>> - Adjust the explanation for moving rseq_syscall() before
>>>   audit_syscall_exit().
>>> - Link to v11: https://lore.kernel.org/all/20260128031934.3906955-1-ruanjinjie@huawei.com/
>>>
>>> Changes in v11:
>>> - Remove unused syscall in syscall_trace_enter().
>>> - Update and provide a detailed explanation of the differences after
>>>   moving rseq_syscall() before audit_syscall_exit().
>>> - Rebased on arm64 (for-next/entry), and remove the first applied 3 patchs.
>>> - syscall_exit_to_user_mode_work() for arch reuse instead of adding
>>>   new syscall_exit_to_user_mode_work_prepare() helper.
>>> - Link to v10: https://lore.kernel.org/all/20251222114737.1334364-1-ruanjinjie@huawei.com/
>>>
>>> Changes in v10:
>>> - Rebased on v6.19-rc1, rename syscall_exit_to_user_mode_prepare() to
>>>   syscall_exit_to_user_mode_work_prepare() to avoid conflict.
>>> - Also inline syscall_trace_enter().
>>> - Support aarch64 for sud_benchmark.
>>> - Update and correct the commit message.
>>> - Add Reviewed-by.
>>> - Link to v9: https://lore.kernel.org/all/20251204082123.2792067-1-ruanjinjie@huawei.com/
>>>
>>> Changes in v9:
>>> - Move "Return early for ptrace_report_syscall_entry() error" patch ahead
>>>   to make it not introduce a regression.
>>> - Not check _TIF_SECCOMP/SYSCALL_EMU for syscall_exit_work() in
>>>   a separate patch.
>>> - Do not report_syscall_exit() for PTRACE_SYSEMU_SINGLESTEP in a separate
>>>   patch.
>>> - Add two performance patch to improve the arm64 performance.
>>> - Add Reviewed-by.
>>> - Link to v8: https://lore.kernel.org/all/20251126071446.3234218-1-ruanjinjie@huawei.com/
>>>
>>> Changes in v8:
>>> - Rename "report_syscall_enter()" to "report_syscall_entry()".
>>> - Add ptrace_save_reg() to avoid duplication.
>>> - Remove unused _TIF_WORK_MASK in a standalone patch.
>>> - Align syscall_trace_enter() return value with the generic version.
>>> - Use "scno" instead of regs->syscallno in el0_svc_common().
>>> - Move rseq_syscall() ahead in a standalone patch to clarify it clearly.
>>> - Rename "syscall_trace_exit()" to "syscall_exit_work()".
>>> - Keep the goto in el0_svc_common().
>>> - No argument was passed to __secure_computing() and check -1 not -1L.
>>> - Remove "Add has_syscall_work() helper" patch.
>>> - Move "Add syscall_exit_to_user_mode_prepare() helper" patch later.
>>> - Add miss header for asm/entry-common.h.
>>> - Update the implementation of arch_syscall_is_vdso_sigreturn().
>>> - Add "ARCH_SYSCALL_WORK_EXIT" to be defined as "SECCOMP | SYSCALL_EMU"
>>>   to keep the behaviour unchanged.
>>> - Add more testcases test.
>>> - Add Reviewed-by.
>>> - Update the commit message.
>>> - Link to v7: https://lore.kernel.org/all/20251117133048.53182-1-ruanjinjie@huawei.com/
>>>
>>> Jinjie Ruan (10):
>>>   arm64/ptrace: Refactor syscall_trace_enter/exit() to accept flags
>>>     parameter
>>>   arm64/ptrace: Use syscall_get_nr() helper for syscall_trace_enter()
>>>   arm64/ptrace: Expand secure_computing() in place
>>>   arm64/ptrace: Use syscall_get_arguments() helper for audit
>>>   arm64: ptrace: Move rseq_syscall() before audit_syscall_exit()
>>>   arm64: syscall: Introduce syscall_exit_to_user_mode_work()
>>>   arm64/ptrace: Define and use _TIF_SYSCALL_EXIT_WORK
>>>   arm64/ptrace: Skip syscall exit reporting for PTRACE_SYSEMU_SINGLESTEP
>>>   arm64: entry: Convert to generic entry
>>>   arm64: Inline el0_svc_common()
>>>
>>>  arch/arm64/Kconfig                    |   2 +-
>>>  arch/arm64/include/asm/entry-common.h |  76 +++++++++++++++++
>>>  arch/arm64/include/asm/syscall.h      |  19 ++++-
>>>  arch/arm64/include/asm/thread_info.h  |  16 +---
>>>  arch/arm64/kernel/debug-monitors.c    |   7 ++
>>>  arch/arm64/kernel/entry-common.c      |  25 ++++--
>>>  arch/arm64/kernel/ptrace.c            | 115 --------------------------
>>>  arch/arm64/kernel/signal.c            |   2 +-
>>>  arch/arm64/kernel/syscall.c           |  29 ++-----
>>>  include/linux/irq-entry-common.h      |   8 --
>>>  include/linux/rseq_entry.h            |  18 ----
>>>  11 files changed, 130 insertions(+), 187 deletions(-)
>>>
> 


^ permalink raw reply

* Re: [PATCH v14 00/10] arm64: entry: Convert to Generic Entry
From: Jinjie Ruan @ 2026-04-10  2:09 UTC (permalink / raw)
  To: Kees Cook
  Cc: mark.rutland, peterz, catalin.marinas, ldv, edumazet, will, mingo,
	thuth, ryan.roberts, arnd, anshuman.khandual, kevin.brodsky,
	pengcan, broonie, mathieu.desnoyers, luto, linux-arm-kernel, wad,
	song, linusw, oleg, linux-kernel, tglx, liqiang01, yeoreum.yun
In-Reply-To: <202604090906.28EA71F63@keescook>



On 2026/4/10 0:14, Kees Cook wrote:
> On Thu, Apr 09, 2026 at 02:29:04PM +0800, Jinjie Ruan wrote:
>> On 2026/3/20 18:26, Jinjie Ruan wrote:
>>> Currently, x86, Riscv, Loongarch use the Generic Entry which makes
>>> maintainers' work easier and codes more elegant. arm64 has already
>>> successfully switched to the Generic IRQ Entry in commit
>>> b3cf07851b6c ("arm64: entry: Switch to generic IRQ entry"), it is
>>> time to completely convert arm64 to Generic Entry.
>>>
>>> The goal is to bring arm64 in line with other architectures that already
>>> use the generic entry infrastructure, reducing duplicated code and
>>> making it easier to share future changes in entry/exit paths, such as
>>> "Syscall User Dispatch" and RSEQ optimizations.
>>
>> Just a quick ping to see if this series is good to go. Do I need to
>> provide a new version rebased on the latest arm64 for-next/generic-entry
>> branches, or is the current version acceptable?
> 
> One thing I see is Sashiko's comments on seccomp:
> https://sashiko.dev/#/patchset/20260320102620.1336796-1-ruanjinjie%40huawei.com
> where "ret", when not 0 or -1, will override the syscall number. While
> that's not currently possible, it'd be better to catch that, or rather,
> avoid the "ret ? : syscall" logic which isn't useful here. "ret" should
> probably be local to the "if (flags & _TIF_SECCOMP)" scope.

It might be better to fix the identical logic in the generic entry
first? then align arm64. Doing otherwise would cause the "arm64: entry:
Convert to generic entry" patch to create an unnecessary discrepancy.

> 


^ permalink raw reply

* Re: [PATCH v5 2/3] dt-bindings: mfd: aspeed,ast2x00-scu: Describe AST2700 SCU0
From: Billy Tsai @ 2026-04-10  2:00 UTC (permalink / raw)
  To: Rob Herring
  Cc: Krzysztof Kozlowski, Lee Jones, Krzysztof Kozlowski, Conor Dooley,
	Joel Stanley, Andrew Jeffery, Linus Walleij, Bartosz Golaszewski,
	Ryan Chen, Andrew Jeffery, devicetree@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-aspeed@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	openbmc@lists.ozlabs.org, linux-gpio@vger.kernel.org,
	linux-clk@vger.kernel.org
In-Reply-To: <20260408133114.GA1938858-robh@kernel.org>

> > > > AST2700 consists of two interconnected SoC instances, each with its own
> > > > System Control Unit (SCU). The SCU0 provides pin control, interrupt
> > > > controllers, clocks, resets, and address-space mappings for the
> > > > Secondary and Tertiary Service Processors (SSP and TSP).
> > > >
> > > > Describe the SSP/TSP address mappings using the standard
> > > > memory-region and memory-region-names properties.
> > > >
> > > > Disallow legacy child nodes that are not present on AST2700, including
> > > > p2a-control and smp-memram. The latter is unnecessary as software can
> > > > access the scratch registers via the SCU syscon.
> > > >
> > > > Also allow the AST2700 SoC0 pin controller to be described as a child
> > > > node of the SCU0, and add an example illustrating the SCU0 layout,
> > > > including reserved-memory, interrupt controllers, and pinctrl.
> > > >
> > > > Signed-off-by: Billy Tsai <billy_tsai@aspeedtech.com>
> > > > ---
> > > >  .../bindings/mfd/aspeed,ast2x00-scu.yaml           | 117 +++++++++++++++++++++
> > > >  1 file changed, 117 insertions(+)
> > > >
> > > > diff --git a/Documentation/devicetree/bindings/mfd/aspeed,ast2x00-scu.yaml b/Documentation/devicetree/bindings/mfd/aspeed,ast2x00-scu.yaml
> > > > index a87f31fce019..86d51389689c 100644
> > > > --- a/Documentation/devicetree/bindings/mfd/aspeed,ast2x00-scu.yaml
> > > > +++ b/Documentation/devicetree/bindings/mfd/aspeed,ast2x00-scu.yaml
> > > > @@ -46,6 +46,9 @@ properties:
> > > >    '#reset-cells':
> > > >      const: 1
> > > >
> > > > +  memory-region: true
> > > > +  memory-region-names: true
> >
> > > Missing constraints. From where did you take such syntax (so I can fix
> > > it)?
> >
> > The intention was to constrain these properties conditionally for
> > AST2700 SCU0 as done further down in the patch.
> >
> > I can update the binding so that memory-region and memory-region-names
> > have baseline constraints (e.g. minItems and maxItems), and then refine them in the
> > conditional branches for AST2700SCU0, AST2700SCU1 and others
> >
> >   memory-region:
> >     minItems: 2
> >     maxItems: 3
> >   memory-region-names:
> >     minItems: 2
> >     maxItems: 3

> As of this patch, you don't need that. You can just define the regions
> and names at the top-level. And the conditional schema only needs to
> disallow them for the appropriate case.

Based on your suggestion, I will simplify the schema and define
memory-region and memory-region-names at the top-level without item
constraints, and only disallow them for the non-AST2700 cases.

The updated structure would look like:

    memory-region:
      description:
        Reserved memory regions used by AST2700 SCU to configure
        coprocessor address mapping windows.

    memory-region-names:
      description:
        Names corresponding to the AST2700 coprocessor mapping windows
        listed in memory-region.

    ...

    - if:
        properties:
          compatible:
            contains:
              anyOf:
                - const: aspeed,ast2700-scu0
                - const: aspeed,ast2700-scu1
      then:
        patternProperties:
          '^p2a-control@[0-9a-f]+$': false
          '^smp-memram@[0-9a-f]+$': false
      else:
        properties:
          memory-region: false
          memory-region-names: false

Does this match what you had in mind?

Thanks

Billy Tsai

^ permalink raw reply

* Re: [PATCH v4 10/10] iommu/arm-smmu-v3: Allow sharing domain across SMMUs
From: Nicolin Chen @ 2026-04-10  1:18 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: will, robin.murphy, joro, jpb, praan, smostafa, linux-arm-kernel,
	iommu, linux-kernel, linux-tegra, jonathan.cameron
In-Reply-To: <20260410003624.GE3357077@nvidia.com>

On Thu, Apr 09, 2026 at 09:36:24PM -0300, Jason Gunthorpe wrote:
> On Thu, Apr 09, 2026 at 09:32:23PM -0300, Jason Gunthorpe wrote:

> Though something else is missing here, I expected this to be removed too:
> 
> struct arm_smmu_domain {
> 	struct arm_smmu_device		*smmu;

An, I didn't look into that very closely, as I vaguely recall that
we planned another series to clean this up.

> What is left using it?
> 
> static int arm_smmu_s1_set_dev_pasid(struct iommu_domain *domain,
> 				     struct device *dev, ioasid_t id,
> 				     struct iommu_domain *old)
> 
> int arm_smmu_set_pasid(struct arm_smmu_master *master,
> 		       struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
> 		       struct arm_smmu_cd *cd, struct iommu_domain *old,
> 		       arm_smmu_make_cd_fn make_cd_fn)
> 
> Thous should use the new helper right? It should work for a S1 domain
> too.

Yes.

> static void arm_smmu_flush_iotlb_all(struct iommu_domain *domain)
> {
> 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> 
> 	if (smmu_domain->smmu)
> 		arm_smmu_tlb_inv_context(smmu_domain);
> }
> 
> I suspect that is just dead code now, it is from before finalize was
> part of alloc?

It seems so. The invalidation doesn't need smmu_domain->smmu any
way.

I will clean this up in v5.

Thanks
Nicolin


^ permalink raw reply

* Re: [RFC PATCH 5/8] mm/vmalloc: map contiguous pages in batches for vmap() if possible
From: Barry Song @ 2026-04-10  1:02 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Dev Jain, linux-mm, linux-arm-kernel, catalin.marinas, will, akpm,
	linux-kernel, anshuman.khandual, ryan.roberts, ajd, rppt, david,
	Xueyuan.chen21
In-Reply-To: <add9XGTYL9sYilqO@milan>

On Thu, Apr 9, 2026 at 6:20 PM Uladzislau Rezki <urezki@gmail.com> wrote:
[...]
> >
> It would be good if you could combine the work together with Jain.
>

Sure, thanks! After discussing with Dev Jain, we’ll also
support non-compound pages in the next version.

> --
> Uladzislau Rezki


^ permalink raw reply

* Re: [PATCH 3/3] pmdomain: arm_scmi: add support for domain hierarchies
From: Kevin Hilman @ 2026-04-10  1:01 UTC (permalink / raw)
  To: Dhruva Gole
  Cc: Ulf Hansson, Rob Herring, Geert Uytterhoeven, linux-pm,
	devicetree, linux-kernel, arm-scmi, linux-arm-kernel
In-Reply-To: <20260313120707.jhkyd772wzuwmlhd@lcpd911>

Dhruva Gole <d-gole@ti.com> writes:

> On Mar 10, 2026 at 17:19:25 -0700, Kevin Hilman (TI) wrote:
>> After primary SCMI pmdomain is created, use new of_genpd helper which
>> checks for child domain mappings defined in power-domains-child-ids.
>> 
>> Also remove any child domain mappings when SCMI domain is removed.
>> 
>> Signed-off-by: Kevin Hilman (TI) <khilman@baylibre.com>
>> ---
>
> Again, since it worked fine on my AM62L,
> Tested-by: Dhruva Gole <d-gole@ti.com>

Thanks for testing & reviewing!

> But I had some thoughts further down...
>
>>  drivers/pmdomain/arm/scmi_pm_domain.c | 14 +++++++++++++-
>>  1 file changed, 13 insertions(+), 1 deletion(-)
>> 
>> diff --git a/drivers/pmdomain/arm/scmi_pm_domain.c b/drivers/pmdomain/arm/scmi_pm_domain.c
>> index b5e2ffd5ea64..9d8faef44aa9 100644
>> --- a/drivers/pmdomain/arm/scmi_pm_domain.c
>> +++ b/drivers/pmdomain/arm/scmi_pm_domain.c
>> @@ -114,6 +114,14 @@ static int scmi_pm_domain_probe(struct scmi_device *sdev)
>>  
>>  	dev_set_drvdata(dev, scmi_pd_data);
>>  
>> +	/*
>> +	 * Parse (optional) power-domains-child-ids property to
>> +	 * establish parent-child relationships
>> +	 */
>> +	ret = of_genpd_add_child_ids(np, scmi_pd_data);
>> +	if (ret < 0 && ret != -ENOENT)
>> +		pr_err("Failed to parse power-domains-child-ids for %pOF: %d\n", np, ret);
>
> Nit: I think the style of this driver is to use dev_err than pr_err

Agreed.

> Also, maybe a dev_warn makes more sense since we're not even returning
> the error or doing anything different if we get certain error path.

OK.

> I am wondering if it makes sense to just abort the whole idea of
> creating power-domain child ids if anything goes wrong?
>
> Basically just of_genpd_remove_child_ids if we face a condition where we
> have different number of parents/ children or id > num etc...
>
> All are error cases where the system behaviour can go on to become very
> unpredictable if we end up making a false/ incomplete parent-child ID
> map.
>
> Thoughts?

I agree.  After thinking through some of Ulf's suggestions on the
different error handling ideas, I think this should really be "all or
nothing".  If we we cannot parse & add all the children in the list, we
should add none of them.  I think partial additions will be come
unwieldy to manage rather quickly, and require the pmdomain core to keep
state.

Kevin


^ permalink raw reply

* Re: [PATCH v1] phy: rockchip-snps-pcie3:phy: Configure clkreq_n and PowerDown for all lanes
From: Shawn Lin @ 2026-04-10  0:46 UTC (permalink / raw)
  To: Anand Moon, Vinod Koul, Neil Armstrong, Heiko Stuebner,
	open list:GENERIC PHY FRAMEWORK,
	moderated list:ARM/Rockchip SoC support,
	open list:ARM/Rockchip SoC support, open list
  Cc: shawn.lin, Niklas Cassel
In-Reply-To: <20260409044939.7647-1-linux.amoon@gmail.com>

Hi Anand

在 2026/04/09 星期四 12:49, Anand Moon 写道:
> During the rk3588_p3phy_init sequence, the driver now explicitly
> configures each lane's CON0 register to ensure
> - PIPE 4.3 Compliance: clkreq_n (bit 6) is forced low (asserted) to meet
>    sideband signal requirements.

clkreq_n is now force asserted via controller driver if supports_clkreq
is not set.

> - Active Power State: PowerDown[3:0] (bits 11:8) is set to P0
>    (Normal Operational State) to ensure the PHY is fully powered and ready
>    for link training.
> 

P0 is the nature state when linking up. I don't know why it should be P0
before we even don't know whether the device is present.

> These changes ensure that all lanes are consistently transitioned from
> reset into a known-good operational state, preventing undefined behavior
> and ensuring the PHY is ready for high-speed data transmission.
> 
> Cc: Niklas Cassel <cassel@kernel.org>
> Signed-off-by: Anand Moon <linux.amoon@gmail.com>
> ---
>   .../phy/rockchip/phy-rockchip-snps-pcie3.c    | 28 +++++++++++++++++--
>   1 file changed, 26 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/phy/rockchip/phy-rockchip-snps-pcie3.c b/drivers/phy/rockchip/phy-rockchip-snps-pcie3.c
> index 4e8ffd173096..f46e13e79a0e 100644
> --- a/drivers/phy/rockchip/phy-rockchip-snps-pcie3.c
> +++ b/drivers/phy/rockchip/phy-rockchip-snps-pcie3.c
> @@ -7,6 +7,7 @@
>   
>   #include <linux/clk.h>
>   #include <linux/delay.h>
> +#include <linux/hw_bitfield.h>
>   #include <linux/io.h>
>   #include <linux/iopoll.h>
>   #include <linux/kernel.h>
> @@ -35,10 +36,14 @@
>   #define RK3588_PCIE3PHY_GRF_CMN_CON0		0x0
>   #define RK3588_PCIE3PHY_GRF_PHY0_STATUS1	0x904
>   #define RK3588_PCIE3PHY_GRF_PHY1_STATUS1	0xa04
> +#define RK3588_PCIE3PHY_GRF_PHY0_LN0_CON0	0x1000
>   #define RK3588_PCIE3PHY_GRF_PHY0_LN0_CON1	0x1004
>   #define RK3588_PCIE3PHY_GRF_PHY0_LN1_CON1	0x1104
> +#define RK3588_PCIE3PHY_GRF_PHY0_LN1_CON0	0x1100
> +#define RK3588_PCIE3PHY_GRF_PHY1_LN0_CON0	0x2000
>   #define RK3588_PCIE3PHY_GRF_PHY1_LN0_CON1	0x2004
>   #define RK3588_PCIE3PHY_GRF_PHY1_LN1_CON1	0x2104
> +#define RK3588_PCIE3PHY_GRF_PHY1_LN1_CON0	0x2100
>   #define RK3588_SRAM_INIT_DONE(reg)		(reg & BIT(0))
>   
>   #define RK3588_BIFURCATION_LANE_0_1		BIT(0)
> @@ -49,6 +54,13 @@
>   #define RK3588_PCIE1LN_SEL_EN			(GENMASK(1, 0) << 16)
>   #define RK3588_PCIE30_PHY_MODE_EN		(GENMASK(2, 0) << 16)
>   
> +static const u32 rk3588_lane_con0[] = {
> +	RK3588_PCIE3PHY_GRF_PHY0_LN0_CON0,
> +	RK3588_PCIE3PHY_GRF_PHY0_LN1_CON0,
> +	RK3588_PCIE3PHY_GRF_PHY1_LN0_CON0,
> +	RK3588_PCIE3PHY_GRF_PHY1_LN1_CON0,
> +};
> +
>   struct rockchip_p3phy_ops;
>   
>   struct rockchip_p3phy_priv {
> @@ -142,7 +154,7 @@ static int rockchip_p3phy_rk3588_init(struct rockchip_p3phy_priv *priv)
>   {
>   	u32 reg = 0;
>   	u8 mode = RK3588_LANE_AGGREGATION; /* default */
> -	int ret;
> +	int ret, i;
>   
>   	regmap_write(priv->phy_grf, RK3588_PCIE3PHY_GRF_PHY0_LN0_CON1,
>   		     priv->rx_cmn_refclk_mode[0] ? RK3588_RX_CMN_REFCLK_MODE_EN :
> @@ -161,7 +173,7 @@ static int rockchip_p3phy_rk3588_init(struct rockchip_p3phy_priv *priv)
>   	regmap_write(priv->phy_grf, RK3588_PCIE3PHY_GRF_CMN_CON0, BIT(8) | BIT(24));
>   
>   	/* Set bifurcation if needed */
> -	for (int i = 0; i < priv->num_lanes; i++) {
> +	for (i = 0; i < priv->num_lanes; i++) {
>   		if (priv->lanes[i] > 1)
>   			mode &= ~RK3588_LANE_AGGREGATION;
>   		if (priv->lanes[i] == 3)
> @@ -174,6 +186,18 @@ static int rockchip_p3phy_rk3588_init(struct rockchip_p3phy_priv *priv)
>   	regmap_write(priv->phy_grf, RK3588_PCIE3PHY_GRF_CMN_CON0,
>   		     RK3588_PCIE30_PHY_MODE_EN | reg);
>   
> +	for (i = 0; i < priv->num_lanes && i < ARRAY_SIZE(rk3588_lane_con0); i++) {
> +		u32 base = rk3588_lane_con0[i];
> +
> +		/* clkreq_n = 0 (asserted low for PIPE 4.3) */
> +		regmap_write(priv->phy_grf, base,
> +			     FIELD_PREP_WM16(BIT(6), 0));
> +
> +		/* PowerDown = P0 (0x0, fully active) */
> +		regmap_write(priv->phy_grf, base,
> +			     FIELD_PREP_WM16(GENMASK(11, 8), 0x0));
> +	}
> +
>   	/* Set pcie1ln_sel in PHP_GRF_PCIESEL_CON */
>   	if (!IS_ERR(priv->pipe_grf)) {
>   		reg = mode & (RK3588_BIFURCATION_LANE_0_1 | RK3588_BIFURCATION_LANE_2_3);
> 
> base-commit: 7f87a5ea75f011d2c9bc8ac0167e5e2d1adb1594


^ permalink raw reply

* Re: [PATCH 2/3] pmdomain: core: add support for power-domains-child-ids
From: Kevin Hilman @ 2026-04-10  0:45 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Rob Herring, Geert Uytterhoeven, linux-pm, devicetree,
	linux-kernel, arm-scmi, linux-arm-kernel
In-Reply-To: <CAPDyKFquJ7K4NcWuKMr1sjrnFVVPGAeLCiSF_FhvJf9Frbn1uA@mail.gmail.com>

Ulf Hansson <ulf.hansson@linaro.org> writes:

> On Wed, 11 Mar 2026 at 01:19, Kevin Hilman (TI) <khilman@baylibre.com> wrote:
>>
>> Currently, PM domains can only support hierarchy for simple
>> providers (e.g. ones with #power-domain-cells = 0).
>>
>> Add support for oncell providers as well by adding a new property
>> `power-domains-child-ids` to describe the parent/child relationship.
>>
>> For example, an SCMI PM domain provider has multiple domains, each of
>> which might be a child of diffeent parent domains. In this example,
>> the parent domains are MAIN_PD and WKUP_PD:
>>
>>     scmi_pds: protocol@11 {
>>         reg = <0x11>;
>>         #power-domain-cells = <1>;
>>         power-domains = <&MAIN_PD>, <&WKUP_PD>;
>>         power-domains-child-ids = <15>, <19>;
>>     };
>>
>> With this example using the new property, SCMI PM domain 15 becomes a
>> child domain of MAIN_PD, and SCMI domain 19 becomes a child domain of
>> WKUP_PD.
>>
>> To support this feature, add two new core functions
>>
>> - of_genpd_add_child_ids()
>> - of_genpd_remove_child_ids()
>>
>> which can be called by pmdomain providers to add/remove child domains
>> if they support the new property power-domains-child-ids.
>>
>> Signed-off-by: Kevin Hilman (TI) <khilman@baylibre.com>
>
> Thanks for working on this! It certainly is a missing feature!

You're welcome, thanks for the detailed review.

>> ---
>>  drivers/pmdomain/core.c   | 169 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  include/linux/pm_domain.h |  16 ++++++++++++++++
>>  2 files changed, 185 insertions(+)
>>
>> diff --git a/drivers/pmdomain/core.c b/drivers/pmdomain/core.c
>> index 61c2277c9ce3..acb45dd540b7 100644
>> --- a/drivers/pmdomain/core.c
>> +++ b/drivers/pmdomain/core.c
>> @@ -2909,6 +2909,175 @@ static struct generic_pm_domain *genpd_get_from_provider(
>>         return genpd;
>>  }
>>
>> +/**
>> + * of_genpd_add_child_ids() - Parse power-domains-child-ids property
>> + * @np: Device node pointer associated with the PM domain provider.
>> + * @data: Pointer to the onecell data associated with the PM domain provider.
>> + *
>> + * Parse the power-domains and power-domains-child-ids properties to establish
>> + * parent-child relationships for PM domains. The power-domains property lists
>> + * parent domains, and power-domains-child-ids lists which child domain IDs
>> + * should be associated with each parent.
>> + *
>> + * Returns 0 on success, -ENOENT if properties don't exist, or negative error code.
>
> I think we should avoid returning specific error codes for specific
> errors, simply because it usually becomes messy.
>
> If I understand correctly the intent here is to allow the caller to
> check for -ENOENT and potentially avoid bailing out as it may not
> really be an error, right?

Right, -ENOENT is not an error of parsing, it's to indicate that there
are no child-ids to be parsed.

> Perhaps a better option is to return the number of children for whom
> we successfully assigned parents. Hence 0 or a positive value allows
> the caller to understand what happened. More importantly, a negative
> error code then really becomes an error for the caller to consider.

I explored this a bit, but it gets messy quick.  It means we have to
track cases where only some of the children were added as well as when
all children were added.   Personally, I think this should be an "all or
nothing" thing.  If all the children cannot be parsed/added, then none
of them should be added.

This also allows the remove to not have to care about how many were
added, and just remove them all, with the additional benefit of not
having to track the state of how many children were successfully added.

>> + */
>> +int of_genpd_add_child_ids(struct device_node *np,
>> +                          struct genpd_onecell_data *data)
>> +{
>> +       struct of_phandle_args parent_args;
>> +       struct generic_pm_domain *parent_genpd, *child_genpd;
>> +       struct of_phandle_iterator it;
>> +       const struct property *prop;
>> +       const __be32 *item;
>> +       u32 child_id;
>> +       int ret;
>> +
>> +       /* Check if both properties exist */
>> +       if (of_count_phandle_with_args(np, "power-domains", "#power-domain-cells") <= 0)
>> +               return -ENOENT;
>> +
>> +       prop = of_find_property(np, "power-domains-child-ids", NULL);
>> +       if (!prop)
>> +               return -ENOENT;
>> +
>> +       item = of_prop_next_u32(prop, NULL, &child_id);
>
> Perhaps it's easier to check if of_property_count_u32_elems() returns
> the same number as of_count_phandle_with_args() above? If it doesn't,
> something is wrong, and there is no need to continue.

Agreed. Will add.

> This way you also know the number of loops upfront that must iterate
> through all indexes. This should allow us to use a simpler for-loop
> below, I think. In this case you can also use
> of_property_read_u32_index() instead.

OK.

>> +
>> +       /* Iterate over power-domains phandles and power-domains-child-ids in lockstep */
>> +       of_for_each_phandle(&it, ret, np, "power-domains", "#power-domain-cells", 0) {
>> +               if (!item) {
>> +                       pr_err("power-domains-child-ids shorter than power-domains for %pOF\n", np);
>> +                       ret = -EINVAL;
>> +                       goto err_put_node;
>> +               }
>> +
>> +               /*
>> +                * Fill parent_args from the iterator. it.node is released by
>> +                * the next of_phandle_iterator_next() call at the top of the
>> +                * loop, or by the of_node_put() on the error path below.
>> +                */
>> +               parent_args.np = it.node;
>> +               parent_args.args_count = of_phandle_iterator_args(&it, parent_args.args,
>> +                                                                 MAX_PHANDLE_ARGS);
>> +
>> +               /* Get the parent domain */
>> +               parent_genpd = genpd_get_from_provider(&parent_args);
>
> Before getting the parent_genpd like this, we need to take the
> gpd_list_lock. The lock must be held when genpd_add_subdomain() is
> being called.

Good catch, thanks.

>> +               if (IS_ERR(parent_genpd)) {
>> +                       pr_err("Failed to get parent domain for %pOF: %ld\n",
>> +                              np, PTR_ERR(parent_genpd));
>> +                       ret = PTR_ERR(parent_genpd);
>> +                       goto err_put_node;
>> +               }
>> +
>> +               /* Validate child ID is within bounds */
>> +               if (child_id >= data->num_domains) {
>> +                       pr_err("Child ID %u out of bounds (max %u) for %pOF\n",
>> +                              child_id, data->num_domains - 1, np);
>> +                       ret = -EINVAL;
>> +                       goto err_put_node;
>> +               }
>> +
>> +               /* Get the child domain */
>> +               child_genpd = data->domains[child_id];
>> +               if (!child_genpd) {
>> +                       pr_err("Child domain %u is NULL for %pOF\n", child_id, np);
>> +                       ret = -EINVAL;
>> +                       goto err_put_node;
>> +               }
>> +
>> +               /* Establish parent-child relationship */
>> +               ret = genpd_add_subdomain(parent_genpd, child_genpd);
>> +               if (ret) {
>> +                       pr_err("Failed to add child domain %u to parent in %pOF: %d\n",
>> +                              child_id, np, ret);
>> +                       goto err_put_node;
>> +               }
>> +
>> +               pr_debug("Added child domain %u (%s) to parent %s for %pOF\n",
>> +                        child_id, child_genpd->name, parent_genpd->name, np);
>> +
>> +               item = of_prop_next_u32(prop, item, &child_id);
>> +       }
>> +
>> +       /* of_for_each_phandle returns -ENOENT at natural end-of-list */
>> +       if (ret && ret != -ENOENT)
>> +               return ret;
>> +
>> +       /* All power-domains phandles were consumed; check for trailing child IDs */
>> +       if (item) {
>> +               pr_err("power-domains-child-ids longer than power-domains for %pOF\n", np);
>> +               return -EINVAL;
>> +       }
>> +
>> +       return 0;
>> +
>> +err_put_node:
>
> This isn't a suffient error handling.
>
> If we successfully added child domains using genpd_add_subdomain(), we
> must remove them here, by calling pm_genpd_remove_subdomain() in the
> reverse order as we just added them.

OK, I was relying on the remove function to cleanup, but you're right,
if there's a falure during the add, it should be unwound before
returning.

>> +       of_node_put(it.node);
>> +       return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(of_genpd_add_child_ids);
>> +
>> +/**
>> + * of_genpd_remove_child_ids() - Remove parent-child PM domain relationships
>> + * @np: Device node pointer associated with the PM domain provider.
>> + * @data: Pointer to the onecell data associated with the PM domain provider.
>> + *
>> + * Reverses the effect of of_genpd_add_child_ids() by parsing the same
>> + * power-domains and power-domains-child-ids properties and calling
>> + * pm_genpd_remove_subdomain() for each established relationship.
>> + *
>> + * Returns 0 on success, -ENOENT if properties don't exist, or negative error
>> + * code on failure.
>> + */
>> +int of_genpd_remove_child_ids(struct device_node *np,
>> +                          struct genpd_onecell_data *data)
>> +{
>> +       struct of_phandle_args parent_args;
>> +       struct generic_pm_domain *parent_genpd, *child_genpd;
>> +       struct of_phandle_iterator it;
>> +       const struct property *prop;
>> +       const __be32 *item;
>> +       u32 child_id;
>> +       int ret;
>> +
>> +       /* Check if both properties exist */
>> +       if (of_count_phandle_with_args(np, "power-domains", "#power-domain-cells") <= 0)
>> +               return -ENOENT;
>> +
>> +       prop = of_find_property(np, "power-domains-child-ids", NULL);
>> +       if (!prop)
>> +               return -ENOENT;
>> +
>> +       item = of_prop_next_u32(prop, NULL, &child_id);
>
> Similar comments as for of_genpd_add_child_ids().
>
> Moreover, I think we should remove the children in the reverse order
> of how we added them.

I'm curious why does the order matter?  The children are all siblings
(no hierarchy), so why would the order be important?

I'm not ware of a phandle iterator/helper to parse in the reverse, so
that would mean iterating once to create a list, and then walking it in
reverse.  Seems unnecessary.

Thanks again for the detailed review,

Kevin


^ permalink raw reply

* Re: [PATCH v4 10/10] iommu/arm-smmu-v3: Allow sharing domain across SMMUs
From: Jason Gunthorpe @ 2026-04-10  0:36 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: will, robin.murphy, joro, jpb, praan, smostafa, linux-arm-kernel,
	iommu, linux-kernel, linux-tegra, jonathan.cameron
In-Reply-To: <20260410003223.GD3357077@nvidia.com>

On Thu, Apr 09, 2026 at 09:32:23PM -0300, Jason Gunthorpe wrote:
> On Thu, Mar 19, 2026 at 12:51:56PM -0700, Nicolin Chen wrote:
> > @@ -987,6 +988,32 @@ struct arm_smmu_nested_domain {
> >  	__le64 ste[2];
> >  };
> >  
> > +static inline bool
> > +arm_smmu_domain_can_share(struct arm_smmu_domain *smmu_domain,
> > +			  struct arm_smmu_device *new_smmu)
> > +{
> 
> Probably a bit big for an inline
> 
> > +	struct io_pgtable *pgtbl =
> > +		io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops);
> > +
> > +	if (pgtbl->fmt == ARM_64_LPAE_S1 &&
> > +	    !(new_smmu->features & ARM_SMMU_FEAT_TRANS_S1))
> > +		return false;
> > +	if (pgtbl->fmt == ARM_64_LPAE_S2 &&
> > +	    !(new_smmu->features & ARM_SMMU_FEAT_TRANS_S2))
> > +		return false;
> > +	if (pgtbl->cfg.pgsize_bitmap & ~new_smmu->pgsize_bitmap)
> > +		return false;
> 
> I think this should check the lowest set bit in the
> domain->pgsize_bitmap is set in new_smmu->pgsize_bitmap
> 
> ie that the selected tg is supported.
> 
> The cfg.pgsize_bitmap is something a little different IIRC
> 
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Though something else is missing here, I expected this to be removed too:

struct arm_smmu_domain {
	struct arm_smmu_device		*smmu;

What is left using it?

static int arm_smmu_s1_set_dev_pasid(struct iommu_domain *domain,
				     struct device *dev, ioasid_t id,
				     struct iommu_domain *old)

int arm_smmu_set_pasid(struct arm_smmu_master *master,
		       struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
		       struct arm_smmu_cd *cd, struct iommu_domain *old,
		       arm_smmu_make_cd_fn make_cd_fn)

Thous should use the new helper right? It should work for a S1 domain
too.

static void arm_smmu_flush_iotlb_all(struct iommu_domain *domain)
{
	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);

	if (smmu_domain->smmu)
		arm_smmu_tlb_inv_context(smmu_domain);
}

I suspect that is just dead code now, it is from before finalize was
part of alloc?

Jason


^ permalink raw reply

* Re: [PATCH v4 10/10] iommu/arm-smmu-v3: Allow sharing domain across SMMUs
From: Jason Gunthorpe @ 2026-04-10  0:32 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: will, robin.murphy, joro, jpb, praan, smostafa, linux-arm-kernel,
	iommu, linux-kernel, linux-tegra, jonathan.cameron
In-Reply-To: <5293b61417f96dd58f25fe797e7d0c20dbe30da8.1773949042.git.nicolinc@nvidia.com>

On Thu, Mar 19, 2026 at 12:51:56PM -0700, Nicolin Chen wrote:
> @@ -987,6 +988,32 @@ struct arm_smmu_nested_domain {
>  	__le64 ste[2];
>  };
>  
> +static inline bool
> +arm_smmu_domain_can_share(struct arm_smmu_domain *smmu_domain,
> +			  struct arm_smmu_device *new_smmu)
> +{

Probably a bit big for an inline

> +	struct io_pgtable *pgtbl =
> +		io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops);
> +
> +	if (pgtbl->fmt == ARM_64_LPAE_S1 &&
> +	    !(new_smmu->features & ARM_SMMU_FEAT_TRANS_S1))
> +		return false;
> +	if (pgtbl->fmt == ARM_64_LPAE_S2 &&
> +	    !(new_smmu->features & ARM_SMMU_FEAT_TRANS_S2))
> +		return false;
> +	if (pgtbl->cfg.pgsize_bitmap & ~new_smmu->pgsize_bitmap)
> +		return false;

I think this should check the lowest set bit in the
domain->pgsize_bitmap is set in new_smmu->pgsize_bitmap

ie that the selected tg is supported.

The cfg.pgsize_bitmap is something a little different IIRC

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason


^ permalink raw reply

* Re: [PATCH v4 09/10] iommu/arm-smmu-v3: Remove ASID/VMID from arm_smmu_domain
From: Jason Gunthorpe @ 2026-04-10  0:27 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: will, robin.murphy, joro, jpb, praan, smostafa, linux-arm-kernel,
	iommu, linux-kernel, linux-tegra, jonathan.cameron
In-Reply-To: <b6d87a722635d29e896b277cb60f0208859073d6.1773949042.git.nicolinc@nvidia.com>

On Thu, Mar 19, 2026 at 12:51:55PM -0700, Nicolin Chen wrote:
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> index 846a278fa5469..0e48264ccd01b 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> @@ -300,14 +300,6 @@ static void arm_smmu_sva_domain_free(struct iommu_domain *domain)
>  	 */
>  	arm_smmu_domain_inv(smmu_domain);
>  
> -	/*
> -	 * Notice that the arm_smmu_mm_arch_invalidate_secondary_tlbs op can
> -	 * still be called/running at this point. We allow the ASID to be
> -	 * reused, and if there is a race then it just suffers harmless
> -	 * unnecessary invalidation.
> -	 */
> -	xa_erase(&arm_smmu_asid_xa, smmu_domain->cd.asid);
> -

I don't think this artifact has disappeared so the comment should
probably remain too. It has become slightly different because it is
now running under RCU protections so it will clear alot faster.

Otherwise

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason


^ permalink raw reply

* Re: [PATCH v4 08/10] iommu/arm-smmu-v3: Allocate INV_TYPE_S2_VMID_VSMMU in arm_vsmmu_init
From: Jason Gunthorpe @ 2026-04-10  0:19 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: will, robin.murphy, joro, jpb, praan, smostafa, linux-arm-kernel,
	iommu, linux-kernel, linux-tegra, jonathan.cameron
In-Reply-To: <05dd00dcb2f0d077f59bcbccac1820534ad7b5cf.1773949042.git.nicolinc@nvidia.com>

On Thu, Mar 19, 2026 at 12:51:54PM -0700, Nicolin Chen wrote:
> VMID owned by a vSMMU should be allocated in the viommu_init callback for
>  - a straightforward lifecycle for a VMID used by a vSMMU
>  - HW like tegra241-cmdqv needs to setup VINTF with the VMID
> 
> Allocate/free a VMID in arm_vsmmu_init/destroy(). This decouples the VMID
> owned by vSMMU from the VMID living in the S2 parent domain (s2_cfg.vmid).
> 
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  1 +
>  .../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c     | 26 ++++++++++++++++---
>  .../iommu/arm/arm-smmu-v3/tegra241-cmdqv.c    |  1 +
>  3 files changed, 25 insertions(+), 3 deletions(-)

Yeah, this is exactly right now.. The vmid exists for the duration of
viommu and gets installed in the invs list when the s2 is actually
attached to a device.

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason


^ permalink raw reply

* Re: [PATCH v3 1/3] PCI: Allow ATS to be always on for CXL.cache capable devices
From: Nicolin Chen @ 2026-04-10  0:04 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Tian, Kevin, will@kernel.org, robin.murphy@arm.com,
	bhelgaas@google.com, joro@8bytes.org, praan@google.com,
	baolu.lu@linux.intel.com, miko.lenczewski@arm.com,
	linux-arm-kernel@lists.infradead.org, iommu@lists.linux.dev,
	linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
	Williams, Dan J, jonathan.cameron@huawei.com, Vikram Sethi,
	linux-cxl@vger.kernel.org
In-Reply-To: <20260409225252.GU3357077@nvidia.com>

On Thu, Apr 09, 2026 at 07:52:52PM -0300, Jason Gunthorpe wrote:
> On Thu, Apr 09, 2026 at 03:45:26PM -0700, Nicolin Chen wrote:
> 
> > One question regarding VM case: if a device is ats_always_on, while
> > VM somehow doesn't set nested_domain->enable_ats. Should the kernel
> > at least spit a warning, given that it would surely fail the device?
> 
> No, just let break, the resulting failure has to be contained to the
> VM or the platform is broken..
> 
> The HV can't turn on ATS because we it can't know what invalidations
> to push so not much other choice.

I see. Thanks

Nicolin


^ permalink raw reply

* Re: [PATCH v4 07/10] iommu/arm-smmu-v3: Allocate IOTLB cache tag if no id to reuse
From: Jason Gunthorpe @ 2026-04-10  0:04 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: will, robin.murphy, joro, jpb, praan, smostafa, linux-arm-kernel,
	iommu, linux-kernel, linux-tegra, jonathan.cameron
In-Reply-To: <aea51cbf226d90436918dc09df5cf8f5c64ef8bb.1773949042.git.nicolinc@nvidia.com>

On Thu, Mar 19, 2026 at 12:51:53PM -0700, Nicolin Chen wrote:
> An IOTLB tag now is forwarded from arm_smmu_domain_get_iotlb_tag() to its
> final destination (a CD or STE entry).
> 
> Thus, arm_smmu_domain_get_iotlb_tag() can safely delink its references to
> the cd->asid and s2_cfg->vmid in the smmu_domain. Instead, allocate a new
> IOTLB cache tag from the xarray/ida.
> 
> The old ASID and VMID in the smmu_domain will be deprecated, once VMID is
> decoupled in the vSMMU use case too.
> 
> Since invst->new_invs->inv[0] and invst->tag are basically the same thing,
> merge arm_smmu_inv_flush_iotlb_tag() into arm_smmu_iotlb_tag_free().
> 
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 63 +++++++++++++--------
>  1 file changed, 38 insertions(+), 25 deletions(-)

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason


^ permalink raw reply

* Re: [PATCH v4 06/10] iommu/arm-smmu-v3: Introduce INV_TYPE_S2_VMID_VSMMU
From: Jason Gunthorpe @ 2026-04-09 23:59 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: will, robin.murphy, joro, jpb, praan, smostafa, linux-arm-kernel,
	iommu, linux-kernel, linux-tegra, jonathan.cameron
In-Reply-To: <c514aa533257ce67bf28645863abf5eaab437996.1773949042.git.nicolinc@nvidia.com>

On Thu, Mar 19, 2026 at 12:51:52PM -0700, Nicolin Chen wrote:
> @@ -655,6 +655,7 @@ struct arm_smmu_cmdq_batch {
>  enum arm_smmu_inv_type {
>  	INV_TYPE_S1_ASID,
>  	INV_TYPE_S2_VMID,
> +	INV_TYPE_S2_VMID_VSMMU,
>  	INV_TYPE_S2_VMID_S1_CLEAR,
>  	INV_TYPE_ATS,
>  	INV_TYPE_ATS_FULL,

> @@ -3246,7 +3248,10 @@ int arm_smmu_find_iotlb_tag(struct iommu_domain *domain,
>  		tag->type = INV_TYPE_S1_ASID;
>  		break;
>  	case ARM_SMMU_DOMAIN_S2:
> -		tag->type = INV_TYPE_S2_VMID;
> +		if (to_vsmmu(domain))
> +			tag->type = INV_TYPE_S2_VMID_VSMMU;
> +		else
> +			tag->type = INV_TYPE_S2_VMID;
>  		break;

This shouldn't search, the vmid always comes from the vsmmu struct.

arm_smmu_alloc_iotlb_tag() fixes it after, but the call in
arm_smmu_attach_prepare_invs() should also only be using the
vsmmu->vmid so this is a bug.

Just set tag->id here and return. Move the tag->smmu up so that is
safe.

> @@ -3357,7 +3369,7 @@ arm_smmu_master_build_invs(struct arm_smmu_master *master, bool ats_enabled,
>  		return NULL;
>  
>  	/* All the nested S1 ASIDs have to be flushed when S2 parent changes */
> -	if (nesting) {
> +	if (tag->type == INV_TYPE_S2_VMID_VSMMU) {
>  		if (!arm_smmu_master_build_inv(master,
>  					       INV_TYPE_S2_VMID_S1_CLEAR,
>  					       tag->id, IOMMU_NO_PASID, 0))

I think this function should not mix nesting and type at the same
time..

If INV_TYPE_S2_VMID_VSMMU means the tag is used as a nesting child
then that should also drive the atc decision:

	if (!arm_smmu_master_build_inv(
			    master, nesting ? INV_TYPE_ATS_FULL : INV_TYPE_ATS,
			    master->streams[i].id, ssid, 0))

Because it is exactly the same reasoning for the IOTLB full
invalidation.

This is the only place reading domain->nest_parent so we can get rid
of it too, instead it effectively becomes driven by tag which derives
the S2_VMID from domain->type == IOMMU_DOMAIN_NESTED

Jason


^ permalink raw reply

* Re: [PATCH v5 0/4] firmware: ti_sci: Introduce BOARDCFG_MANAGED mode for Jacinto family
From: Kendall Willis @ 2026-04-09 23:50 UTC (permalink / raw)
  To: Thomas Richard (TI), Nishanth Menon, Tero Kristo,
	Santosh Shilimkar, Michael Turquette, Stephen Boyd
  Cc: Gregory CLEMENT, richard.genoud, Udit Kumar, Prasanth Mantena,
	Abhash Kumar, Thomas Petazzoni, linux-arm-kernel, linux-kernel,
	linux-clk, Dhruva Gole
In-Reply-To: <20260407-ti-sci-jacinto-s2r-restore-irq-v5-0-97b28f2d93f9@bootlin.com>

On 4/7/26 09:25, Thomas Richard (TI) wrote:
> This is the 5th iteration of this series. Nothing new, I just rebased on
> v7.0-rc7, added Dhruva's RB tags, and use kzalloc_obj() in Patch 2.
> 
> Best Regards,
> Thomas
> 
> Signed-off-by: Thomas Richard (TI) <thomas.richard@bootlin.com>

For the series,

Reviewed-by: Kendall Willis <k-willis@ti.com>

> ---
> Changes in v5:
> - rebase on v7.0-rc7.
> - add Dhruva's RB tag.
> - use kzalloc_obj() in ti_sci driver.
> - Link to v4: https://lore.kernel.org/r/20260204-ti-sci-jacinto-s2r-restore-irq-v4-0-67820af39eac@bootlin.com
> 
> Changes in v4:
> - rebase on linux-next next-20260202.
> - fix BOARDCFG_MANAGED value.
> - add MSG_FLAG_CAPS_LPM_IRQ_CONTEXT_LOST firmware capability.
> - add MSG_FLAG_CAPS_LPM_CLK_CONTEXT_LOST firmware capability.
> - Link to v3: https://lore.kernel.org/r/20251205-ti-sci-jacinto-s2r-restore-irq-v3-0-d06963974ad4@bootlin.com
> 
> Changes in v3:
> - rebased on linux-next
> - sci-clk: context_restore() operation restores also rate.
> - Link to v2: https://lore.kernel.org/r/20251127-ti-sci-jacinto-s2r-restore-irq-v2-0-a487fa3ff221@bootlin.com
> 
> Changes in v2:
> - ti_sci: use hlist to store IRQs.
> - sci-clk: add context_restore operation
> - ti_sci: restore clock parents during resume
> - Link to v1: https://lore.kernel.org/r/20251017-ti-sci-jacinto-s2r-restore-irq-v1-0-34d4339d247a@bootlin.com
> 
> ---
> Thomas Richard (TI) (4):
>        firmware: ti_sci: add BOARDCFG_MANAGED mode support
>        firmware: ti_sci: add support for restoring IRQs during resume
>        clk: keystone: sci-clk: add restore_context() operation
>        firmware: ti_sci: add support for restoring clock context during resume
> 
>   drivers/clk/keystone/sci-clk.c |  42 +++++++++--
>   drivers/firmware/ti_sci.c      | 164 ++++++++++++++++++++++++++++++++++++++---
>   drivers/firmware/ti_sci.h      |   6 ++
>   3 files changed, 192 insertions(+), 20 deletions(-)
> ---
> base-commit: d843b67129e266054d8fa2e41e270a9f779381bd
> change-id: 20251010-ti-sci-jacinto-s2r-restore-irq-428e008fd10c
> 
> Best regards,



^ permalink raw reply

* Re: [PATCH v4 04/10] iommu/arm-smmu-v3: Pass in IOTLB cache tag to arm_smmu_master_build_invs()
From: Jason Gunthorpe @ 2026-04-09 23:43 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: will, robin.murphy, joro, jpb, praan, smostafa, linux-arm-kernel,
	iommu, linux-kernel, linux-tegra, jonathan.cameron
In-Reply-To: <87419a1f7371643959a037f1ee7119ffa054a9a1.1773949042.git.nicolinc@nvidia.com>

On Thu, Mar 19, 2026 at 12:51:50PM -0700, Nicolin Chen wrote:
> Now struct arm_smmu_attach_state carrys an IOTLB cache tag in invst->tag.
> 
> Instead of getting the tag from smmu_domain again, pass in the invst->tag
> to arm_smmu_master_build_invs(). This could simplify the function.

/This could simplify/This does simplify/

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason


^ permalink raw reply

* Re: [PATCH v4 03/10] iommu/arm-smmu-v3: Store IOTLB cache tags in struct arm_smmu_attach_state
From: Jason Gunthorpe @ 2026-04-09 23:42 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: will, robin.murphy, joro, jpb, praan, smostafa, linux-arm-kernel,
	iommu, linux-kernel, linux-tegra, jonathan.cameron
In-Reply-To: <ceb8150f229ee7bd355ec42d23e422ae2185492e.1773949042.git.nicolinc@nvidia.com>

On Thu, Mar 19, 2026 at 12:51:49PM -0700, Nicolin Chen wrote:
> So far, an IOTLB tag (ASID or VMID) has been stored in the arm_smmu_domain
> +static int __arm_smmu_domain_find_iotlb_tag(struct arm_smmu_domain *smmu_domain,
> +					    struct arm_smmu_inv *tag)
> +{
> +	struct arm_smmu_invs *invs = rcu_dereference_protected(
> +		smmu_domain->invs, lockdep_is_held(&arm_smmu_asid_lock));
> +	size_t i;
> +
> +	arm_smmu_inv_assert_iotlb_tag(tag);
> +
> +	for (i = 0; i != invs->num_invs; i++) {
> +		if (invs->inv[i].type == tag->type &&
> +		    invs->inv[i].smmu == tag->smmu &&
> +		    READ_ONCE(invs->inv[i].users)) {
> +			*tag = invs->inv[i];

This users thing has become to hard to understand and it isn't how it
should be.

All writers *with the possibility of concurrent access* need to use
WRITE_ONCE since there is a RCU reader. IIRC that is just
arm_smmu_invs_unref()

The one in arm_smmu_invs_merge() is just writing to newly allocated
memory so it shouldn't be marked.

Only readers *with the possibility of concurrent access* should be
marked with READ_ONCE. IIRC this is just the invalidation walker.

Places like this have to be protected by a lock or the whole thing is
wrong, so it should have a lockdep annoation.

Now what is the locking supposed to be? It looks wrong, it probably
wants to be arm_smmu_asid_lock, but arm_smmu_mm_release() doesn't grab
it.

But why does arm_smmu_mm_release() need a tag in the first place? ASID
isn't going to be used when EPD0|EPD1 is set, so the tag can just be
0. Probably make a patch with that change early on..

All the locking is important because this:

> +/* Find an existing IOTLB cache tag in smmu_domain->invs (users counter != 0) */

Must be held as an invarient into the caller, meaning the caller must
hold arm_smmu_asid_lock while it has an active tag on the stack, and
that should be documented here. As well as a lockdep of course.

From what I can tell the final result is correct (aside from
arm_smmu_mm_release), just under documented.

> +int arm_smmu_find_iotlb_tag(struct iommu_domain *domain,
> +			    struct arm_smmu_device *smmu,
> +			    struct arm_smmu_inv *tag)
> +{
> +	struct arm_smmu_domain *smmu_domain = to_smmu_domain_devices(domain);
> +
> +	if (WARN_ON(!smmu_domain))
> +		return -EINVAL;
> +
> +	/* Decide the type of the iotlb cache tag */
> +	switch (smmu_domain->stage) {
> +	case ARM_SMMU_DOMAIN_SVA:
> +	case ARM_SMMU_DOMAIN_S1:
> +		tag->type = INV_TYPE_S1_ASID;
> +		break;
> +	case ARM_SMMU_DOMAIN_S2:
> +		tag->type = INV_TYPE_S2_VMID;
> +		break;
> +	default:
> +		return -EINVAL;
> +	}
> +
> +	tag->smmu = smmu;
> +
> +	return __arm_smmu_domain_find_iotlb_tag(smmu_domain, tag);

This is the only caller it probably doesn't need a special __
function..

> +/* Allocate a new IOTLB cache tag (users counter == 0) */
> +static int arm_smmu_alloc_iotlb_tag(struct iommu_domain *domain,
> +				    struct arm_smmu_device *smmu,
> +				    struct arm_smmu_inv *tag)
> +{
> +	struct arm_smmu_domain *smmu_domain = to_smmu_domain_devices(domain);
> +	int ret;
> +
> +	/* Only allocate if there is no IOTLB cache tag to re-use */
> +	ret = arm_smmu_find_iotlb_tag(domain, smmu, tag);
> +	if (!ret || ret != -ENOENT)
> +		return ret;

Lets not call the function 'alloc_iotlb_tag' if it doesn't always
allocate.. 'get_iotlb_tag' more implies the find or allocate behavior.

Again the locking is important and the caller must ensure it holds the
asid_lock while the tag is alive on the stack. Mention it in the kdoc.

> +
> +	/* FIXME replace with an actual allocation from the bitmap */
> +	if (tag->type == INV_TYPE_S1_ASID)
> +		tag->id = smmu_domain->cd.asid;
> +	else
> +		tag->id = smmu_domain->s2_cfg.vmid;

I don't usually put FIXMEs that will be fixed in the next patches.

Jason

^ permalink raw reply

* Re: [PATCH v4 02/10] iommu/arm-smmu-v3: Pass in arm_smmu_make_cd_fn to arm_smmu_set_pasid()
From: Jason Gunthorpe @ 2026-04-09 23:17 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: will, robin.murphy, joro, jpb, praan, smostafa, linux-arm-kernel,
	iommu, linux-kernel, linux-tegra, jonathan.cameron
In-Reply-To: <d629f81f2a30bb6fa06ec00b35134cb6bab80a48.1773949042.git.nicolinc@nvidia.com>

On Thu, Mar 19, 2026 at 12:51:48PM -0700, Nicolin Chen wrote:
> To install a domain (CD) to a substream, the common flow in the driver is:
>  - Make an S1 or SVA CD outside arm_smmu_asid_lock
>  - Invoke arm_smmu_set_pasid() where it takes arm_smmu_asid_lock, and fix
>    the ASID in the CD.
> 
> The reason for such a flow is for the timing of arm_smmu_asid_lock, since
> it was too early to take the mutex outside the function.
> 
> Tidy it up by passing in a function pointer for CD making,, which supports
> both existing functions: arm_smmu_make_s1_cd() and arm_smmu_make_sva_cd().
> 
> Then arm_smmu_set_pasid() can make a CD inside the lock where ASID is safe
> to access.
> 
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  7 ++++++-
>  .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |  4 ++--
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 19 ++++---------------
>  3 files changed, 12 insertions(+), 18 deletions(-)

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason


^ permalink raw reply

* Re: [PATCH v4 01/10] iommu/arm-smmu-v3: Add a wrapper for arm_smmu_make_sva_cd()
From: Jason Gunthorpe @ 2026-04-09 23:14 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: will, robin.murphy, joro, jpb, praan, smostafa, linux-arm-kernel,
	iommu, linux-kernel, linux-tegra, jonathan.cameron
In-Reply-To: <7889322d41b1d8fa83bb318df2bd705a6241f6b1.1773949042.git.nicolinc@nvidia.com>

On Thu, Mar 19, 2026 at 12:51:47PM -0700, Nicolin Chen wrote:
> Rename the existing arm_smmu_make_sva_cd() to __arm_smmu_make_sva_cd().
> 
> Add a higher-level wrapper arm_smmu_make_s1_cd() receiving smmu_domain
> and master pointers, aligning with arm_smmu_make_s1_cd(). Then, the two
> function can share a common typedef function pointer.
> 
> No functional changes.
> 
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  6 ++---
>  .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   | 22 +++++++++++++------
>  .../iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c  |  4 ++--
>  3 files changed, 20 insertions(+), 12 deletions(-)

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason


^ permalink raw reply

* ✅ PASS (MISSED 51 of 56): Test report for for-kernelci (7.0.0-rc7, upstream-arm-next, 222ce592)
From: cki-project @ 2026-04-09 23:13 UTC (permalink / raw)
  To: will, catalin.marinas, linux-arm-kernel

Hi, we tested your kernel and here are the results:

    Overall result: PASSED
             Merge: OK
           Compile: OK
              Test: OK

Tested-by: CKI Project <cki-project@redhat.com>

Kernel information:
    Commit message: Merge remote-tracking branch 'origin/nocache-cleanup' into for-kernelci

You can find all the details about the test run at
    https://datawarehouse.cki-project.org/kcidb/checkouts/redhat:2442210584

Tests that were not ran because of internal issues:
    /distribution/check-install [aarch64]
    /distribution/command [aarch64]
    /test/misc/machineinfo [aarch64]
    Boot test [aarch64]
    CKI/restraint [aarch64]
    Hardware - Firmware test suite [aarch64]
    Reboot test [aarch64]
    SELinux Custom Module Setup [aarch64]
    selinux-policy: serge-testsuite [aarch64]
    Storage - blktests - throtl [aarch64]
    Storage - blktests - ublk [aarch64]
    stress: stress-ng - cpu-cache [aarch64]
    stress: stress-ng - memory [aarch64]
    xfstests - btrfs [aarch64]
    xfstests - ext4 [aarch64]
    xfstests - xfs [aarch64]


If you find a failure unrelated to your changes, please ask the test maintainer to review it.
This will prevent the failures from being incorrectly reported in the future.

Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.

        ,-.   ,-.
       ( C ) ( K )  Continuous
        `-',-.`-'   Kernel
          ( I )     Integration
           `-'
______________________________________________________________________________



^ permalink raw reply

* Re: [PATCH v2 1/3] arm64: mm: Fix rodata=full block mapping support for realm guests
From: Yang Shi @ 2026-04-09 23:08 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Kevin Brodsky, Ryan Roberts, Will Deacon, David Hildenbrand (Arm),
	Dev Jain, Suzuki K Poulose, Jinjiang Tu, linux-arm-kernel,
	linux-kernel, stable
In-Reply-To: <adfw_hNDsIWwSAIv@arm.com>



On 4/9/26 11:33 AM, Catalin Marinas wrote:
> On Thu, Apr 09, 2026 at 09:48:58AM -0700, Yang Shi wrote:
>> On 4/9/26 8:20 AM, Catalin Marinas wrote:
>>> On Thu, Apr 09, 2026 at 11:53:41AM +0200, Kevin Brodsky wrote:
>>>> What would make more sense to me is to enable the use of BBML2-noabort
>>>> unconditionally if !force_pte_mapping(). We can then have
>>>> can_set_direct_map() return true if we have BBML2-noabort, and we no
>>>> longer need to check it in map_mem().
>>> Indeed.
>> I'm trying to wrap up my head for this discussion. IIUC, if none of the
>> features is enabled, it means we don't need do anything because the direct
>> map is not changed. For example, if vmalloc doesn't change direct map
>> permission when rodata != full, there is no need to call
>> set_direct_map_*_noflush(). So unconditionally checking BBML2_NOABORT will
>> change the behavior unnecessarily. Did I miss something?
>>
>> I think the only exception is secretmem if I don't miss something.
>> Currently, secretmem is actually not supported if none of the features is
>> enabled. But BBML2_NOABORT allows to lift the restriction.
> Yes, it's secretmem only AFAICT. I think execmem will only change the
> linear map if rodata_full anyway.

Yes, execmem calls set_memory_rox(), which won't change linear map 
permission if rodata_full is not enabled.

Thanks,
Yang

>



^ permalink raw reply

* Re: [PATCH RFC net-next 02/10] net: stmmac: rename dev_id to userver
From: Jitendra Vegiraju @ 2026-04-09 23:07 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Andrew Lunn, Alexandre Torgue, Andrew Lunn, Chen-Yu Tsai,
	David S. Miller, Eric Dumazet, Jakub Kicinski, linux-arm-kernel,
	linux-stm32, linux-sunxi, netdev, Paolo Abeni, Samuel Holland
In-Reply-To: <E1wAPBR-0000000F7ju-1fD9@rmk-PC.armlinux.org.uk>

[-- Attachment #1: Type: text/plain, Size: 4607 bytes --]

Hi Russell,

On Wed, Apr 8, 2026 at 2:27 AM Russell King (Oracle)
<rmk+kernel@armlinux.org.uk> wrote:
>
> The Synopsys Databook and several implementation TRMs identify bits
> 15:8 of the version register in dwmac v3.xx and v4.xx as "userver".
> We even print its value with "User ID". Rather than using "dev_id",
> use "userver" instead.
>
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> ---
>  drivers/net/ethernet/stmicro/stmmac/hwif.c | 18 +++++++++---------
>  1 file changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/net/ethernet/stmicro/stmmac/hwif.c b/drivers/net/ethernet/stmicro/stmmac/hwif.c
> index 3774af66db48..830ff816ab4f 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/hwif.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/hwif.c
> @@ -15,7 +15,7 @@
>
>  struct stmmac_version {
>         u8 snpsver;
> -       u8 dev_id;
> +       u8 userver;
>  };
From the XGMAC databook that I have access to bits(15:8) identify the
DEVID field of MAC_version register.
The userver field is from bits(23:16) of the same register. This is a
customer defined field (configured with coreConsultant).
Currently stmmac doesn't care about bits(23:16).

I think the confusion is coming from macro name in common.h
#define DWMAC_USERVER   GENMASK_U32(15, 8)
This should be named
#define DWMAC_DEVID   GENMASK_U32(15, 8)
Hope someone with access to another databook can confirm this.

>
>  static void stmmac_get_version(struct stmmac_priv *priv,
> @@ -26,7 +26,7 @@ static void stmmac_get_version(struct stmmac_priv *priv,
>         u32 version;
>
>         ver->snpsver = 0;
> -       ver->dev_id = 0;
> +       ver->userver = 0;
>
>         if (core_type == DWMAC_CORE_MAC100)
>                 return;
> @@ -48,7 +48,7 @@ static void stmmac_get_version(struct stmmac_priv *priv,
>
>         ver->snpsver = FIELD_GET(DWMAC_SNPSVER, version);
>         if (core_type == DWMAC_CORE_XGMAC)
> -               ver->dev_id = FIELD_GET(DWMAC_USERVER, version);
> +               ver->userver = FIELD_GET(DWMAC_USERVER, version);
>  }
>
>  static void stmmac_dwmac_mode_quirk(struct stmmac_priv *priv)
> @@ -111,7 +111,7 @@ int stmmac_reset(struct stmmac_priv *priv)
>  static const struct stmmac_hwif_entry {
>         enum dwmac_core_type core_type;
>         u32 min_snpsver;
> -       u32 dev_id;
> +       u32 userver;
>         const struct stmmac_regs_off regs;
>         const void *desc;
>         const void *dma;
> @@ -247,7 +247,7 @@ static const struct stmmac_hwif_entry {
>         }, {
>                 .core_type = DWMAC_CORE_XGMAC,
>                 .min_snpsver = DWXGMAC_CORE_2_10,
> -               .dev_id = DWXGMAC_ID,
> +               .userver = DWXGMAC_ID,
>                 .regs = {
>                         .ptp_off = PTP_XGMAC_OFFSET,
>                         .mmc_off = MMC_XGMAC_OFFSET,
> @@ -269,7 +269,7 @@ static const struct stmmac_hwif_entry {
>         }, {
>                 .core_type = DWMAC_CORE_XGMAC,
>                 .min_snpsver = DWXLGMAC_CORE_2_00,
> -               .dev_id = DWXLGMAC_ID,
> +               .userver = DWXLGMAC_ID,
>                 .regs = {
>                         .ptp_off = PTP_XGMAC_OFFSET,
>                         .mmc_off = MMC_XGMAC_OFFSET,
> @@ -291,7 +291,7 @@ static const struct stmmac_hwif_entry {
>  };
>
>  static const struct stmmac_hwif_entry *
> -stmmac_hwif_find(enum dwmac_core_type core_type, u8 snpsver, u8 dev_id)
> +stmmac_hwif_find(enum dwmac_core_type core_type, u8 snpsver, u8 userver)
>  {
>         const struct stmmac_hwif_entry *entry;
>         int i;
> @@ -305,7 +305,7 @@ stmmac_hwif_find(enum dwmac_core_type core_type, u8 snpsver, u8 dev_id)
>                 if (snpsver < entry->min_snpsver)
>                         continue;
>                 if (core_type == DWMAC_CORE_XGMAC &&
> -                   dev_id != entry->dev_id)
> +                   userver != entry->userver)
>                         continue;
>
>                 return entry;
> @@ -358,7 +358,7 @@ int stmmac_hwif_init(struct stmmac_priv *priv)
>         /* Fallback to generic HW */
>
>         /* Use synopsys_id var because some setups can override this */
> -       entry = stmmac_hwif_find(core_type, priv->synopsys_id, version.dev_id);
> +       entry = stmmac_hwif_find(core_type, priv->synopsys_id, version.userver);
>         if (!entry) {
>                 dev_err(priv->device,
>                         "Failed to find HW IF (id=0x%x, gmac=%d/%d)\n",
> --
> 2.47.3
>

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5435 bytes --]

^ permalink raw reply

* Re: [PATCH v3 1/3] PCI: Allow ATS to be always on for CXL.cache capable devices
From: Jason Gunthorpe @ 2026-04-09 22:52 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: Tian, Kevin, will@kernel.org, robin.murphy@arm.com,
	bhelgaas@google.com, joro@8bytes.org, praan@google.com,
	baolu.lu@linux.intel.com, miko.lenczewski@arm.com,
	linux-arm-kernel@lists.infradead.org, iommu@lists.linux.dev,
	linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
	Williams, Dan J, jonathan.cameron@huawei.com, Vikram Sethi,
	linux-cxl@vger.kernel.org
In-Reply-To: <adgsBgXYy68GmxAf@Asurada-Nvidia>

On Thu, Apr 09, 2026 at 03:45:26PM -0700, Nicolin Chen wrote:

> One question regarding VM case: if a device is ats_always_on, while
> VM somehow doesn't set nested_domain->enable_ats. Should the kernel
> at least spit a warning, given that it would surely fail the device?

No, just let break, the resulting failure has to be contained to the
VM or the platform is broken..

The HV can't turn on ATS because we it can't know what invalidations
to push so not much other choice.

Jason


^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox