Linux-ARM-Kernel Archive on lore.kernel.org

Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH v3 0/6] KVM: arm64: Don't perform vgic-v2 lazy init on timer injection
From: Marc Zyngier @ 2026-05-21  7:23 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel, Marc Zyngier
  Cc: Deepanshu Kartikey, Steffen Eiden, Joey Gouly, Suzuki K Poulose,
	Oliver Upton, Zenghui Yu
In-Reply-To: <20260520100200.543845-1-maz@kernel.org>

On Wed, 20 May 2026 11:01:54 +0100, Marc Zyngier wrote:
> This is the third version of this series aiming at fixing issues with
> vgic-v2 being initialised from non-preemptible context.
> 
> * From v2 [2]:
> 
>   - Remove the PMU's irq level cache which was hidding in plain sight
> 
> [...]

Applied to next, thanks!

[1/6] KVM: arm64: timer: Repaint kvm_timer_{should,irq_can}_fire() to kvm_timer_{pending,enabled}()
      commit: 68a612d4dbc7f2b9dac731c79676a21fce573d29
[2/6] KVM: arm64: Simplify userspace notification of interrupt state
      commit: 0d27b4b351493cb2fe1f87cd152856704d4e141d
[3/6] KVM: arm64: timer: Kill the per-timer irq level cache
      commit: ac7002031852ab8f75b3debb1a4c4b2d1ff5a26c
[4/6] KVM: arm64: pmu: Kill the PMU interrupt level cache
      commit: 2772383afc5c65d6242f62947b5c184ffb049359
[5/6] KVM: arm64: vgic-v2: Force vgic init on injection outside the run loop
      commit: 1a8685ed8cd1ded20d0c81070a49b1cddf70481d
[6/6] KVM: arm64: vgic-v2: Don't init the vgic on in-kernel interrupt injection
      commit: 958023d269e0312d10da85a6a49438d2e107dead

Cheers,

	M.
-- 
Without deviation from the norm, progress is not possible.




^ permalink raw reply

* Re: [PATCH v4 1/3] PCI: Allow ATS to be always on for CXL.cache capable devices
From: Yi Liu @ 2026-05-21  7:31 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Nicolin Chen, will, robin.murphy, bhelgaas, joro, praan, baolu.lu,
	kevin.tian, miko.lenczewski, linux-arm-kernel, iommu,
	linux-kernel, linux-pci, dan.j.williams, jonathan.cameron, vsethi,
	linux-cxl, nirmoyd
In-Reply-To: <20260520143410.GV3602937@nvidia.com>

On 5/20/26 22:34, Jason Gunthorpe wrote:
> On Wed, May 20, 2026 at 09:12:31PM +0800, Yi Liu wrote:
>> On 4/27/26 13:54, Nicolin Chen wrote:
>>> Controlled by the IOMMU driver, ATS is usually enabled "on demand" when a
>>> given PASID on a device is attached to an I/O page table. This is working
>>> even when a device has no translation on its RID (i.e., the RID is IOMMU
>>> bypassed).
>>
>> nit: this description seems not accurate. Intel iommu driver enables ATS
>> in the probe_device() phase. mind tweak a bit to avoid misleading
>> message. :)
> 
> It probably shouldn't do this, it should follow ARM and have it
> dynamic during domain attach.

Agreed that making it dynamic during domain attach is a better
direction. However, even framing it that way, the description tying ATS
enablement to PASID attachment is still architecturally specific to ARM
SMMUv3, and doesn't hold as a general statement. :)

> For security we need ATS disabled for blocking domains at a minimum.

Agreed on the security model.

One more data point worth discussing: today Intel's IOMMU driver enables
ATS at probe time, which has two effects — enabling the PCI ATS
capability on the device, and setting the DTE bit in the scalable-mode
PASID-table entry. When a RID or PASID is subsequently attached to a
blocking domain, the corresponding PASID-table entry has its Present (P)
bit cleared.

Per the VT-d spec (condition SPT.2), with P=0:

- Translation Requests (with or without PASID) complete successfully,
   but return R=W=U=S=0 to the device — effectively a no-access result.
- Untranslated Requests receive UR.
- Translated Requests are N/A.

So while neither the PCI ATS capability nor the DTE bit is explicitly
cleared when a blocking domain is attached, ATS-related transactions
don't produce any usable result from the device's perspective.

Does this hardware behavior satisfy the security expectation you have in
mind? Or do you still require that both the DTE bit and the PCI ATS
capability be explicitly disabled when a blocking domain is in effect?

Regards,
Yi Liu

^ permalink raw reply

* Re: [PATCH v5 7/8] dt-bindings: raspberrypi,bcm2835-firmware: Drop unnecessary select
From: Krzysztof Kozlowski @ 2026-05-21  7:11 UTC (permalink / raw)
  To: Gregor Herburger, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Florian Fainelli, Ray Jui, Scott Branden,
	Broadcom internal kernel review list, Eric Anholt, Stefan Wahren,
	Srinivas Kandagatla, Kees Cook, Gustavo A. R. Silva,
	Thomas Weißschuh
  Cc: devicetree, linux-rpi-kernel, linux-arm-kernel, linux-kernel,
	linux-hardening, Conor Dooley
In-Reply-To: <20260520-rpi-otp-driver-v5-7-b26e5908eeac@linutronix.de>

On 20/05/2026 16:27, Gregor Herburger wrote:
> The select schema is not necessary because the
> raspberrypi,bcm2835-firmware compatible is already matched by the
> compatible string values. 

This is wrong. The select was not because of that. Select was needed
because of simple-mfd, but dtschema was changed, so please rephrase:

The "select" in schema is not necessary anymore since dtschema drops
simple-mfd when constructing the select/filter query for schemas with
compatibles.

Best regards,
Krzysztof


^ permalink raw reply

* Re: [PATCH v5 8/8] arm64: defconfig: Enable the raspberrypi otp driver as module
From: Krzysztof Kozlowski @ 2026-05-21  7:09 UTC (permalink / raw)
  To: Gregor Herburger, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Florian Fainelli, Ray Jui, Scott Branden,
	Broadcom internal kernel review list, Eric Anholt, Stefan Wahren,
	Srinivas Kandagatla, Kees Cook, Gustavo A. R. Silva,
	Thomas Weißschuh
  Cc: devicetree, linux-rpi-kernel, linux-arm-kernel, linux-kernel,
	linux-hardening
In-Reply-To: <20260520-rpi-otp-driver-v5-8-b26e5908eeac@linutronix.de>

On 20/05/2026 16:28, Gregor Herburger wrote:
> Enable the newly add Raspberry Pi OTP driver as module to allow access
> to the otp registers.

... on foo bar board?

Otherwise, why do we want it in upstream?

Best regards,
Krzysztof


^ permalink raw reply

* Re: [PATCH v2 0/2] KVM: arm64: nv: Reduce FP/SVE overhead on exception/exception return
From: Marc Zyngier @ 2026-05-21  7:07 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel, kvm, Marc Zyngier
  Cc: Steffen Eiden, Joey Gouly, Suzuki K Poulose, Oliver Upton,
	Zenghui Yu, Mark Rutland, Will Deacon, Fuad Tabba
In-Reply-To: <20260520085036.541666-1-maz@kernel.org>

On Wed, 20 May 2026 09:50:34 +0100, Marc Zyngier wrote:
> This is the second version of this short series optimising away a lot
> of unnecessary FPSIMD/SVE context switch with NV.
> 
> * From v1 [1]:
> 
>   - New commit message on patch #2 (Mark)
> 
> [...]

Applied to next, thanks!

[1/2] KVM: arm64: nv: Track L2 to L1 exception emulation
      commit: 27ae400e6e888153ded1ad807a94a94e506dd2df
[2/2] KVM: arm64: nv: Don't save/restore FP register during a nested ERET or exception
      commit: 435c466196148ae116f616e6cda97c33281defc2

Cheers,

	M.
-- 
Without deviation from the norm, progress is not possible.




^ permalink raw reply

* Re: [PATCH 1/8] mm: Add ptep_try_set() for lockless empty-slot installs
From: Andrea Righi @ 2026-05-21  7:00 UTC (permalink / raw)
  To: Tejun Heo
  Cc: David Vernet, Changwoo Min, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Kumar Kartikeya Dwivedi,
	Peter Zijlstra, Catalin Marinas, Will Deacon, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Andrew Morton,
	David Hildenbrand, Mike Rapoport, Emil Tsalapatis, sched-ext, bpf,
	x86, linux-arm-kernel, linux-mm, linux-kernel
In-Reply-To: <20260520235052.4180316-2-tj@kernel.org>

Hi Tejun,

On Wed, May 20, 2026 at 01:50:45PM -1000, Tejun Heo wrote:
> Add ptep_try_set(ptep, new_pte): atomically set *ptep to new_pte iff it is
> currently pte_none(). Returns true on success, false if the slot was already
> populated or the arch has no implementation.
> 
> The intended caller is the upcoming bpf_arena kernel-side fault recovery
> path. The install runs from a page fault that can be nested under locks
> held by the faulting kernel caller (e.g. a BPF program holding
> raw_res_spin_lock_irqsave on its arena's spinlock), so trylock-and-retry
> would A-A deadlock. Lock-free cmpxchg is the only viable option, which
> constrains this helper to special kernel page tables where concurrent
> writers cooperate via atomic accessors.
> 
> The generic version in <linux/pgtable.h> returns false. x86 and arm64
> override with try_cmpxchg-based implementations on the underlying pteval.
> Other architectures get the false stub - the callers there already fall
> through to oops.
> 
> v2: Rename to ptep_try_set(). Tighten kerneldoc for kernel-PTE use.
>     (David, Alexei)
> 
> Suggested-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> Suggested-by: Alexei Starovoitov <ast@kernel.org>
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: David Hildenbrand <david@kernel.org>
> ---
>  arch/arm64/include/asm/pgtable.h |  8 ++++++++
>  arch/x86/include/asm/pgtable.h   |  8 ++++++++
>  include/linux/pgtable.h          | 26 ++++++++++++++++++++++++++
>  3 files changed, 42 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index 9029b81ccbe8..a129be91ef2c 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -1830,6 +1830,14 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
>  	return __ptep_get_and_clear(mm, addr, ptep);
>  }
>  
> +static inline bool ptep_try_set(pte_t *ptep, pte_t new_pte)
> +{
> +	pteval_t old = 0;
> +
> +	return try_cmpxchg(&pte_val(*ptep), &old, pte_val(new_pte));
> +}
> +#define ptep_try_set ptep_try_set
> +
>  #define test_and_clear_young_ptes test_and_clear_young_ptes
>  static inline bool test_and_clear_young_ptes(struct vm_area_struct *vma,
>  		unsigned long addr, pte_t *ptep, unsigned int nr)
> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
> index 13e3e9a054cb..047e273a4eab 100644
> --- a/arch/x86/include/asm/pgtable.h
> +++ b/arch/x86/include/asm/pgtable.h
> @@ -1284,6 +1284,14 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm,
>  	} while (!try_cmpxchg((long *)&ptep->pte, (long *)&old_pte, *(long *)&new_pte));
>  }
>  
> +static inline bool ptep_try_set(pte_t *ptep, pte_t new_pte)
> +{
> +	pte_t old_pte = __pte(0);
> +
> +	return try_cmpxchg((long *)&ptep->pte, (long *)&old_pte, *(long *)&new_pte);
> +}

Minor nit (feel free to ignore), on x86 pte_none() is defined as:

static inline int pte_none(pte_t pte)
{
	return !(pte.pte & ~(_PAGE_KNL_ERRATUM_MASK));
}

With:

#if defined(CONFIG_X86_64) || defined(CONFIG_X86_PAE)
#define _PAGE_KNL_ERRATUM_MASK (_PAGE_DIRTY | _PAGE_ACCESSED)
#else
#define _PAGE_KNL_ERRATUM_MASK 0
#endif

If that mask has the D/A bits set, try_cmpxchg(..., &old=0, ...) will reject a
PTE that has only those bits set, even though pte_none() would return true. I
think this is fine for the bpf_arena use case, since hardware shouldn't set A/D
for fresh pages that the BPF prog hasn't touched.

Maybe it's worth adding a comment (something along these lines)?

 /*
  * Note: strictly-zero compare is narrower than pte_none() (see pte_none() and
  * _PAGE_KNL_ERRATUM_MASK), but the gap is harmless in practice: HW shouldn't
  * set _PAGE_DIRTY | _PAGE_ACCESSED bits on entries the caller never touched.
  */

Other than that, looks good to me.

Reviewed-by: Andrea Righi <arighi@nvidia.com>

Thanks,
-Andrea

> +#define ptep_try_set ptep_try_set
> +
>  #define flush_tlb_fix_spurious_fault(vma, address, ptep) do { } while (0)
>  
>  #define  __HAVE_ARCH_PMDP_SET_ACCESS_FLAGS
> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
> index cdd68ed3ae1a..d68374f404c1 100644
> --- a/include/linux/pgtable.h
> +++ b/include/linux/pgtable.h
> @@ -1036,6 +1036,32 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addres
>  }
>  #endif
>  
> +#ifndef ptep_try_set
> +/**
> + * ptep_try_set - atomically set an empty kernel PTE
> + * @ptep: page table entry
> + * @new_pte: value to install
> + *
> + * Atomically set *@ptep to @new_pte iff *@ptep is pte_none(). Return
> + * true on success, false if the slot was already populated or the
> + * arch has no implementation.
> + *
> + * For special kernel page tables only - never user page tables. The
> + * caller must prevent concurrent teardown of @ptep and must accept
> + * that other writers may race. Concurrent clearers must use
> + * ptep_get_and_clear() so racing accesses agree on the outcome.
> + *
> + * Architectures opt in by providing a cmpxchg-based override and
> + * defining ptep_try_set as an identity macro. The generic stub
> + * returns false, which is correct for callers that fall through to
> + * oops on failure.
> + */
> +static inline bool ptep_try_set(pte_t *ptep, pte_t new_pte)
> +{
> +	return false;
> +}
> +#endif
> +
>  #ifndef wrprotect_ptes
>  /**
>   * wrprotect_ptes - Write-protect PTEs that map consecutive pages of the same
> -- 
> 2.54.0
> 


^ permalink raw reply

* [PATCH v3] usb: gadget: aspeed_udc: avoid past-the-end iterator in dequeue
From: Maoyi Xie @ 2026-05-21  6:54 UTC (permalink / raw)
  To: Andrew Jeffery, Neal Liu
  Cc: Greg Kroah-Hartman, Benjamin Herrenschmidt, Joel Stanley,
	Andrew Lunn, Alan Stern, linux-aspeed, linux-arm-kernel,
	linux-usb, linux-kernel
In-Reply-To: <20260519080213.1932516-1-maoyixie.tju@gmail.com>

ast_udc_ep_dequeue() declares the loop cursor `req` outside the
list_for_each_entry(). After the loop it tests `&req->req != _req`
to decide whether the request was found. If the queue holds no
match, `req` is past-the-end. It then aliases
container_of(&ep->queue, struct ast_udc_request, queue) via offset
cancellation. Whether that synthetic address equals `_req` depends
on heap layout. The function can return 0 without dequeueing
anything.

Default `rc` to -EINVAL and set it to 0 only inside the match
branch. `req` is no longer read after the loop, so the past-the-end
dereference goes away. No extra cursor variable or post-loop test
is needed.

Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Suggested-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Signed-off-by: Maoyi Xie <maoyixie.tju@gmail.com>
---
v3: Switch to Andrew Jeffery's shape: default rc to -EINVAL, set
    rc=0 inside the match branch, drop the post-loop check. Smaller
    diff, no extra cursor variable, no goto. Same semantic fix as v2.
v2: https://lore.kernel.org/linux-usb/20260519080213.1932516-1-maoyixie.tju@gmail.com/
v1: https://lore.kernel.org/linux-usb/20260518073403.1285339-1-maoyi.xie@ntu.edu.sg/

 drivers/usb/gadget/udc/aspeed_udc.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/usb/gadget/udc/aspeed_udc.c b/drivers/usb/gadget/udc/aspeed_udc.c
index 7fc6696b7694..75f9c831b21a 100644
--- a/drivers/usb/gadget/udc/aspeed_udc.c
+++ b/drivers/usb/gadget/udc/aspeed_udc.c
@@ -694,7 +694,7 @@ static int ast_udc_ep_dequeue(struct usb_ep *_ep, struct usb_request *_req)
 	struct ast_udc_dev *udc = ep->udc;
 	struct ast_udc_request *req;
 	unsigned long flags;
-	int rc = 0;
+	int rc = -EINVAL;
 
 	spin_lock_irqsave(&udc->lock, flags);
 
@@ -704,14 +704,11 @@ static int ast_udc_ep_dequeue(struct usb_ep *_ep, struct usb_request *_req)
 			list_del_init(&req->queue);
 			ast_udc_done(ep, req, -ESHUTDOWN);
 			_req->status = -ECONNRESET;
+			rc = 0;
 			break;
 		}
 	}
 
-	/* dequeue request not found */
-	if (&req->req != _req)
-		rc = -EINVAL;
-
 	spin_unlock_irqrestore(&udc->lock, flags);
 
 	return rc;
-- 
2.34.1



^ permalink raw reply related

* Re: [PATCH 04/10] [v2] sh: select legacy gpiolib interface
From: John Paul Adrian Glaubitz @ 2026-05-21  6:49 UTC (permalink / raw)
  To: Arnd Bergmann, linux-gpio
  Cc: linux-kernel, Arnd Bergmann, Christian Lamparter, Johannes Berg,
	Aaro Koskinen, Andreas Kemnade, Kevin Hilman, Roger Quadros,
	Tony Lindgren, Thomas Bogendoerfer, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Linus Walleij,
	Bartosz Golaszewski, Dmitry Torokhov, Lee Jones, Pavel Machek,
	Matti Vaittinen, Florian Fainelli, Jonas Gorski, Andrew Lunn,
	Vladimir Oltean, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-wireless, linux-omap, linux-arm-kernel,
	linux-mips, linux-sh, linux-input, linux-leds, netdev
In-Reply-To: <20260520183815.2510387-5-arnd@kernel.org>

Hi Arnd,

On Wed, 2026-05-20 at 20:38 +0200, Arnd Bergmann wrote:
> From: Arnd Bergmann <arnd@arndb.de>
> 
> Many board files on sh reference the legacy gpiolib interfaces that
> are becoming optional. To ensure the boards can keep building, select
> CONFIG_GPIOLIB_LEGACY on each of the boards that have one of the
> hardcoded calls.
> 
> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
> v2: no changes. Adrian said he'll pick it up for 7.2, but so
>     far the patch is not in linux-next yet, so I'm including it
>     for completeness here.

Sorry, I hadn't gotten around to pick the changes for v7.2 yet. I can
pick it up this weekend as I was planning to review and merge some
patches this weekend.

I have received quite a lot of patches for SH recently, so it will take
some time to dig myself through the queue.

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer
`. `'   Physicist
  `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913


^ permalink raw reply

* [PATCH v2] i2c: imx: fix clock and pinctrl state inconsistency in runtime PM
From: Carlos Song (OSS) @ 2026-05-21  6:50 UTC (permalink / raw)
  To: o.rempel, kernel, andi.shyti, Frank.Li, s.hauer, festevam,
	carlos.song
  Cc: linux-i2c, linux-arm-kernel, linux-kernel, stable

From: Carlos Song <carlos.song@nxp.com>

In i2c_imx_runtime_suspend(), the clock is disabled before switching
the pinctrl state to sleep. If pinctrl_pm_select_sleep_state() fails,
the runtime suspend is aborted but the clock remains disabled, causing
a system crash when the hardware is subsequently accessed.

Fix this by switching the pinctrl state before disabling the clock so
that a pinctrl failure leaves the clock enabled and the hardware
accessible.

In i2c_imx_runtime_resume(), restore the pinctrl state back to sleep
if clk_enable() fails to keep the consistent.

Fixes: 576eba03c994 ("i2c: imx: switch different pinctrl state in different system power status")
Cc: stable@vger.kernel.org
Signed-off-by: Carlos Song <carlos.song@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
---
Change for v2:
  - Fix commit log to "keep the consistent" according to Frank's
    suggestion.
---
 drivers/i2c/busses/i2c-imx.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/i2c/busses/i2c-imx.c b/drivers/i2c/busses/i2c-imx.c
index a208fefd3c3b..28313d0fad37 100644
--- a/drivers/i2c/busses/i2c-imx.c
+++ b/drivers/i2c/busses/i2c-imx.c
@@ -1892,9 +1892,15 @@ static void i2c_imx_remove(struct platform_device *pdev)
 static int i2c_imx_runtime_suspend(struct device *dev)
 {
 	struct imx_i2c_struct *i2c_imx = dev_get_drvdata(dev);
+	int ret;
+
+	ret = pinctrl_pm_select_sleep_state(dev);
+	if (ret)
+		return ret;
 
 	clk_disable(i2c_imx->clk);
-	return pinctrl_pm_select_sleep_state(dev);
+
+	return 0;
 }
 
 static int i2c_imx_runtime_resume(struct device *dev)
@@ -1907,10 +1913,13 @@ static int i2c_imx_runtime_resume(struct device *dev)
 		return ret;
 
 	ret = clk_enable(i2c_imx->clk);
-	if (ret)
+	if (ret) {
 		dev_err(dev, "can't enable I2C clock, ret=%d\n", ret);
+		pinctrl_pm_select_sleep_state(dev);
+		return ret;
+	}
 
-	return ret;
+	return 0;
 }
 
 static int i2c_imx_suspend(struct device *dev)
-- 
2.43.0



^ permalink raw reply related

* Re: [PATCH v2 2/2] KVM: arm64: nv: Don't save/restore FP register during a nested ERET or exception
From: Marc Zyngier @ 2026-05-21  6:35 UTC (permalink / raw)
  To: Mark Rutland
  Cc: kvmarm, linux-arm-kernel, kvm, Steffen Eiden, Joey Gouly,
	Suzuki K Poulose, Oliver Upton, Zenghui Yu, Will Deacon,
	Fuad Tabba
In-Reply-To: <ag2w0G34NycT2456@J2N7QTR9R3.cambridge.arm.com>

On Wed, 20 May 2026 14:02:08 +0100,
Mark Rutland <mark.rutland@arm.com> wrote:
> 
> On Wed, May 20, 2026 at 09:50:36AM +0100, Marc Zyngier wrote:
> > When switching between L1 and L2, we save the old state using
> > kvm_arch_vcpu_put(), mutate the state in memory, then load the new
> > state using kvm_arch_vcpu_load(). Any live FPSIMD/SVE state is saved
> > and unbound, such that it can be lazily restored on a subsequent trap.
> > 
> > The FPSIMD/SVE state is shared by exception levels, and only a handful
> > of related control registers need to be changed when transitioning
> > between L1 and L2. The save/restore of the common state is needless
> > overhead, especially as trapping becomes exponentially more expensive
> > with nesting.
> > 
> > Avoid this overhead by leaving the common FPSIMD/SVE state live on the
> > CPU, and only switching the state that is distinct for L1 and L2:
> > 
> > - the trap controls: the effective values are recomputed on each entry
> >   into the guest to take the EL into account and merge the L0 and L1
> >   configuration if in a nested context, or directly use the L0 configuration
> >   in non-nested context (see __activate_traps()).
> > 
> > - the VL settings: the effective values are are also recomputed on each
> >   entry into the guest (see fpsimd_lazy_switch_to_guest()).
> >
> > Since we appear to cover all bases, use the vcpu flags indicating the
> > handling of a nested ERET or exception delivery to avoid the whole FP
> > save/restore shenanigans. SME will have to be similarly dealt with when
> > it eventually gets supported.
> > 
> > For an EL1 L3 guest where L1 and L2 have this optimisation, this
> > results in at least a 10% wall clock reduction when running an I/O
> > heavy workload, generating a high rate of nested exceptions.
> 
> There's on additional thing that's important, but I forgot to mention
> last time: in the window between kvm_arch_vcpu_put() and
> kvm_arch_vcpu_load(), it's possible to take an interrupt, and for a
> softirq handler to try to use kernel mode NEON.
> 
> Due to that, kvm_arch_vcpu_put() must leave the L1 guest's maximum VL
> configured in the host's ZCR_ELx, such that the guest's state can be
> saved.
> 
> That value is configured by fpsimd_lazy_switch_to_host(), so we just
> need to make sure that kvm_arch_vcpu_put() doesn't clobber it. I *think*
> that's fine today, but maybe that warrants a comment somewhere.

I have slapped this onto this patch:

diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index aca98752a6e42..3f6b1e29cd6b9 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -117,7 +117,10 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
 	unsigned long flags;
 
 	/*
-	 * See comment in kvm_arch_vcpu_load_fp().
+	 * See comment in kvm_arch_vcpu_load_fp(). Note that we also rely on
+	 * the guest's max VL to have been set by fpsimd_lazy_switch_to_host()
+	 * so that any intervening kernel-mode SIMD (NEON or otherwise)
+	 * operation sees the full guest state that needs saving.
 	 */
 	if (vcpu_get_flag(vcpu, IN_NESTED_ERET) ||
 	    vcpu_get_flag(vcpu, IN_NESTED_EXCEPTION)) {

> Other than that, this all looks good to me:
> 
> Acked-by: Mark Rutland <mark.rutland@arm.com>

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply related

* Re: [PATCH v2 2/2] KVM: arm64: nv: Don't save/restore FP register during a nested ERET or exception
From: Marc Zyngier @ 2026-05-21  6:21 UTC (permalink / raw)
  To: Joey Gouly
  Cc: kvmarm, linux-arm-kernel, kvm, Steffen Eiden, Suzuki K Poulose,
	Oliver Upton, Zenghui Yu, Mark Rutland, Will Deacon, Fuad Tabba
In-Reply-To: <20260520110231.GA4005903@e124191.cambridge.arm.com>

On Wed, 20 May 2026 12:02:31 +0100,
Joey Gouly <joey.gouly@arm.com> wrote:
> 
> Hi Marc,
> 
> On Wed, May 20, 2026 at 09:50:36AM +0100, Marc Zyngier wrote:
> > When switching between L1 and L2, we save the old state using
> > kvm_arch_vcpu_put(), mutate the state in memory, then load the new
> > state using kvm_arch_vcpu_load(). Any live FPSIMD/SVE state is saved
> > and unbound, such that it can be lazily restored on a subsequent trap.
> > 
> > The FPSIMD/SVE state is shared by exception levels, and only a handful
> > of related control registers need to be changed when transitioning
> > between L1 and L2. The save/restore of the common state is needless
> > overhead, especially as trapping becomes exponentially more expensive
> > with nesting.
> > 
> > Avoid this overhead by leaving the common FPSIMD/SVE state live on the
> > CPU, and only switching the state that is distinct for L1 and L2:
> 
> To make sure I understand this part:
> 	
> 	L1 sets up L2's FP state live on the CPU 
> 	L1 erets
> 	eret traps to L0/host
> 	preemption disabled
> 	kvm_arch_vcpu_put()
> 	    kvm_arch_vcpu_put_fp() <-- actually saves the state of the live registers
> 	.. set elr etc ..
> 	kvm_arch_vcpu_load()
> 	    kvm_arch_vcpu_load_fp() <-- doesn't actually restore state, but ensures
>                                         the CPTR trap will be set
>         .. returns to L2 (traps on first use of FP and state will be restored)
> 	
> So this patch is (effectively) removing the put_fp()/load_fp(), because the FP
> state is common/shared between L1 and L2, so whatever L1 put into that state
> before the eret, L2 was going to see.

Yes, you got it right. The other path is on L1 to L2 exception, which
also requires L0 mediation and has a similar shape.

The most horrible thing is that because all these traps can happen at
a arbitrary depth, each individual trap usually results in the
combination of all of the above.

> If my understanding is correct:
> Reviewed-by: Joey Gouly <joey.gouly@arm.com>

Thanks!

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply

* Re: [PATCH v3] dt-bindings: mfd: st,stmpe: fix PWM schema and drop legacy binding
From: Manish Baing @ 2026-05-21  5:47 UTC (permalink / raw)
  To: Uwe Kleine-König
  Cc: lee, linusw, robh, krzk+dt, conor+dt, mcoquelin.stm32,
	alexandre.torgue, devicetree, linux-stm32, linux-arm-kernel,
	linux-kernel, linux-pwm
In-Reply-To: <agnY16I4sYAdRd9T@monoceros>

Hi Uwe,
> If the patch was split into two, each touching just one of the files,
> there would be no need for merge coordination. Also logically it's two
> patches. Would you mind splitting?

That makes perfect sense. I will split this into a two-patch series
(one for the MFD YAML fix and one for the PWM TXT deletion) and submit
it shortly as v4.
Thanks for the feedback!

Thanks and Regards,
Manish


On Sun, May 17, 2026 at 8:35 PM Uwe Kleine-König <ukleinek@kernel.org> wrote:
>
> Hello,
>
> On Sat, May 09, 2026 at 07:39:28PM +0000, Manish Baing wrote:
> > The st,stmpe-pwm binding is already covered by the MFD schema in
> > Documentation/devicetree/bindings/mfd/st,stmpe.yaml. However, the
> > PWM subnode was missing a 'required' properties block. This allowed
> > Device Tree nodes to pass validation even if the 'compatible'
> > string was omitted. This omission could lead to probe failures
> > at runtime.
> >
> > Fix the schema by adding the missing 'required' block and
> > remove the obsolete and redundant text binding file.
> >
> > Signed-off-by: Manish Baing <manishbaing2789@gmail.com>
> > ---
> > Changes in v3:
> > - Added 'required' properties to the pwm subnode in st,stmpe.yaml
> >   to close a validation gap identified by the Sashiko.
> > - Updated commit message and description to reflect MFD subsystem changes.
> >
> > Changes in v2:
> >  - Droppped the TXT file instead of converting to YAML, as the
> >    functionality is already covered by st,stmpe.yaml.
> >
> >  .../devicetree/bindings/mfd/st,stmpe.yaml      |  4 ++++
> >  .../devicetree/bindings/pwm/st,stmpe-pwm.txt   | 18 ------------------
>
> If the patch was split into two, each touching just one of the files,
> there would be no need for merge coordination. Also logically it's two
> patches. Would you mind splitting?
>
> Best regards
> Uwe


^ permalink raw reply

* Re: [PATCH v2 1/4] dt-bindings: display: verisilicon, dc: generalize for single-output variants
From: Joey Lu @ 2026-05-21  5:41 UTC (permalink / raw)
  To: Icenowy Zheng, Conor Dooley
  Cc: maarten.lankhorst, mripard, tzimmermann, airlied, simona, robh,
	krzk+dt, conor+dt, ychuang3, schung, yclu4, dri-devel, devicetree,
	linux-arm-kernel, linux-kernel
In-Reply-To: <47a06094541da642cabcb6b7d2f92d5125d365ea.camel@iscas.ac.cn>

On 5/20/2026 12:07 PM, Icenowy Zheng wrote:
> 在 2026-05-20三的 11:06 +0800，Joey Lu写道：
>> On 5/20/2026 12:47 AM, Conor Dooley wrote:
>>> On Tue, May 19, 2026 at 03:26:58PM +0800, Icenowy Zheng wrote:
>>>> 在 2026-05-19二的 13:51 +0800，Joey Lu写道：
>>>>> The existing schema assumes a fixed clock/reset topology and
>>>>> dual-
>>>>> output
>>>>> port structure matching the DC8200 IP block.  This prevents
>>>>> reuse for
>>>>> single-output variants such as the Verisilicon DCU Lite used in
>>>>> the
>>>>> Nuvoton MA35D1 SoC.
>>>>>
>>>>> Rework the schema so that variant-specific constraints are
>>>>> expressed
>>>>> via allOf/if-then-else:
>>>>>
>>>>> - The thead,th1520-dc8200 compatible keeps its existing five-
>>>>> clock,
>>>>>     three-reset, dual-port requirements.
>>>>>
>>>>> - A standalone verisilicon,dc compatible covers IPs whose
>>>>> identity is
>>>>>     discovered entirely through hardware registers; these have
>>>>> flexible
>>>>>     clock and reset counts, a single 'port' property, and no
>>>>> 'ports'
>>>>>     requirement.
>>>>>
>>>>> Changes to the base schema:
>>>>> - Replace the fixed clock/reset items lists with
>>>>> minItems/maxItems
>>>>>     ranges; variant sub-schemas tighten the constraints via if-
>>>>> then-
>>>>> else.
>>>>> - Add a 'port' property (graph.yaml single-port alias)
>>>>> alongside the
>>>>>     existing 'ports', for single-output variants.
>>>>> - Drop the unconditional 'ports' requirement; each if-branch
>>>>> enforces
>>>>>     its own port topology.
>>>>> - Tighten additionalProperties to unevaluatedProperties to
>>>>> allow
>>>>>     per-variant schemas to add their own constraints cleanly.
>>>>> - Fix a stray space in the port@0 description.
>>>>> - Add a DT example for the generic verisilicon,dc compatible
>>>>>     (Nuvoton MA35D1 DCU Lite).
>>>>>
>>>>> Signed-off-by: Joey Lu <a0987203069@gmail.com>
>>>>> ---
>>>>>    .../bindings/display/verisilicon,dc.yaml      | 135
>>>>> ++++++++++++++--
>>>>> --
>>>>>    1 file changed, 108 insertions(+), 27 deletions(-)
>>>>>
>>>>> diff --git
>>>>> a/Documentation/devicetree/bindings/display/verisilicon,dc.yaml
>>>>> b/Documentation/devicetree/bindings/display/verisilicon,dc.yaml
>>>>> index 9dc35ab973f2..3a814c2e083e 100644
>>>>> ---
>>>>> a/Documentation/devicetree/bindings/display/verisilicon,dc.yaml
>>>>> +++
>>>>> b/Documentation/devicetree/bindings/display/verisilicon,dc.yaml
>>>>> @@ -14,10 +14,12 @@ properties:
>>>>>        pattern: "^display@[0-9a-f]+$"
>>>>>    
>>>>>      compatible:
>>>>> -    items:
>>>>> -      - enum:
>>>>> -          - thead,th1520-dc8200
>>>> You should add a fallback compatible here for your SoC, in case
>>>> its
>>>> integration gets something quirky; this compatible is usually not
>>>> consumed by the driver (see how thead,th1520-dc8200 exists in the
>>>> binding but not the driver).
>>> s/fallback compatible/soc-specific compatible/, but yes.
>>> NAK to what's been done here, especially after the discussions on
>>> earlier versions of this verisilicon binding.
>>> pw-bot: changes-requested
>> Understood. I will add `nuvoton,ma35d1-dcu` as the SoC-specific
>> compatible string paired with `verisilicon,dc` as the generic
>> fallback,
>> matching the pattern used for `thead,th1520-dc8200`. The standalone
>> `verisilicon,dc` compatible will be removed from the binding. The
>> driver
> No, please don't remove compatible strings from existing binding, and
> the generic compatible is still used for driver binding.
>
> The SoC-specific compatible is informative here, it needs to exist, but
> it doesn't supersede "verisilicon,dc" .
>
> In addition, the SoC-specific compatible is also used for verification
> of the SoC device tree, which is the reason if clauses exist with
> compatible match and additional constraints (e.g. for the nuvoton DCU
> it's invalid to have a 2nd output port).
Sorry for the misunderstanding. I now see that a standalone generic 
fallback compatible is not preferred here, and that the SoC-specific 
compatible is strictly required for DT validation. I will add 
`nuvoton,ma35d1-dcu` as the SoC-specific compatible string in the 
existing compatible items list, without adding or removing anything else.
>> match table is not changed since hardware detection is done via ID
>> registers.
>>>>> -      - const: verisilicon,dc # DC IPs have discoverable
>>>>> ID/revision
>>>>> registers
>>>>> +    oneOf:
>>>>> +      - items:
>>>>> +          - enum:
>>>>> +              - thead,th1520-dc8200
>>>>> +          - const: verisilicon,dc
>>>>> +      - const: verisilicon,dc  # DC IPs have discoverable
>>>>> ID/revision registers
>>>>>    
>>>>>      reg:
>>>>>        maxItems: 1
>>>>> @@ -26,32 +28,24 @@ properties:
>>>>>        maxItems: 1
>>>>>    
>>>>>      clocks:
>>>>> -    items:
>>>>> -      - description: DC Core clock
>>>>> -      - description: DMA AXI bus clock
>>>>> -      - description: Configuration AHB bus clock
>>>>> -      - description: Pixel clock of output 0
>>>>> -      - description: Pixel clock of output 1
>>>>> +    minItems: 2
>>>>> +    maxItems: 5
>>>>>    
>>>>>      clock-names:
>>>>> -    items:
>>>>> -      - const: core
>>>>> -      - const: axi
>>>>> -      - const: ahb
>>>>> -      - const: pix0
>>>>> -      - const: pix1
>>>>> +    minItems: 2
>>>>> +    maxItems: 5
>>>>>    
>>>>>      resets:
>>>>> -    items:
>>>>> -      - description: DC Core reset
>>>>> -      - description: DMA AXI bus reset
>>>>> -      - description: Configuration AHB bus reset
>>>>> +    minItems: 1
>>>>> +    maxItems: 3
>>>>>    
>>>>>      reset-names:
>>>>> -    items:
>>>>> -      - const: core
>>>>> -      - const: axi
>>>>> -      - const: ahb
>>>>> +    minItems: 1
>>>>> +    maxItems: 3
>>>>> +
>>>>> +  port:
>>>>> +    $ref: /schemas/graph.yaml#/properties/port
>>>>> +    description: Single video output port for single-output
>>>>> variants.
>>>> Maybe the endpoint numbering rule needs a move to here? (I am not
>>>> very
>>>> sure).
>> I will add a description to the `port` property noting that endpoint
>> 0
>> is used for DPI output, which is the only output type for
>> DCUltraLite.
> Please note that DC8000 exists, which is single-port but supports both
> DPI and DP.
To make it simple, the `port` property will not be added. `ports` 
remains the sole port property and is kept in the global `required:` 
list as in the original. The MA35D1 example will use `ports { port@0 { 
... } }`, consistent with how other single-output DT nodes are written 
in the kernel.
>>>>>    
>>>>>      ports:
>>>>>        $ref: /schemas/graph.yaml#/properties/ports
>>>>> @@ -59,7 +53,7 @@ properties:
>>>>>        properties:
>>>>>          port@0:
>>>>>            $ref: /schemas/graph.yaml#/properties/port
>>>>> -        description: The first output channel , endpoint 0
>>>>> should be
>>>>> +        description: The first output channel, endpoint 0
>>>>> should be
>>>>>              used for DPI format output and endpoint 1 should be
>>>>> used
>>>>>              for DP format output.
>>>>>    
>>>>> @@ -75,9 +69,75 @@ required:
>>>>>      - interrupts
>>>>>      - clocks
>>>>>      - clock-names
>>>>> -  - ports
>>>>>    
>>>>> -additionalProperties: false
>>>>> +allOf:
>>>>> +  - if:
>>>>> +      properties:
>>>>> +        compatible:
>>>>> +          contains:
>>>>> +            const: thead,th1520-dc8200
>>>>> +    then:
>>>>> +      properties:
>>>>> +        clocks:
>>>>> +          items:
>>>>> +            - description: DC Core clock
>>>>> +            - description: DMA AXI bus clock
>>>>> +            - description: Configuration AHB bus clock
>>>>> +            - description: Pixel clock of output 0
>>>>> +            - description: Pixel clock of output 1
>>>>> +
>>>>> +        clock-names:
>>>>> +          items:
>>>>> +            - const: core
>>>>> +            - const: axi
>>>>> +            - const: ahb
>>>>> +            - const: pix0
>>>>> +            - const: pix1
>>>>> +
>>>>> +        resets:
>>>>> +          items:
>>>>> +            - description: DC Core reset
>>>>> +            - description: DMA AXI bus reset
>>>>> +            - description: Configuration AHB bus reset
>>>>> +
>>>>> +        reset-names:
>>>>> +          items:
>>>>> +            - const: core
>>>>> +            - const: axi
>>>>> +            - const: ahb
>>>>> +
>>>>> +      required:
>>>>> +        - ports
>>>>> +
>>>>> +    else:
>>>>> +      properties:
>>>>> +        clocks:
>>>>> +          items:
>>>>> +            - description: Bus clock that gates register
>>>>> access
>>>>> +            - description: Pixel clock divider for display
>>>>> timing
>>>> Please don't make compatible-specific description strings for
>>>> individual compatibles, and keep these descriptions outside of
>>>> the if.
>>>> The compatible-specific part should be used to specify what's
>>>> required
>>>> for the specific SoC, for dt validation purpose.
>>>>
>>>> BTW if the clock is both the working clock and bus clock for the
>>>> controller, I suggest listing it twice, except if the IP core is
>>>> provided without a dedicated core clock (in the case I suggest to
>>>> use
>>>> "bus" only).
>>> I agree. If the same clock is provided to two+ ports on the IP,
>>> that
>>> should still be two+ clocks in the devicetree.
>>>
>>>> Here's an example for "listing it twice":
>>>> ```
>>>> clocks = <&clk DCU_GATE>, <&clk DCU_GATE>, <&clk DCUP_DIV>;
>>>> clock-names = "core", "bus", "pix0";
>>>> ```
>>>>
>>>> Well nonetheless the name "core" does not match the description
>>>> "Bus
>>>> clock that gates register access".
>>>>
>>>> Thanks,
>>>> Icenowy
>> Understood. I will remove all description strings from the if/else
>> branches; the if/then clauses will only constrain clock-names and
>> reset-names items (name values only, no descriptions). Regarding
>> clock
> Well I think a required properties list is also needed in the if/then
> clause, to prevent DT's from lacking properties.
Since `ports` is kept in the global `required:` list, neither if/then 
block needs a `required:` entry for port topology. Each if/then only 
constrains clock-names and reset-names for DT validation. The `else` 
branch has been eliminated; each variant has its own independent 
`if/then` in the `allOf` array.
>> naming: DCU_GATE on MA35D1 is a peripheral gate clock without a
>> separate
>> dedicated core working clock, so I will keep "core" as the name and
> Do you mean there's no seperate dedicated bus clock? I find that in the
> clock driver dcu_gate has no parent as bus clocks -- its parent is
> dcu_mux, and dcu_mux's 2 parents are both pll ("epll_div2" and
> "syspll").
>
> Thanks,
> Icenowy
You are right — DCU_GATE has no parent as a bus clock. For this case, I 
prefer to keep "core" as the sole gate clock name alongside "pix0".

Thanks.

Here is what the v3 yaml would look like:

```yaml
compatible:
   items:
     - enum: [nuvoton,ma35d1-dcu, thead,th1520-dc8200]
     - const: verisilicon,dc

properties:
   clocks: minItems: 2, items with descriptions
   resets: minItems: 1, items with descriptions

required:
   [compatible, reg, interrupts, clocks, clock-names, ports]

allOf:
   - if: compatible contains thead,th1520-dc8200
     then:
       clock-names: [core, axi, ahb, pix0, pix1]
       reset-names: [core, axi, ahb]
   - if: compatible contains nuvoton,ma35d1-dcu
     then:
       clock-names: [core, pix0]
       reset-names: [core]
```
>> drop
>> the misleading description "Bus clock that gates register access".
>> The
>> description mismatch was entirely in the if/else strings which are
>> now
>> removed.
>>
>> Thanks.
>>
>>>>> +
>>>>> +        clock-names:
>>>>> +          items:
>>>>> +            - const: core
>>>>> +            - const: pix0
>>>>> +
>>>>> +        resets:
>>>>> +          maxItems: 1
>>>>> +          description:
>>>>> +            Reset line for the display controller.
>>>>> +
>>>>> +        reset-names:
>>>>> +          items:
>>>>> +            - const: core
>>>>> +
>>>>> +      required:
>>>>> +        - port
>>>>> +
>>>>> +      not:
>>>>> +        required:
>>>>> +          - ports
>>>>> +
>>>>> +unevaluatedProperties: false
>>>>>    
>>>>>    examples:
>>>>>      - |
>>>>> @@ -120,3 +180,24 @@ examples:
>>>>>            };
>>>>>          };
>>>>>        };
>>>>> +
>>>>> +  - |
>>>>> +    #include <dt-bindings/interrupt-controller/arm-gic.h>
>>>>> +    #include <dt-bindings/clock/nuvoton,ma35d1-clk.h>
>>>>> +    #include <dt-bindings/reset/nuvoton,ma35d1-reset.h>
>>>>> +
>>>>> +    display@40260000 {
>>>>> +        compatible = "verisilicon,dc";
>>>>> +        reg = <0x40260000 0x20000>;
>>>>> +        interrupts = <GIC_SPI 20 IRQ_TYPE_LEVEL_HIGH>;
>>>>> +        clocks = <&clk DCU_GATE>, <&clk DCUP_DIV>;
>>>>> +        clock-names = "core", "pix0";
>>>>> +        resets = <&sys MA35D1_RESET_DISP>;
>>>>> +        reset-names = "core";
>>>>> +
>>>>> +        port {
>>>>> +            dpi_out: endpoint {
>>>>> +                remote-endpoint = <&panel_in>;
>>>>> +            };
>>>>> +        };
>>>>> +    };


^ permalink raw reply

* RE: [PATCH V3 0/8] PCI: imx6: Integrate pwrctrl API and update device trees
From: Sherry Sun @ 2026-05-21  4:40 UTC (permalink / raw)
  To: Hongxing Zhu (OSS), Sherry Sun (OSS), robh@kernel.org,
	krzk+dt@kernel.org, conor+dt@kernel.org, Frank Li,
	s.hauer@pengutronix.de, kernel@pengutronix.de, festevam@gmail.com,
	lpieralisi@kernel.org, kwilczynski@kernel.org, mani@kernel.org,
	bhelgaas@google.com, l.stach@pengutronix.de
  Cc: imx@lists.linux.dev, linux-pci@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, devicetree@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <GV2PR04MB120194C8BDA9B49C79DCE72168C0E2@GV2PR04MB12019.eurprd04.prod.outlook.com>


> > -----Original Message-----
> > From: Sherry Sun (OSS) <sherry.sun@oss.nxp.com>
> > Sent: Wednesday, May 20, 2026 4:49 PM
> > To: robh@kernel.org; krzk+dt@kernel.org; conor+dt@kernel.org; Frank Li
> > <frank.li@nxp.com>; s.hauer@pengutronix.de; kernel@pengutronix.de;
> > festevam@gmail.com; lpieralisi@kernel.org; kwilczynski@kernel.org;
> > mani@kernel.org; bhelgaas@google.com; Hongxing Zhu
> > <hongxing.zhu@nxp.com>; l.stach@pengutronix.de
> > Cc: imx@lists.linux.dev; linux-pci@vger.kernel.org; linux-arm-
> > kernel@lists.infradead.org; devicetree@vger.kernel.org; linux-
> > kernel@vger.kernel.org; Sherry Sun <sherry.sun@nxp.com>
> > Subject: [PATCH V3 0/8] PCI: imx6: Integrate pwrctrl API and update
> > device trees
> >
> > From: Sherry Sun <sherry.sun@nxp.com>
> >
> > This series integrates the PCI pwrctrl framework into the pci-imx6
> > driver and updates i.MX EVK board device trees to support it.
> >
> > Patches 2-8 update device trees for i.MX EVK boards which maintained
> > by NXP to move power supply properties from the PCIe controller node
> > to the Root Port child node, which is required for pwrctrl framework.
> > Affected boards:
> > - i.MX6Q/DL SABRESD
> > - i.MX6SX SDB
> > - i.MX8MM EVK
> > - i.MX8MP EVK
> > - i.MX8MQ EVK
> > - i.MX8DXL/QM/QXP EVK
> > - i.MX95 15x15/19x19 EVK
> >
> > The driver maintains legacy regulator handling for device trees that
> > haven't been updated yet. Both old and new device tree structures are
> supported.
> >
> > Signed-off-by: Sherry Sun <sherry.sun@nxp.com>
> Hi Sherry:
> Since the vpcie3v3aux is used to power up the WAKE#, it is always on in this
> pwrctrl framework whatever the system is in suspend or not, right?
> 

Hi Richard,
Currently the new pwrctrl framework doesn't support vpcie3v3aux, it handles all
regulators with of_regulator_bulk_get_all() and regulator_bulk_enable/disable().
The vpcie3v3aux now only works with pci-imx6 driver.

Best Regards
Sherry

> Best Regards
> Richard Zhu
> > ---
> > Changes in V3:
> > 1. Rebased on top of latest 7.1.0-rc4
> >
> > Changes in V2:
> > 1. After commit 2d8c5098b847 ("PCI/pwrctrl: Do not power off on pwrctrl
> >    device removal"), the pwrctrl drivers no longer power off devices
> >    during removal. Update pci-imx6 driver's shutdown callback in patch#1
> >    to explicitly call pci_pwrctrl_power_off_devices() before
> >    pci_pwrctrl_destroy_devices() to ensure devices are properly powered
> >    off.
> > ---
> >
> > Sherry Sun (8):
> >   PCI: imx6: Integrate new pwrctrl API for pci-imx6
> >   arm: dts: imx6qdl-sabresd: Move power supply property to Root Port
> >     node
> >   arm: dts: imx6sx-sdb: Move power supply property to Root Port node
> >   arm64: dts: imx8mm-evk: Move power supply property to Root Port node
> >   arm64: dts: imx8mp-evk: Move power supply properties to Root Port node
> >   arm64: dts: imx8mq-evk: Move power supply properties to Root Port node
> >   arm64: dts: imx8dxl/qm/qxp: Move power supply properties to Root Port
> >     node
> >   arm64: dts: imx95: Move power supply properties to Root Port node
> >
> >  .../arm/boot/dts/nxp/imx/imx6qdl-sabresd.dtsi |  2 +-
> >  arch/arm/boot/dts/nxp/imx/imx6sx-sdb.dtsi     |  2 +-
> >  arch/arm64/boot/dts/freescale/imx8dxl-evk.dts |  4 ++--
> > arch/arm64/boot/dts/freescale/imx8mm-evk.dtsi |  2 +-
> > arch/arm64/boot/dts/freescale/imx8mp-evk.dts  |  4 ++--
> > arch/arm64/boot/dts/freescale/imx8mq-evk.dts  |  4 ++--
> > arch/arm64/boot/dts/freescale/imx8qm-mek.dts  |  4 ++--
> > arch/arm64/boot/dts/freescale/imx8qxp-mek.dts |  4 ++--
> >  .../boot/dts/freescale/imx95-15x15-evk.dts    |  4 ++--
> >  .../boot/dts/freescale/imx95-19x19-evk.dts    |  8 +++----
> >  drivers/pci/controller/dwc/Kconfig            |  1 +
> >  drivers/pci/controller/dwc/pci-imx6.c         | 24 ++++++++++++++++++-
> >  12 files changed, 43 insertions(+), 20 deletions(-)
> >
> > --
> > 2.37.1



^ permalink raw reply

* Re: [PATCH v14 10/44] arm64: RMI: Add support for SRO
From: Gavin Shan @ 2026-05-21  4:38 UTC (permalink / raw)
  To: Steven Price, kvm, kvmarm
  Cc: Catalin Marinas, Marc Zyngier, Will Deacon, James Morse,
	Oliver Upton, Suzuki K Poulose, Zenghui Yu, linux-arm-kernel,
	linux-kernel, Joey Gouly, Alexandru Elisei, Christoffer Dall,
	Fuad Tabba, linux-coco, Ganapatrao Kulkarni, Shanker Donthineni,
	Alper Gun, Aneesh Kumar K . V, Emi Kisanuki, Vishal Annapurve,
	WeiLin.Chang, Lorenzo.Pieralisi2
In-Reply-To: <20260513131757.116630-11-steven.price@arm.com>

Hi Steven,

On 5/13/26 11:17 PM, Steven Price wrote:
> RMM v2.0 introduces the concept of "Stateful RMI Operations" (SRO). This
> means that an SMC can return with an operation still in progress. The
> host is excepted to continue the operation until is reaches a conclusion
> (either success or failure). During this process the RMM can request
> additional memory ('donate') or hand memory back to the host
> ('reclaim'). The host can request an in progress operation is cancelled,
> but still continue the operation until it has completed (otherwise the
> incomplete operation may cause future RMM operations to fail).
> 
> The SRO is tracked using a struct rmi_sro_state object which keeps track
> of any memory which has been allocated but not yet consumed by the RMM
> or reclaimed from the RMM. This allows the memory to be reused in a
> future request within the same operation. It will also permit an
> operation to be done in a context where memory allocation may be
> difficult (e.g. atomic context) with the option to abort the operation
> and retry the memory allocation outside of the atomic context. The
> memory stored in the struct rmi_sro_state object can then be reused on
> the subsequent attempt.
> 
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
> v14:
>   * SRO support has improved although is still not fully complete. The
>     infrastructure has been moved out of KVM.
> ---
>   arch/arm64/include/asm/rmi_cmds.h |   1 +
>   arch/arm64/kernel/rmi.c           | 359 ++++++++++++++++++++++++++++++
>   2 files changed, 360 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/rmi_cmds.h b/arch/arm64/include/asm/rmi_cmds.h
> index eb213c8e6f26..1a7b0c8f1e38 100644
> --- a/arch/arm64/include/asm/rmi_cmds.h
> +++ b/arch/arm64/include/asm/rmi_cmds.h
> @@ -35,6 +35,7 @@ struct rmi_sro_state {
>   
>   int rmi_delegate_range(phys_addr_t phys, unsigned long size);
>   int rmi_undelegate_range(phys_addr_t phys, unsigned long size);
> +int free_delegated_page(phys_addr_t phys);
>   
>   static inline int rmi_delegate_page(phys_addr_t phys)
>   {
> diff --git a/arch/arm64/kernel/rmi.c b/arch/arm64/kernel/rmi.c
> index 08cef54acadb..a8107ca9bb6d 100644
> --- a/arch/arm64/kernel/rmi.c
> +++ b/arch/arm64/kernel/rmi.c
> @@ -48,6 +48,365 @@ int rmi_undelegate_range(phys_addr_t phys, unsigned long size)
>   	return ret;
>   }
>   
> +static unsigned long donate_req_to_size(unsigned long donatereq)
> +{
> +	unsigned long unit_size = RMI_DONATE_SIZE(donatereq);
> +
> +	switch (unit_size) {
> +	case 0:
> +		return PAGE_SIZE;
> +	case 1:
> +		return PMD_SIZE;
> +	case 2:
> +		return PUD_SIZE;
> +	case 3:
> +		return P4D_SIZE;
> +	}
> +	unreachable();
> +}
> +

It's worthy to have 'inline'. {P4D, PUD, PMD}_SIZE can be equal if there are
no P4D and PUD, depending on CONFIG_PGTABLE_LEVELS. In this case, can the
'unit_size' be translated to wrong value?

> +static void rmi_smccc_invoke(struct arm_smccc_1_2_regs *regs_in,
> +			     struct arm_smccc_1_2_regs *regs_out)
> +{
> +	struct arm_smccc_1_2_regs regs = *regs_in;
> +	unsigned long status;
> +
> +	do {
> +		arm_smccc_1_2_invoke(&regs, regs_out);
> +		status = RMI_RETURN_STATUS(regs_out->a0);
> +	} while (status == RMI_BUSY || status == RMI_BLOCKED);
> +}
> +
> +int free_delegated_page(phys_addr_t phys)
> +{
> +	if (WARN_ON(rmi_undelegate_page(phys))) {
> +		/* Undelegate failed: leak the page */
> +		return -EBUSY;
> +	}
> +
> +	free_page((unsigned long)phys_to_virt(phys));
> +
> +	return 0;
> +}
> +
> +static int rmi_sro_ensure_capacity(struct rmi_sro_state *sro,
> +				   unsigned long count)
> +{
> +	if (WARN_ON_ONCE(sro->addr_count > RMI_MAX_ADDR_LIST))
> +		return -EOVERFLOW;
> +
> +	if (count > RMI_MAX_ADDR_LIST - sro->addr_count)
> +		return -ENOSPC;
> +
> +	return 0;
> +}
> +
> +static int rmi_sro_donate_contig(struct rmi_sro_state *sro,
> +				 unsigned long sro_handle,
> +				 unsigned long donatereq,
> +				 struct arm_smccc_1_2_regs *out_regs,
> +				 gfp_t gfp)
> +{
> +	unsigned long unit_size = RMI_DONATE_SIZE(donatereq);
> +	unsigned long unit_size_bytes = donate_req_to_size(donatereq);
> +	unsigned long count = RMI_DONATE_COUNT(donatereq);
> +	unsigned long state = RMI_DONATE_STATE(donatereq);
> +	unsigned long size = unit_size_bytes * count;
> +	unsigned long addr_range;
> +	int ret;
> +	void *virt;
> +	phys_addr_t phys;
> +	struct arm_smccc_1_2_regs regs = {
> +		SMC_RMI_OP_MEM_DONATE,
> +		sro_handle
> +	};
> +
> +	for (int i = 0; i < sro->addr_count; i++) {
> +		unsigned long entry = sro->addr_list[i];
> +
> +		if (RMI_ADDR_RANGE_SIZE(entry) == unit_size &&
> +		    RMI_ADDR_RANGE_COUNT(entry) == count &&
> +		    RMI_ADDR_RANGE_STATE(entry) == state) {
> +			sro->addr_count--;
> +			swap(sro->addr_list[sro->addr_count],
> +			     sro->addr_list[i]);
> +
> +			goto out;
> +		}
> +	}
> +
> +	ret = rmi_sro_ensure_capacity(sro, 1);
> +	if (ret)
> +		return ret;
> +
> +	virt = alloc_pages_exact(size, gfp);
> +	if (!virt)
> +		return -ENOMEM;
> +	phys = virt_to_phys(virt);
> +

alloc_pages_exact() will fail if the requested size exceeds the maximal allowed
size (1 << MAX_PAGE_ORDER). The maximal size is usually smaller than PUD_SIZE
but PUD_SIZE is allowed by the RMM.

> +	if (state == RMI_OP_MEM_DELEGATED) {
> +		if (rmi_delegate_range(phys, size)) {
> +			free_pages_exact(virt, size);
> +			return -ENXIO;
> +		}
> +	}
> +
> +	addr_range = phys & RMI_ADDR_RANGE_ADDR_MASK;
> +	FIELD_MODIFY(RMI_ADDR_RANGE_SIZE_MASK, &addr_range, unit_size);
> +	FIELD_MODIFY(RMI_ADDR_RANGE_COUNT_MASK, &addr_range, count);
> +	FIELD_MODIFY(RMI_ADDR_RANGE_STATE_MASK, &addr_range, state);
> +
> +	sro->addr_list[sro->addr_count] = addr_range;
> +
> +out:
> +	regs.a2 = virt_to_phys(&sro->addr_list[sro->addr_count]);
> +	regs.a3 = 1;
> +	rmi_smccc_invoke(&regs, out_regs);
> +
> +	unsigned long donated_granules = out_regs->a1;
> +	unsigned long donated_size = donated_granules << PAGE_SHIFT;
> +
> +	if (donated_granules == 0) {
> +		/* No pages used by the RMM */
> +		sro->addr_count++;
> +	} else if (donated_size < size) {
> +		phys = sro->addr_list[sro->addr_count] & RMI_ADDR_RANGE_ADDR_MASK;
> +
> +		/* Not all granules used by the RMM, free the remaining pages */
> +		for (long i = donated_size; i < size; i += PAGE_SIZE) {
> +			if (state == RMI_OP_MEM_DELEGATED)
> +				free_delegated_page(phys + i);
> +			else
> +				__free_page(phys_to_page(phys + i));
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +static int rmi_sro_donate_noncontig(struct rmi_sro_state *sro,
> +				    unsigned long sro_handle,
> +				    unsigned long donatereq,
> +				    struct arm_smccc_1_2_regs *out_regs,
> +				    gfp_t gfp)
> +{
> +	unsigned long unit_size = RMI_DONATE_SIZE(donatereq);
> +	unsigned long unit_size_bytes = donate_req_to_size(donatereq);
> +	unsigned long count = RMI_DONATE_COUNT(donatereq);
> +	unsigned long state = RMI_DONATE_STATE(donatereq);
> +	unsigned long found = 0;
> +	unsigned long addr_list_start = sro->addr_count;
> +	int ret;
> +	struct arm_smccc_1_2_regs regs = {
> +		SMC_RMI_OP_MEM_DONATE,
> +		sro_handle
> +	};
> +
> +	for (int i = 0; i < addr_list_start && found < count; i++) {
> +		unsigned long entry = sro->addr_list[i];
> +
> +		if (RMI_ADDR_RANGE_SIZE(entry) == unit_size &&
> +		    RMI_ADDR_RANGE_COUNT(entry) == 1 &&
> +		    RMI_ADDR_RANGE_STATE(entry) == state) {
> +			addr_list_start--;
> +			swap(sro->addr_list[addr_list_start],
> +			     sro->addr_list[i]);
> +			found++;
> +			i--;
> +		}
> +	}
> +
> +	ret = rmi_sro_ensure_capacity(sro, count - found);
> +	if (ret)
> +		return ret;
> +
> +	while (found < count) {
> +		unsigned long addr_range;
> +		void *virt = alloc_pages_exact(unit_size_bytes, gfp);
> +		phys_addr_t phys;
> +
> +		if (!virt)
> +			return -ENOMEM;
> +
> +		phys = virt_to_phys(virt);
> +
> +		if (state == RMI_OP_MEM_DELEGATED) {
> +			if (rmi_delegate_range(phys, unit_size_bytes)) {
> +				free_pages_exact(virt, unit_size_bytes);
> +				return -ENXIO;
> +			}
> +		}
> +
> +		addr_range = phys & RMI_ADDR_RANGE_ADDR_MASK;
> +		FIELD_MODIFY(RMI_ADDR_RANGE_SIZE_MASK, &addr_range, unit_size);
> +		FIELD_MODIFY(RMI_ADDR_RANGE_COUNT_MASK, &addr_range, 1);
> +		FIELD_MODIFY(RMI_ADDR_RANGE_STATE_MASK, &addr_range, state);
> +
> +		sro->addr_list[sro->addr_count++] = addr_range;
> +		found++;
> +	}
> +
> +	regs.a2 = virt_to_phys(&sro->addr_list[addr_list_start]);
> +	regs.a3 = found;
> +	rmi_smccc_invoke(&regs, out_regs);
> +
> +	unsigned long donated_granules = out_regs->a1;
> +
> +	if (WARN_ON(donated_granules & ((unit_size_bytes >> PAGE_SHIFT) - 1))) {
> +		/*
> +		 * FIXME: RMM has only consumed part of a huge page, this leaks
> +		 * the rest of the huge page
> +		 */
> +		donated_granules = ALIGN(donated_granules,
> +					 (unit_size_bytes >> PAGE_SHIFT));
> +	}
> +	unsigned long donated_blocks = donated_granules / (unit_size_bytes >> PAGE_SHIFT);
> +
> +	if (WARN_ON(donated_blocks > found))
> +		donated_blocks = found;
> +
> +	unsigned long undonated_blocks = found - donated_blocks;
> +
> +	while (donated_blocks && undonated_blocks) {
> +		sro->addr_count--;
> +		swap(sro->addr_list[addr_list_start],
> +		     sro->addr_list[sro->addr_count]);
> +		addr_list_start++;
> +
> +		donated_blocks--;
> +		undonated_blocks--;
> +	}
> +	sro->addr_count -= donated_blocks;
> +
> +	return 0;
> +}
> +
> +static int rmi_sro_donate(struct rmi_sro_state *sro,
> +			  unsigned long sro_handle,
> +			  unsigned long donatereq,
> +			  struct arm_smccc_1_2_regs *regs,
> +			  gfp_t gfp)
> +{
> +	unsigned long count = RMI_DONATE_COUNT(donatereq);
> +
> +	if (WARN_ON(!count))
> +		return 0;
> +
> +	if (RMI_DONATE_CONTIG(donatereq)) {
> +		return rmi_sro_donate_contig(sro, sro_handle, donatereq,
> +					     regs, gfp);
> +	} else {
> +		return rmi_sro_donate_noncontig(sro, sro_handle, donatereq,
> +						regs, gfp);
> +	}
> +}
> +
> +static int rmi_sro_reclaim(struct rmi_sro_state *sro,
> +			   unsigned long sro_handle,
> +			   struct arm_smccc_1_2_regs *out_regs)
> +{
> +	unsigned long capacity;
> +	struct arm_smccc_1_2_regs regs;
> +	int ret;
> +
> +	ret = rmi_sro_ensure_capacity(sro, 1);
> +	if (ret)
> +		rmi_sro_free(sro);
> +
> +	capacity = RMI_MAX_ADDR_LIST - sro->addr_count;
> +
> +	regs = (struct arm_smccc_1_2_regs){
> +		SMC_RMI_OP_MEM_RECLAIM,
> +		sro_handle,
> +		virt_to_phys(&sro->addr_list[sro->addr_count]),
> +		capacity
> +	};
> +	rmi_smccc_invoke(&regs, out_regs);
> +
> +	if (WARN_ON_ONCE(out_regs->a1 > capacity))
> +		out_regs->a1 = capacity;
> +
> +	sro->addr_count += out_regs->a1;
> +
> +	return 0;
> +}
> +
> +void rmi_sro_free(struct rmi_sro_state *sro)
> +{
> +	for (int i = 0; i < sro->addr_count; i++) {
> +		unsigned long entry = sro->addr_list[i];
> +		unsigned long addr = RMI_ADDR_RANGE_ADDR(entry);
> +		unsigned long unit_size = RMI_ADDR_RANGE_SIZE(entry);
> +		unsigned long count = RMI_ADDR_RANGE_COUNT(entry);
> +		unsigned long state = RMI_ADDR_RANGE_STATE(entry);
> +		unsigned long size = donate_req_to_size(unit_size) * count;
> +
> +		if (state == RMI_OP_MEM_DELEGATED) {
> +			if (WARN_ON(rmi_undelegate_range(addr, size))) {
> +				/* Leak the pages */
> +				continue;
> +			}
> +		}
> +		free_pages_exact(phys_to_virt(addr), size);
> +	}
> +
> +	sro->addr_count = 0;
> +}
> +
> +unsigned long rmi_sro_execute(struct rmi_sro_state *sro, gfp_t gfp)
> +{
> +	unsigned long sro_handle;
> +	struct arm_smccc_1_2_regs regs;
> +	struct arm_smccc_1_2_regs *regs_in = &sro->regs;
> +
> +	rmi_smccc_invoke(regs_in, &regs);
> +
> +	sro_handle = regs.a1;
> +
> +	while (RMI_RETURN_STATUS(regs.a0) == RMI_INCOMPLETE) {
> +		bool can_cancel = RMI_RETURN_CAN_CANCEL(regs.a0);
> +		int ret;
> +
> +		switch (RMI_RETURN_MEMREQ(regs.a0)) {
> +		case RMI_OP_MEM_REQ_NONE:
> +			regs = (struct arm_smccc_1_2_regs){
> +				SMC_RMI_OP_CONTINUE, sro_handle, 0
> +			};
> +			rmi_smccc_invoke(&regs, &regs);
> +			break;

'ret' isn't initialized for case RMI_OP_MEM_REQ_NONE.

> +		case RMI_OP_MEM_REQ_DONATE:
> +			ret = rmi_sro_donate(sro, sro_handle, regs.a2, &regs,
> +					     gfp);
> +			break;
> +		case RMI_OP_MEM_REQ_RECLAIM:
> +			ret = rmi_sro_reclaim(sro, sro_handle, &regs);
> +			break;
> +		default:
> +			ret = WARN_ON(1);
> +			break;
> +		}
> +
> +		if (ret) {
> +			if (can_cancel) {
> +				/*
> +				 * FIXME: Handle cancelling properly!
> +				 *
> +				 * If the operation has failed due to memory
> +				 * allocation failure then the information on
> +				 * the memory allocation should be saved, so
> +				 * that the allocation can be repeated outside
> +				 * of any context which prevented the
> +				 * allocation.
> +				 */
> +			}
> +			if (WARN_ON(ret))
> +				return ret;
> +		}
> +	}
> +
> +	return regs.a0;
> +}
> +
>   static int rmi_check_version(void)
>   {
>   	struct arm_smccc_res res;

Thanks,
Gavin



^ permalink raw reply

* [PATCH] [RFC] arm64: mmu: use range based TLB flushing when hot unplugging memory
From: Alistair Popple @ 2026-05-21  4:24 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, linux-mm, catalin.marinas, will, david,
	anshuman.khandual, ryan.roberts, dev.jain, balbirs, jhubbard,
	Alistair Popple

Hot unplugging memory on ARM64 requires a TLB invalidate after unmapping
the page to be hot unplugged from the direct map. Currently that happens
one page at a time, meaning range based invalidates cannot be used. The
result of this is that removing large amounts of memory takes a long
time and in some cases can trigger an RCU stall warning.

For example on one system hot unplugging 480GB of memory takes ~1
minute. With this change the same operation took ~1 second, a 60x
improvement.

Signed-off-by: Alistair Popple <apopple@nvidia.com>

---

This is an RFC, because I'm not sure the change is correct as it frees
the PTE page before flushing the TLB. I'm not familiar enough with ARM64
architecture to be sure this is safe, for example I don't know if HW
can update PTE bits such as access/dirty in the page through a stale
TLB entry.

If so this would open a window during which the page is free but could
still be written to. Likely the safe option would be to collect all the
pages to be free on a list and free them after doing the range based TLB
flush, but wanted to get feedback on the approach before implementing it
which is the goal of this RFC.
---
 arch/arm64/mm/mmu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 0c24fe650e95..75c773232c14 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1459,11 +1459,12 @@ static void unmap_hotplug_pte_range(pmd_t *pmdp, unsigned long addr,

 		WARN_ON(!pte_present(pte));
 		__pte_clear(&init_mm, addr, ptep);
-		flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
 		if (free_mapped)
 			free_hotplug_page_range(pte_page(pte),
 						PAGE_SIZE, altmap);
 	} while (addr += PAGE_SIZE, addr < end);
+
+	flush_tlb_kernel_range(addr, end);
 }

 static void unmap_hotplug_pmd_range(pud_t *pudp, unsigned long addr,
-- 
2.54.0

^ permalink raw reply related

* [PATCH] clocksource/drivers/owl: fix refcount leak
From: Alexander A. Klimov @ 2026-05-21  4:19 UTC (permalink / raw)
  To: Daniel Lezcano, Thomas Gleixner, Andreas Färber,
	Manivannan Sadhasivam, open list:CLOCKSOURCE, CLOCKEVENT DRIVERS,
	moderated list:ARM/ACTIONS SEMI ARCHITECTURE,
	moderated list:ARM/ACTIONS SEMI ARCHITECTURE
  Cc: Alexander A. Klimov

Every value returned from of_clk_get() is supposed to be cleaned up
via clk_put() once not needed anymore.

Fixes: 4be78a86c506 ("clocksource: Add Owl timer")
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
---
 drivers/clocksource/timer-owl.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/clocksource/timer-owl.c b/drivers/clocksource/timer-owl.c
index ac97420bfa7c..fa347f430563 100644
--- a/drivers/clocksource/timer-owl.c
+++ b/drivers/clocksource/timer-owl.c
@@ -142,6 +142,7 @@ static int __init owl_timer_init(struct device_node *node)
 	}
 
 	rate = clk_get_rate(clk);
+	clk_put(clk);
 
 	owl_timer_reset(owl_clksrc_base);
 	owl_timer_set_enabled(owl_clksrc_base, true);
-- 
2.54.0



^ permalink raw reply related

* Re: [PATCH 8/8] sched_ext: Convert ops.set_cmask() to arena-resident cmask
From: Emil Tsalapatis @ 2026-05-21  4:19 UTC (permalink / raw)
  To: Tejun Heo, David Vernet, Andrea Righi, Changwoo Min,
	Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Kumar Kartikeya Dwivedi
  Cc: Peter Zijlstra, Catalin Marinas, Will Deacon, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Andrew Morton,
	David Hildenbrand, Mike Rapoport, Emil Tsalapatis, sched-ext, bpf,
	x86, linux-arm-kernel, linux-mm, linux-kernel
In-Reply-To: <20260520235052.4180316-9-tj@kernel.org>

On Wed May 20, 2026 at 7:50 PM EDT, Tejun Heo wrote:
> ops_cid.set_cmask() expects a cmask. The kernel couldn't write into the
> arena, so it translated cpumask -> cmask in kernel memory and passed the
> result as a trusted pointer. The BPF cmask helpers all operate on arena
> cmasks though, so the BPF side had to word-by-word probe-read the kernel
> cmask into an arena cmask via cmask_copy_from_kernel() before any helper
> could touch it. It works, but is clumsy.
>
> With direct kernel-side arena access now in place, build the cmask in the
> arena. The kernel writes to it through the kern_va side of the dual mapping;
> BPF directly dereferences it via an __arena pointer like any other arena
> struct.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>

Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>

> ---
>  kernel/sched/ext.c                    | 68 +++++++++++++++++++++++++--
>  kernel/sched/ext_cid.c                | 20 +-------
>  kernel/sched/ext_internal.h           | 10 +++-
>  tools/sched_ext/include/scx/cid.bpf.h | 52 --------------------
>  tools/sched_ext/scx_qmap.bpf.c        |  5 +-
>  5 files changed, 75 insertions(+), 80 deletions(-)
>
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index fb91079c1244..94562e3350c6 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -621,11 +621,16 @@ static inline void scx_call_op_set_cpumask(struct scx_sched *sch, struct rq *rq,
>  		update_locked_rq(rq);
>  
>  	if (scx_is_cid_type()) {
> -		struct scx_cmask *cmask = this_cpu_ptr(scx_set_cmask_scratch);
> -
> -		lockdep_assert_irqs_disabled();
> -		scx_cpumask_to_cmask(cpumask, cmask);
> -		sch->ops_cid.set_cmask(task, cmask);
> +		struct scx_cmask *kern_va = *this_cpu_ptr(sch->set_cmask_scratch);
> +		unsigned long uaddr = (unsigned long)kern_va -
> +			bpf_arena_map_kern_vm_start(sch->arena_map);
> +		/*
> +		 * Build the per-CPU arena cmask and hand BPF the uaddr. Caller
> +		 * holds the rq lock with IRQs disabled, which makes us the sole
> +		 * user of the scratch area.
> +		 */
> +		scx_cpumask_to_cmask(cpumask, kern_va);
> +		sch->ops_cid.set_cmask(task, (struct scx_cmask *)uaddr);
>  	} else {
>  		sch->ops.set_cpumask(task, cpumask);
>  	}
> @@ -4949,6 +4954,48 @@ static const struct attribute_group scx_global_attr_group = {
>  static void free_pnode(struct scx_sched_pnode *pnode);
>  static void free_exit_info(struct scx_exit_info *ei);
>  
> +static s32 scx_set_cmask_scratch_alloc(struct scx_sched *sch)
> +{
> +	size_t size = struct_size_t(struct scx_cmask, bits,
> +				    SCX_CMASK_NR_WORDS(num_possible_cpus()));
> +	int cpu;
> +
> +	if (!sch->is_cid_type || !sch->arena_pool)
> +		return 0;
> +
> +	sch->set_cmask_scratch = alloc_percpu(struct scx_cmask *);
> +	if (!sch->set_cmask_scratch)
> +		return -ENOMEM;
> +
> +	for_each_possible_cpu(cpu) {
> +		struct scx_cmask **slot = per_cpu_ptr(sch->set_cmask_scratch, cpu);
> +
> +		*slot = scx_arena_alloc(sch, size);
> +		if (!*slot)
> +			return -ENOMEM;
> +		scx_cmask_init(*slot, 0, num_possible_cpus());
> +	}
> +	return 0;
> +}
> +
> +static void scx_set_cmask_scratch_free(struct scx_sched *sch)
> +{
> +	size_t size = struct_size_t(struct scx_cmask, bits,
> +				    SCX_CMASK_NR_WORDS(num_possible_cpus()));
> +	int cpu;
> +
> +	if (!sch->set_cmask_scratch)
> +		return;
> +
> +	for_each_possible_cpu(cpu) {
> +		struct scx_cmask **slot = per_cpu_ptr(sch->set_cmask_scratch, cpu);
> +
> +		scx_arena_free(sch, *slot, size);
> +	}
> +	free_percpu(sch->set_cmask_scratch);
> +	sch->set_cmask_scratch = NULL;
> +}
> +
>  static void scx_sched_free_rcu_work(struct work_struct *work)
>  {
>  	struct rcu_work *rcu_work = to_rcu_work(work);
> @@ -5003,6 +5050,7 @@ static void scx_sched_free_rcu_work(struct work_struct *work)
>  
>  	rhashtable_free_and_destroy(&sch->dsq_hash, NULL, NULL);
>  	free_exit_info(sch->exit_info);
> +	scx_set_cmask_scratch_free(sch);
>  	scx_arena_pool_destroy(sch);
>  	if (sch->arena_map)
>  		bpf_map_put(sch->arena_map);
> @@ -7162,6 +7210,12 @@ static void scx_root_enable_workfn(struct kthread_work *work)
>  		goto err_disable;
>  	}
>  
> +	ret = scx_set_cmask_scratch_alloc(sch);
> +	if (ret) {
> +		cpus_read_unlock();
> +		goto err_disable;
> +	}
> +
>  	for (i = SCX_OPI_CPU_HOTPLUG_BEGIN; i < SCX_OPI_CPU_HOTPLUG_END; i++)
>  		if (((void (**)(void))ops)[i])
>  			set_bit(i, sch->has_op);
> @@ -7484,6 +7538,10 @@ static void scx_sub_enable_workfn(struct kthread_work *work)
>  	if (ret)
>  		goto err_disable;
>  
> +	ret = scx_set_cmask_scratch_alloc(sch);
> +	if (ret)
> +		goto err_disable;
> +
>  	if (validate_ops(sch, ops))
>  		goto err_disable;
>  
> diff --git a/kernel/sched/ext_cid.c b/kernel/sched/ext_cid.c
> index 0c91b951fd33..808c6390da5a 100644
> --- a/kernel/sched/ext_cid.c
> +++ b/kernel/sched/ext_cid.c
> @@ -7,14 +7,6 @@
>   */
>  #include <linux/cacheinfo.h>
>  
> -/*
> - * Per-cpu scratch cmask used by scx_call_op_set_cpumask() to synthesize a
> - * cmask from a cpumask. Allocated alongside the cid arrays on first enable
> - * and never freed. Sized to the full cid space. Caller holds rq lock so
> - * this_cpu_ptr is safe.
> - */
> -struct scx_cmask __percpu *scx_set_cmask_scratch;
> -
>  /*
>   * cid tables.
>   *
> @@ -54,8 +46,6 @@ static s32 scx_cid_arrays_alloc(void)
>  	u32 npossible = num_possible_cpus();
>  	s16 *cid_to_cpu, *cpu_to_cid;
>  	struct scx_cid_topo *cid_topo;
> -	struct scx_cmask __percpu *set_cmask_scratch;
> -	s32 cpu;
>  
>  	if (scx_cid_to_cpu_tbl)
>  		return 0;
> @@ -63,25 +53,17 @@ static s32 scx_cid_arrays_alloc(void)
>  	cid_to_cpu = kzalloc_objs(*scx_cid_to_cpu_tbl, npossible, GFP_KERNEL);
>  	cpu_to_cid = kzalloc_objs(*scx_cpu_to_cid_tbl, nr_cpu_ids, GFP_KERNEL);
>  	cid_topo = kmalloc_objs(*scx_cid_topo, npossible, GFP_KERNEL);
> -	set_cmask_scratch = __alloc_percpu(struct_size(set_cmask_scratch, bits,
> -						       SCX_CMASK_NR_WORDS(npossible)),
> -					   sizeof(u64));
>  
> -	if (!cid_to_cpu || !cpu_to_cid || !cid_topo || !set_cmask_scratch) {
> +	if (!cid_to_cpu || !cpu_to_cid || !cid_topo) {
>  		kfree(cid_to_cpu);
>  		kfree(cpu_to_cid);
>  		kfree(cid_topo);
> -		free_percpu(set_cmask_scratch);
>  		return -ENOMEM;
>  	}
>  
>  	WRITE_ONCE(scx_cid_to_cpu_tbl, cid_to_cpu);
>  	WRITE_ONCE(scx_cpu_to_cid_tbl, cpu_to_cid);
>  	WRITE_ONCE(scx_cid_topo, cid_topo);
> -	for_each_possible_cpu(cpu)
> -		scx_cmask_init(per_cpu_ptr(set_cmask_scratch, cpu),
> -			       0, npossible);
> -	WRITE_ONCE(scx_set_cmask_scratch, set_cmask_scratch);
>  	return 0;
>  }
>  
> diff --git a/kernel/sched/ext_internal.h b/kernel/sched/ext_internal.h
> index ff7e882bd67a..9bb65367f510 100644
> --- a/kernel/sched/ext_internal.h
> +++ b/kernel/sched/ext_internal.h
> @@ -1124,6 +1124,14 @@ struct scx_sched {
>  	struct bpf_map		*arena_map;
>  	struct gen_pool		*arena_pool;
>  
> +	/*
> +	 * Per-CPU arena cmask used by scx_call_op_set_cpumask() to hand a cmask
> +	 * to ops_cid.set_cmask(). The kernel writes through the stored kern_va;
> +	 * the BPF-arena uaddr handed to BPF is recovered by subtracting the
> +	 * arena's kern_vm_start.
> +	 */
> +	struct scx_cmask * __percpu *set_cmask_scratch;
> +
>  	DECLARE_BITMAP(has_op, SCX_OPI_END);
>  
>  	/*
> @@ -1480,8 +1488,6 @@ enum scx_ops_state {
>  extern struct scx_sched __rcu *scx_root;
>  DECLARE_PER_CPU(struct rq *, scx_locked_rq_state);
>  
> -extern struct scx_cmask __percpu *scx_set_cmask_scratch;
> -
>  /*
>   * True when the currently loaded scheduler hierarchy is cid-form. All scheds
>   * in a hierarchy share one form, so this single key tells callsites which
> diff --git a/tools/sched_ext/include/scx/cid.bpf.h b/tools/sched_ext/include/scx/cid.bpf.h
> index e281c88fa824..70f2a3829af4 100644
> --- a/tools/sched_ext/include/scx/cid.bpf.h
> +++ b/tools/sched_ext/include/scx/cid.bpf.h
> @@ -675,56 +675,4 @@ static __always_inline void cmask_from_cpumask(struct scx_cmask __arena *m,
>  	}
>  }
>  
> -/**
> - * cmask_copy_from_kernel - probe-read a kernel cmask into an arena cmask
> - * @dst: arena cmask to fill; must have @dst->base == 0 and be sized for @src.
> - * @src: kernel-memory cmask (e.g. ops.set_cmask() arg); @src->base must be 0.
> - *
> - * Word-for-word copy; @src and @dst must share base 0 alignment. Triggers
> - * scx_bpf_error() on probe failure or precondition violation.
> - */
> -static __always_inline void cmask_copy_from_kernel(struct scx_cmask __arena *dst,
> -						   const struct scx_cmask *src)
> -{
> -	u32 base = 0, nr_cids = 0, nr_words, wi;
> -
> -	if (dst->base != 0) {
> -		scx_bpf_error("cmask_copy_from_kernel requires dst->base == 0");
> -		return;
> -	}
> -
> -	if (bpf_probe_read_kernel(&base, sizeof(base), &src->base)) {
> -		scx_bpf_error("probe-read cmask->base failed");
> -		return;
> -	}
> -	if (base != 0) {
> -		scx_bpf_error("cmask_copy_from_kernel requires src->base == 0");
> -		return;
> -	}
> -
> -	if (bpf_probe_read_kernel(&nr_cids, sizeof(nr_cids), &src->nr_cids)) {
> -		scx_bpf_error("probe-read cmask->nr_cids failed");
> -		return;
> -	}
> -
> -	if (nr_cids > dst->nr_cids) {
> -		scx_bpf_error("src cmask nr_cids=%u exceeds dst nr_cids=%u",
> -			      nr_cids, dst->nr_cids);
> -		return;
> -	}
> -
> -	nr_words = CMASK_NR_WORDS(nr_cids);
> -	cmask_zero(dst);
> -	bpf_for(wi, 0, CMASK_MAX_WORDS) {
> -		u64 word = 0;
> -		if (wi >= nr_words)
> -			break;
> -		if (bpf_probe_read_kernel(&word, sizeof(u64), &src->bits[wi])) {
> -			scx_bpf_error("probe-read cmask->bits[%u] failed", wi);
> -			return;
> -		}
> -		dst->bits[wi] = word;
> -	}
> -}
> -
>  #endif /* __SCX_CID_BPF_H */
> diff --git a/tools/sched_ext/scx_qmap.bpf.c b/tools/sched_ext/scx_qmap.bpf.c
> index 7e77f22674ea..8a2d6a8ebd8e 100644
> --- a/tools/sched_ext/scx_qmap.bpf.c
> +++ b/tools/sched_ext/scx_qmap.bpf.c
> @@ -919,14 +919,15 @@ void BPF_STRUCT_OPS(qmap_update_idle, s32 cid, bool idle)
>  }
>  
>  void BPF_STRUCT_OPS(qmap_set_cmask, struct task_struct *p,
> -		    const struct scx_cmask *cmask)
> +		    const struct scx_cmask *cmask_in)
>  {
> +	struct scx_cmask __arena *cmask = (struct scx_cmask __arena *)(long)cmask_in;
>  	task_ctx_t *taskc;
>  
>  	taskc = lookup_task_ctx(p);
>  	if (!taskc)
>  		return;
> -	cmask_copy_from_kernel(&taskc->cpus_allowed, cmask);
> +	cmask_copy(&taskc->cpus_allowed, cmask);
>  }
>  
>  struct monitor_timer {



^ permalink raw reply

* Re: [PATCH] clk: moxart: fix refcount leak
From: Alexander A. Klimov @ 2026-05-21  4:16 UTC (permalink / raw)
  To: Brian Masney
  Cc: Krzysztof Kozlowski, Michael Turquette, Stephen Boyd,
	Jonas Jensen, Mike Turquette, moderated list:ARM/MOXA ART SOC,
	open list:COMMON CLK FRAMEWORK, open list
In-Reply-To: <ag41wJBhdK7-Zynb@redhat.com>



On 5/21/26 00:29, Brian Masney wrote:
> Hi Alexander,
> 
> On Wed, May 20, 2026 at 07:55:50PM +0200, Alexander A. Klimov wrote:
>> Every value returned from of_clk_get() is supposed to be cleaned up
>> via clk_put() once not needed anymore.
>> The values here are used only for error checking,
>> but weren't cleaned up until now.
>>
>> Fixes: c7bb4fc16ead ("clk: add MOXA ART SoCs clock driver")
>> Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
>> ---
>>   drivers/clk/clk-moxart.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/clk/clk-moxart.c b/drivers/clk/clk-moxart.c
>> index 3786a0153ad1..7e191b1481bb 100644
>> --- a/drivers/clk/clk-moxart.c
>> +++ b/drivers/clk/clk-moxart.c
>> @@ -39,6 +39,7 @@ static void __init moxart_of_pll_clk_init(struct device_node *node)
>>   		pr_err("%pOF: of_clk_get failed\n", node);
>>   		return;
>>   	}
>> +	clk_put(ref_clk);
>>   
>>   	hw = clk_hw_register_fixed_factor(NULL, name, parent_name, 0, mul, 1);
>>   	if (IS_ERR(hw)) {
>> @@ -83,6 +84,7 @@ static void __init moxart_of_apb_clk_init(struct device_node *node)
>>   		pr_err("%pOF: of_clk_get failed\n", node);
>>   		return;
>>   	}
>> +	clk_put(pll_clk);
> 
> So this immediately drops the reference to the clk after of_clk_get() is
> called. Can we just remove these two of_clk_get() calls since they don't
> appear to be used?
Not if their purpose is to... idk...
check whether device_node is a clock at all, maybe?


^ permalink raw reply

* Re: [PATCH 6/8] sched_ext: Require an arena for cid-form schedulers
From: Emil Tsalapatis @ 2026-05-21  4:15 UTC (permalink / raw)
  To: Tejun Heo, David Vernet, Andrea Righi, Changwoo Min,
	Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Kumar Kartikeya Dwivedi
  Cc: Peter Zijlstra, Catalin Marinas, Will Deacon, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Andrew Morton,
	David Hildenbrand, Mike Rapoport, Emil Tsalapatis, sched-ext, bpf,
	x86, linux-arm-kernel, linux-mm, linux-kernel
In-Reply-To: <20260520235052.4180316-7-tj@kernel.org>

On Wed May 20, 2026 at 7:50 PM EDT, Tejun Heo wrote:
> Upcoming patches will let the kernel place arena-resident scratch shared
> with the BPF program (e.g. per-CPU set_cmask cmask) so the BPF side can
> dereference it directly via __arena pointers, replacing the current
> cmask_copy_from_kernel() probe-read loop. That requires each cid-form
> scheduler to expose its arena to the kernel. Kernel- side accesses are
> recovered by the per-arena scratch-page mechanism.
>
> bpf_scx_reg_cid() walks the struct_ops member progs via
> bpf_struct_ops_for_each_prog() and reads each prog's arena via
> bpf_prog_arena(). The verifier enforces one arena per program, so each
> member prog contributes at most one arena. All non-NULL contributions must
> match and at least one member prog must use an arena. The map ref is held on
> scx_sched and dropped on sched destroy. cpu-form schedulers (bpf_scx_reg)
> are unchanged - no arena requirement.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>
> ---
>  kernel/sched/ext.c          | 56 ++++++++++++++++++++++++++++++++++++-
>  kernel/sched/ext_internal.h |  8 ++++++
>  2 files changed, 63 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index 9c458552d14f..56f94ac32ba0 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -5003,6 +5003,8 @@ static void scx_sched_free_rcu_work(struct work_struct *work)
>  
>  	rhashtable_free_and_destroy(&sch->dsq_hash, NULL, NULL);
>  	free_exit_info(sch->exit_info);
> +	if (sch->arena_map)
> +		bpf_map_put(sch->arena_map);
>  	kfree(sch);
>  }
>  
> @@ -6746,6 +6748,7 @@ struct scx_enable_cmd {
>  		struct sched_ext_ops_cid	*ops_cid;
>  	};
>  	bool			is_cid_type;
> +	struct bpf_map		*arena_map;	/* arena ref to transfer to sch */
>  	int			ret;
>  };
>  
> @@ -6913,6 +6916,15 @@ static struct scx_sched *scx_alloc_and_add_sched(struct scx_enable_cmd *cmd,
>  		return ERR_PTR(ret);
>  	}
>  #endif	/* CONFIG_EXT_SUB_SCHED */
> +
> +	/*
> +	 * Consume the arena_map ref bpf_scx_reg_cid() took. Defer to here so
> +	 * earlier failure paths leave cmd->arena_map set and bpf_scx_reg_cid
> +	 * drops the ref. After this point, sch owns the ref and any cleanup
> +	 * runs through scx_sched_free_rcu_work() which puts it.
> +	 */
> +	sch->arena_map = cmd->arena_map;
> +	cmd->arena_map = NULL;
>  	return sch;
>  
>  #ifdef CONFIG_EXT_SUB_SCHED
> @@ -7898,11 +7910,53 @@ static int bpf_scx_reg(void *kdata, struct bpf_link *link)
>  	return scx_enable(&cmd, link);
>  }
>  
> +struct scx_arena_scan {
> +	struct bpf_map	*arena;
> +	int		err;

Can we skip the int err here...

> +};
> +
> +/*
> + * The verifier enforces one arena per BPF program, so each struct_ops
> + * member prog contributes at most one arena via bpf_prog_arena().
> + * Require all non-NULL contributions to match.
> + */
> +static int scx_arena_scan_prog(struct bpf_prog *prog, void *data)
> +{
> +	struct scx_arena_scan *s = data;
> +	struct bpf_map *arena = bpf_prog_arena(prog);
> +
> +	if (!arena)
> +		return 0;
> +	if (s->arena && s->arena != arena) {
> +		s->err = -EINVAL;

...and just directly return -EINVAL here? bpf_struct_ops_for_each_prog
breaks when we return non-zero so do we need the extra scx_arena_scan
struct?

> +		return 1;
> +	}
> +	s->arena = arena;
> +	return 0;
> +}
> +
>  static int bpf_scx_reg_cid(void *kdata, struct bpf_link *link)
>  {
>  	struct scx_enable_cmd cmd = { .ops_cid = kdata, .is_cid_type = true };
> +	struct scx_arena_scan scan = {};
> +	int ret;
>  
> -	return scx_enable(&cmd, link);
> +	bpf_struct_ops_for_each_prog(kdata, scx_arena_scan_prog, &scan);
> +	if (scan.err) {
> +		pr_err("sched_ext: cid-form scheduler uses multiple arena maps\n");
> +		return scan.err;
> +	}
> +	if (!scan.arena) {
> +		pr_err("sched_ext: cid-form scheduler must use a BPF arena map\n");
> +		return -EINVAL;
> +	}
> +
> +	bpf_map_inc(scan.arena);
> +	cmd.arena_map = scan.arena;
> +	ret = scx_enable(&cmd, link);
> +	if (cmd.arena_map)		/* not consumed by scx_alloc_and_add_sched() */
> +		bpf_map_put(cmd.arena_map);
> +	return ret;
>  }
>  
>  static void bpf_scx_unreg(void *kdata, struct bpf_link *link)
> diff --git a/kernel/sched/ext_internal.h b/kernel/sched/ext_internal.h
> index 7258aea94b9f..d40cfd29ddaa 100644
> --- a/kernel/sched/ext_internal.h
> +++ b/kernel/sched/ext_internal.h
> @@ -1111,6 +1111,14 @@ struct scx_sched {
>  		struct sched_ext_ops_cid	ops_cid;
>  	};
>  	bool			is_cid_type;	/* true if registered via bpf_sched_ext_ops_cid */
> +
> +	/*
> +	 * Arena map auto-discovered from member progs at struct_ops attach.
> +	 * cid-form schedulers must use exactly one arena across all member
> +	 * progs. NULL on cpu-form.
> +	 */
> +	struct bpf_map		*arena_map;
> +
>  	DECLARE_BITMAP(has_op, SCX_OPI_END);
>  
>  	/*



^ permalink raw reply

* Re: [PATCH 5/8] bpf/arena: Add bpf_arena_map_kern_vm_start() and bpf_prog_arena()
From: Emil Tsalapatis @ 2026-05-21  4:08 UTC (permalink / raw)
  To: Tejun Heo, David Vernet, Andrea Righi, Changwoo Min,
	Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Kumar Kartikeya Dwivedi
  Cc: Peter Zijlstra, Catalin Marinas, Will Deacon, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Andrew Morton,
	David Hildenbrand, Mike Rapoport, Emil Tsalapatis, sched-ext, bpf,
	x86, linux-arm-kernel, linux-mm, linux-kernel
In-Reply-To: <20260520235052.4180316-6-tj@kernel.org>

On Wed May 20, 2026 at 7:50 PM EDT, Tejun Heo wrote:
> struct bpf_arena is opaque to callers outside arena.c. Add two helpers
> for struct_ops subsystems that need to reach into an arena:
>
>   bpf_arena_map_kern_vm_start(struct bpf_map *map)
>     returns @map's kern_vm_start. A sched_ext follow-up needs this
>     to translate kern_va <-> uaddr.
>
>   bpf_prog_arena(struct bpf_prog *prog)
>     returns the bpf_map of the arena referenced by @prog (NULL if
>     @prog references no arena). The verifier enforces at most one
>     arena per program. Used by struct_ops callers that auto-discover
>     an arena from a member prog and need to take a map reference.
>
> Suggested-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> Signed-off-by: Tejun Heo <tj@kernel.org>

Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>

> ---
>  include/linux/bpf.h |  2 ++
>  kernel/bpf/arena.c  | 26 ++++++++++++++++++++++++++
>  2 files changed, 28 insertions(+)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 5b99d786e98c..e1ba57c10aaa 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -618,6 +618,8 @@ void bpf_rb_root_free(const struct btf_field *field, void *rb_root,
>  		      struct bpf_spin_lock *spin_lock);
>  u64 bpf_arena_get_kern_vm_start(struct bpf_arena *arena);
>  u64 bpf_arena_get_user_vm_start(struct bpf_arena *arena);
> +u64 bpf_arena_map_kern_vm_start(struct bpf_map *map);
> +struct bpf_map *bpf_prog_arena(struct bpf_prog *prog);
>  int bpf_obj_name_cpy(char *dst, const char *src, unsigned int size);
>  
>  struct bpf_offload_dev;
> diff --git a/kernel/bpf/arena.c b/kernel/bpf/arena.c
> index a811cf6170fa..51b9ae36feb6 100644
> --- a/kernel/bpf/arena.c
> +++ b/kernel/bpf/arena.c
> @@ -84,6 +84,32 @@ u64 bpf_arena_get_user_vm_start(struct bpf_arena *arena)
>  	return arena ? arena->user_vm_start : 0;
>  }
>  
> +/**
> + * bpf_arena_map_kern_vm_start - kern_vm_start lookup by struct bpf_map *
> + * @map: a BPF_MAP_TYPE_ARENA map
> + *
> + * Return @map's kern_vm_start.
> + */
> +u64 bpf_arena_map_kern_vm_start(struct bpf_map *map)
> +{
> +	return bpf_arena_get_kern_vm_start(container_of(map, struct bpf_arena, map));
> +}
> +
> +/**
> + * bpf_prog_arena - return the bpf_map of the arena referenced by @prog
> + * @prog: a loaded BPF program
> + *
> + * The verifier enforces at most one arena per program and stores it in
> + * prog->aux->arena. Return that arena's underlying bpf_map, or NULL if
> + * @prog does not reference an arena.
> + */
> +struct bpf_map *bpf_prog_arena(struct bpf_prog *prog)
> +{
> +	struct bpf_arena *arena = prog->aux->arena;
> +
> +	return arena ? &arena->map : NULL;
> +}
> +
>  static long arena_map_peek_elem(struct bpf_map *map, void *value)
>  {
>  	return -EOPNOTSUPP;



^ permalink raw reply

* Re: [PATCH 4/8] bpf: Add bpf_struct_ops_for_each_prog()
From: Emil Tsalapatis @ 2026-05-21  4:07 UTC (permalink / raw)
  To: Tejun Heo, David Vernet, Andrea Righi, Changwoo Min,
	Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Kumar Kartikeya Dwivedi
  Cc: Peter Zijlstra, Catalin Marinas, Will Deacon, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Andrew Morton,
	David Hildenbrand, Mike Rapoport, Emil Tsalapatis, sched-ext, bpf,
	x86, linux-arm-kernel, linux-mm, linux-kernel
In-Reply-To: <20260520235052.4180316-5-tj@kernel.org>

On Wed May 20, 2026 at 7:50 PM EDT, Tejun Heo wrote:
> Add a helper that walks the member progs of the struct_ops map
> containing a given @kdata vmtable. struct_ops ->reg() callbacks (and
> similar) sometimes need to inspect the loaded BPF programs, e.g. to
> discover maps they reference via prog->aux->used_maps.
>
> The implementation mirrors bpf_struct_ops_id(): container_of @kdata
> to recover the bpf_struct_ops_map, then iterate st_map->links[i]->prog
> for i in [0, funcs_cnt). Same access pattern, no new locking - by the
> time ->reg() fires st_map is fully populated and stable.
>
> A sched_ext follow-up walks the member progs of a cid-form scheduler's
> struct_ops map, reads prog->aux->arena directly, and requires all member
> progs to reference exactly one arena, without requiring the BPF program
> to call a registration kfunc.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>

Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>

> ---
>  include/linux/bpf.h         |  3 +++
>  kernel/bpf/bpf_struct_ops.c | 36 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 39 insertions(+)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 64968ca6db51..5b99d786e98c 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -2129,6 +2129,9 @@ int bpf_prog_assoc_struct_ops(struct bpf_prog *prog, struct bpf_map *map);
>  void bpf_prog_disassoc_struct_ops(struct bpf_prog *prog);
>  void *bpf_prog_get_assoc_struct_ops(const struct bpf_prog_aux *aux);
>  u32 bpf_struct_ops_id(const void *kdata);
> +int bpf_struct_ops_for_each_prog(const void *kdata,
> +				 int (*cb)(struct bpf_prog *prog, void *data),
> +				 void *data);
>  
>  #ifdef CONFIG_NET
>  /* Define it here to avoid the use of forward declaration */
> diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
> index 05b366b821c3..16aec18ed31b 100644
> --- a/kernel/bpf/bpf_struct_ops.c
> +++ b/kernel/bpf/bpf_struct_ops.c
> @@ -1203,6 +1203,42 @@ u32 bpf_struct_ops_id(const void *kdata)
>  }
>  EXPORT_SYMBOL_GPL(bpf_struct_ops_id);
>  
> +/**
> + * bpf_struct_ops_for_each_prog - Invoke @cb for each member prog
> + * @kdata: kernel-side struct_ops vmtable (the @kdata arg to ->reg/->update/->unreg)
> + * @cb: callback invoked once per member prog; non-zero return stops iteration
> + * @data: opaque argument passed to @cb
> + *
> + * Walks the struct_ops member progs registered on the map containing @kdata.
> + * Intended for use from struct_ops ->reg() callbacks (and similar) that need to
> + * inspect the loaded BPF programs (for example to discover maps they reference
> + * via @prog->aux->used_maps).
> + *
> + * Return 0 if iteration completed, otherwise the first non-zero @cb return.
> + */
> +int bpf_struct_ops_for_each_prog(const void *kdata,
> +				 int (*cb)(struct bpf_prog *prog, void *data),
> +				 void *data)
> +{
> +	struct bpf_struct_ops_value *kvalue;
> +	struct bpf_struct_ops_map *st_map;
> +	u32 i;
> +	int ret;
> +
> +	kvalue = container_of(kdata, struct bpf_struct_ops_value, data);
> +	st_map = container_of(kvalue, struct bpf_struct_ops_map, kvalue);
> +
> +	for (i = 0; i < st_map->funcs_cnt; i++) {
> +		if (!st_map->links[i])
> +			continue;
> +		ret = cb(st_map->links[i]->prog, data);
> +		if (ret)
> +			return ret;
> +	}
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(bpf_struct_ops_for_each_prog);
> +
>  static bool bpf_struct_ops_valid_to_reg(struct bpf_map *map)
>  {
>  	struct bpf_struct_ops_map *st_map = (struct bpf_struct_ops_map *)map;



^ permalink raw reply

* Re: [RFC V2 01/14] mm: Abstract printing of pxd_val()
From: Anshuman Khandual @ 2026-05-21  3:43 UTC (permalink / raw)
  To: David Hildenbrand (Arm), Dave Hansen, linux-arm-kernel
  Cc: Catalin Marinas, Will Deacon, Ryan Roberts, Mark Rutland,
	Lorenzo Stoakes, Andrew Morton, Mike Rapoport, Linu Cherian,
	Usama Arif, linux-kernel, linux-mm
In-Reply-To: <b976ec4e-d5cb-4856-885d-dd301b3ddd83@kernel.org>

On 20/05/26 4:11 PM, David Hildenbrand (Arm) wrote:
> On 5/19/26 16:28, Dave Hansen wrote:
>> On 5/12/26 21:45, Anshuman Khandual wrote:
>>>  	if (!p4d_present(p4d) || p4d_leaf(p4d)) {
>>> -		pr_alert("pgd:%08llx p4d:%08llx\n", pgdv, p4dv);
>>> +		pr_alert("pgd:%" __PRIpxx " p4d:%" __PRIpxx "\n",
>>> +			 __PRIpxx_args(pgdv), __PRIpxx_args(p4dv));
>>>  		return;
>>>  	}
>>
>> That's not the most readable result. Could a printk() format specifier
>> make this nicer? Maybe use "%pT"?
>>
>> 	pr_alert("pgd:%pT p4d:%pT\n", &pgd, &p4d);
>>
>> I _think_ it could even get rid of the p??v variables.
> 
> That would be nicer indeed, if that works.

I had attempted something similar earlier.

https://lore.kernel.org/all/20250618041235.1716143-1-anshuman.khandual@arm.com/

But current proposal was to solve the problem with minimum possible churn
in generic MM to handle 128 bit page table entries for its value printing
purpose.

Please find an example WIP patch in this regard (tested very lightly). Please do
let me know if this is in the right direction and should be followed up instead.
Although special_hex_number() might have to support 128 bit values.

=====================================

diff --git a/Documentation/core-api/printk-formats.rst b/Documentation/core-api/printk-formats.rst
index c0b1b6089307..e69f91a9dd9d 100644
--- a/Documentation/core-api/printk-formats.rst
+++ b/Documentation/core-api/printk-formats.rst
@@ -696,6 +696,25 @@ Rust
 Only intended to be used from Rust code to format ``core::fmt::Arguments``.
 Do *not* use it from C.
 
+Page Table Entry
+----------------
+
+::
+
+        %p[pgd|p4dp|pud|pmd|pte]
+
+Print page table entry at any level.
+
+Passed by reference.
+
+Examples for a 64 bit page table entry, given &(u64)0xc0ffee::
+
+        %ppte   0x0000000000c0ffee
+        %ppmd   0x0000000000c0ffee
+        %ppud   0x0000000000c0ffee
+        %pp4d   0x0000000000c0ffee
+        %ppgd   0x0000000000c0ffee
+
 Thanks
 ======
 
diff --git a/lib/tests/printf_kunit.c b/lib/tests/printf_kunit.c
index bb70b9cddadd..ab7f55499eb7 100644
--- a/lib/tests/printf_kunit.c
+++ b/lib/tests/printf_kunit.c
@@ -791,6 +791,73 @@ errptr(struct kunit *kunittest)
 #endif
 }
 
+struct pxd_test {
+	u64 val;
+	const char *name;
+};
+
+static struct pxd_test pxd_test_cases[] = {
+	{ .val = 0xc0ffee,		.name = "0x0000000000c0ffee"},
+	{ .val = 0xdeadbeef,		.name = "0x00000000deadbeef"},
+	{ .val = 0xaabbcc,		.name = "0x0000000000aabbcc"},
+	{ .val = 0xcc,			.name = "0x00000000000000cc"},
+	{ .val = 0x1,			.name = "0x0000000000000001"},
+	{ .val = 0x11,			.name = "0x0000000000000011"},
+	{ .val = 0x111,			.name = "0x0000000000000111"},
+	{ .val = 0x10000010001,		.name = "0x0000010000010001"},
+	{ .val = 0xc0ffeec0ffee,	.name = "0x0000c0ffeec0ffee"},
+	{ .val = 0x10000000000,		.name = "0x0000010000000000"},
+	{ .val = 0x11000000000,		.name = "0x0000011000000000"},
+	{ .val = 0x1000000000000000,	.name = "0x1000000000000000"},
+	{ .val = 0x1100000000000000,	.name = "0x1100000000000000"},
+	{ .val = 0x1110000000000000,	.name = "0x1110000000000000"},
+};
+
+static void
+pxd(struct kunit *kunittest)
+{
+	char buf[64];
+	int i;
+
+	if (sizeof(pte_t) != 8)
+		kunit_skip(kunittest, "pte_t size is not 64 bits");
+
+	for (i = 0; i < ARRAY_SIZE(pxd_test_cases); i++) {
+		pte_t pte = __pte(pxd_test_cases[i].val);
+
+		snprintf(buf, sizeof(buf), "%ppte", &pte);
+		KUNIT_EXPECT_STREQ(kunittest, buf, pxd_test_cases[i].name);
+	}
+
+	for (i = 0; i < ARRAY_SIZE(pxd_test_cases); i++) {
+		pmd_t pmd = __pmd(pxd_test_cases[i].val);
+
+		snprintf(buf, sizeof(buf), "%ppmd", &pmd);
+		KUNIT_EXPECT_STREQ(kunittest, buf, pxd_test_cases[i].name);
+	}
+
+	for (i = 0; i < ARRAY_SIZE(pxd_test_cases); i++) {
+		pud_t pud = __pud(pxd_test_cases[i].val);
+
+		snprintf(buf, sizeof(buf), "%ppud", &pud);
+		KUNIT_EXPECT_STREQ(kunittest, buf, pxd_test_cases[i].name);
+	}
+
+	for (i = 0; i < ARRAY_SIZE(pxd_test_cases); i++) {
+		p4d_t p4d = __p4d(pxd_test_cases[i].val);
+
+		snprintf(buf, sizeof(buf), "%pp4d", &p4d);
+		KUNIT_EXPECT_STREQ(kunittest, buf, pxd_test_cases[i].name);
+	}
+
+	for (i = 0; i < ARRAY_SIZE(pxd_test_cases); i++) {
+		pgd_t pgd = __pgd(pxd_test_cases[i].val);
+
+		snprintf(buf, sizeof(buf), "%ppgd", &pgd);
+		KUNIT_EXPECT_STREQ(kunittest, buf, pxd_test_cases[i].name);
+	}
+}
+
 static int printf_suite_init(struct kunit_suite *suite)
 {
 	total_tests = 0;
@@ -839,6 +906,7 @@ static struct kunit_case printf_test_cases[] = {
 	KUNIT_CASE(errptr),
 	KUNIT_CASE(fwnode_pointer),
 	KUNIT_CASE(fourcc_pointer),
+	KUNIT_CASE(pxd),
 	{}
 };
 
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 9f359b31c8d1..937499c51ecd 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -856,6 +856,51 @@ static char *default_pointer(char *buf, char *end, const void *ptr,
 	return ptr_to_id(buf, end, ptr, spec);
 }
 
+static char *pxd_pointer(char *buf, char *end, const void *ptr,
+			 struct printf_spec spec, const char *fmt)
+{
+
+	if (check_pointer(&buf, end, ptr, spec))
+		return buf;
+
+	static_assert((sizeof(pte_t) == 4) || (sizeof(pte_t) == 8));
+	static_assert((sizeof(pmd_t) == 4) || (sizeof(pmd_t) == 8));
+	static_assert((sizeof(pud_t) == 4) || (sizeof(pud_t) == 8));
+	static_assert((sizeof(p4d_t) == 4) || (sizeof(p4d_t) == 8));
+	static_assert((sizeof(pgd_t) == 4) || (sizeof(pgd_t) == 8));
+
+	if (fmt[1] == 't' && fmt[2] == 'e') {
+		pte_t *pte = (pte_t *)ptr;
+
+		return special_hex_number(buf, end, pte_val(*pte), sizeof(pte_t));
+	}
+
+	if (fmt[1] == 'm' && fmt[2] == 'd') {
+		pmd_t *pmd = (pmd_t *)ptr;
+
+		return special_hex_number(buf, end, pmd_val(*pmd), sizeof(pmd_t));
+	}
+
+	if (fmt[1] == 'u' && fmt[2] == 'd') {
+		pud_t *pud = (pud_t *)ptr;
+
+		return special_hex_number(buf, end, pud_val(*pud), sizeof(pud_t));
+	}
+
+	if (fmt[1] == '4' && fmt[2] == 'd') {
+		p4d_t *p4d = (p4d_t *)ptr;
+
+		return special_hex_number(buf, end, p4d_val(*p4d), sizeof(p4d_t));
+	}
+
+	if (fmt[1] == 'g' && fmt[2] == 'd') {
+		pgd_t *pgd = (pgd_t *)ptr;
+
+		return special_hex_number(buf, end, pgd_val(*pgd), sizeof(pgd_t));
+	}
+	return default_pointer(buf, end, ptr, spec);
+}
+
 int kptr_restrict __read_mostly;
 
 static noinline_for_stack
@@ -2506,6 +2551,9 @@ early_param("no_hash_pointers", no_hash_pointers_enable);
  *		Without an option prints the full name of the node
  *		f full name
  *		P node name, including a possible unit address
+ * - 'p[g|4|u|m|t|][d|e]' For a page table entry, this prints its
+ *			  contents in a hexadecimal format
+ *
  * - 'x' For printing the address unmodified. Equivalent to "%lx".
  *       Please read the documentation (path below) before using!
  * - '[ku]s' For a BPF/tracing related format specifier, e.g. used out of
@@ -2615,6 +2663,8 @@ char *pointer(const char *fmt, char *buf, char *end, void *ptr,
 		default:
 			return error_string(buf, end, "(einval)", spec);
 		}
+	case 'p':
+		return pxd_pointer(buf, end, ptr, spec, fmt);
 	default:
 		return default_pointer(buf, end, ptr, spec);
 	}
diff --git a/mm/memory.c b/mm/memory.c
index ea6568571131..838e06cc377d 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -521,7 +521,6 @@ static bool is_bad_page_map_ratelimited(void)
 
 static void __print_bad_page_map_pgtable(struct mm_struct *mm, unsigned long addr)
 {
-	unsigned long long pgdv, p4dv, pudv, pmdv;
 	p4d_t p4d, *p4dp;
 	pud_t pud, *pudp;
 	pmd_t pmd, *pmdp;
@@ -532,34 +531,30 @@ static void __print_bad_page_map_pgtable(struct mm_struct *mm, unsigned long add
 	 * see locking requirements for print_bad_page_map().
 	 */
 	pgdp = pgd_offset(mm, addr);
-	pgdv = pgd_val(*pgdp);
 
 	if (!pgd_present(*pgdp) || pgd_leaf(*pgdp)) {
-		pr_alert("pgd:%08llx\n", pgdv);
+		pr_alert("pgd:%ppgd\n", pgdp);
 		return;
 	}
 
 	p4dp = p4d_offset(pgdp, addr);
 	p4d = p4dp_get(p4dp);
-	p4dv = p4d_val(p4d);
 
 	if (!p4d_present(p4d) || p4d_leaf(p4d)) {
-		pr_alert("pgd:%08llx p4d:%08llx\n", pgdv, p4dv);
+		pr_alert("pgd:%ppgd p4d:%pp4d\n", pgdp, p4dp);
 		return;
 	}
 
 	pudp = pud_offset(p4dp, addr);
 	pud = pudp_get(pudp);
-	pudv = pud_val(pud);
 
 	if (!pud_present(pud) || pud_leaf(pud)) {
-		pr_alert("pgd:%08llx p4d:%08llx pud:%08llx\n", pgdv, p4dv, pudv);
+		pr_alert("pgd:%ppgd p4d:%pp4d pud:%ppud\n", pgdp, p4dp, pudp);
 		return;
 	}
 
 	pmdp = pmd_offset(pudp, addr);
 	pmd = pmdp_get(pmdp);
-	pmdv = pmd_val(pmd);
 
 	/*
 	 * Dumping the PTE would be nice, but it's tricky with CONFIG_HIGHPTE,
@@ -567,8 +562,8 @@ static void __print_bad_page_map_pgtable(struct mm_struct *mm, unsigned long add
 	 * doing another map would be bad. print_bad_page_map() should
 	 * already take care of printing the PTE.
 	 */
-	pr_alert("pgd:%08llx p4d:%08llx pud:%08llx pmd:%08llx\n", pgdv,
-		 p4dv, pudv, pmdv);
+	pr_alert("pgd:%ppgd p4d:%pp4d pud:%ppud pmd:%ppmd\n", pgdp,
+		 p4dp, pudp, pmdp);
 }
 
 /*
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 0492d6afc9a1..9dd17e501bfa 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -6975,7 +6975,7 @@ sub process {
 				my $fmt = get_quoted_string($lines[$count - 1], raw_line($count, 0));
 				$fmt =~ s/%%//g;
 
-				while ($fmt =~ /(\%[\*\d\.]*p(\w)(\w*))/g) {
+				while ($fmt =~ /(\%[\*\d\.]*p(\w)(\w*)(\te)(\md)(\ud)(\4d)(\gd))/g) {
 					$specifier = $1;
 					$extension = $2;
 					$qualifier = $3;


^ permalink raw reply related

* RE: [PATCH V3 0/8] PCI: imx6: Integrate pwrctrl API and update device trees
From: Hongxing Zhu (OSS) @ 2026-05-21  3:37 UTC (permalink / raw)
  To: Sherry Sun (OSS), robh@kernel.org, krzk+dt@kernel.org,
	conor+dt@kernel.org, Frank Li, s.hauer@pengutronix.de,
	kernel@pengutronix.de, festevam@gmail.com, lpieralisi@kernel.org,
	kwilczynski@kernel.org, mani@kernel.org, bhelgaas@google.com,
	l.stach@pengutronix.de
  Cc: imx@lists.linux.dev, linux-pci@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, devicetree@vger.kernel.org,
	linux-kernel@vger.kernel.org, Sherry Sun
In-Reply-To: <20260520084904.2424253-1-sherry.sun@oss.nxp.com>

> -----Original Message-----
> From: Sherry Sun (OSS) <sherry.sun@oss.nxp.com>
> Sent: Wednesday, May 20, 2026 4:49 PM
> To: robh@kernel.org; krzk+dt@kernel.org; conor+dt@kernel.org; Frank Li
> <frank.li@nxp.com>; s.hauer@pengutronix.de; kernel@pengutronix.de;
> festevam@gmail.com; lpieralisi@kernel.org; kwilczynski@kernel.org;
> mani@kernel.org; bhelgaas@google.com; Hongxing Zhu
> <hongxing.zhu@nxp.com>; l.stach@pengutronix.de
> Cc: imx@lists.linux.dev; linux-pci@vger.kernel.org; linux-arm-
> kernel@lists.infradead.org; devicetree@vger.kernel.org; linux-
> kernel@vger.kernel.org; Sherry Sun <sherry.sun@nxp.com>
> Subject: [PATCH V3 0/8] PCI: imx6: Integrate pwrctrl API and update device trees
> 
> From: Sherry Sun <sherry.sun@nxp.com>
> 
> This series integrates the PCI pwrctrl framework into the pci-imx6 driver and
> updates i.MX EVK board device trees to support it.
> 
> Patches 2-8 update device trees for i.MX EVK boards which maintained by NXP to
> move power supply properties from the PCIe controller node to the Root Port
> child node, which is required for pwrctrl framework.
> Affected boards:
> - i.MX6Q/DL SABRESD
> - i.MX6SX SDB
> - i.MX8MM EVK
> - i.MX8MP EVK
> - i.MX8MQ EVK
> - i.MX8DXL/QM/QXP EVK
> - i.MX95 15x15/19x19 EVK
> 
> The driver maintains legacy regulator handling for device trees that haven't been
> updated yet. Both old and new device tree structures are supported.
> 
> Signed-off-by: Sherry Sun <sherry.sun@nxp.com>
Hi Sherry:
Since the vpcie3v3aux is used to power up the WAKE#, it is always on in this
pwrctrl framework whatever the system is in suspend or not, right?

Best Regards
Richard Zhu
> ---
> Changes in V3:
> 1. Rebased on top of latest 7.1.0-rc4
> 
> Changes in V2:
> 1. After commit 2d8c5098b847 ("PCI/pwrctrl: Do not power off on pwrctrl
>    device removal"), the pwrctrl drivers no longer power off devices
>    during removal. Update pci-imx6 driver's shutdown callback in patch#1
>    to explicitly call pci_pwrctrl_power_off_devices() before
>    pci_pwrctrl_destroy_devices() to ensure devices are properly powered
>    off.
> ---
> 
> Sherry Sun (8):
>   PCI: imx6: Integrate new pwrctrl API for pci-imx6
>   arm: dts: imx6qdl-sabresd: Move power supply property to Root Port
>     node
>   arm: dts: imx6sx-sdb: Move power supply property to Root Port node
>   arm64: dts: imx8mm-evk: Move power supply property to Root Port node
>   arm64: dts: imx8mp-evk: Move power supply properties to Root Port node
>   arm64: dts: imx8mq-evk: Move power supply properties to Root Port node
>   arm64: dts: imx8dxl/qm/qxp: Move power supply properties to Root Port
>     node
>   arm64: dts: imx95: Move power supply properties to Root Port node
> 
>  .../arm/boot/dts/nxp/imx/imx6qdl-sabresd.dtsi |  2 +-
>  arch/arm/boot/dts/nxp/imx/imx6sx-sdb.dtsi     |  2 +-
>  arch/arm64/boot/dts/freescale/imx8dxl-evk.dts |  4 ++--
> arch/arm64/boot/dts/freescale/imx8mm-evk.dtsi |  2 +-
> arch/arm64/boot/dts/freescale/imx8mp-evk.dts  |  4 ++--
> arch/arm64/boot/dts/freescale/imx8mq-evk.dts  |  4 ++--
> arch/arm64/boot/dts/freescale/imx8qm-mek.dts  |  4 ++--
> arch/arm64/boot/dts/freescale/imx8qxp-mek.dts |  4 ++--
>  .../boot/dts/freescale/imx95-15x15-evk.dts    |  4 ++--
>  .../boot/dts/freescale/imx95-19x19-evk.dts    |  8 +++----
>  drivers/pci/controller/dwc/Kconfig            |  1 +
>  drivers/pci/controller/dwc/pci-imx6.c         | 24 ++++++++++++++++++-
>  12 files changed, 43 insertions(+), 20 deletions(-)
> 
> --
> 2.37.1



^ permalink raw reply

* [soc:soc/drivers] BUILD SUCCESS b1700f8d6c8031948e2b898d2c839dfabe0ba68e
From: kernel test robot @ 2026-05-21  3:22 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: linux-arm-kernel, arm

tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/soc/soc.git soc/drivers
branch HEAD: b1700f8d6c8031948e2b898d2c839dfabe0ba68e  Merge tag 'renesas-drivers-for-v7.2-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel into soc/drivers

elapsed time: 759m

configs tested: 227
configs skipped: 3

The following configs have been built successfully.
More configs may be tested in the coming days.

tested configs:
alpha                             allnoconfig    gcc-15.2.0
alpha                            allyesconfig    gcc-15.2.0
alpha                               defconfig    gcc-15.2.0
arc                              allmodconfig    clang-16
arc                              allmodconfig    gcc-15.2.0
arc                               allnoconfig    gcc-15.2.0
arc                              allyesconfig    clang-23
arc                              allyesconfig    gcc-15.2.0
arc                                 defconfig    gcc-15.2.0
arc                   randconfig-001-20260521    gcc-8.5.0
arc                   randconfig-002-20260521    gcc-8.5.0
arm                               allnoconfig    clang-23
arm                               allnoconfig    gcc-15.2.0
arm                              allyesconfig    clang-16
arm                              allyesconfig    gcc-15.2.0
arm                                 defconfig    gcc-15.2.0
arm                   randconfig-001-20260521    gcc-8.5.0
arm                   randconfig-002-20260521    gcc-8.5.0
arm                   randconfig-003-20260521    gcc-8.5.0
arm                   randconfig-004-20260521    gcc-8.5.0
arm64                            allmodconfig    clang-19
arm64                            allmodconfig    clang-23
arm64                             allnoconfig    gcc-15.2.0
arm64                               defconfig    gcc-15.2.0
arm64                 randconfig-001-20260521    gcc-8.5.0
arm64                 randconfig-002-20260521    gcc-8.5.0
arm64                 randconfig-003-20260521    gcc-8.5.0
arm64                 randconfig-004-20260521    gcc-8.5.0
csky                             allmodconfig    gcc-15.2.0
csky                              allnoconfig    gcc-15.2.0
csky                                defconfig    gcc-15.2.0
csky                  randconfig-001-20260521    gcc-8.5.0
csky                  randconfig-002-20260521    gcc-8.5.0
hexagon                          allmodconfig    clang-17
hexagon                          allmodconfig    gcc-15.2.0
hexagon                           allnoconfig    clang-23
hexagon                           allnoconfig    gcc-15.2.0
hexagon                             defconfig    gcc-15.2.0
hexagon               randconfig-001-20260521    gcc-11.5.0
hexagon               randconfig-002-20260521    gcc-11.5.0
i386                             allmodconfig    clang-20
i386                             allmodconfig    gcc-14
i386                              allnoconfig    gcc-14
i386                              allnoconfig    gcc-15.2.0
i386                             allyesconfig    clang-20
i386        buildonly-randconfig-001-20260521    clang-20
i386        buildonly-randconfig-002-20260521    clang-20
i386        buildonly-randconfig-003-20260521    clang-20
i386        buildonly-randconfig-004-20260521    clang-20
i386        buildonly-randconfig-005-20260521    clang-20
i386        buildonly-randconfig-006-20260521    clang-20
i386                                defconfig    gcc-15.2.0
i386                  randconfig-001-20260521    clang-20
i386                  randconfig-002-20260521    clang-20
i386                  randconfig-002-20260521    gcc-14
i386                  randconfig-003-20260521    clang-20
i386                  randconfig-004-20260521    clang-20
i386                  randconfig-004-20260521    gcc-14
i386                  randconfig-005-20260521    clang-20
i386                  randconfig-006-20260521    clang-20
i386                  randconfig-007-20260521    clang-20
i386                  randconfig-011-20260521    gcc-14
i386                  randconfig-012-20260521    gcc-14
i386                  randconfig-013-20260521    gcc-14
i386                  randconfig-014-20260521    gcc-14
i386                  randconfig-015-20260521    gcc-14
i386                  randconfig-016-20260521    gcc-14
i386                  randconfig-017-20260521    gcc-14
loongarch                        allmodconfig    clang-19
loongarch                        allmodconfig    clang-23
loongarch                         allnoconfig    clang-23
loongarch                         allnoconfig    gcc-15.2.0
loongarch                           defconfig    clang-19
loongarch             randconfig-001-20260521    gcc-11.5.0
loongarch             randconfig-002-20260521    gcc-11.5.0
m68k                             allmodconfig    gcc-15.2.0
m68k                              allnoconfig    gcc-15.2.0
m68k                             allyesconfig    clang-16
m68k                             allyesconfig    gcc-15.2.0
m68k                                defconfig    clang-19
microblaze                        allnoconfig    gcc-15.2.0
microblaze                       allyesconfig    gcc-15.2.0
microblaze                          defconfig    clang-19
mips                             allmodconfig    gcc-15.2.0
mips                              allnoconfig    gcc-15.2.0
mips                             allyesconfig    gcc-15.2.0
mips                  cavium_octeon_defconfig    gcc-15.2.0
mips                         rt305x_defconfig    clang-23
nios2                            allmodconfig    clang-23
nios2                            allmodconfig    gcc-11.5.0
nios2                             allnoconfig    clang-23
nios2                             allnoconfig    gcc-11.5.0
nios2                               defconfig    clang-19
nios2                 randconfig-001-20260521    gcc-11.5.0
nios2                 randconfig-002-20260521    gcc-11.5.0
openrisc                         allmodconfig    clang-23
openrisc                         allmodconfig    gcc-15.2.0
openrisc                          allnoconfig    clang-23
openrisc                          allnoconfig    gcc-15.2.0
openrisc                            defconfig    gcc-15.2.0
parisc                           allmodconfig    gcc-15.2.0
parisc                            allnoconfig    clang-23
parisc                            allnoconfig    gcc-15.2.0
parisc                           allyesconfig    clang-19
parisc                           allyesconfig    gcc-15.2.0
parisc                              defconfig    gcc-15.2.0
parisc                randconfig-001-20260521    gcc-12.5.0
parisc                randconfig-002-20260521    gcc-12.5.0
parisc64                            defconfig    clang-19
powerpc                          allmodconfig    gcc-15.2.0
powerpc                           allnoconfig    clang-23
powerpc                           allnoconfig    gcc-15.2.0
powerpc                      mgcoge_defconfig    clang-23
powerpc               randconfig-001-20260521    gcc-12.5.0
powerpc               randconfig-002-20260521    gcc-12.5.0
powerpc                     stx_gp3_defconfig    gcc-15.2.0
powerpc64             randconfig-001-20260521    gcc-12.5.0
powerpc64             randconfig-002-20260521    gcc-12.5.0
riscv                            allmodconfig    clang-23
riscv                             allnoconfig    clang-23
riscv                             allnoconfig    gcc-15.2.0
riscv                            allyesconfig    clang-16
riscv                               defconfig    gcc-15.2.0
riscv                          randconfig-001    gcc-15.2.0
riscv                 randconfig-001-20260521    gcc-15.2.0
riscv                          randconfig-002    gcc-15.2.0
riscv                 randconfig-002-20260521    gcc-15.2.0
s390                             allmodconfig    clang-18
s390                             allmodconfig    clang-19
s390                              allnoconfig    clang-23
s390                             allyesconfig    gcc-15.2.0
s390                          debug_defconfig    gcc-15.2.0
s390                                defconfig    gcc-15.2.0
s390                           randconfig-001    gcc-15.2.0
s390                  randconfig-001-20260521    gcc-15.2.0
s390                           randconfig-002    gcc-15.2.0
s390                  randconfig-002-20260521    gcc-15.2.0
sh                               allmodconfig    gcc-15.2.0
sh                                allnoconfig    clang-23
sh                                allnoconfig    gcc-15.2.0
sh                               allyesconfig    clang-19
sh                               allyesconfig    gcc-15.2.0
sh                                  defconfig    gcc-14
sh                             randconfig-001    gcc-15.2.0
sh                    randconfig-001-20260521    gcc-15.2.0
sh                             randconfig-002    gcc-15.2.0
sh                    randconfig-002-20260521    gcc-15.2.0
sparc                             allnoconfig    clang-23
sparc                             allnoconfig    gcc-15.2.0
sparc                               defconfig    gcc-15.2.0
sparc                 randconfig-001-20260520    gcc-12.5.0
sparc                 randconfig-001-20260521    gcc-8.5.0
sparc                 randconfig-002-20260520    gcc-14.3.0
sparc                 randconfig-002-20260521    gcc-8.5.0
sparc64                          allmodconfig    clang-23
sparc64                             defconfig    gcc-14
sparc64               randconfig-001-20260520    gcc-13.4.0
sparc64               randconfig-001-20260521    gcc-8.5.0
sparc64               randconfig-002-20260520    gcc-13.4.0
sparc64               randconfig-002-20260521    gcc-8.5.0
um                               allmodconfig    clang-19
um                                allnoconfig    clang-23
um                               allyesconfig    gcc-14
um                               allyesconfig    gcc-15.2.0
um                                  defconfig    gcc-14
um                             i386_defconfig    gcc-14
um                    randconfig-001-20260520    clang-23
um                    randconfig-001-20260521    gcc-8.5.0
um                    randconfig-002-20260520    gcc-14
um                    randconfig-002-20260521    gcc-8.5.0
um                           x86_64_defconfig    gcc-14
x86_64                           allmodconfig    clang-20
x86_64                            allnoconfig    clang-20
x86_64                            allnoconfig    clang-23
x86_64                           allyesconfig    clang-20
x86_64               buildonly-randconfig-001    gcc-12
x86_64      buildonly-randconfig-001-20260520    clang-20
x86_64      buildonly-randconfig-001-20260521    clang-20
x86_64               buildonly-randconfig-002    clang-20
x86_64      buildonly-randconfig-002-20260520    clang-20
x86_64      buildonly-randconfig-002-20260521    clang-20
x86_64               buildonly-randconfig-003    gcc-14
x86_64      buildonly-randconfig-003-20260520    gcc-14
x86_64      buildonly-randconfig-003-20260521    clang-20
x86_64               buildonly-randconfig-004    gcc-14
x86_64      buildonly-randconfig-004-20260520    gcc-14
x86_64      buildonly-randconfig-004-20260521    clang-20
x86_64               buildonly-randconfig-005    gcc-14
x86_64      buildonly-randconfig-005-20260520    gcc-14
x86_64      buildonly-randconfig-005-20260521    clang-20
x86_64               buildonly-randconfig-006    clang-20
x86_64      buildonly-randconfig-006-20260520    gcc-14
x86_64      buildonly-randconfig-006-20260521    clang-20
x86_64                              defconfig    gcc-14
x86_64                                  kexec    clang-20
x86_64                randconfig-001-20260521    clang-20
x86_64                randconfig-002-20260521    clang-20
x86_64                randconfig-003-20260521    clang-20
x86_64                randconfig-004-20260521    clang-20
x86_64                randconfig-005-20260521    clang-20
x86_64                randconfig-006-20260521    clang-20
x86_64                randconfig-011-20260521    gcc-14
x86_64                randconfig-012-20260521    gcc-14
x86_64                randconfig-013-20260521    gcc-14
x86_64                randconfig-014-20260521    gcc-14
x86_64                randconfig-015-20260521    gcc-14
x86_64                randconfig-016-20260521    gcc-14
x86_64                randconfig-071-20260521    clang-20
x86_64                randconfig-072-20260521    clang-20
x86_64                randconfig-073-20260521    clang-20
x86_64                randconfig-074-20260521    clang-20
x86_64                randconfig-075-20260521    clang-20
x86_64                randconfig-076-20260521    clang-20
x86_64                               rhel-9.4    clang-20
x86_64                           rhel-9.4-bpf    gcc-14
x86_64                          rhel-9.4-func    clang-20
x86_64                    rhel-9.4-kselftests    clang-20
x86_64                         rhel-9.4-kunit    gcc-14
x86_64                           rhel-9.4-ltp    gcc-14
x86_64                          rhel-9.4-rust    clang-20
xtensa                            allnoconfig    clang-23
xtensa                            allnoconfig    gcc-15.2.0
xtensa                           allyesconfig    clang-23
xtensa                randconfig-001-20260520    gcc-8.5.0
xtensa                randconfig-001-20260521    gcc-8.5.0
xtensa                randconfig-002-20260520    gcc-11.5.0
xtensa                randconfig-002-20260521    gcc-8.5.0

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox