Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] srcu: Optimize SRCU-fast per-CPU counter increments on arm64
From: Will Deacon @ 2026-03-26 10:58 UTC (permalink / raw)
  To: Puranjay Mohan
  Cc: Lai Jiangshan, Mark Rutland, Catalin Marinas, Paul E. McKenney,
	Josh Triplett, Steven Rostedt, Mathieu Desnoyers, rcu,
	linux-arm-kernel, linux-kernel
In-Reply-To: <20260326102608.1855088-1-puranjay@kernel.org>

On Thu, Mar 26, 2026 at 03:26:07AM -0700, Puranjay Mohan wrote:
> On architectures like arm64, this_cpu_inc() wraps the underlying atomic
> instruction (ldadd) with preempt_disable/enable to prevent migration
> between the per-CPU address calculation and the atomic operation.
> However, SRCU does not need this protection because it sums counters
> across all CPUs for grace-period detection, so operating on a "stale"
> CPU's counter after migration is harmless.
> 
> This commit therefore introduces srcu_percpu_counter_inc(), which
> consolidates the SRCU-fast reader counter updates into a single helper,
> replacing the if/else dispatch between this_cpu_inc() and
> atomic_long_inc(raw_cpu_ptr(...)) that was previously open-coded at
> each call site.
> 
> On arm64, this helper uses atomic_long_fetch_add_relaxed(), which
> compiles to the value-returning ldadd instruction.  This is preferred
> over atomic_long_inc()'s non-value-returning stadd because ldadd is
> resolved in L1 cache whereas stadd may be resolved further out in the
> memory hierarchy [1].
> 
> On x86, where this_cpu_inc() compiles to a single "incl %gs:offset"
> instruction with no preempt wrappers, the helper falls through to
> this_cpu_inc(), so there is no change.  Architectures with
> NEED_SRCU_NMI_SAFE continue to use atomic_long_inc(raw_cpu_ptr(...)),
> again with no change.  All remaining architectures also use the
> this_cpu_inc() path, again with no change.
> 
> refscale measurements on a 72-CPU arm64 Neoverse-V2 system show ~11%
> improvement in SRCU-fast reader duration:
> 
>   Unpatched: median 9.273 ns, avg 9.319 ns (min 9.219, max 9.853)
>     Patched: median 8.275 ns, avg 8.411 ns (min 8.186, max 9.183)
> 
>   Command: kvm.sh --torture refscale --duration 1 --cpus 72 \
>            --configs NOPREEMPT --trust-make --bootargs \
>            "refscale.scale_type=srcu-fast refscale.nreaders=72 \
>            refscale.nruns=100"
> 
> [1] https://lore.kernel.org/r/e7d539ed-ced0-4b96-8ecd-048a5b803b85@paulmck-laptop
> 
> Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
> ---
>  include/linux/srcutree.h | 51 +++++++++++++++++++++++++++-------------
>  1 file changed, 35 insertions(+), 16 deletions(-)
> 
> diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
> index fd1a9270cb9a..4ff18de3edfd 100644
> --- a/include/linux/srcutree.h
> +++ b/include/linux/srcutree.h
> @@ -286,15 +286,43 @@ static inline struct srcu_ctr __percpu *__srcu_ctr_to_ptr(struct srcu_struct *ss
>   * on architectures that support NMIs but do not supply NMI-safe
>   * implementations of this_cpu_inc().
>   */
> +
> +/*
> + * Atomically increment a per-CPU SRCU counter.
> + *
> + * On most architectures, this_cpu_inc() is optimal (e.g., on x86 it is
> + * a single "incl %gs:offset" instruction).  However, on architectures
> + * like arm64, s390, and loongarch, this_cpu_inc() wraps the underlying
> + * atomic instruction with preempt_disable/enable to prevent migration
> + * between the per-CPU address calculation and the atomic operation.
> + * SRCU does not need this protection because it sums counters across
> + * all CPUs for grace-period detection, so operating on a "stale" CPU's
> + * counter after migration is harmless.
> + *
> + * On arm64, use atomic_long_fetch_add_relaxed() which compiles to the
> + * value-returning ldadd instruction instead of atomic_long_inc()'s
> + * non-value-returning stadd, because ldadd is resolved in L1 cache
> + * whereas stadd may be resolved further out in the memory hierarchy.
> + * https://lore.kernel.org/r/e7d539ed-ced0-4b96-8ecd-048a5b803b85@paulmck-laptop
> + */
> +static __always_inline void
> +srcu_percpu_counter_inc(atomic_long_t __percpu *v)
> +{
> +#ifdef CONFIG_ARM64
> +	(void)atomic_long_fetch_add_relaxed(1, raw_cpu_ptr(v));
> +#elif IS_ENABLED(CONFIG_NEED_SRCU_NMI_SAFE)
> +	atomic_long_inc(raw_cpu_ptr(v));
> +#else
> +	this_cpu_inc(v->counter);
> +#endif
> +}

No, this is a hack. arm64 shouldn't be treated specially here.

The ldadd issue was already fixed properly in
git.kernel.org/linus/535fdfc5a2285. If you want to improve our preempt
disable/enable code or add helpers that don't require that, then patches
are welcome, but bodging random callers with arch-specific code for a
micro-benchmark is completely the wrong approach.

Will


^ permalink raw reply

* Re: [PATCH v4 21/21] mm: on remap assert that input range within the proposed VMA
From: Vlastimil Babka (SUSE) @ 2026-03-26 10:46 UTC (permalink / raw)
  To: Lorenzo Stoakes (Oracle), Andrew Morton
  Cc: Jonathan Corbet, Clemens Ladisch, Arnd Bergmann,
	Greg Kroah-Hartman, K . Y . Srinivasan, Haiyang Zhang, Wei Liu,
	Dexuan Cui, Long Li, Alexander Shishkin, Maxime Coquelin,
	Alexandre Torgue, Miquel Raynal, Richard Weinberger,
	Vignesh Raghavendra, Bodo Stroesser, Martin K . Petersen,
	David Howells, Marc Dionne, Alexander Viro, Christian Brauner,
	Jan Kara, David Hildenbrand, Liam R . Howlett, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko, Jann Horn, Pedro Falcato,
	linux-kernel, linux-doc, linux-hyperv, linux-stm32,
	linux-arm-kernel, linux-mtd, linux-staging, linux-scsi,
	target-devel, linux-afs, linux-fsdevel, linux-mm, Ryan Roberts
In-Reply-To: <0fc1092f4b74f3f673a58e4e3942dc83f336dd85.1774045440.git.ljs@kernel.org>

On 3/20/26 23:39, Lorenzo Stoakes (Oracle) wrote:
> Now we have range_in_vma_desc(), update remap_pfn_range_prepare() to check
> whether the input range in contained within the specified VMA, so we can
> fail at prepare time if an invalid range is specified.
> 
> This covers the I/O remap mmap actions also which ultimately call into
> this function, and other mmap action types either already span the full
> VMA or check this already.
> 
> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
> Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>

Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>

> ---
>  mm/memory.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 53ef8ef3d04a..68cc592ff0ba 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3142,6 +3142,9 @@ int remap_pfn_range_prepare(struct vm_area_desc *desc)
>  	const bool is_cow = vma_desc_is_cow_mapping(desc);
>  	int err;
>  
> +	if (!range_in_vma_desc(desc, start, end))
> +		return -EFAULT;
> +
>  	err = get_remap_pgoff(is_cow, start, end, desc->start, desc->end, pfn,
>  			      &desc->pgoff);
>  	if (err)



^ permalink raw reply

* Re: [PATCH v4 20/21] mm: add mmap_action_map_kernel_pages[_full]()
From: Vlastimil Babka (SUSE) @ 2026-03-26 10:44 UTC (permalink / raw)
  To: Lorenzo Stoakes (Oracle), Andrew Morton
  Cc: Jonathan Corbet, Clemens Ladisch, Arnd Bergmann,
	Greg Kroah-Hartman, K . Y . Srinivasan, Haiyang Zhang, Wei Liu,
	Dexuan Cui, Long Li, Alexander Shishkin, Maxime Coquelin,
	Alexandre Torgue, Miquel Raynal, Richard Weinberger,
	Vignesh Raghavendra, Bodo Stroesser, Martin K . Petersen,
	David Howells, Marc Dionne, Alexander Viro, Christian Brauner,
	Jan Kara, David Hildenbrand, Liam R . Howlett, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko, Jann Horn, Pedro Falcato,
	linux-kernel, linux-doc, linux-hyperv, linux-stm32,
	linux-arm-kernel, linux-mtd, linux-staging, linux-scsi,
	target-devel, linux-afs, linux-fsdevel, linux-mm, Ryan Roberts
In-Reply-To: <926ac961690d856e67ec847bee2370ab3c6b9046.1774045440.git.ljs@kernel.org>

On 3/20/26 23:39, Lorenzo Stoakes (Oracle) wrote:
> A user can invoke mmap_action_map_kernel_pages() to specify that the
> mapping should map kernel pages starting from desc->start of a specified
> number of pages specified in an array.
> 
> In order to implement this, adjust mmap_action_prepare() to be able to
> return an error code, as it makes sense to assert that the specified
> parameters are valid as quickly as possible as well as updating the VMA
> flags to include VMA_MIXEDMAP_BIT as necessary.
> 
> This provides an mmap_prepare equivalent of vm_insert_pages().  We
> additionally update the existing vm_insert_pages() code to use
> range_in_vma() and add a new range_in_vma_desc() helper function for the
> mmap_prepare case, sharing the code between the two in range_is_subset().
> 
> We add both mmap_action_map_kernel_pages() and
> mmap_action_map_kernel_pages_full() to allow for both partial and full VMA
> mappings.
> 
> We update the documentation to reflect the new features.
> 
> Finally, we update the VMA tests accordingly to reflect the changes.
> 
> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
> Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>

Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>



^ permalink raw reply

* Re: [PATCH] perf/arm-cmn: Fix resource_size_t printk specifier in arm_cmn_init_dtc()
From: Robin Murphy @ 2026-03-26 10:40 UTC (permalink / raw)
  To: Nathan Chancellor, Will Deacon, Mark Rutland, Ilkka Koskinen
  Cc: linux-arm-kernel, linux-perf-users, linux-kernel
In-Reply-To: <20260325-perf-arm-cmn-fix-resource_size_t-format-v1-1-e84d52ee3e81@kernel.org>

On 2026-03-26 2:19 am, Nathan Chancellor wrote:
> When building for 32-bit ARM, there is a warning when using the %llx
> specifier to print a resource_size_t variable:
> 
>    drivers/perf/arm-cmn.c: In function 'arm_cmn_init_dtc':
>    drivers/perf/arm-cmn.c:2149:73: error: format '%llx' expects argument of type 'long long unsigned int', but argument 4 has type 'resource_size_t' {aka 'unsigned int'} [-Werror=format=]
>     2149 |                                      "Failed to request DTC region 0x%llx\n", base);
>          |                                                                      ~~~^     ~~~~
>          |                                                                         |     |
>          |                                                                         |     resource_size_t {aka unsigned int}
>          |                                                                         long long unsigned int
>          |                                                                      %x
> 
> Use the %pa specifier to handle the possible sizes of phys_addr_t
> properly. This requires passing the variable by reference.

Cheers Nathan! I had seen the kbuild robot reports last night, and was 
going to get to this today, but I'm more than happy to be beaten to it!

Reviewed-by: Robin murphy <robin.murphy@arm.com>

> Fixes: 5394396ff548 ("perf/arm-cmn: Stop claiming entire iomem region")
> Signed-off-by: Nathan Chancellor <nathan@kernel.org>
> ---
>   drivers/perf/arm-cmn.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/perf/arm-cmn.c b/drivers/perf/arm-cmn.c
> index 1ac91cda6780..5c727c2abaf0 100644
> --- a/drivers/perf/arm-cmn.c
> +++ b/drivers/perf/arm-cmn.c
> @@ -2146,7 +2146,7 @@ static int arm_cmn_init_dtc(struct arm_cmn *cmn, struct arm_cmn_node *dn, int id
>   	size = cmn->part == PART_CMN600 ? SZ_16K : SZ_64K;
>   	if (!devm_request_mem_region(cmn->dev, base, size, dev_name(cmn->dev)))
>   		return dev_err_probe(cmn->dev, -EBUSY,
> -				     "Failed to request DTC region 0x%llx\n", base);
> +				     "Failed to request DTC region 0x%pa\n", &base);
>   
>   	writel_relaxed(CMN_DT_DTC_CTL_DT_EN, dtc->base + CMN_DT_DTC_CTL);
>   	writel_relaxed(CMN_DT_PMCR_PMU_EN | CMN_DT_PMCR_OVFL_INTR_EN, CMN_DT_PMCR(dtc));
> 
> ---
> base-commit: 2f89b7f78c50ca973ca035ceb30426f78d9e0996
> change-id: 20260325-perf-arm-cmn-fix-resource_size_t-format-b01795e36e60
> 
> Best regards,
> --
> Nathan Chancellor <nathan@kernel.org>
> 



^ permalink raw reply

* RE: [EXT] Re: [PATCH v4 3/4] clocksource/drivers/timer-mediatek: Convert timer-mediatek to a loadable module
From: Zhipeng Wang @ 2026-03-26 10:34 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: daniel.lezcano@linaro.org, tglx@kernel.org, shawnguo@kernel.org,
	s.hauer@pengutronix.de, kernel@pengutronix.de, festevam@gmail.com,
	matthias.bgg@gmail.com, angelogioacchino.delregno@collabora.com,
	linux-kernel@vger.kernel.org, imx@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	linux-mediatek@lists.infradead.org, chun-hung.wu@mediatek.com,
	walter.chang@mediatek.com, jstultz@google.com,
	amergnat@baylibre.com, Aisheng Dong, Jindong Yue, Xuegang Liu,
	Greg Kroah-Hartman
In-Reply-To: <28a09389-9fdf-49d8-84a6-4e68c40b5224@oss.qualcomm.com>



> -----Original Message-----
> From: Daniel Lezcano <daniel.lezcano@oss.qualcomm.com>
> Sent: 2026年3月25日 22:43
> To: Zhipeng Wang <zhipeng.wang_1@nxp.com>
> Cc: daniel.lezcano@linaro.org; tglx@kernel.org; shawnguo@kernel.org;
> s.hauer@pengutronix.de; kernel@pengutronix.de; festevam@gmail.com;
> matthias.bgg@gmail.com; angelogioacchino.delregno@collabora.com;
> linux-kernel@vger.kernel.org; imx@lists.linux.dev;
> linux-arm-kernel@lists.infradead.org; linux-mediatek@lists.infradead.org;
> chun-hung.wu@mediatek.com; walter.chang@mediatek.com;
> jstultz@google.com; amergnat@baylibre.com; Aisheng Dong
> <aisheng.dong@nxp.com>; Jindong Yue <jindong.yue@nxp.com>; Xuegang Liu
> <xuegang.liu@nxp.com>; Greg Kroah-Hartman <gregkh@google.com>
> Subject: Re: [EXT] Re: [PATCH v4 3/4] clocksource/drivers/timer-mediatek:
> Convert timer-mediatek to a loadable module
> 
> Caution: This is an external email. Please take care when clicking links or
> opening attachments. When in doubt, report the message using the 'Report
> this email' button
> 
> 
> Hi Zhipeng,
> 
> On 3/10/26 09:41, Zhipeng Wang wrote:
> >>
> >>
> >> Hi Zhipeng,
> >>
> >> On 3/9/26 06:31, Zhipeng Wang wrote:
> >>> Hello Daniel,
> >>>
> >>> I'd be very happy to collaborate on this!
> >>
> >> Great, let me see if I can cook a patch in the next days
> >>
> >>> My availability: I can dedicate time to work on this over the next few
> weeks.
> >> I'm happy to help with:
> >>>      - Testing the new macros with IMX timer drivers
> >>>      - Converting existing drivers as examples
> >>>      - Reviewing and testing patches
> >>>      - Documentation
> >>
> >> That's awesome, thanks
> >>
> >>> My understanding is that, based on your RFC, we should use two
> >>> macros —
> >> TIMER_OF_DECLARE_PDEV and TIMER_OF_DECLARE_PLATFORM_DRIVER.
> >>
> >> Yes, but also sort out the existing TIMER_OF_DECLARE macro vs MODULE
> >> in order to prevent #ifdef MODULE in the drivers
> >>
> > Hi Daniel,
> >
> > Yes, that's our goal.
> >
> > I'll test the new macros (TIMER_OF_DECLARE_PLATFORM_DRIVER and
> > TIMER_OF_DECLARE_EARLY_PLATFORM_DRIVER) with the IMX timer drivers
> > once the patches are available.
> 
> I think I have an idea on how to achieve that. That will result in the removal of
> TIMER_OF_DECLARE() when all drivers will be changed to use the new macro.
> 
> The #ifdef MODULE macro is set when the driver is compiled as a module.
> 
> So we can do something like:
> 
> #ifdef MODULE
> 
I think there might be a typo here - should this be "#ifndef MODULE" instead?

> #define TIMER_OF_DECLARE_PDEV(name, compat, data, fn) \
>          OF_DECLARE_1_RET(timer_pdev, name, compat, data, fn)
> 
> 
> #else
> 
> #define TIMER_OF_DECLARE_PDEV(__name, compat, data, fn) \
>          OF_DECLARE_1_RET(of_pdev_timer_match_table,
>                         __name, compat, data, fn)
> 
> static struct platform_driver __##__name##_timer_driver = {
>          .probe = __##__name##_timer_probe,
>          .driver = {
>                  .name = name,
>                  .of_match_table = of_pdev_timer_match_table,
>          },
> };
> module_platform_driver(__##__name##_timer_driver);
> 
> #endif
> 
> So we deal with two tables, one for platform device non module and one
> module for modules.
> 
> The first one is called by the timer-of init routine. The other one is called by the
> probe function.
> 
> The drawback will be the match table will be common to all timer drivers. So
> probe will be a bit slower. May be there is an area of optimization here.

This doesn't support EPROBE_DEFER for built-in drivers, correct?


BRs,
Zhipeng


^ permalink raw reply

* Re: [PATCH 1/2] dt-bindings: perf: marvell: Document CN20K DDR PMU
From: Rob Herring (Arm) @ 2026-03-26 10:33 UTC (permalink / raw)
  To: Geetha sowjanya
  Cc: mark.rutland, will, linux-arm-kernel, linux-perf-users,
	devicetree, krzk+dt, linux-kernel
In-Reply-To: <20260326090645.22590-2-gakula@marvell.com>


On Thu, 26 Mar 2026 14:36:44 +0530, Geetha sowjanya wrote:
> Add a devicetree binding for the Marvell CN20K DDR performance
> monitor block, including the marvell,cn20k-ddr-pmu compatible
> string and the required MMIO reg region.
> 
> Signed-off-by: Geetha sowjanya <gakula@marvell.com>
> ---
>  .../bindings/perf/marvell-cn20k-ddr.yaml      | 37 +++++++++++++++++++
>  1 file changed, 37 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/perf/marvell-cn20k-ddr.yaml
> 

My bot found errors running 'make dt_binding_check' on your patch:

yamllint warnings/errors:

dtschema/dtc warnings/errors:
Documentation/devicetree/bindings/perf/marvell-cn20k-ddr.example.dts:22.21-25.15: Warning (unit_address_vs_reg): /example-0/bus/ddrcpmu: node has a reg or ranges property, but no unit name

doc reference errors (make refcheckdocs):

See https://patchwork.kernel.org/project/devicetree/patch/20260326090645.22590-2-gakula@marvell.com

The base for the series is generally the latest rc1. A different dependency
should be noted in *this* patch.

If you already ran 'make dt_binding_check' and didn't see the above
error(s), then make sure 'yamllint' is installed and dt-schema is up to
date:

pip3 install dtschema --upgrade

Please check and re-submit after running the above command yourself. Note
that DT_SCHEMA_FILES can be set to your schema file to speed up checking
your schema. However, it must be unset to test all examples with your schema.



^ permalink raw reply

* [PATCH] ARM: dts: aspeed: g6: Add PWM/Tach controller node
From: Billy Tsai @ 2026-03-26 10:29 UTC (permalink / raw)
  To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Joel Stanley,
	Andrew Jeffery
  Cc: devicetree, linux-arm-kernel, linux-aspeed, linux-kernel,
	Billy Tsai

Introduce a device tree node for the AST2600 PWM/Tach controller.
Describe register range, clock, reset, and cell configuration.
Set status to "disabled" by default.

Prepares for enabling PWM and tachometer support on platforms
utilizing this SoC.

Signed-off-by: Billy Tsai <billy_tsai@aspeedtech.com>
---
 arch/arm/boot/dts/aspeed/aspeed-g6.dtsi | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/arm/boot/dts/aspeed/aspeed-g6.dtsi b/arch/arm/boot/dts/aspeed/aspeed-g6.dtsi
index 189bc3bbb47c..818d486b94ac 100644
--- a/arch/arm/boot/dts/aspeed/aspeed-g6.dtsi
+++ b/arch/arm/boot/dts/aspeed/aspeed-g6.dtsi
@@ -102,6 +102,15 @@ ahbc: bus@1e600000 {
 			reg = <0x1e600000 0x100>;
 		};
 
+		pwm_tach: pwm-tach-controller@1e610000 {
+			compatible = "aspeed,ast2600-pwm-tach";
+			reg = <0x1e610000 0x100>;
+			clocks = <&syscon ASPEED_CLK_AHB>;
+			resets = <&syscon ASPEED_RESET_PWM>;
+			#pwm-cells = <3>;
+			status = "disabled";
+		};
+
 		fmc: spi@1e620000 {
 			reg = <0x1e620000 0xc4>, <0x20000000 0x10000000>;
 			#address-cells = <1>;

---
base-commit: 6de23f81a5e08be8fbf5e8d7e9febc72a5b5f27f
change-id: 20260326-g6-dtsi-9ee3d920bc0c

Best regards,
-- 
Billy Tsai <billy_tsai@aspeedtech.com>



^ permalink raw reply related

* [PATCH] srcu: Optimize SRCU-fast per-CPU counter increments on arm64
From: Puranjay Mohan @ 2026-03-26 10:26 UTC (permalink / raw)
  To: Lai Jiangshan, Will Deacon, Mark Rutland, Catalin Marinas,
	Paul E. McKenney, Josh Triplett, Steven Rostedt,
	Mathieu Desnoyers, rcu, linux-arm-kernel, linux-kernel
  Cc: Puranjay Mohan

On architectures like arm64, this_cpu_inc() wraps the underlying atomic
instruction (ldadd) with preempt_disable/enable to prevent migration
between the per-CPU address calculation and the atomic operation.
However, SRCU does not need this protection because it sums counters
across all CPUs for grace-period detection, so operating on a "stale"
CPU's counter after migration is harmless.

This commit therefore introduces srcu_percpu_counter_inc(), which
consolidates the SRCU-fast reader counter updates into a single helper,
replacing the if/else dispatch between this_cpu_inc() and
atomic_long_inc(raw_cpu_ptr(...)) that was previously open-coded at
each call site.

On arm64, this helper uses atomic_long_fetch_add_relaxed(), which
compiles to the value-returning ldadd instruction.  This is preferred
over atomic_long_inc()'s non-value-returning stadd because ldadd is
resolved in L1 cache whereas stadd may be resolved further out in the
memory hierarchy [1].

On x86, where this_cpu_inc() compiles to a single "incl %gs:offset"
instruction with no preempt wrappers, the helper falls through to
this_cpu_inc(), so there is no change.  Architectures with
NEED_SRCU_NMI_SAFE continue to use atomic_long_inc(raw_cpu_ptr(...)),
again with no change.  All remaining architectures also use the
this_cpu_inc() path, again with no change.

refscale measurements on a 72-CPU arm64 Neoverse-V2 system show ~11%
improvement in SRCU-fast reader duration:

  Unpatched: median 9.273 ns, avg 9.319 ns (min 9.219, max 9.853)
    Patched: median 8.275 ns, avg 8.411 ns (min 8.186, max 9.183)

  Command: kvm.sh --torture refscale --duration 1 --cpus 72 \
           --configs NOPREEMPT --trust-make --bootargs \
           "refscale.scale_type=srcu-fast refscale.nreaders=72 \
           refscale.nruns=100"

[1] https://lore.kernel.org/r/e7d539ed-ced0-4b96-8ecd-048a5b803b85@paulmck-laptop

Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
---
 include/linux/srcutree.h | 51 +++++++++++++++++++++++++++-------------
 1 file changed, 35 insertions(+), 16 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index fd1a9270cb9a..4ff18de3edfd 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -286,15 +286,43 @@ static inline struct srcu_ctr __percpu *__srcu_ctr_to_ptr(struct srcu_struct *ss
  * on architectures that support NMIs but do not supply NMI-safe
  * implementations of this_cpu_inc().
  */
+
+/*
+ * Atomically increment a per-CPU SRCU counter.
+ *
+ * On most architectures, this_cpu_inc() is optimal (e.g., on x86 it is
+ * a single "incl %gs:offset" instruction).  However, on architectures
+ * like arm64, s390, and loongarch, this_cpu_inc() wraps the underlying
+ * atomic instruction with preempt_disable/enable to prevent migration
+ * between the per-CPU address calculation and the atomic operation.
+ * SRCU does not need this protection because it sums counters across
+ * all CPUs for grace-period detection, so operating on a "stale" CPU's
+ * counter after migration is harmless.
+ *
+ * On arm64, use atomic_long_fetch_add_relaxed() which compiles to the
+ * value-returning ldadd instruction instead of atomic_long_inc()'s
+ * non-value-returning stadd, because ldadd is resolved in L1 cache
+ * whereas stadd may be resolved further out in the memory hierarchy.
+ * https://lore.kernel.org/r/e7d539ed-ced0-4b96-8ecd-048a5b803b85@paulmck-laptop
+ */
+static __always_inline void
+srcu_percpu_counter_inc(atomic_long_t __percpu *v)
+{
+#ifdef CONFIG_ARM64
+	(void)atomic_long_fetch_add_relaxed(1, raw_cpu_ptr(v));
+#elif IS_ENABLED(CONFIG_NEED_SRCU_NMI_SAFE)
+	atomic_long_inc(raw_cpu_ptr(v));
+#else
+	this_cpu_inc(v->counter);
+#endif
+}
+
 static inline struct srcu_ctr __percpu notrace *__srcu_read_lock_fast(struct srcu_struct *ssp)
 	__acquires_shared(ssp)
 {
 	struct srcu_ctr __percpu *scp = READ_ONCE(ssp->srcu_ctrp);
 
-	if (!IS_ENABLED(CONFIG_NEED_SRCU_NMI_SAFE))
-		this_cpu_inc(scp->srcu_locks.counter); // Y, and implicit RCU reader.
-	else
-		atomic_long_inc(raw_cpu_ptr(&scp->srcu_locks));  // Y, and implicit RCU reader.
+	srcu_percpu_counter_inc(&scp->srcu_locks); // Y, and implicit RCU reader.
 	barrier(); /* Avoid leaking the critical section. */
 	__acquire_shared(ssp);
 	return scp;
@@ -315,10 +343,7 @@ __srcu_read_unlock_fast(struct srcu_struct *ssp, struct srcu_ctr __percpu *scp)
 {
 	__release_shared(ssp);
 	barrier();  /* Avoid leaking the critical section. */
-	if (!IS_ENABLED(CONFIG_NEED_SRCU_NMI_SAFE))
-		this_cpu_inc(scp->srcu_unlocks.counter);  // Z, and implicit RCU reader.
-	else
-		atomic_long_inc(raw_cpu_ptr(&scp->srcu_unlocks));  // Z, and implicit RCU reader.
+	srcu_percpu_counter_inc(&scp->srcu_unlocks);  // Z, and implicit RCU reader.
 }
 
 /*
@@ -335,10 +360,7 @@ struct srcu_ctr __percpu notrace *__srcu_read_lock_fast_updown(struct srcu_struc
 {
 	struct srcu_ctr __percpu *scp = READ_ONCE(ssp->srcu_ctrp);
 
-	if (!IS_ENABLED(CONFIG_NEED_SRCU_NMI_SAFE))
-		this_cpu_inc(scp->srcu_locks.counter); // Y, and implicit RCU reader.
-	else
-		atomic_long_inc(raw_cpu_ptr(&scp->srcu_locks));  // Y, and implicit RCU reader.
+	srcu_percpu_counter_inc(&scp->srcu_locks); // Y, and implicit RCU reader.
 	barrier(); /* Avoid leaking the critical section. */
 	__acquire_shared(ssp);
 	return scp;
@@ -359,10 +381,7 @@ __srcu_read_unlock_fast_updown(struct srcu_struct *ssp, struct srcu_ctr __percpu
 {
 	__release_shared(ssp);
 	barrier();  /* Avoid leaking the critical section. */
-	if (!IS_ENABLED(CONFIG_NEED_SRCU_NMI_SAFE))
-		this_cpu_inc(scp->srcu_unlocks.counter);  // Z, and implicit RCU reader.
-	else
-		atomic_long_inc(raw_cpu_ptr(&scp->srcu_unlocks));  // Z, and implicit RCU reader.
+	srcu_percpu_counter_inc(&scp->srcu_unlocks);  // Z, and implicit RCU reader.
 }
 
 void __srcu_check_read_flavor(struct srcu_struct *ssp, int read_flavor);

base-commit: 16ad40d1089c5f212d7d87babc2376284f3bf244
-- 
2.52.0



^ permalink raw reply related

* Re: [PATCH v2] ARM: tegra: paz00: configure WiFi rfkill switch through device tree
From: Bartosz Golaszewski @ 2026-03-26 10:16 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Marc Dietrich, Krzysztof Kozlowski, Rob Herring, Conor Dooley,
	Jonathan Hunter, Bartosz Golaszewski, devicetree, linux-tegra,
	linux-kernel, linux-arm-kernel, Thierry Reding
In-Reply-To: <acRtWZohqfDLbMKE@google.com>

On Thu, 26 Mar 2026 00:29:54 +0100, Dmitry Torokhov
<dmitry.torokhov@gmail.com> said:
> As of d64c732dfc9e ("net: rfkill: gpio: add DT support") rfkill-gpio
> device can be instantiated via device tree.
>
> Add the declaration there and drop board-paz00.c file and relevant
> Makefile fragments.
>
> Tested-by: Marc Dietrich <marvin24@gmx.de>
> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
> ---

Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>

But now I need to find another victim of my auto secondary fwnode experiments
for OF systems. :)


^ permalink raw reply

* Re: [PATCH v2 08/13] firmware: arm_scmi: Harden clock protocol initialization
From: Sudeep Holla @ 2026-03-26 10:16 UTC (permalink / raw)
  To: Alexander Stein
  Cc: Marek Szyprowski, Cristian Marussi, Sudeep Holla, linux-kernel,
	linux-arm-kernel, arm-scmi, linux-clk, linux-renesas-soc,
	philip.radford, james.quinlan, f.fainelli, vincent.guittot,
	etienne.carriere, peng.fan, michal.simek, dan.carpenter,
	geert+renesas, kuninori.morimoto.gx, marek.vasut+renesas
In-Reply-To: <5980695.DvuYhMxLoT@steina-w>

On Thu, Mar 26, 2026 at 09:55:18AM +0100, Alexander Stein wrote:
> Hi,
> 
> Am Mittwoch, 25. März 2026, 13:27:48 CET schrieb Cristian Marussi:
> > On Wed, Mar 25, 2026 at 12:02:41PM +0100, Marek Szyprowski wrote:
> > > On 10.03.2026 19:40, Cristian Marussi wrote:
> > > > Add proper error handling on failure to enumerate clocks features or
> > > > rates.
> > > >
> > > > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> > > 
> > 
> > Hi Marek,
> > 
> > > This patch landed yesterday in linux-next as commit 0d8b0c8068a8 
> > > ("firmware: arm_scmi: Harden clock protocol initialization"). In my 
> > > tests I found that it causes a regression on RK3568 Odroid-M1 board 
> > > (arch/arm64/boot/dts/rockchip/rk3568-odroid-m1.dts), cpufreq and GPU 
> > > device are not probed properly:
> > > 
> > > # dmesg | grep scmi
> > > scmi_core: SCMI protocol bus registered
> > > arm-scmi arm-scmi.0.auto: Using scmi_smc_transport
> > > arm-scmi arm-scmi.0.auto: SCMI max-rx-timeout: 30ms / max-msg-size: 
> > > 104bytes / max-msg: 20
> > > scmi_protocol scmi_dev.1: Enabled polling mode TX channel - prot_id:16
> > > arm-scmi arm-scmi.0.auto: SCMI Notifications - Core Enabled.
> > > arm-scmi arm-scmi.0.auto: Malformed reply - real_sz:8 calc_sz:4  
> > > (loop_num_ret:1)
> > > arm-scmi arm-scmi.0.auto: SCMI Protocol v2.0 'rockchip:' Firmware 
> > > version 0x0
> > > arm-scmi arm-scmi.0.auto: Enabling SCMI Quirk 
> > > [quirk_clock_rates_triplet_out_of_spec]
> > > scmi-clocks scmi_dev.3: probe with driver scmi-clocks failed with error -22
> > > 
> > 
> > Yes there are multiple reports of issues on this hardening, the series
> > is on hold and wont go into v7.1 as of now...it needs some basic fixes
> > and various quirks probably to address non-compliant firmwares...
> > 
> > It will be pushed to next again with a few more fixes in the coming
> > days and then we'll need to figure out how many quirks will be needed on
> > top of that and if it is acceptable at all...
> 
> Just for the records: imx95 (maybe imx94 as well) is also affected by this.
> My board doesn't boot at all, because all the clocks are provided by SCMI.
> 
> With this diff I can see it's the 'ext' clock
> -->8---
> --- a/drivers/firmware/arm_scmi/clock.c
> +++ b/drivers/firmware/arm_scmi/clock.c
> @@ -1253,8 +1253,11 @@ static int scmi_clock_protocol_init(const struct scmi_protocol_handle *ph)
>         for (clkid = 0; clkid < cinfo->num_clocks; clkid++) {
>                 cinfo->clkds[clkid].id = clkid;
>                 ret = scmi_clock_attributes_get(ph, clkid, cinfo);
> -               if (ret)
> +               if (ret) {
> +                       dev_warn(ph->dev, "scmi_clock_attributes_get failed for '%s': %d\n",
> +                                cinfo->clkds->info.name, ret);
>                         return ret;
> +               }
>  
>                 ret = scmi_clock_describe_rates_get(ph, clkid, cinfo);
>                 if (ret)
> -->8---
> > arm-scmi arm-scmi.0.auto: scmi_clock_attributes_get failed for 'ext': -2
> > scmi-clocks scmi_dev.6: probe with driver scmi-clocks failed with error -2
> 
> What's the idea of how to proceeed as apparently several platforms are
> affected?
> 

Not exactly answer to the above question, but more discussion here:

https://lore.kernel.org/all/20260324-scmi-clock-fix-v1-v1-1-65c21935824b@nxp.com

-- 
Regards,
Sudeep


^ permalink raw reply

* Re: [RFC PATCH 0/8] xilinx: tsn: Add TSN Endpoint Ethernet MAC driver support
From: Neeli, Srinivas @ 2026-03-26 10:11 UTC (permalink / raw)
  To: Andrew Lunn, Neeli, Srinivas
  Cc: andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
	kuba@kernel.org, pabeni@redhat.com, Simek, Michal,
	robh@kernel.org, krzk+dt@kernel.org, conor+dt@kernel.org,
	richardcochran@gmail.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, devicetree@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, git (AMD-Xilinx)
In-Reply-To: <ca273ea9-2d6f-4c01-b243-803835d08248@amd.com>


On 3/5/2026 5:16 PM, Neeli, Srinivas wrote:
> Hi Andrew,
>
> On 2/20/2026 7:06 PM, Andrew Lunn wrote:
>> On Fri, Feb 20, 2026 at 12:59:16PM +0000, Neeli, Srinivas wrote:
>>> [AMD Official Use Only - AMD Internal Distribution Only]
>> Sorry, i'm not part of AMD...
>>
>>>> So how does the host send a frame out Port 2? Is there an extra header
>>>> on the frame sent by EndPoint, which the switch interprets?
>>>>
>>> In this RFC, I configured all switch ports in forward mode. As a
>>> result, when a frame is sent from the internal endpoint, it is
>>> flooded to both external ports.  To forward packets to a specific
>>> port instead of flooding, either static switch CAM entries need to
>>> be configured or address learning should be enabled so the switch
>>> can learn CAM entries dynamically.
>> Despite not being part of AMD, this part is important.
>>
>> I don't care about how the RFC works, i want to know how the hardware
>> works, to ensure you have the correct choice of DSA vs pure switchdev.
>>
>> Take the example of running Spanning Tree Protocol. The bridge needs
>> to send the BPDU out a specific port. What mechanism is used to do
>> that? It also needs to know which port a BPDU ingressed.
>>
>>     Andrew
>
>
> Hi Andrew,
>
> I would like to briefly share an overview of our TSN switch 
> capabilities and seek your guidance on the most appropriate Linux 
> framework for the driver implementation specifically whether switchdev 
> or DSA would be the better fit.
>
> TSN Switch Capabilities
> -----------------------
> Our TSN subsystem supports the following IEEE TSN clauses:
>
> IEEE 802.1Qbv – Time-Aware Shaper (scheduled traffic using gate control)
> IEEE 802.1Qbu / IEEE 802.3br – Frame preemption
> IEEE 802.1Qci – Per-Stream Filtering and Policing (PSFP), including: 
> SDU-based filtering and Meter-based policing
> IEEE 802.1CB – Frame Replication and Elimination for Reliability (FRER)
> IEEE 802.1AS / IEEE 1588 – Time synchronization (PTP / gPTP)
>
> Hardware Architecture Overview
> ------------------------------
> The switch consists of three ports:
>
> Port 0: Connected to the CPU (control/endpoint port)
> Port 1: Connected to MAC1
> Port 2: Connected to MAC2
>
> MAC1 and MAC2 are capable of transmitting and receiving PTP packets, 
> with received packets stored in internal BRAM. They will not be 
> forwarded by switch to the internal endpoint (EP) and MAC network 
> drivers xmit's and receives the PTP frames.
> The switch forwards frames based on VLAN port membership and the CAM 
> entries and switch supports TSN features such as CBS, Qci (PSFP) and 
> 802.1CB (FRER) through hardware configuration.
> The CPU is intended to operate purely in the control plane and is not 
> part of the forwarding data path.
>
> Thank you very much for your time and guidance. Please let us know if 
> any additional details would be helpful.
>
>
> Best regards,
> Neeli Srinivas
>
Hi Andrew,

Based on the feedback so far, I am planning to proceed with a switchdev 
based implementation for the next RFC series, as this appears to be a 
better fit with the Linux networking model.
Please let me know if you have any concerns with this approach. If this 
direction is acceptable, I will share the next version of the RFC series 
accordingly.
Thank you for your guidance.

Best regards,
Neeli Srinivas


^ permalink raw reply

* Re: Re: [PATCH v2 0/3] Inline helpers into Rust without full LTO
From: Alice Ryhl @ 2026-03-26 10:10 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Miguel Ojeda, a.hindborg, acourbot, akpm, anton.ivanov, bjorn3_gh,
	boqun.feng, dakr, david, gary, johannes, justinstitt,
	linux-arm-kernel, linux-kbuild, linux-kernel, linux-mm, linux-um,
	llvm, lossin, mark.rutland, mmaurer, morbo, nathan,
	nick.desaulniers+lkml, nicolas.schier, nsc, peterz, richard,
	rust-for-linux, tmgross, urezki, will
In-Reply-To: <acEP7tl8pqFA3tK8@shell.armlinux.org.uk>

On Mon, Mar 23, 2026 at 10:03:26AM +0000, Russell King (Oracle) wrote:
> On Mon, Mar 23, 2026 at 01:03:27AM +0100, Miguel Ojeda wrote:
> > On Sun, 22 Mar 2026 20:21:59 +0100 Miguel Ojeda <ojeda@kernel.org> wrote:
> > >
> > >     On the other hand, regardless of whether we fix this (and another
> > >     issue in a separate email found thanks to the UML build), we could
> > >     instead add `depends on` listing explicitly the architectures where
> > >     this is going to be actually tested. That way maintainers can decide
> > >     whether they want to support it when they are ready. Thoughts?
> > 
> > Another one for arm 32-bit:
> > 
> >       LD      .tmp_vmlinux1
> >     ld.lld: error: undefined symbol: __aeabi_read_tp
> >     >>> referenced by uaccess.rs:349 (rust/kernel/uaccess.rs:349)
> >     >>>               samples/rust/rust_misc_device.o:(<rust_misc_device::RustMiscDevice as kernel::miscdevice::MiscDevice>::ioctl) in archive vmlinux.a
> >     >>> referenced by uaccess.rs:543 (rust/kernel/uaccess.rs:543)
> >     >>>               samples/rust/rust_misc_device.o:(<rust_misc_device::RustMiscDevice as kernel::miscdevice::MiscDevice>::ioctl) in archive vmlinux.a
> >     >>> referenced by uaccess.rs:543 (rust/kernel/uaccess.rs:543)
> >     >>>               drivers/android/binder/rust_binder_main.o:(rust_binder_main::rust_binder_ioctl) in archive vmlinux.a
> >     >>> referenced 36 more times
> 
> Why is Rust generating code for userspace thread accessors for kernel
> space, where userspace threads are meaningless. This is totally wrong.
> The kernel must not reference __aeabi_read_tp().
> 
> Note: I know nothing about Rust, but I know enough to say the above is
> pointing to a fundamental issue in Rust for 32-bit ARM.

I noticed that the Makefile currently uses the arm-unknown-linux-gnueabi
target. It should probably not be -linux target to avoid this? Probably
it should just be armv7a-none-eabi, right? We gate HAVE_RUST on
CPU_32v7, so we should not need to consider the other variants.

Alice


^ permalink raw reply

* Re: [PATCH 0/4] arm64: dts: renesas: Fix missing cells and reg
From: Geert Uytterhoeven @ 2026-03-26 10:07 UTC (permalink / raw)
  To: Marek Vasut
  Cc: linux-arm-kernel, Conor Dooley, Geert Uytterhoeven,
	Krzysztof Kozlowski, Magnus Damm, Rob Herring, devicetree,
	linux-kernel, linux-renesas-soc
In-Reply-To: <20260326042411.215241-1-marek.vasut+renesas@mailbox.org>

Hi Marek,

Thanks for your series!

On Thu, 26 Mar 2026 at 05:24, Marek Vasut
<marek.vasut+renesas@mailbox.org> wrote:
> Add missing cells and reg DT property into DTOs to fix warnings like this:
>
> "
> arch/arm64/boot/dts/renesas/draak-ebisu-panel-aa104xd12.dtso:30.10-34.5: Warning (unit_address_vs_reg): /fragment@2/__overlay__/ports/port@1: node has a unit name, but no reg or ranges property
> "

All of these are dtc W=1 warnings, right?

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds


^ permalink raw reply

* [PATCH v3 3/3] drm/gem-dma: Support DRM_MODE_DUMB_KERNEL_MAP flag
From: Chen-Yu Tsai @ 2026-03-26 10:01 UTC (permalink / raw)
  To: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter
  Cc: Rob Herring, dri-devel, linux-kernel, linux-arm-kernel,
	Chen-Yu Tsai, Sasha Finkelstein, Janne Grunau, Liviu Dudau,
	Paul Kocialkowski, Neil Armstrong, Laurent Pinchart,
	Tomi Valkeinen, Kieran Bingham, Biju Das, Yannick Fertre,
	Raphael Gallais-Pou, Philippe Cornu, Jernej Skrabec,
	Dave Stevenson, Maíra Canal, Raspberry Pi Kernel Maintenance,
	Icenowy Zheng, Laurent Pinchart, Tomi Valkeinen
In-Reply-To: <20260326100248.1171828-1-wenst@chromium.org>

From: Rob Herring <robh@kernel.org>

Add support in DMA helpers to handle callers specifying
DRM_MODE_DUMB_KERNEL_MAP flag. Existing behavior is maintained with this
change. drm_gem_dma_dumb_create() always creates a kernel mapping as
before. drm_gem_dma_dumb_create_internal() lets the caller set the flags
as desired.

drm_gem_dma_dumb_create_internal() users have DRM_MODE_DUMB_KERNEL_MAP
added to preserve existing behavior.

A dumb buffer allocated from these devices can be shared (exported) to
another device. The consuming device may require the kernel mapping to
scan out the buffer to its own display. Such devices include DisplayLink
and various MIPI DBI based displays. Therefore altering the behavior
should be given much consideration.

Signed-off-by: Rob Herring <robh@kernel.org>
[wenst@chromium.org: Rebase onto renamed GEM DMA helpers]
[wenst@chromium.org: show "vaddr=(no mapping)" in drm_gem_dma_print_info()]
[wenst@chromium.org: Add DRM_MODE_DUMB_KERNEL_MAP to new drivers]
[wenst@chromium.org: Add flags field to drm_gem_dma_create_with_handle()
		     kerneldoc]
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
---
Changes since v2:
- Added back DRM_MODE_DUMB_KERNEL_MAP flag to all drivers calling
  drm_gem_dma_dumb_create_internal()
- Expanded commit message to cover display drivers needing the kernel
  mapping to do scan out

Changes since v1:
- Rebased onto renamed GEM DMA helpers
- Added check in drm_fb_dma_get_scanout_buffer() and drm_gem_dma_vmap().
- Made drm_gem_dma_print_info() show "vaddr=(no mapping)" for objects
  allocated without kernel mapping
- Dropped existing DRM_MODE_DUMB_KERNEL_MAP flag addition in various
  drivers
- Added DRM_MODE_DUMB_KERNEL_MAP flag to adp_drm_gem_dumb_create()
- Added flags field kerneldoc for drm_gem_dma_create_with_handle()

Cc: Sasha Finkelstein <fnkl.kernel@gmail.com>
Cc: Janne Grunau <j@jannau.net>
Cc: Liviu Dudau <liviu.dudau@arm.com>
Cc: Paul Kocialkowski <paulk@sys-base.io>
Cc: Neil Armstrong <neil.armstrong@linaro.org>
Cc: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Cc: Tomi Valkeinen <tomi.valkeinen+renesas@ideasonboard.com>
Cc: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Cc: Biju Das <biju.das.jz@bp.renesas.com>
Cc: Yannick Fertre <yannick.fertre@foss.st.com>
Cc: Raphael Gallais-Pou <raphael.gallais-pou@foss.st.com>
Cc: Philippe Cornu <philippe.cornu@foss.st.com>
Cc: Jernej Skrabec <jernej.skrabec@gmail.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Dave Stevenson <dave.stevenson@raspberrypi.com>
Cc: "Maíra Canal" <mcanal@igalia.com>
Cc: Raspberry Pi Kernel Maintenance <kernel-list@raspberrypi.com>
Cc: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
---
 drivers/gpu/drm/adp/adp_drv.c                 |  1 +
 .../gpu/drm/arm/display/komeda/komeda_kms.c   |  1 +
 drivers/gpu/drm/arm/malidp_drv.c              |  1 +
 drivers/gpu/drm/drm_fb_dma_helper.c           |  4 ++
 drivers/gpu/drm/drm_gem_dma_helper.c          | 67 ++++++++++++-------
 drivers/gpu/drm/logicvc/logicvc_drm.c         |  1 +
 drivers/gpu/drm/meson/meson_drv.c             |  1 +
 drivers/gpu/drm/renesas/rcar-du/rcar_du_kms.c |  2 +
 drivers/gpu/drm/renesas/rz-du/rzg2l_du_kms.c  |  1 +
 drivers/gpu/drm/stm/drv.c                     |  3 +-
 drivers/gpu/drm/sun4i/sun4i_drv.c             |  1 +
 drivers/gpu/drm/vc4/vc4_drv.c                 |  2 +
 drivers/gpu/drm/verisilicon/vs_drm.c          |  2 +
 drivers/gpu/drm/xlnx/zynqmp_kms.c             |  2 +
 14 files changed, 63 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/adp/adp_drv.c b/drivers/gpu/drm/adp/adp_drv.c
index 4554cf75565e..c549b44b3814 100644
--- a/drivers/gpu/drm/adp/adp_drv.c
+++ b/drivers/gpu/drm/adp/adp_drv.c
@@ -95,6 +95,7 @@ static int adp_drm_gem_dumb_create(struct drm_file *file_priv,
 {
 	args->height = ALIGN(args->height, 64);
 	args->size = args->pitch * args->height;
+	args->flags = DRM_MODE_DUMB_KERNEL_MAP;
 
 	return drm_gem_dma_dumb_create_internal(file_priv, drm, args);
 }
diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_kms.c b/drivers/gpu/drm/arm/display/komeda/komeda_kms.c
index 6ed504099188..2c096ebaea33 100644
--- a/drivers/gpu/drm/arm/display/komeda/komeda_kms.c
+++ b/drivers/gpu/drm/arm/display/komeda/komeda_kms.c
@@ -29,6 +29,7 @@ static int komeda_gem_dma_dumb_create(struct drm_file *file,
 	struct komeda_dev *mdev = dev->dev_private;
 	u32 pitch = DIV_ROUND_UP(args->width * args->bpp, 8);
 
+	args->flags = DRM_MODE_DUMB_KERNEL_MAP;
 	args->pitch = ALIGN(pitch, mdev->chip.bus_width);
 
 	return drm_gem_dma_dumb_create_internal(file, dev, args);
diff --git a/drivers/gpu/drm/arm/malidp_drv.c b/drivers/gpu/drm/arm/malidp_drv.c
index b765f6c9eea4..5519f48a27c0 100644
--- a/drivers/gpu/drm/arm/malidp_drv.c
+++ b/drivers/gpu/drm/arm/malidp_drv.c
@@ -464,6 +464,7 @@ static int malidp_dumb_create(struct drm_file *file_priv,
 	/* allocate for the worst case scenario, i.e. rotated buffers */
 	u8 alignment = malidp_hw_get_pitch_align(malidp->dev, 1);
 
+	args->flags = DRM_MODE_DUMB_KERNEL_MAP;
 	args->pitch = ALIGN(DIV_ROUND_UP(args->width * args->bpp, 8), alignment);
 
 	return drm_gem_dma_dumb_create_internal(file_priv, drm, args);
diff --git a/drivers/gpu/drm/drm_fb_dma_helper.c b/drivers/gpu/drm/drm_fb_dma_helper.c
index fd71969d2fb1..12a44accc48c 100644
--- a/drivers/gpu/drm/drm_fb_dma_helper.c
+++ b/drivers/gpu/drm/drm_fb_dma_helper.c
@@ -187,6 +187,10 @@ int drm_fb_dma_get_scanout_buffer(struct drm_plane *plane,
 	if (!dma_obj->vaddr)
 		return -ENODEV;
 
+	/* Buffer was allocated with NO_KERNEL_MAPPING */
+	if (dma_obj->dma_attrs & DMA_ATTR_NO_KERNEL_MAPPING)
+		return -ENODEV;
+
 	iosys_map_set_vaddr(&sb->map[0], dma_obj->vaddr);
 	sb->format = fb->format;
 	sb->height = fb->height;
diff --git a/drivers/gpu/drm/drm_gem_dma_helper.c b/drivers/gpu/drm/drm_gem_dma_helper.c
index 9722c9fc86f3..281fb563f061 100644
--- a/drivers/gpu/drm/drm_gem_dma_helper.c
+++ b/drivers/gpu/drm/drm_gem_dma_helper.c
@@ -116,26 +116,8 @@ __drm_gem_dma_create(struct drm_device *drm, size_t size, bool private)
 	return ERR_PTR(ret);
 }
 
-/**
- * drm_gem_dma_create - allocate an object with the given size
- * @drm: DRM device
- * @size: size of the object to allocate
- *
- * This function creates a DMA GEM object and allocates memory as backing store.
- * The allocated memory will occupy a contiguous chunk of bus address space.
- *
- * For devices that are directly connected to the memory bus then the allocated
- * memory will be physically contiguous. For devices that access through an
- * IOMMU, then the allocated memory is not expected to be physically contiguous
- * because having contiguous IOVAs is sufficient to meet a devices DMA
- * requirements.
- *
- * Returns:
- * A struct drm_gem_dma_object * on success or an ERR_PTR()-encoded negative
- * error code on failure.
- */
-struct drm_gem_dma_object *drm_gem_dma_create(struct drm_device *drm,
-					      size_t size)
+static struct drm_gem_dma_object *
+drm_gem_dma_create_flags(struct drm_device *drm, size_t size, u32 flags)
 {
 	struct drm_gem_dma_object *dma_obj;
 	int ret;
@@ -146,6 +128,9 @@ struct drm_gem_dma_object *drm_gem_dma_create(struct drm_device *drm,
 	if (IS_ERR(dma_obj))
 		return dma_obj;
 
+	if (!(flags & DRM_MODE_DUMB_KERNEL_MAP))
+		dma_obj->dma_attrs |= DMA_ATTR_NO_KERNEL_MAPPING;
+
 	if (dma_obj->map_noncoherent) {
 		dma_obj->vaddr = dma_alloc_noncoherent(drm_dev_dma_dev(drm),
 						       size,
@@ -171,6 +156,30 @@ struct drm_gem_dma_object *drm_gem_dma_create(struct drm_device *drm,
 	drm_gem_object_put(&dma_obj->base);
 	return ERR_PTR(ret);
 }
+
+/**
+ * drm_gem_dma_create - allocate an object with the given size
+ * @drm: DRM device
+ * @size: size of the object to allocate
+ *
+ * This function creates a DMA GEM object and allocates memory as backing store.
+ * The allocated memory will occupy a contiguous chunk of bus address space.
+ *
+ * For devices that are directly connected to the memory bus then the allocated
+ * memory will be physically contiguous. For devices that access through an
+ * IOMMU, then the allocated memory is not expected to be physically contiguous
+ * because having contiguous IOVAs is sufficient to meet a devices DMA
+ * requirements.
+ *
+ * Returns:
+ * A struct drm_gem_dma_object * on success or an ERR_PTR()-encoded negative
+ * error code on failure.
+ */
+struct drm_gem_dma_object *drm_gem_dma_create(struct drm_device *drm,
+					      size_t size)
+{
+	return drm_gem_dma_create_flags(drm, size, DRM_MODE_DUMB_KERNEL_MAP);
+}
 EXPORT_SYMBOL_GPL(drm_gem_dma_create);
 
 /**
@@ -179,6 +188,7 @@ EXPORT_SYMBOL_GPL(drm_gem_dma_create);
  * @file_priv: DRM file-private structure to register the handle for
  * @drm: DRM device
  * @size: size of the object to allocate
+ * @flags: DRM_MODE_DUMB_* flags if any
  * @handle: return location for the GEM handle
  *
  * This function creates a DMA GEM object, allocating a chunk of memory as
@@ -194,14 +204,14 @@ EXPORT_SYMBOL_GPL(drm_gem_dma_create);
  */
 static struct drm_gem_dma_object *
 drm_gem_dma_create_with_handle(struct drm_file *file_priv,
-			       struct drm_device *drm, size_t size,
+			       struct drm_device *drm, size_t size, u32 flags,
 			       uint32_t *handle)
 {
 	struct drm_gem_dma_object *dma_obj;
 	struct drm_gem_object *gem_obj;
 	int ret;
 
-	dma_obj = drm_gem_dma_create(drm, size);
+	dma_obj = drm_gem_dma_create_flags(drm, size, DRM_MODE_DUMB_KERNEL_MAP);
 	if (IS_ERR(dma_obj))
 		return dma_obj;
 
@@ -283,7 +293,7 @@ int drm_gem_dma_dumb_create_internal(struct drm_file *file_priv,
 		args->size = args->pitch * args->height;
 
 	dma_obj = drm_gem_dma_create_with_handle(file_priv, drm, args->size,
-						 &args->handle);
+						 args->flags, &args->handle);
 	return PTR_ERR_OR_ZERO(dma_obj);
 }
 EXPORT_SYMBOL_GPL(drm_gem_dma_dumb_create_internal);
@@ -313,12 +323,13 @@ int drm_gem_dma_dumb_create(struct drm_file *file_priv,
 	struct drm_gem_dma_object *dma_obj;
 	int ret;
 
+	args->flags = DRM_MODE_DUMB_KERNEL_MAP;
 	ret = drm_mode_size_dumb(drm, args, 0, 0);
 	if (ret)
 		return ret;
 
 	dma_obj = drm_gem_dma_create_with_handle(file_priv, drm, args->size,
-						 &args->handle);
+						 args->flags, &args->handle);
 	return PTR_ERR_OR_ZERO(dma_obj);
 }
 EXPORT_SYMBOL_GPL(drm_gem_dma_dumb_create);
@@ -412,7 +423,10 @@ void drm_gem_dma_print_info(const struct drm_gem_dma_object *dma_obj,
 			    struct drm_printer *p, unsigned int indent)
 {
 	drm_printf_indent(p, indent, "dma_addr=%pad\n", &dma_obj->dma_addr);
-	drm_printf_indent(p, indent, "vaddr=%p\n", dma_obj->vaddr);
+	if (dma_obj->dma_attrs & DMA_ATTR_NO_KERNEL_MAPPING)
+		drm_printf_indent(p, indent, "vaddr=(no mapping)\n");
+	else
+		drm_printf_indent(p, indent, "vaddr=%p\n", dma_obj->vaddr);
 }
 EXPORT_SYMBOL(drm_gem_dma_print_info);
 
@@ -511,6 +525,9 @@ EXPORT_SYMBOL_GPL(drm_gem_dma_prime_import_sg_table);
 int drm_gem_dma_vmap(struct drm_gem_dma_object *dma_obj,
 		     struct iosys_map *map)
 {
+	if (dma_obj->dma_attrs & DMA_ATTR_NO_KERNEL_MAPPING)
+		return -ENOMEM;
+
 	iosys_map_set_vaddr(map, dma_obj->vaddr);
 
 	return 0;
diff --git a/drivers/gpu/drm/logicvc/logicvc_drm.c b/drivers/gpu/drm/logicvc/logicvc_drm.c
index bbebf4fc7f51..595a71163cb5 100644
--- a/drivers/gpu/drm/logicvc/logicvc_drm.c
+++ b/drivers/gpu/drm/logicvc/logicvc_drm.c
@@ -39,6 +39,7 @@ static int logicvc_drm_gem_dma_dumb_create(struct drm_file *file_priv,
 {
 	struct logicvc_drm *logicvc = logicvc_drm(drm_dev);
 
+	args->flags = DRM_MODE_DUMB_KERNEL_MAP;
 	/* Stride is always fixed to its configuration value. */
 	args->pitch = logicvc->config.row_stride * DIV_ROUND_UP(args->bpp, 8);
 
diff --git a/drivers/gpu/drm/meson/meson_drv.c b/drivers/gpu/drm/meson/meson_drv.c
index 49ff9f1f16d3..9fa339d6e273 100644
--- a/drivers/gpu/drm/meson/meson_drv.c
+++ b/drivers/gpu/drm/meson/meson_drv.c
@@ -83,6 +83,7 @@ static irqreturn_t meson_irq(int irq, void *arg)
 static int meson_dumb_create(struct drm_file *file, struct drm_device *dev,
 			     struct drm_mode_create_dumb *args)
 {
+	args->flags = DRM_MODE_DUMB_KERNEL_MAP;
 	/*
 	 * We need 64bytes aligned stride, and PAGE aligned size
 	 */
diff --git a/drivers/gpu/drm/renesas/rcar-du/rcar_du_kms.c b/drivers/gpu/drm/renesas/rcar-du/rcar_du_kms.c
index 60e6f43b8ab2..611fe3d4a4d8 100644
--- a/drivers/gpu/drm/renesas/rcar-du/rcar_du_kms.c
+++ b/drivers/gpu/drm/renesas/rcar-du/rcar_du_kms.c
@@ -424,6 +424,8 @@ int rcar_du_dumb_create(struct drm_file *file, struct drm_device *dev,
 	if (ret)
 		return ret;
 
+	args->flags = DRM_MODE_DUMB_KERNEL_MAP;
+
 	return drm_gem_dma_dumb_create_internal(file, dev, args);
 }
 
diff --git a/drivers/gpu/drm/renesas/rz-du/rzg2l_du_kms.c b/drivers/gpu/drm/renesas/rz-du/rzg2l_du_kms.c
index 87f171145a23..359f85bd84eb 100644
--- a/drivers/gpu/drm/renesas/rz-du/rzg2l_du_kms.c
+++ b/drivers/gpu/drm/renesas/rz-du/rzg2l_du_kms.c
@@ -184,6 +184,7 @@ int rzg2l_du_dumb_create(struct drm_file *file, struct drm_device *dev,
 	unsigned int min_pitch = DIV_ROUND_UP(args->width * args->bpp, 8);
 	unsigned int align = 16 * args->bpp / 8;
 
+	args->flags = DRM_MODE_DUMB_KERNEL_MAP;
 	args->pitch = roundup(min_pitch, align);
 
 	return drm_gem_dma_dumb_create_internal(file, dev, args);
diff --git a/drivers/gpu/drm/stm/drv.c b/drivers/gpu/drm/stm/drv.c
index 56d53ac3082d..a0bc2e215adb 100644
--- a/drivers/gpu/drm/stm/drv.c
+++ b/drivers/gpu/drm/stm/drv.c
@@ -51,8 +51,9 @@ static int stm_gem_dma_dumb_create(struct drm_file *file,
 	 * in order to optimize data transfer, pitch is aligned on
 	 * 128 bytes, height is aligned on 4 bytes
 	 */
-	args->pitch = roundup(min_pitch, 128);
 	args->height = roundup(args->height, 4);
+	args->flags = DRM_MODE_DUMB_KERNEL_MAP;
+	args->pitch = roundup(min_pitch, 128);
 
 	return drm_gem_dma_dumb_create_internal(file, dev, args);
 }
diff --git a/drivers/gpu/drm/sun4i/sun4i_drv.c b/drivers/gpu/drm/sun4i/sun4i_drv.c
index 8a409eee1dca..d3ff53ce2450 100644
--- a/drivers/gpu/drm/sun4i/sun4i_drv.c
+++ b/drivers/gpu/drm/sun4i/sun4i_drv.c
@@ -36,6 +36,7 @@ static int drm_sun4i_gem_dumb_create(struct drm_file *file_priv,
 				     struct drm_device *drm,
 				     struct drm_mode_create_dumb *args)
 {
+	args->flags = DRM_MODE_DUMB_KERNEL_MAP;
 	/* The hardware only allows even pitches for YUV buffers. */
 	args->pitch = ALIGN(DIV_ROUND_UP(args->width * args->bpp, 8), 2);
 
diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c
index a14ecb769461..7a04cf52f511 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.c
+++ b/drivers/gpu/drm/vc4/vc4_drv.c
@@ -87,6 +87,8 @@ static int vc5_dumb_create(struct drm_file *file_priv,
 	if (ret)
 		return ret;
 
+	args->flags = DRM_MODE_DUMB_KERNEL_MAP;
+
 	return drm_gem_dma_dumb_create_internal(file_priv, dev, args);
 }
 
diff --git a/drivers/gpu/drm/verisilicon/vs_drm.c b/drivers/gpu/drm/verisilicon/vs_drm.c
index fd259d53f49f..fe3591244d02 100644
--- a/drivers/gpu/drm/verisilicon/vs_drm.c
+++ b/drivers/gpu/drm/verisilicon/vs_drm.c
@@ -44,6 +44,8 @@ static int vs_gem_dumb_create(struct drm_file *file_priv,
 	if (ret)
 		return ret;
 
+	args->flags = DRM_MODE_DUMB_KERNEL_MAP;
+
 	return drm_gem_dma_dumb_create_internal(file_priv, drm, args);
 }
 
diff --git a/drivers/gpu/drm/xlnx/zynqmp_kms.c b/drivers/gpu/drm/xlnx/zynqmp_kms.c
index 02f3a7d78cf8..aa3822b3cb08 100644
--- a/drivers/gpu/drm/xlnx/zynqmp_kms.c
+++ b/drivers/gpu/drm/xlnx/zynqmp_kms.c
@@ -371,6 +371,8 @@ static int zynqmp_dpsub_dumb_create(struct drm_file *file_priv,
 	if (ret)
 		return ret;
 
+	args->flags = DRM_MODE_DUMB_KERNEL_MAP;
+
 	return drm_gem_dma_dumb_create_internal(file_priv, drm, args);
 }
 
-- 
2.53.0.1018.g2bb0e51243-goog



^ permalink raw reply related

* [PATCH v3 2/3] drm/gem-dma: Use the dma_*_attr API variant
From: Chen-Yu Tsai @ 2026-03-26 10:01 UTC (permalink / raw)
  To: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter
  Cc: Rob Herring, dri-devel, linux-kernel, linux-arm-kernel,
	Chen-Yu Tsai
In-Reply-To: <20260326100248.1171828-1-wenst@chromium.org>

From: Rob Herring <robh@kernel.org>

In preparation to allow DRM DMA users to adjust the DMA_ATTR_* flags,
convert the DMA helper code to use the dma_*_attr APIs instead of the
dma_*_wc variants.

Only the DMA_ATTR_WRITE_COMBINE and DMA_ATTR_NO_WARN attributes are set
in this commit, so there's no functional change.

Update rcar_du_vsp_map_fb() to use dma_get_sgtable_attrs() instead of
dma_get_sgtable().

Also change the dma_free_wc() call in vc4_bo_purge() in the vc4 driver
to use dma_free_attrs() to match. vc4_bo is a sub-class of
drm_gem_dma_object.

Sub-classes of |struct drm_gem_dma_object| can also set additional
DMA_ATTR_* flags if they choose so. For example a sub-class could
set DMA_ATTR_FORCE_CONTIGUOUS in certain cases.

Signed-off-by: Rob Herring <robh@kernel.org>
[wenst@chromium.org: Rebase onto renamed DMA helpers]
[wenst@chromium.org: Make vc4_bo_purge() use matching dma_free_attrs()]
[wenst@chromium.org: Make rcar_du_vsp_map_fb() use dma_get_sgtable_attrs()]
[wenst@chromium.org: Expand commit message to mention that sub-classes
                     can set extra DMA_ATTR_* flags]
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>

---
Changes since v2:
- rcar-du: Change dma_get_sgtable() to dma_get_sgtable_attrs()

Changes since v1:
- Rebased onto renamed DMA helpers
- Made vc4_bo_purge() use matching dma_free_attrs()
- Expanded commit message to mention that sub-classes can set extra
  DMA_ATTR_* flags
---
 drivers/gpu/drm/drm_gem_dma_helper.c          | 26 +++++++++++--------
 drivers/gpu/drm/renesas/rcar-du/rcar_du_vsp.c |  5 ++--
 drivers/gpu/drm/vc4/vc4_bo.c                  |  2 +-
 include/drm/drm_gem_dma_helper.h              |  3 +++
 4 files changed, 22 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_dma_helper.c b/drivers/gpu/drm/drm_gem_dma_helper.c
index 1c00a71ab3c9..9722c9fc86f3 100644
--- a/drivers/gpu/drm/drm_gem_dma_helper.c
+++ b/drivers/gpu/drm/drm_gem_dma_helper.c
@@ -108,6 +108,7 @@ __drm_gem_dma_create(struct drm_device *drm, size_t size, bool private)
 		goto error;
 	}
 
+	dma_obj->dma_attrs |= DMA_ATTR_NO_WARN | DMA_ATTR_WRITE_COMBINE;
 	return dma_obj;
 
 error:
@@ -152,9 +153,10 @@ struct drm_gem_dma_object *drm_gem_dma_create(struct drm_device *drm,
 						       DMA_TO_DEVICE,
 						       GFP_KERNEL | __GFP_NOWARN);
 	} else {
-		dma_obj->vaddr = dma_alloc_wc(drm_dev_dma_dev(drm), size,
-					      &dma_obj->dma_addr,
-					      GFP_KERNEL | __GFP_NOWARN);
+		dma_obj->vaddr = dma_alloc_attrs(drm_dev_dma_dev(drm), size,
+						 &dma_obj->dma_addr,
+						 GFP_KERNEL | __GFP_NOWARN,
+						 dma_obj->dma_attrs);
 	}
 	if (!dma_obj->vaddr) {
 		drm_dbg(drm, "failed to allocate buffer with size %zu\n",
@@ -242,9 +244,9 @@ void drm_gem_dma_free(struct drm_gem_dma_object *dma_obj)
 					     dma_obj->vaddr, dma_obj->dma_addr,
 					     DMA_TO_DEVICE);
 		else
-			dma_free_wc(drm_dev_dma_dev(gem_obj->dev),
-				    dma_obj->base.size, dma_obj->vaddr,
-				    dma_obj->dma_addr);
+			dma_free_attrs(drm_dev_dma_dev(gem_obj->dev),
+				       dma_obj->base.size, dma_obj->vaddr,
+				       dma_obj->dma_addr, dma_obj->dma_attrs);
 	}
 
 	drm_gem_object_release(gem_obj);
@@ -435,8 +437,9 @@ struct sg_table *drm_gem_dma_get_sg_table(struct drm_gem_dma_object *dma_obj)
 	if (!sgt)
 		return ERR_PTR(-ENOMEM);
 
-	ret = dma_get_sgtable(drm_dev_dma_dev(obj->dev), sgt, dma_obj->vaddr,
-			      dma_obj->dma_addr, obj->size);
+	ret = dma_get_sgtable_attrs(drm_dev_dma_dev(obj->dev), sgt,
+				    dma_obj->vaddr, dma_obj->dma_addr,
+				    obj->size, dma_obj->dma_attrs);
 	if (ret < 0)
 		goto out;
 
@@ -546,9 +549,10 @@ int drm_gem_dma_mmap(struct drm_gem_dma_object *dma_obj, struct vm_area_struct *
 				     vma, vma->vm_end - vma->vm_start,
 				     virt_to_page(dma_obj->vaddr));
 	} else {
-		ret = dma_mmap_wc(drm_dev_dma_dev(dma_obj->base.dev), vma,
-				  dma_obj->vaddr, dma_obj->dma_addr,
-				  vma->vm_end - vma->vm_start);
+		ret = dma_mmap_attrs(drm_dev_dma_dev(dma_obj->base.dev), vma,
+				     dma_obj->vaddr, dma_obj->dma_addr,
+				     vma->vm_end - vma->vm_start,
+				     dma_obj->dma_attrs);
 	}
 	if (ret)
 		drm_gem_vm_close(vma);
diff --git a/drivers/gpu/drm/renesas/rcar-du/rcar_du_vsp.c b/drivers/gpu/drm/renesas/rcar-du/rcar_du_vsp.c
index 94c22d2db197..a4896096e3bc 100644
--- a/drivers/gpu/drm/renesas/rcar-du/rcar_du_vsp.c
+++ b/drivers/gpu/drm/renesas/rcar-du/rcar_du_vsp.c
@@ -291,8 +291,9 @@ int rcar_du_vsp_map_fb(struct rcar_du_vsp *vsp, struct drm_framebuffer *fb,
 				dst = sg_next(dst);
 			}
 		} else {
-			ret = dma_get_sgtable(rcdu->dev, sgt, gem->vaddr,
-					      gem->dma_addr, gem->base.size);
+			ret = dma_get_sgtable_attrs(rcdu->dev, sgt, gem->vaddr,
+						    gem->dma_addr, gem->base.size
+						    gem->dma_attrs);
 			if (ret)
 				goto fail;
 		}
diff --git a/drivers/gpu/drm/vc4/vc4_bo.c b/drivers/gpu/drm/vc4/vc4_bo.c
index f45ba47b4ba8..981593739a0f 100644
--- a/drivers/gpu/drm/vc4/vc4_bo.c
+++ b/drivers/gpu/drm/vc4/vc4_bo.c
@@ -304,7 +304,7 @@ static void vc4_bo_purge(struct drm_gem_object *obj)
 
 	drm_vma_node_unmap(&obj->vma_node, dev->anon_inode->i_mapping);
 
-	dma_free_wc(dev->dev, obj->size, bo->base.vaddr, bo->base.dma_addr);
+	dma_free_attrs(dev->dev, obj->size, bo->base.vaddr, bo->base.dma_addr, bo->base.dma_attrs);
 	bo->base.vaddr = NULL;
 	bo->madv = __VC4_MADV_PURGED;
 }
diff --git a/include/drm/drm_gem_dma_helper.h b/include/drm/drm_gem_dma_helper.h
index f2678e7ecb98..e815ff121e23 100644
--- a/include/drm/drm_gem_dma_helper.h
+++ b/include/drm/drm_gem_dma_helper.h
@@ -16,6 +16,8 @@ struct drm_mode_create_dumb;
  *       more than one entry but they are guaranteed to have contiguous
  *       DMA addresses.
  * @vaddr: kernel virtual address of the backing memory
+ * @dma_attrs: DMA attributes used when allocating backing memory.
+ *             Only applies to coherent memory.
  * @map_noncoherent: if true, the GEM object is backed by non-coherent memory
  */
 struct drm_gem_dma_object {
@@ -25,6 +27,7 @@ struct drm_gem_dma_object {
 
 	/* For objects with DMA memory allocated by GEM DMA */
 	void *vaddr;
+	unsigned long dma_attrs;
 
 	bool map_noncoherent;
 };
-- 
2.53.0.1018.g2bb0e51243-goog



^ permalink raw reply related

* [PATCH v3 0/3] drm: Support DMA per allocation kernel mappings
From: Chen-Yu Tsai @ 2026-03-26 10:01 UTC (permalink / raw)
  To: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter
  Cc: Chen-Yu Tsai, dri-devel, linux-kernel, linux-arm-kernel

Hi everyone,

This is v3 of the "DMA per allocation kernel mappings" series.

Changes since v2:
- Dropped rcar-du patch in favor of just using dma_get_sgtable_attrs()
  in patch 2
- Patch 1
  - Switched to drm_warn_once()
  - Moved flag definition from include/uapi/ to include/drm/drm_dumb_buffers.h
  - Reworded commit message
- Patch 2
  - Added change for rcar-du: s/dma_get_sgtable()/dma_get_sgtable_attrs()/
- Patch 3
  - Added back DRM_MODE_DUMB_KERNEL_MAP flag to all drivers calling
    drm_gem_dma_dumb_create_internal()
  - Expanded commit message to cover display drivers needing the kernel
    mapping to do scan out


This is an attempt to revive Rob Herring's "DMA per allocation kernel
mappings" [1] series from 2019. This series stacks on top of my recent
"drm/gem-dma: Support dedicated DMA device for allocation" series [2].
Many of the allocation paths are touched by both.

The 3 driver conversions from the original series are not included, as
the changes have landed in some other form.

Patch 1 adds the kernel internal DRM_MODE_DUMB_KERNEL_MAP flag.

Patch 2 adds the dma_attr field to drm_gem_dma_object, and converts
the GEM DMA helpers to use the dma_*_attrs() variant of the DMA API.

Patch 3 adds support for DRM_MODE_DUMB_KERNEL_MAP to the GEM DMA
helpers by setting the DMA_ATTR_NO_KERNEL_MAPING attribute for requests
without the DRM_MODE_DUMB_KERNEL_MAP flag.


All existing callers of drm_gem_dma_dumb_create_internal() will have
DRM_MODE_DUMB_KERNEL_MAP set to maintain existing behavior.

I have also started to convert the exynos driver to use the GEM DMA
helpers. I also plan on looking into the rockchip driver, but that one
has a separate IOMMU path that needs to be handled. I might copy the
approach used in the exynos driver to deal with it.


Changes compared to the original version from Rob (v1):
- Link to original v1:
  https://lore.kernel.org/dri-devel/20191021214550.1461-1-robh@kernel.org/
- Rebased onto renamed GEM DMA helpers
- New patch for rcar-du
- Patch 2
  - Make drm_mode_create_dumb_ioctl() emit warning if args->flags is not zero
- Patch 3
  - Made vc4_bo_purge() use dma_free_attrs(); this is the other location
    of DMA API used with drm_gem_dma_object outside the helpers
  - Expanded commit message
- Patch 4
  - Added kernal mapping check in drm_fb_dma_get_scanout_buffer() and
    drm_gem_dma_vmap().
  - Made drm_gem_dma_print_info() show "vaddr=(no mapping)" for objects
    allocated without kernel mapping
  - Dropped existing DRM_MODE_DUMB_KERNEL_MAP flag addition in various
    drivers
  - Added DRM_MODE_DUMB_KERNEL_MAP flag to adp_drm_gem_dumb_create()
  - Added flags field kerneldoc for drm_gem_dma_create_with_handle()

I dropped all the original Reviewed-by tags, as it's been 5 years since
the changes were first posted, and also because the code has changed a
lot.

Please have a look.


Thanks
ChenYu

Original cover letter from Rob:

This series adds support for CMA/DMA users to skip kernel mappings for
GEM allocations. The DMA API only guarantees a kernel mapping at
allocation time. Creating mappings with vmap() after allocation may or
may not work as not all allocations have a struct page. As virtual
memory space is limited on 32-bit systems some drivers will skip kernel
mappings when possible. This prevents those drivers from using CMA
helpers and the generic fbdev emulation which results in a lot of
duplicated code.

In order to distinguish between kernel and userspace allocations,
a new flag, DRM_MODE_DUMB_KERNEL_MAP, for drm_mode_create_dumb() is
introduced. This allows drivers to override the default behavior for
CMA helpers of always creating a kernel mapping.

Mediatek is converted to CMA helpers and Rockchip is converted to generic
fbdev support. I also have patches to convert Rockchip to CMA and shmem
helpers, but they need a bit more work. Exynos can also probably be
converted to use CMA helpers.

Compile tested only. I did test fbdev on Rockchip, but the h/w I have
has an IOMMU, so the CMA code path doesn't get tested.

- end quote -

[1] https://lore.kernel.org/dri-devel/20191021214550.1461-1-robh@kernel.org/
[2] https://lore.kernel.org/all/20260311094929.3393338-1-wenst@chromium.org/

Rob Herring (3):
  drm: Introduce DRM_MODE_DUMB_KERNEL_MAP flag
  drm/gem-dma: Use the dma_*_attr API variant
  drm/gem-dma: Support DRM_MODE_DUMB_KERNEL_MAP flag

 drivers/gpu/drm/adp/adp_drv.c                 |  1 +
 .../gpu/drm/arm/display/komeda/komeda_kms.c   |  1 +
 drivers/gpu/drm/arm/malidp_drv.c              |  1 +
 drivers/gpu/drm/drm_client.c                  |  2 +
 drivers/gpu/drm/drm_dumb_buffers.c            |  4 +
 drivers/gpu/drm/drm_fb_dma_helper.c           |  4 +
 drivers/gpu/drm/drm_gem_dma_helper.c          | 93 ++++++++++++-------
 drivers/gpu/drm/logicvc/logicvc_drm.c         |  1 +
 drivers/gpu/drm/meson/meson_drv.c             |  1 +
 drivers/gpu/drm/renesas/rcar-du/rcar_du_kms.c |  2 +
 drivers/gpu/drm/renesas/rcar-du/rcar_du_vsp.c |  5 +-
 drivers/gpu/drm/renesas/rz-du/rzg2l_du_kms.c  |  1 +
 drivers/gpu/drm/stm/drv.c                     |  3 +-
 drivers/gpu/drm/sun4i/sun4i_drv.c             |  1 +
 drivers/gpu/drm/vc4/vc4_bo.c                  |  2 +-
 drivers/gpu/drm/vc4/vc4_drv.c                 |  2 +
 drivers/gpu/drm/verisilicon/vs_drm.c          |  2 +
 drivers/gpu/drm/xlnx/zynqmp_kms.c             |  2 +
 include/drm/drm_dumb_buffers.h                |  3 +
 include/drm/drm_gem_dma_helper.h              |  3 +
 20 files changed, 94 insertions(+), 40 deletions(-)

-- 
2.53.0.1018.g2bb0e51243-goog



^ permalink raw reply

* [PATCH v3 1/3] drm: Introduce DRM_MODE_DUMB_KERNEL_MAP flag
From: Chen-Yu Tsai @ 2026-03-26 10:01 UTC (permalink / raw)
  To: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter
  Cc: Rob Herring, dri-devel, linux-kernel, linux-arm-kernel,
	Chen-Yu Tsai
In-Reply-To: <20260326100248.1171828-1-wenst@chromium.org>

From: Rob Herring <robh@kernel.org>

Introduce a new flag, DRM_MODE_DUMB_KERNEL_MAP, for struct
drm_mode_create_dumb. This flag is for internal kernel use to indicate
if dumb buffer allocation needs a kernel mapping. This is needed only for
GEM DMA where creating a kernel mapping or not has to be decided at
allocation time because creating a mapping on demand (with vmap()) is not
guaranteed to work.

Several drivers are using reimplementing the GEM DMA helpers because
they distinguish between kernel and userspace allocations to create a
kernel mapping or not. Adding a flag allows migrating these drivers
to the helpers while preserving their existing behavior. These include
exynos, rockchip, and previously mediatek.

Update the callers of drm_mode_dumb_create() to set
drm_mode_dumb_create.flags to appropriate defaults. Currently, flags can
be set to anything by userspace, but is unused within the kernel. Let's
force flags to zero (no kernel mapping) for userspace callers by default.
For in kernel clients, set DRM_MODE_DUMB_KERNEL_MAP by default. Drivers
can override this as needed.

Signed-off-by: Rob Herring <robh@kernel.org>
[wenst@chromium.org: Emit warning (once) if args->flags is not zero]
[wenst@chromium.org: Moved flag def. to include/drm/drm_dumb_buffers.h]
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
---
Changes since v2:
- Switched to drm_warn_once()
- Moved flag definition from include/uapi/ to include/drm/drm_dumb_buffers.h
- Reworded commit message

Changes since v1:
- Emit warning if args->flags is not zero
---
 drivers/gpu/drm/drm_client.c       | 2 ++
 drivers/gpu/drm/drm_dumb_buffers.c | 4 ++++
 include/drm/drm_dumb_buffers.h     | 3 +++
 3 files changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/drm_client.c b/drivers/gpu/drm/drm_client.c
index 46c465bce98c..3d3e61823cc1 100644
--- a/drivers/gpu/drm/drm_client.c
+++ b/drivers/gpu/drm/drm_client.c
@@ -14,6 +14,7 @@
 #include <drm/drm_client_event.h>
 #include <drm/drm_device.h>
 #include <drm/drm_drv.h>
+#include <drm/drm_dumb_buffers.h>
 #include <drm/drm_file.h>
 #include <drm/drm_fourcc.h>
 #include <drm/drm_framebuffer.h>
@@ -404,6 +405,7 @@ drm_client_buffer_create_dumb(struct drm_client_dev *client, u32 width, u32 heig
 	dumb_args.width = width;
 	dumb_args.height = height;
 	dumb_args.bpp = drm_format_info_bpp(info, 0);
+	dumb_args.flags = DRM_MODE_DUMB_KERNEL_MAP;
 	ret = drm_mode_create_dumb(dev, &dumb_args, client->file);
 	if (ret)
 		return ERR_PTR(ret);
diff --git a/drivers/gpu/drm/drm_dumb_buffers.c b/drivers/gpu/drm/drm_dumb_buffers.c
index e2b62e5fb891..60f4c2d08641 100644
--- a/drivers/gpu/drm/drm_dumb_buffers.c
+++ b/drivers/gpu/drm/drm_dumb_buffers.c
@@ -233,6 +233,10 @@ int drm_mode_create_dumb_ioctl(struct drm_device *dev,
 	struct drm_mode_create_dumb *args = data;
 	int err;
 
+	if (args->flags)
+		drm_warn_once(dev, "drm_mode_create_dumb.flags is not zero.\n");
+	args->flags = 0;
+
 	err = drm_mode_create_dumb(dev, args, file_priv);
 	if (err) {
 		args->handle = 0;
diff --git a/include/drm/drm_dumb_buffers.h b/include/drm/drm_dumb_buffers.h
index 1f3a8236fb3d..4657e44533f4 100644
--- a/include/drm/drm_dumb_buffers.h
+++ b/include/drm/drm_dumb_buffers.h
@@ -6,6 +6,9 @@
 struct drm_device;
 struct drm_mode_create_dumb;
 
+/* drm_mode_create_dumb flags for internal use */
+#define DRM_MODE_DUMB_KERNEL_MAP	(1<<0)
+
 int drm_mode_size_dumb(struct drm_device *dev,
 		       struct drm_mode_create_dumb *args,
 		       unsigned long hw_pitch_align,
-- 
2.53.0.1018.g2bb0e51243-goog



^ permalink raw reply related

* [PATCH v5 8/9] dt-bindings: net: wireless: brcm: Add compatible for bcm43752
From: Ronald Claveau @ 2026-03-26  9:59 UTC (permalink / raw)
  To: Neil Armstrong, Kevin Hilman, Jerome Brunet, Martin Blumenstingl,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley, Ulf Hansson,
	Johannes Berg, van Spriel
  Cc: linux-arm-kernel, linux-amlogic, devicetree, linux-kernel,
	linux-mmc, linux-wireless, Ronald Claveau, Conor Dooley
In-Reply-To: <20260326-add-emmc-t7-vim4-v5-0-d3f182b48e9d@aliel.fr>

Add bcm43752 compatible with its bcm4329 compatible fallback.

Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Ronald Claveau <linux-kernel-dev@aliel.fr>
---
 Documentation/devicetree/bindings/net/wireless/brcm,bcm4329-fmac.yaml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/net/wireless/brcm,bcm4329-fmac.yaml b/Documentation/devicetree/bindings/net/wireless/brcm,bcm4329-fmac.yaml
index 3be7576787644..81fd3e37452a6 100644
--- a/Documentation/devicetree/bindings/net/wireless/brcm,bcm4329-fmac.yaml
+++ b/Documentation/devicetree/bindings/net/wireless/brcm,bcm4329-fmac.yaml
@@ -42,6 +42,7 @@ properties:
               - brcm,bcm4356-fmac
               - brcm,bcm4359-fmac
               - brcm,bcm4366-fmac
+              - brcm,bcm43752-fmac
               - cypress,cyw4373-fmac
               - cypress,cyw43012-fmac
               - infineon,cyw43439-fmac

-- 
2.49.0



^ permalink raw reply related

* [PATCH v5 7/9] arm64: dts: amlogic: t7: khadas-vim4: Add SDIO power sequence and WiFi clock
From: Ronald Claveau @ 2026-03-26  9:59 UTC (permalink / raw)
  To: Neil Armstrong, Kevin Hilman, Jerome Brunet, Martin Blumenstingl,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley, Ulf Hansson,
	Johannes Berg, van Spriel
  Cc: linux-arm-kernel, linux-amlogic, devicetree, linux-kernel,
	linux-mmc, linux-wireless, Ronald Claveau
In-Reply-To: <20260326-add-emmc-t7-vim4-v5-0-d3f182b48e9d@aliel.fr>

Add the SDIO power sequence node using mmc-pwrseq-simple and a
32.768kHz PWM-based clock required by the Wi-Fi module.

Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Ronald Claveau <linux-kernel-dev@aliel.fr>
---
 .../dts/amlogic/amlogic-t7-a311d2-khadas-vim4.dts  | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/arch/arm64/boot/dts/amlogic/amlogic-t7-a311d2-khadas-vim4.dts b/arch/arm64/boot/dts/amlogic/amlogic-t7-a311d2-khadas-vim4.dts
index 2450084d37642..770f06b0b16c7 100644
--- a/arch/arm64/boot/dts/amlogic/amlogic-t7-a311d2-khadas-vim4.dts
+++ b/arch/arm64/boot/dts/amlogic/amlogic-t7-a311d2-khadas-vim4.dts
@@ -67,6 +67,15 @@ sd_3v3: regulator-sdcard-3v3 {
 		regulator-always-on;
 	};
 
+	sdio_pwrseq: sdio-pwrseq {
+		compatible = "mmc-pwrseq-simple";
+		reset-gpios = <&gpio GPIOX_6 GPIO_ACTIVE_LOW>;
+		post-power-on-delay-ms = <500>;
+		power-off-delay-us = <200000>;
+		clocks = <&wifi32k>;
+		clock-names = "ext_clock";
+	};
+
 	vcc5v: regulator-vcc-5v {
 		compatible = "regulator-fixed";
 		regulator-name = "VCC5V";
@@ -135,6 +144,19 @@ vddio_c: regulator-gpio-c {
 		states = <1800000 1
 			  3300000 0>;
 	};
+
+	wifi32k: wifi32k {
+		compatible = "pwm-clock";
+		#clock-cells = <0>;
+		clock-frequency = <32768>;
+			pwms = <&pwm_ab 0 30518 0>;
+	};
+};
+
+&pwm_ab {
+	status = "okay";
+	pinctrl-0 = <&pwm_a_pins>;
+	pinctrl-names = "default";
 };
 
 &uart_a {

-- 
2.49.0



^ permalink raw reply related

* [PATCH v5 9/9] arm64: dts: amlogic: t7: khadas-vim4: Add MMC nodes
From: Ronald Claveau @ 2026-03-26  9:59 UTC (permalink / raw)
  To: Neil Armstrong, Kevin Hilman, Jerome Brunet, Martin Blumenstingl,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley, Ulf Hansson,
	Johannes Berg, van Spriel
  Cc: linux-arm-kernel, linux-amlogic, devicetree, linux-kernel,
	linux-mmc, linux-wireless, Ronald Claveau
In-Reply-To: <20260326-add-emmc-t7-vim4-v5-0-d3f182b48e9d@aliel.fr>

Enable and configure the three MMC controllers for the Khadas VIM4 board:
- sd_emmc_a: SDIO interface for the BCM43752 Wi-Fi module
- sd_emmc_b: SD card slot
- sd_emmc_c: eMMC storage

Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Ronald Claveau <linux-kernel-dev@aliel.fr>
---
 .../dts/amlogic/amlogic-t7-a311d2-khadas-vim4.dts  | 88 ++++++++++++++++++++++
 1 file changed, 88 insertions(+)

diff --git a/arch/arm64/boot/dts/amlogic/amlogic-t7-a311d2-khadas-vim4.dts b/arch/arm64/boot/dts/amlogic/amlogic-t7-a311d2-khadas-vim4.dts
index 770f06b0b16c7..78d02370553cd 100644
--- a/arch/arm64/boot/dts/amlogic/amlogic-t7-a311d2-khadas-vim4.dts
+++ b/arch/arm64/boot/dts/amlogic/amlogic-t7-a311d2-khadas-vim4.dts
@@ -15,6 +15,9 @@ / {
 
 	aliases {
 		serial0 = &uart_a;
+		mmc0 = &sd_emmc_c;
+		mmc1 = &sd_emmc_b;
+		mmc2 = &sd_emmc_a;
 	};
 
 	memory@0 {
@@ -159,6 +162,91 @@ &pwm_ab {
 	pinctrl-names = "default";
 };
 
+/* SDIO */
+&sd_emmc_a {
+	status = "okay";
+	pinctrl-0 = <&sdio_pins>;
+	pinctrl-1 = <&sdio_clk_gate_pins>;
+	pinctrl-names = "default", "clk-gate";
+	#address-cells = <1>;
+	#size-cells = <0>;
+
+	bus-width = <4>;
+	cap-sd-highspeed;
+	sd-uhs-sdr12;
+	sd-uhs-sdr25;
+	sd-uhs-sdr50;
+	sd-uhs-sdr104;
+	cap-sdio-irq;
+	max-frequency = <200000000>;
+	non-removable;
+	disable-wp;
+	no-mmc;
+	no-sd;
+
+	power-domains = <&pwrc PWRC_T7_SDIO_A_ID>;
+
+	keep-power-in-suspend;
+
+	mmc-pwrseq = <&sdio_pwrseq>;
+
+	vmmc-supply = <&vddao_3v3>;
+	vqmmc-supply = <&vddao_1v8>;
+
+	brcmf: wifi@1 {
+		reg = <1>;
+		compatible = "brcm,bcm43752-fmac", "brcm,bcm4329-fmac";
+	};
+};
+
+/* SD card */
+&sd_emmc_b {
+	status = "okay";
+	pinctrl-0 = <&sdcard_pins>;
+	pinctrl-1 = <&sdcard_clk_gate_pins>;
+	pinctrl-names = "default", "clk-gate";
+
+	bus-width = <4>;
+	cap-sd-highspeed;
+	sd-uhs-sdr12;
+	sd-uhs-sdr25;
+	sd-uhs-sdr50;
+	sd-uhs-sdr104;
+	max-frequency = <200000000>;
+	disable-wp;
+	no-sdio;
+	no-mmc;
+
+	power-domains = <&pwrc PWRC_T7_SDIO_B_ID>;
+
+	cd-gpios = <&gpio GPIOC_6 GPIO_ACTIVE_LOW>;
+	vmmc-supply = <&sd_3v3>;
+	vqmmc-supply = <&vddio_c>;
+};
+
+/* eMMC */
+&sd_emmc_c {
+	status = "okay";
+	pinctrl-0 = <&emmc_ctrl_pins>, <&emmc_data_8b_pins>, <&emmc_ds_pins>;
+	pinctrl-1 = <&emmc_clk_gate_pins>;
+	pinctrl-names = "default", "clk-gate";
+
+	bus-width = <8>;
+	cap-mmc-highspeed;
+	mmc-ddr-1_8v;
+	mmc-hs200-1_8v;
+	max-frequency = <200000000>;
+	disable-wp;
+	non-removable;
+	no-sdio;
+	no-sd;
+
+	power-domains = <&pwrc PWRC_T7_EMMC_ID>;
+
+	vmmc-supply = <&vddio_3v3>;
+	vqmmc-supply = <&vddio_1v8>;
+};
+
 &uart_a {
 	status = "okay";
 	clocks = <&xtal>, <&xtal>, <&xtal>;

-- 
2.49.0



^ permalink raw reply related

* [PATCH v5 6/9] arm64: dts: amlogic: t7: khadas-vim4: Add power regulators
From: Ronald Claveau @ 2026-03-26  9:59 UTC (permalink / raw)
  To: Neil Armstrong, Kevin Hilman, Jerome Brunet, Martin Blumenstingl,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley, Ulf Hansson,
	Johannes Berg, van Spriel
  Cc: linux-arm-kernel, linux-amlogic, devicetree, linux-kernel,
	linux-mmc, linux-wireless, Ronald Claveau
In-Reply-To: <20260326-add-emmc-t7-vim4-v5-0-d3f182b48e9d@aliel.fr>

Add voltage regulator nodes describing the VIM4 power tree,
required by peripheral nodes such as the SD card controller.

Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Ronald Claveau <linux-kernel-dev@aliel.fr>
---
 .../dts/amlogic/amlogic-t7-a311d2-khadas-vim4.dts  | 90 ++++++++++++++++++++++
 1 file changed, 90 insertions(+)

diff --git a/arch/arm64/boot/dts/amlogic/amlogic-t7-a311d2-khadas-vim4.dts b/arch/arm64/boot/dts/amlogic/amlogic-t7-a311d2-khadas-vim4.dts
index fffdab96b12eb..2450084d37642 100644
--- a/arch/arm64/boot/dts/amlogic/amlogic-t7-a311d2-khadas-vim4.dts
+++ b/arch/arm64/boot/dts/amlogic/amlogic-t7-a311d2-khadas-vim4.dts
@@ -6,6 +6,8 @@
 /dts-v1/;
 
 #include "amlogic-t7.dtsi"
+#include <dt-bindings/gpio/amlogic,t7-periphs-pinctrl.h>
+#include <dt-bindings/gpio/gpio.h>
 
 / {
 	model = "Khadas vim4";
@@ -45,6 +47,94 @@ xtal: xtal-clk {
 		#clock-cells = <0>;
 	};
 
+	dc_in: regulator-dc-in {
+		compatible = "regulator-fixed";
+		regulator-name = "DC_IN";
+		regulator-min-microvolt = <5000000>;
+		regulator-max-microvolt = <5000000>;
+		regulator-always-on;
+	};
+
+	sd_3v3: regulator-sdcard-3v3 {
+		compatible = "regulator-fixed";
+		regulator-name = "SD_3V3";
+		regulator-min-microvolt = <3300000>;
+		regulator-max-microvolt = <3300000>;
+		vin-supply = <&vddao_3v3>;
+		gpio = <&gpio GPIOD_11 GPIO_ACTIVE_LOW>;
+		regulator-boot-on;
+		enable-active-low;
+		regulator-always-on;
+	};
+
+	vcc5v: regulator-vcc-5v {
+		compatible = "regulator-fixed";
+		regulator-name = "VCC5V";
+		regulator-min-microvolt = <5000000>;
+		regulator-max-microvolt = <5000000>;
+		vin-supply = <&dc_in>;
+
+		gpio = <&gpio GPIOH_4 GPIO_ACTIVE_HIGH>;
+		enable-active-high;
+	};
+
+	vcc5v0_usb: regulator-vcc-usb {
+		compatible = "regulator-fixed";
+		regulator-name = "VCC5V0_USB";
+		regulator-min-microvolt = <5000000>;
+		regulator-max-microvolt = <5000000>;
+		vin-supply = <&vcc5v>;
+
+		gpio = <&gpio GPIOY_5 GPIO_ACTIVE_HIGH>;
+		enable-active-high;
+	};
+
+	vddao_1v8: regulator-vddao-1v8 {
+		compatible = "regulator-fixed";
+		regulator-name = "VDDAO_1V8";
+		regulator-min-microvolt = <1800000>;
+		regulator-max-microvolt = <1800000>;
+		vin-supply = <&vddao_3v3>;
+		regulator-always-on;
+	};
+
+	vddao_3v3: regulator-vddao-3v3 {
+		compatible = "regulator-fixed";
+		regulator-name = "VDDAO_3V3";
+		regulator-min-microvolt = <3300000>;
+		regulator-max-microvolt = <3300000>;
+		vin-supply = <&dc_in>;
+		regulator-always-on;
+	};
+
+	vddio_1v8: regulator-vddio-1v8 {
+		compatible = "regulator-fixed";
+		regulator-name = "VDDIO_1V8";
+		regulator-min-microvolt = <1800000>;
+		regulator-max-microvolt = <1800000>;
+		vin-supply = <&vddio_3v3>;
+		regulator-always-on;
+	};
+
+	vddio_3v3: regulator-vddio-3v3 {
+		compatible = "regulator-fixed";
+		regulator-name = "VDDIO_3V3";
+		regulator-min-microvolt = <3300000>;
+		regulator-max-microvolt = <3300000>;
+		vin-supply = <&vddao_3v3>;
+		regulator-always-on;
+	};
+
+	vddio_c: regulator-gpio-c {
+		compatible = "regulator-gpio";
+		regulator-name = "VDDIO_C";
+		regulator-min-microvolt = <1800000>;
+		regulator-max-microvolt = <3300000>;
+		vin-supply = <&vddio_3v3>;
+		gpios = <&gpio GPIOD_9 GPIO_ACTIVE_HIGH>;
+		states = <1800000 1
+			  3300000 0>;
+	};
 };
 
 &uart_a {

-- 
2.49.0



^ permalink raw reply related

* [PATCH v5 5/9] arm64: dts: amlogic: t7: Add PWM controller nodes
From: Ronald Claveau @ 2026-03-26  9:59 UTC (permalink / raw)
  To: Neil Armstrong, Kevin Hilman, Jerome Brunet, Martin Blumenstingl,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley, Ulf Hansson,
	Johannes Berg, van Spriel
  Cc: linux-arm-kernel, linux-amlogic, devicetree, linux-kernel,
	linux-mmc, linux-wireless, Ronald Claveau, Nick Xie
In-Reply-To: <20260326-add-emmc-t7-vim4-v5-0-d3f182b48e9d@aliel.fr>

Add device tree nodes for the seven PWM controllers available
on the Amlogic T7 SoC, using amlogic,meson-s4-pwm as fallback compatible.
All nodes are disabled by default and should be
enabled in the board-specific DTS file.

Co-developed-by: Nick Xie <nick@khadas.com>
Signed-off-by: Nick Xie <nick@khadas.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Ronald Claveau <linux-kernel-dev@aliel.fr>
---
 arch/arm64/boot/dts/amlogic/amlogic-t7.dtsi | 63 +++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)

diff --git a/arch/arm64/boot/dts/amlogic/amlogic-t7.dtsi b/arch/arm64/boot/dts/amlogic/amlogic-t7.dtsi
index b66b3d10288f6..02a303d4ec39d 100644
--- a/arch/arm64/boot/dts/amlogic/amlogic-t7.dtsi
+++ b/arch/arm64/boot/dts/amlogic/amlogic-t7.dtsi
@@ -511,6 +511,69 @@ sec_ao: ao-secure@10220 {
 				amlogic,has-chip-id;
 			};
 
+			pwm_ao_ef: pwm@30000 {
+				compatible = "amlogic,t7-pwm", "amlogic,meson-s4-pwm";
+				reg = <0x0 0x30000 0x0 0x24>;
+				clocks = <&clkc_periphs CLKID_PWM_AO_E>,
+					 <&clkc_periphs CLKID_PWM_AO_F>;
+				#pwm-cells = <3>;
+				status = "disabled";
+			};
+
+			pwm_ao_gh: pwm@32000 {
+				compatible = "amlogic,t7-pwm", "amlogic,meson-s4-pwm";
+				reg = <0x0 0x32000 0x0 0x24>;
+				clocks = <&clkc_periphs CLKID_PWM_AO_G>,
+					 <&clkc_periphs CLKID_PWM_AO_H>;
+				#pwm-cells = <3>;
+				status = "disabled";
+			};
+
+			pwm_ab: pwm@58000 {
+				compatible = "amlogic,t7-pwm", "amlogic,meson-s4-pwm";
+				reg = <0x0 0x58000 0x0 0x24>;
+				clocks = <&clkc_periphs CLKID_PWM_A>,
+					 <&clkc_periphs CLKID_PWM_B>;
+				#pwm-cells = <3>;
+				status = "disabled";
+			};
+
+			pwm_cd: pwm@5a000 {
+				compatible = "amlogic,t7-pwm", "amlogic,meson-s4-pwm";
+				reg = <0x0 0x5a000 0x0 0x24>;
+				clocks = <&clkc_periphs CLKID_PWM_C>,
+					 <&clkc_periphs CLKID_PWM_D>;
+				#pwm-cells = <3>;
+				status = "disabled";
+			};
+
+			pwm_ef: pwm@5c000 {
+				compatible = "amlogic,t7-pwm", "amlogic,meson-s4-pwm";
+				reg = <0x0 0x5c000 0x0 0x24>;
+				clocks = <&clkc_periphs CLKID_PWM_E>,
+					 <&clkc_periphs CLKID_PWM_F>;
+				#pwm-cells = <3>;
+				status = "disabled";
+			};
+
+			pwm_ao_ab: pwm@5e000 {
+				compatible = "amlogic,t7-pwm", "amlogic,meson-s4-pwm";
+				reg = <0x0 0x5e000 0x0 0x24>;
+				clocks = <&clkc_periphs CLKID_PWM_AO_A>,
+					 <&clkc_periphs CLKID_PWM_AO_B>;
+				#pwm-cells = <3>;
+				status = "disabled";
+			};
+
+			pwm_ao_cd: pwm@60000 {
+				compatible = "amlogic,t7-pwm", "amlogic,meson-s4-pwm";
+				reg = <0x0 0x60000 0x0 0x24>;
+				clocks = <&clkc_periphs CLKID_PWM_AO_C>,
+					 <&clkc_periphs CLKID_PWM_AO_D>;
+				#pwm-cells = <3>;
+				status = "disabled";
+			};
+
 			sd_emmc_a: mmc@88000 {
 				compatible = "amlogic,t7-mmc", "amlogic,meson-axg-mmc";
 				reg = <0x0 0x88000 0x0 0x800>;

-- 
2.49.0



^ permalink raw reply related

* [PATCH v5 0/9] arm64: dts: amlogic: Add MMC/SD/SDIO support for Khadas VIM4 (Amlogic T7)
From: Ronald Claveau @ 2026-03-26  9:59 UTC (permalink / raw)
  To: Neil Armstrong, Kevin Hilman, Jerome Brunet, Martin Blumenstingl,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley, Ulf Hansson,
	Johannes Berg, van Spriel
  Cc: linux-arm-kernel, linux-amlogic, devicetree, linux-kernel,
	linux-mmc, linux-wireless, Ronald Claveau, Conor Dooley,
	Xianwei Zhao, Nick Xie

This patch series depends on Jian's SCMI clock patches yet to merge
https://lore.kernel.org/all/20260313070022.700437-1-jian.hu@amlogic.com/

This series adds device tree support for the MMC, SD card and SDIO
interfaces on the Amlogic T7 SoC and the Khadas VIM4 board.

The first patches add the necessary building blocks in the T7 SoC
DTSI: pinctrl nodes for pin muxing, PWM controller nodes, and MMC
controller nodes. The amlogic,t7-mmc and amlogic,t7-pwm compatible
strings are introduced with fallbacks to existing drivers, avoiding
the need for new driver code.

The remaining patches enable these interfaces on the Khadas VIM4
board, including the power regulators, the SDIO power sequence and
32.768kHz PWM clock required by the BCM43752 Wi-Fi module, and the
board-specific MMC controller configurations.

Signed-off-by: Ronald Claveau <linux-kernel-dev@aliel.fr>
---
Changes in v5:
- Add missing trailers according to Rob's feedback.
- Change mux-0 to mux in pinctrl nodes for single mux. Neil's feedback.
- Move disabled status at the end of node properties. Neil's feedback.
- Restore space instead of tab in VIM4 DTS file according to Neil's feedback.
- Link to v4: https://lore.kernel.org/r/20260325-add-emmc-t7-vim4-v4-0-44c7b4a5e459@aliel.fr

Changes in v4:
- Address potential DT binding API break from Xianwei's feedback.
- Change underscore to dash in pinctrl nodes names from Xianwei's feedback.
- Link to v3: https://lore.kernel.org/r/20260323-add-emmc-t7-vim4-v3-0-5159d90a984c@aliel.fr

Changes in v3:
- Remove all changes about fixed pll clock from analog controller.
- Use clocks retrieved through SCMI.
- Add other MMC controllers
- Manage Wi-Fi module enablement. 
- Link to v2: https://lore.kernel.org/r/20260218101709.35450-1-linux-kernel-dev@aliel.fr

Changes in v2:
- Resend v1 patches as attached to the first patch.
- Link to v1: https://lore.kernel.org/r/20260218101709.35450-1-linux-kernel-dev@aliel.fr

---
Ronald Claveau (9):
      arm64: dts: amlogic: t7: Add eMMC, SD card and SDIO pinctrl nodes
      dt-bindings: mmc: amlogic: Add compatible for T7 mmc
      arm64: dts: amlogic: t7: Add MMC controller nodes
      arm64: dts: amlogic: t7: Add PWM pinctrl nodes
      arm64: dts: amlogic: t7: Add PWM controller nodes
      arm64: dts: amlogic: t7: khadas-vim4: Add power regulators
      arm64: dts: amlogic: t7: khadas-vim4: Add SDIO power sequence and WiFi clock
      dt-bindings: net: wireless: brcm: Add compatible for bcm43752
      arm64: dts: amlogic: t7: khadas-vim4: Add MMC nodes

 .../bindings/mmc/amlogic,meson-gx-mmc.yaml         |   4 +
 .../bindings/net/wireless/brcm,bcm4329-fmac.yaml   |   1 +
 .../dts/amlogic/amlogic-t7-a311d2-khadas-vim4.dts  | 200 ++++++++++++
 arch/arm64/boot/dts/amlogic/amlogic-t7.dtsi        | 336 +++++++++++++++++++++
 4 files changed, 541 insertions(+)
---
base-commit: f6eb9ae8b9fc13c3971e4a6d1e8442f253001f36
change-id: 20260320-add-emmc-t7-vim4-6ad16e94614f
prerequisite-message-id: <20260313070022.700437-1-jian.hu@amlogic.com>
prerequisite-patch-id: f03a086b4137158412b2d47b3de793b858de8dde
prerequisite-patch-id: 123970c9b29c2090440f2fd71c85d3c6fd8e36de
prerequisite-patch-id: 3e2e56b0926ba327b520f935df4ced5089bbe503

Best regards,
-- 
Ronald Claveau <linux-kernel-dev@aliel.fr>



^ permalink raw reply

* [PATCH v5 4/9] arm64: dts: amlogic: t7: Add PWM pinctrl nodes
From: Ronald Claveau @ 2026-03-26  9:59 UTC (permalink / raw)
  To: Neil Armstrong, Kevin Hilman, Jerome Brunet, Martin Blumenstingl,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley, Ulf Hansson,
	Johannes Berg, van Spriel
  Cc: linux-arm-kernel, linux-amlogic, devicetree, linux-kernel,
	linux-mmc, linux-wireless, Ronald Claveau
In-Reply-To: <20260326-add-emmc-t7-vim4-v5-0-d3f182b48e9d@aliel.fr>

These pinctrl nodes are required by the PWM drivers to configure
pin muxing at runtime.

Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Ronald Claveau <linux-kernel-dev@aliel.fr>
---
 arch/arm64/boot/dts/amlogic/amlogic-t7.dtsi | 136 ++++++++++++++++++++++++++++
 1 file changed, 136 insertions(+)

diff --git a/arch/arm64/boot/dts/amlogic/amlogic-t7.dtsi b/arch/arm64/boot/dts/amlogic/amlogic-t7.dtsi
index fe1ced0a58967..b66b3d10288f6 100644
--- a/arch/arm64/boot/dts/amlogic/amlogic-t7.dtsi
+++ b/arch/arm64/boot/dts/amlogic/amlogic-t7.dtsi
@@ -307,6 +307,142 @@ mux {
 					};
 				};
 
+				pwm_a_pins: pwm-a {
+					mux {
+						groups = "pwm_a";
+						function = "pwm_a";
+						bias-disable;
+					};
+				};
+
+				pwm_ao_a_pins: pwm-ao-a {
+					mux {
+						groups = "pwm_ao_a";
+						function = "pwm_ao_a";
+						bias-disable;
+					};
+				};
+
+				pwm_ao_b_pins: pwm-ao-b {
+					mux {
+						groups = "pwm_ao_b";
+						function = "pwm_ao_b";
+						bias-disable;
+					};
+				};
+
+				pwm_ao_c_pins: pwm-ao-c {
+					mux {
+						groups = "pwm_ao_c";
+						function = "pwm_ao_c";
+						bias-disable;
+					};
+				};
+
+				pwm_ao_c_hiz_pins: pwm-ao-c-hiz {
+					mux {
+						groups = "pwm_ao_c_hiz";
+						function = "pwm_ao_c_hiz";
+						bias-disable;
+					};
+				};
+
+				pwm_ao_d_pins: pwm-ao-d {
+					mux {
+						groups = "pwm_ao_d";
+						function = "pwm_ao_d";
+						bias-disable;
+					};
+				};
+
+				pwm_ao_e_pins: pwm-ao-e {
+					mux {
+						groups = "pwm_ao_e";
+						function = "pwm_ao_e";
+						bias-disable;
+					};
+				};
+
+				pwm_ao_f_pins: pwm-ao-f {
+					mux {
+						groups = "pwm_ao_f";
+						function = "pwm_ao_f";
+						bias-disable;
+					};
+				};
+
+				pwm_ao_g_pins: pwm-ao-g {
+					mux {
+						groups = "pwm_ao_g";
+						function = "pwm_ao_g";
+						bias-disable;
+					};
+				};
+
+				pwm_ao_g_hiz_pins: pwm-ao-g-hiz {
+					mux {
+						groups = "pwm_ao_g_hiz";
+						function = "pwm_ao_g_hiz";
+						bias-disable;
+					};
+				};
+
+				pwm_ao_h_pins: pwm-ao-h {
+					mux {
+						groups = "pwm_ao_h";
+						function = "pwm_ao_h";
+						bias-disable;
+					};
+				};
+
+				pwm_b_pins: pwm-b {
+					mux {
+						groups = "pwm_b";
+						function = "pwm_b";
+						bias-disable;
+					};
+				};
+
+				pwm_c_pins: pwm-c {
+					mux {
+						groups = "pwm_c";
+						function = "pwm_c";
+						bias-disable;
+					};
+				};
+
+				pwm_d_pins: pwm-d {
+					mux {
+						groups = "pwm_d";
+						function = "pwm_d";
+						bias-disable;
+					};
+				};
+
+				pwm_e_pins: pwm-e {
+					mux {
+						groups = "pwm_e";
+						function = "pwm_e";
+						bias-disable;
+					};
+				};
+
+				pwm_f_pins: pwm-f {
+					mux {
+						groups = "pwm_f";
+						function = "pwm_f";
+						bias-disable;
+					};
+				};
+
+				pwm_vs_pins: pwm-vs {
+					mux {
+						groups = "pwm_vs";
+						function = "pwm_vs";
+						bias-disable;
+					};
+				};
+
 				sdcard_pins: sdcard {
 					mux {
 						groups = "sdcard_d0",

-- 
2.49.0



^ permalink raw reply related

* [PATCH v5 2/9] dt-bindings: mmc: amlogic: Add compatible for T7 mmc
From: Ronald Claveau @ 2026-03-26  9:59 UTC (permalink / raw)
  To: Neil Armstrong, Kevin Hilman, Jerome Brunet, Martin Blumenstingl,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley, Ulf Hansson,
	Johannes Berg, van Spriel
  Cc: linux-arm-kernel, linux-amlogic, devicetree, linux-kernel,
	linux-mmc, linux-wireless, Ronald Claveau, Conor Dooley,
	Xianwei Zhao
In-Reply-To: <20260326-add-emmc-t7-vim4-v5-0-d3f182b48e9d@aliel.fr>

Add amlogic,t7-mmc compatible string, falling back to amlogic,meson-axg-mmc
as the T7 MMC controller is compatible with the AXG implementation.

Acked-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Xianwei Zhao <xianwei.zhao@amlogic.com>
Signed-off-by: Ronald Claveau <linux-kernel-dev@aliel.fr>
---
 Documentation/devicetree/bindings/mmc/amlogic,meson-gx-mmc.yaml | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/Documentation/devicetree/bindings/mmc/amlogic,meson-gx-mmc.yaml b/Documentation/devicetree/bindings/mmc/amlogic,meson-gx-mmc.yaml
index 57646575a13f8..976f36de2091c 100644
--- a/Documentation/devicetree/bindings/mmc/amlogic,meson-gx-mmc.yaml
+++ b/Documentation/devicetree/bindings/mmc/amlogic,meson-gx-mmc.yaml
@@ -19,6 +19,10 @@ allOf:
 properties:
   compatible:
     oneOf:
+      - items:
+          - enum:
+              - amlogic,t7-mmc
+          - const: amlogic,meson-axg-mmc
       - const: amlogic,meson-axg-mmc
       - items:
           - const: amlogic,meson-gx-mmc

-- 
2.49.0



^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox