* Re: [PATCH 0/3] mm: split the file's i_mmap tree for NUMA
From: Mateusz Guzik @ 2026-04-13 15:33 UTC (permalink / raw)
To: Huang Shijie
Cc: akpm, viro, brauner, linux-mm, linux-kernel, linux-arm-kernel,
linux-fsdevel, muchun.song, osalvador, linux-trace-kernel,
linux-perf-users, linux-parisc, nvdimm, zhongyuan, fangbaoshun,
yingzhiwei
In-Reply-To: <20260413062042.804-1-huangsj@hygon.cn>
On Mon, Apr 13, 2026 at 02:20:39PM +0800, Huang Shijie wrote:
> In NUMA, there are maybe many NUMA nodes and many CPUs.
> For example, a Hygon's server has 12 NUMA nodes, and 384 CPUs.
> In the UnixBench tests, there is a test "execl" which tests
> the execve system call.
>
> When we test our server with "./Run -c 384 execl",
> the test result is not good enough. The i_mmap locks contended heavily on
> "libc.so" and "ld.so". For example, the i_mmap tree for "libc.so" can have
> over 6000 VMAs, all the VMAs can be in different NUMA mode.
> The insert/remove operations do not run quickly enough.
>
> patch 1 & patch 2 are try to hide the direct access of i_mmap.
> patch 3 splits the i_mmap into sibling trees, and we can get better
> performance with this patch set:
> we can get 77% performance improvement(10 times average)
>
To my reading you kept the lock as-is and only distributed the protected
state.
While I don't doubt the improvement, I'm confident should you take a
look at the profile you are going to find this still does not scale with
rwsem being one of the problems (there are other global locks, some of
which have experimental patches for).
Apart from that this does nothing to help high core systems which are
all one node, which imo puts another question mark on this specific
proposal.
Of course one may question whether a RB tree is the right choice here,
it may be the lock-protected cost can go way down with merely a better
data structure.
Regardless of that, for actual scalability, there will be no way around
decentralazing locking around this and partitioning per some core count
(not just by numa awareness).
Decentralizing locking is definitely possible, but I have not looked
into specifics of how problematic it is. Best case scenario it will
merely with separate locks. Worst case scenario something needs a fully
stabilized state for traversal, in that case another rw lock can be
slapped around this, creating locking order read lock -> per-subset
write lock -- this will suffer scalability due to the read locking, but
it will still scale drastically better as apart from that there will be
no serialization. In this setting the problematic consumer will write
lock the new thing to stabilize the state.
So my non-maintainer opinion is that the patchset is not worth it as it
fails to address anything for significantly more common and already
affected setups.
Have you looked into splitting the lock?
^ permalink raw reply
* Re: [PATCH v2 RESEND 1/2] dt-bindings: phy: mediatek,xsphy: add property to set disconnect threshold
From: Conor Dooley @ 2026-04-13 15:40 UTC (permalink / raw)
To: Chunfeng Yun
Cc: Vinod Koul, AngeloGioacchino Del Regno, Neil Armstrong,
Rob Herring, Krzysztof Kozlowski, Conor Dooley, Matthias Brugger,
linux-arm-kernel, linux-mediatek, linux-phy, devicetree,
linux-kernel
In-Reply-To: <20260413122836.4848-1-chunfeng.yun@mediatek.com>
[-- Attachment #1: Type: text/plain, Size: 1572 bytes --]
On Mon, Apr 13, 2026 at 08:28:35PM +0800, Chunfeng Yun wrote:
> Add a property to tune usb2 phy's disconnect threshold.
> And add a compatible for mt8196.
>
> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
> Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
> ---
> v2: change property name
> ---
> Documentation/devicetree/bindings/phy/mediatek,xsphy.yaml | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/phy/mediatek,xsphy.yaml b/Documentation/devicetree/bindings/phy/mediatek,xsphy.yaml
> index 0bed847bb4ad..9017a9c93eb9 100644
> --- a/Documentation/devicetree/bindings/phy/mediatek,xsphy.yaml
> +++ b/Documentation/devicetree/bindings/phy/mediatek,xsphy.yaml
> @@ -50,6 +50,7 @@ properties:
> - mediatek,mt3611-xsphy
> - mediatek,mt3612-xsphy
> - mediatek,mt7988-xsphy
> + - mediatek,mt8196-xsphy
> - const: mediatek,xsphy
>
> reg:
> @@ -130,6 +131,13 @@ patternProperties:
> minimum: 1
> maximum: 7
>
> + mediatek,disconnect-threshold:
> + description:
> + The selection of disconnect threshold (U2 phy)
Why is this unitless? What does the threshold represent? Time? Voltage?
Something else?
> + $ref: /schemas/types.yaml#/definitions/uint32
> + minimum: 1
> + maximum: 15
> +
> mediatek,efuse-intr:
> description:
> The selection of Internal Resistor (U2/U3 phy)
> --
> 2.45.2
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply
* Re: [PATCH v2 1/8] dt-bindings: thermal: amlogic: Add support for T7
From: Conor Dooley @ 2026-04-13 15:42 UTC (permalink / raw)
To: Ronald Claveau
Cc: Guillaume La Roque, Rafael J. Wysocki, Daniel Lezcano, Zhang Rui,
Lukasz Luba, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Neil Armstrong, Kevin Hilman, Jerome Brunet, Martin Blumenstingl,
linux-pm, linux-amlogic, devicetree, linux-kernel,
linux-arm-kernel
In-Reply-To: <20260413-add-thermal-t7-vim4-v2-1-1002d90a0602@aliel.fr>
[-- Attachment #1: Type: text/plain, Size: 3361 bytes --]
On Mon, Apr 13, 2026 at 12:52:42PM +0200, Ronald Claveau wrote:
> Add the amlogic,t7-thermal compatible for the Amlogic T7 thermal sensor.
>
> Unlike existing variants which use a phandle to the ao-secure syscon,
> the T7 relies on a secure monitor interface described by a phandle and
> a sensor index argument.
>
> The T7 integrates multiple thermal sensors, all accessed through the
> same SMC call. The sensor index argument is required to identify which
> sensor's calibration data the secure monitor should return, as a single
> SM_THERMAL_CALIB_READ command serves all of them.
>
> Introduce the amlogic,secure-monitor property as a phandle-array and
> make amlogic,ao-secure or amlogic,secure-monitor conditionally required
> depending on the compatible.
>
> Signed-off-by: Ronald Claveau <linux-kernel-dev@aliel.fr>
> ---
> .../bindings/thermal/amlogic,thermal.yaml | 42 ++++++++++++++++++++--
> 1 file changed, 40 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/devicetree/bindings/thermal/amlogic,thermal.yaml b/Documentation/devicetree/bindings/thermal/amlogic,thermal.yaml
> index 70b273271754b..1c096116b2dda 100644
> --- a/Documentation/devicetree/bindings/thermal/amlogic,thermal.yaml
> +++ b/Documentation/devicetree/bindings/thermal/amlogic,thermal.yaml
> @@ -21,7 +21,9 @@ properties:
> - amlogic,g12a-cpu-thermal
> - amlogic,g12a-ddr-thermal
> - const: amlogic,g12a-thermal
> - - const: amlogic,a1-cpu-thermal
> + - enum:
> + - amlogic,a1-cpu-thermal
> + - amlogic,t7-thermal
>
> reg:
> maxItems: 1
> @@ -42,12 +44,39 @@ properties:
> '#thermal-sensor-cells':
> const: 0
>
> + amlogic,secure-monitor:
> + description: phandle to the secure monitor
> + $ref: /schemas/types.yaml#/definitions/phandle-array
> + items:
> + - items:
> + - description: phandle to the secure monitor
> + - description: sensor index to get specific calibration data
> +
> required:
> - compatible
> - reg
> - interrupts
> - clocks
> - - amlogic,ao-secure
> +
> +allOf:
> + - if:
> + properties:
> + compatible:
> + contains:
> + enum:
> + - amlogic,a1-cpu-thermal
> + - amlogic,g12a-thermal
> + then:
> + required:
> + - amlogic,ao-secure
> + - if:
> + properties:
> + compatible:
> + contains:
> + const: amlogic,t7-thermal
This can just be replaced by a else I think.
> + then:
> + required:
> + - amlogic,secure-monitor
>
> unevaluatedProperties: false
>
> @@ -62,4 +91,13 @@ examples:
> #thermal-sensor-cells = <0>;
> amlogic,ao-secure = <&sec_AO>;
> };
> + - |
> + a73_tsensor: temperature-sensor@20000 {
Can drop the label here, it has no users.
Otherwise, seems fine.
Cheers,
Conor.
pw-bot: changes-requested
> + compatible = "amlogic,t7-thermal";
> + reg = <0x0 0x20000 0x0 0x50>;
> + interrupts = <GIC_SPI 31 IRQ_TYPE_LEVEL_HIGH>;
> + clocks = <&clkc_periphs CLKID_TS>;
> + #thermal-sensor-cells = <0>;
> + amlogic,secure-monitor = <&sm 1>;
> + };
> ...
>
> --
> 2.49.0
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply
* [PATCH] arm64/hwcap: Include kernel-hwcap.h in list of generated files
From: Mark Brown @ 2026-04-13 15:44 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon
Cc: Marek Vasut, linux-arm-kernel, linux-kernel, Mark Brown
When adding generation for the kernel internal constants for hwcaps the
generated file was not explicitly flagged as such in the build system,
causing it to be regenerated on each build. This wasn't obvious when the
series the change was included in was developed since it was all about
changes that trigger rebuilds anyway.
Fixes: abed23c3c44f5 (arm64/hwcap: Generate the KERNEL_HWCAP_ definitions for the hwcaps)
Reported-by: Marek Vasut <marex@nabladev.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
---
arch/arm64/include/asm/Kbuild | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm64/include/asm/Kbuild b/arch/arm64/include/asm/Kbuild
index d2ff8f6c3231..31441790b808 100644
--- a/arch/arm64/include/asm/Kbuild
+++ b/arch/arm64/include/asm/Kbuild
@@ -17,4 +17,5 @@ generic-y += parport.h
generic-y += user.h
generated-y += cpucap-defs.h
+generated-y += kernel-hwcap.h
generated-y += sysreg-defs.h
---
base-commit: abed23c3c44f565dc812563ac015be70dd61e97b
change-id: 20260413-arm64-hwcap-gen-fix-ecb4bb6dbb91
Best regards,
--
Mark Brown <broonie@kernel.org>
^ permalink raw reply related
* [PATCH] watchdog: ixp4xx: fix reference leak on platform_device_register() failure
From: Guangshuo Li @ 2026-04-13 15:47 UTC (permalink / raw)
To: Linus Walleij, Imre Kaloz, Daniel Lezcano, Thomas Gleixner,
Guenter Roeck, linux-arm-kernel, linux-kernel
Cc: Guangshuo Li, stable
ixp4xx_timer_probe() directly returns the result of
platform_device_register(&ixp4xx_watchdog_device). When registration
fails, the embedded struct device in ixp4xx_watchdog_device has already
been initialized by device_initialize(), but the failure path does not
drop the device reference, leading to a reference leak.
The issue was identified by a static analysis tool I developed and
confirmed by manual review. Fix this by calling platform_device_put()
when platform_device_register() fails.
Fixes: 21a0a29d16c67 ("watchdog: ixp4xx: Rewrite driver to use core")
Cc: stable@vger.kernel.org
Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com>
---
drivers/clocksource/timer-ixp4xx.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/clocksource/timer-ixp4xx.c b/drivers/clocksource/timer-ixp4xx.c
index 720ed70a2964..924dbd58c4da 100644
--- a/drivers/clocksource/timer-ixp4xx.c
+++ b/drivers/clocksource/timer-ixp4xx.c
@@ -239,11 +239,16 @@ static struct platform_device ixp4xx_watchdog_device = {
static int ixp4xx_timer_probe(struct platform_device *pdev)
{
struct device *dev = &pdev->dev;
+ int ret;
/* Pass the base address as platform data and nothing else */
ixp4xx_watchdog_device.dev.platform_data = local_ixp4xx_timer->base;
ixp4xx_watchdog_device.dev.parent = dev;
- return platform_device_register(&ixp4xx_watchdog_device);
+ ret = platform_device_register(&ixp4xx_watchdog_device);
+ if (ret)
+ platform_device_put(&ixp4xx_watchdog_device);
+
+ return ret;
}
static const struct of_device_id ixp4xx_timer_dt_id[] = {
--
2.43.0
^ permalink raw reply related
* RE: [PATCH 06/11] Drivers: hv: Make sint vector architecture neutral in MSHV_VTL
From: Michael Kelley @ 2026-04-13 15:49 UTC (permalink / raw)
To: Naman Jain, K . Y . Srinivasan, Haiyang Zhang, Wei Liu,
Dexuan Cui, Long Li, Catalin Marinas, Will Deacon,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
x86@kernel.org, H . Peter Anvin, Arnd Bergmann, Paul Walmsley,
Palmer Dabbelt, Albert Ou, Alexandre Ghiti
Cc: Marc Zyngier, Timothy Hayes, Lorenzo Pieralisi, mrigendrachaubey,
ssengar@linux.microsoft.com, linux-hyperv@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
linux-riscv@lists.infradead.org
In-Reply-To: <b5125f61-173f-45d0-a6dc-d795ba0f8693@linux.microsoft.com>
From: Naman Jain <namjain@linux.microsoft.com> Sent: Monday, April 13, 2026 4:48 AM
>
> On 4/1/2026 10:27 PM, Michael Kelley wrote:
> > From: Naman Jain <namjain@linux.microsoft.com> Sent: Monday, March 16, 2026 5:13 AM
> >>
> >> Generalize Synthetic interrupt source vector (sint) to use
> >> vmbus_interrupt variable instead, which automatically takes care of
> >> architectures where HYPERVISOR_CALLBACK_VECTOR is not present (arm64).
> >
> > Sashiko AI raised an interesting question about the startup timing --
> > whether the vmbus_platform_driver_probe() is guaranteed to have
> > set vmbus_interrupt before the VTL functions below run and use it.
> > What causes the mshv_vtl.ko module to be loaded, and hence run
> > mshv_vtl_init()?
>
> There is no race condition here. The init ordering guarantees that
> vmbus_interrupt is always set before mshv_vtl_synic_enable_regs()
> reads it.
>
> The call chain for setting vmbus_interrupt:
>
> subsys_initcall(hv_acpi_init) [level 4]
> -> platform_driver_register(&vmbus_platform_driver) and so on.
>
>
> The call chain for reading vmbus_interrupt:
>
> module_init(mshv_vtl_init) [level 6]
> -> hv_vtl_setup_synic()
> -> cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, ..., mshv_vtl_alloc_context, ...)
> -> mshv_vtl_alloc_context()
> -> mshv_vtl_synic_enable_regs()
> -> sint.vector = vmbus_interrupt
>
> do_initcalls() processes sections in order 0 through 7, so
> hv_acpi_init() (level 4) is guaranteed to complete before
> mshv_vtl_init() (level 6) runs.
>
I think the situation is more complex than what you describe, depending
on whether the VMBus driver and/or MSHV_VTL are built as modules vs.
being built-in to the kernel image. In include/linux/module.h, see the
comment for module_init() and how subsys_initcall() is mapped
to module_init() when built as a module.
If both are built-in, then what you describe is correct. But if either or
both are modules, then the respective init functions (hv_acpi_init
and mshv_vtl_init) get called at the time the module is loaded, and
not by do_initcalls(). I think hv_vmbus.ko gets loaded when an attempt
is first made to access a disk, but I would need to look more closely to
be sure. I don't have any understanding of what causes mshv_vtl.ko
to be loaded. And what is the ordering if MSHV_VTL is built-in while
VMBus is built as a module, or vice versa?
Michael
^ permalink raw reply
* Re: [PATCH v4 2/4] pwm: sun50i: Add H616 PWM support
From: Paul Kocialkowski @ 2026-04-13 15:58 UTC (permalink / raw)
To: bigunclemax
Cc: richard.genoud, conor+dt, devicetree, jernej.skrabec, joao,
jstultz, krzk+dt, linux-arm-kernel, linux-kernel, linux-pwm,
linux-sunxi, p.zabel, robh, samuel, thomas.petazzoni,
u.kleine-koenig, wens
In-Reply-To: <20260413123920.2459916-1-bigunclemax@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 840 bytes --]
Hi Maksim,
On Mon 13 Apr 26, 15:39, bigunclemax@gmail.com wrote:
> > +
> > +/* PWM Capture Fall Lock Register */
> > +#define H616_PWM_CFLR(x) (0x74 + (x) * 0x20)
> > +
> > +#define H616_PWM_PAIR_IDX(chan) ((chan) >> 2)
> > +
>
> It looks like there's a typo or a mistake in the PAIR_IDX calculation.
> It should be like ((chan) >> 1).
> For example, for the 5th channel the result will be 1, but it should be 2.
Had a quick look at it too and I agree with you. A right shift by 2 essentially
groups pwms by 4, not 2. So pwms 2 and 3 would also be reported as part of
index 0.
All the best,
Paul
--
Paul Kocialkowski,
Independent contractor - sys-base - https://www.sys-base.io/
Free software developer - https://www.paulk.fr/
Expert in multimedia, graphics and embedded hardware support with Linux.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply
* Re: [PATCH 1/7] x86/vdso: Respect COMPAT_32BIT_TIME
From: Arnd Bergmann @ 2026-04-13 15:59 UTC (permalink / raw)
To: Thomas Weißschuh
Cc: H. Peter Anvin, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, Russell King, Catalin Marinas,
Will Deacon, Madhavan Srinivasan, Michael Ellerman,
Nicholas Piggin, Christophe Leroy, Thomas Bogendoerfer,
Vincenzo Frascino, linux-kernel, linux-arm-kernel, linuxppc-dev,
linux-mips
In-Reply-To: <20260410091131-46b6354c-9d06-4e47-9345-ee224d8528f7@linutronix.de>
On Fri, Apr 10, 2026, at 09:24, Thomas Weißschuh wrote:
> On Tue, Mar 03, 2026 at 09:50:33PM +0100, Arnd Bergmann wrote:
>> On Tue, Mar 3, 2026, at 19:11, H. Peter Anvin wrote:
>> >
>> > The time zone in the kernel has never worked anyway, as it would require the
>> > kernel to contain at least the forward portion of the zoneinfo/tzdata table in
>> > order to actually work correctly. The only plausible use of it would be for
>> > local time-based filesystems like FAT, but I don't think we bother.
>> >
>> > A bigger question is whether or not we should omit these from the vDSO
>> > completely (potentially causing link failures) or replace them with stubs
>> > returning -ENOSYS.
>>
>> I see no harm in keeping gettimeofday() in the vdso when
>> COMPAT_32BIT_TIME is turned on, as existing code will call it
>> no matter whether it's in the vdso or the syscall.
>
> We would still always keep them for 64-bit ABIs, right?
Yes, I think we can't easily change that now. It was probably
a mistake to keep them in the generic syscall table after
we dropped them for 32-bit non-time32 targets, so riscv64
and loongarch should have not had these in the first place.
>> Equally, I see no point in having either version of
>> gettimeofday() or settimeofday() when COMPAT_32BIT_TIME is
>> disabled, as clearly anything calling it would pass incorrect
>> data for times past 2038.
>
> Should we also drop the syscalls in these cases?
> We will need to keep settimeofday() in some form to support the
> timewarping call done by init.
>
> Recap/Proposal:
>
> * Keep the gettimeofday()/time() syscalls when they are y2038 safe or
> CONFIG_COMPAT_32BIT_TIME is set.
> * The vDSO functions always mirror the systemcall availability.
These sound good.
> * Always provide settimeofday(). If CONFIG_COMPAT_32BIT_TIME is *not*
> set, reject passing any 'tv' argument where it may not be y2038 safe.
This sounds wrong to me now: the case I'm worried about is a 32-bit
system calling settimeofday() based on the value of an RTC or NTP.
The idea of CONFIG_COMPAT_32BIT_TIME=n is to catch this by causing
an intentional ENOSYS error even for valid times, so it doesn't
suddenly start breaking in 2038.
Arnd
^ permalink raw reply
* Re: [PATCH v14 0/8] Apply drm_bridge_connector and panel_bridge helper for the Analogix DP driver
From: Luca Ceresoli @ 2026-04-13 16:05 UTC (permalink / raw)
To: andrzej.hajda, neil.armstrong, rfoss, maarten.lankhorst, mripard,
tzimmermann, airlied, simona, inki.dae, sw0312.kim, kyungmin.park,
krzk, jingoohan1, hjc, heiko, andy.yan, Damon Ding
Cc: Laurent.pinchart, jonas, jernej.skrabec, alim.akhtar,
dmitry.baryshkov, nicolas.frattaroli, dianders, m.szyprowski,
linux-kernel, dri-devel, linux-arm-kernel, linux-samsung-soc,
linux-rockchip
In-Reply-To: <20260413132551.1049307-1-damon.ding@rock-chips.com>
On Mon, 13 Apr 2026 21:25:43 +0800, Damon Ding wrote:
> Picked from:
> https://lore.kernel.org/all/20260409065301.446670-1-damon.ding@rock-chips.com/
>
> PATCH 1 is the preparation for apply drm_bridge_connector helper.
> PATCH 2 is to apply the drm_bridge_connector helper.
> PATCH 3-5 are to move the panel/bridge parsing to the Analogix side.
> PATCH 6 is to attach the next bridge on Analogix side uniformly.
> PATCH 7-8 are to apply the panel_bridge helper.
>
> [...]
Applied, thanks!
[1/8] drm/bridge: analogix_dp: Pass struct drm_atomic_state* for analogix_dp_bridge_mode_set()
commit: 3be024d26a576519ce75fabfbdf6972731c19900
[2/8] drm/bridge: analogix_dp: Apply drm_bridge_connector helper
commit: 99a49ff5ef7a5f01e28e724b888d94b6735a88c1
[3/8] drm/bridge: analogix_dp: Add new API analogix_dp_finish_probe()
commit: e9c897c898f9ff32d93380ee4ebc778a6b787aaa
[4/8] drm/rockchip: analogix_dp: Apply analogix_dp_finish_probe()
commit: d35ac0973463f67162d9ee73bf0c828ad5d4d2f9
[5/8] drm/exynos: exynos_dp: Apply analogix_dp_finish_probe()
commit: 02b8a4f240abdc4e99efd6cf95c47378a1015903
[6/8] drm/bridge: analogix_dp: Attach the next bridge in analogix_dp_bridge_attach()
commit: 3076510af7cd01a7fb0ae5103116a39ad35eb5cb
[7/8] drm/bridge: analogix_dp: Remove bridge disabing and panel unpreparing in analogix_dp_unbind()
commit: 2bfc4e192f04260c2eead9b79274d39984f5a143
[8/8] drm/bridge: analogix_dp: Apply panel_bridge helper
commit: 1b86a69b61df411354da70d9528f022833bee4d7
Best regards,
--
Luca Ceresoli <luca.ceresoli@bootlin.com>
^ permalink raw reply
* Re: [PATCH] PCI: host-common: Request bus reassignment when not probe-only
From: Manivannan Sadhasivam @ 2026-04-13 16:05 UTC (permalink / raw)
To: Ratheesh Kannoth
Cc: linux-pci, linux-arm-kernel, linux-kernel, bhelgaas, will,
lpieralisi, kwilczynski, robh, vidyas, Bjorn Helgaas
In-Reply-To: <20260410142124.2673056-1-rkannoth@marvell.com>
On Fri, Apr 10, 2026 at 07:51:24PM +0530, Ratheesh Kannoth wrote:
> pci_host_common_init() is used by several generic ECAM host drivers.
> After PCI core changes around pci_flags and preserve_config, these hosts
> no longer opted into full bus number reassignment the way they did
> before.
>
So pci_assign_unassigned_root_bus_resources() only assigns the unclaimed
resources, and do not reassign the bus numbers. Probably the offending commit
confused PCI_REASSIGN_ALL_BUS with PCI_REASSIGN_ALL_RSRC.
> When PCI_PROBE_ONLY is not set, add PCI_REASSIGN_ALL_BUS so
> pci_scan_bridge_extend() takes the reassignment path: bus numbers can be
> assigned from firmware EA data (e.g. pci_ea_fixed_busnrs()). Skip the
> flag in probe-only mode so existing assignments are not overridden.
>
> CC: Bjorn Helgaas <helgaas@kernel.org>
> CC: Vidya Sagar <vidyas@nvidia.com>
> Fixes: 7246a4520b4b ("PCI: Use preserve_config in place of pci_flags")
> Link: https://lore.kernel.org/netdev/adcXzcz2wWJFw4d7@rkannoth-OptiPlex-7090/
> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
> ---
> drivers/pci/controller/pci-host-common.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/drivers/pci/controller/pci-host-common.c b/drivers/pci/controller/pci-host-common.c
> index d6258c1cffe5..860783553cca 100644
> --- a/drivers/pci/controller/pci-host-common.c
> +++ b/drivers/pci/controller/pci-host-common.c
> @@ -68,6 +68,10 @@ int pci_host_common_init(struct platform_device *pdev,
> if (IS_ERR(cfg))
> return PTR_ERR(cfg);
>
> + /* Do not reassign resources if probe only */
This comment is still wrong and could be the source of confusion that lead to
the bug. PCI_REASSIGN_ALL_BUS only reassigns the bus number, not resources. So
please reword it.
> + if (!pci_has_flag(PCI_PROBE_ONLY))
> + pci_add_flags(PCI_REASSIGN_ALL_BUS);
This forces bus number reassignment for all platforms whose DT doesn't provide
'linux,probe-only' property. Maybe that's OK given that this was the behavior
before 7246a4520b4b.
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply
* Re: [PATCH v14 3/8] drm/bridge: analogix_dp: Add new API analogix_dp_finish_probe()
From: Luca Ceresoli @ 2026-04-13 16:07 UTC (permalink / raw)
To: Damon Ding, andrzej.hajda, neil.armstrong, rfoss,
maarten.lankhorst, mripard, tzimmermann, airlied, simona,
inki.dae, sw0312.kim, kyungmin.park, krzk, jingoohan1, hjc, heiko,
andy.yan
Cc: Laurent.pinchart, jonas, jernej.skrabec, alim.akhtar,
dmitry.baryshkov, nicolas.frattaroli, dianders, m.szyprowski,
linux-kernel, dri-devel, linux-arm-kernel, linux-samsung-soc,
linux-rockchip
In-Reply-To: <20260413132551.1049307-4-damon.ding@rock-chips.com>
Hello Damon,
On Mon Apr 13, 2026 at 3:25 PM CEST, Damon Ding wrote:
> Since the panel/bridge should logically be positioned behind the
> Analogix bridge in the display pipeline, it makes sense to handle
> the panel/bridge parsing on the Analogix side. Therefore, we add
> a new API analogix_dp_finish_probe(), which combines the panel/bridge
> parsing with component addition, to do it.
>
> In order to process component binding right after the probe completes,
> the &analogix_dp_plat_data.ops is newly added to pass &component_ops,
> for which the &dp_aux_ep_device_with_data.done_probing() of DP AUX bus
> only supports passing &drm_dp_aux.
>
> Signed-off-by: Damon Ding <damon.ding@rock-chips.com>
> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Tested-by: Heiko Stuebner <heiko@sntech.de> # rk3588
>
> ---
>
> Changes in v4:
> - Rename the &analogix_dp_plat_data.bridge to
> &analogix_dp_plat_data.next_bridge.
> - Remame API analogix_dp_find_panel_or_bridge() to
> analogix_dp_finish_probe().
>
> Changes in v5:
> - Select DRM_DISPLAY_DP_AUX_BUS for DRM_ANALOGIX_DP.
>
> Changes in v9:
> - Add Tested-by tag.
>
> Changes in v10:
> - Fix to use dev_err_probe() in analogix_dp_finish_probe().
> - Expand the commit message.
>
> Changes in v13:
> - Modify '(on rk3588)' to '# rk3588' for Tested-by tag.
> ---
> drivers/gpu/drm/bridge/analogix/Kconfig | 1 +
> .../drm/bridge/analogix/analogix_dp_core.c | 46 +++++++++++++++++++
> include/drm/bridge/analogix_dp.h | 2 +
> 3 files changed, 49 insertions(+)
>
> diff --git a/drivers/gpu/drm/bridge/analogix/Kconfig b/drivers/gpu/drm/bridge/analogix/Kconfig
> index 03dc7ffe824a..8a6136cd675f 100644
> --- a/drivers/gpu/drm/bridge/analogix/Kconfig
> +++ b/drivers/gpu/drm/bridge/analogix/Kconfig
> @@ -29,6 +29,7 @@ config DRM_ANALOGIX_ANX78XX
> config DRM_ANALOGIX_DP
> tristate
> depends on DRM
> + select DRM_DISPLAY_DP_AUX_BUS
While applying, sparse noticed an issue here: DRM_DISPLAY_DP_AUX_BUS
depends on OF, so you need to propagate the 'depends on OF' to
DRM_ANALOGIX_DP and its reverse dependencies.
I fixed it while applying the patch.
Luca
--
Luca Ceresoli, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
^ permalink raw reply
* Re: [RFC PATCH 3/8] mm/vmalloc: Extend vmap_small_pages_range_noflush() to support larger page_shift sizes
From: Mike Rapoport @ 2026-04-13 16:08 UTC (permalink / raw)
To: Barry Song (Xiaomi)
Cc: linux-mm, linux-arm-kernel, catalin.marinas, will, akpm, urezki,
linux-kernel, anshuman.khandual, ryan.roberts, ajd, david,
Xueyuan.chen21
In-Reply-To: <20260408025115.27368-4-baohua@kernel.org>
Hi Barry,
On Wed, Apr 08, 2026 at 10:51:10AM +0800, Barry Song (Xiaomi) wrote:
> vmap_small_pages_range_noflush() provides a clean interface by taking
> struct page **pages and mapping them via direct PTE iteration. This
> avoids the page table zigzag seen when using
> vmap_range_noflush() for page_shift values other than PAGE_SHIFT.
>
> Extend it to support larger page_shift values, and add PMD- and
> contiguous-PTE mappings as well.
>
> Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
> ---
> mm/vmalloc.c | 54 ++++++++++++++++++++++++++++++++++++++++------------
> 1 file changed, 42 insertions(+), 12 deletions(-)
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 57eae99d9909..5bf072297536 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -524,8 +524,9 @@ void vunmap_range(unsigned long addr, unsigned long end)
>
> static int vmap_pages_pte_range(pmd_t *pmd, unsigned long addr,
> unsigned long end, pgprot_t prot, struct page **pages, int *nr,
> - pgtbl_mod_mask *mask)
> + pgtbl_mod_mask *mask, unsigned int shift)
> {
> + unsigned int steps = 1;
> int err = 0;
> pte_t *pte;
>
> @@ -543,6 +544,7 @@ static int vmap_pages_pte_range(pmd_t *pmd, unsigned long addr,
> do {
> struct page *page = pages[*nr];
>
> + steps = 1;
> if (WARN_ON(!pte_none(ptep_get(pte)))) {
> err = -EBUSY;
> break;
> @@ -556,9 +558,24 @@ static int vmap_pages_pte_range(pmd_t *pmd, unsigned long addr,
> break;
> }
>
> +#ifdef CONFIG_HUGETLB_PAGE
Why is this related to HUGETLB_PAGE?
> + if (shift != PAGE_SHIFT) {
> + unsigned long pfn = page_to_pfn(page), size;
> +
> + size = arch_vmap_pte_range_map_size(addr, end, pfn, shift);
> + if (size != PAGE_SIZE) {
> + steps = size >> PAGE_SHIFT;
> + pte_t entry = pfn_pte(pfn, prot);
> +
> + entry = arch_make_huge_pte(entry, ilog2(size), 0);
> + set_huge_pte_at(&init_mm, addr, pte, entry, size);
> + continue;
> + }
> + }
> +#endif
> +
> set_pte_at(&init_mm, addr, pte, mk_pte(page, prot));
> - (*nr)++;
> - } while (pte++, addr += PAGE_SIZE, addr != end);
> + } while (pte += steps, *nr += steps, addr += PAGE_SIZE * steps, addr != end);
>
> lazy_mmu_mode_disable();
> *mask |= PGTBL_PTE_MODIFIED;
> @@ -568,7 +585,7 @@ static int vmap_pages_pte_range(pmd_t *pmd, unsigned long addr,
>
> static int vmap_pages_pmd_range(pud_t *pud, unsigned long addr,
> unsigned long end, pgprot_t prot, struct page **pages, int *nr,
> - pgtbl_mod_mask *mask)
> + pgtbl_mod_mask *mask, unsigned int shift)
> {
> pmd_t *pmd;
> unsigned long next;
> @@ -578,7 +595,20 @@ static int vmap_pages_pmd_range(pud_t *pud, unsigned long addr,
> return -ENOMEM;
> do {
> next = pmd_addr_end(addr, end);
> - if (vmap_pages_pte_range(pmd, addr, next, prot, pages, nr, mask))
> +
> + if (shift == PMD_SHIFT) {
> + struct page *page = pages[*nr];
> + phys_addr_t phys_addr = page_to_phys(page);
> +
> + if (vmap_try_huge_pmd(pmd, addr, next, phys_addr, prot,
> + shift)) {
> + *mask |= PGTBL_PMD_MODIFIED;
> + *nr += 1 << (shift - PAGE_SHIFT);
> + continue;
> + }
With this vmap_pages_pmd_range() looks quite similar to vmap_pmd_range().
Any changes we can consolidate the two?
> + }
> +
> + if (vmap_pages_pte_range(pmd, addr, next, prot, pages, nr, mask, shift))
> return -ENOMEM;
> } while (pmd++, addr = next, addr != end);
> return 0;
--
Sincerely yours,
Mike.
^ permalink raw reply
* Re: [PATCH 1/7] x86/vdso: Respect COMPAT_32BIT_TIME
From: Thomas Weißschuh @ 2026-04-13 16:13 UTC (permalink / raw)
To: Arnd Bergmann
Cc: H. Peter Anvin, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, Russell King, Catalin Marinas,
Will Deacon, Madhavan Srinivasan, Michael Ellerman,
Nicholas Piggin, Christophe Leroy, Thomas Bogendoerfer,
Vincenzo Frascino, linux-kernel, linux-arm-kernel, linuxppc-dev,
linux-mips
In-Reply-To: <15925544-1ae5-406a-b9cc-af5935cc9f02@app.fastmail.com>
On Mon, Apr 13, 2026 at 05:59:52PM +0200, Arnd Bergmann wrote:
> On Fri, Apr 10, 2026, at 09:24, Thomas Weißschuh wrote:
(...)
> > Recap/Proposal:
(...)
> > * Always provide settimeofday(). If CONFIG_COMPAT_32BIT_TIME is *not*
> > set, reject passing any 'tv' argument where it may not be y2038 safe.
>
> This sounds wrong to me now: the case I'm worried about is a 32-bit
> system calling settimeofday() based on the value of an RTC or NTP.
> The idea of CONFIG_COMPAT_32BIT_TIME=n is to catch this by causing
> an intentional ENOSYS error even for valid times, so it doesn't
> suddenly start breaking in 2038.
This is what I meant with "where it *may*" be not y2038 safe.
Even if the value fits, the call would be rejected.
My wording was crappy indeed, though.
In code:
if (tv && !IS_ENABLED(CONFIG_COMPAT_32BIT_TIME) && sizeof(tv->tv_sec) < 8) {
pr_warn_once(...);
return -EINVAL;
}
Or maybe drop the EINVAL but still emit a warning. That warning would be
useful for gettimeofday(), too.
Thomas
^ permalink raw reply
* Re: [RFC PATCH 4/8] mm/vmalloc: Eliminate page table zigzag for huge vmalloc mappings
From: Mike Rapoport @ 2026-04-13 16:16 UTC (permalink / raw)
To: Barry Song (Xiaomi)
Cc: linux-mm, linux-arm-kernel, catalin.marinas, will, akpm, urezki,
linux-kernel, anshuman.khandual, ryan.roberts, ajd, david,
Xueyuan.chen21
In-Reply-To: <20260408025115.27368-5-baohua@kernel.org>
On Wed, Apr 08, 2026 at 10:51:11AM +0800, Barry Song (Xiaomi) wrote:
> For vmalloc() allocations with VM_ALLOW_HUGE_VMAP, we no longer
> need to iterate over pages one by one, which would otherwise lead to
> zigzag page table mappings.
>
> The code is now unified with the PAGE_SHIFT case by simply
> calling vmap_small_pages_range_noflush().
>
> Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
> ---
> mm/vmalloc.c | 22 ++++------------------
> 1 file changed, 4 insertions(+), 18 deletions(-)
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 5bf072297536..eba436386929 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -689,27 +689,13 @@ static int vmap_small_pages_range_noflush(unsigned long addr, unsigned long end,
> int __vmap_pages_range_noflush(unsigned long addr, unsigned long end,
> pgprot_t prot, struct page **pages, unsigned int page_shift)
> {
> - unsigned int i, nr = (end - addr) >> PAGE_SHIFT;
> -
> WARN_ON(page_shift < PAGE_SHIFT);
>
> - if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMALLOC) ||
> - page_shift == PAGE_SHIFT)
> - return vmap_small_pages_range_noflush(addr, end, prot, pages, PAGE_SHIFT);
> -
> - for (i = 0; i < nr; i += 1U << (page_shift - PAGE_SHIFT)) {
> - int err;
> -
> - err = vmap_range_noflush(addr, addr + (1UL << page_shift),
> - page_to_phys(pages[i]), prot,
> - page_shift);
> - if (err)
> - return err;
> + if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMALLOC))
> + page_shift = PAGE_SHIFT;
>
> - addr += 1UL << page_shift;
> - }
> -
> - return 0;
> + return vmap_small_pages_range_noflush(addr, end, prot, pages,
> + min(page_shift, PMD_SHIFT));
Wouldn't vmap_range_noflush() already "do the right thing" even without
changes to vmap_small_pages_range_noflush()?
> }
>
> int vmap_pages_range_noflush(unsigned long addr, unsigned long end,
> --
> 2.39.3 (Apple Git-146)
>
--
Sincerely yours,
Mike.
^ permalink raw reply
* Re: [PATCH v5 4/4] Input: charlieplex_keypad: add GPIO charlieplex keypad
From: Hugo Villeneuve @ 2026-04-13 16:20 UTC (permalink / raw)
To: Andy Shevchenko
Cc: robin, andy, geert, robh, krzk+dt, conor+dt, dmitry.torokhov,
hvilleneuve, mkorpershoek, matthias.bgg,
angelogioacchino.delregno, lee, alexander.sverdlin, marek.vasut,
akurz, devicetree, linux-kernel, linux-input, linux-arm-kernel,
linux-mediatek
In-Reply-To: <abPXX1eWoq7C7J1R@ashevche-desk.local>
Hi Dmitry,
On Fri, 13 Mar 2026 11:22:39 +0200
Andy Shevchenko <andriy.shevchenko@intel.com> wrote:
> On Thu, Mar 12, 2026 at 02:00:58PM -0400, Hugo Villeneuve wrote:
> >
> > Add support for GPIO-based charlieplex keypad, allowing to control
> > N^2-N keys using N GPIO lines.
> >
> > Reuse matrix keypad keymap to simplify, even if there is no concept
> > of rows and columns in this type of keyboard.
>
> LGTM,
> Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
I was just wondering if this will go into v7.1, as I am not seing the
patch series in your input/next tree/branch for the moment? Let me know
if you need me to rebase it on v7.0.
--
Hugo Villeneuve
^ permalink raw reply
* Re: [PATCH v2] dt-bindings: ARM: arm,vexpress-scc: convert to DT schema
From: Liviu Dudau @ 2026-04-13 16:24 UTC (permalink / raw)
To: Khushal Chitturi
Cc: robh, krzk+dt, conor+dt, sudeep.holla, lpieralisi, pawel.moll,
devicetree, linux-arm-kernel, linux-kernel
In-Reply-To: <20260411183355.8847-1-khushalchitturi@gmail.com>
On Sun, Apr 12, 2026 at 12:03:55AM +0530, Khushal Chitturi wrote:
> Convert the ARM Versatile Express Serial Configuration Controller
> bindings to DT schema.
>
> Signed-off-by: Khushal Chitturi <khushalchitturi@gmail.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Best regards,
Liviu
> ---
> Changelog:
> v1 -> v2:
> - Modified compatible string to use an enum instead of a generic pattern.
> - Updated maintainers list.
>
> .../bindings/arm/arm,vexpress-scc.yaml | 53 +++++++++++++++++++
> .../devicetree/bindings/arm/vexpress-scc.txt | 33 ------------
> 2 files changed, 53 insertions(+), 33 deletions(-)
> create mode 100644 Documentation/devicetree/bindings/arm/arm,vexpress-scc.yaml
> delete mode 100644 Documentation/devicetree/bindings/arm/vexpress-scc.txt
>
> diff --git a/Documentation/devicetree/bindings/arm/arm,vexpress-scc.yaml b/Documentation/devicetree/bindings/arm/arm,vexpress-scc.yaml
> new file mode 100644
> index 000000000000..9b8f7e0c4ea0
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/arm/arm,vexpress-scc.yaml
> @@ -0,0 +1,53 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/arm/arm,vexpress-scc.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: ARM Versatile Express Serial Configuration Controller
> +
> +maintainers:
> + - Liviu Dudau <liviu.dudau@arm.com>
> + - Sudeep Holla <sudeep.holla@arm.com>
> +
> +description: |
> + Test chips for ARM Versatile Express platform implement SCC (Serial
> + Configuration Controller) interface, used to set initial conditions
> + for the test chip.
> +
> + In some cases its registers are also mapped in normal address space
> + and can be used to obtain runtime information about the chip internals
> + (like silicon temperature sensors) and as interface to other subsystems
> + like platform configuration control and power management.
> +
> +properties:
> + compatible:
> + items:
> + - enum:
> + - arm,vexpress-scc,v2p-ca15_a7
> + - const: arm,vexpress-scc
> +
> + reg:
> + maxItems: 1
> +
> + interrupts:
> + maxItems: 1
> +
> +required:
> + - compatible
> +
> +additionalProperties: false
> +
> +examples:
> + - |
> + bus {
> + #address-cells = <2>;
> + #size-cells = <2>;
> +
> + scc@7fff0000 {
> + compatible = "arm,vexpress-scc,v2p-ca15_a7", "arm,vexpress-scc";
> + reg = <0 0x7fff0000 0 0x1000>;
> + interrupts = <0 95 4>;
> + };
> + };
> +...
> diff --git a/Documentation/devicetree/bindings/arm/vexpress-scc.txt b/Documentation/devicetree/bindings/arm/vexpress-scc.txt
> deleted file mode 100644
> index ae5043e42e5d..000000000000
> --- a/Documentation/devicetree/bindings/arm/vexpress-scc.txt
> +++ /dev/null
> @@ -1,33 +0,0 @@
> -ARM Versatile Express Serial Configuration Controller
> ------------------------------------------------------
> -
> -Test chips for ARM Versatile Express platform implement SCC (Serial
> -Configuration Controller) interface, used to set initial conditions
> -for the test chip.
> -
> -In some cases its registers are also mapped in normal address space
> -and can be used to obtain runtime information about the chip internals
> -(like silicon temperature sensors) and as interface to other subsystems
> -like platform configuration control and power management.
> -
> -Required properties:
> -
> -- compatible value: "arm,vexpress-scc,<model>", "arm,vexpress-scc";
> - where <model> is the full tile model name (as used
> - in the tile's Technical Reference Manual),
> - eg. for Coretile Express A15x2 A7x3 (V2P-CA15_A7):
> - compatible = "arm,vexpress-scc,v2p-ca15_a7", "arm,vexpress-scc";
> -
> -Optional properties:
> -
> -- reg: when the SCC is memory mapped, physical address and size of the
> - registers window
> -- interrupts: when the SCC can generate a system-level interrupt
> -
> -Example:
> -
> - scc@7fff0000 {
> - compatible = "arm,vexpress-scc,v2p-ca15_a7", "arm,vexpress-scc";
> - reg = <0 0x7fff0000 0 0x1000>;
> - interrupts = <0 95 4>;
> - };
> --
> 2.53.0
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
^ permalink raw reply
* Re: [PATCH v10 00/20] CoreSight: Refactor power management for CoreSight path
From: Leo Yan @ 2026-04-13 16:31 UTC (permalink / raw)
To: Jie Gan
Cc: Suzuki K Poulose, Mike Leach, James Clark, Yeoreum Yun,
Mark Rutland, Will Deacon, Yabin Cui, Keita Morisaki,
Yuanfang Zhang, Greg Kroah-Hartman, Alexander Shishkin,
Tamas Petz, Thomas Gleixner, Peter Zijlstra, coresight,
linux-arm-kernel
In-Reply-To: <0dd1c432-884a-46c1-828b-f3b22769d000@oss.qualcomm.com>
On Mon, Apr 13, 2026 at 06:30:18PM +0800, Jie Gan wrote:
[...]
> tested on QCOM sa8775-ride:
>
> === 1. Sysfs mode: basic enable/disable ===
> PASS: Sink tmc_etr0 enabled
> PASS: Source etm0 enabled
> PASS: Source etm0 disabled cleanly
> PASS: Sink tmc_etr0 disabled cleanly
>
> === 2. Sysfs mode: repeated enable/disable cycles (10x) ===
> PASS: 10 enable/disable cycles completed without error
>
> === 3. Sysfs mode: enable source with no active sink ===
> PASS: Enable without sink returned error (expected)
>
> === 4. Sysfs mode: enable/disable all per-CPU sources ===
> etm0 (cpu0): enabled OK
> etm1 (cpu1): enabled OK
> etm2 (cpu2): enabled OK
> etm3 (cpu3): enabled OK
> etm4 (cpu4): enabled OK
> etm5 (cpu5): enabled OK
> etm6 (cpu6): enabled OK
> etm7 (cpu7): enabled OK
> PASS: All online per-CPU sources enabled/disabled successfully
>
> === 5. CPU hotplug: offline CPU while sysfs tracing active ===
> Using source etm1 on cpu1
> Tracing active on cpu1, offlining CPU...
> [ 82.805359] psci: CPU1 killed (polled 0 ms)
> PASS: Source auto-disabled on CPU offline
> [ 83.346033] Detected PIPT I-cache on CPU1
> [ 83.346114] GICv3: CPU1: found redistributor 100 region
> 0:0x0000000017a80000
> [ 83.346283] CPU1: Booted secondary processor 0x0000000100 [0x410fd4b2]
> PASS: Source re-enabled after CPU re-online
>
> === 6. Sysfs: enable source on offline CPU (expect ENODEV) ===
> [ 84.013788] psci: CPU1 killed (polled 0 ms)
> PASS: Enable on offline cpu1 rejected (enable_source=0)
> [ 84.349558] Detected PIPT I-cache on CPU1
> [ 84.349640] GICv3: CPU1: found redistributor 100 region
> 0:0x0000000017a80000
> [ 84.349811] CPU1: Booted secondary processor 0x0000000100 [0x410fd4b2]
>
> === 7. CPU PM: trace survives CPU idle entry/exit ===
> Sleeping 3s to allow CPU idle entry...
> Idle entries on cpu0 during test: 35
> PASS: Source still enabled after idle (PM save/restore working)
>
> === 8. Perf mode: basic cs_etm recording ===
> SKIP: perf not found in PATH
>
> === 11. TRBE: check save/restore sysfs nodes (if present) ===
> SKIP: No TRBE devices found
>
> Tested-by: Jie Gan <jie.gan@oss.qualcomm.com>
Just heads up: since Sashiko [1] pointed out a corner case where an SMP call
may fail when disabling the source device, the per-CPU path pointer
might not be cleared. If the ETMv4 device is then removed (e.g. if the
user unloads the ETMv4 module), CPU PM notifier might access the stale
path pointer. Though this is a rare case, we should handle it safely.
This is why the series was not picked for the v7.1 merge window.
Thanks a lot for the testing, Jie! It's very helpful, and I will add
your test tags in the next spin.
Anyway, please expect more iterations.
Thanks,
Leo
[1] https://sashiko.dev/#/patchset/20260405-arm_coresight_path_power_management_improvement-v10-0-13e94754a8be%40arm.com?part=5
^ permalink raw reply
* Re: [PATCH v7 0/4] PCI: Add support for resetting the Root Ports in a platform specific way
From: Manivannan Sadhasivam @ 2026-04-13 16:35 UTC (permalink / raw)
To: Brian Norris, Hongxing Zhu
Cc: Hongxing Zhu, manivannan.sadhasivam@oss.qualcomm.com,
Bjorn Helgaas, Mahesh J Salgaonkar, Oliver O'Halloran,
Will Deacon, Lorenzo Pieralisi, Krzysztof Wilczyński,
Rob Herring, Heiko Stuebner, Philipp Zabel,
linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
linuxppc-dev@lists.ozlabs.org,
linux-arm-kernel@lists.infradead.org,
linux-arm-msm@vger.kernel.org, linux-rockchip@lists.infradead.org,
Niklas Cassel, Wilfred Mallawa, Krishna Chaitanya Chundru,
Lukas Wunner, Wilson Ding, Miles Chen
In-Reply-To: <adcHylFjFjhHT-tP@google.com>
Hi Brian,
On Wed, Apr 08, 2026 at 06:58:34PM -0700, Brian Norris wrote:
> Hi Richard and Mani,
>
> For the record, I've been using a form of an earlier version of this
> patchset in my environment for some time now, and I've run across
> problems that *might* relate to what Richard is reporting, but I'm not
> quite sure at the moment. Details below.
>
> On Wed, Mar 25, 2026 at 07:06:49AM +0000, Hongxing Zhu wrote:
> > Hi Mani:
> > I've accidentally encountered a new issue based on the reset root port patch-set.
> > After performing a few hot-reset operations, the PCIe link enters a continuous up/down cycling pattern.
> >
> > I found that calling pci_reset_secondary_bus() first in pcibios_reset_secondary_bus() appears to resolve this issue.
> > Have you experienced a similar problem?
> >
> > "
> > ...
> > [ 141.897701] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000700) link up detected
> > [ 142.086341] imx6q-pcie 4c300000.pcie: PCIe Gen.3 x1 link up
> > [ 142.092038] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000c00) link down detected
> > ...
> > "
> >
> > Platform: i.MX95 EVK board plus local Root Ports reset supports based on the #1 and #2 patches of v7 patch-set.
> > Notes of the logs:
> > - One Gen3 NVME device is connected.
> > - "./memtool 4c341058=0;./memtool 4c341058=1;" is used to toggle the LTSSM_EN bit to trigger the link down.
> > - Toggle BIT6 of Bridge Control Register to trigger hot reset by "./memtool 4c30003c=004001ff; ./memtool 4c30003c=000001ff;"
> > - The Root Port reset patches works correctly at first.
> > However, after several hot-reset triggers, the link enters a repeated down/up cycling state.
> >
> > Logs:
> > [ 3.553188] imx6q-pcie 4c300000.pcie: host bridge /soc/pcie@4c300000 ranges:
> > [ 3.560308] imx6q-pcie 4c300000.pcie: IO 0x006ff00000..0x006fffffff -> 0x0000000000
> > [ 3.568525] imx6q-pcie 4c300000.pcie: MEM 0x0910000000..0x091fffffff -> 0x0010000000
> > [ 3.577314] imx6q-pcie 4c300000.pcie: config reg[1] 0x60100000 == cpu 0x60100000
> > [ 3.796029] imx6q-pcie 4c300000.pcie: iATU: unroll T, 128 ob, 128 ib, align 4K, limit 1024G
> > [ 4.003746] imx6q-pcie 4c300000.pcie: PCIe Gen.3 x1 link up
> > [ 4.009553] imx6q-pcie 4c300000.pcie: PCI host bridge to bus 0000:00
> > root@imx95evk:~#
> > root@imx95evk:~#
> > root@imx95evk:~# ./memtool 4c341058=0;./memtool 4c341058=1; Writing 32-bit value 0x0 to address 0x4C341058
> > Writing 32-bit v
> > [ 87.265348] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000d01) link down detected
> > alue 0x1 to adder
> > [ 87.273106] imx6q-pcie 4c300000.pcie: Stop root bus and handle link down
> > ss 0x4C341058
> > [ 87.281264] pcieport 0000:00:00.0: Recovering Root Port due to Link Down
> > [ 87.289245] pci 0000:01:00.0: AER: can't recover (no error_detected callback)
> > root@imx95evk:~# [ 87.514216] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000700) link up detected
> > [ 87.702968] imx6q-pcie 4c300000.pcie: PCIe Gen.3 x1 link up
> > [ 87.834983] pcieport 0000:00:00.0: Root Port has been reset
> > [ 87.840714] pcieport 0000:00:00.0: AER: device recovery failed
> > [ 87.846592] imx6q-pcie 4c300000.pcie: Rescan bus after link up is detected
> > [ 87.855947] pcieport 0000:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
>
> I've seen this same line ("bridge configuration invalid") before, and I
> believe that's because the saved state (pci_save_state(); more about
> this below) is invalid -- it contains 0 values in places where they
> should be non-zero. So when those values are restored
> (pci_restore_state()), we get confused.
>
> I believe we've pinned down one reason this invalid state occurs -- it's
> because of an automatic (mis)feature in the DesignWare PCIe hardware.
> Specifically, it's because of what the controller does during a surprise
> link-down error.
>
> From the Designware docs:
>
> "[...] during normal operation, the link might fail and go down. After
> this link-down event, the controller requests the DWC_pcie_clkrst.v
> module to hot-reset the controller. There is no difference in the
> handling of a link-down reset or a hot reset; the controller asserts
> the link_req_rst_not output requesting the DWC_pcie_clkrst.v module to
> reset the controller."
>
> In some of the adjacent documentation (and confirmed in local testing),
> it suggests that this automatic reset will also reset various DBI (i.e.,
> PCIe config space) registers. It also seems as if there's not really a
> good way to completely stop this automatic reset -- the docs mention
> some SW methods prevent the reset, but they all seem racy or incomplete.
>
> Anyway, I think this implies that patch 1 is somewhat wrong [1]. It
> includes some code like this:
>
> pci_save_state(dev);
> ret = host->reset_root_port(host, dev);
> if (ret)
> pci_err(dev, "Failed to reset Root Port: %d\n", ret);
> else
> /* Now restore it on success */
> pci_restore_state(dev);
>
> That first line (pci_save_state()) is prone to saving invalid state,
> depending on whether the link-down event has finished flushing and
> resetting the controller yet or not. The resulting impact is a bit hard
> to judge, depending on what (mis)configuration you end up with.
>
Thanks a lot for your investigation. I think your observation makes sense and
could be the culprit in saving the corrupted state. Even on non-DWC controllers,
there is no guarantee that the Root Port config registers state will be
preserved after LDn (before Root Port reset).
> I also noticed commit a2f1e22390ac ("PCI/ERR: Ensure error
> recoverability at all times") was merged recently. With that change, I
> believe it is now safe to perform pci_restore_state() even without
> pci_save_state() here.
>
> So ... can we remove pci_save_state() from
> pcibios_reset_secondary_bus()? Might that help?
I think so. I will also test it locally and report back soon.
> It sounds like my above
> observations *may* match Richard's reports, but I'm not sure. And
> anyway, the documented hardware behavior is racy, so it's hard to
> propose a foolproof solution.
>
@Richard: Can you confirm if removing 'pci_save_state(dev);' from
pcibios_reset_secondary_bus() fixes your issue?
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply
* Re: [PATCH 06/11] Drivers: hv: Make sint vector architecture neutral in MSHV_VTL
From: Naman Jain @ 2026-04-13 16:51 UTC (permalink / raw)
To: Michael Kelley, K . Y . Srinivasan, Haiyang Zhang, Wei Liu,
Dexuan Cui, Long Li, Catalin Marinas, Will Deacon,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
x86@kernel.org, H . Peter Anvin, Arnd Bergmann, Paul Walmsley,
Palmer Dabbelt, Albert Ou, Alexandre Ghiti
Cc: Marc Zyngier, Timothy Hayes, Lorenzo Pieralisi, mrigendrachaubey,
ssengar@linux.microsoft.com, linux-hyperv@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
linux-riscv@lists.infradead.org
In-Reply-To: <SN6PR02MB4157DA00A31F0BA8585B9B69D4242@SN6PR02MB4157.namprd02.prod.outlook.com>
On 4/13/2026 9:19 PM, Michael Kelley wrote:
> From: Naman Jain <namjain@linux.microsoft.com> Sent: Monday, April 13, 2026 4:48 AM
>>
>> On 4/1/2026 10:27 PM, Michael Kelley wrote:
>>> From: Naman Jain <namjain@linux.microsoft.com> Sent: Monday, March 16, 2026 5:13 AM
>>>>
>>>> Generalize Synthetic interrupt source vector (sint) to use
>>>> vmbus_interrupt variable instead, which automatically takes care of
>>>> architectures where HYPERVISOR_CALLBACK_VECTOR is not present (arm64).
>>>
>>> Sashiko AI raised an interesting question about the startup timing --
>>> whether the vmbus_platform_driver_probe() is guaranteed to have
>>> set vmbus_interrupt before the VTL functions below run and use it.
>>> What causes the mshv_vtl.ko module to be loaded, and hence run
>>> mshv_vtl_init()?
>>
>> There is no race condition here. The init ordering guarantees that
>> vmbus_interrupt is always set before mshv_vtl_synic_enable_regs()
>> reads it.
>>
>> The call chain for setting vmbus_interrupt:
>>
>> subsys_initcall(hv_acpi_init) [level 4]
>> -> platform_driver_register(&vmbus_platform_driver) and so on.
>>
>>
>> The call chain for reading vmbus_interrupt:
>>
>> module_init(mshv_vtl_init) [level 6]
>> -> hv_vtl_setup_synic()
>> -> cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, ..., mshv_vtl_alloc_context, ...)
>> -> mshv_vtl_alloc_context()
>> -> mshv_vtl_synic_enable_regs()
>> -> sint.vector = vmbus_interrupt
>>
>> do_initcalls() processes sections in order 0 through 7, so
>> hv_acpi_init() (level 4) is guaranteed to complete before
>> mshv_vtl_init() (level 6) runs.
>>
>
> I think the situation is more complex than what you describe, depending
> on whether the VMBus driver and/or MSHV_VTL are built as modules vs.
> being built-in to the kernel image. In include/linux/module.h, see the
> comment for module_init() and how subsys_initcall() is mapped
> to module_init() when built as a module.
>
> If both are built-in, then what you describe is correct. But if either or
> both are modules, then the respective init functions (hv_acpi_init
> and mshv_vtl_init) get called at the time the module is loaded, and
> not by do_initcalls(). I think hv_vmbus.ko gets loaded when an attempt
> is first made to access a disk, but I would need to look more closely to
> be sure. I don't have any understanding of what causes mshv_vtl.ko
> to be loaded. And what is the ordering if MSHV_VTL is built-in while
> VMBus is built as a module, or vice versa?
>
> Michael
>
Based on this, I still feel that this race is not possible.
hv_vmbus mshv_vtl
y y -> different initcall levels, no issues
y m -> use without initialization is not possible
m y -> config dependencies take care of this, and mshv_vtl
is forced to compile as a module in this case.
m m -> config and symbol dependencies should take care of
it. mshv_vtl has symbol and config dependencies on hv_vmbus, and it
won't allow loading mshv_vtl if hv_vmbus module is not loaded.
Relevant code here: kernel/module/main.c
Regards,
Naman
^ permalink raw reply
* Re: [PATCH 07/11] arch: arm64: Add support for mshv_vtl_return_call
From: Naman Jain @ 2026-04-13 16:52 UTC (permalink / raw)
To: Michael Kelley, K . Y . Srinivasan, Haiyang Zhang, Wei Liu,
Dexuan Cui, Long Li, Catalin Marinas, Will Deacon,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
x86@kernel.org, H . Peter Anvin, Arnd Bergmann, Paul Walmsley,
Palmer Dabbelt, Albert Ou, Alexandre Ghiti
Cc: Marc Zyngier, Timothy Hayes, Lorenzo Pieralisi, mrigendrachaubey,
ssengar@linux.microsoft.com, linux-hyperv@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
linux-riscv@lists.infradead.org
In-Reply-To: <SN6PR02MB4157D3C4F6F376C8D6C3D234D450A@SN6PR02MB4157.namprd02.prod.outlook.com>
On 4/1/2026 10:27 PM, Michael Kelley wrote:
> From: Naman Jain <namjain@linux.microsoft.com> Sent: Monday, March 16, 2026 5:13 AM
>>
>
> Nit: For historical consistency, use "arm64: hyperv:" as the prefix in the patch Subject.
Acked.
>
>> Add support for arm64 specific variant of mshv_vtl_return_call function
>> to be able to add support for arm64 in MSHV_VTL driver. This would
>> help enable the transition between Virtual Trust Levels (VTL) in
>> MSHV_VTL when the kernel acts as a paravisor.
>
> This commit message has a fair number of "filler" words. Suggest simplifying to:
>
> Add the arm64 variant of mshv_vtl_return_call() to support the MSHV_VTL
> driver on arm64. This function enables the transition between Virtual Trust
> Levels (VTLs) in MSHV_VTL when the kernel acts as a paravisor.
>
I can see the difference clearly :-) Will use this in commit message.
>>
>> Signed-off-by: Roman Kisel <romank@linux.microsoft.com>
>> Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
>> ---
>> arch/arm64/hyperv/Makefile | 1 +
>> arch/arm64/hyperv/hv_vtl.c | 144 ++++++++++++++++++++++++++++++
>> arch/arm64/include/asm/mshyperv.h | 13 +++
>> 3 files changed, 158 insertions(+)
>> create mode 100644 arch/arm64/hyperv/hv_vtl.c
>>
>> diff --git a/arch/arm64/hyperv/Makefile b/arch/arm64/hyperv/Makefile
>> index 87c31c001da9..9701a837a6e1 100644
>> --- a/arch/arm64/hyperv/Makefile
>> +++ b/arch/arm64/hyperv/Makefile
>> @@ -1,2 +1,3 @@
>> # SPDX-License-Identifier: GPL-2.0
>> obj-y := hv_core.o mshyperv.o
>> +obj-$(CONFIG_HYPERV_VTL_MODE) += hv_vtl.o
>> diff --git a/arch/arm64/hyperv/hv_vtl.c b/arch/arm64/hyperv/hv_vtl.c
>> new file mode 100644
>> index 000000000000..66318672c242
>> --- /dev/null
>> +++ b/arch/arm64/hyperv/hv_vtl.c
>> @@ -0,0 +1,144 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Copyright (C) 2026, Microsoft, Inc.
>> + *
>> + * Authors:
>> + * Roman Kisel <romank@linux.microsoft.com>
>> + * Naman Jain <namjain@linux.microsoft.com>
>> + */
>> +
>> +#include <asm/boot.h>
>> +#include <asm/mshyperv.h>
>> +#include <asm/cpu_ops.h>
>> +
>> +void mshv_vtl_return_call(struct mshv_vtl_cpu_context *vtl0)
>> +{
>> + u64 base_ptr = (u64)vtl0->x;
>> +
>> + /*
>> + * VTL switch for ARM64 platform - managing VTL0's CPU context.
>> + * We explicitly use the stack to save the base pointer, and use x16
>> + * as our working register for accessing the context structure.
>> + *
>> + * Register Handling:
>> + * - X0-X17: Saved/restored (general-purpose, shared for VTL communication)
>> + * - X18: NOT touched - hypervisor-managed per-VTL (platform register)
>> + * - X19-X30: Saved/restored (part of VTL0's execution context)
>> + * - Q0-Q31: Saved/restored (128-bit NEON/floating-point registers, shared)
>> + * - SP: Not in structure, hypervisor-managed per-VTL
>> + *
>> + * Note: X29 (FP) and X30 (LR) are in the structure and must be saved/restored
>> + * as part of VTL0's complete execution state.
>
> Could drop "Note:". That's what comments are. :-)
Acked.
>
>
>> + */
>> + asm __volatile__ (
>> + /* Save base pointer to stack explicitly, then load into x16 */
>> + "str %0, [sp, #-16]!\n\t" /* Push base pointer onto stack */
>> + "mov x16, %0\n\t" /* Load base pointer into x16 */
>> + /* Volatile registers (Windows ARM64 ABI: x0-x15) */
>> + "ldp x0, x1, [x16]\n\t"
>> + "ldp x2, x3, [x16, #(2*8)]\n\t"
>
> On the x86 side, there's machinery to generate a series of constants that are
> the offsets of the individual fields in struct mshv_vtl_cpu_context. The x86
> asm code then uses these constants. Here on the arm64 side, you've calculated
> the offsets directly in the asm code. Any reason for the difference? I can see
> it's fairly easy to eyeball the offsets here and compare against the registers
> that are being loaded, where it's not so easy the way registers are named
> on x86. So maybe the additional machinery that's helpful on the x86 side
> is less necessary here. Just wondering ....
There were complexities around static call etc. in x86 which led to that
redesign. Here things are much simpler and the offsets we see are 1-1
mapped to registers. But I still tried to prototype it, and it looked
more complex than it has to be. I think we can keep it in current form.
>
>> + "ldp x4, x5, [x16, #(4*8)]\n\t"
>> + "ldp x6, x7, [x16, #(6*8)]\n\t"
>> + "ldp x8, x9, [x16, #(8*8)]\n\t"
>> + "ldp x10, x11, [x16, #(10*8)]\n\t"
>> + "ldp x12, x13, [x16, #(12*8)]\n\t"
>> + "ldp x14, x15, [x16, #(14*8)]\n\t"
>> + /* x16 will be loaded last, after saving base pointer */
>> + "ldr x17, [x16, #(17*8)]\n\t"
>> + /* x18 is hypervisor-managed per-VTL - DO NOT LOAD */
>> +
>> + /* General-purpose registers: x19-x30 */
>> + "ldp x19, x20, [x16, #(19*8)]\n\t"
>> + "ldp x21, x22, [x16, #(21*8)]\n\t"
>> + "ldp x23, x24, [x16, #(23*8)]\n\t"
>> + "ldp x25, x26, [x16, #(25*8)]\n\t"
>> + "ldp x27, x28, [x16, #(27*8)]\n\t"
>> +
>> + /* Frame pointer and link register */
>> + "ldp x29, x30, [x16, #(29*8)]\n\t"
>> +
>> + /* Shared NEON/FP registers: Q0-Q31 (128-bit) */
>> + "ldp q0, q1, [x16, #(32*8)]\n\t"
>> + "ldp q2, q3, [x16, #(32*8 + 2*16)]\n\t"
>> + "ldp q4, q5, [x16, #(32*8 + 4*16)]\n\t"
>> + "ldp q6, q7, [x16, #(32*8 + 6*16)]\n\t"
>> + "ldp q8, q9, [x16, #(32*8 + 8*16)]\n\t"
>> + "ldp q10, q11, [x16, #(32*8 + 10*16)]\n\t"
>> + "ldp q12, q13, [x16, #(32*8 + 12*16)]\n\t"
>> + "ldp q14, q15, [x16, #(32*8 + 14*16)]\n\t"
>> + "ldp q16, q17, [x16, #(32*8 + 16*16)]\n\t"
>> + "ldp q18, q19, [x16, #(32*8 + 18*16)]\n\t"
>> + "ldp q20, q21, [x16, #(32*8 + 20*16)]\n\t"
>> + "ldp q22, q23, [x16, #(32*8 + 22*16)]\n\t"
>> + "ldp q24, q25, [x16, #(32*8 + 24*16)]\n\t"
>> + "ldp q26, q27, [x16, #(32*8 + 26*16)]\n\t"
>> + "ldp q28, q29, [x16, #(32*8 + 28*16)]\n\t"
>> + "ldp q30, q31, [x16, #(32*8 + 30*16)]\n\t"
>> +
>> + /* Now load x16 itself */
>> + "ldr x16, [x16, #(16*8)]\n\t"
>> +
>> + /* Return to the lower VTL */
>> + "hvc #3\n\t"
>> +
>> + /* Save context after return - reload base pointer from stack */
>> + "stp x16, x17, [sp, #-16]!\n\t" /* Save x16, x17 temporarily */
>> + "ldr x16, [sp, #16]\n\t" /* Reload base pointer (skip saved x16,x17) */
>> +
>> + /* Volatile registers */
>> + "stp x0, x1, [x16]\n\t"
>> + "stp x2, x3, [x16, #(2*8)]\n\t"
>> + "stp x4, x5, [x16, #(4*8)]\n\t"
>> + "stp x6, x7, [x16, #(6*8)]\n\t"
>> + "stp x8, x9, [x16, #(8*8)]\n\t"
>> + "stp x10, x11, [x16, #(10*8)]\n\t"
>> + "stp x12, x13, [x16, #(12*8)]\n\t"
>> + "stp x14, x15, [x16, #(14*8)]\n\t"
>> + "ldp x0, x1, [sp], #16\n\t" /* Recover saved x16, x17 */
>> + "stp x0, x1, [x16, #(16*8)]\n\t"
>> + /* x18 is hypervisor-managed - DO NOT SAVE */
>> +
>> + /* General-purpose registers: x19-x30 */
>> + "stp x19, x20, [x16, #(19*8)]\n\t"
>> + "stp x21, x22, [x16, #(21*8)]\n\t"
>> + "stp x23, x24, [x16, #(23*8)]\n\t"
>> + "stp x25, x26, [x16, #(25*8)]\n\t"
>> + "stp x27, x28, [x16, #(27*8)]\n\t"
>> + "stp x29, x30, [x16, #(29*8)]\n\t" /* Frame pointer and link register */
>> +
>> + /* Shared NEON/FP registers: Q0-Q31 (128-bit) */
>> + "stp q0, q1, [x16, #(32*8)]\n\t"
>> + "stp q2, q3, [x16, #(32*8 + 2*16)]\n\t"
>> + "stp q4, q5, [x16, #(32*8 + 4*16)]\n\t"
>> + "stp q6, q7, [x16, #(32*8 + 6*16)]\n\t"
>> + "stp q8, q9, [x16, #(32*8 + 8*16)]\n\t"
>> + "stp q10, q11, [x16, #(32*8 + 10*16)]\n\t"
>> + "stp q12, q13, [x16, #(32*8 + 12*16)]\n\t"
>> + "stp q14, q15, [x16, #(32*8 + 14*16)]\n\t"
>> + "stp q16, q17, [x16, #(32*8 + 16*16)]\n\t"
>> + "stp q18, q19, [x16, #(32*8 + 18*16)]\n\t"
>> + "stp q20, q21, [x16, #(32*8 + 20*16)]\n\t"
>> + "stp q22, q23, [x16, #(32*8 + 22*16)]\n\t"
>> + "stp q24, q25, [x16, #(32*8 + 24*16)]\n\t"
>> + "stp q26, q27, [x16, #(32*8 + 26*16)]\n\t"
>> + "stp q28, q29, [x16, #(32*8 + 28*16)]\n\t"
>> + "stp q30, q31, [x16, #(32*8 + 30*16)]\n\t"
>> +
>> + /* Clean up stack - pop base pointer */
>> + "add sp, sp, #16\n\t"
>> +
>> + : /* No outputs */
>> + : /* Input */ "r"(base_ptr)
>> + : /* Clobber list - x16 used as base, x18 is hypervisor-managed (not touched) */
>> + "memory", "cc",
>> + "x0", "x1", "x2", "x3", "x4", "x5",
>> + "x6", "x7", "x8", "x9", "x10", "x11", "x12", "x13",
>> + "x14", "x15", "x16", "x17", "x19", "x20", "x21",
>> + "x22", "x23", "x24", "x25", "x26", "x27", "x28",
>> + "x29", "x30",
>> + "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7",
>> + "v8", "v9", "v10", "v11", "v12", "v13", "v14", "v15",
>> + "v16", "v17", "v18", "v19", "v20", "v21", "v22", "v23",
>> + "v24", "v25", "v26", "v27", "v28", "v29", "v30", "v31");
>> +}
>> +EXPORT_SYMBOL(mshv_vtl_return_call);
>> diff --git a/arch/arm64/include/asm/mshyperv.h
>> b/arch/arm64/include/asm/mshyperv.h
>> index 804068e0941b..de7f3a41a8ea 100644
>> --- a/arch/arm64/include/asm/mshyperv.h
>> +++ b/arch/arm64/include/asm/mshyperv.h
>> @@ -60,6 +60,17 @@ static inline u64 hv_get_non_nested_msr(unsigned int reg)
>> ARM_SMCCC_SMC_64, \
>> ARM_SMCCC_OWNER_VENDOR_HYP, \
>> HV_SMCCC_FUNC_NUMBER)
>> +
>> +struct mshv_vtl_cpu_context {
>> +/*
>> + * NOTE: x18 is managed by the hypervisor. It won't be reloaded from this array.
>> + * It is included here for convenience in the common case.
>
> I'm not getting your point in this last sentence. What is the "common case"?
>
This was really odd :-) I should have spotted it. I'll change it to:
It is included here for convenience in array indexing.
> You could also drop the "NOTE: " prefix.
Acked.
>
>> + */
>> + __u64 x[31];
>> + __u64 rsvd;
>> + __uint128_t q[32];
>> +};
>
> struct mshv_vtl_run reserves 1024 bytes for cpu_context. It would be nice to
> have a compile-time check that the size of struct mshv_vtl_cpu_context fits in
> that 1024 bytes. That check might be better added where struct mshv_vtl_run
> is defined so that it works for both x86 and arm64.
Acked, will add it.
>
>> +
>> #ifdef CONFIG_HYPERV_VTL_MODE
>> /*
>> * Get/Set the register. If the function returns `1`, that must be done via
>> @@ -69,6 +80,8 @@ static inline int hv_vtl_get_set_reg(struct hv_register_assoc *regs, bool set, u
>> {
>> return 1;
>> }
>> +
>> +void mshv_vtl_return_call(struct mshv_vtl_cpu_context *vtl0);
>
> This declaration now duplicated in mshyperv.h under arch/arm64 and under
> arch/x86. Instead, it should be added to asm-generic/mshyperv.h, and
> removed from the arch/x86 mshyperv.h, so that there's only a single
> instance of the declaration.
Acked.
>
>> #endif
>>
>> #include <asm-generic/mshyperv.h>
>> --
>> 2.43.0
>>
Regards,
Naman
^ permalink raw reply
* [PATCH RFC] ACPI: processor: idle: Do not propagate acpi_processor_ffh_lpi_probe() -ENODEV
From: Breno Leitao @ 2026-04-13 16:54 UTC (permalink / raw)
To: Rafael J. Wysocki, Len Brown, Huisong Li, lpieralisi,
catalin.marinas, will
Cc: Rafael J. Wysocki, linux-acpi, linux-kernel, pjaroszynski,
guohanjun, sudeep.holla, linux-arm-kernel, rmikey, kernel-team,
Breno Leitao
Commit cac173bea57d ("ACPI: processor: idle: Rework the handling of
acpi_processor_ffh_lpi_probe()") moved the acpi_processor_ffh_lpi_probe()
call from acpi_processor_setup_cpuidle_dev(), where its return value was
ignored, to acpi_processor_get_power_info(), where it is treated as a
hard failure. This causes cpuidle setup to fail entirely for all CPUs on
platforms where the FFH LPI probe returns an error.
On NVIDIA Grace (aarch64) systems with PSCIv1.1, the probe fails for all
72 CPUs with -ENODEV because psci_acpi_cpu_init_idle() finds
power.count - 1 <= 0 (power.count=1). This results in no cpuidle states
registered for any CPU, forcing them to busy-poll when idle instead of
entering low-power states.
The -ENODEV error simply means no deep PSCI idle states are available
beyond WFI, which is a normal condition. Do not propagate -ENODEV and
downgrade its message to pr_debug, while still propagating other errors
that may indicate real problems.
Fixes: cac173bea57d ("ACPI: processor: idle: Rework the handling of acpi_processor_ffh_lpi_probe()")
Signed-off-by: Breno Leitao <leitao@debian.org>
---
drivers/acpi/processor_idle.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index ee5facccbe10c..7b6f7730ec63d 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -1258,8 +1258,13 @@ static int acpi_processor_get_power_info(struct acpi_processor *pr)
if (pr->flags.has_lpi) {
ret = acpi_processor_ffh_lpi_probe(pr->id);
- if (ret)
+ if (ret == -ENODEV) {
+ pr_debug("CPU%u: FFH LPI probe failed, err = %d, power.count = %d\n",
+ pr->id, ret, pr->power.count);
+ ret = 0;
+ } else if (ret) {
pr_err("CPU%u: Invalid FFH LPI data\n", pr->id);
+ }
}
return ret;
---
base-commit: 66672af7a095d89f082c5327f3b15bc2f93d558e
change-id: 20260413-ffh-93f68b2f46a3
Best regards,
--
Breno Leitao <leitao@debian.org>
^ permalink raw reply related
* [PATCH v5 0/2] perf: marvell: Add CN20K DDR PMU support
From: Geetha sowjanya @ 2026-04-13 16:56 UTC (permalink / raw)
To: linux-perf-users, linux-kernel, linux-arm-kernel, devicetree
Cc: mark.rutland, will, krzk+dt
This series adds support for the DDR Performance Monitoring Unit (PMU)
present in Marvell CN20K SoCs.
The DDR PMU is part of the DRAM Subsystem (DSS) and provides hardware
counters to monitor DDR traffic and performance events. The block
implements eight programmable counters and two fixed-function counters
tracking DDR read and write activity, and is accessed via a dedicated
MMIO region.
CN20K is the successor to CN10K, and the DDR PMU hardware is functionally
equivalent to the CN10K implementation, with only minor differences in
register offsets and event mappings. To allow software to distinguish
between the two silicon variants, this series introduces a specific
"marvell,cn20k-ddr-pmu" compatible and extends the existing
marvell_cn10k_ddr_pmu driver to handle CN20K via variant-specific data.
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Chnages in v4:
- Fixed document file name.
Chnages in v3:
- Expanded cover letter and commit message to better describe the DDR PMU
hardware and its relationship to CN10K
- Fixed the file name.
Changes in v2:
- Fixed YAML syntax error triggered by a tab character in the examples
section, which caused dt_binding_check to fail.
Changes in v1:
- Added a description field to the binding.
- Simplified the compatible property using 'const' instead of 'items/enum'.
- Updated the example node name to include a unit-address matching the reg base.
Geetha sowjanya (2):
dt-bindings: perf: marvell: Document CN20K DDR PMU
perf: marvell: Add CN20K DDR PMU support
.../bindings/perf/marvell,cn20k-ddr-pmu.yaml | 39 ++++
drivers/perf/marvell_cn10k_ddr_pmu.c | 187 ++++++++++++++++--
2 files changed, 210 insertions(+), 16 deletions(-)
create mode 100644 Documentation/devicetree/bindings/perf/marvell,cn20k-ddr-pmu.yaml
--
2.25.1
^ permalink raw reply
* [PATCH v5 2/2] perf: marvell: Add CN20K DDR PMU support
From: Geetha sowjanya @ 2026-04-13 16:56 UTC (permalink / raw)
To: linux-perf-users, linux-kernel, linux-arm-kernel, devicetree
Cc: mark.rutland, will, krzk+dt
In-Reply-To: <20260413165621.10921-1-gakula@marvell.com>
The CN20K DRAM Subsystem exposes eight programmable
performance counters and two fixed counters for DDR
read and write traffic. Software selects events for
the programmable counters from traffic at the DDR PHY
interface, the CHI interconnect, or inside the DDR controller.
Add CN20K register offsets, event maps, and sysfs attributes;
match the device via OF (marvell,cn20k-ddr-pmu) and ACPI (MRVL000B).
Represent the SoC variant in platform data with bit flags so
CN20K can reuse the CN10K PMU code path where appropriate.
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
---
drivers/perf/marvell_cn10k_ddr_pmu.c | 187 ++++++++++++++++++++++++---
1 file changed, 171 insertions(+), 16 deletions(-)
diff --git a/drivers/perf/marvell_cn10k_ddr_pmu.c b/drivers/perf/marvell_cn10k_ddr_pmu.c
index 72ac17efd846..7e2e1823b009 100644
--- a/drivers/perf/marvell_cn10k_ddr_pmu.c
+++ b/drivers/perf/marvell_cn10k_ddr_pmu.c
@@ -13,31 +13,43 @@
#include <linux/hrtimer.h>
#include <linux/acpi.h>
#include <linux/platform_device.h>
+#include <linux/bits.h>
+
+/* SoC variant flags for struct ddr_pmu_platform_data (mutually exclusive in pdata) */
+#define IS_CN10K BIT(0)
+#define IS_ODY BIT(1)
+#define IS_CN20K BIT(2)
/* Performance Counters Operating Mode Control Registers */
#define CN10K_DDRC_PERF_CNT_OP_MODE_CTRL 0x8020
#define ODY_DDRC_PERF_CNT_OP_MODE_CTRL 0x20020
+#define CN20K_DDRC_PERF_CNT_OP_MODE_CTRL 0x20000
#define OP_MODE_CTRL_VAL_MANUAL 0x1
/* Performance Counters Start Operation Control Registers */
#define CN10K_DDRC_PERF_CNT_START_OP_CTRL 0x8028
#define ODY_DDRC_PERF_CNT_START_OP_CTRL 0x200A0
+#define CN20K_DDRC_PERF_CNT_START_OP_CTRL 0x20080
#define START_OP_CTRL_VAL_START 0x1ULL
#define START_OP_CTRL_VAL_ACTIVE 0x2
/* Performance Counters End Operation Control Registers */
#define CN10K_DDRC_PERF_CNT_END_OP_CTRL 0x8030
#define ODY_DDRC_PERF_CNT_END_OP_CTRL 0x200E0
+#define CN20K_DDRC_PERF_CNT_END_OP_CTRL 0x200C0
#define END_OP_CTRL_VAL_END 0x1ULL
/* Performance Counters End Status Registers */
#define CN10K_DDRC_PERF_CNT_END_STATUS 0x8038
#define ODY_DDRC_PERF_CNT_END_STATUS 0x20120
+#define CN20K_DDRC_PERF_CNT_END_STATUS 0x20100
#define END_STATUS_VAL_END_TIMER_MODE_END 0x1
/* Performance Counters Configuration Registers */
#define CN10K_DDRC_PERF_CFG_BASE 0x8040
#define ODY_DDRC_PERF_CFG_BASE 0x20160
+#define CN20K_DDRC_PERF_CFG_BASE 0x20140
+#define CN20K_DDRC_PERF_CFG1_BASE 0x20180
/* 8 Generic event counter + 2 fixed event counters */
#define DDRC_PERF_NUM_GEN_COUNTERS 8
@@ -61,6 +73,23 @@
* DO NOT change these event-id numbers, they are used to
* program event bitmap in h/w.
*/
+
+/* CN20K specific events */
+#define EVENT_PERF_OP_IS_RD16 61
+#define EVENT_PERF_OP_IS_RD32 60
+#define EVENT_PERF_OP_IS_WR16 59
+#define EVENT_PERF_OP_IS_WR32 58
+#define EVENT_OP_IS_ENTER_DSM 44
+#define EVENT_OP_IS_RFM 43
+
+#define EVENT_CN20K_OP_IS_TCR_MRR 50
+#define EVENT_CN20K_OP_IS_DQSOSC_MRR 49
+#define EVENT_CN20K_OP_IS_DQSOSC_MPC 48
+#define EVENT_CN20K_VISIBLE_WIN_LIMIT_REACHED_WR 47
+#define EVENT_CN20K_VISIBLE_WIN_LIMIT_REACHED_RD 46
+#define EVENT_CN20K_OP_IS_ZQLATCH 21
+#define EVENT_CN20K_OP_IS_ZQSTART 22
+
#define EVENT_DFI_CMD_IS_RETRY 61
#define EVENT_RD_UC_ECC_ERROR 60
#define EVENT_RD_CRC_ERROR 59
@@ -87,6 +116,9 @@
#define EVENT_OP_IS_SPEC_REF 41
#define EVENT_OP_IS_CRIT_REF 40
#define EVENT_OP_IS_REFRESH 39
+#define EVENT_OP_IS_CAS_WCK_SUS 38
+#define EVENT_OP_IS_CAS_WS_OFF 37
+#define EVENT_OP_IS_CAS_WS 36
#define EVENT_OP_IS_ENTER_MPSM 35
#define EVENT_OP_IS_ENTER_POWERDOWN 31
#define EVENT_OP_IS_ENTER_SELFREF 27
@@ -183,8 +215,8 @@ struct ddr_pmu_platform_data {
u64 cnt_freerun_clr;
u64 cnt_value_wr_op;
u64 cnt_value_rd_op;
- bool is_cn10k;
- bool is_ody;
+ u64 cfg1_base;
+ unsigned int silicon_flags; /* IS_CN10K, IS_ODY, or IS_CN20K */
};
static ssize_t cn10k_ddr_pmu_event_show(struct device *dev,
@@ -336,6 +368,80 @@ static struct attribute *odyssey_ddr_perf_events_attrs[] = {
NULL
};
+static struct attribute *cn20k_ddr_perf_events_attrs[] = {
+ /* Programmable */
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_hif_rd_or_wr_access, EVENT_HIF_RD_OR_WR),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_hif_wr_access, EVENT_HIF_WR),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_hif_rd_access, EVENT_HIF_RD),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_hif_rmw_access, EVENT_HIF_RMW),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_hif_pri_rdaccess, EVENT_HIF_HI_PRI_RD),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_rd_bypass_access, EVENT_READ_BYPASS),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_act_bypass_access, EVENT_ACT_BYPASS),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_dfi_wr_data_access,
+ EVENT_DFI_WR_DATA_CYCLES),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_dfi_rd_data_access,
+ EVENT_DFI_RD_DATA_CYCLES),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_hpri_sched_rd_crit_access,
+ EVENT_HPR_XACT_WHEN_CRITICAL),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_lpri_sched_rd_crit_access,
+ EVENT_LPR_XACT_WHEN_CRITICAL),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_wr_trxn_crit_access,
+ EVENT_WR_XACT_WHEN_CRITICAL),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_cam_active_access, EVENT_OP_IS_ACTIVATE),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_cam_rd_or_wr_access,
+ EVENT_OP_IS_RD_OR_WR),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_cam_rd_active_access,
+ EVENT_OP_IS_RD_ACTIVATE),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_cam_read, EVENT_OP_IS_RD),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_cam_write, EVENT_OP_IS_WR),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_cam_mwr, EVENT_OP_IS_MWR),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_precharge, EVENT_OP_IS_PRECHARGE),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_precharge_for_rdwr,
+ EVENT_PRECHARGE_FOR_RDWR),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_precharge_for_other,
+ EVENT_PRECHARGE_FOR_OTHER),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_rdwr_transitions, EVENT_RDWR_TRANSITIONS),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_write_combine, EVENT_WRITE_COMBINE),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_war_hazard, EVENT_WAR_HAZARD),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_raw_hazard, EVENT_RAW_HAZARD),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_waw_hazard, EVENT_WAW_HAZARD),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_enter_selfref, EVENT_OP_IS_ENTER_SELFREF),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_enter_powerdown,
+ EVENT_OP_IS_ENTER_POWERDOWN),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_cas_ws, EVENT_OP_IS_CAS_WS),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_cas_ws_off, EVENT_OP_IS_CAS_WS_OFF),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_cas_wck_sus, EVENT_OP_IS_CAS_WCK_SUS),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_refresh, EVENT_OP_IS_REFRESH),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_crit_ref, EVENT_OP_IS_CRIT_REF),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_spec_ref, EVENT_OP_IS_SPEC_REF),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_load_mode, EVENT_OP_IS_LOAD_MODE),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_rfm, EVENT_OP_IS_RFM),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_enter_dsm, EVENT_OP_IS_ENTER_DSM),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_dfi_cycles, EVENT_DFI_CYCLES),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_win_limit_reached_rd,
+ EVENT_CN20K_VISIBLE_WIN_LIMIT_REACHED_RD),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_win_limit_reached_wr,
+ EVENT_CN20K_VISIBLE_WIN_LIMIT_REACHED_WR),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_dqsosc_mpc, EVENT_CN20K_OP_IS_DQSOSC_MPC),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_dqsosc_mrr, EVENT_CN20K_OP_IS_DQSOSC_MRR),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_tcr_mrr, EVENT_CN20K_OP_IS_TCR_MRR),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_zqstart, EVENT_CN20K_OP_IS_ZQSTART),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_zqlatch, EVENT_CN20K_OP_IS_ZQLATCH),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_read16, EVENT_PERF_OP_IS_RD16),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_read32, EVENT_PERF_OP_IS_RD32),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_write16, EVENT_PERF_OP_IS_WR16),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_write32, EVENT_PERF_OP_IS_WR32),
+ /* Free run event counters */
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_ddr_reads, EVENT_DDR_READS),
+ CN10K_DDR_PMU_EVENT_ATTR(ddr_ddr_writes, EVENT_DDR_WRITES),
+ NULL
+};
+
+static struct attribute_group cn20k_ddr_perf_events_attr_group = {
+ .name = "events",
+ .attrs = cn20k_ddr_perf_events_attrs,
+};
+
static struct attribute_group odyssey_ddr_perf_events_attr_group = {
.name = "events",
.attrs = odyssey_ddr_perf_events_attrs,
@@ -393,6 +499,13 @@ static const struct attribute_group *odyssey_attr_groups[] = {
NULL
};
+static const struct attribute_group *cn20k_attr_groups[] = {
+ &cn20k_ddr_perf_events_attr_group,
+ &cn10k_ddr_perf_format_attr_group,
+ &cn10k_ddr_perf_cpumask_attr_group,
+ NULL
+};
+
/* Default poll timeout is 100 sec, which is very sufficient for
* 48 bit counter incremented max at 5.6 GT/s, which may take many
* hours to overflow.
@@ -412,7 +525,7 @@ static int ddr_perf_get_event_bitmap(int eventid, u64 *event_bitmap,
switch (eventid) {
case EVENT_DFI_PARITY_POISON ...EVENT_DFI_CMD_IS_RETRY:
- if (!ddr_pmu->p_data->is_ody) {
+ if (!(ddr_pmu->p_data->silicon_flags & IS_ODY)) {
err = -EINVAL;
break;
}
@@ -524,9 +637,9 @@ static void cn10k_ddr_perf_counter_enable(struct cn10k_ddr_pmu *pmu,
int counter, bool enable)
{
const struct ddr_pmu_platform_data *p_data = pmu->p_data;
+ unsigned int silicon_flags = pmu->p_data->silicon_flags;
u64 ctrl_reg = pmu->p_data->cnt_op_mode_ctrl;
const struct ddr_pmu_ops *ops = pmu->ops;
- bool is_ody = pmu->p_data->is_ody;
u32 reg;
u64 val;
@@ -546,7 +659,7 @@ static void cn10k_ddr_perf_counter_enable(struct cn10k_ddr_pmu *pmu,
writeq_relaxed(val, pmu->base + reg);
- if (is_ody) {
+ if (silicon_flags & IS_ODY) {
if (enable) {
/*
* Setup the PMU counter to work in
@@ -621,6 +734,7 @@ static int cn10k_ddr_perf_event_add(struct perf_event *event, int flags)
{
struct cn10k_ddr_pmu *pmu = to_cn10k_ddr_pmu(event->pmu);
const struct ddr_pmu_platform_data *p_data = pmu->p_data;
+ unsigned int silicon_flags = pmu->p_data->silicon_flags;
const struct ddr_pmu_ops *ops = pmu->ops;
struct hw_perf_event *hwc = &event->hw;
u8 config = event->attr.config;
@@ -642,10 +756,17 @@ static int cn10k_ddr_perf_event_add(struct perf_event *event, int flags)
if (counter < DDRC_PERF_NUM_GEN_COUNTERS) {
/* Generic counters, configure event id */
reg_offset = DDRC_PERF_CFG(p_data->cfg_base, counter);
- ret = ddr_perf_get_event_bitmap(config, &val, pmu);
- if (ret)
- return ret;
+ if (silicon_flags & IS_CN20K) {
+ val = (1ULL << (config - 1));
+ if (config == EVENT_CN20K_OP_IS_ZQSTART ||
+ config == EVENT_CN20K_OP_IS_ZQLATCH)
+ reg_offset = DDRC_PERF_CFG(p_data->cfg1_base, counter);
+ } else {
+ ret = ddr_perf_get_event_bitmap(config, &val, pmu);
+ if (ret)
+ return ret;
+ }
writeq_relaxed(val, pmu->base + reg_offset);
} else {
/* fixed event counter, clear counter value */
@@ -952,7 +1073,25 @@ static const struct ddr_pmu_platform_data cn10k_ddr_pmu_pdata = {
.cnt_freerun_clr = 0,
.cnt_value_wr_op = CN10K_DDRC_PERF_CNT_VALUE_WR_OP,
.cnt_value_rd_op = CN10K_DDRC_PERF_CNT_VALUE_RD_OP,
- .is_cn10k = TRUE,
+ .silicon_flags = IS_CN10K,
+};
+
+static const struct ddr_pmu_platform_data cn20k_ddr_pmu_pdata = {
+ .counter_overflow_val = 0,
+ .counter_max_val = GENMASK_ULL(63, 0),
+ .cnt_base = ODY_DDRC_PERF_CNT_VALUE_BASE,
+ .cfg_base = CN20K_DDRC_PERF_CFG_BASE,
+ .cfg1_base = CN20K_DDRC_PERF_CFG1_BASE,
+ .cnt_op_mode_ctrl = CN20K_DDRC_PERF_CNT_OP_MODE_CTRL,
+ .cnt_start_op_ctrl = CN20K_DDRC_PERF_CNT_START_OP_CTRL,
+ .cnt_end_op_ctrl = CN20K_DDRC_PERF_CNT_END_OP_CTRL,
+ .cnt_end_status = CN20K_DDRC_PERF_CNT_END_STATUS,
+ .cnt_freerun_en = 0,
+ .cnt_freerun_ctrl = ODY_DDRC_PERF_CNT_FREERUN_CTRL,
+ .cnt_freerun_clr = ODY_DDRC_PERF_CNT_FREERUN_CLR,
+ .cnt_value_wr_op = ODY_DDRC_PERF_CNT_VALUE_WR_OP,
+ .cnt_value_rd_op = ODY_DDRC_PERF_CNT_VALUE_RD_OP,
+ .silicon_flags = IS_CN20K,
};
#endif
@@ -979,7 +1118,7 @@ static const struct ddr_pmu_platform_data odyssey_ddr_pmu_pdata = {
.cnt_freerun_clr = ODY_DDRC_PERF_CNT_FREERUN_CLR,
.cnt_value_wr_op = ODY_DDRC_PERF_CNT_VALUE_WR_OP,
.cnt_value_rd_op = ODY_DDRC_PERF_CNT_VALUE_RD_OP,
- .is_ody = TRUE,
+ .silicon_flags = IS_ODY,
};
#endif
@@ -989,8 +1128,7 @@ static int cn10k_ddr_perf_probe(struct platform_device *pdev)
struct cn10k_ddr_pmu *ddr_pmu;
struct resource *res;
void __iomem *base;
- bool is_cn10k;
- bool is_ody;
+ unsigned int silicon_flags;
char *name;
int ret;
@@ -1014,10 +1152,9 @@ static int cn10k_ddr_perf_probe(struct platform_device *pdev)
ddr_pmu->base = base;
ddr_pmu->p_data = dev_data;
- is_cn10k = ddr_pmu->p_data->is_cn10k;
- is_ody = ddr_pmu->p_data->is_ody;
+ silicon_flags = ddr_pmu->p_data->silicon_flags;
- if (is_cn10k) {
+ if (silicon_flags & IS_CN10K) {
ddr_pmu->ops = &ddr_pmu_ops;
/* Setup the PMU counter to work in manual mode */
writeq_relaxed(OP_MODE_CTRL_VAL_MANUAL, ddr_pmu->base +
@@ -1039,7 +1176,7 @@ static int cn10k_ddr_perf_probe(struct platform_device *pdev)
};
}
- if (is_ody) {
+ if (silicon_flags & IS_ODY) {
ddr_pmu->ops = &ddr_pmu_ody_ops;
ddr_pmu->pmu = (struct pmu) {
@@ -1056,6 +1193,22 @@ static int cn10k_ddr_perf_probe(struct platform_device *pdev)
};
}
+ if (silicon_flags & IS_CN20K) {
+ ddr_pmu->ops = &ddr_pmu_ody_ops;
+
+ ddr_pmu->pmu = (struct pmu) {
+ .module = THIS_MODULE,
+ .capabilities = PERF_PMU_CAP_NO_EXCLUDE,
+ .task_ctx_nr = perf_invalid_context,
+ .attr_groups = cn20k_attr_groups,
+ .event_init = cn10k_ddr_perf_event_init,
+ .add = cn10k_ddr_perf_event_add,
+ .del = cn10k_ddr_perf_event_del,
+ .start = cn10k_ddr_perf_event_start,
+ .stop = cn10k_ddr_perf_event_stop,
+ .read = cn10k_ddr_perf_event_update,
+ };
+ }
/* Choose this cpu to collect perf data */
ddr_pmu->cpu = raw_smp_processor_id();
@@ -1098,6 +1251,7 @@ static void cn10k_ddr_perf_remove(struct platform_device *pdev)
#ifdef CONFIG_OF
static const struct of_device_id cn10k_ddr_pmu_of_match[] = {
{ .compatible = "marvell,cn10k-ddr-pmu", .data = &cn10k_ddr_pmu_pdata },
+ { .compatible = "marvell,cn20k-ddr-pmu", .data = &cn20k_ddr_pmu_pdata },
{ },
};
MODULE_DEVICE_TABLE(of, cn10k_ddr_pmu_of_match);
@@ -1107,6 +1261,7 @@ MODULE_DEVICE_TABLE(of, cn10k_ddr_pmu_of_match);
static const struct acpi_device_id cn10k_ddr_pmu_acpi_match[] = {
{"MRVL000A", (kernel_ulong_t)&cn10k_ddr_pmu_pdata },
{"MRVL000C", (kernel_ulong_t)&odyssey_ddr_pmu_pdata},
+ {"MRVL000B", (kernel_ulong_t)&cn20k_ddr_pmu_pdata},
{},
};
MODULE_DEVICE_TABLE(acpi, cn10k_ddr_pmu_acpi_match);
--
2.25.1
^ permalink raw reply related
* [PATCH v5 1/2] dt-bindings: perf: marvell: Add CN20K DDR PMU binding
From: Geetha sowjanya @ 2026-04-13 16:56 UTC (permalink / raw)
To: linux-perf-users, linux-kernel, linux-arm-kernel, devicetree
Cc: mark.rutland, will, krzk+dt
In-Reply-To: <20260413165621.10921-1-gakula@marvell.com>
Marvell CN20K SoCs integrate a DDR Performance Monitoring Unit (PMU)
associated with the DDR controller. The block provides hardware counters
to monitor DDR traffic and performance events and is accessed via a
dedicated MMIO region.
The CN20K DDR PMU is functionally equivalent to the CN10K DDR PMU, with
minor register offset differences.
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
---
.../bindings/perf/marvell,cn20k-ddr-pmu.yaml | 39 +++++++++++++++++++
1 file changed, 39 insertions(+)
create mode 100644 Documentation/devicetree/bindings/perf/marvell,cn20k-ddr-pmu.yaml
diff --git a/Documentation/devicetree/bindings/perf/marvell,cn20k-ddr-pmu.yaml b/Documentation/devicetree/bindings/perf/marvell,cn20k-ddr-pmu.yaml
new file mode 100644
index 000000000000..cc6aa760de49
--- /dev/null
+++ b/Documentation/devicetree/bindings/perf/marvell,cn20k-ddr-pmu.yaml
@@ -0,0 +1,39 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/perf/marvell,cn20k-ddr-pmu.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Marvell CN20K DDR performance monitor
+
+description:
+ Performance Monitoring Unit (PMU) for the DDR controller
+ in Marvell CN20K SoCs.
+
+maintainers:
+ - Geetha sowjanya <gakula@marvell.com>
+
+properties:
+ compatible:
+ const: marvell,cn20k-ddr-pmu
+
+ reg:
+ maxItems: 1
+
+required:
+ - compatible
+ - reg
+
+additionalProperties: false
+
+examples:
+ - |
+ bus {
+ #address-cells = <2>;
+ #size-cells = <2>;
+
+ ddr-pmu@c200000000 {
+ compatible = "marvell,cn20k-ddr-pmu";
+ reg = <0xc200 0x00000000 0x0 0x100000>;
+ };
+ };
--
2.25.1
^ permalink raw reply related
* Re: [PATCH bpf-next 1/2] bpf, arm64: Remove redundant bpf_flush_icache() after pack allocator finalize
From: Song Liu @ 2026-04-13 16:56 UTC (permalink / raw)
To: Puranjay Mohan
Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
Yonghong Song, Jiri Olsa, Xu Kuohai, Catalin Marinas, Will Deacon,
Luke Nelson, Xi Wang, Björn Töpel, Pu Lehui,
Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
linux-arm-kernel, linux-riscv, linux-kernel
In-Reply-To: <20260413123256.3296452-2-puranjay@kernel.org>
On Mon, Apr 13, 2026 at 5:33 AM Puranjay Mohan <puranjay@kernel.org> wrote:
>
> bpf_flush_icache() calls flush_icache_range() to clean the data cache
> and invalidate the instruction cache for the JITed code region. However,
> since commit 1dad391daef1 ("bpf, arm64: use bpf_prog_pack for memory
> management"), this flush is redundant.
>
> bpf_jit_binary_pack_finalize() copies the JITed instructions to the ROX
> region via bpf_arch_text_copy() -> aarch64_insn_copy() -> __text_poke(),
> and __text_poke() already calls flush_icache_range() on the written
> range. The subsequent bpf_flush_icache() repeats the same cache
> maintenance on an overlapping range, including an unnecessary second
> synchronous IPI to all CPUs via kick_all_cpus_sync().
>
> Remove the redundant bpf_flush_icache() call and its now-unused
> definition.
>
> Fixes: 1dad391daef1 ("bpf, arm64: use bpf_prog_pack for memory management")
> Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
We can now remove "#include <asm/cacheflush.h>".
Other than that,
Acked-by: Song Liu <song@kernel.org>
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox