* Re: [PATCH v4 1/8] sframe: Allow kernelspace sframe sections
From: Jens Remus @ 2026-04-22 14:08 UTC (permalink / raw)
To: Dylan Hatch, Roman Gushchin, Weinan Liu, Will Deacon,
Josh Poimboeuf, Indu Bhagat, Peter Zijlstra, Steven Rostedt,
Catalin Marinas, Jiri Kosina
Cc: Mark Rutland, Prasanna Kumar T S M, Puranjay Mohan, Song Liu,
joe.lawrence, linux-toolchains, linux-kernel, live-patching,
linux-arm-kernel, Randy Dunlap, Heiko Carstens
In-Reply-To: <20260421225200.1198447-2-dylanbhatch@google.com>
On 4/22/2026 12:51 AM, Dylan Hatch wrote:
> Generalize the sframe lookup code to support kernelspace sections. This
> is done by defining a SFRAME_LOOKUP option that can be activated
> separate from HAVE_UNWIND_USER_SFRAME, as there will be other client to
> this library than just userspace unwind.
>
> Sframe section location is now tracked in a separate sec_type field to
> determine whether user-access functions are necessary to read the sframe
> data. Relevant type delarations are moved and renamed to reflect the
> non-user sframe support.
>
> Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
With return -EFAULT changed to goto label in DATA_COPY() and DATA_GET():
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
> ---
> MAINTAINERS | 2 +-
> arch/Kconfig | 4 +
> .../{unwind_user_sframe.h => unwind_sframe.h} | 6 +-
> arch/x86/include/asm/unwind_user.h | 12 +-
> include/linux/sframe.h | 48 ++--
> include/linux/unwind_types.h | 46 +++
> include/linux/unwind_user_types.h | 41 ---
> kernel/unwind/Makefile | 2 +-
> kernel/unwind/sframe.c | 270 ++++++++++++------
> kernel/unwind/user.c | 41 +--
> 10 files changed, 293 insertions(+), 179 deletions(-)
> rename arch/x86/include/asm/{unwind_user_sframe.h => unwind_sframe.h} (50%)
> create mode 100644 include/linux/unwind_types.h
> diff --git a/include/linux/sframe.h b/include/linux/sframe.h
> +enum sframe_sec_type {
> + SFRAME_KERNEL,
> + SFRAME_USER,
> +};
> struct sframe_section {
> - struct rcu_head rcu;
> + struct rcu_head rcu;
> #ifdef CONFIG_DYNAMIC_DEBUG
> - const char *filename;
> + const char *filename;
> #endif
> - unsigned long sframe_start;
> - unsigned long sframe_end;
> - unsigned long text_start;
> - unsigned long text_end;
> -
> - unsigned long fdes_start;
> - unsigned long fres_start;
> - unsigned long fres_end;
> - unsigned int num_fdes;
> -
> - signed char ra_off;
> - signed char fp_off;
> + enum sframe_sec_type sec_type;
> + unsigned long sframe_start;
> + unsigned long sframe_end;
> + unsigned long text_start;
> + unsigned long text_end;
> +
> + unsigned long fdes_start;
> + unsigned long fres_start;
> + unsigned long fres_end;
> + unsigned int num_fdes;
> +
> + signed char ra_off;
> + signed char fp_off;
> };
> diff --git a/kernel/unwind/sframe.c b/kernel/unwind/sframe.c
> +#define DATA_COPY(sec, to, from, size, label) \
> +({ \
> + switch (sec->sec_type) { \
> + case SFRAME_KERNEL: \
> + KERNEL_COPY(to, from, size, label); \
> + break; \
> + case SFRAME_USER: \
> + UNSAFE_USER_COPY(to, from, size, label); \
> + break; \
I wonder whether it would be worthwhile to come up with an approach
where this would get evaluated at compile time instead at run time?
Or is this overengineering? Of course such improvement could be
made later on, so no need to solve that now.
Options that came into my mind:
A) Introduce and pass through a "bool user" parameter, whose value is
specified in sframe_find_user() and sframe_find_kernel(). Due to
inlining I would expect that to get any conditions based on that
to get evaluated at compile time. See below. Downside is the
ugly additional parameter.
B) Introduce lightweight .c wrappers, e.g. sframe_kernel.c and
sframe_user.c, that define DATA_GET() and DATA_COPY() and include
sframe.c. All HAVE_UNWIND_KERNEL_SFRAME code would be moved into
sframe_kernel.c and likewise all HAVE_UNWIND_USER_SFRAME code into
sframe_user.c.
> + default: \
> + return -EFAULT; \
goto label; \
Users of DATA_COPY() do expect the macro to branch to the label in case
of an error and therefore do not evaluate any return value. The
wrapping then needs also be changed from "({ .. })" to
"do { ... } while (0)".
> + } \
> +})
> +
> +#define DATA_GET(sec, to, from, type, label) \
> +({ \
> + switch (sec->sec_type) { \
> + case SFRAME_KERNEL: \
> + KERNEL_GET(to, from, type, label); \
> + break; \
> + case SFRAME_USER: \
> + UNSAFE_USER_GET(to, from, type, label); \
> + break; \
> + default: \
> + return -EFAULT; \
Likewise.
> + } \
> +})
> +#ifdef CONFIG_HAVE_UNWIND_USER_SFRAME
> +
> +int sframe_find_user(unsigned long ip, struct unwind_frame *frame)
> +{
> + struct mm_struct *mm = current->mm;
> + struct sframe_section *sec;
> + int ret;
> +
> + if (!mm)
> + return -EINVAL;
> +
> + guard(srcu)(&sframe_srcu);
> +
> + sec = mtree_load(&mm->sframe_mt, ip);
> + if (!sec)
> + return -EINVAL;
> +
> + if (!user_read_access_begin((void __user *)sec->sframe_start,
> + sec->sframe_end - sec->sframe_start))
> + return -EFAULT;
> +
> + ret = __sframe_find(sec, ip, frame);
In sframe_find_user() sec->sec_type must be SFRAME_USER. Likewise in
sframe_find_kernel() it must be SFRAME_KERNEL. So instead of
introducing sec_type, we could add a parameter
__sframe_find(..., bool user) and do:
ret = __sframe_find(sec, ip, frame, true);
The downside is that this then requires to pass that flag through
everywhere... (see below).
> +
> + user_read_access_end();
> +
> + if (ret == -EFAULT) {
> + dbg_sec("removing bad .sframe section\n");
> + WARN_ON_ONCE(sframe_remove_section(sec->sframe_start));
> + }
> +
> + return ret;
> +}
Regards,
Jens
--
Jens Remus
Linux on Z Development (D3303)
jremus@de.ibm.com / jremus@linux.ibm.com
IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Ehningen; Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement: https://www.ibm.com/privacy/
^ permalink raw reply
* Re: [PATCH v4 7/8] sframe: Introduce in-kernel SFRAME_VALIDATION
From: Jens Remus @ 2026-04-22 14:11 UTC (permalink / raw)
To: Dylan Hatch, Roman Gushchin, Weinan Liu, Will Deacon,
Josh Poimboeuf, Indu Bhagat, Peter Zijlstra, Steven Rostedt,
Catalin Marinas, Jiri Kosina
Cc: Mark Rutland, Prasanna Kumar T S M, Puranjay Mohan, Song Liu,
joe.lawrence, linux-toolchains, linux-kernel, live-patching,
linux-arm-kernel, Randy Dunlap, Heiko Carstens
In-Reply-To: <20260421225200.1198447-8-dylanbhatch@google.com>
On 4/22/2026 12:51 AM, Dylan Hatch wrote:
> Generalize the __safe* helpers to support a non-user-access code path.
>
> This requires arch-specific function address validation. This is because
> arm64 vmlinux has an .rodata.text section which lies outside the bounds
> of the normal .text. It contains code that is never executed by the
> kernel mapping, but for which the toolchain nonetheless generates sframe
> data, and needs to be considered valid for a PC lookup.
>
> This arch-specific address validation logic is only necessary to support
> SFRAME_VALIDATION for the vmlinux .sframe, since these .rodata.text
> functions would never be encountered during normal unwinding.
>
> Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
> Suggested-by: Jens Remus <jremus@linux.ibm.com>
With the minor nit below fixed:
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
> ---
> arch/Kconfig | 2 +-
> arch/arm64/include/asm/sections.h | 1 +
> arch/arm64/include/asm/unwind_sframe.h | 21 +++++++++++++++++++++
> arch/arm64/kernel/vmlinux.lds.S | 2 ++
> include/linux/sframe.h | 2 ++
> kernel/unwind/sframe.c | 25 +++++++++++++++++++++++--
> 6 files changed, 50 insertions(+), 3 deletions(-)
> diff --git a/arch/Kconfig b/arch/Kconfig
> @@ -503,7 +503,7 @@ config HAVE_UNWIND_USER_SFRAME
>
> config SFRAME_VALIDATION
> bool "Enable .sframe section debugging"
> - depends on HAVE_UNWIND_USER_SFRAME
> + depends on SFRAME_LOOKUP
depends on UNWIND_SFRAME__LOOKUP
> depends on DYNAMIC_DEBUG
> help
> When adding an .sframe section for a task, validate the entire
Regards,
Jens
--
Jens Remus
Linux on Z Development (D3303)
jremus@de.ibm.com / jremus@linux.ibm.com
IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Ehningen; Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement: https://www.ibm.com/privacy/
^ permalink raw reply
* Re: [PATCH] net/stmmac: Fix typos: 'tx_undeflow_irq' -> 'tx_underflow_irq'
From: Jakub Raczynski @ 2026-04-22 14:15 UTC (permalink / raw)
To: Andrew Lunn
Cc: netdev, linux-kernel, kuba, davem, andrew+netdev, kernel-janitors,
linux-arm-kernel, linux-stm32
In-Reply-To: <f1d51362-ca8f-481a-b9c1-400ab6422686@lunn.ch>
[-- Attachment #1: Type: text/plain, Size: 1403 bytes --]
On Wed, Apr 22, 2026 at 02:47:38PM +0200, Andrew Lunn wrote:
> > I don't see anything wrong with it?
> > - naming is correct, same as stmmac_extra_stats from common.h, as it
> > wouldn't compile otherwise
> > - string length is ok, as max name length is ETH_GSTRING_LEN=32 and it is
> > not close
> > - ethtool just polls data from driver and in my tests it is ok
> > - all instances of 'undeflow' are changed
> > - 'underflow' semantic is ok, 'undeflow' is just not correct
> >
> > Please correct me if I am wrong, but imo no issues with this patch.
>
> ABI
>
> This name is published as part of the kAPI. You are changing its
> name. User space could be looking for this name, even thought it has a
> typo in it.
>
> Andrew
>
I don't think it is? This part of extra stats (struct stmmac_extra_stats) and
is not part of standard ABI from
Documentation/ABI/testing/sysfs-class-net-statistics
nor is mentioned in
Documentation/networking/device_drivers/ethernet/stmicro/stmmac.rst
These extra stats are specific to stmmac driver and most of these are more
than standard
https://www.kernel.org/doc/html/v7.0/networking/statistics.html#c.rtnl_link_stats64
This name does not exist outside stmmac driver, so while some application may
expect this (stmmac specific app), question is should this typo stick?
This type of typo is even mentioned in scripts/spelling.txt.
Regards
Jakub Raczynski
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply
* Re: [PATCH v4 2/8] arm64, unwind: build kernel with sframe V3 info
From: Jens Remus @ 2026-04-22 14:15 UTC (permalink / raw)
To: Dylan Hatch, Roman Gushchin, Weinan Liu, Will Deacon,
Josh Poimboeuf, Indu Bhagat, Peter Zijlstra, Steven Rostedt,
Catalin Marinas, Jiri Kosina
Cc: Mark Rutland, Prasanna Kumar T S M, Puranjay Mohan, Song Liu,
joe.lawrence, linux-toolchains, linux-kernel, live-patching,
linux-arm-kernel, Randy Dunlap, Heiko Carstens
In-Reply-To: <20260421225200.1198447-3-dylanbhatch@google.com>
On 4/22/2026 12:51 AM, Dylan Hatch wrote:
> Build with -Wa,--gsframe-3 flags to generate a .sframe section. This
> will be used for in-kernel reliable stacktrace in cases where the frame
> pointer alone is insufficient.
>
> Currently, the sframe format only supports arm64, x86_64 and s390x
> architectures.
>
> Signed-off-by: Weinan Liu <wnliu@google.com>
> Reviewed-by: Prasanna Kumar T S M <ptsm@linux.microsoft.com>
> Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
> ---
> MAINTAINERS | 1 +
> Makefile | 8 ++++++++
> arch/Kconfig | 21 +++++++++++++++++++++
> arch/arm64/Kconfig | 1 +
> arch/arm64/include/asm/unwind_sframe.h | 8 ++++++++
> arch/arm64/kernel/vdso/Makefile | 2 +-
> include/asm-generic/sections.h | 4 ++++
> include/asm-generic/vmlinux.lds.h | 15 +++++++++++++++
> 8 files changed, 59 insertions(+), 1 deletion(-)
> create mode 100644 arch/arm64/include/asm/unwind_sframe.h
Regards,
Jens
--
Jens Remus
Linux on Z Development (D3303)
jremus@de.ibm.com / jremus@linux.ibm.com
IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Ehningen; Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement: https://www.ibm.com/privacy/
^ permalink raw reply
* Re: [PATCH v4 3/8] arm64: entry: add unwind info for various kernel entries
From: Jens Remus @ 2026-04-22 14:18 UTC (permalink / raw)
To: Dylan Hatch, Roman Gushchin, Weinan Liu, Will Deacon,
Josh Poimboeuf, Indu Bhagat, Peter Zijlstra, Steven Rostedt,
Catalin Marinas, Jiri Kosina
Cc: Mark Rutland, Prasanna Kumar T S M, Puranjay Mohan, Song Liu,
joe.lawrence, linux-toolchains, linux-kernel, live-patching,
linux-arm-kernel, Randy Dunlap
In-Reply-To: <20260421225200.1198447-4-dylanbhatch@google.com>
On 4/22/2026 12:51 AM, Dylan Hatch wrote:
> From: Weinan Liu <wnliu@google.com>
>
> DWARF CFI (Call Frame Information) specifies how to recover the return
> address and callee-saved registers at each PC in a given function.
> Compilers are able to generate the CFI annotations when they compile
> the code to assembly language. For handcrafted assembly, we need to
> annotate them by hand.
>
> Annotate minimal CFI to enable stacktracing using SFrame for kernel
> exception entries through el1*_64_*() paths and irq entries through
> call_on_irq_stack()
>
> Signed-off-by: Weinan Liu <wnliu@google.com>
> Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
> Suggested-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
> ---
> arch/arm64/kernel/entry.S | 23 +++++++++++++++++++++++
> 1 file changed, 23 insertions(+)
Regards,
Jens
--
Jens Remus
Linux on Z Development (D3303)
jremus@de.ibm.com / jremus@linux.ibm.com
IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Ehningen; Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement: https://www.ibm.com/privacy/
^ permalink raw reply
* Re: [PATCH v4 8/8] unwind: arm64: Use sframe to unwind interrupt frames
From: Jens Remus @ 2026-04-22 14:25 UTC (permalink / raw)
To: Dylan Hatch, Roman Gushchin, Weinan Liu, Will Deacon,
Josh Poimboeuf, Indu Bhagat, Peter Zijlstra, Steven Rostedt,
Catalin Marinas, Jiri Kosina
Cc: Mark Rutland, Prasanna Kumar T S M, Puranjay Mohan, Song Liu,
joe.lawrence, linux-toolchains, linux-kernel, live-patching,
linux-arm-kernel, Randy Dunlap, Heiko Carstens
In-Reply-To: <20260421225200.1198447-9-dylanbhatch@google.com>
On 4/22/2026 12:52 AM, Dylan Hatch wrote:
> Add unwind_next_frame_sframe() function to unwind by sframe info if
> present. Use this method at exception boundaries, falling back to
> frame-pointer unwind only on failure. In such failure cases, the
> stacktrace is considered unreliable.
>
> During normal unwind, prefer frame pointer unwind (for better
> performance) with sframe as a backup.
>
> This change restores the LR behavior originally introduced in commit
> c2c6b27b5aa14fa2 ("arm64: stacktrace: unwind exception boundaries"),
> But later removed in commit 32ed1205682e ("arm64: stacktrace: Skip
> reporting LR at exception boundaries")
>
> This can be done because the sframe data can be used to determine
> whether the LR is current for the PC value recovered from pt_regs at the
> exception boundary.
>
> Signed-off-by: Weinan Liu <wnliu@google.com>
> Reviewed-by: Prasanna Kumar T S M <ptsm@linux.microsoft.com>
> Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
> ---
> arch/arm64/include/asm/stacktrace/common.h | 6 +
> arch/arm64/kernel/stacktrace.c | 246 +++++++++++++++++++--
> 2 files changed, 232 insertions(+), 20 deletions(-)
Regards,
Jens
--
Jens Remus
Linux on Z Development (D3303)
jremus@de.ibm.com / jremus@linux.ibm.com
IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Ehningen; Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement: https://www.ibm.com/privacy/
^ permalink raw reply
* Re: [PATCH] dt-bindings: arm-smmu: qcom:: Fix Hawi compatible placement
From: Mukesh Ojha @ 2026-04-22 14:27 UTC (permalink / raw)
To: Krzysztof Kozlowski
Cc: Will Deacon, Joerg Roedel, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Robin Murphy, linux-arm-kernel, iommu, devicetree,
linux-kernel
In-Reply-To: <d1d24380-06e8-4833-b893-631c063d77ff@oss.qualcomm.com>
On Wed, Apr 22, 2026 at 11:04:09AM +0200, Krzysztof Kozlowski wrote:
> On 22/04/2026 10:33, Mukesh Ojha wrote:
> > qcom,hawi-smmu-500 was placed in the wrong enum block of GPU. Move it to
> > the correct location alongside other Qualcomm SMMU-500 compatibles for
> > CPU.
>
> How could it pass dtbs_check?
It was bad on my part.
> >
> > Fixes: 5e8323c3d528 ("dt-bindings: arm-smmu: qcom: Add compatible for Hawi SoC")
> > Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
> > ---
> > Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> > index 06fb5c8e7547..ba9ad1f5a8ff 100644
> > --- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> > +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> > @@ -37,6 +37,7 @@ properties:
> > - enum:
> > - qcom,eliza-smmu-500
> > - qcom,glymur-smmu-500
> > + - qcom,hawi-smmu-500
> > - qcom,kaanapali-smmu-500
> > - qcom,milos-smmu-500
> > - qcom,qcm2290-smmu-500
> > @@ -93,7 +94,6 @@ properties:
> > items:
> > - enum:
> > - qcom,glymur-smmu-500
> > - - qcom,hawi-smmu-500
>
> No, why? That's for GPU. Why are you moving GPU compatible to non-GPU place?
Earlier commit was meant for CPUs and not for GPUs, hence moving it Or,
Are you saying do not do the movement and add one for CPUs ?
>
>
> Best regards,
> Krzysztof
--
-Mukesh Ojha
^ permalink raw reply
* Re: [PATCH] dt-bindings: arm-smmu: qcom:: Fix Hawi compatible placement
From: Krzysztof Kozlowski @ 2026-04-22 14:37 UTC (permalink / raw)
To: Mukesh Ojha, Krzysztof Kozlowski
Cc: Will Deacon, Joerg Roedel, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Robin Murphy, linux-arm-kernel, iommu, devicetree,
linux-kernel
In-Reply-To: <20260422142710.2f6f2oijbsjojbmi@hu-mojha-hyd.qualcomm.com>
On 22/04/2026 16:27, Mukesh Ojha wrote:
>>> - enum:
>>> - qcom,eliza-smmu-500
>>> - qcom,glymur-smmu-500
>>> + - qcom,hawi-smmu-500
>>> - qcom,kaanapali-smmu-500
>>> - qcom,milos-smmu-500
>>> - qcom,qcm2290-smmu-500
>>> @@ -93,7 +94,6 @@ properties:
>>> items:
>>> - enum:
>>> - qcom,glymur-smmu-500
>>> - - qcom,hawi-smmu-500
>>
>> No, why? That's for GPU. Why are you moving GPU compatible to non-GPU place?
>
> Earlier commit was meant for CPUs and not for GPUs, hence moving it Or,
OK, but:
> Are you saying do not do the movement and add one for CPUs ?
yeah, because you might need it in this place for the GPU. Therefore I
would propose to check what compatible you expect for the GPU part and
just add it.
Best regards,
Krzysztof
^ permalink raw reply
* Re: [PATCH] arm64: traps: Add a macro to simplify the condition codes check
From: Vladimir Murzin @ 2026-04-22 14:43 UTC (permalink / raw)
To: Jinjie Ruan, catalin.marinas, will, mark.rutland, kees, maz,
ada.coupriediaz, smostafa, leitao, mrigendra.chaubey,
linux-arm-kernel, linux-kernel
In-Reply-To: <dd2f076b-976c-480d-8414-e6685cad40dd@huawei.com>
Hi Jinjie,
On 4/22/26 04:06, Jinjie Ruan wrote:
>
> On 3/20/2026 4:28 PM, Jinjie Ruan wrote:
>> Add DEFINE_COND_CHECK macro to define the simple __check_* functions
>> to simplify the condition codes check.
>>
>> No functional changes.
> Gentle ping.
>
>> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
>> ---
>> arch/arm64/kernel/traps.c | 59 ++++++++++-----------------------------
>> 1 file changed, 15 insertions(+), 44 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
>> index 914282016069..6216fe9e8e42 100644
>> --- a/arch/arm64/kernel/traps.c
>> +++ b/arch/arm64/kernel/traps.c
>> @@ -49,45 +49,21 @@
>> #include <asm/system_misc.h>
>> #include <asm/sysreg.h>
>>
>> -static bool __kprobes __check_eq(unsigned long pstate)
>> -{
>> - return (pstate & PSR_Z_BIT) != 0;
>> -}
>> -
>> -static bool __kprobes __check_ne(unsigned long pstate)
>> -{
>> - return (pstate & PSR_Z_BIT) == 0;
>> -}
>> -
>> -static bool __kprobes __check_cs(unsigned long pstate)
>> -{
>> - return (pstate & PSR_C_BIT) != 0;
>> -}
>> -
>> -static bool __kprobes __check_cc(unsigned long pstate)
>> -{
>> - return (pstate & PSR_C_BIT) == 0;
>> -}
>> -
>> -static bool __kprobes __check_mi(unsigned long pstate)
>> -{
>> - return (pstate & PSR_N_BIT) != 0;
>> -}
>> -
>> -static bool __kprobes __check_pl(unsigned long pstate)
>> -{
>> - return (pstate & PSR_N_BIT) == 0;
>> -}
>> -
>> -static bool __kprobes __check_vs(unsigned long pstate)
>> -{
>> - return (pstate & PSR_V_BIT) != 0;
>> -}
>> -
>> -static bool __kprobes __check_vc(unsigned long pstate)
>> -{
>> - return (pstate & PSR_V_BIT) == 0;
>> -}
>> +#define DEFINE_COND_CHECK(name, flag, expected) \
>> +static bool __kprobes __check_##name(unsigned long pstate) \
>> +{ \
>> + return ((pstate & (flag)) != 0) == (expected); \
>> +}
>> +
>> +DEFINE_COND_CHECK(eq, PSR_Z_BIT, true)
>> +DEFINE_COND_CHECK(ne, PSR_Z_BIT, false)
>> +DEFINE_COND_CHECK(cs, PSR_C_BIT, true)
>> +DEFINE_COND_CHECK(cc, PSR_C_BIT, false)
>> +DEFINE_COND_CHECK(mi, PSR_N_BIT, true)
>> +DEFINE_COND_CHECK(pl, PSR_N_BIT, false)
>> +DEFINE_COND_CHECK(vs, PSR_V_BIT, true)
>> +DEFINE_COND_CHECK(vc, PSR_V_BIT, false)
>> +DEFINE_COND_CHECK(al, 0, false) /* Always true */
>>
>> static bool __kprobes __check_hi(unsigned long pstate)
>> {
>> @@ -131,11 +107,6 @@ static bool __kprobes __check_le(unsigned long pstate)
>> return (temp & PSR_N_BIT) != 0;
>> }
>>
>> -static bool __kprobes __check_al(unsigned long pstate)
>> -{
>> - return true;
>> -}
>> -
>> /*
>> * Note that the ARMv8 ARM calls condition code 0b1111 "nv", but states that
>> * it behaves identically to 0b1110 ("al").
>
It looks like we now have a mix of checks implemented via macros
and others written out explicitly. The existing approach has the
advantage of being consistent and easy to follow, whereas
introducing macros here, even if it reduces some duplication,
adds a bit of cognitive overhead when reading the code.
This may come down to preference, but I think sticking to a
single, consistent style would make the code easier to scan and
maintain.
Just my 2p.
Cheers
Vladimir
^ permalink raw reply
* Re: [PATCH 1/4] arm64: signal: Preserve POR_EL0 if poe_context is missing
From: Kevin Brodsky @ 2026-04-22 14:55 UTC (permalink / raw)
To: Will Deacon
Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Joey Gouly,
Mark Brown, Shuah Khan, linux-kselftest
In-Reply-To: <aei87MmXaEdg8YOf@willie-the-truck>
On 22/04/2026 14:19, Will Deacon wrote:
>> @@ -74,8 +76,12 @@ struct rt_sigframe_user_layout {
>> * This state needs to be carefully managed to ensure that it doesn't cause
>> * uaccess to fail when setting up the signal frame, and the signal handler
>> * itself also expects a well-defined state when entered.
>> + *
>> + * The valid_fields member is a bitfield (see UA_STATE_HAS_*), specifying which
>> + * of the remaining fields is valid (has been set to a value).
>> */
>> struct user_access_state {
>> + unsigned int valid_fields;
>> u64 por_el0;
>> };
> Do you think it would be worth adding some accessors to make it easier
> to keep the flags in sync? For example:
>
> /* Stores por_el0 into uas->por_el0 and sets UA_STATE_HAS_POR_EL0 */
> void set_ua_state_por_el0(struct user_access_state *uas, u64 por_el0);
>
> /*
> * If UA_STATE_HAS_POR_EL0, *por_el0 = uas->por_el0 and return 0.
> * Otherwise, return -ENOENT.
> */
> int get_ua_state_por_el0(struct user_access_state *uas, u64 *por_el0);
>
> WDYT?
I did get a feeling having helpers would be a good idea. I wonder if
getters/setters aren't a bit overkill though, as they make accesses to
the struct more cumbersome and we'd need a pair for every member (unless
we use some macro magic). Maybe it would be sufficient to have say
ua_state_has_field(POR_EL0) to check if the bit is set, and
ua_state_set_field_valid(POR_EL0) to set the bit?
>> @@ -1095,7 +1104,7 @@ SYSCALL_DEFINE0(rt_sigreturn)
>> {
>> struct pt_regs *regs = current_pt_regs();
>> struct rt_sigframe __user *frame;
>> - struct user_access_state ua_state;
>> + struct user_access_state ua_state = {0};
> nit: {} should do (no need for the '0'). Same in setup_rt_frame().
Will change it, I have some vague recollection that GCC and Clang
disagreed about the meaning of {}... but that's probably fixed nowadays.
Thanks for the review!
- Kevin
^ permalink raw reply
* Re: [PATCH] spi: sun6i: Set SPI mode in prepare_message
From: Mark Brown @ 2026-04-22 14:57 UTC (permalink / raw)
To: Kevin Mehall
Cc: Chen-Yu Tsai, Jernej Skrabec, Samuel Holland, Mirko Vogt,
Ralf Schlatterbeck, linux-spi, linux-arm-kernel, linux-sunxi,
linux-kernel
In-Reply-To: <20260420164755.1131645-1-km@kevinmehall.net>
[-- Attachment #1: Type: text/plain, Size: 821 bytes --]
On Mon, Apr 20, 2026 at 10:46:04AM -0600, Kevin Mehall wrote:
> With a GPIO chip select, CS is asserted before entering transfer_one.
> The spi-sun6i driver previously configured the SPI mode (including clock
> polarity) and enabled the bus in transfer_one, which can cause an
> extraneous SCK transition with CS asserted, corrupting the transferred
> data.
> This patch moves the SPI mode configuration and bus enable to the
> spi_prepare_message callback, ensuring that SCK is driven to the correct
> level prior to asserting CS.
> + /* We want to control the chip select manually */
> + reg |= SUN6I_TFR_CTL_CS_MANUAL;
> +
> + sun6i_spi_write(sspi, SUN6I_TFR_CTL_REG, reg);
Might this cause the native chip select to get asserted, we didn't set
up values so it'll have defaults if it wasn't previously configured?
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply
* Re: [PATCH v4 35/49] KVM: arm64: GICv3: nv: Plug L1 LR sync into deactivation primitive
From: Vishnu Pajjuri @ 2026-04-22 14:57 UTC (permalink / raw)
To: Marc Zyngier
Cc: Fuad Tabba, Joey Gouly, Suzuki K Poulose, Oliver Upton,
Zenghui Yu, Christoffer Dall, Mark Brown, kvm, linux-arm-kernel,
kvmarm, Darren Hart
In-Reply-To: <86eck71o1v.wl-maz@kernel.org>
Hi Marc,
On 22-04-2026 12:25, Marc Zyngier wrote:
> [+ Darren]
>
> On Tue, 31 Mar 2026 10:42:57 +0100,
> Marc Zyngier <maz@kernel.org> wrote:
>>
>> On Tue, 31 Mar 2026 07:31:54 +0100,
>> Vishnu Pajjuri <vishnu@os.amperecomputing.com> wrote:
>>>
>>>>> LOG:
>>>>> [ 164.647367] Call trace:
>>>>> [ 164.647368] smp_call_function_many_cond+0x334/0x7a0 (P)
>>>>> [ 164.647372] smp_call_function_many+0x20/0x40
>>>>> [ 164.647374] kvm_make_all_cpus_request+0xec/0x1b8
>>>>> [ 164.647377] vgic_queue_irq_unlock+0x1c8/0x2c8
>>>>> [ 164.647380] kvm_vgic_inject_irq+0x194/0x1e0
>>>>> [ 164.647381] kvm_vm_ioctl_irq_line+0x170/0x400
>>>>> [ 164.647386] kvm_vm_ioctl+0x7b8/0xc88
>>>>> [ 164.647389] __arm64_sys_ioctl+0xb4/0x118
>>>>> [ 164.647393] invoke_syscall+0x6c/0x100
>>>>> [ 164.647397] el0_svc_common.constprop.0+0x48/0xf0
>>>>> [ 164.647398] do_el0_svc+0x24/0x38
>>>>> [ 164.647400] el0_svc+0x3c/0x170
>>>>> [ 164.647403] el0t_64_sync_handler+0xa0/0xe8
>>>>> [ 164.647405] el0t_64_sync+0x1b0/0x1b8
>>>>
>>>> This trace is about interrupt injection from userspace, not
>>>> deactivation of a HW interrupt.
>>>> None of that makes much sense.
>>>
>>> Although this behavior is puzzling, it matches the trace I typically
>>> observe on L0. After reverting the patch, I was able to boot L2 guests
>>> successfully.
>>
>> Well, this patch fixes real bugs, so it isn't going anywhere.
>>
>> The patch you are reverting addresses the deactivation of a HW
>> interrupt, which is likely to be a timer (that's the only one we
>> support). The stacktrace points to the userspace injection of an SPI.
>>
>> If we need to broadcast IPI, that's because there is no other SPI
>> currently in flight. But if a CPU is not responding to the IPI, what
>> is it doing? How does this interact with the patch you are reverting?
>>
>> Given that I don't know what you're running, how you are running it,
>> that I don't have access to whatever HW you are using, and that you
>> are providing no useful information that'd help me debug this, I will
>> leave it up to you to debug it and come back with a detailed analysis
>> of the problem.
>
> Have you made progress on this? I can't reproduce it at all despite my
> best effort. I'm perfectly happy to help, but you need to give me
> *something* to go on.
Thanks for your support!!
The issue is triggered as soon as the timer interrupt (IRQ 27) is
deactivated. Preventing the deactivation of IRQ 27 during nested VGIC
state transitions prevents the failure from reproducing.
I am currently tracing execution paths and inspecting VGIC state to
determine how disabling this interrupt leads to the observed behavior.
Regards,
-Vishnu.
> Thanks,
>
> M.
>
^ permalink raw reply
* Re: [PATCH v7 1/5] clk: bcm: rpi: Manage clock rate in prepare/unprepare callbacks
From: Mark Brown @ 2026-04-22 15:12 UTC (permalink / raw)
To: Linus Torvalds, Stephen Boyd
Cc: Maíra Canal, Michael Turquette, Stephen Boyd,
Nicolas Saenz Julienne, Florian Fainelli, Stefan Wahren,
Maxime Ripard, Melissa Wen, Iago Toral Quiroga, Chema Casanova,
Dave Stevenson, Philipp Zabel, linux-clk, dri-devel,
linux-rpi-kernel, linux-arm-kernel,
Broadcom internal kernel review list, kernel-dev
In-Reply-To: <5f0bec08-f458-4fba-8bf3-06817a100c4c@sirena.org.uk>
[-- Attachment #1: Type: text/plain, Size: 982 bytes --]
On Tue, Mar 31, 2026 at 01:49:24PM +0100, Mark Brown wrote:
> On Thu, Mar 12, 2026 at 06:34:23PM -0300, Maíra Canal wrote:
> > On current firmware versions, RPI_FIRMWARE_SET_CLOCK_STATE doesn't
> > actually power off the clock. To achieve meaningful power savings, the
> > clock rate must be set to the minimum before disabling. This might be
> > fixed in future firmware releases.
>
> I'm seeing boot regressions in -next with NFS root on Raspberry Pi 3B+
> which bisect to this commit. We get a likely unrelated oops from the
> firmware interface and the boot grinds to a halt some time later since
> the ethernet never comes up:
This boot regression, which was originally reported against -next
several weeks ago, is now present in mainline. Maíra quickly provided a
fix:
https://lore.kernel.org/r/20260401111416.562279-2-mcanal@igalia.com
which I have confirmed works but I've not seen any other response to
either that fix or the original thread.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply
* Re: [PATCH v4 1/8] sframe: Allow kernelspace sframe sections
From: Dylan Hatch @ 2026-04-22 15:15 UTC (permalink / raw)
To: Jens Remus
Cc: Roman Gushchin, Weinan Liu, Will Deacon, Josh Poimboeuf,
Indu Bhagat, Peter Zijlstra, Steven Rostedt, Catalin Marinas,
Jiri Kosina, Mark Rutland, Prasanna Kumar T S M, Puranjay Mohan,
Song Liu, joe.lawrence, linux-toolchains, linux-kernel,
live-patching, linux-arm-kernel, Randy Dunlap, Heiko Carstens
In-Reply-To: <38c9d976-6205-409c-8874-4c9757b25fd6@linux.ibm.com>
On Wed, Apr 22, 2026 at 7:08 AM Jens Remus <jremus@linux.ibm.com> wrote:
>
> On 4/22/2026 12:51 AM, Dylan Hatch wrote:
> > Generalize the sframe lookup code to support kernelspace sections. This
> > is done by defining a SFRAME_LOOKUP option that can be activated
> > separate from HAVE_UNWIND_USER_SFRAME, as there will be other client to
> > this library than just userspace unwind.
> >
> > Sframe section location is now tracked in a separate sec_type field to
> > determine whether user-access functions are necessary to read the sframe
> > data. Relevant type delarations are moved and renamed to reflect the
> > non-user sframe support.
> >
> > Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
>
> With return -EFAULT changed to goto label in DATA_COPY() and DATA_GET():
>
> Reviewed-by: Jens Remus <jremus@linux.ibm.com>
>
> > ---
> > MAINTAINERS | 2 +-
> > arch/Kconfig | 4 +
> > .../{unwind_user_sframe.h => unwind_sframe.h} | 6 +-
> > arch/x86/include/asm/unwind_user.h | 12 +-
> > include/linux/sframe.h | 48 ++--
> > include/linux/unwind_types.h | 46 +++
> > include/linux/unwind_user_types.h | 41 ---
> > kernel/unwind/Makefile | 2 +-
> > kernel/unwind/sframe.c | 270 ++++++++++++------
> > kernel/unwind/user.c | 41 +--
> > 10 files changed, 293 insertions(+), 179 deletions(-)
> > rename arch/x86/include/asm/{unwind_user_sframe.h => unwind_sframe.h} (50%)
> > create mode 100644 include/linux/unwind_types.h
>
> > diff --git a/include/linux/sframe.h b/include/linux/sframe.h
>
> > +enum sframe_sec_type {
> > + SFRAME_KERNEL,
> > + SFRAME_USER,
> > +};
>
> > struct sframe_section {
> > - struct rcu_head rcu;
> > + struct rcu_head rcu;
> > #ifdef CONFIG_DYNAMIC_DEBUG
> > - const char *filename;
> > + const char *filename;
> > #endif
> > - unsigned long sframe_start;
> > - unsigned long sframe_end;
> > - unsigned long text_start;
> > - unsigned long text_end;
> > -
> > - unsigned long fdes_start;
> > - unsigned long fres_start;
> > - unsigned long fres_end;
> > - unsigned int num_fdes;
> > -
> > - signed char ra_off;
> > - signed char fp_off;
> > + enum sframe_sec_type sec_type;
> > + unsigned long sframe_start;
> > + unsigned long sframe_end;
> > + unsigned long text_start;
> > + unsigned long text_end;
> > +
> > + unsigned long fdes_start;
> > + unsigned long fres_start;
> > + unsigned long fres_end;
> > + unsigned int num_fdes;
> > +
> > + signed char ra_off;
> > + signed char fp_off;
> > };
>
> > diff --git a/kernel/unwind/sframe.c b/kernel/unwind/sframe.c
>
> > +#define DATA_COPY(sec, to, from, size, label) \
> > +({ \
> > + switch (sec->sec_type) { \
> > + case SFRAME_KERNEL: \
> > + KERNEL_COPY(to, from, size, label); \
> > + break; \
> > + case SFRAME_USER: \
> > + UNSAFE_USER_COPY(to, from, size, label); \
> > + break; \
>
> I wonder whether it would be worthwhile to come up with an approach
> where this would get evaluated at compile time instead at run time?
> Or is this overengineering? Of course such improvement could be
> made later on, so no need to solve that now.
I had a similar thought when I was writing this patch, but I ended up
deciding to avoid premature optimization before getting feedback. I'd
definitely be interested in improving upon this later on.
>
> Options that came into my mind:
> A) Introduce and pass through a "bool user" parameter, whose value is
> specified in sframe_find_user() and sframe_find_kernel(). Due to
> inlining I would expect that to get any conditions based on that
> to get evaluated at compile time. See below. Downside is the
> ugly additional parameter.
>
> B) Introduce lightweight .c wrappers, e.g. sframe_kernel.c and
> sframe_user.c, that define DATA_GET() and DATA_COPY() and include
> sframe.c. All HAVE_UNWIND_KERNEL_SFRAME code would be moved into
> sframe_kernel.c and likewise all HAVE_UNWIND_USER_SFRAME code into
> sframe_user.c.
(A) definitely sounds simpler to implement. For (B) it seems uncommon
for .c files to include one another. Style-wise, is this something
that is typically allowed (e.g. by checkpatch.pl)?
>
> > + default: \
> > + return -EFAULT; \
>
> goto label; \
>
> Users of DATA_COPY() do expect the macro to branch to the label in case
> of an error and therefore do not evaluate any return value. The
> wrapping then needs also be changed from "({ .. })" to
> "do { ... } while (0)".
>
> > + } \
> > +})
> > +
> > +#define DATA_GET(sec, to, from, type, label) \
> > +({ \
> > + switch (sec->sec_type) { \
> > + case SFRAME_KERNEL: \
> > + KERNEL_GET(to, from, type, label); \
> > + break; \
> > + case SFRAME_USER: \
> > + UNSAFE_USER_GET(to, from, type, label); \
> > + break; \
> > + default: \
> > + return -EFAULT; \
>
> Likewise.
>
> > + } \
> > +})
>
> > +#ifdef CONFIG_HAVE_UNWIND_USER_SFRAME
> > +
> > +int sframe_find_user(unsigned long ip, struct unwind_frame *frame)
> > +{
> > + struct mm_struct *mm = current->mm;
> > + struct sframe_section *sec;
> > + int ret;
> > +
> > + if (!mm)
> > + return -EINVAL;
> > +
> > + guard(srcu)(&sframe_srcu);
> > +
> > + sec = mtree_load(&mm->sframe_mt, ip);
> > + if (!sec)
> > + return -EINVAL;
> > +
> > + if (!user_read_access_begin((void __user *)sec->sframe_start,
> > + sec->sframe_end - sec->sframe_start))
> > + return -EFAULT;
> > +
> > + ret = __sframe_find(sec, ip, frame);
>
> In sframe_find_user() sec->sec_type must be SFRAME_USER. Likewise in
> sframe_find_kernel() it must be SFRAME_KERNEL. So instead of
> introducing sec_type, we could add a parameter
> __sframe_find(..., bool user) and do:
>
> ret = __sframe_find(sec, ip, frame, true);
>
> The downside is that this then requires to pass that flag through
> everywhere... (see below).
>
> > +
> > + user_read_access_end();
> > +
> > + if (ret == -EFAULT) {
> > + dbg_sec("removing bad .sframe section\n");
> > + WARN_ON_ONCE(sframe_remove_section(sec->sframe_start));
> > + }
> > +
> > + return ret;
> > +}
> Regards,
> Jens
> --
> Jens Remus
> Linux on Z Development (D3303)
> jremus@de.ibm.com / jremus@linux.ibm.com
>
> IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Ehningen; Registergericht: Amtsgericht Stuttgart, HRB 243294
> IBM Data Privacy Statement: https://www.ibm.com/privacy/
>
^ permalink raw reply
* Re: [PATCH] arm64: traps: Add a macro to simplify the condition codes check
From: Mark Rutland @ 2026-04-22 15:16 UTC (permalink / raw)
To: Vladimir Murzin
Cc: smostafa, kees, catalin.marinas, Jinjie Ruan, linux-kernel,
mrigendra.chaubey, maz, leitao, will, linux-arm-kernel
In-Reply-To: <fbd9268f-ace7-42d8-b920-8b6f9b97cf6d@arm.com>
On Wed, Apr 22, 2026 at 03:43:39PM +0100, Vladimir Murzin wrote:
> Hi Jinjie,
>
> On 4/22/26 04:06, Jinjie Ruan wrote:
> >
> > On 3/20/2026 4:28 PM, Jinjie Ruan wrote:
> >> Add DEFINE_COND_CHECK macro to define the simple __check_* functions
> >> to simplify the condition codes check.
> >>
> >> No functional changes.
> > Gentle ping.
> >
> >> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
> >> ---
> >> arch/arm64/kernel/traps.c | 59 ++++++++++-----------------------------
> >> 1 file changed, 15 insertions(+), 44 deletions(-)
> >>
> >> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> >> index 914282016069..6216fe9e8e42 100644
> >> --- a/arch/arm64/kernel/traps.c
> >> +++ b/arch/arm64/kernel/traps.c
> >> @@ -49,45 +49,21 @@
> >> #include <asm/system_misc.h>
> >> #include <asm/sysreg.h>
> >>
> >> -static bool __kprobes __check_eq(unsigned long pstate)
> >> -{
> >> - return (pstate & PSR_Z_BIT) != 0;
> >> -}
> >> -
> >> -static bool __kprobes __check_ne(unsigned long pstate)
> >> -{
> >> - return (pstate & PSR_Z_BIT) == 0;
> >> -}
> >> -
> >> -static bool __kprobes __check_cs(unsigned long pstate)
> >> -{
> >> - return (pstate & PSR_C_BIT) != 0;
> >> -}
> >> -
> >> -static bool __kprobes __check_cc(unsigned long pstate)
> >> -{
> >> - return (pstate & PSR_C_BIT) == 0;
> >> -}
> >> -
> >> -static bool __kprobes __check_mi(unsigned long pstate)
> >> -{
> >> - return (pstate & PSR_N_BIT) != 0;
> >> -}
> >> -
> >> -static bool __kprobes __check_pl(unsigned long pstate)
> >> -{
> >> - return (pstate & PSR_N_BIT) == 0;
> >> -}
> >> -
> >> -static bool __kprobes __check_vs(unsigned long pstate)
> >> -{
> >> - return (pstate & PSR_V_BIT) != 0;
> >> -}
> >> -
> >> -static bool __kprobes __check_vc(unsigned long pstate)
> >> -{
> >> - return (pstate & PSR_V_BIT) == 0;
> >> -}
> >> +#define DEFINE_COND_CHECK(name, flag, expected) \
> >> +static bool __kprobes __check_##name(unsigned long pstate) \
> >> +{ \
> >> + return ((pstate & (flag)) != 0) == (expected); \
> >> +}
> >> +
> >> +DEFINE_COND_CHECK(eq, PSR_Z_BIT, true)
> >> +DEFINE_COND_CHECK(ne, PSR_Z_BIT, false)
> >> +DEFINE_COND_CHECK(cs, PSR_C_BIT, true)
> >> +DEFINE_COND_CHECK(cc, PSR_C_BIT, false)
> >> +DEFINE_COND_CHECK(mi, PSR_N_BIT, true)
> >> +DEFINE_COND_CHECK(pl, PSR_N_BIT, false)
> >> +DEFINE_COND_CHECK(vs, PSR_V_BIT, true)
> >> +DEFINE_COND_CHECK(vc, PSR_V_BIT, false)
> >> +DEFINE_COND_CHECK(al, 0, false) /* Always true */
> >>
> >> static bool __kprobes __check_hi(unsigned long pstate)
> >> {
> >> @@ -131,11 +107,6 @@ static bool __kprobes __check_le(unsigned long pstate)
> >> return (temp & PSR_N_BIT) != 0;
> >> }
> >>
> >> -static bool __kprobes __check_al(unsigned long pstate)
> >> -{
> >> - return true;
> >> -}
> >> -
> >> /*
> >> * Note that the ARMv8 ARM calls condition code 0b1111 "nv", but states that
> >> * it behaves identically to 0b1110 ("al").
> >
>
> It looks like we now have a mix of checks implemented via macros
> and others written out explicitly. The existing approach has the
> advantage of being consistent and easy to follow, whereas
> introducing macros here, even if it reduces some duplication,
> adds a bit of cognitive overhead when reading the code.
>
> This may come down to preference, but I think sticking to a
> single, consistent style would make the code easier to scan and
> maintain.
FWIW, I agree. I think it'd be better to leave this as-is for now.
Mark.
^ permalink raw reply
* Re: [PATCH v13 00/48] arm64: Support for Arm CCA in KVM
From: Steven Price @ 2026-04-22 15:38 UTC (permalink / raw)
To: Jiahao zheng
Cc: alexandru.elisei, alpergun, aneesh.kumar, catalin.marinas,
christoffer.dall, fj0570is, gankulkarni, gshan, james.morse,
joey.gouly, kvm, kvmarm, linux-arm-kernel, linux-coco,
linux-kernel, maz, oliver.upton, sdonthineni, suzuki.poulose,
tabba, vannapurve, will, yuzenghui
In-Reply-To: <20260421135145.14789-1-jahao.zheng@gmail.com_quarantine>
On 21/04/2026 14:51, Jiahao zheng wrote:
> Hi Steven,
>
> I've been testing CCA patch series and noticed Realm VM cannot boot successfully when the host is forced to run in nVHE mode (e.g., via `kvm-arm.mode=nvhe`). The kvmtool debug information will be truncated in set_guest_bank_private_gpa.
>
> Currently, in `kvm_ioctl_vcpu_run()`, running a Realm VM (REC) bypasses the standard nVHE EL2 stub. `kvm_rec_enter()` directly executes the SMC instruction to transition to the RMM. Upon returning to the EL1 host, the code falls back to `kvm_vgic_sync_hwstate()`, where the VGIC save operation is explicitly skipped for nVHE. Since the EL2 stub was bypassed, `__vgic_v3_save_state()` is never executed, and `ICH_*_EL2` states are lost.
>
> To resolve this, I have a couple of thoughts:
> 1. If Host nVHE mode is not intended to be supported for Realms:
> Since RME implies ARMv9 which mandates VHE, running a Realm with an nVHE host might just be an unsupported edge case. If so, we should explicitly reject RME initialization or REC creation when `!is_kernel_in_hyp_mode()`. This would cleanly prevent the undefined behavior.
> 2. If Host nVHE mode is intended to be supported:
> Since RMM should remain agnostic to the Non-Secure VGIC states, the burden of saving these states falls strictly on KVM. However, the EL1 host cannot access `ICH_*_EL2`. Therefore, KVM needs to add specific logic for this scenario. We would likely need to route the REC exit through a dedicated nVHE EL2 stub to invoke `__vgic_v3_save_state()` before dropping back to EL1, rather than jumping straight back to `kvm_ioctl_vcpu_run()`.
>
> I might have missed some documentation or comments regarding nVHE restrictions for CCA. If this is an oversight, it would be great to see a check added in the next iteration of the series.
Thanks for the testing. Yes indeed this is an oversight. For now option
1 is what I'm going to go for. There's nothing stopping nVHE mode being
supported but as you note any platform with RME will have VHE so it's
not an immediate priority to support.
One interesting case of nVHE is of course pKVM and for that there needs
to be some significant work to ensure that the EL2 hypervisor
understands the RMM communication and prevent any confused-deputy style
attacks. E.g. the host must not be able to map a pVM's private memory
into a realm guest.
I don't have any immediate plans to work on nVHE - my focus is getting
the basic support merged. But I know there was some interest to ensure
that pKVM and CCA would be able to co-exist on a platform so I expect it
will come in some form or another.
Thanks,
Steve
^ permalink raw reply
* Re: [PATCH] iommu/arm-smmu-v3: Allow disabling Stage 1 translation
From: Pranjal Shrivastava @ 2026-04-22 15:44 UTC (permalink / raw)
To: Evangelos Petrongonas
Cc: Jason Gunthorpe, Will Deacon, Robin Murphy, Joerg Roedel,
Nicolin Chen, Lu Baolu, linux-arm-kernel, iommu, linux-kernel,
nh-open-source, Zeev Zilberman
In-Reply-To: <20260422064431.GA49867@dev-dsk-epetron-1c-1d4d9719.eu-west-1.amazon.com>
On Wed, Apr 22, 2026 at 06:44:31AM +0000, Evangelos Petrongonas wrote:
> On Mon, Apr 20, 2026 at 09:40:32AM -0300 Jason Gunthorpe wrote:
> > On Mon, Apr 20, 2026 at 12:32:01PM +0000, Evangelos Petrongonas wrote:
> > > When the hardware advertises both Stage 1 and Stage 2 translation, the
> > > driver prefers Stage 1 for DMA domain allocation and only falls back to
> > > Stage 2 if Stage 1 is not supported.
> > >
> > > Some configurations may want to force Stage 2 translation even when the
> > > hardware supports Stage 1.
> >
> > Why? You really need to explain why for a patch like this.
> >
> > If there really is some HW issue I think it is more appropriate to get
> > an IORT flag or IDR detection that the HW has a problem.
>
> It's not a hardware bug there's no IORT or IDR bit that would make sense
> here.
>
> The motivation is live update of the hypervisor: we want to kexec into a
> new kernel while keeping DMA from passthrough devices flowing, which
> means the SMMU's translation state has to survive the handover. The Live
> Update Orchestrator work [1] and the in-progress "iommu: Add live
> update state preservation" series [2] are building exactly this plumbing
> on top of KHO; [2]'s cover letter calls out Arm SMMUv3 support as future
> work, and an earlier RFC from Amazon [3] sketched the same idea for
> iommufd.
>
> For this use case, Stage 2 is materially easier to persist than Stage 1,
> for structural rather than performance reasons: An S2 STE carries the
> whole translation configuration inline. To hand over an S2 domain, the
> pre-kexec kernel only needs to preserve the stream table pages and the
> S2 pgtable pages. An S1 STE points at a Context Descriptor table and as
> a result Persisting S1 therefore requires preserving the CD table pages
> too, and because the CD is keyed by ASID coordinating ASID identity
> across the handover.
>
> In the long term the plan should be to persist both stages.
> However, until a patch series that properly introduces SMMU support for
> is developed/posted we would like to experiment with S1+S2-capable
> hardware with an easier to implement handover machinery, that relies on
> S2 translations.
>
Hi Evangelos,
We (Google) currently have a series in the works specifically for
arm-smmu-v3 state preservation. Our plan is to post it in phases (S2
preservation first then the S1 + CD series) once the iommu: liveupdate
persistence series has stabilized.
Since the iommu core liveupdate framework itself is still in flux,
it’s a bit premature to accept/merge this patch before both the series.
Furthermore, it must be noted that even if the iommu liveupdate series
is merged, until the framework is fully integrated with the SMMU driver,
liveupdate shall remain essentially non-functional or 'broken' for
drivers that haven't yet implemented the necessary support hooks.
We’d prefer to wait until the core infrastructure is solid so we can
ensure the SMMUv3 implementation aligns perfectly with the final
requirements of the iommu liveupdate persistence series.
That said, we don't mind posting our arm-smmu-v3 series with S2-only
preservation early as an early RFC if that helps align on the design
and implementation details.
Thanks,
Praan
> [1] https://lwn.net/Articles/1021442/ — Live Update Orchestrator
> [2] https://lore.kernel.org/all/20260203220948.2176157-1-skhawaja@google.com/ —
> [PATCH 00/14] iommu: Add live update state preservation
> [3] https://lore.kernel.org/all/20240916113102.710522-1-jgowans@amazon.com/ — [RFC
> PATCH 00/13] Support iommu(fd) persistence for live update
>
> > Jason
>
> Kind Regards,
> Evangelos
>
>
>
> Amazon Web Services Development Center Germany GmbH
> Tamara-Danz-Str. 13
> 10243 Berlin
> Geschaeftsfuehrung: Christof Hellmis, Andreas Stieger
> Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
> Sitz: Berlin
> Ust-ID: DE 365 538 597
^ permalink raw reply
* Re: [PATCH v2 1/6] selftests/resctrl: Introduced linked list management for IMC counters
From: Reinette Chatre @ 2026-04-22 16:02 UTC (permalink / raw)
To: Yifan Wu, tony.luck, Dave.Martin, james.morse, babu.moger, shuah,
tan.shaopeng, fenghuay, ben.horgan, jonathan.cameron, zengheng4,
linux-kernel, linux-arm-kernel, linux-kselftest, linuxarm
Cc: xiaqinxin, prime.zeng, wangyushan12, xuwei5, fanghao11, wangzhou1
In-Reply-To: <20260410093352.3988125-2-wuyifan50@huawei.com>
Hi Yifan,
On 4/10/26 2:33 AM, Yifan Wu wrote:
> Added linked list based management for IMC counter configurations,
> allowing the system to dynamically allocate and clean up resources based on
> actual hardware capabilities.
Above provides a motivation for this work but it does not help reviewer understand
what the patch does. Could you please expand with more detail about what the patch
does and since it is incomplete, provide insight into the context of this work to
help review it? For example, above just has "Added linked list based management
for IMC counter configurations" to describe what the patch does (rest is motivation)
but the patch does not actually do this ... it just adds a new and unused data structure
with empty elements in parallel to existing data structures.
Nit: Could you please write all changelogs with an imperative tone? For example,
"Introduced" -> "Introduce" in the subject and "Added" -> "Add" above?
>
> Signed-off-by: Yifan Wu <wuyifan50@huawei.com>
> ---
> tools/testing/selftests/resctrl/mba_test.c | 1 +
> tools/testing/selftests/resctrl/mbm_test.c | 1 +
> tools/testing/selftests/resctrl/resctrl.h | 2 ++
> tools/testing/selftests/resctrl/resctrl_val.c | 20 +++++++++++++++++++
> 4 files changed, 24 insertions(+)
>
> diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c
> index 39cee9898359..4bb1a82eb195 100644
> --- a/tools/testing/selftests/resctrl/mba_test.c
> +++ b/tools/testing/selftests/resctrl/mba_test.c
> @@ -166,6 +166,7 @@ static int check_results(void)
>
> static void mba_test_cleanup(void)
> {
> + cleanup_read_mem_bw_imc();
> remove(RESULT_FILE_NAME);
> }
>
> diff --git a/tools/testing/selftests/resctrl/mbm_test.c b/tools/testing/selftests/resctrl/mbm_test.c
> index 6dbbc3b76003..68c89f50a34a 100644
> --- a/tools/testing/selftests/resctrl/mbm_test.c
> +++ b/tools/testing/selftests/resctrl/mbm_test.c
> @@ -125,6 +125,7 @@ static int mbm_measure(const struct user_params *uparams,
>
> static void mbm_test_cleanup(void)
> {
> + cleanup_read_mem_bw_imc();
> remove(RESULT_FILE_NAME);
> }
>
> diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
> index 175101022bf3..a7556cdae0de 100644
> --- a/tools/testing/selftests/resctrl/resctrl.h
> +++ b/tools/testing/selftests/resctrl/resctrl.h
> @@ -24,6 +24,7 @@
> #include <linux/perf_event.h>
> #include <linux/compiler.h>
> #include <linux/bits.h>
> +#include <linux/list.h>
> #include "kselftest.h"
>
> #define MB (1024 * 1024)
> @@ -183,6 +184,7 @@ void mem_flush(unsigned char *buf, size_t buf_size);
> void fill_cache_read(unsigned char *buf, size_t buf_size, bool once);
> ssize_t get_fill_buf_size(int cpu_no, const char *cache_type);
> int initialize_read_mem_bw_imc(void);
> +void cleanup_read_mem_bw_imc(void);
> int measure_read_mem_bw(const struct user_params *uparams,
> struct resctrl_val_param *param, pid_t bm_pid);
> void initialize_mem_bw_resctrl(const struct resctrl_val_param *param,
> diff --git a/tools/testing/selftests/resctrl/resctrl_val.c b/tools/testing/selftests/resctrl/resctrl_val.c
> index f20d2194c35f..d9ae24e9d971 100644
> --- a/tools/testing/selftests/resctrl/resctrl_val.c
> +++ b/tools/testing/selftests/resctrl/resctrl_val.c
> @@ -28,6 +28,7 @@ struct membw_read_format {
> };
>
> struct imc_counter_config {
> + struct list_head entry;
> __u32 type;
> __u64 event;
> __u64 umask;
> @@ -38,6 +39,7 @@ struct imc_counter_config {
> static char mbm_total_path[1024];
> static int imcs;
> static struct imc_counter_config imc_counters_config[MAX_IMCS];
> +LIST_HEAD(imc_counters_list);
> static const struct resctrl_test *current_test;
>
> static void read_mem_bw_initialize_perf_event_attr(int i)
> @@ -113,6 +115,7 @@ static int parse_imc_read_bw_events(char *imc_dir, unsigned int type,
> unsigned int *count)
> {
> char imc_events_dir[PATH_MAX], imc_counter_cfg[PATH_MAX];
> + struct imc_counter_config *imc_counter;
> unsigned int orig_count = *count;
> char cas_count_cfg[1024];
> struct dirent *ep;
> @@ -167,11 +170,17 @@ static int parse_imc_read_bw_events(char *imc_dir, unsigned int type,
> ksft_print_msg("Maximum iMC count exceeded\n");
> goto out_close;
> }
> + imc_counter = calloc(1, sizeof(*imc_counter));
> + if (!imc_counter) {
> + ksft_perror("Unable to allocate memory for iMC counters\n");
> + goto out_close;
> + }
>
> imc_counters_config[*count].type = type;
> get_read_event_and_umask(cas_count_cfg, *count);
> /* Do not fail after incrementing *count. */
> *count += 1;
> + list_add(&imc_counter->entry, &imc_counters_list);
> }
> if (*count == orig_count) {
> ksft_print_msg("Unable to find events in %s\n", imc_events_dir);
Should cleanup_read_mem_bw_imc() be called on error exit path?
> @@ -303,6 +312,17 @@ int initialize_read_mem_bw_imc(void)
> return 0;
> }
>
> +void cleanup_read_mem_bw_imc(void)
> +{
> + struct imc_counter_config *imc_counter, *tmp;
> +
> + list_for_each_entry_safe(imc_counter, tmp,
> + &imc_counters_list, entry) {
Looks like above can fit on one line.
> + list_del(&imc_counter->entry);
> + free(imc_counter);
> + }
> +}
> +
> static void perf_close_imc_read_mem_bw(void)
> {
> int mc;
Reinette
^ permalink raw reply
* Re: [PATCH v2 2/6] selftests/resctrl: Refactor the discovery of IMC counters using linked list
From: Reinette Chatre @ 2026-04-22 16:04 UTC (permalink / raw)
To: Yifan Wu, tony.luck, Dave.Martin, james.morse, babu.moger, shuah,
tan.shaopeng, fenghuay, ben.horgan, jonathan.cameron, zengheng4,
linux-kernel, linux-arm-kernel, linux-kselftest, linuxarm
Cc: xiaqinxin, prime.zeng, wangyushan12, xuwei5, fanghao11, wangzhou1
In-Reply-To: <20260410093352.3988125-3-wuyifan50@huawei.com>
Hi Yifan,
On 4/10/26 2:33 AM, Yifan Wu wrote:
> Use linked list to refactor the discovery of IMC counters. The counting
> during the discovery and the check on the upper limit of the number
> of IMC counters are removed.
Please apply the changelog comment of patch #1 to all patches in this
series.
>
> Signed-off-by: Yifan Wu <wuyifan50@huawei.com>
> ---
> tools/testing/selftests/resctrl/resctrl_val.c | 35 ++++++++-----------
> 1 file changed, 14 insertions(+), 21 deletions(-)
>
> diff --git a/tools/testing/selftests/resctrl/resctrl_val.c b/tools/testing/selftests/resctrl/resctrl_val.c
> index d9ae24e9d971..60cda2214c13 100644
> --- a/tools/testing/selftests/resctrl/resctrl_val.c
> +++ b/tools/testing/selftests/resctrl/resctrl_val.c
> @@ -75,7 +75,7 @@ static void read_mem_bw_ioctl_perf_event_ioc_disable(int i)
> * @cas_count_cfg: Config
> * @count: iMC number
Note function description above containing description of @count parameter
> */
> -static void get_read_event_and_umask(char *cas_count_cfg, unsigned int count)
> +static void get_read_event_and_umask(char *cas_count_cfg, struct imc_counter_config *imc_counter)
Since above replaces @count with @imc_counter, please update function description to match.
> {
> char *token[MAX_TOKENS];
> int i = 0;
> @@ -89,9 +89,9 @@ static void get_read_event_and_umask(char *cas_count_cfg, unsigned int count)
> if (!token[i])
> break;
> if (strcmp(token[i], "event") == 0)
> - imc_counters_config[count].event = strtol(token[i + 1], NULL, 16);
> + imc_counter->event = strtol(token[i + 1], NULL, 16);
> if (strcmp(token[i], "umask") == 0)
> - imc_counters_config[count].umask = strtol(token[i + 1], NULL, 16);
> + imc_counter->umask = strtol(token[i + 1], NULL, 16);
> }
> }
>
> @@ -111,12 +111,11 @@ static int open_perf_read_event(int i, int cpu_no)
> return 0;
> }
>
> -static int parse_imc_read_bw_events(char *imc_dir, unsigned int type,
> - unsigned int *count)
> +static int parse_imc_read_bw_events(char *imc_dir, unsigned int type)
> {
> char imc_events_dir[PATH_MAX], imc_counter_cfg[PATH_MAX];
> struct imc_counter_config *imc_counter;
> - unsigned int orig_count = *count;
> + bool found_event = false;
> char cas_count_cfg[1024];
> struct dirent *ep;
> int path_len;
> @@ -166,23 +165,18 @@ static int parse_imc_read_bw_events(char *imc_dir, unsigned int type,
> ksft_perror("Could not get iMC cas count read");
> goto out_close;
> }
> - if (*count >= MAX_IMCS) {
> - ksft_print_msg("Maximum iMC count exceeded\n");
> - goto out_close;
> - }
> imc_counter = calloc(1, sizeof(*imc_counter));
> if (!imc_counter) {
> ksft_perror("Unable to allocate memory for iMC counters\n");
> goto out_close;
> }
>
> - imc_counters_config[*count].type = type;
> - get_read_event_and_umask(cas_count_cfg, *count);
> - /* Do not fail after incrementing *count. */
> - *count += 1;
> + imc_counter->type = type;
> + get_read_event_and_umask(cas_count_cfg, imc_counter);
> list_add(&imc_counter->entry, &imc_counters_list);
> + found_event = true;
> }
> - if (*count == orig_count) {
> + if (!found_event) {
> ksft_print_msg("Unable to find events in %s\n", imc_events_dir);
> goto out_close;
> }
> @@ -193,7 +187,7 @@ static int parse_imc_read_bw_events(char *imc_dir, unsigned int type,
> }
>
> /* Get type and config of an iMC counter's read event. */
> -static int read_from_imc_dir(char *imc_dir, unsigned int *count)
> +static int read_from_imc_dir(char *imc_dir)
> {
> char imc_counter_type[PATH_MAX];
> unsigned int type;
> @@ -221,7 +215,7 @@ static int read_from_imc_dir(char *imc_dir, unsigned int *count)
> ksft_perror("Could not get iMC type");
> return -1;
> }
> - ret = parse_imc_read_bw_events(imc_dir, type, count);
> + ret = parse_imc_read_bw_events(imc_dir, type);
> if (ret) {
> ksft_print_msg("Unable to parse bandwidth event and umask\n");
> return ret;
> @@ -245,7 +239,6 @@ static int read_from_imc_dir(char *imc_dir, unsigned int *count)
> static int num_of_imcs(void)
> {
> char imc_dir[512], *temp;
> - unsigned int count = 0;
> struct dirent *ep;
> int ret;
> DIR *dp;
> @@ -274,7 +267,7 @@ static int num_of_imcs(void)
> if (temp[0] >= '0' && temp[0] <= '9') {
> sprintf(imc_dir, "%s/%s/", DYN_PMU_PATH,
> ep->d_name);
> - ret = read_from_imc_dir(imc_dir, &count);
> + ret = read_from_imc_dir(imc_dir);
> if (ret) {
> closedir(dp);
>
> @@ -283,7 +276,7 @@ static int num_of_imcs(void)
> }
> }
> closedir(dp);
> - if (count == 0) {
> + if (list_empty(&imc_counters_list)) {
> ksft_print_msg("Unable to find iMC counters\n");
>
> return -1;
> @@ -294,7 +287,7 @@ static int num_of_imcs(void)
> return -1;
> }
>
> - return count;
> + return 0;
Since num_of_imcs() now returns a code instead of the number of iMCs found it seems
appropriate to change its name to match, something like "enumerate_imcs()"? Also
please note that num_of_imcs() still has function comments that needs to be updated
to match the new return code. Please check all patches to ensure when a function
signature is changed its description is considered also.
> }
>
> int initialize_read_mem_bw_imc(void)
hmm ... this patch changes how iMCs are enumerated, now placing the data in the new
linked list, but the rest of the code still refers to the (now uninitialized) array.
After this patch the tests are thus broken and if somebody ever needs to do a bisect
and happens to land on this patch is will cause inconvenience.
Please split patches to be incremental changes where tests continue working after
every patch. You can do so with one patch where utilities receive pointer to
array element instead of index as parameter and another patch that switches the
code to use a list. (If this sounds familiar I just copied&pasted the previous
sentence from the v1 review.)
To provide more detail on what this would look like, consider the changes to
get_read_event_and_umask() in this patch:
@@ -75,7 +75,7 @@ static void read_mem_bw_ioctl_perf_event_ioc_disable(int i)
* @cas_count_cfg: Config
* @count: iMC number
*/
-static void get_read_event_and_umask(char *cas_count_cfg, unsigned int count)
+static void get_read_event_and_umask(char *cas_count_cfg, struct imc_counter_config *imc_counter)
{
char *token[MAX_TOKENS];
int i = 0;
@@ -89,9 +89,9 @@ static void get_read_event_and_umask(char *cas_count_cfg, unsigned int count)
if (!token[i])
break;
if (strcmp(token[i], "event") == 0)
- imc_counters_config[count].event = strtol(token[i + 1], NULL, 16);
+ imc_counter->event = strtol(token[i + 1], NULL, 16);
if (strcmp(token[i], "umask") == 0)
- imc_counters_config[count].umask = strtol(token[i + 1], NULL, 16);
+ imc_counter->umask = strtol(token[i + 1], NULL, 16);
}
}
A change like above can be made as first patch in the series in the code that still supports the
array with the callers providing a pointer to array element instead, for example:
get_read_event_and_umask(cas_count_cfg, &imc_counters_config[*count]);
All the changes to utilities similar to above that simply replaces the index with a pointer to
struct imc_counter_config can be grouped into that first patch of the series. For example, the changes to
read_mem_bw_initialize_perf_event_attr(), open_perf_read_event(), read_mem_bw_ioctl_perf_event_ioc_reset_enable(), etc.
Grouping such changes into a preparatory patch is simple to review and reduces the churn when
switching to the list.
Reinette
^ permalink raw reply
* Re: [PATCH v2 3/6] selftests/resctrl: Refactor the initialization of IMC's perf_event_attr using linked list
From: Reinette Chatre @ 2026-04-22 16:05 UTC (permalink / raw)
To: Yifan Wu, tony.luck, Dave.Martin, james.morse, babu.moger, shuah,
tan.shaopeng, fenghuay, ben.horgan, jonathan.cameron, zengheng4,
linux-kernel, linux-arm-kernel, linux-kselftest, linuxarm
Cc: xiaqinxin, prime.zeng, wangyushan12, xuwei5, fanghao11, wangzhou1
In-Reply-To: <20260410093352.3988125-4-wuyifan50@huawei.com>
Hi Yifan,
On 4/10/26 2:33 AM, Yifan Wu wrote:
> @@ -292,15 +290,17 @@ static int num_of_imcs(void)
>
> int initialize_read_mem_bw_imc(void)
> {
> - int imc;
> + struct imc_counter_config *imc_counter;
> + int ret;
>
> - imcs = num_of_imcs();
> - if (imcs <= 0)
> - return imcs;
> + ret = num_of_imcs();
> + if (ret < 0)
> + return ret;
I see this change from "imcs" to "ret" as a consequence of the semantic change
to num_of_imcs() done in previous patch. Please move this change to be
located with the semantic change that will make that switch easier to understand.
Reinette
^ permalink raw reply
* Re: [PATCH v2 4/6] selftests/resctrl: Refactor perf event open/close using linked list
From: Reinette Chatre @ 2026-04-22 16:05 UTC (permalink / raw)
To: Yifan Wu, tony.luck, Dave.Martin, james.morse, babu.moger, shuah,
tan.shaopeng, fenghuay, ben.horgan, jonathan.cameron, zengheng4,
linux-kernel, linux-arm-kernel, linux-kselftest, linuxarm
Cc: xiaqinxin, prime.zeng, wangyushan12, xuwei5, fanghao11, wangzhou1
In-Reply-To: <20260410093352.3988125-5-wuyifan50@huawei.com>
Hi Yifan,
On 4/10/26 2:33 AM, Yifan Wu wrote:
> Using linked list when open/close perf event.
>
To make this easier to review and avoid breaking the tests, please split this
patch with changes to open_perf_read_event() located in preparatory patch and
the rest located with all the other changes to switch tests to the list instead of the
array.
Please similarly split the similar patches that follow.
Reinette
^ permalink raw reply
* Re: [PATCH v2 6/6] selftests/resctrl: Remove the definition of the IMC counter config array and imcs.
From: Reinette Chatre @ 2026-04-22 16:05 UTC (permalink / raw)
To: Yifan Wu, tony.luck, Dave.Martin, james.morse, babu.moger, shuah,
tan.shaopeng, fenghuay, ben.horgan, jonathan.cameron, zengheng4,
linux-kernel, linux-arm-kernel, linux-kselftest, linuxarm
Cc: xiaqinxin, prime.zeng, wangyushan12, xuwei5, fanghao11, wangzhou1
In-Reply-To: <20260410093352.3988125-7-wuyifan50@huawei.com>
Hi Yifan,
On 4/10/26 2:33 AM, Yifan Wu wrote:
> The definitions of the imc counter configuration array, imcs, and MAX_IMCS
> are removed.
Please squash these removals with the patch that removes the last
usage.
Reinette
^ permalink raw reply
* Re: [PATCH net v2 1/2] net: airoha: Move ndesc initialization at end of airoha_qdma_init_rx_queue()
From: Lorenzo Bianconi @ 2026-04-22 16:09 UTC (permalink / raw)
To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni
Cc: Simon Horman, linux-arm-kernel, linux-mediatek, netdev
In-Reply-To: <20260420-airoha_qdma_init_rx_queue-fix-v2-1-d99347e5c18d@kernel.org>
[-- Attachment #1: Type: text/plain, Size: 2817 bytes --]
> If queue entry or DMA descriptor list allocation fails in
> airoha_qdma_init_rx_queue routine, airoha_qdma_cleanup() will trigger a
> NULL pointer dereference running netif_napi_del() for RX queue NAPIs
> since netif_napi_add() has never been executed to this particular RX NAPI.
> The issue is due to the early ndesc initialization in
> airoha_qdma_init_rx_queue() since airoha_qdma_cleanup() relies on ndesc
> value to check if the queue is properly initialized. Fix the issue moving
> ndesc initialization at end of airoha_qdma_init_tx routine.
> Move page_pool allocation after descriptor list allocation in order to
> avoid memory leaks if desc allocation fails.
>
> Fixes: 23020f049327 ("net: airoha: Introduce ethernet support for EN7581 SoC")
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> ---
> drivers/net/ethernet/airoha/airoha_eth.c | 14 +++++++-------
> 1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
> index e1ab15f1ee7d..fc79c456743c 100644
> --- a/drivers/net/ethernet/airoha/airoha_eth.c
> +++ b/drivers/net/ethernet/airoha/airoha_eth.c
> @@ -745,14 +745,18 @@ static int airoha_qdma_init_rx_queue(struct airoha_queue *q,
> dma_addr_t dma_addr;
>
> q->buf_size = PAGE_SIZE / 2;
> - q->ndesc = ndesc;
> q->qdma = qdma;
>
> - q->entry = devm_kzalloc(eth->dev, q->ndesc * sizeof(*q->entry),
> + q->entry = devm_kzalloc(eth->dev, ndesc * sizeof(*q->entry),
> GFP_KERNEL);
> if (!q->entry)
> return -ENOMEM;
>
> + q->desc = dmam_alloc_coherent(eth->dev, ndesc * sizeof(*q->desc),
> + &dma_addr, GFP_KERNEL);
> + if (!q->desc)
> + return -ENOMEM;
> +
> q->page_pool = page_pool_create(&pp_params);
> if (IS_ERR(q->page_pool)) {
> int err = PTR_ERR(q->page_pool);
> @@ -761,11 +765,7 @@ static int airoha_qdma_init_rx_queue(struct airoha_queue *q,
> return err;
> }
>
> - q->desc = dmam_alloc_coherent(eth->dev, q->ndesc * sizeof(*q->desc),
> - &dma_addr, GFP_KERNEL);
> - if (!q->desc)
> - return -ENOMEM;
> -
> + q->ndesc = ndesc;
> netif_napi_add(eth->napi_dev, &q->napi, airoha_qdma_rx_napi_poll);
>
> airoha_qdma_wr(qdma, REG_RX_RING_BASE(qid), dma_addr);
>
> --
> 2.53.0
>
As requested, I am commenting the issue reported by Sashiko on this patch:
https://sashiko.dev/#/patchset/20260420-airoha_qdma_init_rx_queue-fix-v2-0-d99347e5c18d%40kernel.org
- Does this code leave a regression in the TX path by omitting the equivalent fix?
This issue is not related to this patch and already fixed here:
https://patchwork.kernel.org/project/netdevbpf/patch/20260417-airoha_qdma_cleanup_tx_queue-fix-net-v4-1-e04bcc2c9642@kernel.org/
Regards,
Lorenzo
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply
* Re: [PATCH net v2 2/2] net: airoha: Add size check for TX NAPIs in airoha_qdma_cleanup()
From: Lorenzo Bianconi @ 2026-04-22 16:12 UTC (permalink / raw)
To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni
Cc: Simon Horman, linux-arm-kernel, linux-mediatek, netdev
In-Reply-To: <20260420-airoha_qdma_init_rx_queue-fix-v2-2-d99347e5c18d@kernel.org>
[-- Attachment #1: Type: text/plain, Size: 2645 bytes --]
> If airoha_qdma_init routine fails before airoha_qdma_tx_irq_init() runs
> successfully for all TX NAPIs, airoha_qdma_cleanup() will
> unconditionally runs netif_napi_del() on TX NAPIs, triggering a NULL
> pointer dereference. Fix the issue relying on q_tx_irq size value to
> check if the TX NAPIs is properly initialized in airoha_qdma_cleanup().
> Moreover, run netif_napi_add_tx() just if irq_q queue is properly
> allocated.
>
> Fixes: 23020f049327 ("net: airoha: Introduce ethernet support for EN7581 SoC")
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> ---
> drivers/net/ethernet/airoha/airoha_eth.c | 11 ++++++++---
> 1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
> index fc79c456743c..fd8c4f817d85 100644
> --- a/drivers/net/ethernet/airoha/airoha_eth.c
> +++ b/drivers/net/ethernet/airoha/airoha_eth.c
> @@ -996,8 +996,6 @@ static int airoha_qdma_tx_irq_init(struct airoha_tx_irq_queue *irq_q,
> struct airoha_eth *eth = qdma->eth;
> dma_addr_t dma_addr;
>
> - netif_napi_add_tx(eth->napi_dev, &irq_q->napi,
> - airoha_qdma_tx_napi_poll);
> irq_q->q = dmam_alloc_coherent(eth->dev, size * sizeof(u32),
> &dma_addr, GFP_KERNEL);
> if (!irq_q->q)
> @@ -1007,6 +1005,9 @@ static int airoha_qdma_tx_irq_init(struct airoha_tx_irq_queue *irq_q,
> irq_q->size = size;
> irq_q->qdma = qdma;
>
> + netif_napi_add_tx(eth->napi_dev, &irq_q->napi,
> + airoha_qdma_tx_napi_poll);
> +
> airoha_qdma_wr(qdma, REG_TX_IRQ_BASE(id), dma_addr);
> airoha_qdma_rmw(qdma, REG_TX_IRQ_CFG(id), TX_IRQ_DEPTH_MASK,
> FIELD_PREP(TX_IRQ_DEPTH_MASK, size));
> @@ -1398,8 +1399,12 @@ static void airoha_qdma_cleanup(struct airoha_qdma *qdma)
> }
> }
>
> - for (i = 0; i < ARRAY_SIZE(qdma->q_tx_irq); i++)
> + for (i = 0; i < ARRAY_SIZE(qdma->q_tx_irq); i++) {
> + if (!qdma->q_tx_irq[i].size)
> + continue;
> +
> netif_napi_del(&qdma->q_tx_irq[i].napi);
> + }
>
> for (i = 0; i < ARRAY_SIZE(qdma->q_tx); i++) {
> if (!qdma->q_tx[i].ndesc)
>
> --
> 2.53.0
>
Commenting the issue reported by Sashiko here:
https://sashiko.dev/#/patchset/20260420-airoha_qdma_init_rx_queue-fix-v2-0-d99347e5c18d%40kernel.org
- Could a similar vulnerability still exist in the TX queue initialization and cleanup path?
This issue is not related to this patch and already fixed here:
https://patchwork.kernel.org/project/netdevbpf/patch/20260417-airoha_qdma_cleanup_tx_queue-fix-net-v4-1-e04bcc2c9642@kernel.org/
Regards,
Lorenzo
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply
* Re: [PATCH] net/stmmac: Fix typos: 'tx_undeflow_irq' -> 'tx_underflow_irq'
From: Andrew Lunn @ 2026-04-22 16:15 UTC (permalink / raw)
To: Jakub Raczynski
Cc: netdev, linux-kernel, kuba, davem, andrew+netdev, kernel-janitors,
linux-arm-kernel, linux-stm32
In-Reply-To: <aejYCYObZyFPpLat@AMDC4622.eu.corp.samsungelectronics.net>
On Wed, Apr 22, 2026 at 04:15:37PM +0200, Jakub Raczynski wrote:
> On Wed, Apr 22, 2026 at 02:47:38PM +0200, Andrew Lunn wrote:
> > > I don't see anything wrong with it?
> > > - naming is correct, same as stmmac_extra_stats from common.h, as it
> > > wouldn't compile otherwise
> > > - string length is ok, as max name length is ETH_GSTRING_LEN=32 and it is
> > > not close
> > > - ethtool just polls data from driver and in my tests it is ok
> > > - all instances of 'undeflow' are changed
> > > - 'underflow' semantic is ok, 'undeflow' is just not correct
> > >
> > > Please correct me if I am wrong, but imo no issues with this patch.
> >
> > ABI
> >
> > This name is published as part of the kAPI. You are changing its
> > name. User space could be looking for this name, even thought it has a
> > typo in it.
> >
> > Andrew
> >
> I don't think it is? This part of extra stats (struct stmmac_extra_stats) and
> is not part of standard ABI from
> Documentation/ABI/testing/sysfs-class-net-statistics
> nor is mentioned in
> Documentation/networking/device_drivers/ethernet/stmicro/stmmac.rst
>
> These extra stats are specific to stmmac driver and most of these are more
> than standard
> https://www.kernel.org/doc/html/v7.0/networking/statistics.html#c.rtnl_link_stats64
> This name does not exist outside stmmac driver, so while some application may
> expect this (stmmac specific app), question is should this typo stick?
47dd7a540b8a0 drivers/net/stmmac/stmmac_ethtool.c (Giuseppe Cavallaro 2009-10-14 15:13:45 -0700 81) STMMAC_STAT(tx_undeflow_irq),
It has been exposed to user space for 17 years. In that time, there
could well be stmmac specific apps using it.
Just because it is not documented as ABI does not make it not ABI.
Andrew
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox