Live Patching

Live Patching
 help / color / mirror / Atom feed

* Re: [PATCH v2 0/6] kselftests: livepatch: Adapt tests to be executed on 4.12 kernels
From: Josh Poimboeuf @ 2026-04-16 17:07 UTC (permalink / raw)
  To: Marcos Paulo de Souza
  Cc: Jiri Kosina, Miroslav Benes, Petr Mladek, Joe Lawrence,
	Shuah Khan, live-patching, linux-kselftest, linux-kernel
In-Reply-To: <20260413-lp-tests-old-fixes-v2-0-367c7cb5006f@suse.com>

On Mon, Apr 13, 2026 at 02:26:11PM -0300, Marcos Paulo de Souza wrote:
> A new version of the patchset, with fewer patches now. Please take a look!
> 
> Original cover-letter:
> These patches don't really change how the patches are run, just skip
> some tests on kernels that don't support a feature (like kprobe and
> livepatched living together) or when a livepatch sysfs attribute is
> missing.
> 
> The last patch slightly adjusts check_result function to skip dmesg
> messages on SLE kernels when a livepatch is removed.

Why are we adding complexity to support Linux 4.12 in mainline?  Isn't
that what enterprise distros are for?

-- 
Josh

^ permalink raw reply

* Re: [PATCH v2 0/6] kselftests: livepatch: Adapt tests to be executed on 4.12 kernels
From: Marcos Paulo de Souza @ 2026-04-16 18:18 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Jiri Kosina, Miroslav Benes, Petr Mladek, Joe Lawrence,
	Shuah Khan, live-patching, linux-kselftest, linux-kernel
In-Reply-To: <wrecfrmldslvr4dvtb7hrmi3w6joby4qmray3fd3f4dfc2k2tv@ficeojpjxjop>

On Thu, 2026-04-16 at 10:07 -0700, Josh Poimboeuf wrote:
> On Mon, Apr 13, 2026 at 02:26:11PM -0300, Marcos Paulo de Souza
> wrote:
> > A new version of the patchset, with fewer patches now. Please take
> > a look!
> > 
> > Original cover-letter:
> > These patches don't really change how the patches are run, just
> > skip
> > some tests on kernels that don't support a feature (like kprobe and
> > livepatched living together) or when a livepatch sysfs attribute is
> > missing.
> > 
> > The last patch slightly adjusts check_result function to skip dmesg
> > messages on SLE kernels when a livepatch is removed.
> 
> Why are we adding complexity to support Linux 4.12 in mainline? 
> Isn't
> that what enterprise distros are for?

These changes do not add any new complex code, just checks to enable
the tests to run on older kernels. I believe that it would be good for
all enterprises distros if they could run more tests in maintenance
updates of their kernels using the upstream tests.

The changes are not really that big. Some patches were removed from v1
because there were adding checks for out-of-tree messages (like the
last paragraph of the v2 erroneously shows), and another one was to
check if kprobes could live alongside livepatches, which fails for 4.12
kernels.

The patches for this versions introduce only checks to avoid testing
sysfs attributes for kernels that don't supports them.

^ permalink raw reply

* Re: [RFC PATCH 1/2] kernel/notifier: replace single-linked list with double-linked list for reverse traversal
From: David Laight @ 2026-04-16 19:15 UTC (permalink / raw)
  To: Petr Mladek
  Cc: chensong_2000, rafael, lenb, mturquette, sboyd, viresh.kumar, agk,
	snitzer, mpatocka, bmarzins, song, yukuai, linan122, jason.wessel,
	danielt, dianders, horms, davem, edumazet, kuba, pabeni, paulmck,
	frederic, mcgrof, petr.pavlu, da.gomez, samitolvanen, atomlin,
	jpoimboe, jikos, mbenes, joe.lawrence, rostedt, mhiramat,
	mark.rutland, mathieu.desnoyers, linux-modules, linux-kernel,
	linux-trace-kernel, linux-acpi, linux-clk, linux-pm,
	live-patching, dm-devel, linux-raid, kgdb-bugreport, netdev
In-Reply-To: <aeD4H8P1DiPQoM8V@pathway.suse.cz>

On Thu, 16 Apr 2026 16:54:23 +0200
Petr Mladek <pmladek@suse.com> wrote:

> On Thu 2026-04-16 13:30:04, David Laight wrote:
> > On Wed, 15 Apr 2026 15:01:37 +0800
> > chensong_2000@189.cn wrote:
> >   
> > > From: Song Chen <chensong_2000@189.cn>
> > > 
> > > The current notifier chain implementation uses a single-linked list
> > > (struct notifier_block *next), which only supports forward traversal
> > > in priority order. This makes it difficult to handle cleanup/teardown
> > > scenarios that require notifiers to be called in reverse priority order.  
> > 
> > If it is only cleanup/teardown then the list can be order-reversed
> > as part of that process at the same time as the list is deleted.  
> 
> Interesting idea. But it won't work in all situations.

It is useful for things like locklessy queuing a request to be processed later.
Items can be added with a cmpxchg and the list grabbed by xchg of NULL.
The only downside is that reversing a list isn't cache friendly.
Thinks... although that may not be any worse than accessing the current 'tail'
to add to the end of a doubly linked (or singly linked with a tail ptr) list.

	David

> 
> Note that the motivation for this update are the module loader
> notifiers which are called several times for each loaded/removed module.
> 
> Best Regards,
> Petr
> 


^ permalink raw reply

* Re: [PATCH] samples/livepatch: Add BPF struct_ops integration sample
From: Yafang Shao @ 2026-04-17  7:45 UTC (permalink / raw)
  To: Song Liu
  Cc: live-patching, linux-kernel, jpoimboe, jikos, mbenes, pmladek,
	joe.lawrence, kernel-team
In-Reply-To: <CAPhsuW5=oXwQQyOU7Hf6Qf5=tK=-J75Xr+p+dcGiPs2vVEeMFw@mail.gmail.com>

On Fri, Apr 17, 2026 at 12:33 AM Song Liu <song@kernel.org> wrote:
>
> On Thu, Apr 16, 2026 at 12:46 AM Yafang Shao <laoar.shao@gmail.com> wrote:
> [...]
> > > +
> > > +static struct klp_patch patch = {
> > > +       .mod = THIS_MODULE,
> > > +       .objs = objs,
> >
> >   Nit: I suggest enabling the replace flag for this patch to align
> > with the recommended implementation.
> >
> >     .replace = true,
>
> This is an interesting topic. To fully take advantage of the replace
> feature, we need more work on the BPF side.

Right.
Replacement seems to break struct_ops registration and BPF
re-attachment. On the livepatch side, we should add support for the
'livepatch tag' to prevent these types of livepatches from being
replaced ;)

>
> For this sample, I guess we are OK either way.

OK

-- 
Regards
Yafang

^ permalink raw reply

* Re: [PATCH] samples/livepatch: Add BPF struct_ops integration sample
From: Miroslav Benes @ 2026-04-17  8:13 UTC (permalink / raw)
  To: Song Liu
  Cc: live-patching, linux-kernel, jpoimboe, jikos, pmladek,
	joe.lawrence, laoar.shao, kernel-team
In-Reply-To: <20260416001628.2062468-1-song@kernel.org>

Hi,

On Wed, 15 Apr 2026, Song Liu wrote:

> Add a sample module that demonstrates how BPF struct_ops can work
> together with kernel livepatch. The module livepatches
> cmdline_proc_show() and delegates the output to a BPF struct_ops
> callback. When no BPF program is attached, a fallback message is
> shown; when a BPF struct_ops program is attached, it controls the
> /proc/cmdline output via the bpf_klp_seq_write kfunc.
> 
> This builds on the existing livepatch-sample.c pattern but shows how
> livepatch and BPF struct_ops can be combined to make livepatched
> behavior programmable from userspace.
> 
> The module is built when both CONFIG_SAMPLE_LIVEPATCH and
> CONFIG_BPF_JIT are enabled.
> 
> Signed-off-by: Song Liu <song@kernel.org>

Interesting. It does not make me comfortable to be honest. Is this 
something we want to advertise through samples?

Sashiko has comments... 
https://sashiko.dev/#/patchset/20260416001628.2062468-1-song%40kernel.org

Miroslav

^ permalink raw reply

* Re: [PATCH] samples/livepatch: Add BPF struct_ops integration sample
From: Petr Mladek @ 2026-04-17 13:20 UTC (permalink / raw)
  To: Song Liu
  Cc: Yafang Shao, live-patching, linux-kernel, jpoimboe, jikos, mbenes,
	joe.lawrence, kernel-team
In-Reply-To: <CAPhsuW5=oXwQQyOU7Hf6Qf5=tK=-J75Xr+p+dcGiPs2vVEeMFw@mail.gmail.com>

On Thu 2026-04-16 09:32:46, Song Liu wrote:
> On Thu, Apr 16, 2026 at 12:46 AM Yafang Shao <laoar.shao@gmail.com> wrote:
> [...]
> > > +
> > > +static struct klp_patch patch = {
> > > +       .mod = THIS_MODULE,
> > > +       .objs = objs,
> >
> >   Nit: I suggest enabling the replace flag for this patch to align
> > with the recommended implementation.
> >
> >     .replace = true,
> 
> This is an interesting topic. To fully take advantage of the replace
> feature, we need more work on the BPF side.

IMHO, this brings a synchronization problem. I do not have practical
experience with BPF but I expect that:

   + The BPF program could be loaded only when the related bpf_struct_ops
     is registered.

   + The bpf_struct_ops is registered when the livepatch module is being
     loaded.

   + The new livepatch module would replace the older one when it
     is being loaded.

Now, we would need to load the BPF problem after the bpf_struct_ops
is registered but before the livepatch gets enabled. Otherwise,
the new livepatch would not continue working as the previous one.

Let' use the code from this patch:

static int __init livepatch_bpf_init(void)
{
	int ret;

	ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS,
					&klp_bpf_kfunc_set);
	ret = ret ?: register_bpf_struct_ops(&bpf_klp_bpf_cmdline_ops,
					     klp_bpf_cmdline_ops);
	if (ret)
		return ret;

--->	/*
--->	 * We would need to wait here until the BPF program gets loaded.
--->	 * for the new bpf_struct_ops in this new livepatch.
--->     */
	return klp_enable_patch(&patch);
}

Or maybe, the bpf_struct_ops can be _allocated dynamically_ and
the pointer might be _passed via shadow variables_.

One problem is that shadow variables would add another overhead
and need not be suitable for hot paths.

Anyway, I think that I have similar feelings as Miroslav.
The combination of livepatches and BPF programs increases
the complexity for all involved parties: core kernel maintainers,
livepatch and BPF program authors, and system maintainers.

Do we really want to propagate it?
Is there any significant advantage in combining these two, please?
Is it significantly easier to write BPF program then a livepatch?
Is it significantly easier to update BPF programs then livepatches?

IMHO, the livepatches should allow enough flexibility. And it might
be easier to update the livepatch when needed.

Or do you install more independent livepatches as well?

Would the support of different replace tags help?
They would allow to replace only livepatches with the same tag.

Best Regards,
Petr

^ permalink raw reply

* Re: [PATCH v3 6/8] arm64/module, sframe: Add sframe support for modules.
From: Jens Remus @ 2026-04-17 14:07 UTC (permalink / raw)
  To: Dylan Hatch, Roman Gushchin, Weinan Liu, Will Deacon,
	Josh Poimboeuf, Indu Bhagat, Peter Zijlstra, Steven Rostedt,
	Catalin Marinas, Jiri Kosina
  Cc: Mark Rutland, Prasanna Kumar T S M, Puranjay Mohan, Song Liu,
	joe.lawrence, linux-toolchains, linux-kernel, live-patching,
	linux-arm-kernel
In-Reply-To: <20260406185000.1378082-7-dylanbhatch@google.com>

On 4/6/2026 8:49 PM, Dylan Hatch wrote:
> Add sframe table to mod_arch_specific and support sframe PC lookups when
> an .sframe section can be found on incoming modules.
> 
> Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
> Signed-off-by: Weinan Liu <wnliu@google.com>

Reviewed-by: Jens Remus <jremus@linux.ibm.com>

> ---
>  arch/arm64/include/asm/module.h |  6 +++++
>  arch/arm64/kernel/module.c      |  8 +++++++
>  include/linux/sframe.h          |  2 ++
>  kernel/unwind/sframe.c          | 39 +++++++++++++++++++++++++++++++--
>  4 files changed, 53 insertions(+), 2 deletions(-)
Regards,
Jens
-- 
Jens Remus
Linux on Z Development (D3303)
jremus@de.ibm.com / jremus@linux.ibm.com

IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Ehningen; Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement: https://www.ibm.com/privacy/


^ permalink raw reply

* Re: [PATCH v2 1/6] selftests: livepatch: Check for ARCH_HAS_SYSCALL_WRAPPER config
From: Petr Mladek @ 2026-04-17 15:11 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Marcos Paulo de Souza, Josh Poimboeuf, Jiri Kosina, Joe Lawrence,
	Shuah Khan, live-patching, linux-kselftest, linux-kernel
In-Reply-To: <alpine.LSU.2.21.2604151154330.1967@pobox.suse.cz>

On Wed 2026-04-15 11:58:50, Miroslav Benes wrote:
> On Mon, 13 Apr 2026, Marcos Paulo de Souza wrote:
> 
> > Older kernels that lack CONFIG_ARCH_HAS_SYSCALL_WRAPPER config don't
> > have any prefixes for their syscalls. The same applies to current
> > powerpc and loongarch, covering all currently supported architectures
> > that support livepatch.
> > 
> > The other supported architectures have specific prefixes, so error out
> > when a new architecture adds livepatch support with wrappes but didn't
> > update the test to include it.
> > 
> > --- a/tools/testing/selftests/livepatch/test_modules/test_klp_syscall.c
> > +++ b/tools/testing/selftests/livepatch/test_modules/test_klp_syscall.c
> > @@ -12,15 +12,26 @@
> >  #include <linux/slab.h>
> >  #include <linux/livepatch.h>
> >  
> > -#if defined(__x86_64__)
> > +/*
> > + * Before CONFIG_ARCH_HAS_SYSCALL_WRAPPER was introduced there were no
> > + * prefixes for system calls.
> > + * Both ppc and loongarch does not set prefixes for their system calls either.
> > + */
> > +#if !defined(CONFIG_ARCH_HAS_SYSCALL_WRAPPER) ||  defined(__powerpc__) || \
> > +	defined(__loongarch__)
> > +#define FN_PREFIX
> > +#elif defined(__x86_64__)
> >  #define FN_PREFIX __x64_
> >  #elif defined(__s390x__)
> >  #define FN_PREFIX __s390x_
> >  #elif defined(__aarch64__)
> >  #define FN_PREFIX __arm64_
> > -#else
> > -/* powerpc does not select ARCH_HAS_SYSCALL_WRAPPER */
> > +#elif defined(__powerpc__)
> > +#define FN_PREFIX
> > +#elif defined(__loongarch__)
> >  #define FN_PREFIX
> > +#else
> > +#error "Missing syscall wrapper for the given architecture."
> >  #endif
> 
> I know that Sashiko commented on that already but even with that I wonder 
> if it was cleaner to structure it differently...
> 
> #if defined(CONFIG_ARCH_HAS_SYSCALL_WRAPPER)
>   #if define(__x86_64__)
>   ...
>   #elif define(__powerpc__)
>     #define FN_PREFIX
>   #else
>     #error
>   #endif
> #elif
>   #define FN_PREFIX
> #endif

Yeah, this looks better.

Best Regards,
Petr

^ permalink raw reply

* Re: [PATCH v2 4/6] selftests: livepatch: Check if patched sysfs attribute exists
From: Petr Mladek @ 2026-04-17 15:21 UTC (permalink / raw)
  To: Marcos Paulo de Souza
  Cc: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Joe Lawrence,
	Shuah Khan, live-patching, linux-kselftest, linux-kernel
In-Reply-To: <20260413-lp-tests-old-fixes-v2-4-367c7cb5006f@suse.com>

On Mon 2026-04-13 14:26:15, Marcos Paulo de Souza wrote:
> In order to run the selftests on older kernels, check if given kernel
> has support for the attribute. If the attribute is not supported, skip
> the checks.
> 
> Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
> ---
>  tools/testing/selftests/livepatch/test-sysfs.sh | 38 +++++++++++++++----------
>  1 file changed, 23 insertions(+), 15 deletions(-)
> 
> diff --git a/tools/testing/selftests/livepatch/test-sysfs.sh b/tools/testing/selftests/livepatch/test-sysfs.sh
> index 58fe1d96997c..a2d649404a63 100755
> --- a/tools/testing/selftests/livepatch/test-sysfs.sh
> +++ b/tools/testing/selftests/livepatch/test-sysfs.sh
> @@ -8,6 +8,8 @@ MOD_LIVEPATCH=test_klp_livepatch
>  MOD_LIVEPATCH2=test_klp_callbacks_demo
>  MOD_LIVEPATCH3=test_klp_syscall
>  

It would be nice to add a comment in which upstream kernel version
the attribute was introduced. It might help to decide when it is
time to remove the check, see below.

> +HAS_PATCH_ATTR=0
> +
>  setup_config
>  
>  # - load a livepatch and verifies the sysfs entries work as expected
> @@ -25,8 +27,12 @@ check_sysfs_rights "$MOD_LIVEPATCH" "stack_order" "-r--r--r--"
>  check_sysfs_value  "$MOD_LIVEPATCH" "stack_order" "1"
>  check_sysfs_rights "$MOD_LIVEPATCH" "transition" "-r--r--r--"
>  check_sysfs_value  "$MOD_LIVEPATCH" "transition" "0"
> -check_sysfs_rights "$MOD_LIVEPATCH" "vmlinux/patched" "-r--r--r--"
> -check_sysfs_value  "$MOD_LIVEPATCH" "vmlinux/patched" "1"
> +
> +if does_sysfs_exists "$MOD_LIVEPATCH/vmlinux" "patched"; then
> +	check_sysfs_rights "$MOD_LIVEPATCH" "vmlinux/patched" "-r--r--r--"
> +	check_sysfs_value  "$MOD_LIVEPATCH" "vmlinux/patched" "1"
> +	HAS_PATCH_ATTR=1
> +fi
>  
>  disable_lp $MOD_LIVEPATCH
>  
> @@ -45,23 +51,24 @@ livepatch: '$MOD_LIVEPATCH': completing unpatching transition
>  livepatch: '$MOD_LIVEPATCH': unpatching complete
>  % rmmod $MOD_LIVEPATCH"
>  
> -start_test "sysfs test object/patched"
> +if [[ "$HAS_PATCH_ATTR" == "1" ]]; then
> +	start_test "sysfs test object/patched"
>  
> -MOD_LIVEPATCH=test_klp_callbacks_demo
> -MOD_TARGET=test_klp_callbacks_mod
> -load_lp $MOD_LIVEPATCH
> +	MOD_LIVEPATCH=test_klp_callbacks_demo
> +	MOD_TARGET=test_klp_callbacks_mod
> +	load_lp $MOD_LIVEPATCH
>  
> -# check the "patch" file changes as target module loads/unloads
> -check_sysfs_value  "$MOD_LIVEPATCH" "$MOD_TARGET/patched" "0"
> -load_mod $MOD_TARGET
> -check_sysfs_value  "$MOD_LIVEPATCH" "$MOD_TARGET/patched" "1"
> -unload_mod $MOD_TARGET
> -check_sysfs_value  "$MOD_LIVEPATCH" "$MOD_TARGET/patched" "0"
> +	# check the "patch" file changes as target module loads/unloads
> +	check_sysfs_value  "$MOD_LIVEPATCH" "$MOD_TARGET/patched" "0"
> +	load_mod $MOD_TARGET
> +	check_sysfs_value  "$MOD_LIVEPATCH" "$MOD_TARGET/patched" "1"
> +	unload_mod $MOD_TARGET
> +	check_sysfs_value  "$MOD_LIVEPATCH" "$MOD_TARGET/patched" "0"
>  
> -disable_lp $MOD_LIVEPATCH
> -unload_lp $MOD_LIVEPATCH
> +	disable_lp $MOD_LIVEPATCH
> +	unload_lp $MOD_LIVEPATCH
>  
> -check_result "% insmod test_modules/test_klp_callbacks_demo.ko
> +	check_result "% insmod test_modules/test_klp_callbacks_demo.ko
>  livepatch: enabling patch 'test_klp_callbacks_demo'
>  livepatch: 'test_klp_callbacks_demo': initializing patching transition
>  test_klp_callbacks_demo: pre_patch_callback: vmlinux
> @@ -87,6 +94,7 @@ livepatch: 'test_klp_callbacks_demo': completing unpatching transition
>  test_klp_callbacks_demo: post_unpatch_callback: vmlinux
>  livepatch: 'test_klp_callbacks_demo': unpatching complete
>  % rmmod test_klp_callbacks_demo"
> +fi

The indentation mismatch (code vs check_results) looks a bit ugly.
Well, it is rather common thing in bash scripts. And it might
even help to find where the next text starts.

I could live with it. But I am a bit biased.

I would like to see an ack from a non-SUSE person before
taking this ;-)

Best Regards,
Petr

PS: Otherwise, the patchset look good to me, modulo
    the Sashiko AI comments.

^ permalink raw reply

* Re: [PATCH] samples/livepatch: Add BPF struct_ops integration sample
From: Song Liu @ 2026-04-17 15:45 UTC (permalink / raw)
  To: Yafang Shao
  Cc: live-patching, linux-kernel, jpoimboe, jikos, mbenes, pmladek,
	joe.lawrence, kernel-team
In-Reply-To: <CALOAHbAwmPnEFgzzrjQWdp=tfR4rZPoyn-at4EYFO0TX6rCLHA@mail.gmail.com>

On Fri, Apr 17, 2026 at 12:45 AM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> On Fri, Apr 17, 2026 at 12:33 AM Song Liu <song@kernel.org> wrote:
> >
> > On Thu, Apr 16, 2026 at 12:46 AM Yafang Shao <laoar.shao@gmail.com> wrote:
> > [...]
> > > > +
> > > > +static struct klp_patch patch = {
> > > > +       .mod = THIS_MODULE,
> > > > +       .objs = objs,
> > >
> > >   Nit: I suggest enabling the replace flag for this patch to align
> > > with the recommended implementation.
> > >
> > >     .replace = true,
> >
> > This is an interesting topic. To fully take advantage of the replace
> > feature, we need more work on the BPF side.
>
> Right.
> Replacement seems to break struct_ops registration and BPF
> re-attachment. On the livepatch side, we should add support for the
> 'livepatch tag' to prevent these types of livepatches from being
> replaced ;)

I somehow knew you were gonna say this. :) But I don't think this
is a strong enough argument (and will probably scare folks more).

Thanks,
Song

^ permalink raw reply

* Re: [PATCH v3 8/8] unwind: arm64: Use sframe to unwind interrupt frames.
From: Jens Remus @ 2026-04-17 15:45 UTC (permalink / raw)
  To: Dylan Hatch, Roman Gushchin, Weinan Liu, Will Deacon,
	Josh Poimboeuf, Indu Bhagat, Peter Zijlstra, Steven Rostedt,
	Catalin Marinas, Jiri Kosina
  Cc: Mark Rutland, Prasanna Kumar T S M, Puranjay Mohan, Song Liu,
	joe.lawrence, linux-toolchains, linux-kernel, live-patching,
	linux-arm-kernel, Heiko Carstens
In-Reply-To: <20260406185000.1378082-9-dylanbhatch@google.com>

Hello Dylan and Weinan!

On 4/6/2026 8:50 PM, Dylan Hatch wrote:
> Add unwind_next_frame_sframe() function to unwind by sframe info if
> present. Use this method at exception boundaries, falling back to
> frame-pointer unwind only on failure. In such failure cases, the
> stacktrace is considered unreliable.
> 
> During normal unwind, prefer frame pointer unwind (for better
> performance) with sframe as a backup.
> 
> This change restores the LR behavior originally introduced in commit
> c2c6b27b5aa14fa2 ("arm64: stacktrace: unwind exception boundaries"),
> But later removed in commit 32ed1205682e ("arm64: stacktrace: Skip
> reporting LR at exception boundaries")
> 
> This can be done because the sframe data can be used to determine
> whether the LR is current for the PC value recovered from pt_regs at the
> exception boundary.
> 
> Signed-off-by: Weinan Liu <wnliu@google.com>
> Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
> Reviewed-by: Prasanna Kumar T S M <ptsm@linux.microsoft.com>

> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c

> +/*
> + * Unwind to the next frame according to sframe.
> + */
> +static __always_inline int
> +unwind_next_frame_sframe(struct kunwind_state *state)
> +{
> +	struct unwind_frame frame;
> +	unsigned long cfa, fp, ra;
> +	enum kunwind_source source = KUNWIND_SOURCE_FRAME;
> +	struct pt_regs *regs = state->regs;
> +
> +	int err;
> +
> +	/* FP/SP alignment 8 bytes */
> +	if (state->common.fp & 0x7 || state->common.sp & 0x7)
> +		return -EINVAL;
> +
> +	/*
> +	 * Most/all outermost functions are not visible to sframe. So, check for
> +	 * a meta frame record if the sframe lookup fails.
> +	 */
> +	err = sframe_find_kernel(state->common.pc, &frame);
> +	if (err)
> +		return kunwind_next_frame_record_meta(state);
> +
> +	if (frame.outermost)
> +		return -ENOENT;
> +
> +	/* Get the Canonical Frame Address (CFA) */
> +	switch (frame.cfa.rule) {
> +	case UNWIND_CFA_RULE_SP_OFFSET:
> +		cfa = state->common.sp;
> +		break;
> +	case UNWIND_CFA_RULE_FP_OFFSET:
> +		if (state->common.fp < state->common.sp)
> +			return -EINVAL;

I wonder whether that check is valid in kernel?  Looking at
call_on_irq_stack() saving SP in FP and then loading SP with the IRQ SP.
Is that condition always true then?

> +		cfa = state->common.fp;
> +		break;
> +	case UNWIND_CFA_RULE_REG_OFFSET:
> +	case UNWIND_CFA_RULE_REG_OFFSET_DEREF:
> +		if (!regs)

		if (!regs || frame.cfa.regnum > 30)

> +			return -EINVAL;
> +		cfa = regs->regs[frame.cfa.regnum];

In unwind user this is guarded by a topmost frame check, as arbitrary
registers are otherwise not available.  Isn't this necessary in the
kernel case?

> +		break;
> +	default:
> +		WARN_ON_ONCE(1);
> +		return -EINVAL;
> +	}
> +	cfa += frame.cfa.offset;
> +
> +	/*
> +	 * CFA typically points to a higher address than RA or FP, so don't
> +	 * consume from the stack when we read it.
> +	 */
> +	if (frame.cfa.rule & UNWIND_RULE_DEREF &&
> +	    !get_word(&state->common, &cfa))
> +		return -EINVAL;
> +
> +	/* CFA alignment 8 bytes */
> +	if (cfa & 0x7)
> +		return -EINVAL;
> +
> +	/* Get the Return Address (RA) */
> +	switch (frame.ra.rule) {
> +	case UNWIND_RULE_RETAIN:
> +		if (!regs)
> +			return -EINVAL;
> +		ra = regs->regs[30];

Likewise: Topmost frame check not required to access arbitrary registers
(including RA/LR)?  Furthermore, provided don't have a thinko, LR may
only be in LR in the topmost frame.  In any other frame it must have
been saved.  Otherwise there would be an endless return loop.

> +		source = KUNWIND_SOURCE_REGS_LR;
> +		break;
> +	/* UNWIND_USER_RULE_CFA_OFFSET not implemented on purpose */
> +	case UNWIND_RULE_CFA_OFFSET_DEREF:
> +		ra = cfa + frame.ra.offset;
> +		break;
> +	case UNWIND_RULE_REG_OFFSET:
> +	case UNWIND_RULE_REG_OFFSET_DEREF:
> +		if (!regs)

		if (!regs || frame.cfa.regnum > 30)

> +			return -EINVAL;
> +		ra = regs->regs[frame.cfa.regnum];

Likewise: Topmost frame check not required to access arbitrary registers?

> +		ra += frame.ra.offset;
> +		break;
> +	default:
> +		WARN_ON_ONCE(1);
> +		return -EINVAL;
> +	}
> +
> +	/* Get the Frame Pointer (FP) */
> +	switch (frame.fp.rule) {
> +	case UNWIND_RULE_RETAIN:
> +		fp = state->common.fp;
> +		break;
> +	/* UNWIND_USER_RULE_CFA_OFFSET not implemented on purpose */
> +	case UNWIND_RULE_CFA_OFFSET_DEREF:
> +		fp = cfa + frame.fp.offset;
> +		break;
> +	case UNWIND_RULE_REG_OFFSET:
> +	case UNWIND_RULE_REG_OFFSET_DEREF:
> +		if (!regs)

		if (!regs || frame.cfa.regnum > 30)

> +			return -EINVAL;
> +		fp = regs->regs[frame.fp.regnum];

Likewise: Topmost frame check not required to access arbitrary registers?

> +		fp += frame.fp.offset;
> +		break;
> +	default:
> +		WARN_ON_ONCE(1);
> +		return -EINVAL;
> +	}
> +
> +	/*
> +	 * Consume RA and FP from the stack. The frame record puts FP at a lower
> +	 * address than RA, so we always read FP first.
> +	 */
> +	if (frame.fp.rule & UNWIND_RULE_DEREF &&
> +	    !get_word(&state->common, &fp))
> +		return -EINVAL;
> +
> +	if (frame.ra.rule & UNWIND_RULE_DEREF &&
> +	    get_consume_word(&state->common, &ra))
> +		return -EINVAL;
> +
> +	state->common.pc = ra;
> +	state->common.sp = cfa;
> +	state->common.fp = fp;
> +
> +	state->source = source;
> +
> +	return 0;
> +}
Thanks and regards,
Jens
-- 
Jens Remus
Linux on Z Development (D3303)
jremus@de.ibm.com / jremus@linux.ibm.com

IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Ehningen; Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement: https://www.ibm.com/privacy/


^ permalink raw reply

* Re: [PATCH] samples/livepatch: Add BPF struct_ops integration sample
From: Song Liu @ 2026-04-17 15:46 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: live-patching, linux-kernel, jpoimboe, jikos, pmladek,
	joe.lawrence, laoar.shao, kernel-team
In-Reply-To: <alpine.LSU.2.21.2604171011570.24300@pobox.suse.cz>

On Fri, Apr 17, 2026 at 1:13 AM Miroslav Benes <mbenes@suse.cz> wrote:
>
> Hi,
>
> On Wed, 15 Apr 2026, Song Liu wrote:
>
> > Add a sample module that demonstrates how BPF struct_ops can work
> > together with kernel livepatch. The module livepatches
> > cmdline_proc_show() and delegates the output to a BPF struct_ops
> > callback. When no BPF program is attached, a fallback message is
> > shown; when a BPF struct_ops program is attached, it controls the
> > /proc/cmdline output via the bpf_klp_seq_write kfunc.
> >
> > This builds on the existing livepatch-sample.c pattern but shows how
> > livepatch and BPF struct_ops can be combined to make livepatched
> > behavior programmable from userspace.
> >
> > The module is built when both CONFIG_SAMPLE_LIVEPATCH and
> > CONFIG_BPF_JIT are enabled.
> >
> > Signed-off-by: Song Liu <song@kernel.org>
>
> Interesting. It does not make me comfortable to be honest. Is this
> something we want to advertise through samples?

I can understand your concern. We don't have to add these to the
samples.

> Sashiko has comments...
> https://sashiko.dev/#/patchset/20260416001628.2062468-1-song%40kernel.org

These are valid points.

Thanks,
Song

^ permalink raw reply

* Re: [PATCH] samples/livepatch: Add BPF struct_ops integration sample
From: Song Liu @ 2026-04-17 15:52 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Yafang Shao, live-patching, linux-kernel, jpoimboe, jikos, mbenes,
	joe.lawrence, kernel-team
In-Reply-To: <aeIzhNyvYaR2Krrq@pathway.suse.cz>

On Fri, Apr 17, 2026 at 6:20 AM Petr Mladek <pmladek@suse.com> wrote:
>
> On Thu 2026-04-16 09:32:46, Song Liu wrote:
[...]
> Let' use the code from this patch:
>
> static int __init livepatch_bpf_init(void)
> {
>         int ret;
>
>         ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS,
>                                         &klp_bpf_kfunc_set);
>         ret = ret ?: register_bpf_struct_ops(&bpf_klp_bpf_cmdline_ops,
>                                              klp_bpf_cmdline_ops);
>         if (ret)
>                 return ret;
>
> --->    /*
> --->     * We would need to wait here until the BPF program gets loaded.
> --->     * for the new bpf_struct_ops in this new livepatch.
> --->     */
>         return klp_enable_patch(&patch);
> }

Yes, something in this direction is needed to make atomic replace work.
We have no plan to use this in production. I will let Yafang figure out
his plan.

> Or maybe, the bpf_struct_ops can be _allocated dynamically_ and
> the pointer might be _passed via shadow variables_.
>
> One problem is that shadow variables would add another overhead
> and need not be suitable for hot paths.
>
>
> Anyway, I think that I have similar feelings as Miroslav.
> The combination of livepatches and BPF programs increases
> the complexity for all involved parties: core kernel maintainers,
> livepatch and BPF program authors, and system maintainers.
>
> Do we really want to propagate it?
> Is there any significant advantage in combining these two, please?
> Is it significantly easier to write BPF program then a livepatch?
> Is it significantly easier to update BPF programs then livepatches?

Some combination like this will probably make sense for Yafang's use
cases. But I agree maybe we don't want this in the samples, because
it is indeed complicated and could be more dangerous.

Thanks,
Song

> IMHO, the livepatches should allow enough flexibility. And it might
> be easier to update the livepatch when needed.
>
> Or do you install more independent livepatches as well?
>
> Would the support of different replace tags help?
> They would allow to replace only livepatches with the same tag.

^ permalink raw reply

* Re: [PATCH v2 1/6] selftests: livepatch: Check for ARCH_HAS_SYSCALL_WRAPPER config
From: Joe Lawrence @ 2026-04-17 18:10 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Miroslav Benes, Marcos Paulo de Souza, Josh Poimboeuf,
	Jiri Kosina, Shuah Khan, live-patching, linux-kselftest,
	linux-kernel
In-Reply-To: <aeJNnFqLEdMgnHKh@pathway.suse.cz>

On Fri, Apr 17, 2026 at 05:11:24PM +0200, Petr Mladek wrote:
> On Wed 2026-04-15 11:58:50, Miroslav Benes wrote:
> > On Mon, 13 Apr 2026, Marcos Paulo de Souza wrote:
> > 
> > > Older kernels that lack CONFIG_ARCH_HAS_SYSCALL_WRAPPER config don't
> > > have any prefixes for their syscalls. The same applies to current
> > > powerpc and loongarch, covering all currently supported architectures
> > > that support livepatch.
> > > 
> > > The other supported architectures have specific prefixes, so error out
> > > when a new architecture adds livepatch support with wrappes but didn't
> > > update the test to include it.
> > > 
> > > --- a/tools/testing/selftests/livepatch/test_modules/test_klp_syscall.c
> > > +++ b/tools/testing/selftests/livepatch/test_modules/test_klp_syscall.c
> > > @@ -12,15 +12,26 @@
> > >  #include <linux/slab.h>
> > >  #include <linux/livepatch.h>
> > >  
> > > -#if defined(__x86_64__)
> > > +/*
> > > + * Before CONFIG_ARCH_HAS_SYSCALL_WRAPPER was introduced there were no
> > > + * prefixes for system calls.
> > > + * Both ppc and loongarch does not set prefixes for their system calls either.
> > > + */
> > > +#if !defined(CONFIG_ARCH_HAS_SYSCALL_WRAPPER) ||  defined(__powerpc__) || \
> > > +	defined(__loongarch__)
> > > +#define FN_PREFIX
> > > +#elif defined(__x86_64__)
> > >  #define FN_PREFIX __x64_
> > >  #elif defined(__s390x__)
> > >  #define FN_PREFIX __s390x_
> > >  #elif defined(__aarch64__)
> > >  #define FN_PREFIX __arm64_
> > > -#else
> > > -/* powerpc does not select ARCH_HAS_SYSCALL_WRAPPER */
> > > +#elif defined(__powerpc__)
> > > +#define FN_PREFIX
> > > +#elif defined(__loongarch__)
> > >  #define FN_PREFIX
> > > +#else
> > > +#error "Missing syscall wrapper for the given architecture."
> > >  #endif
> > 
> > I know that Sashiko commented on that already but even with that I wonder 
> > if it was cleaner to structure it differently...
> > 
> > #if defined(CONFIG_ARCH_HAS_SYSCALL_WRAPPER)
> >   #if define(__x86_64__)
> >   ...
> >   #elif define(__powerpc__)
> >     #define FN_PREFIX
> >   #else
> >     #error
> >   #endif
> > #elif
> >   #define FN_PREFIX
> > #endif
> 
> Yeah, this looks better.
> 

Agreed, if there is v3, this would definitely be clearer.

--
Joe


^ permalink raw reply

* Re: [PATCH v2 1/6] selftests: livepatch: Check for ARCH_HAS_SYSCALL_WRAPPER config
From: Joe Lawrence @ 2026-04-17 18:14 UTC (permalink / raw)
  To: Marcos Paulo de Souza
  Cc: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek,
	Shuah Khan, live-patching, linux-kselftest, linux-kernel
In-Reply-To: <20260413-lp-tests-old-fixes-v2-1-367c7cb5006f@suse.com>

On Mon, Apr 13, 2026 at 02:26:12PM -0300, Marcos Paulo de Souza wrote:
> Older kernels that lack CONFIG_ARCH_HAS_SYSCALL_WRAPPER config don't
> have any prefixes for their syscalls. The same applies to current
> powerpc and loongarch, covering all currently supported architectures
> that support livepatch.
> 
> The other supported architectures have specific prefixes, so error out
> when a new architecture adds livepatch support with wrappes but didn't

nit: s/wrappes/wrappers

> update the test to include it.
> 
> Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
> ---
>  .../selftests/livepatch/test_modules/test_klp_syscall.c | 17 ++++++++++++++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/testing/selftests/livepatch/test_modules/test_klp_syscall.c b/tools/testing/selftests/livepatch/test_modules/test_klp_syscall.c
> index dd802783ea84..b5527a288a7c 100644
> --- a/tools/testing/selftests/livepatch/test_modules/test_klp_syscall.c
> +++ b/tools/testing/selftests/livepatch/test_modules/test_klp_syscall.c
> @@ -12,15 +12,26 @@
>  #include <linux/slab.h>
>  #include <linux/livepatch.h>
>  
> -#if defined(__x86_64__)
> +/*
> + * Before CONFIG_ARCH_HAS_SYSCALL_WRAPPER was introduced there were no
> + * prefixes for system calls.
> + * Both ppc and loongarch does not set prefixes for their system calls either.

nit: s/does not/do not

--
Joe


^ permalink raw reply

* Re: [PATCH v2 3/6] selftests: livepatch: Introduce does_sysfs_exists function
From: Joe Lawrence @ 2026-04-17 18:15 UTC (permalink / raw)
  To: Marcos Paulo de Souza
  Cc: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek,
	Shuah Khan, live-patching, linux-kselftest, linux-kernel
In-Reply-To: <20260413-lp-tests-old-fixes-v2-3-367c7cb5006f@suse.com>

On Mon, Apr 13, 2026 at 02:26:14PM -0300, Marcos Paulo de Souza wrote:
> Return 1 if the livepatch sysfs attribute exists, and 0 otherwise. This
> new function will be used in the next patches.
> 

Not true in v2 since the return code is now `[[ -f
"$SYSFS_KLP_DIR/$mod/$attr" ]]`

--
Joe


^ permalink raw reply

* Re: [PATCH v2 3/6] selftests: livepatch: Introduce does_sysfs_exists function
From: Joe Lawrence @ 2026-04-17 18:18 UTC (permalink / raw)
  To: Marcos Paulo de Souza
  Cc: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek,
	Shuah Khan, live-patching, linux-kselftest, linux-kernel
In-Reply-To: <20260413-lp-tests-old-fixes-v2-3-367c7cb5006f@suse.com>

On Mon, Apr 13, 2026 at 02:26:14PM -0300, Marcos Paulo de Souza wrote:
> Return 1 if the livepatch sysfs attribute exists, and 0 otherwise. This
> new function will be used in the next patches.
> 
> Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
> ---
>  tools/testing/selftests/livepatch/functions.sh | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/tools/testing/selftests/livepatch/functions.sh b/tools/testing/selftests/livepatch/functions.sh
> index 8ec0cb64ad94..382596eaaf01 100644
> --- a/tools/testing/selftests/livepatch/functions.sh
> +++ b/tools/testing/selftests/livepatch/functions.sh
> @@ -339,6 +339,16 @@ function check_result {
>  	fi
>  }
>  
> +# does_sysfs_exists(modname, attr) - check sysfs attribute existence
> +#	modname - livepatch module creating the sysfs interface
> +#	attr - attribute name to be checked
> +function does_sysfs_exists() {

Super small nit, but s/does_sysfs_exists/does_sysfs_exist ?

--
Joe


^ permalink raw reply

* Re: [PATCH v2 0/6] kselftests: livepatch: Adapt tests to be executed on 4.12 kernels
From: Joe Lawrence @ 2026-04-17 18:36 UTC (permalink / raw)
  To: Marcos Paulo de Souza
  Cc: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek,
	Shuah Khan, live-patching, linux-kselftest, linux-kernel
In-Reply-To: <5fb3ecf5a13bdf459019f6f011f3507593498875.camel@suse.com>

On Thu, Apr 16, 2026 at 03:18:33PM -0300, Marcos Paulo de Souza wrote:
> On Thu, 2026-04-16 at 10:07 -0700, Josh Poimboeuf wrote:
> > On Mon, Apr 13, 2026 at 02:26:11PM -0300, Marcos Paulo de Souza
> > wrote:
> > > A new version of the patchset, with fewer patches now. Please take
> > > a look!
> > > 
> > > Original cover-letter:
> > > These patches don't really change how the patches are run, just
> > > skip
> > > some tests on kernels that don't support a feature (like kprobe and
> > > livepatched living together) or when a livepatch sysfs attribute is
> > > missing.
> > > 
> > > The last patch slightly adjusts check_result function to skip dmesg
> > > messages on SLE kernels when a livepatch is removed.
> > 
> > Why are we adding complexity to support Linux 4.12 in mainline? 
> > Isn't
> > that what enterprise distros are for?
> 
> These changes do not add any new complex code, just checks to enable
> the tests to run on older kernels. I believe that it would be good for
> all enterprises distros if they could run more tests in maintenance
> updates of their kernels using the upstream tests.
> 
> The changes are not really that big. Some patches were removed from v1
> because there were adding checks for out-of-tree messages (like the
> last paragraph of the v2 erroneously shows), and another one was to
> check if kprobes could live alongside livepatches, which fails for 4.12
> kernels.
> 
> The patches for this versions introduce only checks to avoid testing
> sysfs attributes for kernels that don't supports them.
> 

IMHO when the changes are reasonably small, I think we should consider
accomodating older kernels for the selftest suite.  If we reach the
point of having to introduce version #ifdef-erry, that opinion would
flip pretty quickly.  It's pretty amazing that modern tests still run on
older kernels (with this patchset) -- not an explicit kselftest goal
AFAIK, but nice to have.

If we do merge this patchset, it should update the doc
tools/testing/selftests/livepatch/README to note the oldest
expected/tested upstream kernel.  (So new selftest authors may have some
idea of what API / sysfs features to use.)  And that this compatibility
was only an incidental "feature" that came for nearly free.  It's not a
promise to never add backwards-incompatible tests in the future.

--
Joe


^ permalink raw reply

* Re: [PATCH v3 2/8] arm64, unwind: build kernel with sframe V3 info
From: Dylan Hatch @ 2026-04-18  0:20 UTC (permalink / raw)
  To: Jens Remus
  Cc: Roman Gushchin, Weinan Liu, Will Deacon, Josh Poimboeuf,
	Indu Bhagat, Peter Zijlstra, Steven Rostedt, Catalin Marinas,
	Jiri Kosina, Mark Rutland, Prasanna Kumar T S M, Puranjay Mohan,
	Song Liu, joe.lawrence, linux-toolchains, linux-kernel,
	live-patching, linux-arm-kernel, Heiko Carstens, Vasily Gorbik,
	Ilya Leoshkevich
In-Reply-To: <cc7f741c-41a0-4620-b5d5-3428aaa7648f@linux.ibm.com>

On Tue, Apr 14, 2026 at 5:43 AM Jens Remus <jremus@linux.ibm.com> wrote:
>
>
> Good catch!  Should we rather add the following in the series you are
> basing on, as there are already arch-specific unwind_user.h and
> unwind_user_sframe.h?
>
> F:      arch/*/include/asm/unwind*.h

Thanks for the suggestion, this works for me.

>
> On the other hand I wonder whether the arch-specific headers should
> remain maintained by the respective arch maintainers?  How is that
> handled in general?

I had the same question. My scan of MAINTAINERS shows both patterns
are present, so I defer to those who know more about this kind of
maintainership configuration.

>
> > diff --git a/arch/Kconfig b/arch/Kconfig
>
> > @@ -520,6 +520,13 @@ config SFRAME_VALIDATION
> >
> >         If unsure, say N.
> >
> > +config ARCH_SUPPORTS_SFRAME_UNWINDER
> > +     bool
> > +     help
> > +       An architecture can select this if it  enables the sframe (Simple
> > +       Frame) unwinder for unwinding kernel stack traces. It uses unwind
> > +       table that is directly generatedby toolchain based on DWARF CFI information.
>
> Nit: s/sframe/SFrame/
>
> > +
> >  config HAVE_PERF_REGS
> >       bool
> >       help
>
> > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>
> > @@ -112,6 +112,7 @@ config ARM64
> >       select ARCH_SUPPORTS_SCHED_SMT
> >       select ARCH_SUPPORTS_SCHED_CLUSTER
> >       select ARCH_SUPPORTS_SCHED_MC
> > +     select ARCH_SUPPORTS_SFRAME_UNWINDER
> >       select ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH
> >       select ARCH_WANT_COMPAT_IPC_PARSE_VERSION if COMPAT
> >       select ARCH_WANT_DEFAULT_BPF_JIT
>
> > diff --git a/arch/arm64/Kconfig.debug b/arch/arm64/Kconfig.debug
>
> > @@ -20,4 +20,17 @@ config ARM64_RELOC_TEST
> >       depends on m
> >       tristate "Relocation testing module"
> >
> > +config SFRAME_UNWINDER
>
> Why do you introduce this for arm64 (and debug) only?  If s390 were to
> add support (as replacement for s390 backchain), this would have to be
> moved or duplicated.  It would not suffice to enable
> ARCH_SUPPORTS_SFRAME_UNWINDER to also enable SFRAME_UNWINDER.

Makes sense, I'll look into introducing this as arch-generic.

>
> As mentioned in my feedback on the previous patch in this series:
> Would it make sense to align the naming to the existing
> HAVE_UNWIND_USER_SFRAME, for instance:
>
>   HAVE_UNWIND_KERNEL_SFRAME
>
> > +     bool "Sframe unwinder"
> > +     depends on AS_SFRAME3
> > +     depends on 64BIT
> > +     depends on ARCH_SUPPORTS_SFRAME_UNWINDER
> > +     select SFRAME_LOOKUP
> > +     help
> > +       This option enables the sframe (Simple Frame) unwinder for unwinding
> > +       kernel stack traces. It uses unwind table that is directly generated
> > +       by toolchain based on DWARF CFI information. In, practice this can
> > +       provide more reliable stacktrace results than unwinding with frame
> > +       pointers alone.
>
> Nit: s/sframe/SFrame/
>
> > +
> >  source "drivers/hwtracing/coresight/Kconfig"
>
> You are introducing two new Kconfig options (SFRAME_UNWINDER and
> ARCH_SUPPORTS_SFRAME_UNWINDER).  I wonder whether they could somehow be
> combined into a single new option.  Although I am not sure how an option
> can be both selectable and depending at the same time, so that the ARM64
> config could select it, but it would also depend on the above.

I don't think this is recommended, since the behavior of 'select'
appears to override a 'depends' requirement.

From Documentation/kbuild/kconfig-language.rst: "select should be used
with care. select will force a symbol to a value without visiting the
dependencies. By abusing select you are able to select a symbol FOO
even if FOO depends on BAR that is not set. In general use select only
for non-visible symbols (no prompts anywhere) and for symbols with no
dependencies. That will limit the usefulness but on the other hand
avoid the illegal configurations all over."

>
> > diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
>
> > @@ -491,6 +491,8 @@
> >               *(.rodata1)                                             \
> >       }                                                               \
> >                                                                       \
> > +     SFRAME                                                          \
> > +                                                                     \
> >       /* PCI quirks */                                                \
> >       .pci_fixup        : AT(ADDR(.pci_fixup) - LOAD_OFFSET) {        \
> >               BOUNDED_SECTION_PRE_LABEL(.pci_fixup_early,  _pci_fixups_early,  __start, __end) \
> > @@ -911,6 +913,19 @@
> >  #define TRACEDATA
> >  #endif
> >
> > +#ifdef CONFIG_SFRAME_UNWINDER
> > +#define SFRAME                                                       \
> > +     /* sframe */                                            \
> > +     .sframe : AT(ADDR(.sframe) - LOAD_OFFSET) {             \
> > +             __start_sframe_header = .;                      \
>
>                 __start_sframe[_section] = .;
>
> > +             KEEP(*(.sframe))                                \
> > +             KEEP(*(.init.sframe))                           \
> > +             __stop_sframe_header = .;                       \
>
>                 __stop_sframe[_section] = .;
>
> Unless I am missing something both are not the start/stop of the .sframe
> header (in the .sframe section) but the .sframe section itself (see also
> your subsequent "[PATCH v3 4/8] sframe: Provide PC lookup for vmlinux
> .sframe section." where you assign both to kernel_sfsec.sframe_start
> and kernel_sfsec.sframe_end.
>
> > +     }
> > +#else
> > +#define SFRAME
> > +#endif
> > +
> >  #ifdef CONFIG_PRINTK_INDEX
> >  #define PRINTK_INDEX                                                 \
> >       .printk_index : AT(ADDR(.printk_index) - LOAD_OFFSET) {         \
>
> Regards,
> Jens
> --
> Jens Remus
> Linux on Z Development (D3303)
> jremus@de.ibm.com / jremus@linux.ibm.com
>
> IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Ehningen; Registergericht: Amtsgericht Stuttgart, HRB 243294
> IBM Data Privacy Statement: https://www.ibm.com/privacy/
>

I'm working on a new version that I'm hoping to be able to send
sometime next week, which should address your other comments.

Thanks,
Dylan

^ permalink raw reply

* Re: [RFC PATCH 1/2] kernel/notifier: replace single-linked list with double-linked list for reverse traversal
From: Song Chen @ 2026-04-19  0:07 UTC (permalink / raw)
  To: Petr Mladek
  Cc: rafael, lenb, mturquette, sboyd, viresh.kumar, agk, snitzer,
	mpatocka, bmarzins, song, yukuai, linan122, jason.wessel, danielt,
	dianders, horms, davem, edumazet, kuba, pabeni, paulmck, frederic,
	mcgrof, petr.pavlu, da.gomez, samitolvanen, atomlin, jpoimboe,
	jikos, mbenes, joe.lawrence, rostedt, mhiramat, mark.rutland,
	mathieu.desnoyers, linux-modules, linux-kernel,
	linux-trace-kernel, linux-acpi, linux-clk, linux-pm,
	live-patching, dm-devel, linux-raid, kgdb-bugreport, netdev
In-Reply-To: <aeC7ByGA5MHBcGQR@pathway.suse.cz>

Hi,


On 4/16/26 18:33, Petr Mladek wrote:
> On Wed 2026-04-15 15:01:37, chensong_2000@189.cn wrote:
>> From: Song Chen <chensong_2000@189.cn>
>>
>> The current notifier chain implementation uses a single-linked list
>> (struct notifier_block *next), which only supports forward traversal
>> in priority order. This makes it difficult to handle cleanup/teardown
>> scenarios that require notifiers to be called in reverse priority order.
>>
>> A concrete example is the ordering dependency between ftrace and
>> livepatch during module load/unload. see the detail here [1].
>>
>> This patch replaces the single-linked list in struct notifier_block
>> with a struct list_head, converting the notifier chain into a
>> doubly-linked list sorted in descending priority order. Based on
>> this, a new function notifier_call_chain_reverse() is introduced,
>> which traverses the chain in reverse (ascending priority order).
>> The corresponding blocking_notifier_call_chain_reverse() is also
>> added as the locking wrapper for blocking notifier chains.
>>
>> The internal notifier_call_chain_robust() is updated to use
>> notifier_call_chain_reverse() for rollback: on error, it records
>> the failing notifier (last_nb) and the count of successfully called
>> notifiers (nr), then rolls back exactly those nr-1 notifiers in
>> reverse order starting from last_nb's predecessor, without needing
>> to know the total length of the chain.
>>
>> With this change, subsystems with symmetric setup/teardown ordering
>> requirements can register a single notifier_block with one priority
>> value, and rely on blocking_notifier_call_chain() for forward
>> traversal and blocking_notifier_call_chain_reverse() for reverse
>> traversal, without needing hard-coded call sequences or separate
>> notifier registrations for each direction.
>>
>> [1]:https://lore.kernel.org/all
>> 	/alpine.LNX.2.00.1602172216491.22700@cbobk.fhfr.pm/
>>
>> --- a/include/linux/notifier.h
>> +++ b/include/linux/notifier.h
>> @@ -53,41 +53,41 @@ typedef	int (*notifier_fn_t)(struct notifier_block *nb,
> [...]
>>   struct notifier_block {
>>   	notifier_fn_t notifier_call;
>> -	struct notifier_block __rcu *next;
>> +	struct list_head __rcu entry;
>>   	int priority;
>>   };
> [...]
>>   #define ATOMIC_INIT_NOTIFIER_HEAD(name) do {	\
>>   		spin_lock_init(&(name)->lock);	\
>> -		(name)->head = NULL;		\
>> +		INIT_LIST_HEAD(&(name)->head);		\
> 
> I would expect the RCU variant here, aka INIT_LIST_HEAD_RCU().

I'm not familiar with list rcu, but i will look into it and give it a try.
> 
>> --- a/kernel/notifier.c
>> +++ b/kernel/notifier.c
>> @@ -14,39 +14,47 @@
>>    *	are layered on top of these, with appropriate locking added.
>>    */
>>   
>> -static int notifier_chain_register(struct notifier_block **nl,
>> +static int notifier_chain_register(struct list_head *nl,
>>   				   struct notifier_block *n,
>>   				   bool unique_priority)
>>   {
>> -	while ((*nl) != NULL) {
>> -		if (unlikely((*nl) == n)) {
>> +	struct notifier_block *cur;
>> +
>> +	list_for_each_entry(cur, nl, entry) {
>> +		if (unlikely(cur == n)) {
>>   			WARN(1, "notifier callback %ps already registered",
>>   			     n->notifier_call);
>>   			return -EEXIST;
>>   		}
>> -		if (n->priority > (*nl)->priority)
>> -			break;
>> -		if (n->priority == (*nl)->priority && unique_priority)
>> +
>> +		if (n->priority == cur->priority && unique_priority)
>>   			return -EBUSY;
>> -		nl = &((*nl)->next);
>> +
>> +		if (n->priority > cur->priority) {
>> +			list_add_tail(&n->entry, &cur->entry);
>> +			goto out;
>> +		}
>>   	}
>> -	n->next = *nl;
>> -	rcu_assign_pointer(*nl, n);
>> +
>> +	list_add_tail(&n->entry, nl);
> 
> I would expect list_add_tail_rcu() here.
> 
>> @@ -59,25 +67,25 @@ static int notifier_chain_unregister(struct notifier_block **nl,
>>    *			value of this parameter is -1.
>>    *	@nr_calls:	Records the number of notifications sent. Don't care
>>    *			value of this field is NULL.
>> + *	@last_nb:  Records the last called notifier block for rolling back
>>    *	Return:		notifier_call_chain returns the value returned by the
>>    *			last notifier function called.
>>    */
>> -static int notifier_call_chain(struct notifier_block **nl,
>> +static int notifier_call_chain(struct list_head *nl,
>>   			       unsigned long val, void *v,
>> -			       int nr_to_call, int *nr_calls)
>> +			       int nr_to_call, int *nr_calls,
>> +				   struct notifier_block **last_nb)
>>   {
>>   	int ret = NOTIFY_DONE;
>> -	struct notifier_block *nb, *next_nb;
>> -
>> -	nb = rcu_dereference_raw(*nl);
>> +	struct notifier_block *nb;
>>   
>> -	while (nb && nr_to_call) {
>> -		next_nb = rcu_dereference_raw(nb->next);
>> +	if (!nr_to_call)
>> +		return ret;
>>   
>> +	list_for_each_entry(nb, nl, entry) {
> 
> I would expect the RCU variant here, aka list_for_each_rcu()
> 
> These are just two random examples which I found by a quick look.
> 
> I guess that the notifier API is very old and it does not use all
> the RCU API features which allow to track safety when
> CONFIG_PROVE_RCU and CONFIG_PROVE_RCU_LIST are enabled.
> 
> It actually might be worth to audit the code and make it right.
> 
>>   #ifdef CONFIG_DEBUG_NOTIFIERS
>>   		if (unlikely(!func_ptr_is_kernel_text(nb->notifier_call))) {
>>   			WARN(1, "Invalid notifier called!");
>> -			nb = next_nb;
>>   			continue;
>>   		}
>>   #endif
> 
> That said, I am not sure if the ftrace/livepatching handlers are
> the right motivation for this. Especially when I see the
> complexity of the 2nd patch [*]
> 
> After thinking more about it. I am not even sure that the ftrace and
> livepatching callbacks are good candidates for generic notifiers.
> They are too special. It is not only about ordering them against
> each other. But it is also about ordering them against other
> notifiers. The ftrace/livepatching callbacks must be the first/last
> during module load/release.
> 
> [*] The 2nd patch is not archived by lore for some reason.
>      I have found only a review but it gives a good picture, see
>      https://lore.kernel.org/all/1191caf5-6a61-4622-a15e-854d3701f4fc@suse.com/
> 
> Best Regards,
> Petr
> 

Thanks.

BR

Song

^ permalink raw reply

* Re: [RFC PATCH 1/2] kernel/notifier: replace single-linked list with double-linked list for reverse traversal
From: Song Chen @ 2026-04-19  0:21 UTC (permalink / raw)
  To: David Laight
  Cc: rafael, lenb, mturquette, sboyd, viresh.kumar, agk, snitzer,
	mpatocka, bmarzins, song, yukuai, linan122, jason.wessel, danielt,
	dianders, horms, davem, edumazet, kuba, pabeni, paulmck, frederic,
	mcgrof, petr.pavlu, da.gomez, samitolvanen, atomlin, jpoimboe,
	jikos, mbenes, pmladek, joe.lawrence, rostedt, mhiramat,
	mark.rutland, mathieu.desnoyers, linux-modules, linux-kernel,
	linux-trace-kernel, linux-acpi, linux-clk, linux-pm,
	live-patching, dm-devel, linux-raid, kgdb-bugreport, netdev
In-Reply-To: <20260416133004.07bd2886@pumpkin>

Hi,

On 4/16/26 20:30, David Laight wrote:
> On Wed, 15 Apr 2026 15:01:37 +0800
> chensong_2000@189.cn wrote:
> 
>> From: Song Chen <chensong_2000@189.cn>
>>
>> The current notifier chain implementation uses a single-linked list
>> (struct notifier_block *next), which only supports forward traversal
>> in priority order. This makes it difficult to handle cleanup/teardown
>> scenarios that require notifiers to be called in reverse priority order.
> 
> If it is only cleanup/teardown then the list can be order-reversed
> as part of that process at the same time as the list is deleted.
> 
> 	David
> 
> 
> 

Sorry, i don't follow, the notifiers in the list are deleted when 
calling notifier_chain_unregister, other than that, they are traversed 
forward and backward.

Song


^ permalink raw reply

* Re: [PATCH] samples/livepatch: Add BPF struct_ops integration sample
From: Yafang Shao @ 2026-04-19  3:19 UTC (permalink / raw)
  To: Song Liu, Petr Mladek
  Cc: live-patching, linux-kernel, jpoimboe, jikos, mbenes,
	joe.lawrence, kernel-team
In-Reply-To: <CAPhsuW6Z91qAM7G=iA_APX4Jfa8a0pshnGSTYn0+JXKsUfEVqQ@mail.gmail.com>

On Fri, Apr 17, 2026 at 11:52 PM Song Liu <song@kernel.org> wrote:
>
> On Fri, Apr 17, 2026 at 6:20 AM Petr Mladek <pmladek@suse.com> wrote:
> >
> > On Thu 2026-04-16 09:32:46, Song Liu wrote:
> [...]
> > Let' use the code from this patch:
> >
> > static int __init livepatch_bpf_init(void)
> > {
> >         int ret;
> >
> >         ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS,
> >                                         &klp_bpf_kfunc_set);
> >         ret = ret ?: register_bpf_struct_ops(&bpf_klp_bpf_cmdline_ops,
> >                                              klp_bpf_cmdline_ops);
> >         if (ret)
> >                 return ret;
> >
> > --->    /*
> > --->     * We would need to wait here until the BPF program gets loaded.
> > --->     * for the new bpf_struct_ops in this new livepatch.
> > --->     */

No waiting is necessary. If the BPF program is not attached, the
default logic can be executed instead. Consider Song's test case: we
can handle it as follows.

static int livepatch_cmdline_proc_show(struct seq_file *m, void *v)
{
    struct klp_bpf_cmdline_ops *ops = READ_ONCE(active_ops);

    if (ops && ops->set_cmdline)
        return ops->set_cmdline(m);

    // If no BPF program is attached, the default kernel function runs.
    return cmdline_proc_show(m, v);
}

However, as Song explained below, if we want atomic replace to work,
we may need to wait for the new BPF program here. But that would make
the combination of livepatch and BPF more complex.

Currently, on our production servers, we handle this through a user
script, such as:

  stop_traffic_relying_on_livepatch_bpf
  kpatch load new-livepatch-bpf-module.ko
  reattach_the_bpf_program
  start_the_traffic_again

Although this approach requires restarting the affected traffic, other
services running on the same server remain unaffected.

> >         return klp_enable_patch(&patch);
> > }
>
> Yes, something in this direction is needed to make atomic replace work.
> We have no plan to use this in production. I will let Yafang figure out
> his plan.
>
> > Or maybe, the bpf_struct_ops can be _allocated dynamically_ and
> > the pointer might be _passed via shadow variables_.
> >
> > One problem is that shadow variables would add another overhead
> > and need not be suitable for hot paths.
> >
> >
> > Anyway, I think that I have similar feelings as Miroslav.
> > The combination of livepatches and BPF programs increases
> > the complexity for all involved parties: core kernel maintainers,
> > livepatch and BPF program authors, and system maintainers.
> >
> > Do we really want to propagate it?
> > Is there any significant advantage in combining these two, please?
> > Is it significantly easier to write BPF program then a livepatch?
> > Is it significantly easier to update BPF programs then livepatches?

This is an important feature for avoiding server restarts,
particularly in a VM host environment. Since only one VM on the host
may be affected by this feature, we can deploy it rapidly without
impacting other VMs on the same host.

>
> Some combination like this will probably make sense for Yafang's use
> cases. But I agree maybe we don't want this in the samples, because
> it is indeed complicated and could be more dangerous.
>
> Thanks,
> Song
>
> > IMHO, the livepatches should allow enough flexibility. And it might
> > be easier to update the livepatch when needed.
> >
> > Or do you install more independent livepatches as well?

If we want certain livepatches to be persistent on the server, we will
need to support independent livepatches — and we do have such use
cases.

> >
> > Would the support of different replace tags help?
> > They would allow to replace only livepatches with the same tag.

Right, it will help.

-- 
Regards
Yafang

^ permalink raw reply

* Re: [RFC PATCH 2/2] kernel/module: Decouple klp and ftrace from load_module
From: Masami Hiramatsu @ 2026-04-20  2:27 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Petr Pavlu, Song Chen, rafael, lenb, mturquette, sboyd,
	viresh.kumar, agk, snitzer, mpatocka, bmarzins, song, yukuai,
	linan122, jason.wessel, danielt, dianders, horms, davem, edumazet,
	kuba, pabeni, paulmck, frederic, mcgrof, da.gomez, samitolvanen,
	atomlin, jpoimboe, jikos, mbenes, joe.lawrence, rostedt, mhiramat,
	mark.rutland, mathieu.desnoyers, linux-modules, linux-kernel,
	linux-trace-kernel, linux-acpi, linux-clk, linux-pm,
	live-patching, dm-devel, linux-raid, kgdb-bugreport, netdev
In-Reply-To: <aeD2_FrFL6E3dbAC@pathway.suse.cz>

On Thu, 16 Apr 2026 16:49:32 +0200
Petr Mladek <pmladek@suse.com> wrote:

> On Thu 2026-04-16 13:18:30, Petr Pavlu wrote:
> > On 4/15/26 8:43 AM, Song Chen wrote:
> > > On 4/14/26 22:33, Petr Pavlu wrote:
> > >> On 4/13/26 10:07 AM, chensong_2000@189.cn wrote:
> > >>> diff --git a/include/linux/module.h b/include/linux/module.h
> > >>> index 14f391b186c6..0bdd56f9defd 100644
> > >>> --- a/include/linux/module.h
> > >>> +++ b/include/linux/module.h
> > >>> @@ -308,6 +308,14 @@ enum module_state {
> > >>>       MODULE_STATE_COMING,    /* Full formed, running module_init. */
> > >>>       MODULE_STATE_GOING,    /* Going away. */
> > >>>       MODULE_STATE_UNFORMED,    /* Still setting it up. */
> > >>> +    MODULE_STATE_FORMED,
> > >>
> > >> I don't see a reason to add a new module state. Why is it necessary and
> > >> how does it fit with the existing states?
> > >>
> > > because once notifier fails in state MODULE_STATE_UNFORMED (now only ftrace has someting to do in this state), notifier chain will roll back by calling blocking_notifier_call_chain_robust, i'm afraid MODULE_STATE_GOING is going to jeopardise the notifers which don't handle it appropriately, like:
> > > 
> > > case MODULE_STATE_COMING:
> > >      kmalloc();
> > > case MODULE_STATE_GOING:
> > >      kfree();
> > 
> > My understanding is that the current module "state machine" operates as
> > follows. Transitions marked with an asterisk (*) are announced via the
> > module notifier.
> > 
> > ---> UNFORMED --*> COMING --*> LIVE --*> GOING -.
> >         ^            |                     ^    |
> >         |            '---------------------*    |
> >         '---------------------------------------'
> > 
> > The new code aims to replace the current ftrace_module_init() call in
> > load_module(). To achieve this, it adds a notification for the UNFORMED
> > state (only when loading a module) and introduces a new FORMED state for
> > rollback. FORMED is purely a fake state because it never appears in
> > module::state. The new structure is as follows:
> > 
> >         ,--*> (FORMED)
> >         |
> > --*> UNFORMED --*> COMING --*> LIVE --*> GOING -.
> >         ^            |                     ^    |
> >         |            '---------------------*    |
> >         '---------------------------------------'
> > 
> > I'm afraid this is quite complex and inconsistent. Unless it can be kept
> > simple, we would be just replacing one special handling with a different
> > complexity, which is not worth it.
> 
> > >>
> > >>> +    if (err)
> > >>> +        goto ddebug_cleanup;
> > >>>         /* Finally it's fully formed, ready to start executing. */
> > >>>       err = complete_formation(mod, info);
> > >>> -    if (err)
> > >>> +    if (err) {
> > >>> +        blocking_notifier_call_chain_reverse(&module_notify_list,
> > >>> +                MODULE_STATE_FORMED, mod);
> > >>>           goto ddebug_cleanup;
> > >>> +    }
> > >>>   -    err = prepare_coming_module(mod);
> > >>> +    err = prepare_module_state_transaction(mod,
> > >>> +                MODULE_STATE_COMING, MODULE_STATE_GOING);
> > >>>       if (err)
> > >>>           goto bug_cleanup;
> > >>>   @@ -3522,7 +3519,6 @@ static int load_module(struct load_info *info, const char __user *uargs,
> > >>>       destroy_params(mod->kp, mod->num_kp);
> > >>>       blocking_notifier_call_chain(&module_notify_list,
> > >>>                        MODULE_STATE_GOING, mod);
> > >>
> > >> My understanding is that all notifier chains for MODULE_STATE_GOING
> > >> should be reversed.
> > > yes, all, from lowest priority notifier to highest.
> > > I will resend patch 1 which was failed due to my proxy setting.
> > 
> > What I meant here is that the call:
> > 
> > blocking_notifier_call_chain(&module_notify_list, MODULE_STATE_GOING, mod);
> > 
> > should be replaced with:
> > 
> > blocking_notifier_call_chain_reverse(&module_notify_list, MODULE_STATE_GOING, mod);
> > 
> > > 
> > >>
> > >>> -    klp_module_going(mod);
> > >>>    bug_cleanup:
> > >>>       mod->state = MODULE_STATE_GOING;
> > >>>       /* module_bug_cleanup needs module_mutex protection */
> > >>
> > >> The patch removes the klp_module_going() cleanup call in load_module().
> > >> Similarly, the ftrace_release_mod() call under the ddebug_cleanup label
> > >> should be removed and appropriately replaced with a cleanup via
> > >> a notifier.
> > >>
> > >     err = prepare_module_state_transaction(mod,
> > >                 MODULE_STATE_UNFORMED, MODULE_STATE_FORMED);
> > >     if (err)
> > >         goto ddebug_cleanup;
> > > 
> > > ftrace will be cleanup in blocking_notifier_call_chain_robust rolling back.
> > > 
> > >     err = prepare_module_state_transaction(mod,
> > >                 MODULE_STATE_COMING, MODULE_STATE_GOING);
> > > 
> > > each notifier including ftrace and klp will be cleanup in blocking_notifier_call_chain_robust rolling back.
> > > 
> > > if all notifiers are successful in MODULE_STATE_COMING, they all will be clean up in
> > >  coming_cleanup:
> > >     mod->state = MODULE_STATE_GOING;
> > >     destroy_params(mod->kp, mod->num_kp);
> > >     blocking_notifier_call_chain(&module_notify_list,
> > >                      MODULE_STATE_GOING, mod);
> > > 
> > > if  something wrong underneath.
> > 
> > My point is that the patch leaves a call to ftrace_release_mod() in
> > load_module(), which I expected to be handled via a notifier.
> 
> I think that I have got it. The ftrace code needs two notifiers when
> the module is being loaded and two when it is going.
> 
> This is why Sond added the new state. But I think that we would
> need two new states to call:
> 
>     + ftrace_module_init() in MODULE_STATE_UNFORMED
>     + ftrace_module_enable() in MODULE_STATE_FORMED
> 
> and
> 
>     + ftrace_free_mem() in MODULE_STATE_PRE_GOING
>     + ftrace_free_mem() in MODULE_STATE_GOING
> 
> 
> By using the ascii art:
> 
>  -*> UNFORMED -*> FORMED -> COMING -*> LIVE -*> PRE_GOING -*> GOING -.
>               |          |         |                ^           ^    ^
>               |          |         '----------------'           |    |
>               |          '--------------------------------------'    |
>               '------------------------------------------------------'
> 
> 
> But I think that this is not worth it.

Agree.

If this needs to be ordered so strictly, why we will use a "single"
module notifier chain for this complex situation?

I think the notifier call chain is just for notice a single signal,
instead of sending several different signals, especially if there is
any dependency among the callbacks.

If notification callbacks need to be ordered, they are currently
sorted by representing priority numerically, but this is quite
fragile for updating. It has to look up other registered priorities
and adjust the order among dependencies each time. For this reason,
this mechanism is not suitable for global ordering. (It's like line
numbers in BASIC.)
It is probably only useful for representing dependencies between
two components maintained by the same maintainer.

I'm against a general-purpose system that makes everything modular.
It unnecessarily complicates things. If there are processes that
require strict ordering, especially processes that must be performed
before each stage as part of the framework, they should be called
directly from the framework, not via notification callbacks.

This makes it simpler and more robust to maintain.

Only the framework's end users should utilize notification callbacks.

Thank you,


> 
> Best Regards,
> Petr
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply

* Re: [PATCH v3 7/8] sframe: Introduce in-kernel SFRAME_VALIDATION.
From: Dylan Hatch @ 2026-04-20  5:02 UTC (permalink / raw)
  To: Jens Remus
  Cc: Roman Gushchin, Weinan Liu, Will Deacon, Josh Poimboeuf,
	Indu Bhagat, Peter Zijlstra, Steven Rostedt, Catalin Marinas,
	Jiri Kosina, Mark Rutland, Prasanna Kumar T S M, Puranjay Mohan,
	Song Liu, joe.lawrence, linux-toolchains, linux-kernel,
	live-patching, linux-arm-kernel, Heiko Carstens
In-Reply-To: <de7bd273-3650-4378-8fd8-a51217e7284b@linux.ibm.com>

On Thu, Apr 16, 2026 at 8:04 AM Jens Remus <jremus@linux.ibm.com> wrote:
>
> Hello Dylan!
>
> On 4/6/2026 8:49 PM, Dylan Hatch wrote:
> > Generalize the __safe* helpers to support a non-user-access code path.
> > Allow for kernel FDE read failures due to the presence of .rodata.text.
> > This section contains code that can't be executed by the kernel
> > direclty, and thus lies ouside the normal kernel-text bounds.
>
> Nits: s/direclty/directly/ s/ouside/outside/
>
> Could you please explain the issue?  How/why does .sframe for
> .rodata.text pose an issue for .sframe verification?

__read_fde checks that the fde_addr it extracts is within the bounds
of sec->text_start and sec->text_end. In the case of the vmlinux
.sframe section, this is _stext and _etext. However on arm64, there is
an .rodata.text section that lies outside this range. From
arch/arm64/kernel/vmlinux.lds.S:

        /* code sections that are never executed via the kernel mapping */
        .rodata.text : {
                TRAMP_TEXT
                HIBERNATE_TEXT
                KEXEC_TEXT
                IDMAP_TEXT
                . = ALIGN(PAGE_SIZE);
        }

So __read_fde fails for functions in this section. Under normal SFrame
usage for unwinding, we should never need to look up a PC value in
these functions because they will never be executed by the kernel.
However, we still hit this error when validating all FDEs.

I think ideally we might prevent sframe data from being generated for
this code (maybe from the linker script somehow?), but I don't know of
a simple way to do this. Alternatively, we can check for FDEs located
in .rodata.text during validation, but this seems to only be present
in arm64, so maybe we would need an arch-specific hook to do this? I'm
open to suggestions.

>
> > Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
>
> > diff --git a/kernel/unwind/sframe.c b/kernel/unwind/sframe.c
>
> > @@ -690,6 +699,13 @@ static int sframe_validate_section(struct sframe_section *sec)
> >               int ret;
> >
> >               ret = safe_read_fde(sec, i, &fde);
> > +             /*
> > +              * Code in .rodata.text is not considered part of normal kernel
> > +              * text, but there is no easy way to prevent sframe data from
> > +              * being generated for it.
> > +              */
> > +             if (ret && sec->sec_type == SFRAME_KERNEL)
> > +                     continue;
> >               if (ret)
> >                       return ret;
> >
> Thanks and regards,
> Jens
> --
> Jens Remus
> Linux on Z Development (D3303)
> jremus@de.ibm.com / jremus@linux.ibm.com
>
> IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Ehningen; Registergericht: Amtsgericht Stuttgart, HRB 243294
> IBM Data Privacy Statement: https://www.ibm.com/privacy/
>

Thanks,
Dylan

^ permalink raw reply

* Re: [RFC PATCH 1/2] kernel/notifier: replace single-linked list with double-linked list for reverse traversal
From: Masami Hiramatsu @ 2026-04-20  5:44 UTC (permalink / raw)
  To: chensong_2000
  Cc: rafael, lenb, mturquette, sboyd, viresh.kumar, agk, snitzer,
	mpatocka, bmarzins, song, yukuai, linan122, jason.wessel, danielt,
	dianders, horms, davem, edumazet, kuba, pabeni, paulmck, frederic,
	mcgrof, petr.pavlu, da.gomez, samitolvanen, atomlin, jpoimboe,
	jikos, mbenes, pmladek, joe.lawrence, rostedt, mark.rutland,
	mathieu.desnoyers, linux-modules, linux-kernel,
	linux-trace-kernel, linux-acpi, linux-clk, linux-pm,
	live-patching, dm-devel, linux-raid, kgdb-bugreport, netdev
In-Reply-To: <20260415070137.17860-1-chensong_2000@189.cn>

Hi Song,

On Wed, 15 Apr 2026 15:01:37 +0800
chensong_2000@189.cn wrote:

> From: Song Chen <chensong_2000@189.cn>
> 
> The current notifier chain implementation uses a single-linked list
> (struct notifier_block *next), which only supports forward traversal
> in priority order. This makes it difficult to handle cleanup/teardown
> scenarios that require notifiers to be called in reverse priority order.

What about introducing a new notification callback API that allows you
to describe dependencies between callback functions?

For example, when registering a callback, you could register a string
as an ID and specify whether to call it before or after that ID,
or you could register a comparison function that is called when adding
to a list. (I prefer @name and @depends fields so that it can be easily
maintained.)

This would allow for better dependency building when adding to the list.

> 
> A concrete example is the ordering dependency between ftrace and
> livepatch during module load/unload. see the detail here [1].

If this only concerns notification callback issues with the ftrace
and livepatch modules, it's far more robust to simply call the
necessary processing directly when the modules load and unload,
rather than registering notification callbacks externally.

There are fprobe, kprobe and its trace-events, all of them are using
ftrace as its fundation layer. In this case, I always needs to
consider callback order when a module is unloaded.

If ftrace is working as a part of module callbacks, it will conflict
with fprobe/kprobe module callback. Of course we can reorder it with
modifying its priority. But this is ugly, because when we introduce
a new other feature which depends on another layer, we need to
reorder the callback's priority number on the list.

Based on the above, I don't think this can be resolved simply by
changing the list of notification callbacks to a bidirectional list.

Thank you,

-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox