Live Patching
 help / color / mirror / Atom feed
* Re: [PATCH v2 0/6] kselftests: livepatch: Adapt tests to be executed on 4.12 kernels
From: Josh Poimboeuf @ 2026-04-16 17:07 UTC (permalink / raw)
  To: Marcos Paulo de Souza
  Cc: Jiri Kosina, Miroslav Benes, Petr Mladek, Joe Lawrence,
	Shuah Khan, live-patching, linux-kselftest, linux-kernel
In-Reply-To: <20260413-lp-tests-old-fixes-v2-0-367c7cb5006f@suse.com>

On Mon, Apr 13, 2026 at 02:26:11PM -0300, Marcos Paulo de Souza wrote:
> A new version of the patchset, with fewer patches now. Please take a look!
> 
> Original cover-letter:
> These patches don't really change how the patches are run, just skip
> some tests on kernels that don't support a feature (like kprobe and
> livepatched living together) or when a livepatch sysfs attribute is
> missing.
> 
> The last patch slightly adjusts check_result function to skip dmesg
> messages on SLE kernels when a livepatch is removed.

Why are we adding complexity to support Linux 4.12 in mainline?  Isn't
that what enterprise distros are for?

-- 
Josh

^ permalink raw reply

* Re: [PATCH v3 3/8] arm64: entry: add unwind info for various kernel entries
From: Jens Remus @ 2026-04-16 16:49 UTC (permalink / raw)
  To: Dylan Hatch, Roman Gushchin, Weinan Liu, Will Deacon,
	Josh Poimboeuf, Indu Bhagat, Peter Zijlstra, Steven Rostedt,
	Catalin Marinas, Jiri Kosina
  Cc: Mark Rutland, Prasanna Kumar T S M, Puranjay Mohan, Song Liu,
	joe.lawrence, linux-toolchains, linux-kernel, live-patching,
	linux-arm-kernel, Heiko Carstens
In-Reply-To: <20260406185000.1378082-4-dylanbhatch@google.com>

On 4/6/2026 8:49 PM, Dylan Hatch wrote:
> From: Weinan Liu <wnliu@google.com>
> 
> DWARF CFI (Call Frame Information) specifies how to recover the return
> address and callee-saved registers at each PC in a given function.
> Compilers are able to generate the CFI annotations when they compile
> the code to assembly language. For handcrafted assembly, we need to
> annotate them by hand.
> 
> Annotate CFI unwind info for assembly for interrupt and exception
> handlers.
> 
> Signed-off-by: Weinan Liu <wnliu@google.com>
> Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
> ---
>  arch/arm64/kernel/entry.S | 10 ++++++++++
>  1 file changed, 10 insertions(+)

The added CFI directives somehow cause .eh_frame (instead of
.debug_frame) to be generated in addition to .sframe.  This causes the
following warning when linking vmlinux (.tmp_vmlinux1, .tmp_vmlinux2,
and vmlinux.unstripped):

  LD      vmlinux.unstripped
aarch64-linux-gnu-ld: warning: orphan section `.eh_frame' from `arch/arm64/kernel/entry.o' being placed in section `.eh_frame'

I don't think this can be controlled using compiler options
-fno-asynchronous-unwind-tables -fno-unwind-tables, as entry.S is
only preprocessed and then fed into the assembler.

The following at the top of entry.S would resolve the issue:

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
@@ -30,6 +30,12 @@
 #include <asm/asm-uaccess.h>
 #include <asm/unistd.h>

+/*
+ * Do not generate .eh_frame.  Only generate .debug_frame and optionally
+ * .sframe (via assembler option --gsframe[-N]).
+ */
+	.cfi_sections .debug_frame
+
	.macro	clear_gp_regs
	.irp	n,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29
	mov	x\n, xzr

Regards,
Jens
-- 
Jens Remus
Linux on Z Development (D3303)
jremus@de.ibm.com / jremus@linux.ibm.com

IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Ehningen; Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement: https://www.ibm.com/privacy/


^ permalink raw reply

* Re: [PATCH] samples/livepatch: Add BPF struct_ops integration sample
From: Song Liu @ 2026-04-16 16:32 UTC (permalink / raw)
  To: Yafang Shao
  Cc: live-patching, linux-kernel, jpoimboe, jikos, mbenes, pmladek,
	joe.lawrence, kernel-team
In-Reply-To: <CALOAHbDSpofLCQ-LCU2uVtkc9w+zib0PPgBr+6sEv5FD5+Hd=g@mail.gmail.com>

On Thu, Apr 16, 2026 at 12:46 AM Yafang Shao <laoar.shao@gmail.com> wrote:
[...]
> > +
> > +static struct klp_patch patch = {
> > +       .mod = THIS_MODULE,
> > +       .objs = objs,
>
>   Nit: I suggest enabling the replace flag for this patch to align
> with the recommended implementation.
>
>     .replace = true,

This is an interesting topic. To fully take advantage of the replace
feature, we need more work on the BPF side.

For this sample, I guess we are OK either way.

>
> Other than that, it looks good to me:
>
>   Tested-and-acked-by: Yafang Shao <laoar.shao@gmail.com>

Thanks,
Song

^ permalink raw reply

* Re: [GIT PULL] livepatching for 7.1
From: pr-tracker-bot @ 2026-04-16 15:19 UTC (permalink / raw)
  To: Petr Mladek; +Cc: Linus Torvalds, linux-kernel, live-patching
In-Reply-To: <aeCoZ4GkkSzOLA68@pathway.suse.cz>

The pull request you sent on Thu, 16 Apr 2026 11:14:15 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching.git tags/livepatching-for-7.1

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/d3d9443f8bac799340bb04db51ef4ababc4f7267

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply

* Re: [PATCH v3 4/8] sframe: Provide PC lookup for vmlinux .sframe section.
From: Jens Remus @ 2026-04-16 15:10 UTC (permalink / raw)
  To: Dylan Hatch, Roman Gushchin, Weinan Liu, Will Deacon,
	Josh Poimboeuf, Indu Bhagat, Peter Zijlstra, Steven Rostedt,
	Catalin Marinas, Jiri Kosina
  Cc: Mark Rutland, Prasanna Kumar T S M, Puranjay Mohan, Song Liu,
	joe.lawrence, linux-toolchains, linux-kernel, live-patching,
	linux-arm-kernel, Heiko Carstens
In-Reply-To: <20260406185000.1378082-5-dylanbhatch@google.com>

On 4/6/2026 8:49 PM, Dylan Hatch wrote:
> With SFRAME_UNWINDER, read in the .sframe section at boot. This provides
> unwind data as an alternative/supplement to frame pointer-based
> unwinding.
> 
> Signed-off-by: Dylan Hatch <dylanbhatch@google.com>

Reviewed-by: Jens Remus <jremus@linux.ibm.com>

> ---
>  arch/arm64/kernel/setup.c |  2 ++
>  include/linux/sframe.h    | 14 ++++++++++++++
>  kernel/unwind/sframe.c    | 39 +++++++++++++++++++++++++++++++++++++++
>  3 files changed, 55 insertions(+)
Regards,
Jens
-- 
Jens Remus
Linux on Z Development (D3303)
jremus@de.ibm.com / jremus@linux.ibm.com

IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Ehningen; Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement: https://www.ibm.com/privacy/


^ permalink raw reply

* Re: [PATCH v3 7/8] sframe: Introduce in-kernel SFRAME_VALIDATION.
From: Jens Remus @ 2026-04-16 15:04 UTC (permalink / raw)
  To: Dylan Hatch, Roman Gushchin, Weinan Liu, Will Deacon,
	Josh Poimboeuf, Indu Bhagat, Peter Zijlstra, Steven Rostedt,
	Catalin Marinas, Jiri Kosina
  Cc: Mark Rutland, Prasanna Kumar T S M, Puranjay Mohan, Song Liu,
	joe.lawrence, linux-toolchains, linux-kernel, live-patching,
	linux-arm-kernel, Heiko Carstens
In-Reply-To: <20260406185000.1378082-8-dylanbhatch@google.com>

Hello Dylan!

On 4/6/2026 8:49 PM, Dylan Hatch wrote:
> Generalize the __safe* helpers to support a non-user-access code path.
> Allow for kernel FDE read failures due to the presence of .rodata.text.
> This section contains code that can't be executed by the kernel
> direclty, and thus lies ouside the normal kernel-text bounds.

Nits: s/direclty/directly/ s/ouside/outside/

Could you please explain the issue?  How/why does .sframe for
.rodata.text pose an issue for .sframe verification?

> Signed-off-by: Dylan Hatch <dylanbhatch@google.com>

> diff --git a/kernel/unwind/sframe.c b/kernel/unwind/sframe.c

> @@ -690,6 +699,13 @@ static int sframe_validate_section(struct sframe_section *sec)
>  		int ret;
>  
>  		ret = safe_read_fde(sec, i, &fde);
> +		/*
> +		 * Code in .rodata.text is not considered part of normal kernel
> +		 * text, but there is no easy way to prevent sframe data from
> +		 * being generated for it.
> +		 */
> +		if (ret && sec->sec_type == SFRAME_KERNEL)
> +			continue;
>  		if (ret)
>  			return ret;
>  
Thanks and regards,
Jens
-- 
Jens Remus
Linux on Z Development (D3303)
jremus@de.ibm.com / jremus@linux.ibm.com

IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Ehningen; Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement: https://www.ibm.com/privacy/


^ permalink raw reply

* Re: [PATCH v3 5/8] sframe: Allow unsorted FDEs.
From: Jens Remus @ 2026-04-16 14:57 UTC (permalink / raw)
  To: Dylan Hatch, Roman Gushchin, Weinan Liu, Will Deacon,
	Josh Poimboeuf, Indu Bhagat, Peter Zijlstra, Steven Rostedt,
	Catalin Marinas, Jiri Kosina
  Cc: Mark Rutland, Prasanna Kumar T S M, Puranjay Mohan, Song Liu,
	joe.lawrence, linux-toolchains, linux-kernel, live-patching,
	linux-arm-kernel, Heiko Carstens
In-Reply-To: <20260406185000.1378082-6-dylanbhatch@google.com>

On 4/6/2026 8:49 PM, Dylan Hatch wrote:
> The .sframe in kernel modules is built without SFRAME_F_FDE_SORTED set.
> In order to allow sframe PC lookup in modules, add a code path to handle
> unsorted FDE tables by doing a simple linear search.
> 
> Signed-off-by: Dylan Hatch <dylanbhatch@google.com>

With my below two minor comments considered:

Reviewed-by: Jens Remus <jremus@linux.ibm.com>

> diff --git a/include/linux/sframe.h b/include/linux/sframe.h

> @@ -64,6 +64,7 @@ struct sframe_section {
>  	unsigned long		text_start;
>  	unsigned long		text_end;
>  
> +	bool			fdes_sorted;
>  	unsigned long		fdes_start;
>  	unsigned long		fres_start;
>  	unsigned long		fres_end;

The struct would be smaller if the bool fdes_sorted flag would be
inserted after the unsigned int num_fdes field:

$ pahole -C sframe_section kernel/unwind/sframe.o

Yours:
size: 96

With bool fdes_sorted moved after unsigned int num_fdes:
size: 88

> diff --git a/kernel/unwind/sframe.c b/kernel/unwind/sframe.c

> @@ -179,9 +179,34 @@ static __always_inline int __read_fde(struct sframe_section *sec,
>  	return -EFAULT;
>  }
>  
> -static __always_inline int __find_fde(struct sframe_section *sec,
> -				      unsigned long ip,
> -				      struct sframe_fde_internal *fde)
> +static __always_inline int __find_fde_unsorted(struct sframe_section *sec,
> +					       unsigned long ip,
> +					       struct sframe_fde_internal *fde)
> +{
> +	struct sframe_fde_v3 *cur, *start, *end;
> +
> +	start = (struct sframe_fde_v3 *)sec->fdes_start;
> +	end = start + sec->num_fdes;
> +
> +	for (cur = start; cur < end; cur++) {
> +		s64 func_off;
> +		u32 func_size;
> +		unsigned long func_addr;
> +
> +		DATA_GET(sec, func_off, &cur->func_start_off, s64, Efault);
> +		DATA_GET(sec, func_size, &cur->func_size, u32, Efault);
> +		func_addr = (unsigned long)cur + func_off;
> +
> +		if (ip >= func_addr && ip < func_addr + func_size)
> +			return __read_fde(sec, cur - start, fde);
> +	}

__find_fde() (now __find_fde_sorted()) returns -EINVAL, if no FDE is found:

	return -EINVAL;

> +Efault:
> +	return -EFAULT;
> +}
Regards,
Jens
-- 
Jens Remus
Linux on Z Development (D3303)
jremus@de.ibm.com / jremus@linux.ibm.com

IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Ehningen; Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement: https://www.ibm.com/privacy/


^ permalink raw reply

* Re: [RFC PATCH 1/2] kernel/notifier: replace single-linked list with double-linked list for reverse traversal
From: Petr Mladek @ 2026-04-16 14:54 UTC (permalink / raw)
  To: David Laight
  Cc: chensong_2000, rafael, lenb, mturquette, sboyd, viresh.kumar, agk,
	snitzer, mpatocka, bmarzins, song, yukuai, linan122, jason.wessel,
	danielt, dianders, horms, davem, edumazet, kuba, pabeni, paulmck,
	frederic, mcgrof, petr.pavlu, da.gomez, samitolvanen, atomlin,
	jpoimboe, jikos, mbenes, joe.lawrence, rostedt, mhiramat,
	mark.rutland, mathieu.desnoyers, linux-modules, linux-kernel,
	linux-trace-kernel, linux-acpi, linux-clk, linux-pm,
	live-patching, dm-devel, linux-raid, kgdb-bugreport, netdev
In-Reply-To: <20260416133004.07bd2886@pumpkin>

On Thu 2026-04-16 13:30:04, David Laight wrote:
> On Wed, 15 Apr 2026 15:01:37 +0800
> chensong_2000@189.cn wrote:
> 
> > From: Song Chen <chensong_2000@189.cn>
> > 
> > The current notifier chain implementation uses a single-linked list
> > (struct notifier_block *next), which only supports forward traversal
> > in priority order. This makes it difficult to handle cleanup/teardown
> > scenarios that require notifiers to be called in reverse priority order.
> 
> If it is only cleanup/teardown then the list can be order-reversed
> as part of that process at the same time as the list is deleted.

Interesting idea. But it won't work in all situations.

Note that the motivation for this update are the module loader
notifiers which are called several times for each loaded/removed module.

Best Regards,
Petr

^ permalink raw reply

* Re: [RFC PATCH 2/2] kernel/module: Decouple klp and ftrace from load_module
From: Petr Mladek @ 2026-04-16 14:49 UTC (permalink / raw)
  To: Petr Pavlu
  Cc: Song Chen, rafael, lenb, mturquette, sboyd, viresh.kumar, agk,
	snitzer, mpatocka, bmarzins, song, yukuai, linan122, jason.wessel,
	danielt, dianders, horms, davem, edumazet, kuba, pabeni, paulmck,
	frederic, mcgrof, da.gomez, samitolvanen, atomlin, jpoimboe,
	jikos, mbenes, joe.lawrence, rostedt, mhiramat, mark.rutland,
	mathieu.desnoyers, linux-modules, linux-kernel,
	linux-trace-kernel, linux-acpi, linux-clk, linux-pm,
	live-patching, dm-devel, linux-raid, kgdb-bugreport, netdev
In-Reply-To: <1db425bf-58a9-4768-8c38-3ae25d7662a5@suse.com>

On Thu 2026-04-16 13:18:30, Petr Pavlu wrote:
> On 4/15/26 8:43 AM, Song Chen wrote:
> > On 4/14/26 22:33, Petr Pavlu wrote:
> >> On 4/13/26 10:07 AM, chensong_2000@189.cn wrote:
> >>> diff --git a/include/linux/module.h b/include/linux/module.h
> >>> index 14f391b186c6..0bdd56f9defd 100644
> >>> --- a/include/linux/module.h
> >>> +++ b/include/linux/module.h
> >>> @@ -308,6 +308,14 @@ enum module_state {
> >>>       MODULE_STATE_COMING,    /* Full formed, running module_init. */
> >>>       MODULE_STATE_GOING,    /* Going away. */
> >>>       MODULE_STATE_UNFORMED,    /* Still setting it up. */
> >>> +    MODULE_STATE_FORMED,
> >>
> >> I don't see a reason to add a new module state. Why is it necessary and
> >> how does it fit with the existing states?
> >>
> > because once notifier fails in state MODULE_STATE_UNFORMED (now only ftrace has someting to do in this state), notifier chain will roll back by calling blocking_notifier_call_chain_robust, i'm afraid MODULE_STATE_GOING is going to jeopardise the notifers which don't handle it appropriately, like:
> > 
> > case MODULE_STATE_COMING:
> >      kmalloc();
> > case MODULE_STATE_GOING:
> >      kfree();
> 
> My understanding is that the current module "state machine" operates as
> follows. Transitions marked with an asterisk (*) are announced via the
> module notifier.
> 
> ---> UNFORMED --*> COMING --*> LIVE --*> GOING -.
>         ^            |                     ^    |
>         |            '---------------------*    |
>         '---------------------------------------'
> 
> The new code aims to replace the current ftrace_module_init() call in
> load_module(). To achieve this, it adds a notification for the UNFORMED
> state (only when loading a module) and introduces a new FORMED state for
> rollback. FORMED is purely a fake state because it never appears in
> module::state. The new structure is as follows:
> 
>         ,--*> (FORMED)
>         |
> --*> UNFORMED --*> COMING --*> LIVE --*> GOING -.
>         ^            |                     ^    |
>         |            '---------------------*    |
>         '---------------------------------------'
> 
> I'm afraid this is quite complex and inconsistent. Unless it can be kept
> simple, we would be just replacing one special handling with a different
> complexity, which is not worth it.

> >>
> >>> +    if (err)
> >>> +        goto ddebug_cleanup;
> >>>         /* Finally it's fully formed, ready to start executing. */
> >>>       err = complete_formation(mod, info);
> >>> -    if (err)
> >>> +    if (err) {
> >>> +        blocking_notifier_call_chain_reverse(&module_notify_list,
> >>> +                MODULE_STATE_FORMED, mod);
> >>>           goto ddebug_cleanup;
> >>> +    }
> >>>   -    err = prepare_coming_module(mod);
> >>> +    err = prepare_module_state_transaction(mod,
> >>> +                MODULE_STATE_COMING, MODULE_STATE_GOING);
> >>>       if (err)
> >>>           goto bug_cleanup;
> >>>   @@ -3522,7 +3519,6 @@ static int load_module(struct load_info *info, const char __user *uargs,
> >>>       destroy_params(mod->kp, mod->num_kp);
> >>>       blocking_notifier_call_chain(&module_notify_list,
> >>>                        MODULE_STATE_GOING, mod);
> >>
> >> My understanding is that all notifier chains for MODULE_STATE_GOING
> >> should be reversed.
> > yes, all, from lowest priority notifier to highest.
> > I will resend patch 1 which was failed due to my proxy setting.
> 
> What I meant here is that the call:
> 
> blocking_notifier_call_chain(&module_notify_list, MODULE_STATE_GOING, mod);
> 
> should be replaced with:
> 
> blocking_notifier_call_chain_reverse(&module_notify_list, MODULE_STATE_GOING, mod);
> 
> > 
> >>
> >>> -    klp_module_going(mod);
> >>>    bug_cleanup:
> >>>       mod->state = MODULE_STATE_GOING;
> >>>       /* module_bug_cleanup needs module_mutex protection */
> >>
> >> The patch removes the klp_module_going() cleanup call in load_module().
> >> Similarly, the ftrace_release_mod() call under the ddebug_cleanup label
> >> should be removed and appropriately replaced with a cleanup via
> >> a notifier.
> >>
> >     err = prepare_module_state_transaction(mod,
> >                 MODULE_STATE_UNFORMED, MODULE_STATE_FORMED);
> >     if (err)
> >         goto ddebug_cleanup;
> > 
> > ftrace will be cleanup in blocking_notifier_call_chain_robust rolling back.
> > 
> >     err = prepare_module_state_transaction(mod,
> >                 MODULE_STATE_COMING, MODULE_STATE_GOING);
> > 
> > each notifier including ftrace and klp will be cleanup in blocking_notifier_call_chain_robust rolling back.
> > 
> > if all notifiers are successful in MODULE_STATE_COMING, they all will be clean up in
> >  coming_cleanup:
> >     mod->state = MODULE_STATE_GOING;
> >     destroy_params(mod->kp, mod->num_kp);
> >     blocking_notifier_call_chain(&module_notify_list,
> >                      MODULE_STATE_GOING, mod);
> > 
> > if  something wrong underneath.
> 
> My point is that the patch leaves a call to ftrace_release_mod() in
> load_module(), which I expected to be handled via a notifier.

I think that I have got it. The ftrace code needs two notifiers when
the module is being loaded and two when it is going.

This is why Sond added the new state. But I think that we would
need two new states to call:

    + ftrace_module_init() in MODULE_STATE_UNFORMED
    + ftrace_module_enable() in MODULE_STATE_FORMED

and

    + ftrace_free_mem() in MODULE_STATE_PRE_GOING
    + ftrace_free_mem() in MODULE_STATE_GOING


By using the ascii art:

 -*> UNFORMED -*> FORMED -> COMING -*> LIVE -*> PRE_GOING -*> GOING -.
              |          |         |                ^           ^    ^
              |          |         '----------------'           |    |
              |          '--------------------------------------'    |
              '------------------------------------------------------'


But I think that this is not worth it.

Best Regards,
Petr

^ permalink raw reply

* Re: [PATCH v3 3/8] arm64: entry: add unwind info for various kernel entries
From: Jens Remus @ 2026-04-16 14:09 UTC (permalink / raw)
  To: Dylan Hatch, Roman Gushchin, Weinan Liu, Will Deacon,
	Josh Poimboeuf, Indu Bhagat, Peter Zijlstra, Steven Rostedt,
	Catalin Marinas, Jiri Kosina
  Cc: Mark Rutland, Prasanna Kumar T S M, Puranjay Mohan, Song Liu,
	joe.lawrence, linux-toolchains, linux-kernel, live-patching,
	linux-arm-kernel, Heiko Carstens, Sven Schnelle
In-Reply-To: <20260406185000.1378082-4-dylanbhatch@google.com>

Hello Dylan and Weinan!

On 4/6/2026 8:49 PM, Dylan Hatch wrote:
> From: Weinan Liu <wnliu@google.com>
> 
> DWARF CFI (Call Frame Information) specifies how to recover the return
> address and callee-saved registers at each PC in a given function.
> Compilers are able to generate the CFI annotations when they compile
> the code to assembly language. For handcrafted assembly, we need to
> annotate them by hand.
> 
> Annotate CFI unwind info for assembly for interrupt and exception
> handlers.

It took me a while to figure, why CFI annotations are uncommonly only
added to selected instruction (ranges) and not the whole functions.  I
guess you only want to enable stacktracing using SFrame through
el1*_64_*() (from el1*_64_*_handler()) and call_on_irq_stack(), that is
why the added CFI annotations start after the bl/blr instructions, so
that whenever an unwound return address points after those bl/blr
SFrame can recover the stack pointer, frame pointer, and return address.

Wouldn't that be worth to be documented in the commit message?

> Signed-off-by: Weinan Liu <wnliu@google.com>
> Signed-off-by: Dylan Hatch <dylanbhatch@google.com>

> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S

> @@ -575,7 +575,12 @@ SYM_CODE_START_LOCAL(el\el\ht\()_\regsize\()_\label)
>  	.if \el == 0
>  	b	ret_to_user
>  	.else

	/* Minimal DWARF CFI for unwinding across call above. */

> +	.cfi_startproc
> +	.cfi_def_cfa_offset PT_REGS_SIZE
> +	.cfi_offset 29, S_FP - PT_REGS_SIZE
> +	.cfi_offset 30, S_LR - PT_REGS_SIZE
>  	b	ret_to_kernel
> +	.cfi_endproc
>  	.endif
>  SYM_CODE_END(el\el\ht\()_\regsize\()_\label)
>  	.endm

> @@ -889,6 +894,10 @@ SYM_FUNC_START(call_on_irq_stack)
>  	add	sp, x16, #IRQ_STACK_SIZE
>  	restore_irq x9
>  	blr	x1

	/* Minimal DWARF CFI for unwinding across indirect call above. */

> +	.cfi_startproc
> +	.cfi_def_cfa 29, 16
> +	.cfi_offset 29, -16
> +	.cfi_offset 30, -8
>  
>  	save_and_disable_daif x9
>  	/*
> @@ -900,6 +909,7 @@ SYM_FUNC_START(call_on_irq_stack)
>  	scs_load_current
>  	restore_irq x9
>  	ret
> +	.cfi_endproc
>  SYM_FUNC_END(call_on_irq_stack)
>  NOKPROBE(call_on_irq_stack)

While above minimal DWARF CFI works for your use case, the following
minor issue should probably be better corrected (excerpt from
call_on_irq_stack with your patch applied):

	blr	x1
	.cfi_startproc
	.cfi_def_cfa 29, 16   <-- CFA is defined as FP + 16
	.cfi_offset 29, -16
	.cfi_offset 30, -8

	save_and_disable_daif x9
	/*
	 * Restore the SP from the FP, and restore the FP and LR from the frame
	 * record.
	 */
	mov	sp, x29
	ldp	x29, x30, [sp], #16   <-- FP is restored, so that the CFA definition is no longer valid
[CORRECTION]
	.cfi_restore 29
	.cfi_restore 30
	.cfi_def_cfa 31, 0
[/CORRECTION]
	scs_load_current
	restore_irq x9
	ret
	.cfi_endproc
SYM_FUNC_END(call_on_irq_stack)


Would it alternatively make sense to add complete DWARF CFI annotations
to call_on_irq_stack()?  I think the following would do:

SYM_FUNC_START(call_on_irq_stack)
	.cfi_startproc
...
	/* Create a frame record to save our LR and SP (implicit in FP) */
	stp	x29, x30, [sp, #-16]!
	mov	x29, sp
	.cfi_def_cfa 29, 16
	.cfi_offset 29, -16
	.cfi_offset 30, -8
...
	/*
	 * Restore the SP from the FP, and restore the FP and LR from the frame
	 * record.
	 */
	mov	sp, x29
	ldp	x29, x30, [sp], #16
	.cfi_restore 29
	.cfi_restore 30
	.cfi_def_cfa 31, 0
...
	ret
	.cfi_endproc
SYM_FUNC_END(call_on_irq_stack)

Thanks and regards,
Jens
-- 
Jens Remus
Linux on Z Development (D3303)
jremus@de.ibm.com / jremus@linux.ibm.com

IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Ehningen; Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement: https://www.ibm.com/privacy/


^ permalink raw reply

* Re: [PATCH v2 0/6] kselftests: livepatch: Adapt tests to be executed on 4.12 kernels
From: Miroslav Benes @ 2026-04-16 13:24 UTC (permalink / raw)
  To: Marcos Paulo de Souza
  Cc: Josh Poimboeuf, Jiri Kosina, Petr Mladek, Joe Lawrence,
	Shuah Khan, live-patching, linux-kselftest, linux-kernel
In-Reply-To: <847de72350a1fe8bd765f2e5493b13e8bf7b2966.camel@suse.com>

On Wed, 15 Apr 2026, Marcos Paulo de Souza wrote:

> On Wed, 2026-04-15 at 14:01 +0200, Miroslav Benes wrote:
> > On Mon, 13 Apr 2026, Marcos Paulo de Souza wrote:
> > 
> > > A new version of the patchset, with fewer patches now. Please take
> > > a look!
> > > 
> > > Original cover-letter:
> > > These patches don't really change how the patches are run, just
> > > skip
> > > some tests on kernels that don't support a feature (like kprobe and
> > > livepatched living together) or when a livepatch sysfs attribute is
> > > missing.
> > > 
> > > The last patch slightly adjusts check_result function to skip dmesg
> > > messages on SLE kernels when a livepatch is removed.
> > > 
> > > These patches are based on printk/for-next branch.
> > > 
> > > Please review! Thanks!
> > > 
> > > Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
> > 
> > Besides my comment for 1/6 and what Sashiko discovered, it looks good
> > to 
> > me.
> > 
> > However, please also take a look at brand new 
> > test_modules/test_klp_mod_target.c. It does not build on old kernels
> > since 
> > they lack proc_create_single(). I think it should be covered in this
> > patch 
> > set too.
> 
> I saw that yesterday as well, but I wanted to merge this series first.
> I have plans to create a way to unregister the livepatches if something
> fails, so we can continue to run the other tests. I was planning to fix
> the test_klp_mod_target.c in the same patchset.

Fair enough.

Miroslav

^ permalink raw reply

* Re: [RFC PATCH 2/2] kernel/module: Decouple klp and ftrace from load_module
From: Petr Mladek @ 2026-04-16 13:09 UTC (permalink / raw)
  To: Song Chen
  Cc: Petr Pavlu, rafael, lenb, mturquette, sboyd, viresh.kumar, agk,
	snitzer, mpatocka, bmarzins, song, yukuai, linan122, jason.wessel,
	danielt, dianders, horms, davem, edumazet, kuba, pabeni, paulmck,
	frederic, mcgrof, da.gomez, samitolvanen, atomlin, jpoimboe,
	jikos, mbenes, joe.lawrence, rostedt, mhiramat, mark.rutland,
	mathieu.desnoyers, linux-modules, linux-kernel,
	linux-trace-kernel, linux-acpi, linux-clk, linux-pm,
	live-patching, dm-devel, linux-raid, kgdb-bugreport, netdev
In-Reply-To: <a35f5f94-7d5a-4347-974b-b270c89ef241@189.cn>

On Wed 2026-04-15 14:43:53, Song Chen wrote:
> Hi,
> 
> On 4/14/26 22:33, Petr Pavlu wrote:
> > On 4/13/26 10:07 AM, chensong_2000@189.cn wrote:
> > > From: Song Chen <chensong_2000@189.cn>
> > > 
> > > ftrace and livepatch currently have their module load/unload callbacks
> > > hard-coded in the module loader as direct function calls to
> > > ftrace_module_enable(), klp_module_coming(), klp_module_going()
> > > and ftrace_release_mod(). This tight coupling was originally introduced
> > > to enforce strict call ordering that could not be guaranteed by the
> > > module notifier chain, which only supported forward traversal. Their
> > > notifiers were moved in and out back and forth. see [1] and [2].
> > 
> > I'm unclear about what is meant by the notifiers being moved back and
> > forth. The links point to patches that converted ftrace+klp from using
> > module notifiers to explicit callbacks due to ordering issues, but this
> > switch occurred only once. Have there been other attempts to use
> > notifiers again?
> > 
> > > diff --git a/include/linux/module.h b/include/linux/module.h
> > > index 14f391b186c6..0bdd56f9defd 100644
> > > --- a/include/linux/module.h
> > > +++ b/include/linux/module.h
> > > @@ -308,6 +308,14 @@ enum module_state {
> > >   	MODULE_STATE_COMING,	/* Full formed, running module_init. */
> > >   	MODULE_STATE_GOING,	/* Going away. */
> > >   	MODULE_STATE_UNFORMED,	/* Still setting it up. */
> > > +	MODULE_STATE_FORMED,
> > 
> > I don't see a reason to add a new module state. Why is it necessary and
> > how does it fit with the existing states?
> > 
> because once notifier fails in state MODULE_STATE_UNFORMED (now only ftrace
> has someting to do in this state), notifier chain will roll back by calling
> blocking_notifier_call_chain_robust, i'm afraid MODULE_STATE_GOING is going
> to jeopardise the notifers which don't handle it appropriately, like:
> 
> case MODULE_STATE_COMING:
>      kmalloc();
> case MODULE_STATE_GOING:
>      kfree();
> 
> 
> > > +};
> > > +
> > > +enum module_notifier_prio {
> > > +	MODULE_NOTIFIER_PRIO_LOW = INT_MIN,	/* Low prioroty, coming last, going first */
> > > +	MODULE_NOTIFIER_PRIO_MID = 0,	/* Normal priority. */
> > > +	MODULE_NOTIFIER_PRIO_SECOND_HIGH = INT_MAX - 1,	/* Second high priorigy, coming second*/
> > > +	MODULE_NOTIFIER_PRIO_HIGH = INT_MAX,	/* High priorigy, coming first, going late. */
> > 
> > I suggest being explicit about how the notifiers are ordered. For
> > example:
> > 
> > enum module_notifier_prio {
> > 	MODULE_NOTIFIER_PRIO_NORMAL,	/* Normal priority, coming last, going first. */
> > 	MODULE_NOTIFIER_PRIO_LIVEPATCH,
> > 	MODULE_NOTIFIER_PRIO_FTRACE,	/* High priority, coming first, going late. */
> > };
> > 

I like the explicit PRIO_LIVEPATCH/FTRACE names.

But I would keep the INT_MAX - 1 and INT_MAX priorities. I believe
that ftrace/livepatching will always be the first/last to call.
And INT_MAX would help to preserve kABI when PRIO_NORMAL is not
enough for the rest of notifiers.

That said, I am not sure whether this is worth the effort.
This patch tries to move the explicit callbacks in a generic
notifiers API. But it will still need to use some explictly
defined (reserved) priorities. And it will
not guarantee a misuse. Also it requires the double linked
list which complicates the notifiers code.


> > >   };
> > >   struct mod_tree_node {
> > > --- a/kernel/module/main.c
> > > +++ b/kernel/module/main.c
> > > @@ -3281,20 +3277,14 @@ static int complete_formation(struct module *mod, struct load_info *info)
> > >   	return err;
> > >   }
> > > -static int prepare_coming_module(struct module *mod)
> > > +static int prepare_module_state_transaction(struct module *mod,
> > > +			unsigned long val_up, unsigned long val_down)
> > >   {
> > >   	int err;
> > > -	ftrace_module_enable(mod);
> > > -	err = klp_module_coming(mod);
> > > -	if (err)
> > > -		return err;
> > > -
> > >   	err = blocking_notifier_call_chain_robust(&module_notify_list,
> > > -			MODULE_STATE_COMING, MODULE_STATE_GOING, mod);
> > > +			val_up, val_down, mod);
> > >   	err = notifier_to_errno(err);
> > > -	if (err)
> > > -		klp_module_going(mod);
> > >   	return err;
> > >   }

I personally find the name "prepare_module_state_transaction"
misleading. What is the "transaction" here? If this was a "preparation"
step then where is the transaction done/finished?

It might be better to just opencode the
blocking_notifier_call_chain_robust() instead.

> > > @@ -3468,14 +3458,21 @@ static int load_module(struct load_info *info, const char __user *uargs,
> > >   	init_build_id(mod, info);
> > >   	/* Ftrace init must be called in the MODULE_STATE_UNFORMED state */
> > > -	ftrace_module_init(mod);
> > > +	err = prepare_module_state_transaction(mod,
> > > +				MODULE_STATE_UNFORMED, MODULE_STATE_FORMED);
> > 
> > I believe val_down should be MODULE_STATE_GOING to reverse the
> > operation. Why is the new state MODULE_STATE_FORMED needed here?
> to avoid this:
> 
> case MODULE_STATE_COMING:
>      kmalloc();
> case MODULE_STATE_GOING:
>      kfree();

Hmm, the module is in "FORMED" state here.

> > > +	if (err)
> > > +		goto ddebug_cleanup;
> > >   	/* Finally it's fully formed, ready to start executing. */
> > >   	err = complete_formation(mod, info);

And we call "complete_formation()" function. This sounds like
it was not really "FORMED" before. => It is confusing and nono.

Please, try to avoid the new state if possible. My experience
with reading the module loader code is that any new state
brings a lot of complexity. You need to take it into account
when checking correctness of other changes, features, ...

Something tells me that if the state was not needed before
then we could avoid it.

> > > -	if (err)
> > > +	if (err) {
> > > +		blocking_notifier_call_chain_reverse(&module_notify_list,
> > > +				MODULE_STATE_FORMED, mod);
> > >   		goto ddebug_cleanup;
> > > +	}
> > > -	err = prepare_coming_module(mod);
> > > +	err = prepare_module_state_transaction(mod,
> > > +				MODULE_STATE_COMING, MODULE_STATE_GOING);
> > >   	if (err)
> > >   		goto bug_cleanup;
> > > --- a/kernel/trace/ftrace.c
> > > +++ b/kernel/trace/ftrace.c
> > > @@ -5241,6 +5241,44 @@ static int __init ftrace_mod_cmd_init(void)
> > >   }
> > >   core_initcall(ftrace_mod_cmd_init);
> > > +static int ftrace_module_callback(struct notifier_block *nb, unsigned long op,
> > > +			void *module)
> > > +{
> > > +	struct module *mod = module;
> > > +
> > > +	switch (op) {
> > > +	case MODULE_STATE_UNFORMED:
> > > +		ftrace_module_init(mod);
> > > +		break;
> > > +	case MODULE_STATE_COMING:
> > > +		ftrace_module_enable(mod);
> > > +		break;
> > > +	case MODULE_STATE_LIVE:
> > > +		ftrace_free_mem(mod, mod->mem[MOD_INIT_TEXT].base,
> > > +				mod->mem[MOD_INIT_TEXT].base + mod->mem[MOD_INIT_TEXT].size);
> > > +		break;
> > > +	case MODULE_STATE_GOING:
> > > +	case MODULE_STATE_FORMED:
> > > +		ftrace_release_mod(mod);

This calls "release" in a "FORMED" state. It does not make any
sense. Something looks fishy, either the code or the naming.

> > > +		break;
> > > +	default:
> > > +		break;
> > > +	}
> > 

I am sorry for being so picky about names. I believe that good names
help to prevent bugs and reduce headaches.

Best Regards,
Petr

^ permalink raw reply

* Re: [RFC PATCH 1/2] kernel/notifier: replace single-linked list with double-linked list for reverse traversal
From: David Laight @ 2026-04-16 12:30 UTC (permalink / raw)
  To: chensong_2000
  Cc: rafael, lenb, mturquette, sboyd, viresh.kumar, agk, snitzer,
	mpatocka, bmarzins, song, yukuai, linan122, jason.wessel, danielt,
	dianders, horms, davem, edumazet, kuba, pabeni, paulmck, frederic,
	mcgrof, petr.pavlu, da.gomez, samitolvanen, atomlin, jpoimboe,
	jikos, mbenes, pmladek, joe.lawrence, rostedt, mhiramat,
	mark.rutland, mathieu.desnoyers, linux-modules, linux-kernel,
	linux-trace-kernel, linux-acpi, linux-clk, linux-pm,
	live-patching, dm-devel, linux-raid, kgdb-bugreport, netdev
In-Reply-To: <20260415070137.17860-1-chensong_2000@189.cn>

On Wed, 15 Apr 2026 15:01:37 +0800
chensong_2000@189.cn wrote:

> From: Song Chen <chensong_2000@189.cn>
> 
> The current notifier chain implementation uses a single-linked list
> (struct notifier_block *next), which only supports forward traversal
> in priority order. This makes it difficult to handle cleanup/teardown
> scenarios that require notifiers to be called in reverse priority order.

If it is only cleanup/teardown then the list can be order-reversed
as part of that process at the same time as the list is deleted.

	David



^ permalink raw reply

* Re: [RFC PATCH 2/2] kernel/module: Decouple klp and ftrace from load_module
From: Petr Pavlu @ 2026-04-16 11:18 UTC (permalink / raw)
  To: Song Chen
  Cc: rafael, lenb, mturquette, sboyd, viresh.kumar, agk, snitzer,
	mpatocka, bmarzins, song, yukuai, linan122, jason.wessel, danielt,
	dianders, horms, davem, edumazet, kuba, pabeni, paulmck, frederic,
	mcgrof, da.gomez, samitolvanen, atomlin, jpoimboe, jikos, mbenes,
	pmladek, joe.lawrence, rostedt, mhiramat, mark.rutland,
	mathieu.desnoyers, linux-modules, linux-kernel,
	linux-trace-kernel, linux-acpi, linux-clk, linux-pm,
	live-patching, dm-devel, linux-raid, kgdb-bugreport, netdev
In-Reply-To: <a35f5f94-7d5a-4347-974b-b270c89ef241@189.cn>

On 4/15/26 8:43 AM, Song Chen wrote:
> On 4/14/26 22:33, Petr Pavlu wrote:
>> On 4/13/26 10:07 AM, chensong_2000@189.cn wrote:
>>> diff --git a/include/linux/module.h b/include/linux/module.h
>>> index 14f391b186c6..0bdd56f9defd 100644
>>> --- a/include/linux/module.h
>>> +++ b/include/linux/module.h
>>> @@ -308,6 +308,14 @@ enum module_state {
>>>       MODULE_STATE_COMING,    /* Full formed, running module_init. */
>>>       MODULE_STATE_GOING,    /* Going away. */
>>>       MODULE_STATE_UNFORMED,    /* Still setting it up. */
>>> +    MODULE_STATE_FORMED,
>>
>> I don't see a reason to add a new module state. Why is it necessary and
>> how does it fit with the existing states?
>>
> because once notifier fails in state MODULE_STATE_UNFORMED (now only ftrace has someting to do in this state), notifier chain will roll back by calling blocking_notifier_call_chain_robust, i'm afraid MODULE_STATE_GOING is going to jeopardise the notifers which don't handle it appropriately, like:
> 
> case MODULE_STATE_COMING:
>      kmalloc();
> case MODULE_STATE_GOING:
>      kfree();

My understanding is that the current module "state machine" operates as
follows. Transitions marked with an asterisk (*) are announced via the
module notifier.

---> UNFORMED --*> COMING --*> LIVE --*> GOING -.
        ^            |                     ^    |
        |            '---------------------*    |
        '---------------------------------------'

The new code aims to replace the current ftrace_module_init() call in
load_module(). To achieve this, it adds a notification for the UNFORMED
state (only when loading a module) and introduces a new FORMED state for
rollback. FORMED is purely a fake state because it never appears in
module::state. The new structure is as follows:

        ,--*> (FORMED)
        |
--*> UNFORMED --*> COMING --*> LIVE --*> GOING -.
        ^            |                     ^    |
        |            '---------------------*    |
        '---------------------------------------'

I'm afraid this is quite complex and inconsistent. Unless it can be kept
simple, we would be just replacing one special handling with a different
complexity, which is not worth it.

>>
>>> +    if (err)
>>> +        goto ddebug_cleanup;
>>>         /* Finally it's fully formed, ready to start executing. */
>>>       err = complete_formation(mod, info);
>>> -    if (err)
>>> +    if (err) {
>>> +        blocking_notifier_call_chain_reverse(&module_notify_list,
>>> +                MODULE_STATE_FORMED, mod);
>>>           goto ddebug_cleanup;
>>> +    }
>>>   -    err = prepare_coming_module(mod);
>>> +    err = prepare_module_state_transaction(mod,
>>> +                MODULE_STATE_COMING, MODULE_STATE_GOING);
>>>       if (err)
>>>           goto bug_cleanup;
>>>   @@ -3522,7 +3519,6 @@ static int load_module(struct load_info *info, const char __user *uargs,
>>>       destroy_params(mod->kp, mod->num_kp);
>>>       blocking_notifier_call_chain(&module_notify_list,
>>>                        MODULE_STATE_GOING, mod);
>>
>> My understanding is that all notifier chains for MODULE_STATE_GOING
>> should be reversed.
> yes, all, from lowest priority notifier to highest.
> I will resend patch 1 which was failed due to my proxy setting.

What I meant here is that the call:

blocking_notifier_call_chain(&module_notify_list, MODULE_STATE_GOING, mod);

should be replaced with:

blocking_notifier_call_chain_reverse(&module_notify_list, MODULE_STATE_GOING, mod);

> 
>>
>>> -    klp_module_going(mod);
>>>    bug_cleanup:
>>>       mod->state = MODULE_STATE_GOING;
>>>       /* module_bug_cleanup needs module_mutex protection */
>>
>> The patch removes the klp_module_going() cleanup call in load_module().
>> Similarly, the ftrace_release_mod() call under the ddebug_cleanup label
>> should be removed and appropriately replaced with a cleanup via
>> a notifier.
>>
>     err = prepare_module_state_transaction(mod,
>                 MODULE_STATE_UNFORMED, MODULE_STATE_FORMED);
>     if (err)
>         goto ddebug_cleanup;
> 
> ftrace will be cleanup in blocking_notifier_call_chain_robust rolling back.
> 
>     err = prepare_module_state_transaction(mod,
>                 MODULE_STATE_COMING, MODULE_STATE_GOING);
> 
> each notifier including ftrace and klp will be cleanup in blocking_notifier_call_chain_robust rolling back.
> 
> if all notifiers are successful in MODULE_STATE_COMING, they all will be clean up in
>  coming_cleanup:
>     mod->state = MODULE_STATE_GOING;
>     destroy_params(mod->kp, mod->num_kp);
>     blocking_notifier_call_chain(&module_notify_list,
>                      MODULE_STATE_GOING, mod);
> 
> if  something wrong underneath.

My point is that the patch leaves a call to ftrace_release_mod() in
load_module(), which I expected to be handled via a notifier.

-- 
Thanks,
Petr

^ permalink raw reply

* Re: [RFC PATCH 1/2] kernel/notifier: replace single-linked list with double-linked list for reverse traversal
From: Petr Mladek @ 2026-04-16 10:33 UTC (permalink / raw)
  To: chensong_2000
  Cc: rafael, lenb, mturquette, sboyd, viresh.kumar, agk, snitzer,
	mpatocka, bmarzins, song, yukuai, linan122, jason.wessel, danielt,
	dianders, horms, davem, edumazet, kuba, pabeni, paulmck, frederic,
	mcgrof, petr.pavlu, da.gomez, samitolvanen, atomlin, jpoimboe,
	jikos, mbenes, joe.lawrence, rostedt, mhiramat, mark.rutland,
	mathieu.desnoyers, linux-modules, linux-kernel,
	linux-trace-kernel, linux-acpi, linux-clk, linux-pm,
	live-patching, dm-devel, linux-raid, kgdb-bugreport, netdev
In-Reply-To: <20260415070137.17860-1-chensong_2000@189.cn>

On Wed 2026-04-15 15:01:37, chensong_2000@189.cn wrote:
> From: Song Chen <chensong_2000@189.cn>
> 
> The current notifier chain implementation uses a single-linked list
> (struct notifier_block *next), which only supports forward traversal
> in priority order. This makes it difficult to handle cleanup/teardown
> scenarios that require notifiers to be called in reverse priority order.
> 
> A concrete example is the ordering dependency between ftrace and
> livepatch during module load/unload. see the detail here [1].
> 
> This patch replaces the single-linked list in struct notifier_block
> with a struct list_head, converting the notifier chain into a
> doubly-linked list sorted in descending priority order. Based on
> this, a new function notifier_call_chain_reverse() is introduced,
> which traverses the chain in reverse (ascending priority order).
> The corresponding blocking_notifier_call_chain_reverse() is also
> added as the locking wrapper for blocking notifier chains.
> 
> The internal notifier_call_chain_robust() is updated to use
> notifier_call_chain_reverse() for rollback: on error, it records
> the failing notifier (last_nb) and the count of successfully called
> notifiers (nr), then rolls back exactly those nr-1 notifiers in
> reverse order starting from last_nb's predecessor, without needing
> to know the total length of the chain.
> 
> With this change, subsystems with symmetric setup/teardown ordering
> requirements can register a single notifier_block with one priority
> value, and rely on blocking_notifier_call_chain() for forward
> traversal and blocking_notifier_call_chain_reverse() for reverse
> traversal, without needing hard-coded call sequences or separate
> notifier registrations for each direction.
> 
> [1]:https://lore.kernel.org/all
> 	/alpine.LNX.2.00.1602172216491.22700@cbobk.fhfr.pm/
> 
> --- a/include/linux/notifier.h
> +++ b/include/linux/notifier.h
> @@ -53,41 +53,41 @@ typedef	int (*notifier_fn_t)(struct notifier_block *nb,
[...]
>  struct notifier_block {
>  	notifier_fn_t notifier_call;
> -	struct notifier_block __rcu *next;
> +	struct list_head __rcu entry;
>  	int priority;
>  };
[...]
>  #define ATOMIC_INIT_NOTIFIER_HEAD(name) do {	\
>  		spin_lock_init(&(name)->lock);	\
> -		(name)->head = NULL;		\
> +		INIT_LIST_HEAD(&(name)->head);		\

I would expect the RCU variant here, aka INIT_LIST_HEAD_RCU().

> --- a/kernel/notifier.c
> +++ b/kernel/notifier.c
> @@ -14,39 +14,47 @@
>   *	are layered on top of these, with appropriate locking added.
>   */
>  
> -static int notifier_chain_register(struct notifier_block **nl,
> +static int notifier_chain_register(struct list_head *nl,
>  				   struct notifier_block *n,
>  				   bool unique_priority)
>  {
> -	while ((*nl) != NULL) {
> -		if (unlikely((*nl) == n)) {
> +	struct notifier_block *cur;
> +
> +	list_for_each_entry(cur, nl, entry) {
> +		if (unlikely(cur == n)) {
>  			WARN(1, "notifier callback %ps already registered",
>  			     n->notifier_call);
>  			return -EEXIST;
>  		}
> -		if (n->priority > (*nl)->priority)
> -			break;
> -		if (n->priority == (*nl)->priority && unique_priority)
> +
> +		if (n->priority == cur->priority && unique_priority)
>  			return -EBUSY;
> -		nl = &((*nl)->next);
> +
> +		if (n->priority > cur->priority) {
> +			list_add_tail(&n->entry, &cur->entry);
> +			goto out;
> +		}
>  	}
> -	n->next = *nl;
> -	rcu_assign_pointer(*nl, n);
> +
> +	list_add_tail(&n->entry, nl);

I would expect list_add_tail_rcu() here.

> @@ -59,25 +67,25 @@ static int notifier_chain_unregister(struct notifier_block **nl,
>   *			value of this parameter is -1.
>   *	@nr_calls:	Records the number of notifications sent. Don't care
>   *			value of this field is NULL.
> + *	@last_nb:  Records the last called notifier block for rolling back
>   *	Return:		notifier_call_chain returns the value returned by the
>   *			last notifier function called.
>   */
> -static int notifier_call_chain(struct notifier_block **nl,
> +static int notifier_call_chain(struct list_head *nl,
>  			       unsigned long val, void *v,
> -			       int nr_to_call, int *nr_calls)
> +			       int nr_to_call, int *nr_calls,
> +				   struct notifier_block **last_nb)
>  {
>  	int ret = NOTIFY_DONE;
> -	struct notifier_block *nb, *next_nb;
> -
> -	nb = rcu_dereference_raw(*nl);
> +	struct notifier_block *nb;
>  
> -	while (nb && nr_to_call) {
> -		next_nb = rcu_dereference_raw(nb->next);
> +	if (!nr_to_call)
> +		return ret;
>  
> +	list_for_each_entry(nb, nl, entry) {

I would expect the RCU variant here, aka list_for_each_rcu()

These are just two random examples which I found by a quick look.

I guess that the notifier API is very old and it does not use all
the RCU API features which allow to track safety when
CONFIG_PROVE_RCU and CONFIG_PROVE_RCU_LIST are enabled.

It actually might be worth to audit the code and make it right.

>  #ifdef CONFIG_DEBUG_NOTIFIERS
>  		if (unlikely(!func_ptr_is_kernel_text(nb->notifier_call))) {
>  			WARN(1, "Invalid notifier called!");
> -			nb = next_nb;
>  			continue;
>  		}
>  #endif

That said, I am not sure if the ftrace/livepatching handlers are
the right motivation for this. Especially when I see the
complexity of the 2nd patch [*]

After thinking more about it. I am not even sure that the ftrace and
livepatching callbacks are good candidates for generic notifiers.
They are too special. It is not only about ordering them against
each other. But it is also about ordering them against other
notifiers. The ftrace/livepatching callbacks must be the first/last
during module load/release.

[*] The 2nd patch is not archived by lore for some reason.
    I have found only a review but it gives a good picture, see
    https://lore.kernel.org/all/1191caf5-6a61-4622-a15e-854d3701f4fc@suse.com/

Best Regards,
Petr

^ permalink raw reply

* [GIT PULL] livepatching for 7.1
From: Petr Mladek @ 2026-04-16  9:14 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, live-patching

Hi Linus,

please pull the latest livepatching changes from

  git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching.git tags/livepatching-for-7.1

======================================

- Add two new selftests.

----------------------------------------------------------------
Marcos Paulo de Souza (1):
      selftests: livepatch: test-ftrace: livepatch a traced function

Pablo Alessandro Santos Hugen (1):
      selftests/livepatch: add test for module function patching

Petr Mladek (1):
      Merge branch 'for-7.1/module-function-test' into for-linus

 tools/testing/selftests/livepatch/test-ftrace.sh   |  36 ++++++++
 .../testing/selftests/livepatch/test-livepatch.sh  | 100 +++++++++++++++++++++
 .../selftests/livepatch/test_modules/Makefile      |   2 +
 .../livepatch/test_modules/test_klp_mod_patch.c    |  53 +++++++++++
 .../livepatch/test_modules/test_klp_mod_target.c   |  39 ++++++++
 5 files changed, 230 insertions(+)
 create mode 100644 tools/testing/selftests/livepatch/test_modules/test_klp_mod_patch.c
 create mode 100644 tools/testing/selftests/livepatch/test_modules/test_klp_mod_target.c

^ permalink raw reply

* Re: [PATCH] samples/livepatch: Add BPF struct_ops integration sample
From: Yafang Shao @ 2026-04-16  7:45 UTC (permalink / raw)
  To: Song Liu
  Cc: live-patching, linux-kernel, jpoimboe, jikos, mbenes, pmladek,
	joe.lawrence, kernel-team
In-Reply-To: <20260416001628.2062468-1-song@kernel.org>

On Thu, Apr 16, 2026 at 8:16 AM Song Liu <song@kernel.org> wrote:
>
> Add a sample module that demonstrates how BPF struct_ops can work
> together with kernel livepatch. The module livepatches
> cmdline_proc_show() and delegates the output to a BPF struct_ops
> callback. When no BPF program is attached, a fallback message is
> shown; when a BPF struct_ops program is attached, it controls the
> /proc/cmdline output via the bpf_klp_seq_write kfunc.
>
> This builds on the existing livepatch-sample.c pattern but shows how
> livepatch and BPF struct_ops can be combined to make livepatched
> behavior programmable from userspace.
>
> The module is built when both CONFIG_SAMPLE_LIVEPATCH and
> CONFIG_BPF_JIT are enabled.
>
> Signed-off-by: Song Liu <song@kernel.org>
> ---
>  samples/livepatch/Makefile        |   3 +
>  samples/livepatch/livepatch-bpf.c | 202 ++++++++++++++++++++++++++++++
>  2 files changed, 205 insertions(+)
>  create mode 100644 samples/livepatch/livepatch-bpf.c
>
> diff --git a/samples/livepatch/Makefile b/samples/livepatch/Makefile
> index 9f853eeb6140..1ab4ecbf1f0f 100644
> --- a/samples/livepatch/Makefile
> +++ b/samples/livepatch/Makefile
> @@ -6,3 +6,6 @@ obj-$(CONFIG_SAMPLE_LIVEPATCH) += livepatch-shadow-fix2.o
>  obj-$(CONFIG_SAMPLE_LIVEPATCH) += livepatch-callbacks-demo.o
>  obj-$(CONFIG_SAMPLE_LIVEPATCH) += livepatch-callbacks-mod.o
>  obj-$(CONFIG_SAMPLE_LIVEPATCH) += livepatch-callbacks-busymod.o
> +ifdef CONFIG_BPF_JIT
> +obj-$(CONFIG_SAMPLE_LIVEPATCH) += livepatch-bpf.o
> +endif
> diff --git a/samples/livepatch/livepatch-bpf.c b/samples/livepatch/livepatch-bpf.c
> new file mode 100644
> index 000000000000..4a702a3b4726
> --- /dev/null
> +++ b/samples/livepatch/livepatch-bpf.c
> @@ -0,0 +1,202 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * livepatch-bpf.c - BPF struct_ops + Kernel Live Patching Sample Module
> + *
> + * Copyright (c) 2026 Meta Platforms, Inc. and affiliates.
> + *
> + * This sample demonstrates how BPF struct_ops can control kernel
> + * behavior through livepatch. The module livepatches cmdline_proc_show()
> + * and delegates output to a BPF struct_ops callback. A BPF program can
> + * then attach to override /proc/cmdline output via the bpf_klp_seq_write
> + * kfunc.
> + *
> + * Example:
> + *
> + * $ insmod livepatch-bpf.ko
> + * $ cat /proc/cmdline
> + * livepatch_bpf: no struct_ops attached
> + *
> + * (attach a BPF struct_ops program implementing set_cmdline, e.g.)
> + *
> + * SEC("struct_ops/set_cmdline")
> + * int BPF_PROG(set_cmdline, struct seq_file *m)
> + * {
> + *     char custom[] = "klp_bpf: custom cmdline\n";
> + *     bpf_klp_seq_write(m, custom, sizeof(custom) - 1);
> + *     return 0;
> + * }
> + *
> + * $ cat /proc/cmdline
> + * klp_bpf: custom cmdline
> + *
> + * $ echo 0 > /sys/kernel/livepatch/livepatch_bpf/enabled
> + * $ cat /proc/cmdline
> + * <your cmdline>
> + */
> +
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +#include <linux/bpf.h>
> +#include <linux/btf.h>
> +#include <linux/btf_ids.h>
> +#include <linux/init.h>
> +#include <linux/module.h>
> +#include <linux/livepatch.h>
> +#include <linux/seq_file.h>
> +#include <linux/bpf_verifier.h>
> +
> +struct klp_bpf_cmdline_ops {
> +       int (*set_cmdline)(struct seq_file *m);
> +};
> +
> +static struct klp_bpf_cmdline_ops *active_ops;
> +
> +/* --- kfunc: allow BPF struct_ops programs to write to seq_file --- */
> +
> +__bpf_kfunc_start_defs();
> +
> +__bpf_kfunc void bpf_klp_seq_write(struct seq_file *m,
> +                                   const char *data, u32 data__sz)
> +{
> +       seq_write(m, data, data__sz);
> +}
> +
> +__bpf_kfunc_end_defs();
> +
> +BTF_KFUNCS_START(klp_bpf_kfunc_ids)
> +BTF_ID_FLAGS(func, bpf_klp_seq_write)
> +BTF_KFUNCS_END(klp_bpf_kfunc_ids)
> +
> +static const struct btf_kfunc_id_set klp_bpf_kfunc_set = {
> +       .owner = THIS_MODULE,
> +       .set   = &klp_bpf_kfunc_ids,
> +};
> +
> +/* --- Livepatch replacement for cmdline_proc_show --- */
> +
> +static int livepatch_cmdline_proc_show(struct seq_file *m, void *v)
> +{
> +       struct klp_bpf_cmdline_ops *ops = READ_ONCE(active_ops);
> +
> +       if (ops && ops->set_cmdline)
> +               return ops->set_cmdline(m);
> +
> +       seq_printf(m, "%s: no struct_ops attached\n", THIS_MODULE->name);
> +       return 0;
> +}
> +
> +static struct klp_func funcs[] = {
> +       {
> +               .old_name = "cmdline_proc_show",
> +               .new_func = livepatch_cmdline_proc_show,
> +       }, { }
> +};
> +
> +static struct klp_object objs[] = {
> +       {
> +               /* name being NULL means vmlinux */
> +               .funcs = funcs,
> +       }, { }
> +};
> +
> +static struct klp_patch patch = {
> +       .mod = THIS_MODULE,
> +       .objs = objs,

  Nit: I suggest enabling the replace flag for this patch to align
with the recommended implementation.

    .replace = true,

Other than that, it looks good to me:

  Tested-and-acked-by: Yafang Shao <laoar.shao@gmail.com>

[...]

-- 
Regards
Yafang

^ permalink raw reply

* Re: [PATCH] samples/livepatch: Add BPF struct_ops integration sample
From: Song Liu @ 2026-04-16  0:18 UTC (permalink / raw)
  To: live-patching, linux-kernel
  Cc: jpoimboe, jikos, mbenes, pmladek, joe.lawrence, laoar.shao,
	kernel-team
In-Reply-To: <20260416001628.2062468-1-song@kernel.org>

On Wed, Apr 15, 2026 at 5:16 PM Song Liu <song@kernel.org> wrote:
>
> Add a sample module that demonstrates how BPF struct_ops can work
> together with kernel livepatch. The module livepatches
> cmdline_proc_show() and delegates the output to a BPF struct_ops
> callback. When no BPF program is attached, a fallback message is
> shown; when a BPF struct_ops program is attached, it controls the
> /proc/cmdline output via the bpf_klp_seq_write kfunc.
>
> This builds on the existing livepatch-sample.c pattern but shows how
> livepatch and BPF struct_ops can be combined to make livepatched
> behavior programmable from userspace.
>
> The module is built when both CONFIG_SAMPLE_LIVEPATCH and
> CONFIG_BPF_JIT are enabled.
>
> Signed-off-by: Song Liu <song@kernel.org>

Forgot to add

Assisted-by: Claude:claude-opus-4-6

^ permalink raw reply

* [PATCH] samples/livepatch: Add BPF struct_ops integration sample
From: Song Liu @ 2026-04-16  0:16 UTC (permalink / raw)
  To: live-patching, linux-kernel
  Cc: jpoimboe, jikos, mbenes, pmladek, joe.lawrence, laoar.shao,
	kernel-team, Song Liu

Add a sample module that demonstrates how BPF struct_ops can work
together with kernel livepatch. The module livepatches
cmdline_proc_show() and delegates the output to a BPF struct_ops
callback. When no BPF program is attached, a fallback message is
shown; when a BPF struct_ops program is attached, it controls the
/proc/cmdline output via the bpf_klp_seq_write kfunc.

This builds on the existing livepatch-sample.c pattern but shows how
livepatch and BPF struct_ops can be combined to make livepatched
behavior programmable from userspace.

The module is built when both CONFIG_SAMPLE_LIVEPATCH and
CONFIG_BPF_JIT are enabled.

Signed-off-by: Song Liu <song@kernel.org>
---
 samples/livepatch/Makefile        |   3 +
 samples/livepatch/livepatch-bpf.c | 202 ++++++++++++++++++++++++++++++
 2 files changed, 205 insertions(+)
 create mode 100644 samples/livepatch/livepatch-bpf.c

diff --git a/samples/livepatch/Makefile b/samples/livepatch/Makefile
index 9f853eeb6140..1ab4ecbf1f0f 100644
--- a/samples/livepatch/Makefile
+++ b/samples/livepatch/Makefile
@@ -6,3 +6,6 @@ obj-$(CONFIG_SAMPLE_LIVEPATCH) += livepatch-shadow-fix2.o
 obj-$(CONFIG_SAMPLE_LIVEPATCH) += livepatch-callbacks-demo.o
 obj-$(CONFIG_SAMPLE_LIVEPATCH) += livepatch-callbacks-mod.o
 obj-$(CONFIG_SAMPLE_LIVEPATCH) += livepatch-callbacks-busymod.o
+ifdef CONFIG_BPF_JIT
+obj-$(CONFIG_SAMPLE_LIVEPATCH) += livepatch-bpf.o
+endif
diff --git a/samples/livepatch/livepatch-bpf.c b/samples/livepatch/livepatch-bpf.c
new file mode 100644
index 000000000000..4a702a3b4726
--- /dev/null
+++ b/samples/livepatch/livepatch-bpf.c
@@ -0,0 +1,202 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * livepatch-bpf.c - BPF struct_ops + Kernel Live Patching Sample Module
+ *
+ * Copyright (c) 2026 Meta Platforms, Inc. and affiliates.
+ *
+ * This sample demonstrates how BPF struct_ops can control kernel
+ * behavior through livepatch. The module livepatches cmdline_proc_show()
+ * and delegates output to a BPF struct_ops callback. A BPF program can
+ * then attach to override /proc/cmdline output via the bpf_klp_seq_write
+ * kfunc.
+ *
+ * Example:
+ *
+ * $ insmod livepatch-bpf.ko
+ * $ cat /proc/cmdline
+ * livepatch_bpf: no struct_ops attached
+ *
+ * (attach a BPF struct_ops program implementing set_cmdline, e.g.)
+ *
+ * SEC("struct_ops/set_cmdline")
+ * int BPF_PROG(set_cmdline, struct seq_file *m)
+ * {
+ *     char custom[] = "klp_bpf: custom cmdline\n";
+ *     bpf_klp_seq_write(m, custom, sizeof(custom) - 1);
+ *     return 0;
+ * }
+ *
+ * $ cat /proc/cmdline
+ * klp_bpf: custom cmdline
+ *
+ * $ echo 0 > /sys/kernel/livepatch/livepatch_bpf/enabled
+ * $ cat /proc/cmdline
+ * <your cmdline>
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/bpf.h>
+#include <linux/btf.h>
+#include <linux/btf_ids.h>
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/livepatch.h>
+#include <linux/seq_file.h>
+#include <linux/bpf_verifier.h>
+
+struct klp_bpf_cmdline_ops {
+	int (*set_cmdline)(struct seq_file *m);
+};
+
+static struct klp_bpf_cmdline_ops *active_ops;
+
+/* --- kfunc: allow BPF struct_ops programs to write to seq_file --- */
+
+__bpf_kfunc_start_defs();
+
+__bpf_kfunc void bpf_klp_seq_write(struct seq_file *m,
+				    const char *data, u32 data__sz)
+{
+	seq_write(m, data, data__sz);
+}
+
+__bpf_kfunc_end_defs();
+
+BTF_KFUNCS_START(klp_bpf_kfunc_ids)
+BTF_ID_FLAGS(func, bpf_klp_seq_write)
+BTF_KFUNCS_END(klp_bpf_kfunc_ids)
+
+static const struct btf_kfunc_id_set klp_bpf_kfunc_set = {
+	.owner = THIS_MODULE,
+	.set   = &klp_bpf_kfunc_ids,
+};
+
+/* --- Livepatch replacement for cmdline_proc_show --- */
+
+static int livepatch_cmdline_proc_show(struct seq_file *m, void *v)
+{
+	struct klp_bpf_cmdline_ops *ops = READ_ONCE(active_ops);
+
+	if (ops && ops->set_cmdline)
+		return ops->set_cmdline(m);
+
+	seq_printf(m, "%s: no struct_ops attached\n", THIS_MODULE->name);
+	return 0;
+}
+
+static struct klp_func funcs[] = {
+	{
+		.old_name = "cmdline_proc_show",
+		.new_func = livepatch_cmdline_proc_show,
+	}, { }
+};
+
+static struct klp_object objs[] = {
+	{
+		/* name being NULL means vmlinux */
+		.funcs = funcs,
+	}, { }
+};
+
+static struct klp_patch patch = {
+	.mod = THIS_MODULE,
+	.objs = objs,
+};
+
+/* --- struct_ops registration --- */
+
+static int klp_bpf_cmdline_reg(void *kdata, struct bpf_link *link)
+{
+	struct klp_bpf_cmdline_ops *ops = kdata;
+
+	if (cmpxchg(&active_ops, NULL, ops))
+		return -EBUSY;
+
+	return 0;
+}
+
+static void klp_bpf_cmdline_unreg(void *kdata, struct bpf_link *link)
+{
+	WRITE_ONCE(active_ops, NULL);
+}
+
+static int klp_bpf_cmdline_init(struct btf *btf)
+{
+	return 0;
+}
+
+static int klp_bpf_cmdline_init_member(const struct btf_type *t,
+				       const struct btf_member *member,
+				       void *kdata, const void *udata)
+{
+	return 0;
+}
+
+static bool klp_bpf_cmdline_is_valid_access(int off, int size,
+					    enum bpf_access_type type,
+					    const struct bpf_prog *prog,
+					    struct bpf_insn_access_aux *info)
+{
+	return bpf_tracing_btf_ctx_access(off, size, type, prog, info);
+}
+
+static int klp_bpf_cmdline_btf_struct_access(struct bpf_verifier_log *log,
+					     const struct bpf_reg_state *reg,
+					     int off, int size)
+{
+	return -EACCES;
+}
+
+static const struct bpf_verifier_ops klp_bpf_cmdline_verifier_ops = {
+	.is_valid_access = klp_bpf_cmdline_is_valid_access,
+	.btf_struct_access = klp_bpf_cmdline_btf_struct_access,
+};
+
+/* CFI stubs */
+static int klp_bpf_cmdline__set_cmdline(struct seq_file *m)
+{
+	return 0;
+}
+
+static struct klp_bpf_cmdline_ops __bpf_klp_bpf_cmdline_ops = {
+	.set_cmdline = klp_bpf_cmdline__set_cmdline,
+};
+
+static struct bpf_struct_ops bpf_klp_bpf_cmdline_ops = {
+	.verifier_ops = &klp_bpf_cmdline_verifier_ops,
+	.init = klp_bpf_cmdline_init,
+	.init_member = klp_bpf_cmdline_init_member,
+	.reg = klp_bpf_cmdline_reg,
+	.unreg = klp_bpf_cmdline_unreg,
+	.cfi_stubs = &__bpf_klp_bpf_cmdline_ops,
+	.name = "klp_bpf_cmdline_ops",
+	.owner = THIS_MODULE,
+};
+
+/* --- Module init/exit --- */
+
+static int __init livepatch_bpf_init(void)
+{
+	int ret;
+
+	ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS,
+					&klp_bpf_kfunc_set);
+	ret = ret ?: register_bpf_struct_ops(&bpf_klp_bpf_cmdline_ops,
+					     klp_bpf_cmdline_ops);
+	if (ret)
+		return ret;
+
+	return klp_enable_patch(&patch);
+}
+
+static void __exit livepatch_bpf_exit(void)
+{
+}
+
+module_init(livepatch_bpf_init);
+module_exit(livepatch_bpf_exit);
+MODULE_LICENSE("GPL");
+MODULE_INFO(livepatch, "Y");
+MODULE_AUTHOR("Song Liu");
+MODULE_DESCRIPTION("Sample: BPF struct_ops + livepatch integration");
-- 
2.52.0


^ permalink raw reply related

* Re: [PATCH v2 0/6] kselftests: livepatch: Adapt tests to be executed on 4.12 kernels
From: Marcos Paulo de Souza @ 2026-04-15 12:37 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Josh Poimboeuf, Jiri Kosina, Petr Mladek, Joe Lawrence,
	Shuah Khan, live-patching, linux-kselftest, linux-kernel
In-Reply-To: <alpine.LSU.2.21.2604151357350.1967@pobox.suse.cz>

On Wed, 2026-04-15 at 14:01 +0200, Miroslav Benes wrote:
> On Mon, 13 Apr 2026, Marcos Paulo de Souza wrote:
> 
> > A new version of the patchset, with fewer patches now. Please take
> > a look!
> > 
> > Original cover-letter:
> > These patches don't really change how the patches are run, just
> > skip
> > some tests on kernels that don't support a feature (like kprobe and
> > livepatched living together) or when a livepatch sysfs attribute is
> > missing.
> > 
> > The last patch slightly adjusts check_result function to skip dmesg
> > messages on SLE kernels when a livepatch is removed.
> > 
> > These patches are based on printk/for-next branch.
> > 
> > Please review! Thanks!
> > 
> > Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
> 
> Besides my comment for 1/6 and what Sashiko discovered, it looks good
> to 
> me.
> 
> However, please also take a look at brand new 
> test_modules/test_klp_mod_target.c. It does not build on old kernels
> since 
> they lack proc_create_single(). I think it should be covered in this
> patch 
> set too.

I saw that yesterday as well, but I wanted to merge this series first.
I have plans to create a way to unregister the livepatches if something
fails, so we can continue to run the other tests. I was planning to fix
the test_klp_mod_target.c in the same patchset.

> 
> Regards
> Miroslav

^ permalink raw reply

* Re: [PATCH v2 0/6] kselftests: livepatch: Adapt tests to be executed on 4.12 kernels
From: Miroslav Benes @ 2026-04-15 12:01 UTC (permalink / raw)
  To: Marcos Paulo de Souza
  Cc: Josh Poimboeuf, Jiri Kosina, Petr Mladek, Joe Lawrence,
	Shuah Khan, live-patching, linux-kselftest, linux-kernel
In-Reply-To: <20260413-lp-tests-old-fixes-v2-0-367c7cb5006f@suse.com>

On Mon, 13 Apr 2026, Marcos Paulo de Souza wrote:

> A new version of the patchset, with fewer patches now. Please take a look!
> 
> Original cover-letter:
> These patches don't really change how the patches are run, just skip
> some tests on kernels that don't support a feature (like kprobe and
> livepatched living together) or when a livepatch sysfs attribute is
> missing.
> 
> The last patch slightly adjusts check_result function to skip dmesg
> messages on SLE kernels when a livepatch is removed.
> 
> These patches are based on printk/for-next branch.
> 
> Please review! Thanks!
> 
> Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>

Besides my comment for 1/6 and what Sashiko discovered, it looks good to 
me.

However, please also take a look at brand new 
test_modules/test_klp_mod_target.c. It does not build on old kernels since 
they lack proc_create_single(). I think it should be covered in this patch 
set too.

Regards
Miroslav

^ permalink raw reply

* Re: [PATCH v2 1/6] selftests: livepatch: Check for ARCH_HAS_SYSCALL_WRAPPER config
From: Miroslav Benes @ 2026-04-15  9:58 UTC (permalink / raw)
  To: Marcos Paulo de Souza
  Cc: Josh Poimboeuf, Jiri Kosina, Petr Mladek, Joe Lawrence,
	Shuah Khan, live-patching, linux-kselftest, linux-kernel
In-Reply-To: <20260413-lp-tests-old-fixes-v2-1-367c7cb5006f@suse.com>

On Mon, 13 Apr 2026, Marcos Paulo de Souza wrote:

> Older kernels that lack CONFIG_ARCH_HAS_SYSCALL_WRAPPER config don't
> have any prefixes for their syscalls. The same applies to current
> powerpc and loongarch, covering all currently supported architectures
> that support livepatch.
> 
> The other supported architectures have specific prefixes, so error out
> when a new architecture adds livepatch support with wrappes but didn't
> update the test to include it.
> 
> Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
> ---
>  .../selftests/livepatch/test_modules/test_klp_syscall.c | 17 ++++++++++++++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/testing/selftests/livepatch/test_modules/test_klp_syscall.c b/tools/testing/selftests/livepatch/test_modules/test_klp_syscall.c
> index dd802783ea84..b5527a288a7c 100644
> --- a/tools/testing/selftests/livepatch/test_modules/test_klp_syscall.c
> +++ b/tools/testing/selftests/livepatch/test_modules/test_klp_syscall.c
> @@ -12,15 +12,26 @@
>  #include <linux/slab.h>
>  #include <linux/livepatch.h>
>  
> -#if defined(__x86_64__)
> +/*
> + * Before CONFIG_ARCH_HAS_SYSCALL_WRAPPER was introduced there were no
> + * prefixes for system calls.
> + * Both ppc and loongarch does not set prefixes for their system calls either.
> + */
> +#if !defined(CONFIG_ARCH_HAS_SYSCALL_WRAPPER) ||  defined(__powerpc__) || \
> +	defined(__loongarch__)
> +#define FN_PREFIX
> +#elif defined(__x86_64__)
>  #define FN_PREFIX __x64_
>  #elif defined(__s390x__)
>  #define FN_PREFIX __s390x_
>  #elif defined(__aarch64__)
>  #define FN_PREFIX __arm64_
> -#else
> -/* powerpc does not select ARCH_HAS_SYSCALL_WRAPPER */
> +#elif defined(__powerpc__)
> +#define FN_PREFIX
> +#elif defined(__loongarch__)
>  #define FN_PREFIX
> +#else
> +#error "Missing syscall wrapper for the given architecture."
>  #endif

I know that Sashiko commented on that already but even with that I wonder 
if it was cleaner to structure it differently...

#if defined(CONFIG_ARCH_HAS_SYSCALL_WRAPPER)
  #if define(__x86_64__)
  ...
  #elif define(__powerpc__)
    #define FN_PREFIX
  #else
    #error
  #endif
#elif
  #define FN_PREFIX
#endif

?

I still hope that it will be sufficient and we do not have to introduce 
KERNEL_VERSION checks since wrappers changed/will change.

Miroslav

^ permalink raw reply

* Re: [RFC PATCH 1/2] kernel/notifier: replace single-linked list with double-linked list for reverse traversal
From: Christoph Hellwig @ 2026-04-15  7:40 UTC (permalink / raw)
  To: chensong_2000
  Cc: rafael, lenb, mturquette, sboyd, viresh.kumar, agk, snitzer,
	mpatocka, bmarzins, song, yukuai, linan122, jason.wessel, danielt,
	dianders, horms, davem, edumazet, kuba, pabeni, paulmck, frederic,
	mcgrof, petr.pavlu, da.gomez, samitolvanen, atomlin, jpoimboe,
	jikos, mbenes, pmladek, joe.lawrence, rostedt, mhiramat,
	mark.rutland, mathieu.desnoyers, linux-modules, linux-kernel,
	linux-trace-kernel, linux-acpi, linux-clk, linux-pm,
	live-patching, dm-devel, linux-raid, kgdb-bugreport, netdev
In-Reply-To: <20260415070137.17860-1-chensong_2000@189.cn>

On Wed, Apr 15, 2026 at 03:01:37PM +0800, chensong_2000@189.cn wrote:
> diff --git a/drivers/acpi/sleep.c b/drivers/acpi/sleep.c
> index 132a9df98471..b776dbd5a382 100644
> --- a/drivers/acpi/sleep.c
> +++ b/drivers/acpi/sleep.c
> @@ -56,7 +56,6 @@ static int tts_notify_reboot(struct notifier_block *this,
>  
>  static struct notifier_block tts_notifier = {
>  	.notifier_call	= tts_notify_reboot,
> -	.next		= NULL,
>  	.priority	= 0,

IFF this becomes important for some reason (and right now I don't see
it), please start by using proper wrappers for notifiers so that the
implementation details don't leak into the users.  That would actually
be useful on it's own even.


^ permalink raw reply

* Re: [RFC PATCH 0/2] Decouple ftrace/livepatch from module loader via notifier priority and reverse traversal
From: Christoph Hellwig @ 2026-04-15  7:38 UTC (permalink / raw)
  To: chensong_2000
  Cc: rafael, lenb, mturquette, sboyd, viresh.kumar, agk, snitzer,
	mpatocka, bmarzins, song, yukuai, linan122, jason.wessel, danielt,
	dianders, horms, davem, edumazet, kuba, pabeni, paulmck, frederic,
	mcgrof, petr.pavlu, da.gomez, samitolvanen, atomlin, jpoimboe,
	jikos, mbenes, pmladek, joe.lawrence, rostedt, mhiramat,
	mark.rutland, mathieu.desnoyers, linux-modules, linux-kernel,
	linux-trace-kernel, linux-acpi, linux-clk, linux-pm,
	live-patching, dm-devel, linux-raid, kgdb-bugreport, netdev
In-Reply-To: <20260413080140.180616-1-chensong_2000@189.cn>

On Mon, Apr 13, 2026 at 04:01:40PM +0800, chensong_2000@189.cn wrote:
> From: Song Chen <chensong_2000@189.cn>
> 
> This patchset addresses a long-standing tight coupling between the
> module loader and two of its key consumers: ftrace and livepatch.
> 
> Background:
> 
> The module loader currently hard-codes direct calls to
> ftrace_module_enable(), klp_module_coming(), klp_module_going() and
> ftrace_release_mod() inside prepare_coming_module() and the module
> unload path.

And that is bad why?

>  13 files changed, 290 insertions(+), 74 deletions(-)

This is a lot of new complex code touching a lot of places for no obvious
gain.  What is the reason for this series?  Does it prepare for something
else?


^ permalink raw reply

* [RFC PATCH 1/2] kernel/notifier: replace single-linked list with double-linked list for reverse traversal
From: chensong_2000 @ 2026-04-15  7:01 UTC (permalink / raw)
  To: rafael, lenb, mturquette, sboyd, viresh.kumar, agk, snitzer,
	mpatocka, bmarzins, song, yukuai, linan122, jason.wessel, danielt,
	dianders, horms, davem, edumazet, kuba, pabeni, paulmck, frederic,
	mcgrof, petr.pavlu, da.gomez, samitolvanen, atomlin, jpoimboe,
	jikos, mbenes, pmladek, joe.lawrence, rostedt, mhiramat,
	mark.rutland, mathieu.desnoyers
  Cc: linux-modules, linux-kernel, linux-trace-kernel, linux-acpi,
	linux-clk, linux-pm, live-patching, dm-devel, linux-raid,
	kgdb-bugreport, netdev, Song Chen

From: Song Chen <chensong_2000@189.cn>

The current notifier chain implementation uses a single-linked list
(struct notifier_block *next), which only supports forward traversal
in priority order. This makes it difficult to handle cleanup/teardown
scenarios that require notifiers to be called in reverse priority order.

A concrete example is the ordering dependency between ftrace and
livepatch during module load/unload. see the detail here [1].

This patch replaces the single-linked list in struct notifier_block
with a struct list_head, converting the notifier chain into a
doubly-linked list sorted in descending priority order. Based on
this, a new function notifier_call_chain_reverse() is introduced,
which traverses the chain in reverse (ascending priority order).
The corresponding blocking_notifier_call_chain_reverse() is also
added as the locking wrapper for blocking notifier chains.

The internal notifier_call_chain_robust() is updated to use
notifier_call_chain_reverse() for rollback: on error, it records
the failing notifier (last_nb) and the count of successfully called
notifiers (nr), then rolls back exactly those nr-1 notifiers in
reverse order starting from last_nb's predecessor, without needing
to know the total length of the chain.

With this change, subsystems with symmetric setup/teardown ordering
requirements can register a single notifier_block with one priority
value, and rely on blocking_notifier_call_chain() for forward
traversal and blocking_notifier_call_chain_reverse() for reverse
traversal, without needing hard-coded call sequences or separate
notifier registrations for each direction.

[1]:https://lore.kernel.org/all
	/alpine.LNX.2.00.1602172216491.22700@cbobk.fhfr.pm/

Signed-off-by: Song Chen <chensong_2000@189.cn>
---
 drivers/acpi/sleep.c      |   1 -
 drivers/clk/clk.c         |   2 +-
 drivers/cpufreq/cpufreq.c |   2 +-
 drivers/md/dm-integrity.c |   1 -
 drivers/md/md.c           |   1 -
 include/linux/notifier.h  |  26 ++---
 kernel/debug/debug_core.c |   1 -
 kernel/notifier.c         | 219 ++++++++++++++++++++++++++++++++------
 net/ipv4/nexthop.c        |   2 +-
 9 files changed, 201 insertions(+), 54 deletions(-)

diff --git a/drivers/acpi/sleep.c b/drivers/acpi/sleep.c
index 132a9df98471..b776dbd5a382 100644
--- a/drivers/acpi/sleep.c
+++ b/drivers/acpi/sleep.c
@@ -56,7 +56,6 @@ static int tts_notify_reboot(struct notifier_block *this,
 
 static struct notifier_block tts_notifier = {
 	.notifier_call	= tts_notify_reboot,
-	.next		= NULL,
 	.priority	= 0,
 };
 
diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
index 47093cda9df3..b6fe380d0468 100644
--- a/drivers/clk/clk.c
+++ b/drivers/clk/clk.c
@@ -4862,7 +4862,7 @@ int clk_notifier_unregister(struct clk *clk, struct notifier_block *nb)
 			clk->core->notifier_count--;
 
 			/* XXX the notifier code should handle this better */
-			if (!cn->notifier_head.head) {
+			if (list_empty(&cn->notifier_head.head)) {
 				srcu_cleanup_notifier_head(&cn->notifier_head);
 				list_del(&cn->node);
 				kfree(cn);
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 277884d91913..12637e742ffa 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -445,7 +445,7 @@ static void cpufreq_list_transition_notifiers(void)
 
 	mutex_lock(&cpufreq_transition_notifier_list.mutex);
 
-	for (nb = cpufreq_transition_notifier_list.head; nb; nb = nb->next)
+	list_for_each_entry(nb, &cpufreq_transition_notifier_list.head, entry)
 		pr_info("%pS\n", nb->notifier_call);
 
 	mutex_unlock(&cpufreq_transition_notifier_list.mutex);
diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index 06e805902151..ccdf75c40b62 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -3909,7 +3909,6 @@ static void dm_integrity_resume(struct dm_target *ti)
 	}
 
 	ic->reboot_notifier.notifier_call = dm_integrity_reboot;
-	ic->reboot_notifier.next = NULL;
 	ic->reboot_notifier.priority = INT_MAX - 1;	/* be notified after md and before hardware drivers */
 	WARN_ON(register_reboot_notifier(&ic->reboot_notifier));
 
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 3ce6f9e9d38e..8249e78636ab 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -10480,7 +10480,6 @@ static int md_notify_reboot(struct notifier_block *this,
 
 static struct notifier_block md_notifier = {
 	.notifier_call	= md_notify_reboot,
-	.next		= NULL,
 	.priority	= INT_MAX, /* before any real devices */
 };
 
diff --git a/include/linux/notifier.h b/include/linux/notifier.h
index 01b6c9d9956f..b2abbdfcaadd 100644
--- a/include/linux/notifier.h
+++ b/include/linux/notifier.h
@@ -53,41 +53,41 @@ typedef	int (*notifier_fn_t)(struct notifier_block *nb,
 
 struct notifier_block {
 	notifier_fn_t notifier_call;
-	struct notifier_block __rcu *next;
+	struct list_head __rcu entry;
 	int priority;
 };
 
 struct atomic_notifier_head {
 	spinlock_t lock;
-	struct notifier_block __rcu *head;
+	struct list_head __rcu head;
 };
 
 struct blocking_notifier_head {
 	struct rw_semaphore rwsem;
-	struct notifier_block __rcu *head;
+	struct list_head __rcu head;
 };
 
 struct raw_notifier_head {
-	struct notifier_block __rcu *head;
+	struct list_head __rcu head;
 };
 
 struct srcu_notifier_head {
 	struct mutex mutex;
 	struct srcu_usage srcuu;
 	struct srcu_struct srcu;
-	struct notifier_block __rcu *head;
+	struct list_head __rcu head;
 };
 
 #define ATOMIC_INIT_NOTIFIER_HEAD(name) do {	\
 		spin_lock_init(&(name)->lock);	\
-		(name)->head = NULL;		\
+		INIT_LIST_HEAD(&(name)->head);		\
 	} while (0)
 #define BLOCKING_INIT_NOTIFIER_HEAD(name) do {	\
 		init_rwsem(&(name)->rwsem);	\
-		(name)->head = NULL;		\
+		INIT_LIST_HEAD(&(name)->head);		\
 	} while (0)
 #define RAW_INIT_NOTIFIER_HEAD(name) do {	\
-		(name)->head = NULL;		\
+		INIT_LIST_HEAD(&(name)->head);		\
 	} while (0)
 
 /* srcu_notifier_heads must be cleaned up dynamically */
@@ -97,17 +97,17 @@ extern void srcu_init_notifier_head(struct srcu_notifier_head *nh);
 
 #define ATOMIC_NOTIFIER_INIT(name) {				\
 		.lock = __SPIN_LOCK_UNLOCKED(name.lock),	\
-		.head = NULL }
+		.head = LIST_HEAD_INIT((name).head) }
 #define BLOCKING_NOTIFIER_INIT(name) {				\
 		.rwsem = __RWSEM_INITIALIZER((name).rwsem),	\
-		.head = NULL }
+		.head = LIST_HEAD_INIT((name).head) }
 #define RAW_NOTIFIER_INIT(name)	{				\
-		.head = NULL }
+		.head = LIST_HEAD_INIT((name).head) }
 
 #define SRCU_NOTIFIER_INIT(name, pcpu)				\
 	{							\
 		.mutex = __MUTEX_INITIALIZER(name.mutex),	\
-		.head = NULL,					\
+		.head = LIST_HEAD_INIT((name).head),					\
 		.srcuu = __SRCU_USAGE_INIT(name.srcuu),		\
 		.srcu = __SRCU_STRUCT_INIT(name.srcu, name.srcuu, pcpu, 0), \
 	}
@@ -170,6 +170,8 @@ extern int atomic_notifier_call_chain(struct atomic_notifier_head *nh,
 		unsigned long val, void *v);
 extern int blocking_notifier_call_chain(struct blocking_notifier_head *nh,
 		unsigned long val, void *v);
+extern int blocking_notifier_call_chain_reverse(struct blocking_notifier_head *nh,
+		unsigned long val, void *v);
 extern int raw_notifier_call_chain(struct raw_notifier_head *nh,
 		unsigned long val, void *v);
 extern int srcu_notifier_call_chain(struct srcu_notifier_head *nh,
diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
index 0b9495187fba..a26a7683d142 100644
--- a/kernel/debug/debug_core.c
+++ b/kernel/debug/debug_core.c
@@ -1054,7 +1054,6 @@ dbg_notify_reboot(struct notifier_block *this, unsigned long code, void *x)
 
 static struct notifier_block dbg_reboot_notifier = {
 	.notifier_call		= dbg_notify_reboot,
-	.next			= NULL,
 	.priority		= INT_MAX,
 };
 
diff --git a/kernel/notifier.c b/kernel/notifier.c
index 2f9fe7c30287..6f4d887771c4 100644
--- a/kernel/notifier.c
+++ b/kernel/notifier.c
@@ -14,39 +14,47 @@
  *	are layered on top of these, with appropriate locking added.
  */
 
-static int notifier_chain_register(struct notifier_block **nl,
+static int notifier_chain_register(struct list_head *nl,
 				   struct notifier_block *n,
 				   bool unique_priority)
 {
-	while ((*nl) != NULL) {
-		if (unlikely((*nl) == n)) {
+	struct notifier_block *cur;
+
+	list_for_each_entry(cur, nl, entry) {
+		if (unlikely(cur == n)) {
 			WARN(1, "notifier callback %ps already registered",
 			     n->notifier_call);
 			return -EEXIST;
 		}
-		if (n->priority > (*nl)->priority)
-			break;
-		if (n->priority == (*nl)->priority && unique_priority)
+
+		if (n->priority == cur->priority && unique_priority)
 			return -EBUSY;
-		nl = &((*nl)->next);
+
+		if (n->priority > cur->priority) {
+			list_add_tail(&n->entry, &cur->entry);
+			goto out;
+		}
 	}
-	n->next = *nl;
-	rcu_assign_pointer(*nl, n);
+
+	list_add_tail(&n->entry, nl);
+out:
 	trace_notifier_register((void *)n->notifier_call);
 	return 0;
 }
 
-static int notifier_chain_unregister(struct notifier_block **nl,
+static int notifier_chain_unregister(struct list_head *nl,
 		struct notifier_block *n)
 {
-	while ((*nl) != NULL) {
-		if ((*nl) == n) {
-			rcu_assign_pointer(*nl, n->next);
+	struct notifier_block *cur;
+
+	list_for_each_entry(cur, nl, entry) {
+		if (cur == n) {
+			list_del(&n->entry);
 			trace_notifier_unregister((void *)n->notifier_call);
 			return 0;
 		}
-		nl = &((*nl)->next);
 	}
+
 	return -ENOENT;
 }
 
@@ -59,25 +67,25 @@ static int notifier_chain_unregister(struct notifier_block **nl,
  *			value of this parameter is -1.
  *	@nr_calls:	Records the number of notifications sent. Don't care
  *			value of this field is NULL.
+ *	@last_nb:  Records the last called notifier block for rolling back
  *	Return:		notifier_call_chain returns the value returned by the
  *			last notifier function called.
  */
-static int notifier_call_chain(struct notifier_block **nl,
+static int notifier_call_chain(struct list_head *nl,
 			       unsigned long val, void *v,
-			       int nr_to_call, int *nr_calls)
+			       int nr_to_call, int *nr_calls,
+				   struct notifier_block **last_nb)
 {
 	int ret = NOTIFY_DONE;
-	struct notifier_block *nb, *next_nb;
-
-	nb = rcu_dereference_raw(*nl);
+	struct notifier_block *nb;
 
-	while (nb && nr_to_call) {
-		next_nb = rcu_dereference_raw(nb->next);
+	if (!nr_to_call)
+		return ret;
 
+	list_for_each_entry(nb, nl, entry) {
 #ifdef CONFIG_DEBUG_NOTIFIERS
 		if (unlikely(!func_ptr_is_kernel_text(nb->notifier_call))) {
 			WARN(1, "Invalid notifier called!");
-			nb = next_nb;
 			continue;
 		}
 #endif
@@ -87,15 +95,118 @@ static int notifier_call_chain(struct notifier_block **nl,
 		if (nr_calls)
 			(*nr_calls)++;
 
+		if (last_nb)
+			*last_nb = nb;
+
 		if (ret & NOTIFY_STOP_MASK)
 			break;
-		nb = next_nb;
-		nr_to_call--;
+
+		if (nr_to_call-- == 0)
+			break;
 	}
 	return ret;
 }
 NOKPROBE_SYMBOL(notifier_call_chain);
 
+/**
+ * notifier_call_chain_reverse - Informs the registered notifiers
+ *			about an event reversely.
+ *	@nl:		Pointer to head of the blocking notifier chain
+ *	@val:		Value passed unmodified to notifier function
+ *	@v:		Pointer passed unmodified to notifier function
+ *	@nr_to_call:	Number of notifier functions to be called. Don't care
+ *			value of this parameter is -1.
+ *	@nr_calls:	Records the number of notifications sent. Don't care
+ *			value of this field is NULL.
+ *	Return:		notifier_call_chain returns the value returned by the
+ *			last notifier function called.
+ */
+static int notifier_call_chain_reverse(struct list_head *nl,
+					struct notifier_block *start,
+					unsigned long val, void *v,
+					int nr_to_call, int *nr_calls)
+{
+	int ret = NOTIFY_DONE;
+	struct notifier_block *nb;
+	bool do_call = (start == NULL);
+
+	if (!nr_to_call)
+		return ret;
+
+	list_for_each_entry_reverse(nb, nl, entry) {
+		if (!do_call) {
+			if (nb == start)
+				do_call = true;
+			continue;
+		}
+#ifdef CONFIG_DEBUG_NOTIFIERS
+		if (unlikely(!func_ptr_is_kernel_text(nb->notifier_call))) {
+			WARN(1, "Invalid notifier called!");
+			continue;
+		}
+#endif
+		trace_notifier_run((void *)nb->notifier_call);
+		ret = nb->notifier_call(nb, val, v);
+
+		if (nr_calls)
+			(*nr_calls)++;
+
+		if (ret & NOTIFY_STOP_MASK)
+			break;
+
+		if (nr_to_call-- == 0)
+			break;
+	}
+	return ret;
+}
+NOKPROBE_SYMBOL(notifier_call_chain_reverse);
+
+/**
+ * notifier_call_chain_rcu - Informs the registered notifiers
+ *			about an event for srcu notifier chain.
+ *	@nl:		Pointer to head of the blocking notifier chain
+ *	@val:		Value passed unmodified to notifier function
+ *	@v:		Pointer passed unmodified to notifier function
+ *	@nr_to_call:	Number of notifier functions to be called. Don't care
+ *			value of this parameter is -1.
+ *	@nr_calls:	Records the number of notifications sent. Don't care
+ *			value of this field is NULL.
+ *	Return:		notifier_call_chain returns the value returned by the
+ *			last notifier function called.
+ */
+static int notifier_call_chain_rcu(struct list_head *nl,
+			       unsigned long val, void *v,
+			       int nr_to_call, int *nr_calls)
+{
+	int ret = NOTIFY_DONE;
+	struct notifier_block *nb;
+
+	if (!nr_to_call)
+		return ret;
+
+	list_for_each_entry_rcu(nb, nl, entry) {
+#ifdef CONFIG_DEBUG_NOTIFIERS
+		if (unlikely(!func_ptr_is_kernel_text(nb->notifier_call))) {
+			WARN(1, "Invalid notifier called!");
+			continue;
+		}
+#endif
+		trace_notifier_run((void *)nb->notifier_call);
+		ret = nb->notifier_call(nb, val, v);
+
+		if (nr_calls)
+			(*nr_calls)++;
+
+		if (ret & NOTIFY_STOP_MASK)
+			break;
+
+		if (nr_to_call-- == 0)
+			break;
+	}
+	return ret;
+}
+NOKPROBE_SYMBOL(notifier_call_chain_rcu);
+
 /**
  * notifier_call_chain_robust - Inform the registered notifiers about an event
  *                              and rollback on error.
@@ -111,15 +222,16 @@ NOKPROBE_SYMBOL(notifier_call_chain);
  *
  * Return:	the return value of the @val_up call.
  */
-static int notifier_call_chain_robust(struct notifier_block **nl,
+static int notifier_call_chain_robust(struct list_head *nl,
 				     unsigned long val_up, unsigned long val_down,
 				     void *v)
 {
 	int ret, nr = 0;
+	struct notifier_block *last_nb = NULL;
 
-	ret = notifier_call_chain(nl, val_up, v, -1, &nr);
+	ret = notifier_call_chain(nl, val_up, v, -1, &nr, &last_nb);
 	if (ret & NOTIFY_STOP_MASK)
-		notifier_call_chain(nl, val_down, v, nr-1, NULL);
+		notifier_call_chain_reverse(nl, last_nb, val_down, v, nr-1, NULL);
 
 	return ret;
 }
@@ -220,7 +332,7 @@ int atomic_notifier_call_chain(struct atomic_notifier_head *nh,
 	int ret;
 
 	rcu_read_lock();
-	ret = notifier_call_chain(&nh->head, val, v, -1, NULL);
+	ret = notifier_call_chain(&nh->head, val, v, -1, NULL, NULL);
 	rcu_read_unlock();
 
 	return ret;
@@ -238,7 +350,7 @@ NOKPROBE_SYMBOL(atomic_notifier_call_chain);
  */
 bool atomic_notifier_call_chain_is_empty(struct atomic_notifier_head *nh)
 {
-	return !rcu_access_pointer(nh->head);
+	return list_empty(&nh->head);
 }
 
 /*
@@ -340,7 +452,7 @@ int blocking_notifier_call_chain_robust(struct blocking_notifier_head *nh,
 	 * racy then it does not matter what the result of the test
 	 * is, we re-check the list after having taken the lock anyway:
 	 */
-	if (rcu_access_pointer(nh->head)) {
+	if (!list_empty(&nh->head)) {
 		down_read(&nh->rwsem);
 		ret = notifier_call_chain_robust(&nh->head, val_up, val_down, v);
 		up_read(&nh->rwsem);
@@ -375,15 +487,52 @@ int blocking_notifier_call_chain(struct blocking_notifier_head *nh,
 	 * racy then it does not matter what the result of the test
 	 * is, we re-check the list after having taken the lock anyway:
 	 */
-	if (rcu_access_pointer(nh->head)) {
+	if (!list_empty(&nh->head)) {
 		down_read(&nh->rwsem);
-		ret = notifier_call_chain(&nh->head, val, v, -1, NULL);
+		ret = notifier_call_chain(&nh->head, val, v, -1, NULL, NULL);
 		up_read(&nh->rwsem);
 	}
 	return ret;
 }
 EXPORT_SYMBOL_GPL(blocking_notifier_call_chain);
 
+/**
+ *	blocking_notifier_call_chain_reverse - Call functions reversely in
+ *				a blocking notifier chain
+ *	@nh: Pointer to head of the blocking notifier chain
+ *	@val: Value passed unmodified to notifier function
+ *	@v: Pointer passed unmodified to notifier function
+ *
+ *	Calls each function in a notifier chain in turn.  The functions
+ *	run in a process context, so they are allowed to block.
+ *
+ *	If the return value of the notifier can be and'ed
+ *	with %NOTIFY_STOP_MASK then blocking_notifier_call_chain()
+ *	will return immediately, with the return value of
+ *	the notifier function which halted execution.
+ *	Otherwise the return value is the return value
+ *	of the last notifier function called.
+ */
+
+int blocking_notifier_call_chain_reverse(struct blocking_notifier_head *nh,
+		unsigned long val, void *v)
+{
+	int ret = NOTIFY_DONE;
+
+	/*
+	 * We check the head outside the lock, but if this access is
+	 * racy then it does not matter what the result of the test
+	 * is, we re-check the list after having taken the lock anyway:
+	 */
+	if (!list_empty(&nh->head)) {
+		down_read(&nh->rwsem);
+		ret = notifier_call_chain_reverse(&nh->head, NULL, val, v, -1, NULL);
+		up_read(&nh->rwsem);
+	}
+	return ret;
+}
+EXPORT_SYMBOL_GPL(blocking_notifier_call_chain_reverse);
+
 /*
  *	Raw notifier chain routines.  There is no protection;
  *	the caller must provide it.  Use at your own risk!
@@ -450,7 +599,7 @@ EXPORT_SYMBOL_GPL(raw_notifier_call_chain_robust);
 int raw_notifier_call_chain(struct raw_notifier_head *nh,
 		unsigned long val, void *v)
 {
-	return notifier_call_chain(&nh->head, val, v, -1, NULL);
+	return notifier_call_chain(&nh->head, val, v, -1, NULL, NULL);
 }
 EXPORT_SYMBOL_GPL(raw_notifier_call_chain);
 
@@ -543,7 +692,7 @@ int srcu_notifier_call_chain(struct srcu_notifier_head *nh,
 	int idx;
 
 	idx = srcu_read_lock(&nh->srcu);
-	ret = notifier_call_chain(&nh->head, val, v, -1, NULL);
+	ret = notifier_call_chain_rcu(&nh->head, val, v, -1, NULL);
 	srcu_read_unlock(&nh->srcu, idx);
 	return ret;
 }
@@ -566,7 +715,7 @@ void srcu_init_notifier_head(struct srcu_notifier_head *nh)
 	mutex_init(&nh->mutex);
 	if (init_srcu_struct(&nh->srcu) < 0)
 		BUG();
-	nh->head = NULL;
+	INIT_LIST_HEAD(&nh->head);
 }
 EXPORT_SYMBOL_GPL(srcu_init_notifier_head);
 
diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
index c942f1282236..0afcba2967c7 100644
--- a/net/ipv4/nexthop.c
+++ b/net/ipv4/nexthop.c
@@ -90,7 +90,7 @@ static const struct nla_policy rtm_nh_res_bucket_policy_get[] = {
 
 static bool nexthop_notifiers_is_empty(struct net *net)
 {
-	return !net->nexthop.notifier_chain.head;
+	return list_empty(&net->nexthop.notifier_chain.head);
 }
 
 static void
-- 
2.43.0


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox