Live Patching
 help / color / mirror / Atom feed
* Re: [PATCH v5 8/8] unwind: arm64: Use sframe to unwind interrupt frames
From: Jens Remus @ 2026-05-05 15:52 UTC (permalink / raw)
  To: Mark Rutland, Indu Bhagat
  Cc: Dylan Hatch, Roman Gushchin, Weinan Liu, Will Deacon,
	Josh Poimboeuf, Peter Zijlstra, Steven Rostedt, Catalin Marinas,
	Jiri Kosina, Prasanna Kumar T S M, Puranjay Mohan, Song Liu,
	joe.lawrence, linux-toolchains, linux-kernel, live-patching,
	linux-arm-kernel, Randy Dunlap, Heiko Carstens
In-Reply-To: <afnGhsCYEUG0LXR5@J2N7QTR9R3>

On 5/5/2026 12:29 PM, Mark Rutland wrote:
> On Mon, May 04, 2026 at 10:47:26AM +0200, Jens Remus wrote:
>> On 5/1/2026 6:46 PM, Mark Rutland wrote:

>>> Thanks for putting this together. I think this is looking pretty good.
>>> However, there are some things that aren't quite right and need some
>>> work, which I've commented on below.
>>
>>> (2) To make unwinding generally possible, we'll need to annotate some
>>>     assembly functions as unwindable. We'll need to do that for string
>>>     routines under lib/, and probably some crypto code, but we don't
>>>     need to do that for most code in head.S, entry.S, etc.
>>>
>>>     The vast majority of relevant assembly functions are leaf functions
>>>     (where the return address is never moved out of the LR), so we can
>>>     probably get away with a simple annotation for those that avoids the
>>>     need for open-coded CFI directives everywhere.
>>
>> Wrapping them in .cfi_startproc ... .cfi_endproc should do.  For instance
>> by extending SYM_FUNC_START() and SYM_FUNC_END() or introducing flavors
>> that do.  Or where you thinking of something else?
> 
> I was expecting we'd do something like that, either with distinct
> versions, or some entirely separate annotation.
> 
> We can't override SYM_FUNC_START() or SYM_FUNC_END() since those are
> also used for non-leaf functions. The bulk of the work is going to be
> making sure we only annotate leaf functions specifically, which will
> require some human analysis.

Makes sense.

>>> On Tue, Apr 28, 2026 at 06:36:43PM +0000, Dylan Hatch wrote:
>>>> diff --git a/arch/arm64/include/asm/stacktrace/common.h b/arch/arm64/include/asm/stacktrace/common.h
>>
>>>> @@ -21,6 +21,8 @@ struct stack_info {
>>>>   *
>>>>   * @fp:          The fp value in the frame record (or the real fp)
>>>>   * @pc:          The lr value in the frame record (or the real lr)
>>>> + * @sp:          The sp value at the call site of the current function.
>>>> + * @unreliable:  Stacktrace is unreliable.
>>>>   *
>>>>   * @stack:       The stack currently being unwound.
>>>>   * @stacks:      An array of stacks which can be unwound.
>>>> @@ -29,7 +31,11 @@ struct stack_info {
>>>>  struct unwind_state {
>>>>  	unsigned long fp;
>>>>  	unsigned long pc;
>>>> +#ifdef CONFIG_HAVE_UNWIND_KERNEL_SFRAME
>>>> +	unsigned long sp;
>>>> +#endif
>>>
>>> As this is only used by the kernel unwinder (and not the hyp unwinder),
>>> this should live in struct kunwind_state in stacktrace.c.
>>>
>>> That said, for unwinding across exception boundaries we should not need
>>> this, as the SP value will be in the pt_regs. If we only use SFrame for
>>> the exception boundary case, we can remove this entirely. I would
>>> strongly prefer that we do that.
> 
>>>> +	/* Get the Canonical Frame Address (CFA) */
>>>> +	switch (frame.cfa.rule) {
>>>> +	case UNWIND_CFA_RULE_SP_OFFSET:
>>>> +		cfa = state->common.sp;
>>
>> IIUC you suggest this to be changed as follows?
>>
>> 		cfa = regs->regs[31];
> 
> I was suggesting:
> 
> 		cfa = regs->sp;
> 
> Note: arm64's struct pt_regs has:
> 
> 	union {
> 		struct user_pt_regs user_regs;
> 		struct {
> 			u64 regs[31];
> 			u64 sp;
> 			u64 pc;
> 			u64 pstate;
> 		};
> 	};	
> 
> ... so regs->regs[31] would be an out-of-bounds array access.

Aww, my bad!  Of course!

> 
> [...]
> 
>>>> +	case UNWIND_CFA_RULE_REG_OFFSET:
>>>> +	case UNWIND_CFA_RULE_REG_OFFSET_DEREF:
>>>> +		/* regs only available in topmost/interrupt frame */
>>>> +		if (!regs || frame.cfa.regnum > 30)
>>>> +			return -EINVAL;
>>>> +		cfa = regs->regs[frame.cfa.regnum];
>>>> +		break;
>>>
>>> Do we ever expect to see UNWIND_CFA_RULE_REG_OFFSET or
>>> UNWIND_CFA_RULE_REG_OFFSET_DEREF in practice for kernel code?
>>
>> No.  Those can only occur with SFrame V3 flexible FDE, which are
>> currently not generated by GNU assembler for arm64/aarch64, and thus
>> could be omitted in the arm64-specific kernel sframe unwinder:
>>
>> https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gas/config/tc-aarch64.h;hb=binutils-2_46#l342
> 
> Ok.
> 
> Do we know whether there are currently cases on aarch64 that cannot be
> encoded in SFrame (without flexible FDE), or whether SFrame without
> flexible FDE is sufficient for arm64 as-is? ... or do we have
> counter-examples today?

Not that I am aware of.  IIUC this is why Indu, the SFrame maintainer,
did not enable SFrame V3 flexible FDE for arm64/aarch64 in the GNU
assembler.

> Looking at:
> 
>   https://sourceware.org/binutils/docs/sframe-spec.html#Flexible-FDE-Type-Interpretation-1
> 
> For arm64 I'm not sure whether we'd encounter the DRAP or stack
> realignment cases within the kernel (perhaps with SVE?), nor whether the
> Register-based RA/FP Locations cases would apply if we assume that we
> continue to use frame records.

@Indu:  Can you provide more insight?

> [... ]
> 
>> I must admit that while reviewing I thought it would be future-proof to
>> have support for rules that can only be represented with SFrame V3
>> flexible FDE, even if they are currently not used on arm64.  Ideally
>> kunwind_next_sframe() could be made common, if another architecture
>> would implement kernel unwinding using sframe.
> 
> While I understand that principle, I think that for now it would be
> better to keep this arch-specific and minimal:
> 
> * We have arch-specific concerns (e.g. the FRAME_META_TYPE_FINAL
>   frames), and factoring that into generic code is going to be painful
>   to adapt (which we're likely to need to do in the near future), and to
>   maintain going forwards. Keeping that arch-specific for now will make
>   it easier/quicker to get to a stable state.
> 
> * Code which isn't used is liable to be wrong or made wrong by accident.
>   For example, with all the SP cases I mentioned in my initial reply.
> 
> We can certainly look at making that more generic in future, but for now
> I'd prefer to omit the code that cannot be used (and have some sort of
> build or boot/module-load time check that SFrame only has elements that
> we expect), and make sure that we thoroughly test the cases that do
> exist in practice.

Makes sense.

> Do we expect SFrame V3 flexible FDE to be generated by toolchains in the
> near future?

There is work in progress to implement SFrame V3 generation in LLVM.
Their implementation could be less restrictive regarding to which
SFrame V3 feature to enable on arm64/aarch64.  But even then for
SFrame V3 flexible FDE to be generated there must be DWARF CFI patterns
that can only be represented using those.  So if those do not exist
(see your previous question above), then they should not be generated.

@Indu:  What are your thoughts on this?

> [...]
> 
>>>> +	/* CFA alignment 8 bytes */
>>>> +	if (cfa & 0x7)
>>>> +		return -EINVAL;
>>>
>>> If the CFA is the SP upon entry to the function, then per AAPCS64 rules
>>> it should be aligned to 16 bytes. Otherwise, where has this 8 byte
>>> alignment requirement come from? Does SFrame mandate that?
>>
>> That originates from the common unwind user logic (see
>> kernel/unwind/user.c, unwind_user_next_common()), which currently
>> assumes 8-byte/4-byte SP alignment for all 64-bit/32-bit architectures.
>>
>> So checking for 16-byte alignment here would make sense.
> 
> Just to confirm, am I correct to understand that the SFrame definition
> of CFA is intended to be the same as the DWARF definition of CFA, and so

Correct.

> for arm64 the CFA is the SP when the function is called?

Correct.

> 
> That's the case for DWARF on arm64:
> 
>   https://github.com/ARM-software/abi-aa/releases/download/2025Q4/aadwarf64.pdf
>   https://github.com/ARM-software/abi-aa/blob/daa7a94ca55973736c0e434a67a6e4bbcd35d7fa/aadwarf64/aadwarf64.rst
> 
> | The CFA is the value of the stack pointer (sp) at the call site in the
> | previous frame.
> 
> I couldn't find an explciit statement to that effect in:
> 
>   https://sourceware.org/binutils/docs/sframe-spec.html
> 
> ... but I guess that is implied, given the other bits inherited from
> DWARF.

I assume so.

> 
> I see that the documented behaviour for CFA on AMD64 and s390x are
> consistent with their DWARF behaviour.

Yes.

> 
>>>> +
>>>> +	/* Get the Return Address (RA) */
>>>> +	switch (frame.ra.rule) {
>>>> +	case UNWIND_RULE_RETAIN:
>>>> +		/* regs only available in topmost/interrupt frame */
>>>> +		if (!regs)
>>>> +			return -EINVAL;
>>>> +		ra = regs->regs[30];
>>>> +		source = KUNWIND_SOURCE_REGS_LR;
>>>> +		break;
>>>> +	/* UNWIND_USER_RULE_CFA_OFFSET not implemented on purpose */
>>
>> Nit: s/UNWIND_USER_RULE_CFA_OFFSET/UNWIND_RULE_CFA_OFFSET/
>>
>>>
>>> It would be better for the comment to say *why* that's not implemented.
>>>
>>> I assume that's because UNWIND_USER_RULE_CFA_OFFSET would mean that the return
>>> address is a stack address, and that's obviously not legitimate.
>>
>> That and SFrame V3 currently cannot represent FP/RA as CFA + offset
>> (i.e. UNWIND_RULE_CFA_OFFSET; .cfi_val_offset FP/RA).
>>
>> The comment originates from the common unwind user logic (see
>> kernel/unwind/user.c).  I am open to improve that.  What about:
>>
>> 	/*
>> 	 * UNWIND_RULE_CFA_OFFSET not implemented on purpose, as a stack
>> 	 * address cannot be a legitimate return address value.  It is
>> 	 * also not used (e.g. not represented in sframe).
>> 	 */
> 
> I'd go with something simpler, e.g.
> 
> 	/*
> 	 * UNWIND_RULE_CFA_OFFSET doesn't make sense for RA.
> 	 * The return address cannot legitimately be a stack addres.
> 	 */

Thanks!  I have updated the comments accordingly in my latest unwind
user sframe patch series sent today:

[PATCH v14 14/19] unwind_user: Flexible FP/RA recovery rules
https://lore.kernel.org/all/20260505121718.3572346-15-jremus@linux.ibm.com/

[...]

>>>> +	default:
>>>> +		WARN_ON_ONCE(1);
>>>> +		return -EINVAL;
>>>> +	}
>>>> +
>>>> +	/* Get the Frame Pointer (FP) */
>>>> +	switch (frame.fp.rule) {
>>>> +	case UNWIND_RULE_RETAIN:
>>>> +		fp = state->common.fp;
>>>> +		break;
>>>> +	/* UNWIND_USER_RULE_CFA_OFFSET not implemented on purpose */
>>>
>>> As for RA, the comment should explain why that's not implemented.
>>
>> I am open to improve the comment in the the common unwind user logic.
>> What about:
>>
>> 	/*
>> 	 * UNWIND_RULE_CFA_OFFSET not implemented on purpose, as it is
>> 	 * not used (e.g. not represented in sframe).
>> 	 */
> 
> For me, this wording raises more questions, e.g.
> 
> * Does 'not used' mean that toolchains don't use that, or that the spec
>   doesn't permit that?

unwind user currently only supports frame pointer, with SFrame to be
hopefully added soon.  Out of these only SFrame requires ("uses") these
elaborated rules.

> 
> * Does 'not represented' mean that this is not represntable, or that
>   toolchains currently don't generate SFrame with the appropriate
>   elements.

In DWARF CFI it is representable using .cfi_val_offset <FP>, <offset>.
But SFrame V3 currently cannot represent this:

"Note that, using a value of 0 as padding data word, does mean that
currently, e.g., for RA [JR: FP likewise], the rule RA = CFA + 0 cannot
be encoded.  NB: RA = CFA + 0 is distinct from RA = *(CFA + 0)."

https://sourceware.org/binutils/docs/sframe-spec.html#Flexible-FDE-Type-Interpretation-1

> 
> IIUC you're saying that this *is* representable with flexible FDE, but
> current toolchains don't generate that.

Thanks for the feedback!  I changed it as follows to clarify:

	/*
	 * UNWIND_USER_RULE_CFA_OFFSET is currently not used for FP
	 * (e.g. SFrame cannot represent this rule).
	 */

Regards,
Jens
-- 
Jens Remus
Linux on Z Development (D3303)
jremus@de.ibm.com / jremus@linux.ibm.com

IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Ehningen; Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement: https://www.ibm.com/privacy/


^ permalink raw reply

* Re: [PATCH v2 2/2] module/kallsyms: sort function symbols and use binary search
From: Petr Pavlu @ 2026-05-05 14:37 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Stanislaw Gruszka, linux-modules, Sami Tolvanen, Luis Chamberlain,
	linux-kernel, linux-trace-kernel, live-patching, Daniel Gomez,
	Aaron Tomlin, Steven Rostedt, Masami Hiramatsu, Jordan Rome,
	Viktor Malik, Miroslav Benes, Josh Poimboeuf, Joe Lawrence
In-Reply-To: <afnhidn7K7dZ_cPh@pathway.suse.cz>

On 5/5/26 2:24 PM, Petr Mladek wrote:
> On Fri 2026-03-27 12:00:05, Stanislaw Gruszka wrote:
>> Module symbol lookup via find_kallsyms_symbol() performs a linear scan
>> over the entire symtab when resolving an address. The number of symbols
>> in module symtabs has grown over the years, largely due to additional
>> metadata in non-standard sections, making this lookup very slow.
>>
>> Improve this by separating function symbols during module load, placing
>> them at the beginning of the symtab, sorting them by address, and using
>> binary search when resolving addresses in module text.
>>
>> This also should improve times for linear symbol name lookups, as valid
>> function symbols are now located at the beginning of the symtab.
>>
>> The cost of sorting is small relative to module load time. In repeated
>> module load tests [1], depending on .config options, this change
>> increases load time between 2% and 4%. With cold caches, the difference
>> is not measurable, as memory access latency dominates.
>>
>> The sorting theoretically could be done in compile time, but much more
>> complicated as we would have to simulate kernel addresses resolution
>> for symbols, and then correct relocation entries. That would be risky
>> if get out of sync.
>>
>> The improvement can be observed when listing ftrace filter functions.
>>
>> Before:
>>
>> root@nano:~# time cat /sys/kernel/tracing/available_filter_functions | wc -l
>> 74908
>>
>> real	0m1.315s
>> user	0m0.000s
>> sys	0m1.312s
>>
>> After:
>>
>> root@nano:~# time cat /sys/kernel/tracing/available_filter_functions | wc -l
>> 74911
>>
>> real	0m0.167s
>> user	0m0.004s
>> sys	0m0.175s
>>
>> (there are three more symbols introduced by the patch)
>>
>> For livepatch modules, the symtab layout is preserved and the existing
>> linear search is used. For this case, it should be possible to keep
>> the original ELF symtab instead of copying it 1:1, but that is outside
>> the scope of this patch.
> 
> What is the exact motivation for the special handling of livepatch modules,
> please?
> 
> Honestly, I am always a bit lost in the various symbol tables. It is
> possile that I have got something wrong.
> 
> Anyway, my understanding is that there are two aspects which are important
> for livepatches:
> 
> 1. Livepatches need to preserve special symbols which are used to
>    relocate symbols which were local in the original code, see
>    Documentation/livepatch/module-elf-format.rst
> 
>    IMHO, this is why layout_symtab() computes space for all core
>    symbols in livepatch modules and copies them in add_kallsyms().
> 
>    The symtab is normally released when the module is loaded.
>    But livepatch modules make its own copy of the important
>    parts, see copy_module_elf().
> 
>    IMHO, the sorting of function symbols vs other symbols does
>    not matter here. I believe that the special relocation
>    symbols are not affected by this.

I'm not sure if I fully follow the conclusion in this point. My
understanding is that .klp.rela sections still refer to their special
symbols in the symbol table via Elf_Rela::r_info. If the symbol table is
filtered or reordered, these references will end up pointing to
incorrect symbols.

This is also described in Documentation/livepatch/module-elf-format.rst,
section "4.1 A livepatch module's symbol table".

> 
> 
> 2. Livepatches _rely on the sorting_ of symbols in the module.
>    The special relocation symbols might define a symbol position,
>    see
> 
> 	.klp.sym.objname.symbol_name,sympos
> 
>    in the documentation. It defines the position of the symbol
>    when there are more symbols of the same name, see
>    klp_match_callback() in kernel/livepatch/core.c.
> 
>    I am afraid that _this patch might break_ this when it moves
>    function symbols before non-funciton ones. I guess that
>    the symbols of the same name will not longer be groupped.

I see. So if the module loader sorts the symbol table in a regular
module and a livepatch module exists for that module, the livepatch may
no longer function correctly. This means that the loader cannot even
reorder the symbol table in regular modules.

-- 
Thanks,
Petr

^ permalink raw reply

* Re: [PATCH v2 53/53] objtool/klp: Cache dont_correlate() result
From: Miroslav Benes @ 2026-05-05 13:40 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, live-patching, Peter Zijlstra, Joe Lawrence,
	Song Liu, Miroslav Benes, Petr Mladek
In-Reply-To: <b13cf9c9e942563b4a9b19494a83f4abf073b0c5.1777575753.git.jpoimboe@kernel.org>

On Thu, 30 Apr 2026 21:08:41 -0700, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> Cache the dont_correlate() result once per symbol at the start of
> correlate_symbols().  This reduces klp diff time on an arm64 LTO
> vmlinux.o from 2m51s to 35s.

Reviewed-by: Miroslav Benes <mbenes@suse.cz>

-- 
Miroslav


^ permalink raw reply

* Re: [PATCH v2 47/53] objtool/klp: Add correlation debugging output
From: Miroslav Benes @ 2026-05-05 13:28 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, live-patching, Peter Zijlstra, Joe Lawrence,
	Song Liu, Miroslav Benes, Petr Mladek
In-Reply-To: <492432266706503947483b831a17a1c118dc5007.1777575752.git.jpoimboe@kernel.org>

On Thu, 30 Apr 2026 21:08:35 -0700, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> Add debugging messages to show how duplicate symbols get correlated, and
> split the --debug feature into --debug-correlate and --debug-clone.

Reviewed-by: Miroslav Benes <mbenes@suse.cz>

-- 
Miroslav


^ permalink raw reply

* Re: [PATCH v2 46/53] objtool/klp: Rewrite symbol correlation algorithm
From: Miroslav Benes @ 2026-05-05 13:07 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, live-patching, Peter Zijlstra, Joe Lawrence,
	Song Liu, Miroslav Benes, Petr Mladek
In-Reply-To: <27fcb5a17cc7b6821d8b1c4b9812ebb5b4ee6a5c.1777575752.git.jpoimboe@kernel.org>

On Thu, 30 Apr 2026 21:08:34 -0700, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> [...]
>   same-named candidates, similar to livepatch sympos.  Used for data
>   objects like __quirk variables where no deterministic filter can
>   distinguish the candidates.
> 
> Overall this works much better than the existing algorithm, particularly
> with LTO kernels.

Nice improvement which makes it much clearer.

Reviewed-by: Miroslav Benes <mbenes@suse.cz>

-- 
Miroslav


^ permalink raw reply

* Re: [PATCH v2 2/2] module/kallsyms: sort function symbols and use binary search
From: Petr Mladek @ 2026-05-05 12:24 UTC (permalink / raw)
  To: Stanislaw Gruszka
  Cc: linux-modules, Sami Tolvanen, Luis Chamberlain, Petr Pavlu,
	linux-kernel, linux-trace-kernel, live-patching, Daniel Gomez,
	Aaron Tomlin, Steven Rostedt, Masami Hiramatsu, Jordan Rome,
	Viktor Malik, Miroslav Benes, Josh Poimboeuf, Joe Lawrence
In-Reply-To: <20260327110005.16499-2-stf_xl@wp.pl>

On Fri 2026-03-27 12:00:05, Stanislaw Gruszka wrote:
> Module symbol lookup via find_kallsyms_symbol() performs a linear scan
> over the entire symtab when resolving an address. The number of symbols
> in module symtabs has grown over the years, largely due to additional
> metadata in non-standard sections, making this lookup very slow.
> 
> Improve this by separating function symbols during module load, placing
> them at the beginning of the symtab, sorting them by address, and using
> binary search when resolving addresses in module text.
> 
> This also should improve times for linear symbol name lookups, as valid
> function symbols are now located at the beginning of the symtab.
> 
> The cost of sorting is small relative to module load time. In repeated
> module load tests [1], depending on .config options, this change
> increases load time between 2% and 4%. With cold caches, the difference
> is not measurable, as memory access latency dominates.
> 
> The sorting theoretically could be done in compile time, but much more
> complicated as we would have to simulate kernel addresses resolution
> for symbols, and then correct relocation entries. That would be risky
> if get out of sync.
> 
> The improvement can be observed when listing ftrace filter functions.
> 
> Before:
> 
> root@nano:~# time cat /sys/kernel/tracing/available_filter_functions | wc -l
> 74908
> 
> real	0m1.315s
> user	0m0.000s
> sys	0m1.312s
> 
> After:
> 
> root@nano:~# time cat /sys/kernel/tracing/available_filter_functions | wc -l
> 74911
> 
> real	0m0.167s
> user	0m0.004s
> sys	0m0.175s
> 
> (there are three more symbols introduced by the patch)
> 
> For livepatch modules, the symtab layout is preserved and the existing
> linear search is used. For this case, it should be possible to keep
> the original ELF symtab instead of copying it 1:1, but that is outside
> the scope of this patch.

What is the exact motivation for the special handling of livepatch modules,
please?

Honestly, I am always a bit lost in the various symbol tables. It is
possile that I have got something wrong.

Anyway, my understanding is that there are two aspects which are important
for livepatches:

1. Livepatches need to preserve special symbols which are used to
   relocate symbols which were local in the original code, see
   Documentation/livepatch/module-elf-format.rst

   IMHO, this is why layout_symtab() computes space for all core
   symbols in livepatch modules and copies them in add_kallsyms().

   The symtab is normally released when the module is loaded.
   But livepatch modules make its own copy of the important
   parts, see copy_module_elf().

   IMHO, the sorting of function symbols vs other symbols does
   not matter here. I believe that the special relocation
   symbols are not affected by this.


2. Livepatches _rely on the sorting_ of symbols in the module.
   The special relocation symbols might define a symbol position,
   see

	.klp.sym.objname.symbol_name,sympos

   in the documentation. It defines the position of the symbol
   when there are more symbols of the same name, see
   klp_match_callback() in kernel/livepatch/core.c.

   I am afraid that _this patch might break_ this when it moves
   function symbols before non-funciton ones. I guess that
   the symbols of the same name will not longer be groupped.

Idea: Is the shufling really important for the performance, please?

   I would expect that binary search would have a good performance
   even without the shuffling. It puts aside half of the symbols in
   one cycle.


Note that the binary search in find_kallsyms_symbol() is perfectly
fine. The livepatch code explicitly iterates over all symbols using
module_kallsyms_on_each_symbol(), see klp_find_object_symbol().

Best Regards,
Petr

^ permalink raw reply

* Re: [PATCH v2 45/53] objtool/klp: Calculate object checksums
From: Miroslav Benes @ 2026-05-05 12:07 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, live-patching, Peter Zijlstra, Joe Lawrence,
	Song Liu, Miroslav Benes, Petr Mladek
In-Reply-To: <88ebcc903a3c534f2c7d35b95f62d845105d40af.1777575752.git.jpoimboe@kernel.org>

On Thu, 30 Apr 2026 21:08:33 -0700, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> Start checksumming data objects in preparation for revamping the
> correlation algorithm.

Reviewed-by: Miroslav Benes <mbenes@suse.cz>

-- 
Miroslav


^ permalink raw reply

* Re: [PATCH v2 44/53] klp-build: Validate short-circuit prerequisites
From: Miroslav Benes @ 2026-05-05 12:00 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, live-patching, Peter Zijlstra, Joe Lawrence,
	Song Liu, Miroslav Benes, Petr Mladek
In-Reply-To: <ec40bb551a583c7a7c329199e41d21a0f086775c.1777575752.git.jpoimboe@kernel.org>

On Thu, 30 Apr 2026 21:08:32 -0700, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> The --short-circuit option implicitly requires that certain directories
> are already in klp-tmp.  Enforce that to prevent confusing errors.

Reviewed-by: Miroslav Benes <mbenes@suse.cz>

-- 
Miroslav


^ permalink raw reply

* Re: [PATCH v2 43/53] objtool/klp: Remove "objtool --checksum"
From: Miroslav Benes @ 2026-05-05 11:59 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, live-patching, Peter Zijlstra, Joe Lawrence,
	Song Liu, Miroslav Benes, Petr Mladek
In-Reply-To: <9bf05c10e47f711dc2f3b00d728b725618cec5d0.1777575752.git.jpoimboe@kernel.org>

On Thu, 30 Apr 2026 21:08:31 -0700, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> The checksum functionality has been moved to "objtool klp checksum"
> which is now used by klp-build.  Remove the now-dead --checksum and
> --debug-checksum options from the default objtool command.

With the Sashiko fixup you sent elsewhere

Reviewed-by: Miroslav Benes <mbenes@suse.cz>

-- 
Miroslav


^ permalink raw reply

* Re: [PATCH v2 42/53] klp-build: Use "objtool klp checksum" subcommand
From: Miroslav Benes @ 2026-05-05 11:58 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, live-patching, Peter Zijlstra, Joe Lawrence,
	Song Liu, Miroslav Benes, Petr Mladek
In-Reply-To: <1c4f6e2a4e0a3490947111970f2d8e884afa2588.1777575752.git.jpoimboe@kernel.org>

On Thu, 30 Apr 2026 21:08:30 -0700, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> Use the new "objtool klp checksum" subcommand instead of injecting
> --checksum into every objtool invocation via OBJTOOL_ARGS during the
> kernel build.
> 
> This decouples checksum generation from the build, running it in
> separate post-build passes, making the code (and the patch generation
> pipeline itself) more modular.
> 
> [...]

Reviewed-by: Miroslav Benes <mbenes@suse.cz>

-- 
Miroslav


^ permalink raw reply

* Re: [PATCH v2 41/53] objtool/klp: Add "objtool klp checksum" subcommand
From: Miroslav Benes @ 2026-05-05 11:43 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, live-patching, Peter Zijlstra, Joe Lawrence,
	Song Liu, Miroslav Benes, Petr Mladek
In-Reply-To: <ce954af546c49aab9a02de2fa85fb6565ebd3aef.1777575752.git.jpoimboe@kernel.org>

On Thu, 30 Apr 2026 21:08:29 -0700, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> Move the checksum functionality out of the main objtool command into a
> new "objtool klp checksum" subcommand.
> 
> This has the benefit of making the code (and the patch generation
> process itself) more modular.
> 
> For bisectability, both "objtool --checksum" and "objtool klp checksum"
> work for now.  The former will be removed after klp-build has been
> converted to use the new subcommand.
> 
> [...]

Reviewed-by: Miroslav Benes <mbenes@suse.cz>

-- 
Miroslav


^ permalink raw reply

* Re: [PATCH v2 40/53] objtool: Consolidate file decoding into decode_file()
From: Miroslav Benes @ 2026-05-05 11:43 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, live-patching, Peter Zijlstra, Joe Lawrence,
	Song Liu, Miroslav Benes, Petr Mladek
In-Reply-To: <36478f25bee19c8acbff3992e27436b544c4e26b.1777575752.git.jpoimboe@kernel.org>

On Thu, 30 Apr 2026 21:08:28 -0700, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> decode_sections() relies on CFI and cfi_hash initialization done
> separately in check(), making it unusable outside of check().
> 
> Consolidate the initialization into decode_sections() and rename it to
> decode_file(), and make it global along with free_insns() and
> insn_reloc() for use by other objtool components -- namely, the checksum
> code which will be moving to another file.
> 
> [...]

Reviewed-by: Miroslav Benes <mbenes@suse.cz>

-- 
Miroslav


^ permalink raw reply

* Re: [PATCH v2 39/53] objtool/klp: Extricate checksum calculation from validate_branch()
From: Miroslav Benes @ 2026-05-05 11:43 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, live-patching, Peter Zijlstra, Joe Lawrence,
	Song Liu, Miroslav Benes, Petr Mladek
In-Reply-To: <a455b47ef57dcc3506cd97e3c2027ac744941cf1.1777575752.git.jpoimboe@kernel.org>

On Thu, 30 Apr 2026 21:08:27 -0700, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> In preparation for porting the checksum code to other arches, make its
> functionality independent from the CFG reverse engineering code.
> 
> Move it into a standalone calculate_checksums() function which iterates
> all functions and instructions directly, rather than being called inline
> from do_validate_branch().
> 
> [...]

Reviewed-by: Miroslav Benes <mbenes@suse.cz>

-- 
Miroslav


^ permalink raw reply

* Re: [PATCH v2 37/53] objtool: Add is_alias_sym() helper
From: Miroslav Benes @ 2026-05-05 11:31 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, live-patching, Peter Zijlstra, Joe Lawrence,
	Song Liu, Miroslav Benes, Petr Mladek
In-Reply-To: <87cfee570bfffb35961d9b6e5abfbfeae6d903dc.1777575752.git.jpoimboe@kernel.org>

On Thu, 30 Apr 2026 21:08:25 -0700, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> Improve readability with a new is_alias_sym() helper.
> 
> No functional changes intended.

Reviewed-by: Miroslav Benes <mbenes@suse.cz>

-- 
Miroslav


^ permalink raw reply

* Re: [PATCH v2 35/53] objtool/klp: Create empty checksum sections for function-less object files
From: Miroslav Benes @ 2026-05-05 11:31 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, live-patching, Peter Zijlstra, Joe Lawrence,
	Song Liu, Miroslav Benes, Petr Mladek
In-Reply-To: <94267cbd67da8e158153743b80f4d2bfb3f62b4a.1777575752.git.jpoimboe@kernel.org>

On Thu, 30 Apr 2026 21:08:23 -0700, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> If an object file has no functions, objtool has nothing to checksum, so
> it doesn't create the .discard.sym_checksum symbol.
> 
> Then when 'objtool klp diff' reads symbol checksums, it errors out due
> to the missing .discard.sym_checksum section.
> 
> Instead, just create an empty checksum section to signal to
> read_sym_checksums() that the file has been processed.
> 
> [...]

Reviewed-by: Miroslav Benes <mbenes@suse.cz>

-- 
Miroslav


^ permalink raw reply

* Re: [PATCH v2 34/53] objtool: Include libsubcmd headers directly from source tree
From: Miroslav Benes @ 2026-05-05 11:31 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, live-patching, Peter Zijlstra, Joe Lawrence,
	Song Liu, Miroslav Benes, Petr Mladek
In-Reply-To: <9582e7b77cb498a490bab6b93c1b330525f1a5c8.1777575752.git.jpoimboe@kernel.org>

On Thu, 30 Apr 2026 21:08:22 -0700, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> Instead of installing libsubcmd headers to a build output directory and
> including from there, include directly from tools/lib/ where they
> already exist.  This fixes clangd indexing which otherwise can't find
> libsubcmd headers.

Reviewed-by: Miroslav Benes <mbenes@suse.cz>

-- 
Miroslav


^ permalink raw reply

* Re: [PATCH v2 33/53] objtool/klp: Don't set sym->file for section symbols
From: Miroslav Benes @ 2026-05-05 11:31 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, live-patching, Peter Zijlstra, Joe Lawrence,
	Song Liu, Miroslav Benes, Petr Mladek
In-Reply-To: <d44d6a8f8256c9d3896bf531050c969cb15f2661.1777575752.git.jpoimboe@kernel.org>

On Thu, 30 Apr 2026 21:08:21 -0700, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> Section symbols aren't grouped after their corresponding FILE symbols.
> Their sym->file should really be NULL rather than whatever random FILE
> happened to be last.

Reviewed-by: Miroslav Benes <mbenes@suse.cz>

-- 
Miroslav


^ permalink raw reply

* Re: [PATCH v2 31/53] klp-build: Print "objtool klp diff" command in verbose mode
From: Miroslav Benes @ 2026-05-05 11:31 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, live-patching, Peter Zijlstra, Joe Lawrence,
	Song Liu, Miroslav Benes, Petr Mladek
In-Reply-To: <949a2ff797f2c7e366dedc760aea726e7cae1b11.1777575752.git.jpoimboe@kernel.org>

On Thu, 30 Apr 2026 21:08:19 -0700, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> Print the full objtool command line when '--verbose' is given to help
> with debugging.

With the Sashiko fixup you sent elsewhere

Reviewed-by: Miroslav Benes <mbenes@suse.cz>

-- 
Miroslav


^ permalink raw reply

* Re: [PATCH v2 30/53] klp-build: Reject patches to realmode
From: Miroslav Benes @ 2026-05-05 11:31 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, live-patching, Peter Zijlstra, Joe Lawrence,
	Song Liu, Miroslav Benes, Petr Mladek
In-Reply-To: <aeab7294c81cafe98ff363220fc2815f44c4032b.1777575752.git.jpoimboe@kernel.org>

On Thu, 30 Apr 2026 21:08:18 -0700, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> Realmode code is compiled as a separate 16-bit binary and embedded into
> the kernel image via rmpiggy.S.  It can't be livepatched.

Reviewed-by: Miroslav Benes <mbenes@suse.cz>

-- 
Miroslav


^ permalink raw reply

* Re: [PATCH v2 29/53] klp-build: Reject patches to vDSO
From: Miroslav Benes @ 2026-05-05 11:31 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, live-patching, Peter Zijlstra, Joe Lawrence,
	Song Liu, Miroslav Benes, Petr Mladek
In-Reply-To: <a22e56b997cb56f2db1e154a847b354cf9b5b074.1777575752.git.jpoimboe@kernel.org>

On Thu, 30 Apr 2026 21:08:17 -0700, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> vDSO code runs in userspace and can't be livepatched.  Such patches also
> cause spurious "new function" errors due to generated files like
> vdso*-image.c having unstable line numbers across builds.

Reviewed-by: Miroslav Benes <mbenes@suse.cz>

-- 
Miroslav


^ permalink raw reply

* Re: [PATCH v2 28/53] klp-build: Fix patch cleanup on interrupt
From: Miroslav Benes @ 2026-05-05 11:31 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, live-patching, Peter Zijlstra, Joe Lawrence,
	Song Liu, Miroslav Benes, Petr Mladek
In-Reply-To: <cfc3c1d5096764d0cd79ef85e833b8ce098f0838.1777575752.git.jpoimboe@kernel.org>

On Thu, 30 Apr 2026 21:08:16 -0700, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> If a build error occurs and the user hits Ctrl-C while a large patch is
> being reverted during cleanup, the cleanup EXIT trap gets re-triggered
> and tries to re-revert the already partially-reverted patch.  That
> causes 'patch -R' to repeatedly prompt
> 
>   "Unreversed patch detected!  Ignore -R? [n]"
> 
> [...]

Reviewed-by: Miroslav Benes <mbenes@suse.cz>

-- 
Miroslav


^ permalink raw reply

* Re: [PATCH v2 38/53] objtool: Add is_cold_func() helper
From: Miroslav Benes @ 2026-05-05 11:26 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, live-patching, Peter Zijlstra, Joe Lawrence,
	Song Liu, Miroslav Benes, Petr Mladek
In-Reply-To: <a84513224f38c7c7ca2cf2a4930f87d43a76908b.1777575752.git.jpoimboe@kernel.org>

On Thu, 30 Apr 2026 21:08:26 -0700, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> Add an is_cold_func() helper.  No functional changes intended.

Reviewed-by: Miroslav Benes <mbenes@suse.cz>

-- 
Miroslav


^ permalink raw reply

* Re: [PATCH v2 36/53] objtool/klp: Handle Clang .data..Lanon anonymous data sections
From: Miroslav Benes @ 2026-05-05 11:26 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, live-patching, Peter Zijlstra, Joe Lawrence,
	Song Liu, Miroslav Benes, Petr Mladek
In-Reply-To: <a51923df411316860fd18a0db85fdf465d36ddb0.1777575752.git.jpoimboe@kernel.org>

On Thu, 30 Apr 2026 21:08:24 -0700, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> Clang generates anonymous data sections named .data..Lanon.<hash>.
> These need section-symbol references in the same way as .data..Lubsan
> (GCC) and .data..L__unnamed_ (Clang UBSAN) sections.  Without this,
> convert_reloc_sym() fails when processing relocations that reference
> these sections.
> 
> 
> [...]

Reviewed-by: Miroslav Benes <mbenes@suse.cz>

-- 
Miroslav


^ permalink raw reply

* Re: [PATCH v2 32/53] klp-build: Remove redundant SRC and OBJ variables
From: Miroslav Benes @ 2026-05-05 11:26 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, live-patching, Peter Zijlstra, Joe Lawrence,
	Song Liu, Miroslav Benes, Petr Mladek
In-Reply-To: <63b0d7848597ad6011e1f56c8fdd53593d09a992.1777575752.git.jpoimboe@kernel.org>

On Thu, 30 Apr 2026 21:08:20 -0700, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> SRC and OBJ are both set to $(pwd) and are always identical.  The script
> already enforces that klp-build runs from the kernel root directory, and
> builds are done in-place, making these variables unnecessary.

Reviewed-by: Miroslav Benes <mbenes@suse.cz>

-- 
Miroslav


^ permalink raw reply

* Re: [PATCH v5 8/8] unwind: arm64: Use sframe to unwind interrupt frames
From: Mark Rutland @ 2026-05-05 10:29 UTC (permalink / raw)
  To: Jens Remus
  Cc: Dylan Hatch, Roman Gushchin, Weinan Liu, Will Deacon,
	Josh Poimboeuf, Indu Bhagat, Peter Zijlstra, Steven Rostedt,
	Catalin Marinas, Jiri Kosina, Prasanna Kumar T S M,
	Puranjay Mohan, Song Liu, joe.lawrence, linux-toolchains,
	linux-kernel, live-patching, linux-arm-kernel, Randy Dunlap,
	Heiko Carstens
In-Reply-To: <bc3fb59b-9d80-4957-af51-20db38e3487e@linux.ibm.com>

On Mon, May 04, 2026 at 10:47:26AM +0200, Jens Remus wrote:
> Hello Mark,
> 
> I mostly have comments regarding your the SFrame related remarks.

Thanks for this; I have a few more questions and comments below.

> On 5/1/2026 6:46 PM, Mark Rutland wrote:
> > Thanks for putting this together. I think this is looking pretty good.
> > However, there are some things that aren't quite right and need some
> > work, which I've commented on below.
> 
> > (2) To make unwinding generally possible, we'll need to annotate some
> >     assembly functions as unwindable. We'll need to do that for string
> >     routines under lib/, and probably some crypto code, but we don't
> >     need to do that for most code in head.S, entry.S, etc.
> > 
> >     The vast majority of relevant assembly functions are leaf functions
> >     (where the return address is never moved out of the LR), so we can
> >     probably get away with a simple annotation for those that avoids the
> >     need for open-coded CFI directives everywhere.
> 
> Wrapping them in .cfi_startproc ... .cfi_endproc should do.  For instance
> by extending SYM_FUNC_START() and SYM_FUNC_END() or introducing flavors
> that do.  Or where you thinking of something else?

I was expecting we'd do something like that, either with distinct
versions, or some entirely separate annotation.

We can't override SYM_FUNC_START() or SYM_FUNC_END() since those are
also used for non-leaf functions. The bulk of the work is going to be
making sure we only annotate leaf functions specifically, which will
require some human analysis.

> > On Tue, Apr 28, 2026 at 06:36:43PM +0000, Dylan Hatch wrote:
> >> diff --git a/arch/arm64/include/asm/stacktrace/common.h b/arch/arm64/include/asm/stacktrace/common.h
> 
> >> @@ -21,6 +21,8 @@ struct stack_info {
> >>   *
> >>   * @fp:          The fp value in the frame record (or the real fp)
> >>   * @pc:          The lr value in the frame record (or the real lr)
> >> + * @sp:          The sp value at the call site of the current function.
> >> + * @unreliable:  Stacktrace is unreliable.
> >>   *
> >>   * @stack:       The stack currently being unwound.
> >>   * @stacks:      An array of stacks which can be unwound.
> >> @@ -29,7 +31,11 @@ struct stack_info {
> >>  struct unwind_state {
> >>  	unsigned long fp;
> >>  	unsigned long pc;
> >> +#ifdef CONFIG_HAVE_UNWIND_KERNEL_SFRAME
> >> +	unsigned long sp;
> >> +#endif
> > 
> > As this is only used by the kernel unwinder (and not the hyp unwinder),
> > this should live in struct kunwind_state in stacktrace.c.
> > 
> > That said, for unwinding across exception boundaries we should not need
> > this, as the SP value will be in the pt_regs. If we only use SFrame for
> > the exception boundary case, we can remove this entirely. I would
> > strongly prefer that we do that.

> >> +	/* Get the Canonical Frame Address (CFA) */
> >> +	switch (frame.cfa.rule) {
> >> +	case UNWIND_CFA_RULE_SP_OFFSET:
> >> +		cfa = state->common.sp;
> 
> IIUC you suggest this to be changed as follows?
> 
> 		cfa = regs->regs[31];

I was suggesting:

		cfa = regs->sp;

Note: arm64's struct pt_regs has:

	union {
		struct user_pt_regs user_regs;
		struct {
			u64 regs[31];
			u64 sp;
			u64 pc;
			u64 pstate;
		};
	};	

... so regs->regs[31] would be an out-of-bounds array access.

[...]

> >> +	case UNWIND_CFA_RULE_REG_OFFSET:
> >> +	case UNWIND_CFA_RULE_REG_OFFSET_DEREF:
> >> +		/* regs only available in topmost/interrupt frame */
> >> +		if (!regs || frame.cfa.regnum > 30)
> >> +			return -EINVAL;
> >> +		cfa = regs->regs[frame.cfa.regnum];
> >> +		break;
> > 
> > Do we ever expect to see UNWIND_CFA_RULE_REG_OFFSET or
> > UNWIND_CFA_RULE_REG_OFFSET_DEREF in practice for kernel code?
> 
> No.  Those can only occur with SFrame V3 flexible FDE, which are
> currently not generated by GNU assembler for arm64/aarch64, and thus
> could be omitted in the arm64-specific kernel sframe unwinder:
> 
> https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gas/config/tc-aarch64.h;hb=binutils-2_46#l342

Ok.

Do we know whether there are currently cases on aarch64 that cannot be
encoded in SFrame (without flexible FDE), or whether SFrame without
flexible FDE is sufficient for arm64 as-is? ... or do we have
counter-examples today?

Looking at:

  https://sourceware.org/binutils/docs/sframe-spec.html#Flexible-FDE-Type-Interpretation-1

For arm64 I'm not sure whether we'd encounter the DRAP or stack
realignment cases within the kernel (perhaps with SVE?), nor whether the
Register-based RA/FP Locations cases would apply if we assume that we
continue to use frame records.

[... ]

> I must admit that while reviewing I thought it would be future-proof to
> have support for rules that can only be represented with SFrame V3
> flexible FDE, even if they are currently not used on arm64.  Ideally
> kunwind_next_sframe() could be made common, if another architecture
> would implement kernel unwinding using sframe.

While I understand that principle, I think that for now it would be
better to keep this arch-specific and minimal:

* We have arch-specific concerns (e.g. the FRAME_META_TYPE_FINAL
  frames), and factoring that into generic code is going to be painful
  to adapt (which we're likely to need to do in the near future), and to
  maintain going forwards. Keeping that arch-specific for now will make
  it easier/quicker to get to a stable state.

* Code which isn't used is liable to be wrong or made wrong by accident.
  For example, with all the SP cases I mentioned in my initial reply.

We can certainly look at making that more generic in future, but for now
I'd prefer to omit the code that cannot be used (and have some sort of
build or boot/module-load time check that SFrame only has elements that
we expect), and make sure that we thoroughly test the cases that do
exist in practice.

Do we expect SFrame V3 flexible FDE to be generated by toolchains in the
near future?

[...]

> >> +	/* CFA alignment 8 bytes */
> >> +	if (cfa & 0x7)
> >> +		return -EINVAL;
> > 
> > If the CFA is the SP upon entry to the function, then per AAPCS64 rules
> > it should be aligned to 16 bytes. Otherwise, where has this 8 byte
> > alignment requirement come from? Does SFrame mandate that?
> 
> That originates from the common unwind user logic (see
> kernel/unwind/user.c, unwind_user_next_common()), which currently
> assumes 8-byte/4-byte SP alignment for all 64-bit/32-bit architectures.
> 
> So checking for 16-byte alignment here would make sense.

Just to confirm, am I correct to understand that the SFrame definition
of CFA is intended to be the same as the DWARF definition of CFA, and so
for arm64 the CFA is the SP when the function is called?

That's the case for DWARF on arm64:

  https://github.com/ARM-software/abi-aa/releases/download/2025Q4/aadwarf64.pdf
  https://github.com/ARM-software/abi-aa/blob/daa7a94ca55973736c0e434a67a6e4bbcd35d7fa/aadwarf64/aadwarf64.rst

| The CFA is the value of the stack pointer (sp) at the call site in the
| previous frame.

I couldn't find an explciit statement to that effect in:

  https://sourceware.org/binutils/docs/sframe-spec.html

... but I guess that is implied, given the other bits inherited from
DWARF.

I see that the documented behaviour for CFA on AMD64 and s390x are
consistent with their DWARF behaviour.

> >> +
> >> +	/* Get the Return Address (RA) */
> >> +	switch (frame.ra.rule) {
> >> +	case UNWIND_RULE_RETAIN:
> >> +		/* regs only available in topmost/interrupt frame */
> >> +		if (!regs)
> >> +			return -EINVAL;
> >> +		ra = regs->regs[30];
> >> +		source = KUNWIND_SOURCE_REGS_LR;
> >> +		break;
> >> +	/* UNWIND_USER_RULE_CFA_OFFSET not implemented on purpose */
> 
> Nit: s/UNWIND_USER_RULE_CFA_OFFSET/UNWIND_RULE_CFA_OFFSET/
> 
> > 
> > It would be better for the comment to say *why* that's not implemented.
> > 
> > I assume that's because UNWIND_USER_RULE_CFA_OFFSET would mean that the return
> > address is a stack address, and that's obviously not legitimate.
> 
> That and SFrame V3 currently cannot represent FP/RA as CFA + offset
> (i.e. UNWIND_RULE_CFA_OFFSET; .cfi_val_offset FP/RA).
> 
> The comment originates from the common unwind user logic (see
> kernel/unwind/user.c).  I am open to improve that.  What about:
> 
> 	/*
> 	 * UNWIND_RULE_CFA_OFFSET not implemented on purpose, as a stack
> 	 * address cannot be a legitimate return address value.  It is
> 	 * also not used (e.g. not represented in sframe).
> 	 */

I'd go with something simpler, e.g.

	/*
	 * UNWIND_RULE_CFA_OFFSET doesn't make sense for RA.
	 * The return address cannot legitimately be a stack addres.
	 */

[...]

> > 
> > I don't think we expect UNWIND_RULE_REG_OFFSET unless that's sometimes used
> > instead of UNWIND_RULE_RETAIN to express that the return address is in x30
> > (with zero offset).
> 
> No.  Unless there would be nonsense .cfi_register 30, 30, which would
> require SFrame V3 flexible FDE to be represented.

Ok.

> @Indu:  We may consider to treat .cfi_register <reg>, <reg> (for FP/RA)
> like .cfi_restore <reg> in the GNU assembler?
> 
> > Similarly, if the address is on the stack it should be in a frame
> > record. Would we ever expect UNWIND_RULE_REG_OFFSET_DEREF rather than
> > UNWIND_RULE_CFA_OFFSET_DEREF?
> 
> No.  See above (SFrame V3 flexible FDE).

Ok.

> >> +	default:
> >> +		WARN_ON_ONCE(1);
> >> +		return -EINVAL;
> >> +	}
> >> +
> >> +	/* Get the Frame Pointer (FP) */
> >> +	switch (frame.fp.rule) {
> >> +	case UNWIND_RULE_RETAIN:
> >> +		fp = state->common.fp;
> >> +		break;
> >> +	/* UNWIND_USER_RULE_CFA_OFFSET not implemented on purpose */
> > 
> > As for RA, the comment should explain why that's not implemented.
> 
> I am open to improve the comment in the the common unwind user logic.
> What about:
> 
> 	/*
> 	 * UNWIND_RULE_CFA_OFFSET not implemented on purpose, as it is
> 	 * not used (e.g. not represented in sframe).
> 	 */

For me, this wording raises more questions, e.g.

* Does 'not used' mean that toolchains don't use that, or that the spec
  doesn't permit that?

* Does 'not represented' mean that this is not represntable, or that
  toolchains currently don't generate SFrame with the appropriate
  elements.

IIUC you're saying that this *is* representable with flexible FDE, but
current toolchains don't generate that.

Mark.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox