All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Remus <jremus@linux.ibm.com>
To: Steven Rostedt <rostedt@kernel.org>,
	Josh Poimboeuf <jpoimboe@kernel.org>
Cc: bpf@vger.kernel.org, sashiko@lists.linux.dev,
	Indu Bhagat <ibhagatgnu@gmail.com>
Subject: Re: [PATCH v14 03/19] unwind_user/sframe: Store .sframe section data in per-mm maple tree
Date: Wed, 6 May 2026 15:50:45 +0200	[thread overview]
Message-ID: <49318ed5-8668-43fe-880d-b91bd7c3a7a9@linux.ibm.com> (raw)
In-Reply-To: <20260505185158.39F35C2BCB4@smtp.kernel.org>

On 5/5/2026 8:51 PM, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 6 potential issue(s) to consider:
> - [Critical] Concurrency Use-After-Free and Double Free in sframe_remove_section
> - [High] Unbounded memory leak of sframe_section allocations leading to Denial of Service (OOM)
> - [High] Missing MT_FLAGS_USE_RCU initialization breaks lockless NMI unwinding
> - [Medium] Child processes silently lose .sframe unwind data on fork()
> - [Medium] Inclusive bound semantics in maple tree cause 1-byte overlap rejections for contiguous sections
> - [Low] Zero-address validation falsely rejects valid memory mappings
> --
> 
> commit 3b41045a6b38a262cb32acf9ed118e4d667448dd
> Author: Josh Poimboeuf <jpoimboe@kernel.org>
> 
> unwind_user/sframe: Store .sframe section data in per-mm maple tree
> 
> Associate an .sframe section with its mm by adding it to a per-mm maple
> tree which is indexed by the corresponding text address range. A single
> .sframe section can be associated with multiple text ranges.
> 
>> diff --git a/kernel/fork.c b/kernel/fork.c
>> index f1ad69c6dc2d4..0d92fb1c61a99 100644
>> --- a/kernel/fork.c
>> +++ b/kernel/fork.c
> [ ... ]
>> @@ -1072,6 +1074,13 @@ static void mmap_init_lock(struct mm_struct *mm)
>>  #endif
>>  }
>>  
>> +static void mm_init_sframe(struct mm_struct *mm)
>> +{
>> +#ifdef CONFIG_HAVE_UNWIND_USER_SFRAME
>> +	mt_init(&mm->sframe_mt);
>> +#endif
>> +}
>> +
>>  static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p,
> 
> mt_init() initializes the maple tree with flags set to 0. Since stack
> unwinding typically occurs in profiling interrupts (NMIs) which must
> traverse the maple tree locklessly via RCU, does this missing
> MT_FLAGS_USE_RCU flag allow concurrent modifications to perform in-place
> updates instead of RCU-safe copy-on-write allocations? This could result in
> the NMI unwinder observing corrupted tree states.

Please advise.

> [ ... ]
>> @@ -1100,6 +1109,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p,
>>  	mm->pmd_huge_pte = NULL;
>>  #endif
>>  	mm_init_uprobes_state(mm);
>> +	mm_init_sframe(mm);
>>  	hugetlb_count_init(mm);
>>  
>>  	mm_flags_clear_all(mm);
> 
> When a process forks, this initializes an empty sframe_mt for the child's
> mm_struct. While dup_mmap() duplicates the parent's VMAs, it doesn't appear
> that the registered .sframe entries are propagated to the child's maple tree.
> Will child processes silently lose their .sframe unwind data for shared
> libraries registered prior to the fork?

Does this ring any bell, Steven?  I ran into it with my signal frame
stacktracing test on s390.  I think we discussed this during one of the
SFrame calls and it got lost.  

>> diff --git a/kernel/unwind/sframe.c b/kernel/unwind/sframe.c
>> index c128c1eeaca20..fdb2a49085638 100644
>> --- a/kernel/unwind/sframe.c
>> +++ b/kernel/unwind/sframe.c
> [ ... ]
>> @@ -81,6 +81,7 @@ static int sframe_read_header(struct sframe_section *sec)
>>  int sframe_add_section(unsigned long sframe_start, unsigned long sframe_end,
>>  		       unsigned long text_start, unsigned long text_end)
>>  {
>> +	struct maple_tree *sframe_mt = &current->mm->sframe_mt;
>>  	struct vm_area_struct *sframe_vma, *text_vma;
>>  	struct mm_struct *mm = current->mm;
>>  	struct sframe_section *sec;
>> @@ -88,10 +89,12 @@ int sframe_add_section(unsigned long sframe_start, unsigned long sframe_end,
>>  	if (!sframe_start || !sframe_end || !text_start || !text_end) {
> 
> Virtual addresses can legitimately be zero, such as when mmap_min_addr
> permits it. Does checking !text_start falsely reject valid zero-address
> VMAs? Checking the range bounds explicitly might be safer here.

Addressed in previous patch.

> [ ... ]
>>  		}
>>  	}
>>  
>>  	sec = kzalloc(sizeof(*sec), GFP_KERNEL);
> 
> This allocates the section with GFP_KERNEL instead of GFP_KERNEL_ACCOUNT,
> which bypasses memory cgroup limits.
> 
> Furthermore, is there a mechanism to clean up these allocated sections when
> the corresponding VMA is torn down (e.g., via munmap)? Without cleanup hooks
> in the VMA teardown paths, it looks like an application could repeatedly mmap
> an executable region, register it here, and munmap it, leaking kernel memory
> and maple tree nodes indefinitely.

Isn't that taken care of by the change in __mmdrop() to call
sframe_free_mm(mm), which is part of this patch?

> [ ... ]
>> -	/* TODO nowhere to store it yet - just free it and return an error */
>> -	ret = -ENOSYS;
>> +	ret = mtree_insert_range(sframe_mt, sec->text_start, sec->text_end, sec, GFP_KERNEL);

	ret = mtree_insert_range(sframe_mt, sec->text_start, sec->text_end - 1,
				 sec, GFP_KERNEL);

> The Maple Tree API stores ranges inclusively as [first, last]. Since VMA
> boundaries are exclusive, does passing sec->text_end directly cause the
> maple tree range to incorrectly cover one extra byte?
> 
> If an application registers two perfectly contiguous .sframe sections, the
> insertion of the second section could fail with -EEXIST because the first
> section erroneously claims the first byte of the second section. Using
> sec->text_end - 1 might prevent this overlap.
> 
>> +	if (ret) {
>> +		dbg("mtree_insert_range failed: text=%lx-%lx\n",
>> +		    sec->text_start, sec->text_end);
>> +		goto err_free;
>> +	}
>> +
>> +	return 0;
>>  
>>  err_free:
>>  	free_section(sec);
>>  	return ret;
>>  }
>>  
>> +static int __sframe_remove_section(struct mm_struct *mm,
>> +				   struct sframe_section *sec)
>> +{
>> +	if (!mtree_erase(&mm->sframe_mt, sec->text_start)) {
>> +		dbg("mtree_erase failed: text=%lx\n", sec->text_start);
>> +		return -EINVAL;
>> +	}
>> +
>> +	free_section(sec);
>> +
>> +	return 0;
>> +}
>> +
>>  int sframe_remove_section(unsigned long sframe_start)
>>  {
>> -	return -ENOSYS;
>> +	struct mm_struct *mm = current->mm;
>> +	struct sframe_section *sec;
>> +	unsigned long index = 0;
>> +	bool found = false;
>> +	int ret = 0;
>> +
>> +	mt_for_each(&mm->sframe_mt, sec, index, ULONG_MAX) {
>> +		if (sec->sframe_start == sframe_start) {
> 
> Can concurrent calls to sframe_remove_section() cause a use-after-free and
> subsequent double free?
> 
> mt_for_each() locklessly iterates mm->sframe_mt, and internally acquires and
> drops the RCU read lock, meaning the returned sec pointer has no lifetime
> protection in the loop body.
> 
> If two threads concurrently invoke sframe_remove_section(), both could
> receive the exact same sec pointer from the tree. Thread A could then call
> __sframe_remove_section(), erasing the entry and freeing sec via
> free_section(). Thread B would then evaluate the if statement using the
> freed sec pointer, causing a use-after-free read, and potentially proceeding
> to free it again.

Please advise.

> Additionally, does free_section() need to use kfree_rcu() instead of
> kfree() to prevent lockless NMI unwinders from hitting a use-after-free
> when reading payloads?

Please advise.

Thanks and regards,
Jens
-- 
Jens Remus
Linux on Z Development (D3303)
jremus@de.ibm.com / jremus@linux.ibm.com

IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Ehningen; Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement: https://www.ibm.com/privacy/


  reply	other threads:[~2026-05-06 13:50 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-05 12:16 [PATCH v14 00/19] unwind_deferred: Implement sframe handling Jens Remus
2026-05-05 12:17 ` [PATCH v14 01/19] unwind_user: Add generic and arch-specific headers to MAINTAINERS Jens Remus
2026-05-05 12:17 ` [PATCH v14 02/19] unwind_user/sframe: Add support for reading .sframe headers Jens Remus
2026-05-05 12:49   ` sashiko-bot
2026-05-06 13:42     ` Jens Remus
2026-05-07 14:55       ` Jens Remus
2026-05-08 23:02       ` Indu Bhagat
2026-05-11 10:05         ` Jens Remus
2026-05-05 12:17 ` [PATCH v14 03/19] unwind_user/sframe: Store .sframe section data in per-mm maple tree Jens Remus
2026-05-05 18:51   ` sashiko-bot
2026-05-06 13:50     ` Jens Remus [this message]
2026-05-06 15:21       ` Steven Rostedt
2026-05-12 15:52         ` Jens Remus
2026-05-05 12:17 ` [PATCH v14 04/19] x86/uaccess: Add unsafe_copy_from_user() implementation Jens Remus
2026-05-05 18:22   ` sashiko-bot
2026-05-06 14:13     ` Jens Remus
2026-05-06 15:05       ` Steven Rostedt
2026-05-06 14:09   ` Jens Remus
2026-05-06 15:03     ` Steven Rostedt
2026-05-06 21:13     ` David Laight
2026-05-06 21:17       ` David Laight
2026-05-05 12:17 ` [PATCH v14 05/19] unwind_user/sframe: Add support for reading .sframe contents Jens Remus
2026-05-05 18:59   ` sashiko-bot
2026-05-06 14:34     ` Jens Remus
2026-05-06 15:01       ` Steven Rostedt
2026-05-06 15:29         ` Jens Remus
2026-05-08  9:49         ` Jens Remus
2026-05-08 23:04           ` Indu Bhagat
2026-05-12 13:35         ` Jens Remus
2026-05-13 12:22           ` Steven Rostedt
2026-05-08 23:03       ` Indu Bhagat
2026-05-08 10:50   ` Jens Remus
2026-05-11 16:16   ` Jens Remus
2026-05-05 12:17 ` [PATCH v14 06/19] unwind_user/sframe: Detect .sframe sections in executables Jens Remus
2026-05-05 12:53   ` sashiko-bot
2026-05-06 14:56     ` Jens Remus
2026-05-06 15:36       ` Steven Rostedt
2026-05-08 23:05         ` Indu Bhagat
2026-05-05 12:17 ` [PATCH v14 07/19] unwind_user/sframe: Wire up unwind_user to sframe Jens Remus
2026-05-05 18:55   ` sashiko-bot
2026-05-07 16:18     ` Jens Remus
2026-05-08 23:07       ` Indu Bhagat
2026-05-11 16:46         ` Steven Rostedt
2026-05-05 12:17 ` [PATCH v14 08/19] unwind_user: Stop when reaching an outermost frame Jens Remus
2026-05-05 12:40   ` sashiko-bot
2026-05-06 15:01     ` Jens Remus
2026-05-06 15:40       ` Steven Rostedt
2026-05-05 12:17 ` [PATCH v14 09/19] unwind_user/sframe: Add support for outermost frame indication Jens Remus
2026-05-05 12:17 ` [PATCH v14 10/19] unwind_user/sframe: Remove .sframe section on detected corruption Jens Remus
2026-05-05 20:39   ` sashiko-bot
2026-05-07 16:23     ` Jens Remus
2026-05-05 12:17 ` [PATCH v14 11/19] unwind_user/sframe: Show file name in debug output Jens Remus
2026-05-05 18:46   ` sashiko-bot
2026-05-12 14:52     ` Jens Remus
2026-05-13  9:20       ` Jens Remus
2026-05-05 12:17 ` [PATCH v14 12/19] unwind_user/sframe: Add .sframe validation option Jens Remus
2026-05-05 18:32   ` sashiko-bot
2026-05-12 14:23     ` Jens Remus
2026-05-13 12:30       ` Steven Rostedt
2026-05-08 10:51   ` Jens Remus
2026-05-05 12:17 ` [PATCH v14 13/19] unwind_user: Enable archs that pass RA in a register Jens Remus
2026-05-05 18:35   ` sashiko-bot
2026-05-05 12:17 ` [PATCH v14 14/19] unwind_user: Flexible FP/RA recovery rules Jens Remus
2026-05-05 18:34   ` sashiko-bot
2026-05-05 12:17 ` [PATCH v14 15/19] unwind_user: Flexible CFA " Jens Remus
2026-05-05 12:17 ` [PATCH v14 16/19] unwind_user/sframe: Add support for SFrame V3 flexible FDEs Jens Remus
2026-05-05 18:55   ` sashiko-bot
2026-05-07 15:30     ` Jens Remus
2026-05-13  6:26       ` Indu Bhagat
2026-05-13 13:50         ` Jens Remus
2026-05-13 15:16           ` Steven Rostedt
2026-05-05 12:17 ` [PATCH v14 17/19] unwind_user/sframe: Separate reading of FRE from reading of FRE data words Jens Remus
2026-05-05 19:05   ` sashiko-bot
2026-05-07 16:01     ` Jens Remus
2026-05-05 12:17 ` [PATCH v14 18/19] unwind_user/sframe/x86: Enable sframe unwinding on x86 Jens Remus
2026-05-05 19:07   ` sashiko-bot
2026-05-05 12:17 ` [PATCH v14 19/19] unwind_user/sframe: Add prctl() interface for registering .sframe sections Jens Remus
2026-05-05 18:45   ` sashiko-bot
2026-05-07 14:14     ` Jens Remus
2026-05-05 12:25 ` [PATCH v14 00/19] unwind_deferred: Implement sframe handling Jens Remus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49318ed5-8668-43fe-880d-b91bd7c3a7a9@linux.ibm.com \
    --to=jremus@linux.ibm.com \
    --cc=bpf@vger.kernel.org \
    --cc=ibhagatgnu@gmail.com \
    --cc=jpoimboe@kernel.org \
    --cc=rostedt@kernel.org \
    --cc=sashiko@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.