public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: "David Hildenbrand (Arm)" <david@kernel.org>
To: "Lorenzo Stoakes (Oracle)" <ljs@kernel.org>,
	Qi Tang <tpluszz77@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Cyrill Gorcunov <gorcunov@openvz.org>,
	Oleg Nesterov <oleg@redhat.com>,
	linux-kernel@vger.kernel.org, stable@vger.kernel.org
Subject: Re: [PATCH] prctl: require checkpoint_restore_ns_capable for PR_SET_MM_MAP
Date: Thu, 2 Apr 2026 15:55:27 +0200	[thread overview]
Message-ID: <389887c2-ddae-4456-b9d2-417aaaa2b340@kernel.org> (raw)
In-Reply-To: <686134c9-c2e3-444f-b83a-dd229c7b0102@lucifer.local>

On 4/2/26 15:06, Lorenzo Stoakes (Oracle) wrote:
> On Thu, Apr 02, 2026 at 07:13:32PM +0800, Qi Tang wrote:
>> prctl_set_mm_map() allows modifying all mm_struct boundaries and
>> the saved auxv vector.  The individual field path (PR_SET_MM_START_CODE
>> etc.) correctly requires CAP_SYS_RESOURCE, but the PR_SET_MM_MAP path
>> dispatches before this check and has no capability requirement of its
>> own when exe_fd is -1.
>>
>> This means any unprivileged user on a CONFIG_CHECKPOINT_RESTORE kernel
>> (nearly all distros) can rewrite mm boundaries including start_brk, brk,
>> arg_start/end, env_start/end and saved_auxv.  Consequences include:
>>
>>   - SELinux PROCESS__EXECHEAP bypass via start_brk manipulation
>>   - procfs info disclosure by pointing arg/env ranges at other memory
>>   - auxv poisoning (AT_SYSINFO_EHDR, AT_BASE, AT_ENTRY)
>>
>> The original commit f606b77f1a9e ("prctl: PR_SET_MM -- introduce
>> PR_SET_MM_MAP operation") states "we require the caller to be at least
>> user-namespace root user", but this was never enforced in the code.
>>
>> Add a checkpoint_restore_ns_capable() check at the top of
>> prctl_set_mm_map(), after the PR_SET_MM_MAP_SIZE early return.  This
>> requires CAP_CHECKPOINT_RESTORE or CAP_SYS_ADMIN in the caller's
>> user namespace, matching the stated design intent and the existing
>> check for exe_fd changes.
>>
>> Fixes: f606b77f1a9e ("prctl: PR_SET_MM -- introduce PR_SET_MM_MAP operation")
> 
> We've had a gaping security hole since 2014 and nobody noticed? I find it
> hard to believe.
> 
>> Cc: stable@vger.kernel.org
>> Cc: Cyrill Gorcunov <gorcunov@openvz.org>
>> Signed-off-by: Qi Tang <tpluszz77@gmail.com>
>> ---
>>  kernel/sys.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/kernel/sys.c b/kernel/sys.c
>> index c86eba9aa7e9..2b8c57f23a35 100644
>> --- a/kernel/sys.c
>> +++ b/kernel/sys.c
>> @@ -2071,6 +2071,9 @@ static int prctl_set_mm_map(int opt, const void __user *addr, unsigned long data
>>  		return put_user((unsigned int)sizeof(prctl_map),
>>  				(unsigned int __user *)addr);
>>
>> +	if (!checkpoint_restore_ns_capable(current_user_ns()))
>> +		return -EPERM;
> 
> Hmm there is already:
> 
> 	if (prctl_map.exe_fd != (u32)-1) {
> 		/*
> 		 * Check if the current user is checkpoint/restore capable.
> 		 * At the time of this writing, it checks for CAP_SYS_ADMIN
> 		 * or CAP_CHECKPOINT_RESTORE.
> 		 * Note that a user with access to ptrace can masquerade an
> 		 * arbitrary program as any executable, even setuid ones.
> 		 * This may have implications in the tomoyo subsystem.
> 		 */
> 		if (!checkpoint_restore_ns_capable(current_user_ns()))
> 			return -EPERM;
> 
> And you're proposing _adding_ this check on top of that? Seems super
> redundant.

Yes, should be moved.

> 
> but also, this seems super-specific buuut... Then again #ifdef
> CONFIG_CHECKPOINT_RESTORE around this. Ugh.
> 
> I _hate_ this inteface. HATE HATE HATE it.
> 
> Anyway, does updating _your own_ auxv really require elevated permissions
> like this?
> 
> I don't think so? Couldn't you go and manipulate that anyway without
> elevated anything?

Hard to believe ...

I was wondering whether this could break some users. At least CRIU doc
states:

    This option tells *criu* to accept the limitations when running
    as non-root. Running as non-root requires *criu* at least to have
    *CAP_SYS_ADMIN* or *CAP_CHECKPOINT_RESTORE*. For details about
    running *criu* as non-root please consult the *NON-ROOT* section.

I mean, the check makes sense given that prctl_set_mm() rejects all
these operations without CAP_SYS_RESOURCE.


CAP_CHECKPOINT_RESTORE was not introduced before

commit 124ea650d3072b005457faed69909221c2905a1f
Author: Adrian Reber <areber@redhat.com>
Date:   Sun Jul 19 12:04:11 2020 +0200

    capabilities: Introduce CAP_CHECKPOINT_RESTORE

So at the time PR_SET_MM_MAP was added there simply was no such capability.

Likely, now that we have it, we should indeed use it.

-- 
Cheers,

David

  reply	other threads:[~2026-04-02 13:55 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-02 11:13 [PATCH] prctl: require checkpoint_restore_ns_capable for PR_SET_MM_MAP Qi Tang
2026-04-02 12:57 ` Oleg Nesterov
2026-04-02 13:07   ` Lorenzo Stoakes (Oracle)
2026-04-02 13:13   ` Oleg Nesterov
2026-04-02 13:41     ` David Hildenbrand (Arm)
2026-04-02 13:06 ` Lorenzo Stoakes (Oracle)
2026-04-02 13:55   ` David Hildenbrand (Arm) [this message]
2026-04-02 14:05     ` David Hildenbrand (Arm)
2026-04-02 14:21     ` Lorenzo Stoakes (Oracle)
2026-04-02 14:27       ` David Hildenbrand (Arm)
2026-04-02 17:46         ` Andrei Vagin
2026-04-02 13:30 ` David Hildenbrand (Arm)
2026-04-02 17:47 ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=389887c2-ddae-4456-b9d2-417aaaa2b340@kernel.org \
    --to=david@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=gorcunov@openvz.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ljs@kernel.org \
    --cc=oleg@redhat.com \
    --cc=stable@vger.kernel.org \
    --cc=tpluszz77@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox