All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Bernd Edlinger <bernd.edlinger@hotmail.de>
Cc: Christian Brauner <brauner@kernel.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Alexey Dobriyan <adobriyan@gmail.com>,
	Kees Cook <kees@kernel.org>,
	Andy Lutomirski <luto@amacapital.net>,
	Will Drewry <wad@chromium.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@suse.com>, Serge Hallyn <serge@hallyn.com>,
	James Morris <jamorris@linux.microsoft.com>,
	Randy Dunlap <rdunlap@infradead.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Yafang Shao <laoar.shao@gmail.com>, Helge Deller <deller@gmx.de>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Adrian Reber <areber@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Jens Axboe <axboe@kernel.dk>, Alexei Starovoitov <ast@kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	linux-kselftest@vger.kernel.org, linux-mm@kvack.org,
	linux-security-module@vger.kernel.org,
	tiozhang <tiozhang@didiglobal.com>,
	Luis Chamberlain <mcgrof@kernel.org>,
	"Paulo Alcantara (SUSE)" <pc@manguebit.com>,
	Sergey Senozhatsky <senozhatsky@chromium.org>,
	Frederic Weisbecker <frederic@kernel.org>,
	YueHaibing <yuehaibing@huawei.com>,
	Paul Moore <paul@paul-moore.com>,
	Aleksa Sarai <cyphar@cyphar.com>,
	Stefan Roesch <shr@devkernel.io>, Chao Yu <chao@kernel.org>,
	xu xin <xu.xin16@zte.com.cn>, Jeff Layton <jlayton@kernel.org>,
	Jan Kara <jack@suse.cz>, David Hildenbrand <david@redhat.com>,
	Dave Chinner <dchinner@redhat.com>, Shuah Khan <shuah@kernel.org>,
	Elena Reshetova <elena.reshetova@intel.com>,
	David Windsor <dwindsor@gmail.com>,
	Mateusz Guzik <mjguzik@gmail.com>,
	Ard Biesheuvel <ardb@kernel.org>,
	"Joel Fernandes (Google)" <joel@joelfernandes.org>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	Hans Liljestrand <ishkamiel@gmail.com>,
	Penglei Jiang <superman.xpt@gmail.com>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	Adrian Ratiu <adrian.ratiu@collabora.com>,
	Ingo Molnar <mingo@kernel.org>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	Cyrill Gorcunov <gorcunov@gmail.com>,
	Eric Dumazet <edumazet@google.com>
Subject: Re: [PATCH v17] exec: Fix dead-lock in de_thread with ptrace_attach
Date: Mon, 17 Nov 2025 16:01:33 +0100	[thread overview]
Message-ID: <aRs4zYDhddBQFiXZ@redhat.com> (raw)
In-Reply-To: <GV2PPF74270EBEE4FE6E639B899D01D8870E4C9A@GV2PPF74270EBEE.EURP195.PROD.OUTLOOK.COM>

On 11/17, Bernd Edlinger wrote:
>
> On 11/11/25 10:21, Christian Brauner wrote:
> > On Wed, Nov 05, 2025 at 03:32:10PM +0100, Oleg Nesterov wrote:
>
> >> But this is minor. Why do we need "bool unsafe_execve_in_progress" ?
> >> If this patch is correct, de_thread() can drop/reacquire cred_guard_mutex
> >> unconditionally.
> >>
>
> I would not like to drop the mutex when no absolutely necessary for performance reasons.

OK, I won't insist... But I don't really understand how this can help to
improve the performance. If nothing else, this adds another for_other_threads()
loop.

And again, the unsafe_execve_in_progress == T case is unlikely. I'm afraid this
case (de_thread() without cred_guard_mutex) won't have enough testing.

In any case, why you dislike the suggestion to add this unsafe_execve_in_progress
logic in a separate patch?

> >>> +	if (unlikely(unsafe_execve_in_progress)) {
> >>> +		spin_unlock_irq(lock);
> >>> +		sig->exec_bprm = bprm;
> >>> +		mutex_unlock(&sig->cred_guard_mutex);
> >>> +		spin_lock_irq(lock);
> >>
> >> I don't think spin_unlock_irq() + spin_lock_irq() makes any sense...
> >>
>
> Since the spin lock was acquired while holding the mutex, both should be
> unlocked in reverse sequence and the spin lock re-acquired after releasing
> the mutex.

Why?

> I'd expect the scheduler to do a task switch after the cred_guard_mutex is
> unlocked, at least in the RT-linux variant, while the spin lock is not yet
> unlocked.

I must have missed something, but I still don't understand why this would
be wrong...

> >>> @@ -1114,13 +1139,31 @@ int begin_new_exec(struct linux_binprm * bprm)
> >>>  	 */
> >>>  	trace_sched_prepare_exec(current, bprm);
> >>>
> >>> +	/* If the binary is not readable then enforce mm->dumpable=0 */
> >>> +	would_dump(bprm, bprm->file);
> >>> +	if (bprm->have_execfd)
> >>> +		would_dump(bprm, bprm->executable);
> >>> +
> >>> +	/*
> >>> +	 * Figure out dumpability. Note that this checking only of current
> >>> +	 * is wrong, but userspace depends on it. This should be testing
> >>> +	 * bprm->secureexec instead.
> >>> +	 */
> >>> +	if (bprm->interp_flags & BINPRM_FLAGS_ENFORCE_NONDUMP ||
> >>> +	    is_dumpability_changed(current_cred(), bprm->cred) ||
> >>> +	    !(uid_eq(current_euid(), current_uid()) &&
> >>> +	      gid_eq(current_egid(), current_gid())))
> >>> +		set_dumpable(bprm->mm, suid_dumpable);
> >>> +	else
> >>> +		set_dumpable(bprm->mm, SUID_DUMP_USER);
> >>> +
> >>
> >> OK, we need to do this before de_thread() drops cred_guard_mutex.
> >> But imo this too should be done in a separate patch, the changelog should
> >> explain this change.
> >>
>
> The dumpability need to be determined before de_thread, because ptrace_may_access
> needs this information to determine if the tracer is allowed to ptrace. That is
> part of the core of the patch, it would not work without that.

Yes,

> I will add more comments to make that more easy to understand.

But again, why this change can't come in a separate patch? Before the patch which
drops cred_guard_mutex in de_thread().

> >> 	int lock_current_cgm(void)
> >> 	{
> >> 		if (mutex_lock_interruptible(&current->signal->cred_guard_mutex))
> >> 			return -ERESTARTNOINTR;
> >>
> >> 		if (!current->signal->group_exec_task)
> >> 			return 0;
> >>
> >> 		WARN_ON(!fatal_signal_pending(current));
> >> 		mutex_unlock(&current->signal->cred_guard_mutex);
> >> 		return -ERESTARTNOINTR;
> >> 	}
> >>
> >> ?
> >>
>
> Some use mutex_lock_interruptible and some use mutex_lock_killable here,
> so it wont work for all of them.  I would not consider this a new kind
> of dead-lock free mutex, but just an open-coded state machine, handling
> the state that the tasks have whild de_thread is running.

OK. and we don't have mutex_lock_state(). I think that all users could
use mutex_lock_killable(), but you are right anyway, and this is minor.

> >> Note that it checks ->group_exec_task, not ->exec_bprm. So this change can
> >> come in a separate patch too, but I won't insist.

Yes. Although this is minor too ;)

> >> This is the most problematic change which I can't review...
> >>
> >> Firstly, it changes task->mm/real_cred for __ptrace_may_access() and this
> >> looks dangerous to me.
> >
> > Yeah, that is not ok. This is effectively override_creds for real_cred
> > and that is not a pattern I want to see us establish at all! Temporary
> > credential overrides for the subjective credentials is already terrible
> > but at least we have the explicit split between real_cred and cred
> > expressely for that. So no, that's not an acceptable solution.
> >
>
> Okay I understand your point.
> I did this originally just to avoid to have to change the interface to all
> the security engines, but instead I could add a flag PTRACE_MODE_BPRMCREDS to
> the ptrace_may_access which must be handled in all security engines, to use
> child->signal->exec_bprm->creds instead of __task_cred(child).

Can't comment... I don't understand your idea, but this is my fault. I guess
this needs more changes, in particular __ptrace_may_access_mm_cred(), but
most probably I misunderstood your idea.

>
> >> Or. check_unsafe_exec() sets LSM_UNSAFE_PTRACE if ptrace. Is it safe to
> >> ptrace the execing task after that? I have no idea what the security hooks
> >> can do...
>
> That means the tracee is already ptraced before the execve, and SUID-bits
> do not work as usual, and are more or less ignored.  But in this patch
> the tracee is not yet ptraced.

Well. I meant that if LSM_UNSAFE_PTRACE is not set, then currently (say)
security_bprm_committing_creds() has all rights to assume that the execing
task is not ptraced. Yes, I don't see any potential problem right now, but
still.

And just in case... Lets look at this code

	+                               rcu_assign_pointer(task->real_cred, bprm->cred);
	+                               task->mm = bprm->mm;
	+                               retval = __ptrace_may_access(task, PTRACE_MODE_ATTACH_REALCREDS);
	+                               rcu_assign_pointer(task->real_cred, old_cred);
	+                               task->mm = old_mm;

again.

This is mostly theoretical, but what if begin_new_exec() fails after de_thread()
and before exec_mmap() and/or commit_creds(bprm->cred) ? In this case the execing
thread will report SIGSEGV to debugger which can (say) read old_mm.

No?

I am starting to think that ptrace_attach() should simply fail with -EWOULDBLOCK
if it detects "unsafe_execve_in_progress" ... And perhaps this is what you already
tried to do in the past, I can't recall :/

Oleg.


  reply	other threads:[~2025-11-17 15:02 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-17 12:23 [PATCH v10] exec: Fix dead-lock in de_thread with ptrace_attach Bernd Edlinger
2021-07-11 17:43 ` [PATCH v11] " Bernd Edlinger
2023-10-30  5:20   ` [PATCH v12] " Bernd Edlinger
2023-10-30  9:00     ` kernel test robot
2023-11-06  6:41     ` [PATCH v13] " Bernd Edlinger
2024-01-15 19:22       ` [PATCH v14] " Bernd Edlinger
2024-01-15 19:37         ` Matthew Wilcox
2024-01-17  9:51           ` Bernd Edlinger
2024-01-16 15:22         ` Oleg Nesterov
2024-01-17 15:07           ` Bernd Edlinger
2024-01-17 16:38             ` Oleg Nesterov
2024-01-22 13:24               ` Bernd Edlinger
2024-01-22 13:44                 ` Oleg Nesterov
2024-01-22 21:30                 ` Kees Cook
2024-01-23 18:30                   ` Bernd Edlinger
2024-01-24  0:09                     ` Kees Cook
2024-01-22 18:31         ` [PATCH v15] " Bernd Edlinger
2025-08-18  6:04           ` Jain, Ayush
2025-08-18 20:53           ` [PATCH v16] " Bernd Edlinger
2025-08-19  4:36             ` Kees Cook
2025-08-19 18:53               ` Bernd Edlinger
2025-08-21 17:34             ` [PATCH v17] " Bernd Edlinger
2025-10-27  6:26               ` Bernd Edlinger
2025-10-27 12:06               ` Peter Zijlstra
2025-11-02 16:17               ` Oleg Nesterov
2025-11-05 14:32               ` Oleg Nesterov
2025-11-11  9:21                 ` Christian Brauner
2025-11-11 11:07                   ` Bernd Edlinger
2025-11-11 13:12                     ` Oleg Nesterov
2025-11-11 13:45                       ` Bernd Edlinger
2025-11-12  9:52                         ` Oleg Nesterov
2025-11-17  6:31                   ` Bernd Edlinger
2025-11-17 15:01                     ` Oleg Nesterov [this message]
2025-11-17 20:08                       ` Bernd Edlinger
2025-11-23 18:32                         ` Oleg Nesterov
2025-11-29 15:06                           ` Bernd Edlinger
2025-12-01 15:13                             ` Oleg Nesterov
2025-11-09 17:14               ` [RFC PATCH 0/3] mt-exec: fix deadlock with ptrace_attach() Oleg Nesterov
2025-11-09 17:14                 ` [RFC PATCH 1/3] exec: make setup_new_exec() return int Oleg Nesterov
2025-11-09 17:15                 ` [RFC PATCH 2/3] exec: don't wait for zombie threads with cred_guard_mutex held Oleg Nesterov
2025-11-10 10:58                   ` Cyrill Gorcunov
2025-11-10 15:09                     ` Oleg Nesterov
2025-11-10 21:49                       ` Cyrill Gorcunov
2025-11-11 14:09                         ` Oleg Nesterov
2025-11-09 17:16                 ` [RFC PATCH 3/3] ptrace: ensure PTRACE_EVENT_EXIT won't stop if the tracee is killed by exec Oleg Nesterov
2025-11-10  5:28                 ` [RFC PATCH 0/3] mt-exec: fix deadlock with ptrace_attach() Bernd Edlinger
2025-11-10 14:47                   ` Oleg Nesterov
2025-11-18 18:13               ` [PATCH v18] exec: Fix dead-lock in de_thread with ptrace_attach Bernd Edlinger
2025-11-20 15:15                 ` Eric W. Biederman
2025-11-20 17:29                   ` Eric W. Biederman
2025-11-20 20:57                     ` [RFC][PATCH] exec: Move cred computation under exec_update_lock Eric W. Biederman
2025-11-20 23:50                       ` Eric W. Biederman
2025-11-21  2:59                         ` Bernd Edlinger
2025-11-21  7:18                           ` Eric W. Biederman
2025-11-21  9:35                             ` Bernd Edlinger
2025-11-21 11:26                               ` Bernd Edlinger
2025-11-21 19:19                                 ` Eric W. Biederman
2025-11-21 23:06                                   ` Ryan Lee
2025-11-23 18:52                       ` Oleg Nesterov
2025-11-23 23:22                         ` Eric W. Biederman
2025-11-25 16:19                           ` Bernd Edlinger
2025-11-25 11:55                       ` Roberto Sassu
2025-12-01 16:06                         ` Are setuid shell scripts safe? (Implied by security_bprm_creds_for_exec) Eric W. Biederman
2025-12-01 16:49                           ` Roberto Sassu
2025-12-01 18:53                             ` Eric W. Biederman
2025-12-01 21:39                               ` David Laight
2025-12-03 13:16                               ` Bernd Edlinger
2025-12-04  5:49                                 ` Al Viro
2025-12-04  9:32                                   ` David Laight
2025-12-04 13:03                                   ` Bernd Edlinger
2025-12-09 12:28                                     ` Jan Kara
2025-12-04 15:43                           ` Stephen Smalley
2025-11-22 17:10                     ` [PATCH v18] exec: Fix dead-lock in de_thread with ptrace_attach Bernd Edlinger
2025-12-19  8:15                 ` [PATCH v19] " Bernd Edlinger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aRs4zYDhddBQFiXZ@redhat.com \
    --to=oleg@redhat.com \
    --cc=adobriyan@gmail.com \
    --cc=adrian.ratiu@collabora.com \
    --cc=akpm@linux-foundation.org \
    --cc=ardb@kernel.org \
    --cc=areber@redhat.com \
    --cc=ast@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=bernd.edlinger@hotmail.de \
    --cc=brauner@kernel.org \
    --cc=chao@kernel.org \
    --cc=cyphar@cyphar.com \
    --cc=david@redhat.com \
    --cc=dchinner@redhat.com \
    --cc=deller@gmx.de \
    --cc=dwindsor@gmail.com \
    --cc=ebiederm@xmission.com \
    --cc=edumazet@google.com \
    --cc=elena.reshetova@intel.com \
    --cc=frederic@kernel.org \
    --cc=gorcunov@gmail.com \
    --cc=ishkamiel@gmail.com \
    --cc=jack@suse.cz \
    --cc=jamorris@linux.microsoft.com \
    --cc=jlayton@kernel.org \
    --cc=joel@joelfernandes.org \
    --cc=kees@kernel.org \
    --cc=laoar.shao@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=luto@amacapital.net \
    --cc=mcgrof@kernel.org \
    --cc=mhocko@suse.com \
    --cc=mingo@kernel.org \
    --cc=mjguzik@gmail.com \
    --cc=paul@paul-moore.com \
    --cc=pc@manguebit.com \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=senozhatsky@chromium.org \
    --cc=serge@hallyn.com \
    --cc=shr@devkernel.io \
    --cc=shuah@kernel.org \
    --cc=superman.xpt@gmail.com \
    --cc=surenb@google.com \
    --cc=tglx@linutronix.de \
    --cc=tiozhang@didiglobal.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=wad@chromium.org \
    --cc=willy@infradead.org \
    --cc=xu.xin16@zte.com.cn \
    --cc=yuehaibing@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.