linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Bernd Edlinger <bernd.edlinger@hotmail.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
	Alexey Dobriyan <adobriyan@gmail.com>,
	Kees Cook <kees@kernel.org>,
	Andy Lutomirski <luto@amacapital.net>,
	Will Drewry <wad@chromium.org>,
	Christian Brauner <brauner@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@suse.com>, Serge Hallyn <serge@hallyn.com>,
	James Morris <jamorris@linux.microsoft.com>,
	Randy Dunlap <rdunlap@infradead.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Yafang Shao <laoar.shao@gmail.com>, Helge Deller <deller@gmx.de>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Adrian Reber <areber@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Jens Axboe <axboe@kernel.dk>, Alexei Starovoitov <ast@kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	linux-kselftest@vger.kernel.org, linux-mm@kvack.org,
	linux-security-module@vger.kernel.org,
	tiozhang <tiozhang@didiglobal.com>,
	Luis Chamberlain <mcgrof@kernel.org>,
	"Paulo Alcantara (SUSE)" <pc@manguebit.com>,
	Sergey Senozhatsky <senozhatsky@chromium.org>,
	Frederic Weisbecker <frederic@kernel.org>,
	YueHaibing <yuehaibing@huawei.com>,
	Paul Moore <paul@paul-moore.com>,
	Aleksa Sarai <cyphar@cyphar.com>,
	Stefan Roesch <shr@devkernel.io>, Chao Yu <chao@kernel.org>,
	xu xin <xu.xin16@zte.com.cn>, Jeff Layton <jlayton@kernel.org>,
	Jan Kara <jack@suse.cz>, David Hildenbrand <david@redhat.com>,
	Dave Chinner <dchinner@redhat.com>, Shuah Khan <shuah@kernel.org>,
	Elena Reshetova <elena.reshetova@intel.com>,
	David Windsor <dwindsor@gmail.com>,
	Mateusz Guzik <mjguzik@gmail.com>,
	Ard Biesheuvel <ardb@kernel.org>,
	"Joel Fernandes (Google)" <joel@joelfernandes.org>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	Hans Liljestrand <ishkamiel@gmail.com>,
	Penglei Jiang <superman.xpt@gmail.com>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	Adrian Ratiu <adrian.ratiu@collabora.com>,
	Ingo Molnar <mingo@kernel.org>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	Cyrill Gorcunov <gorcunov@gmail.com>,
	Eric Dumazet <edumazet@google.com>
Subject: Re: [PATCH v17] exec: Fix dead-lock in de_thread with ptrace_attach
Date: Wed, 5 Nov 2025 15:32:10 +0100	[thread overview]
Message-ID: <20251105143210.GA25535@redhat.com> (raw)
In-Reply-To: <GV2PPF74270EBEE9EF78827D73D3D7212F7E432A@GV2PPF74270EBEE.EURP195.PROD.OUTLOOK.COM>

I am still thinking about another approach, will write another email.
But let me take a closer look at your patch.

First of all, can you split it? See below.

On 08/21, Bernd Edlinger wrote:
>
> -static int de_thread(struct task_struct *tsk)
> +static int de_thread(struct task_struct *tsk, struct linux_binprm *bprm)
>  {
>  	struct signal_struct *sig = tsk->signal;
>  	struct sighand_struct *oldsighand = tsk->sighand;
>  	spinlock_t *lock = &oldsighand->siglock;
> +	struct task_struct *t;
> +	bool unsafe_execve_in_progress = false;
>
>  	if (thread_group_empty(tsk))
>  		goto no_thread_group;
> @@ -932,6 +934,19 @@ static int de_thread(struct task_struct *tsk)
>  	if (!thread_group_leader(tsk))
>  		sig->notify_count--;
>
> +	for_other_threads(tsk, t) {
> +		if (unlikely(t->ptrace)
> +		    && (t != tsk->group_leader || !t->exit_state))
> +			unsafe_execve_in_progress = true;

you can add "break" into the "if ()" block...

But this is minor. Why do we need "bool unsafe_execve_in_progress" ?
If this patch is correct, de_thread() can drop/reacquire cred_guard_mutex
unconditionally.

If you really think it makes sense, please make another patch with the
changelog.

I'd certainly prefer to avoid this boolean at least for the start. If nothing
else to catch the potential problems earlier.

> +	if (unlikely(unsafe_execve_in_progress)) {
> +		spin_unlock_irq(lock);
> +		sig->exec_bprm = bprm;
> +		mutex_unlock(&sig->cred_guard_mutex);
> +		spin_lock_irq(lock);

I don't think spin_unlock_irq() + spin_lock_irq() makes any sense...

> @@ -1114,13 +1139,31 @@ int begin_new_exec(struct linux_binprm * bprm)
>  	 */
>  	trace_sched_prepare_exec(current, bprm);
>
> +	/* If the binary is not readable then enforce mm->dumpable=0 */
> +	would_dump(bprm, bprm->file);
> +	if (bprm->have_execfd)
> +		would_dump(bprm, bprm->executable);
> +
> +	/*
> +	 * Figure out dumpability. Note that this checking only of current
> +	 * is wrong, but userspace depends on it. This should be testing
> +	 * bprm->secureexec instead.
> +	 */
> +	if (bprm->interp_flags & BINPRM_FLAGS_ENFORCE_NONDUMP ||
> +	    is_dumpability_changed(current_cred(), bprm->cred) ||
> +	    !(uid_eq(current_euid(), current_uid()) &&
> +	      gid_eq(current_egid(), current_gid())))
> +		set_dumpable(bprm->mm, suid_dumpable);
> +	else
> +		set_dumpable(bprm->mm, SUID_DUMP_USER);
> +

OK, we need to do this before de_thread() drops cred_guard_mutex.
But imo this too should be done in a separate patch, the changelog should
explain this change.

> @@ -1361,6 +1387,11 @@ static int prepare_bprm_creds(struct linux_binprm *bprm)
>  	if (mutex_lock_interruptible(&current->signal->cred_guard_mutex))
>  		return -ERESTARTNOINTR;
>
> +	if (unlikely(current->signal->exec_bprm)) {
> +		mutex_unlock(&current->signal->cred_guard_mutex);
> +		return -ERESTARTNOINTR;
> +	}

OK, if signal->exec_bprm != NULL, then current is already killed. But
proc_pid_attr_write() and ptrace_traceme() do the same. So how about
something like

	int lock_current_cgm(void)
	{
		if (mutex_lock_interruptible(&current->signal->cred_guard_mutex))
			return -ERESTARTNOINTR;

		if (!current->signal->group_exec_task)
			return 0;

		WARN_ON(!fatal_signal_pending(current));
		mutex_unlock(&current->signal->cred_guard_mutex);
		return -ERESTARTNOINTR;
	}

?

Note that it checks ->group_exec_task, not ->exec_bprm. So this change can
come in a separate patch too, but I won't insist.

> @@ -453,6 +454,28 @@ static int ptrace_attach(struct task_struct *task, long request,
>  				return retval;
>  		}
>
> +		if (unlikely(task == task->signal->group_exec_task)) {
> +			retval = down_write_killable(&task->signal->exec_update_lock);
> +			if (retval)
> +				return retval;
> +
> +			scoped_guard (task_lock, task) {
> +				struct linux_binprm *bprm = task->signal->exec_bprm;
> +				const struct cred __rcu *old_cred = task->real_cred;
> +				struct mm_struct *old_mm = task->mm;
> +
> +				rcu_assign_pointer(task->real_cred, bprm->cred);
> +				task->mm = bprm->mm;
> +				retval = __ptrace_may_access(task, PTRACE_MODE_ATTACH_REALCREDS);
> +				rcu_assign_pointer(task->real_cred, old_cred);
> +				task->mm = old_mm;
> +			}

This is the most problematic change which I can't review...

Firstly, it changes task->mm/real_cred for __ptrace_may_access() and this
looks dangerous to me.

Say, current_is_single_threaded() called by another CLONE_VM process can
miss group_exec_task and falsely return true. Probably not that bad, in
this case old_mm should go away soon, but still...

And I don't know if this can fool the users of task_cred_xxx/__task_cred
somehow.

Or. check_unsafe_exec() sets LSM_UNSAFE_PTRACE if ptrace. Is it safe to
ptrace the execing task after that? I have no idea what the security hooks
can do...

Again, can't review this part.

Oleg.


  parent reply	other threads:[~2025-11-05 14:34 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-17 12:23 [PATCH v10] exec: Fix dead-lock in de_thread with ptrace_attach Bernd Edlinger
2021-07-11 17:43 ` [PATCH v11] " Bernd Edlinger
2023-10-30  5:20   ` [PATCH v12] " Bernd Edlinger
2023-10-30  9:00     ` kernel test robot
2023-11-06  6:41     ` [PATCH v13] " Bernd Edlinger
2024-01-15 19:22       ` [PATCH v14] " Bernd Edlinger
2024-01-15 19:37         ` Matthew Wilcox
2024-01-17  9:51           ` Bernd Edlinger
2024-01-16 15:22         ` Oleg Nesterov
2024-01-17 15:07           ` Bernd Edlinger
2024-01-17 16:38             ` Oleg Nesterov
2024-01-22 13:24               ` Bernd Edlinger
2024-01-22 13:44                 ` Oleg Nesterov
2024-01-22 21:30                 ` Kees Cook
2024-01-23 18:30                   ` Bernd Edlinger
2024-01-24  0:09                     ` Kees Cook
2024-01-22 18:31         ` [PATCH v15] " Bernd Edlinger
2025-08-18  6:04           ` Jain, Ayush
2025-08-18 20:53           ` [PATCH v16] " Bernd Edlinger
2025-08-19  4:36             ` Kees Cook
2025-08-19 18:53               ` Bernd Edlinger
2025-08-21 17:34             ` [PATCH v17] " Bernd Edlinger
2025-10-27  6:26               ` Bernd Edlinger
2025-10-27 12:06               ` Peter Zijlstra
2025-11-02 16:17               ` Oleg Nesterov
2025-11-05 14:32               ` Oleg Nesterov [this message]
2025-11-11  9:21                 ` Christian Brauner
2025-11-11 11:07                   ` Bernd Edlinger
2025-11-11 13:12                     ` Oleg Nesterov
2025-11-11 13:45                       ` Bernd Edlinger
2025-11-12  9:52                         ` Oleg Nesterov
2025-11-17  6:31                   ` Bernd Edlinger
2025-11-17 15:01                     ` Oleg Nesterov
2025-11-17 20:08                       ` Bernd Edlinger
2025-11-09 17:14               ` [RFC PATCH 0/3] mt-exec: fix deadlock with ptrace_attach() Oleg Nesterov
2025-11-09 17:14                 ` [RFC PATCH 1/3] exec: make setup_new_exec() return int Oleg Nesterov
2025-11-09 17:15                 ` [RFC PATCH 2/3] exec: don't wait for zombie threads with cred_guard_mutex held Oleg Nesterov
2025-11-10 10:58                   ` Cyrill Gorcunov
2025-11-10 15:09                     ` Oleg Nesterov
2025-11-10 21:49                       ` Cyrill Gorcunov
2025-11-11 14:09                         ` Oleg Nesterov
2025-11-09 17:16                 ` [RFC PATCH 3/3] ptrace: ensure PTRACE_EVENT_EXIT won't stop if the tracee is killed by exec Oleg Nesterov
2025-11-10  5:28                 ` [RFC PATCH 0/3] mt-exec: fix deadlock with ptrace_attach() Bernd Edlinger
2025-11-10 14:47                   ` Oleg Nesterov
2025-11-18 18:13               ` [PATCH v18] exec: Fix dead-lock in de_thread with ptrace_attach Bernd Edlinger
2025-11-20 15:15                 ` Eric W. Biederman
2025-11-20 17:29                   ` Eric W. Biederman
2025-11-20 20:57                     ` [RFC][PATCH] exec: Move cred computation under exec_update_lock Eric W. Biederman
2025-11-20 23:50                       ` Eric W. Biederman
2025-11-21  2:59                         ` Bernd Edlinger
2025-11-21  7:18                           ` Eric W. Biederman
2025-11-21  9:35                             ` Bernd Edlinger
2025-11-21 11:26                               ` Bernd Edlinger
2025-11-21 19:19                                 ` Eric W. Biederman
2025-11-21 23:06                                   ` Ryan Lee
2025-11-22 17:10                     ` [PATCH v18] exec: Fix dead-lock in de_thread with ptrace_attach Bernd Edlinger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251105143210.GA25535@redhat.com \
    --to=oleg@redhat.com \
    --cc=adobriyan@gmail.com \
    --cc=adrian.ratiu@collabora.com \
    --cc=akpm@linux-foundation.org \
    --cc=ardb@kernel.org \
    --cc=areber@redhat.com \
    --cc=ast@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=bernd.edlinger@hotmail.de \
    --cc=brauner@kernel.org \
    --cc=chao@kernel.org \
    --cc=cyphar@cyphar.com \
    --cc=david@redhat.com \
    --cc=dchinner@redhat.com \
    --cc=deller@gmx.de \
    --cc=dwindsor@gmail.com \
    --cc=ebiederm@xmission.com \
    --cc=edumazet@google.com \
    --cc=elena.reshetova@intel.com \
    --cc=frederic@kernel.org \
    --cc=gorcunov@gmail.com \
    --cc=ishkamiel@gmail.com \
    --cc=jack@suse.cz \
    --cc=jamorris@linux.microsoft.com \
    --cc=jlayton@kernel.org \
    --cc=joel@joelfernandes.org \
    --cc=kees@kernel.org \
    --cc=laoar.shao@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=luto@amacapital.net \
    --cc=mcgrof@kernel.org \
    --cc=mhocko@suse.com \
    --cc=mingo@kernel.org \
    --cc=mjguzik@gmail.com \
    --cc=paul@paul-moore.com \
    --cc=pc@manguebit.com \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=senozhatsky@chromium.org \
    --cc=serge@hallyn.com \
    --cc=shr@devkernel.io \
    --cc=shuah@kernel.org \
    --cc=superman.xpt@gmail.com \
    --cc=surenb@google.com \
    --cc=tglx@linutronix.de \
    --cc=tiozhang@didiglobal.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=wad@chromium.org \
    --cc=willy@infradead.org \
    --cc=xu.xin16@zte.com.cn \
    --cc=yuehaibing@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).