Re: [syzbot] [fs?] [mm?] KCSAN: data-race in bprm_execve / copy_fs (4)

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Kees Cook <kees@kernel.org>
To: Oleg Nesterov <oleg@redhat.com>, brauner@kernel.org
Cc: jack@suse.cz, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	syzkaller-bugs@googlegroups.com, viro@zeniv.linux.org.uk,
	syzbot <syzbot+1c486d0b62032c82a968@syzkaller.appspotmail.com>
Subject: Re: [syzbot] [fs?] [mm?] KCSAN: data-race in bprm_execve / copy_fs (4)
Date: Thu, 20 Mar 2025 13:09:38 -0700	[thread overview]
Message-ID: <202503201225.92C5F5FB1@keescook> (raw)
In-Reply-To: <67dc67f0.050a0220.25ae54.001f.GAE@google.com>

Hey look another threaded exec bug. :|

On Thu, Mar 20, 2025 at 12:09:36PM -0700, syzbot wrote:
> ==================================================================
> BUG: KCSAN: data-race in bprm_execve / copy_fs
> 
> write to 0xffff8881044f8250 of 4 bytes by task 13692 on cpu 0:
>  bprm_execve+0x748/0x9c0 fs/exec.c:1884

This is:

        current->fs->in_exec = 0;

And is part of the execve failure path:

out:
	...
        if (bprm->point_of_no_return && !fatal_signal_pending(current))
                force_fatal_sig(SIGSEGV);

        sched_mm_cid_after_execve(current);
        current->fs->in_exec = 0;
        current->in_execve = 0;

        return retval;
}

>  do_execveat_common+0x769/0x7e0 fs/exec.c:1966
>  do_execveat fs/exec.c:2051 [inline]
>  __do_sys_execveat fs/exec.c:2125 [inline]
>  __se_sys_execveat fs/exec.c:2119 [inline]
>  __x64_sys_execveat+0x75/0x90 fs/exec.c:2119
>  x64_sys_call+0x291e/0x2dc0 arch/x86/include/generated/asm/syscalls_64.h:323
>  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>  do_syscall_64+0xc9/0x1c0 arch/x86/entry/common.c:83
>  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> 
> read to 0xffff8881044f8250 of 4 bytes by task 13686 on cpu 1:
>  copy_fs+0x95/0xf0 kernel/fork.c:1770

This is:

                if (fs->in_exec) {

Which is under lock:

        struct fs_struct *fs = current->fs;
        if (clone_flags & CLONE_FS) {
                /* tsk->fs is already what we want */
                spin_lock(&fs->lock);
                /* "users" and "in_exec" locked for check_unsafe_exec() * */
                if (fs->in_exec) {
                        spin_unlock(&fs->lock);
                        return -EAGAIN;
                }
                fs->users++;
                spin_unlock(&fs->lock);


Does execve need to be taking this lock? The other thing touching it is
check_unsafe_exec(), which takes the lock. It looks like the bprm_execve()
lock was removed in commit 8c652f96d385 ("do_execve() must not clear
fs->in_exec if it was set by another thread") which used the return
value from check_unsafe_exec():

    When do_execve() succeeds, it is safe to clear ->in_exec unconditionally.
    It can be set only if we don't share ->fs with another process, and since
    we already killed all sub-threads either ->in_exec == 0 or we are the
    only user of this ->fs.

    Also, we do not need fs->lock to clear fs->in_exec.

This logic was updated in commit 9e00cdb091b0 ("exec:check_unsafe_exec:
kill the dead -EAGAIN and clear_in_exec logic"), which includes this
rationale:

            2. "out_unmark:" in do_execve_common() is either called
               under ->cred_guard_mutex, or after de_thread() which
               kills other threads, so we can't race with sub-thread
               which could set ->in_exec. And if ->fs is shared with
               another process ->in_exec should be false anyway.

The de_thread() is part of the "point of no return" in exec_binprm(),
called via exec_binprm(). But the bprm_execve() error path is reachable
from many paths prior to the point of no return.

What I can imagine here is two failing execs racing a fork:

	A start execve
	B fork with CLONE_FS
	C start execve, reach check_unsafe_exec(), set fs->in_exec
	A bprm_execve() failure, clear fs->in_exec
	B copy_fs() increment fs->users.
	C bprm_execve() failure, clear fs->in_exec

But I don't think this is a "real" flaw, though, since the locking is to
protect a _successful_ execve from a fork (i.e. getting the user count
right). A successful execve will de_thread, and I don't see any wrong
counting of fs->users with regard to thread lifetime.

Did I miss something in the analysis? Should we perform locking anyway,
or add data race annotations, or something else?

-- 
Kees Cook

next prev parent reply	other threads:[~2025-03-20 20:09 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-20 19:09 [syzbot] [fs?] [mm?] KCSAN: data-race in bprm_execve / copy_fs (4) syzbot
2025-03-20 20:09 ` Kees Cook [this message]
2025-03-21  1:44   ` Al Viro
2025-03-21  8:10     ` Kees Cook
2025-03-21  8:49       ` Christian Brauner
2025-03-21  8:45   ` Christian Brauner
2025-03-22  1:00     ` Al Viro
2025-03-22  6:26       ` Kees Cook
2025-03-22 10:15         ` Mateusz Guzik
2025-03-22 10:28           ` Christian Brauner
2025-03-22 10:23       ` Christian Brauner
2025-03-22 15:55       ` Oleg Nesterov
2025-03-22 18:50         ` Al Viro
2025-03-23 18:14           ` Oleg Nesterov
2025-03-23 20:57             ` Christian Brauner
2025-03-24 16:00 ` [PATCH] exec: fix the racy usage of fs_struct->in_exec Oleg Nesterov
2025-03-24 17:01   ` Mateusz Guzik
2025-03-24 18:27     ` Oleg Nesterov
2025-03-24 18:37       ` Oleg Nesterov
2025-03-24 22:24       ` Mateusz Guzik
2025-03-25 10:09         ` Oleg Nesterov
2025-03-25 11:01           ` Mateusz Guzik
2025-03-25 13:21             ` Oleg Nesterov
2025-03-25 13:30               ` Christian Brauner
2025-03-25 14:15                 ` Mateusz Guzik
2025-03-25 14:46                   ` Christian Brauner
2025-03-25 18:40                     ` Kees Cook
2025-04-29 15:49   ` Oleg Nesterov
2025-04-29 16:57     ` Kees Cook
2025-04-29 17:12     ` Mateusz Guzik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202503201225.92C5F5FB1@keescook \
    --to=kees@kernel.org \
    --cc=brauner@kernel.org \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=oleg@redhat.com \
    --cc=syzbot+1c486d0b62032c82a968@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.