From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3CCDD212FAC; Mon, 31 Mar 2025 14:32:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743431533; cv=none; b=kw/bE1hOz7b7HDy7H7fn/azk2coufxSRpDWuEEwIYKnOsEuINA5h2mP6ymlyq/c3qRpvdr8Gz2KjDiSzg44H40LIQlZ9AtWc0NuR+lnf1OEpvulK7M6IvdwWIQvU/wrU/yeI6r3rxsCXN1Ob2XLdBdjjH4dLA+PkOJOnmFLVEy4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743431533; c=relaxed/simple; bh=xII5+k3OVXFVpbqehCP/A9IPPiwXAMAiVAqTpWqf7wk=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=bzw1UZwzRrp1ViIwmqn45FQZW9i3k1KSvtg9OwIoVjBfHzBOrd+TkUox+c6HuabjCfuK/x8HVrbVcvFrCf/RLUA6/BG/L1+6Mhm8datOrH0M3l6VjPRxiukcZisESRR4rCO2US3aPPhUG8PKmqTsViMPOmKujA5V3hiAtc0/34I= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=EwVJx76i; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="EwVJx76i" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D1319C4CEE3; Mon, 31 Mar 2025 14:32:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1743431532; bh=xII5+k3OVXFVpbqehCP/A9IPPiwXAMAiVAqTpWqf7wk=; h=From:To:Cc:Subject:Date:From; b=EwVJx76iLa0RF/pNyUcZPhP2orNZ+VTuIuTxKMq6kFm4QOGqEIB8AUVtN8X6eVuBf /eDeNruNFQ3YZWg6H8FSKCii5vgql4XcCWwNBXWOKGe5WumElazNttYeSSTYwNqMkz NOMBbPj+7a7gvVeKQZQlgFbEJA9Lyg/4aHJm1F3LXk5/EAai9nNCwBUDRm5dyd08CC KNktTucpK3F8KXkUThh2k6SW4OQmlgpP++/iHpuypFSse7wjQ9ar800ilFhkAD1CSs 9dd5bMEsfIL/iceMb+Mxt3ZpqgIM6VPo1HyaSzz4QDEEkQjVIQBp1oOWM2Ni2VuHLL s/PDbMzFTmB7Q== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Mateusz Guzik , Christian Brauner , Sasha Levin , viro@zeniv.linux.org.uk, linux-fsdevel@vger.kernel.org Subject: [PATCH AUTOSEL 6.14 1/2] fs: consistently deref the files table with rcu_dereference_raw() Date: Mon, 31 Mar 2025 10:32:09 -0400 Message-Id: <20250331143210.1667697-1-sashal@kernel.org> X-Mailer: git-send-email 2.39.5 Precedence: bulk X-Mailing-List: stable@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore X-stable-base: Linux 6.14 Content-Transfer-Encoding: 8bit From: Mateusz Guzik [ Upstream commit f381640e1bd4f2de7ccafbfe8703d33c3718aad9 ] ... except when the table is known to be only used by one thread. A file pointer can get installed at any moment despite the ->file_lock being held since the following: 8a81252b774b53e6 ("fs/file.c: don't acquire files->file_lock in fd_install()") Accesses subject to such a race can in principle suffer load tearing. While here redo the comment in dup_fd -- it only covered a race against files showing up, still assuming fd_install() takes the lock. Signed-off-by: Mateusz Guzik Link: https://lore.kernel.org/r/20250313135725.1320914-1-mjguzik@gmail.com Signed-off-by: Christian Brauner Signed-off-by: Sasha Levin --- fs/file.c | 26 +++++++++++++++++--------- 1 file changed, 17 insertions(+), 9 deletions(-) diff --git a/fs/file.c b/fs/file.c index d868cdb95d1e7..1ba03662ae66f 100644 --- a/fs/file.c +++ b/fs/file.c @@ -418,17 +418,25 @@ struct files_struct *dup_fd(struct files_struct *oldf, struct fd_range *punch_ho old_fds = old_fdt->fd; new_fds = new_fdt->fd; + /* + * We may be racing against fd allocation from other threads using this + * files_struct, despite holding ->file_lock. + * + * alloc_fd() might have already claimed a slot, while fd_install() + * did not populate it yet. Note the latter operates locklessly, so + * the file can show up as we are walking the array below. + * + * At the same time we know no files will disappear as all other + * operations take the lock. + * + * Instead of trying to placate userspace racing with itself, we + * ref the file if we see it and mark the fd slot as unused otherwise. + */ for (i = open_files; i != 0; i--) { - struct file *f = *old_fds++; + struct file *f = rcu_dereference_raw(*old_fds++); if (f) { get_file(f); } else { - /* - * The fd may be claimed in the fd bitmap but not yet - * instantiated in the files array if a sibling thread - * is partway through open(). So make sure that this - * fd is available to the new process. - */ __clear_open_fd(open_files - i, new_fdt); } rcu_assign_pointer(*new_fds++, f); @@ -679,7 +687,7 @@ struct file *file_close_fd_locked(struct files_struct *files, unsigned fd) return NULL; fd = array_index_nospec(fd, fdt->max_fds); - file = fdt->fd[fd]; + file = rcu_dereference_raw(fdt->fd[fd]); if (file) { rcu_assign_pointer(fdt->fd[fd], NULL); __put_unused_fd(files, fd); @@ -1237,7 +1245,7 @@ __releases(&files->file_lock) */ fdt = files_fdtable(files); fd = array_index_nospec(fd, fdt->max_fds); - tofree = fdt->fd[fd]; + tofree = rcu_dereference_raw(fdt->fd[fd]); if (!tofree && fd_is_open(fd, fdt)) goto Ebusy; get_file(file); -- 2.39.5