From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>,
Alex Markuze <amarkuze@redhat.com>,
Ilya Dryomov <idryomov@gmail.com>,
Sasha Levin <sashal@kernel.org>,
xiubli@redhat.com, ceph-devel@vger.kernel.org
Subject: [PATCH AUTOSEL 6.17-6.12] ceph: fix potential race condition in ceph_ioctl_lazyio()
Date: Sun, 26 Oct 2025 10:49:08 -0400 [thread overview]
Message-ID: <20251026144958.26750-30-sashal@kernel.org> (raw)
In-Reply-To: <20251026144958.26750-1-sashal@kernel.org>
From: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
[ Upstream commit 5824ccba9a39a3ad914fc9b2972a2c1119abaac9 ]
The Coverity Scan service has detected potential
race condition in ceph_ioctl_lazyio() [1].
The CID 1591046 contains explanation: "Check of thread-shared
field evades lock acquisition (LOCK_EVASION). Thread1 sets
fmode to a new value. Now the two threads have an inconsistent
view of fmode and updates to fields correlated with fmode
may be lost. The data guarded by this critical section may
be read while in an inconsistent state or modified by multiple
racing threads. In ceph_ioctl_lazyio: Checking the value of
a thread-shared field outside of a locked region to determine
if a locked operation involving that thread shared field
has completed. (CWE-543)".
The patch places fi->fmode field access under ci->i_ceph_lock
protection. Also, it introduces the is_file_already_lazy
variable that is set under the lock and it is checked later
out of scope of critical section.
[1] https://scan5.scan.coverity.com/#/project-view/64304/10063?selectedIssue=1591046
Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Reviewed-by: Alex Markuze <amarkuze@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
YES
- The race arises because the pre-patch code reads `fi->fmode` without
holding `ci->i_ceph_lock`, so two threads can both see the lazy bit
clear, then each increment
`ci->i_nr_by_mode[ffs(CEPH_FILE_MODE_LAZY)]++` before either releases
the lock. That leaves the counter permanently elevated,
desynchronising the per-mode counts that `ceph_put_fmode()` relies on
to drop capability refs (`fs/ceph/ioctl.c` before line 212, contrasted
with `fs/ceph/caps.c:4744-4789`).
- The patch now performs the test and update while the lock is held
(`fs/ceph/ioctl.c:212-220`), eliminating the window where concurrent
callers can both act on stale state; the new `is_file_already_lazy`
flag preserves the existing logging/`ceph_check_caps()` calls after
the lock is released (`fs/ceph/ioctl.c:221-228`) so behaviour remains
unchanged aside from closing the race.
- Keeping `i_nr_by_mode` accurate is important beyond metrics: it feeds
`__ceph_caps_file_wanted()` when deciding what caps to request or drop
(`fs/ceph/caps.c:1006-1061`). With the race, a leaked lazy count
prevents the last close path from seeing the inode as idle, delaying
capability release and defeating the lazyio semantics the ioctl is
supposed to provide.
- The change is tightly scoped (one function, no API or struct changes,
same code paths still call `__ceph_touch_fmode()` and
`ceph_check_caps()`), so regression risk is minimal while the fix
hardens a locking invariant already respected by other fmode
transitions such as `ceph_get_fmode()` (`fs/ceph/caps.c:4727-4754`).
- No newer infrastructure is required—the fields, lock, and helpers
touched here have existed in long-term stable kernels—so this bug fix
is suitable for stable backporting despite the likely need to adjust
the `doutc` helper name on older branches.
fs/ceph/ioctl.c | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/fs/ceph/ioctl.c b/fs/ceph/ioctl.c
index e861de3c79b9e..15cde055f3da1 100644
--- a/fs/ceph/ioctl.c
+++ b/fs/ceph/ioctl.c
@@ -246,21 +246,28 @@ static long ceph_ioctl_lazyio(struct file *file)
struct ceph_inode_info *ci = ceph_inode(inode);
struct ceph_mds_client *mdsc = ceph_inode_to_fs_client(inode)->mdsc;
struct ceph_client *cl = mdsc->fsc->client;
+ bool is_file_already_lazy = false;
+ spin_lock(&ci->i_ceph_lock);
if ((fi->fmode & CEPH_FILE_MODE_LAZY) == 0) {
- spin_lock(&ci->i_ceph_lock);
fi->fmode |= CEPH_FILE_MODE_LAZY;
ci->i_nr_by_mode[ffs(CEPH_FILE_MODE_LAZY)]++;
__ceph_touch_fmode(ci, mdsc, fi->fmode);
- spin_unlock(&ci->i_ceph_lock);
+ } else {
+ is_file_already_lazy = true;
+ }
+ spin_unlock(&ci->i_ceph_lock);
+
+ if (is_file_already_lazy) {
+ doutc(cl, "file %p %p %llx.%llx already lazy\n", file, inode,
+ ceph_vinop(inode));
+ } else {
doutc(cl, "file %p %p %llx.%llx marked lazy\n", file, inode,
ceph_vinop(inode));
ceph_check_caps(ci, 0);
- } else {
- doutc(cl, "file %p %p %llx.%llx already lazy\n", file, inode,
- ceph_vinop(inode));
}
+
return 0;
}
--
2.51.0
next prev parent reply other threads:[~2025-10-26 14:51 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-26 14:48 [PATCH AUTOSEL 6.17-5.4] ACPI: property: Return present device nodes only on fwnode interface Sasha Levin
2025-10-26 14:48 ` [PATCH AUTOSEL 6.17-5.4] ceph: add checking of wait_for_completion_killable() return value Sasha Levin
2025-10-26 14:48 ` [PATCH AUTOSEL 6.17-5.4] 9p: sysfs_init: don't hardcode error to ENOMEM Sasha Levin
2025-10-26 14:48 ` [PATCH AUTOSEL 6.17-5.10] um: Fix help message for ssl-non-raw Sasha Levin
2025-10-26 14:48 ` [PATCH AUTOSEL 6.17] clk: thead: th1520-ap: set all AXI clocks to CLK_IS_CRITICAL Sasha Levin
2025-10-26 14:48 ` [PATCH AUTOSEL 6.17-6.1] NTB: epf: Allow arbitrary BAR mapping Sasha Levin
2025-10-26 14:48 ` [PATCH AUTOSEL 6.17] rtc: zynqmp: Restore alarm functionality after kexec transition Sasha Levin
2025-10-26 14:48 ` [PATCH AUTOSEL 6.17] hyperv: Add missing field to hv_output_map_device_interrupt Sasha Levin
2025-10-26 14:48 ` [PATCH AUTOSEL 6.17-5.4] fbdev: Add bounds checking in bit_putcs to fix vmalloc-out-of-bounds Sasha Levin
2025-10-26 14:48 ` [PATCH AUTOSEL 6.17] fbdev: core: Fix ubsan warning in pixel_to_pat Sasha Levin
2025-10-26 14:48 ` [PATCH AUTOSEL 6.17-5.10] ASoC: meson: aiu-encoder-i2s: fix bit clock polarity Sasha Levin
2025-10-26 14:48 ` [PATCH AUTOSEL 6.17-5.4] fs/hpfs: Fix error code for new_inode() failure in mkdir/create/mknod/symlink Sasha Levin
2025-10-26 14:48 ` [PATCH AUTOSEL 6.17] drm/amdgpu: Report individual reset error Sasha Levin
2025-10-26 14:48 ` [PATCH AUTOSEL 6.17-5.15] clk: ti: am33xx: keep WKUP_DEBUGSS_CLKCTRL enabled Sasha Levin
2025-10-26 14:48 ` [PATCH AUTOSEL 6.17] clk: at91: add ACR in all PLL settings Sasha Levin
2025-10-26 14:48 ` [PATCH AUTOSEL 6.17-6.12] clk: scmi: Add duty cycle ops only when duty cycle is supported Sasha Levin
2025-10-26 14:48 ` [PATCH AUTOSEL 6.17-5.10] ARM: at91: pm: save and restore ACR during PLL disable/enable Sasha Levin
2025-10-26 14:48 ` [PATCH AUTOSEL 6.17-6.6] rtc: pcf2127: fix watchdog interrupt mask on pcf2131 Sasha Levin
2025-10-26 14:48 ` [PATCH AUTOSEL 6.17-5.15] clk: at91: clk-master: Add check for divide by 3 Sasha Levin
2025-10-26 14:48 ` [PATCH AUTOSEL 6.17-5.15] rtc: pcf2127: clear minute/second interrupt Sasha Levin
2025-10-26 14:48 ` [PATCH AUTOSEL 6.17-6.12] clk: at91: sam9x7: Add peripheral clock id for pmecc Sasha Levin
2025-10-26 14:49 ` [PATCH AUTOSEL 6.17-5.4] 9p: fix /sys/fs/9p/caches overwriting itself Sasha Levin
2025-10-26 14:49 ` [PATCH AUTOSEL 6.17-6.12] 9p/trans_fd: p9_fd_request: kick rx thread if EPOLLIN Sasha Levin
2025-10-26 14:49 ` [PATCH AUTOSEL 6.17] clk: samsung: exynos990: Add missing USB clock registers to HSI0 Sasha Levin
2025-10-26 14:49 ` [PATCH AUTOSEL 6.17] clocksource: hyper-v: Skip unnecessary checks for the root partition Sasha Levin
2025-10-26 14:49 ` [PATCH AUTOSEL 6.17-6.12] ceph: fix multifs mds auth caps issue Sasha Levin
2025-10-26 14:49 ` [PATCH AUTOSEL 6.17-6.12] LoongArch: Handle new atomic instructions for probes Sasha Levin
2025-10-26 14:49 ` [PATCH AUTOSEL 6.17-6.6] ceph: refactor wake_up_bit() pattern of calling Sasha Levin
2025-10-26 14:49 ` [PATCH AUTOSEL 6.17-6.12] drm/amdkfd: Fix mmap write lock not release Sasha Levin
2025-10-26 14:49 ` Sasha Levin [this message]
2025-10-26 14:49 ` [PATCH AUTOSEL 6.17-6.12] clk: qcom: gcc-ipq6018: rework nss_port5 clock to multiple conf Sasha Levin
2025-10-26 14:49 ` [PATCH AUTOSEL 6.17-6.1] clk: at91: clk-sam9x60-pll: force write to PLL_UPDT register Sasha Levin
2025-10-26 14:49 ` [PATCH AUTOSEL 6.17-5.4] tools bitmap: Add missing asm-generic/bitsperlong.h include Sasha Levin
2025-10-26 14:49 ` [PATCH AUTOSEL 6.17] ALSA: hda/realtek: Add quirk for ASUS ROG Zephyrus Duo Sasha Levin
2025-10-26 14:49 ` [PATCH AUTOSEL 6.17-6.1] kbuild: uapi: Strip comments before size type check Sasha Levin
2025-10-26 14:49 ` [PATCH AUTOSEL 6.17-6.1] tools: lib: thermal: don't preserve owner in install Sasha Levin
2025-10-26 14:49 ` [PATCH AUTOSEL 6.17-6.1] scsi: ufs: core: Include UTP error in INT_FATAL_ERRORS Sasha Levin
2025-10-26 14:49 ` [PATCH AUTOSEL 6.17-6.1] clk: sunxi-ng: sun6i-rtc: Add A523 specifics Sasha Levin
2025-10-26 14:49 ` [PATCH AUTOSEL 6.17-6.12] clk: scmi: migrate round_rate() to determine_rate() Sasha Levin
2025-10-26 23:16 ` Brian Masney
2025-10-28 17:47 ` Sasha Levin
2025-10-26 14:49 ` [PATCH AUTOSEL 6.17-6.12] clk: clocking-wizard: Fix output clock register offset for Versal platforms Sasha Levin
2025-10-26 14:49 ` [PATCH AUTOSEL 6.17] ASoC: rt722: add settings for rt722VB Sasha Levin
2025-10-26 14:49 ` [PATCH AUTOSEL 6.17-5.15] cpufreq: tegra186: Initialize all cores to max frequencies Sasha Levin
2025-10-26 14:49 ` [PATCH AUTOSEL 6.17-6.1] tools: lib: thermal: use pkg-config to locate libnl3 Sasha Levin
2025-10-26 14:49 ` [PATCH AUTOSEL 6.17] clk: renesas: rzv2h: Re-assert reset on deassert timeout Sasha Levin
2025-10-26 14:49 ` [PATCH AUTOSEL 6.17-6.12] net: wwan: t7xx: add support for HP DRMR-H01 Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251026144958.26750-30-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=Slava.Dubeyko@ibm.com \
--cc=amarkuze@redhat.com \
--cc=ceph-devel@vger.kernel.org \
--cc=idryomov@gmail.com \
--cc=patches@lists.linux.dev \
--cc=stable@vger.kernel.org \
--cc=xiubli@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).