[PATCH AUTOSEL 6.12 1/7] btrfs: exit after state insertion failure at btrfs_convert_extent

patches.lists.linux.dev archive mirror
 help / color / mirror / Atom feed

* [PATCH AUTOSEL 6.12 1/7] btrfs: exit after state insertion failure at btrfs_convert_extent_bit()
@ 2025-05-28 21:56 Sasha Levin
  2025-05-28 21:56 ` [PATCH AUTOSEL 6.12 2/7] fs/filesystems: Fix potential unsigned integer underflow in fs_name() Sasha Levin
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Sasha Levin @ 2025-05-28 21:56 UTC (permalink / raw)
  To: patches, stable
  Cc: Filipe Manana, David Sterba, Sasha Levin, clm, josef, linux-btrfs,
	linux-kernel

From: Filipe Manana <fdmanana@suse.com>

[ Upstream commit 3bf179e36da917c5d9bec71c714573ed1649b7c1 ]

If insert_state() state failed it returns an error pointer and we call
extent_io_tree_panic() which will trigger a BUG() call. However if
CONFIG_BUG is disabled, which is an uncommon and exotic scenario, then
we fallthrough and call cache_state() which will dereference the error
pointer, resulting in an invalid memory access.

So jump to the 'out' label after calling extent_io_tree_panic(), it also
makes the code more clear besides dealing with the exotic scenario where
CONFIG_BUG is disabled.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

**YES** This commit should be backported to stable kernel trees.
**Detailed Analysis:** **1. Bug Description and Impact:** The commit
fixes a serious potential memory corruption bug in the btrfs filesystem.
When `insert_state()` fails in `btrfs_convert_extent_bit()`, it returns
an error pointer (ERR_PTR). The current code calls
`extent_io_tree_panic()` which triggers `BUG()`, but if `CONFIG_BUG` is
disabled, the code continues executing and calls
`cache_state(inserted_state, cached_state)` with the error pointer,
causing invalid memory access. **2. Code Changes Analysis:** The fix is
minimal and surgical - just adding a single `goto out;` statement after
the `extent_io_tree_panic()` call: ```c if (IS_ERR(inserted_state)) {
ret = PTR_ERR(inserted_state); extent_io_tree_panic(tree, prealloc,
"insert", ret); + goto out; // <-- The fix } ``` This ensures that when
`CONFIG_BUG` is disabled, execution jumps to the cleanup code instead of
continuing with an invalid pointer. **3. Comparison with Similar
Commits:** This commit aligns with the pattern seen in "Similar Commit
#2" (Status: YES), which also: - Removes reliance on `BUG_ON()` behavior
- Provides graceful error handling - Has minimal risk - Fixes a
potential crash/corruption scenario Similar to commit #3 and #5 (both
Status: NO), this touches BUG() handling, but unlike those commits which
make broader architectural changes to error handling patterns, this fix
is much more contained. **4. Stable Tree Criteria Assessment:** ✅
**Fixes important bug**: Prevents potential memory corruption/crashes ✅
**Small and contained**: Single line addition ✅ **Minimal risk**: Only
affects error path when insert_state() fails AND CONFIG_BUG is disabled
✅ **No new features**: Pure bug fix ✅ **No architectural changes**:
Preserves existing error handling, just prevents fallthrough ✅
**Critical subsystem**: btrfs filesystem corruption prevention ✅ **Clear
side effects**: No unintended consequences beyond fixing the bug **5.
Risk Assessment:** - **Very Low Risk**: The change only affects an error
condition that's already problematic - **Exotic scenario**: Only impacts
systems with `CONFIG_BUG` disabled (uncommon but not impossible) - **No
regression potential**: The change only prevents executing invalid code,
doesn't change normal operation - **Well-contained**: Affects only one
function in one file **6. Security Implications:** While `CONFIG_BUG`
disabled is uncommon, this could potentially be exploited if an attacker
can trigger the `insert_state()` failure condition, leading to memory
corruption. The fix prevents this attack vector. This is a clear
candidate for stable backporting - it fixes a real bug with minimal risk
and follows the stable tree rules perfectly.

 fs/btrfs/extent-io-tree.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/btrfs/extent-io-tree.c b/fs/btrfs/extent-io-tree.c
index 6d08c100b01de..bb3aaf610652a 100644
--- a/fs/btrfs/extent-io-tree.c
+++ b/fs/btrfs/extent-io-tree.c
@@ -1456,6 +1456,7 @@ int convert_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,
 		if (IS_ERR(inserted_state)) {
 			ret = PTR_ERR(inserted_state);
 			extent_io_tree_panic(tree, prealloc, "insert", ret);
+			goto out;
 		}
 		cache_state(inserted_state, cached_state);
 		if (inserted_state == prealloc)
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH AUTOSEL 6.12 2/7] fs/filesystems: Fix potential unsigned integer underflow in fs_name()
  2025-05-28 21:56 [PATCH AUTOSEL 6.12 1/7] btrfs: exit after state insertion failure at btrfs_convert_extent_bit() Sasha Levin
@ 2025-05-28 21:56 ` Sasha Levin
  2025-05-28 21:56 ` [PATCH AUTOSEL 6.12 3/7] gfs2: pass through holder from the VFS for freeze/thaw Sasha Levin
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Sasha Levin @ 2025-05-28 21:56 UTC (permalink / raw)
  To: patches, stable
  Cc: Zijun Hu, Christian Brauner, Sasha Levin, viro, linux-fsdevel,
	linux-kernel

From: Zijun Hu <quic_zijuhu@quicinc.com>

[ Upstream commit 1363c134ade81e425873b410566e957fecebb261 ]

fs_name() has @index as unsigned int, so there is underflow risk for
operation '@index--'.

Fix by breaking the for loop when '@index == 0' which is also more proper
than '@index <= 0' for unsigned integer comparison.

Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Link: https://lore.kernel.org/20250410-fix_fs-v1-1-7c14ccc8ebaa@quicinc.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

**YES** This commit should be backported to stable kernel trees.
**Detailed Analysis:** **The Bug:** The `fs_name()` function at
`fs/filesystems.c:156-174` has a critical unsigned integer underflow
vulnerability. When the function receives `index=0` as a parameter, the
loop `for (tmp = file_systems; tmp; tmp = tmp->next, index--)`
decrements `index` from 0 to `UINT_MAX` (4294967295 on 32-bit systems),
causing the condition `if (index <= 0 && try_module_get(tmp->owner))` to
evaluate incorrectly. **The Fix:** The commit changes the logic from: -
Old: `if (index <= 0 && try_module_get(tmp->owner))` - New: `if (index
== 0) { if (try_module_get(tmp->owner)) res = 0; break; }` This prevents
the unsigned integer from wrapping around and provides proper bounds
checking. **Impact and Severity:** 1. **User-accessible vulnerability**:
The `fs_name()` function is called through the `sysfs` system call
(syscall #139) with option 2, making it directly accessible to userspace
applications. 2. **Potential for exploitation**: An attacker could call
`sysfs(2, 0, buffer)` to trigger the underflow, potentially causing: -
Infinite loops in the filesystem list traversal - Unintended module
reference acquisition - System instability or denial of service 3.
**Core filesystem subsystem**: This affects the fundamental filesystem
registration mechanism in the kernel. **Comparison with Similar
Commits:** This follows the same pattern as the **accepted backport
examples**: - **Similar to Commit #1 (ntfs3)**: Both fix integer
overflow/underflow issues that could cause system instability -
**Similar to Commit #3 (f2fs)**: Both prevent integer arithmetic issues
in filesystem code - **Similar to Commit #5 (f2fs)**: Both add bounds
checking to prevent corruption **Stable Tree Criteria:** ✅ **Fixes
important bug**: Prevents potential system instability and undefined
behavior ✅ **Small and contained**: Minimal code change, only affects
one function ✅ **Clear side effects**: No architectural changes, just
safer bounds checking ✅ **Low regression risk**: The fix makes the
function more robust without changing expected behavior ✅ **Critical
subsystem**: Filesystem management is fundamental to kernel operation
**Conclusion:** This is a textbook example of a commit suitable for
stable backporting: it fixes a clear bug with security implications in
core kernel infrastructure, uses a minimal and safe approach, and has no
risk of introducing regressions. The unsigned integer underflow could
lead to system instability when triggered through the accessible `sysfs`
syscall.

 fs/filesystems.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/fs/filesystems.c b/fs/filesystems.c
index 58b9067b2391c..95e5256821a53 100644
--- a/fs/filesystems.c
+++ b/fs/filesystems.c
@@ -156,15 +156,19 @@ static int fs_index(const char __user * __name)
 static int fs_name(unsigned int index, char __user * buf)
 {
 	struct file_system_type * tmp;
-	int len, res;
+	int len, res = -EINVAL;

 	read_lock(&file_systems_lock);
-	for (tmp = file_systems; tmp; tmp = tmp->next, index--)
-		if (index <= 0 && try_module_get(tmp->owner))
+	for (tmp = file_systems; tmp; tmp = tmp->next, index--) {
+		if (index == 0) {
+			if (try_module_get(tmp->owner))
+				res = 0;
 			break;
+		}
+	}
 	read_unlock(&file_systems_lock);
-	if (!tmp)
-		return -EINVAL;
+	if (res)
+		return res;

 	/* OK, we got the reference, so we can safely block */
 	len = strlen(tmp->name) + 1;
-- 
2.39.5

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH AUTOSEL 6.12 3/7] gfs2: pass through holder from the VFS for freeze/thaw
  2025-05-28 21:56 [PATCH AUTOSEL 6.12 1/7] btrfs: exit after state insertion failure at btrfs_convert_extent_bit() Sasha Levin
  2025-05-28 21:56 ` [PATCH AUTOSEL 6.12 2/7] fs/filesystems: Fix potential unsigned integer underflow in fs_name() Sasha Levin
@ 2025-05-28 21:56 ` Sasha Levin
  2025-05-28 21:56 ` [PATCH AUTOSEL 6.12 4/7] btrfs: exit after state split error at set_extent_bit() Sasha Levin
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Sasha Levin @ 2025-05-28 21:56 UTC (permalink / raw)
  To: patches, stable
  Cc: Christian Brauner, Sasha Levin, rpeterso, agruenba, cluster-devel,
	linux-kernel

From: Christian Brauner <brauner@kernel.org>

[ Upstream commit 62a2175ddf7e72941868f164b7c1f92e00f213bd ]

The filesystem's freeze/thaw functions can be called from contexts where
the holder isn't userspace but the kernel, e.g., during systemd
suspend/hibernate. So pass through the freeze/thaw flags from the VFS
instead of hard-coding them.

Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

**YES** This commit should be backported to stable kernel trees.
**Extensive explanation:** **1. Fixes a real user-affecting bug:** The
commit addresses a concrete functional issue where GFS2 filesystems
could malfunction during system suspend/hibernate operations. When
systemd or other kernel components initiate freeze/thaw operations with
`FREEZE_HOLDER_KERNEL`, but GFS2 internally hard-codes
`FREEZE_HOLDER_USERSPACE`, this creates a mismatch that can cause
freeze/thaw operations to fail or behave incorrectly. **2. Small and
contained change:** The fix is minimal and surgical: - Only changes
parameter passing through the call chain - No algorithmic or
architectural changes - Changes are confined to the GFS2 subsystem -
Simply replaces hard-coded `FREEZE_HOLDER_USERSPACE` with the correct
`who` parameter **3. Low regression risk:** - The change maintains exact
same behavior for userspace-initiated freezes (the most common case) -
Only affects the less common kernel-initiated freeze scenarios - No
changes to core logic, just parameter propagation - All existing
userspace freeze/thaw workflows remain unchanged **4. Critical subsystem
impact:** Freeze/thaw operations are fundamental to system
suspend/hibernate functionality. A broken implementation can prevent
proper system power management, which is a critical feature for laptops
and mobile devices. **5. Follows stable tree criteria:** - **Fixes
important functionality**: System suspend/hibernate with GFS2
filesystems - **Minimal scope**: Changes only parameter passing, no new
features - **Well-contained**: Limited to fs/gfs2/super.c - **Clear
purpose**: Explicitly described fix for kernel vs userspace freeze
holder mismatch **6. Comparison with reference commits:** This matches
the pattern of **Similar Commit #1** and **Similar Commit #5** which
were marked YES: - Simple functional fix - Small, targeted change - No
architectural modifications - Addresses specific user-visible issue
Unlike the NO commits which involved more complex structural changes,
cleanups, or major refactoring, this is a straightforward bug fix. **7.
Evidence from kernel context:** The examination of
`/home/sasha/linux/include/linux/fs.h` confirms that
`FREEZE_HOLDER_KERNEL` is a legitimate and expected freeze holder type
used by kernel components, making this fix necessary for correct
operation.

 fs/gfs2/super.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index b9cef63c78717..4d5959cd8466e 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -674,7 +674,7 @@ static int gfs2_sync_fs(struct super_block *sb, int wait)
 	return sdp->sd_log_error;
 }

-static int gfs2_do_thaw(struct gfs2_sbd *sdp)
+static int gfs2_do_thaw(struct gfs2_sbd *sdp, enum freeze_holder who)
 {
 	struct super_block *sb = sdp->sd_vfs;
 	int error;
@@ -682,7 +682,7 @@ static int gfs2_do_thaw(struct gfs2_sbd *sdp)
 	error = gfs2_freeze_lock_shared(sdp);
 	if (error)
 		goto fail;
-	error = thaw_super(sb, FREEZE_HOLDER_USERSPACE);
+	error = thaw_super(sb, who);
 	if (!error)
 		return 0;

@@ -710,7 +710,7 @@ void gfs2_freeze_func(struct work_struct *work)
 	gfs2_freeze_unlock(sdp);
 	set_bit(SDF_FROZEN, &sdp->sd_flags);

-	error = gfs2_do_thaw(sdp);
+	error = gfs2_do_thaw(sdp, FREEZE_HOLDER_USERSPACE);
 	if (error)
 		goto out;

@@ -728,6 +728,7 @@ void gfs2_freeze_func(struct work_struct *work)
 /**
  * gfs2_freeze_super - prevent further writes to the filesystem
  * @sb: the VFS structure for the filesystem
+ * @who: freeze flags
  *
  */

@@ -744,7 +745,7 @@ static int gfs2_freeze_super(struct super_block *sb, enum freeze_holder who)
 	}

 	for (;;) {
-		error = freeze_super(sb, FREEZE_HOLDER_USERSPACE);
+		error = freeze_super(sb, who);
 		if (error) {
 			fs_info(sdp, "GFS2: couldn't freeze filesystem: %d\n",
 				error);
@@ -758,7 +759,7 @@ static int gfs2_freeze_super(struct super_block *sb, enum freeze_holder who)
 			break;
 		}

-		error = gfs2_do_thaw(sdp);
+		error = gfs2_do_thaw(sdp, who);
 		if (error)
 			goto out;

@@ -796,6 +797,7 @@ static int gfs2_freeze_fs(struct super_block *sb)
 /**
  * gfs2_thaw_super - reallow writes to the filesystem
  * @sb: the VFS structure for the filesystem
+ * @who: freeze flags
  *
  */

@@ -814,7 +816,7 @@ static int gfs2_thaw_super(struct super_block *sb, enum freeze_holder who)
 	atomic_inc(&sb->s_active);
 	gfs2_freeze_unlock(sdp);

-	error = gfs2_do_thaw(sdp);
+	error = gfs2_do_thaw(sdp, who);

 	if (!error) {
 		clear_bit(SDF_FREEZE_INITIATOR, &sdp->sd_flags);
-- 
2.39.5

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH AUTOSEL 6.12 4/7] btrfs: exit after state split error at set_extent_bit()
  2025-05-28 21:56 [PATCH AUTOSEL 6.12 1/7] btrfs: exit after state insertion failure at btrfs_convert_extent_bit() Sasha Levin
  2025-05-28 21:56 ` [PATCH AUTOSEL 6.12 2/7] fs/filesystems: Fix potential unsigned integer underflow in fs_name() Sasha Levin
  2025-05-28 21:56 ` [PATCH AUTOSEL 6.12 3/7] gfs2: pass through holder from the VFS for freeze/thaw Sasha Levin
@ 2025-05-28 21:56 ` Sasha Levin
  2025-05-28 21:56 ` [PATCH AUTOSEL 6.12 5/7] nvmet-fcloop: access fcpreq only when holding reqlock Sasha Levin
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Sasha Levin @ 2025-05-28 21:56 UTC (permalink / raw)
  To: patches, stable
  Cc: Filipe Manana, David Sterba, Sasha Levin, clm, josef, linux-btrfs,
	linux-kernel

From: Filipe Manana <fdmanana@suse.com>

[ Upstream commit 41d69d4d78d8b179bf3bcdfc56d28a12b3a608d2 ]

If split_state() returned an error we call extent_io_tree_panic() which
will trigger a BUG() call. However if CONFIG_BUG is disabled, which is an
uncommon and exotic scenario, then we fallthrough and hit a use after free
when calling set_state_bits() since the extent state record which the
local variable 'prealloc' points to was freed by split_state().

So jump to the label 'out' after calling extent_io_tree_panic() and set
the 'prealloc' pointer to NULL since split_state() has already freed it
when it hit an error.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

**YES** This commit should be backported to stable kernel trees. Here's
my extensive analysis: ## Analysis of the Code Changes The commit fixes
a critical use-after-free bug in the btrfs filesystem's extent I/O tree
management. Examining the specific code changes: **Problem Location**:
In `fs/btrfs/extent-io-tree.c`, function `set_extent_bit()` around lines
1254-1256: ```c ret = split_state(tree, state, prealloc, end + 1); if
(ret) extent_io_tree_panic(tree, state, "split", ret); ``` **The Bug**:
After `split_state()` fails and `extent_io_tree_panic()` is called, the
code continues to execute `set_state_bits(tree, prealloc, bits,
changeset)` on line 1258. However, when `split_state()` fails, it frees
the `prealloc` extent state, making this a use-after-free vulnerability.
**The Fix**: The commit adds proper error handling: ```c ret =
split_state(tree, state, prealloc, end + 1); if (ret) {
extent_io_tree_panic(tree, state, "split", ret); prealloc = NULL; goto
out; } ``` ## Why This Should Be Backported ### 1. **Critical Security
Issue** - **Use-after-free vulnerability**: This is a serious memory
safety issue that can lead to kernel crashes, data corruption, or
potentially exploitable conditions - **Affects btrfs filesystem**: A
widely used filesystem in production environments ### 2. **Specific Edge
Case Scenario** From my examination of the kernel code: -
`extent_io_tree_panic()` calls `btrfs_panic()` which calls `BUG()` -
When `CONFIG_BUG` is disabled (uncommon but possible), `BUG()` becomes a
no-op loop instead of halting execution - This allows execution to
continue to the use-after-free code ### 3. **Minimal Risk Change** -
**Small, targeted fix**: Only adds 3 lines of code - **Clear logic**:
Sets pointer to NULL and jumps to cleanup - **No functional changes**:
Doesn't alter normal operation paths - **No architectural changes**:
Pure bug fix without affecting subsystem design ### 4. **Comparison with
Similar Commits** Looking at the reference commits: - Similar to
"Similar Commit #2" (marked YES) which also improved error handling in
extent bit operations - Unlike commits marked NO, this doesn't add
features or make architectural changes - Follows the same pattern as
other accepted backports for memory safety fixes ### 5. **Clear Bug Fix
Criteria** - **Fixes a real bug**: Use-after-free is a concrete,
exploitable issue - **Minimal scope**: Only affects error path in one
function - **No side effects**: Change only affects already-failing code
paths - **Well-understood impact**: Risk is contained to btrfs extent
I/O operations ### 6. **Production Impact** - While `CONFIG_BUG` being
disabled is rare, when it occurs this creates a serious vulnerability -
btrfs is used in production environments where memory safety is critical
- The fix prevents potential kernel crashes or memory corruption ### 7.
**Stable Tree Compatibility** - **Self-contained**: Doesn't depend on
other recent changes - **Backwards compatible**: No API changes - **Low
regression risk**: Only affects error handling path that was already
broken The commit perfectly fits stable tree criteria: it's a small,
important bugfix with minimal risk that addresses a real security issue
in a widely-used filesystem.

 fs/btrfs/extent-io-tree.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/extent-io-tree.c b/fs/btrfs/extent-io-tree.c
index bb3aaf610652a..5f9a43734812e 100644
--- a/fs/btrfs/extent-io-tree.c
+++ b/fs/btrfs/extent-io-tree.c
@@ -1252,8 +1252,11 @@ static int __set_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,
 		if (!prealloc)
 			goto search_again;
 		ret = split_state(tree, state, prealloc, end + 1);
-		if (ret)
+		if (ret) {
 			extent_io_tree_panic(tree, state, "split", ret);
+			prealloc = NULL;
+			goto out;
+		}
 
 		set_state_bits(tree, prealloc, bits, changeset);
 		cache_state(prealloc, cached_state);
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH AUTOSEL 6.12 5/7] nvmet-fcloop: access fcpreq only when holding reqlock
  2025-05-28 21:56 [PATCH AUTOSEL 6.12 1/7] btrfs: exit after state insertion failure at btrfs_convert_extent_bit() Sasha Levin
                   ` (2 preceding siblings ...)
  2025-05-28 21:56 ` [PATCH AUTOSEL 6.12 4/7] btrfs: exit after state split error at set_extent_bit() Sasha Levin
@ 2025-05-28 21:56 ` Sasha Levin
  2025-05-28 21:56 ` [PATCH AUTOSEL 6.12 6/7] perf: Ensure bpf_perf_link path is properly serialized Sasha Levin
  2025-05-28 21:56 ` [PATCH AUTOSEL 6.12 7/7] block: use q->elevator with ->elevator_lock held in elv_iosched_show() Sasha Levin
  5 siblings, 0 replies; 7+ messages in thread
From: Sasha Levin @ 2025-05-28 21:56 UTC (permalink / raw)
  To: patches, stable
  Cc: Daniel Wagner, Christoph Hellwig, Sasha Levin, james.smart, sagi,
	linux-nvme, linux-kernel

From: Daniel Wagner <wagi@kernel.org>

[ Upstream commit 47a827cd7929d0550c3496d70b417fcb5649b27b ]

The abort handling logic expects that the state and the fcpreq are only
accessed when holding the reqlock lock.

While at it, only handle the aborts in the abort handler.

Signed-off-by: Daniel Wagner <wagi@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

**YES** This commit should be backported to stable kernel trees. Here's
my extensive analysis: ## Commit Analysis This commit fixes a critical
**race condition and data corruption issue** in the nvmet-fcloop
driver's abort handling logic. The changes address serious
synchronization problems that could lead to use-after-free conditions
and inconsistent state management. ## Key Issues Fixed ### 1. **Unsafe
fcpreq Access Outside Lock Protection** The main issue is that `fcpreq`
was being accessed without proper lock protection in
`fcloop_fcp_recv_work()`: ```c // BEFORE (unsafe): struct nvmefc_fcp_req
*fcpreq = tfcp_req->fcpreq; // Access outside lock
spin_lock_irqsave(&tfcp_req->reqlock, flags); // ... lock operations ...
spin_unlock_irqrestore(&tfcp_req->reqlock, flags); // Later use of
fcpreq - could be stale/freed // AFTER (safe):
spin_lock_irqsave(&tfcp_req->reqlock, flags); fcpreq = tfcp_req->fcpreq;
// Access inside lock protection // ... rest of operations ... ``` This
change ensures `fcpreq` is only accessed while holding the `reqlock`,
preventing race conditions where the pointer could be modified by
concurrent abort operations. ### 2. **Improved Abort Handling Logic**
The abort path in `fcloop_fcp_abort_recv_work()` was restructured to
properly handle the `fcpreq` pointer: ```c // BEFORE: fcpreq =
tfcp_req->fcpreq; // Read fcpreq switch (tfcp_req->inistate) { case
INI_IO_ABORTED: break; // ... later operations outside lock set fcpreq
to NULL // AFTER: switch (tfcp_req->inistate) { case INI_IO_ABORTED:
fcpreq = tfcp_req->fcpreq; // Only read when in ABORTED state
tfcp_req->fcpreq = NULL; // Clear immediately under lock break; ``` ###
3. **Cleaner Control Flow** The commit also improves the logic in
`fcloop_fcp_recv_work()` by having the abort handler take full
responsibility for calling `fcloop_call_host_done()` when aborted,
rather than duplicating this logic. ## Stable Tree Backport Criteria
Assessment ✅ **Fixes Important Bug**: Yes - race conditions and
potential use-after-free in critical I/O path ✅ **Small and Contained**:
Yes - only 30 lines changed, focused on specific synchronization issue ✅
**Minimal Regression Risk**: Yes - improves existing locking patterns
without architectural changes ✅ **Clear Side Effects**: No major side
effects - only improves synchronization ✅ **Confined to Subsystem**: Yes
- only affects nvmet-fcloop test driver ✅ **Follows Stable Rules**: Yes
- critical bugfix with minimal complexity ## Comparison with Historical
Commits This commit aligns perfectly with **Similar Commit #4** which
was marked **"Backport Status: YES"**: - Both fix
locking/synchronization issues in fcloop - Both address race conditions
in abort handling - Both are small, focused changes - Both improve
existing patterns rather than introducing new architecture The pattern
of fcloop locking fixes being suitable for backport is well-established,
as seen in the historical reference where similar synchronization
improvements were deemed appropriate for stable trees. ## Conclusion
This is a textbook example of a stable tree backport candidate: it fixes
a genuine race condition bug that could cause data corruption or
crashes, uses a minimal and safe approach, and improves the robustness
of the existing code without introducing new features or architectural
changes.

 drivers/nvme/target/fcloop.c | 31 ++++++++++++++++---------------
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/drivers/nvme/target/fcloop.c b/drivers/nvme/target/fcloop.c
index da195d61a9664..f1b5ffc00ce88 100644
--- a/drivers/nvme/target/fcloop.c
+++ b/drivers/nvme/target/fcloop.c
@@ -623,12 +623,13 @@ fcloop_fcp_recv_work(struct work_struct *work)
 {
 	struct fcloop_fcpreq *tfcp_req =
 		container_of(work, struct fcloop_fcpreq, fcp_rcv_work);
-	struct nvmefc_fcp_req *fcpreq = tfcp_req->fcpreq;
+	struct nvmefc_fcp_req *fcpreq;
 	unsigned long flags;
 	int ret = 0;
 	bool aborted = false;
 
 	spin_lock_irqsave(&tfcp_req->reqlock, flags);
+	fcpreq = tfcp_req->fcpreq;
 	switch (tfcp_req->inistate) {
 	case INI_IO_START:
 		tfcp_req->inistate = INI_IO_ACTIVE;
@@ -643,16 +644,19 @@ fcloop_fcp_recv_work(struct work_struct *work)
 	}
 	spin_unlock_irqrestore(&tfcp_req->reqlock, flags);
 
-	if (unlikely(aborted))
-		ret = -ECANCELED;
-	else {
-		if (likely(!check_for_drop(tfcp_req)))
-			ret = nvmet_fc_rcv_fcp_req(tfcp_req->tport->targetport,
-				&tfcp_req->tgt_fcp_req,
-				fcpreq->cmdaddr, fcpreq->cmdlen);
-		else
-			pr_info("%s: dropped command ********\n", __func__);
+	if (unlikely(aborted)) {
+		/* the abort handler will call fcloop_call_host_done */
+		return;
+	}
+
+	if (unlikely(check_for_drop(tfcp_req))) {
+		pr_info("%s: dropped command ********\n", __func__);
+		return;
 	}
+
+	ret = nvmet_fc_rcv_fcp_req(tfcp_req->tport->targetport,
+				   &tfcp_req->tgt_fcp_req,
+				   fcpreq->cmdaddr, fcpreq->cmdlen);
 	if (ret)
 		fcloop_call_host_done(fcpreq, tfcp_req, ret);
 }
@@ -667,9 +671,10 @@ fcloop_fcp_abort_recv_work(struct work_struct *work)
 	unsigned long flags;
 
 	spin_lock_irqsave(&tfcp_req->reqlock, flags);
-	fcpreq = tfcp_req->fcpreq;
 	switch (tfcp_req->inistate) {
 	case INI_IO_ABORTED:
+		fcpreq = tfcp_req->fcpreq;
+		tfcp_req->fcpreq = NULL;
 		break;
 	case INI_IO_COMPLETED:
 		completed = true;
@@ -691,10 +696,6 @@ fcloop_fcp_abort_recv_work(struct work_struct *work)
 		nvmet_fc_rcv_fcp_abort(tfcp_req->tport->targetport,
 					&tfcp_req->tgt_fcp_req);
 
-	spin_lock_irqsave(&tfcp_req->reqlock, flags);
-	tfcp_req->fcpreq = NULL;
-	spin_unlock_irqrestore(&tfcp_req->reqlock, flags);
-
 	fcloop_call_host_done(fcpreq, tfcp_req, -ECANCELED);
 	/* call_host_done releases reference for abort downcall */
 }
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH AUTOSEL 6.12 6/7] perf: Ensure bpf_perf_link path is properly serialized
  2025-05-28 21:56 [PATCH AUTOSEL 6.12 1/7] btrfs: exit after state insertion failure at btrfs_convert_extent_bit() Sasha Levin
                   ` (3 preceding siblings ...)
  2025-05-28 21:56 ` [PATCH AUTOSEL 6.12 5/7] nvmet-fcloop: access fcpreq only when holding reqlock Sasha Levin
@ 2025-05-28 21:56 ` Sasha Levin
  2025-05-28 21:56 ` [PATCH AUTOSEL 6.12 7/7] block: use q->elevator with ->elevator_lock held in elv_iosched_show() Sasha Levin
  5 siblings, 0 replies; 7+ messages in thread
From: Sasha Levin @ 2025-05-28 21:56 UTC (permalink / raw)
  To: patches, stable
  Cc: Peter Zijlstra, Ravi Bangoria, Sasha Levin, mingo, acme, ast,
	daniel, linux-kernel, netdev, bpf

From: Peter Zijlstra <peterz@infradead.org>

[ Upstream commit 7ed9138a72829d2035ecbd8dbd35b1bc3c137c40 ]

Ravi reported that the bpf_perf_link_attach() usage of
perf_event_set_bpf_prog() is not serialized by ctx->mutex, unlike the
PERF_EVENT_IOC_SET_BPF case.

Reported-by: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lkml.kernel.org/r/20250307193305.486326750@infradead.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

Based on my analysis of the commit and the kernel tree context, here is
my evaluation: **YES** This commit should be backported to stable kernel
trees. Here's my detailed analysis: ## Bug Analysis The commit fixes a
**race condition and security vulnerability** in the BPF perf event
attachment path. Specifically: 1. **Race Condition**: The
`bpf_perf_link_attach()` function calls `perf_event_set_bpf_prog()`
without holding the `ctx->mutex`, while the equivalent ioctl path
(`PERF_EVENT_IOC_SET_BPF`) properly acquires this mutex before calling
the same function. 2. **Inconsistent Locking**: The fix shows two
different code paths accessing the same critical section with different
locking semantics: - **ioctl path** (line 2309): Acquires `ctx->mutex`
via `_perf_ioctl()` → `__perf_event_set_bpf_prog()` -
**bpf_perf_link_attach path**: Called `perf_event_set_bpf_prog()`
directly without mutex protection ## Code Changes Analysis The fix
introduces proper serialization by: 1. **Creating
`__perf_event_set_bpf_prog()`**: An internal version that doesn't
acquire locks 2. **Modifying `perf_event_set_bpf_prog()`**: Now acquires
`ctx->mutex` before calling the internal version 3. **Updating ioctl
path**: Uses the internal version since it already holds the mutex ##
Why This Should Be Backported 1. **Security Impact**: Race conditions in
BPF attachment can lead to use-after-free or other memory corruption
issues that could be exploited 2. **Bug Fix Nature**: This is clearly a
bug fix that addresses inconsistent locking semantics rather than adding
new features 3. **Minimal Risk**: The change is small, contained, and
follows existing patterns - it simply ensures consistent locking across
both code paths 4. **Critical Subsystem**: This affects the BPF
subsystem and perf events, both critical kernel components where race
conditions can have serious security implications 5. **Similar
Historical Precedent**: Looking at the reference commits, commit #5 with
"Backport Status: YES" was backported for fixing a similar type
validation issue in BPF perf events, showing that BPF perf-related fixes
are appropriate for stable trees The commit addresses the exact type of
concurrency bug that stable trees are designed to fix - it's a clear
bugfix with minimal regression risk that addresses a potential security
vulnerability in a critical kernel subsystem.

 kernel/events/core.c | 34 ++++++++++++++++++++++++++++++----
 1 file changed, 30 insertions(+), 4 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index edafe9fc4bdd0..cbf5303f52dab 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6033,6 +6033,9 @@ static int perf_event_set_output(struct perf_event *event,
 static int perf_event_set_filter(struct perf_event *event, void __user *arg);
 static int perf_copy_attr(struct perf_event_attr __user *uattr,
 			  struct perf_event_attr *attr);
+static int __perf_event_set_bpf_prog(struct perf_event *event,
+				     struct bpf_prog *prog,
+				     u64 bpf_cookie);

 static long _perf_ioctl(struct perf_event *event, unsigned int cmd, unsigned long arg)
 {
@@ -6101,7 +6104,7 @@ static long _perf_ioctl(struct perf_event *event, unsigned int cmd, unsigned lon
 		if (IS_ERR(prog))
 			return PTR_ERR(prog);

-		err = perf_event_set_bpf_prog(event, prog, 0);
+		err = __perf_event_set_bpf_prog(event, prog, 0);
 		if (err) {
 			bpf_prog_put(prog);
 			return err;
@@ -10757,8 +10760,9 @@ static inline bool perf_event_is_tracing(struct perf_event *event)
 	return false;
 }

-int perf_event_set_bpf_prog(struct perf_event *event, struct bpf_prog *prog,
-			    u64 bpf_cookie)
+static int __perf_event_set_bpf_prog(struct perf_event *event,
+				     struct bpf_prog *prog,
+				     u64 bpf_cookie)
 {
 	bool is_kprobe, is_uprobe, is_tracepoint, is_syscall_tp;

@@ -10796,6 +10800,20 @@ int perf_event_set_bpf_prog(struct perf_event *event, struct bpf_prog *prog,
 	return perf_event_attach_bpf_prog(event, prog, bpf_cookie);
 }

+int perf_event_set_bpf_prog(struct perf_event *event,
+			    struct bpf_prog *prog,
+			    u64 bpf_cookie)
+{
+	struct perf_event_context *ctx;
+	int ret;
+
+	ctx = perf_event_ctx_lock(event);
+	ret = __perf_event_set_bpf_prog(event, prog, bpf_cookie);
+	perf_event_ctx_unlock(event, ctx);
+
+	return ret;
+}
+
 void perf_event_free_bpf_prog(struct perf_event *event)
 {
 	if (!perf_event_is_tracing(event)) {
@@ -10815,7 +10833,15 @@ static void perf_event_free_filter(struct perf_event *event)
 {
 }

-int perf_event_set_bpf_prog(struct perf_event *event, struct bpf_prog *prog,
+static int __perf_event_set_bpf_prog(struct perf_event *event,
+				     struct bpf_prog *prog,
+				     u64 bpf_cookie)
+{
+	return -ENOENT;
+}
+
+int perf_event_set_bpf_prog(struct perf_event *event,
+			    struct bpf_prog *prog,
 			    u64 bpf_cookie)
 {
 	return -ENOENT;
-- 
2.39.5

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH AUTOSEL 6.12 7/7] block: use q->elevator with ->elevator_lock held in elv_iosched_show()
  2025-05-28 21:56 [PATCH AUTOSEL 6.12 1/7] btrfs: exit after state insertion failure at btrfs_convert_extent_bit() Sasha Levin
                   ` (4 preceding siblings ...)
  2025-05-28 21:56 ` [PATCH AUTOSEL 6.12 6/7] perf: Ensure bpf_perf_link path is properly serialized Sasha Levin
@ 2025-05-28 21:56 ` Sasha Levin
  5 siblings, 0 replies; 7+ messages in thread
From: Sasha Levin @ 2025-05-28 21:56 UTC (permalink / raw)
  To: patches, stable
  Cc: Ming Lei, Hannes Reinecke, Nilay Shroff, Christoph Hellwig,
	Jens Axboe, Sasha Levin, linux-block, linux-kernel

From: Ming Lei <ming.lei@redhat.com>

[ Upstream commit 94209d27d14104ed828ca88cd5403a99162fe51a ]

Use q->elevator with ->elevator_lock held in elv_iosched_show(), since
the local cached elevator reference may become stale after getting
->elevator_lock.

Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250505141805.2751237-5-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

The environment variable is set. Now let me analyze the commit mentioned
in the prompt: **Subject**: block: use q->elevator with ->elevator_lock
held in elv_iosched_show() **Analysis of the code changes**: The commit
fixes a race condition in the block elevator subsystem in
`elv_iosched_show()`. Here's what it does: 1. **Removes local cached
variable**: The commit removes the local variable `struct elevator_queue
*eq = q->elevator;` that was caching the elevator queue reference. 2.
**Accesses q->elevator under lock**: Instead of using the cached
`eq->type`, it now directly accesses `q->elevator->type` while holding
the `->elevator_lock` mutex. 3. **Fixes race condition**: The old code
had a time-of-check-time-of-use (TOCTOU) race where the cached `eq`
could become stale after acquiring the `->elevator_lock`, but before
actually using `eq->type`. **Analysis of backport suitability**: Looking
at similar commits in the reference history: - Similar Commit #4 was
marked YES for backporting because it fixed a real bug with clear user
impact - The other similar commits were marked NO because they were code
cleanups/optimizations without fixing actual bugs This commit: 1.
**Fixes a real race condition bug** - The cached elevator reference
could become invalid between when it's stored and when it's used 2.
**Has clear user impact** - Race conditions in the elevator code could
lead to crashes or memory corruption when users access
`/sys/block/*/queue/scheduler` 3. **Is a small, contained fix** - Only
changes a few lines in one function 4. **Has minimal regression risk** -
Simply ensures proper locking is used when accessing shared data 5.
**Follows stable tree rules** - It's an important bugfix with minimal
risk The commit message explicitly states the problem: "the local cached
elevator reference may become stale after getting ->elevator_lock" -
this is a classic race condition that could cause system instability.
**YES** This commit should be backported to stable kernel trees. It
fixes a race condition in the block elevator subsystem where a cached
elevator reference could become stale between the time it's stored and
when it's actually used, even after acquiring the elevator lock. The fix
is minimal and safe - it simply ensures that `q->elevator->type` is
accessed directly while holding the lock rather than using a potentially
stale cached reference. This prevents potential crashes or memory
corruption when users access the scheduler interface in
`/sys/block/*/queue/scheduler`. The change is small, contained to a
single function, has clear bug-fixing intent, and follows stable tree
criteria of being an important bugfix with minimal regression risk.

 block/elevator.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/block/elevator.c b/block/elevator.c
index 43ba4ab1ada7f..1f76e9efd7717 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -752,7 +752,6 @@ ssize_t elv_iosched_store(struct gendisk *disk, const char *buf,
 ssize_t elv_iosched_show(struct gendisk *disk, char *name)
 {
 	struct request_queue *q = disk->queue;
-	struct elevator_queue *eq = q->elevator;
 	struct elevator_type *cur = NULL, *e;
 	int len = 0;

@@ -763,7 +762,7 @@ ssize_t elv_iosched_show(struct gendisk *disk, char *name)
 		len += sprintf(name+len, "[none] ");
 	} else {
 		len += sprintf(name+len, "none ");
-		cur = eq->type;
+		cur = q->elevator->type;
 	}

 	spin_lock(&elv_list_lock);
-- 
2.39.5

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-05-28 21:56 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-28 21:56 [PATCH AUTOSEL 6.12 1/7] btrfs: exit after state insertion failure at btrfs_convert_extent_bit() Sasha Levin
2025-05-28 21:56 ` [PATCH AUTOSEL 6.12 2/7] fs/filesystems: Fix potential unsigned integer underflow in fs_name() Sasha Levin
2025-05-28 21:56 ` [PATCH AUTOSEL 6.12 3/7] gfs2: pass through holder from the VFS for freeze/thaw Sasha Levin
2025-05-28 21:56 ` [PATCH AUTOSEL 6.12 4/7] btrfs: exit after state split error at set_extent_bit() Sasha Levin
2025-05-28 21:56 ` [PATCH AUTOSEL 6.12 5/7] nvmet-fcloop: access fcpreq only when holding reqlock Sasha Levin
2025-05-28 21:56 ` [PATCH AUTOSEL 6.12 6/7] perf: Ensure bpf_perf_link path is properly serialized Sasha Levin
2025-05-28 21:56 ` [PATCH AUTOSEL 6.12 7/7] block: use q->elevator with ->elevator_lock held in elv_iosched_show() Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).