public inbox for gfs2@lists.linux.dev
 help / color / mirror / Atom feed
* [PATCH AUTOSEL 6.18-6.1] gfs2: Fix use of bio_chain
       [not found] <20251206140252.645973-1-sashal@kernel.org>
@ 2025-12-06 14:02 ` Sasha Levin
  0 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2025-12-06 14:02 UTC (permalink / raw)
  To: patches, stable; +Cc: Andreas Gruenbacher, Stephen Zhang, Sasha Levin, gfs2

From: Andreas Gruenbacher <agruenba@redhat.com>

[ Upstream commit 8a157e0a0aa5143b5d94201508c0ca1bb8cfb941 ]

In gfs2_chain_bio(), the call to bio_chain() has its arguments swapped.
The result is leaked bios and incorrect synchronization (only the last
bio will actually be waited for).  This code is only used during mount
and filesystem thaw, so the bug normally won't be noticeable.

Reported-by: Stephen Zhang <starzhangzsd@gmail.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:



 fs/gfs2/lops.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
index 9c8c305a75c46..914d03f6c4e82 100644
--- a/fs/gfs2/lops.c
+++ b/fs/gfs2/lops.c
@@ -487,7 +487,7 @@ static struct bio *gfs2_chain_bio(struct bio *prev, unsigned int nr_iovecs)
 	new = bio_alloc(prev->bi_bdev, nr_iovecs, prev->bi_opf, GFP_NOIO);
 	bio_clone_blkg_association(new, prev);
 	new->bi_iter.bi_sector = bio_end_sector(prev);
-	bio_chain(new, prev);
+	bio_chain(prev, new);
 	submit_bio(prev);
 	return new;
 }
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH AUTOSEL 6.18-6.1] gfs2: Fix use of bio_chain
       [not found] <20251209001610.611575-1-sashal@kernel.org>
@ 2025-12-09  0:15 ` Sasha Levin
  2025-12-09  0:15 ` [PATCH AUTOSEL 6.18-6.6] gfs2: fix remote evict for read-only filesystems Sasha Levin
  2025-12-09  0:15 ` [PATCH AUTOSEL 6.18-6.12] gfs2: Fix "gfs2: Switch to wait_event in gfs2_quotad" Sasha Levin
  2 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2025-12-09  0:15 UTC (permalink / raw)
  To: patches, stable; +Cc: Andreas Gruenbacher, Stephen Zhang, Sasha Levin, gfs2

From: Andreas Gruenbacher <agruenba@redhat.com>

[ Upstream commit 8a157e0a0aa5143b5d94201508c0ca1bb8cfb941 ]

In gfs2_chain_bio(), the call to bio_chain() has its arguments swapped.
The result is leaked bios and incorrect synchronization (only the last
bio will actually be waited for).  This code is only used during mount
and filesystem thaw, so the bug normally won't be noticeable.

Reported-by: Stephen Zhang <starzhangzsd@gmail.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Now I understand the bug clearly. The `bio_chain()` signature is:
```c
void bio_chain(struct bio *bio, struct bio *parent)
```

Where:
- `bio` - the target bio being submitted
- `parent` - the bio whose completion will be delayed until both
  complete

**The Bug:**
In `gfs2_chain_bio()`:
```c
new->bi_iter.bi_sector = bio_end_sector(prev);
bio_chain(new, prev);  // WRONG: arguments swapped
submit_bio(prev);
return new;
```

The function allocates `new`, chains with `prev`, submits `prev`, and
returns `new`. The intention is that `new` accumulates all completions -
callers wait on `new`.

With the **wrong order** `bio_chain(new, prev)`:
- `new`'s completion increments `prev`'s remaining count
- But `prev` is submitted immediately and freed after completion
- Result: **leaked bios** (new never freed properly) and **incorrect
  synchronization** (waiting on `new` doesn't work correctly)

With the **correct order** `bio_chain(prev, new)`:
- `prev`'s completion increments `new`'s remaining count
- Waiting on `new` correctly waits for all chained bios

## 3. CLASSIFICATION

- **Type:** Bug fix - logic error (swapped arguments)
- **Impact:** Memory leaks and incorrect I/O synchronization in GFS2
- **Not:** Feature addition, refactoring, or cleanup

## 4. SCOPE AND RISK ASSESSMENT

- **Lines changed:** 1 line
- **Files touched:** 1 file
- **Complexity:** Trivial - argument swap
- **Risk:** Extremely low - this is the obviously correct fix

The fix is minimal, self-contained, and the correct argument order is
unambiguous from the API documentation.

## 5. USER IMPACT

- **Affected users:** GFS2 users (enterprise/cluster environments)
- **Trigger conditions:** Mount and filesystem thaw operations
- **Severity:**
  - Bio leaks (memory exhaustion over time)
  - Incorrect synchronization could lead to data consistency issues
    during mount/thaw

While the maintainer notes "normally won't be noticeable," in production
cluster environments, bio leaks during repeated mount/umount cycles
could be problematic, and synchronization bugs in filesystem code are
inherently risky.

## 6. STABILITY INDICATORS

- Reported-by tag (issue was identified)
- Signed by GFS2 maintainer (Andreas Gruenbacher)
- Trivially correct fix - no ambiguity about the right solution

## 7. DEPENDENCY CHECK

This is a standalone fix with no dependencies. The `bio_chain()` API is
stable and has existed for years.

## CONCLUSION

**Meets stable criteria:**
- ✅ Obviously correct - clear API misuse with documented correct usage
- ✅ Fixes real bug - bio leaks and incorrect synchronization
- ✅ Small and contained - single one-line change
- ✅ No new features
- ✅ Risk is minimal

**Minor concerns:**
- No explicit `Cc: stable@vger.kernel.org` tag (maintainer didn't
  request backport)
- No `Fixes:` tag identifying affected versions
- Impact described as "normally won't be noticeable"

Despite the lack of explicit stable tags, this is a clear bug fix to
filesystem code. Bio leaks and synchronization issues in filesystem
operations are exactly the type of bugs stable trees should fix. The
change is trivial, obviously correct, and carries essentially zero risk
of regression.

**YES**

 fs/gfs2/lops.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
index 9c8c305a75c46..914d03f6c4e82 100644
--- a/fs/gfs2/lops.c
+++ b/fs/gfs2/lops.c
@@ -487,7 +487,7 @@ static struct bio *gfs2_chain_bio(struct bio *prev, unsigned int nr_iovecs)
 	new = bio_alloc(prev->bi_bdev, nr_iovecs, prev->bi_opf, GFP_NOIO);
 	bio_clone_blkg_association(new, prev);
 	new->bi_iter.bi_sector = bio_end_sector(prev);
-	bio_chain(new, prev);
+	bio_chain(prev, new);
 	submit_bio(prev);
 	return new;
 }
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH AUTOSEL 6.18-6.6] gfs2: fix remote evict for read-only filesystems
       [not found] <20251209001610.611575-1-sashal@kernel.org>
  2025-12-09  0:15 ` [PATCH AUTOSEL 6.18-6.1] gfs2: Fix use of bio_chain Sasha Levin
@ 2025-12-09  0:15 ` Sasha Levin
  2025-12-09  0:15 ` [PATCH AUTOSEL 6.18-6.12] gfs2: Fix "gfs2: Switch to wait_event in gfs2_quotad" Sasha Levin
  2 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2025-12-09  0:15 UTC (permalink / raw)
  To: patches, stable; +Cc: Andreas Gruenbacher, Sasha Levin, gfs2

From: Andreas Gruenbacher <agruenba@redhat.com>

[ Upstream commit 64c10ed9274bc46416f502afea48b4ae11279669 ]

When a node tries to delete an inode, it first requests exclusive access
to the iopen glock.  This triggers demote requests on all remote nodes
currently holding the iopen glock.  To satisfy those requests, the
remote nodes evict the inode in question, or they poke the corresponding
inode glock to signal that the inode is still in active use.

This behavior doesn't depend on whether or not a filesystem is
read-only, so remove the incorrect read-only check.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

## Analysis: gfs2: fix remote evict for read-only filesystems

### 1. COMMIT MESSAGE ANALYSIS

**Subject:** Uses "fix" keyword, indicating a bug fix
**Description:** Explains a cluster coordination bug in GFS2 where the
read-only check was incorrectly preventing remote inode eviction
**Tags:**
- Signed-off-by from Andreas Gruenbacher (GFS2 maintainer)
- No explicit "Cc: stable@vger.kernel.org" tag
- No "Fixes:" tag

### 2. CODE CHANGE ANALYSIS

The change is extremely minimal - removing a single condition from
`iopen_go_callback()`:

**Before:**
```c
if (!remote || sb_rdonly(sdp->sd_vfs) ||
    test_bit(SDF_KILL, &sdp->sd_flags))
    return;
```

**After:**
```c
if (!remote || test_bit(SDF_KILL, &sdp->sd_flags))
    return;
```

**Technical mechanism of the bug:**
- GFS2 is a clustered filesystem where multiple nodes access shared
  storage
- When Node A wants to delete an inode, it requests exclusive access to
  the iopen glock
- This triggers demote requests to all remote nodes (Node B, C, etc.)
  holding that glock
- Remote nodes must respond by either evicting the inode or signaling
  it's still in use
- The bug: The `sb_rdonly()` check caused read-only mounted nodes to
  skip this coordination entirely
- This breaks cluster protocol because Node A waits for Node B to
  release the glock, but Node B ignores the request

**Why the fix is correct:**
Cluster coordination for glock demotes must work regardless of mount
mode. A read-only node still participates in the cluster and must
properly respond to glock callbacks. The read-only check was logically
incorrect and could cause:
- Stale inode issues across the cluster
- Potential hangs where nodes wait indefinitely for glock release
- Cluster coordination failures

### 3. CLASSIFICATION

- **Bug type:** Logic error - incorrect early return preventing required
  cluster coordination
- **Not a feature:** Removing an incorrect check doesn't add
  functionality
- **Security impact:** Not directly security-related, but could cause
  availability issues

### 4. SCOPE AND RISK ASSESSMENT

| Metric | Assessment |
|--------|------------|
| Lines changed | 2 lines (trivial) |
| Files touched | 1 file |
| Complexity | Very low |
| Subsystem | GFS2 (clustered filesystem) |
| Regression risk | Very low |

The change is extremely surgical - it only removes an erroneous
condition. The remaining code path already exists and has been tested;
this fix just ensures it executes when it should.

### 5. USER IMPACT

- **Affected users:** GFS2 users in clustered environments with read-
  only mounts
- **Use case:** Enterprise clusters (Red Hat, etc.) that use GFS2
- **Severity:** Could cause cluster coordination failures, stale inodes,
  or hangs
- **Frequency:** Affects all read-only GFS2 mounts in cluster mode

### 6. STABILITY INDICATORS

- Written by Andreas Gruenbacher, the GFS2 maintainer who deeply
  understands the subsystem
- The fix logic is obviously correct based on the documented behavior
- Small change means easy to verify correctness

### 7. DEPENDENCY CHECK

- No dependencies on other commits
- The `iopen_go_callback()` function and `sb_rdonly()` exist in all
  stable kernels with GFS2 support
- Clean backport expected (simple line removal)

### Risk vs Benefit Analysis

**Benefits:**
- Fixes a real bug affecting cluster coordination
- Prevents potential hangs and stale inode issues
- Restores correct cluster protocol behavior

**Risks:**
- Extremely low - the change simply allows existing, tested code to run
- Only affects GFS2, a specialized filesystem
- The maintainer clearly understands the impact

### Conclusion

Despite the lack of explicit "Cc: stable" tag, this commit:
1. ✅ Is obviously correct (removing incorrect logic)
2. ✅ Fixes a real bug affecting cluster coordination
3. ✅ Is small and contained (2 lines)
4. ✅ Does not introduce new features
5. ✅ Will apply cleanly to stable trees
6. ✅ Has minimal regression risk

The fix addresses a genuine bug that can cause cluster coordination
failures in GFS2. The change is trivially small, written by the
subsystem maintainer, and the logic is clearly correct - read-only
status should not prevent a node from properly responding to cluster
glock demote requests.

**YES**

 fs/gfs2/glops.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
index 0c0a80b3bacab..0c68ab4432b08 100644
--- a/fs/gfs2/glops.c
+++ b/fs/gfs2/glops.c
@@ -630,8 +630,7 @@ static void iopen_go_callback(struct gfs2_glock *gl, bool remote)
 	struct gfs2_inode *ip = gl->gl_object;
 	struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
 
-	if (!remote || sb_rdonly(sdp->sd_vfs) ||
-	    test_bit(SDF_KILL, &sdp->sd_flags))
+	if (!remote || test_bit(SDF_KILL, &sdp->sd_flags))
 		return;
 
 	if (gl->gl_demote_state == LM_ST_UNLOCKED &&
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH AUTOSEL 6.18-6.12] gfs2: Fix "gfs2: Switch to wait_event in gfs2_quotad"
       [not found] <20251209001610.611575-1-sashal@kernel.org>
  2025-12-09  0:15 ` [PATCH AUTOSEL 6.18-6.1] gfs2: Fix use of bio_chain Sasha Levin
  2025-12-09  0:15 ` [PATCH AUTOSEL 6.18-6.6] gfs2: fix remote evict for read-only filesystems Sasha Levin
@ 2025-12-09  0:15 ` Sasha Levin
  2 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2025-12-09  0:15 UTC (permalink / raw)
  To: patches, stable; +Cc: Andreas Gruenbacher, Sasha Levin, gfs2

From: Andreas Gruenbacher <agruenba@redhat.com>

[ Upstream commit dff1fb6d8b7abe5b1119fa060f5d6b3370bf10ac ]

Commit e4a8b5481c59a ("gfs2: Switch to wait_event in gfs2_quotad") broke
cyclic statfs syncing, so the numbers reported by "df" could easily get
completely out of sync with reality.  Fix this by reverting part of
commit e4a8b5481c59a for now.

A follow-up commit will clean this code up later.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

### 7. STABILITY INDICATORS

- **Author**: Andreas Gruenbacher (GFS2 maintainer) - same person who
  introduced the bug
- **Fix type**: Partial revert - restores known-working behavior
- **Commit message**: Clear acknowledgment of the problem and solution
- **Testing**: Maintainer would have verified the fix given the severity

### Summary Assessment

| Criteria | Assessment |
|----------|------------|
| Fixes real bug | ✅ Yes - broken statfs sync breaks "df" reporting |
| Obviously correct | ✅ Yes - single character change restoring original
logic |
| Small and contained | ✅ Yes - 1 character change in 1 file |
| No new features | ✅ Yes - pure regression fix |
| User impact | HIGH - affects all GFS2 users |
| Risk | VERY LOW - restores proven behavior |
| Dependencies | Needs commit e4a8b5481c59a (in 6.6+) |

### Conclusion

This commit is an excellent candidate for stable backport:

1. **Fixes a real, user-visible bug**: Disk usage reporting being
   "completely out of sync with reality" is a serious issue for any
   filesystem.

2. **Minimal, surgical fix**: A single character change (`=` → `-=`)
   with zero risk of collateral damage.

3. **Restores original behavior**: This is a partial revert, returning
   to the proven timing logic that worked before the buggy conversion.

4. **Critical subsystem**: GFS2 is a clustered filesystem used in
   enterprise environments where correct disk space reporting is
   essential.

5. **Clear provenance**: Written by the GFS2 maintainer who introduced
   the original bug, so he understands exactly what went wrong.

The only consideration is that this fix requires the buggy commit
e4a8b5481c59a to be present (v6.6+). For stable trees 6.6.y and later,
this fix should be backported.

**YES**

 fs/gfs2/quota.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c
index 2298e06797ac3..f2df01f801b81 100644
--- a/fs/gfs2/quota.c
+++ b/fs/gfs2/quota.c
@@ -1616,7 +1616,7 @@ int gfs2_quotad(void *data)
 
 		t = min(quotad_timeo, statfs_timeo);
 
-		t = wait_event_freezable_timeout(sdp->sd_quota_wait,
+		t -= wait_event_freezable_timeout(sdp->sd_quota_wait,
 				sdp->sd_statfs_force_sync ||
 				gfs2_withdrawing_or_withdrawn(sdp) ||
 				kthread_should_stop(),
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-12-09  0:18 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20251209001610.611575-1-sashal@kernel.org>
2025-12-09  0:15 ` [PATCH AUTOSEL 6.18-6.1] gfs2: Fix use of bio_chain Sasha Levin
2025-12-09  0:15 ` [PATCH AUTOSEL 6.18-6.6] gfs2: fix remote evict for read-only filesystems Sasha Levin
2025-12-09  0:15 ` [PATCH AUTOSEL 6.18-6.12] gfs2: Fix "gfs2: Switch to wait_event in gfs2_quotad" Sasha Levin
     [not found] <20251206140252.645973-1-sashal@kernel.org>
2025-12-06 14:02 ` [PATCH AUTOSEL 6.18-6.1] gfs2: Fix use of bio_chain Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox