[PATCH AUTOSEL 6.18] btrfs: sync read disk super and set block size

public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed

From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Edward Adam Davis <eadavis@qq.com>,
	syzbot+b4a2af3000eaa84d95d5@syzkaller.appspotmail.com,
	Filipe Manana <fdmanana@suse.com>,
	David Sterba <dsterba@suse.com>, Sasha Levin <sashal@kernel.org>,
	clm@fb.com, linux-btrfs@vger.kernel.org
Subject: [PATCH AUTOSEL 6.18] btrfs: sync read disk super and set block size
Date: Mon,  2 Feb 2026 16:46:10 -0500	[thread overview]
Message-ID: <20260202214643.212290-15-sashal@kernel.org> (raw)
In-Reply-To: <20260202214643.212290-1-sashal@kernel.org>

From: Edward Adam Davis <eadavis@qq.com>

[ Upstream commit 3f29d661e5686f3aa14e6f11537ff5c49846f2e2 ]

When the user performs a btrfs mount, the block device is not set
correctly. The user sets the block size of the block device to 0x4000
by executing the BLKBSZSET command.
Since the block size change also changes the mapping->flags value, this
further affects the result of the mapping_min_folio_order() calculation.

Let's analyze the following two scenarios:

Scenario 1: Without executing the BLKBSZSET command, the block size is
0x1000, and mapping_min_folio_order() returns 0;

Scenario 2: After executing the BLKBSZSET command, the block size is
0x4000, and mapping_min_folio_order() returns 2.

do_read_cache_folio() allocates a folio before the BLKBSZSET command
is executed. This results in the allocated folio having an order value
of 0. Later, after BLKBSZSET is executed, the block size increases to
0x4000, and the mapping_min_folio_order() calculation result becomes 2.

This leads to two undesirable consequences:

1. filemap_add_folio() triggers a VM_BUG_ON_FOLIO(folio_order(folio) <
mapping_min_folio_order(mapping)) assertion.

2. The syzbot report [1] shows a null pointer dereference in
create_empty_buffers() due to a buffer head allocation failure.

Synchronization should be established based on the inode between the
BLKBSZSET command and read cache page to prevent inconsistencies in
block size or mapping flags before and after folio allocation.

[1]
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
RIP: 0010:create_empty_buffers+0x4d/0x480 fs/buffer.c:1694
Call Trace:
 folio_create_buffers+0x109/0x150 fs/buffer.c:1802
 block_read_full_folio+0x14c/0x850 fs/buffer.c:2403
 filemap_read_folio+0xc8/0x2a0 mm/filemap.c:2496
 do_read_cache_folio+0x266/0x5c0 mm/filemap.c:4096
 do_read_cache_page mm/filemap.c:4162 [inline]
 read_cache_page_gfp+0x29/0x120 mm/filemap.c:4195
 btrfs_read_disk_super+0x192/0x500 fs/btrfs/volumes.c:1367

Reported-by: syzbot+b4a2af3000eaa84d95d5@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=b4a2af3000eaa84d95d5
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

So this commit is going into v6.19. Let me now analyze the complete
picture.

## Analysis Summary

### 1. COMMIT MESSAGE ANALYSIS

**Subject:** "btrfs: sync read disk super and set block size"

**Key indicators:**
- **Reported-by: syzbot**: Indicates a real bug found by fuzzing
- **Closes: syzkaller bug link**: Confirms this is fixing a reported
  issue
- **Reviewed-by: Filipe Manana**: Core btrfs maintainer reviewed and
  approved
- **Signed-off-by: David Sterba**: Btrfs maintainer signed off

The commit message describes a race condition that leads to:
1. `VM_BUG_ON_FOLIO` assertion failure
2. Null pointer dereference in `create_empty_buffers()`

### 2. CODE CHANGE ANALYSIS

The fix is **extremely simple** - just 2 lines added:
```c
+       filemap_invalidate_lock(mapping);
        page = read_cache_page_gfp(mapping, bytenr >> PAGE_SHIFT,
GFP_NOFS);
+       filemap_invalidate_unlock(mapping);
```

**Root cause:** A race between:
1. `btrfs_read_disk_super()` allocating a folio via
   `read_cache_page_gfp()`
2. User space calling `BLKBSZSET` ioctl to change block size

When the block size changes mid-operation, the folio order requirements
change, causing either a BUG_ON or null pointer dereference.

**Why the fix works:** The `filemap_invalidate_lock()` provides
synchronization between the page cache reader and the block size setter
(in `set_blocksize()`), preventing the race.

### 3. CLASSIFICATION

- **Bug type:** Race condition leading to kernel crash (null pointer
  deref) or BUG_ON
- **Severity:** High - causes kernel panic/crash
- **Category:** Synchronization fix

### 4. SCOPE AND RISK ASSESSMENT

- **Lines changed:** 2 additions only
- **Files touched:** 1 (fs/btrfs/volumes.c)
- **Complexity:** Very low - just lock/unlock pair around existing call
- **Risk:** LOW - the lock is well-understood and used elsewhere for
  exactly this purpose

### 5. USER IMPACT

- **Who is affected:** Any btrfs user where concurrent block device
  operations occur
- **Triggering scenario:** User space modifying block device size while
  btrfs is mounting/reading superblock
- **Severity of impact:** Kernel crash (HIGH)

### 6. STABILITY INDICATORS

- **Reviewed-by:** Filipe Manana (senior btrfs developer at SUSE)
- **Multiple sign-offs:** Through the btrfs maintainer chain
- **Syzbot reported:** Reproducible issue

### 7. CRITICAL DEPENDENCY CHECK

**THIS IS THE KEY ISSUE:** The fix relies on `set_blocksize()` also
taking `filemap_invalidate_lock()`. This was added by commit
`c0e473a0d2264` ("block: fix race between set_blocksize and read paths")
which is **only in kernel 6.15+**.

Without that block layer fix:
- On kernels < 6.15: `set_blocksize()` does NOT take
  `filemap_invalidate_lock`, so this btrfs fix would NOT provide proper
  synchronization
- The btrfs fix by itself would be ineffective in preventing the race

### Risk vs Benefit Analysis

**Benefits:**
- Fixes real crash reported by syzbot
- Very small, surgical fix (2 lines)
- Well-reviewed by btrfs maintainers
- Follows established synchronization pattern

**Risks/Concerns:**
- **CRITICAL DEPENDENCY**: Requires `c0e473a0d2264` to be present for
  the synchronization to work
- Without the block layer fix, this btrfs fix is essentially a no-op
  (won't actually synchronize)
- The block layer fix is only in 6.15+, so backporting this btrfs fix to
  6.14 or earlier would be pointless without also backporting the block
  layer changes

### Conclusion

This commit fixes a real kernel crash (null pointer dereference) in
btrfs that was reported by syzbot. The fix itself is minimal and correct
- it adds proper locking around the `read_cache_page_gfp()` call to
synchronize with block size changes.

However, there is a **critical dependency issue**: This fix only
provides effective synchronization when paired with commit
`c0e473a0d2264` ("block: fix race between set_blocksize and read paths")
which adds the corresponding lock acquisition in `set_blocksize()`. That
commit is only in kernel 6.15+.

For stable trees **6.15.y and later**: This is a clear YES - fixes a
real crash with minimal risk.

For stable trees **6.14 and earlier**: The fix would be ineffective
without also backporting the block layer changes, which are more
invasive.

Given that the question is about backporting in general and this commit
would only be effective in 6.15+ (where the block layer infrastructure
exists), and the fix is correct, small, and addresses a real crash, the
answer depends on the target kernel. For the current stable (6.15+),
this should be backported.

**YES**

 fs/btrfs/volumes.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 48e717c105c35..8e7dcb12af4c4 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1365,7 +1365,9 @@ struct btrfs_super_block *btrfs_read_disk_super(struct block_device *bdev,
 				      (bytenr + BTRFS_SUPER_INFO_SIZE) >> PAGE_SHIFT);
 	}

+	filemap_invalidate_lock(mapping);
 	page = read_cache_page_gfp(mapping, bytenr >> PAGE_SHIFT, GFP_NOFS);
+	filemap_invalidate_unlock(mapping);
 	if (IS_ERR(page))
 		return ERR_CAST(page);

-- 
2.51.0

next prev parent reply	other threads:[~2026-02-02 21:47 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20260202214643.212290-1-sashal@kernel.org>
2026-02-02 21:46 ` [PATCH AUTOSEL 6.18] btrfs: fix Wmaybe-uninitialized warning in replay_one_buffer() Sasha Levin
2026-02-02 21:46 ` Sasha Levin [this message]
2026-02-02 21:46 ` [PATCH AUTOSEL 6.18-6.12] btrfs: reject new transactions if the fs is fully read-only Sasha Levin
     [not found] <20260128223332.2806589-1-sashal@kernel.org>
2026-01-28 22:33 ` [PATCH AUTOSEL 6.18] btrfs: sync read disk super and set block size Sasha Levin

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:48e717c105c3 dfblob:8e7dcb12af4c )
 OR (
bs:"[PATCH AUTOSEL 6.18] btrfs: sync read disk super and set block size" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260202214643.212290-15-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=clm@fb.com \
    --cc=dsterba@suse.com \
    --cc=eadavis@qq.com \
    --cc=fdmanana@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=patches@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    --cc=syzbot+b4a2af3000eaa84d95d5@syzkaller.appspotmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox