All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] btrfs: do not trust direct IO page at all
@ 2025-10-21  7:21 Qu Wenruo
  2025-10-21  7:21 ` [PATCH 1/2] btrfs: force stable writes for all inodes Qu Wenruo
  2025-10-21  7:21 ` [PATCH 2/2] btrfs: allocate bounce pages for direct IO Qu Wenruo
  0 siblings, 2 replies; 5+ messages in thread
From: Qu Wenruo @ 2025-10-21  7:21 UTC (permalink / raw)
  To: linux-btrfs

[CHANGELOG]
RFC->v1
- Fix a BUG() triggered btrfs/261
  Where the dio bio can be backed by large folios, thus the whole bio
  can be larger than PAGE_SIZE * BIO_MAX_VECS (4K * 256 for x86_64).
  In that case we are not ensured to allocate a bio to cover the whole
  range.
  Add infrastructure to trace multiple btrfs bios for the same dio bio.

- Add the patch to force STABLE_WRITE flags for all inodes

There is a kernel bugzilla report mentioning that direct IO (and certain
buffered IO can modify the page cache during writeback since the device
has no STABLE_WRITE flag) can easily lead to RAID1 mirror content
mismatch.

Although that report doesn't mention btrfs, as our commit 968f19c5b1b7
("btrfs: always fallback to buffered write if the inode requires
checksum") make inodes with data checksum (the default) to fallback to
buffered IO thus avoid modification during writeback.

The report still exposed that, for our nodatasum inodes, they are still
affected by the same direct IO buffer modification bug, and even worse
since the inode has nodatasum, it doesn't even set the STABLE_WRITE flag
thus we're allowed to modify the page cache even if it's still under
writeback.

This series address the problem by:

- Force STABLE_WRITE flags for all btrfs inodes
  So even for nodatasum inodes, they will wait for writeback before
  modifying the page cache. So that at least the content of different
  mirrors should match.

- Use bounce pages for direct IO
  Instead of using the pages from dio bio, always allocate our own pages
  (so no one else can modify) to do the real IO.
  This will ensure even direct IO on nodatasum inodes will result stable
  contents on different mirrors.

Qu Wenruo (2):
  btrfs: force stable writes for all inodes
  btrfs: allocate bounce pages for direct IO

 fs/btrfs/btrfs_inode.h |   5 +-
 fs/btrfs/direct-io.c   | 293 ++++++++++++++++++++++++++++++++---------
 2 files changed, 231 insertions(+), 67 deletions(-)

-- 
2.51.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-10-21 20:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-21  7:21 [PATCH 0/2] btrfs: do not trust direct IO page at all Qu Wenruo
2025-10-21  7:21 ` [PATCH 1/2] btrfs: force stable writes for all inodes Qu Wenruo
2025-10-21  7:21 ` [PATCH 2/2] btrfs: allocate bounce pages for direct IO Qu Wenruo
2025-10-21 10:44   ` Johannes Thumshirn
2025-10-21 20:41     ` Qu Wenruo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.