public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <wqu@suse.com>
To: linux-btrfs@vger.kernel.org
Subject: [PATCH v2 0/5] btrfs: bs > ps support preparation
Date: Tue,  2 Sep 2025 18:32:11 +0930	[thread overview]
Message-ID: <cover.1756803640.git.wqu@suse.com> (raw)

One of the blockage for bs > ps support is the conflicts between all the
single-page bvec iterator/helpers (like memzero_bvec(),
bio_for_each_segment() etc) and large folios (with highmem support).

For bs > ps support, all the folios will have a minimal order, so that
each folio will cover at least one block. This saves the hassle of the
fs to handle sub-block contents.

However for all those single-page bvec iterator/helpers, they can only
handle a bvec that is no larger than a page.

To address the conflicting features, go a completely different way to
handle a fs block:

- Use phys_addr_t to represent a block inside a bio
  So we won't need to bother the sp bvec helpers, just pass a single
  paddr around.

- Do proper highmem handling for checksum generation/verification
  Now we will grab the folio from using the paddr, and make sure the
  folio will cover at least one block starting at the paddr.

  If the folio is highmem, do proper per-page kmap_local_folio()/kunmap()
  to handle highmem.
  Otherwise do a full block csum calculation in one go.

  This should brings no extra overhead except the paddr->folio
  conversion (which should be really tiny), as for systems without
  HIGHMEM, folio_test_partial_kmap() will always return false, and the
  HIGHMEM path will be optimized out by the compiler completely.

  Unfortunately I don't have a 32bit VM at hand to test.

- Introduce extra marcos to iterate blocks inside a bio
  Two macros, btrfs_bio_for_each_block() which starts at the specified
  bio_iter.
  The other one, btrfs_bio_for_each_block_all() will go through all
  blocks in the bio.

  Both returns a @paddr representing a block. Callers are either using
  paddr based helper like
  btrfs_calculate_block_csum()/btrfs_check_block_csum(), or RAID56 which
  is already using paddr.

  For now it's only utilized by btrfs, bcachefs has a similar helper and
  that's my inspiration.

  I hope one day it can be escalated to bio.h.

With all those preparation done, btrfs now can support basic file
opeartions with bs > ps support, but still with quite some limits:

- No compression support
  The compressed folios must be allocated using the minimal folio order.
  As btrfs_calculate_block_csum() requires the minimal folio size.

- No RAID56 support
- No scrub support
  The same as compression, currently we're allocating the folios in page
  size.
  Although raid56 codes are now using the btrfs_bio_for_each_block*()
  helpers, the underlying folio sizes still needs update.

[Changelog]
v2:
- Use paddr to represent a block inside a bio
  This makes a lot of data checksum handling much easier to read.
  And make csum verification/generation to properly follow all the
  highmem helpers, by doing kmap inside the helper, not requiring the
  callers to do kmap.

- Fix a compiler warning when caching max and min folio order
  Remove the fs_info local variable, as for non-experimental builds it
  is not utilized.

Qu Wenruo (5):
  btrfs: support all block sizes which is no larger than page size
  btrfs: concentrate highmem handling for data verification
  btrfs: introduce btrfs_bio_for_each_block() helper
  btrfs: introduce btrfs_bio_for_each_block_all() helper
  btrfs: cache max and min order inside btrfs_fs_info

 fs/btrfs/bio.c         | 22 +++++++-------
 fs/btrfs/btrfs_inode.h | 14 +++++----
 fs/btrfs/disk-io.c     |  2 ++
 fs/btrfs/file-item.c   | 26 ++++-------------
 fs/btrfs/fs.c          |  4 +++
 fs/btrfs/fs.h          |  2 ++
 fs/btrfs/inode.c       | 59 ++++++++++++++++++++++++++------------
 fs/btrfs/misc.h        | 49 +++++++++++++++++++++++++++++++
 fs/btrfs/raid56.c      | 65 ++++++++++++++++--------------------------
 fs/btrfs/scrub.c       | 18 ++++++++++--
 10 files changed, 162 insertions(+), 99 deletions(-)

-- 
2.50.1


             reply	other threads:[~2025-09-02  9:02 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-02  9:02 Qu Wenruo [this message]
2025-09-02  9:02 ` [PATCH v2 1/5] btrfs: support all block sizes which is no larger than page size Qu Wenruo
2025-09-02  9:02 ` [PATCH v2 2/5] btrfs: concentrate highmem handling for data verification Qu Wenruo
2025-09-02  9:02 ` [PATCH v2 3/5] btrfs: introduce btrfs_bio_for_each_block() helper Qu Wenruo
2025-09-02  9:02 ` [PATCH v2 4/5] btrfs: introduce btrfs_bio_for_each_block_all() helper Qu Wenruo
2025-09-05 17:33   ` David Sterba
2025-09-05 22:08     ` Qu Wenruo
2025-09-08 17:44       ` David Sterba
2025-09-02  9:02 ` [PATCH v2 5/5] btrfs: cache max and min order inside btrfs_fs_info Qu Wenruo
2025-09-05 17:36   ` David Sterba
2025-09-06  8:47     ` Qu Wenruo
2025-09-05 17:47 ` [PATCH v2 0/5] btrfs: bs > ps support preparation David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1756803640.git.wqu@suse.com \
    --to=wqu@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox