* [PATCH 0/3] btrfs: removal of on-stack paddrs[], part 1
@ 2026-07-02 7:36 Qu Wenruo
2026-07-02 7:36 ` [PATCH 1/3] btrfs: replace btrfs_repair_io_failure() to use bio for page iteration Qu Wenruo
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Qu Wenruo @ 2026-07-02 7:36 UTC (permalink / raw)
To: linux-btrfs
Since the experimental bs > ps support, several on-stack fixed paddrs[]
arrays are introduced, for assemble mutli-page sized fs blocks.
However that on-stack memory usage is always there for 4K page sized
systems, no matter if the block size of the filesystem.
This series is part 1 of such on-stack paddrs[] cleanup.
The idea is to use bio interface for page iterations, the core idea is
to use a const bvec_iter as the pointer to where the block is.
Then we save a local bevc_iter, and use the local iter to check the next
few pages until we fill a full block.
Furthermore, with the help of bvec_iter, we can remove a lot of
parameters:
- file_offset
- logical
- bio_offset
All can be generated by using the @iter passed in and the
bbio->saved_iter to calculate the old @bio_offset.
@bio_offset is the (iter.bi_sector - saved_iter.bi_sector) <<
SECTOR_SHIFT.
As when bvec_iter is advanced, its bi_sector is also increased.
@logical is simpler, just iter.bi_sector << SECTOR_SHIFT.
@file_offset is the bbio->file_offset + bio_offset.
This means we no longer need to use on-stack paddrs[] to csum
generation.
With bio interfaces, the iteration of an fs block is as simple as the
following: (I tried to change the page/pg_off/cur_len into a macro just
like btrfs_bio_for_each_block(), but failed)
u32 cur = 0;
btrfs_csum_init(&cctx, fs_info->csum_type);
while (cur < blocksize) {
struct page *page = bio_iter_page(&bbio->bio, iter);
const u32 pg_off = bio_iter_offset(&bbio->bio, iter);
const u32 cur_len = min(bio_iter_len(&bbio->bio, iter), blocksize - cur);
void *kaddr;
kaddr = kmap_local_page(page) + pg_off;
btrfs_csum_update(&cctx, kaddr, cur_len);
kunmap_local(kaddr);
bio_advance_iter_single(&bbio->bio, &iter, cur_len);
cur += cur_len;
}
btrfs_csum_final(&cctx, csum);
However there are still some callers left:
- Scrub
That's already addressed in another series accidentially
(https://lore.kernel.org/linux-btrfs/cover.1782795330.git.wqu@suse.com/)
That will be last user of btrfs_check_block_csum().
- RAID56
That will be only location left without a bio.
In that case we can easily craft a local helper to do csum generation
without using on-stack paddrs[].
Qu Wenruo (3):
btrfs: replace btrfs_repair_io_failure() to use bio for page iteration
btrfs: enhance btrfs_data_csum_ok() to use bio for page iteration
btrfs: use a shared helper to calculate data checksum for a bio
fs/btrfs/bio.c | 140 +++++++++++++++++++----------------------
fs/btrfs/bio.h | 5 +-
fs/btrfs/btrfs_inode.h | 6 +-
fs/btrfs/disk-io.c | 24 ++++---
fs/btrfs/file-item.c | 19 ++----
fs/btrfs/inode.c | 59 ++++++++++++++---
6 files changed, 140 insertions(+), 113 deletions(-)
--
2.54.0
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH 1/3] btrfs: replace btrfs_repair_io_failure() to use bio for page iteration
2026-07-02 7:36 [PATCH 0/3] btrfs: removal of on-stack paddrs[], part 1 Qu Wenruo
@ 2026-07-02 7:36 ` Qu Wenruo
2026-07-02 7:36 ` [PATCH 2/3] btrfs: enhance btrfs_data_csum_ok() " Qu Wenruo
2026-07-02 7:36 ` [PATCH 3/3] btrfs: use a shared helper to calculate data checksum for a bio Qu Wenruo
2 siblings, 0 replies; 4+ messages in thread
From: Qu Wenruo @ 2026-07-02 7:36 UTC (permalink / raw)
To: linux-btrfs
Currently btrfs_repair_io_failure() uses a @paddrs[] array to iterate
pages.
Such a parameter is required for bs > ps cases, so that one fs block may
cross several pages.
However there is a much simpler and existing way to iterate pages: bio
and bvec_iter.
This changes btrfs_repair_io_failure() by:
- Use a const @bvec_iter pointer to locate where the pages are
- Extract file offset/logical from the @bbio
- Require no @step parameter
Above features allow us to shorten the parameter list.
- Rename the function to btrfs_repair_bbio_failure()
- Change the caller in btrfs_repair_eb_io_failure() to allocate a bbio
Unlike the data read path, we do not have a handy bbio in that case.
So we need to allocate one just for btrfs_repair_bbio_failure().
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
fs/btrfs/bio.c | 61 ++++++++++++++++++++++++++--------------------
fs/btrfs/bio.h | 5 ++--
fs/btrfs/disk-io.c | 24 ++++++++++++------
3 files changed, 54 insertions(+), 36 deletions(-)
diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c
index cc0bd03048ba..9d0f72cc37ca 100644
--- a/fs/btrfs/bio.c
+++ b/fs/btrfs/bio.c
@@ -186,7 +186,6 @@ static void btrfs_end_repair_bio(struct btrfs_bio *repair_bbio,
*/
struct bvec_iter saved_iter = repair_bbio->saved_iter;
const u32 step = min(fs_info->sectorsize, PAGE_SIZE);
- const u64 logical = repair_bbio->saved_iter.bi_sector << SECTOR_SHIFT;
const u32 nr_steps = repair_bbio->saved_iter.bi_size / step;
int mirror = repair_bbio->mirror_num;
phys_addr_t paddrs[BTRFS_MAX_BLOCKSIZE / PAGE_SIZE];
@@ -220,9 +219,8 @@ static void btrfs_end_repair_bio(struct btrfs_bio *repair_bbio,
do {
mirror = prev_repair_mirror(fbio, mirror);
- btrfs_repair_io_failure(fs_info, btrfs_ino(inode),
- repair_bbio->file_offset, fs_info->sectorsize,
- logical, paddrs, step, mirror);
+ btrfs_repair_bbio_failure(repair_bbio, &repair_bbio->saved_iter,
+ fs_info->sectorsize, mirror);
} while (mirror != fbio->bbio->mirror_num);
done:
@@ -925,21 +923,23 @@ void btrfs_submit_bbio(struct btrfs_bio *bbio, int mirror_num)
* The I/O is issued synchronously to block the repair read completion from
* freeing the bio.
*
- * @ino: Offending inode number
- * @fileoff: File offset inside the inode
+ * @bbio: Original bbio where the repair is needed
+ * @orig_iter: Points to where the repair start is
* @length: Length of the repair write
- * @logical: Logical address of the range
- * @paddrs: Physical address array of the content
- * @step: Length of for each paddrs
* @mirror_num: Mirror number to write to. Must not be zero
*/
-int btrfs_repair_io_failure(struct btrfs_fs_info *fs_info, u64 ino, u64 fileoff,
- u32 length, u64 logical, const phys_addr_t paddrs[],
- unsigned int step, int mirror_num)
+int btrfs_repair_bbio_failure(struct btrfs_bio *bbio, const struct bvec_iter *orig_iter,
+ u32 length, int mirror_num)
{
- const u32 nr_steps = DIV_ROUND_UP_POW2(length, step);
+ struct btrfs_inode *inode = bbio->inode;
+ struct btrfs_fs_info *fs_info = inode->root->fs_info;
struct btrfs_io_stripe smap = { 0 };
- struct bio *bio = NULL;
+ struct bvec_iter iter = *orig_iter;
+ struct bio *repair_bio = NULL;
+ const u64 logical = iter.bi_sector << SECTOR_SHIFT;
+ const u64 fileoff = bbio->file_offset +
+ ((iter.bi_sector - bbio->saved_iter.bi_sector) << SECTOR_SHIFT);
+ u32 cur = 0;
int ret = 0;
BUG_ON(!mirror_num);
@@ -950,8 +950,9 @@ int btrfs_repair_io_failure(struct btrfs_fs_info *fs_info, u64 ino, u64 fileoff,
ASSERT(IS_ALIGNED(fileoff, fs_info->sectorsize));
/* Either it's a single data or metadata block. */
ASSERT(length <= BTRFS_MAX_BLOCKSIZE);
- ASSERT(step <= length);
- ASSERT(is_power_of_2(step));
+
+ /* Our current iter should not be before the original bbio saved_iter. */
+ ASSERT(iter.bi_sector >= bbio->saved_iter.bi_sector);
/*
* The fs either mounted RO or hit critical errors, no need
@@ -979,15 +980,22 @@ int btrfs_repair_io_failure(struct btrfs_fs_info *fs_info, u64 ino, u64 fileoff,
goto out_counter_dec;
}
- bio = bio_alloc(smap.dev->bdev, nr_steps, REQ_OP_WRITE | REQ_SYNC, GFP_NOFS);
- bio->bi_iter.bi_sector = smap.physical >> SECTOR_SHIFT;
- for (int i = 0; i < nr_steps; i++) {
- ret = bio_add_page(bio, phys_to_page(paddrs[i]), step, offset_in_page(paddrs[i]));
- /* We should have allocated enough slots to contain all the different pages. */
- ASSERT(ret == step);
+ repair_bio = bio_alloc(smap.dev->bdev, max(1, length >> PAGE_SHIFT),
+ REQ_OP_WRITE | REQ_SYNC, GFP_NOFS);
+ repair_bio->bi_iter.bi_sector = smap.physical >> SECTOR_SHIFT;
+ while (cur < length) {
+ struct page *page = bio_iter_page(&bbio->bio, iter);
+ const u32 pg_off = bio_iter_offset(&bbio->bio, iter);
+ const u32 cur_len = min(bio_iter_len(&bbio->bio, iter), length - cur);
+
+ ret = bio_add_page(repair_bio, page, cur_len, pg_off);
+ ASSERT(ret == cur_len);
+ bio_advance_iter_single(&bbio->bio, &iter, cur_len);
+ cur += cur_len;
}
- ret = submit_bio_wait(bio);
- bio_put(bio);
+
+ ret = submit_bio_wait(repair_bio);
+ bio_put(repair_bio);
if (ret) {
/* try to remap that extent elsewhere? */
btrfs_dev_stat_inc_and_print(smap.dev, BTRFS_DEV_STAT_WRITE_ERRS);
@@ -995,8 +1003,9 @@ int btrfs_repair_io_failure(struct btrfs_fs_info *fs_info, u64 ino, u64 fileoff,
}
btrfs_info_rl(fs_info,
- "read error corrected: ino %llu off %llu (dev %s sector %llu)",
- ino, fileoff, btrfs_dev_name(smap.dev),
+ "read error corrected: root %llu ino %llu off %llu (dev %s sector %llu)",
+ btrfs_root_id(inode->root), btrfs_ino(inode), fileoff,
+ btrfs_dev_name(smap.dev),
smap.physical >> SECTOR_SHIFT);
ret = 0;
diff --git a/fs/btrfs/bio.h b/fs/btrfs/bio.h
index 303ed6c7103d..b7bd377a0162 100644
--- a/fs/btrfs/bio.h
+++ b/fs/btrfs/bio.h
@@ -126,8 +126,7 @@ void btrfs_bio_end_io(struct btrfs_bio *bbio, blk_status_t status);
void btrfs_submit_bbio(struct btrfs_bio *bbio, int mirror_num);
void btrfs_submit_repair_write(struct btrfs_bio *bbio, int mirror_num, bool dev_replace);
-int btrfs_repair_io_failure(struct btrfs_fs_info *fs_info, u64 ino, u64 fileoff,
- u32 length, u64 logical, const phys_addr_t paddrs[],
- unsigned int step, int mirror_num);
+int btrfs_repair_bbio_failure(struct btrfs_bio *bbio, const struct bvec_iter *orig_iter,
+ u32 length, int mirror_num);
#endif
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 52593423bdc3..f539497e1272 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -176,19 +176,23 @@ static int btrfs_repair_eb_io_failure(const struct extent_buffer *eb,
int mirror_num)
{
struct btrfs_fs_info *fs_info = eb->fs_info;
- const u32 step = min(fs_info->nodesize, PAGE_SIZE);
- const u32 nr_steps = eb->len / step;
- phys_addr_t paddrs[BTRFS_MAX_BLOCKSIZE / PAGE_SIZE];
+ struct btrfs_bio *bbio;
+ int ret;
if (sb_rdonly(fs_info->sb))
return -EROFS;
+ /*
+ * This bbio is only to queue all pages for btrfs_repair_bbio_failure().
+ * Thus it will never get its endio called.
+ */
+ bbio = btrfs_bio_alloc(max(1, fs_info->nodesize >> PAGE_SHIFT), REQ_OP_READ,
+ BTRFS_I(fs_info->btree_inode), eb->start, NULL, NULL);
for (int i = 0; i < num_extent_pages(eb); i++) {
struct folio *folio = eb->folios[i];
/* No large folio support yet. */
ASSERT(folio_order(folio) == 0);
- ASSERT(i < nr_steps);
/*
* For nodesize < page size, there is just one paddr, with some
@@ -197,11 +201,17 @@ static int btrfs_repair_eb_io_failure(const struct extent_buffer *eb,
* For nodesize >= page size, it's one or more paddrs, and eb->start
* must be aligned to page boundary.
*/
- paddrs[i] = page_to_phys(&folio->page) + offset_in_page(eb->start);
+ ret = bio_add_page(&bbio->bio, &folio->page, min(PAGE_SIZE, fs_info->nodesize),
+ offset_in_page(eb->start));
+ ASSERT(ret == min(PAGE_SIZE, fs_info->nodesize));
}
+ /* Since the bbio is never submitted, we have to save the iter manually. */
+ bbio->saved_iter = bbio->bio.bi_iter;
- return btrfs_repair_io_failure(fs_info, 0, eb->start, eb->len,
- eb->start, paddrs, step, mirror_num);
+ ret = btrfs_repair_bbio_failure(bbio, &bbio->saved_iter,
+ fs_info->nodesize, mirror_num);
+ bio_put(&bbio->bio);
+ return ret;
}
/*
--
2.54.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 2/3] btrfs: enhance btrfs_data_csum_ok() to use bio for page iteration
2026-07-02 7:36 [PATCH 0/3] btrfs: removal of on-stack paddrs[], part 1 Qu Wenruo
2026-07-02 7:36 ` [PATCH 1/3] btrfs: replace btrfs_repair_io_failure() to use bio for page iteration Qu Wenruo
@ 2026-07-02 7:36 ` Qu Wenruo
2026-07-02 7:36 ` [PATCH 3/3] btrfs: use a shared helper to calculate data checksum for a bio Qu Wenruo
2 siblings, 0 replies; 4+ messages in thread
From: Qu Wenruo @ 2026-07-02 7:36 UTC (permalink / raw)
To: linux-btrfs
Currently btrfs_data_csum_ok() requires a @paddr[] array to iterate all
possible pages for bs > ps cases.
However for all btrfs_data_csum_ok() call sites, we already have a
btrfs_bio, and the bio infrastructure has many flexible ways to iterate
multiple pages already.
Change btrfs_data_csum_ok() to make full use of btrfs_bio by:
- Change the parameter list to require a @bvec_iter pointer
And remove @bio_offset, which can be calculated through @bvec_iter and
bbio->saved_iter.
Also remove paddrs[], we will iterate all the pages using bio
interfaces.
- Make the same parameter changes to repair_one_sector()
- Use bio interfaces to iterate pages from a bio
- Rename the function to btrfs_bio_data_csum_ok()
- Remove on-stack paddrs[] array usage
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
fs/btrfs/bio.c | 79 +++++++++++++++---------------------------
fs/btrfs/btrfs_inode.h | 4 +--
fs/btrfs/inode.c | 49 ++++++++++++++++++++------
3 files changed, 69 insertions(+), 63 deletions(-)
diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c
index 9d0f72cc37ca..de021a9640df 100644
--- a/fs/btrfs/bio.c
+++ b/fs/btrfs/bio.c
@@ -180,29 +180,13 @@ static void btrfs_end_repair_bio(struct btrfs_bio *repair_bbio,
struct btrfs_failed_bio *fbio = repair_bbio->private;
struct btrfs_inode *inode = repair_bbio->inode;
struct btrfs_fs_info *fs_info = inode->root->fs_info;
- /*
- * We can not move forward the saved_iter, as it will be later
- * utilized by repair_bbio again.
- */
- struct bvec_iter saved_iter = repair_bbio->saved_iter;
- const u32 step = min(fs_info->sectorsize, PAGE_SIZE);
- const u32 nr_steps = repair_bbio->saved_iter.bi_size / step;
int mirror = repair_bbio->mirror_num;
- phys_addr_t paddrs[BTRFS_MAX_BLOCKSIZE / PAGE_SIZE];
- phys_addr_t paddr;
- unsigned int slot = 0;
- /* Repair bbio should be eaxctly one block sized. */
+ /* Repair bbio should be exactly one block sized. */
ASSERT(repair_bbio->saved_iter.bi_size == fs_info->sectorsize);
- btrfs_bio_for_each_block(paddr, &repair_bbio->bio, &saved_iter, step) {
- ASSERT(slot < nr_steps);
- paddrs[slot] = paddr;
- slot++;
- }
-
if (repair_bbio->bio.bi_status ||
- !btrfs_data_csum_ok(repair_bbio, dev, 0, paddrs)) {
+ !btrfs_bio_data_csum_ok(repair_bbio, &repair_bbio->saved_iter, dev)) {
bio_reset(&repair_bbio->bio, NULL, REQ_OP_READ);
repair_bbio->bio.bi_iter = repair_bbio->saved_iter;
@@ -236,25 +220,21 @@ static void btrfs_end_repair_bio(struct btrfs_bio *repair_bbio,
* read succeeded to restore the redundancy.
*/
static struct btrfs_failed_bio *repair_one_sector(struct btrfs_bio *failed_bbio,
- u32 bio_offset,
- phys_addr_t paddrs[],
+ const struct bvec_iter *orig_iter,
struct btrfs_failed_bio *fbio)
{
struct btrfs_inode *inode = failed_bbio->inode;
struct btrfs_fs_info *fs_info = inode->root->fs_info;
- const u32 sectorsize = fs_info->sectorsize;
- const u32 step = min(fs_info->sectorsize, PAGE_SIZE);
- const u32 nr_steps = sectorsize / step;
- /*
- * For bs > ps cases, the saved_iter can be partially moved forward.
- * In that case we should round it down to the block boundary.
- */
- const u64 logical = round_down(failed_bbio->saved_iter.bi_sector << SECTOR_SHIFT,
- sectorsize);
struct btrfs_bio *repair_bbio;
struct bio *repair_bio;
+ struct bvec_iter iter = *orig_iter;
+ const u32 sectorsize = fs_info->sectorsize;
+ const u32 bio_offset = ((iter.bi_sector - failed_bbio->saved_iter.bi_sector) <<
+ SECTOR_SHIFT);
+ const u64 logical = iter.bi_sector << SECTOR_SHIFT;
int num_copies;
int mirror;
+ u32 cur = 0;
btrfs_debug(fs_info, "repair read error: read error at %llu",
failed_bbio->file_offset + bio_offset);
@@ -275,17 +255,22 @@ static struct btrfs_failed_bio *repair_one_sector(struct btrfs_bio *failed_bbio,
atomic_inc(&fbio->repair_count);
- repair_bio = bio_alloc_bioset(NULL, nr_steps, REQ_OP_READ, GFP_NOFS,
+ repair_bio = bio_alloc_bioset(NULL, max(1, sectorsize >> PAGE_SHIFT),
+ REQ_OP_READ, GFP_NOFS,
&btrfs_repair_bioset);
repair_bio->bi_iter.bi_sector = logical >> SECTOR_SHIFT;
- for (int i = 0; i < nr_steps; i++) {
+ while (cur < sectorsize) {
+ struct page *page = bio_iter_page(&failed_bbio->bio, iter);
+ const u32 pg_off = bio_iter_offset(&failed_bbio->bio, iter);
+ const u32 cur_len = min(bio_iter_len(&failed_bbio->bio, iter),
+ sectorsize - cur);
int ret;
- ASSERT(offset_in_page(paddrs[i]) + step <= PAGE_SIZE);
+ ret = bio_add_page(repair_bio, page, cur_len, pg_off);
+ ASSERT(ret == cur_len);
- ret = bio_add_page(repair_bio, phys_to_page(paddrs[i]), step,
- offset_in_page(paddrs[i]));
- ASSERT(ret == step);
+ bio_advance_iter_single(&failed_bbio->bio, &iter, cur_len);
+ cur += cur_len;
}
repair_bbio = btrfs_bio(repair_bio);
@@ -303,18 +288,16 @@ static void btrfs_check_read_bio(struct btrfs_bio *bbio, struct btrfs_device *de
struct btrfs_inode *inode = bbio->inode;
struct btrfs_fs_info *fs_info = inode->root->fs_info;
const u32 sectorsize = fs_info->sectorsize;
- const u32 step = min(sectorsize, PAGE_SIZE);
- const u32 nr_steps = sectorsize / step;
- struct bvec_iter *iter = &bbio->saved_iter;
+ struct bvec_iter iter;
blk_status_t status = bbio->bio.bi_status;
struct btrfs_failed_bio *fbio = NULL;
- phys_addr_t paddrs[BTRFS_MAX_BLOCKSIZE / PAGE_SIZE];
- phys_addr_t paddr;
- u32 offset = 0;
/* Read-repair requires the inode field to be set by the submitter. */
ASSERT(inode);
+ /* The original bbio should be sectorsize aligned. */
+ ASSERT(IS_ALIGNED(bbio->saved_iter.bi_size, sectorsize));
+
/*
* Hand off repair bios to the repair code as there is no upper level
* submitter for them.
@@ -327,16 +310,10 @@ static void btrfs_check_read_bio(struct btrfs_bio *bbio, struct btrfs_device *de
/* Clear the I/O error. A failed repair will reset it. */
bbio->bio.bi_status = BLK_STS_OK;
- btrfs_bio_for_each_block(paddr, &bbio->bio, iter, step) {
- paddrs[(offset / step) % nr_steps] = paddr;
- offset += step;
-
- if (IS_ALIGNED(offset, sectorsize)) {
- if (status ||
- !btrfs_data_csum_ok(bbio, dev, offset - sectorsize, paddrs))
- fbio = repair_one_sector(bbio, offset - sectorsize,
- paddrs, fbio);
- }
+ for (iter = bbio->saved_iter; iter.bi_size;
+ bio_advance_iter(&bbio->bio, &iter, sectorsize)) {
+ if (status || !btrfs_bio_data_csum_ok(bbio, &iter, dev))
+ fbio = repair_one_sector(bbio, &iter, fbio);
}
if (bbio->csum != bbio->csum_inline)
kvfree(bbio->csum);
diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h
index 7fdc6c3fd066..940cc1b24d1b 100644
--- a/fs/btrfs/btrfs_inode.h
+++ b/fs/btrfs/btrfs_inode.h
@@ -513,8 +513,8 @@ void btrfs_calculate_block_csum_pages(struct btrfs_fs_info *fs_info,
const phys_addr_t paddrs[], u8 *dest);
int btrfs_check_block_csum(struct btrfs_fs_info *fs_info, phys_addr_t paddr, u8 *csum,
const u8 * const csum_expected);
-bool btrfs_data_csum_ok(struct btrfs_bio *bbio, struct btrfs_device *dev,
- u32 bio_offset, const phys_addr_t paddrs[]);
+bool btrfs_bio_data_csum_ok(struct btrfs_bio *bbio, const struct bvec_iter *orig_iter,
+ struct btrfs_device *dev);
noinline int can_nocow_extent(struct btrfs_inode *inode, u64 offset, u64 *len,
struct btrfs_file_extent *file_extent,
bool nowait);
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index b47e2aa5071d..9c960cf7ccbe 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -3347,27 +3347,31 @@ int btrfs_check_block_csum(struct btrfs_fs_info *fs_info, phys_addr_t paddr, u8
* different noncontiguous pages.
*
* @bbio: btrfs_io_bio which contains the csum
- * @dev: device the sector is on
- * @bio_offset: offset to the beginning of the bio (in bytes)
- * @paddrs: physical addresses which back the fs block
+ * @orig_iter: bvec iter pointing to the start of the block
+ * @dev: device the sector is on (optional)
*
* Check if the checksum on a data block is valid. When a checksum mismatch is
* detected, report the error and fill the corrupted range with zero.
*
* Return %true if the sector is ok or had no checksum to start with, else %false.
*/
-bool btrfs_data_csum_ok(struct btrfs_bio *bbio, struct btrfs_device *dev,
- u32 bio_offset, const phys_addr_t paddrs[])
+bool btrfs_bio_data_csum_ok(struct btrfs_bio *bbio,
+ const struct bvec_iter *orig_iter,
+ struct btrfs_device *dev)
{
struct btrfs_inode *inode = bbio->inode;
struct btrfs_fs_info *fs_info = inode->root->fs_info;
+ struct bvec_iter iter = *orig_iter;
+ struct btrfs_csum_ctx cctx;
const u32 blocksize = fs_info->sectorsize;
- const u32 step = min(blocksize, PAGE_SIZE);
- const u32 nr_steps = blocksize / step;
+ const u32 bio_offset = (iter.bi_sector - bbio->saved_iter.bi_sector) << SECTOR_SHIFT;
u64 file_offset = bbio->file_offset + bio_offset;
u64 end = file_offset + blocksize - 1;
u8 *csum_expected;
u8 csum[BTRFS_CSUM_SIZE];
+ u32 cur = 0;
+
+ ASSERT(iter.bi_sector >= bbio->saved_iter.bi_sector);
if (!bbio->csum)
return true;
@@ -3383,7 +3387,22 @@ bool btrfs_data_csum_ok(struct btrfs_bio *bbio, struct btrfs_device *dev,
csum_expected = bbio->csum + (bio_offset >> fs_info->sectorsize_bits) *
fs_info->csum_size;
- btrfs_calculate_block_csum_pages(fs_info, paddrs, csum);
+ btrfs_csum_init(&cctx, fs_info->csum_type);
+ while (cur < blocksize) {
+ struct page *page = bio_iter_page(&bbio->bio, iter);
+ const u32 pg_off = bio_iter_offset(&bbio->bio, iter);
+ const u32 cur_len = min(bio_iter_len(&bbio->bio, iter), blocksize - cur);
+ void *kaddr;
+
+ kaddr = kmap_local_page(page) + pg_off;
+ btrfs_csum_update(&cctx, kaddr, cur_len);
+ kunmap_local(kaddr);
+
+ bio_advance_iter_single(&bbio->bio, &iter, cur_len);
+ cur += cur_len;
+ }
+ btrfs_csum_final(&cctx, csum);
+
if (unlikely(memcmp(csum, csum_expected, fs_info->csum_size) != 0))
goto zeroit;
return true;
@@ -3393,8 +3412,18 @@ bool btrfs_data_csum_ok(struct btrfs_bio *bbio, struct btrfs_device *dev,
bbio->mirror_num);
if (dev)
btrfs_dev_stat_inc_and_print(dev, BTRFS_DEV_STAT_CORRUPTION_ERRS);
- for (int i = 0; i < nr_steps; i++)
- memzero_page(phys_to_page(paddrs[i]), offset_in_page(paddrs[i]), step);
+ cur = 0;
+ iter = *orig_iter;
+ while (cur < blocksize) {
+ struct page *page = bio_iter_page(&bbio->bio, iter);
+ const u32 pg_off = bio_iter_offset(&bbio->bio, iter);
+ const u32 cur_len = min(bio_iter_len(&bbio->bio, iter), blocksize - cur);
+
+ memzero_page(page, pg_off, cur_len);
+
+ bio_advance_iter_single(&bbio->bio, &iter, cur_len);
+ cur += cur_len;
+ }
return false;
}
--
2.54.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 3/3] btrfs: use a shared helper to calculate data checksum for a bio
2026-07-02 7:36 [PATCH 0/3] btrfs: removal of on-stack paddrs[], part 1 Qu Wenruo
2026-07-02 7:36 ` [PATCH 1/3] btrfs: replace btrfs_repair_io_failure() to use bio for page iteration Qu Wenruo
2026-07-02 7:36 ` [PATCH 2/3] btrfs: enhance btrfs_data_csum_ok() " Qu Wenruo
@ 2026-07-02 7:36 ` Qu Wenruo
2 siblings, 0 replies; 4+ messages in thread
From: Qu Wenruo @ 2026-07-02 7:36 UTC (permalink / raw)
To: linux-btrfs
Since we are already calculating data checksum using bio interface,
extract the generation part into btrfs_bio_gen_data_csum(), and use that
to replace the paddrs[] array based solution in csum_one_bio().
This will reduce 128 bytes on-stack memory usage.
Since we're here, also slightly change the error message to mention the
rootid number, also for metadata inode, use btree inode for root/ino
output.
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
fs/btrfs/btrfs_inode.h | 2 ++
fs/btrfs/file-item.c | 19 +++++------------
fs/btrfs/inode.c | 48 +++++++++++++++++++++++++-----------------
3 files changed, 36 insertions(+), 33 deletions(-)
diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h
index 940cc1b24d1b..00fdbae44d85 100644
--- a/fs/btrfs/btrfs_inode.h
+++ b/fs/btrfs/btrfs_inode.h
@@ -515,6 +515,8 @@ int btrfs_check_block_csum(struct btrfs_fs_info *fs_info, phys_addr_t paddr, u8
const u8 * const csum_expected);
bool btrfs_bio_data_csum_ok(struct btrfs_bio *bbio, const struct bvec_iter *orig_iter,
struct btrfs_device *dev);
+void btrfs_bio_gen_data_csum(struct btrfs_bio *bbio, const struct bvec_iter *orig_iter,
+ u8 *csum);
noinline int can_nocow_extent(struct btrfs_inode *inode, u64 offset, u64 *len,
struct btrfs_file_extent *file_extent,
bool nowait);
diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c
index cf50fd623f41..a1629c13935c 100644
--- a/fs/btrfs/file-item.c
+++ b/fs/btrfs/file-item.c
@@ -801,25 +801,16 @@ static void csum_one_bio(struct btrfs_bio *bbio, struct bvec_iter *src)
{
struct btrfs_inode *inode = bbio->inode;
struct btrfs_fs_info *fs_info = inode->root->fs_info;
- struct bio *bio = &bbio->bio;
struct btrfs_ordered_sum *sums = bbio->sums;
- struct bvec_iter iter = *src;
- phys_addr_t paddr;
+ struct bvec_iter iter;
const u32 blocksize = fs_info->sectorsize;
- const u32 step = min(blocksize, PAGE_SIZE);
- const u32 nr_steps = blocksize / step;
- phys_addr_t paddrs[BTRFS_MAX_BLOCKSIZE / PAGE_SIZE];
- u32 offset = 0;
int index = 0;
- btrfs_bio_for_each_block(paddr, bio, &iter, step) {
- paddrs[(offset / step) % nr_steps] = paddr;
- offset += step;
+ for (iter = *src; iter.bi_size;
+ bio_advance_iter(&bbio->bio, &iter, blocksize)) {
+ btrfs_bio_gen_data_csum(bbio, &iter, sums->sums + index);
- if (IS_ALIGNED(offset, blocksize)) {
- btrfs_calculate_block_csum_pages(fs_info, paddrs, sums->sums + index);
- index += fs_info->csum_size;
- }
+ index += fs_info->csum_size;
}
}
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 9c960cf7ccbe..23afdd18c59b 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -3342,6 +3342,34 @@ int btrfs_check_block_csum(struct btrfs_fs_info *fs_info, phys_addr_t paddr, u8
return 0;
}
+/* Generate data checksum for a single fs block, pointed by @orig_iter. */
+void btrfs_bio_gen_data_csum(struct btrfs_bio *bbio, const struct bvec_iter *orig_iter,
+ u8 *csum)
+{
+ struct btrfs_inode *inode = bbio->inode;
+ struct btrfs_fs_info *fs_info = inode->root->fs_info;
+ struct btrfs_csum_ctx cctx;
+ struct bvec_iter iter = *orig_iter;
+ const u32 blocksize = fs_info->sectorsize;
+ u32 cur = 0;
+
+ btrfs_csum_init(&cctx, fs_info->csum_type);
+ while (cur < blocksize) {
+ struct page *page = bio_iter_page(&bbio->bio, iter);
+ const u32 pg_off = bio_iter_offset(&bbio->bio, iter);
+ const u32 cur_len = min(bio_iter_len(&bbio->bio, iter), blocksize - cur);
+ void *kaddr;
+
+ kaddr = kmap_local_page(page) + pg_off;
+ btrfs_csum_update(&cctx, kaddr, cur_len);
+ kunmap_local(kaddr);
+
+ bio_advance_iter_single(&bbio->bio, &iter, cur_len);
+ cur += cur_len;
+ }
+ btrfs_csum_final(&cctx, csum);
+}
+
/*
* Verify the checksum of a single data sector, which can be scattered at
* different noncontiguous pages.
@@ -3362,7 +3390,6 @@ bool btrfs_bio_data_csum_ok(struct btrfs_bio *bbio,
struct btrfs_inode *inode = bbio->inode;
struct btrfs_fs_info *fs_info = inode->root->fs_info;
struct bvec_iter iter = *orig_iter;
- struct btrfs_csum_ctx cctx;
const u32 blocksize = fs_info->sectorsize;
const u32 bio_offset = (iter.bi_sector - bbio->saved_iter.bi_sector) << SECTOR_SHIFT;
u64 file_offset = bbio->file_offset + bio_offset;
@@ -3387,22 +3414,7 @@ bool btrfs_bio_data_csum_ok(struct btrfs_bio *bbio,
csum_expected = bbio->csum + (bio_offset >> fs_info->sectorsize_bits) *
fs_info->csum_size;
- btrfs_csum_init(&cctx, fs_info->csum_type);
- while (cur < blocksize) {
- struct page *page = bio_iter_page(&bbio->bio, iter);
- const u32 pg_off = bio_iter_offset(&bbio->bio, iter);
- const u32 cur_len = min(bio_iter_len(&bbio->bio, iter), blocksize - cur);
- void *kaddr;
-
- kaddr = kmap_local_page(page) + pg_off;
- btrfs_csum_update(&cctx, kaddr, cur_len);
- kunmap_local(kaddr);
-
- bio_advance_iter_single(&bbio->bio, &iter, cur_len);
- cur += cur_len;
- }
- btrfs_csum_final(&cctx, csum);
-
+ btrfs_bio_gen_data_csum(bbio, orig_iter, csum);
if (unlikely(memcmp(csum, csum_expected, fs_info->csum_size) != 0))
goto zeroit;
return true;
@@ -3412,8 +3424,6 @@ bool btrfs_bio_data_csum_ok(struct btrfs_bio *bbio,
bbio->mirror_num);
if (dev)
btrfs_dev_stat_inc_and_print(dev, BTRFS_DEV_STAT_CORRUPTION_ERRS);
- cur = 0;
- iter = *orig_iter;
while (cur < blocksize) {
struct page *page = bio_iter_page(&bbio->bio, iter);
const u32 pg_off = bio_iter_offset(&bbio->bio, iter);
--
2.54.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-07-02 7:38 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-07-02 7:36 [PATCH 0/3] btrfs: removal of on-stack paddrs[], part 1 Qu Wenruo
2026-07-02 7:36 ` [PATCH 1/3] btrfs: replace btrfs_repair_io_failure() to use bio for page iteration Qu Wenruo
2026-07-02 7:36 ` [PATCH 2/3] btrfs: enhance btrfs_data_csum_ok() " Qu Wenruo
2026-07-02 7:36 ` [PATCH 3/3] btrfs: use a shared helper to calculate data checksum for a bio Qu Wenruo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox