Re: [PATCH] btrfs: Limit size of bios submitted from writeback

Linux Btrfs filesystem development
 help / color / mirror / Atom feed

From: Qu Wenruo <wqu@suse.com>
To: Jan Kara <jack@suse.cz>, David Sterba <dsterba@suse.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] btrfs: Limit size of bios submitted from writeback
Date: Wed, 22 Apr 2026 19:59:38 +0930	[thread overview]
Message-ID: <2f467264-0f15-4ced-858e-bbfdad4dcefa@suse.com> (raw)
In-Reply-To: <20260422094255.12672-2-jack@suse.cz>



在 2026/4/22 19:12, Jan Kara 写道:
[...]
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index ca3e4b99aec2..9c603d59a09b 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -2555,6 +2555,16 @@ static int extent_write_cache_pages(struct address_space *mapping,
>   				break;
>   			}
>   
> +			/*
> +			 * If we have accumulated decent amount of IO, send it
> +			 * to the block layer so that IO can run while we are
> +			 * accumulating more folios to write.
> +			 */
> +			if (bio_ctrl->bbio &&
> +			    bio_ctrl->bbio->bio.bi_iter.bi_size >=
> +			    inode_to_fs_info(inode)->writeback_bio_size)
> +				submit_write_bio(bio_ctrl, 0);

I'd prefer to move the check a little earlier, better inside 
submit_extent_filio() where we already have a similar check for ordered 
extent boundaries.

One reason here is, we're considering huge folio support recently, and 
with huge folios on arm64, we can have a folio as large as 32MiB, thus 
for the worst case we can have a bio as large as 96MiB before submitting it.

[...]
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index a88e68f90564..cb654e990333 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -8179,6 +8179,60 @@ int btrfs_init_dev_stats(struct btrfs_fs_info *fs_info)
>   	return ret;
>   }
>   
> +/*
> + * At maximum we submit writeback bios 64MB in size to avoid too large
> + * submission latencies
> + */
> +#define BTRFS_MAX_WB_BIO_SIZE (64 << 20)
> +
> +int btrfs_init_writeback_bio_size(struct btrfs_fs_info *fs_info)
> +{
> +	struct rb_node *node;
> +	u32 writeback_bio_sectors = 1;
> +
> +	read_lock(&fs_info->mapping_tree_lock);
> +	/*
> +	 * For each data chunk compute the size of bio large enough to submit
> +	 * optimum size request for each of chunk's disk and take maximum
> +	 * over all data chunks.
> +	 */
> +	for (node = rb_first_cached(&fs_info->mapping_tree); node;
> +	     node = rb_next(node)) {

Iterating through all chunk maps may take some time for huge filesystems.

Meanwhile the device list is way smaller than the chunk maps, what about 
iterating through all devices instead?

Not to mention we are going to hit the same devices again and again 
through the chunk maps.

This may not handle all corner cases, e.g. a fs with new disks added, 
but should handle the most common cases pretty well.

> +		struct btrfs_chunk_map *map;
> +		unsigned int data_stripes, opt_rq_size = fs_info->sectorsize;
> +		int i;
> +
> +		map = rb_entry(node, struct btrfs_chunk_map, rb_node);
> +		if (!(map->type & BTRFS_BLOCK_GROUP_DATA))
> +			continue;
> +		data_stripes = calc_data_stripes(map->type, map->num_stripes);
> +		for (i = 0; i < map->num_stripes; i++) {
> +			struct request_queue *queue;
> +			unsigned int io_opt;
> +
> +			if (!map->stripes[i].dev)
> +				continue;
> +			queue = bdev_get_queue(map->stripes[i].dev->bdev);
> +			io_opt = queue_io_opt(queue) ? :
> +				queue_max_sectors(queue) << SECTOR_SHIFT;
> +			opt_rq_size = max(opt_rq_size, io_opt);

I'm wondering if we should use the minimal or maximum size.

If the optimal io sizes are very different, e.g. 512K vs 128M, the final 
result will be truncated to 64M, would a 64M io submission stall the 
pipeline for that 512K device?

Thanks a lot for finding the root cause!
Qu

next prev parent reply	other threads:[~2026-04-22 10:29 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-22  9:42 [PATCH] btrfs: Limit size of bios submitted from writeback Jan Kara
2026-04-22 10:29 ` Qu Wenruo [this message]
2026-04-22 12:49   ` Jan Kara
2026-04-22 21:43     ` Qu Wenruo
2026-04-23  7:57       ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2f467264-0f15-4ced-858e-bbfdad4dcefa@suse.com \
    --to=wqu@suse.com \
    --cc=dsterba@suse.com \
    --cc=jack@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox