linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Christoph Hellwig <hch@lst.de>
Cc: Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
	Carlos Maiolino <cem@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	willy@infradead.org, dlemoal@kernel.org, hans.holmberg@wdc.com,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	linux-xfs@vger.kernel.org
Subject: Re: [PATCH 2/3] writeback: allow the file system to override MIN_WRITEBACK_PAGES
Date: Wed, 15 Oct 2025 08:57:35 -0700	[thread overview]
Message-ID: <20251015155735.GC6178@frogsfrogsfrogs> (raw)
In-Reply-To: <20251015062728.60104-3-hch@lst.de>

On Wed, Oct 15, 2025 at 03:27:15PM +0900, Christoph Hellwig wrote:
> The relatively low minimal writeback size of 4MiB leads means that
> written back inodes on rotational media are switched a lot.  Besides
> introducing additional seeks, this also can lead to extreme file
> fragmentation on zoned devices when a lot of files are cached relative
> to the available writeback bandwidth.
> 
> Add a superblock field that allows the file system to override the
> default size.

I havea a few side-questy questions about this patch:

Should this be some sort of BDI field?  Maybe there are other workloads
that create a lot of dirty pages and the sysadmin would like to be able
to tell the fs to schedule larger chunks of writeback before switching
to another inode?

XFS can have two volumes, should we be using the rtdev's bdi for
realtime files and the data dev's bdi for non-rt files?  That looks like
a mess to sort out though, since there's a fair number of places where
we just dereference super_block::s_bdi.

Also I have no idea what we'd do for filesystem raid -- synthesize a bdi
for that?  And then how would you advertise that such-and-such fd maps
to a particular bdi?

(Except for the first question, I don't view the other Qs as blocking
issues; the mechanical code change looks ok to me aside from
s_min_writeback_pages should be long like Ted said)

--D

> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/fs-writeback.c         | 14 +++++---------
>  fs/super.c                |  1 +
>  include/linux/fs.h        |  1 +
>  include/linux/writeback.h |  5 +++++
>  4 files changed, 12 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 11fd08a0efb8..6d50b02cdab6 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -31,11 +31,6 @@
>  #include <linux/memcontrol.h>
>  #include "internal.h"
>  
> -/*
> - * 4MB minimal write chunk size
> - */
> -#define MIN_WRITEBACK_PAGES	(4096UL >> (PAGE_SHIFT - 10))
> -
>  /*
>   * Passed into wb_writeback(), essentially a subset of writeback_control
>   */
> @@ -1874,8 +1869,8 @@ static int writeback_single_inode(struct inode *inode,
>  	return ret;
>  }
>  
> -static long writeback_chunk_size(struct bdi_writeback *wb,
> -				 struct wb_writeback_work *work)
> +static long writeback_chunk_size(struct super_block *sb,
> +		struct bdi_writeback *wb, struct wb_writeback_work *work)
>  {
>  	long pages;
>  
> @@ -1898,7 +1893,8 @@ static long writeback_chunk_size(struct bdi_writeback *wb,
>  	pages = min(wb->avg_write_bandwidth / 2,
>  		    global_wb_domain.dirty_limit / DIRTY_SCOPE);
>  	pages = min(pages, work->nr_pages);
> -	return round_down(pages + MIN_WRITEBACK_PAGES, MIN_WRITEBACK_PAGES);
> +	return round_down(pages + sb->s_min_writeback_pages,
> +			sb->s_min_writeback_pages);
>  }
>  
>  /*
> @@ -2000,7 +1996,7 @@ static long writeback_sb_inodes(struct super_block *sb,
>  		inode->i_state |= I_SYNC;
>  		wbc_attach_and_unlock_inode(&wbc, inode);
>  
> -		write_chunk = writeback_chunk_size(wb, work);
> +		write_chunk = writeback_chunk_size(inode->i_sb, wb, work);
>  		wbc.nr_to_write = write_chunk;
>  		wbc.pages_skipped = 0;
>  
> diff --git a/fs/super.c b/fs/super.c
> index 5bab94fb7e03..599c1d2641fe 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -389,6 +389,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
>  		goto fail;
>  	if (list_lru_init_memcg(&s->s_inode_lru, s->s_shrink))
>  		goto fail;
> +	s->s_min_writeback_pages = MIN_WRITEBACK_PAGES;
>  	return s;
>  
>  fail:
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index c895146c1444..23f1f10646b7 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1583,6 +1583,7 @@ struct super_block {
>  
>  	spinlock_t		s_inode_wblist_lock;
>  	struct list_head	s_inodes_wb;	/* writeback inodes */
> +	unsigned int		s_min_writeback_pages;
>  } __randomize_layout;
>  
>  static inline struct user_namespace *i_user_ns(const struct inode *inode)
> diff --git a/include/linux/writeback.h b/include/linux/writeback.h
> index 22dd4adc5667..49e1dd96f43e 100644
> --- a/include/linux/writeback.h
> +++ b/include/linux/writeback.h
> @@ -374,4 +374,9 @@ bool redirty_page_for_writepage(struct writeback_control *, struct page *);
>  void sb_mark_inode_writeback(struct inode *inode);
>  void sb_clear_inode_writeback(struct inode *inode);
>  
> +/*
> + * 4MB minimal write chunk size
> + */
> +#define MIN_WRITEBACK_PAGES	(4096UL >> (PAGE_SHIFT - 10))
> +
>  #endif		/* WRITEBACK_H */
> -- 
> 2.47.3
> 
> 

  parent reply	other threads:[~2025-10-15 15:57 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-15  6:27 allow file systems to increase the minimum writeback chunk size Christoph Hellwig
2025-10-15  6:27 ` [PATCH 1/3] writeback: cleanup writeback_chunk_size Christoph Hellwig
2025-10-15  7:05   ` Damien Le Moal
2025-10-15 15:48   ` Darrick J. Wong
2025-10-20  9:34   ` Jan Kara
2025-10-15  6:27 ` [PATCH 2/3] writeback: allow the file system to override MIN_WRITEBACK_PAGES Christoph Hellwig
2025-10-15  7:09   ` Damien Le Moal
2025-10-15  7:27     ` Christoph Hellwig
2025-10-15 15:13   ` Theodore Ts'o
2025-10-16  4:33     ` Christoph Hellwig
2025-10-15 15:57   ` Darrick J. Wong [this message]
2025-10-16  4:37     ` Christoph Hellwig
2025-10-15 20:49   ` Dave Chinner
2025-10-16  4:39     ` Christoph Hellwig
2025-10-16  8:23       ` Dave Chinner
2025-10-15  6:27 ` [PATCH 3/3] xfs: set s_min_writeback_pages for zoned file systems Christoph Hellwig
2025-10-15  7:10   ` Damien Le Moal
2025-10-15 16:01   ` Darrick J. Wong
2025-10-15  7:11 ` allow file systems to increase the minimum writeback chunk size Damien Le Moal
  -- strict thread matches above, loose matches on Subject: below --
2025-10-17  3:45 allow file systems to increase the minimum writeback chunk size v2 Christoph Hellwig
2025-10-17  3:45 ` [PATCH 2/3] writeback: allow the file system to override MIN_WRITEBACK_PAGES Christoph Hellwig
2025-10-17 12:32   ` Jan Kara
2025-10-17 15:45   ` Darrick J. Wong
2025-10-20  9:35   ` Jan Kara
2025-10-24 14:33   ` Nirjhar Roy (IBM)
2025-10-24 15:12     ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251015155735.GC6178@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=cem@kernel.org \
    --cc=dlemoal@kernel.org \
    --cc=hans.holmberg@wdc.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).