Re: [PATCH 2/4] mm: add atomic flush guard for IOCB_DONTCACHE writeback

public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed

From: Christoph Hellwig <hch@infradead.org>
To: Jeff Layton <jlayton@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Hildenbrand <david@kernel.org>,
	Lorenzo Stoakes <ljs@kernel.org>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Vlastimil Babka <vbabka@kernel.org>,
	Mike Rapoport <rppt@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>, Mike Snitzer <snitzer@kernel.org>,
	Chuck Lever <chuck.lever@oracle.com>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-nfs@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH 2/4] mm: add atomic flush guard for IOCB_DONTCACHE writeback
Date: Sun, 5 Apr 2026 22:49:24 -0700	[thread overview]
Message-ID: <adNJZBXeJomWmhdf@infradead.org> (raw)
In-Reply-To: <629f21c6591903512eb2f3f3c4d6b14a9ac7b91a.camel@kernel.org>

On Thu, Apr 02, 2026 at 08:49:45AM -0400, Jeff Layton wrote:
> > Have you considered stopping to do in-caller writeback for
> > IOCB_DONTCACHE vs just leaving it to the writeback daeon?
> > 
> > Either by totally disabling the writeback and just leaving the
> > dropbehind bit, or by queuing up wb_writeback_work instances for
> > the ranges, or by just increasing the pressure for the writeback
> > daemon.  Note that with all schemes including the one in this patch
> > we might eventually run into writeback scalability limits, which
> > will require multiple writeback workers.
> 
> I did test a "dropbehind" mode that just set the dropbehind bit without
> doing the flush at the end of the write. It was better than stock
> dontcache but the tail latencies were still pretty bad.
> 
> I think having each writer do some writeback submission work makes a
> lot of sense. It helps keep the dirty pages below the dirty thresholds
> and doesn't seem to tax each writing task _too_ much. The trick is
> avoiding lock contention while doing it.

Well, an any time you hit a shared resources from multiple threads you
create that lock contention.   Which is why in file system and writeback
land we've moved away from random user processes hitting both data and
metadata (e.g. XFS AIL) writeback as it leads to these scalability
issues.  At some point we might run out of steam in a single thread,
although so far that's mostly been because it does stupid things
(e.g. writeback on file systems doing complex allocator stuff).

> I think what would be ideal would be to have some (lockless) mechanism
> to say "there is enough data touched by the range just written to kick
> off a write that's a suitable size for the backing store". Each writer
> could check that and then kick off writeback for an approprite range.

And that is called the writeback thread.  So what we should do there
is to make sure we queue up writeback on it for each dontcache write.
Initially queuing up a wb_writeback_work for each range might be first
approximation, although we should probably find a way to just increase
a threshold if going down that road.

> I think this even could be beneficial in the normal buffered write
> codepath too.

Yes, we've had lots of observation that the current 30s timeout is
actively harmful.  Especially on SSDs, but even on HDD just keeping
the active might make sense.

next prev parent reply	other threads:[~2026-04-06  5:49 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-01 19:10 [PATCH 0/4] mm: improve write performance with RWF_DONTCACHE Jeff Layton
2026-04-01 19:10 ` [PATCH 1/4] mm: fix IOCB_DONTCACHE write performance with rate-limited writeback Jeff Layton
2026-04-02  4:43   ` Ritesh Harjani
2026-04-02 11:59     ` Jeff Layton
2026-04-02 12:40       ` Ritesh Harjani
2026-04-02  5:21   ` Christoph Hellwig
2026-04-02 12:28     ` Jeff Layton
2026-04-06  5:44       ` Christoph Hellwig
2026-04-01 19:10 ` [PATCH 2/4] mm: add atomic flush guard for IOCB_DONTCACHE writeback Jeff Layton
2026-04-02  5:27   ` Christoph Hellwig
2026-04-02 12:49     ` Jeff Layton
2026-04-06  5:49       ` Christoph Hellwig [this message]
2026-04-06 13:32         ` Jeff Layton
2026-04-01 19:11 ` [PATCH 3/4] testing: add nfsd-io-bench NFS server benchmark suite Jeff Layton
2026-04-01 19:11 ` [PATCH 4/4] testing: add dontcache-bench local filesystem " Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=adNJZBXeJomWmhdf@infradead.org \
    --to=hch@infradead.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=david@kernel.org \
    --cc=jack@suse.cz \
    --cc=jlayton@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=ljs@kernel.org \
    --cc=mhocko@suse.com \
    --cc=rppt@kernel.org \
    --cc=snitzer@kernel.org \
    --cc=surenb@google.com \
    --cc=vbabka@kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox