From: Christian Brauner <brauner@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christian Brauner <brauner@kernel.org>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [GIT PULL 10/16 for v7.2] vfs writeback
Date: Fri, 12 Jun 2026 17:14:17 +0200 [thread overview]
Message-ID: <20260612-vfs-writeback-v72-d7ca37da4512@brauner> (raw)
In-Reply-To: <20260612-vfs-v72-20facee87e19@brauner>
Hey Linus,
/* Summary */
This contains the writeback changes for this cycle:
* Fix a race between cgroup_writeback_umount() and inode_switch_wbs()
When a container exits, a race between cgroup_writeback_umount() and
inode_switch_wbs()/cleanup_offline_cgwb() can trigger "VFS: Busy
inodes after unmount" followed by a use-after-free on percpu
counters. There is a window between inode_prepare_wbs_switch()
returning true (having passed the SB_ACTIVE check and grabbed the
inode) and the subsequent wb_queue_isw() call: if
cgroup_writeback_umount() observes the global isw_nr_in_flight
counter as non-zero but flush_workqueue() finds nothing queued yet,
it returns early - leaving a held inode reference that blocks
evict_inodes() and a later iput() that hits freed percpu counters.
The race is closed by covering the window from
inode_prepare_wbs_switch() through wb_queue_isw() with an RCU
read-side critical section and synchronizing in the umount path. On
top of that the now-dead rcu_barrier() left over from the
queue_rcu_work() era is removed, and the global
synchronize_rcu()/flush_workqueue() pair is replaced with a per-sb
in-flight counter plus pin/unpin/drain helpers so umount no longer
serializes against switch activity on unrelated superblocks.
Under cgroup writeback churn on a 16 vCPU guest this takes umount
latency from ~92-138ms p50 down to ~5-8ms p50 and the cumulative
cost of cgroup_writeback_umount() from ~62ms to ~4us per call. The
initial race fix is kept separate and minimal so it backports
cleanly to stable trees that still queue switches via
queue_rcu_work().
* Improve write performance with RWF_DONTCACHE
Dirty DONTCACHE pages are now tracked per bdi_writeback so that the
writeback flusher can be kicked in a targeted fashion for
IOCB_DONTCACHE writes instead of relying on global writeback, and
the PG_dropbehind flag is preserved when a folio is split.
/* Testing */
gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)
No build failures or warnings were observed.
/* Conflicts */
Merge conflicts with mainline
=============================
No known conflicts.
Merge conflicts with other trees
================================
The following changes since commit 254f49634ee16a731174d2ae34bc50bd5f45e731:
Linux 7.1-rc1 (2026-04-26 14:19:00 -0700)
are available in the Git repository at:
git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-7.2-rc1.writeback
for you to fetch changes up to 0275dc184aa007b260374af6d46fb15741c062a8:
Merge patch series "mm: improve write performance with RWF_DONTCACHE" (2026-06-04 10:18:25 +0200)
----------------------------------------------------------------
vfs-7.2-rc1.writeback
Please consider pulling these changes from the signed vfs-7.2-rc1.writeback tag.
Thanks!
Christian
----------------------------------------------------------------
Baokun Li (3):
writeback: fix race between cgroup_writeback_umount() and inode_switch_wbs()
writeback: drop now-unnecessary rcu_barrier() in cgroup_writeback_umount()
writeback: use a per-sb counter to drain inode wb switches at umount
Christian Brauner (2):
Merge patch series "writeback: fix race between cgroup_writeback_umount() and inode_switch_wbs()"
Merge patch series "mm: improve write performance with RWF_DONTCACHE"
Jeff Layton (3):
mm: preserve PG_dropbehind flag during folio split
mm: track DONTCACHE dirty pages per bdi_writeback
mm: kick writeback flusher for IOCB_DONTCACHE with targeted dirty tracking
fs/fs-writeback.c | 138 +++++++++++++++++++++++++++++++--------
include/linux/backing-dev-defs.h | 3 +
include/linux/fs.h | 6 +-
include/linux/fs/super_types.h | 8 +++
include/trace/events/writeback.h | 3 +-
mm/filemap.c | 15 ++++-
mm/huge_memory.c | 1 +
mm/page-writeback.c | 6 ++
8 files changed, 147 insertions(+), 33 deletions(-)
next prev parent reply other threads:[~2026-06-12 15:14 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-12 15:10 [GIT PULL 00/16 for v7.2] v7.2 Christian Brauner
2026-06-12 15:11 ` [GIT PULL 01/16 for v7.2] vfs kfunc Christian Brauner
2026-06-12 15:11 ` [GIT PULL 02/16 for v7.2] vfs exportfs Christian Brauner
2026-06-12 15:12 ` [GIT PULL 03/16 for v7.2] vfs inode Christian Brauner
2026-06-12 15:12 ` [GIT PULL 04/16 for v7.2] vfs directory delegations Christian Brauner
2026-06-12 15:12 ` [GIT PULL 05/16 for v7.2] vfs casefold Christian Brauner
2026-06-12 15:13 ` [GIT PULL 06/16 for v7.2] kernel task_exec_state Christian Brauner
2026-06-12 15:13 ` [GIT PULL 07/16 for v7.2] kernel misc Christian Brauner
2026-06-12 15:13 ` [GIT PULL 08/16 for v7.2] vfs openat2 Christian Brauner
2026-06-12 15:14 ` [GIT PULL 09/16 for v7.2] vfs super Christian Brauner
2026-06-12 15:14 ` Christian Brauner [this message]
2026-06-12 15:14 ` [GIT PULL 11/16 for v7.2] vfs bh Christian Brauner
2026-06-12 15:15 ` [GIT PULL 12/16 for v7.2] vfs eventpoll Christian Brauner
2026-06-12 15:15 ` [GIT PULL 13/16 for v7.2] vfs iomap Christian Brauner
2026-06-12 15:15 ` [GIT PULL 14/16 for v7.2] vfs xattr Christian Brauner
2026-06-12 15:16 ` [GIT PULL 15/16 for v7.2] vfs misc Christian Brauner
2026-06-12 15:16 ` [GIT PULL 16/16 for v7.2] vfs procfs Christian Brauner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260612-vfs-writeback-v72-d7ca37da4512@brauner \
--to=brauner@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.