From: Christian Brauner <brauner@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christian Brauner <brauner@kernel.org>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [GIT PULL 11/12 for v6.18] writeback
Date: Fri, 26 Sep 2025 16:19:05 +0200 [thread overview]
Message-ID: <20250926-vfs-writeback-dc8e63496609@brauner> (raw)
In-Reply-To: <20250926-vfs-618-e880cf3b910f@brauner>
Hey Linus,
/* Summary */
This contains work adressing lockups reported by users when a systemd
unit reading lots of files from a filesystem mounted with the lazytime
mount option exits.
With the lazytime mount option enabled we can be switching many dirty
inodes on cgroup exit to the parent cgroup. The numbers observed in
practice when systemd slice of a large cron job exits can easily reach
hundreds of thousands or millions.
The logic in inode_do_switch_wbs() which sorts the inode into
appropriate place in b_dirty list of the target wb however has linear
complexity in the number of dirty inodes thus overall time complexity of
switching all the inodes is quadratic leading to workers being pegged
for hours consuming 100% of the CPU and switching inodes to the parent wb.
Simple reproducer of the issue:
FILES=10000
# Filesystem mounted with lazytime mount option
MNT=/mnt/
echo "Creating files and switching timestamps"
for (( j = 0; j < 50; j ++ )); do
mkdir $MNT/dir$j
for (( i = 0; i < $FILES; i++ )); do
echo "foo" >$MNT/dir$j/file$i
done
touch -a -t 202501010000 $MNT/dir$j/file*
done
wait
echo "Syncing and flushing"
sync
echo 3 >/proc/sys/vm/drop_caches
echo "Reading all files from a cgroup"
mkdir /sys/fs/cgroup/unified/mycg1 || exit
echo $$ >/sys/fs/cgroup/unified/mycg1/cgroup.procs || exit
for (( j = 0; j < 50; j ++ )); do
cat /mnt/dir$j/file* >/dev/null &
done
wait
echo "Switching wbs"
# Now rmdir the cgroup after the script exits
This can be solved by:
* Avoiding contention on the wb->list_lock when switching inodes by
running a single work item per wb and managing a queue of items
switching to the wb.
* Allow rescheduling when switching inodes over to a different cgroup to
avoid softlockups.
* Maintain b_dirty list ordering instead of sorting it.
/* Testing */
gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)
No build failures or warnings were observed.
/* Conflicts */
Merge conflicts with mainline
=============================
No known conflicts.
Merge conflicts with other trees
================================
No known conflicts.
The following changes since commit 8f5ae30d69d7543eee0d70083daf4de8fe15d585:
Linux 6.17-rc1 (2025-08-10 19:41:16 +0300)
are available in the Git repository at:
git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.18-rc1.writeback
for you to fetch changes up to 9426414f0d42f824892ecd4dccfebf8987084a41:
Merge patch series "writeback: Avoid lockups when switching inodes" (2025-09-19 13:11:06 +0200)
Please consider pulling these changes from the signed vfs-6.18-rc1.writeback tag.
Thanks!
Christian
----------------------------------------------------------------
vfs-6.18-rc1.writeback
----------------------------------------------------------------
Christian Brauner (1):
Merge patch series "writeback: Avoid lockups when switching inodes"
Jan Kara (4):
writeback: Avoid contention on wb->list_lock when switching inodes
writeback: Avoid softlockup when switching many inodes
writeback: Avoid excessively long inode switching times
writeback: Add tracepoint to track pending inode switches
fs/fs-writeback.c | 133 +++++++++++++++++++++++++--------------
include/linux/backing-dev-defs.h | 4 ++
include/linux/writeback.h | 2 +
include/trace/events/writeback.h | 29 +++++++++
mm/backing-dev.c | 5 ++
5 files changed, 126 insertions(+), 47 deletions(-)
next prev parent reply other threads:[~2025-09-26 14:19 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-26 14:18 [GIT PULL 00/12 for v6.18] vfs 6.18 Christian Brauner
2025-09-26 14:18 ` [GIT PULL 01/12 for v6.18] misc Christian Brauner
2025-09-29 9:47 ` Christian Brauner
2025-09-29 19:31 ` pr-tracker-bot
2025-09-26 14:18 ` [GIT PULL 02/12 for v6.18] mount Christian Brauner
2025-09-29 19:31 ` pr-tracker-bot
2025-09-26 14:18 ` [GIT PULL 03/12 for v6.18] inode Christian Brauner
2025-09-29 19:31 ` pr-tracker-bot
2025-09-26 14:18 ` [GIT PULL 04/12 for v6.18] iomap Christian Brauner
2025-09-29 19:31 ` pr-tracker-bot
2025-09-26 14:18 ` [GIT PULL 05/12 for v6.18] pidfs Christian Brauner
2025-09-29 19:31 ` pr-tracker-bot
2025-10-01 14:18 ` Oleg Nesterov
2025-10-06 13:48 ` Christian Brauner
2025-10-07 14:34 ` Oleg Nesterov
2025-10-10 11:00 ` Christian Brauner
2025-09-26 14:19 ` [GIT PULL 06/12 for v6.18] rust Christian Brauner
2025-09-29 19:31 ` pr-tracker-bot
2025-09-26 14:19 ` [GIT PULL 07/12 for v6.18] workqueue Christian Brauner
2025-09-29 19:31 ` pr-tracker-bot
2025-09-26 14:19 ` [GIT PULL 08/12 for v6.18] core kernel Christian Brauner
2025-09-27 12:19 ` Sasha Levin
2025-09-29 9:53 ` Christian Brauner
2025-09-29 19:31 ` pr-tracker-bot
2025-09-26 14:19 ` [GIT PULL 9/12 for v6.18] afs Christian Brauner
2025-09-29 19:31 ` pr-tracker-bot
2025-09-26 14:19 ` [GIT PULL 10/12 for v6.18] namespaces Christian Brauner
2025-09-29 19:31 ` pr-tracker-bot
2025-09-26 14:19 ` Christian Brauner [this message]
2025-09-29 19:31 ` [GIT PULL 11/12 for v6.18] writeback pr-tracker-bot
2025-09-26 14:19 ` [GIT PULL 12/12 for v6.18] async directory preliminaries Christian Brauner
2025-09-29 19:31 ` pr-tracker-bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250926-vfs-writeback-dc8e63496609@brauner \
--to=brauner@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.