From: "Darrick J. Wong" <djwong@kernel.org>
To: Matthew Wilcox <willy@infradead.org>
Cc: Theodore Ts'o <tytso@mit.edu>,
"Artem S. Tashkinov" <aros@gmx.com>,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: Spooling large metadata updates / Proposal for a new API/feature in the Linux Kernel (VFS/Filesystems):
Date: Sun, 12 Jan 2025 10:12:01 -0800 [thread overview]
Message-ID: <20250112181201.GL6156@frogsfrogsfrogs> (raw)
In-Reply-To: <Z4OufXVYupmI8yuN@casper.infradead.org>
On Sun, Jan 12, 2025 at 11:58:53AM +0000, Matthew Wilcox wrote:
> On Sun, Jan 12, 2025 at 12:27:43AM -0500, Theodore Ts'o wrote:
> > So yes, it basically exists, although in practice, it doesn't work as
> > well as you might think, because of the need to read potentially a
> > large number of the metdata blocks. But for example, if you make sure
> > that all of the inode information is already cached, e.g.:
> >
> > ls -lR /path/to/large/tree > /dev/null
> >
> > Then the operation to do a bulk update will be fast:
> >
> > time chown -R root:root /path/to/large/tree
> >
> > This demonstrates that the bottleneck tends to be *reading* the
> > metdata blocks, not *writing* the metadata blocks.
>
> So if we presented more of the operations to the kernel at once, it
> could pipeline the reading of the metadata, providing a user-visible
> win.
>
> However, I don't know that we need a new user API to do it. This is
> something that could be done in the "rm" tool; it has the information
> it needs, and it's better to put heuristics like "how far to read ahead"
> in userspace than the kernel.
nr_cpus=$(getconf _NPROCESSORS_ONLN)
find $path -print0 | xargs -P $nr_cpus -0 chown root:root
deltree is probably harder, because while you can easily parallelize
deleting the leaves, find isn't so good at telling you what are the
leaves. I suppose you could do:
find $path ! -type d -print0 | xargs -P $nr_cpus -0 rm -f
rm -r -f $path
which would serialize on all the directories, but hopefully there aren't
that many of those?
FWIW as Amir said, xfs truncates and frees inodes in the background now
so most of the upfront overhead of rm -r -f is reading in metadata,
deleting directory entries, and putting the files on the unlinked list.
--D
next prev parent reply other threads:[~2025-01-12 18:12 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-11 9:17 Spooling large metadata updates / Proposal for a new API/feature in the Linux Kernel (VFS/Filesystems): Artem S. Tashkinov
2025-01-11 10:33 ` Amir Goldstein
2025-01-12 5:27 ` Theodore Ts'o
2025-01-12 11:58 ` Matthew Wilcox
2025-01-12 18:12 ` Darrick J. Wong [this message]
2025-01-13 7:41 ` Artem S. Tashkinov
2025-01-13 14:00 ` Theodore Ts'o
2025-01-13 23:31 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250112181201.GL6156@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=aros@gmx.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tytso@mit.edu \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox