From: Tingmao Wang <m@maowtm.org>
To: "Mickaël Salaün" <mic@digikod.net>, "Song Liu" <song@kernel.org>,
"Al Viro" <viro@zeniv.linux.org.uk>
Cc: "Tingmao Wang" <m@maowtm.org>,
"Günther Noack" <gnoack@google.com>, "Jan Kara" <jack@suse.cz>,
"Alexei Starovoitov" <alexei.starovoitov@gmail.com>,
"Christian Brauner" <brauner@kernel.org>,
linux-security-module@vger.kernel.org,
linux-fsdevel@vger.kernel.org
Subject: [RFC PATCH 0/3] landlock: walk parent dir with RCU, without taking references
Date: Wed, 4 Jun 2025 01:45:42 +0100 [thread overview]
Message-ID: <cover.1748997840.git.m@maowtm.org> (raw)
Hi Mickaël, Song, Al and other reviewers,
In this series I present some experiments and investigations I did on a
potential optimization for Landlock, which is to avoid taking references
during the walk from the target being accessed toward the global root. I
initially made this change and tested it, but did not send for review as I
was slightly uncertain about the correctness here, but I did saw
significant performance improvements. However, earlier I saw Al Viro mention
in another discussion [1]:
> Note, BTW, that it might be better off by doing that similar to
> d_path.c - without arseloads of dget_parent/dput et.al.; not sure
> how feasible it is, but if everything in it can be done under
> rcu_read_lock(), that's something to look into.
which prompted me to pick this up again (even though I realize in the
reply to [1] Song wasn't sure this would work). After some thinking, I
can think of one particular case where a simple chase of `d_parent`s would
be incorrect - that is, when unlinks are involved and dentry can turn
negative. I _think_ that if we can check the global rename seqcount and
restart the pathwalk with the old approach if it changes, we might avoid
these sort of issues. However I'm new to this and very welcome to be
proven wrong here :)
I realize that Song has already submitted a patch [2] to extract out the
pathwalk logic here and improve the mountpoint handling, and I don't want
this to block that in any way, but I think this is worth looking into
separate from that, based on the benchmarking results below:
Earlier when testing other Landlock performance improvement ideas I came
up with what I think is a "typical" workload [3] involving two layers and
a few rules to places like $HOME, /tmp, /proc etc. (Note that this is just
something I came up with, and Mickaël might not agree on the "typicalness"
here.) With that we have:
Comparing: original pathwalk-noref
// this is the % of time spent in Landlock within an openat syscall
landlock_overhead: avg = 33 27 (-6) (%)
median = 34 28 (-6) (%)
// absolute time spent in Landlock
landlock_hook: avg = 852 627 (-26.4%) (ns)
median = 827 611 (-26.1%) (ns)
// duration of the entire openat syscall
open_syscall: avg = 2507 2260 (-9.9%) (ns)
median = 2451 2222 (-9.3%) (ns)
I also ran the benchmarks Mickaël created [4], which tests openat
performance at various depths in the file hierarchy, all with 2 rules, and
the results are basically:
depth inc. / % change in median time spent in Landlock
% change in median openat duration
% in Landlock, old -> new
1 -21.9% +3.4% 9 -> 7
10 -45.3% -2.4% 16 -> 9
20 -57.2% -7.7% 21 -> 10
30 -61.1% -75.9% 22 -> 11
// ^^ On the unchanged kernel, this seems to be a bimodal
// distribution when landlock is involved. Outliers happens
// much less on the kernel with this patch applied.
Raw results including histograms at https://fileshare.maowtm.org/landlock/20250603/index.html
[1]: https://lore.kernel.org/all/20250529231018.GP2023217@ZenIV/
[2]: https://lore.kernel.org/all/20250603065920.3404510-1-song@kernel.org/
[3]: https://github.com/torvalds/linux/commit/f1865ce970af97ac3b6f4edf580529b8cdc66371
[4]: https://github.com/landlock-lsm/landlock-test-tools/pull/16
Tingmao Wang (3):
landlock: walk parent dir without taking references
selftests/landlock: Add fs_race_test
Restart pathwalk on rename seqcount change
security/landlock/fs.c | 133 ++++-
.../testing/selftests/landlock/fs_race_test.c | 505 ++++++++++++++++++
2 files changed, 617 insertions(+), 21 deletions(-)
create mode 100644 tools/testing/selftests/landlock/fs_race_test.c
base-commit: a5806cd506af5a7c19bcd596e4708b5c464bfd21
--
2.49.0
next reply other threads:[~2025-06-04 0:47 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-04 0:45 Tingmao Wang [this message]
2025-06-04 0:45 ` [RFC PATCH 1/3] landlock: walk parent dir without taking references Tingmao Wang
2025-06-04 17:15 ` Mickaël Salaün
2025-06-04 21:05 ` Tingmao Wang
2025-06-06 10:25 ` Christian Brauner
2025-06-04 0:45 ` [RFC PATCH 2/3] selftests/landlock: Add fs_race_test Tingmao Wang
2025-06-04 0:45 ` [RFC PATCH 3/3] Restart pathwalk on rename seqcount change Tingmao Wang
2025-06-04 0:55 ` Al Viro
2025-06-04 1:12 ` Tingmao Wang
2025-06-04 2:21 ` Al Viro
2025-06-04 18:56 ` Tingmao Wang
2025-06-04 19:17 ` Tingmao Wang
2025-06-04 1:09 ` Tingmao Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1748997840.git.m@maowtm.org \
--to=m@maowtm.org \
--cc=alexei.starovoitov@gmail.com \
--cc=brauner@kernel.org \
--cc=gnoack@google.com \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-security-module@vger.kernel.org \
--cc=mic@digikod.net \
--cc=song@kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).