From: Miao Wang via B4 Relay <devnull+shankerwangmiao.gmail.com@kernel.org>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>, stable@vger.kernel.org
Cc: Xi Ruoyao <xry111@xry111.site>,
Christian Brauner <brauner@kernel.org>,
Alexander Viro <viro@zeniv.linux.org.uk>,
linux-fsdevel@vger.kernel.org, stable@vger.kernel.org,
Miao Wang <shankerwangmiao@gmail.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Mateusz Guzik <mjguzik@gmail.com>
Subject: [PATCH v5.4-v4.19 3/6] vfs: mostly undo glibc turning 'fstat()' into 'fstatat(AT_EMPTY_PATH)'
Date: Wed, 18 Sep 2024 22:12:22 +0800 [thread overview]
Message-ID: <20240918-statx-stable-linux-5-4-y-v1-3-8a771c9bbe5f@gmail.com> (raw)
In-Reply-To: <20240918-statx-stable-linux-5-4-y-v1-0-8a771c9bbe5f@gmail.com>
From: Linus Torvalds <torvalds@linux-foundation.org>
commit 9013c51 upstream.
Mateusz reports that glibc turns 'fstat()' calls into 'fstatat()', and
that seems to have been going on for quite a long time due to glibc
having tried to simplify its stat logic into just one point.
This turns out to cause completely unnecessary overhead, where we then
go off and allocate the kernel side pathname, and actually look up the
empty path. Sure, our path lookup is quite optimized, but it still
causes a fair bit of allocation overhead and a couple of completely
unnecessary rounds of lockref accesses etc.
This is all hopefully getting fixed in user space, and there is a patch
floating around for just having glibc use the native fstat() system
call. But even with the current situation we can at least improve on
things by catching the situation and short-circuiting it.
Note that this is still measurably slower than just a plain 'fstat()',
since just checking that the filename is actually empty is somewhat
expensive due to inevitable user space access overhead from the kernel
(ie verifying pointers, and SMAP on x86). But it's still quite a bit
faster than actually looking up the path for real.
To quote numers from Mateusz:
"Sapphire Rapids, will-it-scale, ops/s
stock fstat 5088199
patched fstat 7625244 (+49%)
real fstat 8540383 (+67% / +12%)"
where that 'stock fstat' is the glibc translation of fstat into
fstatat() with an empty path, the 'patched fstat' is with this short
circuiting of the path lookup, and the 'real fstat' is the actual native
fstat() system call with none of this overhead.
Link: https://lore.kernel.org/lkml/20230903204858.lv7i3kqvw6eamhgz@f/
Reported-by: Mateusz Guzik <mjguzik@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@vger.kernel.org> # 4.19.x-5.4.x
Signed-off-by: Miao Wang <shankerwangmiao@gmail.com>
Tested-by: Xi Ruoyao <xry111@xry111.site>
---
fs/stat.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/fs/stat.c b/fs/stat.c
index b09a0e2a6681..526fa0801cad 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -201,6 +201,22 @@ static int vfs_statx(int dfd, const char __user *filename, int flags,
int vfs_fstatat(int dfd, const char __user *filename,
struct kstat *stat, int flags)
{
+ /*
+ * Work around glibc turning fstat() into fstatat(AT_EMPTY_PATH)
+ *
+ * If AT_EMPTY_PATH is set, we expect the common case to be that
+ * empty path, and avoid doing all the extra pathname work.
+ */
+ if (dfd >= 0 && flags == AT_EMPTY_PATH) {
+ char c;
+
+ ret = get_user(c, filename);
+ if (unlikely(ret))
+ return ret;
+
+ if (likely(!c))
+ return vfs_fstat(dfd, stat);
+ }
return vfs_statx(dfd, filename, flags | AT_NO_AUTOMOUNT,
stat, STATX_BASIC_STATS);
}
--
2.43.0
next prev parent reply other threads:[~2024-09-18 14:12 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-18 14:12 [PATCH 0/6] Backport statx(..., NULL, AT_EMPTY_PATH, ...) Miao Wang via B4 Relay
2024-09-18 14:12 ` [PATCH v5.4-v4.19 1/6] fs: implement vfs_stat and vfs_lstat in terms of vfs_fstatat Miao Wang via B4 Relay
2024-09-18 14:12 ` [PATCH v5.4-v4.19 2/6] fs: move vfs_fstatat out of line Miao Wang via B4 Relay
2024-09-18 14:12 ` Miao Wang via B4 Relay [this message]
2024-09-18 14:12 ` [PATCH v5.4-v4.19 4/6] fs: new helper vfs_empty_path() Miao Wang via B4 Relay
2024-09-18 14:12 ` [PATCH v5.4-v4.19 5/6] stat: use vfs_empty_path() helper Miao Wang via B4 Relay
2024-09-18 14:12 ` [PATCH v5.4-v4.19 6/6] vfs: support statx(..., NULL, AT_EMPTY_PATH, ...) Miao Wang via B4 Relay
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240918-statx-stable-linux-5-4-y-v1-3-8a771c9bbe5f@gmail.com \
--to=devnull+shankerwangmiao.gmail.com@kernel.org \
--cc=brauner@kernel.org \
--cc=gregkh@linuxfoundation.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=mjguzik@gmail.com \
--cc=shankerwangmiao@gmail.com \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
--cc=xry111@xry111.site \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).