* [PATCH 1/3] xfs_scrub: fix buffer overflow in string_escape
2025-02-24 19:07 [PATCHSET] xfsprogs: handle emoji filenames better Darrick J. Wong
@ 2025-02-24 19:07 ` Darrick J. Wong
2025-02-24 22:28 ` Christoph Hellwig
2025-02-24 19:07 ` [PATCH 2/3] xfs_scrub: don't warn about zero width joiner control characters Darrick J. Wong
2025-02-24 19:07 ` [PATCH 3/3] xfs_scrub: use the display mountpoint for reporting file corruptions Darrick J. Wong
2 siblings, 1 reply; 7+ messages in thread
From: Darrick J. Wong @ 2025-02-24 19:07 UTC (permalink / raw)
To: djwong, aalbersh; +Cc: linux-xfs, hch, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Need to allocate one more byte for the null terminator, just in case the
/entire/ input string consists of non-printable bytes e.g. emoji.
Cc: <linux-xfs@vger.kernel.org> # v4.15.0
Fixes: 396cd0223598bb ("xfs_scrub: warn about suspicious characters in directory/xattr names")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org>
---
scrub/common.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/scrub/common.c b/scrub/common.c
index 6eb3c026dc5ac9..2b2d4a67bc47a2 100644
--- a/scrub/common.c
+++ b/scrub/common.c
@@ -320,7 +320,11 @@ string_escape(
char *q;
int x;
- str = malloc(strlen(in) * 4);
+ /*
+ * Each non-printing byte renders as a four-byte escape sequence, so
+ * allocate 4x the input length, plus a byte for the null terminator.
+ */
+ str = malloc(strlen(in) * 4 + 1);
if (!str)
return NULL;
for (p = in, q = str; *p != '\0'; p++) {
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH 2/3] xfs_scrub: don't warn about zero width joiner control characters
2025-02-24 19:07 [PATCHSET] xfsprogs: handle emoji filenames better Darrick J. Wong
2025-02-24 19:07 ` [PATCH 1/3] xfs_scrub: fix buffer overflow in string_escape Darrick J. Wong
@ 2025-02-24 19:07 ` Darrick J. Wong
2025-02-24 22:28 ` Christoph Hellwig
2025-02-24 19:07 ` [PATCH 3/3] xfs_scrub: use the display mountpoint for reporting file corruptions Darrick J. Wong
2 siblings, 1 reply; 7+ messages in thread
From: Darrick J. Wong @ 2025-02-24 19:07 UTC (permalink / raw)
To: djwong, aalbersh; +Cc: linux-xfs, hch, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
The Unicode code point for "zero width joiners" (aka 0x200D) is used to
hint to renderers that a sequence of simple code points should be
combined into a more complex rendering. This is how compound emoji such
as "wounded heart" are composed out of "heart" and "bandaid"; and how
complex glyphs are rendered in Malayam.
Emoji in filenames are a supported usecase, so stop warning about the
mere existence of ZWJ. We already warn about ZWJ that are used to
produce confusingly rendered names in a single namespace, so we're not
losing any robustness here.
Cc: <linux-xfs@vger.kernel.org> # v6.10.0
Fixes: d43362c78e3e37 ("xfs_scrub: store bad flags with the name entry")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
scrub/unicrash.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/scrub/unicrash.c b/scrub/unicrash.c
index 143060b569f27c..b83bef644b6dce 100644
--- a/scrub/unicrash.c
+++ b/scrub/unicrash.c
@@ -508,8 +508,14 @@ name_entry_examine(
if (is_nonrendering(uchr))
ret |= UNICRASH_INVISIBLE;
- /* control characters */
- if (u_iscntrl(uchr))
+ /*
+ * Warn about control characters in filenames except for zero
+ * width joiners because those are used to construct compound
+ * emoji and glyphs in various languages. ZWJ is already
+ * covered by UNICRASH_INVISIBLE, so we can detect its use in
+ * confusing names.
+ */
+ if (uchr != 0x200D && u_iscntrl(uchr))
ret |= UNICRASH_CONTROL_CHAR;
switch (u_charDirection(uchr)) {
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH 3/3] xfs_scrub: use the display mountpoint for reporting file corruptions
2025-02-24 19:07 [PATCHSET] xfsprogs: handle emoji filenames better Darrick J. Wong
2025-02-24 19:07 ` [PATCH 1/3] xfs_scrub: fix buffer overflow in string_escape Darrick J. Wong
2025-02-24 19:07 ` [PATCH 2/3] xfs_scrub: don't warn about zero width joiner control characters Darrick J. Wong
@ 2025-02-24 19:07 ` Darrick J. Wong
2025-02-24 22:29 ` Christoph Hellwig
2 siblings, 1 reply; 7+ messages in thread
From: Darrick J. Wong @ 2025-02-24 19:07 UTC (permalink / raw)
To: djwong, aalbersh; +Cc: linux-xfs, hch, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
In systemd service mode, we make systemd bind-mount the target
mountpoint onto /tmp/scrub (/tmp is private to the service) so that
updates to the global mountpoint in the shared mount namespace don't
propagate into our service container and vice versa, and pass the path
to the bind mount to xfs_scrub via -M. This solves races such as
unmounting of the target mount point after service container creation
but before process invocation that result in the wrong filesystem being
scanned.
IOWs, to scrub /usr, systemd runs "xfs_scrub -M /tmp/scrub /usr".
Pretend that /usr is a separate filesystem.
However, when xfs_scrub snapshots the handle of /tmp/scrub, libhandle
remembers that /tmp/scrub the beginning of the path, not the pathname
that we want to use for reporting (/usr). This means that
handle_to_path returns /tmp/scrub and not /usr as well, with the
unfortunate result that file corrupts are reported with the pathnames in
the xfs_scrub@ service container, not the global ones.
Put another way, xfs_scrub should complain that /usr/bin/X is corrupt,
not /tmp/scrub/bin/X.
Therefore, modify scrub_render_ino_descr to manipulate the path buffer
during error reporting so that the user always gets the mountpoint
passed in, even if someone tells us to use another path for the actual
open() call in phase 1.
Cc: <linux-xfs@vger.kernel.org> # v6.10.0
Fixes: 9a8b09762f9a52 ("xfs_scrub: use parent pointers when possible to report file operations")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
scrub/common.c | 46 +++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 45 insertions(+), 1 deletion(-)
diff --git a/scrub/common.c b/scrub/common.c
index 2b2d4a67bc47a2..14cd677ba5e9c4 100644
--- a/scrub/common.c
+++ b/scrub/common.c
@@ -418,15 +418,59 @@ scrub_render_ino_descr(
if (ctx->mnt.fsgeom.flags & XFS_FSOP_GEOM_FLAGS_PARENT) {
struct xfs_handle handle;
+ char *pathbuf = buf;
+ size_t used = 0;
handle_from_fshandle(&handle, ctx->fshandle, ctx->fshandle_len);
handle_from_inogen(&handle, ino, gen);
+ /*
+ * @actual_mntpoint is the path we used to open the filesystem,
+ * and @mntpoint is the path we use for display purposes. If
+ * these aren't the same string, then for reporting purposes
+ * we must fix the start of the path string. Start by copying
+ * the display mountpoint into buf, except for trailing
+ * slashes. At this point buf will not be null-terminated.
+ */
+ if (ctx->actual_mntpoint != ctx->mntpoint) {
+ used = strlen(ctx->mntpoint);
+ while (used && ctx->mntpoint[used - 1] == '/')
+ used--;
+
+ /* If it doesn't fit, report the handle instead. */
+ if (used >= buflen) {
+ used = 0;
+ goto report_inum;
+ }
+
+ memcpy(buf, ctx->mntpoint, used);
+ pathbuf += used;
+ }
+
ret = handle_to_path(&handle, sizeof(struct xfs_handle), 4096,
- buf, buflen);
+ pathbuf, buflen - used);
if (ret)
goto report_inum;
+ /*
+ * Now that handle_to_path formatted the full path (including
+ * the actual mount point, stripped of any trailing slashes)
+ * into the rest of pathbuf, slide down the contents by the
+ * length of the actual mount point. Don't count any trailing
+ * slashes because handle_to_path uses libhandle, which strips
+ * trailing slashes. Copy one more byte to ensure we get the
+ * terminating null.
+ */
+ if (ctx->actual_mntpoint != ctx->mntpoint) {
+ size_t len = strlen(ctx->actual_mntpoint);
+
+ while (len && ctx->actual_mntpoint[len - 1] == '/')
+ len--;
+
+ pathlen = strlen(pathbuf);
+ memmove(pathbuf, pathbuf + len, pathlen - len + 1);
+ }
+
/*
* Leave at least 16 bytes for the description of what went
* wrong. If we can't do that, we'll use the inode number.
^ permalink raw reply related [flat|nested] 7+ messages in thread