From: "Darrick J. Wong" <djwong@kernel.org>
To: torvalds@linux-foundation.org
Cc: david@fromorbit.com, xfs <linux-xfs@vger.kernel.org>
Subject: Re: [PATCHSET 0/3] xfs: fix ascii-ci problems with userspace
Date: Tue, 4 Apr 2023 10:17:15 -0700 [thread overview]
Message-ID: <20230404171715.GE109974@frogsfrogsfrogs> (raw)
In-Reply-To: <168062802052.174368.10967543545284986225.stgit@frogsfrogsfrogs>
Hi Linus,
My finger slipped and I accidentally added you to the To: list on this
new series. This series needs to go through review on linux-xfs; when
this is ready to go I (or Dave) will send you a pull request.
Sorry about the noise.
--D
On Tue, Apr 04, 2023 at 10:07:00AM -0700, Darrick J. Wong wrote:
> Hi all,
>
> Last week, I was fiddling around with the metadump name obfuscation code
> while writing a debugger command to generate directories full of names
> that all have the same hash name. I had a few questions about how well
> all that worked with ascii-ci mode, and discovered a nasty discrepancy
> between the kernel and glibc's implementations of the tolower()
> function.
>
> I discovered that I could create a directory that is large enough to
> require separate leaf index blocks. The hashes stored in the dabtree
> use the ascii-ci specific hash function, which uses a library function
> to convert the name to lowercase before hashing. If the kernel and C
> library's versions of tolower do not behave exactly identically,
> xfs_ascii_ci_hashname will not produce the same results for the same
> inputs. xfs_repair will deem the leaf information corrupt and rebuild
> the directory. After that, lookups in the kernel will fail because the
> hash index doesn't work.
>
> The kernel's tolower function will convert extended ascii uppercase
> letters (e.g. A-with-umlaut) to extended ascii lowercase letters (e.g.
> a-with-umlaut), whereas glibc's will only do that if you force LANG to
> ascii. Tiny embedded libc implementations just plain won't do it at
> all, and the result is a mess. Stabilize the behavior of the hash
> function by encoding the kernel's tolower function in libxfs, add it to
> the selftest, and fix xfs_scrub not handling this correctly.
>
> If you're going to start using this mess, you probably ought to just
> pull from my git trees, which are linked below.
>
> This is an extraordinary way to destroy everything. Enjoy!
> Comments and questions are, as always, welcome.
>
> --D
>
> kernel git tree:
> https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=fix-asciici-tolower-6.3
> ---
> fs/xfs/libxfs/xfs_dir2.c | 4 -
> fs/xfs/libxfs/xfs_dir2.h | 20 ++++
> fs/xfs/scrub/dir.c | 7 +-
> fs/xfs/xfs_dahash_test.c | 211 ++++++++++++++++++++++++----------------------
> 4 files changed, 139 insertions(+), 103 deletions(-)
>
next prev parent reply other threads:[~2023-04-04 17:17 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-04 17:07 [PATCHSET 0/3] xfs: fix ascii-ci problems with userspace Darrick J. Wong
2023-04-04 17:07 ` [PATCH 1/3] xfs: stabilize the tolower function used for ascii-ci dir hash computation Darrick J. Wong
2023-04-04 17:54 ` Linus Torvalds
2023-04-04 18:32 ` Darrick J. Wong
2023-04-04 18:58 ` Linus Torvalds
2023-04-04 23:30 ` Dave Chinner
2023-04-05 0:17 ` Linus Torvalds
2023-04-05 6:12 ` Christoph Hellwig
2023-04-05 15:40 ` Darrick J. Wong
2023-04-05 15:42 ` Christoph Hellwig
2023-04-05 17:10 ` Darrick J. Wong
2023-04-05 10:48 ` Christoph Hellwig
2023-04-05 15:30 ` Darrick J. Wong
2023-04-05 15:45 ` Linus Torvalds
2023-04-04 17:07 ` [PATCH 2/3] xfs: test the ascii case-insensitive hash Darrick J. Wong
2023-04-04 18:06 ` Linus Torvalds
2023-04-04 20:51 ` Darrick J. Wong
2023-04-04 21:21 ` Linus Torvalds
2023-04-05 6:15 ` Christoph Hellwig
2023-04-04 17:07 ` [PATCH 3/3] xfs: use the directory name hash function for dir scrubbing Darrick J. Wong
2023-04-04 17:17 ` Darrick J. Wong [this message]
2023-04-04 18:19 ` [PATCHSET 0/3] xfs: fix ascii-ci problems with userspace Linus Torvalds
2023-04-04 20:21 ` Linus Torvalds
2023-04-04 21:00 ` Darrick J. Wong
2023-04-04 21:50 ` Linus Torvalds
2023-04-04 21:09 ` [PATCH] xfstests: add a couple more tests for ascii-ci problems Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230404171715.GE109974@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox