public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET 0/3] xfs: fix ascii-ci problems with userspace
@ 2023-04-04 17:07 Darrick J. Wong
  2023-04-04 17:07 ` [PATCH 1/3] xfs: stabilize the tolower function used for ascii-ci dir hash computation Darrick J. Wong
                   ` (4 more replies)
  0 siblings, 5 replies; 26+ messages in thread
From: Darrick J. Wong @ 2023-04-04 17:07 UTC (permalink / raw)
  To: torvalds, djwong; +Cc: linux-xfs, david

Hi all,

Last week, I was fiddling around with the metadump name obfuscation code
while writing a debugger command to generate directories full of names
that all have the same hash name.  I had a few questions about how well
all that worked with ascii-ci mode, and discovered a nasty discrepancy
between the kernel and glibc's implementations of the tolower()
function.

I discovered that I could create a directory that is large enough to
require separate leaf index blocks.  The hashes stored in the dabtree
use the ascii-ci specific hash function, which uses a library function
to convert the name to lowercase before hashing.  If the kernel and C
library's versions of tolower do not behave exactly identically,
xfs_ascii_ci_hashname will not produce the same results for the same
inputs.  xfs_repair will deem the leaf information corrupt and rebuild
the directory.  After that, lookups in the kernel will fail because the
hash index doesn't work.

The kernel's tolower function will convert extended ascii uppercase
letters (e.g. A-with-umlaut) to extended ascii lowercase letters (e.g.
a-with-umlaut), whereas glibc's will only do that if you force LANG to
ascii.  Tiny embedded libc implementations just plain won't do it at
all, and the result is a mess.  Stabilize the behavior of the hash
function by encoding the kernel's tolower function in libxfs, add it to
the selftest, and fix xfs_scrub not handling this correctly.

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=fix-asciici-tolower-6.3
---
 fs/xfs/libxfs/xfs_dir2.c |    4 -
 fs/xfs/libxfs/xfs_dir2.h |   20 ++++
 fs/xfs/scrub/dir.c       |    7 +-
 fs/xfs/xfs_dahash_test.c |  211 ++++++++++++++++++++++++----------------------
 4 files changed, 139 insertions(+), 103 deletions(-)


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2023-04-05 17:10 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-04 17:07 [PATCHSET 0/3] xfs: fix ascii-ci problems with userspace Darrick J. Wong
2023-04-04 17:07 ` [PATCH 1/3] xfs: stabilize the tolower function used for ascii-ci dir hash computation Darrick J. Wong
2023-04-04 17:54   ` Linus Torvalds
2023-04-04 18:32     ` Darrick J. Wong
2023-04-04 18:58       ` Linus Torvalds
2023-04-04 23:30       ` Dave Chinner
2023-04-05  0:17         ` Linus Torvalds
2023-04-05  6:12       ` Christoph Hellwig
2023-04-05 15:40         ` Darrick J. Wong
2023-04-05 15:42           ` Christoph Hellwig
2023-04-05 17:10             ` Darrick J. Wong
2023-04-05 10:48   ` Christoph Hellwig
2023-04-05 15:30     ` Darrick J. Wong
2023-04-05 15:45       ` Linus Torvalds
2023-04-04 17:07 ` [PATCH 2/3] xfs: test the ascii case-insensitive hash Darrick J. Wong
2023-04-04 18:06   ` Linus Torvalds
2023-04-04 20:51     ` Darrick J. Wong
2023-04-04 21:21       ` Linus Torvalds
2023-04-05  6:15         ` Christoph Hellwig
2023-04-04 17:07 ` [PATCH 3/3] xfs: use the directory name hash function for dir scrubbing Darrick J. Wong
2023-04-04 17:17 ` [PATCHSET 0/3] xfs: fix ascii-ci problems with userspace Darrick J. Wong
2023-04-04 18:19   ` Linus Torvalds
2023-04-04 20:21     ` Linus Torvalds
2023-04-04 21:00       ` Darrick J. Wong
2023-04-04 21:50         ` Linus Torvalds
2023-04-04 21:09 ` [PATCH] xfstests: add a couple more tests for ascii-ci problems Darrick J. Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox