public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: torvalds@linux-foundation.org
Cc: david@fromorbit.com, xfs <linux-xfs@vger.kernel.org>
Subject: Re: [PATCHSET 0/3] xfs: fix ascii-ci problems with userspace
Date: Tue, 4 Apr 2023 10:17:15 -0700	[thread overview]
Message-ID: <20230404171715.GE109974@frogsfrogsfrogs> (raw)
In-Reply-To: <168062802052.174368.10967543545284986225.stgit@frogsfrogsfrogs>

Hi Linus,

My finger slipped and I accidentally added you to the To: list on this
new series.  This series needs to go through review on linux-xfs; when
this is ready to go I (or Dave) will send you a pull request.

Sorry about the noise.

--D

On Tue, Apr 04, 2023 at 10:07:00AM -0700, Darrick J. Wong wrote:
> Hi all,
> 
> Last week, I was fiddling around with the metadump name obfuscation code
> while writing a debugger command to generate directories full of names
> that all have the same hash name.  I had a few questions about how well
> all that worked with ascii-ci mode, and discovered a nasty discrepancy
> between the kernel and glibc's implementations of the tolower()
> function.
> 
> I discovered that I could create a directory that is large enough to
> require separate leaf index blocks.  The hashes stored in the dabtree
> use the ascii-ci specific hash function, which uses a library function
> to convert the name to lowercase before hashing.  If the kernel and C
> library's versions of tolower do not behave exactly identically,
> xfs_ascii_ci_hashname will not produce the same results for the same
> inputs.  xfs_repair will deem the leaf information corrupt and rebuild
> the directory.  After that, lookups in the kernel will fail because the
> hash index doesn't work.
> 
> The kernel's tolower function will convert extended ascii uppercase
> letters (e.g. A-with-umlaut) to extended ascii lowercase letters (e.g.
> a-with-umlaut), whereas glibc's will only do that if you force LANG to
> ascii.  Tiny embedded libc implementations just plain won't do it at
> all, and the result is a mess.  Stabilize the behavior of the hash
> function by encoding the kernel's tolower function in libxfs, add it to
> the selftest, and fix xfs_scrub not handling this correctly.
> 
> If you're going to start using this mess, you probably ought to just
> pull from my git trees, which are linked below.
> 
> This is an extraordinary way to destroy everything.  Enjoy!
> Comments and questions are, as always, welcome.
> 
> --D
> 
> kernel git tree:
> https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=fix-asciici-tolower-6.3
> ---
>  fs/xfs/libxfs/xfs_dir2.c |    4 -
>  fs/xfs/libxfs/xfs_dir2.h |   20 ++++
>  fs/xfs/scrub/dir.c       |    7 +-
>  fs/xfs/xfs_dahash_test.c |  211 ++++++++++++++++++++++++----------------------
>  4 files changed, 139 insertions(+), 103 deletions(-)
> 

  parent reply	other threads:[~2023-04-04 17:17 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-04 17:07 [PATCHSET 0/3] xfs: fix ascii-ci problems with userspace Darrick J. Wong
2023-04-04 17:07 ` [PATCH 1/3] xfs: stabilize the tolower function used for ascii-ci dir hash computation Darrick J. Wong
2023-04-04 17:54   ` Linus Torvalds
2023-04-04 18:32     ` Darrick J. Wong
2023-04-04 18:58       ` Linus Torvalds
2023-04-04 23:30       ` Dave Chinner
2023-04-05  0:17         ` Linus Torvalds
2023-04-05  6:12       ` Christoph Hellwig
2023-04-05 15:40         ` Darrick J. Wong
2023-04-05 15:42           ` Christoph Hellwig
2023-04-05 17:10             ` Darrick J. Wong
2023-04-05 10:48   ` Christoph Hellwig
2023-04-05 15:30     ` Darrick J. Wong
2023-04-05 15:45       ` Linus Torvalds
2023-04-04 17:07 ` [PATCH 2/3] xfs: test the ascii case-insensitive hash Darrick J. Wong
2023-04-04 18:06   ` Linus Torvalds
2023-04-04 20:51     ` Darrick J. Wong
2023-04-04 21:21       ` Linus Torvalds
2023-04-05  6:15         ` Christoph Hellwig
2023-04-04 17:07 ` [PATCH 3/3] xfs: use the directory name hash function for dir scrubbing Darrick J. Wong
2023-04-04 17:17 ` Darrick J. Wong [this message]
2023-04-04 18:19   ` [PATCHSET 0/3] xfs: fix ascii-ci problems with userspace Linus Torvalds
2023-04-04 20:21     ` Linus Torvalds
2023-04-04 21:00       ` Darrick J. Wong
2023-04-04 21:50         ` Linus Torvalds
2023-04-04 21:09 ` [PATCH] xfstests: add a couple more tests for ascii-ci problems Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230404171715.GE109974@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox