From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Derrick Stolee <stolee@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: [Question] Unicode weirdness breaking tests on ZFS?
Date: Wed, 17 Nov 2021 16:41:47 +0100 [thread overview]
Message-ID: <211117.8635nu7pm3.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <9393e572-0666-6485-df29-abad5e0d32a1@gmail.com>
On Wed, Nov 17 2021, Derrick Stolee wrote:
> I recently had to pave my Linux machine, so I updated it to Ubuntu
> 21.10 and had the choice to start using the ZFS filesystem. I thought,
> "Why not?" but now I maybe see why.
>
> Running the Git test suite at the v2.34.0 tag on my machine results in
> these failures:
>
> t0050-filesystem.sh (Wstat: 0 Tests: 11 Failed: 0)
> TODO passed: 9-10
> t0021-conversion.sh (Wstat: 256 Tests: 41 Failed: 1)
> Failed test: 31
> Non-zero exit status: 1
> t3910-mac-os-precompose.sh (Wstat: 256 Tests: 25 Failed: 10)
> Failed tests: 1, 4, 6, 8, 11-16
> TODO passed: 23
> Non-zero exit status: 1
>
> These are all related to the UTF8_NFD_TO_NFC prereq.
>
> Zooming in on t0050, these tests are marked as "test_expect_failure" due
> to an assignment of $test_unicode using the UTF8_NFD_TO_NFC prereq:
>
>
> $test_unicode 'rename (silent unicode normalization)' '
> git mv "$aumlcdiar" "$auml" &&
> git commit -m rename
> '
>
> $test_unicode 'merge (silent unicode normalization)' '
> git reset --hard initial &&
> git merge topic
> '
>
>
> The prereq creates two files using unicode characters that could
> collapse to equivalent meanings:
>
>
> test_lazy_prereq UTF8_NFD_TO_NFC '
> # check whether FS converts nfd unicode to nfc
> auml=$(printf "\303\244")
> aumlcdiar=$(printf "\141\314\210")
> >"$auml" &&
> test -f "$aumlcdiar"
> '
>
>
> What I see in that first test, the 'git mv' does change the
> index, but the filesystem thinks the files are the same. This
> may mean that our 'git add "$aumlcdiar"' from an earlier test
> is providing a non-equivalence in the index, and the 'git mv'
> changes the index without causing any issues in the filesystem.
>
> It reminds me as if we used 'git mv README readme' on a case-
> insensitive filesystem. Is this not a similar situation?
>
> What I'm trying to gather is that maybe this test is flawed?
> Or maybe something broke (or never worked?) in how we use
> 'git add' to not get the canonical unicode from the filesystem?
>
> The other tests all have similar interactions with 'git add'.
> I'm hoping that these are just test bugs, and not actually a
> functionality issue in Git. Yes, it is confusing that we can
> change the unicode of a file in the index without the filesystem
> understanding the difference, but that is very similar to how
> case-insensitive filesystems work and I don't know what else we
> would do here.
>
> These filesystem/unicode things are out of my expertise, so
> hopefully someone else has a clearer idea of what is going on.
> I'm happy to be a test bed, or even attempt producing patches
> to fix the issue once we have that clarity.
I haven't used ZFS, but this points to non-POSIX behavior on the FS
itself. It looks like tweaking the "normalization" property might change
it, see: https://manpages.ubuntu.com/manpages/eoan/man8/zfs.8.html
There's also "casesensitivity" and "utf8only".
We probably don't want to invoke some ZFS command on every test to
interrogate this, but if we can pass it down from GIT-BUILD-OPTIONS or
similar then we could have a test prereq check this.
Or perhaps it's as simple as changing the "UTF8_NFD_TO_NFC" prereq from
doing a "test -f" to e.g. "echo *" and seeing what it gets back. Perhaps
ZFS says "yes" to "it exists?" but when doing a readdir() it will
canonicalize?
next prev parent reply other threads:[~2021-11-17 15:45 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-17 15:17 [Question] Unicode weirdness breaking tests on ZFS? Derrick Stolee
2021-11-17 15:41 ` Ævar Arnfjörð Bjarmason [this message]
2021-11-17 16:12 ` Torsten Bögershausen
2021-11-17 17:06 ` Torsten =?unknown-8bit?Q?B=C3=B6gershausen?=
2021-11-17 17:39 ` Torsten =?unknown-8bit?Q?B=C3=B6gershausen?=
2021-11-17 18:29 ` Derrick Stolee
2021-11-17 18:35 ` Derrick Stolee
2021-11-19 15:44 ` Torsten Bögershausen
2021-11-19 17:03 ` Junio C Hamano
2021-11-19 18:30 ` Derrick Stolee
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=211117.8635nu7pm3.gmgdl@evledraar.gmail.com \
--to=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).