git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Question] Unicode weirdness breaking tests on ZFS?
@ 2021-11-17 15:17 Derrick Stolee
  2021-11-17 15:41 ` Ævar Arnfjörð Bjarmason
  2021-11-17 16:12 ` Torsten Bögershausen
  0 siblings, 2 replies; 10+ messages in thread
From: Derrick Stolee @ 2021-11-17 15:17 UTC (permalink / raw)
  To: Git Mailing List

I recently had to pave my Linux machine, so I updated it to Ubuntu
21.10 and had the choice to start using the ZFS filesystem. I thought,
"Why not?" but now I maybe see why.

Running the Git test suite at the v2.34.0 tag on my machine results in
these failures:

t0050-filesystem.sh                   (Wstat: 0 Tests: 11 Failed: 0)
  TODO passed:   9-10
t0021-conversion.sh                   (Wstat: 256 Tests: 41 Failed: 1)
  Failed test:  31
  Non-zero exit status: 1
t3910-mac-os-precompose.sh            (Wstat: 256 Tests: 25 Failed: 10)
  Failed tests:  1, 4, 6, 8, 11-16
  TODO passed:   23
  Non-zero exit status: 1

These are all related to the UTF8_NFD_TO_NFC prereq.

Zooming in on t0050, these tests are marked as "test_expect_failure" due
to an assignment of $test_unicode using the UTF8_NFD_TO_NFC prereq:


$test_unicode 'rename (silent unicode normalization)' '
	git mv "$aumlcdiar" "$auml" &&
	git commit -m rename
'

$test_unicode 'merge (silent unicode normalization)' '
	git reset --hard initial &&
	git merge topic
'


The prereq creates two files using unicode characters that could
collapse to equivalent meanings:


test_lazy_prereq UTF8_NFD_TO_NFC '
	# check whether FS converts nfd unicode to nfc
	auml=$(printf "\303\244")
	aumlcdiar=$(printf "\141\314\210")
	>"$auml" &&
	test -f "$aumlcdiar"
'


What I see in that first test, the 'git mv' does change the
index, but the filesystem thinks the files are the same. This
may mean that our 'git add "$aumlcdiar"' from an earlier test
is providing a non-equivalence in the index, and the 'git mv'
changes the index without causing any issues in the filesystem.

It reminds me as if we used 'git mv README readme' on a case-
insensitive filesystem. Is this not a similar situation?

What I'm trying to gather is that maybe this test is flawed?
Or maybe something broke (or never worked?) in how we use
'git add' to not get the canonical unicode from the filesystem?

The other tests all have similar interactions with 'git add'.
I'm hoping that these are just test bugs, and not actually a
functionality issue in Git. Yes, it is confusing that we can
change the unicode of a file in the index without the filesystem
understanding the difference, but that is very similar to how
case-insensitive filesystems work and I don't know what else we
would do here.

These filesystem/unicode things are out of my expertise, so
hopefully someone else has a clearer idea of what is going on.
I'm happy to be a test bed, or even attempt producing patches
to fix the issue once we have that clarity.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-11-19 18:30 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-11-17 15:17 [Question] Unicode weirdness breaking tests on ZFS? Derrick Stolee
2021-11-17 15:41 ` Ævar Arnfjörð Bjarmason
2021-11-17 16:12 ` Torsten Bögershausen
2021-11-17 17:06   ` Torsten =?unknown-8bit?Q?B=C3=B6gershausen?=
2021-11-17 17:39     ` Torsten =?unknown-8bit?Q?B=C3=B6gershausen?=
2021-11-17 18:29       ` Derrick Stolee
2021-11-17 18:35         ` Derrick Stolee
2021-11-19 15:44           ` Torsten Bögershausen
2021-11-19 17:03             ` Junio C Hamano
2021-11-19 18:30               ` Derrick Stolee

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).