From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: "Đoàn Trần Công Danh" <congdanhqx@gmail.com>
Cc: Matheus Tavares <matheus.bernardino@usp.br>,
gitster@pobox.com, git@vger.kernel.org,
"brian m . carlson" <sandals@crustytoothpaste.net>
Subject: Re: [PATCH] t2080: fix cp invocation to copy symlinks instead of following them
Date: Wed, 02 Jun 2021 12:50:53 +0200 [thread overview]
Message-ID: <87pmx47cs9.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <YLbgi0jQn8BJ1ue2@danh.dev>
On Wed, Jun 02 2021, Đoàn Trần Công Danh wrote:
> On 2021-05-31 16:01:01+0200, Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>>
>> On Thu, May 27 2021, Ævar Arnfjörð Bjarmason wrote:
>>
>> > On Wed, May 26 2021, Matheus Tavares wrote:
>> >
>> >> t2080 makes a few copies of a test repository and later performs a
>> >> branch switch on each one of the copies to verify that parallel checkout
>> >> and sequential checkout produce the same results. However, the
>> >> repository is copied with `cp -R` which, on some systems, defaults to
>> >> following symlinks on the directory hierarchy and copying their target
>> >> files instead of copying the symlinks themselves. AIX is one example of
>> >> system where this happens. Because the symlinks are not preserved, the
>> >> copied repositories have paths that do not match what is in the index,
>> >> causing git to abort the checkout operation that we want to test. This
>> >> makes the test fail on these systems.
>> >>
>> >> Fix this by copying the repository with the POSIX flag '-P', which
>> >> forces cp to copy the symlinks instead of following them. Note that we
>> >> already use this flag for other cp invocations in our test suite (see
>> >> t7001). With this change, t2080 now passes on AIX.
>> >>
>> >> Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> >> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
>> >> ---
>> >> t/t2080-parallel-checkout-basics.sh | 2 +-
>> >> 1 file changed, 1 insertion(+), 1 deletion(-)
>> >>
>> >> diff --git a/t/t2080-parallel-checkout-basics.sh b/t/t2080-parallel-checkout-basics.sh
>> >> index 7087818550..3e0f8c675f 100755
>> >> --- a/t/t2080-parallel-checkout-basics.sh
>> >> +++ b/t/t2080-parallel-checkout-basics.sh
>> >> @@ -114,7 +114,7 @@ do
>> >>
>> >> test_expect_success "$mode checkout" '
>> >> repo=various_$mode &&
>> >> - cp -R various $repo &&
>> >> + cp -R -P various $repo &&
>> >>
>> >> # The just copied files have more recent timestamps than their
>> >> # associated index entries. So refresh the cached timestamps
>> >
>> > Thanks for the quick fix, I can confirm that this makes the test pass on
>> > AIX 7.2.
>>
>> There's still a failure[1] in t2082-parallel-checkout-attributes.sh
>> though, which is new in 2.32.0-rc*. The difference is in an unexpected
>> BOM:
>>
>> avar@gcc119:[/scratch/avar/git/t]perl -nle 'print unpack "H*"' trash\ directory.t2082-parallel-checkout-attributes/encoding/A.internal
>> efbbbf74657874
>> avar@gcc119:[/scratch/avar/git/t]perl -nle 'print unpack "H*"' trash\ directory.t2082-parallel-checkout-attributes/encoding/utf8-text
>> 74657874
>>
>> I.e. the A.internal starts with 0xefbbbf. The 2nd test of t0028*.sh also
>> fails similarly[2], so perhaps it's some old/iconv/whatever issue not
>> per-se related to any change of yours.
>
> The 0xefbbbf looks interesting, it's BOM for utf-8.
>
>> I tried compiling with both NO_ICONV=Y and ICONV_OMITS_BOM=Y, both have
>> the same failure.
>
> I didn't check the code-path for NO_ICONV=Y but ICONV_OMITS_BOM=Y only
> affects output of converting *to* utf-16 and utf-32.
>
> So, I think AIX iconv implementation automatically add BOM to utf-8?
>
> Perhap we need to call skip_utf8_bom somewhere?
I debugged this a bit more, it's probably *also* an issue in our use of
libiconv, but it goes wrong just with our test setup with
iconv(1). I.e. on my boring linux box:
echo x | iconv -f UTF-8 -t UTF-16 | perl -0777 -MData::Dumper -ne 'my @a = map { sprintf "0x%x", $_ } unpack "C*"; print Dumper \@a'
$VAR1 = [
'0xff',
'0xfe',
'0x78',
'0x0',
'0xa',
'0x0'
];
On the AIX box to get the same I need to do that as:
(printf '\376\377'; echo x | iconv -f UTF-8 -t UTF-16LE) | [...]
I.e. we omit the BOM *and* AIX's idea of our UTF-16 is little-endian
UTF-16, a plain UTF-16 gives you the big-endian version. To make things
worse the same is true of UTF-32, except "iconv -l" lists no UTF-32LE
version. So it seems we can't get the same result at all for that one.
So from the outset the code added around 79444c92943 (utf8: handle
systems that don't write BOM for UTF-16, 2019-02-12) needs to be more
careful (although this looked broken before), i.e. we should test exact
known-good bytes and see if UTF-16 is really what we think it is,
etc. This is likely broken on any big-endian non-GNUish iconv
implementation.
next prev parent reply other threads:[~2021-06-02 10:59 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-26 23:58 [PATCH] t2080: fix cp invocation to copy symlinks instead of following them Matheus Tavares
2021-05-27 7:25 ` Christian Couder
2021-05-27 12:51 ` Ævar Arnfjörð Bjarmason
2021-05-31 14:01 ` Ævar Arnfjörð Bjarmason
2021-05-31 16:09 ` Matheus Tavares
2021-05-31 20:41 ` Ævar Arnfjörð Bjarmason
2021-06-02 1:36 ` Đoàn Trần Công Danh
2021-06-02 10:50 ` Ævar Arnfjörð Bjarmason [this message]
2021-06-02 11:14 ` Bagas Sanjaya
2021-06-02 11:22 ` Đoàn Trần Công Danh
2021-06-02 13:36 ` Ævar Arnfjörð Bjarmason
2021-06-02 13:50 ` Đoàn Trần Công Danh
2021-06-03 12:34 ` Đoàn Trần Công Danh
2021-06-02 19:13 ` UTF-BOM was: [PATCH] t2080: fix cp invocation Torsten Bögershausen
2021-06-03 0:07 ` [PATCH] t2080: fix cp invocation to copy symlinks instead of following them brian m. carlson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87pmx47cs9.fsf@evledraar.gmail.com \
--to=avarab@gmail.com \
--cc=congdanhqx@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=matheus.bernardino@usp.br \
--cc=sandals@crustytoothpaste.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.