* Testsuite failure on s390x and sparc64 after 6840fe9ee2
@ 2025-03-26 20:42 John Paul Adrian Glaubitz
2025-03-26 22:27 ` Todd Zullinger
0 siblings, 1 reply; 14+ messages in thread
From: John Paul Adrian Glaubitz @ 2025-03-26 20:42 UTC (permalink / raw)
To: git; +Cc: Derrick Stolee
Hi,
the following commit:
commit 6840fe9ee29ab51ffd7d924c624dc62da22c50bf
Author: Derrick Stolee <derrickstolee@github.com>
Date: Mon Feb 3 17:11:05 2025 +0000
backfill: add --min-batch-size=<n> option
Users may want to specify a minimum batch size for their needs. This is only
a minimum: the path-walk API provides a list of OIDs that correspond to the
same path, and thus it is optimal to allow delta compression across those
objects in a single server request.
We could consider limiting the request to have a maximum batch size in the
future. For now, we let the path-walk API batches determine the
boundaries.
(...)
broke the testsuite on s390x [1] and sparc64 [2]. The following test fails:
not ok 4 - do partial clone 2, backfill min batch size
CC'ing the author which is Derrick Stolee.
Thanks,
Adrian
> [1] https://buildd.debian.org/status/fetch.php?pkg=git&arch=s390x&ver=1%3A2.49.0-1&stamp=1742165887&raw=0
> [2] https://buildd.debian.org/status/fetch.php?pkg=git&arch=sparc64&ver=1%3A2.49.0-1&stamp=1742674659&raw=0
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer
`. `' Physicist
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Testsuite failure on s390x and sparc64 after 6840fe9ee2
2025-03-26 20:42 Testsuite failure on s390x and sparc64 after 6840fe9ee2 John Paul Adrian Glaubitz
@ 2025-03-26 22:27 ` Todd Zullinger
2025-03-28 9:29 ` Patrick Steinhardt
0 siblings, 1 reply; 14+ messages in thread
From: Todd Zullinger @ 2025-03-26 22:27 UTC (permalink / raw)
To: John Paul Adrian Glaubitz; +Cc: git, Derrick Stolee
John Paul Adrian Glaubitz wrote:
> the following commit:
>
> commit 6840fe9ee29ab51ffd7d924c624dc62da22c50bf
> Author: Derrick Stolee <derrickstolee@github.com>
> Date: Mon Feb 3 17:11:05 2025 +0000
>
> backfill: add --min-batch-size=<n> option
>
> Users may want to specify a minimum batch size for their needs. This is only
> a minimum: the path-walk API provides a list of OIDs that correspond to the
> same path, and thus it is optimal to allow delta compression across those
> objects in a single server request.
>
> We could consider limiting the request to have a maximum batch size in the
> future. For now, we let the path-walk API batches determine the
> boundaries.
> (...)
>
> broke the testsuite on s390x [1] and sparc64 [2]. The following test fails:
>
> not ok 4 - do partial clone 2, backfill min batch size
>
> CC'ing the author which is Derrick Stolee.
I reported this during the rc period. I didn't hear back on
it, but hopefully your message will arrive at a more
convenient time. :)
https://lore.kernel.org/git/Z8HW6petWuMRWSXf@teonanacatl.net/
--
Todd
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Testsuite failure on s390x and sparc64 after 6840fe9ee2
2025-03-26 22:27 ` Todd Zullinger
@ 2025-03-28 9:29 ` Patrick Steinhardt
2025-03-28 9:30 ` Patrick Steinhardt
2025-03-28 9:38 ` John Paul Adrian Glaubitz
0 siblings, 2 replies; 14+ messages in thread
From: Patrick Steinhardt @ 2025-03-28 9:29 UTC (permalink / raw)
To: Todd Zullinger; +Cc: John Paul Adrian Glaubitz, git, Derrick Stolee
On Wed, Mar 26, 2025 at 06:27:50PM -0400, Todd Zullinger wrote:
> John Paul Adrian Glaubitz wrote:
> > the following commit:
> >
> > commit 6840fe9ee29ab51ffd7d924c624dc62da22c50bf
> > Author: Derrick Stolee <derrickstolee@github.com>
> > Date: Mon Feb 3 17:11:05 2025 +0000
> >
> > backfill: add --min-batch-size=<n> option
> >
> > Users may want to specify a minimum batch size for their needs. This is only
> > a minimum: the path-walk API provides a list of OIDs that correspond to the
> > same path, and thus it is optimal to allow delta compression across those
> > objects in a single server request.
> >
> > We could consider limiting the request to have a maximum batch size in the
> > future. For now, we let the path-walk API batches determine the
> > boundaries.
> > (...)
> >
> > broke the testsuite on s390x [1] and sparc64 [2]. The following test fails:
> >
> > not ok 4 - do partial clone 2, backfill min batch size
> >
> > CC'ing the author which is Derrick Stolee.
>
> I reported this during the rc period. I didn't hear back on
> it, but hopefully your message will arrive at a more
> convenient time. :)
>
> https://lore.kernel.org/git/Z8HW6petWuMRWSXf@teonanacatl.net/
Copy-pasting the test logs from that mail:
expecting success of 5620.4 'do partial clone 2, backfill min batch size':
git clone --no-checkout --filter=blob:none \
--single-branch --branch=main \
"file://$(pwd)/srv.bare" backfill2 &&
GIT_TRACE2_EVENT="$(pwd)/batch-trace" git \
-C backfill2 backfill --min-batch-size=20 &&
# Batches were used
test_trace2_data promisor fetch_count 20 <batch-trace >matches &&
test_line_count = 2 matches &&
test_trace2_data promisor fetch_count 8 <batch-trace &&
# No more missing objects!
git -C backfill2 rev-list --quiet --objects --missing=print HEAD >revs2 &&
test_line_count = 0 revs2
+++ pwd
++ git clone --no-checkout --filter=blob:none --single-branch --branch=main 'file:///tmp/git-t.sYdo/trash directory.t5620-backfill/srv.bare' backfill2
Cloning into 'backfill2'...
+++ pwd
++ GIT_TRACE2_EVENT='/tmp/git-t.sYdo/trash directory.t5620-backfill/batch-trace'
++ git -C backfill2 backfill --min-batch-size=20
++ test_trace2_data promisor fetch_count 20
++ grep -e '"category":"promisor","key":"fetch_count","value":"20"'
error: last command exited with $?=1
not ok 4 - do partial clone 2, backfill min batch size
It would be nice to learn what the file contains instead of the expected
string, which might give us a bit more of a hint what's wrong. You can
for example apply the following patch:
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 79377bc0fc2..197494cd28c 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1975,7 +1975,7 @@ test_region () {
# GIT_TRACE2_EVENT="$(pwd)/trace.txt" git pack-objects ... &&
# test_trace2_data pack-objects reused N <trace2.txt
test_trace2_data () {
- grep -e '"category":"'"$1"'","key":"'"$2"'","value":"'"$3"'"'
+ test_grep -e '"category":"'"$1"'","key":"'"$2"'","value":"'"$3"'"'
}
# Given a GIT_TRACE2_EVENT log over stdin, writes to stdout a list of URLs
If you then re-run the test with `-ix` we should end up printing the
contents of that non-matching file.
Thanks!
Patrick
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: Testsuite failure on s390x and sparc64 after 6840fe9ee2
2025-03-28 9:29 ` Patrick Steinhardt
@ 2025-03-28 9:30 ` Patrick Steinhardt
2025-03-28 9:38 ` John Paul Adrian Glaubitz
1 sibling, 0 replies; 14+ messages in thread
From: Patrick Steinhardt @ 2025-03-28 9:30 UTC (permalink / raw)
To: Todd Zullinger; +Cc: John Paul Adrian Glaubitz, git, Derrick Stolee
Also Cc'ing Stolee's current mail address instead of the GitHub one.
On Fri, Mar 28, 2025 at 10:29:16AM +0100, Patrick Steinhardt wrote:
> On Wed, Mar 26, 2025 at 06:27:50PM -0400, Todd Zullinger wrote:
> > John Paul Adrian Glaubitz wrote:
> > > the following commit:
> > >
> > > commit 6840fe9ee29ab51ffd7d924c624dc62da22c50bf
> > > Author: Derrick Stolee <derrickstolee@github.com>
> > > Date: Mon Feb 3 17:11:05 2025 +0000
> > >
> > > backfill: add --min-batch-size=<n> option
> > >
> > > Users may want to specify a minimum batch size for their needs. This is only
> > > a minimum: the path-walk API provides a list of OIDs that correspond to the
> > > same path, and thus it is optimal to allow delta compression across those
> > > objects in a single server request.
> > >
> > > We could consider limiting the request to have a maximum batch size in the
> > > future. For now, we let the path-walk API batches determine the
> > > boundaries.
> > > (...)
> > >
> > > broke the testsuite on s390x [1] and sparc64 [2]. The following test fails:
> > >
> > > not ok 4 - do partial clone 2, backfill min batch size
> > >
> > > CC'ing the author which is Derrick Stolee.
> >
> > I reported this during the rc period. I didn't hear back on
> > it, but hopefully your message will arrive at a more
> > convenient time. :)
> >
> > https://lore.kernel.org/git/Z8HW6petWuMRWSXf@teonanacatl.net/
>
> Copy-pasting the test logs from that mail:
>
> expecting success of 5620.4 'do partial clone 2, backfill min batch size':
> git clone --no-checkout --filter=blob:none \
> --single-branch --branch=main \
> "file://$(pwd)/srv.bare" backfill2 &&
> GIT_TRACE2_EVENT="$(pwd)/batch-trace" git \
> -C backfill2 backfill --min-batch-size=20 &&
> # Batches were used
> test_trace2_data promisor fetch_count 20 <batch-trace >matches &&
> test_line_count = 2 matches &&
> test_trace2_data promisor fetch_count 8 <batch-trace &&
> # No more missing objects!
> git -C backfill2 rev-list --quiet --objects --missing=print HEAD >revs2 &&
> test_line_count = 0 revs2
> +++ pwd
> ++ git clone --no-checkout --filter=blob:none --single-branch --branch=main 'file:///tmp/git-t.sYdo/trash directory.t5620-backfill/srv.bare' backfill2
> Cloning into 'backfill2'...
> +++ pwd
> ++ GIT_TRACE2_EVENT='/tmp/git-t.sYdo/trash directory.t5620-backfill/batch-trace'
> ++ git -C backfill2 backfill --min-batch-size=20
> ++ test_trace2_data promisor fetch_count 20
> ++ grep -e '"category":"promisor","key":"fetch_count","value":"20"'
> error: last command exited with $?=1
> not ok 4 - do partial clone 2, backfill min batch size
>
> It would be nice to learn what the file contains instead of the expected
> string, which might give us a bit more of a hint what's wrong. You can
> for example apply the following patch:
>
> diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
> index 79377bc0fc2..197494cd28c 100644
> --- a/t/test-lib-functions.sh
> +++ b/t/test-lib-functions.sh
> @@ -1975,7 +1975,7 @@ test_region () {
> # GIT_TRACE2_EVENT="$(pwd)/trace.txt" git pack-objects ... &&
> # test_trace2_data pack-objects reused N <trace2.txt
> test_trace2_data () {
> - grep -e '"category":"'"$1"'","key":"'"$2"'","value":"'"$3"'"'
> + test_grep -e '"category":"'"$1"'","key":"'"$2"'","value":"'"$3"'"'
> }
>
> # Given a GIT_TRACE2_EVENT log over stdin, writes to stdout a list of URLs
>
> If you then re-run the test with `-ix` we should end up printing the
> contents of that non-matching file.
>
> Thanks!
>
> Patrick
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Testsuite failure on s390x and sparc64 after 6840fe9ee2
2025-03-28 9:29 ` Patrick Steinhardt
2025-03-28 9:30 ` Patrick Steinhardt
@ 2025-03-28 9:38 ` John Paul Adrian Glaubitz
2025-03-28 14:08 ` Todd Zullinger
1 sibling, 1 reply; 14+ messages in thread
From: John Paul Adrian Glaubitz @ 2025-03-28 9:38 UTC (permalink / raw)
To: Patrick Steinhardt, Todd Zullinger; +Cc: git, Derrick Stolee
Hi Patrick,
On Fri, 2025-03-28 at 10:29 +0100, Patrick Steinhardt wrote:
> > I reported this during the rc period. I didn't hear back on
> > it, but hopefully your message will arrive at a more
> > convenient time. :)
> >
> > https://lore.kernel.org/git/Z8HW6petWuMRWSXf@teonanacatl.net/
>
> Copy-pasting the test logs from that mail:
>
> expecting success of 5620.4 'do partial clone 2, backfill min batch size':
> git clone --no-checkout --filter=blob:none \
> --single-branch --branch=main \
> "file://$(pwd)/srv.bare" backfill2 &&
> GIT_TRACE2_EVENT="$(pwd)/batch-trace" git \
> -C backfill2 backfill --min-batch-size=20 &&
> # Batches were used
> test_trace2_data promisor fetch_count 20 <batch-trace >matches &&
> test_line_count = 2 matches &&
> test_trace2_data promisor fetch_count 8 <batch-trace &&
> # No more missing objects!
> git -C backfill2 rev-list --quiet --objects --missing=print HEAD >revs2 &&
> test_line_count = 0 revs2
> +++ pwd
> ++ git clone --no-checkout --filter=blob:none --single-branch --branch=main 'file:///tmp/git-t.sYdo/trash directory.t5620-backfill/srv.bare' backfill2
> Cloning into 'backfill2'...
> +++ pwd
> ++ GIT_TRACE2_EVENT='/tmp/git-t.sYdo/trash directory.t5620-backfill/batch-trace'
> ++ git -C backfill2 backfill --min-batch-size=20
> ++ test_trace2_data promisor fetch_count 20
> ++ grep -e '"category":"promisor","key":"fetch_count","value":"20"'
> error: last command exited with $?=1
> not ok 4 - do partial clone 2, backfill min batch size
>
> It would be nice to learn what the file contains instead of the expected
> string, which might give us a bit more of a hint what's wrong. You can
> for example apply the following patch:
>
> diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
> index 79377bc0fc2..197494cd28c 100644
> --- a/t/test-lib-functions.sh
> +++ b/t/test-lib-functions.sh
> @@ -1975,7 +1975,7 @@ test_region () {
> # GIT_TRACE2_EVENT="$(pwd)/trace.txt" git pack-objects ... &&
> # test_trace2_data pack-objects reused N <trace2.txt
> test_trace2_data () {
> - grep -e '"category":"'"$1"'","key":"'"$2"'","value":"'"$3"'"'
> + test_grep -e '"category":"'"$1"'","key":"'"$2"'","value":"'"$3"'"'
> }
>
> # Given a GIT_TRACE2_EVENT log over stdin, writes to stdout a list of URLs
>
> If you then re-run the test with `-ix` we should end up printing the
> contents of that non-matching file.
Could you please post the complete command line? I have no clue where to pass "-ix".
I was previously running the tests with "make test".
Thanks,
Adrian
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer
`. `' Physicist
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Testsuite failure on s390x and sparc64 after 6840fe9ee2
2025-03-28 9:38 ` John Paul Adrian Glaubitz
@ 2025-03-28 14:08 ` Todd Zullinger
2025-03-28 15:37 ` Todd Zullinger
0 siblings, 1 reply; 14+ messages in thread
From: Todd Zullinger @ 2025-03-28 14:08 UTC (permalink / raw)
To: John Paul Adrian Glaubitz; +Cc: Patrick Steinhardt, git, Derrick Stolee
John Paul Adrian Glaubitz wrote:
> Hi Patrick,
>
> On Fri, 2025-03-28 at 10:29 +0100, Patrick Steinhardt wrote:
>>> I reported this during the rc period. I didn't hear back on
>>> it, but hopefully your message will arrive at a more
>>> convenient time. :)
>>>
>>> https://lore.kernel.org/git/Z8HW6petWuMRWSXf@teonanacatl.net/
>>
>> Copy-pasting the test logs from that mail:
>>
>> expecting success of 5620.4 'do partial clone 2, backfill min batch size':
>> git clone --no-checkout --filter=blob:none \
>> --single-branch --branch=main \
>> "file://$(pwd)/srv.bare" backfill2 &&
>> GIT_TRACE2_EVENT="$(pwd)/batch-trace" git \
>> -C backfill2 backfill --min-batch-size=20 &&
>> # Batches were used
>> test_trace2_data promisor fetch_count 20 <batch-trace >matches &&
>> test_line_count = 2 matches &&
>> test_trace2_data promisor fetch_count 8 <batch-trace &&
>> # No more missing objects!
>> git -C backfill2 rev-list --quiet --objects --missing=print HEAD >revs2 &&
>> test_line_count = 0 revs2
>> +++ pwd
>> ++ git clone --no-checkout --filter=blob:none --single-branch --branch=main 'file:///tmp/git-t.sYdo/trash directory.t5620-backfill/srv.bare' backfill2
>> Cloning into 'backfill2'...
>> +++ pwd
>> ++ GIT_TRACE2_EVENT='/tmp/git-t.sYdo/trash directory.t5620-backfill/batch-trace'
>> ++ git -C backfill2 backfill --min-batch-size=20
>> ++ test_trace2_data promisor fetch_count 20
>> ++ grep -e '"category":"promisor","key":"fetch_count","value":"20"'
>> error: last command exited with $?=1
>> not ok 4 - do partial clone 2, backfill min batch size
>>
>> It would be nice to learn what the file contains instead of the expected
>> string, which might give us a bit more of a hint what's wrong. You can
>> for example apply the following patch:
>>
>> diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
>> index 79377bc0fc2..197494cd28c 100644
>> --- a/t/test-lib-functions.sh
>> +++ b/t/test-lib-functions.sh
>> @@ -1975,7 +1975,7 @@ test_region () {
>> # GIT_TRACE2_EVENT="$(pwd)/trace.txt" git pack-objects ... &&
>> # test_trace2_data pack-objects reused N <trace2.txt
>> test_trace2_data () {
>> - grep -e '"category":"'"$1"'","key":"'"$2"'","value":"'"$3"'"'
>> + test_grep -e '"category":"'"$1"'","key":"'"$2"'","value":"'"$3"'"'
>> }
>>
>> # Given a GIT_TRACE2_EVENT log over stdin, writes to stdout a list of URLs
>>
>> If you then re-run the test with `-ix` we should end up printing the
>> contents of that non-matching file.
>
> Could you please post the complete command line? I have no clue where to pass "-ix".
>
> I was previously running the tests with "make test".
You'd do something like:
cd t && ./t5620-backfill.sh -ix
Though the patch to change grep to test_grep is incomplete,
I believe. Using that, you get an error:
error: bug in the test script: test_grep requires a file
to read as the last parameter
I don't have a lot of time to poke at this today, but I'll
make another test run on an s390x build host without that
patch, but where I can save the output and post it
somewhere.
For the Fedora packaging, it will be something like this:
make -C t all || {
(cd t && ./t5620-backfill.sh -ix);
./print-failed-test-output;
}
Where print-failed-test-output is a script¹ which snarfs up
the output files in t/test-results and the test directory,
since there is not direct shell access to the build host(s).
¹ https://src.fedoraproject.org/rpms/git/raw/0af3adf/f/print-failed-test-output
--
Todd
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Testsuite failure on s390x and sparc64 after 6840fe9ee2
2025-03-28 14:08 ` Todd Zullinger
@ 2025-03-28 15:37 ` Todd Zullinger
2025-03-31 12:27 ` Patrick Steinhardt
0 siblings, 1 reply; 14+ messages in thread
From: Todd Zullinger @ 2025-03-28 15:37 UTC (permalink / raw)
To: John Paul Adrian Glaubitz; +Cc: Patrick Steinhardt, git, Derrick Stolee
I wrote:
> I don't have a lot of time to poke at this today, but I'll
> make another test run on an s390x build host without that
> patch, but where I can save the output and post it
> somewhere.
>
> For the Fedora packaging, it will be something like this:
>
> make -C t all || {
> (cd t && ./t5620-backfill.sh -ix);
> ./print-failed-test-output;
> }
The matches file is empty.
$ ls -lhn batch-trace matches
-rw-r--r--. 1 1000 1000 31K Mar 28 11:09 batch-trace
-rw-r--r--. 1 1000 1000 0 Mar 28 11:09 matches
The only match in batch-trace for promisor fetch_count is
from the previous test:
$ grep -e '"category":"promisor","key":"fetch_count","value":' batch-trace
{"event":"data","sid":"20250328T150939.623820Z-H9aa15b67-P0008f613","thread":"main","time":"2025-03-28T15:09:39.625484Z","file":"promisor-remote.c","line":55,"repo":1,"t_abs":0.001777,"t_rel":0.001777,"nesting":1,"category":"promisor","key":"fetch_count","value":"48"}
The trash directory for the test run is here, in case anyone
wants to poke at it:
https://tmz.fedorapeople.org/t5620-backfill-trash-dir.tar.gz
The full build log is available as well:
https://tmz.fedorapeople.org/git-2.49.0-s390x-build.log
If you search for 'BEGIN BASE64 MESSAGE' in that, it
provides a command which can be used to extract the full
test-results directory. That's used to get the output from
the build hosts where shell access isn't available. I don't
know that it's got anything which isn't in the trash
directory tarball which I already extracted, but it's there
just in case.
--
Todd
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Testsuite failure on s390x and sparc64 after 6840fe9ee2
2025-03-28 15:37 ` Todd Zullinger
@ 2025-03-31 12:27 ` Patrick Steinhardt
2025-03-31 15:48 ` Todd Zullinger
2025-03-31 18:17 ` SZEDER Gábor
0 siblings, 2 replies; 14+ messages in thread
From: Patrick Steinhardt @ 2025-03-31 12:27 UTC (permalink / raw)
To: Todd Zullinger; +Cc: John Paul Adrian Glaubitz, git, Derrick Stolee
On Fri, Mar 28, 2025 at 11:37:25AM -0400, Todd Zullinger wrote:
> I wrote:
> > I don't have a lot of time to poke at this today, but I'll
> > make another test run on an s390x build host without that
> > patch, but where I can save the output and post it
> > somewhere.
> >
> > For the Fedora packaging, it will be something like this:
> >
> > make -C t all || {
> > (cd t && ./t5620-backfill.sh -ix);
> > ./print-failed-test-output;
> > }
>
> The matches file is empty.
>
> $ ls -lhn batch-trace matches
> -rw-r--r--. 1 1000 1000 31K Mar 28 11:09 batch-trace
> -rw-r--r--. 1 1000 1000 0 Mar 28 11:09 matches
>
> The only match in batch-trace for promisor fetch_count is
> from the previous test:
>
> $ grep -e '"category":"promisor","key":"fetch_count","value":' batch-trace
> {"event":"data","sid":"20250328T150939.623820Z-H9aa15b67-P0008f613","thread":"main","time":"2025-03-28T15:09:39.625484Z","file":"promisor-remote.c","line":55,"repo":1,"t_abs":0.001777,"t_rel":0.001777,"nesting":1,"category":"promisor","key":"fetch_count","value":"48"}
>
> The trash directory for the test run is here, in case anyone
> wants to poke at it:
>
> https://tmz.fedorapeople.org/t5620-backfill-trash-dir.tar.gz
>
> The full build log is available as well:
>
> https://tmz.fedorapeople.org/git-2.49.0-s390x-build.log
>
> If you search for 'BEGIN BASE64 MESSAGE' in that, it
> provides a command which can be used to extract the full
> test-results directory. That's used to get the output from
> the build hosts where shell access isn't available. I don't
> know that it's got anything which isn't in the trash
> directory tarball which I already extracted, but it's there
> just in case.
Thanks for the additional information!
One thing I stumbled over: the `--min-batch-size` parameter is parsed
using `OPT_INTEGER()`, which expects the value pointer to point to an
integer. But we pass `struct backfill_context::min_batch_size`, which is
of type `size_t`. Maybe that's causing us to end up with an invalid
value?
Could you please check whether the below diff fixes the issue for you?
If so I can turn it into a proper patch.
Patrick
-- >8 --
diff --git a/builtin/backfill.c b/builtin/backfill.c
index 33e1ea2f84f..1dd0d746538 100644
--- a/builtin/backfill.c
+++ b/builtin/backfill.c
@@ -119,11 +119,11 @@ int cmd_backfill(int argc, const char **argv, const char *prefix, struct reposit
struct backfill_context ctx = {
.repo = repo,
.current_batch = OID_ARRAY_INIT,
- .min_batch_size = 50000,
.sparse = 0,
};
+ unsigned long min_batch_size = 50000;
struct option options[] = {
- OPT_INTEGER(0, "min-batch-size", &ctx.min_batch_size,
+ OPT_MAGNITUDE(0, "min-batch-size", &min_batch_size,
N_("Minimum number of objects to request at a time")),
OPT_BOOL(0, "sparse", &ctx.sparse,
N_("Restrict the missing objects to the current sparse-checkout")),
@@ -140,6 +140,7 @@ int cmd_backfill(int argc, const char **argv, const char *prefix, struct reposit
if (ctx.sparse < 0)
ctx.sparse = core_apply_sparse_checkout;
+ ctx.min_batch_size = min_batch_size;
result = do_backfill(&ctx);
backfill_context_clear(&ctx);
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: Testsuite failure on s390x and sparc64 after 6840fe9ee2
2025-03-31 12:27 ` Patrick Steinhardt
@ 2025-03-31 15:48 ` Todd Zullinger
2025-03-31 18:17 ` SZEDER Gábor
1 sibling, 0 replies; 14+ messages in thread
From: Todd Zullinger @ 2025-03-31 15:48 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: John Paul Adrian Glaubitz, git, Derrick Stolee
[cc: fixed Derrick's address]
Hi Patrick,
Patrick Steinhardt wrote:
> Thanks for the additional information!
Thank you for looking into it and providing a patch!
> One thing I stumbled over: the `--min-batch-size` parameter is parsed
> using `OPT_INTEGER()`, which expects the value pointer to point to an
> integer. But we pass `struct backfill_context::min_batch_size`, which is
> of type `size_t`. Maybe that's causing us to end up with an invalid
> value?
>
> Could you please check whether the below diff fixes the issue for you?
> If so I can turn it into a proper patch.
It does indeed lead to a successful test run:
t5620-backfill.sh ..................................
ok 1 - setup repo for object creation
ok 2 - setup bare clone for server
ok 3 - do partial clone 1, backfill gets all objects
ok 4 - do partial clone 2, backfill min batch size
ok 5 - backfill --sparse without sparse-checkout fails
ok 6 - backfill --sparse
ok 7 - backfill --sparse without cone mode (positive)
ok 8 - backfill --sparse without cone mode (negative)
ok 9 - create a partial clone over HTTP
ok 10 - backfilling over HTTP succeeds
# passed all 10 test(s)
1..10
Source: https://kojipkgs.fedoraproject.org//work/tasks/3947/130943947/build.log
I tested it against the other common architectures the
Fedora build system provides as well, to be sure no others
regressed, although there aren't as many as the Debian build
system :).
--
Todd
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Testsuite failure on s390x and sparc64 after 6840fe9ee2
2025-03-31 12:27 ` Patrick Steinhardt
2025-03-31 15:48 ` Todd Zullinger
@ 2025-03-31 18:17 ` SZEDER Gábor
2025-04-01 2:33 ` Jeff King
1 sibling, 1 reply; 14+ messages in thread
From: SZEDER Gábor @ 2025-03-31 18:17 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: Todd Zullinger, John Paul Adrian Glaubitz, git, Derrick Stolee
On Mon, Mar 31, 2025 at 02:27:06PM +0200, Patrick Steinhardt wrote:
> One thing I stumbled over: the `--min-batch-size` parameter is parsed
> using `OPT_INTEGER()`, which expects the value pointer to point to an
> integer. But we pass `struct backfill_context::min_batch_size`, which is
> of type `size_t`. Maybe that's causing us to end up with an invalid
> value?
We could teach parse-options to verify at compile time that it got a
'value' pointer to an appropriately sized variable with a simple
trick:
diff --git a/parse-options.h b/parse-options.h
index 997ffbee80..ac63f9548a 100644
--- a/parse-options.h
+++ b/parse-options.h
@@ -213,7 +213,7 @@ struct option {
.type = OPTION_INTEGER, \
.short_name = (s), \
.long_name = (l), \
- .value = (v), \
+ .value = (v) + 0/(sizeof(*(v)) == sizeof(int)), \
.argh = N_("n"), \
.help = (h), \
.flags = (f), \
This bug would then cause a compiler error like this:
CC builtin/backfill.o
In file included from builtin/backfill.c:7:
builtin/backfill.c: In function ‘cmd_backfill’:
./parse-options.h:216:25: error: division by zero [-Werror=div-by-zero]
216 | .value = (v) + 0/(sizeof(*v) == sizeof(int)), \
| ^
./parse-options.h:272:37: note: in expansion of macro ‘OPT_INTEGER_F’
272 | #define OPT_INTEGER(s, l, v, h) OPT_INTEGER_F(s, l, v, h, 0)
| ^~~~~~~~~~~~~
builtin/backfill.c:126:17: note: in expansion of macro ‘OPT_INTEGER’
126 | OPT_INTEGER(0, "min-batch-size", &ctx.min_batch_size,
| ^~~~~~~~~~~
cc1: all warnings being treated as errors
make: *** [Makefile:2811: builtin/backfill.o] Error 1
Alas, the change is ugly (and we should do the same for many other
OPT_* macros as well) and the error message is far from
to-the-point... Turning this into something usable would require a
more clever trick, and that's more than I can devote to this issue.
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: Testsuite failure on s390x and sparc64 after 6840fe9ee2
2025-03-31 18:17 ` SZEDER Gábor
@ 2025-04-01 2:33 ` Jeff King
2025-04-01 3:10 ` Jeff King
0 siblings, 1 reply; 14+ messages in thread
From: Jeff King @ 2025-04-01 2:33 UTC (permalink / raw)
To: SZEDER Gábor
Cc: René Scharfe, Patrick Steinhardt, Todd Zullinger,
John Paul Adrian Glaubitz, git, Derrick Stolee
On Mon, Mar 31, 2025 at 08:17:57PM +0200, SZEDER Gábor wrote:
> On Mon, Mar 31, 2025 at 02:27:06PM +0200, Patrick Steinhardt wrote:
> > One thing I stumbled over: the `--min-batch-size` parameter is parsed
> > using `OPT_INTEGER()`, which expects the value pointer to point to an
> > integer. But we pass `struct backfill_context::min_batch_size`, which is
> > of type `size_t`. Maybe that's causing us to end up with an invalid
> > value?
>
> We could teach parse-options to verify at compile time that it got a
> 'value' pointer to an appropriately sized variable with a simple
> trick:
That would be nice. I think we've discussed type safety for
parse-options before, but IIRC none of the solutions were very
satisfying. But this sounds like a relatively low-effort approach that
buys us something, at least. I wonder if it could even be extended to
use __builtin_types_compatible() on platforms that support it.
+cc René as our resident expert on gross C hacks. ;)
> This bug would then cause a compiler error like this:
>
> CC builtin/backfill.o
> In file included from builtin/backfill.c:7:
> builtin/backfill.c: In function ‘cmd_backfill’:
> ./parse-options.h:216:25: error: division by zero [-Werror=div-by-zero]
> 216 | .value = (v) + 0/(sizeof(*v) == sizeof(int)), \
> | ^
> ./parse-options.h:272:37: note: in expansion of macro ‘OPT_INTEGER_F’
> 272 | #define OPT_INTEGER(s, l, v, h) OPT_INTEGER_F(s, l, v, h, 0)
> | ^~~~~~~~~~~~~
> builtin/backfill.c:126:17: note: in expansion of macro ‘OPT_INTEGER’
> 126 | OPT_INTEGER(0, "min-batch-size", &ctx.min_batch_size,
> | ^~~~~~~~~~~
> cc1: all warnings being treated as errors
> make: *** [Makefile:2811: builtin/backfill.o] Error 1
>
> Alas, the change is ugly (and we should do the same for many other
> OPT_* macros as well) and the error message is far from
> to-the-point... Turning this into something usable would require a
> more clever trick, and that's more than I can devote to this issue.
We do have BUILD_ASSERT_OR_ZERO(). It produces similarly arcane errors,
but at least the presence of the macro name helps a bit. E.g., doing
this:
diff --git a/parse-options.h b/parse-options.h
index 997ffbee80..5303ad6bcf 100644
--- a/parse-options.h
+++ b/parse-options.h
@@ -213,7 +213,7 @@ struct option {
.type = OPTION_INTEGER, \
.short_name = (s), \
.long_name = (l), \
- .value = (v), \
+ .value = (v) + BUILD_ASSERT_OR_ZERO(sizeof(*v) == sizeof(int)), \
.argh = N_("n"), \
.help = (h), \
.flags = (f), \
yields:
CC builtin/backfill.o
In file included from ./builtin.h:4,
from builtin/backfill.c:4:
builtin/backfill.c: In function ‘cmd_backfill’:
./git-compat-util.h:103:22: error: size of unnamed array is negative
103 | (sizeof(char [1 - 2*!(cond)]) - 1)
| ^
./parse-options.h:216:24: note: in expansion of macro ‘BUILD_ASSERT_OR_ZERO’
216 | .value = (v) + BUILD_ASSERT_OR_ZERO(sizeof(*v) == sizeof(int)), \
| ^~~~~~~~~~~~~~~~~~~~
./parse-options.h:272:37: note: in expansion of macro ‘OPT_INTEGER_F’
272 | #define OPT_INTEGER(s, l, v, h) OPT_INTEGER_F(s, l, v, h, 0)
| ^~~~~~~~~~~~~
builtin/backfill.c:126:17: note: in expansion of macro ‘OPT_INTEGER’
126 | OPT_INTEGER(0, "min-batch-size", &ctx.min_batch_size,
| ^~~~~~~~~~~
make: *** [Makefile:2810: builtin/backfill.o] Error 1
-Peff
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: Testsuite failure on s390x and sparc64 after 6840fe9ee2
2025-04-01 2:33 ` Jeff King
@ 2025-04-01 3:10 ` Jeff King
2025-04-01 11:43 ` Patrick Steinhardt
0 siblings, 1 reply; 14+ messages in thread
From: Jeff King @ 2025-04-01 3:10 UTC (permalink / raw)
To: SZEDER Gábor
Cc: René Scharfe, Patrick Steinhardt, Todd Zullinger,
John Paul Adrian Glaubitz, git, Derrick Stolee
On Mon, Mar 31, 2025 at 10:33:58PM -0400, Jeff King wrote:
> That would be nice. I think we've discussed type safety for
> parse-options before, but IIRC none of the solutions were very
> satisfying. But this sounds like a relatively low-effort approach that
> buys us something, at least. I wonder if it could even be extended to
> use __builtin_types_compatible() on platforms that support it.
So here's a slightly fancier version that uses the gcc builtin when it's
available:
diff --git a/git-compat-util.h b/git-compat-util.h
index 8560c89374..7bcbe0b4ac 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -110,11 +110,17 @@ DISABLE_WARNING(-Wsign-compare)
# define BARF_UNLESS_COPYABLE(dst, src) \
BUILD_ASSERT_OR_ZERO(__builtin_types_compatible_p(__typeof__(*(dst)), \
__typeof__(*(src))))
+
+# define BARF_UNLESS_TYPE_MATCH(var, type) \
+ BUILD_ASSERT_OR_ZERO(__builtin_types_compatible_p(__typeof__(*(var)), type))
+
#else
# define BARF_UNLESS_AN_ARRAY(arr) 0
# define BARF_UNLESS_COPYABLE(dst, src) \
BUILD_ASSERT_OR_ZERO(0 ? ((*(dst) = *(src)), 0) : \
sizeof(*(dst)) == sizeof(*(src)))
+# define BARF_UNLESS_TYPE_MATCH(var, type) \
+ BUILD_ASSERT_OR_ZERO(sizeof(*(var)) == sizeof(type))
#endif
/*
* ARRAY_SIZE - get the number of elements in a visible array
diff --git a/parse-options.h b/parse-options.h
index 997ffbee80..b38a852a8b 100644
--- a/parse-options.h
+++ b/parse-options.h
@@ -213,7 +213,7 @@ struct option {
.type = OPTION_INTEGER, \
.short_name = (s), \
.long_name = (l), \
- .value = (v), \
+ .value = (v) + BARF_UNLESS_TYPE_MATCH((v), int), \
.argh = N_("n"), \
.help = (h), \
.flags = (f), \
That turns up several more hits, which all seem to be related to
signed-ness (mostly passing a pointer to unsigned). E.g.:
git grep --after-context=-1 foo -- builtin/checkout.c
ends up assigning "-1" to an "unsigned" via pointer casting. I think
that's probably technically undefined behavior, but works OK in practice
to give you UINT_MAX. I'd have thought that would give you infinite
context, so it might even be doing something useful (or at least
something that users might rely upon), but strangely it doesn't seem to
(I'd guess very large context values aren't handled in the grep code
somehow).
So it might be possible to clean these up without hurting anything else.
-Peff
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: Testsuite failure on s390x and sparc64 after 6840fe9ee2
2025-04-01 3:10 ` Jeff King
@ 2025-04-01 11:43 ` Patrick Steinhardt
2025-04-01 15:04 ` Patrick Steinhardt
0 siblings, 1 reply; 14+ messages in thread
From: Patrick Steinhardt @ 2025-04-01 11:43 UTC (permalink / raw)
To: Jeff King
Cc: SZEDER Gábor, René Scharfe, Todd Zullinger,
John Paul Adrian Glaubitz, git, Derrick Stolee
On Mon, Mar 31, 2025 at 11:10:30PM -0400, Jeff King wrote:
> On Mon, Mar 31, 2025 at 10:33:58PM -0400, Jeff King wrote:
>
> > That would be nice. I think we've discussed type safety for
> > parse-options before, but IIRC none of the solutions were very
> > satisfying. But this sounds like a relatively low-effort approach that
> > buys us something, at least. I wonder if it could even be extended to
> > use __builtin_types_compatible() on platforms that support it.
>
> So here's a slightly fancier version that uses the gcc builtin when it's
> available:
Thanks for these! I'd also like to spin this even further: right now we
don't really care about the precision of the underlying integer types at
all. While we could force all users to the same type via your mechanism,
I think that'd ultimately be quite awkward. Another way would be to use
one macro per underlying integer type, but that would quickly explode in
scope.
I'll instead try to extend the parse-options interface so that we track
the precision of the underlying integer and then produce an error when
the parsed integer exceeds that precision.
I'll send a patch series later this week.
Patrick
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Testsuite failure on s390x and sparc64 after 6840fe9ee2
2025-04-01 11:43 ` Patrick Steinhardt
@ 2025-04-01 15:04 ` Patrick Steinhardt
0 siblings, 0 replies; 14+ messages in thread
From: Patrick Steinhardt @ 2025-04-01 15:04 UTC (permalink / raw)
To: Jeff King
Cc: SZEDER Gábor, René Scharfe, Todd Zullinger,
John Paul Adrian Glaubitz, git, Derrick Stolee
On Tue, Apr 01, 2025 at 01:43:37PM +0200, Patrick Steinhardt wrote:
> I'll send a patch series later this week.
Sent now via [1].
Patrick
[1]: <20250401-b4-pks-parse-options-integers-v1-0-a628ad40c3b4@pks.im>
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-04-01 15:04 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-26 20:42 Testsuite failure on s390x and sparc64 after 6840fe9ee2 John Paul Adrian Glaubitz
2025-03-26 22:27 ` Todd Zullinger
2025-03-28 9:29 ` Patrick Steinhardt
2025-03-28 9:30 ` Patrick Steinhardt
2025-03-28 9:38 ` John Paul Adrian Glaubitz
2025-03-28 14:08 ` Todd Zullinger
2025-03-28 15:37 ` Todd Zullinger
2025-03-31 12:27 ` Patrick Steinhardt
2025-03-31 15:48 ` Todd Zullinger
2025-03-31 18:17 ` SZEDER Gábor
2025-04-01 2:33 ` Jeff King
2025-04-01 3:10 ` Jeff King
2025-04-01 11:43 ` Patrick Steinhardt
2025-04-01 15:04 ` Patrick Steinhardt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).