Git development
 help / color / mirror / Atom feed
From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Derrick Stolee <stolee@gmail.com>
Cc: Jeff King <peff@peff.net>,
	 Johannes Schindelin via GitGitGadget <gitgitgadget@gmail.com>,
	 git@vger.kernel.org
Subject: Re: [PATCH 6/6] t5608: add regression test for >4GB object clone
Date: Mon, 4 May 2026 19:07:10 +0200 (CEST)	[thread overview]
Message-ID: <7ab69861-1508-106e-5d55-5ce5508653f8@gmx.de> (raw)
In-Reply-To: <ed226571-a095-456b-9d9e-bcc545d7ddfb@gmail.com>

Hi Stolee & Jeff,

On Sun, 3 May 2026, Derrick Stolee wrote:

> On 5/1/2026 2:38 AM, Jeff King wrote:
> > On Wed, Apr 29, 2026 at 09:34:21AM -0400, Derrick Stolee wrote:
> 
> >>> +test_expect_success SIZE_T_IS_64BIT 'set up repo with >4GB object' '
> >>
> >> Your prereq here prevents it from running on 32-bit builds, which is
> >> good. However, I wonder if it would be worth also specifying these
> >> tests as expensive. It's less likely that these layers will be touched
> >> often, so it should be enough to run these on major occasions, such as
> >> testing a release candidate.
> > 
> > I think it is already skipped in most cases, because t5608 requires the
> > GIT_TEST_CLONE_2GB environment variable be set. Arguably it should just
> > be using EXPENSIVE, too, as I do not think there is much value in having
> > individual flags for all of the expensive tests. I think that test just
> > predates the modern prereq system entirely.
> 
> Thanks for the extra details here! That helps avoid the issues that I
> was thinking about, but maybe doubling-down and adding EXPENSIVE is
> still worth it. 

Indeed. The `GIT_TEST_CLONE_2GB` flag is set for all CI jobs:
https://gitlab.com/git-scm/git/-/blob/v2.54.0/ci/lib.sh#L314

So I've tried to accelerate it.

First, by using the "unsafe" SHA-1 (because we're in no danger here, we
generate the data ourselves). That helped some, but unfortunately the most
efficient implementation (OpenSSL) is out of bounds for us because we
cannot enable its use in the default configuration for licensing reasons:
OpenSSL's license and Git's GPLv2 are fundamentally incompatible with each
other, and even though Linus' intention with
https://gitlab.com/git-scm/git/-/blob/e83c5163316f89bfbde7d9ab23ca2e25604af290/Makefile#L11
was clearly to allow linking to OpenSSL (actually, requiring it), there is
one Git contributor I spoke to who flatly stated that they'd block every
attempt to add an exception to Git's license retroactively to allow
linking to OpenSSL (at least for distribution, which is when the GPLv2
would kick in). So that's that.

Second, I added shortcuts for the 4GiB+1 size exercised in the added test
cases. That did help! But only the generation time of those packfiles is
helpd by that. The `git clone` that is tested is still awfully slow, and
that _cannot_ be worked around in the same ways.

Therefore, I ended up marking the test cases as `EXPENSIVE`.

To allow them to be run regularly anyway, I added a final patch to the
series that lets the CI runs on the integration branches (other than
`seen`) run all `EXPENSIVE` test cases. One could argue that this patch
should be split out, and I'm open to it, but in the interest of keeping
the time I am working on this patch series _somewhat_ closer to what could
be called reasonable, I'll just keep it in the patch series for now,
hedging for the possibility that maybe nobody objects?

Ciao,
Johannes

> >> I suppose this also is a question for Junio and our process for
> >> validating releases. Do we have a certain cadence where we run the
> >> expensive tests? What has been our threshold for hiding a test case
> >> behind the expensive label?
> > 
> > AFAIK the labeling of expensive things is mostly ad-hoc, and nobody is
> > systematically running them. Likewise for the t/perf tests, which are
> > super expensive but do (very occasionally) turn up interesting
> > regressions.
> I used to be more diligent about running the performance tests myself
> around release windows. The EXPENSIVE tests would also be good to do
> on rc0. I will contemplate how to put this into my routine.
> 
> Thanks,
> -Stolee
> 
> 
> 

  reply	other threads:[~2026-05-04 17:07 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-28 16:26 [PATCH 0/6] Handle cloning of objects larger than 4GB on Windows Johannes Schindelin via GitGitGadget
2026-04-28 16:26 ` [PATCH 1/6] index-pack, unpack-objects: use size_t for object size Johannes Schindelin via GitGitGadget
2026-04-30 14:13   ` Torsten Bögershausen
2026-05-03 14:46     ` Johannes Schindelin
2026-04-28 16:26 ` [PATCH 2/6] git-zlib: handle data streams larger than 4GB Johannes Schindelin via GitGitGadget
2026-04-28 16:26 ` [PATCH 3/6] odb, packfile: use size_t for streaming object sizes Johannes Schindelin via GitGitGadget
2026-04-28 16:26 ` [PATCH 4/6] delta, packfile: use size_t for delta header sizes Johannes Schindelin via GitGitGadget
2026-04-29 13:28   ` Derrick Stolee
2026-05-03 14:49     ` Johannes Schindelin
2026-04-28 16:26 ` [PATCH 5/6] test-tool: add a helper to synthesize large packfiles Johannes Schindelin via GitGitGadget
2026-04-28 16:26 ` [PATCH 6/6] t5608: add regression test for >4GB object clone Johannes Schindelin via GitGitGadget
2026-04-29 13:34   ` Derrick Stolee
2026-05-01  6:38     ` Jeff King
2026-05-01 13:19       ` Derrick Stolee
2026-05-04 17:07         ` Johannes Schindelin [this message]
2026-04-29 13:35 ` [PATCH 0/6] Handle cloning of objects larger than 4GB on Windows Derrick Stolee
2026-05-04 17:08 ` [PATCH v2 00/11] " Johannes Schindelin via GitGitGadget
2026-05-04 17:08   ` [PATCH v2 01/11] index-pack, unpack-objects: use size_t for object size Johannes Schindelin via GitGitGadget
2026-05-05 19:11     ` Torsten Bögershausen
2026-05-08  7:36       ` Johannes Schindelin
2026-05-08 19:09         ` Torsten Bögershausen
2026-05-10  2:41           ` Junio C Hamano
2026-05-10  9:14             ` Torsten Bögershausen
2026-05-04 17:08   ` [PATCH v2 02/11] git-zlib: handle data streams larger than 4GB Johannes Schindelin via GitGitGadget
2026-05-04 17:08   ` [PATCH v2 03/11] odb, packfile: use size_t for streaming object sizes Johannes Schindelin via GitGitGadget
2026-05-05 19:27     ` Torsten Bögershausen
2026-05-08  7:38       ` Johannes Schindelin
2026-05-04 17:08   ` [PATCH v2 04/11] delta, packfile: use size_t for delta header sizes Johannes Schindelin via GitGitGadget
2026-05-04 17:08   ` [PATCH v2 05/11] test-tool: add a helper to synthesize large packfiles Johannes Schindelin via GitGitGadget
2026-05-04 17:08   ` [PATCH v2 06/11] t5608: add regression test for >4GB object clone Johannes Schindelin via GitGitGadget
2026-05-04 17:08   ` [PATCH v2 07/11] test-tool synthesize: use the unsafe hash for speed Johannes Schindelin via GitGitGadget
2026-05-04 17:08   ` [PATCH v2 08/11] test-tool synthesize: precompute pack for 4 GiB + 1 Johannes Schindelin via GitGitGadget
2026-05-04 18:27     ` Derrick Stolee
2026-05-05 20:54       ` Johannes Schindelin
2026-05-04 17:08   ` [PATCH v2 09/11] test-tool synthesize: add precomputed SHA-256 " Johannes Schindelin via GitGitGadget
2026-05-04 17:08   ` [PATCH v2 10/11] t5608: mark >4GB tests as EXPENSIVE Johannes Schindelin via GitGitGadget
2026-05-04 17:08   ` [PATCH v2 11/11] ci: run expensive tests on push builds to integration branches Johannes Schindelin via GitGitGadget
2026-05-04 18:35     ` Derrick Stolee
2026-05-05 12:56       ` Junio C Hamano
2026-05-05 23:07         ` Junio C Hamano
2026-05-06  8:33           ` Johannes Schindelin
2026-05-07  9:18             ` Junio C Hamano
2026-05-07 10:24               ` Patrick Steinhardt
2026-05-08  2:50         ` Junio C Hamano
2026-05-08  8:16   ` [PATCH v3 00/11] Handle cloning of objects larger than 4GB on Windows Johannes Schindelin via GitGitGadget
2026-05-08  8:16     ` [PATCH v3 01/11] index-pack, unpack-objects: use size_t for object size Johannes Schindelin via GitGitGadget
2026-05-08  8:16     ` [PATCH v3 02/11] git-zlib: handle data streams larger than 4GB Johannes Schindelin via GitGitGadget
2026-05-08  8:16     ` [PATCH v3 03/11] odb, packfile: use size_t for streaming object sizes Johannes Schindelin via GitGitGadget
2026-05-08  8:16     ` [PATCH v3 04/11] delta, packfile: use size_t for delta header sizes Johannes Schindelin via GitGitGadget
2026-05-08  8:16     ` [PATCH v3 05/11] test-tool: add a helper to synthesize large packfiles Johannes Schindelin via GitGitGadget
2026-05-08  8:16     ` [PATCH v3 06/11] t5608: add regression test for >4GB object clone Johannes Schindelin via GitGitGadget
2026-05-08  8:16     ` [PATCH v3 07/11] test-tool synthesize: use the unsafe hash for speed Johannes Schindelin via GitGitGadget
2026-05-08  8:16     ` [PATCH v3 08/11] test-tool synthesize: precompute pack for 4 GiB + 1 Johannes Schindelin via GitGitGadget
2026-05-08  8:16     ` [PATCH v3 09/11] test-tool synthesize: add precomputed SHA-256 " Johannes Schindelin via GitGitGadget
2026-05-08  8:16     ` [PATCH v3 10/11] t5608: mark >4GB tests as EXPENSIVE Johannes Schindelin via GitGitGadget
2026-05-08  8:16     ` [PATCH v3 11/11] ci: run expensive tests on push builds to integration branches Johannes Schindelin via GitGitGadget
2026-05-10 23:51       ` [PATCH] ci: enable EXPENSIVE for contributor builds Junio C Hamano
2026-05-11  7:05         ` Patrick Steinhardt
2026-05-11  8:29           ` Junio C Hamano
2026-05-11 10:02             ` Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7ab69861-1508-106e-5d55-5ce5508653f8@gmx.de \
    --to=johannes.schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=peff@peff.net \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox