Git development
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Jeff King <peff@peff.net>
Cc: Michael Montalbo <mmontalbo@gmail.com>,
	git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>
Subject: Re: [RFH] Why do osx CI jobs so unreliable?
Date: Mon, 22 Jun 2026 06:42:24 +0200	[thread overview]
Message-ID: <aji9MOE-NTHKXYqn@pks.im> (raw)
In-Reply-To: <20260621213407.GC2297179@coredump.intra.peff.net>

On Sun, Jun 21, 2026 at 05:34:07PM -0400, Jeff King wrote:
> On Sat, Jun 20, 2026 at 08:33:13AM -0700, Michael Montalbo wrote:
> 
> > Patrick Steinhardt <ps@pks.im> writes:
> > > So I strongly suspect that it most be one of the t555* tests.
> > > [...]
> > > Maybe this is something that's specific to GitHub's environment...
> > 
> > I think you're right it's t5551/t5559. The runs Junio linked:
> > 
> >   osx-clang     cancelled  360min
> >   osx-gcc       cancelled  360min
> >   osx-reftable  success     35min
> >   osx-meson     success     61min
> > 
> > All four run the same t5551/t5559 under EXPENSIVE. The two that
> > finished differ in just two ways, which look like the levers:
> > osx-reftable generates the 100k-ref advertisement in ~24ms vs ~1.2s
> > for loose refs on macOS (so much less time mid-response), and
> > osx-meson runs tests at nproc while the prove jobs hardcode --jobs=10
> > on a 3-core runner (over recent master/next the prove jobs hang ~40%,
> > meson ~10%).
> 
> If the problem is a racy deadlock, there is a reasonable chance that
> some jobs may simply be lucky. Even if things like packing refs help, I
> suspect the problem may still be lurking. Maybe I'm just a pessimist,
> though. ;)

I had the same thought.

> > When it is wedged the whole chain sits at 0% CPU. upload-pack is
> > blocked in write() on the ls-refs advertisement, curl blocked in
> > select(). So it looks like an HTTP/2 flow-control stall on the
> > response side. The same stall resets itself after ~60-85s on my Linux
> > box and on a bare-metal Mac, but not on the GitHub runner; I haven't
> > pinned down why yet.
> 
> We had some HTTP/2 stalls/deadlocks in the past, and they were dependent
> on libcurl and apache (actually h2_mod) versions. IIRC some of the
> non-TLS code paths for HTTP/2 were not well tested, which led to
> 8f2146dbf1 (t5559: make SSL/TLS the default, 2023-02-23). Of course
> after that commit those cleartext code paths should not be a problem, so
> that is probably not exactly the issue now.
> 
> But it might be worth checking the versions you're running locally
> versus what's in the GitHub runner.

I didn't observe any similar hangs in GitLab's CI systems, so I wonder
whether this is because of different versions of curl. And indeed we use
different versions:

  - On GitHub we use 8.6.0.

  - On GitLab we use 8.7.1.

Now this of course doesn't mean that updating the curl version is the
fix to this whole issue, as there's a ton of other factors that could
play a role in whether or not the test hangs. So while we could just
upgrade parts of the stack and cross our fingers, but that feels rather
unsatisfactory. Still, one place to start could be to update our build
images to macOS 15.

But the big question to me is whether the hang is because of a bug in
Git with how we drive curl, a bug in curl itself, or a bug in Apache.

Patrick

  reply	other threads:[~2026-06-22  4:42 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-20 15:33 [RFH] Why do osx CI jobs so unreliable? Michael Montalbo
2026-06-21 21:34 ` Jeff King
2026-06-22  4:42   ` Patrick Steinhardt [this message]
2026-06-22  5:05   ` Junio C Hamano
  -- strict thread matches above, loose matches on Subject: below --
2026-06-19  0:35 Junio C Hamano
2026-06-19 14:03 ` Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aji9MOE-NTHKXYqn@pks.im \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=mmontalbo@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox