public inbox for git@vger.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Eric Sunshine <sunshine@sunshineco.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 2/2] test-lib: teach test_seq the -f option
Date: Tue, 24 Jun 2025 06:36:52 -0400	[thread overview]
Message-ID: <20250624103652.GD636332@coredump.intra.peff.net> (raw)
In-Reply-To: <CAPig+cQNWVd7M5pe0te9os3NRrjfBSSaUZjUXKX8RUdTk50SFw@mail.gmail.com>

On Tue, Jun 24, 2025 at 02:22:21AM -0400, Eric Sunshine wrote:

> > diff --git a/t/t0612-reftable-jgit-compatibility.sh b/t/t0612-reftable-jgit-compatibility.sh
> > @@ -112,14 +112,11 @@ test_expect_success 'JGit can read multi-level index' '
> > -               awk "
> > -                   BEGIN {
> > -                       print \"start\";
> > -                       for (i = 0; i < 10000; i++)
> > -                           printf \"create refs/heads/branch-%d HEAD\n\", i;
> > -                       print \"commit\";
> > -                   }
> > -               " >input &&
> > +               {
> > +                       echo start &&
> > +                       test_seq -f "create refs/heads/branch-%d HEAD" 10000 &&
> > +                       echo commit
> > +               } >input &&
> 
> I had suggested[1] an effectively equivalent change to Patrick for a
> couple tests in the nearby t0610, but he rejected[2] the idea due to
> the pure-shell version being significantly slower than the `awk`
> version.
> 
> Pondering his response today, I wondered if it would make sense to
> replace our pure-shell `test_seq` with an implementation via `awk`,
> however, if most of our sequence vend only a small set of numbers,
> then the startup cost of `awk` would probably swamp any savings,
> especially on Windows where process startup is extremely slow. Taking
> that into account, I further wondered if we could see an overall win
> by taking a hybrid approach in which we employ the pure-shell version
> if vending a small set of numbers, but fall over to an `awk` version
> if vending a lot of numbers, especially as in the test above or the
> tests in t0610. Anyhow, food for thought, or not, if you're not hungry
> for thought food.

Ah, interesting. I didn't time it at all, as my general intuition for
shell performance is that counting process spawns overrides everything
else (though admittedly it is usually O(n) processes vs O(1), and here
we are going from one extra process to zero).

I did a few timings, and it looks like the shell wins at 10,000 on my
system, but awk wins at 50,000 (though there is a lot of run-to-run
noise; I think awk might even win at 10,000 on a loaded system, as this
is such a light load that CPU frequency throttling comes into play).

I assumed that the culprit was a lack of buffering, but I don't think
so. awk seems to issue 10,000 write() calls. I guess it is just internal
shell overhead in issuing commands. Where is a JIT byte-code shell
interpreter when we need one? ;)


My inclination is not to worry about it too much. At 10,000 I think we
are talking about a few milliseconds. There's so much more low-hanging
fruit if somebody wants to optimize the test suite. IMHO readability is
more important here (and if we really want to optimize, doing it inside
test_seq would be better).

-Peff

      reply	other threads:[~2025-06-24 10:36 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-23 10:55 [PATCH 0/2] test_seq format option Jeff King
2025-06-23 10:55 ` [PATCH 1/2] t7422: replace confusing printf with echo Jeff King
2025-06-23 17:59   ` Eric Sunshine
2025-06-24 10:05     ` Jeff King
2025-06-23 10:56 ` [PATCH 2/2] test-lib: teach test_seq the -f option Jeff King
2025-06-23 16:10   ` Junio C Hamano
2025-06-23 16:25   ` Justin Tobler
2025-06-24 10:11     ` [PATCH 3/2] test-lib: document test_seq's "-f" option Jeff King
2025-06-25  0:02       ` Justin Tobler
2025-06-23 17:27   ` [PATCH 2/2] test-lib: teach test_seq the -f option Todd Zullinger
2025-06-24 10:16     ` Jeff King
2025-06-24 13:42       ` Todd Zullinger
2025-06-24  6:22   ` Eric Sunshine
2025-06-24 10:36     ` Jeff King [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250624103652.GD636332@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox