[PATCH 0/2] test_seq format option

public inbox for git@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/2] test_seq format option
@ 2025-06-23 10:55 Jeff King
  2025-06-23 10:55 ` [PATCH 1/2] t7422: replace confusing printf with echo Jeff King
  2025-06-23 10:56 ` [PATCH 2/2] test-lib: teach test_seq the -f option Jeff King
  0 siblings, 2 replies; 14+ messages in thread
From: Jeff King @ 2025-06-23 10:55 UTC (permalink / raw)
  To: git

This is a small quality of life improvement for our test suite, stemming
from this discussion:

  https://lore.kernel.org/git/20240408172638.GB1629595@coredump.intra.peff.net/

But somehow it took me over a year to get back to it. :-/

The first patch is a small cleanup in nearby code; the second one is the
interesting part.

  [1/2]: t7422: replace confusing printf with echo
  [2/2]: test-lib: teach test_seq the -f option

 t/t0021-conversion.sh                  |  4 ++--
 t/t0610-reftable-basics.sh             |  6 +-----
 t/t0612-reftable-jgit-compatibility.sh | 13 +++++--------
 t/t0613-reftable-write-options.sh      | 24 ++++--------------------
 t/t1400-update-ref.sh                  | 10 ++--------
 t/t5004-archive-corner-cases.sh        |  5 +----
 t/t6422-merge-rename-corner-cases.sh   | 10 ++--------
 t/t7422-submodule-output.sh            |  9 +++------
 t/test-lib-functions.sh                |  9 ++++++++-
 9 files changed, 28 insertions(+), 62 deletions(-)

-Peff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/2] t7422: replace confusing printf with echo
  2025-06-23 10:55 [PATCH 0/2] test_seq format option Jeff King
@ 2025-06-23 10:55 ` Jeff King
  2025-06-23 17:59   ` Eric Sunshine
  2025-06-23 10:56 ` [PATCH 2/2] test-lib: teach test_seq the -f option Jeff King
  1 sibling, 1 reply; 14+ messages in thread
From: Jeff King @ 2025-06-23 10:55 UTC (permalink / raw)
  To: git

While looping over a counter "i", we do:

  printf "[submodule \"sm-$i\"]\npath = recursive-submodule-path-$i\n" "$i"

So we are passing "$i" as an argument to be filled in, but there is no
"%" placeholder in the format string, which is a bit confusing to read.

We could switch both instances of "$i" to "%d" (and pass $i twice). But
that makes the line even longer. Let's just keep interpolating the value
in the string, and drop the confusing extra "$i" argument.

And since we are not using any printf specifiers at all, it becomes
clear that we can swap it out for echo. We do use a "\n" in the middle
of the string, but breaking this into two separate echo statements
actually makes it easier to read.

Signed-off-by: Jeff King <peff@peff.net>
---
 t/t7422-submodule-output.sh | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
index 023a5cbdc4..ea67057f1a 100755
--- a/t/t7422-submodule-output.sh
+++ b/t/t7422-submodule-output.sh
@@ -180,7 +180,8 @@ test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE'
 		COMMIT=$(git rev-parse HEAD) &&
 		for i in $(test_seq 2000)
 		do
-			printf "[submodule \"sm-$i\"]\npath = recursive-submodule-path-$i\n" "$i" ||
+			echo "[submodule \"sm-$i\"]" &&
+			echo "path = recursive-submodule-path-$i" ||
 			return 1
 		done >gitmodules &&
 		BLOB=$(git hash-object -w --stdin <gitmodules) &&
-- 
2.50.0.385.g2a828bf5b7

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] t7422: replace confusing printf with echo
  2025-06-23 10:55 ` [PATCH 1/2] t7422: replace confusing printf with echo Jeff King
@ 2025-06-23 17:59   ` Eric Sunshine
  2025-06-24 10:05     ` Jeff King
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Sunshine @ 2025-06-23 17:59 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Mon, Jun 23, 2025 at 6:57 AM Jeff King <peff@peff.net> wrote:
> While looping over a counter "i", we do:
>
>   printf "[submodule \"sm-$i\"]\npath = recursive-submodule-path-$i\n" "$i"
>
> So we are passing "$i" as an argument to be filled in, but there is no
> "%" placeholder in the format string, which is a bit confusing to read.
>
> We could switch both instances of "$i" to "%d" (and pass $i twice). But
> that makes the line even longer. Let's just keep interpolating the value
> in the string, and drop the confusing extra "$i" argument.
>
> And since we are not using any printf specifiers at all, it becomes
> clear that we can swap it out for echo. We do use a "\n" in the middle
> of the string, but breaking this into two separate echo statements
> actually makes it easier to read.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
> @@ -180,7 +180,8 @@ test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE'
> -                       printf "[submodule \"sm-$i\"]\npath = recursive-submodule-path-$i\n" "$i" ||
> +                       echo "[submodule \"sm-$i\"]" &&
> +                       echo "path = recursive-submodule-path-$i" ||

This looks obviously correct and, as the commit message says, is
almost certainly easier to read, but I was more than a little
surprised to see the patch since I thought this code had been fixed
previously[*] and had some discussion around it.

[*] https://lore.kernel.org/git/20250403144852.19153-1-sn03.general@gmail.com/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] t7422: replace confusing printf with echo
  2025-06-23 17:59   ` Eric Sunshine
@ 2025-06-24 10:05     ` Jeff King
  0 siblings, 0 replies; 14+ messages in thread
From: Jeff King @ 2025-06-24 10:05 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: git

On Mon, Jun 23, 2025 at 01:59:48PM -0400, Eric Sunshine wrote:

> > diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
> > @@ -180,7 +180,8 @@ test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE'
> > -                       printf "[submodule \"sm-$i\"]\npath = recursive-submodule-path-$i\n" "$i" ||
> > +                       echo "[submodule \"sm-$i\"]" &&
> > +                       echo "path = recursive-submodule-path-$i" ||
> 
> This looks obviously correct and, as the commit message says, is
> almost certainly easier to read, but I was more than a little
> surprised to see the patch since I thought this code had been fixed
> previously[*] and had some discussion around it.
> 
> [*] https://lore.kernel.org/git/20250403144852.19153-1-sn03.general@gmail.com/

Ah, interesting, I hadn't seen that one. I am happy with either
solution, but IMHO what I posted with "echo" is a bit more readable.

-Peff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 2/2] test-lib: teach test_seq the -f option
  2025-06-23 10:55 [PATCH 0/2] test_seq format option Jeff King
  2025-06-23 10:55 ` [PATCH 1/2] t7422: replace confusing printf with echo Jeff King
@ 2025-06-23 10:56 ` Jeff King
  2025-06-23 16:10   ` Junio C Hamano
                     ` (3 more replies)
  1 sibling, 4 replies; 14+ messages in thread
From: Jeff King @ 2025-06-23 10:56 UTC (permalink / raw)
  To: git

The "seq" tool has a "-f" option to produce printf-style formatted
lines. Let's teach our test_seq helper the same trick. This lets us get
rid of some shell loops in test snippets (which are particularly verbose
in our test suite because we have to "|| return 1" to keep the &&-chain
going).

This converts a few call-sites I found by grepping around the test
suite. A few notes on these:

  - In "seq", the format specifier is a "%g" float. Since test_seq only
    supports integers, I've kept the more natural "%d" (which is what
    these call sites were using already).

  - Like "seq", test_seq automatically adds a newline to the specified
    format. This is what all callers are doing already except for t0021,
    but there we do not care about the exact format. We are just trying
    to printf a large number of bytes to a file. It's not worth
    complicating other callers or adding an option to avoid the newline
    in that caller.

  - Most conversions are just replacing a shell loop (which does get rid
    of an extra fork, since $() requires a subshell). In t0612 we can
    replace an awk invocation, which I think makes the end result more
    readable, as there's less quoting.

  - In t7422 we can replace one loop, but sadly we have to leave the
    loop directly above it. This is because that earlier loop wants to
    include the seq value twice in the output, which test_seq does not
    support (nor does regular seq). If you run:

      test_seq -f "foo-%d %d" 10

    the second "%d" will always be the empty string. You might naively
    think that test_seq could add some extra arguments, like:

      # 3 ought to be enough for anyone...
      printf "$fmt\n" "$i "$i" $i"

    but that just triggers printf to format multiple lines, one per
    extra set of arguments.

    So we'd have to actually parse the format string, figure out how
    many "%" placeholders are there, and then feed it that many
    instances of the sequence number. The complexity isn't worth it.

Signed-off-by: Jeff King <peff@peff.net>
---
 t/t0021-conversion.sh                  |  4 ++--
 t/t0610-reftable-basics.sh             |  6 +-----
 t/t0612-reftable-jgit-compatibility.sh | 13 +++++--------
 t/t0613-reftable-write-options.sh      | 24 ++++--------------------
 t/t1400-update-ref.sh                  | 10 ++--------
 t/t5004-archive-corner-cases.sh        |  5 +----
 t/t6422-merge-rename-corner-cases.sh   | 10 ++--------
 t/t7422-submodule-output.sh            |  6 +-----
 t/test-lib-functions.sh                |  9 ++++++++-
 9 files changed, 26 insertions(+), 61 deletions(-)

diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index bf10d253ec..f0d50d769e 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -281,7 +281,7 @@ test_expect_success 'required filter with absent smudge field' '
 test_expect_success 'filtering large input to small output should use little memory' '
 	test_config filter.devnull.clean "cat >/dev/null" &&
 	test_config filter.devnull.required true &&
-	for i in $(test_seq 1 30); do printf "%1048576d" 1 || return 1; done >30MB &&
+	test_seq -f "%1048576d" 1 30 >30MB &&
 	echo "30MB filter=devnull" >.gitattributes &&
 	GIT_MMAP_LIMIT=1m GIT_ALLOC_LIMIT=1m git add 30MB
 '
@@ -299,7 +299,7 @@ test_expect_success 'filter that does not read is fine' '
 test_expect_success EXPENSIVE 'filter large file' '
 	test_config filter.largefile.smudge cat &&
 	test_config filter.largefile.clean cat &&
-	for i in $(test_seq 1 2048); do printf "%1048576d" 1 || return 1; done >2GB &&
+	test_seq -f "%1048576d" 1 2048 >2GB &&
 	echo "2GB filter=largefile" >.gitattributes &&
 	git add 2GB 2>err &&
 	test_must_be_empty err &&
diff --git a/t/t0610-reftable-basics.sh b/t/t0610-reftable-basics.sh
index 1be534a895..3ea5d51532 100755
--- a/t/t0610-reftable-basics.sh
+++ b/t/t0610-reftable-basics.sh
@@ -477,11 +477,7 @@ test_expect_success !CYGWIN 'ref transaction: many concurrent writers' '
 		test_commit --no-tag initial &&
 
 		head=$(git rev-parse HEAD) &&
-		for i in $(test_seq 100)
-		do
-			printf "%s commit\trefs/heads/branch-%s\n" "$head" "$i" ||
-			return 1
-		done >expect &&
+		test_seq -f "$head commit\trefs/heads/branch-%d" 100 >expect &&
 		printf "%s commit\trefs/heads/main\n" "$head" >>expect &&
 
 		for i in $(test_seq 100)
diff --git a/t/t0612-reftable-jgit-compatibility.sh b/t/t0612-reftable-jgit-compatibility.sh
index d0d7e80b49..7df2ad5817 100755
--- a/t/t0612-reftable-jgit-compatibility.sh
+++ b/t/t0612-reftable-jgit-compatibility.sh
@@ -112,14 +112,11 @@ test_expect_success 'JGit can read multi-level index' '
 		cd repo &&
 
 		test_commit A &&
-		awk "
-		    BEGIN {
-			print \"start\";
-			for (i = 0; i < 10000; i++)
-			    printf \"create refs/heads/branch-%d HEAD\n\", i;
-			print \"commit\";
-		    }
-		" >input &&
+		{
+			echo start &&
+			test_seq -f "create refs/heads/branch-%d HEAD" 10000 &&
+			echo commit
+		} >input &&
 		git update-ref --stdin <input &&
 
 		test_same_refs &&
diff --git a/t/t0613-reftable-write-options.sh b/t/t0613-reftable-write-options.sh
index 6447920c9b..d77e601111 100755
--- a/t/t0613-reftable-write-options.sh
+++ b/t/t0613-reftable-write-options.sh
@@ -66,11 +66,7 @@ test_expect_success 'many refs results in multiple blocks' '
 	(
 		cd repo &&
 		test_commit initial &&
-		for i in $(test_seq 200)
-		do
-			printf "update refs/heads/branch-%d HEAD\n" "$i" ||
-			return 1
-		done >input &&
+		test_seq -f "update refs/heads/branch-%d HEAD" 200 >input &&
 		git update-ref --stdin <input &&
 		git pack-refs &&
 
@@ -180,11 +176,7 @@ test_expect_success 'restart interval at every single record' '
 	(
 		cd repo &&
 		test_commit initial &&
-		for i in $(test_seq 10)
-		do
-			printf "update refs/heads/branch-%d HEAD\n" "$i" ||
-			return 1
-		done >input &&
+		test_seq -f "update refs/heads/branch-%d HEAD" 10 >input &&
 		git update-ref --stdin <input &&
 		git -c reftable.restartInterval=1 pack-refs &&
 
@@ -224,11 +216,7 @@ test_expect_success 'object index gets written by default with ref index' '
 	(
 		cd repo &&
 		test_commit initial &&
-		for i in $(test_seq 5)
-		do
-			printf "update refs/heads/branch-%d HEAD\n" "$i" ||
-			return 1
-		done >input &&
+		test_seq -f "update refs/heads/branch-%d HEAD" 5 >input &&
 		git update-ref --stdin <input &&
 		git -c reftable.blockSize=100 pack-refs &&
 
@@ -263,11 +251,7 @@ test_expect_success 'object index can be disabled' '
 	(
 		cd repo &&
 		test_commit initial &&
-		for i in $(test_seq 5)
-		do
-			printf "update refs/heads/branch-%d HEAD\n" "$i" ||
-			return 1
-		done >input &&
+		test_seq -f "update refs/heads/branch-%d HEAD" 5 >input &&
 		git update-ref --stdin <input &&
 		git -c reftable.blockSize=100 -c reftable.indexObjects=false pack-refs &&
 
diff --git a/t/t1400-update-ref.sh b/t/t1400-update-ref.sh
index d29d23cb89..e373d9108b 100755
--- a/t/t1400-update-ref.sh
+++ b/t/t1400-update-ref.sh
@@ -1380,21 +1380,15 @@ test_expect_success 'fails with duplicate ref update via symref' '
 
 test_expect_success ULIMIT_FILE_DESCRIPTORS 'large transaction creating branches does not burst open file limit' '
 (
-	for i in $(test_seq 33)
-	do
-		echo "create refs/heads/$i HEAD" || exit 1
-	done >large_input &&
+	test_seq -f "create refs/heads/%d HEAD" 33 >large_input &&
 	run_with_limited_open_files git update-ref --stdin <large_input &&
 	git rev-parse --verify -q refs/heads/33
 )
 '
 
 test_expect_success ULIMIT_FILE_DESCRIPTORS 'large transaction deleting branches does not burst open file limit' '
 (
-	for i in $(test_seq 33)
-	do
-		echo "delete refs/heads/$i HEAD" || exit 1
-	done >large_input &&
+	test_seq -f "delete refs/heads/%d HEAD" 33 >large_input &&
 	run_with_limited_open_files git update-ref --stdin <large_input &&
 	test_must_fail git rev-parse --verify -q refs/heads/33
 )
diff --git a/t/t5004-archive-corner-cases.sh b/t/t5004-archive-corner-cases.sh
index 5174995191..027dedd976 100755
--- a/t/t5004-archive-corner-cases.sh
+++ b/t/t5004-archive-corner-cases.sh
@@ -176,10 +176,7 @@ test_expect_success EXPENSIVE,UNZIP,UNZIP_ZIP64_SUPPORT \
 	blob=$(echo $s | git hash-object -w --stdin) &&
 
 	# create tree containing 65500 entries of that blob
-	for i in $(test_seq 1 65500)
-	do
-		echo "100644 blob $blob	$i" || return 1
-	done >tree &&
+	test_seq -f "100644 blob $blob\t%d" 1 65500 >tree &&
 	tree=$(git mktree <tree) &&
 
 	# zip it, creating an archive a bit bigger than 4GB
diff --git a/t/t6422-merge-rename-corner-cases.sh b/t/t6422-merge-rename-corner-cases.sh
index 9cbe7ca782..f14c0fb30e 100755
--- a/t/t6422-merge-rename-corner-cases.sh
+++ b/t/t6422-merge-rename-corner-cases.sh
@@ -1146,10 +1146,7 @@ test_conflicts_with_adds_and_renames() {
 			cd simple_${sideL}_${sideR} &&
 
 			# Create some related files now
-			for i in $(test_seq 1 10)
-			do
-				echo Random base content line $i
-			done >file_v1 &&
+			test_seq -f "Random base content line %d" 1 10 >file_v1 &&
 			cp file_v1 file_v2 &&
 			echo modification >>file_v2 &&
 
@@ -1293,10 +1290,7 @@ test_setup_nested_conflicts_from_rename_rename () {
 		cd nested_conflicts_from_rename_rename &&
 
 		# Create some related files now
-		for i in $(test_seq 1 10)
-		do
-			echo Random base content line $i
-		done >file_v1 &&
+		test_seq -f "Random base content line %d" 1 10 >file_v1 &&
 
 		cp file_v1 file_v2 &&
 		cp file_v1 file_v3 &&
diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
index ea67057f1a..aea1ddf117 100755
--- a/t/t7422-submodule-output.sh
+++ b/t/t7422-submodule-output.sh
@@ -187,11 +187,7 @@ test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE'
 		BLOB=$(git hash-object -w --stdin <gitmodules) &&
 
 		printf "100644 blob $BLOB\t.gitmodules\n" >tree &&
-		for i in $(test_seq 2000)
-		do
-			printf "160000 commit $COMMIT\trecursive-submodule-path-%d\n" "$i" ||
-			return 1
-		done >>tree &&
+		test_seq -f "160000 commit $COMMIT\trecursive-submodule-path-%d" 2000 >>tree &&
 		TREE=$(git mktree <tree) &&
 
 		COMMIT=$(git commit-tree "$TREE") &&
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index bee4a2ca34..8c176f4efc 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1454,6 +1454,13 @@ test_cmp_fspath () {
 # from 1.
 
 test_seq () {
+	local fmt="%d"
+	case "$1" in
+	-f)
+		fmt="$2"
+		shift 2
+		;;
+	esac
 	case $# in
 	1)	set 1 "$@" ;;
 	2)	;;
@@ -1462,7 +1469,7 @@ test_seq () {
 	test_seq_counter__=$1
 	while test "$test_seq_counter__" -le "$2"
 	do
-		echo "$test_seq_counter__"
+		printf "$fmt\n" "$test_seq_counter__"
 		test_seq_counter__=$(( $test_seq_counter__ + 1 ))
 	done
 }
-- 
2.50.0.385.g2a828bf5b7

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] test-lib: teach test_seq the -f option
  2025-06-23 10:56 ` [PATCH 2/2] test-lib: teach test_seq the -f option Jeff King
@ 2025-06-23 16:10   ` Junio C Hamano
  2025-06-23 16:25   ` Justin Tobler
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 14+ messages in thread
From: Junio C Hamano @ 2025-06-23 16:10 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> The "seq" tool has a "-f" option to produce printf-style formatted
> lines. Let's teach our test_seq helper the same trick. This lets us get
> rid of some shell loops in test snippets (which are particularly verbose
> in our test suite because we have to "|| return 1" to keep the &&-chain
> going).
>
> This converts a few call-sites I found by grepping around the test
> suite. A few notes on these:
>
>   - In "seq", the format specifier is a "%g" float. Since test_seq only
>     supports integers, I've kept the more natural "%d" (which is what
>     these call sites were using already).

Yeah, that was the first thing I started wondering about after
learning that "seq --format" is a thing.

>   - Like "seq", test_seq automatically adds a newline to the specified
>     format. This is what all callers are doing already except for t0021,
>     but there we do not care about the exact format. We are just trying
>     to printf a large number of bytes to a file. It's not worth
>     complicating other callers or adding an option to avoid the newline
>     in that caller.

OK.  Other than the newline, the update to t0021 changes the actual
payload used in the test, but it does not affect the characteristic
of the data (like how compressible the result is) all that much, I
think.

>   - Most conversions are just replacing a shell loop (which does get rid
>     of an extra fork, since $() requires a subshell). In t0612 we can
>     replace an awk invocation, which I think makes the end result more
>     readable, as there's less quoting.

;-)

>   - In t7422 we can replace one loop, but sadly we have to leave the
>     loop directly above it. This is because that earlier loop wants to
>     include the seq value twice in the output, which test_seq does not
>     support (nor does regular seq). If you run:
>
>       test_seq -f "foo-%d %d" 10
>
>     the second "%d" will always be the empty string. You might naively
>     think that test_seq could add some extra arguments, like:
>
>       # 3 ought to be enough for anyone...
>       printf "$fmt\n" "$i "$i" $i"
>
>     but that just triggers printf to format multiple lines, one per
>     extra set of arguments.
>
>     So we'd have to actually parse the format string, figure out how
>     many "%" placeholders are there, and then feed it that many
>     instances of the sequence number. The complexity isn't worth it.

And it would deviate from the normal "seq", which would contaminate
our developers' mind, which is not a direction worth going.

The changes look all good.

Thanks.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] test-lib: teach test_seq the -f option
  2025-06-23 10:56 ` [PATCH 2/2] test-lib: teach test_seq the -f option Jeff King
  2025-06-23 16:10   ` Junio C Hamano
@ 2025-06-23 16:25   ` Justin Tobler
  2025-06-24 10:11     ` [PATCH 3/2] test-lib: document test_seq's "-f" option Jeff King
  2025-06-23 17:27   ` [PATCH 2/2] test-lib: teach test_seq the -f option Todd Zullinger
  2025-06-24  6:22   ` Eric Sunshine
  3 siblings, 1 reply; 14+ messages in thread
From: Justin Tobler @ 2025-06-23 16:25 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On 25/06/23 06:56AM, Jeff King wrote:
> The "seq" tool has a "-f" option to produce printf-style formatted
> lines. Let's teach our test_seq helper the same trick. This lets us get
> rid of some shell loops in test snippets (which are particularly verbose
> in our test suite because we have to "|| return 1" to keep the &&-chain
> going).
> 
> This converts a few call-sites I found by grepping around the test
> suite. A few notes on these:
> 
>   - In "seq", the format specifier is a "%g" float. Since test_seq only
>     supports integers, I've kept the more natural "%d" (which is what
>     these call sites were using already).

Sticking with "%d" definately feels more natural.

>   - Like "seq", test_seq automatically adds a newline to the specified
>     format. This is what all callers are doing already except for t0021,
>     but there we do not care about the exact format. We are just trying
>     to printf a large number of bytes to a file. It's not worth
>     complicating other callers or adding an option to avoid the newline
>     in that caller.
> 
>   - Most conversions are just replacing a shell loop (which does get rid
>     of an extra fork, since $() requires a subshell). In t0612 we can
>     replace an awk invocation, which I think makes the end result more
>     readable, as there's less quoting.
> 
>   - In t7422 we can replace one loop, but sadly we have to leave the
>     loop directly above it. This is because that earlier loop wants to
>     include the seq value twice in the output, which test_seq does not
>     support (nor does regular seq). If you run:
> 
>       test_seq -f "foo-%d %d" 10
> 
>     the second "%d" will always be the empty string. You might naively
>     think that test_seq could add some extra arguments, like:
> 
>       # 3 ought to be enough for anyone...
>       printf "$fmt\n" "$i "$i" $i"
> 
>     but that just triggers printf to format multiple lines, one per
>     extra set of arguments.
> 
>     So we'd have to actually parse the format string, figure out how
>     many "%" placeholders are there, and then feed it that many
>     instances of the sequence number. The complexity isn't worth it.
> 
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  t/t0021-conversion.sh                  |  4 ++--
>  t/t0610-reftable-basics.sh             |  6 +-----
>  t/t0612-reftable-jgit-compatibility.sh | 13 +++++--------
>  t/t0613-reftable-write-options.sh      | 24 ++++--------------------
>  t/t1400-update-ref.sh                  | 10 ++--------
>  t/t5004-archive-corner-cases.sh        |  5 +----
>  t/t6422-merge-rename-corner-cases.sh   | 10 ++--------
>  t/t7422-submodule-output.sh            |  6 +-----
>  t/test-lib-functions.sh                |  9 ++++++++-
>  9 files changed, 26 insertions(+), 61 deletions(-)
> 
> diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
> index bf10d253ec..f0d50d769e 100755
> --- a/t/t0021-conversion.sh
> +++ b/t/t0021-conversion.sh
> @@ -281,7 +281,7 @@ test_expect_success 'required filter with absent smudge field' '
>  test_expect_success 'filtering large input to small output should use little memory' '
>  	test_config filter.devnull.clean "cat >/dev/null" &&
>  	test_config filter.devnull.required true &&
> -	for i in $(test_seq 1 30); do printf "%1048576d" 1 || return 1; done >30MB &&
> +	test_seq -f "%1048576d" 1 30 >30MB &&

Very nice quality of life improvement indeed. :)

>  	echo "30MB filter=devnull" >.gitattributes &&
>  	GIT_MMAP_LIMIT=1m GIT_ALLOC_LIMIT=1m git add 30MB
>  '
[snip]
> diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
> index bee4a2ca34..8c176f4efc 100644
> --- a/t/test-lib-functions.sh
> +++ b/t/test-lib-functions.sh
> @@ -1454,6 +1454,13 @@ test_cmp_fspath () {
>  # from 1.
>  
>  test_seq () {
> +	local fmt="%d"
> +	case "$1" in
> +	-f)
> +		fmt="$2"

With the `-f` option, the default format string gets overwritten to what
is provided by the user. Makes sense.

If we want, we could update the comment above this function to mention
this new option.

> +		shift 2
> +		;;
> +	esac
>  	case $# in
>  	1)	set 1 "$@" ;;
>  	2)	;;
> @@ -1462,7 +1469,7 @@ test_seq () {
>  	test_seq_counter__=$1
>  	while test "$test_seq_counter__" -le "$2"
>  	do
> -		echo "$test_seq_counter__"
> +		printf "$fmt\n" "$test_seq_counter__"

Nice and simple! Each of the updated callsites also look good to me.

-Justin

>  		test_seq_counter__=$(( $test_seq_counter__ + 1 ))
>  	done
>  }
> -- 
> 2.50.0.385.g2a828bf5b7
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/2] test-lib: document test_seq's "-f" option
  2025-06-23 16:25   ` Justin Tobler
@ 2025-06-24 10:11     ` Jeff King
  2025-06-25  0:02       ` Justin Tobler
  0 siblings, 1 reply; 14+ messages in thread
From: Jeff King @ 2025-06-24 10:11 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git, Junio C Hamano

On Mon, Jun 23, 2025 at 11:25:20AM -0500, Justin Tobler wrote:

> > diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
> > index bee4a2ca34..8c176f4efc 100644
> > --- a/t/test-lib-functions.sh
> > +++ b/t/test-lib-functions.sh
> > @@ -1454,6 +1454,13 @@ test_cmp_fspath () {
> >  # from 1.
> >  
> >  test_seq () {
> > +	local fmt="%d"
> > +	case "$1" in
> > +	-f)
> > +		fmt="$2"
> 
> With the `-f` option, the default format string gets overwritten to what
> is provided by the user. Makes sense.
> 
> If we want, we could update the comment above this function to mention
> this new option.

Good point. I didn't even notice that comment!

Perhaps we should squash this in? I don't think there's any need to keep
it as a separate commit.

-- >8 --
Subject: [PATCH] test-lib: document test_seq's "-f" option

The previous commit added the "-f" option, but didn't mention it in the
function's documentation.

Suggested-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
---
 t/test-lib-functions.sh | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 8c176f4efc..6230746cc4 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1451,7 +1451,12 @@ test_cmp_fspath () {
 #     test_seq 1 5 -- outputs 1 2 3 4 5 one line at a time
 #
 # or with one argument (end), in which case it starts counting
-# from 1.
+# from 1. In addition to the start/end arguments, you can pass an optional
+# printf format. For example:
+#
+#     test_seq -f "line %d" 1 5
+#
+# would print 5 lines, "line 1" through "line 5".
 
 test_seq () {
 	local fmt="%d"
-- 
2.50.0.399.g566d3d7b27


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/2] test-lib: document test_seq's "-f" option
  2025-06-24 10:11     ` [PATCH 3/2] test-lib: document test_seq's "-f" option Jeff King
@ 2025-06-25  0:02       ` Justin Tobler
  0 siblings, 0 replies; 14+ messages in thread
From: Justin Tobler @ 2025-06-25  0:02 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Junio C Hamano

On 25/06/24 06:11AM, Jeff King wrote:
> On Mon, Jun 23, 2025 at 11:25:20AM -0500, Justin Tobler wrote:
> 
> > > diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
> > > index bee4a2ca34..8c176f4efc 100644
> > > --- a/t/test-lib-functions.sh
> > > +++ b/t/test-lib-functions.sh
> > > @@ -1454,6 +1454,13 @@ test_cmp_fspath () {
> > >  # from 1.
> > >  
> > >  test_seq () {
> > > +	local fmt="%d"
> > > +	case "$1" in
> > > +	-f)
> > > +		fmt="$2"
> > 
> > With the `-f` option, the default format string gets overwritten to what
> > is provided by the user. Makes sense.
> > 
> > If we want, we could update the comment above this function to mention
> > this new option.
> 
> Good point. I didn't even notice that comment!
> 
> Perhaps we should squash this in? I don't think there's any need to keep
> it as a separate commit.

Ya, that seems reasonable to me.

> -- >8 --
> Subject: [PATCH] test-lib: document test_seq's "-f" option
> 
> The previous commit added the "-f" option, but didn't mention it in the
> function's documentation.
> 
> Suggested-by: Justin Tobler <jltobler@gmail.com>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  t/test-lib-functions.sh | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
> index 8c176f4efc..6230746cc4 100644
> --- a/t/test-lib-functions.sh
> +++ b/t/test-lib-functions.sh
> @@ -1451,7 +1451,12 @@ test_cmp_fspath () {
>  #     test_seq 1 5 -- outputs 1 2 3 4 5 one line at a time
>  #
>  # or with one argument (end), in which case it starts counting
> -# from 1.
> +# from 1. In addition to the start/end arguments, you can pass an optional
> +# printf format. For example:
> +#
> +#     test_seq -f "line %d" 1 5
> +#
> +# would print 5 lines, "line 1" through "line 5".

At first I thought it might be nice to mention that only format strings with
a single specifier are supported, but I think this can also be implied
since the comment mentions the format string follows the printf format.

This looks good to me :)

-Justin

>  
>  test_seq () {
>  	local fmt="%d"
> -- 
> 2.50.0.399.g566d3d7b27
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] test-lib: teach test_seq the -f option
  2025-06-23 10:56 ` [PATCH 2/2] test-lib: teach test_seq the -f option Jeff King
  2025-06-23 16:10   ` Junio C Hamano
  2025-06-23 16:25   ` Justin Tobler
@ 2025-06-23 17:27   ` Todd Zullinger
  2025-06-24 10:16     ` Jeff King
  2025-06-24  6:22   ` Eric Sunshine
  3 siblings, 1 reply; 14+ messages in thread
From: Todd Zullinger @ 2025-06-23 17:27 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Justin Tobler, git

Jeff King wrote:
> The "seq" tool has a "-f" option to produce printf-style formatted
> lines. Let's teach our test_seq helper the same trick. This lets us get
> rid of some shell loops in test snippets (which are particularly verbose
> in our test suite because we have to "|| return 1" to keep the &&-chain
> going).

This is a nice improvement.

> diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
> index bee4a2ca34..8c176f4efc 100644
> --- a/t/test-lib-functions.sh
> +++ b/t/test-lib-functions.sh
> @@ -1454,6 +1454,13 @@ test_cmp_fspath () {
>  # from 1.
>  
>  test_seq () {
> +	local fmt="%d"
> +	case "$1" in
> +	-f)
> +		fmt="$2"
> +		shift 2
> +		;;
> +	esac
>  	case $# in
>  	1)	set 1 "$@" ;;
>  	2)	;;
> @@ -1462,7 +1469,7 @@ test_seq () {
>  	test_seq_counter__=$1
>  	while test "$test_seq_counter__" -le "$2"
>  	do
> -		echo "$test_seq_counter__"
> +		printf "$fmt\n" "$test_seq_counter__"
>  		test_seq_counter__=$(( $test_seq_counter__ + 1 ))
>  	done
>  }

Is it a sharp edge worth caring about that someone might
write `test_seq -f 1 5` where we'd pass 1 as the format
string?

If so, perhaps a check like this might be sufficient to
catch it early?

	diff --git i/t/test-lib-functions.sh w/t/test-lib-functions.sh
	index 8c176f4efc..87b59d5895 100644
	--- i/t/test-lib-functions.sh
	+++ w/t/test-lib-functions.sh
	@@ -1458,6 +1458,10 @@ test_seq () {
		case "$1" in
		-f)
			fmt="$2"
	+		case "$fmt" in
	+			*%*)	: ;;
	+			*)	BUG "no % in -f argument" ;;
	+		esac
			shift 2
			;;
		esac

I don't know whether it's worth the extra code or not.  I
just wondered about how it would fail in the face of a minor
typo.  It certainly should cause any test to fail if it were
to output 1 instead of the intended format string, so it's
arguably fine as-is.

Adding -f to the usage note above, as Justin suggested might
help folks avoid making the mistake of cuddling the format
string against -f, e.g.: -f%d.  That is caught by the
parameter count check (though perhaps not everyone would
notice why, thinking they did pass an argument to -f).

-- 
Todd

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] test-lib: teach test_seq the -f option
  2025-06-23 17:27   ` [PATCH 2/2] test-lib: teach test_seq the -f option Todd Zullinger
@ 2025-06-24 10:16     ` Jeff King
  2025-06-24 13:42       ` Todd Zullinger
  0 siblings, 1 reply; 14+ messages in thread
From: Jeff King @ 2025-06-24 10:16 UTC (permalink / raw)
  To: Todd Zullinger; +Cc: Junio C Hamano, Justin Tobler, git

On Mon, Jun 23, 2025 at 01:27:07PM -0400, Todd Zullinger wrote:

> Is it a sharp edge worth caring about that someone might
> write `test_seq -f 1 5` where we'd pass 1 as the format
> string?
> 
> If so, perhaps a check like this might be sufficient to
> catch it early?
> 
> 	diff --git i/t/test-lib-functions.sh w/t/test-lib-functions.sh
> 	index 8c176f4efc..87b59d5895 100644
> 	--- i/t/test-lib-functions.sh
> 	+++ w/t/test-lib-functions.sh
> 	@@ -1458,6 +1458,10 @@ test_seq () {
> 		case "$1" in
> 		-f)
> 			fmt="$2"
> 	+		case "$fmt" in
> 	+			*%*)	: ;;
> 	+			*)	BUG "no % in -f argument" ;;
> 	+		esac
> 			shift 2
> 			;;
> 		esac
> 
> I don't know whether it's worth the extra code or not.  I
> just wondered about how it would fail in the face of a minor
> typo.  It certainly should cause any test to fail if it were
> to output 1 instead of the intended format string, so it's
> arguably fine as-is.

Hmm, maybe. I notice that "seq" itself does this (though it did surprise
me). I think there it is actually doing the "%" interpolation itself (to
avoid memory errors by feeding arbitrary strings to printf functions),
so it's easy to do.

In our case, we can rely on the shell printf to do something sensible if
fed garbage. And because we're not parsing ourselves, a pattern like you
have above isn't totally accurate (e.g., consider what it would with
"%%d"). But it probably would be enough to catch typos.

It would also disallow:

  test_seq -f "same line" 50

to produce repeated lines, though I don't know how valuable that would
be. So I dunno.

> Adding -f to the usage note above, as Justin suggested might
> help folks avoid making the mistake of cuddling the format
> string against -f, e.g.: -f%d.  That is caught by the
> parameter count check (though perhaps not everyone would
> notice why, thinking they did pass an argument to -f).

If people are going to use "-f%d", I think we'd be better off making it
work than trying to complain about it. But I was hoping we could just
keep things simple and stupid, given the limited audience.

So my inclination is to leave the sharp edges and see if anybody gets
cut, but it's possible that I'm just being lazy.

-Peff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] test-lib: teach test_seq the -f option
  2025-06-24 10:16     ` Jeff King
@ 2025-06-24 13:42       ` Todd Zullinger
  0 siblings, 0 replies; 14+ messages in thread
From: Todd Zullinger @ 2025-06-24 13:42 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Justin Tobler, git

Jeff King wrote:
> On Mon, Jun 23, 2025 at 01:27:07PM -0400, Todd Zullinger wrote:
>> I don't know whether it's worth the extra code or not.  I
>> just wondered about how it would fail in the face of a minor
>> typo.  It certainly should cause any test to fail if it were
>> to output 1 instead of the intended format string, so it's
>> arguably fine as-is.
> 
> Hmm, maybe. I notice that "seq" itself does this (though it did surprise
> me). I think there it is actually doing the "%" interpolation itself (to
> avoid memory errors by feeding arbitrary strings to printf functions),
> so it's easy to do.
> 
> In our case, we can rely on the shell printf to do something sensible if
> fed garbage. And because we're not parsing ourselves, a pattern like you
> have above isn't totally accurate (e.g., consider what it would with
> "%%d"). But it probably would be enough to catch typos.

Parsing it fully felt like overkill, and a good bit more
code to do it well.  So that was me being lazy.  :)

> It would also disallow:
> 
>   test_seq -f "same line" 50
> 
> to produce repeated lines, though I don't know how valuable that would
> be. So I dunno.

I can see someone using it that way, like a cheap version of
the yes command.

> If people are going to use "-f%d", I think we'd be better off making it
> work than trying to complain about it. But I was hoping we could just
> keep things simple and stupid, given the limited audience.
> 
> So my inclination is to leave the sharp edges and see if anybody gets
> cut, but it's possible that I'm just being lazy.

Sounds good to me.  As you say, it's not going to be exposed
to "normal" users.

The worst thing it can do is allow a test to pass which
shouldn't -- and that's by far the least likely way for the
(not very) sharp edges to cut us.

-- 
Todd

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] test-lib: teach test_seq the -f option
  2025-06-23 10:56 ` [PATCH 2/2] test-lib: teach test_seq the -f option Jeff King
                     ` (2 preceding siblings ...)
  2025-06-23 17:27   ` [PATCH 2/2] test-lib: teach test_seq the -f option Todd Zullinger
@ 2025-06-24  6:22   ` Eric Sunshine
  2025-06-24 10:36     ` Jeff King
  3 siblings, 1 reply; 14+ messages in thread
From: Eric Sunshine @ 2025-06-24  6:22 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Mon, Jun 23, 2025 at 6:56 AM Jeff King <peff@peff.net> wrote:
> The "seq" tool has a "-f" option to produce printf-style formatted
> lines. Let's teach our test_seq helper the same trick. This lets us get
> rid of some shell loops in test snippets (which are particularly verbose
> in our test suite because we have to "|| return 1" to keep the &&-chain
> going).
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> diff --git a/t/t0612-reftable-jgit-compatibility.sh b/t/t0612-reftable-jgit-compatibility.sh
> @@ -112,14 +112,11 @@ test_expect_success 'JGit can read multi-level index' '
> -               awk "
> -                   BEGIN {
> -                       print \"start\";
> -                       for (i = 0; i < 10000; i++)
> -                           printf \"create refs/heads/branch-%d HEAD\n\", i;
> -                       print \"commit\";
> -                   }
> -               " >input &&
> +               {
> +                       echo start &&
> +                       test_seq -f "create refs/heads/branch-%d HEAD" 10000 &&
> +                       echo commit
> +               } >input &&

I had suggested[1] an effectively equivalent change to Patrick for a
couple tests in the nearby t0610, but he rejected[2] the idea due to
the pure-shell version being significantly slower than the `awk`
version.

Pondering his response today, I wondered if it would make sense to
replace our pure-shell `test_seq` with an implementation via `awk`,
however, if most of our sequence vend only a small set of numbers,
then the startup cost of `awk` would probably swamp any savings,
especially on Windows where process startup is extremely slow. Taking
that into account, I further wondered if we could see an overall win
by taking a hybrid approach in which we employ the pure-shell version
if vending a small set of numbers, but fall over to an `awk` version
if vending a lot of numbers, especially as in the test above or the
tests in t0610. Anyhow, food for thought, or not, if you're not hungry
for thought food.

[1]: https://lore.kernel.org/git/CAPig+cSC3zdur1fCsa7RMNZDcgUK4pUGKb22tpgdANxR6OxNMA@mail.gmail.com/
[2]: https://lore.kernel.org/git/Z-FUNgmY9hTsnzds@pks.im/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] test-lib: teach test_seq the -f option
  2025-06-24  6:22   ` Eric Sunshine
@ 2025-06-24 10:36     ` Jeff King
  0 siblings, 0 replies; 14+ messages in thread
From: Jeff King @ 2025-06-24 10:36 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: git

On Tue, Jun 24, 2025 at 02:22:21AM -0400, Eric Sunshine wrote:

> > diff --git a/t/t0612-reftable-jgit-compatibility.sh b/t/t0612-reftable-jgit-compatibility.sh
> > @@ -112,14 +112,11 @@ test_expect_success 'JGit can read multi-level index' '
> > -               awk "
> > -                   BEGIN {
> > -                       print \"start\";
> > -                       for (i = 0; i < 10000; i++)
> > -                           printf \"create refs/heads/branch-%d HEAD\n\", i;
> > -                       print \"commit\";
> > -                   }
> > -               " >input &&
> > +               {
> > +                       echo start &&
> > +                       test_seq -f "create refs/heads/branch-%d HEAD" 10000 &&
> > +                       echo commit
> > +               } >input &&
> 
> I had suggested[1] an effectively equivalent change to Patrick for a
> couple tests in the nearby t0610, but he rejected[2] the idea due to
> the pure-shell version being significantly slower than the `awk`
> version.
> 
> Pondering his response today, I wondered if it would make sense to
> replace our pure-shell `test_seq` with an implementation via `awk`,
> however, if most of our sequence vend only a small set of numbers,
> then the startup cost of `awk` would probably swamp any savings,
> especially on Windows where process startup is extremely slow. Taking
> that into account, I further wondered if we could see an overall win
> by taking a hybrid approach in which we employ the pure-shell version
> if vending a small set of numbers, but fall over to an `awk` version
> if vending a lot of numbers, especially as in the test above or the
> tests in t0610. Anyhow, food for thought, or not, if you're not hungry
> for thought food.

Ah, interesting. I didn't time it at all, as my general intuition for
shell performance is that counting process spawns overrides everything
else (though admittedly it is usually O(n) processes vs O(1), and here
we are going from one extra process to zero).

I did a few timings, and it looks like the shell wins at 10,000 on my
system, but awk wins at 50,000 (though there is a lot of run-to-run
noise; I think awk might even win at 10,000 on a loaded system, as this
is such a light load that CPU frequency throttling comes into play).

I assumed that the culprit was a lack of buffering, but I don't think
so. awk seems to issue 10,000 write() calls. I guess it is just internal
shell overhead in issuing commands. Where is a JIT byte-code shell
interpreter when we need one? ;)

My inclination is not to worry about it too much. At 10,000 I think we
are talking about a few milliseconds. There's so much more low-hanging
fruit if somebody wants to optimize the test suite. IMHO readability is
more important here (and if we really want to optimize, doing it inside
test_seq would be better).

-Peff

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-06-25  0:07 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-23 10:55 [PATCH 0/2] test_seq format option Jeff King
2025-06-23 10:55 ` [PATCH 1/2] t7422: replace confusing printf with echo Jeff King
2025-06-23 17:59   ` Eric Sunshine
2025-06-24 10:05     ` Jeff King
2025-06-23 10:56 ` [PATCH 2/2] test-lib: teach test_seq the -f option Jeff King
2025-06-23 16:10   ` Junio C Hamano
2025-06-23 16:25   ` Justin Tobler
2025-06-24 10:11     ` [PATCH 3/2] test-lib: document test_seq's "-f" option Jeff King
2025-06-25  0:02       ` Justin Tobler
2025-06-23 17:27   ` [PATCH 2/2] test-lib: teach test_seq the -f option Todd Zullinger
2025-06-24 10:16     ` Jeff King
2025-06-24 13:42       ` Todd Zullinger
2025-06-24  6:22   ` Eric Sunshine
2025-06-24 10:36     ` Jeff King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox