* [PATCH 0/2] test_seq format option @ 2025-06-23 10:55 Jeff King 2025-06-23 10:55 ` [PATCH 1/2] t7422: replace confusing printf with echo Jeff King 2025-06-23 10:56 ` [PATCH 2/2] test-lib: teach test_seq the -f option Jeff King 0 siblings, 2 replies; 14+ messages in thread From: Jeff King @ 2025-06-23 10:55 UTC (permalink / raw) To: git This is a small quality of life improvement for our test suite, stemming from this discussion: https://lore.kernel.org/git/20240408172638.GB1629595@coredump.intra.peff.net/ But somehow it took me over a year to get back to it. :-/ The first patch is a small cleanup in nearby code; the second one is the interesting part. [1/2]: t7422: replace confusing printf with echo [2/2]: test-lib: teach test_seq the -f option t/t0021-conversion.sh | 4 ++-- t/t0610-reftable-basics.sh | 6 +----- t/t0612-reftable-jgit-compatibility.sh | 13 +++++-------- t/t0613-reftable-write-options.sh | 24 ++++-------------------- t/t1400-update-ref.sh | 10 ++-------- t/t5004-archive-corner-cases.sh | 5 +---- t/t6422-merge-rename-corner-cases.sh | 10 ++-------- t/t7422-submodule-output.sh | 9 +++------ t/test-lib-functions.sh | 9 ++++++++- 9 files changed, 28 insertions(+), 62 deletions(-) -Peff ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/2] t7422: replace confusing printf with echo 2025-06-23 10:55 [PATCH 0/2] test_seq format option Jeff King @ 2025-06-23 10:55 ` Jeff King 2025-06-23 17:59 ` Eric Sunshine 2025-06-23 10:56 ` [PATCH 2/2] test-lib: teach test_seq the -f option Jeff King 1 sibling, 1 reply; 14+ messages in thread From: Jeff King @ 2025-06-23 10:55 UTC (permalink / raw) To: git While looping over a counter "i", we do: printf "[submodule \"sm-$i\"]\npath = recursive-submodule-path-$i\n" "$i" So we are passing "$i" as an argument to be filled in, but there is no "%" placeholder in the format string, which is a bit confusing to read. We could switch both instances of "$i" to "%d" (and pass $i twice). But that makes the line even longer. Let's just keep interpolating the value in the string, and drop the confusing extra "$i" argument. And since we are not using any printf specifiers at all, it becomes clear that we can swap it out for echo. We do use a "\n" in the middle of the string, but breaking this into two separate echo statements actually makes it easier to read. Signed-off-by: Jeff King <peff@peff.net> --- t/t7422-submodule-output.sh | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh index 023a5cbdc4..ea67057f1a 100755 --- a/t/t7422-submodule-output.sh +++ b/t/t7422-submodule-output.sh @@ -180,7 +180,8 @@ test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE' COMMIT=$(git rev-parse HEAD) && for i in $(test_seq 2000) do - printf "[submodule \"sm-$i\"]\npath = recursive-submodule-path-$i\n" "$i" || + echo "[submodule \"sm-$i\"]" && + echo "path = recursive-submodule-path-$i" || return 1 done >gitmodules && BLOB=$(git hash-object -w --stdin <gitmodules) && -- 2.50.0.385.g2a828bf5b7 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] t7422: replace confusing printf with echo 2025-06-23 10:55 ` [PATCH 1/2] t7422: replace confusing printf with echo Jeff King @ 2025-06-23 17:59 ` Eric Sunshine 2025-06-24 10:05 ` Jeff King 0 siblings, 1 reply; 14+ messages in thread From: Eric Sunshine @ 2025-06-23 17:59 UTC (permalink / raw) To: Jeff King; +Cc: git On Mon, Jun 23, 2025 at 6:57 AM Jeff King <peff@peff.net> wrote: > While looping over a counter "i", we do: > > printf "[submodule \"sm-$i\"]\npath = recursive-submodule-path-$i\n" "$i" > > So we are passing "$i" as an argument to be filled in, but there is no > "%" placeholder in the format string, which is a bit confusing to read. > > We could switch both instances of "$i" to "%d" (and pass $i twice). But > that makes the line even longer. Let's just keep interpolating the value > in the string, and drop the confusing extra "$i" argument. > > And since we are not using any printf specifiers at all, it becomes > clear that we can swap it out for echo. We do use a "\n" in the middle > of the string, but breaking this into two separate echo statements > actually makes it easier to read. > > Signed-off-by: Jeff King <peff@peff.net> > --- > diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh > @@ -180,7 +180,8 @@ test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE' > - printf "[submodule \"sm-$i\"]\npath = recursive-submodule-path-$i\n" "$i" || > + echo "[submodule \"sm-$i\"]" && > + echo "path = recursive-submodule-path-$i" || This looks obviously correct and, as the commit message says, is almost certainly easier to read, but I was more than a little surprised to see the patch since I thought this code had been fixed previously[*] and had some discussion around it. [*] https://lore.kernel.org/git/20250403144852.19153-1-sn03.general@gmail.com/ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] t7422: replace confusing printf with echo 2025-06-23 17:59 ` Eric Sunshine @ 2025-06-24 10:05 ` Jeff King 0 siblings, 0 replies; 14+ messages in thread From: Jeff King @ 2025-06-24 10:05 UTC (permalink / raw) To: Eric Sunshine; +Cc: git On Mon, Jun 23, 2025 at 01:59:48PM -0400, Eric Sunshine wrote: > > diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh > > @@ -180,7 +180,8 @@ test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE' > > - printf "[submodule \"sm-$i\"]\npath = recursive-submodule-path-$i\n" "$i" || > > + echo "[submodule \"sm-$i\"]" && > > + echo "path = recursive-submodule-path-$i" || > > This looks obviously correct and, as the commit message says, is > almost certainly easier to read, but I was more than a little > surprised to see the patch since I thought this code had been fixed > previously[*] and had some discussion around it. > > [*] https://lore.kernel.org/git/20250403144852.19153-1-sn03.general@gmail.com/ Ah, interesting, I hadn't seen that one. I am happy with either solution, but IMHO what I posted with "echo" is a bit more readable. -Peff ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 2/2] test-lib: teach test_seq the -f option 2025-06-23 10:55 [PATCH 0/2] test_seq format option Jeff King 2025-06-23 10:55 ` [PATCH 1/2] t7422: replace confusing printf with echo Jeff King @ 2025-06-23 10:56 ` Jeff King 2025-06-23 16:10 ` Junio C Hamano ` (3 more replies) 1 sibling, 4 replies; 14+ messages in thread From: Jeff King @ 2025-06-23 10:56 UTC (permalink / raw) To: git The "seq" tool has a "-f" option to produce printf-style formatted lines. Let's teach our test_seq helper the same trick. This lets us get rid of some shell loops in test snippets (which are particularly verbose in our test suite because we have to "|| return 1" to keep the &&-chain going). This converts a few call-sites I found by grepping around the test suite. A few notes on these: - In "seq", the format specifier is a "%g" float. Since test_seq only supports integers, I've kept the more natural "%d" (which is what these call sites were using already). - Like "seq", test_seq automatically adds a newline to the specified format. This is what all callers are doing already except for t0021, but there we do not care about the exact format. We are just trying to printf a large number of bytes to a file. It's not worth complicating other callers or adding an option to avoid the newline in that caller. - Most conversions are just replacing a shell loop (which does get rid of an extra fork, since $() requires a subshell). In t0612 we can replace an awk invocation, which I think makes the end result more readable, as there's less quoting. - In t7422 we can replace one loop, but sadly we have to leave the loop directly above it. This is because that earlier loop wants to include the seq value twice in the output, which test_seq does not support (nor does regular seq). If you run: test_seq -f "foo-%d %d" 10 the second "%d" will always be the empty string. You might naively think that test_seq could add some extra arguments, like: # 3 ought to be enough for anyone... printf "$fmt\n" "$i "$i" $i" but that just triggers printf to format multiple lines, one per extra set of arguments. So we'd have to actually parse the format string, figure out how many "%" placeholders are there, and then feed it that many instances of the sequence number. The complexity isn't worth it. Signed-off-by: Jeff King <peff@peff.net> --- t/t0021-conversion.sh | 4 ++-- t/t0610-reftable-basics.sh | 6 +----- t/t0612-reftable-jgit-compatibility.sh | 13 +++++-------- t/t0613-reftable-write-options.sh | 24 ++++-------------------- t/t1400-update-ref.sh | 10 ++-------- t/t5004-archive-corner-cases.sh | 5 +---- t/t6422-merge-rename-corner-cases.sh | 10 ++-------- t/t7422-submodule-output.sh | 6 +----- t/test-lib-functions.sh | 9 ++++++++- 9 files changed, 26 insertions(+), 61 deletions(-) diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh index bf10d253ec..f0d50d769e 100755 --- a/t/t0021-conversion.sh +++ b/t/t0021-conversion.sh @@ -281,7 +281,7 @@ test_expect_success 'required filter with absent smudge field' ' test_expect_success 'filtering large input to small output should use little memory' ' test_config filter.devnull.clean "cat >/dev/null" && test_config filter.devnull.required true && - for i in $(test_seq 1 30); do printf "%1048576d" 1 || return 1; done >30MB && + test_seq -f "%1048576d" 1 30 >30MB && echo "30MB filter=devnull" >.gitattributes && GIT_MMAP_LIMIT=1m GIT_ALLOC_LIMIT=1m git add 30MB ' @@ -299,7 +299,7 @@ test_expect_success 'filter that does not read is fine' ' test_expect_success EXPENSIVE 'filter large file' ' test_config filter.largefile.smudge cat && test_config filter.largefile.clean cat && - for i in $(test_seq 1 2048); do printf "%1048576d" 1 || return 1; done >2GB && + test_seq -f "%1048576d" 1 2048 >2GB && echo "2GB filter=largefile" >.gitattributes && git add 2GB 2>err && test_must_be_empty err && diff --git a/t/t0610-reftable-basics.sh b/t/t0610-reftable-basics.sh index 1be534a895..3ea5d51532 100755 --- a/t/t0610-reftable-basics.sh +++ b/t/t0610-reftable-basics.sh @@ -477,11 +477,7 @@ test_expect_success !CYGWIN 'ref transaction: many concurrent writers' ' test_commit --no-tag initial && head=$(git rev-parse HEAD) && - for i in $(test_seq 100) - do - printf "%s commit\trefs/heads/branch-%s\n" "$head" "$i" || - return 1 - done >expect && + test_seq -f "$head commit\trefs/heads/branch-%d" 100 >expect && printf "%s commit\trefs/heads/main\n" "$head" >>expect && for i in $(test_seq 100) diff --git a/t/t0612-reftable-jgit-compatibility.sh b/t/t0612-reftable-jgit-compatibility.sh index d0d7e80b49..7df2ad5817 100755 --- a/t/t0612-reftable-jgit-compatibility.sh +++ b/t/t0612-reftable-jgit-compatibility.sh @@ -112,14 +112,11 @@ test_expect_success 'JGit can read multi-level index' ' cd repo && test_commit A && - awk " - BEGIN { - print \"start\"; - for (i = 0; i < 10000; i++) - printf \"create refs/heads/branch-%d HEAD\n\", i; - print \"commit\"; - } - " >input && + { + echo start && + test_seq -f "create refs/heads/branch-%d HEAD" 10000 && + echo commit + } >input && git update-ref --stdin <input && test_same_refs && diff --git a/t/t0613-reftable-write-options.sh b/t/t0613-reftable-write-options.sh index 6447920c9b..d77e601111 100755 --- a/t/t0613-reftable-write-options.sh +++ b/t/t0613-reftable-write-options.sh @@ -66,11 +66,7 @@ test_expect_success 'many refs results in multiple blocks' ' ( cd repo && test_commit initial && - for i in $(test_seq 200) - do - printf "update refs/heads/branch-%d HEAD\n" "$i" || - return 1 - done >input && + test_seq -f "update refs/heads/branch-%d HEAD" 200 >input && git update-ref --stdin <input && git pack-refs && @@ -180,11 +176,7 @@ test_expect_success 'restart interval at every single record' ' ( cd repo && test_commit initial && - for i in $(test_seq 10) - do - printf "update refs/heads/branch-%d HEAD\n" "$i" || - return 1 - done >input && + test_seq -f "update refs/heads/branch-%d HEAD" 10 >input && git update-ref --stdin <input && git -c reftable.restartInterval=1 pack-refs && @@ -224,11 +216,7 @@ test_expect_success 'object index gets written by default with ref index' ' ( cd repo && test_commit initial && - for i in $(test_seq 5) - do - printf "update refs/heads/branch-%d HEAD\n" "$i" || - return 1 - done >input && + test_seq -f "update refs/heads/branch-%d HEAD" 5 >input && git update-ref --stdin <input && git -c reftable.blockSize=100 pack-refs && @@ -263,11 +251,7 @@ test_expect_success 'object index can be disabled' ' ( cd repo && test_commit initial && - for i in $(test_seq 5) - do - printf "update refs/heads/branch-%d HEAD\n" "$i" || - return 1 - done >input && + test_seq -f "update refs/heads/branch-%d HEAD" 5 >input && git update-ref --stdin <input && git -c reftable.blockSize=100 -c reftable.indexObjects=false pack-refs && diff --git a/t/t1400-update-ref.sh b/t/t1400-update-ref.sh index d29d23cb89..e373d9108b 100755 --- a/t/t1400-update-ref.sh +++ b/t/t1400-update-ref.sh @@ -1380,21 +1380,15 @@ test_expect_success 'fails with duplicate ref update via symref' ' test_expect_success ULIMIT_FILE_DESCRIPTORS 'large transaction creating branches does not burst open file limit' ' ( - for i in $(test_seq 33) - do - echo "create refs/heads/$i HEAD" || exit 1 - done >large_input && + test_seq -f "create refs/heads/%d HEAD" 33 >large_input && run_with_limited_open_files git update-ref --stdin <large_input && git rev-parse --verify -q refs/heads/33 ) ' test_expect_success ULIMIT_FILE_DESCRIPTORS 'large transaction deleting branches does not burst open file limit' ' ( - for i in $(test_seq 33) - do - echo "delete refs/heads/$i HEAD" || exit 1 - done >large_input && + test_seq -f "delete refs/heads/%d HEAD" 33 >large_input && run_with_limited_open_files git update-ref --stdin <large_input && test_must_fail git rev-parse --verify -q refs/heads/33 ) diff --git a/t/t5004-archive-corner-cases.sh b/t/t5004-archive-corner-cases.sh index 5174995191..027dedd976 100755 --- a/t/t5004-archive-corner-cases.sh +++ b/t/t5004-archive-corner-cases.sh @@ -176,10 +176,7 @@ test_expect_success EXPENSIVE,UNZIP,UNZIP_ZIP64_SUPPORT \ blob=$(echo $s | git hash-object -w --stdin) && # create tree containing 65500 entries of that blob - for i in $(test_seq 1 65500) - do - echo "100644 blob $blob $i" || return 1 - done >tree && + test_seq -f "100644 blob $blob\t%d" 1 65500 >tree && tree=$(git mktree <tree) && # zip it, creating an archive a bit bigger than 4GB diff --git a/t/t6422-merge-rename-corner-cases.sh b/t/t6422-merge-rename-corner-cases.sh index 9cbe7ca782..f14c0fb30e 100755 --- a/t/t6422-merge-rename-corner-cases.sh +++ b/t/t6422-merge-rename-corner-cases.sh @@ -1146,10 +1146,7 @@ test_conflicts_with_adds_and_renames() { cd simple_${sideL}_${sideR} && # Create some related files now - for i in $(test_seq 1 10) - do - echo Random base content line $i - done >file_v1 && + test_seq -f "Random base content line %d" 1 10 >file_v1 && cp file_v1 file_v2 && echo modification >>file_v2 && @@ -1293,10 +1290,7 @@ test_setup_nested_conflicts_from_rename_rename () { cd nested_conflicts_from_rename_rename && # Create some related files now - for i in $(test_seq 1 10) - do - echo Random base content line $i - done >file_v1 && + test_seq -f "Random base content line %d" 1 10 >file_v1 && cp file_v1 file_v2 && cp file_v1 file_v3 && diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh index ea67057f1a..aea1ddf117 100755 --- a/t/t7422-submodule-output.sh +++ b/t/t7422-submodule-output.sh @@ -187,11 +187,7 @@ test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE' BLOB=$(git hash-object -w --stdin <gitmodules) && printf "100644 blob $BLOB\t.gitmodules\n" >tree && - for i in $(test_seq 2000) - do - printf "160000 commit $COMMIT\trecursive-submodule-path-%d\n" "$i" || - return 1 - done >>tree && + test_seq -f "160000 commit $COMMIT\trecursive-submodule-path-%d" 2000 >>tree && TREE=$(git mktree <tree) && COMMIT=$(git commit-tree "$TREE") && diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh index bee4a2ca34..8c176f4efc 100644 --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -1454,6 +1454,13 @@ test_cmp_fspath () { # from 1. test_seq () { + local fmt="%d" + case "$1" in + -f) + fmt="$2" + shift 2 + ;; + esac case $# in 1) set 1 "$@" ;; 2) ;; @@ -1462,7 +1469,7 @@ test_seq () { test_seq_counter__=$1 while test "$test_seq_counter__" -le "$2" do - echo "$test_seq_counter__" + printf "$fmt\n" "$test_seq_counter__" test_seq_counter__=$(( $test_seq_counter__ + 1 )) done } -- 2.50.0.385.g2a828bf5b7 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 2/2] test-lib: teach test_seq the -f option 2025-06-23 10:56 ` [PATCH 2/2] test-lib: teach test_seq the -f option Jeff King @ 2025-06-23 16:10 ` Junio C Hamano 2025-06-23 16:25 ` Justin Tobler ` (2 subsequent siblings) 3 siblings, 0 replies; 14+ messages in thread From: Junio C Hamano @ 2025-06-23 16:10 UTC (permalink / raw) To: Jeff King; +Cc: git Jeff King <peff@peff.net> writes: > The "seq" tool has a "-f" option to produce printf-style formatted > lines. Let's teach our test_seq helper the same trick. This lets us get > rid of some shell loops in test snippets (which are particularly verbose > in our test suite because we have to "|| return 1" to keep the &&-chain > going). > > This converts a few call-sites I found by grepping around the test > suite. A few notes on these: > > - In "seq", the format specifier is a "%g" float. Since test_seq only > supports integers, I've kept the more natural "%d" (which is what > these call sites were using already). Yeah, that was the first thing I started wondering about after learning that "seq --format" is a thing. > - Like "seq", test_seq automatically adds a newline to the specified > format. This is what all callers are doing already except for t0021, > but there we do not care about the exact format. We are just trying > to printf a large number of bytes to a file. It's not worth > complicating other callers or adding an option to avoid the newline > in that caller. OK. Other than the newline, the update to t0021 changes the actual payload used in the test, but it does not affect the characteristic of the data (like how compressible the result is) all that much, I think. > - Most conversions are just replacing a shell loop (which does get rid > of an extra fork, since $() requires a subshell). In t0612 we can > replace an awk invocation, which I think makes the end result more > readable, as there's less quoting. ;-) > - In t7422 we can replace one loop, but sadly we have to leave the > loop directly above it. This is because that earlier loop wants to > include the seq value twice in the output, which test_seq does not > support (nor does regular seq). If you run: > > test_seq -f "foo-%d %d" 10 > > the second "%d" will always be the empty string. You might naively > think that test_seq could add some extra arguments, like: > > # 3 ought to be enough for anyone... > printf "$fmt\n" "$i "$i" $i" > > but that just triggers printf to format multiple lines, one per > extra set of arguments. > > So we'd have to actually parse the format string, figure out how > many "%" placeholders are there, and then feed it that many > instances of the sequence number. The complexity isn't worth it. And it would deviate from the normal "seq", which would contaminate our developers' mind, which is not a direction worth going. The changes look all good. Thanks. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/2] test-lib: teach test_seq the -f option 2025-06-23 10:56 ` [PATCH 2/2] test-lib: teach test_seq the -f option Jeff King 2025-06-23 16:10 ` Junio C Hamano @ 2025-06-23 16:25 ` Justin Tobler 2025-06-24 10:11 ` [PATCH 3/2] test-lib: document test_seq's "-f" option Jeff King 2025-06-23 17:27 ` [PATCH 2/2] test-lib: teach test_seq the -f option Todd Zullinger 2025-06-24 6:22 ` Eric Sunshine 3 siblings, 1 reply; 14+ messages in thread From: Justin Tobler @ 2025-06-23 16:25 UTC (permalink / raw) To: Jeff King; +Cc: git On 25/06/23 06:56AM, Jeff King wrote: > The "seq" tool has a "-f" option to produce printf-style formatted > lines. Let's teach our test_seq helper the same trick. This lets us get > rid of some shell loops in test snippets (which are particularly verbose > in our test suite because we have to "|| return 1" to keep the &&-chain > going). > > This converts a few call-sites I found by grepping around the test > suite. A few notes on these: > > - In "seq", the format specifier is a "%g" float. Since test_seq only > supports integers, I've kept the more natural "%d" (which is what > these call sites were using already). Sticking with "%d" definately feels more natural. > - Like "seq", test_seq automatically adds a newline to the specified > format. This is what all callers are doing already except for t0021, > but there we do not care about the exact format. We are just trying > to printf a large number of bytes to a file. It's not worth > complicating other callers or adding an option to avoid the newline > in that caller. > > - Most conversions are just replacing a shell loop (which does get rid > of an extra fork, since $() requires a subshell). In t0612 we can > replace an awk invocation, which I think makes the end result more > readable, as there's less quoting. > > - In t7422 we can replace one loop, but sadly we have to leave the > loop directly above it. This is because that earlier loop wants to > include the seq value twice in the output, which test_seq does not > support (nor does regular seq). If you run: > > test_seq -f "foo-%d %d" 10 > > the second "%d" will always be the empty string. You might naively > think that test_seq could add some extra arguments, like: > > # 3 ought to be enough for anyone... > printf "$fmt\n" "$i "$i" $i" > > but that just triggers printf to format multiple lines, one per > extra set of arguments. > > So we'd have to actually parse the format string, figure out how > many "%" placeholders are there, and then feed it that many > instances of the sequence number. The complexity isn't worth it. > > Signed-off-by: Jeff King <peff@peff.net> > --- > t/t0021-conversion.sh | 4 ++-- > t/t0610-reftable-basics.sh | 6 +----- > t/t0612-reftable-jgit-compatibility.sh | 13 +++++-------- > t/t0613-reftable-write-options.sh | 24 ++++-------------------- > t/t1400-update-ref.sh | 10 ++-------- > t/t5004-archive-corner-cases.sh | 5 +---- > t/t6422-merge-rename-corner-cases.sh | 10 ++-------- > t/t7422-submodule-output.sh | 6 +----- > t/test-lib-functions.sh | 9 ++++++++- > 9 files changed, 26 insertions(+), 61 deletions(-) > > diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh > index bf10d253ec..f0d50d769e 100755 > --- a/t/t0021-conversion.sh > +++ b/t/t0021-conversion.sh > @@ -281,7 +281,7 @@ test_expect_success 'required filter with absent smudge field' ' > test_expect_success 'filtering large input to small output should use little memory' ' > test_config filter.devnull.clean "cat >/dev/null" && > test_config filter.devnull.required true && > - for i in $(test_seq 1 30); do printf "%1048576d" 1 || return 1; done >30MB && > + test_seq -f "%1048576d" 1 30 >30MB && Very nice quality of life improvement indeed. :) > echo "30MB filter=devnull" >.gitattributes && > GIT_MMAP_LIMIT=1m GIT_ALLOC_LIMIT=1m git add 30MB > ' [snip] > diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh > index bee4a2ca34..8c176f4efc 100644 > --- a/t/test-lib-functions.sh > +++ b/t/test-lib-functions.sh > @@ -1454,6 +1454,13 @@ test_cmp_fspath () { > # from 1. > > test_seq () { > + local fmt="%d" > + case "$1" in > + -f) > + fmt="$2" With the `-f` option, the default format string gets overwritten to what is provided by the user. Makes sense. If we want, we could update the comment above this function to mention this new option. > + shift 2 > + ;; > + esac > case $# in > 1) set 1 "$@" ;; > 2) ;; > @@ -1462,7 +1469,7 @@ test_seq () { > test_seq_counter__=$1 > while test "$test_seq_counter__" -le "$2" > do > - echo "$test_seq_counter__" > + printf "$fmt\n" "$test_seq_counter__" Nice and simple! Each of the updated callsites also look good to me. -Justin > test_seq_counter__=$(( $test_seq_counter__ + 1 )) > done > } > -- > 2.50.0.385.g2a828bf5b7 > ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 3/2] test-lib: document test_seq's "-f" option 2025-06-23 16:25 ` Justin Tobler @ 2025-06-24 10:11 ` Jeff King 2025-06-25 0:02 ` Justin Tobler 0 siblings, 1 reply; 14+ messages in thread From: Jeff King @ 2025-06-24 10:11 UTC (permalink / raw) To: Justin Tobler; +Cc: git, Junio C Hamano On Mon, Jun 23, 2025 at 11:25:20AM -0500, Justin Tobler wrote: > > diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh > > index bee4a2ca34..8c176f4efc 100644 > > --- a/t/test-lib-functions.sh > > +++ b/t/test-lib-functions.sh > > @@ -1454,6 +1454,13 @@ test_cmp_fspath () { > > # from 1. > > > > test_seq () { > > + local fmt="%d" > > + case "$1" in > > + -f) > > + fmt="$2" > > With the `-f` option, the default format string gets overwritten to what > is provided by the user. Makes sense. > > If we want, we could update the comment above this function to mention > this new option. Good point. I didn't even notice that comment! Perhaps we should squash this in? I don't think there's any need to keep it as a separate commit. -- >8 -- Subject: [PATCH] test-lib: document test_seq's "-f" option The previous commit added the "-f" option, but didn't mention it in the function's documentation. Suggested-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Jeff King <peff@peff.net> --- t/test-lib-functions.sh | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh index 8c176f4efc..6230746cc4 100644 --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -1451,7 +1451,12 @@ test_cmp_fspath () { # test_seq 1 5 -- outputs 1 2 3 4 5 one line at a time # # or with one argument (end), in which case it starts counting -# from 1. +# from 1. In addition to the start/end arguments, you can pass an optional +# printf format. For example: +# +# test_seq -f "line %d" 1 5 +# +# would print 5 lines, "line 1" through "line 5". test_seq () { local fmt="%d" -- 2.50.0.399.g566d3d7b27 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 3/2] test-lib: document test_seq's "-f" option 2025-06-24 10:11 ` [PATCH 3/2] test-lib: document test_seq's "-f" option Jeff King @ 2025-06-25 0:02 ` Justin Tobler 0 siblings, 0 replies; 14+ messages in thread From: Justin Tobler @ 2025-06-25 0:02 UTC (permalink / raw) To: Jeff King; +Cc: git, Junio C Hamano On 25/06/24 06:11AM, Jeff King wrote: > On Mon, Jun 23, 2025 at 11:25:20AM -0500, Justin Tobler wrote: > > > > diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh > > > index bee4a2ca34..8c176f4efc 100644 > > > --- a/t/test-lib-functions.sh > > > +++ b/t/test-lib-functions.sh > > > @@ -1454,6 +1454,13 @@ test_cmp_fspath () { > > > # from 1. > > > > > > test_seq () { > > > + local fmt="%d" > > > + case "$1" in > > > + -f) > > > + fmt="$2" > > > > With the `-f` option, the default format string gets overwritten to what > > is provided by the user. Makes sense. > > > > If we want, we could update the comment above this function to mention > > this new option. > > Good point. I didn't even notice that comment! > > Perhaps we should squash this in? I don't think there's any need to keep > it as a separate commit. Ya, that seems reasonable to me. > -- >8 -- > Subject: [PATCH] test-lib: document test_seq's "-f" option > > The previous commit added the "-f" option, but didn't mention it in the > function's documentation. > > Suggested-by: Justin Tobler <jltobler@gmail.com> > Signed-off-by: Jeff King <peff@peff.net> > --- > t/test-lib-functions.sh | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh > index 8c176f4efc..6230746cc4 100644 > --- a/t/test-lib-functions.sh > +++ b/t/test-lib-functions.sh > @@ -1451,7 +1451,12 @@ test_cmp_fspath () { > # test_seq 1 5 -- outputs 1 2 3 4 5 one line at a time > # > # or with one argument (end), in which case it starts counting > -# from 1. > +# from 1. In addition to the start/end arguments, you can pass an optional > +# printf format. For example: > +# > +# test_seq -f "line %d" 1 5 > +# > +# would print 5 lines, "line 1" through "line 5". At first I thought it might be nice to mention that only format strings with a single specifier are supported, but I think this can also be implied since the comment mentions the format string follows the printf format. This looks good to me :) -Justin > > test_seq () { > local fmt="%d" > -- > 2.50.0.399.g566d3d7b27 > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/2] test-lib: teach test_seq the -f option 2025-06-23 10:56 ` [PATCH 2/2] test-lib: teach test_seq the -f option Jeff King 2025-06-23 16:10 ` Junio C Hamano 2025-06-23 16:25 ` Justin Tobler @ 2025-06-23 17:27 ` Todd Zullinger 2025-06-24 10:16 ` Jeff King 2025-06-24 6:22 ` Eric Sunshine 3 siblings, 1 reply; 14+ messages in thread From: Todd Zullinger @ 2025-06-23 17:27 UTC (permalink / raw) To: Jeff King; +Cc: Junio C Hamano, Justin Tobler, git Jeff King wrote: > The "seq" tool has a "-f" option to produce printf-style formatted > lines. Let's teach our test_seq helper the same trick. This lets us get > rid of some shell loops in test snippets (which are particularly verbose > in our test suite because we have to "|| return 1" to keep the &&-chain > going). This is a nice improvement. > diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh > index bee4a2ca34..8c176f4efc 100644 > --- a/t/test-lib-functions.sh > +++ b/t/test-lib-functions.sh > @@ -1454,6 +1454,13 @@ test_cmp_fspath () { > # from 1. > > test_seq () { > + local fmt="%d" > + case "$1" in > + -f) > + fmt="$2" > + shift 2 > + ;; > + esac > case $# in > 1) set 1 "$@" ;; > 2) ;; > @@ -1462,7 +1469,7 @@ test_seq () { > test_seq_counter__=$1 > while test "$test_seq_counter__" -le "$2" > do > - echo "$test_seq_counter__" > + printf "$fmt\n" "$test_seq_counter__" > test_seq_counter__=$(( $test_seq_counter__ + 1 )) > done > } Is it a sharp edge worth caring about that someone might write `test_seq -f 1 5` where we'd pass 1 as the format string? If so, perhaps a check like this might be sufficient to catch it early? diff --git i/t/test-lib-functions.sh w/t/test-lib-functions.sh index 8c176f4efc..87b59d5895 100644 --- i/t/test-lib-functions.sh +++ w/t/test-lib-functions.sh @@ -1458,6 +1458,10 @@ test_seq () { case "$1" in -f) fmt="$2" + case "$fmt" in + *%*) : ;; + *) BUG "no % in -f argument" ;; + esac shift 2 ;; esac I don't know whether it's worth the extra code or not. I just wondered about how it would fail in the face of a minor typo. It certainly should cause any test to fail if it were to output 1 instead of the intended format string, so it's arguably fine as-is. Adding -f to the usage note above, as Justin suggested might help folks avoid making the mistake of cuddling the format string against -f, e.g.: -f%d. That is caught by the parameter count check (though perhaps not everyone would notice why, thinking they did pass an argument to -f). -- Todd ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/2] test-lib: teach test_seq the -f option 2025-06-23 17:27 ` [PATCH 2/2] test-lib: teach test_seq the -f option Todd Zullinger @ 2025-06-24 10:16 ` Jeff King 2025-06-24 13:42 ` Todd Zullinger 0 siblings, 1 reply; 14+ messages in thread From: Jeff King @ 2025-06-24 10:16 UTC (permalink / raw) To: Todd Zullinger; +Cc: Junio C Hamano, Justin Tobler, git On Mon, Jun 23, 2025 at 01:27:07PM -0400, Todd Zullinger wrote: > Is it a sharp edge worth caring about that someone might > write `test_seq -f 1 5` where we'd pass 1 as the format > string? > > If so, perhaps a check like this might be sufficient to > catch it early? > > diff --git i/t/test-lib-functions.sh w/t/test-lib-functions.sh > index 8c176f4efc..87b59d5895 100644 > --- i/t/test-lib-functions.sh > +++ w/t/test-lib-functions.sh > @@ -1458,6 +1458,10 @@ test_seq () { > case "$1" in > -f) > fmt="$2" > + case "$fmt" in > + *%*) : ;; > + *) BUG "no % in -f argument" ;; > + esac > shift 2 > ;; > esac > > I don't know whether it's worth the extra code or not. I > just wondered about how it would fail in the face of a minor > typo. It certainly should cause any test to fail if it were > to output 1 instead of the intended format string, so it's > arguably fine as-is. Hmm, maybe. I notice that "seq" itself does this (though it did surprise me). I think there it is actually doing the "%" interpolation itself (to avoid memory errors by feeding arbitrary strings to printf functions), so it's easy to do. In our case, we can rely on the shell printf to do something sensible if fed garbage. And because we're not parsing ourselves, a pattern like you have above isn't totally accurate (e.g., consider what it would with "%%d"). But it probably would be enough to catch typos. It would also disallow: test_seq -f "same line" 50 to produce repeated lines, though I don't know how valuable that would be. So I dunno. > Adding -f to the usage note above, as Justin suggested might > help folks avoid making the mistake of cuddling the format > string against -f, e.g.: -f%d. That is caught by the > parameter count check (though perhaps not everyone would > notice why, thinking they did pass an argument to -f). If people are going to use "-f%d", I think we'd be better off making it work than trying to complain about it. But I was hoping we could just keep things simple and stupid, given the limited audience. So my inclination is to leave the sharp edges and see if anybody gets cut, but it's possible that I'm just being lazy. -Peff ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/2] test-lib: teach test_seq the -f option 2025-06-24 10:16 ` Jeff King @ 2025-06-24 13:42 ` Todd Zullinger 0 siblings, 0 replies; 14+ messages in thread From: Todd Zullinger @ 2025-06-24 13:42 UTC (permalink / raw) To: Jeff King; +Cc: Junio C Hamano, Justin Tobler, git Jeff King wrote: > On Mon, Jun 23, 2025 at 01:27:07PM -0400, Todd Zullinger wrote: >> I don't know whether it's worth the extra code or not. I >> just wondered about how it would fail in the face of a minor >> typo. It certainly should cause any test to fail if it were >> to output 1 instead of the intended format string, so it's >> arguably fine as-is. > > Hmm, maybe. I notice that "seq" itself does this (though it did surprise > me). I think there it is actually doing the "%" interpolation itself (to > avoid memory errors by feeding arbitrary strings to printf functions), > so it's easy to do. > > In our case, we can rely on the shell printf to do something sensible if > fed garbage. And because we're not parsing ourselves, a pattern like you > have above isn't totally accurate (e.g., consider what it would with > "%%d"). But it probably would be enough to catch typos. Parsing it fully felt like overkill, and a good bit more code to do it well. So that was me being lazy. :) > It would also disallow: > > test_seq -f "same line" 50 > > to produce repeated lines, though I don't know how valuable that would > be. So I dunno. I can see someone using it that way, like a cheap version of the yes command. > If people are going to use "-f%d", I think we'd be better off making it > work than trying to complain about it. But I was hoping we could just > keep things simple and stupid, given the limited audience. > > So my inclination is to leave the sharp edges and see if anybody gets > cut, but it's possible that I'm just being lazy. Sounds good to me. As you say, it's not going to be exposed to "normal" users. The worst thing it can do is allow a test to pass which shouldn't -- and that's by far the least likely way for the (not very) sharp edges to cut us. -- Todd ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/2] test-lib: teach test_seq the -f option 2025-06-23 10:56 ` [PATCH 2/2] test-lib: teach test_seq the -f option Jeff King ` (2 preceding siblings ...) 2025-06-23 17:27 ` [PATCH 2/2] test-lib: teach test_seq the -f option Todd Zullinger @ 2025-06-24 6:22 ` Eric Sunshine 2025-06-24 10:36 ` Jeff King 3 siblings, 1 reply; 14+ messages in thread From: Eric Sunshine @ 2025-06-24 6:22 UTC (permalink / raw) To: Jeff King; +Cc: git On Mon, Jun 23, 2025 at 6:56 AM Jeff King <peff@peff.net> wrote: > The "seq" tool has a "-f" option to produce printf-style formatted > lines. Let's teach our test_seq helper the same trick. This lets us get > rid of some shell loops in test snippets (which are particularly verbose > in our test suite because we have to "|| return 1" to keep the &&-chain > going). > > Signed-off-by: Jeff King <peff@peff.net> > --- > diff --git a/t/t0612-reftable-jgit-compatibility.sh b/t/t0612-reftable-jgit-compatibility.sh > @@ -112,14 +112,11 @@ test_expect_success 'JGit can read multi-level index' ' > - awk " > - BEGIN { > - print \"start\"; > - for (i = 0; i < 10000; i++) > - printf \"create refs/heads/branch-%d HEAD\n\", i; > - print \"commit\"; > - } > - " >input && > + { > + echo start && > + test_seq -f "create refs/heads/branch-%d HEAD" 10000 && > + echo commit > + } >input && I had suggested[1] an effectively equivalent change to Patrick for a couple tests in the nearby t0610, but he rejected[2] the idea due to the pure-shell version being significantly slower than the `awk` version. Pondering his response today, I wondered if it would make sense to replace our pure-shell `test_seq` with an implementation via `awk`, however, if most of our sequence vend only a small set of numbers, then the startup cost of `awk` would probably swamp any savings, especially on Windows where process startup is extremely slow. Taking that into account, I further wondered if we could see an overall win by taking a hybrid approach in which we employ the pure-shell version if vending a small set of numbers, but fall over to an `awk` version if vending a lot of numbers, especially as in the test above or the tests in t0610. Anyhow, food for thought, or not, if you're not hungry for thought food. [1]: https://lore.kernel.org/git/CAPig+cSC3zdur1fCsa7RMNZDcgUK4pUGKb22tpgdANxR6OxNMA@mail.gmail.com/ [2]: https://lore.kernel.org/git/Z-FUNgmY9hTsnzds@pks.im/ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/2] test-lib: teach test_seq the -f option 2025-06-24 6:22 ` Eric Sunshine @ 2025-06-24 10:36 ` Jeff King 0 siblings, 0 replies; 14+ messages in thread From: Jeff King @ 2025-06-24 10:36 UTC (permalink / raw) To: Eric Sunshine; +Cc: git On Tue, Jun 24, 2025 at 02:22:21AM -0400, Eric Sunshine wrote: > > diff --git a/t/t0612-reftable-jgit-compatibility.sh b/t/t0612-reftable-jgit-compatibility.sh > > @@ -112,14 +112,11 @@ test_expect_success 'JGit can read multi-level index' ' > > - awk " > > - BEGIN { > > - print \"start\"; > > - for (i = 0; i < 10000; i++) > > - printf \"create refs/heads/branch-%d HEAD\n\", i; > > - print \"commit\"; > > - } > > - " >input && > > + { > > + echo start && > > + test_seq -f "create refs/heads/branch-%d HEAD" 10000 && > > + echo commit > > + } >input && > > I had suggested[1] an effectively equivalent change to Patrick for a > couple tests in the nearby t0610, but he rejected[2] the idea due to > the pure-shell version being significantly slower than the `awk` > version. > > Pondering his response today, I wondered if it would make sense to > replace our pure-shell `test_seq` with an implementation via `awk`, > however, if most of our sequence vend only a small set of numbers, > then the startup cost of `awk` would probably swamp any savings, > especially on Windows where process startup is extremely slow. Taking > that into account, I further wondered if we could see an overall win > by taking a hybrid approach in which we employ the pure-shell version > if vending a small set of numbers, but fall over to an `awk` version > if vending a lot of numbers, especially as in the test above or the > tests in t0610. Anyhow, food for thought, or not, if you're not hungry > for thought food. Ah, interesting. I didn't time it at all, as my general intuition for shell performance is that counting process spawns overrides everything else (though admittedly it is usually O(n) processes vs O(1), and here we are going from one extra process to zero). I did a few timings, and it looks like the shell wins at 10,000 on my system, but awk wins at 50,000 (though there is a lot of run-to-run noise; I think awk might even win at 10,000 on a loaded system, as this is such a light load that CPU frequency throttling comes into play). I assumed that the culprit was a lack of buffering, but I don't think so. awk seems to issue 10,000 write() calls. I guess it is just internal shell overhead in issuing commands. Where is a JIT byte-code shell interpreter when we need one? ;) My inclination is not to worry about it too much. At 10,000 I think we are talking about a few milliseconds. There's so much more low-hanging fruit if somebody wants to optimize the test suite. IMHO readability is more important here (and if we really want to optimize, doing it inside test_seq would be better). -Peff ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-06-25 0:07 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-06-23 10:55 [PATCH 0/2] test_seq format option Jeff King 2025-06-23 10:55 ` [PATCH 1/2] t7422: replace confusing printf with echo Jeff King 2025-06-23 17:59 ` Eric Sunshine 2025-06-24 10:05 ` Jeff King 2025-06-23 10:56 ` [PATCH 2/2] test-lib: teach test_seq the -f option Jeff King 2025-06-23 16:10 ` Junio C Hamano 2025-06-23 16:25 ` Justin Tobler 2025-06-24 10:11 ` [PATCH 3/2] test-lib: document test_seq's "-f" option Jeff King 2025-06-25 0:02 ` Justin Tobler 2025-06-23 17:27 ` [PATCH 2/2] test-lib: teach test_seq the -f option Todd Zullinger 2025-06-24 10:16 ` Jeff King 2025-06-24 13:42 ` Todd Zullinger 2025-06-24 6:22 ` Eric Sunshine 2025-06-24 10:36 ` Jeff King
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox