Re: [PATCH v2 2/9] t5526: use grep to assert on fetches

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Glen Choo <chooglen@google.com>
Cc: git@vger.kernel.org, Jonathan Tan <jonathantanmy@google.com>,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v2 2/9] t5526: use grep to assert on fetches
Date: Thu, 17 Feb 2022 10:25:52 +0100	[thread overview]
Message-ID: <220217.86iltdj12w.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <kl6l7d9u5eay.fsf@chooglen-macbookpro.roam.corp.google.com>


On Thu, Feb 17 2022, Glen Choo wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> On Wed, Feb 16 2022, Glen Choo wrote:
>>
>>> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>>>
>>>> On Wed, Feb 16 2022, Glen Choo wrote:
>>>>
>>>>> In a previous commit, we replaced test_cmp invocations with
>>>>> verify_fetch_result(). Finish the process of removing test_cmp by using
>>>>> grep in verify_fetch_result() instead.
>>>>>
>>>>> This makes the tests less sensitive to changes because, instead of
>>>>> checking the whole stderr, we only grep for the lines of the form
>>>>>
>>>>> * "<old-head>..<new-head>\s+branch\s+-> origin/branch"
>>>>> * "Fetching submodule <submodule-path>" (if fetching a submodule)
>>>>>
>>>>> when we expect the repo to have fetched. If we expect the repo to not
>>>>> have fetched, grep to make sure the lines are absent. Also, simplify the
>>>>> assertions by using grep patterns that match only the relevant pieces of
>>>>> information, e.g. <old-head> is irrelevant because we only want to know
>>>>> if the fetch was performed, so we don't need to know where the branch
>>>>> was before the fetch.
>>>>
>>>> I tried ejecting 1/9 and 2/9 out of this series locally, and it passes
>>>> all tests until the new tests you add in 7/9.
>>>>
>>>> As ugly as some of the pre-image is, I wonder if dropping these first
>>>> two and biting the bullet and just continuing with the test_cmp is
>>>> better.
>>>>
>>>> The test_cmp is going to catch issues that even the cleverest grep
>>>> combinations won't, e.g. in the submodule-in-C series I discovered a bug
>>>> where all of our testing & the existing series hadn't spotted that we
>>>> were dropping a \n at the end in one of the messages.
>>>
>>> I think there are two schools of thought on how to test informational
>>> messages:
>>>
>>> - assert an exact match on the exact output that we generate
>>> - assert that the output contains the pieces of information we care
>>>   about
>>>
>>> These two approaches are virtually opposites on two axes - the former
>>> will catch unintentional changes (like you've noted)[...]
>>
>> Yes, and to be fair I'm thoroughly in the "assert an exact match" camp,
>> i.e. "let's just use test_cmp", and not everyone would agree with that.
>>
>> I mean, I don't think we should test_cmp every single instance of a
>> command, but for things that are *the tests* concerning themselves with
>> what the output should be, yes we should do that.
>
> That's a good point I hadn't considered, which is that if we want any
> hope of catching unintentional changes in our test suite, we'd want
> _some_ test to check the output. For "git fetch --recurse-submodules",
> it makes the most sense for that test to live in this file.
>
> By eliminating all instances of test_cmp in this file in particular, we
> lose assurances that we don't introduce accidental changes. It makes
> sense to at least have some tests explicitly for output.
>
>>
>>> [...] and the latter saves on maintenance effort and tends to be less noisy in tests.
>>
>> I also don't think you're right about the other approach "sav[ing] on
>> [future] maintenance effort" in this case.
>>
>> If I was needing to adjust some of this output I'd spend way longer on
>> trying to carefully reason that some series of "grep" invocations were
>> really doing the right thing, and probably end up doing the equivalent
>> of a "test_cmp" for myself out of general paranoia, whereas adjusting
>> the output.
>
> That's fair. I've optimized the tests for readability by putting
> complicated logic in the test helper. But any diligent test reader would
> need to read the test helper to convince themselves of its correctness.
> In this case, I agree that the helper is too complex.
>
>>> Personally, I'm a bit torn between both approaches in general because I
>>> want tests to be maintainable (testing the exact output is a bit of an
>>> antipattern at Google), but I'm not very comfortable with the fact that
>>> unintended changes can sneak through.
>>
>> Yes, anyway whatever one thinks in general what I meant to point out
>> here with "biting the bullet" is that whatever one thinks in general
>> about the right approch for new tests, this series in particular seems
>> to be creating more work for itself than it needs by refactoring the
>> test_cmp in existing tests just to add a few new ones.
>>
>> I.e. even if you'd like to not use test_cmp-alike for the new tests,
>> wouldn't it be simpler to just leave the old ones in place and use your
>> new helper for your new tests?
>
> I'm not sure about this - avoiding changing old tests leads to
> fragmentation in the test suite and even the same file. I find it very
> challenging to read/modify files like this, because there is no longer a
> consistent style for the file, and I have to figure out which is the
> "good" way to write tests.
>
> This suggestion makes sense if there's some qualitative difference
> between the new tests and old ones besides just 'being new'. This isn't
> true for this series, so I'd prefer to keep things consistent.
>
>>> So I don't think there's a correct answer in general. Maybe an
>>> acceptable rule of thumb is that test_cmp is good until it starts
>>> getting in the way of reading and writing understandable tests.
>>>
>>> If we agree on that rule, then for this patch, I think replacing
>>> test_cmp is the way to go, primarily because it lets us ignore the 'old
>>> head' of the branch before the fetch, e.g. in the quoted example..
>>
>> [...]
>>
>>>>>  test_expect_success setup '
>>>>> @@ -274,13 +277,10 @@ test_expect_success "Recursion doesn't happen when no new commits are fetched in
>>>>>  '
>>>>>  
>>>>>  test_expect_success "Recursion stops when no new submodule commits are fetched" '
>>>>> -	head1=$(git rev-parse --short HEAD) &&
>>>>>  	git add submodule &&
>>>>>  	git commit -m "new submodule" &&
>>>>> -	head2=$(git rev-parse --short HEAD) &&
>>>>> -	echo "From $pwd/." > expect.err.super &&
>>>>> -	echo "   $head1..$head2  super      -> origin/super" >>expect.err.super &&
>>>>
>>>> ...as opposed to if we just rolled the generation of this into some
>>>> utility printf function.
>>>
>>> we'd still have to deal with $head1 if we use test_cmp. That's fine for
>>> this test, because it's pretty simple, but it gets pretty janky later
>>> on:
>>>
>>>   @@ -345,20 +339,13 @@ test_expect_success "Recursion picks up all submodules when necessary" '
>>>         git fetch &&
>>>         git checkout -q FETCH_HEAD
>>>       ) &&
>>>   -		head1=$(git rev-parse --short HEAD^) &&
>>>       git add subdir/deepsubmodule &&
>>>       git commit -m "new deepsubmodule" &&
>>>   -		head2=$(git rev-parse --short HEAD) &&
>>>   -		echo "Fetching submodule submodule" > ../expect.err.sub &&
>>>   -		echo "From $pwd/submodule" >> ../expect.err.sub &&
>>>   -		echo "   $head1..$head2  sub        -> origin/sub" >> ../expect.err.sub
>>>   +		git rev-parse --short HEAD >../subhead
>>>     ) &&
>>>   -	head1=$(git rev-parse --short HEAD) &&
>>>     git add submodule &&
>>>     git commit -m "new submodule" &&
>>>   -	head2=$(git rev-parse --short HEAD) &&
>>>   -	echo "From $pwd/." > expect.err.super &&
>>>   -	echo "   $head1..$head2  super      -> origin/super" >> expect.err.super &&
>>>   +	git rev-parse --short HEAD >superhead &&
>>>     (
>>>       cd downstream &&
>>>       git fetch >../actual.out 2>../actual.err
>>>
>>> In this example, we have two $head1 variables in different subshells,
>>> one of which is HEAD, but the other is HEAD^. The reason why we want
>>> HEAD^ isn't obvious (IIRC it's because the submodule upstream is 2
>>> commits ahead because we add the deepsubmodule in a separate commit), in
>>> my opinion, and I got tripped up quite a few times trying to read and
>>> understand the test. That's a lot of effort to spend on irrelevant
>>> information - the test actually cares about what it fetched, not where
>>> the ref used to be.
>>>
>>> So for that reason, I'd prefer to remove test_cmp for this test.
>>
>> I agree that it's pretty irrelevant, but I also think we'd be throwing
>> the baby out with the bath water by entirely doing away with test_cmp
>> here, there's an easier way to do this.
> [...]
>> Instead let's just test once somewhere that when we run submodule
>> fetching that submodules are indeed updated appropriately. Surely other
>> submodule tests will break if the "update" code is made to NOOP, or
>> update to the wrong HEAD>
>>
>> Then for all these test_cmp tests we can just sed-away the
>> $head1..$head2 with something like (untested):
>>
>>     sed -n -e 's/[^.]*\.\.[^.]*/OLD..NEW/g'
>>
>> I.e. let's just skip this entire ceremony with asserting the old/new
>> HEAD unless it's really needed (and then we can probably do it once
>> outside a test_cmp).
>>
>> If you grep through the test suite for "sed" adjacent to "test_cmp"
>> you'll find a lot of such examples of munging the output before
>> test_cmp-ing it.
>
> Makes sense, I hadn't considered this (though I have seen the pattern in
> the test suite, oops). The most compelling argument in favor of this is
> that this could remove a lot of the complexity of verify_fetch_result(),
> which is impeding test readability.
>
>> I.e. none of these tests surely need to test that we updated from
>> $head1..$head2 again and again with the corresponding verbosity in test
>> setup and shelling out to "git rev-parse --short HEAD" or whatever.
>
> I find the converse (we are testing the formatting over and over again)
> less convincing. In these tests, we really are checking for $head2 in
> the stderr to verify that the correct thing was fetched. I'm not
> convinced that we should be relying on _other_ submodule tests to tell
> us that submodule fetch is broken. Which brings me back to the original
> motivation of this patch..
>
>>
>> That's perfectly fine here, since the actual point of the test_cmp is to
>> check the formatting/order etc. of the output itself, not to continually
>> re-assert that submodule updating still works, and that we get the right
>> OIDs.
>
> which is that these tests actually are continually re-asserting the
> submodule updating works correctly in the different circumstances, and
> since we use the stderr to check this, test_cmp adds unwarranted noise.
>
> But you are correct in that the point of test_cmp is to check
> formatting/order etc. There is value in using test_cmp for this purpose,
> and getting rid of it entirely creates a hole in our test coverage.
> (This wouldn't mean that we'd need to use test_cmp _everywhere_ though,
> only that we need to use test_cmp _somewhere_.)
>
> As it stands:
>
> +   test_cmp can improve the readability of the test helpers and
>     debuggability of tests vs grep
> +/- test_cmp can catch formatting changes that are hard to catch
>     otherwise, though at the cost of being sensitive to _any_ formatting
>     changes
> -   test_cmp needs some munging to eliminate unnecessary information
>
> so on the whole, I think it's worth trying to use test_cmp in the test
> helper. We may not _need_ it everywhere, but if it would be nice to use
> it in as many places as possible.

I think whatever you opt to go for here makes sense. I just wanted to
provide the feedback in case it was helpful, i.e. when reading it I
found these conversions a bit odd, wondered if they were strictly needed
etc.

Thanks!

next prev parent reply	other threads:[~2022-02-17  9:26 UTC|newest]

Thread overview: 149+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-10  4:41 [PATCH 0/8] fetch --recurse-submodules: fetch unpopulated submodules Glen Choo
2022-02-10  4:41 ` [PATCH 1/8] submodule: inline submodule_commits() into caller Glen Choo
2022-02-10  4:41 ` [PATCH 2/8] submodule: store new submodule commits oid_array in a struct Glen Choo
2022-02-10 19:00   ` Jonathan Tan
2022-02-10 22:05     ` Junio C Hamano
2022-02-10  4:41 ` [PATCH 3/8] submodule: make static functions read submodules from commits Glen Choo
2022-02-10 19:15   ` Jonathan Tan
2022-02-11 10:07     ` Glen Choo
2022-02-11 10:09     ` Glen Choo
2022-02-10  4:41 ` [PATCH 4/8] t5526: introduce test helper to assert on fetches Glen Choo
2022-02-10  4:41 ` [PATCH 5/8] t5526: use grep " Glen Choo
2022-02-10 19:17   ` Jonathan Tan
2022-02-10  4:41 ` [PATCH 6/8] submodule: extract get_fetch_task() Glen Choo
2022-02-10 19:33   ` Jonathan Tan
2022-02-10  4:41 ` [PATCH 7/8] fetch: fetch unpopulated, changed submodules Glen Choo
2022-02-10 22:49   ` Junio C Hamano
2022-02-11  7:15     ` Glen Choo
2022-02-11 17:07       ` Junio C Hamano
2022-02-10 22:51   ` Jonathan Tan
2022-02-14  4:24     ` Glen Choo
2022-02-14 18:04     ` Glen Choo
2022-02-14 10:17   ` Glen Choo
2022-02-10  4:41 ` [PATCH 8/8] submodule: fix bug and remove add_submodule_odb() Glen Choo
2022-02-10 22:54   ` Junio C Hamano
2022-02-11  3:13     ` Glen Choo
2022-02-10 23:04   ` Jonathan Tan
2022-02-11  3:18     ` Glen Choo
2022-02-11 17:19     ` Junio C Hamano
2022-02-14  2:52       ` Glen Choo
2022-02-10  7:07 ` [PATCH 0/8] fetch --recurse-submodules: fetch unpopulated submodules Junio C Hamano
2022-02-10  8:51   ` Glen Choo
2022-02-10 17:40     ` Junio C Hamano
2022-02-11  2:39       ` Glen Choo
2022-02-15 17:23 ` [PATCH v2 0/9] " Glen Choo
2022-02-15 17:23   ` [PATCH v2 1/9] t5526: introduce test helper to assert on fetches Glen Choo
2022-02-15 21:37     ` Ævar Arnfjörð Bjarmason
2022-02-15 17:23   ` [PATCH v2 2/9] t5526: use grep " Glen Choo
2022-02-15 21:53     ` Ævar Arnfjörð Bjarmason
2022-02-16  3:09       ` Glen Choo
2022-02-16 10:02         ` Ævar Arnfjörð Bjarmason
2022-02-17  4:04           ` Glen Choo
2022-02-17  9:25             ` Ævar Arnfjörð Bjarmason [this message]
2022-02-17 16:16               ` Glen Choo
2022-02-15 17:23   ` [PATCH v2 3/9] submodule: make static functions read submodules from commits Glen Choo
2022-02-15 21:18     ` Jonathan Tan
2022-02-16  6:59       ` Glen Choo
2022-02-15 22:00     ` Ævar Arnfjörð Bjarmason
2022-02-16  7:08       ` Glen Choo
2022-02-15 17:23   ` [PATCH v2 4/9] submodule: inline submodule_commits() into caller Glen Choo
2022-02-15 22:02     ` Ævar Arnfjörð Bjarmason
2022-02-15 17:23   ` [PATCH v2 5/9] submodule: store new submodule commits oid_array in a struct Glen Choo
2022-02-15 21:33     ` Ævar Arnfjörð Bjarmason
2022-02-15 17:23   ` [PATCH v2 6/9] submodule: extract get_fetch_task() Glen Choo
2022-02-15 17:23   ` [PATCH v2 7/9] fetch: fetch unpopulated, changed submodules Glen Choo
2022-02-15 22:02     ` Jonathan Tan
2022-02-16  5:46       ` Glen Choo
2022-02-16  9:11         ` Glen Choo
2022-02-16  9:39           ` Ævar Arnfjörð Bjarmason
2022-02-16 17:33             ` Glen Choo
2022-02-15 22:06     ` Ævar Arnfjörð Bjarmason
2022-02-15 17:23   ` [PATCH v2 8/9] submodule: read shallows when finding " Glen Choo
2022-02-15 22:03     ` Jonathan Tan
2022-02-15 17:23   ` [PATCH v2 9/9] submodule: fix latent check_has_commit() bug Glen Choo
2022-02-15 22:04     ` Jonathan Tan
2022-02-24 10:08   ` [PATCH v3 00/10] fetch --recurse-submodules: fetch unpopulated submodules Glen Choo
2022-02-24 10:08     ` [PATCH v3 01/10] t5526: introduce test helper to assert on fetches Glen Choo
2022-02-25  0:34       ` Junio C Hamano
2022-02-24 10:08     ` [PATCH v3 02/10] t5526: stop asserting on stderr literally Glen Choo
2022-02-24 11:52       ` Ævar Arnfjörð Bjarmason
2022-02-24 16:15         ` Glen Choo
2022-02-24 18:13           ` Eric Sunshine
2022-02-24 23:05       ` Jonathan Tan
2022-02-25  2:26         ` Glen Choo
2022-02-24 10:08     ` [PATCH v3 03/10] t5526: create superproject commits with test helper Glen Choo
2022-02-24 23:14       ` Jonathan Tan
2022-02-25  2:52         ` Glen Choo
2022-02-25 11:42           ` Ævar Arnfjörð Bjarmason
2022-02-28 18:11             ` Glen Choo
2022-02-24 10:08     ` [PATCH v3 04/10] submodule: make static functions read submodules from commits Glen Choo
2022-02-24 10:08     ` [PATCH v3 05/10] submodule: inline submodule_commits() into caller Glen Choo
2022-02-24 10:08     ` [PATCH v3 06/10] submodule: store new submodule commits oid_array in a struct Glen Choo
2022-02-24 10:08     ` [PATCH v3 07/10] submodule: extract get_fetch_task() Glen Choo
2022-02-24 23:26       ` Jonathan Tan
2022-02-24 10:08     ` [PATCH v3 08/10] submodule: move logic into fetch_task_create() Glen Choo
2022-02-24 23:36       ` Jonathan Tan
2022-02-24 10:08     ` [PATCH v3 09/10] fetch: fetch unpopulated, changed submodules Glen Choo
2022-02-24 21:30       ` Junio C Hamano
2022-02-25  3:04         ` Glen Choo
2022-02-25  0:33       ` Junio C Hamano
2022-02-25  3:07         ` Glen Choo
2022-02-25  0:39       ` Jonathan Tan
2022-02-25  3:46         ` Glen Choo
2022-03-04 23:46           ` Jonathan Tan
2022-03-05  0:22             ` Glen Choo
2022-03-04 23:53           ` Jonathan Tan
2022-02-26 18:53       ` Junio C Hamano
2022-03-01 20:24         ` Johannes Schindelin
2022-03-01 20:33           ` Junio C Hamano
2022-03-02 23:25             ` Glen Choo
2022-03-01 20:32         ` Junio C Hamano
2022-02-24 10:08     ` [PATCH v3 10/10] submodule: fix latent check_has_commit() bug Glen Choo
2022-03-04  0:57     ` [PATCH v4 00/10] fetch --recurse-submodules: fetch unpopulated submodules Glen Choo
2022-03-04  0:57       ` [PATCH v4 01/10] t5526: introduce test helper to assert on fetches Glen Choo
2022-03-04  2:06         ` Junio C Hamano
2022-03-04 22:11           ` Glen Choo
2022-03-04  0:57       ` [PATCH v4 02/10] t5526: stop asserting on stderr literally Glen Choo
2022-03-04  2:12         ` Junio C Hamano
2022-03-04 22:41         ` Jonathan Tan
2022-03-04 23:48           ` Junio C Hamano
2022-03-05  0:25             ` Glen Choo
2022-03-04  0:57       ` [PATCH v4 03/10] t5526: create superproject commits with test helper Glen Choo
2022-03-04 22:59         ` Jonathan Tan
2022-03-04  0:57       ` [PATCH v4 04/10] submodule: make static functions read submodules from commits Glen Choo
2022-03-04  0:57       ` [PATCH v4 05/10] submodule: inline submodule_commits() into caller Glen Choo
2022-03-04  0:57       ` [PATCH v4 06/10] submodule: store new submodule commits oid_array in a struct Glen Choo
2022-03-04  0:57       ` [PATCH v4 07/10] submodule: extract get_fetch_task() Glen Choo
2022-03-04  0:57       ` [PATCH v4 08/10] submodule: move logic into fetch_task_create() Glen Choo
2022-03-04  0:57       ` [PATCH v4 09/10] fetch: fetch unpopulated, changed submodules Glen Choo
2022-03-04  2:37         ` Junio C Hamano
2022-03-04 22:59           ` Glen Choo
2022-03-05  0:13             ` Junio C Hamano
2022-03-05  0:37               ` Glen Choo
2022-03-08  0:11                 ` Junio C Hamano
2022-03-04 23:56         ` Jonathan Tan
2022-03-04  0:57       ` [PATCH v4 10/10] submodule: fix latent check_has_commit() bug Glen Choo
2022-03-04  2:17         ` Junio C Hamano
2022-03-04  2:22       ` [PATCH v4 00/10] fetch --recurse-submodules: fetch unpopulated submodules Junio C Hamano
2022-03-08  0:14       ` [PATCH v5 " Glen Choo
2022-03-08  0:14         ` [PATCH v5 01/10] t5526: introduce test helper to assert on fetches Glen Choo
2022-03-08  0:14         ` [PATCH v5 02/10] t5526: stop asserting on stderr literally Glen Choo
2022-03-08  0:14         ` [PATCH v5 03/10] t5526: create superproject commits with test helper Glen Choo
2022-03-08  0:14         ` [PATCH v5 04/10] submodule: make static functions read submodules from commits Glen Choo
2022-03-08  0:14         ` [PATCH v5 05/10] submodule: inline submodule_commits() into caller Glen Choo
2022-03-08  0:14         ` [PATCH v5 06/10] submodule: store new submodule commits oid_array in a struct Glen Choo
2022-03-08  0:14         ` [PATCH v5 07/10] submodule: extract get_fetch_task() Glen Choo
2022-03-08  0:14         ` [PATCH v5 08/10] submodule: move logic into fetch_task_create() Glen Choo
2022-03-08  0:14         ` [PATCH v5 09/10] fetch: fetch unpopulated, changed submodules Glen Choo
2022-03-08  0:14         ` [PATCH v5 10/10] submodule: fix latent check_has_commit() bug Glen Choo
2022-03-08  0:50         ` [PATCH v5 00/10] fetch --recurse-submodules: fetch unpopulated submodules Junio C Hamano
2022-03-08 18:24           ` Glen Choo
2022-03-09 19:13             ` Junio C Hamano
2022-03-09 19:49               ` Glen Choo
2022-03-09 20:22                 ` Junio C Hamano
2022-03-09 22:11                   ` Glen Choo
2022-03-16 21:58                     ` Glen Choo
2022-03-16 22:06                       ` Junio C Hamano
2022-03-16 22:37                         ` Glen Choo
2022-03-16 23:08                           ` Junio C Hamano
2022-03-08 21:42         ` Jonathan Tan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=220217.86iltdj12w.gmgdl@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=chooglen@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).