Git development

Git development
 help / color / mirror / Atom feed

* Re: [PATCH v2 1/4] wt-status: read HEAD and ORIG_HEAD via the refdb
From: Junio C Hamano @ 2023-12-13 15:15 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Ramsay Jones, git, Taylor Blau, Phillip Wood
In-Reply-To: <ZXlfeWtDgr1GQFCL@tanuki>

Patrick Steinhardt <ps@pks.im> writes:

>> Me neither, but once you start thinking about getting rid of the
>> need to use one-file-per-ref filesystem, being able to maintain all
>> refs, including the pseudo refs, in one r/w store backend, becomes a
>> very tempting goal.  From that point of view, I do not have problem
>> with the idea to move _all_ pseudorefs to reftable.
>
> Yes, we're in agreement.
>
>> But I do have reservations on what Patrick, and the code he
>> inherited from Han-Wen, calls "special refs" (which is not defined
>> in the glossary at all), namely, refs.c:is_special_ref() and its
>> callers.
>
> I do not want to add "special refs" to the glossary because ultimately
> they should go away, with two exceptions: FETCH_HEAD and MERGE_HEAD.
> Once we're there we can of course discuss whether we want to explicitly
> point them out in the glossary and give them a special name.

OK, I somehow got a (wrong) impression that you are very close to
the finish line.  If the intention is to leave many others still in
the "special" category (for only the reasons of inertia), with a
vision that some selected few must remain "special" with their own
good reasons, then I am perfectly fine.

Thanks.

^ permalink raw reply

* Re: [PATCH] Use ^=1 to toggle between 0 and 1
From: Junio C Hamano @ 2023-12-13 15:17 UTC (permalink / raw)
  To: Jeff King
  Cc: René Scharfe, AtariDreams via GitGitGadget, git, Seija Kijin
In-Reply-To: <20231213080143.GA1684525@coredump.intra.peff.net>

Jeff King <peff@peff.net> writes:

> Without digging into the code, I had just assumed that flipped_block was
> used as an array index. But it really is a boolean, so I actually think
> "flipped_block = !flipped_block" would probably be the most clear (but
> IMHO not really worth the churn).

;-)

> I don't even know that we'd need much of a weather-balloon patch. I
> think it would be valid to do:
>
>   #ifndef bool
>   #define bool int
>
> to handle pre-C99 compilers (if there even are any these days). Of
> course we probably need some conditional magic to try to "#include
> <stdbool.h>" for the actual C99. I guess we could assume C99 by default
> and then add NO_STDBOOL as an escape hatch if anybody complains.

Sounds good.


^ permalink raw reply

* Re: [PATCH v6] subtree: fix split processing with multiple subtrees present
From: Zach FettersMoore @ 2023-12-13 15:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Christian Couder, Zach FettersMoore via GitGitGadget, git
In-Reply-To: <xmqqzfyfoy2w.fsf@gitster.g>

Christian Couder <christian.couder@gmail.com> writes:

>>> > $ git subtree split --prefix=apollo-ios-codegen --squash --rejoin
>>> > Merge made by the 'ort' strategy.
>>> > e274aed3ba6d0659fb4cc014587cf31c1e8df7f4
>>>
>>> Looking into this some it looks like it could be a bash config
>>> difference? My machine always runs it all the way through vs
>>> failing for recursion depth. Although that would also be an issue
>>> which is solved by this fix.
>>
>> I use Ubuntu where /bin/sh is dash so my current guess is that dash
>> might have a smaller recursion limit than bash.
>
> That sounds quite bad. Does it have to be recursive (iow, if we can
> rewrite the logic to be iterative instead, that would be a much better
> way to fix the issue)?

I don't think an iterative vs recursive approach fixes this
particular issue, the root of the issue this patch is fixing
is that lots of commits from the history of subtrees not
being acted upon are being processed when they don't need to
be. So the iterative approach would likely resolve the
recursion limit issue for some shells, but in my instance
I don't see a recursion limit error, it just takes an
extraordinary amount of time to run the split command
because of all the unnecessary processing which needs to be
avoided which this patch fixes.

^ permalink raw reply

* Re: Communication channel interruption
From: Junio C Hamano @ 2023-12-13 15:32 UTC (permalink / raw)
  To: Alexander Zhirov; +Cc: git
In-Reply-To: <71f2b28b-3e27-4773-8408-2f4c13757b73@gmail.com>

Alexander Zhirov <azhirov1991@gmail.com> writes:

> When cloning a repository from GitHub, I have a channel break,
> although the connection is stable. What could be the problem?
>
> # git clone https://github.com/Thinstation/thinstation.git
> Cloning into 'thinstation'...
> remote: Enumerating objects: 79839, done.
> remote: Counting objects: 100% (535/535), done.
> remote: Compressing objects: 100% (319/319), done.
> error: RPC failed; curl 92 HTTP/2 stream 5 was not closed cleanly:
> CANCEL (err 8)
> error: 7473 bytes of body are still expected
> fetch-pack: unexpected disconnect while reading sideband packet
> fatal: early EOF
> fatal: fetch-pack: invalid index-pack output

A guess in the dark without knowing Git version or platform on which
you are seeing this symptom, but it would help to read "What does
http.postBuffer really do?" in 

    $ git help faq

and then to ask Google about "RPC failed; curl 92", which would show
mainly two suggestions from various places.  Perhaps

    $ git -c http.version=HTTP/1.1 clone https://github.com/Thinstation/thinstation.git

may help?


^ permalink raw reply

* Re: Allow adding files from git submodule to parent git repo if they are in .gitignore in the submodule
From: Junio C Hamano @ 2023-12-13 15:38 UTC (permalink / raw)
  To: Alexey Murz Korepov; +Cc: git
In-Reply-To: <CAL5pyKseyxEfLbgEkDHSqWQVLGkrrcZU=BzRx2zZOaCdQbMaoA@mail.gmail.com>

Alexey Murz Korepov <murznn@gmail.com> writes:

> ```
> $ git add -f mysubmodule/config.yaml
> fatal: Pathspec 'mysubmodule/config.yaml' is in submodule 'mysubmodule'
> ```
> But I see no problems with doing this, if the file is in .gitignore of
> the mysubmodule submodule.
>
> So, let's discuss the approach of how we can allow this in git?

What should happen when mysubmodule/.gitignore changes?  Who
guarantees that such a file stolen by the superproject will stay in
the .gitignore file in the submodule?

I do not think you want to go there, and I do not think I want to
see such a behaviour in Git, as we would be the ones who will be
blamed for the resulting mess.

The right solution probably is to keep such a file in a directory
that is separate from the directory where you bind your submodule.
Track the file in the superproject, and have a build procedure to
copy it into the submodule directory (as an untracked file, of
course).

^ permalink raw reply

* Assistance Needed
From: micah veintetres @ 2023-12-13 15:52 UTC (permalink / raw)
  To: git@vger.kernel.org

I am in the middle of the worst tragedy I think man will ever see.

The only safety I have found online is with github, who has been hit with several NSLs regarding a repository I setup documenting what the massacre of my church during the height of Sunday morning worship services in April.

I'm asking that everyone please clone this repository: https://github.com/micahonamission/mcbcm.git

The government has worked tirelessly to compromise my iPhone and finally stole it at gunpoint, it is the second time they have stolen one.  My iPhones/iCloud held the only passkey to login.  So I've created a new one account and cloned the repository knowing they will be trying to modify it.  

I need clones so they cannot keep this hidden.  The existing coverup operation has turned into a genocide in Kentucky.

Please clone and fetch it often.  I'm going to pull in all of the untouched wiki content into this repository later from the existing forks.

Thank you,
-micah

p.s. Before you consume any information about my church, the Creator asks you to pray for guidance as there is alot of information that is circulating intended to confuse you or cover the matter up.  The Creator will let you know, He's just not intrusive, so you need to ask Him for some guidance first.  Thanks!

^ permalink raw reply

* End-of-line comments are prompted with "is not a valid attribute name"
From: D无数 @ 2023-12-13 17:32 UTC (permalink / raw)
  To: git

> macos 13.5.2
> git version 2.37.0 (Apple Git-136)

I use end-of-line comments in my .gitattributes file, and it is often
(not always present, but often) prompted with '# is not a valid
attribute name: .gitattributes:1' when performing many operations.

This is my .gitattributes:
```
res/csv/*.txt eol=lf # 保证csv为lf，以匹配解析格式

# Custom for Visual Studio
*.cs     diff=csharp
```

When performing certain operations:
···shell
> git status
# is not a valid attribute name: .gitattributes:1
On branch master
Your branch is up to date with 'origin/master'.

nothing to commit, working tree clean
···

Other repositories that have .gitattributes files and include
end-of-line comments do the same thing. I'm sure there are no special
characters in my files.

^ permalink raw reply

* Re: [PATCH 05/24] midx: implement `DISP` chunk
From: Taylor Blau @ 2023-12-13 18:28 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git, Patrick Steinhardt
In-Reply-To: <20231212080332.GC1117953@coredump.intra.peff.net>

Hi Peff,

On Tue, Dec 12, 2023 at 03:03:32AM -0500, Jeff King wrote:
> [...]

Well this took me longer to respond to than I thought it would ;-).

> So it kind of seems to me that this would "just work" if
> try_partial_reuse() did something like for the midx case:
>
>   - convert offset of base object into an object id hash using the pack
>     revindex (similar to offset_to_pack_pos)
>
>   - look up the object id in the midx to get a pack/offset combo
>
>   - use the midx revindex to convert that into a bit position
>
>   - check if that bit is marked for verbatim reuse
>
> The assumption there is that in the second step, the midx has resolved
> duplicates by putting all of pack A before pack B, and so on, as above.
> It also assumes that we are trying verbatim reuse in the same order
> (though a different order would not produce wrong results, it would only
> result in less verbatim reuse).

After much thinking, I agree with your conclusion here. Which is great
news indeed, since it allows us to implement multi-pack reuse without
requiring that the candidate packs be disjoint with respect to their
objects.

Even though we have some protections in place for ensuring these packs'
disjointed-ness, I agree with Junio upthread that this is likely the
most fragile part of this series. That is, even though we check in
midx.c::get_sorted_entries() that the marked packs are disjoint, if we
miss something, the results would be rather bad. At best it would result
in sending a corrupt packfile to the client, and at worst it could
result in permanent data corruption if repacking a repository with
multi-pack reuse enabled.

> If we assume that any cross-pack deltas are a problem, we could always
> just skip them for verbatim reuse. That is, once we look up the object
> id in the midx, we can see if it's in the same pack we're currently
> processing. If not, we could punt and let the non-verbatim code paths
> handle it as usual.

Let me think aloud for a second, since it took me quite a lot of
thinking to fully wrap my head around this. Suppose we have two packs,
P1 and P2 where P1 has object A delta'd against B, and P2 has its own
copy of object B. If the MIDX chose the copy of B from P2, but also
decided to send P1 (which it chose from A), then if P1 is stored as an
OFS delta, we would be sending a corrupt packfile to the client.

I think there are a few things that we could reasonably do here:

  - Reject cross-pack deltas entirely (as you suggest). This is the
    easiest implementation choice in this already-complex series, and it
    doesn't paint us into a corner in the sense that we could relax
    these requirements in the future.

  - Turn any cross-pack deltas (which are stored as OFS_DELTAs) into
    REF_DELTAs. We already do this today when reusing an OFS_DELTA
    without `--delta-base-offset` enabled, so it's not a huge stretch to
    do the same for cross-pack deltas even when `--delta-base-offset` is
    enabled.

    This would work, but would obviously result in larger-than-necessary
    packs, as we in theory *could* represent these cross-pack deltas by
    patching an existing OFS_DELTA. But it's not clear how much that
    would matter in practice. I suspect it would have a lot to do with
    how you pack your repository in the first place.

  - Finally, we could patch OFS_DELTAs across packs in a similar fashion
    as we do today for OFS_DELTAs within a single pack on either side of
    a gap. This would result in the smallest packs of the three options
    here, but implementing this would be more involved.

    At minimum, you'd have to keep the reusable chunks list for all
    reused packs, not just the one we're currently processing. And you'd
    have to ensure that any bases which are a part of cross-pack deltas
    appear before the delta. I think this is possible to do, but would
    require assembling the reusable chunks list potentially in a
    different order than they appear in the source packs.

> And of course there's a bunch of hard work teaching all of those
> functions about midx's and multiple packs in the first place, but you
> already had to do all that in the latter part of your series, and it
> would still be applicable.

Yep, all of that is still a requirement here, and lives on in the
revised version of this series.

The naive implementation where we call try_partial_reuse() on every
object which is a candidate for reuse and check for cross-pack deltas
would work, but have poor performance. The reason is that we would be
doing a significant amount of cache-inefficient work to determine
whether or not the base for some delta/base-pair resides in the same
pack:

  - If you see a delta in some pack while processing a MIDX bitmap for
    reuse, you first have to figure out the location of its base in that
    same pack. (Note: this may or may not be the copy of the base object
    chosen by the MIDX).

  - To figure out whether or not the MIDX chose that copy, you would
    naively have to do something like:

      - Convert the base object's offset into a packfile position using
        the pack revindex.

      - Convert the base object's packfile position into an index
        position, which would then be used to obtain its OID.

      - Then binary search through the MIDX for that OID found in the
        previous step, filling out the MIDX entry for that object.

      - Toss out the cross-pack delta/base pair if the MIDX entry in the
        previous step indicates that the MIDX chose a copy of the base
        from a different pack than the one we're currently processing
        (i.e. where the delta resides).

That's rather inefficient. But, we can do better. Instead of going back
and forth through both the pack and MIDX's reverse index, we can simply
binary search in the MIDX's reverse index for the (pack_id, offset) pair
corresponding to the base.

If we find a match, then we know that the MIDX chose its copy of the
base object from the same pack, and we can reuse the delta/base-pair. If
we don't, then we know that the MIDX chose its copy of the base object
from a different pack, and we have to throw out the delta/base-pair.

This is a bit more involved than the naive implementation, but it's (a)
efficient, and (b) most of the code for it already exists in the form of
midx_to_pack_pos(), which implements a binary search over the MIDX
bitmap's pseudo-pack order.

With some light refactoring, we can repurpose this code to perform a
binary search for a given (pack, offset) pair instead of starting with a
MIDX lex position and converting it into the (pack, offset) pair. So
that works, and is what I've done in the revised version of this series.

There is one other snag, which is that we can no longer blindly reuse
whole-words from the reuse bitmap if we have non-disjoint packs. That
is, we can't do something like:

    while (pos < result->word_alloc && result->words[pos] == (eword_t)~0)
        pos++;

when processing anything but the first pack.

The reason is that we know the first pack has all duplicate object ties
broken in its favor, but we don't have the same guarantee for subsequent
packs. So we have to be more careful about which bits we reuse from
those subsequent packs, since we may inadvertently pick up a cross-pack
delta/base pair without inspecting it more closely.

As I mentioned, you can still perform this optimization over the first
pack, and I think that will be sufficient for most repositories. It's
not clear to me exactly how much this optimization is helping us in
contrast to all of the other work that pack-objects is doing, but that
is probably something worth measuring.

Thanks for the terrific suggestion. I'll clean up the results of trying
to implement it, and share it with the list soon (ideally before the end
of this week, after which I'm on vacation until the new year).

Thanks,
Taylor

^ permalink raw reply

* Re: [PATCH 05/24] midx: implement `DISP` chunk
From: Junio C Hamano @ 2023-12-13 19:20 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Jeff King, git, Patrick Steinhardt
In-Reply-To: <ZXn33HVBFqidAThn@nand.local>

Taylor Blau <me@ttaylorr.com> writes:

>   - Turn any cross-pack deltas (which are stored as OFS_DELTAs) into
>     REF_DELTAs. We already do this today when reusing an OFS_DELTA
>     without `--delta-base-offset` enabled, so it's not a huge stretch to
>     do the same for cross-pack deltas even when `--delta-base-offset` is
>     enabled.
>
>     This would work, but would obviously result in larger-than-necessary
>     packs, as we in theory *could* represent these cross-pack deltas by
>     patching an existing OFS_DELTA. But it's not clear how much that
>     would matter in practice. I suspect it would have a lot to do with
>     how you pack your repository in the first place.

I have to wonder if the cost of additional computation to see when
is safe to allow cross-pack delta can be kept reasonably low, as
you'd need to prove that you are not introducing a cycle by doing
so.  Compared to that, spending a dozen or so bytes for the offset
for rare cases of having to express such a cross-pack delta pair
would be much less worrisome to me.

> Thanks for the terrific suggestion. I'll clean up the results of trying
> to implement it, and share it with the list soon (ideally before the end
> of this week, after which I'm on vacation until the new year).

Good to hear that you'll be recharging soon.  Enjoy.

^ permalink raw reply

* Re: End-of-line comments are prompted with "is not a valid attribute name"
From: Junio C Hamano @ 2023-12-13 19:22 UTC (permalink / raw)
  To: D无数; +Cc: git
In-Reply-To: <CAOQ=bxz8txyOt6p5L0qfx5DFKfxUhWvHW0pJP+YTbWVfBpvYxg@mail.gmail.com>

D无数 <wushuripple@gmail.com> writes:

> This is my .gitattributes:
> ```
> res/csv/*.txt eol=lf # 保证csv为lf，以匹配解析格式

I do not think of any version of Git ignoring what you wrote after #
(including #) in the middle of the line.

> # Custom for Visual Studio

I know I designed the parser to allow this as a comment, though.

So

> When performing certain operations:
> ···shell
>> git status
> # is not a valid attribute name: .gitattributes:1

This is totally expected; nothing to see here.

^ permalink raw reply

* Re: Communication channel interruption
From: brian m. carlson @ 2023-12-14  2:27 UTC (permalink / raw)
  To: Alexander Zhirov; +Cc: git
In-Reply-To: <71f2b28b-3e27-4773-8408-2f4c13757b73@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1726 bytes --]

On 2023-12-13 at 09:54:30, Alexander Zhirov wrote:
> When cloning a repository from GitHub, I have a channel break, although the
> connection is stable. What could be the problem?
> 
> # git clone https://github.com/Thinstation/thinstation.git
> Cloning into 'thinstation'...
> remote: Enumerating objects: 79839, done.
> remote: Counting objects: 100% (535/535), done.
> remote: Compressing objects: 100% (319/319), done.
> error: RPC failed; curl 92 HTTP/2 stream 5 was not closed cleanly: CANCEL
> (err 8)
> error: 7473 bytes of body are still expected
> fetch-pack: unexpected disconnect while reading sideband packet
> fatal: early EOF
> fatal: fetch-pack: invalid index-pack output

It would be helpful to know what version of Git, libcurl, and operating
system you're using.  This type of error is often caused by proxies,
including non-default antivirus or firewall programs (very especially on
Windows), TLS middleboxes, and monitoring software.  If you have such
software, it should be completely uninstalled and you should reboot, and
usually that will fix the problem.  (Simply disabling it often does not
fix things.)

I should point out that while I'm aware that this does occur for many
people, I have never, ever seen it when using GitHub, nor do I believe
it is actually sent by GitHub servers.  Thus, my experience leads me to
believe the problem lies elsewhere.  It is theoretically possible that
libcurl or one of its dependencies has a bug in your version, but
without more details, it's hard to tell.

As Junio suggested, you could try forcing HTTP/1.1 and see if that fixes
the problem for you as well.
-- 
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply

* [PATCH] tests: drop dependency on `git diff` in check-chainlint
From: Eric Sunshine @ 2023-12-14  3:22 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano, Patrick Steinhardt, Eric Sunshine

From: Eric Sunshine <sunshine@sunshineco.com>

The "check-chainlint" target runs automatically when running tests and
performs self-checks to verify that the chainlinter itself produces the
expected output. Originally, the chainlinter was implemented via sed,
but the infrastructure has been rewritten in fb41727b7e (t: retire
unused chainlint.sed, 2022-09-01) to use a Perl script instead.

The rewrite caused some slight whitespace changes in the output that are
ultimately not of much importance. In order to be able to assert that
the actual chainlinter errors match our expectations we thus have to
ignore whitespace characters when diffing them. As the `-w` flag is not
in POSIX we try to use `git diff -w --no-index` before we fall back to
non-standard `diff -w -u`.

To accommodate for cases where the host system has no Git installation
we use the locally-compiled version of Git. This can result in problems
though when the Git project's repository is using extensions that the
locally-compiled version of Git doesn't understand, in which case `git`
may refuse to run and thus cause the checks to fail.

Work around this issue by normalizing whitespace via sed before invoking
diff, which allows any platform diff implementation to be used, thus
eliminating the dependency upon `git diff` and the non-POSIX `-w` flag.

Reported-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---

This is an alternative solution to the issue Patrick's patch[1]
addresses. Hopefully, this approach should avoid the sort of push-back
Patrick's patch received[2].

I shamelessly stole most of Patrick's commit message.

The sed expressions for normalizing whitespace prior to `diff` may look
a bit hairy, but they are simple enough in concept:

* collapse runs of whitespace to a single SP
* drop blank lines (this step is not new)
* fold out possible SP at beginning and end of each line
* fold out SP surrounding common punctuation characters used in shell
  scripts, such as `>`, `|`, `;`, etc.

By the way, I'm somewhat surprised that this issue crops up at all
considering that --no-index is being used with git-diff. As such, I
would have thought that the local repository's format would not have
been interrogated at all. If that's a bug in `git diff --no-index`, then
fixing that could be considered yet another alternative solution to the
issue raised here.

[1]: https://lore.kernel.org/git/4112adbe467c14a8f22a87ea41aa4705f8760cf6.1702380646.git.ps@pks.im/
[2]: https://lore.kernel.org/git/xmqqr0jqnnmn.fsf@gitster.g/

 t/Makefile | 14 +++-----------
 1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/t/Makefile b/t/Makefile
index 225aaf78ed..656ff10afa 100644
--- a/t/Makefile
+++ b/t/Makefile
@@ -103,20 +103,12 @@ check-chainlint:
 		echo "# chainlint: $(CHAINLINTTMP_SQ)/tests" && \
 		for i in $(CHAINLINTTESTS); do \
 			echo "# chainlint: $$i" && \
-			sed -e '/^[ 	]*$$/d' chainlint/$$i.expect; \
+			sed -e 's/[ 	][ 	]*/ /g;/^ *$$/d;s/^ //;s/ $$//;s/\([<>|();&]\) /\1/g;s/ \([<>|();&]\)/\1/g' chainlint/$$i.expect; \
 		done \
 	} >'$(CHAINLINTTMP_SQ)'/expect && \
 	$(CHAINLINT) --emit-all '$(CHAINLINTTMP_SQ)'/tests | \
-		sed -e 's/^[1-9][0-9]* //;/^[ 	]*$$/d' >'$(CHAINLINTTMP_SQ)'/actual && \
-	if test -f ../GIT-BUILD-OPTIONS; then \
-		. ../GIT-BUILD-OPTIONS; \
-	fi && \
-	if test -x ../git$$X; then \
-		DIFFW="../git$$X --no-pager diff -w --no-index"; \
-	else \
-		DIFFW="diff -w -u"; \
-	fi && \
-	$$DIFFW '$(CHAINLINTTMP_SQ)'/expect '$(CHAINLINTTMP_SQ)'/actual
+		sed -e 's/^[1-9][0-9]* //;s/[ 	][ 	]*/ /g;/^ *$$/d;s/^ //;s/ $$//;s/\([<>|();&]\) /\1/g;s/ \([<>|();&]\)/\1/g' >'$(CHAINLINTTMP_SQ)'/actual && \
+	diff -u '$(CHAINLINTTMP_SQ)'/expect '$(CHAINLINTTMP_SQ)'/actual

 test-lint: test-lint-duplicates test-lint-executable test-lint-shell-syntax \
 	test-lint-filenames
-- 
2.43.0

^ permalink raw reply related

* Re: [PATCH] tests: prefer host Git to verify chainlint self-checks
From: Eric Sunshine @ 2023-12-14  3:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Patrick Steinhardt, git
In-Reply-To: <xmqqr0jqnnmn.fsf@gitster.g>

On Wed, Dec 13, 2023 at 10:11 AM Junio C Hamano <gitster@pobox.com> wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> >> I do not think "prefer host Git" is necessarily a good idea; falling
> >> back to use host Git is perfectly fine, of course.
> >
> > Why is that, though?
>
> Mostly because your "differences in features supported by just-built
> one and what happens to be on $PATH can cause problems" cuts both
> ways [...]

I sent an alternative solution[1] which should sidestep this objection.

As usual, I forgot to use --in-reply-to=<this-thread> when sending the
patch, even though I reminded myself to use it only a minute or so
earlier. Sorry.

[1]: https://lore.kernel.org/git/20231214032248.1615-1-ericsunshine@charter.net/

^ permalink raw reply

* Re: [PATCH] mailinfo: fix out-of-bounds memory reads in unquote_quoted_pair()
From: Patrick Steinhardt @ 2023-12-14  7:41 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Taylor Blau, Carlos Andrés Ramírez Cataño
In-Reply-To: <20231213082027.GB1684525@coredump.intra.peff.net>

[-- Attachment #1: Type: text/plain, Size: 2643 bytes --]

On Wed, Dec 13, 2023 at 03:20:27AM -0500, Jeff King wrote:
> On Wed, Dec 13, 2023 at 08:07:21AM +0100, Patrick Steinhardt wrote:
[snip]
> > Another thing I was wondering about is the recursive nature of these
> > functions, and whether it can lead to similar issues like we recently
> > had when parsing revisions with stack exhaustion. I _think_ this should
> > not be much of a problem in this case though, as we're talking about
> > mail headers here. While the length of header values isn't restricted
> > per se (the line length is restricted to 1000, but with Comment Folding
> > Whitespace values can span multiple lines), but mail provdiers will sure
> > clamp down on mails with a "From:" header that is megabytes in size.
> 
> It's just unquote_comment() that is recursive, but yeah, there is
> nothing to stop it from recursing forever on "((((((...". The stack
> requirements are pretty small, though. I needed between 2^17 and 2^18
> bytes to segfault on my machine using:
> 
>   perl -e 'print "From: ", "(" x 2**18;' |
>   git mailinfo /dev/null /dev/null
> 
> Absurdly big for an email, but maybe within the realm of possibility? I
> think it might be possible to drop the recursion and just use a depth
> counter, like this:

It's definitely not as large as I'd have expected it to be, we're only
talking about kilobytes of data. Feels like it might be in the realm of
possibility to get transferred by a mail provider.

> diff --git a/mailinfo.c b/mailinfo.c
> index 737b9e5e13..db236f9f9f 100644
> --- a/mailinfo.c
> +++ b/mailinfo.c
> @@ -59,6 +59,7 @@ static void parse_bogus_from(struct mailinfo *mi, const struct strbuf *line)
>  static const char *unquote_comment(struct strbuf *outbuf, const char *in)
>  {
>  	int take_next_literally = 0;
> +	int depth = 1;
>  
>  	strbuf_addch(outbuf, '(');
>  
> @@ -72,11 +73,14 @@ static const char *unquote_comment(struct strbuf *outbuf, const char *in)
>  				take_next_literally = 1;
>  				continue;
>  			case '(':
> -				in = unquote_comment(outbuf, in);
> +				strbuf_addch(outbuf, '(');
> +				depth++;
>  				continue;
>  			case ')':
>  				strbuf_addch(outbuf, ')');
> -				return in;
> +				if (!--depth)
> +					return in;
> +				continue;
>  			}
>  		}
>  
> I doubt it's a big deal either way, but if it's that easy to do it might
> be worth it.

Isn't this only protecting against unbalanced braces? That might be a
sensible check to do regardless, but does it protect against recursion
blowing the stack if you just happen to have many opening braces?

Might be I'm missing something.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH] tests: drop dependency on `git diff` in check-chainlint
From: Patrick Steinhardt @ 2023-12-14  8:05 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: git, Eric Sunshine, Junio C Hamano, Eric Sunshine
In-Reply-To: <20231214032248.1615-1-ericsunshine@charter.net>

[-- Attachment #1: Type: text/plain, Size: 4485 bytes --]

On Wed, Dec 13, 2023 at 10:22:48PM -0500, Eric Sunshine wrote:
> From: Eric Sunshine <sunshine@sunshineco.com>
> 
> The "check-chainlint" target runs automatically when running tests and
> performs self-checks to verify that the chainlinter itself produces the
> expected output. Originally, the chainlinter was implemented via sed,
> but the infrastructure has been rewritten in fb41727b7e (t: retire
> unused chainlint.sed, 2022-09-01) to use a Perl script instead.
> 
> The rewrite caused some slight whitespace changes in the output that are
> ultimately not of much importance. In order to be able to assert that
> the actual chainlinter errors match our expectations we thus have to
> ignore whitespace characters when diffing them. As the `-w` flag is not
> in POSIX we try to use `git diff -w --no-index` before we fall back to
> non-standard `diff -w -u`.
> 
> To accommodate for cases where the host system has no Git installation
> we use the locally-compiled version of Git. This can result in problems
> though when the Git project's repository is using extensions that the
> locally-compiled version of Git doesn't understand, in which case `git`
> may refuse to run and thus cause the checks to fail.
> 
> Work around this issue by normalizing whitespace via sed before invoking
> diff, which allows any platform diff implementation to be used, thus
> eliminating the dependency upon `git diff` and the non-POSIX `-w` flag.
> 
> Reported-by: Patrick Steinhardt <ps@pks.im>
> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
> ---
> 
> This is an alternative solution to the issue Patrick's patch[1]
> addresses. Hopefully, this approach should avoid the sort of push-back
> Patrick's patch received[2].

Thanks for chiming in!

> I shamelessly stole most of Patrick's commit message.
> 
> The sed expressions for normalizing whitespace prior to `diff` may look
> a bit hairy, but they are simple enough in concept:
> 
> * collapse runs of whitespace to a single SP
> * drop blank lines (this step is not new)
> * fold out possible SP at beginning and end of each line
> * fold out SP surrounding common punctuation characters used in shell
>   scripts, such as `>`, `|`, `;`, etc.
> 
> By the way, I'm somewhat surprised that this issue crops up at all
> considering that --no-index is being used with git-diff. As such, I
> would have thought that the local repository's format would not have
> been interrogated at all. If that's a bug in `git diff --no-index`, then
> fixing that could be considered yet another alternative solution to the
> issue raised here.

This strongly reminds me of the thread at [1], where a similar issue was
discussed for git-grep(1). Quoting Junio: 

> I actually do not think these "we are allowing Git tools to be used
> on random garbage" is a good idea to begin with X-<.  If we invented
> something nice for our variant in "git grep" and wish we can use it
> outside the repository, contributing the feature to implementations
> of "grep" would have been the right way to move forward, instead of
> contaminating the codebase with things that are not related to Git.

So this might not be the best way to go.

> [1]: https://lore.kernel.org/git/4112adbe467c14a8f22a87ea41aa4705f8760cf6.1702380646.git.ps@pks.im/
> [2]: https://lore.kernel.org/git/xmqqr0jqnnmn.fsf@gitster.g/
> 
>  t/Makefile | 14 +++-----------
>  1 file changed, 3 insertions(+), 11 deletions(-)
> 
> diff --git a/t/Makefile b/t/Makefile
> index 225aaf78ed..656ff10afa 100644
> --- a/t/Makefile
> +++ b/t/Makefile
> @@ -103,20 +103,12 @@ check-chainlint:
>  		echo "# chainlint: $(CHAINLINTTMP_SQ)/tests" && \
>  		for i in $(CHAINLINTTESTS); do \
>  			echo "# chainlint: $$i" && \
> -			sed -e '/^[ 	]*$$/d' chainlint/$$i.expect; \
> +			sed -e 's/[ 	][ 	]*/ /g;/^ *$$/d;s/^ //;s/ $$//;s/\([<>|();&]\) /\1/g;s/ \([<>|();&]\)/\1/g' chainlint/$$i.expect; \

These sed expressions do look hairy indeed. I have to wonder: all that
we're doing here is to munge the expected files we already have in our
tree. Can't we fix those to look exactly like the actual results instead
and then avoid any kind of post processing altogether? If I understand
correctly the only reason we do this post processing is because the
original implementation of the chainlinter produced slightly different
whitespace.

Patrick

[1]: https://lore.kernel.org/git/xmqq7cnnpy3z.fsf@gitster.g/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH] tests: prefer host Git to verify chainlint self-checks
From: Patrick Steinhardt @ 2023-12-14  8:13 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Junio C Hamano, git
In-Reply-To: <CAPig+cRc2hW_xhJRPJmEVYik71zWLDQ_EFjBFw095OgPGYrWGg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1642 bytes --]

On Wed, Dec 13, 2023 at 10:33:00PM -0500, Eric Sunshine wrote:
> On Wed, Dec 13, 2023 at 10:11 AM Junio C Hamano <gitster@pobox.com> wrote:
> > Patrick Steinhardt <ps@pks.im> writes:
> > >> I do not think "prefer host Git" is necessarily a good idea; falling
> > >> back to use host Git is perfectly fine, of course.
> > >
> > > Why is that, though?
> >
> > Mostly because your "differences in features supported by just-built
> > one and what happens to be on $PATH can cause problems" cuts both
> > ways [...]
> 
> I sent an alternative solution[1] which should sidestep this objection.
> 
> As usual, I forgot to use --in-reply-to=<this-thread> when sending the
> patch, even though I reminded myself to use it only a minute or so
> earlier. Sorry.
> 
> [1]: https://lore.kernel.org/git/20231214032248.1615-1-ericsunshine@charter.net/

Thanks, I've replied to the thread. I think by now there are three
different ideas:

  - Improve the logic to pick some kind of diff implementation, which is
    my patch series. It would need to be improved so that we also probe
    whether the respective Git executables actually understand the repo
    format so that we can fall back from the just-built Git to system's
    Git.

  - Munge the whitespace of the expected results with some regexes.
    I like that idea better because we can avoid the git-diff(1)
    problem, but find that the result is somewhat hard to read.

  - Fix the ".expect" files so that we can avoid all of these games.

I actually like the last option most. I'll have a go at it and send this
third version out in a bit.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* [PATCH v2] tests: adjust whitespace in chainlint expectations
From: Patrick Steinhardt @ 2023-12-14  8:30 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Eric Sunshine
In-Reply-To: <4112adbe467c14a8f22a87ea41aa4705f8760cf6.1702380646.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 14057 bytes --]

The "check-chainlint" target runs automatically when running tests and
performs self-checks to verify that the chainlinter itself produces the
expected output. Originally, the chainlinter was implemented via sed,
but the infrastructure has been rewritten in fb41727b7e (t: retire
unused chainlint.sed, 2022-09-01) to use a Perl script instead.

The rewrite caused some slight whitespace changes in the output that are
ultimately not of much importance. In order to be able to assert that
the actual chainlinter errors match our expectations we thus have to
ignore whitespace characters when diffing them. As the `-w` flag is not
in POSIX we try to use `git diff -w --no-index` before we fall back to
`diff -w -u`.

To accomodate for cases where the host system has no Git installation we
use the locally-compiled version of Git. This can result in problems
though when the Git project's repository is using extensions that the
locally-compiled version of Git doesn't understand. It will refuse to
run and thus cause the checks to fail.

Instead of improving the detection logic, fix our ".expect" files so
that we do not need any post-processing anymore. This allows us to drop
the `-w` flag when diffing so that we can always use diff(1) now.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/Makefile                                    | 10 +-------
 t/chainlint/blank-line-before-esac.expect     |  8 +++----
 t/chainlint/block.expect                      |  4 ++--
 t/chainlint/chain-break-background.expect     |  4 ++--
 t/chainlint/chain-break-return-exit.expect    | 14 +++++------
 t/chainlint/chain-break-status.expect         |  4 ++--
 t/chainlint/chained-subshell.expect           |  4 ++--
 .../command-substitution-subsubshell.expect   |  2 +-
 t/chainlint/dqstring-line-splice.expect       |  4 ++--
 t/chainlint/dqstring-no-interpolate.expect    |  4 ++--
 t/chainlint/empty-here-doc.expect             |  4 ++--
 t/chainlint/exclamation.expect                |  2 +-
 t/chainlint/for-loop-abbreviated.expect       |  2 +-
 t/chainlint/function.expect                   |  4 ++--
 t/chainlint/here-doc.expect                   |  4 ++--
 t/chainlint/loop-detect-status.expect         | 20 ++++++++--------
 t/chainlint/nested-loop-detect-failure.expect | 24 +++++++++----------
 t/chainlint/subshell-here-doc.expect          |  4 ++--
 t/chainlint/token-pasting.expect              | 14 +++++------
 19 files changed, 64 insertions(+), 72 deletions(-)

diff --git a/t/Makefile b/t/Makefile
index 225aaf78ed..3a5d81b7fe 100644
--- a/t/Makefile
+++ b/t/Makefile
@@ -108,15 +108,7 @@ check-chainlint:
 	} >'$(CHAINLINTTMP_SQ)'/expect && \
 	$(CHAINLINT) --emit-all '$(CHAINLINTTMP_SQ)'/tests | \
 		sed -e 's/^[1-9][0-9]* //;/^[ 	]*$$/d' >'$(CHAINLINTTMP_SQ)'/actual && \
-	if test -f ../GIT-BUILD-OPTIONS; then \
-		. ../GIT-BUILD-OPTIONS; \
-	fi && \
-	if test -x ../git$$X; then \
-		DIFFW="../git$$X --no-pager diff -w --no-index"; \
-	else \
-		DIFFW="diff -w -u"; \
-	fi && \
-	$$DIFFW '$(CHAINLINTTMP_SQ)'/expect '$(CHAINLINTTMP_SQ)'/actual
+	diff -u '$(CHAINLINTTMP_SQ)'/expect '$(CHAINLINTTMP_SQ)'/actual
 
 test-lint: test-lint-duplicates test-lint-executable test-lint-shell-syntax \
 	test-lint-filenames
diff --git a/t/chainlint/blank-line-before-esac.expect b/t/chainlint/blank-line-before-esac.expect
index 48ed4eb124..056e03003d 100644
--- a/t/chainlint/blank-line-before-esac.expect
+++ b/t/chainlint/blank-line-before-esac.expect
@@ -1,11 +1,11 @@
-test_done ( ) {
+test_done () {
 	case "$test_failure" in
-	0 )
+	0)
 		test_at_end_hook_
 
 		exit 0 ;;
 
-	* )
+	*)
 		if test $test_external_has_tap -eq 0
 		then
 			say_color error "# failed $test_failure among $msg"
@@ -14,5 +14,5 @@ test_done ( ) {
 
 		exit 1 ;;
 
-		esac
+	esac
 }
diff --git a/t/chainlint/block.expect b/t/chainlint/block.expect
index a3bcea492a..1c87326364 100644
--- a/t/chainlint/block.expect
+++ b/t/chainlint/block.expect
@@ -12,9 +12,9 @@
 ) &&
 
 {
-	echo a ; ?!AMP?! echo b
+	echo a; ?!AMP?! echo b
 } &&
-{ echo a ; ?!AMP?! echo b ; } &&
+{ echo a; ?!AMP?! echo b; } &&
 
 {
 	echo "${var}9" &&
diff --git a/t/chainlint/chain-break-background.expect b/t/chainlint/chain-break-background.expect
index 28f9114f42..20d0bb5333 100644
--- a/t/chainlint/chain-break-background.expect
+++ b/t/chainlint/chain-break-background.expect
@@ -1,9 +1,9 @@
 JGIT_DAEMON_PID= &&
 git init --bare empty.git &&
-> empty.git/git-daemon-export-ok &&
+>empty.git/git-daemon-export-ok &&
 mkfifo jgit_daemon_output &&
 {
-	jgit daemon --port="$JGIT_DAEMON_PORT" . > jgit_daemon_output &
+	jgit daemon --port="$JGIT_DAEMON_PORT" . >jgit_daemon_output &
 	JGIT_DAEMON_PID=$!
 } &&
 test_expect_code 2 git ls-remote --exit-code git://localhost:$JGIT_DAEMON_PORT/empty.git
diff --git a/t/chainlint/chain-break-return-exit.expect b/t/chainlint/chain-break-return-exit.expect
index 1732d221c3..4cd18e2edf 100644
--- a/t/chainlint/chain-break-return-exit.expect
+++ b/t/chainlint/chain-break-return-exit.expect
@@ -1,16 +1,16 @@
 case "$(git ls-files)" in
-one ) echo pass one ;;
-* ) echo bad one ; return 1 ;;
+one) echo pass one ;;
+*) echo bad one; return 1 ;;
 esac &&
 (
 	case "$(git ls-files)" in
-	two ) echo pass two ;;
-	* ) echo bad two ; exit 1 ;;
-esac
+	two) echo pass two ;;
+	*) echo bad two; exit 1 ;;
+	esac
 ) &&
 case "$(git ls-files)" in
-dir/two"$LF"one ) echo pass both ;;
-* ) echo bad ; return 1 ;;
+dir/two"$LF"one) echo pass both ;;
+*) echo bad; return 1 ;;
 esac &&
 
 for i in 1 2 3 4 ; do
diff --git a/t/chainlint/chain-break-status.expect b/t/chainlint/chain-break-status.expect
index f4bada9463..e6b3b2193e 100644
--- a/t/chainlint/chain-break-status.expect
+++ b/t/chainlint/chain-break-status.expect
@@ -1,7 +1,7 @@
-OUT=$(( ( large_git ; echo $? 1 >& 3 ) | : ) 3 >& 1) &&
+OUT=$( ((large_git; echo $? 1>&3) | :) 3>&1 ) &&
 test_match_signal 13 "$OUT" &&
 
-{ test-tool sigchain > actual ; ret=$? ; } &&
+{ test-tool sigchain >actual; ret=$?; } &&
 {
 	test_match_signal 15 "$ret" ||
 	test "$ret" = 3
diff --git a/t/chainlint/chained-subshell.expect b/t/chainlint/chained-subshell.expect
index af0369d328..83810ea7ec 100644
--- a/t/chainlint/chained-subshell.expect
+++ b/t/chainlint/chained-subshell.expect
@@ -4,7 +4,7 @@ mkdir sub && (
 	nuff said
 ) &&
 
-cut "-d " -f actual | ( read s1 s2 s3 &&
+cut "-d " -f actual | (read s1 s2 s3 &&
 test -f $s1 ?!AMP?!
 test $(cat $s2) = tree2path1 &&
-test $(cat $s3) = tree3path1 )
+test $(cat $s3) = tree3path1)
diff --git a/t/chainlint/command-substitution-subsubshell.expect b/t/chainlint/command-substitution-subsubshell.expect
index ab2f79e845..ec42f2c30c 100644
--- a/t/chainlint/command-substitution-subsubshell.expect
+++ b/t/chainlint/command-substitution-subsubshell.expect
@@ -1,2 +1,2 @@
-OUT=$(( ( large_git 1 >& 3 ) | : ) 3 >& 1) &&
+OUT=$( ((large_git 1>&3) | :) 3>&1 ) &&
 test_match_signal 13 "$OUT"
diff --git a/t/chainlint/dqstring-line-splice.expect b/t/chainlint/dqstring-line-splice.expect
index bf9ced60d4..f8fa3bc969 100644
--- a/t/chainlint/dqstring-line-splice.expect
+++ b/t/chainlint/dqstring-line-splice.expect
@@ -1,3 +1,3 @@
-echo 'fatal: reword option of --fixup is mutually exclusive with' '--patch/--interactive/--all/--include/--only' > expect &&
-test_must_fail git commit --fixup=reword:HEAD~ $1 2 > actual &&
+echo 'fatal: reword option of --fixup is mutually exclusive with'	'--patch/--interactive/--all/--include/--only' >expect &&
+test_must_fail git commit --fixup=reword:HEAD~ $1 2>actual &&
 test_cmp expect actual
diff --git a/t/chainlint/dqstring-no-interpolate.expect b/t/chainlint/dqstring-no-interpolate.expect
index 10724987a5..2c1851a3c3 100644
--- a/t/chainlint/dqstring-no-interpolate.expect
+++ b/t/chainlint/dqstring-no-interpolate.expect
@@ -6,6 +6,6 @@ grep "^\.git$" output.txt &&
 (
 	cd client$version &&
 	GIT_TEST_PROTOCOL_VERSION=$version git fetch-pack --no-progress .. $(cat ../input)
-) > output &&
-	cut -d ' ' -f 2 < output | sort > actual &&
+) >output &&
+	cut -d ' ' -f 2 <output | sort >actual &&
 	test_cmp expect actual
diff --git a/t/chainlint/empty-here-doc.expect b/t/chainlint/empty-here-doc.expect
index e8733c97c6..8507721192 100644
--- a/t/chainlint/empty-here-doc.expect
+++ b/t/chainlint/empty-here-doc.expect
@@ -1,4 +1,4 @@
-git ls-tree $tree path > current &&
-cat > expected <<\EOF &&
+git ls-tree $tree path >current &&
+cat >expected <<\EOF &&
 EOF
 test_output
diff --git a/t/chainlint/exclamation.expect b/t/chainlint/exclamation.expect
index 2d961a58c6..765a35bb4c 100644
--- a/t/chainlint/exclamation.expect
+++ b/t/chainlint/exclamation.expect
@@ -1,4 +1,4 @@
-if ! condition ; then echo nope ; else yep ; fi &&
+if ! condition; then echo nope; else yep; fi &&
 test_prerequisite !MINGW &&
 mail uucp!address &&
 echo !whatever!
diff --git a/t/chainlint/for-loop-abbreviated.expect b/t/chainlint/for-loop-abbreviated.expect
index a21007a63f..02c0d15cca 100644
--- a/t/chainlint/for-loop-abbreviated.expect
+++ b/t/chainlint/for-loop-abbreviated.expect
@@ -1,5 +1,5 @@
 for it
 do
-	path=$(expr "$it" : ( [^:]*) ) &&
+	path=$(expr "$it" : ([^:]*)) &&
 	git update-index --add "$path" || exit
 done
diff --git a/t/chainlint/function.expect b/t/chainlint/function.expect
index a14388e6b9..dd7c997a3c 100644
--- a/t/chainlint/function.expect
+++ b/t/chainlint/function.expect
@@ -1,8 +1,8 @@
-sha1_file ( ) {
+sha1_file() {
 	echo "$*" | sed "s#..#.git/objects/&/#"
 } &&
 
-remove_object ( ) {
+remove_object() {
 	file=$(sha1_file "$*") &&
 	test -e "$file" ?!AMP?!
 	rm -f "$file"
diff --git a/t/chainlint/here-doc.expect b/t/chainlint/here-doc.expect
index 1df3f78282..91b961242a 100644
--- a/t/chainlint/here-doc.expect
+++ b/t/chainlint/here-doc.expect
@@ -1,6 +1,6 @@
 boodle wobba \
-	gorgo snoot \
-	wafta snurb <<EOF &&
+       gorgo snoot \
+       wafta snurb <<EOF &&
 quoth the raven,
 nevermore...
 EOF
diff --git a/t/chainlint/loop-detect-status.expect b/t/chainlint/loop-detect-status.expect
index 24da9e86d5..7ce3a34806 100644
--- a/t/chainlint/loop-detect-status.expect
+++ b/t/chainlint/loop-detect-status.expect
@@ -1,18 +1,18 @@
-( while test $i -le $blobcount
-do
-	printf "Generating blob $i/$blobcount\r" >& 2 &&
+(while test $i -le $blobcount
+ do
+	printf "Generating blob $i/$blobcount\r" >&2 &&
 	printf "blob\nmark :$i\ndata $blobsize\n" &&
 	#test-tool genrandom $i $blobsize &&
 	printf "%-${blobsize}s" $i &&
 	echo "M 100644 :$i $i" >> commit &&
 	i=$(($i+1)) ||
 	echo $? > exit-status
-done &&
-echo "commit refs/heads/main" &&
-echo "author A U Thor <author@email.com> 123456789 +0000" &&
-echo "committer C O Mitter <committer@email.com> 123456789 +0000" &&
-echo "data 5" &&
-echo ">2gb" &&
-cat commit ) |
+ done &&
+ echo "commit refs/heads/main" &&
+ echo "author A U Thor <author@email.com> 123456789 +0000" &&
+ echo "committer C O Mitter <committer@email.com> 123456789 +0000" &&
+ echo "data 5" &&
+ echo ">2gb" &&
+ cat commit) |
 git fast-import --big-file-threshold=2 &&
 test ! -f exit-status
diff --git a/t/chainlint/nested-loop-detect-failure.expect b/t/chainlint/nested-loop-detect-failure.expect
index 4793a0e8e1..3461df40e5 100644
--- a/t/chainlint/nested-loop-detect-failure.expect
+++ b/t/chainlint/nested-loop-detect-failure.expect
@@ -1,31 +1,31 @@
-for i in 0 1 2 3 4 5 6 7 8 9 ;
+for i in 0 1 2 3 4 5 6 7 8 9;
 do
-	for j in 0 1 2 3 4 5 6 7 8 9 ;
+	for j in 0 1 2 3 4 5 6 7 8 9;
 	do
-		echo "$i$j" > "path$i$j" ?!LOOP?!
+		echo "$i$j" >"path$i$j" ?!LOOP?!
 	done ?!LOOP?!
 done &&
 
-for i in 0 1 2 3 4 5 6 7 8 9 ;
+for i in 0 1 2 3 4 5 6 7 8 9;
 do
-	for j in 0 1 2 3 4 5 6 7 8 9 ;
+	for j in 0 1 2 3 4 5 6 7 8 9;
 	do
-		echo "$i$j" > "path$i$j" || return 1
+		echo "$i$j" >"path$i$j" || return 1
 	done
 done &&
 
-for i in 0 1 2 3 4 5 6 7 8 9 ;
+for i in 0 1 2 3 4 5 6 7 8 9;
 do
-	for j in 0 1 2 3 4 5 6 7 8 9 ;
+	for j in 0 1 2 3 4 5 6 7 8 9;
 	do
-		echo "$i$j" > "path$i$j" ?!LOOP?!
+		echo "$i$j" >"path$i$j" ?!LOOP?!
 	done || return 1
 done &&
 
-for i in 0 1 2 3 4 5 6 7 8 9 ;
+for i in 0 1 2 3 4 5 6 7 8 9;
 do
-	for j in 0 1 2 3 4 5 6 7 8 9 ;
+	for j in 0 1 2 3 4 5 6 7 8 9;
 	do
-		echo "$i$j" > "path$i$j" || return 1
+		echo "$i$j" >"path$i$j" || return 1
 	done || return 1
 done
diff --git a/t/chainlint/subshell-here-doc.expect b/t/chainlint/subshell-here-doc.expect
index 52789278d1..75d6f607e2 100644
--- a/t/chainlint/subshell-here-doc.expect
+++ b/t/chainlint/subshell-here-doc.expect
@@ -1,7 +1,7 @@
 (
 	echo wobba \
-		gorgo snoot \
-		wafta snurb <<-EOF &&
+	       gorgo snoot \
+	       wafta snurb <<-EOF &&
 	quoth the raven,
 	nevermore...
 	EOF
diff --git a/t/chainlint/token-pasting.expect b/t/chainlint/token-pasting.expect
index 342360bcd0..6a387917a7 100644
--- a/t/chainlint/token-pasting.expect
+++ b/t/chainlint/token-pasting.expect
@@ -4,22 +4,22 @@ git config filter.rot13.clean ./rot13.sh &&
 {
     echo "*.t filter=rot13" ?!AMP?!
     echo "*.i ident"
-} > .gitattributes &&
+} >.gitattributes &&
 
 {
     echo a b c d e f g h i j k l m ?!AMP?!
     echo n o p q r s t u v w x y z ?!AMP?!
     echo '$Id$'
-} > test &&
-cat test > test.t &&
-cat test > test.o &&
-cat test > test.i &&
+} >test &&
+cat test >test.t &&
+cat test >test.o &&
+cat test >test.i &&
 git add test test.t test.i &&
 rm -f test test.t test.i &&
 git checkout -- test test.t test.i &&
 
-echo "content-test2" > test2.o &&
-echo "content-test3 - filename with special characters" > "test3 'sq',$x=.o" ?!AMP?!
+echo "content-test2" >test2.o &&
+echo "content-test3 - filename with special characters" >"test3 'sq',$x=.o" ?!AMP?!
 
 downstream_url_for_sed=$(
 	printf "%sn" "$downstream_url" |
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* Re: [PATCH] tests: drop dependency on `git diff` in check-chainlint
From: Eric Sunshine @ 2023-12-14  8:31 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Eric Sunshine, git, Junio C Hamano
In-Reply-To: <ZXq3YdK2RSKF3npE@tanuki>

On Thu, Dec 14, 2023 at 3:05 AM Patrick Steinhardt <ps@pks.im> wrote:
> On Wed, Dec 13, 2023 at 10:22:48PM -0500, Eric Sunshine wrote:
> > This is an alternative solution to the issue Patrick's patch[1]
> > addresses. Hopefully, this approach should avoid the sort of push-back
> > Patrick's patch received[2].
> >
> > By the way, I'm somewhat surprised that this issue crops up at all
> > considering that --no-index is being used with git-diff. As such, I
> > would have thought that the local repository's format would not have
> > been interrogated at all. If that's a bug in `git diff --no-index`, then
> > fixing that could be considered yet another alternative solution to the
> > issue raised here.
>
> This strongly reminds me of the thread at [1], where a similar issue was
> discussed for git-grep(1). Quoting Junio:
>
> > I actually do not think these "we are allowing Git tools to be used
> > on random garbage" is a good idea to begin with X-<.  If we invented
> > something nice for our variant in "git grep" and wish we can use it
> > outside the repository, contributing the feature to implementations
> > of "grep" would have been the right way to move forward, instead of
> > contaminating the codebase with things that are not related to Git.
>
> So this might not be the best way to go.

I recall Junio mentioning that, and I'm fine with the conclusion that
"fixing" --no-index is counter to the project's goals.

> > -                     sed -e '/^[     ]*$$/d' chainlint/$$i.expect; \
> > +                     sed -e 's/[     ][      ]*/ /g;/^ *$$/d;s/^ //;s/ $$//;s/\([<>|();&]\) /\1/g;s/ \([<>|();&]\)/\1/g' chainlint/$$i.expect; \
> >
> > The sed expressions for normalizing whitespace prior to `diff` may look
> > a bit hairy, but they are simple enough in concept:
> >
> > * collapse runs of whitespace to a single SP
> > * drop blank lines (this step is not new)
> > * fold out possible SP at beginning and end of each line
> > * fold out SP surrounding common punctuation characters used in shell
> >   scripts, such as `>`, `|`, `;`, etc.
>
> These sed expressions do look hairy indeed. I have to wonder: all that
> we're doing here is to munge the expected files we already have in our
> tree. Can't we fix those to look exactly like the actual results instead
> and then avoid any kind of post processing altogether? If I understand
> correctly the only reason we do this post processing is because the
> original implementation of the chainlinter produced slightly different
> whitespace.

Yes and no. It's not just whitespace.

I did strongly consider submitting patches to fix all the whitespace
differences in the "expect" files when chainlint.pl replaced
chainlint.sed, but I particularly didn't want to plague the mailing
list with such noise. It's really just unnecessary churn since it's so
easy to work around it with minor sed magic.

And time tells me that that was probably the correct decision since
the output of chainlint.pl has changed multiple times. Even the output
of chainlint.sed wasn't necessarily stable[1]. Then, of course
chainlint.pl replaced[2] chainlint.sed. The original implementation or
chainlint.pl just dumped out the parsed token stream, but [3] improved
it to preserve the original formatting of the test snippet, and [4]
annotated the output with line numbers of the original test snippet.
Had those changes been accompanied by extra patches to "fix" the
"expect" files to suit, it would have been just that much more noise
both in terms of patches to review and in terms of churn in the actual
history.

And, who knows, the output of chainlint.pl might change/improve again
some day. So, I still favor using sed to smooth over these minor
differences rather than "fixing" the "expect" file repeatedly to
adjust them for changes which are not significant to what is actually
being tested.

[1]: d73f5cfa89 (chainlint.sed: stop splitting "(..." into separate
lines "(" and "...", 2021-12-13)
[2]: d00113ec34 (t/Makefile: apply chainlint.pl to existing
self-tests, 2022-09-01)
[3]: 73c768dae9 (chainlint: annotate original test definition rather
than token stream, 2022-11-08)
[4]: 48d69d8f2f (chainlint: prefix annotated test definition with line
numbers, 2022-11-11)

^ permalink raw reply

* Re: [PATCH] tests: prefer host Git to verify chainlint self-checks
From: Eric Sunshine @ 2023-12-14  8:39 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Junio C Hamano, git
In-Reply-To: <ZXq5GL723v4E3_IH@tanuki>

On Thu, Dec 14, 2023 at 3:13 AM Patrick Steinhardt <ps@pks.im> wrote:
> On Wed, Dec 13, 2023 at 10:33:00PM -0500, Eric Sunshine wrote:
> > On Wed, Dec 13, 2023 at 10:11 AM Junio C Hamano <gitster@pobox.com> wrote:
> > > Mostly because your "differences in features supported by just-built
> > > one and what happens to be on $PATH can cause problems" cuts both
> > > ways [...]
> >
> > I sent an alternative solution[1] which should sidestep this objection.
>
> Thanks, I've replied to the thread. I think by now there are three
> different ideas:
>
>   - Improve the logic to pick some kind of diff implementation, which is
>     my patch series. It would need to be improved so that we also probe
>     whether the respective Git executables actually understand the repo
>     format so that we can fall back from the just-built Git to system's
>     Git.
>
>   - Munge the whitespace of the expected results with some regexes.
>     I like that idea better because we can avoid the git-diff(1)
>     problem, but find that the result is somewhat hard to read.
>
>   - Fix the ".expect" files so that we can avoid all of these games.
>
> I actually like the last option most. I'll have a go at it and send this
> third version out in a bit.

I sent a reply[1] in the other thread explaining why I'm still leaning
toward `sed` to smooth over these minor differences rather than
churning the "expect" files, especially since the minor differences
are not significant to what is actually being tested. That said, I
won't stand in the way of such a patch to "fix" the "expect" files,
but it feels unnecessary.

[1]: https://lore.kernel.org/git/CAPig+cSkuRfkR2D3JqYcbaJqj485nfD9Nq6pM=vXWB5DJenWpA@mail.gmail.com/

^ permalink raw reply

* Re: [PATCH] tests: prefer host Git to verify chainlint self-checks
From: Patrick Steinhardt @ 2023-12-14  8:40 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Junio C Hamano, git
In-Reply-To: <CAPig+cQ2-PB24n0xfcoSy_1UT-VbEZUXXJ9QbA8FBA8Vfyd6Ng@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2038 bytes --]

On Thu, Dec 14, 2023 at 03:39:17AM -0500, Eric Sunshine wrote:
> On Thu, Dec 14, 2023 at 3:13 AM Patrick Steinhardt <ps@pks.im> wrote:
> > On Wed, Dec 13, 2023 at 10:33:00PM -0500, Eric Sunshine wrote:
> > > On Wed, Dec 13, 2023 at 10:11 AM Junio C Hamano <gitster@pobox.com> wrote:
> > > > Mostly because your "differences in features supported by just-built
> > > > one and what happens to be on $PATH can cause problems" cuts both
> > > > ways [...]
> > >
> > > I sent an alternative solution[1] which should sidestep this objection.
> >
> > Thanks, I've replied to the thread. I think by now there are three
> > different ideas:
> >
> >   - Improve the logic to pick some kind of diff implementation, which is
> >     my patch series. It would need to be improved so that we also probe
> >     whether the respective Git executables actually understand the repo
> >     format so that we can fall back from the just-built Git to system's
> >     Git.
> >
> >   - Munge the whitespace of the expected results with some regexes.
> >     I like that idea better because we can avoid the git-diff(1)
> >     problem, but find that the result is somewhat hard to read.
> >
> >   - Fix the ".expect" files so that we can avoid all of these games.
> >
> > I actually like the last option most. I'll have a go at it and send this
> > third version out in a bit.
> 
> I sent a reply[1] in the other thread explaining why I'm still leaning
> toward `sed` to smooth over these minor differences rather than
> churning the "expect" files, especially since the minor differences
> are not significant to what is actually being tested. That said, I
> won't stand in the way of such a patch to "fix" the "expect" files,
> but it feels unnecessary.
> 
> [1]: https://lore.kernel.org/git/CAPig+cSkuRfkR2D3JqYcbaJqj485nfD9Nq6pM=vXWB5DJenWpA@mail.gmail.com/

Yeah, our mails indeed crossed. I personally do not mind much which of
our patches land upstream and would be happy with either.

Thanks!

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH v2] tests: adjust whitespace in chainlint expectations
From: Eric Sunshine @ 2023-12-14  8:44 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Junio C Hamano
In-Reply-To: <aec86a15c69aa276eee4875fad208ee2fc57635a.1702542564.git.ps@pks.im>

On Thu, Dec 14, 2023 at 3:30 AM Patrick Steinhardt <ps@pks.im> wrote:
> [...]
> Instead of improving the detection logic, fix our ".expect" files so
> that we do not need any post-processing anymore. This allows us to drop
> the `-w` flag when diffing so that we can always use diff(1) now.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> diff --git a/t/Makefile b/t/Makefile
> @@ -108,15 +108,7 @@ check-chainlint:
>         } >'$(CHAINLINTTMP_SQ)'/expect && \
>         $(CHAINLINT) --emit-all '$(CHAINLINTTMP_SQ)'/tests | \
>                 sed -e 's/^[1-9][0-9]* //;/^[   ]*$$/d' >'$(CHAINLINTTMP_SQ)'/actual && \
> -       if test -f ../GIT-BUILD-OPTIONS; then \
> -               . ../GIT-BUILD-OPTIONS; \
> -       fi && \
> -       if test -x ../git$$X; then \
> -               DIFFW="../git$$X --no-pager diff -w --no-index"; \
> -       else \
> -               DIFFW="diff -w -u"; \
> -       fi && \
> -       $$DIFFW '$(CHAINLINTTMP_SQ)'/expect '$(CHAINLINTTMP_SQ)'/actual
> +       diff -u '$(CHAINLINTTMP_SQ)'/expect '$(CHAINLINTTMP_SQ)'/actual

This change feels incomplete. Yes, it "corrects" the whitespace, but
check-chainlint still applies sed to remove blank lines and strip the
line numbers from the beginning of each line. Based upon the commit
message, I had expected that all post-processing of the output would
have been eliminated and that the "expect" files would exactly mirror
the output of the linter.

As mentioned elsewhere[1], I'm not particularly in favor of this
approach, though I won't stand in the way of it either.

[1]: https://lore.kernel.org/git/CAPig+cSkuRfkR2D3JqYcbaJqj485nfD9Nq6pM=vXWB5DJenWpA@mail.gmail.com/

^ permalink raw reply

* Re: [PATCH v2 1/4] wt-status: read HEAD and ORIG_HEAD via the refdb
From: Patrick Steinhardt @ 2023-12-14  9:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ramsay Jones, git, Taylor Blau, Phillip Wood
In-Reply-To: <xmqqmsuennfu.fsf@gitster.g>

[-- Attachment #1: Type: text/plain, Size: 2853 bytes --]

On Wed, Dec 13, 2023 at 07:15:33AM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> >> Me neither, but once you start thinking about getting rid of the
> >> need to use one-file-per-ref filesystem, being able to maintain all
> >> refs, including the pseudo refs, in one r/w store backend, becomes a
> >> very tempting goal.  From that point of view, I do not have problem
> >> with the idea to move _all_ pseudorefs to reftable.
> >
> > Yes, we're in agreement.
> >
> >> But I do have reservations on what Patrick, and the code he
> >> inherited from Han-Wen, calls "special refs" (which is not defined
> >> in the glossary at all), namely, refs.c:is_special_ref() and its
> >> callers.
> >
> > I do not want to add "special refs" to the glossary because ultimately
> > they should go away, with two exceptions: FETCH_HEAD and MERGE_HEAD.
> > Once we're there we can of course discuss whether we want to explicitly
> > point them out in the glossary and give them a special name.
> 
> OK, I somehow got a (wrong) impression that you are very close to
> the finish line.

You mean with the reftable backend? I indeed am quite close, I've just
finished the last prerequisite ("extensions.refFormat" and related
tooling) today. I will send that patch series upstream for review once
my patches that fix repo initialization with git-clone(1) land in the
"next" branch. The current state at [1] passes CI, even though there
will of course still be bugs which aren't covered by the test suite.

So once all prerequisites that are currently in flight plus the pending
"extensions.refFormat" series have landed I will send the reftable
backend implementation in for review. If things continue to go smoothly
I expect that this may happen at the end of January/start of February.

Anyway. This patch series here is in fact already sufficient to make
reftables work with those special refs. The only thing that we require
in this context is that refs are either exclusively routed through the
filesystem, or exclusively routed through the ref API. If that property
holds then things work just fine.

But still, I do want to clean up the remaining special refs regardless
of that, even though it is not a mandatory prerequisite. I find that the
current state is just plain confusing with all these special cases, and
I'd really love for it to be simplified. Also, I think there is benefit
in having those refs in reftables because it does allow for proper
atomic updates.

> If the intention is to leave many others still in
> the "special" category (for only the reasons of inertia), with a
> vision that some selected few must remain "special" with their own
> good reasons, then I am perfectly fine.

Okay.

Patrick

[1]: https://gitlab.com/gitlab-org/git/-/merge_requests/58

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* [PATCH v2 0/2] jx/fetch-atomic-error-message-fix
From: Jiang Xin @ 2023-12-14 12:33 UTC (permalink / raw)
  To: Git List, Junio C Hamano, Patrick Steinhardt; +Cc: Jiang Xin
In-Reply-To: <38b0b22038399265407f7fc5f126f471dcc6f1a3.1697725898.git.zhiyou.jx@alibaba-inc.com>

From: Jiang Xin <zhiyou.jx@alibaba-inc.com>

# Changes since v1:

1. Add a "test_commit ..." command in test case of t5574, so we can run
   test cases 4-6 individually.

2. Improve commit logs.


# range-diff v1...v2

1:  8c85f83e66 ! 1:  210191917b t5574: test porcelain output of atomic fetch
    @@ Commit message
     
             test_must_be_empty stderr
     
    -    Refactor this test case to run it twice. The first time will be run
    -    using non-atomic fetch and the other time will be run using atomic
    -    fetch. We can see that the above assertion fails for atomic get, as
    -    shown below:
    +    But this assertion fails if using atomic fetch. Refactor this test case
    +    to use different fetch options by splitting it into three test cases.
     
    +      1. "setup for fetch porcelain output".
    +
    +      2. "fetch porcelain output": for non-atomic fetch.
    +
    +      3. "fetch porcelain output (atomic)": for atomic fetch.
    +
    +    Add new command "test_commit ..." in the first test case, so that if we
    +    run these test cases individually (--run=4-6), "git rev-parse HEAD~"
    +    command will work properly. Run the above test cases, we can find that
    +    one test case has a known breakage, as shown below:
    +
    +        ok 4 - setup for fetch porcelain output
             ok 5 - fetch porcelain output  # TODO known breakage vanished
             not ok 6 - fetch porcelain output (atomic) # TODO known breakage
     
    -    The failed test case had an error message with only the error prompt but
    +    The failed test case has an error message with only the error prompt but
         no message body, as follows:
     
             'stderr' is not empty, it contains:
    @@ Commit message
         In a later commit, we will fix this issue.
     
         Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
     
      ## t/t5574-fetch-output.sh ##
     @@ t/t5574-fetch-output.sh: test_expect_success 'fetch compact output' '
    @@ t/t5574-fetch-output.sh: test_expect_success 'fetch compact output' '
     +test_expect_success 'setup for fetch porcelain output' '
      	# Set up a bunch of references that we can use to demonstrate different
      	# kinds of flag symbols in the output format.
    ++	test_commit commit-for-porcelain-output &&
      	MAIN_OLD=$(git rev-parse HEAD) &&
    + 	git branch "fast-forward" &&
    + 	git branch "deleted-branch" &&
     @@ t/t5574-fetch-output.sh: test_expect_success 'fetch porcelain output' '
      	FORCE_UPDATED_OLD=$(git rev-parse HEAD) &&
      	git checkout main &&
2:  d3184a9d0f ! 2:  6fb83a0000 fetch: no redundant error message for atomic fetch
    @@ Commit message
         will appear at the end of do_fetch(). It was introduced in b3a804663c
         (fetch: make `--atomic` flag cover backfilling of tags, 2022-02-17).
     
    -    Instead of displaying the error message unconditionally, the final error
    -    output should follow the pattern in update-ref.c and files-backend.c as
    -    follows:
    +    In function do_fetch(), a failure message is already shown before the
    +    retcode is set, so we should not call additional error() at the end of
    +    this function.
    +
    +    We can remove the redundant error() function, because we know that
    +    the function ref_transaction_abort() never fails. While we can find a
    +    common pattern for calling ref_transaction_abort() by running command
    +    "git grep -A1 ref_transaction_abort", e.g.:
     
             if (ref_transaction_abort(transaction, &error))
                 error("abort: %s", error.buf);
     
    -    This will fix the test case "fetch porcelain output (atomic)" in t5574.
    +    We can fix this issue follow this pattern, and the test case "fetch
    +    porcelain output (atomic)" in t5574 will also be fixed. If in the future
    +    we decide that we don't need to check the return value of the function
    +    ref_transaction_abort(), this change can be fixed along with it.
     
         Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
     
      ## builtin/fetch.c ##
     @@ builtin/fetch.c: static int do_fetch(struct transport *transport,

Jiang Xin (2):
  t5574: test porcelain output of atomic fetch
  fetch: no redundant error message for atomic fetch

 builtin/fetch.c         |  4 +-
 t/t5574-fetch-output.sh | 97 ++++++++++++++++++++++++-----------------
 2 files changed, 59 insertions(+), 42 deletions(-)

-- 
2.41.0.232.g2f6f0bca4f.agit.8.0.4.dev


^ permalink raw reply

* [PATCH v2 1/2] t5574: test porcelain output of atomic fetch
From: Jiang Xin @ 2023-12-14 12:33 UTC (permalink / raw)
  To: Git List, Junio C Hamano, Patrick Steinhardt; +Cc: Jiang Xin
In-Reply-To: <cover.1702556642.git.zhiyou.jx@alibaba-inc.com>

From: Jiang Xin <zhiyou.jx@alibaba-inc.com>

The test case "fetch porcelain output" checks output of the fetch
command. The error output must be empty with the follow assertion:

    test_must_be_empty stderr

But this assertion fails if using atomic fetch. Refactor this test case
to use different fetch options by splitting it into three test cases.

  1. "setup for fetch porcelain output".

  2. "fetch porcelain output": for non-atomic fetch.

  3. "fetch porcelain output (atomic)": for atomic fetch.

Add new command "test_commit ..." in the first test case, so that if we
run these test cases individually (--run=4-6), "git rev-parse HEAD~"
command will work properly. Run the above test cases, we can find that
one test case has a known breakage, as shown below:

    ok 4 - setup for fetch porcelain output
    ok 5 - fetch porcelain output  # TODO known breakage vanished
    not ok 6 - fetch porcelain output (atomic) # TODO known breakage

The failed test case has an error message with only the error prompt but
no message body, as follows:

    'stderr' is not empty, it contains:
    error:

In a later commit, we will fix this issue.

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
---
 t/t5574-fetch-output.sh | 97 ++++++++++++++++++++++++-----------------
 1 file changed, 58 insertions(+), 39 deletions(-)

diff --git a/t/t5574-fetch-output.sh b/t/t5574-fetch-output.sh
index 90e6dcb9a7..bc747efefc 100755
--- a/t/t5574-fetch-output.sh
+++ b/t/t5574-fetch-output.sh
@@ -61,11 +61,10 @@ test_expect_success 'fetch compact output' '
 	test_cmp expect actual
 '
 
-test_expect_success 'fetch porcelain output' '
-	test_when_finished "rm -rf porcelain" &&
-
+test_expect_success 'setup for fetch porcelain output' '
 	# Set up a bunch of references that we can use to demonstrate different
 	# kinds of flag symbols in the output format.
+	test_commit commit-for-porcelain-output &&
 	MAIN_OLD=$(git rev-parse HEAD) &&
 	git branch "fast-forward" &&
 	git branch "deleted-branch" &&
@@ -74,15 +73,10 @@ test_expect_success 'fetch porcelain output' '
 	FORCE_UPDATED_OLD=$(git rev-parse HEAD) &&
 	git checkout main &&
 
-	# Clone and pre-seed the repositories. We fetch references into two
-	# namespaces so that we can test that rejected and force-updated
-	# references are reported properly.
-	refspecs="refs/heads/*:refs/unforced/* +refs/heads/*:refs/forced/*" &&
-	git clone . porcelain &&
-	git -C porcelain fetch origin $refspecs &&
+	# Backup to preseed.git
+	git clone --mirror . preseed.git &&
 
-	# Now that we have set up the client repositories we can change our
-	# local references.
+	# Continue changing our local references.
 	git branch new-branch &&
 	git branch -d deleted-branch &&
 	git checkout fast-forward &&
@@ -91,36 +85,61 @@ test_expect_success 'fetch porcelain output' '
 	git checkout force-updated &&
 	git reset --hard HEAD~ &&
 	test_commit --no-tag force-update-new &&
-	FORCE_UPDATED_NEW=$(git rev-parse HEAD) &&
-
-	cat >expect <<-EOF &&
-	- $MAIN_OLD $ZERO_OID refs/forced/deleted-branch
-	- $MAIN_OLD $ZERO_OID refs/unforced/deleted-branch
-	  $MAIN_OLD $FAST_FORWARD_NEW refs/unforced/fast-forward
-	! $FORCE_UPDATED_OLD $FORCE_UPDATED_NEW refs/unforced/force-updated
-	* $ZERO_OID $MAIN_OLD refs/unforced/new-branch
-	  $MAIN_OLD $FAST_FORWARD_NEW refs/forced/fast-forward
-	+ $FORCE_UPDATED_OLD $FORCE_UPDATED_NEW refs/forced/force-updated
-	* $ZERO_OID $MAIN_OLD refs/forced/new-branch
-	  $MAIN_OLD $FAST_FORWARD_NEW refs/remotes/origin/fast-forward
-	+ $FORCE_UPDATED_OLD $FORCE_UPDATED_NEW refs/remotes/origin/force-updated
-	* $ZERO_OID $MAIN_OLD refs/remotes/origin/new-branch
-	EOF
-
-	# Execute a dry-run fetch first. We do this to assert that the dry-run
-	# and non-dry-run fetches produces the same output. Execution of the
-	# fetch is expected to fail as we have a rejected reference update.
-	test_must_fail git -C porcelain fetch \
-		--porcelain --dry-run --prune origin $refspecs >actual &&
-	test_cmp expect actual &&
-
-	# And now we perform a non-dry-run fetch.
-	test_must_fail git -C porcelain fetch \
-		--porcelain --prune origin $refspecs >actual 2>stderr &&
-	test_cmp expect actual &&
-	test_must_be_empty stderr
+	FORCE_UPDATED_NEW=$(git rev-parse HEAD)
 '
 
+for opt in off on
+do
+	case $opt in
+	on)
+		opt=--atomic
+		;;
+	off)
+		opt=
+		;;
+	esac
+	test_expect_failure "fetch porcelain output ${opt:+(atomic)}" '
+		test_when_finished "rm -rf porcelain" &&
+
+		# Clone and pre-seed the repositories. We fetch references into two
+		# namespaces so that we can test that rejected and force-updated
+		# references are reported properly.
+		refspecs="refs/heads/*:refs/unforced/* +refs/heads/*:refs/forced/*" &&
+		git clone preseed.git porcelain &&
+		git -C porcelain fetch origin $opt $refspecs &&
+
+		cat >expect <<-EOF &&
+		- $MAIN_OLD $ZERO_OID refs/forced/deleted-branch
+		- $MAIN_OLD $ZERO_OID refs/unforced/deleted-branch
+		  $MAIN_OLD $FAST_FORWARD_NEW refs/unforced/fast-forward
+		! $FORCE_UPDATED_OLD $FORCE_UPDATED_NEW refs/unforced/force-updated
+		* $ZERO_OID $MAIN_OLD refs/unforced/new-branch
+		  $MAIN_OLD $FAST_FORWARD_NEW refs/forced/fast-forward
+		+ $FORCE_UPDATED_OLD $FORCE_UPDATED_NEW refs/forced/force-updated
+		* $ZERO_OID $MAIN_OLD refs/forced/new-branch
+		  $MAIN_OLD $FAST_FORWARD_NEW refs/remotes/origin/fast-forward
+		+ $FORCE_UPDATED_OLD $FORCE_UPDATED_NEW refs/remotes/origin/force-updated
+		* $ZERO_OID $MAIN_OLD refs/remotes/origin/new-branch
+		EOF
+
+		# Change the URL of the repository to fetch different references.
+		git -C porcelain remote set-url origin .. &&
+
+		# Execute a dry-run fetch first. We do this to assert that the dry-run
+		# and non-dry-run fetches produces the same output. Execution of the
+		# fetch is expected to fail as we have a rejected reference update.
+		test_must_fail git -C porcelain fetch $opt \
+			--porcelain --dry-run --prune origin $refspecs >actual &&
+		test_cmp expect actual &&
+
+		# And now we perform a non-dry-run fetch.
+		test_must_fail git -C porcelain fetch $opt \
+			--porcelain --prune origin $refspecs >actual 2>stderr &&
+		test_cmp expect actual &&
+		test_must_be_empty stderr
+	'
+done
+
 test_expect_success 'fetch porcelain with multiple remotes' '
 	test_when_finished "rm -rf porcelain" &&
 
-- 
2.41.0.232.g2f6f0bca4f.agit.8.0.4.dev


^ permalink raw reply related

* [PATCH v2 2/2] fetch: no redundant error message for atomic fetch
From: Jiang Xin @ 2023-12-14 12:33 UTC (permalink / raw)
  To: Git List, Junio C Hamano, Patrick Steinhardt; +Cc: Jiang Xin
In-Reply-To: <cover.1702556642.git.zhiyou.jx@alibaba-inc.com>

From: Jiang Xin <zhiyou.jx@alibaba-inc.com>

If an error occurs during an atomic fetch, a redundant error message
will appear at the end of do_fetch(). It was introduced in b3a804663c
(fetch: make `--atomic` flag cover backfilling of tags, 2022-02-17).

In function do_fetch(), a failure message is already shown before the
retcode is set, so we should not call additional error() at the end of
this function.

We can remove the redundant error() function, because we know that
the function ref_transaction_abort() never fails. While we can find a
common pattern for calling ref_transaction_abort() by running command
"git grep -A1 ref_transaction_abort", e.g.:

    if (ref_transaction_abort(transaction, &error))
        error("abort: %s", error.buf);

We can fix this issue follow this pattern, and the test case "fetch
porcelain output (atomic)" in t5574 will also be fixed. If in the future
we decide that we don't need to check the return value of the function
ref_transaction_abort(), this change can be fixed along with it.

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
---
 builtin/fetch.c         | 4 +---
 t/t5574-fetch-output.sh | 2 +-
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index fd134ba74d..01a573cf8d 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1775,10 +1775,8 @@ static int do_fetch(struct transport *transport,
 	}

 cleanup:
-	if (retcode && transaction) {
-		ref_transaction_abort(transaction, &err);
+	if (retcode && transaction && ref_transaction_abort(transaction, &err))
 		error("%s", err.buf);
-	}

 	display_state_release(&display_state);
 	close_fetch_head(&fetch_head);
diff --git a/t/t5574-fetch-output.sh b/t/t5574-fetch-output.sh
index bc747efefc..8d01e36b3d 100755
--- a/t/t5574-fetch-output.sh
+++ b/t/t5574-fetch-output.sh
@@ -98,7 +98,7 @@ do
 		opt=
 		;;
 	esac
-	test_expect_failure "fetch porcelain output ${opt:+(atomic)}" '
+	test_expect_success "fetch porcelain output ${opt:+(atomic)}" '
 		test_when_finished "rm -rf porcelain" &&

 		# Clone and pre-seed the repositories. We fetch references into two
-- 
2.41.0.232.g2f6f0bca4f.agit.8.0.4.dev

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox