Git development

Git development
 help / color / mirror / Atom feed

* Re: [PATCH v2 1/1] branch: advise the user to checkout a different branch before deleting
From: Heba Waly @ 2020-01-08 18:06 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Heba Waly via GitGitGadget, Git List, Junio C Hamano,
	Emily Shaffer
In-Reply-To: <CAPig+cTDayF0hHn7wSPGNS8h2qPUYhhg9Z8fY_rLQnWmAg-NKQ@mail.gmail.com>

On Wed, Jan 8, 2020 at 10:28 PM Eric Sunshine <sunshine@sunshineco.com> wrote:
>
> advice seems simple on the surface, but every new piece of advice
> means having to add yet another configuration variable, writing more
> code, more tests, and more documentation

This raises a question though, do we really need a new configuration
for every new advice?
So a user who's not interested in receiving advice will have to
disable every single advice config? It doesn't seem scalable to me.
I imagine a user will either want to enable or disable the advice
feature all together. Why don't we have only one `enable_advice`
configuration that controls all the advice messages?

Thanks,
Heba

^ permalink raw reply

* Re: [PATCH v4 0/3] t: rework tests for --pathspec-from-file
From: Alexandr Miloslavskiy @ 2020-01-08 17:42 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Alexandr Miloslavskiy via GitGitGadget, git, Jonathan Nieder,
	Eric Sunshine
In-Reply-To: <xmqqimll3lmn.fsf@gitster-ct.c.googlers.com>

On 08.01.2020 18:26, Junio C Hamano wrote:
>> I will implement the next --pathspec-from-file patches as if this
>> third patch was accepted (that is, without copy&pasted tests).
> 
> I am not sure if that is a good idea.  I'd rather see the planned
> new changes not to be taken hostage of the third step.

In my understanding, the new patches will not be taken hostage, they 
will simply adopt the new approach. Everything will work just fine 
whether or not third step is present.

> Besides, with the third step, your preference is not to test the
> behaviour of end-user facing commands that would learn the option at
> all and only test the underlying machinery with test-tool tests, no?

That's not exactly correct. Third step removes duplicate tests that give 
no real benefit. With test-tool tests in place and succceeding, these 
duplicate tests are super unlikely to fail.

I will still provide a few tests for every new command to make sure that 
said command works as intended. I will only skip indirectly testing 
global API again and again.

> If you are not adding tests for the higher-level end-user facing
> commands as part of these new series, would it make a difference if
> the codebase has the third step applied (i.e. missing tests for the
> end-user facing commands that have already learned the option) or
> not (i.e. the commands that have already learned the option are
> still tested end-to-end)?

I will be adding good tests and skip useless tests. For new commands, it 
doesn't really matter if "third step" patch is applied or not.

^ permalink raw reply

* Re: [PATCH v2 1/2] graph: fix case that hit assert()
From: Junio C Hamano @ 2020-01-08 17:34 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, peff, brad, sunshine, James Coglan, Derrick Stolee
In-Reply-To: <xmqqd0bv3pin.fsf@gitster-ct.c.googlers.com>

Junio C Hamano <gitster@pobox.com> writes:

> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>> From: Derrick Stolee <dstolee@microsoft.com>
>> Subject: Re: [PATCH v2 1/2] graph: fix case that hit assert()
>>
>> A failure was reported in "git log --graph --all" with the new
>> graph-rendering logic. The code fails an assert() statement when
>> collapsing multiple edges from a merge.
>>
>> The assert was introduced by eaf158f8 (graph API: Use horizontal
>> lines for more compact graphs, 2009-04-21), which is quite old.
>> This assert is trying to say that when we complete a horizontal
>> line with a single slash, it is because we have reached our target.
>>
>> This assertion is hit when we have two collapsing lines from the
>> same merge commit, as follows:
>>
>>     | | | | *
>>     | |_|_|/|
>>     |/| | |/
>>     | | |/|
>>     | |/| |
>>     | * | |
>>     * | | |
>
> I was sort-of expecting to see
> ...
> near the beginning of this commit, though.

Will do this locally before using them in rc2.

Thanks.

^ permalink raw reply

* Re: [RFC PATCH] unpack-trees: watch for out-of-range index position
From: Junio C Hamano @ 2020-01-08 17:30 UTC (permalink / raw)
  To: Jeff King; +Cc: Emily Shaffer, git
In-Reply-To: <20200108071525.GB1675456@coredump.intra.peff.net>

Jeff King <peff@peff.net> writes:

> On Tue, Jan 07, 2020 at 06:31:27PM -0800, Emily Shaffer wrote:
>
>> This issue came in via a bugreport from a user who had done some nasty
>> things like deleting various files in .git/ (and then couldn't remember
>> how they had done it). The concern was primarily that a segfault is ugly
>> and scary, and possibly dangerous; I didn't see much problem with
>> checking for index-out-of-range if the result is a fatal error
>> regardless.
>>
>> [...]
>>  	if (pos >= 0)
>>  		BUG("This is a directory and should not exist in index");
>>  	pos = -pos - 1;
>> -	if (!starts_with(o->src_index->cache[pos]->name, name.buf) ||
>> +	if (pos >= o->src_index->cache_nr ||
>> +	    !starts_with(o->src_index->cache[pos]->name, name.buf) ||
>>  	    (pos > 0 && starts_with(o->src_index->cache[pos-1]->name, name.buf)))
>> -		BUG("pos must point at the first entry in this directory");
>> +		BUG("pos %d doesn't point to the first entry of %s in index",
>> +		    pos, name.buf);
>
> The new condition you added looks correct to me. I suspect this BUG()
> should not be a BUG() at all, though. It's not necessarily a logic error
> inside Git, but as you showed it could indicate corrupt data we read
> from disk. The true is probably same of the "pos >= 0" condition checked
> above.

It does not sound like a BUG to me, either, but the new condition
does look correct to me, too.  We can turn it into die() later if
somebody truly cares ;-)

Thanks, both.  Will queue.


> It's mostly an academic distinction, though, as I think it would be
> pretty reasonable for now to just die() here (eventually, though, we
> might want to turn it into an error return).
>
> -Peff

^ permalink raw reply

* Re: [PATCH v4 0/3] t: rework tests for --pathspec-from-file
From: Junio C Hamano @ 2020-01-08 17:26 UTC (permalink / raw)
  To: Alexandr Miloslavskiy
  Cc: Alexandr Miloslavskiy via GitGitGadget, git, Jonathan Nieder,
	Eric Sunshine
In-Reply-To: <12861b02-386c-3ae8-cd2f-ffe07c6aabc7@syntevo.com>

Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com> writes:

> On 07.01.2020 22:13, Junio C Hamano wrote:
>> With the third step the series won't merge cleanly with other topic
>> you have in 'next' (t7107 gets somewhat heavy merge conflicts).
>>
>> I'll queue the first two for now but let's clean them up post 2.25
>> release.
>
> OK, I will re-submit the remaining patch after 2.25.
>
> I will implement the next --pathspec-from-file patches as if this
> third patch was accepted (that is, without copy&pasted tests).

I am not sure if that is a good idea.  I'd rather see the planned
new changes not to be taken hostage of the third step.

Besides, with the third step, your preference is not to test the
behaviour of end-user facing commands that would learn the option at
all and only test the underlying machinery with test-tool tests, no?
If you are not adding tests for the higher-level end-user facing
commands as part of these new series, would it make a difference if
the codebase has the third step applied (i.e. missing tests for the
end-user facing commands that have already learned the option) or
not (i.e. the commands that have already learned the option are
still tested end-to-end)?

^ permalink raw reply

* Re: [PATCH v3 10/15] rebase: add an --am option
From: Junio C Hamano @ 2020-01-08 17:18 UTC (permalink / raw)
  To: Phillip Wood
  Cc: Elijah Newren, Phillip Wood, Elijah Newren via GitGitGadget,
	Git Mailing List, Johannes Schindelin, Denton Liu, Pavel Roskin,
	Alban Gruin, SZEDER Gábor
In-Reply-To: <9ac52ef1-c1cb-45aa-178a-ec5a282bd761@gmail.com>

Phillip Wood <phillip.wood123@gmail.com> writes:

> I view this change in the default backend as similar to the rewrite in
> C in that it is an implementation detail we're changing that should be
> transparent (and beneficial in terms of performance) to the
> user. There we provided a configuration variable but not a command
> line option to control if it was used or not.

Do you mean things like GIT_TEST_ADD_I_USE_BUILTIN?  I think it is
OK to have such an escape mechanism to allow people to opt out of
a new reimplementation until it matures, but I do not offhand recall
doing so with config.

^ permalink raw reply

* Re: [PATCH 0/1] Preserve topological ordering after merges
From: Sergey Rudyshin @ 2020-01-08 17:04 UTC (permalink / raw)
  To: Derrick Stolee, Johannes Schindelin,
	Sergey Rudyshin via GitGitGadget; +Cc: git
In-Reply-To: <222602fc-68a9-0671-e6b9-1aa727d830ea@gmail.com>

Hi Derrick,

Thank you for your comments!

08.01.2020 16:37, Derrick Stolee wrote:
> On 1/8/2020 7:50 AM, Sergey Rudyshin wrote:
>> let me explain in more detail why I thought it would make sense to stop using "--topo-order".
>> in case if a user specifies a single commit, like this
>> git rev-list E
>> with a new algorithm the options "--date-order", "--author-date-order", "--topo-order" do not change the ordering. Because there is only one way to sort any git graph with a single tip.
> 
> This is false, unless your history is always linear (no merge commits).

Would it be possible to provide some test script which would prove you 
point? We could run it against the binary created based from 
https://github.com/gitgitgadget/git/pull/514/commits/542a02020c8578d0e004cb9c929bced8719b902a


> 
>> in case if a user specifies multiple tips, like this
>> git rev-list --topo-order C B ^A
>> current version of git displays commits ordered by commit timestamp
>> which does not seem like a topological ordering.
> 
> Are you talking about which of C and B are shown first? They are shown in an order based on your input. If C and B are independent (neither can reach the other), then they will swap order if you write "git rev-list --topo-order B C ^A".
> 

Please find a script showing that commits are always sorted by commit 
timestamp no matter how they appear in the input.

$ cat ./test.sh
#!/bin/bash
export LANGUAGE=en
GIT=/usr/bin/git
$GIT --version
rm -rf .git
$GIT init
export GIT_COMMITTER_DATE="2000-01-01T00:00:00 +0000"
$GIT commit -m "0" --allow-empty
$GIT tag 0
export GIT_COMMITTER_DATE="2001-01-01T00:00:00 +0000"
$GIT checkout --orphan b1
$GIT commit -m "1" --allow-empty
$GIT tag 1
export GIT_COMMITTER_DATE="2002-01-01T00:00:00 +0000"
$GIT checkout --orphan b2
$GIT commit -m "2" --allow-empty
$GIT tag 2

fnc () {
	echo "$@"
	$GIT log --graph --pretty=tformat:"%D %ci" "$@" | sed 's/-01-01 
00:00:00 +0000//'
}
fnc --topo-order 0 1
fnc --topo-order 1 0

fnc --date-order 0 1
fnc --date-order 1 0

fnc --topo-order 1 2
fnc --topo-order 2 1

fnc --date-order 1 2
fnc --date-order 2 1

fnc --topo-order 0 1 2
fnc --topo-order 2 1 0

$ ./test.sh
git version 2.7.4
Initialized empty Git repository in /tmp/git_test_case_ordering/.git/
[master (root-commit) 0a449fc] 0
Switched to a new branch 'b1'
[b1 (root-commit) 070d07f] 1
Switched to a new branch 'b2'
[b2 (root-commit) c751327] 2
--topo-order 0 1
* tag: 1, b1 2001
* tag: 0, master 2000
--topo-order 1 0
* tag: 1, b1 2001
* tag: 0, master 2000
--date-order 0 1
* tag: 1, b1 2001
* tag: 0, master 2000
--date-order 1 0
* tag: 1, b1 2001
* tag: 0, master 2000
--topo-order 1 2
* HEAD -> b2, tag: 2 2002
* tag: 1, b1 2001
--topo-order 2 1
* HEAD -> b2, tag: 2 2002
* tag: 1, b1 2001
--date-order 1 2
* HEAD -> b2, tag: 2 2002
* tag: 1, b1 2001
--date-order 2 1
* HEAD -> b2, tag: 2 2002
* tag: 1, b1 2001
--topo-order 0 1 2
* HEAD -> b2, tag: 2 2002
* tag: 1, b1 2001
* tag: 0, master 2000
--topo-order 2 1 0
* HEAD -> b2, tag: 2 2002
* tag: 1, b1 2001
* tag: 0, master 2000



>> so I decided to change the documentation so that "--topo-order" and "--date-order" be the same. And since "--topo-order" does not add anything new decided to deprecate it.
> 
> Based on this sentence, it is clear that you do not understand the difference between --topo-order and --date-order. Please take time to examine the output difference for the Git repo with the following commands:
> 
> 	git log --oneline --graph --topo-order
> 	git log --oneline --graph --date-order
> 

Hope that the script above will clarify what i meant.

>> Regarding the failed test
>> I'll try to find the reason but what puzzles me is why those tests (t4202, t4215 and t6012) succeeded on all other platforms (linux32, osx-clang, windows, ...) and only failed on linux-gcc.
>> In my machine those tests do not fail either (gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609)
> 
> Try running the tests with GIT_TEST_COMMIT_GRAPH=1.
> 

Yes it helped. With that option the tests starts to fail.
I'll try to find out how COMMIT_GRAPH works and fix it.

> -Stolee
>   
> 

PS
excluded Junio from the loop per his request.

^ permalink raw reply

* Re: [PATCH v2 1/1] doc/gitcore-tutorial: fix prose to match example command
From: Junio C Hamano @ 2020-01-08 16:57 UTC (permalink / raw)
  To: Heba Waly via GitGitGadget; +Cc: git, Heba Waly
In-Reply-To: <0c75cd8f9727b10af6f6a804177e551ba0217abf.1578443496.git.gitgitgadget@gmail.com>

"Heba Waly via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Heba Waly <heba.waly@gmail.com>
>
> In 328c6cb853 (doc: promote "git switch", 2019-03-29), an example
> was changed to use "git switch" rather than "git checkout" but an
> instance of "git checkout" in the explanatory text preceding the
> example was overlooked. Fix this oversight.
>
> Signed-off-by: Heba Waly <heba.waly@gmail.com>
> ---
>  Documentation/gitcore-tutorial.txt | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/gitcore-tutorial.txt b/Documentation/gitcore-tutorial.txt
> index f880d21dfb..c0b95256cc 100644
> --- a/Documentation/gitcore-tutorial.txt
> +++ b/Documentation/gitcore-tutorial.txt
> @@ -751,7 +751,7 @@ to it.
>  ================================================
>  If you make the decision to start your new branch at some
>  other point in the history than the current `HEAD`, you can do so by
> -just telling 'git checkout' what the base of the checkout would be.
> +just telling 'git switch' what the base of the checkout would be.
>  In other words, if you have an earlier tag or branch, you'd just do
>  
>  ------------

Thanks.  The part before this paragraph does illustrate the use of
"switch", and we should talk about that command here, too.


^ permalink raw reply

* Re: [PATCH 0/1] sequencer: comment out the 'squash!' line
From: Junio C Hamano @ 2020-01-08 16:53 UTC (permalink / raw)
  To: brian m. carlson
  Cc: Mike Rappazzo, Michael Rappazzo via GitGitGadget, Git List
In-Reply-To: <20200108025509.GM6570@camp.crustytoothpaste.net>

"brian m. carlson" <sandals@crustytoothpaste.net> writes:

> On 2020-01-07 at 16:15:16, Junio C Hamano wrote:
>> "brian m. carlson" <sandals@crustytoothpaste.net> writes:
>> 
>> > I can see the argument that this makes it a little harder for mechanical
>> > processing across versions, but I suspect most of that looks something
>> > like "sed -i -e '/^squash! /d' COMMIT_EDITMSG" and it probably won't be
>> > affected.
>> 
>> With the left-anchoring, the search pattern will no longer find that
>> line if "squash!" is commented out, but people tend to be sloppy and
>> do not anchor the pattern would not notice the difference.  Perhaps
>> the downside may not be too severe?  I dunno.
>
> Sorry, I was perhaps bad at explaining this.  In my example, it would no
> longer remove that line, but the user wouldn't care, because it would be
> commented out and removed automatically.  So while the code wouldn't
> work, what the user wanted would be done by Git automatically.

I didn't realize that you only care about 'd' there; you're right of
course if you limit the scope of the discussion that way.

I was talking in a more general terms where "^squash!" is used as a
marker that signals the boundary between two original commits and
the processing is done possibly differently on each part.

^ permalink raw reply

* Re: [PATCH 0/1] Update imap-send.c, fix incompatibilities with OpenSSL 1.1.x
From: Junio C Hamano @ 2020-01-08 16:09 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Liam Huang via GitGitGadget, Liam Huang, git
In-Reply-To: <nycvar.QRO.7.76.6.2001081148300.46@tvgsbejvaqbjf.bet>

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> I will change GitGitGadget to no longer Cc: you automatically.

Thanks.

> Please register my suspicion that this will make GitGitGadget a lot less
> useful: the stated mission of GitGitGadget is to make contributing patches
> to the Git project _easier_ so that the contributor can focus on the
> changes they want to make, rather than on the rather involved process.

I am not sure where that "a lot" comes from.  FWIW I do not expect
my response rate to change at all [*1*], but perhaps you have
something else, perhaps effect on reviewers other than me, in mind?

In any case, a large part of focusing on changes they want to make
is to ask for help from the right people who know the part of the
system they want to touch, and that is ...

>> Besides, when they send out patches they would also add area experts and
>> those who participated in the review of the earlier round to Cc: so GGG
>> needs to have a mechanism to allow the end user to do so.
>
> So GitGitGadget should now also learn to determine who the current area
> experts are???

... done by CC'ing the right folks, right?

Whether they run "shortlog --since=18.months $pathspec" locally to
find them, or GGG does so for them before turning the patch into a
piece of e-mail and offers "perhaps some of these people can help
you?", after the contributor decides from whom to ask help, there
would be some way for the contributor to tell GGG "ok I'll ask this
person to help by placing the addresss on the CC", no?  That is what
I meant by the mention of CC: in the part of my response you quoted.

> I must have misread your request.

No, it wasn't even a request (unless GGG does not offer any way to
say "I want this to be CC'ed to these folks", that is).  It was
merely "the contributor must have a way to choose to (or not to) cc
me (or anybody), I presume".

The request part was "let them do so themselves, instead of always
cc'ing me, because the latter does not add any bit of useful
information."

After all, software development is a human interaction process.  I
wouldn't mind if the automated CC is done to address some 'bot
(e.g. patch tracker) at all, but it simply is rude to treat other
people as a convenient review bot and it is even more so to do so
blindly and automatically, which is what automated CC added by GGG
is.  At least, when the contributor chooses to ask a reviewer X,
even if the choice were wreong and the patch were in an area the
reviewer X were not familiar with at all, it means something that
the contributor decided to ask for help from X by CC'ing.

[Footnote]

*1* I do not read patch e-mails out of my mbox and instead read via
the nntp interface to lore or public-inbox archive.  The list of
messages presented to me to choose which ones to read and respond to
would only show me who the author is and what the title is, so "is
it CC'ed to me?" does not affect my response rate at all.

^ permalink raw reply

* Re: [PATCH] restore: invalidate cache-tree when removing entries with --staged
From: Junio C Hamano @ 2020-01-08 15:41 UTC (permalink / raw)
  To: Jeff King; +Cc: Torsten Krah, Emily Shaffer, git@vger.kernel.org
In-Reply-To: <20200108114344.GA3380580@coredump.intra.peff.net>

Jeff King <peff@peff.net> writes:

> diff --git a/builtin/checkout.c b/builtin/checkout.c
> index b52c490c8f..18ef5fb975 100644
> --- a/builtin/checkout.c
> +++ b/builtin/checkout.c
> @@ -524,6 +524,8 @@ static int checkout_paths(const struct checkout_opts *opts,
>  	/* Now we are committed to check them out */
>  	if (opts->checkout_worktree)
>  		errs |= checkout_worktree(opts);
> +	else
> +		remove_marked_cache_entries(&the_index, 1);

Ah, I was wondering why we were seeing breakages on cache-tree,
which is fairly old and stable part of the system---even though it
had caused us quite a lot of pain while it was growing---all of a
sudden.  No wonder---this codepath is a fairly new one, introduced
when "restore" was added X-<.

And the fix looks right, of course.  Thanks for extracting a
reproducible recipe out of the original reporter and quickly
diagnosing it.  Very much appreciated.

^ permalink raw reply

* Re: [PATCH v4 0/3] t: rework tests for --pathspec-from-file
From: Alexandr Miloslavskiy @ 2020-01-08 15:32 UTC (permalink / raw)
  To: Junio C Hamano, Alexandr Miloslavskiy via GitGitGadget
  Cc: git, Jonathan Nieder, Eric Sunshine
In-Reply-To: <xmqqh8173r8e.fsf@gitster-ct.c.googlers.com>

On 07.01.2020 22:13, Junio C Hamano wrote:
> With the third step the series won't merge cleanly with other topic
> you have in 'next' (t7107 gets somewhat heavy merge conflicts).
> 
> I'll queue the first two for now but let's clean them up post 2.25
> release.

OK, I will re-submit the remaining patch after 2.25.

I will implement the next --pathspec-from-file patches as if this third 
patch was accepted (that is, without copy&pasted tests).

Thanks for accepting this and other polishing branches, I was already 
quite pessimistic about them.

^ permalink raw reply

* Re: [PATCH 1/1] Preserve topological ordering after merges This modifies the algorithm of topopological ordering. The problem with the current algoritm is that it displays the graph below as follows
From: Sergey Rudyshin @ 2020-01-08 15:11 UTC (permalink / raw)
  To: Derrick Stolee, Sergey Rudyshin via GitGitGadget, git; +Cc: Junio C Hamano
In-Reply-To: <6c86f1e9-b70b-10b1-f2c5-589312f73a4c@gmail.com>

Hi Derrick,
thank you for you feedback!

07.01.2020 15:14, Derrick Stolee wrote:
> On 1/7/2020 5:30 AM, Sergey Rudyshin via GitGitGadget wrote:
>> From: Sergey Rudyshin <540555@gmail.com>
>>
>> * E
>> |\
>> | * D
>> | |\
>> | |/
>> |/|
>> * | C
>> | * B
>> |/
>> * A
>>
>> commit B is placed between A and C, which is wrong
>> because E stays that D and B comes after C
>>
>> the only correct solution here is
>>
>> * E
>> |\
>> | * D
>> | |\
>> | |/
>> |/|
>> | * B
>> * | C
>> |/
>> * A
>>
>> while it seems that it contradicts to
>> D stating that C should be between D and B
>> The new algorithm solves this issue
> 
> This ordering concern makes sense _somewhat_, because D is the
> second parent of D and that wants to say "Show everything in C..D
> before showing C". The issues is that since C is the second parent
> of D, the topo-ordering says "Show everything in B..C before showing
> things reachable from B". It is unfortunate that these constraints
> collide.
>  > Perhaps your description could do a better job clarifying this
> issue and how your algorithm change fixes the problem.
> 

The proposed algorithm allows to solve collisions you mentioned by 
sticking to the rule which can be summarized as "new commits do not 
change history".
And with that rule in mind choosing between C and B becomes obvious. C 
comes before B because there was a moment in the history when C existed 
and B did not.

Let's imagine the following scenario.
Some person maintains some branch.
At some point in time the branch contains only two commits A and C.
"git rev-list --topo-order" produces "A,C"
then the maintainer merges some branch and
"git rev-list --topo-order" starts to produces "A,B,C,..."
which is confusing.

The algorithm itself is similar to the wall follower used in maze solving.
If you imagine git graph like a maze where edges corresponds to 
corridors and nodes to junctions then using "right-hand rule" you would 
traverse the maze. When leaving a junctions if all corridors are visited 
  assigning the next number to the junctions you are effectively 
ordering them.

Let me repeat the example from my first letter
* walk to to the root via "first" parents;
* go E -> C -> A;
* print A because it has no parents;
* step back to C;
* print C because it has no other parents;
* then step back to E;
* go D -> B -> A;
* do not print A because A is already printed;
* step back to B;
* print B;
* so on.

> However, I'm not sure we even want to make the change, as this
> is still a valid topological order (parents appear after children).
> When we have an ambiguous pair (like B and C) the order can differ.
> The --topo-order option tries to group commits by when they were
> introduced, and that's the reason for presenting the commits reachable
> from the later parents before presenting the commits from earlier
> parents.
> 
> The only documentation we have is from [1]:
> 
> "Show no parents before all of its children are shown, and avoid
> showing commits on multiple lines of history intermixed."
> 
> The first part of the sentence is still true, and the second part
> is ambiguous of how to do that.
> 
> [1] https://git-scm.com/docs/git-log#Documentation/git-log.txt---topo-order
> 

Agreed that it is a a valid topological order but rather for a directed 
acyclic graph. Git has an additional property: edges (parents) are 
ordered. Which makes only one way to order it. Ignoring information that 
parents were ordered we had to invent three similar orderings, one of 
them was ambiguous. For some reason two options do not "avoid showing 
commits on multiple lines of history intermixed". A little confusing.

I think we have an opportunity here to remove options for sorting 
eventually thus simplifying leaning curve for users and give them some 
new features.

Let me step aside and tell why I am proposing this patch in the first 
place. I am a database developer. And me and my team, we have so called 
"migrations": a set of scripts which are to be applied to a database. 
Those scripts are to have a numbers in its filenames so that a tool 
could install them in a particular order (here is example 
https://flywaydb.org/getstarted/how). In our scenario multiple 
developers create those scripts on theirs branches. Those branches get 
merged into a single integration branch. If Git could preserve commit 
ordering after merges it would be possible to generate those filenames 
automatically. Now it can't. So here am i.

>> This change makes option "--topo-order" obsolete, because
>> there is only one way to order parents of a single commit.
>> "--date-order" and "--author-date-order" are preserved and make sense
>> only for the case when multiple commits are given
>> to be able to sort those commits.
> 
> This part of the change needs to be removed. The default sort does
> not preserve topological orderings (like --date-order does), and
> so is much faster to output, especially without a commit-graph file.
> 

Yes indeed. Will fix it.

>>   void sort_in_topological_order(struct commit_list **list, enum rev_sort_order sort_order)
> 
> Since you are only editing this code, and not the incremental topo-order code, your
> test changes will likely break when run with GIT_TEST_COMMIT_GRAPH=1. When the commit-graph
> exists and generation numbers are calculated, we use a different algorithm for topo-order.
> 

Yes this needs to be reconciled. But unfortunately have no experience in 
the code for commit graph.

> I've been meaning to clean up this "duplicated" logic by deleting this method in favor of
> the incremental algorithm in all cases. It needs some perf testing to make sure that
> refactor doesn't have too large of a perf hit in the case of no commit-graph.
>
Given that the new algorithm is pretty simple we could duplicate it 
there once again.

>>   	/* update the indegree */
>> @@ -832,51 +886,56 @@ void sort_in_topological_order(struct commit_list **list, enum rev_sort_order so
>>   	for (next = orig; next; next = next->next) {
>>   		struct commit *commit = next->item;
>>   
>> -		if (*(indegree_slab_at(&indegree, commit)) == 1)
>> -			prio_queue_put(&queue, commit);
>> +		if (*(indegree_slab_at(&indegree, commit)) == 1) {
>> +			/* also record the author dates, if needed */
>> +			if (sort_order == REV_SORT_BY_AUTHOR_DATE)
>> +				record_author_date(&author_date, commit);
>> +			prio_queue_put(&queue_tips, commit);
>> +		}
> 
> Your code change looks rather large for what I expected to be a much simpler change.
> Likely the only thing we need is to avoid adding to the priority queue if we already
> have the commit in the queue (maybe using a hashset storing the commits that we've
> added to the queue). I believe the reason C appears before B in your example is that
> it was added to the queue a second time, and the queue behaves like a stack in the
> topo-order case.
> 
Probably the new code itself could be simplified a bit.
Thanks for suggestion I'll try to think this way.

> Thanks,
> -Stolee
> 

^ permalink raw reply

* Re: [PATCH v3 10/15] rebase: add an --am option
From: Phillip Wood @ 2020-01-08 14:32 UTC (permalink / raw)
  To: Junio C Hamano, Elijah Newren
  Cc: Phillip Wood, Elijah Newren via GitGitGadget, Git Mailing List,
	Johannes Schindelin, Denton Liu, Pavel Roskin, Alban Gruin,
	SZEDER Gábor
In-Reply-To: <xmqqy2uj3u3q.fsf@gitster-ct.c.googlers.com>

On 07/01/2020 20:11, Junio C Hamano wrote:
> Elijah Newren <newren@gmail.com> writes:
> 
>> However, I thought of this option before Junio suggested a
>> rebase.backend config setting, so we could just rely on that instead.
>> Thus, getting rid of the "--am" flag in detail would mean:
>>    * I need to redo the test changes in this series to use "-c
>> rebase.backend=am" instead of "--am"
>>    * It will be slightly harder for users to use the escape hatch in
>> one-off cases during the transition
>>    * We need to figure out the right way to reword the documentation
>>
>> The first two are pretty minor, so that probably means I just need to
>> come up with some good wording for the documentation (suggestions
>> welcome...)
> 
> It probably is a good idea to keep --am (and possibly --[no-]am) as
> long as rebase.backend option exists.  A configuration variable that
> changes behaviour without allowing it to be overridden from the
> command line is not a good idea.

I view this change in the default backend as similar to the rewrite in C 
in that it is an implementation detail we're changing that should be 
transparent (and beneficial in terms of performance) to the user. There 
we provided a configuration variable but not a command line option to 
control if it was used or not. I don't see it as something the user will 
want to change from rebase to rebase but perhaps I'm missing something.

Best Wishes

Phillip

^ permalink raw reply

* Re: [PATCH v3 01/15] rebase: extend the options for handling of empty commits
From: Phillip Wood @ 2020-01-08 14:27 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Elijah Newren via GitGitGadget, Git Mailing List,
	Johannes Schindelin, Phillip Wood, Denton Liu, Junio C Hamano,
	Pavel Roskin, Alban Gruin, SZEDER Gábor
In-Reply-To: <CABPp-BEH=9qejeqysHYE+AJ+JPaBympZizq-bx_OjArYFa4xUQ@mail.gmail.com>

Hi Elijah

Thanks for such a detailed reply, I'll try and do it justice.

TLDR I like your new proposal

On 07/01/2020 19:15, Elijah Newren wrote:
> Hi Phillip,
> 
> On Tue, Jan 7, 2020 at 6:37 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
>>
>> Hi Elijah
>>
>> Thanks for working on this series, I think making the sequencer the
>> default backend is a good idea. I have a few reservations about this
>> path though...
> 
> Thanks for the feedback.  You've provided some good food for thought.
> While responding, I came up with an alternate proposal below which I
> think may be more in line with what you want.  Let me know...
> 
>>
>> On 24/12/2019 19:54, Elijah Newren via GitGitGadget wrote:
>>> From: Elijah Newren <newren@gmail.com>
>>>
>>> Extend the interactive machinery with the ability to handle the full
>>> spread of options for how to handle commits that either start or become
>>> empty (by "become empty" I mean the changes in a commit are a subset of
>>> changes that exist upstream, so the net effect of applying the commit is
>>> no changes).  Introduce a new command line flag for selecting the
>>> desired behavior:
>>>     --empty={drop,keep,ask}
>>> with the definitions:
>>>     drop: drop empty commits
>>>     keep: keep empty commits
>>>     ask:  provide the user a chance to interact and pick what to do with
>>>           empty commits on a case-by-case basis
>>
>> I think we want to distinguish between commits that are empty before
>> rebasing and those that become empty when they are rebased. --keep-empty
>> explicily only applies to commits that are already empty. Cherry-pick
> 
> I am open to changing how the empty handling works; I figured it was
> the most likely area people might have feedback about.
> 
> However, I do strongly disagree with the "explicit" claim here.  It is
> possible that --keep-empty was designed to only apply to commits that
> begin empty, but that is not at all explicit.

Maybe explicit wasn't the best choice of words. I was going off the
implementation which takes pains to only allow and empty commit when the
original was empty. This has been the case since the feature was
introduced in 90e1818f9a ("git-rebase: add keep_empty flag",
2012-04-20). I did some digging in the mail archive this morning and
found [1] which seems to confirm the intention was to preserve existing
empty commits, not keep those that become empty. This fits with the
discussion [2] which lead to this feature. The same patch series that
introduced --keep-empty for rebase introduced the two cherry-pick flags
--allow-empty and --keep-redundant-commits to distinguish between the
two cases and used `cherry-pick --allow-empty` to implement `rebase
--keep-empty` (and failed to provide a way to continue after stopping 
for a conflict resolution)

[1] https://lore.kernel.org/git/5006614E.8090601@viscovery.net/
[2]
https://lore.kernel.org/git/20120323185205.GA11916@hmsreliant.think-freely.org/

>  The only place in the
> documentation I could find that mentions the distinction at all was in
> the "Behavioral Differences" section, which was written by me as part
> of commit  0661e49aeb84 ("git-rebase.txt: document behavioral
> differences between modes", 2015-06-27).  Quoting from that commit
> message:
> 
> "(In fact, it's not clear -- to me at least -- that these differences
> were even desirable or intentional.)  Document these differences."
> 
> I cannot find anywhere else making a distinction in either the
> documentation or the commit messages or even any code comments, which
> has left me wondering whether the distinction was intentional at all.
> Things like commit b00bf1c9a8dd ("git-rebase: make
> --allow-empty-message the default", 2018-06-27) give heavy precedence
> to the assumption that git-rebase is often more happenstance than
> design when it comes to edge cases like these.
> 
>> has distinct options for those two cases. If I've explicitly created an
>> empty commit then I want to keep it but I don't want to keep commits
>> that become empty because the changes they contain are already upstream.
>>
>> If we want an option that keeps commits that become empty (Off hand I
>> don't know why we would though) we should consider if that option should
>> disable --cherry-mark when we create the todo list so that it keeps all
>> commits that become empty when they're rebased.
> 
> To answer the particular reason I have seen why folks might want to
> keep both commits that start empty and become empty, but still use
> --cherry-mark to discard commits that are already upstream:
>   * If cherry-mark can determine the commit was "already upstream",
> then because of how cherry-mark works this means the upstream commit
> message was about the *exact* same set of changes.  Thus, the commit
> messages can be assumed to be fully interchangeable (and are in fact
> likely to be completely identical).
>   * If a commit does not start empty, but becomes so because its
> changes were a subset of those upstream, then while there are clearly
> no more changes that need to be applied, the commit messages can be
> expected to differ -- either there will be an upstream commit that
> included a bunch of other changes too, or the commit we are rebasing
> will be spread across multiple upstream commits.  If the project has
> bad commit message hygiene (because e.g. they are using atrocious code
> review tools like GitHub that don't allow you to review them), there's
> a reasonable likelihood that the upstream version only talks about the
> other changes in the commit instead of the fix we are rebasing.  If
> so, there may be useful information in this commit message that we
> want to copy somewhere.

That sounds reasonable, (I'm not sure what one would do with the commit
message though as creating an empty commit wont be any use for blame or
someone looking at the commit that introduced the change upstream. One
could use add a note to the upstream commit but then there's the problem 
of sharing notes)

> It's a somewhat rare usecase involving projects that generally aren't
> careful with commit messages but which has some people working on it
> who are careful.  But even for these projects, the odds that a commit
> message will just get tossed anyway seem fairly high, so I will not
> strongly defend this choice; it just seemed like "--keep-empty" or
> "--empty=keep" should mean "keep empty", not "sometimes keep empty".
> So, if it'd be nice if we could come up with clear wording if we are
> going to implement other possibilities.
> 
> Anyway, am I correctly understanding that you would like to see the
> following matrix of options?
>   * Drop commits that either start or become empty [the current --empty=drop]

Yes please

>   * Pause and ask users whether they want commits that either start or
> become empty [the current --empty=ask]

I'd prefer an option to keep the commits that start empty and ask about
the ones that become empty (just like --keep-empty does now)

>   * Keep commits that start empty, but drop ones that become empty [a
> new option]

Yes please

> and possibly drop the current --empty=keep behavior?

Yes as I'm not sure it's that useful, if users want to keep the
information from the message of a commit that becomes empty I'm not sure
just creating an empty commit is going to be much use in practice.

>>> Note that traditionally, am-based rebases have always dropped commits
>>> that either started or became empty, while interactive-based rebases
>>> have defaulted to ask (and provided an option to keep commits that
>>> started empty).  This difference made sense since users of an am-based
>>> rebase just wanted to quickly batch apply a sequence of commits, while
>>> users editing a todo list will likely want the chance to interact and
>>> handle unusual cases on a case-by-case basis.
>>
>> I don't see why it makes sense to drop an empty commit that I've made
>> just because it is being rebased.
> 
> Are you arguing that keeping commits that started empty should
> actually be the default, i.e. that all three rebase backends have
> traditionally had the wrong default?

Yes, I think rebase should preserve the commits that are being rebased
as far as possible and the current default is probably a historical accident

>> I'm pretty sure the behavor of the
>> am-based rebase is a function of `git am` not being able to create empty
>> commits.
> 
> That may be, but it's not quite clear.  Junio kind of commented on
> this in https://lore.kernel.org/git/xmqqfu1fswdh.fsf@gitster-ct.c.googlers.com/.
> His comments that "lack of --keep-empty on the 'am' side is probably a
> bug that wants to be fixed" might imply that he only sees it as a
> useful option.  If the behavior of the am-based rebase dropping
> commits which started empty was solely a function of git-am not being
> able to create empty commits, then the correct fix would be to make
> keeping the commits which started empty as the default or maybe only
> behavior rather than just making it an option.

Yeah I'm not sure how Junio feels defaulting to keeping commits that
start empty

> I'd actually be open to changing the default here.

I'd be keen to do that but we should try and find out what others think.

>  We have a
> precedent in doing so with commit b00bf1c9a8dd ("git-rebase: make
> --allow-empty-message the default", 2018-06-27) where we stated that
> really rare edge cases (which I think both empty commit messages and
> commits which start empty qualify as) probably have the wrong default
> in rebase.  And changing the default would yield a much easier to
> understand matrix of options:
> 
>   * Drop commits that become empty (modification to --empty=drop in
> that it keeps commits which started empty)
>   * Pause and ask users if commits become empty (modification to
> --empty=ask in that it keeps commits which started empty)
>   * Keep commits that become empty, in addition to those that started
> empty (--empty=keep).

That matrix is easier to understand and covers all the cases I'm 
interested in.

>>> However, not all rebases
>>> using the interactive machinery are explicitly interactive anymore.  In
>>> particular --merge was always meant to behave more like --am: just
>>> rebase a batch of commits without popping up a todo list.
>>>
>>> If the --empty flag is not specified, pick defaults as follows:
>>>     explicitly interactive: ask
>>>     --exec: keep (exec is about checking existing commits, and often
>>>                   used without actually changing the base.  Thus the
>>>                   expectation is that the user doesn't necessarily want
>>>                   anything to change; they just want to test).
>>>     otherwise: drop
>>
>> I'm not sure I like changing the behavior based on --exec, I see what
>> you're getting at but it has the potential to be confusing. What if I
>> want to rearrange the commits without changing the base - why must I
>> specify --empty=keep there but not if I add --exec to the command line?
> 
> This was probably poorly explained and also poorly implemented.  Let
> me try from a slightly different angle, and come to a slightly
> different result than I did previously:
> 
> Quoting from Junio in b00bf1c9a8dd ("git-rebase: make
> --allow-empty-message the default", 2018-06-27),
>    '"am" based rebase is solely to transplant an existing history and
> want to stop much less than "interactive" one whose purpose is to
> polish a series before making it publishable, and asking for
> confirmation ("this has become empty--do you want to drop it?") is
> more appropriate from the workflow point of view'
> 
> I would modify Junio's wording to expand "am based rebase" to say that
> if the user hasn't requested an interactive rebase or otherwise
> explicitly signalled that they want some measure of interactivity
> (e.g. specified --empty=ask), then we should not stop and ask how to
> handle commits which started OR became empty.  In other worse, it's
> not just the am-based rebase that is about transplanting commits but
> any rebase where interactivity isn't requested.  I think the
> traditional --keep-empty is quite broken in this regard since it does
> stop and ask with commits that become empty.  rebase -m, rebase
> --exec, etc. should not stop when a commit becomes empty, there should
> be some default it just uses unless the user also adds
> --interactive/-i/--empty=ask.
> 
> But, if we take your above suggestion to make keeping the commits
> which started empty not only the default but remove it from the
> options, then this becomes much easier.  The choices become (kind of
> repeated from above):
> 
> [Keep the commits that started empty in all three cases, otherwise:]
>   * Drop commits that become empty (--empty=drop)
>   * Pause and ask users if commits become empty (--empty=ask)
>   * Keep commits that become empty (--empty=keep)

That's nice - I like it

> and the defaults are simply:
>    --interactive/-i: --empty=ask
>    everything else: --empty=drop

While I see the point about the difference between -i and the other 
rebase modes any flavor of rebase can stop at any time due to conflicts 
so it can never be fully automatic. Given that commits becoming empty is 
a bit of a corner case I'm not sure the complexity of different defaults 
is justified but I don't feel that strongly about it.

> (And as a bonus, in the thread I've referenced twice above Junio
> mentioned -- in respect to commits that become empty -- that he thinks
> these are the right defaults.  He didn't comment on the behavior of
> commits that start empty, which seems to be your primary concern.)

>>> Also, this commit makes --keep-empty just imply --empty=keep, and hides
>>> it from help so that we aren't confusing users with different ways to do
>>> the same thing.  (I could have added a --drop-empty flag, but then that
>>> invites users to specify both --keep-empty and --drop-empty and we have
>>> to add sanity checking around that; it seems cleaner to have a single
>>> multi-valued option.)  This actually fixes --keep-empty too; previously,
>>> it only meant to sometimes keep empty commits, in particular commits
>>> which started empty would be kept.  But it would still error out and ask
>>> the user what to do with commits that became empty.  Now it keeps empty
>>> commits, as instructed.
>>
>> It certainly changes the behavior of --keep-empty but I'm not sure it
>> "fixes" it. If I have some empty commits I want to keep as placeholders
>> then that's different from wanting to keep commits that become empty
>> because their changes are upstream but --cherry-mark didn't detect them.
>>
>> In summary I'm in favor of making it easier to drop commits that become
>> empty but not tying that to the handling of commits that are empty
>> before they are rebased.
> 
> Sounds like my modified proposal above may be in line with your tastes?

Yes. Keep commits that start empty and have an option to control what 
happens to those that become empty (defaulting to --empty=ask for 
interactive rebases at least).

>> I'm also not happy that the deprecation of --keep-empty suddenly makes
>> --no-keep-empty an error.
> 
> We could translate that to --empty=drop; does that sound reasonable?

translating it to --empty=ask would keep the behavior the same even if 
it sounds a bit odd. I don't think we want to change the behavior of 
--[no-]keep-empty while it is deprecated.

> 
> [...]
>>> @@ -466,7 +494,10 @@ int cmd_rebase__interactive(int argc, const char **argv, const char *prefix)
>>>       struct option options[] = {
>>>               OPT_NEGBIT(0, "ff", &opts.flags, N_("allow fast-forward"),
>>>                          REBASE_FORCE),
>>> -             OPT_BOOL(0, "keep-empty", &opts.keep_empty, N_("keep empty commits")),
>>> +             { OPTION_CALLBACK, 'k', "keep-empty", &options, NULL,
>>> +                     N_("(DEPRECATED) keep empty commits"),
>>> +                     PARSE_OPT_NOARG | PARSE_OPT_NONEG | PARSE_OPT_HIDDEN,
>>
>> It is all very well deprecating --keep-empty but suddenly making
>> '--no-keep-empty' an error goes beyond deprecation. Also I'm not sure
>> it's worth changing these options as I think the only user is
>> git-rebase--preserve-merges.sh
> 
> Side track: Since git-rebase--preserve-merges.sh is deprecated and we
> want to get rid of it, and rebase-merges exists and is a better
> implementation of the original idea, can we just translate rebase -p
> into rebase -r and delete git-rebase--preserve-merges.sh?  (With a few
> wrinkles, such as telling users in the middle of an existing
> preserve-merges-rebase that they either need to use an old version of
> git to continue their rebase or else abort the rebase?)

dscho posted some patches in November or December moving along this 
path, I'm not sure what the outcome was. It's a bit complicated by the 
different todo list formats between -p and -r but I think the end game 
should be to treat -p as -r as you suggest.

> 
>> Best Wishes
>>
>> Phillip
>>
> [...]
>> This function is used by cherry-pick/revert not rebase do we need to
>> change it?
> 
> Possibly not; I can take a look.  I had quite a difficult time trying
> to figure out how to make sure that --empty options were saved and
> restored across rebase --continue; it's not at all unlikely that I
> modified more than I needed to "for consistency".
> 
> [...]
>>> @@ -3033,9 +3064,15 @@ static int save_opts(struct replay_opts *opts)
>>
>> again this is for cherry-pick/revert
> 
> Okay, will double check this too.
> 
> [...]
> 


^ permalink raw reply

* Re: [PATCH 2/2] graph: fix collapse of multiple edges
From: Jeff King @ 2020-01-08 13:49 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Derrick Stolee via GitGitGadget, git, jcoglan, Derrick Stolee,
	Junio C Hamano
In-Reply-To: <7a0e4637-85de-589f-df31-915b4603e45a@gmail.com>

On Wed, Jan 08, 2020 at 08:40:26AM -0500, Derrick Stolee wrote:

> > Hmm. Your description and your diagrams make sense to me. But one
> > curious thing is that the earlier test you added for 6_* does not need
> > modified. Because it continues to show:
> > 
> >           | | | | * 6_F
> >           | |_|_|/|
> >           |/| | |/
> >           | | |/|
> >           | |/| |
> >           | * | | 6_D
> > 
> > rather than adding a horizontal component to the second-parent line.
> > That seems inconsistent.
> 
> The issue here is that there is not enough room for a second horizontal
> line. The horizontal line can only start after the previous has completely
> terminated, that is
> 
>            | | | | * 6_F
>            | |_|_|/|
>            |/| | |/
> 
> at this point, the first horizontal line has terminated.

Ahhh, OK, that makes perfect sense. I didn't quite realize how the rules
for "going horizontal" worked.

> If there was one more row to the example, then we would have:
> 
>            | | | | | * 6_F
>            | |_|_|_|/|
>            |/| | | |/
>            | | |_|/|
>            | |/| | |
>            | * | | | 6_D

Right, that makes sense. Thanks for explaining. And the patch otherwise
looked good to me; your explanation convinced me that this is the right
thing to do.

-Peff

^ permalink raw reply

* Re: [PATCH] transport: don't flush when disconnecting stateless-rpc helper
From: Derrick Stolee @ 2020-01-08 13:43 UTC (permalink / raw)
  To: Jeff King, brian m. carlson; +Cc: git, Jonathan Tan
In-Reply-To: <20200108071009.GA1675456@coredump.intra.peff.net>

On 1/8/2020 2:10 AM, Jeff King wrote:
> On Wed, Jan 08, 2020 at 03:34:51AM +0000, brian m. carlson wrote:
> 
>> A colleague (Jon Simons) today pointed out an interesting behavior of
>> git ls-remote with protocol v2: it makes a second POST request and sends
>> only a flush packet.  This can be demonstrated with the following:
>>
>>   GIT_CURL_VERBOSE=1 git -c protocol.version=2 ls-remote origin
>>
>> The Content-Length header on the second request will be exactly 4 bytes.

Good find!
 
> But when we've initiated a v2 stateless-connect session over a transport
> helper, there's no point in sending this flush packet. Each operation
> we've performed is self-contained, and the other side is fine with us
> hanging up between operations.

Makes sense to me.

> But much worse, by sending the flush packet we may cause the helper to
> issue an entirely new request _just_ to send the flush packet. So we can
> incur an extra network request just to say "by the way, we have nothing
> more to send".
> 
> Let's drop this extra flush packet. As the test shows, this reduces the
> number of POSTs required for a v2 ls-remote over http from 2 to 1.

> +test_expect_success 'ls-remote with v2 http sends only one POST' '
> +	test_when_finished "rm -f log" &&
> +
> +	git ls-remote "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" >expect &&
> +	GIT_TRACE_CURL="$(pwd)/log" git -c protocol.version=2 \
> +		ls-remote "$HTTPD_URL/smart/http_parent" >actual &&
> +	test_cmp expect actual &&
> +
> +	grep "Send header: POST" log >posts &&
> +	test_line_count = 1 posts
> +'
> +

Nice test!

> diff --git a/transport.c b/transport.c
> index 83379a037d..1fdc7dac1a 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -737,7 +737,7 @@ static int disconnect_git(struct transport *transport)
>  {
>  	struct git_transport_data *data = transport->data;
>  	if (data->conn) {
> -		if (data->got_remote_heads)
> +		if (data->got_remote_heads && !transport->stateless_rpc)
>  			packet_flush(data->fd[1]);
>  		close(data->fd[0]);
>  		close(data->fd[1]);

Looks good to me. Thanks!

-Stolee


^ permalink raw reply

* Re: [PATCH 2/2] graph: fix collapse of multiple edges
From: Derrick Stolee @ 2020-01-08 13:40 UTC (permalink / raw)
  To: Jeff King, Derrick Stolee via GitGitGadget
  Cc: git, jcoglan, Derrick Stolee, Junio C Hamano
In-Reply-To: <20200108072541.GC1675456@coredump.intra.peff.net>

On 1/8/2020 2:25 AM, Jeff King wrote:
> On Wed, Jan 08, 2020 at 04:27:55AM +0000, Derrick Stolee via GitGitGadget wrote:
> 
>> Before this change:
>>
>> 	| | | | | | *
>> 	| |_|_|_|_|/|\
>> 	|/| | | | |/ /
>> 	| | | | |/| /
>> 	| | | |/| |/
>> 	| | |/| |/|
>> 	| |/| |/| |
>> 	| | |/| | |
>>
>> By adding the logic for "filling" a horizontal edge between the
>> target column and the current column, we are able to resolve the
>> issue.
>>
>> After this change:
>>
>> 	| | | | | | *
>> 	| |_|_|_|_|/|\
>> 	|/| | | | |/ /
>> 	| | |_|_|/| /
>> 	| |/| | | |/
>> 	| | | |_|/|
>> 	| | |/| | |
> 
> Hmm. Your description and your diagrams make sense to me. But one
> curious thing is that the earlier test you added for 6_* does not need
> modified. Because it continues to show:
> 
>           | | | | * 6_F
>           | |_|_|/|
>           |/| | |/
>           | | |/|
>           | |/| |
>           | * | | 6_D
> 
> rather than adding a horizontal component to the second-parent line.
> That seems inconsistent.

The issue here is that there is not enough room for a second horizontal
line. The horizontal line can only start after the previous has completely
terminated, that is

           | | | | * 6_F
           | |_|_|/|
           |/| | |/

at this point, the first horizontal line has terminated.

           | | |/|
           | |/| |

The remaining movement for the second line has no room for a horizontal
edge. The logic that I added is hit, but the for loop (over j) terminates
immediately without a single execution of the loop body.

If there was one more row to the example, then we would have:

           | | | | | * 6_F
           | |_|_|_|/|
           |/| | | |/
           | | |_|/|
           | |/| | |
           | * | | | 6_D

Thanks,
-Stolee

^ permalink raw reply

* Re: [PATCH 0/1] Preserve topological ordering after merges
From: Derrick Stolee @ 2020-01-08 13:37 UTC (permalink / raw)
  To: Sergey Rudyshin, Johannes Schindelin,
	Sergey Rudyshin via GitGitGadget
  Cc: git, Junio C Hamano
In-Reply-To: <fa570dcf-b19b-f658-456e-b4174d767ba1@gmail.com>

On 1/8/2020 7:50 AM, Sergey Rudyshin wrote:
> let me explain in more detail why I thought it would make sense to stop using "--topo-order".
> in case if a user specifies a single commit, like this
> git rev-list E
> with a new algorithm the options "--date-order", "--author-date-order", "--topo-order" do not change the ordering. Because there is only one way to sort any git graph with a single tip.

This is false, unless your history is always linear (no merge commits).

> in case if a user specifies multiple tips, like this
> git rev-list --topo-order C B ^A
> current version of git displays commits ordered by commit timestamp
> which does not seem like a topological ordering.

Are you talking about which of C and B are shown first? They are shown in an order based on your input. If C and B are independent (neither can reach the other), then they will swap order if you write "git rev-list --topo-order B C ^A".

> so I decided to change the documentation so that "--topo-order" and "--date-order" be the same. And since "--topo-order" does not add anything new decided to deprecate it.

Based on this sentence, it is clear that you do not understand the difference between --topo-order and --date-order. Please take time to examine the output difference for the Git repo with the following commands:

	git log --oneline --graph --topo-order
	git log --oneline --graph --date-order

> Regarding the failed test
> I'll try to find the reason but what puzzles me is why those tests (t4202, t4215 and t6012) succeeded on all other platforms (linux32, osx-clang, windows, ...) and only failed on linux-gcc.
> In my machine those tests do not fail either (gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609)

Try running the tests with GIT_TEST_COMMIT_GRAPH=1.

-Stolee
 


^ permalink raw reply

* Re: [PATCH 0/1] sequencer: comment out the 'squash!' line
From: Johannes Schindelin @ 2020-01-08 13:23 UTC (permalink / raw)
  To: brian m. carlson
  Cc: Junio C Hamano, Mike Rappazzo, Michael Rappazzo via GitGitGadget,
	Git List
In-Reply-To: <20200108025509.GM6570@camp.crustytoothpaste.net>

Hi brian,

On Wed, 8 Jan 2020, brian m. carlson wrote:

> On 2020-01-07 at 16:15:16, Junio C Hamano wrote:
> > "brian m. carlson" <sandals@crustytoothpaste.net> writes:
> >
> > > I can see the argument that this makes it a little harder for mechanical
> > > processing across versions, but I suspect most of that looks something
> > > like "sed -i -e '/^squash! /d' COMMIT_EDITMSG" and it probably won't be
> > > affected.
> >
> > With the left-anchoring, the search pattern will no longer find that
> > line if "squash!" is commented out, but people tend to be sloppy and
> > do not anchor the pattern would not notice the difference.  Perhaps
> > the downside may not be too severe?  I dunno.
>
> Sorry, I was perhaps bad at explaining this.  In my example, it would no
> longer remove that line, but the user wouldn't care, because it would be
> commented out and removed automatically.  So while the code wouldn't
> work, what the user wanted would be done by Git automatically.

Sounds reasonable to me.

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH 0/1] Preserve topological ordering after merges
From: Sergey Rudyshin @ 2020-01-08 12:50 UTC (permalink / raw)
  To: Johannes Schindelin, Sergey Rudyshin via GitGitGadget; +Cc: git, Junio C Hamano
In-Reply-To: <nycvar.QRO.7.76.6.2001071308290.46@tvgsbejvaqbjf.bet>

Hi Johannes,
thanks for you comments!
please find my replies below

07.01.2020 15:11, Johannes Schindelin wrote:
> Hi Sergey,
> 
> On Tue, 7 Jan 2020, Sergey Rudyshin via GitGitGadget wrote:
> 
>> This modifies the algoritm of topopological ordering. The problem with the
>> current algoritm is that it displays the graph below as following
>>
>> * E
>> |\
>> | * D
>> | |\
>> | |/
>> |/|
>> * | C
>> | * B
>> |/
>> * A
>>
>> commit B is placed between A and C, which is wrong because E stays that D
>> and B comes after C
>>
>> the only correct solution here is
>>
>> * E
>> |\
>> | * D
>> | |\
>> | |/
>> |/|
>> | * B
>> * | C
>> |/
>> * A
>>
>> while it seems that it contradicts to D staiting that C shoud be between D
>> and B The new algorithm solves this issue
>>
>> Here is an example: walk to to the root via "first" parents go E -> C -> A
>> print A because it has no parents step back to C print C because it has no
>> other parents then step back to E go D -> B -> A do not print A becase A is
>> already printed step back to B print B so on
>>
>> This change makes option "--topo-order" obsolete, because for a single
>> commit there is only one way to order parents. "--date-order" and
>> "--author-date-order" are preserved and make sence only for the case when
>> multiple commits are given to be able to sort those commits.
> 
> I have to admit that I am not a fan of this change, as I find that there
> are legitimate use cases where I want to order the commits by commit
> topology, and other use cases where I want to order them by date.
> 
> Maybe other reviewers agree with your reasoning, though, in that case you
> still will need to address the test failures in t4202, t4215 and t6012:
> https://dev.azure.com/gitgitgadget/git/_build/results?buildId=25867&view=ms.vss-test-web.build-test-results-tab
> 
> Ciao,
> Johannes

let me explain in more detail why I thought it would make sense to stop 
using "--topo-order".
in case if a user specifies a single commit, like this
git rev-list E
with a new algorithm the options "--date-order", "--author-date-order", 
"--topo-order" do not change the ordering. Because there is only one way 
to sort any git graph with a single tip.

in case if a user specifies multiple tips, like this
git rev-list --topo-order C B ^A
current version of git displays commits ordered by commit timestamp
which does not seem like a topological ordering.

so I decided to change the documentation so that "--topo-order" and 
"--date-order" be the same. And since "--topo-order" does not add 
anything new decided to deprecate it.

Form my point of view there should be no options to sort at all.
The topology should be derived from the order provided by the user
git rev-list --topo-order C B ^A
should result in C, B
git rev-list --topo-order B C ^A
should result in B, C

Regarding the failed test
I'll try to find the reason but what puzzles me is why those tests 
(t4202, t4215 and t6012) succeeded on all other platforms (linux32, 
osx-clang, windows, ...) and only failed on linux-gcc.
In my machine those tests do not fail either (gcc (Ubuntu 
5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609)

^ permalink raw reply

* Re: [PATCH] transport: don't flush when disconnecting stateless-rpc helper
From: brian m. carlson @ 2020-01-08 12:44 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Jonathan Tan
In-Reply-To: <20200108071009.GA1675456@coredump.intra.peff.net>

[-- Attachment #1: Type: text/plain, Size: 694 bytes --]

On 2020-01-08 at 07:10:09, Jeff King wrote:
> diff --git a/transport.c b/transport.c
> index 83379a037d..1fdc7dac1a 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -737,7 +737,7 @@ static int disconnect_git(struct transport *transport)
>  {
>  	struct git_transport_data *data = transport->data;
>  	if (data->conn) {
> -		if (data->got_remote_heads)
> +		if (data->got_remote_heads && !transport->stateless_rpc)
>  			packet_flush(data->fd[1]);
>  		close(data->fd[0]);
>  		close(data->fd[1]);

This is as simple and elegant as I'd hoped for, and as usual, it's well
explained.  Looks great.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

^ permalink raw reply

* Re: Broken branch after git commit - tracked files in staging area can't be removed with restore --staged, or commit or stash
From: Torsten Krah @ 2020-01-08 12:42 UTC (permalink / raw)
  To: Jeff King; +Cc: git@vger.kernel.org
In-Reply-To: <20200108104008.GA2207365@coredump.intra.peff.net>

Am Mittwoch, den 08.01.2020, 05:40 -0500 schrieb Jeff King:
> So there seem to be at least two bugs:
> 
>  - git-restore doesn't properly invalidate the cache-tree
> 
>  - the index-reading code is not careful enough about bogus cache-
> trees,
>    and may segfault
> 
> -Peff
-- 

Hi Peff,

thanks for confirming those bugs and taking a look at my report itself
even before I've put the small reproducer recipe.
I don't need to take any other actions or create tickets somewhere else
now to get those fixed, you're driving this from here, right?

Another question - how can I fix my broken original branch where this
bug did hit me which is in that "bogus" state now to get back to the
"original" commit made so I can remove my unwanted files with "git
reset HEAD <files>" instead of git-restore in the meantime (tried your
rm .git/index variant but after that I had files which were already in
the branch shown as unversioned after that, so that did not work very
well for me)?

kind regards

Torsten

^ permalink raw reply

* Re: bogus config file vs. 'git config --edit'
From: Jeff King @ 2020-01-08 11:57 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git
In-Reply-To: <20191227110431.GC32750@szeder.dev>

On Fri, Dec 27, 2019 at 12:04:31PM +0100, SZEDER Gábor wrote:

> I think bith 'git config --edit' and 'git help ...' should just work,
> no matter what nonsense might be in the config file, even if they then
> launch a different editor or pager than what are set in the
> configuration.

Agreed. This has come up before (e.g., [1]) but I think doing it right
would need several changes:

  - the config code is too eager to die on a syntax error; it should
    return an error up the stack

  - the stack looks like this when we first die():

      #0  die (err=0x55555582436f "%s") at usage.c:165
      #1  0x0000555555678ab6 in git_parse_source (fn=0x5555557719c4 <check_repo_format>, data=0x7fffffffe1e0, opts=0x0)
          at config.c:849
      #2  0x000055555567a9c5 in do_config_from (top=0x7fffffffdfc0, fn=0x5555557719c4 <check_repo_format>, 
          data=0x7fffffffe1e0, opts=0x0) at config.c:1546
      #3  0x000055555567aab4 in do_config_from_file (fn=0x5555557719c4 <check_repo_format>, origin_type=CONFIG_ORIGIN_FILE, 
          name=0x555555913220 ".git/config", path=0x555555913220 ".git/config", f=0x5555559117c0, data=0x7fffffffe1e0, 
          opts=0x0) at config.c:1574
      #4  0x000055555567ab80 in git_config_from_file_with_options (fn=0x5555557719c4 <check_repo_format>, 
          filename=0x555555913220 ".git/config", data=0x7fffffffe1e0, opts=0x0) at config.c:1594
      #5  0x000055555567abc5 in git_config_from_file (fn=0x5555557719c4 <check_repo_format>, 
          filename=0x555555913220 ".git/config", data=0x7fffffffe1e0) at config.c:1603
      #6  0x0000555555771e0b in read_repository_format (format=0x7fffffffe1e0, path=0x555555913220 ".git/config")
          at setup.c:523
      #7  0x0000555555771bb5 in check_repository_format_gently (gitdir=0x555555911750 ".git", candidate=0x7fffffffe1e0, 
          nongit_ok=0x7fffffffe2cc) at setup.c:460
      #8  0x000055555577272f in setup_discovered_git_dir (gitdir=0x555555911750 ".git", cwd=0x5555558c90b0 <cwd>, 
          offset=19, repo_fmt=0x7fffffffe1e0, nongit_ok=0x7fffffffe2cc) at setup.c:770
      #9  0x0000555555773490 in setup_git_directory_gently (nongit_ok=0x7fffffffe2cc) at setup.c:1100
      #10 0x00005555555706c8 in run_builtin (p=0x5555558bb980 <commands+576>, argc=2, argv=0x7fffffffe538) at git.c:416
      #11 0x0000555555570baf in handle_builtin (argc=2, argv=0x7fffffffe538) at git.c:674
      #12 0x00005555555711b2 in cmd_main (argc=2, argv=0x7fffffffe538) at git.c:842
      #13 0x000055555563412e in main (argc=2, argv=0x7fffffffe538) at common-main.c:52

     So we'd need to teach read_repository_format() to handle the error
     and just assume we're not in a Git repo. But then how would "git
     config --edit" realize it needs to edit the repo config file?

  - assuming we get around the above problem, I suspect we'd run into
    other things that want to read the config (e.g., loading default
    config like the editor). We could be more lenient everywhere, but I
    suspect that most of the other commands _do_ want to die on bogus
    config (rather than do something unexpected). I wouldn't want to
    change every git_config() call to handle errors, but maybe we could
    have a lenient form and use it in just enough call-sites. Or maybe
    we could detect early in git-config that we are in "--edit" mode,
    and set a global for "be lenient when reading the config". I dunno.

So I think it's definitely do-able and would be much better, but it's
probably not entirely trivial.

-Peff

[1] https://lore.kernel.org/git/A5CDBB91-E889-4849-953A-2C1DB4A04513@gmail.com/

^ permalink raw reply

* [PATCH] restore: invalidate cache-tree when removing entries with --staged
From: Jeff King @ 2020-01-08 11:43 UTC (permalink / raw)
  To: Torsten Krah; +Cc: Junio C Hamano, Emily Shaffer, git@vger.kernel.org
In-Reply-To: <20200108104008.GA2207365@coredump.intra.peff.net>

On Wed, Jan 08, 2020 at 05:40:08AM -0500, Jeff King wrote:

> So there seem to be at least two bugs:
> 
>  - git-restore doesn't properly invalidate the cache-tree
> 
>  - the index-reading code is not careful enough about bogus cache-trees,
>    and may segfault

Here's a fix for the first one. I'm adding Junio to the cc as an expert
in index and cache-tree issues. I'm pretty sure this is the correct fix,
but I have some lingering questions below.

I'm not planning on working on the second one immediately. Between this
and Emily's patch from yesterday, I have a feeling that the index code
could use an audit to be a bit more careful about handling bogus on-disk
data.

-- >8 --
Subject: restore: invalidate cache-tree when removing entries with --staged

When "git restore --staged <path>" removes a path that's in the index,
it marks the entry with CE_REMOVE, but we don't do anything to
invalidate the cache-tree. In the non-staged case, we end up in
checkout_worktree(), which calls remove_marked_cache_entries(). That
actually drops the entries from the index, as well as invalidating the
cache-tree and untracked-cache.

But with --staged, we never call checkout_worktree(), and the CE_REMOVE
entries remain. Interestingly, they are dropped when we write out the
index, but that means the resulting index is inconsistent: its
cache-tree will not match the actual entries, and running "git commit"
immediately after will create the wrong tree.

We can solve this by calling remove_marked_cache_entries() ourselves
before writing out the index. Note that we can't just hoist it out of
checkout_worktree(); that function needs to iterate over the CE_REMOVE
entries (to drop their matching worktree files) before removing them.

One curiosity about the test: without this patch, it actually triggers a
BUG() when running git-restore:

  BUG: cache-tree.c:810: new1 with flags 0x4420000 should not be in cache-tree

But in the original problem report, which used a similar recipe,
git-restore actually creates the bogus index (and the commit is created
with the wrong tree). I'm not sure why the test here behaves differently
than my out-of-suite reproduction, but what's here should catch either
symptom (and the fix corrects both cases).

Reported-by: Torsten Krah <krah.tm@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
---
So I'm mildly puzzled about the BUG thing above. But also: it seems
really weird that do_write_index() will drop CE_REMOVE entries as it
writes, but not invalidate the cache-tree (or the untracked-cache, which
AFAICT is probably similarly affected). Should it be doing that
invalidation? Or should it be a BUG() to write out an index without
having done remove_marked_cache_entries()?

 builtin/checkout.c |  2 ++
 t/t2070-restore.sh | 18 ++++++++++++++++++
 2 files changed, 20 insertions(+)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index b52c490c8f..18ef5fb975 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -524,6 +524,8 @@ static int checkout_paths(const struct checkout_opts *opts,
 	/* Now we are committed to check them out */
 	if (opts->checkout_worktree)
 		errs |= checkout_worktree(opts);
+	else
+		remove_marked_cache_entries(&the_index, 1);

 	/*
 	 * Allow updating the index when checking out from the index.
diff --git a/t/t2070-restore.sh b/t/t2070-restore.sh
index 21c3f84459..06a5976143 100755
--- a/t/t2070-restore.sh
+++ b/t/t2070-restore.sh
@@ -96,6 +96,7 @@ test_expect_success 'restore --ignore-unmerged ignores unmerged entries' '
 '

 test_expect_success 'restore --staged adds deleted intent-to-add file back to index' '
+	test_when_finished git reset --hard &&
 	echo "nonempty" >nonempty &&
 	>empty &&
 	git add nonempty empty &&
@@ -106,4 +107,21 @@ test_expect_success 'restore --staged adds deleted intent-to-add file back to in
 	git diff --cached --exit-code
 '

+test_expect_success 'restore --staged invalidates cache tree for deletions' '
+	test_when_finished git reset --hard &&
+	>new1 &&
+	>new2 &&
+	git add new1 new2 &&
+
+	# It is important to commit and then reset here, so that the index
+	# contains a valid cache-tree for the "both" tree.
+	git commit -m both &&
+	git reset --soft HEAD^ &&
+
+	git restore --staged new1 &&
+	git commit -m "just new2" &&
+	git rev-parse HEAD:new2 &&
+	test_must_fail git rev-parse HEAD:new1
+'
+
 test_done
-- 
2.25.0.rc1.622.g8cfac75bdd

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox