git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Michał Kiedrowicz" <michal.kiedrowicz@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] builtin-apply: keep information about files to be deleted
Date: Sat, 18 Apr 2009 15:05:38 -0700	[thread overview]
Message-ID: <7viql1ty6l.fsf@gitster.siamese.dyndns.org> (raw)
In-Reply-To: <20090418225847.54862bdf@gmail.com> (Michał Kiedrowicz's message of "Sat, 18 Apr 2009 22:58:47 +0200")

Michał Kiedrowicz <michal.kiedrowicz@gmail.com> writes:

> Junio C Hamano <gitster@pobox.com> wrote:
>
>> Michał Kiedrowicz <michal.kiedrowicz@gmail.com> writes:
>>   
>>> ... However, there are some cases when these two
>>> rules may cause problems:
>>>
>>> patch #1: rename A to B
>>> patch #2: rename C to A
>>> patch #3: modify A
>>>
>>> Should patch #3 modify B (which was A) or A (which was C)?
>>>
>>> patch #1: rename A to B
>>> patch #2: rename B to A
>>> patch #3: modify A
>>> patch #4: modify B
>>>
>>> Which files should be patched by #3 and #4?
>>>
>>> In my opinion both #3 and #4 should fail (or both should succeed) --
>>> with my patch only #3 will work and #4 will be rejected, because in
>>> #2 B was marked as deleted.  
>> 
>> Both of the examples above cannot be emitted as a single commit by
>> format-patch; the user is feeding a combined patch.  Perhaps renames
>> in each example sequence were came from one git commit but
>> modifications are from separate commit or handcrafted "follow-up"
>> patch.
>
> Yes, that's true. In "normal" case, renames and modifications should be
> handled properly and (generally) aren't subject of this discussion.
>
>>
>> There are two stances we can take:
>> 
>>  (1) The user knows what he is doing.
>> 
>>      In the first example, if he wanted the change in #3 to end up in
>> B, he would have arranged the patches in a different order, namely, 3
>> 1 2, but he didn't.  We should modify A (that came from C).
>> 
>>  (2) In situations like these when it is unusual and there is no
>> clear and unambiguous answer, the rule has always been "fail and ask
>> the user to clean up", because silently doing a wrong thing in an
>> unusual situation that happens only once in a while is far worse than
>>      interrupting the user and forcing a manual intervention.
>> 
>>      In the first example, there is no clear answer.  Perhaps all
>> three patches were independent patches (the first two obviously came
>> from git because only we can do renames, but they may have been
>> separate commits), and the user may have reordered them (or just
>> picked a random order because he was linearizing a history with a
>> merge).
>> 
>> The second one is even iffier.  If we _know_ that originally patch #1
>> and #2 came from the same commit, then they represent swapping
>> between A and B, but if they came from different git commits, and if
>> the user picked patches in a random order, it may mean something
>> completely different.
>
> The problem here is that there are at least two patches which touch the
> same file(s) and it is impossible to say which patches should be handled
> atomically. However, there is no easy way to specify renames as a
> single patch. A diff containing swapping of three files looks like this:
>
> 	diff --git a/file2 b/file1
> 	similarity index 100%
> 	rename from file2
> 	rename to file1
> 	diff --git a/file3 b/file2
> 	similarity index 100%
> 	rename from file3
> 	rename to file2
> 	diff --git a/file1 b/file3
> 	similarity index 100%
> 	rename from file1
> 	rename to file3
>
> BTW: it applies correctly :).
>
>> 
>> I am somewhat tempted to say that we should fail all of them,
>> including the original "single patch swapping files" brought up by
>> Linus.
>
> I may agree that difficult scenarios should be rejected, but I will
> also say that git-apply should always accept git-diff output.
>
>> 
>> BUT
>> 
>> Can we make use simple rule to detect problematic cases?
>> 
>>  - An input to git-apply can contain more than one patch that affects
>> a path; however
>> 
>>    - you cannot create a path that still exists, except for a path
>> that _will_ be renamed away or removed (your patch fixes this by
>> adding this "except for..." part to loosen the existing rule);
>> 
>>    - you cannot modify a path in a separate patch if it is involved
>> in an either side of a rename (this will catch the ambiguity of patch
>> #3 in your first example and #3 and #4 in your second example);
>
> What should happen in following situation:
>
> patch #1: modify A
> patch #2: rename A to B
>
> #2 should fail? Now it creates new B which is a copy of A before
> applying any patches and modifies A according to #1.

Yes.  It is obviously a handcrafted sequence, and it could even have been
mechanically created.

Imagine a merge of two branches like this:

               2----HEAD
              /    /
	common----1

and somebody fed "common..HEAD" to his script that internally runs
format-patch and squashes the patch output into one, perhaps:

	#!/bin/sh
	# Create a single patch e-mail, squashed.
        tmp=/var/tmp/my-squash$$
	rm -rf "$tmp" && mkdir -p "$tmp/out" || exit
        trap 'rm -rf "$tmp"' 0 1 2 3 15
	git format-patch -o "$tmp/out" "$@"
	>"$tmp/all.messages"
        >"$tmp/all.patches"
        for mail in "$tmp"/out/0*
        do
        	git mailinfo "$tmp/msg" "$tmp/patch" >"$tmp/info" <"$mail"
                echo "$mail" >>.messages
                cat "$tmp/msg" >>"$tmp/all.messages"
                cat "$tmp/patch" >>"$tmp/all.patches"
	done
	(
	        cat "$tmp/info"; echo
                cat "$tmp/all.messages"; echo
	        cat "$tmp/all.patches"
	)

Depending on the sort order between #1 and #2, you cannot tell "modify A
and then rename it to B" is the order we will see such a patch at all.  I
think it is safer to reject such a patch with the "when in doubt, do not
act too clever and risk making a silent mistake" principle.

In this particular case, the reverse order of "renaming A to B" and then
"modifing A" would fail anyway, but if you have another patch that renames
C to A in the mix of patches whose order cannot be determined, I think you
can come up with a sequence that results in an "applicable in order but is
the order really what the author intended?" situation.

> Do you mean that patches which break above rules should be
> skipped when "--reject" is set, as other failures? Or that
> whole git-apply should fail regardless of "--reject"?

I meant the latter.

      parent reply	other threads:[~2009-04-18 22:08 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-11 19:31 [PATCH] builtin-apply: keep information about files to be deleted Michał Kiedrowicz
2009-04-13 13:51 ` Michał Kiedrowicz
2009-04-13 18:51 ` Junio C Hamano
2009-04-13 21:03   ` Michał Kiedrowicz
2009-04-13 21:30     ` Junio C Hamano
2009-04-17 17:23       ` Michał Kiedrowicz
2009-04-18  4:59         ` Junio C Hamano
2009-04-18 11:27           ` Andreas Ericsson
2009-04-18 19:56             ` Junio C Hamano
2009-04-18 20:58           ` Michał Kiedrowicz
2009-04-18 21:03             ` [PATCH] tests: make test-apply-criss-cross-rename more robust Michał Kiedrowicz
2009-04-18 22:05             ` Junio C Hamano [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7viql1ty6l.fsf@gitster.siamese.dyndns.org \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=michal.kiedrowicz@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).