From: Junio C Hamano <gitster@pobox.com>
To: "Michał Kiedrowicz" <michal.kiedrowicz@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] builtin-apply: keep information about files to be deleted
Date: Sat, 18 Apr 2009 15:05:38 -0700 [thread overview]
Message-ID: <7viql1ty6l.fsf@gitster.siamese.dyndns.org> (raw)
In-Reply-To: <20090418225847.54862bdf@gmail.com> (Michał Kiedrowicz's message of "Sat, 18 Apr 2009 22:58:47 +0200")
Michał Kiedrowicz <michal.kiedrowicz@gmail.com> writes:
> Junio C Hamano <gitster@pobox.com> wrote:
>
>> Michał Kiedrowicz <michal.kiedrowicz@gmail.com> writes:
>>
>>> ... However, there are some cases when these two
>>> rules may cause problems:
>>>
>>> patch #1: rename A to B
>>> patch #2: rename C to A
>>> patch #3: modify A
>>>
>>> Should patch #3 modify B (which was A) or A (which was C)?
>>>
>>> patch #1: rename A to B
>>> patch #2: rename B to A
>>> patch #3: modify A
>>> patch #4: modify B
>>>
>>> Which files should be patched by #3 and #4?
>>>
>>> In my opinion both #3 and #4 should fail (or both should succeed) --
>>> with my patch only #3 will work and #4 will be rejected, because in
>>> #2 B was marked as deleted.
>>
>> Both of the examples above cannot be emitted as a single commit by
>> format-patch; the user is feeding a combined patch. Perhaps renames
>> in each example sequence were came from one git commit but
>> modifications are from separate commit or handcrafted "follow-up"
>> patch.
>
> Yes, that's true. In "normal" case, renames and modifications should be
> handled properly and (generally) aren't subject of this discussion.
>
>>
>> There are two stances we can take:
>>
>> (1) The user knows what he is doing.
>>
>> In the first example, if he wanted the change in #3 to end up in
>> B, he would have arranged the patches in a different order, namely, 3
>> 1 2, but he didn't. We should modify A (that came from C).
>>
>> (2) In situations like these when it is unusual and there is no
>> clear and unambiguous answer, the rule has always been "fail and ask
>> the user to clean up", because silently doing a wrong thing in an
>> unusual situation that happens only once in a while is far worse than
>> interrupting the user and forcing a manual intervention.
>>
>> In the first example, there is no clear answer. Perhaps all
>> three patches were independent patches (the first two obviously came
>> from git because only we can do renames, but they may have been
>> separate commits), and the user may have reordered them (or just
>> picked a random order because he was linearizing a history with a
>> merge).
>>
>> The second one is even iffier. If we _know_ that originally patch #1
>> and #2 came from the same commit, then they represent swapping
>> between A and B, but if they came from different git commits, and if
>> the user picked patches in a random order, it may mean something
>> completely different.
>
> The problem here is that there are at least two patches which touch the
> same file(s) and it is impossible to say which patches should be handled
> atomically. However, there is no easy way to specify renames as a
> single patch. A diff containing swapping of three files looks like this:
>
> diff --git a/file2 b/file1
> similarity index 100%
> rename from file2
> rename to file1
> diff --git a/file3 b/file2
> similarity index 100%
> rename from file3
> rename to file2
> diff --git a/file1 b/file3
> similarity index 100%
> rename from file1
> rename to file3
>
> BTW: it applies correctly :).
>
>>
>> I am somewhat tempted to say that we should fail all of them,
>> including the original "single patch swapping files" brought up by
>> Linus.
>
> I may agree that difficult scenarios should be rejected, but I will
> also say that git-apply should always accept git-diff output.
>
>>
>> BUT
>>
>> Can we make use simple rule to detect problematic cases?
>>
>> - An input to git-apply can contain more than one patch that affects
>> a path; however
>>
>> - you cannot create a path that still exists, except for a path
>> that _will_ be renamed away or removed (your patch fixes this by
>> adding this "except for..." part to loosen the existing rule);
>>
>> - you cannot modify a path in a separate patch if it is involved
>> in an either side of a rename (this will catch the ambiguity of patch
>> #3 in your first example and #3 and #4 in your second example);
>
> What should happen in following situation:
>
> patch #1: modify A
> patch #2: rename A to B
>
> #2 should fail? Now it creates new B which is a copy of A before
> applying any patches and modifies A according to #1.
Yes. It is obviously a handcrafted sequence, and it could even have been
mechanically created.
Imagine a merge of two branches like this:
2----HEAD
/ /
common----1
and somebody fed "common..HEAD" to his script that internally runs
format-patch and squashes the patch output into one, perhaps:
#!/bin/sh
# Create a single patch e-mail, squashed.
tmp=/var/tmp/my-squash$$
rm -rf "$tmp" && mkdir -p "$tmp/out" || exit
trap 'rm -rf "$tmp"' 0 1 2 3 15
git format-patch -o "$tmp/out" "$@"
>"$tmp/all.messages"
>"$tmp/all.patches"
for mail in "$tmp"/out/0*
do
git mailinfo "$tmp/msg" "$tmp/patch" >"$tmp/info" <"$mail"
echo "$mail" >>.messages
cat "$tmp/msg" >>"$tmp/all.messages"
cat "$tmp/patch" >>"$tmp/all.patches"
done
(
cat "$tmp/info"; echo
cat "$tmp/all.messages"; echo
cat "$tmp/all.patches"
)
Depending on the sort order between #1 and #2, you cannot tell "modify A
and then rename it to B" is the order we will see such a patch at all. I
think it is safer to reject such a patch with the "when in doubt, do not
act too clever and risk making a silent mistake" principle.
In this particular case, the reverse order of "renaming A to B" and then
"modifing A" would fail anyway, but if you have another patch that renames
C to A in the mix of patches whose order cannot be determined, I think you
can come up with a sequence that results in an "applicable in order but is
the order really what the author intended?" situation.
> Do you mean that patches which break above rules should be
> skipped when "--reject" is set, as other failures? Or that
> whole git-apply should fail regardless of "--reject"?
I meant the latter.
prev parent reply other threads:[~2009-04-18 22:08 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-11 19:31 [PATCH] builtin-apply: keep information about files to be deleted Michał Kiedrowicz
2009-04-13 13:51 ` Michał Kiedrowicz
2009-04-13 18:51 ` Junio C Hamano
2009-04-13 21:03 ` Michał Kiedrowicz
2009-04-13 21:30 ` Junio C Hamano
2009-04-17 17:23 ` Michał Kiedrowicz
2009-04-18 4:59 ` Junio C Hamano
2009-04-18 11:27 ` Andreas Ericsson
2009-04-18 19:56 ` Junio C Hamano
2009-04-18 20:58 ` Michał Kiedrowicz
2009-04-18 21:03 ` [PATCH] tests: make test-apply-criss-cross-rename more robust Michał Kiedrowicz
2009-04-18 22:05 ` Junio C Hamano [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7viql1ty6l.fsf@gitster.siamese.dyndns.org \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=michal.kiedrowicz@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).