From: Junio C Hamano <gitster@pobox.com>
To: "Michał Kiedrowicz" <michal.kiedrowicz@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] builtin-apply: keep information about files to be deleted
Date: Sat, 18 Apr 2009 15:05:38 -0700 [thread overview]
Message-ID: <7viql1ty6l.fsf@gitster.siamese.dyndns.org> (raw)
In-Reply-To: <20090418225847.54862bdf@gmail.com> (Michał Kiedrowicz's message of "Sat, 18 Apr 2009 22:58:47 +0200")
Michał Kiedrowicz <michal.kiedrowicz@gmail.com> writes:
> Junio C Hamano <gitster@pobox.com> wrote:
>
>> Michał Kiedrowicz <michal.kiedrowicz@gmail.com> writes:
>>
>>> ... However, there are some cases when these two
>>> rules may cause problems:
>>>
>>> patch #1: rename A to B
>>> patch #2: rename C to A
>>> patch #3: modify A
>>>
>>> Should patch #3 modify B (which was A) or A (which was C)?
>>>
>>> patch #1: rename A to B
>>> patch #2: rename B to A
>>> patch #3: modify A
>>> patch #4: modify B
>>>
>>> Which files should be patched by #3 and #4?
>>>
>>> In my opinion both #3 and #4 should fail (or both should succeed) --
>>> with my patch only #3 will work and #4 will be rejected, because in
>>> #2 B was marked as deleted.
>>
>> Both of the examples above cannot be emitted as a single commit by
>> format-patch; the user is feeding a combined patch. Perhaps renames
>> in each example sequence were came from one git commit but
>> modifications are from separate commit or handcrafted "follow-up"
>> patch.
>
> Yes, that's true. In "normal" case, renames and modifications should be
> handled properly and (generally) aren't subject of this discussion.
>
>>
>> There are two stances we can take:
>>
>> (1) The user knows what he is doing.
>>
>> In the first example, if he wanted the change in #3 to end up in
>> B, he would have arranged the patches in a different order, namely, 3
>> 1 2, but he didn't. We should modify A (that came from C).
>>
>> (2) In situations like these when it is unusual and there is no
>> clear and unambiguous answer, the rule has always been "fail and ask
>> the user to clean up", because silently doing a wrong thing in an
>> unusual situation that happens only once in a while is far worse than
>> interrupting the user and forcing a manual intervention.
>>
>> In the first example, there is no clear answer. Perhaps all
>> three patches were independent patches (the first two obviously came
>> from git because only we can do renames, but they may have been
>> separate commits), and the user may have reordered them (or just
>> picked a random order because he was linearizing a history with a
>> merge).
>>
>> The second one is even iffier. If we _know_ that originally patch #1
>> and #2 came from the same commit, then they represent swapping
>> between A and B, but if they came from different git commits, and if
>> the user picked patches in a random order, it may mean something
>> completely different.
>
> The problem here is that there are at least two patches which touch the
> same file(s) and it is impossible to say which patches should be handled
> atomically. However, there is no easy way to specify renames as a
> single patch. A diff containing swapping of three files looks like this:
>
> diff --git a/file2 b/file1
> similarity index 100%
> rename from file2
> rename to file1
> diff --git a/file3 b/file2
> similarity index 100%
> rename from file3
> rename to file2
> diff --git a/file1 b/file3
> similarity index 100%
> rename from file1
> rename to file3
>
> BTW: it applies correctly :).
>
>>
>> I am somewhat tempted to say that we should fail all of them,
>> including the original "single patch swapping files" brought up by
>> Linus.
>
> I may agree that difficult scenarios should be rejected, but I will
> also say that git-apply should always accept git-diff output.
>
>>
>> BUT
>>
>> Can we make use simple rule to detect problematic cases?
>>
>> - An input to git-apply can contain more than one patch that affects
>> a path; however
>>
>> - you cannot create a path that still exists, except for a path
>> that _will_ be renamed away or removed (your patch fixes this by
>> adding this "except for..." part to loosen the existing rule);
>>
>> - you cannot modify a path in a separate patch if it is involved
>> in an either side of a rename (this will catch the ambiguity of patch
>> #3 in your first example and #3 and #4 in your second example);
>
> What should happen in following situation:
>
> patch #1: modify A
> patch #2: rename A to B
>
> #2 should fail? Now it creates new B which is a copy of A before
> applying any patches and modifies A according to #1.
Yes. It is obviously a handcrafted sequence, and it could even have been
mechanically created.
Imagine a merge of two branches like this:
2----HEAD
/ /
common----1
and somebody fed "common..HEAD" to his script that internally runs
format-patch and squashes the patch output into one, perhaps:
#!/bin/sh
# Create a single patch e-mail, squashed.
tmp=/var/tmp/my-squash$$
rm -rf "$tmp" && mkdir -p "$tmp/out" || exit
trap 'rm -rf "$tmp"' 0 1 2 3 15
git format-patch -o "$tmp/out" "$@"
>"$tmp/all.messages"
>"$tmp/all.patches"
for mail in "$tmp"/out/0*
do
git mailinfo "$tmp/msg" "$tmp/patch" >"$tmp/info" <"$mail"
echo "$mail" >>.messages
cat "$tmp/msg" >>"$tmp/all.messages"
cat "$tmp/patch" >>"$tmp/all.patches"
done
(
cat "$tmp/info"; echo
cat "$tmp/all.messages"; echo
cat "$tmp/all.patches"
)
Depending on the sort order between #1 and #2, you cannot tell "modify A
and then rename it to B" is the order we will see such a patch at all. I
think it is safer to reject such a patch with the "when in doubt, do not
act too clever and risk making a silent mistake" principle.
In this particular case, the reverse order of "renaming A to B" and then
"modifing A" would fail anyway, but if you have another patch that renames
C to A in the mix of patches whose order cannot be determined, I think you
can come up with a sequence that results in an "applicable in order but is
the order really what the author intended?" situation.
> Do you mean that patches which break above rules should be
> skipped when "--reject" is set, as other failures? Or that
> whole git-apply should fail regardless of "--reject"?
I meant the latter.
prev parent reply other threads:[~2009-04-18 22:08 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-11 19:31 [PATCH] builtin-apply: keep information about files to be deleted Michał Kiedrowicz
2009-04-13 13:51 ` Michał Kiedrowicz
2009-04-13 18:51 ` Junio C Hamano
2009-04-13 21:03 ` Michał Kiedrowicz
2009-04-13 21:30 ` Junio C Hamano
2009-04-17 17:23 ` Michał Kiedrowicz
2009-04-18 4:59 ` Junio C Hamano
2009-04-18 11:27 ` Andreas Ericsson
2009-04-18 19:56 ` Junio C Hamano
2009-04-18 20:58 ` Michał Kiedrowicz
2009-04-18 21:03 ` [PATCH] tests: make test-apply-criss-cross-rename more robust Michał Kiedrowicz
2009-04-18 22:05 ` Junio C Hamano [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7viql1ty6l.fsf@gitster.siamese.dyndns.org \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=michal.kiedrowicz@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.