* git-format-patch should include a checksum
@ 2010-01-26 22:34 Juliusz Chroboczek
2010-01-26 23:15 ` Sverre Rabbelier
2010-01-26 23:21 ` Junio C Hamano
0 siblings, 2 replies; 11+ messages in thread
From: Juliusz Chroboczek @ 2010-01-26 22:34 UTC (permalink / raw)
To: git
Hi,
I'm seeing Git patches being corrupted by mailers and still apply
correctly. It would be great if git-format-patch could include a hash
of the patch body (and commit message); git-am should check the hash,
and refuse to commit if the patch was corrupted (--force should override
that, of course).
Juliusz
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git-format-patch should include a checksum
2010-01-26 22:34 git-format-patch should include a checksum Juliusz Chroboczek
@ 2010-01-26 23:15 ` Sverre Rabbelier
2010-01-26 23:21 ` Junio C Hamano
1 sibling, 0 replies; 11+ messages in thread
From: Sverre Rabbelier @ 2010-01-26 23:15 UTC (permalink / raw)
To: Juliusz Chroboczek; +Cc: git
Heya,
On Tue, Jan 26, 2010 at 23:34, Juliusz Chroboczek <jch@pps.jussieu.fr> wrote:
> I'm seeing Git patches being corrupted by mailers and still apply
> correctly. It would be great if git-format-patch could include a hash
> of the patch body (and commit message); git-am should check the hash,
> and refuse to commit if the patch was corrupted (--force should override
> that, of course).
Sounds like a good idea, have a look at cmd_format_patch in
builtin-log.c. Assuming you want to add the hash as a X- mail header
(e.g. X-Git-Message-Hash), you should probably add it before line 1019
(where rev.extra_headers is set to the current contents of 'buf'). For
the hash itself have a look at git_SHA1_{Init,Update,Final} in various
files, csum-file.c would seem like an excellent candidate, as it
writes a sha1-summed file, which is sortof what you want to do.
For the git-am part look at 'git-am.sh', you should have checking
logic warn only at first, and add a configuration option that allows
enforcing it and warn when the option isn't set, instructing the user
how to configure the option, see warn_unconfigured_deny_msg in
builtin-receive-pack.c. A few releases from now we can set it from
warn to abort (or deny, whatever the enforcing option is called).
Good luck!
--
Cheers,
Sverre Rabbelier
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git-format-patch should include a checksum
2010-01-26 22:34 git-format-patch should include a checksum Juliusz Chroboczek
2010-01-26 23:15 ` Sverre Rabbelier
@ 2010-01-26 23:21 ` Junio C Hamano
2010-01-26 23:26 ` Sverre Rabbelier
2010-01-27 1:25 ` Juliusz Chroboczek
1 sibling, 2 replies; 11+ messages in thread
From: Junio C Hamano @ 2010-01-26 23:21 UTC (permalink / raw)
To: Juliusz Chroboczek; +Cc: git
Juliusz Chroboczek <jch@pps.jussieu.fr> writes:
> I'm seeing Git patches being corrupted by mailers and still apply
> correctly. It would be great if git-format-patch could include a hash
> of the patch body (and commit message); git-am should check the hash,
> and refuse to commit if the patch was corrupted (--force should override
> that, of course).
Do you have an example of such corrupted and incorrectly applied patches?
What kind of corruption are you talking about?
format-patch/am pair is designed to be lenient, allowing people to write
additional messages after the three-dash lines after the output is made
but before it is given to the MUA for sending the result out, for example,
so adding a checksum over the entire output and forcing a check upon
application is really a bad idea, even though, provided if the patch is
done cleanly, it might be acceptable as an optional feature.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git-format-patch should include a checksum
2010-01-26 23:21 ` Junio C Hamano
@ 2010-01-26 23:26 ` Sverre Rabbelier
2010-01-27 0:45 ` Linus Torvalds
2010-01-27 1:25 ` Juliusz Chroboczek
1 sibling, 1 reply; 11+ messages in thread
From: Sverre Rabbelier @ 2010-01-26 23:26 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Juliusz Chroboczek, git
Heya,
On Wed, Jan 27, 2010 at 00:21, Junio C Hamano <gitster@pobox.com> wrote:
> format-patch/am pair is designed to be lenient, allowing people to write
> additional messages after the three-dash lines after the output is made
> but before it is given to the MUA for sending the result out, for example,
> so adding a checksum over the entire output and forcing a check upon
> application is really a bad idea, even though, provided if the patch is
> done cleanly, it might be acceptable as an optional feature.
I would imagine that the checksum is taken over just the actual commit
message, perhaps author information, and use the patch-id for the
patch itself, that way any comments after triple dash would be ignored, right?
--
Cheers,
Sverre Rabbelier
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git-format-patch should include a checksum
2010-01-26 23:26 ` Sverre Rabbelier
@ 2010-01-27 0:45 ` Linus Torvalds
2010-01-27 0:50 ` Sverre Rabbelier
0 siblings, 1 reply; 11+ messages in thread
From: Linus Torvalds @ 2010-01-27 0:45 UTC (permalink / raw)
To: Sverre Rabbelier; +Cc: Junio C Hamano, Juliusz Chroboczek, git
On Wed, 27 Jan 2010, Sverre Rabbelier wrote:
>
> I would imagine that the checksum is taken over just the actual commit
> message, perhaps author information, and use the patch-id for the
> patch itself, that way any comments after triple dash would be ignored, right?
That wouldn't work either. People can, should, and do add extra things to
the message before applying it.
Examples of things I tend to add/change in the commit message:
- add ack's from people in the same thread
- add "Cc: stable@kernel.org"
- re-flow paragraphs when somebody uses a mailer that makes a mess of it.
- occasionally fix spelling and grammar
so if there is some checksum that screws that up and requires me to then
use a "--force" flag to apply it, that would be a bad thing.
I also do edit patches manually too. Having lived with people sending me
patches for the last almost twenty years, I can edit patches in my sleep.
Doing things like renaming new variables etc by search-and-replace on the
patch may not be something I do _often_, but it happens.
In short, it might make sense to have some anti-corruption logic, but I
suspect it needs a lot of thought.
Linus
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git-format-patch should include a checksum
2010-01-27 0:45 ` Linus Torvalds
@ 2010-01-27 0:50 ` Sverre Rabbelier
2010-01-27 1:13 ` Nicolas Pitre
0 siblings, 1 reply; 11+ messages in thread
From: Sverre Rabbelier @ 2010-01-27 0:50 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Junio C Hamano, Juliusz Chroboczek, git
Heya,
On Wed, Jan 27, 2010 at 01:45, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> That wouldn't work either. People can, should, and do add extra things to
> the message before applying it.
Ah, that's a fair point.
> In short, it might make sense to have some anti-corruption logic, but I
> suspect it needs a lot of thought.
Perhaps it makes sense to make it a separate mode to git am, such that
it only checks that the patch is not corrupted, but does not apply it.
That way it would be possible to download the patch, check that it
arrived unscathed, and then do your usual patch handling. Those who do
not edit patches before applying it would be convenient to set a
configuration option that automatically does it when applying the
patch, either warning about it or aborting (as Juliusz suggested).
--
Cheers,
Sverre Rabbelier
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git-format-patch should include a checksum
2010-01-27 0:50 ` Sverre Rabbelier
@ 2010-01-27 1:13 ` Nicolas Pitre
2010-01-27 1:25 ` Junio C Hamano
0 siblings, 1 reply; 11+ messages in thread
From: Nicolas Pitre @ 2010-01-27 1:13 UTC (permalink / raw)
To: Sverre Rabbelier; +Cc: Linus Torvalds, Junio C Hamano, Juliusz Chroboczek, git
On Wed, 27 Jan 2010, Sverre Rabbelier wrote:
> Heya,
>
> On Wed, Jan 27, 2010 at 01:45, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> > That wouldn't work either. People can, should, and do add extra things to
> > the message before applying it.
>
> Ah, that's a fair point.
FWIW, I do manually edit both incoming and outgoing patches from time to
time as well.
> > In short, it might make sense to have some anti-corruption logic, but I
> > suspect it needs a lot of thought.
>
> Perhaps it makes sense to make it a separate mode to git am, such that
> it only checks that the patch is not corrupted, but does not apply it.
> That way it would be possible to download the patch, check that it
> arrived unscathed, and then do your usual patch handling. Those who do
> not edit patches before applying it would be convenient to set a
> configuration option that automatically does it when applying the
> patch, either warning about it or aborting (as Juliusz suggested).
I think what would be even more useful at first is to find out why
corrupted patches still apply.
And yet without any changes in the patch format, it should be possible
to test the validity of a patch whenever the blob for the preimage SHA1
from the index line in the patch header is available locally. Just
applying the patch to that blob and confirming it matches the postimage
SHA1 should cover many cases already.
Nicolas
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git-format-patch should include a checksum
2010-01-26 23:21 ` Junio C Hamano
2010-01-26 23:26 ` Sverre Rabbelier
@ 2010-01-27 1:25 ` Juliusz Chroboczek
2010-01-27 1:35 ` Junio C Hamano
2010-01-27 3:01 ` Junio C Hamano
1 sibling, 2 replies; 11+ messages in thread
From: Juliusz Chroboczek @ 2010-01-27 1:25 UTC (permalink / raw)
To: git
> Do you have an example of such corrupted and incorrectly applied patches?
> What kind of corruption are you talking about?
The commit message getting rewrapped. For some reason, the patch itself
was not corrupted.
Another case is that of the commit message having its non-ASCII
characters corrupted.
> adding a checksum over the entire output and forcing a check upon
> application is really a bad idea, even though, provided if the patch
> is done cleanly, it might be acceptable as an optional feature.
The part I really care about is that git-format-patch should include
a checksum by default.
I'd be quite happy if git-am only warned about a checksum mismatch.
Linus:
> That wouldn't work either. People can, should, and do add extra things to
> the message before applying it.
Shouldn't they remove the checksum line at the same time as they edit
a patch?
Juliusz
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git-format-patch should include a checksum
2010-01-27 1:13 ` Nicolas Pitre
@ 2010-01-27 1:25 ` Junio C Hamano
0 siblings, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2010-01-27 1:25 UTC (permalink / raw)
To: Nicolas Pitre
Cc: Sverre Rabbelier, Linus Torvalds, Junio C Hamano,
Juliusz Chroboczek, git
Nicolas Pitre <nico@fluxnic.net> writes:
> FWIW, I do manually edit both incoming and outgoing patches from time to
> time as well.
Me three.
> I think what would be even more useful at first is to find out why
> corrupted patches still apply.
Exactly. That is why I asked that question at the very beginning.
> ... Just
> applying the patch to that blob and confirming it matches the postimage
> SHA1 should cover many cases already.
That would work when you are/have the sole authority (so contributors
won't send patches based on some other trees), you push out often (to keep
the length of the patch queue contributors keep short).
That would make it more likely that others base their work on what you
published, and send patches from their base version all the way (not
skipping "this is what I sent earlier but haven't been accepted nor pushed
out"). Otherwise it will be unlikely that you have the object recorded as
the preimage. Often when I receive follow-up patches from people, some
are based on what was committed by me (possibly with tweaks), and some
others are based on what was seen on the list (lacking the tweaks), yet
some others are based on random other versions. I'll have preimages only
in the first case.
Usually while editing incoming patch text (not log message), you mostly
touch postimage, but when fixing up a diff that was based on a bit stale
version, you need to touch preimage as well. In these cases, the blob
object names recorded in the patch wouldn't be very useful.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git-format-patch should include a checksum
2010-01-27 1:25 ` Juliusz Chroboczek
@ 2010-01-27 1:35 ` Junio C Hamano
2010-01-27 3:01 ` Junio C Hamano
1 sibling, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2010-01-27 1:35 UTC (permalink / raw)
To: Juliusz Chroboczek; +Cc: git
Juliusz Chroboczek <Juliusz.Chroboczek@pps.jussieu.fr> writes:
>> That wouldn't work either. People can, should, and do add extra things to
>> the message before applying it.
>
> Shouldn't they remove the checksum line at the same time as they edit
> a patch?
Why force extra work on people, especially the ones who _receive_ patches,
when they say they _don't_ want to have such a noise by default?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git-format-patch should include a checksum
2010-01-27 1:25 ` Juliusz Chroboczek
2010-01-27 1:35 ` Junio C Hamano
@ 2010-01-27 3:01 ` Junio C Hamano
1 sibling, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2010-01-27 3:01 UTC (permalink / raw)
To: Juliusz Chroboczek; +Cc: git
Juliusz Chroboczek <Juliusz.Chroboczek@pps.jussieu.fr> writes:
>> Do you have an example of such corrupted and incorrectly applied patches?
>> What kind of corruption are you talking about?
>
> The commit message getting rewrapped. For some reason, the patch itself
> was not corrupted.
You hopefully check the message (and of course the patch) before applying,
so "automatically reject and require force" wouldn't help you much in this
case. Either you reject and tell the sender to resend, or you reflow it
yourself to save the sender (and yourself) the hassle of round-trip.
And if you choose to do the latter, having to force it would actively
inconvenience you.
> Another case is that of the commit message having its non-ASCII
> characters corrupted.
I've seen this one. It often is that the commit object records UTF-8, but
somehow the MUA didn't mark it as such (or incorrectly marked it as
ISO-8859-1), and mailinfo ended up doing unnecessary conversion. When
this happens, not just the message but also the patch text is affected.
But to use "checksumming" as a solution for this, you need to think about
what you checksum. Output from format-patch is "text/plain;charset=UTF-8"
and requires 8-bit clean transport, but by cutting and pasting that into
MUA you often end up with whatever MIME B/Q-quoting your MUA gives you,
and mailinfo is actively unwraps them, so the right place to add an
optional check _might_ be immediately after we run mailinfo. At that
point, however, the author name and the subject line is in separate
records from the body of the commit log message.
More realistically, the kind of MUA corruption I personally see most often
is "text/plain; format-flowed"; I have a pre-applypatch hook to reject
them altogether, but they tend to corrupt leading whitespaces so badly
that the patch seldome applies. This is not something "checksumming" is
the right tool to solve.
One thing that we _might_ consider doing is to reduce the default length
of the function name we place on the hunk header line. For some reason, I
see they get wrapped in messages I see, even when the proposed commit log
message has overlong lines.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2010-01-27 3:01 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-26 22:34 git-format-patch should include a checksum Juliusz Chroboczek
2010-01-26 23:15 ` Sverre Rabbelier
2010-01-26 23:21 ` Junio C Hamano
2010-01-26 23:26 ` Sverre Rabbelier
2010-01-27 0:45 ` Linus Torvalds
2010-01-27 0:50 ` Sverre Rabbelier
2010-01-27 1:13 ` Nicolas Pitre
2010-01-27 1:25 ` Junio C Hamano
2010-01-27 1:25 ` Juliusz Chroboczek
2010-01-27 1:35 ` Junio C Hamano
2010-01-27 3:01 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).