* git-format-patch should include a checksum @ 2010-01-26 22:34 Juliusz Chroboczek 2010-01-26 23:15 ` Sverre Rabbelier 2010-01-26 23:21 ` Junio C Hamano 0 siblings, 2 replies; 11+ messages in thread From: Juliusz Chroboczek @ 2010-01-26 22:34 UTC (permalink / raw) To: git Hi, I'm seeing Git patches being corrupted by mailers and still apply correctly. It would be great if git-format-patch could include a hash of the patch body (and commit message); git-am should check the hash, and refuse to commit if the patch was corrupted (--force should override that, of course). Juliusz ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git-format-patch should include a checksum 2010-01-26 22:34 git-format-patch should include a checksum Juliusz Chroboczek @ 2010-01-26 23:15 ` Sverre Rabbelier 2010-01-26 23:21 ` Junio C Hamano 1 sibling, 0 replies; 11+ messages in thread From: Sverre Rabbelier @ 2010-01-26 23:15 UTC (permalink / raw) To: Juliusz Chroboczek; +Cc: git Heya, On Tue, Jan 26, 2010 at 23:34, Juliusz Chroboczek <jch@pps.jussieu.fr> wrote: > I'm seeing Git patches being corrupted by mailers and still apply > correctly. It would be great if git-format-patch could include a hash > of the patch body (and commit message); git-am should check the hash, > and refuse to commit if the patch was corrupted (--force should override > that, of course). Sounds like a good idea, have a look at cmd_format_patch in builtin-log.c. Assuming you want to add the hash as a X- mail header (e.g. X-Git-Message-Hash), you should probably add it before line 1019 (where rev.extra_headers is set to the current contents of 'buf'). For the hash itself have a look at git_SHA1_{Init,Update,Final} in various files, csum-file.c would seem like an excellent candidate, as it writes a sha1-summed file, which is sortof what you want to do. For the git-am part look at 'git-am.sh', you should have checking logic warn only at first, and add a configuration option that allows enforcing it and warn when the option isn't set, instructing the user how to configure the option, see warn_unconfigured_deny_msg in builtin-receive-pack.c. A few releases from now we can set it from warn to abort (or deny, whatever the enforcing option is called). Good luck! -- Cheers, Sverre Rabbelier ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git-format-patch should include a checksum 2010-01-26 22:34 git-format-patch should include a checksum Juliusz Chroboczek 2010-01-26 23:15 ` Sverre Rabbelier @ 2010-01-26 23:21 ` Junio C Hamano 2010-01-26 23:26 ` Sverre Rabbelier 2010-01-27 1:25 ` Juliusz Chroboczek 1 sibling, 2 replies; 11+ messages in thread From: Junio C Hamano @ 2010-01-26 23:21 UTC (permalink / raw) To: Juliusz Chroboczek; +Cc: git Juliusz Chroboczek <jch@pps.jussieu.fr> writes: > I'm seeing Git patches being corrupted by mailers and still apply > correctly. It would be great if git-format-patch could include a hash > of the patch body (and commit message); git-am should check the hash, > and refuse to commit if the patch was corrupted (--force should override > that, of course). Do you have an example of such corrupted and incorrectly applied patches? What kind of corruption are you talking about? format-patch/am pair is designed to be lenient, allowing people to write additional messages after the three-dash lines after the output is made but before it is given to the MUA for sending the result out, for example, so adding a checksum over the entire output and forcing a check upon application is really a bad idea, even though, provided if the patch is done cleanly, it might be acceptable as an optional feature. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git-format-patch should include a checksum 2010-01-26 23:21 ` Junio C Hamano @ 2010-01-26 23:26 ` Sverre Rabbelier 2010-01-27 0:45 ` Linus Torvalds 2010-01-27 1:25 ` Juliusz Chroboczek 1 sibling, 1 reply; 11+ messages in thread From: Sverre Rabbelier @ 2010-01-26 23:26 UTC (permalink / raw) To: Junio C Hamano; +Cc: Juliusz Chroboczek, git Heya, On Wed, Jan 27, 2010 at 00:21, Junio C Hamano <gitster@pobox.com> wrote: > format-patch/am pair is designed to be lenient, allowing people to write > additional messages after the three-dash lines after the output is made > but before it is given to the MUA for sending the result out, for example, > so adding a checksum over the entire output and forcing a check upon > application is really a bad idea, even though, provided if the patch is > done cleanly, it might be acceptable as an optional feature. I would imagine that the checksum is taken over just the actual commit message, perhaps author information, and use the patch-id for the patch itself, that way any comments after triple dash would be ignored, right? -- Cheers, Sverre Rabbelier ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git-format-patch should include a checksum 2010-01-26 23:26 ` Sverre Rabbelier @ 2010-01-27 0:45 ` Linus Torvalds 2010-01-27 0:50 ` Sverre Rabbelier 0 siblings, 1 reply; 11+ messages in thread From: Linus Torvalds @ 2010-01-27 0:45 UTC (permalink / raw) To: Sverre Rabbelier; +Cc: Junio C Hamano, Juliusz Chroboczek, git On Wed, 27 Jan 2010, Sverre Rabbelier wrote: > > I would imagine that the checksum is taken over just the actual commit > message, perhaps author information, and use the patch-id for the > patch itself, that way any comments after triple dash would be ignored, right? That wouldn't work either. People can, should, and do add extra things to the message before applying it. Examples of things I tend to add/change in the commit message: - add ack's from people in the same thread - add "Cc: stable@kernel.org" - re-flow paragraphs when somebody uses a mailer that makes a mess of it. - occasionally fix spelling and grammar so if there is some checksum that screws that up and requires me to then use a "--force" flag to apply it, that would be a bad thing. I also do edit patches manually too. Having lived with people sending me patches for the last almost twenty years, I can edit patches in my sleep. Doing things like renaming new variables etc by search-and-replace on the patch may not be something I do _often_, but it happens. In short, it might make sense to have some anti-corruption logic, but I suspect it needs a lot of thought. Linus ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git-format-patch should include a checksum 2010-01-27 0:45 ` Linus Torvalds @ 2010-01-27 0:50 ` Sverre Rabbelier 2010-01-27 1:13 ` Nicolas Pitre 0 siblings, 1 reply; 11+ messages in thread From: Sverre Rabbelier @ 2010-01-27 0:50 UTC (permalink / raw) To: Linus Torvalds; +Cc: Junio C Hamano, Juliusz Chroboczek, git Heya, On Wed, Jan 27, 2010 at 01:45, Linus Torvalds <torvalds@linux-foundation.org> wrote: > That wouldn't work either. People can, should, and do add extra things to > the message before applying it. Ah, that's a fair point. > In short, it might make sense to have some anti-corruption logic, but I > suspect it needs a lot of thought. Perhaps it makes sense to make it a separate mode to git am, such that it only checks that the patch is not corrupted, but does not apply it. That way it would be possible to download the patch, check that it arrived unscathed, and then do your usual patch handling. Those who do not edit patches before applying it would be convenient to set a configuration option that automatically does it when applying the patch, either warning about it or aborting (as Juliusz suggested). -- Cheers, Sverre Rabbelier ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git-format-patch should include a checksum 2010-01-27 0:50 ` Sverre Rabbelier @ 2010-01-27 1:13 ` Nicolas Pitre 2010-01-27 1:25 ` Junio C Hamano 0 siblings, 1 reply; 11+ messages in thread From: Nicolas Pitre @ 2010-01-27 1:13 UTC (permalink / raw) To: Sverre Rabbelier; +Cc: Linus Torvalds, Junio C Hamano, Juliusz Chroboczek, git On Wed, 27 Jan 2010, Sverre Rabbelier wrote: > Heya, > > On Wed, Jan 27, 2010 at 01:45, Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > That wouldn't work either. People can, should, and do add extra things to > > the message before applying it. > > Ah, that's a fair point. FWIW, I do manually edit both incoming and outgoing patches from time to time as well. > > In short, it might make sense to have some anti-corruption logic, but I > > suspect it needs a lot of thought. > > Perhaps it makes sense to make it a separate mode to git am, such that > it only checks that the patch is not corrupted, but does not apply it. > That way it would be possible to download the patch, check that it > arrived unscathed, and then do your usual patch handling. Those who do > not edit patches before applying it would be convenient to set a > configuration option that automatically does it when applying the > patch, either warning about it or aborting (as Juliusz suggested). I think what would be even more useful at first is to find out why corrupted patches still apply. And yet without any changes in the patch format, it should be possible to test the validity of a patch whenever the blob for the preimage SHA1 from the index line in the patch header is available locally. Just applying the patch to that blob and confirming it matches the postimage SHA1 should cover many cases already. Nicolas ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git-format-patch should include a checksum 2010-01-27 1:13 ` Nicolas Pitre @ 2010-01-27 1:25 ` Junio C Hamano 0 siblings, 0 replies; 11+ messages in thread From: Junio C Hamano @ 2010-01-27 1:25 UTC (permalink / raw) To: Nicolas Pitre Cc: Sverre Rabbelier, Linus Torvalds, Junio C Hamano, Juliusz Chroboczek, git Nicolas Pitre <nico@fluxnic.net> writes: > FWIW, I do manually edit both incoming and outgoing patches from time to > time as well. Me three. > I think what would be even more useful at first is to find out why > corrupted patches still apply. Exactly. That is why I asked that question at the very beginning. > ... Just > applying the patch to that blob and confirming it matches the postimage > SHA1 should cover many cases already. That would work when you are/have the sole authority (so contributors won't send patches based on some other trees), you push out often (to keep the length of the patch queue contributors keep short). That would make it more likely that others base their work on what you published, and send patches from their base version all the way (not skipping "this is what I sent earlier but haven't been accepted nor pushed out"). Otherwise it will be unlikely that you have the object recorded as the preimage. Often when I receive follow-up patches from people, some are based on what was committed by me (possibly with tweaks), and some others are based on what was seen on the list (lacking the tweaks), yet some others are based on random other versions. I'll have preimages only in the first case. Usually while editing incoming patch text (not log message), you mostly touch postimage, but when fixing up a diff that was based on a bit stale version, you need to touch preimage as well. In these cases, the blob object names recorded in the patch wouldn't be very useful. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git-format-patch should include a checksum 2010-01-26 23:21 ` Junio C Hamano 2010-01-26 23:26 ` Sverre Rabbelier @ 2010-01-27 1:25 ` Juliusz Chroboczek 2010-01-27 1:35 ` Junio C Hamano 2010-01-27 3:01 ` Junio C Hamano 1 sibling, 2 replies; 11+ messages in thread From: Juliusz Chroboczek @ 2010-01-27 1:25 UTC (permalink / raw) To: git > Do you have an example of such corrupted and incorrectly applied patches? > What kind of corruption are you talking about? The commit message getting rewrapped. For some reason, the patch itself was not corrupted. Another case is that of the commit message having its non-ASCII characters corrupted. > adding a checksum over the entire output and forcing a check upon > application is really a bad idea, even though, provided if the patch > is done cleanly, it might be acceptable as an optional feature. The part I really care about is that git-format-patch should include a checksum by default. I'd be quite happy if git-am only warned about a checksum mismatch. Linus: > That wouldn't work either. People can, should, and do add extra things to > the message before applying it. Shouldn't they remove the checksum line at the same time as they edit a patch? Juliusz ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git-format-patch should include a checksum 2010-01-27 1:25 ` Juliusz Chroboczek @ 2010-01-27 1:35 ` Junio C Hamano 2010-01-27 3:01 ` Junio C Hamano 1 sibling, 0 replies; 11+ messages in thread From: Junio C Hamano @ 2010-01-27 1:35 UTC (permalink / raw) To: Juliusz Chroboczek; +Cc: git Juliusz Chroboczek <Juliusz.Chroboczek@pps.jussieu.fr> writes: >> That wouldn't work either. People can, should, and do add extra things to >> the message before applying it. > > Shouldn't they remove the checksum line at the same time as they edit > a patch? Why force extra work on people, especially the ones who _receive_ patches, when they say they _don't_ want to have such a noise by default? ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git-format-patch should include a checksum 2010-01-27 1:25 ` Juliusz Chroboczek 2010-01-27 1:35 ` Junio C Hamano @ 2010-01-27 3:01 ` Junio C Hamano 1 sibling, 0 replies; 11+ messages in thread From: Junio C Hamano @ 2010-01-27 3:01 UTC (permalink / raw) To: Juliusz Chroboczek; +Cc: git Juliusz Chroboczek <Juliusz.Chroboczek@pps.jussieu.fr> writes: >> Do you have an example of such corrupted and incorrectly applied patches? >> What kind of corruption are you talking about? > > The commit message getting rewrapped. For some reason, the patch itself > was not corrupted. You hopefully check the message (and of course the patch) before applying, so "automatically reject and require force" wouldn't help you much in this case. Either you reject and tell the sender to resend, or you reflow it yourself to save the sender (and yourself) the hassle of round-trip. And if you choose to do the latter, having to force it would actively inconvenience you. > Another case is that of the commit message having its non-ASCII > characters corrupted. I've seen this one. It often is that the commit object records UTF-8, but somehow the MUA didn't mark it as such (or incorrectly marked it as ISO-8859-1), and mailinfo ended up doing unnecessary conversion. When this happens, not just the message but also the patch text is affected. But to use "checksumming" as a solution for this, you need to think about what you checksum. Output from format-patch is "text/plain;charset=UTF-8" and requires 8-bit clean transport, but by cutting and pasting that into MUA you often end up with whatever MIME B/Q-quoting your MUA gives you, and mailinfo is actively unwraps them, so the right place to add an optional check _might_ be immediately after we run mailinfo. At that point, however, the author name and the subject line is in separate records from the body of the commit log message. More realistically, the kind of MUA corruption I personally see most often is "text/plain; format-flowed"; I have a pre-applypatch hook to reject them altogether, but they tend to corrupt leading whitespaces so badly that the patch seldome applies. This is not something "checksumming" is the right tool to solve. One thing that we _might_ consider doing is to reduce the default length of the function name we place on the hunk header line. For some reason, I see they get wrapped in messages I see, even when the proposed commit log message has overlong lines. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2010-01-27 3:01 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-01-26 22:34 git-format-patch should include a checksum Juliusz Chroboczek 2010-01-26 23:15 ` Sverre Rabbelier 2010-01-26 23:21 ` Junio C Hamano 2010-01-26 23:26 ` Sverre Rabbelier 2010-01-27 0:45 ` Linus Torvalds 2010-01-27 0:50 ` Sverre Rabbelier 2010-01-27 1:13 ` Nicolas Pitre 2010-01-27 1:25 ` Junio C Hamano 2010-01-27 1:25 ` Juliusz Chroboczek 2010-01-27 1:35 ` Junio C Hamano 2010-01-27 3:01 ` Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).