* More builtin git-am issues.. @ 2015-09-04 23:47 Linus Torvalds 2015-09-04 23:52 ` Linus Torvalds ` (2 more replies) 0 siblings, 3 replies; 30+ messages in thread From: Linus Torvalds @ 2015-09-04 23:47 UTC (permalink / raw) To: Junio C Hamano, Paul Tan; +Cc: Git Mailing List Ok, this may not be new either, but I'm trying to be careful when using "git am" these days, because I know it got rewritten. And I _think_ the whitespace handling for adding sign-offs got scrogged. I just applied the usual patch-bomb from Andrew, and several of the commits (but not all) end up looking like this: Cc: <stable@vger.kernel.org> [3.15+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> note the extraneous whitespace line between Andrew's sign-off and mine. What's odd is that the emails I'm applying literally don't have that extra empty line, so it's git that somehow decides to add it. Only for a few cases, though. The pattern *seems* to be that git now looks at the *first* line of the sign-off block and decides that "this is a sign-off block if that first line has a sign-ff on it, ie this is fine: Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> but the failing cases have a comment by Andrew: [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Cc: Xishi Qiu <qiuxishi@huawei.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Taku Izumi <izumi.taku@jp.fujitsu.com> Cc: Gu Zheng <guz.fnst@cn.fujitsu.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: David Rientjes <rientjes@google.com> Cc: <stable@vger.kernel.org> [4.2.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> ie that "[akpm@linux-foundation.org: coding-style fixes]" makes git am now decide that the previous block of text was not a sign-off block, so it adds an empty line before adding my sign-off. But very obviously it *was* a sign-off block. Maybe this isn't new at all, and it's just that I notice because I'm looking for "git am" oddities. Something is clearly wrong in "has_conforming_footer()". I *think* it's this part: if (!(found_rfc2822 || is_cherry_picked_from_line(buf + i, k - i - 1))) return 0; which basically returns 0 for _any_ line in the footer that doesn't match the found_rfc2822 format. I really think that if we find any "Signed-off-by:" in that last chunk, we should not add a whitespace. Comments? Linus ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: More builtin git-am issues.. 2015-09-04 23:47 More builtin git-am issues Linus Torvalds @ 2015-09-04 23:52 ` Linus Torvalds 2015-09-05 0:07 ` Jeff King 2015-09-05 0:54 ` Junio C Hamano 2015-09-05 1:06 ` Junio C Hamano 2015-09-06 4:56 ` [PATCH] am: match --signoff to the original scripted version Junio C Hamano 2 siblings, 2 replies; 30+ messages in thread From: Linus Torvalds @ 2015-09-04 23:52 UTC (permalink / raw) To: Junio C Hamano, Paul Tan; +Cc: Git Mailing List On Fri, Sep 4, 2015 at 4:47 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > I *think* it's this part: > > if (!(found_rfc2822 || > is_cherry_picked_from_line(buf + i, k - i - 1))) > return 0; > > which basically returns 0 for _any_ line in the footer that doesn't > match the found_rfc2822 format. Confirmed. I hacked up a version that just doesn't do that check at all, and it works fine (but obviously only on well-formatted emails that really do have a sign-off). So I think that logic should basically be extended to saying - if any line in the last chunk has a "Signed-off-by:", set a flag. - at the end of the loop, if that flag wasn't set, return 0. Instead of that thing that basically returns zero immediately when it sees a line it doesn't like. I'm in the middle of my merge window, so I'm not going to get around to writing a patch until that's over. Hopefully somebody will step up in the meantime. Hint, hint. Linus ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: More builtin git-am issues.. 2015-09-04 23:52 ` Linus Torvalds @ 2015-09-05 0:07 ` Jeff King 2015-09-05 0:36 ` Linus Torvalds 2015-09-05 0:54 ` Junio C Hamano 1 sibling, 1 reply; 30+ messages in thread From: Jeff King @ 2015-09-05 0:07 UTC (permalink / raw) To: Linus Torvalds; +Cc: Junio C Hamano, Paul Tan, Git Mailing List On Fri, Sep 04, 2015 at 04:52:42PM -0700, Linus Torvalds wrote: > On Fri, Sep 4, 2015 at 4:47 PM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > > > I *think* it's this part: > > > > if (!(found_rfc2822 || > > is_cherry_picked_from_line(buf + i, k - i - 1))) > > return 0; > > > > which basically returns 0 for _any_ line in the footer that doesn't > > match the found_rfc2822 format. > > Confirmed. I hacked up a version that just doesn't do that check at > all, and it works fine (but obviously only on well-formatted emails > that really do have a sign-off). > > So I think that logic should basically be extended to saying > > - if any line in the last chunk has a "Signed-off-by:", set a flag. > > - at the end of the loop, if that flag wasn't set, return 0. Do we want to make Signed-off-by a special token here? What about "Cc:", and other project-specific trailers? You wouldn't want to end up with: [some comment] Cc: somebody Signed-off-by: somebody else It's not a problem for git-am, which should be taking in patches that are already signed-off (and after all, if your project does not use signoffs, why are you using "am -s"?). But won't "git commit -s" run into the same problem? We could look for an arbitrary rfc2822-ish string, but I'd be worried slightly about false positives, like: This is a paragraph with arbitrary text. But one of the lines will use a colon, and that a causes a problem: because of wrapping, this line kind of looks like a trailer. Maybe only the final line needs to look rfc2822-ish? Or maybe we can constrain the content that _doesn't_ look like a trailer line. Is it sufficient to allow a block encased in "[]", rather than arbitrary text? All that being said, I think we can punt on the harder issues for now, and just restore the original git-am.sh behavior (which it looks like was fairly naive) for the upcoming release, so that there is no regression. -Peff ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: More builtin git-am issues.. 2015-09-05 0:07 ` Jeff King @ 2015-09-05 0:36 ` Linus Torvalds 0 siblings, 0 replies; 30+ messages in thread From: Linus Torvalds @ 2015-09-05 0:36 UTC (permalink / raw) To: Jeff King; +Cc: Junio C Hamano, Paul Tan, Git Mailing List On Fri, Sep 4, 2015 at 5:07 PM, Jeff King <peff@peff.net> wrote: > > Do we want to make Signed-off-by a special token here? What about "Cc:", > and other project-specific trailers? You wouldn't want to end up with: > > [some comment] > Cc: somebody > > Signed-off-by: somebody else > > It's not a problem for git-am, which should be taking in patches that > are already signed-off (and after all, if your project does not use > signoffs, why are you using "am -s"?). But won't "git commit -s" run > into the same problem? So I'm a *bit* worried about taking anything else than Signed-off-by: exactly because of the problem you mention: > We could look for an arbitrary rfc2822-ish string, but I'd be worried > slightly about false positives, like: > > This is a paragraph with arbitrary text. But one > of the lines will use a colon, and that a causes a > problem: because of wrapping, this line kind of looks > like a trailer. which clearly needs an empty line before the sign-off. And even if we limit it to just the last line, like you suggest: > Maybe only the final line needs to look rfc2822-ish? I'd still worry, for the same exact reason. The "rfc2822-ish" check is _so_ weak that it can be trivially triggered by normal text. So maybe it doesn't require a strict "Signed-off-by:" match, but I think it needs something stricter than the found_rfc2822 format (like maybe "looks like a name/email"). We just don't want to require that *all* the lines are that way, because at least in the kernel we often do end up adding small comments in that section. The git project sign-off rules seem to be stricter, and it looks like it's almost universally of the form "Signed-off-by:" with a few "Helped-by:" and "Reviewed-by:". In the kernel, we really do tend to be messier in that area, with the sign-off chunk not just having the sign-offs and cc's etc, but also tends to have those occasional small notes left by the people forwarding the emails. And we also often don't have _strictly_ just an email. We tend to have things like Cc: <stable@vger.kernel.org> # v3.13+ etc, and sometimes we don't have have any email at all, but things like Reported-by: coverity Tested-by: MysterX on #openelec Generated-by: Coccinelle SmPL so it's hard to give a really strict rule. It's fairly free-format. That said, I would argue that when you apply a patch with sign-off, pretty much by definition that patch _should_ have a sign-off from the originator. So I suspect that the "there is an existing sign-off in that last chunk" is a fairly good rule. Even if there are lots of other lines too. If you're using "git commit -s" to just commit your own work, you presumably would normally want the extra sign-off. Or at least you can do a "git commit --amend" fairly easily to fix things up. Doing the amending later for "git am" is rather impractical, when it might have been a series of 100+ emails.. Linus ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: More builtin git-am issues.. 2015-09-04 23:52 ` Linus Torvalds 2015-09-05 0:07 ` Jeff King @ 2015-09-05 0:54 ` Junio C Hamano 2015-09-05 1:06 ` Linus Torvalds 2015-09-05 7:30 ` Johannes Sixt 1 sibling, 2 replies; 30+ messages in thread From: Junio C Hamano @ 2015-09-05 0:54 UTC (permalink / raw) To: Linus Torvalds; +Cc: Paul Tan, Git Mailing List Linus Torvalds <torvalds@linux-foundation.org> writes: > So I think that logic should basically be extended to saying > > - if any line in the last chunk has a "Signed-off-by:", set a flag. > > - at the end of the loop, if that flag wasn't set, return 0. I am reluctant to special case S-o-b: too much, even though this is about "am -s" and by definition S-o-b: is special, as that is what we are adding after all. How about a bit looser rule like this? A block of text at the end of the message, each and every line in which must match "^[^: ]+:[ ]" (that is, a "keyword" that does not contain a whitespace nor a colon, followed by a colon and whitespace, and arbitrary value thru the end of line) is a signature block. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: More builtin git-am issues.. 2015-09-05 0:54 ` Junio C Hamano @ 2015-09-05 1:06 ` Linus Torvalds 2015-09-05 1:23 ` Junio C Hamano 2015-09-05 7:30 ` Johannes Sixt 1 sibling, 1 reply; 30+ messages in thread From: Linus Torvalds @ 2015-09-05 1:06 UTC (permalink / raw) To: Junio C Hamano; +Cc: Paul Tan, Git Mailing List On Fri, Sep 4, 2015 at 5:54 PM, Junio C Hamano <gitster@pobox.com> wrote: > > How about a bit looser rule like this? > > A block of text at the end of the message, each and every > line in which must match "^[^: ]+:[ ]" (that is, > a "keyword" that does not contain a whitespace nor a colon, > followed by a colon and whitespace, and arbitrary value thru > the end of line) is a signature block. No. That's still broken. The thing is, and that was what the report was all about, not every line _is_ of that format. We have commetns from the sign-off people. Things like this: Signed-off-by: Noam Camus <noamc@ezchip.com> Acked-by: Vineet Gupta <vgupta@synopsys.com> [ Also removed pointless cast from "void *". - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> or Signed-off-by: Andi Kleen <ak@linux.intel.com> [ Updated comments and changelog a bit. ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/1424225886-18652-3-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org> so no, it is simply not true that "every line must match". I'm not even seeing why you argue for that, since clearly having a sign-off-line is actually a safer choice too. The "every line must match" rule is bad, not just because it's not true like above, but also because it can be true without it being a sign-off block. For example, it's not at all unlikely that you have perfectly normal comments that just list some subsystem and their changes. Which could easily look like Trivial fixes all over the tree drm: fix whitespace mm: speeling errors kernel: indentation and codign style The above looks like a perfectly sane commit log to me. Do you seriously think that it makes for a better "sign-off block test" than one that actually checks for "is there a sign-off line"? I'd much rather have special cases like testing for specific keywords or looking for things that look like emails, than make it about being "every line has this very generic format". Linus ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: More builtin git-am issues.. 2015-09-05 1:06 ` Linus Torvalds @ 2015-09-05 1:23 ` Junio C Hamano 2015-09-05 1:35 ` Linus Torvalds 0 siblings, 1 reply; 30+ messages in thread From: Junio C Hamano @ 2015-09-05 1:23 UTC (permalink / raw) To: Linus Torvalds; +Cc: Paul Tan, Git Mailing List Linus Torvalds <torvalds@linux-foundation.org> writes: > The thing is, and that was what the report was all about, not every > line _is_ of that format. We have commetns from the sign-off people. > Things like this: > > Signed-off-by: Noam Camus <noamc@ezchip.com> > Acked-by: Vineet Gupta <vgupta@synopsys.com> > [ Also removed pointless cast from "void *". - Linus ] > Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> OK, I didn't know that was acceptable in the kernel community to have random comments like that inside the block. Can these comments span multiple paragraphs? What I am wondering is what you want to see in a case like this: Signed-off-by: Noam Camus <noamc@ezchip.com> Acked-by: Vineet Gupta <vgupta@synopsys.com> [ Also removed pointless cast from "void *". - Linus] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [ Ahh, we have to tell at least these people - stakeholder class 1 - staeholder class 2 ] Cc: foo Cc: bar ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: More builtin git-am issues.. 2015-09-05 1:23 ` Junio C Hamano @ 2015-09-05 1:35 ` Linus Torvalds 2015-09-05 16:01 ` Junio C Hamano 0 siblings, 1 reply; 30+ messages in thread From: Linus Torvalds @ 2015-09-05 1:35 UTC (permalink / raw) To: Junio C Hamano; +Cc: Paul Tan, Git Mailing List On Fri, Sep 4, 2015 at 6:23 PM, Junio C Hamano <gitster@pobox.com> wrote: > > OK, I didn't know that was acceptable in the kernel community > to have random comments like that inside the block. Can these > comments span multiple paragraphs? What I am wondering is what > you want to see in a case like this: > > Signed-off-by: Noam Camus <noamc@ezchip.com> > Acked-by: Vineet Gupta <vgupta@synopsys.com> > [ Also removed pointless cast from "void *". - Linus] > Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> > [ Ahh, we have to tell at least these people > > - stakeholder class 1 > - staeholder class 2 > ] > Cc: foo > Cc: bar So quite frankly, at that point if git doesn't recognize it as a sign-off block, I don't think it's a big deal. That said, the original "git am" rules actually seem to be rather straightforward: it's never an issue about "last block of text", and it's simply an issue of "is there a sign-ff _anywhere_ in the text". That simplicity has a certain appeal to me. I don't think it was necessarily written that way because it was "well designed" - I suspect it is more an issue of "easy to implement in a shell-script". And it's possible that I'm mis-reading the scripts too. It's not like I _remember_ what the exact behavior was, I just think it used to work really well for us (ie I don't recall seeing lots of those empty lines in the middle of the sign-off block before, and this current merge window it happened for four of the emails I applied from Andrew's 119-patch series.. Four out of 119 emails may not be a big percentage, but it does mean that it's not horribly unusual either.. Linus ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: More builtin git-am issues.. 2015-09-05 1:35 ` Linus Torvalds @ 2015-09-05 16:01 ` Junio C Hamano 0 siblings, 0 replies; 30+ messages in thread From: Junio C Hamano @ 2015-09-05 16:01 UTC (permalink / raw) To: Linus Torvalds; +Cc: Paul Tan, Git Mailing List Linus Torvalds <torvalds@linux-foundation.org> writes: > That said, the original "git am" rules actually seem to be rather > straightforward: it's never an issue about "last block of text", and > it's simply an issue of "is there a sign-ff _anywhere_ in the text". > > That simplicity has a certain appeal to me. I don't think it was > necessarily written that way because it was "well designed" - I > suspect it is more an issue of "easy to implement in a shell-script". Guilty as pointed out. > Four out of 119 emails may not be a big percentage, but it does mean > that it's not horribly unusual either.. Sure. Thanks for these examples. I was aware that people do strange things with the footer, but with the first example of [akpm] comment at the very beginning alone, I wouldn't have guessed why intermixing one-liner comments directly in the chain of Signed-of-by: lines made any sense. Call it lack of imagination, but each sign-off optionally prefixed by a single-liner summary of what change was done makes sense why people do want to use these lines that way. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: More builtin git-am issues.. 2015-09-05 0:54 ` Junio C Hamano 2015-09-05 1:06 ` Linus Torvalds @ 2015-09-05 7:30 ` Johannes Sixt 2015-09-05 8:03 ` Jeff King 2015-09-05 15:24 ` Junio C Hamano 1 sibling, 2 replies; 30+ messages in thread From: Johannes Sixt @ 2015-09-05 7:30 UTC (permalink / raw) To: Junio C Hamano; +Cc: Linus Torvalds, Paul Tan, Git Mailing List Am 05.09.2015 um 02:54 schrieb Junio C Hamano: > Linus Torvalds <torvalds@linux-foundation.org> writes: > >> So I think that logic should basically be extended to saying >> >> - if any line in the last chunk has a "Signed-off-by:", set a flag. >> >> - at the end of the loop, if that flag wasn't set, return 0. > > I am reluctant to special case S-o-b: too much, even though this is > about "am -s" and by definition S-o-b: is special, as that is what > we are adding after all. > > How about a bit looser rule like this? > > A block of text at the end of the message, each and every > line in which must match "^[^: ]+:[ ]" (that is, > a "keyword" that does not contain a whitespace nor a colon, > followed by a colon and whitespace, and arbitrary value thru > the end of line) is a signature block. Why do we need a new rule? The old git-am had a logic that pleased everyone, and it must have been implemented somewhere. Shouldn't it be sufficient to just re-implement or re-use that logic? -- Hannes ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: More builtin git-am issues.. 2015-09-05 7:30 ` Johannes Sixt @ 2015-09-05 8:03 ` Jeff King 2015-09-05 16:10 ` Junio C Hamano 2015-09-05 15:24 ` Junio C Hamano 1 sibling, 1 reply; 30+ messages in thread From: Jeff King @ 2015-09-05 8:03 UTC (permalink / raw) To: Johannes Sixt; +Cc: Junio C Hamano, Linus Torvalds, Paul Tan, Git Mailing List On Sat, Sep 05, 2015 at 09:30:27AM +0200, Johannes Sixt wrote: > >How about a bit looser rule like this? > > > > A block of text at the end of the message, each and every > > line in which must match "^[^: ]+:[ ]" (that is, > > a "keyword" that does not contain a whitespace nor a colon, > > followed by a colon and whitespace, and arbitrary value thru > > the end of line) is a signature block. > > Why do we need a new rule? The old git-am had a logic that pleased everyone, > and it must have been implemented somewhere. Shouldn't it be sufficient to > just re-implement or re-use that logic? That was my thought, too; if there is a regression, let's start by fixing that for the upcoming 2.6.0, and then we can worry about doing something fancier[1] later. And I think the original behavior really is what Linus is asking for: we consider the final block, even with a "[]" comment, as a S-o-b block if it has a Signed-off-by in it. So if we have the final block: [some comment] Signed-off-by: Somebody Else <them@example.com> Running "am -s" with the current master yields: [some comment] Signed-off-by: Somebody Else <them@example.com> Signed-off-by: Jeff King <peff@peff.net> which is wrong. Running the same with v2.5.0 gets: [some comment] Signed-off-by: Somebody Else <them@example.com> Signed-off-by: Jeff King <peff@peff.net> So far so good. Now let's change our input to: [some comment] Reviewed-by: Somebody Else <them@example.com> and run "am -s". Current "master" continues to stick the extra line in there. No surprise. But now so does v2.5.0! So I don't think the old behavior covered all cases well, either, and there's room for improvement. But likewise, I don't recall seeing a lot of complaints about it in practice. It seems like a sane thing to restore it for the upcoming release, and then build from there. -Peff [1] I think part of the reason people are interested in "fancy" here is that this topic extends beyond "git am". There's "commit -s", of course, but there's also the generic "interpret-trailers" command, which is supposed to be a generalization of the "--signoff" option. It would be nice if the rules remained consistent for when we add a trailer to an existing block, rather than special-casing signoffs. But again, I think that's something to shoot for in the long run. It's more important in the short term not to regress "am". ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: More builtin git-am issues.. 2015-09-05 8:03 ` Jeff King @ 2015-09-05 16:10 ` Junio C Hamano 2015-09-05 16:32 ` Junio C Hamano 2015-09-05 19:39 ` Junio C Hamano 0 siblings, 2 replies; 30+ messages in thread From: Junio C Hamano @ 2015-09-05 16:10 UTC (permalink / raw) To: Jeff King; +Cc: Johannes Sixt, Linus Torvalds, Paul Tan, Git Mailing List Jeff King <peff@peff.net> writes: > [1] I think part of the reason people are interested in "fancy" here is > that this topic extends beyond "git am". There's "commit -s", of > course, but there's also the generic "interpret-trailers" command, > which is supposed to be a generalization of the "--signoff" option. > It would be nice if the rules remained consistent for when we add a > trailer to an existing block, rather than special-casing signoffs. > > But again, I think that's something to shoot for in the long run. > It's more important in the short term not to regress "am". Yes. I personally think the global append_signoff() everybody else uses can and should implement the same logic (whichever the exact logic is) as whatever is used by "am" because the caller that makes a call to that function knows it is adding a "Signed-off-by:" line, so existing "Signed-off-by" lines are already special in that context. But for the purpose of 2.6-rc period, I think we should start from doing a separate implementation inside builtin/am.c without touching append_signoff(). We can do a more thoughtful refactoring for append_signoff() in the next cycle. Another thing that needs consideration is the other user of the has_conforming_footer() helper append_signoff() uses, which is the codepath to add "cherry-picked-from"; it is less obvious that "find Signed-off-by: anywhere in the text to decide if the new footer needs its own paragraph" is a good logic in that context. To salvage "interpret-trailers" needs a lot more, as we are realizing that the definition that led to its external design does not match the way users use footers in the real world. This affects the internal data representation and the whole thing needs to be rethought. Here is a quick attempt to do the "just fix am regression without changing anything else". builtin/am.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/builtin/am.c b/builtin/am.c index 83b3d86..c63d238 100644 --- a/builtin/am.c +++ b/builtin/am.c @@ -1213,9 +1213,24 @@ static void NORETURN die_user_resolve(const struct am_state *state) static void am_append_signoff(struct am_state *state) { struct strbuf sb = STRBUF_INIT; + char *cp; strbuf_attach(&sb, state->msg, state->msg_len, state->msg_len); - append_signoff(&sb, 0, 0); + strbuf_complete_line(&sb); + + for (cp = sb.buf; + cp && *cp && (cp = strstr(cp, sign_off_header)) != NULL; + cp = strchr(cp, '\n')) { + if (sb.buf < cp && cp[-1] == '\n') + break; + } + if (!cp) + strbuf_addch(&sb, '\n'); + strbuf_addf(&sb, "%s%s\n", + sign_off_header, + fmt_name(getenv("GIT_COMMITTER_NAME"), + getenv("GIT_COMMITTER_EMAIL"))); + strbuf_addch(&sb, '\n'); state->msg = strbuf_detach(&sb, &state->msg_len); } ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: More builtin git-am issues.. 2015-09-05 16:10 ` Junio C Hamano @ 2015-09-05 16:32 ` Junio C Hamano 2015-09-05 16:57 ` Junio C Hamano 2015-09-05 19:39 ` Junio C Hamano 1 sibling, 1 reply; 30+ messages in thread From: Junio C Hamano @ 2015-09-05 16:32 UTC (permalink / raw) To: Jeff King; +Cc: Johannes Sixt, Linus Torvalds, Paul Tan, Git Mailing List Junio C Hamano <gitster@pobox.com> writes: > But for the purpose of 2.6-rc period, I think we should start from > doing a separate implementation inside builtin/am.c without touching > append_signoff(). > ... > Here is a quick attempt to do the "just fix am regression without > changing anything else". And a quick attempt without even compilation testing has flaws as expected X-<. Second attempt. builtin/am.c | 25 +++++++++++++++++++++++-- 1 file changed, 23 insertions(+), 2 deletions(-) diff --git a/builtin/am.c b/builtin/am.c index 83b3d86..3a65d09 100644 --- a/builtin/am.c +++ b/builtin/am.c @@ -1207,6 +1207,27 @@ static void NORETURN die_user_resolve(const struct am_state *state) exit(128); } +static void am_signoff(struct strbuf *sb) +{ + char *cp; + + strbuf_complete_line(sb); + + for (cp = sb->buf; + cp && *cp && (cp = strstr(cp, sign_off_header)) != NULL; + cp = strchr(cp, '\n')) { + if (sb->buf == cp || cp[-1] == '\n') + break; + } + if (!cp) + strbuf_addch(sb, '\n'); + strbuf_addf(sb, "%s%s\n", + sign_off_header, + fmt_name(getenv("GIT_COMMITTER_NAME"), + getenv("GIT_COMMITTER_EMAIL"))); + strbuf_addch(sb, '\n'); +} + /** * Appends signoff to the "msg" field of the am_state. */ @@ -1215,7 +1236,7 @@ static void am_append_signoff(struct am_state *state) struct strbuf sb = STRBUF_INIT; strbuf_attach(&sb, state->msg, state->msg_len, state->msg_len); - append_signoff(&sb, 0, 0); + am_signoff(&sb); state->msg = strbuf_detach(&sb, &state->msg_len); } @@ -1319,7 +1340,7 @@ static int parse_mail(struct am_state *state, const char *mail) stripspace(&msg, 0); if (state->signoff) - append_signoff(&msg, 0, 0); + am_signoff(&msg); assert(!state->author_name); state->author_name = strbuf_detach(&author_name, NULL); ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: More builtin git-am issues.. 2015-09-05 16:32 ` Junio C Hamano @ 2015-09-05 16:57 ` Junio C Hamano 0 siblings, 0 replies; 30+ messages in thread From: Junio C Hamano @ 2015-09-05 16:57 UTC (permalink / raw) To: Jeff King; +Cc: Johannes Sixt, Linus Torvalds, Paul Tan, Git Mailing List Junio C Hamano <gitster@pobox.com> writes: > And a quick attempt without even compilation testing has flaws as > expected X-<. > > Second attempt. ... and I forget the de-dup logic. The third attempt. builtin/am.c | 32 ++++++++++++++++++++++++++++++-- 1 file changed, 30 insertions(+), 2 deletions(-) diff --git a/builtin/am.c b/builtin/am.c index 83b3d86..30ffdde 100644 --- a/builtin/am.c +++ b/builtin/am.c @@ -1207,6 +1207,34 @@ static void NORETURN die_user_resolve(const struct am_state *state) exit(128); } +static void am_signoff(struct strbuf *sb) +{ + char *cp; + struct strbuf mine = STRBUF_INIT; + + /* Does it end with our own sign-off? */ + strbuf_addf(&mine, "%s%s\n", + sign_off_header, + fmt_name(getenv("GIT_COMMITTER_NAME"), + getenv("GIT_COMMITTER_EMAIL"))); + if (mine.len < sb->len && + !strcmp(mine.buf, sb->buf + sb->len - mine.len)) + goto exit; /* no need to duplicate */ + + /* Does it have any Signed-off-by: in the text */ + for (cp = sb->buf; + cp && *cp && (cp = strstr(cp, sign_off_header)) != NULL; + cp = strchr(cp, '\n')) { + if (sb->buf == cp || cp[-1] == '\n') + break; + } + if (!cp) + strbuf_addch(sb, '\n'); + strbuf_addbuf(sb, &mine); +exit: + strbuf_release(&mine); +} + /** * Appends signoff to the "msg" field of the am_state. */ @@ -1215,7 +1243,7 @@ static void am_append_signoff(struct am_state *state) struct strbuf sb = STRBUF_INIT; strbuf_attach(&sb, state->msg, state->msg_len, state->msg_len); - append_signoff(&sb, 0, 0); + am_signoff(&sb); state->msg = strbuf_detach(&sb, &state->msg_len); } @@ -1319,7 +1347,7 @@ static int parse_mail(struct am_state *state, const char *mail) stripspace(&msg, 0); if (state->signoff) - append_signoff(&msg, 0, 0); + am_signoff(&msg); assert(!state->author_name); state->author_name = strbuf_detach(&author_name, NULL); ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: More builtin git-am issues.. 2015-09-05 16:10 ` Junio C Hamano 2015-09-05 16:32 ` Junio C Hamano @ 2015-09-05 19:39 ` Junio C Hamano 2015-09-07 19:27 ` Christian Couder 1 sibling, 1 reply; 30+ messages in thread From: Junio C Hamano @ 2015-09-05 19:39 UTC (permalink / raw) To: Christian Couder Cc: Johannes Sixt, Linus Torvalds, Paul Tan, Git Mailing List, Jeff King Junio C Hamano <gitster@pobox.com> writes: > To salvage "interpret-trailers" needs a lot more, as we are > realizing that the definition that led to its external design does > not match the way users use footers in the real world. This affects > the internal data representation and the whole thing needs to be > rethought. Note that I am not saying that you personally did any bad job while working on the interpret-trailers topic. We collectively designed its feature based on a much narrower definition of what the trailer block is than what is used in the real world in practice, so we do not have a good way to locate an existing entry that is not a (key, value), no matter what the syntax to denote which is key and which is value is, insert a new entry that is not a (key, value), or remove an existing entry that is not a (key, value), all of which will be necessary to mutate trailer blocks people use in the real life. I think a good way forward would be to go like this: * a helper function that starts from a flat <buf, len> (or a strbuf) and identifies the end of the body of the message, i.e. find "^---$", "^Conflicts:$", etc. and skip blank lines backwards. That is what ignore_non_trailer() in commit.c does, and that can be shared across everybody that mutates a log message. * a helper function that takes the result of the above as a flat <buf, len> (or a strbuf) and identifies the beginning of a trailer block. That may be just the matter of scanning backwards from the end of the trailer block ignore_non_trailer() identified for the first blank line, as I agree with Linus's "So quite frankly, at that point if git doesn't recognize it as a sign-off block, I don't think it's a big deal" in the thread. Not having that and not calling that function can reintroduce the recent "interpret-trailers corner case" bug Matthieu brought up. With these, everybody except interpret-trailers that mutates a log message can add a new signoff consistently. And then, building on these, "interpret-trailers" can be written like this: (1) starting from a flat <buf, len> (or a strbuf), using the above helpers, identify the parts of the log message that is the trailer block (and you will know what is before and what is after the trailer block). (2) keep the part before the trailer block and the part after the trailer block (this could be empty) in one strbuf each; we do not want to mutate these parts, and it is pointless to split them further into individual lines. (3) express the lines in the trailer block in a richer data structure that is easier to manipulate (i.e. reorder the lines, insert, delete, etc.) and work on it. (4) when manipulation of the trailer block is finished, reconstruct the resulting message by concatenating the "before trailer" part, "trailer" part, and "after trailer" part. As to the exact design of "a richer data structure" and the manipulation we may want on the trailer, I currently do not have a strong "it should be this way" opinion yet, but after looking at various examples Linus gave us in the discussion, my gut feelig is that it would be best to keep the operation simple and generic, e.g. "find a line that matches this regexp and replace it with this line", "insert this line at the end", "delete all lines that match this regexp", etc. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: More builtin git-am issues.. 2015-09-05 19:39 ` Junio C Hamano @ 2015-09-07 19:27 ` Christian Couder 0 siblings, 0 replies; 30+ messages in thread From: Christian Couder @ 2015-09-07 19:27 UTC (permalink / raw) To: Junio C Hamano Cc: Johannes Sixt, Linus Torvalds, Paul Tan, Git Mailing List, Jeff King On Sat, Sep 5, 2015 at 9:39 PM, Junio C Hamano <gitster@pobox.com> wrote: > Junio C Hamano <gitster@pobox.com> writes: > >> To salvage "interpret-trailers" needs a lot more, as we are >> realizing that the definition that led to its external design does >> not match the way users use footers in the real world. This affects >> the internal data representation and the whole thing needs to be >> rethought. > > Note that I am not saying that you personally did any bad job while > working on the interpret-trailers topic. We collectively designed > its feature based on a much narrower definition of what the trailer > block is than what is used in the real world in practice, so we do > not have a good way to locate an existing entry that is not a (key, > value), no matter what the syntax to denote which is key and which > is value is, insert a new entry that is not a (key, value), or > remove an existing entry that is not a (key, value), all of which > will be necessary to mutate trailer blocks people use in the real > life. > > I think a good way forward would be to go like this: > > * a helper function that starts from a flat <buf, len> (or a > strbuf) and identifies the end of the body of the message, > i.e. find "^---$", "^Conflicts:$", etc. and skip blank lines > backwards. That is what ignore_non_trailer() in commit.c does, > and that can be shared across everybody that mutates a log > message. > > * a helper function that takes the result of the above as a flat > <buf, len> (or a strbuf) and identifies the beginning of a > trailer block. That may be just the matter of scanning backwards > from the end of the trailer block ignore_non_trailer() identified > for the first blank line, as I agree with Linus's "So quite > frankly, at that point if git doesn't recognize it as a sign-off > block, I don't think it's a big deal" in the thread. > > Not having that and not calling that function can reintroduce the > recent "interpret-trailers corner case" bug Matthieu brought up. > > With these, everybody except interpret-trailers that mutates a log > message can add a new signoff consistently. And then, building on > these, "interpret-trailers" can be written like this: > > (1) starting from a flat <buf, len> (or a strbuf), using the above > helpers, identify the parts of the log message that is the > trailer block (and you will know what is before and what is > after the trailer block). > > (2) keep the part before the trailer block and the part after the > trailer block (this could be empty) in one strbuf each; we do > not want to mutate these parts, and it is pointless to split > them further into individual lines. > > (3) express the lines in the trailer block in a richer data > structure that is easier to manipulate (i.e. reorder the lines, > insert, delete, etc.) and work on it. > > (4) when manipulation of the trailer block is finished, reconstruct > the resulting message by concatenating the "before trailer" > part, "trailer" part, and "after trailer" part. > > As to the exact design of "a richer data structure" and the > manipulation we may want on the trailer, I currently do not have a > strong "it should be this way" opinion yet, but after looking at > various examples Linus gave us in the discussion, my gut feelig is > that it would be best to keep the operation simple and generic, > e.g. "find a line that matches this regexp and replace it with this > line", "insert this line at the end", "delete all lines that match > this regexp", etc. I will see what I can do about this. Thanks, Christian. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: More builtin git-am issues.. 2015-09-05 7:30 ` Johannes Sixt 2015-09-05 8:03 ` Jeff King @ 2015-09-05 15:24 ` Junio C Hamano 2015-09-05 15:34 ` Junio C Hamano 1 sibling, 1 reply; 30+ messages in thread From: Junio C Hamano @ 2015-09-05 15:24 UTC (permalink / raw) To: Johannes Sixt; +Cc: Linus Torvalds, Paul Tan, Git Mailing List Johannes Sixt <j6t@kdbg.org> writes: > Why do we need a new rule? The old git-am had a logic that pleased > everyone, and it must have been implemented somewhere. Shouldn't it be > sufficient to just re-implement or re-use that logic? If you look at the helper the rewritten "am" calls, you will notice that it is used by other things that wants to know whether a log message has the final "trailer" block, and the reason why the callers want to know is not limited to "I want to add a sign-off". What your "just re-implement" means is "change the world order drastically to other callers who do not want such a special casing for sign-off". That was why I tried to see if a slight tweak to the rule shared by all callers that would be much less likely to break these other callers can satisfy the need of "am". Perhaps we would need to tell has_conforming_footer() function who the caller is, and use a different logic (i.e. "just re-implement") when the caller is append_signoff(). ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: More builtin git-am issues.. 2015-09-05 15:24 ` Junio C Hamano @ 2015-09-05 15:34 ` Junio C Hamano 0 siblings, 0 replies; 30+ messages in thread From: Junio C Hamano @ 2015-09-05 15:34 UTC (permalink / raw) To: Johannes Sixt; +Cc: Linus Torvalds, Paul Tan, Git Mailing List Junio C Hamano <gitster@pobox.com> writes: > Johannes Sixt <j6t@kdbg.org> writes: > >> Why do we need a new rule? The old git-am had a logic that pleased >> everyone, and it must have been implemented somewhere. Shouldn't it be >> sufficient to just re-implement or re-use that logic? > ... > Perhaps we would need to tell has_conforming_footer() function who > the caller is, and use a different logic (i.e. "just re-implement") > when the caller is append_signoff(). Ah, I think I misunderstood you. "just re-implement" could be "not to use the existing helper used by other callers and instead do it manually all inside builtin/am.c", and I agree that it would be the safest way to go. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: More builtin git-am issues.. 2015-09-04 23:47 More builtin git-am issues Linus Torvalds 2015-09-04 23:52 ` Linus Torvalds @ 2015-09-05 1:06 ` Junio C Hamano 2015-09-05 1:20 ` Linus Torvalds 2015-09-06 4:56 ` [PATCH] am: match --signoff to the original scripted version Junio C Hamano 2 siblings, 1 reply; 30+ messages in thread From: Junio C Hamano @ 2015-09-05 1:06 UTC (permalink / raw) To: Linus Torvalds; +Cc: Paul Tan, Git Mailing List Linus Torvalds <torvalds@linux-foundation.org> writes: > but the failing cases have a comment by Andrew: > > [akpm@linux-foundation.org: coding-style fixes] > Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> > Cc: Xishi Qiu <qiuxishi@huawei.com> > Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> > Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> > Cc: Taku Izumi <izumi.taku@jp.fujitsu.com> > Cc: Gu Zheng <guz.fnst@cn.fujitsu.com> > Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> > Cc: Vlastimil Babka <vbabka@suse.cz> > Cc: Mel Gorman <mgorman@techsingularity.net> > Cc: David Rientjes <rientjes@google.com> > Cc: <stable@vger.kernel.org> [4.2.x] > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > > Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> > > ie that "[akpm@linux-foundation.org: coding-style fixes]" makes git am > now decide that the previous block of text was not a sign-off block, > so it adds an empty line before adding my sign-off. But very obviously > it *was* a sign-off block. Ahh, OK, scratch what I said earlier. The user intended this to be sign-off block, but the convention append_signoff() was taught from very earlier days is that the sign-off block must consist of block of text all of which look like rfc2822 "keyword: value" header lines, and the comment thing makes it a non-conforming header. Perhaps A block of text at the end of the existing text could be a signature block. If all its lines that are rfc2822-like are at its end, then it is a sign-off block. Otherwise it is not. would allow the leading non-signature lines in the above example. If the comment line (which I would say should have been separated by a blank line from the signature block if only to make it easier to read the whole thing) were in the middle, e.g. > Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> > Cc: Xishi Qiu <qiuxishi@huawei.com> > Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> > Cc: Vlastimil Babka <vbabka@suse.cz> > Cc: Mel Gorman <mgorman@techsingularity.net> > [akpm@linux-foundation.org: coding-style fixes] > Cc: David Rientjes <rientjes@google.com> > Cc: <stable@vger.kernel.org> [4.2.x] > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> that rule would still not think this is a signature block, but at that point, do we really want to consider such a block of text a signature block? ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: More builtin git-am issues.. 2015-09-05 1:06 ` Junio C Hamano @ 2015-09-05 1:20 ` Linus Torvalds 2015-09-05 1:24 ` Junio C Hamano 0 siblings, 1 reply; 30+ messages in thread From: Linus Torvalds @ 2015-09-05 1:20 UTC (permalink / raw) To: Junio C Hamano; +Cc: Paul Tan, Git Mailing List On Fri, Sep 4, 2015 at 6:06 PM, Junio C Hamano <gitster@pobox.com> wrote: > > that rule would still not think this is a signature block, but at > that point, do we really want to consider such a block of text a > signature block? So exactly why are you arguing for these rules that are known to break in real life, that I gave actual examples for existing, and that I also gave an actual example for not just giving a false negative, but also a false positive? I'm also pretty sure that what you are arguing for is a regression. Now, as mentioned, it may well be true that we've had this odd behavior before, and it's not a real regression - I may just have picked up on this problem because I've been more careful. Maybe I didn't notice these problems before. But looking at the old git-am.sh script, it does simply seem to look for that '^Signed-off-by:' pattern. It did ADD_SIGNOFF=$( test "$LAST_SIGNED_OFF_BY" = "$SIGNOFF" || { test '' = "$LAST_SIGNED_OFF_BY" && echo echo "$SIGNOFF" }) which seems to literally just check the last sign-off line it found. If it matches the new sign-off, it doesn't do anything (and doesn't add the new one either), and if it doesn't exist at all (so it's empty) it adds teh empty line. Quite frankly, that not only worked for a long time, it's simply less ambiguous than your made up rule. It's very simple. "if you find a sign-off in the commit message, don't add an new empty line before the new signoff". Much better than "every line in the last set of lines must match some weak format that isn't even true, and is too non-specific anyway". Linus ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: More builtin git-am issues.. 2015-09-05 1:20 ` Linus Torvalds @ 2015-09-05 1:24 ` Junio C Hamano 0 siblings, 0 replies; 30+ messages in thread From: Junio C Hamano @ 2015-09-05 1:24 UTC (permalink / raw) To: Linus Torvalds; +Cc: Paul Tan, Git Mailing List Linus Torvalds <torvalds@linux-foundation.org> writes: > On Fri, Sep 4, 2015 at 6:06 PM, Junio C Hamano <gitster@pobox.com> wrote: >> >> that rule would still not think this is a signature block, but at >> that point, do we really want to consider such a block of text a >> signature block? > > So exactly why are you arguing for these rules that are known to break > in real life, that I gave actual examples for existing,... Our mails crossed. I was working from the [akpm] comment at the very beginning while you were giving more examples that inserts comments in the middle which I didn't see. ^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH] am: match --signoff to the original scripted version 2015-09-04 23:47 More builtin git-am issues Linus Torvalds 2015-09-04 23:52 ` Linus Torvalds 2015-09-05 1:06 ` Junio C Hamano @ 2015-09-06 4:56 ` Junio C Hamano 2015-09-06 9:04 ` Paul Tan ` (3 more replies) 2 siblings, 4 replies; 30+ messages in thread From: Junio C Hamano @ 2015-09-06 4:56 UTC (permalink / raw) To: Linus Torvalds; +Cc: Paul Tan, Git Mailing List Linus noticed that the recently reimplementated "git am -s" defines the trailer block too rigidly, resulting an unnecessary blank line between the existing sign-offs and his new sign-off. An e-mail submission sent to Linus in real life ends with mixture of sign-offs and commentaries, e.g. title here message here Signed-off-by: Original Author <original@auth.or> [rv: tweaked frotz and nitfol] Signed-off-by: Re Viewer <rv@ew.er> Signed-off-by: Other Reviewer <other@rev.ewer> --- patch here Because the reimplementation reused append_signoff() helper that is used by other codepaths, which is unaware that people intermix such comments with their sign-offs in the trailer block, such a message was judged to end with a non-trailer, resulting in an extra blank before adding a new sign-off. The original scripted version of "git am" used a lot looser definition, i.e. "if and only if there is no line that begins with Signed-off-by:, add a blank line before adding a new sign-off". For the upcoming release, stop using the append_signoff() in "git am" and reimplement the looser definition used by the scripted version to use only in "git am" to fix this regression in "am" while avoiding new regressions to other users of append_signoff(). In the longer term, we should look into loosening append_signoff() so that other codepaths that add a new sign-off behave the same way as "git am -s", but that is a task for post-release. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> --- * So this is essentially that "third try", but with a bit of test to cast in stone that we expect things that are not "key: value" in the trailer block. Let's have this (or a fix along this line) as a regression fix for the upcoming release. builtin/am.c | 31 +++++++++++++++++++++++++++++-- t/t4150-am.sh | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 77 insertions(+), 2 deletions(-) diff --git a/builtin/am.c b/builtin/am.c index 634f7a7..e7828e5 100644 --- a/builtin/am.c +++ b/builtin/am.c @@ -1191,6 +1191,33 @@ static void NORETURN die_user_resolve(const struct am_state *state) exit(128); } +static void am_signoff(struct strbuf *sb) +{ + char *cp; + struct strbuf mine = STRBUF_INIT; + + /* Does it end with our own sign-off? */ + strbuf_addf(&mine, "\n%s%s\n", + sign_off_header, + fmt_name(getenv("GIT_COMMITTER_NAME"), + getenv("GIT_COMMITTER_EMAIL"))); + if (mine.len < sb->len && + !strcmp(mine.buf, sb->buf + sb->len - mine.len)) + goto exit; /* no need to duplicate */ + + /* Does it have any Signed-off-by: in the text */ + for (cp = sb->buf; + cp && *cp && (cp = strstr(cp, sign_off_header)) != NULL; + cp = strchr(cp, '\n')) { + if (sb->buf == cp || cp[-1] == '\n') + break; + } + + strbuf_addstr(sb, mine.buf + !!cp); +exit: + strbuf_release(&mine); +} + /** * Appends signoff to the "msg" field of the am_state. */ @@ -1199,7 +1226,7 @@ static void am_append_signoff(struct am_state *state) struct strbuf sb = STRBUF_INIT; strbuf_attach(&sb, state->msg, state->msg_len, state->msg_len); - append_signoff(&sb, 0, 0); + am_signoff(&sb); state->msg = strbuf_detach(&sb, &state->msg_len); } @@ -1303,7 +1330,7 @@ static int parse_mail(struct am_state *state, const char *mail) stripspace(&msg, 0); if (state->signoff) - append_signoff(&msg, 0, 0); + am_signoff(&msg); assert(!state->author_name); state->author_name = strbuf_detach(&author_name, NULL); diff --git a/t/t4150-am.sh b/t/t4150-am.sh index dd627c4..5e48555 100755 --- a/t/t4150-am.sh +++ b/t/t4150-am.sh @@ -873,4 +873,52 @@ test_expect_success 'am --message-id -s signs off after the message id' ' test_cmp expected actual ' +test_expect_success 'am -s unexpected trailer block' ' + rm -fr .git/rebase-apply && + git reset --hard && + echo signed >file && + git add file && + cat >msg <<-EOF && + subject here + + Signed-off-by: $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> + [jc: tweaked log message] + Signed-off-by: J C H <j@c.h> + EOF + git commit -F msg && + git cat-file commit HEAD | sed -e '1,/^$/d' >original && + git format-patch --stdout -1 >patch && + + git reset --hard HEAD^ && + git am -s patch && + ( + cat original && + echo "Signed-off-by: $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>" + ) >expect && + git cat-file commit HEAD | sed -e '1,/^$/d' >actual && + test_cmp expect actual && + + cat >msg <<-\EOF && + subject here + + We make sure that there is a blank line between the log + message proper and Signed-off-by: line added. + EOF + git reset HEAD^ && + git commit -F msg file && + git cat-file commit HEAD | sed -e '1,/^$/d' >original && + git format-patch --stdout -1 >patch && + + git reset --hard HEAD^ && + git am -s patch && + + ( + cat original && + echo && + echo "Signed-off-by: $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>" + ) >expect && + git cat-file commit HEAD | sed -e '1,/^$/d' >actual && + test_cmp expect actual +' + test_done -- 2.6.0-rc0-130-gfaaaedc ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH] am: match --signoff to the original scripted version 2015-09-06 4:56 ` [PATCH] am: match --signoff to the original scripted version Junio C Hamano @ 2015-09-06 9:04 ` Paul Tan 2015-09-06 17:24 ` Junio C Hamano 2015-09-06 14:21 ` Paul Tan ` (2 subsequent siblings) 3 siblings, 1 reply; 30+ messages in thread From: Paul Tan @ 2015-09-06 9:04 UTC (permalink / raw) To: Junio C Hamano; +Cc: Linus Torvalds, Git Mailing List Hi, Thanks for handling this. On Sun, Sep 6, 2015 at 12:56 PM, Junio C Hamano <gitster@pobox.com> wrote: > Linus noticed that the recently reimplementated "git am -s" defines s/reimplementated/reimplemented/ ? > the trailer block too rigidly, resulting an unnecessary blank line s/resulting an/resulting in an/ ? > between the existing sign-offs and his new sign-off. An e-mail > submission sent to Linus in real life ends with mixture of sign-offs > and commentaries, e.g. > > title here > > message here > > Signed-off-by: Original Author <original@auth.or> > [rv: tweaked frotz and nitfol] > Signed-off-by: Re Viewer <rv@ew.er> > Signed-off-by: Other Reviewer <other@rev.ewer> > --- > patch here > > Because the reimplementation reused append_signoff() helper that is > used by other codepaths, which is unaware that people intermix such > comments with their sign-offs in the trailer block, such a message > was judged to end with a non-trailer, resulting in an extra blank s/extra blank/extra blank line/ ? > before adding a new sign-off. > > The original scripted version of "git am" used a lot looser > definition, i.e. "if and only if there is no line that begins with > Signed-off-by:, add a blank line before adding a new sign-off". For > the upcoming release, stop using the append_signoff() in "git am" > and reimplement the looser definition used by the scripted version > to use only in "git am" to fix this regression in "am" while > avoiding new regressions to other users of append_signoff(). > > In the longer term, we should look into loosening append_signoff() > so that other codepaths that add a new sign-off behave the same way > as "git am -s", but that is a task for post-release. > > Reported-by: Linus Torvalds <torvalds@linux-foundation.org> > Signed-off-by: Junio C Hamano <gitster@pobox.com> > --- > builtin/am.c | 31 +++++++++++++++++++++++++++++-- > t/t4150-am.sh | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 77 insertions(+), 2 deletions(-) > > diff --git a/builtin/am.c b/builtin/am.c > index 634f7a7..e7828e5 100644 > --- a/builtin/am.c > +++ b/builtin/am.c > @@ -1191,6 +1191,33 @@ static void NORETURN die_user_resolve(const struct am_state *state) > exit(128); > } > > +static void am_signoff(struct strbuf *sb) > +{ Hmm, okay, but now we have two similarly named functions am_signoff() and am_append_signoff() which both do nearly similar things, the only difference being am_signoff() operates on a strbuf while am_append_signoff() operates on the "msg" char* field in the am_state, which seems a bit iffy to me. I wonder if the logic could be implemented in am_append_signoff() instead so we have only one function? > + char *cp; > + struct strbuf mine = STRBUF_INIT; > + > + /* Does it end with our own sign-off? */ > + strbuf_addf(&mine, "\n%s%s\n", > + sign_off_header, > + fmt_name(getenv("GIT_COMMITTER_NAME"), > + getenv("GIT_COMMITTER_EMAIL"))); Maybe use git_committer_info() here? > + if (mine.len < sb->len && > + !strcmp(mine.buf, sb->buf + sb->len - mine.len)) Perhaps use ends_with()? > + goto exit; /* no need to duplicate */ > + > + /* Does it have any Signed-off-by: in the text */ > + for (cp = sb->buf; > + cp && *cp && (cp = strstr(cp, sign_off_header)) != NULL; > + cp = strchr(cp, '\n')) { > + if (sb->buf == cp || cp[-1] == '\n') > + break; > + } > + > + strbuf_addstr(sb, mine.buf + !!cp); > +exit: > + strbuf_release(&mine); > +} > + > /** > * Appends signoff to the "msg" field of the am_state. > */ > @@ -1199,7 +1226,7 @@ static void am_append_signoff(struct am_state *state) > struct strbuf sb = STRBUF_INIT; > > strbuf_attach(&sb, state->msg, state->msg_len, state->msg_len); > - append_signoff(&sb, 0, 0); > + am_signoff(&sb); > state->msg = strbuf_detach(&sb, &state->msg_len); > } > > @@ -1303,7 +1330,7 @@ static int parse_mail(struct am_state *state, const char *mail) > stripspace(&msg, 0); > > if (state->signoff) > - append_signoff(&msg, 0, 0); > + am_signoff(&msg); > > assert(!state->author_name); > state->author_name = strbuf_detach(&author_name, NULL); Thanks again, Paul ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] am: match --signoff to the original scripted version 2015-09-06 9:04 ` Paul Tan @ 2015-09-06 17:24 ` Junio C Hamano 2015-09-08 6:18 ` Jeff King 0 siblings, 1 reply; 30+ messages in thread From: Junio C Hamano @ 2015-09-06 17:24 UTC (permalink / raw) To: Paul Tan; +Cc: Linus Torvalds, Git Mailing List Paul Tan <pyokagan@gmail.com> writes: > s/reimplementated/reimplemented/ ? > s/resulting an/resulting in an/ ? > s/extra blank/extra blank line/ ? Thanks. >> +static void am_signoff(struct strbuf *sb) >> +{ > > Hmm, okay, but now we have two similarly named functions am_signoff() > and am_append_signoff() which both do nearly similar things, the only > difference being am_signoff() operates on a strbuf while > am_append_signoff() operates on the "msg" char* field in the am_state, > which seems a bit iffy to me. I do recall advising you not to overuse strbuf in am_state, and specifically not to use strbuf for a field like author_name, and the criteria to tell why a field should not be a strbuf in am_state is if the code used its strbuf-ness only because it is initially read with strbuf_read_file(), but afterwards it is necessary to use the field as a strbuf because the field is not modified later and is not passed to a helper function that takes a strbuf. I think the final code ended up following that piece of advice a bit too aggressively. The <msg, msg_len> pair in am_state did need to be modified with sign-off, and it was done by passing it as a strbuf to append_signoff() in the code we are fixing here; keeping msg field that you wrote as a strbuf in the original would have made am_append_signoff() unnecessary. But this patch you are commenting on is purely for regression fix, and I didn't want to change other things like data representation at the same time. >> + /* Does it end with our own sign-off? */ >> + strbuf_addf(&mine, "\n%s%s\n", >> + sign_off_header, >> + fmt_name(getenv("GIT_COMMITTER_NAME"), >> + getenv("GIT_COMMITTER_EMAIL"))); > > Maybe use git_committer_info() here? Perhaps, but I wanted to make sure I am doing the same thing as the codepath of sequencer.c::append_signoff(), which the original ended up calling. git_committer_info() does way more than that, no? >> + if (mine.len < sb->len && >> + !strcmp(mine.buf, sb->buf + sb->len - mine.len)) > > Perhaps use ends_with()? This caller already _knows_ how long the sb->buf string is; it is pointless to have ends_with() run strlen() on it. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] am: match --signoff to the original scripted version 2015-09-06 17:24 ` Junio C Hamano @ 2015-09-08 6:18 ` Jeff King 0 siblings, 0 replies; 30+ messages in thread From: Jeff King @ 2015-09-08 6:18 UTC (permalink / raw) To: Junio C Hamano; +Cc: Paul Tan, Linus Torvalds, Git Mailing List On Sun, Sep 06, 2015 at 10:24:12AM -0700, Junio C Hamano wrote: > >> + /* Does it end with our own sign-off? */ > >> + strbuf_addf(&mine, "\n%s%s\n", > >> + sign_off_header, > >> + fmt_name(getenv("GIT_COMMITTER_NAME"), > >> + getenv("GIT_COMMITTER_EMAIL"))); > > > > Maybe use git_committer_info() here? > > Perhaps, but I wanted to make sure I am doing the same thing as the > codepath of sequencer.c::append_signoff(), which the original ended > up calling. git_committer_info() does way more than that, no? Not really. I think git_committer_info(IDENT_STRICT | IDENT_NO_DATE) runs the exact same code, with one exception: we would also set the ident_explicitly_given variables. But nobody in builtin/am.c calls committer_ident_sufficiently_given(). And if they did, I think the change would be an improvement. > >> + if (mine.len < sb->len && > >> + !strcmp(mine.buf, sb->buf + sb->len - mine.len)) > > > > Perhaps use ends_with()? > > This caller already _knows_ how long the sb->buf string is; it is > pointless to have ends_with() run strlen() on it. That actually goes double. We know sb.len. The ends_with() function is built around strip_suffix(), which both operate on strings. But we do not have ends_with_mem() built around strip_suffix_mem(). But we also know mine.len. Even strip_suffix_mem() assumes the suffix itself is a string. So what you really want is: int strip_suffix_mem_mem(const char *buf, size_t *len, const char *suffix, size_t suffix_len); and then you can trivially build the existing strip_suffix_mem() around it, build strip_suffix() around that, and then build ends_with(), ends_with_mem() and ends_with_mem_mem() around those. And don't forget strbuf_ends_with(), strbuf_ends_with_mem(), and strbuf_ends_with_strbuf() :) I am only half tongue in cheek. The proliferation of names is tedious (and not appropriate for an -rc regression fix), but I do think the resulting code is a lot more obvious as: if (strbuf_ends_with_strbuf(&sb, &mine)) ... or even: if (ends_with_mem_mem(sb->buf, sb->len, mine.buf, mine.len)) ... Of course given that this is run only once per program, and that these _are_ in fact strings, we can probably not bother to optimize it and just accept: if (ends_with(sb->buf, mine.buf)) ... But if you want to go all-out on optimization, I think you can replace your strcmp with memcmp: if (mine.len < sb->len && !memcmp(mine.buf, sb->buf + sb->len - mine.len, mine.len)) (assuming that memcmp is in fact faster than strcmp). -Peff ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] am: match --signoff to the original scripted version 2015-09-06 4:56 ` [PATCH] am: match --signoff to the original scripted version Junio C Hamano 2015-09-06 9:04 ` Paul Tan @ 2015-09-06 14:21 ` Paul Tan 2015-09-06 17:39 ` Junio C Hamano 2015-09-06 16:16 ` Linus Torvalds 2015-09-08 6:25 ` Jeff King 3 siblings, 1 reply; 30+ messages in thread From: Paul Tan @ 2015-09-06 14:21 UTC (permalink / raw) To: Junio C Hamano; +Cc: Linus Torvalds, Git Mailing List On Sun, Sep 6, 2015 at 12:56 PM, Junio C Hamano <gitster@pobox.com> wrote: > diff --git a/builtin/am.c b/builtin/am.c > index 634f7a7..e7828e5 100644 > --- a/builtin/am.c > +++ b/builtin/am.c > @@ -1191,6 +1191,33 @@ static void NORETURN die_user_resolve(const struct am_state *state) > exit(128); > } > > +static void am_signoff(struct strbuf *sb) > +{ > + char *cp; > + struct strbuf mine = STRBUF_INIT; > + > + /* Does it end with our own sign-off? */ > + strbuf_addf(&mine, "\n%s%s\n", > + sign_off_header, > + fmt_name(getenv("GIT_COMMITTER_NAME"), > + getenv("GIT_COMMITTER_EMAIL"))); > + if (mine.len < sb->len && > + !strcmp(mine.buf, sb->buf + sb->len - mine.len)) > + goto exit; /* no need to duplicate */ > + > + /* Does it have any Signed-off-by: in the text */ > + for (cp = sb->buf; > + cp && *cp && (cp = strstr(cp, sign_off_header)) != NULL; > + cp = strchr(cp, '\n')) { > + if (sb->buf == cp || cp[-1] == '\n') > + break; > + } > + > + strbuf_addstr(sb, mine.buf + !!cp); To add on, I wonder if the above "add a blank line if there is no Signed-off-by: at the beginning of a line" logic could be expressed more succinctly like this: if (!starts_with(sb->buf, "Signed-off-by: ") && !strstr(sb->buf, "\nSigned-off-by: ")) strbuf_addch(sb, '\n'); strbuf_addstr(sb, mine.buf + 1); > +exit: > + strbuf_release(&mine); > +} > + > /** > * Appends signoff to the "msg" field of the am_state. > */ Regards, Paul ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] am: match --signoff to the original scripted version 2015-09-06 14:21 ` Paul Tan @ 2015-09-06 17:39 ` Junio C Hamano 0 siblings, 0 replies; 30+ messages in thread From: Junio C Hamano @ 2015-09-06 17:39 UTC (permalink / raw) To: Paul Tan; +Cc: Linus Torvalds, Git Mailing List Paul Tan <pyokagan@gmail.com> writes: >> + /* Does it have any Signed-off-by: in the text */ >> + for (cp = sb->buf; >> + cp && *cp && (cp = strstr(cp, sign_off_header)) != NULL; >> + cp = strchr(cp, '\n')) { >> + if (sb->buf == cp || cp[-1] == '\n') >> + break; >> + } >> + >> + strbuf_addstr(sb, mine.buf + !!cp); > > To add on, I wonder if the above "add a blank line if there is no > Signed-off-by: at the beginning of a line" logic could be expressed > more succinctly like this: > > if (!starts_with(sb->buf, "Signed-off-by: ") && > !strstr(sb->buf, "\nSigned-off-by: ")) > strbuf_addch(sb, '\n'); > > strbuf_addstr(sb, mine.buf + 1); That may be more obvious, but I wanted to reuse the existing sign_off_header[]; there is no variant that has a leading "end of previous line". I actually think it is not an uncommon thing to ask "Give me the first occurrence of the string NEEDLE that appears at the beginning of lines in BUF", so the right way to address the "is it more readable" question would probably be to have a helper function to do just that, and use it here and other places that answer that question by themselves due to lack of such a helper. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] am: match --signoff to the original scripted version 2015-09-06 4:56 ` [PATCH] am: match --signoff to the original scripted version Junio C Hamano 2015-09-06 9:04 ` Paul Tan 2015-09-06 14:21 ` Paul Tan @ 2015-09-06 16:16 ` Linus Torvalds 2015-09-08 6:25 ` Jeff King 3 siblings, 0 replies; 30+ messages in thread From: Linus Torvalds @ 2015-09-06 16:16 UTC (permalink / raw) To: Junio C Hamano; +Cc: Paul Tan, Git Mailing List On Sat, Sep 5, 2015 at 9:56 PM, Junio C Hamano <gitster@pobox.com> wrote: > > For > the upcoming release, stop using the append_signoff() in "git am" > and reimplement the looser definition used by the scripted version > to use only in "git am" to fix this regression in "am" while > avoiding new regressions to other users of append_signoff(). I've tested this by re-doing the same patch-bomb from Andrew that I had problems with originally, and it WorksForMe(tm). Thanks, Linus ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] am: match --signoff to the original scripted version 2015-09-06 4:56 ` [PATCH] am: match --signoff to the original scripted version Junio C Hamano ` (2 preceding siblings ...) 2015-09-06 16:16 ` Linus Torvalds @ 2015-09-08 6:25 ` Jeff King 2015-09-08 18:14 ` Junio C Hamano 3 siblings, 1 reply; 30+ messages in thread From: Jeff King @ 2015-09-08 6:25 UTC (permalink / raw) To: Junio C Hamano; +Cc: Linus Torvalds, Paul Tan, Git Mailing List On Sat, Sep 05, 2015 at 09:56:27PM -0700, Junio C Hamano wrote: > +static void am_signoff(struct strbuf *sb) > +{ > + char *cp; > + struct strbuf mine = STRBUF_INIT; > + > + /* Does it end with our own sign-off? */ > + strbuf_addf(&mine, "\n%s%s\n", > + sign_off_header, > + fmt_name(getenv("GIT_COMMITTER_NAME"), > + getenv("GIT_COMMITTER_EMAIL"))); > + if (mine.len < sb->len && > + !strcmp(mine.buf, sb->buf + sb->len - mine.len)) > + goto exit; /* no need to duplicate */ Here you insert the "\n" directly at the start of "mine", so the test "does it contain S-o-b at the beginning of a line" does not count the first line. Probably fine, as somebody putting a S-o-b in their subject deserves whatever they get. But... > + /* Does it have any Signed-off-by: in the text */ > + for (cp = sb->buf; > + cp && *cp && (cp = strstr(cp, sign_off_header)) != NULL; > + cp = strchr(cp, '\n')) { > + if (sb->buf == cp || cp[-1] == '\n') > + break; > + } Here you are more careful about finding S-o-b at sb->buf. I don't think it really matters in practice, but it's an interesting inconsistency. Other than that (and I do not think it is worth re-rolling for this; just an interesting observation), the patch looks OK to me. -Peff ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] am: match --signoff to the original scripted version 2015-09-08 6:25 ` Jeff King @ 2015-09-08 18:14 ` Junio C Hamano 0 siblings, 0 replies; 30+ messages in thread From: Junio C Hamano @ 2015-09-08 18:14 UTC (permalink / raw) To: Jeff King; +Cc: Linus Torvalds, Paul Tan, Git Mailing List Jeff King <peff@peff.net> writes: > On Sat, Sep 05, 2015 at 09:56:27PM -0700, Junio C Hamano wrote: > >> +static void am_signoff(struct strbuf *sb) >> +{ >> + char *cp; >> + struct strbuf mine = STRBUF_INIT; >> + >> + /* Does it end with our own sign-off? */ >> + strbuf_addf(&mine, "\n%s%s\n", >> + sign_off_header, >> + fmt_name(getenv("GIT_COMMITTER_NAME"), >> + getenv("GIT_COMMITTER_EMAIL"))); >> + if (mine.len < sb->len && >> + !strcmp(mine.buf, sb->buf + sb->len - mine.len)) >> + goto exit; /* no need to duplicate */ > > Here you insert the "\n" directly at the start of "mine", so the test > "does it contain S-o-b at the beginning of a line" does not count the > first line. Probably fine, as somebody putting a S-o-b in their subject > deserves whatever they get. Yup. >> + /* Does it have any Signed-off-by: in the text */ >> + for (cp = sb->buf; >> + cp && *cp && (cp = strstr(cp, sign_off_header)) != NULL; >> + cp = strchr(cp, '\n')) { >> + if (sb->buf == cp || cp[-1] == '\n') >> + break; >> + } > > Here you are more careful about finding S-o-b at sb->buf. It is not being careful for that, actually. It just is avoiding to access sb->buf[-1], which would be a realproblem. "The beginning of buffer is also at the beginning of a line" is merely a happy side effect ;-). ^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2015-09-08 18:14 UTC | newest] Thread overview: 30+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-09-04 23:47 More builtin git-am issues Linus Torvalds 2015-09-04 23:52 ` Linus Torvalds 2015-09-05 0:07 ` Jeff King 2015-09-05 0:36 ` Linus Torvalds 2015-09-05 0:54 ` Junio C Hamano 2015-09-05 1:06 ` Linus Torvalds 2015-09-05 1:23 ` Junio C Hamano 2015-09-05 1:35 ` Linus Torvalds 2015-09-05 16:01 ` Junio C Hamano 2015-09-05 7:30 ` Johannes Sixt 2015-09-05 8:03 ` Jeff King 2015-09-05 16:10 ` Junio C Hamano 2015-09-05 16:32 ` Junio C Hamano 2015-09-05 16:57 ` Junio C Hamano 2015-09-05 19:39 ` Junio C Hamano 2015-09-07 19:27 ` Christian Couder 2015-09-05 15:24 ` Junio C Hamano 2015-09-05 15:34 ` Junio C Hamano 2015-09-05 1:06 ` Junio C Hamano 2015-09-05 1:20 ` Linus Torvalds 2015-09-05 1:24 ` Junio C Hamano 2015-09-06 4:56 ` [PATCH] am: match --signoff to the original scripted version Junio C Hamano 2015-09-06 9:04 ` Paul Tan 2015-09-06 17:24 ` Junio C Hamano 2015-09-08 6:18 ` Jeff King 2015-09-06 14:21 ` Paul Tan 2015-09-06 17:39 ` Junio C Hamano 2015-09-06 16:16 ` Linus Torvalds 2015-09-08 6:25 ` Jeff King 2015-09-08 18:14 ` Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).