* Re: git mailinfo strips important context from patch subjects
2009-06-28 23:04 ` Junio C Hamano
@ 2009-06-29 9:53 ` Andreas Ericsson
2009-06-29 9:55 ` [PATCH] mailinfo: Remove only one set of square brackets Andreas Ericsson
2009-06-29 21:17 ` [PATCH] builtin-mailinfo.c: Trim only first pair of square brackets in subject Roger Leigh
` (2 subsequent siblings)
3 siblings, 1 reply; 21+ messages in thread
From: Andreas Ericsson @ 2009-06-29 9:53 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Jeff King, Roger Leigh, git
Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
>> On Sun, Jun 28, 2009 at 08:38:58PM +0100, Roger Leigh wrote:
>>
>>> In most of the projects I work on, the git commit message has
>>> the affected subsystem or component in square brackets, such as
>>>
>>> [foo] change bar to baz
>>>
>>> [...]
>>>
>>> The [sbuild] prefix has been dropped from the Subject, so an
>>> important bit of context about the patch has been lost.
>>>
>>> It's a bit of a bug that you can't round trip from a git-format-patch
>>> to import with git-am and then not be able to produce the exact same
>>> patch set with git-format-patch again (assuming preparing and applying
>>> to the same point, of course).
>> As an immediate solution, you probably want to use "-k" when generating
>> the patch (not to add the [PATCH] munging) and "-k" when reading the
>> patch via "git am" (which will avoid trying to strip any munging).
>>
>> However:
>>
>>> Would it be possible to change the git-mailinfo logic to use a less
>>> greedy pattern match so it leaves everything after
>>> ([PATCH( [0-9/])+])+ in the subject? AFAICT this is cleanup_subject in
>>> builtin-mailinfo.c? Could this rather complex function not just do a
>>> simple regex match which can also take care of stripping ([Rr]e:) ?
>> Yes, I think in the long run it makes sense to strip just the _first_
>> set of brackets. I don't think we want to be more specific than that in
>> the match, because we allow arbitrary cruft inside the brackets (like
>> "[RFC/PATCH]", etc). But if format-patch always puts exactly one set of
>> brackets, and am strips exactly one set, then that should retain your
>> subject in practice, even if it starts with [foo].
>
> I think it may still make sense to insist that PATCH appears somewhere in
> the first set of brackets, but I have stop and wonder if it is even
> necessary.
>
> Because git removes [sbuild] at the beginning, Roger is unhappy.
>
[ and a lot more ]
>
> _An_ established (note that I did not say _the_ nor _best current_)
> practice supported well by git to note the area being affected in a
> project of nontrivial size is to prefix the single line summary with the
> name of the area followed by a colon. There is no difference between
> "[sbuild] foo" and "sbuild: foo" at the information content point-of-view,
> but the latter has an advantage of being one letter shorter and less
> distracting in MUA. He does not have a very strong reason to choose
> something different only to make his life harder, does he?
>
True, but it seems wrong to have am remove more of the subject than
format-patch prepends. Imagine a commit subject looking like this:
"Allow [ and ] in the blurble.foostuff table".
Should am strip the subject all the way up to the last ']'? I think
not, and I'd be very vexed if it did.
> Users can take advantage of this established practice when running
> shortlog with "--grep=^area:" to limit the birds-eye-view to a specific
> area. If this turns out to be useful, we could even add an option to "git
> log --area=name" that limits this kind of match to the first paragraph of
> the commit log message, for example.
>
> Supporting a slightly different convention may seem to be accomodating and
> nice, but if there is no real technical difference between the two (and
> again, "area:" is one letter shorter ;-), letting people run with
> different convention longer, when they can switch easily to another
> convention that is already well supported, may actually hurt them in the
> long run. "[sbuild]" will not match "--area=sbuild" that will internally
> become "--grep-only-first-line=sbuild:" so either he will miss out
> benefiting from the new feature, or the implementation of the new feature
> unnecessarily needs more code.
>
> It is not about discouraging a wrong workflow or practice, because there
> is nothing _wrong_ per-se in [sbuild] prefix. It is just that it makes
> things harder in the long run. In this particular case, it is only very
> slightly harder, but these things tend to add up from different fronts.
Agreed, but there are valid use-cases orthogonal to subsystem naming to
place [] in the patch subject. I still feel that since format-patch only
adds one set, am (mailinfo) should really only remove one set, too. It's
what makes sense, really.
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH] mailinfo: Remove only one set of square brackets
2009-06-29 9:53 ` Andreas Ericsson
@ 2009-06-29 9:55 ` Andreas Ericsson
2009-06-29 16:09 ` Junio C Hamano
2009-06-30 5:33 ` Jeff King
0 siblings, 2 replies; 21+ messages in thread
From: Andreas Ericsson @ 2009-06-29 9:55 UTC (permalink / raw)
To: gitster; +Cc: git, Andreas Ericsson
git-format-patch prepends patches with a [PATCH x/n] prefix, but
mailinfo used to remove any number of square-bracket pairs and
the content between them. This prevents one from using a commit
subject like this:
[ and ] must be allowed as input
Removing the square bracket pair from this rather clumsily
constructed subject line loses important information, so we must
take care not to.
This patch causes the subject stripping to stop after it has
encountered one pair of square brackets.
One possible downside of this patch is that the patch-handling
programs will now fail at removing author-added square-brackets
to be removed, such as
[RFC][PATCH x/n]
However, since format-patch only adds one set of square brackets,
this behaviour is quite easily undesrstood and defended while the
previous behaviour is not.
Signed-off-by: Andreas Ericsson <ae@op5.se>
---
builtin-mailinfo.c | 7 +++++++
1 files changed, 7 insertions(+), 0 deletions(-)
diff --git a/builtin-mailinfo.c b/builtin-mailinfo.c
index 92637ac..fb5ad70 100644
--- a/builtin-mailinfo.c
+++ b/builtin-mailinfo.c
@@ -221,6 +221,8 @@ static void cleanup_subject(struct strbuf *subject)
{
char *pos;
size_t remove;
+ int brackets_removed = 0;
+
while (subject->len) {
switch (*subject->buf) {
case 'r': case 'R':
@@ -235,10 +237,15 @@ static void cleanup_subject(struct strbuf *subject)
strbuf_remove(subject, 0, 1);
continue;
case '[':
+ /* remove only one set of square brackets */
+ if (brackets_removed)
+ break;
+
if ((pos = strchr(subject->buf, ']'))) {
remove = pos - subject->buf;
if (remove <= (subject->len - remove) * 2) {
strbuf_remove(subject, 0, remove + 1);
+ brackets_removed = 1;
continue;
}
} else
--
1.6.3.3.354.gfb24
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH] mailinfo: Remove only one set of square brackets
2009-06-29 9:55 ` [PATCH] mailinfo: Remove only one set of square brackets Andreas Ericsson
@ 2009-06-29 16:09 ` Junio C Hamano
2009-06-30 5:33 ` Jeff King
1 sibling, 0 replies; 21+ messages in thread
From: Junio C Hamano @ 2009-06-29 16:09 UTC (permalink / raw)
To: Andreas Ericsson; +Cc: git
Andreas Ericsson <ae@op5.se> writes:
> git-format-patch prepends patches with a [PATCH x/n] prefix, but
> mailinfo used to remove any number of square-bracket pairs and
> the content between them. This prevents one from using a commit
> subject like this:
>
> [ and ] must be allowed as input
>
> Removing the square bracket pair from this rather clumsily
> constructed subject line loses important information, so we must
> take care not to.
>
> This patch causes the subject stripping to stop after it has
> encountered one pair of square brackets.
>
> One possible downside of this patch is that the patch-handling
> programs will now fail at removing author-added square-brackets
> to be removed, such as
>
> [RFC][PATCH x/n]
>
> However, since format-patch only adds one set of square brackets,
> this behaviour is quite easily undesrstood and defended while the
> previous behaviour is not.
>
> Signed-off-by: Andreas Ericsson <ae@op5.se>
> ---
All good points, and I like this one, including its Subject: line.
> builtin-mailinfo.c | 7 +++++++
> 1 files changed, 7 insertions(+), 0 deletions(-)
>
> diff --git a/builtin-mailinfo.c b/builtin-mailinfo.c
> index 92637ac..fb5ad70 100644
> --- a/builtin-mailinfo.c
> +++ b/builtin-mailinfo.c
> @@ -221,6 +221,8 @@ static void cleanup_subject(struct strbuf *subject)
> {
> char *pos;
> size_t remove;
> + int brackets_removed = 0;
> +
> while (subject->len) {
> switch (*subject->buf) {
> case 'r': case 'R':
> @@ -235,10 +237,15 @@ static void cleanup_subject(struct strbuf *subject)
> strbuf_remove(subject, 0, 1);
> continue;
> case '[':
> + /* remove only one set of square brackets */
> + if (brackets_removed)
> + break;
> +
> if ((pos = strchr(subject->buf, ']'))) {
> remove = pos - subject->buf;
> if (remove <= (subject->len - remove) * 2) {
> strbuf_remove(subject, 0, remove + 1);
> + brackets_removed = 1;
> continue;
> }
> } else
> --
> 1.6.3.3.354.gfb24
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] mailinfo: Remove only one set of square brackets
2009-06-29 9:55 ` [PATCH] mailinfo: Remove only one set of square brackets Andreas Ericsson
2009-06-29 16:09 ` Junio C Hamano
@ 2009-06-30 5:33 ` Jeff King
1 sibling, 0 replies; 21+ messages in thread
From: Jeff King @ 2009-06-30 5:33 UTC (permalink / raw)
To: Andreas Ericsson; +Cc: gitster, git
On Mon, Jun 29, 2009 at 11:55:51AM +0200, Andreas Ericsson wrote:
> git-format-patch prepends patches with a [PATCH x/n] prefix, but
> mailinfo used to remove any number of square-bracket pairs and
> the content between them. This prevents one from using a commit
> subject like this:
>
> [ and ] must be allowed as input
>
> Removing the square bracket pair from this rather clumsily
> constructed subject line loses important information, so we must
> take care not to.
>
> This patch causes the subject stripping to stop after it has
> encountered one pair of square brackets.
I think this is a definite improvement, though I would be much more
convinced that the does the right thing if there were some tests. :)
> One possible downside of this patch is that the patch-handling
> programs will now fail at removing author-added square-brackets
> to be removed, such as
>
> [RFC][PATCH x/n]
>
> However, since format-patch only adds one set of square brackets,
> this behaviour is quite easily undesrstood and defended while the
> previous behaviour is not.
Agreed. And I think Junio raised a good point elsewhere: there are
certain formatting conventions that are part of format-patch output. So
I think we do need to address "this subject munging is totally idiot
proof and will always reproduce the input patch text exactly". But
rather "is this a sane and useful way to do the munging?". And I think
it is a useful convention.
This is a user-visible change that might impact people's workflows (if
only slightly), though, so it should probably get a good mention in the
release notes.
-Peff
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH] builtin-mailinfo.c: Trim only first pair of square brackets in subject
2009-06-28 23:04 ` Junio C Hamano
2009-06-29 9:53 ` Andreas Ericsson
@ 2009-06-29 21:17 ` Roger Leigh
2009-06-29 21:26 ` Jakub Narebski
2009-09-22 10:39 ` Neil Roberts
2009-06-29 21:34 ` [PATCH 2/2] builtin-mailinfo.c: Free regular expression after use Roger Leigh
2009-06-29 21:36 ` git mailinfo strips important context from patch subjects Roger Leigh
3 siblings, 2 replies; 21+ messages in thread
From: Roger Leigh @ 2009-06-29 21:17 UTC (permalink / raw)
To: git; +Cc: Roger Leigh
Use a regular expression to match text after "Re:" or any text in the
first pair of square brackets such as "[PATCH n/m]". This replaces
the complex hairy string munging with a simple single pattern match.
Signed-off-by: Roger Leigh <rleigh@debian.org>
---
builtin-mailinfo.c | 61 +++++++++++++++++++++++++++++-----------------------
1 files changed, 34 insertions(+), 27 deletions(-)
diff --git a/builtin-mailinfo.c b/builtin-mailinfo.c
index 92637ac..6d19046 100644
--- a/builtin-mailinfo.c
+++ b/builtin-mailinfo.c
@@ -219,35 +219,42 @@ static int is_multipart_boundary(const struct strbuf *line)
static void cleanup_subject(struct strbuf *subject)
{
- char *pos;
- size_t remove;
- while (subject->len) {
- switch (*subject->buf) {
- case 'r': case 'R':
- if (subject->len <= 3)
- break;
- if (!memcmp(subject->buf + 1, "e:", 2)) {
- strbuf_remove(subject, 0, 3);
- continue;
- }
- break;
- case ' ': case '\t': case ':':
- strbuf_remove(subject, 0, 1);
- continue;
- case '[':
- if ((pos = strchr(subject->buf, ']'))) {
- remove = pos - subject->buf;
- if (remove <= (subject->len - remove) * 2) {
- strbuf_remove(subject, 0, remove + 1);
- continue;
- }
- } else
- strbuf_remove(subject, 0, 1);
- break;
- }
+ int status;
+ regex_t regex;
+ regmatch_t match[4];
+
+ /* Strip off 'Re:' and/or the first text in square brackets, such as
+ '[PATCH]' at the start of the mail Subject. */
+ status = regcomp(®ex,
+ "^([Rr]e:)?([^]]*\\[[^]]+\\])(.*)$",
+ REG_EXTENDED);
+
+ if (status) {
+ /* Compiling the regex failed. Find out why and tell
+ the user. This is always a bug in the code. */
+ int esize = regerror(status, ®ex, NULL, 0);
+ struct strbuf etext = STRBUF_INIT;
+
+ strbuf_grow(&etext, esize);
+ regerror(status, ®ex, etext.buf, esize);
+ fprintf (stderr,
+ "Error compiling regular expression: %s\n",
+ etext.buf);
+ strbuf_release(&etext);
+ exit(1);
+ }
+
+ /* Store any matches in match. */
+ status = regexec(®ex, subject->buf, 4, match, 0);
+
+ /* If there was a match for \3 in the regex, trim the subject
+ to this match. */
+ if (!status && match[3].rm_so > 0) {
+ strbuf_remove(subject, 0, match[3].rm_so);
strbuf_trim(subject);
- return;
}
+
+ return;
}
static void cleanup_space(struct strbuf *sb)
--
1.6.3.3
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH] builtin-mailinfo.c: Trim only first pair of square brackets in subject
2009-06-29 21:17 ` [PATCH] builtin-mailinfo.c: Trim only first pair of square brackets in subject Roger Leigh
@ 2009-06-29 21:26 ` Jakub Narebski
2009-06-29 21:49 ` Roger Leigh
2009-09-22 10:39 ` Neil Roberts
1 sibling, 1 reply; 21+ messages in thread
From: Jakub Narebski @ 2009-06-29 21:26 UTC (permalink / raw)
To: Roger Leigh; +Cc: git
Roger Leigh <rleigh@debian.org> writes:
> Use a regular expression to match text after "Re:" or any text in the
> first pair of square brackets such as "[PATCH n/m]". This replaces
> the complex hairy string munging with a simple single pattern match.
[...]
> + /* Strip off 'Re:' and/or the first text in square brackets, such as
> + '[PATCH]' at the start of the mail Subject. */
> + status = regcomp(®ex,
> + "^([Rr]e:)?([^]]*\\[[^]]+\\])(.*)$",
> + REG_EXTENDED);
Sidenote: it probably didn't worked before either, but there are some
broken mail readers in the wold (*cough* MS Outlook *cough*), that
misinterpret RFCs and use translated form of "Re:" e.g. "Odp:" (Polish),
or not strip "Re:" when replying resulting in string of "Re: Re: Re: ...",
or use capitalized form of "Re:", i.e. "RE:", or use yet another form
e.g. compact form of repeated "Re: Re: Re: ..." in form of "Re(3):".
But I guess it didn't worked before either.
--
Jakub Narebski
Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] builtin-mailinfo.c: Trim only first pair of square brackets in subject
2009-06-29 21:26 ` Jakub Narebski
@ 2009-06-29 21:49 ` Roger Leigh
0 siblings, 0 replies; 21+ messages in thread
From: Roger Leigh @ 2009-06-29 21:49 UTC (permalink / raw)
To: Jakub Narebski; +Cc: Roger Leigh, git
[-- Attachment #1: Type: text/plain, Size: 1628 bytes --]
On Mon, Jun 29, 2009 at 02:26:45PM -0700, Jakub Narebski wrote:
> Roger Leigh <rleigh@debian.org> writes:
>
> > Use a regular expression to match text after "Re:" or any text in the
> > first pair of square brackets such as "[PATCH n/m]". This replaces
> > the complex hairy string munging with a simple single pattern match.
>
> [...]
> > + /* Strip off 'Re:' and/or the first text in square brackets, such as
> > + '[PATCH]' at the start of the mail Subject. */
> > + status = regcomp(®ex,
> > + "^([Rr]e:)?([^]]*\\[[^]]+\\])(.*)$",
> > + REG_EXTENDED);
>
> Sidenote: it probably didn't worked before either, but there are some
> broken mail readers in the wold (*cough* MS Outlook *cough*), that
> misinterpret RFCs and use translated form of "Re:" e.g. "Odp:" (Polish),
> or not strip "Re:" when replying resulting in string of "Re: Re: Re: ...",
> or use capitalized form of "Re:", i.e. "RE:", or use yet another form
> e.g. compact form of repeated "Re: Re: Re: ..." in form of "Re(3):".
>
> But I guess it didn't worked before either.
One could update the regex to cope with that easily enough such as
"^([Rr]e:[[:space:]]*)*([^]]*\\[[^]]+\\])(.*)$"
for the "Re: Re: Re:" case, though I can't say I've seen anything
except "Re:" for years. Maybe I just don't get mail and patches
from Outlook users ;-)
Regards,
Roger
--
.''`. Roger Leigh
: :' : Debian GNU/Linux http://people.debian.org/~rleigh/
`. `' Printing on GNU/Linux? http://gutenprint.sourceforge.net/
`- GPG Public Key: 0x25BFB848 Please GPG sign your mail.
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] builtin-mailinfo.c: Trim only first pair of square brackets in subject
2009-06-29 21:17 ` [PATCH] builtin-mailinfo.c: Trim only first pair of square brackets in subject Roger Leigh
2009-06-29 21:26 ` Jakub Narebski
@ 2009-09-22 10:39 ` Neil Roberts
2009-09-22 12:56 ` [PATCH] builtin-mailinfo.c: Improve the regexp for cleaning up the subject Neil Roberts
2009-09-22 16:15 ` [PATCH] builtin-mailinfo.c: Trim only first pair of square brackets in subject Junio C Hamano
1 sibling, 2 replies; 21+ messages in thread
From: Neil Roberts @ 2009-09-22 10:39 UTC (permalink / raw)
To: git
Roger Leigh <rleigh@debian.org> writes:
> Use a regular expression to match text after "Re:" or any text in the
> first pair of square brackets such as "[PATCH n/m]". This replaces
> the complex hairy string munging with a simple single pattern match.
Is this patch going to get applied? We like to use the '[topic]' format
in Clutter¹ because it looks so much nicer than 'topic:' and it would be
really nice not to have to manually fix the commit when a contributor
sends a patch in the same format.
- Neil
[1] http://git.clutter-project.org/cgit.cgi?url=clutter/log/
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] builtin-mailinfo.c: Improve the regexp for cleaning up the subject
2009-09-22 10:39 ` Neil Roberts
@ 2009-09-22 12:56 ` Neil Roberts
2009-09-22 16:15 ` [PATCH] builtin-mailinfo.c: Trim only first pair of square brackets in subject Junio C Hamano
1 sibling, 0 replies; 21+ messages in thread
From: Neil Roberts @ 2009-09-22 12:56 UTC (permalink / raw)
To: git
Previously the regular expression would remove the first set of square
brackets regardless of what came before it. If a patch with a summary
such as 'Added a[0] to a line' was passed through git-format-patch
with the -k option then the summary would be cropped to 'to a line'
when applied with git-am.
The new regular expression also matches any number of 're:' prefixes
which apparently can be generated by some old mail clients.
The old regexp required that there be at least one set of square
brackets before it would remove the 're:' and this is now fixed.
---
builtin-mailinfo.c | 9 ++++-----
1 files changed, 4 insertions(+), 5 deletions(-)
This patch is meant to apply on top of the two previous patches by
Roger Leigh which are available here:
http://marc.info/?l=git&m=124839483217718&w=2
http://marc.info/?l=git&m=124839483317722&w=2
It fixes some small problems as described above.
diff --git a/builtin-mailinfo.c b/builtin-mailinfo.c
index 7098c90..f5799f1 100644
--- a/builtin-mailinfo.c
+++ b/builtin-mailinfo.c
@@ -227,7 +227,7 @@ static void cleanup_subject(struct strbuf *subject)
/* Strip off 'Re:' and/or the first text in square brackets, such as
'[PATCH]' at the start of the mail Subject. */
status = regcomp(®ex,
- "^([Rr]e:)?([^]]*\\[[^]]+\\])(.*)$",
+ "^([Rr]e:[ \t]*)*(\\[[^]]+\\][ \t]*)?",
REG_EXTENDED);
if (status) {
@@ -248,10 +248,9 @@ static void cleanup_subject(struct strbuf *subject)
/* Store any matches in match. */
status = regexec(®ex, subject->buf, 4, match, 0);
- /* If there was a match for \3 in the regex, trim the subject
- to this match. */
- if (!status && match[3].rm_so > 0) {
- strbuf_remove(subject, 0, match[3].rm_so);
+ /* If there was a match, remove it */
+ if (!status && match[0].rm_so >= 0) {
+ strbuf_remove(subject, 0, match[0].rm_eo);
strbuf_trim(subject);
}
--
1.6.0.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH] builtin-mailinfo.c: Trim only first pair of square brackets in subject
2009-09-22 10:39 ` Neil Roberts
2009-09-22 12:56 ` [PATCH] builtin-mailinfo.c: Improve the regexp for cleaning up the subject Neil Roberts
@ 2009-09-22 16:15 ` Junio C Hamano
2009-09-22 16:51 ` Neil Roberts
2009-09-23 0:26 ` Jason Holden
1 sibling, 2 replies; 21+ messages in thread
From: Junio C Hamano @ 2009-09-22 16:15 UTC (permalink / raw)
To: Neil Roberts; +Cc: git
Neil Roberts <bpeeluk@yahoo.co.uk> writes:
> Roger Leigh <rleigh@debian.org> writes:
>
>> Use a regular expression to match text after "Re:" or any text in the
>> first pair of square brackets such as "[PATCH n/m]". This replaces
>> the complex hairy string munging with a simple single pattern match.
>
> Is this patch going to get applied?
I do not think it is likely to happen for a patch without much comments
nor progress after this long blank period, without a refresher discussion.
It definitely won't be applied silently in its original form, especially
because the final comment in the old discussion on the patch in question
began with "One could _update_ ..." from the author of the patch, and then
nothing happened.
http://thread.gmane.org/gmane.comp.version-control.git/122418/focus=122466
I actually liked the much simpler one by Andreas in the original thread,
but if you really want to use a regexp (which we didn't have to) we should
make it configurable. See the neighbouring discussion here as well.
http://thread.gmane.org/gmane.comp.version-control.git/123322
I think we all agree that the behaviour should be improved, but I think
neither Roger's patch nor Andreas's one was the solution.. People who
care need to carry discussions and proposed patches forward to help us
agree on an acceptable solution.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] builtin-mailinfo.c: Trim only first pair of square brackets in subject
2009-09-22 16:15 ` [PATCH] builtin-mailinfo.c: Trim only first pair of square brackets in subject Junio C Hamano
@ 2009-09-22 16:51 ` Neil Roberts
2009-09-23 0:26 ` Jason Holden
1 sibling, 0 replies; 21+ messages in thread
From: Neil Roberts @ 2009-09-22 16:51 UTC (permalink / raw)
To: git
> Neil Roberts <bpeeluk@yahoo.co.uk> writes:
>
>> Is this patch going to get applied?
Junio C Hamano <gitster@pobox.com> writes:
> I do not think it is likely to happen for a patch without much
> comments nor progress after this long blank period, without a
> refresher discussion.
>
> It definitely won't be applied silently in its original form,
> especially because the final comment in the old discussion on the
> patch in question began with "One could _update_ ..." from the author
> of the patch, and then nothing happened.
>
> http://thread.gmane.org/gmane.comp.version-control.git/122418/focus=122466
Ok, fair enough. I submitted another patch to mailing list earlier which
at least addresses the issue mentioned by the original author when he
says "One could _update_ ...".
> I actually liked the much simpler one by Andreas in the original
> thread, but if you really want to use a regexp (which we didn't have
> to) we should make it configurable. See the neighbouring discussion
> here as well.
>
> http://thread.gmane.org/gmane.comp.version-control.git/123322
Oh I didn't see that thread, sorry. It's quite tricky to track the issue
when it is spread across multiple threads in a mailing list.
I'm not particularly set on the idea of it being a regular expression so
I'd be happy with an improved version of the existing loop. I'd
certainly be happy with it being an option as in your patch here:
http://article.gmane.org/gmane.comp.version-control.git/123340
If it is an option as in that patch surely it's quite safe as it can't
affect anyone's existing workflow? It might also be nice if it was
possible to change it in .git/config so you could enable it by default
for projects that use the '[topic]' syntax (such as Cairo and Clutter).
> I think we all agree that the behaviour should be improved, but I
> think neither Roger's patch nor Andreas's one was the solution..
> People who care need to carry discussions and proposed patches forward
> to help us agree on an acceptable solution.
Ok, well I do care about this issue and it annoys me regularly so I
would love to reopen the discussion. What are the issues with the last
patch mentioned above?
- Neil
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] builtin-mailinfo.c: Trim only first pair of square brackets in subject
2009-09-22 16:15 ` [PATCH] builtin-mailinfo.c: Trim only first pair of square brackets in subject Junio C Hamano
2009-09-22 16:51 ` Neil Roberts
@ 2009-09-23 0:26 ` Jason Holden
1 sibling, 0 replies; 21+ messages in thread
From: Jason Holden @ 2009-09-23 0:26 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Neil Roberts, git
Junio C Hamano wrote:
>
> I think we all agree that the behaviour should be improved, but I think
> neither Roger's patch nor Andreas's one was the solution.. People who
> care need to carry discussions and proposed patches forward to help us
> agree on an acceptable solution.
An additional use case for this is that at $dayjob, we use GForge
Advanced Server. With GForge, commits are tied to the bug-tracker
by including the bug-id in the commit message with the syntax
[#NNN], where NNN is a unique id for each submitted bug.
So the typical first line of a commit message looks something like:
[#100] Fix bug in foo.c
sent using git-send-email, this becomes
[PATCH] [#100] Fix bug in foo.c
But of course, both [PATCH] and [#100] get stripped off when applied
with git-am, forcing a manual edit.
The reg-expression stuff isn't necessary for my particular use-case.
Stripping off brackets that have any variant of "PATCH" in them, or
just stripping off the first set of brackets would work for me.
--
Regards,
Jason Holden
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 2/2] builtin-mailinfo.c: Free regular expression after use
2009-06-28 23:04 ` Junio C Hamano
2009-06-29 9:53 ` Andreas Ericsson
2009-06-29 21:17 ` [PATCH] builtin-mailinfo.c: Trim only first pair of square brackets in subject Roger Leigh
@ 2009-06-29 21:34 ` Roger Leigh
2009-06-29 21:36 ` git mailinfo strips important context from patch subjects Roger Leigh
3 siblings, 0 replies; 21+ messages in thread
From: Roger Leigh @ 2009-06-29 21:34 UTC (permalink / raw)
To: git; +Cc: Roger Leigh
Signed-off-by: Roger Leigh <rleigh@debian.org>
---
builtin-mailinfo.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/builtin-mailinfo.c b/builtin-mailinfo.c
index 6d19046..6559c37 100644
--- a/builtin-mailinfo.c
+++ b/builtin-mailinfo.c
@@ -254,6 +254,8 @@ static void cleanup_subject(struct strbuf *subject)
strbuf_trim(subject);
}
+ regfree(®ex);
+
return;
}
--
1.6.3.3
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: git mailinfo strips important context from patch subjects
2009-06-28 23:04 ` Junio C Hamano
` (2 preceding siblings ...)
2009-06-29 21:34 ` [PATCH 2/2] builtin-mailinfo.c: Free regular expression after use Roger Leigh
@ 2009-06-29 21:36 ` Roger Leigh
3 siblings, 0 replies; 21+ messages in thread
From: Roger Leigh @ 2009-06-29 21:36 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Jeff King, git
[-- Attachment #1: Type: text/plain, Size: 6831 bytes --]
On Sun, Jun 28, 2009 at 04:04:37PM -0700, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
> > On Sun, Jun 28, 2009 at 08:38:58PM +0100, Roger Leigh wrote:
> >
> >> In most of the projects I work on, the git commit message has
> >> the affected subsystem or component in square brackets, such as
> >>
> >> [foo] change bar to baz
> >>
> >> [...]
> >>
> >> The [sbuild] prefix has been dropped from the Subject, so an
> >> important bit of context about the patch has been lost.
> >>
> >> It's a bit of a bug that you can't round trip from a git-format-patch
> >> to import with git-am and then not be able to produce the exact same
> >> patch set with git-format-patch again (assuming preparing and applying
> >> to the same point, of course).
> >
> > As an immediate solution, you probably want to use "-k" when generating
> > the patch (not to add the [PATCH] munging) and "-k" when reading the
> > patch via "git am" (which will avoid trying to strip any munging).
> >
> > However:
> >
> >> Would it be possible to change the git-mailinfo logic to use a less
> >> greedy pattern match so it leaves everything after
> >> ([PATCH( [0-9/])+])+ in the subject? AFAICT this is cleanup_subject in
> >> builtin-mailinfo.c? Could this rather complex function not just do a
> >> simple regex match which can also take care of stripping ([Rr]e:) ?
> >
> > Yes, I think in the long run it makes sense to strip just the _first_
> > set of brackets. I don't think we want to be more specific than that in
> > the match, because we allow arbitrary cruft inside the brackets (like
> > "[RFC/PATCH]", etc). But if format-patch always puts exactly one set of
> > brackets, and am strips exactly one set, then that should retain your
> > subject in practice, even if it starts with [foo].
>
> I think it may still make sense to insist that PATCH appears somewhere in
> the first set of brackets, but I have stop and wonder if it is even
> necessary.
I imagine not. I've submitted a patch separately which implements
this behaviour (more on that below).
> Because git removes [sbuild] at the beginning, Roger is unhappy.
>
> * Is he happy that git removes [PATCH]? In E-mail based workflow it is
> a good practice to mark messages that are patches clearly so that they
> can be quickly found among the discussions that lead to them, and it is
> plausible that his project accepted that as an established practice
> supported well by git.
I'm perfectly happy that [PATCH] is removed. My requirement is that
the commit created by "git am" is identical to the commit represented
in the patch created with "git format-patch". The removal of this
is IMO correct, and I agree that it's presence is useful in an email-
based workflow.
> * Is he happy that git treats the first paragraph of the commit message
> specially from the rest of the message? In a project with many
> commits, it is essential that people write good commit summaries that
> fits on a single line so that tools like shortlog and gitweb can be
> used to get a bird-eye view of what happened recently. Perhaps his
> project picked it up as the best current practice supported well by
> git.
I'm also happy with this, and make use of it. As for the previous
paragraph, I would like the commit message to be preserved correctly
so that the message committed by "git am" matches the original
commit message exactly.
> * Is he happy that git takes "---" as the end of message marker, so that
> any other commentary can be added to the message to facilitate the
> communication without adding noise to the commits? Perhaps he is and
> his project picked it up as a good practice supported well by git.
This sounds just fine, though I have not yet had the need to use it.
> _An_ established (note that I did not say _the_ nor _best current_)
> practice supported well by git to note the area being affected in a
> project of nontrivial size is to prefix the single line summary with the
> name of the area followed by a colon. There is no difference between
> "[sbuild] foo" and "sbuild: foo" at the information content point-of-view,
> but the latter has an advantage of being one letter shorter and less
> distracting in MUA. He does not have a very strong reason to choose
> something different only to make his life harder, does he?
Well, I sometimes use the format
[foo] bar: baz
but my more general point was not my specific usage but that the
existing behaviour was causing loss of information. I think it
would be preferable to guarantee that data from the original
commit is not lost and is preserved exactly if at all possible.
> Supporting a slightly different convention may seem to be accomodating and
> nice, but if there is no real technical difference between the two (and
> again, "area:" is one letter shorter ;-), letting people run with
> different convention longer, when they can switch easily to another
> convention that is already well supported, may actually hurt them in the
> long run. "[sbuild]" will not match "--area=sbuild" that will internally
> become "--grep-only-first-line=sbuild:" so either he will miss out
> benefiting from the new feature, or the implementation of the new feature
> unnecessarily needs more code.
This is a nice feature I wasn't aware of, so thanks for pointing it
out. It might be useful to alter my workflow to allow it to be used,
or alternatively customisation to allow a custom regex stored e.g.
in .git/config would allow me to match both forms?
The patch I sent to the list separately replaces the existing
cleanup_subject string munging (which is rather complex and
hairy), with a single regular expression to match the bits of
the string we don't want such as '^Re:' and the first set of
square brackets. We then just keep the remainder. I initally
went with the following extended regex:
^([Rr]e: )?(.*PATCH[^]]*\\] )(.*)$
but as per your comments above about removing the first set of
brackets whatever the contents, chose the following more
general expression:
^([Rr]e:)?([^]]*\\[[^]]+\\])(.*)$
This should be rather more maintainable and flexible than the
existing code, because one can just tweak the regex rather than
fiddling with hairy string offsets. This preserves the
existing behaviour with the exception of matching the first []
pair only rather than being "greedy" and removing everything up
to the last "]".
Regards,
Roger
--
.''`. Roger Leigh
: :' : Debian GNU/Linux http://people.debian.org/~rleigh/
`. `' Printing on GNU/Linux? http://gutenprint.sourceforge.net/
`- GPG Public Key: 0x25BFB848 Please GPG sign your mail.
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread