git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Roger Leigh <rleigh@codelibre.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Jeff King <peff@peff.net>, git@vger.kernel.org
Subject: Re: git mailinfo strips important context from patch subjects
Date: Mon, 29 Jun 2009 22:36:26 +0100	[thread overview]
Message-ID: <20090629213625.GA5397@codelibre.net> (raw)
In-Reply-To: <7vfxdkez96.fsf@alter.siamese.dyndns.org>

[-- Attachment #1: Type: text/plain, Size: 6831 bytes --]

On Sun, Jun 28, 2009 at 04:04:37PM -0700, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
> 
> > On Sun, Jun 28, 2009 at 08:38:58PM +0100, Roger Leigh wrote:
> >
> >> In most of the projects I work on, the git commit message has
> >> the affected subsystem or component in square brackets, such as
> >> 
> >>   [foo] change bar to baz
> >>
> >> [...]
> >>
> >> The [sbuild] prefix has been dropped from the Subject, so an
> >> important bit of context about the patch has been lost.
> >> 
> >> It's a bit of a bug that you can't round trip from a git-format-patch
> >> to import with git-am and then not be able to produce the exact same
> >> patch set with git-format-patch again (assuming preparing and applying
> >> to the same point, of course).
> >
> > As an immediate solution, you probably want to use "-k" when generating
> > the patch (not to add the [PATCH] munging) and "-k" when reading the
> > patch via "git am" (which will avoid trying to strip any munging).
> >
> > However:
> >
> >> Would it be possible to change the git-mailinfo logic to use a less
> >> greedy pattern match so it leaves everything after
> >> ([PATCH( [0-9/])+])+ in the subject?  AFAICT this is cleanup_subject in
> >> builtin-mailinfo.c?  Could this rather complex function not just do a
> >> simple regex match which can also take care of stripping ([Rr]e:) ?
> >
> > Yes, I think in the long run it makes sense to strip just the _first_
> > set of brackets. I don't think we want to be more specific than that in
> > the match, because we allow arbitrary cruft inside the brackets (like
> > "[RFC/PATCH]", etc). But if format-patch always puts exactly one set of
> > brackets, and am strips exactly one set, then that should retain your
> > subject in practice, even if it starts with [foo].
> 
> I think it may still make sense to insist that PATCH appears somewhere in
> the first set of brackets, but I have stop and wonder if it is even
> necessary.

I imagine not.  I've submitted a patch separately which implements
this behaviour (more on that below).

> Because git removes [sbuild] at the beginning, Roger is unhappy.
> 
>  * Is he happy that git removes [PATCH]?  In E-mail based workflow it is
>    a good practice to mark messages that are patches clearly so that they
>    can be quickly found among the discussions that lead to them, and it is
>    plausible that his project accepted that as an established practice
>    supported well by git.

I'm perfectly happy that [PATCH] is removed.  My requirement is that
the commit created by "git am" is identical to the commit represented
in the patch created with "git format-patch".  The removal of this
is IMO correct, and I agree that it's presence is useful in an email-
based workflow.

>  * Is he happy that git treats the first paragraph of the commit message
>    specially from the rest of the message?  In a project with many
>    commits, it is essential that people write good commit summaries that
>    fits on a single line so that tools like shortlog and gitweb can be
>    used to get a bird-eye view of what happened recently.  Perhaps his
>    project picked it up as the best current practice supported well by
>    git.

I'm also happy with this, and make use of it.  As for the previous
paragraph, I would like the commit message to be preserved correctly
so that the message committed by "git am" matches the original
commit message exactly.

>  * Is he happy that git takes "---" as the end of message marker, so that
>    any other commentary can be added to the message to facilitate the
>    communication without adding noise to the commits?  Perhaps he is and
>    his project picked it up as a good practice supported well by git.

This sounds just fine, though I have not yet had the need to use it.

> _An_ established (note that I did not say _the_ nor _best current_)
> practice supported well by git to note the area being affected in a
> project of nontrivial size is to prefix the single line summary with the
> name of the area followed by a colon.  There is no difference between
> "[sbuild] foo" and "sbuild: foo" at the information content point-of-view,
> but the latter has an advantage of being one letter shorter and less
> distracting in MUA.  He does not have a very strong reason to choose
> something different only to make his life harder, does he?

Well, I sometimes use the format

  [foo] bar: baz

but my more general point was not my specific usage but that the
existing behaviour was causing loss of information.  I think it
would be preferable to guarantee that data from the original
commit is not lost and is preserved exactly if at all possible.

> Supporting a slightly different convention may seem to be accomodating and
> nice, but if there is no real technical difference between the two (and
> again, "area:" is one letter shorter ;-), letting people run with
> different convention longer, when they can switch easily to another
> convention that is already well supported, may actually hurt them in the
> long run.  "[sbuild]" will not match "--area=sbuild" that will internally
> become "--grep-only-first-line=sbuild:" so either he will miss out
> benefiting from the new feature, or the implementation of the new feature
> unnecessarily needs more code.

This is a nice feature I wasn't aware of, so thanks for pointing it
out.  It might be useful to alter my workflow to allow it to be used,
or alternatively customisation to allow a custom regex stored e.g.
in .git/config would allow me to match both forms?

The patch I sent to the list separately replaces the existing
cleanup_subject string munging (which is rather complex and
hairy), with a single regular expression to match the bits of
the string we don't want such as '^Re:' and the first set of
square brackets.  We then just keep the remainder.  I initally
went with the following extended regex:

  ^([Rr]e: )?(.*PATCH[^]]*\\] )(.*)$

but as per your comments above about removing the first set of
brackets whatever the contents, chose the following more
general expression:

  ^([Rr]e:)?([^]]*\\[[^]]+\\])(.*)$

This should be rather more maintainable and flexible than the
existing code, because one can just tweak the regex rather than
fiddling with hairy string offsets.  This preserves the
existing behaviour with the exception of matching the first []
pair only rather than being "greedy" and removing everything up
to the last "]".


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux             http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?       http://gutenprint.sourceforge.net/
   `-    GPG Public Key: 0x25BFB848   Please GPG sign your mail.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

  parent reply	other threads:[~2009-06-29 21:36 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-28 19:38 git mailinfo strips important context from patch subjects Roger Leigh
2009-06-28 20:02 ` Jeff King
2009-06-28 23:04   ` Junio C Hamano
2009-06-29  9:53     ` Andreas Ericsson
2009-06-29  9:55       ` [PATCH] mailinfo: Remove only one set of square brackets Andreas Ericsson
2009-06-29 16:09         ` Junio C Hamano
2009-06-30  5:33         ` Jeff King
2009-06-29 21:17     ` [PATCH] builtin-mailinfo.c: Trim only first pair of square brackets in subject Roger Leigh
2009-06-29 21:26       ` Jakub Narebski
2009-06-29 21:49         ` Roger Leigh
2009-09-22 10:39       ` Neil Roberts
2009-09-22 12:56         ` [PATCH] builtin-mailinfo.c: Improve the regexp for cleaning up the subject Neil Roberts
2009-09-22 16:15         ` [PATCH] builtin-mailinfo.c: Trim only first pair of square brackets in subject Junio C Hamano
2009-09-22 16:51           ` Neil Roberts
2009-09-23  0:26           ` Jason Holden
2009-06-29 21:34     ` [PATCH 2/2] builtin-mailinfo.c: Free regular expression after use Roger Leigh
2009-06-29 21:36     ` Roger Leigh [this message]
2009-06-28 20:07 ` [PATCH] git mailinfo strips important context from patch subjects Paolo Bonzini
2009-06-29  9:19   ` Andreas Ericsson
2009-06-29 10:21     ` Paolo Bonzini
2009-06-29 10:54       ` Andreas Ericsson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090629213625.GA5397@codelibre.net \
    --to=rleigh@codelibre.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).