From: Emma Brooks <me@pluvano.com>
To: Eric Sunshine <sunshine@sunshineco.com>
Cc: "Git List" <git@vger.kernel.org>, "Jakub Narębski" <jnareb@gmail.com>
Subject: Re: [PATCH v2] gitweb: map names/emails with mailmap
Date: Mon, 10 Aug 2020 03:12:00 +0000 [thread overview]
Message-ID: <20200810031123.GA3565@pluvano.com> (raw)
In-Reply-To: <CAPig+cT3aMUQapU7i0C3jZaLd2XJwP9SbxFEm_tG_1wj8w1PKw@mail.gmail.com>
On 2020-08-09 20:49:59-0400, Eric Sunshine wrote:
> On Sun, Aug 9, 2020 at 7:06 PM Emma Brooks <me@pluvano.com> wrote:
> > Add an option to map names and emails to their canonical forms via a
> > .mailmap file. This is enabled by default, consistent with the behavior
> > of Git itself.
> >
> > Signed-off-by: Emma Brooks <me@pluvano.com>
> > ---
> > diff --git a/Documentation/gitweb.conf.txt b/Documentation/gitweb.conf.txt
> > @@ -751,6 +751,11 @@ default font sizes or lineheights are changed (e.g. via adding extra
> > +mailmap::
> > + Use mailmap to find the canonical name/email for
> > + committers/authors (see linkgit:git-shortlog[1]). Enabled by
> > + default.
>
> Is this setting global or per-repository? (I ask because documentation
> for other options in this section document whether they can be set
> per-repository.)
Global. I'll add a note that it cannot be set per-project, or I could
add support for setting it per-project if that's wanted.
> Should there be any sort of support for functionality similar to the
> "mailmap.file" and "mailmap.blob" configuration options in Git itself?
> (Genuine question, not a demand for you to implement such support.)
Yes, that would be useful and should probably be supported.
> > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> > +# Contents of mailmap stored as a referance to a hash with keys in the format
>
> s/referance/reference/
OK.
> > +# of "name <email>" or "<email>", and values that are hashes containing a
> > +# replacement "name" and/or "email". If set (even if empty) the mailmap has
> > +# already been read.
> > +my $mailmap;
> > +
> > +sub read_mailmap {
> > + my %mailmap = ();
> > + open my $fd, '-|', quote_command(
> > + git_cmd(), 'cat-file', 'blob', 'HEAD:.mailmap') . ' 2> /dev/null'
> > + or die_error(500, 'Failed to read mailmap');
>
> Am I reading this correctly that this will die if the project does not
> have a .mailmap file? If so, that seems like harsh behavior since
> there are many projects in the wild lacking a .mailmap file.
No, this error message is misleading. The die_error is called if there
is a problem executing git cat-file, but not if cat-file returns an
error. I'll revise this message to be more accurate.
> > + return \%mailmap if eof $fd;
> > + foreach (split '\n', <$fd>) {
>
> If the .mailmap has no content, then the 'foreach' loop won't be
> entered, which means the early 'return' above it is unneeded, correct?
> (Not necessarily asking for the early 'return' to be removed, but more
> a case of checking that I'm understanding the logic.)
The early return is intended to catch when there is no mailmap, so $fd
does not get initialized. Without it, you would get an error when you
try to split $fd's content:
Use of uninitialized value $fd in split at [the foreach]
> > + next if (/^#/);
> > + if (/(.*)\s+ <([^<>]+)>\s+ ((?:.*\s+)? <[^<>]+>) (?:\s+\#)/x ||
> > + /(.*)\s+ <([^<>]+)>\s+ ((?:.*\s+)? <[^<>]+>)/x) {
> > + # New Name <new@email> <old@email>
> > + # New Name <new@email> Old Name <old@email>
>
> The first regex is intended to handle a trailing "# comment", whereas
> the second regex is for lines lacking a comment, correct? However,
> because neither of these expressions are anchored, the second regex
> will match both types of lines, thus the first regex is redundant. I'm
> guessing, therefore, that your intent was actually to anchor the
> expressions, perhaps like this:
>
> if (/^\s* (.*)\s+ <([^<>]+)>\s+ ((?:.*\s+)? <[^<>]+>) (?:\s+\#)/x ||
> /^\s* (.*)\s+ <([^<>]+)>\s+ ((?:.*\s+)? <[^<>]+>) \s*$/x) {
>
> Also, if you're matching lines of the form:
>
> name1 <email1> [optional-name] <email2>
>
> in which you expect to see "name1", then is the loose "(.*)\s+"
> desirable? Shouldn't it be tighter "(.+)\s+"? For instance:
>
> if (/^\s* (.+)\s+ <([^<>]+)>\s+ ((?:.*\s+)? <[^<>]+>) (?:\s+\#)/x ||
> /^\s* (.+)\s+ <([^<>]+)>\s+ ((?:.*\s+)? <[^<>]+>) \s*$/x) {
Yes and yes. I'll update those.
> > + $mailmap{$3} = ();
>
> I wonder if you should be doing some sort of whitespace normalization
> on $3 before using it as a hash key. For instance, if someone has a
> .mailmap that looks like this (where I've used "." to represent
> space):
>
> name1.<email1>.name2...<email2>
>
> then $3 will have three spaces between 'name2' and '<email2>' when
> used as a key, and that won't match later when you construct a "name
> <email>" key later in map_author() with a single space.
Yes, I hadn't considered that case.
Thanks.
next prev parent reply other threads:[~2020-08-10 3:12 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-30 4:12 [RFC PATCH] gitweb: Map names/emails with mailmap Emma Brooks
2020-07-30 16:20 ` Junio C Hamano
2020-07-31 1:01 ` Jeff King
2020-07-31 2:10 ` Junio C Hamano
2020-08-08 21:34 ` [PATCH] " Emma Brooks
2020-08-09 23:04 ` [PATCH v2] gitweb: map " Emma Brooks
2020-08-10 0:49 ` Eric Sunshine
2020-08-10 3:12 ` Emma Brooks [this message]
2020-08-10 5:41 ` Eric Sunshine
2020-08-10 10:02 ` Jeff King
2020-08-11 4:17 ` Emma Brooks
2020-08-11 4:48 ` Eric Sunshine
2020-08-11 4:55 ` Jeff King
2020-09-05 2:55 ` Emma Brooks
2020-09-05 3:26 ` Junio C Hamano
2020-09-07 22:10 ` Emma Brooks
2020-08-11 6:17 ` Eric Wong
2020-08-11 6:33 ` Joe Perches
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200810031123.GA3565@pluvano.com \
--to=me@pluvano.com \
--cc=git@vger.kernel.org \
--cc=jnareb@gmail.com \
--cc=sunshine@sunshineco.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.