From: "Alex Bennée" <alex.bennee@linaro.org>
To: Matthieu Moy <git@matthieu-moy.fr>
Cc: gitster@pobox.com, git@vger.kernel.org,
"Eric Sunshine" <sunshine@sunshineco.com>,
"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
"Thomas Adam" <thomas@xteddy.org>
Subject: Re: [PATCH v3 1/3] send-email: add and use a local copy of Mail::Address
Date: Mon, 08 Jan 2018 11:56:15 +0000 [thread overview]
Message-ID: <87incco97k.fsf@linaro.org> (raw)
In-Reply-To: <1515407674-5233-1-git-send-email-git@matthieu-moy.fr>
Matthieu Moy <git@matthieu-moy.fr> writes:
> We used to have two versions of the email parsing code. Our
> parse_mailboxes (in Git.pm), and Mail::Address which we used if
> installed. Unfortunately, both versions have different sets of bugs, and
> changing the behavior of git depending on whether Mail::Address is
> installed was a bad idea.
>
> A first attempt to solve this was cc90750 (send-email: don't use
> Mail::Address, even if available, 2017-08-23), but it turns out our
> parse_mailboxes is too buggy for some uses. For example the lack of
> nested comments support breaks get_maintainer.pl in the Linux kernel
> tree:
>
> https://public-inbox.org/git/20171116154814.23785-1-alex.bennee@linaro.org/
>
> This patch goes the other way: use Mail::Address anyway, but have a
> local copy from CPAN as a fallback, when the system one is not
> available.
>
> The duplicated script is small (276 lines of code) and stable in time.
> Maintaining the local copy should not be an issue, and will certainly be
> less burden than maintaining our own parse_mailboxes.
>
> Another option would be to consider Mail::Address as a hard dependency,
> but it's easy enough to save the trouble of extra-dependency to the end
> user or packager.
>
> Signed-off-by: Matthieu Moy <git@matthieu-moy.fr>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
> ---
> No change since v2.
>
> git-send-email.perl | 3 +-
> perl/Git/FromCPAN/Mail/Address.pm | 276 ++++++++++++++++++++++++++++++++++++++
> perl/Git/Mail/Address.pm | 24 ++++
> 3 files changed, 302 insertions(+), 1 deletion(-)
> create mode 100644 perl/Git/FromCPAN/Mail/Address.pm
> create mode 100755 perl/Git/Mail/Address.pm
>
> diff --git a/git-send-email.perl b/git-send-email.perl
> index edcc6d3..340b5c8 100755
> --- a/git-send-email.perl
> +++ b/git-send-email.perl
> @@ -30,6 +30,7 @@ use Error qw(:try);
> use Cwd qw(abs_path cwd);
> use Git;
> use Git::I18N;
> +use Git::Mail::Address;
>
> Getopt::Long::Configure qw/ pass_through /;
>
> @@ -489,7 +490,7 @@ my ($repoauthor, $repocommitter);
> ($repocommitter) = Git::ident_person(@repo, 'committer');
>
> sub parse_address_line {
> - return Git::parse_mailboxes($_[0]);
> + return map { $_->format } Mail::Address->parse($_[0]);
> }
>
> sub split_addrs {
> diff --git a/perl/Git/FromCPAN/Mail/Address.pm b/perl/Git/FromCPAN/Mail/Address.pm
> new file mode 100644
> index 0000000..13b2ff7
> --- /dev/null
> +++ b/perl/Git/FromCPAN/Mail/Address.pm
> @@ -0,0 +1,276 @@
> +# Copyrights 1995-2017 by [Mark Overmeer <perl@overmeer.net>].
> +# For other contributors see ChangeLog.
> +# See the manual pages for details on the licensing terms.
> +# Pod stripped from pm file by OODoc 2.02.
> +package Mail::Address;
> +use vars '$VERSION';
> +$VERSION = '2.19';
> +
> +use strict;
> +
> +use Carp;
> +
> +# use locale; removed in version 1.78, because it causes taint problems
> +
> +sub Version { our $VERSION }
> +
> +
> +
> +# given a comment, attempt to extract a person's name
> +sub _extract_name
> +{ # This function can be called as method as well
> + my $self = @_ && ref $_[0] ? shift : undef;
> +
> + local $_ = shift
> + or return '';
> +
> + # Using encodings, too hard. See Mail::Message::Field::Full.
> + return '' if m/\=\?.*?\?\=/;
> +
> + # trim whitespace
> + s/^\s+//;
> + s/\s+$//;
> + s/\s+/ /;
> +
> + # Disregard numeric names (e.g. 123456.1234@compuserve.com)
> + return "" if /^[\d ]+$/;
> +
> + s/^\((.*)\)$/$1/; # remove outermost parenthesis
> + s/^"(.*)"$/$1/; # remove outer quotation marks
> + s/\(.*?\)//g; # remove minimal embedded comments
> + s/\\//g; # remove all escapes
> + s/^"(.*)"$/$1/; # remove internal quotation marks
> + s/^([^\s]+) ?, ?(.*)$/$2 $1/; # reverse "Last, First M." if applicable
> + s/,.*//;
> +
> + # Change casing only when the name contains only upper or only
> + # lower cased characters.
> + unless( m/[A-Z]/ && m/[a-z]/ )
> + { # Set the case of the name to first char upper rest lower
> + s/\b(\w+)/\L\u$1/igo; # Upcase first letter on name
> + s/\bMc(\w)/Mc\u$1/igo; # Scottish names such as 'McLeod'
> + s/\bo'(\w)/O'\u$1/igo; # Irish names such as 'O'Malley, O'Reilly'
> + s/\b(x*(ix)?v*(iv)?i*)\b/\U$1/igo; # Roman numerals, eg 'Level III Support'
> + }
> +
> + # some cleanup
> + s/\[[^\]]*\]//g;
> + s/(^[\s'"]+|[\s'"]+$)//g;
> + s/\s{2,}/ /g;
> +
> + $_;
> +}
> +
> +sub _tokenise
> +{ local $_ = join ',', @_;
> + my (@words,$snippet,$field);
> +
> + s/\A\s+//;
> + s/[\r\n]+/ /g;
> +
> + while ($_ ne '')
> + { $field = '';
> + if(s/^\s*\(/(/ ) # (...)
> + { my $depth = 0;
> +
> + PAREN: while(s/^(\(([^\(\)\\]|\\.)*)//)
> + { $field .= $1;
> + $depth++;
> + while(s/^(([^\(\)\\]|\\.)*\)\s*)//)
> + { $field .= $1;
> + last PAREN unless --$depth;
> + $field .= $1 if s/^(([^\(\)\\]|\\.)+)//;
> + }
> + }
> +
> + carp "Unmatched () '$field' '$_'"
> + if $depth;
> +
> + $field =~ s/\s+\Z//;
> + push @words, $field;
> +
> + next;
> + }
> +
> + if( s/^("(?:[^"\\]+|\\.)*")\s*// # "..."
> + || s/^(\[(?:[^\]\\]+|\\.)*\])\s*// # [...]
> + || s/^([^\s()<>\@,;:\\".[\]]+)\s*//
> + || s/^([()<>\@,;:\\".[\]])\s*//
> + )
> + { push @words, $1;
> + next;
> + }
> +
> + croak "Unrecognised line: $_";
> + }
> +
> + push @words, ",";
> + \@words;
> +}
> +
> +sub _find_next
> +{ my ($idx, $tokens, $len) = @_;
> +
> + while($idx < $len)
> + { my $c = $tokens->[$idx];
> + return $c if $c eq ',' || $c eq ';' || $c eq '<';
> + $idx++;
> + }
> +
> + "";
> +}
> +
> +sub _complete
> +{ my ($class, $phrase, $address, $comment) = @_;
> +
> + @$phrase || @$comment || @$address
> + or return undef;
> +
> + my $o = $class->new(join(" ",@$phrase), join("",@$address), join(" ",@$comment));
> + @$phrase = @$address = @$comment = ();
> + $o;
> +}
> +
> +#------------
> +
> +sub new(@)
> +{ my $class = shift;
> + bless [@_], $class;
> +}
> +
> +
> +sub parse(@)
> +{ my $class = shift;
> + my @line = grep {defined} @_;
> + my $line = join '', @line;
> +
> + my (@phrase, @comment, @address, @objs);
> + my ($depth, $idx) = (0, 0);
> +
> + my $tokens = _tokenise @line;
> + my $len = @$tokens;
> + my $next = _find_next $idx, $tokens, $len;
> +
> + local $_;
> + for(my $idx = 0; $idx < $len; $idx++)
> + { $_ = $tokens->[$idx];
> +
> + if(substr($_,0,1) eq '(') { push @comment, $_ }
> + elsif($_ eq '<') { $depth++ }
> + elsif($_ eq '>') { $depth-- if $depth }
> + elsif($_ eq ',' || $_ eq ';')
> + { warn "Unmatched '<>' in $line" if $depth;
> + my $o = $class->_complete(\@phrase, \@address, \@comment);
> + push @objs, $o if defined $o;
> + $depth = 0;
> + $next = _find_next $idx+1, $tokens, $len;
> + }
> + elsif($depth) { push @address, $_ }
> + elsif($next eq '<') { push @phrase, $_ }
> + elsif( /^[.\@:;]$/ || !@address || $address[-1] =~ /^[.\@:;]$/ )
> + { push @address, $_ }
> + else
> + { warn "Unmatched '<>' in $line" if $depth;
> + my $o = $class->_complete(\@phrase, \@address, \@comment);
> + push @objs, $o if defined $o;
> + $depth = 0;
> + push @address, $_;
> + }
> + }
> + @objs;
> +}
> +
> +#------------
> +
> +sub phrase { shift->set_or_get(0, @_) }
> +sub address { shift->set_or_get(1, @_) }
> +sub comment { shift->set_or_get(2, @_) }
> +
> +sub set_or_get($)
> +{ my ($self, $i) = (shift, shift);
> + @_ or return $self->[$i];
> +
> + my $val = $self->[$i];
> + $self->[$i] = shift if @_;
> + $val;
> +}
> +
> +
> +my $atext = '[\-\w !#$%&\'*+/=?^`{|}~]';
> +sub format
> +{ my @addrs;
> +
> + foreach (@_)
> + { my ($phrase, $email, $comment) = @$_;
> + my @addr;
> +
> + if(defined $phrase && length $phrase)
> + { push @addr
> + , $phrase =~ /^(?:\s*$atext\s*)+$/o ? $phrase
> + : $phrase =~ /(?<!\\)"/ ? $phrase
> + : qq("$phrase");
> +
> + push @addr, "<$email>"
> + if defined $email && length $email;
> + }
> + elsif(defined $email && length $email)
> + { push @addr, $email;
> + }
> +
> + if(defined $comment && $comment =~ /\S/)
> + { $comment =~ s/^\s*\(?/(/;
> + $comment =~ s/\)?\s*$/)/;
> + }
> +
> + push @addr, $comment
> + if defined $comment && length $comment;
> +
> + push @addrs, join(" ", @addr)
> + if @addr;
> + }
> +
> + join ", ", @addrs;
> +}
> +
> +#------------
> +
> +sub name
> +{ my $self = shift;
> + my $phrase = $self->phrase;
> + my $addr = $self->address;
> +
> + $phrase = $self->comment
> + unless defined $phrase && length $phrase;
> +
> + my $name = $self->_extract_name($phrase);
> +
> + # first.last@domain address
> + if($name eq '' && $addr =~ /([^\%\.\@_]+([\._][^\%\.\@_]+)+)[\@\%]/)
> + { ($name = $1) =~ s/[\._]+/ /g;
> + $name = _extract_name $name;
> + }
> +
> + if($name eq '' && $addr =~ m#/g=#i) # X400 style address
> + { my ($f) = $addr =~ m#g=([^/]*)#i;
> + my ($l) = $addr =~ m#s=([^/]*)#i;
> + $name = _extract_name "$f $l";
> + }
> +
> + length $name ? $name : undef;
> +}
> +
> +
> +sub host
> +{ my $addr = shift->address || '';
> + my $i = rindex $addr, '@';
> + $i >= 0 ? substr($addr, $i+1) : undef;
> +}
> +
> +
> +sub user
> +{ my $addr = shift->address || '';
> + my $i = rindex $addr, '@';
> + $i >= 0 ? substr($addr,0,$i) : $addr;
> +}
> +
> +1;
> diff --git a/perl/Git/Mail/Address.pm b/perl/Git/Mail/Address.pm
> new file mode 100755
> index 0000000..2ce3e84
> --- /dev/null
> +++ b/perl/Git/Mail/Address.pm
> @@ -0,0 +1,24 @@
> +package Git::Mail::Address;
> +use 5.008;
> +use strict;
> +use warnings;
> +
> +=head1 NAME
> +
> +Git::Mail::Address - Wrapper for the L<Mail::Address> module, in case it's not installed
> +
> +=head1 DESCRIPTION
> +
> +This module is only intended to be used for code shipping in the
> +C<git.git> repository. Use it for anything else at your peril!
> +
> +=cut
> +
> +eval {
> + require Mail::Address;
> + 1;
> +} or do {
> + require Git::FromCPAN::Mail::Address;
> +};
> +
> +1;
--
Alex Bennée
next prev parent reply other threads:[~2018-01-08 11:56 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-04 18:55 [RFC PATCH 1/2] add a local copy of Mail::Address from CPAN Matthieu Moy
2018-01-04 18:55 ` [RFC PATCH 2/2] Remove now useless email-address parsing code Matthieu Moy
2018-01-04 22:11 ` Alex Bennée
2018-01-05 9:39 ` Matthieu Moy
2018-01-05 10:11 ` [PATCH] send-email: add test for Linux's get_maintainer.pl Matthieu Moy
2018-01-05 11:15 ` Alex Bennée
2018-01-05 11:36 ` Matthieu Moy
2018-01-05 20:16 ` Junio C Hamano
2018-01-04 21:02 ` [RFC PATCH 1/2] add a local copy of Mail::Address from CPAN Eric Sunshine
2018-01-05 14:19 ` Ævar Arnfjörð Bjarmason
2018-01-05 18:36 ` [PATCH v2 1/3] send-email: add and use a local copy of Mail::Address Matthieu Moy
2018-01-05 18:36 ` [PATCH v2 2/3] Remove now useless email-address parsing code Matthieu Moy
2018-01-05 18:36 ` [PATCH v2 3/3] send-email: add test for Linux's get_maintainer.pl Matthieu Moy
2018-01-05 18:59 ` Eric Sunshine
2018-01-08 10:30 ` Matthieu Moy
2018-01-08 10:34 ` [PATCH v3 1/3] send-email: add and use a local copy of Mail::Address Matthieu Moy
2018-01-08 10:34 ` [PATCH v3 2/3] Remove now useless email-address parsing code Matthieu Moy
2018-01-08 11:57 ` Alex Bennée
2018-01-08 10:34 ` [PATCH v3 3/3] send-email: add test for Linux's get_maintainer.pl Matthieu Moy
2018-01-08 18:45 ` Junio C Hamano
2018-01-08 11:56 ` Alex Bennée [this message]
2018-02-14 14:59 ` [PATCH v2 1/3] send-email: add and use a local copy of Mail::Address Ævar Arnfjörð Bjarmason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87incco97k.fsf@linaro.org \
--to=alex.bennee@linaro.org \
--cc=avarab@gmail.com \
--cc=git@matthieu-moy.fr \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=sunshine@sunshineco.com \
--cc=thomas@xteddy.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.