* [RFC PATCH 2/2] send-email: don't use Mail::Address, even if available
2017-08-23 10:21 ` [RFC PATCH 1/2] send-email: fix garbage removal after address Matthieu Moy
@ 2017-08-23 10:21 ` Matthieu Moy
2017-08-23 21:59 ` [RFC PATCH 1/2] send-email: fix garbage removal after address Jacob Keller
2017-08-24 20:32 ` Junio C Hamano
2 siblings, 0 replies; 13+ messages in thread
From: Matthieu Moy @ 2017-08-23 10:21 UTC (permalink / raw)
To: git; +Cc: Matthieu Moy
Using Mail::Address made sense when we didn't have a proper parser. We
now have a reasonable address parser, and using Mail::Address
_if available_ causes much more trouble than it gives benefits:
* Developers typically test one version, not both.
* Users may not be aware that installing Mail::Address will change the
behavior. They may complain about the behavior in one case without
knowing that Mail::Address is involved.
* Having this optional Mail::Address makes it tempting to anwser "please
install Mail::Address" to users instead of fixing our own code. We've
reached the stage where bugs in our parser should be fixed, not worked
around.
Signed-off-by: Matthieu Moy <git@matthieu-moy.fr>
---
git-send-email.perl | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/git-send-email.perl b/git-send-email.perl
index 33a69ffe5d..2208dcc213 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -155,7 +155,6 @@ sub format_2822_time {
}
my $have_email_valid = eval { require Email::Valid; 1 };
-my $have_mail_address = eval { require Mail::Address; 1 };
my $smtp;
my $auth;
my $num_sent = 0;
@@ -490,11 +489,7 @@ my ($repoauthor, $repocommitter);
($repocommitter) = Git::ident_person(@repo, 'committer');
sub parse_address_line {
- if ($have_mail_address) {
- return map { $_->format } Mail::Address->parse($_[0]);
- } else {
- return Git::parse_mailboxes($_[0]);
- }
+ return Git::parse_mailboxes($_[0]);
}
sub split_addrs {
--
2.14.0.rc0.dirty
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [RFC PATCH 1/2] send-email: fix garbage removal after address
2017-08-23 10:21 ` [RFC PATCH 1/2] send-email: fix garbage removal after address Matthieu Moy
2017-08-23 10:21 ` [RFC PATCH 2/2] send-email: don't use Mail::Address, even if available Matthieu Moy
@ 2017-08-23 21:59 ` Jacob Keller
2017-08-24 20:32 ` Junio C Hamano
2 siblings, 0 replies; 13+ messages in thread
From: Jacob Keller @ 2017-08-23 21:59 UTC (permalink / raw)
To: Matthieu Moy; +Cc: Git mailing list
On Wed, Aug 23, 2017 at 3:21 AM, Matthieu Moy <git@matthieu-moy.fr> wrote:
> This is a followup over 9d33439 (send-email: only allow one address
> per body tag, 2017-02-20). The first iteration did allow writting
>
> Cc: <foo@example.com> # garbage
>
> but did so by matching the regex ([^>]*>?), i.e. stop after the first
> instance of '>'. However, it did not properly deal with
>
> Cc: foo@example.com # garbage
>
> Fix this using a new function strip_garbage_one_address, which does
> essentially what the old ([^>]*>?) was doing, but dealing with more
> corner-cases. Since we've allowed
>
> Cc: "Foo # Bar" <foobar@example.com>
>
> in previous versions, it makes sense to continue allowing it (but we
> still remove any garbage after it). OTOH, when an address is given
> without quoting, we just take the first word and ignore everything
> after.
>
> Signed-off-by: Matthieu Moy <git@matthieu-moy.fr>
> ---
I pulled this and tested it for my issue, and it fixes the problem for
me. I think the approach in the code was solid too, extracting out the
logic helps make the code more clear.
Thanks,
Jake
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC PATCH 1/2] send-email: fix garbage removal after address
2017-08-23 10:21 ` [RFC PATCH 1/2] send-email: fix garbage removal after address Matthieu Moy
2017-08-23 10:21 ` [RFC PATCH 2/2] send-email: don't use Mail::Address, even if available Matthieu Moy
2017-08-23 21:59 ` [RFC PATCH 1/2] send-email: fix garbage removal after address Jacob Keller
@ 2017-08-24 20:32 ` Junio C Hamano
2017-08-25 9:11 ` Matthieu Moy
2 siblings, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2017-08-24 20:32 UTC (permalink / raw)
To: Matthieu Moy; +Cc: git
Matthieu Moy <git@matthieu-moy.fr> writes:
> This is a followup over 9d33439 (send-email: only allow one address
> per body tag, 2017-02-20). The first iteration did allow writting
>
> Cc: <foo@example.com> # garbage
>
> but did so by matching the regex ([^>]*>?), i.e. stop after the first
> instance of '>'. However, it did not properly deal with
>
> Cc: foo@example.com # garbage
>
> Fix this using a new function strip_garbage_one_address, which does
> essentially what the old ([^>]*>?) was doing, but dealing with more
> corner-cases. Since we've allowed
>
> Cc: "Foo # Bar" <foobar@example.com>
>
> in previous versions, it makes sense to continue allowing it (but we
> still remove any garbage after it). OTOH, when an address is given
> without quoting, we just take the first word and ignore everything
> after.
>
> Signed-off-by: Matthieu Moy <git@matthieu-moy.fr>
> ---
> Also available as: https://github.com/git/git/pull/398
>
> git-send-email.perl | 26 ++++++++++++++++++++++++--
> t/t9001-send-email.sh | 4 ++++
> 2 files changed, 28 insertions(+), 2 deletions(-)
>
> diff --git a/git-send-email.perl b/git-send-email.perl
> index fa6526986e..33a69ffe5d 100755
> --- a/git-send-email.perl
> +++ b/git-send-email.perl
> @@ -1089,6 +1089,26 @@ sub sanitize_address {
>
> }
>
> +sub strip_garbage_one_address {
> + my ($addr) = @_;
> + chomp $addr;
> + if ($addr =~ /^(("[^"]*"|[^"<]*)? *<[^>]*>).*/) {
> + # "Foo Bar" <foobar@example.com> [possibly garbage here]
> + # Foo Bar <foobar@example.com> [possibly garbage here]
> + return $1;
> + }
> + if ($addr =~ /^(<[^>]*>).*/) {
> + # <foo@example.com> [possibly garbage here]
> + # if garbage contains other addresses, they are ignored.
> + return $1;
> + }
Isn't this already covered by the first one, which allows an
optional "something", followed by an optional run of SPs, in front
of this exact pattern, so the case where the optional "something"
does not appear and the number of optional SP is zero would exactly
match the one this pattern is meant to cover.
> + if ($addr =~ /^([^"#,\s]*)/) {
> + # address without quoting: remove anything after the address
> + return $1;
> + }
> + return $addr;
> +}
By the way, these three regexps smell like they were written
specifically to cover three cases you care about (perhaps the ones
in your proposed log message), but what will be our response when
somebody else comes next time to us and says that their favourite
formatting of "Cc:" line is not covered by these rules?
Will we add yet another pattern? Where will it end? There will be
a point where we instead start telling them to update the convention
of their project so that it will be covered by one of the patterns
we have already developed, I would imagine.
So, from that point of view, I, with devil's advocate hat on, wonder
why we are not saying
"Cc: s@k.org # cruft"? Use "Cc: <s@k.org> # cruft" instead
and you'd be fine.
right now, without this patch.
I do not _mind_ us trying to be extra nice for a while, and I
certainly do not mind _this_ particular patch that gives us a single
helper function that future "here is another way to spell cruft"
rules can go, but I feel that there should be some line that lets us
say that we've done enough.
Thanks.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC PATCH 1/2] send-email: fix garbage removal after address
2017-08-24 20:32 ` Junio C Hamano
@ 2017-08-25 9:11 ` Matthieu Moy
2017-08-25 9:11 ` [PATCH v2 " Matthieu Moy
0 siblings, 1 reply; 13+ messages in thread
From: Matthieu Moy @ 2017-08-25 9:11 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Junio C Hamano <gitster@pobox.com> writes:
> Matthieu Moy <git@matthieu-moy.fr> writes:
>
>> +sub strip_garbage_one_address {
>> + my ($addr) = @_;
>> + chomp $addr;
>> + if ($addr =~ /^(("[^"]*"|[^"<]*)? *<[^>]*>).*/) {
>> + # "Foo Bar" <foobar@example.com> [possibly garbage here]
>> + # Foo Bar <foobar@example.com> [possibly garbage here]
>> + return $1;
>> + }
>> + if ($addr =~ /^(<[^>]*>).*/) {
>> + # <foo@example.com> [possibly garbage here]
>> + # if garbage contains other addresses, they are ignored.
>> + return $1;
>> + }
>
> Isn't this already covered by the first one,
Oops, indeed. I just removed the second "if" (and added the appropriate
comment to the first):
+ if ($addr =~ /^(("[^"]*"|[^"<]*)? *<[^>]*>).*/) {
+ # "Foo Bar" <foobar@example.com> [possibly garbage here]
+ # Foo Bar <foobar@example.com> [possibly garbage here]
+ # <foo@example.com> [possibly garbage here]
+ return $1;
+ }
> By the way, these three regexps smell like they were written
> specifically to cover three cases you care about (perhaps the ones
> in your proposed log message), but what will be our response when
> somebody else comes next time to us and says that their favourite
> formatting of "Cc:" line is not covered by these rules?
Well, actually the last one covers essentially everything. Just stop at
the first space, #, ',' or '"'. The first case is here to allow putting
a name in front of the address, which is something we've already allowed
and sounds reasonable from the user point of view.
OTOH, I didn't bother with real corner-cases like
Cc: "Foo \"bar\"" <foobar@example.com>
> So, from that point of view, I, with devil's advocate hat on, wonder
> why we are not saying
>
> "Cc: s@k.org # cruft"? Use "Cc: <s@k.org> # cruft" instead
> and you'd be fine.
>
> right now, without this patch.
I would agree if the broken case were an exotic one. But a plain adress
is really the simplest use-case I can think of, so it's hard to say
"don't do that" when we should say "sorry, we should obviously have
thought about this use-case".
--
Matthieu Moy
https://matthieu-moy.fr/
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 1/2] send-email: fix garbage removal after address
2017-08-25 9:11 ` Matthieu Moy
@ 2017-08-25 9:11 ` Matthieu Moy
2017-08-25 9:12 ` [PATCH v2 2/2] send-email: don't use Mail::Address, even if available Matthieu Moy
0 siblings, 1 reply; 13+ messages in thread
From: Matthieu Moy @ 2017-08-25 9:11 UTC (permalink / raw)
To: gitster; +Cc: git, Matthieu Moy
This is a followup over 9d33439 (send-email: only allow one address
per body tag, 2017-02-20). The first iteration did allow writting
Cc: <foo@example.com> # garbage
but did so by matching the regex ([^>]*>?), i.e. stop after the first
instance of '>'. However, it did not properly deal with
Cc: foo@example.com # garbage
Fix this using a new function strip_garbage_one_address, which does
essentially what the old ([^>]*>?) was doing, but dealing with more
corner-cases. Since we've allowed
Cc: "Foo # Bar" <foobar@example.com>
in previous versions, it makes sense to continue allowing it (but we
still remove any garbage after it). OTOH, when an address is given
without quoting, we just take the first word and ignore everything
after.
Signed-off-by: Matthieu Moy <git@matthieu-moy.fr>
---
Change since v1: removed dead code as suggested by Junio.
git-send-email.perl | 22 ++++++++++++++++++++--
t/t9001-send-email.sh | 4 ++++
2 files changed, 24 insertions(+), 2 deletions(-)
diff --git a/git-send-email.perl b/git-send-email.perl
index fa6526986e..dfd646ac5b 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -1089,6 +1089,22 @@ sub sanitize_address {
}
+sub strip_garbage_one_address {
+ my ($addr) = @_;
+ chomp $addr;
+ if ($addr =~ /^(("[^"]*"|[^"<]*)? *<[^>]*>).*/) {
+ # "Foo Bar" <foobar@example.com> [possibly garbage here]
+ # Foo Bar <foobar@example.com> [possibly garbage here]
+ # <foo@example.com> [possibly garbage here]
+ return $1;
+ }
+ if ($addr =~ /^([^"#,\s]*)/) {
+ # address without quoting: remove anything after the address
+ return $1;
+ }
+ return $addr;
+}
+
sub sanitize_address_list {
return (map { sanitize_address($_) } @_);
}
@@ -1590,10 +1606,12 @@ foreach my $t (@files) {
# Now parse the message body
while(<$fh>) {
$message .= $_;
- if (/^(Signed-off-by|Cc): ([^>]*>?)/i) {
+ if (/^(Signed-off-by|Cc): (.*)/i) {
chomp;
my ($what, $c) = ($1, $2);
- chomp $c;
+ # strip garbage for the address we'll use:
+ $c = strip_garbage_one_address($c);
+ # sanitize a bit more to decide whether to suppress the address:
my $sc = sanitize_address($c);
if ($sc eq $sender) {
next if ($suppress_cc{'self'});
diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index d1e4e8ad19..f30980895c 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -148,6 +148,8 @@ cat >expected-cc <<\EOF
!two@example.com!
!three@example.com!
!four@example.com!
+!five@example.com!
+!six@example.com!
EOF
"
@@ -161,6 +163,8 @@ test_expect_success $PREREQ 'cc trailer with various syntax' '
Cc: <two@example.com> # trailing comments are ignored
Cc: <three@example.com>, <not.four@example.com> one address per line
Cc: "Some # Body" <four@example.com> [ <also.a.comment> ]
+ Cc: five@example.com # not.six@example.com
+ Cc: six@example.com, not.seven@example.com
EOF
clean_fake_sendmail &&
git send-email -1 --to=recipient@example.com \
--
2.14.0.rc0.dirty
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v2 2/2] send-email: don't use Mail::Address, even if available
2017-08-25 9:11 ` [PATCH v2 " Matthieu Moy
@ 2017-08-25 9:12 ` Matthieu Moy
0 siblings, 0 replies; 13+ messages in thread
From: Matthieu Moy @ 2017-08-25 9:12 UTC (permalink / raw)
To: gitster; +Cc: git, Matthieu Moy
Using Mail::Address made sense when we didn't have a proper parser. We
now have a reasonable address parser, and using Mail::Address
_if available_ causes much more trouble than it gives benefits:
* Developers typically test one version, not both.
* Users may not be aware that installing Mail::Address will change the
behavior. They may complain about the behavior in one case without
knowing that Mail::Address is involved.
* Having this optional Mail::Address makes it tempting to anwser "please
install Mail::Address" to users instead of fixing our own code. We've
reached the stage where bugs in our parser should be fixed, not worked
around.
Signed-off-by: Matthieu Moy <git@matthieu-moy.fr>
---
git-send-email.perl | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/git-send-email.perl b/git-send-email.perl
index dfd646ac5b..0061dbfab9 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -155,7 +155,6 @@ sub format_2822_time {
}
my $have_email_valid = eval { require Email::Valid; 1 };
-my $have_mail_address = eval { require Mail::Address; 1 };
my $smtp;
my $auth;
my $num_sent = 0;
@@ -490,11 +489,7 @@ my ($repoauthor, $repocommitter);
($repocommitter) = Git::ident_person(@repo, 'committer');
sub parse_address_line {
- if ($have_mail_address) {
- return map { $_->format } Mail::Address->parse($_[0]);
- } else {
- return Git::parse_mailboxes($_[0]);
- }
+ return Git::parse_mailboxes($_[0]);
}
sub split_addrs {
--
2.14.0.rc0.dirty
^ permalink raw reply related [flat|nested] 13+ messages in thread