* [PATCH] Cleanup git-send-email.perl:extract_valid_email
@ 2006-06-03 17:11 Horst H. von Brand
2006-06-03 22:49 ` Eric Wong
0 siblings, 1 reply; 11+ messages in thread
From: Horst H. von Brand @ 2006-06-03 17:11 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Horst H. von Brand
- Fix the regular expressions for local addresses
- Fix the fallback regexp for non-local addresses, simplify the logic
Signed-off-by: Horst H. von Brand <vonbrand@inf.utfsm.cl>
---
git-send-email.perl | 9 +++------
1 files changed, 3 insertions(+), 6 deletions(-)
diff --git a/git-send-email.perl b/git-send-email.perl
index ed1d89b..a7a7797 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -314,18 +314,15 @@ sub extract_valid_address {
my $address = shift;
# check for a local address:
- return $address if ($address =~ /^([\w\-]+)$/);
+ return $address if ($address =~ /^([\w\-.]+)$/);
if ($have_email_valid) {
return Email::Valid->address($address);
} else {
# less robust/correct than the monster regexp in Email::Valid,
# but still does a 99% job, and one less dependency
- my $cleaned_address;
- if ($address =~ /([^\"<>\s]+@[^<>\s]+)/) {
- $cleaned_address = $1;
- }
- return $cleaned_address;
+ $address =~ /([\w\-.]+@[\w\-.]+)/;
+ return $1;
}
}
--
1.3.3.g86f7
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] Cleanup git-send-email.perl:extract_valid_email
2006-06-03 17:11 [PATCH] Cleanup git-send-email.perl:extract_valid_email Horst H. von Brand
@ 2006-06-03 22:49 ` Eric Wong
2006-06-04 0:10 ` Horst von Brand
0 siblings, 1 reply; 11+ messages in thread
From: Eric Wong @ 2006-06-03 22:49 UTC (permalink / raw)
To: Horst H. von Brand; +Cc: git, Junio C Hamano, Ryan Anderson
"Horst H. von Brand" <vonbrand@inf.utfsm.cl> wrote:
> - Fix the regular expressions for local addresses
> - Fix the fallback regexp for non-local addresses, simplify the logic
>
> Signed-off-by: Horst H. von Brand <vonbrand@inf.utfsm.cl>
> ---
> git-send-email.perl | 9 +++------
> 1 files changed, 3 insertions(+), 6 deletions(-)
>
> diff --git a/git-send-email.perl b/git-send-email.perl
> index ed1d89b..a7a7797 100755
> --- a/git-send-email.perl
> +++ b/git-send-email.perl
> @@ -314,18 +314,15 @@ sub extract_valid_address {
> my $address = shift;
>
> # check for a local address:
> - return $address if ($address =~ /^([\w\-]+)$/);
> + return $address if ($address =~ /^([\w\-.]+)$/);
I keep forgetting this, '+' is a valid (and useful) setup, too.
>
> if ($have_email_valid) {
> return Email::Valid->address($address);
> } else {
> # less robust/correct than the monster regexp in Email::Valid,
> # but still does a 99% job, and one less dependency
> - my $cleaned_address;
> - if ($address =~ /([^\"<>\s]+@[^<>\s]+)/) {
> - $cleaned_address = $1;
> - }
> - return $cleaned_address;
> + $address =~ /([\w\-.]+@[\w\-.]+)/;
> + return $1;
Actually, I'm retracting my earlier ack on this. This is way too
restrictive. I'd rather allow an occasional invalid email address than
to reject valid ones. I generally trust git users to know what they're
doing when entering email addresses[1].
*, $, ^, +, = are all valid characters in the username portion (not sure
about local accounts, though), and I'm sure there are more that I don't
know about.
[1] - of course, without address book support in git-send-email, I think
I would've left the 'k' out of Junio's email address by now. Of course
there's also the issue of where I work and having several people using
variations of rob/robert in their email address. I'm likely to make
errors like these far more than entering addresses that email systems
consider invalid, and I suspect I'm not the only one.
--
Eric Wong
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Cleanup git-send-email.perl:extract_valid_email
2006-06-03 22:49 ` Eric Wong
@ 2006-06-04 0:10 ` Horst von Brand
2006-06-06 6:59 ` Junio C Hamano
0 siblings, 1 reply; 11+ messages in thread
From: Horst von Brand @ 2006-06-04 0:10 UTC (permalink / raw)
To: Eric Wong; +Cc: Horst H. von Brand, git, Junio C Hamano, Ryan Anderson
Eric Wong <normalperson@yhbt.net> wrote:
> "Horst H. von Brand" <vonbrand@inf.utfsm.cl> wrote:
> > - Fix the regular expressions for local addresses
> > - Fix the fallback regexp for non-local addresses, simplify the logic
> >
> > Signed-off-by: Horst H. von Brand <vonbrand@inf.utfsm.cl>
> > ---
> > git-send-email.perl | 9 +++------
> > 1 files changed, 3 insertions(+), 6 deletions(-)
> >
> > diff --git a/git-send-email.perl b/git-send-email.perl
> > index ed1d89b..a7a7797 100755
> > --- a/git-send-email.perl
> > +++ b/git-send-email.perl
> > @@ -314,18 +314,15 @@ sub extract_valid_address {
> > my $address = shift;
> >
> > # check for a local address:
> > - return $address if ($address =~ /^([\w\-]+)$/);
> > + return $address if ($address =~ /^([\w\-.]+)$/);
>
> I keep forgetting this, '+' is a valid (and useful) setup, too.
Oops...
> > if ($have_email_valid) {
> > return Email::Valid->address($address);
> > } else {
> > # less robust/correct than the monster regexp in Email::Valid,
> > # but still does a 99% job, and one less dependency
> > - my $cleaned_address;
> > - if ($address =~ /([^\"<>\s]+@[^<>\s]+)/) {
> > - $cleaned_address = $1;
> > - }
> > - return $cleaned_address;
> > + $address =~ /([\w\-.]+@[\w\-.]+)/;
> > + return $1;
>
> Actually, I'm retracting my earlier ack on this. This is way too
> restrictive. I'd rather allow an occasional invalid email address than
> to reject valid ones. I generally trust git users to know what they're
> doing when entering email addresses[1].
>
> *, $, ^, +, = are all valid characters in the username portion (not sure
> about local accounts, though), and I'm sure there are more that I don't
> know about.
As a general principle, I prefer to check what is legal instead of trying
to filter out what isn't.
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Cleanup git-send-email.perl:extract_valid_email
2006-06-04 0:10 ` Horst von Brand
@ 2006-06-06 6:59 ` Junio C Hamano
2006-06-06 15:42 ` Horst von Brand
0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2006-06-06 6:59 UTC (permalink / raw)
To: Horst von Brand, Ryan Anderson, Eric Wong; +Cc: git
Horst von Brand <vonbrand@inf.utfsm.cl> writes:
>> > # check for a local address:
>> > - return $address if ($address =~ /^([\w\-]+)$/);
>> > + return $address if ($address =~ /^([\w\-.]+)$/);
>>
>> I keep forgetting this, '+' is a valid (and useful) setup, too.
>
> Oops...
>>
>> Actually, I'm retracting my earlier ack on this. This is way too
>> restrictive. I'd rather allow an occasional invalid email address than
>> to reject valid ones. I generally trust git users to know what they're
>> doing when entering email addresses[1].
>>
>> *, $, ^, +, = are all valid characters in the username portion (not sure
>> about local accounts, though), and I'm sure there are more that I don't
>> know about.
>
> As a general principle, I prefer to check what is legal instead of trying
> to filter out what isn't.
If we start doing addr-spec in RFC2822 (page 17) ourselves, we
should rather be using Email::Valid. A permissive sanity check
to catch obvious mistakes would be more appropriate here than
being RFC police.
I think something like the attached, on top of your patch, would
be appropriate for upcoming 1.4.0.
-- >8 --
send-email: be more lenient and just catch obvious mistakes.
This cleans up the pattern matching subroutine by introducing
two variables to hold regexp to approximately match local-part
and domain in the e-mail address. It is meant to catch obvious
mistakes with a cheap check.
The patch also moves "scalar" to force Email::Valid->address()
to work in !wantarray environment to extract_valid_address;
earlier it was in the caller of the subroutine, which was way
too error prone.
---
diff --git a/git-send-email.perl b/git-send-email.perl
index a7a7797..700d0c3 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -312,16 +312,18 @@ our ($message_id, $cc, %mail, $subject,
sub extract_valid_address {
my $address = shift;
+ my $local_part_regexp = '[^<>"\s@]+';
+ my $domain_regexp = '[^.<>"\s@]+\.[^<>"\s@]+';
# check for a local address:
- return $address if ($address =~ /^([\w\-.]+)$/);
+ return $address if ($address =~ /^($local_part_regexp)$/);
if ($have_email_valid) {
- return Email::Valid->address($address);
+ return scalar Email::Valid->address($address);
} else {
# less robust/correct than the monster regexp in Email::Valid,
# but still does a 99% job, and one less dependency
- $address =~ /([\w\-.]+@[\w\-.]+)/;
+ $address =~ /($local_part_regexp\@$domain_regexp)/;
return $1;
}
}
@@ -384,7 +386,7 @@ X-Mailer: git-send-email $gitversion
defined $pid or die $!;
if (!$pid) {
exec($smtp_server,'-i',
- map { scalar extract_valid_address($_) }
+ map { extract_valid_address($_) }
@recipients) or die $!;
}
print $sm "$header\n$message";
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] Cleanup git-send-email.perl:extract_valid_email
2006-06-06 6:59 ` Junio C Hamano
@ 2006-06-06 15:42 ` Horst von Brand
2006-06-06 15:54 ` Junio C Hamano
0 siblings, 1 reply; 11+ messages in thread
From: Horst von Brand @ 2006-06-06 15:42 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Horst von Brand, Ryan Anderson, Eric Wong, git
Junio C Hamano <junkio@cox.net> wrote:
> Horst von Brand <vonbrand@inf.utfsm.cl> writes:
>
> >> > # check for a local address:
> >> > - return $address if ($address =~ /^([\w\-]+)$/);
> >> > + return $address if ($address =~ /^([\w\-.]+)$/);
> >>
> >> I keep forgetting this, '+' is a valid (and useful) setup, too.
> >
> > Oops...
> >>
> >> Actually, I'm retracting my earlier ack on this. This is way too
> >> restrictive. I'd rather allow an occasional invalid email address than
> >> to reject valid ones. I generally trust git users to know what they're
> >> doing when entering email addresses[1].
> >>
> >> *, $, ^, +, = are all valid characters in the username portion (not sure
> >> about local accounts, though), and I'm sure there are more that I don't
> >> know about.
> >
> > As a general principle, I prefer to check what is legal instead of trying
> > to filter out what isn't.
> If we start doing addr-spec in RFC2822 (page 17) ourselves, we
> should rather be using Email::Valid. A permissive sanity check
> to catch obvious mistakes would be more appropriate here than
> being RFC police.
OK.
> I think something like the attached, on top of your patch, would
> be appropriate for upcoming 1.4.0.
>
> -- >8 --
> send-email: be more lenient and just catch obvious mistakes.
>
> This cleans up the pattern matching subroutine by introducing
> two variables to hold regexp to approximately match local-part
> and domain in the e-mail address. It is meant to catch obvious
> mistakes with a cheap check.
>
> The patch also moves "scalar" to force Email::Valid->address()
> to work in !wantarray environment to extract_valid_address;
> earlier it was in the caller of the subroutine, which was way
> too error prone.
Right.
> ---
> diff --git a/git-send-email.perl b/git-send-email.perl
> index a7a7797..700d0c3 100755
> --- a/git-send-email.perl
> +++ b/git-send-email.perl
> @@ -312,16 +312,18 @@ our ($message_id, $cc, %mail, $subject,
>
> sub extract_valid_address {
> my $address = shift;
> + my $local_part_regexp = '[^<>"\s@]+';
> + my $domain_regexp = '[^.<>"\s@]+\.[^<>"\s@]+';
This forces a '.' in the domain, while vonbrand@localhost is perfectly
reasonable. Plus it doesn't disallow adyacent '.'s. What about:
my $domain_regexp = '[^.<>"\s@]+(\.[^<>"\s@]+)*';
(but this is probably nitpicking...)
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Cleanup git-send-email.perl:extract_valid_email
2006-06-06 15:42 ` Horst von Brand
@ 2006-06-06 15:54 ` Junio C Hamano
2006-06-06 16:05 ` Horst von Brand
0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2006-06-06 15:54 UTC (permalink / raw)
To: Horst von Brand; +Cc: git
Horst von Brand <vonbrand@inf.utfsm.cl> writes:
>> diff --git a/git-send-email.perl b/git-send-email.perl
>> index a7a7797..700d0c3 100755
>> --- a/git-send-email.perl
>> +++ b/git-send-email.perl
>> @@ -312,16 +312,18 @@ our ($message_id, $cc, %mail, $subject,
>>
>> sub extract_valid_address {
>> my $address = shift;
>> + my $local_part_regexp = '[^<>"\s@]+';
>> + my $domain_regexp = '[^.<>"\s@]+\.[^<>"\s@]+';
>
> This forces a '.' in the domain, while vonbrand@localhost is perfectly
> reasonable. Plus it doesn't disallow adyacent '.'s. What about:
>
> my $domain_regexp = '[^.<>"\s@]+(\.[^<>"\s@]+)*';
>
> (but this is probably nitpicking...)
I do not have preference either way about allowing an address
like tld-administrator@net myself, but Email::Valid->address
does not seem to allow it, and I just copied that behaviour for
consistency between two alternative implementations.
I think you meant to say:
> my $domain_regexp = '[^.<>"\s@]+(\.[^.<>"\s@]+)*';
(i.e. exclude dot from the latter character class), but I am
inclined to do this instead:
my $domain_regexp = '[^.<>"\s@]+(?:\.[^.<>"\s@]+)+';
(i.e. still require at least two levels).
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Cleanup git-send-email.perl:extract_valid_email
2006-06-06 15:54 ` Junio C Hamano
@ 2006-06-06 16:05 ` Horst von Brand
2006-06-06 16:58 ` Junio C Hamano
0 siblings, 1 reply; 11+ messages in thread
From: Horst von Brand @ 2006-06-06 16:05 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Horst von Brand, git
Junio C Hamano <junkio@cox.net> wrote:
> Horst von Brand <vonbrand@inf.utfsm.cl> writes:
>
> >> diff --git a/git-send-email.perl b/git-send-email.perl
> >> index a7a7797..700d0c3 100755
> >> --- a/git-send-email.perl
> >> +++ b/git-send-email.perl
> >> @@ -312,16 +312,18 @@ our ($message_id, $cc, %mail, $subject,
> >>
> >> sub extract_valid_address {
> >> my $address = shift;
> >> + my $local_part_regexp = '[^<>"\s@]+';
> >> + my $domain_regexp = '[^.<>"\s@]+\.[^<>"\s@]+';
> >
> > This forces a '.' in the domain, while vonbrand@localhost is perfectly
> > reasonable. Plus it doesn't disallow adyacent '.'s. What about:
> >
> > my $domain_regexp = '[^.<>"\s@]+(\.[^<>"\s@]+)*';
> >
> > (but this is probably nitpicking...)
>
> I do not have preference either way about allowing an address
> like tld-administrator@net myself, but Email::Valid->address
> does not seem to allow it, and I just copied that behaviour for
> consistency between two alternative implementations.
Reasonable.
> I think you meant to say:
>
> > my $domain_regexp = '[^.<>"\s@]+(\.[^.<>"\s@]+)*';
>
> (i.e. exclude dot from the latter character class),
Right, my bad.
> but I am
> inclined to do this instead:
>
> my $domain_regexp = '[^.<>"\s@]+(?:\.[^.<>"\s@]+)+';
>
> (i.e. still require at least two levels).
OK, but be careful as this (?:...) is an extended regexp (needs /x on
match). I'd just leave it plain (the performance impact shouldn't be
noticeable). I don't see any use except for $1, so the extra parenthesis
should be safe.
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Cleanup git-send-email.perl:extract_valid_email
2006-06-06 16:05 ` Horst von Brand
@ 2006-06-06 16:58 ` Junio C Hamano
2006-06-06 21:24 ` Horst von Brand
0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2006-06-06 16:58 UTC (permalink / raw)
To: Horst von Brand; +Cc: git
Horst von Brand <vonbrand@inf.utfsm.cl> writes:
> Junio C Hamano <junkio@cox.net> wrote:
>> but I am
>> inclined to do this instead:
>>
>> my $domain_regexp = '[^.<>"\s@]+(?:\.[^.<>"\s@]+)+';
>>
>> (i.e. still require at least two levels).
>
> OK, but be careful as this (?:...) is an extended regexp (needs /x on
> match).
Are you sure about /x?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Cleanup git-send-email.perl:extract_valid_email
2006-06-06 16:58 ` Junio C Hamano
@ 2006-06-06 21:24 ` Horst von Brand
2006-06-06 21:39 ` Junio C Hamano
0 siblings, 1 reply; 11+ messages in thread
From: Horst von Brand @ 2006-06-06 21:24 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Horst von Brand, git
Junio C Hamano <junkio@cox.net> wrote:
> Horst von Brand <vonbrand@inf.utfsm.cl> writes:
> > Junio C Hamano <junkio@cox.net> wrote:
> >> but I am
> >> inclined to do this instead:
> >>
> >> my $domain_regexp = '[^.<>"\s@]+(?:\.[^.<>"\s@]+)+';
> >>
> >> (i.e. still require at least two levels).
> >
> > OK, but be careful as this (?:...) is an extended regexp (needs /x on
> > match).
> Are you sure about /x?
The manual (perlop(1)) says you need /x to match extended regexps, and
(?...) is the marker for such (perlre(1)). But my perl here (5.5.8-6 on
Fedora rawhide) doesn't care...
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Cleanup git-send-email.perl:extract_valid_email
2006-06-06 21:24 ` Horst von Brand
@ 2006-06-06 21:39 ` Junio C Hamano
2006-06-06 22:48 ` Horst von Brand
0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2006-06-06 21:39 UTC (permalink / raw)
To: Horst von Brand; +Cc: git
Horst von Brand <vonbrand@inf.utfsm.cl> writes:
>> > OK, but be careful as this (?:...) is an extended regexp (needs /x on
>> > match).
>
>> Are you sure about /x?
>
> The manual (perlop(1)) says you need /x to match extended regexps, and
> (?...) is the marker for such (perlre(1)).
I always had the impression that eXtended in the context to talk
about /x was about ignoring whitespaces and forcing people to
write \s (or perhaps \040) when they mean a whitespace and had
nothing to do with (?...) stuff. Let me look up the fine
manual.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Cleanup git-send-email.perl:extract_valid_email
2006-06-06 21:39 ` Junio C Hamano
@ 2006-06-06 22:48 ` Horst von Brand
0 siblings, 0 replies; 11+ messages in thread
From: Horst von Brand @ 2006-06-06 22:48 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Horst von Brand, git
Junio C Hamano <junkio@cox.net> wrote:
> Horst von Brand <vonbrand@inf.utfsm.cl> writes:
> >> > OK, but be careful as this (?:...) is an extended regexp (needs /x on
> >> > match).
> >
> >> Are you sure about /x?
> >
> > The manual (perlop(1)) says you need /x to match extended regexps, and
> > (?...) is the marker for such (perlre(1)).
> I always had the impression that eXtended in the context to talk
> about /x was about ignoring whitespaces and forcing people to
> write \s (or perhaps \040) when they mean a whitespace and had
> nothing to do with (?...) stuff. Let me look up the fine
> manual.
You might be right... and it even sounds sensible; but both (?...) stuff
and the ignoring of space is described as extended here.
Note that \s is a space character (' ', '\t', ...), which is not the same
as \040 (and that one assumes ASCII...).
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2006-06-06 22:48 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-03 17:11 [PATCH] Cleanup git-send-email.perl:extract_valid_email Horst H. von Brand
2006-06-03 22:49 ` Eric Wong
2006-06-04 0:10 ` Horst von Brand
2006-06-06 6:59 ` Junio C Hamano
2006-06-06 15:42 ` Horst von Brand
2006-06-06 15:54 ` Junio C Hamano
2006-06-06 16:05 ` Horst von Brand
2006-06-06 16:58 ` Junio C Hamano
2006-06-06 21:24 ` Horst von Brand
2006-06-06 21:39 ` Junio C Hamano
2006-06-06 22:48 ` Horst von Brand
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).