* [PATCH] send-email: Fix Pine address book parsing
@ 2008-11-26 2:55 Trent Piepho
2008-11-26 4:58 ` Junio C Hamano
0 siblings, 1 reply; 6+ messages in thread
From: Trent Piepho @ 2008-11-26 2:55 UTC (permalink / raw)
To: git; +Cc: Trent Piepho, gitster
See: http://www.washington.edu/pine/tech-notes/low-level.html
Entries with a fcc or comment field after the address weren't parsed
correctly.
Continuation lines, identified by leading spaces, were also not handled.
Distribution lists which had ( ) around a list of addresses did not have
the parenthesis removed.
Signed-off-by: Trent Piepho <tpiepho@freescale.com>
---
git-send-email.perl | 9 ++++++---
1 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/git-send-email.perl b/git-send-email.perl
index 94ca5c8..007e2c6 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -345,10 +345,13 @@ my %parse_alias = (
# spaces delimit multiple addresses
$aliases{$1} = [ split(/\s+/, $2) ];
}}},
- pine => sub { my $fh = shift; while (<$fh>) {
- if (/^(\S+)\t.*\t(.*)$/) {
+ pine => sub { my $fh = shift; my $f='\t[^\t]*';
+ for (my $x = ''; defined($x); $x = $_) {
+ chomp $x;
+ $x .= $1 while(defined($_ = <$fh>) && /^ +(.*)$/);
+ $x =~ /^(\S+)$f\t\(?([^\t]+?)\)?(:?$f){0,2}$/ or next;
$aliases{$1} = [ split(/\s*,\s*/, $2) ];
- }}},
+ }},
gnus => sub { my $fh = shift; while (<$fh>) {
if (/\(define-mail-alias\s+"(\S+?)"\s+"(\S+?)"\)/) {
$aliases{$1} = [ $2 ];
--
1.5.4.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] send-email: Fix Pine address book parsing
2008-11-26 2:55 [PATCH] send-email: Fix Pine address book parsing Trent Piepho
@ 2008-11-26 4:58 ` Junio C Hamano
2008-11-26 5:59 ` Trent Piepho
0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2008-11-26 4:58 UTC (permalink / raw)
To: Trent Piepho; +Cc: git, gitster
Trent Piepho <tpiepho@freescale.com> writes:
> See: http://www.washington.edu/pine/tech-notes/low-level.html
>
> Entries with a fcc or comment field after the address weren't parsed
> correctly.
>
> Continuation lines, identified by leading spaces, were also not handled.
>
> Distribution lists which had ( ) around a list of addresses did not have
> the parenthesis removed.
> + pine => sub { my $fh = shift; my $f='\t[^\t]*';
> + for (my $x = ''; defined($x); $x = $_) {
> + chomp $x;
> + $x .= $1 while(defined($_ = <$fh>) && /^ +(.*)$/);
> + $x =~ /^(\S+)$f\t\(?([^\t]+?)\)?(:?$f){0,2}$/ or next;
Hmm, so you chomp each continuation line with /^ +(.*)$/ and concatenate
that to the hold buffer ($x) as long as you see continuation lines,
a non-continuation line that you read ahead is given to the next round
(the third part of for(;;) control), checked if you hit an EOF and then
chomped. Which means the complicated regexp about the parentheses is
applied to a logical single line in $x that does not have any newline in
it, right?
I wonder what this does:
$x .= $1 while (defined($_ = <$fh>) && /^ +(.*)$/);
when you have "a b" in $x and feed " c\n d\ne\n" to it. When it leaves
the loop, you would have "e\n" in $_ for the next round, and "a bcd" (note
that "bcd" becomes one word) in $x, which I suspect may not be what you
want.
But I do not use Pine nor its aliases file.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] send-email: Fix Pine address book parsing
2008-11-26 4:58 ` Junio C Hamano
@ 2008-11-26 5:59 ` Trent Piepho
2008-11-26 6:44 ` Junio C Hamano
0 siblings, 1 reply; 6+ messages in thread
From: Trent Piepho @ 2008-11-26 5:59 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Tue, 25 Nov 2008, Junio C Hamano wrote:
> Trent Piepho <tpiepho@freescale.com> writes:
>> See: http://www.washington.edu/pine/tech-notes/low-level.html
>>
>> Entries with a fcc or comment field after the address weren't parsed
>> correctly.
>>
>> Continuation lines, identified by leading spaces, were also not handled.
>>
>> Distribution lists which had ( ) around a list of addresses did not have
>> the parenthesis removed.
>
>> + pine => sub { my $fh = shift; my $f='\t[^\t]*';
>> + for (my $x = ''; defined($x); $x = $_) {
>> + chomp $x;
>> + $x .= $1 while(defined($_ = <$fh>) && /^ +(.*)$/);
>> + $x =~ /^(\S+)$f\t\(?([^\t]+?)\)?(:?$f){0,2}$/ or next;
>
> Hmm, so you chomp each continuation line with /^ +(.*)$/ and concatenate
> that to the hold buffer ($x) as long as you see continuation lines,
> a non-continuation line that you read ahead is given to the next round
> (the third part of for(;;) control), checked if you hit an EOF and then
> chomped. Which means the complicated regexp about the parentheses is
> applied to a logical single line in $x that does not have any newline in
> it, right?
Yes. The previous regex would just grab the email address with (\S+)$, but
that's not right. There can be email address with spaces in them, like
"John Doe <jdoe@anon.org>". And the email address isn't always the last
field. So each field has to be put in the regex and \S+ and \s* have to
become [^\t]* and \t to count fields properly. That's why the regex got so
complex.
> I wonder what this does:
>
> $x .= $1 while (defined($_ = <$fh>) && /^ +(.*)$/);
>
> when you have "a b" in $x and feed " c\n d\ne\n" to it. When it leaves
> the loop, you would have "e\n" in $_ for the next round, and "a bcd" (note
> that "bcd" becomes one word) in $x, which I suspect may not be what you
> want.
The tech docs I linked to just say pine continues lines with leading space,
but not how many spaces exactly. From what I can see it appears to usually
use three spaces, but sometimes it uses one space when wrapping a very long
comment field. It also appears to only split lines between whitespace and
non-whitespace. So if "a b c d\n" were to be wrapped, it would be something
like "a b \n c \n d\n". If I didn't eat the leading spaces in the
continuations, it would be re-assembled as "a b c d". This might cause
an address to become "John Doe <jdoe@anon.org>"
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] send-email: Fix Pine address book parsing
2008-11-26 5:59 ` Trent Piepho
@ 2008-11-26 6:44 ` Junio C Hamano
2008-11-26 11:13 ` Trent Piepho
0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2008-11-26 6:44 UTC (permalink / raw)
To: Trent Piepho; +Cc: git
Trent Piepho <tpiepho@freescale.com> writes:
> The tech docs I linked to just say pine continues lines with leading space,
> but not how many spaces exactly.
My reading of the wording "spaces" it uses is that any number. I agree it
is underspecified what would happen to them.
> It also appears to only split lines between whitespace and
> non-whitespace. ...
> ... like "a b \n c \n d\n". If I didn't eat the leading spaces in the
> continuations, it would be re-assembled as "a b c d". This might cause
> an address to become "John Doe <jdoe@anon.org>"
Which would still work. If you had two addresses a and b and smashed them
together into ab on the other hand it wouldn't. That is why I asked.
If you know for sure (e.g. by reading the Pine source) that it only splits
a line at a whitespace to non-whitespace transition, that it keeps the
whitespace at the end of the first line, and that the non-whitespace and
everything after that on the second line (prefixed by extra unspecified
number of spaces as the continuation sign), then I think what you had in
your patch is exactly what we want. I just wanted to make sure you know
what you are doing, as I do not use Pine nor its address book myself.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] send-email: Fix Pine address book parsing
2008-11-26 6:44 ` Junio C Hamano
@ 2008-11-26 11:13 ` Trent Piepho
2008-11-26 19:46 ` Junio C Hamano
0 siblings, 1 reply; 6+ messages in thread
From: Trent Piepho @ 2008-11-26 11:13 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Tue, 25 Nov 2008, Junio C Hamano wrote:
> Trent Piepho <tpiepho@freescale.com> writes:
>
>> The tech docs I linked to just say pine continues lines with leading space,
>> but not how many spaces exactly.
>
> My reading of the wording "spaces" it uses is that any number. I agree it
> is underspecified what would happen to them.
>
>> It also appears to only split lines between whitespace and
>> non-whitespace. ...
>> ... like "a b \n c \n d\n". If I didn't eat the leading spaces in the
>> continuations, it would be re-assembled as "a b c d". This might cause
>> an address to become "John Doe <jdoe@anon.org>"
>
> Which would still work. If you had two addresses a and b and smashed them
> together into ab on the other hand it wouldn't. That is why I asked.
Though if Pine decided to wrap something mid-word, e.g. "a@b.com" to
"a@b.\n com\n" then no eating the leading space would be wrong.
> If you know for sure (e.g. by reading the Pine source) that it only splits
> a line at a whitespace to non-whitespace transition, that it keeps the
> whitespace at the end of the first line, and that the non-whitespace and
> everything after that on the second line (prefixed by extra unspecified
> number of spaces as the continuation sign), then I think what you had in
> your patch is exactly what we want. I just wanted to make sure you know
> what you are doing, as I do not use Pine nor its address book myself.
I tried all kinds of extreme address book entries and it would never put
spaces at the beginning of a line unless they had been inserted to make a
continuation.
But just for you I downloaded the pine code, which is 20 years old and not
a lot of fun to read. There isn't any code that just does line wrapping.
The code that prints out a address book entry will at various points check
if it should output a newline. And it whatever it prints after those
points isn't allowed to start with space. The code that reads the entries
eats all the spaces before and after each field. If long comments are
wrapped, it will change "a b" into "a\n : b\n" and effectively does
s/^ +: / /; when reading it back. But email address don't get wrapped
that way so we don't need to care about the ':'.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] send-email: Fix Pine address book parsing
2008-11-26 11:13 ` Trent Piepho
@ 2008-11-26 19:46 ` Junio C Hamano
0 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2008-11-26 19:46 UTC (permalink / raw)
To: Trent Piepho; +Cc: git
Trent Piepho <tpiepho@freescale.com> writes:
> ... There isn't any code that just does line wrapping.
> The code that prints out a address book entry will at various points check
> if it should output a newline. And it whatever it prints after those
> points isn't allowed to start with space. The code that reads the entries
> eats all the spaces before and after each field. If long comments are
> wrapped, it will change "a b" into "a\n : b\n" and effectively does
> s/^ +: / /; when reading it back. But email address don't get wrapped
> that way so we don't need to care about the ':'.
I never was worried about eating and losing the leading space that was
used only to signal the continuation line, but was wondering if we need to
insert an artificial space in place of the newline we are eating, but it
appears that we can say with confidence that worry is unfounded.
Will queue. Thanks.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2008-11-26 19:47 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-26 2:55 [PATCH] send-email: Fix Pine address book parsing Trent Piepho
2008-11-26 4:58 ` Junio C Hamano
2008-11-26 5:59 ` Trent Piepho
2008-11-26 6:44 ` Junio C Hamano
2008-11-26 11:13 ` Trent Piepho
2008-11-26 19:46 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).