git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kirill Smelkov <kirr@landau.phys.spbu.ru>
To: Junio C Hamano <gitster@pobox.com>
Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>, git@vger.kernel.org
Subject: Re: John (zzz) Doe <john.doe@xz> (Comment)
Date: Tue, 20 Jan 2009 22:14:46 +0300	[thread overview]
Message-ID: <20090120191446.GB5721@roro3.zxlink> (raw)
In-Reply-To: <7vmydoxxcr.fsf_-_@gitster.siamese.dyndns.org>

On Sun, Jan 18, 2009 at 10:50:12AM -0800, Junio C Hamano wrote:
> So we can separate "John (zzz) Doe <john.doe@xz> (Comment)" into:
> 
> 	AUTHOR_EMAIL=john.doe@xz
>         AUTHOR_NAME="John (zzz) Doe (Comment)"
> 
> and leave it like so, I think.

Ok, here you are:

Subject: [PATCH 1/3] mailinfo: cleanup extra spaces for complex 'From'

As described in RFC822 (3.4.3 COMMENTS, and  A.1.4.), comments, as e.g.

    John (zzz) Doe <john.doe@xz> (Comment)

should "NOT [be] included in the destination mailbox"

On the other hand, quoting Junio:

> The above quote from the RFC is irrelevant.  Note that it is only about
> how you extract the e-mail address, discarding everything else.
>
> What mailinfo wants to do is to separate the human-readable name and the
> e-mail address, and we want to use _both_ results from it.
>
> We separate a few example From: lines like this:
>
> 	Kirill Smelkov <kirr@smelkov.xz>
> ==>	AUTHOR_EMAIL="kirr@smelkov.xz" AUTHOR_NAME="Kirill Smelkov"
>
> 	kirr@smelkov.xz (Kirill Smelkov)
> ==>	AUTHOR_EMAIL="kirr@smelkov.xz" AUTHOR_NAME="Kirill Smelkov"
>
> Traditionally, the way people spelled their name on From: line has been
> either one of the above form.  Typically comment form (i.e. the second
> one) adds the name at the end, while "Name <addr>" form has the name at
> the front.  But I do not think RFC requires that, primarily because it is
> all about discarding non-address part to find the e-mail address aka
> "destination mailbox".  It does not specify how humans should interpret
> the human readable name and the comment.
>
> Now, why is the name not AUTHOR_NAME="(Kirill Smelkov)" in the latter
> form?
>
> It is just common sense transformation.  Otherwise it looks simply ugly,
> and it is obvious that the parentheses is not part of the name of the
> person who used "kirr@smelkov.xz (Kirill Smelkov)" on his From: line.
>
> So we can separate "John (zzz) Doe <john.doe@xz> (Comment)" into:
>
> 	AUTHOR_EMAIL=john.doe@xz
>         AUTHOR_NAME="John (zzz) Doe (Comment)"
>
> and leave it like so, I think.

So let's just correctly remove extra spaces which could be left inside
name.

We need this functionality to pass all RFC2047 based tests in the next commit.

Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru>
---
 builtin-mailinfo.c  |   21 +++++++++++++++++----
 t/t5100/info0001    |    2 +-
 t/t5100/sample.mbox |    4 ++--
 3 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/builtin-mailinfo.c b/builtin-mailinfo.c
index dacc8ac..8030823 100644
--- a/builtin-mailinfo.c
+++ b/builtin-mailinfo.c
@@ -29,6 +29,9 @@ static struct strbuf **p_hdr_data, **s_hdr_data;
 #define MAX_HDR_PARSED 10
 #define MAX_BOUNDARIES 5
 
+static void cleanup_space(struct strbuf *sb);
+
+
 static void get_sane_name(struct strbuf *out, struct strbuf *name, struct strbuf *email)
 {
 	struct strbuf *src = name;
@@ -109,10 +112,14 @@ static void handle_from(const struct strbuf *from)
 	strbuf_add(&email, at, el);
 	strbuf_remove(&f, at - f.buf, el + (at[el] ? 1 : 0));
 
-	/* The remainder is name.  It could be "John Doe <john.doe@xz>"
-	 * or "john.doe@xz (John Doe)", but we have removed the
-	 * email part, so trim from both ends, possibly removing
-	 * the () pair at the end.
+	/* The remainder is name.  It could be
+	 *
+	 * - "John Doe <john.doe@xz>"			(a), or
+	 * - "john.doe@xz (John Doe)"			(b), or
+	 * - "John (zzz) Doe <john.doe@xz> (Comment)"	(c)
+	 *
+	 * but we have removed the email part, so trim from both ends, possibly
+	 * removing the () pair at the end for case 'b'.
 	 */
 	strbuf_trim(&f);
 	if (f.buf[0] == '(' && f.len && f.buf[f.len - 1] == ')') {
@@ -120,6 +127,12 @@ static void handle_from(const struct strbuf *from)
 		strbuf_setlen(&f, f.len - 1);
 	}
 
+	/* Otherwise we want comments to stay. It's just time to cleanup extra
+	 * spaces
+	 */
+	cleanup_space(&f);
+	strbuf_trim(&f);
+
 	get_sane_name(&name, &f, &email);
 	strbuf_release(&f);
 }
diff --git a/t/t5100/info0001 b/t/t5100/info0001
index 8c05277..f951538 100644
--- a/t/t5100/info0001
+++ b/t/t5100/info0001
@@ -1,4 +1,4 @@
-Author: A U Thor
+Author: A (zzz) U Thor (Comment)
 Email: a.u.thor@example.com
 Subject: a commit.
 Date: Fri, 9 Jun 2006 00:44:16 -0700
diff --git a/t/t5100/sample.mbox b/t/t5100/sample.mbox
index 38725f3..4f80b82 100644
--- a/t/t5100/sample.mbox
+++ b/t/t5100/sample.mbox
@@ -2,10 +2,10 @@
 	
     
 From nobody Mon Sep 17 00:00:00 2001
-From: A
+From: A (zzz)
       U
       Thor
-      <a.u.thor@example.com>
+      <a.u.thor@example.com> (Comment)
 Date: Fri, 9 Jun 2006 00:44:16 -0700
 Subject: [PATCH] a commit.
 
-- 
1.6.1.79.g92b9.dirty


Is it ok?

And by the way, please pull the whole updated series from

    git://repo.or.cz/git/kirr.git   for-junio-maint

Kirill Smelkov (3):
      mailinfo: cleanup extra spaces for complex 'From'
      mailinfo: add explicit test for mails like '<a.u.thor@example.com> (A U Thor)'
      mailinfo: tests for RFC2047 examples


Thanks,
Kirill

  reply	other threads:[~2009-01-20 19:14 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-14 11:32 What's cooking in git.git (Jan 2009, #03; Wed, 14) Junio C Hamano
2009-01-15 19:49 ` Kirill Smelkov
2009-01-15 20:39   ` Junio C Hamano
2009-01-16  8:08     ` Kirill Smelkov
2009-01-16  8:21       ` Junio C Hamano
2009-01-16 11:54         ` Johannes Schindelin
2009-01-18 14:54           ` Kirill Smelkov
2009-01-18 18:50             ` John (zzz) Doe <john.doe@xz> (Comment) Junio C Hamano
2009-01-20 19:14               ` Kirill Smelkov [this message]
2009-01-21  3:12                 ` Junio C Hamano
2009-01-21 20:30                   ` Kirill Smelkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090120191446.GB5721@roro3.zxlink \
    --to=kirr@landau.phys.spbu.ru \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).