git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* `git-send-email' doesn't specify `Content-Type'
@ 2007-11-10  0:14 Ludovic Courtès
  2007-11-10  0:52 ` Johannes Schindelin
  0 siblings, 1 reply; 13+ messages in thread
From: Ludovic Courtès @ 2007-11-10  0:14 UTC (permalink / raw)
  To: git

Hi,

Apparently, `git-send-email' doesn't specify the email's `Content-Type',
notably its charset, while it should really add something like:

  Content-Type: text/plain; charset=UTF-8

Or did I miss an option or something?

Thanks,
Ludovic.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: `git-send-email' doesn't specify `Content-Type'
  2007-11-10  0:14 `git-send-email' doesn't specify `Content-Type' Ludovic Courtès
@ 2007-11-10  0:52 ` Johannes Schindelin
  2007-11-10 10:14   ` Brian Swetland
  0 siblings, 1 reply; 13+ messages in thread
From: Johannes Schindelin @ 2007-11-10  0:52 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: git

Hi,

On Sat, 10 Nov 2007, Ludovic Court?s wrote:

> Apparently, `git-send-email' doesn't specify the email's `Content-Type',
> notably its charset, while it should really add something like:
> 
>   Content-Type: text/plain; charset=UTF-8
> 
> Or did I miss an option or something?

Apparently.  There was a thread some days ago, about that very issue.  
Please find and read it.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: `git-send-email' doesn't specify `Content-Type'
  2007-11-10  0:52 ` Johannes Schindelin
@ 2007-11-10 10:14   ` Brian Swetland
  2007-11-10 12:25     ` Björn Steinbrink
  0 siblings, 1 reply; 13+ messages in thread
From: Brian Swetland @ 2007-11-10 10:14 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Ludovic Courtès, git

[Johannes Schindelin <Johannes.Schindelin@gmx.de>]
> Hi,
> 
> On Sat, 10 Nov 2007, Ludovic Court?s wrote:
> 
> > Apparently, `git-send-email' doesn't specify the email's `Content-Type',
> > notably its charset, while it should really add something like:
> > 
> >   Content-Type: text/plain; charset=UTF-8
> > 
> > Or did I miss an option or something?
> 
> Apparently.  There was a thread some days ago, about that very issue.  
> Please find and read it.

The thread I found says that git-send-email should do the right thing if
there are non-ascii characters, but this does not seem to be the case
for me.

The example I have involves a coworker's name which needs non-ascii
characters.  They are properly escaped in the From: line generated by
git-format-patch.  git-send-email puts the generated From: line at the
top of the body of the email, unescapes it (to utf-8), and proceeds to
send the email with no Content-Type specified.

This behaviour is observed in 1.5.3.5.  A sample output from
git-format-patch follows, which demonstrates the problem:


>From 3440baaed3b21138f6fc8b80e03769e3903f9c11 Mon Sep 17 00:00:00 2001
From: =?utf-8?q?Arve=20Hj=C3=B8nnev=C3=A5g?= <arve@android.com>
Date: Wed, 7 Nov 2007 22:51:44 -0800
Subject: [PATCH] hrtimer: Add timer back to pending list if it was reactivated and has already expired again.

This avoids problems with timer hardware that does not respond to timers set in the past.

Signed-off-by: Brian Swetland <swetland@android.com>
---
 kernel/hrtimer.c |   10 ++++++++--
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 22a2514..7c60769 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1149,8 +1149,14 @@ static void run_hrtimer_softirq(struct softirq_action *h)
 			 * If the timer was rearmed on another CPU, reprogram
 			 * the event device.
 			 */
-			if (timer->base->first == &timer->node)
-				hrtimer_reprogram(timer, timer->base);
+			if (timer->base->first == &timer->node) {
+				if(hrtimer_reprogram(timer, timer->base)) {
+					__remove_hrtimer(timer, timer->base,
+							 HRTIMER_STATE_PENDING, 0);
+					list_add_tail(&timer->cb_entry,
+						      &cpu_base->cb_pending);
+				}
+			}
 		}
 	}
 	spin_unlock_irq(&cpu_base->lock);
-- 
1.5.3.5

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: `git-send-email' doesn't specify `Content-Type'
  2007-11-10 10:14   ` Brian Swetland
@ 2007-11-10 12:25     ` Björn Steinbrink
  2007-11-10 12:35       ` Brian Swetland
  0 siblings, 1 reply; 13+ messages in thread
From: Björn Steinbrink @ 2007-11-10 12:25 UTC (permalink / raw)
  To: Brian Swetland; +Cc: Johannes Schindelin, Ludovic Courtès, git

On 2007.11.10 02:14:20 -0800, Brian Swetland wrote:
> [Johannes Schindelin <Johannes.Schindelin@gmx.de>]
> > Hi,
> > 
> > On Sat, 10 Nov 2007, Ludovic Court?s wrote:
> > 
> > > Apparently, `git-send-email' doesn't specify the email's `Content-Type',
> > > notably its charset, while it should really add something like:
> > > 
> > >   Content-Type: text/plain; charset=UTF-8
> > > 
> > > Or did I miss an option or something?
> > 
> > Apparently.  There was a thread some days ago, about that very issue.  
> > Please find and read it.
> 
> The thread I found says that git-send-email should do the right thing if
> there are non-ascii characters, but this does not seem to be the case
> for me.
> 
> The example I have involves a coworker's name which needs non-ascii
> characters.  They are properly escaped in the From: line generated by
> git-format-patch.  git-send-email puts the generated From: line at the
> top of the body of the email, unescapes it (to utf-8), and proceeds to
> send the email with no Content-Type specified.

You mean that it converts the header field to utf-8? It doesn't do that
here (neither master nor 1.5.3.5) and IIRC that would be invalid anyway,
because Content-Type applies to exactly that, content, not headers. Your
sample has no non-ASCII characters (or at least I didn't see any), so
git-send-email doesn't add a header to specify a charset.

Björn

> This behaviour is observed in 1.5.3.5.  A sample output from
> git-format-patch follows, which demonstrates the problem:
> 
> 
> >From 3440baaed3b21138f6fc8b80e03769e3903f9c11 Mon Sep 17 00:00:00 2001
> From: =?utf-8?q?Arve=20Hj=C3=B8nnev=C3=A5g?= <arve@android.com>
> Date: Wed, 7 Nov 2007 22:51:44 -0800
> Subject: [PATCH] hrtimer: Add timer back to pending list if it was reactivated and has already expired again.
> 
> This avoids problems with timer hardware that does not respond to timers set in the past.
> 
> Signed-off-by: Brian Swetland <swetland@android.com>
> ---
>  kernel/hrtimer.c |   10 ++++++++--
>  1 files changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
> index 22a2514..7c60769 100644
> --- a/kernel/hrtimer.c
> +++ b/kernel/hrtimer.c
> @@ -1149,8 +1149,14 @@ static void run_hrtimer_softirq(struct softirq_action *h)
>  			 * If the timer was rearmed on another CPU, reprogram
>  			 * the event device.
>  			 */
> -			if (timer->base->first == &timer->node)
> -				hrtimer_reprogram(timer, timer->base);
> +			if (timer->base->first == &timer->node) {
> +				if(hrtimer_reprogram(timer, timer->base)) {
> +					__remove_hrtimer(timer, timer->base,
> +							 HRTIMER_STATE_PENDING, 0);
> +					list_add_tail(&timer->cb_entry,
> +						      &cpu_base->cb_pending);
> +				}
> +			}
>  		}
>  	}
>  	spin_unlock_irq(&cpu_base->lock);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: `git-send-email' doesn't specify `Content-Type'
  2007-11-10 12:25     ` Björn Steinbrink
@ 2007-11-10 12:35       ` Brian Swetland
  2007-11-10 12:51         ` Björn Steinbrink
  0 siblings, 1 reply; 13+ messages in thread
From: Brian Swetland @ 2007-11-10 12:35 UTC (permalink / raw)
  To: Björn Steinbrink; +Cc: Johannes Schindelin, Ludovic Courtès, git

[-- Attachment #1: Type: text/plain, Size: 1367 bytes --]

[Björn Steinbrink <B.Steinbrink@gmx.de>]
> On 2007.11.10 02:14:20 -0800, Brian Swetland wrote:
> > 
> > The example I have involves a coworker's name which needs non-ascii
> > characters.  They are properly escaped in the From: line generated by
> > git-format-patch.  git-send-email puts the generated From: line at the
> > top of the body of the email, unescapes it (to utf-8), and proceeds to
> > send the email with no Content-Type specified.
> 
> You mean that it converts the header field to utf-8? It doesn't do that
> here (neither master nor 1.5.3.5) and IIRC that would be invalid anyway,
> because Content-Type applies to exactly that, content, not headers. Your
> sample has no non-ASCII characters (or at least I didn't see any), so
> git-send-email doesn't add a header to specify a charset.

The first line of the patch is a From: field with Arve's name, in
an (rfc822?) encoded format):
From: =?utf-8?q?Arve=20Hj=C3=B8nnev=C3=A5g?= <arve@android.com>

The mail generated by git-send-email makes this From: line the first
line of the *body* of the generated email.  This line in the body
is no longer escaped, the utf8 characters are visible, but the header
of the message does not have a Content-Type indicating a non-ascii
encoding.

Attached are the result of git-format-patch and the actual email
received from git-send-email (mbox format).

Brian

[-- Attachment #2: 0001-hrtimer-Add-timer-back-to-pending-list-if-it-was-re.patch --]
[-- Type: text/x-diff, Size: 1224 bytes --]

>From 3440baaed3b21138f6fc8b80e03769e3903f9c11 Mon Sep 17 00:00:00 2001
From: =?utf-8?q?Arve=20Hj=C3=B8nnev=C3=A5g?= <arve@android.com>
Date: Wed, 7 Nov 2007 22:51:44 -0800
Subject: [PATCH] hrtimer: Add timer back to pending list if it was reactivated and has already expired again.

This avoids problems with timer hardware that does not respond to timers set in the past.

Signed-off-by: Brian Swetland <swetland@android.com>
---
 kernel/hrtimer.c |   10 ++++++++--
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 22a2514..7c60769 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1149,8 +1149,14 @@ static void run_hrtimer_softirq(struct softirq_action *h)
 			 * If the timer was rearmed on another CPU, reprogram
 			 * the event device.
 			 */
-			if (timer->base->first == &timer->node)
-				hrtimer_reprogram(timer, timer->base);
+			if (timer->base->first == &timer->node) {
+				if(hrtimer_reprogram(timer, timer->base)) {
+					__remove_hrtimer(timer, timer->base,
+							 HRTIMER_STATE_PENDING, 0);
+					list_add_tail(&timer->cb_entry,
+						      &cpu_base->cb_pending);
+				}
+			}
 		}
 	}
 	spin_unlock_irq(&cpu_base->lock);
-- 
1.5.3.5


[-- Attachment #3: mbox --]
[-- Type: text/plain, Size: 2456 bytes --]

>From swetland@google.com Sat Nov 10 02:04:08 2007
Return-Path: <swetland@google.com>
X-Original-To: swetland@frotz.net
Delivered-To: swetland@frotz.net
Received: from smtp-out.google.com (smtp-out.google.com [216.239.45.13])
	by mumble.frotz.net (Postfix) with ESMTP id 1BC002500D
	for <swetland@frotz.net>; Sat, 10 Nov 2007 02:04:08 -0800 (PST)
Received: from zps35.corp.google.com (zps35.corp.google.com [172.25.146.35])
	by smtp-out.google.com with ESMTP id lAAA5hUj030761;
	Sat, 10 Nov 2007 02:05:43 -0800
DomainKey-Signature: a=rsa-sha1; s=beta; d=google.com; c=nofws; q=dns;
	h=received:from:to:cc:subject:date:message-id:x-mailer;
	b=g2B628wRsJJahlIpNw3mgNDqOQKNMcUCPOurvqj+3fO6qLH+vpBS0ZwN1lLv6BnC7
	w4QLOotDo7t+nI2KgZDVQ==
Received: from bulgaria (bulgaria.corp.google.com [172.18.102.38])
	by zps35.corp.google.com with ESMTP id lAAA5e22002202;
	Sat, 10 Nov 2007 02:05:42 -0800
Received: by bulgaria (Postfix, from userid 1000)
	id 613018F45E; Sat, 10 Nov 2007 02:05:25 -0800 (PST)
From: swetland@google.com
To: swetland@frotz.net
Cc: =?utf-8?q?Arve=20Hj=C3=B8nnev=C3=A5g?= <arve@android.com>
Subject: [PATCH] hrtimer: Add timer back to pending list if it was reactivated and has already expired again.
Date: Sat, 10 Nov 2007 02:05:25 -0800
Message-Id: <1194689125-21319-1-git-send-email-swetland@google.com>
X-Mailer: git-send-email 1.5.3.5
X-SpamProbe: GOOD 0.0000099 b892c7c5c469d044f28ab48846487cf5
X-SpamCheck: OKAY
Status: RO
Content-Length: 983
Lines: 33

From: Arve Hjønnevåg <arve@android.com>

This avoids problems with timer hardware that does not respond to timers set in the past.

Signed-off-by: Brian Swetland <swetland@android.com>
---
 kernel/hrtimer.c |   10 ++++++++--
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 22a2514..7c60769 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1149,8 +1149,14 @@ static void run_hrtimer_softirq(struct softirq_action *h)
 			 * If the timer was rearmed on another CPU, reprogram
 			 * the event device.
 			 */
-			if (timer->base->first == &timer->node)
-				hrtimer_reprogram(timer, timer->base);
+			if (timer->base->first == &timer->node) {
+				if(hrtimer_reprogram(timer, timer->base)) {
+					__remove_hrtimer(timer, timer->base,
+							 HRTIMER_STATE_PENDING, 0);
+					list_add_tail(&timer->cb_entry,
+						      &cpu_base->cb_pending);
+				}
+			}
 		}
 	}
 	spin_unlock_irq(&cpu_base->lock);
-- 
1.5.3.5



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: `git-send-email' doesn't specify `Content-Type'
  2007-11-10 12:35       ` Brian Swetland
@ 2007-11-10 12:51         ` Björn Steinbrink
  2007-11-11  8:32           ` Jeff King
  0 siblings, 1 reply; 13+ messages in thread
From: Björn Steinbrink @ 2007-11-10 12:51 UTC (permalink / raw)
  To: Brian Swetland; +Cc: Johannes Schindelin, Ludovic Courtès, git

On 2007.11.10 04:35:05 -0800, Brian Swetland wrote:
> [Björn Steinbrink <B.Steinbrink@gmx.de>]
> > On 2007.11.10 02:14:20 -0800, Brian Swetland wrote:
> > > 
> > > The example I have involves a coworker's name which needs non-ascii
> > > characters.  They are properly escaped in the From: line generated by
> > > git-format-patch.  git-send-email puts the generated From: line at the
> > > top of the body of the email, unescapes it (to utf-8), and proceeds to
> > > send the email with no Content-Type specified.
> > 
> > You mean that it converts the header field to utf-8? It doesn't do that
> > here (neither master nor 1.5.3.5) and IIRC that would be invalid anyway,
> > because Content-Type applies to exactly that, content, not headers. Your
> > sample has no non-ASCII characters (or at least I didn't see any), so
> > git-send-email doesn't add a header to specify a charset.
> 
> The first line of the patch is a From: field with Arve's name, in
> an (rfc822?) encoded format):
> From: =?utf-8?q?Arve=20Hj=C3=B8nnev=C3=A5g?= <arve@android.com>
> 
> The mail generated by git-send-email makes this From: line the first
> line of the *body* of the generated email.  This line in the body
> is no longer escaped, the utf8 characters are visible, but the header
> of the message does not have a Content-Type indicating a non-ascii
> encoding.

Ah! Commit author differs from mail sender, didn't think of that. That's
probably the same problem as with the -s option, ie. that git-send-email
only looks at the existing text and not add anything it adds itself when
checking the encoding. Sorry for the noise.

Björn

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: `git-send-email' doesn't specify `Content-Type'
  2007-11-10 12:51         ` Björn Steinbrink
@ 2007-11-11  8:32           ` Jeff King
  2007-11-11  8:35             ` Jeff King
                               ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Jeff King @ 2007-11-11  8:32 UTC (permalink / raw)
  To: Björn Steinbrink
  Cc: Brian Swetland, Johannes Schindelin, Ludovic Courtès, git

On Sat, Nov 10, 2007 at 01:51:26PM +0100, Björn Steinbrink wrote:

> On 2007.11.10 04:35:05 -0800, Brian Swetland wrote:
> > The first line of the patch is a From: field with Arve's name, in
> > an (rfc822?) encoded format):
> > From: =?utf-8?q?Arve=20Hj=C3=B8nnev=C3=A5g?= <arve@android.com>

It's rfc2047 (and you can grep for that in git-send-email).

> Ah! Commit author differs from mail sender, didn't think of that. That's
> probably the same problem as with the -s option, ie. that git-send-email
> only looks at the existing text and not add anything it adds itself when
> checking the encoding. Sorry for the noise.

It's not the same problem; the '-s' problem was git-format-patch, and
this is git-send-email. In fact, git-format-patch correctly notes the
encoding in the header. It is git-send-email in this case that takes the
encoded and properly marked header, deciphers it, throws away the
original encoding, and sticks it into the message body without
considering the encoding of the body.

So I think you would want to:
  1. remember the encoding pulled from the rfc2047 header
  2. When prepending the author line to the message, consider the
     body encoding.
  2a. If no encoding, then the body is US-ASCII and we can presumably
      just add
         MIME-Version: 1.0
         Content-Type: text/plain; charset=$enc
  2b. If there is an encoding, we need to Iconv from the name
      encoding to the body encoding.

However, as it stands now, our rfc2047 unquoting _always_ assumes that
we are in utf-8 for the name (which is probably true if the messages
came out of git-format-patch with default-ish settings). So the easy,
hackish way is probably to just add the MIME-Version and 'Content-type:
text/plain; charset=utf-8' headers if we unquoted the author field.

If we want to accept arbitrary messages, below is a patch to at least
have unquote_rfc2047 return the right information (and then on
git-send-email.perl:758, where we prepend $author, the encoding would
need to be taken into account as I described above).

Given that git-send-email is already pretty dependent on
git-format-patch output (and nobody has been complaining about its
rfc2047 handling so far!) the easy, hackish way is probably the best.

-Peff

---
diff --git a/git-send-email.perl b/git-send-email.perl
index f9bd2e5..4f8297f 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -514,11 +514,13 @@ $time = time - scalar $#files;
 
 sub unquote_rfc2047 {
 	local ($_) = @_;
-	if (s/=\?utf-8\?q\?(.*)\?=/$1/g) {
+	my $encoding;
+	if (s/=\?([^?])+\?q\?(.*)\?=/$2/g) {
+		$encoding = $1;
 		s/_/ /g;
 		s/=([0-9A-F]{2})/chr(hex($1))/eg;
 	}
-	return "$_";
+	return "$_", $encoding;
 }
 
 # use the simplest quoting being able to handle the recipient
@@ -667,6 +669,7 @@ foreach my $t (@files) {
 	open(F,"<",$t) or die "can't open file $t";
 
 	my $author = undef;
+	my $author_encoding;
 	@cc = @initial_cc;
 	@xh = ();
 	my $input_format = undef;
@@ -692,7 +695,8 @@ foreach my $t (@files) {
 						next if ($suppress_from);
 					}
 					elsif ($1 eq 'From') {
-						$author = unquote_rfc2047($2);
+						($author, $author_encoding)
+						  = unquote_rfc2047($2);
 					}
 					printf("(mbox) Adding cc: %s from line '%s'\n",
 						$2, $_) unless $quiet;

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: `git-send-email' doesn't specify `Content-Type'
  2007-11-11  8:32           ` Jeff King
@ 2007-11-11  8:35             ` Jeff King
  2007-11-11  8:39             ` Brian Swetland
  2007-11-11  8:56             ` Jeff King
  2 siblings, 0 replies; 13+ messages in thread
From: Jeff King @ 2007-11-11  8:35 UTC (permalink / raw)
  To: Björn Steinbrink
  Cc: Brian Swetland, Johannes Schindelin, Ludovic Courtès, git

On Sun, Nov 11, 2007 at 03:32:24AM -0500, Jeff King wrote:

> -	return "$_";
> +	return "$_", $encoding;

This actually breaks other calls to unquote_rfc2047 which use a scalar
context. So that would have to be fixed if this were to start a real
patch.

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: `git-send-email' doesn't specify `Content-Type'
  2007-11-11  8:32           ` Jeff King
  2007-11-11  8:35             ` Jeff King
@ 2007-11-11  8:39             ` Brian Swetland
  2007-11-11  8:41               ` Jeff King
  2007-11-11  8:56             ` Jeff King
  2 siblings, 1 reply; 13+ messages in thread
From: Brian Swetland @ 2007-11-11  8:39 UTC (permalink / raw)
  To: Jeff King
  Cc: Björn Steinbrink, Johannes Schindelin, Ludovic Courtès,
	git


This issue with the encoding of the author got me thinking...

What happens if the metadata has utf8 content and the patch itself has 
some *other* non-ascii encoding (some iso-latin variant perhaps).

Is there any way to deal with that situation sanely other than indicate
that it's 8bit content and not specify an encoding?  Is that what
happens currently?

Brian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: `git-send-email' doesn't specify `Content-Type'
  2007-11-11  8:39             ` Brian Swetland
@ 2007-11-11  8:41               ` Jeff King
  2007-11-11  8:45                 ` Brian Swetland
  0 siblings, 1 reply; 13+ messages in thread
From: Jeff King @ 2007-11-11  8:41 UTC (permalink / raw)
  To: Brian Swetland
  Cc: Björn Steinbrink, Johannes Schindelin, Ludovic Courtès,
	git

On Sun, Nov 11, 2007 at 12:39:15AM -0800, Brian Swetland wrote:

> This issue with the encoding of the author got me thinking...
> 
> What happens if the metadata has utf8 content and the patch itself has 
> some *other* non-ascii encoding (some iso-latin variant perhaps).
> 
> Is there any way to deal with that situation sanely other than indicate
> that it's 8bit content and not specify an encoding?  Is that what
> happens currently?

The body has to be in one encoding, so at the time that you know both
encodings, you have to pick one and convert the data from the discarded
encoding into the used encoding.

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: `git-send-email' doesn't specify `Content-Type'
  2007-11-11  8:41               ` Jeff King
@ 2007-11-11  8:45                 ` Brian Swetland
  2007-11-11  8:51                   ` Jeff King
  0 siblings, 1 reply; 13+ messages in thread
From: Brian Swetland @ 2007-11-11  8:45 UTC (permalink / raw)
  To: Jeff King
  Cc: Björn Steinbrink, Johannes Schindelin, Ludovic Courtès,
	git

[Jeff King <peff@peff.net>]
> On Sun, Nov 11, 2007 at 12:39:15AM -0800, Brian Swetland wrote:
> 
> > This issue with the encoding of the author got me thinking...
> > 
> > What happens if the metadata has utf8 content and the patch itself has 
> > some *other* non-ascii encoding (some iso-latin variant perhaps).
> > 
> > Is there any way to deal with that situation sanely other than indicate
> > that it's 8bit content and not specify an encoding?  Is that what
> > happens currently?
> 
> The body has to be in one encoding, so at the time that you know both
> encodings, you have to pick one and convert the data from the discarded
> encoding into the used encoding.

That seems potentially bad in that the transport (mailed patches) could
be altering the contents of the patch.  Or is this process reversed when 
the patch is finally applied?

Brian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: `git-send-email' doesn't specify `Content-Type'
  2007-11-11  8:45                 ` Brian Swetland
@ 2007-11-11  8:51                   ` Jeff King
  0 siblings, 0 replies; 13+ messages in thread
From: Jeff King @ 2007-11-11  8:51 UTC (permalink / raw)
  To: Brian Swetland
  Cc: Björn Steinbrink, Johannes Schindelin, Ludovic Courtès,
	git

On Sun, Nov 11, 2007 at 12:45:15AM -0800, Brian Swetland wrote:

> > > What happens if the metadata has utf8 content and the patch itself has 
> > > some *other* non-ascii encoding (some iso-latin variant perhaps).
[...]
> > The body has to be in one encoding, so at the time that you know both
> > encodings, you have to pick one and convert the data from the discarded
> > encoding into the used encoding.
> 
> That seems potentially bad in that the transport (mailed patches) could
> be altering the contents of the patch.  Or is this process reversed when 
> the patch is finally applied?

My answer was for "how do you stick two things with different encoding
in the same mail" (which applies to the name + commit message
situation). However, we don't actually _have_ an encoding for the patch
data. We just assume that it matches the metadata.

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: `git-send-email' doesn't specify `Content-Type'
  2007-11-11  8:32           ` Jeff King
  2007-11-11  8:35             ` Jeff King
  2007-11-11  8:39             ` Brian Swetland
@ 2007-11-11  8:56             ` Jeff King
  2 siblings, 0 replies; 13+ messages in thread
From: Jeff King @ 2007-11-11  8:56 UTC (permalink / raw)
  To: Björn Steinbrink
  Cc: Brian Swetland, Johannes Schindelin, Ludovic Courtès, git

On Sun, Nov 11, 2007 at 03:32:24AM -0500, Jeff King wrote:

> came out of git-format-patch with default-ish settings). So the easy,
> hackish way is probably to just add the MIME-Version and 'Content-type:
> text/plain; charset=utf-8' headers if we unquoted the author field.

Here is the quick and dirty patch. It is totally untested (as in, I
didn't even run git-send-email once), but maybe it can get somebody
started (I left some comments about how to make it less quick and
dirty).  My head is going to explode if I read any more of the ad-hoc
header parsing in git-send-email.perl.

-Peff

---
diff --git a/git-send-email.perl b/git-send-email.perl
index f9bd2e5..4a071f2 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -514,11 +514,13 @@ $time = time - scalar $#files;
 
 sub unquote_rfc2047 {
 	local ($_) = @_;
-	if (s/=\?utf-8\?q\?(.*)\?=/$1/g) {
+	my $encoding;
+	if (s/=\?([^?])+\?q\?(.*)\?=/$2/g) {
+		$encoding = $1;
 		s/_/ /g;
 		s/=([0-9A-F]{2})/chr(hex($1))/eg;
 	}
-	return "$_";
+	return wantarray ? ($_, $encoding) : $_;
 }
 
 # use the simplest quoting being able to handle the recipient
@@ -667,6 +669,9 @@ foreach my $t (@files) {
 	open(F,"<",$t) or die "can't open file $t";
 
 	my $author = undef;
+	my $author_encoding;
+	my $has_content_type;
+	my $body_encoding;
 	@cc = @initial_cc;
 	@xh = ();
 	my $input_format = undef;
@@ -692,12 +697,20 @@ foreach my $t (@files) {
 						next if ($suppress_from);
 					}
 					elsif ($1 eq 'From') {
-						$author = unquote_rfc2047($2);
+						($author, $author_encoding)
+						  = unquote_rfc2047($2);
 					}
 					printf("(mbox) Adding cc: %s from line '%s'\n",
 						$2, $_) unless $quiet;
 					push @cc, $2;
 				}
+				elsif (/^Content-type:/i) {
+					$has_content_type = 1;
+					if (/charset="?[^ "]+/) {
+						$body_encoding = $1;
+					}
+					push @xh, $_;
+				}
 				elsif (!/^Date:\s/ && /^[-A-Za-z]+:\s+\S/) {
 					push @xh, $_;
 				}
@@ -756,6 +769,21 @@ foreach my $t (@files) {
 
 	if (defined $author) {
 		$message = "From: $author\n\n$message";
+		if (defined $author_encoding) {
+			if ($has_content_type) {
+				if ($body_encoding eq $author_encoding) {
+					# ok, we already have the right encoding
+				}
+				else {
+					# uh oh, we should re-encode
+				}
+			}
+			else {
+				push @xh,
+				  'MIME-Version: 1.0',
+				  "Content-Type: text/plain; charset=$author_encoding";
+			}
+		}
 	}
 
 	send_message();

^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2007-11-11  8:56 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-10  0:14 `git-send-email' doesn't specify `Content-Type' Ludovic Courtès
2007-11-10  0:52 ` Johannes Schindelin
2007-11-10 10:14   ` Brian Swetland
2007-11-10 12:25     ` Björn Steinbrink
2007-11-10 12:35       ` Brian Swetland
2007-11-10 12:51         ` Björn Steinbrink
2007-11-11  8:32           ` Jeff King
2007-11-11  8:35             ` Jeff King
2007-11-11  8:39             ` Brian Swetland
2007-11-11  8:41               ` Jeff King
2007-11-11  8:45                 ` Brian Swetland
2007-11-11  8:51                   ` Jeff King
2007-11-11  8:56             ` Jeff King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).