* Content-Type handling @ 2012-09-17 16:34 Tomas Cohen Arazi 2012-09-17 17:18 ` Re*: " Junio C Hamano 0 siblings, 1 reply; 3+ messages in thread From: Tomas Cohen Arazi @ 2012-09-17 16:34 UTC (permalink / raw) To: git Hi, I'm not sure it is a bug, but we used: git config --global format.headers "Content-Type: text/plain; charset=\"utf-8\"" and recently (perhaps an Ubuntu default setup issue) the content-type is being automatically set, the result is that patches contain this: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="utf-8" This might not be the problem, but when I apply the patch I get this: fatal: cannot convert from UTF-8utf-8 to UTF-8 which looks like a bug. Not sure it hasn't been reported before, but I think it should take one of the content-type specifications and not append both. I couldn't find a place to looks for previous bug reports to check, so forgive me if this is not the right place to report it. Regards To+ ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re*: Content-Type handling 2012-09-17 16:34 Content-Type handling Tomas Cohen Arazi @ 2012-09-17 17:18 ` Junio C Hamano 2012-09-17 22:28 ` [PATCH] mailinfo: do not concatenate charset= attribute values from mime headers Junio C Hamano 0 siblings, 1 reply; 3+ messages in thread From: Junio C Hamano @ 2012-09-17 17:18 UTC (permalink / raw) To: Tomas Cohen Arazi; +Cc: git Tomas Cohen Arazi <tomascohen@gmail.com> writes: > Hi, I'm not sure it is a bug, but we used: > > git config --global format.headers "Content-Type: text/plain; charset=\"utf-8\"" > > and recently (perhaps an Ubuntu default setup issue) the content-type > is being automatically set, the result is that patches contain this: > > Content-Type: text/plain; charset=UTF-8 > Content-Transfer-Encoding: 8bit > Content-Type: text/plain; charset="utf-8" > > This might not be the problem, but when I apply the patch I get this: > > fatal: cannot convert from UTF-8utf-8 to UTF-8 > > which looks like a bug. Not sure it hasn't been reported before, but I > think it should take one of the content-type specifications and not > append both. I couldn't find a place to looks for previous bug reports > to check, so forgive me if this is not the right place to report it. What I think is happening is: * ancient "git format-patch" did not add appropriate content-type headers, and that bug was fixed in more recent versions. * while waiting for the bug to get fixed, the user worked around by having her own format.headers to work it around. * the user upgraded to a newer version of git without the bug. * Git trusts that the user knows better than git itself, and it adds format.headers to whatever it needs to have in the message. As a message with more than one content-type: is invalid, the handcrafted format.headers used to be used as a workaround should be removed, but it hasn't been. * Then the receiving end (mailinfo) gets confused by the invalid input. At least, a patch line this may alleviate the symptom. It seems that we assumed we won't see multiple charset="..." so the parser, starting from an empty charset, simply appended whatever it found in the message, instead of overwriting the old value it had. The other user of the function this patch touches always makes sure the "attr" strbuf is empty before calling the function, and wouldn't have been confused by multiple occurrences of the same header, and this patch should not affect it, either. builtin/mailinfo.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git i/builtin/mailinfo.c w/builtin/mailinfo.c index b691b77..2b3f4d9 100644 --- i/builtin/mailinfo.c +++ w/builtin/mailinfo.c @@ -160,10 +160,9 @@ static int slurp_attr(const char *line, const char *name, struct strbuf *attr) const char *ends, *ap = strcasestr(line, name); size_t sz; - if (!ap) { - strbuf_setlen(attr, 0); + strbuf_setlen(attr, 0); + if (!ap) return 0; - } ap += strlen(name); if (*ap == '"') { ap++; ^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH] mailinfo: do not concatenate charset= attribute values from mime headers 2012-09-17 17:18 ` Re*: " Junio C Hamano @ 2012-09-17 22:28 ` Junio C Hamano 0 siblings, 0 replies; 3+ messages in thread From: Junio C Hamano @ 2012-09-17 22:28 UTC (permalink / raw) To: git; +Cc: Tomas Cohen Arazi "Content-type: text/plain; charset=UTF-8" header should not appear twice in the input, but it is always better to gracefully deal with such a case. The current code concatenates the value to the values we have seen previously, producing nonsense such as "utf8UTF-8". Instead of concatenating, forget the previous value and use the last value we see. Reported-by: Tomas Cohen Arazi Signed-off-by: Junio C Hamano <gitster@pobox.com> --- * I think this is the right thing to do after all. Thanks for bringing the issue up, Tomas. builtin/mailinfo.c | 5 ++--- t/t5100-mailinfo.sh | 2 +- t/t5100/info0017 | 5 +++++ t/t5100/msg0017 | 2 ++ t/t5100/patch0017 | 6 ++++++ t/t5100/sample.mbox | 16 ++++++++++++++++ 6 files changed, 32 insertions(+), 4 deletions(-) create mode 100644 t/t5100/info0017 create mode 100644 t/t5100/msg0017 create mode 100644 t/t5100/patch0017 diff --git a/builtin/mailinfo.c b/builtin/mailinfo.c index eaf9e15..dd8b67c 100644 --- a/builtin/mailinfo.c +++ b/builtin/mailinfo.c @@ -160,10 +160,9 @@ static int slurp_attr(const char *line, const char *name, struct strbuf *attr) const char *ends, *ap = strcasestr(line, name); size_t sz; - if (!ap) { - strbuf_setlen(attr, 0); + strbuf_setlen(attr, 0); + if (!ap) return 0; - } ap += strlen(name); if (*ap == '"') { ap++; diff --git a/t/t5100-mailinfo.sh b/t/t5100-mailinfo.sh index 81904d9..3e64a7a 100755 --- a/t/t5100-mailinfo.sh +++ b/t/t5100-mailinfo.sh @@ -11,7 +11,7 @@ test_expect_success 'split sample box' \ 'git mailsplit -o. "$TEST_DIRECTORY"/t5100/sample.mbox >last && last=`cat last` && echo total is $last && - test `cat last` = 16' + test `cat last` = 17' check_mailinfo () { mail=$1 opt=$2 diff --git a/t/t5100/info0017 b/t/t5100/info0017 new file mode 100644 index 0000000..d2bc89f --- /dev/null +++ b/t/t5100/info0017 @@ -0,0 +1,5 @@ +Author: A U Thor +Email: a.u.thor@example.com +Subject: A E I O U +Date: Mon, 17 Sep 2012 14:23:44 -0700 + diff --git a/t/t5100/msg0017 b/t/t5100/msg0017 new file mode 100644 index 0000000..2ee0900 --- /dev/null +++ b/t/t5100/msg0017 @@ -0,0 +1,2 @@ +New content here + diff --git a/t/t5100/patch0017 b/t/t5100/patch0017 new file mode 100644 index 0000000..35cf84c --- /dev/null +++ b/t/t5100/patch0017 @@ -0,0 +1,6 @@ +diff --git a/foo b/foo +index e69de29..d95f3ad 100644 +--- a/foo ++++ b/foo +@@ -0,0 +1 @@ ++New content diff --git a/t/t5100/sample.mbox b/t/t5100/sample.mbox index de10312..fcc0e0d 100644 --- a/t/t5100/sample.mbox +++ b/t/t5100/sample.mbox @@ -683,3 +683,19 @@ index e69de29..d95f3ad 100644 @@ -0,0 +1 @@ +content +From nobody Mon Sep 17 00:00:00 2001 +From: A U Thor <a.u.thor@example.com> +Subject: A E I O U +Date: Mon, 17 Sep 2012 14:23:44 -0700 +MIME-Version: 1.0 +Content-Type: text/plain; charset="iso-2022-jp" +Content-type: text/plain; charset="UTF-8" + +New content here + +diff --git a/foo b/foo +index e69de29..d95f3ad 100644 +--- a/foo ++++ b/foo +@@ -0,0 +1 @@ ++New content -- 1.7.12.556.gb90bb19 ^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2012-09-17 22:45 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-09-17 16:34 Content-Type handling Tomas Cohen Arazi 2012-09-17 17:18 ` Re*: " Junio C Hamano 2012-09-17 22:28 ` [PATCH] mailinfo: do not concatenate charset= attribute values from mime headers Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).