* MIME problem when using git format-patch / git am
@ 2010-04-25 23:35 Øyvind A. Holm
2010-04-26 1:49 ` Jonathan Nieder
0 siblings, 1 reply; 4+ messages in thread
From: Øyvind A. Holm @ 2010-04-25 23:35 UTC (permalink / raw)
To: git; +Cc: sunny
[-- Attachment #1.1: Type: text/plain, Size: 3903 bytes --]
I have a problem when using `git format-patch` and `git am` when there
are 8-bit (i.e. UTF-8) characters in the log message from line three and
below. I have attached a script (`runme.sh`) for reproducing this. It is
also available from <http://gist.github.com/378785> in case the
attachment is mangled.
My output is this:
$ ./runme.sh
LANG=en_GB.UTF-8
LANGUAGE=en_GB.UTF-8
LC_COLLATE=C
LC_CTYPE=en_GB.utf8
Initialized empty Git repository in /home/sunny/src/git/til_git-lista/dir/.git/
[master (root-commit) 2755112] Initial commit
======== Create commits
[master 9ab743b] First commit without 8-bit chars
1 files changed, 1 insertions(+), 0 deletions(-)
create mode 100644 foo.txt
[master dd5bdf2] Second commit with © in first line of logmsg
1 files changed, 1 insertions(+), 0 deletions(-)
[master 82a445a] Third commit with no 8-bit in first line but €uro further down
1 files changed, 1 insertions(+), 0 deletions(-)
[master 42881f1] Fourth commit with © in first line again
1 files changed, 1 insertions(+), 0 deletions(-)
======== git format-patch firstrev
0001-First-commit-without-8-bit-chars.patch
0002-Second-commit-with-in-first-line-of-logmsg.patch
0003-Third-commit-with-no-8-bit-in-first-line.patch
0004-Fourth-commit-with-in-first-line-again.patch
======== Create new, empty branch and apply patches
Switched to a new branch 'patches'
Applying: First commit without 8-bit chars
Applying: Second commit with © in first line of logmsg
Applying: =?UTF-8?q?Third=20commit=20with=20no=208-bit=20in=20first=20line
Applying: Fourth commit with © in first line again
======== git log
commit 58bcf14aee4b17152ae1f8bd40a24141e93897ec
Author: Øyvind A. Holm <sunny@sunbase.org>
Date: 0 seconds ago
Fourth commit with © in first line again
commit 56e98bd6b161510687abd658728f92b3bd85baf1
Author: Øyvind A. Holm <sunny@sunbase.org>
Date: 1 seconds ago
=?UTF-8?q?Third=20commit=20with=20no=208-bit=20in=20first=20line
=20but=20=E2=82=ACuro=20further=20down?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
commit 55de5c9008c53a388439f0a0cdb2043d9a36ba6b
Author: Øyvind A. Holm <sunny@sunbase.org>
Date: 1 seconds ago
Second commit with © in first line of logmsg
commit af6bf42a07208bc31d537a50562a5997f02e58cb
Author: Øyvind A. Holm <sunny@sunbase.org>
Date: 1 seconds ago
First commit without 8-bit chars
commit 2755112f613687510803e45868751e7b85e3cd1e
Author: Øyvind A. Holm <sunny@sunbase.org>
Date: 1 seconds ago
Initial commit
$
If it’s messed up (wordwrap and such), it’s also available from
<http://gist.github.com/raw/378785/output.txt>.
In this case the log message of the third commit (56e98) is unreadable.
This only happens when there are characters above U+0080 in the log
message, but not in the first line, as that ends up as the subject. Yes,
I’ve also tried to remove the "Ø" from the author name, with similar
results.
I’m using `git format-patch`/`git am` as an easy way to import/export
commits between repositories and directories, and I’m thinking that
these commands are probably not intended for that kind of use. Are there
any other method or some magic command line options that makes this
possible, or is this a bug?
+-| Øyvind A. Holm <sunny@sunbase.org> - N 60.39548° E 5.31735° |-+
| OpenPGP: 0xFB0CBEE894A506E5 - http://www.sunbase.org/pubkey.asc |
| Fingerprint: A006 05D6 E676 B319 55E2 E77E FB0C BEE8 94A5 06E5 |
+------------| 1f818370-50c1-11df-ae95-90e6ba3022ac |-------------+
[-- Attachment #1.2: runme.sh --]
[-- Type: application/x-sh, Size: 941 bytes --]
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: MIME problem when using git format-patch / git am
2010-04-25 23:35 MIME problem when using git format-patch / git am Øyvind A. Holm
@ 2010-04-26 1:49 ` Jonathan Nieder
2010-04-26 8:50 ` Peter Krefting
0 siblings, 1 reply; 4+ messages in thread
From: Jonathan Nieder @ 2010-04-26 1:49 UTC (permalink / raw)
To: Øyvind A. Holm, git
Hi,
Øyvind A. Holm wrote:
| git commit -m "Initial commit" --allow-empty
| git tag firstrev
| echo First line >foo.txt
| git add foo.txt
| git commit -m "First commit without 8-bit chars"
| echo Second line >>foo.txt
| git commit -m "Second commit with © in first line of logmsg" -a
| echo Third line >>foo.txt
| git commit -m "Third commit with no 8-bit in first line`echo; echo but €uro further down`" -a
| echo Fourth line >>foo.txt
| git commit -m "Fourth commit with © in first line again" -a
| git format-patch firstrev
| git checkout -b patches firstrev
| git am 0*
| git log
[...]
| Applying: =?UTF-8?q?Third=20commit=20with=20no=208-bit=20in=20first=20line
| Applying: Fourth commit with © in first line again
On this machine, it’s even worse:
=?UTF-8?q?Third=20commit=20with=20no=208-bit=20in=20first=20line
=20but=20=E2=82=ACuro=20further=20down?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
An encoded-word [1] is defined to be at most 75 characters long and not
to contain whitespace. On the other hand, multiple encoded-words
within a field are required to be separated by whitespace.
[1] http://tools.ietf.org/html/rfc2047
This leads to a question: what if one wants to include a word that
quotes to more than 75 characters? How about more than 997 ASCII
characters without whitespace? No can do.
Ideally, I would like to see git quoting single words, though I admit
I have not seen how well various user agents cope with this.
Maybe this patch would give you some joy until then. Feel free to pick
it up and take it somewhere more useful.
Thanks,
Jonathan
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
diff --git a/pretty.c b/pretty.c
index d493cad..b822e24 100644
--- a/pretty.c
+++ b/pretty.c
@@ -131,7 +131,7 @@ needquote:
* many programs do not understand this and just
* leave the underscore in place.
*/
- if (is_rfc2047_special(ch) || ch == ' ') {
+ if (is_rfc2047_special(ch) || ch == ' ' || ch == '\n') {
strbuf_add(sb, line + last, i - last);
strbuf_addf(sb, "=%02X", ch);
last = i + 1;
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: MIME problem when using git format-patch / git am
2010-04-26 1:49 ` Jonathan Nieder
@ 2010-04-26 8:50 ` Peter Krefting
2010-04-27 15:32 ` Jonathan Nieder
0 siblings, 1 reply; 4+ messages in thread
From: Peter Krefting @ 2010-04-26 8:50 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: Øyvind A. Holm, git
Jonathan Nieder:
> This leads to a question: what if one wants to include a word that quotes
> to more than 75 characters?
You just split it into two encoded words. The whitespace between the
encoded words is there for syntactic reasons only, it is not included in the
final result if both parts are encoded words (look at the "Examples" section
of the RFC for some examples of how it works).
--
\\// Peter - http://www.softwolves.pp.se/
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: MIME problem when using git format-patch / git am
2010-04-26 8:50 ` Peter Krefting
@ 2010-04-27 15:32 ` Jonathan Nieder
0 siblings, 0 replies; 4+ messages in thread
From: Jonathan Nieder @ 2010-04-27 15:32 UTC (permalink / raw)
To: Peter Krefting; +Cc: Øyvind A. Holm, git
Peter Krefting wrote:
> Jonathan Nieder:
>> This leads to a question: what if one wants to include a word that
>> quotes to more than 75 characters?
>
> You just split it into two encoded words. The whitespace between the
> encoded words is there for syntactic reasons only, it is not
> included in the final result if both parts are encoded words (look
> at the "Examples" section of the RFC for some examples of how it
> works).
Thank you. He explains it before then, too; I should be reading more
carefully.
I’ve been working through a full implementation of that RFC, but it
seems that to support breaking words, Git would need to learn where the
character boundaries are. Which means encodings would not be just opaque
strings as far as git cares any more.
On the bright side, the only encoding I care about is UTF-8, and that
one’s easy.
Jonathan
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-04-27 15:36 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-25 23:35 MIME problem when using git format-patch / git am Øyvind A. Holm
2010-04-26 1:49 ` Jonathan Nieder
2010-04-26 8:50 ` Peter Krefting
2010-04-27 15:32 ` Jonathan Nieder
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).