git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Problems with format-patch UTF-8 and a missing second empty line
@ 2011-09-15  9:45 Ingo Ruhnke
  2011-09-15 15:17 ` Jeff King
  2011-09-15 19:01 ` Jeff King
  0 siblings, 2 replies; 6+ messages in thread
From: Ingo Ruhnke @ 2011-09-15  9:45 UTC (permalink / raw)
  To: git

Creating a patch of a commit including UTF-8 and no empty second line,
like this:

mkdir foobar
cd foobar
git init
echo "Hello World" > file
git add file
git commit -m "ÄÖÜ
ÄÖÜ" file
git format-patch --root HEAD --stdout

Results in this:

From f4f889bad560c479a70fbf5f70a4239576001262 Mon Sep 17 00:00:00 2001
From: Ingo Ruhnke <grumbel@gmail.com>
Date: Thu, 15 Sep 2011 11:25:11 +0200
Subject: [PATCH] =?UTF-8?q?=C3=84=C3=96=C3=9C
=20=C3=84=C3=96=C3=9C?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
...

Trying to apply the patch then results in this:

$ git am /tmp/foobar/0001-.patch
Applying: =?UTF-8?q?=C3=84=C3=96=C3=9C
applying to an empty history
$ git log
commit 6f27fc51a8c52fdb595131a934d8d56c9df3b5c0
Author: Ingo Ruhnke <grumbel@gmail.com>
Date:   Thu Sep 15 11:27:47 2011 +0200

    =?UTF-8?q?=C3=84=C3=96=C3=9C

    =20=C3=84=C3=96=C3=9C?=
    MIME-Version: 1.0
    Content-Type: text/plain; charset=UTF-8
    Content-Transfer-Encoding: 8bit

The UTF-8 stuff doesn't get decoded and the log message ends up broken.

The problem seems to already start with just the lack of an empty second line:

mkdir foobar
cd foobar
git init
echo "Hello World" > file
git add file
git commit -m "ABC
ABC" file
git format-patch --root HEAD --stdout

From 3abc0e59abc4c9343d22e79575e02910073d1013 Mon Sep 17 00:00:00 2001
From: Ingo Ruhnke <grumbel@gmail.com>
Date: Thu, 15 Sep 2011 11:31:03 +0200
Subject: [PATCH] ABC
 ABC

$ git am /tmp/foobar/0001-ABC.patch
Applying: ABC ABC
applying to an empty history
ingo@duo:/tmp/5/bar$ git log | cat
commit eb8a9e9a1421ae6d930d99bfb8f2eab47349c387
Author: Ingo Ruhnke <grumbel@gmail.com>
Date:   Thu Sep 15 11:31:03 2011 +0200

    ABC ABC

Here the newline between ABC\nABC gets stripped out and replaced with
a space when transferring the commit with format-patch from one
repository to another.

Inserting an empty second line in the commit message makes both
problems go away.

Another small issue is that the filename of the patch will strip out
any UTF-8 characters, Thus a commit message of "123Äöü456" will result
in "0001-123-456.patch".

The problems happen with git version 1.7.4.1 (4b5eac7f0) on Ubuntu 11.04.

-- 
Blog:     http://grumbel.blogspot.com/
JabberID: xmpp:grumbel@jabber.org
ICQ:      59461927

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Problems with format-patch UTF-8 and a missing second empty line
  2011-09-15  9:45 Problems with format-patch UTF-8 and a missing second empty line Ingo Ruhnke
@ 2011-09-15 15:17 ` Jeff King
       [not found]   ` <20110915224456.14410ed8@zappedws>
  2011-09-15 19:01 ` Jeff King
  1 sibling, 1 reply; 6+ messages in thread
From: Jeff King @ 2011-09-15 15:17 UTC (permalink / raw)
  To: Ingo Ruhnke; +Cc: git

On Thu, Sep 15, 2011 at 11:45:15AM +0200, Ingo Ruhnke wrote:

> Creating a patch of a commit including UTF-8 and no empty second line,
> like this:
> [...]
> Results in this:
> [...]
> Subject: [PATCH] =?UTF-8?q?=C3=84=C3=96=C3=9C
> =20=C3=84=C3=96=C3=9C?=
>[....]
> The problems happen with git version 1.7.4.1 (4b5eac7f0) on Ubuntu 11.04.

I'm pretty sure I fixed this in a1f6baa, which is in v1.7.4.4 and later.

-Peff

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Problems with format-patch UTF-8 and a missing second empty line
       [not found]   ` <20110915224456.14410ed8@zappedws>
@ 2011-09-15 18:50     ` Jeff King
  2011-09-15 20:05       ` Alexey Shumkin
  0 siblings, 1 reply; 6+ messages in thread
From: Jeff King @ 2011-09-15 18:50 UTC (permalink / raw)
  To: Alexey Shumkin; +Cc: Ingo Ruhnke, git

[resending with git@vger cc'd; please keep discussion on list]

On Thu, Sep 15, 2011 at 10:44:56PM +0400, Alexey Shumkin wrote:

> > On Thu, Sep 15, 2011 at 11:45:15AM +0200, Ingo Ruhnke wrote:
> > 
> > > Creating a patch of a commit including UTF-8 and no empty second
> > > line, like this:
> > > [...]
> > > Results in this:
> > > [...]
> > > Subject: [PATCH] =?UTF-8?q?=C3=84=C3=96=C3=9C
> > > =20=C3=84=C3=96=C3=9C?=
> > >[....]
> > > The problems happen with git version 1.7.4.1 (4b5eac7f0) on Ubuntu
> > > 11.04.
> > 
> > I'm pretty sure I fixed this in a1f6baa, which is in v1.7.4.4 and
> > later.
> 
> I reproduced this bug with the latest git (v1.7.6.3)
> It seems to me this is not the "git format-patch" bug
> but "git am"'s one. (But it is only the supposition)

Can you be more specific about what you tested? Running Ingo's snippet
with a more recent git produces:

  Subject: [PATCH] =?UTF-8?q?=C3=84=C3=96=C3=9C=20=C3=84=C3=96=C3=9C?=

which is right (and "git am", new or old, will apply it just fine).

But there may be a different, related bug lurking somewhere.

-Peff

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Problems with format-patch UTF-8 and a missing second empty line
  2011-09-15  9:45 Problems with format-patch UTF-8 and a missing second empty line Ingo Ruhnke
  2011-09-15 15:17 ` Jeff King
@ 2011-09-15 19:01 ` Jeff King
  1 sibling, 0 replies; 6+ messages in thread
From: Jeff King @ 2011-09-15 19:01 UTC (permalink / raw)
  To: Ingo Ruhnke; +Cc: git

On Thu, Sep 15, 2011 at 11:45:15AM +0200, Ingo Ruhnke wrote:

> Creating a patch of a commit including UTF-8 and no empty second line,
> like this:

I already responded about the bug with utf8-encoded subjects, but let me
address the second half of your mail, too:

> Here the newline between ABC\nABC gets stripped out and replaced with
> a space when transferring the commit with format-patch from one
> repository to another.

This is by design. Git commit messages are intended to have a
single-line subject, followed by a blank line, followed by more
elaboration. A multi-line subject is treated as a single line that has
been line-broken, and is subject to being reflowed onto a single line.
This is done to help with commits imported from other version control
systems which don't follow this pattern (the other option is truncating
the subject and putting the other lines into the "body", but that often
ends up quite unreadable).

If you really want to retain the newlines across "format-patch | am",
use the "-k" option of both to preserve the subject (I don't recall the
details, but I think you need a more recent version of git for
format-patch to correctly encode this, but "am" can be from any
version).

> Another small issue is that the filename of the patch will strip out
> any UTF-8 characters, Thus a commit message of "123Äöü456" will result
> in "0001-123-456.patch".

Yes, it's an attempt to strip out characters that some filesystems might
not support well. We could probably enable high-bit characters with a
config option (maybe even just using core.quotepath).

-Peff

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Problems with format-patch UTF-8 and a missing second empty line
  2011-09-15 18:50     ` Jeff King
@ 2011-09-15 20:05       ` Alexey Shumkin
  2011-09-15 20:33         ` Jeff King
  0 siblings, 1 reply; 6+ messages in thread
From: Alexey Shumkin @ 2011-09-15 20:05 UTC (permalink / raw)
  To: Jeff King; +Cc: Ingo Ruhnke, git

> > 
> > I reproduced this bug with the latest git (v1.7.6.3)
> > It seems to me this is not the "git format-patch" bug
> > but "git am"'s one. (But it is only the supposition)
> 
> Can you be more specific about what you tested? Running Ingo's snippet
> with a more recent git produces:
> 
>   Subject: [PATCH] =?UTF-8?q?=C3=84=C3=96=C3=9C=20=C3=84=C3=96=C3=9C?=
> 
> which is right (and "git am", new or old, will apply it just fine).
> 
> But there may be a different, related bug lurking somewhere.
> 
> -Peff
> 
this is my steps (log from terminal)

$ mkdir git-format-patch
Initialized empty Git repository
in /home/Alex/tmp/git-format-patch/.git/

$ cd git-format-patch

$ echo file content > file

$ git add -vf file
add 'file'

$ git commit -a -m 'коммит: строка-1
> коммит: строка-2'

[master (root-commit) 7ede929] коммит: строка-1 коммит: строка-2
 1 files changed, 1 insertions(+), 0 deletions(-)
 create mode 100644 file

$ git log
commit 7ede9291cf2d160721bcd8362d8d0f6c6e28cf29
Author: Alexey Shumkin <zapped@mail.ru>
Date:   Thu Sep 15 23:18:26 2011 +0400

    коммит: строка-1
    коммит: строка-2

$ git format-patch --root HEAD
0001-1.patch

$ cat 0001-1.patch 
From 7ede9291cf2d160721bcd8362d8d0f6c6e28cf29 Mon Sep 17 00:00:00 2001
From: Alexey Shumkin <zapped@mail.ru>
Date: Thu, 15 Sep 2011 23:18:26 +0400
Subject: [PATCH]
=?UTF-8?q?=D0=BA=D0=BE=D0=BC=D0=BC=D0=B8=D1=82:=20=D1=81=D1?=
=?UTF-8?q?=82=D1=80=D0=BE=D0=BA=D0=B0-1=20=D0=BA=D0=BE=D0=BC=D0=BC=D0=B8=D1?=
=?UTF-8?q?=82:=20=D1=81=D1=82=D1=80=D0=BE=D0=BA=D0=B0-2?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

---
 file |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)
 create mode 100644 file

diff --git a/file b/file
new file mode 100644
index 0000000..dd59d09
--- /dev/null
+++ b/file
@@ -0,0 +1 @@
+file content
-- 
1.7.6.3.4.gf71f

$ git init ../git-format-patch-am
Initialized empty Git repository
in /home/Alex/tmp/git-format-patch-am/.git/

$ cd ../git-format-patch-am

$ git am < ../git-format-patch/0001-1.patch
Applying: коммит: строка-1 коммит: строка-2
applying to an empty history

$ git log
commit 9856238e06d4ca8faeefc48e5c80e8ef7bd34195
Author: Alexey Shumkin <zapped@mail.ru>
Date:   Thu Sep 15 23:18:26 2011 +0400

    коммит: строка-1 коммит: строка-2

$ git --version
git version 1.7.6.3.4.gf71f



But as you said
>>This is by design. Git commit messages are intended to have a
>>single-line subject, followed by a blank line, followed by more
>>elaboration

and solved with "-k" for both "format-patch" and "am" commands

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: Problems with format-patch UTF-8 and a missing second empty line
  2011-09-15 20:05       ` Alexey Shumkin
@ 2011-09-15 20:33         ` Jeff King
  0 siblings, 0 replies; 6+ messages in thread
From: Jeff King @ 2011-09-15 20:33 UTC (permalink / raw)
  To: Alexey Shumkin; +Cc: Ingo Ruhnke, git

On Fri, Sep 16, 2011 at 12:05:15AM +0400, Alexey Shumkin wrote:

> But as you said
> >>This is by design. Git commit messages are intended to have a
> >>single-line subject, followed by a blank line, followed by more
> >>elaboration
> 
> and solved with "-k" for both "format-patch" and "am" commands

OK, that makes sense to me, then. I didn't read Ingo's first message
carefully enough, but your response made me scratch my head and read it
again. Thanks for the sanity check.

-Peff

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-09-15 20:33 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-09-15  9:45 Problems with format-patch UTF-8 and a missing second empty line Ingo Ruhnke
2011-09-15 15:17 ` Jeff King
     [not found]   ` <20110915224456.14410ed8@zappedws>
2011-09-15 18:50     ` Jeff King
2011-09-15 20:05       ` Alexey Shumkin
2011-09-15 20:33         ` Jeff King
2011-09-15 19:01 ` Jeff King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).