git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* encoding bug in git.el
@ 2008-05-20 22:09 Karl Hasselström
       [not found] ` <87mymkbo9x.fsf@lysator.liu.se>
  0 siblings, 1 reply; 10+ messages in thread
From: Karl Hasselström @ 2008-05-20 22:09 UTC (permalink / raw)
  To: Clifford Caoile; +Cc: Junio C Hamano, David Kågedal, git

Recently, some commits started misrecording the "ö" in my name. (In
emacs, for example, it looks like this in a utf8 buffer:
Hasselstr\201\366m.) I'm guessing there's an extra latin1->utf8
conversion in there somewhere.

It turns out that the breakage occurs when I commit with the
git-status mode from git.el, and it was introduced by this commit:

  commit dbe48256b41c1e94d81f2458d7e84b1fdcb47026
  Author: Clifford Caoile <piyo@users.sourceforge.net>

      git.el: Set process-environment instead of invoking env

It's in master, but not yet in maint. (In fact, it's the _only_ change
to contrib/emacs that's in master but not in maint.)

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: encoding bug in git.el
       [not found] ` <87mymkbo9x.fsf@lysator.liu.se>
@ 2008-05-21 14:08   ` Clifford Caoile
  2008-05-21 14:54     ` Karl Hasselström
  2008-05-25 13:42     ` Karl Hasselström
  0 siblings, 2 replies; 10+ messages in thread
From: Clifford Caoile @ 2008-05-21 14:08 UTC (permalink / raw)
  To: Karl Hasselström, David Kågedal; +Cc: git, Junio C. Hamano

Hi:

On Wed, May 21, 2008 at 7:31 AM, David Kågedal <davidk@lysator.liu.se> wrote:
> Karl Hasselström <kha@treskal.com> writes:
>
>> Recently, some commits started misrecording the "ö" in my name. (In
>> emacs, for example, it looks like this in a utf8 buffer:
>> Hasselstr\201\366m.) I'm guessing there's an extra latin1->utf8
>> conversion in there somewhere.
>
> The \201 looks more like Emacs' internal mule encoding, where
> everything that isn't ASCII is prefixed with \201 or something
> similar.

Thanks for reporting this.

I concur. This is not UTF-8 translation, but an emacs MULE encoding. I
suspect the U+F6 character is read in to the *git-commit* buffer in
latin-1 mode because git.el displays the Author line, then Emacs
writes that out as 0x81F6, because that is the emacs buffer code of
U+F6.

This is because git.el, upon git-commit-tree, always redefines the
environment variables like GIT_AUTHOR_NAME. However the difference is
that prior to commit dbe482, "env" handle the encoding while commit
dbe482 lets emacs process-environment handle it. Unfortunately the
string is passed without the proper recoding in the latter case.

Here is a proposed fix. I suggest that process-environment should be
given these envvars already encoded as shown in this code sample:

------------------ git.el ------------------
[not a proper git-diff]
@@ -216,6 +216,11 @@ and `git-diff-setup-hook'."
   "Build a list of NAME=VALUE strings from a list of environment strings."
   (mapcar (lambda (entry) (concat (car entry) "=" (cdr entry))) env))

+(defun git-get-env-strings-encoded (env encoding)
+  "Build a list of NAME=VALUE strings from a list of environment strings,
+converting from mule-encoding to ENCODING (e.g. mule-utf-8, latin-1, etc)."
+  (mapcar (lambda (entry) (concat (car entry) "="
(encode-coding-string (cdr entry) encoding))) env))
+
 (defun git-call-process-env (buffer env &rest args)
   "Wrapper for call-process that sets environment strings."
   (let ((process-environment (append (git-get-env-strings env)
@@ -265,7 +270,7 @@ and returns the process output as a string, or nil
if the git failed."

 (defun git-run-command-region (buffer start end env &rest args)
   "Run a git command with specified buffer region as input."
-  (unless (eq 0 (let ((process-environment (append (git-get-env-strings env)
+  (unless (eq 0 (let ((process-environment (append
(git-get-env-strings-encoded env coding-system-for-write)
                                                    process-environment)))
                   (git-run-process-region
                    buffer start end "git" args)))

The buffer text is saved with the encoding coding-system-for-write,
while the GIT_* envvars were not encoded, so when appending to
process-environment variable, use the same encoding.

(Reminder: the *git-commit* buffer's encoding is based on the git
config i18n.commitencoding, which in turn sets
buffer-file-coding-system, which in turn sets coding-system-for-write)

I tested this with U+F6 in the GIT_AUTHOR_NAME, git config user.name,
and the commit text, and it seems to work better (I think it's fixed).
Please review it. Also, I am not sure if this fix needs to be
propagated to the other areas where process-environment is redefined,
so YMMV.

(Lastly, while testing this for Japanese, I'm having some encoding
problem with meadow (Emacs on Windows), msysgit (git on Windows),
set-language-mode Japanese, utf-8, and M-x git-commit-file but I don't
think its related to this exact problem. Hopefully.)

>> It turns out that the breakage occurs when I commit with the
>> git-status mode from git.el, and it was introduced by this commit:
>>
>>   commit dbe48256b41c1e94d81f2458d7e84b1fdcb47026
>>   Author: Clifford Caoile <piyo@users.sourceforge.net>
>>
>>       git.el: Set process-environment instead of invoking env

:-)

This must be the reason why process-environment wasn't used in all places.

>> It's in master, but not yet in maint. (In fact, it's the _only_ change
>> to contrib/emacs that's in master but not in maint.)

Please forgive my ignorance, but what does this mean?

Best regards,
Clifford Caoile

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: encoding bug in git.el
  2008-05-21 14:08   ` Clifford Caoile
@ 2008-05-21 14:54     ` Karl Hasselström
  2008-05-21 21:31       ` Clifford Caoile
  2008-05-25 13:42     ` Karl Hasselström
  1 sibling, 1 reply; 10+ messages in thread
From: Karl Hasselström @ 2008-05-21 14:54 UTC (permalink / raw)
  To: Clifford Caoile; +Cc: David Kågedal, git, Junio C. Hamano

On 2008-05-21 23:08:09 +0900, Clifford Caoile wrote:

> > > It's in master, but not yet in maint. (In fact, it's the _only_
> > > change to contrib/emacs that's in master but not in maint.)
>
> Please forgive my ignorance, but what does this mean?

That the change was committed to the "master" branch, and not the
"maint" branch. So folks who run stable releases haven't seen the bug
yet.

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: encoding bug in git.el
  2008-05-21 14:54     ` Karl Hasselström
@ 2008-05-21 21:31       ` Clifford Caoile
  2008-05-23  7:09         ` Karl Hasselström
  0 siblings, 1 reply; 10+ messages in thread
From: Clifford Caoile @ 2008-05-21 21:31 UTC (permalink / raw)
  To: Karl Hasselström; +Cc: David Kågedal, git, Junio C. Hamano

On Wed, May 21, 2008 at 11:54 PM, Karl Hasselström <kha@treskal.com> wrote:
> On 2008-05-21 23:08:09 +0900, Clifford Caoile wrote:
>
>> > > It's in master, but not yet in maint. (In fact, it's the _only_
>> > > change to contrib/emacs that's in master but not in maint.)
>>
>> Please forgive my ignorance, but what does this mean?
>
> That the change was committed to the "master" branch, and not the
> "maint" branch. So folks who run stable releases haven't seen the bug
> yet.

Ok I understand.

Did you test the proposed fix I sent? I would like to know your feedback.

Best regards,
Clifford Caoile

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: encoding bug in git.el
  2008-05-21 21:31       ` Clifford Caoile
@ 2008-05-23  7:09         ` Karl Hasselström
  0 siblings, 0 replies; 10+ messages in thread
From: Karl Hasselström @ 2008-05-23  7:09 UTC (permalink / raw)
  To: Clifford Caoile; +Cc: David Kågedal, git, Junio C. Hamano

On 2008-05-22 06:31:03 +0900, Clifford Caoile wrote:

> Did you test the proposed fix I sent? I would like to know your
> feedback.

No, sorry, I haven't taken the time to do so yet. Will try to do so
this weekend.

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: encoding bug in git.el
  2008-05-21 14:08   ` Clifford Caoile
  2008-05-21 14:54     ` Karl Hasselström
@ 2008-05-25 13:42     ` Karl Hasselström
  2008-05-30 12:28       ` Karl Hasselström
  1 sibling, 1 reply; 10+ messages in thread
From: Karl Hasselström @ 2008-05-25 13:42 UTC (permalink / raw)
  To: Clifford Caoile; +Cc: David Kågedal, git, Junio C. Hamano

On 2008-05-21 23:08:09 +0900, Clifford Caoile wrote:

> Here is a proposed fix. I suggest that process-environment should be
> given these envvars already encoded as shown in this code sample:
>
> ------------------ git.el ------------------
> [not a proper git-diff]
> @@ -216,6 +216,11 @@ and `git-diff-setup-hook'."
>    "Build a list of NAME=VALUE strings from a list of environment strings."
>    (mapcar (lambda (entry) (concat (car entry) "=" (cdr entry))) env))
>
> +(defun git-get-env-strings-encoded (env encoding)
> +  "Build a list of NAME=VALUE strings from a list of environment strings,
> +converting from mule-encoding to ENCODING (e.g. mule-utf-8, latin-1, etc)."
> +  (mapcar (lambda (entry) (concat (car entry) "="
> (encode-coding-string (cdr entry) encoding))) env))
> +
>  (defun git-call-process-env (buffer env &rest args)
>    "Wrapper for call-process that sets environment strings."
>    (let ((process-environment (append (git-get-env-strings env)
> @@ -265,7 +270,7 @@ and returns the process output as a string, or nil
> if the git failed."
>
>  (defun git-run-command-region (buffer start end env &rest args)
>    "Run a git command with specified buffer region as input."
> -  (unless (eq 0 (let ((process-environment (append (git-get-env-strings env)
> +  (unless (eq 0 (let ((process-environment (append
> (git-get-env-strings-encoded env coding-system-for-write)
>                                                     process-environment)))
>                    (git-run-process-region
>                     buffer start end "git" args)))
>
> The buffer text is saved with the encoding coding-system-for-write,
> while the GIT_* envvars were not encoded, so when appending to
> process-environment variable, use the same encoding.

I don't claim to understand any of the design issues around this, but
your patch certainly fixes my problem (once I managed to apply it,
which involved working around the lack of headers, non-matching
offsets, and whitespace damage -- luckily it was just two hunks). So:

Tested-by: Karl Hasselström <kha@treskal.com>

Thanks for taking the time.

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: encoding bug in git.el
  2008-05-25 13:42     ` Karl Hasselström
@ 2008-05-30 12:28       ` Karl Hasselström
  2008-05-30 20:27         ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Karl Hasselström @ 2008-05-30 12:28 UTC (permalink / raw)
  To: Clifford Caoile; +Cc: David Kågedal, git, Junio C. Hamano

On 2008-05-25 15:42:00 +0200, Karl Hasselström wrote:

> On 2008-05-21 23:08:09 +0900, Clifford Caoile wrote:
>
> > Here is a proposed fix.
>
> I don't claim to understand any of the design issues around this,
> but your patch certainly fixes my problem (once I managed to apply
> it, which involved working around the lack of headers, non-matching
> offsets, and whitespace damage -- luckily it was just two hunks).
> So:
>
> Tested-by: Karl Hasselström <kha@treskal.com>
>
> Thanks for taking the time.

How are things going with this fix? Junio, I expect you're waiting for
a properly cleaned-up patch, possibly with acks from relevant people?

I think it would be a mistake to release 1.5.6 with this bug still in
it; if not this bugfix, then a revert of the offending commit.

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: encoding bug in git.el
  2008-05-30 12:28       ` Karl Hasselström
@ 2008-05-30 20:27         ` Junio C Hamano
  2008-06-02 22:41           ` [PATCH] Revert "git.el: Set process-environment instead of invoking env" Karl Hasselström
  0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2008-05-30 20:27 UTC (permalink / raw)
  To: Karl Hasselström
  Cc: Clifford Caoile, David Kågedal, git, Junio C. Hamano

Karl Hasselström <kha@treskal.com> writes:

> On 2008-05-25 15:42:00 +0200, Karl Hasselström wrote:
>
>> On 2008-05-21 23:08:09 +0900, Clifford Caoile wrote:
>>
>> > Here is a proposed fix.
>>
>> I don't claim to understand any of the design issues around this,
>> but your patch certainly fixes my problem (once I managed to apply
>> it, which involved working around the lack of headers, non-matching
>> offsets, and whitespace damage -- luckily it was just two hunks).
>> So:
>>
>> Tested-by: Karl Hasselström <kha@treskal.com>
>>
>> Thanks for taking the time.
>
> How are things going with this fix? Junio, I expect you're waiting for
> a properly cleaned-up patch, possibly with acks from relevant people?

You expected correctly.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] Revert "git.el: Set process-environment instead of invoking env"
  2008-05-30 20:27         ` Junio C Hamano
@ 2008-06-02 22:41           ` Karl Hasselström
  2008-06-03 15:54             ` David Christensen
  0 siblings, 1 reply; 10+ messages in thread
From: Karl Hasselström @ 2008-06-02 22:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Clifford Caoile, David Kågedal

This reverts commit dbe48256b41c1e94d81f2458d7e84b1fdcb47026, which
caused mis-encoding of non-ASCII author/committer names when the
git-status mode is used to create commits.

Signed-off-by: Karl Hasselström <kha@treskal.com>

---

On 2008-05-30 13:27:43 -0700, Junio C Hamano wrote:

> Karl Hasselström <kha@treskal.com> writes:
> 
> > How are things going with this fix? Junio, I expect you're waiting
> > for a properly cleaned-up patch, possibly with acks from relevant
> > people?
> 
> You expected correctly.

In case no one who understands how, why, and whether the fix works
comes forward, here's a revert of the commit that introduced the
problem.

 contrib/emacs/git.el |   11 +++++++----
 1 files changed, 7 insertions(+), 4 deletions(-)


diff --git a/contrib/emacs/git.el b/contrib/emacs/git.el
index 2557a76..4fa853f 100644
--- a/contrib/emacs/git.el
+++ b/contrib/emacs/git.el
@@ -232,8 +232,10 @@ and returns the process output as a string, or nil if the git failed."
 
 (defun git-run-command-region (buffer start end env &rest args)
   "Run a git command with specified buffer region as input."
-  (unless (eq 0 (let ((process-environment (append (git-get-env-strings env)
-                                                   process-environment)))
+  (unless (eq 0 (if env
+                    (git-run-process-region
+                     buffer start end "env"
+                     (append (git-get-env-strings env) (list "git") args))
                   (git-run-process-region
                    buffer start end "git" args)))
     (error "Failed to run \"git %s\":\n%s" (mapconcat (lambda (x) x) args " ") (buffer-string))))
@@ -248,8 +250,9 @@ and returns the process output as a string, or nil if the git failed."
             (erase-buffer)
             (cd dir)
             (setq status
-                  (let ((process-environment (append (git-get-env-strings env)
-                                                     process-environment)))
+                  (if env
+                      (apply #'call-process "env" nil (list buffer t) nil
+                             (append (git-get-env-strings env) (list hook-name) args))
                     (apply #'call-process hook-name nil (list buffer t) nil args))))
           (display-message-or-buffer buffer)
           (eq 0 status)))))

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] Revert "git.el: Set process-environment instead of invoking env"
  2008-06-02 22:41           ` [PATCH] Revert "git.el: Set process-environment instead of invoking env" Karl Hasselström
@ 2008-06-03 15:54             ` David Christensen
  0 siblings, 0 replies; 10+ messages in thread
From: David Christensen @ 2008-06-03 15:54 UTC (permalink / raw)
  To: Karl Hasselström
  Cc: Junio C Hamano, git, Clifford Caoile, David Kågedal

> This reverts commit dbe48256b41c1e94d81f2458d7e84b1fdcb47026, which
> caused mis-encoding of non-ASCII author/committer names when the
> git-status mode is used to create commits.
>
> Signed-off-by: Karl Hasselström <kha@treskal.com>
>
> ---
>
> On 2008-05-30 13:27:43 -0700, Junio C Hamano wrote:
>
>> Karl Hasselström <kha@treskal.com> writes:
>>
>>> How are things going with this fix? Junio, I expect you're waiting
>>> for a properly cleaned-up patch, possibly with acks from relevant
>>> people?
>>
>> You expected correctly.
>
> In case no one who understands how, why, and whether the fix works
> comes forward, here's a revert of the commit that introduced the
> problem.

This likely is due to the process-coding-system selected by emacs;  
the correct functioning of this command will rely on both the current  
buffer's coding system and the coding system of the data returned by  
the invocation of git-status.  In order for this to function  
properly, these should match.  Both of these are variables which can  
be customized local to the buffer as part of the routine, so this  
could be fixed if we are able to determine at invocation time what  
coding system the git-status command will return in (presumably some  
form of utf-8, but I believe this is configurable per repo).

I'd be glad to take a more in-depth look at this, but I'm not up on  
the code at this point.

Regards,

David
--
David Christensen
End Point Corporation
david@endpoint.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2008-06-03 15:55 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-20 22:09 encoding bug in git.el Karl Hasselström
     [not found] ` <87mymkbo9x.fsf@lysator.liu.se>
2008-05-21 14:08   ` Clifford Caoile
2008-05-21 14:54     ` Karl Hasselström
2008-05-21 21:31       ` Clifford Caoile
2008-05-23  7:09         ` Karl Hasselström
2008-05-25 13:42     ` Karl Hasselström
2008-05-30 12:28       ` Karl Hasselström
2008-05-30 20:27         ` Junio C Hamano
2008-06-02 22:41           ` [PATCH] Revert "git.el: Set process-environment instead of invoking env" Karl Hasselström
2008-06-03 15:54             ` David Christensen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).