* Encoding problem on OSX?
[not found] <AANLkTikh12guRxCK2Vf=WvshzX8P-fYTyu3qxYWNJ2px@mail.gmail.com>
@ 2010-08-09 13:58 ` İsmail Dönmez
2010-08-09 23:46 ` Jonathan Nieder
0 siblings, 1 reply; 14+ messages in thread
From: İsmail Dönmez @ 2010-08-09 13:58 UTC (permalink / raw)
To: git
Hi all;
On master & maint branch, t4201-shortlog.sh test 2 fails with:
expecting success:
git shortlog HEAD >log &&
fuzz log >log.predictable &&
test_cmp expect.template log.predictable
--- expect.template 2010-08-09 13:45:46.000000000 +0000
+++ log.predictable 2010-08-09 13:45:46.000000000 +0000
@@ -1,8 +1,8 @@
A U Thor (5):
SUBJECT
SUBJECT
- SUBJECT
- SUBJECT
+ SUBJECT𝄞s 𝄞s a very, very long f𝄞rst l𝄞ne for the comm𝄞t
message to see 𝄞f 𝄞t 𝄞s wrapped correctly
+ SUBJECT????s ????s a very, very long f????rst l????ne for the
comm????t message to see ????f ????t ????s wrapped correctly
SUBJECT
I am not sure if this is a known problem so I am reporting it.
Regards,
ismail
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Encoding problem on OSX?
2010-08-09 13:58 ` Encoding problem on OSX? İsmail Dönmez
@ 2010-08-09 23:46 ` Jonathan Nieder
2010-08-10 5:52 ` İsmail Dönmez
0 siblings, 1 reply; 14+ messages in thread
From: Jonathan Nieder @ 2010-08-09 23:46 UTC (permalink / raw)
To: İsmail Dönmez; +Cc: git
İsmail Dönmez wrote:
> git shortlog HEAD >log &&
> fuzz log >log.predictable &&
> test_cmp expect.template log.predictable
>
> --- expect.template 2010-08-09 13:45:46.000000000 +0000
> +++ log.predictable 2010-08-09 13:45:46.000000000 +0000
> @@ -1,8 +1,8 @@
> A U Thor (5):
> SUBJECT
> SUBJECT
> - SUBJECT
> - SUBJECT
> + SUBJECT𝄞s 𝄞s a very, very long f𝄞rst l𝄞ne for the comm𝄞t
> message to see 𝄞f 𝄞t 𝄞s wrapped correctly
> + SUBJECT????s ????s a very, very long f????rst l????ne for the
> comm????t message to see ????f ????t ????s wrapped correctly
> SUBJECT
Very interesting; thanks for a report.
From the definition of fuzz(), it looks like
sed "
s/$_x40/OBJECT_NAME/g
s/$_x05/OBJID/g
s/^ \{6\}[CTa].*/ SUBJECT/g
s/^ \{8\}[^ ].*/ CONTINUATION/g
" <log >log.fuzzy
failed to completely match the fourth and five lines of the shortlog:
A U Thor (5):
Test
This is a very, very long first[etc]
Th𝄞s 𝄞s a very, very long f𝄞rst[etc]
Th<malformed treble clef>s <malformed treble clef>s a...
Could you confirm this? What does
locale
printf 'Th\360\235\204\236s\n' | sed 's/.*//g'
print?
Jonathan
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Encoding problem on OSX?
2010-08-09 23:46 ` Jonathan Nieder
@ 2010-08-10 5:52 ` İsmail Dönmez
2010-08-11 7:55 ` Jonathan Nieder
0 siblings, 1 reply; 14+ messages in thread
From: İsmail Dönmez @ 2010-08-10 5:52 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: git
Hi;
On Tue, Aug 10, 2010 at 2:46 AM, Jonathan Nieder <jrnieder@gmail.com> wrote:
>
> locale
> printf 'Th\360\235\204\236s\n' | sed 's/.*//g
[ismail@havana][08:50:45]
[~]> locale
LANG=
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL="en_US.UTF-8"
[ismail@havana][08:51:00]
[~]> printf 'Th\360\235\204\236s\n' | sed 's/.*//g'
[ismail@havana][08:51:06]
[~]>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Encoding problem on OSX?
2010-08-10 5:52 ` İsmail Dönmez
@ 2010-08-11 7:55 ` Jonathan Nieder
2010-08-11 8:20 ` İsmail Dönmez
0 siblings, 1 reply; 14+ messages in thread
From: Jonathan Nieder @ 2010-08-11 7:55 UTC (permalink / raw)
To: İsmail Dönmez; +Cc: git
İsmail Dönmez wrote:
> [~]> printf 'Th\360\235\204\236s\n' | sed 's/.*//g'
>
> [ismail@havana][08:51:06]
> [~]>
Thanks for checking. So sed is not completely broken. Could you try
sh t4201-shortlog.sh
cd "trash directory.t4201-shortlog"
git log
cat "trash directory.t4201-shortlog/log"
?
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Encoding problem on OSX?
2010-08-11 7:55 ` Jonathan Nieder
@ 2010-08-11 8:20 ` İsmail Dönmez
2010-08-11 8:29 ` Jonathan Nieder
0 siblings, 1 reply; 14+ messages in thread
From: İsmail Dönmez @ 2010-08-11 8:20 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: git
Hi;
On Wed, Aug 11, 2010 at 10:55 AM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> İsmail Dönmez wrote:
>
>> [~]> printf 'Th\360\235\204\236s\n' | sed 's/.*//g'
>>
>> [ismail@havana][08:51:06]
>> [~]>
>
> Thanks for checking. So sed is not completely broken. Could you try
>
> sh t4201-shortlog.sh
> cd "trash directory.t4201-shortlog"
> git log
> cat "trash directory.t4201-shortlog/log"
First of all note that this is not Mac's default sed but instead GNU sed:
GNU sed version 4.2.1
Copyright (C) 2009 Free Software Foundation, Inc.
Now the output of what you requested;
[~/Sources/git/t]> sh t4201-shortlog.sh
ok 1 - setup
not ok - 2 default output format
#
# git shortlog HEAD >log &&
# fuzz log >log.predictable &&
# test_cmp expect.template log.predictable
#
ok 3 - pretty format
ok 4 - --abbrev
ok 5 - output from user-defined format is re-wrapped
ok 6 - shortlog wrapping
ok 7 - shortlog from non-git directory
ok 8 - shortlog encoding
# failed 1 among 8 test(s)
1..8
[ismail@havana][11:18:24]
[~/Sources/git/t]> cd "trash directory.t4201-shortlog"
[ismail@havana][11:18:33]
[~/Sources/git/t/trash directory.t4201-shortlog]> git log
commit ef6c19b4846d6a3e41f9a3ce746a3bffae653c17
Author: Jöhännës "Dschö" Schindëlin <Johannes.Schindelin@gmx.de>
Date: Wed Aug 11 08:18:24 2010 +0000
set a1 to 3 and some non-ASCII chars: áæï
commit d7c0787d081716755e2863f612d171846f503d4f
Author: Jöhännës "Dschö" Schindëlin <Johannes.Schindelin@gmx.de>
Date: Wed Aug 11 08:18:24 2010 +0000
set a1 to 2 and some non-ASCII chars: Äßø
commit 7e9687adfe33f5d2050f0fc4ab5004f324d3559f
Author: A U Thor <author@example.com>
Date: Wed Aug 11 08:18:24 2010 +0000
Test
[~/Sources/git/t/trash directory.t4201-shortlog]> cat log
commit 5fc75f5794d1cd8575fc3e2e86f9c0e1aa31723e
Author: Someone else <not!me>
Date: Wed Aug 11 08:18:24 2010 +0000
Commit by someone else
commit 0f5955f471a9d882b0e869752614b5123af19da3
Author: A U Thor <author@example.com>
Date: Wed Aug 11 08:18:24 2010 +0000
a 12 34 56 78
commit 0bb7d083233c266d9051b283913bd83000c9001f
Author: A U Thor <author@example.com>
Date: Wed Aug 11 08:18:24 2010 +0000
Th????s ????s a very, very long f????rst l????ne for the comm????t
message to see ????f ????t ????s wrapped correctly
commit 03a5a848c658751c51925127820491bf2a94a752
Author: A U Thor <author@example.com>
Date: Wed Aug 11 08:18:24 2010 +0000
Th𝄞s 𝄞s a very, very long f𝄞rst l𝄞ne for the comm𝄞t message
to see 𝄞f 𝄞t 𝄞s wrapped correctly
commit fdfc106190118f705dee70b56930764007353922
Author: A U Thor <author@example.com>
Date: Wed Aug 11 08:18:24 2010 +0000
This is a very, very long first line for the commit message to see
if it is wrapped correctly
commit 7e9687adfe33f5d2050f0fc4ab5004f324d3559f
Author: A U Thor <author@example.com>
Date: Wed Aug 11 08:18:24 2010 +0000
Test
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Encoding problem on OSX?
2010-08-11 8:20 ` İsmail Dönmez
@ 2010-08-11 8:29 ` Jonathan Nieder
2010-08-11 8:33 ` İsmail Dönmez
0 siblings, 1 reply; 14+ messages in thread
From: Jonathan Nieder @ 2010-08-11 8:29 UTC (permalink / raw)
To: İsmail Dönmez; +Cc: git
İsmail Dönmez wrote:
> [~/Sources/git/t]> sh t4201-shortlog.sh
> ok 1 - setup
> not ok - 2 default output format
> #
> # git shortlog HEAD >log &&
> # fuzz log >log.predictable &&
> # test_cmp expect.template log.predictable
> #
> ok 3 - pretty format
Oops, my bad.
sh t4201-shortlog.sh --immediate
cat "trash directory.t4201-shortlog/log"
is what I meant. The idea is to get the log that that log.predictable
is based on, by fetching the log from immediately after the failing test.
Sorry for the trouble,
Jonathan
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Encoding problem on OSX?
2010-08-11 8:29 ` Jonathan Nieder
@ 2010-08-11 8:33 ` İsmail Dönmez
2010-08-11 8:44 ` Jonathan Nieder
0 siblings, 1 reply; 14+ messages in thread
From: İsmail Dönmez @ 2010-08-11 8:33 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: git
On Wed, Aug 11, 2010 at 11:29 AM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> İsmail Dönmez wrote:
>
>> [~/Sources/git/t]> sh t4201-shortlog.sh
>> ok 1 - setup
>> not ok - 2 default output format
>> #
>> # git shortlog HEAD >log &&
>> # fuzz log >log.predictable &&
>> # test_cmp expect.template log.predictable
>> #
>> ok 3 - pretty format
>
> Oops, my bad.
>
> sh t4201-shortlog.sh --immediate
> cat "trash directory.t4201-shortlog/log"
>
> is what I meant. The idea is to get the log that that log.predictable
> is based on, by fetching the log from immediately after the failing test.
Ok here we go;
[~/Sources/git/t]> sh t4201-shortlog.sh --immediate
ok 1 - setup
not ok - 2 default output format
#
# git shortlog HEAD >log &&
# fuzz log >log.predictable &&
# test_cmp expect.template log.predictable
#
[ismail@havana][11:32:29]
[~/Sources/git/t]> cat "trash directory.t4201-shortlog/log"
A U Thor (5):
Test
This is a very, very long first line for the commit message to
see if it is wrapped correctly
Th𝄞s 𝄞s a very, very long f𝄞rst l𝄞ne for the comm𝄞t message
to see 𝄞f 𝄞t 𝄞s wrapped correctly
Th????s ????s a very, very long f????rst l????ne for the
comm????t message to see ????f ????t ????s wrapped correctly
a 12 34 56 78
Someone else (1):
Commit by someone else
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Encoding problem on OSX?
2010-08-11 8:33 ` İsmail Dönmez
@ 2010-08-11 8:44 ` Jonathan Nieder
2010-08-11 8:47 ` İsmail Dönmez
0 siblings, 1 reply; 14+ messages in thread
From: Jonathan Nieder @ 2010-08-11 8:44 UTC (permalink / raw)
To: İsmail Dönmez; +Cc: git
İsmail Dönmez wrote:
> On Wed, Aug 11, 2010 at 11:29 AM, Jonathan Nieder <jrnieder@gmail.com> wrote:
>> sh t4201-shortlog.sh --immediate
>> cat "trash directory.t4201-shortlog/log"
>>
>> is what I meant. The idea is to get the log that that log.predictable
>> is based on, by fetching the log from immediately after the failing test.
>
> Ok here we go;
Okay, I’m stymied. It *looks* like a sed bug even if a quick
test did not catch it in the act.
I guess the last thing to try is
sed "s/^ \{6\}[CTa].*/ SUBJECT/g" <"trash directory.t4201-shortlog/log"
because then you would have a test case to report to your sed
supplier.
Hopefully someone else with Mac OS X can reproduce this.
Thanks again.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Encoding problem on OSX?
2010-08-11 8:44 ` Jonathan Nieder
@ 2010-08-11 8:47 ` İsmail Dönmez
2010-08-11 9:01 ` İsmail Dönmez
0 siblings, 1 reply; 14+ messages in thread
From: İsmail Dönmez @ 2010-08-11 8:47 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: git
On Wed, Aug 11, 2010 at 11:44 AM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> sed "s/^ \{6\}[CTa].*/ SUBJECT/g" <"trash directory.t4201-shortlog/log"
>
A U Thor (5):
SUBJECT
SUBJECT
SUBJECT
SUBJECT????s ????s a very, very long f????rst l????ne for the
comm????t message to see ????f ????t ????s wrapped correctly
SUBJECT
Someone else (1):
SUBJECT
I will try updating my sed, thanks!
Regards,
ismail
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Encoding problem on OSX?
2010-08-11 8:47 ` İsmail Dönmez
@ 2010-08-11 9:01 ` İsmail Dönmez
2010-08-11 9:23 ` Jonathan Nieder
0 siblings, 1 reply; 14+ messages in thread
From: İsmail Dönmez @ 2010-08-11 9:01 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: git
Hi again;
On Wed, Aug 11, 2010 at 11:47 AM, İsmail Dönmez <ismail@namtrac.org> wrote:
> On Wed, Aug 11, 2010 at 11:44 AM, Jonathan Nieder <jrnieder@gmail.com> wrote:
>> sed "s/^ \{6\}[CTa].*/ SUBJECT/g" <"trash directory.t4201-shortlog/log"
>>
>
> A U Thor (5):
> SUBJECT
> SUBJECT
> SUBJECT
> SUBJECT????s ????s a very, very long f????rst l????ne for the
> comm????t message to see ????f ????t ????s wrapped correctly
> SUBJECT
>
> Someone else (1):
> SUBJECT
>
> I will try updating my sed, thanks!
Downgrading my sed to v 4.1.5 fixed the issue. Thanks for your help!
Regards,
ismail
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Encoding problem on OSX?
2010-08-11 9:01 ` İsmail Dönmez
@ 2010-08-11 9:23 ` Jonathan Nieder
2010-09-27 2:31 ` Jonathan Nieder
0 siblings, 1 reply; 14+ messages in thread
From: Jonathan Nieder @ 2010-08-11 9:23 UTC (permalink / raw)
To: İsmail Dönmez; +Cc: git
İsmail Dönmez wrote:
> Downgrading my sed to v 4.1.5 fixed the issue. Thanks for your help!
I just read BUGS in the sed distribution. Strangely enough the above seems to
be correct behavior:
Another common localization-related problem happens if your input stream
includes invalid multibyte sequences. POSIX mandates that such
sequences are _not_ matched by `.', so that `s/.*//' will not clear
pattern space as you would expect. In fact, there is no way to clear
sed's buffers in the middle of the script in most multibyte locales
(including UTF-8 locales). For this reason, GNU sed provides a `z'
command (for `zap') as an extension.
However there is still a sed bug as far as I can tell, since in the
test suite, LC_ALL is set to C, and using the C locale is the
suggested workaround in the GNU sed docs. This explains where my
first suggested diagnostic messed up: presumably
printf 'Th\360\235\204\236s\n' | LC_ALL=C sed "s/.*//"
would print
<treble clef>s
and
printf 'Th\370\235\204\236s\n' | sed "s/.*//"
would print
????s
with your copy of sed 4.2.1.
Well, I learned something new today. Still thinking over how to fix
this in the test suite. Thanks again.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Encoding problem on OSX?
2010-08-11 9:23 ` Jonathan Nieder
@ 2010-09-27 2:31 ` Jonathan Nieder
2010-09-27 5:15 ` Kevin Ballard
0 siblings, 1 reply; 14+ messages in thread
From: Jonathan Nieder @ 2010-09-27 2:31 UTC (permalink / raw)
To: İsmail Dönmez, Richard MICHAEL; +Cc: git
Hi again,
İsmail Dönmez wrote:
> Downgrading my sed to v 4.1.5 fixed the issue.
This is nicely explained here:
https://www.opengroup.org/sophocles/show_mail.tpl?source=L&listname=austin-group-l&id=14595
It looks to be a Mac OS libc misfeature. Could you two lobby Apple to
get this fixed? :)
Thanks again for the reports.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Encoding problem on OSX?
2010-09-27 2:31 ` Jonathan Nieder
@ 2010-09-27 5:15 ` Kevin Ballard
2010-09-27 5:18 ` İsmail Dönmez
0 siblings, 1 reply; 14+ messages in thread
From: Kevin Ballard @ 2010-09-27 5:15 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: İsmail Dönmez, Richard MICHAEL, git
On Sep 26, 2010, at 7:31 PM, Jonathan Nieder wrote:
> Hi again,
>
> İsmail Dönmez wrote:
>
>> Downgrading my sed to v 4.1.5 fixed the issue.
>
> This is nicely explained here:
>
> https://www.opengroup.org/sophocles/show_mail.tpl?source=L&listname=austin-group-l&id=14595
>
> It looks to be a Mac OS libc misfeature. Could you two lobby Apple to
> get this fixed? :)
FWIW, /usr/bin/sed on Mac OS X 10.6 doesn't seem to be having a problem. t4201-shortlog.sh passes all tests on my machine.
-Kevin Ballard
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Encoding problem on OSX?
2010-09-27 5:15 ` Kevin Ballard
@ 2010-09-27 5:18 ` İsmail Dönmez
0 siblings, 0 replies; 14+ messages in thread
From: İsmail Dönmez @ 2010-09-27 5:18 UTC (permalink / raw)
To: Kevin Ballard; +Cc: Jonathan Nieder, Richard MICHAEL, git
On Mon, Sep 27, 2010 at 8:15 AM, Kevin Ballard <kevin@sb.org> wrote:
> On Sep 26, 2010, at 7:31 PM, Jonathan Nieder wrote:
>
>> Hi again,
>>
>> İsmail Dönmez wrote:
>>
>>> Downgrading my sed to v 4.1.5 fixed the issue.
>>
>> This is nicely explained here:
>>
>> https://www.opengroup.org/sophocles/show_mail.tpl?source=L&listname=austin-group-l&id=14595
>>
>> It looks to be a Mac OS libc misfeature. Could you two lobby Apple to
>> get this fixed? :)
>
> FWIW, /usr/bin/sed on Mac OS X 10.6 doesn't seem to be having a problem. t4201-shortlog.sh passes all tests on my machine.
Yes the problem was with GNU sed on OSX 10.6
Regards,
ismail
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2010-09-27 5:18 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <AANLkTikh12guRxCK2Vf=WvshzX8P-fYTyu3qxYWNJ2px@mail.gmail.com>
2010-08-09 13:58 ` Encoding problem on OSX? İsmail Dönmez
2010-08-09 23:46 ` Jonathan Nieder
2010-08-10 5:52 ` İsmail Dönmez
2010-08-11 7:55 ` Jonathan Nieder
2010-08-11 8:20 ` İsmail Dönmez
2010-08-11 8:29 ` Jonathan Nieder
2010-08-11 8:33 ` İsmail Dönmez
2010-08-11 8:44 ` Jonathan Nieder
2010-08-11 8:47 ` İsmail Dönmez
2010-08-11 9:01 ` İsmail Dönmez
2010-08-11 9:23 ` Jonathan Nieder
2010-09-27 2:31 ` Jonathan Nieder
2010-09-27 5:15 ` Kevin Ballard
2010-09-27 5:18 ` İsmail Dönmez
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).