* Encoding problem on OSX? [not found] <AANLkTikh12guRxCK2Vf=WvshzX8P-fYTyu3qxYWNJ2px@mail.gmail.com> @ 2010-08-09 13:58 ` İsmail Dönmez 2010-08-09 23:46 ` Jonathan Nieder 0 siblings, 1 reply; 14+ messages in thread From: İsmail Dönmez @ 2010-08-09 13:58 UTC (permalink / raw) To: git Hi all; On master & maint branch, t4201-shortlog.sh test 2 fails with: expecting success: git shortlog HEAD >log && fuzz log >log.predictable && test_cmp expect.template log.predictable --- expect.template 2010-08-09 13:45:46.000000000 +0000 +++ log.predictable 2010-08-09 13:45:46.000000000 +0000 @@ -1,8 +1,8 @@ A U Thor (5): SUBJECT SUBJECT - SUBJECT - SUBJECT + SUBJECT𝄞s 𝄞s a very, very long f𝄞rst l𝄞ne for the comm𝄞t message to see 𝄞f 𝄞t 𝄞s wrapped correctly + SUBJECT????s ????s a very, very long f????rst l????ne for the comm????t message to see ????f ????t ????s wrapped correctly SUBJECT I am not sure if this is a known problem so I am reporting it. Regards, ismail ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Encoding problem on OSX? 2010-08-09 13:58 ` Encoding problem on OSX? İsmail Dönmez @ 2010-08-09 23:46 ` Jonathan Nieder 2010-08-10 5:52 ` İsmail Dönmez 0 siblings, 1 reply; 14+ messages in thread From: Jonathan Nieder @ 2010-08-09 23:46 UTC (permalink / raw) To: İsmail Dönmez; +Cc: git İsmail Dönmez wrote: > git shortlog HEAD >log && > fuzz log >log.predictable && > test_cmp expect.template log.predictable > > --- expect.template 2010-08-09 13:45:46.000000000 +0000 > +++ log.predictable 2010-08-09 13:45:46.000000000 +0000 > @@ -1,8 +1,8 @@ > A U Thor (5): > SUBJECT > SUBJECT > - SUBJECT > - SUBJECT > + SUBJECT𝄞s 𝄞s a very, very long f𝄞rst l𝄞ne for the comm𝄞t > message to see 𝄞f 𝄞t 𝄞s wrapped correctly > + SUBJECT????s ????s a very, very long f????rst l????ne for the > comm????t message to see ????f ????t ????s wrapped correctly > SUBJECT Very interesting; thanks for a report. From the definition of fuzz(), it looks like sed " s/$_x40/OBJECT_NAME/g s/$_x05/OBJID/g s/^ \{6\}[CTa].*/ SUBJECT/g s/^ \{8\}[^ ].*/ CONTINUATION/g " <log >log.fuzzy failed to completely match the fourth and five lines of the shortlog: A U Thor (5): Test This is a very, very long first[etc] Th𝄞s 𝄞s a very, very long f𝄞rst[etc] Th<malformed treble clef>s <malformed treble clef>s a... Could you confirm this? What does locale printf 'Th\360\235\204\236s\n' | sed 's/.*//g' print? Jonathan ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Encoding problem on OSX? 2010-08-09 23:46 ` Jonathan Nieder @ 2010-08-10 5:52 ` İsmail Dönmez 2010-08-11 7:55 ` Jonathan Nieder 0 siblings, 1 reply; 14+ messages in thread From: İsmail Dönmez @ 2010-08-10 5:52 UTC (permalink / raw) To: Jonathan Nieder; +Cc: git Hi; On Tue, Aug 10, 2010 at 2:46 AM, Jonathan Nieder <jrnieder@gmail.com> wrote: > > locale > printf 'Th\360\235\204\236s\n' | sed 's/.*//g [ismail@havana][08:50:45] [~]> locale LANG= LC_COLLATE="en_US.UTF-8" LC_CTYPE="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_ALL="en_US.UTF-8" [ismail@havana][08:51:00] [~]> printf 'Th\360\235\204\236s\n' | sed 's/.*//g' [ismail@havana][08:51:06] [~]> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Encoding problem on OSX? 2010-08-10 5:52 ` İsmail Dönmez @ 2010-08-11 7:55 ` Jonathan Nieder 2010-08-11 8:20 ` İsmail Dönmez 0 siblings, 1 reply; 14+ messages in thread From: Jonathan Nieder @ 2010-08-11 7:55 UTC (permalink / raw) To: İsmail Dönmez; +Cc: git İsmail Dönmez wrote: > [~]> printf 'Th\360\235\204\236s\n' | sed 's/.*//g' > > [ismail@havana][08:51:06] > [~]> Thanks for checking. So sed is not completely broken. Could you try sh t4201-shortlog.sh cd "trash directory.t4201-shortlog" git log cat "trash directory.t4201-shortlog/log" ? ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Encoding problem on OSX? 2010-08-11 7:55 ` Jonathan Nieder @ 2010-08-11 8:20 ` İsmail Dönmez 2010-08-11 8:29 ` Jonathan Nieder 0 siblings, 1 reply; 14+ messages in thread From: İsmail Dönmez @ 2010-08-11 8:20 UTC (permalink / raw) To: Jonathan Nieder; +Cc: git Hi; On Wed, Aug 11, 2010 at 10:55 AM, Jonathan Nieder <jrnieder@gmail.com> wrote: > İsmail Dönmez wrote: > >> [~]> printf 'Th\360\235\204\236s\n' | sed 's/.*//g' >> >> [ismail@havana][08:51:06] >> [~]> > > Thanks for checking. So sed is not completely broken. Could you try > > sh t4201-shortlog.sh > cd "trash directory.t4201-shortlog" > git log > cat "trash directory.t4201-shortlog/log" First of all note that this is not Mac's default sed but instead GNU sed: GNU sed version 4.2.1 Copyright (C) 2009 Free Software Foundation, Inc. Now the output of what you requested; [~/Sources/git/t]> sh t4201-shortlog.sh ok 1 - setup not ok - 2 default output format # # git shortlog HEAD >log && # fuzz log >log.predictable && # test_cmp expect.template log.predictable # ok 3 - pretty format ok 4 - --abbrev ok 5 - output from user-defined format is re-wrapped ok 6 - shortlog wrapping ok 7 - shortlog from non-git directory ok 8 - shortlog encoding # failed 1 among 8 test(s) 1..8 [ismail@havana][11:18:24] [~/Sources/git/t]> cd "trash directory.t4201-shortlog" [ismail@havana][11:18:33] [~/Sources/git/t/trash directory.t4201-shortlog]> git log commit ef6c19b4846d6a3e41f9a3ce746a3bffae653c17 Author: Jöhännës "Dschö" Schindëlin <Johannes.Schindelin@gmx.de> Date: Wed Aug 11 08:18:24 2010 +0000 set a1 to 3 and some non-ASCII chars: áæï commit d7c0787d081716755e2863f612d171846f503d4f Author: Jöhännës "Dschö" Schindëlin <Johannes.Schindelin@gmx.de> Date: Wed Aug 11 08:18:24 2010 +0000 set a1 to 2 and some non-ASCII chars: Äßø commit 7e9687adfe33f5d2050f0fc4ab5004f324d3559f Author: A U Thor <author@example.com> Date: Wed Aug 11 08:18:24 2010 +0000 Test [~/Sources/git/t/trash directory.t4201-shortlog]> cat log commit 5fc75f5794d1cd8575fc3e2e86f9c0e1aa31723e Author: Someone else <not!me> Date: Wed Aug 11 08:18:24 2010 +0000 Commit by someone else commit 0f5955f471a9d882b0e869752614b5123af19da3 Author: A U Thor <author@example.com> Date: Wed Aug 11 08:18:24 2010 +0000 a 12 34 56 78 commit 0bb7d083233c266d9051b283913bd83000c9001f Author: A U Thor <author@example.com> Date: Wed Aug 11 08:18:24 2010 +0000 Th????s ????s a very, very long f????rst l????ne for the comm????t message to see ????f ????t ????s wrapped correctly commit 03a5a848c658751c51925127820491bf2a94a752 Author: A U Thor <author@example.com> Date: Wed Aug 11 08:18:24 2010 +0000 Th𝄞s 𝄞s a very, very long f𝄞rst l𝄞ne for the comm𝄞t message to see 𝄞f 𝄞t 𝄞s wrapped correctly commit fdfc106190118f705dee70b56930764007353922 Author: A U Thor <author@example.com> Date: Wed Aug 11 08:18:24 2010 +0000 This is a very, very long first line for the commit message to see if it is wrapped correctly commit 7e9687adfe33f5d2050f0fc4ab5004f324d3559f Author: A U Thor <author@example.com> Date: Wed Aug 11 08:18:24 2010 +0000 Test ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Encoding problem on OSX? 2010-08-11 8:20 ` İsmail Dönmez @ 2010-08-11 8:29 ` Jonathan Nieder 2010-08-11 8:33 ` İsmail Dönmez 0 siblings, 1 reply; 14+ messages in thread From: Jonathan Nieder @ 2010-08-11 8:29 UTC (permalink / raw) To: İsmail Dönmez; +Cc: git İsmail Dönmez wrote: > [~/Sources/git/t]> sh t4201-shortlog.sh > ok 1 - setup > not ok - 2 default output format > # > # git shortlog HEAD >log && > # fuzz log >log.predictable && > # test_cmp expect.template log.predictable > # > ok 3 - pretty format Oops, my bad. sh t4201-shortlog.sh --immediate cat "trash directory.t4201-shortlog/log" is what I meant. The idea is to get the log that that log.predictable is based on, by fetching the log from immediately after the failing test. Sorry for the trouble, Jonathan ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Encoding problem on OSX? 2010-08-11 8:29 ` Jonathan Nieder @ 2010-08-11 8:33 ` İsmail Dönmez 2010-08-11 8:44 ` Jonathan Nieder 0 siblings, 1 reply; 14+ messages in thread From: İsmail Dönmez @ 2010-08-11 8:33 UTC (permalink / raw) To: Jonathan Nieder; +Cc: git On Wed, Aug 11, 2010 at 11:29 AM, Jonathan Nieder <jrnieder@gmail.com> wrote: > İsmail Dönmez wrote: > >> [~/Sources/git/t]> sh t4201-shortlog.sh >> ok 1 - setup >> not ok - 2 default output format >> # >> # git shortlog HEAD >log && >> # fuzz log >log.predictable && >> # test_cmp expect.template log.predictable >> # >> ok 3 - pretty format > > Oops, my bad. > > sh t4201-shortlog.sh --immediate > cat "trash directory.t4201-shortlog/log" > > is what I meant. The idea is to get the log that that log.predictable > is based on, by fetching the log from immediately after the failing test. Ok here we go; [~/Sources/git/t]> sh t4201-shortlog.sh --immediate ok 1 - setup not ok - 2 default output format # # git shortlog HEAD >log && # fuzz log >log.predictable && # test_cmp expect.template log.predictable # [ismail@havana][11:32:29] [~/Sources/git/t]> cat "trash directory.t4201-shortlog/log" A U Thor (5): Test This is a very, very long first line for the commit message to see if it is wrapped correctly Th𝄞s 𝄞s a very, very long f𝄞rst l𝄞ne for the comm𝄞t message to see 𝄞f 𝄞t 𝄞s wrapped correctly Th????s ????s a very, very long f????rst l????ne for the comm????t message to see ????f ????t ????s wrapped correctly a 12 34 56 78 Someone else (1): Commit by someone else ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Encoding problem on OSX? 2010-08-11 8:33 ` İsmail Dönmez @ 2010-08-11 8:44 ` Jonathan Nieder 2010-08-11 8:47 ` İsmail Dönmez 0 siblings, 1 reply; 14+ messages in thread From: Jonathan Nieder @ 2010-08-11 8:44 UTC (permalink / raw) To: İsmail Dönmez; +Cc: git İsmail Dönmez wrote: > On Wed, Aug 11, 2010 at 11:29 AM, Jonathan Nieder <jrnieder@gmail.com> wrote: >> sh t4201-shortlog.sh --immediate >> cat "trash directory.t4201-shortlog/log" >> >> is what I meant. The idea is to get the log that that log.predictable >> is based on, by fetching the log from immediately after the failing test. > > Ok here we go; Okay, I’m stymied. It *looks* like a sed bug even if a quick test did not catch it in the act. I guess the last thing to try is sed "s/^ \{6\}[CTa].*/ SUBJECT/g" <"trash directory.t4201-shortlog/log" because then you would have a test case to report to your sed supplier. Hopefully someone else with Mac OS X can reproduce this. Thanks again. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Encoding problem on OSX? 2010-08-11 8:44 ` Jonathan Nieder @ 2010-08-11 8:47 ` İsmail Dönmez 2010-08-11 9:01 ` İsmail Dönmez 0 siblings, 1 reply; 14+ messages in thread From: İsmail Dönmez @ 2010-08-11 8:47 UTC (permalink / raw) To: Jonathan Nieder; +Cc: git On Wed, Aug 11, 2010 at 11:44 AM, Jonathan Nieder <jrnieder@gmail.com> wrote: > sed "s/^ \{6\}[CTa].*/ SUBJECT/g" <"trash directory.t4201-shortlog/log" > A U Thor (5): SUBJECT SUBJECT SUBJECT SUBJECT????s ????s a very, very long f????rst l????ne for the comm????t message to see ????f ????t ????s wrapped correctly SUBJECT Someone else (1): SUBJECT I will try updating my sed, thanks! Regards, ismail ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Encoding problem on OSX? 2010-08-11 8:47 ` İsmail Dönmez @ 2010-08-11 9:01 ` İsmail Dönmez 2010-08-11 9:23 ` Jonathan Nieder 0 siblings, 1 reply; 14+ messages in thread From: İsmail Dönmez @ 2010-08-11 9:01 UTC (permalink / raw) To: Jonathan Nieder; +Cc: git Hi again; On Wed, Aug 11, 2010 at 11:47 AM, İsmail Dönmez <ismail@namtrac.org> wrote: > On Wed, Aug 11, 2010 at 11:44 AM, Jonathan Nieder <jrnieder@gmail.com> wrote: >> sed "s/^ \{6\}[CTa].*/ SUBJECT/g" <"trash directory.t4201-shortlog/log" >> > > A U Thor (5): > SUBJECT > SUBJECT > SUBJECT > SUBJECT????s ????s a very, very long f????rst l????ne for the > comm????t message to see ????f ????t ????s wrapped correctly > SUBJECT > > Someone else (1): > SUBJECT > > I will try updating my sed, thanks! Downgrading my sed to v 4.1.5 fixed the issue. Thanks for your help! Regards, ismail ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Encoding problem on OSX? 2010-08-11 9:01 ` İsmail Dönmez @ 2010-08-11 9:23 ` Jonathan Nieder 2010-09-27 2:31 ` Jonathan Nieder 0 siblings, 1 reply; 14+ messages in thread From: Jonathan Nieder @ 2010-08-11 9:23 UTC (permalink / raw) To: İsmail Dönmez; +Cc: git İsmail Dönmez wrote: > Downgrading my sed to v 4.1.5 fixed the issue. Thanks for your help! I just read BUGS in the sed distribution. Strangely enough the above seems to be correct behavior: Another common localization-related problem happens if your input stream includes invalid multibyte sequences. POSIX mandates that such sequences are _not_ matched by `.', so that `s/.*//' will not clear pattern space as you would expect. In fact, there is no way to clear sed's buffers in the middle of the script in most multibyte locales (including UTF-8 locales). For this reason, GNU sed provides a `z' command (for `zap') as an extension. However there is still a sed bug as far as I can tell, since in the test suite, LC_ALL is set to C, and using the C locale is the suggested workaround in the GNU sed docs. This explains where my first suggested diagnostic messed up: presumably printf 'Th\360\235\204\236s\n' | LC_ALL=C sed "s/.*//" would print <treble clef>s and printf 'Th\370\235\204\236s\n' | sed "s/.*//" would print ????s with your copy of sed 4.2.1. Well, I learned something new today. Still thinking over how to fix this in the test suite. Thanks again. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Encoding problem on OSX? 2010-08-11 9:23 ` Jonathan Nieder @ 2010-09-27 2:31 ` Jonathan Nieder 2010-09-27 5:15 ` Kevin Ballard 0 siblings, 1 reply; 14+ messages in thread From: Jonathan Nieder @ 2010-09-27 2:31 UTC (permalink / raw) To: İsmail Dönmez, Richard MICHAEL; +Cc: git Hi again, İsmail Dönmez wrote: > Downgrading my sed to v 4.1.5 fixed the issue. This is nicely explained here: https://www.opengroup.org/sophocles/show_mail.tpl?source=L&listname=austin-group-l&id=14595 It looks to be a Mac OS libc misfeature. Could you two lobby Apple to get this fixed? :) Thanks again for the reports. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Encoding problem on OSX? 2010-09-27 2:31 ` Jonathan Nieder @ 2010-09-27 5:15 ` Kevin Ballard 2010-09-27 5:18 ` İsmail Dönmez 0 siblings, 1 reply; 14+ messages in thread From: Kevin Ballard @ 2010-09-27 5:15 UTC (permalink / raw) To: Jonathan Nieder; +Cc: İsmail Dönmez, Richard MICHAEL, git On Sep 26, 2010, at 7:31 PM, Jonathan Nieder wrote: > Hi again, > > İsmail Dönmez wrote: > >> Downgrading my sed to v 4.1.5 fixed the issue. > > This is nicely explained here: > > https://www.opengroup.org/sophocles/show_mail.tpl?source=L&listname=austin-group-l&id=14595 > > It looks to be a Mac OS libc misfeature. Could you two lobby Apple to > get this fixed? :) FWIW, /usr/bin/sed on Mac OS X 10.6 doesn't seem to be having a problem. t4201-shortlog.sh passes all tests on my machine. -Kevin Ballard ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Encoding problem on OSX? 2010-09-27 5:15 ` Kevin Ballard @ 2010-09-27 5:18 ` İsmail Dönmez 0 siblings, 0 replies; 14+ messages in thread From: İsmail Dönmez @ 2010-09-27 5:18 UTC (permalink / raw) To: Kevin Ballard; +Cc: Jonathan Nieder, Richard MICHAEL, git On Mon, Sep 27, 2010 at 8:15 AM, Kevin Ballard <kevin@sb.org> wrote: > On Sep 26, 2010, at 7:31 PM, Jonathan Nieder wrote: > >> Hi again, >> >> İsmail Dönmez wrote: >> >>> Downgrading my sed to v 4.1.5 fixed the issue. >> >> This is nicely explained here: >> >> https://www.opengroup.org/sophocles/show_mail.tpl?source=L&listname=austin-group-l&id=14595 >> >> It looks to be a Mac OS libc misfeature. Could you two lobby Apple to >> get this fixed? :) > > FWIW, /usr/bin/sed on Mac OS X 10.6 doesn't seem to be having a problem. t4201-shortlog.sh passes all tests on my machine. Yes the problem was with GNU sed on OSX 10.6 Regards, ismail ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2010-09-27 5:18 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <AANLkTikh12guRxCK2Vf=WvshzX8P-fYTyu3qxYWNJ2px@mail.gmail.com> 2010-08-09 13:58 ` Encoding problem on OSX? İsmail Dönmez 2010-08-09 23:46 ` Jonathan Nieder 2010-08-10 5:52 ` İsmail Dönmez 2010-08-11 7:55 ` Jonathan Nieder 2010-08-11 8:20 ` İsmail Dönmez 2010-08-11 8:29 ` Jonathan Nieder 2010-08-11 8:33 ` İsmail Dönmez 2010-08-11 8:44 ` Jonathan Nieder 2010-08-11 8:47 ` İsmail Dönmez 2010-08-11 9:01 ` İsmail Dönmez 2010-08-11 9:23 ` Jonathan Nieder 2010-09-27 2:31 ` Jonathan Nieder 2010-09-27 5:15 ` Kevin Ballard 2010-09-27 5:18 ` İsmail Dönmez
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).