* Bug: pull --rebase with é in name @ 2012-03-05 9:59 René Haber 2012-03-05 10:26 ` Jeff King 0 siblings, 1 reply; 17+ messages in thread From: René Haber @ 2012-03-05 9:59 UTC (permalink / raw) To: git Hello, I'm having trouble with the following scenario: My name contains an é with accent. Having set git config --global user.name "René Haber" and several commits with that name in a project. Now I wanted to pull with --rebase, which fails with: git pull --rebase remote: Counting objects: 9, done. remote: Compressing objects: 100% (5/5), done. remote: Total 5 (delta 4), reused 0 (delta 0) Unpacking objects: 100% (5/5), done. From ____.de:repositories/kapa 173c610..18987db master -> origin/master First, rewinding head to replay your work on top of it... /sw/lib/git-core/git-am: line 675: Haber: command not found Patch does not have a valid e-mail address. The problem lies in .git/rebase-apply/author-script : GIT_AUTHOR_NAME='Rene'́ Haber GIT_AUTHOR_EMAIL='rene@habr.de' GIT_AUTHOR_DATE='@1330931169 +0100' where the accent ´ is on top of the apostrophe and an apostrophe is missing from the end of the GIT_AUTHOR_NAME line. This leads to the "Haber: command not found". As the author name is taken from the rebased commits changing the user.name in the .gitconfig is useless. The only way I found around this is changing my name to "Rene Haber" and first rewriting my local history up to the point of the rebase with that name. Thanks for your help. René Haber ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Bug: pull --rebase with é in name 2012-03-05 9:59 Bug: pull --rebase with é in name René Haber @ 2012-03-05 10:26 ` Jeff King 2012-03-05 10:37 ` Thomas Rast 0 siblings, 1 reply; 17+ messages in thread From: Jeff King @ 2012-03-05 10:26 UTC (permalink / raw) To: René Haber; +Cc: git On Mon, Mar 05, 2012 at 10:59:16AM +0100, René Haber wrote: > I'm having trouble with the following scenario: > My name contains an é with accent. Having set > git config --global user.name "René Haber" > and several commits with that name in a project. That should work in general, but... > git pull --rebase > [...] > /sw/lib/git-core/git-am: line 675: Haber: command not found > > The problem lies in .git/rebase-apply/author-script : > > GIT_AUTHOR_NAME='Rene'́ Haber > GIT_AUTHOR_EMAIL='rene@habr.de' > GIT_AUTHOR_DATE='@1330931169 +0100' That's definitely not right. I can't seem to reproduce it here with a simple test (neither with "René" in the author name, nor with an author name containing single-quote). What version of git are you using (it looks like a recent one, as it has the magic @-date syntax). Have you set i18n.commitencoding, or are otherwise using an encoding besides utf8? Is it possible to share the commits that trigger this bug? -Peff ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Bug: pull --rebase with é in name 2012-03-05 10:26 ` Jeff King @ 2012-03-05 10:37 ` Thomas Rast 2012-03-05 11:42 ` René Haber 0 siblings, 1 reply; 17+ messages in thread From: Thomas Rast @ 2012-03-05 10:37 UTC (permalink / raw) To: Jeff King; +Cc: René Haber, git Jeff King <peff@peff.net> writes: > On Mon, Mar 05, 2012 at 10:59:16AM +0100, René Haber wrote: > >> I'm having trouble with the following scenario: >> My name contains an é with accent. Having set >> git config --global user.name "René Haber" >> and several commits with that name in a project. > > That should work in general, but... > >> git pull --rebase >> [...] >> /sw/lib/git-core/git-am: line 675: Haber: command not found >> >> The problem lies in .git/rebase-apply/author-script : >> >> GIT_AUTHOR_NAME='Rene'́ Haber >> GIT_AUTHOR_EMAIL='rene@habr.de' >> GIT_AUTHOR_DATE='@1330931169 +0100' > > That's definitely not right. > > I can't seem to reproduce it here with a simple test (neither with > "René" in the author name, nor with an author name containing > single-quote). What version of git are you using (it looks like a recent > one, as it has the magic @-date syntax). Have you set > i18n.commitencoding, or are otherwise using an encoding besides utf8? Is > it possible to share the commits that trigger this bug? Also, can you post a hex dump of the config that defines user.name (try 'xxd ~/.gitconfig'), so we can see the encoding of René? I find it pretty odd that Git manages to split the ´ from the e, so I'm wondering if perhaps you are using UTF-8 in NFD or similar. -- Thomas Rast trast@{inf,student}.ethz.ch ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Bug: pull --rebase with é in name 2012-03-05 10:37 ` Thomas Rast @ 2012-03-05 11:42 ` René Haber 2012-03-05 11:58 ` Jeff King 0 siblings, 1 reply; 17+ messages in thread From: René Haber @ 2012-03-05 11:42 UTC (permalink / raw) To: Thomas Rast; +Cc: Jeff King, git [-- Attachment #1: Type: text/plain, Size: 181 bytes --] I'm running git 1.7.9.2 from Fink Project on MacOS X 10.6. The gitconfig in hex is attached. I'm not using i18n.commitencoding or a charset different from utf8. Thanks. René [-- Attachment #2: gitconfig.xxd --] [-- Type: application/octet-stream, Size: 3680 bytes --] 0000000: 5b75 7365 725d 0a09 656d 6169 6c20 3d20 [user]..email = 0000010: 7265 6e65 4068 6162 722e 6465 0a09 6e61 rene@habr.de..na 0000020: 6d65 203d 2052 656e c3a9 2048 6162 6572 me = Ren.. Haber 0000030: 0a5b 636f 6c6f 725d 0a09 6469 6666 203d .[color]..diff = 0000040: 2061 7574 6f0a 0973 7461 7475 7320 3d20 auto..status = 0000050: 6175 746f 0a09 6272 616e 6368 203d 2061 auto..branch = a 0000060: 7574 6f0a 0969 6e74 6572 6163 7469 7665 uto..interactive 0000070: 203d 2061 7574 6f0a 0975 6920 3d20 7472 = auto..ui = tr 0000080: 7565 0a09 7061 6765 7220 3d20 7472 7565 ue..pager = true 0000090: 0a5b 636f 6c6f 7220 2262 7261 6e63 6822 .[color "branch" 00000a0: 5d0a 0963 7572 7265 6e74 203d 2079 656c ]..current = yel 00000b0: 6c6f 7720 7265 7665 7273 650a 096c 6f63 low reverse..loc 00000c0: 616c 203d 2079 656c 6c6f 770a 0972 656d al = yellow..rem 00000d0: 6f74 6520 3d20 6772 6565 6e0a 5b63 6f6c ote = green.[col 00000e0: 6f72 2022 6469 6666 225d 0a09 6d65 7461 or "diff"]..meta 00000f0: 203d 2079 656c 6c6f 7720 626f 6c64 0a09 = yellow bold.. 0000100: 6672 6167 203d 206d 6167 656e 7461 2062 frag = magenta b 0000110: 6f6c 640a 096f 6c64 203d 2072 6564 2062 old..old = red b 0000120: 6f6c 640a 096e 6577 203d 2067 7265 656e old..new = green 0000130: 2062 6f6c 640a 0977 6869 7465 7370 6163 bold..whitespac 0000140: 6520 3d20 7265 6420 7265 7665 7273 650a e = red reverse. 0000150: 5b63 6f6c 6f72 2022 7374 6174 7573 225d [color "status"] 0000160: 0a09 6164 6465 6420 3d20 7965 6c6c 6f77 ..added = yellow 0000170: 0a09 6368 616e 6765 6420 3d20 6772 6565 ..changed = gree 0000180: 6e0a 0975 6e74 7261 636b 6564 203d 2063 n..untracked = c 0000190: 7961 6e0a 5b70 6163 6b5d 0a09 7468 7265 yan.[pack]..thre 00001a0: 6164 7320 3d20 300a 5b61 6c69 6173 5d0a ads = 0.[alias]. 00001b0: 0973 7420 3d20 7374 6174 7573 0a09 6369 .st = status..ci 00001c0: 203d 2063 6f6d 6d69 740a 0962 7220 3d20 = commit..br = 00001d0: 6272 616e 6368 0a09 636f 203d 2063 6865 branch..co = che 00001e0: 636b 6f75 740a 0964 6620 3d20 6469 6666 ckout..df = diff 00001f0: 0a09 6c70 203d 206c 6f67 202d 700a 096c ..lp = log -p..l 0000200: 6720 3d20 6c6f 6720 2d2d 6772 6170 6820 g = log --graph 0000210: 2d2d 7072 6574 7479 3d66 6f72 6d61 743a --pretty=format: 0000220: 2725 4372 6564 2568 2543 7265 7365 7420 '%Cred%h%Creset 0000230: 2d25 4328 7965 6c6c 6f77 2925 6425 4372 -%C(yellow)%d%Cr 0000240: 6573 6574 2025 7320 2543 6772 6565 6e28 eset %s %Cgreen( 0000250: 2563 7229 2025 4328 626f 6c64 2062 6c75 %cr) %C(bold blu 0000260: 6529 3c25 616e 3e25 4372 6573 6574 2720 e)<%an>%Creset' 0000270: 2d2d 6162 6272 6576 2d63 6f6d 6d69 7420 --abbrev-commit 0000280: 2d2d 6461 7465 3d72 656c 6174 6976 650a --date=relative. 0000290: 0964 6320 3d20 6469 6666 202d 2d63 6163 .dc = diff --cac 00002a0: 6865 6420 2d2d 6e6f 2d65 7874 2d64 6966 hed --no-ext-dif 00002b0: 660a 0977 7466 203d 2021 6769 742d 7774 f..wtf = !git-wt 00002c0: 660a 5b70 7573 685d 0a09 6465 6661 756c f.[push]..defaul 00002d0: 7420 3d20 6d61 7463 6869 6e67 0a5b 636f t = matching.[co 00002e0: 7265 5d0a 0977 6869 7465 7370 6163 653d re]..whitespace= 00002f0: 6669 782c 7472 6169 6c69 6e67 2d73 7061 fix,trailing-spa 0000300: 6365 2c63 722d 6174 2d65 6f6c 0a5b 7265 ce,cr-at-eol.[re 0000310: 7265 7265 5d0a 0965 6e61 626c 6564 203d rere]..enabled = 0000320: 2074 7275 650a 5b6d 6572 6765 5d0a 0973 true.[merge]..s 0000330: 7461 7420 3d20 7472 7565 0a5b 6469 6666 tat = true.[diff 0000340: 5d0a 096d 6e65 6d6f 6e69 6370 7265 6669 ]..mnemonicprefi 0000350: 7820 3d20 7472 7565 0a09 7265 6e61 6d65 x = true..rename 0000360: 7320 3d20 636f 7069 6573 0a s = copies. [-- Attachment #3: Type: text/plain, Size: 1499 bytes --] Am 05.03.2012 um 11:37 schrieb Thomas Rast: > Jeff King <peff@peff.net> writes: > >> On Mon, Mar 05, 2012 at 10:59:16AM +0100, René Haber wrote: >> >>> I'm having trouble with the following scenario: >>> My name contains an é with accent. Having set >>> git config --global user.name "René Haber" >>> and several commits with that name in a project. >> >> That should work in general, but... >> >>> git pull --rebase >>> [...] >>> /sw/lib/git-core/git-am: line 675: Haber: command not found >>> >>> The problem lies in .git/rebase-apply/author-script : >>> >>> GIT_AUTHOR_NAME='Rene'́ Haber >>> GIT_AUTHOR_EMAIL='rene@habr.de' >>> GIT_AUTHOR_DATE='@1330931169 +0100' >> >> That's definitely not right. >> >> I can't seem to reproduce it here with a simple test (neither with >> "René" in the author name, nor with an author name containing >> single-quote). What version of git are you using (it looks like a recent >> one, as it has the magic @-date syntax). Have you set >> i18n.commitencoding, or are otherwise using an encoding besides utf8? Is >> it possible to share the commits that trigger this bug? > > Also, can you post a hex dump of the config that defines user.name (try > 'xxd ~/.gitconfig'), so we can see the encoding of René? > > I find it pretty odd that Git manages to split the ´ from the e, so I'm > wondering if perhaps you are using UTF-8 in NFD or similar. > > -- > Thomas Rast > trast@{inf,student}.ethz.ch ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Bug: pull --rebase with é in name 2012-03-05 11:42 ` René Haber @ 2012-03-05 11:58 ` Jeff King 2012-03-05 12:36 ` Jakub Narebski 2012-03-05 12:46 ` René Haber 0 siblings, 2 replies; 17+ messages in thread From: Jeff King @ 2012-03-05 11:58 UTC (permalink / raw) To: René Haber; +Cc: Thomas Rast, git On Mon, Mar 05, 2012 at 12:42:14PM +0100, René Haber wrote: > I'm running git 1.7.9.2 from Fink Project on MacOS X 10.6. > The gitconfig in hex is attached. Hmm, looks like pretty standard utf8: 0000020: 6d65 203d 2052 656e c3a9 2048 6162 6572 me = Ren.. Haber and the same thing I used in my tests. I tried repeating the test with v1.7.9.2 on OS X (although my test box is 10.7), and couldn't replicate it. Can you show us the commit that causes the problem, as printed by "git cat-file commit $commit | xxd"? I just want to double-check that there are no odd bytes there. Also, what happens if you do: sh -c ' . /sw/lib/git-core/git-sh-setup get_author_ident_from_commit $commit ' (my theory is that this is the underlying problem in the rebase, and should show the bug; by narrowing it down, it should make testing a lot simpler). -Peff ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Bug: pull --rebase with é in name 2012-03-05 11:58 ` Jeff King @ 2012-03-05 12:36 ` Jakub Narebski 2012-03-05 12:46 ` René Haber 1 sibling, 0 replies; 17+ messages in thread From: Jakub Narebski @ 2012-03-05 12:36 UTC (permalink / raw) To: Jeff King; +Cc: René Haber, Thomas Rast, git Jeff King <peff@peff.net> writes: > On Mon, Mar 05, 2012 at 12:42:14PM +0100, René Haber wrote: > > > I'm running git 1.7.9.2 from Fink Project on MacOS X 10.6. > > The gitconfig in hex is attached. > > Hmm, looks like pretty standard utf8: > > 0000020: 6d65 203d 2052 656e c3a9 2048 6162 6572 me = Ren.. Haber > > and the same thing I used in my tests. I tried repeating the test with > v1.7.9.2 on OS X (although my test box is 10.7), and couldn't replicate > it. > > Can you show us the commit that causes the problem, as printed by "git > cat-file commit $commit | xxd"? I just want to double-check that there > are no odd bytes there. > > Also, what happens if you do: > > sh -c ' > . /sw/lib/git-core/git-sh-setup > get_author_ident_from_commit $commit > ' > > (my theory is that this is the underlying problem in the rebase, and > should show the bug; by narrowing it down, it should make testing a lot > simpler). Hmmm... one place where I have read about this strange "René" -> "Rene'" conversion is when terminal (console) cannot display unicode, and tries to show it using ASCII: http://stackoverflow.com/a/9430419/46058 But it should not matter if we are writing to file, isn't it? -- Jakub Narębski ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Bug: pull --rebase with é in name 2012-03-05 11:58 ` Jeff King 2012-03-05 12:36 ` Jakub Narebski @ 2012-03-05 12:46 ` René Haber 2012-03-05 13:04 ` Thomas Rast 1 sibling, 1 reply; 17+ messages in thread From: René Haber @ 2012-03-05 12:46 UTC (permalink / raw) To: Jeff King; +Cc: Thomas Rast, git [-- Attachment #1: Type: text/plain, Size: 395 bytes --] sh -c ' . /sw/lib/git-core/git-sh-setup get_author_ident_from_commit 16b94413cbce12531e8f946286851598449d3913 ' GIT_AUTHOR_NAME='Ren'é Haber GIT_AUTHOR_EMAIL='rene@habr.de' GIT_AUTHOR_DATE='@1329212923 +0100' Commit attached. The thing is, that this only happens when I do git pull --rebase. Doing a git rebase -i HEAD~5 or so works. [-- Attachment #2: 16b9441.commit --] [-- Type: application/octet-stream, Size: 1068 bytes --] 0000000: 7472 6565 2032 3338 3339 6430 6161 6635 tree 23839d0aaf5 0000010: 6130 3536 3932 3366 3735 3839 6433 6335 a056923f7589d3c5 0000020: 3063 6661 6337 3830 3632 6661 350a 7061 0cfac78062fa5.pa 0000030: 7265 6e74 2030 6530 6264 3264 6236 3232 rent 0e0bd2db622 0000040: 3565 3433 6463 3565 3239 6139 6161 3034 5e43dc5e29a9aa04 0000050: 3732 3730 3466 3430 6237 3066 380a 6175 72704f40b70f8.au 0000060: 7468 6f72 2052 656e c3a9 2048 6162 6572 thor Ren.. Haber 0000070: 203c 7265 6e65 4068 6162 722e 6465 3e20 <rene@habr.de> 0000080: 3133 3239 3231 3239 3233 202b 3031 3030 1329212923 +0100 0000090: 0a63 6f6d 6d69 7474 6572 2052 656e c3a9 .committer Ren.. 00000a0: 2048 6162 6572 203c 7265 6e65 4068 6162 Haber <rene@hab 00000b0: 722e 6465 3e20 3133 3239 3231 3239 3233 r.de> 1329212923 00000c0: 202b 3031 3030 0a0a 486f 7065 6675 6c6c +0100..Hopefull 00000d0: 7920 6669 7865 6420 6469 7370 6c61 7920 y fixed display 00000e0: 6275 6720 696e 2076 6572 616e 7374 616c bug in veranstal 00000f0: 7475 6e67 656e 2f65 6469 742e tungen/edit. [-- Attachment #3: Type: text/plain, Size: 994 bytes --] Am 05.03.2012 um 12:58 schrieb Jeff King: > On Mon, Mar 05, 2012 at 12:42:14PM +0100, René Haber wrote: > >> I'm running git 1.7.9.2 from Fink Project on MacOS X 10.6. >> The gitconfig in hex is attached. > > Hmm, looks like pretty standard utf8: > > 0000020: 6d65 203d 2052 656e c3a9 2048 6162 6572 me = Ren.. Haber > > and the same thing I used in my tests. I tried repeating the test with > v1.7.9.2 on OS X (although my test box is 10.7), and couldn't replicate > it. > > Can you show us the commit that causes the problem, as printed by "git > cat-file commit $commit | xxd"? I just want to double-check that there > are no odd bytes there. > > Also, what happens if you do: > > sh -c ' > . /sw/lib/git-core/git-sh-setup > get_author_ident_from_commit $commit > ' > > (my theory is that this is the underlying problem in the rebase, and > should show the bug; by narrowing it down, it should make testing a lot > simpler). > > -Peff ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Bug: pull --rebase with é in name 2012-03-05 12:46 ` René Haber @ 2012-03-05 13:04 ` Thomas Rast 2012-03-05 13:19 ` René Haber 2012-03-05 13:29 ` Jeff King 0 siblings, 2 replies; 17+ messages in thread From: Thomas Rast @ 2012-03-05 13:04 UTC (permalink / raw) To: René Haber; +Cc: Jeff King, git, Will Palmer René Haber <rene@habr.de> writes: > sh -c ' > . /sw/lib/git-core/git-sh-setup > get_author_ident_from_commit 16b94413cbce12531e8f946286851598449d3913 > ' > GIT_AUTHOR_NAME='Ren'é Haber > GIT_AUTHOR_EMAIL='rene@habr.de' > GIT_AUTHOR_DATE='@1329212923 +0100' I think this is the same issue that we recently discussed on #git-devel, where some broken versions of sed will fail to match "any character" with '.' even under LC_ALL=C. Will "shruggar" Palmer (cc) had this issue under OS X with a build of GNU sed that ignored LC_*. You can verify that this is the problem by looking at printf "\370\235\204\236\n" | LC_CTYPE=C sed 's/./x/g' | xxd It should say 0000000: 7878 7878 0a xxxx. That is, the garbage (if you try to read it as UTF-8) in the printf string was matched and replaced byte-by-byte with 'x'. However, Will was getting the unreplaced results 0000000: f89d 849e 0a ..... I'm not sure he has followed up on that problem; the only hope may be to get a better 'sed'. -- Thomas Rast trast@{inf,student}.ethz.ch ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Bug: pull --rebase with é in name 2012-03-05 13:04 ` Thomas Rast @ 2012-03-05 13:19 ` René Haber 2012-03-05 13:29 ` Jeff King 1 sibling, 0 replies; 17+ messages in thread From: René Haber @ 2012-03-05 13:19 UTC (permalink / raw) To: Thomas Rast; +Cc: Jeff King, git, Will Palmer Am 05.03.2012 um 14:04 schrieb Thomas Rast: > René Haber <rene@habr.de> writes: > >> sh -c ' >> . /sw/lib/git-core/git-sh-setup >> get_author_ident_from_commit 16b94413cbce12531e8f946286851598449d3913 >> ' >> GIT_AUTHOR_NAME='Ren'é Haber >> GIT_AUTHOR_EMAIL='rene@habr.de' >> GIT_AUTHOR_DATE='@1329212923 +0100' > > I think this is the same issue that we recently discussed on #git-devel, > where some broken versions of sed will fail to match "any character" > with '.' even under LC_ALL=C. Will "shruggar" Palmer (cc) had this > issue under OS X with a build of GNU sed that ignored LC_*. > > You can verify that this is the problem by looking at > > printf "\370\235\204\236\n" | LC_CTYPE=C sed 's/./x/g' | xxd > > It should say > > 0000000: 7878 7878 0a xxxx. > > That is, the garbage (if you try to read it as UTF-8) in the printf > string was matched and replaced byte-by-byte with 'x'. However, > Will was getting the unreplaced results > > 0000000: f89d 849e 0a ..... > > I'm not sure he has followed up on that problem; the only hope may be to > get a better 'sed'. I can conform this. I get ..... Using the sed from apple results in xxxx. Thanks. René > -- > Thomas Rast > trast@{inf,student}.ethz.ch ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Bug: pull --rebase with é in name 2012-03-05 13:04 ` Thomas Rast 2012-03-05 13:19 ` René Haber @ 2012-03-05 13:29 ` Jeff King 2012-03-05 13:40 ` Thomas Rast 2012-03-05 17:23 ` Junio C Hamano 1 sibling, 2 replies; 17+ messages in thread From: Jeff King @ 2012-03-05 13:29 UTC (permalink / raw) To: Thomas Rast; +Cc: René Haber, git, Will Palmer On Mon, Mar 05, 2012 at 02:04:37PM +0100, Thomas Rast wrote: > René Haber <rene@habr.de> writes: > > > sh -c ' > > . /sw/lib/git-core/git-sh-setup > > get_author_ident_from_commit 16b94413cbce12531e8f946286851598449d3913 > > ' > > GIT_AUTHOR_NAME='Ren'é Haber > > GIT_AUTHOR_EMAIL='rene@habr.de' > > GIT_AUTHOR_DATE='@1329212923 +0100' > [...] > That is, the garbage (if you try to read it as UTF-8) in the printf > string was matched and replaced byte-by-byte with 'x'. However, > Will was getting the unreplaced results > > 0000000: f89d 849e 0a ..... > > I'm not sure he has followed up on that problem; the only hope may be to > get a better 'sed'. Long ago, 47c9739e replaced the shell quoting in git-am with "git rev-parse --sq-quote" (instead of sed). Maybe we can do the same for get_author_ident_from_commit (though it is a little trickier there, as we are also parsing values directly out of --pretty=raw). It would be nice if the --pretty format placeholders had a "shell-quote" modifier, and we could just do: git show --format='GIT_AUTHOR_NAME=%(an:shell)' or something similar. for-each-ref knows about shell-quoting, but we can't use it here, because we are looking at arbitrary commits, not just ones pointed to by refs. -Peff ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Bug: pull --rebase with é in name 2012-03-05 13:29 ` Jeff King @ 2012-03-05 13:40 ` Thomas Rast 2012-03-05 13:50 ` Jeff King 2012-03-05 17:23 ` Junio C Hamano 1 sibling, 1 reply; 17+ messages in thread From: Thomas Rast @ 2012-03-05 13:40 UTC (permalink / raw) To: Jeff King; +Cc: René Haber, git, Will Palmer Jeff King <peff@peff.net> writes: > On Mon, Mar 05, 2012 at 02:04:37PM +0100, Thomas Rast wrote: > >> René Haber <rene@habr.de> writes: >> >> > sh -c ' >> > . /sw/lib/git-core/git-sh-setup >> > get_author_ident_from_commit 16b94413cbce12531e8f946286851598449d3913 >> > ' >> > GIT_AUTHOR_NAME='Ren'é Haber >> > GIT_AUTHOR_EMAIL='rene@habr.de' >> > GIT_AUTHOR_DATE='@1329212923 +0100' >> [...] >> That is, the garbage (if you try to read it as UTF-8) in the printf >> string was matched and replaced byte-by-byte with 'x'. However, >> Will was getting the unreplaced results >> >> 0000000: f89d 849e 0a ..... >> >> I'm not sure he has followed up on that problem; the only hope may be to >> get a better 'sed'. [...] > It would be nice if the --pretty format placeholders had a "shell-quote" > modifier, and we could just do: > > git show --format='GIT_AUTHOR_NAME=%(an:shell)' > > or something similar. for-each-ref knows about shell-quoting, but we > can't use it here, because we are looking at arbitrary commits, not just > ones pointed to by refs. Perhaps by using %an etc., line numbers and --sq-quote: $ git rev-list --no-walk --date=raw --format="%an%n%ae%n%ad" --encoding=UTF-8 HEAD | while read -r s; do git rev-parse --sq-quote "$s"; done | sed -n -e '2s/^ /GIT_AUTHOR_NAME=/p' -e '3s/^ /GIT_AUTHOR_EMAIL=/p' -e '4s/^ /GIT_AUTHOR_DATE=/p' GIT_AUTHOR_NAME='Thom'\''as Ràst' GIT_AUTHOR_EMAIL='trast@inf.ethz.ch' GIT_AUTHOR_DATE='1330935546 +0100' This is for a commit where I deliberately mangled my author line to make an interesting example, as in $ git cat-file commit HEAD | grep ^author author Thom'as Ràst <trast@inf.ethz.ch> 1330935546 +0100 I tried doing the quoting inside sed instead of the while...rev-parse --sq-quote, but it made my head hurt. -- Thomas Rast trast@{inf,student}.ethz.ch ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Bug: pull --rebase with é in name 2012-03-05 13:40 ` Thomas Rast @ 2012-03-05 13:50 ` Jeff King 0 siblings, 0 replies; 17+ messages in thread From: Jeff King @ 2012-03-05 13:50 UTC (permalink / raw) To: Thomas Rast; +Cc: René Haber, git, Will Palmer On Mon, Mar 05, 2012 at 02:40:34PM +0100, Thomas Rast wrote: > > It would be nice if the --pretty format placeholders had a "shell-quote" > > modifier, and we could just do: > > > > git show --format='GIT_AUTHOR_NAME=%(an:shell)' > > > > or something similar. for-each-ref knows about shell-quoting, but we > > can't use it here, because we are looking at arbitrary commits, not just > > ones pointed to by refs. > > Perhaps by using %an etc., line numbers and --sq-quote: > > $ git rev-list --no-walk --date=raw --format="%an%n%ae%n%ad" --encoding=UTF-8 HEAD | > while read -r s; do git rev-parse --sq-quote "$s"; done | > sed -n -e '2s/^ /GIT_AUTHOR_NAME=/p' -e '3s/^ /GIT_AUTHOR_EMAIL=/p' -e '4s/^ /GIT_AUTHOR_DATE=/p' > GIT_AUTHOR_NAME='Thom'\''as Ràst' > GIT_AUTHOR_EMAIL='trast@inf.ethz.ch' > GIT_AUTHOR_DATE='1330935546 +0100' Yeah, that works. It's a little harder to read than would be ideal, but should produce the right results (I was initially hesitant to use "read" because I was worried about newlines in the input. But of course, that's a non-issue since author ident by definition cannot have newlines in it). I think this is a good direction regardless of the sed issue. We end up parsing ident lines like this in a lot of different places, and I would not be surprised if they do not all behave exactly the same. Eliminating one such parser in favor of the standard one in pretty.c seems like a good thing. -Peff PS If you are going to turn that into a real patch, note that your date field accidentally drops the "@" specifier that unambiguously marks the number as an epoch timestamp. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Bug: pull --rebase with é in name 2012-03-05 13:29 ` Jeff King 2012-03-05 13:40 ` Thomas Rast @ 2012-03-05 17:23 ` Junio C Hamano 2012-03-06 8:23 ` Jeff King 2012-03-06 8:36 ` Thomas Rast 1 sibling, 2 replies; 17+ messages in thread From: Junio C Hamano @ 2012-03-05 17:23 UTC (permalink / raw) To: Jeff King; +Cc: Thomas Rast, René Haber, git, Will Palmer Jeff King <peff@peff.net> writes: > It would be nice if the --pretty format placeholders had a "shell-quote" > modifier, and we could just do: > > git show --format='GIT_AUTHOR_NAME=%(an:shell)' > > or something similar. for-each-ref knows about shell-quoting, but we > can't use it here, because we are looking at arbitrary commits, not just > ones pointed to by refs. You guys seem to have been having a lot of fun overnight. Perhaps I should live on European time? I think there were talks about cross pollinating and eventually unifying the placeholder languages of pretty and for-each-ref, and if we were to do so, I agree that --pretty definitely should learn to do --sq. But I do not think we want to teach everything :shell; following the style of %w(), something more generic that would apply to any payload would be preferred, perhaps giving an end result like this: git show -s --format=' GIT_AUTHOR_NAME=%(sq-begin)%an%(sq-end) GIT_AUTHOR_EMAIL=%(sq-begin)%ae%(sq-end) ' which would be immediately `eval`-able. Also I wonder if it is time for "git-am" to make more use of direct knowledge of the $rebasing and the original commit. Perhaps by teaching commit-tree to take the -c option from commit, we may not even have to worry about this. In any case, my reading of the conclusion you guys have already reached in this thread is that the issue is not even a bug in Git, but is a broken build/installation of sed by a third-party. I am inclined to suggest any change to get_author_ident_from_commit helper backburnered before we teach --sq to --pretty machinery. If the broken sed was the apple one that came with the platform, my conclusion might be different, but it seems to me that this is not something we would urgently have to worry about and patch our code up with an ugly band-aid workaround. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Bug: pull --rebase with é in name 2012-03-05 17:23 ` Junio C Hamano @ 2012-03-06 8:23 ` Jeff King 2012-03-06 8:36 ` Thomas Rast 1 sibling, 0 replies; 17+ messages in thread From: Jeff King @ 2012-03-06 8:23 UTC (permalink / raw) To: Junio C Hamano; +Cc: Thomas Rast, René Haber, git, Will Palmer On Mon, Mar 05, 2012 at 09:23:19AM -0800, Junio C Hamano wrote: > I think there were talks about cross pollinating and eventually > unifying the placeholder languages of pretty and for-each-ref, and > if we were to do so, I agree that --pretty definitely should learn > to do --sq. But I do not think we want to teach everything :shell; > following the style of %w(), something more generic that would apply > to any payload would be preferred, perhaps giving an end result like > this: > > git show -s --format=' > GIT_AUTHOR_NAME=%(sq-begin)%an%(sq-end) > GIT_AUTHOR_EMAIL=%(sq-begin)%ae%(sq-end) > ' > > which would be immediately `eval`-able. Yeah, that could work. I didn't want to teach everything :shell individually. I was hoping eventually for a world where "%(foo:one:two=bar)" was internally parsed into "the foo item, with attribute one set, and attribute two set to bar". And then the "shell" attribute would have a particular meaning for everything, whereas in "%(authordate:format=short)", the "format" attribute would be specific to that item. I think that makes for a more readable syntax. However, your proposal does allow quoting multiple entities at a time, like: IDENT=%(sq-begin)%an <%ae>%(sq-end) which could be useful. Anyway, there is not much point in discussing hypothetical syntaxes. I think we agree that some form of this feature would be an ideal way forward in the long term, but specifics can wait until somebody shows up with patches. > In any case, my reading of the conclusion you guys have already > reached in this thread is that the issue is not even a bug in Git, > but is a broken build/installation of sed by a third-party. I am > inclined to suggest any change to get_author_ident_from_commit > helper backburnered before we teach --sq to --pretty machinery. I think that is true. It could be considered a bug in git if we were relying on an unportable sed construct. But it works everywhere else, and we already go to the effort to set LANG and LC_ALL, so I am inclined to say that it is not a portability issue in git, but a crappy sed implementation, and the right solution is to use a better one. We could switch the use of sed to perl (even just using 5.005-ish features, which are pretty portable), but until now, users of git-sh-setup don't need to rely on having perl at all. So I'm fine with leaving it for now and telling people to fix their sed. -Peff ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Bug: pull --rebase with é in name 2012-03-05 17:23 ` Junio C Hamano 2012-03-06 8:23 ` Jeff King @ 2012-03-06 8:36 ` Thomas Rast 2012-03-06 9:02 ` Jeff King 2012-03-06 18:31 ` Junio C Hamano 1 sibling, 2 replies; 17+ messages in thread From: Thomas Rast @ 2012-03-06 8:36 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jeff King, René Haber, git, Will Palmer Junio C Hamano <gitster@pobox.com> writes: > Jeff King <peff@peff.net> writes: > >> It would be nice if the --pretty format placeholders had a "shell-quote" >> modifier, and we could just do: >> >> git show --format='GIT_AUTHOR_NAME=%(an:shell)' >> >> or something similar. for-each-ref knows about shell-quoting, but we >> can't use it here, because we are looking at arbitrary commits, not just >> ones pointed to by refs. > > You guys seem to have been having a lot of fun overnight. Perhaps I > should live on European time? IIUC Peff just got up at an unreasonably early time to have fun with us Europeans? > I think there were talks about cross pollinating and eventually > unifying the placeholder languages of pretty and for-each-ref, and > if we were to do so, I agree that --pretty definitely should learn > to do --sq. But I do not think we want to teach everything :shell; > following the style of %w(), something more generic that would apply > to any payload would be preferred, perhaps giving an end result like > this: > > git show -s --format=' > GIT_AUTHOR_NAME=%(sq-begin)%an%(sq-end) > GIT_AUTHOR_EMAIL=%(sq-begin)%ae%(sq-end) > ' How about something along the lines of %Q(%an) instead? Though at least implementation-wise, it should be possible to make %'%an%' work, too, which would be rather cute. > In any case, my reading of the conclusion you guys have already > reached in this thread is that the issue is not even a bug in Git, > but is a broken build/installation of sed by a third-party. I am > inclined to suggest any change to get_author_ident_from_commit > helper backburnered before we teach --sq to --pretty machinery. Ok. This is the second "victim" of this broken install of sed, however. I wonder where René and Will got it from? Perhaps this is "the" common way of getting GNU sed on OS X, and thus more widespread than we might think. -- Thomas Rast trast@{inf,student}.ethz.ch ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Bug: pull --rebase with é in name 2012-03-06 8:36 ` Thomas Rast @ 2012-03-06 9:02 ` Jeff King 2012-03-06 18:31 ` Junio C Hamano 1 sibling, 0 replies; 17+ messages in thread From: Jeff King @ 2012-03-06 9:02 UTC (permalink / raw) To: Thomas Rast; +Cc: Junio C Hamano, René Haber, git, Will Palmer On Tue, Mar 06, 2012 at 09:36:31AM +0100, Thomas Rast wrote: > > You guys seem to have been having a lot of fun overnight. Perhaps I > > should live on European time? > > IIUC Peff just got up at an unreasonably early time to have fun with us > Europeans? Er...got up? Yeeeeeah, that's what happened. I would never stay up until 6am local time hacking on git. ;) > This is the second "victim" of this broken install of sed, however. I > wonder where René and Will got it from? Perhaps this is "the" common > way of getting GNU sed on OS X, and thus more widespread than we might > think. That's worth looking into, but the answer may still be "this common sed is broken, and we should tell the people who are packaging it to unbreak it". I'm worried that there really isn't a workaround (we are already trying LC_ALL and LANG; is there something else we can do short of not using sed at all?). -Peff ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Bug: pull --rebase with é in name 2012-03-06 8:36 ` Thomas Rast 2012-03-06 9:02 ` Jeff King @ 2012-03-06 18:31 ` Junio C Hamano 1 sibling, 0 replies; 17+ messages in thread From: Junio C Hamano @ 2012-03-06 18:31 UTC (permalink / raw) To: Thomas Rast; +Cc: Jeff King, René Haber, git, Will Palmer Thomas Rast <trast@inf.ethz.ch> writes: >> git show -s --format=' >> GIT_AUTHOR_NAME=%(sq-begin)%an%(sq-end) >> GIT_AUTHOR_EMAIL=%(sq-begin)%ae%(sq-end) >> ' > > How about something along the lines of %Q(%an) instead? Though at least > implementation-wise, it should be possible to make %'%an%' work, too, > which would be rather cute. It would be also less error prone from end user's point of view if your closing token is not ")" (as in %Q(%an)) but percent-something, e.g. %<%an%>, %`%an%', or %'%an%'. The way to quote a string that happens to be the same as closing token you want to put in the quoted string would be more obvious (e.g. ID=%Q(%ae (%an%29) is a bit hard to read for ID='gitster@pobox.com (J C H)'). ID=%'%ae (%an)%' As I do not expect these things to nest (do we want to be able to formulate a string that can be eval'ed twice???), using the same string for both opening and closing token is fine by me. ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2012-03-06 18:31 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-03-05 9:59 Bug: pull --rebase with é in name René Haber 2012-03-05 10:26 ` Jeff King 2012-03-05 10:37 ` Thomas Rast 2012-03-05 11:42 ` René Haber 2012-03-05 11:58 ` Jeff King 2012-03-05 12:36 ` Jakub Narebski 2012-03-05 12:46 ` René Haber 2012-03-05 13:04 ` Thomas Rast 2012-03-05 13:19 ` René Haber 2012-03-05 13:29 ` Jeff King 2012-03-05 13:40 ` Thomas Rast 2012-03-05 13:50 ` Jeff King 2012-03-05 17:23 ` Junio C Hamano 2012-03-06 8:23 ` Jeff King 2012-03-06 8:36 ` Thomas Rast 2012-03-06 9:02 ` Jeff King 2012-03-06 18:31 ` Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).