* git-diff-tree rename detection bug @ 2005-09-14 16:47 Wayne Scott 2005-09-14 18:09 ` Junio C Hamano 2005-09-14 18:51 ` Junio C Hamano 0 siblings, 2 replies; 23+ messages in thread From: Wayne Scott @ 2005-09-14 16:47 UTC (permalink / raw) To: git Look at the diffs between ad6571a78ac74e9fa27e581834709067dba459af and it's parent with and without rename detection enabled. (In linux-2.6 git tree) (formated for narrow screens) $ REV=ad6571a78ac74e9fa27e581834709067dba459af $ git-diff-tree -r $REV^1 $REV | grep termios.h :000000 100644 0000000000000000000000000000000000000000 237533bb0e9f1a3e640c4906d8b350deafd315b9 A include/asm-powerpc/termios.h :100644 000000 97c6287a6cbaa5903ee1a5934a5553e9e485d8e7 0000000000000000000000000000000000000000 D include/asm-ppc/termios.h :100644 000000 02c3d283aa62bc1b4d7c5d1b22ce03ee4b8771eb 0000000000000000000000000000000000000000 D include/asm-ppc64/termios.h $ git-diff-tree -r -M $REV^1 $REV | grep termios.h :000000 100644 0000000000000000000000000000000000000000 237533bb0e9f1a3e640c4906d8b350deafd315b9 A include/asm-powerpc/termios.h :100644 000000 97c6287a6cbaa5903ee1a5934a5553e9e485d8e7 0000000000000000000000000000000000000000 D include/asm-ppc/termios.h Notice how the the fact that include/asm-ppc64/termios.h is deleted gets lost? Looks broken to me. -Wayne ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: git-diff-tree rename detection bug 2005-09-14 16:47 git-diff-tree rename detection bug Wayne Scott @ 2005-09-14 18:09 ` Junio C Hamano 2005-09-14 18:40 ` Wayne Scott 2005-09-14 18:51 ` Junio C Hamano 1 sibling, 1 reply; 23+ messages in thread From: Junio C Hamano @ 2005-09-14 18:09 UTC (permalink / raw) To: wsc9tt; +Cc: git Wayne Scott <wsc9tt@gmail.com> writes: > Look at the diffs between ad6571a78ac74e9fa27e581834709067dba459af and > it's parent with and without rename detection enabled. (In linux-2.6 > git tree) > Notice how the the fact that include/asm-ppc64/termios.h is > deleted gets lost? Looks broken to me. I suspect that what is deleted is not asm-ppc64/termios.h but asm-ppc/termios.h. The below is an output without grep which seems to confuse things. $ git-diff-tree -r -M ad6571a78ac74e9fa27e581834709067dba459af | sed -ne 's/^.* \([AMCRDU]\)/\1/p' R092 include/asm-ppc64/mman.h include/asm-powerpc/mman.h R097 include/asm-ppc64/termbits.h include/asm-powerpc/termbits.h R098 include/asm-ppc64/termios.h include/asm-powerpc/termios.h D include/asm-ppc/mman.h D include/asm-ppc/termbits.h D include/asm-ppc/termios.h $ git-diff-tree -r ad6571a78ac74e9fa27e581834709067dba459af | sed -ne 's/^.* \([AMCRDU]\)/\1/p' A include/asm-powerpc/mman.h A include/asm-powerpc/termbits.h A include/asm-powerpc/termios.h D include/asm-ppc/mman.h D include/asm-ppc/termbits.h D include/asm-ppc/termios.h D include/asm-ppc64/mman.h D include/asm-ppc64/termbits.h D include/asm-ppc64/termios.h The first 3 A and last 3 D are accounted for in the -M output as renames from asm-ppc64 to asm-powerpc. Middle 3 D from asm-ppc are shown in the -M output. So I do not think we are losing anything. Am I missing something? ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: git-diff-tree rename detection bug 2005-09-14 18:09 ` Junio C Hamano @ 2005-09-14 18:40 ` Wayne Scott 2005-09-14 20:24 ` Linus Torvalds 0 siblings, 1 reply; 23+ messages in thread From: Wayne Scott @ 2005-09-14 18:40 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On 9/14/05, Junio C Hamano <junkio@cox.net> wrote: > Wayne Scott <wsc9tt@gmail.com> writes: > > > Look at the diffs between ad6571a78ac74e9fa27e581834709067dba459af and > > it's parent with and without rename detection enabled. (In linux-2.6 > > git tree) > > > Notice how the the fact that include/asm-ppc64/termios.h is > > deleted gets lost? Looks broken to me. > > I suspect that what is deleted is not asm-ppc64/termios.h but > asm-ppc/termios.h. The below is an output without grep which > seems to confuse things. > > $ git-diff-tree -r -M ad6571a78ac74e9fa27e581834709067dba459af | > sed -ne 's/^.* \([AMCRDU]\)/\1/p' > R092 include/asm-ppc64/mman.h include/asm-powerpc/mman.h > R097 include/asm-ppc64/termbits.h include/asm-powerpc/termbits.h > R098 include/asm-ppc64/termios.h include/asm-powerpc/termios.h > D include/asm-ppc/mman.h > D include/asm-ppc/termbits.h > D include/asm-ppc/termios.h > > $ git-diff-tree -r ad6571a78ac74e9fa27e581834709067dba459af | > sed -ne 's/^.* \([AMCRDU]\)/\1/p' > A include/asm-powerpc/mman.h > A include/asm-powerpc/termbits.h > A include/asm-powerpc/termios.h > D include/asm-ppc/mman.h > D include/asm-ppc/termbits.h > D include/asm-ppc/termios.h > D include/asm-ppc64/mman.h > D include/asm-ppc64/termbits.h > D include/asm-ppc64/termios.h > > The first 3 A and last 3 D are accounted for in the -M output as > renames from asm-ppc64 to asm-powerpc. Middle 3 D from asm-ppc > are shown in the -M output. So I do not think we are losing > anything. Am I missing something? Odd. I get the same answer on my x86 box: $ git-diff-tree -r -M ad6571a78ac74e9fa27e581834709067dba459af | sed -ne 's/^.* \([AMCRDU]\)/\1/p' R092 include/asm-ppc64/mman.h include/asm-powerpc/mman.h R097 include/asm-ppc64/termbits.h include/asm-powerpc/termbits.h R098 include/asm-ppc64/termios.h include/asm-powerpc/termios.h D include/asm-ppc/mman.h D include/asm-ppc/termbits.h D include/asm-ppc/termios.h But here is the output on my quad xeon running in 64-bit mode: (fedora core 2) $ git-diff-tree -r -M ad6571a78ac74e9fa27e581834709067dba459af | sed -ne 's/^.* \([AMCRDU]\)/\1/p' R092 include/asm-ppc64/mman.h include/asm-powerpc/mman.h R097 include/asm-ppc64/termbits.h include/asm-powerpc/termbits.h A include/asm-powerpc/termios.h D include/asm-ppc/mman.h D include/asm-ppc/termbits.h D include/asm-ppc/termios.h This is the same version of git both rebuilt just for this test. However, I noticed a whole collection of errors from valgrind when I run this command line: ==13457== Invalid read of size 4 ==13457== at 0x805402C: locate_rename_dst (diffcore-rename.c:28) ==13457== by 0x805464B: diffcore_rename (diffcore-rename.c:356) ==13457== by 0x805249D: diffcore_std (diff.c:1093) ==13457== by 0x8049B48: call_diff_flush (diff-tree.c:273) ==13457== by 0x804A225: diff_tree_sha1_top (diff-tree.c:298) ==13457== by 0x804A30B: diff_tree_commit (diff-tree.c:363) ==13457== by 0x804A884: main (diff-tree.c:551) ==13457== Address 0x1BBD781C is 20 bytes inside a block of size 71 free'd ==13457== at 0x1B9003C3: free (vg_replace_malloc.c:235) ==13457== by 0x805205B: diff_free_filepair (diff.c:775) ==13457== by 0x805422A: diffcore_rename (diffcore-rename.c:415) ==13457== by 0x805249D: diffcore_std (diff.c:1093) ==13457== by 0x8049B48: call_diff_flush (diff-tree.c:273) ==13457== by 0x804A225: diff_tree_sha1_top (diff-tree.c:298) ==13457== by 0x804A30B: diff_tree_commit (diff-tree.c:363) ==13457== by 0x804A884: main (diff-tree.c:551) ==13457== ==13457== Invalid read of size 1 ==13457== at 0x1B90140D: strcmp (mac_replace_strmem.c:332) ==13457== by 0x8054038: locate_rename_dst (diffcore-rename.c:28) ==13457== by 0x805464B: diffcore_rename (diffcore-rename.c:356) ==13457== by 0x805249D: diffcore_std (diff.c:1093) ==13457== by 0x8049B48: call_diff_flush (diff-tree.c:273) ==13457== by 0x804A225: diff_tree_sha1_top (diff-tree.c:298) ==13457== by 0x804A30B: diff_tree_commit (diff-tree.c:363) ==13457== by 0x804A884: main (diff-tree.c:551) ==13457== Address 0x1BBD7830 is 40 bytes inside a block of size 71 free'd ==13457== at 0x1B9003C3: free (vg_replace_malloc.c:235) ==13457== by 0x805205B: diff_free_filepair (diff.c:775) ==13457== by 0x805422A: diffcore_rename (diffcore-rename.c:415) ==13457== by 0x805249D: diffcore_std (diff.c:1093) ==13457== by 0x8049B48: call_diff_flush (diff-tree.c:273) ==13457== by 0x804A225: diff_tree_sha1_top (diff-tree.c:298) ==13457== by 0x804A30B: diff_tree_commit (diff-tree.c:363) ==13457== by 0x804A884: main (diff-tree.c:551) ==13457== ==13457== Invalid read of size 1 ==13457== at 0x1B901423: strcmp (mac_replace_strmem.c:332) ==13457== by 0x8054038: locate_rename_dst (diffcore-rename.c:28) ==13457== by 0x805464B: diffcore_rename (diffcore-rename.c:356) ==13457== by 0x805249D: diffcore_std (diff.c:1093) ==13457== by 0x8049B48: call_diff_flush (diff-tree.c:273) ==13457== by 0x804A225: diff_tree_sha1_top (diff-tree.c:298) ==13457== by 0x804A30B: diff_tree_commit (diff-tree.c:363) ==13457== by 0x804A884: main (diff-tree.c:551) ==13504== Invalid read of size 4 ==13504== at 0x805402C: locate_rename_dst (diffcore-rename.c:28) ==13504== by 0x805464B: diffcore_rename (diffcore-rename.c:356) ==13504== by 0x805249D: diffcore_std (diff.c:1093) ==13504== by 0x8049B48: call_diff_flush (diff-tree.c:273) ==13504== by 0x804A225: diff_tree_sha1_top (diff-tree.c:298) ==13504== by 0x804A30B: diff_tree_commit (diff-tree.c:363) ==13504== by 0x804A884: main (diff-tree.c:551) ==13504== Address 0x1BBD781C is 20 bytes inside a block of size 71 free'd ==13504== at 0x1B9003C3: free (vg_replace_malloc.c:235) ==13504== by 0x805205B: diff_free_filepair (diff.c:775) ==13504== by 0x805422A: diffcore_rename (diffcore-rename.c:415) ==13504== by 0x805249D: diffcore_std (diff.c:1093) ==13504== by 0x8049B48: call_diff_flush (diff-tree.c:273) ==13504== by 0x804A225: diff_tree_sha1_top (diff-tree.c:298) ==13504== by 0x804A30B: diff_tree_commit (diff-tree.c:363) ==13504== by 0x804A884: main (diff-tree.c:551) ==13504== ==13504== Invalid read of size 1 ==13504== at 0x1B90140D: strcmp (mac_replace_strmem.c:332) ==13504== by 0x8054038: locate_rename_dst (diffcore-rename.c:28) ==13504== by 0x805464B: diffcore_rename (diffcore-rename.c:356) ==13504== by 0x805249D: diffcore_std (diff.c:1093) ==13504== by 0x8049B48: call_diff_flush (diff-tree.c:273) ==13504== by 0x804A225: diff_tree_sha1_top (diff-tree.c:298) ==13504== by 0x804A30B: diff_tree_commit (diff-tree.c:363) ==13504== by 0x804A884: main (diff-tree.c:551) ==13504== Address 0x1BBD7830 is 40 bytes inside a block of size 71 free'd ==13504== at 0x1B9003C3: free (vg_replace_malloc.c:235) ==13504== by 0x805205B: diff_free_filepair (diff.c:775) ==13504== by 0x805422A: diffcore_rename (diffcore-rename.c:415) ==13504== by 0x805249D: diffcore_std (diff.c:1093) ==13504== by 0x8049B48: call_diff_flush (diff-tree.c:273) ==13504== by 0x804A225: diff_tree_sha1_top (diff-tree.c:298) ==13504== by 0x804A30B: diff_tree_commit (diff-tree.c:363) ==13504== by 0x804A884: main (diff-tree.c:551) ==13504== ==13504== Invalid read of size 1 ==13504== at 0x1B901423: strcmp (mac_replace_strmem.c:332) ==13504== by 0x8054038: locate_rename_dst (diffcore-rename.c:28) ==13504== by 0x805464B: diffcore_rename (diffcore-rename.c:356) ==13504== by 0x805249D: diffcore_std (diff.c:1093) ==13504== by 0x8049B48: call_diff_flush (diff-tree.c:273) ==13504== by 0x804A225: diff_tree_sha1_top (diff-tree.c:298) ==13504== by 0x804A30B: diff_tree_commit (diff-tree.c:363) ==13504== by 0x804A884: main (diff-tree.c:551) ==13504== Address 0x1BBD7831 is 41 bytes inside a block of size 71 free'd ==13504== at 0x1B9003C3: free (vg_replace_malloc.c:235) ==13504== by 0x805205B: diff_free_filepair (diff.c:775) ==13504== by 0x805422A: diffcore_rename (diffcore-rename.c:415) ==13504== by 0x805249D: diffcore_std (diff.c:1093) ==13504== by 0x8049B48: call_diff_flush (diff-tree.c:273) ==13504== by 0x804A225: diff_tree_sha1_top (diff-tree.c:298) ==13504== by 0x804A30B: diff_tree_commit (diff-tree.c:363) ==13504== by 0x804A884: main (diff-tree.c:551) ==13504== Perhaps that explains the difference. -Wayne ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: git-diff-tree rename detection bug 2005-09-14 18:40 ` Wayne Scott @ 2005-09-14 20:24 ` Linus Torvalds 2005-09-14 20:41 ` Fix alloc_filespec() initialization Linus Torvalds 2005-09-15 2:23 ` git-diff-tree rename detection bug Paul Mackerras 0 siblings, 2 replies; 23+ messages in thread From: Linus Torvalds @ 2005-09-14 20:24 UTC (permalink / raw) To: Wayne Scott; +Cc: Junio C Hamano, git On Wed, 14 Sep 2005, Wayne Scott wrote: > > However, I noticed a whole collection of errors from valgrind when I > run this command line: I get even more, including: ==3234== Use of uninitialised value of size 4 ==3234== at 0x80507C1: alloc_filespec (diff.c:224) ==3234== by 0x8052387: diff_addremove (diff.c:1144) ==3234== by 0x8049B74: show_file (diff-tree.c:97) ==3234== by 0x8049E17: diff_tree (diff-tree.c:118) Damn, too bad valgrind doesn't work on ppc64, so I can't use it on my main machine. It seems to be in development on ppc32, so maybe some day. I'll look at it on my other machines instead, Linus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Fix alloc_filespec() initialization 2005-09-14 20:24 ` Linus Torvalds @ 2005-09-14 20:41 ` Linus Torvalds 2005-09-15 2:23 ` git-diff-tree rename detection bug Paul Mackerras 1 sibling, 0 replies; 23+ messages in thread From: Linus Torvalds @ 2005-09-14 20:41 UTC (permalink / raw) To: Wayne Scott; +Cc: Junio C Hamano, git This simplifies and fixes the initialization of a "diff_filespec" when allocated. The old code would not initialize "sha1_valid". Noticed by valgrind. Signed-off-by: Linus Torvalds <torvalds@osdl.org> --- This does not fix the issue Wayne saw, but I'm not going to look at the later valgrind errors before I've fixed the first ones. On Wed, 14 Sep 2005, Linus Torvalds wrote: > > I get even more, including: > > ==3234== Use of uninitialised value of size 4 > ==3234== at 0x80507C1: alloc_filespec (diff.c:224) > ==3234== by 0x8052387: diff_addremove (diff.c:1144) > ==3234== by 0x8049B74: show_file (diff-tree.c:97) > ==3234== by 0x8049E17: diff_tree (diff-tree.c:118) diff --git a/diff.c b/diff.c --- a/diff.c +++ b/diff.c @@ -214,14 +214,10 @@ struct diff_filespec *alloc_filespec(con { int namelen = strlen(path); struct diff_filespec *spec = xmalloc(sizeof(*spec) + namelen + 1); + + memset(spec, 0, sizeof(*spec)); spec->path = (char *)(spec + 1); - strcpy(spec->path, path); - spec->should_free = spec->should_munmap = 0; - spec->xfrm_flags = 0; - spec->size = 0; - spec->data = NULL; - spec->mode = 0; - memset(spec->sha1, 0, 20); + memcpy(spec->path, path, namelen+1); return spec; } ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: git-diff-tree rename detection bug 2005-09-14 20:24 ` Linus Torvalds 2005-09-14 20:41 ` Fix alloc_filespec() initialization Linus Torvalds @ 2005-09-15 2:23 ` Paul Mackerras 2005-09-15 2:26 ` Paul Mackerras ` (2 more replies) 1 sibling, 3 replies; 23+ messages in thread From: Paul Mackerras @ 2005-09-15 2:23 UTC (permalink / raw) To: Linus Torvalds; +Cc: Wayne Scott, Junio C Hamano, git Linus Torvalds writes: > Damn, too bad valgrind doesn't work on ppc64, so I can't use it on my main > machine. It seems to be in development on ppc32, so maybe some day. How about... today? :-) My port of Valgrind-2.4.1 to ppc32 works pretty well. You can get it from: http://www.valgrind.org/downloads/variants.html?pmk I assume you're compiling git as 32-bit executables on your G5. I don't see any reason why the git binaries would need to be 64-bit. Regards, Paul. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: git-diff-tree rename detection bug 2005-09-15 2:23 ` git-diff-tree rename detection bug Paul Mackerras @ 2005-09-15 2:26 ` Paul Mackerras 2005-09-15 3:36 ` Linus Torvalds 2005-09-15 3:29 ` Linus Torvalds 2005-09-15 5:58 ` H. Peter Anvin 2 siblings, 1 reply; 23+ messages in thread From: Paul Mackerras @ 2005-09-15 2:26 UTC (permalink / raw) To: Linus Torvalds, Wayne Scott, Junio C Hamano, git I wrote: > How about... today? :-) My port of Valgrind-2.4.1 to ppc32 works > pretty well. You can get it from: I meant to say explicitly that it runs quite happily under a ppc64 kernel, but only does 32-bit executables at present. Paul. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: git-diff-tree rename detection bug 2005-09-15 2:26 ` Paul Mackerras @ 2005-09-15 3:36 ` Linus Torvalds 2005-09-15 4:52 ` Paul Mackerras 0 siblings, 1 reply; 23+ messages in thread From: Linus Torvalds @ 2005-09-15 3:36 UTC (permalink / raw) To: Paul Mackerras; +Cc: Wayne Scott, Junio C Hamano, git On Thu, 15 Sep 2005, Paul Mackerras wrote: > > I meant to say explicitly that it runs quite happily under a ppc64 > kernel, but only does 32-bit executables at present. It works, but it does end up complaining about things like ==23756== Invalid read of size 4 ==23756== at 0x25A38990: strlen (in /lib/libc-2.3.5.so) .. ==23756== Address 0x25B86754 is 3 bytes after a block of size 17 alloc'd which seems to be just strlen prefetching the next word or something like that. But it's still nicer than not having it at all, even if it appears I'll have to do some filtering of my own. Linus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: git-diff-tree rename detection bug 2005-09-15 3:36 ` Linus Torvalds @ 2005-09-15 4:52 ` Paul Mackerras 2005-09-15 8:17 ` Junio C Hamano 2005-09-15 14:49 ` Josef Weidendorfer 0 siblings, 2 replies; 23+ messages in thread From: Paul Mackerras @ 2005-09-15 4:52 UTC (permalink / raw) To: Linus Torvalds; +Cc: Wayne Scott, Junio C Hamano, git Linus Torvalds writes: > It works, but it does end up complaining about things like > > ==23756== Invalid read of size 4 > ==23756== at 0x25A38990: strlen (in /lib/libc-2.3.5.so) > .. > ==23756== Address 0x25B86754 is 3 bytes after a block of size 17 alloc'd > > which seems to be just strlen prefetching the next word or something like > that. The strlen() in glibc for ppc is unbearably clever hand-coded assembly, which loads up 8 bytes at a time (once it has the address 8-byte aligned), and does various ANDs and ORs and ADDs and conditional branches. If some of the 8 bytes aren't defined, it will in many cases branch one way or the other based on the undefined bytes, but end up computing the same result on either branch. Valgrind is right in that strlen is loading up some bytes that are past the end of a malloc'd block. In fact those bytes don't end up affecting the result, and in fact the load couldn't cause a segfault, but it's not surprising that Valgrind can't see that, since the value of the extra bytes can actually affect whether a conditional branch is taken or not, but we end up with the same result either way. Valgrind sets LD_PRELOAD so that you get a simple Valgrind-supplied set of string functions, including strlen, from vgpreload_memcheck.so rather than the fancy glibc ones. However, that doesn't seem to catch the calls to strlen from inside glibc - the call from vfprintf is a direct branch rather than going through the PLT, for instance. I could add a suppression to suppress all errors in strlen, but that would mean you would miss real errors, where the string is not null-terminated within the malloc'd block, and strlen runs off the end. I wish I had a good answer for this problem, but I don't. Maybe we need a debugging version of glibc that doesn't use the fancy bit-fiddling algorithms in the string functions. (Just for interest: here are the comments from strlen.S: 1) Given a word 'x', we can test to see if it contains any 0 bytes by subtracting 0x01010101, and seeing if any of the high bits of each byte changed from 0 to 1. This works because the least significant 0 byte must have had no incoming carry (otherwise it's not the least significant), so it is 0x00 - 0x01 == 0xff. For all other byte values, either they have the high bit set initially, or when 1 is subtracted you get a value in the range 0x00-0x7f, none of which have their high bit set. The expression here is (x + 0xfefefeff) & ~(x | 0x7f7f7f7f), which gives 0x00000000 when there were no 0x00 bytes in the word. 2) Given a word 'x', we can test to see _which_ byte was zero by calculating ~(((x & 0x7f7f7f7f) + 0x7f7f7f7f) | x | 0x7f7f7f7f). This produces 0x80 in each byte that was zero, and 0x00 in all the other bytes. The '| 0x7f7f7f7f' clears the low 7 bits in each byte, and the '| x' part ensures that bytes with the high bit set produce 0x00. The addition will carry into the high bit of each byte iff that byte had one of its low 7 bits set. We can then just see which was the most significant bit set and divide by 8 to find how many to add to the index. This is from the book 'The PowerPC Compiler Writer's Guide', by Steve Hoxey, Faraydon Karim, Bill Hay and Hank Warren. ) Paul. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: git-diff-tree rename detection bug 2005-09-15 4:52 ` Paul Mackerras @ 2005-09-15 8:17 ` Junio C Hamano 2005-09-15 14:49 ` Josef Weidendorfer 1 sibling, 0 replies; 23+ messages in thread From: Junio C Hamano @ 2005-09-15 8:17 UTC (permalink / raw) To: Paul Mackerras; +Cc: Linus Torvalds, Wayne Scott, git Paul Mackerras <paulus@samba.org> writes: > The strlen() in glibc for ppc is unbearably clever hand-coded > assembly, which loads up 8 bytes at a time (once it has the address > 8-byte aligned), and does various ANDs and ORs and ADDs and > conditional branches. If some of the 8 bytes aren't defined, it will > in many cases branch one way or the other based on the undefined > bytes, but end up computing the same result on either branch. This reminds me of what I did in my previous life, writing a memory allocation checker -- this was before Valgrind -- and found out that strcpy in the C library that came with Solaris had a similar clever trick. What was interesting was that copying a string starting at the (PAGESIZE-3)th byte on a page and NUL terminated at the end of the same page ended up prefetching the first word from the next page (please do not ask me about the details -- I do not remember the disassembly of that part of the code anymore). It was not an inconvenience for our memory checker but was a real bug -- the next page could very well be unaccessible. The bug was fixed in the next version of the C library when we updated our Solaris box. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: git-diff-tree rename detection bug 2005-09-15 4:52 ` Paul Mackerras 2005-09-15 8:17 ` Junio C Hamano @ 2005-09-15 14:49 ` Josef Weidendorfer 2005-09-20 10:50 ` [Valgrind-developers] " Paul Mackerras 1 sibling, 1 reply; 23+ messages in thread From: Josef Weidendorfer @ 2005-09-15 14:49 UTC (permalink / raw) To: git; +Cc: valgrind-developers On Thursday 15 September 2005 06:52, Paul Mackerras wrote: > Valgrind sets LD_PRELOAD so that you get a simple Valgrind-supplied > set of string functions, including strlen, from vgpreload_memcheck.so > rather than the fancy glibc ones. However, that doesn't seem to catch > the calls to strlen from inside glibc - the call from vfprintf is a > direct branch rather than going through the PLT, for instance. Just curious: Why is it using LD_PRELOAD and not the VGs symbol redirection mechanism, which should catch strlen even if used inside of glibc? Josef ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Valgrind-developers] Re: git-diff-tree rename detection bug 2005-09-15 14:49 ` Josef Weidendorfer @ 2005-09-20 10:50 ` Paul Mackerras 0 siblings, 0 replies; 23+ messages in thread From: Paul Mackerras @ 2005-09-20 10:50 UTC (permalink / raw) To: Josef Weidendorfer; +Cc: git, valgrind-developers Josef Weidendorfer writes: > Just curious: Why is it using LD_PRELOAD and not the VGs symbol redirection > mechanism, which should catch strlen even if used inside of glibc? Valgrind-2.4.1 does symbol redirects for stpcpy and strnlen in libc.so.6, and stpcpy and strchr in ld-linux.so.2. I could extend that list on ppc to include strlen et al., but I don't know if that would solve the problem for ld.so, since strlen doesn't appear in the symbol table for ld.so. Paul. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: git-diff-tree rename detection bug 2005-09-15 2:23 ` git-diff-tree rename detection bug Paul Mackerras 2005-09-15 2:26 ` Paul Mackerras @ 2005-09-15 3:29 ` Linus Torvalds 2005-09-16 5:59 ` Junio C Hamano 2005-09-15 5:58 ` H. Peter Anvin 2 siblings, 1 reply; 23+ messages in thread From: Linus Torvalds @ 2005-09-15 3:29 UTC (permalink / raw) To: Paul Mackerras; +Cc: Wayne Scott, Junio C Hamano, git On Thu, 15 Sep 2005, Paul Mackerras wrote: > > Linus Torvalds writes: > > > Damn, too bad valgrind doesn't work on ppc64, so I can't use it on my main > > machine. It seems to be in development on ppc32, so maybe some day. > > How about... today? :-) My port of Valgrind-2.4.1 to ppc32 works > pretty well. You can get it from: > > http://www.valgrind.org/downloads/variants.html?pmk Ahh. Compiling right now. > I assume you're compiling git as 32-bit executables on your G5. I > don't see any reason why the git binaries would need to be 64-bit. I use whatever the defaults are. And yes, it seems to me -m32. Linus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: git-diff-tree rename detection bug 2005-09-15 3:29 ` Linus Torvalds @ 2005-09-16 5:59 ` Junio C Hamano 0 siblings, 0 replies; 23+ messages in thread From: Junio C Hamano @ 2005-09-16 5:59 UTC (permalink / raw) To: Linus Torvalds; +Cc: Paul Mackerras, Wayne Scott, git Linus Torvalds <torvalds@osdl.org> writes: > On Thu, 15 Sep 2005, Paul Mackerras wrote: >> >> Linus Torvalds writes: >> >> > Damn, too bad valgrind doesn't work on ppc64, so I can't use it on my main >> > machine. It seems to be in development on ppc32, so maybe some day. FWIW, I think I fixed the leak that patch we reverted was attempting to fix, and on my i386 box valgrind seems much happier. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: git-diff-tree rename detection bug 2005-09-15 2:23 ` git-diff-tree rename detection bug Paul Mackerras 2005-09-15 2:26 ` Paul Mackerras 2005-09-15 3:29 ` Linus Torvalds @ 2005-09-15 5:58 ` H. Peter Anvin 2005-09-15 7:41 ` Junio C Hamano ` (2 more replies) 2 siblings, 3 replies; 23+ messages in thread From: H. Peter Anvin @ 2005-09-15 5:58 UTC (permalink / raw) To: Paul Mackerras; +Cc: Linus Torvalds, Wayne Scott, Junio C Hamano, git Paul Mackerras wrote: > > I assume you're compiling git as 32-bit executables on your G5. I > don't see any reason why the git binaries would need to be 64-bit. > Well, git seems to assume it can mmap() the entirety of any file under its control, so a 64-bit git could handle larger files. Still, I'm using 32-bit git on ppc64. -hpa ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: git-diff-tree rename detection bug 2005-09-15 5:58 ` H. Peter Anvin @ 2005-09-15 7:41 ` Junio C Hamano 2005-09-15 14:55 ` Linus Torvalds 2005-09-15 20:57 ` Nicolas Pitre 2 siblings, 0 replies; 23+ messages in thread From: Junio C Hamano @ 2005-09-15 7:41 UTC (permalink / raw) To: H. Peter Anvin; +Cc: Paul Mackerras, Linus Torvalds, Wayne Scott, git "H. Peter Anvin" <hpa@zytor.com> writes: > Paul Mackerras wrote: >> I assume you're compiling git as 32-bit executables on your G5. I >> don't see any reason why the git binaries would need to be 64-bit. >> > > Well, git seems to assume it can mmap() the entirety of any file under > its control, so a 64-bit git could handle larger files. > > Still, I'm using 32-bit git on ppc64. Which is a valid thing to do because git also assumes that a file offset fits within 32-bit as far as I know. Each object in a packfile should also fit in 32-bit offset and the size of one packfile also needs to be within 32-bit offset. I was unsure if the last limitation would cause problems in the real life but we will see in a couple of years ;-). ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: git-diff-tree rename detection bug 2005-09-15 5:58 ` H. Peter Anvin 2005-09-15 7:41 ` Junio C Hamano @ 2005-09-15 14:55 ` Linus Torvalds 2005-09-16 12:53 ` Matthias Urlichs 2005-09-15 20:57 ` Nicolas Pitre 2 siblings, 1 reply; 23+ messages in thread From: Linus Torvalds @ 2005-09-15 14:55 UTC (permalink / raw) To: H. Peter Anvin; +Cc: Paul Mackerras, Wayne Scott, Junio C Hamano, git On Wed, 14 Sep 2005, H. Peter Anvin wrote: > > Well, git seems to assume it can mmap() the entirety of any file under > its control, so a 64-bit git could handle larger files. Right now git isn't 64-bit clean anyway (well, it should _work_ fine on 64-bit architectures, but it just can't handle files >32 bits). I think the core object format should be all perfectly ok (since all the sizes etc are in ascii), so this is not a huge deal. If you have an existing git archive and suddenly notice that you need 32+ bit file formats, you don't have to throw your archive away and re-generate it from scratch in some new format. In fact, I think even the streaming pack format is 64-bit-safe: it has binary sizes, but they are all length-encoded, and I think the data structure is safe. Also, the index file - in order to not be unnecessarily big, and to be able to use standard htonl/ntohl helpers etc - uses 32-bit lengths for files. Even that _that_ should be 64-bit safe, because the length isn't actually _used_ for anything but to verify the curret state, and as with the timestamps etc, it's ok to "only" check the low 32 bits. So the on-disk format should be capable of handling 64-bit entities. HOWEVER. I might be wrong. I tried to think about it, but I didn't care too much, because I think git would suck at truly huge files. If only because compressing them will take forever. And the _implementation_ uses "unsigned int" and has never been tested with anything else, so it would need a lot of testing. Also, the "pack index" file can only handle 32-bit offsets - you can make a pack-file that is bigger than that, and it should be fine from a _streaming_ standpoint (ie in the way we use them for network transport), but you can't index them in .git/objects/packs. Which isn't a disaster: you might choose to say that you never pack huge files. That might be ok for some cases (maybe the huge file is a one-off satellite picture). It would suck if the huge file is a incrementally created log-file, where packing really would be nice. IF we ever hit this, and IF people decide that git actually makes sense for those kinds of files, we CAN change the pack index format. It has a version number and everything, so we can even do it gently. Hopefully that would be the only actual on-disk format that would need to change. But regardless, the git code itself would need a lot of verification to make sure that it handles big files correctly. I'm not seeing that as a high priority. Maybe in five years ;) Linus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: git-diff-tree rename detection bug 2005-09-15 14:55 ` Linus Torvalds @ 2005-09-16 12:53 ` Matthias Urlichs 0 siblings, 0 replies; 23+ messages in thread From: Matthias Urlichs @ 2005-09-16 12:53 UTC (permalink / raw) To: git Hi, Linus Torvalds wrote: > I'm not seeing that as a high priority. Maybe in five years ;) Hmm. Just for comparison, how long was the time between "this will never run on anything but i386" and "mkdir arch/alpha"? ;-) -- Matthias Urlichs | {M:U} IT Design @ m-u-it.de | smurf@smurf.noris.de Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de - - ...He who laughs does not believe in what he laughs at, but neither does he hate it. Therefore, laughing at evil means not preparing oneself to combat it, and laughing at good means denying the power through which good is self-propagating. -- Umberto Eco, "The Name of the Rose" ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: git-diff-tree rename detection bug 2005-09-15 5:58 ` H. Peter Anvin 2005-09-15 7:41 ` Junio C Hamano 2005-09-15 14:55 ` Linus Torvalds @ 2005-09-15 20:57 ` Nicolas Pitre 2 siblings, 0 replies; 23+ messages in thread From: Nicolas Pitre @ 2005-09-15 20:57 UTC (permalink / raw) To: H. Peter Anvin Cc: Paul Mackerras, Linus Torvalds, Wayne Scott, Junio C Hamano, git On Wed, 14 Sep 2005, H. Peter Anvin wrote: > Paul Mackerras wrote: > > > > I assume you're compiling git as 32-bit executables on your G5. I > > don't see any reason why the git binaries would need to be 64-bit. > > > > Well, git seems to assume it can mmap() the entirety of any file under its > control, so a 64-bit git could handle larger files. But beware that the delta code would break since it currently won't cope with any file offset larger than 32 bits. I reserved the zero byte to prefix any extended encoding but felt it wasn't really needed yet and I just didn't bother coding it. Nicolas ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: git-diff-tree rename detection bug 2005-09-14 16:47 git-diff-tree rename detection bug Wayne Scott 2005-09-14 18:09 ` Junio C Hamano @ 2005-09-14 18:51 ` Junio C Hamano [not found] ` <59a6e5830509141208282166c8@mail.gmail.com> 1 sibling, 1 reply; 23+ messages in thread From: Junio C Hamano @ 2005-09-14 18:51 UTC (permalink / raw) To: wsc9tt; +Cc: git Wayne Scott <wsc9tt@gmail.com> writes: > Look at the diffs between ad6571a78ac74e9fa27e581834709067dba459af and > it's parent with and without rename detection enabled. (In linux-2.6 > git tree) > $ git-diff-tree -r -M $REV^1 $REV | grep termios.h > :000000 100644 0000000000000000000000000000000000000000 > 237533bb0e9f1a3e640c4906d8b350deafd315b9 A include/asm-powerpc/termios.h > :100644 000000 97c6287a6cbaa5903ee1a5934a5553e9e485d8e7 > 0000000000000000000000000000000000000000 D include/asm-ppc/termios.h > > Notice how the the fact that include/asm-ppc64/termios.h is deleted gets lost? > Looks broken to me. It looks broken to me, too. I rebuilt from a reasonably ancient source (v0.99) and re-run the test but I could not get it to produce 'A' for include/asm-powerpc/termios.h. So I rewound it further to 4d235c8044a638108b67e22f94b2876657130fc8 commit, which is really ancient version, but it still says it is renamed from asm-ppc64 directory. FWIW, all the v0.99* tagged versions seem to detect that rename correctly and not lose anything in my tests. Which version of git do you run and on what platform? It might be that something in the diffcore chain is broken in non-i386 and/or non-GNU/Linux and/or non-GCC environment. Shoot, I thought it would be a good practice-case for me to use 'git bisect' in reverse to find the commit that fixed a bug ;-). My copy of linux-2.6 repository for testing is fully packed so I could not try the commit that introduced diffcore-rename.c, but that is what I really wanted to try. ^ permalink raw reply [flat|nested] 23+ messages in thread
[parent not found: <59a6e5830509141208282166c8@mail.gmail.com>]
* Re: git-diff-tree rename detection bug [not found] ` <59a6e5830509141208282166c8@mail.gmail.com> @ 2005-09-14 19:19 ` Junio C Hamano 2005-09-14 20:53 ` Linus Torvalds 0 siblings, 1 reply; 23+ messages in thread From: Junio C Hamano @ 2005-09-14 19:19 UTC (permalink / raw) To: wsc9tt; +Cc: git Could you revert one or both of these commits and try the same test on the 64-bit box you saw problems with please? Commit: 90a734dc7f37a7bd1f3beec4d33acad559360f6c Author: Yasushi SHOJI <yashi@atmark-techno.com> Date: Sun Aug 21 16:14:16 2005 +0900 [PATCH] possible memory leak in diff.c::diff_free_filepair() Commit: 068eac91ce04b9aca163acb1927c3878c45d1a07 Author: Yasushi SHOJI <yashi@atmark-techno.com> Date: Sat Aug 13 19:58:56 2005 +0900 [PATCH] plug memory leak in diff.c::diff_free_filepair() ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: git-diff-tree rename detection bug 2005-09-14 19:19 ` Junio C Hamano @ 2005-09-14 20:53 ` Linus Torvalds 2005-09-14 21:55 ` Junio C Hamano 0 siblings, 1 reply; 23+ messages in thread From: Linus Torvalds @ 2005-09-14 20:53 UTC (permalink / raw) To: Junio C Hamano; +Cc: wsc9tt, git On Wed, 14 Sep 2005, Junio C Hamano wrote: > > Could you revert one or both of these commits and try the same > test on the 64-bit box you saw problems with please? > > Commit: 90a734dc7f37a7bd1f3beec4d33acad559360f6c > Author: Yasushi SHOJI <yashi@atmark-techno.com> > Date: Sun Aug 21 16:14:16 2005 +0900 > > [PATCH] possible memory leak in diff.c::diff_free_filepair() > > Commit: 068eac91ce04b9aca163acb1927c3878c45d1a07 > Author: Yasushi SHOJI <yashi@atmark-techno.com> > Date: Sat Aug 13 19:58:56 2005 +0900 > > [PATCH] plug memory leak in diff.c::diff_free_filepair() Undoing that second one (068eac91ce04b9aca163acb1927c3878c45d1a07) fixes the valgrind errors. Linus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: git-diff-tree rename detection bug 2005-09-14 20:53 ` Linus Torvalds @ 2005-09-14 21:55 ` Junio C Hamano 0 siblings, 0 replies; 23+ messages in thread From: Junio C Hamano @ 2005-09-14 21:55 UTC (permalink / raw) To: Linus Torvalds; +Cc: git Linus Torvalds <torvalds@osdl.org> writes: > Undoing that second one (068eac91ce04b9aca163acb1927c3878c45d1a07) fixes > the valgrind errors. Thanks; reverted and applied your other patch and pushed them out. I think the filepairs are sometimes shared so to be able to free them properly, we need to reference count. Ugh. ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2005-09-20 10:50 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-14 16:47 git-diff-tree rename detection bug Wayne Scott
2005-09-14 18:09 ` Junio C Hamano
2005-09-14 18:40 ` Wayne Scott
2005-09-14 20:24 ` Linus Torvalds
2005-09-14 20:41 ` Fix alloc_filespec() initialization Linus Torvalds
2005-09-15 2:23 ` git-diff-tree rename detection bug Paul Mackerras
2005-09-15 2:26 ` Paul Mackerras
2005-09-15 3:36 ` Linus Torvalds
2005-09-15 4:52 ` Paul Mackerras
2005-09-15 8:17 ` Junio C Hamano
2005-09-15 14:49 ` Josef Weidendorfer
2005-09-20 10:50 ` [Valgrind-developers] " Paul Mackerras
2005-09-15 3:29 ` Linus Torvalds
2005-09-16 5:59 ` Junio C Hamano
2005-09-15 5:58 ` H. Peter Anvin
2005-09-15 7:41 ` Junio C Hamano
2005-09-15 14:55 ` Linus Torvalds
2005-09-16 12:53 ` Matthias Urlichs
2005-09-15 20:57 ` Nicolas Pitre
2005-09-14 18:51 ` Junio C Hamano
[not found] ` <59a6e5830509141208282166c8@mail.gmail.com>
2005-09-14 19:19 ` Junio C Hamano
2005-09-14 20:53 ` Linus Torvalds
2005-09-14 21:55 ` Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).