* [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking @ 2009-08-06 9:55 Nick Edelen 2009-08-06 14:48 ` Johannes Schindelin 0 siblings, 1 reply; 34+ messages in thread From: Nick Edelen @ 2009-08-06 9:55 UTC (permalink / raw) To: Junio C Hamano, Johannes Schindelin, Jeff King, Sam Vilain, Shawn O. Pearce SUGGESTED FOR 'PU': Traversing objects is currently very costly, as every commit and tree must be loaded and parsed. Much time and energy could be saved by caching metadata and topological info in an efficient, easily accessible manner. Furthermore, this could improve git's interfacing potential, by providing a condensed summary of a repository's commit tree. This is a series to implement such a revision caching mechanism, aptly named rev-cache. The series will provide: - a core API to manipulate and traverse caches - an integration into the internal revision walker - a porcelain front-end providing access to users and (shell) applications - a series of tests to verify/demonstrate correctness - documentation of the API, porcelain and core concepts In cold starts rev-cache has sped up packing and walking by a factor of 4, and over twice that on warm starts. Some times on slax for the linux repository: rev-list --all --objects >/dev/null default cold 1:13 warm 0:43 rev-cache'd cold 0:19 warm 0:02 pack-objects --revs --all --stdout >/dev/null default cold 2:44 warm 1:21 rev-cache'd cold 0:44 warm 0:10 The mechanism is minimally intrusive: most of the changes take place in seperate files, and only a handful of git's existing functions are modified. Hope you find this useful. - Nick Documentation/rev-cache.txt | 51 + Documentation/technical/rev-cache.txt | 336 ++++++ Makefile | 2 + blob.c | 1 + blob.h | 1 + builtin-rev-cache.c | 284 +++++ builtin.h | 1 + commit.c | 3 + commit.h | 2 + git.c | 1 + list-objects.c | 49 +- rev-cache.c | 1832 +++++++++++++++++++++++++++++++++ revision.c | 89 ++- revision.h | 46 +- t/t6015-rev-cache-list.sh | 228 ++++ t/t6015-sha1-dump-diff.py | 36 + 16 files changed, 2937 insertions(+), 25 deletions(-) create mode 100755 Documentation/rev-cache.txt create mode 100755 Documentation/technical/rev-cache.txt create mode 100755 builtin-rev-cache.c create mode 100755 rev-cache.c create mode 100755 t/t6015-rev-cache-list.sh create mode 100755 t/t6015-sha1-dump-diff.py ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-06 9:55 [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking Nick Edelen @ 2009-08-06 14:48 ` Johannes Schindelin 2009-08-06 14:58 ` Michael J Gruber 0 siblings, 1 reply; 34+ messages in thread From: Johannes Schindelin @ 2009-08-06 14:48 UTC (permalink / raw) To: Nick Edelen Cc: Junio C Hamano, Jeff King, Sam Vilain, Shawn O. Pearce, Andreas Ericsson, Christian Couder, git@vger.kernel.org Hi, On Thu, 6 Aug 2009, Nick Edelen wrote: > SUGGESTED FOR 'PU': > > Traversing objects is currently very costly, as every commit and tree must be > loaded and parsed. Much time and energy could be saved by caching metadata and > topological info in an efficient, easily accessible manner. Furthermore, this > could improve git's interfacing potential, by providing a condensed summary of > a repository's commit tree. > > This is a series to implement such a revision caching mechanism, aptly named > rev-cache. The series will provide: > - a core API to manipulate and traverse caches > - an integration into the internal revision walker > - a porcelain front-end providing access to users and (shell) applications > - a series of tests to verify/demonstrate correctness > - documentation of the API, porcelain and core concepts > > In cold starts rev-cache has sped up packing and walking by a factor of 4, and > over twice that on warm starts. Some times on slax for the linux repository: > > rev-list --all --objects >/dev/null > default > cold 1:13 > warm 0:43 > rev-cache'd > cold 0:19 > warm 0:02 > > pack-objects --revs --all --stdout >/dev/null > default > cold 2:44 > warm 1:21 > rev-cache'd > cold 0:44 > warm 0:10 Nice! > The mechanism is minimally intrusive: most of the changes take place in > seperate files, and only a handful of git's existing functions are > modified. Sorry, I forgot the details, could you quickly remind me why these caches are not in the pack index files? > Documentation/rev-cache.txt | 51 + > Documentation/technical/rev-cache.txt | 336 ++++++ > Makefile | 2 + > blob.c | 1 + > blob.h | 1 + > builtin-rev-cache.c | 284 +++++ > builtin.h | 1 + > commit.c | 3 + > commit.h | 2 + > git.c | 1 + > list-objects.c | 49 +- > rev-cache.c | 1832 +++++++++++++++++++++++++++++++++ > revision.c | 89 ++- > revision.h | 46 +- > t/t6015-rev-cache-list.sh | 228 ++++ > t/t6015-sha1-dump-diff.py | 36 + Hmpf. We got rid of the last Python script in Git a long time ago, but now two different patch series try to sneak that dependency (at least for testing) back in. That's all the worse because we cannot use Python in msysGit, and Windows should be a platform benefitting dramatically from your work. Ciao, Dscho ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-06 14:48 ` Johannes Schindelin @ 2009-08-06 14:58 ` Michael J Gruber 2009-08-06 17:39 ` Nick Edelen 0 siblings, 1 reply; 34+ messages in thread From: Michael J Gruber @ 2009-08-06 14:58 UTC (permalink / raw) To: Johannes Schindelin Cc: Nick Edelen, Junio C Hamano, Jeff King, Sam Vilain, Shawn O. Pearce, Andreas Ericsson, Christian Couder, git@vger.kernel.org Johannes Schindelin venit, vidit, dixit 06.08.2009 16:48: > Hi, > > On Thu, 6 Aug 2009, Nick Edelen wrote: > >> SUGGESTED FOR 'PU': >> >> Traversing objects is currently very costly, as every commit and tree must be >> loaded and parsed. Much time and energy could be saved by caching metadata and >> topological info in an efficient, easily accessible manner. Furthermore, this >> could improve git's interfacing potential, by providing a condensed summary of >> a repository's commit tree. >> >> This is a series to implement such a revision caching mechanism, aptly named >> rev-cache. The series will provide: >> - a core API to manipulate and traverse caches >> - an integration into the internal revision walker >> - a porcelain front-end providing access to users and (shell) applications >> - a series of tests to verify/demonstrate correctness >> - documentation of the API, porcelain and core concepts >> >> In cold starts rev-cache has sped up packing and walking by a factor of 4, and >> over twice that on warm starts. Some times on slax for the linux repository: >> >> rev-list --all --objects >/dev/null >> default >> cold 1:13 >> warm 0:43 >> rev-cache'd >> cold 0:19 >> warm 0:02 >> >> pack-objects --revs --all --stdout >/dev/null >> default >> cold 2:44 >> warm 1:21 >> rev-cache'd >> cold 0:44 >> warm 0:10 > > Nice! > >> The mechanism is minimally intrusive: most of the changes take place in >> seperate files, and only a handful of git's existing functions are >> modified. > > Sorry, I forgot the details, could you quickly remind me why these caches > are not in the pack index files? > >> Documentation/rev-cache.txt | 51 + >> Documentation/technical/rev-cache.txt | 336 ++++++ >> Makefile | 2 + >> blob.c | 1 + >> blob.h | 1 + >> builtin-rev-cache.c | 284 +++++ >> builtin.h | 1 + >> commit.c | 3 + >> commit.h | 2 + >> git.c | 1 + >> list-objects.c | 49 +- >> rev-cache.c | 1832 +++++++++++++++++++++++++++++++++ >> revision.c | 89 ++- >> revision.h | 46 +- >> t/t6015-rev-cache-list.sh | 228 ++++ >> t/t6015-sha1-dump-diff.py | 36 + > > Hmpf. > > We got rid of the last Python script in Git a long time ago, but now two > different patch series try to sneak that dependency (at least for testing) > back in. > > That's all the worse because we cannot use Python in msysGit, and Windows > should be a platform benefitting dramatically from your work. In fact, the test the script performs could be easily rephrased with "sort", "uniq" and "comm". OTOH: If the walker is supposed to return the exact same orderd list of commits you can just use test_cmp. Michael ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-06 14:58 ` Michael J Gruber @ 2009-08-06 17:39 ` Nick Edelen 2009-08-06 19:06 ` Johannes Schindelin 0 siblings, 1 reply; 34+ messages in thread From: Nick Edelen @ 2009-08-06 17:39 UTC (permalink / raw) To: Michael J Gruber Cc: Johannes Schindelin, Junio C Hamano, Jeff King, Sam Vilain, Shawn O. Pearce, Andreas Ericsson, Christian Couder, git@vger.kernel.org Hi there, > Sorry, I forgot the details, could you quickly remind me why these caches > are not in the pack index files? Er, I'm not sure what you mean. Are you asking why these revision caches are required if we have a pack index, or why they aren't in the pack index, or something different? I'm thinking probably the second: the short answer is that cache slices are totally independant of pack files. It might be possible to somehow merge revision cache slices with pack indexes, but I don't think it'd be a very suitable modification. The rev-cache slices are meant to act almost like topo-relation pack files: new slices are simply new files, seperate slice files can be fused ("repacked") into a larger one, the index is a (recreatable) single file associating file (positions) with objects. The format was geared to reducing potential cache/data loss and preventing overly large cache slices. >> Hmpf. >> >> We got rid of the last Python script in Git a long time ago, but now two >> different patch series try to sneak that dependency (at least for testing) >> back in. >> >> That's all the worse because we cannot use Python in msysGit, and Windows >> should be a platform benefitting dramatically from your work. > > In fact, the test the script performs could be easily rephrased with > "sort", "uniq" and "comm". > OTOH: If the walker is supposed to return the exact same orderd list of > commits you can just use test_cmp. The language that script is written in isn't important. I originally wrote it in python because I wanted something quick and wasn't much of a sh guru (sorry :-/ ). As Micheal said I've no doubt it can easily be converted to shell script -- in fact, I'll try to get a shell version working today. - Nick ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-06 17:39 ` Nick Edelen @ 2009-08-06 19:06 ` Johannes Schindelin 2009-08-06 20:01 ` Nick Edelen ` (2 more replies) 0 siblings, 3 replies; 34+ messages in thread From: Johannes Schindelin @ 2009-08-06 19:06 UTC (permalink / raw) To: Nick Edelen Cc: Michael J Gruber, Junio C Hamano, Jeff King, Sam Vilain, Shawn O. Pearce, Andreas Ericsson, Christian Couder, git@vger.kernel.org Hi, On Thu, 6 Aug 2009, Nick Edelen wrote: > > Sorry, I forgot the details, could you quickly remind me why these > > caches are not in the pack index files? > > Er, I'm not sure what you mean. Are you asking why these revision > caches are required if we have a pack index, or why they aren't in the > pack index, or something different? I'm thinking probably the second: Yep. > the short answer is that cache slices are totally independant of pack > files. My idea with that was that you already have a SHA-1 map in the pack index, and if all you want to be able to accelerate the revision walker, you'd probably need something that adds yet another mapping, from commit to parents and tree, and from tree to sub-tree and blob (so you can avoid unpacking commit and tree objects). I just thought that it could be more efficient to do it at the time the pack index is written _anyway_, as nothing will change in the pack after that anyway. But I guess I can answer my question easily myself: the boundary commits will not be handled that way. Still, there is some redundancy between the pack index and your cache, as you have to write out the whole list of SHA-1s all over again. I guess it is time to look at the code instead of asking stupid questions. > It might be possible to somehow merge revision cache slices with pack > indexes, but I don't think it'd be a very suitable modification. The > rev-cache slices are meant to act almost like topo-relation pack files: > new slices are simply new files, seperate slice files can be fused > ("repacked") into a larger one, the index is a (recreatable) single file > associating file (positions) with objects. The format was geared to > reducing potential cache/data loss and preventing overly large cache > slices. > > >> Hmpf. > >> > >> We got rid of the last Python script in Git a long time ago, but now > >> two different patch series try to sneak that dependency (at least for > >> testing) back in. > >> > >> That's all the worse because we cannot use Python in msysGit, and > >> Windows should be a platform benefitting dramatically from your work. > > > > In fact, the test the script performs could be easily rephrased with > > "sort", "uniq" and "comm". OTOH: If the walker is supposed to return > > the exact same orderd list of commits you can just use test_cmp. > > The language that script is written in isn't important. I originally > wrote it in python because I wanted something quick and wasn't much of > a sh guru (sorry :-/ ). As Micheal said I've no doubt it can easily > be converted to shell script That is not what I wanted to hear. > -- in fact, I'll try to get a shell version working today. That is. Thanks, Dscho ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-06 19:06 ` Johannes Schindelin @ 2009-08-06 20:01 ` Nick Edelen 2009-08-06 20:30 ` Nick Edelen 2009-08-07 2:47 ` Sam Vilain 2009-08-08 18:57 ` Junio C Hamano 2 siblings, 1 reply; 34+ messages in thread From: Nick Edelen @ 2009-08-06 20:01 UTC (permalink / raw) To: Johannes Schindelin Cc: Michael J Gruber, Junio C Hamano, Jeff King, Sam Vilain, Shawn O. Pearce, Andreas Ericsson, Christian Couder, git@vger.kernel.org Hi, >My idea with that was that you already have a SHA-1 map in the pack index, >and if all you want to be able to accelerate the revision walker, you'd >probably need something that adds yet another mapping, from commit to >parents and tree, and from tree to sub-tree and blob (so you can avoid >unpacking commit and tree objects). As I mention in one of the patch descriptions, along with each commit a list of objects introduced per commit is cached, so no extra I/O is necessary for tree recursion, etc. during traversal. >I just thought that it could be more efficient to do it at the time the >pack index is written _anyway_, as nothing will change in the pack after >that anyway. Nothing might change in the pack, but the slices were made to allow for continual addition and refinement of the cache. In a typical usage slices will be added and fused on a regular basis, which would require tinkering in pack indexes if they were combined. >But I guess I can answer my question easily myself: the boundary commits >will not be handled that way. > >Still, there is some redundancy between the pack index and your cache, as >you have to write out the whole list of SHA-1s all over again. I guess it >is time to look at the code instead of asking stupid questions. The whole revision cache is redundant, technically speaking: nothing in it can't be found by rummaging through packs or objects. The point of it was to distill out important information for fast, easy access of the commit tree. On another note, I've eliminated the python dependancy. Shall I resend the patchset now or should I wait until it has been further reviewed? (don't want to flood the list with resubmits) - Nick On Thu, Aug 6, 2009 at 9:06 PM, Johannes Schindelin<Johannes.Schindelin@gmx.de> wrote: > Hi, > > On Thu, 6 Aug 2009, Nick Edelen wrote: > >> > Sorry, I forgot the details, could you quickly remind me why these >> > caches are not in the pack index files? >> >> Er, I'm not sure what you mean. Are you asking why these revision >> caches are required if we have a pack index, or why they aren't in the >> pack index, or something different? I'm thinking probably the second: > > Yep. > >> the short answer is that cache slices are totally independant of pack >> files. > > My idea with that was that you already have a SHA-1 map in the pack index, > and if all you want to be able to accelerate the revision walker, you'd > probably need something that adds yet another mapping, from commit to > parents and tree, and from tree to sub-tree and blob (so you can avoid > unpacking commit and tree objects). > > I just thought that it could be more efficient to do it at the time the > pack index is written _anyway_, as nothing will change in the pack after > that anyway. > > But I guess I can answer my question easily myself: the boundary commits > will not be handled that way. > > Still, there is some redundancy between the pack index and your cache, as > you have to write out the whole list of SHA-1s all over again. I guess it > is time to look at the code instead of asking stupid questions. > >> It might be possible to somehow merge revision cache slices with pack >> indexes, but I don't think it'd be a very suitable modification. The >> rev-cache slices are meant to act almost like topo-relation pack files: >> new slices are simply new files, seperate slice files can be fused >> ("repacked") into a larger one, the index is a (recreatable) single file >> associating file (positions) with objects. The format was geared to >> reducing potential cache/data loss and preventing overly large cache >> slices. >> >> >> Hmpf. >> >> >> >> We got rid of the last Python script in Git a long time ago, but now >> >> two different patch series try to sneak that dependency (at least for >> >> testing) back in. >> >> >> >> That's all the worse because we cannot use Python in msysGit, and >> >> Windows should be a platform benefitting dramatically from your work. >> > >> > In fact, the test the script performs could be easily rephrased with >> > "sort", "uniq" and "comm". OTOH: If the walker is supposed to return >> > the exact same orderd list of commits you can just use test_cmp. >> >> The language that script is written in isn't important. I originally >> wrote it in python because I wanted something quick and wasn't much of >> a sh guru (sorry :-/ ). As Micheal said I've no doubt it can easily >> be converted to shell script > > That is not what I wanted to hear. > >> -- in fact, I'll try to get a shell version working today. > > That is. > > Thanks, > Dscho > ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-06 20:01 ` Nick Edelen @ 2009-08-06 20:30 ` Nick Edelen 2009-08-06 20:32 ` Shawn O. Pearce 2009-08-07 4:42 ` Nicolas Pitre 0 siblings, 2 replies; 34+ messages in thread From: Nick Edelen @ 2009-08-06 20:30 UTC (permalink / raw) To: Johannes Schindelin Cc: Michael J Gruber, Junio C Hamano, Jeff King, Sam Vilain, Shawn O. Pearce, Andreas Ericsson, Christian Couder, git@vger.kernel.org Hrmm, I just realized that it dosn't actually cache paths/names... This obviously has no bearing on its use in packing, but I should either add that in or restrict usage in non-packing-related walks. Weird how things like that escape you. I think I may go ahead and add support for this tomorrow. It should have no effect on performance and very little impact on cache slice size. On Thu, Aug 6, 2009 at 10:01 PM, Nick Edelen<sirnot@gmail.com> wrote: > Hi, > >>My idea with that was that you already have a SHA-1 map in the pack index, >>and if all you want to be able to accelerate the revision walker, you'd >>probably need something that adds yet another mapping, from commit to >>parents and tree, and from tree to sub-tree and blob (so you can avoid >>unpacking commit and tree objects). > > As I mention in one of the patch descriptions, along with each commit > a list of objects introduced per commit is cached, so no extra I/O is > necessary for tree recursion, etc. during traversal. > >>I just thought that it could be more efficient to do it at the time the >>pack index is written _anyway_, as nothing will change in the pack after >>that anyway. > > Nothing might change in the pack, but the slices were made to allow > for continual addition and refinement of the cache. In a typical > usage slices will be added and fused on a regular basis, which would > require tinkering in pack indexes if they were combined. > >>But I guess I can answer my question easily myself: the boundary commits >>will not be handled that way. >> >>Still, there is some redundancy between the pack index and your cache, as >>you have to write out the whole list of SHA-1s all over again. I guess it >>is time to look at the code instead of asking stupid questions. > > The whole revision cache is redundant, technically speaking: nothing > in it can't be found by rummaging through packs or objects. The point > of it was to distill out important information for fast, easy access > of the commit tree. > > On another note, I've eliminated the python dependancy. Shall I > resend the patchset now or should I wait until it has been further > reviewed? (don't want to flood the list with resubmits) > > - Nick > > On Thu, Aug 6, 2009 at 9:06 PM, Johannes > Schindelin<Johannes.Schindelin@gmx.de> wrote: >> Hi, >> >> On Thu, 6 Aug 2009, Nick Edelen wrote: >> >>> > Sorry, I forgot the details, could you quickly remind me why these >>> > caches are not in the pack index files? >>> >>> Er, I'm not sure what you mean. Are you asking why these revision >>> caches are required if we have a pack index, or why they aren't in the >>> pack index, or something different? I'm thinking probably the second: >> >> Yep. >> >>> the short answer is that cache slices are totally independant of pack >>> files. >> >> My idea with that was that you already have a SHA-1 map in the pack index, >> and if all you want to be able to accelerate the revision walker, you'd >> probably need something that adds yet another mapping, from commit to >> parents and tree, and from tree to sub-tree and blob (so you can avoid >> unpacking commit and tree objects). >> >> I just thought that it could be more efficient to do it at the time the >> pack index is written _anyway_, as nothing will change in the pack after >> that anyway. >> >> But I guess I can answer my question easily myself: the boundary commits >> will not be handled that way. >> >> Still, there is some redundancy between the pack index and your cache, as >> you have to write out the whole list of SHA-1s all over again. I guess it >> is time to look at the code instead of asking stupid questions. >> >>> It might be possible to somehow merge revision cache slices with pack >>> indexes, but I don't think it'd be a very suitable modification. The >>> rev-cache slices are meant to act almost like topo-relation pack files: >>> new slices are simply new files, seperate slice files can be fused >>> ("repacked") into a larger one, the index is a (recreatable) single file >>> associating file (positions) with objects. The format was geared to >>> reducing potential cache/data loss and preventing overly large cache >>> slices. >>> >>> >> Hmpf. >>> >> >>> >> We got rid of the last Python script in Git a long time ago, but now >>> >> two different patch series try to sneak that dependency (at least for >>> >> testing) back in. >>> >> >>> >> That's all the worse because we cannot use Python in msysGit, and >>> >> Windows should be a platform benefitting dramatically from your work. >>> > >>> > In fact, the test the script performs could be easily rephrased with >>> > "sort", "uniq" and "comm". OTOH: If the walker is supposed to return >>> > the exact same orderd list of commits you can just use test_cmp. >>> >>> The language that script is written in isn't important. I originally >>> wrote it in python because I wanted something quick and wasn't much of >>> a sh guru (sorry :-/ ). As Micheal said I've no doubt it can easily >>> be converted to shell script >> >> That is not what I wanted to hear. >> >>> -- in fact, I'll try to get a shell version working today. >> >> That is. >> >> Thanks, >> Dscho >> > ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-06 20:30 ` Nick Edelen @ 2009-08-06 20:32 ` Shawn O. Pearce 2009-08-06 23:35 ` A Large Angry SCM 2009-08-07 4:42 ` Nicolas Pitre 1 sibling, 1 reply; 34+ messages in thread From: Shawn O. Pearce @ 2009-08-06 20:32 UTC (permalink / raw) To: Nick Edelen Cc: Johannes Schindelin, Michael J Gruber, Junio C Hamano, Jeff King, Sam Vilain, Andreas Ericsson, Christian Couder, git@vger.kernel.org Nick Edelen <sirnot@gmail.com> wrote: > Hrmm, I just realized that it dosn't actually cache paths/names... > This obviously has no bearing on its use in packing, but I should > either add that in or restrict usage in non-packing-related walks. > Weird how things like that escape you. > > I think I may go ahead and add support for this tomorrow. It should > have no effect on performance and very little impact on cache slice > size. You may not need the path name, but instead the hash value that pack-objects computes from the path name. All that matters is the hash, so pack-objects can schedule the objects into the right buckets when its doing delta computation for objects which are not yet delta compressed, or whose delta cannot be suitably reused. -- Shawn. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-06 20:32 ` Shawn O. Pearce @ 2009-08-06 23:35 ` A Large Angry SCM 2009-08-06 23:37 ` Shawn O. Pearce 0 siblings, 1 reply; 34+ messages in thread From: A Large Angry SCM @ 2009-08-06 23:35 UTC (permalink / raw) To: Shawn O. Pearce Cc: Nick Edelen, Johannes Schindelin, Michael J Gruber, Junio C Hamano, Jeff King, Sam Vilain, Andreas Ericsson, Christian Couder, git@vger.kernel.org Shawn O. Pearce wrote: > Nick Edelen <sirnot@gmail.com> wrote: >> Hrmm, I just realized that it dosn't actually cache paths/names... >> This obviously has no bearing on its use in packing, but I should >> either add that in or restrict usage in non-packing-related walks. >> Weird how things like that escape you. >> >> I think I may go ahead and add support for this tomorrow. It should >> have no effect on performance and very little impact on cache slice >> size. > > You may not need the path name, but instead the hash value that > pack-objects computes from the path name. All that matters is > the hash, so pack-objects can schedule the objects into the right > buckets when its doing delta computation for objects which are not > yet delta compressed, or whose delta cannot be suitably reused. > Please do NOT expose the hash values. The hash used by pack-objects is an implementation detail of the heuristics used by the _current_ object packing code. It would be a real shame to have to maintain backward compatibility with it at some future date after the packing machinery has changed. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-06 23:35 ` A Large Angry SCM @ 2009-08-06 23:37 ` Shawn O. Pearce 2009-08-06 23:43 ` A Large Angry SCM 2009-08-07 6:05 ` Johannes Schindelin 0 siblings, 2 replies; 34+ messages in thread From: Shawn O. Pearce @ 2009-08-06 23:37 UTC (permalink / raw) To: A Large Angry SCM Cc: Nick Edelen, Johannes Schindelin, Michael J Gruber, Junio C Hamano, Jeff King, Sam Vilain, Andreas Ericsson, Christian Couder, git@vger.kernel.org A Large Angry SCM <gitzilla@gmail.com> wrote: > Shawn O. Pearce wrote: >> Nick Edelen <sirnot@gmail.com> wrote: >>> Hrmm, I just realized that it dosn't actually cache paths/names... >> >> You may not need the path name, but instead the hash value that >> pack-objects computes from the path name. > > Please do NOT expose the hash values. The hash used by pack-objects is > an implementation detail of the heuristics used by the _current_ object > packing code. It would be a real shame to have to maintain backward > compatibility with it at some future date after the packing machinery > has changed. This is a local cache. If there was a version number in the header, and the hash function changes, we could just bump the version number and invalidate all of the caches. No sense in storing (and doing IO of) huge duplicate string values for something where we really only need 32 bits, and where a recompute from scratch only costs a minute. -- Shawn. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-06 23:37 ` Shawn O. Pearce @ 2009-08-06 23:43 ` A Large Angry SCM 2009-08-07 0:15 ` Nick Edelen 2009-08-07 6:05 ` Johannes Schindelin 1 sibling, 1 reply; 34+ messages in thread From: A Large Angry SCM @ 2009-08-06 23:43 UTC (permalink / raw) To: Shawn O. Pearce Cc: Nick Edelen, Johannes Schindelin, Michael J Gruber, Junio C Hamano, Jeff King, Sam Vilain, Andreas Ericsson, Christian Couder, git@vger.kernel.org Shawn O. Pearce wrote: > A Large Angry SCM <gitzilla@gmail.com> wrote: >> Shawn O. Pearce wrote: >>> Nick Edelen <sirnot@gmail.com> wrote: >>>> Hrmm, I just realized that it dosn't actually cache paths/names... >>> You may not need the path name, but instead the hash value that >>> pack-objects computes from the path name. >> Please do NOT expose the hash values. The hash used by pack-objects is >> an implementation detail of the heuristics used by the _current_ object >> packing code. It would be a real shame to have to maintain backward >> compatibility with it at some future date after the packing machinery >> has changed. > > This is a local cache. If there was a version number in the header, > and the hash function changes, we could just bump the version number > and invalidate all of the caches. > > No sense in storing (and doing IO of) huge duplicate string values > for something where we really only need 32 bits, and where a > recompute from scratch only costs a minute. > That will work for me if the cache gets a version number and iff the pack-objects hash code gets big warning comments about the cache code dependency. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-06 23:43 ` A Large Angry SCM @ 2009-08-07 0:15 ` Nick Edelen 0 siblings, 0 replies; 34+ messages in thread From: Nick Edelen @ 2009-08-07 0:15 UTC (permalink / raw) To: gitzilla Cc: Shawn O. Pearce, Johannes Schindelin, Michael J Gruber, Junio C Hamano, Jeff King, Sam Vilain, Andreas Ericsson, Christian Couder, git@vger.kernel.org That would work, but I sorta like the idea of caching the actual names. I'm thinking of having a block of slice-unique, null-seperated names at the end of each slice (ie. not in the mapping) which is loaded into memory (it wouldn't be very big). Then each blob/tree object would have an variable length index referencing a particular name. Using the actual names would give us greater flexbility, and allow rev-cache to output proper rev-list type output (with the names after the hashes). On Fri, Aug 7, 2009 at 1:43 AM, A Large Angry SCM<gitzilla@gmail.com> wrote: > Shawn O. Pearce wrote: >> >> A Large Angry SCM <gitzilla@gmail.com> wrote: >>> >>> Shawn O. Pearce wrote: >>>> >>>> Nick Edelen <sirnot@gmail.com> wrote: >>>>> >>>>> Hrmm, I just realized that it dosn't actually cache paths/names... >>>> >>>> You may not need the path name, but instead the hash value that >>>> pack-objects computes from the path name. >>> >>> Please do NOT expose the hash values. The hash used by pack-objects is >>> an implementation detail of the heuristics used by the _current_ object >>> packing code. It would be a real shame to have to maintain backward >>> compatibility with it at some future date after the packing machinery has >>> changed. >> >> This is a local cache. If there was a version number in the header, >> and the hash function changes, we could just bump the version number >> and invalidate all of the caches. >> >> No sense in storing (and doing IO of) huge duplicate string values >> for something where we really only need 32 bits, and where a >> recompute from scratch only costs a minute. >> > > That will work for me if the cache gets a version number and iff the > pack-objects hash code gets big warning comments about the cache code > dependency. > ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-06 23:37 ` Shawn O. Pearce 2009-08-06 23:43 ` A Large Angry SCM @ 2009-08-07 6:05 ` Johannes Schindelin 1 sibling, 0 replies; 34+ messages in thread From: Johannes Schindelin @ 2009-08-07 6:05 UTC (permalink / raw) To: Shawn O. Pearce Cc: A Large Angry SCM, Nick Edelen, Michael J Gruber, Junio C Hamano, Jeff King, Sam Vilain, Andreas Ericsson, Christian Couder, git@vger.kernel.org On Thu, 6 Aug 2009, Shawn O. Pearce wrote: > A Large Angry SCM <gitzilla@gmail.com> wrote: > > Shawn O. Pearce wrote: > >> Nick Edelen <sirnot@gmail.com> wrote: > >>> Hrmm, I just realized that it dosn't actually cache paths/names... > >> > >> You may not need the path name, but instead the hash value that > >> pack-objects computes from the path name. > > > > Please do NOT expose the hash values. The hash used by pack-objects is > > an implementation detail of the heuristics used by the _current_ > > object packing code. It would be a real shame to have to maintain > > backward compatibility with it at some future date after the packing > > machinery has changed. > > This is a local cache. If there was a version number in the header, and > the hash function changes, we could just bump the version number and > invalidate all of the caches. > > No sense in storing (and doing IO of) huge duplicate string values for > something where we really only need 32 bits, and where a recompute from > scratch only costs a minute. FWIW it was this redundancy in duplicate (unpacked) string redundancy I meant, but I did a poor job at expressing myself, and consequently Nick did not understand what I want (and I'm on a slow connection, so I deleted the mail halfway through looking if there are some real answers hidden in the huge quoted part instead of replying). And the fragility of the dependency to the implementation detail of the pack index suggests to me that my intuition that this whole thing should be more tightly integrated with the pack index was not totally off the mark. Ciao, Dscho ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-06 20:30 ` Nick Edelen 2009-08-06 20:32 ` Shawn O. Pearce @ 2009-08-07 4:42 ` Nicolas Pitre 1 sibling, 0 replies; 34+ messages in thread From: Nicolas Pitre @ 2009-08-07 4:42 UTC (permalink / raw) To: Nick Edelen Cc: Johannes Schindelin, Michael J Gruber, Junio C Hamano, Jeff King, Sam Vilain, Shawn O. Pearce, Andreas Ericsson, Christian Couder, git@vger.kernel.org On Thu, 6 Aug 2009, Nick Edelen wrote: > Hrmm, I just realized that it dosn't actually cache paths/names... > This obviously has no bearing on its use in packing, but I should > either add that in or restrict usage in non-packing-related walks. > Weird how things like that escape you. Actually it is really the packing related walk that would benefit the most from this work. The "counting objects" phase of a clone may take quite a while with some repositories. Most other operations don't care as much because the rev walk is done incrementally whereas the packing operation needs to perform it all up front. Nicolas ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-06 19:06 ` Johannes Schindelin 2009-08-06 20:01 ` Nick Edelen @ 2009-08-07 2:47 ` Sam Vilain 2009-08-07 4:35 ` Nicolas Pitre 2009-08-07 6:12 ` Johannes Schindelin 2009-08-08 18:57 ` Junio C Hamano 2 siblings, 2 replies; 34+ messages in thread From: Sam Vilain @ 2009-08-07 2:47 UTC (permalink / raw) To: Johannes Schindelin Cc: Nick Edelen, Michael J Gruber, Junio C Hamano, Jeff King, Shawn O. Pearce, Andreas Ericsson, Christian Couder, git@vger.kernel.org Johannes Schindelin wrote: >> the short answer is that cache slices are totally independant of pack >> files. >> > > My idea with that was that you already have a SHA-1 map in the pack index, > and if all you want to be able to accelerate the revision walker, you'd > probably need something that adds yet another mapping, from commit to > parents and tree, and from tree to sub-tree and blob (so you can avoid > unpacking commit and tree objects). > Tying indexes together like that is not a good idea in the database world. Especially as in this case as Nick mentions, the domain is subtly different (ie pack vs dag). Unfortunately you just can't try to pretend that they will always be the same; you can't force a full repack on every ref change! > Still, there is some redundancy between the pack index and your cache, as > you have to write out the whole list of SHA-1s all over again. I guess it > is time to look at the code instead of asking stupid questions. > "Disk is cheap" :-) It should be a welcome trade-off; perhaps it's worth including numbers about how big the indexes are with the time statistics. It sounds though like it should be a significant win as a single index can be used to accelerate a wide range of rev-list operations, and store indexes for many different questions that can be asked. Sam ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-07 2:47 ` Sam Vilain @ 2009-08-07 4:35 ` Nicolas Pitre 2009-08-07 6:08 ` Johannes Schindelin 2009-08-07 6:12 ` Johannes Schindelin 1 sibling, 1 reply; 34+ messages in thread From: Nicolas Pitre @ 2009-08-07 4:35 UTC (permalink / raw) To: Sam Vilain Cc: Johannes Schindelin, Nick Edelen, Michael J Gruber, Junio C Hamano, Jeff King, Shawn O. Pearce, Andreas Ericsson, Christian Couder, git@vger.kernel.org On Fri, 7 Aug 2009, Sam Vilain wrote: > Johannes Schindelin wrote: > >> the short answer is that cache slices are totally independant of pack > >> files. > >> > > > > My idea with that was that you already have a SHA-1 map in the pack index, > > and if all you want to be able to accelerate the revision walker, you'd > > probably need something that adds yet another mapping, from commit to > > parents and tree, and from tree to sub-tree and blob (so you can avoid > > unpacking commit and tree objects). > > > > Tying indexes together like that is not a good idea in the database > world. Especially as in this case as Nick mentions, the domain is subtly > different (ie pack vs dag). Unfortunately you just can't try to pretend > that they will always be the same; you can't force a full repack on > every ref change! Right. And the rev cache must work even if the repository is not packed. So pack index and rev caching are orthogonal things and are best kept separate on disk. How big this cache might get would be interesting indeed. Nicolas ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-07 4:35 ` Nicolas Pitre @ 2009-08-07 6:08 ` Johannes Schindelin 2009-08-07 14:18 ` Nicolas Pitre 0 siblings, 1 reply; 34+ messages in thread From: Johannes Schindelin @ 2009-08-07 6:08 UTC (permalink / raw) To: Nicolas Pitre Cc: Sam Vilain, Nick Edelen, Michael J Gruber, Junio C Hamano, Jeff King, Shawn O. Pearce, Andreas Ericsson, Christian Couder, git@vger.kernel.org Hi, On Fri, 7 Aug 2009, Nicolas Pitre wrote: > On Fri, 7 Aug 2009, Sam Vilain wrote: > > > Johannes Schindelin wrote: > > >> the short answer is that cache slices are totally independant of > > >> pack files. > > >> > > > > > > My idea with that was that you already have a SHA-1 map in the pack > > > index, and if all you want to be able to accelerate the revision > > > walker, you'd probably need something that adds yet another mapping, > > > from commit to parents and tree, and from tree to sub-tree and blob > > > (so you can avoid unpacking commit and tree objects). > > > > > > > Tying indexes together like that is not a good idea in the database > > world. Especially as in this case as Nick mentions, the domain is > > subtly different (ie pack vs dag). Unfortunately you just can't try to > > pretend that they will always be the same; you can't force a full > > repack on every ref change! > > Right. And the rev cache must work even if the repository is not > packed. Umm, why? AFAICT the principal purpose of the rev cache is to help work loads on, say, www.kernel.org. I am unlikely to notice the improvements in my regular "git log" calls that only show a couple of pages before I quit the pager. Ciao, Dscho ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-07 6:08 ` Johannes Schindelin @ 2009-08-07 14:18 ` Nicolas Pitre 2009-08-08 15:18 ` Johannes Schindelin 0 siblings, 1 reply; 34+ messages in thread From: Nicolas Pitre @ 2009-08-07 14:18 UTC (permalink / raw) To: Johannes Schindelin Cc: Sam Vilain, Nick Edelen, Michael J Gruber, Junio C Hamano, Jeff King, Shawn O. Pearce, Andreas Ericsson, Christian Couder, git@vger.kernel.org On Fri, 7 Aug 2009, Johannes Schindelin wrote: > Hi, > > On Fri, 7 Aug 2009, Nicolas Pitre wrote: > > > On Fri, 7 Aug 2009, Sam Vilain wrote: > > > > > Johannes Schindelin wrote: > > > >> the short answer is that cache slices are totally independant of > > > >> pack files. > > > >> > > > > > > > > My idea with that was that you already have a SHA-1 map in the pack > > > > index, and if all you want to be able to accelerate the revision > > > > walker, you'd probably need something that adds yet another mapping, > > > > from commit to parents and tree, and from tree to sub-tree and blob > > > > (so you can avoid unpacking commit and tree objects). > > > > > > > > > > Tying indexes together like that is not a good idea in the database > > > world. Especially as in this case as Nick mentions, the domain is > > > subtly different (ie pack vs dag). Unfortunately you just can't try to > > > pretend that they will always be the same; you can't force a full > > > repack on every ref change! > > > > Right. And the rev cache must work even if the repository is not > > packed. > > Umm, why? AFAICT the principal purpose of the rev cache is to help work > loads on, say, www.kernel.org. So what? Speeding up rev-list with a rev cache is completely orthogonal to whether the repository is packed or not. It is like having a "git diff" result cache: no one would think of stuffing that in the pack index. If we want to improve on the repository packing format, that must be doable without bothering with an independent concept such as a rev cache. > I am unlikely to notice the improvements in my regular "git log" calls > that only show a couple of pages before I quit the pager. Indeed. But what is your point again? Nicolas ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-07 14:18 ` Nicolas Pitre @ 2009-08-08 15:18 ` Johannes Schindelin 2009-08-08 16:07 ` Junio C Hamano ` (2 more replies) 0 siblings, 3 replies; 34+ messages in thread From: Johannes Schindelin @ 2009-08-08 15:18 UTC (permalink / raw) To: Nicolas Pitre Cc: Sam Vilain, Nick Edelen, Michael J Gruber, Junio C Hamano, Jeff King, Shawn O. Pearce, Andreas Ericsson, Christian Couder, git@vger.kernel.org Hi, On Fri, 7 Aug 2009, Nicolas Pitre wrote: > On Fri, 7 Aug 2009, Johannes Schindelin wrote: > > > Hi, > > > > On Fri, 7 Aug 2009, Nicolas Pitre wrote: > > > > > On Fri, 7 Aug 2009, Sam Vilain wrote: > > > > > > > Johannes Schindelin wrote: > > > > >> the short answer is that cache slices are totally independant of > > > > >> pack files. > > > > >> > > > > > > > > > > My idea with that was that you already have a SHA-1 map in the pack > > > > > index, and if all you want to be able to accelerate the revision > > > > > walker, you'd probably need something that adds yet another mapping, > > > > > from commit to parents and tree, and from tree to sub-tree and blob > > > > > (so you can avoid unpacking commit and tree objects). > > > > > > > > > > > > > Tying indexes together like that is not a good idea in the database > > > > world. Especially as in this case as Nick mentions, the domain is > > > > subtly different (ie pack vs dag). Unfortunately you just can't try to > > > > pretend that they will always be the same; you can't force a full > > > > repack on every ref change! > > > > > > Right. And the rev cache must work even if the repository is not > > > packed. > > > > Umm, why? AFAICT the principal purpose of the rev cache is to help work > > loads on, say, www.kernel.org. > > So what? > > Speeding up rev-list with a rev cache is completely orthogonal to > whether the repository is packed or not. No, it is not. For both technical and practical reasons, caching revision walker data is very closely related to packing. You are _very_ unlikely helped by speeding up revision walking in the general case, _especially_ when you do stuff like blame or -S that needs to unpack tons of objects _anyway_. The one big kicker argument for speeding up revision walking _is_ to relieve the loads on big ass servers, and they _should_ be as packed as possible (as I will patiently explain over and over again). > It is like having a "git diff" result cache: no one would think of > stuffing that in the pack index. Do you want to try to kid me? You'll have to try harder. Caching "git diff" results... no, really! > If we want to improve on the repository packing format, that must be > doable without bothering with an independent concept such as a rev > cache. I would love to tell you that you're right, but the single fact that pack v4 is startig to compete with Duke Nukem Forever just prevents me from doing that. > > I am unlikely to notice the improvements in my regular "git log" calls > > that only show a couple of pages before I quit the pager. > > Indeed. But what is your point again? Oh? My point? Being that the rev cache has a certain target audience, and that the regular user is not part of that audience, and that it just so happens that the _technical_ similarities with the pack index can be exploited in those scenarios? IOW we can be pretty certain that a heavy-load server has a fully (or next-to-fully) packed object database. The pack indices already contain a SHA-1 table that we can simply reuse. And it should not be hard (or fragile) at all to put the "cached" information about parents, referenced tree and blob objects into that file, into a different section. After all, the parents, referenced tree and blob objects to change as often as the objects in the pack: never. Ciao, Dscho ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-08 15:18 ` Johannes Schindelin @ 2009-08-08 16:07 ` Junio C Hamano 2009-08-08 23:54 ` Sam Vilain 2009-08-09 2:37 ` Nicolas Pitre 2 siblings, 0 replies; 34+ messages in thread From: Junio C Hamano @ 2009-08-08 16:07 UTC (permalink / raw) To: Johannes Schindelin Cc: Nicolas Pitre, Sam Vilain, Nick Edelen, Michael J Gruber, Jeff King, Shawn O. Pearce, Andreas Ericsson, Christian Couder, git@vger.kernel.org Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: >> It is like having a "git diff" result cache: no one would think of >> stuffing that in the pack index. > > Do you want to try to kid me? You'll have to try harder. Caching "git > diff" results... no, really! I thought Nico meant caching the rename similarity matrix when he said "git diff" cache. In a narrow context of "log --follow" it may make sense. It also might help "blame" without -C. But I do not see how we would gain anything if we had that cache tied to the pack index. > After all, the parents, referenced tree and blob objects to change as > often as the objects in the pack: never. But I notice that the aspects of objects you listed: the parents, referenced tree and blob objects. The frequency of them changing does not depend on where the actual object is, either packed or loose. These aspects of an object (more specifically, you are talking about a commit object) never change either way. So I am somewhat puzzled. It does change if a particular commit and its associated objects are relevant to the traversal as refs change (especially when they rewind). Just like an old "kept" pack can suddenly have tons of irrelevant objects after refs are pruned (e.g. a branch is dropped), cached reachability data, even though they may stay correct, would become irrelevant when nobody starts traversing from an object that is no longer reachable. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-08 15:18 ` Johannes Schindelin 2009-08-08 16:07 ` Junio C Hamano @ 2009-08-08 23:54 ` Sam Vilain 2009-08-09 2:37 ` Nicolas Pitre 2 siblings, 0 replies; 34+ messages in thread From: Sam Vilain @ 2009-08-08 23:54 UTC (permalink / raw) To: Johannes Schindelin Cc: Nicolas Pitre, Nick Edelen, Michael J Gruber, Junio C Hamano, Jeff King, Shawn O. Pearce, Andreas Ericsson, Christian Couder, git@vger.kernel.org On Sat, 2009-08-08 at 17:18 +0200, Johannes Schindelin wrote: > > Speeding up rev-list with a rev cache is completely orthogonal to > > whether the repository is packed or not. > > No, it is not. > > For both technical and practical reasons, caching revision walker data is > very closely related to packing. > [...] > ... the rev cache has a certain target audience, > and that the regular user is not part of that audience, and that it just > so happens that the _technical_ similarities with the pack index can be > exploited in those scenarios? > > IOW we can be pretty certain that a heavy-load server has a fully (or > next-to-fully) packed object database. The pack indices already contain a > SHA-1 table that we can simply reuse. And it should not be hard (or > fragile) at all to put the "cached" information about parents, > referenced tree and blob objects into that file, into a different section. I think your argument would work better if packs and bundles were the same thing, and we always stored bundles in the objects/packs directory, but they're not and we don't. You can't assume that a pack has any particular properties, such as representing the objects returned from a single rev-cache walk. And I will say that *especially* on a busy git server, serving active projects you can't expect people to repack their repository for every single update. Repacking daily or so by a batch job, sure. Expecting the repository to always be fully packed? No. Too much churn, or inefficient packing. You can't just pretend that the mixed packed/loose case doesn't exist. The 10% size seems a very good bang for your buck to me and a good start. Sam ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-08 15:18 ` Johannes Schindelin 2009-08-08 16:07 ` Junio C Hamano 2009-08-08 23:54 ` Sam Vilain @ 2009-08-09 2:37 ` Nicolas Pitre 2009-08-09 13:42 ` Nick Edelen 2 siblings, 1 reply; 34+ messages in thread From: Nicolas Pitre @ 2009-08-09 2:37 UTC (permalink / raw) To: Johannes Schindelin Cc: Sam Vilain, Nick Edelen, Michael J Gruber, Junio C Hamano, Jeff King, Shawn O. Pearce, Andreas Ericsson, Christian Couder, git@vger.kernel.org On Sat, 8 Aug 2009, Johannes Schindelin wrote: > Hi, > > On Fri, 7 Aug 2009, Nicolas Pitre wrote: > > > On Fri, 7 Aug 2009, Johannes Schindelin wrote: > > > > > Hi, > > > > > > On Fri, 7 Aug 2009, Nicolas Pitre wrote: > > > > > > > On Fri, 7 Aug 2009, Sam Vilain wrote: > > > > > > > > > Johannes Schindelin wrote: > > > > > >> the short answer is that cache slices are totally independant of > > > > > >> pack files. > > > > > >> > > > > > > > > > > > > My idea with that was that you already have a SHA-1 map in the pack > > > > > > index, and if all you want to be able to accelerate the revision > > > > > > walker, you'd probably need something that adds yet another mapping, > > > > > > from commit to parents and tree, and from tree to sub-tree and blob > > > > > > (so you can avoid unpacking commit and tree objects). > > > > > > > > > > > > > > > > Tying indexes together like that is not a good idea in the database > > > > > world. Especially as in this case as Nick mentions, the domain is > > > > > subtly different (ie pack vs dag). Unfortunately you just can't try to > > > > > pretend that they will always be the same; you can't force a full > > > > > repack on every ref change! > > > > > > > > Right. And the rev cache must work even if the repository is not > > > > packed. > > > > > > Umm, why? AFAICT the principal purpose of the rev cache is to help work > > > loads on, say, www.kernel.org. > > > > So what? > > > > Speeding up rev-list with a rev cache is completely orthogonal to > > whether the repository is packed or not. > > No, it is not. > > For both technical and practical reasons, caching revision walker data is > very closely related to packing. No it is not. > You are _very_ unlikely helped by speeding up revision walking in the > general case, _especially_ when you do stuff like blame or -S that needs > to unpack tons of objects _anyway_. I completely agree. And this is why I wish for the rev cache to be enabled with a config variable. Why? Because "client" repositories are unlikely to benefit as much as "server" repositories from this cache. > The one big kicker argument for speeding up revision walking _is_ to > relieve the loads on big ass servers, and they _should_ be as packed as > possible (as I will patiently explain over and over again). Please do a shortlog on builtin-pack-objects.c and realize who spent a lot of energy making git repository packing what it is now. Then re-read what I said on this list for more than a year now about the remaining latency on large git clone operations. After explaining it a couple times already I will need some of your patience to make people understand that fully packing your repo is _not_ going to help it as we're always talking about fully packed repos to start with. > > It is like having a "git diff" result cache: no one would think of > > stuffing that in the pack index. > > Do you want to try to kid me? You'll have to try harder. Caching "git > diff" results... no, really! Why not? You worked on git diff yourself, so you certainly are well positionned to appreciate my suggestion, no? > > If we want to improve on the repository packing format, that must be > > doable without bothering with an independent concept such as a rev > > cache. > > I would love to tell you that you're right, but the single fact that > pack v4 is startig to compete with Duke Nukem Forever just prevents me > from doing that. Because of the current economy, I was waiting to be laid off so to have the time to make pack v4 a reality. Unfortunately for git they didn't fire me yet. Oh well... > > > I am unlikely to notice the improvements in my regular "git log" calls > > > that only show a couple of pages before I quit the pager. > > > > Indeed. But what is your point again? > > Oh? My point? Being that the rev cache has a certain target audience, > and that the regular user is not part of that audience, and that it just > so happens that the _technical_ similarities with the pack index can be > exploited in those scenarios? I don't see a similarity with the pack index at all, certainly not a technical one. > IOW we can be pretty certain that a heavy-load server has a fully (or > next-to-fully) packed object database. The pack indices already contain a > SHA-1 table that we can simply reuse. And it should not be hard (or > fragile) at all to put the "cached" information about parents, > referenced tree and blob objects into that file, into a different section. And then someone does a few pushes. So most of the time your repository contains a few packs and not only a single one. So in which pack index files should you put the rev cache? What do yo do with that cache if it happens to be split across multiple pack index files when a repack is performed? Can't you see all the disadvantages to tie rev cache data which happens to share no issue with repacking into the same file? So what do you do to keep the code simple and maintainable? Yes, you abstract things and use separate files, thank you very much. After all, the packed refs are not stored in the pack index file either even if the packed-refs file contains a list of SHA1's. Nicolas ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-09 2:37 ` Nicolas Pitre @ 2009-08-09 13:42 ` Nick Edelen 0 siblings, 0 replies; 34+ messages in thread From: Nick Edelen @ 2009-08-09 13:42 UTC (permalink / raw) To: Nicolas Pitre Cc: Johannes Schindelin, Sam Vilain, Michael J Gruber, Junio C Hamano, Jeff King, Shawn O. Pearce, Andreas Ericsson, Christian Couder, git@vger.kernel.org I can see the logic behind Johannes's ideas, but I'm still not sure it'd be a great modification. If you wanted to associate revision caching much more strongly with packs, then packs and slices could be merged reasonably well. Say you just attached the actual slice data at the end of the pack, then stored offsets of the slice payload in the pack index. Since you'd (presumably) have to search the index for the object anyway, you wouldn't have to deal with searching a rev-cache index on top of that (although it's not exactly unoptimized now). However, that would sorta be preemptively limiting rev-cache to pack-related optimizations. I mean at the moment that's the main target, but it could be improved in the future to be more relavant to other operations as well. Leaving the rev-cache as a seperate system would keep both it and packing much more flexible, and open to longer-term developments. >I haven't read the side of the patch that _uses_ the information stored in >the rev-cache to figure out what it optimizes and what its limitations are >(e.g. how it interacts with pathspecs). Perhaps the rev-cache may turn >out to be _only_ useful for pack-objects and nothing else, in which case >we may not care about standalone version of rev-cache generator after all. rev-cache's cache slice traversal basically emulates git's revision walker, on a smaller scale. At the moment it only really handles date limiting (and obviously slop stuff) so it's not used for any pruning. That's not to say it couldn't be updated in the future though. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-07 2:47 ` Sam Vilain 2009-08-07 4:35 ` Nicolas Pitre @ 2009-08-07 6:12 ` Johannes Schindelin 2009-08-07 15:00 ` Nicolas Pitre 1 sibling, 1 reply; 34+ messages in thread From: Johannes Schindelin @ 2009-08-07 6:12 UTC (permalink / raw) To: Sam Vilain Cc: Nick Edelen, Michael J Gruber, Junio C Hamano, Jeff King, Shawn O. Pearce, Andreas Ericsson, Christian Couder, git@vger.kernel.org Hi, On Fri, 7 Aug 2009, Sam Vilain wrote: > Johannes Schindelin wrote: > >> the short answer is that cache slices are totally independant of pack > >> files. > >> > > > > My idea with that was that you already have a SHA-1 map in the pack > > index, and if all you want to be able to accelerate the revision > > walker, you'd probably need something that adds yet another mapping, > > from commit to parents and tree, and from tree to sub-tree and blob > > (so you can avoid unpacking commit and tree objects). > > > > Tying indexes together like that is not a good idea in the database > world. This is not the same index as in the database world. It is more comparable with a cached view. And there, it is generally a good idea to keep related things in the same cached view (with an outer join), rather than having two primary keys for almost every record. > Especially as in this case as Nick mentions, the domain is subtly > different (ie pack vs dag). Unfortunately you just can't try to pretend > that they will always be the same; you can't force a full repack on > every ref change! No, but you do not need that, either. In the setting that is most likely the most thankful one, i.e. a git:// server, you _want_ to keep the repository "as packed as possible", otherwise the rev cache improvements will be lost in the bad packing performance anyway. > > Still, there is some redundancy between the pack index and your cache, > > as you have to write out the whole list of SHA-1s all over again. I > > guess it is time to look at the code instead of asking stupid > > questions. > > > > "Disk is cheap" :-) Disk I/O ain't. (Size of the I/O caches, yaddayadda, I'm sure you get my point). Ciao, Dscho ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-07 6:12 ` Johannes Schindelin @ 2009-08-07 15:00 ` Nicolas Pitre 2009-08-07 22:02 ` Nick Edelen 0 siblings, 1 reply; 34+ messages in thread From: Nicolas Pitre @ 2009-08-07 15:00 UTC (permalink / raw) To: Johannes Schindelin Cc: Sam Vilain, Nick Edelen, Michael J Gruber, Junio C Hamano, Jeff King, Shawn O. Pearce, Andreas Ericsson, Christian Couder, git@vger.kernel.org On Fri, 7 Aug 2009, Johannes Schindelin wrote: > On Fri, 7 Aug 2009, Sam Vilain wrote: > > > Especially as in this case as Nick mentions, the domain is subtly > > different (ie pack vs dag). Unfortunately you just can't try to pretend > > that they will always be the same; you can't force a full repack on > > every ref change! > > No, but you do not need that, either. In the setting that is most likely > the most thankful one, i.e. a git:// server, you _want_ to keep the > repository "as packed as possible", otherwise the rev cache improvements > will be lost in the bad packing performance anyway. Yes and no. Currently, the number #1 latency in any initial git clone is the famous "counting objects" phase, even if the repo is perfectly packed. And that's all this rev cache can and will improve. The packing does play its performance role of course, but for a totally different reason. Hence the repository needs no be perfectly packed for a rev cache to speed up its own part of the game. > > > Still, there is some redundancy between the pack index and your cache, > > > as you have to write out the whole list of SHA-1s all over again. I > > > guess it is time to look at the code instead of asking stupid > > > questions. > > > > > > > "Disk is cheap" :-) > > Disk I/O ain't. > > (Size of the I/O caches, yaddayadda, I'm sure you get my point). I don't know about the size of the rev cache on disk yet (I asked Nick about that) nor do I really know how this cache is implemented. But I know damn well about git packs and associated index and I for sure don't want to see a revision cache coupled with it. And for a clone the disk IO will certainly be a magnitude larger than for the cache (or so I hope). Maybe the IO for the rev cache might be a significant overhead for operations performed on a client (aka developer) repository, in which case it would be a good idea to have a config variable to control the cache size, or even to turn it off entirely. We do it for delta depth and many other things already. Nicolas ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-07 15:00 ` Nicolas Pitre @ 2009-08-07 22:02 ` Nick Edelen 2009-08-07 22:48 ` Junio C Hamano 0 siblings, 1 reply; 34+ messages in thread From: Nick Edelen @ 2009-08-07 22:02 UTC (permalink / raw) To: Nicolas Pitre Cc: Johannes Schindelin, Sam Vilain, Michael J Gruber, Junio C Hamano, Jeff King, Shawn O. Pearce, Andreas Ericsson, Christian Couder, git@vger.kernel.org > I don't know about the size of the rev cache on disk yet (I asked Nick > about that) nor do I really know how this cache is implemented. The cache file for all of the linux repository (as of a few weeks ago) is around 42MB, without names. The names would probably add 2 or 3 MB on top of that. That's probably about as big as I'd want to get, as the whole slice (minus the name list) is mapped to memory (then again bigger might be ok; I'm not an expert on mem mapping). The rev-cache command fuse provides functionality to ignore certain cache sizes, which was geared towards preventing overly large slices. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-07 22:02 ` Nick Edelen @ 2009-08-07 22:48 ` Junio C Hamano 2009-08-07 22:53 ` Nick Edelen 2009-08-08 2:50 ` Jeff King 0 siblings, 2 replies; 34+ messages in thread From: Junio C Hamano @ 2009-08-07 22:48 UTC (permalink / raw) To: Nick Edelen Cc: Nicolas Pitre, Johannes Schindelin, Sam Vilain, Michael J Gruber, Junio C Hamano, Jeff King, Shawn O. Pearce, Andreas Ericsson, Christian Couder, git@vger.kernel.org Nick Edelen <sirnot@gmail.com> writes: > The cache file for all of the linux repository (as of a few weeks ago) > is around 42MB, without names. The names would probably add 2 or 3 MB > on top of that. That's probably about as big as I'd want to get, Hmm. .git/objects/ as of today is about 482M here, so we are talking about roughly 10% overhead? ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-07 22:48 ` Junio C Hamano @ 2009-08-07 22:53 ` Nick Edelen 2009-08-08 3:11 ` Junio C Hamano 2009-08-08 2:50 ` Jeff King 1 sibling, 1 reply; 34+ messages in thread From: Nick Edelen @ 2009-08-07 22:53 UTC (permalink / raw) To: Junio C Hamano Cc: Nicolas Pitre, Johannes Schindelin, Sam Vilain, Michael J Gruber, Jeff King, Shawn O. Pearce, Andreas Ericsson, Christian Couder, git@vger.kernel.org > Hmm. .git/objects/ as of today is about 482M here, so we are talking > about roughly 10% overhead? Yes, that sounds about right. The cache file for git's repository is 3MB, and my repo (partly packed) is ~35MB. By the way, what would be the best way of posting a revised patchset? Should I just reply to my older posts, or make new ones? ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-07 22:53 ` Nick Edelen @ 2009-08-08 3:11 ` Junio C Hamano 2009-08-08 7:27 ` Nick Edelen 0 siblings, 1 reply; 34+ messages in thread From: Junio C Hamano @ 2009-08-08 3:11 UTC (permalink / raw) To: Nick Edelen Cc: Nicolas Pitre, Johannes Schindelin, Sam Vilain, Michael J Gruber, Jeff King, Shawn O. Pearce, Andreas Ericsson, Christian Couder, git@vger.kernel.org Nick Edelen <sirnot@gmail.com> writes: > By the way, what would be the best way of posting a revised patchset? > Should I just reply to my older posts, or make new ones? That depends primarily on how heavily the patches needed to change in response to review comments, but until the series lands in 'next', you would typically send updated series as a replacement, not incremental. Many people seemed to be interested in the series and had a volume of comments on it. I suspect the updated series would be quite different from the original, so for the next round I would suspect it would be best to start anew, marking them as [PATCH N/M (v2)], in a fresh thread. It would help reviewers if you said "this corresponds to [PATCH 3/5] in the original series, with the following improvements based on X and Y's comments" after the three-dash line. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-08 3:11 ` Junio C Hamano @ 2009-08-08 7:27 ` Nick Edelen 2009-08-08 7:30 ` Jeff King 0 siblings, 1 reply; 34+ messages in thread From: Nick Edelen @ 2009-08-08 7:27 UTC (permalink / raw) To: Junio C Hamano Cc: Nicolas Pitre, Johannes Schindelin, Sam Vilain, Michael J Gruber, Jeff King, Shawn O. Pearce, Andreas Ericsson, Christian Couder, git@vger.kernel.org > IIRC from previous discussions, kernel.org's main performance problem is > I/O, not CPU. Are there any provisions for sharing rev-caches between > similar repositories, as we already do for objects? I haven't implemented a transmission protocol or anything, but it would be perfectly possible to copy cache slices from one repo to another. Generating the revision cache from scratch on large repos can take several minutes, so this wouldn't be a bad idea. > That depends primarily on how heavily the patches needed to change in > response to review comments, but until the series lands in 'next', you > would typically send updated series as a replacement, not incremental. > > Many people seemed to be interested in the series and had a volume of > comments on it. I suspect the updated series would be quite different > from the original, so for the next round I would suspect it would be best > to start anew, marking them as [PATCH N/M (v2)], in a fresh thread. It > would help reviewers if you said "this corresponds to [PATCH 3/5] in the > original series, with the following improvements based on X and Y's > comments" after the three-dash line. Ok, that sounds good. I've added a new patch as well so the numbering changes. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-08 7:27 ` Nick Edelen @ 2009-08-08 7:30 ` Jeff King 2009-08-08 7:40 ` Nick Edelen 0 siblings, 1 reply; 34+ messages in thread From: Jeff King @ 2009-08-08 7:30 UTC (permalink / raw) To: Nick Edelen Cc: Junio C Hamano, Nicolas Pitre, Johannes Schindelin, Sam Vilain, Michael J Gruber, Shawn O. Pearce, Andreas Ericsson, Christian Couder, git@vger.kernel.org On Sat, Aug 08, 2009 at 09:27:55AM +0200, Nick Edelen wrote: > > IIRC from previous discussions, kernel.org's main performance problem is > > I/O, not CPU. Are there any provisions for sharing rev-caches between > > similar repositories, as we already do for objects? > > I haven't implemented a transmission protocol or anything, but it > would be perfectly possible to copy cache slices from one repo to > another. Generating the revision cache from scratch on large repos > can take several minutes, so this wouldn't be a bad idea. That might be useful, but I was thinking more of an "alternates"-like mechanism between repos. So that the data is stored only once on disk and in the disk cache, which is helpful for sites like kernel.org which serve many similar repositories. -Peff ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-08 7:30 ` Jeff King @ 2009-08-08 7:40 ` Nick Edelen 0 siblings, 0 replies; 34+ messages in thread From: Nick Edelen @ 2009-08-08 7:40 UTC (permalink / raw) To: Jeff King Cc: Junio C Hamano, Nicolas Pitre, Johannes Schindelin, Sam Vilain, Michael J Gruber, Shawn O. Pearce, Andreas Ericsson, Christian Couder, git@vger.kernel.org > That might be useful, but I was thinking more of an "alternates"-like > mechanism between repos. So that the data is stored only once on disk > and in the disk cache, which is helpful for sites like kernel.org which > serve many similar repositories. Oh, right. Yes, that seems like it could work. We'd have to be careful that a shared cache slice wouldn' change (like in a fuse or something), but other than that we could have something as simple as a link. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-07 22:48 ` Junio C Hamano 2009-08-07 22:53 ` Nick Edelen @ 2009-08-08 2:50 ` Jeff King 1 sibling, 0 replies; 34+ messages in thread From: Jeff King @ 2009-08-08 2:50 UTC (permalink / raw) To: Nick Edelen Cc: Junio C Hamano, Nicolas Pitre, Johannes Schindelin, Sam Vilain, Michael J Gruber, Shawn O. Pearce, Andreas Ericsson, Christian Couder, git@vger.kernel.org On Fri, Aug 07, 2009 at 03:48:51PM -0700, Junio C Hamano wrote: > > The cache file for all of the linux repository (as of a few weeks ago) > > is around 42MB, without names. The names would probably add 2 or 3 MB > > on top of that. That's probably about as big as I'd want to get, > > Hmm. .git/objects/ as of today is about 482M here, so we are talking > about roughly 10% overhead? IIRC from previous discussions, kernel.org's main performance problem is I/O, not CPU. Are there any provisions for sharing rev-caches between similar repositories, as we already do for objects? -Peff ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking 2009-08-06 19:06 ` Johannes Schindelin 2009-08-06 20:01 ` Nick Edelen 2009-08-07 2:47 ` Sam Vilain @ 2009-08-08 18:57 ` Junio C Hamano 2 siblings, 0 replies; 34+ messages in thread From: Junio C Hamano @ 2009-08-08 18:57 UTC (permalink / raw) To: Johannes Schindelin Cc: Nick Edelen, Michael J Gruber, Junio C Hamano, Jeff King, Sam Vilain, Shawn O. Pearce, Andreas Ericsson, Christian Couder, git@vger.kernel.org Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > My idea with that was that you already have a SHA-1 map in the pack index, > and if all you want to be able to accelerate the revision walker, you'd > probably need something that adds yet another mapping, from commit to > parents and tree, and from tree to sub-tree and blob (so you can avoid > unpacking commit and tree objects). > > I just thought that it could be more efficient to do it at the time the > pack index is written _anyway_, as nothing will change in the pack after > that anyway. After reading the version 2 of the "documentation" patch and commenting heavily on it, I partly share the same feeling with you. The codepath to pack objects is _one of the places_ you can generate rev-cache and slice information without redoing a lot of work that has already been done anyway. But - You can write that information separately out to a different file. Logically it does not have to be _in_ the same pack idx file; and - You may want to generate rev-cache information even if you do not pack the repository. They may practically go hand-in-hand, but logically they are orthogonal. And I am not sure if it is easy to retrofit "rev-list | pack-objects" code to additionally produce this information, while keeping the standalone version of rev-cache generation. Having said all that. I haven't read the side of the patch that _uses_ the information stored in the rev-cache to figure out what it optimizes and what its limitations are (e.g. how it interacts with pathspecs). Perhaps the rev-cache may turn out to be _only_ useful for pack-objects and nothing else, in which case we may not care about standalone version of rev-cache generator after all. If that is the case, I think it is also a reasonable implementation if the rev-cache is generated only by "rev-list | pack-objects" codepath as a side effect of traversal it already does, and it might even make sense to introduce the version 3 of pack idx format that let you record additional information, like you suggest. I am not ready to make that judgement as I haven't read the rest, but my gut feeling tells me that you might be right. ^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2009-08-09 13:42 UTC | newest] Thread overview: 34+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-08-06 9:55 [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking Nick Edelen 2009-08-06 14:48 ` Johannes Schindelin 2009-08-06 14:58 ` Michael J Gruber 2009-08-06 17:39 ` Nick Edelen 2009-08-06 19:06 ` Johannes Schindelin 2009-08-06 20:01 ` Nick Edelen 2009-08-06 20:30 ` Nick Edelen 2009-08-06 20:32 ` Shawn O. Pearce 2009-08-06 23:35 ` A Large Angry SCM 2009-08-06 23:37 ` Shawn O. Pearce 2009-08-06 23:43 ` A Large Angry SCM 2009-08-07 0:15 ` Nick Edelen 2009-08-07 6:05 ` Johannes Schindelin 2009-08-07 4:42 ` Nicolas Pitre 2009-08-07 2:47 ` Sam Vilain 2009-08-07 4:35 ` Nicolas Pitre 2009-08-07 6:08 ` Johannes Schindelin 2009-08-07 14:18 ` Nicolas Pitre 2009-08-08 15:18 ` Johannes Schindelin 2009-08-08 16:07 ` Junio C Hamano 2009-08-08 23:54 ` Sam Vilain 2009-08-09 2:37 ` Nicolas Pitre 2009-08-09 13:42 ` Nick Edelen 2009-08-07 6:12 ` Johannes Schindelin 2009-08-07 15:00 ` Nicolas Pitre 2009-08-07 22:02 ` Nick Edelen 2009-08-07 22:48 ` Junio C Hamano 2009-08-07 22:53 ` Nick Edelen 2009-08-08 3:11 ` Junio C Hamano 2009-08-08 7:27 ` Nick Edelen 2009-08-08 7:30 ` Jeff King 2009-08-08 7:40 ` Nick Edelen 2009-08-08 2:50 ` Jeff King 2009-08-08 18:57 ` Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).