* Summer of Code - Cached Packs/Object Lists
@ 2009-03-23 1:53 thisnukes4u
2009-03-23 1:59 ` Shawn O. Pearce
0 siblings, 1 reply; 5+ messages in thread
From: thisnukes4u @ 2009-03-23 1:53 UTC (permalink / raw)
To: git
[-- Attachment #1: Type: text/plain, Size: 1019 bytes --]
Hey list,
My name is Thomas Coppi, and I am a junior studying Computer Science at New Mexico Tech. I've been using git for a few years now for managing coursework and personal projects, but have never looked much into the git source code or internals, so I figure that Summer of Code is a good way to get started.
I am particularly interested in the packfile caching project mentioned on the wiki, but I have a couple of questions:
1. Would it be possible to implement both the packfile and object list caching mechanisms, or might would one interfere with the other in some way?
2. With just a quick perusal of the daemon source, I noticed that it shells out to the upload-pack command. Where would it be appropriate to implement such a caching mechanism, in the daemon proper, the upload-pack code, or would both need to be updated?
Thanks in advance, and sorry if my questions seem a little bit newbish, as I mentioned this is my first time diving into git internals.
--
Thomas Coppi
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 271 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Summer of Code - Cached Packs/Object Lists
2009-03-23 1:53 Summer of Code - Cached Packs/Object Lists thisnukes4u
@ 2009-03-23 1:59 ` Shawn O. Pearce
2009-03-23 2:49 ` Thomas Coppi
0 siblings, 1 reply; 5+ messages in thread
From: Shawn O. Pearce @ 2009-03-23 1:59 UTC (permalink / raw)
To: thisnukes4u; +Cc: git
Please line wrap your email at something useful to others when
quoting, like 70-72 characters per line.
thisnukes4u@gmail.com wrote:
> I am particularly interested in the packfile caching project
> mentioned on the wiki, but I have a couple of questions:
>
> 1. Would it be possible to implement both the packfile and
> object list caching mechanisms, or might would one interfere with
> the other in some way?
You could do both. But I think most people on the list will argue
that doing both is overkill and only one is necessary, and further,
that only the one that offers the "biggest bank for the buck"
should be implemented.
Whole pack file caching has been discussed on list a few times as a
nice feature to have, but it raises some issues of cache management,
not to mention the issue I posed about it being relatively useless
on frequently changing repositories.
> 2. With just a quick perusal of the daemon source, I noticed
> that it shells out to the upload-pack command. Where would it be
> appropriate to implement such a caching mechanism, in the daemon
> proper, the upload-pack code, or would both need to be updated?
The daemon doesn't get enough data from the client in order to
perform any sort of caching.
So the caching has to happen in upload-pack, and/or pack-objects.
(upload-pack forks out to pack-objects to create the pack file to
send to the client)
--
Shawn.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Summer of Code - Cached Packs/Object Lists
2009-03-23 1:59 ` Shawn O. Pearce
@ 2009-03-23 2:49 ` Thomas Coppi
2009-03-23 2:52 ` Shawn O. Pearce
0 siblings, 1 reply; 5+ messages in thread
From: Thomas Coppi @ 2009-03-23 2:49 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 887 bytes --]
On Sun, Mar 22, 2009 at 7:59 PM, Shawn O. Pearce <spearce@spearce.org> wrote:
> Please line wrap your email at something useful to others when
> quoting, like 70-72 characters per line.
Sorry about that.
> You could do both. But I think most people on the list will argue
> that doing both is overkill and only one is necessary, and further,
> that only the one that offers the "biggest bank for the buck"
> should be implemented.
Alright, that seems reasonable. Given that I think I would lean
towards implementing an object list caching mechanism, since that seems
to be more generally applicable. The logic for this would then need to
be in the rev-list code(as mentioned in the JGit discussion), correct?
Oh, and I forgot to mention in my previous email that I am also
available on freenode IRC under the nick tcoppi.
Thanks again,
--
Thomas Coppi
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 271 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Summer of Code - Cached Packs/Object Lists
2009-03-23 2:49 ` Thomas Coppi
@ 2009-03-23 2:52 ` Shawn O. Pearce
2009-03-23 14:41 ` Nicolas Pitre
0 siblings, 1 reply; 5+ messages in thread
From: Shawn O. Pearce @ 2009-03-23 2:52 UTC (permalink / raw)
To: Thomas Coppi; +Cc: git
Thomas Coppi <thisnukes4u@gmail.com> wrote:
> On Sun, Mar 22, 2009 at 7:59 PM, Shawn O. Pearce <spearce@spearce.org> wrote:
>> You could do both. ??But I think most people on the list will argue
>> that doing both is overkill and only one is necessary, and further,
>> that only the one that offers the "biggest bank for the buck"
>> should be implemented.
>
> Alright, that seems reasonable. Given that I think I would lean
> towards implementing an object list caching mechanism, since that seems
> to be more generally applicable. The logic for this would then need to
> be in the rev-list code(as mentioned in the JGit discussion), correct?
Probably. IIRC upload-pack forks a rev-list to produce the
object list, and pipes that into the forked pack-objects' stdin.
Thus rev-list is probably what would need to know how to include
the cached list to its output.
--
Shawn.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Summer of Code - Cached Packs/Object Lists
2009-03-23 2:52 ` Shawn O. Pearce
@ 2009-03-23 14:41 ` Nicolas Pitre
0 siblings, 0 replies; 5+ messages in thread
From: Nicolas Pitre @ 2009-03-23 14:41 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: Thomas Coppi, git
On Sun, 22 Mar 2009, Shawn O. Pearce wrote:
> Thomas Coppi <thisnukes4u@gmail.com> wrote:
> > On Sun, Mar 22, 2009 at 7:59 PM, Shawn O. Pearce <spearce@spearce.org> wrote:
> >> You could do both. ??But I think most people on the list will argue
> >> that doing both is overkill and only one is necessary, and further,
> >> that only the one that offers the "biggest bank for the buck"
> >> should be implemented.
> >
> > Alright, that seems reasonable. Given that I think I would lean
> > towards implementing an object list caching mechanism, since that seems
> > to be more generally applicable. The logic for this would then need to
> > be in the rev-list code(as mentioned in the JGit discussion), correct?
>
> Probably. IIRC upload-pack forks a rev-list to produce the
> object list, and pipes that into the forked pack-objects' stdin.
> Thus rev-list is probably what would need to know how to include
> the cached list to its output.
Related to this, the first optimization is probably to avoid the fork
altogether. The pack-objects code knows how to list objects by itself
already, and that is used by git-repack. At the moment, packed tree
objects during a fetch are probably accessed one extra time needlessly.
Nicolas
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2009-03-23 14:42 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-23 1:53 Summer of Code - Cached Packs/Object Lists thisnukes4u
2009-03-23 1:59 ` Shawn O. Pearce
2009-03-23 2:49 ` Thomas Coppi
2009-03-23 2:52 ` Shawn O. Pearce
2009-03-23 14:41 ` Nicolas Pitre
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).