* Pushing vs. alternates
@ 2006-10-24 3:53 Petr Baudis
2006-10-24 5:29 ` Junio C Hamano
0 siblings, 1 reply; 7+ messages in thread
From: Petr Baudis @ 2006-10-24 3:53 UTC (permalink / raw)
To: git
Hi,
I don't have time to code that myself right now, so I'm just tossing
an idea around - pushing to a directory with alternates set up should
avoid sending objects that are already in the alternate object database.
This is obviously very hard to achieve, but I think it should be
possible to do something like look if $alternate/../refs/ exists and in
that case send haves for those refs - that could give good results. Or
is that a bad idea for some reason?
That would be quite useful for the repo.or.cz's forked objects.
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: Pushing vs. alternates 2006-10-24 3:53 Pushing vs. alternates Petr Baudis @ 2006-10-24 5:29 ` Junio C Hamano 2006-10-24 5:46 ` Shawn Pearce 2006-10-24 11:20 ` Petr Baudis 0 siblings, 2 replies; 7+ messages in thread From: Junio C Hamano @ 2006-10-24 5:29 UTC (permalink / raw) To: Petr Baudis; +Cc: git Petr Baudis <pasky@ucw.cz> writes: > I don't have time to code that myself right now, so I'm just tossing > an idea around - pushing to a directory with alternates set up should > avoid sending objects that are already in the alternate object database. That is probably only relevant for the first time, since subsequent pushes have refs from its own repository that tracks the tips of branches that was pushed for the last time. And first time usage when you are initializing the repository with alternates, you have direct access to that repository (that's how you can set up alternates), you can as easily do the initial fetch/clone as well at that time. So it might be a nice addition but I suspect it would not matter much in practice. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Pushing vs. alternates 2006-10-24 5:29 ` Junio C Hamano @ 2006-10-24 5:46 ` Shawn Pearce 2006-10-24 11:20 ` Petr Baudis 1 sibling, 0 replies; 7+ messages in thread From: Shawn Pearce @ 2006-10-24 5:46 UTC (permalink / raw) To: Junio C Hamano; +Cc: Petr Baudis, git Junio C Hamano <junkio@cox.net> wrote: > Petr Baudis <pasky@ucw.cz> writes: > > > I don't have time to code that myself right now, so I'm just tossing > > an idea around - pushing to a directory with alternates set up should > > avoid sending objects that are already in the alternate object database. > > That is probably only relevant for the first time, since > subsequent pushes have refs from its own repository that tracks > the tips of branches that was pushed for the last time. > > And first time usage when you are initializing the repository > with alternates, you have direct access to that repository > (that's how you can set up alternates), you can as easily do the > initial fetch/clone as well at that time. > > So it might be a nice addition but I suspect it would not matter > much in practice. What would be useful in practice is not unpacking the first pack pushed to the an empty repository, or better yet just dealing with converting thin packs to standalone packs rather than unpacking to loose objects when the number of objects in the incoming pack exceeds some configured threshold. Which Linus and Nico already took stabs at doing but haven't finished... -- Shawn. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Pushing vs. alternates 2006-10-24 5:29 ` Junio C Hamano 2006-10-24 5:46 ` Shawn Pearce @ 2006-10-24 11:20 ` Petr Baudis 2006-10-24 17:12 ` Junio C Hamano 1 sibling, 1 reply; 7+ messages in thread From: Petr Baudis @ 2006-10-24 11:20 UTC (permalink / raw) To: Junio C Hamano; +Cc: git Dear diary, on Tue, Oct 24, 2006 at 07:29:45AM CEST, I got a letter where Junio C Hamano <junkio@cox.net> said that... > Petr Baudis <pasky@ucw.cz> writes: > > > I don't have time to code that myself right now, so I'm just tossing > > an idea around - pushing to a directory with alternates set up should > > avoid sending objects that are already in the alternate object database. > > That is probably only relevant for the first time, since > subsequent pushes have refs from its own repository that tracks > the tips of branches that was pushed for the last time. Well, I would send haves for the alternate repository anyway, since: you push your kernel branch, half a year passes, you merge with new development and want to push again; you really do not want to push everything that happenned over the last half a year. And sending the extra haves shouldn't hurt, right? > And first time usage when you are initializing the repository > with alternates, you have direct access to that repository > (that's how you can set up alternates), you can as easily do the > initial fetch/clone as well at that time. I don't understand this paragraph. This mail is about pushing, not fetch/clone. You can only push if your login access is reduced to git-shell, and something external could've set up your alternates. -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ #!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj $/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1 lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/) ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Pushing vs. alternates 2006-10-24 11:20 ` Petr Baudis @ 2006-10-24 17:12 ` Junio C Hamano 2006-10-24 17:23 ` Petr Baudis 2006-10-24 17:33 ` Junio C Hamano 0 siblings, 2 replies; 7+ messages in thread From: Junio C Hamano @ 2006-10-24 17:12 UTC (permalink / raw) To: Petr Baudis; +Cc: git Petr Baudis <pasky@suse.cz> writes: > Dear diary, on Tue, Oct 24, 2006 at 07:29:45AM CEST, I got a letter > where Junio C Hamano <junkio@cox.net> said that... >> Petr Baudis <pasky@ucw.cz> writes: >> >> > I don't have time to code that myself right now, so I'm just tossing >> > an idea around - pushing to a directory with alternates set up should >> > avoid sending objects that are already in the alternate object database. >> >> That is probably only relevant for the first time, since >> subsequent pushes have refs from its own repository that tracks >> the tips of branches that was pushed for the last time. > > Well, I would send haves for the alternate repository anyway,... While I agree it would be an optimization if it worked, there is one conceptual problem here though, coming from old warts. It's not alternate "repository" but it is alternate object store. There is no guarantee that refs/ directory that is next to the objects/ alternate points at is related to that object store, for historical reasons (i.e. we have separate GIT_DIR and GIT_OBJECT_DIRECTORIES). So unless we declare that objects that are reachable from the refs/ *must* be fully connected in objects/ when objects/ has refs/ next to it, sending HAVEs from that refs/ can break the push, since that refs/ you are looking at may not be related to the alternate objects/ at all. I do not think it is a big restriction at all, but it is a new restriction you are adding to the repository layout. > ... You can only push if your login access is reduced to > git-shell, and something external could've set up your alternates. Ok, I was not thinking about "something external". ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Pushing vs. alternates 2006-10-24 17:12 ` Junio C Hamano @ 2006-10-24 17:23 ` Petr Baudis 2006-10-24 17:33 ` Junio C Hamano 1 sibling, 0 replies; 7+ messages in thread From: Petr Baudis @ 2006-10-24 17:23 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Tue, Oct 24, 2006 at 07:12:17PM CEST, Junio C Hamano wrote: > Petr Baudis <pasky@suse.cz> writes: > > > Dear diary, on Tue, Oct 24, 2006 at 07:29:45AM CEST, I got a letter > > where Junio C Hamano <junkio@cox.net> said that... > >> Petr Baudis <pasky@ucw.cz> writes: > >> > >> > I don't have time to code that myself right now, so I'm just tossing > >> > an idea around - pushing to a directory with alternates set up should > >> > avoid sending objects that are already in the alternate object database. > >> > >> That is probably only relevant for the first time, since > >> subsequent pushes have refs from its own repository that tracks > >> the tips of branches that was pushed for the last time. > > > > Well, I would send haves for the alternate repository anyway,... > > While I agree it would be an optimization if it worked, there is > one conceptual problem here though, coming from old warts. It's > not alternate "repository" but it is alternate object store. Yes. Which is ugly but it may make sense in case of really having things like "portable objects database" on your usbflash or whatever else insane. ;-) Still, > There is no guarantee that refs/ directory that is next to the > objects/ alternate points at is related to that object store, > for historical reasons (i.e. we have separate GIT_DIR and > GIT_OBJECT_DIRECTORIES). So unless we declare that objects that > are reachable from the refs/ *must* be fully connected in > objects/ when objects/ has refs/ next to it, sending HAVEs from > that refs/ can break the push, since that refs/ you are looking > at may not be related to the alternate objects/ at all. I do > not think it is a big restriction at all, but it is a new > restriction you are adding to the repository layout. I think this situation (having something that looks like a Git repository with objects/ inside that does *not* belong to this repository) _is_ totally insane and such a restriction is fine. Who thinks otherwise? If this really bothers anyone (I can't see why), we could have something like [ -e objects/info/standalone ] to prohibit receive-pack from ever thinking of checking if the object database belongs to a repository. We could of course keep the behaviour as is and make the new one optional, but I believe that the new one is more sensible. > > ... You can only push if your login access is reduced to > > git-shell, and something external could've set up your alternates. > > Ok, I was not thinking about "something external". Also, if you can push, that does not imply at all that you can fetch as well. In plenty of situations you can't; most UNIX machines do have ssh running, but that's not very useful when they're behind a NAT or just a restrictive firewall. And with my notebook, I'm almost always behind a NAT. -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ #!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj $/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1 lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/) ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Pushing vs. alternates 2006-10-24 17:12 ` Junio C Hamano 2006-10-24 17:23 ` Petr Baudis @ 2006-10-24 17:33 ` Junio C Hamano 1 sibling, 0 replies; 7+ messages in thread From: Junio C Hamano @ 2006-10-24 17:33 UTC (permalink / raw) To: Petr Baudis; +Cc: git Junio C Hamano <junkio@cox.net> writes: >> Well, I would send haves for the alternate repository anyway,... > > While I agree it would be an optimization if it worked, there is > one conceptual problem here though, coming from old warts. It's > not alternate "repository" but it is alternate object store. > There is no guarantee that refs/ directory that is next to the > objects/ alternate points at is related to that object store, > for historical reasons (i.e. we have separate GIT_DIR and > GIT_OBJECT_DIRECTORIES). Having said that, I am not opposed to the idea of using refs/ next to objects/ your alternate points at. Certainly I would not have any objection (heck I would even volunteer to code it myself if only to see how much we can save) if we did not have GIT_OBJECT_DIRECTORY in the system (i.e. if we had a guarantee from the beginning that objects/ directory that is next to refs/ *must* be related). So I am Ok with this change, but I would feel better if we add a few sentences to repository-layout.txt that warns about the (technically new although it is very likely that violating it would not have been useful at all) restriction. I suspect we could do the same for fetching in principle, e.g. when you track Linus's and a subsystem maintainer's trees and these two repositories are linked with alternates at your end. Fetching into your copy of Linus's and then fetching into your copy of subsystem would be optimized the same way if you send refs/ from the alternates as HAVEs, right? ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2006-10-24 17:33 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-10-24 3:53 Pushing vs. alternates Petr Baudis 2006-10-24 5:29 ` Junio C Hamano 2006-10-24 5:46 ` Shawn Pearce 2006-10-24 11:20 ` Petr Baudis 2006-10-24 17:12 ` Junio C Hamano 2006-10-24 17:23 ` Petr Baudis 2006-10-24 17:33 ` Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).