* Pushing vs. alternates
@ 2006-10-24 3:53 Petr Baudis
2006-10-24 5:29 ` Junio C Hamano
0 siblings, 1 reply; 7+ messages in thread
From: Petr Baudis @ 2006-10-24 3:53 UTC (permalink / raw)
To: git
Hi,
I don't have time to code that myself right now, so I'm just tossing
an idea around - pushing to a directory with alternates set up should
avoid sending objects that are already in the alternate object database.
This is obviously very hard to achieve, but I think it should be
possible to do something like look if $alternate/../refs/ exists and in
that case send haves for those refs - that could give good results. Or
is that a bad idea for some reason?
That would be quite useful for the repo.or.cz's forked objects.
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Pushing vs. alternates
2006-10-24 3:53 Pushing vs. alternates Petr Baudis
@ 2006-10-24 5:29 ` Junio C Hamano
2006-10-24 5:46 ` Shawn Pearce
2006-10-24 11:20 ` Petr Baudis
0 siblings, 2 replies; 7+ messages in thread
From: Junio C Hamano @ 2006-10-24 5:29 UTC (permalink / raw)
To: Petr Baudis; +Cc: git
Petr Baudis <pasky@ucw.cz> writes:
> I don't have time to code that myself right now, so I'm just tossing
> an idea around - pushing to a directory with alternates set up should
> avoid sending objects that are already in the alternate object database.
That is probably only relevant for the first time, since
subsequent pushes have refs from its own repository that tracks
the tips of branches that was pushed for the last time.
And first time usage when you are initializing the repository
with alternates, you have direct access to that repository
(that's how you can set up alternates), you can as easily do the
initial fetch/clone as well at that time.
So it might be a nice addition but I suspect it would not matter
much in practice.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Pushing vs. alternates
2006-10-24 5:29 ` Junio C Hamano
@ 2006-10-24 5:46 ` Shawn Pearce
2006-10-24 11:20 ` Petr Baudis
1 sibling, 0 replies; 7+ messages in thread
From: Shawn Pearce @ 2006-10-24 5:46 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Petr Baudis, git
Junio C Hamano <junkio@cox.net> wrote:
> Petr Baudis <pasky@ucw.cz> writes:
>
> > I don't have time to code that myself right now, so I'm just tossing
> > an idea around - pushing to a directory with alternates set up should
> > avoid sending objects that are already in the alternate object database.
>
> That is probably only relevant for the first time, since
> subsequent pushes have refs from its own repository that tracks
> the tips of branches that was pushed for the last time.
>
> And first time usage when you are initializing the repository
> with alternates, you have direct access to that repository
> (that's how you can set up alternates), you can as easily do the
> initial fetch/clone as well at that time.
>
> So it might be a nice addition but I suspect it would not matter
> much in practice.
What would be useful in practice is not unpacking the first pack
pushed to the an empty repository, or better yet just dealing with
converting thin packs to standalone packs rather than unpacking
to loose objects when the number of objects in the incoming pack
exceeds some configured threshold.
Which Linus and Nico already took stabs at doing but haven't finished...
--
Shawn.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Pushing vs. alternates
2006-10-24 5:29 ` Junio C Hamano
2006-10-24 5:46 ` Shawn Pearce
@ 2006-10-24 11:20 ` Petr Baudis
2006-10-24 17:12 ` Junio C Hamano
1 sibling, 1 reply; 7+ messages in thread
From: Petr Baudis @ 2006-10-24 11:20 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Dear diary, on Tue, Oct 24, 2006 at 07:29:45AM CEST, I got a letter
where Junio C Hamano <junkio@cox.net> said that...
> Petr Baudis <pasky@ucw.cz> writes:
>
> > I don't have time to code that myself right now, so I'm just tossing
> > an idea around - pushing to a directory with alternates set up should
> > avoid sending objects that are already in the alternate object database.
>
> That is probably only relevant for the first time, since
> subsequent pushes have refs from its own repository that tracks
> the tips of branches that was pushed for the last time.
Well, I would send haves for the alternate repository anyway, since: you
push your kernel branch, half a year passes, you merge with new
development and want to push again; you really do not want to push
everything that happenned over the last half a year. And sending the
extra haves shouldn't hurt, right?
> And first time usage when you are initializing the repository
> with alternates, you have direct access to that repository
> (that's how you can set up alternates), you can as easily do the
> initial fetch/clone as well at that time.
I don't understand this paragraph. This mail is about pushing, not
fetch/clone. You can only push if your login access is reduced to
git-shell, and something external could've set up your alternates.
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Pushing vs. alternates
2006-10-24 11:20 ` Petr Baudis
@ 2006-10-24 17:12 ` Junio C Hamano
2006-10-24 17:23 ` Petr Baudis
2006-10-24 17:33 ` Junio C Hamano
0 siblings, 2 replies; 7+ messages in thread
From: Junio C Hamano @ 2006-10-24 17:12 UTC (permalink / raw)
To: Petr Baudis; +Cc: git
Petr Baudis <pasky@suse.cz> writes:
> Dear diary, on Tue, Oct 24, 2006 at 07:29:45AM CEST, I got a letter
> where Junio C Hamano <junkio@cox.net> said that...
>> Petr Baudis <pasky@ucw.cz> writes:
>>
>> > I don't have time to code that myself right now, so I'm just tossing
>> > an idea around - pushing to a directory with alternates set up should
>> > avoid sending objects that are already in the alternate object database.
>>
>> That is probably only relevant for the first time, since
>> subsequent pushes have refs from its own repository that tracks
>> the tips of branches that was pushed for the last time.
>
> Well, I would send haves for the alternate repository anyway,...
While I agree it would be an optimization if it worked, there is
one conceptual problem here though, coming from old warts. It's
not alternate "repository" but it is alternate object store.
There is no guarantee that refs/ directory that is next to the
objects/ alternate points at is related to that object store,
for historical reasons (i.e. we have separate GIT_DIR and
GIT_OBJECT_DIRECTORIES). So unless we declare that objects that
are reachable from the refs/ *must* be fully connected in
objects/ when objects/ has refs/ next to it, sending HAVEs from
that refs/ can break the push, since that refs/ you are looking
at may not be related to the alternate objects/ at all. I do
not think it is a big restriction at all, but it is a new
restriction you are adding to the repository layout.
> ... You can only push if your login access is reduced to
> git-shell, and something external could've set up your alternates.
Ok, I was not thinking about "something external".
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Pushing vs. alternates
2006-10-24 17:12 ` Junio C Hamano
@ 2006-10-24 17:23 ` Petr Baudis
2006-10-24 17:33 ` Junio C Hamano
1 sibling, 0 replies; 7+ messages in thread
From: Petr Baudis @ 2006-10-24 17:23 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Tue, Oct 24, 2006 at 07:12:17PM CEST, Junio C Hamano wrote:
> Petr Baudis <pasky@suse.cz> writes:
>
> > Dear diary, on Tue, Oct 24, 2006 at 07:29:45AM CEST, I got a letter
> > where Junio C Hamano <junkio@cox.net> said that...
> >> Petr Baudis <pasky@ucw.cz> writes:
> >>
> >> > I don't have time to code that myself right now, so I'm just tossing
> >> > an idea around - pushing to a directory with alternates set up should
> >> > avoid sending objects that are already in the alternate object database.
> >>
> >> That is probably only relevant for the first time, since
> >> subsequent pushes have refs from its own repository that tracks
> >> the tips of branches that was pushed for the last time.
> >
> > Well, I would send haves for the alternate repository anyway,...
>
> While I agree it would be an optimization if it worked, there is
> one conceptual problem here though, coming from old warts. It's
> not alternate "repository" but it is alternate object store.
Yes. Which is ugly but it may make sense in case of really having things
like "portable objects database" on your usbflash or whatever else
insane. ;-) Still,
> There is no guarantee that refs/ directory that is next to the
> objects/ alternate points at is related to that object store,
> for historical reasons (i.e. we have separate GIT_DIR and
> GIT_OBJECT_DIRECTORIES). So unless we declare that objects that
> are reachable from the refs/ *must* be fully connected in
> objects/ when objects/ has refs/ next to it, sending HAVEs from
> that refs/ can break the push, since that refs/ you are looking
> at may not be related to the alternate objects/ at all. I do
> not think it is a big restriction at all, but it is a new
> restriction you are adding to the repository layout.
I think this situation (having something that looks like a Git
repository with objects/ inside that does *not* belong to this
repository) _is_ totally insane and such a restriction is fine. Who
thinks otherwise?
If this really bothers anyone (I can't see why), we could have something
like [ -e objects/info/standalone ] to prohibit receive-pack from ever
thinking of checking if the object database belongs to a repository. We
could of course keep the behaviour as is and make the new one optional,
but I believe that the new one is more sensible.
> > ... You can only push if your login access is reduced to
> > git-shell, and something external could've set up your alternates.
>
> Ok, I was not thinking about "something external".
Also, if you can push, that does not imply at all that you can fetch as
well. In plenty of situations you can't; most UNIX machines do have ssh
running, but that's not very useful when they're behind a NAT or just a
restrictive firewall. And with my notebook, I'm almost always behind a
NAT.
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Pushing vs. alternates
2006-10-24 17:12 ` Junio C Hamano
2006-10-24 17:23 ` Petr Baudis
@ 2006-10-24 17:33 ` Junio C Hamano
1 sibling, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2006-10-24 17:33 UTC (permalink / raw)
To: Petr Baudis; +Cc: git
Junio C Hamano <junkio@cox.net> writes:
>> Well, I would send haves for the alternate repository anyway,...
>
> While I agree it would be an optimization if it worked, there is
> one conceptual problem here though, coming from old warts. It's
> not alternate "repository" but it is alternate object store.
> There is no guarantee that refs/ directory that is next to the
> objects/ alternate points at is related to that object store,
> for historical reasons (i.e. we have separate GIT_DIR and
> GIT_OBJECT_DIRECTORIES).
Having said that, I am not opposed to the idea of using refs/
next to objects/ your alternate points at. Certainly I would
not have any objection (heck I would even volunteer to code it
myself if only to see how much we can save) if we did not have
GIT_OBJECT_DIRECTORY in the system (i.e. if we had a guarantee
from the beginning that objects/ directory that is next to refs/
*must* be related). So I am Ok with this change, but I would
feel better if we add a few sentences to repository-layout.txt
that warns about the (technically new although it is very likely
that violating it would not have been useful at all) restriction.
I suspect we could do the same for fetching in principle,
e.g. when you track Linus's and a subsystem maintainer's trees
and these two repositories are linked with alternates at your
end. Fetching into your copy of Linus's and then fetching into
your copy of subsystem would be optimized the same way if you
send refs/ from the alternates as HAVEs, right?
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2006-10-24 17:33 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-24 3:53 Pushing vs. alternates Petr Baudis
2006-10-24 5:29 ` Junio C Hamano
2006-10-24 5:46 ` Shawn Pearce
2006-10-24 11:20 ` Petr Baudis
2006-10-24 17:12 ` Junio C Hamano
2006-10-24 17:23 ` Petr Baudis
2006-10-24 17:33 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).