* git-pull from git.git - no remote ref for pu or next?
@ 2006-12-12 16:44 Randal L. Schwartz
2006-12-12 16:47 ` Randal L. Schwartz
0 siblings, 1 reply; 22+ messages in thread
From: Randal L. Schwartz @ 2006-12-12 16:44 UTC (permalink / raw)
To: git
I just got this on this morning's git-fetch:
error: no such remote ref refs/heads/pu
error: no such remote ref refs/heads/next
Fetch failure: git://git.kernel.org/pub/scm/git/git.git
Here's my remotes/origin:
URL: git://git.kernel.org/pub/scm/git/git.git
Pull: master:origin
Pull: man:man
Pull: html:html
Pull: +pu:pu
Pull: +next:next
--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: git-pull from git.git - no remote ref for pu or next?
2006-12-12 16:44 git-pull from git.git - no remote ref for pu or next? Randal L. Schwartz
@ 2006-12-12 16:47 ` Randal L. Schwartz
2006-12-12 17:40 ` Johannes Schindelin
2006-12-12 17:56 ` Linus Torvalds
0 siblings, 2 replies; 22+ messages in thread
From: Randal L. Schwartz @ 2006-12-12 16:47 UTC (permalink / raw)
To: git
>>>>> "Randal" == Randal L Schwartz <merlyn@stonehenge.com> writes:
Randal> I just got this on this morning's git-fetch:
Randal> error: no such remote ref refs/heads/pu
Randal> error: no such remote ref refs/heads/next
Randal> Fetch failure: git://git.kernel.org/pub/scm/git/git.git
Randal> Here's my remotes/origin:
Randal> URL: git://git.kernel.org/pub/scm/git/git.git
Randal> Pull: master:origin
Randal> Pull: man:man
Randal> Pull: html:html
Randal> Pull: +pu:pu
Randal> Pull: +next:next
And then it mysteriously fixed itself a few minutes later.
Is there some sort of publishing failure, or intermittent race condition?
Or is this something unique to git.git?
Or just bad electron spin or something?
--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: git-pull from git.git - no remote ref for pu or next?
2006-12-12 16:47 ` Randal L. Schwartz
@ 2006-12-12 17:40 ` Johannes Schindelin
2006-12-12 17:42 ` Randal L. Schwartz
2006-12-12 17:56 ` Linus Torvalds
1 sibling, 1 reply; 22+ messages in thread
From: Johannes Schindelin @ 2006-12-12 17:40 UTC (permalink / raw)
To: Randal L. Schwartz; +Cc: git
Hi,
On Tue, 12 Dec 2006, Randal L. Schwartz wrote:
> >>>>> "Randal" == Randal L Schwartz <merlyn@stonehenge.com> writes:
>
> Randal> I just got this on this morning's git-fetch:
>
> Randal> error: no such remote ref refs/heads/pu
> Randal> error: no such remote ref refs/heads/next
> Randal> Fetch failure: git://git.kernel.org/pub/scm/git/git.git
Congratulations!
You are experiencing the Thundering Herd Phenomenon we talked a lot about
lately (the kernel.org mirroring thread).
Ciao,
Dscho
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: git-pull from git.git - no remote ref for pu or next?
2006-12-12 17:40 ` Johannes Schindelin
@ 2006-12-12 17:42 ` Randal L. Schwartz
0 siblings, 0 replies; 22+ messages in thread
From: Randal L. Schwartz @ 2006-12-12 17:42 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: git
>>>>> "Johannes" == Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
Johannes> Congratulations!
Johannes> You are experiencing the Thundering Herd Phenomenon we talked a lot about
Johannes> lately (the kernel.org mirroring thread).
As long as others are sharing my pain, I'll be OK. :)
--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: git-pull from git.git - no remote ref for pu or next?
2006-12-12 16:47 ` Randal L. Schwartz
2006-12-12 17:40 ` Johannes Schindelin
@ 2006-12-12 17:56 ` Linus Torvalds
2006-12-12 18:09 ` Johannes Schindelin
` (2 more replies)
1 sibling, 3 replies; 22+ messages in thread
From: Linus Torvalds @ 2006-12-12 17:56 UTC (permalink / raw)
To: Randal L. Schwartz; +Cc: git
On Tue, 12 Dec 2006, Randal L. Schwartz wrote:
> >>>>> "Randal" == Randal L Schwartz <merlyn@stonehenge.com> writes:
>
> Randal> I just got this on this morning's git-fetch:
>
> Randal> error: no such remote ref refs/heads/pu
> Randal> error: no such remote ref refs/heads/next
> Randal> Fetch failure: git://git.kernel.org/pub/scm/git/git.git
>
> Randal> Here's my remotes/origin:
>
> Randal> URL: git://git.kernel.org/pub/scm/git/git.git
> Randal> Pull: master:origin
> Randal> Pull: man:man
> Randal> Pull: html:html
> Randal> Pull: +pu:pu
> Randal> Pull: +next:next
>
> And then it mysteriously fixed itself a few minutes later.
> Is there some sort of publishing failure, or intermittent race condition?
It's mirroring.
The way that kernel.org works is that there is one master site, which is
not actually running any public services at all. That's the one that
people who have write access can ssh into, and rather than run any public
services, it runs the security-conscious things, like the secure logins
and the automated signing scripts.
The actual _public_ sites are just mirrors, with just rsync between the
things. All to keep the services on the master site minimal.
But because the public sites just mirror using rsync, and aren't really
aware of git repositories etc at that stage, what can happen is that a
mirroring is on-going when Junio does a push, and then the changes to the
"refs/" directory might get rsync'ed before the "object/" directory does,
and you end up with the public sites having references to objects that
don't even _exist_ on those public sites any more.
When they then run git-daemon, git-deamon will basically see a corrupt git
archive, and not expose those "broken" refs at all. Which explains what
you see.
And once the mirroring completes, the issue just goes away, which explains
why it just magically works five minutes later.
If the public sites used git itself to synchronize git repositories,
they'd never see anything like this (because git itself will only write
the new refs after it has actually updated the data). But since the thing
needs mirroring for non-git uses too, and since rsync generally _works_
apart from the slight race-condition issue, that's what it just uses.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: git-pull from git.git - no remote ref for pu or next?
2006-12-12 17:56 ` Linus Torvalds
@ 2006-12-12 18:09 ` Johannes Schindelin
2006-12-12 18:23 ` Linus Torvalds
2006-12-12 18:54 ` Nicolas Pitre
2006-12-12 18:51 ` Nicolas Pitre
2006-12-12 21:26 ` Eric Wong
2 siblings, 2 replies; 22+ messages in thread
From: Johannes Schindelin @ 2006-12-12 18:09 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Randal L. Schwartz, git
Hi,
On Tue, 12 Dec 2006, Linus Torvalds wrote:
> But since the thing needs mirroring for non-git uses too, and since
> rsync generally _works_ apart from the slight race-condition issue,
... and git would probably change the pack structure (i.e. which objects
are in which packs, or even loose) which would be too bad for all those
HTTP leechers ...
> that's what it just uses.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: git-pull from git.git - no remote ref for pu or next?
2006-12-12 18:09 ` Johannes Schindelin
@ 2006-12-12 18:23 ` Linus Torvalds
2006-12-12 19:07 ` Nicolas Pitre
2006-12-12 19:18 ` Jakub Narebski
2006-12-12 18:54 ` Nicolas Pitre
1 sibling, 2 replies; 22+ messages in thread
From: Linus Torvalds @ 2006-12-12 18:23 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Randal L. Schwartz, git
On Tue, 12 Dec 2006, Johannes Schindelin wrote:
> > rsync generally _works_ apart from the slight race-condition issue,
>
> ... and git would probably change the pack structure (i.e. which objects
> are in which packs, or even loose) which would be too bad for all those
> HTTP leechers ...
Well, as it is, I end up repacking my git archives on kernel.org every two
weeks or so anyway, so anybody who uses stupid protocols (rsync or http)
will end up downloading everything anew anyway.
And kernel.org will probably start doing automatic repacking, since the
current situation just means that some people don't repack on their own,
and have tens of thousands of loose objects.
You really don't want to use the non-native protocols unless you have to,
or for projects that don't change.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: git-pull from git.git - no remote ref for pu or next?
2006-12-12 17:56 ` Linus Torvalds
2006-12-12 18:09 ` Johannes Schindelin
@ 2006-12-12 18:51 ` Nicolas Pitre
2006-12-12 19:03 ` Linus Torvalds
2006-12-12 21:26 ` Eric Wong
2 siblings, 1 reply; 22+ messages in thread
From: Nicolas Pitre @ 2006-12-12 18:51 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Randal L. Schwartz, git
On Tue, 12 Dec 2006, Linus Torvalds wrote:
> If the public sites used git itself to synchronize git repositories,
> they'd never see anything like this (because git itself will only write
> the new refs after it has actually updated the data). But since the thing
> needs mirroring for non-git uses too, and since rsync generally _works_
> apart from the slight race-condition issue, that's what it just uses.
Wouldn't it be a worthy goal to exclude git repos from the rsync
mirroring and use git instead? The current arrangement doesn't put git
in good light for the general public not reading this mailing list wrt
git reliability, even if we know it is just a minor and temporary
annoyance.
A failure always makes you look bad regardless of its severity.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: git-pull from git.git - no remote ref for pu or next?
2006-12-12 18:09 ` Johannes Schindelin
2006-12-12 18:23 ` Linus Torvalds
@ 2006-12-12 18:54 ` Nicolas Pitre
2006-12-12 19:04 ` Johannes Schindelin
1 sibling, 1 reply; 22+ messages in thread
From: Nicolas Pitre @ 2006-12-12 18:54 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Linus Torvalds, Randal L. Schwartz, git
On Tue, 12 Dec 2006, Johannes Schindelin wrote:
> Hi,
>
> On Tue, 12 Dec 2006, Linus Torvalds wrote:
>
> > But since the thing needs mirroring for non-git uses too, and since
> > rsync generally _works_ apart from the slight race-condition issue,
>
> ... and git would probably change the pack structure (i.e. which objects
> are in which packs, or even loose) which would be too bad for all those
> HTTP leechers ...
I don't see how that would be more of a concern than the current
situation with occasional repacks.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: git-pull from git.git - no remote ref for pu or next?
2006-12-12 18:51 ` Nicolas Pitre
@ 2006-12-12 19:03 ` Linus Torvalds
0 siblings, 0 replies; 22+ messages in thread
From: Linus Torvalds @ 2006-12-12 19:03 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Randal L. Schwartz, git
On Tue, 12 Dec 2006, Nicolas Pitre wrote:
>
> Wouldn't it be a worthy goal to exclude git repos from the rsync
> mirroring and use git instead?
Well, one of the problems is simply maintenance of kernel.org.
It's just _simpler_ to use rsync for everything.
Look at the current gitweb caching discussion. Did anybody actually step
up to be a gitweb maintainer on kernel.org?
Same deal. Simplicity and lack of maintenance is sometimes not just a good
idea, it's a requirement.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: git-pull from git.git - no remote ref for pu or next?
2006-12-12 18:54 ` Nicolas Pitre
@ 2006-12-12 19:04 ` Johannes Schindelin
2006-12-12 19:15 ` Jakub Narebski
2006-12-12 19:26 ` Nicolas Pitre
0 siblings, 2 replies; 22+ messages in thread
From: Johannes Schindelin @ 2006-12-12 19:04 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Linus Torvalds, Randal L. Schwartz, git
Hi,
On Tue, 12 Dec 2006, Nicolas Pitre wrote:
> On Tue, 12 Dec 2006, Johannes Schindelin wrote:
>
> > On Tue, 12 Dec 2006, Linus Torvalds wrote:
> >
> > > But since the thing needs mirroring for non-git uses too, and since
> > > rsync generally _works_ apart from the slight race-condition issue,
> >
> > ... and git would probably change the pack structure (i.e. which objects
> > are in which packs, or even loose) which would be too bad for all those
> > HTTP leechers ...
>
> I don't see how that would be more of a concern than the current
> situation with occasional repacks.
Oh well. I did not want to get bashed for something which is probably no
problem, but I suspected that the two mirror machines could get out of
sync, which could well mean that the new packs would have to be downloaded
_twice_. As I said, probably no problem.
But it would become a non-problem when the HTTP transport would learn to
read and interpret the .idx files, basically constructing thin packs from
parts of the .pack files ("Content-Range:" comes to mind)...
Ciao,
Dscho
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: git-pull from git.git - no remote ref for pu or next?
2006-12-12 18:23 ` Linus Torvalds
@ 2006-12-12 19:07 ` Nicolas Pitre
2006-12-12 19:13 ` Linus Torvalds
2006-12-12 19:18 ` Jakub Narebski
1 sibling, 1 reply; 22+ messages in thread
From: Nicolas Pitre @ 2006-12-12 19:07 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Johannes Schindelin, Randal L. Schwartz, git
On Tue, 12 Dec 2006, Linus Torvalds wrote:
> And kernel.org will probably start doing automatic repacking, since the
> current situation just means that some people don't repack on their own,
> and have tens of thousands of loose objects.
Maybe object sharing between repos could be a good idea too? All kernel
repos are likely to contain a large subset of your own so they could
have a reference on it by default. That would certainly allow for
better caching and less IO on the server.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: git-pull from git.git - no remote ref for pu or next?
2006-12-12 19:07 ` Nicolas Pitre
@ 2006-12-12 19:13 ` Linus Torvalds
0 siblings, 0 replies; 22+ messages in thread
From: Linus Torvalds @ 2006-12-12 19:13 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Johannes Schindelin, Randal L. Schwartz, git
On Tue, 12 Dec 2006, Nicolas Pitre wrote:
>
> Maybe object sharing between repos could be a good idea too?
We often do. Many people have used "git clone -s -l", and as long as you
repack with the "-l" flag too, the object sharing actually increases over
time as the base repo gets repacked and the cloned repo keeps using the
growing base pack..
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: git-pull from git.git - no remote ref for pu or next?
2006-12-12 19:04 ` Johannes Schindelin
@ 2006-12-12 19:15 ` Jakub Narebski
2006-12-12 19:26 ` Nicolas Pitre
1 sibling, 0 replies; 22+ messages in thread
From: Jakub Narebski @ 2006-12-12 19:15 UTC (permalink / raw)
To: git
Johannes Schindelin wrote:
> But it would become a non-problem when the HTTP transport would learn to
> read and interpret the .idx files, basically constructing thin packs from
> parts of the .pack files ("Content-Range:" comes to mind)...
cURL the CLI can do this with -r/--range option, so I think that curl
the library can do this too. Mind you, this is HTTP/1.1 extension
(hmmm... I wonder if many sites run HTTP/1.0 only).
See also: http://www.linux.com/article.pl?sid=06/10/10/1824245
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: git-pull from git.git - no remote ref for pu or next?
2006-12-12 18:23 ` Linus Torvalds
2006-12-12 19:07 ` Nicolas Pitre
@ 2006-12-12 19:18 ` Jakub Narebski
1 sibling, 0 replies; 22+ messages in thread
From: Jakub Narebski @ 2006-12-12 19:18 UTC (permalink / raw)
To: git
Linus Torvalds wrote:
> On Tue, 12 Dec 2006, Johannes Schindelin wrote:
>> > rsync generally _works_ apart from the slight race-condition issue,
>>
>> ... and git would probably change the pack structure (i.e. which objects
>> are in which packs, or even loose) which would be too bad for all those
>> HTTP leechers ...
>
> Well, as it is, I end up repacking my git archives on kernel.org every two
> weeks or so anyway, so anybody who uses stupid protocols (rsync or http)
> will end up downloading everything anew anyway.
What about "logaritmic packs" idea someone (Pasky?) posted on git mailing
list: pack from last week, pack from last month except last week, pack from
two months, pack from four months, pack from last year...
> And kernel.org will probably start doing automatic repacking, since the
> current situation just means that some people don't repack on their own,
> and have tens of thousands of loose objects.
...with *.keep to keep archive packs it can be even automated somewhat.
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: git-pull from git.git - no remote ref for pu or next?
2006-12-12 19:04 ` Johannes Schindelin
2006-12-12 19:15 ` Jakub Narebski
@ 2006-12-12 19:26 ` Nicolas Pitre
2006-12-12 19:32 ` Johannes Schindelin
1 sibling, 1 reply; 22+ messages in thread
From: Nicolas Pitre @ 2006-12-12 19:26 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Linus Torvalds, Randal L. Schwartz, git
On Tue, 12 Dec 2006, Johannes Schindelin wrote:
> On Tue, 12 Dec 2006, Nicolas Pitre wrote:
>
> > On Tue, 12 Dec 2006, Johannes Schindelin wrote:
> >
> > > On Tue, 12 Dec 2006, Linus Torvalds wrote:
> > >
> > > > But since the thing needs mirroring for non-git uses too, and since
> > > > rsync generally _works_ apart from the slight race-condition issue,
> > >
> > > ... and git would probably change the pack structure (i.e. which objects
> > > are in which packs, or even loose) which would be too bad for all those
> > > HTTP leechers ...
> >
> > I don't see how that would be more of a concern than the current
> > situation with occasional repacks.
>
> Oh well. I did not want to get bashed for something which is probably no
> problem,
Sorry, far from me to sound as if I was bashing you.
> but I suspected that the two mirror machines could get out of
> sync, which could well mean that the new packs would have to be downloaded
> _twice_. As I said, probably no problem.
In theory that should not happen since all mirrors would get the same
updates in the same steps. But in practice if one mirror fails to get
updated for whatever reason then the next time around it could have a
bigger pack instead of two smaller ones for the same set of objects.
> But it would become a non-problem when the HTTP transport would learn to
> read and interpret the .idx files, basically constructing thin packs from
> parts of the .pack files ("Content-Range:" comes to mind)...
Woooh.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: git-pull from git.git - no remote ref for pu or next?
2006-12-12 19:26 ` Nicolas Pitre
@ 2006-12-12 19:32 ` Johannes Schindelin
2006-12-12 19:48 ` Linus Torvalds
2006-12-12 19:50 ` Nicolas Pitre
0 siblings, 2 replies; 22+ messages in thread
From: Johannes Schindelin @ 2006-12-12 19:32 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Linus Torvalds, Randal L. Schwartz, git
Hi,
On Tue, 12 Dec 2006, Nicolas Pitre wrote:
> On Tue, 12 Dec 2006, Johannes Schindelin wrote:
>
> > But it would become a non-problem when the HTTP transport would learn
> > to read and interpret the .idx files, basically constructing thin
> > packs from parts of the .pack files ("Content-Range:" comes to
> > mind)...
>
> Woooh.
Does that mean "Yes, I'll do it"? ;-)
Ciao,
Dscho
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: git-pull from git.git - no remote ref for pu or next?
2006-12-12 19:32 ` Johannes Schindelin
@ 2006-12-12 19:48 ` Linus Torvalds
2006-12-12 20:04 ` Jakub Narebski
2006-12-12 20:26 ` Johannes Schindelin
2006-12-12 19:50 ` Nicolas Pitre
1 sibling, 2 replies; 22+ messages in thread
From: Linus Torvalds @ 2006-12-12 19:48 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Nicolas Pitre, Randal L. Schwartz, git
On Tue, 12 Dec 2006, Johannes Schindelin wrote:
> On Tue, 12 Dec 2006, Nicolas Pitre wrote:
>
> > On Tue, 12 Dec 2006, Johannes Schindelin wrote:
> >
> > > But it would become a non-problem when the HTTP transport would learn
> > > to read and interpret the .idx files, basically constructing thin
> > > packs from parts of the .pack files ("Content-Range:" comes to
> > > mind)...
> >
> > Woooh.
>
> Does that mean "Yes, I'll do it"? ;-)
Umm. I hope it means "Woooh, that's crazy talk".
You do realize that then you need to teach the http-walker about walking
the delta chain all the way up? For big pulls, you're going to be a lot
_slower_ than just downloading the whole dang thing, because the delta
objects are often just ~40 bytes, and you've now added a ping-pong latency
for each such small transfer.
You don't need to download many such small ranges, and suddenly the few
hundred ping-pongs that got you a few tens of kB of data took longer than
just downloading a big stream efficiently that got you everything.
One big reason the native git protocol is efficient is that it's largely a
streaming protocol (apart from the early ref-walking, but even that tries
to stream as much as possible, rather than having a back-and-forth latency
for each query)
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: git-pull from git.git - no remote ref for pu or next?
2006-12-12 19:32 ` Johannes Schindelin
2006-12-12 19:48 ` Linus Torvalds
@ 2006-12-12 19:50 ` Nicolas Pitre
1 sibling, 0 replies; 22+ messages in thread
From: Nicolas Pitre @ 2006-12-12 19:50 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Linus Torvalds, Randal L. Schwartz, git
On Tue, 12 Dec 2006, Johannes Schindelin wrote:
> Hi,
>
> On Tue, 12 Dec 2006, Nicolas Pitre wrote:
>
> > On Tue, 12 Dec 2006, Johannes Schindelin wrote:
> >
> > > But it would become a non-problem when the HTTP transport would learn
> > > to read and interpret the .idx files, basically constructing thin
> > > packs from parts of the .pack files ("Content-Range:" comes to
> > > mind)...
> >
> > Woooh.
>
> Does that mean "Yes, I'll do it"? ;-)
Absolutely not. ;-) I know next to nothing about HTTP to start with.
It just looks like a crazy idea that might actually work.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: git-pull from git.git - no remote ref for pu or next?
2006-12-12 19:48 ` Linus Torvalds
@ 2006-12-12 20:04 ` Jakub Narebski
2006-12-12 20:26 ` Johannes Schindelin
1 sibling, 0 replies; 22+ messages in thread
From: Jakub Narebski @ 2006-12-12 20:04 UTC (permalink / raw)
To: git
Linus Torvalds wrote:
>
>
> On Tue, 12 Dec 2006, Johannes Schindelin wrote:
>> On Tue, 12 Dec 2006, Nicolas Pitre wrote:
>>
>>> On Tue, 12 Dec 2006, Johannes Schindelin wrote:
>>>
>>>> But it would become a non-problem when the HTTP transport would learn
>>>> to read and interpret the .idx files, basically constructing thin
>>>> packs from parts of the .pack files ("Content-Range:" comes to
>>>> mind)...
>>>
>>> Woooh.
>>
>> Does that mean "Yes, I'll do it"? ;-)
>
> Umm. I hope it means "Woooh, that's crazy talk".
>
> You do realize that then you need to teach the http-walker about walking
> the delta chain all the way up? For big pulls, you're going to be a lot
> _slower_ than just downloading the whole dang thing, because the delta
> objects are often just ~40 bytes, and you've now added a ping-pong latency
> for each such small transfer.
>
> You don't need to download many such small ranges, and suddenly the few
> hundred ping-pongs that got you a few tens of kB of data took longer than
> just downloading a big stream efficiently that got you everything.
While I think the problem is much better solved by having "archive" pack(s)
and "current" pack, perhaps with always sownloading the whole delta it
would be feasible?
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: git-pull from git.git - no remote ref for pu or next?
2006-12-12 19:48 ` Linus Torvalds
2006-12-12 20:04 ` Jakub Narebski
@ 2006-12-12 20:26 ` Johannes Schindelin
1 sibling, 0 replies; 22+ messages in thread
From: Johannes Schindelin @ 2006-12-12 20:26 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Nicolas Pitre, Randal L. Schwartz, git
Hi,
On Tue, 12 Dec 2006, Linus Torvalds wrote:
> On Tue, 12 Dec 2006, Johannes Schindelin wrote:
> > On Tue, 12 Dec 2006, Nicolas Pitre wrote:
> >
> > > On Tue, 12 Dec 2006, Johannes Schindelin wrote:
> > >
> > > > But it would become a non-problem when the HTTP transport would learn
> > > > to read and interpret the .idx files, basically constructing thin
> > > > packs from parts of the .pack files ("Content-Range:" comes to
> > > > mind)...
> > >
> > > Woooh.
> >
> > Does that mean "Yes, I'll do it"? ;-)
>
> Umm. I hope it means "Woooh, that's crazy talk".
>
> You do realize that then you need to teach the http-walker about walking
> the delta chain all the way up? For big pulls, you're going to be a lot
> _slower_ than just downloading the whole dang thing, because the delta
> objects are often just ~40 bytes, and you've now added a ping-pong latency
> for each such small transfer.
Two points:
- For loose objects, the HTTP walker does exactly that. This is the normal
case for "just a few objects since the last fetch". It will _never_ be the
case for the initial clone!
- Usually, the object fetching can be parallelized, because you want
multiple objects which are in disjunct delta chains. And for these, you
can say something like "Content-Range: 15-31,64-79,108-135" IIRC.
You could even fetch sensible chunks, i.e. cut only at multiples of 512 to
make the transport more efficient, and only fetch the parts which are
_still_ missing.
So, a crazy idea, yes. But a feasible one. Just not crazy enough to be
tempting for me (I use the git protocol whenever possible, too).
Ciao,
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: git-pull from git.git - no remote ref for pu or next?
2006-12-12 17:56 ` Linus Torvalds
2006-12-12 18:09 ` Johannes Schindelin
2006-12-12 18:51 ` Nicolas Pitre
@ 2006-12-12 21:26 ` Eric Wong
2 siblings, 0 replies; 22+ messages in thread
From: Eric Wong @ 2006-12-12 21:26 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Randal L. Schwartz, git
Linus Torvalds <torvalds@osdl.org> wrote:
> > And then it mysteriously fixed itself a few minutes later.
> > Is there some sort of publishing failure, or intermittent race condition?
>
> But because the public sites just mirror using rsync, and aren't really
> aware of git repositories etc at that stage, what can happen is that a
> mirroring is on-going when Junio does a push, and then the changes to the
> "refs/" directory might get rsync'ed before the "object/" directory does,
> and you end up with the public sites having references to objects that
> don't even _exist_ on those public sites any more.
>
> And once the mirroring completes, the issue just goes away, which explains
> why it just magically works five minutes later.
If kernel.org isn't using it already, I've found the --delay-updates
option of rsync works reasonably well and can cut down the
race-condition window. It does use more memory and disk space, however.
atomic-rsync (a perl front-end distributed with the rsync source) takes
even more disk space but works across an entire subdirectory all at once
(with 2 renames)
--
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2006-12-12 21:26 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-12-12 16:44 git-pull from git.git - no remote ref for pu or next? Randal L. Schwartz
2006-12-12 16:47 ` Randal L. Schwartz
2006-12-12 17:40 ` Johannes Schindelin
2006-12-12 17:42 ` Randal L. Schwartz
2006-12-12 17:56 ` Linus Torvalds
2006-12-12 18:09 ` Johannes Schindelin
2006-12-12 18:23 ` Linus Torvalds
2006-12-12 19:07 ` Nicolas Pitre
2006-12-12 19:13 ` Linus Torvalds
2006-12-12 19:18 ` Jakub Narebski
2006-12-12 18:54 ` Nicolas Pitre
2006-12-12 19:04 ` Johannes Schindelin
2006-12-12 19:15 ` Jakub Narebski
2006-12-12 19:26 ` Nicolas Pitre
2006-12-12 19:32 ` Johannes Schindelin
2006-12-12 19:48 ` Linus Torvalds
2006-12-12 20:04 ` Jakub Narebski
2006-12-12 20:26 ` Johannes Schindelin
2006-12-12 19:50 ` Nicolas Pitre
2006-12-12 18:51 ` Nicolas Pitre
2006-12-12 19:03 ` Linus Torvalds
2006-12-12 21:26 ` Eric Wong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).